Big data development and data warehouse development must be programmed, big data analytics also requires programming, but there will only be hive and sql to do analysis.
Look at what you want to turn that aspect of my usual work in the most commonly used of these technologies
1, Java is not deep, the Javase part of the eat through on the line.
2, Hadoop ecosystem, Yarn, Zookeeper, HDFS, the underlying principles to understand.
3, Mapreduce and Spark development.
4, Hbase and HIve, engage in big data these do not understand really can not say.
5, Mysql, Oracle and Postgres database operations to return, Sql to write.
6, linux operating system, this simple command must understand, will write shell scripts better.
7, Kettle or Sqoop such data processing tools should be at least one.
8, SparkSql and SparkStreaming, the underlying principles, the kernel, the process of submitting tasks and so on, try to go deep inside. Of course, you also need to understand Storm and Flink, Flink is getting hotter and hotter now.
9, Redis, Kafka, ElasticSearch, these have to understand, will use, will operate, will tune.
10, impala and kylin these try to understand will also use.
11, Python this if you have the ability, have the energy, it is recommended that you also want to learn in depth, I am currently in the self-study.
12, cluster issues, including some simple operation and maintenance knowledge.
My work often contact commonly used, in fact, you understand which MapReduce, Spark, kafka, HBASE, hive, ES and database operations These commonly used can find a job.
They're not the only ones who can do it.