Java: As long as you understand some of the basics of bai, you don't need to be very deep in Java technology to do big data, learning java SE is equivalent to learning big data.
Linux: Because big data-related software are running on Linux, so Linux to learn some solid, learn Linux on your fast mastery of big data-related technology will be of great help, can let you better understand hadoop, hive, hbase, spark and other big data software running environment and network environment configuration, can less step on a lot of pits, learn ssl, and so on. Can less step on a lot of pits, learn shell can read the script so that it can be easier to understand and configure big data clusters.
Hadoop: This is now a popular big data processing platform has almost become synonymous with big data, so this is a must learn.
Zookeeper: this is a million bucks, the installation of Hadoop's HA will use it, the future of Hbase will also use it.
Mysql: we have finished learning the processing of big data, the next learning to learn the processing of small data tools mysql database, because a moment to install hive when you want to use, mysql need to master to what level that? You can install it on Linux, run it, will configure simple permissions, change the password of root, create a database.
Sqoop: this is used to import data from Mysql into Hadoop.
Hive: this thing is a godsend for those who know SQL syntax, it makes it easy to handle big data
Oozie: now that you've learned Hive, I'm sure you're going to need this, which can help you manage your Hive or MapReduce, Spark scripts, and also check if your program is executed correctly.
Hbase: This is a NOSQL database in the Hadoop ecosystem, his data is stored in the form of key and value and the key is unique, so it can be used to do data ranking, it can store a lot of data compared to MYSQL.
Kafka: This is a better queuing tool.
Spark: it is used to make up for the shortcomings in the speed of processing data based on MapReduce.
Please click to enter image description