Current location - Loan Platform Complete Network - Big data management - Big data are learning what software
Big data are learning what software

Java: As long as you understand some of the basics can be done to do big data does not require a very deep Java technology, learning java SE is equivalent to learning big data.

Linux: because of big data related software are running on Linux, so Linux to learn some solid, learn Linux on your fast mastery of big data related technology will be of great help, can let you better understand hadoop, hive, hbase, spark and other big data software running environment and network environment configuration, can less step on a lot of pits, learn ssl, and so on. Can less step on a lot of pits, learn shell can read the script so that it can be easier to understand and configure big data clusters.

Hadoop: This is now a popular big data processing platform has almost become synonymous with big data, so this is a must learn.

Zookeeper: this is a million bucks, the installation of Hadoop's HA will use it, the future of Hbase will also use it.

Mysql: we have finished learning the processing of big data, the next learning to learn the processing of small data tools mysql database, because a moment to install hive when you need to use, mysql need to master to what level that? You can install it on Linux, run it, will configure simple permissions, change the password of root, create a database.

Sqoop: this is used to import data from Mysql into Hadoop.

Hive: this thing is a godsend for those who know SQL syntax, it makes it easy to handle big data

Oozie: now that you've learned Hive, I'm sure you'll need this, it can help you manage your Hive or MapRece, Spark scripts, and it can also check if your program is executed correctly.

Hbase: This is a NOSQL database in the Hadoop ecosystem, his data is stored in the form of key and value and the key is unique, so it can be used to do the ranking of data, it can store a lot more data compared to MYSQL.

Kafka: This is a better queuing tool.

Spark: it is used to make up for the shortcomings of MapRece-based processing data speed.