Current location - Loan Platform Complete Network - Big data management - What is hadoop core technology in big data
What is hadoop core technology in big data
The Hadoop project is open source software developed for the purpose of reliable, scalable and distributed computing.

Reliable: there are backups, data is not easy to lose. hdfs can backup data.

Scalable: not enough storage, plus disk, plus machine hanging disk Analysis of CPU memory resources are not enough, plus machine plus memory

Distributed computing: multiple machines to calculate a part of a task at the same time, and then, the results of each calculation is summarized.

hadoop Core Components Used to solve two core problems: storage and computation Core Components :

1) Hadoop Common: a set of components and interfaces (serialization, Java RPC, and persistent data structures) for distributed file systems and common I/O.

2) Hadoop Distributed FileSystem (Hadoop Distributed File System HDFS) HDFS is where the data is stored, just like the hard disk of our computer files are stored on this.

3) Hadoop MapReduce (Distributed Computing Framework) MapReduce is the data processing calculations, it has a characteristic that no matter how big the data as long as you give it time it will be able to run out of data, but the time may not be very fast, so it is called data batch processing.

4) Hadoop YARN (Distributed Resource Manager) YARN is an important component of the concept of Hadoop platform, with which other software of the big data ecosystem can be run on hadoop, so that you can better utilize the advantages of HDFS large storage and save more resources.