Why does hadoop ecosystem appear?
Map Reduce -MapReduce is a programmable model, which uses cluster parallel and distributed algorithms to process large data sets. Apache MapReduce comes from Google MapReduce: it simplifies data processing in large clusters. The current Apache MapReduce version is based on the Apache YARN framework. YARN = "another resource negotiator". YARN can run non-MapReduce model applications. YARN is an attempt by Apache Hadoop to surpass the data processing ability of MapReduce. HDFS——Hadoop Distributed File System (HDFS) provides a solution to store large files across multiple machines. Hadoop and HDFS both originated from Google File System (GFS). Before Hadoop 2.0.0, NameNode was a single point of failure (SPOF) of HDFS cluster. Using Zookeeper, the high availability function of HDFS solves this problem, and provides the option of running two duplicate NameNodes in the same cluster with the same active/passive configuration. H base- inspired by Google BigTable. HBase is an open source implementation of Google Bigtable, which uses GFS as its file storage system and Hadoop HDFS as its file storage system. Google runs MapReduce to process the massive data in Bigtable, and HBase also uses Hadoop MapReduce to process the massive data in HBase; Google Bigtable uses Chubby as the collaboration service, and HBase uses Zookeeper as the corresponding service. Hive-a data warehouse infrastructure developed by Facebook. Data collection, query and analysis. Hive provides a SQL-like language (incompatible with SQL 92): HiveQL. Pig-Pig provides an engine for parallel execution of data flow in Hadoop. Pig contains a language: Pig Latin, which is used to express these data streams. Pig Latin contains a lot of traditional data operations (join, sort, filter, etc. ), but also allows users to develop their own functions to view, process and write data. Pig runs on hadoop and is used for Hadoop distributed file system, HDFS, Hadoop processing system and MapReduce. Pig uses MapReduce to perform all data processing and compile Pig Latin scripts. Users can write a series of one or more MapReduce jobs.