Current location - Loan Platform Complete Network - Big data management - How to do big data processing?
How to do big data processing?

One of the big data processing: collection

The collection of big data refers to the use of multiple databases to receive data from the client (Web, App or sensor mode, etc.), and the user is able to go through these databases to carry out simple querying and processing operations, in the process of big data collection, its main feature and challenge is the high concurrency, because there may be thousands of users to visit and operate at the same time

How to do Big Data processing? In the process of big data collection, the main feature and challenge is the high concurrency, because there may be thousands of users to visit and operate at the same time

Big Data Processing No.2: Import/Preprocessing

While there are many databases at the collection end itself, if you want to analyze the massive data effectively, you should still import these data from the front-end to the front-end. Import and preprocessing process features and challenges are mainly imported data volume, the import volume per second often reaches a hundred megabytes, or even a gigabyte level.

Big Data Processing III: Accounting/Analysis

Accounting and analysis of the main use of distributed databases, or distributed computing clusters to be stored in the massive amount of data within the general analysis and classification of the summary to meet the majority of the common needs of the analysis in this regard, some of the real-time demand will use EMC to analyze the data. In this regard, some real-time demand will use EMC's GreenPlum, Oracle's Exadata, and according to MySQL's columnar storage Infobright, etc., and some batch processing, or according to the demand for semi-structured data can be used to apply Hadoop. Accounting and analysis of this part of the main features and challenges is the analysis of the large amount of data touched on the system resources, especially the I/O will have a great impact on the system. I/O will have a great occupation.

Big Data Processing IV: Discovery

Mainly in the existing data on top of the various algorithms based on the accounting, and then play a role in prediction (Predict), and then to achieve some high-level data analysis needs. The main tools used are Hadoop's Mahout and so on. The characteristics of the process and the challenge is mainly used to uncover the algorithm is very complex, and accounting touches the amount of data and accounting are very large, commonly used data mining algorithms are single-threaded.

On how to carry out big data processing, Qingteng Xiaobo will share with you here. If you have a strong interest in big data engineering, I hope this article can help you. If you still want to know more about data analyst, big data engineer tips and materials, you can click on other articles on this site to learn.