Data collection refers to the acquisition of various types of structured, semi-structured (or called weakly structured) and unstructured massive data by means of RFID radio-frequency data, sensor data, social network interaction data and mobile Internet data. Data acquisition refers to various types of structured, semi-structured (or called weakly structured) and unstructured massive data obtained through RFID radio frequency data, sensor data, social network interaction data and mobile Internet data, which is the root of big data knowledge service model. The focus should be on breakthroughs in distributed high-speed and highly reliable data crawling or collection, high-speed data full image and other big data collection technologies; breakthroughs in high-speed data parsing, conversion and loading and other big data integration technologies; and the design of quality assessment models and the development of data quality technologies.
Big data storage and management should use memory to store the collected data, establish corresponding database, and manage and call. Focus on solving complex structured, semi-structured and unstructured big data management and processing technology. It mainly solves several key problems such as storable, representable, processable, reliable and effective transmission of big data. Develop reliable distributed file system (DFS), energy-efficiency optimized storage, computation into storage, big data de-redundancy, and efficient and low-cost big data storage technology; breakthrough in distributed non-relational big data management and processing technology, heterogeneous data data fusion technology, data organization technology, and big data modeling technology; breakthrough in big data indexing technology; breakthrough in big data movement, backup, and replication technology; develop big data visualization technology. ; develop big data visualization technology.
Big data analysis technology. Improve existing data mining and machine learning technologies; develop new data mining technologies such as data network mining, specific group mining, graph mining; breakthroughs in object-based data connectivity, similarity connectivity and other big data fusion technologies; breakthroughs in user interest analysis, network behavioral analysis, emotional semantic analysis and other domain-oriented big data mining technologies.
Big data technology can dig out the information and knowledge hidden in the massive data and provide the basis for human socio-economic activities, thus improving the operational efficiency of various fields and greatly increasing the degree of intensification of the entire socio-economy.