The evolution of Big Data can be divided according to the point of time.
The specific history of the development of the Big Data era is as follows:
In 2005 the Hadoop project was born. Hadoop its initially just a project used by Yahoo to solve the problem of web search, and then because of the efficiency of its technology, it was introduced by Apache Software Foundation and became an open source application.
Hadoop is not a product in itself, but an ecosystem of software products that ****together to achieve full-featured and flexible big data analytics. Technically, Hadoop consists of two key services: a reliable data storage service using the Hadoop Distributed File System (HDFS), and a high-performance parallel data processing service utilizing a technology called MapReduce. Both services share the goal of providing a foundation that makes fast, reliable analysis of structured and complex data a reality.
In late 2008, "Big Data" was recognized by some of the leading computer science researchers in the United States, and the industry group Computing Community Consortium, published an influential white paper, "Big Data Computing: Creating Revolutionary Breakthroughs in Business, Big Data Computing: Creating Revolutionary Breakthroughs in Business, Science, and Society. It takes thinking beyond data processing machines and suggests that what really matters about Big Data are the new uses and insights, not the data itself. This organization was arguably the first to introduce the concept of Big Data.
In 2009, the Indian government set up a biometric database for identity management, and the United Nations Global Pulse Project has examined how cell phone and social networking site data feeds can be used to analyze and predict issues ranging from the price of spirals to disease outbreaks. In the same year, the U.S. government made a wide range of government data available to the public through the launch of http://Data.gov网站的方式进一步开放了数据的大门, a website. The site's more than 44,500 datasets are used to ensure that websites and smartphone apps track everything from flights to product recalls to unemployment rates in a given region, a move that has inspired governments from Kenya to the U.K. to launch similar initiatives.
In 2009, some of Europe's leading research libraries and scientific and technical information research organizations formed a partnership to improve the ease of access to scientific data on the Internet.
In February 2010, Kenneth Kukel wrote in The Economist about the importance of the Internet for scientific research. Koukl published a 14-page report on big data, "Data, Ubiquitous Data," in The Economist. In the report, Koukl said, "The world has an unimaginably huge amount of digital information, and it is growing at a breakneck pace. The impact of this huge amount of information is already being felt in many ways, from the world of economics to the world of science, from the government sector to the arts. Scientists and computer engineers have coined a new term for this phenomenon: "big data." Koukl has thus become one of the first data scientists to gain insight into the trends of the big data era.
In February 2011, IBM's Watson supercomputer, which scans and analyzes 4 terabytes (about 200 million pages of text) of data per second, won the prestigious U.S. quiz show Jeopardy by beating two human contestants. The New York Times later recognized this moment as a "triumph of big data computing." Successively in May of the same year, the world-renowned consulting firm McKinsey (McKinsey & Company) Kennedy Global Institute (MGI) released a report - "Big Data: Innovation, Competition, and Productivity of the next new field", big data began to attract attention, which is the first time that the professional organizations all-round introduction and outlook on big data. aspect of the introduction and outlook of big data. The report points out that Big Data has penetrated every industry and business function area today and become an important production factor. The mining and utilization of massive amounts of data heralds a new wave of productivity growth and consumer surplus. The report also mentions that "Big Data" stems from a dramatic increase in the capacity and speed of data production and collection - as more and more people, devices, and sensors are connected through digital networks, the ability to generate, transmit, share, and access data has been revolutionized! .
In December 2011, the Ministry of Industry and Information Technology (MIIT) released the 12th Five-Year Plan for the Internet of Things (IoT), on which information processing technology was proposed as one of the four key technological innovation projects, which included massive data storage, data mining, and intelligent analysis of images and videos, all of which are important components of big data.
In January 2012, big data was one of the themes of the World Economic Forum held in Davos, Switzerland, where the report Big Data, Big Impact was released, declaring that data has become a new class of economic assets, just like money or gold.
In March 2012, the U.S. Obama administration released the "Big Data Research and Development Initiative" on the White House website, an initiative that signaled that big data has become an important feature of the times. on March 22, 2012, the Obama administration announced a $200 million investment in the field of big data, a watershed moment in the rise of big data technology from a business practice to a national science and technology strategy, and on the following day's conference call, the In a conference call the following day, the administration defined data as "the new oil of the future" and said that competition in the field of big data technology is a matter of national security and the future. It also said that competitiveness at the national level will be partly reflected in the scale and activity of a country's possession of data, as well as its ability to interpret and utilize it; and that the country's digital sovereignty reflects its possession and control of data. Digital sovereignty will be another space for great powers to play after border defense, sea defense and air defense.
April 2012, the U.S. software company Splunk on the 19th in the Nasdaq successfully listed, becoming the first listed big data processing company. In view of the continuing economic downturn in the United States, the stock market continues to shake the background, Splunk's outstanding trading performance on the first day is particularly impressive, the first day that more than doubled the surge.Splunk is a leading software provider to provide big data monitoring and analytics services, was founded in 2003.Splunk's success in the listing of the capital market to promote the attention of big data, but also prompted IT In July 2012, the United Nations released a white paper on big data governance in New York, summarizing how governments are using big data to better serve and protect their people. The white paper provides examples of the respective roles, motivations and needs of individuals, the public and private sectors in a data ecosystem: for example, through price concerns and a desire for better service, individuals provide data and crowdsourced information and demand privacy and opt-out powers; the public sector, for the purpose of improving service delivery and efficiency, provides statistics, equipment information, health indicators, and tax and consumer information, among others. , and tax and consumer information, among others, with demands for privacy and opt-out powers; and the private sector provides aggregated data, consumption and usage information for the purpose of improving customer awareness and predicting trends, with increased attention to sensitive data ownership and business models. The white paper also points to the vastly enriched data resources, both old and new, that people can use today to analyze social demographics in real time like never before. The UN also cites the growth of social network activity in Ireland and the United States as an early indicator of rising unemployment, demonstrating that governments can "keep up with the numbers" and respond quickly if they analyze the data resources at their disposal. In July of that year, in order to tap the value of big data, Alibaba Group established the position of "Chief Data Officer" at the management level, who is responsible for comprehensively promoting the strategy of "Data Sharing Platform" and launching a large-scale data-sharing platform - "Jushita". Alibaba Group established the position of "Chief Data Officer" at the management level to promote the "Data Sharing Platform" strategy, and launched a large-scale data sharing platform, Jushita, to provide data cloud services for e-commerce companies and e-commerce service providers on Tmall and Taobao. Subsequently, Jack Ma, Chairman of the Board of Directors of Alibaba, delivered a speech at the 2012 Online Business Conference, stating that from January 1, 2013, the company will transform and reshape its three major businesses, namely platform, finance and data. Jack Ma emphasized, "If we have a data forecasting station, it's like installing a GPS and radar for enterprises, you will be more certain to go out to sea." Therefore, Alibaba Group hopes to provide value to the country and small and medium-sized enterprises by sharing and mining massive data. The move is one of the first major milestones for domestic companies to elevate big data to the level of corporate management. Alibaba was also the first company to propose data-based enterprise operations through data.
In April 2014, the World Economic Forum (WEF) released its Global Information Technology Report (13th Edition) on the theme of "The Rewards and Risks of Big Data". The report concludes that policies for various ICTs will become even more important in the coming years. In the next few days, there will be active discussions on issues such as data privacy and cyber control. The growing activity of the global big data industry and the accelerated development of technological evolution and application innovation have led governments to gradually recognize the significance of big data in promoting economic development, improving public *** services, enhancing people's well-being, and even safeguarding national security.In May, the U.S. White House released the 2014 Global "Big Data" In May, the U.S. White House released the 2014 Global "Big Data" White Paper study "Big Data: Seizing Opportunities, Safeguarding Value". The report encourages the use of data to drive social progress, especially in areas where markets and existing institutions do not otherwise support such progress; at the same time, frameworks, structures, and research are needed to help protect Americans' y held beliefs about protecting individual privacy, ensuring fairness, or preventing discrimination. 2014 saw the first appearance of "Big Data" in that year's White Paper. " first appeared in that year's Government Work Report. In the report, it was stated that a platform for entrepreneurship and innovation in emerging industries should be set up to catch up with advances in areas such as big data and lead the development of future industries. "Big data" immediately became a hot topic of domestic vocabulary.
In 2015, the State officially issued the "Outline of Action for Promoting the Development of Big Data", which makes it clear that the promotion of the development and application of big data will create a new model of social governance with precise governance and multi-party collaboration in the next 5 to 10 years, establish a new mechanism for smooth, safe and efficient economic operation, build a new system of people-oriented services that will benefit all people's livelihoods, and open up a new pattern of innovation drive for mass entrepreneurship and innovation, as well as cultivate a new pattern of innovation drive for the development of new industries, and foster the development of new industries in the future. It will also open a new pattern of innovation drive for mass entrepreneurship and innovation, and cultivate a new ecology of high-end intelligence and emerging prosperity for industrial development. It marks the official rise of big data as a national strategy.
In 2016, big data "13th Five-Year Plan" will be introduced, the "Plan" has solicited the views of experts, and centralized discussion and revision. The Plan covers, among other things, the promotion of the application of big data in industrial research and development, manufacturing, and all aspects of the whole process of the industrial chain; and the support of the service industry in using big data to establish branding, precision marketing and customized services.
Technology of big data:
1. Hadoop
Hadoop was born in 2005, which was initially just a project used by Yahoo to solve the problem of web search, and was later introduced by Apache Software Foundation and became an open-source application due to the high efficiency of the technology.
Hadoop is not a product per se, but is made up of a series of products, such as Hadoop, Hadoop, Hadoop, Hadoop, Hadoop, and Hadoop. Hadoop is not a product per se, but rather an ecosystem of software products*** that work together to enable full-featured and flexible big data analytics. Technically, Hadoop consists of two key services: a reliable data storage service using the Hadoop Distributed File System (HDFS), and a high-performance parallel data processing service utilizing a technology called MapReduce.
2. Hive
Hive is a data warehouse architecture built on the Hadoop file system and enables the analysis and management of data stored in HDFS. It was originally created and developed in response to the need to manage and machine learn from the massive amounts of emerging social network data that Facebook generates every day. Other companies have since started using and developing Apache Hive, such as Netflix, Amazon, and others.
3. Storm:
Storm is a distributed computing framework written primarily in the Clojure programming language. It was originally created by Nathan Marz and his team at BackType, a marketing intelligence business that was acquired by Twitter in 2011. Twitter then turned the project into open source and pushed it to the GitHub platform, and eventually Storm joined the Apache Incubator Program and in September 2014 officially became one of the top projects under the Apache umbrella.