The main risks faced by Big Data are not mentioned in this lecture ().
A. security risk?
B. ethical risks?
C. Ethical risks?
D. military risk?
D.
Big data
Big data refers to information that is so large in size that it cannot be captured, managed, processed, or organized in a reasonable amount of time by mainstream software tools to help businesses make more positive decisions.
Big data is a term used to describe information that is not readily available in the marketplace.
In The Age of Big Data by Viktor Mayer Sch?nberg and Kenneth Kukier, big data refers to the use of all data without the shortcut of random analysis (sampling).
The 5V characteristics of Big Data (proposed by IBM): Volume, Velocity, Variety, Value, Veracity.
For "big data" (Big data) research organization Gartner gave this definition. "Big data" is the need for new processing models to have a stronger decision-making power, insight discovery and process optimization capabilities to adapt to the massive, high growth rate and diversity of information assets.
The McKinsey Global Institute defines it as a collection of data so large that it is well beyond the capabilities of traditional database software tools to acquire, store, manage, and analyze, and is characterized by massive data size, rapid data flow, diverse data types, and low value density.
The strategic significance of big data technology does not lie in the mastery of huge data information, but in the specialized processing of these data containing meaning. In other words, if big data is compared to an industry, then the key to profitability of this industry lies in improving the "processing capability" of data, and realizing the "value-added" of data through "processing". "
These are the key to the profitability of this industry.
From a technical point of view, the relationship between big data and cloud computing is as inextricably linked as the positive and negative sides of a coin. Big data necessarily cannot be processed by a single computer, and must use a distributed architecture. It is characterized by distributed data mining of massive amounts of data. But it must rely on cloud computing distributed processing, distributed database and cloud storage, virtualization technology.