Today, computers have grown to unprecedented power, and huge amounts of data - including personal information - are collected by them to form large databases. While this data is being utilized normally and legally, there is also the danger of misuse.
The predictive power of big data gives it enormous potential to revolutionize our lives. With its support, weather forecasts in the next two days will be 95 percent accurate. However, when big data is misused, the privacy and security of users are at risk, especially those who use the Internet regularly.
How do these threats arise? And how can we address these growing threats while ensuring that big data benefits society?
Scope of Potential Problems
First of all, the scope of the impact of big data security incidents continues to grow, analyzed purely in terms of the number of people involved in them.
The breach of the University of Arkansas' professional development system in 2014 resulted in the disclosure of the identities of 50,000 people. That's not a small number anymore, but it's small compared to the 145,000,000 people whose birthdays, addresses, emails and other information were stolen in the ebay Inc. data breach that same year.
From the specialized field of security maintenance, it is even less optimistic to protect information in large databases from theft. In part, this has to do with the inherent flaws in the underlying technology used to store and process the information.
Big data companies like Amazon rely so heavily on distributed computing that they often have data processing centers scattered around the world. Amazon.com operates its global business in twelve zones, each of which has a large number of data centers that are under constant physical and cyber attack, mainly from hundreds of hidden, independent servers.
The access control conundrum
The best strategy for controlling access to information or Web pages is to set up a single access interface, which is much easier than controlling the hundreds of interfaces that exist today. However, the fact is: big data is stored in a wide variety of areas. The sheer volume, distribution, and accessibility of it also makes it more vulnerable to threats.
In addition, many companies don't pay enough attention to the security of their sophisticated software components and big data infrastructure. This opens the door to potential attacks.
As a prime example, the large number of software components in Hadoop (a distributed systems infrastructure developed by the Apache Foundation that allows users to develop distributed programs without knowing the underlying details of the distribution) allows programmers to access large amounts of data information in a distributed computing system. When it was first introduced, Hadoop was less secure and could not be used by many people at the same time. Despite this shortcoming, many large companies continued to adopt Hadoop as their company-wide data platform.
User demand drives data security
From the user's perspective, it is critical to improve the security of big data products in a variety of ways, such as signing conditions and agreements, service level agreements, and security seals with the organizations that collect and use big data.
What can big data companies do to protect users' personal information from being leaked? To prevent information leakage from falling into the hands of unscrupulous users, we can adopt strategies including information encryption, access control, intrusion detection, data backup, and review of the usage process. In this way, data security is improved and the privacy of our personal information is enhanced.
However, an overemphasis on security may infringe on your privacy: law enforcement agencies can use security reasons to collect more personal information, such as the browsing history of an employee's computer.
Under the guise of increased security, law enforcement agencies treat everyone as if they were a potential criminal or terrorist, collecting information that can be used to prove their guilt someday. Not only does the government have tons of information on us this way, but companies like Apple, Google, and Amazon are also asked to provide other intelligence, including our online shopping records, web browsing history, and decrypted data on various things.
The basic principle behind this kind of surveillance is that everyone can't be trusted (and big data technology has made it far less expensive and more feasible). However, the information collected is likely to be leaked and misused, as in the case of NSA employees who abused their authority to listen in on other people's phone calls.
In fact, if properly utilized, big data can help us obtain more information and improve the quality (and especially the accuracy) of intelligence about potential computer attacks and attackers. In this way, your privacy is better protected.
Ideally, for example, we'd never have to worry about running into phishing emails again, either, if big data analytics engines could accurately identify which ones are fraudulent in a sea of them.
How big data is used -- for your benefit or for your detriment
Other issues related to the use of big data include the fact that some companies keep track of your browsing history in order to tailor ads to your habits and preferences. Big data facilitates this kind of behavior -- it's cheaper and easier to analyze.
IBM's "Personality Insights" service can "profile" you based on your online habits. This is much more than just having your identity compromised. Your personality traits, such as whether you're outgoing, environmentally conscious, politically conservative or innovative, and even whether you're willing to travel to Africa, are all shown in the results of the survey.
The companies outwardly claim that this technology can greatly improve the Internet experience. It sounds like they're thinking about the user, but on the flip side, it's not hard to think that the same information could easily be used to our detriment. Already now, for example, insurance companies are using user sketches analyzed by big data to apply differential charges.
To solve this problem, banning large-scale data collection is clearly unrealistic. Whether we like it or not, the age of big data is here. Finding ways to protect privacy while allowing legitimate use of big data is what will make our lives safer, richer, and more productive
When used legally and safely, for example, big data tech can dramatically improve the efficiency of counter-surveillance, which in turn will allow us to avoid identity theft and potential financial loss.
The key to solving the challenge of ensuring security and privacy while enjoying the convenience of big data is openness and transparency in the use of information. Operators of big data must be open about the content and use of the data they collect.
In addition to this, users must have the right to know how the data is stored, who can use it and the process of authorizing it. Lastly, big data companies must win the public's trust by explaining the security controls they have in place to safeguard users.
Note: All articles are published under the authorization of China Digital Science and Technology Museum (CDSTM) partner organizations or individuals.