Web crawler big data marketing case

From the case of Sina Weibo's improper access to user data to the data dispute between LinkedIn and hiQ Labs ... The continuous emergence of relevant judicial cases has attracted much attention.

10 year 10 On October 23rd, the Yangtze River Delta Data Compliance Forum (Phase III) and the seminar on legal regulation of data crawler were held in Shanghai. Many legal experts, judicial workers and business representatives have discussed the influence of crawler technology on digital industry, the legal boundary and regulation of crawling other people's data.

In the era of big data, with the prominent value of data, the application of data crawler is increasingly extensive. Many experts mentioned at the meeting that the crawler technology itself is neutral, but the application of crawler technology is often purposeful, and it is necessary to consider whether the crawling behavior and data use are justified.

"Fierce" web crawler increases the burden of website operation.

From a technical point of view, a crawler is a process that simulates the behavior of people surfing the Internet or using an App through a program, so as to efficiently capture network information. Not everyone welcomes this technology.

Liu, head of L 'Oré al's digitalization in China, said at the seminar that most websites refused to be visited by reptiles, both for commercial interests and for the safety of their own website operation. The automatic, continuous and high-frequency access of the crawler will lead to the soaring load of the website server, which makes some small and medium-sized platforms face the risk that the website cannot be opened, the webpage loads slowly, and even directly paralyzes. Therefore, "website operators often suffer from' fierce' web crawlers."

Although websites can adopt corresponding strategies or technical means to prevent data from being crawled, crawlers also have more technical means to counter it, which is called anti-crawling strategy. According to Liu, the technology of anti-crawling and crawling has been iteratively updated-crawling is not a problem, the key is whether you are willing to climb and how difficult it is. Generally, the more difficult it is to climb a big factory App or website, the more anti-climbing mechanisms there are.

Ceng Xiang, legal director of Xiaohongshu, observed that malicious crawler cases often occur on content platforms and e-commerce platforms. More videos, pictures, texts, user behavior data, etc. They are all captured in the content, and more business information and commodity information are captured in the field of e-commerce.

"Generally speaking, the content platform will stipulate that the intellectual property rights of related content belong to the publisher, or the publisher and platform * * *. Grab suspected infringement of intellectual property rights without consent. " Ceng Xiang said that the platform stimulates the creativity of creators through investment. If someone uses crawler technology to easily obtain content and copy and adapt it, it will harm the interests of the platform.

When it comes to web crawler, Robots protocol is an unavoidable topic-its full name is "Web crawler exclusion standard". Through the Robots protocol, the website clearly warns the search engine which pages can be crawled and which pages cannot be crawled. The agreement is also known as the "gentleman's agreement" in the search field in the industry.

Xu Hongtao, a judge of the Intellectual Property Court of Shanghai Pudong Court, described it this way: reptiles are visitors, and the robot agreement is a sign hanging on the door. Modest gentlemen will stop when they come near the door and see this sign, but lawless people may still break into houses.

Combing relevant precedents, Xu Hongtao pointed out that the Robots protocol is a rule generally followed by the Internet industry. If the search engine crawls the website content in violation of the Robots agreement, it may be considered as violating business ethics and constituting unfair competition. Robots protocol solves the pre-problem, that is, whether the grabbing behavior is appropriate, but it does not solve the problem of whether the data is used properly after grabbing.

He further analyzed that the court tends to think that crawler technology is neutral in the judgment of a case, and respects the way the website sets up the Robots protocol. If the crawler crawls forcibly in violation of the Robots protocol, it may give some negative comments on the legality evaluation. In addition, the Robots agreement is related to the legality of the behavior, but it is not the only opposition-even if it conforms to the Robots agreement, it may be judged illegal because of the later use behavior.

It is worth mentioning that when defending the crawling behavior, web crawlers often associate the Robots protocol with data streams.

Xu Hongtao believes that "order" and "circulation" are equally important in the context of "interconnection". This requires a good grasp of the degree between "interconnection" and data * * *, and at the same time, consider whether the Robots protocol strategy adopted by operators in various Internet industries may lead to the emergence of data islands.

Multiple factors should be considered to judge the legitimacy of reptile behavior.

At the seminar, Zhang Yong, a professor at East China University of Political Science and Law, classified the harmful behaviors of data reptiles.

He said that from the perspective of data types, the rights and interests that data capture may infringe include computer system security, personal information, copyright, state secrets, trade secrets, market competition order and so on. From the way of crawling, data crawling may endanger the security of computer information system, illegally obtain citizens' personal information, illegally obtain business secrets, and undermine copyright technical protection measures. Judging from the results, there are some problems such as unfair competition, infringement of copyright and infringement of personality rights.

When data becomes a factor of production, the application scenarios of data capture technology are increasingly extensive, and the ensuing disputes are also increasing. How to judge the legitimacy of reptile behavior may find some answers from existing cases.

On September 14 this year, Hangzhou Internet Court announced an unfair competition case of grabbing data from WeChat official account platform, ordered the defendant to stop data grabbing and compensated WeChat for a loss of 600,000 yuan.

The court held that the defendant violated the principle of good faith and used the data with commercial value collected by the plaintiff without the user's consent, which was enough to substantially replace some products or services provided by other operators, damaged the market order of fair competition and constituted unfair competition.

In this case, the court also analyzed whether the crawling behavior is justified from the perspective of "ternary target superposition".

Taking this as an example, Xu Hongtao mentioned that the legitimacy of non-search engine crawler mainly depends on whether the defendant respects the Robots protocol preset by the crawled website, whether the technical measures to destroy the crawled website are enough to ensure the safety of user data, and whether creativity and public interests are measured.

He pointed out that if data is captured at the expense of endangering user data security, and the application of crawler technology can not create new high-quality resources, but only increases the burden on other people's servers, then it is likely to be given a negative evaluation on the legality of behavior.

What about the School of Software at Henan University of Technology?

How to use Xunlei better?

How to take out a large loan online online loan large loans

Does the child's father need to report to school every day when he runs freight but doesn't come back to live?

Has Bityard been included in the non-small account?

Hearthstone legend war order system is what war order system detailed introduction

After the Politburo meeting named, Ningbo set 112 school district housing neighborhood second-hand housing reference price

What's the key to data use in the age of big data

How about Yancheng Qingfa Network Technology Co., Ltd.?

In the TV series Kangxi Dynasty, why did Kangxi say that Mingzhu is a kaleidoscope and an eight-legged octopus