Once you are sure that you do not hurt anyone with your scratching, you need to analyze the regulations that apply to you. If you are a company in the EU, the GDPR applies to you, even if you want to collect personal data from people elsewhere in the world. As an EU company, you need to do your research. Sometimes it is acceptable to go ahead on the basis of a legitimate interest, but in most cases you will need to outsource this project to your non-European partners or competitors to recover personal data. On the other hand, if you are not an EU company, you do not do business in the EU and you do not target people who are in the EU, maybe everything will be fine. Also, be sure to check your local regulations like the CCPA. You don`t need to know any code, a web scraping tool is very useful for non-tech professionals like marketers, statisticians, financial advisors, Bitcoin investors, researchers, journalists, etc. In the following article, we`ll look at web scraping laws, why they`re important, and how your website`s terms and conditions (also known as terms of use or terms of use) can limit other people`s web scraping activity. We`re ParseHub, and we`re going to go through some notable legal cases and the insight of a tech lawyer to break down the topic and answer the question of the legality of web scraping. In other words, although in some regions certain limits may be imposed on scraping activity by a company`s terms and conditions agreement, the US court essentially ruled that it is not “theft” for a company to cross out information such as users without a product, open user profiles, ticket prices, etc. It really depends on the particular situation and the definition of web scraping you are using. Here we define web scraping simply as the process of collecting data on the Internet.

Scraping data from other websites is a useful and essential part of many legitimate data analysis operations. Web data scraping itself isn`t illegal, but it can be illegal (or in a gray area), depending on these three things: The U.S. Supreme Court has the power to overturn the Court of Appeals and could overturn the decision to legalize scraping of publicly available and non-copyrighted data. Before we begin, let`s clear up some misconceptions. We sometimes hear that “scrapers operate in a grey area of the law”. Or that “web scraping is illegal, but no one applies illegality because it`s difficult”. Sometimes even “web scraping is hacking” or “web scrapers steal our data”. We`ve heard this from customers, friends, interviewees, and other businesses. The fact is that none of this is true.

Not at all. Legitimate web scraping companies are ordinary businesses and follow the same rules and regulations that everyone must follow to do their respective business. Web scraping is not heavily regulated, it`s true. But that doesn`t mean anything illegal. Quite the contrary. The second type of data you need to watch out for scratching is copyrighted data. This is a quote from the aforementioned HiQ injunction against LinkedIn. We think this is a good guideline on how unilateral scraping bans by website owners should be addressed: In the US, scraping copyrighted content is allowed by fair dealing doctrine. The rules are somewhat similar to European rules, but they do not make a clear distinction between scientific research and for-profit scraping. The basic case law for applying fair use to scratching is Authors Guild v.

Google (Google Books case). In the Google Books case, the court found that virtual copies of copyrighted content – entire books – were permitted under fair use. Many are unaware that the end-use case of the data often has a significant impact on whether or not the scrape is legal. Sometimes it can be perfectly legal to scratch a website, but the way you want to use the data can make it illegal. While it`s perfectly legal to cross off publicly available data, there are two types of information you should be wary of. This question is often asked. According to Google Trends, searches for the term “web scraping legal” have steadily increased over the past 4 years. This is very important because it means that scraping copyrighted content is only allowed for the purpose of generating information. For example, you can search for a web page to extract prices, or books for natural language analysis, but you can`t search for news articles and republish them on your own website. “We are disappointed with the court`s decision. This is a preliminary decision and the matter is far from over,” LinkedIn spokesman Greg Snapper said in a statement.

“We will continue to fight to protect our members` ability to control the information they provide on LinkedIn. If your data is taken without permission and used in a way to which you have not consented, this is not acceptable. On LinkedIn, our members entrust us with their information, which is why we prohibit unauthorized scraping on our platform. Legal affairs are among the best resources when it comes to investigating the legality of an activity. We will review 3 current and notable legal cases concerning web scraping: The main question in all these cases is whether the terms of use listed on many websites that prohibit web scraping (or automatic access) are legally enforceable. Of course, there are no problems with websites that allow web scraping. On April 18, 2022, the Ninth Circuit reaffirmed that scraping publicly available data cannot violate the CFAA. The Ninth Circuit relied on the Van Buren case, where the U.S. Supreme Court opened the door up or down inquiry. if authorization is required and has been granted, the doors are at the top; If permission is required and has not been granted, the doors below are used to access a protected computer. In the recent judgment in HiQ v.

LinkedIn, the Ninth Circuit emphasized that a defining feature of public websites is the absence of access restrictions; Therefore, with the analogy of the door – there was no door that had to be raised or lowered. In other words, when no permit is required, there is nothing to remove later. The CFAA concept of “without permission” simply does not apply to public websites. It all depends on what you scratch and how you scratch it. It`s quite similar to taking pictures with your phone. In most cases, it`s completely legal, but photographing a military base or confidential documents can get you in trouble. Web scraping is the same thing. There is no law or rule prohibiting web scraping. But that doesn`t mean you can scratch it all. This is not surprising given the growth of web scraping and many ongoing legal cases related to web scraping. In the EU, scraping of copyright-protected content is permitted by Articles 3 and 4 of Directive 2019/790 on copyright and related rights in the Digital Single Market (MUN Directive).

The DSM policy allows text and data mining, which means that these techniques are typically used to prevent malicious bots that overload and block the website. But techniques can be used more frequently to make automated scraping less profitable for web crawlers. Web scraping is legal if you retrieve publicly available data from the Internet. However, you should avoid scratching personal data or intellectual property. We cover the confusion surrounding the legality of web scraping and give you tips for compliant and ethical scrapers. Octoparse has introduced a unique feature – web scraper templates, which are pre-formatted scrapers that cover more than 14 categories on more than 30 websites, including Facebook, Twitter, Amazon, eBay, Instagram and more.

Written by