{"id":22104,"date":"2023-09-08T17:33:23","date_gmt":"2023-09-08T17:33:23","guid":{"rendered":"https:\/\/nftandcrypto-news.com\/nft\/x-updates-its-terms-bans-data-scraping-crawling\/"},"modified":"2023-09-08T17:33:24","modified_gmt":"2023-09-08T17:33:24","slug":"x-updates-its-terms-bans-data-scraping-crawling","status":"publish","type":"post","link":"https:\/\/nftandcrypto-news.com\/nft\/x-updates-its-terms-bans-data-scraping-crawling\/","title":{"rendered":"X Updates its Terms, Bans Data Scraping& Crawling"},"content":{"rendered":"
\n

X, formerly known as Twitter, has just updated its terms of service<\/a> (again) to explicitly forbid data scraping and crawling its platform without prior written consent.\u00a0<\/p>\n

The updated terms, set to take effect on September 29, 2023, introduce strict controls on unauthorized data collection methods and comes just eight days after it amended its Privacy Policy, stating that the platform will begin collecting users\u2019 biometric data and professional education and employment history.\u00a0<\/p>\n

The previous version of the terms permitted crawling as long as it adhered to the guidelines outlined in the robots.txt<\/em><\/strong> file \u2013 an instructional file given to \u201ccrawlers\u201d (or programs) about what parts of a website they are allowed to visit. However, the revised terms have eliminated this provision, mandating that any form of scraping or crawling must secure explicit written consent from X.<\/p>\n

Web Crawling vs. Web Scraping<\/h2>\n

While both may sound very similar, they operate for two different purposes.\u00a0<\/p>\n

Web \u201ccrawling\u201d grabs other web pages to create indices or collections of data, while web \u201cscraping\u201d downloads webpages to extract a specific set of data for analysis \u2013 e.g. product details, pricing information, SEO data, etc<\/em>.\u00a0<\/p>\n

Essentially, \u201cweb scraping\u201d simply extracts publicly available data from a website and imports it into any local file\/folder on your computer through the use of a \u201ccrawler\u201d program that looks for the specific set of data the user is looking for and additional targets to crawl, while \u201cweb crawling\u201d discovers target URL(s) or other links for the purpose of creating an index or multiple indices of data.\u00a0<\/p>\n

Data scraping is one of the most effective ways to extract data from the web and doesn\u2019t require an internet connection.\u00a0<\/p>\n

In conjunction with the updated terms of service, X has recently made alterations to its robots.txt file. This file directs web crawlers, including those from Google, regarding which sections of the site they are permitted to access. These amendments have effectively curtailed access to specific data types, including likes, retweets associated with particular posts, and account-related information like likes, media, and photos.<\/p>\n

The decision to bolster restrictions on scraping and data access comes on the heels of X\u2019s recent platform modifications. These adjustments included temporarily preventing logged-out users from viewing posts and subsequently eliminating the login requirement for accessing tweets.\u00a0<\/p>\n

X\u2019s CEO, Elon Musk, cited the need for these measures in response to excessive data scraping, which was adversely affecting the platform\u2019s performance for regular users.<\/p>\n

Musk has vocally opposed companies scraping Twitter\/X data for training AI models in the past. He previously issued a legal threat against Microsoft, alleging their unlawful use of the platform\u2019s data for AI training.\u00a0<\/p>\n

In July, Musk initiated a legal action against \u201cJohn Doe\u201d defendants involved in unauthorized data collection.<\/p>\n

The impact of these stringent measures on data accessibility and X\u2019s relationship with web crawlers, including those from tech giants like Google, remains to be seen.<\/p>\n

Editor\u2019s note: This article was written by an nft now staff member in collaboration with OpenAI\u2019s GPT-3.<\/em><\/p>\n<\/p><\/div>\n