One watchdog argues the limit will help curtail the advancement of AI-trained mass-censorship models, while others are convinced the move is no more than a bid to pump Twitter Blue subscriptions.
To the dismay of countless Twitter users, CEO Elon Musk has instated a temporary reading limit on users. In a July 1 tweet, Musk explained the limit was imposed to address “extreme levels data scraping” – a term that describes the automated process of ingesting huge volumes of public data for various purposes, including training AI models.
Verified users are now limited to reading 10,000 tweets, while the unverified are limited to 1,000. Newly unverified accounts are limited to just 500. (Initially, these limits were 6000, 600 and 300 respectively).
Twitter, as you’d expect, tore its hair out over the announcement. “We are tired of censorship, Elon!”, one user wrote, adding “some of us thought this social network was free, but oh! Limits!”.
Others confidently argued that Musk’s motive was simply to increase the subscriber base of Twitter Blue, rather than to negate data scraping.
Data scraping: is it something to worry about?
Possibly, if you value a free and uncensored internet. In the context of AI, data scraping is the process of automatically extracting data from various online sources, including social media sites, to gather huge pools of data that can be analysed and applied to language processing and machine learning. It also has more banal purposes, like analysing market trends.
The human-feel of chatbots like ChatGPT and Bard comes from somewhere, after all.
Yoel Roth, a former Twitter employee, wrote on Bluesky that “scraping was the open secret of Twitter data access. We knew about it, it was fine.”
He also wrote that there is “some legitimacy” to Twitter’s concerns of AI companies “slurping up social data gratis in order to train commercially lucrative models”, but the company ought not to forget that the data belongs to users, rather than Twitter. “A solution to parasitic AI needs to be user-centric, not profit-centric,” he said.
Twitter is a far more lucrative platform to scrape data from than Meta products as profiles are far more public.
It gets weirder, though.
Mike Benz, executive director at the Foundation For Freedom Online, a free speech watchdog, offered a compelling take on Twitter yesterday. He pointed out that data scraping has enabled third party companies to become exceptionally effective at censoring tweets – even more so than the FBI.
He said, “The Twitter files showed how the FBI might come in and get 22 tweets censored. AI technology, such as Enterprise Intelligence Platform (EIP) and other types of third party censorship groups, were able to get 22 million tweets censored […] it’s a completely different animal”.
Most interesting of Benz’s comments was that excessive data scraping aids in the construction of a “social media censorship death star” – an AI-trained censorship and surveillance model that has been in the works for seven years.
This goal of this tax-funded ‘death star’, he argued, is ultimately to surveil and control public conversation.
“Political communities, social or public health communities, climate communities, you name it – whatever the sensitive policy issue of the day is, you can use this massive Twitter scraping capacity to ingest everyone’s tweets and then disambiguate out the words they’re using, the hashtags, the themes, the memes, to build this sort of code book of online communities that can be used for mass censorship, that is used for mass censorship,” he detailed.
He added that such models have been used by the Central Intelligence Agency (CIA), the state department, the defense department, hundreds of government-funded NGOs, and more.
Benz asserted that this ‘death-star’ is tax-funded “to the tune of tens of millions of dollars from DARPA” – the research and development agency of the US Department of Defense responsible for the development of emerging technologies for use by the military – “and the National Science Foundation, to say nothing about the state department and the National Endowment for Democracy grants.”
“Musk has no idea the DARPA rattlesnake he just stepped on by doing this,” he said.
Musk tweeted an approving ‘spot on’ emoji in response to Benz’s theory.
So, while the new viewing limit on Twitter does in one way limit the openness of the internet, it also potentially preserves the openness of the internet by preventing the construction of this censorship ‘death-star’. (According to Benz, at least.)
Benz also acknowledges that even if Musk’s motive is to pump up the subscription base of Twitter Blue, there will nonetheless be “hundreds of censorship operatives, housed within the university research centres, this week howling at the moon that this is an attack on democracy for Musk to limit their access.”
He explained, “If they lose access to the underlying data on which their AI censorship models are built, then they will not be able to do their jobs as effective, fast, precise, and comprehensive as social media censors.”
Regardless of Musk’s motives, the rate limit appears to already tackle excessive data scraping. ChatGPT’s web browser extension can no longer access Twitter’s data.
Binance CEO Changpeng Zhao (CZ) chimed in on the discussion, writing that “viewing should not be limited. Posting and commenting should be. No humans post more than 800 posts per day.”
In a follow-up tweet, he revised his advice, stating, “actually, only commenting need to be limited. IMO.”
It’s unclear when the restrictions will be reverted.
Disclaimer: CryptoPlug does not recommend that any cryptocurrency should be bought, sold, or held by you. Do conduct your own due diligence and consult your financial advisor before making any investment decisions.