(Photo credit: Reuters)

Between February 2, 2012 and May 30, 2018 2,973,371 tweets from 2,848 Twitter accounts spread misinformation and discord like a cancer through cyberspace. Many of these tweets were meant to divide the left while sparking rage and galvanizing the Republican base with political attacks that echoed MAGA Trump supporter talking points. They came from a dubious “Internet Research Agency” (IRA) attempting to interfere with the US presidential election—one which eventual saw Donald Trump take the White House.

The tweets were harvested from the internet by researchers Darren Linvill and Patrick Warren at Clemson University and the school’s Social Media Listening Centre, “an interdisciplinary lab that captures ‘more than 650 million sources of social media conversations.”

“Earlier this year, as part of special counsel Robert Mueller’s investigation, the Justice Department charged 13 Russian nationals with interfering in American electoral and political processes. The defendants worked for a well-funded ‘troll factory’ called the Internet Research Agency, which had 400 employees, according to one Russian news report.” (FiveThirtyEight)

The data was shared with FiveThirtyEight, which published an excellent story (via Oliver Roeder) detailing the scale of this “troll factory” churning out fake content, and reporters there made the data publicly available for further analysis.

The bulk of the tweets were sent between November 2014 and December 2017 with several pronounced spikes in volume within that period. October 6, 2016 and August 16, 2017 recorded the highest numbers, with 18,722 and 14,720 tweets, retweets, or quoted retweets published respectively. As first reported by the Washington Post, the October 6 spike in activity preceded the release of the Clinton campaign emails by WikiLeaks.

The majority of content was sent between 2015 and 2017, but extends as far back as 2012 (488 posts). Both Linvill and Warren have published a paper, Troll Factories: The Internet Research Agency and State-Sponsored Agenda Building, wherein they organize IRA’s troll accounts into five categories: Right Troll, Left Troll, News Feed, Hashtag Gamer, and Fearmonger.

As explained by FiveThirtyEight, Right Trolls behave and communicate like typical Donald Trump supporters, often espousing bigoted or intolerant views towards immigrants and posting highly political content. They generated the largest share of partisan messages with just under a quarter (24%) of total tweets, retweets, and quoted retweets. Left Trolls tend to emulate progressive activist accounts and try to divide the left, for example, by supporting Bernie Sanders and attacking Hilary Clinton. Their content made up 14% of the total. News Feeds aggregated related news articles, Hashtag Gamers invited user to engage in wordplay games, and Fearmongers “spread news about a fake crises, such as salmonella-contaminated turkeys around Thanksgiving.”

Based on volume of content, it appears the strategy may have been to directly attack the left through the highly partisan content of Right Trolls while “dividing and conquering” using the Left Trolls to splinter progressives. News Feeds may have served to signal boost specific content; and Hashtag Gamers, to increase engagement and message reach.

Three distinct types of content were measured: original, retweets, and quoted retweets. The majority of content was in the first category (55%), followed by direct retweets (42%). When considering the type of content generated by each type of account, there were distinct differences. Right Trolls published original material and retweeted others in relatively equal measure: 24% of original content from the IRA was published by Right Trolls, which were also behind 23% of the retweets. Left Trolls on the other had focused more on retweeting: 26% of total retweets came from Left Trolls, whereas these accounts only generated 4% of total original content from the IRA. Conversely, News Feeds focused almost exclusively on original content, i.e., non-retweets.

Digging Deeper

This data was generously made public by FiveThirtyEight in order that others could gather more insight into the purpose and methods behind the IRA and the degree to which it may have influenced the outcome of the US presidential election. The amount of data is massive, split into nine separate files together forming a record stretching thousands of rows. The potential to explore and discover is huge. Here are a few areas one could consider:

  • Translating the content from Non-English accounts, which constituted a significant percentage of the entire data set – What were they saying?
  • Analysis of the content shared on days with heavier volume – What was being shared? Were there important events in the news triggering the surge? Was the IRA able to anticipate or detect when people were surfing online so that they could capitalize on audience size?
  • What was the role of News Feeds? What type of content were they sharing?
  • How were the 2,848 troll accounts communicating/signal boosting one another? Was there an “order of operations” driving how content was disseminated?

It would also be interesting to learn which accounts were the most prolific and uncover more information about the users behind them. Many of the accounts have since been deleted.

This data reveals the degree to which the underbelly of social media can be organized, sophisticated, and fake—something to consider the next time political discourse devolves online. And perhaps the views and opinions that seem to be shared by a majority, aren’t. This too is an important reflection the next time a topic “trends” on social feeds and digital platforms.

In this climate of fake and parody profiles, it would be understandable to view with suspicion accounts that don’t use real photos, post with a particularly strident tone, and have few followers and inconsistent posting histories. Often these accounts follow each other, which can be a sign they are controlled by the same person—someone with an agenda trying to create the illusion of support and consensus where there is none.