What were Russian Trolls saying on Twitter? How to Use a Text Mining, Content Analysis Approach to Find Out

As we have learned, during the 2016 US Presidential Election there were significant efforts by outside actors to influence the vote. There have been several investigations and reports written into those efforts. The following is a Blog by Ryan Atkinson explaining how he took a content analysis approach to examining more than a large volume of tweets and how he used WordStat software in his work.

The project that Professor Joanne Miller and I worked on involved analyzing a dataset of over 200,000 tweets that were either sent or retweeted by Internet Research Agency affiliated trolls around the 2016 United States presidential election. This is my first time as a student dealing with such a large dataset, and so it was intimidating to even know where to start with my analysis of its content. However, Provalis has a Youtube channel and a comprehensive user manual on their website for both QDA Miner and WordStat, and I utilized these resources in order to build categories, organize tweets, and discover thematic relationships.

The paper addresses four general contextual themes in the dataset. The first theme involved superficial identities and their support or opposition for the Republican or Democratic candidate in the 2016 election. The second theme involved general troll tactics. The third theme involved message salience as observed through retweet counts. The fourth theme involved explicit support for either Hillary Clinton or Donald Trump.

The visualization tools helped guide the narrowing process to these four themes by revealing word co-occurences and relationships within the dataset prior to coding for the four themes. One of the most helpful tools allowed me to filter tweet counts based on a variable, which I used to discover the most prolific tweeters and the content that those tweets contained. Furthermore, narrowing categories into subcategories would reveal patterns of troll tactics that, without the assistance of the software, would be incredibly difficult and time-consuming to find.

I only had three months to learn how to use the software, develop the scope of the research questions, and write a final paper of the results. Both QDA Miner and WordStat provided a low barrier of entry to use the software and an extensive toolkit, both of which I needed on such a short turnaround deadline. I was only able to scratch the surface of the potential uses of the software, which provides useful tools for both beginner and advanced users alike. The finished paper and results can be found at the following University of Minnesota Digital Conservancy link: http://hdl.handle.net/11299/199858.