Analyzing Arabic with WordStat
WordStat text mining software can help you analyze more than 60 languages including; Chinese, Japanese, Korean, Russia, Turkish and Arabic. Even though the information is available on our website, we are often asked by prospective customers about WordStat’s languages capabilities beyond the obvious ones like English and the main European languages. To demonstrate WordStat’s linguistic diversity we will be presenting a series of blogs highlighting researchers’ analysis of different languages. In this blog we showcase Arabic. It is a form of qualitative, mixed methods research where the story is the data.
The brutal killing of journalist Jamal Khashoggi at the Saudi Arabian consulate in Istanbul in 2018 caused widespread outrage. After his disappearance was reported, there were many differing news and social media reports commenting negatively on Khashoggi’s journalism and character. This included Twitter, Facebook, Instagram and many more. How could much of the information concerning this well-known journalist have been so at odds with the established narrative of his career and character? A large part of the answer is found on social media and the internet in what seems to have been a concerted and organized, state-sponsored campaign.
We have seen state-sponsored digital disinformation campaigns waged by various authoritarian governments in the recent past. Actors include Russia, China, Turkey, and others. Mackinnon (2011) coined a term for the general description of this activity, networked authoritarianism. Part of the study this blog is focused on Al-Rawi (2021) was to see how this concept applied to state-run disinformation campaigns against individuals.
Dr. Ahmed Al-Rawi is an Assistant Professor of News, Social Media, and Public Communication at the School of Communication at Simon Fraser University, Canada. He is the Director of the Disinformation Project that empirically examines fake news discourses in Canada on social media and news media. His research expertise is related to social media, news, and global communication with emphasis on the Middle East. In, Al-Rawi (2021) he explores a coordinated disinformation campaign waged against Khashoggi and his finance by Saudi Arabia and its agents. Much of the internet and social media text data (twitter, Facebook) was in Arabic. The author used many and varied means to collect this data, and he analyzed the text component using QDA Miner and WordStat.
In order to examine the trolls’ textual dataset as a whole, I used QDA Miner—WordStat 8 to identify the most frequent words and their association with other terms. I also conducted topic modeling analysis with the use of factor analysis arranged based on their eigenvalues. I used this software because it allows topic modeling analysis of non-English texts (Al-Rawi, Kane & Bizimana, 2021). p.144.
The author used WordStat’s link analysis to see the relationship between the top 100 key words. (fig. 1) And he used topic modeling which largely confirmed the link analysis relationships between word groups.
In addition to identifying the troll-generated content, the study also analyzes its relative effectiveness. The conclusions are very interesting with respect to networked authoritarianism and a very good example of how to approach and analyze this type of event of which we are likely to see more and more in the future.
Al-Rawi, A. (2021). Disinformation under a networked authoritarian state: Saudi trolls’ credibility attacks against Jamal Khashoggi. Open Information Science, 5(1), 140–162.
MacKinnon, R. (2011). China’s “Networked Authoritarianism” Journal of Democracy, 22(2), 32–46.