Using Content Analysis Software in Bibliometrics and Scientometrics

Bibliometrics and scientometrics often involve the monitoring of research, the assessment of the scientific contribution of authors, journals, specific works, or citations as well as the analysis of the dissemination process of scientific knowledge. The discipline has grown rapidly over the last 10-20 years. One of the reasons is the advent of readily available computer software that can analyze large amounts of text data and the availability of books, journal articles and other data over the internet. In addition, governments and other institutions are more frequently looking to see how influential or far-reaching is the research they have funded.

As an example, by doing a simple search on Google Scholar we found more than 50 articles published between 2013 and 2020 on bibliometrics and scientometrics that used our content analysis software  WordStat. The topics vary widely and include health science, marine science, management, marketing, communications, information science, agriculture, sustainable development and more.

Both, computer-assisted qualitative software like QDA Miner and text mining software like WordStat are highly effective in pursuing this form of research. For example,Dabic, González-Loureiro, & Furrer (2014) performed content analysis on 1116 papers to analyze strategies of multinational enterprises. Gora (2019) analyzed 1516 selected papers, the citations in those articles and the keywords to explore management decision making and performance. Milojevic, Sugimoto, Larivière, Thelwall, & Ding (2014) analyzed the full-text of five handbooks (500,000 words) and a well-defined set of 11,700 science and technology study articles to explore the role handbooks play in knowledge creation and diffusion and their relationship with the genre of journal articles, particularly in highly interdisciplinary and emergent social science and humanities disciplines. Reinhold, Laesser, and Bazzi (2015) selected 900 scholarly articles published between 1946 and 2012, and identified 55 topical clusters that delineate and weight the dominant associations with the term “transportation management,” in their study on transportation management research.

The subject matter varies but the common factor is quantity. The practise of bibliometrics and scientometrics usually requires analyzing very large amounts of articles, abstracts, books, web content or other forms of text-based scientific information. WordStat is a tool that is built to help researchers in this area. Its ability to quickly perform topic modelling, phrase extraction, cluster extraction, keyword-in-context, keyword retrieval, correspondence analysis and more, allow researchers to quickly analyze very large datasets to get an understanding of the common themes, patterns, how they relate to each other and how they don’t.

The following is a more detailed description of how one author, Rudd (2017) used QDA Miner and WordStat in his research using journal abstracts to find trends in ocean and costal sustainability. In the initial topic/keyword search the author found 203,348 articles.

“I first used topic modeling to filter research topics clearly outside the scope of this research and then iteratively used a variety of tools (e.g., phrase extraction, “query by example,” thesaurus searches) to further winnow the selection to articles that dealt with ecologically-oriented ocean sustainability challenges and their potential solutions. At the start, the abstracts contained >8.1 million words; a stop list […] was constructed for common words and phrases [..] eliminating 4.05 million words from the analysis.” (p.4)

The author then proceeded to perform coding by paragraph and developed a content analysis dictionary to analyze the abstracts. The full paper is available online.

