Using Text Mining to Suggest What’s Next for Travel after Covid-19

We all know that Covid-19 has decimated the travel industry. Most experts believe it will take years before travel returns to 2019 levels. What will it take to get there? In their paper Ugur & Akbiyik (2020) use text mining to look for some of the potential answers.

The paper uses data from early in the pandemic. From our perspective, its conclusions on some of the strategies to help rebuild the travel industry are interesting, but what is most instructive is how the authors use text mining and the features of WordStat to explore and analyze their data.

Three keywords coronavirus, coronavirus, COVID were chosen to create the dataset. After a sorting process, the preliminary dataset (captured between December 30, 2019 – March 15, 2020) was comprised of 23,515 comments from the US, Asian, and European Trip Advisor forums. This included a total of 74,768 sentences containing 1,329,825 words, 844,253 words removed by way of lemmatization because they were not meaningful eg, I, or, etc.

The authors examined the 500 words with the highest TF * IDF value, certain criterion for frequency, appearance in a certain number of cases, and repetition. They had the software create a word cloud so they could visualize the results.

They next deployed the phrase extraction feature to see how the most frequent phrases, again using some pre-set conditions for frequency. In the paper, they present a table of the most frequent phrases This gave them even more insight into the comments. They begin to discover, among other things that many of the phrases relate to insurance, refunds, cancelations, or references to people commenting about how to be compensated for travel disruption caused by the pandemic. They subsequently used the topic extraction feature of WordStat and saw that some of the most frequent topics were about refunds, travel insurance, cancelation, risk of cancelation, and so on. Of course, there were also the expected topics relating to hand washing, masks, and other safety-related issues. The authors continue to explore their dataset using the dendrogram feature to see the clustering of words and how it related to the automatic topic extraction, the cross-tab, and link analysis features to look at other elements in their data.

We will let you read the complete paper to examine the discussion, conclusions, and what it could mean for the future of the global travel industry.

In these uncertain times of COVID-19 things are changing quickly.  New tactics and strategies must be developed and deployed to keep people safe and help businesses adapt to survive. Some of the answers are found in customer comments, social media posts, surveys, employee comments, and many other forms of text data. This paper is a good step-by-step example of how to use a text-mining tool to help researchers find meaning and answers in that data.

One of the authors of the paper, Adem Akbiyik, has written a book in Turkish that describes the basic concepts of text mining, applications for creating projects with WordStat in the field of social science.


Uğur, N. G., & Akbiyik, A. (2020). Impacts of COVID-19 on global tourism industry: A cross-regional comparison. Tourism Management Perspectives, 100744.