Using Text Mining of Big Data for Prediction

From time-to-time in this Blog we draw your attention to some of the applications of text analytics. In the last few years one of those areas has been predictive analysis, using text mining to explore large databases. Using text mining to learn from past behavior, discover patterns and identify trends, allows researchers are able to make predictions in different fields. This blog is based on the paper Big Data for Prediction: Patent Analysis. The analysis of patents using text mining can help investors and inventors get a better sense of where a...

read more

PART I: Automatic Machine Learning Document Classification – An Introduction

This blog focuses on Automatic Machine Learning Document Classification (AML-DC), which is part of the broader topic of Natural Language Processing (NLP).  NLP itself can be described as “the application of computation techniques on language used in the natural form, written text or speech, to analyse and derive certain insights from it” (Arun, 2018). AML-DC aims to automatically assign ‘a data-point to a predefined class or group according to its predictive characteristics’ (Kabir et. al., 2018). AML-DC is the essence of text mining as it transects,...

read more

What were Russian Trolls saying on Twitter? How to Use a Text Mining, Content Analysis Approach to Find Out

As we have learned, during the 2016 US Presidential Election there were significant efforts by outside actors to influence the vote. There have been several investigations and reports written into those efforts. The following is a Blog by Ryan Atkinson explaining how he took a content analysis approach to examining more than a large volume of tweets and how he used WordStat software in his work. The project that Professor Joanne Miller and I worked on involved analyzing a dataset of over 200,000 tweets that were either sent or retweeted by Internet...

read more

WordStat 8 New Deviation Table Instantly Compares Words/Phrases Used by Different Variables

    We have introduced a special new feature to WordStat 8. This is a brand new feature that we believe you will find quite useful. We are calling it the Deviation Table. It allows you to see which words, phrases or topics are used more or less by different values of a categorical variable. You can also think of it as another way to view the Crosstab feature. The Deviation Table is simple to use. All you do is perform a word frequency or phrase extraction in WordStat. Click on the Cross Tab button in the toolbar and then click on the...

read more

Directly Import Text from Social Media with New WordStat for Stata

    Stata users can now directly import text data from many different platforms using WordStat for Stata. The latest version of WordStat for Stata (8) can be run as a stand-alone software or integrated with STATA. This means Stata users can use WordStat to directly import text data related to their quantitative data and create projects in WordStat for analysis. Stata users can now use WordStat for Stata to import text data from social media (Twitter, Facebook, RSS Feeds, YouTube, Reddit), email platforms (Outlook, Gmail, Mbox Files), web...

read more

More Fun, More Features in WordStat 8 Text Mining Software

    We are always looking for ways to make your text mining experience easier, better, more precise and ….more fun. It is what drives us. We think you will find the recently released WordStat 8 checks all these boxes. Easier.  Import documents and create projects directly in WordStat. Use our new Explorer mode for those of you with limited experience with text analytics software.   More Precise. New enriched Topic Modeling moves beyond the traditional approach by providing you with suggestions, exceptions and spelling corrections.  ...

read more

See it live

Interested in purchasing QDA Miner and WordStat? Register for one of our web demos!

Web demos Registration