INTRODUCTION Part I of our blog series introduced Automatic Machine Learning Document Classification (AML-DC). Part II of our blog series on Automatic Machine Learning Document Classification (AML-DC) provides a practical and detailed walkthrough on the development and implementation of a supervised AML-DC model in fast, reproducible, reliable and auditable way. RESEARCH PROBLEM How does one classify a large corpora of survey results into positive and negative sentiment classes, in a fast, reproducible, reliable and auditable way? METHOD Our solution...

read more

How Text Analytics Adds Value to NPS

    Companies are always trying to find easy and accurate ways to measure customer satisfaction or customer sentiment.  In the last decade Net Promoter Score (NPS) has emerged as a favorite.  Many companies have adopted it as a standard. The one-question survey developed by Fred Reicheld, Bain and Company, “How likely are you to recommend this product/brand/service to your friends or colleagues,” is straightforward, as is the 0-10 scoring system that accompanies it. Companies can ask the question and over time see how their score moves up or...

read more

Using Text Mining of Big Data for Prediction

From time-to-time in this Blog we draw your attention to some of the applications of text analytics. In the last few years one of those areas has been predictive analysis, using text mining to explore large databases. Using text mining to learn from past behavior, discover patterns and identify trends, allows researchers are able to make predictions in different fields. This blog is based on the paper Big Data for Prediction: Patent Analysis. The analysis of patents using text mining can help investors and inventors get a better sense of where a...

read more

PART I: Automatic Machine Learning Document Classification – An Introduction

This blog focuses on Automatic Machine Learning Document Classification (AML-DC), which is part of the broader topic of Natural Language Processing (NLP).  NLP itself can be described as “the application of computation techniques on language used in the natural form, written text or speech, to analyse and derive certain insights from it” (Arun, 2018). AML-DC aims to automatically assign ‘a data-point to a predefined class or group according to its predictive characteristics’ (Kabir et. al., 2018). AML-DC is the essence of text mining as it transects,...

read more

What were Russian Trolls saying on Twitter? How to Use a Text Mining, Content Analysis Approach to Find Out

As we have learned, during the 2016 US Presidential Election there were significant efforts by outside actors to influence the vote. There have been several investigations and reports written into those efforts. The following is a Blog by Ryan Atkinson explaining how he took a content analysis approach to examining more than a large volume of tweets and how he used WordStat software in his work. The project that Professor Joanne Miller and I worked on involved analyzing a dataset of over 200,000 tweets that were either sent or retweeted by Internet...

read more

WordStat 8 New Deviation Table Instantly Compares Words/Phrases Used by Different Variables

    We have introduced a special new feature to WordStat 8. This is a brand new feature that we believe you will find quite useful. We are calling it the Deviation Table. It allows you to see which words, phrases or topics are used more or less by different values of a categorical variable. You can also think of it as another way to view the Crosstab feature. The Deviation Table is simple to use. All you do is perform a word frequency or phrase extraction in WordStat. Click on the Cross Tab button in the toolbar and then click on the...

read more

See it live

Interested in purchasing QDA Miner and WordStat? Register for one of our web demos!

Web demos Registration