Amazon Review Analysis

Analyzing customer reviews allows brands to identify potential areas of improvement, gauge overall customer satisfaction, communicate brand messaging, identify products with the most reviews, and much more. However, as the volume and variety of review data continue to grow, effective analysis becomes more challenging.

When done right, a customer feedback analytics dashboard for your brand will look something like this:

Review Analysis Dashboard Example

In this article, we will outline the current state of the art methods to analyze reviews and also look at AI text mining solutions for customer feedback analytics.

Review Analysis Process

Broadly, we can break down the review analysis process into 3 steps - Data Gathering, Analysis and Reporting.

Step 1: Data Gathering

This involves extracting data from e-commerce review sites like Amazon, BestBuy, Walmart, Bed Bath & Beyond etc.. This can be tricky because:

Since each site displays reviews differently, a custom parser will be required for each site.
Difficulty of crawling some sites (like Amazon) is very high.
Custom parsers will need to be kept up to date with changes on the marketplaces.

Step 2: Review Analysis

In order to detect patterns and actionable insights from a large volume of review data, we need to derive structure out of the unstructured text.

Topic Detection or Review Aspect Classification is the process to understanding what is the topic that is spoken about. This can be very tricky to get right. Consider the example of the word “light” - depending on context the meaning of the word completely changes. For eg. “the headphones are very light”, “the bruner is easy to light”, “screen is not visible under the light”
Sentiment Detection needs to take the context of the category into account. For eg. the word “loud” can be a positive thing for a speaker or a headphone, but negative for an appliance like a washing machine.
Likewise, we also need to take the context of the review into account - ie. the word “cheap” is positive in “i got it for cheap” and negative in “it is cheaply made”

Step 3: Reporting

Though it might seem straightforward, to do justice in report generation for review data can be quite challenging:

Needs to be simple, easy to understand - but comprehensive
Depth and Breadth - Needs to be broad enough to cover the whole category, but should also have the capability to drill down into the root cause for any event.
Many top companies already have adopted best practices - gaining insight into their processes can jump-start the report generation process.

Let's dig deeper into each of these stages.

Step 1: Data Gathering

Marketplace Sources

Extracting this data review would be a time consuming process, and requires constant maintenance.

Customers speak freely about their product experiences in the places that they buy the product from. Undoubtedly, the biggest online retailer today is Amazon hence Amazon Reviews play a crucial role in customer sentiment. In addition to Amazon review data, other marketplaces like BestBuy, Target, HomeDepot, Bed Bath & Beyond, etc. may also contain large volumes of reviews depending on the category.

In this step, we understand the process of gathering the data from the various marketplace sources. Following are the tools commonly used for scraping from the web:

Selenium - Browser automation based scraping
Python
- Scrapy
- LXML - Library for HTML and XML parsing
- Beautiful Soup - More modern XML/HTML library
- Selenium Python library - Works in conjunction with the Selenium standalone tool
Javascript
- Jsdom - HTML extraction library for JS
- Cheerio - Lighter weight library compared to Jsdom
- Puppeteer - Headless browser tool much like Selenium
Ruby
- Nokogiri - HTML/DOM parsing

The right tool set required to scrape from a website could be different based on the tech used on the website.

Step 2: Review and Rating Analysis

In context of review analysis, the biggest problems to solve are Aspect Classification (identifying the topic being spoken about) and Sentiment Classification (inferring whether the user is complaining or praising).

In order to understand the text, Natural Language Processing (NLP) is used. NLP is a field that deals with gathering information from unstructured text. As a field has experienced a renaissance in the past few years with the application of Neural Networks and Transfer Learning.

Neural Networks is a field of Machine Learning that gathers inspiration from the inner working of the brain to solve problems. Transfer Learning is a sub-field that specializes in leveraging learning from one domain to solve problems in another.

In Neural Networks, Convolutional Neural Networks, LSTMs and more recently Transformers are some of the techniques that are state of the art and can be used to solve many problems like sentiment analysis and topic detection. Most neural networks these days either use PyTorch or Tensorflow.

The most popular modern open source libraries used are:

The typical machine learning process usually looks like the following:

Neural Networks Block Diagram

Using an off the shelf SaaS solution for these usually offloads these steps and saves huge development costs.

Step 3: Report Generation

First step here is to figure out the right view and visualizations required to get the most out of the data. This is often the most challenging part, and is often overlooked.

The data is then transferred to a visualization platform like Tableau, PowerBI, Qlik, ChartIO or Google Data Studio.

Instead of building a platform like this in-house with data scientists, engineers and data analysts, a solution like FreeText AI might be the right solution for brands.

FreeText's AI solution

FreeText AI is a turnkey solution for brands to gather all their feedback in one central location and make sense of it. A lot of effort has been invested in creating the right toolset, so that you don’t have to.

With FreeText AI, brands do not have to worry about data gathering, analysis and report generation. Instead, all the data is exposed through an easy to use dead-simple dashboard.

Add products by URL

In order to add a source in FreeText AI dashboard, you just have to provide the URLs of the products.

Add Products Screenshot

Alternatively, the entire category can be automatically added instead of adding individual product SKUs.

FreeText AI is designed with scale in mind, so entire categories of products with hundreds of thousands of reviews can be analyzed.

Analysis comprises of review sentiment analysis and topic detection using automated machine learning models.

Once, the products are added and processed (takes a few minutes depending on the number of products) the data is ready to be explored!

Let’s take a look at the overview dashboard for an individual product

Product view in FreeText AI Dashboard

Product Report Example

Goal of the product overview dashboard is to make all the high-level and root-cause analysis available in one go. A quick glance at the dashboard and the following questions can be answered:

How many reviews came in for the given time period?
What was the rating breakdown (both star rating and review rating)?
Note that in many cases there can be a difference between the star rating and the review rating. Review ratings tend to be more detailed and thought through, and hence in many cases they would be a lower score.
What is the overall sentiment trend?
What are the topics that people are talking about
What is the breakdown of the top positive and top negative topics that have been spoken about

We can also realign our focus and create this report for only negative reviews, or positive reviews or focus on a particular time period.

Filter Records

Filters Screenshot

Zooming out, there could be trends on a product group - say a brand or a sub-category level - that could reveal interesting consumer patterns as well. Trends that might not be obvious on a product level could reveal themselves while looking at the category as a whole.

Product Group Report

Category Screenshot

Another way to look at this is to zoom in and understand what exactly users are saying. This is where the Explore view comes in - this view focuses on exploring just what the users are saying in the text.

Topics Explore Table

Topics Table Screenshot

As you can see, all the topics and sub-topics mentioned are categorized. The sentiment and volume bars at a glance tell what the trend has been and what the positive/negative split of the mentions have been.

Clicking on a topic opens inline all the mentions, with highlights indicated why the AI thought a particular text was classified into the topic. This also allows us to read verbatim of the text and actually read the contents as the user has exactly stated.

Explore Mentions and Highlights

Review Explore Animation

In addition to this, searching for specific keywords or terms is also enabled and so is comparison between product groups, with the same goodness of exploration of volume trends and sentiment split reports.

Challenges

Let’s now take a look at some of the challenges of doing review analysis.

Data gathering challenges

The first step to do review analysis is to gather the requried data. Now, the requried data may be split in multiple places. For eg. a brand could have some Amazon reviews, and reviews in their listings on BestBuy, Target and their own D2C website. Typical challenges involve gathering reviews at scale, dealing with technical blockades etc. Another major challenge is to keep up with the changes made to the marketplace website, which can be very frequent at times. Another challenge is to ensure that the data is complete and converted to a standardized format - which is trickier than it sounds.

Text Analytics challenges

The next challenge is around text analytics. Sentiment analysis might seem straight forward, except when you consider the edge cases. For example, is being “light” is a good thing or a bad thing? For eg. “The product was too light” - could be a good thing if it is say a laptop, and a bad thing if it is a paper-weight. Likewise, the same word “light” has many connotations - “light” as a noun as in “LED light” or say “light a fire”. These kind of ambiguous nuances are filled in natural languages and not handling them right can lead to disastrous outcomes.

Reporting challenges

Even after gathering the data, and analyzing the text the problem is not fully solved. Then comes the next step - reporting. Most people understand that reporting involves creating visualizations that make sense. But it doesn’t end there - perhaps the most important aspect of reporting that is often overlooked is Aggregation. Aggregation involve combining entities and units into groups that makes sense for reporting. This can be tricky, especially if you consider that the data we are dealing with can be Big Data - with hundreds of thousands of pieces of feedback. Another tricky part of reporting is understanding the impact of each insight from the feedback.

A tool like FreeText AI was explicitly built with these considerations in mind, can has use the core engineering expertise of their team to get around some of these challenges.

Takeaway

Understanding reviews is key to unlocking customer value. Modern review analysis tools leverage technologies like Neural Networks to enable mining insights from marketplace reviews.

Review Analysis Process

Step 1: Data Gathering

Step 2: Review Analysis

Step 3: Reporting

Step 1: Data Gathering

Step 2: Review and Rating Analysis

Step 3: Report Generation

FreeText's AI solution

Add products by URL

Product view in FreeText AI Dashboard

Filter Records

Product Group Report

Topics Explore Table

Explore Mentions and Highlights

Challenges

Data gathering challenges

Text Analytics challenges

Reporting challenges

Takeaway

Ready to dig deeper intoCustomer Feedback?