Ecommerce Analytics and Data Science

Ecommerce Use Case: How Predictive Analytics Can Increase Profits

Written by Taras on November 30th, 2016

Want to increase your sales? Here’s a detailed case study on how ecommerce companies can leverage profits with machine learning algorithms.

Ecommerce is becoming a very crowded space. Competing businesses can sell their products all over the planet, and getting a good piece of the marketplace is harder and harder to accomplish.

Most ecommerce entrepreneurs have mastered content marketing. They understand the concepts of building relationships with customers, of keeping each content marketing platform engaging and up-to-date. They build their email lists and retain users with creative campaigns. They are even moving into geo-location and personalization with their content outreach. And still, they are not able to increase sales performance for all of their efforts.

The Answer Lies in Ecommerce Analytics and Data Science

Ecommerce Analytics

When e-commerce companies are asked about how they are using analytics and data, there seems to be a disconnect. More than a half of Fortune 500 companies are actually using big data to analyze their websites for traffic, user experience and behavior, in order to gather the information to alter user behavior. But small and mid-sized e-commerce businesses have not taken the full advantage of the data science and big data analytics that is out there. There are two reasons for this:

  • They may use Google data analytics and generate plenty of reports that show areas of weakness, but they are not sure how to effectively correct those weaknesses.
  • They may believe that they have to hire a data scientist like the “big boys” do, who can not only collect and analyze, but who can then collaborate with marketing staffs to develop complex and expensive strategies. Such strategies will address challenges (traffic patterns, bounce spots and rates, etc.), and, as well, analyze specific customer behaviors and how those behaviors can be targeted to increase sales in the future. It’s pretty amazing stuff, actually, and much of it is accomplished through data science machine learning, allowing machines to use algorithms and math to solve specific problems better than humans can.

The truth is this: An ecommerce business of any size can take advantage of data science for business and use it to ramp up its customer base (and thus profits). For small and mid-sized businesses, this does not mean adding expensive big data science experts. It means contracting out with a service that has the data scientists who can collect the data, organize and analyze it, develop models, and then collaborate with others on their teams to make recommendations to an ecommerce business, including ecommerce conversion rate.

Why Data Science?

Why Data Science

If you have purchased anything on Amazon recently, you will see some interesting things pop up, as you search for products and ultimately make a purchase. One of the most prominent features you will see, is the statement: “Other customers who purchased this product also purchased these.” And then additional products will be displayed for your viewing.

Data science for ecommerce has been used to group you with customers who may be of the same age range, the same sex, and with the same interests that you have. Data science is tracking your behavior and offering other potential purchases to you, based upon all of these factors. Chances are you will look at those other products, may purchase one or two, or at least be aware that they exist so that you may return and purchase them. Big data analysis allowed Amazon to customize its website in real time, just for you. And it can do much more.

Data science techniques, indeed, are powerful tools, and all ecommerce businesses should be using them. Let me show you how:

What Data Science Can Fix For Your Business

What Data Science Can Fix For Your Business

The problems that ecommerce businesses face are pretty typical – low conversion rates, high bounce rates, cart abandonment, lack of customer loyalty, etc. Their own analytics will show this in the reports they generate. But those reports lack the deep learning that data science can provide, so that individual solutions can be developed and implemented.

Romexsoft has the team and the tools for deep learning through data science – learning that can drive what a business does to increase its revenue, user by user, customer by customer.

Case Study: Boosting Customer Loyalty and The Average Check With Big Data

Big Data

Recently an online retailer contacted us with the following problem(s). He has a large line of casual and sports clothing and shoes for people of all ages, for both genders, and for style preferences.

What he was discovering was this: he could get a customer “in the door,” and often get a purchase. But most customers were not “coming back for more” and/or purchasing other products that would suit them.

What he wanted from Romexsoft was a full analysis of what he could do to change his customers’ behaviors and move them to purchase more.

Our process involved several steps, and in the end, we were able to make recommendations which, when implemented, increase his sales almost immediately. Here was the process:

Analysis of the Site Structure Itself


When our team entered the website, we were able to make a few suggestions after a detailed research. Using basic analytics, we were able to locate those pages which were obviously least popular, those pages that resulted in the most bounce rates, most and least popular products, based upon the correlation between views and actual purchases.

For example, there were several shoe products that the retailer was considering discarding. While there were many views, the proportion of purchases was quite low. What we discovered through our analytics, was that the problem was not the product – the problem was the pricing.

Our developers were able to remodel the structure of the site, revise groupings of products, and recommend the correct price points for “low sale” products.

But the real work to solve the problem was just beginning. The job ahead of us was ultimately analyze the behavior of each individual customer and determine how to change that behavior to translate into more purchases. This information would be valuable for existing customers but also for new customers who visited.

Generating The Test Data

To prepare for deep analysis, we had to first organize products based upon type (e.g., shirt, shoes) sex, age groups, their purpose (casual or sport), brands/pricing, and a full history of the numbers of views of each product page and the information that was provided on that page. We generated more than 150,000 records of data to test.

Statistical Analysis and Machine Learning

Using data science with Java and Apache Spark, we applied an item-to-item correlation filtering system recommended by Amazon. What this means is as follows:

  • Each product was described by its type, sex, age, brand and purpose.
  • We filtered by three variants – the item code, the product code, and the “rate” which we defined as click-throughs to that product.

We were then able to generate data on actual customer taste. Here is a sampling of that data:

User idBrandProduct idCategory of productProduct type by ageProduct type by genderProduct for sports or casual wear?
1Brand A42shoeschildrenmalecasual
1Brand A45shoeschildrenmalecasual
1Brand A48shoeschildrenmalecasual
1Brand B717jacketchildrenmalesport
19761Brand H123shoeschildrenfemalecasual
19761Brand B1186shortschildrenmalesport
19761Brand C1190shortschildrenmalesport
38335Brand H95shoesadultfemalecasual
38335Brand C1596capchildrenmalesport
38335Brand C1597capchildrenmalesport
39999Brand J41shoesadultmalecasual
39999Brand E59shoeschildrenmalecasual
39999Brand E60shoeschildrenmalecasual
39999Brand E61shoeschildrenmalecasual
39999Brand E62shoeschildrenmalecasual
39999Brand E64shoeschildrenmalecasual

Establishing Predictions for Customer Rates Based Upon Actual Rates

Next, we wanted to generate data that would tell us the predicted rate (click throughs) of customers who looked at more than one product, if they were shown similar products. This is a sampling of that data:

This first chart shows a customer looking at a specific product and the actual product rate (number of times the customer actually clicked-through).

Users idProducts idProducts rate (in fact)

This next chart shows the same customer and the predicted product rate if shown similar items:

Users idProducts idProducts rate (in fact)Products rate (predicted)

You can clearly see how close the actual and predicted rates are, and they are based upon predictor models that have been proven. What this data science machine learning tells the business owner is that he should be showing individual customers similar products, which customer might not even heard about but which will suit him the most. And this is the value of using data science in retail – informing the retailer of the potential for customers to click-through to other products, when presented with them. And because the data puts customers into groups, those groups of customers, with similar behavior and interests, can be shown the same similar products.

Predictions of Product Presentations/Ratings Based Upon Customer Groups

Now that the retailer knows he will be presenting similar products to his customers, the next data science challenge is to determine the products to present. Again, machine learning takes over based upon customer groups and past product rates of those groups, and then generates a listing of the similar products to which customers should be exposed.

The following chart is an example of what this data report will show, based upon six additional products that should be shown to each customer, along with predicted ratings.

Users idProduct idRatingProduct idRatingProduct idRatingProduct idRatingProduct idRatingProduct idRating

Based on the existing data, we can also determine the potential buyers for a certain group of products or a certain brand even if they did not express any prior interest in some particular brand. Our model allows juxtapositioning them against people who have similar shopping preferences and had previously purchased the brand in question. As a result, we can narrow down the potential buyer segment that will feel interested in a certain group of products:

Product idUser idRatingUser idRatingUseridRatingUser idRatingUser idRatingUser idRatingUser id

The concept is simple: Customers’ who have completed specific purchases in the past, and those purchases have been similar to those of a group of customers, then future purchases can be predicted. Using real data of these purchases, and applying machine learning for data science, the business owner can customize and personalize (and direct) each customer’s experience and journey on his site.

The Benefits of This Model

For our client, the benefits were obvious. He will increase the potential for purchases and, as the result, increase ecommerce sales by displaying a larger assortment of similar products to each customer – products the customer didn’t even realize were on the site and products that will suit customer’s needs the most.

Another value of this model is that sales can be more accurately. The business owner can then better manage his inventory – something that will certainly help to grow business profits. As outlined above, you can make more accurate predictions on the kind of goods to be likely purchased. The predictions can be as accurate as claiming that your company will sell 100-120 Nike Air Max Model shoes with a 90% probability in the next week.

What is more, our model allows determining the exact factors that may (or may not) impact the sales volumes. For instance, in most cases the frequency of visiting your website has no direct impact on the sales. Users may spend a lot of time browsing and comparing goods without committing to a purchase. While factors like age, seasonality and past record of purchases have a significant impact on the probability of a purchase.

So What are Your Problems?

You may have the insight to know that you are not growing as you should. Knowing why is another matter. And that is where business analytics comes in. It is a complex matter, but data science case studies continue to show that big data and machine learning can provide the answers.

Romexsoft is ready to build a model for you, based upon your unique circumstances. Let’s discuss your problem today.



Hey! I’m Taras, Data Scientist at Romexsoft. Want to know more about big data, machine learning and other cool stuff? Then follow my posts on Romexsoft blog.