Case Study Data Analysis Example

Best way to learn analytics is through experience and solving case studies. Here, I will present you a complete business model and take you through a step by step process of how analytics is set up in a new business, how is it used in daily processes and some of the advanced analytics techniques which a business can use to make meaningful segmentation and prediction to optimize its marketing & sales campaign.

Background :

You have recently started a Video CD rent shop. After 2 months you realize that there is tough competition in the market and you need to make a more customer centric strategy to stand out in the market. Hence, you want to collect the most granular details of your customer behavior and build strategy accordingly.

Business case layout 

This business case have been broken down into 3 articles. Following is a plot of the articles and each article will be strongly dependent on findings in the previous articles:

1. How do you collect data so as to capture all the important information?

2. Deep dive into customer behavior and using basic data analysis with business knowledge to optimize daily operations : Click here to directly move to this part https://www.analyticsvidhya.com/blog/2014/03/learn-analytics-business-case-study-part-ii/

3. How do you use data with advanced analytics to make your marketing/sales startegies more targeted?

Case study part I :

Did you ever wonder why do you deal with so many datamarts in your company. Let’s try to understand as the owner of the busienss what all data sources do you need.

1. Transactions Table :

You rent out Video CDs and the most important data for you will be transactional data. Transactional data is by far the richest data throughout all industries. Each row in transactional data corresponds to one transaction made. This transaction mostly are monetary transaction. To identify each transaction, you need a distinct transaction code associated with each transaction. What other fields can you think of to be captured along with each transaction. Following is a small list of such variables :

Following is a  sample transactional data set :

2. Product Table :

If you have transaction table, you basically have the linkage between the customers and the products. But why does transaction table not have the discription of products? The simplest reason for the same is that total number of products are limited in any industry, and the same product is repeated throughout the transactions table. If we add description in every single line, it adds enormously to the overall size of transaction table, which anyway is huge. Hence, we keep the products table seperate and merge it with required transactions for specific analysis.

Product table is unique on product id, which maps to transactions table. What other parameters can you think of that make sense for you to include? Following is a list of possible variables :

Note : Product ID generally can be decoded to know product details. For example, here H denotes “Hindi” and E denotes “English”. This coding makes the analysis simpler.

3. Customer Table :

The other hand of transaction table is the customer table. Using the above two tables, you almost have everything except the details of the customer. While making any kind of customer centric strategy, its very essential to consider the customer profile.This table helps you find the customer profile. This table is unique on customer id. What other parameters can you think of that make sense for you to include? Following is a list of possible variables :

Note : Similar to Product ID, Customer ID also generally can be decoded to know customer details.

4. Engagement Tracker :

All the three tables together can be used to create any kind of analysis to build marketing and sales strategy. What they do not cover is the engagement you had with your customers till date. Say, I called Kunal 1 week back to tell him about a movie X. Now, it might not be the best idea to call Kunal again this week to tell about the same movie. Hence, we need to keep a track on all kinds of engagement we have with our customer on daily basis. This is similar to transactions table but this include all the non-monetary interactions we have with out customers till date. These interactions can be inbound or outbound. This table is unique on engagement_id. What other parameters can you think of that make sense for you to include? Following is a list of possible variables :

5. Derived tables :

Because the data sizes become huge with time, it is always recommended to keep some monthly snapshots handy. One of such table can be transactions data rolled up at customer level. Following is a list of such possible variables :

Such derived tables come very handy to make quick analysis. Say, you have acquired 10 new english movies and want to market them. You might want to market these movies to customers who watch english movies, who responded to recent engagements and who have done recent transactions. For such a targeting list, imagine the process you might need to follow. Following is a possible way to achieve the same :

Imagine how easy this analysis gets if you have the derived monthly snapshot handy.

Graph schemas:

The article till now focuses on use of traditional relational databases. Graph based databases (e.g. Neo4j) are a strong alternate to these traditional databases. They add a lot of flexibility to your database, where you can change the schema very easily.

This kind of flexibility is required in case your data formats can change and you can not have much control on it. Also, you can add new structures and relationships very quickly. Before we go in these details, a typical graph schema in this case would look something like:

Blue nodes represent customers, Red represent movies and Green represents various package available. Every edge is a relationship in between nodes. For example, if a customer rents out a movie, we can draw an edge between the 2.

Now by calculating things like number of edges from a node, you can look at things like most active customer, most rented and least rented movies. You can also start looking at what kind of customers are renting what kind of movies.

P.S. Like all data model designs, there are various alternates to this design and you should choose the best depending on your usage.

End Notes :

We discussed relational database and graph database for representing a typical business problem. The data tables we discussed in this article is almost parallel to datamarts in any industry. We will look at some interesting strategies which can be derived using these data sources for the CD rental business case. Some of these strategies which are very basic in nature and needs more of business sense than modelling will be discussed in the next article. This will make you understand how effective strategies can be built if you mix business knowledge with simple data analysis.Knowledge of data is very essential regardless of the industry you work for. To view the next part of this case study click https://www.analyticsvidhya.com/blog/2014/03/learn-analytics-business-case-study-part-ii/

Did you find the article useful? Share with us any other problem statements you can think of. Do let us know your thoughts about this article in the box below.

If you like what you just read & want to continue your analytics learning, subscribe to our emails, follow us on twitter or like our facebook page.

 

Next part of case study:Learn Analytics using a business case study: Part II


Exploratory data analysis for Soccer – by Roopam

For the last couple of weeks we have been working on a marketing analytics case study example (read Part 1 and Part 2). In the last part (Part 2) we defined a couple of advanced analytics objectives based on the business problem at an online retail company called DresSmart Inc. In this part, we will perform some exploratory data analysis as a part of the same case study example. But before that let’s explore the power of exploratory data analysis (EDA) to reveal hidden facts about the greatest game on the planet – soccer or football.

Soccer – Exploratory Data Analysis

Soccer is undoubtedly the most popular game on the planet with over 200 nations having their official soccer teams. No other game has such a universal appeal with millions of hardcore followers.  Every detail of soccer is analyzed by the players, the coaches and the support staff. Despite this, a careful exploratory data analysis of the game could unravel match-winning secrets about the greatest game, as you will see in the next two example case studies.

Penalty Kicks

Let’s relive the first knockout (pre-quarterfinal) match of the Soccer World Cup 2014 between Brazil and Chile. The scores were level at 1-1 at the end of allotted 90 minutes. Even the extra half an hour could not conclude the match with the scoreboard still reading 1-1. This led the match towards penalty shoot-outs to break the tie. After the Brazilian player, Neymar, scored the goal in the penultimate penalty kick, Brazil  were 3-2 ahead in the penalty shootouts. Chile still has a penalty kick left from Gonzalo Jara and the opportunity to extend the tie further – but if he misses Chile’s campaign will be over in the competition. What should Gonzalo Jara do to extend the tie?

Gonzalo Jara’s Kick – Source: irishtimes.com

On average, at this level around 75% penalty kicks convert to goals. The odds, by this definition, are highly in favor of Gonzalo Jara. Where should he kick the ball to improve his odds further? All the fans, coaches, and players will say kick the ball in either corner, away from the goalkeeper who is standing in the center of the goal. They will also advise never to shoot the ball at the dead center towards the goalkeeper. A group of researchers asked the same question and did the exploratory data analysis of penalty kicks at the elite level of soccer. Goalkeepers usually go by their instincts when the ball is kicked at them with undecipherable pace. They either jump towards their left (57% of times) or right (41% of times). This leaves them at the center just 2% of times to stop the ball hit right towards them. Hence, a kick hit dead towards the center of the goal has significantly higher chances of conversion to goal then kicks on either corner at the same height.

Back to Gonzalo Jara, he hits the ball towards his right, in the direction of the diving goalkeeper as shown in the picture above. He misses the shot, the ball hits the goal post and ricochets away from the goal. As a result, Chile got knocked out of the world cup and Brazil advanced to the next stage. In Gonzalo Jara’s defense, the conversion rate for crucial penalty kicks like this one (to avoid elimination) drops to 44%. Yes, pressure is another beast to which even the best succumb.

Corner Kicks

In another case, a few years ago Manchester City’s soccer team was struggling with corner kicks and hence decided to do some exploratory data analysis to differentiate effective corner kicks from ineffective. The team of analysts analyzed hundreds of videos of corner kicks from the premier league. After their analysis, they found that in-swinging kicks towards the goal were far more effective and dangerous than the out-swinging kicks. They took their findings to Roberto Mancini, the coach of  Manchester City team at that time. Mancini, who has played and followed the game since his childhood, rejected the findings outrightly. He recalled all those memorable and picture perfects goals by great headers of out-swingers. On the other hand, clumsy goals of in-swingers hardly created a lasting impression on the spectators’ mind. Mancini, it turned out, was wrong. All that looks great and memorable is not always optimal. This is a great case for how simple but sincere exploratory data analysis can challenge the deeply ingrained beliefs developed over centuries (yes, soccer is a really old game).

Exploratory Data Analysis – Retail Case Study Example

Back to our case study example (read Part 1 and Part 2), in which you are the chief analytics officer & business strategy head at an online shopping store called DresSMart Inc. You are helping out the CMO of the company to enhance the company’s campaigns’ results. For the last few days, you are playing around with data as a part of exploratory data analysis. The following is one of the several interesting results and patterns you have noticed in the data. When you analyzed the distribution of customers across a number of product categories (men’s shirt, casual trousers, formal skirts etc.) purchased by each customer you found the following pattern.

Exploratory data analysis – marketing analytics case study (retail)

The above distribution looks more or less as expected. However, there is an interesting peak for customers purchasing more than 50 product-categories. Who are these customers? Why are they buying so many product categories for their usage? You further analyzed this small set of customers and found that they are growing at a faster rate than the other set of customers. Since the inception of the company 7 years ago, the percentage of customers purchasing 50+ product categories in a year has exponentially gone up (currently at 2.1%). This set of customers also contributes to about 23% of all the sales for DresSMart Inc. The following graphs are part of your above analysis.

Exploratory data analysis

So, what is going on here? You further analyzed the patterns and size(s) of clothes these customers are buying and noticed they are buying the same style in different sizes. Aha! Now you know them, these are small neighborhood retailers using DresSMart Inc as a wholesaler. The following is what you concluded from the above analysis

  1. There is no point sending these retailers the same retail product catalog and campaign as to retail customers
  2. There is an opportunity to strengthen business ties with these mom-&-pop retailers and in turn, improve profitability of your company through a separate business program

Additionally, your further analysis revealed that order fulfillment or delivery patterns (delivery quantity / chargers etc.)  for these retailers are similar to other customers. Your company is incurring additional cost for these customers in delivery. You could plan the overall supply chain much better keeping these small retailers in the equation. This exploratory data analysis has given you ideas for more low hanging fruits to improve company’s profitability.

Sign-off Note

Exploratory data analysis is a powerful tool. A diligent EDA is an absolute must to put your advanced business analytics in the right direction. EDA provides a great opportunity to test your simple business hypotheses and hunches before jumping into a rigorous model building. Coming back to soccer, we are approaching the final stages of the World Cup. Enjoy the last few games and may the best team lift the prized trophy.

Posted in Marketing Analytics, Retail Case Study Example | Tags: Business Analytics, Marketing Analytics, Predictive Analytics, Retail Analytics, Roopam Upadhyay |

One thought on “Case Study Data Analysis Example

Leave a Reply

Your email address will not be published. Required fields are marked *