Market Basket Analysis and Lift
Fries go with ketchup. Limes go with vodka. There are certain products consumer tend to place together in their baskets. If we can analyze the market baskets and identify pairs or sets of products that consumers tend to purchase together, we will be able to increase our profit.
We can also perform market basket analysis with python, but today I’ll practice with Excel, utilizing the powerful evolutionary solver. Our goal is to identify products purchased together so that we can lay out a store where products with high lifts are located near each other to optimized sales.
Lift is the most commonly used tool in market basket analysis. The lift for a combination of purchase products and/or day of week is defined by this equation: (The actual number of times combination occurs) divided by (Predicted number of times combination occurs if items in combination were independent).
Today, I will practice on Loblaw’s supermarket data of 2928 transactions. The screenshot on the right shows a subset of our data. We have our day of the week denoted by 1=Monday 2=Tuesday etc.
To calculate the lift between vegetables and lift, we will calculate the (Actual number of transactions where meat and vegetables were purchased)/ (Total # of transactions)(Fraction of times meat was purchased) *(Fraction of time vegetables were purchased).
Product combo with lifts much greater than 1 indicates that the products tend to be purchased together.
Step 1: Create Name Ranges
Now, back to our market basket analysis. Let’s set up a few name ranges. Use Name Box to assign the name data to the range B9:H2936, which in this case contains all of the transactions. Select INSERT NAME CREATE to name each column of data by its heading. Next we determine the fraction of all transactions involving each type of product and the fraction of transaction taking place on each day of the week. First compute the total number of transactions with =COUNT(B:B) then we use the formula =COUNTIF(INDIRECT(K9),1)/$L$7 TO L10:L14 to compute the fraction of transactions involving each product.
To determine the fraction of all transactions occuring on each day of the week, we use the formula =COUNTIF(day_week,K17)/COUNT(day_week)in L17:L23.
Step 2: Calculate Lifts for Two-Way Product combinations
Now that we have our ranges named, we can move on to calculating the lift for any product combination. To compute lift for two products combination, we’ll use a two-way data table.First we enter the name ranges for any two products. Here I’ll just use fruits and vegetables.
Then we use the formula =IF(N9<>O9,VLOOKUP(N9,K9:L14,2,FALSE)*L7*VLOOKUP(O9,K9:L14,2,FALSE),0) to compute the predicted number of transactions involving the two products assuming independence.
To compute the number of times the combination of veggies and fruits occur together, we use the formula =SUM((INDIRECT(N9)=1)*(INDIRECT(O9)=1)) This formula causes the arrays to be pairwise multiplied and then the entries in the resulting arrays will be added together.
Compute the total lift for all the categories with formula =IF(Q10=0,1,P10/Q10). Divide the actual occurrence of fruits and veggies together by the predicted number of occurrences. We take a step further and use a two-way data table to compute the lift for all two-product combinations simultaneously. My fav part obviously.
And wallah! We have our market basket analysis!