May 2, 2022   Finance /

Power BI: Cohort Analysis

In this article, you will learn how to run your Cohort Analysis in Power BI. 

The cohort analysis is a powerful customer analysis: it segments customers based on when they first purchased a product. Specifically, it answers the questions:

  • Are newer customers coming back more often than older customers?
  • Are newer customers spending more than older customers?

These questions are key to understand if the company is going in the right direction, with newer customers coming back more often and spending more money.

If you are looking for a simpler analysis of new existing customers, check out my former article here.

Let's now take the questions one by one.

Are newer customers coming back more often than older customers?

Below is the classical chart of the cohort analysis. On left is the month of customer acquisition. The numbers represent the percentage of customers coming back after x months.

For example, 12% of the customers that first purchased in January 2018 came back to buy a month later, in February 2018.

Cohort Analysis in Power BI – customers coming back in percentage

From this chart, it looks like customers acquired from May 2019 are coming back more often. This is great news! 

Nevertheless, the second semester of 2019 also has few 'empty spots' with zero customers coming back. Specifically, the new customers of April 2019 and June 2019 did not come back for 3 consecutive months (month 2 through month 4).

Looking at the totals at the bottom of the report, we see that on average 14% of customers buy again after 1 month, then only 8% in month 2. The remaining months are low, until a spike around month 10.

The spike in month 10 can be linked to a sales cycle: possibly the customers are replacing products after 10 months.

Let's move to the second question.

Are newer customers spending more than older customers?

From the table below, the average spend for new customers on month 0 (the first month) did not change across 2 years: it stays at about 65 dollars. There are however increases in average spend 2-3 months after acquisition, and 10 months after acquisition.

This chart highlights two factors:

  • we are not able to increase the average price spent by new customers in their first month.
  • we should look into the products purchased after 2-3 months and 10 months: identify if we can convince consumers to purchase these additional products earlier on.
Cohort Analysis in Power BI – new customers average spend

With the goal of the cohort analysis clear, it's time to get our hands dirty and start with Power BI!

Load the sales data for the cohort analysis in Power BI

To follow along with the tutorial, download the sample sales data from here.

You can load it by going to Get Data, then clicking on Text/CSV. Below is what the sales table looks like.

initial sales table

Once the table is loaded, select it.

select the sales table

With the table selected, add a new column.

add a new column

We add three columns this way. The first one is Month Year, and it will be used in the Cohort Analysis visualization.

The DAX formula is:

Month Year = FORMAT([date],"mmm yyyy")

The second and third columns are necessary for the cohort analysis calculations. Here they are:

Year Month Number = VALUE(FORMAT([date],"yyyymm"))

Year Month order = RANKX('Sales table',[Year Month Number],,ASC,Dense)

Once the three columns are added, the Sales table will look like this.

final sales table with three columns added

The last step for this table is to sort the Month Year column based on the Order column. Click on Month Year, then Sort by Column and finally on Year Month Order.

Sort Month Year

Create a table for the future months of the Cohort Analysis.

This is a simple table and will have rows for 'Month 0', 'Month 1', etc.

Click on New Table.

Create a new table

When the textbox appears, enter the following DAX formula:

Future months = GENERATESERIES(0, 12, 1)

The table will have 12 rows and look like this.

Future months initial table

Similarly as before, click on New Column and add the following column:

Months = "Month "&[Value]

We have now the final table for the future months.

final table for future months

Done with creating tables! We can move our focus to adding the measures that we want to track.

Create a Power BI measure for the customer count (Cohort Analysis).

This first measure will count the customers coming, based on their acquisition month.

For example, out of the 32 customers who purchased in May 2018 for their first time, 9 purchased again one month later.

Cohort Analysis – New customers count

Here is the DAX formula for it:

New customers count = 

var acquisitionMonth = SELECTEDVALUE('Sales table'[Year Month order])
var selectedFutureMonth = SELECTEDVALUE('Future months'[Value])

var currentCustomers = VALUES('Sales table'[customer])
var pastCustomers = CALCULATETABLE(VALUES('Sales table'[customer]), 
    ALL('Sales table'[Month Year],'Sales table'[Year Month Number])
    , 'Sales table'[Year Month order]<acquisitionMonth)
var newCustomers = EXCEPT(currentCustomers,pastCustomers)

var customersFuturePeriods = CALCULATETABLE(VALUES('Sales table'[customer]),
ALL('Sales table'[Month Year],
'Sales table'[Year Month Number]), 
'Sales table'[Year Month order] = acquisitionMonth+selectedFutureMonth )
var newCustomersBuyingInFuturePeriods = INTERSECT(newCustomers, customersFuturePeriods)

RETURN COUNTROWS(newCustomersBuyingInFuturePeriods)

There are three main parts to the formula. 

The first part defines the variables acquisitionMonth and selectedFutureMonth. Using the example of the 9 customers purchasing again after one month, acquisitionMonth would be May 2018 and futureMonth would be 1.

The second section has the variables currentCustomers, pastCustomers, and newCustomers. This section finds the customers buying for the first time in May 2018.

Finally, the variable newCustomersBuyingInFuturePeriods identifies the customers acquired in May 2018 that also bought in June 2018.

This formula is quite complicated so my recommendation is to play around with it, load easy data to Power BI and see the result.

Create a formula for the percentage of new customers coming back

We want to build the table below.

If you don't need the total at the bottom, then you can use a simple formula:

New customers % no total = 
[New customers count]
/
CALCULATE([New customers count],ALL('Future months'),'Future months'[Value]=0)

Using the same example as before, to calculate the percentage of customers acquired in May 2018 and coming back after 1 month, we divide 9 (customers coming back after 1 month) by 32 (customers acquired in May). The result is 28%.

This simpler formula, however, does not work for the total. To work on the total, we should use SUMX, which sums the values on different months. Below is the formula updated to work on the total. 

New customers % = 

var selectedFutureMonth = SELECTEDVALUE('Future months'[Value])
var maxRank = MAXX(ALL('Sales table'[Year Month order]),[Year Month order])
var months = VALUES('Sales table'[Year Month order])
var monthsFiltered = FILTER(months, [Year Month order]<=maxRank-selectedFutureMonth)

RETURN SUMX(monthsFiltered,
    [New customers count])
    / 
SUMX(monthsFiltered,
    CALCULATE([New customers count],ALL('Future months'),'Future months'[Value]=0))

Let's move to the last formula: the new customers average spend.

Create a Power BI formula for the average spend of customers (Cohort Analysis)

This last formula shows us how much returning customer spend. It allows us to create the table below.

The formula is similar to the first formula for the customer count, except for the last part.

New customers average spend = 

var acquisitionMonth = SELECTEDVALUE('Sales table'[Year Month order])
var selectedFutureMonth = SELECTEDVALUE('Future months'[Value])

var currentCustomers = VALUES('Sales table'[customer])
var pastCustomers = CALCULATETABLE(VALUES('Sales table'[customer]), 
    ALL('Sales table'[Month Year],'Sales table'[Year Month Number])
    , 'Sales table'[Year Month order]<acquisitionMonth)
var newCustomers = EXCEPT(currentCustomers,pastCustomers)

var customersFuturePeriods = CALCULATETABLE(VALUES('Sales table'[customer]),
ALL('Sales table'[Month Year],'Sales table'[Year Month Number]), 
'Sales table'[Year Month order] = acquisitionMonth+selectedFutureMonth )
var newCustomersBuyingInFuturePeriods = INTERSECT(newCustomers, customersFuturePeriods)

RETURN 
CALCULATE(SUM('Sales table'[sales]), 
ALL('Sales table'[Month Year],'Sales table'[Year Month Number]), 
'Sales table'[Year Month order] = acquisitionMonth+selectedFutureMonth,
'Sales table'[customer] IN newCustomersBuyingInFuturePeriods) 
/
COUNTROWS(newCustomersBuyingInFuturePeriods)

The last part (starting from RETURN) is different: there is a CALCULATE divided by a COUNTROWS.

The COUNTROWS is the same as before: it counts the number of customers returning after x months. The CALCULATE, on the other hand, finds how much these customers spend. 

With all the measures created, add a Matrix report and you have your Cohort Analysis.

Power BI Report Configuration for the Cohort Analysis

That's it! You can create a powerful Cohort Analysis in Power BI. If you have questions, do not hesitate to add comments below.

Enjoyed the article? Subscribe to the newsletter and download the Power BI of the article for free.

    Share this post

    AUTHORLuca
    I help SMEs who want to get a clear view on what drives their profit. After becoming a CPA and 10 years in Corporate Finance between the US, Europe, and Asia I discovered my real passion. I love helping Small and Medium Companies figure out how to improve their profit and be sustainable.

    Join the discussion

    39 Comments

    • Gugi says:

      THANK YOU SOOOO MUCH!!!!!!! THIS IS SOOO GREAT! I wonder how long you've spent learning DAX so you can come up with this amazing code?

      • Luca says:

        Wow, thank you Gugi for your comment!
        Comments like this are what motivates me to write articles.

        To answer your question, I started working with DAX about 3 years ago. I used to be 'stuck' a lot at first, but using it regularly really helped me to learn it.
        What about you: when did you start learning DAX?

    • Genilson says:

      Very nice job.
      Just a question, how the tables are relate?
      I didn't figure out how to work out the "Month 0"

    • Genilson says:

      Nice work.
      I just coundn't figure out how the tables are related.

      • Luca says:

        Hi Genilson,

        thank you for your comment!
        The tables are not related. If you wish, you can download the Power BI at the bottom of the article so you can check out in detail how I built it.

        Cheers,

        Luca

    • Tuan Anh says:

      Amazing, good job bro <3

    • Gonzo says:

      Hi Luca,

      You are an absolute legend! Your formula helped me quite a lot!
      I was wondering, how do you add the colour (pink in this case) inside each cell to graphically show how big the values are? I would love to include it in my graph!

      Best wishes,
      Gonzo

    • Sudarshan Seshadri says:

      Luca,
      Happy New Year! Your website is really amazing, we are learning a lot from all the content you have posted and we will be applying those in our dashboards.

      Thankyou so much again, very detailed, very valuable!

      • Luca says:

        It makes so happy to hear this!
        Glad I could help you in your learning.

        Best of luck, and feel free to send me an email if you wish to discuss any topics further

    • Jack says:

      One question – how can I convert this to weeks instead of months?

      • Luca says:

        Hi Jack,

        thank you for your question!
        You can definitely convert this to weeks.

        To set up the logic by month I used 3 columns:
        – Month Year. Example: Oct 2019
        – Year Month Number. Example: 201910
        – Year Month order. Example: 22 (October 2019 is the 22nd month)

        Regarding the Year Month order column, I use it to order the months.
        For example, the first month in my data is Jan 2018, so its order is 1. The second month is Feb 2018 so its order is 2. Jan 2019's order is 13 and so on.

        So you can recreate these 3 columns using weeks instead of months.
        And here are the new formulas:
        1) 'Month Year' becomes 'Week of Year'.
        Week of Year = "Week "&WEEKNUM([date])

        2) 'Year Month Number' becomes 'Year Week Number'
        Year Week Number = YEAR([date])*100+WEEKNUM([date])

        3) 'Year Month order' becomes 'Year Week order'
        Year Week order = RANKX('Sales table',[Year Week Number],,ASC,Dense)

        Does this help you?

        Luca

    • Frederik says:

      Hi Luke.

      Really appreciate your work here!

      Tried to apply dax formulas on our live data and completed the new customers count part just fine. When trying to add the new customers % column and formula the calculate part suddenly says 'New customer count' cannot be found or may not be used in this expression. You know why?

      It finds 'New customer count' just fine when applying it in the return sumx part.

    • William says:

      Hi Luca, I am trying several times to download your power BI files by writing my email but i am not receiving the download files. Can u please help me? Thanks.

      • Luca says:

        Hi William,

        thank you for your message. There should be a link that appears once you enter your email at the bottom of each article.
        I apologize if that was unclear or it did not work for you.

        To make sure you have the Power BI sample, I emailed it to you.
        I hope it helps,

        Luca

    • Mohammad says:

      Hello
      Thanks for your amazing job, i do the same steps, but when i want to create visuall " Cant display the data because power bi can not determin the realation between two or more fild"
      I use power bi desktop report server, would you tell me how can i fix this issue.
      I donwload your file and it ok, but in your file i want to crate the report i have the same erorr.
      Thanks

      • Luca says:

        Hi Mohammad,

        thank you for your message.
        You say that when you download the example, it works ok?
        If that is the case, the best would be to try to reproduce all steps in the example, step by step. That is what I would do to find the issue.

        I can also plan to do consulting time with you for your issue if you wish. In that case, feel free to email me at lc@finance-bi.com

        I hope the above helps, and wish you the best of luck.

        Luca

    • Hanan says:

      Hi Luca,
      First of all thanks for the amazing, easy and clear tutorial.
      I have a question –
      If I have customers who came back to buy in month 0, how can I adjust the dax measure to take this conversion as well? to show the % retention for month 0.

      • Luca says:

        Thank you Hanan for your nice words on the article!

        And for your question:
        Month 0 is the month in which a customer started buying, so it will always have 100% retention.

        For retention, I mean: number of customers active today / numbers of customers active on their first month.
        For Month 0, all the customers are active because it's the month when they started buying. Therefore the retention will always be 100%.

        Does the above make sense?
        Please let me know if I am understanding correctly your question.

        Or if maybe you have in mind a specific case, which is not covered by my understanding above.

        Regards,

        Luca

    • avi says:

      Hi Luca,

      First of all great article. Keep them coming!

      I tried to filter the cohort table by year. It seems like the numbers change when I do that. Can you please take a look? Thanks

      • Luca says:

        Thank you Avi for your comment.

        Regarding filtering by year, I understand why your numbers would change.
        For example, if you filter out 2018 data then Power BI would 'no longer see' 2018. Thus it would consider that all customers started buying in 2019, even customers that actually started buying in 2018.
        Therefore, as you say, the numbers change: the count of New customers becomes higher in 2019.

        Does the above make sense?

        For this type of issue, I usually create a separate table for the filtering.

        Here are the steps for the fix:
        1- Create a separate table for the filtering, for example 'Years for filtering'. This table will have all years, and it will not have any relationship with other tables.

        2- Integrate this table in the DAX calculations (for example, in the 'New customers count'):

        2a- Identify the years selected in the 'Years for filtering'.
        var filteredYears = VALUES('Years for filterings'[Year])
        2b- Identify the year of the current row of the table
        var currentYear = SELECTEDVALUE('Sales table'[Year])
        3b- Return the final result only if the condition is satisfied
        RETURN IF(currentYear IN filteredYears, COUNTROWS(newCustomersBuyingInFuturePeriods), BLANK())

        That's it. I hope this helps you!
        You can use this type of fix in lots of other situations too.

        Luca

    • Lina says:

      Thank you so much for this article, and for your great work, it really helps!!!

      How do you calculate the customer lifetime value from this?
      I was thinking of adding up the total revenue rows per cohort (Month0+Month1+Month2…), and dividing by the number of clients (number in Month 0) in that cohort. But I'm having trouble adding up the rows in the Dax measure. Can you help me with this?

      • Luca says:

        Thank you Lina for your reading the article!

        For that, I would create a new DAX measures CLTV and use the following:
        – first take the number of customers at month 0.
        The formula would look like something like this: CALCULATE([New customers count], 'Future months'[Value] = 0)
        – second, sum the average spend for all future month
        The formula can look like the following: SUMX(VALUES('Future months'[Value]), [New customers average spend])

        Does the above help you?
        Feel free to contact me by email lc@finance-bi.com if you wish to discuss this further.

        Luca

        • Luca says:

          One additional thing to consider: for some cohorts you will have more months.
          For example, for the Cohort of New customers which arrived in January 2020 you will have 15 months of data available (until end of March 2021)
          For the cohort of New customers which arrived in July 2020, you have only 9 months of data available (until end of March 2021)

          So you should probably decide on a specific number of months to take into account for the analysis.
          Otherwise, older cohorts will appear more profitable simply because they have more months of sales.

          Does the above make sense?

          Best of luck for your analysis,

          Luca

    • Nate says:

      Great article really helped me out a ton!

      One question, if I have a definition of a new customer as a customer that has purchased within the past 365 days from the current date. How can I implement this into the DAX code above?

      Such as, I have a customer counted as new 3 times within a 6 month period, I only want that customer to be counted as a new customer once, unless it has been more than 365 days since their last purchase.

      Thanks!

      • Luca says:

        Hi Nate,

        it's great to hear that the article helped you out!

        For the method explained by the article, a customer is considered as New on his month of purchase.
        Can you explain a bit more about your source data and your data model?
        The current method should not consider the same customer to be New more than once.

        Regarding the last part of your comment: 'unless it has been more than 365 days since their last purchase', below is how I would do it.

        The current method calculates:
        – currentCustomers -> these are all the customers buying in the selected period. For example: customers buying in June 2021
        – pastCustomers -> these are all the customers buying prior to the selected period. For example: customers that purchased in May 2021 or earlier
        – newCustomers -> this is calculated by difference. All the currentCustomers which were not pastCustomers are in effect new customers.

        You could modify the formula for pastCustomers, to only count customers that purchased from June 2020 until May 2021 (instead of the current formula that counts customer purchasing anytime before May 2021).
        The result will be that a customer that purchased in June 2021 and May 2020 will not be included in pastCustomers. And therefore it will be considered as New, based on the logic above.

        Does this make sense?

        You can add the additional filter in the CALCULATETABLE for pastCustomers.

        I hope this helps you,

        Luca

    • Chetna says:

      All of these formulas would have to be added in seperate columns right?

      • Luca says:

        Hi Chetna,

        the formulas should be added in measures, and not calculated columns.
        Does this answer your question?

        Luca

    • devesh says:

      Hi,

      My months are not in orders.

      month0,month 1, month 2, month 10,,month11,month12,month3……month 9

    • Karani says:

      Fantastic !

    • Siyanda says:

      Thank you so much for a great article – I just used it as a guide to analyze cohorts of customers by year! I am officially your student now! Please see my question below:

      My data does not have sales values and I am using product names to filter and analyse how long our customers remain loyal to our products and brands. This approach has limitations, especially in as far as showing me whether our customers have moved to consume other products in our store, or they simply stopped being our customers. Is there a solution whereby I can do this analysis by product, essentially showing how long our customers stay in a product before moving to the next?

      Hope my question is clear – thanking you in advance.

    • Devesh says:

      Please explain how are you calculating 13 months or above months total customers.

    • Surmai says:

      Thank You so much for such a detailed explanation! It helped me a lot and now I am inspired to deep-dive into DAX!

    • Devesh says:

      hi Luca,

      Can I calculate Prior months customer in this chart.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    LN Consulting
    Tired Of Building A Dashboard By Yourself?

    Continue Reading...

    May 2, 2022
    In this post, I will teach you how to track new and repeat customers in Power BI. While new customers are key to a company’s growth, the cost of acquiring them is usually higher compared to repeat customers. It is therefore important to have a balanced growth between new and repeat customers. Below is the ...
    Read More
    May 2, 2022
    In this post, I will teach you how to build a Pareto analysis dashboard in Power BI. Specifically, the Pareto principle states that 20% of the causes generate 80% of the effects, and vice-versa. We can apply this empirical principle to different topics, from work (20% of the efforts cause 80% of the results) to ...
    Read More
    March 22, 2020
    Xero is an online accounting tool for small businesses. I have been working with it for the past few months and the richness of its features, along with its simplicity is impressive. You can check it out at www.xero.com. In this post, I will show you how to connect Xero to Power BI Desktop using ...
    Read More