Attribution Modeling

How much credit to be given to which touchpoint?

Posted by on September 24, 2018

What is attribution modeling?

soccer fieldset Soccer field example

When a goal happens in soccer, which player should get how much credit of it?

  • Should striker get the most credit for the goal?
  • Should last person who pass the ball to the striker get the most credit for the goal?
  • Should first person who initiated that rythm of passes get the most credit of the goal?
  • Should every person who passes the ball get equal credit?
  • Should the latter ones get more credit of the goal?

In marketing we are facing the similar kind of problem, that when the conversion happens. Then how much credit should be given to which channel in marketing?
marketing fieldset A customer's journey for landing on website


Why we need attribution model?

In today's world every company is spending big amount of their budget on marketing

  • To gauge the effectiveness of channels
  • To measure impact of communication with customers
  • To determine ROI
  • To decide which actions to take





Types of attribution models:


  • Rule based attribution model
    • First Click Attribution Model

      This is the closest proxy we have for "how did they hear about us in the first place?"
      This is useful if you need to know which keywords genuinely helped make you known to the user.

      marketing fieldset Pros:
      • Easy to implement
      • Helps to know New Customer Acquisition Channels
      • Insight into drive awareness campaigns
      Cons:
      • No influence of subsequent touches
      • Too much credit to lead gen programs

    • Last Click Attribution Model

      Most useful when the final touchpoint really was the deciding one, e.g. for impulse purchases or very price-sensitive decisions.

      marketing fieldset Pros:
      • Easy to implement
      • Insights into drive conversion campaigns
      Cons:
      • No influence of prior touches
      • Too much credit to converting campaigns

    • Equal weight Attribution Model

      Gives equal weight to every touchpoint. Doesn't matter when that click is happening.

      marketing fieldset Pros:
      • No-fighting over who gets credit
      • Helpful for longer revenue cycles with many clicks
      Cons:
      • Low-impact touchpoints gets high credit
      • No importance to high impact touchpoints

    • Time Decay Attribution Model

      “Latter you are in click-chain, more credit you will get”. If users can't remember about who showed up on the first few clicks, than those were probably worth less in final decision.

      marketing fieldset Pros:
      • Focused on all touchpoints
      • Helpful for longer revenue cycles with many clicks
      Cons:
      • Artificially inflate importance to latter channels
      • Low-credit to acquisition channels

    • Positional-Rule based Attribution Model

      “What make you aware and What make you purchase from us?"
      If you needed to get on a shortlist but you don't care if the user interacted with you again until they made their final decision, this model replicates that level of credit.

      marketing fieldset Pros:
      • High credit for acquiring and converting channels
      • Helpful for longer revenue cycles with many clicks
      Cons:
      • Less influence for middle channels



  • Data-driven attribution model: Second way of looking at attribution
    • "Proper distribution to each channel"

      marketing fieldset Pros:
      • Helps to know proper attribution to each channel
      • No set of rules to skew data
      • Less guess work
      • Better decisions
      Cons:
      • Not Easy to implement
      • Require statistical knowledge
      • Require a lot of data to know statistical validity

    • Shapely Attribution Model marketing fieldset

      How much value does each channel have?
      Or in other words
      How much value a channel brings to already existing list of channels?

      lets consider a customer journeys

      marketing fieldset

      In this customer journey, customer has 1% chances of making a purchase.

      marketing fieldset

      Now when we add display to the customer journey customer has 2% chances of making a purchase. Means there are 100% more chances of making a purchase when we add display as a channel to the current customer journey.

      Let's consider we have 430 total conversions that happened thought the paid channels. Out of these 50 conversions had only social as the channel in the journey, 30 conversions had only Display as the channel in the journey, ..., 60 conversions had Social as well as Display as the channel in the journey, ..., 100 conversions had Social, Display as well as Retargeting as the channel in the journey.


      marketing fieldset

      Contribution of each channel is total time the channel appears in the customers journey. e.g. for Social it is
      [50(where social appears as only channel in customers purchase journey] +
      [60(where social, display occur together) - 30(where display occur only)]
      + .... +
      [100(where social, display, retargeting occur together) - 80(where display and retargeting occur only)]


      marketing fieldset

      Weightage of each channel is contribution of that channel divided by total contribution of all the channels.


      marketing fieldset

      The final contribution of channels is different from all the rule based models.


    • Markov-chain Attribution Model

      Probabilistic Model
      Relies on current state to predict the next state

      lets consider 3 customer journeys


      Journey 1: Customer comes from Social channel, then after a few days customer comes from Display channel, then from Retargeting and finally makes a purchase.

      marketing fieldset

      Journey 2: Customer comes from Display channel, then after a few days customer comes from Retargeting channel but never makes a purchase.

      marketing fieldset

      Journey 3: Customer comes from Display channel only and never makes a purchase.

      marketing fieldset

      So, we have 3 customer journey's as follows:

      marketing fieldset

      Since Markov Model is the state change model. It is based on the current as well as previous stage only. So, we need to transpose the data and make the probability of the customer going from the initial stage to the next stage.

      marketing fieldset

      Once we have done the transformation of data, we get this kind of chart for all the channel in the customer journey:

      marketing fieldset

      Total chances of making a purchase is (33% * 100% * 66% * 50%) + (66% * 66% * 50%) = 33%

      marketing fieldset

      what will be the impact if we remove social channel from the customers journey is (33% * 100% * 66% * 50%) + (66% * 66% * 50%) = 22%

      marketing fieldset

      Effect of removing social media is 22%/33% = 0.66

      Similarly, finding the impact of removing other channels.


      Effect of removing:

      • Social Media: 22.2% / 33.3% = 0.66
      • Display: 1
      • Retargeting: 1


      Weight% of the effect of removing on total conversion, is the effect of removing the channel divided by the effect of removing all channels multiplied by the total conversions that we have in the journey, which is one in this case. This can be calculated as:

      • Social Media: 0.66 / (0.66 + 1 + 1) = 0.25 * conversion (1) = 25%
      • Display: 1 / (0.66 + 1 + 1) = 0.375 * conversion (1) = 37.5%
      • Retargeting : 1 / (0.66 + 1 + 1) = 0.375 * conversion (1) = 37.5%

      marketing fieldset

      The final contribution of channels is different from all the rule based models.


    • When to choose which data-driven attribution model

      Shapley wins over Markov chain in:

      • Much broader industry adoption
      • Used successfully in attribution and auto-bidding platforms for years
      • Backed by Nobel Prize winning research
      • Slightly more straightforward approach to the attribution problem in which sequence doesn’t matter
      • Easier to implement
      • Results are usually more stable
      • Results are less sensitive to the input data


      Markov chain outshines Shapley value in:
      • Considers channel sequence as a fundamental part of the algorithm which is more closely aligned to a customer’s journey
      • Potential to scale to a more considerable number of channels
      • Marginal contributions for Shapley value must be calculated 2^n times (n being the number of marketing channels)
      • Can be applied to the individual marketing campaign level which is more actionable for personalization



We will be properly able to attribute the revenue to different channels for revenue for new customer acquisition.