Introduction
The need to go beyond propensity
When companies get started with data science, one of the first projects they tackle is often a customer intelligence project. These projects focus on predicting which customers are likely to attrite, which customers are the best targets for cross-selling, or which prospects are most likely to purchase the company’s products. We call these models, generically, targeting models, since they select which customers should be targeted by the company to encourage (or discourage) some response.
All of these models focus on the propensity that an individual will have a specific response (e.g., attrite, purchase an additional product, or purchase an initial product). These models are a helpful first step, but they do not answer a fundamental question: what is the value of taking different actions toward these individuals?
Instead of using models that predict the probability of an individual responding in a certain way, we propose building models that answer three pivotal questions:
- What is the probability that an individual will respond given different actions that the company can take? For example, what is the probability an individual will attrite if the company does nothing, and what is the probability the individual will attrite if the company places an outbound call to him?
- What is the current valuation of the individual’s remaining lifetime value?
- What is the cost of the different actions the company can take?
If we can provide answers to these questions, we can describe the value of making different decisions concerning any individual.
We call this process of building models for probability of response under different actions and for remaining lifetime value; determining the cost of actions; combining these models into measures of expected value under different actions; and selecting the best actions to take, value-driven targeting. In the following sections, we describe the approach to each of these components of value-driven targeting.
What is the probability of response given different actions?
To answer the first question requires constructing a model to predict the probability of an individual responding as the result of different actions the company can take. To build such a model, we need access to the following types of data:
- Historical data at the individual level, including the responses those individuals have given (e.g., whether an individual attrited or stayed with the company)
- Actions taken by the company toward that individual. This typically consists of marketing campaigns (both mass marketing and direct marketing) or outbound calls.
These models are typically constructed to allow prediction of whether an individual will respond within a specified time window. For example, an insurance company might choose to predict whether a prospect will purchase an auto loan in the next six months.Multiple modeling approaches can be taken to solve this problem. Statisticians might choose to use a survival model, which makes efficient use of observations of individuals over time. In contrast, data scientists might choose a machine learning model that creates a binary response (individual purchased auto insurance or did not purchase) over a historical six-month window.
Regardless of the methodology used, there is one important element of these models that is often overlooked: they do not account for the impact of different actions the company might take. Does the company truly care that Rachel Simmons in Topeka, KS, has the highest probability among all prospects of purchasing auto insurance from them in the next six months if they can’t do anything about it? No. What the company cares about is how much they can impact the individual’s response by taking different actions. At a minimum, this means that the models need to include effects of different actions the company has taken in the past. Ideally, the models would also allow for those effects to differ from person to person (i.e., the variables representing actions would be interacted with variables related to individuals).
As an example, the following figure shows a scenario in which older individuals are more likely to respond, as shown by the positive slopes for the lines. Ignoring the actions the company can take, we might choose to focus on older individuals, those who have the highest probability of responding.However, the most significant impact is to be had on younger individuals. Although they have a lower probability of responding, the company is able to affect them more through mailing, as shown by the larger gap between the probability of response without mail and the probability of response with mail.
In cases where the company isn’t able to estimate the likelihood of individual behavior as the result of possible action, it will be necessary to estimate the lift associated with the company’s action using other studies, testing, or ad hoc estimation. For example, a company could encounter this challenge when they are executing a new marketing tactic, or entering a new channel for the first time. In these cases, it is advisable to engage a partner with experience in measuring lift from different marketing tactics.
There are numerous nuances that need to be considered inaccurately estimating the probability of an individual responding to different actions the company can make. Within the tactic of direct mail, we must consider cumulative mailings over time, repeated mailings within a single campaign, attribution of response to an individual mail piece, and ad stock.Within the tactics of television and radio, we must consider the number of times an individual was exposed to the ad, whether the individual understood the message, and possible synergies among different tactics. And for all tactics, we must consider the impacts of changing brand health and brand awareness over time.
What is the remaining lifetime value of your customer?
Not all customers are equally valuable to a company, and the value of a customer is exhausted over time. For example, a bank might estimate the remaining lifetime value of a young doctor who has just finished his fellowship and landed a permanent job to be higher than the residual lifetime value of a wealthy retiree intending to donate his fortune to charity. Although the retiree’s assets might have greater current value, the bank’s ability to receive ongoing income from him is limited, while building loyalty with the doctor could result in increased assets and deepening relationship for many years.
Estimating lifetime value is challenging in all industries. It’s slightly more straightforward in financial services than in, say, retail or restaurant because information is easily tied to an individual. Although loyalty programs are making this sort of information more readily available in other industries, enrollment in loyalty programs is problematic for two reasons:
- A small number of customers participate
- Those who do participate are systematically different from those who do not (i.e., they are not a random sample from the population of interest).
From a modeling perspective, there are several different approaches that can be taken to estimate remaining lifetime value. The simplest method is to estimate average annual (or daily, or monthly) revenue and expenses associated with an individual and to estimate the average lifetime with the company. For example, we might estimate that the average Netflix customer spends $12.99/month, incurs an average of $1.29/month in expenses due to usage and customer service, and retains a subscription for an average of five years. In this case, a customer who has been a member for four years has a remaining lifetime value of:
The limitations of this approach are somewhat obvious. For most companies, we know that the longer an individual has been with a company, the more likely she is to stay with the company. Also, the model does not account for individual characteristics, like age or historical viewing habits. As with models for probability of response above, survival models predicting the length of time the individual will remain with the company conditional on their historical behavior would be more useful. Also, we should calculate remaining lifetime value in today’s dollars, discounting revenue obtained in the future.
What is the cost of various actions?
The cost of different actions is not something that is typically modeled statistically; companies know how much it costs to make a phone call to a customer or deliver a mail piece. However, companies often incorrectly calculate or ignore overhead costs associated with different actions. For example, a company might know that its mail campaign typically incurs a cost of $0.47 per mail piece printed and sent, but they do not account for the cost of the data scientist building the targeting models.Admittedly, such a cost is difficult to allocate to the individual actions the company can take (since the data scientist is a fixed cost), but if the company decides that direct mail is no longer a profitable action to take, the direct mail campaign might be eliminated, saving not only the cost of the actual mailings but of the data scientist performing the modeling.
Pulling the elements together: A practical example
Once models have been developed, calculating the value of different actions and selecting the best one for each is straightforward. As an example, suppose an individual has a 5 percent probability of attriting in the next six months (based on a targeting model). Based on historical tests of outbound phone call effectiveness, the company estimates it can reduce that probability of attrition in the next six months to 3 percent by calling her, to 4 percent by sending her a letter, or to 4.5 percent by sending her an email. Also, suppose the individual’s remaining lifetime value, assuming she leaves, is $0, and her remaining lifetime value, assuming the company retains her, is $50.Finally, suppose that after considering all overhead, the cost of the outbound call is estimated to be $5, the letter costs $0.50, and the email costs $0.03.
In this case, sending an email is the right choice. It has the highest expected value, and the expected value is positive.
Typically, companies are not interested in deciding how to reach one person but instead are interested in designing a campaign to address a large population of individuals. It is possible to calculate the expected value of each action for each person as above, but the company still needs to allocate its resources as effectively as possible. To maximize value, a company should rank individuals by the difference between the optimal action (excluding doing nothing) and to do nothing. In our example above, this value would be:
The next step is to make decisions for each, starting with the one with the highest expected value, until the marketing budget is exhausted or there are no longer any individuals with an expected value greater than $0 for any possible action.
Of course, such a strategy is not possible in all cases. A company may have resources allocated separately foreach action (e.g., a certain number of call center employees who will make outbound calls). These cases require more sophisticated optimization algorithms to create the marketing plan.
Finally, we note that the strategies described here wholly ignore the uncertainty associated with estimates provided by statistical models. For example, when we estimate the probability of attrition for an individual, there is uncertainty attached to that probability. The same is true for estimates of remaining lifetime value. From a decision-theoretic standpoint, those uncertainties should be incorporated into the calculation of expected value from different actions, but the complexities of those calculations are not described here.
Don’t get in over your head
While companies are making good use of data science (and machine learning) to calculate the probabilities that individuals of interest will respond, they are not adequately optimizing their actions toward those individuals. Rather than merely calculating probabilities of different response, it’s important to calculate those probabilities based on different actions the company can take, consider the remaining lifetime value of those individuals, and incorporate the costs of different actions. Once all of those components have been incorporated, making the right decision concerning each person is straightforward, and taking the right steps can have a meaningful impact on a company’s bottom line.
Admittedly, there are several challenges to overcome in adopting value-driven targeting. To understand the probability of response given different actions requires bringing together data across silos; in other words, merging marketing data with customer transactional data. This can be a challenge for companies who do not have easy access to this data. Likewise, estimating lifetime value is difficult because many companies don’t track customer data across the entire lifecycle. Some industries, like banking and insurance, collect this data, but few are using it to estimate the lifetime value of their current and future customers. Finally, calculating the cost of different actions is challenging because overhead costs are often not allocated or recorded in a way that allows meaningful attribution to different actions.
Ready to dive in?
A good start to overcoming these challenges is to identify difficult questions you want to answer and actions that involve significant investment. If you are interested in helping your teams go beyond propensity and start optimizing the actions you take, please contact us to schedule a conversation.