Programmatic advertising continues to shift away from simply impressions to a more advanced audience-targeted approach. Ultimately, advertisers running programmatic app marketing campaigns are interested in acquiring users with a high lifetime value (LTV). The definition of LTV varies from advertiser to advertiser, but in most cases, it is expressed as the revenue the user generates through in-app purchases or in-app advertising.

Ideally, when bidding in RTB auction, we would be able to accurately estimate the LTV of the user. In practice, the task of LTV estimation is challenging for two primary reasons.

Sparse, noisy data

Peduzzi et al. (1996) suggest the following rule of thumb for choosing a sufficient sample for maximum likelihood estimation: given $$k$$ predictor variables, and a fraction $$p$$ of positive samples in the population, the minimum number of training samples is given by $N = \frac {10k}p$ Effectively, the minimum sample size grows linearly with the number of predictors, and inversely with the positive label frequency. Both of these pose significant challenges.

Take for instance the binary classification problem of predicting whether or not the user will install an app and eventually make a purchase. A typical campaign sees 1 in 1000 impressions convert to an install, and 1 in 10-100 installs convert to a purchase, giving a positive label frequency $$p \approx 10^{-5}$$.

Additionally, most predictor variables available in the RTB auction setting are high-dimensional categorical variables; the publisher space alone is on the order of $$10^5$$ levels. Even with aggressive feature selection, it’s not uncommon to see datasets with $$k = 10^4$$ predictor variables.

With these parameters, Peduzzi’s formula puts the minimum sample size at 10 billion impressions. This is prohibitively expensive; obtaining 10 billion training impressions can cost millions of dollars, even at conservative CPMs.

Feedback delay

Lifetime value, by definition, has an extremely long time horizon (on the order of months or years). Therefore, naive LTV modeling is limited to mature installs. However, within the RTB setting, it is crucial to adapt bidding strategies to changes in the underlying landscape.

The figure below illustrates the problem of feedback delay. Users acquired more recently have had less time to mature, and we have censored information feeding into our LTV model. This makes it crucial to employ some form of survival modeling, to jointly model feedback delay along with the feedback distribution (LTV).

Common proxies

For the reasons outlined above, most advertiser KPIs hinge on a proxy metric for LTV. A developer will analyze the in-app conversion funnel and detect some user behaviors which are early predictors of high LTV. These proxies typically fall into three classes.

• Cost per click/install

Without any behavioral user analysis, it is common to run a campaign on a CPC/CPI basis. The DSP delivers clicks or installs at a certain cost determined by the advertiser; the advertiser determines this cost based on the average revenue per click/install (RPC/RPI) seen within their specific conversion funnel.

Once an industry standard, CPC/CPI KPIs are losing popularity. Despite having the advantage of immediate feedback and ample training data (essentially eliminating both problems outlined above), CPC/CPI targets are difficult to set due to the underlying RPC/RPI variance. For instance, the figure below illustrates the RPC distribution for a sample campaign.

Historically, some clicks generate more revenue than others by three orders of magnitude; buying these clicks at a fixed RPC leads to extremely inefficient impression pricing.

• Cost per engagement (by day N)

Generally, advertisers are able to see a high correlation between lifetime value and some in-app behaviors (for example, completing a level, or making an in-app purchase within a certain time window after install). This type of KPI provides a “happy medium” — generally, the feedback delay is reduced to a few days, while providing a high-fidelity proxy for future spend.

A simple recency/frequency analysis can unearth interesting behaviors which are early indicators of high-LTV users. These behavioural indicators can then be used as labels to train a predictive model, essentially inflating the positive sample rate while.

• Day-N return on investment

By far the most popular choice of KPI is the return on investment by day N after install (DN ROI, where N can vary between 3 and 30). The choice of N comes with a trade-off: small values reduce eliminate feedback delay, but increase label sparsity; large values generate more labeled data, but increase feedback delay, requiring the use of survival modeling.

In either case, user purchase data is highly noisy and sparse. Adequately predicting in-app revenue generally requires modeling the post-install user conversion funnel separately, using a large seed dataset of all in-app behavioral data.

The correct estimation of the user LTV and the right optimization strategy are the cornerstones of a successful app marketing campaign.

Topics: Machine Learning