Machine Learning Publications

Behavioral Targeting

Online advertising is becoming more and more competitive. Hence, to maximize their revenue by optimizing their KPI (Key Performance Indicators) such as the CTR (Click-Through-Rate), advertisers use various targeting techniques. On the one hand, targeting can be based on the contextual information of the current webpage page visited by the users, and on the other hand it can be performed by profiling the users’ historical set of queries. The latter technique is known as Behavioral Targeting (BT). This approach is used to predict whether a user would be more or less sensitive to advertising. We present here a BT system based on logistic regression. However we shall begin this paper by presenting some technical concepts used in digital advertising.

Technical Definitions

At any time , the user is can be represented by a features vector . The features can be one or a combination of the following types of information: demographic information (age, gender) geographic (state, zip, code, city), web page browsing (history of web pages visited).

Given an advertisement , we defined the following functions:

: is the targeting function which quantifies the probability that user  clicks on advertisement .

: a function with a binary output indicating whether or not the user views advertisement 

: function with a binary output indicating whether or not the user clicked on advertisement 

Among the different types of KPI, the one mostly used by the advertiser to maximize their revenue is the CTR.

Denoting  a set of users, for any advertisement  the reach function  and the CTR are defined as follows:



Note that given threshold  can infer a subset of users: 

Targeting strategies [1]

In the field of online advertising, there are two main approaches to user targeting: the one based on the context of the web pages, and the one based on the users’ behavior. In the case of contextual targeting, the advertisers broadcast on web pages whose contents are highly correlated to their advertisements. In this case, it is assumed that the users visit web pages related to their centers of interest.

Regarding behavioral targeting (BT), it aims at targeting users mainly by considering their historical data. Behavioral targeting is performed by building a predictive model which uses historical data as a training set and where the labels are the binary variables indicating whether or not the user clicked on the ad.

Logistic regression based BT

The objective is to model the targeting function . To do this, several types of supervised learning system can be used. Here, since clicking is a binary object, we use logistic regression, which moreover possesses some great scalability properties. We define the targeting function by:  which is the probability that the user  clicks on an ad (the binary value ) given his historical information (queries features) represented by the vector . The idea is to use a function whose output values are in the range . One function which respects this property is the inverse logit function [2]:


Nevertheless a more general format of the inverse logit function is used to model the targeting function:


The symbol  represents the inner product and  are the logistic regression parameters.

Since  is binary, hence


Thereafter, the objective is to find the best parameters  that maximize the likelihood function  usually denoted 

Let us assume that the parameter  is an n-dimensional vector whose components  are independent. Hence,  becomes:


Finally the problem leads to solving the Maximum Likelihood Estimation problem:


The above optimization problem can be solved using the gradient descent algorithm [3].

References

[1]  Borko Furht. Handbook of Social Network Technologies and Applications. Springer, 2010.

[2]  J.S. Cramer. “The origins and development of the logit model.”2004.

[3] Léon Bottou. Large-Scale Machine Learning with Stochastic Gradient Descent. Proceedings of COMPSTAT’2010.

You may also like
Big Data: Basics, MapReduce
Sentiment Analysis