Online advertising is becoming more and more competitive. Hence, to maximize their revenue by optimizing their KPI (Key Performance Indicators) such as the CTR (Click-Through-Rate), advertisers use various targeting techniques. On the one hand, targeting can be based on the contextual information of the current webpage page visited by the users, and on the other hand it can be performed by profiling the users’ historical set of queries. The latter technique is known as Behavioral Targeting (BT). This approach is used to predict whether a user would be more or less sensitive to advertising. We present here a BT system based on logistic regression. However we shall begin this paper by presenting some technical concepts used in digital advertising.

At any time , the user is can be represented by a features vector . The features can be one or a combination of the following types of information: demographic information (age, gender) geographic (state, zip, code, city), web page browsing (history of web pages visited).

Given an advertisement , we defined the following functions:

: is the targeting function which quantifies the probability that user clicks on advertisement .

: a function with a binary output indicating whether or not the user views advertisement

: function with a binary output indicating whether or not the user clicked on advertisement

Among the different types of KPI, the one mostly used by the advertiser to maximize their revenue is the CTR.

Denoting a set of users, for any advertisement the reach function and the CTR are defined as follows:

Note that given threshold can infer a subset of users:

In the field of online advertising, there are two main approaches to user targeting: the one based on the context of the web pages, and the one based on the users’ behavior. In the case of contextual targeting, the advertisers broadcast on web pages whose contents are highly correlated to their advertisements. In this case, it is assumed that the users visit web pages related to their centers of interest.

Regarding behavioral targeting (BT), it aims at targeting users mainly by considering their historical data. Behavioral targeting is performed by building a predictive model which uses historical data as a training set and where the labels are the binary variables indicating whether or not the user clicked on the ad.

The objective is to model the targeting function . To do this, several types of supervised learning system can be used. Here, since clicking is a binary object, we use logistic regression, which moreover possesses some great scalability properties. We define the targeting function by: which is the probability that the user clicks on an ad (the binary value ) given his historical information (queries features) represented by the vector . The idea is to use a function whose output values are in the range . One function which respects this property is the inverse logit function [2]:

Nevertheless a more general format of the inverse logit function is used to model the targeting function:

The symbol represents the inner product and are the logistic regression parameters.

Since is binary, hence

Thereafter, the objective is to find the best parameters that maximize the likelihood function usually denoted

Let us assume that the parameter is an *n-*dimensional vector whose components are independent. Hence, becomes:

Finally the problem leads to solving the Maximum Likelihood Estimation problem:

The above optimization problem can be solved using the gradient descent algorithm [3].

[1] Borko Furht. Handbook of Social Network Technologies and Applications. Springer, 2010.

[2] J.S. Cramer. “The origins and development of the logit model.”2004.

[3] Léon Bottou. Large-Scale Machine Learning with Stochastic Gradient Descent. Proceedings of COMPSTAT’2010.

You may also like

Acoustic FeaturesAcoustic ModelsArtificial IntelligenceASRaudio featuresAudio FingerprintsAudio signal processingAudio time and frequency indicatorsBigDataBigTableBlind Audio Source SeparationCassandraComputation VisionComputer VisionEdge DetectionGMMGrammarHadoopHMMImage ProcessingInformation RetrievalKeyword spottingLanguage ModellingLatent Dirichlet AllocationLatent Semantic AnalysisLexiconMAPMapReduceMFCCNoise ReductionPhone RecognitionPLPProbalistic Latent Semantic AnalysisShifted Delta CepstralSimilarity MeasuresSpeech EnhancementSpeech ProcessingSupervised automatic learningSVMSVM and GMM classificationsText MiningUBMvideo boundariesvideo featuresWiener Filter