Audio Analysis Publications Signal Processing Speech Processing

Signal Processing Applied To Video Mining: Video Boundaries Detection

Scene change detection is a technique which aims to identify automatically the scene change in a video. Assuming that a scene is defined by its audio and video signals, we present here scene change techniques based on audio and video signals. In the case of audio signal, the different techniques are based on abrupt variations of their frequency- and time-based features. For techniques based on video signals, the usual algorithms are based on the Sum of Absolute Differences (SAD) variation.

Scene change of an audio signal

Features of audio signal belonging to a same scene are assumed to be uniform, which is generally the case in real contexts. Consequently, the objective of scene detection algorithms is to find segments of the audio signal where features are substantially different from those of adjacent segments.

The audio signal is first cut into segments of arbitrary sizes (e.g. 1 second). For each of the clips a vector composed of some frequency and time-based features extracted from these clips is built. Then, different statistics are applied to these features. Finally, the concatenation of the clip  features, and their statistics forms the final features vector is designed as follows:


where  is the total number of features and  is the  features for clip .

Let us consider a signal  split into  clips. Let  be its features matrix. The rows and columns of  correspond respectively to clips and their features values. A Principal Components Analysis (PCA) is applied in order to project onto a low dimensional space whose dimensions are orthogonal. The PCA allows moreover retaining the essential information provided by the features.

Let  be the new features matrix derived from the PCA. Each row  of this matrix contains the new coordinates of the clip . Then, an index change function  defined over a fixed size sliding analysis window is then designed. An optimal threshold ? is then empirically set, such a scene change is detected in clip . when .

Scene change of a video signal

The scene change detection algorithm in a video signal presented in this study is based on the Sum of Absolute Differences (SAD). The SAD is a metric quantifying the difference between image blocks. The algorithm studied consists, first of all, in splitting each video into frames. Then, the SAD of consecutive frames in the video signal is computed as follows:


?   is the frame index

?   and  are respectively the height and width of frame 

?    and  are, respectively, the Red, Green and Blue components of the pixel whose coordinate are  and  in the frame .

Finally we compare the derivative of , denoted  to a threshold : to detect a scene change.

You may also like
HMM-based ASR
Artificial Intelligence and Video Mining: Audio Event Detection Using SVM