Scene change detection is a technique which aims to identify automatically the scene change in a video. Assuming that a scene is defined by its audio and video signals, we present here scene change techniques based on audio and video signals. In the case of audio signal, the different techniques are based on abrupt variations of their frequency- and time-based features. For techniques based on video signals, the usual algorithms are based on the Sum of Absolute Differences (SAD) variation.
Features of audio signal belonging to a same scene are assumed to be uniform, which is generally the case in real contexts. Consequently, the objective of scene detection algorithms is to find segments of the audio signal where features are substantially different from those of adjacent segments.
The audio signal is first cut into segments of arbitrary sizes (e.g. 1 second). For each of the clips a vector composed of some frequency and time-based features extracted from these clips is built. Then, different statistics are applied to these features. Finally, the concatenation of the clip features, and their statistics forms the final features vector is designed as follows:
where is the total number of features and is the features for clip .
Let us consider a signal split into clips. Let be its features matrix. The rows and columns of correspond respectively to clips and their features values. A Principal Components Analysis (PCA) is applied in order to project onto a low dimensional space whose dimensions are orthogonal. The PCA allows moreover retaining the essential information provided by the features.
Let be the new features matrix derived from the PCA. Each row of this matrix contains the new coordinates of the clip . Then, an index change function defined over a fixed size sliding analysis window is then designed. An optimal threshold ? is then empirically set, such a scene change is detected in clip . when .
The scene change detection algorithm in a video signal presented in this study is based on the Sum of Absolute Differences (SAD). The SAD is a metric quantifying the difference between image blocks. The algorithm studied consists, first of all, in splitting each video into frames. Then, the SAD of consecutive frames in the video signal is computed as follows:
? is the frame index
? and are respectively the height and width of frame
? and are, respectively, the Red, Green and Blue components of the pixel whose coordinate are and in the frame .
Finally we compare the derivative of , denoted to a threshold : to detect a scene change.