Detection of Abnormal Events via Optical Flow Feature Analysis

In this paper, a novel algorithm is proposed to detect abnormal events in video streams. The algorithm is based on the histogram of the optical flow orientation descriptor and the classification method. The details of the histogram of the optical flow orientation descriptor are illustrated for describing movement information of the global video frame or foreground frame. By combining one-class support vector machine and kernel principal component analysis methods, the abnormal events in the current frame can be detected after a learning period characterizing normal behaviors. The difference abnormal detection results are analyzed and explained. The proposed detection method is tested on benchmark datasets, then the experimental results show the effectiveness of the algorithm.


Introduction
With the development of human society, the security challenges in public scenes are gradually increased. In the last several decades, the cost for camera and network communication has been significantly reduced. Furthermore, video camera sensors are used wildly in many areas of human life. However, the traditional way of visual surveillance is a labor-intensive non-stop human attention work, and the efficiency is low. Thus, tackling visual surveillance problems automatically by adopting video processing technique plays a paramount role in the computer vision research area. The scientific challenges in this area include developing strategies to ensure public safety and detecting the abnormal behavior of an individual or a group.
The methods modeling behavior by adopting the Bayesian network were introduced in [1][2][3][4]. In [5], delta-dual hierarchical Dirichlet processes (dDHDP) were used to detect abnormal activity patterns in the field of visual features. By analyzing the statistical property, abnormal events were detected. Successful results were obtained on several scenes, but the prediction was based on the complicated probability model. Some researchers took notice of spatio-temporal features. In [6], the movement was represented by a co-occurrence matrix and modeled by a Markov random field model. Abnormal activities, which were the significant changes in the scene, were detected. The work was similar to the foreground subtraction method in a non-stable background scene.
Low-level motion features have also gained attention for detecting abnormal events. In [7], an algorithm was proposed to detect the action of a single individual, such as hand-waving, boxing, etc. In [8], bionics technology was applied to model the superior colliculus (SC) to discover abnormalities in the panoramic image. These methods were based on partial information, such as contained in small observation windows of the image. In other words, they did not employ global information within the frame.
Based on the feature representation and the pattern classification, an abnormal detection method is proposed in this paper. The datasets used in our work are Performance Evaluation of Tracking and Surveillance (PETS2009) [9] and University of Minnesota (UMN) dataset [10], as shown in Figure 1. A normal scene means that the individuals are promenading in different directions. In the abnormal scenes of the PETS dataset, people are moving (walking or running) in the same direction, while the UMN abnormal scene means that the individuals are running. The proposed algorithm is composed of two parts. Firstly, the visual features are extracted without object tracking. Secondly, abnormal events are detected by classifying the extracted features. In fact, one-class support vector machine (SVM) and principal component analysis (PCA) are used in this paper. By learning the normal behaviors, the classifiers detect the abnormal ones. The rest of the paper is organized as follows. In Section 2, the optical flow-based feature is proposed. In Section 3, a one-class SVM classification method and a kernel PCA for novelty detection method are presented, and thus, the corresponding abnormal detection framework is described. In Section 4, the experimental results and the analysis are given. Finally, the paper is concluded with future works in Section 5.

Feature Selection for Abnormal Detection
Because optical flow can represent the movement information of actions, we choose the Horn-Schunck (HS) [11] method to compute it. The HS method formulates the optical flow as a global energy functional for the gray image sequence: where I x ,I y and I t are the derivatives of the image intensity values along the horizontal direction x, vertical direction y and time t dimension, respectively. u, v are the horizontal and vertical components of the optical flow. α is a regularization constant. In [12], the abnormal global frame detection was proposed, and the frame covariance matrix descriptor was constructed based on the optical flow. In this paper, we analyze the details of the histogram of the optical flow orientation (HOFO) with different parameters. The optical flow orientation features of an image are extracted at fixed resolution and then gathered into a high dimensional feature vector. A 2 × 2 rectangular cell HOFO descriptor of the original image or the foreground image is shown in Figure 2. By a trigonometric function, the orientation is computed from horizontal and vertical optical flow. The orientation is voted into n bins in 0 • -360 • (noted as signed angle) or 0 • -180 • (noted as unsigned angle). Nine bins are chosen in this paper. The optical flow magnitude of a pixel is considered as a weight coefficient in the voting process. A block contains h b × w b cells; it is set as 2 × 2 in this paper to present the spatial information of the HOFO. The HOFO dimension of one block is 36 (9 × 2 × 2). The HOFO feature describes the global movement information of one frame (or foreground frame) by gathering the histogram of the optical flow orientation in the sub-frame (block). Because the movement of an abnormal image usually has a bigger value of the optical flow strength and more directions, the element in the HOFO vector of an abnormal image is generally higher than a normal one. Four normalization schemes are chosen when HOFO is calculated: where v is the HOFO descriptor vector before being normalized and ε is a small constant to make the calculation reasonable.

Abnormal Detection Method Based on Optical Flow Analysis
The objective of the abnormal event detection problem is to find the samples that are different from the training ones. Thus, two classification methods, the one-class support vector machine (one-class SVM) and kernel principal component analysis (KPCA) for novelty detection, suit this application. In this section, we firstly introduce these two methods and then propose the abnormal detection algorithm in video sequences.

One-Class Support Vector Machine
Vapnik and Lerner initially proposed the support vector machine for classification or regression based on statistical learning theory [13]. Later, by adopting the kernel methods, the support vector machine was extended to deal with non-linear problems [14][15][16]. Thus, the non-linear one-class support vector machine is one development of the basic SVM theory to find out an appropriate region containing most of the data drawn from an unknown probability distribution. The problem of the non-linear one-class support vector machine can be presented as [17,18]: where x i ∈ X , i ∈ [1 . . . n] are n training samples in the original data space X . ξ i is the slack variable for penalizing the outliers. The hyperparameter ν ∈ (0, 1] is the weight for the controlled slack variable. Φ is a map from the non-empty set of the original input data X to a feature space H. For computing dot products in H, the kernel function is defined as The decision function is defined as: where x is a vector in the input data space X and κ is the kernel function. The Gaussian kernel is used to deal with the non-linear problem in this paper.

Kernel Principal Component Analysis
Kernel principal component analysis [19,20] extends the standard PCA to non-linear data distributions. Before performing PCA, map the n datum points x i ∈ R d to a higher-dimensional feature space F where standard PCA is performed: In kernel PCA, an eigenvector V of the covariance matrix in F is a linear combination of Φ(x i ): where α i is the component of a vector α. This vector is an eigenvector of the matrix For novelty detection [21], the reconstruction error p( Φ) can be defined as: subject to: where f l (x) is the projection of Φ(x) on the eigenvector V l , and the index l denotes the l-th eigenvector, with l = 1 for the eigenvector with the largest eigenvalue.

Abnormal Detection Algorithm Based on Optical Flow Feature Classification
By adopting the histogram of the optical flow orientation feature descriptor and these two novel detection methods, the abnormal event detection method in video streams is summarized in Algorithm 1.  (2) PCA method: compute the principal components by the KPCA method and measure the squared distance. (2) PCA method: each incoming frame H n,...,q is classified by KPCA. 5: The detection results are filtered by state transition restriction.
Step 1: Compute the optical flow of each frame via the Horn-Schunck (HS) optical flow method in the gray scale.
Step 2: Calculate the histogram of the optical flow orientation (HOFO) of each frame. The sketch image for choosing the HOFO feature in the original image or in the foreground image is shown in Figure 2. If the HOFO descriptor is computed on the foreground image, the optical flow in the background is zero. Thus, the background area is not considered, and then, computing time is saved.
Step 3: The one-class support vector machine or kernel principal component analysis method is used to classify feature samples of the incoming video frames. The flowchart of our method is shown in Figure 3. PCA method: The normal training feature samples for KPCA are mapped into a high-dimensional feature space. In this space, PCA extracts the principal components of the data distribution. Then, the squared distance of each testing sample to the corresponding principal subspace is measured for novelty detection [21].
Step 4: If a normal event or an abnormal event is observed, it means that the video clip holds one state in several consecutive frames. Thus, we use a state transition restriction method by presetting a threshold N to filter the short fluctuation clip. If the number of the predicted abnormal frames after a normal video clip is larger than N , the state of the abnormal detection system is changed from "normal" to "abnormal".

Experimental Results and Analyses
This section presents the experimental results and the analyses of the proposed abnormal detection method. The datasets PETS [9] and UMN [10] are used.

PETS Dataset
The detection accuracies of PETS dataset under different features and classification methods are shown in  Figure 5 by using the projection of the three largest principal components. The training normal data (labeled as a blue cross) are confused with the testing normal data (labeled as a cyan diamond) and the testing abnormal data (labeled as a red rectangle). In a word, the training data are mixed with the test-abnormal data. One-class SVM has a slack variable, which tunes the number of acceptable outliers of the training data. This soft margin strategy makes the one-class SVM obtain more precision.   The results adjusted by restriction of the state transition are shown in Figure 6. As shown in the figure, the fluctuations between the "abnormal" and "normal" state are reduced. The detection results of the PETS scene are shown in Figure 7.

UMN Dataset
The HOFO descriptor can represent not only the information of optical flow orientation, but also the optical flow magnitude. The results of the benchmark dataset UMN are shown in Figure 8. The HOFO descriptor can deal with the abnormal scene in which people are running in all directions. For the lawn scene, the detection accuracies of different conditions are shown in Figure 9. One-class SVM and KPCA classification methods can get great accuracy without the state transition restriction strategy. For the indoor scene and the plaza scene, the detected accuracy of different conditions are shown in Figures 10 and 11, respectively. The restriction of the state transition improves the accuracy. In summary, the KPCA method is generally better than one-class SVM for abnormal detection in these experiments. Furthermore, the data distributions need to be considered.
The performance summary of the UMN dataset compared with the state-of-the-art methods is shown in Table 1. The results in the table are not post-processed by the state transition restriction strategy. Our method obtains great accuracy for all three scenes in the UMN dataset. NN, nearest neighbor. SRC, sparse reconstruction cost. STCOG, spatial-temporal co-occurrence Gaussian mixture models.

Conclusions
We propose an abnormal detection method by analyzing the optical flow feature. The method is based on two components, computing the histogram of the optical flow orientation (HOFO) and applying one-class support vector machine and kernel principal component analysis for classification. The HOFO feature is computed in the original frame or foreground image. Moreover, the details of the parameters are analyzed. The algorithm has been tested on several video sequences, and the experimental results show the effectiveness of the algorithm. From the experimental results, we can see that the normalization schemes, none and L2hys-0.4, generally get the best performance. The detection results under the signed angle and original image condition is broadly acceptable. In general, the KPCA novelty detection method is as good as one-class SVM, but under a certain distribution of the data, the one-class SVM can obtain more accurate performance.
Future work will aim at reducing the false alarms and training the samples online. Two solutions are under consideration: capturing more efficient features based on the optical flow or replacing the optical flow by other approaches that can represent the information of the events. Online learning is also urgent. Due to the large amount of normal examples, it is hard to learn the training samples as one batch. Moreover, our method focuses on detecting global abnormal events, but detecting local abnormal events is also important. Improving the method to detect the global and local abnormal events jointly is also necessary.