^{1}

^{*}

^{2}

^{3}

^{1}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method.

Visual surveillance is one of the major research areas in computer vision. After recording events by a visual sensor, such as a camera, obtaining detailed information of individual or crowd behavior is a challenging object in this area; automatic abnormal event detection is required to provide convenience, safety and an efficient lifestyle for humanity [

In [

On the other hand, low-level motion features were employed. In [

Trajectory information is also adopted to detect abnormal events. In [

We consider the model-free approach, which does not require an explicit statistical model. To be accurate, the support vector machine (SVM) classification method is relied on in this paper. Inspired by the satisfactory performance of a covariance feature descriptor representing object in a tracking problem, a covariance descriptor characterizes the moving information of a global frame. In a tracking problem, the covariance descriptor is constructed of the blob intensity or color for template matching. In this paper, covariance encodes the optical flow of the global frame.

The rest of the paper is organized as follows. In Section 2, related works are briefly reviewed. In Section 3, the online least squares one-class support vector machine (online LS-OC-SVM) classification method is originally derived. In Section 4, a covariance matrix descriptor is described to provide feature vectors for the classification algorithm. In Section 5, we propose abnormal detection methods based on the online LS-OC-SVM. In Section 6, we present the results on synthetic data and real-world video scenes. Finally, Section 7 concludes the paper.

SVM is usually trained in a batch model,

Some online learning algorithms for SVM were derived based on analyzing the change of Karush-Kuhn-Tucker (KKT) conditions while updating the classifier. In [

In [

In [

Some online one-class SVM classification methods were proposed based on support vector data description (SVDD) [

In order to sidestep the difficulty in the nature of the constrained quadratic optimization problem, we derive an online version of the hyperplane one-class SVM [

In this section, we introduce the derivation of the proposed online least square one-class support vector machine (online LS-OC-SVM). In abnormal detection problems, it is supposed that the samples from a positive class are obtainable. A density will only exist if the underlying probability measure possesses an absolutely continuous distribution function, but the general problem of estimating the measure for a large class of sets is not solvable [

One-class SVM (OC-SVM) aims to determine a suitable region in the input data space, _{i}_{i}_{i}_{i}

Least squares SVM (LS-SVM) was proposed by Suykens in [

The condition for the slack variables in OC-SVM, _{i}_{i}_{i}

Setting derivatives of _{i}_{i}_{i}_{j}_{i}_{i}_{i}

In an online learning scheme, the training data continuously arrive. We thus need to tune hyperparameters in the objective function and the hypothesis class in an online manner [_{n}_{n}_{n}_{n}_{n}^{T} at the time step, _{n}_{n}_{+1} is the column vector with _{i}_{n}_{+1}), _{n}_{+1} = _{n}_{+1}, _{n}_{+1}). Based on

The procedures for calculating the parameters,

Instead of _{
},
_{wj}_{
} (_{wj}_{j}_{i}_{i}^{T} becomes:

After providing these relations with the dictionary, we now discuss the dictionary construction. The coherence criterion is adopted to characterize a dictionary in sparse approximation problems. It provides an elegant model reduction criterion with a less computationally-demanding procedure [

In the online case, the coherence between a new datum and the current dictionary is calculated by:
_{wj}_{
}. Presetting a threshold, _{0}, the new arrival sample, _{t}_{t}_{0}, it will be included into the dictionary. Concretely, the algorithm is performed with two cases described herein below.

_{t}_{0}

In this case, at time step _{n}_{+1}, is not included into the dictionary. The Gram matrix, _{
}, with the entries, _{i}_{j}_{n}_{+1},_{wj}_{
} (_{i}_{wj}

_{t} ≤ _{0}

In this case, the new data, _{n}_{+1}, is added into the dictionary, _{
}. Then, the Gram matrix should be changed by:

_{ }is the Gram matrix of the dictionary, including the new arrival dictionary sample,

_{n}

_{+1}, and

_{ }is the Gram matrix of the dictionary at the last time step,

_{ }= {

_{w}

_{1},

_{w}

_{2}, …,

_{wD}

_{j}= κ

_{wj}

_{n}

_{+1},

_{n}

_{+1}). By adopting the matrix inverse identity

_{ }(

_{i}

_{i}

_{D}

_{+1}),

_{D}

_{+1}is the new arrival datum

_{n}

_{+1}, which is included into the dictionary. The matrix inverse in

with proper choices of matrices ^{−1} + ^{−1}_{
}(^{⊤}), we regard two vectors, (_{
}(^{⊤}, as vector

Once knowing _{n}_{+1}, _{wj}_{wj}

The optical flow is chosen as the basic low-level feature to represent the movement direction and amplitude. We apply the Horn-Schunck (HS) method to compute optical flow in this paper. The optical flow can provide important information about the spatial arrangement of the object and the change rate of this arrangement [_{x}_{y}_{t}

The covariance feature descriptor was originally proposed by Tuzel [_{i}_{k}_{x}_{x}_{y}_{y}_{xx}_{xx}_{yy}_{yy}

If proper parameters are given, classical kernels, such as Gaussian, polynomial and sigmoidal kernels, have similar performances [_{i}_{j}

In an abnormal event detection problem, it is assumed that a set of training frames, {_{1}_{2},…, _{n}

The general architecture of the abnormal event detection method via online least squares one-class SVM (online LS-OC-SVM) proposed in Section 3.3 is summarized in Algorithm 1; the flowchart is shown in

_{
}, offline.

_{n}

_{+l}is classified via LS-OC-SVM.

where {_{1},_{2},…, _{n}_{1},_{2},…, _{n}

^{T} are defined in

^{T}, are obtained. The online LS-OC-SVM method (Section 3.3) is applied to learn the remaining _{m}_{+1},_{m}_{+2}_{n}

^{T}, the distance of the training samples
_{n}_{+}_{l}

where _{n}_{+}_{l}_{i}_{dis}

The abnormal event detection via sparse online least squares one-class SVM (sparse online LS-OC-SVM) is introduced below. A subset of the samples is chosen to form the dictionary, _{
}, making a sparse representation of the training data. The initial dictionary, _{
}, is learned offline. Each remaining training sample is learned one-by-one online. Meanwhile, it is checked to be included, or not, into the dictionary. The test datum is classified based on the dictionary. The feature extraction step (Step 1) and the detection step (Step 4) are the same as the ones presented in Section 5.1. Owing to the dictionary, the training steps are different.

_{
}. This step can be generalized as:

_{
}, including the first _{m}_{+1}, _{m}_{+2}, …, _{n}_{
} is the dictionary and _{k}_{k}_{
}.

This section presents the results of experiments conducted to illustrate the performance of the two proposed classification algorithms, online least square one-class SVM (online LS-OC-SVM) and sparse online least square one-class SVM (sparse online LS-OC-SVM). The two-dimensional synthetic distribution dataset and the University of Minnesota (UMN) [

Two synthetic data, “square” and “ring-line-square” [

The first sample is used for initializing the online LS-OC-SVM proposed in Section 3.3; the 399 remaining samples in “square” and 849 remaining samples in “ring-ling-square” are learned in the online manner.

Via the sparse online LS-OC-SVM method proposed in Section 3.4, the first sample is trained offline, and this sample is considered the initial dictionary. Then, each arrival sample in 399 remaining samples in “square” and 849 remaining samples in “ring-ling-square” are checked by the coherence criterion to determine whether the dictionary should be retained or updated by including the new element.

The distances are shown in contours illustrating the boundary. The contours of “square” and “ring-line-square” are shown in _{0} = 0.08. The detection results obtained by these two online training algorithms are the same as the ones when training data were learned in a batch model.

UMN dataset detection results via online LS-OC-SVM proposed in Section 3.3 are shown below. The UMN dataset consists of eleven sequences of crowded panic escape events, which are recorded in a lawn, an indoor and a plaza scene. A frame where the people are walking in different directions is considered as a normal sample for training or for normal testing. A scene where the people are running is taken as an abnormal sample for testing. The detection results of the lawn scene, the indoor scene and the plaza scene are shown in

UMN dataset abnormal event detection results via sparse online LS-OC-SVM proposed in Section 3.4 are presented. Taking the lawn scene as an example, the first normal covariance matrix descriptor from the training samples is included into the dictionary firstly; then, the remaining training covariance descriptors are learned online by the sparse online LS-OC-SVM method. The ROC curve of the detection results of the lawn scene, the indoor scene and the plaza scene are shown in

The resulting performances when all training samples are learned offline via one-class SVM (OC-SVM), learned via least squares one-class SVM (LS-OC-SVM), learned via online least squares one-class SVM (online LS-OC-SVM) and learned via sparse online least squares one-class SVM (sparse LS-OC-SVM), are shown in

The resulting performances of the covariance matrix descriptor-based online least squares one-class SVM method, and of state-of-the-art methods, are shown in

In this paper, we proposed a method to detect abnormal events via online least squares one-class SVM (online LS-OC-SVM) and sparse online least squares one-class SVM (sparse online LS-OC-SVM). Online LS-OC-SVM learns training samples sequentially; sparse online LS-OC-SVM incorporates the coherence criterion to form the dictionary for a sparse representation of the detector. The covariance matrix descriptor encodes the movement feature of the frame to distinguish between normal and abnormal events. The proposed detection algorithms have been tested on a synthetic dataset and a real-world video dataset yielding successful results in detecting abnormal events.

This work is partially supported by the China Scholarship Council of Chinese Government and the SURECAP CPER project (fonction de surveillance dans les réseaux de capteurs sans fil via contrat de plan Etat-Région) and the Platform CAPSEC (capteurs pour la sécurité) funded by Région Champagne-Ardenne and FEDER (fonds européen de développement régional).

The authors declare no conflicts of interest.

Examples of the normal and abnormal scenes. (

Covariance descriptor computation based on the features of a video frame.

Major processing states of the proposed abnormal frame event detection method. The covariance of the

Synthetic dataset. (

Offline, online LS-OC-SVM and sparse online LS-OC-SVM results of “square”. The figures might be viewed better electronically, in color and enlarged, (

Offline, online LS-OC-SVM and sparse online LS-OC-SVM results of “ring-line-square”, (

Detection results of the lawn scene, (

Detection results of the indoor scene. (

Detection results of the plaza scene. (

ROC curve of University of Minnesota (UMN) dataset. (

Different choices of feature

_{1}(6 × 6) |
[_{x}u_{y} |

_{2}(6 × 6) |
[_{x}υ_{y} |

_{3}(8 × 8) |
[_{x}u_{y}υ_{x}υ_{y} |

_{4}(12 × 12) |
[_{x}u_{y}υ_{x}υ_{y}u_{xx}u_{yy}υ_{xx}υ_{yy} |

AUC of the abnormal event detection method based on covariance descriptors constructed by different features

Features | Area under ROC | ||

lawn | indoor | plaza | |

offline OC-SVM | |||

_{1}(6 × 6) |
0.9474 | 0.8381 | 0.9148 |

_{2}(6 × 6) |
0.9583 | 0.8410 | 0.9192 |

_{3}(8 |
0.9656 | 0.8483 | 0.9367 |

_{4}(12 × 12) |
|||

offline LS-OC-SVM | |||

_{1}(6 × 6) |
0.9755 | 0.8605 | 0.9422 |

_{2}(6 × 6) |
0.9738 | 0.8603 | 0.9489 |

_{3}(8 × 8) |
0.9788 | 0.8662 | 0.9538 |

_{4}(12 × 12) |
|||

Online LS-OC-SVM | |||

_{1}(6 × 6) |
0.9755 | 0.8616 | 0.9403 |

_{2}(6 × 6) |
0.9720 | 0.8730 | 0.9517 |

_{3}(8 × 8) |
0.9795 | 0.8670 | 0.9563 |

_{4}(12 × 12) |
|||

Sparse Online LS-OC-SV | M | ||

_{1}(6 × 6) |
0.8840 | 0.8077 | 0.9245 |

_{2}(6 × 6) |
0.9435 | ||

_{3}(8 |
0.9269 | 0.8266 | 0.9428 |

_{4}(12 × 12) |
0.8223 | 0.9501 |

The comparison of our proposed method with state-of-the-art methods for abnormal event detection in the UMN dataset. NN, nearest neighbor. SRC, sparse reconstruction cost. STCOG, spatial-temporal co-occurrence Gaussian mixture models.

Method | Area under ROC | ||
---|---|---|---|

lawn | indoor | plaza | |

Social Force [ |
0.96 | ||

Optical Flow [ |
0.84 | ||

NN [ |
0.93 | ||

SRC [ |
0.995 | 0.975 | 0.964 |

STCOG [ |
0.9362 | 0.7759 | 0.9661 |

LS-SVM (Ours) | 0.9874 | 0.8900 | 0.9800 |

Online (Ours) | 0.9874 | 0.8904 | 0.9839 |

Sparse Online(Ours) | 0.9510 | 0.8886 | 0.9515 |