Neural Network Model for Automatic Traffic Incident Detection

Artificial neural networks are known to be effective in solving problems involving pattern recognition and classification. The traffic incident detection problem can be viewed as recognizing incident patterns from the incident-free patterns. A neural network classifier has to be trained first using incident and incident-free traffic data. The dimensionality of the training input data is high and the embedded incident characteristics are not easily detectable. In this article we present a computational model for automatic traffic incident detection using discrete wavelet transform, linear discriminant analysis, and neural networks. Wavelet transform and linear discriminant analysis are used for feature extraction, de-noising, and effective preprocessing of data before an adaptive neural network model is used to make the traffic incident detection. Simulated as well as actual traffic data are used to test the model. For incidents with duration of more than five minutes, the incident detection model yields a detection rate of nearly 100% and false alarm rate of about 1% for twoor three-lane freeways.


Executive Summary
Automatic fieeway incident detection is an important component of advanced transportation management systems (ATMS) that provides information for emergency relief and traffic control and management purposes. Earlier algorithms for the freeway incident problems have produced unreliable results especially in recurrent congestion and compression wave traffic conditions. In this research, a multi-paradigm intelligent system approach and several innovative algorithms were developed for solution of the freeway traffk incident detection problem employing advanced signal processing, pattern recognition, and classification techniques. The methodology effectively integrates hzzy, wavelet, and neural computing techniques to improve reliability and robustness. The specific accomplishments of this research are A comprehensive parametric study of the performance of the single-station fiizzy-wavelet RBFNN freeway incident detection model and comparison with the benchmark California algorithm #8 based on three quantitative measures of detection rate, false alarm rate, and detection time, and the qualitative measure of algorithm portability using both real and simulated data. The new algorithm outperformed the California algorithm consistently under various scenarios. False alarms are a major hindrance to the widespread implementation of automatic freeway incident detection algorithms. The false alarm rate ranges from 0 to 0.07 % for the new algorithm and 0.53 to 3.82% for the California algorithm. A comprehensive evaluation of the single-station wavelet energy neural network freeway incident detection algorithm and comparison with the California algorithm #8.
Evaluation of the wavelet energy neural network freeway incident detection algorithm on rural freeways where flow rates are low and detector stations are spaced fiuther apart.
It is demonstrated that both --wavelet RBFNN and wavelet energy neural network freeway incident detection algorithms are computationally efficient, produce excellent detection rates and very low false alarm rates on urban freeways, and can readily be implemented on-line in any ATMS without any need for re-calibration and without any performance deterioration. Considering the difficulty in automatic detection of incidents on rural freeways, the wavelet energy algorithm performs well on rural freeways as well. The algorithm is fast as it detects an incident on urban freeways in less than two minutes and on rural freeways in less than three F i minutes.

Summary and Organization of the Report
This report consists of seven parts presented as seven different manuscripts. Each manuscript is summarized in the following paragraphs. Automatic freeway incident detection is an important component of advanced transportation management systems that provides information for emergency relief and traffic control and management purposes. Earlier algorithms for the freeway incident problems have produced unreliable results especially in recurrent congestion and compression wave traffic conditions. Traffic incidents are non-recurrent and pseudo-random events that disrupt the normal flow of trflic and create a bottleneck in the road network. The probability of incidents is higher during peak flow rates when their system wide impact is most severe. Model-based solutions to the incident detection problem have not produced practically useful results primarily because the complexity of the problem does not lend itself to accurate mathematical and knowledge-based representations. ' To eliminate false alarms an effective traffic incident detection algorithm must be able to extract incident related features from the traffic patterns. A robust feature extraction algorithm also helps reduce the dimension of the input space for a neural network model without any significant loss of related traffic information, resulting in a substantial reduction in the network size, the effect of random traffic fluctuation, the number of required training samples, and the computational resources required to train the neural network. In Part 1, an effective traffic feature extraction model is presented using discrete wavelet transform (D WT) and linear discriminant analysis (LDA). The DWT is first applied to raw traffic data and the fmest resolution coefficients representing the random fluctuations of traffic are discarded. Next, LDA is employed to the filtered signal for further feature extraction and reducing the dimensionality of the problem. The results of LDA are used as input to a neural network model for traffic incident detection. \ Artificial neural networks are known to be effective in solving problems involving pattern recognition and classification. The traffic incident detection problem can be viewed as 7 recognizing incident patterns from the incident-free patterns. A neural network classifier has to be trained first using incident and incident-free traffic data. The dimensionality of the training input data is high and the embedded incident characteristics are not easily detectable. In Part 2, a r 1 computational model is presented for automatic traffic incident detection using discrete wavelet . transform, linear discriminant analysis, and neural networks. Wavelet transform and linear discriminant analysis are used for feature extraction, de-noising, and effective preprocessing of data before an adaptive neural network model is used to make the traffic incident detection.
Simulated as well as actual traffic data are used to test the model. For incidents with duration of more than five minutes, the incident detection model yields a detection rate of nearly 100% and false alarm rate of about 1% for two* or three-lane fieeways.
? 7 Researchers have presented freeway traffic incident detection algorithms by combining the adaptive learning capability of neural networks with imprecision modeling capability of fbzzy r logic. In Part 3, it is shown that'the performance of a fuzzy neural network algorithm can be 7 improved through preprocessing of data using a wavelet based feature extraction model. In particular, the discrete wavelet transform de-noising and feature extraction model presented in Part 1 is combined with the fuzzy-neural network approach presented by . It is T I I n n r . -r T T shown that substantid improvement can be achieved using the data fiitered by u w I . use of the wavelet theory to de-noise the traffic data increases the incident detection rate, reduces the false f alarm rate and the incident detection time, and improves the convergence of the neural network ! training algorithm substantially.

I
In Part 4, a multi-paradigm intelligent system approach is presented for the solution of the freeway traffic incident detection problem employing advanced signal processing, pattern recognition, and classification techniques. The methodology effectively integrates hzzy, wavelet, and neural computing techniques to improve reliability and robustness. A wavelet-based denoising technique is employed to eliminate undesirable fluctuations in observed data from traffic sensors. Fuzzy c-mean clustering is used to extract significant information from the observed data and to reduce its dimensionality. A radial basis function neural network is developed to classify the de-noised and clustered observed data. The new model produced excellent incident detection rates with no false alarms when tested using both real and simulated data.
In Part 5, a two-stage single-station freeway incident detection model is presented based on advanced wavelet analysis and pattern recognition techniques. Wavelet analysis is used to denoise, cluster, and enhance the raw traffic data, which is then classified by a radial basis function neural network. An energy representation of the traffic pattern in the wavelet domain is found to best characterize incident and non-incident traffic conditions. False alarm during recurrent congestion and compression waves is eliminated by normalization of a sufficiently long timeseries pattern. The model is tested under several traffic flow scenarios including compression wave conditions. It produced excellent detection and false alarms characteristics. The model is computationally efficient and can readily be implemented on-line in any ATMS without any need for re-calibration.
In Part 6, the performance of the fuzzy-wavelet radial basis function neural network (RBFNN) freeway incident detection model presented in Part 4 is evaluated and compared with the benchmark California algorithm #8 using both real and simulated data. The evaluation is based on three quantitative measures of detection rate, false alarm rate, and detection time, and the qualitative measure of algorithm portability. The new algorithm outperformed the California algorithm consistently under various scenarios. False alarms are a major hindrance to the widespread implementation of automatic freeway incident detection algorithms. The false alarm rate ranges from 0 to 0.07 % for the new algorithm and 0.53 to 3.82% for the California algorithm. The new fizzy-wavelet RBFNN freeway incident detection model is a single-station pattern-based algorithm that is computationally efficient and requires no re-calibration. The new model can be readily transferred without re-training and without any performance deterioration.
In Part 7, a comprehensive evaluation of the single-station wavelet energy neural network freeway incident detection algorithm of is presented. Quantitative performance measures of detection rate, false alarm rate, and detection time as well as the qualitative measure of portability are investigated for both urban and rural freeway conditions: Further, the performance of the algorithm is compared with that of the California algorithm #8. This research demonstrates the portability of the wavelet energy algorithm and its excellent performance for urban freeways across a wide range of traffic flow and roadway geometry conditions regardless of the density of the loop detectors. Rural freeways present additional challenges in that flow rates are low and detector stations are spaced further apart. Considering the difficulty in automatic detection of incidents on rural fkeeways, the new wavelet energy algorithm performs well on such freeways.
The algorithm is fast as it detects an incident on urban freeways in less than two minutes and on rural freeways in less than three minutes. Though these computational models were easy to implement, they could not achieve the desired level of accuracy. Consequently, new approaches such as artificial neural network^^"^ and fuzzy logic7 have been investigated to improve the performance.
Research also has been carried out to filter out the random fluctuations of the traffic using moving average or median plus average methods15 in an attempt to minimize the occurrence of false alarm (i.e., false detection of incidents) but with limited success. To eliminate the false alarms an effective incident detection algorithm must be able to extract features from the traffic patterns, which are related to the incident. In this work we use the discrete wavelet transform @WT) and Linear Discriminant Analysis (LDA) for feature extraction. Actual traffic data obtained from the sensors on the fi-eeways are not well suited as a direct input to a neural network model to be used to detect incidents. The dimensionality of the training input data is generally high as various traffic parameters (e.g. traffic volume and occupancy) at different locations (e.g. upstream and downstream of the various incident locations) and at many instances of time are required to be inputted, and the embedded incident characteristics may not be easily detectable. Also, the training of a neural network incident detection algorithm requires input patterns containing sufficient incident data. Thus, effective pre-processing of the sensory data is essential before they can be used in a neural network model. In this work, we perform feature extraction in two steps. In the first step, the data is filtered and the high frequency signals representing noise, which may not be related to an incident, are removed using wavelet transform. In the second step, the features are enhanced using LDA. The feature extraction algorithm also helps reduce the dimensionality of the input space to a neural network model without any significant loss of related traffic information. In the companion paper we use the feature extraction algorithm to develop a robust traffic incident detection model'.

DISCRETE WAVELET TRANSFORM
The wavelet transform is found to be an effective tool in signal and image processing due to its attractive properties such as time-frequency localization (obtaining a signal at particular time or frequency), multi-rate filtering (differentiating the signals having various frequencies), scale-space analysis (extracting features at various locations in space at different scales), and multi-resolution a n a l y~i s~,~.~. Using these properties one can extract the desired features fiom an input signal characterized by certain local For the traffic incident detection problem, we consider various traffic data (e-g. traffic volumes and occupancies at various locations) recorded at a fixed time interval (e.g. 20-30 seconds). Each of these data series can be represented by xb], where j E 2 and 2 is a set of integers (square brackets represent a series, a sequence or a vector and circular brackets represent functions). The vector space of square summable sequences is defined as follows: where y represents a sequence of real numbers and 2 is the set of all the integers. That means the inner product of a sequence with itself converges to a finite value. We denote where ( ) denotes the inner product of the two sequences in the vector space L2 ( 2 ) , k represents the total number of input data points, 1 represents the number of coefficients of each data series such as traffic volume or occupancy. The coordinates f3 and A are in fact, low and high-resolution coefficients of the given data series x [k] , respectively. The inner product of any two data seriesAn] and g[nJ is calculated as follows: In our traffic incident detection problem, we use 8-minute traffic patterns with data recorded in intervals of 30 seconds as multi-resolution analysis using DWT requires at least 16 data points at a time. AS such, in Eqs. (2) and (3) k16. We choose, I = 2, which means the traffic patterns are divided into three types of signals: low-resolution 16 2' ( p2 ), medium-resolution ( A2 ), and high-resolution ( A, ). In this case I in Eq. (2) = -= 4. Consequently, we have 4 low-resolution coefficient (p2 [ Z' J) . Similarly 1 in Eq. (3) is equal to 8 forj=l, and equal to 4 forj=2, which yields us 8 fine-resolution coefficients (A, [a) and 4 medium resolution coefficients (4 [o).
The coordinates of the wavelet bases ( pS and As ) are computed using a concept called the quadrature mirror filters'*. Quadrature filter is an operator that performs signal convolution and d~wnsampling'~. We use mirror filters (a pair of filters) so that the original traffic signal can be reconstructed without any loss of related information (One filter yields the high-resolution components of the signal and the other filter yields the low-resolution components.) The convolution of any two sequences An] and g[n] is calculated as follows: The dclrw-sampling part is discussed in the following section.
To extract the traffic incident pattern from the traffic data we perform multiresolution analysis of the wavelet transforms of traffic patterns. Multi-resolution analysis involves dividing the original signal (e.g. traffic volume or occupancy) into signals having different frequencies and time localizations and analyzing the signal in different scales.
To carry out a multi-resolution analysis of a traffic pattern we need to define a two-dimensional set of scaling fimctions, #(t), and wavelet functions, cy(t). A twodimensional family of scaling functions is obtained by scaling and translating the basic scaling function 4(t) as follows:
Integers j and k are called scaling and translation parameters respectively. The corresponding subspaces spanned by 4j,k (t) The over-bar indicates that 5. is a closed subspace (-.e. boundaries are included in the subspace). Equation (7)  The scalej can be varied from -00 to + 00 to obtain signals having various resolutions.
Similar to the scaling functions a two-dimensional family of wavelet functions is obtained from the mother wavelet ~( t ) by scaling and translation as follows: (12) V, =

{ O }
(indicating the empty space) V, = L~ (2) (contains the original input signal) Equation (1 1) indicates that 7$+1 is a subset of 5. The subspace 5 contains all the signals included in 5+1 plus additional high-resolution signals. These additional high-resolution signals are contained in the wavelet-spanned subspace Wj+l : vj = 8 wj+l (14) where 0  The value of I can be varied to obtain the desired level of resolution. We choose I = 2, for I 1 I the same reasons explained for Eqs. (2) and (3). In that case, the original signal is divided into three parts, each one lying in a different subspace as follows: where V, , W, , and W, contain the low, medium and high-resolution signals, respectively.
The definition of 4 and the scaling condition given by Eq. (6) ensure that elements in the two consecutive subspaces 4 and G+I are inter-related as follows: where the notation e indicates mutual implication. The actual relationship is expressed as follows2: n E Z 1

4[2'"tI = -Z h [ n ] 4[2't-n],
is a sequence of real numbers known as the scaling function coefficients or 1 low-pass filter coefficients. (-keeps the norm of the scaling function equal to 1).

f i
Since the wavelet-spanned subspace at scalej+l is a part of V, (the subspace spanned by the scaling function with scalej, i.e. W j c Vj+l) the wavelets at scalej+l can be represented in terms of scalar-multiples of the translated scaling functions at scalej as follows: where hl[n] is a set of real numbers known as wavelet function coefficients or high-pass filter coefficients. Due to the orthogonal relationship between the wavelet and scaling functions, the wavelet coefficients are related to scaling coefficients as follows2: where L is the length of the filter used. In our traffic incident detection case, we use length-4 Daubechies filter coefficients ( L = 4) as it is found to be accurate and efficient in the area of digital filtering. For this filter, h[n] values are found by solving the recursion Now we can write any input data series f [t] in Vo (orL2 ( 2 )

FILTERING AND DOWNSAMPLING
Digital filtering of the input signal is carried out by convoluting the signal with another set of numbers known as the filter coefficients or impulse responsesg and the downsampling process involving decimation of some of the input data. In downsampling, the input signal x(n) is transformed into an output signal y(n) such that y(n) = x(2n). This means the alternate data points are discarded as shown in Figure 1 schematically.
Equations (25) and (26)  downsampling are completed the next low-resolution scaling and high-resolution wavelet coefficients are found. This is shown schematically in Figure 2, where Ho and H, represent the two FIR filters. This splitting (dividing of signal into higher and lower resolution signals), filtering and decimation (downsampling) can be repeated on the scaling coeficients to obtain a two-or three-stage two-scale filter (Figure 3).
Having found the relationship among the four sets of coefficients, we now describe how to obtain the input set of scaling coeEcients (Po ) from the input signal. In the traffic incident detection model the traffic data are not continuous. We use the traffic volume and occupancy values at 30 second intervals which means the data are prefiltered and can be used directly as input coefficients. As an example if we use 8-minute trafpic patterns, we will have 16 input values for each of the four input parameters: upstream and downstream occupancy and volume. After two stages of downsampling and ? filtering we will have 8 coefficients of the finest resolution, 4 coefficients of the medium resolution and 4 coefficients of the coarse resolution for each traffic parameter. All 8 high-resolution coefficients are discarded as they represent the ordinary traffic fluctuations, which may not be related to the traffic incidents. For both low as well as medium resolution coefficients, we will take some or all of them, and find out the best combination. The signal is then re-generated using these medium and low-resolution coefficients, which is called de-noised signal. To enhance the feature extraction, linear discriminant analysis is performed on the coeficients obtained from the wavelet transform and multi-resolution analysis. Linear discriminant analysis is discussed in the next section.

LINEAR DISCRIMINANT ANALYSIS
a I We use a linear discriminant analysis to reduce the dimensionality of the problem as well as to improve the generalization capability of the pattern classifier while at the same time reducing its computational processing requirements. This part of feature extraction can be formulated as a mapping from a d-dimensional input space ( R ") to an m-dimensional feature space ( R " I ) through a transformation matrix T6: The linear discriminant analysis achieves feature extraction through linear mapping of the input space to the output space. The most popular and commonly used linear discriminant classifiers are Fisher Linear Discriminant Classifier (FLD) and Nearest Mean Classifier (NMC) or Euclidean Distance Classifier. The construction procedure for both classifiers is almost the same with minor differences in the Let (XJj be a vector representing the ith training sample outputted by the discrete wavelet transform in classj. If we use 6 of the 8 medium and low resolution coefficients, as an example then, (Xi)j = (X~,~,X~,~,....,X,!~) 1 2 -In the traffic incident detection casej will be either 1 or 2 where j = 1 indicates the incident free samples and j = 2, indicates the incident samples. Also, i = 1, 2,. . ., nj, where nj = number of training samples in class j, and n = nl + n2, the total number of training samples. The within-class co-variance square matrix C, of dimension d is defined as where mi is the mean vector for class j . The incident detection is a two-class problem involving classification of data between incident and incident-fiee regions. For this twoclass problem, the between-class covariance square matrix, C,, of dimension d is defined as6: where m is the mean vector of all the data. The goal of linear discriminant analysis is to find a dxm transformation matrix T such that the within-class scatter is 'zed"and ' the between class scatter is maximized. This can be achieved by maximizing the sum of the eigenvalues (J) of the multiplication matrix C$CB6. SimplifLing Eq. (29) we obtain Since C, is a function of only one vector (m2 -m,), its rank (number of independent rows or columns in the matrix) is one. And since C , has a Eull rank its inverse exists and r r the rank of C:C,is also equal to one. That is, it has only one non-zero eigenvalue. The corresponding eigenvector of this non-zero eigenvalue is6*'' Where the constant denominator is chosen to make the norm of the eigenvector unity, i.e. llEl 1 1 = 1. For our two-class incident detection problem the eigenvector is a function of one vector (m2 -m,) only, requiring one discriminating feature, and the mapping function yielding the output vector Y is: where c is a constant.
In the incident detection problem, the value of d is varied from 3 to 6, when we apply LDA to a single data series at a time. If the LDA is carried out using all the data series (upstream and downstream traffic volume and occupancy) together then d will be equal to the number of data series considered (4 in this particular case). On the other hand m = 1 ( Figure 4) represents a single value of effective traffic occupancy or volume for a given time period of 8 minutes. Since the incident detection problem is a two-class classification problem only one feature is sufficient to differentiate between the two classes. Consequently, number of input nodes of the neural network is reduced to 4.
Equation (35) represents the general FLD function. The NMC function does not consider the covariance part (C:) and is represented by As the name suggests the Nearest Mean Classifier (NMC) classifies the data on the basis of distance from the class means. Thus, in the two-class incident detection problem it generates the perpendicular bisector between the class means. This type of linear classification is ideal for classes with identical distribution ofdata around the class means. But in the incident detection case, the incident and incident-free data may not have identical distribution around their class means. Consequently, the covariance part has to be considered for optimal linear classification. The standard FLD takes into account the covariance part, but linear classification using FLD involves inversion of within-class covariance matrix (C, ), which is often an ill-conditioned matrix. This problem can be overcome by adding some constant value ( 6 ) to the diagonal elements of the covariance matrix as fol10ws'~:

c, = C , + S I (34)
where I is an identity matrix. In this case, the classifier is known as Regularized FLD.

DATA ACQUISITION
Traffic incident detection is a real-life problem. Therefore it is quite essential to The last method is based on using sensors placed along freeways at intervals of a few hundred meters to a few kilometers and computers to process the traffic flow data. 1 In this approach, sensors detect the effects of incident occurred within two neighboring sets of detectors rather than the incidents themselves. In some incident detection algorithms an incident is detected after a few minutes, which is relatively considerable.
One of our goals in this research is to minimize the detection time.
In an automated fieeway incident detection system (AFIDS) an entire freeway system can be monitored continuously in a central office through the use of a network of sensors without actually anyone observing the incidents. But, first the AFIDS has to be trained using the data obtained from sensors. and duration of the incident. The simulated data can be displayed graphically on the computer screen. An example is shown in Figure 5 displaying a straight four-lane freeway segment with two sets of entry and exit ramps. TSIS/CORSIM provides a comprehensive fieeway incident simulation module called FRESIM (Freeway Simulation Package). An example of a simulation instance for the freeway of Figure 5 is shown in Figure 6 , displaying the location of the accident and the traffic congestion after the incident.
We can specify either blockages in one or both lanes or rubbernecking which is a reduction in the capacity of a lane without a blockage (defined as a percentage reduction in the capacity) due to blockage in a neighboring lane or an incident on the shoulder. The user can specify the following for an incident: the longitudinal location on a fieeway link, the length of the blockage, and the duration of the incident. The characteristics of an incident can be changed during the incident duration. For example, it is possible to specify a two-lane blockage turning into a one-lane blockage after a specified duration.
The lane from which the blockage is removed can then become unrestricted or subjected I 1 i to rubbernecking. The simulation parameters to be chosen are the percentage reduction in the capacity of the freeway.         Low-resolution part

INTRODUCTION
Stephanedes et al." used a moving average method to reduce the effect of random fluctuations in the traffic on the incident detection algorithm. They average the , 1 differences in the occupancies at upstream and downstream locations over 3-minute I f periods using data recorded at 30-second intervals. Their comparison with other existing approaches showed improvement in reducing the false alarm rates. They report a detection rate of around 90% for a false alarm rate of about 1%. They also note that "the algorithm performance may exhibit varying degree of transferability across test ! 7 I locations". To take into account the uncertainty and imprecision inherent in the incident detection', researchers have recently explored the use of new computing approaches such 1 \ as fuzzy logic9,17 and neural n e t~o r k s~,~ to improve the incident detection rate with simultaneous reduction in false alarms. Neural networks are known as a powerful method for pattern recognition and classification2. The price to pay for their adaptive learning capability is often the need for large computational resources when the problem is complicated requiring a large network and a large number of training instances. As an example, if we use an 8-minute traffic pattern with 30-sec. intervals and upstream and \ ' I f downstream traffic volumes and occupancies as input, then the number of input nodes for the neural network model will be 4 x 8~2 = 64. If we use one hidden layer with the same ni-~mher of nodes as the input layer then the number of links connecting the input layer to the hidden layer would be 64x64 = 4096. This means we have to solve a large 1 optimization problem with 4096 + 64 = 4160 variables (assuming one output node) in order to find the 4160 weights of the network. Further, a few hundreds training instances I r 1 are needed to train such a large network.
In order to reduce the high dimensionality of the network and improve its computational efficiency, we first employ a two-stage feature extraction model using the discrete wavelet transform and linear discriminant analysis, as described in the companion paperI4. This will reduce the number of nodes in the input and hidden layers for the aforementioned example to 4, thus reducing the size of the network substantially and resulting in significant computational efficiency ( Figure 1).
A robust feature extraction algorithm also helps reduce the dimension of the input space for a neural network model without any significant loss of related traffic information, resulting in a substantial reduction in the network size @e., the number of nodes in the input and hidden layers), the effect of random traffic fluctuation on the learning curve of the neural network, (Learning curve for any neural network is defined as the relation between the mean squared error of the output and the number of iterations required for the training. As the random trdfic fluctuations are reduced the total number of iterations required for convergence reduces too.) the computational resources required to train the network, and the required number of training samples (that means more accurate generalization).

0
Backpropagation neural network13 has been used to solve the traffic incident detection p r~b l e m~> '~.
The attraction of backpropagation is its simplicity. But, it suffers fiom a number of shortcomings'.'': 1. It often requires a very large number of iterations for convergence, 2. Its convergence depends heavily on the selection parameters, learning and momentum ratios, that have to of two problem-dependent be selected by trial and error,

3.
It suffers fiom the hill-climbing problem, that is entrapment in a local minimum.
In this work we use the adaptive conjugate gradient neural network learning algorithm of Adeli and Hung', which combines the conjugate gradient method originally proposed by Fletcher and Reeves7 and modified by Powell'2 with an inexact line search with three criteria for finding the optimal search direction.

MODEL
The conjugate gradient method is based on the steepest descent method where weight changes are made along the direction resulting in the maximum decrease in the system error. Determination of the step length of a gradient-based optimization algorithm has a significant impact on its efficiency'. A very accurate or "exact" line search requires many function evaluations thus making the algorithm prohibitively and unnecessarily expensive. An appropriate inexact line search algorithm can determine the step length, within a small percentage of that found based on an exact search. Adeli and Hung' use the backtracking inexact line search algorithm of Dennis and Schnable6, the step length selection terminating criterion of Armijo4 to ensure the step length is not too large, the terminating criterion of Goldstein' to ensure the step length is not too small, and the aireciioii coilvergenee cr;iterion of Kocedi!'' tc! ensure that the descent direction is always generated.

.
The steps of the adaptive conjugate gradient algorithm for training of neural networks are presented here briefly. For a classification problem involving T decision variables, the training of the network is started using a randomly generated initial weight vector of (@ c R T ) . Two stopping criteria are provided for convergence, one for the gradient vector (E =lom5 to and one for the minimum system error (0.01 or 0.001).
The minimum (minlen) and maximum (marlen) step length is set to 0.0001 and 100, respectively. The initial search direction is set to 0. The parameters 8 and p are chosen equal to 0.9 and 0.01, respectively, per Adeli and Hung2. The outer iteration number, n, is set to 1. The decision variable counter, t, is set to 0.

1.
Steps ( b) The system error is calculated for the k* training instance. In the traffic incident detection case, there is only one output node so the error will be just the square of the difference between actual ( Y k ) and actual output ( ok ).
c) The deltas in the output layer for the kth training instance are calculated as follows: d) Deltas for the hidden layers are then calculated back, propagating the error: e) The gradient vector for the k* training instance is calculated as: r 2. The total system error is then calculated by adding-up the individual errors fiom step l(b). If the total system error satisfies the minimum error convergence criterion, the training is completed. Otherwise, the gradient vector for the total system error is calculated. A new search direction is assigned as negative of the gradient vector as .. .

follows:
If the gradient vector satisfies the convergence criterion IVE(W("))I < E then the training is stopped and the weight vector obtained is the final solution. Otherwise, following steps are performed.
3. The decision variable counter (t) is increased: t = t + 1. If t 2 T, that is if t exceeds the number of decision variables, then it is set to 0 (t = 0). If t = 1, a, is set to 0.
Otherwise, a new conjugate direction is calculated as follows: A is initialized equal to one. Then, the Armijo4 criterion is applied to ensure the step length is not too large. If the step length is too large, step 10 is carried out. Otherwise, the Goldstein' criterion is applied to ensure the step length is not too small. If this criterion is satisfied then step 8 is carried out where a new search direction is calculated using the new value of A . If the Goldstein' criterion is not satisfied then value of A is checked. If its value changes (that is A f 1 ), then step 6 is carried out.
Otherwise, next step is performed.

5.
A new A value is set as follows: A= min ( 2 4 , maxlen) A new search direction d("+)is calculated (Eq. 8). Using this new search direction Nocedal" direction convergence criterion is checked. If the direction convergence criterion is not satisfied then step 6 is carried out. If the direction convergence criterion is satisfied then Goldstein8 criterion is checked. If Goldstein criterion is not satisfied or if Avalue becomes greater than muxlen, step 6 is carried out. Otherwise this step is repeated. ! 6. If A. c 1 , or, if A. > 1 and direction convergence criterion of Nocedal" is not satisfied then step 7 is carried out. Otherwise, step 12 is carried out directly.

7.
A new value of A is calculated using backtracking and parabolic interpolation. A new search direction is calculated using Eq. (8). This is repeated until both Nocedal" and Goldstein' criteria are satisfied simultaneously. Then, step 12 is carried out.

A new search direction is found and checked for the descent condition criterion of
Nocedal". If it is satisfied then step 12 is carried out directly. Otherwise, the next step is perfonned.

9.
A new A. is found by backtracking and a new search direction is computed (Eq. 8).
This step is repeated until the gradient descent condition of Nocedal" is satisfied.
Then step 12 is carried out directly.
10. If A < minlen, A. is set to 0 and step 12 is performed. Otherwise, cubic interpolation is used to find a new A .
12. If this step is executed directly after step 7, 8, 9 or 10 then the inexact line search algorithm is stopped and step 13 is performed. Otherwise, step 4 is carried out using a new value of A.
Weight vector is updated along with the iteration counter as follows: If n exceeds the specified maximum number of iterations the training is stopped.
14. If step 14 is executed directly after step 2 or step 3, then stop. In this case weight vector obtained is the optimum weight vector.
This algorithm is repeated after every T iterations (for T decision variables), and an is set to zero for the t = 1.

INCIDENT DETECTION RESULTS USING VARIOUS APPROACHES
As discussed in the companion paper14 the actual data obtained from several state departments of transportation including Minnesota DOT were not sufficient to train the classifiers and the neural network. Consequently, the results presented in this section are based on simulated data using TSIS/CORSIM developed by ITT Systems and Sciences Corporation (http://www.fhwa-tsis.com). Three types of trait data are used and investigated: traffic volume, traffic occupancy, and average vehicle speed.
Deciding on the data polling frequency, that is the data-recording interval, is crucial in developing an automated freeway incident detection and management system.
If the interval is very small, say 5 sec., then the change in the traflfic data per interval may not be noticeable and the hardware and computational cost can become prohibitively high. The increase in the computational cost will be due to an increase in the size of the network as well as the required number of training instances. On the other hand, if this interval is made large, say 5 minutes, then it will take a relatively long time to detect the incident and take appropriate recovery measures such as re-routing the traffic or providing emergency medical assistance. A data polling period of 20-40 sec is commonly used in automatic traffic incident detection models. We have used 30-sec intervals for the simulated traffic data.
The distance between the sensors also affects the incident detection rate and 1 specially the time to detect the incident. If the distance is too small, say a couple of hundred meters, the number and cost of sensors needed to cover the same segment of the I freeway will increase. On the other hand, if this distance is too large, say a few ? kilometers, the sensors will take a long time to detect the incident and may not detect small incidents at all. The appropriate distance appears to be in the range of 2000-3000 ft 1 (600-900 m). The lower end of the range can be used for the critical sections of the freeway where the probability of incident occurrence is high, such as before the exit ramp and after the entry ramp, or where there is a reduction in the number of lanes. These are

LDA
r We will investigate the application of linear discriminant analysis in two different ways, as a linear classifier and as a feature enhancer. As a linear classifier, it is applied to all the data series simultaneously using a single data point from each data series, without using neural networks. As a feature enhancer, it is applied to each data series separately and the resulting traffic parameter values are used as input to the neural network model.

ACGNN
r In this work we will investigate various combinations of different traffic data series such as traffic volume, occupancy and average vehicle speed at upstream and downstream stations. Parametric studies will be performed to find out the most effective .L combination of the traffic data series. Table 3 shows the results of traffic incident detection using three different types of traffic data and their combinations employing the ACGNN learning model. It is observed that the combination of all three parameters yields the best incident detection rate of 91.1% and the lowest false alarm rate of 5.1%. But, the results are only slightly better than those obtained from the combination of the traffic volume and occupancy with the corresponding numbers of 88.9% and 5.1%. Considering the fact that the threeparameter traffic data input increases the number of nodes in the input and hidden layers by a factor of 1.5 and the number of links (and the unknown weights) connecting the hidden layer to input and output layers by a factor of 1 .52=2.25, we will choose the traffic volume and occupancy as the input parameters for the final incident detection algorithm.
The results presented in Table 3 show that the ACGNN is superior to the combination of DWT and LDA (Table 2). However, the 5.1% rate of false alarm is still too high. This can be explained by the fact that the incident and incident-free domains are not easily separable using the original unfiltered data. Table 4 shows the incident detection results employing the ACGNN algorithm after the filtering and preprocessing of data by DWT and LDA using the traffic volume and occupancy as input data. As explained in Samant and AdeliI4 the traffic data are first filtered using DWT and multi-resolution analysis and the high-resolution components are discarded. The low and medium resolution components are found to be sufficient for representing the traffic flow.

DWT, LDA, and ACGNN
After the wavelet transform is performed, the resulting data can be applied to LDA or ACGNN in two different ways. Wavelet transform coefficients can be used directly as the input to LDA or ACGNN. Alternatively, the traffic signal can be regenerated using an inverse of the DWT and setting the high-resolution coefficients equal to zero. The results for both cases are shown in Table 4. The two methods yield comparable results. Re-generating the traffic signals is an additional and unnecessary computational burden. While the wavelet transform coefficients have no physical significance their use is adequate and therefore recommended for computational efficiency.
It is observed that the new computational model for tr&ic incident detection based on preprocessing of the trafEc data by DWT and LDA followed by application of the ACGNN yields a high incident detection rate of 97.8% and a low false alarm rate of around 1%. Further, the mean time for detection is about 38 seconds.
The traffk data obtained from Minnesota DOT included only two incidents over a 150-min. period. We used these data to test the new incident detection model trained using the simulated data. The model detected both incidents with time to incident detection of less than a minute.

EFFECT OF DATA FILTERING USING DWT
In order to see the effect of DWT on improving the performance, the raw upstream and downstream traffic volume data obtained from Minnesota DOT as well as the data filtered by DWT are shown in Figures 2 and 3, respectively. These figures show the incident and incident-free regions are more distinct after the data are filtered using DWT. This helps the neural network model classify the incident and incident regions more effectively resulting in better incident detection and low false alarm rates. Further, this helps improve the convergence of the ACGNN learning model substantially, shown in Figure 4.

EXTRACTION
Our feature extraction model is a two-step algorithm consisting of DWT and LDA. In order to investigate their relative contribution in feature extraction, we also used DWT as the sole feature extractor. The results are shown in Table 5. A comparison of the data in Tables 4 and 5 indicate that most of the feature extraction capability is due to LDA has a smaller contribution toward improving the incident detection. One can i say it has a fine tuning effect for reducing the false alarm rates.

EFFECTS OF FREEWAY GEOMETRY ON THE INCIDENT DETECTION
In order to show the efficacy and robustness of the new incident detection algorithm in various situations we performed a parametric study. To investigate the effect of various geometric changes on the incident detection algorithm, we used 65 incident test runs with minimum incident duration of 5 minutes and minimum traffic flow of 50% of the freeway capacity. Selected results of this study are presented here.

Effect of Curvature
Freeway geometric features such as grade, super-elevation, curvature, and pavement conditions do not affect the incident detection algorithm directly. They may have an indirect effect. For example, an incident on a curved freeway often causes more congestion than a similar incident on a straight segment. As a result, smaller duration incidents can cause sufficient congestion to get detected by the incident detection a f I algorithm. As an example, Figure 5 displays an instance 45 seconds after a simulated incident on a curved fieeway. Comparing the results obtained for a curved freeway segment with those obtained for the straight freeway segment in Table 6, it is concluded that the curvature does not have an appreciable effect on the incident detection and false alarm rates of the incident detection model. However, the detection time for the curved segment is lower than that for the straight segment, because freeway gets congested faster.

Effect of Number of Lanes
The number of lanes in a freeway also affects the incident detection time and the detection rate of the incident detection algorithm. For similar incidents, having similar blockage characteristics as well as duration, the percentage changes in the traffic parameters are smaller for a larger freeway. Consequently, it takes more time to detect an incident as number of lanes increases. An example of an incident on a five-lane freeway is shown in Figure 6(a) to 6(c). Figure 6(a) shows the traffic pattern 45 seconds after the incident. Normally, this type of incident involving a lane blockage on a two-lane freeway (in one direction) gets detected within this time range. But for an incident on five-lane freeway (in one direction) two to four minutes may be required to detect the same. Figure   6(b) shows the traffic pattern 3 minutes after an incident. Figure 6(c) displays the incident characteristics.
For a small-duration incident the incident may not get detected. Thus, it afYects the detection rate of an incident detection algorithm. The detection rate computed for a five-lane freeway is about 94% and the average detection time is 2 minutes and 47 seconds. The false alarm rate remains practically the same. Figure 7(a) and 7(b) show the effect of the size of the freeway (number of lanes) on the incident detection rate and t h e for detection, respectively. It is observed that change in the detection rate and time is much higher for ACGNN using raw data than for the ACGNN using data filtered by DWT and LDA.

CONCLUSION
In this and the companion papers, we presented a robust incident detection computational model and algorithm through adroit integration of three different computational approaches/disciplines: signal processing and wavelet transform, statistical linear discriminant analysis, and artificial neural networks. For incidents with duration of more than five minutes, the algorithm yields a detection rate of nearly 100% and false alarm rate of about 1% for two-or three-lane and freeways. For incidents with duration o f . less than 5 minutes, the incident detection rate for two-or three-lane freeways is about 98% with a false alarm rate of about 1 'YO.
For four-lane and five-lane freeways, the detection rate is reduced to 96% and 94%, respectively, but the false alarm rate remains around 1%. It is also observed that the freeway curvature does not affect the performance of the algorithm.
There is one type of incidents that the new algorithm cannot detect, that is the so-  Engineering, Vol. 120, No. 5, 1994, pp. 753-771. 10. Kollias, S. and Anastassiou, D., "An adaptive least-squares algorithm for the efficient training of artificial neural networks", IEEE Transactions on Circuits and Systems, Vol. 36, NO. 8, 1989, pp. 1092-1101. Nocedal, J., "The performance of several algorithms for a large scale unconstrained optimization", in Coleman T.F., and Li, Y., Eds.               membership functions. They test the model using an "empirical data base collected in Toronto, Canada". They report detection rates in the range of 54% (with a false alarm rate of 0%) to 90% (with a false alarm rate of 7.9 %).

CAPTIONS FOR FIGURES
A main reason for unreliability of the trafic incident detection algorithms is the noise in the traffic data, In other words, the traffic data are often corrupted as they are collected by sensors and then transmitted to a central processing station. To eliminate false alarms an effective traffic incident detection algorithm must be able to extract features from the traffic patterns, which are related to the incident. A robust feature extraction algorithm also helps reduce the dimension of the input space for a neural network model without any significant loss of related traffic information, resulting in a >substantial reduction in the network size, effect of random traffic fluctuations, number of required training samples, and computational resources required to train the neural network.  present an effective traffic data de-noising and feature extraction model using discrete wavelet transform (DWT) and linear discriminant analysis. The DWT is first applied to raw traffic data and the finest resolution coefficients representing the random fluctuations of traffic are discarded. Next, LDA is employed to the filtered signal for further feature extraction and reducing the dimensionality of the problem. The results of LDA are used as input to a neural network model for traffic incident detection.
In this article it is shown that the performance of a fuzzy neural network algorithm can be improved through preprocessing of data using a wavelet-based feature extraction model. In particular, the DWT de-noising and feature extraction model proposed by  is combined with the fbzzy-neural network approach presented by . It is shown that substantial improvement can be achieved using the data filtered by DWT.

DISCRETE WAVELET TRANSFORM
The wavelet transform is found to be an effective tool in signal and image processing due to its attractive properties such as time-frequency and multi-resolution analysis (Daubechies, 1992;Jameson et al., 1996;Mallat, 1998). Using these properties one can extract the desired features from an input signal characterized by certain local properties in time and space. A feature extraction approach using wavelet transform is used to achieve higher level of accuracy in the decision making process by the fbzzy neural network algorithm. The details of the feature extraction model for the traffic incident detection problem are presented in . The basic idea is briefly described here in non-mathematical terms.
We represent scaling and wavelet functions (Daubechies, 1992;Farge et al., 1993), respectively, and I is a positive integer. We use Daubechies wavelet function as it is found to be quite effective in digital signal processing. The value of I is chosen such that the desired level of resolution is obtained a n d j = 1, 2,. .., I. The coordinates p and A are in fact, low and high-resolution coefficients of the given data series x [ k ] , respectively.
The coordinates of the wavelet bases ( ps and As ) are computed using a concept called the quadrature mirror filters (Wickerhauser, 1994).
To extract the traffic incident pattern from the traffic data we perform multiresolution analysis of the wavelet transforms of traffic patterns. Multi-resolution analysis involves dividing the original signal (e.g. traffic volume or occupancy) into signals having different fkequencies and time localizations and analyzing the signal in different scales.

ARCHITECTURE
The architecture of the enhanced incident detection model is represented schematically in Figure  i'=1,2 where xi is the degree of membership value for the ith output membership function.

T
Using these degree of membership values a crisp output is obtained by using a "center of

TRAINING OF THE NETWORK
The training of the network requires finding the means ( mu, rn] ) and variances ( ay ,a]) of the input and output membership functions. Shortcomings of the BP learning algorithm such as very slow rate of learning and trial-and-error problem-dependent selection of learning and momentum ratios have been discussed in the recent literature (Adeli and Hung, 1994). Since the objective of this article is to demonstrate how a fizzy neural network incident detection model can be improved through a DWT feature extraction model we use the same feed forward BP learning rule used by  to train the neural network.
The training is initialized by providing the desired initial ranges of input and output fuzzy partitions in the form of means and variances of the membership functions.
For example, for occupancy initial mean values of O%, 50%, and 100% are provided for the three linguistic variables low, medium and high with a variance value of 30% for each one. The initialization is done such that the linguistic variable covers the feasible region of the corresponding inputloutput space uniformly .

After the initialization the mean and variance values are obtained by minimizing
A----* C . w -A n n in the fnllnyring form: where 77 is the so-called learning rate parameter. Using Eqs. (6), (7) and (8)

FILTERING OF TRAFFIC DATA USING DWT
The raw traffic data is obtained through simulation of freeway traffic flow using the TSISKORSIM simulation package (http://www.fkwa-tsis.com). The traffic flow parameters (traffic volume, occupancy and vehicle speed) are recorded at 30-second intervals. DWT is then applied to each of the traffic data series separately. Eight-minute traffic patterns yielding 16 data points are used at a time for the filtering process. DWT divides the signal into two parts: high-resolution signal and low-resolution signal. Thus, a single stage DWT produces 8 high-resolution data points and 8 low-resolution data points. The high-resolution data points are discarded as they mainly represent the random fluctuations in the traffic. DWT is again applied to the remaining 8 low-resolution data points to obtain 4 medium-resolution and 4 low-resolution data points. The traffic signal is then regenerated using these medium and low-resolution data points which carry the incident related information. This process is called multi-resolution analysis ( M U ) as it extracts the signals having different resolutions. The new filtered signal is used as a direct input to the --neural network. The linear discriminant analysis used in Samant and Adeli (2000) is not needed here for feature extraction as the means and variances of the traffic data are incorporated in the form of membership function parameters of the fizzy sets.

INCIDENT DETECTION RESULTS
The fizzy wavelet neural network is trained using the data obtained from 32 simulation runs, 25 of which include an incident. The network was then tested using 45 new simulated lane-blocking incidents on freeways with different number of lanes.
Figures 3a to 3c show the learned membership functions for traffic volume, occupancy and vehicle speed, respectively, for a two-lane freeway (in one direction) using the fuzzy wavelet neural network. We obtained similar curves when the data was not filtered by DWT. Table 1 shows the incident detection results for a two-lane freeway (in one direction) using the fuzzy wavelet neural network model as well the fuzzy neural network model of . Use of the wavelet theory to de-noise the traffic data , \ I , IO increases the incident detection rate fiom 86.7% to 97.8%, reduces the false alarm rate from 5.3% to 1.8%, and reduces the incident detection time from 63.6 second to 48.9 seconds. Figure 4 shows the training convergence curve with and without DWT. It is observed that use of DWT improves the convergence of the training algorithm substantially.         wavelet, and neural computing techniques to improve reliability and robustness. A waveletbased de-noising technique is employed to eliminate undesirable fluctuations in observed data from traffic sensors. Fuzzy c-mean clustering is used to extract significant information from the observed data and to reduce its dimensionality. A radial basis function neural network is developed to classify the de-noised and clustered observed data. The new model produced excellent incident detection rates with no false alarms when tested using both real and simulated data.

INTRODUCTION
According to one estimate about 60 percent of the total vehicle-hours of delay on urban freeways is caused by traffic incidents (Lindley, 1987). In most urban areas the situation is worsening with increasing traffic and limited expansion of the existing highway infrastructure. In fact, most major urban freeways regularly operate at levels above their design capacities.
The Intermodal Surface Transportation Efficiency Act of 199 1 and the National Highway System Designation Act of 1995 realize the significance of the situation and require all urban areas with populations greater than 200,000 to implement a congestion management system (Cottrell, 1998). A number of major U.S. cities already have a freeway management system in place with remote detection of traffic characteristics and a central operations center. However, few make use of an automatic incident detection algorithm for rapid identification and localization of incidents. In most cases, detection of incidents is done by human operators monitoring video camera outputs and/or from information obtained from the news media.
Considerable research has been done on the development of traffic incident detection algorithms in the past three decades. The lack of their widespread use is primarily due to their unreliability. In the simplest case, incident detection is a classification problem with two desired output classes: incident detected and no incident detected. The misclassification of an incident into no incident detected a b no incidezt conditions into incident detected (false alarm) reduces the reliability of the algorithm and makes it less effective for general use.
In this article, we present a new systematic approach to the traffic incident detection problem employing advanced signal processing, pattern recognition, and classification techniques. The developed model judiciously integrates fuzzy logic, wavelet theory, and neural network computation techniques into an efficient, reliable, and robust algorithm. One key feature of the new model is noise elimination and signal enhancement to improve detection and reduce false alarms. The collection and transmission of data introduces random noise that masks the observed signal and throws off any algorithm based on them. We present an advanced de-noising technique based on wavelet theory to overcome this problem and improve the efficiency and effectiveness of the algorithm.

INCIDENT DETECTION ALGORITHMS
Several algorithms have been suggested over the years for automatic freeway incident detection based on traffic data obtained from fixed detectors. The tr&k characteristics obtained from these detectors and commonly used as input for the algorithms are the traffic occupancy (the fraction of time a location is occupied by a vehicle expressed as a percentage), flow rate (the number of vehicles passing a location in unit amount of time), and speed.
The approaches used for the incident detection algorithms range from simple magnitude comparisons to model-based predictions. The California algorithm  is a popular algorithm that compares temporal and spatial occupancy data to predetermined thresholds in its algorithm logic. The thresholds are calibrated for each on-line implementation based on the trade-off desired between the detection rate and false alarm rate.
The California algorithm is an example of a multi-detector, comparative algorithm. On the other hand, the McMaster algorithm Persaud et al., 1990) is a single detector algorithm that is based on a catastrophe theorylmodel of the traffic flow. The traffic model partitions the flow rate-occupancy behavior among different traffic states. This information is then used in the algorithm logic together w i t h the speed data to detect the onset of congestion due to a traffic incident.
Traffic data usually exhibit sudden and large changes in magnitude that reduce the reliability of algorithms. Statistical techniques for preprocessing the raw data have been proposed in the past (Dudek et al. 1974; Cook and Cleveland 1974;Ahmed and Cook, 1982; Stephanedes and Chassiakos 1993). More recently research has concentrated on model-free intelligent systems approaches to . .
the solution of the incident detection problem. These algorithms are either based on fuzzy f logic theory Weil et al. 1998), neural network techniques Amin et al., 1998), or hybrid fuzzy logic and neural network approaches Geng and Lee, 1998). Fuzzy logic theory provides a roo1 for reaoniiig &oat c o q l e x systems that effectively utilizes imprecise and linguistic input (Zadeh, 1978).  and  propose a fuzzy expert system approach for the incident detection problem. The idea is to build a fuzzy knowledge base from the raw data in the form of fuzzy rules that are then processed by a h z z y inference system to identify and classify the relevant traffic states. The performance analysis and system output validation. The authors, however, do not present any numerical results.
A judicious combination of AI techniques and a multi-paradigm approach has the best potential to provide an effective solution to the incident detection problem . Work during the past 30 years on developing a model-based solution, either mathematical or symbolic, has not produced reliable solutions that can be adopted widely in practice. Currently available algorithms can miss up to 30 percent of incidents and can produce a fkaction of a percent of tests in false alarms. These performance indicators may look good but when the algorithm is implemented on an urban freeway management system with hundreds or even thousands of detector stations it can produce an unacceptable number of missed detections and false alarms. As a result, the total cost of operation of these algorithms in a practical environment is often too high to justify their deployment. The primary reason for the poor performance of incident detection algorithms is the complexity of the problem that does not lend itself to accurate conventional mathematical and knowledge-based representation. On the other hand, ANN techniques are self-organizing and learn from examples. However, it is imprudent to ignore known behavior of traffic flow completely. Our new approach to be described subsequently is based on a judicious integration of various problem-solving paradigms.

Basic Concept
Wavelet analysis is a transformation method in which the original signal is transformed into and represented in a different domain that is more amenable to analysis and processing.  where v/ is called the mother or generating wavelet. The integers j and k represent the scaling and translation values, respectively. In most practical uses, the scaling in Eq. (2) is done in powers of two. For this dyadic formulation Eq. (2) can be rewritten as When an orthonormal basis is used as the expansion set the coefficients of the expansion can be computed by an inner product of the signal with the corresponding wavelet: Equation (1)

Multiresolution Analysis
Multiresolution analysis provides a powerful framework for analyzing functions at various levels of detail or resolution (Mallat, 1989). Multiresolution analysis entails a sequence of nested closed approximation subspaces V,,, (rn E 2 ) , satisfying the following properties: and there exist a scaling function p E Yo such that poh (k E 2) forms a basis of V,. The where 8 represents a direct sum. This means by starting from a representation of a function belonging to a coarse subspace higher detail or resolution can be obtained by adding spaces spanned by yj,k at a higher resolution (i.e. given by the next higher value ofj).
The function x(t) can then be represented as where the first term is a coarse resolution at scalej, and the second term adds details of increasing resolutions. Equation (

Computation of the DWT
In practical wavelet analysis of discrete signals we usually do not have to deal with the fbnctions themselves but instead work with discrete coefficients. If {p,&} and {vjk} form an orthonormal basis of L2(R), which is true for most wavelet systems used in practice, the expansion coefficients c;,k and 4,k can be found by taking the inner products of the basis functions and the original signal. Using the properties of the wavelet system, Eq. (4) can be written in terms of the coefficients as follows (Burrus et al., 1998):

dj,k = d , [ k ] = ~h , [ m -2 k ] c j + , [ m ]
c: The sequences h, and h, are called filter coefficients whose values are known for each type of wavelet system that may be used for analysis. The initial scaling coefficients cj are taken equal to the original discrete signal. Equations (13)-(14) provide a recursive way to compute the DWT of a signal. Note that these computations have a finite time complexity as the 1 i coeficients are of finite length. The inverse DWT is used to reconstruct the signal from the wavelet coefficients using Eq. (12). In this work we use Daubechies wavelet system of length 8 (Daubechies, 1992). For a more detailed coverage of DWT and its computation see .

SELECTION OF TYPE AND NUMBER OF TRAFFIC DATA
It is important to carefully choose the number, type, and format of input data to be used for the incident detection algorithm. Most currently used sensors provide the speed, the occupancy, and the flow rate values at a given location every 20 or 30 seconds. Therefore, the choice for the type of traffic data has to be restricted to these three types. From these three data types only those that exhibit consistently identifiable patterns for incident and nonincident traffic flow conditions should be selected.
In this work, a pattern consists of a time-history of data rather than a single-time data value. This pattern preserves the temporal nature of traffic flow and makes distinguishing between pattenis produced by incident and non-incident conditions easier. The distinguishing feature adopted in this work is the shape of the time-history and not any particular magnitude.
To achieve this, each pattern is normalized to eliminate the effect of data magnitudes on the classification process. This approach also eliminates algorithm calibration and transferability issues caused by location specific conditions and temporal traffic flow variations. A singlestation non-comparative approach is adopted in this research. This decision is based on the I analysis of patterns on both the upstream and downstream side of an incident. The upstream and downstream patterns produced by an incident do not develop at the same time. Therefore, mixing them reduces the reliability of the algorithm. Furthermore, using patterns from adjacent stations makes the algorithm dependent on several factors such as incident characteristics, distance between stations, and existence of on-and off-ramps in between the stations. The result is calibration problems and poor performance of the algorithm.
The speed and occupancy upstream of a capacity reducing obstruction are found to exhibit the most significant and consistent change relatively independent of the flow rate (Figure 2%  localized high frequency components in a predominantly low frequency signal then the signal can be de-noised by the following procedure. Take the DWT of the signal, selectively discard t I i ! i i the higher scale coefficients, and then reconstruct the signal by taking the inverse DWT. This technique is not optimal and automatic for use in a real-time intelligent system environment.
In particular, no definite criteria are available to determine which wavelet coefficients to discard in order to produce the best results.
i I i f In recent years, formal wavelet-based de-noising techniques have been presented in the literature (Polchlopek and Noonan, 1997;Donoho, 1993Donoho, , 1995. These signals will be cleaner versions of the original corrupted signal.
Perform the inverse DWT using the scaling and the filtered wavelet coefficients. Data clustering techniques extract significant features from data based on given criteria.
The goal is to reduce the dimensionality of the data without losing important information needed for a particular problem. Dimensionality reduction is needed to reduce data processing complexity and increase robustness and efficiency. The data clustering problem can be stated The fuzzy c-means (FCM) clustering algorithm (Bezdek, 1981;Cannon et al., 1986) performs a fuzzy partitioning of the data set into classes. This is in contrast to crisp of a vector in a given class is determined by its In a general FCM formulation the membership grades A, are also optimization variables.
However, this formulation leads to a non-convex optimization problem that does not always produce a global optimal solution (Al-Sultan and Fediki, 1997). When using an iterative procedure for solving the optimization problem we use the following membership grade function based on the Euclidean norm (Bezdek,198 1 and use the FCM algorithm in the following form.

1.
Select an initial fuzzy c-partition by setting up the membership grades A , such that Eq.
(1 6 ) is satisfied. Select a value for p > 1 . Set the iteration counter t = 0.
2. Calculate the class centers for the traffic pattern X.

'
The radial basis function neural network (RBFNN) learns an input-output mapping by covering the input space with basis functions that transforms a vector Erom the input space to 1 ? " the output space Poggio and Girosi, 1990). Conceptually, the RBFNN is an abstraction of the observation that biological neurons exhibit a receptive field of activation such that the output is large when the input is closer to the center of the field and small when the input moves away from the center. Structurally, the RBFNN has a simple topology with a hidden layer of nodes having nonlinear basis transfer functions and an output layer of nodes with linear transfer functions. Figure 3 shows the topology of the RBFNN for the classification of traffic data into two states: incident and no incident. Therefore, only a single node in the output layer is required.
The input vector is denoted by x and the output is denoted by y. The number of input nodes is equal to N. which is equal to the product of the number of clusters, c (equal to 4 in our test example), and the dimension of each cluster (equal to 2, when occupancy and speed is used as in our example). The number of nodes in the hidden layer is equal to the number of cluster centers, 1 < N, < Np, for the entire training instances where Np is the total number of training instances. The cluster centers pi (1 5 i I N , ) is obtained using the FCM algorithm.
The connection from the input node i to the hidden node j is assigned the weight pji corresponding to the ith component of the vector pi. Each hidden node produces an output that is a function of the Euclidean distance of the input vector x from the cluster center p i . In this work, we use the Gaussian (bell-shaped) function as the transfer function for the hidden nodes. The output of the hidden nodej is then given by where the factor D, controls the spread or range of influence of the Gaussian function centered at p, . The output y of the network is given by where l j is the weight of the link from the hidden nodej to the output node. The output value of 1 corresponds to an incident classification while a value of -1 corresponds to a no incident classification.
The variables lj ' s and ,uji 's are found by training the neural network off-line. The FCM algorithm is used to obtain N, cluster centers pi from the Np training instances x. The RBr"I\siu' is trained to find the weights Ai by minimizing the error between the network computed output y and the desired output yJ. In other words, to train the network for Ai's we solve the following unconstrained optimization problem: The gradient .descent optimization algorithm is used to solve this optimization problem.
The spread parameters aj 's can also be treated as variables. However, we found that there was no improvement in the performance of the classification when the spread parameter is allowed to adapt. At the same time, including the parameter in the learning process slows down the training. In this work, the following expression is used to pre-assign the value of This equation approximates the spread parameter aj as one third of the mean distance between the cluster center at j and all other cluster centers. In this way an adequate amount of overlap of the basis functions is achieved for classification purposes.

EXAMPLE
The new incident detection algorithm is tested using both simulated and real traffic data.
The simulated data is generated from the simulation software TSIS (Traffic Software Therefore, to ensure that the incident patterns are consistent they are extracted from the 800second simuiations such that the effects ofthe blockzge is pronounced during the last few values of the sample. Figure 4 shows the normalized occupancy plots for two simulation runs. similarity of the form of the two patterns. This pattern extraction is essential for robust classification. For the test example, the RBFNN learned the patterns with a cumulative mean square error of less than 0.003 in a few seconds on a Pentium I1 400 MHz machine.

Testing of Algorithm Using Simulated Data
To test the algorithm the output from the Rl3FNN is passed through a threshold, t, of 0.3. An output greater than or equal to 0.3 is classified as an incident. Otherwise, it is classified as a non-incident. The model is tested using the simulated data by presenting each of the ninety 806-second simulation as a continuous stream of data. An output is produced every 20-second after the first 320-second (1 6 data points). An incident is detected when the output becomes greater than the threshold for the first time. All the 60 incidents were detected correctly during the testing of the model. Therefore, the detection rate is 100 percent. Also, none of the nonincident simulations or the incident simulations before the occurrence of the incident (a total of 360 patterns) were misclassified as an incident. Therefore, the false alarm rate is zero.
The time to detection tends to be somewhat large for flow rates less than the freeway capacity. Figure 6 shows the variation of the mean detection time of the algorithm with pre incident flow rate and distance from the upstream detector station.

Testing of Algorithm Using Real Data
The 1-880 database contains loop detector and incident data for a 14.8 km ( Note that the model trained using simulated is tested on both simulated and real data without modification. Also, the simulated data is available at 20-second interval while the real       Over the years researchers have developed numerous algorithms for the traffic incident . detection (ID) problem (Cook and Cleveland, 1974;Ahmed and Cook, 1982;Chassiakos and Stephanedes, 1993;Lin and Daganzo, 1997;Ishak and Al-Deek, 1998; Lin and as significant compared with that occurring on the upstream of the incident. It has been argued that an algorithm that uses only the downstream readings produces a high false alarm rate and has difficulty in distinguishing compression waves fiom incident producing patterns (Weil, et al., 1998).
This argument, however, is often based on using algorithms incapable of reliably distinguishing the patterns.
Recently, Adeli and Karirn (2000) presented a computational model for automatic traffic incident detection using discrete wavelet transform, h z z y logic, and neural networks. In their model, the upstream lane occupancy and speed time series data is adopted as the characterizing pattern for traffic state classification. The raw data is first de-noised by soft thresholding in the wavelet domain. Subsequently, the de-noised data is clustered by the fuzzy c-means technique to reduce data dimensionality and enhance feature separation. Finally, a radial basis function neural network is developed to reliably classify the de-noised and clustered pattern. The model is tested with both simulated and real traffic data producing excellent incident detection and false alarm characteristics. However, the time to detection for the model is long, and depending on the traffic and incident characteristics can be as large as 5 minutes.
, In this article, a new traffic incident detection algorithm is presented that distinguishes effectively patterns produced by capacity reducing incidents from those produced by compression Second, the selected patterns by and large should be independent of prevailing roadway and trafEc conditions to avoid calibration problems. L=.
e Third, the patterns should . occurrence of incidence.
indicate an incident condition in less than one minute after the In this section patterns in trafic data before, during, and after an incident are investigated to determine the most appropriate input for the incident detection algorithm. Note that raw traffic data are analyzed. The pattern identified from this analysis will be processed further to enhance desirable features. The data presented in this section are obtained from TSIS (http://www.fhwa-tskcom), a traffic simulation software.

Single-Station Versus Two-Station Incident Detection Approaches
A capacity-reducing traffic incident will produce observable changes in flow conditions at the detector stations immediately upstream and downstream of the incident. In general, these changes consist of an increase in traffic congestion upstream and a decrease in trdfic congestion downstream of the incident. Based on these observations, two different approaches-called twostation comparative and single-station approaches-have been used to develop traffic incident detection algorithms. The single-station approach relies on data obtained from only one station while the two-station approach makes use of data from two adjacent stations.
The two-station comparative approach, exemplified by the California algorithm , employs both spatial and temporal data in its algorithm logic. The premise is that using spatial data will reduce false alarms that are produced as a result of changing roadway and traffic conditions because of the natural canceling effect of comparative analysis (Weil et al., 1998;. The California algorithm is a simple thresholdbased algorithm that uses only one flow parameter (occupancy). Also, because of its comparative approach it has to be calibrated at each station to optimize it for the particular roadway geometry.
The two-station comparative approach, in general, has several disadvantages even when advanced pattern recognition techniques are employed. Traffic incidents are temporal events whose effects develop over time both in the upstream and downstream directions. However, fie characteristics of the traffic patterns developed in the upstream and downstream directions are different. Therefore, combining data from both stations is likely to produce less reliable detection of incidents because of the mixing of two different temporal patterns. Two-station comparative algorithms are also more difficult to calibrate because they are afTected by the geometry of the roadway, the distance between the stations, the presence of on-and off-ramps, and the prevailing flow conditions.

DISCRETE WAVELET TRANSFORM AND SIGNAL ENERGY
The discrete wavelet transform (DWT) provides a powerful and efficient technique for analyzing, decomposing, de-noising, and compressing signals. In particular, the DWT of a signal breaks it down into several time-frequency components that enable the extraction of features desirable for signal identification and recognition. The DWT and wavelet theory in general have been developed rapidly in the last 10 years (Daubechies, 1992, Burrus et al., 1998. In this section the basic concepts of DWT and its energy representation employed in this research are presented briefly. Additional details of DWT and its application in ITS problems can be found in . where ho and hl are filter coefficients and the constant f i maintains the unity norm of the functions. In this work, the Daubechies wavelet system of order eight (Daubechies, 1992), defined by eight hl and ho coefficients, is used. This wavelet basis system is selected because of its I orthonormality property and compact support providing a DWT with a finite length and number of f wavelet coefficients.
When an orthonsrmal basis is used the coefficients Cj,k and dJ,k are given by the inner product of the signal w i t h the appropriate function: which can be reduced to the following recursive equations ( Burrus et al., 1998): In these equations it is assumed that the scaling coefficients of the signal at the highest resolution are known.   Figures 1 and 2, are not periodic. In other words, generally the end valuesfill andJcL] are not equal. As a result of the incompatibility of the trafiic data with the periodic boundary condition, the wavelet representation can distort the shape of the original traffic pattern. To overcome this problem the traffic pattern is extended on either ends before its DWT is found. This procedure is explained in detail in the next section.
An advantage of using an orthonormal basis to frnd the DWT of a signal is that the energy of the signal can be partitioned into its various time-frequency components. The energy contribution from each component is expressed as a function of the wavelet and scaling coeficients. This is known as Parseval's theorem and is expressed mathematically in the form of the following energy functional (Burrus et al., 1998): .. .

'(10)
We use this functional to enhance the traffic data streams for the purpose of pronouncing the traffic incident patterns, as explained in the next section.   The length L of each data series now becomes 32 (i.e. L = and J = 5). The need for extending the data series is shown in Figures 7a and b. Figure 7a shows a typical flow rate data series, f F [ i ]

TRAF'FIC PATTERN FEATURE ENHANCEMENT AND DE-NOISING
(solid line), on the downstream side of an incident and its scale 3 (i.e.j = 3) wavelet approximation (dashed line). Notice how the shape of the wavelet approximation is distorted at the left edge because of the periodic boundary condition assumption. Figure 7b shows the same data series extended using Eq. (11) (solid line) and its scale 3 wavelet approximation (dashed line). In this figure the wavelet distortion has been pushed aside to the outer edges, outside the usable region of data, the segment from data points 9 to 24. In this segment the basic shape of the original data series is preserved without distortions.
In the new traffic incident detection model, the DWT is employed to reduce the dimensionality of input data for the neural network pattern classifier, eliminate the traffic noise, and enhance the desirable features in each data series. The extended data series has a length of 25 and is represented by scale J = 5 in Eq. (5). Equation (7) is applied two times recursively to calculate the scaling coefficients at scalej = 3. This operation corresponds to a two-stage low-pass filtering of cJlk] with ho . At this reduced resolution the higher frequency noise-like components are eliminated leaving a smoother de-noised shape or form. Also, through the two-stage low-pass filtering the 32-point time-series is now reduced to an 8-coefficient representation. However, this DWT is for the extended 32-point data series.
The benefit of DWT-based de-noising and feature enhancement is demonstrated in Figures 8   and 9. Figure 8 is a scatter plot of to [ i ] and t s [ i ] based on the same data used in Figure 3. Figure 9 is a scatter plot of to [i] and tF [ i ] based on the same data used in Figure 6. Comparisons of Figure   3 with

PATTERN CLASSIFICATION USING RADIAL-BASIS FUNCTION NEURAL NETWORK
Neural networks are powerful model-free pattern classifiers . However, they can be computationally very expensive when the size or dimensionality of the input data is large requiring a very large number of training instances. Training instances of the traffic patterns defined by Eqs. (1 3) and (1 4) are used to develop a mapping from an 8-dimensional space to a onedimensional space. For this purpose, the radial basis function (RBF) neural network is adopted. The RJ3F neural network is an efficient universal classifier ) that has a simple topology consisting of a hidden layer of nodes with nonlinear transfer Tmctions and an output layer of nodes with linear transfer functions.
The topology of the RBF neural network developed for the traffic pattern classification is shown in Figure 10. The input layer has 8 nodes corresponding to the eight data points in each pattern

(xdi] or x~[ i ] ,
henceforth called vector x). The number of nodes in the hidden layer, Nh is equal to the number of cluster centers used to characterize the input training space. The output layer has one node 61). The number of nodes in the hidden layer is chosen as a fraction of the total number of training instances. This choice is based on numerical experimentation to determine which number adequately covers the input space and produces the best mapping. We found a number within the range of 10 to 30% of the number of training instances to provide satisfactory results. The cluster centers pi (1 I i 5 N h ) is obtained using the fuzzy c-means algorithm (Bezdek, 1981;Cannon et al., 1986).
The connection from the input node i to the hidden node j is assigned the weight p,i corresponding to the ith component of the vector p j . The output of a hidden node j is given by the following Gaussian transfer function: The gradient descent optimization algorithm is used to solve this optimization problem. ?

Introduction
The new computational model for freeway incident detection is tested using both real and simulated traffk data. More than 40 hours of simulated traffic data is generated fiom the trafEc simulation software TSIS/CORSIM while real tr&ic data is obtained from the freeway service patrol (FSP) project's 1-880 database. A large portion of the simulated data is made up of incident or incident-like conditions on two-and three-lane freeways. This is an advantage of employing a simulation software for testing purposes as sufficient quantities s f reliable real data with traffic incidents are not readily available. Furthermore, with a data generating software it is possible to study the performance of the model under various traffic flow scenarios. The real data is used for further validation of the model.

Training
The model is trained using a sample of 30 incident and 30 non-incident patterns extracted fiom the simulated data. Two RBF neural networks are trained: one for the upstream detector station and the other for the downstream detector station. Training is done only once and no re-calibration or retraining is needed. The RBF classifier can therefore be implemented on-line on all stations after the training is done off-line.

First Test Using Simulated Data: Two-lane Freeway
The performance of the incident detection model on a two-lane freeway (in each direction) is shown in Table 1 In the subsequent test scenarios the threshold value was taken as 0.2 where an output greater or equal to 0.2 was signaled as an incident while a value less than 0.2 was labeled as a non-incident.
This was intended to eliminate the false alarms but at the expense of slightly more detection times. Table 2 Table 3 shows the performance of the new incident detection model using real data. Both downstream and upstream stations produced a detection rate of 95.2 percent and a false alarm rate of zero. This result is identical to that reported by . Accurate information for the time of occurrence of incidents is not available from the database. Thus, the detection times for the model cannot be computed.

Result Summary and Comparison
The results of the new incident detection model indicate that the downstream detector station data and logic by themselves provide satisfactory results. In an ATMS that does not provide speed data the upstream station logic can be eliminated. However, in situations where the speed data is available the upstream detector station logic provides an additional level of reliability without any significant increase in computation. The results also show the calibration free transferability of the model where the model trained using simulated data performs reliably when tested using both real and simulated data. As compared to the *-wavelet RBFNN model presented by , the new model produces significantly shorter detection times without any loss in detection and false alarm rate performance. Furthermore, the new model is computationally more efficient as it does not require the compuation of the inverse wavelet transform and the fuzzy Cmean at each time interval.

CONCLUSION
A new traffic incident detection logic and computational model is presented that overcomes several shortcomings of earlier algorithms. The model uses a two-stage single-station detection logic. In the first stage a decision is made based on data obtained from the downstream detector station only while in the second stage the decision is confrmed based on data obtained fiom the upstream detector station only. Wavelet domain processing is used to de-noise, compress, and enhance the raw traffic data for classification. It is found that an energy representation of the data best characterizes incident and non-incident conditions. The model determines the state of the trafEc flow fiom the shape of the time-series data rather than the magnitude. A radial basis function neural network is developed to classify the processed trafEc data into incident and non-incident states.

INTRODUCTION
In recent years, researchers have investigated neural network based incident detection algorithms with promising performance results.   discriminant analysis is used for data de-noising and enhancement, respectively . The model is tested using simulated data for several geometric and traffic flow conditions.

A NEW TRAFFIC INCIDENT DETECTION METHODOLOGY
A freeway incident detection algorithm must produce consistently reliable results from remotely sensed data of traffic streams. This is a challenging problem especially considering the non-homogenous, turbulent, and often chaotic nature of traffic flow and the limited information available from sensors. This is further complicated by noise introduced in the data during its collection and transmission. This indicates that a wholly model-based approach is less likely to be successful than a model-free, adaptive pattern recognition approach. However, a pattern-based approach must not neglect traffic behavior information that can be used to improve the efficiency and performance of the algorithm. The pattern-based approaches presented in the literature often neglect this aspect and tend to be overly simplistic. To solve the complex freeway incident detection problem effectively, our approach is based on utilizing advanced signal processing, pattern recognition, and classification techniques with appropriate heuristics derived from known traffic flow behavior.
The rationale behind this methodology is: Traffic flow is highly complex and not amenable to accurate mathematical modeling.
Therefore, reliance must be made on adaptive algorithms that can learn and recognize patterns in an unsupervised manner.
Traffic data is often corrupted with noise. Noise elimination is essential to improve the performance of any algorithm.
The algorithm should require little or no calibration for its on-line implementation. That is, the algorithm's performance must be independent of roadway geometry, existence of on-and off-ramps, weather conditions, and changing traffic demand.
Trafic flow behavior and information from other sources must not be ignored. For example, knowledge of flow behavior should be used wherever possible to simplify the algorithm and improve performance.
The algorithm must be capable of real-time operation. Therefore, computationally intensive algorithms must be avoided. make it difficult for any algorithm to discriminate between an actual incident pattern and a noiseinduced pattern. Noise can be effectively removed from a signal if it can be separated from the true signal. Transform-based techniques, such as discrete wavelet transform, provide the best solution.
The third stage performs a feature extraction process. This stage reduces the dimensionality of the data and improves the performance of the following classification and decision-making stages. Several clustering techniques are available including neural network , fuzzy logic, and statistical approaches. In general, the statistical discriminant analysis approaches are computationally intensive and require high CPU resources in order to be implemented in real-time, a requirement for effective incident detection algorithms. Fuzzy clustering techniques such as the hzzy c-means approach-are both computationally eficient and capable of handling imprecision.
The classification stage identifies patterns in data into relevant categories. This stage determines whether the data represents an incident or not. Neural network models are most appropriate for this stage of processing. The clustering and classification stages may be combined in an algorithm.
The final decision is made in the decision making stage. This stage can be used to merge information available from other sources such as surveillance cameras before making a decision.
Techniques such as fuzzy logic and decision theory may be used in this stage, in addition to heuristics based on human judgement.

FUZZY-WAVELET RBFNN MODEL FOR INCIDENT DETECTION
Recently,  developed a new multi-paradigm incident detection model for freeway incident detection. The model is based on the general methodology for the development of reliable, robust, and efficient incident detection algorithms presented above. The model is self-calibrating once it is trained and does not need to be modified for different roadway geometries and flow conditions. The new incident detection algorithm is described briefly in this section. For complete details, the reader should refer to .
This model is a single-station time-series pattern recognition approach that uses advanced de-

L
The algorithm is shown schematically in Figure 2 and summarized succinctly in the following steps. These steps represent the processing that is needed at each decision interval (equal to the reporting interval for the sensors) and at each detector station.

(xs[n]).
When data are available every 2053, for example, then this process is performed every 20-9 by adding ihe new reading and dropping the lac? reading in the sequence.

4.
Daubechies wavelet system of length 8 (D8). The lowest scale resolved is 2. Therefore, the final number of scaling coefficients (c2,k) obtained is 4 and the final number of wavelet coefficients (&) obtained is 12.
Filter the wavelet coefficients (4.k) using the soft-thresholding nonlinearity,   1, 16). Use the fuzzy c-mean (FCM) algorithm to reduce the dimensionality of x from 16 x 2 to 4 x 2, denoted by x' . These 8 data points represent the de-noised and clustered pattern that is used in the next classification step.
Feed-forward the pattern through the trained radial basis function neural network (RBFNN).
If the output y is greater than a pre-selected threshold, then an incident condition is signaled.
Otherwise, no incident condition exists.
The RBFNN is trained off-line from representative incident and non-incident patterns. Each pattern is processed by following Steps 1-3 above. Note that the training has to be done only once. The trained RBFNN can then be implemented on all the detector stations in the fieeway management system. This portability is possible because the algorithm depends on the shape of a pattern rather than on any magnitude to distinguish between incident and non-incident conditions. The RBFNN can even be trained using simulated data only and implemented on-line, which is the case in this evaluation.

CALIFORNIA ALGORITHM #8
The California Department of Transportation and its associates developed several algorithms for freeway incident detection in the 1970s that are collectively known as California algorithms.
As many as 10 variations of these algorithms were developed. All of these algorithms use the can be described by a binary tree structure where each node, except the leaf (end) nodes, perform a two-way decision made by comparing a traffic pattern (an occupancy-based value) with a preselected threshold Levin and Krause, 1979). Starting from the root S t node a sequence of such decisions are made until a leaf node is reached, which represents a traffic state. This algorithm needs six parameters for calibration. These are defined in Table 1.
Five of them (PI to Ps) are thresholds for occupancy-based values, while parameter p6 specifies the number of time periods the algorithm will wait for a compression wave condition to persist before signaling it.
The performance of the algorithm depends on the choice of these parameters. The parameters t i r r are determined in a trial-and-error fashion by testing the algorithm on a given data set to obtain the best trade-off between detection rate and false alarm rate. The calibrated parameters are data dependent and may not be optimal for other data sets. This in turn means that the performance of the algorithm will not be optimal at all locations and at all times in a freeway management system. Thus, California algorithms are not readily transferable and need re-calibrations for their effective network wide implementation. Despite this shortcoming the California algorithmsespecially algorithms #7 and #8-are the most widely known and accepted algorithms for trafic incident detection. They are often used as benchmarks for the evaluation of new algorithms. Both algorithms #7 and #8 are recognized as the "best" (Levin and Krause, 1979). However, algorithm #8, with its additional compression wave suppression logic, performs better in heavy traffic and produces fewer false alarms as compared to algorithm #7 (Levin and Krause, 1979). For these reasons, we adopt California algorithm #8 for the comparative evaluation of the new fuzzywavelet RBFNN incident detection model.

EVALUATION OF THE MODEL
r

Introduction
In general, there are two approaches to the evaluation of a new computational model. The I first approach is to test the model using a standard representative data set and determine its performance. This data set should be recognized as the benchmark for comparative evaluations of such models. In the second approach, the model is evaluated using non-standard but representative data sets and its performance compared to that of a benchmark model on the same data set. Presently, a standard data set is not available for evaluating freeway incident detection algorithms. Furthermore, real traffic data is not available in sufficiently large and varied quantities to allow any meaningful evaluations. Therefore, freeway incident detection algorithms are usually evaluated using representative simulated data for which the performance of both the new and a benchmark algorithm (such as California algorithm #8) are compared. The use of simulated data has one more advantage not possible with real data: the algorithms can be tested and studied under different fieeway traffic flow and geometric conditions. The f-wavelet RBFNN freeway incident detection model (also abbreviated as the new algorithrdmodel in the rest of this article) is tested using both simulated and real data. Simulated data is used for comparative evaluations with California algorithm #8 (also abbreviated as California algorithm), whereas real data is used to test model robustness and portability.

Evaluation Criteria
Three quantitative measures are commonly used to evaluate freeway incident detection algorithms.
Detection rate: The detection rate is defined as a percentage calculated by dividing the number of incidents correctly signaled by the algorithm to the total number of incidents in the data set. A vaiue of i OO percent represents perfect performance.
False alarm rate: The false alarm rate is defined as the percentage calculated by dividing the number of incidents incorrectly signaled to the total number of decisions made by the algorithm. A value of zero represents perfect performance. As the ratio is calculated with respect to the total number of decisions made by the algorithm even a small value for the e false alarm rate can represent an unacceptable number of false alarms in practice. For example, a false alarm rate of 0.5% can produce 21.6 false alarms from a single station (that reports every 20 seconds) per day. Urban freeway management systems usually have hundreds of detector stations, thus compounding the problem. Therefore, a very low false alarm rate is of utmost practical importance.
Detection time: The detection time is defined as the time it takes the algorithm to signal the incident after its occurrence. A consistently short detection time is desirable so that emergency support can be dispatched to the scene and appropriate traffic control measures can be taken quickly. An incident detection algorithm that correctly signals 100 percent of the incidents but takes a long time to do so is of little practical value.
The quantitative measures defined above, however, do not completely describe the performance of an incident detection algorithm in practice. These performance measures are often determined from off-line tests on data for which the algorithm is calibrated. Such calibrations, however, are not practically feasible when an algorithm is implemented on-line in a large freeway management system. Thus, the network wide performance degrades significantly from that reported in the tests. For this reason, the following qualitative measure must also be considered in the evaluation of freeway incident detection algorithms.
'Portability: An algorithm is transferable if it performs at optimal or near optimal levels under different conditions without re-calibration or re-training. This qualitative measure is judged by the performance of the algorithm in terms of the three quantitative measures on different freeway traffic flow and geometric conditions. Ideally, an algorithm should not require any re-calibration for its network wide on-line implementation.

Traffic Data
The new model is tested and evaluated using both simulated and real traffic data. Simulated traffic data is generated from the microscopic stochastic simulation

Training and Calibration
The new model is trained using simulated data. Following the procedure outlined in a previous section 60 incident and 60 incident-free patterns are used for training. These patterns are selected randomly from all the different simulations performed for this evaluation. In pai-ticular, the incident-free patterns contain samples from traffic compression waves, stop-andgo traffic, and traffic affected by onand off-ramps. This selection is done to provide added robustness to the trained network in recognizing incident-free conditions from those caused by incidents. However, it should be noted that the model bases its decision on a pattern that is to a large extent independent of the prevailing traflic and freeway conditions. Once the network is trained and its weights established the model is evaluated without any modifications.
The California algorithm is calibrated with the same 60 incident and 60 incident-free traffic samples used for the training of the --wavelet RBFNN model. Threshold calibration is done in a trial-and-error manner whereby the thresholds are modified after each run through the data set based on the determined detection rate, false alarm rate, and detection time. There is a tradeoff between the detection rate and the false alarm rate such that an increase in the detection rate results in an increase in the false alarm rate. In the calibration process, a ceiling for the detection rate is achieved and the thresholds are then modified to minimize the false alarm rate. This procedure is identical to that reported by Payne and Tignor (1 978) and Levin and Krause (1 979).
The set of parameters obtained are PI = 13, P2 = -30, P3 = 30, P 4 = 15, P5 = 30, and P6 = 2. Note that compression wave false alarm suppression is done for two time periods (40 or 60 seconds) unlike the 5 minutes used by . This low value is chosen to avoid unacceptably long detection times. This set is used throughout the evaluation without modification.

First Simulation Test -Parametric Evaluation
In this test, the new model is evaluated under different freeway geometric, traffic flow, and detector station location conditions. The general freeway layout and the locations of the detector stations and the incidents are shown in Figure 3. depends on the distance of the station from the incident, the prevailing flow rate, and the capacity reduction at the incident location. The detection times for the California algorithm also depend on the same factors. However, because the California algorithm has a two-station logic its detection time variation with distance is less pronounced. This behavior is evident from Figure 4 Both new and California 'algorithms detected all incidents on a 2-lane freeway ( Table 2) I yielding a detection rate of 100 percent. On 3and 4-lane freeways both algorithms failed to detect some incidents for the smallest flow rate of 1000 vph per lane (Tables 3 and 4). This is because the reduced capacity after incident is still greater than the prevailing flow rate, and the impact on traffk on the upstream side is minimal. Both algorithms detected all five incidents when the incident is closest (152 m) to an upstream detector station. The new model, however, performed better on the 4-lane freeway where it also detected some incidents located at distances greater than 3b5 m (Table 4) yielding an overall detection rate of 83.3% as compared to 75% for the Califomia algorithm.

T
In this test, the false alarm rate performance of the new and California algorithms are evaluated on a freeway with on-and off-ramps. The purpose of this test is to determine the portability of the algorithms to conditions of varying flow rates and freeway bottlenecks. These due the close proximity of the station to the off-ramp and chaotic traffic situation at that station.

Test Using Reai Data
To hrther evaluate the performance of the new algorithm real traffic data from two sources are used for testing.  (Table 7). Moreover, in all cases the algorithm detected the incident before that reported by

INTRODUCTION
There are two major uses of automatic incident detection in an advanced traffic management system (ATMS motorists' safety. Second, it provides usehl information to the routing control system to maintain and optimize system wide performance. For the best performance, the incident detection system must provide quick and reliable information. The traffic incident detection system is a main component of an ATMS (Figure 1). The other components that make up the advanced traffic management system include the traffic routing and control system, the data archiving system, and the pre-and post-processing systems.
TrafEc sensors provide the main source of data for analysis. Additionally, information may be obtained from the news media, special traffic probe vehicles, and motorists' callins. The goal of an ATMS is to maximize the system throughput. This is currently achieved by means of traffic control devices such as entry ramp access control and changeable message signs that guide and control traffic.
Recently,  presented a new multi-paradigm intelligent system approach to the solution of the freeway incident detection problem employing advanced signal processing, neural network pattern recognition , and classification techniques. This is a single-station algorithm that uses loop detector data upstream of the incident A wavelet-based de-noising technique is employed to eliminate undesirable fluctuations in observed data from traffic sensors . Fuzzy c-mean clustering is used to extract significant information from the observed data and to reduce its dimensionality. A radial basis  The purpose of evaluating a new freeway incident detection algorithm is to determine its robustness under different trdfic flow and roadway geometry conditions, and thus to , assess its cost-effectiveness for practical network-wide implementation. Three quantitative performance measures are commonly used for this purpose. They are the detection rate (percentage of number of correctly detected incidents to the total number of incidents in the data set), the false alarm rate (percentage of the number of false alarms signaled by the algorithm to the total number of decisions made), and the detection time (the time it takes for the algorithm to signal the incident after its occurrence).
These three quantitative measures, however, do not provide a complete picture of algorithm's performance in practice. The qualitative measure of portability without recalibration must also be considered in conjunction with the quantitative measures. This is because the cost of maintaining and re-calibrating the algorithm to perform acceptably at all locations in a large freeway system can make its network-wide implementation economically infeasible. There is a cost associated with every missed detection and every false alarm, the time taken to detect an incident, and the efforts exerted to maintain and calibrate the algorithm. These costs ultimately determine the success or failure of the algorithm in practice. As reported by Abdulhai and Ritchie (1999), traffic control centers place differing cost premiums on each performance measure whenever a trade-off is sought. In any case, a higher detection rate, a lower false alarm rate, and a shorter detection time is always desirable. Moreover, an algorithm that is readily portable is often preferred over one that performs excellently only at a given location.
All freeway incident detection algorithms reported in the literature have been developed and evaluated for urban freeway systems. This is understandable because of the negative impacts incidents create on congested urban freeways and the need to remove them as soon as possible. However, there is also a need to develop and evaluate incident detection algorithms for rural freeways. The vehicle-miles of rural freeways in the United States is much larger than that for urban freeways and there is indeed a need for automatic and rapid detection of incidents so that emergency/medical support can be In the following section, factors to consider in rural freeway incident detection are delineated. Then, the wavelet energy freeway incident detection algorithm is described step-by-step, followed by a comprehensive evaluation of the algorithm and discussions of the test results.

FACTORS TO CONSIDER IN RURAL FREEWAY INCIDENT DETECTION
Traffic on urban freeways is characterized by high demand and periodic congestion that reduces the level of service expected by motorists. Because of the high demand and insufficient capacity the level of service degrades dramatically when an obstructing incident occurs. Therefore, quick and reliable identification and localization of such incidents is essential to prevent unacceptable backups and delays caused by obstructions that are not cleared quickly. As such, an effective incident detection algorithm must be both reliable and fast in detecting an incident.
Traffic on rural freeways, on the other hand, is usually congestion-free under normal operating conditions. Furthermore, the impact of an obstructing incident is often less severe because traflic demands on rural freeways usually do not exceed the capacity.
Nevertheless, the need for reliable automatic incident detection still exists. Incidents in rural areas, unlike in urban areas, may go unreported for several minutes. Furthermore, the transit of emergency and medical support to rural locations can take more time.
Therefore, rapid automatic notification of an incident condition is very valuable.
Automatic incident detection on rural freeways is challenging because of low flow rates and large distances between detectors. Most of the incident detection algorithms

WAVELET ENERGY MODEL FOR FREEWAY INCIDENT DETECTION
The new single-station incident detection algorithm developed by Karim and Adeli (2001b)  This representation makes it possible to de-noise, enhance, and reduce the dimensionality of the patterns effectively and efficiently. The processed patterns are then classified into one of two states representing either an incident or incident-free condition by a radial basis function neural network. The key ideas are described in Karim and Adeli (2001 b) in general terms. A complete detailed step-by-step algorithm is presented in this section.
Only the downstream station logic is implemented and tested in this evaluation. It was found that the upstream logic produced results almost identical-and in the case of detection time, slightly inferior-to those produced by the downstream logic. Therefore, the wavelet energy algorithm consists of the collection, processing, and classification of the downstream lane occupancy and flow rate time-series data. In a fi-eeway management system, this algorithm is implemented at every detector station and reports on the presence or absence of an incident upstream of the station. The algorithm is shown schematically in Figure 2 and described in the following steps. These dements correspond to the input traffic data before it is extended for processing. Let the processed lane occupancy and speed data be denoted as co [i] and cF [i] , respectively.

3.
Form the feature pattern by concatenating the processed lane occupancy and flow rate sequences: The 8-element sequence x[i] represents the de-noised, clustered, and enhanced pattern that is used in the subsequent step for classification 4. Feed-forward the feature pattern x[i] through a trained radial-basis function neural network. The neural network has 8 input nodes, 12 hidden nodes with Gaussian transfer functions, and one output node with a linear transfer function. If the output is greater than a pre-selected threshold (a small positive value such as 0.2) then an incident is signaled; otherwise, the pattern represents an incident-free condition.
The RBFNN is trained with incident and incident-free patterns to determine the weights of the links connecting the input layer to the hidden layer and the links connecting the ! I I 1 I I hidden layer to the output node. Training is done iteratively to minimize the output error.

1
Once the network is trained no further training is necessary. For further details, refer to Karim and Adeli (2001b).

Goals
A comprehensive evaluation of the wavelet energy freeway incident detection algorithm is presented in this section. The goals of the evaluation are:

4.
To perform a parametric evaluation of the algorithm, that is, to determine the sensitivity of the algorithm to variations in roadway geometry and trafic flow conditions.

5.
To compare the performance of the algorithm with that of California algorithm #8 .
The roadway geometry conditions evaluated are the number of lanes (

Data
The majority of the traffic data used in the evaluation are generated using the simulation software TSIS (http://www.fhwa-tsis.com/). TSIS is a microscopic simulation tool that considers each vehicle as a separate entity in a stochastic model of vehicles and their environment (roadway geometry, pavement conditions, proximity to other vehicles, etc).
In addition to simulated data, real data. from the San Francisco Bay area freeway service patrol project's 1-880 database is also used for evaluation. This database is a collection of binary files of loop detector outputs collected over a period of about 2 months. A software program is used to process this database and extract selected information in a readable format for further processing. The database contains basic information such as lane occupancy, flow rate, and speed. The information on the location and time of incidents is recorded by human observers and has to be correlated to the loop data for analysis. Because this information is recorded by humans, it is not reliable and has to be verified by visual observation of the loop detector data. In all, data for 21 single-lane blocking incidents and four hours of incident-free conditions are extracted for evaluation in this research.

Training and Calibration
The wavelet energy freeway incident detection algorithm is trained with 60 incident and 60 incident-free patterns. These patterns are chosen randomly from all the simulated data generated for the evaluation. No real data is used in the training phase of the network. The training determines the weights for the RBFNN. Once the algorithm is trained no further training is done as it is evaluated using different sets of data.
The California algorithm #8  The same set of parameters is used throughout the evaluation without re-calibration.
This is done to test the portability property of the algorithm and compare it with that of the new wavelet energy algorithm.

I
The detection times reported by the new wavelet energy algorithm varies from 56 to 1 16 seconds. The detection time generally increases with an increase in the distance of the incident from the downstream detector station. However, this variation of the detection time with location of incident is substantially less pronounced than that for the California algorithm. This is evident fkom Figure 4, which compares the detection times for the wavelet energy and California algorithms on a 2-lane freeway. The detection time for California algorithm is a lot longer, varying from 76 to 480 seconds; it increases substantially with a decrease in flow rate and distance of incident from downstream detector station. This is because the California algorithm is based on the formation of congestion on the upstream side of the incident, which takes more time to develop when the prevailing flow rate is low. The wavelet energy algorithm, on the other hand, does not exhibit this behavior as seen in Figure 4. The performance of the wavelet energy algorithm is also not greatly effected by changes in geometry such as the number of lanes as noted in Figure 5. The relative independence of the wavelet energy algorithm to changes in flow rate and roadway geometry demonstrates its superior portability property as compared to the California algorithm.
False alarms generated by automatic freeway incident detection algorithms are often a major source of excessive operational costs. Traffic control centers would often prefer an algorithm that generates fewer false alarms over another one with better detection rate but higher false alarm rate. On urban freeway segments, the wavelet energy algorithm generated no false alarms, thus producing an overall false alarm rate of zero. In contrast, the California algorithm produced false alarm rates of 0.22, 0.1 1 , and 0.28 percent, on 2-, 3-, and 4-lane freeways, respectively. These false alarms are generated during moderate and heavy traffic flow conditions.

False Alarm Performance in the Vicinity of On-and Off-Ramps
Traffic flow in the vicinity of onand off-ramps is often chaotic and marked by large fluctuations in occupancy, speed, and flow rate as vehicles maneuver to enter and exit the freeway. This is especially true for urban freeways where ramps are usually spaced closely apart and the entering and exiting flow rates are high. On-and off-ramps are thus geometric bottlenecks that create non-homogeneities in traffic flow, and are responsible for generating a large number of false alarms from existing automatic freeway incident detection algorithms. To test the false alarm performance of the algorithms in such situations a 3-lane urban freeway segment with two on-and off-ramps is modeled for simulation (Figure 6). For this freeway geometry four traffic flow scenarios are evaluated, as described in Table 4. Each scenario consists of three time periods of different mainline, on-, and off-ramp traffic flow rates. This is done to simulate sudden changes in entering and exiting flows on heavy traffk freeways that often cause automatic freeway incident detection algorithms to produce false alarms.
The false alarm performance of the wavelet energy algorithm and California algorithm #8 in the vicinity of on-and off-ramps is given in Table 5. The remarkable false alarm performance of the wavelet energy algorithm is evident; it produced no false alarms at all six detector station locations and in 27000 (4X6X1125) decisions. The I California algorithm, on the other hand, produced numerous false alarms, ranging from 0.5% to 3.8%, especially for the roadway segment between detectors 4 and 5 ( Figure 6). i ? I Note that both algorithms are not re-calibrated or retrained for this and all other evaluations. This is done to ascertain the portability property of the algorithms. The California algorithm #8 may be re-calibrated for each segment to produce fewer false alarms. However, this procedure is time consuming and expensive on a large urban fieeway management system. Furthermore, this procedure may be required on a regular basis to ensure optimal performance with changing traffk flow conditions. The wavelet energy algorithm, on the other hand, performed excellently without any need for retraining and thus is readily transferable and portable for implementation on urban freeway systems.

Evaluation on Rural Freeways
Rural fieeways present a challenge for passive automatic freeway incident detection algorithms that use loop detector data. As discussed earlier, it is economically infeasible to have closely spaced loop detectors on the large network of rural freeways in the U.S.
Thus, incident detection algorithms can only rely on sparse information to arrive at a decision. This is further complicated by the often low flow rates on rural freeways that are impacted little by an incident. As a result, passive automatic incident detection algorithms often perform poorly on rural freeways making them impractical for traffic agencies to implement. Traffic agencies also desire algorithms that require little maintenance and no site-specific calibrations for their optimal performance on rural fieeways.
To the best of the authors' knowledge, no automatic fieeway incident detection algorithm has been evaluated for rural freeway conditions. In this section, the new for 600 seconds; however, no visible change in the occupancy pattern such as a persistent reduction in the occupancy during and after the incident is noticeable from the plot (the spike in the figure is an outlier due to an extraneous factor such as noise in the data and is not an indicator of any change in the occupancy pattern). The wavelet energy algorithm is able to detect some incidents because it considers both occupancy and flow rate readings to create an enhanced and de-noised pattern before classifying it. The increased sensitivity of the algorithm, however, does come with a higher false alarm rate. The number of false alarms can be reduced by increasing the threshold t (see Figure 2) used in the wavelet energy algorithm. This can be done easily and in real-time by an appropriate logic in the algorithm.
A flow rate of 1000 vph per lane is typical on many rural freeways under normal operational conditions. Under these conditions the wavelet energy algorithm detected 88 percent of the incidents with a false alarm rate of 0.08 percent. The California algorithm, on the other hand, produced detection and false alarm rates of 20 percent and zero, respectively. The California algorithm failed to detect any incident that is less than 2479 m from the downstream station. The wavelet energy algorithm is able to detect 85% ofincidents for such distances from the downstream station. The California algorithm will require the detector stations to be spaced at about 610 m apart for its performance to be at par with the wavelet energy algorithm. Such a high density of loop detectors is economically infeasible for rural freeways. Furthermore, the wavelet energy algorithm Often an incident.results in the blockage of a lane for only a short duration of time.
For example, a disabled vehicle may block one lane for a few minutes before it is moved onto the shoulders. Detecting such incidents are often more challenging for incident detection algorithms as the impact of the incident lasts just for a shorter period of time. In all the previous evaluations, the incident duration is equal to 10 minutes. Table 7 shows the performance of the wavelet energy algorithm and California algorithm #8 on a 2-lane rural freeway when the lane blockage lasts for 5 minutes only. The detection rate, false alarm rate, and detection times produced by the two algorithms for this scenario are similar to those produced for 1 0-minute incidents recorded in Table 6. This is because the maximum detection time for the energy wavelet algorithm in all cases is 160 seconds which is substantially less than the 5-minute duration of the incident. As long as the duration of an incident is greater that the detection time it does not affect the performance of the algorithm in any significant way. The same does not hold true for the California algorithm because its detection time is as large as 430 seconds. Consequently, as is the case for the lorminute duration incidents, the performance of the wavelet energy algorithm is superior to that of California algorithm #8.
Sometimes incidents produce no lane blockage but only reduction in the capacity of the lanes. This situation may occur when, for example, a disabled truck is parked on a shoulder reducing the capacity of the lanes. To study such scenarios on rural freeways a 40 percent reduction in capacity of both lanes that lasts for 10-minutes is modeled for evaluation. The performance of the wavelet energy and California algorithms under such scenario are given in Table 8. The detection rates produced by both wavelet energy and California algorithms dropped slightly as compared to the case when one lane is blocked (Table 7). This is because an incident that does not block any lanes produces a less severe disruption in traffic flow than an incident that blocks at least one lane. This is especially true when the flow rate is low (1000 vph per lane). For the same reason also, the average detection time by California algorithm is longer as it takes more time for the congestion to develop and be detected by the algorithm. The detection time of the wavelet energy algorithm is in the range of 40-145 seconds while that of the California algorithm is in the range of 252-580 seconds.

Evaluation Using Real Data
Limited usable real traffic data was available to the authors. Real traffic flow and incident data are extracted from the San Francisco bay area freeway service patrol project's 1-880 database for evaluation of the wavelet energy and California algorithms.
Data for 21 incidents that block at least one lane are used to determine detection rate performance, while 4 hours of incident-fiee data are used to ascertain the false alarm rate performance. The time of incident information in the database is inaccurate and therefore cannot be used to determine detection times. The performance of the wavelet energy and California algorithms using real data is shown in Table 9. The wavelet energy algorithm outperformed the California algorithm in both detection and false alarm rate. In particular, the wavelet energy algorithm did not signal any false alarm at all. In contrast, the California algorithm produced false alarm rate of 0.63% for this small real data set. It should be noted that this evaluation was also done without re-calibrating or re-training the algorithms. Also, note that the algorithms have been trainedcalibrated using simulated data only. The detection rate of the wavelet energy incident detection algorithm can be improved when a good amount of real data is available.

PERFORMANCE SUMMARY AND CONCLUSION
Transferability or portability is a qualitative property of a fieeway incident detection algorithm that-determines how well the algorithm performs across various traffk flow and roadway geometry conditions. In all the tests performed in this evaluation the algorithms are not re-calibrated or retrained. Thus, a good way to assess the algorithms' portability is to compare their performance vectors across different test scenarios. A performance vector is defined as a vector with three performance elements: the percentage of missed detections (equal to 100 minus the detection rate), the false alarm rate, and the detection time. The smaller the value of each element the better the performance. Table 10    In of the incident from detector stations is 762 m.    )elween detector stations is 3048 m.