Robust Feedback Zoom Tracking for Digital Video Surveillance

Zoom tracking is an important function in video surveillance, particularly in traffic management and security monitoring. It involves keeping an object of interest in focus during the zoom operation. Zoom tracking is typically achieved by moving the zoom and focus motors in lenses following the so-called “trace curve”, which shows the in-focus motor positions versus the zoom motor positions for a specific object distance. The main task of a zoom tracking approach is to accurately estimate the trace curve for the specified object. Because a proportional integral derivative (PID) controller has historically been considered to be the best controller in the absence of knowledge of the underlying process and its high-quality performance in motor control, in this paper, we propose a novel feedback zoom tracking (FZT) approach based on the geometric trace curve estimation and PID feedback controller. The performance of this approach is compared with existing zoom tracking methods in digital video surveillance. The real-time implementation results obtained on an actual digital video platform indicate that the developed FZT approach not only solves the traditional one-to-many mapping problem without pre-training but also improves the robustness for tracking moving or switching objects which is the key challenge in video surveillance.


Introduction
Due to the remarkable growth in the video surveillance market over the last few years [1][2][3], high-quality imaging results from zoom operation are now demanded by consumers [4,5], particularly in traffic management and security monitoring [6][7][8]. Maintaining image sharpness or focus during the entire zoom process is the main challenge of zoom tracking. Figure 1 shows the zoom tracking effect as the zoom is changed from a wide-angle zoom to a tele-angle zoom. As shown in this figure, the plant remains in-focus as the zoom is changed by the user in the presence of zoom tracking. However, the image becomes out-of-focus in the absence of zoom tracking, and the image finally clarifies after zoom tracking due to an auto-focusing (AF) [9] algorithm.

Zoom Tracking Principle
Users often utilise two different zoom options in a digital video system: optical zoom and digital zoom. Digital zoom works by cropping and subsequently enlarging a captured image, which produces an image of lower optical resolution. In contrast, optical zoom uses the optic lens to bring the subject closer [10]. In this paper the zoom tracking problem is only studied for optical zoom. Figure 2(a) shows an actual zoom system, and its structure chart is shown in Figure 2(b). Figure 2(c) introduces the zoom tracking mechanism in detail. When the zoom is changed from wide-angle to tele-angle, the zoom lens focal length increases from F wide to F tele , whereas the angle of view reduces from Φ wide to Φ tele . In response to this change, the in-focus plane (image distance) should shift during this process. For an object at a distance d, s d (z wide ) and s d (z tele ) are defined as the image distance at wide-angle and tele-angle zooms, respectively. Thus, when the zoom is changed from wide-angle to tele-angle, to maintain image sharpness, the image sensor must be moved from the wide-angle in-focus plane at s d (z wide ) to the tele-angle in-focus plane at s d (z tele ). As the zoom lens focal length is altered via a zoom motor and the image sensor is moved by a focus motor, the zoom tracking is typically achieved by following the so-called "trace curves", which show zoom motor positions versus in-focus motor positions for various object distances in Figure 3. Thus, trace curve estimation is a crucial problem for zoom tracking methods. A major challenge in this estimation is the one-to-many mapping problem [11], which becomes troublesome when the zoom is changed from wide-angle to tele-angle. This problem will be further described in Section 2.

Existing Zoom Tracking Methods
The existing zoom tracking methods can be divided into two categories: (1) geometric methods, such as geometric zoom tracking (GZT) and adaptive zoom tracking (AZT); (2) machine learning methods, such as relational zoom tracking (RZT) and predictive zoom tracking (PZT). The development of zoom tracking can be traced back to the look-up table method [12], which stores a large number of trace curves for various object distances in memory. The real trace curve is estimated by selecting the closest curve among the stored ones. However, this approach is not often used in practice because of its large memory requirement. To cover the shortage of memory, the GZT [13,14] has been proposed. The GZT approach obtains an estimate of a trace curve via linear interpolation only based on two trace curves for near and far objects. A drawback of this approach is that the offset between the estimated and the real trace curves gradually increases as the zoom is changed from wide-angle to tele-angle. This approach is later extended to the AZT method [15], which incorporates a recalibration procedure at the boundary zoom position where the trace curve changes from linear to non-linear.
The RZT [16] and PZT [11] methods were proposed later to improve the estimation accuracy through machine learning. RZT generates an estimate of the distance range in which the object resides by so-called "relational curves". This distance range is then used to estimate a trace curve. PZT uses an input-output model trained by a priori characteristic trace curves to generate an estimate of a trace curve. The trained model is often based on the Auto-Regression with Exogenous Inputs (ARX) model [17] or the Recurrent Neural Network (RNN) model [18]. Both RZT and PZT solve the one-to-many mapping problem well, but they require a significant amount of a priori knowledge for training. It is not always convenient to obtain these a priori trace curves in practical use. Furthermore, the errors in the learning step will also have an effect on the estimation. Because the variation of the lens or scenes often requires additional time for re-training, the adaptability of these two algorithms is relatively poor.

Zoom Tracking for Digital Video Surveillance
There are typically two occasions for which the optical zoom is used: (1) the enlarged occasion, which enlarges the object at a constant distance in image to look at it in detail; and (2) the telephoto occasion, which tracks the object moving away. In traffic management and security monitoring, the telephoto occasions are often encountered, for example, for capturing the license plate of an escaping vehicle that has just run a red light. However, all existing zoom tracking methods mentioned previously have been developed for the digital still camera systems. These algorithmic methods assume that the object distance is constant; thus, the moving or switching object in video surveillance [19] has not been considered. Figure 4(a) shows the moving object as the zoom is changed from wide-angle to tele-angle. The object distance is changing as the car moves towards the video camera during zooming. In this situation, existing methods cannot produce an ideal result. There are several other situations in which these methods cannot properly function, even when the objects are stationary. Figure 4(b) illustrates the switching object during zooming. The computer box and network switch are shown as two stationary objects at different distances in the scene. When the zoom motor is moved from wide-angle to tele-angle, the main target in the video changes from the computer box to the network switch. The traditional zoom tracking methods will also fail in this situation. To track moving and switching objects in digital video surveillance and to acquire better estimated results without pre-training the system, we propose the robust feedback zoom tracking (FZT) method to revise the estimated trace curve, which is based on traditional GZT estimation and utilises a proportional-integral-derivative (PID) loop-closed feedback controller [20][21][22]. In the absence of knowledge of the underlying process, a PID controller has historically been considered optimal [23]. The controller can provide control action for specific process requirements by tuning its parameters. This method compensates for errors along the estimated trace curve using the real-time focus value (FV), which is typically used in the auto-focusing function.

Contributions and Organisation
In this work: (1) we discuss the zoom tracking methods in video surveillance for the first time; (2) we propose a novel zoom tracking method called FZT, which is robust in tracking moving or switching objects in video surveillance; (3) we implement our FZT zoom tracking algorithm on real-time digital video hardware and compare it with commonly used algorithms. To the best of our knowledge, the focus value and real-time feedback mechanism have not yet been used in previous zoom tracking studies, and there have been no previous reports on the implementation of the zoom tracking method in video surveillance devices. This paper is organised as follows. Section 2 introduces our FZT method in detail. The FZT approach is then implemented on the hardware platform in Section 3. Our experimental results and comparisons between our algorithm and other existing methods in terms of accuracy and speed are reported in Section 4. Finally, conclusions are stated in Section 5.

Feedback Zoom Tracking
As mentioned above, zoom tracking is related to the zoom and focus motor positions. It is typically achieved by following a trace curve. If the motors are moved following the trace curve during zoom operation, the image will always stay sharp. Figure 3 shows the trace curves for an 18× zoom lens. Each trace curve corresponds to a certain object distance.

Trace Curve Estimation
The first goal in zoom tracking that we addressed is how to estimate the right trace curve without any special distance measurement equipment. Let f d denote the real trace curve acquired by running the global search auto-focusing function [24,25]. Thus, f d indicates the in-focus motor position for each zoom motor position z n at a given object distance d. For simplicity, let z 1 and z n denote the wide-angle zoom (z wide ) and tele-angle zoom (z tele ), respectively. As shown in Figure 3, all of the trace curves for various object distances have the same in-focus motor position at the wide-angle zoom z 1 , which is f 1m (z 1 ) = f 1.5m (z 1 ) =… = f 30m (z 1 ). However, it is difficult to determine which trace curve should be followed during zooming without the distance information, particularly when the zoom motor moves from the wide-angle towards the tele-angle. This issue is the so-called "one-to-many" mapping problem.
Thus, a zoom tracking approach is required to estimate a trace curve as close as possible to the real one. The classical method GZT estimates the trace curve via linear interpolation based on the stored trace curves for near and far objects. It obtains the estimated trace curve using Equation (1): (1) where and denote in-focus motor positions at the zoom position z for near and far objects, respectively, and and represent the initial zoom motor position and its corresponding in-focus motor position for an object at a distance d, respectively. The subscript "start" indicates that the in-focus motor position is obtained by performing auto-focusing before the zoom motor is moved. As shown in Figure 5, GZT actually uses the so-called GZT focus ratio described in Equation (2) to estimate the in-focus motor position: (2) Figure 6 shows the effectiveness of the GZT focus ratio for the targets at different distances. When the zoom is changed from wide-angle to tele-angle, the GZT focus ratio shows non-linear characteristics, resulting in large estimation errors when predicting the trace curves with GZT. Although AZT uses recalibration to improve its accuracy, it also cannot completely avoid this type of error caused by using linear interpolation to fit the non-linear problem.

Trace Curve Revision
To overcome the disadvantages of GZT and the issues associated with the moving and switching objects, we utilise the feedback method to revise the estimated trace curve automatically in real-time applications. The first step of the feedback method is to acquire the error from the system. We first consider the focus value (FV) [26][27][28], which is the measurement of sharpness in the auto-focusing application. As the focus value increases, an object's image increases in sharpness. Figure 7 illustrates the focus values for per focus motor position versus per zoom motor position acquired using our digital video surveillance equipment, which is described in Section 3. Figure 7 shows that the highest focus value is on the peak of the mountain and that sharpness decreases gradually down the hillside. The peak line is the real trace curve for the object in the experiment. Away from the trace curves, the corresponding focus value declines symmetrically on both sides of the mountain. Thus, the FV can be used as a measurement of the offset between a test point and the real trace curve.   Using focus values, we propose that the FZT method will maintain object sharpness during the entire zoom process, even when there are moving or switching objects in the scene. Figure   In the first stage, the initial estimated trace curve is given by the GZT model according to the geometric characteristics at the beginning of zooming. When the user changes the zoom from wide-angle to tele-angle, the approach requires a feedback period length fp to determine where it should revise the trace curve. If fp = 48, the system must detect the error once every 48 zoom motor steps. For example, if the first detection begins at motor position z = −2,536, the following feedback mechanism will be run at z = −2,584, −2,632, −2,680, and so on. As shown in Figure 8, if the current zoom position does not require revision, the zoom and focus motors are moved according to the current estimated trace curve without detection; otherwise, the system would acquire the focus values at two corresponding probe points for real-time feedback revision. The probe points are detecting positions for obtaining the focus values needed by our FZT, and they are symmetrically located on both sides of the current estimated trace curve. Figure 9 shows that the two probe point positions p 1 and p 2 are calculated using p 1 = p 0 + ps, p 2 = p 0 − ps, in which p 0 is the point on the current estimated trace curve at the corresponding zoom position, and ps is a probe step length parameter used to determine positions p 1 and p 2 . This ps parameter controls the detection boundary of the algorithm. A small ps may miss some tiny errors, whereas a large ps will increase the fluctuation of the trace curve. The ps can be either constant or variable. Here, we propose an adaptive selection mechanism: the ps is determined using the difference between the current and next focus motor positions on the estimated trace curve. The adaptive mechanism is described by Equation (3): where F current represents the focus motor position on the estimated trace curve at the current step, and F next represents the focus motor position at the next step. Both of these are shown on the y-axis in Figure 9.
In the second stage, the focus motor is moved from p 1 to p 2 , and the corresponding focus values e 1 and e 2 are acquired at these two points, respectively. Because the focus value decreases symmetrically on both sides with the increasing distance from the real trace curve, the revision can be made by our FZT algorithm. Because they are the same distance from the point p 0 , the probe points p 1 and p 2 should have approximately the same focus value. However, because the estimated trace curve often deviates from the real one, the focus values e 1 and e 2 are often different. By studying the relationship between these two values, we can determine the actual location of the trace curve. As illustrated in Figure 10, the red line represents the real focus value curve at the current zoom position, whereas the blue line represents the estimated focus value curve. Thus, e 2 > e 1 in Figure 10 indicates that the probe point p 2 is closer to the real trace curve than is p 1 . Thus, the estimated trace curve should be revised towards the direction of p 2 to approach the real one. In contrast, if e 1 > e 2 , the estimated trace curve should be moved towards p 1 . During the trace curve revision stage, revision is achieved by moving the next estimated position p e on the current estimated trace curve to p r , as shown in Figure 9. The program then updates the GZT focus ratio k by p r and rebuilds the estimated trace curve. The revision distance , which will be discussed later, is finally calculated by the PID controller.
In addition to the feedback period fp, there are several other variable parameters in our FZT model. In the feedback area, the motors are moved following straight lines. The feedback area length fa, which consists of the front area length fra and the back area length bka, influences the fluctuations of the motor trace. A large fa value will reduce the slope of the trace adjustment and causes less shaking in the image during the process. In other parts of the feedback period, the motors should be moved according to the current estimated trace curve.

S Δ
The process of revision is given in Figure 9. When the zoom motor enters the first feedback area at point p s , the probe position p 1 , p 2 should be calculated at the next step. The motors are then moved to these positions to acquire the focus values e 1 and e 2 following straight lines. The error can then be obtained using Equation (4): Because Δe < 0 and |Δe| > e thr , the estimated trace curve should be revised towards the p 2 direction in which e thr is a threshold parameter that avoids system jittering. Next, the position p e on the current estimated trace curve is revised to p r = p e + ΔS 1 . The focus ratio k is then recalculated via the position of p r , and the new estimated trace curve C 2 is built by the classical GZT method.
The motors pass through the back area following the straight line from p 2 to p r . They then move from p r to p' s following the curve C 2 without feedback and enter the second feedback area. Because Δe > 0 in this area, the estimated trace curve C 2 is judged to have a lower value than the real trace curve. Then, the position p' e is revised to p' r on the curve C 3 by ΔS 2 . The green line in Figure 9 shows the actual motor trace during this process. The feedback mechanism occurs during the entire zoom operation process.

Revision Distance Control
The revision distance ΔS is a critical parameter that decides the regulating ability of the algorithm. If the ΔS is smaller than the ideal ΔS, the revised trace curve will not approach the real trace curve efficiently. However, if ΔS is too large, an overshoot error will occur. Because the revision is influenced not only by the current error but also by the previous errors, we use a proportional-integral (PI) controller to improve its accuracy. The PI controller, which is widely used in motor control, can provide the control action according to the current and previous errors. Figure 11 shows the control structure of our FZT method. The controller action, which consists of proportional and integral components, can also be written as Equation (5): (5) where K P is the proportional gain and T I is the integral time. The integral component accumulates all previous errors to compensate for the error value, with the intention of completely eliminating these errors in T I seconds. The resulting compensated error value is scaled by the proportional gain K P .
Because the Equation (5) can only be used in analogue systems, the integral component should be discretised for the digital equipment. Equation (6) shows the formula conversion from the integral term to the sum of discrete errors: (6) where Δt = T represents the sampling period. In our experiments, the value of T is set to 1. Equation (5) can then be rewritten in discrete form as Equation (7): According to Equation (8), Equation (7) can be further converted to the incremental form as Equation (9), which simplifies the calculation and saves storage space. This equation now only needs the last ΔS and the errors in the last two consecutive steps to calculate the revision distance: where is the integral coefficient.
Using the PI controller, FZT is able to complete its feedback procedure. However, the parameters K P and T I need tuning before use. Tuning a PI control loop involves adjusting these parameters to the optimum values for the desired control response. There are several methods for tuning a PI loop, including manual tuning, the Ziegler-Nichols method [29], the Cohen-Coon method [30] and so on.

Real-Time Hardware Implementation
The improved FZT algorithm and traditional methods were implemented on a high-speed TI TMS320DM365 digital video platform, and the focus value calculation for the 720-P (1280 × 720 pixels) image was simultaneously performed at 30f/s. Figure 12(a) shows the configuration of this platform. This platform consists of a zoom lens, CMOS chip, dedicated video capture board, lens control board, and PC.
For the high-speed camera head, we adopted a CHIOPT 18× zoom lens, in which the zoom range is sufficiently large for experiments. To increase motion accuracy, the zoom motor was driven by a program in four-subdivision mode, which divides each normal motor step into four smaller steps. Figure 12(b) shows an overview of the device. Twelve-bit RAW image data were built by the 5-MP CMOS chip (MT9P031) and then transferred to the video capture board at 30f/s for 1,280 × 720 pixels. The video capture board is designed as a dedicated device for video capturing, transferring, processing, and focus value calculation. Figure 12 such as UART and Ethernet. This board, CMOS, and zoom lens actually construct a standard internet protocol network camera (IPNC) system. The lens zoom board is another electronic function in this system. This board contains another C8051F microcontroller for estimating the trace curve. It receives zoom commands and focus values from the PC and video capture board, respectively. The FZT algorithm is implemented here to acquire the positions of motors using focus values. The motor control signals are then produced by the special motor control chip. The entire working procedure of our device is as follows: (1) Receive zoom command from PC: The PC transfers the zoom command given by the user to the lens control board. (2) Acquire motor position by estimated trace curve: Our FZT algorithm is an improved GZT that accounts for the focus value when revising the estimated trace curve. Before obtaining the corresponding focus value, the lens control board applies GZT to estimate the position of the focus motor. In the feedback area, the probe positions are also acquired on this board through FZT.
(3) Calculate the focus value: The focus value is calculated by the video capture board and sent to the lens control board. To fit the real-time requirement, we use the analogue circuits. The corresponding analogue video signals are first output by the Video DAC in TMS320DM365. Then, an analogue band-pass circuit is used to filter out the high-frequency components. As the number of high-frequency components increases, the clarity of the image increases. A precise small-signal rectifier and analogue integrator circuit are applied to build a voltage from high-frequency components that represents the focus value. The 10-bit A/D converter embedded in the C8051F microcontroller is used to obtain the exact digital focus value from the voltage. Furthermore, the focus value can also be obtained digitally through the information contained in the H3A register in TMS320DM365. This hardware implementation can run as an IPNC video surveillance system, which fits the active object during zoom operation. It can also run as the base of various applications, including active tracking [31,32], salient recognition [33], speed estimation [34] and automatic driving [35,36]. In this paper, we use a PC with a 2.6-GHz Intel Pentium Dual-Core CPU and 2 GB of memory for observation and control.

Experimental Results and Discussion
In this section, we provide a comparison of our FZT with the traditional zoom tracking approaches of GZT, AZT, RZT and PZT. The performance measures considered include tracking accuracy, tracking speed, storage space and training requirements. Tracking accuracy was measured in terms of mean offsets between the estimated and real trace curves for stationary and moving objects, respectively. Tracking speed was measured in terms of the total zoom operation time, which is dependent on the lens' motor type. Training and storage requirements were measured using the demand of determining the optimal model parameters. The parameters of the PI controller are also discussed in this section. Finally, the drawbacks of our method observed in the experiments are discussed.
All of the experiments were realised by the digital video surveillance system described in Section 3. Due to the four-subdivision mode, the zoom motor position, which was four times the normal pattern (190 to −970), ranged from 760 to −3,880. This mode improved the precision of our experiments. Furthermore, all of the experiments described in this section were under the zoom direction of wide-angle to tele-angle because the reverse sequence does not cause the one-to-many mapping problem when applied.
Moreover, because there are many independent parameters for our proposed system, we discuss how to obtain these values here. The seven main parameters in our algorithm are K P , T I , T, fp, fra, bka and ps. The proportional gain K P , integral time T I and sampling period T are three important parameters for the PI controller. We propose a combined tuning method for setting these three values in our experiments, as it is relatively difficult to obtain sufficient results using single tuning methods in complex surveillance environments. First, we use the Ziegler-Nichols [29] method to obtain the approximate values. Then, the manual tuning is performed for further optimisation according to the actual effect of the algorithm. The different revision effects acquired by the various K P and T I values in our experiments will be discussed in Section 4.3 as a reference for the stage of manual tuning. Because the fp, fra, bka and ps depend on different zoom lenses, image sensors, control circuits and application environments, it is difficult to find a common setting method for them. They should be regulated based on the hardware and software conditions and application environment, which can be obtained through several actual experiments in the user's specific working environment. We chose these values manually according to our digital surveillance platform and the scenes in our experiments. The feedback period fp controls the feedback frequency along the trace curve. A small fp value can increase the accuracy within a certain range through a frequent feedback procedure but causes increased time consumption and fluctuations on the trace curve. Thus, the fp value should achieve a balance between accuracy and user experience according to the specific application scene. Because user experience varies, this value setting mainly relies on actual tests and manual regulation. When there are many high-speed moving objects or objects with complex movement, such as in traffic or outdoor video surveillance, the value of fp should be relatively reduced. Otherwise, the fp should be set relatively high for indoor surveillance. The effect of fp in our experiment will be further discussed in Section 4.3 for advanced reference. The front area length fra and back area length bka are two auxiliary parameters that also affect the user experience by influencing the motor trace fluctuations. Their values are often set to 1/4 or 1/5 of the feedback period fp, depending on the user experience. The probe step length parameter ps controls the detection boundary of the algorithm, for which we have proposed an adaptive mechanism to determine this boundary described in Section 2.2.

Stationary Objects
The performance measures for tracking stationary object during zoom operation were collected from 600 distinct scenes under different lighting conditions and various object distances. This evaluation was performed for enlarged occasions in surveillance, which was described in Section 1.3. The object distances were set to 2, 3, 5, 10 and 20 m. For each distance, 120 samples were obtained from the GZT, AZT, RZT, PZT (S = 5), PZT (S = 20) and FZT models (20 samples for each method). Due to its higher accuracy in comparison to the RNN model [11], we chose the ARX model for PZT for all of our experiments. PZT(S = 5) indicates that the PZT model was only trained using five characteristic trace curves before use, whereas PZT (S = 20) was trained using 20 curves. Figure 13 shows an example of the trace curve for a 3 m stationary object acquired using our FZT method. In this case, the parameters were set as follows: fp = 96, fra = bka = 24, K P = 3, T I = 6, T = 1, and the adaptive probe step mechanism was applied to choose the ps. The real trace curve was obtained by running the global search auto-focusing function at each zoom motor position. The FZT trace curve was observed to tightly fit the real trace curve with several small fluctuations.  Table 1 summarises the overall tracking accuracy of the developed FZT compared with the existing GZT, AZT, RZT and PZT approaches. From this table, it can be observed that FZT exhibits better tracking accuracies than most of the traditional methods. However, FZT does not gain improvement in comparison with PZT trained by 20 trace curves due to its beneficial adaptability to the one-to-many mapping problem. However, if PZT has not been trained sufficiently, as shown in the PZT (S = 5) results, it may lose its advantages. The distributions of offsets for all of the approaches in these experiments are shown in Figure 14. The cases are divided into two groups: 0 m to 10 m stationary objects and 10 m to 20 m stationary objects. The offsets of most points on the FZT trace curve were within five steps. The experiments also showed that there was a tolerant threshold of focus position offset for human vision. If the offset stays below the tolerant threshold, the user will not feel uncomfortable. This threshold is not a constant value but a variable that gradually increases from 10 to 30 steps when the zoom is changed from wide-angle to tele-angle in our system. Thus, the small fluctuations from probe steps on the FZT trace curve did not cause user discomfort. To compare the approaches for situations involving the one-to-many mapping problem, a further study was performed for the different zooming sequences shown in Figure 15. As indicated in this figure, the four different zooming sequences depend on the location of the initial and stopping zoom motor positions with respect to the boundary zoom position. Zooming Sequence-3 (ZS-3) incorporates the sequences that generate the one-to-many mapping problem because the zoom motor is moved from the linear region to the non-linear region on the trace curves.  Table 2 provides the overall tracking accuracies for each sequence region. For stationary objects, PZT (S = 20) generated the least mean offset of 8.13 motor steps for ZS-3 compared to other approaches and worked better for the other three sequences as well. Furthermore, FZT exhibited a mean offset of 8.37 motor steps, which was more than that of PZT. The FZT model was found to work better than most of the existing methods for tracking stationary objects with the exception of the PZT model with sufficient training. However, FZT does not require any specified training before tracking; thus, it is more suitable for use in complex environments in which the user is not able to acquire a sufficient amount of accurate training trace curves. It can also be applied to a video surveillance system with many different lens configurations in which the RZT or PZT models would have to be trained for every lens.

Moving and Switching Objects
Experiments were also performed to evaluate the robustness in tracking moving or switching objects. Figure 16 shows the focus values of an object moving from 6 m to 8 m. The focus values acquired by our equipment show an obvious real trace curve. Therefore, according to these values, the feedback mechanism can be run to revise the estimated trace curve.  Figure 17 shows the FZT trace curves for an object moving from 3 m to 4 m and an object moving from 5 m to 8 m compared with the RZT and PZT models. In these cases, the FZT parameters were set as follows: fp = 96, fra = bka = 24, K P = 1, T I = 8, T = 1, and the adaptive ps was used. The FZT trace curve was observed to be closer to the real trace curve than the RZT and PZT curves due to its real-time revision based on the feedback mechanism. To further study the robustness in tracking moving or switching objects, we performed another 500 experiments for tracking objects moving from 2 m to 3 m, 5 m, 8 m, 10 m and 20 m. For each group of moving distance, 20 cases under different scenes for each tracking method were modelled. In these experiments, the FZT parameters were set as follows: fp = 96, fra = bka = 24, K P = 1, T I = 8, and T = 1. Table 3 provides the results of the average tracking accuracy for these experiments. The FZT approach showed significant robustness, which was better than those of the other existing approaches. Furthermore, the mean offset of FZT grew slowly as the moving distance increased. The additional real-time estimate revision contributed to all of these effects. Another 500 experiments under the similar parameter conditions were performed to validate the robustness for tracking switching objects in various scenes. We set two testing objects at 2 m and 3 m in the 2 m; 3 m group. When the main target switched from 2 m to 3 m, the FZT model exhibited the least mean offset of 8.41 motor steps in Table 4 compared with the other algorithms. Table 4 shows the overall accuracy results for this type of experiment. Unlike moving object, switching object shows a transition in real trace curve because of the different object distances of switching targets in the scene. The focus value of image increases as motor positions approach the characteristic trace curve of new target. Thus, the estimated trace curve can be revised to the new object trace curve gradually towards the high focus value direction using real-time feedback mechanism of our FZT. The revision effect mainly focuses on a small range of the trace curve, in which the main object of image switches. Outside this range, FZT has little influence on the estimated trace curve. Experimental results show that FZT has better robustness compared with other existing methods on tracking switching object. Figure 18 shows the offset distributions for the 2 m; 5 m and 2 m; 20 m groups in the experiments. Most of the offsets on the FZT trace curve were within 10 steps, whereas more than 40% of the offsets on the other trace curves exceeded 15 steps. The large offset may cause users to be uncomfortable. Thus, FZT is the best choice for scenes that contain many moving or switching objects.

Control Parameters
The control parameter setting is an important problem in the applications of FZT applications. In this section we discuss the feedback period fp, proportional gain K P and integral time T I . The feedback period fp controls the feedback frequency along the estimated trace curve. Figure 19 shows the FZT trace curves for tracking an 8 m stationary object with different feedback periods under fra = bka = 24, K P = 3, T I = 6, and T = 1. A group of 20 experimental cases was performed for each fp value, and the average accuracies and time consumption are shown in Table 5. A small fp value caused the feedback procedure to occur frequently. In addition, it increased the accuracy within a certain range but caused a larger time consumption and more fluctuations along the trace curve. Moreover, the overly frequent revision might reduce the tracking accuracy at times due to the overshoot effect. Table 5 shows that fp = 96 was the suitable value for our device in this experiment due to the feedback procedure's high accuracy and relatively low time consumption.  After the discussion of fp, we consider the proportional gain K P for the PI controller. The parameter K P decides the revision magnitude. To show the magnitude in a clear manner, we use the feed response curves in which fp = fa and the motors are moved following the straight connection of probe points. For instance, if we want to produce the feedback response curve in Figure 9, the motors should be moved using the following sequence: p 1 , p 2 , p' 1 , p' 2 , p'' 1 , p'' 2 . This type of curve causes feedback operation throughout the time period and shows the revision distance ΔS directly through the amplitude of the curve. Figure 20 shows the feedback response curves for tracking the same 8 m stationary object with T I = 6, T = 1 and K P = 1, 3, 5, and 8. It was observed that the K P influenced the magnitude significantly. A high K P caused a large fluctuation on the response curve, which indicates strong adjustment on the estimated trace curve. In contrast, the small K P with a weak revision effect is not able to complement the error in time. Thus, the choice of K P should be based on the offset between the estimated and actual trace curves. For stationary objects, the K P can be set to a small value, whereas a larger K P is necessary for tracking moving or switching objects. The integral term in the PI controller accumulates the past errors over time and adds them to the revision distance ΔS as a complementary effect. The parameter T I controls the speed of releasing the accumulated errors to the revision distance. Figure 21 illustrates the feedback response curves for tracking the same 8 m stationary object mentioned above with K P = 5, T = 1 and T I = 1, 3, and 8. As observed in Figure 21, a large T I value reduced the fluctuations of the response curve.  However, Figure 22 shows two additional cases with T I = 10 and 20 in which an excessively large T I could not achieve a sufficient feedback result because it reduced the role of the integral value. Thus, T I should be set properly according to the K P value, considering the revision effect and fluctuation.

Speed and Drawback
Because zoom tracking is a real-time application, tracking speed is also a key issue. Table 6 summarises the time consumption for the experiments of stationary, moving and switching targets. AZT took the largest amount of time due to its recalibration when crossing the boundary zoom position. FZT with fp = 96 achieved the second-highest time cost due to an additional 637 ms for feedback revision. Thus, FZT sacrifices speed in exchange for accuracy. The comparison of other performance measures is summarised in Table 7. The following observations are made from this table. (1) GZT, AZT and FZT do not require any training procedures, while RZT and PZT require a minimum of 20 trace curves to generate an acceptable tracking result; (2) To revise the estimate, FZT requires some additional memory spaces, but this storage requirement does not grow as the number of discrete zoom motor position N increases. Thus, FZT limits the storage usage on the order of N similar to GZT, AZT and RZT, as opposed to PZT, which requires storage on the order of N 2 ; (3) Unlike AZT, FZT does not cause discomfort when crossing the boundary zoom position. However, it causes users to be uncomfortable when the fluctuations on its trace curve are serious. Fortunately, this phenomenon occurs seldomly when we choose suitable parameters for the PI controller; (4) Based on the feedback, with respect to the moving or switching objects that often appear in video surveillance, FZT demonstrates robustness, while RZT and PZT have large offsets in these situations. Therefore, based on the above observations, FZT not only solves the one-to-many mapping problem but also improves the tracking robustness. Finally, it is also worth mentioning that similar to GZT, AZT, RZT and PZT, our FZT method may also fail in several scenes in which there are two main targets at different distances due to an incorrect estimate acquired by auto-focusing at the beginning of the algorithms. Figure 23 shows one example of this failure. There are two peak lines in the figure that indicate the two main targets, whereas only one line is present in the normal case, as shown in Figure 16. The additional peak line will disturb the auto-focusing program and build an incorrect estimated trace curve due to its relatively high focus value. Due to this incorrect estimate at the beginning of the algorithm, FZT may fall into the local adjustment along the wrong curve. It should also be noted that this drawback is not caused by the feedback mechanism but by the auto-focusing procedure. Thus, all of the existing zoom tracking approaches that use the auto-focusing program at the beginning of the algorithm have this drawback. The advanced auto-focusing technique concerning image content can further be used to cover this shortage.

Conclusions
In this paper, a robust feedback zoom tracking method has been introduced for digital video surveillance systems. This real-time method uses focus values and a PI loop-closed controller to revise the estimation of the trace curve. To assess performance, a real-time hardware implementation of the FZT algorithm along with commonly used methods was performed on an actual digital video platform. The extensive experiments under different lighting conditions for both stationary and moving objects revealed that the proposed feedback method generates better accuracies without pre-training compared to the commonly used approaches. Furthermore, the feedback mechanism may cause several fluctuations on the trace curve, but they typically stay within the tolerance level of a human being if the method parameters are properly chosen. Although it takes a little more time than traditional methods, the FZT method improves the robustness and adaptability of zoom tracking, particularly for moving or switching objects in video surveillance.