Adaptive Framework for Multi-Feature Hybrid Object Tracking

: Object tracking is a computer vision task deemed necessary for high-level intelligent decision-making algorithms. Researchers have merged different object tracking techniques and discovered a new class of hybrid algorithms that is based on embedding a meanshift (MS) optimization procedure into the particle ﬁlter (PF) (MSPF) to replace its inaccurate and expensive particle validation processes. The algorithm employs a combination of predetermined features, implicitly assuming that the background will not change. However, the assumption of fully specifying the background of the object may not often hold, especially in an uncontrolled environment. The ﬁrst innovation of this research paper is the development of a dynamically adaptive multi-feature framework for MSPF (AMF-MSPF) in which features are ranked by a ranking module and the top features are selected on-the-ﬂy. As a consequence, it improves local discrimination of the object from its immediate surroundings. It is also highly desirable to reduce the already complex framework of the MSPF to save resources to implement a feature ranking module. Thus, the second innovation of this research paper introduces a novel technique for the MS optimization method, which reduces its traditional complexity by an order of magnitude. The proposed AMF-MSPF framework is tested on different video datasets that exhibit challenging constraints. Experimental results have shown robustness, tracking accuracy and computational efﬁciency against these constraints. Comparison with existing methods has shown signiﬁcant improvements in term of root mean square error (RMSE), false alarm rate (FAR), and F-SCORE.


Introduction
The increase in the computational power of existing systems has led to huge investments in automated data analysis. Object tracking is one such class of algorithms that automatically locates the region of interest, possibly obscured by challenging constraints [1]. These constraints are what defines the requirements that should be considered while developing real-time robust object tracking algorithms. Earlier methods successfully tracked objects from stationary cameras using background subtraction techniques [2]. These methods, when combined with data association techniques, can track multiple objects [3]. However, in these methods, the scene structure is known in advance. Fukunage and Hostetlar introduced a method based on meanshift (MS) estimation using gradient descent that follows a reference template/model [4]. Later Comaniciu, Ramesh, and Meer used MS to solve the problem of tracking [5] and ever since it has shown continuous presence in the computer vision community [6][7][8][9].
MS is a non-parametric method that determines the location of the object by measuring distance between the histograms of the target and candidate templates. This is achieved by maximizing a similarity measure, the Bhattacharyya coefficient, until it converges. The MS method is robust against partial occlusion and can adapt to size/scale variation under expensive mixture of Gaussian (MoG) [8,10,11]. Nevertheless, since histogram disassociates the target from the neighbourhood pixel information, it fails under the constraints such as fast object motion, clutter background, and full occlusion. This constrains the object from large displacements in consecutive frames, which as a consequence mandates some part of the object to lie in the basin of search. When the object is lost, the convergence of MS either becomes very slow or it eventually loses the object. MS also becomes slow when the size of the object increases.
On the other hand, the statistical category treats the tracking problem as a recursive computation in state space. Kalman filters (KF) and particle filters (PF) are the most popular methods in this category. KF give optimal estimation under the assumption of a linear state transition and Gaussian measurement and process noise [12]. PF, a popular statistical approach, waives off the restrictive hypothesis of KF and estimates the posterior density by combining random samples associated with weights [13][14][15][16][17][18][19][20][21]. Nummiaro et al. [16], Israd et al. [18] and Pe'rez et al. [19], are some of the most prominent research that has used color histograms with the PF method for object tracking. In [16], a method was developed that uses color with the PF method and the results are compared with the MS method and MS in combination with KF. The color-based PF method showed good performance against occlusion and non-linear object motion. The method of [18] used color and shape to represent the target. However, the method was successful in multi-object tracking, but the tracker distracts from the target when multiple people cross it. Most of these methods use global histogram for object representation and in the process the spatial information is lost. Pe'rez et al. introduced a technique that models multi-part of the target to compensate for some of the spatial information [19]. The method is robust against clutter background, size/scale variation and occlusion [19]. All these research work has proved robustness and accuracy of the PF method over MS and KF, however, the accuracy is dependent on a large number of particles. This dependency is overcome through embedding the MS optimization into the PF framework and that give rise to a new class of hybrid object tracking algorithms which will be discussed next.

Materials and Methods
This section discusses hybrid object tracking algorithm based on embedding MS into the PF methodology (MSPF). In this technique, the MS optimization procedure replaces the expensive and inaccurate particle validation process of the PF method. The MS optimization method reduces the particle count by finding the local modes for each particle. That essentially increases the accuracy of the particles' state, which consequently reduces the need for a large number of particles.
There is a volume of research that approves the accuracy and robustness of MSPF based hybrid tracking methods [22][23][24][25][26][27][28][29]. Shan Caifeng combined PF and MS to track a hand for an application of intelligent wheelchair. Color and motion cues were used in a complementary fashion to offset the error of feature that is not discriminating the object [23]. The results of which are robust against clutter background and light illumination variation. In the same hybrid category, Anbang Yao introduced a particle filter based kernel object tracking (PFKBOT) algorithm. It is based on an incremental Bhattacharyya dissimilarity into the MS optimization method [24]. The idea was to continuously distinguish the background particles from the ones lying in the target region. And, consequently, the result shows robustness to background clutter and occlusion, however, at the cost of large computational power. Likewise, color is complemented with the motion information to solve the problem of background clutter and mild illumination change [26]. A Two Stage Hybrid Tracker (TSHT) was developed that uses color histograms in combination with orientation edge histograms to include the shape and inner edges of the object [27]. The results, however, fail under occlusion but nevertheless are robust towards changes in size/scale and fast motion. TSHT uses 5 MS iterations per particle to compute probable locations, therefore it does not meet the real-time requirements. Summary of the literature review is given in Table 1. Table 1. Summary of state-of-the-art hybrid object tracking algorithms.

Statistical (Particle Filters)
Color Non-rigid deformations, partial occlusions and cluttered background [16] Color Clutter background, Occlusion, size/scale variation and light illumination [17] Color and edge orientation Clutter background and short time period occlusion [18] Color and shape Large object motion, partial occlusion [19] Color Cluttered background, occlusion and size/scale variation [28] Motion model Illumination variation and partial occlusion [20]

MSPF based hybrid systems
Color + motion Cue Fast motion, light clutter, illumination change [21] Color Multiple hypothesis, clutter background [22] HSV Color Clutter background, size and scale [23] Color + Motion model Occlusion, clutter and fast motion [24] Color + motion model Fast object motion and clutter background [25] Color Background clutter, full occlusion [26] HSV color components Size/Scale, fast object motion, occlusion [27] Color + Edge Orientation histogram Size/Scale, fast motion [29] Color + Motion model Clutter background, light illumination variation and full occlusion [30] Color + local integral orientation Scale and pose variation These approaches rely on a combination of predetermined fixed features neglecting the adaptive nature of the background that might be caused by the moving object. This creates an impasse when the object moves toward a background that camouflage the object with either a similar texture or any other abrupt change. Consequently, the maneuvering aspects of the object constraints the algorithm. This proposed research work takes into account the above-mentioned shortcomings and presents the following two novelties: 1.
The proposed AMF-MSPF framework implements a feature ranking module on top of the MSPF methodology. Thereby an adaptive multi-feature framework is implemented that selects the required features on-demand as and when required. Consequently, this enables the object tracking algorithm to discriminate the object locally. The feature ranking module re-initializes the MS procedure with new features and is triggered based on whether re-sampling occurs or not. When resampling occurs, a new set of N ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise. represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
The MS procedure maximizes Equation (6), which consequently minimizes Equation (9). The MS optimization process is an iterative process and is initialized in the previous frame with the target position y . The new location y is evaluated based on the convergence of Equation (6). By expanding the Taylor series around coefficient (y ), the linear approximation of Equation (6) is obtained after some manipulation [6][7][8]: features are selected using the ranking module that are used for updating the target model.

2.
As the PF algorithm itself is a very compute intensive method, embedding an MS into its particle validation process increases its computational load. Thereby it is pertinent that the complexity of the MS method is reduced to enable the proposed framework to run in real-time. We propose a novel MS optimization method based on an observation that MS only requires a fraction of sample to accurately track. This has led to huge reduction in computational load without inducing any significant error.
The rest of the paper is organized as follows: The mathematical formulation of the proposed AMF-MSPF framework is explained in Section 3. The experimental results are presented in Section 4. The concluding remarks along with future directions are summarized in Section 5.

Proposed Framework
The AMF-MSPF hybrid framework is developed in this section. In this framework, multi-features are adaptively selected from a large pool that is used by the MSPF method. Initially, a set of particles and their associated weights are generated using dynamic state equations (i.e., Equations (16) and (17)). A template of the object is initialized from these particles and processed through the ranking module to select the top N Appl. Sci. 2018, 8, x FOR PEER REVIEW 6 of 17 new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represented as ∶ ₣ = ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise. represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
features. The MS optimization uses these features to validate the particles by herding them to more precise locations or state. The new state of the particles reinforces PF with more accurate measurements. A mathematical formulation, as shown in the Figure 1, is developed in this subsection.

Feature Ranking
Unlike the domains of image understanding and medical applications where offline feature ranking and selection are successfully adopted, the broad spectrum of applications in the computer vision domain is directly tied to real-time processing. We have used a simple ranking criterion adopted by Collins [9] to accommodate multiple features that are considered by the tracker on the fly. The ranking module ranks a pool of features based on the variance ratio between the background and foreground. The top ranked features are input into the AMF-MSPF framework, as a result it takes care of the changing variations such as background clutter and abrupt illumination changes. In this subsection, we describe the feature ranking module. Our feature space Appl. Sci. 2018, 8, x FOR PEER REVIEW new N ₣ locations for each particle. Symbolically Equatio ƒeatureƦnk(₣); the ranking module that produces N ₣ top feat is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent m to their new state ŝ that is more precise and accurate and applied to all the particles that essentially gives us as many ₣ locations are merged, over each particle, using median ove for every particle using multi-features is apparently an inte proposed MS optimization technique, the complexity is red weights are estimated for each pixel of the object that is fi algorithm based on J. Shi and J. Malik [31]. We pick only a fract candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction o its complexity by an order of magnitude. This subsection introduced in the MS optimization method. The heart of any the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over discrete color histograms of the target and candidates respecti small intervals that divides the whole range into smaller ones densities are given from Equation (7) through Equation (8): is formed by a linear combination of the RGB color components having coefficients c 1 , c 2 , and c 3 : pl. Sci. 2018, 8, x FOR PEER REVIEW w N ₣ locations for each particle. Symbolically Equations (1)-(5) is represen tureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimizati xplained in the next subsection.

. MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the p to their new state ŝ that is more precise and accurate. MS is initialized in each d applied to all the particles that essentially gives us as many new locations for each p locations are merged, over each particle, using median over them. Evaluating the M every particle using multi-features is apparently an intensive computation. How oposed MS optimization technique, the complexity is reduced by an order of ma ights are estimated for each pixel of the object that is first segmented using no orithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the didate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improve complexity by an order of magnitude. This subsection describes our propose roduced in the MS optimization method. The heart of any MS procedure is the ma Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and crete color histograms of the target and candidates respectively. These bins are simp all intervals that divides the whole range into smaller ones to reduce computationa nsities are given from Equation (7) through Equation (8): where, c [−2 − 1 0 1 2] and a combination of which produces 5 3 possible candidates in our feature space. Filtering out the redundant and useless cases, leave us with around 50 features. Let fg(i) and bg(i) be the normalized discrete densities of background and foreground respectively. These densities are discretized to 16 bits for efficiency. The likelihood ratio of these densities is given by Equation (2).  Multiple likelihood images are generated using these likelihood ratios that are to be used by the MS procedure. We have used the traditional definition of variance, var(x) = E(x) − E(x) , to calculate the variance of ℓ(i) w.r.t the object and background densities to maximize the inter-class variance between them. Equations (3) and (4) are used to calculate the variance: In the final ranking step, Equation (5) finds the Variance Ratio ( ). Essentially, the inter-class variance of the foreground and the background is maximized for the feature space ₣.
This variance ratio is sorted for the top N ₣ features which indicates the highest discrimination score. The likelihood, corresponding to features with the highest score, are used to form new likelihood images. These images are used to initialize the MS optimization procedure that produces Multiple likelihood images are generated using these likelihood ratios that are to be used by the MS procedure. We have used the traditional definition of variance, var(x) = E(x) 2 − E(x) 2 , to calculate the variance of (i) w.r.t the object and background densities to maximize the inter-class variance between them. Equations (3) and (4) are used to calculate the variance: In the final ranking step, Equation (5) finds the Variance Ratio (VR). Essentially, the inter-class variance of the foreground and the background is maximized for the feature space its complexity by an order of magnitude. This subsecti introduced in the MS optimization method. The heart of a the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized ov discrete color histograms of the target and candidates respe small intervals that divides the whole range into smaller o densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the targ delta function equal to 1 only at the particular bin u and random samples inside the target and candidate. In ord multiply by a coefficient C to restrict the summations ∑ decreasing convex kernel. Traditionally Bhattacharryya coe are spatially weighted by an Epanechnikov kernel. Howeve at dealing with non-rigid objects. We instead, employed a m fraction of samples picked from the object and its immediat using only a fraction of samples, the computational load redu the similarity measure based on the distance between the t samples from them.
The MS procedure maximizes Equation (6), which co MS optimization process is an iterative process and is initial position y . The new location y is evaluated based on expanding the Taylor series around coefficient (y ), the obtained after some manipulation [6][7][8]: This variance ratio is sorted for the top N and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise. represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
The MS procedure maximizes Equation (6), which consequently minimizes Equation (9). The MS optimization process is an iterative process and is initialized in the previous frame with the target position y . The new location y is evaluated based on the convergence of Equation (6). By expanding the Taylor series around coefficient (y ), the linear approximation of Equation (6) is obtained after some manipulation [6][7][8]: features which indicates the highest discrimination score. The likelihood, corresponding to features with the highest score, are used to form new likelihood images. These images are used to initialize the MS optimization procedure that produces new N The MS procedure is a non-parametric gradient des to their new state ŝ that is more precise and a and applied to all the particles that essentially gives us a ₣ locations are merged, over each particle, using medi for every particle using multi-features is apparently a proposed MS optimization technique, the complexity weights are estimated for each pixel of the object th algorithm based on J. Shi and J. Malik [31]. We pick only candidate regions according to these estimates. Essentially, the proposed MS operates only on a fra its complexity by an order of magnitude. This subs introduced in the MS optimization method. The heart o the Bhattacharryya coefficient that is given by Equation The Bhattacharryya coefficient, ρ : ₣ , is maximized discrete color histograms of the target and candidates re small intervals that divides the whole range into smalle densities are given from Equation (7) through Equation where, x * } … and x } … are the pixels of the delta function equal to 1 only at the particular bin u a random samples inside the target and candidate. In multiply by a coefficient C to restrict the summations decreasing convex kernel. Traditionally Bhattacharryya are spatially weighted by an Epanechnikov kernel. How at dealing with non-rigid objects. We instead, employed fraction of samples picked from the object and its imme using only a fraction of samples, the computational load the similarity measure based on the distance between t samples from them.
The MS procedure maximizes Equation (6), which MS optimization process is an iterative process and is ini position y . The new location y is evaluated based expanding the Taylor series around coefficient (y ), obtained after some manipulation [6][7][8]: locations for each particle. Symbolically Equations (1)-(5) is represented as : N

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used t to their new state ŝ that is more precise and accurate. MS is initi and applied to all the particles that essentially gives us as many new locatio ₣ locations are merged, over each particle, using median over them. Eva for every particle using multi-features is apparently an intensive comp proposed MS optimization technique, the complexity is reduced by an weights are estimated for each pixel of the object that is first segment algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samp candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. T its complexity by an order of magnitude. This subsection describes o introduced in the MS optimization method. The heart of any MS procedu the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ featu discrete color histograms of the target and candidates respectively. These small intervals that divides the whole range into smaller ones to reduce c densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candid delta function equal to 1 only at the particular bin u and 0 otherwise. random samples inside the target and candidate. In order to represen multiply by a coefficient C to restrict the summations ∑ and ∑ decreasing convex kernel. Traditionally Bhattacharryya coefficient is maxi are spatially weighted by an Epanechnikov kernel. However, these multiva at dealing with non-rigid objects. We instead, employed a monotonic decr fraction of samples picked from the object and its immediate surrounding using only a fraction of samples, the computational load reduces considerab the similarity measure based on the distance between the target and cand samples from them.
The MS procedure maximizes Equation (6), which consequently min MS optimization process is an iterative process and is initialized in the prev position y . The new location y is evaluated based on the converge expanding the Taylor series around coefficient (y ), the linear approxi obtained after some manipulation [6][7][8] points and hence the tracker has a chance t proposed framework executes the feature rank re-sampling. The next subsection presents the

Pseudo Code of AMF-MSPF
At time k, execute the following steps:

Experimental Results
This section evaluates and compares the and PFKBOT of [24]. The choice of selecting multi-features and an adaptive model updatin framework. The experiments have been condu using an Intel Core i5 2.60 GHZ with 4 GB R MSPF, we applied it to video sequences havin background introduced by the moving obje experiments very difficult. The accuracy and the WalkByShop1cor, CAVIAR, and PETS vid characteristics of these sequences.
; the ranking module that produces N

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
The MS procedure maximizes Equation (6), which consequently minimizes Equation (9). The MS optimization process is an iterative process and is initialized in the previous frame with the target position y . The new location y is evaluated based on the convergence of Equation (6). By expanding the Taylor series around coefficient (y ), the linear approximation of Equation (6) is obtained after some manipulation [6][7][8]: top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, s i k+1 to their new stateŝ i k+1 that is more precise and accurate. MS is initialized in each N new N ₣ locations for each particle. Symbolically Equations (1 ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. T is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method to their new state ŝ that is more precise and accurate. MS i and applied to all the particles that essentially gives us as many new l ₣ locations are merged, over each particle, using median over them for every particle using multi-features is apparently an intensive proposed MS optimization technique, the complexity is reduced b weights are estimated for each pixel of the object that is first se algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of sam its complexity by an order of magnitude. This subsection descr introduced in the MS optimization method. The heart of any MS pr the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ discrete color histograms of the target and candidates respectively. T small intervals that divides the whole range into smaller ones to red densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and c delta function equal to 1 only at the particular bin u and 0 otherw random samples inside the target and candidate. In order to rep multiply by a coefficient C to restrict the summations ∑ and decreasing convex kernel. Traditionally Bhattacharryya coefficient is are spatially weighted by an Epanechnikov kernel. However, these m at dealing with non-rigid objects. We instead, employed a monotoni fraction of samples picked from the object and its immediate surrou using only a fraction of samples, the computational load reduces cons the similarity measure based on the distance between the target an samples from them.
The MS procedure maximizes Equation (6), which consequent MS optimization process is an iterative process and is initialized in th position y . The new location y is evaluated based on the co expanding the Taylor series around coefficient (y ), the linear ap obtained after some manipulation [6][7][8]: features and applied to all the particles that essentially gives us as many new locations for each particle. These N Appl. Sci. 2018, 8, x FOR PEER REVIEW new N ₣ locations for each particle. Symbolically E ƒeatureƦnk(₣); the ranking module that produces N ₣ to is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient des to their new state ŝ that is more precise and a and applied to all the particles that essentially gives us a ₣ locations are merged, over each particle, using medi for every particle using multi-features is apparently a proposed MS optimization technique, the complexity weights are estimated for each pixel of the object th algorithm based on J. Shi and J. Malik [31]. We pick only candidate regions according to these estimates. Essentially, the proposed MS operates only on a fra its complexity by an order of magnitude. This subs introduced in the MS optimization method. The heart o the Bhattacharryya coefficient that is given by Equation The Bhattacharryya coefficient, ρ : ₣ , is maximized discrete color histograms of the target and candidates re small intervals that divides the whole range into smalle densities are given from Equation (7) through Equation where, x * } … and x } … are the pixels of the delta function equal to 1 only at the particular bin u a random samples inside the target and candidate. In multiply by a coefficient C to restrict the summations decreasing convex kernel. Traditionally Bhattacharryya are spatially weighted by an Epanechnikov kernel. How at dealing with non-rigid objects. We instead, employed fraction of samples picked from the object and its imme using only a fraction of samples, the computational load the similarity measure based on the distance between t samples from them.
The MS procedure maximizes Equation (6), which MS optimization process is an iterative process and is ini position y . The new location y is evaluated based expanding the Taylor series around coefficient (y ), obtained after some manipulation [6][7][8]: locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6) new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represented as ∶ ₣ = ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
The MS procedure maximizes Equation (6), which consequently minimizes Equation (9). The MS optimization process is an iterative process and is initialized in the previous frame with the target new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represented as ∶ ₣ = ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
, is maximized over all N new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represen ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimizati is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the p to their new state ŝ that is more precise and accurate. MS is initialized in each and applied to all the particles that essentially gives us as many new locations for each p ₣ locations are merged, over each particle, using median over them. Evaluating the M for every particle using multi-features is apparently an intensive computation. How proposed MS optimization technique, the complexity is reduced by an order of ma weights are estimated for each pixel of the object that is first segmented using no algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improve its complexity by an order of magnitude. This subsection describes our propose introduced in the MS optimization method. The heart of any MS procedure is the ma the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and discrete color histograms of the target and candidates respectively. These bins are simp small intervals that divides the whole range into smaller ones to reduce computationa densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respecti delta function equal to 1 only at the particular bin u and 0 otherwise. represents random samples inside the target and candidate. In order to represent , as multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels at dealing with non-rigid objects. We instead, employed a monotonic decreasing funct fraction of samples picked from the object and its immediate surroundings for spatial w using only a fraction of samples, the computational load reduces considerably. Equation the similarity measure based on the distance between the target and candidates using samples from them.
features. q u and p u are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): ..N h are the pixels of the target and candidates respectively. δ is the delta function equal to 1 only at the particular bin u and 0 otherwise. N h represents a fraction of random samples inside the target and candidate. In order to represent q u , p u as densities, we multiply by a coefficient C to restrict the summations ∑ m u=1 p u and ∑ m u=1 q u to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
The MS procedure maximizes Equation (6), which consequently minimizes Equation (9). The MS optimization process is an iterative process and is initialized in the previous frame with the target position y 0 . The new location y 1 is evaluated based on the convergence of Equation (6). By expanding the Taylor series around coefficient p u y 0 , the linear approximation of Equation (6) is obtained after some manipulation [6][7][8]:

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
The MS procedure maximizes Equation (6), which consequently minimizes Equation (9). The MS optimization process is an iterative process and is initialized in the previous frame with the target position y . The new location y is evaluated based on the convergence of Equation (6). By expanding the Taylor series around coefficient (y ), the linear approximation of Equation (6) is obtained after some manipulation [6][7][8]: y substituting Equation (8) in Equation (10), we obtain new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represented as ∶ ₣ = ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
The MS procedure maximizes Equation (6), which consequently minimizes Equation (9). The MS optimization process is an iterative process and is initialized in the previous frame with the target position y . The new location y is evaluated based on the convergence of Equation (6). By expanding the Taylor series around coefficient (y ), the linear approximation of Equation (6) is obtained after some manipulation [6][7][8]: The first term, of Equation (11), is actually the previous location of the target which is not dependent on the new coordinate y. Therefore, we only maximize the second term of the equation to get the new target location y 1 .
where g(x) is a constant for the kernel profile that reduces Equation (12) to a simple weighted distance average as in Equation (13).
Update p(y 1 ) and evaluate ρ 1:N new N ₣ locations for each particle. Symbolically Equations (1)- (5) is represented as ∶ ₣ = ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features.
and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
[p(y), q] until y 1 − y 0 < for every N Appl. Sci. 2018, 8, x FOR PEER REVIEW new N ₣ locations for each particle. Symbolically Equations (1)-(5) is r ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS op is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to mo to their new state ŝ that is more precise and accurate. MS is initialize and applied to all the particles that essentially gives us as many new locations fo ₣ locations are merged, over each particle, using median over them. Evaluati for every particle using multi-features is apparently an intensive computat proposed MS optimization technique, the complexity is reduced by an orde weights are estimated for each pixel of the object that is first segmented u algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples fr candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This i its complexity by an order of magnitude. This subsection describes our introduced in the MS optimization method. The heart of any MS procedure is the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. discrete color histograms of the target and candidates respectively. These bins small intervals that divides the whole range into smaller ones to reduce comp densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates delta function equal to 1 only at the particular bin u and 0 otherwise. re random samples inside the target and candidate. In order to represent , multiply by a coefficient C to restrict the summations ∑ and ∑ to decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximize are spatially weighted by an Epanechnikov kernel. However, these multivariate at dealing with non-rigid objects. We instead, employed a monotonic decreasin fraction of samples picked from the object and its immediate surroundings for using only a fraction of samples, the computational load reduces considerably. E the similarity measure based on the distance between the target and candidat samples from them.
feature and is usually 1 pixel. The particles along the gradient direction of the MS vector move to the local extreme of a probability density function and that determines the best location i.e., where Bhattacharyya coefficient is the highest. The MS procedure is applied to every particle over all the top N Appl. Sci. 2018, 8, x FOR PEER REVIEW new N ₣ locations for each particle. Symbolically Equations (1)-(5) is ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to to their new state ŝ that is more precise and accurate. MS is initial and applied to all the particles that essentially gives us as many new location ₣ locations are merged, over each particle, using median over them. Evalu for every particle using multi-features is apparently an intensive compu proposed MS optimization technique, the complexity is reduced by an o weights are estimated for each pixel of the object that is first segmente algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of sample candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. Th its complexity by an order of magnitude. This subsection describes ou introduced in the MS optimization method. The heart of any MS procedur the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ feature discrete color histograms of the target and candidates respectively. These bi small intervals that divides the whole range into smaller ones to reduce co densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidat delta function equal to 1 only at the particular bin u and 0 otherwise. random samples inside the target and candidate. In order to represent multiply by a coefficient C to restrict the summations ∑ and ∑ decreasing convex kernel. Traditionally Bhattacharryya coefficient is maxim are spatially weighted by an Epanechnikov kernel. However, these multivar at dealing with non-rigid objects. We instead, employed a monotonic decrea fraction of samples picked from the object and its immediate surroundings using only a fraction of samples, the computational load reduces considerabl features given as in Equation (14). new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represented as ∶ ₣ = ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represented as ∶ ₣ = ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a tion Procedure cedure is a non-parametric gradient descent method used to move the particles' state, w state ŝ that is more precise and accurate. MS is initialized in each ₣ features ll the particles that essentially gives us as many new locations for each particle. These merged, over each particle, using median over them. Evaluating the MS procedure le using multi-features is apparently an intensive computation. However, in the ptimization technique, the complexity is reduced by an order of magnitude. The imated for each pixel of the object that is first segmented using normalized cut on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and s according to these estimates. the proposed MS operates only on a fraction of samples. This improvement reduces by an order of magnitude. This subsection describes our proposed innovation e MS optimization method. The heart of any MS procedure is the maximization of ya coefficient that is given by Equation (6): harryya coefficient, ρ : ₣ , is maximized over all ₣ features.
and are m-bin stograms of the target and candidates respectively. These bins are simply a series of hat divides the whole range into smaller ones to reduce computational load. These en from Equation (7) through Equation (8): and x } … are the pixels of the target and candidates respectively. is the qual to 1 only at the particular bin u and 0 otherwise.
represents a fraction of s inside the target and candidate. In order to represent , as densities, we efficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic ex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that

.2. MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features nd applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure or every particle using multi-features is apparently an intensive computation. However, in the roposed MS optimization technique, the complexity is reduced by an order of magnitude. The eights are estimated for each pixel of the object that is first segmented using normalized cut lgorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and andidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces ts complexity by an order of magnitude. This subsection describes our proposed innovation ntroduced in the MS optimization method. The heart of any MS procedure is the maximization of he Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features.
and are m-bin iscrete color histograms of the target and candidates respectively. These bins are simply a series of mall intervals that divides the whole range into smaller ones to reduce computational load. These ensities are given from Equation (7) through Equation (8): here, x * } … and x } … are the pixels of the target and candidates respectively. is the elta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of andom samples inside the target and candidate. In order to represent , as densities, we ultiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic ecreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that estimate locations, for each N s particles, that are combined to form new estimates using Equation (15 new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represented as ∶ ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization proce is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ fea and applied to all the particles that essentially gives us as many new locations for each particle. T ₣ locations are merged, over each particle, using median over them. Evaluating the MS proce for every particle using multi-features is apparently an intensive computation. However, i proposed MS optimization technique, the complexity is reduced by an order of magnitude weights are estimated for each pixel of the object that is first segmented using normalized algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the referenc candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement red its complexity by an order of magnitude. This subsection describes our proposed innov introduced in the MS optimization method. The heart of any MS procedure is the maximizati the Bhattacharryya coefficient that is given by Equation (6):  are m discrete color histograms of the target and candidates respectively. These bins are simply a ser small intervals that divides the whole range into smaller ones to reduce computational load. T densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. i delta function equal to 1 only at the particular bin u and 0 otherwise. represents a fracti random samples inside the target and candidate. In order to represent , as densities (15) s k+1 are the new estimate of the particles that are now closer to their local maxima than s k+1 . Figure 1 depicts the working of our framework. The top N new N ₣ locations for each particle. Symbolically Equations (1)-(5) is represented as ∶ ₣ = ƒeatureƦnk(₣); the ranking module that produces N ₣ top features. The MS optimization procedure is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features.
and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } and x } are the pixels of the target and candidates respectively. is the features along with the initialized particles serves as the input to the MS modules. This gives us new probable locations of the object for all the selected features. Or in other words, the particles after processing through MS, gives rise to N Appl. Sci. 2018, 8, x FOR PEER REVIEW new N ₣ locations for each particle. Symbolically Equations ƒeatureƦnk(₣); the ranking module that produces N ₣ top feature is explained in the next subsection.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent metho to their new state ŝ that is more precise and accurate. M and applied to all the particles that essentially gives us as many ne ₣ locations are merged, over each particle, using median over th for every particle using multi-features is apparently an intensi proposed MS optimization technique, the complexity is reduce weights are estimated for each pixel of the object that is first algorithm based on J. Shi and J. Malik [31]. We pick only a fraction candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of sa its complexity by an order of magnitude. This subsection de introduced in the MS optimization method. The heart of any MS the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all discrete color histograms of the target and candidates respectivel small intervals that divides the whole range into smaller ones to densities are given from Equation (7) through Equation (8) locations for each particle. The median over all the locations gives us the particles that are more accurate i.e., s k+1 . These new hypotheses are used to estimate the posterior density through Equation (20) in the next subsection.

MS Embedded Particle Filter
The MSPF hybrid method is based on the Bayesian framework which basically deals with the evolution of object state. The belief network is reinforced by the measurement process through a set of dynamic equations as follows: where, s k and s k+1 are the object states at time k and k + 1. f k (.) and h k (.), are the dynamic equations for drawing particles and taking new measurements z k+1 . k and η k represent the process and measurement noise respectively. The object state distribution is estimated based on all the previous measurements that theoretically are: This prior is used in the prediction step when the measurement z k+1 is available. Bayes' rule recursively updates this prior in the prediction step as in Equation (19): where, p(z k+1 |s k+1 ) is the likelihood distribution and p(s k+1 |s k )p(s k |z 1:k )ds k is the normalization factor. The posterior density function is approximated through summations or weighted summation over all particles, as in Equation (20): where,ŝ i k+1 are the new particles obtained by processing s i k+1 through the MS procedure using Equations (14) and (15). So now, effectively the PF particle validation process is replaced with the MS procedure by insertingŝ k+1 into the posterior distribution function in Equation (20). The vector w i k+1 are the weights associated with each particle and are calculated through Equation (21).
Usually the likelihood distribution is used to calculate the weights of the particles: In order to get normalized weights, w i k+1 is divided by sum of all weights: Expectation is used to approximate the posterior densities of Equation (20) given as: We should be very careful about the degeneration phenomenon, also called sample impoverishment, in which a few particles assume the leading role in approximating the posterior. This leads to a situation where many particles will cluster together as a result of which the sample set will have repeated particles. This results in a large percentage of particles getting insignificant tinny weights. Consequently, the tracker loses the target due to particles drifting towards one side. We calculate N e {k = 2,3,4, … . . ; p( | )=( , )}

MS Optimization
Step:

Experimental Results
This section evaluates and compares the proposed AMF-MSPF framework with the TSHT of [27] and PFKBOT of [24]. The choice of selecting these reference approaches, is because they employ multi-features and an adaptive model updating techniques similar, in some way, to our multi-feature framework. The experiments have been conducted on frames of different sizes and processing is done using an Intel Core i5 2.60 GHZ with 4 GB RAM. To illustrate the efficiency of the proposed AMF-MSPF, we applied it to video sequences having full occlusions, abrupt intensity changes, and clutter background introduced by the moving object. These challenging sequences evidently makes the experiments very difficult. The accuracy and robustness of the proposed AMF-MSPF are tested on the WalkByShop1cor, CAVIAR, and PETS video data sets. Table 2 highlights some of the important characteristics of these sequences.

Experimental Results
This section evaluates and compares the proposed AMF-MSPF framework wi and PFKBOT of [24]. The choice of selecting these reference approaches, is bec multi-features and an adaptive model updating techniques similar, in some way, to framework. The experiments have been conducted on frames of different sizes and using an Intel Core i5 2.60 GHZ with 4 GB RAM. To illustrate the efficiency of th MSPF, we applied it to video sequences having full occlusions, abrupt intensity ch background introduced by the moving object. These challenging sequences ev experiments very difficult. The accuracy and robustness of the proposed AMF-M the WalkByShop1cor, CAVIAR, and PETS video data sets. Table 2 highlights som characteristics of these sequences.

Experimental Results
This section evaluates and compares the proposed AMF-MSPF framework with the TSHT of [27] and PFKBOT of [24]. The choice of selecting these reference approaches, is because they employ multi-features and an adaptive model updating techniques similar, in some way, to our multi-feature framework. The experiments have been conducted on frames of different sizes and processing is done using an Intel Core i5 2.60 GHZ with 4 GB RAM. To illustrate the efficiency of the proposed AMF-MSPF, we applied it to video sequences having full occlusions, abrupt intensity changes, and clutter background introduced by the moving object. These challenging sequences evidently makes the experiments very difficult. The accuracy and robustness of the proposed AMF-MSPF are tested on the WalkByShop1cor, CAVIAR, and PETS video data sets. Table 2 highlights some of the important characteristics of these sequences.
≤ N T we redistribute the clusters of particles, so that new particles are generated out of this cluster and re-initialize their weights to 1 N s . N T is a threshold that triggers re-sampling if it is greater than N e

Experimental Results
This section evaluates and compares the proposed AMF-MSPF framework with the TSHT of [27] and PFKBOT of [24]. The choice of selecting these reference approaches, is because they employ multi-features and an adaptive model updating techniques similar, in some way, to our multi-feature framework. The experiments have been conducted on frames of different sizes and processing is done using an Intel Core i5 2.60 GHZ with 4 GB RAM. To illustrate the efficiency of the proposed AMF-MSPF, we applied it to video sequences having full occlusions, abrupt intensity changes, and clutter background introduced by the moving object. These challenging sequences evidently makes the experiments very difficult. The accuracy and robustness of the proposed AMF-MSPF are tested on the WalkByShop1cor, CAVIAR, and PETS video data sets. Table 2 highlights some of the important characteristics of these sequences.
. The idea behind resampling is to get rid of particles with tiny weights and generate more particles from the ones with greater weights. As resampling can only execute after the weight calculation and normalization is done, it becomes a bottleneck in parallelizing the PF computations. Re-sampling although reduces the effects of degeneration, however, it introduces loss of diversity among the particles, also called impoverishment due to the fact that the resultant particles are clustered to close vicinity [21,32]. This is because redundant particles are chosen from the same points and hence the tracker has a chance to lose the object. To mitigate from this problem, the proposed framework executes the feature ranking module to update the model of the object alongside re-sampling. The next subsection presents the pseudo code of the proposed AMF-MSPF framework.

Pseudo Code of AMF-MSPF
At time k, execute the following steps: mization Procedure procedure is a non-parametric gradient descent method used to move the particles' state, r new state ŝ that is more precise and accurate. MS is initialized in each ₣ features to all the particles that essentially gives us as many new locations for each particle. These are merged, over each particle, using median over them. Evaluating the MS procedure rticle using multi-features is apparently an intensive computation. However, in the S optimization technique, the complexity is reduced by an order of magnitude. The estimated for each pixel of the object that is first segmented using normalized cut sed on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and gions according to these estimates. lly, the proposed MS operates only on a fraction of samples. This improvement reduces ity by an order of magnitude. This subsection describes our proposed innovation n the MS optimization method. The heart of any MS procedure is the maximization of arryya coefficient that is given by Equation (6): ttacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin r histograms of the target and candidates respectively. These bins are simply a series of als that divides the whole range into smaller ones to reduce computational load. These given from Equation (7) through Equation (8): … and x } … are the pixels of the target and candidates respectively. is the n equal to 1 only at the particular bin u and 0 otherwise. represents a fraction of ples inside the target and candidate. In order to represent , as densities, we a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic onvex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that weighted by an Epanechnikov kernel. However, these multivariate kernels are not good ith non-rigid objects. We instead, employed a monotonic decreasing function to select a mples picked from the object and its immediate surroundings for spatial weighting. By fraction of samples, the computational load reduces considerably. Equation (9) computes y measure based on the distance between the target and candidates using a fraction of them.
procedure maximizes Equation (6), which consequently minimizes Equation (9). The tion process is an iterative process and is initialized in the previous frame with the target The new location y is evaluated based on the convergence of Equation (6). By he Taylor series around coefficient (y ), the linear approximation of Equation (6)  points and hence the tracker has a chance to lose the object. To mitigate from this problem, the proposed framework executes the feature ranking module to update the model of the object alongside re-sampling. The next subsection presents the pseudo code of the proposed AMF-MSPF framework.

Pseudo Code of AMF-MSPF
At time k, execute the following steps:

Experimental Results
This section evaluates and compares the proposed AMF-MSPF framework with the TSHT of [27] and PFKBOT of [24]. The choice of selecting these reference approaches, is because they employ multi-features and an adaptive model updating techniques similar, in some way, to our multi-feature timization Procedure S procedure is a non-parametric gradient descent method used to move the particles' state, eir new state ŝ that is more precise and accurate. MS is initialized in each ₣ features d to all the particles that essentially gives us as many new locations for each particle. These ns are merged, over each particle, using median over them. Evaluating the MS procedure particle using multi-features is apparently an intensive computation. However, in the MS optimization technique, the complexity is reduced by an order of magnitude. The re estimated for each pixel of the object that is first segmented using normalized cut based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and regions according to these estimates. tially, the proposed MS operates only on a fraction of samples. This improvement reduces xity by an order of magnitude. This subsection describes our proposed innovation in the MS optimization method. The heart of any MS procedure is the maximization of charryya coefficient that is given by Equation (6): hattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin lor histograms of the target and candidates respectively. These bins are simply a series of rvals that divides the whole range into smaller ones to reduce computational load. These re given from Equation (7) through Equation (8): ) * } … and x } … are the pixels of the target and candidates respectively. is the tion equal to 1 only at the particular bin u and 0 otherwise. represents a fraction of amples inside the target and candidate. In order to represent , as densities, we y a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that ly weighted by an Epanechnikov kernel. However, these multivariate kernels are not good with non-rigid objects. We instead, employed a monotonic decreasing function to select a samples picked from the object and its immediate surroundings for spatial weighting. By a fraction of samples, the computational load reduces considerably. Equation (9) computes rity measure based on the distance between the target and candidates using a fraction of om them.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise. represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.

MS Optimization Procedure
The MS procedure is a non-parametric gradient descent method used to move the particles' state, to their new state ŝ that is more precise and accurate. MS is initialized in each ₣ features and applied to all the particles that essentially gives us as many new locations for each particle. These ₣ locations are merged, over each particle, using median over them. Evaluating the MS procedure for every particle using multi-features is apparently an intensive computation. However, in the proposed MS optimization technique, the complexity is reduced by an order of magnitude. The weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise. represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of

5.
Weight Calculation and normalization step: points and hence the tracker has a chance to lose the object. To mitigate from this problem, the proposed framework executes the feature ranking module to update the model of the object alongside re-sampling. The next subsection presents the pseudo code of the proposed AMF-MSPF framework.

Pseudo Code of AMF-MSPF
At time k, execute the following steps:

Experimental Results
This section evaluates and compares the proposed AMF-MSPF framework with the TSHT of [27] and PFKBOT of [24]. The choice of selecting these reference approaches, is because they employ Step: (Posterior Estimation) points and hence the tracker has a chance to lose the object. To mitigate from this problem, the proposed framework executes the feature ranking module to update the model of the object alongside re-sampling. The next subsection presents the pseudo code of the proposed AMF-MSPF framework.

Pseudo Code of AMF-MSPF
At time k, execute the following steps:

Experimental Results
This section evaluates and compares the proposed AMF-MSPF framework with the TSHT of [27] and PFKBOT of [24]. The choice of selecting these reference approaches, is because they employ multi-features and an adaptive model updating techniques similar, in some way, to our multi-feature framework. The experiments have been conducted on frames of different sizes and processing is done using an Intel Core i5 2.60 GHZ with 4 GB RAM. To illustrate the efficiency of the proposed AMF-MSPF, we applied it to video sequences having full occlusions, abrupt intensity changes, and clutter background introduced by the moving object. These challenging sequences evidently makes the experiments very difficult. The accuracy and robustness of the proposed AMF-MSPF are tested on the WalkByShop1cor, CAVIAR, and PETS video data sets. Table 2 highlights some of the important characteristics of these sequences. The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3.
Precision ( REVIEW 10 of 17 ecting these data sets is because they have rich characteristics such as full brupt illumination change. The computer vision research community widely evaluate their algorithms. The prime goal, of the simulations, is to track gions of interest during long video sequences under clutter background and . The ground truth locations of the objects are recorded manually for each een and yellow depicts the results of the ground truth, proposed AMF-MSPF, spectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and (RMSE) are evaluated for all the sequences under test, as shown in Table 3. (Ʀ) are given by Equations (26) and (27): TN are the number of true positives, false positives, false negatives, and true F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): calculated as the ratio between the number of negative event wrongly and the total number of actual negative events using Equation (29).
WalkByShop1cor dataset which is a long video sequence exhibiting several t exhibits multiple full occlusions in addition to similar object appearance in -930. The results of AMF-MSPF are close to the ground truth and outperforms thods as shown in Figure 2. PFKBOT fails on encountering instances having the object. The error accumulates until the PFKBOT loses the target because to false mode. However, in THST, the object recovers from the occasional ground clutter. This is due to the fact that TSHT method takes into account l aspects of the object that leads the particles towards more likely modes. r intensity change, TSHT is quite comparable to the proposed AMF-MSPF in ataset. Figure 2 highlights the visual tracking results.  on for selecting these data sets is because they have rich characteristics such as full tter, and abrupt illumination change. The computer vision research community widely ata sets to evaluate their algorithms. The prime goal, of the simulations, is to track tialized regions of interest during long video sequences under clutter background and intensity. The ground truth locations of the objects are recorded manually for each d, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and uare error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. and recall (Ʀ) are given by Equations (26) and (27): , FN and TN are the number of true positives, false positives, false negatives, and true pectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): , FAR is calculated as the ratio between the number of negative event wrongly s positive and the total number of actual negative events using Equation (29). (29) cking Result with the WalkByShop1cor dataset which is a long video sequence exhibiting several he dataset exhibits multiple full occlusions in addition to similar object appearance in 10 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms KBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having similar to the object. The error accumulates until the PFKBOT loses the target because particles to false mode. However, in THST, the object recovers from the occasional the background clutter. This is due to the fact that TSHT method takes into account tiotemporal aspects of the object that leads the particles towards more likely modes. ut regular intensity change, TSHT is quite comparable to the proposed AMF-MSPF in hop1cor dataset. Figure 2 highlights the visual tracking results.

FAR = FP TN + FP
) are given by Equations (26) and (27)  The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. Precision (Ƥ) and recall (Ʀ) are given by Equations (26)  where, TP, FP, FN and TN are the number of true positives, false positives, false negatives, and true negatives respectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional distraction by the background clutter. This is due to the fact that TSHT method takes into account both the spatiotemporal aspects of the object that leads the particles towards more likely modes. Under mild but regular intensity change, TSHT is quite comparable to the proposed AMF-MSPF in the WalkByShop1cor dataset. Figure 2 highlights the visual tracking results.
Appl. Sci. 2018, 8, x FOR PEER REVIEW 10 of 17 The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. Precision (Ƥ) and recall (Ʀ) are given by Equations (26) and (27): where, TP, FP, FN and TN are the number of true positives, false positives, false negatives, and true negatives respectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional distraction by the background clutter. This is due to the fact that TSHT method takes into account both the spatiotemporal aspects of the object that leads the particles towards more likely modes. Under mild but regular intensity change, TSHT is quite comparable to the proposed AMF-MSPF in the WalkByShop1cor dataset. Figure 2 highlights the visual tracking results.  The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. Precision (Ƥ) and recall (Ʀ) are given by Equations (26) where, TP, FP, FN and TN are the number of true positives, false positives, false negatives, and true negatives respectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional distraction by the background clutter. This is due to the fact that TSHT method takes into account both the spatiotemporal aspects of the object that leads the particles towards more likely modes. Under mild but regular intensity change, TSHT is quite comparable to the proposed AMF-MSPF in  The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. Precision (Ƥ) and recall (Ʀ) are given by Equations (26) and (27): where, TP, FP, FN and TN are the number of true positives, false positives, false negatives, and true negatives respectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional distraction by the background clutter. This is due to the fact that TSHT method takes into account both the spatiotemporal aspects of the object that leads the particles towards more likely modes. Under mild but regular intensity change, TSHT is quite comparable to the proposed AMF-MSPF in that is given by Equation (28)  The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. Precision (Ƥ) and recall (Ʀ) are given by Equations (26) where, TP, FP, FN and TN are the number of true positives, false positives, false negatives, and true negatives respectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional distraction by the background clutter. This is due to the fact that TSHT method takes into account *  The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. Precision (Ƥ) and recall (Ʀ) are given by Equations (26) and (27): where, TP, FP, FN and TN are the number of true positives, false positives, false negatives, and true negatives respectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional distraction by the background clutter. This is due to the fact that TSHT method takes into account  The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. Precision (Ƥ) and recall (Ʀ) are given by Equations (26) and (27): where, TP, FP, FN and TN are the number of true positives, false positives, false negatives, and true negatives respectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional  The reason for selecting these data sets is because they have rich characteristics such as full occlusion, clutter, and abrupt illumination change. The computer vision research community widely uses these data sets to evaluate their algorithms. The prime goal, of the simulations, is to track manually initialized regions of interest during long video sequences under clutter background and occlusions by intensity. The ground truth locations of the objects are recorded manually for each sequence. Red, blue, green and yellow depicts the results of the ground truth, proposed AMF-MSPF, PFKBOT, and TSHT respectively. For a quantitative analysis F-SCORE, false alarm rate (FAR) and root mean square error (RMSE) are evaluated for all the sequences under test, as shown in Table 3. Precision (Ƥ) and recall (Ʀ) are given by Equations (26) and (27): where, TP, FP, FN and TN are the number of true positives, false positives, false negatives, and true negatives respectively. F-SCORE becomes trivial if we know Ƥ and Ʀ that is given by Equation (28): Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional (28) Similarly, FAR is calculated as the ratio between the number of negative event wrongly categorized as positive and the total number of actual negative events using Equation (29).

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional distraction by the background clutter. This is due to the fact that TSHT method takes into account both the spatiotemporal aspects of the object that leads the particles towards more likely modes. Under mild but regular intensity change, TSHT is quite comparable to the proposed AMF-MSPF in the WalkByShop1cor dataset. Figure 2 highlights the visual tracking results.

Visual Tracking Result
We start with the WalkByShop1cor dataset which is a long video sequence exhibiting several constraints. The dataset exhibits multiple full occlusions in addition to similar object appearance in frames 350-410 and 870-930. The results of AMF-MSPF are close to the ground truth and outperforms TSHT and PFKBOT methods as shown in Figure 2. PFKBOT fails on encountering instances having a background similar to the object. The error accumulates until the PFKBOT loses the target because FP strays the particles to false mode. However, in THST, the object recovers from the occasional distraction by the background clutter. This is due to the fact that TSHT method takes into account both the spatiotemporal aspects of the object that leads the particles towards more likely modes. Under mild but regular intensity change, TSHT is quite comparable to the proposed AMF-MSPF in the WalkByShop1cor dataset. Figure 2  Since the results of AMF-MSPF and the reference methods have come in proximity, RMSE metric is evaluated for a closer look. In Table 3, the RMSE, of AMF-MSPF, THST and PFKBOT for WalkByShop1cor dataset are 5.76, 7.76 and 30.70 respectively. Consequently, the FAR of the proposed method is the lowest, due to very few false positives i.e., 0.13 as compared to 0.24 and 0.35 for THST and PFKBOT respectively. Moreover, the F-Score of the proposed method is 0.93 as compared to 0.86, 0.79 for THST and PFKBOT. In the next experiments, we consider two more sequences from the Browse4.mpg dataset. In these sequences, the object moves in severe intensity occluded areas. The abrupt intensity change occurs in multiple instances from frames 230-410, frames 880-920 and frames 1005-1050. The experimental results show that under abrupt intensity change, the AMF-MSPF is robust and tracks the object, as highlighted in Figures 3 and 4. In Table 3, the RMSE of AMF-MSPF, TSHT, and PFKBOT is 9.18, 41.30 and 55.80 respectively. Low RMSE indicates FAR and consequently a higher F-SCORE that is 0.91 for the proposed. Since the results of AMF-MSPF and the reference methods have come in proximity, RMSE metric is evaluated for a closer look. In Table 3, the RMSE, of AMF-MSPF, THST and PFKBOT for WalkByShop1cor dataset are 5.76, 7.76 and 30.70 respectively. Consequently, the FAR of the proposed method is the lowest, due to very few false positives i.e., 0.13 as compared to 0.24 and 0.35 for THST and PFKBOT respectively. Moreover, the F-Score of the proposed method is 0.93 as compared to 0.86, 0.79 for THST and PFKBOT. In the next experiments, we consider two more sequences from the Browse4.mpg dataset. In these sequences, the object moves in severe intensity occluded areas. The abrupt intensity change occurs in multiple instances from frames 230-410, frames 880-920 and frames 1005-1050. The experimental results show that under abrupt intensity change, the AMF-MSPF is robust and tracks the object, as highlighted in Figures 3 and 4. In Table 3, the RMSE of AMF-MSPF, TSHT, and PFKBOT is 9.18, 41.30 and 55.80 respectively. Low RMSE indicates FAR and consequently a higher F-SCORE that is 0.91 for the proposed. In the next experiments, we consider two more sequences from the Browse4.mpg dataset. In these sequences, the object moves in severe intensity occluded areas. The abrupt intensity change occurs in multiple instances from frames 230-410, frames 880-920 and frames 1005-1050. The experimental results show that under abrupt intensity change, the AMF-MSPF is robust and tracks the object, as highlighted in Figures 3 and 4. In Table 3  In the final experiment, AMF-MSPF is tested on the PETS 2007 dataset, which exhibits continuous background change due to occlusion by intensity as well as other objects. The results of AMF-MSPF are convincing as compared to the TSHT and PFKBOT methods as shown in Figure 5. Contrasting TSHT and PFKBOT, the proposed AMF-MSPF framework selects a new set of ₣ features that are used to initialize MS procedure for every particle. These features are selected by the   In the final experiment, AMF-MSPF is tested on the PETS 2007 dataset, which exhibits continuous background change due to occlusion by intensity as well as other objects. The results of AMF-MSPF are convincing as compared to the TSHT and PFKBOT methods as shown in Figure 5. Contrasting TSHT and PFKBOT, the proposed AMF-MSPF framework selects a new set of ₣ In the final experiment, AMF-MSPF is tested on the PETS 2007 dataset, which exhibits continuous background change due to occlusion by intensity as well as other objects. The results of AMF-MSPF are convincing as compared to the TSHT and PFKBOT methods as shown in Figure 5. Contrasting TSHT and PFKBOT, the proposed AMF-MSPF framework selects a new set of N proposed MS optimization technique, the complexity is reduced by an ord weights are estimated for each pixel of the object that is first segmented algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples candidate regions according to these estimates.
Essentially, the proposed MS operates only on a fraction of samples. This its complexity by an order of magnitude. This subsection describes our introduced in the MS optimization method. The heart of any MS procedure the Bhattacharryya coefficient that is given by Equation (6): The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features discrete color histograms of the target and candidates respectively. These bin small intervals that divides the whole range into smaller ones to reduce com densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidate delta function equal to 1 only at the particular bin u and 0 otherwise. r random samples inside the target and candidate. In order to represent multiply by a coefficient C to restrict the summations ∑ and ∑ decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximiz are spatially weighted by an Epanechnikov kernel. However, these multivaria at dealing with non-rigid objects. We instead, employed a monotonic decreas fraction of samples picked from the object and its immediate surroundings fo using only a fraction of samples, the computational load reduces considerably. the similarity measure based on the distance between the target and candid samples from them.

d(y) = 1 − ρ[ (y), ]
The MS procedure maximizes Equation (6), which consequently minim MS optimization process is an iterative process and is initialized in the previou position y . The new location y is evaluated based on the convergenc expanding the Taylor series around coefficient (y ), the linear approxima obtained after some manipulation [6][7][8]: features that are used to initialize MS procedure for every particle. These features are selected by the ranking module that is triggered whenever the re-sampling step occurs due to the degeneration problem. In contrast to other methods, our method is more accurate because it re-initializes the object based on local features. For a quantitative analysis and highlights of some of the prominent characteristics of our method Tables 3  and 4

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we obtained improved results in term of robustness and accuracy. However, the MSPF hybrid methodology implicitly has an additional cost of incorporating the MS optimization into the already complex PF algorithm. One innovation of the proposed research is the simplification of the MS optimization. The simplification comes through an observation, that only a fraction of samples was required by the MS optimization procedure. The computational efficiency of the proposed MS was evaluated on a sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduced, due to dropping more than 75% of the samples, is negligible. And our simplified MS is able to track the object successfully in real-time, even under a Matlab implementation. This has led to a huge

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we obtained improved results in term of robustness and accuracy. However, the MSPF hybrid methodology implicitly has an additional cost of incorporating the MS optimization into the already complex PF algorithm. One innovation of the proposed research is the simplification of the MS optimization. The simplification comes through an observation, that only a fraction of samples was required by the MS optimization procedure. The computational efficiency of the proposed MS was evaluated on a sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduced, due to dropping more than 75% of the samples, is negligible. And our simplified MS is able to track the object successfully in real-time, even under a Matlab implementation. This has led to a huge

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we obtained improved results in term of robustness and accuracy. However, the MSPF hybrid methodology implicitly has an additional cost of incorporating the MS optimization into the already complex PF algorithm. One innovation of the proposed research is the simplification of the MS optimization. The simplification comes through an observation, that only a fraction of samples was required by the MS optimization procedure. The computational efficiency of the proposed MS was evaluated on a sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduced, due to dropping more than 75% of the samples, is negligible. And our simplified MS is able to track the object successfully in real-time, even under a Matlab implementation. This has led to a huge

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we o improved results in term of robustness and accuracy. However, the MSPF hybrid metho implicitly has an additional cost of incorporating the MS optimization into the already com algorithm. One innovation of the proposed research is the simplification of the MS optimizat simplification comes through an observation, that only a fraction of samples was required by optimization procedure. The computational efficiency of the proposed MS was evaluate sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduc to dropping more than 75% of the samples, is negligible. And our simplified MS is able to t object successfully in real-time, even under a Matlab implementation. This has led to

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we obtained improved results in term of robustness and accuracy. However, the MSPF hybrid methodology implicitly has an additional cost of incorporating the MS optimization into the already complex PF algorithm. One innovation of the proposed research is the simplification of the MS optimization. The simplification comes through an observation, that only a fraction of samples was required by the MS optimization procedure. The computational efficiency of the proposed MS was evaluated on a sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduced, due to dropping more than 75% of the samples, is negligible. And our simplified MS is able to track the object successfully in real-time, even under a Matlab implementation. This has led to a huge

Full Occlusion Clutter Background
Intensity Change

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we obtained improved results in term of robustness and accuracy. However, the MSPF hybrid methodology implicitly has an additional cost of incorporating the MS optimization into the already complex PF algorithm. One innovation of the proposed research is the simplification of the MS optimization. The simplification comes through an observation, that only a fraction of samples was required by the MS optimization procedure. The computational efficiency of the proposed MS was evaluated on a sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduced, due to dropping more than 75% of the samples, is negligible. And our simplified MS is able to track the

Full Occlusion Clutter Background
Intensity Change

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we o improved results in term of robustness and accuracy. However, the MSPF hybrid metho implicitly has an additional cost of incorporating the MS optimization into the already com algorithm. One innovation of the proposed research is the simplification of the MS optimizat simplification comes through an observation, that only a fraction of samples was required by optimization procedure. The computational efficiency of the proposed MS was evaluate sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduc to dropping more than 75% of the samples, is negligible. And our simplified MS is able to t

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we obtained improved results in term of robustness and accuracy. However, the MSPF hybrid methodology implicitly has an additional cost of incorporating the MS optimization into the already complex PF algorithm. One innovation of the proposed research is the simplification of the MS optimization. The simplification comes through an observation, that only a fraction of samples was required by the MS optimization procedure. The computational efficiency of the proposed MS was evaluated on a sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduced, due to dropping more than 75% of the samples, is negligible. And our simplified MS is able to track the

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we obtained improved results in term of robustness and accuracy. However, the MSPF hybrid methodology implicitly has an additional cost of incorporating the MS optimization into the already complex PF algorithm. One innovation of the proposed research is the simplification of the MS optimization. The simplification comes through an observation, that only a fraction of samples was required by the MS optimization procedure. The computational efficiency of the proposed MS was evaluated on a sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduced, due to dropping more than 75% of the samples, is negligible. And our simplified MS is able to track the

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we o improved results in term of robustness and accuracy. However, the MSPF hybrid metho implicitly has an additional cost of incorporating the MS optimization into the already com algorithm. One innovation of the proposed research is the simplification of the MS optimizat simplification comes through an observation, that only a fraction of samples was required by optimization procedure. The computational efficiency of the proposed MS was evaluate sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduc

Computational Complexity
After evaluating the proposed AMF-MSPF on several challenging video datasets, we obtained improved results in term of robustness and accuracy. However, the MSPF hybrid methodology implicitly has an additional cost of incorporating the MS optimization into the already complex PF algorithm. One innovation of the proposed research is the simplification of the MS optimization.
The simplification comes through an observation, that only a fraction of samples was required by the MS optimization procedure. The computational efficiency of the proposed MS was evaluated on a sequence extracted from the Browse4 dataset. As can be seen in Figure 6, the error introduced, due to dropping more than 75% of the samples, is negligible. And our simplified MS is able to track the object successfully in real-time, even under a Matlab implementation. This has led to a huge computational reduction in the MS method without significantly compromising its accuracy. This as a consequence saves resources for feature ranking module. Thereby, the proposed AMF-MSPF framework implements the feature ranking on top of the MSPF methodology without aggravating its computational complexity. The complexity of proposed framework is considerably reduced as compared to TSHT and PFKBOT because in those implementations, MS takes into account all the samples of object window. The AMF-MSPF framework can process 10-15 frames per second, which is perceived by humans, while none of the reference methods are able to run in real-time. Let us derive the overall computational cost of the proposed AMF-MSPF framework. Let be the total number of iteration for each MS procedure (in our case = 5), _ _ be the execution time required for sample generation and their associated weights, , , and be the execution times for the resampling, MS procedure and ranking module respectively. Then the execution time of the AMF-MSPF framework is given in Equation (29): Although MS is initialized in multiple features over all particles, however down sampling the number of pixels reduces its execution time by an order of magnitude. Thus with the reduced pixels, , the complexity of our proposed MS is ( * N ₣ * ) which has no significant impact on Equation (29). This saves enormous computational power that is utilized by the ranking module. Since the MS optimization is applied multiple times to every particle until convergence, it moves the Let us derive the overall computational cost of the proposed AMF-MSPF framework. Let I ter be the total number of iteration for each MS procedure (in our case I ter = 5), T p_gen_wt be the execution time required for sample generation and their associated weights, T re−sampling , T ms , and T rnk be the execution times for the resampling, MS procedure and ranking module respectively. Then the execution time of the AMF-MSPF framework is given in Equation (29): T Am f −msp f = T p _gen_wt + T ms + T re−sampling + T rnk (30) Although MS is initialized in multiple features over all particles, however down sampling the number of pixels reduces its execution time by an order of magnitude. Thus with the reduced pixels, N h , the complexity of our proposed MS is O(m * N weights are estimated for each pixel of the object that is first segmented using normalized cut algorithm based on J. Shi and J. Malik [31]. We pick only a fraction of samples from the reference and candidate regions according to these estimates. Essentially, the proposed MS operates only on a fraction of samples. This improvement reduces its complexity by an order of magnitude. This subsection describes our proposed innovation introduced in the MS optimization method. The heart of any MS procedure is the maximization of the Bhattacharryya coefficient that is given by Equation (6): ρ : ₣ ≡ ρ[ (y), ] = (y).
The Bhattacharryya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin discrete color histograms of the target and candidates respectively. These bins are simply a series of small intervals that divides the whole range into smaller ones to reduce computational load. These densities are given from Equation (7) through Equation (8): where, x * } … and x } … are the pixels of the target and candidates respectively. is the delta function equal to 1 only at the particular bin u and 0 otherwise.
represents a fraction of random samples inside the target and candidate. In order to represent , as densities, we multiply by a coefficient C to restrict the summations ∑ and ∑ to 1. k is a monotonic decreasing convex kernel. Traditionally Bhattacharryya coefficient is maximized using densities that are spatially weighted by an Epanechnikov kernel. However, these multivariate kernels are not good at dealing with non-rigid objects. We instead, employed a monotonic decreasing function to select a fraction of samples picked from the object and its immediate surroundings for spatial weighting. By using only a fraction of samples, the computational load reduces considerably. Equation (9) computes the similarity measure based on the distance between the target and candidates using a fraction of samples from them.
The MS procedure maximizes Equation (6), which consequently minimizes Equation (9). The MS optimization process is an iterative process and is initialized in the previous frame with the target position y . The new location y is evaluated based on the convergence of Equation (6). By expanding the Taylor series around coefficient (y ), the linear approximation of Equation (6) is obtained after some manipulation [6][7][8]: * N h ) which has no significant impact on Equation (29). This saves enormous computational power that is utilized by the ranking module. Since the MS optimization is applied multiple times to every particle until convergence, it moves the particles to new locations where it no longer conforms to the posterior distribution. This is what is known as the degeneracy problem in which a small number of particles dominate the weight race and therefore the estimates are heavily influenced by them. Consequently, the re-sampling stage is introduced to mitigate from the degeneracy problem by replicating particles with larger weights and removing the particles with negligible ones. This, however, constraints the system that re-sampling step be carried out after the particle weighting and normalization steps and that makes it a bottleneck in parallelizing the PF computations. The complexity of the re-sampling module is also equal to the particle generation and weighting step i.e., O(N s ).
While the ranking module is quadratic in number of samples from the background and foreground i.e., O( ₣ articles that essentially gives us as many new locations for each particle. These ed, over each particle, using median over them. Evaluating the MS procedure ng multi-features is apparently an intensive computation. However, in the ation technique, the complexity is reduced by an order of magnitude. The for each pixel of the object that is first segmented using normalized cut hi and J. Malik [31]. We pick only a fraction of samples from the reference and rding to these estimates. oposed MS operates only on a fraction of samples. This improvement reduces order of magnitude. This subsection describes our proposed innovation optimization method. The heart of any MS procedure is the maximization of fficient that is given by Equation (6): ρ : ₣ ≡ ρ[ (y), ] = (y).
ya coefficient, ρ : ₣ , is maximized over all ₣ features. and are m-bin ms of the target and candidates respectively. These bins are simply a series of vides the whole range into smaller ones to reduce computational load. These m Equation (7) through Equation (8) are the pixels of the target and candidates respectively. is the o 1 only at the particular bin u and 0 otherwise. represents a fraction of e the target and candidate. In order to represent , as densities, we nt C to restrict the summations ∑ and ∑ to 1. k is a monotonic nel. Traditionally Bhattacharryya coefficient is maximized using densities that by an Epanechnikov kernel. However, these multivariate kernels are not good gid objects. We instead, employed a monotonic decreasing function to select a ked from the object and its immediate surroundings for spatial weighting. By f samples, the computational load reduces considerably. Equation (9) computes based on the distance between the target and candidates using a fraction of d(y) = 1 − ρ[ (y), ] e maximizes Equation (6), which consequently minimizes Equation (9). The * f g(j) * bg(j)), nevertheless, it is only triggered when the re-sampling step is required.
Since the re-sampling step is not carried in every step, most of the time the tracking is performed using only the novel optimized MSPF. Therefore, asymptotically speaking, this makes the overall complexity of our proposed AMF-MSPF framework equal to O(n) with high probability.

Summary
In summary, this research work brings improvement to the field of robust object tracking in two novel ways. Firstly, it develops an adaptive multi-feature framework on top of the MSPF methodology. A ranking module ranks the given feature space based on maximizing the inter-class variance between the background and foreground. High variance score signifies the chances for better target discrimination and thereby the top features are selected for updating the target model. The likelihoods corresponding to these features are continuously used to form new likelihood images that dynamically initialize the MS to enable tracking in the local context. Secondly, the computational cost of the MS optimization method of the proposed AMF-MSPF framework is reduced. Consequently, that enables tracking in real-time unlike other methods in the hybrid tracking category that can only process a few frames.
The accuracy and robustness of the proposed AMF-MSPF is tested on the WalkByShop1cor, CAVIAR and PETS video data sets. These sequences are known for their challenging constraints that makes the experiments very difficult. The AMF-MSPF framework was found to provide improved tracking performance compared with other conventional methods based on hybrid object tracking methodology. The experimental results demonstrate successful visual tracking even under extreme intensity variations and full occlusion. The average F-SCORE (over all the video sets) of the AMF-MSPF is 0.92 as compared to 0.79 and 0.72 for TSHT and PFKBOT respectively.
The proposed AMF-MSPF implements a ranking module on top of the MSPF methodology to give it an additional layer of capability. Consequently, this has increased the performance of the MSPF that now takes into account any rapid movement of the object that changes its background. However, the improvement is tied to an assumption that the target reference model deviates significantly from its instance of initialization which is updated at the sampling step. This assumption needs to be carefully studied and there is a need to research on a more robust dynamic technique that switches the required features for the target update. In future research, we also plan to employ statistical estimation or deterministic background weighting mechanisms to boast the target in a cluttered background. As the clutter background eclipses some part of the target, and that can hamper the performance of the feature evaluation process. We have employed a linear combination of the RGB color components. However, the availability of a potentially large spatio-temporal feature space can also be experimented with to further improve tracking using the AMF-MSPF framework.