A Dual-Attention Temporal Convolutional Network-Based Track Initiation Method for Maneuvering Targets
Abstract
1. Introduction
1.1. The Critical Role and Fundamental Challenges of Radar Track Initiation
1.2. From Traditional to Data-Driven Methods
1.3. Limitations in Existing Deep Learning Models
1.4. Contributions and Organization of This Paper
- DA-TCN directly processes a 2D tensor input with the shape of time step × feature dimension, which preserves the temporal correlation and feature semantics of the track data, and avoids the loss of information caused by “feature flattening” fundamentally.
- The model innovatively adopts a two-branch attention mechanism to dynamically learn the most important information dimensions in the trajectory data. The channel attention branch evaluates the importance of different kinematic features, while the temporal attention branch focuses on locating the key time nodes that determine the authenticity of the trajectory.
- A gated fusion mechanism is introduced to dynamically weight the fusion of the outputs of the dual attention branches, so that the network can dynamically adjust its dependence on the dual branches according to the specific characteristics of the candidate trajectories to achieve optimal information fusion.
- Comprehensive evaluations on a simulated dataset containing various complex, high-speed maneuvering patterns validate the proposed method. The results demonstrate that its performance is significantly superior to that of multiple baseline models that rely on feature flattening, proving the method’s effectiveness, robustness, and practical value for track initiation in high-clutter and target-maneuvering scenarios.
2. Related Work
2.1. Foundational Deep Learning Architectures for Sequence Analysis
2.2. Classification of Attention Mechanisms for Feature Optimization
2.3. Comparative Analysis of Feature Fusion Strategies
- Element-wise Summation: Models in the style of the Dual Attention Network for Scene Segmentation (DA-NET) [22] assume that features from different sources are semantically aligned and can be directly added. This method is computationally simple, but if the feature scales or semantic meanings differ significantly, it can lead to information interference.
- Concatenation Fusion: Models like Densely Connected Convolutional Networks (DenseNet) [23] concatenate different feature vectors along their dimensions, preserving all original information and delegating the fusion task to subsequent fully connected layers. This is a robust baseline approach, but the subsequent layers must learn a fixed fusion weighting that is the same for all samples.
- Attentional Feature Fusion: Represented by the Attentional Feature Fusion (AFF) framework [24], this approach treats the fusion process itself as an attention process to be learned. Specifically, AFF uses a dedicated attention module (such as its proposed Multi-Scale Channel Attention Module, MS-CAM) that takes the features to be fused as input and outputs a set of weights. These weights are then used to perform a weighted sum of the original features. This elevates the fusion process from simple addition or concatenation to an intelligent, content-aware weighted average.
- Gated Fusion: This is a powerful and non-linear dynamic fusion mechanism. Its core is the introduction of a “gating unit,” typically a fully connected layer with a Sigmoid activation function. The output of this unit is a “gate” signal between 0 and 1. This signal controls the information flow from other branches through element-wise multiplication. A value of 1 means the information passes through completely, while a value of 0 means it is completely blocked. This mechanism endows the model with a “switch-like” capability to dynamically and non-linearly decide on the flow of information based on the input.
2.4. Current State of Deep Learning-Based Track Initiation Research
3. DA-TCN Framework
3.1. System Overview
- Candidate trajectory generation: in order to cope with the problem of exploding measurement combinations, a backtracking search strategy combining spatial indexing and kinematic constraint pruning is adopted in this phase. Specifically, discrete radar traces are first grouped by time and a KD-Tree [30] spatial index is constructed for each group to accelerate the neighborhood query. Subsequently, a depth-first recursive backtracking algorithm is used to extend the traces to the subsequent subgroups, starting from the starting time subgroup. In the extension process, the spatial search path is limited by the maximum search speed, and the KD-Tree is utilized to efficiently screen the candidate points and prune them based on the kinematic thresholds such as velocity and acceleration, so as to efficiently generate a collection of candidate trails conforming to the physical laws from a large number of combinations.
- Feature engineering: construct a 2D feature tensor for each candidate track that retains the complete spatial and temporal structure.
- DA-TCN classification and discrimination: Input the feature tensor into DA-TCN, the core model of this paper, to classify and discriminate between true and false tracks.
- Conflict resolution post-processing: since different candidate tracks may share the same radar point track, the high confidence track output from the model is post-processed in this stage. Specifically, a greedy strategy based on confidence ranking is adopted: first, all candidate tracks with confidence exceeding a preset threshold are filtered and ranked from high to low confidence; then, starting from the track with the highest confidence, it is added to the final set, and all conflicting tracks with which it shares the same point track are removed from the final set; the process is iterated until the list is empty, which ensures that each track is unique and outputs the final track without any ambiguities.
3.2. Input Construction and Feature Engineering
3.3. DA-TCN Model Architecture
- The channel attention branch: this branch generates weight vectors for the feature channels by performing global average pooling and maximum pooling in the time dimension, and splicing the results to perform a nonlinear transformation. Finally, by element-wise multiplication.
- Temporal Attention Branch: This branch first inputs into a Bi-GRU network to capture the complete contextual information of each time point to generate the hidden state sequence . Subsequently, a self-attention mechanism is applied to the sequence to compute the importance weights of each time step and generate the weighted sequence representation .
4. Experimental Evaluation
4.1. Simulation Environment and Data Set
- Velocity Range: [100, 1100] m/s, covering various targets from low-speed UAVs to high-speed reconnaissance aircraft.
- Tangential Acceleration Range: [−0.5, 4.5] m/s2, simulating typical aircraft acceleration and deceleration capabilities.
- Yaw Rate Range: [−2, 2] deg/s, slightly below the standard-rate turn to simulate sustained tactical maneuvers.
- Yaw Rate Range: [−20, 20] deg/s, simulating physically impossible, drastic changes in direction;
- Tangential Acceleration Range: [−5, 45] m/s2, simulating velocity pulses that do not conform to the laws of inertia.
4.2. Baseline Modeling and Evaluation Metrics
4.3. Evaluation Metrics
- True Track Initiation Rate : This metric measures the ability of the algorithm to discover real targets, defined as the ratio of the number of real trajectories successfully initiated to the total number of real trajectories in the scene.
- False Track Alarm Rate : This metric measures the ability of the algorithm to suppress clutter and false combinations, and is defined as the ratio of the number of false trajectories incorrectly initiated as real trajectories to the total number of all initiated trajectories.
- Average Start Time : This metric measures the computational efficiency of the algorithm and is defined as the average time required to complete a complete track initiation task.
5. Results and Analysis
5.1. Comparative Performance in High Clutter Maneuvering Scenarios
5.2. Analysis of Model Component Effectiveness
- The underlying TCN backbone network has been able to achieve better performance than the baseline model, confirming the initial advantages of preserving the temporal structure.
- Analysis of the channel attention branch shows that introducing a simple channel attention mechanism (using Global Average Pooling, GAP) to the TCN backbone significantly reduces from 14.0466% to 9.885%, demonstrating the effectiveness of dynamically weighting the 11 kinematic features. However, employing the dual-pooling channel attention module yields a better of 95.04%. This suggests that for discriminating maneuvering targets, relying solely on global average pooling can obscure critical transient peak information. The dual-pooling mechanism more effectively captures abrupt changes in features like “acceleration” or “curvature” within the short sequence, thereby enhancing the identification of true targets.
- In the temporal attention branch, introducing a Bi-GRU as the base reduces to 13.0058%, confirming the necessity of capturing the full sequence context. While TCN excels at extracting local dynamics, the final judgment of a track’s authenticity depends on the overall evolutionary logic of the sequence, which Bi-GRU provides by generating a context-aware representation for each time step. Further integrating a multi-head attention mechanism on top of the Bi-GRU degrades performance, suggesting that this mechanism is overly complex for such short sequences and hinders convergence. In contrast, integrating self-attention achieves a of 10.2506%. This proves that self-attention effectively helps the model pinpoint key nodes where sharp maneuvers or measurement anomalies occur, thus efficiently suppressing tracks that do not conform to kinematic logic.
- Finally, the complete DA-TCN with Gated Fusion achieves the best performance on both metrics simultaneously, especially the simultaneous enhancement on and . This decisively proves the superiority of the gated fusion mechanism: instead of a fixed combination of the two-branch information, it adaptively learns the fusion weights according to the characteristics of the input trajectories, and thus makes more accurate and robust decisions.
5.3. Comparison with Other Attention and Fusion Mechanisms
5.4. Visualization and Interpretability Analysis of Attention Mechanisms
5.5. Model Configuration and Hyperparameter Sensitivity Analysis
- Impact of Network Width: Reducing the channel width to 32 significantly decreases the parameter count and FLOPs but leads to a noticeable drop in performance, especially a sharp increase in , indicating insufficient model capacity. Increasing the width to 128 dramatically raises the parameter count and FLOPs, yielding a slight improvement in but a minor degradation in , with a sharp rise in inference time. This suggests diminishing returns and unnecessary complexity. Therefore, a width of 64 strikes an effective balance between performance and efficiency.
- Impact of Network Depth: Decreasing the number of TCN blocks to 2 reduces complexity but results in a clear increase in , suggesting that insufficient depth hinders the full extraction of dynamic features. Increasing the depth to 4 adds complexity and provides only marginal gains in while slightly increasing . For a very short sequence of length 4, a deeper network does not yield significant benefits; three TCN blocks are sufficient.
- Impact of Dilated Convolutions: With nearly identical parameter counts and FLOPs, the model using dilated convolutions shows a significant performance degradation. This aligns with the ablation study findings and confirms that for the short-sequence track initiation task, expanding the receptive field is not only unnecessary but can also be detrimental by skipping adjacent time steps, thereby interfering with the capture of local, high-frequency dynamics.
5.6. Computational Complexity and Efficiency Analysis
- Model Complexity: The parameter counts and FLOPs of DA-TCN are comparable to the DA-Net style model and are moderate compared to models with complex attention or modulation mechanisms like AFF and TCN-BiFA. However, DA-TCN is significantly more complex than lightweight single-path models (TCN-CBAM, TCN-ECA), the Transformer, and the traditional baselines. This is primarily due to its parallel dual-branch architecture and the additional computational load from the Bi-GRU and self-attention mechanisms in the temporal branch.
- Training Efficiency: DA-TCN requires 83 epochs to converge, has the longest total training time, and a lower training speed than most competitors. This indicates that its complex structure and gated fusion mechanism require more iterations for full parameter optimization.
- Inference Efficiency: DA-TCN’s single-sample inference time is 5.33 ms, with a throughput of 187 samples/s, which is similar to the DA-Net style model but significantly slower than more streamlined architectures like the Transformer and lightweight TCN variants. This further reflects the inference overhead introduced by the parallel structure and complex internal modules.
5.7. Section Summary
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mayor, M.A.; Carroll, R.L. A Multi-Target Track Initiation Algorithm. In Proceedings of the 1987 American Control Conference, Minneapolis, MN, USA, 10–12 June 1987; pp. 1128–1130. [Google Scholar] [CrossRef]
- Skolnik, M. Radar Handbook, 3rd ed.; McGraw-Hill Education: New York, NY, USA, 2008. [Google Scholar]
- Bar-Shalom, Y.; Li, X.R.; Kirubarajan, T. Estimation with Applications to Tracking and Navigation: Theory Algorithms and Software; John Wiley & Sons: Hoboken, NJ, USA, 2001. [Google Scholar]
- Trunk, G.V.; Wilson, J.D. Track initiation of occasionally unresolved radar targets. IEEE Trans. Aerosp. Electron. Syst. 1981, 17, 122–130. [Google Scholar] [CrossRef]
- Wang, Z.; Zhang, W.; Pan, M.; Liu, D. A novel track initiation method based on rule knowledge and deep detection network. In Sixteenth International Conference on Signal Processing Systems (ICSPS 2024); SPIE: Bellingham, WA, USA, 2024; Volume 13559, pp. 1256–1267. [Google Scholar] [CrossRef]
- Zhang, X.; Liang, F.; Chen, X.; Cheng, M.; Hu, Q.; He, S. Track Initiation Method Based on Deep Learning and Logic Method. In Proceedings of the 2023 7th International Conference on Machine Vision and Information Technology (CMVIT), Xiamen, China, 24–26 March 2023; pp. 57–61. [Google Scholar] [CrossRef]
- Konopko, M.; Malanowski, M.; Hardejewicz, J. Multi-Hypothesis Track Initialization with the Use of Multiple Trajectory Models. In Proceedings of the 2021 21st International Radar Symposium (IRS), Berlin, Germany, 21–22 June 2021; pp. 1–10. [Google Scholar] [CrossRef]
- Blackman, S.; Popoli, R. Design and Analysis of Modern Tracking Systems; Artech House: Norwood, MA, USA, 1999. [Google Scholar]
- Smith, M.C. Feature space transform for multitarget detection. In Proceedings of the 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes, Albuquerque, NM, USA, 10–12 December 1980; pp. 835–836. [Google Scholar]
- Dong, X.; Hao, C.; Chunsheng, X.; Jiwei, L. A new Hough transform applied in track initiation. In Proceedings of the 2011 International Conference on Consumer Electronics, Communications and Networks (CECNet), Xianning, China, 16–18 April 2011; pp. 30–33. [Google Scholar] [CrossRef]
- Zeng, L.; Xiao, G.; Ding, C.; He, Y. Track initiation based on adaptive gates and fuzzy Hough transform. SIViP 2023, 17, 4057–4065. [Google Scholar] [CrossRef]
- Wei, S.; Zhou, X.; Wang, J.; Pang, R.; Li, X.; Liu, Q. Adaptive Multi-Radar Anti-Bias Track Association Algorithm Based on Reference Topology Features. Remote Sens. 2025, 17, 1876. [Google Scholar] [CrossRef]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Ige, A.O.; Sibiya, M. State-of-the-Art in 1D Convolutional Neural Networks: A Survey. IEEE Access 2024, 12, 144082–144105. [Google Scholar] [CrossRef]
- Mienye, I.D.; Swart, T.G.; Obaido, G. Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications. Information 2024, 15, 517. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
- Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 11531–11539. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Liu, C.; Mao, Z.; Liu, A.-A.; Zhang, T.; Wang, B.; Zhang, Y. Focus Your Attention: A Bidirectional Focal Attention Network for Image-Text Matching. In Proceedings of the 27th ACM International Conference on Multimedia (MM ’19), Nice, France, 21–25 October 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 3–11. [Google Scholar] [CrossRef]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3141–3149. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef]
- Dai, Y.; Gieseke, F.; Oehmcke, S.; Wu, Y.; Barnard, K. Attentional feature fusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Online, 5–9 January 2021; pp. 3560–3569. [Google Scholar]
- Xiong, F.; Wang, G.; Bai, Y.; Bian, L. Radar track initiation method based on the convolutional neural network. Electron. Meas. Technol. 2020, 43, 78–83. [Google Scholar] [CrossRef]
- Shen, G.; Xu, X.; Fan, Y. A track initiation algorithm based on temporal-spatial characteristics of radar measurement. J. Terahertz Sci. Electron. Inf. Technol. 2022, 20, 1269–1276. [Google Scholar] [CrossRef]
- Yang, F.; Xia, X.; Gao, T. A Deep Learning-Based Radar Track Initiation Algorithm with Spatiotemporal Feature Fusion. In Proceedings of the 2024 13th International Conference on Control, Automation and Information Sciences (ICCAIS), Ho Chi Minh City, Vietnam, 26–28 November 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Zhang, Y.; Yang, S.; Li, H.; Mu, H. A novel multi-target track initiation method based on convolution neural network. In Proceedings of the 2017 International Workshop on Remote Sensing with Intelligent Processing (RSIP), Shanghai, China, 18–21 May 2017; pp. 1–5. [Google Scholar] [CrossRef]
- Yang, D.; Zhao, G.; Li, D.; Li, S. A track initiation method based on the combination of Kalman filtering and the transformer model. In Proceedings of the IET International Radar Conference (IRC 2023), Chongqing, China, 3–5 December 2023; pp. 980–986. [Google Scholar] [CrossRef]
- Bentley, J.L. Multidimensional Binary Search Trees Used for Associative Searching. Commun. ACM 1975, 18, 509–517. [Google Scholar] [CrossRef]
- Skolnik, M.I. Introduction to Radar Systems, 3rd ed.; McGraw-Hill: New York, NY, USA, 2001; pp. 385–435. [Google Scholar]
- Ross, S.M. Simulation, 5th ed.; Academic Press: Boston, MA, USA, 2013; pp. 69–73. [Google Scholar]









| Parameter Category | Parameter | Positive Sample (Kinematically Consistent) | Negative Sample (Kinematically Inconsistent) | 
|---|---|---|---|
| Generalized Parameters | Number of tracks | 50,000 (5 groups of 10,000) | 50,000 (5 groups of 10,000) | 
| Kinematic parameters | Speed range (m/s) | [100, 1100] | Same as positive sample | 
| Acceleration range (m/s2) | [−0.5, 4.5] | [−5, 45] | |
| Yaw angular velocity range (deg/s) | [−2, 2] | [−20, 20] | |
| Core generation mechanism | Parameter update method | Sampled once before track generation, held constant throughout | Acceleration and yaw rate are re-sampled randomly at each time step | 
| Model | Metric | = 50 | = 100 | = 150 | = 200 | = 250 | 
|---|---|---|---|---|---|---|
| CNN-BASED | True Track Initiation Rate (%) | 92.24 | 88.24 | 84.24 | 81.76 | 78.00 | 
| False Track Alarm Rate (%) | 4.3947 | 9.516 | 18.2453 | 24.5199 | 42.9825 | |
| Average Start Time (s) | 0.058 | 0.067 | 0.0802 | 0.0936 | 0.1182 | |
| 1DCNN-GRU | True Track Initiation Rate (%) | 96.72 | 95.76 | 94.32 | 93.76 | 92.32 | 
| False Track Alarm Rate (%) | 1.1447 | 1.5625 | 5.4531 | 8.8647 | 18.7324 | |
| Average Start Time (s) | 0.0682 | 0.092 | 0.1044 | 0.1344 | 0.218 | |
| GRU-GRU | True Track Initiation Rate (%) | 96 | 95.12 | 94.56 | 92.88 | 91.68 | 
| False Track Alarm Rate (%) | 0.9083 | 1.5728 | 4.7542 | 7.6372 | 15.7972 | |
| Average Start Time (s) | 0.076 | 0.091 | 0.116 | 0.1546 | 0.2272 | |
| Transformer | True Track Initiation Rate (%) | 97.12 | 96.96 | 96.12 | 95.2 | 93.52 | 
| False Track Alarm Rate (%) | 0.5733 | 0.9803 | 3.763 | 5.9289 | 13.2789 | |
| Average Start Time (s) | 0.0844 | 0.0982 | 0.133 | 0.1846 | 0.2926 | |
| DA-TCN | True Track Initiation Rate (%) | 97.6 | 96.8 | 96.12 | 95.76 | 95.12 | 
| False Track Alarm Rate (%) | 0.4894 | 1.063 | 3.1426 | 4.6215 | 9.6505 | |
| Average Start Time (s) | 0.0758 | 0.0928 | 0.126 | 0.1746 | 0.2868 | 
| Model Configuration | True Track Initiation Rate (%) | False Track Alarm Rate (%) | Average Start Time (s) | 
|---|---|---|---|
| TCN (Backbone) | 94.48 | 14.0466 | 0.2486 | 
| TCN + Channel Attention (GAP) | 94.50 | 9.885 | 0.2504 | 
| TCN + Dual-Pooling Channel Attention | 95.04 | 14.1618 | 0.2528 | 
| TCN + Bi-GRU (Temporal Branching BASE) | 94.48 | 13.0058 | 0.2672 | 
| TCN + Bi-GRU + Multi-Head Attention | 94.00 | 16.4296 | 0.2796 | 
| TCN + Bi-GRU + Self-Attention | 94.56 | 10.2506 | 0.2814 | 
| DA-TCN (Gated Fusion, Full Model) | 95.12 | 9.6505 | 0.2868 | 
| Model | Attention Mechanism | Fusion Method | True Track Initiation Rate (%) | False Track Alarm Rate (%) | Average Start Time (s) | 
|---|---|---|---|---|---|
| Transformer | Global Self-Attention | Single-Path | 93.52 | 13.2789 | 0.2926 | 
| TCN-CBAM | Sequential (Channel then Temporal) | Sequential Multiplication | 94.32 | 10.6818 | 0.2628 | 
| TCN-ECA | Efficient Channel Attention | Single-Path | 94.88 | 10.2874 | 0.2648 | 
| TCN-BiFA | Bidirectional Focal Attention | Gated Modulation | 94.08 | 9.5385 | 0.312 | 
| DA-NET Style | Parallel (Channel + Temporal) | Element-wise Sum | 94.4 | 11.2114 | 0.26776 | 
| Concatenation Fusion | Parallel (Channel + Temporal) | Concatenation | 94.40 | 9.7859 | 0.2836 | 
| AFF Style | Parallel (Channel + Temporal) | Attentional Fusion (MS-CAM) | 94.24 | 10.8926 | 0.2926 | 
| DA-TCN | Parallel (Channel + Temporal) | Gated Fusion | 95.12 | 9.6505 | 0.2868 | 
| Parameter Category | Specific Setting | 
|---|---|
| Input Structure | 11 features × 4 time steps, shape (4, 11) | 
| Dataset Size | 100,000 (Training: 80,000, Validation: 20,000) | 
| Training Strategy | 200 epochs (Early Stopping with Patience = 20), Adam optimizer (LR = 0.001), Batch Size = 32 | 
| TCN Backbone | 3 TCN blocks, Channels = 64, Kernel Size = 2, Dropout = 0.2 | 
| Channel Attention Branch | Global Average Pooling + Global Max Pooling | 
| Temporal Attention Branch | Bi-GRU (Hidden Units = 64) + Self-Attention | 
| Fusion Mechanism | Gated Fusion | 
| Model Configuration (Width, Depth, Dilation) | True Track Initiation Rate (%) | False Track Alarm Rate (%) | Average Initiation Time (s) | Parameters (Params) | FLOPs (M) | 
|---|---|---|---|---|---|
| Width = 32 | 94.8 | 12.5606 | 0.2772 | 32,527 | 0.14 | 
| Width = 128 | 95.36 | 9.7653 | 0.3158 | 486,319 | 2.09 | 
| Depth = 2 (TCN blocks) | 95.04 | 10.961 | 0.2806 | 107,375 | 0.41 | 
| Depth = 4 (TCN blocks) | 95.28 | 10.181 | 0.2972 | 141,423 | 0.67 | 
| TCN (with Dilated Convolution) | 94.48 | 13.92128 | 0.2928 | 124,399 | 0.54 | 
| Final Model (Width = 64, Depth = 3, No Dilation) | 95.12 | 9.6505 | 0.2868 | 124,270 | 0.54 | 
| Model | Parameters | FLOPs (M) | Convergence Epochs | Training Time (min) | Training Speed (Samples/s) | Single-Sample Inference (ms) | Inference Throughput (Samples/s) | 
|---|---|---|---|---|---|---|---|
| DA-Net Style | 124,270 | 0.54 | 43 | 27.95 | 2060 | 5.30 | 188 | 
| AFF Style | 141,438 | 0.67 | 33 | 24.78 | 1766 | 6.21 | 161 | 
| Transformer | 64,354 | 0.22 | 64 | 9.77 | 8738 | 1.32 | 758 | 
| TCN-CBAM | 47,281 | 0.36 | 43 | 16.23 | 3555 | 2.89 | 346 | 
| TCN-ECA | 45,232 | 0.35 | 44 | 13.15 | 4463 | 2.85 | 351 | 
| TCN-BiFA | 62,829 | 0.60 | 32 | 8.51 | 5022 | 4.32 | 231 | 
| 1DCNN-GRU | 18,073 | 0.13 | 137 | 13.75 | 13,308 | 1.88 | 532 | 
| GRU-GRU | 26,433 | 0.19 | 58 | 12.88 | 5993 | 3.12 | 321 | 
| CNN-BASED | 10,177 | 0.10 | 86 | 4.82 | 29,875 | 0.76 | 1315 | 
| DA-TCN | 124,399 | 0.54 | 83 | 46.85 | 2374 | 5.33 | 187 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, H.; Hao, Y.; Chen, W.; Liao, M. A Dual-Attention Temporal Convolutional Network-Based Track Initiation Method for Maneuvering Targets. Electronics 2025, 14, 4215. https://doi.org/10.3390/electronics14214215
Wu H, Hao Y, Chen W, Liao M. A Dual-Attention Temporal Convolutional Network-Based Track Initiation Method for Maneuvering Targets. Electronics. 2025; 14(21):4215. https://doi.org/10.3390/electronics14214215
Chicago/Turabian StyleWu, Hanbao, Yiming Hao, Wei Chen, and Mingli Liao. 2025. "A Dual-Attention Temporal Convolutional Network-Based Track Initiation Method for Maneuvering Targets" Electronics 14, no. 21: 4215. https://doi.org/10.3390/electronics14214215
APA StyleWu, H., Hao, Y., Chen, W., & Liao, M. (2025). A Dual-Attention Temporal Convolutional Network-Based Track Initiation Method for Maneuvering Targets. Electronics, 14(21), 4215. https://doi.org/10.3390/electronics14214215
 
        

 
       