Aggregation and Pruning for Continuous Incremental Multi-Task Inference
Abstract
1. Introduction
- We propose a general and adaptive pruning scheme, AP, for multi-task networks, which can be used to address continuous incremental task addition in the context of pruning.
- We propose a novel filter compression mechanism that minimizes redundancy between the current tasks and incremental tasks by adaptively aggregating similarity filters into a new filter.
- Extensive experiments on various network frameworks and a large number of datasets show that our method can effectively compress the total parameters of the whole network while maintaining the representation power of individual tasks.

2. Related Work
3. The Proposed Method
3.1. Problem Statement
- (1)
- Specialized filters for task a, i.e., filters that are only relevant to task a, which are defined as .
- (2)
- Specialized filters for task b, i.e., filters that are only relevant to task b, which are defined as .
- (3)
- Filter families. That is, filters related to both task a and task b, which are defined as .
- (4)
- In a convolutional layer, each filter consists of convolutional kernels. The number of uncompressed convolutional kernels in the multi-task network is given by Equation (1), where and represent the number of kernels specific to task networks a and b, respectively, and represents the number of shared kernels. Based on the kernel-to-filter relation, the total number of filters (i.e., output channels) in the layer is
3.2. Individual Task Pruning
3.3. Buffer Area Build
- 1.
- For buffer , we construct a mask of the same size as P to delete the shared filters:where represents the i-th position in the buffers of the j-th and m-th tasks, and represents the a-th position in the buffer of the n-th and j-th tasks. Notably, is a 1 vector when first aggregated.
- 2.
- Calculate the size of as follows:where represents the weight of the j-th task network in the l-th layer, and ⊙ is the Hadamard product.
- 3.
- Next, we capture similar filter pairs between tasks through a similarity measure as follows:where represents the i-th convolution kernel filters of the l-th layer of the task network, and denotes the cosine similarity between two filters.
- 4.
- Next, we need to aggregate similar filter pairs to obtain a shared filter family with a strong representation. Here, we set up a buffer consisting of P and B for the shared filter. P represents the position index of the filters common to the current iteration, initialized to an empty matrix. B represents the position of the current iteration shared filter, initialized to an empty matrix. Then, we introduce a Mask matrix M of size , which is used to represent the filter coordinates of the network a that need to be shared by the current iteration as follows:where v is the threshold; ∅ represents vacancy.
- 5.
- The position index of the shared filter can be expressed as follows:where represents the i-th row and m-th column of the similarity matrix S, and represents the i-th position of the location buffer of task j, .
- 6.
- The value of the shared filter after aggregation iswhere represents the i-th entry of the filter buffer for task j, . represents the result calculated in (1) for the parameters of the i-th position of the l-th layer of the task j network. represents the result calculated in (1) for the parameters of the -th position of the l-th layer of the task network.
3.4. Filter Update
3.5. Adaptive Learning Mechanism
4. Experiments
4.1. Performance on Uniform Task Groups
- Exp. A: independent labels with shared low-level features. We combine two classification tasks on Fashion-MNIST and MNIST. Both datasets use 28 × 28 grayscale images, sharing similar low-level feature spaces (e.g., edge and texture patterns), but their label spaces are semantically independent (clothing vs. digits). LeNet-5 is used as a lightweight baseline to focus on task-specific learning.
- Exp. B: aligned labels with domain-Specific features. This scenario involves two classification tasks on the Office-Caltech Webcam (low-resolution images with environmental noise) and Amazon (high-resolution product images) subsets. While the label spaces are fully aligned, the feature distributions exhibit significant domain shifts. We adopt VGG-16 to evaluate the compatibility with classical deep CNNs lacking residual connections.
- Exp. C: architecture compatibility validation. Using the same tasks as Exp. B (Webcam and Amazon), we replace VGG-16 with ResNet-50 to evaluate the method’s performance on modern architectures with residual connections, explicitly verifying its adaptability to advanced network designs.
- Exp. D: partial label alignment with mixed feature domains and incremental tasks. Building on Exp. C, Exp. D extends this setup by adding two more datasets: Office-Caltech DSLR and the Art dataset. This extension introduces a more complex scenario with heterogeneous feature spaces (natural images vs. paintings) and partially overlapping labels (e.g., shared “chair” category in Office-Caltech vs. unique art categories). We build upon the multi-task network trained on the webcam and Amazon domains in Experiment C. Subsequently, we incrementally introduce two single-task networks for the DSLR and Art in a predefined order. The tasks are added one by one, enabling us to evaluate the scalability and effectiveness of our method in a continuous incremental learning setting. This experiment further validates our method’s adaptability to modern architectures with residual connections while handling the increased complexity introduced by new tasks with mixed feature domains and label alignments.
- Results on Exp. A: As shown in Table 1, tasks with independent labels but shared low-level features (Fashion-MNIST and MNIST) demonstrate that our method effectively aggregates filters, preserving task-specific features while reducing redundancy. This leads to improved accuracy and efficiency.
- Results on Exp. B and C: As shown in Table 2 and Table 3, the results on both VGG-16 (Exp. B) and ResNet-50 (Exp. C) show similar trends in accuracy improvement and parameter reduction. Across both VGG-16 (Exp. B) and ResNet-50 (Exp. C), our method consistently demonstrates improvements in accuracy and reductions in parameters. Moreover, we observe that pruning slightly outperforms cosine pruning in terms of accuracy in ResNet-50, while cosine pruning results in marginally better parameter reduction. The results indicate that our approach is versatile and effective in handling domain-specific features with aligned labels, regardless of the underlying model architecture. These findings confirm that our method is capable of generalizing across different network designs while maintaining high performance and reducing computational overhead. It is also worth noting that Experiment B is designed to evaluate the model’s robustness to task feature distribution shifts, as it involves different domains in the Office-Caltech dataset. The consistent improvements achieved in this setting further demonstrate the generalization capability of our approach under distributional changes across tasks.
- Results on Exp. D: As shown in Table 4, we evaluate a more complex scenario with four tasks, where partial label alignment and mixed feature domains are introduced. Our method maintains 87.33% accuracy while reducing parameters by 41.7%. This result demonstrates that, even with the increasing complexity of tasks and heterogeneity in label spaces, our method efficiently prunes redundant parameters. This highlights the effectiveness of our approach in maintaining task-specific accuracy while adapting to the addition of new tasks in a multi-task, incremental learning environment.
4.2. Performance on Diverse Task Groups
4.3. Analysis
4.4. Ablation Studies
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhan, J.; Luo, Y.; Guo, C.; Wu, Y.; Meng, J.; Liu, J. YOLOPX: Anchor-free multi-task learning network for panoptic driving perception. Pattern Recognit. 2024, 148, 110152. [Google Scholar] [CrossRef]
- Tapu, R.; Mocanu, B.; Zaharia, T. Wearable assistive devices for visually impaired: A state of the art survey. Pattern Recognit. Lett. 2020, 137, 37–52. [Google Scholar] [CrossRef]
- Meshram, V.V.; Patil, K.; Meshram, V.A.; Shu, F.C. An astute assistive device for mobility and object recognition for visually impaired people. IEEE Trans. Hum.-Mach. Syst. 2019, 49, 449–460. [Google Scholar] [CrossRef]
- Krishna, S.; Little, G.; Black, J.; Panchanathan, S. A wearable face recognition system for individuals with visual impairments. In Proceedings of the 7th International ACM SIGACCESS Conference on Computers and Accessibility, Baltimore, MD, USA, 9–12 October 2005; pp. 106–113. [Google Scholar]
- Poggi, M.; Mattoccia, S. A wearable mobility aid for the visually impaired based on embedded 3D vision and deep learning. In Proceedings of the 2016 IEEE Symposium on Computers and Communication (ISCC), Messina, Italy, 27–30 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 208–213. [Google Scholar]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar]
- Samek, W.; Montavon, G.; Lapuschkin, S.; Anders, C.J.; Müller, K.R. Explaining deep neural networks and beyond: A review of methods and applications. Proc. IEEE 2021, 109, 247–278. [Google Scholar] [CrossRef]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef]
- Jiang, W.; Luo, J. Graph neural network for traffic forecasting: A survey. Expert Syst. Appl. 2022, 207, 117921. [Google Scholar] [CrossRef]
- Sharma, K.; Lee, Y.C.; Nambi, S.; Salian, A.; Shah, S.; Kim, S.W.; Kumar, S. A survey of graph neural networks for social recommender systems. ACM Comput. Surv. 2024, 56, 265. [Google Scholar] [CrossRef]
- Liu, J.; Yang, C.; Lu, Z.; Chen, J.; Li, Y.; Zhang, M.; Bai, T.; Fang, Y.; Sun, L.; Yu, P.S.; et al. Towards graph foundation models: A survey and beyond. arXiv 2023, arXiv:2310.11829. [Google Scholar]
- Wu, L.; He, X.; Wang, X.; Zhang, K.; Wang, M. A survey on accuracy-oriented neural recommendation: From collaborative filtering to information-rich recommendation. IEEE Trans. Knowl. Data Eng. 2022, 35, 4425–4445. [Google Scholar] [CrossRef]
- Han, S.; Mao, H.; Dally, W.J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv 2015, arXiv:1510.00149. [Google Scholar]
- Zhuang, W.; Wen, Y.; Lyu, L.; Zhang, S. MAS: Towards resource-efficient federated multiple-task learning. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 23414–23424. [Google Scholar]
- LeCun, Y.; Denker, J.; Solla, S. Optimal brain damage. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 1989; Volume 2, pp. 598–605. [Google Scholar]
- Su, J.; Chen, Y.; Cai, T.; Wu, T.; Gao, R.; Wang, L.; Lee, J.D. Sanity-checking pruning methods: Random tickets can win the jackpot. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2020; Volume 33, pp. 20390–20401. [Google Scholar]
- Sanh, V.; Wolf, T.; Rush, A. Movement pruning: Adaptive sparsity by fine-tuning. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2020; Volume 33, pp. 20378–20389. [Google Scholar]
- Molchanov, D.; Ashukha, A.; Vetrov, D. Variational dropout sparsifies deep neural networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 2498–2507. [Google Scholar]
- Chen, Y.; Zheng, B.; Zhang, Z.; Wang, Q.; Shen, C.; Zhang, Q. Deep learning on mobile and embedded devices: State-of-the-art, challenges, and future directions. ACM Comput. Surv. (CSUR) 2020, 53, 84. [Google Scholar] [CrossRef]
- Molchanov, P.; Tyree, S.; Karras, T.; Aila, T.; Kautz, J. Pruning convolutional neural networks for resource efficient inference. arXiv 2016, arXiv:1611.06440. [Google Scholar]
- Kuutti, S.; Bowden, R.; Jin, Y.; Barber, P.; Fallah, S. A survey of deep learning applications to autonomous vehicle control. IEEE Trans. Intell. Transp. Syst. 2020, 22, 712–733. [Google Scholar] [CrossRef]
- Peng, H.; Gurevin, D.; Huang, S.; Geng, T.; Jiang, W.; Khan, O.; Ding, C. Towards sparsification of graph neural networks. In Proceedings of the 2022 IEEE 40th International Conference on Computer Design (ICCD), Olympic Valley, CA, USA, 23–26 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 272–279. [Google Scholar]
- Luo, Y.; Behnam, P.; Thorat, K.; Liu, Z.; Peng, H.; Huang, S.; Zhou, S.; Khan, O.; Tumanov, A.; Ding, C.; et al. Codg-reram: An algorithm-hardware co-design to accelerate semi-structured gnns on reram. In Proceedings of the 2022 IEEE 40th International Conference on Computer Design (ICCD), Olympic Valley, CA, USA, 23–26 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 280–289. [Google Scholar]
- Chen, T.; Sui, Y.; Chen, X.; Zhang, A.; Wang, Z. A unified lottery ticket hypothesis for graph neural networks. In Proceedings of the 38th International Conference on Machine Learning, Virtual Event, 18–24 July 2021; pp. 1695–1706. [Google Scholar]
- Skarding, J.; Gabrys, B.; Musial, K. Foundations and modeling of dynamic networks using dynamic graph neural networks: A survey. IEEE Access 2021, 9, 79143–79168. [Google Scholar] [CrossRef]
- Chou, Y.M.; Chan, Y.M.; Lee, J.H.; Chiu, C.Y.; Chen, C.S. Unifying and merging well-trained deep neural networks for inference stage. arXiv 2018, arXiv:1805.04980. [Google Scholar]
- Dellinger, F.; Boulay, T.; Barrenechea, D.M.; El-Hachimi, S.; Leang, I.; Bürger, F. Multi-task network pruning and embedded optimization for real-time deployment in adas. arXiv 2021, arXiv:2101.07831. [Google Scholar]
- He, X.; Gao, D.; Zhou, Z.; Tong, Y.; Thiele, L. Pruning-aware merging for efficient multitask inference. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual Event, 14–18 August 2021; pp. 585–595. [Google Scholar]
- Ye, H.; Zhang, B.; Chen, T.; Fan, J.; Wang, B. Performance-aware approximation of global channel pruning for multitask cnns. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 10267–10284. [Google Scholar] [CrossRef]
- He, Y.; Liu, P.; Zhu, L.; Yang, Y. Filter pruning by switching to neighboring CNNs with good attributes. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 8044–8056. [Google Scholar] [CrossRef]
- Kanakis, M.; Bruggemann, D.; Saha, S.; Georgoulis, S.; Obukhov, A.; Van Gool, L. Reparameterizing convolutions for incremental multi-task learning without task interference. In Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020, Proceedings, Part XX; Springer: Cham, Switzerland, 2020; pp. 689–707. [Google Scholar]
- Han, S.; Pool, J.; Tran, J.; Dally, W. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2015; Volume 28. [Google Scholar]
- Liu, Y.; Chen, K.; Liu, C.; Qin, Z.; Luo, Z.; Wang, J. Structured knowledge distillation for semantic segmentation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 2604–2613. [Google Scholar]
- Garg, S.; Zhang, L.; Guan, H. Structured pruning for multi-task deep neural networks. In Proceedings of the 2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR), San Jose, CA, USA, 7–9 August 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 260–266. [Google Scholar]
- Molchanov, P.; Mallya, A.; Tyree, S.; Frosio, I.; Kautz, J. Importance estimation for neural network pruning. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 11264–11272. [Google Scholar]
- Hassibi, B.; Stork, D. Second order derivatives for network pruning: Optimal brain surgeon. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 1992; Volume 5. [Google Scholar]
- Luo, J.H.; Wu, J. Neural network pruning with residual-connections and limited-data. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 1458–1467. [Google Scholar]
- Malach, E.; Yehudai, G.; Shalev-Schwartz, S.; Shamir, O. Proving the lottery ticket hypothesis: Pruning is all you need. In Proceedings of the 37th International Conference on Machine Learning, Virtual Event, 13–18 July 2020; pp. 6682–6691. [Google Scholar]
- Lee, N.; Ajanthan, T.; Torr, P.H. Snip: Single-shot network pruning based on connection sensitivity. arXiv 2018, arXiv:1810.02340. [Google Scholar]
- Park, J.H.; Kim, Y.; Kim, J.; Choi, J.Y.; Lee, S. Dynamic structure pruning for compressing CNNs. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 9408–9416. [Google Scholar]
- Chen, J.; Chen, S.; Pan, S.J. Storage efficient and dynamic flexible runtime channel pruning via deep reinforcement learning. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2020; Volume 33, pp. 14747–14758. [Google Scholar]
- Gurevin, D.; Shan, M.; Huang, S.; Hasan, M.A.; Ding, C.; Khan, O. Prunegnn: Algorithm-architecture pruning framework for graph neural network acceleration. In Proceedings of the 2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Edinburgh, UK, 2–6 March 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 108–123. [Google Scholar]
- He, X.; Zhou, Z.; Thiele, L. Multi-task zipping via layer-wise neuron sharing. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2018; Volume 31. [Google Scholar]
- Chen, X.; Zhang, Y.; Wang, Y. MTP: Multi-task pruning for efficient semantic segmentation networks. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 18–22 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar]
- Xiang, M.; Tang, J.; Yang, Q.; Guan, H.; Liu, T. AdapMTL: Adaptive Pruning Framework for Multitask Learning Model. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, Australia, 28 October–1 November 2024; pp. 5121–5130. [Google Scholar]
- Sun, X.; Hassani, A.; Wang, Z.; Huang, G.; Shi, H. Disparse: Disentangled sparsification for multitask model compression. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 12382–12392. [Google Scholar]
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
- Deng, L. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 2012, 29, 141–142. [Google Scholar] [CrossRef]
- Gong, B.; Shi, Y.; Sha, F.; Grauman, K. Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 2066–2073. [Google Scholar]
- Tan, W.R.; Chan, C.S.; Aguirre, H.E.; Tanaka, K. Improved ArtGAN for conditional synthesis of natural image and artwork. IEEE Trans. Image Process. 2018, 28, 394–409. [Google Scholar] [CrossRef] [PubMed]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from rgbd images. In Computer Vision—ECCV 2012, In Proceedings of the 12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012. Proceedings, Part V; Springer: Berlin/Heidelberg, Germany, 2012; pp. 746–760. [Google Scholar]
- Misra, I.; Shrivastava, A.; Gupta, A.; Hebert, M. Cross-stitch networks for multi-task learning. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3994–4003. [Google Scholar]
- Ruder, S.; Bingel, J.; Augenstein, I.; Søgaard, A. Latent multi-task architecture learning. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 4822–4829. [Google Scholar]
- Ahn, C.; Kim, E.; Oh, S. Deep elastic networks with model selection for multi-task learning. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6529–6538. [Google Scholar]
- Frankle, J.; Dziugaite, G.K.; Roy, D.; Carbin, M. Linear mode connectivity and the lottery ticket hypothesis. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, 12–18 July 2020; pp. 3259–3269. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef]




| Pruning | Tasks | Accuracy (%) | # Parameters (M) | FLOPs (×106) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Baseline | Our | Baseline | Our | Baseline | Our | |||||
| A | 88.94 | 90.19 | +1.25 | 2.61 | 2.56 | −1.92% | 28.34 | 27.73 | −2.15% | |
| B | 98.62 | 99.01 | +0.39 | 2.61 | 2.59 | −0.77% | 28.34 | 27.76 | −2.05% | |
| A + B | 93.78 | 94.67 | +0.89 | 5.21 | 3.18 | −38.96% | 56.69 | 48.16 | −15.05% | |
| A | 87.72 | 88.97 | +1.25 | 2.61 | 2.57 | −1.53% | 28.34 | 27.89 | −1.59% | |
| B | 98.03 | 98.11 | +0.08 | 2.61 | 2.57 | −1.53% | 28.34 | 27.64 | −2.47% | |
| A + B | 92.88 | 93.56 | +0.68 | 5.21 | 3.15 | −39.54% | 56.69 | 47.93 | −15.45% | |
| Pruning | Tasks | Accuracy (%) | # Parameters (M) | FLOPs (×106) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Baseline | Our | Baseline | Our | Baseline | Our | |||||
| A | 69.50 | 90.25 | +20.75 | 23.01 | 22.97 | −0.17% | 3.18 | 3.07 | −3.46% | |
| B | 83.65 | 86.16 | +2.51 | 23.01 | 22.89 | −0.52% | 3.18 | 3.11 | −2.20% | |
| A + B | 76.58 | 88.21 | +11.63 | 46.02 | 31.35 | −31.88% | 6.36 | 6.18 | −2.83% | |
| A | 68.55 | 91.31 | +22.76 | 23.01 | 22.85 | −0.70% | 3.18 | 3.12 | −1.89% | |
| B | 83.01 | 85.01 | +2.00 | 23.01 | 22.87 | −0.61% | 3.18 | 3.07 | −3.46% | |
| A + B | 75.78 | 88.16 | +12.38 | 46.02 | 32.65 | −29.05% | 6.36 | 6.19 | −2.67% | |
| Pruning | Tasks | Accuracy (%) | # Parameters (M) | FLOPs (×106) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Baseline | Our | Baseline | Our | Baseline | Our | |||||
| A | 60.64 | 90.25 | +29.61 | 124.52 | 124.47 | −0.04% | 14.06 | 14.05 | −0.07% | |
| B | 72.01 | 86.01 | +14.00 | 124.52 | 124.44 | −0.06% | 14.06 | 14.03 | −0.21% | |
| A + B | 66.32 | 88.13 | +21.81 | 249.04 | 176.52 | −29.12% | 28.12 | 28.08 | −0.14% | |
| A | 61.08 | 89.74 | +28.66 | 124.52 | 123.74 | −0.63% | 14.06 | 14.03 | −0.21% | |
| B | 71.33 | 85.54 | +14.21 | 124.52 | 123.98 | −0.43% | 14.06 | 14.02 | −0.28% | |
| A + B | 66.21 | 87.65 | +21.44 | 249.04 | 174.28 | −30.02% | 28.12 | 28.05 | −0.25% | |
| Pruning | Tasks | Accuracy (%) | # Parameters (M) | FLOPs (×106) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Baseline | Our | Baseline | Our | Baseline | Our | |||||
| A + B | 76.58 | 88.21 | +11.63 | 46.02 | 31.35 | −31.88% | 6.36 | 6.18 | −2.83% | |
| A + B + C | 71.53 | 87.54 | +16.01 | 69.03 | 42.25 | −38.79% | 9.54 | 9.23 | −3.25% | |
| A + B + C + D | 58.44 | 87.33 | +28.89 | 92.04 | 53.63 | −41.73% | 12.72 | 12.28 | −3.46% | |
| A + B | 75.78 | 88.16 | +12.38 | 46.02 | 32.65 | −29.05% | 6.36 | 6.19 | −2.67% | |
| A + B + C | 67.54 | 84.98 | +17.44 | 69.03 | 43.83 | −36.51% | 9.54 | 9.29 | −2.62% | |
| A + B + C + D | 55.01 | 84.46 | +29.45 | 92.04 | 54.97 | −40.28% | 12.72 | 12.37 | −2.75% | |
| T1: Semantic Seg. | T2: Surface Normal Prediction | T3: Depth Estimation | Spars. (%) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mIoU ↑ | Pixel Acc ↑ | Error ↓ | Angle , within ↑ | Error ↓ | , within ↑ | ||||||||
| Mean | Median | 11.25° | 22.5° | 30° | Abs. | Rel. | 1.25 | ||||||
| Deeplab [59] | 27.24 | 58.62 | 17.2 | 14.73 | 37.19 | 72.24 | 84.97 | 0.55 | 0.22 | 65.21 | 89.87 | 97.52 | 0 | 
| Cross-Stitch [55] | 25.3 | 57.44 | 16.61 | 13.28 | 43.7 | 72.4 | 83.82 | - | - | - | - | - | 0 | 
| Sluice [56] | 26.6 | 59.15 | 16.66 | 13.06 | 44.1 | 73.07 | 83.93 | - | - | - | - | - | 0 | 
| DEN [57] | 26.3 | 58.8 | 17.03 | 14.39 | 39.52 | 72.23 | 84.76 | - | - | - | - | - | 0 | 
| SNIP [39] | 26.57 | 59.85 | 16.91 | 13.55 | 42.01 | 71.72 | 82.01 | 0.6 | 0.23 | 61.35 | 87.73 | 96.87 | 30 | 
| LTH [58] | 23.84 | 56.35 | 16.81 | 13.84 | 40.91 | 72.31 | 84.28 | 0.57 | 0.23 | 62.43 | 88.77 | 97.35 | 30 | 
| IMP [13] | 28.15 | 59.43 | 16.72 | 13.57 | 43.16 | 72.41 | 86.15 | 0.56 | 0.22 | 64.85 | 89.32 | 96.93 | 30 | 
| DiSparse [46] | 28.37 | 58.08 | 16.45 | 13.48 | 43.42 | 73.55 | 86.76 | 0.56 | 0.22 | 63.62 | 88.73 | 96.87 | 30 | 
| AdapMTL [45] | 28.24 | 58.79 | 17.17 | 15.24 | 34.03 | 73.57 | 86.62 | 0.55 | 0.22 | 64.64 | 89.8 | 97.51 | 30 | 
| Ours | 28.95 | 59.91 | 12.92 | 9.34 | 57.09 | 82.23 | 90.84 | 0.56 | 0.22 | 64.53 | 89.9 | 97.39 | 35 | 
| T1: Semantic Seg. | T2: Surface Normal Prediction | T3: Depth Estimation | ||||
|---|---|---|---|---|---|---|
| mIoU ↑ | Pixel Acc ↑ | Mean Err. ↓ | Median Err. ↓ | Abs. Err. ↓ | Rel. Err. ↓ | |
| full networks | 27.24 | 58.62 | 17.2 | 14.73 | 0.55 | 0.22 | 
| random aggregation | 25.10 | 56.70 | 17.72 | 16.37 | 0.60 | 0.21 | 
| w/o filter update | 26.12 | 57.71 | 13.12 | 10.47 | 0.57 | 0.23 | 
| w/o adaptive learning mechanism | 27.54 | 58.96 | 16.75 | 14.36 | 0.57 | 0.24 | 
| Ours | 28.95 | 59.91 | 12.92 | 9.34 | 0.56 | 0.22 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, L.; Cen, F.; Feng, Q.; Xu, J. Aggregation and Pruning for Continuous Incremental Multi-Task Inference. Mathematics 2025, 13, 1414. https://doi.org/10.3390/math13091414
Li L, Cen F, Feng Q, Xu J. Aggregation and Pruning for Continuous Incremental Multi-Task Inference. Mathematics. 2025; 13(9):1414. https://doi.org/10.3390/math13091414
Chicago/Turabian StyleLi, Lining, Fenglin Cen, Quan Feng, and Ji Xu. 2025. "Aggregation and Pruning for Continuous Incremental Multi-Task Inference" Mathematics 13, no. 9: 1414. https://doi.org/10.3390/math13091414
APA StyleLi, L., Cen, F., Feng, Q., & Xu, J. (2025). Aggregation and Pruning for Continuous Incremental Multi-Task Inference. Mathematics, 13(9), 1414. https://doi.org/10.3390/math13091414
 
         
                                                


 
       