Enhancing Few-Shot Learning in Lightweight Models via Dual-Faceted Knowledge Distillation
Abstract
1. Introduction
- (1)
- We investigate the application of the dual-faceted knowledge distillation method to the task of few-shot model compression to enhance the performance of lightweight models.
- (2)
- We develop a novel method for calibrating feature error distribution that significantly enhances the performance of feature-based knowledge distillation. Furthermore, we provide theoretical proof to substantiate our proposed method, offering valuable insights and establishing a robust framework for future research in this field.
- (3)
- We demonstrate the effectiveness of our proposed method by validating it on three benchmark datasets. Our proposed method outperforms all other methods and achieves the best performance.
2. Related Work
2.1. Few-Shot Classification
2.2. Knowledge Distillation
3. Main Approach
3.1. Overview
3.2. Teacher Model Construction
3.3. Dual-Faceted Knowledge Distillation Strategy
3.3.1. Theoretical Foundation
3.3.2. Intermediate Feature-Based Distillation
3.3.3. Output-Based Distillation
3.4. Model Evaluation and Testing
| Algorithm 1 Implementation: Few-Shot Model Compression Algorithm. | 
| Input: Base class dataset , validation dataset , and the novel class dataset , | 
| the teacher network , the student network , | 
| temperature parameter τ, hyperparameter , , . | 
| Output: The predicted value of query samples in | 
| Stage 1: Teacher network pre-training | 
| While epoch ≤ maximum number of the iteration | 
| A batch of images is randomly selected from . | 
| Images are fed into the backbone of the teacher network to extract the feature. | 
| Obtain the base class and rotation class probability values. | 
| Pre-train the teacher network according to Equation (4). | 
| Stage 2: Few-shot model compression | 
| While epoch ≤ maximum number of the iteration | 
| A batch of images is randomly selected from . | 
| The image is separately fed into the backbone of the teacher and the student networks to extract features. | 
| Obtain the base class probability values from the teacher network and the student network, respectively. | 
| Calibrate the feature error distribution between the student network and the teacher network according to Equation (16). | 
| Calculate the knowledge distillation loss function for intermediate features according to Equation (17). | 
| Calculate the KL divergence-based loss function between the predicted output values of the student network and the teacher network according to Equation (20). | 
| Calculate the cross-entropy loss function of the student network according to Equation (21). | 
| Train the student network according to Equation (22). | 
| Stage 3: Few-shot model testing | 
| While epoch ≤ maximum number of the iteration | 
| Images from are processed through the feature extractor to obtain the feature. | 
| Train classifier for the novel classes. | 
| Test on the query set from . | 
4. Experiments
4.1. Dataset
4.2. Experimental Setup
4.3. Methodological Validation
4.3.1. Model Compression Efficacy
4.3.2. Feature Error Distribution Calibration
4.4. Comparative Studies
4.4.1. Comparison with Classical Knowledge Distillation Approaches
4.4.2. Comparison with Other Methods
4.5. Detailed Analysis
4.5.1. Parameter Analysis
4.5.2. Visualization Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Xie, J.; Long, F.; Lv, J.; Wang, Q.; Li, P. Joint distribution matters: Deep Brownian distance covariance for few-shot classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Bateni, P.; Goyal, R.; Masrani, V.; Wood, F.; Sigal, L. Improved few-shot visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Bogdan, T.N.; Bruno, M.; Rafael, W.; Victor, B.G.; Vanderlei, Z.; Lourival, L. A computer vision system for monitoring disconnect switches in distribution substations. IEEE Trans. Power Deliv. 2022, 37, 833–841. [Google Scholar]
- Liu, Y.; Zhang, W.; Xiang, C.; Zheng, T.; Cai, D.; He, X. Learning to affiliate: Mutual centralized learning for few-shot classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Yang, Z.; Wang, J.; Zhu, Y. Few-shot classification with contrastive learning. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022. [Google Scholar]
- Lai, J.; Yang, S.; Zhou, J.; Wu, W.; Chen, X.; Liu, J.; Gao, B.; Wang, C. Clustered-patch element connection for few-shot learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Macao, China, 19–25 August 2023. [Google Scholar]
- Ma, J.; Zhang, Y.; Ma, Z.; Mao, K. Research progress of lightweight neural network convolution design. J. Front. Comput. Sci. Technol. 2022, 16, 512–528. [Google Scholar]
- Song, Y.; Wang, T.; Cai, P.; Mondal, S.K.; Sahoo, J.P. A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities. ACM Comput. Surv. 2023, 55, 1–40. [Google Scholar] [CrossRef]
- Zhang, M.; Zhang, J.; Lu, Z.; Xiang, T.; Ding, M.; Huang, S. IEPT: Instance-level and episode-level pretext tasks for few-shot learning. In Proceedings of the International Conference on Learning Representations (ICLR), Virtual Conference, 3–7 May 2021. [Google Scholar]
- Li, W.; Wang, Z.; Yang, X.; Dong, C.; Tian, P.; Qin, T.; Huo, J.; Shi, Y.; Wang, L.; Gao, Y.; et al. LibFewShot: A comprehensive library for few-shot learning. Trans. Pattern Anal. Mach. Intell. 2023, 45, 14938–14955. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Lei, Y.; Fan, J.; Wang, F.; Gong, Y.; Tian, Q. Survey on image classification technology based on small sample learning. Acta Autom. Sin. 2021, 42, 297–315. [Google Scholar]
- Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep network. In Proceedings of the International Conference on Machine Learning (ICML), Sydney, Australia, 6–11 August 2017. [Google Scholar]
- Mike, H.; Jan, N.R.; Aske, P. A survey of deep meta-learning. Artif. Intell. Rev. 2021, 54, 4483–4541. [Google Scholar]
- Gidaris, S.; Bursuc, A.; Komodakis, N.; Pérez, P.; Cord, M. Boosting few-shot visual learning with self-supervision. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Bouniot, Q.; Redko, I.; Audigier, R.; Loesch, A.; Habrard, A. Kernel relative-prototype spectral filtering for few-shot learning. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022. [Google Scholar]
- Ma, R.; Fang, P.; Avraham, G.; Zuo, Y.; Zhu, T.; Drummond, T.; Harandi, M. Learning instance and task-aware dynamic kernels for few-shot learning. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022. [Google Scholar]
- Chen, W.; Liu, Y.; Kira, Z.; Wang, Y.F.; Huang, J. A closer look at few-shot classification. In Proceedings of the International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Liu, B.; Cao, Y.; Lin, Y.; Zhang, Z.; Long, M.; Hu, H. Negative margin matters: Understanding margin in few-shot classification. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, Scotland, 23–23 August 2020. [Google Scholar]
- Mangla, P.; Singh, M.; Sinha, A.; Kumari, N.; Balasubramanian, V.; Krishnamurthy, B. Charting the right manifold: Manifold mixup for few-shot learning. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–7 January 2020. [Google Scholar]
- Rizve, M.N.; Khan, S.; Khan, F.S.; Shah, M. Exploring complementary strengths of invariant and equivariant representations for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, 19–25 June 2021. [Google Scholar]
- Xu, J.; Pan, X.; Luo, X.; Pei, W.; Xu, Z. Exploring categorycorrelated feature for few-shot image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Geoffrey, H.; Oriol, V.; Jeff, D. Distilling the Knowledge in a Neural Network. arXiv 2015, arXiv:1503.02531. [Google Scholar]
- Adriana, R.; Nicolas, B.; Samira, E.K.; Antoine, C.; Carlo, G.; Yoshua, B. Fitnets: Hints for thin deep nets. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 5–7 May 2015. [Google Scholar]
- Zagoruyko, S.; Komodakis, N. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv 2016, arXiv:1612.03928. [Google Scholar]
- Tung, F.; Mori, G. Similarity-preserving knowledge distillation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Park, W.; Kim, D.; Lu, Y.; Cho, M. Relational knowledge distillation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Passalis, N.; Tzelepi, M.; Tefas, A. Probabilistic knowledge transfer for lightweight deep representation learning. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 2030–2039. [Google Scholar] [CrossRef] [PubMed]
- Huang, Z.; Wang, N. Like what you like: Knowledge distill via neuron selectivity transfer. arXiv 2017, arXiv:1707.01219. [Google Scholar]
- Tian, Y.; Wang, Y.; Krishnan, D.; Tenenbaum, J.B.; Isola, P. Rethinking few-shot image classification: A good embedding is all you need? In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, Scotland, 23–23 August 2020. [Google Scholar]
- Rajasegaran, J.; Khan, S.; Hayat, M.; Khan, F.S.; Shah, M. Self-supervised knowledge distillation for few-shot learning. arXiv 2020, arXiv:2006.09785. [Google Scholar]
- Ma, J.; Xie, H.; Han, G.; Chang, S.F.; Galstyan, A.; Abd-Almageed, W. Partner-assisted learning for few-shot image classification. In Proceedings of the International Conference on Computer Vision (ICCV), Virtual Conference, 10–17 October 2021. [Google Scholar]
- Zhou, Z.; Qiu, X.; Xie, J.; Wu, J.; Zhang, C. Binocular mutual learning for improving few-shot classification. In Proceedings of the International Conference on Computer Vision (ICCV), Virtual Conference, 10–17 October 2021. [Google Scholar]
- Gomes, J.C.; Borges, L.d.A.B.; Borges, D.L. A multi-layer feature fusion method for few-Shot image classification. Sensors 2023, 23, 6880. [Google Scholar] [CrossRef] [PubMed]
- Zhang, P.; Li, Y.; Wang, D.; Wang, J. RS-SSKD: Self-supervision equipped with knowledge distillation for few-shot remote sensing scene classification. Sensors 2021, 21, 1566. [Google Scholar] [CrossRef] [PubMed]
- Tukey, J.W. Exploratory Data Analysis; Addison-Wesley: Reading, MA, USA, 1977. [Google Scholar]
- Li, K.; Zhang, Y.; Li, K.; Fu, Y. Adversarial feature hallucination networks for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Ren, M.; Triantafillou, E.; Ravi, S.; Snell, J.; Swersky, K. Meta-learning for semi-supervised few-shot classification. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Hilliard, N.; Phillips, L.; Howland, S.; Yankov, A.; Corley, C.D.; Hodas, N.O. Few-shot learning with metric-agnostic conditional embeddings. arXiv 2018, arXiv:1802.04376. [Google Scholar]
- Snell, J.; Swerrsky, K.; Zemel, R. Prototypical networks for few-shot learning. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; Hospedales, T.M. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 19–21 June 2018. [Google Scholar]
- Bertinetto, L.; Henriques, J.F.; Torr, P.; Vedaldi, A. Meta-learning with differentiable closed-form solvers. In Proceedings of the International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Li, W.; Xu, J.; Huo, J.; Wang, L. Distribution consistency based covariance metric networks for few-shot learning. In Proceedings of the American Association for Artificial Intelligence (AAAI), Hawaii, HI, USA, 27 January–1 February 2019. [Google Scholar]
- Li, W.; Wang, L.; Xu, J.; Huo, J.; Gao, Y.; Luo, J. Revisiting local descriptor-based image-to-class measure for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Baik, S.; Choi, J.; Kim, H.; Cho, D.; Min, J.; Lee, K.M. MetaLearning with task-adaptive loss function for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, 19–25 June 2021. [Google Scholar]
- Snell, J.; Zemel, R. Bayesian few-shot classification with one-vs-each pólya-gamma augmented gaussian processes. arXiv 2020, arXiv:2007.10417. [Google Scholar]
- Wang, Z.; Miao, Z.; Zhen, X.; Qiu, Q. Learning to learn dense gaussian processes for few-shot learning. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Virtual Conference, 6–14 December 2021. [Google Scholar]
- Chen, Z.Y.; Ge, J.X.; Zhan, H.S.; Huang, S.; Wang, D.L. Pareto self-supervised training for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, 19–25 June 2021. [Google Scholar]
- Yu, T.; He, S.; Song, Y.Z.; Xiang, T. Hybrid graph neural networks for few-shot learning. In Proceedings of the American Association for Artificial Intelligence (AAAI), Vancouver, BC, Canada, 22 February–1 March 2022. [Google Scholar]
- Gao, Z.; Wu, Y.; Jia, Y.; Harandi, M. Curvature generation in curved spaces for few-shot learning. In Proceedings of the International Conference on Computer Vision (ICCV), Virtual Conference, 10–17 October 2021. [Google Scholar]
- Laurens, L.M.; Geoffrey, H. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]






| Method | Backbone | MiniImageNet | CIFAR-FS | CUB | |||
|---|---|---|---|---|---|---|---|
| 1-Shot | 5-Shot | 1-Shot | 5-Shot | 1-Shot | 5-Shot | ||
| Baseline | Conv4-64 | 53.36 | 70.51 | 63.83 | 78.14 | 64.77 | 79.17 | 
| MC | Conv4-64 | 57.99 | 73.30 | 67.02 | 80.98 | 69.92 | 85.27 | 
| Dataset | Backbone | Setting | KD-Plain | KD-Improved | 
|---|---|---|---|---|
| MiniImageNet | Conv4-64 | 1-shot | 57.01 | 57.99 (0.98 ↑) | 
| Conv4-64 | 5-shot | 72.69 | 73.30 (0.61 ↑) | |
| CIFAR-FS | Conv4-64 | 1-shot | 65.98 | 67.02 (1.04 ↑) | 
| Conv4-64 | 5-shot | 80.02 | 80.98 (0.96 ↑) | |
| CUB | Conv4-64 | 1-shot | 68.87 | 69.92 (1.05 ↑) | 
| Conv4-64 | 5-shot | 84.24 | 85.27 (1.03 ↑) | 
| Method | Backbone | MiniImageNet | CIFAR-FS | CUB | |||
|---|---|---|---|---|---|---|---|
| 1-Shot | 5-Shot | 1-Shot | 5-Shot | 1-Shot | 5-Shot | ||
| HKD [22] | Conv4-64F | 57.33 ± 0.41 | 72.42 ± 0.34 | 66.27 ± 0.93 | 80.35 ± 0.69 | 68.28 ± 0.50 | 84.07 ± 0.28 | 
| FitNet [23] | Conv4-64F | 57.43 ± 0.45 | 72.26 ± 0.34 | 66.40 ± 0.91 | 80.44 ± 0.67 | 68.45 ± 0.48 | 83.63 ± 0.29 | 
| AT [24] | Conv4-64F | 57.86 ± 0.48 | 72.79 ± 0.35 | 66.75 ± 0.92 | 80.72 ± 0.68 | 67.90 ± 0.48 | 83.53 ± 0.30 | 
| SP [25] | Conv4-64F | 57.26 ± 0.45 | 72.61 ± 0.35 | 66.72 ± 0.92 | 80.68 ± 0.69 | 68.37 ± 0.47 | 84.05 ± 0.29 | 
| RKD [26] | Conv4-64F | 57.14 ± 0.45 | 72.89 ± 0.35 | 66.73 ± 0.91 | 80.70 ± 0.68 | 68.68 ± 0.47 | 84.03 ± 0.29 | 
| PKT [27] | Conv4-64F | 57.61 ± 0.48 | 72.91 ± 0.31 | 66.84 ± 0.92 | 80.70 ± 0.67 | 68.13 ± 0.47 | 84.02 ± 0.29 | 
| NST [28] | Conv4-64F | 57.82 ± 0.48 | 72.66 ± 0.35 | 66.83 ± 0.93 | 80.72 ± 0.68 | 68.07 ± 0.48 | 84.02 ± 0.29 | 
| Ours | Conv4-64F | 57.99 ± 0.44 | 73.30 ± 0.36 | 67.02 ± 0.92 | 80.98 ± 0.69 | 69.92 ± 0.46 | 85.27 ± 0.28 | 
| Method | Backbone | MiniImageNet | CIFAR-FS | CUB | |||
|---|---|---|---|---|---|---|---|
| 1-Shot | 5-Shot | 1-Shot | 5-Shot | 1-Shot | 5-Shot | ||
| Meta-learning paradigms | |||||||
| MAML [12] | Conv4-64F | 48.70 ± 1.75 | 63.11 ± 0.92 | 58.90 ± 1.90 | 71.50 ± 1.00 | 55.92 ± 0.95 | 72.09 ± 0.76 | 
| Prototypical [39] | Conv4-64F | 49.42 ± 0.78 | 68.20 ± 0.66 | 51.31 ± 0.91 | 70.77 ± 0.69 | - | - | 
| Relational [40] | Conv4-64F | 50.44 ± 0.82 | 65.32 ± 0.70 | 49.42 ± 0.78 | 68.20 ± 0.66 | 51.31 ± 0.91 | 70.77 ± 0.69 | 
| MetaOpt SVM [41] | Conv4-64F | 52.87 ± 0.57 | 68.76 ± 0.48 | - | - | 62.45 ± 0.98 | 76.11 ± 0.69 | 
| PN+rot [14] | Conv4-64F | 53.63 ± 0.43 | 71.70 ± 0.36 | - | - | - | - | 
| CovaMNet [42] | Conv4-64F | 51.19 ± 0.76 | 67.65 ± 0.63 | - | - | 52.42 ± 0.76 | 63.76 ± 0.64 | 
| DN4 [43] | Conv4-64F | 51.24 ± 0.74 | 71.02 ± 0.64 | - | - | 46.84 ± 0.81 | 74.92 ± 0.64 | 
| MeTAL [44] | Conv4-64F | 52.63 ± 0.37 | 70.52 ± 0.29 | - | - | - | - | 
| PL [45] | Conv4-64F | 48.00 ± 0.24 | 67.14 ± 0.23 | - | - | 60.11 ± 0.26 | 79.07 ± 0.25 | 
| LLDGP [46] | Conv4-64F | - | - | 64.17 ± 0.31 | 78.42 ± 0.26 | - | - | 
| PSST [47] | Conv4-64F | - | - | 64.37 ± 0.33 | 80.42 ± 0.32 | - | - | 
| HGNN [48] | Conv4-64F | 55.63 ± 0.20 | 72.48 ± 0.16 | - | - | 69.02 ± 0.22 | 83.20 ± 0.15 | 
| ProtoNet+norm [15] | Conv4-64F | 50.29 ± 0.41 | 67.13 ± 0.34 | - | - | - | - | 
| MC [16] | Conv4-64F | 49.64 ± 0.83 | 65.67 ± 0.70 | - | - | - | - | 
| Transfer learning paradigms | |||||||
| Baseline++ [17] | Conv4-64F | 48.24 ± 0.75 | 66.43 ± 0.63 | - | - | 60.53 ± 0.83 | 79.34 ± 0.61 | 
| Neg-Cosine [24] | Conv4-64F | 52.84 ± 0.76 | 70.41 ± 0.66 | - | - | - | - | 
| RFS-distill [29] | Conv4-64F | 47.71 | 65.40 | - | - | - | - | 
| SKD [30] | Conv4-64F | 48.14 | 66.36 | - | - | - | - | 
| CGCS [49] | Conv4-64F | 55.53 ± 0.20 | 72.12 ± 0.16 | - | - | - | - | 
| Ours | Conv4-64F | 57.99 ± 0.44 | 73.30 ± 0.36 | 67.02 ± 0.92 | 80.98 ± 0.69 | 69.92 ± 0.46 | 85.27 ± 0.28 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, B.; Cheng, T.; Zhao, J.; Yan, C.; Jiang, L.; Zhang, X.; Gu, J. Enhancing Few-Shot Learning in Lightweight Models via Dual-Faceted Knowledge Distillation. Sensors 2024, 24, 1815. https://doi.org/10.3390/s24061815
Zhou B, Cheng T, Zhao J, Yan C, Jiang L, Zhang X, Gu J. Enhancing Few-Shot Learning in Lightweight Models via Dual-Faceted Knowledge Distillation. Sensors. 2024; 24(6):1815. https://doi.org/10.3390/s24061815
Chicago/Turabian StyleZhou, Bojun, Tianyu Cheng, Jiahao Zhao, Chunkai Yan, Ling Jiang, Xinsong Zhang, and Juping Gu. 2024. "Enhancing Few-Shot Learning in Lightweight Models via Dual-Faceted Knowledge Distillation" Sensors 24, no. 6: 1815. https://doi.org/10.3390/s24061815
APA StyleZhou, B., Cheng, T., Zhao, J., Yan, C., Jiang, L., Zhang, X., & Gu, J. (2024). Enhancing Few-Shot Learning in Lightweight Models via Dual-Faceted Knowledge Distillation. Sensors, 24(6), 1815. https://doi.org/10.3390/s24061815
 
        


 
       