Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss
Abstract
1. Introduction
2. Related Works
2.1. Model Agnostic Meta-Learning
| Algorithm 1 Model Agnostic Meta-Learning [8] |
|
2.2. Optimization-Based First-Order Meta-Learning
2.3. Cosine Similarity Loss
3. Preliminaries
3.1. Second-Order Computation from MAML
3.2. Hessian-Vector Product
3.3. Setting the Second-Order Term to Zero Is Effective
4. Method
4.1. Approximate Hessian Effect
4.2. How to Update Variables via Gradient Similarity Loss
4.2.1. Gradient Similarity Loss
4.2.2. MAML via AHE Update
| Algorithm 2 MAML via AHE update |
|
4.3. Comparative Analysis of Proposed Algorithm
5. Experiments
5.1. Implementation Detail
5.1.1. Dataset
5.1.2. Experiment Setting
5.1.3. Evaluation Setup
5.1.4. Comparison Methods
5.2. Experimental Results
5.2.1. 5-Way 1-Shot Classification
5.2.2. 5-Way 5-Shot Classification
5.2.3. 20-Way 1-Shot Classification
5.2.4. 20-Way 5-Shot Classification
5.3. Additional Experiment Details
5.3.1. Adjustment of AHE Gradient Magnitude
5.3.2. Methods for Scheduling
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Vinyals, O.; Blundell, C.; Lillicrap, T. Matching networks for one-shot learning. Adv. Neural Inf. Process. Syst. 2016, 29, 3630–3638. [Google Scholar]
- Hospedales, T.M.; Antoniou, A.; Micaelli, P.; Storkey, A. Meta-Learning in Neural Networks: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 5149–5169. [Google Scholar] [CrossRef] [PubMed]
- Huisman, M.; van Rijn, J.N.; Plaat, A. A survey of deep meta-learning. Artif. Intell. Rev. 2021, 54, 4483–4541. [Google Scholar] [CrossRef]
- Achille, A.; Lam, M.; Tewari, R.; Ravichandran, A.; Maji, S.; Fowlkes, C.C.; Soatto, S.; Perona, P. Task2Vec: Task Embedding for Meta-Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Wu, Z.; Shi, X.; Lin, G.; Cai, J. Learning Meta-class Memory for Few-Shot Semantic Segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021. [Google Scholar]
- Santoro, A.; Bartunov, S.; Botvinick, M.; Wierstra, D.; Lillicrap, T. Meta-learning with memory-augmented neural networks. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016. [Google Scholar]
- Lee, K.; Maji, S.; Ravichandran, A.; Soatto, S. Meta-Learning With Differentiable Convex Optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017. [Google Scholar]
- Ravi, S.; Larochelle, H. Optimization as a model for few-shot learning. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Yuan, Y.; Zheng, G.; Wong, K.; Ottersten, B.; Luo, Z. Transfer Learning and Meta Learning-Based Fast Downlink Beamforming Adaptation. IEEE Trans. Wirel. Commun. 2020, 20, 1742–1755. [Google Scholar] [CrossRef]
- Khadka, R.; Jha, D.; Hicks, S.; Thambawita, V.; Riegler, M.A.; Ali, S.; Halvorsen, P. Meta-learning with implicit gradients in a few-shot setting for medical image segmentation. Comput. Biol. Med. 2022, 143, 105227. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar]
- Gu, J.; Wang, Y.; Chen, Y.; Cho, K.; Li, V. Meta-Learning for Low-Resource Neural Machine Translation. arXiv 2018, arXiv:1808.08437. [Google Scholar]
- Li, B.; Gan, Z.; Chen, D.; Aleksandrovich, D.S. UAV Maneuvering Target Tracking in Uncertain Environments Based on Deep Reinforcement Learning and Meta-Learning. Remote Sens. 2020, 12, 3789. [Google Scholar] [CrossRef]
- Zhang, C.; Bengio, S.; Hardt, M.; Recht, B.; Vinyals, O. Understanding deep learning requires rethinking generalization. Commun. ACM 2021, 64, 107–115. [Google Scholar] [CrossRef]
- Li, Z.; Zhou, F.; Chen, F.; Li, H. Meta-SGD: Learning to learn quickly for few-shot learning. arXiv 2017, arXiv:1707.09835. [Google Scholar]
- Nichol, A.; Achiam, J.; Schulman, J. On first-order meta-learning algorithms. arXiv 2018, arXiv:1803.02999. [Google Scholar]
- Triantafillou, E.; Zemel, R.; Urtasun, R. Few-shot learning through an information retrieval lens. Adv. Neural Inf. Process. Syst. 2017, 30, 2255–2265. [Google Scholar]
- Singh, R.; Bharti, V.; Purohit, V.; Kumar, A.; Singh, A.K.; Singh, S.K. MetaMed: Few-shot medical image classification using gradient-based meta-learning. Pattern Recognit. 2021, 120, 108111. [Google Scholar] [CrossRef]
- Rajeswaran, A.; Finn, C.; Kakade, S.M.; Levine, S. Meta-learning with implicit gradients. Adv. Neural Inf. Process. Syst. 2019, 32, 113–124. [Google Scholar]
- Zhou, P.; Yuan, X.; Xu, H.; Yan, S.; Feng, J. Efficient meta learning via minibatch proximal update. Adv. Neural Inf. Process. Syst. 2019, 32, 1534–1544. [Google Scholar]
- Kedia, A.; Chinthakindi, S.C.; Ryu, W. Beyond Reptile: Meta-Learned Dot-Product Maximization between Gradients for Improved Single-Task Regularization. In Findings of the Association for Computational Linguistics: EMNLP 2021; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021. [Google Scholar]
- Bai, Y.; Chen, M.; Zhou, P.; Zhao, T.; Lee, J.; Kakade, S.; Wang, H.; Xiong, C. How important is the train-validation split in meta-learning? In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021; pp. 543–553. [Google Scholar]
- Fan, C.; Ram, P.; Liu, S. Sign-maml: Efficient model-agnostic meta-learning by signsgd. arXiv 2021, arXiv:2109.07497. [Google Scholar]
- Falato, M.J.; Wolfe, B.; Natan, T.M.; Zhang, X.; Marshall, R.; Zhou, Y.; Bellan, P.; Wang, Z. Plasma image classification using cosine similarity constrained convolutional neural network. J. Plasma Phys. 2022, 88, 895880603. [Google Scholar] [CrossRef]
- Tao, Z.; Huang, S.; Wang, G. Prototypes Sampling Mechanism for Class Incremental Learning. IEEE Access 2023, 11, 81942–81952. [Google Scholar] [CrossRef]
- Griewank, A. Some bounds on the complexity of gradients, Jacobians, and Hessians. In Complexity in Numerical Optimization; World Scientific: Singapore, 1993; pp. 128–162. [Google Scholar]
- Rusu, A.A.; Rao, D.; Sygnowski, J.; Vinyals, O.; Pascanu, R.; Osindero, S.; Hadsell, R. Meta-learning with latent embedding optimization. arXiv 2018, arXiv:1807.05960. [Google Scholar]
- Munkhdalai, T.; Yu, H. Meta networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; Volume 70, pp. 2554–2563. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Arnold, S.M.; Mahajan, P.; Datta, D.; Bunner, I.; Zarkias, K.S. learn2learn: A library for Meta-Learning research. arXiv 2020, arXiv:2008.12284. [Google Scholar]
- Fallah, A.; Mokhtari, A.; Ozdaglar, A. Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach. Adv. Neural Inf. Process. Syst. 2020, 33, 3557–3568. [Google Scholar]
- Finn, C.; Rajeswaran, A.; Kakade, S.; Levine, S. Online meta-learning. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 1920–1930. [Google Scholar]





| Task distribution | |
| Learning rate of inner, outer loop | |
| Inner loop update iteration number | |
| Meta-batch size of outer loop for | |
| The gradient of inner loop loss from | |
| The variables approximate from and | |
| Defined step size of | |
| The interpolation rate in (0,1] | |
| Learning rate of |
| Computational Complexity | Space Complexity | |
|---|---|---|
| FOMAML | ||
| MAML | ||
| AHE (Ours) |
| 5-Way 1-Shot | 5-Way 5-Shot | 20-Way 1-Shot | 20-Way 5-Shot | |
|---|---|---|---|---|
| MAML | ||||
| FOMAML | ||||
| AHE (Ours) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tak, J.-H.; Hong, B.-W. Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss. Electronics 2024, 13, 535. https://doi.org/10.3390/electronics13030535
Tak J-H, Hong B-W. Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss. Electronics. 2024; 13(3):535. https://doi.org/10.3390/electronics13030535
Chicago/Turabian StyleTak, Jae-Ho, and Byung-Woo Hong. 2024. "Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss" Electronics 13, no. 3: 535. https://doi.org/10.3390/electronics13030535
APA StyleTak, J.-H., & Hong, B.-W. (2024). Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss. Electronics, 13(3), 535. https://doi.org/10.3390/electronics13030535

