Self-Organizing Optimization Based on Caputo’s Fractional Order Gradients
Abstract
1. Introduction
2. Related Work
3. Mathematical Background
4. Analysis of Caputo’s Fractional Order Gradient Descent Method and Evaluation
4.1. Condition for Non-Divergence
4.2. Performance Evaluation
- (1)
- GSDM and C-FOG are both gradient-based self-organizing optimization algorithms that can guarantee no divergence. For the Caputo fractional order gradient descent method, the fractional order 1 < α < 2 is the necessary condition to guarantee no divergence.
- (2)
- Compared with currently popular gradient-based optimization algorithms, C-FOG converges faster, and the fractional order gives C-FOG additional freedom to converge to different points near the extreme point.
5. Application of C-FOG
5.1. Framework for Generating Adversarial Samples With C-FOG
Algorithm 1. C-FOG: |
1. Input: Image , classifier , , the order |
2. Output: the adversarial example |
3. Initialize: , |
4. while : |
loss = |
end while |
5. return |
Algorithm 2. GSDM: |
1. Input: Image , classifier , |
2. Output: the adversarial example |
3. Initialize: , |
4. While : |
loss = |
end while |
5. return |
5.2. Experimental Results
5.2.1. Evaluation of Attack Speed
5.2.2. Evaluation of Attack Strength
5.2.3. Evaluation of Attack Transferability
6. Conclusions and Outlook
- (1)
- Compared with classical optimization algorithms, power functions increase computational complexity. During the iteration process, it is necessary to store the parameters generated during the last two iterations and to calculate the power of the absolute value of the difference between the two parameter vectors elementwise. This increases the computational and storage overhead.
- (2)
- Adjusting the learning rate at the right time determines the speed of convergence as well as the accuracy of the converged results, which requires experience and trial and error. C-FOG guarantees no divergence but cannot guarantee convergence into the neighborhood of the extreme point if the learning rate is too high, so appropriate learning rates ensure both convergence speed and desired accuracy. Adaptive learning rate optimization algorithms such as Adam are much simpler in this regard.
- (3)
- Further areas related to the convergence and divergence of C-FOG need to be explored, for example, the performance of C-FOG in high-dimensional parameter spaces, the relationships between the learning rates of these parameters, and the law of oscillations near the convergence points.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Kingmma, D.P.; Lei, B.J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Tas, E. Learning Parameter Optimization of Stochastic Gradient Descent with Momentum for a Stochastic Quadratic. In Proceedings of the 24th European Conference on Operational Research (EURO XXIV), Lisbon, Portugal, 11–14 July 2010. [Google Scholar]
- Duchi, J.C.; Hazan, E.; Singer, Y. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 2011, 7, 2121–2159. [Google Scholar]
- Ruder, S. An Overview of Gradient Descent Optimization Algorithms. Comput. Sci. arXiv 2016, arXiv:1609.04747. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classifcation with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Shamshirband, S.; Fathi, M.; Dehzangi, A.; Chronopoulos, A.T.; Alinejad-Rokny, H. A review on deep learning approaches in healthcare systems: Taxonomies. J. Biomed. Inform. 2020, 113, 103627. [Google Scholar]
- Dahl, G.E.; Yu, D.; Deng, L.; Acero, A. Context-Dependent Pre-Trained Deep Neural Networks for Large vocabulary Speech Recognition. IEEE Trans. Audio Speech Lang. Proc. 2012, 20, 30–42. [Google Scholar] [CrossRef]
- You, Y.B.; Qian, Y.M.; He, T.X.; Yu, K. An investigation on DNN-derived bottleneck features for GMM-HMM based robust speech recognition. In Proceedings of the 2015 IEEE China Summit and International Conference on Signal and Information Processing, Beijing, China, 12–15 July 2015; pp. 30–34. [Google Scholar]
- Aslan, S. A deep learning-based sentiment analysis approach (MF-CNN-BILSTM) and topic modeling of tweets related to the Ukraine-Russia conflict. Appl. Soft Comput. 2023, 143, 110404. [Google Scholar] [CrossRef]
- Alagarsamy, P.; Sridharan, B.; Kalimuthu, V.K. A Deep Learning Based Glioma Tumor Detection Using Efficient Visual Geometry Group Convolutional Neural Networks. Braz. Arch. Biol. Technol. 2024, 67, 267101018. [Google Scholar] [CrossRef]
- Biggio, B.; Corona, I.; Maiorca, D.; Nelson, B.; Šrndić, N.; Laskov, P.; Giacinto, G.; Roli , F. Battista Biggio; Corona, I.; Maiorca, D.; Nelson, B.; Srndic, N.; Laskov, P.; Giacinto, G.; Roli, F. Evasion Attacks Against Machine Learning at Test Time. arXiv 2017, arXiv:1708.06131. [Google Scholar]
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations (ICLR), Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Alexey, K.; Bengio, S.; Goodfellow, I. Adversarial Examples in the Physical World. In Proceedings of the International Conference on Learning Representations (ICLR), San Juan, Puerto Rico, 14–16 August 2016. [Google Scholar]
- Machado, G.R.; Silva, E.; Goldschmidt, R.R. Adversarial Machine Learning in Image Classification: A Survey Toward the Defender’s Perspective. ACM Comput. Surv. 2023, 55, 1–35. [Google Scholar] [CrossRef]
- Moosavi-Dezfooli, S.M.; Fawzi, A.; Frossard, P. DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, DC, USA, 27–30 June 2016; pp. 2574–2582. [Google Scholar]
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. In Proceedings of the International Conference on Learning Representations (ICLR), Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Brendel, W.; Rauber, J.; Bethge, M. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. In Proceedings of the International Conference on Learning Representations (ICLR), Tulon, France, 24–26 April 2017. [Google Scholar]
- Maho, T.; Furon, T.; Erwan, L.M. Surfree: A Fast Surrogate-Free Black-Box Attack. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Shenzhen, China, 19–25 June 2021. [Google Scholar]
- Rahmati, A.; Moosavi-Dezfooli, S.-M.; Frossard, P.; Dai, H. GeoDA: A Geometric Framework for Black-Box Adversarial Attacks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, DC, USA, 14–19 June 2020. [Google Scholar]
- Chen, J.B.; Jordan, M.I. HopSkipJumpAttack: A Query-Efficient Decision-Based Attack. In Proceedings of the IEEE Symposium on Security and Privacy (S&P), Oakland, CA, USA, 4 April 2019. [Google Scholar]
- Shi, Y.C.; Han, Y.H.; Hu, Q.H. Query-Efficient Black-Box Adversarial Attack with Customized Iteration and Sampling. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 2226–2245. [Google Scholar] [CrossRef]
- Qayyum, A.; Qadir, J.; Bilal, A. Secure and Robust Machine Learning for Healthcare: A Survey. IEEE Rev. Biomed. Eng. 2020, 14, 156–180. [Google Scholar] [CrossRef]
- Zhang, Z.; Ma, L.; Liu, M.; Chen, Y.; Zhao, N. Adversarial Attacking and Defensing Modulation Recognition with Deep Learning in Cognitive-Radio-Enabled IoT. IEEE Internet Things J. 2023, 11, 14949–14962. [Google Scholar] [CrossRef]
- Bai, Z.X.; Wang, H.J.; Guo, K.X. Summary of Adversarial Examples Techniques Based on Deep Neural Networks. Comput. Eng. Appl. 2021, 57, 61–70. [Google Scholar]
- Carlini, N.; Wagner, D. Towards Evaluating the Robustness of Neural Networks. In Proceedings of the IEEE Symposium on Security and Privacy, San Jose, CA, Canada, 22–26 May 2017. [Google Scholar]
- Carlini, N.; Wagner, D. Defensive Distillation is Not Robust to Adversarial Examples. arXiv 2016, arXiv:1607.04311v1. [Google Scholar]
- Papernot, N.; McDaniel, P.; Goodfellow, I. Transferability in Machine Learning: From Phenomena to Black-Box Attacks using Adversarial Samples. arXiv 2016, arXiv:1605.07277v1. [Google Scholar]
- Iqbal, F.; Tufail, M.; Ahmed, S.; Akhtar, M.T. A Fractional Taylor Series-based Least Mean Square Algorithm and Its Application to Power Signal Estimation. Signal Process. 2021, 193, 108405. [Google Scholar] [CrossRef]
- Khan, Z.A.; Chaudhary, N.I.; Raja, M.A.Z. Generalized Fractional Strategy for Recommender Systems with Chaotic Ratings Behavior. Chaos Solitons Fractals 2022, 160, 112204. [Google Scholar] [CrossRef]
- Chaudhary, N.I.; Raja, M.A.Z.; Khan, Z.A.; Mehmood, A.; Shah, S.M. Design of Fractional Hierarchical Gradient Descent Algorithm for Parameter Estimation of Nonlinear Control Autoregressive Systems. Chaos Solitons Fractals 2022, 157, 111913. [Google Scholar] [CrossRef]
- Zeiler, M.D. AdaDelta: An Adaptive Learning Rate Method. arXiv 2012, arXiv:1212.5701. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. In Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar]
- Tian, Y.J.; Zhang, Y.Q.; Zhang, H.B. Recent Advances in Stochastic Gradient Descent in Deep Learning. Mathematics 2023, 11, 682. [Google Scholar] [CrossRef]
- Reddi, S.J.; Kale, S.; Kumar, S. On the Convergence of Adam and Beyond. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Sutskever, I.; Martens, J.; Dahl, G.E.; Hinton, G. On the Importance of Initialization and Momentum in Deep Learning. In Proceedings of the 30th International Conference on Machine Learning, Toronto, ON, Canada, 17–19 June 2013. [Google Scholar]
- Podlubny, I. Preface. In Fractional Differential Equations; Academic Press: San Diego, CA, USA, 1998; Volume 198, p. XVII. [Google Scholar]
- Miller, K.S.; Ross, B. An Introduction to the Fractional Calculus and Fractional Differential Equation; Wiley: New York, NY, USA, 1993. [Google Scholar]
- Oldham, K.B.; Spanier, J. The Fractional Calculas—Theory and Application of Differentiation and Integration to Arbitrary Order; Academic Press: New York, NY, USA, 1974. [Google Scholar]
- Gorenflo, R.; Mainardi, F. Fractional Calculus: Integral and Differential Equations of Fractional Order. Mathematics 2008, 49, 277–290. [Google Scholar]
- Pu, Y.F.; Zhou, J.L.; Zhang, Y.; Zhang, N.; Huang, G.; Siarry, P. Fractional Extreme Value Adaptive Training Method: Fractional Steepest Descent Approach. IEEE Trans. Neural Netw. Learn. Syst. 2015, 26, 653–662. [Google Scholar] [CrossRef] [PubMed]
- Cheng, S.; Wei, Y.; Chen, Y.; Li, Y.; Wang, Y. An Innovative Fractional Order LMS Based on Variable Initial Value and Gradient Order. Signal Process 2017, 133, 260–269. [Google Scholar] [CrossRef]
- Chen, Y.; Gao, Q.; Wei, Y.; Wang, Y. Study on fractional order gradient methods. Appl. Math. Comput. 2017, 314, 310–321. [Google Scholar] [CrossRef]
- Sheng, D.; Wei, Y.; Chen, Y.; Wang, Y. Convolutional neural networks with fractional order gradient method. Neurocomputing 2020, 408, 42–50. [Google Scholar] [CrossRef]
- Wang, J.; Yan, Q.W.; Gou, Y.D.; Ye, Z.; Chen, H. Fractional-Order Gradient Descent Learning of BP Neural Networks with Caputo Derivative. Neural Netw. 2017, 89, 19–30. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995. [Google Scholar]
- Zhu, Z.G.; Li, A.; Wang, Y. Study on Two-Stage Fractional Order Gradient Descend Method. In Proceedings of the 2021 40th Chinese Control Conference (CCC), Shanghai, China, 26–28 July 2021. [Google Scholar]
- Hazan, E. Introduction to Online Convex Optimization, 2nd ed.; Now Foundations and Trends: Boston, MA, USA, 2019; pp. 41–48. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2017; p. 306. [Google Scholar]
- Madry, A.; Maklov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. In Proceedings of the 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Papernot, N.; Mcdaniel, P.; Goodfellow, I.; Jha, C.; Celik, Z.B.; Swami, A. Practical Black-Box Attacks against Machine Learning, In Proceedings of the 2017 ACM Asia Conference on Computer and Communications Security, Dubai, United Arab Emirates, 2–6 April 2017.
- Andrea, P.; Andrew, Z.; Jawahar, C.V. Cats and dogs. In Proceedings of the 25th IEEE Conference on Computer Vision and Pattern Recognition, Rhode, Greece, 18–20 June 2012; pp. 3498–3505. [Google Scholar]
- Vinyals, O.; Blundell, C.; Lillicrap, T.; Kavukcuoglu, K.; Wierstra, D. Matching Networks for One Shot Learning. In Proceedings of the 30th Annual Conference on Neural Information Processing Systems 2016 (ICONIP), Barcelona, Spain, 5–10 December 2016. [Google Scholar]
- Wah, C.; Branson, S.; Welinder, P.; Perona, P.; Belongie, S. The Caltech-UCSD Birds-200-2011 Dataset, Computer Science; California Institute of Technology: Pasadena, CA, USA, 2011. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the Knowledge in a Neural Network. Comput. Sci. 2015, 14, 1–9. [Google Scholar]
- Papernot, N.; McDaniel, P.; Wu, X.; Jha, S.; Swami, A. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In Proceedings of the IEEE Symposium on Security and Privacy, San Jose, CA, USA, 17–21 May 2015. [Google Scholar]
Algorithms | Hyperparameters |
---|---|
C-FOG | , |
Momentum | , |
AdaGrad | |
AdaDelta. | |
RMSProp | , |
Adam |
C-FOG | Momentum | AdaGrad | RMSProp | AdaDelta | Adam | |
---|---|---|---|---|---|---|
steps (1 × 10−5) | 32 | 104 | 3788 | 74 | 260 | 931 |
steps (1 × 10−10) | 61 | 222 | 7586 | 86 | 11909 | 1547 |
steps (1 × 10−20) | 112 | 431 | 15,181 | 648 | -- | 2280 |
steps (1 × 10−30) | 171 | 640 | 22,776 | 1929 | -- | 2750 |
steps (1 × 10−40) | 221 | 901 | 30,371 | 3211 | -- | 3081 |
Algorithm | C&W | C-FOG | PGD | DeepFool | GSDM |
---|---|---|---|---|---|
resnet50 | 2545.71 | 251.68 | 1064.32 | 481.13 | 524.43 |
mobilenet_v2 | 1606.71 | 160.93 | 671.72 | 252.03 | 312.19 |
vgg11 | 1383.99 | 137.97 | 556.00 | 264.46 | 266.88 |
Algorithm | C&W | C-FOG | PGD | DeepFool | GSDM |
---|---|---|---|---|---|
resnet50 | 98.3% | 98.5% | 98.8% | 98.2% | 98.7% |
mobilenet_v2 | 98.7% | 99.5% | 99.5% | 99.1% | 98.8% |
vgg11 | 97.5% | 99.4% | 99.3% | 99.4% | 99.1% |
Model | Conv | Conv | Conv | Conv | Linear | Linear | Linear |
---|---|---|---|---|---|---|---|
subs_01 | 32 | 32 | 64 | 64 | 200 | -- | 10 |
subs_02 | 32 | 32 | 64 | 64 | 1000 | 200 | 10 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tan, S.; Zhang, N.; Pu, Y. Self-Organizing Optimization Based on Caputo’s Fractional Order Gradients. Fractal Fract. 2024, 8, 451. https://doi.org/10.3390/fractalfract8080451
Tan S, Zhang N, Pu Y. Self-Organizing Optimization Based on Caputo’s Fractional Order Gradients. Fractal and Fractional. 2024; 8(8):451. https://doi.org/10.3390/fractalfract8080451
Chicago/Turabian StyleTan, Sunfu, Ni Zhang, and Yifei Pu. 2024. "Self-Organizing Optimization Based on Caputo’s Fractional Order Gradients" Fractal and Fractional 8, no. 8: 451. https://doi.org/10.3390/fractalfract8080451
APA StyleTan, S., Zhang, N., & Pu, Y. (2024). Self-Organizing Optimization Based on Caputo’s Fractional Order Gradients. Fractal and Fractional, 8(8), 451. https://doi.org/10.3390/fractalfract8080451