Improving Age Estimation in Occluded Facial Images with Knowledge Distillation and Layer-Wise Feature Reconstruction
Abstract
1. Introduction
- Novel Research Focus: This is the first study to address age estimation in the context of large-scale occlusions affecting the eyes and mouth regions;
- New Architecture: The proposed model introduces a knowledge distillation framework that facilitates the transfer of unoccluded facial feature knowledge from a teacher model to a student model without the need to compress model parameters;
- New Training Strategy: The training process integrates layer-wise feature reconstruction and parameter freezing to ensure the accurate reconstruction of occluded facial features. Fine-tuning with original age labels is then performed to remove any noisy features.
2. Related Works
2.1. Age Estimation
Categories | Method | Database | MAE |
---|---|---|---|
Classification of multi-class ages | DEX [26] | IMDB-WIKI + LAP2015 | 3.22 |
MORPH | 2.68 | ||
FG-NET | 3.09 | ||
CACD | 6.52 | ||
Regression based on metrics | OR-CNN [20] | AFAD | 3.34 |
MORPH | 3.27 | ||
VGG + BridgeNet [27] | MORPH | 2.38 | |
FG-NET | 2.56 | ||
LAP2015 | 2.98 | ||
Learning by the distribution of deep label | DLDL-v2 [28] | LAP2015 | 3.14 |
LAP2016 | 3.45 | ||
MORPH | 1.97 | ||
Ranking | Ranking-CNN [21] | MORPH | 2.96 |
2.2. Facial Occlusion in Age Estimation
2.3. Knowledge Distillation
3. Proposed Methods
3.1. Overview of Suggested Method
Algorithm 1 Layer-Wise Distillation with Feature Alignment and Fine-Tuning |
Input: Teacher weights , Student weights , Number of blocks g Output: Optimized student weights
|
3.2. Feature Alignment
3.3. Feature Distillation Mechanism
3.4. Layer-Wise Reconstruction and Global Fine-Tuning
4. Experimental Results
4.1. Dataset and Experimental Setup
4.2. Performance Verification Experiment
4.2.1. Neural Network Architectures
4.2.2. Implementation Details
4.2.3. Ablation Study
4.2.4. Comparison Experiment
4.2.5. Cross-Dataset Comparison with Multiple Advanced Models
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Nam, S.H.; Kim, Y.H.; Choi, J.; Park, C.; Park, K.R. LCA-GAN: Low-Complexity Attention-Generative Adversarial Network for Age Estimation with Mask-Occluded Facial Images. Mathematics 2023, 11, 1926. [Google Scholar] [CrossRef]
- Angulu, R.; Tapamo, J.R.; Adewumi, A.O. Age estimation via face images: A survey. EURASIP J. Image Video Process. 2018, 2018, 42. [Google Scholar] [CrossRef]
- Zeng, D.; Veldhuis, R.; Spreeuwers, L. A survey of face recognition techniques under occlusion. IET Biom. 2021, 10, 581–606. [Google Scholar] [CrossRef]
- Song, L.; Gong, D.; Li, Z.; Liu, C.; Liu, W. Occlusion robust face recognition based on mask learning with pairwise differential siamese network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 773–782. [Google Scholar]
- Farkas, J.P.; Pessa, J.E.; Hubbard, B.; Rohrich, R.J. The science and theory behind facial aging. Plast. Reconstr. Surg.—Glob. Open 2013, 1, e8–e15. [Google Scholar] [CrossRef]
- Shou, Y.; Cao, X.; Liu, H.; Meng, D. Masked contrastive graph representation learning for age estimation. Pattern Recognit. 2025, 158, 110974. [Google Scholar] [CrossRef]
- Wang, H.; Sanchez, V.; Li, C.T. Improving face-based age estimation with attention-based dynamic patch fusion. IEEE Trans. Image Process. 2022, 31, 1084–1096. [Google Scholar] [CrossRef]
- Li, W.; Lu, J.; Wuerkaixi, A.; Feng, J.; Zhou, J. MetaAge: Meta-learning personalized age estimators. IEEE Trans. Image Process. 2022, 31, 4761–4775. [Google Scholar] [CrossRef]
- Antipov, G.; Baccouche, M.; Berrani, S.A.; Dugelay, J.L. Apparent age estimation from face images combining general and children-specialized deep learning models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 96–104. [Google Scholar]
- He, M.; Zhang, J.; Shan, S.; Liu, X.; Wu, Z.; Chen, X. Locality-aware channel-wise dropout for occluded face recognition. IEEE Trans. Image Process. 2021, 31, 788–798. [Google Scholar] [CrossRef]
- Cho, Y.; Cho, H.; Hong, H.G.; Ahn, J.; Cho, D.; Chang, J.; Kim, J. Localization using multi-focal spatial attention for masked face recognition. In Proceedings of the 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG), Waikoloa Beach, HI, USA, 5–8 January 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
- Li, H.; Zhang, Y.; Wang, W.; Zhang, S.; Zhang, S. Recovery-Based Occluded Face Recognition by Identity-Guided Inpainting. Sensors 2024, 24, 394. [Google Scholar] [CrossRef]
- Buolamwini, J.; Gebru, T. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the Conference on Fairness, Accountability and Transparency, PMLR, New York, NY, USA, 23–24 February 2018; pp. 77–91. [Google Scholar]
- Hiba, S.; Keller, Y. Hierarchical attention-based age estimation and Bias estimation. arXiv 2021, arXiv:2103.09882. [Google Scholar]
- Nimhed, C. Estimation of Height, Weight, Sex and Age from Magnetic Resonance Images Using 3D Convolutional Neural Networks. 2022. Available online: https://www.diva-portal.org/smash/get/diva2:1667861/FULLTEXT01.pdf (accessed on 22 April 2025).
- Yaman, D.; Eyiokur, F.I.; Ekenel, H.K. Multimodal soft biometrics: Combining ear and face biometrics for age and gender classification. Multimed. Tools Appl. 2022, 81, 22695–22713. [Google Scholar] [CrossRef]
- Agbo-Ajala, O.; Viriri, S. Deep learning approach for facial age classification: A survey of the state-of-the-art. Artif. Intell. Rev. 2021, 54, 179–213. [Google Scholar] [CrossRef]
- Duan, M.; Li, K.; Li, K. An ensemble CNN2ELM for age estimation. IEEE Trans. Inf. Forensics Secur. 2017, 13, 758–772. [Google Scholar] [CrossRef]
- Rothe, R.; Timofte, R.; Van Gool, L. Deep expectation of real and apparent age from a single image without facial landmarks. Int. J. Comput. Vis. 2018, 126, 144–157. [Google Scholar] [CrossRef]
- Niu, Z.; Zhou, M.; Wang, L.; Gao, X.; Hua, G. Ordinal regression with multiple output cnn for age estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 4920–4928. [Google Scholar]
- Chen, S.; Zhang, C.; Dong, M.; Le, J.; Rao, M. Using ranking-CNN for age estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5183–5192. [Google Scholar]
- Shen, W.; Guo, Y.; Wang, Y.; Zhao, K.; Wang, B.; Yuille, A.L. Deep regression forests for age estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2304–2313. [Google Scholar]
- Cao, W.; Mirjalili, V.; Raschka, S. Rank consistent ordinal regression for neural networks with application to age estimation. Pattern Recognit. Lett. 2020, 140, 325–331. [Google Scholar] [CrossRef]
- Gao, B.B.; Xing, C.; Xie, C.W.; Wu, J.; Geng, X. Deep label distribution learning with label ambiguity. IEEE Trans. Image Process. 2017, 26, 2825–2838. [Google Scholar] [CrossRef]
- Shin, N.H.; Lee, S.H.; Kim, C.S. Moving window regression: A novel approach to ordinal regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 18760–18769. [Google Scholar]
- Rothe, R.; Timofte, R.; Van Gool, L. Dex: Deep expectation of apparent age from a single image. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Santiago, Chile, 13–16 December 2015; pp. 10–15. [Google Scholar]
- Li, W.; Lu, J.; Feng, J.; Xu, C.; Zhou, J.; Tian, Q. Bridgenet: A continuity-aware probabilistic network for age estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1145–1154. [Google Scholar]
- Gao, B.B.; Zhou, H.Y.; Wu, J.; Geng, X. Age Estimation Using Expectation of Label Distribution Learning. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018; Volume 1, p. 3. [Google Scholar]
- Dong, J.; Zhang, L.; Zhang, H.; Liu, W. Occlusion-aware gan for face de-occlusion in the wild. In Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK, 6–10 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Jabbar, A.; Li, X.; Assam, M.; Khan, J.A.; Obayya, M.; Alkhonaini, M.A.; Al-Wesabi, F.N.; Assad, M. AFD-StackGAN: Automatic mask generation network for face de-occlusion using StackGAN. Sensors 2022, 22, 1747. [Google Scholar] [CrossRef]
- Ju, Y.J.; Lee, G.H.; Hong, J.H.; Lee, S.W. Complete face recovery gan: Unsupervised joint face rotation and de-occlusion from a single-view image. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2022; pp. 3711–3721. [Google Scholar]
- Zhao, F.; Feng, J.; Zhao, J.; Yang, W.; Yan, S. Robust LSTM-autoencoders for face de-occlusion in the wild. IEEE Trans. Image Process. 2017, 27, 778–790. [Google Scholar] [CrossRef]
- Hörmann, S.; Zhang, Z.; Knoche, M.; Teepe, T.; Rigoll, G. Attention-based partial face recognition. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 2978–2982. [Google Scholar]
- Wen, R.; Yao, L.; Wan, W.; Chen, S. Occluded Face Recognition Based on Attention Mechanism and Damaged Feature Masking. In Proceedings of the 2023 16th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Taizhou, China, 28–30 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–5. [Google Scholar]
- Din, N.U.; Javed, K.; Bae, S.; Yi, J. A novel GAN-based network for unmasking of masked face. IEEE Access 2020, 8, 44276–44287. [Google Scholar] [CrossRef]
- Hinton, G. Distilling the Knowledge in a Neural Network. arXiv 2015, arXiv:1503.02531. [Google Scholar]
- Romero, A.; Ballas, N.; Kahou, S.E.; Chassang, A.; Gatta, C.; Bengio, Y. Fitnets: Hints for thin deep nets. arXiv 2014, arXiv:1412.6550. [Google Scholar]
- Chen, B.C.; Chen, C.S.; Hsu, W.H. Cross-age reference coding for age-invariant face recognition and retrieval. In Computer Vision–ECCV 2014: Proceedings of the 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part VI 13; Springer: Cham, Switzerland, 2014; pp. 768–783. [Google Scholar]
- Ricanek, K.; Tesafaye, T. Morph: A longitudinal image database of normal adult age-progression. In Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition (FGR06), Southampton, UK, 10–12 April 2006; IEEE: Piscataway, NJ, USA, 2006; pp. 341–345. [Google Scholar]
- Rothe, R.; Timofte, R.; Gool, L. IMDB-WIKI–500k+ Face Images with Age and Gender Labels. 2015, Volume 4. Available online: https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki (accessed on 15 January 2024).
- Zhang, K.; Zhang, Z.; Li, Z.; Qiao, Y. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 2016, 23, 1499–1503. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Sharma, N.; Sharma, R.; Jindal, N. Face-based age and gender estimation using improved convolutional neural network approach. Wirel. Pers. Commun. 2022, 124, 3035–3054. [Google Scholar] [CrossRef]
- Zhang, B.; Bao, Y. Age estimation of faces in videos using head pose estimation and convolutional neural networks. Sensors 2022, 22, 4171. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 4278–4284. [Google Scholar]
- Li, S.; Deng, W.; Du, J. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2852–2861. [Google Scholar]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
- Yu, J.; Lin, Z.; Yang, J.; Shen, X.; Lu, X.; Huang, T.S. Free-form image inpainting with gated convolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 4471–4480. [Google Scholar]
Age Group | 16–20 | 21–25 | 26–30 | 31–35 | 36–40 | 41–45 | 46–50 | 51–55 | 56–60 | 61–65 | 66–70 |
---|---|---|---|---|---|---|---|---|---|---|---|
Training Set | 7376 | 7189 | 5748 | 6477 | 6965 | 6914 | 4147 | 2903 | 2000 | 890 | 288 |
Testing Set | 820 | 800 | 650 | 780 | 750 | 790 | 380 | 180 | 70 | 30 | 20 |
Total | 8196 | 7989 | 6398 | 7257 | 7715 | 7704 | 4527 | 3083 | 2070 | 920 | 308 |
Occlusion Type | MAE | ||
---|---|---|---|
CORAL | DLDL | MWR | |
Mask | 5.51 | 5.72 | 6.44 |
Sunglasses | 6.21 | 6.44 | 6.32 |
Mask + Sunglasses | 7.52 | 7.71 | 7.82 |
Mouth Color Block | 4.96 | 4.85 | 5.57 |
Eyes Color Block | 5.09 | 5.22 | 5.98 |
Mouth + Eyes Color Block | 6.62 | 6.31 | 7.02 |
Model | MAE | |
---|---|---|
Baseline | Layer-Wise Feature Reconstruction | |
CORAL | 6.62 | 4.59 ↓ 2.03 |
DLDL | 6.88 | 4.91 ↓ 1.97 |
MWR | 6.85 | 5.08 ↓ 1.77 |
Model | MAE | |
---|---|---|
Layer-Wise Feature Reconstruction | Layer-Wise Feature Reconstruction + Age Label Fine-Tuning | |
CORAL | 4.59 | 4.27 ↓ 0.32 |
DLDL | 4.91 | 4.43 ↓ 0.48 |
MWR | 5.08 | 4.87 ↓ 0.21 |
Model | Parameters (Millions) | Layers | MAE | ||
---|---|---|---|---|---|
CORAL | DLDL | MWR | |||
ResNet-34 | 21.8 | 34 | 4.27 | 4.43 | 4.87 |
ResNet-50 | 25.6 | 50 | 4.12 | 4.28 | 4.75 |
ResNet-101 | 44.5 | 101 | 3.95 | 4.07 | 4.62 |
VGG16 [45] | 138.0 | 16 | 4.30 | 4.47 | 4.84 |
EfficientNet-B0 [46] | 5.3 | 237 | 4.01 | 4.15 | 4.68 |
InceptionV4 [47] | 42.5 | 48 | 4.10 | 4.22 | 4.71 |
Occlusion Type | Proportion (%) |
---|---|
Mask | 15% |
Sunglasses | 15% |
Mask + Sunglasses | 20% |
Mouth Color Block | 15% |
Eyes Color Block | 15% |
Mouth + Eyes Color Block | 20% |
Model | MAE | |
---|---|---|
Occlusion Dataset Direct Training | Layer-Wise Feature Reconstruction + Age Label Fine-Tuning | |
CORAL | 5.61 | 4.27 |
DLDL | 5.91 | 4.43 |
MWR | 5.98 | 4.87 |
Method | AFAD | CACD | IMDB-WIKI | Average | ||||
---|---|---|---|---|---|---|---|---|
MAE | CS(%) | MAE | CS(%) | MAE | CS(%) | MAE | CS(%) | |
Human workers | 8.53 | 48.26 | 8.95 | 46.18 | 9.58 | 44.37 | 9.02 | 46.27 |
DLP-CNN [48] | 7.86 | 56.58 | 7.98 | 52.33 | 8.92 | 49.37 | 8.25 | 52.76 |
Pix2Pix [49] | 6.18 | 65.22 | 6.33 | 63.36 | 6.81 | 61.62 | 6.44 | 63.40 |
DeepFill v2 [50] | 5.92 | 69.39 | 6.06 | 67.86 | 6.61 | 63.83 | 6.20 | 67.03 |
LCA-GAN [1] | 4.72 | 79.69 | 5.21 | 75.25 | 5.98 | 70.83 | 5.30 | 75.26 |
MCGRL [6] | 5.11 | 78.86 | 5.48 | 74.22 | 5.81 | 71.09 | 5.47 | 74.72 |
KD-CORAL (ours) | 4.83 | 77.11 | 5.15 | 74.29 | 5.71 | 72.16 | 5.23 | 74.52 |
KD-DLDL (ours) | 5.04 | 75.59 | 5.59 | 73.33 | 6.04 | 70.97 | 5.56 | 73.30 |
KD-MWR (ours) | 5.28 | 74.68 | 5.66 | 72.85 | 6.37 | 70.03 | 5.77 | 72.52 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, S.; Zhao, Q. Improving Age Estimation in Occluded Facial Images with Knowledge Distillation and Layer-Wise Feature Reconstruction. Appl. Sci. 2025, 15, 5806. https://doi.org/10.3390/app15115806
Yu S, Zhao Q. Improving Age Estimation in Occluded Facial Images with Knowledge Distillation and Layer-Wise Feature Reconstruction. Applied Sciences. 2025; 15(11):5806. https://doi.org/10.3390/app15115806
Chicago/Turabian StyleYu, Shuangfei, and Qilu Zhao. 2025. "Improving Age Estimation in Occluded Facial Images with Knowledge Distillation and Layer-Wise Feature Reconstruction" Applied Sciences 15, no. 11: 5806. https://doi.org/10.3390/app15115806
APA StyleYu, S., & Zhao, Q. (2025). Improving Age Estimation in Occluded Facial Images with Knowledge Distillation and Layer-Wise Feature Reconstruction. Applied Sciences, 15(11), 5806. https://doi.org/10.3390/app15115806