Robust Active Shape Model via Hierarchical Feature Extraction with SFS-Optimized Convolution Neural Network for Invariant Human Age Classification

: The features and appearance of the human face are affected greatly by aging. A human face is an important aspect for human age identification from childhood through adulthood. Although many traits are used in human age estimation, this article discusses age classification using salient texture and facial landmark feature vectors. We propose a novel human age classification (HAC) model that can localize landmark points of the face. A robust multi-perspective view-based Active Shape Model (ASM) is generated and age classification is achieved using Convolution Neural Network (CNN). The HAC model is subdivided into the following steps: (1) at first, a face is detected using aYCbCr color segmentation model; (2) landmark localization is done on the face using a connected components approach and a ridge contour method; (3) an Active Shape Model (ASM) is generated on the face using three-sided polygon meshes and perpendicular bisection of a triangle; (4) feature extraction is achieved using anthropometric model, carnio-facial development, interior angle formulation, wrinkle detection and heat maps; (5) Sequential Forward Selection (SFS) is used to select the most ideal set of features; and (6) finally, the Convolution Neural Network (CNN) model is used to classify according to age in the correct age group. The proposed system outperforms existing statistical state-of-the-art HAC methods in terms of classification accuracy, achieving 91.58% with The Images of Groups dataset, 92.62% with the OUI Adience dataset and 94.59% with the FG-NET dataset. The system is applicable to many research areas including access control, surveillance monitoring, interaction and self-identification.


Introduction
Despite changes in lifestyle and environment, the first signs of human facial aging show up between the ages of 25-30 years. It is a progressive transformation in the skin, soft tissue and the skeletal structure of the face. The face and neck appearance typically changes in the late 30s. The face appears more flabby or drooping over time due to the loss of muscle tone and thinning skin whereas in some people, a double chin appears on the face due to the sagging jowls. To some extent, wrinkles cannot be avoided [1]. The skin smoothness and tightness decrease with age which causes the appearance of wrinkles, crow's feet around the eyes and bags under the eyes become more prominent. Normally, changes in the growth of the bones occur in childhood whereas in puberty changes in the texture of the skin are the most intense and noticeable [2].Aging-related changes allow humans to estimate the age of an individual just by looking at their face. However, some researchers in this field conclude that humans cannot predict the age of the persons accurately. Thus, an efficient automated system that can estimate the human age correctly is required [3].
The aim of automated human age classification (HAC) models is to estimate a person's age with information derived from his/her face image by using dedicated algorithms. The problem encountered during age classification is the shared similarities with other typical face image interpretation tasks in which the execution phase includes the method of face detection, locating characteristics of face, formulation of feature vectors and age classification. According to the age estimation system applicability, the classification result can be an estimate of the exact age of a person or the age group of an individual or even a binary result indicating whether the age of a subject is within a certain age range. Among those three variations listed above, the classification into age-groups is used in several applications. Despite this, it is important to have a rough estimate of an individual's age rather than his/her exact age. Another prominent problem encountered during age estimation is the range of ages considered. This parameter is important because different aging signs appear in different age groups; hence, the system that is only trained with specific age groups may not be applicable to more diverse age ranges [4].
The information depicted from the face has drawn the research attention in the image processing research field [5][6][7][8][9]. Human age estimation or age group classification [10] using images has vast applicability potential in age-invariant face identification, face recognition across ages, commercial and law enforcement areas [11][12][13][14][15], security control and surveillance, e-learning, biometrics, human-machine interaction [16][17][18][19][20] and electronic customer relationship management (ECRM). The main goal of HAC study is to draw out the different patterns and variations that occurred in the appearance of the face from childhood till adulthood to best characterize an aging face for accurate age or age group estimation. Although many researches have been carried out to estimate human age, still automated age estimation accuracies of current systems are far below the accuracy achieved by humans [21].
The facial aging of an individual is based on several aspects from lifestyle, psychology and occupation to environmental factors. Human age can be categorized into two factors; intrinsic and extrinsic. In extrinsic factors the human face is affected by the environment and occupation whereas the intrinsic factor includes components like bone symmetry and genetic influences which occur naturally over time [22]. During childhood, the main variations that occur on the face are due to the craniofacial development which causes changes in shape of the face due to the rapid growth and development of facial tissue. This can cause changes in the proportions of the face [23]. Facial bones lose volume as the years pass and this causes the appearance of aging on the face. In older females, the eye bone socket is larger than in younger years, the lower jaw bone angle and the angle of the eyebrow is reduced. With the passage of time, the forehead slope releases space on the cranium. During childhood, the bone structure of the face drifts causing changes in face shape. In adulthood, different texture changes like wrinkling and loss of skin elasticity occur on the face. Aging is common in all males and females but it can appear faster in females than males; however, it is yet not clear whether this is caused by the rate of aging or by sexual dimorphism [24]. Differences in male facial aging include manifestations of facial hair like beards, increased thickness of bones, facial vascularity, sebaceous content and potential differences in fat and bone absorption rates. The formation of deeper wrinkles around the mouth area is common in women as compared to men since women's skin has fewer appendages like wrinkling and freckling compared to men [25].
Our proposed work is subdivided into the following steps. At first, the face is detected using the YCbCr color segmentation model. Second, 35 landmarks are plotted on the face using a connected components and ridge contour method. An Active Shape Model (ASM) is generated over the face using two approaches, namely, three-sided polygon mesh and perpendicular bisection of triangle. Feature extraction is achieved using two techniques, namely, image representation and aging patterns. The Sequential Forward Selection (SFS) technique is used before classification to select the ideal features for classification. Finally, CNN is used to achieve accurate age group classification.
The main contributions in our HAC system are enlisted here: 1. Texture feature vectors are not enough for better age classification accuracy. For accurate and robust age classification results, we have specified 35 landmark facial features points. 2. We map a multi-perspective views Active Shape Model (ASM) on the face for better age classification accuracy. 3. Our salient texture and landmarks localization feature vectors provide far better accuracy than other state-of-the art techniques. 4. The selection of the ideal set of features is achieved using a Sequential Forward Selection (SFS) algorithm along with the CNN classifier for human age classification.
This article is subdivided into the following sections. Section 2 describes the previous literature on human age classification approaches. Section 3 describes the methods and materials used in our robust HAC model. Section 4 describes the performance evaluation and experimental results obtained on three benchmark datasets while Section 5 present the conclusion and future work.

Related Work
Recently, a lot of research has been carried out on the age classification of individuals and age groups using 2D and 3D images. Facial age estimation is divided into two groups. The first group is from infancy to adulthood when most changes occur; the second stage is from the teenage to old age where skin color, texture and elasticity changes are most likely to occur. Various age estimation methods have been investigated. Recently, age estimation using multiple faces has been carried out by researchers. In this research article, we discuss age classification using single and multi-face datasets.

Age Classification via Classical Machine Learning Algorithms
Several age classification studies have been done using single face datasets as in [26] where the authors proposed an age estimation model in which features are extracted using LBP, active appearance model, and HOG. Age groups are categorized into three classes: child, adult and senior. For classification K-Nearest Neighbors, Support Vector Machine and Gradient Boosting Tree models are used. The system is evaluated on an FG-Net aging database and it achieved an accuracy rate of 82%. In [27], Principal Component Analysis (PCA) was used to predict the age. The system was classified into 7 age classes ranging from 10 to 60 years old. PCA and geometric features were used to extract the features. The system achieved 92.5% accuracy. The drawback of this system is they had only used the Euclidean distance between two points for feature extraction which may vary between distant images and close-up images. The technique for feature points localization does not give promising results in other datasets. To classify the human age from facial images, biologically inspired features are explored. Previously, a Gabor filters pyramid was used in biologically inspired models for all image points but the authors found that the use of pre-trained models for the S2 layer which are then developed to the C2 layer did not perform up to the mark in age estimation. In response, the authors in [28] proposed an STD operator to encode the ages of faces. The system evaluation experiment used two benchmark datasets, namely, YGA and FG-Net datasets. Both of these datasets are publicly available and the results were improved using these advanced methods.
Using multi-face datasets, the authors in [29] estimated the age of real-life faces, Local Binary Pattern (LBP) and Gabor features were extracted. For classification, SVM was used. The system was evaluated using a large dataset named The Images of Groups dataset and achieved an accuracy rate of 87.7%. The drawback of this model isthat it can only deal with high resolution single faced images. In [30], the authors used the Gallagher benchmark dataset to estimate the age. LBP and FPLBP were used for feature extraction whereas SVM was used to classify seven age classes and an accuracy of 66.6% was achieved. The drawback of this model is that only MB-LBP is used for feature extraction which is not sufficient to get all the necessary information from the face image and thus achieves less accuracy. In [31], authors developed an automatic age and gender classification system using The Images of Groups dataset. The features were extracted using Multi-level Local Binary Pattern (ML-LBP); SVM with non-linear RBF kernel was used to classify the correct age.

Age Classification via Classical Deep Learning Algorithms
Age classification using deep learning methods promises to produce better classification results compared to machine learning techniques. This section describes age estimation or classification using deep learning methods on both single and multi-face age datasets. In [32], authors proposed a system to estimate age groups categorized in 4 classes: babies, young, middle age and old adults. The system is divided into three phases, namely, location, feature extraction and classification. The location phase can determine the positioning of the eyes, nose and mouth regions while the Sobel edge detector is used to estimate the symmetry of the facial features and the face. As a result, three wrinkle features and two geometric features are obtained. A neural network is used for the classification of the age groups. Two classifiers are usedin this system. The first classifier uses geometric features in order to predict the age class, i.e., whether it belongs to the baby class, and, if not, then another classifier uses wrinkle features to accurately predict which of the remaining three age group classes it belongs. The system achieved and accuracy rate of 81.58%. In [33], the authors developed an age estimation system based on age manifold analysis. Different learning methods are applied to achieve sufficient embedding space. A multiple linear regression function is used to solve the low-dimensional manifold data. The experiments were done on a large number of dataset images and the results obtained show that the system was very effective in age estimation.The drawback of this system is that it requires a large amount of data during age manifold analysis to train the system enough in order to predict the age correctly.
In [34], the authors proposed two novel approaches for age estimation. The first approach is simple fusion-based effective descriptors on local appearance and texture to extract the features. The second approach is based on deep learning to estimate age. The two approaches are evaluated simultaneously using two big datasets namely FGRC and MORPH. Experimental results show better accuracy compared to the previous work. In [35], age estimation is entirely based on the local and global features. An active appearance model is used to obtain the global features. On the other hand, dimensional discrete cosine transform is used to extract the local features. The local and global features extracted from the FGNET aging database are combined and regression is used to predict the exact age of an individual. The drawback of this system is that AAM can only fit on frontal face images. In [36], the authors developed a system in which the features are extracted and classified using the CNN. The system is tested on an OUI Adience dataset and it achieved an accuracy rate of 62.34%. The system has some loop-holes in that it produces mis-labeling of old people as young and young people as old. Secondly, no pre-processing is done which can lead to mis-prediction of human faces in an image.
Our proposed system provides promising results in both single and multi-faces datasets. We have pre-processed the datasets' images to remove the noise in all images in order to avoid any mis-classification. Our proposed robust ASM works well in different views of the face. Multi-texture-based and point-based features extraction methods boost the classification accuracy of our system.

Materials and Methods
This section is subdivided into six phases. First, the face is detected using a YCbCr color segmentation model; second, 35 landmarks are specified on the face using the connected components and ridge contour methods; third, an Active Shape Model (ASM) is generated on the face using two approaches-three-sided polygon mesh and perpendicular bisection of a triangle; fourth, feature extraction is done using two techniques-image representation consisting of a three sub-phase anthropometric model, carnio-facial development and interior angle formulation and feature extraction using aging patterns which are subdivided into two phases; wrinkles detection and heat maps. Fifth, the Sequential Forward Selection (SFS) technique is used in order to select the most ideal set of features. Finally, the CNN model is used to classify into accurate age groups. Figure 1 depicts the overall proposed architecture of the HAC system.

Pre-Processing and Face Detection
The goal of image pre-processing is to avoid redundancy in the images and to suppress undesirable distortion. This enhances image features which are required for further processing in later phases. In The Images of Groups and OUI Adience datasets, some of the images are not perfectly aligned. To horizontally set the alignments of the faces of both datasets, we have used the [37] code available on github. Secondly, the background subtraction in the images of both datasets is done by median filtering to remove the noise using a 5 ×5 window. Then, the filtered image is labeled and noisy regions are removed.
The literature reveals that many segmentation techniques have been developed. A binary decision in the color space representation can be defined by a skin detector. The threshold can be defined for each color component for a controlled acquisition.
YCbCr color segmentation is not affected by any variation of illumination in the Y (luma) factor. The description of the skin representation is based on two components: Cb(blue difference) and Cr (red difference) of the chrominance components. Other color segmentation models like HSV and RGB are not appropriate for skin detection. In RGB color segmentation models, the skin color region varies in all three channels. The histogram clearly shows that the skin color region is uniformly distributed across a large spectrum. The hue channel in HSV space shows a clear discrimination of the skin color regions. The range of the H component (hue) in HSV skin color segmentation mostly lies between 0 and 0.1 and between 0.9 and 1.0 with a normalized scale of 0 to 1.
Each individual's skin color is unique. To get the full coverage of skin pixels, the RGB image is converted to YCbCr color segmentation model. This can provide exact coverage of skin pixels and easily distinguish between skin and non-skin pixels without being affected by illumination levels. The YCbCr skin color segmentation model is formulated as [38] = 0.299 + 0.287 + 0. 11 (1)

Landmark Localization
After face detection, the next phase is landmark localization on face features including the jaw area. Figure 2 shows the results obtained for face detection and landmark localization on the FG-NET dataset, the OUI Adience dataset and The Images of Groups dataset. To plot the landmarks on eyebrows, eyes and lips, the image is first converted to a binary image [39]. A certain range is allotted to obtain the maximum connected face region. The above-mentioned facial features are then bounded by a blob, and the landmarks are localized on those features using the blob edge plotting technique [38] via Equation (3). For the nose, the ridge contour method [40] is used to mark the nose with the 7 landmarks. For the chin, jaw area and head area, the blob edges middle point is evaluated using Equation (4). Figure 3 represents landmark localization points ranging from 1 to 35.
In the following equation, p, q, r and s represent the edges of the bounding box and the midpoint of each edge is calculated as [38]: The midpoints between the above four points can be calculated as;

Active Shape Model
An Active Shape Model (ASM) is generated on the face using the 35 landmark points. In our proposed model, novel techniques are used for the generation of ASM which are three sided polygon meshes and the perpendicular bisector of a triangle. An ASM is a robust technique to determine the age of an individual. ASM is a successful method used in various areas such as face recognition, face detection, face aging development, etc. Given the image of a face and corresponding landmark localization points, a multivariate model of shape variation can be generated from the points by using polygon meshes and perpendicular bisectors of triangles. Basically, the shape of an Active Shape Model changes across age changes and this fact can be used to accurately classify individuals into age groups. For facial features like eyebrows, eyes and lips, the perpendicular bisection of a triangle technique is used to highlight the changes that occurred in facial features from childhood till adulthood. Figure 4 shows the ASM on different images of theFG-NET dataset. Algorithm 1 shows the Active Shape model formulation.  After the Active Shape Model is mapped on a face, we proceed to the feature extraction step. This includes ASM features, anthropometric features, carnio-facial development features, interior angle formulation features [41,42], heat maps and wrinkles detection. Sections 3.3 and 3.4 describe feature extraction using image representation and aging patterns, respectively. Algorithm 2 explains the overall definition of the salient features extracted and the ensuing process.

3.4.Feature Extraction Using Image Representation
In this section, feature extraction using image representation is discussed in detail. This section includes modeling techniques, i.e.,the anthropometric model, carnio-facial development and the interior angles formulation of polygons. Features are extracted based on the landmark localization points. These modeling techniques will be described in detail in the subsequent sections.
In anthropometric modeling, the distances between the facial features are measured. Face anthropometry is all about the study of face sizes and proportions [43][44][45]. The landmark points are known by anatomical names. For instance, the midpoint between the inner corners of the eyebrows is known as Nasion and is denoted by n;the nostril points are known as alare, which is denoted by al; the front of the ear is known as the tragion and it is denoted by t; lip corners are known as cheilion and they are denoted by ch; the chin point is known as mental protuberance and it is denoted by mp.
Facial dimensions are taken by measuring the distance between the facial features and angles of inclination. The shortest distance between the landmark points is measured using the Euclidean distance. These distances cover the eight facial feature dimensions which are Bizygomatic breadth, Bigonial breadth, Menton-subnasale length, Subnasalenasal root length, nose width, lip length, Bitragion-subnasale arc and Bitragion-menton arc [44]. Figure 5 shows all the anatomical names and facial dimensions. We take each landmark point as the reference point from which the distance is to be measured from one landmark point to the other. The shortest distance between landmarks is calculatedusing the following distance formula [44]; where x1,x2,y1 and y2are the pixels location along x and y coordinates. The angle of inclination is the angle between the line and the x-axis which is measured in a counterclockwise direction from the part of the x-axis to the right of the line [45,46]. The angle of inclination between the eyes and chinbone point and two chin points is calculated. The angle of inclination is measured as; wherex1,x2,y1 and y2 are the coordinate points of the triangle. Figure 6 shows the angle of inclination. Carnio-facial development theory is the study of a human face growth from infancy to adulthood. With the passage of time, changes in bone growth and development, soft tissue morphology and skeletal patterns are clearly visible. The physiochemical process by which an organism becomes large over time happens from childhood till adulthood. At birth, the largest to smallest dimensions of the face are depth, height and width. Naturally, the growth of the face is more rapid in depth, followed by the height, with slowest rate of growth found in the width.
From infancy to adulthood, a person's facial development can be measured using the cardioidal strain transformation mathematical model. In this model, a circle is defined on the face to track the growth of the face by finding variations in the radius and circumference of the face. By taking the nose tip point as the central point of the face and then finding the radius of the circle,major changes in the radius values of the faces of all ages of individual are revealed. The radius of the circle is measured as in [43]; where C is the circle's initial radius, k is the parameter that increases with time and is the initial angle formed with the vertical axis. Figure 7 shows carnio-facial development across ages. An ASM is a mesh of polygons. From infancy to childhood, the shape of these polygons changes along with their angles. In order to calculate the interior angles , , of these polygons, we have used the law of cosine. For any number of the given sides in a regular polygon, all the interior angles are the same whereas for an irregular polygon the angles would be different. The interior angle of a polygon can be formulated as [18] = + − 2 (8) Where a, b and c are the sides of a triangle. Figure 8 shows the different interior angles measurements on two different age groups over OUI Adience dataset.

Feature Extraction Using Aging Pattern
Over time, the texture of an individual's skin changes due to different factors like environment, stress, health and age. Age is the primary reason for texture changes which can manifest as wrinkles, baggy skin, under eye puffiness and many other signs. In this section, we discuss how changes in these aging patterns, from childhood till adulthood, can be measured by two techniques, namely,wrinkle detection and heat maps. These two aging detecting techniques will be described in detail in the subsequent sections.
For wrinkle detection, edges are the outside limits of an image. Wrinkles are detected on face images using the canny edge detection method. This provides the wrinkle edges in a binary image [47]. The wrinkles present in the binary image are the white pixels,so the quantity of the white pixels is equal to the quantity of edges in the regions of the face. Wrinkle information on the forehead area, around left and right eyelids, under the eyes and lips are calculated first. These wrinkle features (WF) are calculated as in [48]; where NF is the number of white pixels on the forehead, NLEis the number of white pixels around left eyelid, NREis the number of white pixels around right eyelid, NU is the number of white pixels under the eyes, NL is the number of white pixels around lips and N1,N2,N3,N4,N5 are the total number of pixels on the forehead, left eyelid, right eyelid, under the eyes and the lips, respectively. Figure 9 shows the wrinkles formation on face images over The Images of Groups dataset. Heat maps are an effective way to extract useful information which helps in accurate age classification. Other global feature extraction methods lose useful information. In the heat map, HeatMap(z)calculates changes in the texture of the human face in the form of heat map matrix values categorized into dark colors to light colors over the identified face image. Dark colors identify parts of the image that contribute to the predicted class. Light colors show parts contradicting the predicted class. After the extraction of heat map values, we collect only dark color values that exhibit the correct class prediction using the heuristic thresholding technique and we place all extracted values in a 1D array. The heat map is calculated using Equation (12) and the example results of the heat map are represented in Figure 10.
where HeatMap(z) specifies the heat map array vector, u expresses index values and represents the index value of certain RGB pixels. The SFS technique allows us to select the highest distinctive features and filter out irrelevant and redundant data which can decrease the overall classification accuracy of the age estimation model.This is a bottom-up approach that can take an empty feature set S and gradually add new features that are selected by an evaluation function. This can reduce the mean square error and produce extended features. The feature selection method has been widely used in biometrics and surveillance applications.
This feature selection technique helps to minimize the original feature space.The SFS feature selection procedure depends upon the selection criteria or an objective function to extract the ideal features helpful in the age estimation process. In this proposed work, two validation criteria are used. The first is used to approximate different clusters of the age groups. In this work, the Bhattacharyya distance measure is used. It can measure the separation score Sa,b between two classes a and b given which are calculated as [49]; where and are the mean and covariance of class a. For N number of classes, the separationscore is computed as [49]; For the estimation of different age groups for a given image, a validation-based evaluation criterion is proposed in order to derive feature subsets that can be guaranteed to reduce classification errors and provide higher inter-class separability across different age groups. For the OUI Adience dataset, the ideal features are: 1) ASM, 2)anthropometric model, 3) carnio-facial development, 4) interior angles and 5) wrinkle detection. For Images from the Group dataset,the ideal features selected are: 1) ASM, 2) anthropometric model, 3) carnio-facial development, 4) wrinkle detection and 5) heat maps. Figure 11 shows the number of ideal features selected using the SFS algorithm.

Age Estimation Modeling
The salient features extracted by the above-mentioned techniques are passed to a Convolution Neural Network (CNN) to classify in the correct age groups over three benchmark datasets. CNN is mostly used for image-based deep learning applications [50][51][52]. CNN provides higherclassification accuracies than other deep learning methodsdue to its ability to extract and learn image-based features. CNN also uses a small amount of bias and weights to achieve high classification accuracy.
In our proposed 1-D CNN model, the feature set extracted from The Images of Groups dataset is organized as 5080 × 536 which is used as an input to the CNN where 5080 is the number of images and 536 is the feature vectors. Figure 12 depicts the structure of a 1-D CNN in the proposed work. The proposed CNN model has three convolution layers, three pooling layers and one fully connected layer. The result obtainedby the CNN is the prediction of the individual's age from different age groups via a fully connected layer. The input matrix in the first convolution layer Conv1 is convolved with the 32 kernels with a size of 1 × 7. A matrix of 5080 × 530 × 32 is generated as a result. The matrix convolution on the convolution layer is calculatedas [53] ( where ( ) ( , ) produces the convolution results for the coordinates (c, d) of the a+1 layer with the b th convolution map. Ω is the previous layer map and y is the kernel size [54]. is the b th convolution kernel for the a th layer. is the b th bias values for the a th layer. Activation function ReLU, which is the sum of the weights of the previous layer passed to the next layer, is used. The second layer is the pooling layer Pool1. Pooling layer down samples the result produced at the first convolution layer Conv1 to a matrix size of 5080 × 265 × 32 via 1×2 max-pooling. In the pooling layer, a 1 × 2 sliding window is applied to the result produced by the previous convolution layer by selecting the maximum value. Thus, the pooling results of the (a + 1)thlayer, b kernel, x row and y column can be represented as [53]; where 1≤m≤n and z is the pooling window size. Following the same method for the second convolution layer Conv2 with 64 convolution kernels with the size of 1 × 6. In the same way, 1 × 2 max-pooling is applied in the second and third pooling layers. The third pooling layer generates an output matrix size of 5080 × 63 × 128. At the end, a fully connected layer is obtained as; where is the matrix having weight values from the a th node of the o th layer to the q th node of the (o + 1) th layer.
denotes the content of the a th node at o th layer. Figure 13 shows the convergence plot of the CNN using 350 epochs over the age features of different age groups.

Experimental Results
This section is organized into five sub-sections. First, the details of the three benchmark datasets are described in detail. Second, age classification accuracy is discussed. Third, our proposed work is compared with other state-of-the-art deep learning techniques. Fourth, the error resilience of our proposed Active Shape Model is compared with other ASM models and, finally, classification accuracy rates with respective sets of features are described using the CNN classifier.

Datasets' Descriptions
The Images of Groups dataset [55] is a multi-face dataset. It contains a total of 5080 images with 28,231 faces labeled with ages. This is the largest multi-face dataset. Each face in the dataset is labeled according to one of seven age groups: 0-2, 3-7, 8-12, 13-19, 20-36, 37-65 and 66+. Some images of this dataset are shown in Figure 14. The second dataset used is the OUI Adience dataset [38] consisting of 26,580 images having 2284 cropped numbers of individuals. Each face is labeled according to one of eight age groups; 0-2, 4-6, 8-13, 15-20, 25-32, 38-43, 48-53, 60-. Figure 15 shows some of the examples of the OUI Adience dataset.

Experiment I: Experimental Results Obtained Using the Proposed Model and the Other Three Competing Approaches over Benchmark Datasets
For age classification, we have used a Convolution Neural Network as an age classifier, and the proposed system is evaluated by the Leave One Person Out (LOPO) crossvalidation technique. Our proposed model is evaluated using Leave One Person Out (LOPO) over three datasets, namely, The Images of Group, OUI Adience and FG-NET. By using LOPO, we repeatedly split the extracted features by randomly selecting one data feature for testing purposes while the other features are used for the training model. This procedure is repeated until all the features are divided into training and testing data. This model efficiently predicts the existing model and also the status of new individuals.   After this, we applied the Convolution Neural Network and other competing approaches over the OUI Adience dataset and the FG-NET dataset to findthe age classification results. Table 2 shows the results of age classification over the OUI Adience datasetwith a 92.62% mean accuracy rate and Table 3 shows the results overthe FG-NET datasetwith a mean accuracy of 94.59%.  In this experiment, our proposed Active Shape Model (ASM) is compared with other Active Appearance Models (AAM) or Active Shape Models (ASM). These models are compared in Table 4 on the basis of landmark points, shape of the model and mean absolute error (MAE). The results show that our proposed Active Shape Model (ASM) provides lower MAE than other well-known models.

Experiment III: Comparison of Age Classification Performance Using CNN in Different Features Sets
In experiment III, different sets of features are compared using the CNN classifier for age classification. Results show that combining ASM+anthropometric features (AF)+carnio-development features (CDF) + interior angles features (IAF) and wrinkles detection (WD) provides better accuracy than the other sets of features over the OUI Adience dataset and the FG-NET dataset. For the Images of Groups dataset, results have shown that combining ASM + anthropometric features (AF) + carnio-development features (CDF), wrinkles detection (WD) and heat maps (HM) provides better accuracy than the other sets of features. Figures 18-20 show the results obtained for age classification using different sets of features over the OUI Adience dataset, the FG-NET dataset and The Images of Groups dataset, respectively.

Conclusions
In this research, we proposed a novel approach to determine and classify age from images of human faces. First, an image is pre-processed and faces are detected. After face detection, 35 landmark points are plotted on the face. With the help of these landmarks, an Active Shape Model (ASM) is mapped on the face. A six feature sets is used for further for feature extraction. These feature sets are: Active Shape Model (ASM), anthropometric model, carnio-facial development, interior angle formulation, wrinkles detection and heat maps. After the feature extraction step, these features are passed to a Sequential Forward Selection algorithm so that the most ideal set of features can be selected for better age classification. Finally, Convolution Neural Network (CNN) helps to classify the correct age of an individual. The system was evaluated over three benchmark datasets named The Images of Groups dataset which is a multi-face dataset, the OUI Adience dataset and the FG-NET dataset. The mean accuracies achieved on these datasets are 91.58%, 92.62% and 94.59%, respectively. In the future, we will evaluate these techniques to estimate age using RGB-D age datasets.