Multi-Class Classification and Multi-Output Regression of Three-Dimensional Objects Using Artificial Intelligence Applied to Digital Holographic Information

Digital holographically sensed 3D data processing, which is useful for AI-based vision, is demonstrated. Three prominent methods of learning from datasets such as sensed holograms, computationally retrieved intensity and phase from holograms forming concatenated intensity–phase (whole information) images, and phase-only images (depth information) were utilized for the proposed multi-class classification and multi-output regression tasks of the chosen 3D objects in supervised learning. Each dataset comprised 2268 images obtained from the chosen eighteen 3D objects. The efficacy of our approaches was validated on experimentally generated digital holographic data then further quantified and compared using specific evaluation matrices. The machine learning classifiers had better AUC values for different classes on the holograms and whole information datasets compared to the CNN, whereas the CNN had a better performance on the phase-only image dataset compared to these classifiers. The MLP regressor was found to have a stable prediction in the test and validation sets with a fixed EV regression score of 0.00 compared to the CNN, the other regressors for holograms, and the phase-only image datasets, whereas the RF regressor showed a better performance in the validation set for the whole information dataset with a fixed EV regression score of 0.01 compared to the CNN and other regressors.


Introduction
Multi-class classification and multi-output regression tasks [1] are the deep learning applications that produce a single output for multiple inputs. These supervised learning techniques [2] play a vital role in the development of artificial intelligence systems in which decision making is done through discrete and continuous labels by considering multiple inputs based on the criteria of the problem at hand. Studies have emerged in the areas of learning and decision making that used multi-class classification and multi-output regression tasks such as Alzheimer's disease classification [3][4][5][6], food ingredient classification [7], river quality prediction [8], natural gas demand forecasting [9], drug efficacy prediction [10], prediction of the audio spectrum of wind noise (represented by several sound pressure variables) of a given vehicle component [11], real-time prediction of multiple gas tank levels of a Linz-Donawitz converter gas system [12], simultaneous estimation of different biophysical parameters from remote sensing images [13], and channel estimation via the prediction of several received signals [14]. However, these real-world problems [11][12][13][14] still face major challenges such as the absence of feature/target values and the presence of noise due to the complexity of real domains. Despite dealing with these challenges, it has been proven that multi-output regression methods have a better predictive performance and computational efficiency [15]. Therefore, in the present work, we studied the implications of formed by recording the off-axis geometry at different distances; by post-processing the holograms, the datasets of the concatenated intensity-phase images and phase-only images were obtained. These three datasets were further passed through the deep CNN to perform five-class classification and regression tasks. The five-class classification and regression tasks in supervised learning applied to the digital holographic information of the eighteen 3D objects using a deep CNN were equivalent to 3D object allocation and prediction performed on the digital holographic datasets that produced discrete and continuous labels as output, which justified the rationale behind the present work. The CNN was trained on all three datasets separately to generate the results. For the five-class classification, results such as loss/accuracy graphs of the training/validation sets, a confusion matrix, performance metrics, receiver operating characteristics (ROC), and precision-recall characteristics are shown for the validation of the present work. Similarly for the five-class regression task, the results such as loss, mean square error (MSE), and mean absolute error (MAE) curves were plotted for both the training/validation sets; and performance metrics such as the MAE, R 2 score (coefficient of determination), and explained-variance (EV) regression score for the test/validation sets are shown for the confirmation of the work. Further, the CNN was compared with machine learning classifiers and regressors such as KNN, MLP, DT, RF, and ET separately using all three datasets for both of the tasks. The proof of the proposed concept was demonstrated using a real-time off-axis digital holographic experiment for sensing and retrieving the 3D object information, which was further processed using AI/ML techniques.

Theory
In this section, we describe the digital hologram sensing, retrieval, and processing of the 3D information using the deep CNN and machine learning algorithms.

Sensing and Retrieval of 3D Information of the Objects Using Off-Axis Digital Fresnel Holography
The construction and modeling of the 3D objects are shown in Appendix SA.1 of the Supplementary Materials. Figure 1 shows the schematic diagrams of four of the eighteen 3D objects used for the recording of the off-axis digital Fresnel holograms. The 3D objects shown in Figure 1 consisted of two different planes; namely, the first plane and the second plane, which had different features that were separated by a dis-( c ) Figure 1. The 3D objects used in the off-axis digital holographic recording geometry: (a) circle-triangle; (b) square-rectangle; (c) square-pentagon; (d) pentagon-square. Circle: 2 mm in diameter; triangle: 2 mm in x and y directions; square: 2 mm in x and y directions; pentagon: 2 mm in x and y directions; rectangle: 2 mm in x direction and 1 mm in y direction. The distance between the first plane and second plane was 8 mm in the z direction.
The 3D objects shown in Figure 1 consisted of two different planes; namely, the first plane and the second plane, which had different features that were separated by a distance of z = 8 mm. The construction of the remaining fourteen 3D objects was similar to that of the four 3D objects shown in Figure 1 but with different features on each plane [48]. In total, eighteen 3D objects were considered for the proposed five-class classification and regression tasks [49]. The 3D objects were characterized by their intensity and phase Sensors 2023, 23, 1095 4 of 33 information. In the construction shown in Figure 1, when light passed through the first plane, the amplitude and phase information of the object with the features of the first plane were obtained. Then, after propagating by a distance of z using free space propagation, the amplitude and phase information of the next object features in the second plane were also obtained.
Section S1.2 presents the details of the digital recording and numerical reconstruction of the holograms to obtain the complex 3D object wave information. Figure 2 describes the experimental setup (of the Mach-Zehnder digital holographic recording geometry in an off-axis scheme) used for the recording of the holograms of the 3D objects [49]. A He-Ne laser source with a wavelength λ = 632.8 nm was used here. The holograms were recorded by using a CMOS sensor with a square pixel pitch of 6 µm × 6 µm at an interference angle of θ = 1.4 • . The size of each recorded hologram was 1600 × 1600. Next, the complex-wave retrieval method [50] was applied to the recorded holograms of the 3D objects to obtain the complex-wave fields of the objects at the recording plane. Further, an inverse Fresnel transform was applied on the retrieved complex-wave field to obtain a 2D digital complexvalued image at the object plane. The 2D digital complex-valued images contained 3D information in the form of the intensity and phase.
tance of = 8 . The construction of the remaining fourteen 3D objects was similar to that of the four 3D objects shown in Figure 1 but with different features on each plane [48]. In total, eighteen 3D objects were considered for the proposed five-class classification and regression tasks [49]. The 3D objects were characterized by their intensity and phase information. In the construction shown in Figure 1, when light passed through the first plane, the amplitude and phase information of the object with the features of the first plane were obtained. Then, after propagating by a distance of using free space propagation, the amplitude and phase information of the next object features in the second plane were also obtained.
Appendix SA.2 presents the details of the digital recording and numerical reconstruction of the holograms to obtain the complex 3D object wave information. Figure 2 describes the experimental setup (of the Mach-Zehnder digital holographic recording geometry in an off-axis scheme) used for the recording of the holograms of the 3D objects [49]. A He-Ne laser source with a wavelength = 632. 8 was used here. The holograms were recorded by using a CMOS sensor with a square pixel pitch of 6 × 6 at an interference angle of = 1.4°. The size of each recorded hologram was 1600 × 1600. Next, the complex-wave retrieval method [50] was applied to the recorded holograms of the 3D objects to obtain the complex-wave fields of the objects at the recording plane. Further, an inverse Fresnel transform was applied on the retrieved complex-wave field to obtain a 2D digital complex-valued image at the object plane. The 2D digital complex-valued images contained 3D information in the form of the intensity and phase. The intensity and phase information present in the 2D digital complex-valued images were extracted and further united via the method of concatenation to form concatenated intensity-phase (whole information) images, and phase (depth)-only information was also extracted from the 2D digital complex-valued images to form phase images. The off-axis Mach-Zehnder holographic geometry was suitable for both transmitting and reflecting the types of objects. The object beam arm could be modified appropriately for the reflective objects or specimens. In the present paper, we modeled the 3D objects to use them in transmission mode to demonstrate the proof of concept of the proposed application.

Multi-Class Classification and Multi-Output Regression of 3D Objects Using Holographic Information
Appendix SA.3 shows the equations that governed the 3D object set formation of the datasets of the sensed holograms, concatenated intensity-phase images, and phase-only images. The holographic information of 3D objects can be processed using an AI-based approach in several ways. One method is to apply direct learning from the sensed hologram data. Another method is to learn the retrieved 3D object information The intensity and phase information present in the 2D digital complex-valued images were extracted and further united via the method of concatenation to form concatenated intensity-phase (whole information) images, and phase (depth)-only information was also extracted from the 2D digital complex-valued images to form phase images. The off-axis Mach-Zehnder holographic geometry was suitable for both transmitting and reflecting the types of objects. The object beam arm could be modified appropriately for the reflective objects or specimens. In the present paper, we modeled the 3D objects to use them in transmission mode to demonstrate the proof of concept of the proposed application.

Multi-Class Classification and Multi-Output Regression of 3D Objects Using Holographic Information
Section S1.3 shows the equations that governed the 3D object set formation of the datasets of the sensed holograms, concatenated intensity-phase images, and phase-only images. The holographic information of 3D objects can be processed using an AI-based approach in several ways. One method is to apply direct learning from the sensed hologram data. Another method is to learn the retrieved 3D object information from digital holograms; i.e., by forming a dataset of the retrieved intensities and phases combined to form concatenated intensity-phase images. Since the phase information contains the depth features, a phase-only information or phase-only image dataset can be learned to accomplish the tasks. In the present paper, we addressed the above approaches for the multi-class classification and multi-output regression tasks of the 3D objects in supervised learning by using a deep CNN and comparing the results with those of standard machine learning algorithms. The five-class classification and regression tasks in supervised learning applied to the digital holographic information of eighteen 3D objects using a deep CNN was equivalent to 3D object allocation and prediction performed on the digital holographic datasets that produced discrete and continuous labels as output, which justified the rationale behind the present work. The eighteen 3D objects considered for the above problem were classified into five different sub-classes (Class-a, Class-b, Class-c, Class-d, and Class-e) to perform the five-class classification and regression tasks using the following equations: Class-c : Class-d : Class-e : where 'd i ' represents the distance between the recording plane and object plane, and i denotes the indices of the individual objects. The combined objects circle-pentagon (a di ), circle-triangle(b di ), circle-square (c di ), and circle-rectangle (e di ) were considered for Class-a. The combined objects square-circle ( f di ), square-triangle (g di ), square-rectangle (h di ), and square-pentagon (k di ) were considered for Class-b. The combined objects triangle-circle (l di ), triangle-square (m di ), triangle-rectangle (n di ), and triangle-pentagon (o di ) were considered for Class-c. The combined objects pentagon-circle (p di ), pentagon-square (q di ), and pentagontriangle (r di ) were considered for Class-d. Finally, the combined objects rectangle-circle (s di ), rectangle-square (t di ), and rectangle-triangle (u di ) were considered for Class-e. The five-class classification and regression tasks of the 3D objects using the hologram dataset are shown in Equations (1)- (5). The five-class classification and regression tasks for the concatenated intensity-phase (whole information) image dataset were performed by using the following Equations (6)-(10), respectively.
The five-class classification and regression tasks of the phase (depth)-only image dataset were performed by using Equations (11)-(15), respectively.
Class-e : {RT PH5 } ∈ {Rs di,PH , Rt di,PH , Ru di,PH } The deep CNN was used to perform the five-class classification and regression tasks by employing datasets of holograms, concatenated intensity-phase images, and phaseonly images. Further, the five-class classification and regression tasks for different digital holographic datasets performed using the deep CNN were compared via machine learning algorithms such as the K-nearest neighbor (KNN), multi-layer perceptron (MLP), decision tree (DT), random forest (RF), and extra trees (ET). In this way, the five-class classification Sensors 2023, 23, 1095 6 of 33 and regression tasks were performed for the different digital holographic datasets using deep learning and machine learning frameworks. Figure 3 shows a block diagram of the CNN that was used to perform the five-class classification and regression tasks for the different digital holographic information, which consisted of datasets of holograms, concatenated intensity-phase (whole information) images, and phase (depth)-only images; i.e., the CNN took the inputs from all three datasets independently. Section S2.1 provides the details of the mathematical model of the CNN used for the multi-class classification and multi-output regression.

Architecture of CNN for Multi-Class Classification and Multi-Output Regression
Class-e: The deep CNN was used to perform the five-class classification and regression tasks by employing datasets of holograms, concatenated intensity-phase images, and phase-only images. Further, the five-class classification and regression tasks for different digital holographic datasets performed using the deep CNN were compared via machine learning algorithms such as the K-nearest neighbor (KNN), multi-layer perceptron (MLP), decision tree (DT), random forest (RF), and extra trees (ET). In this way, the five-class classification and regression tasks were performed for the different digital holographic datasets using deep learning and machine learning frameworks. Figure 3 shows a block diagram of the CNN that was used to perform the five-class classification and regression tasks for the different digital holographic information, which consisted of datasets of holograms, concatenated intensity-phase (whole information) images, and phase (depth)-only images; i.e., the CNN took the inputs from all three datasets independently. Appendix SB.1 provides the details of the mathematical model of the CNN used for the multi-class classification and multi-output regression.  Figure 3 describes the architecture of the CNN, which contained four convolutional and four pooling layers, fully connected layers, and an output layer. The classification stage was used for the five-class classification and regression purposes. In the five-class regression, the classification stage was modified into the regression stage to implement the task. The convolutional layer operated on the raw input and the kernel to generate the output, which was then further processed by the pooling layer. Here, each convolutional layer consisted of the rectified linear unit (ReLU) activation function. The number of kernels was equal to 8 in the first convolutional layer, 16 in the second, 32 in the third, and finally 64 in the fourth. The kernel size was equal to three in all convolutional layers. The pooling layer accepted the input from the convolutional layer to minimize the dimensionality of the feature map. The pooling technique used here was MaxPooling2D. The pooling layer did not affect the number of parameters because it only reduced the dimensionality of the feature map. After four successive stages of convolution and the pooling operations in the feature-extraction stage, the final pooling-stage output was given to the classification stage to perform the five-class classification and regression tasks. The fully connected layer took the input from the fourth pooling stage; i.e., the 2D data, and converted it into 1D data through the flattened layer before processing it through the fully connected layer. In the fully connected layer, the number of neurons considered was 16. The output layer received the input from the fully connected layer to perform the five-class classification and regression tasks. For the five-class classification, the softmax function was used; for regression, the linear function was used in the output  Figure 3 describes the architecture of the CNN, which contained four convolutional and four pooling layers, fully connected layers, and an output layer. The classification stage was used for the five-class classification and regression purposes. In the five-class regression, the classification stage was modified into the regression stage to implement the task. The convolutional layer operated on the raw input and the kernel to generate the output, which was then further processed by the pooling layer. Here, each convolutional layer consisted of the rectified linear unit (ReLU) activation function. The number of kernels was equal to 8 in the first convolutional layer, 16 in the second, 32 in the third, and finally 64 in the fourth. The kernel size was equal to three in all convolutional layers. The pooling layer accepted the input from the convolutional layer to minimize the dimensionality of the feature map. The pooling technique used here was MaxPooling2D. The pooling layer did not affect the number of parameters because it only reduced the dimensionality of the feature map. After four successive stages of convolution and the pooling operations in the feature-extraction stage, the final pooling-stage output was given to the classification stage to perform the five-class classification and regression tasks. The fully connected layer took the input from the fourth pooling stage; i.e., the 2D data, and converted it into 1D data through the flattened layer before processing it through the fully connected layer. In the fully connected layer, the number of neurons considered was 16. The output layer received the input from the fully connected layer to perform the five-class classification and regression tasks. For the five-class classification, the softmax function was used; for regression, the linear function was used in the output layer. The number of neurons in the output layer for the five-class classification and regression tasks considered was five. The summary of the proposed deep CNN model used for five-class classification and regression tasks is shown in Table 1. Section S2.2 provides the details on the performance metrics used for the multi-class classification and multi-output regression tasks.

Dataset Preparation
The datasets of the holograms, concatenated intensity-phase images, and phase (depth)-only images were created using the eighteen 3D objects considered in this study. Digital holograms of the eighteen 3D objects were created using Mach-Zehnder off-axis digital holographic geometry as shown in Figure 2. The 18 three-dimensional objects were used in the off-axis geometry to form 63 holograms with a size of 1600 × 1600 using a CMOS sensor at 15 different distances of Figure 4 shows the digital holograms of five of the 3-D objects recorded at a distance of d 3 = 200 mm. layer. The number of neurons in the output layer for the five-class classification and regression tasks considered was five. The summary of the proposed deep CNN model used for five-class classification and regression tasks is shown in Table 1. Appendix SB.2 provides the details on the performance metrics used for the multi-class classification and multi-output regression tasks.

Dataset Preparation
The datasets of the holograms, concatenated intensity-phase images, and phase (depth)-only images were created using the eighteen 3D objects considered in this study. Digital holograms of the eighteen 3D objects were created using Mach-Zehnder off-axis digital holographic geometry as shown in Figure 2. The 18 three-dimensional objects were used in the off-axis geometry to form 63 holograms with a size of 1600 × 1600 using a CMOS sensor at 15 different distances of 1 = 180 mm, 2 = 185 mm, 3   The holograms of eighteen 3D objects were used to obtain 2D digital complex-valued images using the complex-wave retrieval method [50]. The datasets of the holograms, concatenated intensity-phase images, and phase-only images were formed by rotating each image in the respective datasets by 5°. Then, these three digital holo-  The holograms of eighteen 3D objects were used to obtain 2D digital complex-valued images using the complex-wave retrieval method [50]. The datasets of the holograms, concatenated intensity-phase images, and phase-only images were formed by rotating each image in the respective datasets by 5 • . Then, these three digital holographic datasets, which consisted of 2268 images separately, were passed through the deep CNN independently as shown in Figure 3 to perform the five-class classification and regression tasks. The hologram and phase-image size considered for the input of the CNN was 160 × 160 from 1600 × 1600. The concatenated intensity-phase image size considered for the input of the CNN was 160 × 160 from 1600 × 3200. The five-class classification and regression tasks of the hologram dataset were governed by Equations (1)-(5), respectively. The five-class classification and regression tasks of the concatenated intensity-phase image dataset were governed by Equations (6)- (10). Figure 5a shows the concatenated intensity-phase image of the object 'circle-pentagon' (Ra d1,I NPH ) present in Class-a at a distance of d 1 = 180 mm. Similarly, Figure 5b shows the concatenated intensity-phase image of the object 'squarerectangle' (Rh d1,I NPH ) present in Class-b at a distance of d 1 = 180 mm. Figure 6a shows the reconstructed phase image Rg d 1,PH of the object 'square-triangle' present in Class-b at a distance of d 1 = 180 mm. Figure 6b shows the reconstructed phase image Rk d 1,PH of the object 'square-pentagon' present in Class-b at a distance of d 1 = 180 mm. The five-class classification and regression tasks of the phase-only image dataset were also governed by Equations (11)-(15), respectively. Next, after performing the five-class classification and regression for the datasets of the holograms, concatenated intensity-phase images, and phase-only images separately using the deep CNN, the results were further correlated with those of machine learning algorithms such as KNN, MLP, DT, RF, and ET. The number of nearest neighbors considered for the KNN classifier and regressor was k = 5. ) object that belonged to Class-a; (b) concatenated intensity-phase image of the square-rectangle ( ℎ 1, ) object that belonged to Class-b. The MLP classifier and regressor, which consisted of a single hidden layer with ReLU as the activation function, were trained using an Adam optimizer with a learning rate and regularization rate (α) of 0.0003. The DT classifier and regressor was trained by setting _ ℎ = 2 . The RF classifier and regressor was trained by setting _ = 10, _ ℎ = , and _ _ = 2. The ET classifier and regressor were also trained similarly to the RF classifier and regressor using the same parameters. For the five-class classification and regression tasks, all three digital holographic datasets were separated into training, validation, and test sets with respective proportions of 75:15:10. The training set consisted of 340 images in Class-a, Class-b, Class-c, and Class-d; and 341 images in Class-e. The validation set consisted of 68 images The MLP classifier and regressor, which consisted of a single hidden layer with ReLU as the activation function, were trained using an Adam optimizer with a learning rate and regularization rate (α) of 0.0003. The DT classifier and regressor was trained by setting max_depth = 2. The RF classifier and regressor was trained by setting n_estimators = 10, max_depth = none, and min_samples_split = 2. The ET classifier and regressor were also trained similarly to the RF classifier and regressor using the same parameters. For the five-class classification and regression tasks, all three digital holographic datasets were separated into training, validation, and test sets with respective proportions of 75:15:10. The training set consisted of 340 images in Class-a, Class-b, Class-c, and Class-d; and 341 images in Class-e. The validation set consisted of 68 images each in all five classes. The test set consisted of 45 images each in Class-a, Class-b, Class-c, and Class-d; and 47 images in Class-e. For the five-class classification, the deep CNN was trained for 100 epochs using an Adam optimizer whose learning rate considered was 0.0003; categorical-cross entropy was used as a loss function. Similarly to the five-class regression, the training of the deep CNN was performed like that for the five-class classification with the mean square error (MSE) as the loss function; the metrics considered were the mean square error (MSE) and the mean absolute error (MAE). The learning rate of the deep CNN model for both of the tasks was fixed throughout the training. The execution of the deep CNN was conducted using Python programming in TensorFlow, and the machine learning classifier and regressor execution was conducted using scikit learn.

Training of CNN for Multi-Class Classification on Holograms
The CNN was trained for the hologram dataset on both the training/validation sets with a subset of 21/20 images in one epoch. The same process was repeated simultaneously with 81/17 steps for both sets in the remaining epochs. Figure 7 shows the loss/accuracy plot obtained by the CNN. Figure 7 shows that the validation error was higher than the training error and that the accuracy of the training set was greater than that of the validation set. This confirmed that the CNN did not correctly fit.
The testing of the CNN on the hologram dataset was performed separately with a batch size of 23 images. Figure 8 describes the multi-class confusion matrix obtained from the test set. Further, the performance of the multi-class classification was compared with certain machine learning classifiers such as KNN, MLP, DT, RF, and ET. The multi-class confusion matrix obtained by the KNN, MLP, DT, RF, and ET classifiers with a batch size of 23 images for the hologram dataset on the test set are also shown in Figure 8. In the figure, it can be observed that the confusion matrix was represented for multiple classes; i.e., for five classes. Section S2.3 provides the details on the general confusion matrix for the five classes.

Training of CNN for Multi-Class Classification on Holograms
The CNN was trained for the hologram dataset on both the training/validation sets with a subset of 21/20 images in one epoch. The same process was repeated simultaneously with 81/17 steps for both sets in the remaining epochs. Figure 7 shows the loss/accuracy plot obtained by the CNN. Figure 7 shows that the validation error was higher than the training error and that the accuracy of the training set was greater than that of the validation set. This confirmed that the CNN did not correctly fit. The testing of the CNN on the hologram dataset was performed separately with a batch size of 23 images. Figure 8 describes the multi-class confusion matrix obtained from the test set. Further, the performance of the multi-class classification was compared with certain machine learning classifiers such as KNN, MLP, DT, RF, and ET. The multi-class confusion matrix obtained by the KNN, MLP, DT, RF, and ET classifiers with a batch size of 23 images for the hologram dataset on the test set are also shown in Figure 8. In the figure, it can be observed that the confusion matrix was represented for multiple classes; i.e., for five classes. Appendix SB.3 provides the details on the general confusion matrix for the five classes. The performance metrics obtained from the CNN, KNN, MLP, DT, RF, and ET classifiers for the hologram dataset are shown in Table 2. The metric macro average was obtained by averaging over all five classes for the respective labels. The metrics of the micro average, weighted average, and samples average were calculated using the confusion The performance metrics obtained from the CNN, KNN, MLP, DT, RF, and ET classifiers for the hologram dataset are shown in Table 2. The metric macro average was obtained by averaging over all five classes for the respective labels. The metrics of the micro average, weighted average, and samples average were calculated using the confusion matrix. Table 2 shows that the CNN had a greater accuracy for Class-a compared to the other classes. The KNN and MLP classifiers had a greater accuracy for Class-b compared to the other classes.
The DT, RF, and ET classifiers had a higher accuracy for Class-b and Class-e compared to the other classes. Table 3 shows the computational costs and complexity parameters such as the floating-point operations (FLOPs), training time, and test time for the CNN, as well as for the other machine learning classifiers (KNN, MLP, DT, RF, and ET) for the hologram dataset.  Table 3 shows that the number of floating-point operations (FLOPs) for the CNN was greater compared to that of the other machine learning classifiers. In Table 3, it can also be seen that the training time and test time for the CNN were greater compared to the other machine learning classifiers. The receiver operating characteristic (ROC) and precision-recall characteristic were also used to describe the performance of the five-class classification task. Figure 9 shows the ROCs obtained from the CNN, KNN, MLP, DT, RF, and ET classifiers for the hologram dataset.  Table 3 shows that the number of floating-point operations (FLOPs) for the CNN was greater compared to that of the other machine learning classifiers. In Table 3, it can also be seen that the training time and test time for the CNN were greater compared to the other machine learning classifiers. The receiver operating characteristic (ROC) and precision-recall characteristic were also used to describe the performance of the five-class classification task. Figure 9 shows the ROCs obtained from the CNN, KNN, MLP, DT, RF, and ET classifiers for the hologram dataset. In Figure 9a, it can be seen that the CNN has a better area under the curve (AUC) value of 0.57 for Class-a compared to the other classes. Similarly, it can be seen that the KNN classifier had a better AUC value for Class-d compared to the other classes. The MLP classifier had equal AUC values for all five of the classes. The DT classifier had better AUC values for Class-a, -b, and -e compared to the other classes. The RF and ET classifiers have better AUC values for Class-a, -b, -c, and -e compared to Class-d. Figure  10 describes the precision-recall characteristics obtained by the CNN, KNN, MLP, DT, RF, and ET classifiers for the hologram dataset.   Figure 10a shows that the CNN had a lower precision as the recall approached unity for all five of the classes. Similarly to the remaining precision-recall characteristics, the machine learning classifiers also had a lower precision when the recall approached unity for all five of the classes.

Training of CNN for Multi-Class Classification on Concatenated Intensity-Phase Image Dataset
The CNN was trained on the concatenated intensity-phase image dataset in the same manner as that of the hologram dataset. The loss/accuracy plot obtained by the CNN is shown in Figure 11, which depicts that the validation error was higher than the training error and the accuracy of the training set was greater than that of the validation set. Therefore, it can be said that the CNN model was overfitting.

Training of CNN for Multi-Class Classification on Concatenated Intensity-Phase Image Dataset
The CNN was trained on the concatenated intensity-phase image dataset in the same manner as that of the hologram dataset. The loss/accuracy plot obtained by the CNN is shown in Figure 11, which depicts that the validation error was higher than the training error and the accuracy of the training set was greater than that of the validation set. Therefore, it can be said that the CNN model was overfitting.  Figure 10a shows that the CNN had a lower precision as the recall approached unity for all five of the classes. Similarly to the remaining precision-recall characteristics, the machine learning classifiers also had a lower precision when the recall approached unity for all five of the classes.

Training of CNN for Multi-Class Classification on Concatenated Intensity-Phase Image Dataset
The CNN was trained on the concatenated intensity-phase image dataset in the same manner as that of the hologram dataset. The loss/accuracy plot obtained by the CNN is shown in Figure 11, which depicts that the validation error was higher than the training error and the accuracy of the training set was greater than that of the validation set. Therefore, it can be said that the CNN model was overfitting.  The testing of the CNN on the concatenated intensity-phase image dataset was performed in the same manner as that of the hologram dataset. The multi-class confusion matrix obtained by the CNN is shown in Figure 12. Further, the CNN was compared with the machine learning classifiers. The testing of the machine learning classifiers on the concatenated intensity-phase image dataset was performed in the same manner as that of the hologram dataset. Figure 12 shows the confusion matrix obtained by the KNN, MLP, DT, RF, and ET classifiers for all five classes for the concatenated intensity-phase image dataset. The testing of the CNN on the concatenated intensity-phase image dataset was performed in the same manner as that of the hologram dataset. The multi-class confusion matrix obtained by the CNN is shown in Figure 12. Further, the CNN was compared with the machine learning classifiers. The testing of the machine learning classifiers on the concatenated intensity-phase image dataset was performed in the same manner as that of the hologram dataset. Figure 12 shows the confusion matrix obtained by the KNN, MLP, DT, RF, and ET classifiers for all five classes for the concatenated intensity-phase image dataset.   Table 4 shows that the CNN, DT, RF, and ET classifiers had a higher accuracy for Class-a compared to that for other classes. The KNN and MLP classifiers had a greater accuracy for Class-b and -c compared to the other classes. Table 5 shows the computational costs and complexity parameters such as the floating-point operations (FLOPs), training time, and test time for the CNN and the machine learning classifiers (KNN, MLP, DT, RF, and ET) for the concatenated intensity-phase image dataset. In Table 5, it can be seen that the number of floating-point operations (FLOPs) for the CNN was high compared to that of the other machine learning classifiers. Based on Table 5, it also can be said that the training time and the test time for the CNN were higher  Figure 13 shows the ROCs obtained from the CNN and the machine learning classifiers for all five classes for the concatenated intensity-phase image dataset. In Table 5, it can be seen that the number of floating-point operations (FLOPs) for the CNN was high compared to that of the other machine learning classifiers. Based on Table 5, it also can be said that the training time and the test time for the CNN were higher compared to those of the machine learning classifiers. Figure 13 shows the ROCs obtained from the CNN and the machine learning classifiers for all five classes for the concatenated intensity-phase image dataset.   Figure 14 shows the precision-recall characteristics obtained by the CNN and the machine learning classifiers for the concatenated intensity-phase image dataset.   Figure 14 shows the precision-recall characteristics obtained by the CNN and the machine learning classifiers for the concatenated intensity-phase image dataset. In Table 5, it can be seen that the number of floating-point operations (FLOPs) for the CNN was high compared to that of the other machine learning classifiers. Based on Table 5, it also can be said that the training time and the test time for the CNN were higher compared to those of the machine learning classifiers. Figure 13 shows the ROCs obtained from the CNN and the machine learning classifiers for all five classes for the concatenated intensity-phase image dataset.   Figure 14 shows the precision-recall characteristics obtained by the CNN and the machine learning classifiers for the concatenated intensity-phase image dataset.  Figure 14a shows that the CNN had a lower precision as the recall approached unity. Similarly, it can be said that the KNN, MLP, DT, RF, and ET classifiers had a lower precision as the recall approached unity.

Training of CNN for Multi-Class Classification on Phase-Only Information
The CNN was trained on the phase-only image dataset in the same manner as that of the hologram dataset. The loss/accuracy plot for both the training/validation sets obtained by the CNN for the phase-only image dataset is shown in Figure 15.  Figure 14a shows that the CNN had a lower precision as the recall approached unity. Similarly, it can be said that the KNN, MLP, DT, RF, and ET classifiers had a lower precision as the recall approached unity.

Training of CNN for Multi-Class Classification on Phase-Only Information
The CNN was trained on the phase-only image dataset in the same manner as that of the hologram dataset. The loss/accuracy plot for both the training/validation sets obtained by the CNN for the phase-only image dataset is shown in Figure 15.  Figure 15 shows that the validation error was higher compared to the training error and that the accuracy of the training set was higher compared to that of the validation set. This showed that the CNN model was overfitting. The testing of the CNN for the phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. The confusion matrix obtained by the CNN for the five classes is shown in Figure 16.  Figure 15 shows that the validation error was higher compared to the training error and that the accuracy of the training set was higher compared to that of the validation set. This showed that the CNN model was overfitting. The testing of the CNN for the phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. The confusion matrix obtained by the CNN for the five classes is shown in Figure 16.
Further, the performance of the five-class classification task was described by the machine learning classifiers. The testing of the machine learning classifiers for the phaseonly image dataset on the test set was performed in the same manner as that of the hologram dataset. Figure 16 also describes the confusion matrix for all five classes obtained by the KNN, MLP, DT, RF, and ET classifiers for the phase-only image dataset. The performance metrics obtained by the CNN, KNN, MLP, DT, RF, and ET classifiers on the phase-only image dataset are shown in Table 6. Table 6 shows that the CNN and RF classifiers had a greater accuracy for Class-e and Class-c compared to the other classes. The KNN and DT classifiers had a larger accuracy for Class-d compared to the other classes. The MLP classifier had a higher accuracy for Class-b compared to the other classes. The ET classifier achieved a higher accuracy for Class-a and Class-c compared to the other classes. Table 7   Further, the performance of the five-class classification task was described by the machine learning classifiers. The testing of the machine learning classifiers for the phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. Figure 16 also describes the confusion matrix for all five classes obtained by the KNN, MLP, DT, RF, and ET classifiers for the phase-only image dataset.  In Table 7, it can be seen that the number of floating-point operations (FLOPs) for the CNN was higher compared to those of the machine learning classifiers. In Table 7, it also can be seen that the training time and test time for the CNN were greater compared to the machine learning classifiers. Figure 17   In Table 7, it can be seen that the number of floating-point operations (FLOPs) for the CNN was higher compared to those of the machine learning classifiers. In Table 7, it also can be seen that the training time and test time for the CNN were greater compared to the machine learning classifiers. Figure 17 depicts the ROCs obtained from the CNN, KNN, MLP, DT, RF, and ET classifiers for the phase-only image dataset.    Figure 18a shows that the CNN had a lower precision when the recall approached unity for all the classes. Similarly, for the remaining precision-recall characteristics, it can be seen that the KNN, MLP, DT, RF, and ET classifiers also had a lower precision when the recall approached unity for all the classes.

Training of CNN for Multi-Output Regression on Holograms
The training of the CNN for the five-class regression on the hologram dataset was performed in the same manner as that of the five-class classification on the hologram dataset. Figure 19 shows the loss/MSE/MAE plot obtained by the CNN for the training/validation sets; it can be seen that the validation error was higher than the training error. The loss and MSE plots were the same for both training/validation sets. Further, Figure 19 shows that the validation MAE was greater than the training MAE. This showed that the CNN was not correctly fitting. The testing of the CNN was performed separately on a batch size of 23 images for the test set on the hologram dataset. The evaluation metrics such as the mean absolute error (MAE), R 2 score (coefficient of determination), and explained-variance (EV) regression score were used to measure the performance of the five-class regression task. The evaluation metrics obtained by the CNN are shown in Table 8. Further, the evaluation metrics obtained by the CNN were compared with those of the machine learning regressors. The evaluation metrics obtained by the KNN, MLP, DT, RF, and ET regressors with a batch size of 23 images for the hologram dataset on the test set are also shown in Table 8.   Table 8 shows that the MLP regressor had a better performance for the five-class regression tasks on the test set with a stable EV regression score of 0.00 compared to the CNN and other regressors. The testing of the CNN for the validation set on the hologram dataset was also performed with a batch size of 20 images. The evaluation metrics obtained by the CNN and the other machine learning regressors for the validation set are shown in Table 9. The machine learning regressors were also tested on the validation set with a batch size of 20 images for the hologram dataset.    Table 8 shows that the MLP regressor had a better performance for the five-class regression tasks on the test set with a stable EV regression score of 0.00 compared to the CNN and other regressors. The testing of the CNN for the validation set on the hologram dataset was also performed with a batch size of 20 images. The evaluation metrics obtained by the CNN and the other machine learning regressors for the validation set are shown in Table 9. The machine learning regressors were also tested on the validation set with a batch size of 20 images for the hologram dataset.  Table 9 shows that the MLP regressor had a consistent performance on the validation set compared to the CNN and the other machine learning regressors with a fixed EV regression score of 0.00.

Training of CNN for Multi-Output Regression on Concatenated Intensity-Phase Image Dataset
The training of the CNN for the five-class regression on the concatenated intensityphase image dataset was performed in the same manner as that of the five-class classification on the hologram dataset. Figure 20 shows the loss/MSE/MAE plot obtained by the CNN on the training/validation sets.  Figure 20 shows that the validation error was greater than the training error. The loss and MSE plots were the same for both the training/validation sets. Further, it can be seen that the validation MAE was greater than the training MAE. Therefore, it can be said that the CNN model was not correctly fitting. The testing of the CNN for the concatenated intensity-phase image dataset on the test set was performed in the same manner as that of the hologram dataset. The evaluation metrics obtained by the CNN are shown in Table 10. Further, the evaluation metrics obtained by the CNN were compared with certain machine learning regressors. The evaluation metrics obtained by those machine learning regressors on the test set are also shown in Table 10. The testing of the machine learning regressors for the concatenated intensity-phase image dataset on the test set was performed in the same manner as that of the hologram dataset.  Table 10 shows that the MLP regressor had a better performance for the five-class regression tasks on the test set with a fixed EV regression score of 0.00 compared to the CNN and the other machine learning regressors. The testing of the CNN for the concatenated intensity-phase image dataset on the validation set was performed in the same manner as that of the hologram dataset. The evaluation metrics obtained by the CNN and the other machine learning regressors on the validation set are shown in Table  11. The testing of the machine learning regressors for the whole information dataset on the validation set was performed in the same manner as that of the hologram dataset.  Figure 20 shows that the validation error was greater than the training error. The loss and MSE plots were the same for both the training/validation sets. Further, it can be seen that the validation MAE was greater than the training MAE. Therefore, it can be said that the CNN model was not correctly fitting. The testing of the CNN for the concatenated intensity-phase image dataset on the test set was performed in the same manner as that of the hologram dataset. The evaluation metrics obtained by the CNN are shown in Table 10. Further, the evaluation metrics obtained by the CNN were compared with certain machine learning regressors. The evaluation metrics obtained by those machine learning regressors on the test set are also shown in Table 10. The testing of the machine learning regressors for the concatenated intensity-phase image dataset on the test set was performed in the same manner as that of the hologram dataset.  Table 10 shows that the MLP regressor had a better performance for the five-class regression tasks on the test set with a fixed EV regression score of 0.00 compared to the CNN and the other machine learning regressors. The testing of the CNN for the concatenated intensity-phase image dataset on the validation set was performed in the same manner as  Table 11. The testing of the machine learning regressors for the whole information dataset on the validation set was performed in the same manner as that of the hologram dataset.  Table 11 shows that the RF regressor had a better performance on the validation set compared to the CNN and the other machine learning regressors with a stable EV regression score of 0.01.

Training of CNN for Multi-Output Regression on Phase-Only Information
The training of the CNN for the five-class regression on the phase-only image dataset was performed in the same manner as that of the five-class classification on the hologram dataset. The loss/MSE/MAE plot for the training/validation sets is provided in Figure 21, which shows that the error for the training set was lower than the error for the validation set. The loss and MSE plots were the same as those depicted in Figure 21. Further, it can also be seen in Figure 21 that the validation MAE was higher compared to the training MAE. This showed that the CNN model was overfitting. The testing of the CNN for the phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. Next, these evaluation metrics obtained by the CNN on the test set were compared with certain machine learning regressors. The evaluation metrics obtained from the CNN and the machine learning regressors on the test set are shown in Table 12. The testing of machine learning regressors for phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. Table 12 shows that the MLP regressor had a better performance for the five-class regression tasks on the test set with a fixed EV regression score of 0.00 compared to the CNN, KNN, DT, RF, and ET regressors. The testing of the CNN for the phase-only image dataset on the validation set was performed in the same manner as that of the hologram dataset. The evaluation metrics obtained by the CNN and the other machine learning regressors on the validation set are shown in Table 13. The testing of the machine learning regressors for the validation set on the phase-only image dataset was performed in the same manner as that of the hologram dataset.

Training of CNN for Multi-Output Regression on Phase-Only Information
The training of the CNN for the five-class regression on the phase-only image dataset was performed in the same manner as that of the five-class classification on the hologram dataset. The loss/MSE/MAE plot for the training/validation sets is provided in Figure 21, which shows that the error for the training set was lower than the error for the validation set. The loss and MSE plots were the same as those depicted in Figure 21. Further, it can also be seen in Figure 21 that the validation MAE was higher compared to the training MAE. This showed that the CNN model was overfitting. The testing of the CNN for the phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. Next, these evaluation metrics obtained by the CNN on the test set were compared with certain machine learning regressors. The evaluation metrics obtained from the CNN and the machine learning regressors on the test set are shown in Table 12. The testing of machine learning regressors for phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset.     Table 13 shows that the MLP regressor had a good performance on the validation set compared to the CNN and the other machine learning regressors with a stable EV regression score of 0.00.

Conclusions
In this paper, digital holographic information in datasets comprising holograms, reconstructed intensity and phase images combined to form concatenated intensity-phase images, and phase-only images was used for the proposed multi-class classification and multi-output regression tasks by using deep learning and machine learning techniques. Each dataset comprised 2268 images separately to perform the multi-class classification and multi-output regression tasks. A deep CNN was used on all three datasets independently to perform the five-class classification and regression tasks. The five-class classification and regression tasks in supervised learning applied to the digital holographic information of eighteen 3D objects using deep CNN was equivalent to 3D object allocation and prediction performed on the digital holographic datasets that produced discrete and continuous labels as output, which justified the rationale behind the present work. For the five-class classification task, the results such as the error/accuracy plots and error matrix, evaluation metrics, receiver operating characteristics (ROCs), and precision-recall characteristics were shown for the confirmation of the work. Similarly, for the five-class regression task, the results such as the error/mean square error (MSE)/mean absolute error (MAE) plots and evaluation metrics were shown for the confirmation of the work. The CNN overfitted all three datasets as shown by the error/accuracy graphs. The ML classifiers had better AUC values for different classes on the datasets of holograms and concatenated intensity-phase images when compared to the CNN. Further, the CNN was found to have a higher AUC value for all five classes on the phase-only image dataset when compared to the other machine learning classifiers. Similarly, the CNN overfitted all three datasets as obtained in the loss/MSE/MAE curves on the training/validation sets. Further, the MLP regressor had a better performance on the test/validation sets for the hologram and phase-only image datasets with a fixed EV regression score of 0.00 compared to the CNN and the other machine learning regressors [51]. The RF regressor had a better performance on the validation set for the concatenated intensity-phase image dataset with a stable EV regression score of 0.01 compared to the CNN and the other regressors [52]. Therefore, we concluded that both the CNN and the machine learning classifiers and regressors (KNN, MLP, DT, RF, and ET) had a superior performance in both the five-class classification and regression tasks for all three datasets.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/s23031095/s1, Figure S1: 3D object used in the off-axis digital holographic recording geometry; Figure S2: Off-axis digital holographic recording geometry used for the recording of hologram of 3D object; Figure S3: General Multi-Class Confusion Matrix.
Author Contributions: U.M.R.N.: CNN architecture design, data collection, analysis and interpretation of results, manuscript preparation; A.N.: supervision, conceptualization, digital holographic experiment and informational retrieval, manuscript review and editing. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the Science and Engineering Research Board (SERB), Department of Science and Technology, Government of India, grant number CRG/2018/003906 and the APC was funded by VIT Chennai.
Institutional Review Board Statement: Not applicable.