Classification of LED Packages for Quality Control by Discriminant Analysis, Neural Network and Decision Tree

This study investigates supervised learning to improve LED classification. A hardware system for testing was built. The data for learning were acquired and then analyzed to show their characteristics. An LED was tested, and the results were categorized into three defective LED groups and one normal LED group. Before classification, electrical and optical data were examined to identify their characteristics. To find out the best way for quality control, an ensemble of methods was used. First, the discriminant analysis using the validation data achieved a 77.9% true positive rate for normal products, inadequate for quality control. Second, neural network-based learning boosted this rate to 97.8%, but the 2.2% false negative rate remained problematic. Finally, a binary decision tree was constructed, achieving a 99.4% true positive rate from just 14 splits, proving highly effective in product classification. The training time was measured as 8.1, 18.2 and 8.2 s for discriminant analysis, neural network and decision tree, respectively. This work has found the binary decision tree is advantageous considering both learning and classification efficiencies.


Introduction
In recent years, in conjunction with the growth of the LED (light emitting diode) market for various applications, there has been an expansion into the market for LED lighting [1][2][3][4][5][6].Consequently, the inspection of LED packages has become a pivotal concern within manufacturing processes [7].The demands for increased speed and enhanced reliability persist in these inspection procedures.As high-power LED packages continue to be developed, various LED types cater to specific applications.This phase aligns with the preparations for mass producing high-efficiency LED products designed for illumination.As production scales, the necessity for deploying inspection equipment becomes increasingly vital, representing a pivotal moment for advancing reliability measurement tools, enhancing quality standards, and ensuring the objectivity of product quality validation.
There are several advanced LED packages such as flip-chip and wafer-level packages, but SMD (surface mount device) LEDs are still popular for a variety of applications.They have achieved a high-speed processing rate of approximately 20,000 units per hour (UPH).Packaged LEDs should go through inspection before shipping.To further improve speed and reliability, there is a pressing need for enhancements in inspection.
A decade ago, Perng et al. introduced an automated method for inspecting SMD LEDs using machine vision [8].In a similar vein, Kuo et al. achieved an impressive recognition rate of 97.83% by first extracting features from images through clustering and then training a neural network [9].More recently, Lin et al. introduced a method for detecting defective LEDs using convolutional neural networks (CNNs) from images of operational LEDs [10].These techniques offer valuable approaches, streamlining the inspection process and extracting rich information from images.However, it is important to note that real-time image processing demands substantial computational power [11], leading to the continued reliance on classical optical and electrical measurement methods in actual mass production.While these methods alone may not suffice to ensure reliability, they can be complemented by other approaches that facilitate real-time measurements [12].
Smart manufacturing heavily relies on intelligent algorithms for decisions [13].Recently, data-based quality management techniques have been applied in various cases, and their utility continues to rise.Among these techniques, machine learning is the most popular approach in fields such as design automation [14], semiconductor analysis [15] and modeling [16].Especially for classifications, machine learning can be a reliable and scalable solution.The classification methods by machine learning have been implemented and tested for recycling [17], precision machining [18], welding [19,20], tools [21], bearing diagnostics [22], consumer parts [23], additive manufacturing [24], human action in manufacturing [25] and polymer processing [26].
In practice, quality controls of SMD LEDs are still conducted using rule-based expert systems.This prevents scale-up and automation of system building.To enhance efficiency, this study specifically examines three key algorithms: discriminant analysis (DA), a well-established statistical technique with a long history of application in data mining; neural networks (NNs), the predominant choice for artificial intelligence-based decision-making; and decision trees (DTs), known for their ability to perform systematic classification through learning.These methods have been applied to classification in manufacturing for automation and smart processes [27,28].
A dedicated automated testing apparatus, custom-tailored for the examination of SMD LEDs for lighting, has been developed for integration into the inspection line, in alignment with the methodology devised in this study.The SMD package in this work includes two LED devices.The package is considered normal when both the LEDs are bright as designed.However, some packages can be fully dark, evenly dimmed, or partly dark.Each defective LED should be sorted into its respective category, while normal ones are classified accordingly.After inspection, the defective LEDs are either repaired, repurposed, or discarded.
Datasets for training and validation were collected to create a robust classification system for the inspection process.Before learning, the data were scrutinized in detail to understand the characteristics of the measurements.Once the classification systems based on the aforementioned three methods were learned on the training dataset, their performance was evaluated on a separate validation dataset.

Experimental System
As previously mentioned, this study aimed to transform the pre-existing expert system for quality control into a learning-based system that prioritizes scalability and reliability.Thus, we employed conventional measurement methods to gather learning data, ensuring seamless integration with the existing inspection line.An instrument for automated LED inspection was built as shown in Figure 1.Conventionally, both electrical and optical tests are conducted to check the quality of an LED package [29].Commercialized electrical and optical testers (LX4652C, SR2000A, Teknologue, Kawasaki, Japan) were chosen and assembled into the system.The electrical tester was a voltage measurement device with precision current source control.In most cases, a tester with the µA-A range is sufficient for LEDs.A specified operational current rating is provided as a dominant specification because LEDs operate based on current, and the forward voltage is a resulting property as part of the product data [6].Note that there have been attempts to use CCD images only [30], but spectrometers are usually required for reliable inspections.
Additionally, the optical characteristics of the LED package could be simultaneously measured during operation voltage application.Further details of the equipment are suppressed here.This sort of inspection system is quite widely used, but the classification algorithm was constructed manually based on the expertise of engineers.Conventionally, both electrical and optical data are used in LED inspection for quality control according to IEC 60081 [32].

Electrical Test
The current-voltage characteristic is the most fundamental and informative electrical characterization of an LED.The measured electrical data represents the device health, In a mass production line, a pick-and-place device is required for inspection [31].An automated pick-and-place device connected to a feeder was designed and implemented for automatically measuring the electrical and optical features.This work considered a white LED package with dual chips.This LED had only two electrodes exposed, allowing it to receive power for operation.Figure 2 shows the insulated jig that handles the DUT (device under test).The spring inside the jig applied the required force for contact between the LED electrodes and the test probes.With a durable palladium-coated electrical sensing probe, it was possible to quickly measure and record voltage in situations where current is controlled.As shown in Figure 2, the electrodes of the probe and LED were pressed together using spring force to measure the voltage under reliable conditions.Pneumatic control allowed for either contact or disconnection between the LED and probe electrodes.As mentioned earlier, the electrical tester used in this experiment could rapidly measure voltage under controlled current conditions, enabling voltage measurements across various current conditions.
Micromachines 2024, 15, x FOR PEER REVIEW 3 of 22 it to receive power for operation.Figure 2 shows the insulated jig that handles the DUT (device under test).The spring inside the jig applied the required force for contact between the LED electrodes and the test probes.With a durable palladium-coated electrical sensing probe, it was possible to quickly measure and record voltage in situations where current is controlled.As shown in Figure 2, the electrodes of the probe and LED were pressed together using spring force to measure the voltage under reliable conditions.Pneumatic control allowed for either contact or disconnection between the LED and probe electrodes.
As mentioned earlier, the electrical tester used in this experiment could rapidly measure voltage under controlled current conditions, enabling voltage measurements across various current conditions.Additionally, the optical characteristics of the LED package could be simultaneously measured during operation voltage application.Further details of the equipment are suppressed here.This sort of inspection system is quite widely used, but the classification algorithm was constructed manually based on the expertise of engineers.Conventionally, both electrical and optical data are used in LED inspection for quality control according to IEC 60081 [32].

Electrical Test
The current-voltage characteristic is the most fundamental and informative electrical characterization of an LED.The measured electrical data represents the device health, Additionally, the optical characteristics of the LED package could be simultaneously measured during operation voltage application.Further details of the equipment are suppressed here.This sort of inspection system is quite widely used, but the classification algorithm was constructed manually based on the expertise of engineers.Conventionally, both electrical and optical data are used in LED inspection for quality control according to IEC 60081 [32].

Electrical Test
The current-voltage characteristic is the most fundamental and informative electrical characterization of an LED.The measured electrical data represents the device health, which is crucial for both development and manufacturing processes.During device development, the full I-V curve is measured by sweeping the current over the range of interest.This is required to understand the device's behavior and performance [33].For a single junction device, the current range can be very wide, spanning from tens to thousands of mA depending on the LED power, while the operating voltage ranges from about 1.5 to 4 V.The dynamic resistance is defined as the reciprocal of the slope of the I-V curve above the turn-on voltage.It is a measure of the bulk resistances of the semiconductors and metals in the LED device.In manufacturing, testing throughput remains a significant concern.To expedite testing, only specific predefined points on the I-V curve are measured.If these measured data points fall outside the specified range, the device is deemed defective.In this study, we measured voltages at six controlled currents, as illustrated in Figure 3.The details about the six measurement points are presented below.which is crucial for both development and manufacturing processes.During device development, the full I-V curve is measured by sweeping the current over the range of interest.This is required to understand the device's behavior and performance [33].For a single junction device, the current range can be very wide, spanning from tens to thousands of mA depending on the LED power, while the operating voltage ranges from about 1.5 to 4 V.The dynamic resistance is defined as the reciprocal of the slope of the I-V curve above the turn-on voltage.It is a measure of the bulk resistances of the semiconductors and metals in the LED device.In manufacturing, testing throughput remains a significant concern.To expedite testing, only specific predefined points on the I-V curve are measured.If these measured data points fall outside the specified range, the device is deemed defective.In this study, we measured voltages at six controlled currents, as illustrated in Figure 3.The details about the six measurement points are presented below.A PN junction in an LED requires a certain turn-on (knee, threshold) voltage to begin conducting substantial current.The LED is operated at a forward voltage beyond the turnon voltage while current flows through it.Typically, for standard silicon-based LEDs, the turn-on voltage is around 0.7 to 1.2 volts, reflecting interfacial properties in the LED device.Below the turn-on voltage, the LED behaves like an open circuit, and virtually no current flows.If an LED is defective, causing noticeable current flow below the turn-on (usually between 0.5 and 1.2 V), the LED is regarded as leaky [34].The voltage at 1 mA denoted as VF1 in Figure 3 is used to judge the leakiness.On the reverse voltage side, little to no current may flow.LEDs are not purposely operated in this region (reverse bias) [31].However, this reverse voltage can be used in testing as a metric of defectiveness.The negative voltage, VR, is measured at −5 mA as shown in Figure 3 [35].In addition, four more forward voltages, VF2, VF3, VF4 and VF5, are measured at the currents indicated in Figure 3.The resistive region where an increase in voltage leads to a decrease in current increase rate is often observed in curves, and this is explained by the effect of an increase in resistance due to elevated temperature [36].In this way, measuring the forward voltage for various current values has been proven useful for verifying the normality of the operating voltage.If the LED is functioning correctly, it is expected that VF2, VF3 and VF4 will be measured within a certain range, ideally reproducing its characteristic curve like Figure 3. Additionally, for a healthy LED, an increase in current beyond the operating voltage range does not significantly affect the forward voltage.Therefore, if significant variations are observed in VF4 and VF5, there is a higher likelihood that the LED is defective.A PN junction in an LED requires a certain turn-on (knee, threshold) voltage to begin conducting substantial current.The LED is operated at a forward voltage beyond the turn-on voltage while current flows through it.Typically, for standard silicon-based LEDs, the turn-on voltage is around 0.7 to 1.2 volts, reflecting interfacial properties in the LED device.Below the turn-on voltage, the LED behaves like an open circuit, and virtually no current flows.If an LED is defective, causing noticeable current flow below the turn-on (usually between 0.5 and 1.2 V), the LED is regarded as leaky [34].The voltage at 1 mA denoted as VF 1 in Figure 3 is used to judge the leakiness.On the reverse voltage side, little to no current may flow.LEDs are not purposely operated in this region (reverse bias) [31].However, this reverse voltage can be used in testing as a metric of defectiveness.The negative voltage, VR, is measured at −5 mA as shown in Figure 3 [35].In addition, four more forward voltages, VF 2 , VF 3 , VF 4 and VF 5 , are measured at the currents indicated in Figure 3.The resistive region where an increase in voltage leads to a decrease in current increase rate is often observed in curves, and this is explained by the effect of an increase in resistance due to elevated temperature [36].In this way, measuring the forward voltage for various current values has been proven useful for verifying the normality of the operating voltage.If the LED is functioning correctly, it is expected that VF 2 , VF 3 and VF 4 will be measured within a certain range, ideally reproducing its characteristic curve like Figure 3. Additionally, for a healthy LED, an increase in current beyond the operating voltage range does not significantly affect the forward voltage.Therefore, if significant variations are observed in VF 4 and VF 5 , there is a higher likelihood that the LED is defective.

Optical Test
In this work, the peak wavelength λ p and the luminous intensity Iv were measured [37].The peak wavelength signifies the wavelength at which an LED emits the greatest power [38].It is determined from a chart depicting power density versus wavelength, where the peak wavelength corresponds to the point where the radiation energy in the spectrum is at its maximum.The popularity of the peak wavelength arises from its visually intuitive nature, making it easy to identify.Naturally, any alteration in this value can lead to variations in the characteristics of the LED.
The luminous intensity is flux emitted integrated over a given directional range from the DUT and is conventionally expressed in Candela (Cd) [6,39].Additionally, x and y values in the CIE 1931 diagram, as presented in Figure 4, were measured to capture the variation in colors.As a result, four optical measurement values were obtained, and when combined with the electrical test results for a single LED, a total of ten features were acquired.

Optical Test
In this work, the peak wavelength λp and the luminous intensity Iv were measured [37].The peak wavelength signifies the wavelength at which an LED emits the greatest power [38].It is determined from a chart depicting power density versus wavelength, where the peak wavelength corresponds to the point where the radiation energy in the spectrum is at its maximum.The popularity of the peak wavelength arises from its visually intuitive nature, making it easy to identify.Naturally, any alteration in this value can lead to variations in the characteristics of the LED.
The luminous intensity is flux emitted integrated over a given directional range from the DUT and is conventionally expressed in Candela (Cd) [6,39].Additionally, x and y values in the CIE 1931 diagram, as presented in Figure 4, were measured to capture the variation in colors.As a result, four optical measurement values were obtained, and when combined with the electrical test results for a single LED, a total of ten features were acquired.

Learning Data
Differentiating between satisfactory and faulty LED packages relies on the qualitative standards established for these components.In the context of crafting measurement devices for these components, only imperfections such as component breakage and insufficient luminosity were regarded as defects.This investigation encompassed three grades of defects, designated as D1, D2 and D3, as depicted in Figure 5. LED packages that do not respond to the current were categorized as D1.Partially luminous packages were classified as D2, while uniformly dim ones fell under the D3 category.Other LEDs were classified in the Normal category.The initial categorization was performed using a rough criterion based on Iv.Subsequently, multiple technicians conducted exhaustive examinations at least twice to finalize the categorization of both the training and validation data.
In the process of gathering training and validation data, the classification procedure was manually executed, with automated image acquisition with a CCD camera providing additional support.These four defect categories aligned with the classification outcomes to be predicted.Within this work, supervised learning was employed, utilizing the ten previously mentioned electrical and optical test features, to predict the specific category to which an LED belongs.A total of 8600 tests were conducted to collect data for both

Learning Data
Differentiating between satisfactory and faulty LED packages relies on the qualitative standards established for these components.In the context of crafting measurement devices for these components, only imperfections such as component breakage and insufficient luminosity were regarded as defects.This investigation encompassed three grades of defects, designated as D1, D2 and D3, as depicted in Figure 5. LED packages that do not respond to the current were categorized as D1.Partially luminous packages were classified as D2, while uniformly dim ones fell under the D3 category.Other LEDs were classified in the Normal category.The initial categorization was performed using a rough criterion based on Iv.Subsequently, multiple technicians conducted exhaustive examinations at least twice to finalize the categorization of both the training and validation data.
In the process of gathering training and validation data, the classification procedure was manually executed, with automated image acquisition with a CCD camera providing additional support.These four defect categories aligned with the classification outcomes to be predicted.Within this work, supervised learning was employed, utilizing the ten previously mentioned electrical and optical test features, to predict the specific category to which an LED belongs.A total of 8600 tests were conducted to collect data for both learning and validation purposes, with 6000 tests allocated for learning and 2600 for validation.While learning, K-fold cross-validation was conducted with K = 5.Note that the separate validation data were not used during K-fold cross-validation.
learning and validation purposes, with 6000 tests allocated for learning and 2600 for validation.While learning, K-fold cross-validation was conducted with K = 5.Note that the separate validation data were not used during K-fold cross-validation.In the current investigation, the classification methods to be presented in the following sections were implemented by the ClassificationLearner and Deep Learning Toolbox in Matlab 2023a.The learning and classification were conducted on a Windows 11 machine with an x-64-based CPU (AMD Ryzen Threadripper PRO 3995WX 16 physical cores 3.9 GHz).

Discriminant Analysis
In order to check if the tested LEDs can be classified by a simple statistical method, discriminant analysis (DA) was employed.Discriminant analysis has been adopted in classification processes for manufacturing.The method is renowned for classification of the new observations based on the prebuilt predictor variables.Several studies have utilized LDA (linear discriminant analysis) and QDA (quadratic discriminant analysis) for the purpose of categorizing defects based on data collected during the manufacturing process.[18,[40][41][42].
Let us describe the linear DA (LDA) briefly.Here, these variables are called the k-th canonical variables, which are [43,44] where x is an observation vector with q members for a test.In this work, q = 10, and x is defined as The coefficients uik in Equation ( 1) are set to find the linear combination of the q variables that maximizes the ratio of between-group to within-group variation.The formed canonical variates can then be used to discriminate between groups.In LDA, the number of the canonical variables is usually set as one smaller than the number of classes, N. As there are four classes shown in Figure 5, three variables, V1, V2 and V3, were evaluated during the LDA.This work also performed QDA, but the details are suppressed here [44].

Neural Network
Neural networks for classification are a type of machine learning model that can be trained to categorize input data into different classes.In this context, the network learns to establish a relationship between input features and class labels, the NN structure in this work.A common architectural setup includes using the rectified linear unit (ReLU) as the In the current investigation, the classification methods to be presented in the following sections were implemented by the ClassificationLearner and Deep Learning Toolbox in Matlab 2023a.The learning and classification were conducted on a Windows 11 machine with an x-64-based CPU (AMD Ryzen Threadripper PRO 3995WX 16 physical cores 3.9 GHz).

Discriminant Analysis
In order to check if the tested LEDs can be classified by a simple statistical method, discriminant analysis (DA) was employed.Discriminant analysis has been adopted in classification processes for manufacturing.The method is renowned for classification of the new observations based on the prebuilt predictor variables.Several studies have utilized LDA (linear discriminant analysis) and QDA (quadratic discriminant analysis) for the purpose of categorizing defects based on data collected during the manufacturing process.[18,[40][41][42].
Let us describe the linear DA (LDA) briefly.Here, these variables are called the k-th canonical variables, which are [43,44] where x is an observation vector with q members for a test.In this work, q = 10, and x is defined as The coefficients u ik in Equation ( 1) are set to find the linear combination of the q variables that maximizes the ratio of between-group to within-group variation.The formed canonical variates can then be used to discriminate between groups.In LDA, the number of the canonical variables is usually set as one smaller than the number of classes, N. As there are four classes shown in Figure 5, three variables, V 1 , V 2 and V 3 , were evaluated during the LDA.This work also performed QDA, but the details are suppressed here [44].

Neural Network
Neural networks for classification are a type of machine learning model that can be trained to categorize input data into different classes.In this context, the network learns to establish a relationship between input features and class labels, the NN structure in this work.A common architectural setup includes using the rectified linear unit (ReLU) as the activation function to mitigate the vanishing gradient problem [45] and employing Softmax at the output layer to transform the outputs into meaningful probability values [46].Figure 6 shows a typical shallow NN for classification, which was employed in this work [47].
activation function to mitigate the vanishing gradient problem [45] and employing Softmax at the output layer to transform the outputs into meaningful probability values [46].Figure 6 shows a typical shallow NN for classification, which was employed in this work [47].Furthermore, in the final classification layer, the cross-entropy method was employed.Cross-entropy is a widely used loss function for classification tasks.It quantifies the dissimilarity between the predicted probability distribution of a class and the actual probability distribution of that class.When training a classification model, the objective is to minimize the cross-entropy loss, signifying that the model should learn to predict the correct class label with the highest probability.If the value of the j-th neuron is Yj, and its corresponding target value is Tj, the loss function is expressed by the following equation [48]: Here, we present the classification results when L is minimized.

Decision Tree
A decision tree is a supervised learning algorithm primarily utilized in classification tasks [49].It involves a predefined target variable and is tailored for solving such problems.Especially in manufacturing processes, DTs are particularly well suited for classification tasks as they are purposefully designed for this specific purpose [50][51][52][53][54].The data need to be labeled with a single categorical response variable, which corresponds to a process result.As previously described, each LED examined in this study must be assigned one of the categories: D1, D2, D3 or Normal.Leaf nodes correspond to potential values of the response variable, while non-leaf nodes correspond to a condition by a predictor feature.Each feature node divides a set of instances into subsets with the root node encompassing all instances.The key question is how this identification process takes place, including variable selection and partitioning.Decision trees employ various algorithms to accomplish this task.
During tree construction, the primary aim is to minimize the complexity, or impurity, at each node.In essence, the goal is to create a tree with a low impurity index, which quantifies the mixing degree within the leaf (result) node.Three common impurity indices Furthermore, in the final classification layer, the cross-entropy method was employed.Cross-entropy is a widely used loss function for classification tasks.It quantifies the dissimilarity between the predicted probability distribution of a class and the actual probability distribution of that class.When training a classification model, the objective is to minimize the cross-entropy loss, signifying that the model should learn to predict the correct class label with the highest probability.If the value of the j-th neuron is Y j , and its corresponding target value is T j , the loss function is expressed by the following equation [48]: Here, we present the classification results when L is minimized.

Decision Tree
A decision tree is a supervised learning algorithm primarily utilized in classification tasks [49].It involves a predefined target variable and is tailored for solving such problems.Especially in manufacturing processes, DTs are particularly well suited for classification tasks as they are purposefully designed for this specific purpose [50][51][52][53][54].The data need to be labeled with a single categorical response variable, which corresponds to a process result.As previously described, each LED examined in this study must be assigned one of the categories: D1, D2, D3 or Normal.Leaf nodes correspond to potential values of the response variable, while non-leaf nodes correspond to a condition by a predictor feature.Each feature node divides a set of instances into subsets with the root node encompassing all instances.The key question is how this identification process takes place, including variable selection and partitioning.Decision trees employ various algorithms to accomplish this task.
During tree construction, the primary aim is to minimize the complexity, or impurity, at each node.In essence, the goal is to create a tree with a low impurity index, which quantifies the mixing degree within the leaf (result) node.Three common impurity indices are utilized: entropy, the Gini diversity index and the information gain ratio.The Gini diversity index is frequently favored due to its sensitivity to misclassification, computational efficiency and robustness in response to outliers.This work employed the following Gini index to check the impurity [55]: Here, p k is the probability of being classified into the k-th class.In this work, a binary tree was constructed by splitting a particular observation based on the choice that minimizes the aforementioned value at each node of the tree by following the standard CART algorithm [56,57].Through the repetition of this process, a regression tree that is well fitted to the training data could ultimately be generated.At a specific node, the split was performed through the application of a binary judgment to a single measurement variable.Once the tree was fully built, if new data were input, the classification response was obtained by following the tree until the leaf node was reached.
The current configuration employed a maximum limit of 100 splits.This research endeavor additionally sought to ascertain the feasibility of diminishing the quantity of measurement variables by conducting the classification task solely based on either optical or electrical data.The rationale behind this pursuit lies in the recognition that measurements invariably entail associated costs, and a reduction in the quantity of measurements holds the potential to yield cost savings within the domain of quality control.

Observation of Learning Data
In conventional quality control frameworks, human-derived rule-based expert systems are prevalent.However, they struggle with scalability, timely updates, and achieving full automation.At times, the data may inherently exhibit patterns that suggest rules, which can streamline the construction of an expert system.Thus, to assess the viability of a learning-based model, it is essential to analyze the correlations within the gathered data.A comprehensive analysis is not required since specific subsets of data will be enough to describe the characteristics of the data.
First, the electrical measurement data will be examined.As plotted in Figure 7, a notable increase in current is evident among VF 1 , VF 2 and VF 3 , accompanied by a slight rise in voltage difference.For VF 4 , it is observed that the voltage exhibits a relatively substantial increase in comparison to VF 3 .VF 4 and VF 5 are difficult to distinguish on the plot due to the low current difference.In general, a resemblance to the pattern illustrated in Figure 3 is observed.In Figure 8, VF 2 is plotted relative to VF 1 .In the range measured with low voltage, it appears that there is a correlation between the two voltages.However, near the normal operating range, VF 2 maintains a consistent voltage even as VF 1 varies.Within the normal operating range, VF 1 is observed near its maximum value, while VF 2 is predominantly clustered around the observed maximum voltage.The normal value in the figure designates the target operational point of the device.The variation in VF 3 with respect to VF 1 as presented in Figure 9 is similar to what is depicted in Figure 8. From this, it can be predicted that the behaviors of VF 3 and VF 2 will be generally alike.This is indicative of the fact that VF 1 is measured at low current, thus reflecting a substantial variation in behavior at low currents in accordance with the manufacturing characteristics of the LED.As can be seen in Figure 10, as mentioned earlier, VF 3 and VF 2 exhibit a strong correlation at low voltages.Furthermore, it can also be observed that VF 3 maintains a relatively consistent behavior near the maximum voltage during normal operation.In Figure 11, the variation in VR with respect to VF4 is shown.It can be observed that VR remains relatively constant while VF4 varies over a wide range of voltages.From these characteristics, it can be predicted that VR is likely to be considered defective if it deviates from the narrow range indicated, whereas VF4 may need to be assessed over a wider range.In Figure 11, the variation in VR with respect to VF 4 is shown.It can be observed that VR remains relatively constant while VF 4 varies over a wide range of voltages.From these characteristics, it can be predicted that VR is likely to be considered defective if it deviates from the narrow range indicated, whereas VF 4 may need to be assessed over a wider range.On the other hand, as mentioned earlier, VF1 exhibits a wider range of variation than VF4.It is particularly noteworthy that voltages significantly deviating from the normal range can be observed in Figure 12.Values lower than the normal voltage are mostly presumed to be related to leakiness.It can be predicted that VF4 and VF5, which are measured within very close current ranges, will be strongly correlated with each other.As presented in Figure 13, VF5 and VF4 exhibit a highly linear relationship, and data points deviating from this correlation are expected to have a high likelihood of being defective.Intuitively classifying LEDs based solely on the distribution of voltage data appears to be quite challenging.Given the significant variability in voltage measurement values, it is predicted that the criteria will vary from VR to VF5.For instance, if VF1 is measured at a very low level, there is a high likelihood that VF3 is also in the abnormal range, but even if VF3 is within the normal range, VF1 is likely to be in the abnormal range.On the other hand, as mentioned earlier, VF 1 exhibits a wider range of variation than VF 4 .It is particularly noteworthy that voltages significantly deviating from the normal range can be observed in Figure 12.Values lower than the normal voltage are mostly presumed to be related to leakiness.It can be predicted that VF 4 and VF 5 , which are measured within very close current ranges, will be strongly correlated with each other.As presented in Figure 13, VF 5 and VF 4 exhibit a highly linear relationship, and data points deviating from this correlation are expected to have a high likelihood of being defective.Intuitively classifying LEDs based solely on the distribution of voltage data appears to be quite challenging.Given the significant variability in voltage measurement values, it is predicted that the criteria will vary from VR to VF 5 .For instance, if VF 1 is measured at a very low level, there is a high likelihood that VF 3 is also in the abnormal range, but even if VF 3 is within the normal range, VF 1 is likely to be in the abnormal range.
sumed to be related to leakiness.It can be predicted that VF4 and VF5, which are measured within very close current ranges, will be strongly correlated with each other.As presented in Figure 13, VF5 and VF4 exhibit a highly linear relationship, and data points deviating from this correlation are expected to have a high likelihood of being defective.Intuitively classifying LEDs based solely on the distribution of voltage data appears to be quite challenging.Given the significant variability in voltage measurement values, it is predicted that the criteria will vary from VR to VF5.For instance, if VF1 is measured at a very low level, there is a high likelihood that VF3 is also in the abnormal range, but even if VF3 is within the normal range, VF1 is likely to be in the abnormal range.As can be observed in Figure 14, the x and y CIE1931 values exhibit a strong linear correlation with each other.Many data points are densely clustered near the target values presented in Figure 4.The data to be used for training are presented in Figure 15, showing luminous intensity as a function of wavelength.On the graph, the data points are divided into three distinct groups.While a specific group may not necessarily signify good or defective products, it is evident that there are distinct differences between them.Particularly, products with different Iv values can be classified into different grades, but this study does not consider grading for defect-free products.As can be observed in Figure 14, the x and y CIE1931 values exhibit a strong linear correlation with each other.Many data points are densely clustered near the target values presented in Figure 4.The data to be used for training are presented in Figure 15, showing luminous intensity as a function of wavelength.On the graph, the data points are divided into three distinct groups.While a specific group may not necessarily signify good or defective products, it is evident that there are distinct differences between them.Particularly, products with different Iv values can be classified into different grades, but this study does not consider grading for defect-free products.
presented in Figure 4.The data to be used for training are presented in Figure 15, showing luminous intensity as a function of wavelength.On the graph, the data points are divided into three distinct groups.While a specific group may not necessarily signify good or defective products, it is evident that there are distinct differences between them.Particularly, products with different Iv values can be classified into different grades, but this study does not consider grading for defect-free products.Luminous intensity, as seen in Figure 16, varies according to VR, with two distinct groups forming and values outside the range of the scattered group being visible in the vicinity.In Figure 17, the relationship between Iv and VF1 is plotted.When VF1 deviates from the normal range and records a low voltage, there is a tendency for Iv values to generally appear in the lower group.In cases where Iv values are high, VF1 is typically observed to be close to the normal voltage range.Taken together with the observations regarding voltage and optical measurements presented earlier, it appears to be quite challenging to establish quality criteria solely based on the values observed in the plots.Luminous intensity, as seen in Figure 16, varies according to VR, with two distinct groups forming and values outside the range of the scattered group being visible in the vicinity.In Figure 17, the relationship between Iv and VF 1 is plotted.When VF 1 deviates from the normal range and records a low voltage, there is a tendency for Iv values to generally appear in the lower group.In cases where Iv values are high, VF 1 is typically observed to be close to the normal voltage range.Taken together with the observations regarding voltage and optical measurements presented earlier, it appears to be quite challenging to establish quality criteria solely based on the values observed in the plots.
vicinity.In Figure 17, the relationship between Iv and VF1 is plotted.When VF1 deviates from the normal range and records a low voltage, there is a tendency for Iv values to generally appear in the lower group.In cases where Iv values are high, VF1 is typically observed to be close to the normal voltage range.Taken together with the observations regarding voltage and optical measurements presented earlier, it appears to be quite challenging to establish quality criteria solely based on the values observed in the plots.Setting criteria for an expert system from such data may be somehow achievable for engineers with a high level of understanding of the both electrical and optical characteristics of LEDs, coupled with deep insight and experience, but it is expected to be a timeconsuming process to ensure reliability.As evident from the results presented earlier, it seems difficult to express the relationships between the data as simple functions.Moreover, if quality judgments are made based on the range of each feature, many defect-free products may be erroneously rejected, which, as mentioned earlier, could significantly reduce yield.Therefore, it appears necessary to establish a quality control process through automated methodologies such as DA, NNs and DTs.

Discriminant Analysis
The results of discriminant analysis are presented in Table 1.For each learning dataset, the coefficients from Table 1 were substituted into Equation (1), and canonical variables 1 and 2 were plotted using the measured values, as depicted in Figure 18.It is evident that the displayed points form distinct groups.However, even variables belonging to the Normal group are observed to be widely scattered, and points from the D1, D2 and D3 groups are also highly dispersed.Figure 19 represents the data in three-dimensional space for the three canonical variables.The Normal group clearly divides into two distinct Setting criteria for an expert system from such data may be somehow achievable for engineers with a high level of understanding of the both electrical and optical characteristics of LEDs, coupled with deep insight and experience, but it is expected to be a time-consuming process to ensure reliability.As evident from the results presented earlier, it seems difficult to express the relationships between the data as simple functions.Moreover, if quality judgments are made based on the range of each feature, many defect-free products may be erroneously rejected, which, as mentioned earlier, could significantly reduce yield.Therefore, it appears necessary to establish a quality control process through automated methodologies such as DA, NNs and DTs.

Discriminant Analysis
The results of discriminant analysis are presented in Table 1.For each learning dataset, the coefficients from Table 1 were substituted into Equation (1), and canonical variables 1 and 2 were plotted using the measured values, as depicted in Figure 18.It is evident that the displayed points form distinct groups.However, even variables belonging to the Normal group are observed to be widely scattered, and points from the D1, D2 and D3 groups are also highly dispersed.Figure 19 represents the data in three-dimensional space for the three canonical variables.The Normal group clearly divides into two distinct groups, and the points corresponding to D2 are widely spread, indicating the difficulty in discrimination using linear discriminant analysis (LDA).
The evaluation results using validation data are presented in Table 2, where the discriminative success rate for the Normal category reaches only 77.92%.As evident from Figures 19 and 20, it can be inferred that the distribution of data may pose challenges for successful classification based solely on variance.The ROC (receiver operating characteristic) evaluates the trade-off between the true positive rate (TPR) and the false positive rate (FPR).The ROC curve shown in Figure 20 illustrates that there is some discernible resolving power; however, it falls short of meeting the requirements for an effective quality control process.
When QDA was employed, markedly different results were observed.A distinct improvement was not seen in the Normal category, while a 39% reduction in prediction accuracy for D1 was observed, as can be observed in Table 3.This dramatic difference may have been caused by the complex distribution of data in both the Normal and D1 categories.Notably, the presence of numerous outliers in the D1 category would have rendered the prediction task quite challenging.A previous study on turning process analysis reported an accuracy of 86.9% when employing QDA [18].However, it is important to note that the features used in this study may not be well suited for discriminant analysis (DA).Nevertheless, it is essential to acknowledge that DA provides valuable insights into data distribution, even if it may have limited discriminatory power.the prediction task quite challenging.A previous study on turning process analysis reported an accuracy of 86.9% when employing QDA [18].However, it is important to note that the features used in this study may not be well suited for discriminant analysis (DA).Nevertheless, it is essential to acknowledge that DA provides valuable insights into data distribution, even if it may have limited discriminatory power.the prediction task quite challenging.A previous study on turning process analysis reported an accuracy of 86.9% when employing QDA [18].However, it is important to note that the features used in this study may not be well suited for discriminant analysis (DA).Nevertheless, it is essential to acknowledge that DA provides valuable insights into data distribution, even if it may have limited discriminatory power.

Neural Network
Multiple configurations of NNs with additional layers and neurons have been tested, yet the arrangement presented in Figure 6 has emerged as the most effective for this dataset.The learning process has been conducted with the training dataset.The outcomes of an evaluation with 5-fold cross-validation were found to be 97.4% true for normal data (see Figure S1 in the supplemental data).The results using the validation data are presented in Table 4.The accuracy in correctly classifying normal data reached 97.8%, signifying a substantial improvement when compared to the results obtained using LDA.The ROC curve in Figure 21 shows very high operating points with AUC over 0.99.an evaluation with 5-fold cross-validation were found to be 97.4% true for normal data (see Figure S1 in the supplemental data).The results using the validation data are presented in Table 4.The accuracy in correctly classifying normal data reached 97.8%, signifying a substantial improvement when compared to the results obtained using LDA.The ROC curve in Figure 21 shows very high operating points with AUC over 0.99.Nevertheless, this level of accuracy for the category Normal still falls short of meeting the stringent requirements for effective deployment within quality control of LEDs.In addition to the setup shown in Figure 6, various NN designs have been tried.In previous works, there have been successful applications of deep NN in classification or detection [19,[58][59][60][61]. Intriguingly, augmenting the quantity of fully connected layers in this work led to a decrement in the recognition rate, and various combinations failed to yield discernibly improved outcomes.This might be featured to overfitting [62], data deficiency or vanishing gradients [63], which are persistent issues in deep learning [64].Given the limited batch size of the current data, applying convolution becomes challenging, making it difficult to enhance performance of the NN further.Nevertheless, this level of accuracy for the category Normal still falls short of meeting the stringent requirements for effective deployment within quality control of LEDs.In addition to the setup shown in Figure 6, various NN designs have been tried.In previous works, there have been successful applications of deep NN in classification or detection [19,[58][59][60][61].
Intriguingly, augmenting the quantity of fully connected layers in this work led to a decrement in the recognition rate, and various combinations failed to yield discernibly improved outcomes.This might be featured to overfitting [62], data deficiency or vanishing gradients [63], which are persistent issues in deep learning [64].Given the limited batch size of the current data, applying convolution becomes challenging, making it difficult to enhance performance of the NN further.

Decision Trees
The results of the training process using the training data have culminated in the formation of the DT illustrated in Figure 22.While the tree is not exceedingly simplistic, it achieves classification with a maximum of 14 splits.The evaluation by 5-fold crossvalidation reached 98.9% true for normal data (see Figure S2 in the supplemental data).Additionally, the evaluation results using validation data are presented in Table 5.These results demonstrate that 99.4% of Normal LEDs were correctly classified as Normal, and D1 achieved a perfect 100% classification rate.However, D2 exhibited a 1.7% misclassification rate.The ROC curve presented in Figure 23 exhibits an unusual characteristic in machine learning, with a shape that is notably close to TPR = 1, which is a rare occurrence.As a result, the obtained AUC (area under the curve) value is 0.9956 for the Normal category.
Micromachines 2024, 15, x FOR PEER REVIEW 18 of 22 be employed for quality management purposes.In light of these findings, it is believed that the construction of a DT based on the training process can enable the implementation of a classification system logic without the need for expert assistance.Let us conduct a brief analysis of the resulting tree.In the initial and second splits, the tree identifies all instances of D1.As anticipated, there may be some ambiguity between D2, D3 and Normal.As we traverse down the tree, D2 and D3 are successfully classified.Ultimately, any remaining instances of DUT that do not fall into the D2 or D3 categories are labeled as Normal.
The DTs obtained using only electrical measurements and optical measurements separately, from the same dataset, yielded TPRs of only 77.1% and 79.6% for Normal data, respectively (see Tables S1 and S2 in the supplemental data).Furthermore, the structures of the trees were exceedingly complex, with 79 and 74 splits (see Figures S3 and S5 in the supplemental data).Therefore, it can be concluded that training should involve the simultaneous utilization of both electrical and optical measurements.An earlier study has also indicated that a DT method based on random forests outperforms NNs [65].
As shown in Figures 9-14, the decision boundaries are nonlinear.However, decision trees are capable of capturing complex nonlinear relationships in data through a series of simple decision rules.In some cases, this simplicity can lead to a more efficient representation of the underlying data structure compared to NNs.As aforementioned, the DT is simpler and more interpretable.Moreover, it is robust in response to outliers as it can split the data into distinct regions.be employed for quality management purposes.In light of these findings, it is believed that the construction of a DT based on the training process can enable the implementation of a classification system logic without the need for expert assistance.Let us conduct a brief analysis of the resulting tree.In the initial and second splits, the tree identifies all instances of D1.As anticipated, there may be some ambiguity between D2, D3 and Normal.As we traverse down the tree, D2 and D3 are successfully classified.Ultimately, any remaining instances of DUT that do not fall into the D2 or D3 categories are labeled as Normal.
The DTs obtained using only electrical measurements and optical measurements separately, from the same dataset, yielded TPRs of only 77.1% and 79.6% for Normal data, respectively (see Tables S1 and S2 in the supplemental data).Furthermore, the structures of the trees were exceedingly complex, with 79 and 74 splits (see Figures S3 and S5 in the supplemental data).Therefore, it can be concluded that training should involve the simultaneous utilization of both electrical and optical measurements.An earlier study has also indicated that a DT method based on random forests outperforms NNs [65].
As shown in Figures 9-14, the decision boundaries are nonlinear.However, decision trees are capable of capturing complex nonlinear relationships in data through a series of simple decision rules.In some cases, this simplicity can lead to a more efficient representation of the underlying data structure compared to NNs.As aforementioned, the DT is simpler and more interpretable.Moreover, it is robust in response to outliers as it can split the data into distinct regions.The DT shown in Figure 22 is relatively straightforward to implement on a computer and exhibits a high success rate in classification.Consequently, it is anticipated that it can be employed for quality management purposes.In light of these findings, it is believed that the construction of a DT based on the training process can enable the implementation of a classification system logic without the need for expert assistance.Let us conduct a brief analysis of the resulting tree.In the initial and second splits, the tree identifies all instances of D1.As anticipated, there may be some ambiguity between D2, D3 and Normal.As we traverse down the tree, D2 and D3 are successfully classified.Ultimately, any remaining instances of DUT that do not fall into the D2 or D3 categories are labeled as Normal.
The DTs obtained using only electrical measurements and optical measurements separately, from the same dataset, yielded TPRs of only 77.1% and 79.6% for Normal data, respectively (see Tables S1 and S2 in the supplemental data).Furthermore, the structures of the trees were exceedingly complex, with 79 and 74 splits (see Figures S3 and S5 in the supplemental data).Therefore, it can be concluded that training should involve the simultaneous utilization of both electrical and optical measurements.An earlier study has also indicated that a DT method based on random forests outperforms NNs [65].
As shown in Figures 9-14, the decision boundaries are nonlinear.However, decision trees are capable of capturing complex nonlinear relationships in data through a series of simple decision rules.In some cases, this simplicity can lead to a more efficient representation of the underlying data structure compared to NNs.As aforementioned, the DT is simpler and more interpretable.Moreover, it is robust in response to outliers as it can split the data into distinct regions.

Computational Aspect
On the specified machine, the training times for QDA, NN and DT are 8.1 s, 18.2 s and 8.2 s, respectively.Meanwhile, the prediction speeds are 100,000 samples per second, 150,000 samples per second and 120,000 samples per second, respectively.During validation, the total costs have been evaluated as 3366, 108 and 18, respectively.The total cost is calculated as the weighted sum of false negatives and false positives.Additionally, the model sizes are 16 kB, 9 kB and 11 kB, respectively.In addition to the excellence of the DT in accuracy, it is well rounded in terms of computational efficiency.

Conclusions
In this study, a supervised learning method was employed to enhance the logic for classifying mass-produced LEDs.The hardware system responsible for conducting the tests was described, along with the presentation of the test performance method.The test results were categorized into three defective categories and one normal category, and learning was carried out using 6000 training data points and 2600 validation data points.Six current-dependent voltage measurements and four optical measurement values served as features for each data point.
Upon evaluating the learning performance, it was found that linear discriminant analysis achieved a true positive rate of only 77.9% for normal products, indicating insufficient suitability for quality control.Following supervised learning using a neural network, a substantial improvement in the true positive rate to 97.8% was achieved.However, the 2.2% false negative rate was considered significant in product classification.To address this concern, learning was undertaken to construct a binary decision tree, resulting in highly

Figure 1 .
Figure 1.Automated LED inspection equipment for optical and electrical measurements.

Figure 2 .
Figure 2. DUT handler with insulated block and probes.

Figure 1 .
Figure 1.Automated LED inspection equipment for optical and electrical measurements.

Figure 1 .
Figure 1.Automated LED inspection equipment for optical and electrical measurements.

Figure 2 .
Figure 2. DUT handler with insulated block and probes.

Figure 2 .
Figure 2. DUT handler with insulated block and probes.

Figure 6 .
Figure 6.The classification neural network scheme employed in this work.

Figure 6 .
Figure 6.The classification neural network scheme employed in this work.

Figure 14 .
Figure 14.y vs. x in CIE1931 for the learning data.

Figure 15 .
Figure 15.Luminous intensity vs. wavelength for the learning data.

Figure 15 .
Figure 15.Luminous intensity vs. wavelength for the learning data.

Figure 16 .
Figure 16.Iv vs. VR for the learning data.

Figure 17 .
Figure 17.Luminous intensity vs. VF 1 for the learning data.

Figure 19 .
Figure 19.Canonical variable 3 alongside canonical variables 1 and 2 from the discriminant analysis.

Figure 19 .
Figure 19.Canonical variable 3 alongside canonical variables 1 and 2 from the discriminant analysis.

Figure 19 .
Figure 19.Canonical variable 3 alongside canonical variables 1 and 2 from the discriminant analysis.

Figure 21 .
Figure 21.Receiver operating characteristics from neural network.

Figure 22 .
Figure 22.The built decision tree with both optical and electrical measurements.

Figure 23 .
Figure 23.Receiver operating characteristics from the decision tree.

Figure 22 .
Figure 22.The built decision tree with both optical and electrical measurements.

Figure 23 .
Figure 23.Receiver operating characteristics from the decision tree.

Figure 23 .
Figure 23.Receiver operating characteristics from the decision tree.
Figure 14.y vs. x in CIE1931 for the learning data.

Table 1 .
Canonical coefficients of the discriminant analysis.

Table 2 .
Prediction by LDA using the validation data (*TPR).

Table 1 .
Canonical coefficients of the discriminant analysis.
u Canonical

Table 2 .
Prediction by LDA using the validation data (*TPR).

Table 3 .
Prediction by QDA using the validation data (*TPR).Multiple configurations of NNs with additional layers and neurons have been tested, yet the arrangement presented in Figure6has emerged as the most effective for this dataset.The learning process has been conducted with the training dataset.The outcomes of

Table 4 .
Prediction by neural network using the validation data (*TPR).

Table 4 .
Prediction by neural network using the validation data (*TPR).
Figure 21.Receiver operating characteristics from neural network.

Table 5 .
Prediction by decision tree using the validation data (*TPR).