Application of Machine Learning Methods to Neutron Transmission Spectroscopic Imaging for Solid–Liquid Phase Fraction Analysis

In neutron transmission spectroscopic imaging, the transmission spectrum of each pixel on a two-dimensional detector is analyzed and the real-space distribution of microscopic information in an object is visualized with a wide field of view by mapping the obtained parameters. In the analysis of the transmission spectrum, since the spectrum can be classified with certain characteristics, it is possible for machine learning methods to be applied. In this study, we selected the subject of solid–liquid phase fraction imaging as the simplest application of the machine learning method. Firstly, liquid and solid transmission spectra have characteristic shapes, so spectrum classification according to their fraction can be carried out. Unsupervised and supervised machine learning analysis methods were tested and evaluated with simulated datasets of solid–liquid spectrum combinations. Then, the established methods were used to perform an analysis with actual measured spectrum datasets. As a result, the solid–liquid interface zone was specified from the solid–liquid phase fraction imaging using machine learning analysis.


Introduction
In neutron transmission spectroscopic imaging (NTSI), microscopic information is mapped on an object image with a wide field of view by analyzing the transmission spectrum of each point in two-dimensional imaging data [1][2][3][4][5]. Since the shape of the transmission spectrum is determined by the neutron total cross-section, which reflects the microstructure of the object material, it is possible to visualize the spatial distributions of analyzed microstructural parameters [6]. Since the distribution of statistically averaged microstructure information deep inside the sample can be visualized with a wide field of view, its application has been expanded beyond material analysis to include analysis of the deformation behavior of components [7] and evaluation of the homogeneity of actual products [8].
In recent years, machine learning (ML) has been used for various spectral analyses, including neutron science [9][10][11][12][13][14][15][16]. In neutron transmission spectrum analysis, which is the basis of NTSI method, it is considered that ML can be applied if the obtained spectrum can be categorized. That is, spectra with various material parameters are learned by ML in advance and the material parameters of the experimental spectra are determined using the ML model. When ML is used, it is not necessary to perform a detailed time-consuming analysis for each pixel, so it can be expected that the efficiency will be greatly improved, especially in the case of spectroscopic/energy-dependent imaging with a significant number of pixels.
However, the issue in this case is the collection of a large number of spectra with a wide variety of microscopic structures needed to build the ML model. In other words, it is necessary to collect a number of Bragg-edge spectrum patterns of the solid material with various different microstructures, which is the main subject in NTSI analysis. Braggedge patterns reflecting various material microstructures can be calculated, for example, using RITS (Rietveld Imaging of Transmission Spectra) code [6]. However, it is difficult to calculate all types of microstructures as training spectra; moreover, there are patterns that cannot be calculated with analysis codes such as RITS. Therefore, we considered building an ML model for experimental analysis using the experimentally measured Bragg-edge spectra as training data for the ML.
Our choice as the first trial for such ML analysis was the imaging of solid-liquid interfaces in motion. When the solid-liquid interface in the solidification process is measured by neutron imaging, due to the relationship between the measurement time and the solidification velocity, a zone appears where the solid and liquid phases coexist between the 100% solid-phase and 100% liquid-phase regions. In this study, we applied ML to the solid-liquid phase fraction analysis of this solid-liquid coexisting zone. In this case, the analysis was not used for the microstructure of the solid phase itself directly, so it was easy to apply ML. In addition, various Bragg-edge data can be collected using the solid-phase spectrum of the entire solidified sample. From the above, it is considered that the analysis of the solid-liquid phase fraction is optimal for evaluating the applicability of the ML analysis method to NTSI as a first step.
In this study, lead-bismuth eutectic alloy (LBE), which our group has experience in analyzing [17], was used as a sample for imaging the solidification process. LBE is expected to be used as a coolant and spallation target for next-generation nuclear reactors (accelerator-driven subcritical reactors), and it is required for evaluating the heat transfer coefficient by determining the moving velocity of the solid-liquid interface [18][19][20]. As it is difficult to analyze the solid-liquid phase fraction of neutron transmission spectra with high accuracy for LBE due to its complicated crystal phase [21], this is considered an appropriate application of ML analysis.
Therefore, the aim of this study was to investigate the possibility of applying ML analysis to neutron transmission spectroscopy through LBE solid-liquid phase fraction imaging as an example. For the ML analysis, the solid-liquid phase fraction of each pixel during solidification was evaluated by the ML model built from the training dataset, created from 100% liquid-phase and 100% solid-phase transmission measurement data. We first evaluated the analytical performance of the ML using the neutron transmission spectral dataset generated by the neutron total cross-section simulation. Then, ML was applied to the experimental transmission imaging data for the solidification process of LBE, which solidification process was performed to simulate the coolant solidification event, and then, solid-liquid phase fraction imaging was performed.

Application of Machine Learning Methods to Solid-Liquid Phase Fraction Analysis
In this study, to learn the spectra for LBE solids with various crystalline textures produced by solidification, we assembled training data from the solidified LBE data itself at the time of the experiment. In addition, the neutron transmission spectrum handled here was composed of a large number of data points by the neutron time-of-flight (TOF) method, and the computational cost was too large to be applied to ML in its original form. Therefore, we first reduced the dimensionality of the spectral data through unsupervised machine learning and then used the reduced dimensionality data to discriminate the spectral shape by supervised machine learning in the second stage. The procedure for the analysis was as follows: (1) Measure the solidification process of liquid LBE with NTSI. (2a) Obtain the solid-phase spectrum and liquid-phase spectrum of LBE from the imaging data. For the solid-phase spectrum, after the whole sample has solidified, divide the solid zone into several parts and obtain multiple spectra with various crystalline textures from each part. For the liquid-phase spectrum, the neutron data for the entire melted liquid zone are integrated to obtain a liquid-state spectrum. (2b) Create a training dataset with different crystalline textures and solid-liquid fractions by adding the liquid-phase spectrum to each solid-phase spectrum acquired in step (2a) in fractions from 0 to 100%. (2c) Reduce the dimensionality of the training data in step (2b) using unsupervised machine learning, then build an ML model using the training dataset through supervised machine learning. (3) Reduce the dimensionality of the experimental spectra of step (1) using unsupervised ML. (4) Apply the supervised machine learning model built in step (2c) to the dimensionalityreduced experimental spectra in step (3) and obtain the solid-liquid phase fraction. Visualize the obtained phase fraction at each pixel.
In this study, we built ML programs with python and scikit-learn libraries, which are commonly used.

Unsupervised Machine Learning
The purpose of the dimensionality reduction of high-dimensional data such as neutron transmission spectra can be classified into two main categories. The first of these is data compression. This can reduce computational costs when using supervised ML, where datasets tend to be large. The other purpose is data visualization. This can help to convert high-dimensional data whose characteristics are difficult to recognize into a two-or threedimensional diagram. For these purposes, unsupervised ML can be used as a method to perform the dimensionality reduction of high-dimensional data.
There are various methods for the dimensionality reduction of data using unsupervised ML, including principal component analysis (PCA), latent semantic analysis (LSA), nonnegative matrix factorization (NMF), latent Dirichlet allocation (LDA), etc. Among them, PCA is one of the most typical dimensionality reduction methods; it has a long history and is widely used in various situations [22,23]. There are two main methods used for the dimensionality reduction: selecting only important variables and not using the remaining variables, and building new variables from the variables of the original data. PCA applies to the latter. That is, original data can be explained by lower-dimensional variables converting high-dimensional data by PCA. In this study, we adopted PCA as a dimensionality reduction method for neutron transmission spectra.
In PCA, the following procedure is used to obtain the principal components.
(2) Solve the eigenvalue problem for the variance-covariance matrix to find the eigenvectors and eigenvalues. (3) Represent data in the direction of each principal component.
Consider a variance-covariance matrix A for the given data: where A is an nth-order square matrix and x is an n-dimensional vector. Finding x and the eigenvalue λ that satisfy (1) is mathematically equivalent to the problem of finding orthogonal axes that maximize the variance. In PCA, when the eigenvalue problem is solved for the variance-covariance matrix A, multiple combinations of eigenvalues and eigenvectors are calculated. The eigenvector with the largest eigenvalue is called the first principal component, while the eigenvector with the second largest eigenvalue is called the second principal component. After the third principal component, the eigenvectors with the largest eigenvalues are expressed in order. Further, by dividing the eigenvalues calculated for each principal component by the sum of the eigenvalues, the importance of the principal components can be expressed as a ratio. This is called the contribution ratio, and it indicates how much each principal component expresses the data. The sum of the contribution rates in order from the first principal component is called the cumulative contribution rate, and this can be used to measure the amount of information lost through dimensionality reduction. The cumulative contribution rate is discussed in Section 3.2.

Supervised Machine Learning
There are several supervised ML methods [24] used to carry out regression analysis on dimensionally reduced data to perform spectral analysis, such as the K-nearest neighbor (KNN) algorithm, extra tree regression (ETR), and support vector regression (SVR). In this study, we adopted the KNN algorithm, which requires a lower cost of computation. KNN is a supervised ML method that can be used for both classification and regression problems. Supervised ML basically consists of two parts: the training part, which is used to calculate the optimal parameters from the training data, and the prediction part, which makes predictions using the calculated training parameters. However, KNN does not have the part to calculate the optimum parameters from the training data. Since the training data are learned and predictions are made directly, the computational load is small.
KNN is a simple algorithm that learns the training data they are when training. The procedure used for solving a regression problem for input data is as follows.
(1) Calculate the distance between the input data and the training data.
(2) Chose K training data from their data points closest to the input data. In KNN, when predicting a regression problem, the distance between the input data and the training data is generally calculated using Euclidean distance. When the two points P and Q-defined by variables p i and q i , respectively-for which we want to find the distance are represented by P(p 1 , p 2 , ..., p n ) and Q(q 1 , q 2 , ..., q n ), the multidimensional Euclidean distance d(P, Q) is given by the following equation: When KNN handles a large amount of data, the prediction of unknown data takes a long time because KNN performs a neighborhood search on a large amount of training data. In addition, since d(P, Q) is used to measure the distance between data, there appears to be only a small distance difference between data when the number of data dimensions is large, and it is often impossible to train with a high accuracy. Therefore, KNN is effective when the amount of data (spectra) is small or when the number of dimensions (channel number of time-of-flight spectrum) is small. The number of neighborhood training data points, K, to be acquired when making predictions with KNN is not determined within the inference or prediction, and K is called hyperparameter. A smaller K leads to a more complicated decision boundary, therefore overfitting is more likely to occur. Therefore, to improve the generalization performance, it is necessary to adjust an appropriate K using training data.

Preparation of Simulation Spectra for Evaluation of Machine Learning Analysis Methods
Prior to the solid-liquid phase fraction analysis of the measured spectra, the validity of the aforementioned analysis procedure was confirmed by simulated LBE Bragg-edge transmission spectra prepared using computer simulation shown below. The simulated total cross-section spectrum of the LBE solid phase was calculated using the RITS code [6] with the systematic variation of the texture information. The parameters considered here were the direction of the crystal orientation vector (preferred orientation <hkl>) and the degree of development of the orientation texture (March-Dollase coefficient, MD). However, since the ratio of the β-phase is higher than that of the γ-phase in LBE [20], only the texture of the β-phase was considered here. Sixteen types of texture combinations were created between four types of preferred orientation vectors (β<100>, β<002>, β<101>, β<102>) parallel to the neutron transmission direction and four types of MD coefficients (MD = 0.6, 0.7, 0.8, 0.9). A completely isotropic solid-phase texture (MD = 1.0) was added and we prepared 17 types of LBE solid-phase spectra in total. The total cross-section spectrum of the LBE liquid phase was prepared from the neutron transmission spectroscopic imaging experiment (see Section 4.2). The simulated transmission spectra of various phase fractions of the LBE mixed solid-liquid phase were calculated from these cross-sections according to the next equation: Here, Tr(λ) is the neutron transmission spectrum; Σ solid (λ) and Σ liquid (λ) are the macroscopic total cross-section of the LBE solid and liquid phases, respectively; and a is the fraction of the solid phase (0 ≤ a ≤ 1). t is the thickness of the entire sample, and here it was set to 10 mm according to the experiment. The solid-phase fraction a was changed from 0 to 100% in 1% steps, and one type of 100% liquid phase, which is common to the LBE system, was added to make a total of 1701 types of spectral datasets for evaluation. Figure 1 shows an example of the neutron transmission spectrum variation in an LBE solid-liquid mixture when the texture of the solid phase is isotropic. It can be seen that the Bragg-edges of the solid phase appear to be relatively clear against the 100% of the liquid phase. Thus, we prepared neutron transmission spectra of the LBE for various solid-liquid phase fractions (0 to 100%) and various textures (17 types as mentioned above) of the solid phase.
were the direction of the crystal orientation vector (preferred orientation <hkl>) and the degree of development of the orientation texture (March-Dollase coefficient, MD). However, since the ratio of the β-phase is higher than that of the γ-phase in LBE [20], only the texture of the β-phase was considered here. Sixteen types of texture combinations were created between four types of preferred orientation vectors (β<100>, β<002>, β<101>, β<102>) parallel to the neutron transmission direction and four types of MD coefficients (MD = 0.6, 0.7, 0.8, 0.9). A completely isotropic solid-phase texture (MD = 1.0) was added and we prepared 17 types of LBE solid-phase spectra in total. The total cross-section spectrum of the LBE liquid phase was prepared from the neutron transmission spectroscopic imaging experiment (see Section 4.2). The simulated transmission spectra of various phase fractions of the LBE mixed solid-liquid phase were calculated from these cross-sections according to the next equation: Here, Tr(λ) is the neutron transmission spectrum; Σsolid(λ) and Σliquid(λ) are the macroscopic total cross-section of the LBE solid and liquid phases, respectively; and a is the fraction of the solid phase (0 ≤ a ≤ 1). t is the thickness of the entire sample, and here it was set to 10 mm according to the experiment. The solid-phase fraction a was changed from 0 to 100% in 1% steps, and one type of 100% liquid phase, which is common to the LBE system, was added to make a total of 1701 types of spectral datasets for evaluation. Figure  1 shows an example of the neutron transmission spectrum variation in an LBE solid-liquid mixture when the texture of the solid phase is isotropic. It can be seen that the Braggedges of the solid phase appear to be relatively clear against the 100% of the liquid phase. Thus, we prepared neutron transmission spectra of the LBE for various solid-liquid phase fractions (0 to 100%) and various textures (17 types as mentioned above) of the solid phase.
Furthermore, in this study, to use the experimentally obtained transmission spectrum as training data, we also examined the case where the neutron transmission spectrum included statistical errors. To the simulated spectra created above, errors following a Gaussian distribution with a standard deviation σ of 1, 2,..., 9, and 10% were randomly added to the simulated spectra. As an example, Figure 2 shows the neutron transmission spectra with σ = 5% error in the case of the isotropic solid-phase texture. From this, when the statistical error included in the neutron transmission spectrum becomes large, it becomes more difficult for human eyes to distinguish between the solid and liquid phases.  Furthermore, in this study, to use the experimentally obtained transmission spectrum as training data, we also examined the case where the neutron transmission spectrum included statistical errors. To the simulated spectra created above, errors following a Gaussian distribution with a standard deviation σ of 1, 2,..., 9, and 10% were randomly added to the simulated spectra. As an example, Figure 2 shows the neutron transmission spectra with σ = 5% error in the case of the isotropic solid-phase texture. From this, when the statistical error included in the neutron transmission spectrum becomes large, it becomes more difficult for human eyes to distinguish between the solid and liquid phases.  Figure 3 shows the result of the cumulative contribution evaluation of the transmission spectrum dataset with σ = 0% created in Section 3.1 to evaluate the amount of information lost due to the dimensionality reduction of PCA. The horizontal axis of the figure is the number of principal components after dimensionality reduction by PCA, while the vertical axis is the cumulative contribution rate explained in Section 2.2. From the cumulative contribution rate, it is evident that more than 99% of the original data can be expressed with the first three principal components of the PCA; namely, this dataset can be expected to be sufficiently analyzed by reducing the number of dimensions to three.  Figure 4 shows the diagram plotting the distribution of the first three principal components of PCA expressing the simulated dataset of σ = 0%. Note that each principal component is computationally found and the unit cannot be defined. In the figure, the point where each plot line converges is the point of a 100% liquid phase; from there, the point of a 100% solid phase appears at the opposite end of the line according to each texture state. Then, the data for each phase fraction are plotted on a straight line connecting both  Figure 3 shows the result of the cumulative contribution evaluation of the transmission spectrum dataset with σ = 0% created in Section 3.1 to evaluate the amount of information lost due to the dimensionality reduction of PCA. The horizontal axis of the figure is the number of principal components after dimensionality reduction by PCA, while the vertical axis is the cumulative contribution rate explained in Section 2.2. From the cumulative contribution rate, it is evident that more than 99% of the original data can be expressed with the first three principal components of the PCA; namely, this dataset can be expected to be sufficiently analyzed by reducing the number of dimensions to three.  Figure 3 shows the result of the cumulative contribution evaluation of the transmission spectrum dataset with σ = 0% created in Section 3.1 to evaluate the amount of information lost due to the dimensionality reduction of PCA. The horizontal axis of the figure is the number of principal components after dimensionality reduction by PCA, while the vertical axis is the cumulative contribution rate explained in Section 2.2. From the cumulative contribution rate, it is evident that more than 99% of the original data can be expressed with the first three principal components of the PCA; namely, this dataset can be expected to be sufficiently analyzed by reducing the number of dimensions to three.    Figure 4 shows the diagram plotting the distribution of the first three principal components of PCA expressing the simulated dataset of σ = 0%. Note that each principal component is computationally found and the unit cannot be defined. In the figure, the point where each plot line converges is the point of a 100% liquid phase; from there, the point of a 100% solid phase appears at the opposite end of the line according to each texture state. Then, the data for each phase fraction are plotted on a straight line connecting both points. This straight line is drawn depending on the solid-liquid phase fraction. The data groups of different preferred orientation vectors are represented by colors, and the black line in the center of the distribution corresponds to the solid-liquid phase fraction depending on the line of the isotropic texture. These groups are distributed to form fanshaped planes in different preferred orientations from the black line. That is, as the texture develops (from MD = 1.0 to MD = 0.6), the data are plotted farther from the isotropic texture case (black line). In this way, it was found that by reducing the dimensions with PCA, it is possible to clearly identify the LBE transmission spectra with different solid-liquid phase fractions, preferred orientation vectors, and degrees of development of the texture. With this detailed analysis, it may be possible to obtain not only the solid-liquid phase fraction but also the texture information for the solid phase depending on the position of the transmission spectrum in this diagram. points. This straight line is drawn depending on the solid-liquid phase fraction. The data groups of different preferred orientation vectors are represented by colors, and the black line in the center of the distribution corresponds to the solid-liquid phase fraction depending on the line of the isotropic texture. These groups are distributed to form fanshaped planes in different preferred orientations from the black line. That is, as the texture develops (from MD = 1.0 to MD = 0.6), the data are plotted farther from the isotropic texture case (black line). In this way, it was found that by reducing the dimensions with PCA, it is possible to clearly identify the LBE transmission spectra with different solid-liquid phase fractions, preferred orientation vectors, and degrees of development of the texture.

Dimensionality Reduction Using Unsupervised Machine Learning Methods
With this detailed analysis, it may be possible to obtain not only the solid-liquid phase fraction but also the texture information for the solid phase depending on the position of the transmission spectrum in this diagram.

Solid-Liquid Phase Fraction Analysis by Supervised Machine Learning Method
To evaluate the performance of the supervised machine learning method, the simulated dataset was randomly divided into training and testing data groups after dimensionality reduction by PCA. Of the original dataset including 1701 spectra, 1500 were for the training data group and 201 were for the test data group. The test data group was analyzed by the ML model built using the training data group, and the difference between the analyzed solid fraction and the original solid fraction was evaluated by the root mean square error (RMSE) shown by the equation: Here, n is the number of spectra analyzed, fi is the solid-phase fraction estimated by the ML analysis, and yi is the original solid-phase fraction set when the dataset was created. For the ML analysis, we tested SVR and ETR in addition to KNN. KNN and ETR showed higher performance than SVR when the training data contained statistical errors, and KNN had the lowest computational cost, so we adopted KNN in this study. The computational cost is also important in analyzing more than 10,000 spectra of all pixels of a two-dimensional detector in NTSI. In KNN, it is necessary to optimize the hyperparameter K of the training data, and in this study it was determined by 10-fold cross validation (CV).
First, the result of the solid-phase fraction analysis by KNN is shown in Figure 5. In this case, the Gaussian distribution error was not added to the simulated spectrum dataset (σ = 0%). Here, K = 2 was used due to optimization. The horizontal axis of the diagram is the original solid-phase fraction, and the vertical axis is the solid-phase fraction estimated

Solid-Liquid Phase Fraction Analysis by Supervised Machine Learning Method
To evaluate the performance of the supervised machine learning method, the simulated dataset was randomly divided into training and testing data groups after dimensionality reduction by PCA. Of the original dataset including 1701 spectra, 1500 were for the training data group and 201 were for the test data group. The test data group was analyzed by the ML model built using the training data group, and the difference between the analyzed solid fraction and the original solid fraction was evaluated by the root mean square error (RMSE) shown by the equation: Here, n is the number of spectra analyzed, f i is the solid-phase fraction estimated by the ML analysis, and y i is the original solid-phase fraction set when the dataset was created. For the ML analysis, we tested SVR and ETR in addition to KNN. KNN and ETR showed higher performance than SVR when the training data contained statistical errors, and KNN had the lowest computational cost, so we adopted KNN in this study. The computational cost is also important in analyzing more than 10,000 spectra of all pixels of a two-dimensional detector in NTSI. In KNN, it is necessary to optimize the hyperparameter K of the training data, and in this study it was determined by 10-fold cross validation (CV).
First, the result of the solid-phase fraction analysis by KNN is shown in Figure 5. In this case, the Gaussian distribution error was not added to the simulated spectrum dataset (σ = 0%). Here, K = 2 was used due to optimization. The horizontal axis of the diagram is the original solid-phase fraction, and the vertical axis is the solid-phase fraction estimated by KNN. The analyzed results of the test data shown by the red circle are in good agreement with the ideal values shown by the black line. In the figure, we also show the analysis results of the training data (blue triangles) in order to see how accurate the analysis can be when analyzing the data used for training. Although this depends on the type of ML algorithm used, if there is a difference between the training and test data distributions there is a possibility that overlearning is occurring. In this case, there appears to be no problem.
by KNN. The analyzed results of the test data shown by the red circle are in good agreement with the ideal values shown by the black line. In the figure, we also show the analysis results of the training data (blue triangles) in order to see how accurate the analysis can be when analyzing the data used for training. Although this depends on the type of ML algorithm used, if there is a difference between the training and test data distributions there is a possibility that overlearning is occurring. In this case, there appears to be no problem. Next, we analyzed simulated datasets with Gaussian distribution errors (σ = 1, 2, 3, ..., 9, 10%) using the ML models built by the training spectra, including the errors, which is the subject of this research. For example, the ML analysis results for the dataset that included an error of σ = 5% were divided and analyzed as mentioned above and shown in Figure 6. Here, the hyperparameter K = 10 was the optimal value. As can be seen from the figure, the variance of both the training and test data groups increases around the black line, which is the set value of the solid-phase fraction. However, the estimated values follow the black line as a whole and the solid-phase fraction is considered to give results with a certain accuracy. Since the variances of the training and test data groups are about the same, note that our ML model did not fail with regard to overlearning.  Next, we analyzed simulated datasets with Gaussian distribution errors (σ = 1, 2, 3, ..., 9, 10%) using the ML models built by the training spectra, including the errors, which is the subject of this research. For example, the ML analysis results for the dataset that included an error of σ = 5% were divided and analyzed as mentioned above and shown in Figure 6. Here, the hyperparameter K = 10 was the optimal value. As can be seen from the figure, the variance of both the training and test data groups increases around the black line, which is the set value of the solid-phase fraction. However, the estimated values follow the black line as a whole and the solid-phase fraction is considered to give results with a certain accuracy. Since the variances of the training and test data groups are about the same, note that our ML model did not fail with regard to overlearning. by KNN. The analyzed results of the test data shown by the red circle are in good agreement with the ideal values shown by the black line. In the figure, we also show the analysis results of the training data (blue triangles) in order to see how accurate the analysis can be when analyzing the data used for training. Although this depends on the type of ML algorithm used, if there is a difference between the training and test data distributions there is a possibility that overlearning is occurring. In this case, there appears to be no problem. Figure 5. Results of solid-liquid phase fraction analysis by K-nearest neighbor algorithm (KNN) for the simulated spectral dataset without Gaussian errors (σ = 0%) for both the training and testing data.
Next, we analyzed simulated datasets with Gaussian distribution errors (σ = 1, 2, 3, ..., 9, 10%) using the ML models built by the training spectra, including the errors, which is the subject of this research. For example, the ML analysis results for the dataset that included an error of σ = 5% were divided and analyzed as mentioned above and shown in Figure 6. Here, the hyperparameter K = 10 was the optimal value. As can be seen from the figure, the variance of both the training and test data groups increases around the black line, which is the set value of the solid-phase fraction. However, the estimated values follow the black line as a whole and the solid-phase fraction is considered to give results with a certain accuracy. Since the variances of the training and test data groups are about the same, note that our ML model did not fail with regard to overlearning.   Figure 7 shows the difference in RMSE when the ML models were built from each dataset created by changing the simulated statistical error, σ. The horizontal axis of the diagram is the magnitude of the statistical error σ in the test data and the vertical axis is the RMSE. This figure shows the difference in RMSE depending on the statistical error (<5%) of the training data. If the statistical error contained in the test data is small, it is more accurate to perform the analysis using an ML model built using the training data of small σ. However, looking at more than 5% of the horizontal axis shown in the diagram (statistical error of the test (analyzed) data), we found that the accuracy of the analysis was almost the same even when the statistical error of the test data was larger than the error in the training data. In other words, it is more profitable to obtain the training data with the same or better statistical accuracy than the analyzing data.
diagram is the magnitude of the statistical error σ in the test data and the vertical axis is the RMSE. This figure shows the difference in RMSE depending on the statistical error (< 5%) of the training data. If the statistical error contained in the test data is small, it is more accurate to perform the analysis using an ML model built using the training data of small σ. However, looking at more than 5% of the horizontal axis shown in the diagram (statistical error of the test (analyzed) data), we found that the accuracy of the analysis was almost the same even when the statistical error of the test data was larger than the error in the training data. In other words, it is more profitable to obtain the training data with the same or better statistical accuracy than the analyzing data.

Overview of Neutron Transmission Spectroscopic Imaging Experiments of LBE Solidification
The NTSI experiments during the solidification process of LBE were performed using the energy-resolved neutron imaging system RADEN in the Materials and Life Science Experimental Facility (MLF) of the Japan Proton Accelerator Research Complex (J-PARC) [25] with the neutron wavelength resolution of = 0.25%. The details of these experiments are already reported [26,27]. The LBE sample had a solidified volume 80 mm in vertical length, 80 mm in width, and 10 mm in thickness, and it was contained in a 304 stainless steel container with 1-mm-thick walls. The entire container was covered with glass fiber for heat insulation. A heater block at the top of the container and a heat sink at the bottom were installed. The sample temperature was checked using thermocouples inserted in seven locations around the container. A gas electron multiplier (GEM)-type detector (nGEM, Bee Beans Technologies [28]) was used as the neutron time-of-flight (TOF) imaging detector, covering a 100 mm × 100 mm detection size. The pixel resolution was 0.8 mm in length and there were 128 × 128 pixels in total. In the experiment, the sample was heated for 60 min at first until about half of the top of the LBE sample reached a temperature higher than 124.7 °C, which is the melting point of LBE. Then, while holding the initial state, the first neutron transmission TOF imaging measurement was performed for 60 min. After that, the heating was stopped and the gradual solidification process from the downward direction of the sample was measured over 270 min as time-accumulated measurements every 30 min.

Overview of Neutron Transmission Spectroscopic Imaging Experiments of LBE Solidification
The NTSI experiments during the solidification process of LBE were performed using the energy-resolved neutron imaging system RADEN in the Materials and Life Science Experimental Facility (MLF) of the Japan Proton Accelerator Research Complex (J-PARC) [25] with the neutron wavelength resolution of = 0.25%. The details of these experiments are already reported [26,27]. The LBE sample had a solidified volume 80 mm in vertical length, 80 mm in width, and 10 mm in thickness, and it was contained in a 304 stainless steel container with 1-mm-thick walls. The entire container was covered with glass fiber for heat insulation. A heater block at the top of the container and a heat sink at the bottom were installed. The sample temperature was checked using thermocouples inserted in seven locations around the container. A gas electron multiplier (GEM)-type detector (nGEM, Bee Beans Technologies [28]) was used as the neutron time-of-flight (TOF) imaging detector, covering a 100 mm × 100 mm detection size. The pixel resolution was 0.8 mm in length and there were 128 × 128 pixels in total. In the experiment, the sample was heated for 60 min at first until about half of the top of the LBE sample reached a temperature higher than 124.7 • C, which is the melting point of LBE. Then, while holding the initial state, the first neutron transmission TOF imaging measurement was performed for 60 min. After that, the heating was stopped and the gradual solidification process from the downward direction of the sample was measured over 270 min as time-accumulated measurements every 30 min. Figure 8 shows one of the neutron transmission images during the LBE solidification process as an example of the data measured. The figure shows a neutron wavelengthresolved radiogram with a neutron wavelength of 0.530~0.537 nm near the β(101) Braggedge of LBE. The relatively uniform zone at the bottom of the figure is considered the zone where the sample did not melt during the first 60 min of heating. Upward from there, the zone where black and white vertical shades can be seen is considered the zone that was solidified through the solidification process. The slightly darker gray zone in the upper part (surrounded by a red frame) is estimated to be the zone where the solid-liquid interface moved during the 30 min integration measurement, and the uppermost uniform zone is the LBE melt. Although the approximate location of the solidification interface can be estimated from the obtained images, it is necessary to evaluate the solid-liquid phase fraction to investigate its location in detail. The target of this study was to determine the rate at which this red zone moves from the bottom to the top during the solidification experiment.

Creation of Training Data and Building Machine Learning Model
zone where black and white vertical shades can be seen is considered the solidified through the solidification process. The slightly darker gray zon part (surrounded by a red frame) is estimated to be the zone where the soli face moved during the 30 min integration measurement, and the uppermos is the LBE melt. Although the approximate location of the solidification in estimated from the obtained images, it is necessary to evaluate the solid-liq tion to investigate its location in detail. The target of this study was to dete at which this red zone moves from the bottom to the top during the solidif ment. The training spectral dataset used for the ML analysis was created transmission spectra from TOF imaging experiments. For obtaining the liqu trum shown in Figure 9, we averaged the entire zone where the liquid-pha to be 100% in the first 60-min measurement, identified by the temperatu with the thermocouples inserted in the sample. For the spectrum of the so data after the entire sample was solidified were picked out in a 45 horiz vertical pixel section, avoiding the thermocouple insertion area, and these tegrated and averaged together. Then, eventually, 50 types of solid-phas sampled for every two vertical pixels. Figure 10 shows some of the spectra solid-phase spectra obtained from the different points show differences i effect of the sample container was removed as a background during these The training spectral dataset used for the ML analysis was created from neutron transmission spectra from TOF imaging experiments. For obtaining the liquid-phase spectrum shown in Figure 9, we averaged the entire zone where the liquid-phase was judged to be 100% in the first 60-min measurement, identified by the temperature information with the thermocouples inserted in the sample. For the spectrum of the solid phase, the data after the entire sample was solidified were picked out in a 45 horizontal pixel × 2 vertical pixel section, avoiding the thermocouple insertion area, and these pixels were integrated and averaged together. Then, eventually, 50 types of solid-phase spectra were sampled for every two vertical pixels. Figure 10 shows some of the spectra obtained. The solid-phase spectra obtained from the different points show differences in texture. The effect of the sample container was removed as a background during these procedures.
is the LBE melt. Although the approximate location of the solidification inter estimated from the obtained images, it is necessary to evaluate the solid-liquid tion to investigate its location in detail. The target of this study was to determ at which this red zone moves from the bottom to the top during the solidificat ment. The training spectral dataset used for the ML analysis was created fro transmission spectra from TOF imaging experiments. For obtaining the liquid-p trum shown in Figure 9, we averaged the entire zone where the liquid-phase w to be 100% in the first 60-min measurement, identified by the temperature in with the thermocouples inserted in the sample. For the spectrum of the solid data after the entire sample was solidified were picked out in a 45 horizonta vertical pixel section, avoiding the thermocouple insertion area, and these pixe tegrated and averaged together. Then, eventually, 50 types of solid-phase sp sampled for every two vertical pixels. Figure 10 shows some of the spectra ob solid-phase spectra obtained from the different points show differences in te effect of the sample container was removed as a background during these pro  Using the spectra for the 100% liquid phase (1 kind) and 100% solid phase (50 kinds), prepared as described above, we created a training dataset of LBE neutron transmission spectra for solid-phase fractions from 0 to 100% in 1% increments following the procedure described in Section 3.1. Since we prepared 50 types of solid-phase spectra, the total number of neutron transmission spectra in the created training dataset was 5001. These spectra were then dimensionally reduced to 3 by PCA, and 5001 of these spectra were trained to build an ML model for the KNN.

Principal Component Imaging by PCA of Solidified Sample
As an example of the PCA result of the LBE spectra in the solidified sample obtained in the experiment, the first through fifth principal components are shown with the neutron transmission image in Figure 11. Different images were obtained for each principal component. The first principal component closely resembles the neutron transmission image, while the images of the second through fourth principal components are images that emphasize the solidification texture. It can also be seen that the higher the principal component is, the more information is lost in the image. The image shows an almost uniform noise distribution in the fifth principal component. Although these images do not directly correspond to physical quantities, PCA can visualize points of interest in the shape of the neutron transmission spectrum. Note that the sample container and thermocouples are clearly visible in the lower-order principal components. Using the spectra for the 100% liquid phase (1 kind) and 100% solid phase (50 kinds), prepared as described above, we created a training dataset of LBE neutron transmission spectra for solid-phase fractions from 0 to 100% in 1% increments following the procedure described in Section 3.1. Since we prepared 50 types of solid-phase spectra, the total number of neutron transmission spectra in the created training dataset was 5001. These spectra were then dimensionally reduced to 3 by PCA, and 5001 of these spectra were trained to build an ML model for the KNN.

Principal Component Imaging by PCA of Solidified Sample
As an example of the PCA result of the LBE spectra in the solidified sample obtained in the experiment, the first through fifth principal components are shown with the neutron transmission image in Figure 11. Different images were obtained for each principal component. The first principal component closely resembles the neutron transmission image, while the images of the second through fourth principal components are images that emphasize the solidification texture. It can also be seen that the higher the principal component is, the more information is lost in the image. The image shows an almost uniform noise distribution in the fifth principal component. Although these images do not directly correspond to physical quantities, PCA can visualize points of interest in the shape of the neutron transmission spectrum. Note that the sample container and thermocouples are clearly visible in the lower-order principal components.
Appl. Sci. 2021, 11, × FOR PEER REVIEW 12 of 17 Figure 12 shows an example of a single-pixel spectrum used for analysis. It was esti-mated to be the liquid phase from the thermocouple measurement until 30 min after the solidification process started. The statistical error is large for one-pixel data. Such a spec-trum was created pixel by pixel from the TOF transmission image data, and the solid-phase fraction was quantitatively obtained by KNN analysis. Figure 13 Figure 12 shows an example of a single-pixel spectrum used for analysis. It was estimated to be the liquid phase from the thermocouple measurement until 30 min after the solidification process started. The statistical error is large for one-pixel data. Such a spectrum was created pixel by pixel from the TOF transmission image data, and the solid-phase fraction was quantitatively obtained by KNN analysis. Figure 13 shows an example of the result of quantifying the solid-phase fraction, performed on the entire detection area. The linear shadows shown by the blue arrows are thermocouples. To exclude the influence of the background caused by such devices, the solidification process was evaluated in the zone indicated by the red frame. Note that, once the spectral dataset for analysis is prepared, the KNN analysis can be completed in a few minutes, so the effect of time efficiency is extremely high. Figure 11. Examples of 100 mm × 100 mm images of the first through fifth principal components (PC) obtained by the PCA of the neutron transmission spectrum at each pixel (after solidification). Figure 12 shows an example of a single-pixel spectrum used for analysis. It was estimated to be the liquid phase from the thermocouple measurement until 30 min after the solidification process started. The statistical error is large for one-pixel data. Such a spectrum was created pixel by pixel from the TOF transmission image data, and the solidphase fraction was quantitatively obtained by KNN analysis. Figure 13 shows an example of the result of quantifying the solid-phase fraction, performed on the entire detection area. The linear shadows shown by the blue arrows are thermocouples. To exclude the influence of the background caused by such devices, the solidification process was evaluated in the zone indicated by the red frame. Note that, once the spectral dataset for analysis is prepared, the KNN analysis can be completed in a few minutes, so the effect of time efficiency is extremely high.  Some examples of the solid-phase fraction map and average in the horizontal direction inside the red-frame zone are shown in Figure 14. In the average diagram on the right in the figure, the horizontal axis is the solid-phase fraction and the vertical axis is the distance from the top of the sample. Initially, the lower zone of the image corresponds to 100% solid phase and the upper zone to 100% liquid phase. In the average diagram, the Some examples of the solid-phase fraction map and average in the horizontal direction inside the red-frame zone are shown in Figure 14. In the average diagram on the right in the figure, the horizontal axis is the solid-phase fraction and the vertical axis is the distance from the top of the sample. Initially, the lower zone of the image corresponds to 100% solid phase and the upper zone to 100% liquid phase. In the average diagram, the analyzed results of the solid-and liquid-phase zones do not show perfect 100% and 0% values, but this may be due to the influence of statistical errors and residual background components. However, as shown in the diagram, each zone shows moderately constant values and can be used to determine the solid-and liquid-phase zones. In addition, there is a zone of gradual change in phase fraction between these two zones, which is considered the zone where the solid-liquid interface moved within 30 min. This interfacial zone moves to the top over time, until eventually the whole area becomes solid. Subsequently, the average solid-liquid interfaces were estimated by the position of the 50% solid-phase fraction for all analyzed images. The results are shown in Figure 15. The horizontal axis shows the time since the solidification process began, and the vertical axis shows the position from the top of the sample. The results are relatively consistent Subsequently, the average solid-liquid interfaces were estimated by the position of the 50% solid-phase fraction for all analyzed images. The results are shown in Figure 15. The horizontal axis shows the time since the solidification process began, and the vertical axis shows the position from the top of the sample. The results are relatively consistent with the time dependence results of a previous study [26,27] estimated from neutron transmission images. Based on the results, we estimated the solidification velocity during the experiment, and this is shown in Figure 16. The result of this study shows a tendency for the solidification velocity to gradually increase with time, which may be reasonable considering the decrease in the liquid-phase volume over time, i.e., the decrease in heat storage capacity.

Conclusions
In this study, the application of ML to neutron transmission spectroscopic imaging (NTSI) was investigated using solid-liquid phase fraction analysis as an example. The target of the analysis was the evaluation of the solid-liquid interface movement during the solidification process of LBE. In supervised ML, creating training data can be an issue, but in this study we assembled the training dataset using actual experimental data with a wide variety of solid-liquid phase fractions and crystalline textures of the solid phase without model calculation.
For the ML analysis, we adopted two steps-the dimensionality reduction of neutron transmission spectrum by PCA and spectrum pattern analysis by KNN-with the goal of reducing the computational cost as much as possible. For the examination of this ML analysis, the evaluating spectrum dataset was created by first setting the phase fraction between the simulated solid spectra and experimental liquid spectrum. Then, the ML was examined with a dataset that contained Gaussian distributed random errors to simulate statistical errors. As a result, it was found that PCA can evaluate the solid-liquid phase fraction of LBE with principal components up to several dimensions, while KNN can evaluate the fraction with the same accuracy as that found for the training data.
As a result of applying this ML analysis to the actual experimental data, it was pos-

Conclusions
In this study, the application of ML to neutron transmission spectroscopic imaging (NTSI) was investigated using solid-liquid phase fraction analysis as an example. The target of the analysis was the evaluation of the solid-liquid interface movement during the solidification process of LBE. In supervised ML, creating training data can be an issue, but in this study we assembled the training dataset using actual experimental data with a wide variety of solid-liquid phase fractions and crystalline textures of the solid phase without model calculation.
For the ML analysis, we adopted two steps-the dimensionality reduction of neutron transmission spectrum by PCA and spectrum pattern analysis by KNN-with the goal of reducing the computational cost as much as possible. For the examination of this ML analysis, the evaluating spectrum dataset was created by first setting the phase fraction between the simulated solid spectra and experimental liquid spectrum. Then, the ML was examined with a dataset that contained Gaussian distributed random errors to simulate statistical errors. As a result, it was found that PCA can evaluate the solid-liquid phase fraction of LBE with principal components up to several dimensions, while KNN can evaluate the fraction with the same accuracy as that found for the training data.
As a result of applying this ML analysis to the actual experimental data, it was pos-

Conclusions
In this study, the application of ML to neutron transmission spectroscopic imaging (NTSI) was investigated using solid-liquid phase fraction analysis as an example. The target of the analysis was the evaluation of the solid-liquid interface movement during the solidification process of LBE. In supervised ML, creating training data can be an issue, but in this study we assembled the training dataset using actual experimental data with a wide variety of solid-liquid phase fractions and crystalline textures of the solid phase without model calculation.
For the ML analysis, we adopted two steps-the dimensionality reduction of neutron transmission spectrum by PCA and spectrum pattern analysis by KNN-with the goal of reducing the computational cost as much as possible. For the examination of this ML analysis, the evaluating spectrum dataset was created by first setting the phase fraction between the simulated solid spectra and experimental liquid spectrum. Then, the ML was examined with a dataset that contained Gaussian distributed random errors to simulate statistical errors. As a result, it was found that PCA can evaluate the solid-liquid phase fraction of LBE with principal components up to several dimensions, while KNN can evaluate the fraction with the same accuracy as that found for the training data.
As a result of applying this ML analysis to the actual experimental data, it was possible to evaluate the solid-, liquid-, and mixed-interface zones quantitatively based on the phase fraction. The average position of the solid-liquid interface was generally consistent with the results of a previous study [26,27]. The ML analysis introduced here was for the simple problem of shape recognition about the mixture of the spectra of solid-and liquid-phases, which have clearly different shapes. To extend ML to general spectrum analysis, there are many issues such as accumulation of training data. We are considering applying ML to NTSI spectral analysis by limiting the number of parameters of the crystal texture.
Thus, the application of ML analysis to NTSI seems to have potential. In unsupervised ML, the visualization of PCA results is also expected to be a bridge to detailed crystallographic analysis because it can clearly show the characteristics of the transmission spectrum. Further, the result that the measured spectra can be used as the training data in supervised ML is advantageous to the NTSI, which can collect the number of spectra with multiple pixels on a two-dimensional detector. The effectiveness of the use of experimental spectra for training data, as revealed in this study, shows the potential of ML analysis using large databases of previous experimental data for new experimental data.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.