Using Nearest-Neighbor Distributions to Quantify Machine Learning of Materials’ Microstructures

Rickman, Jeffrey M.; Barmak, Katayun; Patrick, Matthew J.; Mensah, Godfred Adomako

doi:10.3390/e27050536

Open AccessArticle

Using Nearest-Neighbor Distributions to Quantify Machine Learning of Materials’ Microstructures

¹

Department of Physics, Lehigh University, Bethlehem, PA 18015, USA

²

Department of Materials Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA

³

Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY 10027, USA

^*

Author to whom correspondence should be addressed.

Entropy 2025, 27(5), 536; https://doi.org/10.3390/e27050536

Submission received: 2 April 2025 / Revised: 13 May 2025 / Accepted: 15 May 2025 / Published: 17 May 2025

(This article belongs to the Section Multidisciplinary Applications)

Download

Browse Figures

Versions Notes

Abstract

Machine learning strategies for the semantic segmentation of materials’ micrographs, such as U-Net, have been employed in recent years to enable the automated identification of grain-boundary networks in polycrystals. For example, most recently, this architecture has allowed researchers to address the long-standing problem of automated image segmentation of thin-film microstructures in bright-field TEM micrographs. Such approaches are typically based on the minimization of a binary cross-entropy loss function that compares constructed images to a ground truth at the pixel level over many epochs. In this work, we quantify the rate at which the underlying microstructural features embodied in the grain-boundary network, as described stereologically, are also learned in this process. In particular, we assess the rate of microstructural learning in terms of the moments of the k-th nearest-neighbor pixel distributions and associated metrics, including a microstructural cross-entropy, that embody the spatial correlations among the pixels through a hierarchy of n-point correlation functions. From the moments of these distributions, we obtain so-called learning functions that highlight the rate at which the important topological features of a grain-boundary network appear. It is found that the salient features of network structure emerge after relatively few epochs, suggesting that grain size, network topology, etc., are learned early (as measured in epochs) during the segmentation process.

Keywords:

microstructural segmentation; convolutional network; neighbor distribution

1. Introduction

Many metals and alloys that are employed in engineering applications are polycrystalline, meaning they comprise a multitude of small crystallites that are misoriented with respect to one another and meet at defects called grain boundaries. The grains and grain-boundary structures of materials, known collectively as the microstructure, often dictate the properties and observed performance, including the mechanical (e.g., strength and ductility), electrical (e.g., resistivity), and transport (e.g., diffusion) behavior [1,2,3,4,5,6,7,8]. Therefore, it is extremely important to accurately characterize the microstructural topology. To achieve this aim, one early approach employed edge detection algorithms to segment micrographs into constituent grains and grain boundaries [9]. Unfortunately, due to the complex contrast present in diffraction-contrast images, the segmentations generated by such approaches are missing many true boundaries, where the contrast between grains is not well defined, and are rife with spurious boundaries due to the internal contrast in grains with bend contours, where contrast is present inside a single grain due to the distortion of the crystal planes. More recently, a machine learning approach to this problem has leveraged boundaries in hand-labeled micrographs to train a model based on the U-Net architecture to effect automated grain-boundary segmentation. This strategy has been successfully applied, in particular, to bright-field transmission electron microscopy (TEM) micrographs for the identification of grain-boundary networks [10]. One example of such applications is the recent work of Patrick et al. [11], who employed a supervised learning approach based on a U-Net convolutional neural network to automate the tracing of grain-boundary networks in aluminum, platinum, and palladium thin films.

While a variety of deep learning architectures are available for image segmentation, many microstructural quantities of interest, particularly those related to triple-junction statistics, require precise segmentation of fine features. Furthermore, this task requires models that can learn robust relationships within limited datasets, as the manual labeling of grain-boundary networks is particularly laborious. U-Net is specifically designed to address these challenges in the context of biomedical images. Furthermore, preliminary results suggest that instance segmentations like those generated by YOLOv8 [12,13] and its family of models perform well on grain-size statistics, but the boundary locations and the network’s topology are only known implicitly, and the performance of the models has generally been found to be sensitive to magnification. Together, these findings highlight U-Net as an attractive and relevant architecture for the analysis presented in this paper.

As with other deep learning algorithms, the U-Net algorithm is designed to learn objective (i.e., loss) functions. In this context, the task of image segmentation can be regarded as classification at the pixel level, in which semantic segmentation involves the minimization of a loss function [14]. For most applications, the binary cross-entropy loss function is widely used for segmentation. However, given the centrality of the loss function in the segmentation process, various researchers have proposed alternative functions in different application domains to optimize results. These alternatives can be categorized based on their functional form, notably including those comparing underlying probability distributions (e.g., cross-entropies) and those comparing the distance between the predicted structure and the ground truth (e.g., distance loss metrics) [14,15].

In this work, using U-Net as a test case, we address the question: What is being learned during deep learning-based segmentation of a material’s microstructure? In particular, we seek to highlight the rate at which the underlying microstructure, as reflected in the topology of the grain-boundary network, is learned when a pixel-level objective function, such as the binary cross-entropy, is minimized. This is accomplished by defining so-called learning functions that embody the spatial correlations among the pixels that comprise the grain-boundary network. These functions are based on the moments of the k-th nearest-neighbor pixel distributions, as these distributions can be expressed through a hierarchy of n-point correlation functions that reveal the spatial arrangement of the pixels in the boundary network. As demonstrated below, it is found that the structure of this network emerges after relatively few epochs in the segmentation process.

2. Materials and Methods

To test our methodology, we interrogate microstructures acquired from samples imaged using bright-field transmission electron microscopy (BF TEM). The grain-boundary network obtained from these images is constructed using U-Net and subsequently analyzed using the spatial distribution of the neighbors of the network pixels. The associated experimental and data-analytic methodologies are described below.

2.1. Experimental System and Imaging

The microstructures under study were generated from a 100 nm thick aluminum thin film that was sputter deposited on a Si(100) with a 300 nm thick thermal oxide [16]. The substrate was rotated at 20 rpm to ensure deposition uniformity using a DC magnetron sputtering system (with a base pressure in the

10^{- 9}

Torr range) that was operated at 500 W DC using

1.1

mTorr of purified argon (

99.9995

%). After this procedure, the sample was cut into square chips (<

1.6

mm ×

1.6

mm), which were subsequently annealed at 185 °C in a reducing atmosphere of flowing Ar/H₂. It was found that the grain-size distributions were equivalent for the different annealing times considered here. These chips were first mechanically thinned to <200

μ

m, and then chemical back etching was employed to remove both the silicon wafer material and some of the thermal oxide to yield large (∼1 mm diameter) electron-transparent regions. The resulting samples were imaged via BF TEM in an FEI F200X Talos S/TEM (200 kV accelerating voltage, 30

μ

m objective aperture, 36,000× magnification). Images were recorded for three sample tilts per field of view on a

4096 \times 4096

BM-CETA charge-coupled device (CCD) camera and were subsequently downscaled to

512 \times 512

during U-Net inferencing. All three sample tilts acquired in this manner were used to inform the manual reconstruction of the microstructures for comparison with the corresponding U-Net reconstructions.

2.2. U-Net Implementation

The U-Net architecture was first employed for the segmentation of TEM images of stained biological samples [17,18] and, as noted above, has been used more recently to enable the automated identification of grain-boundary networks in polycrystals [10,16]. Ronneberger et al. [18] developed this “fully convolutional network” architecture and associated training strategy, in which a contracting network is supplemented by layers that enhance the resolution of the output. More specifically, the architecture comprises both compressive (encoder) and expansive (decoder) paths, with the former consisting of two repeated convolutions, a subsequent rectified linear unit (ReLU), and a pooling operation, and the latter consisting of a feature map upsampling, a concatenation with a cropped map, and several convolution/ReLU combinations. Further details, including the use of training sets, can be found in the work by Ronneberger et al. As noted above, the algorithm seeks to minimize a binary cross-entropy loss function based on a soft-max operation [19] on a feature map. The binary cross-entropy loss function, denoted here by

l (t)

, is employed in binary classification problems, such as those performed with U-Net, to quantify the difference between a class label and a model’s predicted probability of that label. This loss function operates at the pixel level to classify a given pixel, thereby effectively quantifying the distance between the probability distributions of the predicted and actual pixel values. By minimizing

l (t)

, one seeks to improve model accuracy relative to the ground truth. Our aim here is to compare

l (t)

with other learning measures, as defined below, that highlight the rate at which a grain-boundary network is constructed during U-Net processing.

The training set used for segmentation in this study comprised images from 299 fields of view. The corresponding grain-size distribution, the probability density function of the normalized effective circular-grain diameter

d / 〈 d 〉

, is shown in Figure 1 along with a fitted log-normal distribution. As the log-normal distribution provides a robust, albeit empirical, description of the grain-size distribution in many different systems, the training set is representative of microstructures with a wide spectrum of grain sizes consistent with normal grain growth conditions. After training, U-Net was employed to segment the images in the test set considered here. Prior to analyzing these

512 \times 512

test images, they were individually cropped to

480 \times 480

pixels to remove unphysical border lines and then subjected to periodic boundary conditions. (The use of periodic boundary conditions here is not necessary but facilitates comparisons with idealized microstructures comprising, for example, periodically repeated square grains.) As the pixel data values lie on the interval

[0, 1]

, it was convenient to binarize the data to highlight the grain-boundary pixels from the background. The reconstructed images in the test set were then analyzed as a function of the epoch number using the methodology described below.

2.3. Methodology for Nearest-Neighbor Analysis

The grain-boundary network is characterized stereologically in terms of the spatial statistics of the pixels comprising it. More specifically, the grain-boundary pixel locations are regarded as a spatial point process [20] that can be described through a hierarchy of n-point correlation functions [21],

ξ^{(n)}

, that reflect the positional associations among the pixels. While one can sometimes determine these correlation functions for small n, for larger values of n, the calculations become increasingly tedious, and it is advantageous to work instead with the moments of the k-th nearest-neighbor distributions.

To see that these neighbor distributions embody

ξ^{(n)}

and therefore reflect important spatial correlations associated with the grain-boundary network, consider a point process in d-dimensions with a density

ρ_{d}

and described by the n-point correlation functions,

ξ^{(n)}

, such that

ξ^{(0)} = 0

and

ξ^{(1)} = 1

. These correlation functions reflect, for example, the tendency of points to cluster or, alternatively, to be effectively repelled from one another. To determine the probability of finding a certain number of neighbors in a volume

V_{d}

, one starts with the generating function [21,22]

G (z) = \exp [\sum_{k = 1}^{\infty} \frac{ρ_{d}^{k} {(z - 1)}^{k}}{Γ (k + 1)} \times \int_{V_{d}} \dots \int_{V_{d}} d^{d} r_{1} \dots d^{d} r_{k} ξ^{(k)} ({\vec{r}}_{1}, \dots, {\vec{r}}_{k})],

(1)

where

Γ

is the gamma function [23], from which one can find the probability of finding k neighbors (

k = 1, 2, \dots

) in this volume via

P_{k} = \frac{1}{Γ (k + 1)} {[\frac{d^{k}}{d z^{k}} G (z)]}_{z = 0} .

(2)

Thus, as expected, the probability of finding neighboring points within the same volume is determined by the correlations among these points and, consequently, the average neighbor distances,

〈 r_{k} 〉

, and higher-order moments,

〈 r_{k}^{m} 〉 (m > 1)

, embody these correlations as well. Thus, measures that compare reconstructed boundary networks that are based on these moments will reflect differences in pixel spatial correlations between distinct networks. This observation is the basis for the use of learning functions to quantify differences in network topology, as described below.

To illustrate the use of the neighbor distributions in this context, consider first a Poisson point process in d spatial dimensions with associated intensity

ρ_{d}

. The probability of finding

k - 1

points drawn from this distribution with an intensity

ρ_{d}

in a hyperspherical region of radius r and the k-th point between r and

r + d r

is given by

p_{k} (r) d r = ρ_{d} \frac{{[ρ_{d} V_{d} (r)]}^{k - 1}}{Γ (k)} \exp (- ρ_{d} V_{d} (r)) S_{d - 1} (r) d r,

(3)

where, in this case, the d-dimensional hyperspherical volume,

V_{d} (r)

, and (d-1)-dimensional surface area,

S_{d - 1} (r)

, are given, respectively, by [24]

\begin{matrix} V_{d} (r) = \frac{π^{d / 2} r^{d}}{Γ (\frac{d}{2} + 1)}, \\ S_{d - 1} (r) = \frac{d V_{d} (r)}{d r} = \frac{2 π^{d / 2} r^{d - 1}}{Γ (\frac{d}{2})} . \end{matrix}

(4)

The m-th moment of this distribution for

k = 1, 2, \dots

is given by

〈 r_{k}^{m} 〉 = \int_{0}^{\infty} d r r^{m} p_{k} (r) = {[\frac{1}{ρ_{d}}]}^{m / d} C (d, m) \frac{Γ (k + \frac{m}{d})}{Γ (k)},

(5)

where

C (d, m) = {[Γ (\frac{d}{2} + 1) / π^{d / 2}]}^{m / d}

. Moreover, for the calculation of a neighbor-based cross-entropy, it is also useful to note that the average of the logarithm of the m-th moment can be obtained from the previous results via

\frac{d 〈 r_{k}^{m} 〉}{d m} |_{m = 0} = 〈 \ln (r_{k}) 〉 = \frac{1}{d} [ψ (k) + \ln [\frac{Γ (\frac{d}{2} + 1)}{ρ_{d} π^{d / 2}}]],

(6)

where

ψ (k)

is the digamma function [23].

For

d = 2

, the first two moments of the distributions for

k = 1, 2, \dots

are given by

\begin{matrix} 〈 r_{k} 〉 = {[\frac{1}{ρ_{2} π}]}^{1 / 2} \frac{Γ (k + \frac{1}{2})}{Γ (k)}, \\ 〈 r_{k}^{2} 〉 = [\frac{1}{ρ_{2} π}] k . \end{matrix}

(7)

These moments constitute a benchmark for comparison with the moments of the boundary pixel distributions, the latter calculated from U-Net processed micrographs.

2.4. Learning Functions

While some statistical tests exist to compare the moments of neighbor distributions [25,26], for the pixelized data used here, it is more convenient to construct other comparators to assess the rate at which the microstructural information embodied in the grain-boundary network is learned over the course of many U-Net epochs, t. For this purpose, two comparators, hereafter called learning functions, are constructed to emphasize their utility in this context. The first learning function,

L_{1} (t)

, is obtained from the sum over the neighbors of the moment differences between the current epoch number t and the final epoch number

t_{f}

. More specifically, after this sum is scaled by its maximum value, obtained using the starting epoch number,

t_{0}

, and the final epoch number,

t_{f}

, we define the ratio as

L_{1} (t, t_{0}, t_{f}, m) : = \frac{\sum_{k = 1}^{k_{e n d}} [〈 r_{k}^{m} 〉 (t_{f}) - 〈 r_{k}^{m} 〉 (t)]}{\sum_{k = 1}^{k_{e n d}} [〈 r_{k}^{m} 〉 (t_{f}) - 〈 r_{k}^{m} 〉 (t_{0})]},

(8)

where

k_{e n d}

is the final neighbor number considered and

0 \leq L_{1} \leq 1

. It is of particular interest to compare the evolution of the U-Net cross-entropy loss function,

l (t)

, with

L_{1} (t)

to quantify the rate at which microstructural features are learned during U-Net boundary network formation, as summarized in Section 3 below.

The second learning function,

L_{2} (t)

, is related to a scaled cross-entropy based on the logarithms of the moments of the k-th neighbor distributions. As this comparator incorporates the structure of a grain-boundary network, we refer to it as a scaled microstructural cross-entropy. We note that the first moment of such distributions has been used in other fields to estimate, for example, the conformational and hydration entropies of molecules [27]. In this case, the construction of

L_{2} (t)

is motivated by considering the cross-entropy between two probability density functions (pdfs)

p (x)

and

q (x)

associated with the values

x

of the random variable

X

,

H (p, q)

[28]. The cross-entropy can be written in terms of the entropy associated with

p (x)

,

H (p)

, and the Kullback–Leibler (K-L) divergence,

D (p | | q)

, [29] as

H (p, q) = H (p) + D (p | | q),

(9)

where the K–L divergence can be interpreted as the statistical distance between the two pdfs. In this context, we take p and q as the pdfs associated with the pixel locations in two (typically different) microstructures.

Several authors have shown that estimates of the K–L divergence can be calculated using

〈 \ln (r_{k}) 〉

[28,30]. To obtain the K–L divergence for microstructural comparison, one draws a set of N samples of pixel locations

{X_{1}, \dots, X_{N}}

from a particular microstructure generated at epoch number t and another set of M samples

{Y_{1}, \dots, Y_{N}}

from a fully reconstructed reference microstructure at time

t_{f}

. Then, if

ϵ_{i} (k)

is the Euclidean distance between

X_{i}

and its k-th neighbor in

{X_{1}, \dots, X_{i - 1}, X_{i + 1}, \dots, X_{N}}

and

ν_{i} (k)

is the Euclidean distance between

X_{i}

and its k-th neighbor in

{Y_{1}, \dots, Y_{N}}

, then an estimator of the K–L divergence is given by

\hat{D} (t, t_{f}, k) = \frac{d}{α} \sum_{i = 1}^{α} \ln \frac{ν_{i} (k)}{ϵ_{i} (k)} + \ln (\frac{β}{α - 1}),

(10)

where

β

(

α

) denotes

\max (M, N)

(

\min (M, N)

). Using this estimator, the second learning function is a scaled microstructural entropy (K–L divergence) defined as

L_{2} (t, t_{0}, t_{f}, k) : = \frac{\hat{D} (t, t_{f}, k)}{\hat{D} (t_{0}, t_{f}, k)},

(11)

so that

0 \leq L_{2} \leq 1

. Again, as above, it is of interest to compare the evolution of the U-Net binary cross-entropy loss function,

l (t)

, with the microstructural cross-entropy,

L_{2} (t)

, as the former is based on a pixel-to-pixel comparison while the latter embodies a comparison of microstructural features and thereby quantifies the rate at which these features are learned.

3. Results

As noted above, U-Net was employed to segment TEM micrographs obtained from a sputter-deposited Al thin film. Figure 2 shows the U-Net-traced microstructures for different epoch numbers. As is evident in the figure, while the grain-boundary network only begins to emerge after 20 epochs, a substantially more complete network skeleton is already present after 50 epochs, and this incomplete network subsequently evolves to fill in missing boundary segments as the number of epochs increases. To quantify the extent to which the topology of the boundary network is accurately reflected during reconstruction as a function of the epoch number, we use the two learning functions described above.

The learning function

L_{1} (t, t_{0}, t_{f}, m)

is constructed from the information contained in the moments of the k-th nearest-neighbor distributions (see Equation (8)). Figure 3a,b show the dimensionless moments

ρ_{2}^{1 / 2} 〈 r_{k} 〉 (t)

and

ρ_{2} 〈 r_{k}^{2} 〉 (t)

, respectively, as functions of the neighbor number, k, for many epochs, t, for microstructures generated using U-Net. Also shown for comparison are the results for a spatially random distribution of pixels (see Equation (7)). As is evident in the figures, as the epoch number increases, both

ρ_{2}^{1 / 2} 〈 r_{k} 〉 (t)

and

ρ_{2} 〈 r_{k}^{2} 〉 (t)

eventually converge to the corresponding limiting results (although not monotonically). In addition, the dimensionless moments differ considerably from those associated with a spatially random distribution for small k, since in this regime, neighboring pixels are arrayed approximately on line segments rather than distributed in

2 d

. Note that, in general, the dimensionless distance to the k-th neighbor is less than the corresponding distance for a random

2 d

distribution of pixels due to the more compact structure of the boundary network created by U-Net. In general, the curvature of

ρ_{2}^{1 / 2} 〈 r_{k} 〉

as a function of k can be related to a transition from

1 d

to

2 d

behavior, and, as outlined in Section 4, one can extract an approximate average grain diameter from this plot.

From this information, we can compare and contrast the U-Net cross-entropy loss function,

l (t)

, and

L_{1} (t, t_{0}, t_{f}, m)

to determine the rate at which microstructural characteristics are learned during the U-Net boundary network construction. Figure 4 presents this comparison as a function of t for the first two moments,

m = 1, 2

. One can see in the figure that, as quantified by

L_{1}

, just after about 75 epochs, the U-Net-generated microstructural network reflects most of the important topological features of the fully reconstructed network. This behavior can be contrasted with that of

l (t)

, which evinces a much slower decay with the epoch number. The decay rate for

l (t)

is approximately

3.3 \times 10^{- 3}

epochs⁻¹. While it is tempting to interpret negative values of

L_{1}

as a negative correlation, particularly for

m = 2

, such negative values may result from an accumulation of statistical uncertainties in the second moments of the neighbor distributions. Finally, if one regards

l (t)

and

L_{1}

as pseudo-time-series data, the learning rate vis-à-vis

l (t)

can be obtained in terms of a cross-correlation function.

We next examine the evolution of the scaled microstructural cross-entropy,

L_{2} (t, t_{0}, t_{f}, k)

, as depicted in Figure 5, for nearest neighbors

k = 2

and

k = 4

(Given that

ν_{i} (1) = 0

when the same pixel coordinates are in a given microstructure and the reference structure at epoch number

t_{f}

, we omit consideration of the first neighbor,

k = 1

.). The spatial distribution of these nearest neighboring pixels is highly revealing. As is evident from the figure, after approximately 100 epochs, the principal characteristics of the microstructural network have been recovered by U-Net. As these nearest neighbors primarily describe pixel correlations on short boundary segments, these segments are, perhaps unsurprisingly, key to boundary network construction. These results provide additional evidence that microstructural network correlations develop early in the U-Net segmentation process.

4. Discussion and Conclusions

The aim of this work is to examine what is learned during U-Net-based segmentation of the microstructure of a material. We assessed the rate of microstructural learning associated with the segmentation of micrographs using two learning functions based on the moments of the k-th nearest-neighbor pixel distributions. These functions embody the spatial correlations among the pixels that comprise the grain-boundary networks that emerge during segmentation and are straightforward to calculate. It was found that the structure of these networks emerges after relatively few U-Net epochs, indicating that important microstructural features (e.g., grain size, network topology, etc.) are learned early during the segmentation process (as measured in epochs). In short, the microstructural learning rate is large relative to that associated with the U-Net binary, cross-entropy loss function.

To provide statistical validation for the claim that salient microstructural features emerge early in the U-Net reconstruction, we conducted paired t-tests for k-moment differences across epochs to assess the degree to which moment differences are significant. More specifically, we performed hypothesis tests for two pairs of dimensionless first moments: the first at epochs

t_{1} = 10

and

t_{2} = 295

and the second at epochs

t_{1} = 75

and

t_{2} = 295

. For each pair of moments, we compared the null hypothesis

H_{0} : \sqrt{ρ_{2}} 〈 r_{k} 〉 (t_{1}) - \sqrt{ρ_{2}} 〈 r_{k} 〉 (t_{2}) = 0

with the alternative hypothesis

H_{A} : \sqrt{ρ_{2}} 〈 r_{k} 〉 (t_{1}) - \sqrt{ρ_{2}} 〈 r_{k} 〉 (t_{2}) \neq 0

. Under the assumption that the test statistic based on a pooled variance follows a t-distribution, we found that the corresponding p-values,

p (k) \approx 0

, for all k, an unsurprising result given the separation between the corresponding curves in Figure 3. In contrast, in the second case, it was found that

p (k) > 0.05

for all k (except for

k = 1

). As one therefore cannot reject

H_{0}

at the 5% significance level, this result suggests that the first moments are likely the same over nearly all values of k and that the microstructural networks at epochs 75 and 295 are essentially the same. The results from these tests should, however, be interpreted with some care, as samples from different epochs are not independent.

We note that some useful microstructural information can be inferred directly from the dependence of the pixel moment,

〈 r_{k} 〉

, on the neighbor number k. As an illustration, consider a simplified 2d microstructure consisting of square grains, each with a side length

ℓ a

on a square grid with pixels of side length a. For a fixed, large value of k, one expects that

〈 r_{k} 〉 \propto 1 / \sqrt{ρ_{2}}

, where, in this case,

1 / \sqrt{ρ_{2}} = ℓ a / \sqrt{(2 ℓ - 1)} \approx a \sqrt{ℓ / 2}

for large ℓ. If one then uses Equation (7) as an approximation for

〈 r_{k} 〉

in general, one finds that, for large k,

〈 r_{k} 〉 \approx {[\frac{k}{π}]}^{1 / 2} \frac{ℓ a}{\sqrt{(2 ℓ - 1)}} (1 - (\frac{1}{8 k})),

(12)

where an asymptotic expansion is used for the gamma function. Equation (12) can be used to estimate the average grain size,

ℓ a

. For example, for the last U-Net-constructed image analyzed above (i.e., epoch number 295), using this equation, one finds that

ℓ \approx 44.7 a

, and so, using the pixel size

a = 1.6

nm associated with this image,

ℓ a \approx 71.6

nm. This result is relatively insensitive to the value of k for

50 < k < 70

. (The effective circular-grain diameter for the corresponding ground-truth microstructure was found to be

100.6

nm. One could relax the circular-grain assumption by, for example, fitting individual grains with ellipses to account for different grain aspect ratios and then defining the grain diameter as the average of the lengths of the semi-major and semi-minor axes. Similarly, one can extend our square-grain model by using different-sized squares or rectangles to represent grains.)

As indicated above, the moments given in Equations (5)–(7) are used simply for comparison with those calculated directly from the data. In other words, the calculation of the moments from the pixelated data does not involve the continuous spatial assumption used to derive the moments in Equations (5)–(7). Nevertheless, it is useful to provide approximate correction factors that connect the pixelated calculations with their continuous counterparts [31]. For example, for

d = 2

, the number of neighbors, k, within a circle with radius

〈 r_{k} 〉

is given by

k \approx ρ_{2} π {〈 r_{k} 〉}^{2}

. Taking into account the pixels that lie along the circumference of this circle, one can set bounds on the radius. One finds that

\sqrt{\frac{k}{ρ_{2} π}} [1 - \sqrt{\frac{2 π ρ_{2} a^{2}}{k}}] < 〈 r_{k} 〉 < \sqrt{\frac{k}{ρ_{2} π}} [1 + \sqrt{\frac{2 π ρ_{2} a^{2}}{k}}] .

(13)

Thus, for large k,

〈 r_{k} 〉 \approx \sqrt{k / ρ_{2} π}

, with the corrections given by Equation (13).

The approach outlined here for boundary network reconstruction can be improved by combining the image information obtained from different sample tilts of the same field of view. The underlying issue is that the contrast between neighboring grains changes as the specimen is tilted such that, at any given tilt, a number of boundaries will likely be out of contrast. To mitigate this information loss, if multiple images corresponding to different tilts are available, one can form a blended image using a logical OR operation that yields a boundary pixel location if a boundary pixel is found at any tilt [9]. The blended image will, therefore, contain boundary information that may be absent in an image at a particular tilt. The evolution of this blended image can then be tracked as a function of the epoch number.

Finally, given the utility of the functions

L_{1}

and

L_{2}

in capturing microstructural learning, it is reasonable to ask whether the existing binary, cross-entropy loss function can be augmented with

L_{1}

and/or

L_{2}

. Given the rapid convergence of the learning functions with time, it would be best not to employ them solely as a loss function but rather to construct a multi-objective loss function comprising

ℓ (t)

and either

L_{1}

or

L_{2}

. The incorporation of these learning functions can profitably enhance the convergence of U-Net segmentation. The creation of such a compound loss function is the subject of the current study.

Author Contributions

J.M.R. conceptualized this work and performed the data analysis. K.B. and M.J.P. also conceptualized this work and performed the microscopy and U-Net analysis. G.A.M. contributed to the discussions of optimizing image information. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the U.S. National Science Foundation (NSF), under grant number DMS-1905492, and the DMREF program, under grant numbers DMS-2118206 and DMS-2118197.

Institutional Review Board Statement

Not applicable in this case.

Data Availability Statement

The authors will make available, upon request, the data used in this work. It is understood that the data provided will not be for commercial use.

Acknowledgments

The imaging work was carried out in the Electron Microscopy Laboratory of the Columbia Nano Initiative (CNI) Shared Lab Facilities at Columbia University.

Conflicts of Interest

The authors declare that they have no competing interests.

References

Hansen, N. Hall-Petch relation and boundary strengthening. Scr. Mater. 2004, 51, 801–806. [Google Scholar] [CrossRef]
Hanaor, D.A.H.; Xu, W.; Ferry, M.; Sorrell, C.C. Abnormal grain growth of rutile TiO₂ induced by ZrSiO₄. J. Cryst. Growth 2012, 359, 83–91. [Google Scholar] [CrossRef]
Zhu, M.-L.; Xuan, F.-Z. Correlation between microstructure, hardness and strength in HAZ of dissimilar welds of rotor steels. Mat. Sci. Eng. A 2010, 527, 4035–4042. [Google Scholar] [CrossRef]
Yoo, E.; Moon, J.H.; Sang, Y.; Kim, Y.; Ahn, J.-P.; Kim, Y.K. Electrical resistivity and microstructural evolution of electrodeposited Co and Co-W nanowires. Mater. Charact. 2020, 166, 110451. [Google Scholar] [CrossRef]
Barmak, K.; Lu, X.; Darbal, A.; Nuhfer, N.T.; Choi, D.; Sun, T.; Warren, A.P.; Coffey, K.R.; Toney, M.F. On twin density and resistivity of nanometric Cu thin films. J. Appl. Phys. 2016, 120, 065106. [Google Scholar] [CrossRef]
Bedu-Amissah, K.; Rickman, J.M.; Chan, H.M.; Harmer, M.P. Grain-boundary diffusion of Cr in pure and Y-doped alumina. J. Am. Ceram. Soc. 2007, 90, 1551–1555. [Google Scholar] [CrossRef]
Wang, C.-M.; Cho, J.; Chan, H.M.; Harmer, M.P.; Rickman, J.M. Influence of dopant concentration on creep properties of Nd₂O₃-doped alumina. J. Am. Ceram. Soc. 2001, 84, 1010–1016. [Google Scholar] [CrossRef]
Cho, J.; Wang, C.M.; Chan, H.M.; Rickman, J.M.; Harmer, M.P. Improved tensile creep properties of yttrium- and lanthanum-doped alumina: A solid solution effect. J. Mater. Res. 2001, 16, 425–429. [Google Scholar] [CrossRef]
Carpenter, D.T.; Rickman, J.M.; Barmak, K. A methodology for automated quantitative microstructural analysis of transmission electron micrographs. J. Appl. Phys. 1998, 84, 5843. [Google Scholar] [CrossRef]
Patrick, M.J.; Eckstein, J.K.; Lopez, J.R.; Toderas, S.; Levine, S.; Rickman, J.M.; Barmak, K. Automated grain boundary detection for bright-field transmission electron microscopy images via U-Net. Microsc. Microanal. 2023, 29, 1968–1979. [Google Scholar] [CrossRef] [PubMed]
Patrick, M.J.; Eckstein, J.K.; Lopez, J.R.; Toderas, S.; Levine, S.; Barmak, K. U-Net implementation for high throughput grain boundary detection in bright field TEM micrographs: Toward in situ grain growth studies. Microsc. Microanal. 2023, 29 (Suppl. S1), 1581–1582. [Google Scholar] [CrossRef]
Williams, D.B.; Carter, C.B. Transmission Electron Microscopy: A Textbook for Materials Science, 2nd ed.; Springer: New York, NY, USA, 2010. [Google Scholar]
Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLOv8, version 8.0.0; Ultralytics: San Francisco, CA, USA, 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 9 May 2025).
Jadon, S. A survey of loss functions for semantic segmentation. In Proceedings of the 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Via del Mar, Chile, 27–29 October 2020; pp. 1–7. [Google Scholar] [CrossRef]
Ciampiconi, L.; Elwood, A.; Leonardi, M.; Mohamed, A.; Rozza, A. A survey and taxonomy of loss functions in machine learning. arXiv 2023, arXiv:2301.05579. [Google Scholar]
Barmak, K.; Rickman, J.M.; Patrick, M.J. Advances in experimental studies of grain growth in thin films. JOM 2024, 76, 3622–3636. [Google Scholar] [CrossRef]
Siddique, N.; Paheding, S.; Elkin, C.P.; Devabhaktuni, V. U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access 2021, 9, 82031–82057. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI); Springer: Berlin/Heidelberg, Germany, 2015; Volume 9351, pp. 234–241. [Google Scholar]
Gao, B.; Pavel, L. On the properties of the softmax function with application in game theory and reinforcement learning. arXiv 2017, arXiv:1704.00805. Available online: https://arxiv.org/pdf/1704.00805 (accessed on 14 May 2025).
Diggle, P.J. Statistical Analysis of Spatial and Spatio-Temporal Point Patterns; CRC Press: New York, NY, USA, 2014. [Google Scholar]
van Kampen, N.G. Stochastic Processes in Physics and Chemistry, 3rd ed.; Elsevier: Amsterdam, The Netherlands, 2007. [Google Scholar]
Banerjee, A.; Abel, T. Nearest neighbour distributions: New statistical measures for cosmological clustering. Mon. Not. R. Astron. Soc. 2021, 500, 5479–5499. [Google Scholar] [CrossRef]
Arfken, G.B.; Weber, H.J. Mathematical Methods for Physicists, 6th ed.; Academic Press: New York, NY, USA, 2005. [Google Scholar]
Luscombe, J.H. Thermodynamics; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
Schilling, M.F. Multivariate two-sample tests based on nearest neighbors. J. Am. Stat. Assoc. 1986, 81, 799–806. [Google Scholar] [CrossRef]
Friedman, J.H.; Steppel, S. A Nonparametric Procedure for Comparing Multivariate Point Sets; SLAC Computation Group (Internal) Technical Memo 153 [U.S. Atomic Energy Contract AT(043)515]; Stanford University: Stanford, CA, USA, 1974. [Google Scholar]
Fogolari, F.; Borelli, R.; Dovier, A.; Esposito, G. The kth nearest neighbor method for estimation of entropy changes from molecular ensembles. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2024, 14, e1691. [Google Scholar] [CrossRef]
Li, S.; Mnatsakanov, R.M.; Andrew, M.E. k-nearest neighbor based consistent entropy estimation for hyperspherical distributions. Entropy 2011, 13, 650–667. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
Zhao, P.; Lai, L. Analysis of k nearest neighbor KL divergence estimation for continuous distributions. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; pp. 2562–2567. [Google Scholar]
Miyagawa, M. An approximation for the kth nearest distance and its application to locational analysis. J. Oper. Res. Soc. Jpn. 2012, 55, 146–157. [Google Scholar] [CrossRef][Green Version]

Figure 1. The probability density function of the normalized effective circular-grain diameter,

d / 〈 d 〉

, for the training set. Also shown is a fit to a log-normal distribution (dashed line).

Figure 1. The probability density function of the normalized effective circular-grain diameter,

d / 〈 d 〉

, for the training set. Also shown is a fit to a log-normal distribution (dashed line).

Figure 2. U-Net-traced microstructures for different epoch numbers: (a) 20, (b) 50, and (c) 295. Note the emergence of a connected grain-boundary network as the number of epochs increases.

Figure 3. (a) The dimensionless first moment,

ρ_{2}^{1 / 2} 〈 r_{k} 〉 (t)

, versus the neighbor number, k, for many epochs, t. The epoch numbers after which each curve was compiled are shown in the legend. The results for neighbors up to

k_{e n d} = 70

are presented. In addition, the results for a spatially random distribution of pixels are shown as a solid black line (see Equation (7)). (b) The dimensionless second moment,

ρ_{2} 〈 r_{k}^{2} 〉 (t)

, versus the neighbor number, k, for many epochs, t. (see Equation (7)).

Figure 3. (a) The dimensionless first moment,

ρ_{2}^{1 / 2} 〈 r_{k} 〉 (t)

, versus the neighbor number, k, for many epochs, t. The epoch numbers after which each curve was compiled are shown in the legend. The results for neighbors up to

k_{e n d} = 70

are presented. In addition, the results for a spatially random distribution of pixels are shown as a solid black line (see Equation (7)). (b) The dimensionless second moment,

ρ_{2} 〈 r_{k}^{2} 〉 (t)

, versus the neighbor number, k, for many epochs, t. (see Equation (7)).

Figure 4. Comparison of the loss function

l (t)

(shown as a red line) with

L_{1} (t, t_{0}, t_{f}, m)

(see Equation (8)) as a function of t, the latter for the first two moments,

m = 1, 2

at a starting epoch number

t_{0} = 15

and a final epoch number

t_{f} = 295

. The results are based on

k_{e n d} = 70

neighbors. It should be noted that one can also use a ground-truth image instead of the image corresponding to

t_{f} = 295

. This ground truth can be obtained, for example, through hand-tracing of the microstructure [10].

Figure 4. Comparison of the loss function

l (t)

(shown as a red line) with

L_{1} (t, t_{0}, t_{f}, m)

(see Equation (8)) as a function of t, the latter for the first two moments,

m = 1, 2

at a starting epoch number

t_{0} = 15

and a final epoch number

t_{f} = 295

. The results are based on

k_{e n d} = 70

neighbors. It should be noted that one can also use a ground-truth image instead of the image corresponding to

t_{f} = 295

. This ground truth can be obtained, for example, through hand-tracing of the microstructure [10].

Figure 5. The dependence of the scaled microstructural cross-entropy,

L_{2} (t, t_{0}, t_{f}, k)

(see Equation (11)) as a function of t for neighbors

k = 2, 4

at a starting epoch number

t_{0} = 10

and a final epoch number

t_{0} = 295

. The results are based on

k_{e n d} = 70

neighbors. The somewhat faster decrease in

L_{2}

for

k = 4

may be specific to the microstructural network considered here rather than a general trend. In future work, it would be interesting to examine the behavior of

L_{2}

as a function of k for various thin-film microstructures.

Figure 5. The dependence of the scaled microstructural cross-entropy,

L_{2} (t, t_{0}, t_{f}, k)

(see Equation (11)) as a function of t for neighbors

k = 2, 4

at a starting epoch number

t_{0} = 10

and a final epoch number

t_{0} = 295

. The results are based on

k_{e n d} = 70

neighbors. The somewhat faster decrease in

L_{2}

for

k = 4

may be specific to the microstructural network considered here rather than a general trend. In future work, it would be interesting to examine the behavior of

L_{2}

as a function of k for various thin-film microstructures.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rickman, J.M.; Barmak, K.; Patrick, M.J.; Mensah, G.A. Using Nearest-Neighbor Distributions to Quantify Machine Learning of Materials’ Microstructures. Entropy 2025, 27, 536. https://doi.org/10.3390/e27050536

AMA Style

Rickman JM, Barmak K, Patrick MJ, Mensah GA. Using Nearest-Neighbor Distributions to Quantify Machine Learning of Materials’ Microstructures. Entropy. 2025; 27(5):536. https://doi.org/10.3390/e27050536

Chicago/Turabian Style

Rickman, Jeffrey M., Katayun Barmak, Matthew J. Patrick, and Godfred Adomako Mensah. 2025. "Using Nearest-Neighbor Distributions to Quantify Machine Learning of Materials’ Microstructures" Entropy 27, no. 5: 536. https://doi.org/10.3390/e27050536

APA Style

Rickman, J. M., Barmak, K., Patrick, M. J., & Mensah, G. A. (2025). Using Nearest-Neighbor Distributions to Quantify Machine Learning of Materials’ Microstructures. Entropy, 27(5), 536. https://doi.org/10.3390/e27050536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Using Nearest-Neighbor Distributions to Quantify Machine Learning of Materials’ Microstructures

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental System and Imaging

2.2. U-Net Implementation

2.3. Methodology for Nearest-Neighbor Analysis

2.4. Learning Functions

3. Results

4. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI