Contrast Transfer Function-Based Exit-Wave Reconstruction and Denoising of Atomic-Resolution Transmission Electron Microscopy Images of Graphene and Cu Single Atom Substitutions by Deep Learning Framework

The exit wave is the state of a uniform plane incident electron wave exiting immediately after passing through a specimen and before the atomic-resolution transmission electron microscopy (ARTEM) image is modified by the aberration of the optical system and the incoherence effect of the electron. Although exit-wave reconstruction has been developed to prevent the misinterpretation of ARTEM images, there have been limitations in the use of conventional exit-wave reconstruction in ARTEM studies of the structure and dynamics of two-dimensional materials. In this study, we propose a framework that consists of the convolutional dual-decoder autoencoder to reconstruct the exit wave and denoise ARTEM images. We calculated the contrast transfer function (CTF) for real ARTEM and assigned the output of each decoder to the CTF as the amplitude and phase of the exit wave. We present exit-wave reconstruction experiments with ARTEM images of monolayer graphene and compare the findings with those of a simulated exit wave. Cu single atom substitution in monolayer graphene was, for the first time, directly identified through exit-wave reconstruction experiments. Our exit-wave reconstruction experiments show that the performance of the denoising task is improved when compared to the Wiener filter in terms of the signal-to-noise ratio, peak signal-to-noise ratio, and structural similarity index map metrics.


Introduction
In recent years, researchers have explored many new materials with unique properties. Among them, due to their excellent intrinsic properties, two-dimensional (2D) materials have been considered one of the most attractive materials for future applications, including semiconductor, batteries, fuel cells, sensors, and information technology [1]. The physical and chemical properties of 2D materials are greatly influenced by the microstructure, such as crystallinity, domain size, and atomic defects. Therefore, to understand the properties of 2D materials and to further tailor the properties required for a specific application, information on the atomic structure and composition of 2D materials is essential [2]. In recent years, atomic-resolution transmission electron microscopy (ARTEM) in real-time has become a major tool for observing the structure and dynamics of 2D materials. For example, the structure where r is a vector in the exit wave surface. Additionally, the direct reconstruction of the exit wave can be extracted by using the relationship between the inverse Fourier transform of the CCD data φ(u) 2 and the exit wave function ψ(r) with defocus z and wavelength λ [7]: where: According to Equation (2), the exit wave can be reconstructed from a single defocused CCD image by using the correlation of the phase and the amplitude information retrieved by the autocorrelation of the scattered wave area surrounding the penetrating wave. To get the autocorrelation, the single defocus image needs enough penetrating wave area (X free × X free ), which is four times larger than the penetrating wave area (X obj × X obj ). The following Equation (5) should be satisfied.
According to the definition of the covalent radius, at least one of two adjacent atoms does not satisfy the aforementioned Equation (5) for ARTEM images of 2D materials.

Image Simulation Verification Method
Image simulation can be performed by applying the contrast transfer function (CTF), which describes the modification of the optic system, into the exit wave (vide infra) [11][12][13][14]: where H(k), the contrast transfer function, can be expressed as follows [9]: where A(k) is the aperture function, which describes the cutoff of the TEM aperture. E(k) is the envelope function that contains various factors of the lens, affecting the resolution.
To apply the image simulation method described above in the study of 2D materials, it is necessary to conduct the experiment in the following order:

1.
Prediction of the atomic structure from the raw data obtained from the CCD.

2.
Conversion of the expected atomic structure into the exit wave by using the multislice method.

3.
Conduction of the image simulation by using the exit wave and CTF (Equation (6)).

4.
Verification of the atomic structure by comparing the result of the image simulation to the raw data. 5.
Iteration of the structure modulation until the simulated results of the expected atomic structure become similar to the actual image up to the desired level.
However, the abovementioned verification method of the image simulation with inverse engineering shows a limitation, insofar as the exit wave cannot be directly obtained from the actual image.

FFT-Based Image Deconvolution
FFT-based image deconvolution is a method involving an iterative deconvolution algorithm that can preserve image details [17,18]. A common blurred image (y) can be expressed as the following equation: where k is the blur kernel, x is a latent image, and η represents additive noise. The x is estimated as a reconstructed imagex by finding the linear filter g, which is assumed to minimize the error , as shown below: = E|x −x| 2 (10) In the conventional Wiener deconvolution method, the blurred image can be restored by solving the following equation:x where n and s indicate the expected power spectra of the noise and image, respectively. Since then, iterative error minimization research has been conducted by adding various regularization terms.
In recent years, research on convolutional neural networks (CNNs) has been actively conducted.

Autoencoder
An autoencoder is one of the deep learning structures organized by artificial neural networks [19], encoder networks, and decoder networks, connected by a latent vector. Each network is a sequential combination of an activation function and various regularization methods and hyperparameters, which include a linear and nonlinear operation, trained by a backpropagation algorithm. In the case of a basic autoencoder, the loss is the L2 normalization of the input data and output data, and it is minimized by training hyperparameters with backpropagation. The loss is defined as: where f is the encoder and g is the decoder. The basic autoencoder is studies in fields such as denoising [20], nonlinear dimensionality reduction [21], super resolution [22], and anomaly detection [23]. We found similarities between ARTEM image simulation (Equation (6)) and FFT-based image deconvolution (Equation (12)). Therefore, in our study, we propose a CDDAE framework for exit-wave reconstruction by applying the ARTEM image simulation to the FFT-based image deconvolutional method and autoencoder.

CDDAE Framework
Here, we propose a deep learning framework, CDDAE framework, which is composed of two main parts: an autoencoder part (E, D amp , D phase ) and an image simulation part ( Figure 1). The autoencoder part has an input image (X), which is an ARTEM image obtained from the CCD written in Equation (6) and described as φ(u) 2 and which has the role of finding the relationship between X and the exit wave (ψ) from Equation (2) for the direct reconstruction of the exit wave. It is the variation of the convolutional autoencoder, composed of one encoder (E) and two decoders, an amplitude decoder (D amp ) and a phase decoder (D phase ). E consists of the first four sequential layers, and each layer has a rectified linear unit (ReLU) as an activation function. Then, E out , an output of E, is the latent space representation of the X. Suppose A is the output of D amp with input E out , which decodes the amplitude of the exit wave from E out , and B is the output of D phase , which decodes the phase of the exit wave from E out . We set the activation function of each decoder to be a zero-centered hyperbolic tangent for the sensitive learning of each decoder to the sign of the wave. The end of each decoder has a convolutional layer, with a 1 × 1 size 1 kernel and no activation function, for merging results. Detailed information on the CDDAE network is provided in Table 1.
Nanomaterials 2020, 10, x FOR PEER REVIEW 5 of 11 decoder has a convolutional layer, with a 1 × 1 size 1 kernel and no activation function, for merging results. Detailed information on the CDDAE network is provided in Table 1.  The effective kernel size is 16, as the eight kernels with a size of 3 × 3 are sequentially stacked. In this case, the effective kernel size should be larger than the object size, because the convolution of the effective kernel and the object should be larger than the autocorrelation of the object (Equation (5)). The nonlinear property of the autoencoder reduces the effects of the factors unconsidered in the  The effective kernel size is 16, as the eight kernels with a size of 3 × 3 are sequentially stacked. In this case, the effective kernel size should be larger than the object size, because the convolution of the effective kernel and the object should be larger than the autocorrelation of the object (Equation (5)). The nonlinear property of the autoencoder reduces the effects of the factors unconsidered in the precalculated CTF, such as high order aberrations, attenuation factors, and noise. For the image simulation, we defined the exit wave ψ, which is a complex matrix, by using the outputs A and B of each decoder as follows: Furthermore, we could derive Ψ from the FFT of ψ as follows: Suppose I r are the image simulation results obtained from assigning the precalculated CTF (Equation (7)) and ψ to Equation (6): CDDAE is a neural network that reconstructs the exit wave by reducing the loss function as follows: We use ADAM [24] to minimize the loss function, with 2e −5 as the initial learning rate and default parameters.
Learning the CDDAE framework by TEM images means discovering a nonlinear solution for reconstructing the exit wave and denoising an input image.

Training Data
The monolayer graphene sheet was synthesized by chemical vapor deposition (CVD) method [25,26], and ARTEM images were obtained from an aberration-corrected TEM (FEI, Hillsboro, OR, USA) with an acceleration voltage of 80 kV and a monochromator. The original 4000 ARTEM images have 1024 × 1024 px, and they were cropped into 128 × 128 px images, resulting in a 256,000 images dataset. One image contains approximately 300 carbon atoms. Since the same atom cannot be identified, about 76,800,000 objects were finally used to learn the CDDAE framework.

Direct Exit-Wave Reconstruction from Single Image of Monolayer Graphene
To overcome incoherence effects and aberration problems, a numerical method can be applied by reconstructing the wave function at the exit surface of the specimen [6]. Exit-wave reconstruction aims to accurately interpret the atomic information by acquiring the exit wave prior to the influence of the CTF. To evaluate the CDDAE framework, it is first necessary to input the test set images into the network trained by training set images. Subsequently, the evaluation can be performed by comparing the resultant exit wave with that of the atomic structures obtained from the image simulation verification method.
The exit-wave simulation and image simulation of the monolayer graphene were conducted by MacTempas, version 2.4.9; Total Resolution: Berkeley, CA, USA, 2015. We verified the atomic structure by comparing it with the test set images. The carbon atoms appeared to be in white contrast from the simulated exit-wave amplitude, whereas they appeared to be in dark contrast in the case of the phase. The exit wave from the CDDAE framework also showed the same contrast. This result allowed us to conclude that the framework worked in obtaining the exit wave. Figure 2 shows the evaluated images.

Identification of Cu Single Atom from Single Image
The CDDAE framework achieved an accurate identification of substitutional atoms in graphene. The incorporation of substitutional atoms (Cr, Ti, Pd, Ni, Al, Cu, Si, B, or N) in the graphene lattice results in the etching and doping of graphene [27][28][29][30]. Such doping and etching by impurity atoms is useful in building novel nanostructures. TEMs have been used to analyze the impurity effects of graphene because TEM can identify impurity atoms and reconstructed novel nanostructures simultaneously. However, the identification of impurity atoms by TEM is difficult. In particular, substituted Cu and Si atoms are not discernible from TEM images, only because of misinterpretable atomic intensity differences [28]. Another obstacle is that the life-time of substitutional atoms is just a few seconds [28], which makes them difficult to observe in TEM. To distinguish Cu and Si atoms accurately, exit-wave reconstruction is necessary. In Figure 3, the exit-wave reconstructions performed by the CDDAE framework and MacTempas were compared. Figure 3c-f shows the phase and amplitude images simulated with MacTempas for both Cu and Si substitution in the graphene lattice. The reconstructed amplitude and phase with our experimental TEM condition made it possible to discern Cu and Si atoms with large differences between the amplitude and phase, as shown in Figure 3g,h. Comparing the phase and amplitude images of the exit-wave reconstruction

Identification of Cu Single Atom from Single Image
The CDDAE framework achieved an accurate identification of substitutional atoms in graphene. The incorporation of substitutional atoms (Cr, Ti, Pd, Ni, Al, Cu, Si, B, or N) in the graphene lattice results in the etching and doping of graphene [27][28][29][30]. Such doping and etching by impurity atoms is useful in building novel nanostructures. TEMs have been used to analyze the impurity effects of graphene because TEM can identify impurity atoms and reconstructed novel nanostructures simultaneously. However, the identification of impurity atoms by TEM is difficult. In particular, substituted Cu and Si atoms are not discernible from TEM images, only because of misinterpretable atomic intensity differences [28]. Another obstacle is that the life-time of substitutional atoms is just a few seconds [28], which makes them difficult to observe in TEM. To distinguish Cu and Si atoms accurately, exit-wave reconstruction is necessary. In Figure 3, the exit-wave reconstructions performed by the CDDAE framework and MacTempas were compared. Figure 3c-f shows the phase and amplitude images simulated with MacTempas for both Cu and Si substitution in the graphene lattice. The reconstructed amplitude and phase with our experimental TEM condition made it possible to discern Cu and Si atoms with large differences between the amplitude and phase, as shown in Figure 3g,h. Comparing the phase and amplitude images of the exit-wave reconstruction by the CDDAE framework to those of the MacTempas results, we conclude that the substitutional atom in the graphene lattice is Cu. It is worth noting that the CDDAE framework can achieve exit-wave reconstruction with just one TEM image. Nanomaterials 2020, 10, x FOR PEER REVIEW 8 of 11 by the CDDAE framework to those of the MacTempas results, we conclude that the substitutional atom in the graphene lattice is Cu. It is worth noting that the CDDAE framework can achieve exitwave reconstruction with just one TEM image.

Denoising Performance Metrics
Ir, the result of image simulation through CTF and the reconstructed exit wave in the CDDAE framework, is a nonlinear denoising solution of the input image. A conventional denoising method, the Wiener filter provides a linear denoising solution [14]. To compare the performance of the two methods mentioned earlier, an image simulation dataset of monolayer graphene has been built using MacTempas, and detector noise has been added, so that of the SNR is 9.1164 according to the actual TEM conditions, as shown in Figure 4a,b.
The results of denoising the noise-added dataset with the Wiener filter and CDDAE framework are shown in Figure 4c,d. The quality of the images was evaluated by calculating SNR, PSNR, and SSIM. The average values of SNRs, PSNRs, and SSIMs of the images are summarized in Table 2. After filtering with the Wiener filter and the CDDAE framework, SNR, PSNR, and SSIM of each image were improved. Comparing the results, the CDDAE framework performed better for all three values. These results indicate that performance can be improved by the nonlinear denoising solution. Furthermore, it is evident that the reconstructed exit wave from the CDDAE framework is the CTF deconvolution solution.

Denoising Performance Metrics
I r , the result of image simulation through CTF and the reconstructed exit wave in the CDDAE framework, is a nonlinear denoising solution of the input image. A conventional denoising method, the Wiener filter provides a linear denoising solution [14]. To compare the performance of the two methods mentioned earlier, an image simulation dataset of monolayer graphene has been built using MacTempas, and detector noise has been added, so that of the SNR is 9.1164 according to the actual TEM conditions, as shown in Figure 4a,b.
The results of denoising the noise-added dataset with the Wiener filter and CDDAE framework are shown in Figure 4c,d. The quality of the images was evaluated by calculating SNR, PSNR, and SSIM. The average values of SNRs, PSNRs, and SSIMs of the images are summarized in Table 2. After filtering with the Wiener filter and the CDDAE framework, SNR, PSNR, and SSIM of each image were improved. Comparing the results, the CDDAE framework performed better for all three values. These results indicate that performance can be improved by the nonlinear denoising solution. Furthermore, it is evident that the reconstructed exit wave from the CDDAE framework is the CTF deconvolution solution.  SNR-signal-to-noise ratio; PSNR-peak signal-to-noise ratio; SSIM-structural similarity index map.

Conclusions
In this study, we successfully demonstrate a deep learning framework, the CDDAE framework, for denoising and reconstructing the exit wave of ARTEM images. The CDDAE framework obtains the exit wave from a separated decoder, which decodes the phase and amplitude from the original images. The framework we proposed aimed to achieve the development of the conventional method, the use of a precalculated CTF, by finding a nonlinear solution through backpropagation.
The output of the image simulation in the CDDAE framework is a nonlinear denoising solution for ARTEM images, which exceeds the performance of a conventional linear denoising solution.
The proposed framework was validated by reconstructing the exit wave of the ARTEM images of CVD-grown monolayer graphene and comparing it with the results of the conventional exit wave simulation. Based on our proposed framework, we were able to differentiate between substituted Cu single atoms in the graphene lattice and substituted Si single atoms using, for the first time, only one ARTEM image. In addition, we proved that our method could perform the denoising and reconstructing of the exit wave of a single-defocused image of 2D materials without a through focal series of images. By adopting the proposed method, the misinterpretation of the image can be reduced, and the accurate atomic information can be used for the structural and dynamics studies of 2D materials with ARTEM. We believe this pioneering work opens up new possibilities for the use of deep learning techniques in the field of atomic-scale 2D material research.  SNR-signal-to-noise ratio; PSNR-peak signal-to-noise ratio; SSIM-structural similarity index map.

Conclusions
In this study, we successfully demonstrate a deep learning framework, the CDDAE framework, for denoising and reconstructing the exit wave of ARTEM images. The CDDAE framework obtains the exit wave from a separated decoder, which decodes the phase and amplitude from the original images. The framework we proposed aimed to achieve the development of the conventional method, the use of a precalculated CTF, by finding a nonlinear solution through backpropagation.
The output of the image simulation in the CDDAE framework is a nonlinear denoising solution for ARTEM images, which exceeds the performance of a conventional linear denoising solution.
The proposed framework was validated by reconstructing the exit wave of the ARTEM images of CVD-grown monolayer graphene and comparing it with the results of the conventional exit wave simulation. Based on our proposed framework, we were able to differentiate between substituted Cu single atoms in the graphene lattice and substituted Si single atoms using, for the first time, only one ARTEM image. In addition, we proved that our method could perform the denoising and reconstructing of the exit wave of a single-defocused image of 2D materials without a through focal series of images. By adopting the proposed method, the misinterpretation of the image can be reduced, and the accurate atomic information can be used for the structural and dynamics studies of 2D materials with ARTEM. We believe this pioneering work opens up new possibilities for the use of deep learning techniques in the field of atomic-scale 2D material research.

Conflicts of Interest:
The authors declare no conflict of interest.