Inpainting Saturation Artifact in Anterior Segment Optical Coherence Tomography

The cornea is an important refractive structure in the human eye. The corneal segmentation technique provides valuable information for clinical diagnoses, such as corneal thickness. Non-contact anterior segment optical coherence tomography (AS-OCT) is a prevalent ophthalmic imaging technique that can visualize the anterior and posterior surfaces of the cornea. Nonetheless, during the imaging process, saturation artifacts are commonly generated due to the tangent of the corneal surface at that point, which is normal to the incident light source. This stripe-shaped saturation artifact covers the corneal surface, causing blurring of the corneal edge, reducing the accuracy of corneal segmentation. To settle this matter, an inpainting method that introduces structural similarity and frequency loss is proposed to remove the saturation artifact in AS-OCT images. Specifically, the structural similarity loss reconstructs the corneal structure and restores corneal textural details. The frequency loss combines the spatial domain with the frequency domain to ensure the overall consistency of the image in both domains. Furthermore, the performance of the proposed method in corneal segmentation tasks is evaluated, and the results indicate a significant benefit for subsequent clinical analysis.


Introduction
The cornea is an important refractive medium in the human eye, participating in the transmission and processing of visual information.The corneal segmentation technique provides support for the diagnosis and treatment of corneal diseases.Accurate segmentation is crucial, as a few micrometers of corneal segmentation errors can lead to significant changes in the derived clinical parameters [1].Through corneal segmentation, the precise corneal morphology information [2] can be provided for the diagnosis and evaluation of diseases such as keratoconus [3], while also supporting preoperative preparation for procedures such as refractive surgery [1].Anterior segment optical coherence tomography (AS-OCT) is a non-invasive imaging technique with a range of potential clinical applications [4,5], which provides high-resolution images of the anterior segment at the micron-scale resolution.AS-OCT achieves the resolution of the anterior and posterior surfaces of the entire cornea, as shown in Figure 1.However, during the imaging process, a strip-like saturation artifact (Figure 1a) is commonly generated, which covers the corneal surface and blurs the corneal edge.This is because the incident light source's behavior at that point is normal to the tangent of the corneal surface.This high-intensity and high-contrast artifact negatively impacts the accuracy of corneal segmentation, making it challenging to obtain precise corneal morphology information.To improve the reliability and accuracy of corneal segmentation in various clinical applications and research studies, it is crucial to develop an inpainting method to eliminate the effects of the saturation artifact [6].Recently, deep learning technology [7] has shown superiority in fields such as image processing [8].AS-OCT image inpainting refers to the process of reconstructing the saturation artifact region and simultaneously maintaining the overall consistency of the image.Deep generative methods [9][10][11][12][13][14] are currently dominant, which effectively extract meaningful semantics from the image to be repaired and recover reasonable content with high visual fidelity, due to their powerful feature learning ability.Although these methods have achieved good performance with natural images, there are some differences between AS-OCT images and natural data, including the structure, noise, resolution, etc.These differences make it difficult to apply natural inpainting methods directly to AS-OCT saturation artifact restoration.AS-OCT inpainting needs to recover lost details more precisely to achieve the texture rationality and biological characteristics of eye tissue, such as the posterior surface of Bowman's layer (BL).
Nowadays, many methods have been proposed for OCT image restoration.In order to avoid the problem of inaccurate quantification of the choroidal vascular system caused by retinal vascular shadows, Zhang et al. [15] proposed a three-stage image-processing framework to remove shadows in the retina, which may not be suitable for diseases with abnormal retinal pigment epithelial cell (RPE) layers.Cheong et al. [16] recommended the DeshadowGAN method for optical coherence tomography images of the optic nerve head vascular shadow, which performed well on healthy-eye OCT images.However, it is uncertain whether it can achieve the same performance in eyes with pathological conditions such as glaucoma.Liu et al. [17] recently projected a dictionary-based sparse representation method for saturation artifact removal in OCT images.Although this method works very well on narrow-range saturation artifacts, it can be less effective when presented in large areas of shadow.To solve this problem, Tang et al. [18] proposed a multi-scale framework for shadow repair to achieve the repair effect of both wide and narrow retinal vascular shadows, but this repair effect on real shadows cannot be consistent with the composite shadow restoration effect.Nevertheless, these methods are proposed for unique OCT images collected from the various OCT devices.Due to the different data characteristics collected using different OCT devices, it is challenging to directly use existing OCT repair methods for saturation artifact restoration in AS-OCT images.There is an urgent for a dedicated inpainting method to remove saturation artifact in AS-OCT images.
To solve the issues of artifacts and speckle noise patterns and precisely segment the shallowest tissue interface in an AS-OCT image, Ouyang et al. [6] proposed a cascaded neural network framework.However, this frame only removes saturation artifacts and speckle noise patterns just above the shallowest tissue interface via cGAN [19], and does not completely remove the saturation artifact penetrating the center of the cornea.Recently, in order to address the issue of stripe artifacts damaging the visual quality of images and affecting automatic ophthalmic analysis, Bai et al. [20] proposed SC_GAN, which can successfully removed artifacts but cannot significantly improve clinical segmentation accuracy.
In this paper, we treat saturation artifact inpainting as converting the artifact image into an artifact-free image.A new generative model is proposed to inpaint the corneal Recently, deep learning technology [7] has shown superiority in fields such as image processing [8].AS-OCT image inpainting refers to the process of reconstructing the saturation artifact region and simultaneously maintaining the overall consistency of the image.Deep generative methods [9][10][11][12][13][14] are currently dominant, which effectively extract meaningful semantics from the image to be repaired and recover reasonable content with high visual fidelity, due to their powerful feature learning ability.Although these methods have achieved good performance with natural images, there are some differences between AS-OCT images and natural data, including the structure, noise, resolution, etc.These differences make it difficult to apply natural inpainting methods directly to AS-OCT saturation artifact restoration.AS-OCT inpainting needs to recover lost details more precisely to achieve the texture rationality and biological characteristics of eye tissue, such as the posterior surface of Bowman's layer (BL).
Nowadays, many methods have been proposed for OCT image restoration.In order to avoid the problem of inaccurate quantification of the choroidal vascular system caused by retinal vascular shadows, Zhang et al. [15] proposed a three-stage image-processing framework to remove shadows in the retina, which may not be suitable for diseases with abnormal retinal pigment epithelial cell (RPE) layers.Cheong et al. [16] recommended the DeshadowGAN method for optical coherence tomography images of the optic nerve head vascular shadow, which performed well on healthy-eye OCT images.However, it is uncertain whether it can achieve the same performance in eyes with pathological conditions such as glaucoma.Liu et al. [17] recently projected a dictionary-based sparse representation method for saturation artifact removal in OCT images.Although this method works very well on narrow-range saturation artifacts, it can be less effective when presented in large areas of shadow.To solve this problem, Tang et al. [18] proposed a multi-scale framework for shadow repair to achieve the repair effect of both wide and narrow retinal vascular shadows, but this repair effect on real shadows cannot be consistent with the composite shadow restoration effect.Nevertheless, these methods are proposed for unique OCT images collected from the various OCT devices.Due to the different data characteristics collected using different OCT devices, it is challenging to directly use existing OCT repair methods for saturation artifact restoration in AS-OCT images.There is an urgent for a dedicated inpainting method to remove saturation artifact in AS-OCT images.
To solve the issues of artifacts and speckle noise patterns and precisely segment the shallowest tissue interface in an AS-OCT image, Ouyang et al. [6] proposed a cascaded neural network framework.However, this frame only removes saturation artifacts and speckle noise patterns just above the shallowest tissue interface via cGAN [19], and does not completely remove the saturation artifact penetrating the center of the cornea.Recently, in order to address the issue of stripe artifacts damaging the visual quality of images and affecting automatic ophthalmic analysis, Bai et al. [20] proposed SC_GAN, which can successfully removed artifacts but cannot significantly improve clinical segmentation accuracy.
In this paper, we treat saturation artifact inpainting as converting the artifact image into an artifact-free image.A new generative model is proposed to inpaint the corneal structure and texture obscured by saturation artifacts, as well as maintaining the overall consistency of the image.The main novelties and contributions are as follows:

•
The dual-domain transformation capability of DualGAN [21] is designed to achieve AS-OCT saturation artifact inpainting by converting the artifact image into an artifactfree image.The structural similarity loss for reconstructing the structure and texture of the cornea is incorporated; • A frequency loss that combines the spatial and frequency domains is introduced to ensure the overall consistency of the images in both domains;

•
The repair experiments on both synthetic and real artifacts are devised.The results indicate that the proposed methods can restore artifacts in different situations.To confirm the clinical value of saturation artifact inpainting, segmentation experiments are designed on the three corneal boundaries of real artifact-inpainted images, including the anterior surface of the epithelium (EP), the posterior surface of Bowman's layer (BL), and the posterior surface of the endothelium (EN).The experimental results demonstrate that the method significantly enhances the precision of corneal segmentation, proving to be more accurate than other repair techniques.
The remainder of the paper is organized as follows: We introduce our approach in Section 2. In Section 3, the experiments are described, specifically for synthesis and real artifact restoration (Section 3.3), segmentation experiments for real restoration verification (Section 3.4), and ablation experiments (Section 3.5), followed by the conclusions in Section 4.

Proposed Method
In this work, AS-OCT image inpainting is regarded as a translation task from artifact images to artifact-free data.As shown in Figure 2, the proposed method is built on DualGAN, including two generators (G A , G B ) and two discriminators (D A , D B ), and introduces two loss terms: structural similarity and frequency loss.By giving two paired images sampled from domains S and W, the forward task of the network is to learn the generator G A : S → W that maps the images s ∈ S to w ∈ W, and the backward task is to train the generator G B : W → S. The generated image and ground_truth image are constrained using four different loss functions, including L F , L SSI M , L adv , and L 1 , making the former more similar to the latter.L F combines the spatial and frequency domain of the image.Moreover, the reconstruction loss, L recon , further restrains the reconstructed image to be consistent with the ground_truth.Two discriminators are used to evaluate the fit between the output of the corresponding generator and the ground_truth.In this section, the generator, discriminator, and loss functions are described in detail.The remainder of the paper is organized as follows: We introduce our approach in Section 2. In Section 3, the experiments are described, specifically for synthesis and real artifact restoration (Section 3.3), segmentation experiments for real restoration verification (Section 3.4), and ablation experiments (Section 3.5), followed by the conclusions in Section 4.

Proposed Method
In this work, AS-OCT image inpainting is regarded as a translation task from artifact images to artifact-free data.As shown in Figure 2, the proposed method is built on Dual-GAN, including two generators ( ,  ) and two discriminators ( ,  ), and introduces two loss terms: structural similarity and frequency loss.By giving two paired images sampled from domains S and W, the forward task of the network is to learn the generator  : S → W that maps the images  ∈  to  ∈ , and the backward task is to train the generator  : W → S. The generated image and ground_truth image are constrained using four different loss functions, including  ,  ,  , and  , making the former more similar to the latter. combines the spatial and frequency domain of the image.Moreover, the reconstruction loss,  , further restrains the reconstructed image to be consistent with the ground_truth.Two discriminators are used to evaluate the fit between the output of the corresponding generator and the ground_truth.In this section, the generator, discriminator, and loss functions are described in detail.

Network Architecture
The network consists of two generators (G A , G B ) and two discriminators (D A , D B ).As shown in Figure 3, the two generators achieve mutual conversion between S-domain and W-domain images, and the two discriminators are used to distinguish the generated image and real data.Generator G A generates artifact-free images, while G B generates an artifact image with a central stripe pattern.Discriminator D A is trained to distinguish between real samples in the W-domain and data generated from the S-domain, while D B is trained in the opposite manner.
Figure 2. The overall architecture of the proposed model includes two generators ( ,  discriminators ( ,  ).Image  ∈  is input to generator  and image  ∈  is input ator  .The proposed structural similarity loss and frequency loss are represented as   , respectively. combines spatial and frequency domains.

Network Architecture
The network consists of two generators ( ,  ) and two discriminators ( shown in Figure 3, the two generators achieve mutual conversion between S-dom W-domain images, and the two discriminators are used to distinguish the generate and real data.Generator  generates artifact-free images, while  generates a image with a central stripe pattern.Discriminator  is trained to distinguish real samples in the W-domain and data generated from the S-domain, while  i in the opposite manner.

Network Architecture
The network consists of two generators ( ,  ) and two discriminators ( ,  ).As shown in Figure 3, the two generators achieve mutual conversion between S-domain and W-domain images, and the two discriminators are used to distinguish the generated image and real data.Generator  generates artifact-free images, while  generates an artifact image with a central stripe pattern.Discriminator  is trained to distinguish between real samples in the W-domain and data generated from the S-domain, while  is trained in the opposite manner.which is input into G A to obtain the reconstructed image, G A (G B (w)) ∈ W. Discriminator D A evaluates the degree of fit between G A (s) and the real w.Discriminator D B estimates the measure of fit between the fake s generated with G B and the real s.

Objective Functions
In practice, generator G A generates the image ŝ.In order to avoid network damage to other organizational structures of image s, the images, s and ŝ, and the corresponding mask (m) were integrated; thus, the inpainted results can be denoted as follows:

Objective Functions
Formally, image  ∈  is input into generator  to derive the image  () ∈  , and the image is processed through  to obtain the reconstructed image,  ( ()) ∈ .Similarly, image  ∈  is passed through generator  to achieve the image  () ∈ , which is input into  to obtain the reconstructed image,  ( ()) ∈ .Discriminator  evaluates the degree of fit between  () and the real .Discriminator  estimates the measure of fit between the fake  generated with  and the real .
In practice, generator  generates the image .In order to avoid network damage to other organizational structures of image , the images,  and , and the corresponding mask () were integrated; thus, the inpainted results can be denoted as follows: where ʘ is the pixel-wise multiplication.Equation (1) ensures that image  is only replaced by the generated image in the artifact region, while other regions remain unchanged.SSIM Loss: Although DualGAN achieves image translation between two domains, it distorts the corneal boundary structure and texture information in AS-OCT images.Structural similarity (SSIM) [24] is a powerful tool for image quality assessment.Generally, a higher SSIM means that the image has clearer results.SSIM has been widely used in tasks such as image restoration [25], semantic segmentation [26], and dehazing [27] since it was proposed.Therefore, SSIM is adopted as a loss function to train the network to reconstruct the corneal structure and can improve the corneal segmentation accuracy.Since SSIM is pixel-based, this loss function has constraints on corneal structure reconstruction and helps to restore texture information.SSIM for two images, x and y (x is the ground_truth, y denotes the repaired image), is defined as follows: where µ and σ represent the standard deviation and covariance of images, respectively. and  are constants.A higher SSIM indicates that the two images are more similar to each other, and the SSIM equals 1 for identical images.The loss function for the SSIM operates on both generators and can then be written as follows: Frequency Loss: In AS-OCT images, non-tissue structure saturation artifact is usually positioned at high frequencies, which could overshadow or interfere with other frequency components.There are gaps between the real and generated images, especially in the frequency domain [28,29], as shown in Figure 5.For different images, their frequency domain distribution also varies.Figure 5b, c, respectively, show the frequency domain distribution of synthesized and real saturated artifact images, which have significant differences from the that of artifact-free images.Compared to Figure 5a, these have significant changes in the horizontal direction in the spatial domain, so their frequency domain distribution has a bright line in the horizontal direction, especially in Figure 5b, and the brightness in the central low-frequency region is also different.
To minimize the disparities, the frequency loss is introduced, which is the main idea of obtaining frequency domain characteristics of the real and generated images by performing a fast Fourier transform (FFT); in other words, make Figure 5d infinitely close to Figure 5a to achieve the same.Previous studies [30,31] found it beneficial to replace distance  with distance  , since the former often leads to blurriness.The distance formulas for  and  are Equations ( 4) and (5), respectively.Hence, the  distance is adopted to measure the distance between the real image and the generated image after FFT.The spatial and frequency domains are combined to further improve the quality of restoration.The frequency loss function operates on both generators and can then be written as Equation (6).

Objective Functions
Formally, image  ∈  is input into generator  to derive the image  () ∈ and the image is processed through  to obtain the reconstructed image,  ( ()) Similarly, image  ∈  is passed through generator  to achieve the image  ( , which is input into  to obtain the reconstructed image,  ( ()) ∈ .Discrim tor  evaluates the degree of fit between  () and the real .Discriminator  e mates the measure of fit between the fake  generated with  and the real .In practice, generator  generates the image .In order to avoid network dam to other organizational structures of image , the images,  and , and the correspond mask () were integrated; thus, the inpainted results can be denoted as follows: where ʘ is the pixel-wise multiplication.Equation (1) ensures that image  is only placed by the generated image in the artifact region, while other regions remain changed.SSIM Loss: Although DualGAN achieves image translation between two doma it distorts the corneal boundary structure and texture information in AS-OCT ima Structural similarity (SSIM) [24] is a powerful tool for image quality assessment.Ge ally, a higher SSIM means that the image has clearer results.SSIM has been widely u in tasks such as image restoration [25], semantic segmentation [26], and dehazing since it was proposed.Therefore, SSIM is adopted as a loss function to train the netw to reconstruct the corneal structure and can improve the corneal segmentation accur Since SSIM is pixel-based, this loss function has constraints on corneal structure rec struction and helps to restore texture information.SSIM for two images, x and y (x is ground_truth, y denotes the repaired image), is defined as follows: where µ and σ represent the standard deviation and covariance of images, respectiv  and  are constants.A higher SSIM indicates that the two images are more simila each other, and the SSIM equals 1 for identical images.The loss function for the SS operates on both generators and can then be written as follows:  =  1 −  ,  () +  1 −  ,  () Frequency Loss: In AS-OCT images, non-tissue structure saturation artifact is usu positioned at high frequencies, which could overshadow or interfere with other freque components.There are gaps between the real and generated images, especially in the quency domain [28,29], as shown in Figure 5.For different images, their frequency dom distribution also varies.Figure 5b, c, respectively, show the frequency domain distribu of synthesized and real saturated artifact images, which have significant differences fr the that of artifact-free images.Compared to Figure 5a, these have significant change the horizontal direction in the spatial domain, so their frequency domain distribution a bright line in the horizontal direction, especially in Figure 5b, and the brightness in central low-frequency region is also different.
To minimize the disparities, the frequency loss is introduced, which is the main i of obtaining frequency domain characteristics of the real and generated images by p forming a fast Fourier transform (FFT); in other words, make Figure 5d infinitely clos Figure 5a to achieve the same.Previous studies [30,31] found it beneficial to replace tance  with distance  , since the former often leads to blurriness.The distance form las for  and  are Equations ( 4) and ( 5), respectively.Hence, the  distanc adopted to measure the distance between the real image and the generated image a FFT.The spatial and frequency domains are combined to further improve the qualit restoration.The frequency loss function operates on both generators and can then be w ten as Equation (6).
where EW 5 of 15

Objective Functions
Formally, image  ∈  is input into generator  to derive the image  () ∈  , and the image is processed through  to obtain the reconstructed image,  ( ()) ∈ .Similarly, image  ∈  is passed through generator  to achieve the image  () ∈ , which is input into  to obtain the reconstructed image,  ( ()) ∈ .Discriminator  evaluates the degree of fit between  () and the real .Discriminator  estimates the measure of fit between the fake  generated with  and the real .In practice, generator  generates the image .In order to avoid network damage to other organizational structures of image , the images,  and , and the corresponding mask () were integrated; thus, the inpainted results can be denoted as follows: where ʘ is the pixel-wise multiplication.Equation (1) ensures that image  is only replaced by the generated image in the artifact region, while other regions remain unchanged.SSIM Loss: Although DualGAN achieves image translation between two domains, it distorts the corneal boundary structure and texture information in AS-OCT images.Structural similarity (SSIM) [24] is a powerful tool for image quality assessment.Generally, a higher SSIM means that the image has clearer results.SSIM has been widely used in tasks such as image restoration [25], semantic segmentation [26], and dehazing [27] since it was proposed.Therefore, SSIM is adopted as a loss function to train the network to reconstruct the corneal structure and can improve the corneal segmentation accuracy.Since SSIM is pixel-based, this loss function has constraints on corneal structure reconstruction and helps to restore texture information.SSIM for two images, x and y (x is the ground_truth, y denotes the repaired image), is defined as follows: where µ and σ represent the standard deviation and covariance of images, respectively. and  are constants.A higher SSIM indicates that the two images are more similar to each other, and the SSIM equals 1 for identical images.The loss function for the SSIM operates on both generators and can then be written as follows: Frequency Loss: In AS-OCT images, non-tissue structure saturation artifact is usually positioned at high frequencies, which could overshadow or interfere with other frequency components.There are gaps between the real and generated images, especially in the frequency domain [28,29], as shown in Figure 5.For different images, their frequency domain distribution also varies.Figure 5b, c, respectively, show the frequency domain distribution of synthesized and real saturated artifact images, which have significant differences from the that of artifact-free images.Compared to Figure 5a, these have significant changes in the horizontal direction in the spatial domain, so their frequency domain distribution has a bright line in the horizontal direction, especially in Figure 5b, and the brightness in the central low-frequency region is also different.
To minimize the disparities, the frequency loss is introduced, which is the main idea of obtaining frequency domain characteristics of the real and generated images by performing a fast Fourier transform (FFT); in other words, make Figure 5d infinitely close to Figure 5a to achieve the same.Previous studies [30,31] found it beneficial to replace distance  with distance  , since the former often leads to blurriness.The distance formulas for  and  are Equations ( 4) and ( 5), respectively.Hence, the  distance is adopted to measure the distance between the real image and the generated image after FFT.The spatial and frequency domains are combined to further improve the quality of restoration.The frequency loss function operates on both generators and can then be written as Equation (6).
is the pixel-wise multiplication.Equation (1) ensures that image s is only replaced by the generated image in the artifact region, while other regions remain unchanged.
SSIM Loss: Although DualGAN achieves image translation between two domains, it distorts the corneal boundary structure and texture information in AS-OCT images.Structural similarity (SSIM) [24] is a powerful tool for image quality assessment.Generally, a higher SSIM means that the image has clearer results.SSIM has been widely used in tasks such as image restoration [25], semantic segmentation [26], and dehazing [27] since it was proposed.Therefore, SSIM is adopted as a loss function to train the network to reconstruct the corneal structure and can improve the corneal segmentation accuracy.Since SSIM is pixel-based, this loss function has constraints on corneal structure reconstruction and helps to restore texture information.SSIM for two images, x and y (x is the ground_truth, y denotes the repaired image), is defined as follows: where µ and σ represent the standard deviation and covariance of images, respectively.C 1 and C 2 are constants.A higher SSIM indicates that the two images are more similar to each other, and the SSIM equals 1 for identical images.The loss function for the SSIM operates on both generators and can then be written as follows: Frequency Loss: In AS-OCT images, non-tissue structure saturation artifact is usually positioned at high frequencies, which could overshadow or interfere with other frequency components.There are gaps between the real and generated images, especially in the frequency domain [28,29], as shown in Figure 5.For different images, their frequency domain distribution also varies.Figure 5b,c, respectively, show the frequency domain distribution of synthesized and real saturated artifact images, which have significant differences from the that of artifact-free images.Compared to Figure 5a, these have significant changes in the horizontal direction in the spatial domain, so their frequency domain distribution has a bright line in the horizontal direction, especially in Figure 5b, and the brightness in the central low-frequency region is also different.
To minimize the disparities, the frequency loss is introduced, which is the main idea of obtaining frequency domain characteristics of the real and generated images by performing a fast Fourier transform (FFT); in other words, make Figure 5d infinitely close to Figure 5a to achieve the same.Previous studies [30,31] found it beneficial to replace distance L 2 with distance L 1 , since the former often leads to blurriness.The distance formulas for L 1 and L 2 are Equations ( 4) and (5), respectively.Hence, the L 1 distance is adopted to measure the distance between the real image and the generated image after FFT.The spatial and frequency domains are combined to further improve the quality of restoration.The frequency loss function operates on both generators and can then be written as Equation (6).
where F represents the fast Fourier transform, transforming the image from the spatial distribution to the frequency domain via FFT.  Loss: we use  loss to measure the difference between the generated image and the real image, to avoid some pixels being smoothed by transitions, resulting in the resulting image missing detail and texture information. loss is defined as Discriminator  is trained with the real w as positive samples and the generated  () as negative examples, whereas  takes the real s as positive and  () as negative.Generators  and  are optimized to emulate 'fake' outputs to blind the corresponding discriminators,  and  .
Reconstruction Loss: the  distance between the reconstructed image and the real image is adopted as the reconstruction loss, formulated as Total Losses: the whole objective function of the proposed network can be written as where F represents the fast Fourier transform, transforming the image from the spatial distribution to the frequency domain via FFT.L 1 Loss: we use L 1 loss to measure the difference between the generated image and the real image, to avoid some pixels being smoothed by transitions, resulting in the resulting image missing detail and texture information.L 1 loss is defined as Adversarial Loss: the adversarial loss acts between two pairs of generators-discriminators (G A − D A , G B − D B ), and is defined as Discriminator D A is trained with the real w as positive samples and the generated G A (s) as negative examples, whereas D B takes the real s as positive and G B (w) as negative.Generators G A and G B are optimized to emulate 'fake' outputs to blind the corresponding discriminators, D A and D B .
Reconstruction Loss: the L 1 distance between the reconstructed image and the real image is adopted as the reconstruction loss, formulated as Total Losses: the whole objective function of the proposed network can be written as where λ SSI M , λ F , λ 1 , λ adv , and λ recon are the tradeoff parameters, and set 50, 1, 100, 1, and 1, respectively.

Experimental Setup and Results
In this section, the details of data preprocessing are presented in Section 3.1, and some parameter settings during the training process are explained in Section 3.2.Quantitative and qualitative evaluations of the synthetic artifact restoration are conducted in Section 3.3, and a visualization of real restoration is performed.Then, in Section 3.4, the real repair quality is verified through corneal segmentation experiments.Section 3.5 introduces extensive ablation studies to validate the effectiveness of various components of the model.

Data Preprocessing
The AS-OCT image used in this article is from a CASIA1 [32] ophthalmology device (Tomey Inc., Nagoya, Japan), using a swept-source OCT (SS-OCT) with a scanning speed of 30,000 A-ultrasound/second, a wavelength of 1310 nm, and a frequency of 50 Hz, whilst preserving an original image size of 1689 × 1000.To more accurately repair saturation artifacts throughout the corneal tissue, the AS-OCT image is cropped to the size of 256 × 256 to obtain the image, I.
Since saturation artifact images do not have paired data, the binary mask (m) (with value 0 for known pixels and 1 for the area to be repaired) is manually added to artifact-free image I with varying degrees of inclination in the corneal tissue to simulate saturation artifacts in the spatial domain.Figure 6 shows the process of obtaining the synthetic artifact image s = I × m, where the position and width of the mask are different.The paired data for image s are the artifact-free image w = I.A total of 3774 pairs of synthetic images were obtained; hence, 3374 were employed as a training dataset, while the remaining subsets served as test set.To verify the effectiveness of the network, the mask in the test set was wider than the mask added in the training set, and 80 AS-OCT real artifact images were tested.

Experimental Setup and Results
In this section, the details of data preprocessing are presented in Section 3.1, an some parameter settings during the training process are explained in Section 3.2.Quant tative and qualitative evaluations of the synthetic artifact restoration are conducted in Se tion 3.3, and a visualization of real restoration is performed.Then, in Section 3.4, the re repair quality is verified through corneal segmentation experiments.Section 3.5 intro duces extensive ablation studies to validate the effectiveness of various components of th model.

Data Preprocessing
The AS-OCT image used in this article is from a CASIA1 [32] ophthalmology devic (Tomey Inc., Nagoya, Japan), using a swept-source OCT (SS-OCT) with a scanning spee of 30,000 A-ultrasound/second, a wavelength of 1310 nm, and a frequency of 50 Hz, whil preserving an original image size of 1689 1000.To more accurately repair saturatio artifacts throughout the corneal tissue, the AS-OCT image is cropped to the size o 256 256 to obtain the image, I.
Since saturation artifact images do not have paired data, the binary mask () (wit value 0 for known pixels and 1 for the area to be repaired) is manually added to artifac free image I with varying degrees of inclination in the corneal tissue to simulate saturatio artifacts in the spatial domain.Figure 6 shows the process of obtaining the synthetic art fact image  =  , where the position and width of the mask are different.The paire data for image  are the artifact-free image  = .A total of 3774 pairs of synthetic im ages were obtained; hence, 3374 were employed as a training dataset, while the remainin subsets served as test set.To verify the effectiveness of the network, the mask in the te set was wider than the mask added in the training set, and 80 AS-OCT real artifact image were tested.

Training Parameters
All training and testing is performed on a single NVIDIA GeForce RTX 3090 GPU (24 GB).Using the RMSprop optimizer, the learning rates of the generator and discriminator are 0.0005, and 0.0001, respectively, and the batch size is 4.
Synthetic Saturation Artifacts' Qualitative Comparison: the comparative experiments on synthetic artifact restoration are conducted in three different situations:

•
The repair effects of different methods on corneal tissue images with different tilt degrees under the same mask conditions are shown in Figure 7; The inpainting results of different methods on corneal tissue images with the same inclination combined with different masks are shown in Figure 8.Briefly, Figure 8 shows the image restoration effects of three groups, adding different masks to the same AS-OCT image.Figure 8(I), (II), and (III) respectively show the repair results of images with downward tilt of corneal tissue, images with no tilt degree of corneal tissue, and images with upward tilt of corneal tissue with different masks added.

•
The results of adding different masks into the corneal tissue images with different degrees of inclination using different methods are shown in Figure 9.
As shown in these renderings, the MADF method can achieve non-tilted corneal tissue image restoration, but there may be some structural defects for wide shadows and tilted corneal tissue images.For CTSDG, AOT-GAN, and G&L, these methods can repair the corneal structure, but they retain blurry or overly smooth textures in shaded areas, especially in wide-shaded images.The RFR approach may effectively rectify backdrop pixels, yet manifests a limited capacity when addressing the corneal structure and texture.The PICNet method has a certain repair effect on the mask area, but its processing ability of the narrow-band artifact's edge is significantly weak, resulting in incomplete restoration.Significate repair marks occur with the DualGAN method, which can distort the corneal structure and texture.The proposed method can successfully rectify artifacts across varying widths and positions, irrespective of the degree of inclination of the corneal tissue.

Training Parameters
All training and testing is performed on a single NVIDIA GeForce RTX 3090 GPU (24 GB).Using the RMSprop optimizer, the learning rates of the generator and discriminator are 0.0005, and 0.0001, respectively, and the batch size is 4.
Synthetic Saturation Artifacts' Qualitative Comparison: the comparative experiments on synthetic artifact restoration are conducted in three different situations:  The repair effects of different methods on corneal tissue images with different tilt degrees under the same mask conditions are shown in Figure 7; The inpainting results of different methods on corneal tissue images with the same inclination combined with different masks are shown in Figure 8.Briefly, Figure 8 shows the image restoration effects of three groups, adding different masks to the same AS-OCT image.Figure 8(I), (II), and (III) respectively show the repair results of images with downward tilt of corneal tissue, images with no tilt degree of corneal tissue, and images with upward tilt of corneal tissue with different masks added.


The results of adding different masks into the corneal tissue images with different degrees of inclination using different methods are shown in Figure 9.
As shown in these renderings, the MADF method can achieve non-tilted corneal tissue image restoration, but there may be some structural defects for wide shadows and tilted corneal tissue images.For CTSDG, AOT-GAN, and G&L, these methods can repair the corneal structure, but they retain blurry or overly smooth textures in shaded areas, especially in wide-shaded images.The RFR approach may effectively rectify backdrop pixels, yet manifests a limited capacity when addressing the corneal structure and texture.The PICNet method has a certain repair effect on the mask area, but its processing ability of the narrow-band artifact's edge is significantly weak, resulting in incomplete restoration.Significate repair marks occur with the DualGAN method, which can distort the corneal structure and texture.The proposed method can successfully rectify artifacts across varying widths and positions, irrespective of the degree of inclination of the corneal tissue.

Synthetic Saturation Artifacts' Quantitative Comparison:
To quantitatively evaluate the inpainting quality, the means of the peak signal-to-noise ratio (PSNR), SSIM, and

Synthetic Saturation Artifacts' Quantitative Comparison:
To quantitatively evaluate the inpainting quality, the means of the peak signal-to-noise ratio (PSNR), SSIM, and learned perceptual image patch similarity (LPIPS) [33] are calculated for the repaired regions from different techniques on the synthetic testing set.Table 1 shows the results, with the best performance value written in bold.For CTSDG, AOT-GAN, and G&L, these methods add noise beyond the original image to the repair results.As the PSNR is more sensitive to noise, these methods achieve higher values in this metric.In addition, the SSIM values indicate that these methods have a positive effect on the reconstruction of corneal structures.
The RFR method is unable to repair the structure and texture of the cornea, resulting in low values for various indicators.The MADF, PICNet, and DualGAN methods have varying degrees of inpainting effects, but due to the distortion of the corneal structure in the repair results, low SSIM values are achieved.Although our method cannot achieve optimal results in the PSNR and SSIM metrics for synthetic saturated artifacts, the image quality is also crucial for corneal boundary segmentation [34].Therefore, it is necessary to pay attention to the corneal tissue structure and texture information simultaneously.LPIPS can better simulate human visual perception and subjective quality judgment.For synthetic artifact restoration, the recommended method achieves the best performance in LPIPS.
learned perceptual image patch similarity (LPIPS) [33] are calculated for the repaired regions from different techniques on the synthetic testing set.Table 1 shows the results, with the best performance value written in bold.For CTSDG, AOT-GAN, and G&L, these methods add noise beyond the original image to the repair results.As the PSNR is more sensitive to noise, these methods achieve higher values in this metric.In addition, the SSIM values indicate that these methods have a positive effect on the reconstruction of corneal structures.The RFR is unable to repair the structure and texture of the cornea, resulting in low values for various indicators.The MADF, PICNet, and DualGAN methods have varying degrees of inpainting effects, but due to the distortion of the corneal structure in the repair results, low SSIM values are achieved.Although our method cannot achieve optimal results in the PSNR and SSIM metrics for synthetic saturated artifacts, the image quality is also crucial for corneal boundary segmentation [34].Therefore, it is necessary to pay attention to the corneal tissue structure and texture information simultaneously.LPIPS can better simulate human visual perception and subjective quality judgment.For synthetic artifact restoration, the recommended method achieves the best performance in LPIPS.10 shows the inpainting performance of different methods for real artifacts, and these comparative methods show consistent repair results with the synthetic saturation artifacts.Specifically, the MADF method can repair narrow-range artifacts, while repairing wide-range artifacts can lead to corneal structural protrusions.The CTSDG, AOT-GAN, and G&L methods bring blurry or smooth texture details, regardless of the wideness of the artifacts.The RFR method can only play a certain role in repairing background pixels.The PICNet method performs well in wide artifact restoration, but it may have blurring for narrow-range artifacts and has weak edge-restoration capabilities.The baseline network, DualGAN, distorts the corneal structure.Our model can repair saturation artifacts with different widths and positions, both synthetic and real, reconstructing the corneal structures and repairing textural details.Real Saturation Artifact Qualitative Comparison: Figure 10 shows the inpainting performance of different methods for real artifacts, and these comparative methods show consistent repair results with the synthetic saturation artifacts.Specifically, the MADF method can repair narrow-range artifacts, while repairing wide-range artifacts can lead to corneal structural protrusions.The CTSDG, AOT-GAN, and G&L methods bring blurry or smooth texture details, regardless of the wideness of the artifacts.The RFR method can only play a certain role in repairing background pixels.The PICNet method performs well in wide artifact restoration, but it may have blurring for narrow-range artifacts and has weak edge-restoration capabilities.The baseline network, DualGAN, distorts the corneal structure.Our model can repair saturation artifacts with different widths and positions, both synthetic and real, reconstructing the corneal structures and repairing textural details.

Evaluation on Segmentation
To further evaluate the inpainting performance of different methods on machine analysis, experiments on a corneal segmentation task are conducted.Concretely, the widely used U-Net segmentation model was trained using the labelled AS-OCT image; 31 AS-OCT images are randomly selected from the real repaired saturation artifact to test the segmentation effect.The inpainting and segmentation results of each method are shown in Figure 11.The visual segmentation effect shows that the proposed method maintains overall image consistency while removing the saturation artifact and reconstructing the corneal structure and textural details, due to the constraints of structural similarity and frequency domain loss.In addition, the performance of different inpainting methods is analyzed by calculating the Dice similarity coefficient (DSC), pixel accuracy (PA), F1-score,

Evaluation on Segmentation
To further evaluate the inpainting performance of different methods on machine analysis, experiments on a corneal segmentation task are conducted.Concretely, the widely used U-Net segmentation model was trained using the labelled AS-OCT image; 31 AS-OCT images are randomly selected from the real repaired saturation artifact to test the segmentation effect.The inpainting and segmentation results of each method are shown in Figure 11.The visual segmentation effect shows that the proposed method maintains overall image consistency while removing the saturation artifact and reconstructing the corneal structure and textural details, due to the constraints of structural similarity and frequency domain loss.In addition, the performance of different inpainting methods is analyzed by calculating the Dice similarity coefficient (DSC), pixel accuracy (PA), F1-score, and Jaccard value of the segmentation results.The calculation formulas for DSC, PA, F1-score, and Jaccard are shown in Equations ( 11)- (14).
where X and Y represent the generated image and the ground_truth image, respectively, and X ∩ Y is the intersection of the two.
where TP is the number of true positives, TN is the number of true negatives, FN is the number of false negatives, and FP is the number of false positives.
where X ∪ Y signals the union of the generated image and the ground_true.As shown in Table 2, our method is ahead of other methods in calculating various indicators, especially DSC, with a maximum DSC value of 0.732, which is not presented in other methods.The average values calculated for each indicator are provided in the table, indicating that our inpainting method has significantly improved the corneal segmentation.As shown in Table 2, our method is ahead of other methods in calculating various indicators, especially DSC, with a maximum DSC value of 0.732, which is not presented in other methods.The average values calculated for each indicator are provided in the table, indicating that our inpainting method has significantly improved the corneal segmentation.

Ablation Studies
In this section, two sets of experiments are conducted to verify the contribution.Here, the impact of structural similarity loss and frequency loss on the repair of saturation artifacts are studied.Specifically, the structural similarity and frequency losses are removed, respectively, and their impact on the experiment is observed.From the results in Figure 12, it can be seen that removing the structural loss will seriously affect the texture and structural information, especially in Bowman's layered structure.Meanwhile, the segmentation indicators in Table 3 also show low segmentation accuracy due to the incomplete corneal structure.Unlike removing structural loss, although removing frequency loss can maintain almost the same visual effect as the suggested method, Table 3 also indicates that adding frequency loss can improve the accuracy of corneal segmentation.In summary, the results reveal the role of structural similarity loss and frequency loss in the repair of saturation artifacts, and also confirm the superiority of the proposed method.
Sensors 2023, 23, x FOR PEER REVIEW loss can maintain almost the same visual effect as the suggested method, Tabl dicates that adding frequency loss can improve the accuracy of corneal segme summary, the results reveal the role of structural similarity loss and frequency repair of saturation artifacts, and also confirm the superiority of the proposed

Conclusions
In this study, a novel deep generative method is proposed to remove saturation artifacts generated during AS-OCT imaging.The proposed model is a bilateral network and introduces two loss terms, namely structural similarity loss and frequency loss.The structural similarity loss reconstructs structural and textural details while ensuring the overall image consistency.In the frequency domain, the saturated artifact is located in the high-frequency component, and frequency loss is designed to eliminate image distortion in the high-frequency region.Moreover, frequency loss combines the spatial domain with the frequency domain to reduce the gap between the generated image and the real image.The repair comparison experiment and ablation experiment demonstrate the effectiveness of the innovative points.The results show that this method can successfully remove saturation artifacts and restore the structure and texture information of the cornea.The corneal boundary segmentation experiment further verifies that the proposed method effectively improves the quality of AS-OCT images and significantly improves the accuracy of corneal segmentation.
Although this method has achieved satisfactory results, there are still some limitations to the framework.Specific device data are adopted to obtain these valuable results, while the image restoration performance of other OCT devices is not determined in this study.
In summary, an inpainting method was proposed to remove AS-OCT saturation artifact while preserving the surroundings.The proposed network is of great significance for the improvement of corneal segmentation accuracy.

Figure 1 .
Figure 1.AS-OCT can visualize the entire cornea.

Figure 1 .
Figure 1.AS-OCT can visualize the entire cornea.

Figure 2 .
Figure 2. The overall architecture of the proposed model includes two generators (G A , G B ) and two discriminators (D A , D B ). Image s ∈ S is input to generator G A and image w ∈ W is input to generator G B .The proposed structural similarity loss and frequency loss are represented as L SSI M and L F , respectively.L F combines spatial and frequency domains.

Figure 3 .
Figure 3.The mutual conversion between S-domain and W-domain images through two g Two discriminators are used to discern between generated images and real data.Both generators,  and  , employ the U-Net [22] architecture with eigh lutional layers for the encoder and decoder, as shown in Figure 4a.The encoder of a convolutional layer, six Relu-Conv-LayerNorm (RCL) blocks, and one Relu-C block.The decoder comprises seven Relu-ConvTranspose-BatchNorm-Dropout blocks and one Relu-ConvTranspose-Tanh (RCT) block.The encoder and decoder nected through skip connections.

Figure 4 .
Figure 4.The proposed method structure of generators and discriminators.The generato U-Net architecture, and the discriminators employ PatchGAN with five convolutional lay Both discriminators,  and  , use the PatchGAN [23] structure, as show ure 4b.The receptive field of PatchGAN is 70 × 70, and it consists of five 4 × 4 convo layers.Specifically, it includes a Conv-LeakyRelu (CL) block, three Conv-Batc LeakyRelu (CBL) blocks, and a Conv-Sigmoid (CS) block.

Figure 3 .
Figure 3.The mutual conversion between S-domain and W-domain images through two generators.Two discriminators are used to discern between generated images and real data.Both generators, G A and G B , employ the U-Net [22] architecture with eight convolutional layers for the encoder and decoder, as shown in Figure 4a.The encoder consists of a convolutional layer, six Relu-Conv-LayerNorm (RCL) blocks, and one Relu-Conv (RC) block.The decoder comprises seven Relu-ConvTranspose-BatchNorm-Dropout (RCBD) blocks and one Relu-ConvTranspose-Tanh (RCT) block.The encoder and decoder are connected through skip connections.

Figure 3 .
Figure 3.The mutual conversion between S-domain and W-domain images through two generators.Two discriminators are used to discern between generated images and real data.Both generators,  and  , employ the U-Net [22] architecture with eight convolutional layers for the encoder and decoder, as shown in Figure 4a.The encoder consists of a convolutional layer, six Relu-Conv-LayerNorm (RCL) blocks, and one Relu-Conv (RC) block.The decoder comprises seven Relu-ConvTranspose-BatchNorm-Dropout (RCBD) blocks and one Relu-ConvTranspose-Tanh (RCT) block.The encoder and decoder are connected through skip connections.

Figure 4 .
Figure 4.The proposed method structure of generators and discriminators.The generators adopt a U-Net architecture, and the discriminators employ PatchGAN with five convolutional layers.Both discriminators,  and  , use the PatchGAN [23] structure, as shown in Figure 4b.The receptive field of PatchGAN is 70 × 70, and it consists of five 4 × 4 convolutional layers.Specifically, it includes a Conv-LeakyRelu (CL) block, three Conv-BatchNorm-LeakyRelu (CBL) blocks, and a Conv-Sigmoid (CS) block.

Figure 4 .
Figure 4.The proposed method structure of generators and discriminators.The generators adopt a U-Net architecture, and the discriminators employ PatchGAN with five convolutional layers.Both discriminators, D A and D B , use the PatchGAN [23] structure, as shown in Figure 4b.The receptive field of PatchGAN is 70 × 70, and it consists of five 4 × 4 convolutional layers.Specifically, it includes a Conv-LeakyRelu (CL) block, three Conv-BatchNorm-LeakyRelu (CBL) blocks, and a Conv-Sigmoid (CS) block.
Formally, image s ∈ S is input into generator G A to derive the image G A (s) ∈ W, and the image is processed through G B to obtain the reconstructed image, G B (G A (s)) ∈ S. Similarly, image w ∈ W is passed through generator G B to achieve the image G B (w) ∈ S, ||( ()) − ()|| +  ||( ()) − ()||

Figure 5 .
Figure 5. Feature distribution of different images in space and frequency domains: (a) artifact-free image; (b) synthetic saturation artifact image; (c) real saturation artifact image; (d) after inpainting data.

Figure 5 .
Figure 5. Feature distribution of different images in space and frequency domains: (a) artifact-free image; (b) synthetic saturation artifact image; (c) real saturation artifact image; (d) after inpainting data.

Figure 6 .
Figure 6.The process of creating the synthesized saturated artifact image,  =  .

Figure 6 .
Figure 6.The process of creating the synthesized saturated artifact image, s = I × m.
The overall architecture of the proposed model includes two generators ( ,  ) and two discriminators ( ,  ).Image  ∈  is input to generator  and image  ∈  is input to generator  .The proposed structural similarity loss and frequency loss are represented as  and  , respectively. combines spatial and frequency domains.

Table 1 .
Quantitative comparison of synthetic artifact inpainting results using different methods.↑ Higher is better.↓ Lower is better.The best results are presented in bold.

Table 1 .
Quantitative comparison of synthetic artifact inpainting results using different methods.↑ Higher is better.↓ Lower is better.The best results are presented in bold.

Table 2 .
Quantitative comparison of corneal segmentation results in AS-OCT images using different methods.↑ Higher is better.The best results are presented in bold.

Table 2 .
Quantitative comparison of corneal segmentation results in AS-OCT images using different methods.↑ Higher is better.The best results are presented in bold.

Table 3 .
Quantitative comparison of corneal segmentation results in ablation study.↑ H ter. The best results are presented in bold.

Table 3 .
Quantitative comparison of corneal segmentation results in ablation study.↑ Higher is better.The best results are presented in bold.