Radon-Guided Wavelet-Domain Attention U-Net for Periodic Artifact Suppression in Brain MRI

Rios-Perez, Jesus David; Sanchez-Torres, German; Branch-Bedoya, John W.; Laiton-Bonadiez, Camilo Andres

doi:10.3390/jimaging12040153

Open AccessArticle

Radon-Guided Wavelet-Domain Attention U-Net for Periodic Artifact Suppression in Brain MRI

by

Jesus David Rios-Perez

^1,*,

German Sanchez-Torres

²

,

John W. Branch-Bedoya

¹

and

Camilo Andres Laiton-Bonadiez

¹

Departamento de Ciencias de la Computación y de la Decisión, Facultad de Minas, Universidad Nacional de Colombia, Medellín 050035, Colombia

²

Facultad de Ingeniería, Grupo de Investigación y Desarrollo en Sistemas y Computación, Universidad del Magdalena, Santa Marta 470004, Colombia

^*

Author to whom correspondence should be addressed.

J. Imaging 2026, 12(4), 153; https://doi.org/10.3390/jimaging12040153

Submission received: 5 February 2026 / Revised: 20 March 2026 / Accepted: 23 March 2026 / Published: 2 April 2026

(This article belongs to the Section Medical Imaging)

Download

Browse Figures

Versions Notes

Abstract

Periodic artifacts such as ringing (Gibbs), herringbone (spike/corduroy), and zipper patterns degrade the quality of brain MRI. We present a reproducible framework that (i) synthetically generates periodic artifacts with controllable severity directly in k-space, (ii) normalizes pattern orientation through a Radon-guided alignment step, and (iii) corrects them in the wavelet domain using a 2D DWT (AA/AD/DA/DD) with a band-weighted loss. The evaluation was conducted using DLBS T1-weighted 3T MRI volumes with synthetically generated periodic artifacts. It combined global image-quality metrics (SSIM, PSNR) with per-band metrics to quantify how correction concentrates on high-frequency components, and included ablation studies, mixed-artifact stress tests, and structural preservation analyses. Compared with several baseline architectures, the proposed approach shows improvements in structural fidelity and a reduction in periodic patterns (SSIM:

0.985 \pm 0.022;

PSNR:

43.337 \pm 5.364

; reduction in concentrated error in high-frequency bands), while preserving unaffected structures. These findings indicate that, within a controlled synthetic benchmark, aligning the pattern orientation prior to learning and optimizing correction in the wavelet domain enables suppression of synthetically generated periodic artifacts while limiting over-smoothing.

Keywords:

MRI; artifact correction; self-attention; wavelet; U-Net

1. Introduction

Magnetic resonance imaging (MRI) is a widely used non-invasive tool in clinical diagnosis and biomedical research, due to its ability to enhance soft-tissue contrast and versatility in anatomical, functional, and metabolic applications. Millions of MRI scans are performed worldwide each year. However, the increasing demand for higher spatial and temporal resolution also increases exposure to artifacts that degrade anatomical and quantitative fidelity, which may affect early diagnosis and follow-up accuracy. These artifacts fall into three main categories: (1) high noise and limitations in spatial resolution; (2) geometric distortions caused by magnetic field heterogeneities and patient motion; and (3) susceptibility artifacts associated with fast gradient echo (FGE) sequences or the presence of metallic structures [1,2].

In multi-shot diffusion-weighted imaging (DWI), for example, lengthening the acquisition time to optimize the signal-to-noise ratio (SNR) increases vulnerability to distortions and motion shifts [3]. Similarly, in whole-body positron emission tomography (PET)/MRI studies, metallic implants and truncation of the field of view cause areas of signal loss that hinder attenuation correction and tracer uptake quantification [4]. Furthermore, echo planar imaging (EPI) sequences used in functional neuroimaging show more pronounced spatial aberrations at ≥7 T [2]. In oncology applications and cardiac balanced steady-state free precession (bSSFP) cine imaging, noise, distortions, banding, and flow artifacts hinder the detection of small lesions and morphological and functional analysis [5]. Given this complex interaction of protocols and artifacts, artificial intelligence (AI)-based post-processing strategies are required to restore image quality and ensure the reproducibility of quantitative measurements in diverse clinical scenarios.

In recent years, deep learning has become a widely adopted strategy for correcting these artifacts in MRI. Deformable convolutional networks (DCNs) have been reported to be effective in correcting geometric distortions in neuroimaging while also extracting features relevant for tumor classification [6]. In multi-shot DWI, denoising schemes trained on single-shot data have allowed acquisitions to be accelerated up to 4×, maintaining fidelity at high b-values, and improving the detectability of rectal lesions [3]. For PET/MRI, deep completion methods estimate signal-loss regions caused by metal artifacts or truncation, reducing the volumetric error in attenuation correction from 9.8% to less than 1% in the head and torso [4].

In cardiac bSSFP cine imaging, dual-encoder architectures have jointly suppressed banding and flow artifacts, outperforming conventional averaging [5], while the TS-Net network combines inverted phase encodings (PEs) with anatomical images to correct distortions in all three directions without requiring additional data during inference [2]. Despite these quantitative improvements, most proposals are evaluated in specific domains using particular protocols, lack direct comparisons with traditional techniques, and offer limited validation for explainability, generalizability, and standardized clinical metrics [1]. This fragmentation makes it difficult to identify the most robust approaches and to implement them in heterogeneous environments.

In this work, we propose a reproducible synthetic framework for generating periodic artifacts (ringing, herringbone, and zipper) in reference structural images, with k-space severity control. The protocols parameterize smoothing, axis/direction, kernel size, distance, and amplitude, among other parameters, with calibrated ranges to obtain controlled variations in artifact severity. This design enables the study of how correction methods behave under increasingly severe regimes.

At the learning stage, we introduce Radon-guided orientation normalization. We estimate the dominant pattern angle and apply a noisy and ground truth (GT) rotation before the network, then reverse the rotation during reconstruction, thereby reducing geometric variability that hinders training. The core model, WaveletBasedAttention-Net, uses a 2D discrete wavelet transform (DWT) to decompose each slice into four bands [Approximation + Approximation (AA), Approximation + Detail (AD), Detail + Approximation (DA), and Detail + Detail (DD)]. The resulting four-channel tensor is processed by a U-Net architecture with four-channel input and output, employing attention-based skip connections that enhance feature selection and spatial representation. The corrected bands are then reconstructed using an inverse discrete wavelet transform (IDWT). The WaveletLoss function weights the error per band to selectively suppress the pattern, and we record the structural similarity index measure (SSIM) during training and inference as a global structural metric.

In the Section 4, we evaluate the model on a controlled synthetic dataset using global and wavelet-domain metrics, along with partitioning analyses, ablation studies, structural preservation assessment, and mixed-artifact stress tests. Regarding the data partitioning analysis, while histograms and PCA projections indicate substantial overlap between the training and validation subsets, formal two-sample tests detect subtle but statistically significant differences. Therefore, the experimental setting is best understood as a controlled synthetic evaluation rather than a statistically equivalent sampling regime.

2. Previous Works

2.1. Sampling/Aliasing/Truncation-Related Artifacts

This category encompasses the effects that arise when the signal is discretized in space and frequency. These include aliasing due to undersampling, Gibbs ringing caused by truncation in k-space, and wrap-around, where a signal from outside the field of view is folded onto the image. To mitigate these effects, accelerated sampling techniques (parallel imaging, compressed sensing, and simultaneous multislice sampling) are now combined with transform-domain methods and spatiotemporal regularization. Furthermore, approaches that model the acquisition physics and integrate neural networks (in k-space, in the image domain, or in both domains) are gaining ground, thereby restoring the signal and suppressing artifacts without always relying on fully sampled references.

For noise suppression and defocusing, the goal is to eliminate interference and recover fine detail in dynamic sequences. In the k-t domain, models such as the DENSE Artifact Suppression Network (DAS-Net) separate echoes in dense cine, and deep residual networks preserve edges in accelerated multi-shot diffusion [1,2]. In more “real-world” scenarios, vendor-agnostic convolutional neural network (CNN) filtering applied to DICOM images increased SSIM while saving time. In addition, a 2D U-Net trained with simulated degradations (Gaussian noise, blur, and motion) clearly outperformed classical methods in denoising and deblurring [3,4].

In arterial spin labeling (ASL) perfusion, deep autoencoders improve SNR and reduce kinetic error, supporting their use for quantitative maps such as cerebral blood flow (CBF) [5]. In parallel, zero-shot in situ schemes (e.g., zero-shot medical image artifact reduction (ZSAR)/one-shot medical image artifact reduction (OSAR)) reduce dependence on ground truth, although they degrade more at low SNR, in 3D settings, and when manual regions of interest (ROI) are required [6,7]. For mixed artifacts, dual-domain architectures such as feature distillation block (FDB-GAN) (high-frequency k-space branch + spatial branch) achieve net advantages, albeit with greater complexity and limited 3D validation, and low-SNR evaluation [8].

In clinical and commercial settings, deep learning reconstruction (DLR) in 3D T2W-FS increases SNR and decreases ringing without prolonging acquisition, improving the area under the curve (AUC) of findings (e.g., epidural fluid) [9]. In hybrid super-resolution, orthogonal stacks plus a 5-layer artifact reduction convolutional neural network (ARCNN) achieve competitive PSNR and SSIM, albeit with high computational costs and limited pathological validation [10]. For out-of-site generalization, a residual network (ResNet) with perceptual loss improves PSNR, SSIM, and normalized root mean square error (NRMSE) without retraining but may attenuate subtle hyperintensities; further validation in 3D and other pathologies is suggested [11]. In Rician noise, patched residual frames recover fine texture, and residual noise learning approaches show improvements in PSNR and SSIM, although sometimes with phantom-based evidence and without 3D clinical validation [12,13]. In specific contexts, AIR™ Recon Deep Learning (DL) for prostate imaging (residual encoder trained with “near-perfect” vs. conventional pairs) obtained a higher frequency of “excellent” ratings and a reduction of artifacts, improving anatomical visualization. However, objective metrics and slice-by-slice correlation with pathology were not reported [14]. In cardiac late gadolinium enhancement (LGE), a deep learning reconstruction (DLR) prototype increased SNR (up to ~3× in phantom and ~1.8–3× in clinical) and sharpness, but high levels of noise reduction altered quantitative measurements with conventional thresholds; the full width at half maximum (FWHM) method proved more stable, underscoring the need for systematic evaluations of quantification [15].

When k-space lines are missing, aliasing occurs. To strengthen generalized auto-calibrating partially parallel acquisitions (GRAPPAs) without external data, scan-specific artifact reduction in k-space (SPARK) learns to correct the error in the auto-calibration signal (ACS) region; the alias artifact suppression network (ALIASNET) reduces parameters by combining 1D regularizers with 2D convolutions [16,17]. In real-time cine imaging without ground truth, a CNN is pre-trained with synthetic profiles and subsequently refined with data consistency [18]. Several studies reveal that 2D spatiotemporal U-Nets match 3D approaches with less data (supported by persistent homology), and that implicit approaches such as neural implicit k-space (NIK), sinusoidal multi-layer perceptron (MLP), or multiresolution deformable convolutions improve end-to-end quality [19,20,21]. The k-space/image jointly unrolled cross-domain optimization-based spatiotemporal reconstruction network (JUST-Net) improves myelin maps and mitigates motion; and recurrent frames or framework like Flow Reconstruction and Segmentation for Low-Latency Cardiac Output (FReSCO) allow inference in <1 s, which is useful for real-time applications [22,23,24]. In simultaneous multislice (SMS) myocardial perfusion, both signal intensity informed multi-coil (SIIM) and edge-guided cascades enforce data/space consistencies to stabilize the reconstruction [25,26].

In dual-domain architectures, multi-domain convolutional neural network (MD-CNN) and DuDoRNet+ combine 3D subnets, 2D U-Nets, and interpolation; in plug-and-play magnetic resonance fingerprinting (MRF), the alternating direction method of multipliers (ADMM) integrates a pre-trained denoiser without retraining it; and Regularization by Artifact-REmoval (RARE) extends Regularization by Denoising (RED) with Artifact2Artifact pairs for 4D free-breathing [27,28,29,30]. Other frameworks explore adversarial and transformer-based strategies, including the Hierarchical Perception Adversarial Learning Framework (HP-ALF), a volumetric radial generative adversarial network (GAN), and accelerated MRI reconstruction using a recurrent transformer (ReconFormer). Additional approaches include structured low-rank methods, and task-specific accelerations such as spiral phase-contrast cardiac magnetic resonance imaging (PCMR), ~59 s → 3.9 s [31,32,33,34,35].

In musculoskeletal imaging, Controlled Aliasing in Parallel Imaging Results in Higher Acceleration (CAIPIRINHA) combined with compressed sensing (CS) accelerates 3D Turbo Spin-Echo (TSE) (4–8×), although prospective validations are needed. In prostate multiparametric magnetic resonance imaging (mpMRI)—Prostate Imaging-Reporting and Data System (PI-RADS)—synthetic data and networks without full reference are combined. Variants such as dual GAN-U-Net (DLGAN) or a Hybrid Image-Wavelet Domain Network (HIWDNet), together with modules such as the Cross-scale Dense Feature Fusion Module (CDFFM), Region Adaptive Artifact Removal Module (RAARM), and Wavelet Sub-band Reconstruction Module (WSRM), are also emerging, showing good metrics but with stability and latency compromises [36,37,38,39].

Uncertainty-guided progressive GANs have also been proposed; these refine regions with high uncertainty and improve interpretability with limited data, albeit with greater complexity [40].

2.2. Motion Artifacts

Motion artifacts in MRI—blurring, ghosting, or phase inconsistencies—arise from voluntary or involuntary patient movement. Two main families of methods can be used to mitigate them: (i) prospective corrections, which measure movement (e.g., with navigators or sensors) and adjust the acquisition sequence in real time; and (ii) retrospective corrections, which act after acquisition (in k-space or image) using registration, transform inversion, and constraints such as compressed sensing. More recently, deep learning approaches—supervised and unsupervised, using CNNs, GANs, diffusion models, and transforms—learn to “translate” artifacted images into clean images by combining spatial and frequency information.

Non-rigid motion and artifact representation. Non-rigidity motion requires models that adapt to local, anatomy-dependent deformations [41]. In this context, deformable convolutions adaptively shift convolutional kernels (

{Δ p}_{n}

) to follow the underlying geometry and are combined with deformable max pooling and a final Support Vector Machine (SVM) for tumor detection [41]. In parallel, disentangled learning approaches separate structural content from artifact components to better align with Spatial Transformer Networks (STNs) and cross-stitch in nnUNet, optimizing mean absolute error (MAE) and multi-scale structural similarity (MS-SSIM) [42].

Respiratory motion in the abdomen: from filters to U-Nets. In liver MRI, post-processing filters such as Motion Artifact Reduction with Convolutional Neural Network (MARC), shallow CNN, and U-Nets enhanced with high-pass filtering reduce respiratory artifacts and improve SSIM and phase-insensitive contrast [43,44]. In more complex scenarios, generative methods have also been used. In fetal MRI, residual GANs with Squeeze-and-Excitation combine adversarial, L1, and perceptual losses, including the Visual Geometry Group (VGG) perceptual loss [45]. Conditioned diffusion models achieve high SSIM in simulated cardiac cine imaging, while Variance Exploding Stochastic Differential Equation (VE-SDE) models formulate the process in terms of forward and inverse stochastic differential equations (SDEs) with iterative steps in k-space (powerful but computationally costly and with the risk of retaining low-frequency artifacts) [46,47]. Recurrent GANs with multi-scale Convolutional Long-Short Term Memory (ConvLSTM) also improve PSNR and SSIM in temporal interpolation in cine imaging [48], while classic supervised schemes such as U-Net, variational autoencoder (VAE), and GAN remain effective when adequate training references are available [49].

When clean–corrupt pairs are unavailable, unpaired sampling with bootstrap aggregation (averaging several reconstructions) helps discard outliers and outperforms purely supervised alternatives in mono- and multi-coil TSM acquisitions [50]. Furthermore, integrated frameworks that detect, correct, and segment artifacts within a single processing pipeline—such as CNN/Convolutional Recurrent Neural Network (CRNN)/U-Net architectures—show improvements in accuracy and resolution in population datasets such as the UK Biobank [51,52]. A recent review (2018–present) covers approaches ranging from classical coding methods to deep learning, reporting high accuracies (∼97%) and indicating the need for multicenter validation and real-time 3D/4D optimization [53].

Rigid motion: an active line strategy simulates rigid artifacts by replacing k-space lines using temporal masks and trains models to reverse them. For example, Motion Artifact Reduction using a Conditional Diffusion Probabilistic Model (MAR-CDPM) matches or outperforms U-Net, CycleGAN, and Pix2Pix in silico and in multicenter environments [54]. Motion Artifact Unsupervised Disentanglement Generative Adversarial Network (MAUDGAN) achieves competitive results without paired data or sequence modifications, combining adversarial, reconstruction, and artifact losses [55]. Using a hybrid approach, a hybrid Deep AutoEncoder–Convolutional Neural Network (DAE-CNN) first classifies ceT1 volumes as clean or artifacted and then removes artifacts using optimized guided bilateral filtering, achieving high PSNR and classification accuracy; however, 3D validation and detailed analysis in pathological regions are still lacking [56].

2.3. Off-Resonance and Susceptibility (B0)

This category encompasses effects that arise when the main magnetic field is not perfectly homogeneous. Typical causes include incomplete shimming, differences in susceptibility between tissues, or the presence of metal. The resulting artifacts are typically observed as geometric distortions, signal mismatches, and off-resonance frequency shifts [57]. To address these effects, the literature generally combines three lines of research: (i) acquisition strategies, (e.g., Slice Encoding for Metal Artifact Correction (SEMAC), View-Angle Tilting (VAT), Multi-Acquisition with Variable Resonance Image Combination (MAVRIC), EPI with reverse polarities, Dixon, and bSSFP with phase-cycling); (ii) physical correction and inversion algorithms; and (iii) deep learning approaches—both supervised and unsupervised—that learn to suppress or compensate for these artifacts [57].

Metal Artifact Reduction Sequence MRI (MARS-MRI): areas of attenuated or absent signal can appear around metallic implants, complicating, for example, the generation of attenuation maps in PET/MRI and potentially inducing significant quantitative biases. One representative approach uses dilated convolutional networks with residual connections to “fill in” the truncated regions by simulating metallic voids in healthy data and optimizing a reconstruction loss based on root mean squared error (RMSE). In tests, artifact volume was substantially reduced, and quantitative biases in the head and thorax were markedly decreased; however, challenges remain, including multicenter validation, data expansion, and coverage of complex implants [58].

In parallel, hardware improvements (e.g., dense radiofrequency (RF) coils, parallel imaging, and gradient defect correction) and dedicated sequences (VAT, SEMAC)—often combined with iterative reconstruction methods or neural networks—help mitigate signal loss, distortion, and frequency shifts. Careful parameter balancing is essential. Increasing bandwidth reduces in-plane distortion but also decreases SNR, while shortening the spin-echo time (TE) refocuses intravoxel dephasing at the cost of a higher specific absorption rate (SAR). In fat suppression, Dixon and Short-Tau Inversion Recovery (STIR) exhibit different sensitivities to B0/B1 inhomogeneities, so the protocol must be tailored to the implant, clinical objective, and time/SAR constraints [59].

Unsupervised learning and multimodal approaches. Unsupervised methods have been proposed to correct off-resonance distortions from dual-polarity EPI acquisitions by estimating a direct field map from both polarities without requiring a distortion-free reference. These approaches show reduced overfitting and good performance under low SNR conditions, although they increase acquisition time (e.g., by requiring both polarities), and further testing is needed across a broader range of implants. In parallel, multimodal architectures for computed tomography (CT)/MRI artifact reduction have been explored that introduce similarity terms between modalities within composite loss functions, with quantitative improvements in CT noise reduction and segmentation propagation in MRI. However, the gains depend on the anatomical region evaluated and the size of the test set, and challenges related to multicenter generalization remain [57,60].

EPI distortion correction. The goal is to estimate a displacement map that reverses geometric deformations and intensity distortions. Established methods, such as TOPUP, estimate the field using regularization, while U-Net-type networks—with losses enforcing cycle consistency, smoothness, or anatomy-based constraints—have accelerated the process and increased structural fidelity [3,61,62].

DLRPG-net restricts the ΔB0 map to a smooth subspace (splines) and simultaneously corrects geometry and intensity, yielding PSNR and SSIM improvements at 3 T and 7 T with very low inference times; however, it requires UP/DOWN pairs, and its robustness to motion and varied protocols requires further study [63].

PreQual synthesizes a “distortion-free” b0 from T1 and a “real” b0 using a 3D GAN. This facilitates subsequent correction with TOPUP but can alter derived metrics (e.g., fractional anisotropy (FA) in tract-based spatial statistics (TBSS)) if applied without reverse-polarity data [62].

S-Net estimates a displacement field from two EPIs with inverted PEs and integrates a differentiable spatial transformer unit, achieving accuracy comparable to TOPUP with substantial computational speedups [64].

TS-Net extends this approach to 3D and incorporates a T1-based anatomical term during training. It outperforms TOPUP and related methods across several datasets (fMRI/DWI, 3T, and 7T) with sub-second graphics processing unit (GPU) inference. Its adoption, however, depends on substantial training, careful hyperparameter selection, and the availability of inverse PE pairs [65].

2.4. Ghosting, Phase Errors, and System Effects

This category encompasses artifacts caused by phase errors, interference, and effects inherent to the magnetic resonance imaging system. It includes ghosting associated with periodic motion or flow, gradient, and eddy current failures, as well as RF contamination—for example, zipper or herringbone patterns. To address these effects, the literature combines several strategies: (i) gradient calibration and flow-sensitive acquisition protocols; (ii) projections into interference-null subspaces; (iii) iterative reconstructions with constraints; and (iv) k-space filtering. In parallel, deep learning is increasingly used to detect and suppress periodic artifacts by learning feature representations that capture their underlying structure.

In the case of RF interference and “spikes” (zipper/herringbone), which appear as bands or specks due to unwanted RF emissions or signal nonlinearities, classical filtering methods may fall short in emerging systems, such as Radiowave Amplification by Stimulated Emission of Radiation (RASER) MRI, where nonlinear behavior dominates. For this scenario, a two-stage deep learning pipeline has been proposed: first, a convolutional network corrects 1D sinograms, and then a U-Net refines the 2D reconstruction. Training uses synthetic data generated from controlled variations in the theoretical RASER model. These simulations generate responses in different modes and apply transforms (Fourier and Radon) to create distorted images, employing domain randomization to improve robustness. The resulting images are practically independent of the degree of nonlinearity. However, the method critically depends on the validity of the physical model and the number of modes considered, whose computational cost increases sharply, thus limiting its extrapolation to higher resolutions [66].

A hybrid approach based on CNNs and Deep Belief Networks (DBNs) has been applied to reduce metallic artifacts in brain MRI. The system extracts features using the Gray Level Co-occurrence Matrix (GLCM), performs segmentation and morphological operations, and classifies images as normal or tumor-bearing. After artifact removal—guided by improvements in SNR and energy metrics—classification accuracy increases from 92.12% to 95.77%. The DBN phase relies on its characteristic energy function to adjust weights and latent representations. This framework demonstrates clinical viability for tumor diagnosis, but its generalizability may be limited by the need for annotated data, specific segmentations, and the absence of an explicit physical model of the artifacts, making transfer to other types of interference difficult [61]. Further validation with experimental RASER data is needed, along with exploration of hybrid architectures that integrate physical models and optimization of computational efficiency to scale to higher resolutions.

Gradient nonlinearities and eddy currents introduce spatial-frequency deviations that result in geometric distortions and ray-like artifacts. Solutions include gradient response calibration, effective field modeling using direct measurements, and gradient matrix-based corrections, along with image interference suppression or k-space techniques [7]. In this context, ACC improves SNR by combining coils with weights derived from the main component that maximizes SNR, while Beamforming-based STreak Artifact Reduction (B-STAR) maximizes the signal-to-interference ratio using a global interference correlation matrix [63]. More recently, Cancellation of streAk artifaCts using the inTerference nUll Space (CACTUS) identifies a low-dimensional interference subspace from the spectral decomposition of the interference correlation matrix and projects the coil data onto the orthogonal complement of this subspace. This approach, which is invariant in k-space and compatible with iterative reconstructions, enhances the cancellation of artifacts caused by gradient nonlinearity without amplifying noise. Practical questions remain, such as its robustness to different levels of subsampling and the optimal choice of the dimension of this subspace [67].

Flow- or pulsatility-induced ghosting, common in Epithelial Inflammatory Disease (EID) and especially relevant in the left hepatic lobe, has been mitigated using acquisition strategies (such as gradients that cancel the first moment) and with classical post-processing methods (outlier exclusion, weighted averaging, or percentile-based approaches). These techniques usually improve lesion–background contrast, although they sometimes attenuate vascular darkness. As an alternative, a feature-driven U-Net has been proposed that is not trained against a gold-standard image, but instead optimizes four normalized scores: pulsation artifacts, vascular darkness, contrast-to-noise ratio (CNR), and data consistency. The goal is to combine the average of these metrics with additional constraints. These terms stabilize variance in the reconstruction, penalize residual pulsation, and preserve image fidelity. Compared with the best conventional strategy, this approach improves image quality, increases CNR, and reduces artifacts without sacrificing vascular darkness [68].

Additionally, cerebrospinal fluid flow artifacts in cervical MRI were reduced using a CycleGAN model trained with T2 TSE (artifact-free reference) and T2 Fast-Field Echo (FFE) images [69].

Table 1 provides a consolidated overview of recent deep learning-based strategies for artifact correction in medical imaging. For each major artifact family, we summarize the typical acquisition- and system-related causes, the predominant methodological directions proposed in the literature, and the main limitations that currently constrain robustness, generalization, and clinical deployment.

Although previous studies have reported effective MRI artifact correction, as in our proposal, most have focused on broader reconstruction problems or non-periodic degradations. Many also rely on dual-domain designs, acquisitions with field maps or reversed polarity, or other strategies that increase methodological and computational complexity. In contrast, our work specifically addresses periodic artifacts by combining two strategies: Radon transform-guided orientation normalization before training, which reduces pattern variability, and wavelet-domain correction using DWT sub-bands with a band-weighted loss that emphasizes the high-frequency components, where these artifacts are most prominent.

3. Materials and Methods

3.1. Dataset

In this study, we used structural magnetic resonance imaging data from the Dallas Lifespan Brain Study (DLBS) [70]. The dataset comprises 967 high-resolution T1-weighted volumes with isotropic 1 mm voxels.

MRI acquisitions were performed on a Philips Achieva 3T scanner (Philips Healthcare, Best, The Netherlands) equipped with an 8-channel head coil. A 3D Magnetization-Prepared Rapid Acquisition with Gradient Echo (MPRAGE) sequence was used (TR = 8.1 ms, TE = 3.7 ms, rotation angle = 12°, Field of View (FOV) = 204 × 256 mm², and 160 slices), resulting in volumes of 160 × 256 × 256 voxels. The cohort comprised approximately 500 healthy participants aged between 20 and 90 years.

3.2. Artifact Generation

This section summarizes the origin of these artifacts, their image manifestations, and, when relevant, their k-space signatures. It also describes the reproducible protocols for generating ringing, herringbone, and zipper artifacts.

MRI is based on nuclear magnetic resonance [71], and due to the complexity of MRI acquisition, several types of artifacts may arise. In this work, we group them according to their origin: (i) acquisition, sampling, and reconstruction effects [72,73]; (ii) patient or physiological motion [74,75,76,77,78]; (iii)

B_{0}

inhomogeneity and susceptibility effects, which are frequent in the presence of metallic implants [74,79]; and (iv) hardware or RF interference, which can produce zipper and herringbone patterns [79,80].

3.2.1. Ringing

The ringing artifact manifests as edge oscillations (light and dark bands) around high-contrast transitions. It is mainly caused by the truncation of high-frequency components in k-space (finite sampling) or by windowing during reconstruction [81].

Reproducible generation model

To generate ringing artifacts in a reproducible manner, a brain MRI volume

f (x, y, z)

is first transformed into k-space by applying a 3D discrete Fourier transform (Equation (1)):

F (u, v, w) = \sum_{z = 0}^{N_{z} - 1} \sum_{y = 0}^{N_{y} - 1} \sum_{x = 0}^{N_{x} - 1} f (x, y, z) e^{- j 2 π (\frac{u x}{N_{x}} + \frac{v y}{N_{y}} + \frac{w z}{N_{z}})}, \begin{matrix} u = 0, \dots, N_{x} - 1 \\ v = 0, \dots, N_{y} - 1 \\ z = 0, \dots, N_{z} - 1 \end{matrix}

(1)

After obtaining the k-space representation in (1), the artifact is introduced slice-wise by modifying the 2D Fourier domain of each slice. Specifically, for each 2D slice

f_{z} (x, y)

, its 2D spectrum is computed as

F_{z} (u, v) = F_{2 D} \{f_{z} (x, y)\}

:

α = (\frac{1}{2} \sin (π {(1 - \frac{r_{i}}{r_{e}})}^{0.55} - \frac{π}{2}) + \frac{1}{2}) U, U ~ u (0,1)

(2)

Then,

α

is applied only to a sector of the k-space defined by an angular interval

[θ_{1}, θ_{2}]

and a radial band

[r_{i}, r_{e}]

, as defined by:

\tilde{F_{z}} (u, v) = \{\begin{matrix} α F_{z} (u, v), & i f θ_{1} \leq θ (u, v) \leq θ_{2} a n d r_{i} \leq r (u, v) \leq r_{e} \\ F_{z} (u, v), & o t h e r w i s e \end{matrix}

(3)

where

r (u, v)

and

θ (u, v)

are computed with respect to the k-space center

(u_{0}, v_{0})

:

r (u, v) = \sqrt{{(u - u_{0})}^{2} + {(v - v_{0})}^{2}}, θ (u, v) = atan 2 (v - v_{0}, u - u_{0})

(4)

As a result, two symmetric angular sectors of k-space are modified, producing oscillatory patterns in the spatial domain that characterize the ringing effect. Finally, the corrupted volume is obtained by applying the inverse 3D Fourier transform:

f (x, y, z) = \frac{1}{N_{x} N_{y} N_{z}} \sum_{w = 0}^{N_{z} - 1} \sum_{v = 0}^{N_{y} - 1} \sum_{u = 0}^{N_{x} - 1} F (u, v, w) e^{+ j 2 π (\frac{ux}{N_{x}} + \frac{vy}{N_{y}} + \frac{wz}{N_{z}})}

(5)

Control Parameters

To control the severity of the artifact, the following parameter ranges were selected empirically (Table 2). Intensity

α \in [0, 1]

, propagation axis in

\{0, 1, 2\},

end angle

θ_{2} \in [0, 360]

, initial angle

θ_{1} \in [θ_{2} - 100, θ_{2} - 10]

, outer radius

r_{e} \in [118, 123]

, and inner radius

r_{i} \in [r_{e} - 20, r_{e} - 10]

.

3.2.2. Herringbone

This artifact manifests as oblique light-and-dark bands that regularly cross the entire field of view, often near ±0° or ±90°. Its origin lies in the injection of a periodic or coherent signal into the acquisition chain, such as RF interference, hardware failures, or clock desynchronization. This introduces discrete peaks in k-space that are reconstructed as an oblique sinusoidal pattern in the image [82].

Generation Model

To generate this artifact, a brain MRI volume

f (x, y, z)

is first converted to k-space by applying the Fourier transform (Equation (1)). The artifact is then introduced slice-wise by modifying the 2D Fourier spectrum of each slice.

For each slice

f_{z} (x, y)

, we compute

F_{z} (u, v) = F_{2 D} \{f_{z} (x, y)\} .

A local k-space disturbance is created by copying a small kernel centered at the k-space origin

(u_{0}, v_{0})

to a displaced location

(u_{p}, v_{p})

and scaling it by a factor

β

, which is controlled by a smoothing parameter

S \in [3, 20]

, thus

β = \frac{1}{100 s}

.

Let

Ω_{p}

be the target kernel region of size

k_{s} \times k_{s}

centered at

(u_{p}, v_{p})

:

Ω_{p} = \{(u, v) : |u - u_{p}| \leq \frac{k_{s} - 1}{2}, |v - v_{p}| \leq \frac{k_{s} - 1}{2}\}

(6)

The modified spectrum

\tilde{F_{z}} (u, v)

is defined as:

\tilde{F_{z}} (u, v) = \{\begin{matrix} β F_{z} (u - (u_{p} - u_{0}), v - (v_{p} - v_{0})), & i f (u, v) \in Ω_{p} \\ F_{z} (u, v), & o t h e r w i s e \end{matrix}

(7)

As a result, a localized perturbation in k-space (a “peak-like” insertion) is produced at a position displaced from the center along either the horizontal or vertical direction. In the spatial domain, this reconstructs as a coherent oblique banding pattern. Finally, the inverse Fourier transform (Equation (5)) is applied to obtain the artifact-corrupted volume.

Control Parameters

To control the severity of the artifact, the following parameter ranges were selected empirically (Table 3): smoothing parameter

S \in [3, 20]

, propagation axis in

\{0, 1, 2\}

, selection point in

\{0, 1, 2, 3\}

, kernel size

k_{S} \in [3, 13]

, and displacement distance

d \in [- 30, 30]

. The selection point defines the reference location around the k-space center

(u_{0}, v_{0})

, from which the target point

(u_{p}, v_{p})

is obtained by applying the displacement

d

along the chosen axis.

3.2.3. Zipper

This artifact appears as one or more lines of noise, often alternating between light and dark pixels, extending across the image in a single direction [83]. In clinical MRI, zipper artifacts may arise from several causes, most commonly related to external RF interference or hardware or software issues outside the radiologist’s direct control.

Generation Model

To generate this artifact, for each 2D slice

f_{z} (x, y)

of the brain volume, random noise with half intensity is applied to the entire slice. Additionally, if the pixel coordinate falls within the interval defined by a starting position

p_{a}

and an upper limit

p_{a} + a U

(where

a

is the artifact amplitude and

U \sim U (0,1))

, the full noise intensity is injected (Equation (8)).

The interval is evaluated along the selected propagation axis.

Let

n (x, y) \sim U (- 1, 1)

be a random noise field and let

i

be the noise intensity. Let

t (x, y)

denote the coordinate of

(x, y)

along the propagation axis. The corrupted slice

f_{z}^{'} (x, y)

is defined as:

f_{z}^{'} (x, y) = \{\begin{matrix} f_{z} (x, y) + i n (x, y), & i f p_{a} \leq t (x, y) \leq p_{a} + a U \\ f_{z (x, y)} + \frac{i}{2} n (x, y), & o t h e r w i s e \end{matrix}, U ~ u (0,1)

(8)

As a result, a stripe-like noise variation appears in one direction, producing the characteristic zipper artifact pattern.

Control Parameters

To control the severity of the artifact, the following parameter ranges were selected empirically (Table 4). The intensity ranged from 15 to 50, the propagation axis from 0 to 2, the number of artifacts from 1 to 16, the variability from 20 to 40, and the amplitude from 10 to 50.

3.3. Preprocessing

For volume preprocessing, intensity standardization was applied using z-score normalization,

Z = (X - μ) / σ

, where

X

denotes the original voxel intensity,

μ

is the mean intensity, and

σ

is the corresponding standard deviation. This transformation centers the intensities at zero and rescales them to unit variance, facilitating inter-image comparison and stabilizing training.

The 3D volumes were divided into 2D slices, and the dataset was partitioned into training (80%) and validation (20%) sets using stratified sampling by artifact type. In addition, the train–validation split was performed at the subject level before any 2D slicing, ensuring that all slices from a given subject or volume were assigned exclusively to one split.

3.4. Validation of Dataset Partitioning and Distribution

This section evaluates whether the sets follow comparable distributions, ensuring that the model is exposed to a similar level of complexity in both partitions. First, we assessed whether the proportion of samples for each artifact type remained approximately constant across the training and validation subsets, thereby maintaining balanced representation of artifact classes. Second, we compared the statistical characteristics of the input images between the two subsets to confirm that the data complexity is similar in both partitions.

3.4.1. Feature Extraction Within a Tissue Mask

For each 2D slice, an automatic tissue mask was generated using intensity percentiles to remove background and restrict feature computation to the tissue region. From the voxels inside the mask, we extracted a feature vector composed of the following:

(i) Intensity descriptors, including the mean, standard deviation, skewness, kurtosis, selected percentiles, and a 64-bin histogram computed in z-score space.

(ii) Spatial-frequency descriptors, derived from wavelet- and Fourier-domain energy ratios. Using a first-level 2D discrete wavelet transform with a Daubechies-2 wavelet, we computed the average energies of the sub-bands

A A

,

A D

,

D A

, and

D D

. We also calculated the ratio between high- and low-frequency energy regions in the Fourier spectrum. The resulting frequency feature vector is defined as:

f^{f r e q} = [\frac{E_{A D} + E_{D A} + E_{D D}}{E_{A A} + ϵ}, \frac{E_{h i g h}}{E_{l o w} + ϵ}]

(9)

where

E_{A A}, E_{A D}, E_{D A}, E_{D D}

denote the average energies of the level-1 wavelet sub-bands, and

E_{l o w}, E_{h i g h}

denote the average energies of the low- and high-frequency regions of the Fourier magnitude spectrum. Here,

ϵ > 0

is a small constant introduced to avoid division by zero.

(iii) Second-order texture descriptors were derived from gray-level co-occurrence matrices (GLCMs). Each image was quantized into 32 gray levels, and the corresponding GLCMs were computed and averaged across four orientations (0°, 45°, 90°, and 135°). The resulting texture feature vector is defined as:

f^{g l c m} (P) = [\sum_{i, j} {(i, j)}^{2} P_{i j}, \sum_{i, j} \frac{P_{i j}}{1 + |i - j|}, \sum_{i j} P_{i j}^{2}, \frac{\sum_{i, j} (i - μ_{x}) (j - μ_{y}) P_{i j}}{σ_{x} σ_{y} + ϵ}, \sum_{i, j} |i - j| P_{i j}]

(10)

where

P_{i j}

denotes the normalized GLCM averaged across the four orientations, and

μ_{x}, μ_{y}, σ_{x}, σ_{y}

denote the marginal means and standard deviations of

P,

computed from the row and column marginals.

3.4.2. Joint Normalization and Decorrelation

After feature extraction, feature vectors from the training and validation partitions were jointly normalized. First, a pooled z-score normalization was applied to each feature dimension by subtracting the global mean and dividing by the pooled standard deviation. Next, a whitening transformation based on the pooled covariance matrix was applied to reduce linear correlations between dimensions.

This procedure produces a multivariate representation with approximately zero-mean, unit variance, and reduced cross-feature correlations.

3.4.3. Multivariate Comparison Tests

The training and validation distributions were compared using three multivariate two-sample tests applied to the normalized feature vectors:

ε (X, Y) = \frac{2}{n m} \sum_{i = 1}^{n} \sum_{j = 1}^{m} {‖x_{i} - y_{i}‖}_{2} - \frac{1}{n^{2}} \sum_{i = 1}^{n} \sum_{i ’ = 1}^{n} {‖x_{i} - x_{i ’}‖}_{2} - \frac{1}{m^{2}} \sum_{j = 1}^{m} \sum_{j ’ = 1}^{m} {‖y_{j} - y_{j ’}‖}_{2}

(11)

where

X = {x_{i}}_{i = 1}^{n}

and

Y = {y_{j}}_{j = 1}^{m}

are the feature samples from the training and validation sets, respectively.

3.4.4. Maximum Mean Discrepancy (MMD) with a Gaussian Kernel

{M M D}^{2} (X, Y) = \frac{1}{n (n - 1)} \sum_{i \neq i ’} k_{σ} (x_{i}, x_{i ’}) + \frac{1}{m (m - 1)} \sum_{j \neq j ’} k_{σ} (y_{j}, y_{j ’}) - \frac{2}{n m} \sum_{i = 1}^{n} \sum_{j = 1}^{m} k_{σ} (x_{i}, y_{j})

(12)

with the Gaussian kernel

k_{σ} (x, y) = \exp (- \frac{{∥ x - y ∥}_{2}^{2}}{2 σ^{2}}) .

The bandwidth

σ

was selected using the median heuristic, i.e., as the median of pairwise Euclidean distances computed over the pooled set

X \cup Y

.

3.4.5. Sliced-Wasserstein Distance (Order 1)

{S W D}_{1} \approx \frac{1}{L} \sum_{i = 1}^{L} [\frac{1}{k} \sum_{k = 1}^{K} |{\tilde{x}}_{k}^{(l)} - {\tilde{y}}_{k}^{(l)}|], K = \min (n, m)

(13)

where

θ_{l} \in S^{d - 1}

are

L

random unit directions, and

\{{\tilde{x}}_{k}^{(l)}\}

and

\{{\tilde{y}}_{k}^{(l)}\}

are the sorted one-dimensional projections of

X

and

Y

onto

θ_{l}

.

In all cases, significance was assessed using a stratified permutation test comparing the training and validation sets, yielding p-values associated with the null hypothesis that both partitions are drawn from the same multivariate feature distribution. High p-values indicate that no statistically significant differences were detected between partitions with respect to the intensity, texture, and spatial-frequency properties considered.

3.5. Proposed Architecture

The proposed network, WaveletBasedAttention-Net, takes the four sub-bands of a 2D Discrete Wavelet Transform (DWT) {AA, AD, DA, DD} as input. It is trained to explicitly correct each sub-band.

We employ a 2D U-Net with four encoding and decoding levels, symmetric attention-based skip connections, and a bottleneck containing 1024 filters. Each convolutional block is implemented as a DoubleConv module consisting of

3 \times 3

convolutions, LeakyReLU activation (slope = 0.2), and batch normalization. In the encoder, each block is followed by 2 × 2 max pooling, whereas the decoder uses bilinear upsampling with concatenation of the corresponding skip features. A final convolution reconstructs the four wavelet bands.

The WaveletLoss combines L1 reconstruction errors computed independently for each band using configurable weights

(w_{AA}, w_{AD}, w_{DA}, w_{DD})

.

Figure 1 summarizes the network topology and tensor dimensions for a typical

4 \times 128 \times 128

tile. Due to the DWT stage followed by four pooling operations, the implementation requires a minimum spatial size of

\geq 32 \times 32

.

Figure 2 illustrates the attention gate (AG) mechanism integrated into the skip connections of the proposed Attention U-Net architecture [84]. The attention gate selectively emphasizes relevant spatial regions in the encoder feature maps while suppressing irrelevant or noisy activations before feature fusion in the decoder.

Let

x \in R^{B \times C_{x} \times H_{x} \times W_{x}}

denote the encoder feature map (skip connection) and let

g \in R^{B \times C_{g} \times H_{g} \times W_{g}}

denote the gating signal from the decoder at a coarser resolution.

First, the encoder features are linearly projected and spatially downsampled using a strided convolution (Equation (14)):

θ_{x} = {c o n v}_{2 \times 2, s = 2} (x), θ_{x} \in R^{B \times C_{i} \times H_{θ} \times W_{θ}}

(14)

where

C_{i}

is an intermediate channel dimensionality.

Simultaneously, the decoder gating signal is projected into the same intermediate feature space using a

1 \times 1

convolution (Equation (15)):

ϕ_{g} = {c o n v}_{1 \times 1} (g), ϕ \in R^{B \times C_{i} \times H_{g} \times W_{g}}

(15)

If necessary,

ϕ_{g}

is spatially interpolated to match the spatial resolution of

θ_{x}

. The resulting feature maps are then combined through element-wise addition and passed through a Rectified Linear Unit (ReLU) (Equation (16)):

f = R e L U (θ_{x} + ϕ_{g})

(16)

To generate the attention coefficients, a

1 \times 1

convolution followed by a sigmoid activation is applied (Equation (17)):

ψ = σ ({c o n v}_{1 \times 1} (f)), ψ \in R^{B \times 1 \times H_{θ} \times W_{θ}}

(17)

The resulting attention map

ψ

encodes the relevance of spatial locations conditioned on the decoder context. It is then upsampled to match the spatial resolution of the encoder feature maps (Equation (18)):

ψ_{↑} = u p s a m p l e (ψ), ψ_{↑} \in R^{B \times 1 \times H_{x} \times W_{x}}

(18)

Finally, the attention gate modulates the encoder feature map via element-wise multiplication (Equation (19)):

\hat{x} = x ⊙ ψ

(19)

where

\hat{x}

denotes the refined skip features passed to the decoder. The formulation of the attention module is summarized in Equations (20) and (21):

α (x, g) = σ (ψ^{T} R E L U (θ_{x} (x) + ϕ (g)))

(20)

\hat{x} = x ⊙ α (x, g)

(21)

3.6. Training

The complete pipeline begins with Radon-guided orientation normalization. For each artifacted slice, the dominant angle is estimated within the range [0, 180°), and the slice and its ground truth counterpart are rotated to align the artifact pattern. The predicted output is then rotated back to the original reference frame. After rotation, a 2D DWT is applied to obtain the four sub-bands {AA, AD, DA, DD}, which form the input tensor [B, 4, H/2, W/2] for the U-Net. The network predicts corrected bands, and the denoised image is reconstructed using IDWT.

During training, optimization is performed exclusively in the wavelet band domain with WaveletLoss (a band-weighted L1 loss), while SSIM is monitored as a structural metric without contributing to the gradient. During validation, in addition to total loss, band-wise metrics (e.g., MAE for DA) and global metrics computed on the spatial reconstruction are reported. The data loaders are configured with explicit batch size and worker settings (Figure 3).

3.6.1. Wavelet Component Selection for Training

This section focuses on selecting the Daubechies (db) wavelet family that yields the best average performance after 5 runs using a U-Net architecture. Each wavelet family decomposes the image using different low-pass and high-pass filters, separating the signal into low- and high-frequency components [85].

The wavelet function can be expressed in terms of the scaling function

\emptyset (t)

(Equation (22)):

ϕ_{L} [t] = \sqrt{2} \sum_{k = 0}^{2 L - 1} 1 [k] ϕ_{L} (2 t - k)

(22)

where

I_{k}

are the scale coefficients and L is the Daubechies order.

The wavelet filter coefficients

h [k]

, which are related to

l_{k}

, are defined as Equation (23):

h [k] = {(- 1)}^{k} I [2 L - 1 - k]

(23)

Accordingly, the wavelet function can be written as Equation (24):

ψ_{L} (t) = \sqrt{2} \sum_{k = 0}^{L - 1} h [k] ϕ_{L} (2 t - k)

(24)

For each component

{d b_{1}, \dots, d b_{6}}

, both the number of filters and the coefficients vary. The number of filter coefficients is equal to twice the order of the wavelet. Thus,

d b_{1}

contains two filter coefficients, whereas

d b_{6}

contains twelve.

3.6.2. Loss Function (WaveletLoss)

Training optimizes a loss function defined in the wavelet domain that weights the error of each band to direct the correction towards the components where artifacts concentrate most of their energy (Equation (25)).

Given the prediction

{\hat{B}}_{b}^{(n)}

and the reference

B_{b}^{(n)}

for band

b \in \{AA, AD, DA, DD\}

of slice n, the loss is defined as the mean L1 error computed independently for each band. Each band contribution is weighted by the corresponding coefficient

w_{b}

:

L_{w a v e l e t} = \sum_{b \in \{A A, A D, D A, D D\}} W_{b} \frac{1}{N H_{b} W_{b}} \sum_{n = 1}^{N} {‖{\hat{B}}_{b}^{(n)} - B_{b}^{(n)}‖}_{1}

(25)

In the implementation, each term is computed over the batch and spatial dimensions, then linearly combined using the band weights. The loss operates exclusively on the four bands produced by the 2D DWT, while the image reconstructed via the IDWT is used only to compute evaluation metrics (e.g., SSIM) that do not backpropagate gradients.

To favor suppression of periodic patterns while preserving low-frequency content, we use by default uneven weights emphasizing the high-frequency bands,

{w_{AA}, w_{AD}, w_{DA}, w_{DD}} = {0.1, 0.3, 0.4, 0.3}

.

Table 5 summarizes the hyperparameters selected for training.

3.7. Comparison of Results

To compare the results, several models were trained, as described below. All baseline models were implemented using a U-Net architecture with the four db2 wavelet sub-bands. The differences between the models lie in the encoder and decoder layers and the types of skip connections used.

3.7.1. U-Net

Among the models considered for comparison is the U-Net, a network originally designed for medical image segmentation [86]. Its use, however, was extended to several other tasks, including image reconstruction [87,88,89] and artifact correction [90,91,92], among others [93,94].

This architecture follows an encoder–decoder structure. The encoder extracts hierarchical image features, while the decoder reconstructs the output according to the target objective, which in this case is artifact correction.

Through its skip connections, U-Net preserves fine spatial details while maintaining anatomical coherence. Moreover, the architecture typically achieves strong performance even when trained with relatively limited amounts of data.

Nevertheless, although the encoder reduces spatial resolution to capture more global contextual information, the model still relies primarily on local receptive fields. Furthermore, some 3D implementations substantially increase the computational cost.

3.7.2. GAN

Another network considered for comparison is the generative adversarial network (GAN) [95], which consists of two components: a generator and a discriminator. The generator produces candidate images, while the discriminator learns to distinguish whether the generated image is consistent with the target distribution. This architecture can model complex data distributions and was widely applied to tasks such as image reconstruction and synthetic data generation. Moreover, GAN-based approaches often produce perceptually more realistic results, avoiding excessively smoothed images.

However, because the generator depends on the discriminator, training can become unstable and difficult to converge. In addition, GANs may introduce structures that are not anatomically faithful, which could negatively affect clinical interpretation.

3.7.3. Spatial and Channel Attention Mechanisms

In addition to the attention-based approach proposed in Section 3.5, we compare two alternative attention mechanisms: spatial attention and channel attention [96]. The spatial attention module focuses on relevant spatial regions and introduces global contextual information into the learned feature maps through a spatial attention map.

Channel attention, in contrast, emphasizes the importance of each feature channel, encouraging the reconstructed features to remain consistent with the input representations. As a result, the encoder outputs are guided to preserve meaningful feature relationships while suppressing less relevant activations. Together, these mechanisms improve the modeling of global dependencies through guided attention. Reported results indicate that such architectures can achieve statistically more robust and accurate predictions.

However, these approaches introduce additional architectural and computational complexity because extra attention operations must be computed during training and inference. Furthermore, they typically require larger training datasets and careful hyperparameter tuning to achieve good generalization.

3.7.4. Attention–Attention

The proposed model relies on a self-attention mechanism in which each image patch attends to all other patches, enabling the modeling of long-range dependencies during reconstruction [97]. Through multi-head self-attention, the encoder captures contextual relationships between degraded and informative regions, allowing the decoder to reconstruct each patch using both local information and global image context.

This mechanism improves robustness and reconstruction quality compared with purely local convolutional approaches. However, the attention mechanism also increases computational complexity and may, in some cases, attend to irrelevant regions, leading to reconstruction errors when foreground and background features are not clearly distinguishable.

3.7.5. Vision Transformer

Another model considered for comparison is the Vision Transformer (ViT) [98], which processes the input image by dividing it into fixed-size patches that are linearly embedded and augmented with positional information. Through a self-attention mechanism, ViT models interactions among all patches, enabling the network to capture long-range global dependencies across the entire image.

This architecture is particularly effective for tasks requiring global contextual reasoning, such as image reconstruction and artifact correction, as it enables more coherent modeling of spatial relationships compared with purely convolutional approaches.

However, ViT-based models typically require larger training datasets to generalize effectively and incur higher computational and memory costs due to the quadratic complexity of the attention mechanism. Additionally, the lack of strong inductive biases, such as locality and translation invariance, may reduce performance in scenarios with limited data.

4. Results

This section presents the results obtained in a controlled synthetic evaluation setting, including artifact generation, validation of the synthetic dataset, training outcomes, and visual examples of artifact correction produced by the proposed method.

4.1. Artifact Generation

Following the procedure described in Section 2.3, we generated three types of artifacts: ringing, herringbone, and zipper. For each artifact type, we present the original artifact-free image, the corresponding k-space (when applicable), and the visual appearance of the generated artifact in the reconstructed image.

4.1.1. Ringing

Figure 4 illustrates the ringing artifact. Figure 4a shows the original artifact-free image. In Figure 4c, the ringing artifact appears as fine oscillatory bands parallel to image edges and aligned with the sampling direction (readout or phase-encoding direction). The artifact becomes particularly visible at high-contrast interfaces, such as the boundary between cerebrospinal fluid (CSF) and gray matter.

Figure 4b shows the corresponding k-space representation. The artifact originates from variations in the high-frequency components of k-space. In the FFT domain, the spectrum shows concentrated energy near the center, with residual oscillatory patterns corresponding to the undulations observed in the reconstructed image.

4.1.2. Herringbone

Figure 5 shows the herringbone artifact. Figure 5a presents the artifact-free image, while Figure 5c displays the image with the herringbone artifact, which appears as thin vertical bands arranged in parallel. These bands become more visible at high-contrast interfaces, such as between CSF and gray matter.

Figure 5b shows the corresponding k-space representation. The artifact component reflects variations in the high-frequency region, and its Fast Fourier transform (FFT) exhibits concentrated energy at the center, with residual dot-like patterns corresponding to the bands observed in the spatial image.

4.1.3. Zipper

Figure 6 shows the zipper artifact. Figure 6a presents the artifact-free image, while Figure 6b displays the zipper artifact, which appears as vertical noise that degrades image quality.

4.2. Validation of Dataset Partitioning and Distribution

This section reports the results of validating the dataset partitioning and distribution. The analysis follows the methodology described in Section 3.4.

4.2.1. Distribution of Artifact Types Across Dataset Partitioning

We first evaluated the proportion of samples for each artifact type (ringing, herringbone, and zipper) in the training and validation sets. As shown in Figure 7, the proportions remain nearly constant across the two partitions.

Specifically, the herringbone class represents 32.9% of the samples in the training set and 30.5% in the validation set. The ringing artifact accounts for 35.3% and 35.1% of the samples in the training and validation sets, respectively, while the zipper artifact represents 31.8% and 34.5%, respectively. These minor differences indicate that the class distribution remains balanced between the two subsets, reducing the risk of sampling bias during model training and evaluation.

4.2.2. Validation of Dataset Partition Distribution

Figure 8 shows the Z-score distributions of the wavelet-energy and GLCM texture descriptors for the training and validation partitions. In all cases, the histograms of both subsets largely overlap and exhibit very similar shapes, tails, and amplitudes.

The wavelet sub-band energies and the high-to-low-frequency ratio exhibit strongly right-skewed distributions. In contrast, the GLCM descriptors show consistent patterns across the two partitions: contrast and dissimilarity are concentrated at low values, whereas homogeneity, energy, and correlation remain close to 1.

This visual similarity indicates that both partitions capture comparable frequency and texture characteristics, with no noticeable bias in statistical complexity between the training and validation sets.

Furthermore, Figure 9 shows the projection of the feature vectors (wavelet + GLCM) onto the first two principal components (PC1 and PC2), which together explain approximately 80% of the total variance. The samples from the training and validation sets are highly intermingled in the latent space, forming point clouds with very similar shapes and extents. No systematic separation is observed between the two partitions; instead, both occupy the same high-density regions and exhibit similar clustering patterns. This strong overlap indicates that, in terms of frequency and texture features considered, the training and validation distributions are comparable and do not introduce any evident bias into the feature space.

Table 6 summarizes the results of the multivariate analyses performed on the feature subspace defined by wavelet energies and GLCM descriptors using the training and validation partitions. Three statistical tests were applied: Energy Distance, Maximum Mean Discrepancy with a radial basis function kernel (MMD-RBF), and Sliced-Wasserstein distance. The resulting statistics are small (0.01038, 0.00032, and 0.07342, respectively), with p-values below 0.05. However, the feature histograms show substantial overlap, and the PCA projection does not reveal a clear separation between the partitions. Therefore, although differences are detected, they do not suggest a severe imbalance between the sets. We believe these differences can be explained by the volume-level partitioning strategy used to prevent data leakage, which naturally introduces minor differences between subsets while preserving methodological validity.

4.3. Wavelet Family Selection

To determine the wavelet family used in the subsequent experiments, we conducted an exploratory study on a 5% subset of the full dataset. In this scenario, six Daubechies families (db1–db6) were compared while keeping the network architecture, training hyperparameters, and all other model settings constant.

For each wavelet family, the model was trained for 20 epochs, and the experiment was repeated five times with different random seeds to account for variability arising from initialization and mini-batch sampling. In each run, we recorded the minimum validation loss and the corresponding evaluation metrics (SSIM and MAE in the image domain) at the epoch with the lowest validation loss.

The results were then aggregated as mean ± standard deviation for each wavelet family, as summarized in Table 7.

Among the evaluated wavelet families, db2 achieved the highest average validation SSIM. As training cost scales with dataset size and available computational resources, repeating full-dataset training for each wavelet family would be computationally expensive. Therefore, this wavelet-selection study was conducted on a reduced subset of the complete dataset and limited to five independent runs per wavelet, providing a practical compromise between robustness and computational cost.

Given this limited number of runs, the experiment does not provide sufficient statistical power to claim differences between wavelet families. Accordingly, the observed differences should be interpreted as indicative rather than conclusive statistical evidence. Based on these results, the db2 wavelet family was selected for all subsequent experiments.

4.4. Training Results

This section reports the loss, SSIM, PSNR, and MAE results obtained for each dataset split. For the training set, the reported values correspond to the model’s best epoch.

Figure 10 illustrates the evolution of the compound loss

w_{A A}, w_{A D}, w_{D A}, w_{D D}

as a function of epoch for both the training and validation partitions. In the training set, the loss decreases sharply, dropping from approximately 0.198 in the first epoch to around 0.078 in the final epoch. This trend indicates that the model progressively adjusts its parameters to the characteristics of the training data.

In the validation set, the loss starts at approximately 0.130 and converges to around 0.097 by the ninth epoch. Although small oscillations are observed between epochs, the validation loss remains within a bounded range and does not exhibit sustained increases. This suggests that the model does not show pronounced overfitting and that its generalization performance remains stable throughout training.

Figure 11 shows the evolution of the structural similarity index (SSIM) for the training and validation partitions over nine epochs. In the training set, the mean SSIM increases progressively from 91.8% in the first epoch to 98.3% in the ninth epoch, reflecting a steady improvement in reconstruction quality.

In the validation set, SSIM remains consistently high throughout training, with moderate fluctuations between approximately 96% and 97%. It increases from 96.2% in the first epoch and stabilizes around 97.4% by the last epoch.

The close alignment between the training and validation curves, without systematic degradation on the validation set, indicates that the model generalizes well and does not exhibit significant overfitting with respect to SSIM.

Figure 12 shows the evolution of the validation MAE calculated on the DA wavelet sub-band. The error decreases from approximately 0.039 in the first epoch to around 0.027 by the ninth, indicating a progressive and relatively stable improvement during training.

This metric is evaluated specifically on the high-frequency component because the artifacts of interest (ringing, herringbone, and zipper) are most pronounced in these sub-bands, along with the fine edge and texture details that are diagnostically relevant. In contrast, the low-frequency components (such as AA) are dominated by the overall structure and coarse contrast, which are typically easier to reconstruct and less sensitive to small distortions.

By focusing the MAE on the DA sub-band, the metric provides a more targeted assessment of correction quality in the regions where artifacts primarily affect high-frequency information. This allows us to assess whether the model suppresses these distortions while preserving subtle anatomical details.

Figure 13 and Figure 14 show consistent improvements in SSIM and PSNR after correction for all three artifact types. For the herringbone artifact, the average SSIM increases from 0.968 ± 0.0228 to 0.99 ± 0.0096 (approximately +2.2 percentage points), while the PSNR improves from 36.166 ± 4.893 dB to 39.892 ± 3.046 dB.

For ringing, the input SSIM is already very high (0.987 ± 0.011) but still increases to 0.995 ± 0.003, accompanied by an improvement in PSNR from 38.17 ± 4.824 dB to 41.658 ± 2.062 dB.

The zipper artifact shows the largest improvement, with SSIM increasing from 0.860 ± 0.059 to 0.951 ± 0.023, and PSNR rising from 25.958 ± 3.186 dB to 32.424 ± 2.427 dB. These results indicate that the proposed model is particularly effective and consistent in correcting this type of artifact.

Figure 15 shows the MSE values per wavelet band (AA approximation and AD, DA, and DD details) for each artifact type, comparing images with artifacts (ART) and model-corrected images (PRED). The logarithmic MSE scale shows a consistent shift towards lower values after correction, particularly in the high-frequency bands.

For ringing, the AD, DA, and DD detail components show a clear reduction in MSE and reduced dispersion, indicating that the model consistently attenuates the edge oscillations associated with this artifact. A similar pattern is observed for herringbone: the error decreases substantially in the detail bands, and the distributions become narrower, while the approximation component remains relatively stable. This suggests that the correction mainly targets high-frequency periodic structures.

Finally, for the zipper artifact, the reduction in MSE is particularly pronounced across all bands, with lower medians and more compact interquartile ranges. This indicates that the model effectively removes much of the error energy associated with interference lines, preserving low-frequency structural information.

Table 8 summarizes the quality metrics before and after artifact correction. A reduction in the MSE is observed across all weighted wavelet sub-bands. Specifically, in the

w_{A A}

component, the MSE decreases from 0.048 ± 0.869 to 0.007 ± 0.009, in

w_{A D}

from 0.008 ± 0.011 to 0.003 ± 0.004, in

w_{D A}

from 0.009 ± 0.011 to 0.004 ± 0.004, and in

w_{D D}

from 0.007 ± 0.010 to 0.002 ± 0.003. Notably, the unusually large pre-correction dispersion in MSE_wAA reflects a small subset of severely corrupted slices in which the periodic artifact contributes substantial energy to the approximation (AA) band, producing a strongly right-skewed distribution rather than a numerical inconsistency.

The weighted-average MSE decreases from 0.018 ± 0.029 to 0.004 ± 0.005. These improvements are also reflected in the perceptual quality metrics: the average PSNR increases from 33.42 ± 6.939 dB to 43.337 ± 5.364 dB, while the SSIM rises from 0.938 ± 0.067 to 0.985 ± 0.022. These results indicate that the corrected images exhibit lower distortion and greater structural similarity to the artifact-free references.

4.5. Structural Preservation and Edge Integrity

To assess whether artifact suppression preserves anatomical structures, we complemented standard image-quality metrics with gradient- and edge-based measures computed slice-wise. Images were first evaluated in the same z-score space used for model inference, ensuring consistent intensity scaling between input, prediction, and ground truth. In addition to PSNR and SSIM in the image-intensity domain, we computed metrics on Sobel-gradient magnitude maps to assess edge consistency. We also measured gradient errors (L1 and L2) between predicted and ground truth gradients to quantify structural deviations. Edge Dice (F1) was obtained from binary edge maps derived from Sobel magnitude using a robust threshold based on the 90th percentile of the ground truth gradient distribution. These complementary metrics serve as proxies for anatomical structural integrity and boundary preservation, helping distinguish true structural recovery from simple image smoothing. In addition, we evaluated a Radon-based periodicity proxy on residual error maps to determine whether oriented artifact patterns remain after correction.

The proposed method improves image fidelity while maintaining structural consistency. Improvements are observed in the gradient-based domain: Edge-PSNR increases from 38.41 to 41.04, and Edge-SSIM from 0.952 to 0.980, indicating stronger agreement between edge structures and the ground truth images. Gradient mismatch errors decrease (Grad-L1: 0.0744 to 0.0540, Grad-L2: 0.0622 to 0.0436), and edge overlap improves (Dice/F1: 0.920 to 0.938), suggesting that boundary structures are preserved rather than smoothed away. Finally, the Radon peak ratio measured on residual errors decreases from 6.72 to 2.89, indicating a reduction in oriented periodic artifact patterns (Table 9). The relatively large pre-correction variability of this metric reflects a strongly right-skewed distribution driven by a small number of severe slices, in which structured periodic residuals dominate the error map.

4.6. Visual Results of the Correction

Figure 16, Figure 17 and Figure 18 present qualitative examples of corrections for each artifact type, comparing the artifacted images, the ground truth, and the model’s predictions. In the ringing examples, per-edge oscillations at high-contrast transitions are consistently reduced, resulting in sharper edges at gray and white matter boundaries (Figure 16).

For herringbone, the high-frequency oblique bands are noticeably attenuated or disappear, while the fine anatomy of the cortex and deep structures is preserved, with textures and gray level distributions comparable to those of the ground truth (Figure 17).

Finally, in the zipper cases, the interference lines crossing the volume largely disappear in the predictions. This restoration produces a more homogeneous background and brain contours that are similar to those in the reference image, without an evident loss of contrast, and with improvements across all evaluation metrics (Figure 18).

The predictions closely resemble the artifact-free images, supporting the model’s ability to suppress distortions while preserving relevant structural information.

Across all three qualitative examples, the model consistently improves visual fidelity and reduces structured residuals in the absolute error maps |Input—GT| and |Pred—GT| (same z-score domain and shared color scale within each Figure). For ringing, the prediction restores sharper boundaries and suppresses oscillatory edge-related errors, increasing PSNR from 37.98 to 44.53 dB and SSIM from 0.9809 to 0.9956. For herringbone, the characteristic stripe-like pattern visible in the input error map is strongly attenuated after correction, with PSNR improving from 35.50 to 42.37 dB and SSIM from 0.9426 to 0.9901. For the more challenging zipper case, the input exhibits pronounced banding and large residuals, whereas the prediction substantially reduces the stripe energy and error magnitude, yielding a marked improvement in metrics (PSNR 17.26 to 29.63 dB, SSIM 0.4471 to 0.8681).

In all cases, residual errors become more spatially diffuse and lower in magnitude after correction, indicating effective suppression of periodic artifact structure (Figure 19, Figure 20 and Figure 21).

In the ROI triptychs (a: Input ROI, b: GT ROI, and c: Pred ROI), the corrected ROIs show closer visual agreement with the ground truth images and reduced structured artifacts across all cases. For ringing (Figure 22), the prediction preserves fine anatomical edges while attenuating local oscillatory patterns: PSNR increases from 31.50 dB (input) to 38.21 dB (prediction), and SSIM increases from 0.9236 to 0.9766.

For herringbone (Figure 23), the stripe-like contamination visible in the input ROI is strongly reduced, producing a texture more consistent with the GT ROI: PSNR increases from 30.01 dB to 39.02 dB, and SSIM increases from 0.8323 to 0.9806.

For the zipper (Figure 24), the input ROI exhibits pronounced banding and contrast disruption; the prediction suppresses the band structure and restores local contrast toward the GT ROI: PSNR increases from 19.20 dB to 31.28 dB, and SSIM increases from 0.4542 to 0.8928.

4.7. Stress Test on Mixed Periodic Artifacts

To assess robustness within the controlled synthetic setting under combined degradations, we perform a stress test using synthetic mixtures of periodic artifacts generated by composing the same corruption operators used in the single-artifact setting. Specifically, we considered two-artifact mixtures (ringing+zipper, herringbone+ringing, and zipper+herringbone) and a three-artifact mixture (ringing+zipper+herringbone), where the degradations are applied sequentially to the GT to obtain a single mixed-input image.

Performance was evaluated using SSIM and PSNR, reported for Noisy(mix) vs. GT and Predict vs. GT. We also compute the paired gain per slice as Δ = Metric(Predict, GT) − (Noisy(mix), GT). Results are summarized as mean ± standard deviation, with gains reported as mean values (optionally with 95% bootstrap confidence intervals (CIs)).

Mixed artifacts substantially degrade image quality (low SSIM and PSNR), whereas the proposed model restores structural similarity and improves fidelity across all mixture conditions. This yields consistently positive ΔSSIM and ΔPSNR. Mean results for each mixture and for the stress test are reported in Table 10.

4.8. Results Stratified by Artifact Severity

Using the generator-defined severity parameters (intensity for ringing/zipper, smooth for herringbone) together with the p25/p75 thresholds (mild ≤ p25, severe ≥ p75, and moderate otherwise), we stratified the paired evaluation dataset (n = 12,529 slices). We then computed mean improvements in PSNR and SSIM in the z-score domain. Uncertainty was estimated using bootstrap 95% CI for the mean improvement (n_boot = 2000), and statistical significance was assessed with a paired sign-flip permutation test (n_perm = 10,000).

The model yields statistically significant improvement in the z-score domain. Across 12,529 paired samples, PSNR increases from 33.77 ± 11.62 dB at the input to 41.03 ± 6.15 dB at the output, corresponding to a mean gain of +7.26 dB (bootstrap 95% CI [7.15, 7.38]). Similarly, SSIM rises from 0.8323 ± 0.2190 to 0.9740 ± 0.0414, corresponding to a mean gain of +0.1418 (95% CI [0.1385, 0.1452]; paired sign-flip permutation p ≤ 1 × 10⁻⁴).

As shown in Table 11, the magnitude of improvement depends on both artifact type and severity. Zipper artifacts show the largest gains across all severity levels, with consistently strong improvements in both PSNR and SSIM. Ringing artifacts also improve significantly, although the gains are more moderate than those observed for zipper. For herringbone, the effect is severity-dependent: the mild group shows only a small improvement, whereas the moderate and severe groups exhibit clearly larger gains. The analysis of the results indicates that the proposed model is most effective when the degradation is stronger and more structurally pronounced, while still preserving stable performance across the full severity range.

4.9. Ablation Study on Radon-Based Loss Regularization

The ablation results show the contribution of the main components of the proposed architecture. The inclusion of the Radon-based regularization yields a gain of +0.686 dB in PSNR and +0.00286 in SSIM relative to the model trained without the Radon term, indicating that the Radon constraint improves the suppression of oriented periodic residuals.

Removing the attention mechanism results in greater degradation (−2.20 dB PSNR, −0.01233 SSIM), suggesting that attention plays an important role in selectively refining artifact-affected regions. The largest drop occurs when the AA component is removed (−8.46 dB in PSNR, −0.14313 in SSIM), indicating the importance of the low-frequency wavelet representation in preserving global anatomical structure. Removing the remaining wavelet sub-bands or their associated losses results in consistent degradation of roughly −5 dB in PSNR and −0.02 in SSIM, indicating that multi-sub-band supervision helps stabilize reconstruction across artifact patterns.

These results suggest that the combined components contribute jointly to the observed performance improvements. The ablation should still be interpreted as a component-level analysis rather than an assessment of each design choice. See Table 12.

4.10. Comparison of Results

This section presents and compares the results of all trained models with our proposed approach, which achieved the best performance (Table 13).

Based on the results presented in Table 13, the proposed model outperforms all trained baseline models, achieving the highest SSIM (0.98528 ± 0.02218) and PSNR (43.33710 ± 5.36451 dB). These results indicate a superior ability to preserve structural similarity and achieve higher reconstruction fidelity compared with the alternative approaches.

Furthermore, the proposed model exhibits one of the lowest standard deviations among all evaluated methods, suggesting greater stability and consistency across the test samples. The combination of high average metric values and low variability demonstrates the robustness of the proposed approach and increases confidence in its performance. In contrast, models such as Attention–Attention and Vision Transformer show a noticeable drop in performance, particularly in PSNR, indicating the limitations of purely attention-based mechanisms when trained with limited data or under strict reconstruction constraints.

We also report computational indicators for all evaluated models in Table 14 (NVIDIA RTX A6000, NVIDIA Corporation, Santa Clara, CA, USA; 300 W TDP). These include parameter count, approximate FLOPs/MACs for a fixed input size, peak GPU memory usage during training, average wall-clock training time per epoch, total training time normalized to 100 epochs, and inference latency per slice. All methods were profiled under the same wavelet-input regime (four db2 sub-bands) and comparable training settings to ensure a fair cost–performance comparison.

5. Discussion

Regarding the learning process, the composite loss decreases rapidly during the early epochs and then stabilizes within a low range for both training and validation, without sustained increases or marked divergence between the two curves. This behavior suggests that the model adapts its parameters to the data characteristics without clear evidence of severe overfitting, while maintaining stable performance on previously unseen volumes within the same controlled synthetic regime. The simultaneous stabilization of both curves indicates that the proposed architecture can capture the main structure of the problem without obvious signs of memorization in the training data. Furthermore, the SSIM and PSNR metrics increase across all three artifact types in the reported experiments.

Wavelet-domain analysis provides insight into how the model operates. The reduction in MSE after correction is particularly pronounced in the detail bands (AD, DA, and DD) for all three artifacts, whereas the approximation band (AA) changes to a lesser extent. This pattern suggests that most of the correction occurs in high-frequency components, where periodic structures and oscillations associated with ringing, herringbone, and zipper artifacts are concentrated, whereas changes in the low-frequency band appear comparatively smaller.

In the case of zipper artifacts, the reduction in error across all wavelet bands suggests a reduction in interference-line patterns, with no visually apparent loss of the underlying anatomical structures in the evaluated examples. However, this four-band representation also has limitations. Each band (AA, AD, DA, and DD) captures different frequency ranges and orientations (low-frequency, primarily horizontal, vertical, and diagonal details), resulting in a relatively rigid representation. Moreover, in the current architecture, the sub-bands are treated as independent channels, without explicitly modeling correlations across scales and orientations. This design may reduce sensitivity to more complex artifact patterns or anatomical structures that extend across multiple scales and orientations.

Although zipper artifacts yield the lowest absolute post-correction metrics (0.951 ± 0.023 in SSIM and 32.424 ± 2.427 dB in PSNR), compared with herringbone (0.990 ± 0.009 SSIM and 39.892 ± 3.046 dB PSNR) and ringing (0.995 ± 0.003 SSIM and 41.658 ± 2.062 dB PSNR), they also show the largest relative improvement in these metrics. The average SSIM increases by approximately 13 percentage points (from 0.860 to 0.951), whereas the improvement is about 2.2 points for herringbone (from 0.968 to 0.990) and about 1 point for ringing (from 0.987 to 0.995).

Although SSIM and PSNR increase substantially and visual inspection (Figure 16, Figure 17 and Figure 18) indicates marked artifact attenuation, the synthetic artifact generation process may not fully capture the variability and irregularity of real artifacts, particularly because no public datasets with annotated real MRI artifacts are currently available for direct validation.

In our setup, artifacts are generated through discrete, symmetric peaks in k-space, which introduces a structural bias into the training data. As a result, the training data are dominated by highly structured, regular, and periodic patterns, which can be interpreted as idealized artifacts. While this regular structure likely reduces variability in the learning problem, it may lead to an overestimation of the ability of the model to generalize to more complex, asymmetric, or irregular artifacts found in real clinical data.

Another important limitation concerns the Radon transform-based normalization used to align artifact orientation. As implemented, this procedure assumes a single dominant angle in the interference pattern and is therefore tailored to highly regular global structures. In practice, however, artifacts may exhibit multiple simultaneous orientations, local curvatures, or spatial variations. In such cases, estimating a single global angle through the Radon transform may average heterogeneous structures and reduce sensitivity to local components, thereby reducing correction performance for heterogeneous artifacts and biasing the model toward patterns that more closely match the synthetic design.

The proposed pipeline also introduces additional computational overhead compared with standard U-Net baselines. (i) Radon-guided angle normalization and its inverse rotation add pre- and post-processing steps and may introduce minor interpolation artifacts. (ii) DWT/IDWT operations and multi-band processing increase the computational footprint. (iii) Attention-gated skip connections increase parameterization and memory usage, and (iv) performance may also degrade when the dominant artifact orientation is ambiguous.

Nevertheless, the ablation study indicates that the Radon component has a measurable impact on the reported performance. Incorporating Radon-based regularization yields gains of +0.686 dB in PSNR and +0.00286 in SSIM, suggesting that orientation normalization is associated with improved correction performance in this synthetic setting.

The stratified evaluation shows that the model remains effective across different artifact severity levels, with the largest improvements observed in severe cases while still providing consistent benefits for milder degradations. Similar trends are reflected in the absolute error maps, which visually indicate marked attenuation of artifact-related distortions, and in the ROI-based analysis, where improvements are also observed in anatomically relevant regions. These findings suggest that, in this controlled synthetic setting, the method improves global similarity metrics and local image quality in structurally relevant regions.

The mixed-artifact experiments provide additional insight into the model’s behavior under more challenging conditions. Although the network was trained on individual artifact types, it still improves on inputs altered by unseen synthetic combinations of the same periodic corruption operators, suggesting limited generalization in this synthetic setting.

Finally, the edge-preservation analysis—based on Sobel-derived representations and complementary metrics such as edge-PSNR, edge-SSIM, Grad-L1, Grad-L2, Edge Dice/F1, and the Radon peak ratio—indicates that, within this synthetic evaluation, increases in the considered metrics are not accompanied by obvious losses of structural information. Instead, the results suggest that, in this controlled synthetic setting, the proposed approach suppresses periodic artifacts while preserving significant anatomical boundaries relevant for MRI post-processing analyses.

Another aspect to consider is extending the problem to 3D and the inter-slice coherence. The current model operates on independent 2D slices and therefore does not explicitly exploit spatial correlations along the slice axis, where many artifacts—particularly those related to hardware or motion—manifest continuously across slices. In addition, uncertainty remains regarding the severity ranges used during synthetic artifact generation (peak amplitude in k-space, number of spikes, noise levels, etc.). If these parameters are tuned toward relatively mild conditions, the model may be better adapted to correcting subtle artifacts but less robust to more extreme cases.

Finally, although the inverse Radon and wavelet transforms are theoretically well-defined and approximately reversible, their practical implementation involves interpolation, boundary effects, and numerical quantization. These steps may introduce local smoothing, slight geometric misalignments, or small redistributions of high-frequency energy.

Although such effects are typically small, they are not entirely negligible and should be considered when interpreting the corrected images as approximations to the original anatomy, especially when several transforms are applied sequentially within the same processing pipeline.

6. Conclusions

In the controlled synthetic evaluation, the model increases image similarity metrics, with higher SSIM and PSNR values. It also attenuates synthetically generated ringing, herringbone, and zipper artifacts, with the largest relative gains observed for zipper artifacts, accompanied by a marked reduction in wavelet-domain error, particularly in the detail bands.

This approach also presents several limitations. Synthetic artifacts arise from discrete, symmetric peaks in k-space and are represented using four wavelet sub-bands that are processed largely independently. This configuration favors learning regular, globally oriented patterns and does not guarantee robustness to more complex, irregular, or locally heterogeneous artifacts observed in real data. In addition, the use of the Radon transform to normalize orientation assumes a single dominant angle, the model operates on 2D patches without explicitly enforcing inter-slice coherence, and the inverse Radon and wavelet transforms may introduce interpolation and boundary effects. These aspects can lead to biases and small distortions that may limit generalization to real-world clinical settings.

Within these constraints, the reported results provide a methodological validation of periodic artifact correction in a controlled synthetic environment. They also underline the need to generate more realistic artifact models, explore architectures operating directly in 3D, and develop formulations that explicitly exploit dependencies across scales, orientations, and slices.

Author Contributions

Conceptualization, J.D.R.-P., C.A.L.-B. and G.S.-T.; methodology, C.A.L.-B. and G.S.-T.; software, J.D.R.-P. and G.S.-T.; validation, J.D.R.-P., G.S.-T., C.A.L.-B. and J.W.B.-B.; formal analysis, G.S.-T.; investigation, J.D.R.-P. and C.A.L.-B.; resources, J.W.B.-B.; data curation, J.D.R.-P. and C.A.L.-B.; writing—original draft preparation, J.D.R.-P. and G.S.-T.; writing—review and editing, J.W.B.-B.; visualization, J.D.R.-P.; supervision, G.S.-T.; project administration, J.W.B.-B.; funding acquisition, J.W.B.-B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Universidad Nacional de Colombia, Medellín campus, through the project called “Deep Learning-Based Architecture for the Correction of Artifacts in Magnetic Resonance Imaging”, No. 1004349834.

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the data used were obtained from the public databases.

Informed Consent Statement

Patient consent was waived due to the data used were obtained from the public databases.

Data Availability Statement

The data presented in this study are openly available in Dataset Creation: Synthetic MRI Artifacts (Ringing, Herringbone, Zipper) at https://github.com/Jesusdrp09/Periodic-Artifact-Dataset-Creation-in-Brain-MRI (accessed on 20 February 2026).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AA	Approximation–approximation
AD	Approximation–detail
DA	Detail–approximation
DD	Detail–detail
DWT	Discrete wavelet transform
IDWT	Inverse discrete wavelet transform

References

Abdi, M.; Feng, X.; Sun, C.; Bilchick, K.C.; Meyer, C.H.; Epstein, F.H. Suppression of Artifact-Generating Echoes in Cine Dense Using Deep Learning. Magn. Reson. Med. 2021, 86, 2095–2104. [Google Scholar] [CrossRef] [PubMed]
Alus, O.; El Homsi, M.; Golia Pernicka, J.S.; Rodriguez, L.; Mazaheri, Y.; Kee, Y.; Petkovska, I.; Otazo, R. Convolutional Network Denoising for Acceleration of Multi-Shot Diffusion Mri. Magn. Reson. Imaging 2024, 105, 108–113. [Google Scholar] [CrossRef]
Bash, S.; Johnson, B.; Gibbs, W.; Zhang, T.; Shankaranarayanan, A.; Tanenbaum, L.N. Deep Learning Image Processing Enables 40% Faster Spinal MR Scans Which Match or Exceed Quality of Standard of Care. Clin. Neuroradiol. 2021, 32, 197–203. [Google Scholar] [CrossRef]
Boudissa, S.; Kanli, G.; Perlo, D.; Jaquet, T.; Keunen, O. Addressing Artefacts in Anatomical Mr Images: A k-Space-Based Approach. In Proceedings of the 2024 IEEE International Symposium on Biomedical Imaging (ISBI), Athens, Greece, 27–30 May 2024; IEEE: Piscataway, NJ, USA, 2024. [Google Scholar]
Hales, P.W.; Pfeuffer, J.; Clark, C.A. Combined Denoising and Suppression of Transient Artifacts in Arterial Spin Labeling Mri Using Deep Learning. J. Magn. Reson. Imaging 2020, 52, 1413–1426. [Google Scholar] [CrossRef]
Chen, Y.-J.; Chang, Y.-J.; Wen, S.-C.; Shi, Y.; Xu, X.; Ho, T.-Y.; Jia, Q.; Huang, M.; Zhuang, J. Zero-Shot Medical Image Artifact Reduction. In Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), Iowa City, IA, USA, 3–7 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 862–866. [Google Scholar]
Chen, Y.-J.; Chang, Y.-J.; Wen, S.-C.; Xu, X.; Huang, M.; Yuan, H.; Zhuang, J.; Shi, Y.; Ho, T.-Y. “one-Shot” Reduction of Additive Artifacts in Medical Images. In Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 2–9 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 836–841. [Google Scholar]
Jiang, Y.; Cui, L.; Jiang, B.; Zhao, X.; Chai, S. Cardiac Mri Image Enhancement Based on Gan Network. In Proceedings of the 2024 43rd Chinese Control Conference (CCC), Kunming, China, 28–31 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 8309–8315. [Google Scholar]
Kim, M.; Yi, J.; Lee, H.-J.; Hahn, S.; Lee, Y.; Lee, J. Deep Learning-Based Reconstruction for 3-Dimensional Heavily T2-Weighted Fat-Saturated Magnetic Resonance (Mr) Myelography in Epidural Fluid Detection: Image Quality and Diagnostic Performance. Quant. Imaging Med. Surg. 2024, 14, 6531–6542. [Google Scholar] [CrossRef] [PubMed]
Patel, V.; Wang, A.; Monk, A.P.; Schneider, M.T.-Y. Enhancing Knee Mr Image Clarity through Image Domain Super-Resolution Reconstruction. Bioengineering 2024, 11, 186. [Google Scholar] [CrossRef]
Ryu, K.H.; Baek, H.J.; Gho, S.-M.; Ryu, K.; Kim, D.-H.; Park, S.E.; Ha, J.Y.; Cho, S.B.; Lee, J.S. Validation of Deep Learning-Based Artifact Correction on Synthetic Flair Images in a Different Scanning Environment. J. Clin. Med. 2020, 9, 364. [Google Scholar] [CrossRef]
Singh, R.; Kaur, L. Magnetic Resonance Image Denoising Using Patchwise Convolutional Neural Networks. In Proceedings of the 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 17–19 March 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 652–657. [Google Scholar]
Singh, R.; Kaur, L. Noise-Residue Learning Convolutional Network Model for Magnetic Resonance Image Enhancement. J. Phys. Conf. Ser. 2021, 2089, 012029. [Google Scholar] [CrossRef]
Wang, X.; Ma, J.; Bhosale, P.; Ibarra Rovira, J.J.; Qayyum, A.; Sun, J.; Bayram, E.; Szklaruk, J. Novel Deep Learning-Based Noise Reduction Technique for Prostate Magnetic Resonance Imaging. Abdom. Radiol. 2021, 46, 3378–3386. [Google Scholar] [CrossRef] [PubMed]
van der Velde, N.; Hassing, H.C.; Bakker, B.J.; Wielopolski, P.A.; Lebel, R.M.; Janich, M.A.; Kardys, I.; Budde, R.P.J.; Hirsch, A. Improvement of Late Gadolinium Enhancement Image Quality Using a Deep Learning-Based Reconstruction Algorithm and Its Influence on Myocardial Scar Quantification. Eur. Radiol. 2021, 31, 3846–3855. [Google Scholar] [CrossRef]
Arefeen, Y.; Beker, O.; Cho, J.; Yu, H.; Adalsteinsson, E.; Bilgic, B. Scan-Specific Artifact Reduction in k-Space (Spark) Neural Networks Synergize with Physics-Based Reconstruction to Accelerate Mri. Magn. Reson. Med. 2022, 87, 764–780. [Google Scholar] [CrossRef]
Bran Lorenzana, M.; Chandra, S.S.; Liu, F. Aliasnet: Alias Artefact Suppression Network for Accelerated Phase-Encode Mri. Magn. Reson. Imaging 2024, 105, 17–28. [Google Scholar] [CrossRef]
Chen, D.; Schaeffter, T.; Kolbitsch, C.; Kofler, A. Ground-Truth-Free Deep Learning for Artefacts Reduction in 2d Radial Cardiac Cine Mri Using a Synthetically Generated Dataset. Phys. Med. Biol. 2021, 66, 095005. [Google Scholar] [CrossRef]
Kofler, A.; Dewey, M.; Schaeffter, T.; Wald, C.; Kolbitsch, C. Spatio-Temporal Deep Learning-Based Undersampling Artefact Reduction for 2d Radial Cine Mri with Limited Training Data. IEEE Trans. Med. Imaging 2020, 39, 703–717. [Google Scholar] [CrossRef] [PubMed]
Huang, W.; Li, H.B.; Pan, J.; Cruz, G.; Rueckert, D.; Hammernik, K. Neural Implicit K-Space for Binning-Free Non-Cartesian Cardiac Mr Imaging. Lect. Notes Comput. Sci. 2023, 13939 LNCS, 548–560. [Google Scholar] [CrossRef]
Han, X.; Chen, Y.; Liu, Q.; Liu, Y.; Chen, K.; Lin, Y.; Zhang, W. Reconstruction of Cardiac Cine Mri Using Motion-Guided Deformable Alignment and Multi-Resolution Fusion. Int. J. Imaging Syst. Technol. 2024, 34, e23131. [Google Scholar] [CrossRef]
Lee, J.-H.; Kim, J.-Y.; Ryu, K.; Al-masni, M.A.; Kim, T.H.; Han, D.; Kim, H.G.; Kim, D.-H. Just-Net: Jointly Unrolled Cross-Domain Optimization Based Spatio-Temporal Reconstruction Network for Accelerated 3d Myelin Water Imaging. Magn. Reson. Med. 2024, 91, 2483–2497. [Google Scholar] [CrossRef] [PubMed]
Jaubert, O.; Montalt-Tordera, J.; Knight, D.; Coghlan, G.J.; Arridge, S.; Steeden, J.A.; Muthurangu, V. Real-Time Deep Artifact Suppression Using Recurrent u-Nets for Low-Latency Cardiac Mri. Magn. Reson. Med. 2021, 86, 1904–1916. [Google Scholar] [CrossRef]
Jaubert, O.; Montalt-Tordera, J.; Brown, J.; Knight, D.; Arridge, S.; Steeden, J.; Muthurangu, V. Fresco: Flow Reconstruction and Segmentation for Low-Latency Cardiac Output Monitoring Using Deep Artifact Suppression and Segmentation. Magn. Reson. Med. 2022, 88, 2179–2189. [Google Scholar] [CrossRef]
Demirel, O.B.; Yaman, B.; Shenoy, C.; Moeller, S.; Weingärtner, S.; Akçakaya, M. Signal Intensity Informed Multi-Coil Encoding Operator for Physics-Guided Deep Learning Reconstruction of Highly Accelerated Myocardial Perfusion Cmr. Magn. Reson. Med. 2023, 89, 308–321. [Google Scholar] [CrossRef]
Dhengre, N.; Sinha, S. An Edge Guided Cascaded U-Net Approach for Accelerated Magnetic Resonance Imaging Reconstruction. Int. J. Imaging Syst. Technol. 2021, 31, 2014–2022. [Google Scholar] [CrossRef]
El-Rewaidy, H.; Fahmy, A.S.; Pashakhanloo, F.; Cai, X.; Kucukseymen, S.; Csecs, I.; Neisius, U.; Haji-Valizadeh, H.; Menze, B.; Nezafat, R. Multi-Domain Convolutional Neural Network (Md-Cnn) for Radial Reconstruction of Dynamic Cardiac Mri. Magn. Reson. Med. 2021, 85, 1195–1208. [Google Scholar] [CrossRef]
Gao, Z.; Zhou, S.K. Rethinking Dual-Domain Undersampled Mri Reconstruction: Domain-Specific Design from the Perspective of the Receptive Field. In Proceedings of the 2024 IEEE International Symposium on Biomedical Imaging (ISBI), Athens, Greece, 27–30 May 2024; IEEE: Piscataway, NJ, USA, 2024. [Google Scholar]
Fatania, K.; Pirkl, C.M.; Menzel, M.I.; Hall, P.; Golbabaee, M. A Plug-and-Play Approach to Multiparametric Quantitative Mri: Image Reconstruction Using Pre-Trained Deep Denoisers. In Proceedings—International Symposium on Biomedical Imaging, Kolkata, India, 28–31 March 2022; IEEE: Piscataway, NJ, USA, 2022. [Google Scholar] [CrossRef]
Liu, J.; Sun, Y.; Eldeniz, C.; Gan, W.; An, H.; Kamilov, U.S. Rare: Image Reconstruction Using Deep Priors Learned without Groundtruth. IEEE J. Sel. Top. Signal Process. 2020, 14, 1088–1099. [Google Scholar] [CrossRef]
Gao, Z.; Guo, Y.; Zhang, J.; Zeng, T.; Yang, G. Hierarchical Perception Adversarial Learning Framework for Compressed Sensing Mri. IEEE Trans. Med. Imaging 2023, 42, 1859–1874. [Google Scholar] [CrossRef] [PubMed]
Gao, C.; Ghodrati, V.; Shih, S.-F.; Wu, H.H.; Liu, Y.; Nickel, M.D.; Vahle, T.; Dale, B.; Sai, V.; Felker, E.; et al. Undersampling Artifact Reduction for Free-Breathing 3d Stack-of-Radial Mri Based on a Deep Adversarial Learning Network. Magn. Reson. Imaging 2023, 95, 70–79. [Google Scholar] [CrossRef]
Guo, P.; Mei, Y.; Zhou, J.; Jiang, S.; Patel, V.M. Reconformer: Accelerated Mri Reconstruction Using Recurrent Transformer. IEEE Trans. Med. Imaging 2024, 43, 582–593. [Google Scholar] [CrossRef] [PubMed]
Jacob, M.; Mani, M.P.; Ye, J.C. Structured Low-Rank Algorithms: Theory, Magnetic Resonance Applications, and Links to Machine Learning. IEEE Signal Process. Mag. 2020, 37, 54–68. [Google Scholar] [CrossRef]
Jaubert, O.; Steeden, J.; Montalt-Tordera, J.; Arridge, S.; Kowalik, G.T.; Muthurangu, V. Deep Artifact Suppression for Spiral Real-Time Phase Contrast Cardiac Magnetic Resonance Imaging in Congenital Heart Disease. Magn. Reson. Imaging 2021, 83, 125–132. [Google Scholar] [CrossRef]
Kijowski, R.; Fritz, J. Emerging Technology in Musculoskeletal Mri and Ct. Radiology 2023, 306, 6–19. [Google Scholar] [CrossRef]
Mir, N.; Fransen, S.J.; Wolterink, J.M.; Fütterer, J.J.; Simonis, F.F.J. Recent Developments in Speeding up Prostate Mri. J. Magn. Reson. Imaging 2024, 60, 813–826. [Google Scholar] [CrossRef] [PubMed]
Noor, R.; Wahid, A.; Bazai, S.U.; Khan, A.; Fang, M.; Syam, M.S.; Bhatti, U.A.; Ghadi, Y.Y. Dlgan: Undersampled Mri Reconstruction Using Deep Learning Based Generative Adversarial Network. Biomed. Signal Process. Control 2024, 93, 106218. [Google Scholar] [CrossRef]
Tong, C.; Pang, Y.; Wan, Y. Hiwdnet: A Hybrid Image-Wavelet Domain Network for Fast Magnetic Resonance Image Reconstruction. Comput. Biol. Med. 2022, 151, 105947. [Google Scholar] [CrossRef]
Upadhyay, U.; Chen, Y.; Hepp, T.; Gatidis, S.; Akata, Z. Uncertainty-Guided Progressive Gans for Medical Image Translation. Lect. Notes Comput. Sci. 2021, 12903 LNCS, 614–624. [Google Scholar] [CrossRef]
Abinesh, R.; Yogeshkumar, V.G.; Sarabesh, T.J.; Nandhini, S. Deformable Convolution Network for Reconstruction of Misaligned Mri Images and Classification of Brain Tumor in the Aligned Images. In Proceedings of the 2024 8th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Kirtipur, Nepal, 3–5 October 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1319–1323. [Google Scholar]
Jung, S.; Choi, Y.; Al-Masni, M.A.; Jung, M.; Kim, D.-H. Deformation-Aware Segmentation Network Robust to Motion Artifacts for Brain Tissue Segmentation Using Disentanglement Learning. Lect. Notes Comput. Sci. 2024, 15009 LNCS, 213–222. [Google Scholar] [CrossRef]
Kromrey, M.-L.; Tamada, D.; Johno, H.; Funayama, S.; Nagata, N.; Ichikawa, S.; Kühn, J.-P.; Onishi, H.; Motosugi, U. Reduction of Respiratory Motion Artifacts in Gadoxetate-Enhanced Mr with a Deep Learning-Based Filter Using Convolutional Neural Network. Eur. Radiol. 2020, 30, 5923–5932. [Google Scholar] [CrossRef] [PubMed]
Mio, M.; Tabata, N.; Toyofuku, T.; Nakamura, H. Reduction of motion artifacts in liver mri using deep learning with high-pass filtering. Nihon Hoshasen Gijutsu Gakkai Zasshi 2024, 80, 510–518. [Google Scholar] [CrossRef]
Lim, A.; Lo, J.; Wagner, M.W.; Ertl-Wagner, B.; Sussman, D. Motion Artifact Correction in Fetal Mri Based on a Generative Adversarial Network Method. Biomed. Signal Process. Control 2023, 81, 104484. [Google Scholar] [CrossRef]
Liu, Y.; Diao, J.; Zhou, Z.; Qi, H.; Hu, P. Cardiac Cine Mri Motion Correction Using Diffusion Models. In Proceedings of the 2024 IEEE International Symposium on Biomedical Imaging (ISBI), Athens, Greece, 27–30 May 2024; IEEE: Piscataway, NJ, USA, 2024. [Google Scholar]
Oh, G.; Jung, S.; Lee, J.E.; Ye, J.C. Annealed Score-Based Diffusion Model for Mr Motion Artifact Reduction. IEEE Trans. Comput. Imaging 2024, 10, 43–53. [Google Scholar] [CrossRef]
Lyu, Q.; Shan, H.; Xie, Y.; Kwan, A.C.; Otaki, Y.; Kuronuma, K.; Li, D.; Wang, G. Cine Cardiac Mri Motion Artifact Reduction Using a Recurrent Neural Network. IEEE Trans. Med. Imaging 2021, 40, 2170–2181. [Google Scholar] [CrossRef]
Nguyen, X.V.; Oztek, M.A.; Nelakurti, D.D.; Brunnquell, C.L.; Mossa-Basha, M.; Haynor, D.R.; Prevedello, L.M. Applying Artificial Intelligence to Mitigate Effects of Patient Motion or Other Complicating Factors on Image Quality. Top. Magn. Reson. Imaging 2020, 29, 175–180. [Google Scholar] [CrossRef]
Oh, G.; Lee, J.E.; Ye, J.C. Unpaired Mr Motion Artifact Deep Learning Using Outlier-Rejecting Bootstrap Aggregation. IEEE Trans. Med. Imaging 2021, 40, 3125–3139. [Google Scholar] [CrossRef]
Oksuz, I.; Clough, J.R.; Ruijsink, B.; Anton, E.P.; Bustin, A.; Cruz, G.; Prieto, C.; King, A.P.; Schnabel, J.A. Deep Learning-Based Detection and Correction of Cardiac Mr Motion Artefacts during Reconstruction for High-Quality Segmentation. IEEE Trans. Med. Imaging 2020, 39, 4001–4010. [Google Scholar] [CrossRef]
Rawat, U.; Batra, V.; Sharma, R.K.; Kulandhaivel, M.; Mukuntharaj, C.; Dongre, D. High Quality Segmentation Using Deep Learning Centered Detection and Correction of Cardiac Mr Motion Artefacts throughout Reconstruction. In Proceedings of the 2023 6th International Conference on Contemporary Computing and Informatics (IC3I), Gautam Buddha Nagar, India, 14–16 September 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1941–1945. [Google Scholar]
Tripathi, V.R.; Tibdewal, M.N.; Mishra, R. A Survey on Motion Artifact Correction in Magnetic Resonance Imaging for Improved Diagnostics. SN Comput. Sci. 2024, 5, 281. [Google Scholar] [CrossRef]
Safari, M.; Yang, X.; Fatemi, A.; Archambault, L. Mri Motion Artifact Reduction Using a Conditional Diffusion Probabilistic Model (Mar-Cdpm). Med. Phys. 2024, 51, 2598–2610. [Google Scholar] [CrossRef]
Safari, M.; Yang, X.; Chang, C.-W.; Qiu, R.L.J.; Fatemi, A.; Archambault, L. Unsupervised Mri Motion Artifact Disentanglement: Introducing Maudgan. Phys. Med. Biol. 2024, 69, 115057. [Google Scholar] [CrossRef]
Samuel, S.; Ochawar, R.S.; Rukmini, M.S.S. Hybrid Deep Autoencoder Network Based Adaptive Cross Guided Bilateral Filter for Motion Artifacts Correction and Denoising from Mri. Imaging Sci. J. 2023, 72, 76–91. [Google Scholar] [CrossRef]
Kwon, K.; Kim, D.; Kim, B.; Park, H. Unsupervised Learning of a Deep Neural Network for Metal Artifact Correction Using Dual-Polarity Readout Gradients. Magn. Reson. Med. 2020, 83, 124–138. [Google Scholar] [CrossRef] [PubMed]
Arabi, H.; Zaidi, H. Truncation Compensation and Metallic Dental Implant Artefact Reduction in Pet/Mri Attenuation Correction Using Deep Learning-Based Object Completion. Phys. Med. Biol. 2020, 65, 195002. [Google Scholar] [CrossRef]
Feuerriegel, G.C.; Sutter, R. Managing Hardware-Related Metal Artifacts in Mri: Current and Evolving Techniques. Skelet. Radiol. 2024, 53, 1737–1750. [Google Scholar] [CrossRef] [PubMed]
Ranzini, M.B.M.; Groothuis, I.; Klaser, K.; Cardoso, M.J.; Henckel, J.; Ourselin, S.; Hart, A.; Modat, M. Combining Multimodal Information for Metal Artefact Reduction: An Unsupervised Deep Learning Framework. In Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), owa City, IA, USA, 3–7 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 600–604. [Google Scholar]
Bedi, P.; Goyal, S.B.; Yadav, D.K.; Kumar, S.; Sharma, M. Hybrid Learning Model for Metal Artifact Reduction. J. Phys. Conf. Ser. 2021, 1714, 012021. [Google Scholar] [CrossRef]
Begnoche, J.P.; Schilling, K.G.; Boyd, B.D.; Cai, L.Y.; Taylor, W.D.; Landman, B.A. Epi Susceptibility Correction Introduces Significant Differences Far from Local Areas of High Distortion. Magn. Reson. Imaging 2022, 92, 1–9. [Google Scholar] [CrossRef]
Bao, Q.; Xie, W.; Otikovs, M.; Xia, L.; Xie, H.; Liu, X.; Liu, K.; Zhang, Z.; Chen, F.; Zhou, X.; et al. Unsupervised Cycle-Consistent Network Using Restricted Subspace Field Map for Removing Susceptibility Artifacts in Epi. Magn. Reson. Med. 2023, 90, 458–472. [Google Scholar] [CrossRef]
Duong, S.T.M.; Phung, S.L.; Bouzerdoum, A.; Schira, M.M. An Unsupervised Deep Learning Technique for Susceptibility Artifact Correction in Reversed Phase-Encoding Epi Images. Magn. Reson. Imaging 2020, 71, 1–10. [Google Scholar] [CrossRef]
Duong, S.T.M.; Phung, S.L.; Bouzerdoum, A.; Ang, S.P.; Schira, M.M. Correcting Susceptibility Artifacts of Mri Sensors in Brain Scanning: A 3d Anatomy-Guided Deep Learning Approach. Sensors 2021, 21, 2314. [Google Scholar] [CrossRef] [PubMed]
Becker, M.; Arvidsson, F.; Bertilson, J.; Aslanikashvili, E.; Korvink, J.G.; Jouda, M.; Lehmkuhl, S. Deep Learning Corrects Artifacts in Raser Mri Profiles. Magn. Reson. Imaging 2025, 115, 110247. [Google Scholar] [CrossRef] [PubMed]
Fu, Z.; Johnson, K.; Altbach, M.I.; Bilgin, A. Cancellation of Streak Artifacts in Radial Abdominal Imaging Using Interference Null Space Projection. Magn. Reson. Med. 2022, 88, 1355–1369. [Google Scholar] [CrossRef] [PubMed]
Führes, T.; Saake, M.; Lorenz, J.; Seuss, H.; Bickelhaupt, S.; Uder, M.; Laun, F.B. Feature-Guided Deep Learning Reduces Signal Loss and Increases Lesion Cnr in Diffusion-Weighted Imaging of the Liver. Z. Med. Ethik 2024, 34, 258–269. [Google Scholar] [CrossRef]
Kim, U.-H.; Kim, H.J.; Seo, J.; Chai, J.W.; Oh, J.; Choi, Y.-H.; Kim, D.H. Cerebrospinal Fluid Flow Artifact Reduction with Deep Learning to Optimize the Evaluation of Spinal Canal Stenosis on Spine Mri. Skelet. Radiol. 2024, 53, 957–965. [Google Scholar] [CrossRef]
Park, D.; Hennessee, J.; Smith, E.T.; Chan, M.; Katen, C.; Wig, G.; Rodrigue, K.; Kennedy, K. The Dallas Lifespan Brain Study. Sci. Data 2024, 12, 846. [Google Scholar]
Plewes, D.B.; Kucharczyk, W. Physics of MRI: A Primer. J. Magn. Reson. Imaging 2012, 35, 1038–1054. [Google Scholar] [CrossRef]
Ferreira, P.F.; Gatehouse, P.D.; Mohiaddin, R.H.; Firmin, D.N. Cardiovascular Magnetic Resonance Artefacts. J. Cardiovasc. Magn. Reson. 2013, 15, 41. [Google Scholar] [CrossRef]
Ahmadian, S.; Jabbari, I.; Bagherimofidi, S.M.; Saligheh Rad, H. Characterization of Hardware-Related Spatial Distortions for IR-PETRA Pulse Sequence Using a Brain Specific Phantom. Magn. Reson. Mater. Phys. 2021, 34, 213–228. [Google Scholar] [CrossRef]
Khodarahmi, I.; Kirsch, J.; Chang, G.; Fritz, J. Metal Artifacts of Hip Arthroplasty Implants at 1.5-T and 3.0-T: A Closer Look into the B1 Effects. Skelet. Radiol. 2021, 50, 1007–1015. [Google Scholar] [CrossRef]
Gallo-Bernal, S.; Bedoya, M.A.; Gee, M.S.; Jaimes, C. Pediatric Magnetic Resonance Imaging: Faster Is Better. Pediatr. Radiol. 2023, 53, 1270–1284. [Google Scholar] [CrossRef] [PubMed]
Kim, S.; Park, H.; Park, S.-H. A Review of Deep Learning-Based Reconstruction Methods for Accelerated MRI Using Spatiotemporal and Multi-Contrast Redundancies. Biomed. Eng. Lett. 2024, 14, 1221–1242. [Google Scholar] [CrossRef] [PubMed]
Saotome, K.; Matsumoto, K.; Kato, Y.; Ozaki, Y.; Nagai, M.; Hasegawa, T.; Tsuchiya, H.; Yamao, T. Improving Image Quality Using the Pause Function Combination to PROPELLER Sequence in Brain MRI: A Phantom Study. Radiol. Phys. Technol. 2024, 17, 518–526. [Google Scholar] [CrossRef]
Knoll, F.; Zbontar, J.; Sriram, A.; Muckley, M.J.; Bruno, M.; Defazio, A.; Parente, M.; Geras, K.J.; Katsnelson, J.; Chandarana, H.; et al. fastMRI: A Publicly Available Raw k-Space and DICOM Dataset of Knee Images for Accelerated MR Image Reconstruction Using Machine Learning. Radiol. Artif. Intell. 2020, 2, e190007. [Google Scholar] [CrossRef]
Stadler, A.; Schima, W.; Ba-Ssalamah, A.; Kettenbach, J.; Eisenhuber, E. Artifacts in Body MR Imaging: Their Appearance and How to Eliminate Them. Eur. Radiol. 2007, 17, 1242–1255. [Google Scholar] [CrossRef] [PubMed]
Pietsch, M.; Christiaens, D.; Hajnal, J.V.; Tournier, J.-D. dStripe: Slice Artefact Correction in Diffusion MRI via Constrained Neural Network. Med. Image Anal. 2021, 74, 102255. [Google Scholar] [CrossRef]
Zhao, Y.; Ossowski, J.; Wang, X.; Li, S.; Devinsky, O.; Martin, S.P.; Pardoe, H.R. Localized Motion Artifact Reduction on Brain MRI Using Deep Learning with Effective Data Augmentation Techniques. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; IEEE: Piscataway, NJ, USA, 2021. [Google Scholar]
Vishnumurthy, T.D.; Meshram, V.A.; Mohana, H.S.; Kammar, P. Frequency Domain Technique to Remove Herringbone Artifact from Magnetic Resonance Images of Brain and Morphological Segmentation for Detection of Tumor. In Proceedings of the Emerging Research in Computing, Information, Communication and Applications; Shetty, N.R., Patnaik, L.M., Prasad, N.H., Nalini, N., Eds.; Springer: Singapore, 2018; pp. 57–66. [Google Scholar]
Meshram, V.A.; Vishnumurthy, T.D.; Mohana, H.S. A Review on Brain Magnetic Resonance Imaging Artifacts: Description, Causes and their Elimination. Int. J. Adv. Inf. Sci. Technol. (IJAIST) 2015, 4, 88–93. Available online: https://www.ijaist.com/wp-content/uploads/2018/08/AReviewonBrainMagneticResonanceImagingArtifactsDescriptionCausesandtheirElimination.pdf (accessed on 20 February 2026).
Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention U-Net: Learning Where to Look for the Pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar] [CrossRef]
Miravete Zararaza, C.; Gaspar Lorenz, F.J.; Rodrigo Cardiel, C. La Transformación Wavelet y Sus Aplicaciones en el Procesamiento de Imágenes; Universidad de Zaragoza: Zaragoza, Spain, 2024. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI; Springer: Cham, Switzerland, 2015. [Google Scholar]
Aghabiglou, A.; Eksioglu, E.M. Projection-Based Cascaded U-Net Model for MR Image Reconstruction. Comput. Methods Programs Biomed. 2021, 207, 106151. [Google Scholar] [CrossRef] [PubMed]
Dhengre, N.; Sinha, S. Multiscale U-Net-Based Accelerated Magnetic Resonance Imaging Reconstruction. SIViP 2022, 16, 881–888. [Google Scholar] [CrossRef]
Zabihi, S.; Rahimian, E.; Asif, A.; Mohammadi, A. SepUnet: Depthwise Separable Convolution Integrated U-Net For MRI Reconstruction. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; Available online: https://ieeexplore.ieee.org/abstract/document/9506285 (accessed on 15 December 2025).
Oksuz, I. Brain MRI Artefact Detection and Correction Using Convolutional Neural Networks. Comput. Methods Programs Biomed. 2021, 199, 105909. [Google Scholar] [CrossRef]
Al-masni, M.A.; Lee, S.; Yi, J.; Kim, S.; Gho, S.-M.; Choi, Y.H.; Kim, D.-H. Stacked U-Nets with Self-Assisted Priors towards Robust Correction of Rigid Motion Artifact in Brain MRI. NeuroImage 2022, 259, 119411. [Google Scholar] [CrossRef]
Sang-Hyun, K.; Dong-Hee, H. Deep Learning with U-Net for Motion Artifact Reduction in Brain MRI. Vasc. Endovasc. Rev. 2025, 8, 195–200. [Google Scholar]
Sharma, R.; Tsiamyrtzis, P.; Webb, A.G.; Leiss, E.L.; Tsekos, N.V. Learning to Deep Learning: Statistics and a Paradigm Test in Selecting a UNet Architecture to Enhance MRI. Magn. Reson. Mater. Phys. 2024, 37, 507–528. [Google Scholar] [CrossRef]
Fan, Z.; Li, J.; Zhang, L.; Zhu, G.; Li, P.; Lu, X.; Shen, P.; Shah, S.A.A.; Bennamoun, M.; Hua, T.; et al. U-Net Based Analysis of MRI for Alzheimer’s Disease Diagnosis. Neural Comput. Applic 2021, 33, 13587–13599. [Google Scholar] [CrossRef]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. Commun. ACM 2014, 63, 139–144. [Google Scholar] [CrossRef]
Sinha, A.; Dolz, J. Multi-Scale Self-Guided Attention for Medical Image Segmentation. IEEE J. Biomed. Health Inform. 2020, 25, 121–130. [Google Scholar] [CrossRef] [PubMed]
Souibgui, M.A.; Biswas, S.; Jemni, S.K.; Kessentini, Y.; Fornés, A.; Lladós, J.; Pal, U. DocEnTr: An End-to-End Document Image Enhancement Transformer. In Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 21–25 August 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1699–1705. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2021, arXiv:2010.11929. [Google Scholar] [CrossRef]

Figure 1. WaveletBasedAttention-Net architecture (DWT→4-band U-Net→IDWT) with hop connections and 1024 filter bottleneck.

Figure 2. Attention gate (AG) mechanism integrated into the skip connections of the proposed Attention U-Net architecture.

Figure 3. Training and evaluation pipeline: Radon-guided rotation, DWT decomposition, wavelet-based network processing, IDWT reconstruction, and inverse rotation. Loss is computed per band, and metrics are evaluated on the reconstructed image.

Figure 4. Ringing artifact: image appearance and k-space signature. (a) Artifact-free brain MRI. (b) k-space signature showing the magnitude spectrum of the artifact component, computed as log (1 +

|F|)

. (c) Image with the ringing artifact; ellipses show edge-aligned oscillations adjacent to a high-contrast boundary.

Figure 4. Ringing artifact: image appearance and k-space signature. (a) Artifact-free brain MRI. (b) k-space signature showing the magnitude spectrum of the artifact component, computed as log (1 +

|F|)

. (c) Image with the ringing artifact; ellipses show edge-aligned oscillations adjacent to a high-contrast boundary.

Figure 5. Herringbone artifact: image appearance and k-space signature. (a) Artifact-free brain MRI. (b) k-space signature showing the log-magnitude spectrum of the artifact component computed as log (1 +

|F|) .

(c) Image with herringbone artifact; the ellipse shows the banding pattern in gray matter.

Figure 5. Herringbone artifact: image appearance and k-space signature. (a) Artifact-free brain MRI. (b) k-space signature showing the log-magnitude spectrum of the artifact component computed as log (1 +

|F|) .

(c) Image with herringbone artifact; the ellipse shows the banding pattern in gray matter.

Figure 6. Zipper artifact: image appearance. (a) Artifact-free brain MRI. (b) Image with the zipper artifact; rectangles show the vertical noise patterns.

Figure 7. Percentage distribution of artifact types (ringing, herringbone, and zipper) across the training and validation partitions.

Figure 8. Distribution of wavelet and GLCM descriptors in the training and validation partitions.

Figure 9. PCA projection of wavelet + GLCM features: training vs. validation distribution comparison.

Figure 10. Evolution of the compound loss

w_{A A}, w_{A D}, w_{D A}, w_{D D}

across epochs for the training and validation partitions. (a) Training loss; (b) validation loss.

Figure 10. Evolution of the compound loss

w_{A A}, w_{A D}, w_{D A}, w_{D D}

across epochs for the training and validation partitions. (a) Training loss; (b) validation loss.

Figure 11. Evolution of the SSIM metric across epochs for the training and validation partitions. (a) Training SSIM; (b) validation SSIM.

Figure 12. Evolution of the validation MAE across epochs, computed on the high-frequency DA wavelet sub-band.

Figure 13. SSIM results by artifact type before and after correction. (a) Ringing artifact; (b) herringbone artifact; (c) zipper artifact.

Figure 14. PSNR before and after correction for each artifact type. (a) Ringing artifact; (b) herringbone artifact; (c) zipper artifact.

Figure 15. MSE by wavelet band before and after correction for each artifact type. (a) Ringing artifact; (b) herringbone artifact; (c) zipper artifact.

Figure 16. Visual comparison of ringing artifact correction. (a) Artifacted input image (b), ground truth (GT), and (c) model prediction. Highlighted regions show areas affected by the ringing artifact, where the proposed model achieves improved restoration quality, reflected in higher PSNR and SSIM values.

Figure 17. Visual comparison of herringbone artifact correction. (a) Artifacted input image; (b) ground truth (GT); and (c) model prediction. Highlighted regions indicate areas affected by the herringbone artifact, where the proposed model achieves noticeable improvements in image quality, reflected in higher PSNR and SSIM values.

Figure 18. Visual comparison of zipper artifact correction. (a) Artifacted input image; (b) ground truth (GT); and (c) model prediction. Highlighted regions indicate areas affected by the zipper artifact, where the proposed model achieves noticeable improvements in image quality, reflected in higher PSNR and SSIM values.

Figure 19. Ringing artifact suppression (z-score domain). (a) Input image with PSNR/SSIM, (b) model prediction, (c) absolute error map ∣Input—GT∣, and (d) absolute error map ∣Pred—GT∣. The prediction reduces edge-related ringing residuals (PSNR 37.98 → 44.53 dB, SSIM 0.9809 → 0.9956).

Figure 20. Herringbone artifact suppression (z-score domain). (a) Input image with PSNR/SSIM, (b) model prediction, (c) ∣Input—GT∣, and (d) ∣Pred—GT∣. The stripe-like residual pattern is markedly attenuated after correction (PSNR 35.50 → 42.37 dB, SSIM 0.9426 → 0.9901).

Figure 21. Zipper artifact suppression (z-score domain). (a) Input image with PSNR/SSIM, (b) model prediction, (c) ∣Input—GT∣, and (d) ∣Pred—GT∣. Strong banding artifacts and large residuals in the input are substantially reduced (PSNR 17.26 → 29.63 dB, SSIM 0.4471 → 0.8681).

Figure 22. (a) Input ROI; (b) Ground truth (GT) ROI; (c) Predicted ROI. In the selected ROI, the prediction is closer to the ground truth and reduces ringing-related distortions. Quantitatively, PSNR increases from 31.50 dB to 38.21 dB (+6.71 dB), and SSIM increases from 0.9236 to 0.9766 (+0.0530).

Figure 23. (a) Input ROI; (b) Ground truth (GT) ROI; (c) Predicted ROI. The ROI shows strong suppression of the stripe-like (herringbone) pattern in the prediction relative to the input, with improved similarity to the GT ROI. PSNR increases from 30.01 dB to 39.02 dB (+9.01 dB), and SSIM increases from 0.8323 to 0.9806 (+0.1483).

Figure 24. (a) Input ROI; (b) Ground truth (GT) ROI; (c) Predicted ROI. The prediction substantially reduces pronounced banding within the ROI and restores local contrast toward the GT ROI. PSNR increases from 19.20 dB to 31.28 dB (+12.08 dB), and SSIM increases from 0.4542 to 0.8928 (+0.4386).

Table 1. Summary of deep learning-based approaches for artifact correction in medical imaging, including artifact category, type, principal causes, solution approaches, and key limitations.

Category	Field	Content
Sampling, aliasing and truncation	Artifacts	Aliasing, Gibbs, ringing, Wrap-around, reduced spatial resolution
	Main causes	Signal discretization in space and frequency, Undersampling, k-space truncation, Signal outside the field of view
	Deep learning solution approaches	CNN/ResNet/autoencoder reconstruction [1,2,3,4,5,9,10,11,12,13,14,15]; zero-shot and in situ adaptation [6,7]; adversarial and dual-domain learning [8]; structured k-space correction [16,17]; model-based unrolling [22,23,24]; transformer reconstruction [31,32,33,34,35]; hybrid pipelines [8,22,23,24,31,32,33,34,35]
	Identified limitations	Architectural complexity in dual-domain models [8]; limited 3D and low-SNR validation [6,7]; high computational cost [10]; possible suppression of subtle findings [11]; limited pathological validation [12,13,14]; metric shifts under threshold-based evaluation [15]; dependence on training data [16]; latency in iterative methods [36,37,38,39]
Motion artifacts	Artifacts	Blurring, ghosting, phase stiffness, respiratory artifacts, motion shifts in DWI
	Main causes	Voluntary and involuntary patient motion, breathing, fetal motion, cardiac motion
	Deep learning solution approaches	CNN/U-Net restoration [41,42,43,44,51,52]; GAN-based approaches [45,48,55]; diffusion and score-based models [46,47,54]; unpaired and autoencoder pipelines [50,56]; hybrid motion estimation + restoration [41,42,43,44,45,46,47,50,51,52,53,54,55,56]
	Identified limitations	GAN instability and possible anatomical artifacts [45]; high computational cost in diffusion models [46,47]; limited multicenter and real-time 3D/4D validation [53]; sensitivity to hyperparameters and limited pathological evaluation [55,56]
Off resonance and susceptibility B0	Artifacts	Geometric distortions, signal voids from metals, signal mismatches, frequency shifts
	Main causes	B0 field inhomogeneity, susceptibility differences between tissues, metallic implants, and incomplete shimming
	Deep learning solution approaches	ΔB0-based geometric correction [63,64,65]; CNN/U-Net restoration [3,58,61,62]; unsupervised and multimodal approaches (dual polarity, CT–MRI transfer, GANs) [57,60,62]; hybrid field-map + restoration pipelines [57,58,59,60,61,62,63,64,65]
	Identified limitations	Additional acquisitions may increase scan time [57]; dependence on reversed polarity data [63]; possible alteration of derived metrics (e.g., FA in TBSS) [62]; training sensitivity to hyperparameters [65]; robustness to motion and protocol variability remains limited [63]
Ghosting phase, errors and system effects	Artifacts	Herringbone spike and corduroy patterns, zipper artifacts, ghosting from flow and pulsatility, gradient nonlinearities, eddy currents
	Main causes	Phase errors, RF interference and unintended RF emissions, system nonlinearities and gradient faults, eddy currents, flow, and pulsatility
	Deep learning solution approaches	CNN/U-Net restoration [66,68]; coil-combination and subspace methods [63,67]; generative and hybrid approaches for metal/flow artifacts [61,69]; physics-informed pipelines with CNN post-processing [63,66,67,68,69]
	Identified limitations	Dependence on physical model assumptions [66]; computational cost at high resolution [66]; limited generalization with task-specific annotations [61]; robustness across scanners and protocols still under evaluation [67]

Table 2. Parameters for controlling the severity of the ringing artifact.

Parameter	Lower Limit Value	Higher Limit Value	Type
Scalar intensity ( $α)$	0	1	Decimal
Propagation axis	0	2	Integer
End angle ( $θ_{2}$ )	0	360	Integer
Initial angle ( $θ_{1}$ )	$θ_{2} - 100$	$θ_{2} - 10$	Integer
Outer radius ( $r_{e}$ )	118	123	Integer
Inner radius ( $r_{i}$ )	$r_{e} - 20$	$r_{e} - 10$	Integer

Table 3. Parameters controlling the severity of the herringbone artifact.

Parameter	Lower Limit Value	Higher Limit Value	Type
Smoothing $(S)$	3	20	Integer
Propagation axis	0	2	Integer
Selection point	0	3	Integer
Kernel size ( $k_{S}$ )	3	13	Integer
Distance $(d)$	−30	30	Integer

Table 4. Parameters for controlling the severity of zipper artifacts.

Parameter	Lower Limit Value	Higher Limit Value	Type
Intensity	15	50	Integer
Propagation axis	0	2	Integer
Number of artifacts	1	16	Integer
Variability	20	40	Integer
Amplitude	10	50	Integer

Table 5. Hyperparameter values.

Parameter	Value
Input/Output Bands	4 (AA, AD, DA, DD)
Encoder/Decoder Levels	4/4
Loss	WaveletLoss (band-weighted L1)
Metrics	SSIM and PSNR (image), MAE per band
Optimizer	Adam
Batch Size/Epochs	8/50
Dataloader Workers	4
Wavelet	db2
Minimum Tile Size	≥32 × 32

Table 6. Results of multivariate analysis on wavelet + GLCM characteristics in training and validation.

Test	Stat	p-Value
Energy Distance	0.010375	<0.005
MMD-RBF	0.000320	<0.005
Sliced-Wasserstein	0.073419	<0.005

Table 7. Results of the wavelet family selection experiment (db1–db6) for brain MRI artifact correction using a U-Net architecture.

Wavelet Family	SSIM	MAE
db2	0.97322 ± 0.00217	0.03570 ± 0.00051
db4	0.97306 ± 0.00203	0.03535 ± 0.00046
db3	0.97180 ± 0.00139	0.03552 ± 0.00085
db1	0.97149 ± 0.00252	0.03721 ± 0.00119
db6	0.97068 ± 0.00219	0.03541 ± 0.00101
db5	0.97049 ± 0.00244	0.03641 ± 0.00134

Table 8. Average MSE per wavelet component and overall SSIM and PSNR across the entire image.

Metric	Art Value	Correction Value
MSE_ $w_{A A}$	0.048 ± 0.869	0.007 ± 0.009
MSE_ $w_{A D}$	0.008 ± 0.011	0.003 ± 0.004
MSE_ $w_{D A}$	0.009 ± 0.011	0.004 ± 0.004
MSE_ $w_{D D}$	0.007 ± 0.010	0.002 ± 0.003
MSE_ $w_{m e a n}$	0.018 ± 0.029	0.004 ± 0.005
PSNR [dB]	33.42 ± 6.939	43.337 ± 5.364
SSIM	0.938 ± 0.067	0.985 ± 0.022

Table 9. Quantitative evaluation of artifact correction and edge preservation (input vs. output).

Metric	$\|I n p u t - G T\|$	$\|O u t p u t - G T\|$
Edge-PSNR	38.41 ± 12.39	41.04 ± 8.54
Edge-SSIM	0.952 ± 0.121	0.980 ± 0.060
Grad-L1	0.0744 ± 0.0530	0.0540 ± 0.0392
Grad-L2	0.0622 ± 0.0466	0.0436 ± 0.0342
Edge Dice/F1	0.920 ± 0.0500	0.938 ± 0.0416
Radon peak ratio	6.72 ± 10.33	2.89 ± 3.46

Table 10. Stress test on mixed periodic artifacts.

Mixture Artifact	n	PSNR in	PSNR out	SSIM in	SSIM out	ΔPSNR	ΔSSIM
Ringing + zipper	150	17.83 ± 3.00	30.61 ± 3.39	0.4621 ± 0.1336	0.8911 ± 0.0767	+12.78	+0.4290
Herringbone + ringing	150	35.13 ± 2.86	38.54 ± 2.84	0.9600 ± 0.0208	0.9793 ± 0.0144	+3.41	+0.0192
Zipper + herringbone	150	17.64 ± 3.21	29.07 ± 2.88	0.3902 ± 0.1557	0.8579 ± 0.0668	+11.43	+0.4678
Ringing + zipper + herringbone	150	17.58 ± 3.35	28.95 ± 2.87	0.3950 ± 0.1582	0.8570 ± 0.0701	+11.37	+0.4620

Table 11. PSNR and SSIM results stratified by artifact severity (input vs. output) by artifact type.

Artifact	Severity	Severity Interval	n	PSNR in	PSNR out	SSIM in	SSIM out
herringbone	mild	smooth ≤ 6	573	47.13 ± 2.88	47.34 ± 1.65	0.9948 ± 0.0031	0.9966 ± 0.0017
herringbone	moderate	6 < smooth < 16	2414	40.88 ± 3.60	44.35 ± 2.68	0.9790 ± 0.0132	0.9930 ± 0.0069
herringbone	severe	smooth ≥ 16	903	37.41 ± 3.44	42.63 ± 2.60	0.9618 ± 0.0176	0.9899 ± 0.0096
ringing	mild	intensity ≤ 0.27195	1386	42.57 ± 5.59	45.50 ± 3.91	0.9886 ± 0.0121	0.9950 ± 0.0048
ringing	moderate	0.27195 < intensity < 0.76809	2038	42.39 ± 5.11	44.93 ± 3.58	0.9896 ± 0.0093	0.9950 ± 0.0037
ringing	severe	intensity ≥ 0.76809	982	40.17 ± 4.56	44.24 ± 3.53	0.9854 ± 0.0104	0.9945 ± 0.0040
zipper	mild	intensity ≤ 22	658	20.90 ± 1.37	35.77 ± 3.59	0.6531 ± 0.0468	0.9537 ± 0.0321
zipper	moderate	22 < intensity < 41	2579	18.49 ± 1.83	34.08 ± 3.02	0.5267 ± 0.0855	0.9384 ± 0.0410
zipper	severe	intensity ≥ 41	996	17.41 ± 2.03	32.04 ± 3.72	0.4848 ± 0.0858	0.9142 ± 0.0757

Table 12. Quantitative ablation study of the main architecture components.

Model	SSIM	PSNR
Our model	0.98528 ± 0.02218	43.33710 ± 5.36451
without Radon	0.98242 ± 0.02441	42.65113 ± 5.73252
without Attention	0.97295 ± 0.04182	41.13236 ± 6.58765
without AA component	0.84215 ± 0.08133	34.87268 ± 7.15305
without AD component	0.95828 ± 0.04865	37.92928 ± 5.17067
without DA component	0.96147 ± 0.04702	38.48882 ± 5.56199
without DD component	0.95989 ± 0.05004	38.30767 ± 5.53929
without AA loss	0.96311 ± 0.04702	38.36318 ± 5.26245
without AD loss	0.96266 ± 0.04576	38.05932 ± 4.99527
without DA loss	0.96354 ± 0.04467	38.21393 ± 5.06365
without DD loss	0.96377 ± 0.04325	38.17828 ± 4.85161

Table 13. Average SSIM and PSNR results for each trained model compared with the proposed approach.

Model	SSIM	PSNR
Our model	0.98528 ± 0.02218	43.33710 ± 5.36451
U-Net	0.97295 ± 0.04182	41.13236 ± 6.58765
GAN	0.97390 ± 0.03990	41.08036 ± 6.32473
Spatial + channel attention	0.97353 ± 0.03422	41.41551 ± 6.70846
Attention–attention	0.93782 ± 0.04089	32.84799 ± 3.70444
Vision transformer	0.97322 ± 0.04279	40.77219 ± 6.18026

Table 14. Computational cost comparison across the different approaches.

Model	Params (M)	FLOPs/ MACs (G)	Peak GPU Mem (GB)	Train Time /Epoch (s)	Total Train Time (min, 100 ep)	Inference Latency/Slice (ms)
Our model	33.482	14.337	1.433 ± 0.03	359.83 ± 10.79	599.72 ±17.99	5.461 ± 0.27
U-Net	31.391	13.933	1.582 ± 0.03	335.07 ± 10.05	558.45 ± 16.75	4.399 ± 0.22
GAN	34.155	13.933	1.583 ± 0.04	671.05 ± 26.84	1118.42 ± 44.74	4.389 ± 0.22
Spatial + channel attention	32.177	14.451	4.554 ± 0.09	601.78 ± 24.07	1002.97 ± 40.12	6.202 ± 0.37
Attention–attention	3.599	9.881	4.528 ± 0.09	4324.75 ± 216.24	7207.92 ± 360.4	23.526 ± 1.41
Vision transformer	43.508	14.442	1.968 ± 0.04	529.47 ± 21.18	882.44 ± 35.30	12.327 ± 0.74

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rios-Perez, J.D.; Sanchez-Torres, G.; Branch-Bedoya, J.W.; Laiton-Bonadiez, C.A. Radon-Guided Wavelet-Domain Attention U-Net for Periodic Artifact Suppression in Brain MRI. J. Imaging 2026, 12, 153. https://doi.org/10.3390/jimaging12040153

AMA Style

Rios-Perez JD, Sanchez-Torres G, Branch-Bedoya JW, Laiton-Bonadiez CA. Radon-Guided Wavelet-Domain Attention U-Net for Periodic Artifact Suppression in Brain MRI. Journal of Imaging. 2026; 12(4):153. https://doi.org/10.3390/jimaging12040153

Chicago/Turabian Style

Rios-Perez, Jesus David, German Sanchez-Torres, John W. Branch-Bedoya, and Camilo Andres Laiton-Bonadiez. 2026. "Radon-Guided Wavelet-Domain Attention U-Net for Periodic Artifact Suppression in Brain MRI" Journal of Imaging 12, no. 4: 153. https://doi.org/10.3390/jimaging12040153

APA Style

Rios-Perez, J. D., Sanchez-Torres, G., Branch-Bedoya, J. W., & Laiton-Bonadiez, C. A. (2026). Radon-Guided Wavelet-Domain Attention U-Net for Periodic Artifact Suppression in Brain MRI. Journal of Imaging, 12(4), 153. https://doi.org/10.3390/jimaging12040153

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Radon-Guided Wavelet-Domain Attention U-Net for Periodic Artifact Suppression in Brain MRI

Abstract

1. Introduction

2. Previous Works

2.1. Sampling/Aliasing/Truncation-Related Artifacts

2.2. Motion Artifacts

2.3. Off-Resonance and Susceptibility (B0)

2.4. Ghosting, Phase Errors, and System Effects

3. Materials and Methods

3.1. Dataset

3.2. Artifact Generation

3.2.1. Ringing

Control Parameters

3.2.2. Herringbone

Generation Model

Control Parameters

3.2.3. Zipper

Generation Model

Control Parameters

3.3. Preprocessing

3.4. Validation of Dataset Partitioning and Distribution

3.4.1. Feature Extraction Within a Tissue Mask

3.4.2. Joint Normalization and Decorrelation

3.4.3. Multivariate Comparison Tests

3.4.4. Maximum Mean Discrepancy (MMD) with a Gaussian Kernel

3.4.5. Sliced-Wasserstein Distance (Order 1)

3.5. Proposed Architecture

3.6. Training

3.6.1. Wavelet Component Selection for Training

3.6.2. Loss Function (WaveletLoss)

3.7. Comparison of Results

3.7.1. U-Net

3.7.2. GAN

3.7.3. Spatial and Channel Attention Mechanisms

3.7.4. Attention–Attention

3.7.5. Vision Transformer

4. Results

4.1. Artifact Generation

4.1.1. Ringing

4.1.2. Herringbone

4.1.3. Zipper

4.2. Validation of Dataset Partitioning and Distribution

4.2.1. Distribution of Artifact Types Across Dataset Partitioning

4.2.2. Validation of Dataset Partition Distribution

4.3. Wavelet Family Selection

4.4. Training Results

4.5. Structural Preservation and Edge Integrity

4.6. Visual Results of the Correction

4.7. Stress Test on Mixed Periodic Artifacts

4.8. Results Stratified by Artifact Severity

4.9. Ablation Study on Radon-Based Loss Regularization

4.10. Comparison of Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI