Deep Learning-Based Super-Resolution Reconstruction and Segmentation of Photoacoustic Images

: Photoacoustic imaging (PAI) is an emerging imaging technique that offers real-time, non-invasive, and radiation-free measurements of optical tissue properties. However, image quality degradation due to factors such as non-ideal signal detection hampers its clinical applicability. To address this challenge, this paper proposes an algorithm for super-resolution reconstruction and segmentation based on deep learning. The proposed enhanced deep super-resolution minimalistic network (EDSR-M) not only mitigates the shortcomings of the original algorithm regarding computational complexity and parameter count but also employs residual learning and attention mechanisms to extract image features and enhance image details, thereby achieving high-quality reconstruction of PAI. DeepLabV3+ is used to segment the images before and after reconstruction to verify the network reconstruction performance. The experimental results demonstrate average improvements of 19.76% in peak-signal-to-noise ratio (PSNR) and 4.80% in structural similarity index (SSIM) for the reconstructed images compared to those of their pre-reconstructed counterparts. Additionally, mean accuracy, mean intersection and union ratio (IoU), and mean boundary F1 score (BFScore) for segmentation showed enhancements of 8.27%, 6.20%, and 6.28%, respectively. The proposed algorithm enhances the effect and texture features of PAI and makes the overall structure of the image restoration more complete.


Introduction
With the continuous development of medical imaging technology, photoacoustic imaging (PAI) has attracted more and more attention as a fast-developing hybrid biomedical imaging technology [1][2][3].PAI combines the advantages of optical imaging and ultrasound imaging, and has unique advantages in imaging depth, spatial resolution, and tissue imaging.It is able to provide abundant tissue functional and structural information, widely used in biomedical fields [4][5][6].In a variety of clinical applications, PAI has shown significant potential and effectiveness [7].For example, breast cancer screening and diagnosis [8,9]: early lesions and malignant tumors have been identified by detecting blood oxygen saturation and angiogenesis in breast tissue; diagnosis of skin lesions [10,11]: skin cancer and melanoma have been identified by detecting the concentration of melanin and vascular structure in the skin; cardiovascular disease monitoring [12,13]: the severity of atherosclerosis has been assessed by real-time imaging of intravascular lipid cores and calcified plaques; tumor detection [14,15]: by detecting blood flow changes and blood oxygen levels in tissues, important information is provided for early detection and surgical navigation.However, PAI still has limitations in practical applications.Non-ideal signal detection can significantly reduce the image quality of PAI [16].In addition, the generation of PAI images relies on acoustic waves generated from biological tissues, which can only be sampled Appl.Sci.2024, 14, 5331 2 of 18 in the spatial dimension.Each discrete spatial measurement requires its own detector, and it may be infeasible to build an imaging system with a sufficiently large number of detectors due to practical and physical limitations [17].Reconstruction of sampled data using standard methods results in low-quality images with serious detail loss.This has become a major obstacle to the clinical promotion of PAI.
Therefore, despite the great progress in PAI, there are still great challenges in image quality improvement and accurate segmentation of tissue structures [18,19].Traditional photoacoustic image reconstruction and segmentation methods often rely on hand-designed feature extractors and mathematical models, which have many limitations in dealing with complex backgrounds, noise interference, and blurred organizational boundaries [20,21].For example, filter-or edge detector-based methods have limited effectiveness in dealing with low-contrast structures and subtle features in PAI, while threshold-based segmentation methods are sensitive to noise and artifacts and are prone to producing erroneous segmentation results.This restricts their advancement in real-world uses.
In order to overcome the limitations of traditional methods, deep learning-based image reconstruction and segmentation methods have gradually attracted the attention of researchers in recent years [22][23][24][25].Currently, deep-learning techniques have been widely used in the field of optoacoustic imaging to acquire high-quality optoacoustic images through large-scale learning and to use the reconstructed optoacoustic images for disease segmentation, classification, and detection.With their strong feature learning capabilities and end-to-end data-driven advantages, deep-learning models can effectively and accurately perform reconstruction and segmentation tasks using optoacoustic image data, as well as develop complicated representations.Deep learning-based techniques have superior generalization and robustness over traditional techniques for handling noise, artifacts, and structural complexity in PAI [26][27][28].However, despite the significant progress made by deep-learning methods in the field of PAI processing, there are still some challenges and limitations.Because PAIs have complex structures and noise interference, traditional convolutional neural networks often perform poorly in processing PAIs, in which it is difficult to accurately reconstruct and segment the fine structures, and may suffer from overfitting or underfitting problems, resulting in insufficient model generalization ability [29].Second, the application breadth and performance of deep-learning methods are limited because deep-learning models need a significant amount of labelled data for training, and the labelled data of PAI are hard to collect.
In order to solve the above problems, this paper proposes a deep-learning PAI reconstruction and segmentation method based on the improved network enhanced deep super-resolution minimalistic network (EDSR-M) and DeepLabV3+ [30].As an advanced model for super-resolution reconstruction tasks, the EDSR [31] network can effectively improve the spatial resolution and detail information of the image.The improved network EDSR-M replaces the original upsampling layer with a convolution layer, which can effectively reduce the computational burden, simplify the network structure, reduce the risk of overfitting, and help learn image features better.The DeepLabV3 + network is an advanced model for semantic segmentation tasks that can segment different objects and organizations in the image into semantic regions.These two network structures are able to make full use of the spatial and semantic information of the image in the PAI reconstruction and segmentation task, thus achieving more accurate and robust reconstruction and segmentation results.Experimental results show that the proposed method achieves significant improvements in the reconstruction and segmentation tasks of PAI, demonstrating the potential and effectiveness of deep learning in the field of PAI.This research result provides an important reference and support for further promoting the development of PAI technology in clinical applications.

Overview of the Framework
Figure 1 shows the overall structure proposed in this study.The framework includes four steps of photoacoustic image (PAI) generation, deep-learning image reconstruction, deep-learning image segmentation, and model validation and evaluation, which are comprehensively applied to photoacoustic image processing.Firstly, low-resolution photoacoustic images (LR-PAIs) were generated by k-Wave [32] simulation.Secondly, the generated photoacoustic images were further reconstructed using the improved EDSR-M network to obtain high-resolution photoacoustic images (HR-PAIs).Subsequently, a deep-learning model (DeepLabV3+ [30]) was used to segment the images before and after reconstruction to assist in verifying the reconstruction effect and assessing the accuracy of the reconstructed images.Finally, the combination of deep-learning models was evaluated and validated.

Overview of the Framework
Figure 1 shows the overall structure proposed in this study.The framework includes four steps of photoacoustic image (PAI) generation, deep-learning image reconstruction, deep-learning image segmentation, and model validation and evaluation, which are comprehensively applied to photoacoustic image processing.Firstly, low-resolution photoacoustic images (LR-PAIs) were generated by k-Wave [32] simulation.Secondly, the generated photoacoustic images were further reconstructed using the improved EDSR-M network to obtain high-resolution photoacoustic images (HR-PAIs).Subsequently, a deeplearning model (DeepLabV3+ [30]) was used to segment the images before and after reconstruction to assist in verifying the reconstruction effect and assessing the accuracy of the reconstructed images.Finally, the combination of deep-learning models was evaluated and validated.

Photoacoustic Signal Generation and Reconstruction
Photoacoustic signals are generated by illuminating tissue with a nanosecond laser pulse   .The light-absorbing molecules in the tissue undergo thermoelastic expansion and generate a photoacoustic pressure wave [33].Assuming that thermal diffusion and volume expansion during illumination are negligible, the initial photoacoustic pressure  can be defined as where   is the spatial absorption function and   is the Grüneisen coefficient describing the conversion efficiency from heat to pressure [34].The photoacoustic pressure wave  ,  at position r and time  can be modelled as an initial-value problem for the fluctuation equation, where  is the speed of sound [35].
The transient signal is measured by a sensor located on the measurement surface  .The linear operator Μ acts on  ,  confined to the boundary of the computational domain  for a finite time  and provides linear mapping from the initial pressure  to the measured transient signal .

𝑦 Μ
Time-reversal reconstruction [36] is a robust photoacoustic image reconstruction method for homogeneous and heterogeneous media and any arbitrary detection geometry.The method utilizes a time-reversal algorithm to achieve image reconstruction by running the numerical model of the forward problem in reverse, i.e., inverting the signal propagation process in the time domain while keeping the spatial coordinates constant.Compared to other imaging algorithms, the time-reversal reconstruction method is less susceptible to image artefacts and can achieve more desirable reconstruction results [37].

Photoacoustic Signal Generation and Reconstruction
Photoacoustic signals are generated by illuminating tissue with a nanosecond laser pulse δ(t).The light-absorbing molecules in the tissue undergo thermoelastic expansion and generate a photoacoustic pressure wave [33].Assuming that thermal diffusion and volume expansion during illumination are negligible, the initial photoacoustic pressure x can be defined as where A(r) is the spatial absorption function and Γ(r) is the Grüneisen coefficient describing the conversion efficiency from heat to pressure [34].The photoacoustic pressure wave p(r, t) at position r and time t can be modelled as an initial-value problem for the fluctuation equation, where c is the speed of sound [35].
The transient signal is measured by a sensor located on the measurement surface S 0 .The linear operator M acts on p(r, t) confined to the boundary of the computational domain Ω for a finite time T and provides linear mapping from the initial pressure x to the measured transient signal y. y = M p|∂Ω×(0,T) = Ax Time-reversal reconstruction [36] is a robust photoacoustic image reconstruction method for homogeneous and heterogeneous media and any arbitrary detection geometry.The method utilizes a time-reversal algorithm to achieve image reconstruction by running the numerical model of the forward problem in reverse, i.e., inverting the signal propagation process in the time domain while keeping the spatial coordinates constant.Compared to other imaging algorithms, the time-reversal reconstruction method is less susceptible to image artefacts and can achieve more desirable reconstruction results [37].

Deep-Learning Algorithms for PAI Reconstruction
The convolutional neural network EDSR [31] for super-resolution reconstruction was subjected to some modifications to enhance the efficiency and scalability of PAI reconstruction.Deep learning-based super-resolution reconstruction algorithms have been used to obtain high-resolution (HR) images from their low-resolution (LR) counterparts in various fields [38,39].The EDSR network, as a deep-learning model for image super-resolution reconstruction, employs techniques such as residual learning and dense connectivity by increasing the depth and the number of parameters of the network, which can efficiently improve the quality of image super-resolution reconstruction and accuracy.However, since the PAI reconstruction itself does not need to change the size of the output image, the existence of the upsampling layer increases the computational complexity and the number of parameters of the network, which may lead to the problem of model overfitting or unstable training.In addition, the upsampling layer may lead to blurring or distortion of the network in reconstructing the image, especially at the edge, and detail parts of the reconstruction may be poor, affecting the reconstruction of the image.
To address this, an improved network EDSR-M was proposed by replacing the original upsampling layer with a convolutional layer, whose architecture is shown in Figure 2. The improved network still consists of a series of residual blocks, each of which contains multiple convolutional layers and activation functions inside, and sums up the input features with the output features through residual connections in order to efficiently learn the residual information of the image and increase the resolution of the image by employing appropriate convolutional operations in the final part of the network.Overall, the improved network structure improves the computational efficiency while maintaining the performance of the original network and avoids the problems that may be introduced by the traditional upsampling layer, such as artefacts and distortion.It also reduces the overfitting risk of the model, making the model more generalizable, robust, and better able to adapt to changes in different datasets and scenarios.Therefore, the improved network has significant advantages in improving model efficiency, reducing overfitting risk, and improving model interpretability.

Deep-Learning Algorithms for PAI Reconstruction
The convolutional neural network EDSR [31] for super-resolution reconstruction was subjected to some modifications to enhance the efficiency and scalability of PAI reconstruction.Deep learning-based super-resolution reconstruction algorithms have been used to obtain high-resolution (HR) images from their low-resolution (LR) counterparts in various fields [38,39].The EDSR network, as a deep-learning model for image superresolution reconstruction, employs techniques such as residual learning and dense connectivity by increasing the depth and the number of parameters of the network, which can efficiently improve the quality of image super-resolution reconstruction and accuracy.However, since the PAI reconstruction itself does not need to change the size of the output image, the existence of the upsampling layer increases the computational complexity and the number of parameters of the network, which may lead to the problem of model overfitting or unstable training.In addition, the upsampling layer may lead to blurring or distortion of the network in reconstructing the image, especially at the edge, and detail parts of the reconstruction may be poor, affecting the reconstruction of the image.
To address this, an improved network EDSR-M was proposed by replacing the original upsampling layer with a convolutional layer, whose architecture is shown in Figure 2. The improved network still consists of a series of residual blocks, each of which contains multiple convolutional layers and activation functions inside, and sums up the input features with the output features through residual connections in order to efficiently learn the residual information of the image and increase the resolution of the image by employing appropriate convolutional operations in the final part of the network.Overall, the improved network structure improves the computational efficiency while maintaining the performance of the original network and avoids the problems that may be introduced by the traditional upsampling layer, such as artefacts and distortion.It also reduces the overfitting risk of the model, making the model more generalizable, robust, and better able to adapt to changes in different datasets and scenarios.Therefore, the improved network has significant advantages in improving model efficiency, reducing overfitting risk, and improving model interpretability.

Deep-Learning Algorithm for PAI Segmentation
Deep learning plays an important role in medical image segmentation by combining the powerful feature learning capability of deep neural networks and the rich information of medical images to achieve automatic and accurate segmentation of lesions, tissues, and organs, which provides important assistance and support for medical diagnosis and treatment [40,41].DeepLabV3+ [30], as a deep-learning algorithm that focuses on semantic segmentation tasks, is able to accurately identify blood vessels, tissues, and other structures in PAI to provide accurate segmentation results for medical diagnosis, and its architecture is shown in Figure 3.Its core feature lies in the combination of deep separable convolution, null convolution, and multi-scale feature fusion, which can effectively capture the subtle structures and complex information in medical images to achieve accurate segmentation of organs and lesions.The model employs advanced backbone networks (e.g., ResNet, Xception, or MobileNetV2) to extract image features, and effectively captures contextual information at different scales through the cavity convolution module and feature pyramid pooling operations.Meanwhile, DeepLabV3+ also introduces a decoder module that fuses low-level and high-level features by transposing convolutional layers and hopping connections, which improves the accuracy and detail of the segmentation results, and provides powerful support for the development of medical imaging and clinical practice.

Deep-Learning Algorithm for PAI Segmentation
Deep learning plays an important role in medical image segmentation by combining the powerful feature learning capability of deep neural networks and the rich information of medical images to achieve automatic and accurate segmentation of lesions, tissues, and organs, which provides important assistance and support for medical diagnosis and treatment [40,41].DeepLabV3+ [30], as a deep-learning algorithm that focuses on semantic segmentation tasks, is able to accurately identify blood vessels, tissues, and other structures in PAI to provide accurate segmentation results for medical diagnosis, and its architecture is shown in Figure 3.Its core feature lies in the combination of deep separable convolution, null convolution, and multi-scale feature fusion, which can effectively capture the subtle structures and complex information in medical images to achieve accurate segmentation of organs and lesions.The model employs advanced backbone networks (e.g., ResNet, Xception, or MobileNetV2) to extract image features, and effectively captures contextual information at different scales through the cavity convolution module and feature pyramid pooling operations.Meanwhile, DeepLabV3+ also introduces a decoder module that fuses low-level and high-level features by transposing convolutional layers and hopping connections, which improves the accuracy and detail of the segmentation results, and provides powerful support for the development of medical imaging and clinical practice.

Photoacoustic Data for Training and Testing
Photoacoustic computed tomography is a novel medical imaging technique that utilizes the light absorption properties of tissues to generate images.The method works by identifying photoacoustic-induced initial pressure distributions in tissues.In these situations, light absorption raises local temperatures, which produces ultrasound waves and an initial pressure distribution [42].Different tissues absorb light to different extents, and thus the initial pressure distributions vary, which affects the contrast of the PAI [43].PAI of biological tissues mostly employs near-infrared (NIR) light, which can produce deeper penetration depth, a higher signal-to-noise ratio, and more contrast compared to those of typical visible-light biological imaging [44].This helps to increase spatial resolution.The absorption coefficients of different tissues such as fat, muscle, and blood at different wavelengths have been extensively studied [45].Therefore, the tissue structure and density information obtained using CT and MR scans can be used to define the initial pressure distribution in the PAI and then obtain the contrast of the photoacoustic image.
Synthetic training and test data were created using k-Wave, a MATLAB toolbox for simulating photoacoustic wavefields [32].The photoacoustic simulation in the k-Wave toolbox was implemented using a pseudo-spectral approach.The pseudo-spectral method is a numerical method for solving partial differential equations that improves efficiency in the spatial domain by fitting Fourier series to all the data in a global way, and is suitable

Photoacoustic Data for Training and Testing
Photoacoustic computed tomography is a novel medical imaging technique that utilizes the light absorption properties of tissues to generate images.The method works by identifying photoacoustic-induced initial pressure distributions in tissues.In these situations, light absorption raises local temperatures, which produces ultrasound waves and an initial pressure distribution [42].Different tissues absorb light to different extents, and thus the initial pressure distributions vary, which affects the contrast of the PAI [43].PAI of biological tissues mostly employs near-infrared (NIR) light, which can produce deeper penetration depth, a higher signal-to-noise ratio, and more contrast compared to those of typical visible-light biological imaging [44].This helps to increase spatial resolution.The absorption coefficients of different tissues such as fat, muscle, and blood at different wavelengths have been extensively studied [45].Therefore, the tissue structure and density information obtained using CT and MR scans can be used to define the initial pressure distribution in the PAI and then obtain the contrast of the photoacoustic image.
Synthetic training and test data were created using k-Wave, a MATLAB toolbox for simulating photoacoustic wavefields [32].The photoacoustic simulation in the k-Wave toolbox was implemented using a pseudo-spectral approach.The pseudo-spectral method is a numerical method for solving partial differential equations that improves efficiency in the spatial domain by fitting Fourier series to all the data in a global way, and is suitable for time-domain modeling of broadband or high-frequency waves in the field of acoustics [46].For each image in the dataset, an initial photoacoustic source with a grid size of 256 × 256 pixels is defined.The medium is assumed to be homogeneous, with a sound velocity of 1500 m/s and an attenuation coefficient of0.75 dB/(MHz•cm), similar to that of soft tissue in vivo [47][48][49].The transducer arrays have 32, 64, and 128 equidistant sensors on a circle with a radius of 100 pixels for the reception of the photoacoustic waves.k-Wave's inbuilt functions are used to simulate photoacoustic pressure sampling.The image is then reconstructed from the simulated photoacoustic time-series data using the time-reversal method.
The CHAOS dataset [50] is a comprehensive dataset for multimodal medical imaging of the liver.The dataset contains CT and MR scans from multiple healthy volunteers covering segmented annotations of abdominal organs such as liver, kidney, and spleen.The abdominal MRI image dataset with T2-Spectral Pre-Saturation Inversion Recovery (SPIR) sequence was selected for defining the initial photoacoustic stressor in k-Wave to create simulated PAI.It contains 632 T2-weighted abdominal images based on fat-suppressed pulse sequences acquisitions.Post-processing of the reconstructed sound field images, including filtering, denoising, interpolation, and other operations, was performed to create a simulated image dataset for deep-learning reconstruction of PAI.

Experimental Setup
Photoacoustic images (PAIs) were acquired from the CHAOS dataset by k-Wave simulation.During the experiments, the dataset was divided into a training set (75%), a validation set (5%), and a test set (20%), and data enhancement was performed on the training set by random rotations of 90 degrees and flipping on the x-axis, followed by super-resolution reconstruction using the EDSR-M network with 32 residual blocks.Finally, the pre-and post-reconstruction images were segmented using the DeepLabV3+ network with ResNet-50 backbone to assist in testing the reconstruction performance.The values of the main parameters of the network are shown in Table 1, and the implementation of these models was done with the MATLAB Deep Learning Toolbox (R2023b).

Model Evaluation Measures
In this study, a comprehensive assessment of deep-learning image reconstruction and segmentation models was conducted.The assessment measures mainly include quantitative assessment metrics and statistical difference analysis.

Quantitative Assessment Metrics for Image Reconstruction Models
In image reconstruction experiments, peak-signal-to-noise ratio (PSNR) and structural similarity index (SSIM) are used as image quality metrics to compare the reconstructed images with the ground truth.Among them, PSNR is a comprehensive measure of image quality, while SSIM is a local measure of contrast, brightness, and structural similarity, which can objectively measure the degree of difference between the model reconstructed image and the original image.These model evaluation measures are calculated according to Equations ( 4)- (6).
where MAX is the maximum possible value of the pixel value, N is the number of pixels, I i is the ith pixel value of the original image, and Îi is the ith pixel value of the reconstructed image.
where µ x and µ y are the mean values of the original image x and reconstructed image y, respectively, σ x 2 and σ y 2 are their variances, σ xy is their covariance, and c 1 and c 2 are the two constants used for stabilization calculation.

Quantitative Assessment Metrics for Image Segmentation Models
In image segmentation experiments, the image segmentation results are compared with the ground truth using metrics such as accuracy, intersection and union ratio (IoU), boundary F1 score (BFScore), and Dice coefficient.For different categories in the segmentation results, accuracy denotes the ratio of correctly classified pixels to total pixels in that category.IoU denotes the ratio of correctly classified pixels in that category to the total number of ground-truth pixels and predicted pixels in that category.B-score denotes the extent to which the predicted boundaries match the true boundaries of each category.The Dice coefficient is used to measure the overlapping of the algorithm's predicted results with the real labels degree.It takes values between 0 and 1, where 1 indicates perfect overlap and 0 indicates no overlap.For each image, mean accuracy is the average of the accuracies of all categories in that image.Global accuracy is the ratio of correctly categorized pixels (without categories) to the total number of pixels.Mean IoU is the average IoU score of all the classes in it.Weighted IoU is the average IoU for each class, weighted by the number of pixels in that class.Mean BFScore is the average BFScore score for all classes in it.These model evaluation measures are calculated according to Equations ( 7)- (17).
Global Accuracy = TP + TN TP + TN + FP + FN (9) where TP denotes the number of pixels correctly categorized as positive categories, TN denotes the number of pixels correctly categorized as negative categories, FP denotes the number of pixels correctly categorized as positive categories, FN denotes the number of pixels incorrectly categorized as negative categories, N denotes the number of categories, and w denotes the pixel weight of each category.

Methods of Statistical Difference Analysis
In the statistical difference analysis, t-test and analysis of variance were used to evaluate the significant differences between the proposed method and other comparison methods.The t-test is a statistical method used to compare whether there is a statistically significant difference in the means between two groups of data.It is usually used to analyze whether there is a statistically significant difference between the means of two groups of samples.Analysis of variance is a statistical method used to test whether there is a significant difference between the means of samples from two or more groups.It is classified into one-way analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA) according to the number of indicators analyzed.It determines whether the effects of different factors on the overall mean are statistically significant by decomposing the variance of the data into the variance due to each factor and the variance due to random errors.These statistical analysis methods enable a systematic assessment of the significant difference in performance between the proposed method and the existing methods to ensure the reliability and scientific validity of the experimental results.

Comparison of Experimental Results for Reconstruction
In the super-resolution reconstruction experiments, conventional photoacoustic image reconstruction techniques (e.g., time-reversal (TR) method) and EDSR-M were compared under different numbers of sensors.Meanwhile, the reconstruction effects of the TR method, bicubic interpolation, and deep learning-based reconstruction methods (including SRCNN, FSRCNN, VDSR, and EDSR-M) were evaluated.The reconstructed images were compared with ground truth images using PSNR and SSIM as quantitative indicators of image reconstruction quality.Meanwhile, the performance of the proposed method and other methods on PSNR and SSIM was analyzed for significant difference using various statistical analysis methods, including t-test, ANOVA, and MANOVA.
Table 2 and Figure 4 show the results of the reconstruction of photoacoustic images at different number of sensors.The observed results show that as the number of sensors increased, the artefacts in the reconstructed images were significantly reduced and the image quality was significantly improved.The images directly reconstructed by the TR method generally had problems such as blurring, lack of details, and artifacts, and their average PSNR and SSIM metrics also showed the lowest levels, indicating that their reconstruction results were relatively poor.In contrast, the reconstructed images using the EDSR-M method showed better reconstruction results at all sensor counts, and their average PSNR and SSIM were always at the highest level.The EDSR-M method was able to reduce the artifacts and improve the overall quality of the images, and it could Appl.Sci.2024, 14, 5331 9 of 18 recover the detail information in the images, which provided a solid foundation for the best reconstruction results.image quality was significantly improved.The images directly reconstructed by the TR method generally had problems such as blurring, lack of details, and artifacts, and their average PSNR and SSIM metrics also showed the lowest levels, indicating that their reconstruction results were relatively poor.In contrast, the reconstructed images using the EDSR-M method showed better reconstruction results at all sensor counts, and their average PSNR and SSIM were always at the highest level.The EDSR-M method was able to reduce the artifacts and improve the overall quality of the images, and it could recover the detail information in the images, which provided a solid foundation for the best reconstruction results.Figure 5 shows the ground truth image and the PAI reconstructed by each method, and it can be observed that the PAI directly reconstructed by the TR method had obvious blurring and information loss problems, and the overall perception was poor.In addition, the reconstruction effect of double cubic interpolation, as a basic interpolation method, became worse instead, which indicates that it was not suitable for the super-resolution reconstruction of PAIs.Among the various deep-learning methods used, from the perspective of image effect, EDSR-M was able to better recover the fine structure and texture of the image, making the image look clearer and more natural.This demonstrates its superiority in improving image quality and preserving details.From the numerical evaluation point of view, the EDSR-M method also showed superior performance, and it can be observed that the evaluation metrics such as PSNR and SSIM of the EDSR-M reconstructed images achieved optimal values, which are higher than those of other reconstruction methods.This demonstrates the superiority of EDSR-M in maintaining image quality and improving detail sharpness.reconstruction of PAIs.Among the various deep-learning methods used, from the perspective of image effect, EDSR-M was able to better recover the fine structure and texture of the image, making the image look clearer and more natural.This demonstrates its superiority in improving image quality and preserving details.From the numerical evaluation point of view, the EDSR-M method also showed superior performance, and it can be observed that the evaluation metrics such as PSNR and SSIM of the EDSR-M reconstructed images achieved optimal values, which are higher than those of other reconstruction methods.This demonstrates the superiority of EDSR-M in maintaining image quality and improving detail sharpness.Table 3 demonstrates the average numerical results of the reconstruction of each method on the test set.The TR method, as the most primitive direct reconstruction, had a PSNR of only 30.76 dB and an SSIM of 0.916.The bicubic interpolation method performed poorly, which may be due to the fact that its simple interpolation failed to improve the quality of the image efficiently.The image reconstructed by the SRCNN method had a lower SSIM, which was probably due to the fact that the SRCNN failed to recover image details effectively.In contrast, among FSRCNN, VDSR, and EDSR-M, EDSR-M achieved the best results with a PSNR of 36.84 dB and an SSIM of 0.960, indicating that it was an effective super-resolution reconstruction method that could improve the quality of the reconstruction while preserving the image details.Table 3 demonstrates the average numerical results of the reconstruction of each method on the test set.The TR method, as the most primitive direct reconstruction, had a PSNR of only 30.76 dB and an SSIM of 0.916.The bicubic interpolation method performed poorly, which may be due to the fact that its simple interpolation failed to improve the quality of the image efficiently.The image reconstructed by the SRCNN method had a lower SSIM, which was probably due to the fact that the SRCNN failed to recover image details effectively.In contrast, among FSRCNN, VDSR, and EDSR-M, EDSR-M achieved the best results with a PSNR of 36.84 dB and an SSIM of 0.960, indicating that it was an effective super-resolution reconstruction method that could improve the quality of the reconstruction while preserving the image details.In Figure 6, a comparison of the results of multiple methods of super-resolution reconstruction on the test set is shown, and Figure 6a demonstrates the distribution of PSNR values of different reconstruction methods on the test set images.It can be seen that the box of the EDSR-M method is located at the top of the overall distribution, indicating a higher median PSNR value and higher quality reconstruction result.Figure 6b shows the distribution of SSIM values for different reconstruction methods on the test set images.Similar to PSNR, the box of the EDSR-M method is also located at the top of the overall distribution, indicating that the median of its SSIM values was higher and the distribution was more concentrated, which suggests that its reconstruction results were better in terms of structural similarity.Combining the two images, it can be seen that the EDSR-M method performed well in the task of PAI super-resolution reconstruction, and its reconstruction results were better than other methods in terms of PSNR and SSIM, which provides a valuable reference and development direction for the field of photoacoustic image superresolution reconstruction.
higher median PSNR value and higher quality reconstruction result.Figure 6b shows the distribution of SSIM values for different reconstruction methods on the test set images.Similar to PSNR, the box of the EDSR-M method is also located at the top of the overall distribution, indicating that the median of its SSIM values was higher and the distribution was more concentrated, which suggests that its reconstruction results were better in terms of structural similarity.Combining the two images, it can be seen that the EDSR-M method performed well in the task of PAI super-resolution reconstruction, and its reconstruction results were better than other methods in terms of PSNR and SSIM, which provides a valuable reference and development direction for the field of photoacoustic image super-resolution reconstruction.Table 4 demonstrates the results of statistical difference analysis between the proposed method EDSR-M and other reconstruction methods in PSNR and SSIM.The p-values calculated by all the difference analysis methods were much smaller than the significance level  = 0.05, indicating significant differences between the EDSR-M method and other reconstruction methods in PSNR and SSIM.Combined with the mean values in Table 3, the superior performance of the EDSR-M method relative to other methods is further verified, which is of great significance for further research and practical applications in the field of photoacoustic image reconstruction.Table 4 demonstrates the results of statistical difference analysis between the proposed method EDSR-M and other reconstruction methods in PSNR and SSIM.The p-values calculated by all the difference analysis methods were much smaller than the significance level α = 0.05, indicating significant differences between the EDSR-M method and other reconstruction methods in PSNR and SSIM.Combined with the mean values in Table 3, the superior performance of the EDSR-M method relative to other methods is further verified, which is of great significance for further research and practical applications in the field of photoacoustic image reconstruction.

Comparison of Results of Split Experiments
In the image segmentation experiments, the segmentation results of HR-PAI were compared with those of LR-PAI using DeepLabv3+ under the same training set (128 sensors).It was also compared with existing image deep-learning segmentation methods (FCN, SegNet, and U-Net).The predicted segmentation labels were compared with ground truth images using accuracy, IoU, BFScore, and the Dice coefficient as quantitative metrics of image segmentation quality.Meanwhile, the Dice coefficients of the segmentation results of each organ were analyzed for significant differences using several statistical analysis methods, including the t-test, ANOVA, and MANOVA.
Figure 7 shows the segmentation results of HR-PAI and LR-PAI on a single image, and evaluating the segmentation results, it can be observed that the segmentation effect of HR images was significantly better than that of LR images.The segmented labels of the HR image were highly consistent with the ground truth labels with clear morphological edges, presenting excellent segmentation precision and accuracy.On the contrary, the segmented labels of the LR image showed obvious confusion, with pixel confusion between labels and blurred edges, which could not correctly capture the subtle features in the image.sors).It was also compared with existing image deep-learning segmentation methods (FCN, SegNet, and U-Net).The predicted segmentation labels were compared with ground truth images using accuracy, IoU, BFScore, and the Dice coefficient as quantitative metrics of image segmentation quality.Meanwhile, the Dice coefficients of the segmentation results of each organ were analyzed for significant differences using several statistical analysis methods, including the t-test, ANOVA, and MANOVA.
Figure 7 shows the segmentation results of HR-PAI and LR-PAI on a single image, and evaluating the segmentation results, it can be observed that the segmentation effect of HR images was significantly better than that of LR images.The segmented labels of the HR image were highly consistent with the ground truth labels with clear morphological edges, presenting excellent segmentation precision and accuracy.On the contrary, the segmented labels of the LR image showed obvious confusion, with pixel confusion between labels and blurred edges, which could not correctly capture the subtle features in the image.Table 5 shows the segmentation effect of HR-PAI and LR-PAI in the test set.The overall segmentation effect of the HR image was significantly better than that of the LR image.The overall pixel accuracy of the HR image reached 0.992, the average pixel accuracy was 0.903, the mean IoU was 0.856, the weighted IoU was 0.984, and the BFScore was 0.880, whereas the corresponding metrics for the LR image were 0.986, 0.834, 0.806, 0.954, and 0.828, respectively.As seen in Table 6, the HR images had better pixel accuracy, IoU, and BFScore on all organ types than the LR images, and the average Dice coefficient was correspondingly higher.Table 7 shows the significant difference analysis results of the Dice coefficients of each organ segmentation of HR-PAI and LR-PAI.The p-values calculated by all difference analysis methods were smaller than the significance level  = 0.05, which indicates that there were significant differences between HR images and LR images in the segmentation of different organs.In Figure 8, the Dice scores for each organ category on the test set are shown, where the Dice scores for all HR images were overall higher than those for LR images.This indicates that the segmentation results of the HR images were more accurate and finer at the organ level, and more closely matched the ground truth labels, while the segmentation results of the LR images were lower and had obvious Table 5 shows the segmentation effect of HR-PAI and LR-PAI in the test set.The overall segmentation effect of the HR image was significantly better than that of the LR image.The overall pixel accuracy of the HR image reached 0.992, the average pixel accuracy was 0.903, the mean IoU was 0.856, the weighted IoU was 0.984, and the BFScore was 0.880, whereas the corresponding metrics for the LR image were 0.986, 0.834, 0.806, 0.954, and 0.828, respectively.As seen in Table 6, the HR images had better pixel accuracy, IoU, and BFScore on all organ types than the LR images, and the average Dice coefficient was correspondingly higher.Table 7 shows the significant difference analysis results of the Dice coefficients of each organ segmentation of HR-PAI and LR-PAI.The p-values calculated by all difference analysis methods were smaller than the significance level α = 0.05, which indicates that there were significant differences between HR images and LR images in the segmentation of different organs.In Figure 8, the Dice scores for each organ category on the test set are shown, where the Dice scores for all HR images were overall higher than those for LR images.This indicates that the segmentation results of the HR images were more accurate and finer at the organ level, and more closely matched the ground truth labels, while the segmentation results of the LR images were lower and had obvious deficiencies.In summary, HR images showed obvious advantages in the abdominal organ segmentation task with more accurate and clearer segmentation results, which provides important support and guidance for medical diagnosis and research based on PAI.   Figure 9 shows the segmentation results of HR-PAI on multiple networks.By evaluating the images, it can be concluded that the segmentation effect of different networks varied significantly, FCN had obvious deficiencies in segmenting edges and morphology, while SegNet could not distinguish multiple types of organs well.U-Net had a better segmentation effect, but there was pixel confusion, and some regions were not segmented accurately.DeepLabv3+, on the other hand, showed excellent segmentation results, highly similar to the ground truth labels with clear morphological edges, showing the best segmentation results.
varied significantly, FCN had obvious deficiencies in segmenting edges and morphology, while SegNet could not distinguish multiple types of organs well.U-Net had a better segmentation effect, but there was pixel confusion, and some regions were not segmented accurately.DeepLabv3+, on the other hand, showed excellent segmentation results, highly similar to the ground truth labels with clear morphological edges, showing the best segmentation results.Table 8 demonstrates the segmentation evaluation metrics of HR-PAI on the test set.DeepLabv3+ showed the best results in five metrics, namely overall pixel accuracy, average pixel accuracy, mean IoU, weighted IoU, and mean BFScore, which reached 0.992, 0.903, 0.856, 0.984, and 0.880, respectively.Table 9 and Figure 10 show the Dice coefficients of the segmentation results for each network.The box diagram of the SegNet part intersects with 0 because SegNet cannot distinguish multiple types of organs well in the segmentation experiment, and the Dice coefficient of other types of organs was 0. The Dice coefficients of DeepLabv3+ were significantly higher than those of other networks, indicating that it achieved the best results in the segmentation tasks of different organs.Table 10 shows the significant difference analysis results of DeepLabv3+ and other segmentation methods in the Dice coefficients of various organs in HR-PAI.The t-test results for DeepLabv3+ and U-Net on the liver and right kidney showed larger p-values.This could be because U-Net had a relatively better segmentation effect on these particular organs, leading to fewer significant differences.However, from the overall data, the p-values calculated by the other methods of difference analysis were much smaller than the significance level  = 0.05 , which indicates that there were significant differences between DeepLabv3 + and other methods.In summary, DeepLabv3 + not only had high accuracy and reliability in HR-PAI segmentation, but also could better capture and preserve image details, providing a better segmentation tool for clinical applications.Table 8 demonstrates the segmentation evaluation metrics of HR-PAI on the test set.DeepLabv3+ showed the best results in five metrics, namely overall pixel accuracy, average pixel accuracy, mean IoU, weighted IoU, and mean BFScore, which reached 0.992, 0.903, 0.856, 0.984, and 0.880, respectively.Table 9 and Figure 10 show the Dice coefficients of the segmentation results for each network.The box diagram of the SegNet part intersects with 0 because SegNet cannot distinguish multiple types of organs well in the segmentation experiment, and the Dice coefficient of other types of organs was 0. The Dice coefficients of DeepLabv3+ were significantly higher than those of other networks, indicating that it achieved the best results in the segmentation tasks of different organs.Table 10 shows the significant difference analysis results of DeepLabv3+ and other segmentation methods in the Dice coefficients of various organs in HR-PAI.The t-test results for DeepLabv3+ and U-Net on the liver and right kidney showed larger p-values.This could be because U-Net had a relatively better segmentation effect on these particular organs, leading to fewer significant differences.However, from the overall data, the p-values calculated by the other methods of difference analysis were much smaller than the significance level α = 0.05, which indicates that there were significant differences between DeepLabv3 + and other methods.In summary, DeepLabv3 + not only had high accuracy and reliability in HR-PAI segmentation, but also could better capture and preserve image details, providing a better segmentation tool for clinical applications.

Discussion
In this paper, a new network architecture, EDSR-M, was proposed to solve the problem of unclear imaging of PAI due to the imaging principle, to achieve high-quality superresolution reconstruction of PAI, and to combine with DeepLabv3+ for image segmentation.The effectiveness of the method was verified by experiments and compared with that of other reconstruction and segmentation methods.The results of the reconstruction experiments show that EDSR-M outperformed other reconstruction methods both numerically and in terms of image effect, with average improvements of 11.02% in PSNR and

Discussion
In this paper, a new network architecture, EDSR-M, was proposed to solve the problem of unclear imaging of PAI due to the imaging principle, to achieve high-quality superresolution reconstruction of PAI, and to combine with DeepLabv3+ for image segmentation.The effectiveness of the method was verified by experiments and compared with that of other reconstruction and segmentation methods.The results of the reconstruction experiments show that EDSR-M outperformed other reconstruction methods both numerically and in terms of image effect, with average improvements of 11.02% in PSNR and 4.13% in SSIM, as well as superior performance in fine structure and texture reconstruction.The segmentation impact of HR-PAI after reconstruction by this model was much superior to that of LR-PAI according to the segmentation experiment findings.The segmentation results were enhanced by 8.27%, 6.20%, and 6.28% in terms of accuracy, mean IoU, and mean BFScore, respectively.The LR-PAI segmentation results exhibited fuzzy edges and confusing pixels, whereas the HR-PAI segmentation results had distinct edges and excellent accuracy.Furthermore, DeepLabv3+ performed well in PAI segmentation, with signifi-cantly higher accuracy, IoU, and BFScore when compared to those of other approaches, demonstrating the relevance of increasing PAI resolution for better segmentation outcomes.
In summary, deep learning-based reconstruction methods can effectively improve the reconstruction quality of PAI, which helps to enhance the accuracy and reliability of medical diagnosis.HR images provide more accurate labels, which can better help doctors understand the PAI and provide more valuable information for clinical diagnosis.In future research, the performance of the method with real PAI data and other medical images will be further validated to improve the generalization ability and robustness of the algorithm and to promote its wide application in the field of medical imaging.

Figure 1 .
Figure 1.Schematic diagram of the deep-learning structure for PAI.

Figure 1 .
Figure 1.Schematic diagram of the deep-learning structure for PAI.

Figure 4 .
Figure 4. Photoacoustic image reconstruction results for different number of sensors using (a) 32 sensors for PAI reconstruction, (b) 64 sensors for PAI reconstruction, and (c) 128 sensors for PAI reconstruction.

Figure 4 .
Figure 4. Photoacoustic image reconstruction results for different number of sensors using (a) 32 sensors for PAI reconstruction, (b) 64 sensors for PAI reconstruction, and (c) 128 sensors for PAI reconstruction.

Figure 6 .
Figure 6.Comparison of super-resolution reconstruction performance of PAI by different methods (128 sensors).(a) PSNR of different methods' reconstruction results; (b) SSIM of different methods' reconstruction results.

Figure 6 .
Figure 6.Comparison of super-resolution reconstruction performance of PAI by different methods (128 sensors).(a) PSNR of different methods' reconstruction results; (b) SSIM of different methods' reconstruction results.

7 *
MANOVA analyzed the Dice coefficients of each organ as a whole, significance level  = 0.05.

Figure 10 .
Figure 10.The Dice coefficients of each organ in HR-PAI by different segmentation methods.

Figure 10 .
Figure 10.The Dice coefficients of each organ in HR-PAI by different segmentation methods.

Table 1 .
The main hyperparameters of the network.

Table 2 .
Average PSNR and SSIM of reconstruction results with different number of sensors.

Table 2 .
Average PSNR and SSIM of reconstruction results with different number of sensors.

Table 4 .
Analysis of significant differences between EDSR-M and other reconstruction methods in PSNR and SSIM (128 sensors).

Table 4 .
Analysis of significant differences between EDSR-M and other reconstruction methods in PSNR and SSIM (128 sensors).

Table 5 .
Overall segmentation evaluation indexes for HR-PAI and LR-PAI.

Table 6 .
Evaluation indexes for segmentation of each organ in HR-PAI and LR-PAI.

Table 5 .
Overall segmentation evaluation indexes for HR-PAI and LR-PAI.

Table 6 .
Evaluation indexes for segmentation of each organ in HR-PAI and LR-PAI.

Table 7 .
Analysis of significant differences in Dice coefficients of organ segmentation between HR-PAI and LR-PAI.
* MANOVA analyzed the Dice coefficients of each organ as a whole, significance level  = 0.05.

Table 8 .
Segmentation evaluation indexes of different segmentation methods in HR-PAI.

Table 8 .
Segmentation evaluation indexes of different segmentation methods in HR-PAI.

Table 9 .
The average Dice coefficients of each organ in HR-PAI by different segmentation methods.