Challenges of 3D Surface Reconstruction in Capsule Endoscopy

Essential for improving the accuracy and reliability of bowel cancer screening, three-dimensional (3D) surface reconstruction using capsule endoscopy (CE) images remains challenging due to CE hardware and software limitations. This report generally focuses on challenges associated with 3D visualization and specifically investigates the impact of the indeterminate selection of the angle of the line–of–sight on 3D surfaces. Furthermore, it demonstrates that impact through 3D surfaces viewed at the same azimuth angles and different elevation angles of the line–of–sight. The report concludes that 3D printing of reconstructed 3D surfaces can potentially overcome line–of–sight indeterminate selection and 2D screen visual restriction-related errors.


Introduction
Capsule endoscopy (CE) is the newest and most patient-friendly endoscopic solution to gastrointestinal (GI) tract screening, particularly bowel cancer screening. To improve the CE-based screening process, the accurate and reliable evaluation of bowel pathologies can be facilitated by enhanced visualization of three-dimensional (3D) bowel surfaces. In CE, 3D visualization can be made possible by 3D reconstruction, which involves creating a 3D model of an object or scene from two-dimensional (2D) images or sensor data. One prominent approach for 3D reconstruction is the utilization of shape-from-shading algorithms [1,2]. Shape-from-shading algorithms, including Tsai's, Ciuti's, Barron's, and Torreao's, have been used to generate accurate 3D models [3]. Researchers have successfully applied shape-from-shading to represent the GI tract surface using 2D CE images [2]. Near-source perspective shape-from-shading enables precise 3D reconstructions of mucosal tissues [4]. Combining image stitching and shape-from-shading techniques generates comprehensive 3D maps [5]. Epipolar geometry enhances accuracy by constraining matching feature points for a more reliable 3D view [6].
However, despite these efforts, there are still several challenges and limitations that complicate the realization of 3D surface reconstruction in CE. For example, the CE hardware limitations and associated challenges make it infeasible to produce traditional 3D imaging, thus making 3D reconstruction from 2D images the only option in CE [7,8]. Specifically, operational and packaging-related challenges of the pill-cam or capsule endoscope [8,9] affect the traditional CE imaging procedure. On top of that, the GI environment is dark, and the natural peristalsis decides which lumen and mucosal surface to be imaged before viewing by gastroenterologists in a circular and monocular view [8,10], thus making the pathological evaluation efforts inaccurate or unreliable to some extent [3]. Another example is related to software limitations and associated challenges that make it difficult to accurately and reliably evaluate pathologies, such as user interface and interaction-based 3D visualization, the imprecise 3D mapping or inaccuracy of current techniques used for reconstruction of 3D surfaces from 2D images or frames [1][2][3][4][5][7][8][9][10][11].
In this report, the focus is on challenges associated with 3D visualization in CE. Specifically, the impact of the indeterminate selection of the angle of the line-of-sight for meaningfully visualizing the content of the reconstructed 3D surfaces from 2D images is demonstrated and discussed. Figure 1a shows the line of sight in the 3D view context. This line starts at the center of the plot and points toward the camera or eye. As can be seen, two angles, the azimuth and the elevation, are the pillars of the line of sight. In this context, it can be understood that larger and noise-free images would be the key to achieving a better view of image objects' details before further processing. Therefore, preprocessing operations, such as image upscaling (via interpolation) and/or image filtering (via outlier removal), can help to leverage CE image quality in this direction. It is important to note that a particular emphasis was put on exposing and exploring the impact of the indeterminate selection of the angle of the line-of-sight for meaningfully visualizing the content of the reconstructed 3D surfaces from 2D images.
In this report, the focus is on challenges associated with 3D visualization in CE. Specifically, the impact of the indeterminate selection of the angle of the line-of-sight for meaningfully visualizing the content of the reconstructed 3D surfaces from 2D images is demonstrated and discussed. Figure 1a shows the line of sight in the 3D view context. This line starts at the center of the plot and points toward the camera or eye. As can be seen, two angles, the azimuth and the elevation, are the pillars of the line of sight. In this context, it can be understood that larger and noise-free images would be the key to achieving a better view of image objects' details before further processing. Therefore, preprocessing operations, such as image upscaling (via interpolation) and/or image filtering (via outlier removal), can help to leverage CE image quality in this direction. It is important to note that a particular emphasis was put on exposing and exploring the impact of the indeterminate selection of the angle of the line-of-sight for meaningfully visualizing the content of the reconstructed 3D surfaces from 2D images.

Pre-Processing
Traditional pre-processing methods may include techniques for automatic processing or analysis purposes [12][13][14]. In this paper, we improved the quality of CE images for 3D surface reconstruction by preprocessing them with upscaling via interpolation and filtering via outlier removal.

Interpolation
Interpolation is a widely used method in many fields to construct a new data value within the range of a set of known data [15][16][17]. This mathematical method pervades many applications in computer science and beyond. It enables us to obtain a high-resolution image from its low-resolution version [18]. In addition, image interpolation is practiced in improved definition television (IDTV) receiver design, photograph zooming and remote sensing [19]. Besides this, it is also applied in medical imaging, computer graphics, satellite imagery and in various other fields [15][16][17][18][19][20][21][22][23].
The author's prior studies generally demonstrated the performance of image interpolation algorithms in terms of effectiveness and efficiency [17,18]. Similary, other researchers demonstrated the effects of interpolation on the visual quality of digitally resized images [19][20][21].
In this work, the Lanczos interpolation method, referenced in [22], was used for image upscaling purposes. It is important to note that the Lanczos interpolation is based on the 3-lobed Lanczos window function as the interpolation function [22]. Given that Lanczos interpolation generally proved to lead to better outcomes than most interpolation

Pre-Processing
Traditional pre-processing methods may include techniques for automatic processing or analysis purposes [12][13][14]. In this paper, we improved the quality of CE images for 3D surface reconstruction by preprocessing them with upscaling via interpolation and filtering via outlier removal.

Interpolation
Interpolation is a widely used method in many fields to construct a new data value within the range of a set of known data [15][16][17]. This mathematical method pervades many applications in computer science and beyond. It enables us to obtain a high-resolution image from its low-resolution version [18]. In addition, image interpolation is practiced in improved definition television (IDTV) receiver design, photograph zooming and remote sensing [19]. Besides this, it is also applied in medical imaging, computer graphics, satellite imagery and in various other fields [15][16][17][18][19][20][21][22][23].
The author's prior studies generally demonstrated the performance of image interpolation algorithms in terms of effectiveness and efficiency [17,18]. Similary, other researchers demonstrated the effects of interpolation on the visual quality of digitally resized images [19][20][21].
In this work, the Lanczos interpolation method, referenced in [22], was used for image upscaling purposes. It is important to note that the Lanczos interpolation is based on the 3-lobed Lanczos window function as the interpolation function [22]. Given that Lanczos interpolation generally proved to lead to better outcomes than most interpolation methods, currently available in commercial software, it was therefore chosen over others to double the size of the input CE image before further processing.

Filtering
Image filtering is the process of modifying an image to block or pass a particular set of frequency components [24]. There are many image-filtering techniques in the current literature, some of which have been specifically developed to remove outliers in digital images [24][25][26][27].
In this work, the simplest filtering procedure adopted includes rescaling image pixels and filtering using the 2D convolution kernel. Normally, the rescaling function scales the range of array elements to the desired interval. The desired interval is normally characterized by lower and upper bounds. The upper and lower bounds were determined using the mean and standard deviation of a given input CE image. The 2D convolution function was used with the convolution kernel size equal to 3 × 3 to filter the rescaled image. More details on 2D convolution using the kernel size 3 × 3 are provided in [28].

Dataset
Our experimental dataset comprised five CE images (size 360 × 360 × 3) that were captured using the PillCam COLON. Note that these images were previously downloaded for our previous work, as presented in [29] from the capsule endoscopy database for medical decision support [30].

Single Image 3D Reconstruction
MATLAB's 3D-colored surface function was used to plot the colored parametric surface defined by four matrix arguments X, Y, Z, and C. The lengths of interpolated images were used to create the row and column vectors needed by the meshgrid function to return the 2D grid coordinates, X and Y. The range of the Z argument was determined by the interpolated grayscale image, while the color scaling was determined by the range of C. Here, C was without the black background of the input image. This was achieved by first splitting the RGB color channels and extracting the mask, as well as computing its complement. The complement was separately added to each channel before concatenation. The shading model was determined by MATLAB's shading function. Figure 1b briefly illustrates the simplified 3D reconstruction steps from a single 2D CE image. Figure 2 shows three main columns, mainly (a), (b-c), and (d-e). The (a) column shows original CE images. Knowing whether these CE images contained bowel diseases was out of the scope of this work. Results presented in Figure 2 focused on demonstrating the need for determinate selection of the line-of-sight to better view 3D structures that contain these images. As can be seen, the (b) and (c) columns showed images that had 3D surfaces good and relevant enough to allow gastroenterologists to see the contents of 3D versions extracted from 2D CE images. However, the (d) and (e) columns showed images in which the structural contents were difficult to understand or find their relevance to the input images contents, showed in column (a). The reason for the lack of relevance of 3D surfaces was due to the elevation angle selected for images shown in columns (d) and (e). Here the EL = 0 • while for columns (b) and (c), the EL = −80 • . In both cases, the AZ = 0 • . This demonstrated that, if not carefully selected, the angles of the line-of-sight could negatively affect the meaningfulness of the reconstructed 3D surfaces. In this context, a potential and promising solution would be to have reconstructed 3D surfaces printed in 3D objects to allow medical experts or gastroenterologists to directly observe them without 2D computer screen restrictions or related errors. Now, considering each of the five columns separately, it could be seen that (b) and (d) columns contained original or non-preprocessed images while (c) and (e) contained preprocessed images. Comparing the images in columns (b) and (c), as well as in columns (d) and (e), the images looked almost the same way (unless one zoomed in-and in such a case, it would be possible to notice differences in terms of smoothness of edges). In this way, the preprocessing did not significantly improve the quality of CE images, thus introducing the need for further research in this preprocessing direction. preprocessed images while (c) and (e) contained preprocessed images. Comparing the images in columns (b) and (c), as well as in columns (d) and (e), the images looked almost the same way (unless one zoomed in-and in such a case, it would be possible to notice differences in terms of smoothness of edges). In this way, the preprocessing did not significantly improve the quality of CE images, thus introducing the need for further research in this preprocessing direction.

Discussions
The importance of 3D reconstruction in various aspects of capsule endoscopy imaging has been well-established, as reported by several works [1,2,[4][5][6][7][8][9][10]16,31,32]. For example, these works highlight the benefits of 3D reconstruction in tasks such as characterizing subepithelial tumors [31], accurate measurements [32], enhanced lesion visualization [7], and promising results for polypoid structures and angioectasias [8]. In addition, according to authors in [2,6], the 3D reconstruction could provide clear surface recovery and improve the perception of the gastrointestinal (GI) tract. However, despite the significance of 3D reconstruction, there is a lack of focus on 3D software user interface-related challenges, such as 3D visualization-related, in existing works. Specifically, none of these works examined the effects of the irrelevance of 3D surfaces when the elevation angle

Discussions
The importance of 3D reconstruction in various aspects of capsule endoscopy imaging has been well-established, as reported by several works [1,2,[4][5][6][7][8][9][10]16,31,32]. For example, these works highlight the benefits of 3D reconstruction in tasks such as characterizing subepithelial tumors [31], accurate measurements [32], enhanced lesion visualization [7], and promising results for polypoid structures and angioectasias [8]. In addition, according to authors in [2,6], the 3D reconstruction could provide clear surface recovery and improve the perception of the gastrointestinal (GI) tract. However, despite the significance of 3D reconstruction, there is a lack of focus on 3D software user interface-related challenges, such as 3D visualization-related, in existing works. Specifically, none of these works examined the effects of the irrelevance of 3D surfaces when the elevation angle and/or angle of the line of sight was selected indeterminately-which is why this report focused on demonstrating that the indeterminate selection of such an angle could negatively affect the meaningfulness of the reconstructed 3D surfaces. Although some endoscopists reported improved or non-improved visualization when referring to original 2D images in [7], the authors did not mention whether they encountered any challenges related to the 3D visualization of the reconstructed surfaces via user interfaces. Here, the author's main objective was to explore the accuracy of 3D reconstruction using innovative software and assess whether it led to enhanced lesion visualization in small bowel CE. In [8], authors noted the presence of highlights caused by lights reflected at various angles, which could potentially provide false information about the shape of the reconstructed surface. However, there was no further mention of the angle of view or the possibility that an indeterminate selection of the angle of the line of sight could lead to more highlights. This highlights the need for careful consideration of the angle of the line of sight or view when viewing reconstructed 3D surfaces, emphasizing the importance of a determinate selection of the angle, as demonstrated in this work. Another work [4] did not discuss the challenges related to the 3D software user interface or 3D visualization of reconstructed surfaces. Instead, the authors focused on other tasks involved in achieving 3D reconstructions of surfaces of interest. Similarly, in work [31], authors provided examples of 3D reconstructions from 2D images, but the view angles of these 3D surfaces were not defined or mentioned. Despite the lack of angle definition, the authors referred to other works to conclude that, at some percentage rate (less than 100%), the 3D versions presented enhanced visualization features compared to their 2D counterparts. This suggests that the lack of achieving 100% enhancement of visualization features can be attributed to the overlooking of 3D visualization challenges, particularly the failure to address the importance of a determinate line of sight. In [32], authors did not assess the effects of 3D visualization using the MiroCam MC4000 but instead evaluated its reliability in reconstructing 3D images and accurately calculating lesion size within a phantom model. The authors highlighted that the MiroCam MC4000 utilizes stereo-matching technology to enable the reconstruction of selected images in a 3D format for size calculation. They concluded that the estimated measurements highly correlated with the known sizes, showcasing the capabilities of this novel capsule. However, similar to previous cases, the authors overlooked the challenges of 3D visualization and instead focused on developing a method to reconstruct the 3D texture surface of the GI tract using a single CE image and the Shape from Shading technique [2]. In [6], authors acknowledged the need for a realistic and user-friendly 3D view to assist physicians in better viewing or observing the GI tract. However, they did not mention the visualization challenges associated with achieving this desired 3D view, particularly the importance of determining the angle of view or angle of the line of sight. In brief, the lack of work reporting on the 3D visualization challenges related to the 3D software user interface has led to the potential for demonstrating and reporting on these challenges. The impact of an indeterminate line-of-sight on 3D reconstructed surfaces has been evaluated. As a result, highlights the need for further consideration of the 3D software user interface challenges, particularly the angle of the line of sight, for optimal 3D visualization of capsule endoscopy imaging data.

Conclusions
In brief, this report sheds light on the specific challenge associated with meaningfully viewing the content of reconstructed 3D surfaces from CE. Preliminary results were presented to mainly demonstrate the extent to which the indeterminate selection of the line-of-sight could affect the 3D reconstruction-based analysis in CE. The report exposed and explored the potential to overcome the line-of-sight indeterminate selection challenge and suggested 3D printing of reconstructed 3D surfaces solution for the determinate selection of the line-of-sight and improving 3D visualization-based bowel cancer screening outcomes. Further research could extend this report's findings.

Future Perspectives of 3D Surface Reconstruction in CE
In brief, future perspectives can encompass exploring the potential of 3D printing, which would allow for determinate line-of-sight selection or leveraged 3D visualization, improving 3D user interfaces and visualization tools, and further investigating the impact of indeterminate line-of-sight on reconstructed surfaces. These three directions can contribute to enhancing the clinical utility and effectiveness of 3D reconstruction in CE.