A Brief Review of Some Interesting Mars Rover Image Enhancement Projects

: The Curiosity rover has landed on Mars since 2012. One of the instruments onboard the rover is a pair of multispectral cameras known as Mastcams, which act as eyes of the rover. In this paper, we summarize our recent studies on some interesting image processing projects for Mastcams. In particular, we will address perceptually lossless compression of Mastcam images, debayering and resolution enhancement of Mastcam images, high resolution stereo and disparity map generation using fused Mastcam images, and improved performance of anomaly detection and pixel clustering using combined left and right Mastcam images. The main goal of this review paper is to raise public awareness about these interesting Mastcam projects and also stimulate interests in the research community to further develop new algorithms for those applications.


Introduction
NASA has sent several rovers to Mars over the past few decades. The Sojourner rover landed on 4 July 1997. It practically worked for a short while because communication link was broken after two months. Sojourner traveled slightly more than 100 m. Spirit, also known as the Mars Exploration Rover ( This paper will focus on the Curiosity rover. Onboard the Curiosity rover, there are a few important instruments. The laser induced breakdown spectroscopy (LIBS) instrument, ChemCam, performs rock composition analysis from distances as far as seven meters [2]. Another type of instrument is the mast cameras (Mastcams). There are two Mastcams [3]. The cameras have nine bands in each with six of them overlapped. The range of wavelengths covers the blue (445 nm) to the short-wave near-infrared (1012 nm).
The Mastcams can be seen in Figure 1. The right imager has three times better resolution than the left. As a result, the right camera is usually for short range image collection and the right is for far field data collection. The various bands of the two Mastcams are shown in Table 1 and Figure 2. There are a total of nine bands in each Mastcam. One can see that, except for the RGB bands, the other bands in the left and right images are nonoverlapped, meaning that it is possible to generate a 12-band data cube by fusing the left and right bands. The dotted curves in Figure 2 are known as the "broadband near-IR cutoff filter", which has a filter bandwidth (3 dB) of 502 to 678 nm. Its purpose is to help the Bayer filter in the camera [3]. In a later section, the 12-band cube was used for accurate data clustering and anomaly detection. Figure 1. The Mars rover-Curiosity, and its onboard instruments [4]. Mastcams are located just below the white box near the top of the mast.  [4]. There are nine bands with six overlapping bands in each camera.

The Left Mastcam
The   The Mars rover-Curiosity, and its onboard instruments [4]. Mastcams are located just below the white box near the top of the mast. Bayer filter in the camera [3]. In a later section, the 12-band cube was used for accurate data clustering and anomaly detection. Figure 1. The Mars rover-Curiosity, and its onboard instruments [4]. Mastcams are located just below the white box near the top of the mast.  [4]. There are nine bands with six overlapping bands in each camera.

The Left Mastcam The Right Mastcam Filter
Wavelength (nm) Filter Wavelength (nm)  L2  445  R2  447  L0B  495  R0B  493  L1  527  R1  527  L0G  554  R0G  551  L0R  640  R0R  638  L4  676  R3  805  L3  751  R4  908  L5  867  R5  937  L6 1012 R6 1013 Figure 2. Spectral response curves for the left eye (top panel) and the right eye (bottom panel) [5]. Figure 2. Spectral response curves for the left eye (top panel) and the right eye (bottom panel) [5]. The objective of this paper is to briefly review some recent studies done by our team for Mastcam. First, we review our work on perceptually lossless compression effort for Mastcam images. The motivation of this study was to demonstrate that, with the help of recent compression technologies, it is plausible to adopt perceptually lossless compression (ten to one compression) instead of lossless compression (three to one compression) for NASA's Mastcam images. This will save three times the precious bandwidth between Mars and Earth. Second, we review our recent study on debayering for Mastcam images. The Mastcam is still using a debayering algorithm developed in 2004. Our study shows that some recent debayering algorithms can achieve better artifact reduction and enhanced image quality. Third, we review our work on image enhancement for the left Mastcam images. Both conventional and deep learning approaches were studied. Fourth, we review our past work on stereo imaging and disparity map generation for Mastcam images. The approach was to combine left and right images for stereo imaging. Fifth, we further summarize our study on fusing both Mastcam images to enhance the performance of data clustering and anomaly detection. Finally, we will conclude our paper and discuss some future research opportunities, including Mastcam-Z, which is the new Mastcam imager onboard the Perseverance rover, image enhancement and stereo imaging by combining left and right Mastcam images.
We would like to emphasize that one key goal of our paper is to publicize some interesting projects related to Mastcam in the Curiosity rover and hopefully this will stimulate some interest from the research community to look into these interesting projects and perhaps further develop some new algorithms to improve the state-of-the-art. Our team worked with NASA Jet propulsion Laboratory (JPL) and two other universities on the Mastcam project for more than five years. Few researchers in the world actually know the fact that NASA has archived Mastcam images as well as data acquired by quite a few other instruments (LIBS, Alpha Particle X-Ray Spectrometer (APXS), etc.) onboard the Mars rover Curiosity. The database is known as the Planetary Data System (PDS) (https://pds.nasa.gov/ accessed on 6 September 2021). All these datasets are available to the public free of charge. If researchers are interested in applying some new algorithms to demosaic the Mastcam images, there are millions of images available. Another objective of our review paper is to summarize some preliminary algorithm improvement in five applications so that interested researchers can look at this review paper alone and can gather about the state-of-the-art algorithms in processing Mastcam images.
The NASA Mastcam projects are very specific applications. Few people are even aware of these projects. For all of the five applications, NASA JPL first implemented some baseline algorithms, and our team was the next one to continue the investigations. To the best of our knowledge, no one else has performed detailed investigations in these areas. For instance, in the demosaicing of Mastcam images, NASA used the Malvar-He-Cutler algorithm, which was developed in 2004. Since then, there has been tremendous developments in demosaicing. We worked with NASA JPL to compare a number of conventional and deep learning demosaicing algorithms and eventually convinced NASA that it is probably time to adopt newer algorithms. For the image compression project, NASA is still using the JPEG standard, which was developed in the 1990s. We performed thorough comparative studies and advocated the importance of using perceptually lossless compression. For the fusion of left and right Mastcam images, no one has done this before. Similarly, for anomaly detection and image enhancement, we are the only team working in this area.

Perceptually Lossless Compression for Mastcam Images
Up to now, NASA is still compressing the Mastcam images without loss using JPEG, which is a technology developed around 1990 [6]. JPEG is computationally efficient. However, it can achieve a compression ratio of at most three times in the lossless compression mode. In the past two decades, new compression standards, including JPEG-2000 (J2K) [7], X264 [8], and X265 [9], were developed. These video codecs can also compress still images. Lossless compression options are also present in these codecs.
In addition to the above codecs, some researchers developed lapped transform (LT) [10] and incorporated it into a new codec known as Daala [11] in recent years. Daala can compress both still images and videos. A lossless option is also present.
The objective of our recent study [12] was to perform thorough comparative studies and advocated the importance of using perceptually lossless compression for NASA's missions. In particular, in our recent paper [12], we evaluated five image codecs, including Daala, X265, X264, J2k, and JPEG. The objective is to investigate which one of the above codecs can attain a 10:1 compression ratio, which we consider as perceptually lossless compression. We emphasize that some suitable metrics are required to quantify perceptual performance. In the past, researchers have found that peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), two popular and widely used metrics, do not correlate well with human's subjective evaluations. In recent years, some metrics known as human visual system (HVS) and HVS with masking (HVSm) [13] were developed. For Mastcam images, HVS and HVSm were adopted in our compression studies. For perceptually lossless compression studies, we could have used CIELab metric too, but did not do so because we wanted to compare with other existing compression methods in the literature which only used PSNR, SSIM, HVS, and HVSm. Moreover, we also evaluated the decompressed RGB Mastcam images using subjective assessment. We noticed that perceptually lossless compression can be attained even at 20 to 1 compression. If one focuses at ten to one compression using Daala, the objective metrics of HVS and HVSm are 5 to 10 dBs higher than those of JPEG.
Our findings are as follows. Details can be found in [12].
• Comparison of different approaches For the nine-band multispectral Mastcam images, we compared several approaches (principal component analysis (PCA), split band (SB), video, and two-step). It was observed that the SB approach performed better than others using actual Mastcam images.

•
Codec comparisons In each approach, five codecs were evaluated. In terms of those objective metrics (HVS and HVSm), Daala yielded the best performance amongst the various codecs. At ten to one compression, more than 5 dBs of improvement was observed by using Daala as compared to JPEG, which is the default codec by NASA.

•
Computational complexity Daala uses discrete cosine transform (DCT) and is more amenable for parallel processing. J2K is based on wavelet which requires the whole image as input. Although X265 and X264 are also based on DCT, they did not perform well at ten to one compression in our experiments. • Subjective comparisons Using visual inspections on RGB images, it was observed that at 10:1 and 20:1 compression, all codecs have almost no loss. However, at higher compression ratios such as 40 to 1 compression, it was observed that there are noticeable color distortions and block artifacts in JPEG, X264, and X265. In contrast, we still observe good compression performance in Daala and J2K even at 40:1 compression.

Debayering for Mastcam Images
The nine bands in each Mastcam camera contain RGB bands. Different from other bands, the RGB bands are collected by using a Bayer pattern filter, which first came out in 1976 [14]. In the past few decades, many debayering algorithms were developed [15][16][17][18][19]. NASA still uses the Malvar-He-Cutler (MHC) algorithm [20] to demosaic the RGB Mastcam images. Although MHC was developed in 2004, it is an efficient algorithm that can be easily implemented in the camera's control electronics. In [3], another algorithm known as the directional linear minimum mean square-error estimation (DLMMSE) [21] was also compared against the MHC algorithm.
Deep learning has gained popularity since 2012. In [22], a joint demosaicing and denoising algorithm was proposed. For the sake of easy referencing, this algorithm can be called DEMOsaic-Net (DEMONET). Two other deep learning-based algorithms for demosaicing [23,24] have been identified as well. The objective of our recent work [4] is to compare a number of conventional and deep learning demosaicing algorithms and eventually convince NASA that it is probably time to adopt newer algorithms.
We have several observations on our Mastcam image demosaicing experiments. First, we observe that the MHC algorithm still generated reasonable performance in Mastcam images even though some recent ones yielded better performance. Second, we observe that some deep learning algorithms did not always perform well. Only the DEMONET generated better performance than conventional methods. This shows that the performance of demosaicing algorithms depends on the applications. Third, we observe that DEMONET performed better than others only for right Mastcam images. DEMONET has comparable performance to a method know as exploitation of color correlation (ECC) [31] for the left Mastcam images.
Due to the fact that there are no ground truth demosaiced images, we adopted an objective blind image quality assessment metric known as natural image quality evaluator (NIQE). Low NIQE scores mean better performance. Figure 3 shows the NIQE metrics of various methods. One can see that ECC and DEMONET have better performance than others.   From Figure 4, we see obvious color distortions in demosaiced image using bilinear, MHC, AP, LT, LDI-NAT, F3, and ATMF. One can also see strong zipper artifacts in the images from AFD, AP, DLMMSE, PCSD, LDI-NAT, F3, and ATMF. There are slight color distortions in the results of ECC and MLRI. Finally, we can observe that the images of DEMONET, ARI, DRL, and SEM are more perceptually pleasing than others.

Mastcam Image Enhancement
The left Mastcam images have three times lower resolution than that of the right. We have tried to improve the spatial resolution of the left images so that left and right images may be fused for some applications such as anomaly detection. It should be noted that no one, including NASA, has done this work before. Here, we summarize two approaches that we have tried. The first one is based image deconvolution, which is a standard technique in image restoration. The second one is to apply deep learning algorithms.

Model based Enhancement
In [35], we presented an algorithm to improve the left Mastcam images. There are two steps in our approach. First, a pair of left and right Mastcam bands is used to estimate the point spread function (PSF) using a sparsity-based approach. Second, the estimated PSF is then applied to improve the other left bands. Preliminary results using real Mastcam images indicated that the enhancement performance is mixed. In some left images, improvements can be clearly seen, but not so good results appeared in others.

Mastcam Image Enhancement
The left Mastcam images have three times lower resolution than that of the right. We have tried to improve the spatial resolution of the left images so that left and right images may be fused for some applications such as anomaly detection. It should be noted that no one, including NASA, has done this work before. Here, we summarize two approaches that we have tried. The first one is based image deconvolution, which is a standard technique in image restoration. The second one is to apply deep learning algorithms.

Model Based Enhancement
In [35], we presented an algorithm to improve the left Mastcam images. There are two steps in our approach. First, a pair of left and right Mastcam bands is used to estimate the point spread function (PSF) using a sparsity-based approach. Second, the estimated PSF is then applied to improve the other left bands. Preliminary results using real Mastcam images indicated that the enhancement performance is mixed. In some left images, improvements can be clearly seen, but not so good results appeared in others.
From Figure 5, we can clearly observe the sharpening effects of the deblurred image (i.e., Figure 5f) compared with the aligned left images (i.e., Figure 5e). The estimated kernel in Figure 5c, was obtained using a pair of left and right green bands. We can see better enhancement in Figure 5 for the LR band. However, in some cases in [35], some performance degradations were observed.
The mixed results suggest a new direction for future research, which may involve deep learning techniques for PSF estimation and robust deblurring.

Deep Learning Approach
Over the past two decades, a large number of papers was published on the subject of pansharpening, which is the fusion of a high resolution (HR) panchromatic (pan) image with a low resolution (LR) multispectral image (MSI) [36][37][38][39][40]. Recently, we proposed an unsupervised network for image super-resolution (SR) of hyperspectral image (HSI) [41,42]. Similar to MSI, HSI has found many applications. The key features of our work in HSI include the following. First, our proposed algorithm extracts both the spectral and spatial information from LR HSI and HR MSI with two deep learning networks, which share the same decoder weights, as shown in Figure 6. Second, sum-to-one and sparsity are two physical constraints of HSI and MSI data representation. Third, our proposed algorithm directly addresses the challenge of spectral distortion by minimizing the angular difference of these representations. The proposed method is coined as unsupervised sparse Dirichlet network (uSDN). Details of uDSN can be found in our recent work [43].
Computers 2021, 10, x FOR PEER REVIEW 7 of 19 From Figure 5, we can clearly observe the sharpening effects of the deblurred image (i.e., Sub-Figure f) compared with the aligned left images (i.e., Sub- Figure e). The estimated kernel in Figure 5c, was obtained using a pair of left and right green bands. We can see better enhancement in Figure 5 for the LR band. However, in some cases in [35], some performance degradations were observed.
The mixed results suggest a new direction for future research, which may involve deep learning techniques for PSF estimation and robust deblurring.  Two benchmark datasets, CAVE [44] and Harvard [45], were used to evaluate the proposed uSDN. More details can be found in [41,42]. Here, we include results of applying uDSN to Mastcam images. As mentioned before, the right Mastcam has higher resolution than the left. Consequently, the right Mastcam images are treated as HR MSI and the left images are treated as LR HSI.
To generate objective metrics, we used the root mean squared error (RMSE) and spectral angle mapper (SAM), which are widely used in the image enhancement and pansharpening literature. Smaller values imply better performance. Figure 7 shows the images of our experiments. One can see that the reconstructed image is comparable to the ground truth. Here, we only compare the proposed method with coupled nonnegative matrix factorization (CNMF) [46] which has been considered a good algorithm. The results in Table 2 show that the proposed approach was able to outperform the CNMF in two metrics.
Computers 2021, 10, 111 8 of 18 [41,42]. Similar to MSI, HSI has found many applications. The key features of our work in HSI include the following. First, our proposed algorithm extracts both the spectral and spatial information from LR HSI and HR MSI with two deep learning networks, which share the same decoder weights, as shown in Figure 6. Second, sum-to-one and sparsity are two physical constraints of HSI and MSI data representation. Third, our proposed algorithm directly addresses the challenge of spectral distortion by minimizing the angular difference of these representations. The proposed method is coined as unsupervised sparse Dirichlet network (uSDN). Details of uDSN can be found in our recent work [43]. Two benchmark datasets, CAVE [44] and Harvard [45], were used to evaluate the proposed uSDN. More details can be found in [41,42]. Here, we include results of applying uDSN to Mastcam images. As mentioned before, the right Mastcam has higher resolution than the left. Consequently, the right Mastcam images are treated as HR MSI and the left images are treated as LR HSI.
To generate objective metrics, we used the root mean squared error (RMSE) and spectral angle mapper (SAM), which are widely used in the image enhancement and pansharpening literature. Smaller values imply better performance. Figure 7 shows the images of our experiments. One can see that the reconstructed image is comparable to the ground truth. Here, we only compare the proposed method with coupled nonnegative matrix factorization (CNMF) [46] which has been considered a good algorithm. The results in Table 2 show that the proposed approach was able to outperform the CNMF in two metrics.

Stereo Imaging and Disparity Map Generation for Mastcam Images
In the past few years, more research has been investigated in using virtual reality and augmented reality tools to Mars rover missions [47][48][49]. For example, a software called OnSight was developed by NASA and Microsoft to enable scientists to virtually work on Mars using Microsoft HoloLens [50]. Mastcam images have been used by OnSight software to create a 3D terrain model of the Mars. The disparity maps extracted from stereo Mastcam images are important by providing depth information. Some papers [51][52][53] proposed methods to estimate disparity maps using monocular images. Since the two Mastcam images do not have the same resolution, a generic disparity map estimation using the original Mastcam images may not take the full potential of the right Mastcam images that have three times higher image resolution. It will be more beneficial to NASA and other users of Mastcam images if a high-resolution disparity map can be generated.
In [54], we introduced a processing framework that can generate high resolution disparity maps for the Mastcam image pairs. The low-resolution left camera image was improved and the impact of the image enhancement on the disparity map estimation was studied quantitatively. It should be noted that, in our earlier paper [55], we generated stereo images using the Mastcam instruments. However, no quantitative assessment of the impact of the image enhancement was carried out.
Three algorithms were used to improve left camera images. The bicubic interpolation [56] was used as the baseline technique. Another method [57] is an adaptation of the technique in [5] with pansharpening [57][58][59][60][61]. Recently, deep learning-based SR techniques [62][63][64] have been developed. We used the enhanced deep super resolution (EDSR) [65] as one representative deep learning-based algorithm in our experiments. It should be emphasized that no one, including NASA, has carried out any work related to this stereo generation effort. As a result, we do not have a baseline algorithm from NASA to compare with.
Here, we include some comparative results. From Figure 8, we observe that the image quality with EDSR and the pansharpening-based method are better when compared with the original and bicubic images. Figure 9 shows the objective NIQE metrics for the various algorithms. It is worth mentioning that even though the pansharpening-based method provides the lowest NIQE values (best performance) and provides visually very appealing enhanced images, it is noticed that some pixel regions in the enhanced images do not seem to be well registered in the sub-pixel level. Since the NIQE metric does not take into consideration issues related to registration in its assessment, it clearly favors the pansharpening-based method over others as shown in Figure 9. Other objective metrics using RMSE, PSNR, SSIM were used in [54] to demonstrate that the EDSR algorithm performed better than other methods.

Stereo Imaging and Disparity Map Generation for Mastcam Images
In the past few years, more research has been investigated in using virtual reality and augmented reality tools to Mars rover missions [47][48][49]. For example, a software called OnSight was developed by NASA and Microsoft to enable scientists to virtually work on Mars using Microsoft HoloLens [50]. Mastcam images have been used by OnSight software to create a 3D terrain model of the Mars. The disparity maps extracted from stereo Mastcam images are important by providing depth information. Some papers [51][52][53] proposed methods to estimate disparity maps using monocular images. Since the two Mastcam images do not have the same resolution, a generic disparity map estimation using the original Mastcam images may not take the full potential of the right Mastcam images that have three times higher image resolution. It will be more beneficial to NASA and other users of Mastcam images if a high-resolution disparity map can be generated.
In [54], we introduced a processing framework that can generate high resolution disparity maps for the Mastcam image pairs. The low-resolution left camera image was improved and the impact of the image enhancement on the disparity map estimation was studied quantitatively. It should be noted that, in our earlier paper [55], we generated stereo images using the Mastcam instruments. However, no quantitative assessment of the impact of the image enhancement was carried out.
Three algorithms were used to improve left camera images. The bicubic interpolation [56] was used as the baseline technique. Another method [57] is an adaptation of the technique in [5] with pansharpening [57][58][59][60][61]. Recently, deep learning-based SR techniques [62][63][64] have been developed. We used the enhanced deep super resolution (EDSR) [65] as one representative deep learning-based algorithm in our experiments. It should be emphasized that no one, including NASA, has carried out any work related to this stereo generation effort. As a result, we do not have a baseline algorithm from NASA to compare with.
Here, we include some comparative results. From Figure 8, we observe that the image quality with EDSR and the pansharpening-based method are better when compared with the original and bicubic images. (c) (d) Figure 9 shows the objective NIQE metrics for the various algorithms. It is worth mentioning that even though the pansharpening-based method provides the lowest NIQE values (best performance) and provides visually very appealing enhanced images, it is noticed that some pixel regions in the enhanced images do not seem to be well registered in the sub-pixel level. Since the NIQE metric does not take into consideration issues related to registration in its assessment, it clearly favors the pansharpening-based method over others as shown in Figure 9. Other objective metrics using RMSE, PSNR, SSIM were used in [54] to demonstrate that the EDSR algorithm performed better than other methods.  Figure 10 shows the estimated disparity maps with the three image enhancement methods for Image Pair 6 in [54]. Figure 10a shows the estimated disparity map using the (c) (d) Figure 9 shows the objective NIQE metrics for the various algorithm mentioning that even though the pansharpening-based method provides the values (best performance) and provides visually very appealing enhanced noticed that some pixel regions in the enhanced images do not seem to be w in the sub-pixel level. Since the NIQE metric does not take into considerat lated to registration in its assessment, it clearly favors the pansharpening-b over others as shown in Figure 9. Other objective metrics using RMSE, PSN used in [54] to demonstrate that the EDSR algorithm performed better tha ods. Figure 9. Natural image quality evaluator (NIQE) metric results for enhanced "origin images" (scale: ×2) by the bicubic interpolation, pansharpening-based method, and Figure 10 shows the estimated disparity maps with the three image methods for Image Pair 6 in [54]. Figure 10a shows the estimated disparity m Figure 9. Natural image quality evaluator (NIQE) metric results for enhanced "original left Mastcam images" (scale: ×2) by the bicubic interpolation, pansharpening-based method, and EDSR. Figure 10 shows the estimated disparity maps with the three image enhancement methods for Image Pair 6 in [54]. Figure 10a shows the estimated disparity map using the original left camera image. Figure 10b-d show the resultant disparity maps with the three methods. Figure 10e shows to the mask used when computing the average absolute error values. According to our paper [54], the disparity map shown in Figure 9d has the best performance. More details can be found in [54].
original left camera image. Figure 10b-d show the resultant disparity maps with the three methods. Figure 10e shows to the mask used when computing the average absolute error values. According to our paper [54], the disparity map shown in Figure 9d has the best performance. More details can be found in [54].

Anomaly Detection Using Mastcam Images
One important role of Mastcam imagers is to help locate anomalous or interesting rocks so that the rover can go to that rock and collect some samples for further analysis.
A two-step image alignment approach was introduced in [5]. The performance of the proposed approach was demonstrated using more than 100 pairs of Mastcam images, selected from over 500,000 images in NASA's PDS database. As detailed in [5], the fused images have improved the performance of anomaly detection and pixel clustering applications. We would like to emphasize that this anomaly detection work was not done before by NASA and hence there is no baseline approach from NASA. Figure 11 illustrates the proposed two-step approach. The first step uses RANSAC (random sample consensus) technique [66] for an initial image alignment. SURF features [67] and SIFT features [68] are then matched within the image pair.

Anomaly Detection Using Mastcam Images
One important role of Mastcam imagers is to help locate anomalous or interesting rocks so that the rover can go to that rock and collect some samples for further analysis.
A two-step image alignment approach was introduced in [5]. The performance of the proposed approach was demonstrated using more than 100 pairs of Mastcam images, selected from over 500,000 images in NASA's PDS database. As detailed in [5], the fused images have improved the performance of anomaly detection and pixel clustering applications. We would like to emphasize that this anomaly detection work was not done before by NASA and hence there is no baseline approach from NASA. Figure 11 illustrates the proposed two-step approach. The first step uses RANSAC (random sample consensus) technique [66] for an initial image alignment. SURF features [67] and SIFT features [68] are then matched within the image pair.
The second step uses the diffeomorphic registration [69] technique to perform a refinement on the alignment. We observed that the second step achieves subpixel alignment performance. After the alignment, we can then perform anomaly detection and pixel clustering with the constructed multispectral image cubes. Figure 11. A two-step image alignment approach to registering left and right images.
We used K-means for pixel clustering. The number of clusters are set to be six following suggestion of the gap statistical method [70]. Figure 12 shows the results. In each figure, we enlarged one clustering region to showcase the performance. There are several important observations: (i) We observe that the clustering performance is improved after the first and second registration step of our proposed two-step framework; (ii) The clustering performance of the two-step registration for the M34-resolution and M100-resolution is comparable; (iii) The pansharpened data show the best clustering results with fewer randomly clustered pixels. Figure 11. A two-step image alignment approach to registering left and right images.
The second step uses the diffeomorphic registration [69] technique to perform a refinement on the alignment. We observed that the second step achieves subpixel alignment performance. After the alignment, we can then perform anomaly detection and pixel clustering with the constructed multispectral image cubes.
We used K-means for pixel clustering. The number of clusters are set to be six following suggestion of the gap statistical method [70]. Figure 12 shows the results. In each figure, we enlarged one clustering region to showcase the performance. There are several important observations: (i) We observe that the clustering performance is improved after the first and second registration step of our proposed two-step framework; (ii) The clustering performance of the two-step registration for the M34-resolution and M100-resolution is comparable; (iii) The pansharpened data show the best clustering results with fewer randomly clustered pixels. Figure 13 displays the anomaly detection results of two LR-pair cases for the three competing methods (global-RX, local-RX and NRS methods) applied to the original nine-band data captured only by the right Mastcam (second row) and the five twelve-band fused data counterparts (third to seventh rows). There is no ground-truth information about anomaly targets. Consequently, we relied on visual inspection. From Figure 13, we observe better detection results when both RANSAC and diffeomorphic registration steps are applied as compared with just RANSAC registration. Moreover, the results using BDSD and PRACS pan-sharpening produce less noise than the detection outputs of purely registration-based MS data.  (e) using twelve-band MS cube after the second registration step with lower resolution; (f) using twelve-band MS cube after the second registration step with higher (M-100) resolution; (g) using pansharpened images by band dependent spatial detail (BDSD) [71]; and (h) using pan-sharpened images by partial replacement adaptive CS (PRACS) [72]. Figure 13 displays the anomaly detection results of two LR-pair cases for the three competing methods (global-RX, local-RX and NRS methods) applied to the original nineband data captured only by the right Mastcam (second row) and the five twelve-band fused data counterparts (third to seventh rows). There is no ground-truth information (d) using twelve-band MS cube after first registration step with lower (M-34) resolution; (e) using twelve-band MS cube after the second registration step with lower resolution; (f) using twelve-band MS cube after the second registration step with higher (M-100) resolution; (g) using pan-sharpened images by band dependent spatial detail (BDSD) [71]; and (h) using pan-sharpened images by partial replacement adaptive CS (PRACS) [72]. about anomaly targets. Consequently, we relied on visual inspection. From Figure 13, we observe better detection results when both RANSAC and diffeomorphic registration steps are applied as compared with just RANSAC registration. Moreover, the results using BDSD and PRACS pan-sharpening produce less noise than the detection outputs of purely registration-based MS data.
Computers 2021, 10, x FOR PEER REVIEW 16 of 19 Figure 13. Comparison of anomaly detection performance of an LR-pair on sol 1138 taken on 10-19-2015. The first row shows the RGB left and right images; and the second to seventh rows are the anomaly detection results of the six MS data versions listed in [5] in which the first, second, and third columns are results of global-RX, local-RX [73] and NRS [74] methods, respectively.

Conclusions and Future Work
With the goals of raising public awareness and stimulating research interests of some interesting Mastcam projects, we briefly review our recent projects related to Mastcam image processing. First, we studied various new image compression algorithms and observed that perceptually lossless compression at ten to one compression can be achieved. Figure 13. Comparison of anomaly detection performance of an LR-pair on sol 1138 taken on 10-19-2015. The first row shows the RGB left and right images; and the second to seventh rows are the anomaly detection results of the six MS data versions listed in [5] in which the first, second, and third columns are results of global-RX, local-RX [73] and NRS [74] methods, respectively.

Conclusions and Future Work
With the goals of raising public awareness and stimulating research interests of some interesting Mastcam projects, we briefly review our recent projects related to Mastcam image processing. First, we studied various new image compression algorithms and observed that perceptually lossless compression at ten to one compression can be achieved. This will tremendously save scarce bandwidth between Mars rover and JPL. Second, we compared recent debayering algorithms with the default algorithm used by NASA and found that recent algorithm can yield less artifacts. Third, we investigated image enhancement algorithms for left Mastcam images. It was observed that, with the help of right Mastcam images, it is possible to improve the resolution of left Mastcam images. Fourth, we investigated stereo image and disparity map generation by combining left and right Mastcam images. It was noticed that the fusion of enhanced left and right images can create higher resolution stereo image and disparity maps. Finally, we investigated the fusion of left and right images to form a 12-band multispectral image cube and its application to pixel clustering and anomaly detection.
The new Mars rover Perseverance that has landed on Mars in 2021 contains a new generation of stereo instrument known as Mastcam-Z, (https://mars.nasa.gov/mars202 0/spacecraft/instruments/mastcam-z/ accessed on 7 September 2021). We are currently pursuing funding to continue our customization of our algorithms described in this paper to those images in Mastcam-Z. In the stereo imaging work, more research is needed in order to deal with left and right images from different view angles.