Author Contributions
Conceptualization, B.L., M.Y. and X.X.; methodology, B.L., F.W. and Y.Q.; formal analysis, B.Z. and Q.L.; investigation, B.L., Y.Q. and F.W.; resources, B.Z., H.C. and X.H.; data curation, Y.Q., F.W. and Q.L.; writing—original draft preparation, Y.Q., F.W. and B.L.; writing—review and editing, X.X. and M.Y.; visualization, B.L. and Y.Q.; supervision, H.C., M.Y. and X.X.; project administration, B.Z. and X.H.; funding acquisition, M.Y. and X.H. All authors have read and agreed to the published version of the manuscript.
Figure 2.
From left to right, the examples show SynMars-Air, MarsScapes, and images captured by the Ingenuity helicopter. (a–d) display a comparison of BigRock, Sand, Ridge, and BedRock.
Figure 2.
From left to right, the examples show SynMars-Air, MarsScapes, and images captured by the Ingenuity helicopter. (a–d) display a comparison of BigRock, Sand, Ridge, and BedRock.
Figure 3.
The overall architecture of our LisseMars. represent the feature output through each encoder modules, and represent the feature output through each decoder modules.
Figure 3.
The overall architecture of our LisseMars. represent the feature output through each encoder modules, and represent the feature output through each decoder modules.
Figure 4.
The working principle of WMA. Compared to models like PVT, Deformable DETR, and Swin, our WMA demonstrates a significant advantage in terms of receptive field.
Figure 4.
The working principle of WMA. Compared to models like PVT, Deformable DETR, and Swin, our WMA demonstrates a significant advantage in terms of receptive field.
Figure 7.
The structure of CFFN. (a) FFN in ViT. (b) Mix-FFN in PVTv2. (c) Ours.
Figure 7.
The structure of CFFN. (a) FFN in ViT. (b) Mix-FFN in PVTv2. (c) Ours.
Figure 8.
(a) Basic shape of Martian rocks. (b) Structure of DPCM. (c) DPC. The input X is processed through various paths, including DPC, and then merged to produce the final output features.
Figure 8.
(a) Basic shape of Martian rocks. (b) Structure of DPCM. (c) DPC. The input X is processed through various paths, including DPC, and then merged to produce the final output features.
Figure 9.
(Left): Initial coordinate display of DPC and a demonstration of the convolution kernel’s movement. (Right): The receptive field of DPC. When the formed polygon has concave edges, the convolution kernel returns to the initial coordinates.
Figure 9.
(Left): Initial coordinate display of DPC and a demonstration of the convolution kernel’s movement. (Right): The receptive field of DPC. When the formed polygon has concave edges, the convolution kernel returns to the initial coordinates.
Figure 10.
The structure of GFM.
Figure 10.
The structure of GFM.
Figure 11.
Samples of different categories in the SynMars-Air dataset (Source: [
45]).
Figure 11.
Samples of different categories in the SynMars-Air dataset (Source: [
45]).
Figure 12.
A sample of MarsScapes panoramic image (Source: [
43]).
Figure 12.
A sample of MarsScapes panoramic image (Source: [
43]).
Figure 13.
Samples from the TianWen dataset. (Source: NAOC/GRAS).
Figure 13.
Samples from the TianWen dataset. (Source: NAOC/GRAS).
Figure 14.
Visual comparison on the SynMars-Air test set. The red boxes mask some challenging objects that are difficult to cut.
Figure 14.
Visual comparison on the SynMars-Air test set. The red boxes mask some challenging objects that are difficult to cut.
Figure 15.
Visual comparison on the MarsScapes test set.
Figure 15.
Visual comparison on the MarsScapes test set.
Figure 16.
Visual comparison on the TianWen test set. The red boxes mask some challenging objects that are difficult to cut.
Figure 16.
Visual comparison on the TianWen test set. The red boxes mask some challenging objects that are difficult to cut.
Figure 17.
The visualization result of feature heatmaps with different stages.
Figure 17.
The visualization result of feature heatmaps with different stages.
Table 1.
Model variants of LisseMars. N represents the number of blocks, represents the number of heads, and p and K represent the size of patches and the number of convolution kernels, respectively.
Table 1.
Model variants of LisseMars. N represents the number of blocks, represents the number of heads, and p and K represent the size of patches and the number of convolution kernels, respectively.
| | Module | LisseMars-T | LisseMars-S | LisseMars-B |
|---|
| Stage 1 | WMA | p = 4 | p = 4 | p = 4 |
| = 8, = 1, = 1 | = 32, = 2, = 1 | = 32, = 2, = 2 |
| Stage 2 | WMA | p = 2 | p = 2 | p = 2 |
| = 16, = 2, = 1 | = 64, = 2, = 1 | = 64, = 2, = 2 |
| Stage 3 | DPCM | K = 9 | K = 9 | K = 9 |
| = 1 | = 1 | = 1 |
| = 32, | = 64 | = 128 |
| Stage 4 | WMA | p = 2 | p = 2 | p = 2 |
| = 32, = 4, = 1 | = 128, = 4, = 1 | = 128, = 4, = 2 |
| Stage 5 | WMA | p = 2 | p = 2 | p = 2 |
| = 64, = 8, = 1 | = 256, = 8, = 1 | = 256, = 8, = 2 |
Table 2.
Semantic Segmentation with 160 K on SynMars-Air (the first and second best results are highlighted in red and underlined, respectively).
Table 2.
Semantic Segmentation with 160 K on SynMars-Air (the first and second best results are highlighted in red and underlined, respectively).
| Backbone | Params | FLOPs | 160 K |
|---|
| (M) | (G) | Gravel | SmallRock | BigRock | BedRock | Sand | Soil | Ridge | Sky | mIoU (%) |
|---|
| Bisenetv2 [52] | 3.35 | 12.31 | 0 | 29.14 | 81.48 | 88.43 | 95.36 | 98.47 | 98.11 | 99.78 | 73.85 |
| MobileNetV3 [53] | 3.28 | 8.60 | 0.77 | 36.22 | 80.9 | 88.97 | 94.65 | 98.71 | 98.33 | 99.76 | 74.79 |
| Crossformer++-T [54] | 23.3 | 4.9 | 0.01 | 44.05 | 80.5 | 88.34 | 96.0 | 98.87 | 98.17 | 99.56 | 75.79 |
| Convnext-T [55] | 28 | 4.5 | 2.81 | 54.22 | 85.94 | 91.48 | 95.87 | 98.83 | 91.97 | 99.05 | 77.52 |
| Cswin-T [56] | 38.9 | 61.5 | 0.0 | 0 | 31.64 | 62.34 | 25.21 | 94.76 | 75.7 | 92.97 | 47.83 |
| EMO-1M [57] | 5.6 | 2.4 | 0.0 | 0.04 | 65.17 | 78.14 | 91.28 | 97.33 | 94.66 | 99.2 | 65.72 |
| Pvtv2-B0 [58] | 29.1 | 45.8 | 2.62 | 47.62 | 77.14 | 84.34 | 95.61 | 98.8 | 97.53 | 99.59 | 75.41 |
| Poolformer-s12 [59] | 12.47 | 54.31 | 0.03 | 27.11 | 55.96 | 79.27 | 83.26 | 97.51 | 88.55 | 98.03 | 66.21 |
| DeepLabV3+-r18 [60] | 15.65 | 30.75 | 1.15 | 56.22 | 89.88 | 93.13 | 82.82 | 98.12 | 90.55 | 99.47 | 76.44 |
| Segformer-b0 [61] | 3.7 | 6.41 | 2.9 | 38.96 | 84.87 | 88.44 | 86.98 | 98.04 | 95.98 | 99.56 | 74.46 |
| Segnext-T [62] | 4.3 | 6.6 | 0 | 24.83 | 79.63 | 88.41 | 96.04 | 98.44 | 98.52 | 99.75 | 73.2 |
| Light4Mars-T [40] | 0.1 | 0.53 | 4.73 | 41.38 | 76.8 | 85.6 | 93.42 | 98.61 | 94.23 | 98.44 | 74.15 |
| LisseMars-T | 0.12 | 0.90 | 7.25 | 48.46 | 84.53 | 89.91 | 96.26 | 98.93 | 96.33 | 98.97 | 77.58 |
| Crossformer++-S [54] | 52.0 | 9.5 | 1.36 | 51.69 | 80.18 | 87.57 | 96.1 | 98.9 | 94.27 | 98.18 | 76.03 |
| Convnext-S [55] | 50 | 8.7 | 3.91 | 60.96 | 86.62 | 91.77 | 95.79 | 99.16 | 99.38 | 99.93 | 79.69 |
| Swin-T [63] | 60.0 | 236 | 3.03 | 55.09 | 76.33 | 83.41 | 94.84 | 98.94 | 84.41 | 95.14 | 73.90 |
| Swin-S [63] | 81.0 | 329 | 2.29 | 57.21 | 82.67 | 88.17 | 95.79 | 99.1 | 93.61 | 97.99 | 77.10 |
| Cswin-S [56] | 51.3 | 73.7 | 0 | 17.33 | 49.1 | 70.39 | 77.07 | 97.14 | 97.34 | 98.28 | 62.64 |
| EMO-2M [57] | 6.9 | 3.5 | 0.0 | 0.02 | 67.15 | 79.31 | 92.21 | 97.33 | 94.23 | 98.99 | 66.16 |
| SMT-B [64] | 51.7 | 76.2 | 1.9 | 56.9 | 91.05 | 94.35 | 97.94 | 99.25 | 99.0 | 99.82 | 82.19 |
| Uniformer-B [65] | 70.0 | 90.34 | 7.49 | 69.13 | 90.1 | 92.71 | 97.49 | 99.3 | 98.4 | 99.93 | 82.51 |
| Pvtv2-B3 [58] | 49.0 | 62.4 | 5.53 | 66.38 | 88.64 | 91.63 | 97.88 | 99.35 | 99.5 | 99.93 | 81.11 |
| PIDNet-s [66] | 7.6 | 47.6 | 0 | 31,52 | 87.48 | 91.23 | 97.2 | 98.86 | 98.53 | 99.78 | 75.58 |
| Poolformer-s36 [59] | 34.6 | 42.0 | 0.01 | 27.1 | 57.19 | 78.83 | 83.46 | 97.52 | 88.65 | 97.96 | 66.34 |
| DeepLabV3+-r50 [60] | 47.0 | 62.7 | 0.9 | 56.37 | 90.18 | 93.44 | 97.41 | 99.13 | 98.34 | 99.84 | 79.45 |
| Segformer-b2 [61] | 25.4 | 15.1 | 0.99 | 53.42 | 89.92 | 94.04 | 97.69 | 99.19 | 98.94 | 99.79 | 79.25 |
| Segnext-B [62] | 27.8 | 35.7 | 0.09 | 47.5 | 86.65 | 90.01 | 96.15 | 99.01 | 99.14 | 99.89 | 77.31 |
| SMT-B [64] | 61.8 | 328 | 8.27 | 70.03 | 89.87 | 92.11 | 98.4 | 99.41 | 99.37 | 99.88 | 82.17 |
| Uniformer-B [65] | 80 | 471 | 7.49 | 69.13 | 90.1 | 92.71 | 97.49 | 99.38 | 98.4 | 99.5 | 81.78 |
| Light4Mars-S [40] | 1.50 | 5.45 | 15.09 | 63.97 | 90.98 | 93.91 | 97.6 | 99.34 | 99.77 | 99.96 | 82.51 |
| LisseMars-S | 1.355 | 6.49 | 17.63 | 67.77 | 91.53 | 94.04 | 97.63 | 99.41 | 99.3 | 99.85 | 83.39 |
| Crossformer++-B [54] | 92.0 | 16.6 | 1.94 | 49.97 | 80.31 | 87.61 | 95.36 | 98.87 | 96.79 | 99.83 | 76.33 |
| Convnext-B [55] | 89 | 45.0 | 5.33 | 64.64 | 86.91 | 92.07 | 96.48 | 99.2 | 99.42 | 99.94 | 80.50 |
| Swin-B [63] | 121.0 | 479 | 1.9 | 56.9 | 91.05 | 94.35 | 97.94 | 99.25 | 99.0 | 99.82 | 80.03 |
| Cswin-B [56] | 96.7 | 115.53 | 0 | 25.02 | 56.92 | 77.16 | 79.0 | 97.37 | 78.59 | 94.8 | 63.61 |
| EMO-5M [57] | 10.3 | 5.8 | 0.0 | 0.01 | 71.14 | 80.88 | 92.77 | 97.53 | 93.83 | 98.82 | 66.87 |
| SMT-L [64] | 97.1 | 132.2 | 9.05 | 70.17 | 90.32 | 92.49 | 98.35 | 99.42 | 99.56 | 99.95 | 82.41 |
| Light4Mars-B [40] | 2.57 | 9.43 | 16.61 | 67.38 | 91.53 | 94.45 | 97.87 | 99.4 | 99.28 | 99.9 | 83.30 |
| Uniformer-L [65] | 119.0 | 130.6 | 10.36 | 71.51 | 91.64 | 93.65 | 98.31 | 99.44 | 99.56 | 99.95 | 83.05 |
| Pvtv2-B5 [58] | 85.7 | 91.1 | 8.76 | 69.35 | 89.59 | 92.45 | 97.87 | 99.4 | 99.58 | 99.95 | 82.12 |
| PIDNet-L [66] | 36.9 | 275.8 | 0 | 33.62 | 89.35 | 92.63 | 97.74 | 98.94 | 98.75 | 99.78 | 76.35 |
| Poolformer-m48 [59] | 77.1 | 47.2 | 0.94 | 50.32 | 86.76 | 91.88 | 97.33 | 99.1 | 99.09 | 99.83 | 78.16 |
| DeepLabV3+-r101 [60] | 66.7 | 83.4 | 2.19 | 60.11 | 89.94 | 93.16 | 96.87 | 99.18 | 98.58 | 99.82 | 79.98 |
| Segformer-b5 [61] | 82.0 | 22.5 | 3.91 | 63.4 | 90.68 | 94.07 | 97.76 | 99.29 | 99.32 | 99.88 | 81.04 |
| Segnext-L [62] | 48.9 | 70.0 | 0.15 | 47.36 | 87.61 | 90.72 | 96.36 | 99.02 | 99.12 | 99.9 | 77.53 |
| SMT-L [64] | 102 | 546 | 9.05 | 70.17 | 90.32 | 92.49 | 98.35 | 99.42 | 99.56 | 99.95 | 82.41 |
| Uniformer-L [65] | 100 | 490 | 10.36 | 71.51 | 91.64 | 93.65 | 98.31 | 99.44 | 99.56 | 99.95 | 83.05 |
| LisseMars-B | 9.16 | 21.7 | 36.48 | 77.65 | 92.71 | 95.0 | 98.04 | 99.61 | 99.56 | 99.95 | 87.37 |
Table 3.
Semantic Segmentation with 160 K on MarsScapes. (the first and second best results are highlighted in red and underlined, respectively).
Table 3.
Semantic Segmentation with 160 K on MarsScapes. (the first and second best results are highlighted in red and underlined, respectively).
| Backbone | Params | FLOPs | 160 K |
|---|
| (M) | (G) | Soil | BedRock | Gravel | Sand | BigRock | Ridge | Sky | Rover | Unknow | mIoU (%) |
|---|
| Bisenetv2 [52] | 3.35 | 6.16 | 66.78 | 64.79 | 37.80 | 59.46 | 44.02 | 45.94 | 70.63 | 81.98 | 46.01 | 57.49 |
| MobileNetV3 [53] | 3.28 | 4.37 | 69.42 | 73.48 | 49.64 | 65.43 | 47.47 | 48.84 | 77.38 | 62.30 | 45.91 | 59.99 |
| Crossformer++-T [54] | 23.3 | 4.9 | 61.9 | 60.17 | 33.11 | 48.44 | 31.55 | 42.34 | 64.87 | 64.14 | 43.72 | 50.03 |
| Convnext-T [55] | 28 | 4.5 | 63.81 | 49.58 | 38.64 | 49.9 | 38.64 | 54.88 | 79.13 | 60.64 | 43.09 | 53.15 |
| EMO-2M [57] | 6.9 | 3.5 | 57.29 | 50.55 | 23.85 | 43.01 | 33.83 | 36.39 | 71.98 | 36.67 | 25.56 | 42.12 |
| Light4Mars-T [40] | 0.1 | 0.53 | 69.35 | 70.25 | 50.82 | 60.29 | 45.61 | 53.87 | 81.88 | 55.42 | 45.89 | 59.26 |
| Pvtv2-B0 [58] | 7.25 | 11.79 | 64.91 | 59.17 | 37.63 | 51.77 | 36.66 | 49.54 | 75.18 | 86.08 | 48.54 | 56.61 |
| Segnext-T [62] | 40.0 | 52.3 | 62.51 | 57.74 | 43.72 | 49.55 | 40.41 | 28.32 | 63.92 | 12.79 | 0 | 39.89 |
| Uniformer-S [65] | 52.0 | 247 | 65.85 | 57.25 | 38.45 | 52.67 | 41.75 | 46.61 | 77.66 | 63.41 | 39.68 | 53.7 |
| PIDNet-s [66] | 7.6 | 47.6 | 60.79 | 55.84 | 32.32 | 53.98 | 39.9 | 34.39 | 49.61 | 87.45 | 46.67 | 51.22 |
| LisseMars-T | 0.12 | 0.90 | 68.95 | 67.51 | 45.81 | 58.94 | 40.74 | 62.24 | 83.84 | 89.9 | 53.43 | 63.48 |
| MarsNet [67] | 33.21 | 120.29 | 76.02 | 56.39 | 44.77 | 56.44 | 35.45 | 35.83 | 79.72 | 71.59 | 45.69 | 55.77 |
| Crossformer++-S [54] | 52.0 | 9.5 | 70.12 | 61.67 | 49.33 | 60.43 | 39.4 | 38.63 | 75.05 | 42.26 | 51.71 | 54.30 |
| Convnext-S [55] | 50 | 8.7 | 64.42 | 51.13 | 40.0 | 51.22 | 39.63 | 48.4 | 73.85 | 92.59 | 45.32 | 56.28 |
| EMO-3M [57] | 6.9 | 3.5 | 58.88 | 55.5 | 33.77 | 50.47 | 34.28 | 46.39 | 75.44 | 61.68 | 31.03 | 49.71 |
| Light4Mars-S [40] | 1.50 | 5.45 | 69.41 | 71.57 | 46.15 | 60.72 | 47.67 | 60.84 | 82.13 | 95.00 | 46.26 | 64.30 |
| Segnext-B [62] | 27.8 | 35.7 | 53.83 | 39.01 | 22.53 | 46.06 | 35.81 | 34.58 | 57.87 | 69.29 | 44.83 | 44.87 |
| Uniformer-B [65] | 80.0 | 471 | 65.24 | 55.51 | 38.75 | 52.24 | 42.37 | 51.32 | 80.24 | 69.91 | 40.66 | 55.14 |
| Poolformer [59] | 15.65 | 15.38 | 67.21 | 66.50 | 43.61 | 60.78 | 45.96 | 51.58 | 78.00 | 94.79 | 48.20 | 61.85 |
| DeepLabV3+ [60] | 12.47 | 27.16 | 69.53 | 73.99 | 43.48 | 61.86 | 48.58 | 58.40 | 82.54 | 97.80 | 46.81 | 64.78 |
| LisseMars-S | 1.355 | 6.49 | 70.21 | 71.92 | 50.08 | 58.74 | 46.98 | 62.72 | 80.76 | 92.32 | 53.31 | 65.23 |
| Crossformer++-B [54] | 92.0 | 16.6 | 69.54 | 66.53 | 44.89 | 59.55 | 43.62 | 44.0 | 64.61 | 83.18 | 49.82 | 58.42 |
| Convnext-B [55] | 89 | 45.0 | 62.79 | 51.02 | 41.61 | 50.22 | 38.2 | 56.45 | 78.04 | 95.19 | 49.2 | 58.06 |
| EMO-5M [57] | 10.28 | 3.05 | 68.01 | 71.06 | 40.03 | 58.67 | 42.76 | 48.97 | 80.32 | 95.79 | 45.33 | 61.22 |
| Light4Mars-B [40] | 2.57 | 9.43 | 70.46 | 73.44 | 51.62 | 62.14 | 48.65 | 60.40 | 84.10 | 92.08 | 52.39 | 66.14 |
| PIDNet-L [66] | 36.9 | 275.8 | 66.70 | 65.12 | 36.45 | 57.04 | 39.31 | 51.87 | 76.67 | 89.65 | 46.01 | 58.76 |
| Segnext-L [62] | 48.0 | 70.0 | 63.48 | 60.42 | 36.83 | 53.28 | 40.21 | 41.84 | 72.11 | 60.97 | 0 | 47.68 |
| Uniformer-L [65] | 100 | 490 | 67.11 | 62.53 | 41.49 | 51.01 | 42.91 | 46.49 | 77.69 | 75.05 | 40.3 | 56.06 |
| Cswin [56] | 52.01 | 115.41 | 64.18 | 63.03 | 33.42 | 52.07 | 35.34 | 49.38 | 75.30 | 86.91 | 46.42 | 56.23 |
| Swin [63] | 58.95 | 121.86 | 69.28 | 75.50 | 46.47 | 61.13 | 47.11 | 66.65 | 83.53 | 96.17 | 53.13 | 66.55 |
| SMT [64] | 52.24 | 117.0 | 68.94 | 75.96 | 44.76 | 64.19 | 51.02 | 58.04 | 80.72 | 96.66 | 47.32 | 65.29 |
| Segformer [61] | 3.72 | 3.70 | 69.48 | 75.26 | 46.62 | 62.85 | 48.96 | 61.98 | 80.52 | 98.00 | 53.13 | 66.31 |
| LisseMars-B | 9.16 | 21.7 | 70.48 | 70.81 | 50.31 | 61.28 | 48.14 | 67.73 | 81.51 | 96.41 | 54.7 | 66.81 |
Table 4.
Semantic Segmentation with 160 K on TianWen. (the first and second best results are highlighted in red and underlined, respectively).
Table 4.
Semantic Segmentation with 160 K on TianWen. (the first and second best results are highlighted in red and underlined, respectively).
| Backbone | Params | FLOPs | 160 K | mIoU (%) |
|---|
| (M) | (G) | Background | Rock |
|---|
| Bisenetv2 [52] | 3.35 | 12.31 | 99.19 | 53.58 | 76.39 |
| MobileNetV3 [53] | 3.28 | 8.60 | 99.27 | 57.34 | 78.30 |
| Convnext-T [55] | 28.0 | 4.5 | 99.22 | 56.29 | 77.76 |
| Light4Mars-T [40] | 0.1 | 0.53 | 99.18 | 56.15 | 77.84 |
| Segformer-b0 [61] | 3.7 | 6.41 | 99.17 | 53.41 | 76.74 |
| LisseMars-T | 0.12 | 0.904 | 99.19 | 57.93 | 78.56 |
Table 5.
Effect of different proposed modules (The improvement is highlighted in red).
Table 5.
Effect of different proposed modules (The improvement is highlighted in red).
| Backbone | Fusion | Gravel | SmallRock | BigRock | BedRock | Sand | Soil | Ridge | Sky | mIoU (%) |
|---|
| BaseFormer | | 16.61 | 67.38 | 91.53 | 94.45 | 97.87 | 99.4 | 99.28 | 99.9 | 83.30 |
| +WMA | 20.64 | 69.75 | 92.03 | 94.26 | 95.69 | 99.45 | 98.37 | 99.9 | 83.76 (↑0.46) |
| +CFFN | 19.4 | 69.25 | 92.18 | 94.7 | 98.05 | 99.44 | 99.36 | 99.92 | 84.04 (↑0.28) |
| +GFM | 33.64 | 77.02 | 91.73 | 94.13 | 98.28 | 99.6 | 99.7 | 99.96 | 86.76 (↑2.72) |
| +DPCM | 36.48 | 77.65 | 92.71 | 95.0 | 98.04 | 99.61 | 99.56 | 99.91 | 87.37 (↑0.61) |
Table 6.
Comparison of different decoders based on LisseMars-B on SynMars-Air (the first and second best results are highlighted in red and underlined, respectively).
Table 6.
Comparison of different decoders based on LisseMars-B on SynMars-Air (the first and second best results are highlighted in red and underlined, respectively).
| AttentionEncoder | Decoder | Gravel | SmallRock | BigRock | BedRock | Sand | Soil | Ridge | Sky | mIoU (%) |
|---|
| CBAM [68] | | 0 | 26.25 | 73.14 | 79.56 | 89.85 | 88.32 | 90.63 | 94.75 | 67.81 |
| SE attention [69] | | 0 | 22.24 | 61.55 | 70.47 | 84.21 | 83.37 | 86.78 | 92.42 | 62.63 |
| Shift Window Attention [61] | | 2.11 | 57.06 | 92.14 | 94.84 | 97.84 | 99.6 | 99.33 | 99.75 | 80.3 |
| Effcient Attention [61] | | 5.9 | 65.96 | 91.8 | 94.47 | 98.0 | 99.30 | 99.5 | 99.99 | 81.8 |
| Linear Spatial Reduction Attention [58] | | 10.26 | 70.17 | 90.45 | 93.13 | 97.90 | 99.49 | 98.80 | 99.99 | 82.52 |
| Mix Attention [24] | | 14.0 | 65.2 | 93.81 | 94.68 | 95.13 | 98.21 | 99.52 | 99.76 | 82.53 |
| Scale-Aware Aggregation [27] | GFM | 9.3 | 70.04 | 90.38 | 92.58 | 98.04 | 99.41 | 99.54 | 99.95 | 82.5 |
| Multi-head Relation Attention [65] | | 12.31 | 72.2 | 92.02 | 95.4 | 98.27 | 99.58 | 99.56 | 99.9 | 83.65 |
| Squeeze Window Attention [40] | | 23.04 | 70.94 | 92.37 | 95.0 | 97.9 | 99.47 | 99.44 | 99.88 | 84.7 |
| Window Movable Attention (Ours) | | 36.48 | 77.65 | 92.71 | 95.0 | 98.04 | 99.61 | 99.56 | 99.91 | 87.37 |
Table 7.
Comparison of different multi-layer perceptrons based on LisseMars-B on SynMars-Air (the first and second best results are highlighted in red and underlined, respectively).
Table 7.
Comparison of different multi-layer perceptrons based on LisseMars-B on SynMars-Air (the first and second best results are highlighted in red and underlined, respectively).
| Model | Multi-Layer | Gravel | SmallRock | BigRock | BedRock | Sand | Soil | Ridge | Sky | mIoU (%) |
|---|
| LisseMars | CMLP [65] | 22.01 | 70.4 | 90.87 | 93.44 | 92.99 | 99.46 | 97.12 | 99.85 | 83.27 |
| MLP [70] | 21.88 | 70.46 | 90.44 | 93.02 | 91.58 | 99.46 | 96.28 | 99.82 | 82.87 |
| UFFN [40] | 21.75 | 70.28 | 90.55 | 93.17 | 92.52 | 99.46 | 96.84 | 99.83 | 83.05 |
| CFFN (Ours) | 36.48 | 77.65 | 92.71 | 95.0 | 98.04 | 99.61 | 99.56 | 99.91 | 87.37 |
Table 8.
Comparison of different decoders based on LisseMars-B on SynMars-Air (the first and second best results are highlighted in red and underlined, respectively).
Table 8.
Comparison of different decoders based on LisseMars-B on SynMars-Air (the first and second best results are highlighted in red and underlined, respectively).
| Encoder | Decoder | Gravel | SmallRock | BigRock | BedRock | Sand | Soil | Ridge | Sky | mIoU (%) |
|---|
| WMA | EncHead [61] | 0 | 18.11 | 86.74 | 89.1 | 97.09 | 98.6 | 98.33 | 99.75 | 73.47 |
| SegformerHead [61] | 23.07 | 71.17 | 90.45 | 93.33 | 95.32 | 99.49 | 98.45 | 99.99 | 83.89 |
| FPN [71] | 26.0 | 72.77 | 91.95 | 94.68 | 98.03 | 99.51 | 99.52 | 99.91 | 85.35 |
| UPerHead [27] | 31.53 | 76.2 | 92.02 | 94.3 | 98.27 | 99.58 | 99.46 | 99.9 | 86.41 |
| FCNHead [72] | 23.04 | 70.94 | 92.37 | 95.0 | 97.9 | 99.47 | 99.44 | 99.88 | 84.77 |
| ALA [40] | 20.64 | 69.75 | 92.03 | 94.26 | 95.69 | 99.45 | 98.37 | 99.9 | 83.76 |
| GFM (Ours) | 36.48 | 77.65 | 92.71 | 95.0 | 98.04 | 99.61 | 99.56 | 99.91 | 87.37 |
Table 9.
The effects of different convolutions in DPC (the first and second best results are highlighted in red and underlined, respectively).
Table 9.
The effects of different convolutions in DPC (the first and second best results are highlighted in red and underlined, respectively).
| BaseMoudle | Convolution | Gravel | SmallRock | BigRock | BedRock | Sand | Soil | Ridge | Sky | mIoU (%) |
|---|
| DPCM | Conv | 35.62 | 77.21 | 92.4 | 94.77 | 98.09 | 99.61 | 99.66 | 99.95 | 87.16 |
| DWConv [73] | 35.39 | 77.2 | 92.69 | 95.0 | 98.27 | 99.61 | 99.4 | 99.85 | 87.18 |
| GConv [74] | 35.86 | 77.21 | 92.33 | 94.78 | 98.19 | 99.61 | 99.63 | 99.93 | 87.19 |
| DCN [75] | 35.92 | 77.22 | 92.6 | 95.0 | 98.1 | 99.61 | 99.44 | 99.86 | 87.22 |
| DPC (Ours) | 36.48 | 77.65 | 92.71 | 95.0 | 98.04 | 99.61 | 99.56 | 99.91 | 87.37 |
Table 10.
The position of DPCM (the first result is highlighted in red).
Table 10.
The position of DPCM (the first result is highlighted in red).
| Position | Stage 1 | Stage 2 | Stage3 | Stage 4 | Stage 5 | mIoU (%) |
|---|
| DPCM | ✓ | | | | | 86.07 |
| | ✓ | | | | 86.83 |
| | | | ✓ | | 86.92 |
| | | | | ✓ | 86.5 |
| | | ✓ | | | 87.37 |
Table 11.
Run-time performance of LisseMars on Nvidia Jetson Xavier NX (NX).
Table 11.
Run-time performance of LisseMars on Nvidia Jetson Xavier NX (NX).
| Metrics | SynMars-Air | MarsScapes | TianWen |
|---|
| Times (s) | 0.048 | 0.04 | 0.044 |
| mIoU (%) | 77.58 | 61.48 | 78.56 |