MAFT: A Lightweight Network for Martian Rock Segmentation Based on an Adaptive Frequency Transformer
Highlights
- We proposed the Mars Adaptive Frequency Transformer (MAFT), a lightweight network building upon AFFormer with AKConv and EMCA, which achieves 88.90% Intersection over Union (IoU) with only 2.97 M parameters and 15.49 G floating-point operations (FLOPs), surpassing all compared lightweight and Mars-specific segmentation models.
- We constructed the TWMARS-V2 dataset with fine-grained annotations, addressing the high omission rate of small rocks in existing datasets and establishing a robust evaluation benchmark.
- With a high inference speed of 35.25 frames per second (FPS) and low computational cost, MAFT is highly suitable for deployment on resource-constrained onboard hardware, enabling real-time obstacle avoidance for future Mars rovers.
- The network’s robustness under dust coverage and complex textures supports automated rock size and morphology statistics through a practical measurement workflow.
Abstract
1. Introduction
- Martian rocks present extreme scale variations as shown in Figure 1b,f, alongside highly irregular morphologies. Current CNNs relying on fixed receptive fields struggle to maintain this fine-to-global spatial awareness simultaneously
- Perennial dust storms severely degrade surface textures as shown in Figure 1c,g. This creates a fundamental difficulty: CNNs tend to confuse texture-degraded rocks with the surrounding sand, while standard Transformers capture global context but lack the local inductive bias needed to delineate blurred boundaries.
- Acquiring labeled data of Martian surface rocks is difficult, and publicly available annotated datasets remain scarce. The existing annotated datasets related to the rocks of the Zhurong rover have low completeness, with a high omission rate in the annotation of small rocks.
- The embedded systems of Mars rovers are subject to stringent constraints on power consumption and computing capabilities. However, existing models struggle to achieve an optimal balance between computational efficiency and segmentation precision. High-accuracy Transformer-based models designed for rock segmentation, such as MarsFormer, entail large parameter counts and per-inference computational loads that surpass the capacity of onboard computing units. Conversely, extreme lightweight architectures, such as Light4Mars, sacrifice essential representational capacity and fail to reliably detect critical but subtle obstacles. Therefore, a principled framework that effectively balances deployment efficiency with high-accuracy segmentation remains lacking.
- We propose MAFT, a lightweight framework for Martian rock segmentation that combines adaptive convolution and enhanced attention with a frequency-domain Transformer backbone. With only 2.97 M parameters and 15.49 G FLOPs, MAFT achieves the highest segmentation accuracy among all compared methods.
- We construct an improved backbone termed IAFFormer by building upon the AFFormer architecture and replacing the fixed-grid 1 × 1 convolutions in the pixel descriptor module with AKConv. Standard pointwise convolutions produce spatially isolated descriptors that lack local context, making them unreliable for distinguishing rocks from spectrally similar sandy backgrounds under dust-degraded conditions. AKConv enables shape-aware local feature aggregation through dynamically adjusted sampling positions, yielding geometrically adaptive descriptors that better conform to irregular rock contours.
- We design the EMCA module with a triple-branch structure for simultaneous channel, height, and width attention, integrating hybrid pooling and adaptive dilated convolutions to improve boundary discrimination under dust occlusion.
- We release TWMARS-V2, an improved version of the TWMARS dataset with exhaustive re-annotation covering all visible rock instances, providing a more complete benchmark for Martian rock segmentation research.

2. Related Works
2.1. Semantic Segmentation of Martian Rocks
2.2. Adaptive Frequency Transformer
2.3. Martian Rock Data
3. Methodology
3.1. Overall Framework
3.2. Adaptive Kernel Convolution (AKConv)
3.3. Enhanced Multi-Dimensional Convolutional Attention (EMCA)
4. Experimental Results and Analysis
4.1. Datasets
4.2. Evaluation Metrics
4.3. Implementation Details
4.4. Ablation Experiments
4.4.1. Component Ablation Analysis
4.4.2. Computational Cost of Each Module
4.4.3. Qualitative Ablation Results
4.5. Comparison with State-of-the-Art Methods
4.5.1. Quantitative Performance Comparison
4.5.2. Computational Complexity Comparison
4.5.3. Cross-Dataset Generalization Analysis
4.5.4. Visualization Comparison
4.5.5. Violin Plot Comparison
4.6. Analysis of Morphological Parameters of Martian Rocks
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Liu, H.; Yao, M.; Xiao, X.; Cui, H. A hybrid attention semantic segmentation network for unstructured terrain on Mars. Acta Astronaut. 2023, 204, 492–499. [Google Scholar] [CrossRef]
- Feng, W.; Ding, L.; Zhou, R.; Xu, C.; Yang, H.; Gao, H.; Liu, G.; Deng, Z. Learning-Based End-to-End Navigation for Planetary Rovers Considering Non-Geometric Hazards. IEEE Robot. Autom. Lett. 2023, 8, 4084–4091. [Google Scholar] [CrossRef]
- Rogers, A.D.; Aharonson, O.; Bandfield, J.L. Geologic context of in situ rocky exposures in Mare Serpentis, Mars: Implications for crust and regolith evolution in the cratered highlands. Icarus 2009, 200, 446–462. [Google Scholar] [CrossRef]
- Garvin, J.; Edgett, K.; Dotson, R.; Fey, D.; Herkenhoff, K.; Hallet, B.; Kennedy, M. Quantitative Relief Models of Rock Surfaces on Mars at Sub-millimeter Scales from Mars Curiosity Rover Mars Hand Lens Imager (MAHLI) Observations: Geologic Implications. Microsc. Microanal. 2017, 23, 2146–2147. [Google Scholar] [CrossRef]
- Huang, G.; Yang, L.; Cai, Y.; Zhang, D. Terrain classification-based rover traverse planner with kinematic constraints for Mars exploration. Planet. Space Sci. 2021, 209, 105371. [Google Scholar] [CrossRef]
- Changela, H.G.; Chatzitheodoridis, E.; Antunes, A.; Beaty, D.; Bouw, K.; Bridges, J.C.; Capova, K.A.; Cockell, C.S.; Conley, C.A.; Dadachova, E.; et al. Mars: New insights and unresolved questions. Int. J. Astrobiol. 2021, 20, 394–426. [Google Scholar] [CrossRef]
- Fassett, C.I. Analysis of impact crater populations and the geochronology of planetary surfaces in the inner solar system. J. Geophys. Res. Planets 2016, 121, 1900–1926. [Google Scholar] [CrossRef]
- Golombek, M.; Rapp, D. Size-frequency distributions of rocks on Mars and Earth analog sites: Implications for future landed missions. J. Geophys. Res. Planets 1997, 102, 4117–4129. [Google Scholar] [CrossRef]
- Gerdes, L.; Azkarate, M.; Sánchez-Ibáez, J.R.; Joudrier, L.; Perez-del-Pulgar, C.J. Efficient autonomous navigation for planetary rovers with limited resources. J. Field Robot. 2020, 37, 1153–1170. [Google Scholar] [CrossRef]
- Hood, D.R.; Sholes, S.F.; Karunatillake, S.; Fassett, C.I.; Ewing, R.C.; Levy, J. The Martian Boulder Automatic Recognition System, MBARS. Earth Space Sci. 2022, 9, e2022EA002410. [Google Scholar] [CrossRef]
- Bickel, V.T.; Aaron, J.; Manconi, A.; Loew, S.; Mall, U. Impacts drive lunar rockfalls over billions of years. Nat. Commun. 2020, 11, 2862. [Google Scholar] [CrossRef] [PubMed]
- Yang, C.; Zhao, H.; Bruzzone, L.; Benediktsson, J.A.; Liang, Y.; Liu, B.; Zeng, X.; Guan, R.; Li, C.; Ouyang, Z. Lunar impact crater identification and age estimation with Chang’E data by deep and transfer learning. Nat. Commun. 2020, 11, 6358. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Sun, H.; Wang, Y.; Wang, X.; Zhang, B.; Xin, Y.; Zhang, B.; Cao, X.; Ding, E.; Han, S. MAFormer: A transformer network with multi-scale attention fusion for visual recognition. Neurocomputing 2024, 595, 127828. [Google Scholar] [CrossRef]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. In Proceedings of the 35th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 6–14 December 2021; pp. 12077–12090. [Google Scholar]
- Wang, W.; Xie, E.; Li, X.; Fan, D.P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. PVT v2: Improved baselines with Pyramid Vision Transformer. Comput. Vis. Media 2022, 8, 415–424. [Google Scholar] [CrossRef]
- Xiong, Y.; Xiao, X.; Yao, M.; Cui, H.; Fu, Y. Light4Mars: A lightweight transformer model for semantic segmentation on unstructured environment like Mars. ISPRS J. Photogramm. Remote Sens. 2024, 214, 12. [Google Scholar] [CrossRef]
- Zhang, X.; Song, Y.; Song, T.; Yang, D.; Ye, Y.; Zhou, J.; Zhang, L. AKConv: Convolutional Kernel with Arbitrary Sampled Shapes and Arbitrary Number of Parameters. arXiv 2023. [Google Scholar] [CrossRef]
- Yu, Y.; Zhang, Y.; Cheng, Z.; Song, Z.; Tang, C. MCA: Multidimensional collaborative attention in deep convolutional neural networks for image recognition. Eng. Appl. Artif. Intell. 2023, 126, 107079. [Google Scholar] [CrossRef]
- Xiong, Y.; Xiao, X.; Yao, M.; Liu, H.; Yang, H.; Fu, Y. MarsFormer: Martian Rock Semantic Segmentation with Transformer. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4600612. [Google Scholar] [CrossRef]
- Liu, H.; Yao, M.; Xiao, X.; Xiong, Y. RockFormer: A U-Shaped Transformer Network for Martian Rock Segmentation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4600116. [Google Scholar] [CrossRef]
- Smith, D.E.; Zuber, M.T.; Frey, H.V.; Garvin, J.B.; Head, J.W.; Muhleman, D.O.; Pettengill, G.H.; Phillips, R.J.; Solomon, S.C.; Zwally, H.J. Mars Orbiter Laser Altimeter: Experiment summary after the first year of global mapping of Mars. J. Geophys. Res. Planets 2001, 106, 23689–23722. [Google Scholar] [CrossRef]
- Qiao, W.; Zhao, Y.; Xu, Y.; Lei, Y.; Wang, Y.; Yu, S.; Li, H. Deep learning-based pixel-level rock fragment recognition during tunnel excavation using instance segmentation model. Tunn. Undergr. Space Technol. 2021, 115, 104072. [Google Scholar] [CrossRef]
- Canny, J. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI-8, 679–698. [Google Scholar] [CrossRef]
- Furlán, F.; Rubio, E.; Sossa, H.; Ponce, V. Rock Detection in a Mars-Like Environment Using a CNN. In Pattern Recognition, 11th Mexican Conference, MCPR 2019, Querétaro, Mexico, 26–29 June 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 149–158. [Google Scholar]
- Kuang, B.; Wisniewski, M.; Rana, Z.A.; Zhao, Y. Rock Segmentation in the Navigation Vision of the Planetary Rovers. Mathematics 2021, 9, 3048. [Google Scholar] [CrossRef]
- Ebadi, K.; Coble, K.; Kogan, D.; Atha, D.; Schwartz, R.; Padgett, C.; Hook, J.V. Semantic Mapping in Unstructured Environments: Toward Autonomous Localization of Planetary Robotic Explorers. In Proceedings of the IEEE Aerospace Conference, Big Sky, MT, USA, 5–12 March 2022; pp. 1–10. [Google Scholar]
- Lv, W.; Wei, L.; Zheng, D.; Liu, Y.; Wang, Y. MarsNet: Automated Rock Segmentation with Transformers for Tianwen-1 Mission. IEEE Geosci. Remote Sens. Lett. 2023, 20, 3506605. [Google Scholar] [CrossRef]
- Li, J.; Chen, K.; Tian, G.; Li, L.; Shi, Z. MarsSeg: Mars Surface Semantic Segmentation with Multi-level Extractor and Connector. IEEE Trans. Geosci. Remote Sens. 2024, 63, 4501012. [Google Scholar]
- Wei, P.; Sun, Z.; Tian, H. LBNet: A Lightweight Bilateral Network for Semantic Segmentation of Martian Rock. IEEE Access 2024, 12, 182137–182144. [Google Scholar] [CrossRef]
- Wei, P.; Sun, Z.; Tian, H. Rocknet: Lightweight network for real-time segmentation of Martian rocks. J. Real-Time Image Process. 2025, 22, 41. [Google Scholar] [CrossRef]
- Jia, Y.; Wan, G.; Li, W.; Li, C.; Liu, J.; Cong, D.; Liu, L. EDR-TransUnet: Integrating Enhanced Dual Relation-Attention with Transformer U-Net for Multiscale Rock Segmentation on Mars. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4601416. [Google Scholar] [CrossRef]
- Ma, Y.; Li, Z.; Wu, B.; Duan, R. DepthFormer: Depth-enhanced transformer network for semantic segmentation of the Martian surface from rover images. Earth Space Sci. 2025, 12, e2024EA003812. [Google Scholar] [CrossRef]
- Lin, B.; Wang, F.; Li, Q.; Zheng, B.; Yao, M.; Xiao, X.; Qi, Y.; Cui, H.; Huang, X. LisseMars: A Lightweight Semantic Segmentation Model for Mars Helicopter. Aerospace 2025, 12, 1049. [Google Scholar] [CrossRef]
- Dong, B.; Wang, P.; Wang, F. Head-Free Lightweight Semantic Segmentation with Linear Transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington DC, USA, 7–14 February 2023. [Google Scholar] [CrossRef]
- Ma, C.; Li, Y.; Lv, J.; Xiao, Z.; Zhang, W.; Mo, L. Automated Rock Detection from Mars Rover Image via Y-Shaped Dual-Task Network with Depth-Aware Spatial Attention Mechanism. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4600418. [Google Scholar] [CrossRef]
- Thompson, D.R.; Castano, R. Performance Comparison of Rock Detection Algorithms for Autonomous Planetary Geology. In Proceedings of the IEEE Aerospace Conference, Big Sky, MT, USA, 3–10 March 2007; pp. 1–9. [Google Scholar]
- Niekum, S. Reliable Rock Detection and Classification for Autonomous Science. Master’s Thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 2008. [Google Scholar]
- Wang, C.; Zhang, Z.; Zhang, Y.; Tian, R.; Ding, M. GMSRI: A Texture-Based Martian Surface Rock Image Dataset. Sensors 2021, 21, 5410. [Google Scholar] [CrossRef] [PubMed]
- Xiao, X.; Yao, M.; Liu, H.; Wang, J.; Zhang, L.; Fu, Y. A Kernel-Based Multi-Featured Rock Modeling and Detection Framework for a Mars Rover. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3335–3344. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Zelinsky, A. Learning OpenCV—Computer Vision with the OpenCV Library. IEEE Robot. Autom. Mag. 2009, 16, 100. [Google Scholar] [CrossRef]
- Keskar, N.S.; Mudigere, D.; Nocedal, J.; Smelyanskiy, M.; Tang, P.T.P. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv 2016, arXiv:1609.04836. [Google Scholar]
- Duchi, J.; Hazan, E.; Singer, Y. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 2011, 12, 2121–2159. [Google Scholar]
- De, S.; Mukherjee, A.; Ullah, E. Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration. arXiv 2018, arXiv:1807.06766. [Google Scholar] [CrossRef]
- Golombek, M.P.; Trussell, A.R.; Williams, N.R.; Charalambous, C.; Abarca, H.; Warner, N.H.; Deahn, M.; Trautman, M.R.; Crocco, B.; Grant, J.A.; et al. Rock Size-Frequency Distributions at the InSight Landing Site, Mars. Earth Space Sci. 2021, 8, e2021EA001959. [Google Scholar] [CrossRef]
- Han, X.; Papyan, V.; Donoho, D.L. Neural collapse under mse loss: Proximity to and dynamics on the central path. arXiv 2021, arXiv:2106.02073. [Google Scholar]
- Zhang, J.; Li, X.; Li, J.; Liu, L.; Xue, Z.; Zhang, B.; Jiang, Z.; Huang, T.; Wang, Y.; Wang, C. Rethinking Mobile Block for Efficient Attention-based Models. arXiv 2023. [Google Scholar] [CrossRef]
- Wu, H.; Zhang, J.; Huang, K.; Liang, K.; Yu, Y. FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation. arXiv 2019. [Google Scholar] [CrossRef]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Xu, J.; Xiong, Z.; Bhattacharyya, S.P. PIDNet: A real-time semantic segmentation network inspired by PID controllers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 19529–19539. [Google Scholar]
- Mehta, S.; Rastegari, M. Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. arXiv 2021, arXiv:2110.02178. [Google Scholar]
- Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.C.; Tan, M.; Chu, G.; Vasudevan, V.; Zhu, Y.; Pang, R. Searching for MobileNetV3. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, 18th International Conference, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6230–6239. [Google Scholar]
- Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. arXiv 2022. [Google Scholar] [CrossRef]
- Di, K.; Xu, B.; Peng, M.; Yue, Z.; Liu, Z.; Wan, W.; Li, L.; Zhou, J. Rock size-frequency distribution analysis at the Chang’E-3 landing site. Planet. Space Sci. 2016, 120, 103–112. [Google Scholar] [CrossRef]
- Wang, B.; Gou, S.; Di, K.; Wan, W.; Peng, M.; Zhao, C.; Zhang, Y.; Xie, B. Rock size-frequency distribution analysis at the Zhurong landing site based on Navigation and Terrain Camera images along the entire traverse. Icarus 2024, 413, 116001. [Google Scholar] [CrossRef]















| Variant | AKConv | EMCA | Params (M) | FLOPs (G) | IoU (%) | PA (%) | Pre (%) | F1 (%) |
|---|---|---|---|---|---|---|---|---|
| (a) | × | × | 3.02 | 14.68 | 84.32 | 96.91 | 90.15 | 91.50 |
| (b) | √ | × | 2.97 | 15.22 | 86.58 | 97.49 | 91.83 | 92.81 |
| (c) | × | √ | 3.00 | 14.96 | 86.13 | 97.35 | 91.52 | 92.55 |
| (d) | √ | MCA | 2.97 | 15.37 | 87.52 | 97.83 | 92.41 | 93.34 |
| (e) MAFT | √ | √ | 2.97 | 15.49 | 88.90 | 98.17 | 93.43 | 94.12 |
| Configuration | Params (M) | FLOPs (G) | Δ FLOPs (G) | IoU (%) |
|---|---|---|---|---|
| AFFormer baseline | 3.02 | 14.68 | - | 84.32 |
| AFFormer + AKConv | 2.97 | 15.22 | +0.54 | 86.58 |
| ViT | 144.06 | 385.46 | - | 80.75 |
| AFFormer + AKConv + EMCA/MAFT | 2.97 | 15.49 | +0.27 | 88.90 |
| Method | AKConv | EMCA | IAFFormer | IoU (%) | F1 (%) | Recall (%) |
|---|---|---|---|---|---|---|
| MAFT-1 | √ | √ | × | 82.15 | 90.23 | 91.18 |
| MAFT-2 | × | √ | √ | 86.35 | 92.68 | 93.76 |
| MAFT-3 | √ | × | √ | 86.58 | 92.81 | 93.80 |
| MAFT | √ | √ | √ | 88.90 | 94.12 | 94.82 |
| Model | Methods | TWMARS-V2 | MarsData-V2 | SynMars | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Pre | IoU | PA | F1 | Pre | IoU | PA | F1 | Pre | IoU | PA | F1 | ||
| Martian rock Methods | NI-U-Net++ | 82.63 | 70.14 | 78.49 | 82.45 | 92.67 | 89.05 | 91.20 | 94.21 | 87.59 | 83.15 | 94.52 | 90.80 |
| MarsNet | 92.58 | 84.56 | 88.17 | 91.63 | 93.29 | 90.26 | 91.53 | 94.88 | 93.46 | 91.59 | 92.10 | 95.61 | |
| CNN-based | EMO-5M | 89.67 | 80.24 | 83.49 | 89.04 | 93.75 | 90.24 | 92.34 | 94.87 | 90.64 | 84.66 | 85.59 | 91.69 |
| FastFCN | 85.36 | 78.54 | 82.17 | 87.98 | 90.28 | 87.56 | 89.52 | 93.37 | 86.54 | 82.29 | 84.58 | 90.28 | |
| DeepLabV3+ | 91.70 | 78.71 | 83.29 | 88.09 | 97.56 | 94.30 | 95.18 | 97.07 | 93.40 | 89.33 | 91.28 | 94.36 | |
| PIDNet-S | 89.58 | 82.49 | 86.30 | 90.40 | 86.14 | 75.83 | 84.23 | 86.25 | 88.72 | 80.14 | 85.11 | 88.98 | |
| MobileViT-S | 91.24 | 84.16 | 87.20 | 91.40 | 87.80 | 77.62 | 84.89 | 87.40 | 90.34 | 81.93 | 85.93 | 90.07 | |
| MobileNetV3 | 90.12 | 81.59 | 84.25 | 89.86 | 97.09 | 92.47 | 93.46 | 96.09 | 93.88 | 92.47 | 93.59 | 96.09 | |
| UNet-based | UNet | 80.54 | 72.26 | 76.58 | 83.90 | 91.59 | 81.52 | 88.56 | 89.82 | 88.17 | 76.75 | 82.58 | 86.85 |
| PSPNet | 92.48 | 81.59 | 88.54 | 89.86 | 96.39 | 92.11 | 94.23 | 95.89 | 92.00 | 84.27 | 89.56 | 91.46 | |
| Transformer-based | SegFormer | 92.16 | 83.91 | 97.53 | 91.25 | 94.83 | 91.80 | 92.68 | 96.27 | 92.33 | 83.49 | 88.17 | 91.00 |
| ViT | 89.26 | 80.75 | 97.95 | 89.35 | 93.91 | 91.3 | 92.13 | 95.45 | 90.61 | 81.42 | 83.56 | 89.76 | |
| Swin Transformer | 91.3 | 86.20 | 97.87 | 92.59 | 96.51 | 93.38 | 97.53 | 96.58 | 91.30 | 83.40 | 85.25 | 90.95 | |
| MAFT | 93.43 | 88.90 | 98.17 | 94.12 | 98.18 | 96.62 | 98.64 | 98.28 | 94.37 | 92.80 | 92.93 | 96.27 | |
| Model | Methods | Params (M) | FLOPs (G) | FPS(GPU) | FPS(CPU) | ||
|---|---|---|---|---|---|---|---|
| TWMARS-V2 | MarsData-V2 | SynMars | TWMARS-V2 | ||||
| Mars Rocks-Methods | NI-U-Net++ | 13.45 | 44.60 | 19.30 | 19.00 | 19.20 | 3.65 |
| MarsNet | 33.21 | 240.38 | 31.21 | 30.80 | 30.50 | 2.20 | |
| CNN-based | EMO-5M | 10.28 | 16.05 | 12.40 | 12.10 | 12.21 | 2.32 |
| FastFCN | 68.71 | 60.52 | 10.61 | 10.40 | 10.54 | 1.90 | |
| PIDNet-S | 3.67 | 15.65 | 31.62 | 32.23 | 31.92 | 6.50 | |
| MobileViT-S | 5.60 | 15.70 | 31.90 | 32.80 | 32.54 | 4.30 | |
| MobileNetV3 | 3.28 | 11.60 | 30.53 | 30.26 | 30.28 | 7.60 | |
| ConvNeXt [59] | 122.10 | 100.58 | 4.24 | 4.18 | 4.26 | 0.13 | |
| UNet-based | UNet | 7.75 | 18.08 | 5.90 | 5.75 | 5.80 | 1.12 |
| PSPNet | 58.95 | 234.90 | 12.46 | 12.89 | 12.35 | 2.22 | |
| Transformer-based | SegFormer | 3.72 | 20.34 | 9.64 | 9.90 | 9.51 | 0.6 |
| ViT | 144.06 | 385.46 | 1.59 | 1.62 | 1.60 | 0.06 | |
| Swin Transformer | 58.95 | 241.66 | 25.53 | 26.11 | 25.34 | 1.4 | |
| MAFT | 2.97 | 15.49 | 35.25 | 35.94 | 35.96 | 8.46 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Li, C.; Jia, Y.; Wan, G.; Ma, Q.; Liu, J.; Wang, Y.; Wang, B.; Liu, J.; Wei, Z. MAFT: A Lightweight Network for Martian Rock Segmentation Based on an Adaptive Frequency Transformer. Remote Sens. 2026, 18, 1794. https://doi.org/10.3390/rs18111794
Li C, Jia Y, Wan G, Ma Q, Liu J, Wang Y, Wang B, Liu J, Wei Z. MAFT: A Lightweight Network for Martian Rock Segmentation Based on an Adaptive Frequency Transformer. Remote Sensing. 2026; 18(11):1794. https://doi.org/10.3390/rs18111794
Chicago/Turabian StyleLi, Chu, Yutong Jia, Gang Wan, Qifang Ma, Jia Liu, Yang Wang, Biao Wang, Jia Liu, and Zhanji Wei. 2026. "MAFT: A Lightweight Network for Martian Rock Segmentation Based on an Adaptive Frequency Transformer" Remote Sensing 18, no. 11: 1794. https://doi.org/10.3390/rs18111794
APA StyleLi, C., Jia, Y., Wan, G., Ma, Q., Liu, J., Wang, Y., Wang, B., Liu, J., & Wei, Z. (2026). MAFT: A Lightweight Network for Martian Rock Segmentation Based on an Adaptive Frequency Transformer. Remote Sensing, 18(11), 1794. https://doi.org/10.3390/rs18111794

