S5Utis: Structured State-Space Sequence SegNeXt UNet-like Tongue Image Segmentation in Traditional Chinese Medicine
Abstract
1. Introduction
- A tongue image segmentation model based on the SegNeXt network is proposed, with the S4 network as a more efficient and lightweight network backbone. It uses improved S4-2D convolutional self-attention for multi-scale feature fusion.
- Cross-layer connections and residual connections are used in the decoder to achieve layer-by-layer upsampling and improve segmentation accuracy.
- Better segmentation accuracy is achieved for tongue images taken by non-laboratory personnel and using non-professional equipment, which is an advantage over other currently popular semantic segmentation networks.
2. Related Work
2.1. Traditional Image Segmentation
2.2. Deep Learning Segmentation
3. Methods
3.1. Encoder
3.2. S4-2D Block
3.2.1. State Space Models
3.2.2. HIPPO Matrix and S4 Structure
3.2.3. S4 for 2D Input Image
3.3. Decoder
3.4. Loss Function
4. Results and Discussion
4.1. Dateset and Implementation Details
4.2. Experimental Results and Analysis
4.3. Ablation Experiment Results
4.4. Tongue Segmentation Visualization
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Albahri, A.S.; Alwan, J.K.; Taha, Z.K.; Ismail, S.F.; Hamid, R.A.; Zaidan, A.; Albahri, O.S.; Zaidan, B.; Alamoodi, A.H.; Alsalem, M. IoT-based telemedicine for disease prevention and health promotion: State-of-the-Art. J. Netw. Comput. Appl. 2021, 173, 102873. [Google Scholar] [CrossRef]
- Lee, J.A.; Choi, M.; Lee, S.A.; Jiang, N. Effective behavioral intervention strategies using mobile health applications for chronic disease management: A systematic review. BMC Med. Inform. Decis. Mak. 2018, 18, 12. [Google Scholar] [CrossRef] [PubMed]
- Majumder, S.; Deen, M.J. Smartphone sensors for health monitoring and diagnosis. Sensors 2019, 19, 2164. [Google Scholar] [CrossRef] [PubMed]
- Tang, J.L.; Liu, B.Y.; Ma, K.W. Traditional chinese medicine. Lancet 2008, 372, 1938–1940. [Google Scholar] [CrossRef] [PubMed]
- World Health Organization World Health Assembly Update. 25 May 2019. Available online: https://www.who.int/news-room/detail/25-05-2019-world-health-assembly-update (accessed on 25 March 2024).
- Lam, W.C.; Lyu, A.; Bian, Z. ICD-11: Impact on traditional Chinese medicine and world healthcare systems. Pharm. Med. 2019, 33, 373–377. [Google Scholar] [CrossRef] [PubMed]
- World Health Organization. Shanghai Declaration on Promoting Health in the 2030 Agenda for Sustainable Development; World Health Organization: Geneva, Switzerland, 2017. [Google Scholar]
- Cheng, F.; Wang, X.; Song, W.; Lu, Y.; Li, X.; Zhang, H.; Wang, Q. Biologic basis of TCM syndromes and the standardization of syndrome classification. J. Tradit. Chin. Med. Sci. 2014, 1, 92–97. [Google Scholar] [CrossRef]
- Li, Y.; Cui, J.; Liu, Y.; Chen, K.; Huang, L.; Liu, Y. Oral, tongue-coating microbiota, and metabolic disorders: A novel area of interactive research. Front. Cardiovasc. Med. 2021, 8, 730203. [Google Scholar] [CrossRef]
- Cui, J.; Cui, H.; Yang, M.; Du, S.; Li, J.; Li, Y.; Liu, L.; Zhang, X.; Li, S. Tongue coating microbiome as a potential biomarker for gastritis including precancerous cascade. Protein Cell 2019, 10, 496–509. [Google Scholar] [CrossRef]
- Wu, T.C.; Lu, C.N.; Hu, W.L.; Wu, K.L.; Chiang, J.Y.; Sheen, J.M.; Hung, Y.C. Tongue diagnosis indices for gastroesophageal reflux disease: A cross-sectional, case-controlled observational study. Medicine 2020, 99, e20471. [Google Scholar] [CrossRef]
- Huang, Y.S.; Wu, H.K.; Chang, H.H.; Lee, T.C.; Huang, S.Y.; Chiang, J.Y.; Hsu, P.C.; Lo, L.C. Exploring the pivotal variables of tongue diagnosis between patients with acute ischemic stroke and health participants. J. Tradit. Complement. Med. 2022, 12, 505–510. [Google Scholar] [CrossRef]
- Liang, K.; Huang, X.; Chen, H.; Qiu, L.; Zhuang, Y.; Zou, C.; Bai, Y.; Huang, Y. Tongue diagnosis and treatment in traditional Chinese medicine for severe COVID-19: A case report. Ann. Palliat. Med. 2020, 9, 2400407. [Google Scholar] [CrossRef] [PubMed]
- Zhang, G.; He, X.; Li, D.; Tian, C.; Wei, B. Automated screening of COVID-19-based tongue image on Chinese medicine. BioMed Res. Int. 2022, 2022, 6825576. [Google Scholar] [CrossRef]
- Xie, J.; Jing, C.; Zhang, Z.; Xu, J.; Duan, Y.; Xu, D. Digital tongue image analyses for health assessment. Med. Rev. 2021, 1, 172–198. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Bian, H.; Cai, Y.; Zhang, K.; Li, H. An improved tongue image segmentation algorithm based on Deeplabv3+ framework. IET Image Process. 2022, 16, 1473–1485. [Google Scholar] [CrossRef]
- Jiang, T.; Guo, X.j.; Tu, L.p.; Lu, Z.; Cui, J.; Ma, X.x.; Hu, X.j.; Yao, X.h.; Cui, L.t.; Li, Y.z.; et al. Application of computer tongue image analysis technology in the diagnosis of NAFLD. Comput. Biol. Med. 2021, 135, 104622. [Google Scholar] [CrossRef] [PubMed]
- Li, L.; Luo, Z.; Zhang, M.; Cai, Y.; Li, C.; Li, S. An iterative transfer learning framework for cross-domain tongue segmentation. Concurr. Comput. Pract. Exp. 2020, 32, e5714. [Google Scholar] [CrossRef]
- Huang, Z.H.; Huang, W.C.; Wu, H.C.; Fang, W.C. TongueMobile: Automated tongue segmentation and diagnosis on smartphones. Neural Comput. Appl. 2023, 35, 21259–21274. [Google Scholar] [CrossRef]
- Li, X.; Yang, D.; Wang, Y.; Yang, S.; Qi, L.; Li, F.; Gan, Z.; Zhang, W. Automatic tongue image segmentation for real-time remote diagnosis. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18–21 November 2019; pp. 409–414. [Google Scholar]
- Gao, S.; Guo, N.; Mao, D. LSM-SEC: Tongue segmentation by the level set model with symmetry and edge constraints. Comput. Intell. Neurosci. 2021, 2021, 6370526. [Google Scholar] [CrossRef]
- Sungheetha, A.; Rajesh, S. Comparative study: Statistical approach and deep learning method for automatic segmentation methods for lung CT image segmentation. J. Innov. Image Process 2020, 2, 187–193. [Google Scholar] [CrossRef]
- Sehar, U.; Naseem, M.L. How deep learning is empowering semantic segmentation: Traditional and deep learning techniques for semantic segmentation: A comparison. Multimed. Tools Appl. 2022, 81, 30519–30544. [Google Scholar] [CrossRef]
- Khaniabadi, S.M.; Ibrahim, H.; Huqqani, I.A.; Khaniabadi, F.M.; Sakim, H.A.M.; Teoh, S.S. Comparative review on traditional and deep learning methods for medical image segmentation. In Proceedings of the 2023 IEEE 14th Control and System Graduate Research Colloquium (ICSGRC), Shah Alam, Malaysia, 5 August 2023; pp. 45–50. [Google Scholar]
- Landgraf, S.; Hillemann, M.; Aberle, M.; Jung, V.; Ulrich, M. Segmentation of industrial burner flames: A comparative study from traditional image processing to machine and deep learning. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 10, 953. [Google Scholar] [CrossRef]
- Guo, M.H.; Lu, C.Z.; Hou, Q.; Liu, Z.; Cheng, M.M.; Hu, S.M. Segnext: Rethinking convolutional attention design for semantic segmentation. Adv. Neural Inf. Process. Syst. 2022, 35, 1140–1156. [Google Scholar]
- Gu, A.; Goel, K.; Ré, C. Efficiently modeling long sequences with structured state spaces. arXiv 2021, arXiv:2111.00396. [Google Scholar]
- Ning, J.; Zhang, D.; Wu, C.; Yue, F. Automatic tongue image segmentation based on gradient vector flow and region merging. Neural Comput. Appl. 2012, 21, 1819–1826. [Google Scholar] [CrossRef]
- Zhang, H.; Zuo, W.; Wang, K.; Zhang, D. A snake-based approach to automated segmentation of tongue image using polar edge detector. Int. J. Imaging Syst. Technol. 2006, 16, 103–112. [Google Scholar] [CrossRef]
- Pang, B.; Zhang, D.; Wang, K. The bi-elliptical deformable contour and its application to automated tongue segmentation in Chinese medicine. IEEE Trans. Med. Imaging 2005, 24, 946–956. [Google Scholar] [CrossRef] [PubMed]
- Guo, J.; Yang, Y.; Wu, Q.; Su, J.; Ma, F. Adaptive active contour model based automatic tongue image segmentation. In Proceedings of the 2016 9th International Congress on Image and Signal Processing, BioMedical ENGINEERING and Informatics (CISP-BMEI), Datong, China, 15–17 October 2016; pp. 1386–1390. [Google Scholar]
- Wu, K.; Zhang, D. Robust tongue segmentation by fusing region-based and edge-based approaches. Expert Syst. Appl. 2015, 42, 8027–8038. [Google Scholar] [CrossRef]
- Wei, C.; Wang, C.; Huang, S. Using threshold method to separate the edge, coating and body of tongue in automatic tongue diagnosis. In Proceedings of the 6th International Conference on Networked Computing and Advanced Information Management, Seoul, Republic of Korea, 16–18 August 2010; pp. 653–656. [Google Scholar]
- Fachrurrozi, M.; Dela, N.R.; Mahyudin, Y.; Putra, H.K. Tongue image segmentation using hybrid multilevel otsu thresholding and harmony search algorithm. Proc. J. Physics Conf. Ser. 2019, 1196, 012072. [Google Scholar] [CrossRef]
- Wei, Y.K.; Fan, P.; Zeng, G. Application of improved GrabCut method in tongue diagnosis system. Transducer Microsyst. Technol. 2014, 33, 157–160. [Google Scholar]
- Wang, L.; He, X.; Tang, Y.; Chen, P.; Yuan, G. Tongue semantic segmentation based on fully convolutional neural network. In Proceedings of the 2019 International Conference on Intelligent Computing, Automation and Systems (ICICAS), Chongqing, China, 6–8 December 2019; pp. 298–301. [Google Scholar]
- Huang, X.; Zhang, H.; Zhuo, L.; Li, X.; Zhang, J. TISNet-enhanced fully convolutional network with encoder-decoder structure for tongue image segmentation in traditional Chinese medicine. Comput. Math. Methods Med. 2020, 2020, 6029258. [Google Scholar] [CrossRef]
- Zhang, H.; Jiang, R.; Yang, T.; Gao, J.; Wang, Y.; Zhang, J. Study on TCM tongue image segmentation model based on convolutional neural network fused with superpixel. Evid.-Based Complement. Altern. Med. 2022, 2022, 3943920. [Google Scholar] [CrossRef] [PubMed]
- Xu, Q.; Zeng, Y.; Tang, W.; Peng, W.; Xia, T.; Li, Z.; Teng, F.; Li, W.; Guo, J. Multi-task joint learning model for segmenting and classifying tongue images using a deep neural network. IEEE J. Biomed. Health Inform. 2020, 24, 2481–2489. [Google Scholar] [CrossRef] [PubMed]
- Song, H.; Huang, Z.; Feng, L.; Zhong, Y.; Wen, C.; Guo, J. RAFF-Net: An improved tongue segmentation algorithm based on residual attention network and multiscale feature fusion. Digit. Health 2022, 8, 20552076221136362. [Google Scholar] [CrossRef] [PubMed]
- Peng, J.; Li, X.; Yang, D.; Zhang, Y.; Zhang, W.; Zhang, Y.; Kong, Y.; Li, F.; Zhang, W. Automatic tongue crack extraction for real-time diagnosis. In Proceedings of the 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Seoul, Republic of Korea, 16–19 December 2020; pp. 694–699. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Nguyen, E.; Goel, K.; Gu, A.; Downs, G.W.; Shah, P.; Dao, T.; Baccus, S.A.; Ré, C. S4nd: Modeling images and videos as multidimensional signals using state spaces. arXiv 2022, arXiv:2210.06583. [Google Scholar]
- Hatamizadeh, A.; Tang, Y.; Nath, V.; Yang, D.; Myronenko, A.; Landman, B.; Roth, H.R.; Xu, D. Unetr: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2022; pp. 574–584. [Google Scholar]
- Kervadec, H.; Bouchtiba, J.; Desrosiers, C.; Granger, E.; Dolz, J.; Ayed, I.B. Boundary loss for highly unbalanced segmentation. In Proceedings of the International Conference on Medical Imaging with Deep Learning, PMLR, London, UK, 8–10 July 2019; pp. 285–296. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical image computing and Computer-ASSISTED Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer International Publishing: New York, NY, USA, 2015; pp. 234–241. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Gu, A.; Dao, T.; Ermon, S.; Rudra, A.; Ré, C. Hippo: Recurrent memory with optimal polynomial projections. Adv. Neural Inf. Process. Syst. 2020, 33, 1474–1487. [Google Scholar]
- Zheng, S.; Lu, J.; Zhao, H.; Zhu, X.; Luo, Z.; Wang, Y.; Fu, Y.; Feng, J.; Xiang, T.; Torr, P.H.; et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 19–25 June 2021; pp. 6881–6890. [Google Scholar]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]










| Stage | Output Size | e.r. | S5Utis | 
|---|---|---|---|
| 1 | 8 | ||
| 2 | 8 | ||
| 3 | 4 | ||
| 4 | 4 | 
| Hyperparameter | Parameter Setting | 
|---|---|
| Base_lr | 0.0001 | 
| Beta_1 | 0.9 | 
| Beta_2 | 0.999 | 
| Momentum_1 | 0.9 | 
| Momentum_2 | 0.999 | 
| Batch_size | 16 | 
| Droppath_rate in Encoder | 0.01 | 
| BN_epsilon in Encoder | 0.00001 | 
| BN_momentum in Encoder | 0.1 | 
| Model | Loss | Dice (%) | mIoU (%) | PA (%) | 
|---|---|---|---|---|
| UNet | CE | 96.08 | 93.37 | 98.37 | 
| UNet(S4ver) | CE | 97.20 | 94.88 | 98.91 | 
| UNETR | CE | 90.98 | 85.14 | 95.71 | 
| SETR | CE | 94.76 | 91.01 | 97.61 | 
| Segformer | CE | 87.39 | 80.51 | 93.76 | 
| SegNeXt | CE | 97.20 | 94.80 | 98.81 | 
| S5Utis | CE | 98.01 | 96.18 | 99.21 | 
| Encoder | Decoder | Dice (%) | mIoU (%) | PA (%) | 
|---|---|---|---|---|
| UNet | NONE | 96.08 | 93.37 | 98.37 | 
| UNet(S4ver) | NONE | 97.20 | 94.88 | 98.91 | 
| SegNeXt | HAM | 97.20 | 94.80 | 98.81 | 
| SegNeXt(S4ver) | HAM | 97.22 | 94.50 | 98.82 | 
| SegNeXt | UNETR | 97.98 | 96.15 | 99.21 | 
| SegNeXt(S4ver) | UNETR | 98.01 | 96.18 | 99.21 | 
| Model | Loss | Dice (%) | mIoU (%) | PA (%) | 
|---|---|---|---|---|
| S5Utis | CE | 98.01 | 96.18 | 99.21 | 
| S5Utis | CE + BL | 97.78 | 95.77 | 99.14 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, D.; Zhang, H.; Shi, L.; Xu, H.; Xu, Y. S5Utis: Structured State-Space Sequence SegNeXt UNet-like Tongue Image Segmentation in Traditional Chinese Medicine. Sensors 2024, 24, 4046. https://doi.org/10.3390/s24134046
Song D, Zhang H, Shi L, Xu H, Xu Y. S5Utis: Structured State-Space Sequence SegNeXt UNet-like Tongue Image Segmentation in Traditional Chinese Medicine. Sensors. 2024; 24(13):4046. https://doi.org/10.3390/s24134046
Chicago/Turabian StyleSong, Donglei, Hongda Zhang, Lida Shi, Hao Xu, and Ying Xu. 2024. "S5Utis: Structured State-Space Sequence SegNeXt UNet-like Tongue Image Segmentation in Traditional Chinese Medicine" Sensors 24, no. 13: 4046. https://doi.org/10.3390/s24134046
APA StyleSong, D., Zhang, H., Shi, L., Xu, H., & Xu, Y. (2024). S5Utis: Structured State-Space Sequence SegNeXt UNet-like Tongue Image Segmentation in Traditional Chinese Medicine. Sensors, 24(13), 4046. https://doi.org/10.3390/s24134046
 
         
                                                

 
       