Enabling Self-Practice of Digital Audio–Tactile Maps for Visually Impaired People by Large Language Models
Abstract
:1. Introduction
2. Related Work
3. Conceptual Framework
4. Prototype Implementation and Experiment Procedure
4.1. Prototype Implementation
4.2. Feedback Design
4.3. Prerequisite Information for the LLM Assistant
4.4. Experiment Procedure
- Q1.
- Did the LLM assistant hear the inquiry correctly?
- Q2.
- Did the LLM assistant respond correctly?
- Q3.
- Whether the LLM assistant responded fast?
- Q4.
- Whether the LLM assistant was helpful for you to understand the map?
5. Results and Analysis
5.1. Similarity Scores between Ground Truth DATMs and Maps Drawn by the Participants
5.2. Subjective Evaluation of the LLM Assistant
6. Discussion
6.1. Effectiveness and Usability
6.2. Concerns for Hallucination Problem
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
DATM | digital audio–tactile map |
PVI | people who are visually impaired |
LLM | large language model |
POI | point of interest |
References
- Ito, Y.; Kiyohara, H.; Awamura, K.; Yamaoka, C. People with Visual Impairment Continue to Experience Difficulties in Their Daily Lives that Affect Their Health-related Quality of Life after the COVID-19 Pandemic. JMA J. 2024, 7, 114–119. [Google Scholar] [CrossRef]
- Alves, J.P.; Eusébio, C.; Carneiro, M.J.; Teixeira, L.; Mesquita, S. Living in an untouchable world: Barriers to recreation and tourism for Portuguese blind people during the COVID-19 pandemic. J. Outdoor Recreat. Tour. 2023, 42, 100637. [Google Scholar] [CrossRef]
- Engel, C.; Müller, K.; Constantinescu, A.; Loitsch, C.; Petrausch, V.; Weber, G.; Stiefelhagen, R. Travelling more independently: A Requirements Analysis for Accessible Journeys to Unknown Buildings for People with Visual Impairments. In Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility, ASSETS ’20, Virtual Event, 26–28 October 2020. [Google Scholar] [CrossRef]
- Google Maps. Available online: https://www.google.com/maps/ (accessed on 24 May 2024).
- Chebat, D.R.; Schneider, F.C.; Ptito, M. Spatial Competence and Brain Plasticity in Congenital Blindness via Sensory Substitution Devices. Front. Neurosci. 2020, 14, 815. [Google Scholar] [CrossRef]
- Jakub Wabiński, A.M.; Touya, G. Guidelines for Standardizing the Design of Tactile Maps: A Review of Research and Best Practice. Cartogr. J. 2022, 59, 239–258. [Google Scholar] [CrossRef]
- Hofmann, M.; Mack, K.; Birchfield, J.; Cao, J.; Hughes, A.G.; Kurpad, S.; Lum, K.J.; Warnock, E.; Caspi, A.; Hudson, S.E.; et al. Maptimizer: Using Optimization to Tailor Tactile Maps to Users Needs. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, CHI ’22, New Orleans, LA, USA, 29 April–5 May 2022. [Google Scholar] [CrossRef]
- Palivcová, D.; Macík, M.; Míkovec, Z. Interactive Tactile Map as a Tool for Building Spatial Knowledge of Visually Impaired Older Adults. In Proceedings of the Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, CHI EA ’20, Honolulu, HI, USA, 25–30 April 2020; pp. 1–9. [Google Scholar] [CrossRef]
- Wang, X.; Kayukawa, S.; Takagi, H.; Asakawa, C. BentoMuseum: 3D and Layered Interactive Museum Map for Blind Visitors. In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility, ASSETS ’22, Athens, Greece, 23–26 October 2022. [Google Scholar] [CrossRef]
- Ottink, L.; van Raalte, B.; Doeller, C.F.; Van der Geest, T.M.; Van Wezel, R.J.A. Cognitive map formation through tactile map navigation in visually impaired and sighted persons. Sci. Rep. 2022, 12, 11567. [Google Scholar] [CrossRef]
- Ottink, L.; Hoogendonk, M.; Doeller, C.F.; Van der Geest, T.M.; Van Wezel, R.J.A. Cognitive map formation through haptic and visual exploration of tactile city-like maps. Sci. Rep. 2021, 11, 15254. [Google Scholar] [CrossRef]
- Holloway, L.; Ananthanarayan, S.; Butler, M.; De Silva, M.T.; Ellis, K.; Goncu, C.; Stephens, K.; Marriott, K. Animations at Your Fingertips: Using a Refreshable Tactile Display to Convey Motion Graphics for People who are Blind or have Low Vision. In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility, ASSETS ’22, Athens, Greece, 23–26 October 2022. [Google Scholar] [CrossRef]
- Paratore, M.T.; Leporini, B. Exploiting the haptic and audio channels to improve orientation and mobility apps for the visually impaired. Univers. Access Inf. Soc. 2023, 23, 859–869. [Google Scholar] [CrossRef]
- Palani, H.P.; Fink, P.D.S.; Giudice, N.A. Comparing Map Learning between Touchscreen-Based Visual and Haptic Displays: A Behavioral Evaluation with Blind and Sighted Users. Multimodal Technol. Interact. 2022, 6, 1. [Google Scholar] [CrossRef]
- Feitl, S.; Kreimeier, J.; Götzelmann, T. Accessible Electrostatic Surface Haptics: Towards an Interactive Audiotactile Map Interface for People with Visual Impairments. In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, PETRA ’22, Corfu, Greece, 29 June–1 July 2022; pp. 522–531. [Google Scholar] [CrossRef]
- Poppinga, B.; Magnusson, C.; Pielot, M.; Rassmus-Gröhn, K. TouchOver map: Audio-tactile exploration of interactive maps. In Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services, MobileHCI ’11, Stockholm, Sweden, 30 August–2 September 2011; pp. 545–550. [Google Scholar] [CrossRef]
- Kaklanis, N.; Votis, K.; Tzovaras, D. A mobile interactive maps application for a visually impaired audience. In Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility, W4A ’13, Rio de Janeiro, Brazil, 13–15 May 2023. [Google Scholar] [CrossRef]
- Darvishy, A.; Hutter, H.P.; Grossenbacher, M.; Merz, D. Touch Explorer: Exploring Digital Maps for Visually Impaired People. In Computers Helping People with Special Needs, Proceedings of the 17th International Conference, ICCHP 2020, Lecco, Italy, 9–11 September 2020; Proceedings, Part I; Springer: Berlin/Heidelberg, Germany, 2020; pp. 427–434. [Google Scholar] [CrossRef]
- Tivadar, R.I.; Franceschiello, B.; Minier, A.; Murray, M.M. Learning and navigating digitally rendered haptic spatial layouts. Npj Sci. Learn. 2023, 8, 61. [Google Scholar] [CrossRef]
- Giudice, N.A.; Guenther, B.A.; Jensen, N.A.; Haase, K.N. Cognitive Mapping Without Vision: Comparing Wayfinding Performance After Learning From Digital Touchscreen-Based Multimodal Maps vs. Embossed Tactile Overlays. Front. Hum. Neurosci. 2020, 14, 87. [Google Scholar] [CrossRef]
- Johnson, K.O.; Phillips, J.R. Tactile spatial resolution. I. Two-point discrimination, gap detection, grating resolution, and letter recognition. J. Neurophysiol. 1981, 46, 1177–1192. [Google Scholar] [CrossRef] [PubMed]
- Yau, J.M.; Kim, S.S.; Thakur, P.H.; Bensmaia, S.J. Feeling form: The neural basis of haptic shape perception. J. Neurophysiol. 2016, 115, 631–642. [Google Scholar] [CrossRef] [PubMed]
- Robinson Moore, W.J.; Kalal, M.; Tennison, J.L.; Giudice, N.A.; Gorlewicz, J. Spatial Audio-Enhanced Multimodal Graph Rendering for Efficient Data Trend Learning on Touchscreen Devices. In Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI ’24, Honolulu, HI, USA, 11–16 May 2024. [Google Scholar] [CrossRef]
- Gorlewicz, J.L.; Tennison, J.L.; Uesbeck, P.M.; Richard, M.E.; Palani, H.P.; Stefik, A.; Smith, D.W.; Giudice, N.A. Design Guidelines and Recommendations for Multimodal, Touchscreen-based Graphics. ACM Trans. Access. Comput. 2020, 13, 1–30. [Google Scholar] [CrossRef]
- Jain, G.; Teng, Y.; Cho, D.H.; Xing, Y.; Aziz, M.; Smith, B.A. “I Want to Figure Things Out”: Supporting Exploration in Navigation for People with Visual Impairments. Proc. ACM Hum.-Comput. Interact. 2023, 7, 1–28. [Google Scholar] [CrossRef]
- Schles, R.A.; Chastain, M. Teachers of Students With Visual Impairments: Motivations for Entering the Field of Visual Impairment and Reflections on Pre-Service Training. J. Vis. Impair. Blind. 2023, 117, 62–73. [Google Scholar] [CrossRef]
- Alhammadi, M.M. Availability of disability specialists for students with vision or hearing impairment in the United Arab Emirates: Current status and future needs. Disabil. Rehabil. Assist. Technol. 2024, 19, 1709–1717. [Google Scholar] [CrossRef] [PubMed]
- Chundury, P.; Patnaik, B.; Reyazuddin, Y.; Tang, C.; Lazar, J.; Elmqvist, N. Towards Understanding Sensory Substitution for Accessible Visualization: An Interview Study. IEEE Trans. Vis. Comput. Graph. 2022, 28, 1084–1094. [Google Scholar] [CrossRef]
- Chat GPT. Available online: https://openai.com/chatgpt/ (accessed on 24 May 2024).
- Gemini. Available online: https://gemini.google.com/ (accessed on 24 May 2024).
- Karanikolas, N.; Manga, E.; Samaridi, N.; Tousidou, E.; Vassilakopoulos, M. Large Language Models versus Natural Language Understanding and Generation. In Proceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics, PCI ’23, Lamia, Greece, 24–26 November 2023; pp. 278–290. [Google Scholar] [CrossRef]
- Yang, J.; Jin, H.; Tang, R.; Han, X.; Feng, Q.; Jiang, H.; Zhong, S.; Yin, B.; Hu, X. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. ACM Trans. Knowl. Discov. Data 2024, 18, 1–32. [Google Scholar] [CrossRef]
- Martiniello, N.; Eisenbarth, W.; Lehane, C.; Johnson, A.; Wittich, W. Exploring the use of smartphones and tablets among people with visual impairments: Are mainstream devices replacing the use of traditional visual aids? Assist. Technol. 2022, 34, 34–45. [Google Scholar] [CrossRef] [PubMed]
- Senjam, S.S.; Manna, S.; Bascaran, C. Smartphones-Based Assistive Technology: Accessibility Features and Apps for People with Visual Impairment, and its Usage, Challenges, and Usability Testing. Clin. Optom. (Auckl.) 2021, 13, 311–322. [Google Scholar] [CrossRef] [PubMed]
- Paratore, M.T.; Leporini, B. Haptic-Based Cognitive Mapping to Support Shopping Malls Exploration. In Smart Objects and Technologies for Social Goods; Pires, I.M., Zdravevski, E., Garcia, N.C., Eds.; Springer: Cham, Switzerland, 2023; pp. 54–62. [Google Scholar]
- GPT-4 Turbo. Available online: https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4 (accessed on 24 May 2024).
- Whisper: Robust Speech Recognition via Large-Scale Weak Supervision. Available online: https://github.com/openai/whisper (accessed on 24 May 2024).
- gTTS: Python Library and CLI Tool to Interface with Google Translate’s Text-to-Speech API. Available online: https://github.com/pndurette/gTTS (accessed on 24 May 2024).
- Chamberlain, M.N. The ABCs of Structured Discovery Cane Travel for Children; Information Age Publishing: Charlotte, NC, USA, 2021. [Google Scholar]
- Wu, X.L.; Li, J.; Zhou, F. An Experimental Study of Features Search under Visual Interference in Radar Situation-Interface. Chin. J. Mech. Eng. 2018, 31, 45. [Google Scholar] [CrossRef]
- MediaPipe. Available online: https://ai.google.dev/edge/mediapipe/solutions/guide (accessed on 24 May 2024).
- Chang, J.D.; Brantley, K.; Ramamurthy, R.; Misra, D.; Sun, W. Learning to generate better than your LLM. arXiv 2023. [Google Scholar] [CrossRef]
- Bai, Z.; Wang, P.; Xiao, T.; He, T.; Han, Z.; Zhang, Z.; Shou, M.Z. Hallucination of Multimodal Large Language Models: A Survey. arXiv 2024. [Google Scholar] [CrossRef]
- Tonmoy, S.M.T.I.; Zaman, S.M.M.; Jain, V.; Rani, A.; Rawte, V.; Chadha, A.; Das, A. A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
POI | Sound Effect | Audio Description |
---|---|---|
Room boundary | Standard keypress sound used in Android devices | None |
Entrance | Sound of a doorbell | “Entrance” |
Restroom | Sound of a drop of water | “Toilet” |
User’s location | Sound of item collection in console game | “You are here” |
Type | Content |
---|---|
Descriptions of the role | “You are a self-practice assistant of using digital audio-tactile maps for visually impaired user.You receive an image showing the map of a room where the user is at. The user asks you questions about the map to help him/her confirm their understanding. Typical questions are about the shape of the room or the locations of the user, the entrance, or the restroom.” |
Details of the POIs | “The shape of the room is the outer black outlined shape. The black dot inside the room is the user’s location in the room.Assume that the user is facing north. The blue dot inside the room is the restroom. The green line overlapping with the room shape is the entrance.” |
Desirable responses | “If what user says is correct, answer ’Correct’ and explain. Otherwise, answer ’Not correct’ and explain. As the targeted users are visually impaired people, when explaining, no need to indicate the black dot, blue dot, or green line, but directly referring to them as user’s location, restroom, or entrance. If user directly asks ’where’ something is or ’what’ the shape of the room is, tell them to try feeling it on the screen first and then ask the following type of question:Is the restroom on the right corner? or Is the room a rectangle shape? or Is the left edge straight?” |
DATM | Group A (Without LLM Assistant) | Group B (With LLM Assistant) |
---|---|---|
Rectangular shape | 0.804 ± 0.138 | 0.802 ± 0.065 |
Irregular shape | 0.558 ± 0.169 | 0.763 ± 0.118 |
Rectangular DATM | Irregularly Shaped DATM | |
---|---|---|
Number of Prompts | 4 | 13 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tran, C.M.; Bach, N.G.; Tan, P.X.; Kamioka, E.; Kanamaru, M. Enabling Self-Practice of Digital Audio–Tactile Maps for Visually Impaired People by Large Language Models. Electronics 2024, 13, 2395. https://doi.org/10.3390/electronics13122395
Tran CM, Bach NG, Tan PX, Kamioka E, Kanamaru M. Enabling Self-Practice of Digital Audio–Tactile Maps for Visually Impaired People by Large Language Models. Electronics. 2024; 13(12):2395. https://doi.org/10.3390/electronics13122395
Chicago/Turabian StyleTran, Chanh Minh, Nguyen Gia Bach, Phan Xuan Tan, Eiji Kamioka, and Manami Kanamaru. 2024. "Enabling Self-Practice of Digital Audio–Tactile Maps for Visually Impaired People by Large Language Models" Electronics 13, no. 12: 2395. https://doi.org/10.3390/electronics13122395
APA StyleTran, C. M., Bach, N. G., Tan, P. X., Kamioka, E., & Kanamaru, M. (2024). Enabling Self-Practice of Digital Audio–Tactile Maps for Visually Impaired People by Large Language Models. Electronics, 13(12), 2395. https://doi.org/10.3390/electronics13122395