An Alternative Audio-Tactile Method of Presenting Structural Information Contained in Mathematical Drawings Adapted to the Needs of the Blind

: Alternative methods of presenting the information contained in mathematical images, which are adapted to the needs of blind people, are signiﬁcant challenges in modern education. This article presents an alternative multimodal method that substitutes the sense of sight with the sense of touch and hearing to convey graphical information. The developed method was evaluated at a center specializing in the education of the blind in Poland, on a group of 46 students aged 15–19. They solved a set of 60 high school-level problems on geometry, mathematical analysis, and various types of graphs. We assessed the mechanisms introduced for the sense of touch and hearing, as well as the overall impression of the users. The system usability scale and the NASA task load index tests were used in the evaluation. The results obtained indicate an overall increase in user satisfaction and usefulness of the proposed approach and a reduction in the workload during exercise solving. The results also show a signiﬁcant impact of the proposed navigation modes on the average time to reach objects in the drawing. Therefore, the presented method could signiﬁcantly contribute to the development of systems supporting multimodal education for people with blindness.


Introduction
Education is an essential stage in life, especially for children. Educational activities provide opportunities to learn about the world around us and acquire the necessary skills to live in society. One of the key elements of the educational process is to provide students with appropriate scientific materials that facilitate the acquisition of knowledge and skills. These materials often allow multiple senses to be used in the cognitive process, enabling an improved understanding and retention of knowledge. For example, visual materials such as diagrams, schematics, and charts can help students understand difficult concepts. Undoubtedly, the sense of sight is crucial in this case, allowing students to familiarize themselves with the materials mentioned. Visually impaired individuals face challenges in their cognitive processes due to permanent damage to one of their senses. This requires the preparation of adapted methods to present information that is normally available without hindrance to sighted people. This is a significant problem, as according to data from the World Health Organization (WHO), there are approximately 285 million people worldwide with visual impairments, including 39 million people who are completely blind [1,2]. In the case of textual materials, the problem was addressed quite early through the use of the worldwide Braille alphabet, which allows the representation of 63 characters in the form of points. However, problems start to arise in the case of drawings and graphs, which • Arrangement of objects-This aspect is one of the most important because graphics with overlapping objects or widely spaced objects can introduce errors in the image and prolong the process of understanding the graphics.

•
Limiting the number of objects-The number of objects in the image is an important aspect. If there are too many objects, the user may become disoriented during the analysis.

•
Complexity of the structure of objects-The more complicated the object, the longer it takes to understand the image. • Differentiation of objects-The individual features of the objects should be distinctly different from each other. Too much similarity may confuse the person analyzing a specific image, as in the case of the number of objects.
Thanks to the use of tactile prints, it is possible to compensate for loss of sight, which improves the cognitive process [8,9]. However, this approach has some drawbacks. It is important to be aware of cognitive overload, which according to the literature [10], has limited capacity. Hence, it is important to adjust the amount of information and provide it in the least burdensome form possible for the person concerned. An approach that combines the senses of touch and hearing as a substitute for the sense of sight may be supportive here. This approach has been described in many previous publications [11,12], showing its positive impact on perception.

Related Works
For people with blindness, drawings and charts (containing structural information) often pose a problematic issue in exercises related to exact sciences. These individuals are unable to read information presented in a visual form and require the use of alternative methods of data presentation. In this case, solutions utilizing Braille printing and 3D printing for the preparation of tactile materials (tactile images) can be helpful. Such prints contain raised lines that can be analyzed by touch, allowing the user to become familiar with them using tactile perception [1,13].
Many studies related to the adaptation of graphics have been conducted so far [14,15], and documents containing guidelines for this process have been created, among which the set of rules called "Guidelines and Standards for Tactile Graphics" stand out, created by the Braille Authority of North America and the Canadian Braille Authority [7]. It is a collection of knowledge on adapting graphics to the needs of visually impaired individuals and allows them to overcome barriers related to the lack of access to graphic educational materials. However, the tactile image itself is not a perfect solution, and as mentioned in the introduction, we urge the combination of two senses, touch and hearing, to convey all the information included in the image.

Solutions Related to Adapting Audio-Tactile Materials
The first possibility worth mentioning is the use of video cameras to determine the place of touch and the touched element in the drawing [16,17]. The image objects were Appl. Sci. 2023, 13, 9989 3 of 18 tagged with QR codes and the touch distance was analyzed from each of them. The message was read for the nearest QR code. An extension of this method has been the application of artificial intelligence techniques. Studies based on real-time recognition of human fingers and touched elements in the image were described in previous publications [18,19]. In their approaches, they tried to recognize the user's fingers, but this method also often required the use of additional markers. The results of the research indicated the good usability of the solutions, which were nonetheless dependent on external conditions, i.e., lighting or hand position.
Many innovative solutions for dynamic touch displays, e.g., Graphiti created by ORBIT Research, have been created in recent years. These innovations consist of the use of a dot matrix display with adaptively pushed needles to present content that appears on the input device. The disadvantages of this solution, however, include the small area of presentation and the cost of the device, ranging from 5 to 40 thousand euros. These solutions have been used mainly in map presentation and building navigation [6,20].
Tablets with large screen sizes are gaining increasing popularity for presentations. In these solutions, a crucial aspect is the imposition of a tactile printout on the tablet to analyze the user's work and, at a later stage, to provide audio feedback on the graphics. The first research in this approach was conducted as part of the MIDAS project [21], in which an application was developed to detect several types of user gestures on the screen.
The authors of all the works mentioned above emphasize that the precise detection of touched elements and the provision of contextual information about those elements in the form of audio description is crucial. However, an excess of contextual information or erroneous messages, such as information about neighboring elements in the picture, can be irritating and distracting for blind individuals, which makes it difficult for them to properly interpret the tactile picture [22].

Designing the Solution Dedicated to Blind Users
The literature and research conducted so far have distinguished three important approaches relating to the design of solutions for the blind: • Audio interfaces. From the perspective of individuals with visual impairments, audio interaction with electronic devices is the most convenient [3,23]. The accuracy of the descriptions is especially important in this context. In addition to accuracy, the authors of the relevant book [24] have described several distinguished features based on numerous expert interviews. Recommendations include personalizing the synthesizer voice and adjusting the length of messages. • Touch interfaces. The great advantage of tactile interfaces is their naturalness. Individuals with visual impairments who have used Braille since childhood can quickly adapt to such interfaces. Tactile interfaces are commonly used in navigation systems for the blind [25,26]. However, when dealing with complex objects or increasing details, it is desirable to combine tactile interfaces with another sense, such as hearing [27]. Many scientists have demonstrated that combining the sense of touch with signals such as force, vibration, and additional voice messages allows blind users to acquire a larger amount of information [28,29]. • Multimodal interfaces. Combining the senses is assumed to be the most useful strategy when improving perception [30,31]. Research conducted by various sources [10,32] has proven this method to be the most effective for designing interfaces for users who are blind. In the study presented in [33], in which the sense of sight was replaced by a combination of two senses, improved results were demonstrated for both the sighted and blind participants.
This article aims to present an alternative audio-tactile method of presenting structural information contained in mathematical drawings adapted to the needs of the blind. As a contribution and new element of the developed method, in relation to the well-known art, one can consider (i) the adaptation of the idea of feedback independently in touch and sound channels, (ii) the integration of different navigation modes into the adaptive way of drawing presentation, and (iii) the possibility of audio-tactile exploration of drawings consisting of several layers of objects.
The structure of this article is organized as follows. Section 3 provides a detailed description of each step of the proposed solution. Section 4 includes the results of the evaluation of the tools developed for blind students. Finally, Section 5 contains the conclusions, summary, and directions for further development.

Materials and Methods
One of the main elements of adapting materials in the developed method was combining two senses: touch and hearing. Thanks to this approach, it was possible to substitute the vision channel. Previous studies have confirmed that this approach improves the perception of blind people [34,35]. Figure 1 shows the concept of the proposed solution, including the stages related to the processing of the original graphics in a form adapted to the needs of the blind, along with the preparation of alternative descriptions of the tactile and sound presentation.
This article aims to present an alternative audio-tactile method of presenting structural information contained in mathematical drawings adapted to the needs of the blind. As a contribution and new element of the developed method, in relation to the wellknown art, one can consider (i) the adaptation of the idea of feedback independently in touch and sound channels, (ii) the integration of different navigation modes into the adaptive way of drawing presentation, and (iii) the possibility of audio-tactile exploration of drawings consisting of several layers of objects.
The structure of this article is organized as follows. Section 3 provides a detailed description of each step of the proposed solution. Section 4 includes the results of the evaluation of the tools developed for blind students. Finally, Section 5 contains the conclusions, summary, and directions for further development.

Materials and Methods
One of the main elements of adapting materials in the developed method was combining two senses: touch and hearing. Thanks to this approach, it was possible to substitute the vision channel. Previous studies have confirmed that this approach improves the perception of blind people [34,35]. Figure 1 shows the concept of the proposed solution, including the stages related to the processing of the original graphics in a form adapted to the needs of the blind, along with the preparation of alternative descriptions of the tactile and sound presentation.

Preparing the Tactile Image
The first part of the entire developed method mainly focuses on the sense of touch. All stages related to this part aim at preparing appropriately adapted graphics based on the image for adaptation. The initial stage of the research involved selecting appropriate artificial intelligence methods to support object detection in images. To achieve this, the research included a review of currently available solutions in this field. After analyzing the most important parameters, such as accuracy, precision, speed, and the possibility of using it in a specific case related to mathematical exercises, we decided on the YOLO (You Only Look Once) network version 5 [36]. The entire operation of the algorithm focuses on convolutional neural networks (CNN), which enables us to achieve high classification accuracy. Additionally, this solution ensures that the occurrence of a given object will be detected only once, minimizing the risk of multiple overlapping objects in subsequent stages.
The next step involved preparing the classes of objects in the images used in mathematical exercises. We distinguished several classes, such as axes of coordinate systems (x-axis, y-axis), various types of function graphs (linear, quadratic, polynomial, and trigonometric), and basic geometric figures (square, rectangle, triangle, circle, rhombus, parallelogram, and trapezoid), as well as characteristic points on function graphs (zero points and intersection points). The exact list of predefined object classes is shown in Table 1. The choice of an appropriate algorithm, and its expansion with new isolated classes, allowed for the creation of a fast and accurate model for classifying objects that can be found in mathematical exercises. The trained YOLO model is characterized by high accuracy in recognizing classes of objects in the drawing. The evaluation showed that 70% of the exercises did not require any correction at the stage of parameterization and matching class objects, which is a satisfactory result. The described method is semi-automatic, meaning that the teacher has the opportunity to verify the correctness of recognizing the object class in the drawing. YOLO generates a set of bounding boxes around detected objects with labels and probabilities. Before further processing of the results, the teacher has the option of adding, removing, or editing bounding boxes and assigning the appropriate object class. In the case of adapting an existing exercise that already contains a drawing, it is necessary to scan it into a digital image. However, before such a scanning, additional intersection points of objects must be marked at the appropriate places, which the author of the exercise considers important for a blind person.

Creating a Template for the Drawing's Content
Based on the objects detected in the adapted raster image by the YOLO network, we create an image template in SVG format. For each detected object class returned in the image, one instance of the corresponding object is assigned in the SVG file. An additional advantage of this process is the removal of redundant, i.e., unrecognized objects, from the original image. In the case of more complex objects, such as overlapping parabolas, mapping can be conducted to only one object of the <path/> type. The resulting SVG file can be manually corrected and adjusted using the provided basic SVG image adaptation tool, which is characterized below.

Parameterization of the Classes of the Detected Objects
Using the tool mentioned in the previous point, it is possible to parameterize the detected elements through an editor prepared for this purpose. The image can be customized by adjusting parameters such as the ranges of the axes, the function parameters, the lengths of the sides, and the angles of the figures. An advantage of the prepared editor is the ability to add additional characteristic points in areas that should be highlighted and are important from the perspective of a blind person [37]. Such a prepared exercise could be used to create a basic version of the adapted image, but additional optional transformations are required to fully adapt the graphics to the needs of a person with blindness.

Ensuring a Good "Object Density"
In this stage, the image from the previous point was analyzed in terms of the arrangement of objects in accordance with the guidelines quoted at the beginning (see Related Works section). The results of the analysis helped us to manually verify and correct the distribution and size of the objects in the image. This stage has a significant impact on improving the perception of a blind person because excessive accumulation of objects can make the image illegible, resulting in a decrease in motivation for further image exploration. Additionally, it resulted in the elimination of unused space, which means that a blind person has less chance of encountering empty spaces when becoming acquainted with the image.

Printing a Tactile Image
The final stage involves printing the image on a Braille printer. The methods applied facilitated the printing of a fully adapted image, taking into account all the guidelines for adjusting materials for people with visual disabilities. The use of a printed tactile image and working with a mobile application is described in Section 3.4.

Preparing the Alternative Descriptions
The second part of the proposed method involves using an adapted image in an application that activates the sense of hearing. For this purpose, we created two cooperating applications that constitute a learning support system. The first one is a web application that allows users to add exercises and text descriptions to SVG image elements, tests, and questions, as well as monitor student progress. The second one is an Android mobile application that works with the web application and the adapted printout. It is the main tool for the students' work during classes, enabling them to familiarize themselves with the images, solve tests, and learn on their own. Both applications play a key role in supporting the alternative descriptions provided to students as they explore the images in the exercises.
This part focuses on enhancing the completed tactile image by combining two senses: touch and hearing. The goal is to improve perception and accelerate the learning process for individuals who are blind. Previous publications have presented studies on the impact of this approach [34,35].
An important aspect in the case of descriptions is their length. Our earlier research [38] has examined the effect of cognitive saturation on the usefulness of descriptions. According to the theory of cognitive saturation [39,40], the maximum number of concepts contained in a single description is between 5 and 7. If a description exceeds this limit, it may be too overwhelming for the listener to comprehend. To overcome this challenge, longer descriptions can be split into smaller pieces of information by introducing a gesture and assigning different messages to them. There are three gestures distinguished in the application: 1-tap, 2-taps, and 3-taps. The rules for assigning descriptions to gestures were as follows: • 1-tap-Basic information about the object, e.g., side length, coordinate values, and location of the vertex between the sides. • 2-taps-Additional information about the object's properties, e.g., the side of a rightangled triangle. • 3-taps-Broader theoretical context, e.g., the formula for the area of a triangle.
The second mechanism involves diversifying the level of detail in the description according to the level of knowledge of the user. In the current version, the proposed method consists of three levels of detail that can be dynamically adjusted based on the number of mistakes made by the student during test-solving. It is worth mentioning that mechanisms based on error and knowledge vectors are used to accurately determine the user's level of advancement. Alternative descriptions prepared in this way help to obtain satisfactory results when working with students, as presented in our previous work [41]. Moreover, the developed platform enables the personalization of the learning path, which is understood as the optimization of the learning curve for a specific student. Information about the student's progress and mistakes are stored in the error and knowledge vectors. In the authors' previous paper [42], the mechanism of error aggregation and feedback used during learning in the developed platform as well as algorithms for intelligent selection of subsequent exercises to be solved by the student were described.

Feedback Loop
Substitution of the visual channel by combining the touch and sound channels requires adaptation to the individual needs of the student. To achieve this effect, it was proposed to use the generally known idea of feedback, independently in both channels. The proposed concept is shown in Figure 2. The entire gesture detection process was based on user interaction maps. These are The entire gesture detection process was based on user interaction maps. These are images with superimposed points in places in which the user interacts with the image. This visualization mechanism is crucial for adjusting the timing and intervals of the gestures presented on the taps map. If the taps map does not contain correct interaction patterns, i.e., a large number of taps around one graphic primitive, it may cause issues. Using this solution, it is possible to receive feedback directly from the sense of touch.
Taking into account the described mechanism and the research on the sense of touch presented in previous publications [41], this article introduces new possibilities for the user's interaction with the application. The first option involves adapting the time intervals between the gestures described in Section 3.2 which include the 1-tap, 2-tap, and 3-tap gestures. The second possibility allows for adjusting the perception field of graphic primitives while assuming a fixed size of the tactile image that is preselected according to the guidelines. The perception field of the primitive is the area where messages are activated in response to the user's gesture.

Completeness Detection
At this stage, we introduce lists of necessary elements-graphic primitives (defined by the teacher), which the student is required to analyze during the image exploration to solve a given part of the exercise. Figure 3 shows an example of the completeness map.  Various modes of navigating through the tactile image were also proposed (presented in Figure 4), which include: • "Mode 0": Lack of hint mechanisms for missing elements-all objects in the image are active, and interactions with any of them will trigger the appropriate message. • "Mode 1": A mechanism that activates only the audio description of missing elements-only the objects needed to solve the exercise are active. The user becomes acquainted with all the active elements in turn using defined gestures-the hearing feedback. Once an object has been explored, it becomes inactive. When an attempt is made to complete an exercise, all unselected items are searched. • "Mode 2": The mechanism of guiding the direction to the missing elements-the guiding mode in a specific direction, relative to the user's last active position. It is implemented based on previously published results [43,44]. The whole operation uses the messages: left/right and up/down. Thanks to this, the user can be guided to the desired image element. The difference from mode 1 is that all the necessary elements are active. In this mode, we assume that the user does not remove their finger from the screen but moves it toward a given direction. Similarly, as in mode 1, a touch Various modes of navigating through the tactile image were also proposed (presented in Figure 4), which include: • "Mode 0": Lack of hint mechanisms for missing elements-all objects in the image are active, and interactions with any of them will trigger the appropriate message. • "Mode 1": A mechanism that activates only the audio description of missing elements-only the objects needed to solve the exercise are active. The user becomes acquainted with all the active elements in turn using defined gestures-the hearing feedback. Once an object has been explored, it becomes inactive. When an attempt is made to complete an exercise, all unselected items are searched. • "Mode 2": The mechanism of guiding the direction to the missing elements-the guiding mode in a specific direction, relative to the user's last active position. It is implemented based on previously published results [43,44]. The whole operation uses the messages: left/right and up/down. Thanks to this, the user can be guided to the desired image element. The difference from mode 1 is that all the necessary elements are active. In this mode, we assume that the user does not remove their finger from the screen but moves it toward a given direction. Similarly, as in mode 1, a touch gesture is detected. If, while moving the finger in a given direction, the student encounters image elements that have been activated previously, they receive the sound message that this element has already been recognized and should continue to search in the specified direction.

The Problem of Overlapping Objects when Exploring a Tactile Image
The classes of objects appearing in the image (Section 3.1.1) have been assigned default priorities in terms of priority of the presentation to avoid the problem of overlapping objects. Table 1 presents the proposed priorities, where the lowest number corresponds to the highest priority.
At each stage of the exercise, the author can redefine his own order of priorities (if he The bar chart in Figure 4 presents information on the average reduction in natural gas consumption (expressed in %) by the European Union countries in 2022-2023 compared to 2017-2021. In this exercise, the student had to find the countries with the gas consumption reduction below 15% in the current year. Figure 4a shows the complete figure from the Eurostat portal [45]. Figure 4b shows an adapted chart with two navigation mechanism modes. Depending on the navigation mode, the student is guided to the bar in the graph in a different way.

The Problem of Overlapping Objects when Exploring a Tactile Image
The classes of objects appearing in the image (Section 3.1.1) have been assigned default priorities in terms of priority of the presentation to avoid the problem of overlapping objects. Table 1 presents the proposed priorities, where the lowest number corresponds to the highest priority.
At each stage of the exercise, the author can redefine his own order of priorities (if he does not, the default values will be used). The representation of the tactile object in the image corresponds to a tag in the SVG file. This allows for a modification of the activation area of the object by adjusting specific parameters, such as line thickness, within the SVG file. In cases where objects overlap, audio presentations are determined by the established priorities outlined above. By default, the object with the highest priority is always activated. The 1, 2, and 3 tap gestures correspond to audio descriptions with increasing levels of detail. Depth maps are used to activate audio descriptions during user interaction.
Depth maps are structures based on tables with a size equal to the resolution of the tablet image, where each element in the structure is a list of objects located at the corresponding coordinate in the image. This structure is generated along with the SVG file, and the list of objects is sorted by priority at each stage of the exercise. When interacting with the user, an additional sound is generated for all 1, 2, and 3 tap gestures if there are additional objects beneath the current object. Additionally, a long-hold gesture is introduced to enable the user to hear the names of all objects beneath the current object in order of priority.

Auditory Feedback Loop
The information conveyed to the sense of hearing is divided into two parts:

•
Exercise elements that directly stem from the exercise content and are necessary for its solution.

•
Elements of the theory, contextually related to the exercise-solving stage.
The audio descriptions of the image elements are created based on the attributes of the SVG item. These attributes are added to the corresponding SVG tags that are detected by the YOLO network as classes of mathematical objects in the image. When editing an SVG drawing by the teacher, the tags can be completed automatically or manually using prepared wizards that correspond to different classes of objects. As an example, we can describe filling in entire-part information, e.g., for the sides of a figure, when creating an object. Moreover, we can add the attributes of the parent object class, parent identifier, and primitive properties, i.e., a length object class. Additionally, each stage of the exercise has a list of required graphic primitives to solve it. This list is created by interactively selecting the appropriate primitives or specifying their identifiers.
The set of all the concepts and the current state of the error vector serves as the basis for creating theoretical descriptions. Individual fragments from the description of the concepts are mapped to the error classes. Each stage of the exercise contains a subset of theoretical concepts for content description. Depending on the values of the error components, for a given user in his error vector instance, the platform generates the appropriate theoretical hint text. The theoretical description is available under the 3-tap gesture and is active only for elements that occur in the classes of mathematical objects. Optionally, the teacher can expand the theoretical descriptions and define them for other types of objects in the exercise. Suggestions for assigning concepts to an exercise appear based on the object classes detected by the YOLO network.
The platform also includes a mechanism to prevent students from simply memorizing the exercises. It checks whether the student has used the contextual theoretical hints. If so, the student is asked to solve a similar exercise from the same section but without theoretical assistance. Verification of understanding of the audio descriptions is completed by checking for changes in the error vector's component values. If necessary, the theoretical description is either extended or shortened. The platform also includes the ability to modify audio presentation parameters to support the sense of hearing, such as adjusting the speed of the synthesizer, based on previous research results [41].

Working Method with the Image by a Blind Person
The application provides a touch and audio interface that enables the operation of the mobile application by blind users. When a student begins to work with the application, calibration is performed using a test image. This approach enables the adjustment of the line thickness displayed on the screen to ensure that a blind person receives the correct messages about the currently touched object during exercise exploration. It also helps in reading messages effectively in case of problems with the sense of touch, such as shaking fingers and hands. Figure 5 shows a student using the mobile app. A tactile image is placed on the tablet screen (immobilized by an applied frame). The student, upon sensing an element of the image under his finger, can tap it to prompt the mobile application to read the appropriate alternative description. A sheet with a tactile image is not a barrier for the tablet to detect a finger tap on the image. fingers and hands. Figure 5 shows a student using the mobile app. A tactile image is placed on the tablet screen (immobilized by an applied frame). The student, upon sensing an element of the image under his finger, can tap it to prompt the mobile application to read the appropriate alternative description. A sheet with a tactile image is not a barrier for the tablet to detect a finger tap on the image. The designed interface allows students to navigate through the application using dedicated buttons that correspond to selected actions, such as selecting an exercise, confirming, marking an answer, or submitting a test. Students can efficiently navigate through the application modules and solve exercises in both the learning and test modes. The work of a visually impaired person with an image consists of two stages: exploring the image and solving the exercise. The exploration stage of the image familiarizes the student with its contents, and various modes of exploration are available, which include: • Free exploration-The users completely and independently decide how to explore the image.

•
Working with the selected navigation mode (i.e., the modes described above), which guides and verifies the completeness of the activation of all the image elements.
The main task of a student working in the image exploration mode is to familiarize themselves with the layout of the image and answer specific questions. Solving the exer- Figure 5. The real use of the system during tests by a blind student. On the left, the student puts the tactile image on the tablet screen (on the left). On the right, the student uses the application and can listen to alternative descriptions by performing the tap gesture on the tablet screen (under the fingers the student feels the tactile elements in the drawing).
The designed interface allows students to navigate through the application using dedicated buttons that correspond to selected actions, such as selecting an exercise, confirming, marking an answer, or submitting a test. Students can efficiently navigate through the application modules and solve exercises in both the learning and test modes. The work of a visually impaired person with an image consists of two stages: exploring the image and solving the exercise. The exploration stage of the image familiarizes the student with its contents, and various modes of exploration are available, which include: • Free exploration-The users completely and independently decide how to explore the image.

•
Working with the selected navigation mode (i.e., the modes described above), which guides and verifies the completeness of the activation of all the image elements.
The main task of a student working in the image exploration mode is to familiarize themselves with the layout of the image and answer specific questions. Solving the exercise triggers the following steps, which involve adjusting the length of the theoretical prompts based on changes in the components of the error vector.

Conducted Experiments and Research Group
The presented method enables the development of a set of exercises for the students. The experiments encompassed a set of 60 exercises covering the following topics: The prepared set of 60 exercises from the above-mentioned areas (20 exercises from each area) was prepared according to the following assumptions:

•
Each of the exercises contained from a few to a dozen graphic elements; • The size of the image elements was similar, e.g., from 2 to 8 cm; • In the case of mathematical analysis exercises, the content concerned linear, quadratic, and trigonometric functions; • The number of bars in the bar charts is between 10 and 25.
The above assumptions result from the limitations of tactile image exploration with the use of hands and fingers by blind students, which was the subject of our previous research [41].
The research group consisted of 46 participants with blindness, who could solve math exercises at a high-school level. The group consisted of 19 women and 27 men, aged 15-19, with a median age of 17 years and a standard deviation of 1.21. The following inclusion criteria were adopted in the studies:

•
The degree of vision loss was significant, with more than 90% of vision loss; • The student is in secondary school and completed a mathematics course at a primary school level; • There were no mental disabilities; • The ability to use the tablet computer, and the audio-tactile interface and interpret the tactile image, was assessed by a teacher having several years of experience in teaching the blind.
The authors confirm that all methods were carried out in accordance with the relevant guidelines and regulations. Research carried out is not a medical experiment. All data are fully anonymized and conform to the ethical principles.
In the context of the sense of touch, we analyzed the ratio of the number of taps that were successful to the total number attempted, considering all the stages of the exercise and whether the object is included in the completeness list. Initially, following previous research [41], the user begins with the default activation area size. If the number of successful taps is below the set threshold, the activation area increases to properly recognize the gesture for a given graphic primitive in a tactile image. The activation area serves as a global parameter (applied to all primitives) for a given user. Moreover, the mechanism also allows for adjustment of the time intervals between gestures. This modification is made during the initial configuration, in which the user learns how to perform exemplary gestures. The following gestures are verified: touch, 1-tap, 2-taps, 3-taps, and long press.
In the context of the sense of hearing, the focus was on the verification of the proposed mechanisms of presentation. First, the mechanisms of audio presentation of overlapping objects were evaluated, including: • The depth map creation mechanism, • The prioritization of objects in the image, • The detectability by the long-hold gesture.
Another tested mechanism included navigation modes. The students solved the exercise in three presented navigation modes: without navigation, with the activation of only selected elements, and with navigation in a specific direction. The average time for reaching a given object, defined in the list of objects necessary to solve the current stage of the exercise, was determined.
The test procedure involved dividing the developed group of exercises into two parts, to enable an assessment of the impact of the proposed feedback mechanisms in the context of touch and hearing. Both parts contained exercises from all the prepared types. In the first part, no feedback mechanisms were applied in either sense. Furthermore, in the audio reproduction tests, no mechanisms were used for detecting objects lying underneath. The second part of the exercises involved the use of all the proposed mechanisms. The assessment was based on commonly used tests: the System Usability Scale (SUS) [46,47] and the Task Load Index (NASA-TLX) [48,49]. Figure 6 shows an example of a "gesture-taps map" used to evaluate how the user explores the image. It shows several tested features of the user's interaction with the image: single tap, double tap, a touch gesture (raw data), long press gesture, and a list of necessary elements at a given stage of the exercise, decomposed into already selected and unselected elements. Figure 6 is an illustrative drawing. Under real conditions, the number of gestures made by the student is much higher. Our Web platform allows for filtering gestures based on their type and time. The assessment was based on commonly used tests: the System Usability Scale (SUS) [46,47] and the Task Load Index (NASA-TLX) [48,49]. Figure 6 shows an example of a "gesture-taps map" used to evaluate how the user explores the image. It shows several tested features of the user's interaction with the image: single tap, double tap, a touch gesture (raw data), long press gesture, and a list of necessary elements at a given stage of the exercise, decomposed into already selected and unselected elements. Figure 6 is an illustrative drawing. Under real conditions, the number of gestures made by the student is much higher. Our Web platform allows for filtering gestures based on their type and time.

Evaluation Results for the Sense of Touch Support
The activation area of the graphics primitives was tested with a cut-off threshold of 0.5 and a default activation area width of 5 mm. If the hit rate was lower than the threshold value, the activation area was extended by an additional 2 mm, while continuously testing the completeness of the hits. After adapting the activation area, the average value of the hit completeness coefficient for the subjects was greater than 0.8. As a result, the activation area of the primitives was changed for 12 students.
Based on previous research [41], the default time interval for gestures involving 2 and 3 taps was initially assumed to be 335 ms. However, in the group of 46 students that were tested, it was necessary to adjust this interval for 10 students.

Evaluation Results for the Sense of Touch Support
The activation area of the graphics primitives was tested with a cut-off threshold of 0.5 and a default activation area width of 5 mm. If the hit rate was lower than the threshold value, the activation area was extended by an additional 2 mm, while continuously testing the completeness of the hits. After adapting the activation area, the average value of the hit completeness coefficient for the subjects was greater than 0.8. As a result, the activation area of the primitives was changed for 12 students.
Based on previous research [41], the default time interval for gestures involving 2 and 3 taps was initially assumed to be 335 ms. However, in the group of 46 students that were tested, it was necessary to adjust this interval for 10 students.

Evaluation Results for the Sense of Hearing Support
The accuracy of the mechanism for detecting overlapping objects was verified by a different teacher, who was not the exercise author. After minor errors were removed during the initial testing stage, the final accuracy was confirmed to be 100%. The effectiveness of this mechanism for blind students was evaluated during user satisfaction tests.
The assessment of navigation modes during the exploration of the figure showed a statistically significant reduction in the time taken to reach the object in modes 1 and 2 compared to mode 0 (without navigation). In mode 0, the average time taken to reach the object was 8.5 s, while in mode 1 (with only missing elements audio active), the average time taken was 3.6 s. The optimal results were achieved in mode 2 (with sound active for only missing elements and additional guidance), with an average time of only 2.8 s. A more detailed analysis of the navigation modes revealed that mode 1 worked better in the case of a complex image structure, such as figures inscribed one into another. On the other hand, variant 2 (directional) was most effective in less dense drawings or in a similar list of objects, such as bar charts.

General Results
The user's subjective workload assessment was performed using two tests: NASA-TLX and SUS. The results are presented in Table 2. For cases when solving the exercises without the feedback mechanisms, the results of the SUS test were found to be below the acceptable level of 70% (i.e., 65%). According to the authors, Bangor et al. [50], anything below 70% has usability issues and creates a cause for concern. The use of the developed method with and without the pre-exploration phase increases the level of system usability. Additionally, the use of the proposed feedback mechanism mode also slightly increases this level.
The NASA-TLX rating scale is a multidimensional assessment tool that enables the participants to rate cognitive loads across six subscales: mental resource demand, physical resource demand, temporal demand, effort, performance, and task frustration. In this study, the participants who were blind solved exercises without any imposed time limit and, thus, time pressures were not evaluated. The results demonstrated that the proposed method significantly reduced the exercise load. The type of mode, with or without feedback mechanisms, also affected the results. When comparing the results of individual categories, the demand for physical load decreases the most. Students rated their performance as improving, directly resulting from the proposed method. In both the effort and frustration categories, there was a similar reduction in task load.

Discussion and Conclusions
The results indicate an increase in user satisfaction and a reduction in exercise load (see Table 2). The detailed findings regarding the adaptation for the sense of touch revealed the need to adjust the activation area for approximately 15% of the students and gesture time intervals for approximately 10% of the students. The results related to the navigation modes demonstrated their significant impact on the average time to reach all objects, as specified by the author of the exercise, to complete a given stage. The feedback loop for the sense of touch enables personalized selection of image presentation parameters, such as the primitive perception field and time intervals in the recognized two-and three-tap gestures. The feedback loop for the sense of hearing enables an adaptation of the length of theoretical descriptions based on the user's current knowledge. Additionally, it helps avoid fatigue and controls the level of cognitive saturation, which in turn increases user satisfaction and reduces exercise load (as indicated by the SUS and NASA TLX test results).
One more advantage of the mechanism for detecting overlapping objects is the ability to explore more complexly structured images. The feedback loop, which incorporates both the sense of touch and hearing, enables an effective substitution of sight by combining and synchronizing two sources of cognitive stimuli.
Blind individuals are often unable to access structural information in the same way as sighted individuals. As a result, their access to technical materials is restricted, which leads to potential educational and social exclusion. This highlights the necessity of developing alternative methods for presenting structural information, such as the one presented in this paper. The designing, implementation, and evaluation of the proposed method occurred at a center specializing in the education of students with blindness. The results obtained confirmed the effectiveness of the proposed approach. We believe that this achievement could contribute to the development of actual systems that support multimodal education for blind students. A significant contribution of this study is the development of an alternative method for presenting structural information contained in technical images. The addition of support for the sense of touch and hearing through feedback mechanisms enables a personalized adaptation of the image presentation. Additionally, we introduced useful mechanisms for detecting objects located beneath the currently explored object and specific navigation modes for exploring tactile drawings.
It should be noted that there are limitations to the proposed approach and evaluation. First, the method was evaluated only in one center, specifically the central educational center for blind students in Poland. Second, the scope of the developed exercises was limited to mathematical exercises and did not include other types of technical images, such as diagrams, maps, and other types of graphs. Future work on the method will involve developing a wider range of exercises, including exercises in several national languages, and evaluating the method for a larger group of centers that specialize in the education of students with blindness. Additionally, we also plan to extend the evaluation to inclusive classrooms. This will allow for a more comprehensive assessment of the method's potential impact in real-world educational settings and its ability to facilitate inclusive learning environments. The results of this evaluation also provide insights into the feasibility of implementing the presented method in mainstream education systems. Funding: Publication partially supported from project no.: 31/010/SDU20/0006-10 (program Research University Excellence Initiative).