How Blind Individuals Recall Mathematical Expressions in Auditory, Tactile, and Auditory–Tactile Modalities

: In contrast to sighted students who acquire mathematical expressions (MEs) from their visual sources, blind students must keep MEs in their memory using the Tactile or Auditory Modality. In this work, we rigorously investigate the ability to temporarily retain MEs by blind individuals when they use different input modalities: Auditory, Tactile, and Auditory–Tactile. In the experiments with 16 blind participants, we meticulously measured the users’ capacity for memory retention utilizing ME recall. Based on a robust methodology, our results indicate that the distribution of the recall errors regarding their types (Deletions, Substitutions, Insertions) and math element categories (Structural, Numerical, Identifiers, Operators) are the same across the tested modalities. Deletions are the favored recall error, while operator elements are the hardest to forget. Our findings show a threshold to the cognitive overload of the short-term memory in terms of type and number of elements in an ME, where the recall rapidly decreases. The increase in the number of errors is affected by the increase in complexity; however, it is significantly higher in the Auditory modality than in the other two. Therefore, segmenting a math expression into smaller parts will benefit the ability of the blind reader to retain it in memory while studying.


Introduction
Speech technologies, particularly Text-to-Speech (TtS) systems, have been contributing significantly to digital accessibility since the invention of the first TtS engine in 1986 [1,2].Nowadays, automated reading devices and screen readers [3,4] are extensively used by blind users to convert printed or electronic textual content to audible speech.In education, students with blindness use computers and mobile devices with Assistive Technology (AT) to access the educational content and participate in the educational process [5].These AT use the following modalities: (c) Auditory-Tactile Modality by listening to an audio rendering output (e.g., from a screen reader or other AT) alongside reading the output braille (e.g., on a refreshable braille display or an embossed sheet of paper or by using audio-tactile devices [6,7]).
When math is in a digital form, not just graphically presented but in code accessible to AT, it can be commonly rendered in either Tactile Modality based on a braille math notation or in Auditory Modality using a Math-to-Speech (MtS) system that complies with specific speech transformation rules.
In recent years, the acoustic rendering of mathematics has been explored and applied mainly at the research level.One of the most essential AT systems to make math accessible via speech and sound was AsTER (Audio system for TEchnical Readings) [8].AsTER was a tool to convert LaTeX [9] documents to a format that could be used as audio documents.MathTalk [10] was developed to speak standard algebra notation through a speech synthesizer, using prosody to make math more accessible and allow the user control of the information flow.AudioMath [11] was introduced as an application to convert mathematical expressions (MEs) from MathML [12] format to plain text and, along with a TtS system, reading out the mathematical content.MathSpeak, which incorporated a set of rules for speaking MEs non-ambiguously [13,14], became a component of MathPlayer [15].Localization (i.e., adaptation of a specific native language), support of multilingual mathematical or textual content, cultural differences, and user preferences are among the open challenging factors that influence the behavior of advanced MathML players [16].Some examples of local implementations for audio math rendering have been proposed for the Thai language [17], for Polish [18], and for the Korean language [19].Ongoing research for advanced MtS is aimed at navigating mathematical structures.Nowadays, some screen readers apply the acoustic rendering of mathematics.They either incorporate the ability to speak math (JAWS, VoiceOver with Safari) or achieve it with the help of plugins (MathPlayer [20] and MathJax [21]) or browser extensions (ChromeVox).
The rules for the acoustic rendering of mathematics are less extensive regarding notations and coverage of MEs than those for the braille notations of math.One reason is that braille notations, such as Nemeth [22], provide rules to extend the given symbols and create new ones at any given time.Also, while in Tactile rendering, similar to the visual representation, readers are responsible for interpreting the role of a symbol that can take different names based on the context (e.g., the symbolic operator ∇ could be read as "nabla", "del" in vector analysis, "backward difference" in the calculus of finite differences, "widening operator" in the computer science field of abstract interpretation and more), speech rules in existing systems do not provide this "smart" interpretation yet and use descriptive or more generic descriptions of some symbols.Contextual semantic analysis has been recently proposed [23] to address this shortcoming.
Braille math notations that are currently in use include the Antoine Notation (French Braille Code), Nemeth Code, Unified English Braille Code (UEB), British Mathematics Notation (BAUK), Spanish Unified Mathematics Code, Marburg Mathematics (German Code), Woluwe Code (Notaert Code), Italian Braille Code, Swedish Braille Code, Finnish Braille Code, Russian Code, Chinese Code, and Arabic Code [24].Some of them are solely and others are partially dedicated to mathematics.As their names suggest, the codes differ from country to country, and no global braille notation is in use, unlike in math for the sighted.Given the linearity of braille and the finite number of symbols to be represented in a single braille cell, these codes contain complex rules to convey mathematical symbols and structures in a space-saving fashion [25].
Other written systems or codes used in some form by blind people include LaTeX and MathML.LaTeX is widely used to create technical and scientific documents, and blind people studying STEM subjects in higher education train themselves in using source code LaTeX as an option to read and write mathematical content.LaTeX is sometimes used as an alternative text to ME incorporated in a document or on a webpage as images.It can be used as input to some commercial math accessibility products such as MathType [26], DBT [27], Tiger Software Suite [28], and ChattyInfty [29].Different efforts have been made to either make LaTeX more accessible [30,31] for the visually impaired or to convert LaTeX to an accessible format (e.g., braille) [32,33].MathML is not meant to be written or read in source code but is a code for mathematics on the Web; therefore, it is used as input to some of the AT systems mentioned above.
When sighted people read an ME, it has been observed that they (a) read from left to right, element by element, (b) back-scan the expression, (c) substitute the outcome of a parenthetical expression, and (d) scan the entire ME for creating a schematic structure [34].
These observations were supported by experiments conducted with sighted participants in Visual Modality.In contrast to sighted students who can acquire MEs from their sources whenever necessary, blind students must keep them in their memory [35].
Working memory has received much attention as a source of improved cognitive functions in middle childhood.It is considered the "active" memory system, which holds and manipulates the information needed to reason about complex tasks and problems [36].A standard behavioral method for measuring the changing capacity of working memory is to assess children's memory span, that is, the number of randomly presented pieces of information that children can repeat as soon as they are presented [37].Researchers divide memory into two stages: short-term memory, lasting from seconds to hours, and long-term memory, which lasts from hours to months [38].According to [39], the auditory information remains in short-term memory for around 10-30 s.
As mentioned, there are two modalities for blind students: the Auditory and the Tactile.The first step to mathematical problem solving is the ability to hold the information in memory.Recall and the working memory of blind people is typically addressed to children in the literature and usually, but not always, it is related to text [40,41].There has not been any previous reference to the recall of whole MEs that contain structural elements, operators, numerical elements, and identifiers, as opposed to number sequences.When comparing auditory versus tactile encoding, blind and braille-literate children recall more words encoded in braille compared to when listening to words [42].The same has not yet been confirmed for math.
In this work, we intend to check the ability of blind individuals to temporarily retain an ME when they use different input modalities.We measure the capacity of one's memory retention by mathematical expression recall.Our goal is to answer the following questions: i.
Is there a threshold to the cognitive overload regarding the type and number of elements in an ME where the recall rapidly decreases?ii.Does a modality provide better chances of ME recall to blind users?

Materials and Methods
The basis of this study relies on experiments that took into account user experience with MEs in terms of representation and not calculation.Specifically, blind individuals were invited to read (using Tactile or Auditory or Auditory-Tactile Modalities) and then recall three sets of similar MEs in a three-unit experiment.The approach in the present study was influenced by the EAR-Math evaluation methodology for audio-rendered MEs [43] but was modified accordingly to incorporate Tactile Modality.Participants were asked to recall the representation of MEs.

Participants
Sixteen volunteers who were blind (age: 21.25 ± 5.98 years, eight males and eight females, education: 13.27 ± 3.86 years) participated in this study.All of them (100%) had a visual loss of 95-100% in both eyes.All participants had a good grasp of the braille code for both literal and math texts (braille users for 15.18 ± 5.77 years).They all reported being active users of embossed braille and reading math during the last two years prior to the experiment.Regarding their education level, all users attended some elementary school for the blind, for 2-6 years, depending on when they became blind, followed by inclusive education in secondary school.The level of mathematical education received was the same for all participants.However, their competence in the subject was not measured since no computations were required on their part.All the participants spoke Greek as a primary language and used screen readers daily.None of the participants had any other disability (e.g., hearing or dexterity impairment) or were diagnosed with a learning difficulty.They all confirmed that they fully understood the experimental procedure of the current study and signed a written consent form for their participation.For the underage participants, an additional parental consent form was signed.All signatories were given in both printed and embossed documents.The research followed the tenets of the Helsinki Declaration and was approved by the Ethics Committee of the National and Kapodistrian University of Athens.

Materials
The MEs used in the stimuli were based on that introduced in Raman's AsTER [8].Our set of mathematics included simple fractions and expressions, superscripts, and subscripts, Knuth's examples of fractions and exponents, a continued fraction, square roots, trigonometric identities, logarithms, series, integrals, summations, limits, cross-referenced equations, the distance formula, a quantified expression, and exponentiation.Well-known expressions, such as the Pythagorean theorem and trigonometric identities, were excluded to avoid implicit associate responses.All the mathematical concepts included in the stimuli are taught as part of the Greek secondary school curriculum.A total of 25 expressions were initially selected.
Using the Presentation MathML syntax, MEs can be regarded as trees where each node corresponds to a MathML element, the branches under a "parent" node correspond to its "children", and the leaves in the tree correspond to atomic notation or content units such as numbers, characters, etc. [44].For this work, we chose to address the three element types found of presentation token elements, namely (a) structural elements, (b) identifiers and numbers, and (c) operators.As an example, the syntax tree of the math expression e (α χ +β χ +χ) is depicted in Figure 1.
and signed a written consent form for their participation.For the underage participants, an additional parental consent form was signed.All signatories were given in both printed and embossed documents.The research followed the tenets of the Helsinki Declaration and was approved by the Ethics Committee of the National and Kapodistrian University of Athens.

Materials
The MEs used in the stimuli were based on that introduced in Raman's AsTER [8].Our set of mathematics included simple fractions and expressions, superscripts, and subscripts, Knuth's examples of fractions and exponents, a continued fraction, square roots, trigonometric identities, logarithms, series, integrals, summations, limits, cross-referenced equations, the distance formula, a quantified expression, and exponentiation.Wellknown expressions, such as the Pythagorean theorem and trigonometric identities, were excluded to avoid implicit associate responses.All the mathematical concepts included in the stimuli are taught as part of the Greek secondary school curriculum.A total of 25 expressions were initially selected.
Using the Presentation MathML syntax, MEs can be regarded as trees where each node corresponds to a MathML element, the branches under a "parent" node correspond to its "children", and the leaves in the tree correspond to atomic notation or content units such as numbers, characters, etc. [44].For this work, we chose to address the three element types found of presentation token elements, namely (a) structural elements, (b) identifiers and numbers, and (c) operators.As an example, the syntax tree of the math expression  ( ) is depicted in Figure 1.Each of the three experimental units provides the user with 25 MEs in random order (25 expressions × 3 sets = 75 total stimuli) (see Appendix A).We created two extra variation sets of the initially selected expressions to avoid learning the original expressions using mnemonic strategies.The expressions in the three sets had identical structures and the same number of identifiers and operators.They only differed in identifiers and operators when moving from one set to another.We wanted them to maintain similarity to the initially selected set and have the same level of difficulty while being different.
The expressions chosen from different math areas were also given in random order to each unit and user to ensure that the deliberate use of practices to enhance memorization [45] was minimal, if existent.
In their tactile form, the MEs were embossed in the Nemeth Code on free dust paper of 160 g/m 2 of A4-size sheets, one per page, in the middle of the paper in landscape orientation (Figure 2), using an Index Everest V4 embosser.The ME was also written above the Each of the three experimental units provides the user with 25 MEs in random order (25 expressions × 3 sets = 75 total stimuli) (see Appendix A).We created two extra variation sets of the initially selected expressions to avoid learning the original expressions using mnemonic strategies.The expressions in the three sets had identical structures and the same number of identifiers and operators.They only differed in identifiers and operators when moving from one set to another.We wanted them to maintain similarity to the initially selected set and have the same level of difficulty while being different.
The expressions chosen from different math areas were also given in random order to each unit and user to ensure that the deliberate use of practices to enhance memorization [45] was minimal, if existent.
In their tactile form, the MEs were embossed in the Nemeth Code on free dust paper of 160 g/m 2 of A4-size sheets, one per page, in the middle of the paper in landscape orientation (Figure 2), using an Index Everest V4 embosser.The ME was also written above the tactile form to aid the researcher in following the expression while a participant read it out loud.
In their auditory form, the MEs were pre-recorded using MathPlayer with the Acapela Text-to-Speech Greek voice Dimitris, a voice familiar to all participants, in the default speech rate and pitch.The users could set only the sound level to match their individual needs.The participants did not have the option to navigate in the MEs.In their auditory form, the MEs were pre-recorded using MathPlayer with the Acapela Text-to-Speech Greek voice Dimitris, a voice familiar to all participants, in the default speech rate and pitch.The users could set only the sound level to match their individual needs.The participants did not have the option to navigate in the MEs.
We replaced the embossed test stimuli sets for each group of eight participants to avoid paper deterioration caused by intensive use.Paper deterioration was similar to the attrition of braille books after extended use.

Experimental Procedure
Initially, a researcher briefly described the study's objectives, the experimental procedure, and how to complete each task for each participant.
Before experimenting, (i) users were trained in audio rules used by MathPlayer, and (ii) the Greek braille system and Nemeth braille code were repeated.To complete the training phase, users were asked to read and write 15 MEs afterward to ensure they understood the audio rules and could write in Nemeth code.The expressions used in the training phase were the ones from AsTER that were left out of the experiment phase.The whole training lasted 1 h.
The experiment was conducted in three units with a one-day gap between them.The units were (1) Tactile, (2) Auditory, and (3) Auditory-Tactile, assigned randomly to each participant.One blind individual at a time participated in the experiment conducted in a quiet room.The experiment was set in a quiet environment not to interfere with users' concentration and achieve maximum information retention.During an experimental unit, participants sat on a chair with adjustable height in front of a desk.To note their answers, the researcher placed a Perkins braille machine and A4 120 g/m 2 paper sheets on the desk (Figure 3).The researcher was responsible for providing each stimulus to the user (embossed sheets and/or audio recordings).We replaced the embossed test stimuli sets for each group of eight participants to avoid paper deterioration caused by intensive use.Paper deterioration was similar to the attrition of braille books after extended use.

Experimental Procedure
Initially, a researcher briefly described the study's objectives, the experimental procedure, and how to complete each task for each participant.
Before experimenting, (i) users were trained in audio rules used by MathPlayer, and (ii) the Greek braille system and Nemeth braille code were repeated.To complete the training phase, users were asked to read and write 15 MEs afterward to ensure they understood the audio rules and could write in Nemeth code.The expressions used in the training phase were the ones from AsTER that were left out of the experiment phase.The whole training lasted 1 h.
The experiment was conducted in three units with a one-day gap between them.The units were (1) Tactile, (2) Auditory, and (3) Auditory-Tactile, assigned randomly to each participant.One blind individual at a time participated in the experiment conducted in a quiet room.The experiment was set in a quiet environment not to interfere with users' concentration and achieve maximum information retention.During an experimental unit, participants sat on a chair with adjustable height in front of a desk.To note their answers, the researcher placed a Perkins braille machine and A4 120 g/m 2 paper sheets on the desk (Figure 3).The researcher was responsible for providing each stimulus to the user (embossed sheets and/or audio recordings).Our experiment extensively used the users' short-term memory and was not designed to require any computing on their part.If we adopted Baddeley's [46,47] multicomponent model for working memory, in both Tactile and Auditory Modalities, the users would have temporarily used the speech-based phonological loop to store the MEs.The Our experiment extensively used the users' short-term memory and was not designed to require any computing on their part.If we adopted Baddeley's [46,47] multi-component model for working memory, in both Tactile and Auditory Modalities, the users would have temporarily used the speech-based phonological loop to store the MEs.The tactile presentation of the MEs was given in a horizontal format, as in the auditory representation.When reading them in braille, we asked the users to read the MEs aloud to treat them as math and not as text.As with the multi-digit arithmetic problem when presented in a visual format where individuals may translate the visually presented information into a phonological code for temporary storage [48], we ensured that our users translated the tactile information into a phonological code for temporary storage.In translating the tactile input to phonological code, participants had to use the input sensory recording and retrieve the meaning of the braille codes from their long-term memory.We hypothesized that users would benefit from the extra computing and therefore show better results in the tactile part.
The participants were asked to read/hear each stimulus only once-the users were not allowed to repeat the material they had to memorize [49] and then write on the braille machine as much as they remembered from the ME.In the tactile unit, the users were also asked to recite what they were reading so that we could check that they recognized mathematical rather than mere braille symbols.
In the Audible-Tactile experimental unit, the MEs were first assigned in an embossed form in Nemeth code.Once a participant finished reading braille, the auditory version of the same expression was rendered, and then they were asked to write on the braille machine as much as they remembered from the expression.
The Tactile and Audiovisual parts of the experiment were video recorded to determine the reading time in later analysis.To determine the reading time in the case of Tactile reading, the recording focus was on the stimuli, as well as on the hands of the participants.
All the reading times were recorded by the experimenter using a stopwatch after the end of each experiment.The experimenter rewatched the video recordings.The timer started when the user first touched the embossed expression and stopped when the user took their hands off the printed paper.
The embossed paper sheet was fixed on the desk's surface, and participants were allowed to explore stimuli freely with both hands and all fingers, as in that case, a more detailed examination could be performed effectively.
Each recall trial ended after the participant announced that they finished writing.The procedure of an experimental unit was repeated until all 25 stimuli of the same set were tested.The sequence of the units of the experiment, the stimuli set to be used for each unit, as well as the sequence of stimuli within each test for each participant were randomly selected with normal distribution based on computer software.The users visited the MEs sequentially and only once for all modalities.

Data Analysis
The primary outcome of this study is the number of recall errors (RE), and the main question is whether RE varies significantly between the three modalities-Auditory (A), Tactile (T), and Auditory-Tactile (A-T).The proportional distribution of RE is described and compared across (a) error types: Deletions (D), Substitutions (U), and Insertions (I), (b) elements: Structural (S), Numerical or Identifiers (N) and Operators (O), and (c) the combination of error types and elements [50,51].
The mean values of recall errors were compared between the two genders with the independent samples t-test and between modalities, error types, elements, and their two-way and three-way interactions with the three-factor ANOVA, followed by pairwise comparisons with Bonferroni adjustment.Moreover, using regression techniques, the distribution of RE was tested against the complexity of the MEs, where complexity (C) is defined as the total number of structural elements, numerical identifiers, and operators contained in the following expression: Finally, we used repeated measures ANOVA to evaluate under which complexity conditions the RE significantly differs between the three modalities.The level of significance was set at 0.05.

Descriptive Statistics
The sixteen participants committed a total number of 5408 recall errors across the three modalities.Figure 4A shows that this number was not evenly distributed between the three modalities.There were 2403 errors in Auditory Modality, 1606 in Tactile Modality, and 1399 in Auditory-Tactile Modality.This is the first indication that performance in the specific experiment was inferior in Auditory Modality and that Auditory-Tactile Modality was significantly better than Tactile Modality.Figure 4B depicts that most recall errors were deletions, i.e., when participants omitted an element.Substitutions and insertions of elements were much less frequent.
The majority of recall errors were committed with structural elements (Figure 4C).However, the ME sets had different numbers of identifiers, structures, and operators: 192 identifiers, 163 structures, and 103 operators.Therefore, the correct approach is to divide the total number of recall errors in the identifiers, structures, and operators by the total number of items in each category to obtain the mean number of recall errors per element type.This approach makes recall errors across the three element types seem evenly distributed (Figure 4D).
Table 1 presents the time spent on each ME regarding mean, minimum, and maximum values.Users spent less time on the tactile part of Auditory-Tactile Modality than on Tactile Modality.Still, this time difference is insufficient to cover the time spent on the auditory part of Auditory-Tactile Modality, making it the most time-consuming.

Inferential Statistics
There were no significant differences in the mean numbers of recall errors between the two genders (t-test, t430 = 1.947, p = 0.052, Figure 5A).Three-factor analysis of variance revealed that all three factors had a significant effect on the mean number of recall errors: (Modality-F2,405 = 10.8, p < 0.01), Error type-F2,405 = 111.0,p < 0.01 and Element-F2,405 = 15.5, p < 0.01.There were no significant two-way or three-way interaction effects.Post hoc pairwise comparison with Bonferroni adjustment revealed that (a) the mean number of recall errors in Auditory Modality was significantly greater than in Tactile Modality (p = 0.022) and Auditory-Tactile Modality (p = 0.018)-Figure 5B, (b) the mean number of recall errors of the deletion type was significantly greater than those of the insertion type (p < 0.01) and the substitution type (p < 0.01) (Figure 5C), and (c) the mean number of recall error in operators was significantly lower than in identifiers (p < 0.01) and structures (p < 0.01), Figure 5D.
The fact that there are no interaction effects means that the relative number of recall errors in each modality is independent of the types of error and the elements.This allows for investigating the dependence of the number of recall errors on the complexity of the MEs and evaluating under which complexity conditions the averaged per participant RE significantly differs between the three modalities.
Contrary to what might be expected, the dependency of the number of recall errors on the complexity of the MEs is best described by a linear equation rather than a power or an exponential function (Figure 6A).This means that the number of recall errors is expected to increase linearly, proportional to the increase in the complexity of the expression.According to the regression equation, an increase in two items in the complexity of the ME results in an increase in roughly one recall error.
Furthermore, it seems (Figure 6B) that the linear relationship between the number of recall errors and the complexity of the expression is different in the three modalities.
Table 2 presents the parameters of the linear regression equations of the dependency of the number of errors (RE) on the complexity (C) of the ME for the three modalities, RE = a + bC, where a is the constant and b is the coefficient (slope) of the equation.recall error in operators was significantly lower than in identifiers (p < 0.01) and structures (p < 0.01), Figure 5D.The fact that there are no interaction effects means that the relative number of recall errors in each modality is independent of the types of error and the elements.This allows for investigating the dependence of the number of recall errors on the complexity of the MEs and evaluating under which complexity conditions the averaged per participant RE significantly differs between the three modalities.
Contrary to what might be expected, the dependency of the number of recall errors on the complexity of the MEs is best described by a linear equation rather than a power or an exponential function (Figure 6A).This means that the number of recall errors is expected to increase linearly, proportional to the increase in the complexity of the expression.According to the regression equation, an increase in two items in the complexity of the ME results in an increase in roughly one recall error.Furthermore, it seems (Figure 6B) that the linear relationship between the number of recall errors and the complexity of the expression is different in the three modalities.
Table 2 presents the parameters of the linear regression equations of the dependency of the number of errors (RE) on the complexity (C) of the ME for the three modalities, RE = a + bC, where a is the constant and b is the coefficient (slope) of the equation.Table 2. Parameters of the linear regression equations RE = a + bC of the dependency of the number of errors RE on complexity C of the expression for the three modalities, along with the 95% confidence intervals (CI) for coefficient (b).The 95% confidence intervals (CIs) for coefficient (b) in the auditory modality lie beyond the CI for the other two modalities.Thus, the coefficient (0.570) in the auditory modality is significantly greater than the coefficients in the other two modalities (0.449 and 0.389).This means that the increase in the number of errors effected by the increase in complexity is significantly greater in the auditory modality than in the other two modalities.

Modality
Table 3 presents the results (p-values) of the three pairwise comparisons of the values of recall errors between the three modalities separately for each ME complexity.In expressions of low complexity (up to 10 items), the participants performed equally well in all three modalities.Starting from the medium complexity of 11 items and up to 35 items, participants performed significantly better (marked as bold in gray cells) in Tactile and especially in Auditory-Tactile Modalities than in Auditory Modality.Finally, the participants performed equally poorly in the high-complexity expression containing 46 items.

Conclusions
In this investigation, we worked with blind users active in learning and with math content that has not been randomly generated but one may come across in a textbook, consisting not only of numbers but also of variables, symbols, operators, and functions.We experimented in three settings with Auditory, Tactile, and Auditory-Tactile Modalities in an experiment designed to measure the users' short memory capacity regarding ME recall.The questions we posed were answered in the following conclusions.
The first conclusion from the statistical analysis is that the distribution of the recall errors regarding the error types and elements is the same across the tested modalities.
Second, deletions are by far the most common type of recall error, although participants were asked to write on paper every part of a given ME they could recall and to avoid omitting the parts they did not feel they retained correctly.
A third conclusion is that recall errors in operators are less frequent than in structures and identifiers, which is in accordance with the results of a similar experiment conducted on sighted students in Visual Modality [34].
Fourth, the complexity of the MEs (i.e., the total number of math elements) affects the recall capabilities of the participants, as expected, because of the augmented cognitive load.The number of recall errors is linearly dependent on the complexity of the expression.However, the increase in the number of errors effected by the increase in complexity is significantly greater in Auditory Modality compared to the other two modalities.In expressions of medium complexity, the participants' performance in Auditory Modality is substantially worse than in the other two modalities.Expressions of low complexity are easily recalled, while expressions of high complexity are not, irrespective of modality.Therefore, our hypothesis that participants perform worse in Auditory Modality than in Tactile and Auditory-Tactile Modalities is proven for expressions of medium complexity.These expressions are neither too short to occupy one's full short memory nor too long that one cannot benefit from using long-term memory in the tactile mode.
The current study constitutes a first step toward recommendations to be considered when designing math educational material for people who are blind.It is a given that educators must make math content accessible in different modalities, depending on the student's preferences.Our findings suggest that the extraneous cognitive load cannot be eased by choosing a specific modality in favor of another, but for medium complexity, math braille is a better choice.Thus, long MEs should be given to the student in smaller parts, as proposed previously [52].While cognitive accessibility [53] aims to make content usable for people with cognitive and learning disabilities, based on our results, the length of the MEs embedded in text should also be considered by both content creators and (semi)automatic accessibility checkers.
If students are given control of a lengthy ME over audio, they can pause it whenever they see fit, therefore segmenting it themselves.An automatic segmentation would be preferable in comparison to self-segmentation, as in its case it could be performed on different structural levels and not random places, thereby allowing for users to listen to complete sub-expressions.Therefore, pre-recorded audio of math is not preferable to a fully accessible content that can be accessed multimodally by students.In real-life circumstances, e.g., in a textbook, a long and complex ME is usually "built" in/from several steps/expressions, so readers can use prior knowledge to recall the new expression.However, whether this prior knowledge will augment the recall is unproven and thus requires further research.
As proven, providing Tactile or Auditory-Tactile content to students increases their ability to understand and recall the ME.These modalities also prove valuable if an ME contains ambiguous symbols whose meanings depend on the context.
In 2017, in the USA, blind people represented less than 5% of all the science, technology, engineering, and mathematics (STEM) workforce [54].If interest in STEM is lost in the educational years, then we believe we should try to make STEM content more interesting by making it more accessible also at a cognitive level.Since multimodal interaction and technologies are a given for blind people and there is a constant interest in research to exploit newer technologies in pursuit of accessibility, the technologies created for math should offer users access to different modalities and assist them in decreasing the cognitive load and achieving better recall.
In the future, we plan to exploit the video recordings of our experiment further.We want to study the users' finger movements, pauses, and backtracking and check whether they are somehow in accordance with how sighted users look at MEs [55].
(a) Auditory Modality through TtS systems in connection to screen readers; (b) Tactile Modality by • reading texts in braille either on embossed paper or on refreshable braille displays, • reading tactile images, or • manipulating 3D tactile artifacts;

Figure 1 .
Figure 1.Example MathML tree of a mathematical expression.Structural elements are presented in rectangular form, operators are given in diamonds, and numericals/identifiers are circled.

Figure 1 .
Figure 1.Example MathML tree of a mathematical expression.Structural elements are presented in rectangular form, operators are given in diamonds, and numericals/identifiers are circled.
Multimodal Technol.Interact.2024, 8, x FOR PEER REVIEW 5 of 15 tactile form to aid the researcher in following the expression while a participant read it out loud.

Figure 2 .
Figure 2. Example of a mathematical expression in Tactile form.

Figure 2 .
Figure 2. Example of a mathematical expression in Tactile form.

15 Figure 3 .
Figure 3.The setting of the tactile experimental unit.

Figure 3 .
Figure 3.The setting of the tactile experimental unit.

Figure 4 .
Figure 4.The absolute, mean, and relative number of recall errors across all participants by modality, type of error, and element.(A).The absolute and relative number of recall errors committed by all participants in each of the three modalities: Auditory (A), Tactile (T), and Audio-Tactile (A-T).(B).The absolute and relative number of recall errors across all participants, math expressions, and modalities by type of error (I-Insertions, D-Deletions, and U-Substitutions).(C).The absolute and relative number of recall errors across all participants, math expressions, and modalities in Structural elements (S), Operators (O), and Identifiers (N).(D).Mean number and relative number of recall errors across all participants, math expressions, and modalities per item in Structural elements (S), Operators (O), and Identifiers (N).

Figure 4 .
Figure 4.The absolute, mean, and relative number of recall errors across all participants by modality, type of error, and element.(A).The absolute and relative number of recall errors committed by all participants in each of the three modalities: Auditory (A), Tactile (T), and Audio-Tactile (A-T).(B).The absolute and relative number of recall errors across all participants, math expressions, and modalities by type of error (I-Insertions, D-Deletions, and U-Substitutions).(C).The absolute and relative number of recall errors across all participants, math expressions, and modalities in Structural elements (S), Operators (O), and Identifiers (N).(D).Mean number and relative number of recall errors across all participants, math expressions, and modalities per item in Structural elements (S), Operators (O), and Identifiers (N).

(
A) The mean number of recall errors per gender.(B) The mean number of recall errors per modality.(C) The mean number of recall errors per error type.(D) The mean number of recall errors per element.

Figure 5 .
Figure 5. Mean number and 95% confidence intervals (CI) of the recall errors per gender, modality error type, and element.

Figure 6 .
Figure 6.Parameters of the linear regression equations RE = a + bC of the dependency of the number of errors RE on the complexity C of the expression for the three modalities, along with the 95% confidence intervals (CI) for coefficient (b).(A).Scatterplot of the number of recall errors depending on the complexity of the expression.Results of the linear regression analysis.(B).Dependence of recall errors on the complexity of the expression for each modality.

Table 1 .
Time spent in math expressions.

Table 1 .
Time spent in math expressions.

Table 3 .
Pairwise comparisons of the mean numbers of recall errors between the three modalities separately for each expression complexity.