Learning to Read in Hebrew and Arabic: Challenges and Pedagogical Approaches

Chan, Martin Luther

doi:10.3390/educsci14070765

Open AccessEssay

Learning to Read in Hebrew and Arabic: Challenges and Pedagogical Approaches

by

Martin Luther Chan

College of Arts and Sciences, University of Kentucky, Lexington, KY 40506, USA

Educ. Sci. 2024, 14(7), 765; https://doi.org/10.3390/educsci14070765

Submission received: 20 December 2023 / Revised: 27 June 2024 / Accepted: 8 July 2024 / Published: 12 July 2024

(This article belongs to the Special Issue The Science of Second Language Reading: Ecological, Educational, Neurolinguistic, Psychological, and Sociocultural Perspectives)

Download Versions Notes

Abstract

Hebrew and Arabic are Semitic languages that use abjad alphabets, a consonant-primary writing system in which vowels are featured as optional diacritics. The relatively predictable morphology of Semitic language renders abjad writing feasible, with literate native speakers relying on grammatical and lexical familiarity to infer vowel sounds from consonantal texts. However, in the context of foreign language acquisition, abjads present unique difficulties in the attainment of literacy. Due to the absence of written vowels, learners of Hebrew and Arabic face manifold challenges, such as phonetic ambiguity, extensive homography, and morphological unpredictability. Therefore, the inherent complexities of abjad alphabets necessitate targeted pedagogical intervention to increase metalinguistic awareness to strengthen learners’ reading skills—specifically, by recreating elements of literacy education for native speakers in the second language context. This article explores the linguistic challenges of abjads for foreign language students and how pedagogical methodologies can be optimized to ameliorate long-term learning outcomes.

Keywords:

foreign language pedagogy; Hebrew; Arabic; morphology; morphological awareness; Semitic languages

1. Introduction

Constituting the oldest form of alphabetic writing, abjads have been continuously in use for the last three thousand years to transcribe Hebrew and Arabic, two closely related Semitic languages with similar morphological systems and whose scripts derive from a common ancestor.

By definition, abjad alphabets feature consonants primarily, representing vowel phonemes with optional diacritics that are absent in quotidian writing. Linguists refer to the vowelless abjad scripts of Semitic languages as an example of deep orthography [1], where the relationship between spelling and phonology is opaque, in contrast with shallow orthography languages like Spanish, where there exists a straightforward one-to-one correspondence between graphemes and phonemes [2].

Hebrew and Arabic both have systematic morphologies that allow for the shallow orthography of abjad writing to be comprehensible. When reading texts in these languages, native Hebrew and Arabic speakers (L1) naturally rely on metalinguistic awareness and their innate understanding of morphological patterns. However, the absence of vowel representation in abjad writing constitutes an impediment for second language (L2) learners, who generally do not possess the same knowledge of grammar or lexical inventory to navigate reading with equal facility as native speakers. Because reading abjad scripts requires a high level of linguistic competence, students of Hebrew and Arabic should expect to acquire literacy incrementally and to refine it in tandem with the development of morphological awareness.

Structure of Article and Novel Approach and Findings

Extant research on abjads and the role of diacritics in reading tends to focus on either Hebrew or Arabic. However, this article adopts a novel approach to pedagogical research by evaluating the two languages together rather than examining them in isolation. Because the languages are remarkably close and call for similar educational approaches, research conducted on Arabic L1 and L2 readers has important ramifications for Hebrew specialists and vice versa.

Constituting an original review of linguistic theory and current pedagogies, the article fills a void in scholarship by concurrently evaluating these two languages from both a theoretical and an applicational perspective. Composed of three distinct but conceptually interconnected sections, the article is structured to function as both a linguistic and a pedagogical piece.

The first part (comprising Section 2, Section 3, Section 4 and Section 5) contributes unique insights to the field of Hebrew and Arabic language pedagogy by identifying the three primary linguistic barriers facing L2 students when reading Semitic abjad scripts. Elucidating the obstacles for learners gives educators a deeper understanding of learner needs, so that they can tailor their pedagogy accordingly. The second part amalgamates theory and practice by conducting a review of extant research on Hebrew and Arabic L1 and L2 learners, concluding that metalinguistic awareness plays an outsized role when reading Semitic abjad scripts and that L2 learners require targeted training to refine this awareness. The third part is application focused, weighing in on the ongoing debate in Semitic language pedagogy regarding the optimal rate of diacritization for learning materials and conducting a comprehensive review of the methodologies used in popular textbooks. The review of pedagogical strategies concludes with a novel contribution to the field of foreign language education in the form of a proposition of how educators can employs diacritics in a way that is inversely correlated with learner progression to simultaneously strengthen morphological awareness and reading competency.

2. Challenges of Reading Semitic Abjads

This article is the first to identify three key stumbling blocks for L2 readers stemming from the linguistic features of Hebrew and Arabic that are imperfectly represented by their respective abjad scripts. These are as follows:

(1): Semitic phonology: Abjad scripts are consonant-primary and vowel-secondary, meaning that vowel phonemes are not generally represented in written texts, which can result in significant ambiguities.
(2): Nominal patterns: The non-concatenative morphology results in the vowel pattern being an interwoven morpheme that is unmarked in written texts. For example, the triliteral consonantal noun mlk in Arabic has three vowel permutations (malik, mulk, milk), each of which corresponds to a word with a distinct meaning.
(3): Verbal system: Semitic verbs are highly reliant on vowels as markers of inflection, mood, voice, and semantic paradigm. Case in point, the Hebrew verb ktvt can be read as either katavta (you wrote—m.s.) or katavt (you wrote—f.s.).

The inability of abjad writing to fully represent the complexities of Semitic phonology and morphology results in high density of homographs. In fact, homography is a pervasive phenomenon in Semitic abjads, with anywhere between 25 and 40% of unvocalized words being homographic in Hebrew [3] and an estimated one out of three in Arabic [4]. The high saturation of homographs creates numerous ambiguities, straining the ability of an unacclimatized L2 learner to read fluidly. In fact, studies have found that reading unvocalized texts in Hebrew or Arabic entails the deciphering of words and analysis of their morphological structure, a task which slows the pace of reading even for native speakers [5]. This problem is even more prominent in the context of second language acquisition.

The three primary homography-inducing phenomena will be discussed at length in Section 3, Section 4 and Section 5 to underscore common pitfalls for learners, followed by a discussion of different pedagogical strategies that mitigate the impacts of these impediments on literary competency.

3. Representing Semitic Phonology in Abjad Scripts

3.1. Hebrew Phonology

Hebrew phonology is challenging to represent in the abjad script for two reasons: (1) polyvalent graphemes, i.e., the use of a single consonant to represent two phonemes, and (2) the limitations of the abjad in representing the five-vowel system of Modern Hebrew.

3.1.1. Polyvalent Graphemes

Historically, Hebrew did not possess sufficient graphemes to represent the pronunciation of every sound in its phonemic inventory, resulting in several letters becoming polyvalent [6], a phenomenon where a single grapheme corresponds to multiple phonemes, making its pronunciation variable and context based. An example from English is the letter c, which can represent either the s or the k sound.

In Hebrew, there were historically six polyvalent graphemes that represented a plosive-fricative pair [7]. However, in order to elucidate pronunciation, late antique Hebrew introduced a diacritical dot known as the dagesh lene to visually distinguish between the plosive and fricative articulation of the letter [8]. However, despite its utility, dagesh lene is considered to be an optional diacritic and is usually unmarked in quotidian writing.

In Modern Hebrew, three graphemes continue to be polyvalent, representing the plosive-fricative pairs b/v, k/x, and p/f. With the diacritic, these letters appear visually distinct: בּ (b) vs. ב (v), כּ (k) vs. כ (x), and פּ (p) vs. פ (f). However, without the diacritic, deciphering these graphemes is challenging for L2 learners as there are complex underlying forces governing the alternation between the fricative and plosive sound, often within variants of the same root. The example below shows derived forms from the root kbš:

כָּבַשׁ (kavaš—he conquered)

וְּכָבַשׁ (u-xavaš—and he conquered)

יִכְבֹּשׁ (yixboš)—he will conquer

The phonological rules governing polyvalent consonants are referred to as “triggers” [7] (e.g., the plosive sound is used at the beginning of a word or after a closed syllable), and readers must understand these rules to read accurately.

3.1.2. Vowel System

A second phonological challenge of the Hebrew abjad script is the relative abundance of vowel phonemes. Historically, Hebrew had a complex vowel system, with distinction of both vowel quality and vowel length: long, short, and ultra-short [9]. However, in Modern Israeli Hebrew, there are five cardinal vowels (a, e, i, o, u) with no distinction between vowel length [10].

3.1.3. The Utilization of Vowel Letters

Even though Modern Hebrew has vastly simplified its vowel system, there is still significant ambiguity stemming from representing a five-vowel system in a consonant-primary script. To reduce the number of ambiguities in unvocalized texts, Hebrew frequently employs consonant letters to represent certain vowel sounds, an orthographic convention known as ketiv male—“full spelling” [11]. Two consonantal graphemes are used to represent vowel sounds: ו (originally waw—now pronounced vav) was used to represent either u or o, and י (yod) was used to represent i (and, less commonly, e).

The prevalent use of ketiv male has helped to disambiguate a number of words that would have otherwise been heterophonic homographs, graphemically identical words with multiple phono-semantic readings (such as the word “present” in English) [12]. Their usage constitutes a useful tool for mitigating ambiguity when reading unvowelled texts.

Table 1 shows eight different readings of the triconsonantal root ספר (spr or sfr) to demonstrate how the use of full spelling can greatly reduce (although not eliminate) homography in Hebrew.

The heterophonic homograph ספר (sfr/spr) illustrates the challenge of deep orthography in Hebrew, and the presence of the medial polyvalent consonant פ (p/f) makes this word an exceptionally complex case. Fortunately, the adoption of vowel letters to represent the sounds o, u, and i reduces the number of possible readings to five, leaving the reader to rely on context and lexical frequency to rule out the remaining possibilities.

3.1.4. Ambiguities in the Vowel Letters

While vowel letters significantly ease the challenge for readers navigating unvowelled texts, these letters do not fully eliminate ambiguities. In fact, sometimes the vowel letters may create ambiguities of their own. Ambiguity is inherent in the system of vowel letters because vav and yod are semi-consonants [9] due to their dual function as both consonant and vowel. In the word-initial position, occurrences of vav and yod are pronounced as consonants (with the notable exception of vav as a conjunctional clitic). However, in word-medial positions, ambiguity often arises as the pronunciation of vav and yod is lexeme dependent, a phenomenon illustrated in Table 2.

To reduce instances of confusion, Modern Hebrew has widely popularized the orthographical tradition of writing the vav or yod twice to indicate a consonantal sound [13], giving us the forms shown in Table 3.

Nonetheless, while this orthographical practice is useful, doubling the vowel letters only provides a partial phonetic clue; it does not reveal the entire pronunciation of the word. For example, the word חייל (ḥyyl) is still a heterophonic homograph, in spite of writing the y twice, as it still has three lexical interpretations: ḥayal (“soldier”), ḥiyel (“enlist”), and ḥayel (“enlist!”).

3.2. Arabic Phonology

Before embarking on our discussion of Arabic, it is important to clarify that Arabic is a diglossic language [14], with significant lexical, grammatical, and phonological differences between the standard language and the sundry colloquial varieties spoken across the Middle East and North Africa. For the sake of simplicity, the discussion of Arabic in this article centers around the standardized register, Modern Standard Arabic, used in formal literature and communication.

Standard Arabic possesses 28 consonantal phonemes, having preserved most of the original consonants of Proto-Semitic [15], which gives it a significantly larger phonemic inventory than Hebrew. However, it does not have any problem with polyvalent graphemes because each consonantal phoneme is represented by a distinct grapheme. Moreover, when compared with other Semitic languages, the vowel system of Standard Arabic is relatively simple [16], comprising three vowel qualities (a i u), with short and long counterparts, for a total of six vowel phonemes. In fact, the relative poverty of vowel phonemes is reflected in its consonant-heavy abjad [17]. (However, it is worth noting that some varieties, such as Levantine, have a more extensive vocalic inventory that includes e, o, and ə [18], but this is beyond the scope of our discussion.) Short vowels are indicated with optional diacritics but are generally unmarked in regular writing. Conversely, long vowels are always written with vowel letters, with the consonants alif, ya’, and waw (أ, ي, و) re-presenting ā, ī, and ū, respectively.

The historical tradition of Arabic to use consonants to represent long vowels dates back to the seventh century C.E. [19]. To this day, vowel letters constitute an integral part of the Arabic writing system to mark long vowels, constituting an important distinction between the Arabic script vis-à-vis Hebrew writing. In Hebrew, vowel letters are an auxiliary part of the system and are used as needed to compensate for the absence of diacritics. In Arabic, they are a built-in part of the abjad script.

However, the fixed presence of vowel letters in the Arabic abjad still results in occasional ambiguities because waw (و) and ya’ (ي) have a dual function: they not only represent long vowel sounds, but they also represent the diphthongs aj and aw, respectively. Optional short vowel diacritics may be used before the vowel letter to distinguish between a diphthong and a long vowel sound (one could theoretically choose to diacritize the word طول (tūl) as طُول, with the ُ in addition to the vowel letter to ensure that a reader does not interpret it as the diphthong طَوْل (ṭawl)), but this is not a common convention. Table 4 shows the use of vowel letters in Arabic script.

Apart from the diphthongs, the vowel phonology of Standard Arabic is generally less ambiguous in the abjad script as it only has three unmarked vowel sounds (as opposed to five in Hebrew). Ostensibly, this implies greater facility for an L2 reader to pronounce unfamiliar words as there would be fewer unmarked vowel combinations to consider. Nonetheless, the relative simplicity of Arabic vowel phonology is offset by the complexity of its morphology, a topic discussed in Section 4 and Section 5.

4. Morphology: Representing Nominal Patterns in Semitic Abjads

4.1. Hebrew Nominal Patterns

Hebrew nominal patterns are relatively straightforward and easy to represent with the abjad script. This is because most noun patterns are derived from a triconsonantal root, coupled with nominal affixesIn addition to affix-based morphology, there are also several nominal patterns that do not require affixes and are derived by changes to the non-concatenative vowel sequence. These nominal patterns tend to be consistent, allowing readers to make a conjecture regarding the pronunciation of a noun.

Table 5 shows some frequently recurring nominal patterns in Hebrew that derive from a triconsonantal root without any affixes. In the table, “C” represents a consonant, and lowercase letters represent vowels. For the transliteration, Latin letters in bold font represent the abjad consonants.

While there is still room for ambiguity, Hebrew phonotactics and use of vowel letters help to mitigate confusion when deciphering nominal patterns. As far as phonotactics is concerned, nearly all non-affixed nominal patterns in Hebrew are disyllabic in nature, a result of the disyllabification of primordial monosyllabic nominal patterns (Proto-Semitic CaCC and CiCC shifted in Hebrew to CeCeC [20], while CuCC shifted to CoCeC [20]). In Modern Hebrew, there are only four monosyllabic nominal patterns [21], and CCaC is the only monosyllabic pattern that does not entail the use of affixes. The paucity of monosyllabism in Hebrew nouns greatly alleviates ambiguity as a reader can assume that a given triconsonantal noun (i.e., CCC) will more likely than not be disyllabic.

Furthermore, the use of vowel letters also helps to reduce ambiguities in nominal patterns. For example, the pattern CoCeC is generally written as CWCeC, using ketiv male to represent the o vowel. Therefore, this imposes a distinction for the lexical readings of the heterophonic homograph kvd, with kaved being written as כבד (kvd) and koved written as כובד (kwvd).

4.2. Arabic Nominal Patterns

Conversely, Arabic nominal patterns are significantly more complex, and the number of permutations may cause serious confusion for an inexperienced L2 reader. As is the case with Hebrew, many Arabic nominal patterns can be formed merely by changing the internal vowel sequence of a root, without adding any affixes [22]. In these cases, the vowel pattern constitutes the sole distinctive morpheme for many nouns, resulting in substantial ambiguity when written diacritics are removed from a text.

Table 6 presents a list of all possible nominal patterns derived from a triconsonantal root and different vowel sequences.

As shown above, non-affixed Arabic nominal patterns span both monosyllabic (e.g., rijl—leg) and disyllabic nouns (e.g., rajul—man), as well as singular (e.g., jamal—camel) and plural nouns (e.g., jumal—sentences). Adding to the difficulty, non-affixed nominal patterns resist semantic classification, and their morphemic vowel sequences generally do not exhibit any definable function [22], aside from the notable exceptions of CuCaC and CuCuC, patterns mostly reserved for plural nouns.

Table 7 illustrates the different nominal forms derived from the triconsonantal roots mlk and ʕqd, both of which are heterophonic homographs that have three readings.

Because each of the above words has three lexical readings, even a proficient native speaker must rely on contextual cues to determine the context-appropriate meaning and pronunciation of the word. Moreover, for some heterophonic homographs, vowel diacritics are not only useful but also necessary. A study conducted at the Lebanese University of Beirut found that even skilled native readers benefit from the diacritization of certain words as they are not always able to recall all lexical possibilities when encountering a graphemically ambiguous word [23].

When factoring in non-lexical readings, the number of possible pronunciations may constitute a serious impediment for a second language reader, who must determine both the vowel pattern and the number of syllables. Furthermore, the number of phonotactic permutations for nominal patterns could easily result in a word being phonologically miscoded in the mind of the learner (i.e., the internal vowels are misremembered).

Moreover, the complexity of Arabic nominal patterns means that a triconsonantal noun (CCC) may initially obscure critical linguistic information for textual comprehension (such as singular vs. plural). This problem can be circumvented by heavier reliance on context, underscoring the importance of linguistic awareness when reading a text.

Finally, another potential source of confusion stems from the fact that many past tense verbs are triconsonantal roots without affixes and could potentially be mistaken as nouns or vice versa [23]. For example, mlk has three lexical nominal forms but could also be read as the perfect verb malaka (مَلَكَ—“he ruled”). This is yet another challenge a reader must navigate when encountering a triconsonantal word, although context and lexical knowledge will help disambiguate the possibilities.

5. Morphology: Representing Verb Forms in Semitic Abjads

The verbal morphology of Semitic languages is complex. The Semitic verb is composed of two interwoven morphemes: (1) a triconsonantal root and (2) an in-built verbal template composed of affixes and a distinct internal vowel pattern. The non-concatenative nature of Semitic morphology results in the root being interspersed with affixes and vowels—i.e., Prefix + C₁ + V + C₂ + V + C₃ + Suffix—meaning that the internal vowel pattern bears significant morphemic value (despite not being marked in most writing). This contrasts sharply with Indo-European concatenative morphology, which consists of a stem combined with prefixes and/or suffixes [24]—i.e., Prefix + Stem + suffix.

5.1. Vowels as Markers of Subject in Semitic Verbs

Semitic verbs are pro-drop in the past tense (with distinct suffixes marking each inflected form. Nonetheless, the absence of written vowels may lead to imprecision as there are numerous forms distinguishable only by the word-final vowel. One of the most prominent examples is the Semitic verb ktbt, consisting of the root ktv/ktb (“wrote”) and the inflectional suffix t, where the word-final vowel marks the grammatical subject of the past tense.

The Hebrew reading for the verb כתבת (ktvt) can be interpreted in two ways: כָּתַבְתָּ (katavta—“you (m.s.) wrote”) and כָּתַבְתְּ (katavt—“you (f.s.) wrote).” Meanwhile, the Standard Arabic cognate كتبت (ktbt) has three possibilities: كَـتَبْتُ, كَتَبْتَ, and كَتَبْتِ (katabtu, katabta, and katabti), with the respective meanings of “I wrote”, “you (m.s.) wrote”, and “you (f.s.) wrote”.

These are examples of inflectional homographs [25], where the conjugation of the verb is indicated through an unmarked vowel change. Inflectional homography is a pervasive phenomenon because of the limitations of abjad writing in representing Semitic verbs.

5.2. Vowels as Markers of Mood

Sometimes in Semitic languages, verb inflections from different moods overlap graphemically. In these situations, vowels function as the only phonological distinction between these homographic verb forms. This homography tends to happen most with the indicative and imperative forms. (In Arabic, homography occurs in all moods. In fact, a single word-final vowel marker distinguishes between the indicative (marfūʕ), subjunctive (mansūb), and jussive (majzūm), i.e., اكتبُ vs. اكتبَ vs. اكتُب (aktubu vs. aktuba vs. aktub).) In Hebrew, one encounters כתבו (ktvw), most commonly read as the indicative כָּתְבוּ (katvu—“they wrote”) but can also represent the imperative form כִּתְבוּ (kitvu—“write!” pl.). In Arabic, the word أكتب (ʔktb) can be interpreted as either أَكْتُبُ (ʔaktubu—“I write”) or أُكْتُب (ʔuktub—write! m.s.).

5.3. Vowels as a Marker of Voice

In both Hebrew and Arabic, the passive form of verbs is limited primarily in formal written literature, appearing infrequently in informal contexts. However, it is precisely in written literature that L2 readers are most likely to encounter the passive voice, which, in many cases, is homographic with its active counterpart.

In Arabic, the passive voice for many verbs is formed through the modification of the internal vowel pattern, without necessitating the change of any consonants. This means that several verbs could potentially be read as active or passive, depending on the context. For instance, the Arabic triliteral verb كتب (ktb) could be read as either كَتَبَ (kataba—“he wrote”) or كُتِبَ (kutiba—“it was written”). Without written diacritics, readers must rely on context to decipher the grammatical voice of the verb.

However, because the passive form is significantly less used than its active counterpart, vowel markings are generally added to obviate confusion, a phenomenon known as disambiguating diacritization [26]. Nonetheless, despite its utility, disambiguating diacritization is relatively uncommon, with a maximal occurrence of 1.2% of Arabic texts [25].

In Hebrew, the passive and active counterparts of many verbs were originally homographs; however, confusion stemming from this overlap has been obviated through the insertion of vowel letters, with y and w added to the active and passive forms, respectively (e.g., קישט qyšṭ—“he decorated” vs. קושט qwšṭ—“it was decorated”).

5.4. Vowels as Markers of Verbal Paradigm

Semitic verbal morphology consists primarily of triconsonantal roots that can be inputted into different semantic paradigms to create shades of meaning to derive new verbs from a common root, with each of the descendant verbs exhibiting related but distinct meanings. This system of verbal derivation is somewhat akin to the use of prefixes in European languages (e.g., write vs. co-write, vs. re-write), except the difference being the non-concatenative morphology. With semantic paradigms, verbs that are conceptually disparate in English, such as “learn” and “teach”, are morphologically related in Hebrew and Arabic, with both verbs deriving from a common triconsonantal root (lmd in Hebrew and drs in Arabic).

Paradigmatic derivation is usually formed by a combination of consonantal affixes and vowel patterns. However, there are numerous cases where even the affixes themselves do not adequately demarcate pattern types, meaning that some conceptually related verbs are graphemically indistinguishable without the presence of vowel diacritics.

Table 8 demonstrates the heterophonic homography of the Semitic verbs “teach” and “learn.”

In the case of Arabic, the gemination marker ( Education 14 00765 i001

) used in the intensive-causative paradigm is the only diacritical mark that is regularly featured in unvocalized texts [27] and is used a way to disambiguate homographs. The presence of the gemination diacritic provides a necessary visual distinction for the reader, giving us يدرس (ydrs) and يدرّس (ydrrs). (It is worth mentioning that the indicative form of the Arabic causative paradigm اَفْعَل (afʕala) results in heterophonic homographs with the base paradigm. Case in point: the verb يشغر (yšʕr) can be interpreted as either يَشْعُر (yašʕur—“he feels”) vs. يُشْعِر (yušʕir—“he causes someone to feel). In this case, disambiguating diacritization would be required to afford clarity to the reader).) In the case of the Hebrew verb למד (lmd), there is no way to distinguish between the homographic forms without resorting to partial or full diacritization.

6. Pedagogical Implications for Literacy Acquisition

The purpose of Section 2, Section 3, Section 4 and Section 5 was to elucidate the main linguistic challenges of reading abjad scripts. For educators to optimize the attainment of reading proficiency, it is paramount to understand the morphological and phonological challenges stemming from the inherent inadequacies of transposing orally marked distinctions into abjad writing (e.g., word-final vowels in the verbal system or plosive-fricative pairs in Hebrew). While the previous sections were grounded in theory, Section 6 and Section 7 provide the applicational component. Section 6 sheds light on the observed neurolinguistic tendencies of learners when learning to read in Hebrew and Arabic, while Section 7 discusses how educators can meaningfully intervene in the learning process to improve learner outcomes.

6.1. Metalinguistic Awareness and Semitic Abjads

Israeli researcher Ram Frost argues that the deep orthography inherent to Semitic abjads requires a more sophisticated understanding of grammar and morphology than languages with more shallow orthographies, as “lexical access is based mainly on the word’s orthography, and the word’s phonology is retrieved from the mental lexicon” [28]. Experts in applied linguistics refer to the consciousness surrounding morphological patterns and their formation as “metalinguistic/morphological awareness” [29]. This phenomenon is well known in the field of second language acquisition, although its importance in language mastery remains a subject of debate [30]. Centering the discourse on Semitic languages, the following subsections cite studies to demonstrate the role of metalinguistic awareness in the study of Hebrew and Arabic.

6.2. Metalinguistic Awareness in L1 Readers of Semitic Languages

Semitic non-concatenative morphology coupled with the absence of diacritics requires readers to fill in the vowels based on their knowledge of grammar, a process that has unique neurocognitive implications. Various studies have been conducted on literate L1 speakers of Semitic languages that examined the impact of “morphological awareness” on the neurological processing of written texts. Israeli neuroscientists Bechor Barouch and Yael Weiss conducted a study on Hebrew-speaking children that found that young readers exhibit an early sensitivity to the morphologically rich structures of their native language and tend to be more reliant on morpho-semantic cues than adult readers [31]. Similarly, in 2010, neurolinguist Sami Boudelaa and team conducted a study on native Arabic speakers, suggesting that “there were identifiable neural circuits dedicated to the processing of morphology” and that morphological processing exists as a “distinct neurocognitive component” [32].

6.3. Metalinguistic Awareness in L2 Readers

Most neurolinguistic and pedagogical studies for reading Semitic languages (especially for Hebrew) studied L1 learners; however, many of the findings are applicable to second language learners as well. Similarly to native speakers, L2 learners navigating the linguistic ambiguities of unvowelled abjad writing must compensate for the lack of written diacritics by improving their understanding of Semitic grammar and morphology.

Research on morphological awareness in Arabic L2 learners corroborates the results of the previously cited literature. For Hebrew, there is little research on metalinguistic consciousness in L2 learners, prompting increased reliance on the studies on non-native Arabic learners.

Linguist and language acquisition researcher, Mahmood Azaz, demonstrated that advanced L2 readers exhibit higher levels of metalinguistic awareness than elementary or intermediate readers. Based on his results from his study on Arabic language students at the University of Arizona, he arrived at the conclusion that readers at all stages exhibit a demonstrable level of metalinguistic awareness but that the precise degree is dependent on the linguistic feature in question [33]. According to the results of the research, beginning and intermediate students showed high levels of awareness of so-called “high-salience features” (i.e., basic grammatical points) but demonstrated significantly lower levels of awareness of “low-salience features” (i.e., advanced grammatical concepts) [33]. In other words, certain grammatical structures are consciously processed by learners, while more sophisticated features elude their observation. While Azaz’s study proved a definitive neurolinguistic link between reading and metalinguistic awareness, it did not demonstrably point to the origin of grammatical cognition: is this a naturally occurring phenomenon in L2 students, or is it acquired through targeted instruction?

To answer this question, Lama Nassif et al. conducted an experiment in 2022 on intermediate-level Arabic language students at Michigan State University. The students were given texts containing both regular (sound) and irregular verbs (geminate) in Arabic. Quantitative metrics of eye movement found that L2 learners spent a similar duration of time identifying and visually processing both verbal types [34]. Furthermore, qualitative data collected from participant exit interviews found that while students did report applying metalinguistic awareness to understanding some grammatical structures, not one noticed the presence of irregular verbs. These findings have important implications. First, Nassif’s study reinforces Azaz’s findings at the University of Arizona, which had concluded that the application of metalinguistic awareness is dependent on the sophistication of the grammatical feature in question. Second, Nassif’s research found that L2 learners focus primarily on understanding the meaning of a word or passage, rather than paying attention to the grammatical structures within it [34].

Nassif’s work at Michigan State University confirmed previous research that showed L2 readers consistently prioritize understanding meaning over grammatical form when reading unfamiliar texts [35]. Essentially, morphology is secondary to semantics, unless it provides a direct inroad to comprehension. However, the proclivity of learners to concentrate on the search for meaning at the expense of other elements may have adverse effects on long-term learning outcomes, something that educators should take into consideration when structuring their language courses.

6.4. Implications of Learner Metalinguistic Awareness for Educators

The above studies pointed to the limitations of metalinguistic awareness in L2 learners when reading text. These findings have important implications for educators looking to improve students’ reading proficiency. Because it is not intuitive for learners to focus on grammatical features when reading, instructors can make up for this deficit by designing exercises that prompt students to visualize morphological structures. This is precisely what Nassif et al. recommend in their conclusion:

“Given learners’ predisposition to read and process L2 input for meaning, it is important to design tasks that reinforce form-function mappings and that create the need to fulfill a communicative purpose in such a way that it will draw learners’ attention to the forms that are required to perform these communicative functions” [34].

Ultimately, although learners naturally exhibit a rudimentary level of morphological awareness, they are not cognizant of more sophisticated structures and patterns. This means that it should not be expected that morphological awareness will simply arise naturally with the study of the language. Quite the contrary: learners must be trained to be aware of the structures they are using. Instructors can provide this additional boost by supplementing the communicative approach with morphology-centered activities and exercises that will promote metalinguistic awareness. For example, instructors may wish to draw learners’ attention to prominent morphological features that mark verbs or nouns, as well as emphasizing the reliance on grammatical context to disambiguate heterophonic homographs.

7. Review of Current Pedagogical Strategies

It has now been established that reading abjad scripts is a formidable task due to the complexities of Semitic phonology and morphology but that these challenges can be surmounted with the development of greater metalinguistic awareness. This raises an important question: should instructors avail themselves to diacritics to address the simultaneous needs of morphological awareness and reading accuracy? The use of diacritics constitutes a heated point of contention among Hebrew and Arabic language educators. Proponents of diacritics cite the utility of diacritics in providing clarity to readers, while opponents resist incorporating them on the basis that vowel markings constitute an impediment for long-term outcomes in literacy development [36]. The following subsections examine different pedagogical methodologies used in popular Hebrew and Arabic textbooks, evaluating the overall effectiveness of each approach in improving L2 reading competency.

7.1. Minimal Diacritic Method

Proponents of this method seek to simulate and immerse learners in the real-life usage of the language. Because most native texts do not feature vowels, some educators contend that L2 students should grapple with this challenge from the very onset to provide a more authentic learning experience [36]. Diacritics may be provided in vocabulary lists and conjugation tables but usually disappear after the initial exposure, with learners expected to memorize the pronunciation of new words and rapidly apply it when reading written texts. In fact, this is the approach that is employed in the most widely used Arabic language textbook in American universities: Al-Kitaab fii Taʕallum al-ʕArabiyya. The textbook minimizes the use of short vowels, with the stated objective to provide users with an authentic experience [37]. Likewise, in The Routledge Introductory Course in Modern Hebrew, the author describes his use of the orthographical convention of “contemporary plene spelling ketiv male with occasional use of partial or full vocalization annotations, especially in new vocabulary items” [38]. (Other Hebrew textbooks that use this method are עברית מן ההתחלה (Ivrit min ha-Hatchala—Hebrew from Scratch), as well as most books used in Israeli ulpanim (intensive language courses designed to rapidly assimilate new immigrants).)

7.1.1. Advantages of the Method

As mentioned previously in the discussion of the Semitic verbal system, vowels play a useful role in disambiguating different morphological variants. In spite of this, the University of Michigan study demonstrated that L2 learners tend to fixate on meaning over form [34]. On the basis of this finding, the minimal diacritic methods enjoy the distinct advantage in that the absence of diacritics directs readers to focus on the overall meaning conveyed by the consonantal root. A Hebrew University of Jerusalem study on Hebrew diacritics confirmed this theory, postulating that the pronunciation of unvowelled words was activated by orthographic recognition, with readers deciding the appropriate interpretation based on a word’s relative frequency [39]. In other words, the consonants alone should suffice for lexical recall, if readers are familiar with the word. Vowels are useful for grammatical nuance but are not critical to conveying the overarching meaning of a word, corroborating the claims of educators who espouse the minimal diacritic approach.

Furthermore, in the context of learning Semitic verbs (the difficulties of which were enumerated in Section 5), learners might come to appreciate the efficiency of the abjad script. The vowelless abjad script transforms the verbal root into “an orthographically continuous unit” [3], even though it is phonologically disconnected when pronounced orally. With proper grammatical training, readers can infer the vowel pattern and supply the pronunciation of a given verb if they successfully utilize the available visual and phonetic cues. The assumption of grammatical proficiency underlies the structure of abjad writing, which requires the reader to engage simultaneously with the word’s spelling as well as its morphology, a phenomenon referred to as morpho-orthographic decomposition [5].

7.1.2. Disadvantages of the Method

The minimal-diacritic method presupposes the ability of L2 learners to recall the pronunciation of new vocabulary words, as well as the recognition of consonants to trigger lexical recall. However, this raises questions regarding the speed and accuracy through which learners accrue new words. While there is no conclusive evidence for how many instances of exposure are required to recall a word, it has been suggested that the retention of incidental vocabulary requires exposure at least ten times [40]. In practice, this number is variable and dependent on the individual, language, and type of vocabulary; however, it is incontrovertible that repeated exposure to new vocabulary is a crucial factor in its retention [41].

The difficulty of reading unvowelled texts is further compounded at the elementary level of Hebrew and Arabic language acquisition due to learners’ inadequate knowledge of the verbal system and word formation. As demonstrated in the previous section, literacy in abjad scripts is inextricably tied to morphological awareness. Consequently, novice readers may be impeded by the limitations of their knowledge of grammar.

Finally, it has been established that novice learners process the written word through the phonological recoding of the graphemic units together with the diacritics [42]. That is to say, vowels are as neurologically important as the consonants for learners. Therefore, the absence of diacritics at an early stage may lead to errors in the phonological recording process—e.g., without vowel markings, a learner might mistakenly recode the Arabic noun درس (drs) as “dirs” or “durs” instead of dars, potentially resulting in a fossilized error.

7.2. Diacritics-Heavy Method

This approach views diacritics as an essential component of the writing system as it mitigates ambiguities and increases fluidity in reading from the onset of the learning process. With this method, most words are written with diacritics (shallow orthography) to ensure accuracy. Diacritics may or may not only be dropped much later in the learning process in favor of authentic abjad writing (deep orthography).

In general, it seems that textbooks that prioritize grammatical accuracy and morphological mastery use this approach as vowel markings ensure that students will avoid mistakes in nominal and verbal patterns, given how the probability for error is high in these areas (as illustrated in Section 4 and Section 5). Examples of fully diacritized textbooks include Ha-Yesod: Fundamentals of Hebrew and Cambridge’s A Student Grammar of Modern Standard Arabic.

7.2.1. Advantages of the Method

One of the advantages of this methodology is that it evinces similarities to the pedagogical approach to early language education for L1 learners, where it has been well established that vowel markings are important for developing literacy. Hebrew reading studies conducted on Israeli grade school children (all native speakers of Hebrew) found that the shallow orthography facilitated the speed and accuracy of reading and that the children were able to more to rapidly identify vowelled words over unvowelled words [43]. Shallow orthography has other benefits. In 1999, David L. Share contended that phonological decoding in fully diacritized Hebrew is more accurate than languages like English that feature significant orthographical inconsistency [44]. Additionally, Arabic language curriculum in the Middle East is also designed in a way that is consistent with these research findings. Primers and textbooks for Arabic-speaking first-grade students in Egypt default to a system of simplified diacritization, which features the majority of diacritical markers to ensure accurate pronunciation (excluding only the ones with limited phonemic value), with a diacritization rate of around 60–70% [25].

7.2.2. Disadvantages of the Method

Nonetheless, although this model has been proven and tested with native speakers, it gives rise to a unique set of challenges. The first disadvantage cited by research regards reading speed (based on studies on L1 speakers, the results of which should be logically extrapolated to L2 learners). The presence of written vowels seems to increase neurological processing times, thereby reading speeds due of the deluge of information encoded in the diacritics [45], some of which is irrelevant to a word’s meaning. Studies on Hebrew L1 learners have shown that even if a reader does not need vowel markings to read a particular word, it is not cognitively possible to disregard them [46]. Similarly, studies on Arabic native speakers seem to corroborate the findings. In a study of eye movement fixation in 1987, Roman and Pavard asserted that the reading speed of Arabic was impaired upon introducing diacritics [47], and Asadi’s 2017 study on Arabic-speaking children blamed the visual bulkiness of diacritics for students’ low performance on literacy tests when reading with vowels, vis-à-vis students of other shallow orthography languages [48].

A second drawback is that diacritics change the shape of a word, rendering lexical identification more difficult as the vowelized form may not automatically map onto the forms stored in the mental lexicon of the reader [49]. For example, a novice reader familiar with the word مُسْتَقْبَل (mustaqbal—“future) might not immediately recognize its vowelless form مستقبل (mstqbl).

The third disadvantage pertains to reliance on vowel markings as full diacritization may detract from the development of morphological awareness in second language learners, potentially having an adverse impact on long-term literacy outcomes. It is possible that learners may stagnate at this stage of reading development, where they can convert visual input into oral output (i.e., mastering shallow orthography) but never independently transition to the morphological awareness required for comprehending deep orthography.

7.3. An Integrative Pedagogical Method

Upon careful consideration of the relative merits and flaws of the aforementioned pedagogical approaches, educators might consider synthesizing an approach that integrates elements of both methodologies. Research has indeed shown that diacritics are useful for providing clarity to novice readers. Nonetheless, in order for diacritization to be continually effective, vowel markings must be incrementally pared down as learners progress in order to reduce dependence and to spur the refinement of morphological awareness.

Furthermore, educators need to determine the mode of diacritization that best befits the pedagogical context (full, simplified, lexical, etc. [25]), as well as a systematic way to reduce diacritic density in a way that is inversely correlated with learner progression. In other words, as learners develop greater proficiency in the target Semitic language, vowel markings should become less common. For instance, diacritics should be removed for high-frequency words, as well as morphologically predictable cases (e.g., basic verbal and nominal patterns). This would closely replicate the process of literacy acquisition for native speakers. At the beginning, vowel diacritics help in comprehending written texts, but in time, learners become neuro-linguistically attuned to morphological and orthographical patterns, enabling them to gradually dispense of diacritics altogether.

Israeli researchers David Share and Amalia Bar-on contend that Hebrew literacy acquisition for L1 learners comprises three distinct phases, referring to it as a “triplex model of reading development.” In the first stage, readers rely on individual graphemes and diacritics to string together sounds. They then graduate to the second stage, characterized by increased reliance on localized morphological awareness. Finally, they progress to the third stage, where they rely on context in order to navigate the challenges of homography [3]. In fact, it is not until the second grade that children accrue adequate linguistic knowledge to transition away from fully vowelled texts.

“Grade 2 is a watershed in Hebrew reading development, because the young reader relies less on low-level sublexical phonology and more on higher-order lexical and morpho-orthographic knowledge. This is also the same point in time when children begin to show evidence of the acquisition of word- and morpheme-specific orthographic representations that signal the growth of the orthographic lexicon” [3].

In theory, this triplex model could be effectively applied to L2 students, many of whom experience similar gradations in the learning curve as native speaker children (albeit with different timelines). For learners, the presence of vowels at the beginning is useful until they reach a stage where they will have amassed an adequate lexicon in conjunction with an expansive knowledge of the grammar.

7.3.1. Hebrew and Arabic Language Resources Featuring the Integrated Approach

While not nearly approaching universal uptake, a quasi-integrated approach is featured in several learner materials. The Arabic textbook Arabiyyat al-Naas explicitly informs the reader of the its intention to wean them off vowels: “As you Arabic skills progress and you learn to recognize familiar words, you will gradually see fewer and fewer diacritical marks printed in this book” [50]. In the Hebrew studies world, Brandeis Modern Hebrew declares that it intentionally uses ketiv male in most text but includes diacritics for headlines, select texts, and verb conjugations [51].

7.3.2. Sample Timeline and Incremental Reduction in Diacritics

Although the integrative pedagogical approach is utilized in some language materials, there exists a need for (1) greater systemization across the board and (2) instructor cognition regarding the process.

The following sample timeline maps the L2 learning process, serving as a reference to instructors on how the integrated approach can be used efficaciously. By demonstrating how diacritization can be wielded to optimize learning outcomes, this timeline is a unique contribution to the field of Hebrew and Arabic language pedagogy and a model for greater uptake of gradual de-diacritization. However, because diacritization is inherently a subjective process, instructors and textbook authors must independently assess the level of language proficiency that must be achieved before diacritics can be meaningfully pared down. Table 9 gives an example of how instructors of Hebrew and Arabic can use diacritization in their texts.

7.3.3. Challenges of the Integrated Approach

While there are clear benefits to adopting this approach, it too comes with specific challenges: namely, it requires greater instructor-mediated intervention to optimize literacy training and consciousness surrounding the use of diacritics in the reading corpus [52]. To properly implement an integrated approach, instructors will likely need to customize materials to suit the needs of students, as well as the progression in the course content (e.g., removing vowels for familiar verbs or retaining vowels for difficult words). However, extensive customization of course materials could inadvertently result in fewer authentic texts being used in the classroom. Alternatively, instructors who do not wish to forego newspapers, book excerpts, etc., might choose to diacritize authentic texts as needed to render them level appropriate.

In order to effectively reduce reliance on diacritics, educators must take three factors into account: (1) learner familiarity with vocabulary, (2) recurring morphological patterns, and (3) situations where diacritics are contextually useful to mitigate confusion and unnecessary errors. With the proper implementation and necessary adjustments, the integrated approach of diacritization may yield improved outcomes in literacy proficiency together with a concomitant increase in morphological awareness. In short, instructors must be judicious about the use of diacritics and proactive about removing them to balance the dual goal of increasing morphological awareness and improving reading accuracy.

8. Conclusions

The pervasive homophony and morphological ambiguity inherent in Semitic abjads render the process of literacy acquisition a formidable challenge for both native and non-native speakers. However, because L2 students do not have the same linguistic exposure and grammatical knowledge as native speakers, they face serious challenges in the realms of phonology and morphology. Without proper pedagogical guidance, these linguistic barriers may forestall accuracy in pronunciation for an unacclimatized reader and potentially lead to fossilized errors. Because Semitic abjads constitute deep orthography, readers must achieve a high-level understanding of grammar in order read texts accurately. In fact, applied linguistic research has found that literate native speakers of Hebrew and Arabic exhibit demonstrable levels of metalinguistic awareness due to their languages’ complex morphology, something second language learners have also been found to possess, although at lower levels. Essentially, in the context of Semitic abjads, morphology, semantics, and reading accuracy are intertwined, thereby necessitating instructors to adapt their teaching to ensure equal progression in these three areas.

In light of these linguistic challenges, one of the ongoing debates in Semitic pedagogical discourse relates to the utilization of diacritics in texts for second language students and to what extent vowels should feature in the classroom corpus. While the current debate surrounding the use of diacritics remains inconclusive, educators might consider replicating the approach used for L1 learners, which features fully vowelled texts for novice learners and gradually moves toward vowelless texts as students gain proficiency in their mother tongue. Research has shown this method to be highly effective, and it has the potential to revolutionize the pedagogy of Hebrew and Arabic as a second language if implemented properly in the L2 context.

Funding

This research received no external funding.

Conflicts of Interest

The author declares no conflict of interest.

References

Besner, D.; Smith, M.C. Basic Processes in Reading: Is the Orthographic Depth Hypothesis Sinking? In Orthography, Phonology, Morphology and Meaning; Frost, R., Katz, L., Eds.; Elsevier: Amsterdam, The Netherlands, 1992. [Google Scholar]
Cuetos, F. Writing Processes in a Shallow Orthography. Read. Writ. 1993, 5, 17–28. [Google Scholar] [CrossRef]
Share, D.L.; Bar-On, A. Learning to Read a Semitic Abjad: The Triplex Model of Hebrew Reading Development. J. Learn. Disabil. 2018, 51, 444–453. [Google Scholar] [CrossRef]
Abu-Rabia, S. Reading Arabic Texts: Effects of Text Type, Reader Type and Vowelization. Read. Writ. 1998, 10, 105–119. [Google Scholar] [CrossRef]
Shimron, J.; Sivan, T. Reading Proficiency and Orthography Evidence from Hebrew and English. Lang. Learn. 1994, 44, 5–27. [Google Scholar] [CrossRef]
De Voogt, A.; Quack, J.F. The Idea of Writing: Writing across Borders; Brill: Leiden, The Netherlands, 2011. [Google Scholar]
Hoffman, J. In the Beginning: A Short History of the Hebrew Language; NYU Press: New York, NY, USA, 2006; ISBN 978-0-8147-3690-6. [Google Scholar]
Khan, G. The Tiberian Pronunciation Tradition of Biblical Hebrew; Open Book Publishers: Cambridge, UK, 2020; Volume 1. [Google Scholar]
Blau, J. Phonology and Morphology of Biblical Hebrew: An Introduction; Penn State Press: Philadelphia, PA, USA, 2010. [Google Scholar]
Ravid, D.; Shlesinger, Y. Vowel Reduction in Modern Hebrew: Traces of the Past and Current Variation. Folia Linguist. 2001, 35, 371–398. [Google Scholar] [CrossRef]
Weinberg, W.; College, H.U. The History of Hebrew Plene Spelling; Hebrew Union College Press: New York, NY, USA, 1985. [Google Scholar]
Faust, M. The Handbook of the Neuropsychology of Language; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Coffin, E.A.; Bolozky, S. A Reference Grammar of Modern Hebrew; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
Ferguson, C.A. Diglossia. WORD 1959, 15, 325–340. [Google Scholar] [CrossRef]
Mustafawi, E. Arabic Phonology. In The Routledge Handbook of Arabic Linguistics; Benmamoun, E., Bassiouney, R., Eds.; Routledge: London, UK, 2017. [Google Scholar]
Salameh, M.Y.B.; Abu-Melhim, A.-R. The Phonetic Nature of Vowels in Modern Standard Arabic. Adv. Lang. Lit. Stud. 2014, 5, 60–67. [Google Scholar]
Watson, J.C.E. The Phonology and Morphology of Arabic; OUP Oxford: Oxford, UK, 2007. [Google Scholar]
Brustad, K.; Zuniga, E. Levantine Arabic. In The Semitic Languages; Routledge: New York, NY, USA, 2019; pp. 403–432. [Google Scholar]
Posegay, N. Points of Contact: The Shared Intellectual History of Vocalisation in Syriac, Arabic, and Hebrew; Open Book Publishers: Cambridge, UK, 2021. [Google Scholar]
Huehnergard, J. Biblical Hebrew Nominal Patterns. In Epigraphy, Philology, and the Hebrew Bible: Methodological Perspectives on Philological and Comparative Study of the Hebrew Bible in Honor of Jo Ann Hackett; Hutton, J.M., Rubin, A.D., Eds.; SBL Press: Atlanta, GA, USA, 2015. [Google Scholar]
Shatil, N. Noun Patterns and Their Vitality in Modern Hebrew. Hebr. Stud. 2014, 55, 171–203. [Google Scholar] [CrossRef]
Bateson, M.C. Arabic Language Handbook; Georgetown University Press: Washington, DC, USA, 1967. [Google Scholar]
Al-Taani, A.; Al-Rub, S.A. A Rule-Based Approach for Tagging Non-Vocalized Arabic Words. Int. Arab J. Inf. Technol. 2009, 6, 320–328. [Google Scholar]
Benmamoun, E. Perspectives on Arabic Linguistics: Papers from the Annual Symposium on Arabic Linguistics. Volume XII: Urbana-Champaign, Illinois, 1998; John Benjamins Publishing Company: Amsterdam, The Netherlands, 1999. [Google Scholar]
Hallberg, A. Variation in the Use of Diacritics in Modern Typeset Standard Arabic: A Theoretical and Descriptive Framework. Arabica 2022, 69, 279–317. [Google Scholar] [CrossRef]
Hermena, E.W.; Drieghe, D.; Hellmuth, S.; Liversedge, S.P. Processing of Arabic Diacritical Marks: Phonological–Syntactic Disambiguation of Homographic Verbs and Visual Crowding Effects. J. Exp. Psychol. Hum. Percept. Perform. 2015, 41, 494–507. [Google Scholar] [CrossRef]
Zitouni, I.; Sorensen, J.S.; Sarikaya, R. Maximum Entropy Based Restoration of Arabic Diacritics. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–21 July 2006; Calzolari, N., Cardie, C., Isabelle, P., Eds.; Association for Computational Linguistics: Sydney, Australia, 2006; pp. 577–584. [Google Scholar] [CrossRef][Green Version]
Frost, R. Prelexical and Postlexical Strategies in Reading: Evidence from a Deep and a Shallow Orthography. J. Exp. Psychol. Learn. Mem. Cogn. 1994, 20, 116–129. [Google Scholar] [CrossRef] [PubMed]
DeKeyser, R.M. Cognitive-Psychological Processes in Second Language Learning. In The Handbook of Language Teaching; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2009; pp. 119–138. [Google Scholar] [CrossRef]
Roehr, K.; Ganem-Gutierrez, G.A. The Metalinguistic Dimension in Instructed Second Language Learning; Bloomsbury Publishing Plc: London, UK, 2013. [Google Scholar]
Barouch, B.; Weiss, Y.; Katzir, T.; Bitan, T. Neural Processing of Morphology during Reading in Children. Neuroscience 2022, 485, 37–52. [Google Scholar] [CrossRef]
Boudelaa, S.; Pulvermüller, F.; Hauk, O.; Shtyrov, Y.; Marslen-Wilson, W. Arabic Morphology in the Neural Language System. J. Cogn. Neurosci. 2009, 22, 998–1010. [Google Scholar] [CrossRef]
Azaz, M. Metalinguistic Knowledge of Salient vs. Unsalient Features: Evidence From the Arabic Construct State. Foreign Lang. Ann. 2017, 50, 214–236. [Google Scholar] [CrossRef]
Nassif, L.; Huntley, E.; Mohamed, A. Attention to Verbal Morphology in L2 Arabic Reading: An Eye-Movement Study. Foreign Lang. Ann. 2022, 55, 769–792. [Google Scholar] [CrossRef]
VanPatten, B. Processing Instruction: An Update. Lang. Learn. 2002, 52, 755–803. [Google Scholar] [CrossRef]
Midhwah, A.A.; Alhawary, M.T. Arabic Diacritics and Their Role in Facilitating Reading Speed, Accuracy, and Comprehension by English L2 Learners of Arabic. Mod. Lang. J. 2020, 104, 418–438. [Google Scholar] [CrossRef]
Attieh, A. Review of Al-Kitaab fii Ta’allum al-’Arabiyya with DVDs: A Textbook for Beginning Arabic, Part 1. Mod. Lang. J. 2006, 90, 431–433. [Google Scholar]
Etzion, G. The Routledge Introductory Course in Modern Hebrew: Hebrew in Israel; Routledge: London, UK, 2017. [Google Scholar]
Frost, R.; Bentin, S. Processing Phonological and Semantic Ambiguity: Evidence From Semantic Priming at Different SOAs. J. Exp. Psychol. Learn. Mem. Cogn. 1992, 18, 58–68. [Google Scholar] [CrossRef]
Saragi, T.; Nation, I.S.P.; Meister, G.F. Vocabulary Learning and Reading. System 1978, 6, 72–78. [Google Scholar] [CrossRef]
Hiebert, E.H.; Kamil, M.L. Teaching and Learning Vocabulary: Bringing Research to Practice; Routledge: London, UK, 2005. [Google Scholar]
Bar-On, A.; Shalhoub-Awwad, Y.; Tuma-Basila, R.I. Contribution of Phonological and Morphological Information in Reading Arabic: A Developmental Perspective. Appl. Psycholinguist. 2018, 39, 1253–1277. [Google Scholar] [CrossRef]
Share, D.; Levin, I. Learning to Read and Writing in Hebrew. In Learning to Read and Write: A Cross-Linguistic Perspective; Harris, M., Hatano, G., Eds.; Cambridge University Press: Cambridge, UK, 1999. [Google Scholar]
Share, D.L. Phonological Recoding and Orthographic Learning: A Direct Test of the Self-Teaching Hypothesis. J. Exp. Child Psychol. 1999, 72, 95–129. [Google Scholar] [CrossRef]
Ravid, D. Hebrew Orthography and Literacy. In Handbook of Orthography and Literacy; Joshi, R.M., Aaron, P.G., Eds.; Routledge: London, UK, 2013. [Google Scholar]
Frost, R.; Bentin, S. Reading Consonants and Guessing Vowels; Elsevier: Amsterdam, The Netherlands, 1992. [Google Scholar]
Roman, G.; Pavard, B. A Comparative Study: How We Read in Arabic and French. In Eye Movements from Physiology to Cognition; O’regan, J.K., Levy-schoen, A., Eds.; Elsevier: Amsterdam, The Netherlands, 1987; pp. 431–440. [Google Scholar] [CrossRef]
Asadi, I.A. Reading Arabic with the Diacritics for Short Vowels: Vowelised but Not Necessarily Easy to Read. Writ. Syst. Res. 2017, 9, 137–147. [Google Scholar] [CrossRef]
Hallberg, A. Principles of Variation in the Use of Diacritics (Taškīl) in Arabic Books. Lang. Sci. 2022, 93, 101482. [Google Scholar] [CrossRef]
Younes, M.; Weatherspoon, M.G.; Foster, M.S. Arabiyyat Al-Naas (Part One): An Introductory Course in Arabic; Routledge: London, UK, 2017; ISBN 978-1-135-01084-3. [Google Scholar]
Ringvald, V.; Porath, B.; Peleg, Y.; Shorr, E.; Hascal, S. Brandeis Modern Hebrew; Brandeis University Press: Chicago, IL, USA, 2015. [Google Scholar]
Moser, J. Evaluating Arabic Textbooks: A Corpus-Based Lexical Frequency Study. Int. J. Appl. Linguist. 2021, 31, 248–263. [Google Scholar] [CrossRef]

Table 1. Disambiguation of Hebrew heterophonic homographs with full spelling.

Vowelled Form	Transliteration	Full Spelling	Transcription	Translation
סֵפֶר	sefer	ספר	sfr or spr	“book”
סָפַר	safar	ספר	sfr or spr	“he counted”
סְפַר	sfar	ספר	sfr or spr	“edge, margin”
סַפָּר	sapar	ספר	sfr or spr	“barber”
סַפֵּר	saper	ספר	sfr or spr	“recount! (m.s.)”
סִפֵּר	siper	סיפר	syfr or sypr	“he recounted”
סְפֹר	sfor	ספור	sfwr or spwr	“count! (m.s.)”
סֻפַּר	supar	סופר	sfwr or spwr	“was recounted”

Table 2. Hebrew heterophonic homographs that contain vowel letters.

Unvocalized Word	Reading #1	Reading #2
טיל ṭyl	טִיל ṭil “missile”	טִיֵּל ṭiyel “he strolled’
רוחי rwḥy	רוּחִי ruḥi “my spirit”	רִוְחִי rivḥi“profitable”

Table 3. Use of double vowel letters to disambiguate heterophonic homographs.

Orthography with Vowel Letter	Orthography with Double Vowel Letter
טיל tyl “missile”	טייל tyyl “he strolled”
רוחי rwḥy “my spirit”	רווחיrwwḥy “profitable”

Table 4. Arabic heterophonic homographs that contain vowel letters.

Homograph	Reading #1			Reading #2
طول ṭwl	طول	ṭūl	“length”	طَوْل tawl “power”
سيف syf	سِيف	sīf	“riverbank”	سَيْف sayf “sword”

Table 5. Non-affixed nominal patterns in Hebrew.

Nominal Pattern	Hebrew Example	Transliteration	Translation
CaCaC	פַּסָּל	pasál	“sculptor”
CeCeC	פֶּסֶל	pésel	“statue”
CoCeC	כֹּבֶד	kóved	“heaviness”
CaCeC	כָּבֵד	kavéd	“liver”
CCaC	כְּפָר	kfar	“village”

Table 6. Non-affixed nominal patterns in Arabic.

Nominal Pattern	Arabic Example	Transliteration	Translation
CaCC	دَرْس	dars	“lesson”
CuCC	قُطْب	qutb	“pole”
CiCC	رَجْل	rijl	“leg”
CaCuC	رَجُل	rajul	“man”
CaCiC	مَلِك	malik	“king”
CaCaC	جَمَل	jamal	“camel”
CuCaC	جُمَل	jumal	“sentences”
CuCuC	كُتُب	kutub	“books”

Table 7. Arabic homographs with multiple nominal readings.

Homograph	Reading #1		Reading #2		Reading #3
ملك mlk	مَلِك malik	“king”	مُلْك mulk	“dominion”	مِلْك milk	“property”
عقد ʕqd	عُقَد ʕuqad	“complexes”	عَقْد ʕaqd	“contract”	عِقْد ʕiqd	“necklace”

Table 8. Semitic homographs across verbal paradigms.

Semitic Verb	Base Paradigm		Intensive-Causative
ילמד ylmd	יִלְמַד yilmad	“He will learn”	יְלַמֵּד yelamed	“He will teach”
يدرس ydrs	يَدْرُسُ yadrusu	“He learns”	يُدَرِّسُ yudarrisu	“He teaches”

Table 9. Incremental reduction in diacritics in Hebrew and Arabic texts for L2 learners.

Level and Rate of Diacritization	Hebrew Sentence	Arabic Sentence
After 1 month—~90–100% diacritization Sentence: “I want to go the restaurant in the city center and to talk to my friends there”. At this stage, the use of (nearly) full diacritization is reassuring for students who have only recently learned a new script and are in the early stages of forming neurological connections. Note: In Arabic, even at this stage, it is likely unnecessary to vowellize the definite article ال. Additionally, Arabic instructors might prefer to omit case inflections in the elementary stage to avoid inundating learners with peripheral information.	אֲנִי רוֹצֶה לָלֶכֶת לָמִסְעָדָה בְּמֶרֶכָּז הָעִיר וּלְדַבֵּר עִם הָחֲבֵרִים שֶׁלִי שָׁם.	أَنا أُريد الذِّهاب إلى المَطْعَم في مَرْكَز المَدينة والّتَحَدُّث مَعَ أَصْدِقائي هُناك.
I haveAfter 3 months—~75% diacritization Students are expected to be familiar with words like “I”, “with”, and “city.” Verbs can still be diacritized to ensure accurate pronunciation and conjugation (depending on instructor assessment of learner progress). For Hebrew, one might consider adding the diacritical dot in the words שׁלי and שׁם to help learners distinguish between s and š. In this sentence, שָׁם (šam) is fully diacritized to ensure students do not pronounce it as šem, another commonly used word.	אני רוֹצֶה לָלֶכֶת לָמִסְעָדָה בְּמֶרְכָּז העיר וּלְדַבֵּר עם החֲבֵרִים שׁלי שָׁם.	أَنا أُريد الذِّهاب إلى المَطْعَم في مَرْكَز المدينة والتَّحَدُّث مع أَصْدِقائي هناك.
After 6 months—~30% diacritiziation At this stage, learners should be able to conjugate common verbs like “want” without issue, removing the need for continual diacritization. However, trickier verbs like “go” and “speak” might require partial diacritization until learners master their inflection. At this stage, upper elementary-level vocabulary like “center” and “restaurant” likely do not require diacritics at this point. Furthermore, instructors might opt to teach students the nominal patterns for locations (מִקְטָל/ה and مَفْعَل/ة), thereby obviating the need for future diacritization of morphologically related terms.	אני רוצה לָלֶכֶת לָמסעדה בְּמרכז העיר ולדַבֵּר עם החברים שלי	أنا أريد الذِّهاب إلى المطعم في مركز المدنية والتّحدُّث مع أصدقائي هناك.
After 9 months—Minimal diacritization At this point, diacritization should no longer be necessary for this sentence, except to disambiguate or ensure accurate pronunciation. For Hebrew, לָ and בְּ might still be diacritized to avoid confusion with לָ and בָּ, which have different meanings. (Note that the conjunctional clitic vav does not need to be diacritized because there is no grammatical distinction between וּ and וְ in Israeli speech.) In Arabic, some words may still have a shadda to ensure that learners geminate the consonants as needed.	אני רוצה ללכת לָמסעדה בְּמרכז העיר ולדבר עם החברים שלי שם.	أنا أريد الذّهاب إلى المطعم في مركز المدينة والتحدّث مع أصدقائي هناك.
After 12 months—Case Ending Diacritization (Arabic only) Unlike Hebrew, which does not have grammatical declension, Standard Arabic features a complex case system (iʕrāb) marked in the word-final position. In the context of L1 Arabic learning, written literature uses case endings to indicate noun declensions until the sixth grade [49], by which point native speakers are presumed to have mastered the declensions. In the context of L2 instruction, Arabic language teachers might omit case endings at the beginning. However, in the intermediate stage of language learning, instructors might choose to temporarily reintroduce diacritics in word-final positions to acclimatize learners to the case system, until learners internalize the grammatical rules that govern the use of cases.		أنا أريدُ الذّهابَ (أن أذهبَ ⁱ) إلى المطعمِ في مركزِ المدينةِ والتّحدثَ (أن أتحدّثَ) مع أصدقائي هناك.

ⁱ This alternative phrasing has been added because both are usually taught simultaneously as the Arabic case system also applies to verbs.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chan, M.L. Learning to Read in Hebrew and Arabic: Challenges and Pedagogical Approaches. Educ. Sci. 2024, 14, 765. https://doi.org/10.3390/educsci14070765

AMA Style

Chan ML. Learning to Read in Hebrew and Arabic: Challenges and Pedagogical Approaches. Education Sciences. 2024; 14(7):765. https://doi.org/10.3390/educsci14070765

Chicago/Turabian Style

Chan, Martin Luther. 2024. "Learning to Read in Hebrew and Arabic: Challenges and Pedagogical Approaches" Education Sciences 14, no. 7: 765. https://doi.org/10.3390/educsci14070765

APA Style

Chan, M. L. (2024). Learning to Read in Hebrew and Arabic: Challenges and Pedagogical Approaches. Education Sciences, 14(7), 765. https://doi.org/10.3390/educsci14070765

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Learning to Read in Hebrew and Arabic: Challenges and Pedagogical Approaches

Abstract

1. Introduction

Structure of Article and Novel Approach and Findings

2. Challenges of Reading Semitic Abjads

3. Representing Semitic Phonology in Abjad Scripts

3.1. Hebrew Phonology

3.1.1. Polyvalent Graphemes

3.1.2. Vowel System

3.1.3. The Utilization of Vowel Letters

3.1.4. Ambiguities in the Vowel Letters

3.2. Arabic Phonology

4. Morphology: Representing Nominal Patterns in Semitic Abjads

4.1. Hebrew Nominal Patterns

4.2. Arabic Nominal Patterns

5. Morphology: Representing Verb Forms in Semitic Abjads

5.1. Vowels as Markers of Subject in Semitic Verbs

5.2. Vowels as Markers of Mood

5.3. Vowels as a Marker of Voice

5.4. Vowels as Markers of Verbal Paradigm

6. Pedagogical Implications for Literacy Acquisition

6.1. Metalinguistic Awareness and Semitic Abjads

6.2. Metalinguistic Awareness in L1 Readers of Semitic Languages

6.3. Metalinguistic Awareness in L2 Readers

6.4. Implications of Learner Metalinguistic Awareness for Educators

7. Review of Current Pedagogical Strategies

7.1. Minimal Diacritic Method

7.1.1. Advantages of the Method

7.1.2. Disadvantages of the Method

7.2. Diacritics-Heavy Method

7.2.1. Advantages of the Method

7.2.2. Disadvantages of the Method

7.3. An Integrative Pedagogical Method

7.3.1. Hebrew and Arabic Language Resources Featuring the Integrated Approach

7.3.2. Sample Timeline and Incremental Reduction in Diacritics

7.3.3. Challenges of the Integrated Approach

8. Conclusions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI