1. Introduction
School-age children present varying levels of literacy abilities in school. Among them, poor readers—who may be deemed to be at risk of or diagnosed with dyslexia—struggle with word recognition and spelling tasks and often receive literacy intervention to address their literacy difficulties. Many literacy intervention programs involve phonological and sight word instruction for phonetically regular and irregular words [
1,
2] to facilitate the development of word recognition. Such programs aim to establish letter–sound correspondence to support the orthographic and morphological mapping of automatic word recognition and spelling regardless of the regularity of the words [
3].
However, there are poor readers who demonstrate little or no progress in literacy intervention as they struggle to develop proper letter–sound correspondence or attain automatic and accurate word recognition regardless of the years of instruction [
4,
5]. Attempts at meaning making in the classroom are hampered by the struggle to recognize words with automaticity. According to McMurray and McVeigh [
6], poor readers may face challenges with phonemic instruction because they are unable to handle the de-contextualized nature of the instruction with their limited working memory. As such, high-frequency irregular sight words are recommended to be introduced alongside other aspects of literacy instruction. This is especially so when there is an urgency for poor readers to gain greater recognition and automaticity in high-frequency sight words and stem further delays in their language development in the face of the increasing linguistic demands of each progressive grade [
7]. Literature on error patterns committed by poor readers suggests that the susceptibility towards the concreteness (the inclination for words with more physical or tangible characteristics that can be experienced directly through the senses [e.g., dog, tree]) and neighborhood (other words that bear similarities to the base words via orthographic [letter] auditory or semantic [meaning] similarities) effects are possible interferences in the accurate and efficient acquisition of sight words via repeated exposure [
8]. Such findings are supported by modern research into the interference of phonological, orthographic and semantic reading networks [
9].
Poor readers experience challenges in the efficient activation of language processing streams—predominantly left hemispheric processes—due to susceptibility to the concreteness and neighborhood effects. These streams are generally regarded as the lexical ventral stream and the non-lexical dorsal stream [
9]. The former maps the word orthography (letters) with semantics (meaning) as a whole word to attain reading automaticity. The latter involves the process of translating orthography into phonics to aid in the pronunciation of unfamiliar words or to cope with the initial stages of literacy acquisition [
10,
11]. Both streams lead to a phonological output that results in speech sounds for word articulation. Poor readers experience challenges in the efficient utilization of either one or both streams due to susceptibility to the concreteness and neighborhood effects. Such susceptibility is attributed to two reasons: (1) the difficulty in inhibiting the neighbors (words with orthographic/phonological/semantic similarities) of target words, and (2) a reduced retention or understanding of words with a lack of representation in reality (low concreteness). The word neighbors and words with low concreteness interfere with efficient reading processes—that require accurate identification of the target word—as poor readers exhibit confusion in distinguishing between the target and uninhibited words. In an attempt to alleviate such confusion, distinguishing word neighbors from the target words should be a cornerstone of literacy instruction for poor readers. In addition, words of lower concreteness should be differentiated from words of higher concreteness. Caution against the casual inclusion of the former in word instruction for poor readers should be exercised due to their reduced retention or understanding towards words with low concreteness.
It is of interest in this study to incorporate these considerations into an existing sight word teaching resource of high classroom utility (e.g., Dolch, Fry list) that, due to its datedness, does not control for the concreteness and neighborhood effect. As compared to the Fry list that contains 1000 words, the Dolch list is selected for revision due to its concise list of 100 words which limits the number of overlapping words present as a neighbor for another word in the list. As such, the aim of this study is to revise the Dolch list by the recategorization of words according to concreteness and the inclusion of neighborhoods to provide an effective reference in the planning of sight word intervention for poor readers for language intervention in school-going children. Below, we first review important findings of the factors that influence word recognition to motivate the creation of a new multi-faceted sight word list.
1.1. Atypical Processing of Orthographic Neighbors
Typically developing individuals demonstrate greater susceptibility to words of higher frequencies as compared to intrusions of orthographic neighbors. Such susceptibility to words of higher frequencies is known as the word frequency effect. It refers to the longer reaction time incurred in response to words of high frequency—specifically with a number of high-frequency neighbors—as compared to low frequency due to the faster activation of a wider range of phonemic representations among high-frequency words [
12,
13]. This effect is based on an assumed precedence of the non-lexical dorsal stream before the lexical ventral by the typically developing individuals. Typically developing individuals require more time to determine the accurate phonemic representations of the target word among the many activated representations, which is supported by Luque et al. [
12] who found longer response times among high-frequency words during lexical decision tasks among typically developing individuals. Conversely, typically developing individuals do not seem to demonstrate susceptibility to orthographic neighbors. While confusion may arise via the similarity of orthographic features between the target word and its neighbors, effective lateral inhibition of neighbors in typically developing individuals stems from that confusion and promotes accuracy in the identification of target words [
14].
Poor readers, however, react to intrusions of orthographic neighbors faster than high-frequency words due to poor inhibition. Orthographic neighbors are defined as words that differ by changing a single letter—via substitution, omission and transposition—of the target word (e.g., ‘now’ vs. ‘no’), to which the N metric reflects the number of orthographic neighbors that a target word has [
15]. Target words with high orthographic density have similar orthographic resemblance to many other words. Target words with low orthographic density do bear orthographic resemblance to many other words and are distinct in letter combinations [
16]. The orthographic neighborhood density effect assumes susceptibility towards target words of high orthographic density due to the interference of the numerous word neighbors in ensuring accurate word production [
17]. This effect assumes faster initial access to the lexical route among the poor readers. Luque et al. [
12] found that the orthographic density effect supersedes the word frequency effect among poor readers, resulting in longer reaction times towards high frequency words as compared to their typically developing peers. Two extra years of literacy instruction did not improve the inhibition of orthographic neighbors [
12] (p. 199). Zoccolotti et al. [
18] proposed an idea of the global impairment of orthographic strings based on findings that even the most minimal variation borne by the neighborhood density effect resulted in significant worsening of the reading performance among the poorest of readers. As such, orthographic neighbors of target words must be taken into consideration for literacy instruction for poor readers.
1.2. Atypical Processing of Phonological Neighbors
Homophone interference, particularly heterographic homophones (e.g., ‘maid’ and ‘made’), manifests in typically developing individuals due to early phonological influences in visual word recognition [
19]. Word frequency and homophony modulate the degree that phonology affects visual word recognition [
20], with homophone interference more evident in higher frequency words due to faster word activation [
21]. Phonological neighbors are defined as words that differ by one phoneme from the original via addition, deletion or substitution [
22]. The number of phonological neighbors of a word is known as phonological density [
22]. Words with few phonological neighbors (e.g., PROOF) are recognized faster than words with many phonological neighbors (e.g., FRUIT) [
23].
Contrasts between error rates to homophone foils versus spelling-control foils suggest that poor readers are significantly affected by phonology in determining categorization [
24]. Due to challenges with phonemic discrimination, identification and/or repetition [
25], poor readers have shown greater retention of words with low neighborhood density and low acoustic similarity as compared to their typically developing peers [
26]. Conversely, poor readers demonstrate confusion towards words with high neighborhood density and high acoustic similarity. Such confusion interferes with literacy acquisition, resulting in low retention and inaccurate word production. As such, phonological neighbors of target words should be factored in literacy instruction for poor readers.
1.3. Atypical Processing of Semantic Neighbors
Individuals—regardless of typically developing individuals or poor readers—demonstrate some degree of susceptibility to the concreteness effect. Concreteness or imageability is the degree of perceptual and sensorial features of a concept that can be experienced in reality [
27,
28]. The greater the word concreteness, the richer the mental representation of the meaning and semantic knowledge, respectively [
27,
28]. As compared to concrete verbs (e.g., jump), concrete nouns (e.g., pen) typically possess greater levels of concreteness due to higher featural weights (i.e., multiple perceptual and motor features) [
28]. Along the continuum of concreteness, abstract words (e.g., freedom) possess the least featural weights. In a phenomenon known as the ‘concreteness effect’, concrete concepts and nouns are processed faster—and with greater accuracy—than abstract and verb concepts, respectively [
29].
While all readers show inclination to the concreteness effect [
30], it is more evident in poor readers with reduced inhibition of semantic neighbors at the phonological output lexicon [
31]. Poor readers show an obvious inclination for concrete words due to (1) the absence of direct sensory referents of abstract words, (2) the greater availability of contextual information provided by concrete words, and (3) the greater number of semantic features supporting concrete words [
32]. Reduced inhibition of semantic neighbors may result in the activation of a greater number of neighbors in abstract words as compared to concrete words (e.g., “freedom” may activate “justice”, “liberty”, etc. while “bed” may activate “pillow”) [
33]. As such, poor readers demonstrate confusion in determining the accurate target abstract word among a greater number of activated semantic neighbors. Such confusion would interfere with the understanding and accurate word production. Semantic neighbors should be incorporated into the instruction for poor readers to allow the confusion to be addressed by the reading teacher (RT).
1.4. Set for Variability for Irregular Words and the Concreteness Effect
While beginning readers activate the non-lexical dorsal stream by engaging in grapheme—phoneme correspondence to develop their literacy competence, they eventually develop automaticity in accurate word identification and the activation of the lexical ventral stream by engaging in a process called ‘Set for Variability’ (SOV). SOV involves the cognitive flexibility to bridge the mismatch of orthography and phonology in irregular words via visualization of meaning and compensate with the accurate way of reading and spelling the word [
34,
35]. SOV is essential in the acquisition of irregular words that emerge as exceptions to grapheme–phoneme correspondence due to a mismatch in orthography and phonology.
There is increasing evidence that semantic knowledge aids in SOV. SOV has been found to be a predictor in the reading performance of irregular, regular and nonwords [
34,
35]. Other studies have found SOV of word-specific knowledge to be a predictor of orthographic learning and reading accuracy for irregular words [
36]. More importantly, concreteness—or imageability—was a significant factor in determining irregular word reading accuracy [
37] and constructing lexical representations of irregular words [
38,
39].
Typically developing readers exhibit developed grapheme–phoneme correspondence and demonstrate developed SOV as complete mental word representations are formed for both regular and irregular words [
3]. Conversely, poor readers with low SOV exhibit partially developed grapheme–phoneme correspondence that adversely affects word representations and reading automaticity. While poor readers attempt to compensate by relying on alternative and less efficient approaches (i.e., engaging in word guessing by using semantics—word meaning) for word acquisition [
8,
40], they may not be successful due to their susceptibility to the concreteness—or imageability—effect.
In a series of studies conducted by Steacy, Compton et al. [
38], Steacy, Wade-Woolley et al. [
34] and Steacy et al. [
8], concreteness—or imageability—was found to have a significant interaction with initial word reading skill. It underscores the importance of imageability for students who commenced intervention with poor word reading skills [
8,
39] and word learning efficiency [
38]. The growth in word reading was adversely affected in poor readers dealing with words with low imageability—or concreteness values [
8]. Such difficulty with low imageability was especially evident for students who commenced intervention with the poorest word reading skills and demonstrated significantly lower receptivity to words with lower imageability as compared to words with higher imageability values [
8]. Steacy et al. [
38] concluded that the interaction between imageability and poor readers suggests that the poorest of readers may deem imageability as a cornerstone in word acquisition. As the poorest of readers are highly influenced by imageability—or concreteness—in word acquisition, there is a strong need for literacy intervention to clearly consider imageability—or concreteness—as a factor in literacy instruction for poor readers to facilitate the development of SOV by mitigating any interference brought about by words of low imageability or concreteness.
1.5. Current Study—Towards a Multi-Faceted Sight Word List
In view of the potential interference in word acquisition presented by word neighbors, distinguishing word neighbors from the target words should form the cornerstone of literacy instruction for poor readers. In addition, words of lower concreteness should be differentiated from words of higher concreteness. Caution against the casual inclusion of the former in word instruction for poor readers should be exercised due to their reduced retention or understanding towards words with low concreteness. It is important to incorporate these considerations into the intervention of basic sight words for poor readers as many sight words are irregular words that frequently appear in the learning materials of any grade.
To aid in sight word intervention, word neighborhoods—orthographic, phonological and semantic—and concreteness should be factored into the revision of high-frequency word lists such as the commonly used Dolch list in school-going children. The Dolch list comprises 220 high-frequency irregular words that contain all parts of speech other than nouns—except for pronouns [
41,
42]. The list was initially created based on shortlisting high-frequency words—with a frequency rating of one hundred or more—from reading material of that era and a speech corpus of student interactions in a kindergarten classroom [
43]. The Dolch word list is often categorized by grade—pre-primer, primer, kindergarten, grade 1, grade 2 and grade 3 [
44]. While questions over the validity of the Dolch word list usually revolve around its relevance to the evolving English language in the contemporary era [
45], the Dolch list remains popular among early language teachers and therapists as it is deemed as a simple and accessible resource to chart progress [
46,
47].
However, the grade and frequency categorization of the Dolch list is not designed to the needs of poor readers who are susceptible to the neighborhood or concreteness effect. The frequency of word exposure makes no attempts at accounting for neighbors or concreteness due to assumptions of successful inhibition of these effects by students. For instance, the word ‘it’ has low concreteness value, contains ‘eat’ as a phonological and semantic neighbor and is listed under the primer category. With little attempt to tackle any resulting confusion, poor readers continue to struggle with what is deemed to be the highest-frequency words in lower grade texts even as they progress through the grades. There is no precedence in establishing a high-frequency sight word list that accounts for the concreteness and neighborhood effect, and the incorporation of these considerations into a revised Dolch list serves as an effective reference for sight word instruction for poor readers.
The recategorization of the list according to concreteness allows the reading teacher (RT) to select words for intervention according to syntactic and semantic properties that correspond to lower and high levels of concreteness. This is achieved by an initial delineation between content and function words before recategorizing the function words according to Parts of Speech (POS) categories. Content words (e.g., verbs, adjectives) have higher concreteness values as they possess physical representations. On the other hand, function words (e.g., conjunctions) have lower concreteness values and contain more syntactic than semantic properties [
48]. In addition, the semantic properties inherent in function words are generally more “abstract” than content words due to the lack of immediate representation in reality. As function words are often the last group of words that are fully integrated in a child’s early language development, their low concreteness values would translate into the most challenging type of words for individuals with semantic impairment. This warrants the creation of a distinct category for function words from the POS categories derived from the remaining content words. The presence of the function word category in addition to the existing POS categories will be collectively known as ‘fPOS’. With the ‘fPOS’ categories, the RT can differentiate the intervention according to content or function words. For instance, the introduction of orthographic neighbors to the word ‘buy’ (vs. ‘boy’) and ‘now’ (vs. ‘know’ and ‘no’) warrants differentiated intervention. As compared to the latter, the former does not require as much explanation, visuals and monitoring of confusion due to the higher concreteness values that stem from their semantic properties.
This study aims to revise the Dolch sight list by recategorizing the words according to concreteness in the fPOS categories and including the associated neighbors of every word. It does so by addressing five research questions:
Research question 1: How should the Dolch list be recategorized according to function and content words?
Research question 2: How should the content words be recategorized according to POS?
Research question 3: How should the words in each fPOS category be listed according to concreteness values?
Research question 4: How significantly different are the grade and fPOS categories in both the original and recategorized Dolch list, respectively?
Research question 5: What are the orthographic, phonological and semantic neighbors of the words in each fPOS category?
2. Materials and Methods
The aim of this study was to revise the Dolch list to create a new “High Frequency List with Neighbors” (HFLN) to support poor readers who demonstrate susceptibility towards the concreteness or neighborhood effects (orthographic, phonological and semantic neighbors). Psycholinguistic techniques were employed to reappropriate the Dolch list by (1) recategorizing the existing words according to concreteness and (2) including orthographic, phonological and semantic neighbors of each word in the list. The Dolch list was obtained from
https://sightwords.com/sight-words/dolch/ (accessed on 4 January 2022).
The creation of the HFLN consisted of a two-step process. The first step involved the recategorization of word categories based on concreteness. The words were recategorized according to types of content and function words. The identification of function words and content words was guided by the function word list [
49] and SUBTLEX-UK word list, respectively. The content words were recategorized into POS categories. Thereafter, words in each fPOS category were listed in descending order of concreteness. The concreteness value of each word was obtained from the MRC Psycholinguistics Database [
50]. The MRC Psycholinguistic Database offers 26 linguistics and psycholinguistic attributes—including concreteness—across more than 150,000 words. Categorial concreteness values of both Dolch list and HFLN were compared to determine significant differences.
The second step involves the inclusion of orthographic, phonological and semantic neighbors of the HFLN using psycholinguistic databases and word list(s). The orthographic and semantic neighbors were obtained from CLEARPOND [
51] and WordNet
® [
52], respectively. CLEARPOND is a database that allows researchers to obtain phonological and orthographic neighbors of inputted words. WordNet
® is a database of English words that establishes semantic linkages between synonyms sets (known as synsets) of semantic and lexical relations. Homophones and phonological neighbors were obtained from Alan Cooper’s Homonym List [
53] and CLEARPOND, respectively.
More details of the databases are given in the subsequent text.
2.1. Recategorization and Ranking according to Concreteness
2.1.1. Recategorize according to Concreteness
According to Ozturk [
54], there are only five avenues to date that feature a function word list. Of the five avenues, O’Shea [
49] remains the only avenue that uses natural language processing—a by-product of the research into Short Text Semantic—to compile a function word list. The Dolch list is matched against the function word list [
49] for an initial separation of the function and content words. Upon segregating the words into function and content words categories, these two categories are further segregated into subcategories based on parts of speech (POS)—function words (abstract) and function words (pronoun/possessive), content words (present verb), content words (past verb) and content words (adjective/adverb)—to accommodate the different ranges along the concreteness continuum. Subcategorization of each word was referred against the ‘DomPos’ column in the SUBTLEX-UK data to determine the most frequent part of speech for each word in the Dolch list. The SUBTLEX-UK data can be accessed at
https://psychology.nottingham.ac.uk/subtlex-uk/ (accessed on 5 February 2022).
2.1.2. Rank according to Concreteness
The concreteness values for all possible words were obtained from the MRC Psycholinguistic Database Output. The database can be accessed at
https://websites.psychology.uwa.edu.au/school/MRCDatabase/uwa_mrc.htm (accessed on 5 February 2022). Created by Coltheart (1981) [
50], it consists of 150,837 words and 26 linguistic and psycholinguistic attributes. Many attributes—including concreteness—are expressed as inter values between 100 and 700. Larger values indicate a greater degree of concreteness. The words in each category were inserted into the database to obtain the concreteness value for each word. The concreteness values of 32 out of 220 words (14.55%) were unavailable on the MRC Psycholinguistic database. The remaining 188 words were arranged according to concreteness values of descending order in the HFLN. Thereafter, 32 words with no concreteness values were included at the end of their respective categories.
2.1.3. Data Analyses
Data analyses involved one-way analysis of variance (ANOVA) to compare the categorical concreteness values within each list. The first ANOVA compared scores between the categories of the Dolch list. The second ANOVA compared scores between the categories of the HFLN. The Bonferroni post hoc test and Levene’s test of variance homogeneity were carried out as measures of statistical significance and variability, respectively. An alpha level of 0.05 was used for all statistical tests.
2.2. Inclusion of Neighbors
The following points details the process in acquiring orthographic, phonological and semantic neighbors for the HFLN.
2.2.1. Determining Orthographically Neighbors
CLEARPOND (Cross-Linguistic Easy-Access Resource for Phonological and Orthographic Neighborhood Densities) is a database that allows researchers to obtain phonological and orthographic neighbors of inputted words. Users can customize the neighborhood list by the separate definition of metrics—substitution, deletion, and/or addition—or the summation across all three metrics [
51]. CLEARPOND yields a corpus size of 27,751 English words. CLEARPOND for English words can be accessed at
https://clearpond.northwestern.edu/englishpond.php (accessed on 20 January 2022). The word list was inserted into the database and the orthographic neighbors of each word were identified by defining the following metrics:
Features: neighbors (list of words);
Neighbor Type: orthographic;
Neighbor Metric: total [Substitution, Addition, Deletion];
Neighbor Frequency: all neighbors.
The orthographic neighbors considered are words that have the same first letter as the target word and fulfill the N metrics (e.g., substitution = (“were” vs. “wore”), insertion = (“were” vs. “where”), omission = (“your” vs. “you”)). Should a word contain more than one orthographic neighbor in the substitution, insertion or omission category, the neighbors will be listed according to a descending order of concreteness via the MRC Psycholinguistic Database. For instance, the word ‘new’ has two orthographic neighbors—‘net’ and ‘now’—under the orthographic substitution category. As the neighbor ‘net’ contains a higher concrete value than ‘now’, ‘net’ will be listed before ‘now’. The concreteness values of multiple orthographic neighbors are listed in
Table S4. One limitation of the MRC Psycholinguistic Database is the omission of concreteness values for certain words. Of the 481 orthographic neighbors, 148 neighbors (30.77%) do not have concreteness values (abstract, 17.67%; pronouns, 30.95%; concrete verbs, 36.9%; past tense verbs, 27.03%; adjectives/adverbs, 30.51%). The absence of these concreteness values is represented with an asterisk.
2.2.2. Determining Phonological Neighbors (Homophones)
Using Alan Cooper’s Homonym List [
53], homophones of the words in the Dolch list were identified. 59 out of the 188 words (31.38%) with concreteness values contained homophones as phonological neighbors. With the inclusion of words without concreteness, 60 out of 220 words (27.27%) in the HFLN contained homophones as phonological neighbors.
2.2.3. Determining Phonological Neighbors (Phonological Neighborhood)
Phonological neighbors with inserted phonemes consist of letters with manners of articulation that involve oral closures or partial constrictions of the vocal tract—plosive, nasal and approximant consonants—that do not vastly distort the sound of the original word (e.g., ‘so’ as ‘soap’ and ‘soul’) (UCL Division of Psychology and Language Science, 2018).
The word list was inserted into CLEARPOND and the phonological neighbors of each word were identified by defining the following metrics:
Features: neighbors (list of words);
Neighbor Type: phonological;
Neighbor Metric: total [Substitution, Addition];
Neighbor Frequency: all neighbors.
Only phonological neighbors with substituted or inserted phonemes within the same manner of articulatory category as the last phoneme in the original word will be identified as phonological neighbors to be included into the word list (refer to the UCL Division of Psychology and Language Science [
55] for a complete list of the different categories of articulation). For instance, the letter ‘k’ that comprises the last phoneme in the word ‘think’ is a plosive consonant that shares the same manner of articulation with the letters ‘b’, ‘d’, ‘g’, ‘p’ and ‘t’ [
56]. As such, any real word with a substituted plosive consonant as the final phoneme qualifies as a phonological neighbor (e.g., ‘step’ as ‘stab’). In another example, the letter ‘n’ is a nasal consonant that shares a similar manner of articulation to the letter ‘m’ [
56]. As such, the word ‘than’ has a phonological neighbor of ‘them’.
2.2.4. Determining Semantically Similar Counterparts
WordNet
® is a database of English words that is developed by Princeton University. It can be accessed via
http://wordnetweb.princeton.edu/perl/webwn (accessed on 20 January 2022). This database establishes semantic linkages between synonym sets (known as synsets) of semantic and lexical relations [
57]. The database contains 117,000 synsets. WordNet
® provides synonyms or close synonyms of the target word borne from every conceivable meaning found in the dictionary. According to Princeton University [
52], noun synsets are characterized by part-whole relations (e.g., ‘chair’ as ‘seat’ or ‘legs’), verb synsets are arranged in hierarchies that are grouped according to a semantic category (e.g., ‘buy’ as ‘pay’, ‘move’ as ‘run’ or ‘jog’, ‘talk’ as ‘speak’ or ‘say’), and adjectives are organized accordingly to semantic similarities (e.g., ‘dry’ as ‘parched’, ‘wet’ as ‘soggy’). Apart from the abstract words, words from the other grammatical categories are inserted into WordNet
® to identify semantic neighbors. Semantic neighbors are selected based on the same concrete representational image reflected in reality. For instance, the semantic neighbors of ‘move’ can be ‘run’ or ‘jog’ due to the same concrete representation image of running that all three words reflect in reality. The limitation of identifying semantic neighbors from the database is the inability to produce a comprehensive list of semantic neighbors for all poor readers due to the extensive individual variability of their mental lexicon. In addition, the types of semantic errors are extensive. An individual with semantic impairment may commit coordinate errors (e.g., ‘chicken’ as ‘duck’), superordinate semantic errors (e.g., ‘chicken’ as ‘animal’) or associative semantic errors (e.g., ‘chicken’ as ‘egg’ or ‘Kentucky Fried Chicken’) [
58]. In particular, it is impossible for WordNet
® to provide all semantically associative words as semantic errors are uniquely shaped by the experiences of every poor reader.
4. Discussion
Among poor readers who receive literacy intervention, some students demonstrate effective inhibition of neighbors and little sensitivity to the concreteness effect. With the assumption of no other interfering factors in their literacy acquisition, such students acquire proper orthographic mapping (letter sound correspondence) and develop automaticity of print recognition via phonics and sight word intervention through repetition and frequent word exposure [
1,
2,
3]. As such, the grade centric approach of the Dolch list based on frequency [
43,
44] is suited for such students due to the consistent number of words and range of concreteness between each grade list. Such consistency of concreteness between the different grade lists demonstrates the goal of efficient word recognition instruction for students who are receptive in developing automaticity via word repetition and exposure.
However, the acquisition of accurate and automatic sight word recognition for poor readers is affected by their susceptibility to neighborhood [
12,
18,
24,
26,
31] and concreteness effects [
32,
33,
38]. Depending on their type and degree of susceptibility, poor readers will face challenges with words of low concreteness values or large neighborhood sizes. Such challenging words should be identified so as to present them with caution during intervention. However, in the commonly used Dolch list, there is no distinction between words of different concreteness values and the size of the neighborhood. The similarity in concreteness (e.g., no significant difference between concreteness values of the second-grade category of words and the earlier grades) suggests a consistent—not progressive—level of difficulty between words of the different grade categories. This means that the primer and kindergarten level words contain as many words of low concreteness values as compared to the higher-grade words. This can result in poor readers continuing to demonstrate a consistent lack of receptivity or retention towards words at the primer or kindergarten level—containing low concreteness values or neighbors—regardless of age. The poor sight word recognition abilities of poor readers interfere with language therapy as their attempts at meaning making are interrupted by issues with word recognition.
For instance, a poor reader may perceive the printed word ‘want’ as the orthographic neighbor ‘went’ in a sentence during intervention and proceeds to develop incorrect meaning making of the sentence, derailing language therapy goals in the process. In another instance, a poor reader may expend too much effort in word recognition to comprehend the text due to semantic interference. A revision of the Dolch list to incorporate distinctions and provision between words based on concreteness values and neighbors, respectively, would be a useful reference for the RT to plan for more effective intervention involving such cases.
The HFLN was created by recategorizing the words in the Dolch list according to the function content words and then POS subcategories, establishing each word category with clearer delineation of value clusters and narrowing the range of concreteness values within each category and including the associated neighbors. The abstract words consist of words with low concreteness values, and are significantly different from the other subcategories. Conversely, other subcategories with greater concreteness values (e.g., the present verb and adjective/adverb) are not significantly different from each other. Overall, the categories in the HFLN show greater distinctiveness than the grade categories. Such distinctiveness is evident as the revised list shows greater statistically significant differences as compared to the list sorted by grade.
The HFLN provides a versatile resource to aid the RT in the customization of sight word intervention. The following recommendations are two of the many ways that RTs can use the HFLN in planning for intervention:
In the planning of sight word instruction, the RT determines a student’s susceptibility towards the neighborhood or concreteness effect. For instance, the student may be susceptible to orthographic interference (e.g., ‘fall’, ‘fail’ or ‘full’), phonological neighbors (‘sit’ or ‘seat’) or multiple neighbor interferences—(e.g., ‘it’ or ‘eat’ [semantics, phonology]). Thereafter, the sight word can be presented alongside the type of neighbor that the student is susceptible towards. The simultaneous presentation of the words and their associated visuals for meaning making alleviates confusion and promotes improved sight word recognition.
In the planning of language therapy (e.g., sentence comprehension, vocabulary), the HFLN serves as a way for the RT to control the concreteness or neighborhood effect to avoid the distortion of meaning making. For instance, should POS be established as the therapy focus, the list—categorized according to POS—allows the therapist to choose words that have higher concreteness values or with a low neighborhood size. This way, there is minimal interference in achieving the therapy goal.
There are three limitations in this study. Firstly, 32 words were omitted from the analysis due to a lack of concreteness values from the database. Future research should refer to other databases of concreteness values and establish a standardized formula across the databases to incorporate the words of standardized concreteness values into the statistical analysis. Secondly, the variability seen in the concreteness values for past verbs is due to its small sample size (
n = 7). Regardless, these past verbs have lower concreteness values from present verbs and warrant a separate category due to the conceptual demands in understanding the linearity of time between past and present. Thirdly, WordNet
®—or any semantic database—faces limitations in producing a comprehensive list of semantic neighbors that accounts for the extensive mental lexical variability among poor readers. The limitation of identifying semantic neighbors from the database is the inability to produce a comprehensive list of semantic neighbors for all individuals with atypical semantic processing due to the extensive individual variability of their mental lexicon. In addition, the types of semantic errors are extensive. An individual with semantic impairment may commit coordinate errors (e.g., ‘chicken’ as ‘duck’), superordinate semantic errors (e.g., ‘chicken’ as ‘animal’) or associative semantic errors (e.g., ‘chicken’ as ‘egg’ or ‘Kentucky Fried Chicken’) [
58]. In particular, it is impossible for WordNet
® to provide all semantically associative words as semantic errors are uniquely shaped by the experiences of each individual. Future research can focus on compiling the semantic references of different individuals to provide more targeted and relevant options of semantic neighbors in the HFLN.