1. Introduction
In the introduction to this symposium, Robert Sternberg cites four widely expressed lines of criticism of intelligence tests: that they are narrow, biased, undertheorized and static. The present contribution echoes the first three of these themes, arguing that the inherent bias of normative tests can only be justified politically if a compelling theoretical account is available of how the construct of intelligence relates to learning and how opportunities for learning are distributed through educational policy.
Numerous studies in Africa have found that indigenous conceptualization of intelligence includes dimensions of social responsibility and reflective deliberation in addition to the dimension of cognitive alacrity emphasized in most tests of intelligence tests standardized in Western societies. On the other hand, the technology of intelligence testing has been widely applied to the process of educational selection in contemporary African societies, but in Zambia, current applications of that technology rely exclusively on Western style educational tests and fail to respond to some enduring cultural preoccupations of many parents, educators and policymakers. The themes that social responsibility and reflective deliberation are hallmarks of intelligence are rooted in the various indigenous cultures that inform public values and family socialization practices in contemporary Zambian society, but do not feature in the tests currently used for educational selection. As a consequence, there is a danger that educational policy will undervalue aspects of human development that have the potential to benefit society, will intensify long-standing tensions between rural and urban ways of life, and will tend to rationalize discriminatory exclusion of certain groups from leadership positions in society.
The cultural practice of intelligence testing arose historically in response to particular social challenges in Europe and the USA at the beginning of the twentieth century: identifying children with special educational needs or adults fit for particular types of occupation. Export of the practice to the African region in the second half of the century was informed by a comparably pragmatic motive of identifying adolescents within the upper primary school population best suited to benefit from access to limited opportunities for secondary schooling. Since then the practice has been extended unsystematically to various other social functions, including early detection of children at risk for developmental disability, monitoring the impact of changes in the curriculum of public primary schooling, and selection for admission to secondary and tertiary education. The design of tests for these purposes has been driven mainly by sophisticated psychometric technology, with little attention to the various theories of intelligence that have emerged in the mainstream of Western cognitive psychology, and still less attention to sociocultural research on the opportunities for cognitive development distinctive to African societies [
1].
Our analysis in this article is situated in Zambia, an African society undergoing rapid socio-cultural and politico-economic change. We seek to articulate what formal psychological assessment in this context might aspire to add to the hunches of a parent, teacher or administrator about particular intellectual functions in three different phases of human development: early childhood; middle childhood; and early adulthood. We describe some challenges of eco-culturally responsive assessment in each of those phases, and discuss how recent or ongoing research has addressed those challenges. Our analysis draws on three bodies of recent research in Zambia: on design and validation of new instruments for assessment of early childhood development [
2,
3]; on assessment of literacy skills in the early grades of public primary schools [
4,
5]; and on conceptualization of intelligence among the Lozi people of Western Province and its implications for measures of intelligence in the national secondary school selection examination [
6]. In each of these fields, we pay special attention to the interface between conceptual issues and empirical research strategies.
The theoretical rationale informing our analysis arises from earlier publications [
7,
8,
9,
10,
11,
12]. Psychological explanations are constrained by three complementary roles in communication: the subject whose behavior or experience is to be explained, the author who proposes the explanation, and the audience to whom the explanation is addressed. These roles are sometimes played by three separate persons, while at other times two or three of them are played by the same person [
7,
8]. In the situations discussed in this paper, the author is a psychologist or other professional conducting psychological assessment, the subject is an individual whose intelligence and/or other personal characteristics are being assessed, and the audience is one or more stakeholders, such as a professional or administrator seeking information to guide a decision regarding the subject (e.g., diagnosis and management of a developmental disability or admission to a competitive entry educational program), a parent or teacher responsible for assistance of the subject in her zone of proximal development [
13] or the subject herself.
Because the most widely cited intelligence tests in the scientific literature have for the most part been developed by Western authors based on research with Western subjects and addressed to Western audiences, their focus tends to be narrowly defined by the particular circumstances of Western societies, and their relevance to the needs and concerns of people in other societies depends on careful attention to details of context. But doing so often encounters resistance from Western scientists and the professionals they train because of a cultural predilection for “decontextualized, explanatory constructs… <designed> not merely to illuminate a particular problem, to enhance the quality of communication among an identifiable set of communicating participants, but to lay down a general law applicable across all societies and for all time” ([
11], p. 568). In our cultural-developmental theoretical perspective, “competence is defined by a culturally constituted system of representation. Its presence or absence in a given individual is construed in emergent ways through interpersonal interactions, which in turn are informed by a system of meanings shared among the co-participants and their various audiences… The cultural practice of intelligence testing falls within this framework as an institutionalized network of recurrent activities, scripts, artifacts, roles, and social functions… Many of the assumptions underpinning the legitimacy of the practice in American society are much less widely shared in contemporary African societies. As a result… the process of institutionalizing intelligence testing in Africa threatens to distort important aspects of education in dysfunctional ways rather than enhancing its precision and efficiency” ([
1], pp. 163–164). The cultural validity of a psychological or educational theory can be conceptualized as “a positive balance of benefits over costs for a given community at a given time engaged in a given task, where the sensitizing and heuristic power of a model outweigh the perceived narrowness of its focus and the extraneous connotations… In addition to its theoretical fruitfulness and its empirically predictive power, a psychological theory will always be judged by its capacity to resonate with the broader cultural preoccupations of the society of which its audience are members” ([
7], p. 125). Paradoxically, “when psychologists have explicitly set out to study cultural differences, this task of cross-cultural interpretation has often been neglected. Many ‘Western’ psychologists have not even appeared to be interested in whether their reports would be intelligible, let a1one illuminating, for the people they studied. Indeed, they did not even acknowledge them as a potential audience: subjects were construed as objects of study but not as recipients of the wisdom generated by the study. The cross-cultural researchers who have acknowledged an obligation to capture the meaning of behavior and experience from the perspective of their subjects have most often construed this as a means of ensuring functional equivalence for the purpose of comparison. This concern emanates from the ‘Olympian’ project of formulating a universalistic psychology.” ([
7], pp. 125–126).
In contrast to that universalistic agenda, Serpell has urged colleagues in Africa to pay more attention to “an alternative set of concerns motivating cross–cultural psychology in a Third World country… The focus of these concerns, in the light of the predominantly ‘Western’ cultural orientation of contemporary psychology, is on the question: what use can a socially responsible, Third World psychologist make of psychology (along with other, indigenous cultural resources) in explaining the behavior and experience of a Third World subject to a Third World audience?” ([
7], p. 126). This emphasis on practical utility highlights the importance of situating explanation within contexts of practical decision-making and negotiating a fusion of perspectives “to facilitate a process of cooperative communication between children’s families and their schools, exploring areas of consensus and disagreement, as a basis on which to promote local accountability of public educational institutions [
10]. Rather than focusing only on incongruities between the cultures of home and school, it seeks to enhance the quality of mutual understanding among parents and teachers, and to generate a socioculturally productive relationship between public education and the community it is mandated to serve” ([
12], p. 424).
As Sternberg ([
14], p. 336) observed, “When cultural context is taken into account: (a) individuals are better recognized for and are better able to make use of their talents; (b) schools teach and assess children better; and (c) society utilizes rather than wastes the talents of its members. One can pretend to measure intelligence across cultures simply by translating Western tests and giving them to individuals in a variety of cultures. But such measurement is only pretense.” Taking cultural context into account demands attention to several complementary aspects of the interface between culture and intelligence: the environmentally structured opportunities for learning experienced by the individuals whose adaptive behavior is being assessed (reflecting the metaphor of culture as a womb), the conceptualization of personal attributes in ways that are meaningful for participant owner-members of a given culture (culture as a language), and the psychological criteria for practical decision-making that command consensus in a particular sociocultural context (culture as a forum) [
9]. The recent empirical research in Zambia selected for discussion in this paper illustrates each of these aspects of a contextualized approach, and seeks to show how assessment can respond productively to the demands of a particular socio-cultural and politico-economic context by generating a disciplined set of procedures to refine practical decision-making that are open to external validation, whereas the use of pre-standardized instruments is liable to give rise to invalid assessment, reliance on which could yield inappropriate practical decisions.
2. Assessment of Intellectual Functions in Early Childhood
Developmental changes in capacity to use language and in understanding of the world are recognized as evident in all human societies. Growth of those cognitive functions from infancy in the first year of life to early childhood in the age-range 5–7 is easily construed as a concomitant of physical growth, and universal milestones of that developmental trajectory have been widely adopted by health professions around the world for monitoring behavioral aspects of healthy human development [
15]. Yet, even within that relatively uncontroversial domain, cross-cultural variations have been documented in the rate at which particular, objective indicators are attained [
16,
17]. Once a child is able to use language, the norms of communication influence how his or her behavior is appraised by local agents of socialization [
18], and what is construed as unusual or deficient varies considerably from one socio-cultural context to another [
19,
20,
21]. Thus if assessment tests are to serve as guides for action with children aged 3–7, such as remedial intervention, curriculum design or parental education, their content needs to be attuned to the child’s context of learning opportunities and sociocultural expectations.
In Zambia, pilot work in development of CDAZ (a Child Assessment screening tool for children aged 0–5 years, developed by Ettling
et al. for UNICEF [
22] attempted to replace test items requiring manipulation of a pencil with items requiring the use of scissors to cut paper, but found that young children in rural homes consistently performed poorly at cutting with scissors. No plastic scissors were locally available, and parents reported that they did not allow young children to play with regular scissors for fear of injury [
23]. Despite making some adaptations, Ettling
et al. [
22] retained in the fine motor scale of their screening tool several tasks that most Zambian children seldom encounter in their preschool years unless they are enrolled in a formal Early Childhood Education (ECE) programme (building a tower of small blocks, turning pages of a book, copying shapes, tracing, colouring, drawing a human figure with a pencil, cutting paper with scissors). It is hardly surprising therefore that in their nationwide sample, most of whom were not enrolled in such a programme, performance on the fine motor scale was generally quite weak, with almost half the items being failed by more than 25% of the children tested.
Ngenda [
24] was inspired in part by this finding to explore whether a locally developed test that uses materials more familiar to most young Zambian children than pencil and paper might be applicable for the assessment of early childhood development in Zambia. The Panga Munthu (Make-a-Person) Test (PMT) was developed by Serpell and colleagues in the 1970s–1990s [
2]. It requires the child to make out of modelling clay the best model of a person she or he can, and the model is scored for detail on a 25-point scale, somewhat similar to that of the Goodenough-Harris Draw-a-Person Test [
23]. A standardisation study was conducted by Kathuria and Serpell [
25] on a nationwide sample of over 3000 school-going children aged 7 to 12 enrolled in Grades 1, 3, and 5 at government primary schools in urban and rural areas. Scores were significantly correlated with age (
r = 0.28,
p < 0.001) and school grade (
r = 0.26,
p < 0.001), but differences between the two genders or between children living in rural and urban areas were not consistently significant. The report presents norms in the form of a five-point scale ranging from low to high for children in each of the three grades, and also for children in three age bands: 7–8, 9–10, and 11–12. Ngenda [
24] administered the test to two samples of younger children aged 4 to 6 years old in Zambia’s Copperbelt Province, one sample drawn from an urban preschool facility, the other from a rural community without any access to formal ECE. As expected, neither of these samples performed significantly better than the other, nor was there any effect of gender, but the average scores increased steadily across ages 4, 5 and 6.
In search of ways to characterize the construct measured by the PMT, Ngenda also administered two new tests to his samples of 4–6-year olds: a make-a-dog modeling task that shared with PMT the demand for visuo-motor coordination in the construction of clay models and a body-part-name recognition task that shared with PMT the demand for knowledge about the component parts of the human body. He also solicited ratings of the children’s intelligence by a preschool teacher in the urban sample and by parents in the rural sample. Scores on the dog modeling task did not differ between the two samples, but the rural children scored significantly higher on the body-part naming task, which was administered to them in their mother tongue, than the urban children who were tested in English, the medium of instruction at their preschool. A strong positive correlation was found between PMT scores in the urban sample and ratings of their intelligence by their preschool teacher (r = 0.75. N = 60, p < 0.001), while scores on the other two tests were both moderately correlated with PMT scores (0.33, 0.49) and moderately but less strongly correlated with the teacher’s ratings than the PMT scores (r = 0.54, 0.55). In the rural sample the pattern of correlations among the three tests was similar, but when each individual child’s parent’s rating was compared with her or his PMT score the correlation was found to be low (r = 0.21) and non-significant.
Ngenda’s study brought the process of test development of PMT full circle, by applying traditional psychometric criteria to the evaluation of its construct validity His findings support the theoretical interpretation that the PMT taps into both children’s knowledge of the structure of the human body and their skill at modeling in the medium of clay. Regarding external validity, because the test presupposes only familiarity with the widespread African play activity of clay modeling, it may be considered more suitable for the assessment of cognitive ability among children with less formal schooling than is prescribed by official public policy (e.g., street children, children orphaned by AIDS, child soldiers, and forced migrants) than other tests available that presuppose familiarity with pencil and paper, jigsaw puzzles, and other Western-origin materials that are unevenly distributed in African societies [
26]. Alcock
et al. [
27], for instance, in a study of unschooled children aged 6–9 in rural coastal Kenya chose the PMT (translating its title as Build a Man) as an alternative to the better known Goodenough-Harris Draw-a-Person Test [
23], “that is hypothesized to assess intellectual maturity,” on the grounds that “clay modeling measures an equivalent construct but in materials more familiar to African children” (p. 538). In that population, the test fared well on several standard psychometric indices: inter-rater reliability 0.90 (
N = 20), test-retest reliability 0.57 (
N = 26), Cronbach alpha 0.82 (
N = 537). However, most large scale surveys of early childhood development have used only lightly adapted versions of tests designed and standardized for WEIRD populations (Western, Educated, Industrialized, Rich and Democratic societies—Henrich
et al. [
28] and such tests are increasingly used in the NoWeMics (Northern Western More industrialized countries).
The development of one such instrument in Zambia (the ZAMCAT) is described by Fink
et al. [
29]. An informative example of the challenges faced by such adaptations is presented by Zuilkowski
et al. [
3]. One of the sub-tests of the ZAMCAT is a Pattern Reasoning task that has been widely used in Western tests of cognitive ability (e.g., [
30]). In each test item, the child is shown a pattern color-printed on a sheet of paper, made up of abstract forms such as circles, squares and triangles. The child must select from several options presented a display that completes or continues the pattern, by pointing to the correct response on the page. During pre-testing, this task was presented to a nationwide sample of over 2700 six-year-olds, and found to elicit very few correct responses. In light of this, a modified version of the task (Object-based Pattern Reasoning Assessment—OPRA) was designed in which the items deployed to make a pattern were three-dimensional, familiar objects rather than colored geometrical shapes printed on paper. The logic of the patterns presented to the child for reasoning in these two versions of the test was exactly the same. Yet the average scores by the large, closely comparable samples tested two years apart differed significantly: on the two-dimensional version, the mean score was 2.7 out of 10, with a mode of 1, whereas the mean score on the three-dimensional assessment was 4.5, with a mode of 5. In both formats, items were presented in order of increasing difficulty, such that the percentage of children answering correctly decreased as the items became more difficult. But this decline was much sharper for the two-dimensional assessment, from 49% to 13% across items 2 through 7 (all of which were ab-ab patterns), compared to 59% to 43% across the same set of items on the three-dimensional assessment.
The direction in which test adaptation moved in the case of the OPRA was consistent with the call made by Wober [
31] for cross-cultural psychology to replace the question “how well can they do our tricks?” with the question “how well can they do their tricks?” An earlier demonstration of the power of such a shift of focus was Serpell’s [
32] comparative study of pattern reproduction by Zambian and English schoolchildren in three different media: paper-and-pencil, wire-modeling, and clay modeling. In that study the same Zambian children who performed significantly less well than the English children on copying shapes with pencil on paper performed significantly better than their English peers on copying shapes by bending strips of wire, and the two samples performed equally well on copying shapes by modeling clay. The widespread practice of using an imported test that was developed abroad for a culturally very different population tends to portray the local population as deficient or delayed with respect to a socially significant attribute such as intelligence or literacy, whereas in fact the difficulty experienced with the foreign test reflects lack of familiarity with its particular format.
3. Monitoring Initial Literacy Acquisition in Middle Childhood
Ideally, the appropriation by children of the cultural practice of literacy should not only afford those individuals greater opportunities for social mobility, but also generate progressive social change in the society, and empower African societies to communicate across space and time with one another and with the wider world. In order to achieve those social goals, literacy education should draw on the strengths of African cultures and languages, rather than alienating children from their communities of origin. Hence the wise decision by the Zambian Government in 1996 to reverse an earlier policy of immersing children in English as the sole medium of instruction in Grade 1. The medium of instruction in Grades 1–4 in Government schools is currently one of seven indigenous Zambian languages, depending on the geographical zone in which the school is located.
An additional benefit of the policy of offering initial literacy instruction in the indigenous African languages is that they all have a transparent writing system (orthography), most of which is shared across all the Zambian languages: e.g., the letters A, E, I, O and U each consistently represents a single phoneme in each of the Bantu languages of Zambia. This makes initial literacy learning much easier than it is in English, where each of those letters, depending on context, can represent several different phonemes.
National statistics on measured outcomes of basic literacy learning in Zambian public schools fall far short of the instructional objectives set by the national curriculum. In 2007, the Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ) administered carefully standardized tests of literacy and numeracy to a large, representative, nationwide sample of pupils enrolled in Grade 6 in each of fifteen countries in the South-Eastern region of Africa. The SACMEQ study provides rigorous and disturbing evidence (Hungi
et al. [
33]) that the average level of literacy achieved by Grade 6 by children in Zambia’s registered, mainstream primary schools was very low and that it is heavily influenced by parental socio-economic status and urban
vs. rural location. It also suggests that access by children to instructional resources makes a difference to the level of literacy they achieve.
The Zambian SACMEQ 2007 survey sampled Grade 6 classes at 157 schools, and assessed 2895 children on an 8-level scale of reading skill. Levels 1–3 represented pre-reading, emergent reading, and basic reading, while Levels 4–8 represented reading for meaning, interpretive reading, inferential, analytical, and critical reading. Nationally, only 24 percent of pupils in Grade 6 reached one of levels 4–8. Among children of families in the highest quartile of the range of socio-economic status 52% scored on level 4 or better, whereas only 17% of children of families in the lowest quartile did so. It is worth noting that the reading skills in this Zambian survey were assessed in English, which is the medium of instruction in Grades 5, 6 and 7. In Tanzania, where the initial medium of literacy instruction is maintained throughout Grades 1–7, the SACMEQ survey tests were conducted in Kiswahili, and the national proportion of Grade 6 pupils scoring on level 4 or better was 90%. The exceptionally low level of literacy performance in the upper primary grades has been declared a national emergency by the Government Ministry of Education, and a major new initiative has been launched to identify the causes and remedy them [
34]. Educational assessment has become an issue of considerable interest in this context as a way of collecting reliable evidence on which to base changes in policy, curriculum, instructional methods, materials and teacher training.
One concerted effort to address the particular needs of Zambian children for acquiring initial literacy has centered on the deployment of an innovative digital learning environment (Graphogame) designed in Finland by Lyytinen and his colleagues [
35] and adapted for use in African public basic schools. The various steps through which this adaptation and application has evolved are described by Ojanen
et al. [
36]. One of the key steps was to conduct a field test of the intervention in a large sample of urban Government schools, and this required close attention to systematic assessment of early literacy skills [
4].
A major challenge for the formal assessment of children’s cognition arises from the wide prevalence of bilingualism and translanguaging [
37] in Zambian society. The policy rationale for using the indigenous Bantu languages as media for initial literacy instruction rests on the supposition that they afford a cognitive bridge between most Zambian children’s existing communicative competence and the linguistic code that they must master in order to become literate. In some rural communities, a single language variety is shared among almost all adults and as such is an obvious candidate for designation as
The medium of instruction for children of that community. But in the cities that house almost half of the nation’s population, and in some rural areas as well, considerable diversity exists and most adults are fluent in three or more speech varieties [
38]. The borders between varieties are porous, and everyday discourse involves frequent code-switching and other forms of translanguaging. The communicative socialization that children receive in such a speech community is likely to influence their performance on measures of verbal cognition and literacy. Yet this is often overlooked in the design of formal assessment procedures:
“If the outcome of assessment (such as a test score, or a profile of strengths and needs) is to be a valid representation of a child’s communicative competence, the formal process must be sensitive to the various real world contexts in which a child has been exposed to language. If the contexts of family life, children’s play and formal learning activities at school have each rendered a different speech variety familiar to the child, then the procedures of assessment and the criteria for evaluating the child’s performance will need to be correspondingly adjusted. In other words, if the purpose of assessment is to gauge the level of a child’s communicative competence, rather than recording performance indicators of whether the child understands word X or says X in response to a picture, it would be more useful to pose the competence questions ‘can she understand X?’ and ‘can she say X in an appropriate context?’. Furthermore, if assessment is to provide a valid guide to instruction, it needs, in addition to contextual sensitivity, to gauge the ease with which a learner can adapt to the demands of a learning task. Thus, a process of dynamic assessment [
28] should address the questions ‘can she learn to understand X?’ (and, if so, ‘how fast?’) and ‘can she learn to say X?’ (and, if so, ‘how well?’).”
An ambitious programme of international cooperation was launched in 2012 between the Zambian and US Governments, under the auspices of the
Read to Succeed project administered by Creative Associates on a sub-contract from USAID. As a preliminary baseline against which to measure the impact of the strategic intervention to improve reading levels in the Zambian public schools, a sample of 4000+ learners in Grades 2 and 3 at rural Government schools in six provinces were tested with an instrument known as the Early Grade Reading Assessment (
EGRA). This instrument, which was designed by Gove and Wetterberg [
39] at RTI (Research Triangle Institute International), a research corporation in the USA under contract with USAID, was adapted for use in seven Zambian languages.
EGRA is an individually, orally administered standardized assessment, takes about 15 min to administer, and consists of several components, which have been found to be highly correlated with one another. Three of the components are
Nonsense word reading fluency (Ability to decipher made-up words that follow the linguistic rules but do not actually exist in the natural language being assessed. It assesses a child’s ability to decode words fluently—measured by words read correctly per minute);
Connected text oral reading fluency (Ability to read a passage, approximately 60 words long—measured by words read correctly per minute); and
Listening comprehension (Ability to follow and understand a simple oral story, measured by percent correct out of five comprehension questions).
The
EGRA baseline study in Zambia findings cited in various public announcements in 2013–2014 have been widely interpreted as showing that most learners at the end of Grade 2 had not acquired any basic literacy. And even the more nuanced conclusions published by RTI ([
40], p. 2) on the basis of a concurrent National Assessment using the EGRA, provided a bleak profile:
“Overall, the EGRA showed that grade 2 pupils, on average, were struggling to read fluently; the average oral reading fluency rate for the local languages ranged from 1.84 to 8.40 words per minute, indicating that the typical grade 2 pupil could sight-recognize a few words but struggled to string the words from a passage into a coherent sentence. This finding is not surprising, as pupils were able to produce the correct sounds of only between 3.68 and 9.63 letters per minute across languages, indicating they lacked the foundation needed to decode unfamiliar words. This finding also was reflected in the Reading Comprehension subtask, for which most pupils were challenged to answer the comprehension questions of the passage they had just read.”
However, this despondent appraisal of the national level of early grade reading attainment may be due in part to certain peculiarities of the EGRA as a test. EGRA was strictly timed and this was an unfamiliar experience for young Zambian children, especially in rural schools. Some of the administrators trained to conduct the EGRA baseline testing in rural schools in 2012 were University of Zambia (UNZA) students who had earlier been trained to conduct an Orthographic awareness test, and a Decoding test administered to Grade 1 learners as part of a collaborative study on reading support for Zambian children (RESUZ) in 2011. By agreement with Read to Succeed, those trained administrators also applied the EGRA to a sub-sample of the Lusaka children who had been tested in Grade 1 under the RESUZ Project, and were now traced at the same schools in Grade 2.
Comparison of RESUZ cohort Grade 1 scores on the Decoding test with Grade 2 scores on the EGRA Nonsense word reading test—administered to the same children one year apart by the same trained test administrators revealed the following (These data were a focus of further analysis by Francis Sampa in a research report under development at the time of writing, entitled “A study of the validity of the early grade reading assessment tests in determining levels of reading acquisition—A case of early grade reading assessment among Grade 2 Zambian primary school children”). The nonsense-word decoding sub-test was completely failed by 90% of the 4000+ Grade 2 children tested in rural Districts of Zambia and by 82% of Standard 2 (rural and urban combined) children in Malawi. In the Lusaka RESUZ cohort, 172 children were tested with EGRA, and the zero score rate for that urban Grade 2 sample on the Nonsense word reading test was 81%. Yet only 27% of this sample of 172 children scored less than 6/20 on the RESUZ Decoding test at the end of Grade 1. In other words, many of the children scoring zero on the EGRA test of reading aloud (sounding out) nonsense words at the end of Grade 2, scored more than 25% correct on the (quite reliable, untimed) RESUZ Dictation test at the end of Grade 1.
The RESUZ and EGRA tests differed in a number of respects. But perhaps the most salient is that EGRA was strictly timed, with a score of zero recorded if the child did not respond within the allotted time for the item, whereas the RESUZ tests were untimed, allowing for wide variations in response time. Research in several rural African communities, including the Chewa people of Zambia’s Eastern Province has reported that informants in these communities did not value speed of performance as highly as other traits such as depth of understanding and social responsibility (see
Section 4 below). For instance, Wober [
41] found that, on a
semantic differential scale, rural Baganda informants rated an indigenous concept of intelligence,
obugezi as relatively
slow rather than
fast, whereas urban Baganda school teachers rated
obugezi as relatively
fast rather than
slow. Are speeded tests appropriate for assessment of cognitive functions in Zambian Early Grade learners? It may be that learning to respond quickly is a valuable cognitive achievement in respect of certain functions. For instance, some theories of reading have argued that fluency is an essential prerequisite for understanding extended bodies of text. Moreover, empirical data from studies of initial literacy acquisition in the NoWeMics have often found a positive correlation between reading fluency (indexed by number of words read aloud per minute) and reading comprehension. But that correlation has not been consistently found in studies of reading acquisition in Africa. Inyega [
42], for instance, reported data from Kenya showing that the correlation was variable depending on whether the texts in question were in the learner’s first, second or third language.
The effect of large-scale surveys of national performance can in principle be motivating for policymakers, teachers and other key stakeholders to focus their attention in productive ways. In Tanzania, where the SACMEQ record is less shocking than in Zambia, and in several other African countries, policymakers have expressed an urgent need to enhance the quality of basic schooling now that high levels of enrollment have been achieved [
43,
44]. However, a global national assessment to the effect that children are not learning anything can also be demoralizing. In our view, such assessments should aim to reveal not only weaknesses and needs but also strengths and opportunities for improvement. The goals of literacy instruction as a national agenda need not necessarily give priority attention to skills required for success on selection tests to progress beyond basic education, especially when the scale of provision at secondary and tertiary levels of public education affordable for the majority of citizens is extremely small. In Zambia, for instance, currently less than 3% of a given age-cohort are able to access any form of tertiary education [
45]. Rather than documenting what most children in the lower primary grades cannot do, assessment should focus on identifying emergent competencies and developmental processes on which teachers and other agents of socialization can capitalize to nurture each learner’s progress within a resource-constrained and rapidly evolving educational environment [
1]. In the next section, we consider the question of what types of assessment are best suited for selection at the end of primary and secondary schooling in the framework of Zambia’s educational goals.
4. Selection for Admission to Secondary and Tertiary Education
The scale of educational provision has grown dramatically over the past 50 years in many countries around the world, notably in Africa. In Zambia, the promises of personal advancement and national development have been closely tied to formal education ever since independence in 1964. However, in the 1990s, the World Bank launched a critical attack on the widespread expansion of secondary and tertiary provision in African countries and led an international movement to prioritize investment in primary or basic education. As a result, the narrowing pyramidal structure of educational provision has endured [
46,
47]. Advocates of standardized testing as a method of selection for identifying the most deserving candidates for progression up that narrowing staircase have emphasized the potential of cognitive tests to equalize opportunity for individuals coming from economically varied family backgrounds, to reduce the dangers of favoritism and corruption, and to target uniquely valuable aptitudes so as to maximize the beneficial outcomes of further education [
48,
49]. The notion of individual differences in “potential” to benefit from education is, to say the least, controversial. In a study of ethnotheories held by preschool and early grade teachers at inner-city neighborhood schools in Baltimore, USA in the 1990s, Akkari
et al. [
50] found that African-American teachers were more inclined than their European-American counterparts to endorse the theme that every child can learn, whereas European-American teachers placed greater emphasis on individualized-differential teaching and child-centered education. In the poorly resourced city schools of Lusaka, Zambia in 2011, Jere-Folotiya [
51] found that most Grade 1 teachers professed to believe in individualized-differential teaching and did not endorse the theme that every child can learn.
Further up the ladder of grade progression, Serpell [
46] found in his longitudinal study of children in a rural Zambian community that failure to qualify for a place in secondary school was the norm in the 1980s, and that most of those failing blamed themselves, attributing their failure to lack of intelligence (
nzelu).
Over the course of multiple interviews with adult members of that rural Chewa community about the characteristics of local children they knew well, Serpell [
46] and his colleagues posed the question “
wanzelu ndani?”—who is a person endowed with
nzelu? A concept in the Chewa language that encompasses the scope of the English terms intelligence, skill and wisdom. The study concluded that “
nzelu was construed as an amalgam of cognitive alacrity and social responsibility” ([
52], p. 128).
More recently, Simatende [
6] adapted a more abstract question formulated by Sternberg and his colleagues [
53] in their study of everyday concepts of intelligence among urban citizens of the USA: “what constitutes an ideally intelligent person?” His inquiry was addressed mainly to authority figures among the Lozi people of Zambia’s Western Province. Their responses showed that intelligence was perceived not merely as cognitive ability of counting, adding or subtracting, completing patterns, distinguishing differences in a created pattern and ability to arrange letters but encompasses a broader perspective that pertained to peoples’ way of life.
As many researchers in African indigenous cultures have noted, intelligence entails depth and breadth more than speed of processing. Durojaiye [
54] in his research in Nigeria found the Yoruba people’s concept of intelligence emphasized the importance of depth of listening and of being able to see all aspects of an issue in its proper overall context. In this regard, speed is seen to undermine the quality of work because less time is given to fully grasp what one wishes to achieve or learn. Ruzgis and Grigorenko [
55] highlighted that in Africa, concepts of intelligence evolve largely around skills that help to facilitate and maintain harmonious and stable intergroup relations, as these are important African values since communal living and sharing is cardinal to an African way of life. In Serpell’s research on parental ethnotheories of child development among the Chewa in Eastern Zambia, social responsibility (
ku-tumikila), was highlighted as an important aspect of intelligence, subdivided into
“mva/
mvela (attentiveness, obedience) and
khulupilika/
mvana (trustworthiness, cooperativeness) [
46,
52,
56]. And in Western Kenya, parents among the
Kokwet emphasized reasonable participation in family and social life as important aspects of intelligence [
57]. The word
ngom was applied to child intelligence and seemed to denote responsibility, highly verbal cognitive quickness, the ability to comprehend complex matters quickly, and good management of interpersonal relations. Among the Lozi elders he interviewed in Western Zambia, Simatende [
6] found that their interpretations of
ngana (intelligence) included the ability to survive on one’s own, respect and obedience to elders and to obey rules, social responsibility, ability to perform tasks beyond what is expected of one’s age, understanding and ability to listen, awareness, knowledge; swiftness to grasp ideas and cleverness.
4.1. Grounding the Design of Tests in Local Culture
Simatende [
6] observed that the current instruments used to assess the intelligence of a child in Zambia and the consequential selection for higher grades do not capture the broader attributes of intelligence that are valuably assessed during informal education. The practice of relying on alien methods of assessment and modes of instruction disregards and discards the child’s original stepping stone into life. A more relevant set of policies and practices to guide the process of formal education would recognize and take up everything the child has learned and the methods of instruction that have built up the child and continue to nurture the learner’s development from there.
Moreover, it is widely recognized in educational policy circles that the focus of “high stakes”, public examinations tends to exert a powerful “backwash” effect on teaching practices, as teachers seek to prepare their students for success on the drastic selection processes that determine who gets an opportunity for further education. In Kenya, the Government Ministry of Education took some courageous steps in response to this problem in their 1973–1980 examinations reform program. As Wasanga and Somerset ([
58], pp. 387–388) observed in a long-term appraisal of the goals and achievements of the program, the reforms were broadly directed towards “four goals:
- (1)
Relevance: The examination should test skills relevant to the future lives of all candidates; those who would leave school after the primary cycle as well as those who would continue to the secondary level.
- (2)
Equity: To the maximum extent possible, the examination should be fair to all candidates; in particular, the content of the questions should not give further advantage to pupils from already-advantaged home and/or school backgrounds.
- (3)
Predictive validity: The examination should aim to identify, as efficiently as possible, the pupils who would make best use of scarce secondary school places.
- (4)
Quality: The examination should aim to enhance the quality of education offered in the primary schools; and to reduce the range of quality differences among schools and localities.”
Towards these goals, two major instruments of reform were employed: changes in examination content to widen the range of competencies tested, and widespread dissemination of two types of feedback on examination performance:
- (a)
“performance-order listings at two levels of aggregation: first, the district-level list, ranking all districts within the country according to their mean scores; and then within each district, school-level lists, similarly ranking all schools within the district”, and
- (b)
“Guidance feedback was provided through an annual examination newsletter, discussing concepts and skills pupils had found particularly difficult in the previous year, and suggesting pedagogical approaches teachers might take to strengthen learning.”
Wasanga and Somerset [
58] draw a number of thought provoking lessons from Kenya’s experiment in using “examinations as an instrument for strengthening pedagogy”, many of which center on possible misconstruals and manipulative distortions of the kind of response the reformed examinations set out to elicit, as the published feedback on performance was interpreted and used by teachers, administrators and educational publishers. The pedagogical goals at which the Kenyan reform program was aimed included promotion of higher-order thinking skills, creative rather than descriptive writing, and writing fluency and imagination, as well as accuracy. In some instances, there is evidence that attention to promotion of these educational outcomes increased to some degree at certain schools. On the other hand attempts were also observed to train students to use “packaged” routines to score well on the new type of examination questions. Similar, and perhaps even greater risks are likely to be attached to implementation of Simatende’s recommendations below with respect to promoting reflective depth and social responsibility.
Simatende recommends that the Examinations Council of Zambia, in collaboration with scholars at the University of Zambia develop assessment tools that take into account the broader scope of intelligence. These instruments would take into consideration the sociocultural context of the learners. This implies that, assessments are not merely restricted to activities of formal education or schooling. Examples here would be:
- -
Assigning and involving parents or official guardians to make assessments and reports of social economic progress by a child during informal education (time before school and during school holidays);
- -
Involving the immediate community of neighbors for a progressive assessment of a child’s interactive activities with others in community.
- -
Involving other pupils of the same class to make a social cognitive assessment of their interaction with the child being assessed.
These, among other forms of assessment, will bring to the fore the other attributes of intelligence that are upheld and assessed during the informal learning that happens before formal education and during the school holidays.
In addition, within the existing standardized examination format, Simatende recommends certain changes in the nature of the problem situations presented to the candidate for solution. Grounding his reasoning in the contextualist perspective advanced by Sternberg [
59] and others, Simatende argues that it is difficult to measure adaptive and effective behavior without embedding it in a cultural setting [
14]. Moreover, if the content of items on an intelligence test is familiar to members of one cultural group in a society but not familiar to another, the second group will perform much worse than the first group, reflecting a cultural bias in the testing procedure. In this regard therefore, the test cannot be construed as reflecting a legitimate sense of the nation’s intelligence. Situational problems used in the test papers should be familiar to all cultures. At the same time, familiar situational problems should call for familiar solutions. Besides familiarity of situational problems and solutions, situational problems should be relevant and practically necessary to the society. In order for the test materials to achieve this, their design must reflect an understanding of geographic and traditional cultural distinctions that impact on the ways of life and values of various major groups in the country. These variations range from rice farming and fishing culture to formal preschool culture that exposes children to schooling long before those of other cultures get exposed to it. Similarly, problem solutions being sought in the test papers should be practically relevant to all sections of society where the test is applied.
Another way of making the national secondary school selection examination more responsive to the Zambian context would be to take account of the many extrinsic variables that can impact on a candidate’s performance. These include ecological factors such as seasonal rainfall that place demands on agricultural activity, the mainstay of most rural families, and individual circumstances such as illness or bereavement. Performance on a single, once-off assessment is more vulnerable to distortion by extrinsic factors than multiple assessments spread out over the course of the year. Collecting data relevant to a student’s intelligence and motivation relative to opportunities for further education could be more accurate and just if it were spread over several moments in time and synthesized with some kind of points’ accumulation system.
4.2. Affirmative Action in Favour of Disadvantaged Groups
The University of Zambia (UNZA) was founded in 1965 as one of the priority projects of the movement for political independence from Britain, the colonial power that had sadly neglected post-basic education for Africans. An oft-quoted declaration by the first Chancellor, Zambia’s founding Republican President Kenneth Kaunda on the occasion of opening the University Library in 1967 is recorded on a plaque: “funds for the construction of this library were raised from contributions by citizens from all over the nation”. Admission to the university in the first decade of its existence was free of charge and based solely on applicants’ academic achievements, and the first decade saw students admitted and graduating from families of all socio-economic strata.
When the first author of this paper rejoined the University in 2003, as Vice-Chancellor, following an extended spell of work abroad, he was struck by signs of class stratification within the student body, and a conspicuous number of students from elite families. Invoking the collective memory of the academic faculty, many of whom had graduated from UNZA in the 1970s–1990s, he invited the University Senate to consider introducing an affirmative action program to enable more secondary schooled candidates from underserved rural communities to access tertiary education. Part of the evidence considered by the Senate was as follows (The data presented here were extracted from unpublished records maintained by the Vice-Chancellor’s Office in 2003–2006, for presentation at the 2014 conference of the Association for Educational Assessment in Africa (AEAA) [
60]). A preponderance of offers of admission to the university in 2005 were made to candidates applying from the three most heavily urbanized provinces (35% in Lusaka, 19% Copperbelt and 16% Southern Province), considerably higher than their proportional representation of the nation’s population (14% in Lusaka, 16% Copperbelt and 12% Southern Province), whereas only 31% of the offers were made to candidates applying from the other six provinces, which were predominantly rural and home to 58% of the national population.
In an effort to correct this imbalance, the Senate adopted an affirmative action strategy informed by several underlying assumptions. Performance on public examinations is a joint function of cognitive aptitudes, learning opportunities, and motivation. The distribution of cognitive aptitudes sufficient to tackle and benefit from higher education is similar across the populations of all of Zambia’s provinces. Provinces showing lower rates of success on public examinations offer school learners a weaker set of learning opportunities than provinces showing higher rates of success (qualifications of school teachers; school facilities; parents’ education). Students admitted with lower grades on Grade 12 examinations are not necessarily at a great disadvantage in university courses, because the challenges posed by university courses are different from those posed by secondary school courses. Students admitted with a given level of certified educational achievement against a background of relatively weak learning opportunities are likely to apply relatively greater effort in their first year of studies, leading to equal or better performance in university courses as compared with students from more privileged backgrounds.
In 2005, UNZA had the capacity to admit only one in seven qualified applicants. Admission was therefore highly competitive, with some degree program quotas only admitting applicants with less than 10 points in Grade 12 exam performance spread over five acceptable subjects, as against the minimum admission criterion of 25 points. Some secondary schools in rural provinces were already achieving high admission rates to UNZA and did not need any affirmative action, e.g., more than 10 applicants from each of those schools qualified for an offer of admission on a national competitive basis. Therefore schools were selected for affirmative action only if they had qualified for admission offers to zero, or only one or two applicants. Applicants from these schools were only offered admission if they had achieved the basic minimum admission criterion. Within that range each selected school was offered admission for its three top-scoring applicants, totaling 750 admission offers over three years.
In the first cohort, 64% of those offered admission on this basis were traced as candidates for mid-year examinations in their first year of enrolment, and 80% of those 220 candidates achieved a clear pass in all courses, while 27% achieved an average grade of B or better. In Years 2 and 3, 166 students in this cohort sat for mid-year exams (including some who had qualified competitively at the end of Year 1 for entry into the selective programs in Law, Engineering or Medicine). Seventy-five percent of those 166 candidates achieved a clear pass in all courses, while 29% achieved an average grade of B or better. These academic success rates were not significantly different from those of students admitted on the normal competitive basis without affirmative action [
60].
In the USA, Davidson and Lewis [
61] reported on the experience of twenty years of affirmative action admissions to the Medical School of the University of California at Davis. Their study followed the career paths of a sample of about 350 men and women admitted on affirmative action “special considerations” in their early twenties between 1968 and 1981, and compared them with a sample of matched cohort controls who entered the School at a similar age over the same period without any affirmative action. At entry, the study sample scored an average GPA of 3.06 on a 4-point scale and 544 on the standardized MCAT test administered by the Princeton Educational Testing Service (ETS), significantly lower than the averages of 3.50 GPA and 613 MCAT scored by the control group. During the course of their university studies, the academic grades obtained by the study sample were consistently slightly lower than those of the control group, but after graduation the assessment of their performance by Residency Directors was not significantly different. The two groups went on to a similar range of career paths, most of them in private practice, and expressed similar levels of satisfaction with their choice of career. Moreover the group admitted under affirmative action expressed significantly greater overall satisfaction with their life.
Reflecting on Zambia’s national policy of affirmative action in favor of girls’ admission to secondary schooling, Serpell and Folotiya ([
62], p. 94) noted that:
“According to leading American authorities on educational and psychological test development, validity is ‘the evidence for inferences made about a test score’ [
63]. It thus includes inferences grounded in professional and administrative practice as well as any grounded in theoretical analyses of the constructs that inform the test. Validity is not an inherent property of a test, but a property of its use for a given purpose: so we must always ask the question: ‘Valid for what?’… If girls require more instruction to reach a certain level of performance in the Zambian high school curriculum, the validation criteria for a selection test may need to include the impact those selected for high school could have on the development of the next generation of the nation’s human resources.”
Likewise, any extra effort required to enable university students coming from a weaker basic and secondary educational background to complete their degree programs may be justified in terms of longer-term benefits to Zambian society. For instance, high schools in under-resourced rural Districts struggle, as do primary and basic schools [
46,
47], to maintain credibility in the local communities from which they recruit their intake if few if any of their graduates qualify for admission to the national university. A positive backwash of the affirmative action admissions program at UNZA may be to generate greater community support for education of the next generation.
5. Conclusions
The studies reported in this article have considered several different strategies for enhancing the responsiveness of assessment practices to the socio-cultural and politico-economic context in which they operate: regarding culture as a womb, adaptation of foreign tests to match the learning opportunities of the local population; invoking culture as a language, design and validation of new assessment schemes within the framework of the cultural values and practices of learners’ communities of origin, as well as criteria derived from curriculum objectives; and addressing culture as a forum, affirmative action in application of a fixed assessment scheme. The shift from taking account of culture as a set of environmental constraints and opportunities for learning (a womb) to engaging with culture as a medium of communication among participants (a language) has been construed by some researchers as implying a radical cultural relativism [
63]. But, as reflexive participants in Zambian society, we have preferred to adopt a stance of engaged perspectivism [
64,
65], acknowledging the diversity of perspectives brought to the topic by various stakeholder groups and seeking ways to negotiate common ground in a dynamic system where all participants have agency.
Paying attention to context in the design and application of intelligence tests is not really an option: it is a methodological necessity for any claim to the appropriateness of assessment based on the results of the tests. The decision by some researchers and practitioners to eschew or play down the importance of such attention to context has been motivated by two types of consideration: either practical convenience (systematically adapting tests in response to context, and even more so generating new tests in context, typically involves a great deal of work and the associated costs of time and effort are considerable), or a philosophical preoccupation with universalizability. In our view, the search for universals in this domain is no more likely to be successful than in the domain of aesthetics. Few people versed in the visual arts or music across different cultures around the world would accept that any single evaluative yardstick can meaningfully be applied to the comparison of two works of art or music created in different cultural media. We can admire such works each within its cultural genre, but we cannot compare them and conclude that one is more beautiful than the other. Likewise, in the domain of human intelligence, we can only conduct meaningful evaluation of how intelligent any given individual (or his/her behavior) is within the cultural context to which the behavior is adaptive.
Research reviewed in this article has shown that systematically designed measures of cognitive function can be used judiciously to guide socially significant processes of assessment of intellectual functions in early childhood, monitoring initial literacy acquisition in middle childhood, and selection for admission to secondary and tertiary education, in each case enabling greater precision and reliability than the informal hunches that parents, teachers and policymakers often apply in those domains.
The claim that formal educational assessment brings added value to society rests on the premise that standardization of an instrument serves to secure its reliability and validity. From the perspective of cultural-developmental psychology, this logic needs to be complemented by attention to the frame of reference for establishing norms. If educational assessment is to respond appropriately to the challenges of progressive social change in a given society, the technical instruments deployed must be critically examined for their ecological validity and sociocultural relevance. Tests standardized abroad should only be applied after careful evaluation and adaptation to local circumstances. National policymakers and researchers should collaborate on the systematic development of assessment instruments and practices that are responsive to the ecocultural demands of local contexts.
At the outset of this paper, we posited the need for a theoretical account of how the construct of intelligence relates to learning and how opportunities for learning are distributed through educational policy. The construct of intelligence is rooted in an adverbial quality of behavior [
66] with moral implications [
67]—a feature of how a person acts that is valued for its social productivity. In agrarian subsistence economies technological innovation and social cooperation have powerful value that commands respect. A person with
nzelu in Chewa culture, with
mano in Bemba culture, or with
ngana in Lozi culture is considered worthy of respect because of the way she acts in situations that make sense in those cultures.
The range of situations in which Zambians are called upon to act has expanded over the course of the past century from the demands of an agrarian subsistence economy to include coexistence, communication and cooperation with people of other cultural groups, as well as appropriation and deployment for problem-solving of the cultural resources of literacy, mathematics and modern science. But the disposition to use one’s knowledge and skills in socially responsible ways remains an enduring condition for the respect conferred by those who describe a person as intelligent. Educational practices that fail to recognize this socially responsible dimension of intelligence fall short of the expectations of many Africans responsible for family socialization or for formulating public policy.
Thus evaluation of behavior that informs the assessment of intellectual functions in early childhood, monitoring of literacy acquisition in middle childhood and selection of candidates for admission to further education will be more adaptive in contemporary Zambian society if they include the dimensions of reflection and social responsibility. By doing so, the assessment of intelligence will resonate with and extend the hunches of Zambian parents, educators and policymakers. Moreover it will contribute to social progress that harnesses individual human cognition in ways that benefit society, contribute to constructive bridging between rural and urban ways of life, and afford opportunities for the best minds to play a part in national leadership.