This essay, written by the guest editor, is an introduction to a special issue of the Journal of Intelligence devoted to methodological issues associated with studying the Flynn Effect.
It is not an unusual epistemological approach within science to observe an interesting, atypical, and/or paradoxical outcome, and then to spend considerable scientific effort attempting to explain it. The usual instantiation of the scientific model, in which an abstract hypothesis is stated and then empirically evaluated, is (at most) only half of the actual process. What is missing from this old but typical presentation of the scientific model is the answer to the question: Where did the hypothesis come from in the first place?
John Tukey spent considerable intellectual effort developing—and promoting and celebrating—the value of the exploratory side of science (e.g., [1
]), as distinct from the confirmatory effort that represents the usual process-oriented and hypothesis-driven version of science. Within Tukey’s formulation, science is not a straight line from the question (or hypothesis) to the answer, but rather is a complex feedback loop that requires interaction between exploratory and confirmatory processes ([2
], p. 23). Further, there are both methodological and ethical requirements and constraints that distinguish the two processes.
The Flynn Effect is one of those “interesting, atypical, and/or paradoxical outcomes,” and researchers have spent “considerable scientific effort attempting to explain it.” What is the Flynn Effect? To summarize (and, of course, to oversimplify), measured IQ has been increasing at a relatively regular rate in most countries for at least a century. What is the fundamental cause? No one knows, though many researchers believe that they have solved part of the puzzle. What are the conditioning factors? These are, finally, coming into sharper focus, though also still incomplete. How has this focus emerged? How can we continue to make progress? Answering those two critical questions requires attention to methodology. What are the past, present, and future research methods that will support continued success in studying the Flynn Effect?
Importantly, virtually the whole research enterprise associated with the Flynn Effect appears scientifically backwards, as the empirical finding has been driving the theory, rather than the other way around. Tukey’s perspective offers a reasonable and insightful way to conceptualize research on the Flynn Effect, all still within a scientific framework, and to provide methodological insights. Further, the exploratory/confirmatory paradigm—as a relatively recent insightful (and perhaps radical) expansion of the standard dogma—explains some of the pitfalls associated with Flynn Effect research, suggests why progress has been so slow, and also suggests some routes to potential progress.
I will begin with a meta-discussion of past Flynn Effect research—that is, I will discuss the process of doing the research, rather than to review the research itself (which I, and many others, have done repeatedly over the past 20 years). Then, I will cast that past work into the exploratory/confirmatory paradigm developed by Tukey. Next, a natural way to think about methodology associated with studying the Flynn Effect will emerge, and I will make that explicit. Next, I will discuss the four papers within this special issue, and how they fit into the framework. I conclude with comments about current and future methodology in relation to the Flynn Effect.
2. Doing Flynn Effect Research in the Past
], I presented a critical assessment of Flynn Effect research. There I suggested that explanations were being proffered long before the actual phenomenon itself had been fully specified and understood. I presented ten research agendas that would inform the discussion, and set the stage to identify the underlying cause(s) of the Flynn Effect. Well over 15 years after that paper was published, only partial progress has been made. We are still attempting to identify the causes of a phenomenon that, in many ways, is unexplained.
However, there has been substantial progress, in two senses. First, certain elements of the Flynn Effect have now been elaborated and specified; several of the suggested ten research agendas have been engaged, others that I did not anticipate have been proposed and used, and the products have provided important insights. Second, some of the confusion has been formalized. This version of confusion is actually a good outcome (though its existence may at first glance appear to be a negative critique); formalized confusion identifies those areas underlying the phenomenon that don’t cohere. In other words, if empirical results are disparate across cultures, or across time, then those disparities themselves provide valuable and useful evidence, pointing to some potential causes, and ruling out others. As one example, Flynn Effect patterns have diverged in interesting (though still not adequately explained) ways between Scandinavian countries and the U.S. (see for example, [4
]). What causes this divergence? As an example in the other direction, when educational reforms have fluxed and waned in the U.S., the Flynn Effect patterns have often stayed on course, almost relentlessly. Does this take educational reform out of the explanatory equation? They are not removed logically, because educational reform could be important, yet mediated by other processes that dampen the obvious effects, or other naturally-occurring processes may compensate in relation to increasing and decreasing educational reform. As a third example, consider that the Flynn Effect should show pragmatic outcomes; patterns or anomalies in the real world presumably reflect the cause(s) of the effect. Howard [6
] evaluated changing patterns among elite chess and bridge players for evidence of the Flynn Effect. Huebner [7
] considered how the Flynn Effect should impact the development of innovations. As a reviewer noted, “Detailed treatment of the ‘criterion validity’ … is undoubtedly a highly significant, yet very much overlooked aspect of Flynn Effect research.”
To be sure, the rush of Flynn Effect researchers to quickly (too quickly?) propose a coherent causal explanation with consensual support—which at this point has still not been obtained for any specific explanation—has helped to stimulate interest and has contributed to the ultimate goal of a plausible, parsimonious, and empirically supported explanatory model. In Tukey’s language, what were intended to be confirmatory processes have often turned exploratory. The proposed explanatory models (and typically quick counter-arguments) have generated considerable scholarly exchange and discourse. Lynn (and others) published a number of article arguing for a nutrition explanation (e.g., [8
]). Flynn (and others) consistently countered those. Mingroni proposed an outbreeding (hybrid vigor) explanation (e.g., [9
]). Flynn (and others) identified weaknesses in the argument (e.g., [10
]). Brand [11
] proposed that testing artifacts underlie the Flynn Effect, yet others have identified weaknesses in the connection between that reasoning and empirical outcomes. Dickens and Flynn [12
] offered a sophisticated and fascinating social-multiplier model that accounted for many of the unexplained factors in past Flynn Effect research (also see [13
]). Of course others criticized it (including the current author, see [14
]) and then others elaborated and improved the model (see the de Kort et al.
’s paper [15
] in this special issue).
The race to identify the cause of the Flynn Effect has similarities to the race to identify the structure of DNA, an account of which is presented in Watson’s The Double Helix
]. New empirical results are being presented all the time, and the proper model(s) must accommodate past patterns and successes, and new empirical findings. Most Flynn Effect researchers have been seeking an explanatory system with a single basic cause (yet see Jensen’s “multiplicity hypothesis” [17
]). Single explanatory systems are almost certainly doomed to fail. If nutrition, or educational reform, or outbreeding could fully explain the Flynn Effect, the solution would be on the table before us. Or even if it takes two or three obvious and basic explanatory processes to combine into a full explanation, again, the cause would likely be well-known.
But three qualifications are relevant. First, the single-source or small-number-of-sources type of explanation may itself be multi-faceted and quite complex. Dickens and Flynn’s [12
] social multiplier model is a good example. In this case, the complexity of the small number of explanations may be the challenge, rather than the number of different explanatory components involved. Second, even two or three different sources may themselves interact in complex ways. Third, there are different explanatory levels that must be untangled. Flynn Effect researchers have suggested that children eating more nutritious food contributes to increasing intelligence, that Scandinavian social support systems may have contributed to a slowing down of the Flynn Effect in those countries, or that evolutionary processes associated with hybrid vigor are the proper mechanisms to explain the Flynn Effect. But these different explanations, existing as they do at different explanatory levels, do not necessarily conflict; all three simultaneously could be plausible explanatory factors. In fact, across different explanatory levels, some of the explanations may contain others within them as special cases. In other words, many of the previous explanatory mechanisms compete with one another, and many do not.
3. Exploratory versus Confirmatory Flynn Effect Research
The hypothetico-deductive model posits that science moves forward through testing falsifiable hypotheses. Philosophers of science refer to this process as a “top-down” approach, in which abstract ideas are tested against patterns in empirical data. Tukey [1
] suggested that the hypothetico-deductive version of science may be incomplete, that an inductive process involving “bottom-up” reasoning was being ignored. In inductive reasoning, individual pieces of information are integrated into a coherent model, prior to and as critical input to the development of abstract hypotheses
In the late 1980’s, what the research community had in-hand was a fascinating finding, popularized in Flynn’s original [18
] Psychological Bulletin
papers (though the Flynn Effect has longer historical status than it originally appeared; see [20
]). Early explanatory models associated with educational processes, nutrition, and modern technological innovation (among other explanations) quickly were suggested. When hypotheses based on those theories were submitted to confirmatory analysis, they typically foundered in relation to empirical evidence. This purely hypothetico-deductive process is what I criticized in [3
However, even as the specific mechanisms were typically failing in relation to empirical data, the hypothetico-inductive effort produced valuable information in an exploratory sense, which has been submitted back to the inductive development of explanatory models. The Flynn Effect was shown to be strongly associated with fluid intelligence processes, and less (though slightly) with crystalized intelligence ([18
]). The effect, originally believed to be occurring at a virtually fixed pace, was shown in fact to vary at least somewhat across time and especially across cultures. Ways that educational patterns were explanatory of the Flynn Effect, and ways they were not, were specified. Research studies on the location of the Flynn Effect across different locations within the distribution of intelligence were sometimes inconsistent, but revealing. Both strengths and weaknesses in the nutrition hypothesis were stated and argued. Each of these efforts provided information to an inductively-derived explanatory system. Dickens and Flynn’s [12
] social multiplier model is perhaps the best example of an inductively-derived explanatory process that attempted to capture many of the evidentiary components that emerged from past research. Criticism of the Dickens and Flynn model [12
] suggested ways in which the Dickens and Flynn model was still deficient in matching parts of the body of evidence.
This introductory article is being written just over 30 years after the original Flynn Effect article was published in Flynn [18
]. Enough exploratory research has been conducted (some, as noted, under the guise of confirmatory research) that we are now in a strong position to develop and test plausible explanatory models, confirmatory efforts that emerge naturally from past exploratory work. One additional component is required.
Methodological issues associated with the study of the Flynn Effect have been relatively ignored for the past 30 years. In the exciting rush to propose the ultimate explanation, few researchers have given much thought to the methodological challenges involved in even specifying the Flynn Effect, much less investigating it. My paper critical of Flynn Effect work at that time [3
] was particularly focused on methodological weaknesses in past Flynn Effect work. Though early theoretical suggestions were insightful, few methodological innovations existed that shed light on the patterns. As Tukey both promoted and illustrated, methodological approaches to exploratory work are at the foundational level of scientific progress in relatively immature areas of scientific development. What do I mean by methodology?
In regards the Flynn Effect, methodological advances are ones that use innovative designs or analytic efforts to shed light—either from an exploratory or confirmatory perspective—on patterns that were otherwise obscure. Several domains of methodology are relevant. One involves the methodology of family studies. In most past research, family studies scholars have often conflated within- and between-family variance. The result is a quagmire of often confusing and sometimes fully incorrect findings. As one example, I published a recent paper showing that past birth order researchers have consistently interpreted what were almost certainly Flynn Effect patterns as birth order patterns, exactly because of the conflation of within- and between-family variance [23
]. As another example, Sundet et al.
] showed that changes in fertility and family composition patterns provide a coherent explanatory system to help explain the Flynn Effect. A second methodological domain involves the application of time series analysis and growth curve modeling. Few Flynn Effect studies have used such analytic methods, though they are perfectly designed for many collected Flynn Effect datasets (see [26
] for an example of a growth curve modeling exercise). More broadly, new statistical/methodological advances have seldom been found in past Flynn Effect research. Recent studies have become markedly more sophisticated, however (see, for example, [27
]). Another exception is provided by two recent meta-analyses, published as this essay was being written (more on those meta-analytic results later in this essay).
The dearth of sophisticated and innovated methodology, combined with an invitation for me to be guest-editor of a special issue of the Journal of Intelligence on the Flynn Effect, resulted in a call-for-papers from Flynn Effect researchers to contribute a paper on methodological advances in studying the Flynn Effect (with the term “methodological” broadly construed). The result is the four papers published in this special issue. In each case, sophisticated methodological thinking is combined with a strong focus on explaining the Flynn Effect. I will briefly review those in a later section.
4. Doing Flynn Effect Research in the Future
The implications of the historical review of past Flynn Effect research, couched within a scientific epistemology, are perhaps obvious; I will, nevertheless, make those explicit. First, Flynn Effect researchers in the future need to be quite clear as to their intentions. Is your Flynn Effect paper designed to uncover new patterns, to submit inductively to the ultimate development of future explanatory systems—in the Tukey [1
] sense, exploratory research? Or does your research contain a model, an explanatory system, or even a presumed (but unconfirmed) belief about how the Flynn Effect works—in the Tukey sense, confirmatory research? Being purposive about both efforts will move the research arena forward rapidly.
Second, if confirmatory, at what level does the explanatory system exist? What other explanatory systems compete, which ones are complementary, and which ones are able to simply co-exist? For example, improved nutrition and improved education do not compete, and may even be complementary (e.g., if some of the improvement in nutrition comes from improved nutritional support within the school). Alternatively, explanations focusing on the Flynn Effect as an artifact of testing, as a result of increased test sophistication, and as an actual improvement in intellectual performance over time overlap with one another in complex ways. Some of those ways are partly complementary, some are in direct conflict and cannot co-exist.
Third, a wide open arena—illustrated by the papers within this special issue—is the application of sophisticated methodology, both old and new, to study the Flynn Effect. Wicherts et al.
] provided an excellent example, presenting evidence that at least some of the items on tests demonstrating the Flynn Effect are not invariant; thus, changes in word meaning and item performance must be carefully accounted for. The Wicherts et al.
] finding set the stage for the application of item response theory (IRT) models (e.g., [33
5. Special Issue Papers
The authors of the four special issue papers take different approaches to introducing methodological innovations to the study of the Flynn Effect. I will briefly review the goal of each paper, and comment on its status within the exploratory-confirmatory framework.
Janneke M. de Kort, Conor Dolan, and their colleagues’ paper [15
] is titled “Can GE-Covariance Originating in Phenotype to Environment Transmission Account for the Flynn Effect?” This paper presents an expansion of the earlier Dickens and Flynn [12
] model discussed earlier in this essay. Specifically, they focus on the genotype-environment covariance component developed in the earlier model, and its ability to “amplifying increases in environmental means”, by embedding the Dickens and Flynn model within a biometrical design. They compare the fit of their model to a standard biometrical simplex model using twin data from the Netherlands Twin Registry, and find approximately equivalent fits. This work is clearly confirmatory, and supports the earlier goal of Dickens and Flynn. Past researchers have discounted the role of genes in accounting for the Flynn Effect, because the genome is much too slow to change through evolutionary mechanisms to rely on it for short-term increase like those seen in Flynn Effect data. However, as shown by Dickens and Flynn [12
], and reinforced and elaborated by de Kort and her colleagues [15
], the genome can interact with the environment to produce short-term phenotypic increases in measured intelligence.
Jon Martin Sundet’s paper [34
] is titled “The Flynn Effect in Families: Studies of Register Data on Norwegian Military Conscripts and Their Families”. He has contributed a number of excellent methodological treatments in the past showing how considering within-family patterns can contribute to our understanding of the Flynn Effect (some of that earlier work is reviewed above). In the current work, he develops the logic of using sibling pairs to study the Flynn Effect, and then illustrates the work with analysis of Norwegian family data from military conscripts. This work includes some exploratory components, including presentation of some new patterns. One in particular, a presentation of changing correlations between IQ scores and family size across time, provides highly useful new patterns for which Flynn Effect explanations must account. There are legitimate and unresolved disagreements among Flynn Effect researchers concerning whether the Flynn Effect patterns emerge from within-family as well as between-family differences. Sundet’s empirical results and methodological considerations contribute insight into this important question.
Michael Mingroni’s paper [35
] is titled “Future Efforts in Flynn Effect Research: Balancing Reductionism with Holism”. Mingroni is the developer and primary advocate for the role of heterosis—or hybrid vigor, or outbreeding—in explaining the Flynn Effect. Like the de Kort et al.
], Mingroni’s past work provides a mechanism by which genetic processes enter the explanatory arena. Heterosis suggests that as mating dyads expand from local matches to community-level to regional to national and international matches, fitness in relation to intelligence is improved through broadening of mating pools. The current paper has two goals. The first is to further develop the heterosis hypothesis. The second is to consider—in both practical and philosophical context—the application of different levels of scientific arguments. Similar in scope to my earlier Flynn Effect critique [3
], though also informed by a decade-and-a-half of additional research and new ideas, Mingroni offers six research approaches to sharpen understanding of the Flynn Effect. Notably, two are suggestions for within-family research. Another important suggestion is to expand interest beyond intelligence (as Mingroni himself has done in some of his past research). His suggestions have implications for both the exploratory and confirmatory endeavors critical to progress in understanding the Flynn Effect.
Grant B. Morgan and A. Alexander Beaujean’s paper [36
] is titled “An Investigation of Growth Mixture Models for Studying the Flynn Effect.” This paper applies statistical modeling methods that have seldom been used in relation to the Flynn Effect, similar to earlier efforts by Beaujean (reviewed above) and his colleagues. Growth Mixture Models (GMM’s) include within the modeling effort the ability to account for heterogeneity in Flynn Effect patterns over time; the “mixtures” represent coherent groups of respondents. Morgan and Beaujean present both a simulation, showing the plausibility of using GMM’s to study the Flynn Effect, and also empirical analysis using those models with intelligence data from Estonia. In the empirical analysis, they identified two separate and distinguishable change patterns. Their work provides the basis for both exploratory and confirmatory work. GMM’s can be used to evaluate postulated change patterns in confirmatory analyses, and also to identify different change patterns in an exploratory sense.
The Flynn Effect has fascinated a large body of researchers since the mid-1980s. After 30 years, we know a great deal about the effect, from both confirmatory analyses and exploratory work (much of which began as confirmatory). However, agreement on its cause or causes has been illusory. As this introductory essay was being developed, a large-scale meta-analysis was published in Perspectives on Psychological Science
]. This meta-analysis follows by only a few months another similar effort published in Psychological Bulletin
Both meta-analyses validate the approximate average pace of the Flynn Effect (frequently referred to as 3 IQ points per decade, approximately 1/5th
of a standard deviation; the usual estimate is likely slightly too high). Especially the Pietschnig and Voracek study [37
] evaluated a number of the details, and validated that the effect is much higher in the fluid than the crystallized domain, that there is disagreement about the location of the Flynn Effect within the ability distribution, and showed differential pace of the Flynn Effect across the past century. Their Table 2 evaluated 12 categories of causes. Some support was found for each, and some lack-of-support was found for each, except that the social multiplier hypothesis and a category called “life history speed” had no negative results identified. The “life history speed” explanation, recently proposed and developed by Woodley [39
] and Woodley et al.
], uses several different explanatory process including education, nutrition, and fertility. This theory appears similar to Jensen’s [17
] “multiplicity hypothesis,” except with more underlying theoretical direction. Within the context of the current essay, these meta-analyses both provide strong exploratory evidence, to support the further development of inductively generated theories. They also document an increasing amount of successful confirmatory analysis.
Finally, the field appears poised for real confirmatory analysis, after 30 years, based on efforts integrating the past findings into coherent theoretical approaches. The science underlying the Flynn Effect—in particular, the search for its elusive cause(s)—will be driven in exciting directions by recent and new empirical results and theory. The methodology associated with the development and testing of new theories will provide exciting findings, just as the four papers presented within this special issue make important contributions in both exploratory and confirmatory domains to our understanding of the Flynn Effect.