An essential function of scientific inquiry is the ability to recognize patterns that connect discrete pieces of data and provide new meaning to the investigation [70
]. Connecting pieces of evidence together in a synthesis is analogous to assembling pieces of a jigsaw puzzle to create a coherent big picture. The graphical abstract in Figure 2
uses the analogy of a jigsaw puzzle to illustrate how current, prior, qualitative, and quantitative knowledge are pieced together from various domains during knowledge synthesis. This section of the article provides an elementary explanation of relationships by which research concepts are logically combined during synthesis into new knowledge. The strengths and weaknesses of linked concepts can strengthen or weaken one’s synthesis. I provide further examples from my web-based syntheses in the field of epidemiology, and the strengths and limitations of three common types of linked relationships are discussed: association, causation, and mediation.
An association is a link between two items or variables. Sometimes the link may be coincidental or spurious and occur strictly by chance, and sometimes a link may be meaningful and occur with a higher probability than by chance alone. An example of a coincidental association is demonstrated by flipping a coin. The outcome of heads or tails is strictly a matter of chance, assuming you are using a fair coin. Even if turning up 99 heads in a row, the chance that the next coin flip will turn up tails does not increase; it still remains at approximately 50%. Betting that there is a meaningful association between the number of coin flips and the chances that heads or tails will turn up next is known as a gambler’s fallacy [71
When associating concepts in a synthesis, the aim is to form a relationship in which the variables are meaningfully associated—as one variable changes, the associated variable also changes to some degree, known as covariance. But this does not necessarily mean that one variable is causing the other to change. Let us say the average person gets two colds a year and also takes vitamin C supplements. You do not take vitamin C supplements, and you observe that you get more colds than the average person. You suspect that taking vitamin C supplements may be associated with a reduced number of colds per year. But even if you are able to confirm that these two variables are statistically associated, as claimed in advertising using results of a clinical study, you still may not know what is causing the association between the variables. Perhaps people who take vitamin supplements look after their overall health better than you, which could be a confounding factor that independently causes the same outcome of reduced colds; which brings us to the next section.
A mediator is a variable that lies between other variables within a direct causal pathway to an outcome variable. A directed acyclic graph (DAG) may be used to visually represent direct causal pathways between variables [73
]. Acyclic means that a variable’s causal pathway does not cycle back directly onto itself. Figure 3
is a DAG that shows a simple mediator causation pathway, based on Baron and Kenny, 1986 [74
]. Note that confounders and effect modifiers lie outside the causation pathway in this model. For example, the independent variable may be replaced with an associated independent variable, a confounder, which causes the same effect on the outcome variable. An effect modifier may also change the outcome variable at the end of the causation pathway, as in the modifying effect of age and gender in the association of a risk factor with a disease. When inferring causation during a synthesis, possessing expert knowledge of the subject matter under investigation enables the identification of potential confounding factors [75
], effect modifiers, and mediators. Research designs that include participant randomization and stratification of results can also assist in controlling the effects of confounders and effect modifiers.
As causal diagrams have developed, indirect links between variables may be represented by a dotted line, and double-headed solid arrows may represent linked variables with an unspecified common cause [76
]. I have also combined double-headed arrows with a dotted line (
) to represent variables linked indirectly with an unspecified common cause. When conducting a synthesis, the researcher may infer the mediating common cause that indirectly links two variables. To illustrate, low vitamin D levels in patients have been associated with a higher risk of cancer incidence [77
]. Based on this association, some researchers have proposed that taking vitamin D supplements may prevent cancer, but recently published clinical trials of vitamin D supplements and cancer prevention do not support this causal inference [78
]. Having coauthored a textbook chapter on the endocrine regulation of phosphate homeostasis [81
], I have background knowledge of vitamin D’s role in regulating intestinal absorption of dietary phosphorus—i.e., vitamin D levels are lowered if phosphorus serum levels rise too high, as in clinical and subclinical hyperphosphatemia. Synthesizing the link between lowered vitamin D and hyperphosphatemia with the link between hyperphosphatemia and tumorigenesis [61
], I proposed that hyperphosphatemia is a common cause that mediates an indirect association between lowered vitamin D levels and increased cancer risk [69
When selecting information during knowledge synthesis, conflicting material helps identify areas requiring further in-depth investigation. As demonstrated in the above example of vitamin D supplementation and cancer prevention, there may be additional factors that are missing which thwart the synthesis of a truer overall picture. To illustrate, in the allegory of six blind-men and the elephant, each blind man examined a different part of the elephant by touch: the tail, trunk, tusk, ear, leg, and side, and each man inferred a different description of the nature of an elephant as being like a rope, snake, spear, fan, tree, and wall, respectively. Although their tactile observations were accurate, the men were unable to discover the true overall nature of an elephant because they did not synthesize their findings into new knowledge.
Mediation is also used in literature-based discovery, a synthesis method in which implicit knowledge is discovered from linking together separate bodies of literature [82
]. For example, if concept A is related to concept B in one body of literature, and a separate body of literature relates the same concept B to concept C, transitive inference relates A to C, as shown in Figure 4
. In this example, B acts as a potential mediator that causatively links two separate bodies of literature in a novel way to infer new knowledge.
I used transitive inference to propose an explanatory theory of how cholesterol oxidation products (COPs) are causatively linked to atherosclerosis [83
]. I synthesized concepts from one body of research relating COPs (A) to defects in arterial cell membranes (B) with another body of research relating defects in arterial cell membranes (B) to atherosclerosis (C). In this case, defects in arterial cell membranes (B) acted as a mediator that linked COPs with atherosclerosis. This synthesis helped fill in some theoretical knowledge gaps in the potential cause and mechanism of atherosclerosis and strengthened the evidence for dietary prevention of atherosclerosis by avoiding COPs in thermally treated and processed animal-based foods that contain cholesterol.