Geospatiotemporal and Causal Inferential Study of European Epidemiological Patterns of Cannabis- and Substance-Related Congenital Orofacial Anomalies

Introduction. Since high rates of congenital anomalies (CAs), including facial CAs (FCAs), causally attributed to antenatal and community cannabis use have been reported in several recent series, it was of interest to examine this subject in detail in Europe. Methods. CA data were taken from the EUROCAT database. Drug exposure data were downloaded from the European Monitoring Centre for Drugs and Drug Addiction (EMCDDA). Income was taken from the World Bank’s online sources. Results. On the bivariate maps of both orofacial clefts and holoprosencephaly against resin, the Δ9-tetrahydrocannabinol concentration rates of both covariates increased together in France, Bulgaria, and the Netherlands. In the bivariate analysis, the anomalies could be ranked by the minimum E-value (mEV) as congenital glaucoma > congenital cataract > choanal atresia > cleft lip ± cleft palate > holoprosencephaly > orofacial clefts > ear, face, and neck anomalies. When nations with increasing daily use were compared to those without, the former had generally higher rates of FCAs (p = 0.0281). In the inverse probability weighted panel regression, the sequence of anomalies—orofacial clefts, anotia, congenital cataract, and holoprosencephaly—had positive and significant cannabis coefficients of p = 2.65 × 10−5, 1.04 × 10−8, 5.88 × 10−16, and 3.21 × 10−13, respectively. In the geospatial regression, the same series of FCAs had positive and significant regression terms for cannabis of p = 8.86 × 10−9, 0.0011, 3.36 × 10−8, and 0.0015, respectively. Some 25/28 (89.3%) E-value estimates and 14/28 (50%) mEVs were >9 (considered to be in the high range), and 100% of both were >1.25 (understood to be in the causal range). Conclusion. Rising cannabis use is associated with all the FCAs and fulfils the epidemiological criteria for causality. The data indicate particular concerns relating to brain development and exponential genotoxic dose-responses, urging caution with regard to community cannabinoid penetration.


Introduction
Previous reports from Hawaii, Colorado, and the USA in general [1][2][3][4] demonstrate the close link between community prenatal cannabis exposure and congenital anomalies (CAs) affecting the orofacial region (FCAs). The first study to identify FCAs in human populations was conducted in Hawaii [1]. In that study, both cleft palate (O.R. = 14.73, 95%C.I. 3.98-38.23) and cleft lip ± cleft palate (8.19, 2.22-21.13) were found to be linked to prenatal cannabis exposure. In Colorado, choanal atresia was found to be related to cannabis use [2]. In the USA, microtia/anotia, holoprosencephaly, and cleft palate alone were determined to be related to ∆9-tetrahydrocannabinol (THC) exposure [5]. For these reasons, we were keen to study these anomalies in the very important European datasets.
Orofacial congenital anomalies are some of the best-known anomalies and also some of the most obvious. Beyond this, however, they are of importance because the organizer of the face is in bidirectional communication with the organizer of the forebrain developmentally, and anomalies of the face are often associated with disorders of thinking and intellectual development [6,7]. Moreover, both are controlled overall by the gradients of the major embryonic morphogen sonic hedgehog coming from the ventral surface of the embryo, and the interruption of or interference with these gradients has been shown to lead to major defects in the development of both the face and brain [6]. This important feature increases the importance and overall relevance of the present study significantly.
It is worth noting that the eye actually develops as a composite outgrowth from both the face and the forebrain. The neural elements of the retina and optic nerve come from the forebrain outgrowth, while the more anterior parts of the eye are derived from the face. Eye anomalies as a group are considered in a companion paper, along with central nervous anomalies. Disorders of the anterior part of the eye are considered in this section.
Sonic hedgehog is a major morphogen controlling face and brain formation. It is also a major morphogen for the eyes, teeth, and nasal protuberances. It is, therefore, of great relevance to this section to note that both THC and cannabidiol have been shown to inhibit sonic hedgehog both directly [7] and via epigenetic mechanisms [8]. Other key embryological morphogens are similarly inhibited by cannabinoids, and these are discussed further in the Discussion section.
The indirect mitochondrial metabolic pathways are also important for cannabinoids, as the mitochondria supply both energy and substrates, maintain a delicate mitonuclear balance, and have mitohormetic input to nuclear metabolism, which when disrupted can perturb major genomic and epigenomic energy-dependent reactions. Moreover, the mitochondria supply most of the epigenetic substrates required for epigenetic reactions. Since many cannabinoids interfere with mitochondria metabolism, this necessarily implies downstream disruption of the genomic and epigenomic homeostatic mechanisms.
One of the major features of laboratory studies of cannabinoid genotoxicity is the exponential effects of higher doses [31][32][33][34][35][36][37]. This remark applies equally to direct genotoxicity assays [7,12,[31][32][33][35][36][37][38][39][40] and also to studies of the inhibition of mitochondrial metabolism [34,[41][42][43][44][45], which in turn provide the organic basis of epigenomic reactions from both energy and substrate supply. Moreover, this result has been borne out in several recent epidemiological studies, which confirm this exponential effect at higher dose ranges in field studies of human populations. Since Europe, like many other places, has recently experienced a rise in cannabis use prevalence, cannabis use daily intensity, and cannabinoid THC potency, it seems that with all three trends operative in the direction of increased cannabis exposure, Europe has recently become subject to greatly increased cannabis exposure across whole populations [46,47]. This effect will be exacerbated by the long half-life of cannabinoids in the fat stores of the adipose tissue, brain, and gonads.
Moreover, teratological considerations indicate that cannabinoids are entering the food chain in parts of France, where many acres are cropped with cannabis and calves are being born without legs, as are human babies. In fact, the odds ratio recently reported in one press release indicated a 60-fold increase in French babies born limbless [48][49][50], which actually lies within the confidence interval of the original Hawaiian report on this issue (4.45-65.63) [1]. Such reports highlight the concerns relating to the inevitable collision between rising community cannabinoid exposure and the cannabinoid genotoxic dose-response curve.
It was shown long ago that many cannabinoids (including cannabidiol, cannabinol, cannabichromene, cannabicyclol, ∆8-, ∆9and ∆11-tetrahydrocannabinol, and their 11hydroxymetabolites) are genotoxic and that the genotoxic activity of these compounds relates to their central biphenolic ring, which is known as olivetol [17]. The structure of olivetol is that of a central dihydroxylated benzene ring with a short aliphatic tail. Benzene is a well-established classical mutagen, teratogen, and carcinogen, whose activities have been widely recognized for many decades [51].
The present study set out to determine the extent to which orofacial congenital anomalies may be related to exposure to cannabis and other substances in the social environment at the national level across Europe. The analysis plan was to employ both bivariate and multivariable techniques to examine these relationships, and to do so within formal quantitative causal inferential and geospatiotemporal frameworks. The advent of the massive EURO-CAT congenital anomaly database [55], along with recent epidemiological explorations of the European experience of cannabis exposure [56], greatly facilitated this endeavor.

Data
Data on all the available congenital anomaly rates were downloaded for each individual year for the 14 nations from the European Network of Population-Based Registries for the Epidemiological Surveillance of Congenital Anomalies (EUROCAT) website [55] and then analyzed. The total congenital anomaly rate from the EUROCAT includes the anomaly rates for live births, stillbirths, and cases where early termination due to an anomaly was employed all combined together, meaning that it represents a total overall rate across all classes of births. The nations selected were chosen based on the availability of their congenital anomaly data for most of the years 2010-2019. Data on tobacco (as the percent of daily tobacco use prevalence) and alcohol (as the liters of pure alcohol consumed per capita annually) use for each country were sourced from the World Health Organization [57]. Data on drug use for cannabis, amphetamines, and cocaine was derived from the European Monitoring Centre for Drugs and Drug Addiction (EMCDDA) online database [58]. The last-month cannabis use data were also supplemented by information on the THC content of cannabis herb and resin samples published in recent reports [47], which itself was originally derived from the EMCDDA [47]. The median household income data (in $USD) were taken from online World Bank sources [59].

National Assignment
The nations were categorized into two groups as being either high and rising daily cannabis use or low and/or falling daily cannabis use based on the categorization in a recent detailed European epidemiological study (see Figure S4 in [47]). In this way, Belgium, Croatia, the Netherlands, France, Germany, Italy, Norway, Portugal, and Spain were categorized as nations experiencing increasing daily use, while Hungary, Bulgaria, Finland, Poland, and Sweden were categorized as nations experiencing low or falling levels of daily cannabis use.

Derived Data
As several metrics of cannabis exposure were available, it was possible to calculate various derived metrics. Thus, the last-month cannabis use prevalence data were multiplied by the THC content of cannabis herb and resin samples to derive a compound metric from their product. These metrics were then multiplied by the imputed daily cannabis use prevalence rates to derive comprehensive additional compound metrics for both cannabis resin and herb.

Data Imputation
Linear interpolation was used to complete any missing datasets. This was particularly relevant for the daily cannabis use data. A total of 59 data points on daily cannabis use were directly available from the EMCDDA for the 14 nations over the study period. The use of linear interpolation allowed for the expansion of this dataset to 129 data points (further details and documentation are provided in the Results section). Data concerning the THC concentration of cannabis resin were unavailable for Sweden. It was, however, noted that the resin-to-herb THC concentration was almost constant at 17.7 in each year in nearby Norway. This ratio was, therefore, applied to the Swedish cannabis herb THC concentration data to derive estimates of the Swedish cannabis resin THC concentration. Similarly, data concerning the cannabis resin THC concentration in Poland were not available. The annual ratio of the THC concentration of cannabis resin to the THC concentration of cannabis herb in Germany was, therefore, used to estimate the resin THC content in Poland from the known herb THC concentrations in Poland. As geospatial analytical techniques do not allow for missing data, the dataset was completed by the last observation carried forward or backwards for Croatia in 2018 and 2019 and the Netherlands in 2010. It was not appropriate to use multiple imputation methods for this dataset, as multiple datasets cannot be used as simultaneous chained inputs for panel or spatial multivariable regression techniques, and moreover, published model pooling techniques are not available for these models.

Statistics
The gathered data were processed in R Studio version 1.4.1717 based on R version 4.1.1, which was obtained from the Comprehensive R Archive Network and the R Foundation for Statistical Computing [60]. The analysis was conducted in December 2021. The data were manipulated and rearranged using dplyr from the tidyverse from the R Core development team [61]. The data were log transformed as appropriate to improve the compliance with the normality assumptions based on the results of the Shapiro-Wilk test. Two-dimensional graphs were drawn using ggplot2 from the tidyverse. Maps were made using ggplot2 and sf (simple features) [62]. Custom color palettes and palettes taken from the viridisLite and viridis packages were used for the color fills [63]. The R package colorplaner was used to compose the bivariate maps [64]. None of the illustrations have been previously published and all are original. The linear regression was conducted in Base R. The mixed effects regression was performed using the R package nlme [65]. The classical technique of the serial deletion of the least significant term was used for the reduction of all the multivariable models to yield a final reduced model, which is the model presented in the following tables. Multiple linear models were processed simultaneously using the combined coordinated techniques from the R packages purrr and broom, as is well described [61,66,67]. In multi-variate interactive models, the effect of any particular covariate may not be immediately apparent. The effect of an individual covariate can be calculated from such models and is known as the marginal effect. In this study, this was calculated in R using the package margins [68].

Covariate Selection
The presence of multiple different metrics for cannabis exposure created a problem in terms of the analysis, as it was unclear which was the most appropriate combination of metrics to employ for a particular multivariable model. The indiscriminate use of excessive covariates would unnecessarily consume degrees of freedom and complicate the analysis, thereby restricting the ability to assess important effects. This issue was addressed directly through the use of a random forest regression conducted using the R package ranger [69], and the variable importance was formally assessed at the same time using the R package vip (variable importance plot) [70]. The most highly predictive covariates from this process were then entered into the regression modelling equations. The tables from this analysis are presented in the Results section.

Panel and Geospatial Analysis
The panel analysis was conducted using the R package plm [71] across both space and time simultaneously utilizing the "twoways" effect. The spatial weights matrix was calculated using the edge and corner "queen" relationships (by analogy with chess) utilizing the R package spdep (spatial dependency) [72]. The geospatial modelling was conducted using the spatial panel random effects maximum likelihood (spreml) function from the R package spml, which is an advanced function that allows detailed correction of highly anomalous spatial model error structures [73,74]. Such models may produce up to four model error coefficients, which are useful in determining the optimal error structure of the model. These coefficients are phi (the random error effect), rho (the spatial coefficient), psi (the serial correlation effect), and theta (the spatial autocorrelation coefficient). The most appropriate error structure was chosen for each spatial model, taking care to preserve the model error specification between closely related models. The backwards methods obtained via reduction from the full general model to the most specific model were used to determine the most appropriate error structure, as has been previously descried [75]. Both the panel and geospatial models were temporally lagged, as indicated in the Results section, by one to two years.

Causal Inference
The tools of formal causal inference were deployed in this analysis. Inverse probability weighting (ipw) is the technique of choice and converts an observational study into a pseudo-randomized study from which it has been convincingly shown to be appropriate to draw causal inferences [76]. All the multivariate panel models presented herein were inverse probability weighted. The inverse probability weighting was conducted using the R package ipw [76]. Similarly, E-values (expected values) provide a quantitative estimate of the correlation required for some hypothetical unmeasured confounder covariates with both the exposure of concern and the outcome of interest in order to explain some apparently causal relationships [77][78][79]. It is thus a very useful tool for use in a sensitivity analysis and provides a quantitative measure of the robustness of the model to extraneous covariates that have not been included within the matrix of the chosen independent covariates. E-values also have a confidence interval, and the 95% lower bound of this confidence interval is of particular importance and is widely reported in this study. E-value estimates greater than 1.25 are usually interpreted as implying causality [80], and E-values greater than nine are described in the literature as being high [81]. The R package EValue was used to calculate the E-values [82]. Both the E-values and inverse probability weighting are foundational and pivotal techniques used in quantitative formal causal inferential methods, allow causal relationships to be assessed from real-world observational studies, and essentially and powerfully transform the analytical paradigm.

Data Availability
The raw datasets, including 3800 lines of computation code in R, have been made freely available through the Mendeley data repository. They can be located at the following URLs: https://doi.org/10.17632/tysn37t426.1 and https://doi.org/10.17632/jjhpfxz5m7.1.

Ethics
Ethical approval for this study was provided by the Human Research Ethics Committee of the University of Western Australia (number RA/4/20/4724) on 24 September 2021.

Results
Table S1 sets out the basic background data for the study population. The table lists the 14 nations contributing data to the study and the 9 CAs in this group. It also provides drug use data, including summary statistics on the compound metrics of cannabis exposure, in addition to median household income data.
During embryology, the eye forms as a confluence of tissues from both the face and the developing forebrain. The facial tissues contribute the tissues at the front of the eye, and the retina and neural components grow out as protuberances from the forebrain. Hence, disorders of the anterior eye are included in this section, while disorders of the eye as a whole and the posterior eye tissues are addressed in a companion manuscript, which addresses disorders of the central nervous system. Table S2 sets out the daily cannabis use data, which was notably incomplete. A total of 59 data points were identified in this group of nations. The data were completed by means of linear interpolation, as indicated in Table S3, with the addition of a further 70 points to make 129 points for this covariate overall. Figure 1 shows the regression lines for each of the anomalies with tobacco, alcohol, cannabis herb THC concentration, amphetamine, and cocaine. The trend lines for tobacco are either flat or have a negative slope. The trend lines for alcohol are mostly flat, although two are weakly positive. The trend lines for amphetamine exposure are either flat or have slopes in both the positive and negative directions. For the cannabis herb THC concentration, while some regression lines are flat, those for anotia, choanal atresia, cleft lip and/or palate, holoprosencephaly, and orofacial clefts appear to be rising significantly. Figure 2 illustrates the lines of best fit for the various metrics of cannabis exposure. The regression lines for the cannabis resin THC concentration and orofacial clefts and holoprosencephaly appear to be strongly positive. The daily cannabis use interpolated appears to be strongly associated with choanal atresia, congenital cataract, congenital glaucoma, and holoprosencephaly. The trend lines for the compound cannabis metrics are as indicated.                Figure S2 shows the rate of the compound cannabis exposure metric last-month cannabis use x cannabis resin THC concentration. Temporal rises in Spain, France, Italy, the Netherlands, Belgium, and Bulgaria are noted. Figure 6 is a bivariate plot of the relationship between orofacial clefts and the cannabis herb THC concentration. In this plot, the green shading represents areas where both covariates are low, while the pink or purple shading shows areas where both are high, as shown in the colorplane in the key. France, the Netherlands, Belgium, Norway, and Bulgaria are noted to be pink or purple at times. Figure 7 is a similar bivariate graphical map of the holoprosencephaly against cannabis herb THC concentration. Germany, France, and Bulgaria are noted to be pink or purple at different times.
When the holoprosencephaly rate is charted against the compound metric of lastmonth cannabis use x resin THC concentration x daily use interpolated, the appearances shown in Figure 8 are generated. It is clear from this figure that the area of France emerges as pink, signifying that both rates are elevated.
nabis use x cannabis resin THC concentration. Temporal rises in Spain, France, Italy, the Netherlands, Belgium, and Bulgaria are noted. Figure 6 is a bivariate plot of the relationship between orofacial clefts and the cannabis herb THC concentration. In this plot, the green shading represents areas where both covariates are low, while the pink or purple shading shows areas where both are high, as shown in the colorplane in the key. France, the Netherlands, Belgium, Norway, and Bulgaria are noted to be pink or purple at times.   When the holoprosencephaly rate is charted against the compound metric of lastmonth cannabis use x resin THC concentration x daily use interpolated, the appearances shown in Figure 8 are generated. It is clear from this figure that the area of France emerges as pink, signifying that both rates are elevated.  When the holoprosencephaly rate is charted against the compound metric of lastmonth cannabis use x resin THC concentration x daily use interpolated, the appearances shown in Figure 8 are generated. It is clear from this figure that the area of France emerges as pink, signifying that both rates are elevated.  When the holoprosencephaly rate is charted against the cannabis resin THC concentration, France, the Netherlands, and Bulgaria are noted to turn pink or purple across the decade ( Figure 9). When anotia is charted against the cannabis herb THC concentration, Germany and Belgium are noted to turn purple at times ( Figure 10). When choanal atresia is charted against the cannabis herb THC concentration ( Figure S3), Germany and Bulgaria are noted to be pink or purple at times. J. Xenobiot. 2022, 12, FOR PEER REVIEW 13 When the holoprosencephaly rate is charted against the cannabis resin THC concentration, France, the Netherlands, and Bulgaria are noted to turn pink or purple across the decade ( Figure 9). When anotia is charted against the cannabis herb THC concentration, Germany and Belgium are noted to turn purple at times ( Figure 10). When choanal atresia is charted against the cannabis herb THC concentration ( Figure S3), Germany and Bulgaria are noted to be pink or purple at times.     A particular concern relates to high-intensity daily use [8,18,46,[83][84][85]. For this purpose, it is of interest to divide the nations into those where daily cannabis use is high or increasing and those where it is low or declining. As described in the Methods section, this was done based on recent published reports (see eFigure 4 in [47]). When this was done, the results charted in Figure 11 were found. In the upper scatterplot, across time the nations with increasing cannabis daily use appear to be a little higher than the others. This is made explicit in the lower boxplot graph, which aggregates the data over time, where the notches on the two box plots obviously do not overlap and thus indicate a statistically significant difference in the rates (linear regression: β-est. = 0.1768, t = 2.2, p = 0.0281; model F = 4.83, dF = 1, 1066, p = 0.0281; t-test: t = 2.12, df = 4741.6, p = 0.0346). However, when the data are separated by the anomaly type, no apparent difference in the two groups of nations is apparent ( Figure 12). Table S4 lists the slopes of the various bivariate regression lines shown in Figures 1 and 2. The 39 slopes that are positive and statistically significant are extracted and shown in Table 1. They are listed in terms of the descending minimum E-values (mEV). A total of 33 of the listed terms include metrics of cannabis exposure, 5 relate to cocaine, and 1 to alcohol consumption. The E-value quantifies the strength of the association with both the exposure of interest and the outcome of concern to obviate some apparently causal effect. The E-value is a key metric in causal inference. F = 4.83, dF = 1, 1066, p = 0.0281; t-test: t = 2.12, df = 4741.6, p = 0.0346). However, when the data are separated by the anomaly type, no apparent difference in the two groups of nations is apparent ( Figure 12). Table S4 lists the slopes of the various bivariate regression lines shown in Figures 1  and 2. The 39 slopes that are positive and statistically significant are extracted and shown in Table 1. They are listed in terms of the descending minimum E-values (mEV). A total of 33 of the listed terms include metrics of cannabis exposure, 5 relate to cocaine, and 1 to alcohol consumption. The E-value quantifies the strength of the association with both the exposure of interest and the outcome of concern to obviate some apparently causal effect. The E-value is a key metric in causal inference.     By analogy with the genomic literature in which volcano plots are used to chart the negative log of the p-value against the fold-change in the gene expression, it is possible to chart the negative log p-value of these bivariate regression coefficients against the foldchange quantified as the E-value. Figure 13 does this for the negative log of the p-value against the E-value estimate. In this figure, congenital glaucoma, congenital cataract, and orofacial clefts figure highly in the plot.    Having made these important bivariate observations, the next issue is the nature of the results considered in the multivariable regression. However, given the large number of cannabis exposure metrics, the variable selection for these regression equations is far from straightforward.
For these reasons, a random forest regression conducted in the R package ranger was used together with the variable importance package to create tables listing the relative importance of the 13 possible covariates of interest. Four variable importance tables are listed in Tables S5-S8 for the four orofacial congenital anomalies of particular interest. These were orofacial clefts, anotia/microtia, congenital cataracts, and holoprosencephaly. The reasons for choosing these particular CAs are explained in the Discussion section. Table S9 lists the final models from the four multivariable panel regression equations for orofacial clefts shown for one each of an additive and interactive model and models lagged by one and two years. All of these models are inverse probability weighted, which transfers them from a purely observational context into a pseudo-randomized context from which one is able to meaningfully draw causal inferences. Inverse probability weighting is the technique of choice for such data transformations within the discipline of casual inference. It is noted that in each model, terms such as cannabis exposure metrics survive the model reduction, have positive regression coefficients, and are statistically significant. Having made these important bivariate observations, the next issue is the nature of the results considered in the multivariable regression. However, given the large number of cannabis exposure metrics, the variable selection for these regression equations is far from straightforward.
For these reasons, a random forest regression conducted in the R package ranger was used together with the variable importance package to create tables listing the relative importance of the 13 possible covariates of interest. Four variable importance tables are listed in Tables S5-S8 for the four orofacial congenital anomalies of particular interest. These were orofacial clefts, anotia/microtia, congenital cataracts, and holoprosencephaly. The reasons for choosing these particular CAs are explained in the Discussion section. Table S9 lists the final models from the four multivariable panel regression equations for orofacial clefts shown for one each of an additive and interactive model and models lagged by one and two years. All of these models are inverse probability weighted, which transfers them from a purely observational context into a pseudo-randomized context from which one is able to meaningfully draw causal inferences. Inverse probability weighting is the technique of choice for such data transformations within the discipline of casual inference. It is noted that in each model, terms such as cannabis exposure metrics survive the model reduction, have positive regression coefficients, and are statistically significant.
Tables S10-S12 list similar models for anotia/microtia, congenital cataract, and holoprosencephaly. Similar observations can be made in each of these cases. One notes that in all the models presented in this section, terms such as metrics of cannabis exposure persist after the model reduction, are positive, and are statistically (often highly) significant.
The next issue relates to the question of what would happen if these data were considered in their formal space-time relationships. It is possible to do this formally in a way that accounts for issues which are potentially serious in data analysis, such as serial correlation, random effects, spatial error correlation, and spatial autocorrelation in the patterns of data expression. Figure S1 shows the initial, edited, and final geospatial links between the nations, which became the basis from which the sparse spatial weights matrix was derived. Table 2 shows the final geospatial regression models for the additive, interactive, and lagged models for orofacial clefts. One notes from this table that, once again, terms such as cannabis survive the model reduction, are positive, and are statistically highly significant. The same pattern is repeated in Tables 3-5 for anotia/microtia, congenital cataract, and holoprosencephaly.
These various regression parameters can each be associated with the E-values. The Evalues applicable to the panel and geospatial regression models are listed in Tables 6 and 7, respectively. These tables may be combined and re-presented as a complete single list, as shown in Table 8. Table 9 presents the list of 28 E-value estimates and mEVs in descending order of mEV. In this table, the link between each E-value estimate and mEV has been broken so that both lists appear in straight descending order. From Table 9, it is apparent that 25/28 (89.3%) E-value estimates are greater than 9 and so fall into the high response zone [81], and that all 28 (100%) are greater than the 1.25 cut-off for causality [80]. In terms of the mEVs, it is noted that 14/28 (50%) are greater than 9, while all 28 (100%) are greater than the 1.25 threshold indicative of a potentially causal relationship.  Abbreviations: See Table 1. In geospatial models, the rho is the spatial coefficient and the lambda is the spatial autocorrelation coefficient. *-Interaction.  Abbreviations: See Tables 1 and 3. *-Interaction.     Table key: Note that both lists of E-value estimates and lower bounds are presented in descending order. This implies that the paired relationship between the values has been broken. Table 8 is then re-presented according to the order of congenital anomalies in Table 10. Table 10 is then summarized for comparative purposes as Table 11, which lists various summary statistics for the E-values in descending order of the median mEV, which coincides with the order of the median E-value estimate. Hence, the order of association seen in this table is anotia > congenital cataract > orofacial clefts > holoprosencephaly.
The data from Table 8 is presented graphically in Figure 15. Here, it is noted that congenital cataract, holoprosencephaly, anotia/microtia, and orofacial clefts are all high scoring, with points seen at the periphery of the data cloud. Figure 16 presents a similar plot for the minimum E-values, with qualitatively similar results.
Table S13 re-lists the data from Table 8 ordered by the regression terms. As indicated, the regression terms can be simplified and grouped by their primary covariate into daily cannabis use interpolated, herb THC concentration, or resin THC concentration. The descriptive statistics from these data can then be summarized and ordered by descending mEV, as indicated in Table 12, where it is seen that when one considers the median E-value estimate and mEV, the order of potency is daily use interpolated > cannabis herb THC concentration > cannabis resin THC concentration.
These terms may be quantitatively compared using the Wilcoxon test, as shown in Table 12. As shown in this table, only the comparison between the daily use and the THC concentration of cannabis resin for the E-value estimate is significant (W = 36, p = 0.0115).

Main Results
The main result of the present analysis was that seven of the nine anomalies in this group could be significantly related based on the bivariate analysis with the metrics of cannabis exposure and that all four of the FCAs chosen for further multivariable analysis could be related to the metrics of cannabis exposure on inverse probability weighted panel and geospatial modelling. Thus, overall, eight of nine FCAs were relatable to cannabis exposure, with the sole exception being the group ear, face, and neck anomalies.

Detailed Review of Results
According to the bivariate analysis, tobacco was not obviously related to any of the FCAs. Alcohol was related to choanal atresia; ear, face, and neck anomalies; and HPE. The cannabis herb THC concentration was related to choanal atresia and orofacial clefts, holoprosencephaly, and cleft lip ± palate. The cannabis resin THC concentration was related to cleft lip ± palate, choanal atresia, holoprosencephaly, and orofacial clefts.
The maps of the FCA incidence indicated that the anotia rates deteriorated in France and Germany; holoprosencephaly deteriorated in Spain, France, German, Bulgaria, and Norway; and choanal atresia became worse in German, France, Spain, Italy, and Hungary. The bivariate maps of orofacial clefts against the cannabis resin THC concentration showed that both covariates increased together in Norway, France, the Netherlands, and Bulgaria. When holoprosencephaly was mapped against the cannabis resin THC concentration, both covariates increased together in the Netherlands, France, and Bulgaria. When holoprosencephaly was mapped against the cannabis herb THC concentration, both covariates increased simultaneously in France, Germany, and Bulgaria. When anotia was mapped against the cannabis herb THC concentration, both covariates increased together in Belgium, the Netherlands, and Germany. When choanal atresia was mapped against the cannabis herb THC concentration, both covariates increased together in Belgium, Spain, and Germany.
When the nations with increasing daily use were compared to those without, the former group had generally higher rates of FCAs (p = 0.0281, Figure 11). The daily cannabis interpolated was the most significant covariate in the bivariate regression ( Table 1). The most powerfully related FCAs in the bivariate volcano plots were congenital glaucoma, congenital cataract, and orofacial clefts (Figures 13 and 14). From Table 1, the FCAs may be ranked in order by their median mEV as congenital glaucoma (median mEV = 95.71) > congenital cataract (6.09) > choanal atresia (5.53) > cleft lip ± cleft palate (4.58) > holoprosencephaly and arhinencephaly (3.85) > orofacial clefts (3.26) > ear, face, and neck anomalies (1.07). Anotia and cleft palate alone do not feature in this list.
Four anomalies were selected for detailed multivariable study. According to the inverse probability weighted multivariable panel regression for the sequence of anomalies, orofacial clefts, anotia, congenital cataract, and holoprosencephaly had positive and significant cannabis coefficients ranging from p = 2.65 × 10 −5 , 1.04 × 10 −8 , 5.88 × 10 −16 and 3.21 × 10 −13 . According to the geospatial multivariable regression, this same series of FCAs had terms for cannabis that were positive and significant from p = 8.86 × 10 −9 , 00011, 3.36 × 10 −8 and 0.0015. The most powerfully related on the multivariable volcano plots were congenital glaucoma, congenital cataract, and orofacial clefts (Figures 15 and 16). Some 25/28 (89.3%) E-value estimates and 14/28 (50%) minimum E-values exceeded 9 and thus fell in the high range [81]. Moreover, 100% of the E-value estimates and mEVs fell above the 1.25 cut-off for causal relationships [80]. Considering the median mEVs, the strength of the relationship with cannabis was anotia > congenital cataract > orofacial clefts > holoprosencephaly. Considering the mEV, the explanatory power of the cannabis metrics daily cannabis use > herb THC concentration > resin THC concentration, although these differences were (mostly) not significant.

Choice of Anomalies
Some explanation why the various anomalies chosen for further multivariable analysis were selected is of interest. Cleft lip ± cleft palate was chosen because it is the bestknown and amongst the commonest anomalies in this group, and it is often a highly visible anomaly. Anotia was chosen because this FCA has been found to be associated with cannabis exposure in Hawaii, Colorado, the USA, and Australia [1][2][3]5]. It was, therefore, naturally of interest to see how it would perform in the European dataset. Congenital cataract was chosen because it showed large bivariate effects (Figures 1 and 2). Holoprosencephaly was chosen because it showed large bivariate effects, and as it has recently been shown to be due to sonic hedgehog inhibition interfering with the face and forebrain morphogenesis, it was of prognostic interest not only for facial but also for neurological development [6,7].

Qualitative Causal Inference
In 1965, the great English epidemiologist A.B. Hill set out what have become the accepted qualitative criteria for demonstrating causality in an associative relationship [86]. His nine criteria were strength of association, consistency amongst studies, specificity, temporality, coherence with known data, biological plausibility, dose response curve, analogy with similar situations elsewhere, and experimental confirmation. It is clear from the above remarks that the present results fulfill all of these criteria.

Quantitative Causal Inference
One of the major issues faced by observational studies is the non-comparability of experimental groups. The analytical technique of choice with which to address this is inverse probability weighting. This technique has been applied to all the panel models shown in this report. In interpreting our results, it is important to appreciate that this technique moves this report from the merely associational into a pseudo-randomized paradigm from which is it entirely appropriate to make truly causal inferences.
Another issue faced by observational studies is that some uncontrolled confounding variable may exist that explain away the reported apparently causal relationships. The use of the E-value (or "Expected value") quantifies the degree of association demanded of such a hypothetical extraneous confounder covariate with both the exposure of concern and the outcome of interest in order to obviate the results reported. Since 50% of our minimum E-values were in the high range and 71.4% were in the moderate range (greater than 5), we are confident that this is not a major issue for the results in our study.
For these reasons, we feel convinced that the present results also fulfill the criteria for quantitative formal causal inference, further lending weight to the study findings.

Morphogen Gradients
The centrality of the major embryonic morphogen gradients to the normal patterning and organogenic growth of embryonic development was alluded to in the Introduction. In addition to their effect on sonic hedgehog, cannabinoids have also been reported to interfere with fibroblast growth factor [87,88], retinoic acid [89][90][91], and bone morphogenetic proteins [92][93][94], amongst others. Together, such disruptions place inordinate stress on the normal embryogenic pathways, sequences, and patterning.

Epigenomic Impacts on the Genes of Facial Developmental Cannabinoid-Induced Epigenomic Perturbations
There is increasing recent concern at the epigenetic impacts and, therefore, transgenerational effects of many cannabinoids [54,95,96]. Recent serial whole epigenome screening has identified many sets of functional annotations identified in the Ingenuity Pathway Analysis of human sperm, including particularly the following, which are of relevance to the morphogenesis of facial structures [8]: Head development (47 genes, page 304, p = 0.00012). Development of sensory organs (29 genes, page 306, p = 0.000164).
Only one gene was identified with palatal gene expression, namely TRPS1 (page 358, p = 0.00701). Mutations of TRPS1 cause trichorhinopharyngeal (TRPS1) syndrome, which includes a rounded bulbous nose and a long flat upper lip and dental anomalies. Some patients also have heart, kidney, and genitourinary system anomalies.
Only a single functional annotation was identified relevant to nasal development, which involved three genes: BMP4, CHD7, and GLI3 (3 genes, page 343, p = 0.00108). Bone morphogenetic protein 4 (BMP4) is the natural antagonist to sonic hedgehog (shh), and both are involved in eye and ear formation. GLI3 (Gli family zinc finger 3) is part of the hedgehog signaling pathway. CHD7 (chromodomain helicase DNA binding protein 7) is a DNA-binding protein.
The abnormal morphology of the anterior eye segment was noted (7 genes, page 318). Abnormalities of the corneal stroma were also noted (3 genes, page 323). The morphology of the lens was noted (3 genes, page 354), as was lens formation (4 genes, page 333).

Generalizability
Since this study uses a large European dataset, one of the largest in the world, and has produced results closely in line with those found elsewhere, we feel that the results produced in the present study are robust to external generalization. Moreover, in the present work, we have been careful to adopt a framework of formal quantitative causal inference, meaning that this study moves beyond a simply observational project and becomes a pseudo-randomized analytical environment from which one can appropriately draw causal inferences. It is also noted that there are robust epigenomic aetiopathological biological explanations for the present findings, which also support the epidemiological results. In that the effects described then are causal rather than merely associational, this causal dimension also reinforces the view that these results are widely generalizable.

Strengths and Limitations
This study has a number of strengths and limitations. Its strengths include the use of one of the world's largest datasets for congenital anomalies together with the drug exposure dataset from the EMCDDA, which is unusually fulsome. We have also been careful to use inverse probability weighting for all the panel regression modelling, which makes our analyses pseudo-randomized and suitable for drawing formal causal inferences from. We have also used E-values liberally throughout this report, which further robustifies the results to external confounding. We have used multiple paneled graphs and plots to present complex time series data at a single glance, which is visually useful to validate the quantitative analyses presented. Ranger regression was used for the formal variable selection. The study's limitations include the fact that we did not have access to individual participant-level data, which is a relatively common limitation broadly applicable to many epidemiological investigations. In addition, some of the data, such as the daily cannabis use data, was incomplete and had to be interpolated. It is, therefore, important to bear this in mind when considering the results that mention daily use. Data on anomaly rates by sex were not available. This would be a useful refinement for future studies.

Conclusions
In conclusion, the present study describes how it was possible to relate eight of the nine orofacial congenital anomalies investigated herein to the metrics of cannabis exposure in either a bivariate or inverse probability weighted panel regression and geospatial modelling framework, with high levels of statistical significance and moderate to high levels of minimum E-values signifying robustness in the sensitivity analyses. As well as the usual associational regression techniques, the study uses the techniques for formal quantitative causal inference, which moves the analysis from the world of an observational study into a truly pseudo-randomized casual inferential paradigm in which casual inferences may be properly made. The study results are externally verified by other recent series, with the addition of congenital glaucoma and congenital cataract, which are two new anomalies not previously linked with cannabis exposure. While the implications of the study are concerning in terms of orofacial congenital anomalies themselves, the data are particularly concerning with regard to their implications for fetal brain development and intellectual disability, which is so often paired with anomalies of the anterior head. It is noteworthy that recent epigenomic studies of cannabis dependence and withdrawal appear to provide powerful potential and novel mechanistic explanations for these findings. The intersection of the rapidly rising community cannabis exposure with the known exponential cannabinoid genotoxic dose-response relationship is of particular concern for custodianship by this generation of the genome and epigenome of the generations to come.

Supplementary Materials:
The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/jox13010006/s1, Table S1: Overall Study Profile; Table S2: Daily Cannabis Use-Raw Data; Table S3: Daily Cannabis Use-Interpolated Data; Table S4: All regression slopes and results from bivariate analyses; Table S5: Variable Importance Tables from Ranger  random forest regression-Anotia; Table S6: Variable Importance Tables from Ranger random forest regression-Orofacial clefts; Table S7: Variable Importance Tables from Ranger random forest  regression-Congenital cataracts; Table S8: Variable Importance Tables from Ranger random forest  regression-Holoprosencephaly; Table S9: Inverse probability weighted panel regression results-Anotia; Table S10: Inverse probability weighted panel regression results -Orofacial clefts; Table  S11: Inverse probability weighted panel regression results-Congenital cataracts; Table S12: Inverse probability weighted panel regression results-Holoprosencephaly; Table S13: Inverse probability weighted panel regression results-Holoprosencephaly; Figure S1: Sequential map-graphs of log (choanal atresia rates) across surveyed European nations over time 2010-2019; Figure S2: Sequential map-graphs of last month cannabis use x cannabis resin THC concentration across surveyed European nations over time 2010-2019; Figure S3: Bivariate Map of Log Choanal Atresia by Cannabis Herb THC Concentration across European natoins over time; Figure S4: Geospatial linked between european nations (A) edited and (B) final.
Author Contributions: A.S.R. assembled the data, designed and conducted the analyses, and wrote the first manuscript draft. G.K.H. provided technical and logistic support, co-wrote the paper, assisted with gaining ethical approval, and provided advice on the manuscript preparation and general guidance on conducting the study. A.S.R. had the idea for the article, performed the literature search, wrote the first draft, and is the guarantor for the article. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding. No funding organization played any role in the design and conduct of the study; the collection, management, analysis, and interpretation of the data; the preparation, review, or approval of the manuscript; or the decision to submit the manuscript for publication.