Patterns of Cannabis- and Substance-Related Congenital General Anomalies in Europe: A Geospatiotemporal and Causal Inferential Study

Introduction: Recent series of congenital anomaly (CA) rates (CARs) have showed the close and epidemiologically causal relationship of cannabis exposure to many CARs. We investigated these trends in Europe where similar trends have occurred. Methods: CARs from EUROCAT. Drug use from European Monitoring Centre for Drugs and Drug Addiction. Income data from World Bank. Results: CARs were higher in countries with increasing daily use overall (p = 9.99 × 10−14, minimum E-value (mEV) = 2.09) and especially for maternal infections, situs inversus, teratogenic syndromes and VACTERL syndrome (p = 1.49 × 10−15, mEV = 3.04). In inverse probability weighted panel regression models the series of anomalies: all anomalies, VACTERL, foetal alcohol syndrome, situs inversus (SI), lateralization (L), and teratogenic syndromes (TS; AAVFASSILTS) had cannabis metric p-values from: p < 2.2 × 10−16, 1.52 × 10−12, 1.44 × 10−13, 1.88 × 10−7, 7.39 × 10−6 and <2.2 × 10−16. In a series of spatiotemporal models this anomaly series had cannabis metric p-values from: 8.96 × 10−6, 6.56 × 10−6, 0.0004, 0.0019, 0.0006, 5.65 × 10−5. Considering E-values, the cannabis effect size order was VACTERL > situs inversus > teratogenic syndromes > FAS > lateralization syndromes > all anomalies. 50/64 (78.1%) E-value estimates and 42/64 (65.6%) mEVs > 9. Daily cannabis use was the strongest predictor for all anomalies. Conclusion: Data confirmed laboratory, preclinical and recent epidemiological studies from Canada, Australia, Hawaii, Colorado and USA for teratological links between cannabis exposure and AAVFASSILTS anomalies, fulfilled epidemiological criteria for causality and underscored importance of cannabis teratogenicity. VACTERL data are consistent with causation via cannabis-induced Sonic Hedgehog inhibition. TS data suggest cannabinoid contribution. SI&L data are consistent with results for cardiovascular CAs. Overall, these data show that cannabis is linked across space and time and in a manner which fulfills epidemiological criteria for causality not only with many CAs, but with several multiorgan teratologic syndromes. The major clinical implication of these results is that access to cannabinoids should be tightly restricted in the interests of safeguarding the community’s genetic heritage to protect and preserve coming generations, as is done for all other major genotoxins.

It is pertinent to observe that cannabinoid teratogenicity is a subset of disorders within the broader class of cannabinoid genotoxicity which also includes cannabinoid-related carcinogenicity and cannabinoid-related cellular aging. These are other closely-related subjects which meaningfully inform the present discussion of cannabinoid teratogenicity.
Recent reports also demonstrate clearly that both Europe and North America are experiencing a triple confluence of, in the first instance, a rising prevalence of cannabis use; secondly, a rising intensity of daily cannabis use; and thirdly, a rising cannabinoid potency for most phytocannabinoids in commercially-available products [21][22][23]23]. This suggests that street-level cannabinoid exposure has increased substantially under the influence of this triple confluence. This is of major concern given the proliferation of cellular and molecular research demonstrating the exponential dose-response effects of cannabinoid genotoxicity and epigenotoxicity, including cannabis-induced disruptions of mitochondrial and cellular metabolism which directly control chromosomal, genomic and epigenomic processes generally [24][25][26][27][28][29][30][31][32][33][33][34][35]. Moreover, several recent epidemiological studies have verified the predicted abrupt jump in anomaly rates observed following the highest levels of cannabis exposure [2,7,8,14]. Thus, community epidemiological data confirms laboratory findings of exponential cannabis effects [24][25][26][27][28][29][30][31][32][33][33][34][35]. This in turn implies that due to the relatively sudden rises in cannabis use prevalence, intensity and potency, a relatively abrupt rise in major teratologic, mutagenic and genotoxic presentations may be expected as major public health outcomes. Indeed, there are already indications of similar abrupt increases in anomaly rates both in the USA and Europe [21][22][23]23].
A group of eleven anomalies including all anomalies, conjoint twins, foetal alcohol syndrome (FAS), skeletal dysplasias, teratogenic syndromes, amniotic bands, lateralization anomalies, maternal infections causing malformations, situs inversus, valproate syndrome, and VACTERL syndrome were studied. While the group of "all anomalies" is evidently of crucial interest, a number of anomalies provided significant insight into the effects of cannabis exposure. VACTERL syndrome is an acronym for vertebral defects, anorectal anomalies, cardiac defects, tracheoesophageal fistula/oesophageal atresia, renal anomalies and limb anomalies [36]. It was recently shown to be due to inhibition of one of the major embryonic morphogens, sonic hedgehog, in a manner which could be induced by the cannabinoids ∆9-tetrahydrocannabinol (THC) and cannabidiol both directly [34] and via epigenomic pathways [37]. Moreover, it seemed to contain within just one syndrome most of the multisystem disorders which have recently been attributed to cannabis exposure. Furthermore, a demonstrated causal relationship with the multisystem VACTERL syndrome would completely belie the typically harmless characterization of cannabis in popular culture. Given that the cluster of "teratogenic syndromes" were clearly in need of some aetiopathological explanation, and given the diverse nature of cannabinoid teratology, we wanted to learn if perhaps some of the variance seen in this disorder might potentially be epidemiologically explained by cannabinoid exposure. Foetal alcohol syndrome (FAS) has been described as being caused in part by epigenomic signalling through the cannabinoid type 1 receptor (CB1R). Moreover, cannabis and alcohol are often co-abused and each has been reported as performing a gateway role for the other [38][39][40][41][42][43][44][45][46]. This made the outcome of this analysis of great interest. Cannabis has recently been reported as being associated with many cardiovascular anomalies, including transposition of the great arteries [7,14,47], which can be considered as a limited cardiovascular manifestation of lateralization syndromes or situs inversus. Taken together, these disorders would argue for a general disruption of left-right lateralization mechanisms generally throughout body morphogenesis.
This study was performed to assess if any of the CA disorders in this group demonstrated a relationship to metrics of cannabis exposure, survived multivariable adjustment, and if so, whether this relationship fulfilled quantitative epidemiological criteria for causality. As a result of the wider links between many of these disorders and other syndromes, the present analysis has implications stretching beyond merely the CAs listed herein, and indeed, contributes pointedly to a generic consideration of cannabinoid genotoxicity more broadly.

Data for analysis was downloaded from the European Network of Population-Based
Registries for the Epidemiological Surveillance of Congenital Anomalies (EUROCAT) website [89]. Data on all available congenital anomaly rates was sourced by each individual year for each of 14 nations. The total congenital anomaly rate in the EUROCAT data comprehends anomaly rates amongst live births, stillbirths, and cases where early termination for anomaly was practised, all combined together. The total congenital anomaly rate therefore represents a total overall picture across all classes of births. Nations were selected based on the availability of their congenital anomaly data for the years 2010-2019. The World Health Organization [90] was the source of national tobacco (percent daily tobacco use prevalence) and alcohol (litres of pure alcohol consumed per capita annually) use data. Drug use data was sourced from the European Monitoring Centre for Drugs and Drug Addiction (EMCDDA) [91] for cannabis, amphetamines and cocaine. In addition to this, data cannabis consumption data was supplemented by data on the tetrahydrocannabinol (THC) content of cannabis herb and resin published in recent reports [22]. This data was originally sourced from EMCDDA and was therefore coincident with EMCDDA data for this data field [22]. The World Bank [92] was the source of median household income data (in USD).
Nations were categorized as being either low and/or falling daily cannabis use, or high and/or rising daily cannabis use, based on a recent European epidemiological study (see Supplementary Figure S4 [22]). Thus Belgium, Italy, The Netherlands, Norway, Portugal, Croatia, France, Germany and Spain were categorized as nations experiencing increasing daily use, while Poland, Bulgaria, Finland, Hungary and Sweden were nations which were experiencing low or falling levels of daily cannabis use.
Various derived metrics of cannabis exposure could be calculated from the several metrics of cannabis use available. In this way, last month cannabis use prevalence data was multiplied by the THC content of cannabis herb and resin to derive compound metrics. These metrics were then also multiplied by imputed daily cannabis use prevalence rates to derive further compound metrics both for cannabis herb and cannabis resin.
Missing data was completed by linear interpolation. This technique was particularly employed for daily cannabis use. A total of 59 data points on daily cannabis use were available from EMCDDA for these 14 nations over this period. Linear interpolation was used to expand this dataset to 129 datapoints (further details provided in Section 3). Whilst data on cannabis resin THC concentration were not available for Sweden, it was noted that the resin to herb THC concentration was almost constant in nearby Norway at 17.7, so this ratio was therefore applied to the Swedish cannabis herb THC concentration data to derive estimates of Swedish cannabis resin THC concentration. Similarly, data for the cannabis resin THC concentration in Poland were unavailable. The resin to herb THC concentration ratio of neighbouring Germany was used to estimate the resin THC content in Poland from the Polish herb THC concentrations which were known. Since geospatial analytical techniques do not tolerate missing data, the dataset was completed by the last observation carried forward or backwards for The Netherlands in 2010 and for Croatia in 2018 and 2019. Multiple imputation methods could not be used for this analysis as multiple datasets cannot be inputted in panel or spatial multivariable regression techniques.
RStudio version 1.4.1717, based on R version 4.1.1 from the Comprehensive R Archive Network and the R Foundation for Statistical Computing [93], was used for data processing. The analysis was performed in December 2021. dplyr from the tidyverse [94] was used for data manipulation. The results of the Shapiro-Wilk test were used to guide the decision on whether to log transform data in order to improve compliance with normality assumptions. ggplot2 from tidyverse was used to draw graphs. ggplot2 and sf (simple features) [95] were used to draw maps and both custom colour palettes and palettes taken from the viridis and viridisLite packages were used to generate colour fill panels [96]. The package colorplaner [97] was used to draw bivariate maps. All illustrations are original. They have not been published previously. Linear regression was performed in Base R. Mixed effects regression was conducted using R package nlme [98]. In all multivariable models the classical technique of model reduction was employed using serial deletion of the least significant term to yield a final reduced model which is the model presented. Multiple linear models were processed in a single pass using nested and combined techniques from R packages purrr and broom [94,99,100]. The overall effect of covariates in multivariable models may be quantified and is known as the marginal effect. In this study, the overall marginal effect was calculated using the R package margins [101].
The presence of multiple different metrics for cannabis consumption and exposure created an important problem of covariate redundancy for analysis as it was not clear which was the most appropriate metric to employ for any particular model. Use of excessive covariates in a multivariable model would unnecessarily consume degrees of freedom and lead to problems of collinearity and thereby restrict ability to assess interactions. This issue was addressed formally by the use of random forest regression using the R package ranger [102], with variable importance being quantified via the R package vip (variable importance plot) [103]. This process was used to select the most predictive covariates which were entered into the regression modelling equations. The tables from this analysis are presented in the Section 3.
Panel analysis was conducted using R package plm [104] across both space and time simultaneously for which the "twoways" effect was employed. The spatial weights matrix was computed using the edge and corner "queen" relationships using R package spdep (spatial dependency) [105]. Geospatial modelling was conducted using the spatial panel random effects maximum likelihood (spreml) function from the package spml, which allows modelling and correction of model error structures in a detailed manner [106,107]. Such models may produce four model coefficients of interest, which can then be used to determine the most appropriate error structure for the model. These coefficients are rho, the spatial coefficient; phi, the random error effect; psi, the serial correlation effect; and theta, the spatial autocorrelation coefficient. In this manner, the most appropriate error structure was chosen for each spatial model, generally, taking care to preserve the model error specification across closely-related models. The appropriate error structure was determined by the backwards methods from the full general model to the most specific model, as has been published previously [108]. Both panel and geospatial models were temporally lagged as shown by one to two years.
The tools of formal causal inference were employed in this analysis. Inverse probability weighting (ipw) is the technique of choice to convert a purely observational study into a pseudo-randomized study and from such analyses, it is appropriate to draw causal inferences [109]. All of the multivariable panel models presented herein were inverse probability weighted. Inverse probability weighting was performed using the R package ipw. Similarly, E-values (expected values) quantify the correlation required of some hypothetical unmeasured confounder covariate with both the outcome of interest and the exposure of concern in order to explain away some apparently causal relationship [110][111][112]. It is thus an important technique of sensitivity analysis and therefore provides a quantitative measure of the robustness of the model to extraneous covariates which have not been accounted for within the measured parameters. E-values have a confidence interval associated with them and the 95% lower bound of this confidence interval is reported in the present study. E-value estimates greater than 1.25 are indicative of causality [113], with E-values greater than 9 described as being in the high range [114]. E-values were calculated from the R package EValue [115]. Both inverse probability weighting and E-values are foundational and pivotal techniques used in formal causal inferential methods in order to allow causal relationships to be assessed from real-world observational studies and together create a powerful causal inferential pseudo-randomized analytical paradigm.

Results
Presentation of Results in this section would be aided by a Navigation chart to assist the reader to properly understand the analysis plan which has been undertaken. This may be set out as follows:

Data Presentation
An overall profile of the 14 countries contributing data and the 11 congenital anomalies investigated is shown in Supplementary Table S1. The table also provides national substance use exposure, including compound variables for cannabis exposure, and median household income.
Supplementary Table S2 provides the available data on daily cannabis use for countries over the applicable time period. It is notably incomplete. For these reasons, data was supplemented by linear interpolation with the addition of a further 70 data points to obtain the dataset shown in Supplementary Table S3 2 show the bivariate relationship between substance exposure and the rates of the various anomalies in this class. It is interesting that tobacco use is negatively associated with all the anomalies except fetal alcohol syndrome (FAS), which is unsurprising as tobacco and alcohol use are often co-associated. The trend lines for alcohol use are mostly flat although they are strongly positive for FAS, teratogenic syndromes and amniotic bands. For amphetamine exposure, the trend lines are mostly flat or negative but are significantly positive for all anomalies and VACTERL syndrome. Seven trend lines for cocaine exposure are strongly positive as indicated.
The cannabis use metrics employed in these graphs is the compound metric of last month daily use × cannabis resin THC concentration × daily use interpolated. For ten of the eleven anomalies listed the trend line for this cannabis exposure metric is more strongly positive than any of the other trend lines in these two graphs. From visual inspection it is likely that the exceptional anomaly is conjoined twins. Figures 3 and 4 shown the bivariate relationship between the various anomalies and different cannabis exposure metrics. In stark contrast to the preceding figures, many of the cannabis exposure metrics appear to be strongly related to this set of anomalies, especially for the compound metrics on the right hand side of the graph.
The distribution of the group "All Anomalies" across space and time over the decade in Europe is shown in Figure 5. The situation in France, Spain, Bulgaria, Poland and Sweden has deteriorated whilst that in Germany appears to have improved. The rates in Belgium, The Netherlands and Portugal has varied across this period.  Supplementary Figure S2 charts the space-time occurrence of skeletal dysplasia. The largest consistent change on this map is the rise in the rates of this anomaly in Spain. The Netherlands, Belgium and Bulgaria also reported high rates at times.
The rates across Europe of VACTERL syndrome are shown in Figure 6. Very high rates prominently stand out in Belgium and The Netherlands in some years, which is more prominent when it is noted that this is actually a plot of the logarithm of the rates, so the differences for the raw data are even greater.
The rates of teratogenic syndromes in Europe are shown in Figure 7. Belgium often reports very high rates with Germany and Bulgaria reporting high rates in some years. The rates in France, Germany and Norway seem to fluctuate significantly over this decade. The cannabis use metrics employed in these graphs is the compound metric of last month daily use × cannabis resin THC concentration × daily use interpolated. For ten of the eleven anomalies listed the trend line for this cannabis exposure metric is more strongly positive than any of the other trend lines in these two graphs. From visual inspection it is likely that the exceptional anomaly is conjoined twins. Figures 3 and 4 shown the bivariate relationship between the various anomalies and different cannabis exposure metrics. In stark contrast to the preceding figures, many of the cannabis exposure metrics appear to be strongly related to this set of anomalies, especially for the compound metrics on the right hand side of the graph. Supplementary Figure S3 illustrates the rates of the compound cannabis index last month cannabis use × cannabis resin THC concentration × daily cannabis use interpolated. It is apparent that this index has increased in all nations across the continent over this time with particularly marked increases in France and Spain, but The Netherlands, Poland, Portugal, Norway and Croatia also reporting higher rates. Figure 8 is a bivariate colorplane map showing the association between the rate of all anomalies and the cannabis resin THC concentration. As shown in the colorplane key, green shading is indicative of low rates of both covariates, whilst purple and pink shading shows that both rates are high. Other colours have meanings as shown in the colorplane key. Therefore, the increase in both covariates is clearly demonstrated for France, Bulgaria, Spain, Germany and Sweden, which move into violet and purple shades.             Figure S4 shows the space-time co-distribution of FAS and cannabis use × resin THC concentration × daily interpolated use. France is noted to have experienced high rates of both covariates in 2014, 2015 and 2018.
The co-varying patterns of VACTERL syndrome and cannabis use × resin THC concentration are shown in Figure 9. France is noted to have had high rates in 2016, and Belgium and The Netherlands are both noted to have reported high rates of these covariates in 2018.
The spatiotemporal patterns of teratogenic syndromes across Europe is shown in Figure 10. There is a prominent emergence of high rates of both covariates in France, Spain, and The Netherlands over the decade of observation.
From Supplementary Figure S5 it emerges that high rates of the covariates skeletal dysplasia and cannabis use × resin THC concentration × daily interpolated emerged in France and Spain across this period.
Interestingly, when the covariates maternal infections and cannabis use × resin THC concentration × daily interpolated are considered as in Supplementary Figure S6, one again notes the emergence of high rates of these covariates together in France and Spain.

Categorical Bivariate Analysis
Nations may be divided into those with high rates of daily use or lower or decreasing rates of daily cannabis use, based on recent epidemiological analyses of these trends [22]. When the rates of these CAs across the continent are compared, it is clear that for some CAs the rates in those nations with increasing rates of daily cannabis use are higher than those where daily use is less prominent (Figure 11). At mixed effects regression, using the CA as the random effect, these differences were highly significant (β-est. = 0.2223, t = 7.52, p = 9.99 × 10 −14 ; model AIC = 1876.46, Log.Lik = −923.873; minimum E-value = 2.09). When consideration is limited to the four variables where this effect was most obvious (from Figure 11, namely maternal infections, situs inversus, teratogenic syndromes and VACTERL syndrome) the effects was more pronounced still (β-est. These effects may be aggregated in time and the relative rates compared. This has been done in Figure 12 where box plots convey this information graphically. The areas where the notches do not overlap indicate statistically significant differences graphically. These differences in the aggregated rates are shown quantitatively in Table 1 along with the relative rates in the countries with increasing daily use compared to that in those where it is declining, along with t-tests of statistical significance. The table is ordered in order of declining values of Student's t. It is interesting to note that the table is headed by the VACTERL syndrome, followed by teratogenic syndromes. Situs inversus is also highly significant. FAS is also significantly higher in nations with increasing daily cannabis use. The slopes of 132 of the regression lines from Figures 1 and 2 are shown in Supplementary Table S4. Those 66 terms with positive and statistically significant regression coefficients are extracted and shown in Table 2. The terms are listed in descending order of minimum E-value (mEV). It is of interest that 55 (83.3%) of the remaining terms include various cannabis metrics, 8 (12.1%) include cocaine and 3 (4.5%) include alcohol. The top rows of this table shows that six of the first nine terms include interpolated daily cannabis exposure. Figure 9. Bivariate graph of VACTERL syndrome by last month cannabis use. Bivariate colorplane sequential map-graph of the log rate of VACTERL anomalies by last month cannabis use: cannabis resin THC concentration: daily cannabis use interpolated in selected European nations. Figure 9. Bivariate graph of VACTERL syndrome by last month cannabis use. Bivariate colorplane sequential map-graph of the log rate of VACTERL anomalies by last month cannabis use: cannabis resin THC concentration: daily cannabis use interpolated in selected European nations. Figure 10. Bivariate graph of teratogenic syndromes by last month cannabis use. Bivariate colorplane sequential map-graph of the log rate of teratogenic syndromes by last month cannabis use: cannabis resin THC concentration: daily cannabis use interpolated in selected European nations. Figure 10. Bivariate graph of teratogenic syndromes by last month cannabis use. Bivariate colorplane sequential map-graph of the log rate of teratogenic syndromes by last month cannabis use: cannabis resin THC concentration: daily cannabis use interpolated in selected European nations.

Panel Regression
Having demonstrated these important and powerful bivariate relationships, the next issue to arise was how they compared in a multivariable context. However, given the very large number of covariates concerned, including the high number of covariates for the various cannabis metrics, the issue of the choice of the best covariates to enter into multivariable regression equations is non-trivial. This issue was formally addressed by using random Forrest regression (from the R Package ranger) in tandem with variable importance plots (from R package vip). This gave rise to the Random Forrest Variable Importance tables shown in Supplementary Tables S5-S10.
As suggested in these tables, it was decided to focus the analysis going forward on the six CAs described in these tables, namely All anomalies, VACTERL syndrome, FAS, situs inversus, lateralization anomalies and teratogenic syndromes. The rationale for selection of these six CAs is explained in the Section 4.
Supplementary Table S11 sets out the results of inverse probability weighted multivariable panel regression in additive and interactive models and in models lagged by one and two years. Inverse probability weighting is an important technical modification to usual panel regression which allows us to generalize our results from the merely observational context into a more generalizable scenario through the adoption of pseudo-randomization via the weighting procedure. In each case, terms including cannabis exposure remain in the final model, have positive coefficients and are highly statistically significant. Indeed, in the two lagged models several terms positive and significant for cannabis are reported.
Similar findings appear in Supplementary Table S12 for VACTERL syndrome. In this case however, the signal for cannabis terms disappears at two years of temporal lag but re-appears at four years of temporal lag as indicated.
Since the cause of VACTERL syndrome was not elucidated until recently, it is of interest to formally compare it to a similar model which omits cannabis terms. This additive model is shown in the last section of Supplementary Table S12. Unfortunately, it is not possible to directly compare panel models with, for example, an ANOVA test, as would usually be done for many other model types. However, it is noted that when moving from the additive model without cannabis terms to the additive model which includes cannabis terms, the degree of the data variance which is accounted for by the model (indicated by R-squared) increases from 28.29% to 39.46% and the significance of the model p-value greatly increases from p = 4.51 × 10 −3 to p = 4.45 × 10 −14 .
Supplementary Tables S13-S16 perform similar functions for FAS, situs inversus. Lateralization syndromes and teratogenic syndromes all include the important inverse probability weighting. Similar multivariable findings are made as those noted above.
In the case of teratogenic syndromes reported in Supplementary Table S16 it is of great interest to consider the potential contribution of cannabis to this syndrome which is clearly not well understood. Once again, a model which excludes cannabis terms in included at the foot of this table with details as shown. It is noted that compared to the model without cannabis-related terms, the amount of variance accounted for by the model greatly increases from 30.41% to 40.89% and the significance of the model p-value rises by 19 orders of magnitude, from p = 1.32 × 10 −7 to 1.61 × 10 −26 .

Geospatial Analysis
It was of interest to consider these data in their native space-time context in order to formally account for analytically-important confounding factors including random error effects, serial correlation, spatial correlation and spatial autocorrelation. For this purpose, geospatial links between the countries were derived, edited and finalized as shown in Supplementary Figure S7, which formed the basis of the sparse spatial weights matrix for the implementation of formal geospatial regression. Table 3 presents the results of geospatial regression in additive and interactive models and in models lagged to two years. In all three models, terms including cannabis use remain in the final model, are positive and are collectively highly statistically significant. Indeed, it is noted that in the interactive model, only cannabis terms remain in the final model. From comparing the magnitude of the regression coefficients (as they all sum to greater than zero), it is clear that the overall effect of cannabis metrics in these models is strongly in the positive direction.  When the VACTERL syndrome is considered (Table 4), similar findings are made. Again, many terms positive for cannabis remain in final models and the overall effect of cannabis on these models is clearly in the positive direction. In this case, models lagged to four and six years are also included and these effects continue to become even stronger with lagged time.  With VACTERL, it is again of interest to compare these spatial models to those without cannabis terms included. This is presented in Table 5. Here the Log of the maximum likelihood ratio at model optimization (LogLik.) increases from −28.449 without cannabis terms in the additive model to −26.104 when they are included. Spatial models may be directly compared using the spatial Hausman test. In this case, the model with cannabis terms included is superior (Chi Squ. = 8.12, df = 3, p = 0.0436). When the model lagged by two years without cannabis terms is compared to the similar model with them included the LogLik. Ratio increases from −39.6349 to −22.2103 (Spatial Hausman test ChiSqu. = 82.41, df = 3, p = 9.33 × 10 −18 ). Hence, the models which include cannabis terms are shown to be clearly analytically greatly superior to those without such terms. Similar findings apply to the spatial analysis of FAS, situs inversus, lateralization syndromes and teratogenic syndromes presented in Tables 6-9.          For the teratogenic syndromes, which are clearly of unknown aetiology, it is of interest to formally consider the potential difference made by the inclusion of cannabis metrics. Table 10 presents the model without cannabis terms and as noted there, the model which includes cannabis is greatly and significantly statistically superior (LogLik. increases from −106.99 to −63.74, Spatial Hausman test: ChiSqu. = 184.20, df = 2, p = 1.00 × 10 −40 ).

Causal Inference E-Values
E-values or expected values can be calculated from all of the various multivariable regression models presented. These are listed for panel models and for geospatial models in Tables 11 and 12 respectively. It is of considerable interest and importance that four of the E-value estimates for VACTERL syndrome in spatial models are infinite ad are two of the mEV values. These two tables are combined and listed in descending order of the mEV in Supplementary Table S17. It is significant here that the VACTERL syndrome, spatial regression formats and daily cannabis use occupy the first four rows on this list.     Table key: Abbreviations as in Table 2.
These 64 E-value pairs are then listed consecutively in descending order in Table 13. 50/64 (78.1%) E-value estimates exceed 9 and are therefore considered to be in the high range [114], and all 64 (100%) exceed the threshold for causality at 1.25 [113]. For the minimum E-value estimates, 42/64 (65.6%) exceed 9 and are thus in the high range and 62/64 (96.9%) exceed the threshold for causality at 1.25.   Table 14 re-lists Supplementary Table S17 in order of CAs so that they can be compared directly. Summary measures of the E-value estimate and the mEV are presented in Table 15 and listed in descending order of mEV and E-value estimate. It is important to note that the list is headed by VACTERL syndrome and teratogenic syndromes, and FAS also scores highly in this table.      Table 14 is re-listed in order of the substance exposure term in Table 16 and the exposures are grouped into the primary exposure of interest being daily cannabis use interpolated, cannabis herb THC concentration of cannabis resin THC concentration. Grouped summary data for these E-values are then presented in Table 17 and again ordered by descending E-value. It is noted in this table that the order here is daily cannabis use > cannabis herb > cannabis resin.     These groups are formally compared using the Wilcoxson test in Table 18 and all the apparent differences noted in Table 17 are found to be highly statistically significant at the levels indicated in Table 18.

Main Results
A strong positive relationship was shown in bivariate analysis between CARs and many metrics of cannabis exposure. The compound metric last month cannabis use × cannabis resin THC concentration × daily cannabis use interpolated was a powerful predictor of CAR (Figures 3 and 4), just as was shown earlier [7]. Strong upward trends between cannabis resin THC concentration and all anomalies, skeletal dysplasia, lateral anomalies, maternal infections and situs inversus were noted.
Consideration of the maps showed that for VACTERL syndrome, there had been a dramatic rise in low countries and France. Teratogenic syndromes were also often high in low countries. Bivariate maps showed that the rates of the following anomalies increased along with cannabis herb THC concentration in France and Spain-All anomalies, FAS, VACTERL, Teratogenic syndromes, skeletal anomalies and maternal infections.
Comparison of geospatial VACTERL models with and without cannabis metrics showed that models including terms for cannabis were greatly superior (at 2 Lags, p < 2.2 × 10 −16 ). Comparison of geospatial models for teratogenic syndromes with and without cannabis found similarly (additive model p < 2.2 × 10 −16 ).
A total of 50/64 (78.1%) E-value estimates and 42/64 (65.6%) minimum E-values exceeded 9 and so were in the high risk zone [114]. All 64 E-value estimates and 62/64 lower E-values exceeded 1.25 and thus achieve the threshold for causal inference [113]. Considering E-values, the rate of effect size from cannabis was VACTERL > situs inversus > teratogenic syndromes > FAS > lateralization syndromes > all anomalies. Daily cannabis use interpolated the strongest predictor for all anomalies.

Choice of Anomalies
This group of anomalies was made up from CAs which did not fit in the standard organ-specific systems such as cardiovascular system, etc. Some of the anomalies which were chosen for more detailed study were chosen due to high effect size on the bivariate plots in Figures 1-4 (situs inversus, lateralization syndromes, foetal alcohol syndrome). All anomalies was chosen for its obvious overall importance to the field. Some syndromes were chosen because they are known to have pathophysiological overlap with cannabinoid effects, such as foetal alcohol syndrome, which has been shown to signal to the epigenome through the cannabinoid type 1 receptor (CB1R) [116][117][118][119][120][121][122][123][124][125][126][127][128][129][130]. Moreover, cannabis is often co-abused with alcohol and each has been described as being a gateway drug for the other [38,[40][41][42][43][44][45][46].
VACTERL syndrome was chosen as it had recently been described as being related to Sonic Hedgehog inhibition, a change which was inducible by both THC and cannabidiol [34]. Epigenetic modulation of the Sonic Hedgehog pathway in cannabis withdrawal has also been demonstrated [37]. This group identified differentially-methylated genes at BMP4 (bone morphogenetic protein 4), GLI3 (Gli family zinc finger 3), MEGF8 (multiple EGF-like domains 8), and TMEM107 (transmembrane protein 107) and CHD7 (chromodomain helicase DNA binding protein 7) ( [37] Supplementary Material Pages 352, 354, 356, p = 0.00547). The first four of these are all either part of the Sonic Hedgehog signalling pathway or are modifiers of it. Hence, we wished to test the epidemiological evidence for cannabis involvement in this syndrome.
Increased evidence of cannabinoid association or causation in VACTERL syndrome was also of great interest as it comprised within one syndrome most of the spectrum of con-genital anomalies which have recently been connected to community or prenatal cannabis exposure. Hence, a demonstrated result here would prove the whole severity and broad spectrum which is increasingly being outlined within cannabis teratology. VACTERL syndrome was also of interest as it is by definition multisystem and polysyndromic. A positive result in causal inferential and space-time analysis would directly implicate cannabis in multisystem disease and disprove the recalcitrant notion that persistently equates cannabis use with being harmless.
Similarly, teratogenic syndromes were chosen to see if cannabis might potentially explain some of the variance of this challenging group.
We were also interested in lateralization syndromes, including the full version of the disorder situs inversus, as there was considerable evidence from recent work that certain cardiovascular anomalies with which cannabis was implicated such as double outlet right ventricle and transposition of the great vessels included an element of malrotation or nonrotation in their dysmorphogenesis [7,14,47]. We therefore wished to see how far-ranging this left-right malrotation was in relation to cannabis teratogenesis. Left-right malrotation syndromes have also been described from cannabis epigenomically [37].
Some anomalies were chosen for several reasons as indicated.

Qualitative Causal Inference
In 1965, A.B. Hill outlined several criteria which would be used to assess potentially causal relationships. They include strength of association, consistency amongst studies, specificity, temporality, coherence with known data, biological plausibility, dose-response relationships, analogy with similar situations elsewhere, and experimental confirmation. It is observed that the present series of anomalies fulfills most of these criteria except those related to reproduction in other studies, which is unsurprising given this is an original report for many of these anomalies. Nor have this series of anomalies been identified in earlier studies to our knowledge. However, it is important to note that a plethora of established biological pathways exist which could well explain these findings.

Quantitative Causal Inference
Inverse probability weighting is the technique of choice in causal inferential studies to even out environmental exposures across groups and make them truly comparable so that causal inferences can appropriately be derived from observations datasets [109,131]. The process of inverse probability weighting is known to convert observations datasets into pseudorandomized studies so that causal inferences can properly be determined. It is noted that this was applied to all the multivariable panel models reported in the present study.
E-values quantify the degree of association required of some extraneous covariate which has not been included in the analysis with both the exposure of concern and the outcome of interest to explain away and obviate an apparently causal association. As such, it lends great confidence to an analysis as it is a form of sensitivity testing to the effects of added covariates. E-values greater than 9 are said to be high [114] and E-values in excess of 1.25 are usually taken as being indicative of potentially causal relationships [113]. Most of the E-values reported in the present study are much greater than 1.25, which greatly strengthens the conclusions drawn.

Mechanisms
As noted in the introduction, there are a variety of mechanisms by which cannabinoids exert genotoxic effects. It is, however, also important to consider some of the cannabinoidrelated related epigenomic mechanisms which have been elucidated by recent whole genome epigenetic screens.

Epigenomic Controls
A recent paper cataloguing genome-wide DNA methylation changes provided great insight into the heritable mechanism underlying cannabinoid genotoxicity and teratogenesis in particular [37]. Researchers looked at 20 cannabis-dependent patients and controls both at study initiation and after 17 weeks of confirmed cannabis abstinence in the experimental group. Seventeen weeks was chosen as it is the normal sperm cycle time in human males. Hence, each patient formed their own control and investigators were able to look at the comparative epigenomes both at time zero and in cannabis withdrawal and compare them both to each other and between groups. A total of 163 differentially methylated regions (DMRs) were identified in cannabis dependence between the two groups and 127 DMRs were identified in cannabis withdrawal at the 17-week mark. These DMRs in turn affected hundreds of genes. The present consideration of the general group of anomalies provides an ideal opportunity to consider the breadth of the findings of this remarkable paper. Study results were supported by detailed online supplementary material which ran to 359 pages.
During cannabis dependence pathways annotated for cerebral disorder, neurodevelopmental disorders, agenesis (lack of growth), growth of organism, cardiogenesis, haematological and immune changes and liver lesions were defined. During cannabis withdrawal, brain changes (hippocampal formation cognitive impairments, learning, encephalopathy, quantity of pyramidal neurons), activation of alveolar macrophages, organismal death and abnormal morphology of seminiferous tubules were identified.
As to the specific genes identified by functional annotations of Ingenuity Pathway Analysis, most will be covered in a companion paper in this series dealing with specific organ systems. Consideration in this paper will be confined to those which broadly affect all body systems or are not addressed elsewhere. References are to the page in the Supplementary material of Schrott [37].
A total of 256 hits for DNA metabolism were found in cannabis dependence and withdrawal, including DNA transcription (60 genes, page 314), DNA promoter activity (49 genes, page 317), DNA replication, recombination and repair (12 genes, page 317), DNA binding (24 genes, page 323), DNA synthesis and repair (20 genes, page 323), and DNA replication, recombination and repair (4 genes, page 344). This is a fascinating list and may explain the frequent observations of DNA and chromosomal breaks after cannabinoid exposure from failed recombination and repair events. It also explains why testicular and lymphoid cancers are more common in cannabisexposed patients, as recombination events happen programmatically in those tissues as part of gamete crossing-over events and hypermutation in immune follicles during antigen selection processes. It is also noted that there are many annotations in this appendix for immune and haemopoietic systems which are detailed elsewhere [12,132,133].
One hit was found for mitochondria disorders which related to mitochondrial function in the eye (1 gene, RNASEH1 [Ribonuclease H1], page 357).
For body axis development, there was a single hit with 50 genes annotated (page 302). There was one reference to diminished ovarian reserve (2 genes, page 349, p = 0.00308). Searching limb morphogenesis, a gene called MEGF8 (multiple EGF-like domains) was encountered (page 354). In fact, there were 105 annotations in the Supplementary Appendix where this gene was identified. One of its functions is left-right patterning [137]. This may be part of the reason for the positive findings with situs inversus and lateralization syndromes in the present report and for transposition of the great vessels in others [7,14,47].
A search for embryo growth revealed 101 hits including (page 296, 83 genes) and 306 (27 genes). Such a finding may explain the reports of small babies and smaller heads on cannabis-exposed neonates [138,139]. Growth of the embryo and embryonic morphogenesis were also noted (Page 310, 39 and 15 genes each), morphogenesis of embryonic tissue (12 genes, page 300), embryonic growth and organismal development (27 genes, page 305), and differentiation of embryonic cells (15 genes, page 316).
Given the definition of VACTERL syndrome, it was of interest to determine if vertebrae was identified in this material. This was indeed the case and a search identified three genes at page 353 (p = 0.00492).
A total of 22 hits were found for bone, including abnormal morphology of bone (20 genes, page 319), ( This concise survey shows that the epigenomic findings of this study account for many of the clinically relevant dysmorphogenetic features encountered in cannabis teratogenesis. Further details relating to other specific systems are provided in accompanying reports [7]. Of significance, virtually all of the gene annotations identified occurred in the cannabis withdrawal samples rather than the cannabis dependence samples (the sole exception being the reference to Table S4 above). This is also clinically highly relevant as many patients will find that their cannabis supply varies across time, and indeed, drug withdrawal is one of the primary motivations to continue drug self-administration [154].
Given that such gradients are foundational and fundamental to embryonic patterning and morphogenesis, it becomes easy to comprehend how their systematic disruption can lead to various and diverse teratological outcomes depending on the timing and dose of administration during the organogenic period.

Exponential Genotoxic Effects
One of the great concerns voiced by ourselves and others is the exponential doseresponse of cannabinoids in the development of many mutagenicity and related metabolic assays [37,80,81]. Not only is this exponential dose-response well described in the laboratory, but it has now been described in a number of epidemiological reports in human populations [2,7,14,136].
Since many parts of Europe are clearly caught up in a triple confluence of increased prevalence of use, increased intensity of daily use [21,22], and increased cannabinoid concentrations, the conclusion seems inescapable that many regional populations have increased cannabinoid exposures, doubtless exacerbated by the long half-life of cannabinoids in fatty tissues in adipose, brain and gonadal stores. Further, given that in some rural areas large cannabis crops are being grown, it seems that escape of cannabinoids into the water table and probably into the food chain is occurring.
Reports from both the Ain region in northeast France and from the Brittany region indicate that increased rates of babies being born without limbs has recently occurred [173][174][175]. In both areas, cannabis crops are being cultivated. In the Ain region, calves are being born without limbs [173][174][175]. In this context concern may be expressed that it may be that cows are eating either contaminated feed or water and then their products are entering the food chain which is being manifested in elevated rates of major congenital anomalies.
Similar observations are being reported in parts of USA in relation to atrial septal defects [15].
The now epidemiologically-observed sudden and significant spike in rates of congenital anomalies is what would be predicted from laboratory studies following exponential exposure which achieved a threshold for cannabis genotoxicity.

Generalizability
Current study results have employed one of the most comprehensive and largest datasets globally. We also find assurance in the analyses of the anomalies in this dataset because wherever they can be paired with other published comparable results from large datasets from elsewhere, in general, similar results are obtained [1,2,14]. However, as most of the anomalies in this group have not been studied previously, it is important to see the results of similar studies undertaken on other datasets. Moreover, for the six specific anomalies which were the particular focus of this study, the results at bivariate analysis were confirmed in various multivariable models and within a pseudorandomized causal inferential analytical framework at high levels of statistical significance. For this group particularly, we would be confident that subsequent analyses would find confirmatory results.

Strengths and Limitations
The strengths of this study are that it was conducted on one of the largest, most comprehensive datasets in the world and employed cutting-edge analytical tools such as inverse probability weighting, E-value, and panel and geospatial regression techniques. As such, these techniques convert observational data into a pseudorandomized dataset from which it is entirely appropriate to draw causal inferences, providing a robust analytical paradigm from which to draw important conclusions. One of the great strengths of using European data is that the dataset is complete in the sense that it also includes stillbirths and early terminations for anomaly, which are notably absent from the data from many other datasets. Another strength is that these anomalies are not listed or studied by many other registries and in this sense, the dataset is more complete than many others. Ranger regression was used for formal variable selection. Like many epidemiological studies, the present work did not have available individual participant cannabis exposure and anomaly outcome data. Nor was it possible to complete full analyses on all the anomalies listed. Like most epidemiological studies, this study is not mechanistic in the experimental sense. However, observations made here, together with the complex mechanistic underpinning, strongly indicate additional laboratory work to be performed to further investigate the observed relationships in a formal experimental setting.

Conclusions
Nine of the eleven anomalies studied in this section demonstrate strong and robust bivariate associations with metrics of cannabis exposure, the two exceptions being conjoined twins and valproate syndrome. All six of the congenital anomalies studied in multivariable detail, namely all anomalies, VACTERL syndrome, FAS, situs inversus, lateralization syndromes and teratogenic syndromes demonstrated strong and robust associations with cannabis exposure metrics which survived multivariable adjustment and persisted in final inverse probability weighted panel and spatiotemporal models. In each case, high E-values and inverse probability weighting fulfilled quantitative epidemiological criteria for causal relationships. These results confirm earlier preliminary results on this dataset and are consistent with other recent reports which have found that rates of diverse groups of congenital anomalies are increased in association with cannabinoid exposure. The very robust response of the VACTERL dataset to the effects of cannabinoids is consistent with the prediction that the syndrome is caused by cannabinoid-induced inhibition of Sonic Hedgehog. The data for teratogenic syndromes is similarly consistent with a significant contribution to the aetiopathology of this group of disorders coming from cannabinoids. The data for situs inversus and lateralization syndromes similarly strongly implicate cannabinoids as an important contributing cause. This is intriguing as transposition of the great arteries, which may be considered to be a forme fruste of these disorders, has been identified in several other studies as being an anomaly which is cannabis-related [14,50]. The present results support this conclusion. Epigenomic mechanisms may well explain many of the dysmorphogenetic features of cannabis teratogenesis. Particular concern is expressed at the rapidly-rising confluence of cannabinoid exposure in many parts of Europe with the known exponential genotoxic dose-response curve, and that the recent serious French experience with limb reduction anomalies may be portentous of more such outbreaks to come. Given the present powerful findings it would appear important to limit community exposure to powerful teratogens in the interests of protection of the genome and epigenome for coming generations.  Figure S1. Sequential map-graph of the log rate of Foetal Alcohol Syndrome over time in selected European nations; Figure S2. Sequential map-graph of the log rate of Skeletal Dysplasia over time in selected European nations; Figure S3. Sequential map-graph of the log rate of past month cannabis use × cannabis resin THC concentration × daily cannabis use interpolated over time in selected European nations; Figure S4. Bivariate colorplane sequential map-graph of the log rate of Foetal Alcohol Syndrome by last month cannabis use: cannabis resin THC concentration: daily cannabis use interpolated in selected European nations; Figure S5. Bivariate colorplane sequential map-graph of the log rate of Skeletal Dysplasia by last month cannabis use: cannabis resin THC concentration: daily cannabis use interpolated in selected European nations; Figure S6. Bivariate colorplane sequential map-graph of the log rate of Maternal Infections Anomalies by last month cannabis use: cannabis resin THC concentration: daily cannabis use interpolated in selected European nations; Figure S7. International geospatial links (A) raw (in blue) and edited (in red) and (B) final (in pink) used to derive the sparse spatial weights matrix for spatial regression.
Author Contributions: A.S.R. assembled the data, designed and conducted the analyses, and wrote the first manuscript draft. G.K.H. provided technical and logistic support, co-wrote the paper, assisted with gaining ethical approval, provided advice on manuscript preparation and general guidance to study conduct. A.S.R. had the idea for the article, performed the literature search, wrote the first draft and is the guarantor for the article. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding. No funding organization played any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or the decision to submit the manuscript for publication.