Gut Microbiota Functional Traits, Blood pH, and Anti-GAD Antibodies Concur in the Clinical Characterization of T1D at Onset

Alterations of gut microbiota have been identified before clinical manifestation of type 1 diabetes (T1D). To identify the associations amongst gut microbiome profile, metabolism and disease markers, the 16S rRNA-based microbiota profiling and 1H-NMR metabolomic analysis were performed on stool samples of 52 T1D patients at onset, 17 T1D siblings and 57 healthy subjects (CTRL). Univariate, multivariate analyses and classification models were applied to clinical and -omic integrated datasets. In T1D patients and their siblings, Clostridiales and Dorea were increased and Dialister and Akkermansia were decreased compared to CTRL, while in T1D, Lachnospiraceae were higher and Collinsella was lower, compared to siblings and CTRL. Higher levels of isobutyrate, malonate, Clostridium, Enterobacteriaceae, Clostridiales, Bacteroidales, were associated to T1D compared to CTRL. Patients with higher anti-GAD levels showed low abundances of Roseburia, Faecalibacterium and Alistipes and those with normal blood pH and low serum HbA1c levels showed high levels of purine and pyrimidine intermediates. We detected specific gut microbiota profiles linked to both T1D at the onset and to diabetes familiarity. The presence of specific microbial and metabolic profiles in gut linked to anti-GAD levels and to blood acidosis can be considered as predictive biomarker associated progression and severity of T1D.


Introduction
Type 1 diabetes mellitus (T1D) is an autoimmune disease caused by the progressive immunomediated destruction of pancreatic β-cells by autoreactive T lymphocytes in genetically susceptible individuals [1]. Patients with T1D develop a severe insulin deficiency that leads to hyperglycemia, requiring accurate exogenous insulin administration [2]. Markers of the β-cell destruction comprise islet cell autoantibodies (ICA), insulin autoantibodies (IAA), anti-glutamic acid decarboxylase antibodies (anti-GAD), and autoantibodies to the tyrosine phosphatases IA-2 and IA-2β [3]. However, the rate of β-cells destruction is subject-dependent, then it can be very fast in some individuals and slower in others [3]. Moreover, especially in early onset cases, some patients may show ketoacidosis as the first manifestation of the disease. Diabetic ketoacidosis (DKA) is characterized by hyperglycemia, ketosis, and metabolic acidosis [3]. During ketoacidosis, ketone bodies are produced via hepatic beta oxidation of fatty acids and released in the blood flow. This condition leads to dehydration and acidosis and it is associated with increased morbidity and mortality [4]. In general, T1D patients are characterized by a rise of inflammatory cytokines, as well as C-reactive protein (CRP) [5]. This proinflammatory condition seems to be mainly related to hyperglycemia [6]. High levels of blood glucose can activate the proinflammatory transcription factor nuclear κB (NF-κB), resulting in increased inflammatory chemokine and cytokine release [7].
In the "SEARCH for Diabetes in Youth" study, performed on 1396 young T1D patients at onset of disease, the DKA value at the diagnosis of T1D was identified as predictor of poor long-term glycemic control in children, independently of established risk factors [8]. DKA and age influence disease outcome and may thus be used to early identify those patients with rapidly deteriorating metabolic control, who might benefit from a more intensive therapeutic approach [9]. Ketoacidosis in T1D patients is indeed considered a serious complication that requires prompt intervention [10].
External factors such as life environment, diet, antibiotic exposure, and activation of the gut immune systems, have been associated with T1D pathophysiology [11,12]. Moreover, the composition of the intestinal mucus glycans can shape the gut microbiota depending on the fucosyltransferase 2 (FUT2) gene expression of the individual [13] and, indeed, FUT2 polymorphisms have been associated with several conditions, including chronic diseases and infections [14]. Recently, a possible role of the FUT2 phenotype has been suggested also in T1D pathogenesis [15].
Different studies report on an aberrant gut microbiota composition in T1D [16][17][18]. The consequence of gut dysbiosis is the impairment of gut permeability, environmentally increased exposure to non-self-antigens followed by mucosal immune response that parallel and putatively influence the development of the autoimmune attack against insulinproducing β-cell [19,20]. The modifications in gut microbiota composition and function have been identified before the β-cells destruction stage [19]. In T1D patients, gut microbiota are characterized by a reduction in: (i) bacteria producing short chain fatty acids (SCFAs) [21], (ii) fecal butyrate, (iii) intestinal immunoglobulin A (IgA) antibody levels and (iv) intestinal alkaline phosphates activity and by an increase in fecal calprotectin concentration [22]. These findings reflect the alteration of gut microbiota in T1D patients, involving butyrate and lactate producers as well as the mucin degraders [19,20].
In this study, we investigated the gut bacterial composition and its metabolic activity in an Italian cohort of children at onset of T1D, in relation to the disease clinical biomarkers at onset of disease, to identify possible disease predictors of microbial and/or metabolic nature of severity during long-term.
The average values of the clinical and anthropometric features for T1D patients are reported in Table 1.

T1D Clinical Profiling of the Overall Cohort
T1D cohort characteristics were recognized based on the cut-off values of the most representative clinical parameters for the disease (Table 1).
PCA was applied to the matrix of variables composed of anti-GAD, IAA, IA2, HbA1c, cholesterol, exogenous insulin need, blood pH, age and c-peptide, after autoscaling. The first principal component (PC1) and the second principal component (PC2), respectively, accounted for 23% and 19% of the overall variance. The loadings plot in Figure 1 showed that the statistically significant variables insulin need and HbA 1c positively correlated with PC1, while anti-GAD and blood pH negatively correlated with PC1; furthermore, c-peptide and age were significant along PC2 resulting thus independent from the previous ones.
In the PCA scores plot, the diabetic cohort appeared to be clustered on the ba their blood pH values. This allowed to identify the blood pH as a parameter for a fu stratification of the patients ( Figure S1A), linked to their severity status of the dise the onset: patients with blood pH < 7.32, (more severe) and pH ≥ 7.32 (less severe). over, higher levels of anti-GAD antibodies were mainly observed in patients with no to-moderate compared to low blood pH ( Figure S1B).

Gut Microbiota Ecology of the Overall Subject Cohorts
A total of 7,916,373 sequencing reads were obtained from 126 faecal samples, w mean value of 75,154 sequences per sample.
In the PCA scores plot, the diabetic cohort appeared to be clustered on the basis of their blood pH values. This allowed to identify the blood pH as a parameter for a further stratification of the patients ( Figure S1A), linked to their severity status of the disease at the onset: patients with blood pH < 7.32, (more severe) and pH ≥ 7.32 (less severe). Moreover, higher levels of anti-GAD antibodies were mainly observed in patients with normal-to-moderate compared to low blood pH ( Figure S1B).

Gut Microbiota Ecology of the Overall Subject Cohorts
A total of 7,916,373 sequencing reads were obtained from 126 faecal samples, with a mean value of 75,154 sequences per sample.
Analyzing microbiota ecology amongst T1D patients, siblings and CTRL, the αdiversity indexes were higher in patients compared to the other two groups (p values ≤ 0.05) (Figure 2A   Moreover, β-diversity showed that patient group resulted clearly separated, while CTRL and siblings resulted intermixed between each other (PERMANOVA p values = 0.0001) ( Figure 2D-F). Moreover, based on the calculation of intragroup distance the higher distance amongst T1D samples was evident, compared to the other two groups ( Figure 2G-I).
The gut microbiota composition in the three groups at phylum, family and genus levels was reported in Figure S2. The comparison among these groups showed that Dialister and Akkermansia were lower in T1D and siblings with respect to CTRL. Likewise, Clostridiales and Dorea were significantly more abundant in siblings and T1D than in CTRL. Higher Lachnospiraceae and lower Collinsella abundances were found in T1D patients than in both siblings and CTRL ( Figure 2J-O) (Table S1A). However, even though Dorea mean abundance for T1D and siblings was significantly higher compared to CTRL, the analysis of each single couple of siblings revealed that most of them showed lower abundances compared to the CTRL's mean (Table S1B).

Comparison among T1D Patients in Relation to Clinical Parameters Used for Diagnosis
The possible relationship between gut microbiota composition and clinical characteristics was investigated in patients grouped on the basis of clinical variable values, as reported in Table 1. Alpha-diversity analyses did not reveal any significant variation when the patients were grouped for all T1D parameters ( Figure S3). Accordingly, also β-diversity did not reveal any statistically significant group clustering ( Figure S4).
Bacteria abundance analysis highlighted that Alistipes, Roseburia and Faecalibacterium abundances were significantly higher in patients with anti-GAD levels values ≤ 1 (p values ≤ 0.05, FDR p values ≤ 0.1) ( Figure 3) (Table S2). Moreover, β-diversity showed that patient group resulted clearly separated, while CTRL and siblings resulted intermixed between each other (PERMANOVA p values = 0.0001) ( Figure 2D-F). Moreover, based on the calculation of intragroup distance the higher distance amongst T1D samples was evident, compared to the other two groups ( Figure 2G-I).
The gut microbiota composition in the three groups at phylum, family and genus levels was reported in Figure S2. The comparison among these groups showed that Dialister and Akkermansia were lower in T1D and siblings with respect to CTRL. Likewise, Clostridiales and Dorea were significantly more abundant in siblings and T1D than in CTRL. Higher Lachnospiraceae and lower Collinsella abundances were found in T1D patients than in both siblings and CTRL ( Figure 2J-O) (Table S1A). However, even though Dorea mean abundance for T1D and siblings was significantly higher compared to CTRL, the analysis of each single couple of siblings revealed that most of them showed lower abundances compared to the CTRL's mean (Table S1B).

Comparison among T1D Patients in Relation to Clinical Parameters Used for Diagnosis
The possible relationship between gut microbiota composition and clinical characteristics was investigated in patients grouped on the basis of clinical variable values, as reported in Table 1. Alpha-diversity analyses did not reveal any significant variation when the patients were grouped for all T1D parameters ( Figure S3). Accordingly, also β-diversity did not reveal any statistically significant group clustering ( Figure S4).

Gut Microbiota and Metabolome Profiling on the Downsized T1D and CTRL Cohorts
Metabolomics analyses were further performed on a downscaled group of patients and CTRL based on sample availability; metagenomics and metabolomics integration Anti-GAD antibodies negatively correlated with Alistipes (Pearson's correlation test; p < 0.05) ( Figure S5).

Gut Microbiota and Metabolome Profiling on the Downsized T1D and CTRL Cohorts
Metabolomics analyses were further performed on a downscaled group of patients and CTRL based on sample availability; metagenomics and metabolomics integration analyses were performed on a dataset of 43 samples and the PCA based on the clinical features was repeated ( Figure S6). The obtained results were consistent with the PCA applied on the entire dataset. The PCA scores plot of T1D subset, labeled based on blood pH values, showed as well as clustering depending on blood pH values: pH < 7.32 and pH ≥ 7.32 along PC1 ( Figure S7).
Sensitivity and specificity of the PLS-DA model, performed on the metabolomics block only, did not reach a sufficient degree of classification for the four comparisons T1D vs. CTRL; T1D pH ≥ 7.32 vs. CTRL; T1D pH < 7.32 vs. CTRL and T1D pH ≥ 7.32 vs. T1D pH < 7.32 (Table S4).
On the contrary, the PLS-DA model performed on the metagenomics block showed % values of accuracy, sensitivity (74.0 ± 5.1) and specificity (71.2 ± 5.5) with a sufficient degree of classification for T1D vs. CTRL (Table S4).

Omics Data Integration
Chemometric approaches were performed on the fused set, in order to try to disentangle the relationship between (i) T1D and CTRL and (ii) pH ≥ 7.32 and pH < 7.32 groups (Figure 4), as well as (iii) T1D pH ≥ 7.32 vs. CTRL and (iv) T1D pH < 7.32 vs. CTRL ( Figure S8). analyses were performed on a dataset of 43 samples and the PCA based on the clinical features was repeated ( Figure S6). The obtained results were consistent with the PCA applied on the entire dataset. The PCA scores plot of T1D subset, labeled based on blood pH values, showed as well as clustering depending on blood pH values: pH < 7.32 and pH ≥ 7.32 along PC1 ( Figure S7). By 1 H-NMR metabolomics 37 metabolites, belonging to the classes of amino acids, SCFA, organic acids, pyrimidines, purine, nitrogen compounds, carbohydrates, alcohols, were identified and quantified (Table S3).
Sensitivity and specificity of the PLS-DA model, performed on the metabolomics block only, did not reach a sufficient degree of classification for the four comparisons T1D vs. CTRL; T1D pH ≥ 7.32 vs. CTRL; T1D pH < 7.32 vs. CTRL and T1D pH ≥ 7.32 vs. T1D pH < 7.32 (Table S4).
On the contrary, the PLS-DA model performed on the metagenomics block showed % values of accuracy, sensitivity (74.0 ± 5.1) and specificity (71.2 ± 5.5) with a sufficient degree of classification for T1D vs. CTRL (Table S4).

Omics Data Integration
Chemometric approaches were performed on the fused set, in order to try to disentangle the relationship between (i) T1D and CTRL and (ii) pH ≥ 7.32 and pH < 7.32 groups (Figure 4), as well as (iii) T1D pH ≥ 7.32 vs. CTRL and (iv) T1D pH < 7.32 vs. CTRL ( Figure  S8). The PLS-DA model performed on the fused data resulted highly improved when compared to the single block analyses, as reported in Table S4. The features with values of VIP ≥1.15 were considered important and the significance was verified by performing non-parametric test with Bonferroni-corrected p values (Table S5). In the comparison between T1D and CTRL, T1D showed higher levels of isobutyrate, malonate, Clostridium (Lachnospiraceae), Enterobacteriaceae, Bacteroidales, Clostridiales and Firmicutes, while CTRL showed higher levels of butyrate, galactose, ethanol, succinate, Odoribacter, Alistipes, Akkermansia, Sutterella, Actinomyces, Adlercreutzia, Collinsella, Turicibacter and Mogibacteriaceae.
Finally, the comparison between T1D pH ≥ 7.32 and T1D pH < 7.32 revealed in T1D pH ≥ 7.32 the following features as higher: malonate, uracil, formate, 2-methylbutyrate, Figure 4. VIP values of the significant metabolites and bacterial taxa in the omics-data integration analysis. The levels of the features in blue are higher in T1D patients and in T1D pH ≥ 7.32, while the features in red are higher in CTRL and in T1D pH < 7.32. Features that resulted significative also at the univariate analysis were labeled with *. DMA, dimethylamine; Glu, glutamic acid.
The PLS-DA model performed on the fused data resulted highly improved when compared to the single block analyses, as reported in Table S4. The features with values of VIP ≥1.15 were considered important and the significance was verified by performing non-parametric test with Bonferroni-corrected p values (Table S5). In the comparison between T1D and CTRL, T1D showed higher levels of isobutyrate, malonate, Clostridium (Lachnospiraceae), Enterobacteriaceae, Bacteroidales, Clostridiales and Firmicutes, while CTRL showed higher levels of butyrate, galactose, ethanol, succinate, Odoribacter, Alistipes, Akkermansia, Sutterella, Actinomyces, Adlercreutzia, Collinsella, Turicibacter and Mogibacteriaceae.
Pearson correlations were performed on the matrix composed of both metabolites and bacterial taxa, for each considered class. Interestingly, within the group of patients with blood pH ≥ 7.32, isobutyrate was inversely correlated with Akkermansia (r = −0.68, p = 0.01). In the matrix of the CTRL, Collinsella was directly correlated with butyrate (r = 0.56, p = 0.01), succinate (r = 0.81, p < 0.01) and ethanol (r = 0.80, p < 0.01).

Discussion
The present study depicts the first integrated omics-based investigation by 16S rRNAbased metagenomics and NMR-based metabolomics, applied to a clinically uniform cohort of children affected by T1D along with their respective non-diabetic siblings and healthy reference subjects.
In our cohort of T1D patients at the onset, the α-diversity resulted increased compared to both siblings and CTRLs, as already reported [23]. Moreover, Mrozinska and colleagues proposed a positive correlation of α-diversity index and the levels of HbA 1c ≥ 53 mmol/mol in T1D patients [24]. This observation is consistent with our results; in fact, in our T1D cohort, HbA 1c values were higher than 53 mmol/mol, except one.
Comparing the gut microbiota profiles of CTRL, T1D patient and sibling groups, we observed that T1D patients and their related siblings shared higher levels of Clostridiales and lower levels of Dialister and Akkermansia, as compared to CTRL.
The similarity in microbiota composition between T1D and siblings' group in 14 different families, could shed a light on a phenotypic familiar predisposition that may favor a T1D characteristic gut microbiota profile. In particular, the lower presence of the beneficial Akkermansia muciniphila, of which increase is ever associated to a healthy intestine, could be correlated to the decrease in the intestinal integrity, and to the consequent increase in gut permeability and intestinal inflammation, that increases T1D predisposition [25].
However, T1D patients were characterized by higher levels of Lachnospiraceae and lower levels of Collinsella, as compared to siblings and CTRL, suggesting a specific microbiota profile associated to the diabetes at the onset. Collinsella is a beneficial producer of H 2 , ethanol, SCFAs, lactate and positively correlates with circulating insulin [26]. On the contrary, Lachnospiraceae have been linked to different diseases and to impaired glucose metabolism, inflammation and to the onset of T1D [20]. The unsupervised analysis (PCA) of T1D clinical data revealed that c-peptide levels were directly related to age at the time of diagnosis, and were independent from other disease severity parameters, at the onset. This analysis identified a stratification of the patients based on blood pH: pH < 7.32 (more severe) and pH ≥ 7.32 (less severe). Moreover, we observed the association between high levels of anti-GAD antibodies and normal-to-moderate pH levels and low anti-GAD antibodies levels with low blood pH values. Intriguingly, patients with low level of anti-GAD and positive to IA2 antibodies, had a more severe clinical situation compared to patients with higher anti-GAD levels that had, instead, normal blood pH values and lower insulin need ( Figure 5).
The patients with low anti-GAD levels were characterized by high abundances of Alistipes, Roseburia and Faecalibacterium, suggesting an association of these gut microbiota features with ketosis, and higher levels of HbA 1c . As also reported by Fang and colleagues, the anti-GAD autoantibody strongly associates with the structure and composition of the gut microbiome but negatively correlates with SCFA-producing bacteria [27]. Our results showed that serum anti-GAD levels can be considered as a microbiota-linked predictive biomarker associated with the development of T1D.
The patients with low anti-GAD levels were characterized by high abundances of Alistipes, Roseburia and Faecalibacterium, suggesting an association of these gut microbiota features with ketosis, and higher levels of HbA1c. As also reported by Fang and colleagues, the anti-GAD autoantibody strongly associates with the structure and composition of the gut microbiome but negatively correlates with SCFA-producing bacteria [27]. Our results showed that serum anti-GAD levels can be considered as a microbiota-linked predictive biomarker associated with the development of T1D.
The integration of omics data showed an invariant core of features for the T1D children non-dependent on the severity of the disease at the onset, composed by high levels of isobutyrate, malonate, Clostridium (Lachnospiraceae), Enterobacteriaceae, Bacteroidales and Clostridiales unk. families, as well as low levels of butyrate, galactose, ethanol, Alistipes, Odoribacter and Sutterella. Moreover, in the Biassoni et al. study, the increase in Enterobacteriaceae and the decrease in Sutterella have been associated to T1D population [28].
The integration of omics data showed an invariant core of features for the T1D children non-dependent on the severity of the disease at the onset, composed by high levels of isobutyrate, malonate, Clostridium (Lachnospiraceae), Enterobacteriaceae, Bacteroidales and Clostridiales unk. families, as well as low levels of butyrate, galactose, ethanol, Alistipes, Odoribacter and Sutterella. Moreover, in the Biassoni et al. study, the increase in Enterobacteriaceae and the decrease in Sutterella have been associated to T1D population [28].
Enterobacteriaceae overgrowth has been linked to the increase in intestinal permeability and inflammation [29]. The intestinal permeability may be considered a predisposing factor thus participating in the pathogenesis of T1D [19]. The opposite effect, i.e., the maintenance of the inter-epithelial tight junctions, is guaranteed by SCFAs, in particular butyrate, which was increased in CTRLs. Moreover, in T1D patients, a decrease in butyrogenic bacteria, i.e., Odoribacter and Collinsella, a lower butyrate production and a less butyryl-CoA transferase genes [30] was detected.
In T1D patients, malonate was increased and, in the group of patients with blood pH < 7.32, it was positively correlated with Parabacteroides. Interestingly, Parabacteroides distasonis, has been identified as the most significantly genus associated with T1D [21].
The increase in galactose metabolism pathway in T1D patients, has been already observed in a previous study [24] in which the T1D patients with HbA 1c ≥ 53 mmol/mol showed increased gut microbial galactose metabolism, also associated with higher αdiversity, as compared to CTRL. Moreover, an increase in the multiple sugar transport system (amongst them D-galactose) was also registered in T1D patients [20]. The passive transporting-in of nutrients is characteristic of auxotrophic bacteria that live in inflammatory environments where dead tissue provides easy access to many nutrients that are less available in the healthy gut [20].
Alistipes was increased in CTRL and negatively correlated with anti-GAD in T1D patients. This acetate producer seems to have a protective role for the gut health [31], despite in previous works was described as involved in inflammation processes, cancer and mental health [32].
It is noteworthy the inverse relationship between isobutyrate and butyrate in T1D patients compared to CTRL. Branched-chain fatty acids (BCFAs), such as isovalerate and isobutyrate, are produced by some gut microbes [33] responsible for the degradation pathways of leucine and valine, respectively [34]. However, isobutyrate is also produced by the activity of the B 12 -dependent Isobutyryl-CoA Mutase [35,36]. We inferred that T1D population can be characterized by bacteria with isobutyryl-CoA mutase.
Moreover, a reduced content of ethanol in stool of T1D patients, as compared to CTRL, could be related to a diet with a lower content of carbohydrates, as possibly the first tentative of nutritional intervention prior to the clinical diagnosis.
Finally, by comparing diabetic patients on the basis of their differences in blood pH, we obtained lower classification values compared to T1D vs. CTRL results. Despite the loss of this statistical power, this model could be considered noteworthy since still allowed us to discriminate patients who shared the same disease. Intriguingly, the features that discriminate T1D patients are mostly metabolites, suggesting a major role of the metabolism in T1D progression. In particular, T1D patients characterized by blood pH ≥ 7.32, showed high level of uracil, formate, guanine and hypoxanthine. The increase in guanine and hypoxanthine led us to assume the possible involvement of purine/pyrimidine metabolism pathways in the disease progression. The gut microbiota releases purine compounds available to the intestinal mucosa that can be therefore used for nucleotide genesis [37]. We inferred that, when patients have still a balanced production of ketone bodies, the described mechanism occurs maintaining the integrity of the mucosal barrier and guaranteeing a normal mucin intestinal production. In fact, it is known that an undamaged gut barrier can shield against the entry of infectious agents and dietary antigens, avoiding immune reactions with damage to pancreatic β-cells, increased cytokine levels and insulin resistance [38].
There are several limitations of the study. First, it still requires further clinical studies with a larger sample size to validate the functional and compositional fecal microbial profiles associate to the progression and severity of T1D. Second, animal experiments could help to determine the cause-effect relationship amongst gut microbiota composition and the destruction of pancreatic β-cells. Finally, for future longitudinal studies, the exploration of the alterations or restoration of the fecal microbiota after insulin treatment could be take into consideration.

Patient Recruitment
A cohort of 52 consecutive Caucasian children were enrolled within one week of T1D diagnosis at the Endocrinology and Diabetes Unit of Children's Hospital Bambino Gesù in Rome, Italy. Inclusion criteria were age between 5 and 15 years, glycemia > 126 mg/dL, glycated hemoglobin (HbA 1c ) > 6.5% (48 mmol/mol), c-peptide < 1 ng/mL. Exclusion criteria were other chronic and infective disease; use of antibiotic, pre/probiotic and propton-pump inhibitors in the last two months from recruitment.
Age at onset, gender, weight, birth weight, pubertal stage, blood pH, exogenous insulin need, body mass index (BMI), IAA, islet antigen 2 antibody (IA-2), anti-GAD, HbA 1c , High Density Lipoprotein Cholesterol (HDL), Low Density Lipoprotein Cholesterol (LDL), C-Reactive Protein (CRP) and presence of other autoimmune diseases (i.e., thyroiditis and celiac disease) were registered for each patient at the time of enrollment.
A cohort of 17 children that had a full sibling diagnosed with T1D was recruited at the Endocrinology and Diabetes Unit of Children's Hospital Bambino Gesù in Rome, Italy. Inclusion criteria were age between 5 and 17 years, general healthy condition, no gastrointestinal (GI) diseases, use of antibiotic, pre/probiotic in the previous two months from recruitment.
A cohort of 57 gender-and age-matched controls were enrolled during an epidemiological survey carried out at the Human Microbiome Unit of Bambino Gesù Children's Hospital in Rome (BBMRI Human Micro-biome Biobank, OPBG) to generate a reference digital biobank of healthy subjects (CTRL). This cohort did not have family history for autoimmune diseases. Moreover, they were normal weight and had no GI diseases, use of antibiotic, and pre/probiotic in the previous two months before the recruitment.
All patients, siblings and CTRLs followed a Mediterranean diet regimen attested by Food Frequency questionnaire (FFQ) to assess the adherence to the Mediterranean diet through targeted questions about the frequency of consumption of extra virgin olive oil, butter or margarine, fresh fruit and vegetables, legumes, fish, red meat and sausages, white meat, industrial bakery products, dried fruit sugary drinks and alcohol use.
The study was approved by the OPBG Ethical Committee (protocol 1274_OPBG_2016; healthy subjects: Protocol No. 1113_OPBG_2016) and was conducted in accordance with the Principles of Good Clinical Practice and the Declaration of Helsinki. Written informed consent was obtained from either parents or legal representative of children.
From each subject of these cohorts, a single fecal sample was collected and stored at −80 • C until further analyses. The T1D overall set was analyzed for gut microbiota ecology by 16S rRNA-based metagenomics. For the metabolomics and, as well, for the multi-block approach, analyses were performed on a sub-set of samples ("fused set"), consistently with material availability (Figure 6).

Biochemical and Immunological Analyses
Glycated hemoglobin (HbA1c) was analyzed by a spectrophotometric method (Bio-Rad Richmond, CA, USA). C-peptide were measured by chemiluminescence on an AD-VIA Centaur ® XPT Immunoassay System (Siemens Healthcare GmbH, Munich, Germany), a two-site sandwich immunoassay using direct chemiluminescent technology. To-

Biochemical and Immunological Analyses
Glycated hemoglobin (HbA1c) was analyzed by a spectrophotometric method (Bio-Rad Richmond, CA, USA). C-peptide were measured by chemiluminescence on an ADVIA Centaur ® XPT Immunoassay System (Siemens Healthcare GmbH, Munich, Germany), a two-site sandwich immunoassay using direct chemiluminescent technology. Total cholesterol, LDL and HDL cholesterol, and triglycerides were measured by an enzymatic method on a Roche automated clinical chemistry analyzer (Roche/Hitachi 904 analyzer, Roche Diagnostics, Mannheim, Germany). Anti-GAD, IA2, IAA antibodies were measured by Elisa method on Elite Impatto Twin plus (EUROSPITAL diagnostics, Trieste, Italy).

16S rRNA-Based Microbiota Profiling
DNA was extracted from 200 mg of stools using QIAmp Fast DNA Stool mini kit (Qiagen, Germany), following the manufacturer's instructions. The 16S rRNA V3-V4 variable region (~460 bp) was amplified by using the primer pairs described in the MiSeq rRNA Amplicon Sequencing protocol (Illumina, San Diego, CA, USA). The PCR reactions were set up using a 2× KAPA Hifi HotStart ready Mix (KAPA Biosystems Inc., Wilmington, MA, USA) following the manufacturer's protocol. AMPure XP beads (Beckman Coulter Inc., Beverly, MA, USA) were employed to clean DNA amplicons from primers and dimer primers. A unique combination of Illumina Nextera adaptor-primers for each sample was incorporated in amplicons by a second amplification step. The final library was cleaned-up and quantified using Quant-iT™ PicoGreen ® dsDNA Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA). Samples were pooled together before the sequencing on an Illumina MiSeqTM platform according to the manufacturer's specifications to generate paired-end reads of 300 base-length [39]. The bioinformatic data analysis was performed using the QIIME 2 pipeline [40]. Forward and reverse raw fastq files were merged using PEAR v. 0.9.6. Merged reads were filtered for chimeras and length using DADA2 plugin of QIIME2 [41] with a trunc value of 400 nucleotides (nt). Taxonomy was assigned against the Greengenes 13_08 database, using a Naïve bayes classifier trained on the reference sequences of Greengenes 13_08 clustered at 99% of sequence similarity [42]. Data normalization was performed in R 4.0.3 while data filtering was performed by using custom scripts in python 3.6. The read number count was normalized using Cumulative Sum Scaling (CSS) method as implemented in the MetagenomeSeq (version 3.12) package of R. Clusters (Amplicon sequence variants [ASVs]) with a number of reads lower than 1% of the total read number were removed from the statistical analysis together with those taxa not present in at least 25% of the samples [43]. α-diversity was performed by scikit-bio (http://scikit-bio.org/, accessed on 31 March 2022) of Python 3 package and the p value for group comparisons was determined by Kruskal-Wallis test. β-diversity analysis was calculated on weighted and unweighted Unifrac distance matrices and Bray Curtis matrix, and graphed by principal coordinate analysis (PCoA) plots. The association between the covariates and β-diversity measures was assessed by permutational analysis of variance (PERMANOVA) [44]. The non-parametric Kruskal-Wallis test, corrected for FDR p value ≤ 0.05, was used to compare bacterial taxa relative abundance amongst groups. The correlation between clinical variables and bacterial taxa was calculated by Pearson's coefficient with relative p value. Ternary scatter plot on bacterial taxa relative abundances at phylum, family and genus levels was performed by Plotly Express of python.

1 H-NMR Metabolomic Analysis
An average quantity of 300 mg of frozen stools recovered from each sample was combined with 1 mL of PBS-D 2 O with 0.3% (final concentration) of sodium azide. The samples were thawed for 30 min at 25 • C and then vortexed to achieve a homogenous solution. The supernatant was separated from the solid phase through a first centrifugation at 10,000× g for 25 min at 4 • C, hence filtered on a 40 µm pores filter. After adding 200 µL of PBS-D 2 O with 0.3% of sodium azide, the samples were centrifuged again at 10,000× g for 25 min at 4 • C. Six hundred µL of supernatant were withdrawn and 60 µL of PBS-D 2 O containing 3-(trimethylsilyl) propionic-2,2,3,3-d 4 acid sodium salt (TSP, 2mM final concentration) were added. NMR spectra were acquired using a Bruker Avance III 400 spectrometer (Bruker BioSpin GmbH, Karlsruhe, Germany) equipped with a 9.4T magnet operating at 1 H frequency of 400.13MHz and at 298K. The assignment was achieved by means of bi-dimensional (2D) experiments (COSY, TOCSY, HSQC, HMBC) on selected samples and confirmed by comparison with web database [45], the literature [46][47][48][49] and in-house database. One-dimensional (1D) NMR spectra were processed and quantified by using ACD/Lab 1D NMR Manager ver. 12.0 software (Advanced Chemistry Development, Inc., Toronto, ON, Canada), whereas 2D-NMR spectra were processed by using Bruker TopSpin ver.3.1 (Bruker BioSpin GmbH) and MestreC ver.4.7.0.0 (Mestrelab Research SL, Santiago de Compostela, Spain). Phase and baseline of the NMR spectra were manually corrected. The quantification was carried out by comparing the integrals of the metabolites' resonances to the TSP one, then normalizing for the number of protons and for feces weight.

Statistical Analyses and Modeling
To evaluate the differences in stool metabolic profiles among T1D patients and CTRL, as well as among severity status subgroups, were performed multivariate and univariate analyses. As the first step, an unsupervised approach, by means of Principal Component Analysis (PCA), was applied to the matrix of the clinical variables composed of anti-GAD, IAA, IA2, HbA1c, cholesterol, exogenous insulin need, blood pH, age and c-peptide ( Figure S2) in order to highlight possible patients' clusters, to identify outliers and features of interest. All data were autoscaled before further data processing: operationally, each variable was first centered by subtracting its average from the data and then scaled through division by its standard deviation. The scores plot allowed to identify the blood pH as a parameter for the further stratification of the patients based on the severity status of the disease at the onset: T1D patients with blood pH ≥ 7.32 and pH < 7.32 ( Figure S3B). To evaluate the contribution of the metabolites and bacterial taxa for each class that were identified, a classification strategy based on the Partial Least Squares-Discriminant Analysis (PLS-DA) algorithm was applied on both metabolomics and the metagenomics data sets (Table S4). In this context, to evaluate the reliability of the prediction models and to identify a set of features significantly (and consistently) contributing to the model, an approach based on repeated double cross-validation (rDCV) was adopted.

Multi-Omics Data Integration
Since data were obtained by NMR-based metabolomics and taxonomical metagenomics, a multi-block (low-level fusion) approach was followed [50] to extract, simultaneously, the maximum of the information from the two approaches. Indeed, rather than being separately processed, the matrices were jointly elaborated to highlight the correlations between metabolites and bacterial taxa. For the classification stage, the predictive analyses were carried out by means of a multi-block partial least squares discriminant analysis (MB-PLSDA). MB-PLSDA [50] consists in building a PLS-DA model on the concatenated matrix resulting from the low-level data fusion of the two different blocks, after scaling each one through division by its Frobenius' norm in order to level out their relative contribution [50]. In particular, defining X mb and X mg the matrices resulting from the metabolomic and metagenomic analysis, respectively, their squared Frobenius' norm is given by the sum of their squared elements: where X i is either X mb or X mg and x jk i is the generic element of that matrix. In order to perform multi-block modelling, X mb or X mg are normalized and concatenated row-wise (low-level fusion) to obtain the matrix X conc : Standard PLS-DA is then used to build the classification model based on X conc . Since MB-PLSDA is a predictive model, a validation phase is needed to evaluate the reliability of its prediction. To this purpose, an approach based on repeated double crossvalidation (rDCV) strategy, to estimate the confidence intervals for the model predictions and the consistence of candidate biomarkers based on their VIP value, was applied. Finally, to rule out any possibility that good classification results could be obtained by chance, permutation tests were applied to estimate in a non-parametrical fashion the distributions of the classification figures of merit under the null hypothesis for significance testing.
Following the multivariate results, to the significant metabolites and bacterial taxa Mann-Whitney's u-test was applied to assess differences in the levels within each of the specific class. Bonferroni-corrected p values ≤ 0.05 were considered significant.

Data and Resource Availability
All raw sequences have been archived in the NCBI database: PRJNA702261 and PRJNA280490 (https://www.ncbi.nlm.nih.gov/bioproject).

Conclusions
Our results evidenced both specific gut microbiota profiles in T1D patients at the onset and other linked to familiarity. Thus, a comprehensive characterization of the gut microbiota could be associated to a genetic test to define the diabetes predisposition in a T1D familial context. Moreover, the presence of specific microbial taxa in gut linked to the serum anti-GAD levels can be considered as a microbiota predictive biomarker associated with the progression of T1D. Furthermore, we find gut microbiota specific functional traits associated to blood acidosis, indicating a role of gut microbiota in disease severity. Therefore, the routine characterization of the composition and function of the gut microbiota could be useful in patients' clinical monitoring to assess disease status and progression. This evidence could open new avenues on patient's treatments at onset, exploring the opportunity to combine insulin administration with probiotics, prebiotics or fecal microbiota transplantations (FMT) at onset and during clinical disease progression.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of Bambino Gesù Children's Hospital (protocol 1274_OPBG_2016; healthy subjects: Protocol No. 1113_OPBG_2016).
Informed Consent Statement: Written informed consent was obtained from either parents or legal representative of children.