Humanized NSG Mouse Models as a Preclinical Tool for Translational Research in Inflammatory Bowel Diseases

The development of animal models reflecting the pathologies of ulcerative colitis (UC) and Crohn’s disease (CD) remains a major challenge. The NOD/SCID/IL2rγnull (NSG) mouse strain, which is immune-compromised, tolerates the engraftment of human peripheral blood mononuclear cells (PBMC) derived from patients with UC (NSG-UC) or CD (NSG-CD). This offers the opportunity to examine the impact of individual immunological background on the development of pathophysiological manifestations. When challenged with ethanol, NSG-UC mice exhibited a strong pro-inflammatory response, including the development of edemas, influx of human T cells, B cells and monocytes into the mucosa and submucosa, and elevated expression of the inflammatory markers CRP and CCL-7. Fibrotic alterations were characterized by an influx of fibroblasts and a thickening of the muscularis mucosae. In contrast, the development of pathological manifestations in NSG-CD mice developed without challenge and was signified by extensive collagen deposition between the muscularis propria and muscularis mucosae, as observed in the areas of strictures in CD patients. Vimentin-expressing fibroblasts supplanting colonic crypts and elevated expression of HGF and TGFß corroborated the remodeling phenotype. In summary, the NSG-UC and NSG-CD models partially reflect these human diseases and are powerful tools to examine the mechanism underlying the inflammatory processes in UC and CD.


Introduction
Inflammatory bowel disease (IBD), comprising Crohn's disease (CD) and ulcerative colitis (UC), is an idiopathic, chronic, inflammatory disorder of the gastrointestinal tract. Its prevalence is increasing worldwide, currently affecting 3.5 million people in Europe and North America [1,2]. The immune-pathogenesis of CD and UC is not fully understood; however, it is presently thought that genetic pre-dispositions, along with environmental factors, contribute to a breach of tolerance against the colonic microbiome, leading to excessive intestinal inflammation [3][4][5]. Currently, the differential diagnosis of UC and CD relies on endoscopic and histological analyses [6]. UC is characterized by superficial mucosal inflammation and rectal bleeding restricted to the colon. In contrast, CD involves transmural inflammation, which can cause perforation, strictures and extensive fibrosis, and can affect the entire gastrointestinal tract. UC and CD are both umbrella diagnoses covering multiple disease forms that are distinguished by clinical manifestation, severity,

Pathological Manifestations
Immune-compromised NSG mice were reconstituted with PBMCs isolated from patients with UC ( Figure 1). As controls, NSG mice reconstituted with PBMCs from a healthy donor were included (NSG-non-IBD). The basic patient characteristics and study groups are listed in Table 1. To induce disease-specific symptoms, the NSG-UC and NSG-non-IBD mice were challenged twice by rectal application of 10% or 50% ethanol on days 7 and 14, respectively. The mice from both groups experienced mild weight loss upon challenge with ethanol; however, severe diarrhea was only observed in the NSG-UC mice . The clinical symptoms were classified according to a clinical score throughout the study ( Figure 2). For analysis, only studies with matched groups were selected (Table 1 UC 1-4; for complete data set, see Table S1). Table 1 shows the basic patient characteristics and groups defined in this animal model study.  To induce disease-specific symptoms, the NSG-UC and NSG-non-IBD mice were challenged twice by rectal application of 10% or 50% ethanol on days 7 and 14, respectively. The mice from both groups experienced mild weight loss upon challenge with ethanol; however, severe diarrhea was only observed in the NSG-UC mice. The clinical symptoms were classified according to a clinical score throughout the study (Figure 2). For analysis, only studies with matched groups were selected (Table 1 UC 1-4; for complete data set, see  Table S1). Table 1 shows the basic patient characteristics and groups defined in this animal model study. The clinical scores in the unchallenged NSG-non-IBD and NSG-UC mice fluctuated around a mean value of 0.6, most likely reflecting variability in weight measurements. Upon challenge with ethanol, the clinical score increased in both groups significantly, indicating that the rectal application of ethanol had an impact on the mice (mean value of 3.2 and 2.6, respectively, for unchallenged versus challenge; NSG-non-IBD: p = 0.055; NSG-UC: p = 0.01; ANOVA followed by Tukey's HSD). On day 18, the mice were euthanized, and their colon was removed, subjected to macroscopic inspection, and classified according to a macroscopic score as described in Material and Methods section. Here, a clear difference was observed between the two models. In contrast to the challenged NSG-non-IBD mice, the colons of the ethanol-challenged NSG-UC mice exhibited unformed pellets, dilatation and colon shortening. In general, the macroscopic inspection of the colon corroborated the clinical score ( Figure 2).  Clinical and macroscopic colon scores of NSG-non-IBD mice (unchallenged N = 2, n = 8; challenged N = 2, n = 8) and NSG-UC mice (unchallenged N = 4, n = 21; challenged N = 4, n = 23) depicted in Cumming plots. NSG mice were reconstituted with PBMCs on day 1 and were left unchallenged or challenged with 10% ethanol on day 8, and 50% ethanol on day 15. The upper part of the plot presents each data point in a swarmplot. The mean and standard deviation (SD) of each group are plotted as a gapped line, where the vertical lines correspond to the mean ± SD and the mean itself is depicted as a gap in the line. The 0 point of the difference axis is based on the mean of the reference group (control). The dots show the difference between groups (effect size/mean difference). The shaded curve shows the entire distribution of excepted sampling error for the difference between the means (the higher the peak, the smaller the error). The error bar in the filled circles indicates the 95% confidence interval (bootstrapped) for the difference between means. N: no. of donors, n: no. of mice. (B) Macrophotographs of representative colons. For the challenged NSG-UC mice, colons with a score of 3 and 7 are depicted. The bar indicates 10 cm.
The clinical scores in the unchallenged NSG-non-IBD and NSG-UC mice fluctuated around a mean value of 0.6, most likely reflecting variability in weight measurements. Upon challenge with ethanol, the clinical score increased in both groups significantly, indicating that the rectal application of ethanol had an impact on the mice (mean value of 3.2 and 2.6, respectively, for unchallenged versus challenge; NSG-non-IBD: p = 0.055; NSG-UC: p = 0.01; ANOVA followed by Tukey's HSD). On day 18, the mice were euthanized, and their colon was removed, subjected to macroscopic inspection, and classified according to a macroscopic score as described in Section 4. Here, a clear difference was observed between the two models. In contrast to the challenged NSG-non-IBD mice, the colons of the ethanol-challenged NSG-UC mice exhibited unformed pellets, dilatation and colon shortening. In general, the macroscopic inspection of the colon corroborated the clinical score ( Figure 2).
To further examine the effect of ethanol, the histopathological manifestations were examined using two different staining methods to either visualize the colon architecture (hematoxylin and eosin, HE) or the connective tissue (Sirius Red, SR), (Figure 3). Various pathological phenotypes were observed in the NSG-UC mice, including edema, influx of inflammatory cells into the mucosa and submucosa, crypt elongation, tufting, goblet cell loss and fibrosis. In areas of complete destruction, fibrosis was not as profound as in areas of mild inflammation ( Figure 2, NSG-UC histological score = 16 or 7). In contrast, the NSG-non-IBD mice exhibited hardly any signs of inflammation and only mild edema. Occasionally, fibrotic alterations were observed. The sections were classified according to a histological score described in the Section 4. As shown in Figure 3B, the only significant differences were observed when comparing the unchallenged and challenged NSG-UC mice (p = 0) or the challenged NSG-non-IBD and NSG-UC mice (p = 4 × 10 −5 ; ANOVA followed by Tukey's HSD).
To corroborate the observation that the immunological background of the donor affected the development of symptoms in the NSG-UC mice, a correlation analysis was performed between the patients' Simple Clinical Colitis Activity Index (SCCAI) scores and the mean values of the histological scores observed in the mice that were reconstituted with PBMCs from the respective donors. As shown in Figure 4, this correlation was significant (Pearson's product-moment correlation analysis, cor = 0.49, p-value = 0.018, CI = 0.11-1).
To validate the observations obtained from the analysis of clinical scores, inflammatory markers associated with the inflammatory processes were analyzed (UC 3-5). Figure 5 provides an example of common inflammatory markers, such as msCCL-7, msCRP and msIL-6, which were analyzed in protein extracts from the distal parts of the colon and using Luminex. All selected markers were significantly elevated upon challenge with ethanol (CCL-7: unchallenged mean = 679.5 ± 702.31 pg/mL, challenged mean = 3382.81 ± 3540 pg/mL, p = 0.001; CRP: unchallenged mean = 6232.37 ± 2642 pg/mL, challenged mean = 9481.12 ± 2807 pg/mL, p = 0.01 (Student's t-test); IL-6: unchallenged mean = 5.5 ± 2.9 pg/mL, challenged mean = 1415 ± 3872 pg/mL, p = 0.02 (Wilcoxon rank-sum test)).   The dots show the difference between groups (effect size/mean difference). The shaded curve shows the entire distribution of excepted sampling error for the difference between the means (the higher the peak, the smaller the error). The error bar in the filled circles indicates the 95% confidence interval (bootstrapped) for the difference between means. N: no. of donors, n: no. of mice.
fected the development of symptoms in the NSG-UC mice, a correlation analysis was performed between the patients' Simple Clinical Colitis Activity Index (SCCAI) scores and the mean values of the histological scores observed in the mice that were reconstituted with PBMCs from the respective donors. As shown in Figure 4, this correlation was significant (Pearson's product-moment correlation analysis, cor = 0.49, p-value = 0.018, CI = 0.11-1). To validate the observations obtained from the analysis of clinical scores, inflammatory markers associated with the inflammatory processes were analyzed (UC 3-5). Figure  5 provides an example of common inflammatory markers, such as msCCL-7, msCRP and msIL-6, which were analyzed in protein extracts from the distal parts of the colon and using Luminex. All selected markers were significantly elevated upon challenge with ethanol (CCL-7: unchallenged mean= 679.5 ± 702.31 pg/mL, challenged mean = 3382.81 ± 3540 pg/mL, p = 0.001; CRP: unchallenged mean= 6232.37 ± 2642 pg/mL, challenged mean = 9481.12 ± 2807 pg/mL, p = 0.01 (Student's t-test); IL-6: unchallenged mean = 5.5 ± 2.9 pg/mL, challenged mean=1415 ± 3872 pg/mL, p = 0.02 (Wilcoxon rank-sum test)). The mice were treated as described in Figure 2 (unchallenged N = 2, n = 8; challenged N = 4, n = 24). Levels of msCRP, msCCL−7 and msIL−6 were detected using Luminex from colon extracts and depicted as Cumming plots. The upper part of the plot presents each data point in a swarmplot. The mean and standard deviation (SD) of each group are plotted as a gapped line, where the vertical lines correspond to the mean ± SD and the mean itself is depicted as a gap in the line. The 0 point of the difference axis is based on the mean of the reference group (control). The dots show the difference between groups (effect size/mean difference). The shaded curve shows the entire distribution of excepted sampling error for the difference between the means (the higher the peak, the smaller the error). The error bar in the filled circles indicates the 95% confidence interval (bootstrapped) for the difference between means. The labeling on the right side applies to all graphs. N: no. of donors, n: no. of mice.

Characterization of Cells Involved in Inflammation
To gain a better understanding of the immunological compartment of the colon in the NSG-UC mice, an immunohistochemical analysis was performed. As we expected that cells of human origin would migrate into the mucosa and submucosa, anti-huCD4, anti-huCD8, anti-huCD19 and anti-huCD14 antibodies were used. As shown in Figure 6, human CD4 and CD8 T cells, human B cells and human monocytes were detected. Interestingly, these cells seemed to concentrate at the tip of the crypt, suggesting that spatially expressed chemokines direct the migration of human leukocytes. Further analysis to characterize the subtypes of cells needs to be performed in the future. The mice were treated as described in Figure 2 (unchallenged N = 2, n = 8; challenged N = 4, n = 24). Levels of msCRP, msCCL−7 and msIL−6 were detected using Luminex from colon extracts and depicted as Cumming plots. The upper part of the plot presents each data point in a swarmplot. The mean and standard deviation (SD) of each group are plotted as a gapped line, where the vertical lines correspond to the mean ± SD and the mean itself is depicted as a gap in the line. The 0 point of the difference axis is based on the mean of the reference group (control). The dots show the difference between groups (effect size/mean difference). The shaded curve shows the entire distribution of excepted sampling error for the difference between the means (the higher the peak, the smaller the error). The error bar in the filled circles indicates the 95% confidence interval (bootstrapped) for the difference between means. The labeling on the right side applies to all graphs. N: no. of donors, n: no. of mice.

Characterization of Cells Involved in Inflammation
To gain a better understanding of the immunological compartment of the colon in the NSG-UC mice, an immunohistochemical analysis was performed. As we expected that cells of human origin would migrate into the mucosa and submucosa, anti-huCD4, anti-huCD8, anti-huCD19 and anti-huCD14 antibodies were used. As shown in Figure 6, human CD4 and CD8 T cells, human B cells and human monocytes were detected. Interestingly, these cells seemed to concentrate at the tip of the crypt, suggesting that spatially expressed chemokines direct the migration of human leukocytes. Further analysis to characterize the subtypes of cells needs to be performed in the future.

Fibrotic Alterations of the Colon
As shown in Figure 3, fibrosis is an important part of the inflammatory process in the NSG-UC mice. The fibrotic regions of the sections did not exhibit profound influx of human leukocytes but indicated the influx of fibroblasts. Colonic fibroblasts are a heterogeneous population consisting of at least three different cell types, including subepithelial myofibroblasts (vimentin+, CD90+, and αSMA+), myocytes (vimentin+, αSMA+, and des-min+) and pericytes (vimentin+, CD90-, desmin+, and Acta2 low) [12][13][14]. Therefore, the selected sections were stained with anti-vimentin, anti αSMA, anti-desmin and anti-TRPA1, which have been previously identified as a potential therapeutic target of fibroblasts and myofibroblasts [15].
As shown in Figure 7A, αSMA was predominantly expressed in the muscularis propria and in the muscularis mucosae. In the NSG-non-IBD sections, the muscularis mucosae was visible as a thin line, and weak staining was also observed between the crypts in different sections. Staining with desmin superposed αSMA staining, indicating the presence of myocytes in the muscularis mucosae. Upon challenge with ethanol, the muscularis mucosae thickened in the NSG-UC mice, and αSMA-positive fibroblasts squeezed between the crypts, replacing the epithelial cells. As these cells were desmin negative, these fibroblasts were most probably myofibroblasts. This observation was also corroborated by anti-vimentin staining. Vimentin was extremely weak in the NSG-non-IBD sections and intense in areas that had been identified as fibrotic by H&E staining, as shown in Figure  2. Some fibroblasts were double positive for vimentin and TRPA1, suggesting that suppressing TRPA1 may affect fibrosis. To support these findings, fibroblasts were isolated from the colons of the mice ( Figure 7B) and cultivated for five days. As expected, the H&E staining displayed a heterogenic morphology of fibroblasts, suggesting a heterogeneous population in the isolates. All cells expressed vimentin, thus confirming the isolation of Figure 6. Cells of human origin migrate into the mucosa and submucosa. Sections from the distal parts of the colon were stained with green Alexafluor anti-human CD4, CD8, CD19 and CD14. Scale bar represents 100 µm.

Fibrotic Alterations of the Colon
As shown in Figure 3, fibrosis is an important part of the inflammatory process in the NSG-UC mice. The fibrotic regions of the sections did not exhibit profound influx of human leukocytes but indicated the influx of fibroblasts. Colonic fibroblasts are a heterogeneous population consisting of at least three different cell types, including subepithelial myofibroblasts (vimentin+, CD90+, and αSMA+), myocytes (vimentin+, αSMA+, and desmin+) and pericytes (vimentin+, CD90−, desmin+, and Acta2 low) [12][13][14]. Therefore, the selected sections were stained with anti-vimentin, anti αSMA, anti-desmin and anti-TRPA1, which have been previously identified as a potential therapeutic target of fibroblasts and myofibroblasts [15].
As shown in Figure 7A, αSMA was predominantly expressed in the muscularis propria and in the muscularis mucosae. In the NSG-non-IBD sections, the muscularis mucosae was visible as a thin line, and weak staining was also observed between the crypts in different sections. Staining with desmin superposed αSMA staining, indicating the presence of myocytes in the muscularis mucosae. Upon challenge with ethanol, the muscularis mucosae thickened in the NSG-UC mice, and αSMA-positive fibroblasts squeezed between the crypts, replacing the epithelial cells. As these cells were desmin negative, these fibroblasts were most probably myofibroblasts. This observation was also corroborated by anti-vimentin staining. Vimentin was extremely weak in the NSG-non-IBD sections and intense in areas that had been identified as fibrotic by H&E staining, as shown in Figure 2. Some fibroblasts were double positive for vimentin and TRPA1, suggesting that suppressing TRPA1 may affect fibrosis. To support these findings, fibroblasts were isolated from the colons of the mice ( Figure 7B) and cultivated for five days. As expected, the H&E staining displayed a heterogenic morphology of fibroblasts, suggesting a heterogeneous population in the isolates. All cells expressed vimentin, thus confirming the isolation of fibroblasts. However, the expression pattern of other markers reflected the heterogeneity of the fibroblastic population. αSMA, desmin and TRPA1 were not expressed in every cell. fibroblasts. However, the expression pattern of other markers reflected the heterogeneity of the fibroblastic population. αSMA, desmin and TRPA1 were not expressed in every cell.

Testing of Therapeutics in the NSG-UC Mouse Model
The NSG-UC mouse model has become a well-accepted model for validating the efficacy of novel and approved therapeutics addressing human molecular targets (Table 2). This provides a significant advantage as it avoids the need to develop murine surrogate inhibitors and allows inhibition of human-specific pathways, which may differ or may not exist in conventional murine models. In this study, we tested eleven different therapeutics. Tofacitinib Jak ameliorating [21] Infliximab TNFα ameliorating [21] Adalimumab TNFα ameliorating [22,23] Oxelumab OX40L ameliorating [22] Anti-CCR4 mab CCR4 deteriorating [18] One noticeable issue in this study is inherent variability, which includes the impact of the immunological background of the donor and the variability in scores and expressions of markers within each group. Therefore, a comprehensive analysis of all markers is necessary to obtain reliable results. In most of our studies, we applied OPLS-DA analysis, which not only allows a comparison of the efficacy of therapeutics but also provides quantitative data. Figure 8 presents an example of an OPLS-DA analysis between Infliximab-and Tofacitinib-treated mice (Figure 8). Tofacitinib showed lower R2X and higher R2Y and Q2y values, and a higher RMSSE value. The mice were treated as described in [21]. Clinical, colonic and histological scores, and levels of msIL−6, msTGFß, msCRP, msCCL−7 and tryptophan were used for modeling. R 2 X: fraction of variation in the X variables explained by the model; R 2 Y: fraction of variation in the Y variables explained by the model, Q 2 Y: fraction of variation in the Y variables predicted by the model; RMSEE: root mean square error of estimation.

The NSG-CD Mouse Model
The impact of the immunological background of the donor was also obvious in the NSG-CD model. While the treatment scheme was identical to the NSG-UC model, the pathological manifestations differed (Figure 9) [24]. In the NSG-CD mice, the development of edema and the influx of inflammatory cells were less prominent compared to the Figure 8. Examples of OPLS-DA analysis of inflammatory parameters obtained from NSG-UC mice treated with Infliximab or Tofacitinib. The mice were treated as described in [21]. Clinical, colonic and histological scores, and levels of msIL−6, msTGFß, msCRP, msCCL−7 and tryptophan were used for modeling. R 2 X: fraction of variation in the × variables explained by the model; R 2 Y: fraction of variation in the Y variables explained by the model, Q 2 Y: fraction of variation in the Y variables predicted by the model; RMSEE: root mean square error of estimation.

The NSG-CD Mouse Model
The impact of the immunological background of the donor was also obvious in the NSG-CD model. While the treatment scheme was identical to the NSG-UC model, the pathological manifestations differed (Figure 9) [24]. In the NSG-CD mice, the development of edema and the influx of inflammatory cells were less prominent compared to the NSG-UC mice, and complete destruction of the mucosa was rarely observed. However, a thickening of the muscularis mucosae and an enlargement of the submucosa were frequently observed ( Figure 9A). These areas exhibited collagen deposition and the presence of fibroblasts and fibrocytes ( Figure 9B). Additionally, unlike the NSG-UC mice, there was no significant difference in the histological score between the unchallenged (mean = 2.21 ± 1.47) and challenged mice (mean = 3.53 ± 2.02), suggesting that patient-derived PBMCs can spontaneously induce remodeling of inflammatory response (Figure 2). In this analysis, only matched groups were selected (CD1, 3). Human CD8+ cytotoxic T cells were detected in the sections of the NSG-CD mice ( Figure 9B), although the influx of inflammatory cells was negligible compared to the NSG-UC mice. The detected levels of inflammatory markers corroborated the observation of a less inflammatory phenotype in the NSG-CD model ( Figure 10). Levels of CRP and CCL-7 were significantly lower in the NSG-CD mice (CRP: NSG-UC mean = 8469.01 ± 3420.11 pg/mL, NSG-CD mean = 3671.4 ± 2736.52 pg/mL, p = 0; CCL-7: NSG-UC mean = 2724.23 ± 3425.72, NSG-CD mean = 651.08 ± 1349.15, p = 0.0035; Welch two-sample t-test). Conversely, remodeling markers like HGF and TGFß were elevated in the NSG-CD mice (HGF: NSG-UC mean = 4.66 ± 3.06 ng/mL, NSG-CD mean = 46.48 ± 35.94 ng/mL, p = 2 × 10 −5 ; TGFß: NSG-UC mean = 26.13 ± 7.01 g/mL, NSG-CD mean = 31.00 ± 10.11, p = 0.02). These observations also suggest that the NSG-CD model reflects the human disease. and standard deviation (SD) of each group are plotted as a gapped line, where the vertical lines correspond to the mean ± SD and the mean itself is depicted as a gap in the line. The 0 point of the difference axis is based on the mean of the reference group (control). The dots show the difference between groups (effect size/mean difference). The shaded curve shows the entire distribution of excepted sampling error for the difference between the means (the higher the peak, the smaller the error). The error bar in the filled circles indicates the 95% confidence interval (bootstrapped) for the difference between means. N: no. of donors, n: no. of mice.
The detected levels of inflammatory markers corroborated the observation of a less inflammatory phenotype in the NSG-CD model ( Figure 10). Levels of CRP and CCL-7 were significantly lower in the NSG-CD mice (CRP: NSG-UC mean = 8469.01 ± 3420.11 pg/mL, NSG-CD mean = 3671.4 ± 2736.52 pg/mL, p = 0; CCL-7: NSG-UC mean = 2724.23 ± 3425.72, NSG-CD mean = 651.08 ± 1349.15, p = 0.0035; Welch two-sample t-test). Conversely, remodeling markers like HGF and TGFß were elevated in the NSG-CD mice (HGF: NSG-UC mean = 4.66 ± 3.06 ng/mL, NSG-CD mean = 46.48 ± 35.94 ng/mL, p =2x10 -5 ; TGFß: NSG-UC mean = 26.13 ± 7.01 g/mL, NSG-CD mean = 31.00 ± 10.11, p = 0.02). These observations also suggest that the NSG-CD model reflects the human disease. The mean and standard deviation (SD) of each group are plotted as a gapped line, where the vertical lines correspond to the mean ± SD and the mean itself is depicted as a gap in the line. The 0 point of the difference axis is based on the mean of the reference group (control). The dots show the difference between groups (effect size/mean difference). The shaded curve shows the entire distribution of excepted sampling error for the difference between the means (the higher the peak, the smaller where the vertical lines correspond to the mean ± SD and the mean itself is depicted as a gap in the line. The 0 point of the difference axis is based on the mean of the reference group (control). The dots show the difference between groups (effect size/mean difference). The shaded curve shows the entire distribution of excepted sampling error for the difference between the means (the higher the peak, the smaller the error). The error bar in the filled circles indicates the 95% confidence interval (bootstrapped) for the difference between means. N: no. of donors, n: no. of mice.

Discussion
The gap between preclinical mouse models and human diseases appears to be irreconcilable due to species-specific signalling pathways, non-matching cellular populations and physiological differences in organ function. This disparity is particularly evident in chronic inflammatory diseases, which exhibit heterogeneous manifestations influenced by factors such as age, disease duration, genetic predisposition and individual etiopathologies. Despite these challenges, mice remain the preferred species for practical reasons. Conventional mouse models typically induce pathophysiological symptoms through the application of chemical substances like DSS or TNBS, resulting in a predominantly pro-inflammatory response [9,10]. While these models have been valuable for validating anti-inflammatory therapeutics, their results have limited predictive value for clinical trials or therapeutic responsiveness, and they poorly reflect the fibrotic alterations observed in UC and CD.
The developed NSG-IBD models described in this study offer several advantages over conventional models. First, they provide a more accurate reflection of these human diseases. From an evolutionary point of view, it is reasonable to assume that the inflammatory processes in IBD are aberrant wound healing processes. These processes involve the recruitment of pro-inflammatory T cells, B cells, monocytes and neutrophils to protect the epithelial barrier from invading pathogens. In the NSG-UC model, this response was observed with the development of edema filled with a mixed infiltrate of leukocytes upon ethanol application. Additionally, human T cells, B cells and monocytes migrated to the tip of the crypts for further protection. Similar to UC, these uncontrolled processes led to the destruction of the mucosa. Like in wound healing processes, a second arm of inflammation was observed which involved fibrosis to seal the damaged area. While variations of fibrosis occurred in the NSG-UC mice, fibrotic alterations were more pronounced in the NSG-CD mice at the expense of a pro-inflammatory phenotype. Invading fibroblasts squeezed and replaced the crypts and the CD-NSG mice exhibited profound collagen deposition between the muscularis mucosae and the muscularis propria, resembling the loss of flexibility of the colon seen in strictures of CD. Both models exhibited a coexistence of pro-inflammatory and remodeling processes, with one process often predominating over the other. These observations were supported by the expression of pro-inflammatory markers (CRP and CCL-7) in the NSG-UC mice and remodeling markers (HGF and TGFß) in the NSG-CD mice.
Fibrotic areas were characterized by collagen deposition and the presence of fibrocytes and fibroblasts. While it is believed that monocyte-derived fibrocytes migrate from the blood to damaged areas [25,26], the origin of inflammatory fibroblasts is less understood. Given the heterogeneity of the mucosal fibroblast population, it is not clear which subtype contributes to the pool of inflammatory fibroblasts. The study by Kinchen et al. [14] suggests that a specific subtype of fibroblasts proliferates in the inflammatory environment of UC, characterized by an elevated expression of chemokine ligand-19, -21, TNF Superfamily Member 14, major histocompatibility invariant chain CD74, clusterin, interleukin-33, CD24 and podoplanin. However, this study does not provide guidance on targeting excessive fibrosis. Another study by West et al. [27] suggests that inflammatory fibrocytes express the oncostatin M receptor, potentially allowing targeting with inhibitors. The origin of these fibroblasts was not elucidated in this study.
Interestingly, fibrosis in the NSG-CD model developed spontaneously without the need for ethanol application. This observation suggests that PBMCs from CD and UC patients carry intrinsic information that leads to the distinct phenotypes in NSG mice. One possible explanation is that an auto-toxic T cell response triggers CD, as a subtype of CD8+ T cells expressing genes indicating a canonical effector phenotype, including KLRG1, GZMB, GZMK, PRF1, IFNG and FCRL6 which have been detected in terminal ileal resections [28]. On the other hand, the response to the mild irritant ethanol in the NSG-UC mice suggests that PBMCs derived from UC donors are differently harnessed and therefore evoke a different pathological phenotype. The specific differences are not yet understood, but these models provide an opportunity to gain a better understanding of the onset and mechanism of UC and CD.
Second, these models partially reflect the inflammatory background of the donor and, when combined with patient profiling, may eventually enable stratification of patients for clinical trials, reducing the risk in clinical studies [23]. Currently, results obtained in animal models indicating efficacy have limited value as 54% of drugs fail in phase II clinical trials due to the lack of efficacy [29]. While inadequate animal models are not the sole cause of failure, the NSG-IBD models offer the possibility of predicting therapeutic response based on the inflammatory profile of patients selected for reconstitution [21,23].
Third, these models allow the testing of therapeutics targeting human molecules, potentially avoiding the need for developing surrogate inhibitors specific to murine targets or testing in non-human primates.
However, there are limitations of the IBD models described in this study. They are still chimeric and reliant on the interaction of murine ligands and cognate human receptors and vice versa. Although certain chemokine and chemokines and chemokine receptors appear to facilitate the directed migration of human leukocytes to the mucosa, the communication between resident murine cells and migrating leukocytes is not fully understood. This may limit the extent to which the NSG-IBD models accurately reflect these human diseases. In addition, these models exhibit inherent variability due to the inflammatory background of the donor, levels of engraftment, and diverse pathological manifestations encompassing both pro-inflammatory responses and various forms of fibrosis. Lastly, they are logistically more complicated than conventional models and require the commitment of patients who are willing to donate blood.
In summary, despite these limitations, we believe that the advantages offered by these models outweigh their drawbacks and that they provide powerful tools to elucidate inflammatory mechanisms in UC and CD.

Isolation of PBMCs and Engraftment
A total of 60 mL of peripheral blood in trisodium citrate solution (S-Monovette, Sarstedt, Nürnberg, Germany) was collected from the arm vein of healthy donors or IBD patients, as described previously [24]. The blood was diluted with Hank's balanced salt solution (HBSS, Sigma-Aldrich, Deisenhofen, Germany) at a 1:2 ratio. The suspension was loaded onto LeucoSep tubes (Greiner Bio One, Frickenhausen, Germany). PBMCs were separated by centrifugation at 400 g for 30 min without acceleration. The interphase was extracted and diluted with phosphate-buffered saline (PBS) to a final volume of 40 mL. Cells were counted and centrifuged at 1400 g for 5 min. The cell pellet was resuspended in PBS at a concentration of 4 × 10 6 cells in 100 µL.

Study Protocol
NSG mice were obtained from Charles River Laboratories (Sulzfeld, Germany). The mice were kept under specific pathogen-free conditions in individually ventilated cages in a facility controlled according to the Federation of Laboratory Animal Science Association (FELASA) guidelines. The age of the mice was between 6 and 8 weeks.
Following engraftment on day 1, the mice were presensitized by rectal application of 150 µL of 10% ethanol on day 8 using a 1 mm cat catheter as previously described [23] (Henry Schein, Hamburg, Deutschland). The catheter was lubricated with Xylocain©Gel 2% (AstraZeneca, Wedel, Germany). Rectal application was performed under general anesthesia using 4% isofluran. Post application, the mice were kept at an angle of 30 • to avoid ethanol dripping. On day 15, the mice were challenged by rectal application of 150 µL of 50% ethanol following the protocol of day 8. On day 18, the mice were sacrificed. Therapeutic antibodies (150 µL in PBS) were applied i.p. on days 7 and 14, and small-molecule inhibitors (150 µL in PBS or 0.5% methylcellulose gel in PBS (Merck KGaA, Darmstadt, Germany, Firma Cat# M0512) were applied i.p. on days 7-9 and 14-17.
Body posture: intermediately hunched posture (1), and permanently hunched posture (2). The scores were added daily to a total score with a maximum of 12 points per day. Animals who suffered from weight loss > 20%, rectal bleeding, rectal prolapse, self-isolation or a severity score > 7 were euthanized immediately and not taken into count. All scores were added for statistical analysis.

Histological Analysis
Necropsy samples from the distal parts of the colon were fixed in 4% formaldehyde for 24 h, stored in 70% ethanol and routinely embedded in paraffin. The samples were cut into 3 µm sections and stained with hematoxylin and eosin (H&E) and Sirius Red (SR).
Epithelial erosions were scored as follows [23]: no lesions (1), focal lesions (2), multifocal lesions (3), and major damage with involvement of basal membrane (4). Inflammation was scored as follows: infiltration of few inflammatory cells into the lamina propria (1), major infiltration of inflammatory cells into the lamina propria (2), confluent infiltration of inflammatory cells into the lamina propria (3), and infiltration of inflammatory cells including tunica muscularis (4). Fibrosis was scored as follows: focal fibrosis (1) and multifocal fibrosis and crypt atrophy (2). The presence of edema, hyperemia and crypt abscess was scored with 1 additional point in each case. The scores for each criterion were added to a total score ranging from 0 to 12. Images were taken with a Zeiss AXIO Observer microscope (Zeiss, Oberkochen, Germany) using the Zeiss ZEN2 lite software.

Immunohistochemistry (IHC) and Immune Cytochemistry (ICC)
For the IHC of the colon, the samples from different parts of the murine colon were fixed in 4% formaldehyde for 24 h, stored in 70% ethanol and embedded in paraffin. The samples were cut into 3 µm sections. After de-paraffinization and rehydration with xylene and ethanol, antigen retrieval in 1 mM EDTA was conducted.
For ICC of fibroblast cell cultures, the medium was removed and cells were washed twice with 2 mL of PBS. Following fixation with 2 mL of ROTI ® Histofix 4% formaldehyde for 15 min at RT, the cells were washed three times with 2 mL of PBS and blocked in 1 mL of blocking buffer (5% FBS in 1 × PBS) at RT for 30 min.

Ex Vivo Fibroblast Cultivation
Fibroblasts were isolated from NSG mouse colon by adapting the protocol described in [30]. At necropsy, the mouse colon was removed and put on a Petri dish containing ice-cold PBS. The colon was cleaned with ice-cold 1 × PBS with the help of a syringe, cut open lengthwise and transferred into a falcon containing 20 mL of ice-cold 1 × PBS. The ice-cold 1 × PBS was replaced three times until the fluid became clear. The colon was put in a falcon containing 25 mL of ice-cold 1 × HBSS and then incubated for 15 min at 37 • C. The experiment proceeded under a sterile hood afterward. The colon was washed with 1 × HBSS and then transferred into 20 mL of digestion medium (20 mL of complete medium + 20 mg of Dispase II (Sigma Aldrich, Deisenhofen, Germany) + 1.8 mL of 10 mg/mL concentration of Collagenase IV (Sigma Aldrich, Deisenhofen, Germany) in HBSS). The digestion sample was put into a shaking water bath at 37 • C for 1 h 15 min until it looks stringy. During digestion, the samples were vortexed every 15 min. Afterward, the samples were centrifuged at 4 • C at 200 rcf for 5.5 min. The supernatant and remaining solid colon was discarded. The samples were centrifuged at 4 • C at 200 rcf (g) for 5.5 min again. The remaining supernatant was discarded. The pellet was dissolved with~7.5 mL of complete medium (1 × RPMI 1640 Medium + 10% FBS + 1% Penicillin-Streptomycin). The cells were passed through a 70 µm cell strainer in a fresh tube and 350 µL of cell suspension was seeded on the coverslips (24 × 24 mm) of a 6-well plate. For sterilization, the coverslips were incubated in 70% EtOH for at least 1 h, followed by three washing steps with PBS. A total of 350 µL of complete medium was added into the wells. After incubation at 37 • C and 5% CO 2 for 3 h, the wells were washed twice with 450 µL of HBSS. Then, 1200 µL of complete medium was added for incubation. After 24 h, the medium was replaced by the same amount of fresh complete medium. The cells could be cultivated up to 7 days. For H&E staining, the cells were permeabilized with 2 mL of 0.5% Tween-20 in PBS for 20 min.

Statistical Analysis
Statistical analysis was performed using R: A language and environment for statistical computing (R Foundation for Statistical Computing, Vienna, Austria. URL https://www. R-project.org/, version 3.6.3, accessed on 29 February 2020). The data were analyzed using Cumming plots [31] using the dabestr package for data presentation and comparison. Cumming plots are a new generation of data analysis with bootstrap-coupled estimation (DABEST) plots that move beyond p-values. These plots can be used to visualize large samples and multiple groups easily. For correlation analysis, Pearson's product-moment correlation was performed and a 95% confidence interval was applied. The variables were also represented as mean, standard deviation and median values. A two-sided Student's t-test with a confidence level = 0.95 was used to compare binary groups, and for more than two groups, ANOVA followed by Tukey's HSD was conducted. In cases where data were not equally distributed, a Wilcoxon rank-sum test was performed. Orthogonal partial least squares discrimination analysis (oPLS-DA) was performed using the ropls package.

Conclusions
There is an urgent medical need for accurate IBD patient stratification to optimize patient care and response to therapy. This approach needs to be matched with animal models to validate the expected response to a given therapeutic in a selected patient group.
Currently, none of the animal models for IBD fully replicates all the manifestations of the heterogeneous patient groups observed in these human diseases. However, the NSG-UC and NSG-CD models presented in this study significantly narrow the gap between these human diseases and preclinical studies as both models are highly reflective of the respective human diseases. In both models, the disease pattern is dependent on the individual immunological background of the donor. Therefore, combining patient profiling and testing of therapeutics in these models may lead to stratification of patients for individualized and phase-dependent treatment. In addition, it may reduce the risk of novel therapeutics failing in clinical phase II studies.