A Mechanogenetic Model of Exercise-Induced Pulmonary Haemorrhage in the Thoroughbred Horse

Exercise-induced pulmonary haemorrhage (EIPH) occurs in horses performing high-intensity athletic activity. The application of physics principles to derive a ‘physical model’, which is coherent with existing physiology and cell biology data, shows that critical parameters for capillary rupture are cell–cell adhesion and cell stiffness (cytoskeleton organisation). Specifically, length of fracture in the capillary is a ratio between the energy involved in cell–cell adhesion and the stiffness of cells suggesting that if the adhesion diminishes and/or that the stiffness of cells increases EIPH is more likely to occur. To identify genes associated with relevant cellular or physiological phenotypes, the physical model was used in a post-genome-wide association study (GWAS) to define gene sets associated with the model parameters. The primary study was a GWAS of EIPH where the phenotype was based on weekly tracheal wash samples collected over a two-year period from 72 horses in a flat race training yard. The EIPH phenotype was determined from cytological analysis of the tracheal wash samples, by scoring for the presence of red blood cells and haemosiderophages. Genotyping was performed using the Illumina Equine SNP50 BeadChip and analysed using linear regression in PLINK. Genes within significant genome regions were selected for sets based on their GeneOntology biological process, and analysed using fastBAT. The gene set analysis showed that genes associated with cell stiffness (cytoskeleton organisation) and blood flow have the most significant impact on EIPH risk.


Introduction
Exercise-induced pulmonary haemorrhage (EIPH) is frequently identified in horses performing high-intensity athletic activity. During exercise the pressure in the pulmonary capillaries reaches at least 80 mmHg [1], which is sufficient to cause pulmonary capillary stress failure [2,3]. Additionally, high transmural pressure (difference between the positive intraluminal and negative alveolar pressures) causes rupture of the blood-gas barrier [4]. The cause of bleeding is capillary failure; bleeding stops when the pressure is reduced [5] but lung lesions remain in horses with EIPH suggesting the condition is due to more than simple stress failure [6][7][8]. During exercise, raised left atrial pressure causes blood to back up into the lungs leading to an increase in pulmonary capillary pressure [9]. As a result, in EIPH lung lesions the small pulmonary veins (100-200 µm outer diameter) are remodelled through the build-up of adventitial collagen and hyperplasia of smooth muscle [5] which can lead to a significant reduction in the vascular lumen [6][7][8].

Tracheal Wash Cytology
Tracheal wash samples were formalin-fixed at the point of collection. The wash sample was centrifuged, and a portion of the cell pellet was pipetted onto a Poly-L-Lysine glass slide and smeared. In samples where a reasonably sized pellet was not visible after centrifugation, slides were prepared using cytospin centrifugation. All slides were air-dried, stained using Haematoxylin and Eosin stains and examined using light microscopy. Each sample was evaluated for the presence of epithelial cells, macrophages, lymphocytes, eosinophils, haemosiderophages (HF), erythrocytes or red blood cells (RBC), squames, mucus, fungi, bacteria, plant material and debris. Each of these cell types or characteristics was assigned a score out of three indicating a relative density, with a score of 0 equating to none, score of 1 to occasional, score of 2 to moderate and a score of 3 to high (see Appendix A for cytology images). Overall cell density was scored out of 3; a score of 0 equated to low/medium-low, a score of 1 to medium, a score of 2 to medium-high and a score of 3 to high.

EIPH Phenotype
EIPH is defined as blood in the airways after exercise and may be detected by tracheobronchoscopic examination and/or by enumerating red blood cells or haemosiderophages in cytology samples from tracheal wash (TW) or bronchoalveolar lavage (BAL) fluid [19]. BAL requires sedation and local anaesthesia of the horse, as the endoscope is passed further than the trachea and into a subsegmental bronchus [20]. The need for sedation and local anaesthesia means that the routine use of BAL on horses engaged in both training exercise and racing is not acceptable to trainers, hence this study was based on routine (weekly) tracheoendoscopic examination and TW sampling rather than BAL. The occurrence of EIPH was determined by cytological analysis of the TW samples to score the presence of red blood cells and haemosiderophages (macrophages which have phagocytized red blood cells in the alveolar space or airways). Red blood cells are detectable immediately post-exercise in BAL fluid, while the percentage of haemosiderophages significantly increases one week after exercise returning to pre-exercise levels within three weeks [21]. Although TW does not directly correlate with BAL cytology [20], the numbers of haemosiderophages may be used to detect EIPH in TW samples [22]. While the precise timing of haemorrhage is difficult to determine from TW samples, an increased frequency of haemosiderophage score during the time period of sampling indicates persistent episodes of bleeding after training and racing. In this study, the EIPH phenotype was obtained as a frequency score for each horse calculated as the sum of the HF scores (>1) over the series of tracheal wash samples taken on that individual divided by the total number of samples in the series.

Horse Race Record
The number of races starts in both 2009 and 2010, dates of first and last race start for both 2009 and 2010, gender and date of birth for each horse was obtained by searching the Racing Post Online database (http://www.racingpost.co.uk). This information was used to calculate the age of each horse at first race start for both 2009 and 2010, and the age of each horse when the first tracheal wash sample was collected. The age of the horse at first race start and age at first tracheal wash sampling was significantly correlated (r = 0.895). The number of days between the last race start and each tracheal wash sampling was also calculated.

DNA Extraction and Quantification
Blood samples from each horse were collected for routine haematology testing as part of the animal's training regime by a veterinary surgeon, and residual samples were used for DNA extraction. DNA was extracted using nucleon BACC DNA extraction kits. Samples were quantified in duplicate using Quant_iT PicoGreen dsDNA kits (Invitrogen Carlsbad, CA, USA) and 10% of the samples were run on a 1% agarose gel to check for the presence of high molecular weight DNA. DNA aliquots were adjusted to a concentration of 70 ng/uL for genotyping.

Single Nucleotide Polymorphism (SNP) Genotyping and Quality Control
Samples were genotyped using the Equine SNP50 BeadChip (Illumina, San Diego, CA, USA). This array consists of 54,602 SNP (single nucleotide polymorphisms) assays, all of which are derived from the EquCab2.0 SNP collection (http://www.broadinstitute.org/mammals/horse). SNPs are evenly distributed throughout the equine genome, with an average density of one SNP per 43.2 kb.
The genotyping data were analysed with GenomeStudio software (Illumina, San Diego, CA, USA) using a cluster file generated from a dataset of 1342 Thoroughbred samples previously genotyped with the Equine SNP50 BeadChip. SNPs with low intensity data (190 SNPs), with inadequately

Linear Regression Association Analysis
An association analysis to identify SNPs significantly associated with EIPH was carried out using the linear regression option in PLINK v1.07 [23]. Each SNP is analysed independently (single SNP analysis), with a regression coefficient significantly different from zero indicating a relationship between the genotype at that SNP and the phenotype. A t-test is used to test whether or not the coefficient is significantly different from zero. Linear regression analysis also allows for the testing of disease trait SNP associations along with interaction with defined co-variates. Covariates incorporated into the association analysis included gender, age of horse at first race start within the period of time when tracheal wash sampling occurred, period(s) of time when tracheal wash sampling occurred and the average number of days to last race start before tracheal wash sampling occurred. A Bonferroni correction was applied to adjust the significance of p-values to account for multiple testing; in this case, 43,417 SNPs were tested. A p-value was considered to be significant when p < 1.15 × 10 −6 .

Estimation of the Genetic Variance Explained by Individual Chromosome SNPs
Restricted maximum likelihood (REML) analysis with the GCTA program [24] was used to obtain estimates of the genetic variance explained by SNPs on individual chromosomes.

Linear Regression Association Analysis
An association analysis to identify SNPs significantly associated with EIPH was carried out using the linear regression option in PLINK v1.07 [23]. Each SNP is analysed independently (single SNP analysis), with a regression coefficient significantly different from zero indicating a relationship between the genotype at that SNP and the phenotype. A t-test is used to test whether or not the coefficient is significantly different from zero. Linear regression analysis also allows for the testing of disease trait SNP associations along with interaction with defined co-variates. Covariates incorporated into the association analysis included gender, age of horse at first race start within the period of time when tracheal wash sampling occurred, period(s) of time when tracheal wash sampling occurred and the average number of days to last race start before tracheal wash sampling occurred. A Bonferroni correction was applied to adjust the significance of p-values to account for multiple testing; in this case, 43,417 SNPs were tested. A p-value was considered to be significant when p < 1.15 × 10 −6 .

Estimation of the Genetic Variance Explained by Individual Chromosome SNPs
Restricted maximum likelihood (REML) analysis with the GCTA program [24] was used to obtain estimates of the genetic variance explained by SNPs on individual chromosomes.

Gene Set Analysis
The physical model of EIPH described in Appendix B shows that sustained bleeding will occur when capillary fracture reaches a critical length, the length being determined by the ratio between the energy involved in cell-cell adhesion and the stiffness (cytoskeleton organisation) of cells. Genetic variants that decrease cell-cell adhesion or increase cell stiffness are, therefore, likely to enhance susceptibility to EIPH. The model also suggests that factors modulating blood flow, such as a reduction in blood vessel diameter or an increase in cardiac output, would increase susceptibility to EIPH. The key parameters modulating risk are, therefore, cell-cell adhesion, cell stiffness and blood flow. To identify whether genetic variants identified through the GWAS significantly affected these parameters a gene set analysis was carried out, assigning SNPs from genome regions significantly associated with EIPH to gene sets based on their proximity to known or predicted genes with functions linked to these parameters.
Genes in the genome regions demarcated by the top 20 GWAS SNPs ranked on p-value were identified in the Ensembl genome browser (www.ensembl.org), and gene annotations were identified by comparison with the syntenic human regions. Gene function analysis and assignment to Gene Ontology (GO) classifications were carried out using DAVID [25] and Amigo2 [26]. Based on the prior information provided by the physics model, and using GO classifications as gene function, genes were grouped into sets with functions relating to (i) cell-cell adhesion (ii) cytoskeleton organisation (cell stiffness) and (iii) blood flow. For comparison gene sets representing (iv) disease biomarkers (stroke or hypertension) and (v) random background markers (genes within the significant GWAS genome regions but having functions unrelated to the parameters of interest) were also constructed. The GO terms used to classify the gene sets are shown in the Supplementary File: Table S1 Gene Ontology Terms.
For each gene, the chromosome position, gene start and end positions were identified using Ensembl BioMart [27]. SNPs on the Equine SNP50 BeadChip were then assigned to genes within the gene sets if they were located within the gene boundaries (end-start), or within 5 kb or within 20 kb of the gene start or end.
Gene set analysis was carried out using fast set-based association analysis (fastBAT) [28]. SNPs in genes that appeared in the intersect between sets (i) cell-cell adhesion, (ii) cytoskeleton organisation (cell stiffness) and (iii) blood flow were further analysed for haplotype associations and SNP-SNP epistatic interactions using PLINK v1.07 [23].

Population Structure
All sampled individuals came from bloodlines which predominantly represent the UK Thoroughbred population. Figure 1 shows the majority of individuals were in a central cluster, with three outlying clusters representing three separate sire families. All three of these families trace back to a single sire within three or four generations. The structure seen in the data, therefore, represents close family structure within the breed. There was no significant difference in IBS between groups (horse with HF scores > 1 were defined as cases, all others as controls), with the proportion of variance in IBS between groups = 0.0002, confirming there is no significant stratification within this population.

Genome-Wide Association Analysis (GWAS)
Genome-wide significant single nucleotide polymorphisms (SNPs), after adjusting the p-values for multiple testing (p < 1.15 × 10 −6 ), were found on chromosomes 3, 13 and 23 for haemosiderophage frequency (number of times HF score > 1). Suggestive signals for haemosiderophage frequency (HF), not quite reaching genome-wide significance, were found for genome regions on chromosomes 1, 14 and 25. Figure 2 illustrates the genome-wide association results for all SNPs. Appendix C shows the top 20 SNPs from the GWAS, ranked by smallest to largest p-value.  Table  S2 SNP based Estimates of Genetic Variance. Genetic variances significantly different from zero were identified on chromosomes 13, 19 and 20. Table 1 shows the GWAS genome regions from which gene lists were identified. The GAD_Disease, Gene Ontology terms, KEGG Pathway and UP Keywords associated with all genes identified in the significant genome regions are listed in the Supplementary file: Table S3 EIPH Top 20 SNPs Gene Functions. Gene sets were constructed based on these terms, in total six sets were tested representing (1) cell-cell adhesion (2) cytoskeleton (cell stiffness) (3) blood flow (4) disease biomarkers for stroke (5) disease biomarkers for hypertension and (6) random genes in the significant genome regions with functions unrelated to categories (1)- (5). The gene names and symbols for all genes contained within gene sets (1)- (5), and the intersections among the gene sets are given in the Supplementary file: Results of the fast set-based association analysis (fastBAT) [28] gene set analysis are summarized in Tables 2-4.  Table S2 SNP based Estimates of Genetic Variance. Genetic variances significantly different from zero were identified on chromosomes 13, 19 and 20. Table 1 shows the GWAS genome regions from which gene lists were identified. The GAD_Disease, Gene Ontology terms, KEGG Pathway and UP Keywords associated with all genes identified in the significant genome regions are listed in the Supplementary file: Table S3 EIPH Top 20 SNPs Gene Functions. Gene sets were constructed based on these terms, in total six sets were tested representing (1) cell-cell adhesion (2) cytoskeleton (cell stiffness) (3) blood flow (4) disease biomarkers for stroke (5) disease biomarkers for hypertension and (6) random genes in the significant genome regions with functions unrelated to categories (1)-(5). The gene names and symbols for all genes contained within gene sets (1)- (5), and the intersections among the gene sets are given in the Supplementary file: Table S4 Gene Set Lists. Results of the fast set-based association analysis (fastBAT) [28] gene set analysis are summarized in Tables 2-4.  When SNPs were restricted to either those positioned within gene boundaries or up to 20 kb from the gene boundaries the significant gene sets were those with functions related to cytoskeleton (cell stiffness), blood flow and biomarkers for hypertension. There was intersection of gene content among the gene sets, with 5 genes (ABAT, CYFIP2, EMP2, FN1, TEK) being common to cell-cell adhesion, cytoskeleton organisation and blood flow, 9 genes (AAMP, ABAT, CYFIP2, EMP2, FN1, KCNMA1, RAPIGDS1, SGCD, TEK) intersecting between cytoskeleton organisation and blood flow, 14 genes (ABAT, ADGRE2, B4GALT1, CDH13, CYFIP2, DLG5, EMCN, EMP2, FN1, FOXF1, IL12B, PLAU, PPP3CA, TEK) between cell-cell adhesion and blood flow and 11 genes (ABAT, ARPC2, CHMP5, CYFIP2, EMP2, FN1, SOCS1, TEK, TNS1, VCL, WHRN) between cell-cell adhesion and cytoskeleton organisation.

Gene Set Analysis
The five genes which have functions related to all three physical model parameters (cell-cell adhesion, cytoskeleton organisation or cell stiffness and blood flow) are potentially important candidate genes influencing EIPH risk. ABAT (4 aminobutyrate aminotransferase) is a mitochondrial gene involved in the regulation of blood pressure and response to hypoxia. CYFIP2 (cytoplasmic FMR1-interacting protein 2) forms part of the VEGFA-VEGFR2 pathway and is associated in humans with susceptibility to Kawasaki disease and coronary artery lesions [29], interacting with PDE2A so that high-risk allele combinations account for 67% of cases in this human population. EMP2 (epithelial membrane protein 2) regulates blood vessel endothelial cell migration and angiogenesis by regulating VEGF protein expression through PTK2 activation [30] and plays a role in the induction of VEGFA via a HIF1A-dependent pathway [31]. FN1 (Fibronectin) is involved in cell adhesion, cell motility, wound healing and maintenance of cell shape. TEK (angiopoietin-1 receptor) regulates angiogenesis and endothelial cell survival, and is associated with pulmonary hypertension [32] and cutanomucosal venous malformation in humans. ABAT, CYFIP2, FN1 and TEK are all expressed in vascular and lung tissue, EMP2, FN1 and TEK are expressed in tracheal tissue (UniGene cDNA sources). Four further genes intersect between cytoskeleton and blood flow: AAMP (angio-associated migratory cell protein) is associated with angiogenesis and has potential roles in endothelial tube formation and the migration of endothelial cells, KCNMA1 (potassium calcium-activated channel subfamily M α 1), RAP1GDS1 (Rap1 GTPase-GDP dissociation stimulator 1) and SGCD (sarcoglycan delta) are associated with total anomalous pulmonary venous return [33].
The analysis of epistatic (gene-gene) interactions among the five genes which have roles in all three biological parameters or functions showed significant (p < 0.05) interactions between six SNPs representing all five genes (Appendix D). SNPs in EMP2, ABAT and FN1 had significant epistatic interaction with SNPs in CYFIP2 and TEK, and there was also interaction between SNPs in CYFIP2 and TEK. Figure 3 illustrates the epistatic relationships among SNPs in the five genes. Haplotype analysis of SNPs within these five genes showed the haplotypes mostly significantly associated with the EIPH phenotype were in ABAT on chromosome 13 and CYFIP2 on chromosome 14 (Table 5). These results suggest that the interactions between the genotypes at ABAT, EMP2, CYFIP2, FN1 and TEK play an important role in determining EIPH risk. angiogenesis and endothelial cell survival, and is associated with pulmonary hypertension [32] and cutanomucosal venous malformation in humans. ABAT, CYFIP2, FN1 and TEK are all expressed in vascular and lung tissue, EMP2, FN1 and TEK are expressed in tracheal tissue (UniGene cDNA sources). Four further genes intersect between cytoskeleton and blood flow: AAMP (angio-associated migratory cell protein) is associated with angiogenesis and has potential roles in endothelial tube formation and the migration of endothelial cells, KCNMA1 (potassium calcium-activated channel subfamily M α 1), RAP1GDS1 (Rap1 GTPase-GDP dissociation stimulator 1) and SGCD (sarcoglycan delta) are associated with total anomalous pulmonary venous return [33]. The analysis of epistatic (gene-gene) interactions among the five genes which have roles in all three biological parameters or functions showed significant (p < 0.05) interactions between six SNPs representing all five genes (Appendix D). SNPs in EMP2, ABAT and FN1 had significant epistatic interaction with SNPs in CYFIP2 and TEK, and there was also interaction between SNPs in CYFIP2 and TEK. Figure 3 illustrates the epistatic relationships among SNPs in the five genes. Haplotype analysis of SNPs within these five genes showed the haplotypes mostly significantly associated with the EIPH phenotype were in ABAT on chromosome 13 and CYFIP2 on chromosome 14 (Table 5). These results suggest that the interactions between the genotypes at ABAT, EMP2, CYFIP2, FN1 and TEK play an important role in determining EIPH risk.   (Table 5).

Discussion
The initial GWAS analysis identified three genome-wide significant regions associated with haemosiderophage score on chromosomes 3, 13 and 23, with the size of the regions ranging between 1-16 Mb. Altogether, the top 20 SNPs from the GWAS were localised in 11 genome regions. These regions contain 412 genes with human homologues, and functions could be assigned to 375 of the genes. A novel gene set analysis (GSA) was carried out to test the significance of the physics model key parameters, cell-cell adhesion, cytoskeleton organisation (cell stiffness) and blood flow, allowing the complex nature of the trait to be incorporated into the analysis through the grouping of genes by related functions. The most significant gene sets represented the functions of cytoskeleton organisation (cell stiffness) and blood flow, highlighting these as key processes in the pathology of EIPH. Biomarkers for hypertension were also found to be significant suggesting that genetic variation impacting blood pressure could also be important, perhaps as a consequence of the relationship between blood pressure and flow.
Several genes were contained within more than one gene set, with a sub-set of five genes affecting all three biological functions known to represent the key parameters in the physical model (cell-cell adhesion, cytoskeleton organisation and blood flow). These genes (ABAT, CYFIP2, EMP2, FN1, and TEK) are linked through their roles in promoting angiogenesis, particularly in response to hypoxic stress. TEK is the receptor for angiopoietin-1; the angiopoietin-1/TEK signalling complex regulates both maintenances of vascular quiescence and promotion of angiogenesis, with distinct signalling complexes being produced in the presence or absence of endothelial cell-cell adhesions [34]. Angiopoietin-1/TEK signalling also promotes microvascular integrity through interaction with the hypoxia-inducible factor HIF-2α. CYFIP2 is a target gene for p53, a protein essential for cellular response to oxygen stress which is regulated by HIF-1 [35]. EMP2 is upregulated under hypoxic conditions, promoting vascular endothelial growth factor (VEGF) expression through a HIF-1α dependent pathway [31]. ABAT is involved in response to hypoxia, and negative regulation of blood pressure (GO Biological Processes). The FN1 gene codes for the type I fibrin-binding domain in fibronectin. Polymerized fibrin is important in the process of coagulation and wound healing; faulty assembly or premature lysis of fibrin would increase the risk of haemorrhage.
The hypoxia-inducible factors (HIFs), especially HIF-1α and HIF-2α, are key mediators of the adaptive response to hypoxic stress and play essential roles in maintaining lung homeostasis. Both human and animal genetic studies have shown that variants in HIF correlate with pulmonary vascular pathology and chronic lung diseases [36], including pulmonary arterial hypertension [37]. It has been postulated that inhibiting HIF could be a therapeutic target for pulmonary arterial hypertension (PAH) in humans [38]. Given the apparent similarities in the genetic background between PAH and EIPH, the inhibition of HIF might also provide therapeutic benefits for EIPH. Exercise and hypoxic conditions both increase angiogenesis, however, HIF signalling and gene expression are different under the two conditions. Exercise induces much larger changes in HIF expression levels compared with hypoxia alone [39]. Regular endurance training reduces skeletal muscle HIF expression under normoxic conditions [40], suggesting that HIF is involved in regulating adaptive gene response to exercise. Consequently, genetic variants increasing EIPH risk may also have effects on the HIF signalling pathway and response to training. Risk variants may have been maintained within the Thoroughbred racehorse population because they enhance muscle development and cardiovascular performance, as long as bleeding does not occur detrimentally.
The mechano-sensing roles of these genes are also important in vascular remodelling, as vascular endothelial cells are sensitive to stress and strain generated by blood flowing through the capillaries and venules. The mechano-sensitive and hypoxia responses are inter-linked; the hypoxia response has been shown to have a role in the differential expression of mechanosensitive genes within osteocytes [41], and it therefore seems plausible to expect that this relationship would also hold for other cells, such as vascular endothelial cells, which must respond to mechanical stresses. It is already known, for example, that microvascular integrity (vessel leakiness) is mediated by the role of angiopoietin 1 through a mechano-transduction pathway [42]. The interaction between the response to hypoxia and mechanical forces acting on endothelial cells appears to play a pivotal role in vascular remodelling, and the relationship between these responses merits further investigation. The nature of gene-gene interactions and pleiotropy within both the hypoxia and mechano-transduction pathways also need to be better understood before firm conclusions can be drawn on the suitability of therapeutic targets for preventing or reducing EIPH.
The physical model linking the EIPH phenotype to cellular biology and genetics has enabled candidate genes to be identified based on an understanding of their functional significance within the model. Further dissection of the biological pathways identified will elucidate the physiological benefits of genetic variants, in terms of cardiovascular enhancement and response to training/racing and the associated pathological risk of EIPH. This will enable improved therapies to be developed, as well as providing information required to reduce genetic risk within the Thoroughbred population.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A
Tracheal Wash Cytology Images of angiopoietin 1 through a mechano-transduction pathway [42]. The interaction between the response to hypoxia and mechanical forces acting on endothelial cells appears to play a pivotal role in vascular remodelling, and the relationship between these responses merits further investigation. The nature of gene-gene interactions and pleiotropy within both the hypoxia and mechanotransduction pathways also need to be better understood before firm conclusions can be drawn on the suitability of therapeutic targets for preventing or reducing EIPH.
The physical model linking the EIPH phenotype to cellular biology and genetics has enabled candidate genes to be identified based on an understanding of their functional significance within the model. Further dissection of the biological pathways identified will elucidate the physiological benefits of genetic variants, in terms of cardiovascular enhancement and response to training/racing and the associated pathological risk of EIPH. This will enable improved therapies to be developed, as well as providing information required to reduce genetic risk within the Thoroughbred population.
Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1, Table S1: Gene ontology terms, Table S2: SNP based estimates of genetic variance, Table S3: EIPH Top 20 SNPs gene functions, Table S4: Gene set lists.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix B.1 A Physics-Based Model of EIPH
Physiological studies have shown that during intense exercise a mean pulmonary arterial pressure (mPAP) above 80 mmHg is required to elicit bleeding in the lungs ( Figure A1A) [21]. Such an increase in mPAP is not fully compensated by any other pressure and as a result the mean transmural pulmonary arterial pressure (mTPAP) follows mPAP trends ( Figure A1B) [43] and the pressure across the capillary wall (mTPAP) is expected to promote a large increase in the capillary wall stress estimated to be ∼ 8.3 × 10 4 N/m 2 for an mTPAP ∼ 95 mmHg and capillary radius ∼ 3.3 µm during exercise [2]. Using the surface tension ∼ h × 8.3 × 10 4 N/m 2 where h ∼ 10 µm is the capillary wall thickness [44] one finds ∼ 8.3 × 10 −1 N/m. The physics model leads to two important conclusions: the first is that bleeding occurs under the form of an abrupt transition, which means that standard physics theory relating to a critical phenomenon (or phase transitions) should apply. The second is that the most important variable is the intra-capillary pressure which is not compensated and is linked to blood flow. In other words, the mTPAP written as P i − P e where the indexes 'i' and 'e' refer to intra-and extra-respectively, can be rewritten as, P i − P 0 e . Modelling the pathophysiology of blood vessel rupture requires modification of the physical model to incorporate biology. We have decomposed the problem into smaller steps which, when resolved and integrated, provide a better understanding of the pathophysiology of EIPH. The first step corresponds to determining the physical conditions for a vessel to rupture provided that the vessel has a constant lumen radius. The second step is to add to this the risk of bleeding and in particular the continuation of bleeding when the horse is exercising. The third step is to consider the contractile ability of vessels (i.e., arterioles or venuoles) to modulate vascular blood flow and pressure, and the relationship with bleeding. Once integrated these steps show the sensitivity of EIPH to key physical parameters. Physics theory can be applied to any vessel provided that their radius is constant. The first point to thus resolve is, given the fragility of capillaries, what capillary hydrostatic pressure is needed to rupture them in such a way that the rupture is large enough to let blood cells flow out. To model this phenomenon, we make use of the theory developed in physics to understand how fracture occurs in materials. We assume that capillaries can be modelled as a small shallow cylinder with a wall thickness h constant length L 0 and radius R. In this context, the entire volume of the capillary is V = πL 0 R with a surface area of S = 2πL 0 R. The mechanical tension or surface tension with unit J/m 2 that is applied to the capillary wall is noted as, hσ 0 , where σ 0 is the stress that is present within the capillary wall, which is assumed to be independent of the wall thickness. Laplace's law then gives the relationship between the radius of the capillary, the surface tension and the pressure difference between the lumen of the capillary and the alveoli: hσ 0 ∼ R(P i − P e ). This last relation states that as the pressure difference increases, the surface tension increases in a similar proportion and it is expected that the increase in surface tension will fracture the capillary.
To model the possibility of a capillary fracture of surface area 's' it is noted that any fracture releases an energy ∼ −shσ 2 0 /E where, σ 2 0 /E, is the energy density and E the elastic modulus of the capillary wall and, sh is the volume generated by the fracture. The fracture can only appear if cell-cell de-adhesion occurs so that cells can move apart. Assuming that the density of cell-cell adhesion molecules is constant along the contour of cells, then the energy is proportional to the contour length of the fracture (the fracture 'lips'). Noting by the variable 'l' the contour length and by 'hλ' the cell-cell binding energy per unit of contour (i.e., λ is an energy per surface area of adhesion), a drop in the surface tension energy is always accompanied by an increase of energy due to the presence of free lips given by ∼ 2λhl where the factor '2 is added as each time a fracture appears two lips are generated. Let us consider that the contour length is proportional to the fracture area under the form: s ∼ al 2 where a is a geometrical constant such that for a circular fracture, a = 1/4π. The variation of surface energy can be written as, E S ∼ 2λhl − al 2 hσ 2 0 /E. The surface energy represented in Figure A2(A) demonstrates that not all fractures diverge irreversibly but that a critical fracture size, l c ∼ Eλ/aσ 2 0 , needs to be attained. This allows E S ∼ E 0 2l − l 2 to be rewritten with E 0 ∼ hEλ 2 /aσ 2 0 and l = l/l c ( Figure A3(A)). This means that the capillaries can heal physically (and biologically) provided that the pressure difference or surface tension is small enough. This has been demonstrated in cell culture [45].

Appendix B.1.2 Part 2-Physics of Blood Vessel Rupture with the Presence of a Leak under an Imposed Hydrostatic Pressure
The theory applied from material physics suggests that fractures diverge once a threshold size, l c ∼ Eλ/aσ 2 0 , has been attained. However, in the vascular system, it is the blood flow that builds up the pressure inside vessels and if a leak occurs then it is possible to demonstrate that a fracture or rupture can be maintained. This point is important as racehorses undergoing intense exercise for a given time will probably bleed during the time considered. As shown above, the surface tension linked to the pressure inside vessels is also affected by the blood leaking as it is expected to lower the transmural capillary pressure. To model this we consider the internal volume of a capillary. The pressure difference changes are related to the amount of blood flowing into the capillary section under consideration. The extra blood flow rate in the capillary is noted as 'δ . V' and 'τ' is the characteristic time related to the length of time the flow rate is increased. Note that the characteristic time is supposed to be long enough to apply thermodynamics and δ Vτ is therefore the extra volume present in the capillary and it is natural to assume that δ . Vτ should increase the value of the lumen pressure, P i . To link P i and δ . Vτ a simple model is used whereby the volume of the capillary under consideration is constant. If more blood is poured into the section, P i is expected to increase as a function of the proportion of extra volume being poured in. That is to say that for an incompressible fluid, i.e., the blood, at the lowest order P i ∼ P 0 Vτ/V is obtained where P 0 i is the lumen pressure under normal/resting conditions. This relation allows the pressure difference to be written as, Vτ/V. However, a fraction of this extra volume can be leaked out from the capillary lumen. Considering a capillary fracture of surface area 's', the volume leaked per unit of time is therefore ∼ s × v where 'v' is the blood velocity leaking out. Using Darcy's law, the velocity is a function of the pressure difference between the inside and outside of the vessel, blood viscosity 'µ' and capillary wall thickness 'h', expressed under the form, v ∼ α(P i − P e )s/µh, where 'α' is the proportionality constant. Using s ∼ al 2 , the balance of volumes, i.e., what flows in minus what leaks out, can be re-expressed as: δ . Vτ − a 2 ατ(P i − P e )l 4 /µh. As a result the pressure difference in the presence of blood leaking out from the vessel can be rewritten as, P i − P e ∼ P 0 i − P e + P 0 i δ .
Vτ/V − a 2 ατ(P i − P e )l 4 /Vµh , which once rearranged gives: P i − P e ∼ P 0 i − P e + P 0 i δ . Vτ/V / 1 + a 2 ατP 0 i l 4 /Vµh . It follows by virtue of Laplace's law that new surface tension, hσ ∼ R(P i − P e ), can be defined: hσ ∼ R P 0 i − P e + P 0 i δ . Vτ/V / 1 + a 2 ατP 0 i l 4 /Vµh . Noting hσ 0 ∼ R P 0 i − P e and τRP 0 i /Vhσ 0 = 1/ . V 0 one obtains: σ ∼ σ 0 1 + δ . V/ . V 0 / 1 + a 2 ατP 0 i l 4 /Vµh . In the expression of the new surface tension the presence of 'l 4 ' in the denominator confirms that the surface tension of the capillary decreases as a function of the rupture length. Within the latter relation the characteristic time 'τ' remains present but as a steady-state flow is considered it can be related to usual hemodynamic parameters. A simple dimensional analysis allows one to find that surface tension can also be expressed as a function of the steady-state flow rate in resting conditions under the form hσ 0 ∼ . V 0 µL 0 /R 3 . As τRP 0 i /Vhσ 0 = 1/ . V 0 , by identity one obtains, τP 0 i /µ ∼ (L 0 /R) 2 . Reintegrating these relations into the system's energy one finds: Figure A3(B) captures the physical elements that are deemed important for EIPH. The bleeding transition can now be considered.

Appendix B.1.3 Part 3-Physical Conditions for the Bleeding Transition
While Equation (A1) captures most of the physics of EIPH, under this form it cannot provide any meaningful relationship between the physical variables supposedly involved in EIPH. However, it is clear that capillaries with poor cell-cell adhesion will probably have a tendency to bleed more, a result that comes automatically from Equation (A1). As a result, another relation needs to be introduced that links the physical variables together to determine the 'bleeding transition'.
The 'no-bleeding' condition up to the fracture can be captured by setting Ω 0 ∼ 0 (this is the equivalent of saying that α ∼ 0 in the absence of leaks), which transforms Equation (A1) . It is clear that the latter relation has an extreme value that is modulated by the blood flow and that is given when 'l 1 ' the solution of: ∂(E S /E 0 )/∂l ∼ 0 leading with a resulting extreme value for the energy: The blood flow, therefore, modulates the energy barrier leading to rupture. Furthermore the thermal energy k B T (where k B is Boltzmann's constant and T the absolute temperature) modulates the energy of this fundamental state to generate statistical thermal fluctuations and it is when this fluctuation is enough to promote a fracture that the bleeding transition occurs. Putting (E S ) max ∼ k B T, i.e., that the thermal energy needs to match the extrema value of the potential, one finds a simple critical relation: V 0 , to microscopic ones, hEλ 2 /2aσ 2 0 k B T, that have all been separately determined experimentally. Given that capillary rupture occurs at the level of cell-cell adhesion, as the interaction energy involving cell-cell adhesion complexes is typically ∼ 100 k B T [46] and that the surface density for adhesive units involved between adjacent cells ∼ 10 3 µm −2 [47], one deduces a typical value of λ ∼ 10 5 k B T/µm 2 . Using k B T ∼ 3 × 10 −21 J (at T ∼ 300 K i.e., a temperature of 27 o C) leads to λ ∼ 3 · 10 −4 J/m 2 . Considering capillary stress σ 0 ∼ 8 × 10 4 N/m 2 [3], an elastic modulus per cell E ∼ 10 3 Pa [48], and a typical capillary wall thickness h ∼ 10 −6 m and a = 1/4π one finds hEλ 2 /2aσ 2 0 k B T ∼ 5 − 6. To determine V 0 is linearly linked to the transmural pressure variation inside the vessel (This assumption can be justified as follow: Vτ/V, the blood flow can be related to a critical pressure difference, i.e., a critical transmural pressure, under the form: V 0 . Then the use of Figure A2(A) allows one to predict that . The order of magnitudes is similar, suggesting that the model is sound to be used further.
Based on the similarity above, one can also provide an order of magnitude for l c that can be rewritten as, l c /h ∼ λ × E/ahσ 2 0 . As described above, the term σ 2 0 /E is the energy density of the capillary wall meaning that, hσ 2 0 /E, has an order of magnitude similar to the surface tension of the wall. Using hσ 2 0 /E ∼ 0.8 N/m and a = 1/4π one finds l c ∼ 30 nm. It is worth noting that with the value of l c now defined, it is expected that Ω 0 << 1 as Ω 0 ∼ l 4 c . While Equation (A2) provides the physical conditions required to be fulfilled for bleeding it does not provide any information on the amount of blood crossing the vessel wall.

Appendix B.1.4 Part 4-Physical Pathophysiology of EIPH
The main question is to find the relation that allows sustained bleeding to occur. Considering Ω 0 << 1, Equation (A1) can be rewritten as a Taylor's expansion (chain rule method) up to the 6th power of the energy around l ∼ 0 under the form: (A3) The term in 'l 6 ' provides for maintenance of the leak. As shown in Figure A3(B), Equation (A3) demonstrates the presence of two extrema. The order of magnitude of the maximum, noted l 1 , results from the competition between the two first terms that are present in the right-hand-side of Equation (A3) , which is one solution determined previously. Likewise the minimum, noted l 2 , results from the competition between the two last terms that are present in the right-hand-side of Equation (A3), leading to l 4 2 ∼ 1/Ω 0 . It is worth noting here that using Equation (A1) to replace Ω 0 by its literal expression one finds, l 2 ∼ R πh/a 2 αL 0 1/4 , that is to say that the size of the capillary rupture is chiefly imposed by the radius of the capillaries. Finally, the maintenance of the leak appears only if l 2 ≥ l 1 that is: (R/l c ) πh/a 2 αL 0 , which is linked to the geometry of the capillaries. In view of this last result the prior information that physics gives in terms of genetic sensitivity to EIPH can be discussed. To start with the critical length, l c ∼ Eλ/aσ 2 0 , is a ratio between the energy involved in cell-cell adhesion and the stiffness of cells suggesting that if the adhesion diminishes, for example by a reduction in the number of cadherins, and/or that the stiffness of cells increases, for example by augmenting the stiffness of the cytoskeleton, EIPH is more likely to occur. However the presence of the squared term in ratio, , means that EIPH is potentially more sensitive to the blood flowing into capillaries which may result from physiological or anatomical changes in the vascular system, for example when venules have a smaller diameter [6][7][8] or following the selection of racehorses with larger cardiac outputs [49]. In any case, contrary to what has previously been suggested [50], blood viscosity does not seem to be involved. δ +   , means that EIPH is potentially more sensitive to the blood flowing into capillaries which may result from physiological or anatomical changes in the vascular system, for example when venules have a smaller diameter [6][7][8] or following the selection of racehorses with larger cardiac outputs [49]. In any case, contrary to what has previously been suggested [50], blood viscosity does not seem to be involved. Figure A1. (A) Impact of the mean pulmonary arterial pressure (mPAP) on bleeding, data from [21]. (B) mPAP is not fully compensated by any other pressure and as a result, the mean transmural pulmonary arterial pressure (mTPAP) follows mPAP trends, data from [43].  Figure A2. (A) Impact of the mean pulmonary arterial pressure (mPAP) on bleeding, data from [21]. (B) mPAP is not fully compensated by any other pressure and as a result, the mean transmural pulmonary arterial pressure (mTPAP) follows mPAP trends, data from [43]. Figure A1. (A) Impact of the mean pulmonary arterial pressure (mPAP) on bleeding, data from [21]. (B) mPAP is not fully compensated by any other pressure and as a result, the mean transmural pulmonary arterial pressure (mTPAP) follows mPAP trends, data from [43]. The initial state (thick red arrow) can be modulated by the thermal energy and when the tension is enough a fracture opens up in which case the red circle, which schematically represents the state of the fracture, 'jumps' above the energy barrier to generate fractures which are the largest possible. In this model there is no compensatory mechanism (i.e., a reduction of the tension due to fracture) hence the description 'the largest possible' remains valid.
(B) When a compensatory mechanism is involved, namely when there is reduction of the tension due to the fracture, new stable states (e.g., thick red arrows on the right) emerge from the system leading to sustained fracture. However, the ability of the fracture to be sustained is a function of the value of omega which presents an inflection when its value is above 1/60 (green curve). In this case fractures do not appear however much the fracture can 'jump'. It is the value of omega and thus the type of vessel that will determine whether the fracture is maintained (leading to a bleeding state) or not. The maximum and minimum value of the energy are located at l 1 and l 2 , respectively. (C) An interesting phenomenon is the bleeding oscillation that should occur when the tension energy is sufficiently large for the system's energy to match the thermal energy. In this context constant opening and closing configurations are possible. (D) In the case of increased bloodflow, the system's energy landscape changes leading to the appearance of new stable modes (i.e., fracture lengths) that were not present before.