Generation of Epichloë Strains Expressing Fluorescent Proteins Suitable for Studying Host-Endophyte Interactions and Characterisation of a T-DNA Integration Event

Methods for the identification and localisation of endophytic fungi are required to study the establishment, development, and progression of host-symbiont interactions, as visible reactions or disease symptoms are generally absent from host plants. Fluorescent proteins have proved valuable as reporter gene products, allowing non-invasive detection in living cells. This study reports the introduction of genes for two fluorescent proteins, green fluorescent protein (GFP) and red fluorescent protein, DsRed, into the genomes of two distinct perennial ryegrass (Lolium perenne L.)-associated Epichloë endophyte strains using A. tumefaciens-mediated transformation. Comprehensive characterisation of reporter gene-containing endophyte strains was performed using molecular genetic, phenotypic, and bioinformatic tools. A combination of long read and short read sequencing of a selected transformant identified a single complex T-DNA insert of 35,530 bp containing multiple T-DNAs linked together. This approach allowed for comprehensive characterisation of T-DNA integration to single-base resolution, while revealing the unanticipated nature of T-DNA integration in the transformant analysed. These reporter gene endophyte strains were able to establish and maintain stable symbiotum with the host. In addition, the same endophyte strain labelled with two different fluorescent proteins were able to cohabit the same plant. This knowledge can be used to provide the basis to develop strategies to gain new insights into the host-endophyte interaction through independent and simultaneous monitoring in planta throughout its life cycle in greater detail.


Introduction
Asexual endophytes from the genus Epichloë (previously Neotyphodium), are strictly seed-transmitted endophytic symbionts of cool-season grasses (Poaceae, sub-family Pooideae) characterised by a life-cycle wholly confined to the host plant [1,2]. The relationships between asexual Epichloë species and their grass hosts are often mutualistic in nature [3]. Endophyte infection can impart environmental stress tolerance and protection from herbivory, thus markedly enhancing technology for asexual Epichloë fungal endophytes. This study offers new perspectives to understand the complex patterns of T-DNA integration in fungal genomes.
The present study describes the successful generation of GFP and DsRed expressing asexual Epichloë endophyte strains, henceforth referred to as endophytes, belonging to different representative taxa, LpTG-3 (strain NEA12) and LpTG-4 (strain E1), which provide bioprotective properties to the host plant against invertebrate herbivores, likely through production of epoxy-janthitrems [6,8,34]. This study also reports comprehensive characterisation of these reporter gene-containing Epichloë endophyte strains using molecular genetic, phenotypic, and bioinformatic tools. The ability of these reporter endophytes to establish a successful symbiotum with the host was studied. Furthermore, co-inoculation studies demonstrated the co-existence ability of endophytes expressing GFP and DsRed from the same taxon in the same host. The results obtained in this study strongly suggest the applicability of these reporter gene-containing endophyte strains to better understand host-symbiont association. The outcomes of the first use of third-generation long-range DNA sequencing to identify complex T-DNA integration events in fungal genomes provide a significant advance, which may be further explored to understand the intricacies of A. tumefaciens-mediated transformation of fungi.

Microbial Culture Conditions
Properties of the selected endophyte strains are summarised in Table 1. Cultures of NEA12 and E1 were grown either on potato dextrose agar (PDA; Sigma-Aldrich, St. Louis, MO, USA), or in PDB at 22 • C in the dark for a period of 7-10 days, depending upon growth rate. The AGL1 and LBA 4404 strains of A. tumefaciens which were used for transformation, and Escherichia coli strain DH5α (Thermo Fisher Scientific, Walthman, MA, USA), which was used during construction and maintenance of plasmid constructs, were grown at 28 and 37 • C, respectively, on either LB (Luria-Bertani) agar (5 g yeast extract, 10 g tryptone, 10 g NaCl in 1L of ddH 2 O) plates or in LB broth. PDA and LB media were supplemented with appropriate antibiotics as necessary [35].  [8]. g Compared by mycelial growth on PDA in 22 • C in the dark.

Construction of Vectors Containing Reporter Genes
Gene cassettes including the constitutive Aspergillus nidulans glyceraldehyde-3-phosphate dehydrogenase promoter (gpdP) and the A. nidulans tryptophan biosynthesis terminator (trpCT), for cloning of the first reading frame A (RFA-A) cassette; the hygromycin B resistance gene (hph) under the control of the A. nidulans trpC promoter (trpCP) and terminator; and reporter genes including DsRed (DsRed-Express 736 bp, gb|DQ232603.1|), sgfp (gb|EF090408.1|), egfp (gb|HQ259114.1|) containing attB1 and attB2 sites were synthesised and obtained from GeneArt, Regensburg, Germany. The Gateway™-enabled destination vector (pEND0002) was constructed through modifications of the T-DNA region of pPZP200 [36] by cloning the 2.1 kb XbaI/KpnI fragment containing the hph expression cassette and 2.9 kb gpdP-(RFA-A)-trpCT cassette (Invitrogen, Thermo Fisher Scientific, Walthman, MA, USA) as previously described in [35]. The 779 bp fragments corresponding to the reporter genes sgfp and egfp were excised from vectors pMA4 and pMA5, respectively, using the SacI and KpnI restriction enzyme sites. The DsRed gene, containing attB1 and attB2 sites, was excised from the pMA3 vector by co-digestion with the restriction endonucleases AscI and PacI. DsRed, egfp, and sgfp gene fragments containing attB1 and attB2 sites were cloned into the pDONR221 vector using BP clonase following the manufacturer's recommended method (Invitrogen, Thermo Fisher Scientific, Walthman, MA, USA) generating entry clones designated pEND0003, pEND0004, and pEND0005, respectively. The entry clones pEND003, pEND004, and pEND005 were combined with the destination vector pEND0002 using a LR clonase reaction following the manufacturer's recommended method (Invitrogen, Thermo Fisher Scientific, Walthman, MA, USA), to generate the final expression vectors pEND-DsRed, pEND-egfp, and pEND-sgfp, respectively.
T-DNA regions of all expression clones were sequence verified by Sanger sequencing using oligonucleotides; Details of all plasmids designed and constructed in this study are summarised in Table S1.

Sensitivity of Non-Transgenic Endophyte Mycelia to Hygromycin B
The sensitivity of non-transformed endophyte mycelia to treatment with hygromycin B were determined by plating 400 µL of liquid culture onto PDA plates containing increasing concentrations of antibiotic (50,100,150,200,250, and 300 µg/mL) as previously described [35].

A. tumefaciens-Mediated Transformation
A. tumefaciens-mediated transformation of endophyte strains NEA12 and E1 was performed using A. tumefaciens cells (AGL1 and LBA 4404) as previously described in [35]. Control experiments were performed by co-cultivation of endophyte mycelia with strains of A. tumefaciens that had not been transformed with the plasmid vectors.

Mitotic Stability of Transformed Endophytes
Mitotic stability of putative endophyte transformants obtained from each vector (appearing 2-4 weeks after A. tumefaciens-mediated transformation) and protoplast-derived transformants was determined by sub-culturing five successive times, every 2-3 weeks, alternating between media containing and lacking 200 µg/mL hygromycin B as previously described in [35].
Transformant colonies were examined for GFP and DsRed expression under a confocal microscope. Hyphae suspended in a drop of water were observed using an Olympus FluoView FV10i confocal microscope (Olympus, Tokyo, Japan). Filters (green-narrow with excitation maximum 473 nm, emission maximum 490-540 nm and TRITC with excitation maximum 552 nm and emission maximum 578 nm) were used to monitor GFP and DsRed expression.
Samples of endophyte infected plants were observed using a Leica M205 FA fluorescence stereo microscope (Leica Microsystems, Wetzlar, Germany) fitted with filters GFP2 and GFP3 with same excitation and barrier filters as above.

Screening for the Presence of Endophytes in Planta
Three tiller samples (<0.5 cm) were harvested from the base of 6-month-old soil grown plants, followed by freeze-drying (Virtis Genesis, 25XL, Gardiner, NY, USA) for 48 h. DNA was extracted using the Qiagen MagAttract DNA Extraction kit (Qiagen, Hilden, Germany) as per manufacturer's instructions. Presence of the reporter genes were assessed by real time amplification

MinION Nanopore Library Preparation and Sequencing
Genomic DNA was extracted, libraries were prepared, and sequencing performed as described in [6]. DNA was extracted from snap frozen endophyte mycelia using a cetyltrimethylammonium bromide (CTAB)-based extraction method. Genomic libraries were prepared and sequenced using ligation sequence preparation kit (SQK-LSK109; Oxford Nanopore, Oxford, UK).

Processing of Sequencing Reads
All Miseq reads were filtered using a custom Python script. All MinIon reads were filtered and assembled as described in [6]. Sequence correction, trimming, assembly, polishing the scaffolded assembly using short-reads generated from Illumina Miseq sequencing platform was also performed as described in [6]. In brief, sequence correction, trimming, and assembly was performed using the long-read assembler Canu v.1.8 [38]. Scaffolded assembly was polished with genome assembly improvement and variant detection tool Pilon v. 1.23 [39] using Miseq reads.
Presence, copy number, length, start and stop coordinates, as well as orientation of gfp, hph, gpdP, trpCP, and trpCT in the endophyte genome were initially identified through nucleotide BLAST (Basic Local Alignment Search Tool) (version 2.2.25) analysis [40], which led to the identification of the pattern of T-DNA integration. Subsequently, junction sequences were identified and nucleotide BLAST against the transformation vector was performed to study their origin. The DNA sequences immediately upstream and downstream of the integration were analysed to determine associated microhomology and the genomic position of T-DNA integration in the endophyte genome.

Construction of Binary Reporter Gene Vectors
The Gateway™-enabled destination vector (pEND0002) containing hph selectable marker cassette was used to permit insertion of reporter genes between constitutive control elements. The expression-specific construct plasmids (pEND-DsRed, pEND-egfp, and pEND-sgfp) were constructed by replacement of the Gateway RFA-A cassette in vector pEND0002 by the reporter genes DsRed, egfp, and sgfp.

A. tumefaciens-Mediated Transformation of Endophyte Mycelia
All binary vectors generated (pEND-DsRed, pEND-egfp, and pEND-sgfp) were transformed into A. tumefaciens cells of strains AGL1 and LBA 4404, and PCR analysis was used to confirm the presence of the transformed plasmids. Hygromycin B at 200 µg/mL was used for selection of putative transformants. Following co-cultivation with finely cut mycelium of NEA12 and E1, hygromycin B-resistant colonies developed 2-4 weeks after transformation. No growth was observed in the non-transgenic endophyte controls at this concentration of hygromycin.
Both A. tumefaciens strains produced hygromycin B-resistant transformants, but the number produced by AGL1 was five-times higher than for LBA 4404. Fifty-six transformants were obtained for NEA12. The number of individual putative E1 transformants was too large to be counted, due to high-density growth. Stable transformation events were confirmed by the ability of these putative transformants to grow effectively at high concentrations of hygromycin B (300 µg/mL).

PCR Analysis and Mitotic Stability Assessment of Reporter Gene-Containing Endophytes
A sub-set of randomly selected hygromycin B-resistant transformants from each endophyte strain (six SGFP, six EGFP, and six DsRed transformants) were analysed by PCR using primers specific for the gfp, DsRed, and hph genes. GFP and DsRed transformants, which produced the expected product sizes for hph (414 bp), gfp (440 bp), and DsRed (385 bp) genes, were identified.
A. tumefaciens-mediated transformation of coenocytic mycelia may result in mixtures of genetically distinct nuclei, both lacking and containing the integrated reporter gene. To minimise potential problems associated with chimeric expression of fluorescent proteins, protoplasts of PCR-positive reporter gene-containing endophytes were prepared and regenerated. One variant of GFP, SGFP, for each endophyte strain was used for further analysis. A total of 48 hygromycin B-resistant reporter gene-containing endophytes (12 from each of the reporter gene-containing endophyte strains NEA12:SGFP, NEA12:DsRed, E1:SGFP, and E1:DsRed) were analysed by PCR using primers specific to the gfp, DsRed, and hph genes. The expected PCR product sizes were observed for all transformants ( Figure S1).
Hygromycin B-resistant transformants expressing either the GFP or DsRed fluorescent proteins were sub-cultured for a minimum of five cycles over a period of six months. All tested transformants maintained the ability to grow on selection. After 12 months of consecutive sub-culture, reporter gene-containing endophytes for both target strains continued to express the respective fluorescent proteins.

Visualisation of Reporter Gene Expression
DsRed-and SGFP-specific fluorescence were observed for all transformants ( Figure 1). Young, actively growing hyphae showed strong expression of GFP or DsRed, while longer-established mycelium exhibited reduced or negligible expression.

In Planta Detection of Endophytes
Representatives of each reporter gene-containing endophyte strain were assessed for the ability to efficiently inoculate perennial ryegrass seedlings using an RT-PCR assay that targets gfp and DsRed. Initially, melt curve analysis was performed to confirm the specificity of primer annealing. Single sharp peaks were obtained, confirming primer specificity. Positive associations between ryegrass seedlings and reporter gene-containing endophytes were identified for NEA12-GFP1, NEA12-DsRed9, E1-GFP2, and E1-DsRed4. Infection frequencies obtained six months post inoculation are summarised in Table 2. containing endophytes for both target strains continued to express the respective fluorescent proteins.

Visualisation of Reporter Gene Expression
DsRed-and SGFP-specific fluorescence were observed for all transformants (Figure 1). Young, actively growing hyphae showed strong expression of GFP or DsRed, while longer-established mycelium exhibited reduced or negligible expression.   Endophyte-infected plants that tested positive for the gfp and DsRed transgene using RT-PCR analysis were further analysed for fluorescence in planta. GFP-and DsRed-specific fluorescences were detected, more specifically, in the basal meristem and intercellular spaces of leaf sheaths of plants infected with reporter gene-containing endophytes (Figure 2).

In Planta Detection of Endophytes
Representatives of each reporter gene-containing endophyte strain were assessed for the ability to efficiently inoculate perennial ryegrass seedlings using an RT-PCR assay that targets gfp and DsRed. Initially, melt curve analysis was performed to confirm the specificity of primer annealing. Single sharp peaks were obtained, confirming primer specificity. Positive associations between ryegrass seedlings and reporter gene-containing endophytes were identified for NEA12-GFP1, NEA12-DsRed9, E1-GFP2, and E1-DsRed4. Infection frequencies obtained six months post inoculation are summarised in Table 2. Endophyte-infected plants that tested positive for the gfp and DsRed transgene using RT-PCR analysis were further analysed for fluorescence in planta. GFP-and DsRed-specific fluorescences were detected, more specifically, in the basal meristem and intercellular spaces of leaf sheaths of plants infected with reporter gene-containing endophytes (Figure 2). Secondly, to study the colonisation ability of endophytes of the same taxon or different taxa simultaneously in the same host, inoculation of complementary endophytes expressing GFP and DsRed was performed. Co-existence of GFP and DsRed expressing endophyte strains of the same taxon was identified, however GFP and DsRed expressing endophyte strains from different taxa were not observed to co-exist six months post inoculation (Table 3).

DNA Sequencing and Analysis of Transgene Integration Sites
Endophyte strain E1 was chosen for further studies due to the beneficial properties of this endophyte strain, such as a relatively fast growth rate and high inoculation ability compared to other asexual Epichloë strains. One representative of E1-GFP was sequenced using Illumina MiSeq and Secondly, to study the colonisation ability of endophytes of the same taxon or different taxa simultaneously in the same host, inoculation of complementary endophytes expressing GFP and DsRed was performed. Co-existence of GFP and DsRed expressing endophyte strains of the same taxon was identified, however GFP and DsRed expressing endophyte strains from different taxa were not observed to co-exist six months post inoculation (Table 3).

DNA Sequencing and Analysis of Transgene Integration Sites
Endophyte strain E1 was chosen for further studies due to the beneficial properties of this endophyte strain, such as a relatively fast growth rate and high inoculation ability compared to other asexual Epichloë strains. One representative of E1-GFP was sequenced using Illumina MiSeq and Oxford Nanopore MinION sequencing for detailed analysis of T-DNA integration in the endophyte genome. The Illumina MiSeq run generated 2,033,256 reads and the Oxford Nanopore MinION run produced 466,415 reads for a total of 5.3 Gbp of data. BLASTN analysis, as described above, was used to identify the presence of the transgenes and their corresponding promoters and terminators, and consequently the T-DNA integration site in the E1 genome. Integration of multiple T-DNA into a single locus close to the telomere in the 6,412,111 bp contig 3 of E1-GFP was detected ( Figure 3A), resulting in a 35,530 bp insertion from 116,456 bp to 151,986 bp ( Figure 3B).
Co-integration of multiple T-DNA copies (10 T-DNA insertions) including four reporter gene cassettes and eight selectable marker cassettes with some degradations and rearrangements were identified ( Figure 3C). With one exception, all T-DNA copies were organised in the same orientation as the T-DNA region of the transformation vector ( Figure 3D). Full-length as well as truncated promoter, gene, and terminator sequences were observed. Interestingly, all gfp and trpCP insertions were found to be intact. One copy of hph, at the terminus of a T-DNA insertion, was truncated, while the other seven hph genes were intact. The presence of both intact and truncated copies of gpdP and trpCT may be due to the fact that most of T-DNA-T-DNA junctions were between gpdP and trpCT.
Three T-DNA copies (before junction 1, between junctions 1-2 and 8-9) contained hph and gfp gene cassettes including their corresponding promoters and terminators without internal T-DNA sequence deletions. However, their terminators were truncated, especially the first terminator at the start of integration. Breaks in the middle of T-DNA copies were identified for all other T-DNA copies, indicating some form of partial deletion during T-DNA integration.
T-DNA integration in the recipient E1 genome did not occur at the LB and RB sequences. For example, integration of one terminus occurred at trpCT and the other end at the hph gene. Short stretches of both trpCT and hph components were identified at these termini of integration; 10 bp compared to 775 bp long intact trpCT and 226 bp compared to intact 1020 bp long hph. Comparison between T-DNA end sequences and genome sequences at insertion sites revealed that T-DNA integration was associated with 3 bp microhomology between genomic DNA and one T-DNA terminus. Exact union between the T-DNA border and genomic DNA was observed at the other terminus.
Junctions 1, 2, and 4 are between two trpCT. Analysis of these junctions identified that junction sequences consist of the three last bp of the LB to the start of trpCT of the hph cassette and first three bp of the RB, except junction 2 where the sequence between the LB and the start of trpCT of the hph cassette is missing. All three RBs identified (1, 2, and 4) displayed a breakpoint located three bases (TGA) into the border sequence. Between junctions 4 and 8, four replicates of the left part of the T-DNA region (trpCT-hph-trpCP-gpdP) were identified.
Junctions 5, 6, 7, and 8 were very similar and are between an intact trpCT and a gpdP with a 280 bp truncation, except at junction 2 where the trpCT terminator has a 239 bp truncation. These junction sequences were matched 100% to the vector sequence between trpCT and two (7 and 8) or three (5 and 6) bp before the start of the RB. The components of the gfp gene cassette, which are located after gpdP (gfp and trpCT), have been truncated at some stage of T-DNA integration and linked to trpCT of the hph gene cassette (the sequence from the LB to the start of trpCT is also truncated). Junctions 9 (between an intact trpCT and hph (226 bp)) and 3 (trpCT (320 bp) and gpdP (336 bp)) are unique.  T-DNA integration was identified at contig 3, which started with 17 copies of a telomeric repeat (TAACCC). The T-DNA has integrated into a gene-rich region of the E1 genome; analysis of T-DNA-genome junction sequences identified that T-DNA is integrated into a gene predicted to encode for a hydrolase (Pochonia chlamydosporia 170 hydrolase, α/β fold family protein (XM_018288623.1)) (Gene E) ( Figure 3E). Further study of the genes immediately upstream and downstream of Gene E indicates that these genes are highly conserved, with closely related Clavicipitacae species such as Metarhizium brunneum, M. acridumm and P. chlamydosporia (Table S2). This insertion event is associated with a 10 bp deletion in Gene E of the E1 genome ( Figure 3F).

Vector Construction and A. tumefaciens-Mediated Transformation
GFP variants such as SGFP and EGFP have been shown to confer high levels of fluorescence in several fungal species, including Cochliobolous heterostrophus, Ustilago maydis, F. oxysporum, and Phytophthora palmivora [41][42][43][44]. Similarly, the codon-optimised version of DsRed has been expressed successfully in filamentous fungi such as Verticillium albo-atrum, F. oxysporum, L. maculans, L. biglobosa, Oculimacula yallundae, O. acuformis, Penicillium paxilli, and Trichoderma harzianum [13,[45][46][47][48]. Constitutive and heritable expression of GFP fluorescence in transgenic organisms has provided an ideal system for assessment of organismal viability [49]. In addition, major advantages of the use of reporter genes to study the host-endophyte association include the ability to visually locate fungal mycelia in planta; hence, ideal for monitoring different stages of life cycle, provide fast real-time temporal resolution, facilitate rapid screening and discrimination of different inoculated strains, as well as to distinguish inoculated strains from naturally present plant-associated microorganisms [10,18,50]. Based on all these desirable qualities of fluorescent proteins, sgfp, egfp, and DsRed were selected as reporter genes for the present study.
Expression vectors containing the reporter genes were constructed using Gateway™ cloning and transformed using two A. tumefaciens strains, AGL1 and LBA 4404. Ability to induce tumours on particular plant species, which are the natural primary host, varies between A. tumefaciens strains, and it is unclear whether similar variation arises in respect to the capacity to also transform fungi [51]. In the present study, the transformation efficiency of the AGL1 and LBA 4404 strains differed substantially. AGL1 has previously been shown to be more efficient than other A. tumefaciens strains for transformation of both ascomycete and basidiomycete fungal species [52,53], consistent with the outcomes of this study. A further practical limitation of the use of LBA 4404 is a tendency to form aggregates in liquid culture, which makes accurate determination of optical density difficult [46].
The backbone vector used in this study was conveniently manipulated for multiple other purposes, in addition to the successful expression of GFP and DsRed. This included modification of the vector to express peramine in culture and in planta [35], as well as for construction of gene silencing vectors to identify the genes responsible for epoxy-janthitrem biosynthesis [6].

Molecular Analysis of Transformants
PCR-based analysis confirmed presence of hph and the corresponding reporter gene in transformants that were identified based on growth on antibiotic selection media following A. tumefaciens-mediated transformation.

Expression of Fluorescence Proteins in Culture
Visualisation of GFP and DsRed expression in reporter gene-containing endophytes showed that the level of fluorescence varied between individual transformants, potentially reflecting copy number and position effects. However, microscope-based examination revealed no gross differences in expression levels between the two GFP variants, sgfp (codon-optimised for higher plants [54]) or egfp (codon-optimised for mammals [54]). Growing hyphal tips always exhibited higher expression of GFP or DsRed presumably due to relatively high nuclear density. This observation may also be due to variable levels of transgene expression in some nuclei according to developmental stage or transgene silencing in non-dividing mycelial cells. In common with the present study, physical variation in the intensity of florescent protein expression has been observed for Botrytis cinerea, Pyrenophora tritici-repentis, and Sclerotinia sclerotiorum [42,55].
Expression of GFP and DsRed was stable in transformants of both endophyte strains and did not disappear after subsequent sub-culture over a period of twelve months. High rates of mitotic stability of gfp gene containing transformants of different fungal species generated by A. tumefaciens-mediated transformation have also been reported in other studies [56,57].
A. tumefaciens-mediated transformation of mycelia may result in cells that have genetically different nuclei, some containing and some lacking the integrated reporter gene. Therefore, protoplasts from PCR-positive transformants were prepared to mitigate the risk of analysing mycelia with chimeric nuclear content arising from both transformed and non-transformed nuclei. This strategy is vulnerable to the possibility that protoplasts may be multinucleate rather than uninucleate, which could inadvertently capture both transformed and non-transformed nuclei. However, sequence analysis confirmed the presence of single transformed nuclei in E1-GFP transformant. Purification of transformants has also been attempted by subculturing of hyphal tips and serial transfers on selective media [42]. However, homokaryotic mutants of Monilinia fructicola were not identified even after four rounds of single hyphal tip purification [58].

Inoculation Ability of Reporter Gene-Containing Endophyte Strains and in Planta Fluorescent Protein Expression
Fluorescent proteins provide a unique and visual phenotype for studying plant-microbe interactions without any external intervention, and so providing valuable information on in planta development of endophyte. The colonisation ability of a transgenic endophyte could be compromised by transgene insertion, especially if a critical gene is disrupted [59]. Therefore, one representative from each reporter gene-containing endophyte strain was assessed for their infection ability. Colonisation ability was observed for gfp and DsRed containing NEA12 and E1 endophyte strains six months after inoculation indicating vegetative stability of the host-endophyte association over time.
GFP-and DsRed-specific fluorescence of plants infected with transgenic endophytes was observed in basal meristems, as well as in intercellular spaces in which endophytes are abundantly located. Each individual plant infected with a transgenic endophyte strain showed normal growth in the glasshouse and stable expression of either GFP or DsRed was retained. The stability of the associations established provides evidence for the ultimate value of reporter gene-expressing endophytes for assessment of spatial and temporal changes during host colonisation and longer-term evaluation of symbiota through a time-course of plant growth and development to the next generation through the seeds, while emphasising the importance of access to endophyte strains of different taxa with different fluorescent protein expressions.

Co-Existence Ability of Reporter Gene-Containing Endophyte Strains
This study used endophyte strains that belong to same and different taxa which express GFP and DsRed to investigate co-existence potential of endophytes within the same host plant. Only a few studies have so far attempted to understand this phenomenon, and the mechanisms that govern an inability to co-exist within the same tiller are yet to be elucidated [60][61][62]. As part of the studies performed so far, different endophyte strains in the same host have been distinguished based on characteristics such as presence or absence of conidia, production of certain specific alkaloids, and RAPD (random amplification of polymorphic DNA) profiles [60,61]. However, none of these identification methods can provide real-time temporal resolution, as may be achieved by use of fluorescent proteins. No prior study of co-existence and distribution of multiple endophytes within an individual plant have used fluorescent-labelled endophytes, so the strains described here provide dynamic tools for a better understanding of the mechanisms involved during co-inoculation, co-existence, and competition between co-inoculated strains.
Co-inoculation events were identified for both NEA12-GFP1 + NEA12-DsRed9 and E1-GFP2 + E1-DsRed4 suggesting stable co-colonisation is possible within the same taxa. Only E1-GFP was identified for plants inoculated with both E1-GFP2 + NEA12-DsRed9. E1 has a faster growth rate and high inoculation ability compared to NEA12. These attributes may have enabled E1 to outcompete and become dominant. This characteristic of E1 has also been observed in other co-inoculation studies using different perennial ryegrass endophyte strains. This co-inoculation study was performed mainly to understand the potential of using these reporter gene-containing endophyte strains to study host-endophyte interactions. However, it would be beneficial to co-inoculate and analyse a greater number of plants to understand the co-infection ability of endophytes precisely. It has been found that hyphae of the dominant strain colonised primordial tillers arising from a dually infected plants, and co-existence disappeared over time, giving rise to plants in which all tillers contained a single strain [60].

Bioinformatic Analysis of Transgene Integration Sites
T-DNA integration is a complex process and its mechanism is not yet completely understood [32,63]. Understanding the complex nature of T-DNA integrations may help to develop and optimise transformation methods and conditions to achieve transformants with only intended modifications. Integration patterns of T-DNA in a small number of plant genomes have been studied in detail [31,64]. In the case of fungi, even though A. tumefaciens-mediated transformation has been used extensively for generation of transgenic strains, integration patterns of T-DNA in fungal genomes has not been characterised to the same extent [65][66][67]. A detailed understanding of T-DNA integration is essential to develop transformation methods to obtain a more desirable outcome, which is very important for both commercial and research purposes [25,32]. Traditionally, molecular techniques such as Southern blotting, PCR, and genome walking have been used for characterisation of T-DNA integration [21,57,68]. However, these approaches are not capable of identifying complex integration events [69]. Next-generation sequencing has been used more effectively for the analysis of T-DNA integration patterns. Although there are limitations such as difficulties in assembling large repeat structures and other complex regions accurately [64]. Further, due to the short reads generated, these sequencing methods only identify flanking sequences of T-DNA and genomic DNA [64]. More recently, the combination of next-generation sequencing and third-generation techniques such as PacBio and MinIon sequencing have been used to analyse T-DNA integration events and associated genomic variations in the recipient genome with higher precision and in greater detail [31,64]. However, to date, studies using long read sequencing to analyse T-DNA integration events are limited to a few plant species such as Arabidopsis and Betula (birch) [31,64].
This study used a combination of second-and third-generation sequencing (Miseq and MinIon, respectively) to identify and characterise T-DNA integration in the endophyte strain, E1-GFP. Insertion of 35 kb containing 10 T-DNA copies was identified in contig 3 of E1-GFP, which is homologous to chromosome III of E. festucae strain Fl1, 6 Mb [70,71]. For some fungi, the number of T-DNA copies integrated may depend on the A. tumefaciens-based transformation method. In A. aculeatus, Blastomyces dermatitidis, and Suillus bovinus, a high concentration of A. tumefaciens cells resulted in multiple copy integrations, while low concentrations resulted predominantly in single copy integrations [72]. However, A. tumefaciens concentration exerted no obvious influence on T-DNA copy number in F. oxysporum [73]. Occurrence of transformants with single or multiple copy T-DNA integration patterns has also been shown to depend on the addition of acetosyringone to the A. tumefaciens pre-culture and the length of the co-cultivation period [22,74]. Presence of multiple T-DNA inserts in the E1-GFP strain analysed could potentially be due to the nature of the transformation conditions used in this study. These conditions included a co-cultivation time of 72 h and pre-treatment of the A. tumefaciens cells with acetosyringone prior to co-cultivation with endophyte mycelia. T-DNA may have been sequentially integrated into a single recipient genome through more than one round of transformation.
T-DNA truncations at both the LB and RB as well as deletions of borders were observed in the present study. Three right-border breakpoints (with only the first 3 bp of RB) were detected in three of the nine junctions while two left-border breakpoints (with only the last 3 bp) were detected in two of nine junctions identified. Consistent with this observation, the most commonly identified right-border breakpoint is at the first three nucleotides (TGA) of the RB sequence [24]. This is consistent with the expected nick point between the third and fourth nucleotide of the 25 bp repeat [24]. Truncations of T-DNA may arise either due to recognition of non-border as border sequences, with subsequent nicking of the T-DNA at these locations, or to exonuclease digestion of the T-DNA ends prior to integration or breakage of the T-DNA during transfer [24]. In M. oryzae, the LB sequence is truncated at a high frequency when compared to the RB [20,75]. In contrast, conservation of both borders has been observed for transformants of L. maculans [67].
Four gfp cassettes were present in ten T-DNA copies identified. Three of them contained intact gfp genes and intact gpdP promoters. Only one of those three cassettes have full length trpCT while the other two have truncated trpCT (<250 bp truncations). The fourth gfp gene cassette has a full-length gfp gene with very short fragments of corresponding promoter and terminator and therefore may not be transcribed. The other six T-DNA copies contained hph cassettes with truncated gpdP. Deletions of T-DNA were observed not only at the ends, but also in the middle of the T-DNA. Interestingly, four replicates of the left part of the T-DNA (trpCT-hph-trpCP-gpdP) were observed between junctions 4 and 8 proving the complex nature of T-DNA integration. This could be due to occasional random nicking of the T-DNA and nuclease digestion of the T-DNA ends [24]. The integration of DNA with part of the T-DNA fragment missing or with rearrangements has been identified frequently in plant genomes [64]. This complex pattern of T-DNA integration was further confirmed by long read sequences that span the entire 35 kb integration.
Integration of T-DNA requires a DNA repair pathway. In fungi, the main pathway is non-homologous end-joining (NHEJ), however targeted integration of the T-DNA by homologous recombination (HR) is also possible in species such as yeast [76,77]. NHEJ requires a small microhomology of up to 5 bp between T-DNA and host genomic DNA whereas homologous recombination requires extensive DNA sequence homology [76][77][78].
During NHEJ, some rearrangements such as deletions of a few nucleotides and/or duplications have been observed at double strand break sites in plants [77]. In this study, microhomology of 3 bp was observed at the junction between one T-DNA terminus and endophyte genome, suggesting that T-DNA integration in Epichloë does not depend upon long stretches of homology at cross-over points, similarly to that observed for M. oryzae [75]. No microhomology of T-DNA end and genome was observed at the other end of the integration. Microhomology between T-DNA ends and the genomic DNA has also been reported for other fungi such as S. cerevisiae and C. neoformans [20,27]. Identification of a 10 bp deletion of endophyte genome sequence around the insertion site in this study is consistent with observations for other fungi including M. oryzae and C. neoformans. Similar to the results of the current study, most of the identified deletions were less than 20 bp in length in these fungi [20,75].
Analysis of genomic flanking sequences in the present study identified that the T-DNA has integrated in a gene-rich region of E1 genome and more specifically into a gene predicted to coding for a hydrolase. Still, it is debatable whether T-DNA integration targets certain genomic regions or it happens in a completely random manner [79]. Greater susceptibility of regions upstream from genes, as well as intergenic regions, to T-DNA integrations has been reported in the pathogenic fungus H. capsulatum [21]. In contrast, relatively even distribution of T-DNA integration events throughout the genome was observed for S. cerevisiae, L. maculans, and M. grisea [28].
This study revealed the complex nature of T-DNA integrations in fungi through analysis of a single transformant. Multiple T-DNA insertions with truncations of different lengths were identified with precise detail using a combination of long read and short read sequence data. These outcomes show the need for more thorough study of T-DNA integration in fungal genomes to better understand the process of T-DNA integration. This study has implications for genome modification, including genome editing, of fungal genomes using A. tumefaciens-mediated transformation and can further be explored to understand other structural variations of fungal genomes, which can be caused by T-DNA insertions.

Conclusions
A new set of fluorescent protein expression vectors were generated to study biological processes in endophytic filamentous fungi. Derivatives of endophyte strains NEA12 and E1 that express GFP and DsRed were generated and characterised in detail using a range of analytical techniques. Stable integration and expression of transgenes in endophytic fungi provides an efficient tool for exploration of the ability of endophytes to colonise plant tissue, to establish new inoculation methods and for the study of numerous aspects of host-endophyte and endophyte-endophyte interactions. A more detailed understanding of the mechanisms controlling endophyte co-existence will permit establishment of artificial symbioses between endophytes and host grasses that are capable of providing a larger range of benefits to host plants, through combinations of beneficial properties such as production of complementary bioprotective alkaloids for pest and disease control. Sequence analysis of T-DNA integration emphasizes the importance of development and optimisation of transformation protocols as well as screening the transformants in detail, most importantly in technologies such as foreign-DNA free genome editing. This is the first study to use a combination of second-and third-generation DNA sequencing methods to analyse copy number, sites, and exact structures of T-DNA integration events in fungal genomes with precise detail.