Effect of T-DNA Integration on Growth of Transgenic Populus × euramericana cv. Neva Underlying Field Stands

Multigene cotransformation has been widely used in the study of genetic improvement in crops and trees. However, little is known about the unintended effects and causes of multigene cotransformation in poplars. To gain insight into the unintended effects of T-DNA integration during multigene cotransformation in field stands, here, three lines (A1–A3) of Populus × euramericana cv. Neva (PEN) carrying Cry1Ac-Cry3A-BADH genes and three lines (B1–B3) of PEN carrying Cry1Ac-Cry3A-NTHK1 genes were used as research objects, with non-transgenic PEN as the control. Experimental stands were established at three common gardens in three locations and next generation sequencing (NGS) was used to identify the insertion sites of exogenous genes in six transgenic lines. We compared the growth data of the transgenic and control lines for four consecutive years. The results demonstrated that the tree height and diameter at breast height (DBH) of transgenic lines were significantly lower than those of the control, and the adaptability of transgenic lines in different locations varied significantly. The genotype and the experimental environment showed an interaction effect. A total of seven insertion sites were detected in the six transgenic lines, with B3 having a double-site insertion and the other lines having single copies. There are four insertion sites in the gene region and three insertion sites in the intergenic region. Analysis of the bases near the insertion sites showed that AT content was higher than the average chromosome content in four of the seven insertion sites within 1000 bp. Transcriptome analysis suggested that the differential expression of genes related to plant hormone transduction and lignin synthesis might be responsible for the slow development of plant height and DBH in transgenic lines. This study provides an integrated analysis of the unintended effects of transgenic poplar, which will benefit the safety assessment and reasonable application of genetically modified trees.


Introduction
The introduction of foreign genes into the genome of the recipient plant enables the host plant to acquire novel traits. At the same time, the host plant may have unexpected and uncontrollable trait changes, which are called unintended effects [1]. It is necessary to assess the unintended effects of genetically modified (GM) plants to ensure they do not pose unacceptable risk [2]. Plants that introduce a single gene for insect resistance or herbicide resistance usually do not cause unintended effects [3]. Genetically engineered plants involved complex multigene networks, such as enhancing tolerance to abiotic stress, have a greater potential to introduce unintended effects [4].
Unintended effects can have either neutral or adverse impacts. In the study of GM herbicide-tolerant rice, it was discovered that, for most morphological or agronomic traits, there were no significant differences. In addition, some altered traits were not biologically relevant [5]. In Petunia hybrida, a Clarkia breweri S-linalool synthase cDNA (lis) was

Growth Traits of Transgenic Poplar in Field
To investigate the development of transgenic and non-transgenic lines, growth data from Mancheng (MC), Yanshan (YS), and Luannan (LN) were collected in different years ( Figure 1). The tree height of MC demonstrated that the average tree height of transgenic lines in 2018 was 4.5% lower than that of the control, and in 2019, 2020, and 2021, the tree height of transgenic lines was significantly lower than that of the non-transgenic line, with average reductions of 18.5%, 13.3%, and 10%, respectively ( Figure 1A). In the YS area, the average tree height of transgenic lines in 2018 was 8.7% lower than that of the control, and in 2019 and 2021, the tree height of transgenic lines was significantly lower, with average reductions of 23% and 30%, respectively ( Figure 1C). Similarly, for LN, the average tree height of transgenic lines in 2018 and 2021 was 7.1% and 20% lower than that of the control, respectively ( Figure 1E).
average reductions of 18.5%, 13.3%, and 10%, respectively ( Figure 1A). In the YS area, the average tree height of transgenic lines in 2018 was 8.7% lower than that of the control, and in 2019 and 2021, the tree height of transgenic lines was significantly lower, with average reductions of 23% and 30%, respectively ( Figure 1C). Similarly, for LN, the average tree height of transgenic lines in 2018 and 2021 was 7.1% and 20% lower than that of the control, respectively ( Figure 1E).
Regarding DBH, there were significant differences between transgenic and nontransgenic lines in MC for four consecutive years. The average DBH of transgenic lines was 20%, 25.2%, 22%, and 22% lower than that of the control ( Figure 1B). In the YS area, the DBH of transgenic lines in 2018, 2019, and 2021 was significantly lower, with average reductions of 31%, 43.2%, and 25.2%, respectively ( Figure 1D). In the LN area, the same trend was observed in 2018 and 2021, with the average DBH of transgenic lines being 30.1% and 35% lower than that of the control, respectively ( Figure 1F).  Regarding DBH, there were significant differences between transgenic and nontransgenic lines in MC for four consecutive years. The average DBH of transgenic lines was 20%, 25.2%, 22%, and 22% lower than that of the control ( Figure 1B). In the YS area, the DBH of transgenic lines in 2018, 2019, and 2021 was significantly lower, with average reductions of 31%, 43.2%, and 25.2%, respectively ( Figure 1D). In the LN area, the same trend was observed in 2018 and 2021, with the average DBH of transgenic lines being 30.1% and 35% lower than that of the control, respectively ( Figure 1F).

Genotype by Environment Interaction Effect
To determine the environmental adaptability of transgenic and non-transgenic lines, we conducted variance analysis on the tree height and DBH across three regions in 2021. As shown in Table 1, there were no significant differences in tree height and DBH between blocks (p > 0.05), indicating that the sample replicates within the same region were consistent. However, significant variation in both tree height and DBH were observed among regions for the same line, as well as among different regions for the entire group of trees. This suggests a remarkable interaction effect between genotype and environment.

NGS Sequencing Quality Control and Data Comparison
After filtering the raw NGS sequencing data, we obtained~15.8 G,~19.1 G,~17.4 G, 15.8 G,~19.1 G, and~17.4 G of clean data for the transgenic lines A1, A2, A3, B1, B2, and B3, respectively. The sequencing depth was greater than 30X, and the Q30 value was greater than 90%, providing evidence of the sequencing quality and high data reliability (Supplementary Table S1). We detected seven insertion sites according to sequence alignment of the junction reads in all six samples ( Table 2). All insertion sites were aligned to unilateral junction read. T-DNA insertion information was searched with the assistance of Agrobacterium Insertional Mutagenesis Highthroughput Insert Identification (AIM-HII) software. In the table, "Chromosome" and "Insertion Position" represent the chromosome and position detected at the T-DNA insertion site. "Insert Direction" indicates the insertion direction of T-DNA determined after reading sequence alignment; "Reads" refers to the number of fragments containing both T-DNA sequence and genome sequence information recognized by AIM-HI software; "Verified Subsequently" means subsequent verification by PCR.

PCR Verification of T-DNA Insertion Sites
We designed PCR primers based on the insertion site flanking sequence and T-DNA boundary information to verify the NGS sequencing results (Supplementary Table S2). The PCR results showed that only four insertion sites in the A2, A3, and B3 transgenic lines were validated (Table 2), with slight differences in product sizes from predicted bands. The PCR-amplified products were then sequenced and aligned to the genome and vector sequences, and the results proved the accuracy of the four insertion sites detected by NGS sequencing. The alignment results were shown in Supplementary Table S3 and Figure S1. The boundary sequence on the vector was moderately damaged, and some small fragments of the genome were deleted.

T-DNA Insertion Location and Adjacent Genes
Among the seven insertion sites of the six transgenic lines, three were located in the gene region and four were located in the intergenic region (Supplementary Table S4). Three insertion sites were located in the following gene regions: the first exon of the LOC7498022 gene on chromosome 10 of B1, the first intron of the LOC18104831 gene on chromosome 14 of B2, and the second exon of the LOC18100061 gene on chromosome six of B3. A total of 21 genes were found within 20 kb upstream and downstream of insertion sites, including 19 protein-coding genes and two non-coding genes, such as abscisic acid 8'-hydroxylase 2, growth-regulating factor 4. In the 1000 bp ranges extending from upstream and downstream of each insertion site, the AT base contents of A1, A2, A3, B1, B2, and B3 were 67.4%, 47.4%, 75.8%, 57.8%, 67.2%, 64.2%, and 69.7%, respectively. The AT base content in the adjacent insertion sites of A1, A3, B2, and chromosome 18 in B3 was higher than the average AT content in chromosomes.

RNA-seq and Differentially Expressed Genes (DEGs) Analysis
After filtering out adapter and low-quality reads from the raw data, 55.85 G clean bases were obtained from nine samples. The mapping rate of reads among the samples was > 86%, with an average GC content of 44.28% and Q30 > 93%, indicating good sequencing quality (Supplementary Table S5). A total of 31640 genes were identified in the genome of P. trichocarpa. In addition, some reads were identified in the non-gene region of the reference genome, which were re-annotated as new genes. A total of 1081 new genes were predicted, of which 1009 genes were functionally annotated. Compared with the control, 575 DEGs (212 upregulated and 363 downregulated) were identified in A1 line and 678 DEGs (359 upregulated and 319 downregulated) were identified in B1 line ( Figure 2A). In the A1 and B1 lines there were 309 common DEGs, and 266 and 369 specific DEGs, respectively. Common DEGs accounted for 32.73% ( Figure 2B).

Gene Ontology (GO) Enrichment Analysis
The GO enrichment terms of DEGs in the A1 and B1 lines showed similar results, with the most DEGs involved in biological processes, followed by molecular function, and the fewest in the cellular component (Supplementary Figure S2). In biological processes, the DEGs were mainly enriched in metabolic processes, cellular processes, and single-organism process. Concerning the molecular function, the DEGs were mainly enriched in

Gene Ontology (GO) Enrichment Analysis
The GO enrichment terms of DEGs in the A1 and B1 lines showed similar results, with the most DEGs involved in biological processes, followed by molecular function, and the fewest in the cellular component (Supplementary Figure S2). In biological processes, the DEGs were mainly enriched in metabolic processes, cellular processes, and singleorganism process. Concerning the molecular function, the DEGs were mainly enriched in catalytic activity and binding. In terms of the cellular component, the DEGs were mainly enriched in the cell, cell part, and membrane. As shown in Figure 3, the biological processes (BP) involving DEGs in the A1 transgenic line were significantly enriched in terms of "response to stimulus", "organonitrogen compound catabolic process", "response to stress". The molecular function (MF) GO terms related to "hydrolase activity, hydrolyzing Oglycosyl compounds", "carbohydrate derivative transporter activity", "hydrolase activity, acting on glycosyl bond" were significantly enriched. Regarding the cellular component (CC), GO terms related to "vacuole", "intrinsic component of membrane", "extracellular region part" were significantly enriched. Similarly, the DEGs in the B1 transgenic line were significantly enriched in terms of "sulfur amino acid catabolic process", "glycerol-3-phosphate metabolic process", and "cellular metal ion homeostasis" in BP. "Protein phosphatase binding" GO terms were notably enriched in MF. Additionally, GO terms related to "vacuole", "extracellular region part", and "lytic vacuole" were significantly enriched in CC.

Kyoto Encyclopedia of Genes and Genomes (KEGG) Metabolic Pathway Analysis
KEGG pathway analysis was used to investigate the unintended effects of transgenic lines. The significantly enriched pathways are shown in Figure 4. Out of 131 DGEs from the A1 line, 90 pathways were found to be enriched. The top 20 pathways with the highest significance include "Cyanoamino acid metabolism", "Tyrosine metabolism", "Isoquinoline alkaloid biosynthesis", etc. A total of 141 DGEs from the B1 line were enriched in 84 metabolic pathways. The top 20 significantly enriched pathways consist of "Isoquinoline alkaloid biosynthesis", "Tyrosine metabolism", "Terpenoid backbone biosynthesis", etc. A total of 68 pathways showed co-enrichment of DEGs from both the A1 and B1 lines, including "plant hormone transduction", "Phenylpropanoid biosynthesis", "Zeatin bio-

Kyoto Encyclopedia of Genes and Genomes (KEGG) Metabolic Pathway Analysis
KEGG pathway analysis was used to investigate the unintended effects of transgenic lines. The significantly enriched pathways are shown in Figure 4. Out of 131 DGEs from the A1 line, 90 pathways were found to be enriched. The top 20 pathways with the highest significance include "Cyanoamino acid metabolism", "Tyrosine metabolism", "Isoquinoline alkaloid biosynthesis", etc. A total of 141 DGEs from the B1 line were enriched in 84 metabolic pathways. The top 20 significantly enriched pathways consist of "Isoquinoline alkaloid biosynthesis", "Tyrosine metabolism", "Terpenoid backbone biosynthesis", etc. A total of 68 pathways showed co-enrichment of DEGs from both the A1 and B1 lines, including "plant hormone transduction", "Phenylpropanoid biosynthesis", "Zeatin biosynthesis", "MAPK signaling pathway", and other important pathways related to plant growth and development.

K-Means Clustering of DEGs Expression Patterns
To further understand the gene expression patterns of transgenic lines, we performed a K-means cluster analysis of the DEGs ( Figure 5). All the DEGs were clustered into eight expression patterns. Among these clusters, Profiles 2, 7, and 8 contained the most genes and showed significant correlation. The expression patterns of the A1 and B1 lines in Profiles 2 and 7 were consistent. Profile 2 contained 319 genes, and the gene expression levels of A1 and B1 were significantly down-regulated. Profiles 7 and 8 contained 195 and 196 genes, respectively; the gene expressions of A1 and B1 were significantly up-regulated compared to control.

K-Means Clustering of DEGs Expression Patterns
To further understand the gene expression patterns of transgenic lines, we performed a K-means cluster analysis of the DEGs ( Figure 5). All the DEGs were clustered into eight expression patterns. Among these clusters, Profiles 2, 7, and 8 contained the most genes and showed significant correlation. The expression patterns of the A1 and B1 lines in Profiles 2 and 7 were consistent. Profile 2 contained 319 genes, and the gene expression levels of A1 and B1 were significantly down-regulated. Profiles 7 and 8 contained 195 and 196 genes, respectively; the gene expressions of A1 and B1 were significantly up-regulated compared to control.

Analysis of Genes Expression Related to Growth and Development
To determine the influence of exogenous gene insertion on plant growth and development, we analyzed the expression levels of genes related to growth in the A1 and BI transgenic lines ( Figure 6A,B). The gene expression revealed that two JAZ genes and MPK6 gene of the A1 transgenic line were significantly down-regulated in the plant hormone transduction pathway, while GH3 was significantly up-regulated. CKX and CISZOG genes that are involved in the zeatin biosynthesis pathway were significantly down-regulated in A1 line ( Figure 6A). Likewise, in the B1 transgenic line, the CKX gene was significantly down-regulated in the zeatin biosynthesis pathway ( Figure 6B). Moreover, the MPK6, PP2C, and SnRK genes were down-regulated while the SAUR, AUX/IAA, GH3, and CYCD3 genes were up-regulated in the plant hormone transduction pathway of the B1 transgenic line ( Figure 6B). In the phenylpropanoid biosynthesis pathway, the HCT and Peroxidase genes involved in lignin synthesis were significantly down-regulated in both transgenic lines, and 4CL gene was significantly down-regulated in the B1 transgenic line. expression patterns. Among these clusters, Profiles 2, 7, and 8 contained the most genes and showed significant correlation. The expression patterns of the A1 and B1 lines in Profiles 2 and 7 were consistent. Profile 2 contained 319 genes, and the gene expression levels of A1 and B1 were significantly down-regulated. Profiles 7 and 8 contained 195 and 196 genes, respectively; the gene expressions of A1 and B1 were significantly up-regulated compared to control.

Analysis of Genes Expression Related to Growth and Development
To determine the influence of exogenous gene insertion on plant growth and development, we analyzed the expression levels of genes related to growth in the A1 and BI transgenic lines ( Figure 6A,B). The gene expression revealed that two JAZ genes and

Analysis of Differentially Expressed Transcription Factors (TFs)
TFs play a vital role in regulating plant growth and enhancing stress resistance. In this study, a total of 2359 TFs were screened, and 76 TFs showed a significant difference in expression across different comparison groups. These TFs were classified into 29 families, with the most abundant ones being bHLH, WRKY, ERF, MYB, and NAC. In the A1 transgenic line, 43 TFs exhibited significant expression changes, with 20 up-regulated and 23 down-regulated genes ( Figure 6C). These genes belong to 21 TF families. The bHLH and WRKY families had the largest number of genes, and most of them were predominantly up-regulated. Regarding the B1 line, the expression levels of 59 TFs were significantly altered, with 43 up-regulated and 16 down-regulated ( Figure 6D). These genes belonged to 27 TF families, and the ERF, WRKY, and bHLH families had the largest number. Most DGEs in the ERF family showed up-regulation, while all DGEs in the bHLH and WRKY families were predominantly up-regulated.

Verification of RNA-seq Data by Quantitative Reverse-Transcription PCR (qRT-PCR)
We used qRT-PCR to measure the expression level of selected genes. We focused on four genes that were commonly expressed in the A1 and B1 lines (LOC18100835, LOC7475677, LOC7469610, LOC466794), as well as four genes located within 20 kb of the T-DNA insertion site (LOC7468629, LOC7468630, LOC7481526, LOC7498022). The four protein-coding genes near the insertion site were all effectively expressed (FPKM > 1), and the gene expression levels of A1 and B1 lines were not significantly different from those of the control (Figure 7). Real-time PCR revealed the same tendency in changes in expression as the RNA-seq data, which suggests that the RNA-seq data in this study are reliable. lated in A1 line ( Figure 6A). Likewise, in the B1 transgenic line, the CKX gene was significantly down-regulated in the zeatin biosynthesis pathway ( Figure 6B). Moreover, the MPK6, PP2C, and SnRK genes were down-regulated while the SAUR, AUX/IAA, GH3, and CYCD3 genes were up-regulated in the plant hormone transduction pathway of the B1 transgenic line ( Figure 6B). In the phenylpropanoid biosynthesis pathway, the HCT and Peroxidase genes involved in lignin synthesis were significantly down-regulated in both transgenic lines, and 4CL gene was significantly down-regulated in the B1 transgenic line.  four genes that were commonly expressed in the A1 and B1 lines (LOC18100835, LOC7475677, LOC7469610, LOC466794), as well as four genes located within 20 kb of the T-DNA insertion site (LOC7468629, LOC7468630, LOC7481526, LOC7498022). The four protein-coding genes near the insertion site were all effectively expressed (FPKM > 1), and the gene expression levels of A1 and B1 lines were not significantly different from those of the control (Figure 7). Real-time PCR revealed the same tendency in changes in expression as the RNA-seq data, which suggests that the RNA-seq data in this study are reliable.

Discussion
Transgenic plants are labeled with exogenous T-DNA, making it essential to identify their unintended effects for commercial promotion and risk assessment. In Arabidopsis, engineering stress-tolerance genes did not cause unintended effects [19]. However, transgenic rice expressing Bt toxin showed that the Bt protein had unintended effects on nontarget invertebrates [20]. Our previous studies demonstrated that Bt transgenic poplar had no significant effect on the arthropod community or bacterial diversity, nor did it impact tree height. However, it did affect the radial development of the receptor [21,22]. In this study, the transgenic PEN carrying the Cry1Ac-Cry3A-BADH genes and the Cry1Ac-Cry3A-NTHK1 genes had significantly inhibited tree heights and DBH at different

Discussion
Transgenic plants are labeled with exogenous T-DNA, making it essential to identify their unintended effects for commercial promotion and risk assessment. In Arabidopsis, engineering stress-tolerance genes did not cause unintended effects [19]. However, transgenic rice expressing Bt toxin showed that the Bt protein had unintended effects on non-target invertebrates [20]. Our previous studies demonstrated that Bt transgenic poplar had no significant effect on the arthropod community or bacterial diversity, nor did it impact tree height. However, it did affect the radial development of the receptor [21,22]. In this study, the transgenic PEN carrying the Cry1Ac-Cry3A-BADH genes and the Cry1Ac-Cry3A-NTHK1 genes had significantly inhibited tree heights and DBH at different growing sites (Figure 1). Therefore, analyzing unintended effects of transgenic plants case by case is necessary.

T-DNA Integration Analysis
Exogenous gene integration usually occurs randomly on the chromosome of the recipient genome. Choi et al. (2002) used fluorescence in situ hybridization (FISH) to analyze transgenic plants, and the results showed that there were no preferential integration sites of foreign DNA among chromosomes [23]. This study also found no obvious chromosomal bias in T-DNA insertion, which is consistent with previous findings. However, it is worth mentioning that the sample size in this study was small, and further data from transgenic lines are needed to explore whether T-DNA integration on chromosomes is biased or not. It has been recognized that the expression level of exogenous genes will vary depending on their randomly inserted positions [24]. Generally, the expression of exogenous genes is unstable when T-DNA is integrated near the heterochromatin region, while it is stable when T-DNA is integrated into the euchromatin region with high transcriptional activity [25,26]. In the genome, T-DNA is more likely to insert into intergenic regions rather than gene regions [27]. In this study, three insertion sites were located in the gene region and four were located in the intergenic region (Supplementary Table S4). Furthermore, transgenic lines with Agrobacterium-mediated integration usually have a relatively small copy number [28]. In this study, six transgenic lines of PEN were all generated using the Agrobacterium-mediated method, resulting in a total of seven T-DNA insertion sites, all of which were single-copy integrations ( Table 2). The integrated form of a single copy or limited number of copies may be more conducive to the expression of exogenous genes [29]. Previous studies have shown that T-DNA tends to integrate into AT-rich regions and intergenic regions [30]. According to the analysis of 1000 bp sequences near the seven loci in this study, the AT content of four loci was higher than the average of the chromosomes.
In previous studies, the analysis of exogenous gene integration sites and flanking sequences has been mostly based on PCR and other similar methods. T-DNA flanking sequences have been successfully identified in a variety of crops [31]. However, these methods are complex, time consuming, possess poor specificity, and are unable to discover all sites of insertion and flanking sequences of multilocus insertion. The NGS technology can overcome the limitations of PCR-based methods and provide rapid and comprehensive molecular characterization data. The quality of sequencing library construction and sequencing depth are two pivotal factors that affect NGS sequencing results [32]. In this study, utilizing NGS re-sequencing technology, 129.72 G of sequencing data was obtained, and six insertion sites of transgenic PEN trees were identified. Four of these were subsequently verified by PCR and Sanger sequencing ( Table 2).
The integration of T-DNA into the recipient genome often leads to duplication, deletion, and inversion of T-DNA or the recipient genome DNA [33]. In this study, the flanking sequence of the genome near the T-DNA integration site was also partially deleted compared with the reference genome (Supplementary Table S3). The reason may be the loss of fragments, or there may be differences between the genome of PEN and the reference genome of P. trichocarpa. Some insertion sites in this study could not be verified by PCR amplification, which may be caused by false positives in the comparison of NGS data [34,35]. On the other hand, the left and right boundary sequences of the vector used in this study are short. If PCR primers are located in the boundary sequence, the product may not be obtained due to partial loss of the boundary sequence during integration.

T-DNA Integration and Gene Expression
Gene stacking biotechnology is a way to combine multiple traits in plants. Nowadays, crops with gene stacking have been planted in many countries; however, the combination of multiple genes may also lead to mutual interference between genes, causing one gene to inhibit the other gene. According to previous research, a comparison and analysis of the expression of exogenous genes in four double Bt gene vectors, showed that the expression level of the Bt gene was low when it was located upstream of T-DNA [36]. With the development of transgenic technology, it is necessary to note whether gene stacking transgenic plants are safer than single gene transgenic plants. Some studies suggested that breeding by stacking two transgene events does not introduce greater variation than conventional breeding processes [37,38]. Proteomic profiling analysis found no newly expressed proteins in transgenic maize (12-5 × IE034) with insect-resistant and glyphosatetolerant genes. The differences in protein expression between transgenic plants and their parents were also within the range of variation of conventional maize varieties. This suggests that stacked-trait development via conventional breeding did not have an impact on the genetic stability of T-DNA [39]. Transgenic poplar trees with a single Bt gene have been shown to either improve yields or affect DBH development [22,40].
In this study, the results indicated that insect-resistant and salt-tolerant gene stacking further inhibited the growth of poplar trees. The growth rate of DBH and tree height were significantly lower in transgenic lines than in non-transgenic lines. This may be caused by both exogenous promoter overexpression and gene stacking. A study has shown that, when the transgenic line uses a constitutive promoter (CaMV 35S), the transgenic rice plants exhibit a dwarfing phenotype under normal growth conditions [41]. The CaMV 35S promoter can drive high gene expression even when plants do not require it, which may disrupt normal gene regulation, metabolic pathways, and affect overall growth and development. Gene stacking also intensifies the accumulation of foreign genes in the receptor, and the interaction between exogenous genes may bring "byproducts" that affect growth and development.
The growth and development of transgenic poplar are closely related to gene transcription regulation. RNA-seq technology can analyze the molecular mechanism of inhibiting the growth of DBH and tree height of transgenic PEN to some extent [42]. In this study, 575 and 678 DEGs, including bHLH, WRKY, ERF, MYB, NAC, and other TFs, were screened in the A1 and B1 transgenic lines by RNA-seq ( Figure 2). There is a positive correlation that exists between lignin content and DBH development, and transcriptional regulation is the main mechanism of gene expression related to lignin biosynthesis [43]. The NAC and bHLH TF families can regulate lignin synthesis, while the MYB family plays a key role in lignin synthesis [44,45]. The WRKY TF family plays an indispensable role in the formation and development of plant roots, stems, and leaves, and some WRKY TFs can regulate diameter elongation [46,47]. Among the pathways of DEG enrichment, the A1 and B1 lines were co-enriched in important pathways related to plant growth and development, such as plant hormone transduction, phenylpropanoid biosynthesis, and zeatin biosynthesis. Some auxin early response genes (SAUR, IAA4, GH3) and CYCD3 of the A1 or B1 line were significantly up-regulated, while cytokinin dehydrogenase (CKX), jasmonic acid response inhibitor gene (JAZ), and key genes in the lignin pathway (4CL, HCT) and enzymes were significantly down-regulated ( Figure 6A,B).
Most of the early auxin response genes belong to the family of Aux/IAAs, GH3s, and SAURs [48]. The GH3 gene is associated with plant growth and development [49]. In A. thaliana, the gene product of GH3 is involved in auxin signal transduction and inhibits shoot and hypocotyl cell elongation and lateral root cell differentiation [50]. Overexpression of CYCD3 in A. thaliana leads to a radical alteration in leaf architecture and inhibiting differentiation [51]. CKX has been shown to be involved in the regulation of cytokinin homeostasis and plays a key regulatory role in plant growth and development [52]. Both 4CL and HCT are key enzymes in the phenylpropane metabolism pathway, and inhibiting 4CL activity in tobacco leads to a decrease in lignin content [53]. In summary, changes in the transcriptional expression of some TFs and genes related to plant growth and lignin synthesis may be the main reason for the growth inhibition of transgenic PEN in this study.

Conclusions
In this study, successive years of experimental tests were conducted at different sites to investigate the unintended effects of exogenous T-DNA integration in poplar. The obtained results provide theoretical support for the application of transgenic trees in the field. Based on the data, T-DNA integration events carrying multiple genes led to the inhibition of tree height and DBH in PEN. NGS and transcriptome sequencing analysis revealed that the growth inhibition of transgenic PEN may be attributed to the influence of exogenous genes on the expression of genes related to plant hormone transduction and lignin synthesis, rather than the damage done by T-DNA to the receptor genome and its influence on adjacent genes. In conclusion, all the experimental data in this study were obtained from plants in the field, and it was found that there were unintended effects in multigene cotransformation. In the study of transgenic plants, especially to investigate the unintended effects, field testing is necessary, which can provide results that may not be obtained in laboratory conditions. Additionally, the results of this study remind us to pay special attention to the growth inhibition caused by multigene cotransformation and to address this issue in the research of woody plant transgenic for wood or biomass harvesting.

Plant Materials and Study Sites
Three transgenic lines of PEN carrying the Cry1Ac-Cry3A-BADH genes (A1-A3) and three transgenic lines of PEN carrying the Cry1Ac-Cry3A-NTHK1 genes (B1-B3) were obtained in our laboratory by Agrobacterium-mediated genetic transformation [18,54]. These Three transgenic lines of PEN carrying the Cry1Ac-Cry3A-BADH genes (A1-A3) and three transgenic lines of PEN carrying the Cry1Ac-Cry3A-NTHK1 genes (B1-B3) were obtained in our laboratory by Agrobacterium-mediated genetic transformation [18,54]. These lines have been approved by the State Forestry and Grassland Bureau for environmental release. Non-transgenic PEN was used as the control. On April 2018, a random block design with three replicates was employed to construct field trial forest in MC District, Baoding City, Hebei Province (E115°19′ and N38°57′), LN, Tangshan City, Hebei Province (E118°41′ and N39°31′), and YS, Cangzhou City, Hebei Province (E117°14′ and N38°3′) ( Figure 8). Each plot contained nine clones, with a plant spacing of 2 m and row spacing of 4 m. Natural conditions and management practices were consistent across the test sites.

Growth Characteristics
Between 2018 and 2021, the plant height and DBH of six transgenic and control lines at three locations were measured in November. Based on the high consistency of data collected from different locations in 2018 and the impact of COVID-19 in 2020, we collected growth data for four consecutive years (2018-2021) as typical representative in MC. We also collected data from YS in 2018, 2019, and 2021, as well as data from LN in 2018 and 2021. Measurements were taken from 27 trees per line in three plots. The diameter of the trunk at 1.3 m from the ground was measured to determine the DBH. A Bruce altimeter was used to measure the plant height.

NGS Resequencing
The young leaves of six transgenic lines were collected for NGS sequencing. The DNA of the samples was broken into fragments of about 300 bp by ultrasound, which fragments were purified, and the ends were repaired. Subsequently, the 3' end was added to A and the sequencing adaptor was connected. The appropriate size of fragments was chosen through agarose gel electrophoresis. Then, the fragments were amplified by PCR to construct sequencing libraries. After quality control, the libraries were sequencing by the Illumina HiSeq (San Diego, CA, USA) platform at the Beijing BIOMICS Co., Ltd. Quality control was performed on the obtained raw data to remove reads containing adaptor and ones of low quality. Clean reads were mapped to the genome of P. trichocarpa and vector sequences through the BWA algorithm of the AIM-HII software [55], in order to search for exogenous insertions. The junction reads that were aligned to both the genome and T-DNA sequences were selected, and the position and direction of insertion were determined according to the alignment information of junction reads.

T-DNA Integration Verification and Flanking Sequence Analysis
The PCR primers were designed based on the genomic sequences within 600 bp upstream and downstream of the T-DNA insertion sites (Supplementary Table S2). T-DNA sequence information, which was obtained through NGS resequencing, was also taken into account. The DNA of six transgenic lines was amplified by PCR to verify the results of NGS resequencing. The total PCR amplification volume was 40 µL including: 20 µL 2 × M5 Hiper plus Taq HiFi PCR mix (with blue dye), 2 µL DNA, 1 µL each of forward and reverse primers, and 16 µL deionized water. The amplification procedure was: predenaturation at 95 • C for 5 min, followed by denaturation at 94 • C for 50 s, annealing at 55-58 • C for 30 s, and extension at 72 • C for 1 min for 30 cycles, and after cycles were completed, hold for 5 min at 72 • C. The PCR products were sequenced by first-generation Sanger at BGI Genomics (Shenzhen, China). The sequences were aligned to the genome and vector sequence to identify the integrity of the T-DNA integration sequence and flanking sequence of the insertion sites. The adjacent genes within 20 kb upstream and downstream of each insertion site were identified and their functions analyzed according to the genome annotation. The AT content within 1000 bp upstream and downstream of the insertion site was also analyzed statistically.

RNA-seq and Data Quality Control
In May 2019, the third fully expanded leaves from top to bottom from the same part of A1, B1, and control lines were collected in the MC experimental forest for transcriptome sequencing, with three biological replicates per line. The leaves were immediately frozen in liquid nitrogen after collection. Total RNA extraction was performed in the laboratory using the plant total RNA extraction kit (Takara, Dalian, China), following the manufacturer's instructions. RNA quality testing and library construction, quality control, and sequencing were conducted at Gene Denovo Biotechnology Co., Ltd. (Guangzhou, China) using the Illumina HiSeq 2500 platform. Clean reads were obtained after quality control and lowquality data filtering. Reads containing adapters, an N ratio greater than 10%, and all A-bases were removed. The clean data were then used for sequence comparison through Hisat2 v2.0.1.

DEGs Identification and Functional Analysis
Gene expression levels were measured based on fragments per kilobase of transcript per million base pairs sequenced (FPKM), and DEGs were identified using the DESeq 2 R package (1.10.1) [56]. The thresholds for DEGs were set to fold change |log2FC| ≥ 1 and significance testing (p < 0.05). GO analysis and the KEGG database were used to examine the functional enrichment of DEGs at a significance level of p < 0.05. K-means hierarchical clustering was used to analyze gene expression trends.

qRT-PCR
Eight genes were performed using qRT-PCR and Actin as an internal reference to verify the RNA-seq results. First-strand complementary cDNA synthesis for gene quantification was carried out using the First-Strand cDNA synthesis Kit (TranGen, Beijing, China). Gene expression was quantified by SYBR green qPCR Master Mix (Tiangen, Beijing, China) according to the manufacturer's instructions. Three biological and three technical replicates were performed for each transgenic line, and the relative expression levels were calculated using the 2 -∆∆Ct method. The primer sequences are listed in the Supplementary Table S2.

Statistical Analysis Methods
Microsoft Excel 2016 was used to create tables. IBM SPSS statistics v26.0 software was used to conduct one-way analysis of variance (ANOVA) and Duncan's multiple range test (p < 0.05). The DPS v19.05 software was used to test the significance of the interaction effect between transgenic lines and the environment (p < 0.05). Graphpad Prism v9.3, as well as R v4.3.1 software, was used for data visualization and analysis.

Availability of Data and Material
The datasets generated and/or analyzed during the current study have been deposited in the NCBI Sequence Read Archive repository under accession numbers PRJNA995219 and PRJNA995045 (https://www.ncbi.nlm.nih.gov/bioproject, accessed on 15 August 2023). Any reasonable requests should be directed to the corresponding author.