Development of High-Yielding Upland Cotton Genotypes with Reduced Regrowth after Defoliation Using a Combination of Molecular and Conventional Approaches

Cotton is an economically important crop. However, the yield gain in cotton has stagnated over the years, probably due to its narrow genetic base. The introgression of beneficial variations through conventional and molecular approaches has helped broaden its genetic base to some extent. The growth habit of cotton is one of the crucial factors that determine crop maturation time, yield, and management. This study used 44 diverse upland cotton genotypes to develop high-yielding cotton germplasm with reduced regrowth after defoliation and early maturity by altering its growth habit from perennial to somewhat annual. We selected eight top-scoring genotypes based on the gene expression analysis of five floral induction and meristem identity genes (FT, SOC1, LFY, FUL, and AP1) and used them to make a total of 587 genetic crosses in 30 different combinations of these genotypes. High-performance progeny lines were selected based on the phenotypic data on plant height, flower and boll numbers per plant, boll opening date, floral clustering, and regrowth after defoliation as surrogates of annual growth habit, collected over four years (2019 to 2022). Of the selected lines, 8×5-B3, 8×5-B4, 9×5-C1, 8×9-E2, 8×9-E3, and 39×5-H1 showed early maturity, and 20×37-K1, 20×37-K2, and 20×37-D1 showed clustered flowering, reduced regrowth, high quality of fiber, and high lint yield. In 2022, 15 advanced lines (F8/F7) from seven cross combinations were selected and sent for an increase to a Costa Rica winter nursery to be used in advanced testing and for release as germplasm lines. In addition to these breeding lines, we developed molecular resources to breed for reduced regrowth after defoliation and improved yield by converting eight expression-trait-associated SNP markers we identified earlier into a user-friendly allele-specific PCR-based assay and tested them on eight parental genotypes and an F2 population.


Introduction
Cotton (Gossypium spp.) is one of the major sources of natural fiber and vegetable oil globally [1,2].Despite advances in plant breeding and management practices, the cotton yield gain has stagnated.One reason for the stagnated yield is the narrow genetic base [1,3,4].As a result of this narrow genetic base, breeding progress has slowed, which could represent an impediment to enhancing pest and disease resistance, improving fiber quality, and sustaining high yields in cotton cultivars [1,[5][6][7][8][9].One of the important factors in the success of the cotton industry is the development of high-yielding cultivars exhibiting enhanced pest and disease resistance, early maturity, high quality of fiber, and low management costs [1,10].
Genetic diversity is an important factor that may contribute to early crop maturity, fiber yield, and quality.The success of a breeding program is highly dependent on the diversity of the gene pool [11].The intensive use of a few genotypes in a breeding program can lead to a narrow genetic base [12].Earlier studies conducted on 260 commercial upland cotton cultivars, released between 1970 and 1990, to determine the coefficient of parentage and pedigrees showed a narrow genetic base [3,6,13,14].Further investigations using molecular markers also confirmed the low genetic diversity in cultivated cotton germplasm [12,[15][16][17][18][19][20].
The introgression of the beneficial variations in the existing cotton germplasms using conventional and molecular plant-breeding approaches helps widen the cotton genetic base.For example, the interspecific hybridization between Gossypium barbadense with fine-quality long staple fibers but low yield and Gossypium hirsutum with high yield and low fiber quality led to the transmission of the desired fiber quality traits from G. barbadense to G. hirsutum [21,22].On the other hand, G. hirsutum contributed favorable alleles for other fiber-related characteristics such as fiber length, strength, and micronaire [21,23].These studies suggested that the allelic combinations of different genes contributing to desirable traits could be achieved via interspecific hybridization followed by selection [24].Additionally, Soomro et al. [25] reported that intraspecific hybrids in G. hirsutum and intraspecific hybrids in G. barbadense showed 33.7% and 28.3% heterosis, respectively.
Genotypes exhibiting traits that fall beyond the phenotypic range of the parental genotypes, known as transgressive segregation, are commonly occurring phenomena affecting different quantitative traits in interspecific/intraspecific hybrids.The interspecific cotton populations that segregate for different phenotypic traits, such as plant height (short or tall), flowering time (early or late), boll size (small or large), pre/post-harvest regrowth (less or more), maturity (early or late), and fiber fineness (coarse to fine), were developed in the past [26].These extreme phenotypes are due to the dominant and recessive alleles inherited from the parental genotypes, and their different combinations in the filial generations give rise to these segregants with extreme phenotypes.However, it is not easy to understand the genetic basis of these segregants completely.Introgression breeding is a long-term process.However, using introgression breeding, cotton breeders and geneticists have in the past century developed different lines showing many desirable traits such as Acala-type fiber quality and Delta-type fiber yield with resistance to Fusarium wilt [27].One major challenge of using traditional breeding techniques is the unintentional transfer of undesirable genes to the next generation, making the pyramiding of desirable genes lengthy.
Gene pyramiding is defined as stacking desirable genes into a single genetic background using conventional and molecular breeding techniques.Several approaches have been used for gene stacking.One such approach is identifying trait-associated DNA markers and their use in marker-assisted selection.An example of this approach is the identification of quantitative trait loci (QTL) for growth habit in cotton [28].Gene pyramiding is one of the most popular approaches to crop improvement by stacking genes for resistance to different biotic and abiotic stresses [28][29][30][31][32][33][34].
Plant architecture is an important factor that determines growth habit, maturity, crop management, and productivity.Meristems determine the plant architecture, which can be determinate (consumed during the flower production) or indeterminate (supporting reiterative vegetative growth).The meristem identity is determined by members of the PEBP (phosphatidylethanolamine-binding protein) gene family, such as CETS (CENTRO-RADIALIS/TERMINAL FLOWER 1/SELF PRUNING) and FLOWERING LOCUS T (FT).One of the members of this gene family plays an important role in promoting the determinate growth habit in plants by serving as a key component of the florigen activation complex (FAC), which activates the expression of downstream meristem identity genes SUPPRES-SOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1), LEAFY (LFY), APETALA 1 (AP1), and FRUITFUL (FUL).On the other hand, reduced expression of the cotton FT, SOC1, and FUL genes supports vegetative growth, delayed flowering, and bushy architecture in cot-ton [35].The development of cotton cultivars with somewhat annual growth habit having fewer vegetative branches and reduced regrowth after defoliation, which are desirable traits in cotton, has led to compact plants with reduced vegetative and more reproductive growth with improved yield under optimal growth conditions, which facilitate mechanical picking [35][36][37].
Early maturity is one of the key breeding objectives in cotton breeding programs.In cotton, early maturity is a complex trait that includes different indicators such as growth habit, first fruiting branch node (FFBN), the height of the first fruiting branch (HFFBN), flowering time, bud period, boll opening date, and boll maturity.Different phenotypic traits have been considered to evaluate early maturity in cotton, but FFBN was the most reliable indicator of early maturity [38,39].The value of the FFBN has a direct relationship with plant height and earliness of the onset of squaring, flowering, and boll opening [40].Likewise, the timing of FFBN appearance is a key indicator of early maturity in cotton [41], as early-maturing cotton shows lower FFBN and HFFBN values [42].The indicators of early maturity are typically quantitative traits determined by QTL and the environment [43,44].Simultaneous selection for these agriculturally important traits using a conventional plant breeding approach is challenging.In the recent past, the rapid use of molecular markers, next-generation sequencing (NGS), and QTL mapping have helped researchers to investigate the genetic architecture of these quantitative traits and utilize this knowledge to improve crop plants.
In this study, we used an upland cotton mini-core collection of 44 genotypes to develop cotton genotypes with reduced regrowth after defoliation and improved fiber yield by altering the growth habit of plants from perennial to somewhat annual.With these alterations, we expect changes in the plant architecture, height, flowering time, boll number, internode length, and pre/post-harvest regrowth.To achieve this objective, we collected tissues at three developmental stages, stage 1 (S1)-10 days after sowing (DASs), stage 2 (S2)-30 DASs, and stage 3 (S3)-45 DASs, respectively, from cotyledonary leaves, immature square, and mature square, for three consecutive years from 2017 to 2019 to study expression patterns of five floral induction and meristem identity genes, FT, SOC1, LFY, AP1, and FUL and identify high-expression alleles of these genes [45].We hypothesized that the high-expression alleles of these signal integrators and meristem identity genes would result in plants with annual growth habit exhibiting reduced to no regrowth after defoliation and enhanced yield, as the assimilates would be channeled towards fiber yield over storage in buds for regrowth.To identify high-expression alleles of selected genes, we developed an arbitrary expression matrix, where any genotype showing an expression level more than the population mean at a developmental stage in a study year was given a point.The genotypes were sorted from high to low expression levels [45] to make genetic crosses with an aim to stack the high-expression alleles of floral induction and meristem identity genes in a single genetic background.The specific objectives of the study were to (i) stack the complementing high-expression alleles of five floral induction and meristem identity genes by genetic crossing of selected genotypes; (ii) evaluate selected lines for surrogate traits for annual growth habit; (iii) develop molecular markers for reliable screening of genotypes with reduced regrowth after defoliation.

Plant Material
At the Pee Dee Research and Education Center (PDREC), we had access to 44 of the 53 upland cotton mini-core collection genotypes [46].For the remaining nine genotypes, insufficient seed was available for propagation.Hence, in this study, 44 upland cotton genotypes were included.These genotypes were cultivated at PDREC in the same field (34 • 18 39 N 79 • 44 40 W) consecutively for three years, from 2017 to 2019.This minicore collection represents over 92% diversity of a larger collection of the upland cotton genotypes released in the USA in the past 100 years [45].A list of 44 genotypes used in this study is presented in Table S1.

Crop Husbandry
As mentioned earlier, the upland cotton genotypes of the mini-core collection were cultivated consecutively for three years from 2017-2019 in PDREC field # 7 (34 • 18 39 N 79 • 44 40 W).Genetic crosses between selected lines were made in 2018 and 2019, and in subsequent years, the progeny of crosses made from selected cotton genotypes with high-expression alleles of five floral induction and meristem identity genes were evaluated in field # 23 in 2020 (34 •   44 35 W).All fields are located at the Clemson University PDREC and were managed similarly over the years.In the years 2020-2022, the plants were at F 2 /F 3 to F 6 /F 7 stage.Every year, the seeds of selected lines were advanced at the Cotton Winter Nursery (CWN), Liberia, Costa Rica.
Before sowing at PDREC or dispatching seed for CWN, the cotton seeds were ginned and delinted in the USDA-ARS delinting facility at PDREC, and the delinted seeds were treated with fungicides (composition: 10% Allegiance metalaxyl, Bayer Crop Sciences, Research Triangle, NC, USA, 3.3% Trilex trifloxystrobin, Bayer Crop Sciences, USA, 0.66% Vortex ipconazole, Bayer Crop Sciences, USA, and 3.3% EverGol penflufen, Bayer Crop Sciences, USA).The seeds were sown in 40-foot two-row plots with a 1-foot plant-to-plant distance.On average, 200 seeds per genotype were mechanically sown (100 seeds/row).The seeds were planted 2-3 cm deep, the fields were periodically sprayed with insecticides and herbicides, and defoliants were applied before physiological maturity to facilitate harvesting.Generally, the defoliants were applied onto the plants in mid to late September each season.The active ingredients of the defoliants included tribufos (a cell wall disrupter), diuron (a photosynthesis inhibitor), thidiazuron (an auxin inhibitor), and ethephon (a plant growth regulator), procured from Bayer Crop Sciences, USA and applied at the rate of 0.26, 0.84, 0.42, and 2.24 kg/ha, respectively.The harvesting was performed by manually picking bolls to ensure seed purity for generation advancement and mechanically to obtain the plot yield.

Genetic Crossing and Advancement of Generation
Eight upland cotton genotypes selected for high-expression level of the five floral induction and meristem identity genes (FT, SOC1, LFY, AP1, and FUL) were selected for making genetic crosses.These genotypes were selected from a screen of the 44 upland cotton mini-core collection lines for the high-expression alleles of the selected cotton genes at three developmental stages from 2017-2019 using qRT-PCR (for details, see ref. [45]).Collectively, we recorded data for 45 gene expression traits, i.e., five genes and three developmental stages, for three consecutive years (5 × 3 × 3).The gene expression data for these genes were normalized to the cotton housekeeping gene ACT4-2.An arbitrary point system was developed to facilitate the selection process.In this system, each individual genotype was given a point for an expression trait when it was found to express a gene at a developmental stage more than the population mean of that gene's expression in a given year.In this way, an individual genotype could receive a maximum of 45 points (5 genes × 3 developmental stages × 3 years) [45].The genetic crosses between selected cotton genotypes with complementary expression patterns of the selected genes were made reciprocally in the field in the 2018 and 2019 growing seasons following the commonly used procedure.
The bolls from crossed plants were manually harvested, ginned, and delinted, and the generations were advanced utilizing the CWN and the cotton research fields at PDREC.The phenotypic data on various traits such as plant height, flower and boll numbers per plant, boll opening date, floral clustering, and regrowth after defoliation were recorded in alternative generations, and selections were made.We used the single-seed descent method for the advancement of generations.A timeline of the generation advancement is given in Figure S1.
In 2022, fifteen advanced breeding lines (F 6 and F 5 generations) were cultivated in a triplicated randomized complete block design, and the phenotypic data collected from the trial were analyzed using Microsoft Excel and SAS packages.

Conversion of SNP Markers to User-Friendly PCR-Based Assay
The sequences of the eight selected markers showing associations with different expression traits [45] were retrieved from the CottonGen database (https://www.cottongen.org, accessed on 11 March 2021) to design allele-specific PCR-based assays.The full-length gene sequences provided sufficient flanking sequences to develop the allele-specific primer pairs.In these primer pairs, one primer's (forward/reverse) 3 -end was tagged at the two alternative SNP alleles.Additionally, to improve the primer specificity, we introduced a non-template-specific nucleotide change at (n-2 location) (see Table S2).The primers were synthesized, and optimum annealing temperatures were determined using gradient PCR (Table S2) and validated on eight genotypes selected for genetic crossing and a GSA 74 (17) × TAMCOT SP-23 (39) F 2 population.To test the specificity of the allele-specific primers, DNA was isolated from a month-old true tender leaf using the Mag-Bind ® Plant DNA DS Kit (Omega Bio-Tek Inc., Norcross, GA, USA) following the manufacturer's instructions.
Genes 2023, 14, x FOR PEER REVIEW 6 of 19 germinated (Table S5).Bolls were manually harvested from F2 plants of each population and seeds from ten selected plants per cross combination (5 crosses per combination) were sent to CWN for generation advancement (Table S6).Phenotypic data were collected on surrogate traits for annual/determinate growth habit, such as plant height, flower number per plant, and first boll opening date.The individuals of the F2 population showed transgressive segregation for these phenotypic traits.Precisely, four of the five F2 populations showed transgressive segregation for flower number, as F2 lines produced more flowers than the parental genotypes 53 to 60 days after sowing.Similar transgressive segregation was observed for the number of open bolls, recorded on 92 to 101 days after sowing (Figure 2).The boll opening date faithfully Eight cotton genotypes scored the most data points in the gene expression study of five floral induction and meristem identity (FT, LFY, SOC1, AP1, and FUL) genes at three developmental stages (cotyledonary leaf, first square, and subsequent square) studied for three consecutive years (2017, 2018, and 2019) (for further details, see ref. [45] and Section 2).

F 2 and F 3 Generations
In the 2019 growing season, the genetic material was sown in two phases.In phase 1 (21 May 2019), 44 upland cotton genotypes of the mini-core collection were sown, and in phase 2 (23 May 2019), F 1 and F 2 populations were sown (Tables S3 and S4).Seeds were received in time for propagation in 2019 for five of twenty genetic crosses that were prioritized for the advancement of the generation in CWN.These crosses were prioritized due to the high expression scores carried by the crossed genotypes in our expression analysis.To sum up, a total of five F 2 populations, 15 F 1 populations (not sent to Costa Rica), and 44 upland cotton genotypes of the mini-core collection were planted in field # 7 (34 •   S4).Similarly, 37 to 95 F 1 s from 15 cross combinations germinated (Table S5).Bolls were manually harvested from F 2 plants of each population and seeds from ten selected plants per cross combination (5 crosses per combination) were sent to CWN for generation advancement (Table S6).
Phenotypic data were collected on surrogate traits for annual/determinate growth habit, such as plant height, flower number per plant, and first boll opening date.The individuals of the F 2 population showed transgressive segregation for these phenotypic traits.Precisely, four of the five F 2 populations showed transgressive segregation for flower number, as F 2 lines produced more flowers than the parental genotypes 53 to 60 days after sowing.Similar transgressive segregation was observed for the number of open bolls, recorded on 92 to 101 days after sowing (Figure 2).The boll opening date faithfully reflects flowering time and hence was recorded in this study to find early-flowering genotypes.Likewise, plant height is positively correlated with boll number and negatively correlated with earliness in cotton, somewhat reflective of annual/determinate growth habit; hence, it was recorded for each population.In general, in all five populations, the plants showed more compact stature than the parental genotypes.This is in line with our hypothesis that the strong expression alleles of the five floral induction and meristem identity genes stacked together will promote an annual/determinate growth habit with a more compact plant form.S1.DAS = days after sowing.

F3 and F4 Generations
Thirty different populations were developed by crossing eight selected upland cotton genotypes in various combinations (Table S3).Out of these populations, four F3 populations and ten F2 populations were sown in the field at PDREC in the 2020 growing season.The wet and cold weather conditions and the field's location in a low-lying area (leading to waterlogging in some plots) resulted in poor plant turnout in some populations.Specifically, three F2 populations, which were crosses between ARKOT-8102 (5) × CABD3CABCH-1-89 (8), TAMCOT SP-23 (39) × CABD3CABCH-1-89 (8), and ARKOT  S1.DAS = days after sowing.Furthermore, we isolated the DNA from one of the F 2 populations, GSA 74 (17) × TAM-COT SP-23 (39), to test whether any of the expression-trait-associated markers identified in our previous work showed an association with the surrogate phenotypic traits recorded on the F 2 population [45].For this purpose, we converted SNPs to user-friendly PCR-based assays.We genotyped 91 F 2 lines of the GSA 74 (17) × TAMCOT SP-23 (39) population with four SNP markers, i09222Gh, i00443Gh, i13158Gh, and i13851Gh (Table S7).Unfortunately, a variable number of plants produced no data, resulting in missing data points.

F 3 and F 4 Generations
Thirty different populations were developed by crossing eight selected upland cotton genotypes in various combinations (Table S3).Out of these populations, four F 3 populations and ten F 2 populations were sown in the field at PDREC in the 2020 growing season.The wet and cold weather conditions and the field's location in a low-lying area (leading to waterlogging in some plots) resulted in poor plant turnout in some populations.Specifically, three F 2 populations, which were crosses between ARKOT-8102 (5) × CABD3CABCH-1-89 (8), TAMCOT SP-23 (39) × CABD3CABCH-1-89 (8), and ARKOT 8102 (5) × TAMCOT SP-23 (39) either showed no germination or poor survival after germination.The F 3 population derived from the reciprocal genetic cross between CABD3CABCH-1-89 (8) × TAMCOT SP-23 (39) also exhibited poor germination.However, due to the confounding effect of the environmental conditions, it is difficult to conclude whether any of the observed effects on germination were genetic.
The parental genotypes and the surviving individuals of F 2 /F 3 populations were selfed in the field conditions.Phenotypic data on the plant height, total boll number per plant, first boll opening date, and regrowth after defoliant application were recorded.Simple linear regression of plant height and total boll number and plant height and percentage of open bolls was performed, and the analysis showed a positive correlation between plant height and total boll number (almost all populations) and a negative/no correlation between plant height and percentage of open bolls (Figure S3).Interestingly, the correlation between plant height and total boll number was much more robust in populations involving ARKOT-8102 (5) and/or TAMCOT SP-23 (39), hinting towards the genetic nature of this correlation.The plants selected for propagation in Costa Rica are marked on the regression/correlation plots to make it easy to understand the bases of plant selection.The criteria for plant selection are further elaborated in Tables S8 and S9.
Phenotypic data on plant height, flower number (including candles and white, pink, and brown flowers) per plant, total bolls per plant, floral clustering, and regrowth after defoliation were collected from all field-grown plants.Out of 50 genotypes of different advanced generations (F 3 and F 4 seeds from Costa Rica and F 2 and F 3 seeds from PDREC), a variable number of plants established in the field for 39 genotypes, whereas none of the plants germinated or survived for 11 genotypes (Table S10).The data on regrowth after defoliation were recorded for all surviving plants of 39 genotypes on 27 October 2021.Individuals of all populations were split into two categories: plants showing regrowth or no regrowth after defoliation.Subsequently, to find any trend for plant height, flower number (total number of candles and flowers of different ages), and boll number in each class, we averaged the trait values and present the range in Figure 3. Contrary to expectation, this analysis suggested no correspondence between regrowth and plant height, flower number, or number of bolls.Out of 39 genotypes analyzed, all studied plants for 5 genotypes (four F 4 and one F 3 generation) showed regrowth, whereas, for the remaining 34 genotypes, a variable number of plants showed no regrowth, where the number of plants with no regrowth ranged from a single plant to several plants.Interestingly, the plants of F 5 line 8 × 5-10-4 showed no regrowth in 41.5% of individuals and also exhibited floral clustering on the fruiting branches (Figure 3).Traits like plant height (PH) reflect on the plant's stature and, to some extent, the growth habit (indeterminate/determinate) of its main stem and the lower indeterminate branches (nodes 2-4), whereas traits like boll number (BN) and, to some extent, flower number (FN; precisely the way we recorded it, where we counted different flower developmental stages individually) reflect earliness and determinacy, as the flowers represent determinate growth.Regrowth after defoliation is a trait that reflects on the change in the growth habit from perennial to annual, as regrowth after defoliation is a perennial trait.In sum, the genotypes in these populations indicated a transition in the plant growth habit and architecture.The seeds from the selected plants were sent to CWN to advance a generation (Tables S11 and S12).number (BN) and, to some extent, flower number (FN; precisely the way we recorded it, where we counted different flower developmental stages individually) reflect earliness and determinacy, as the flowers represent determinate growth.Regrowth after defoliation is a trait that reflects on the change in the growth habit from perennial to annual, as regrowth after defoliation is a perennial trait.In sum, the genotypes in these populations indicated a transition in the plant growth habit and architecture.The seeds from the selected plants were sent to CWN to advance a generation (Tables S11 and S12).S1.PH = average plant height (in inches); FN = average flower number; and BN = average boll number.

F5 and F6 Generations
The F5 seeds from the ten plants (selected based on the phenotypic data) of two cross combinations, GSA 74 ( 17 S1.PH = average plant height (in inches); FN = average flower number; and BN = average boll number.

F 5 and F 6 Generations
The F 5 seeds from the ten plants (selected based on the phenotypic data) of two cross combinations, GSA 74 ( 17 S11 and S12).The F 6 and F 5 seeds were received on 15 April 2022.A replicated yield trial (with three biological replicates of each line), including a higher-yielding check, DP-493, and a higher-quality-fiber check, FM-958, in a randomized complete block design, was planted on 18 May 2022.
The phenotypic data were collected on plant height (inches), boll number per plant, number of open bolls, floral clustering, regrowth after defoliation, lint yield, and fiber quality traits.The phenotypic data were recorded from the replicated field trial between 23-31 August 2022 (Table 1).The regrowth data were collected between 7-10 October 2022, about two weeks after the defoliant application on 20 September 2022.The data were collected from thirty flagged plants from three two-row plots per genotype (45 plots for 15 selected genotypes).Later, bolls (excluding any green bolls) were manually picked from the same plants (between 28-29 September 2022) for pure seeds.The basic idea was to send the seed to Costa Rica to obtain pure (F 8 /F 9 ) seeds for replicated trials in 2023 and determine the lint yield and fiber quality.The plots' mechanical harvest (boll picking) was performed on 28 October 2022 to obtain the yield data.
Table 1.Phenotypic data recorded for selected cotton genotypes sown in triplicated randomized complete block design.DP-493 and FM-958 were used as high-yielding and high-quality-fiber controls, respectively.For genotype names, see Table S1.The data analysis suggested that in these selected advanced lines, we have lines appropriate for long growing seasons (such as the Carolinas) and short growing seasons (such as Texas).Genotypes such as CABD3CABCH-1-89 (8) × ARKOT-8102 (5)-B3, CABD3CABCH-1-89 (8) × ARKOT-8102 (5)-B4, CAHUGLBBCS-1-88 (9) × ARKOT-8102 (5)-C1, CABD3CABCH-1-89 (8) × CAHUGLBBCS-1-88 (9)-E2, CABD3CABCH-1-89 (8) × CAHUGLBBCS-1-88 (9)-E3, and TAMCOT SP-23 (39) × ARKOT-8102 (5)-H1 matured early (about 60% open bolls as early as September 1, 2022), dropped leaves without defoliant application, and exhibited compact stature.In contrast, genotypes HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-K1, HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-K2, and HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-D1 exhibited clustered flowering, reduced regrowth after defoliant application, and produced a large number of bolls (Table 1; Figure 4).However, some of these later bolls did not mature in time for harvest during the manual picking between 28-29 September 2022, and during the recording of regrowth data between 7-10 October 2022, there were still green bolls on the plants (sprayed with defoliant on 20 September 2022).We hypothesized that the change in plant growth habit had reduced the internode length, resulting in floral clustering, and most of the buds turned into flowers, leading to some late-emerging bolls.However, as these plants set many bolls, the number of bolls produced offset the effect of not being able to harvest all bolls on yield, as most plants of the HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-K2 family yielded significantly more than the high-yielding control, DP-493 (Table 2).Also, in case of a long growing season, the producers could delay harvest or plan multiple pickings, which may further enhance the yield.On the other hand, the short-statured early-maturing genotypes are suitable for areas with a shorter growing season as these mature early and could avoid drought and heat stress occurring later during the growing season.These early results suggest that the various combinations of the high-expression alleles of the floral induction and meristem identity genes (FT, LFY, AP1, SOC1, and FUL) lead to various outcomes, as the selected genotypes that we crossed carried high-expression alleles of different floral induction and meristem identity genes.Samples were collected from alternative generations of selected plants (grown at PDREC) for RNA (three developmental stages: cotyledonary leaves, immature square, and subsequent square) and DNA extractions.The objective of this ongoing effort is to track the expression pattern of the five floral induction and meristem identity genes and alleles of molecular markers associated with expression traits [45,47].

Line
had reduced the internode length, resulting in floral clustering, and most of the buds turned into flowers, leading to some late-emerging bolls.However, as these plants set many bolls, the number of bolls produced offset the effect of not being able to harvest all bolls on yield, as most plants of the HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-K2 family yielded significantly more than the high-yielding control, DP-493 (Table 2).Also, in case of a long growing season, the producers could delay harvest or plan multiple pickings, which may further enhance the yield.On the other hand, the short-statured early-maturing genotypes are suitable for areas with a shorter growing season as these mature early and could avoid drought and heat stress occurring later during the growing season.These early results suggest that the various combinations of the high-expression alleles of the floral induction and meristem identity genes (FT, LFY, AP1, SOC1, and FUL) lead to various outcomes, as the selected genotypes that we crossed carried high-expression alleles of different floral induction and meristem identity genes.Samples were collected from alternative generations of selected plants (grown at PDREC) for RNA (three developmental stages: cotyledonary leaves, immature square, and subsequent square) and DNA extractions.The objective of this ongoing effort is to track the expression pattern of the five floral induction and meristem identity genes and alleles of molecular markers associated with expression traits [45,47].The fiber quality analysis of the selected genotypes showed some genotypes, such as HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-D1, to carry high-quality-fiber relative to the high-quality-fiber variety FM-958 (Table 2).On the other hand, HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-K2 family plants that exhibited many desirable traits, such as clustered flowering, reduced regrowth after defoliation, and high yield, exhibited desirable micronaire values but less desirable values for fiber length and strength (Table 2).We assumed this was because some of the bolls were not as mature as other bolls at Table 2. Lint yield recorded on per plot basis (obtained from cotton picker) and fiber quality data for selected cotton genotypes in each plot (SD values calculated and presented next to each quality parameter).DP-493 and FM-958 were used as high-yielding and high-quality-fiber controls, respectively.The fiber quality analysis of the selected genotypes showed some genotypes, such as HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-D1, to carry high-quality-fiber relative to the high-quality-fiber variety FM-958 (Table 2).On the other hand, HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37)-K2 family plants that exhibited many desirable traits, such as clustered flowering, reduced regrowth after defoliation, and high yield, exhibited desirable micronaire values but less desirable values for fiber length and strength (Table 2).We assumed this was because some of the bolls were not as mature as other bolls at harvest, resulting in the blending of mature and immature fibers, leading to reduced fiber-length and -strength values.

Line
Subsequently, plants were selected for increase in Costa Rica based on the total number of bolls per plant, plant height (close to check varieties), floral clustering, earliness, and no-regrowth after defoliation.A total of 59 plants were selected to be ginned, delinted, and treated with fungicide.Additionally, 15 plants were selected and sent for increase to CWN (Table S13).These plants will be reevaluated at PDREC for two years (2023 and 2024) for the above-listed attributes in addition to fiber yields and quality, with an intent to release them as germplasm lines.
Furthermore, we used the phenotypic data (plant height, total number of bolls per plant, and the number of open bolls) recorded for ten randomly selected plants of 15 advanced cotton selections replicated thrice (a total of 450 plants) in the field to study the genetic relations among these families.These populations were derived from seven genetic cross combinations of seven selected upland cotton genotypes, HOPI MOENCOPI, SPNXCHGLBH-1-94, ARKOT-8102, TAMCOT SP-23, CABD3CABCH-1-89, CAHUGLBBCS-1-88, and GSA 74 with high-expression alleles of five floral induction and meristem identity genes [45].As expected, using the phenotypic data, the related families (siblings) clustered together.However, the cross-clustering of genotypes with members of other families was also witnessed.This cross-clustering indicated breakage of correlations between traits, such as a positive correlation between plant height and total boll number per plant, probably due to recombination and stacking of high-expression alleles of five floral induction and meristem identity genes in different combinations (Figure 5).Genotype names are based on the family name followed by plant numbers 1-30 in the PCA plot, (A) and the dendrogram, (B).Notice the grouping of genotypes of a family with their siblings, as evident from following the font colors, e.g., clustering of K1, K2, and D1 family members [derived from the HOPI MOENCOPI (20) × SPNXCHGLBH-1-94 (37) cross].However, the clustering of some individuals with members of other families can also be witnessed, which reflects the breakage of correlations between traits and could be an outcome of recombination.D1 = green "*", E2 = pink "*", E3 = red "*", K1 = blue "*", and K2 = orange "*".

Conversion of the Expression-Trait-Associated SNP Markers into User-Friendly PCR-Based Assays for Use in Cotton Breeding
The sequences of the eight selected markers showing associations with different expression traits were retrieved from the CottonGen database to design allele-specific PCR-based assays that breeders could conveniently use to transfer and stack these traits in other relevant upland cotton breeding materials.We used the short SNP marker sequences (often about 100 nucleotides in length) to pull out the full-length gene sequences from the database.The full-length gene sequences provide sufficient flanking sequences to develop the allele-specific primer pairs (Table S3).In these primer pairs, one primer's (forward/reverse) 3 -end was tagged at the two alternative SNP alleles (Figure 6A).However, the clustering of some individuals with members of other families can also be witnessed, which reflects the breakage of correlations between traits and could be an outcome of recombination.D1 = green "*", E2 = pink "*", E3 = red "*", K1 = blue "*", and K2 = orange "*".

Conversion of the Expression-Trait-Associated SNP Markers into User-Friendly PCR-Based Assays for Use in Cotton Breeding
The sequences of the eight selected markers showing associations with different expression traits were retrieved from the CottonGen database to design allele-specific PCRbased assays that breeders could conveniently use to transfer and stack these traits in other relevant upland cotton breeding materials.We used the short SNP marker sequences (often about 100 nucleotides in length) to pull out the full-length gene sequences from the database.The full-length gene sequences provide sufficient flanking sequences to develop the allele-specific primer pairs (Table S3).In these primer pairs, one primer's (forward/reverse) 3′-end was tagged at the two alternative SNP alleles (Figure 6A).Additionally, to improve the primer specificity, we followed a strategy tested by Liu et al. [48] and introduced a non-template-specific nucleotide change at the n-2 location from the 3 -end of the oligonucleotide, where n is the SNP.The objective of introducing a non-template-specific (random) change at the n-2 location from the 3 -end is to destabilize the allele-specific primer and stop the production of non-specific products from the alternative SNP allele [49].To test the specificity of the allele-specific primers, these were used with the genomic DNA of parental genotypes.Primers designed from the 18S rRNA gene were used as a positive control in each experiment.Under optimized PCR conditions, agarose gel electrophoresis allowed the differentiation of the lines from the 17 × 39 F 2 population, whether they were homozygous or heterozygous for that particular expression-trait-associated SNP marker [50].All eight (i02927Gh, i43992Gh, i13158Gh, i09222Gh, i00443Gh, i08185Gh, i13848Gh, i13851Gh) expression-trait-associated SNP markers were tested on eight parental genotypes (Figure 6B) and four (i13851Gh, i09222Gh, i13158Gh, and i00443Gh) of these markers were used on one F 2 population (17 × 39).Out of a total of 91 lines from the 17 × 39 F 2 population, various numbers of plants did not produce any data, resulting in missing data.For example, SNP marker i09222Gh was tested on 41 lines (45.05%), i13851Gh on 35 lines (38.46%), and i13158Gh and i00443Gh markers on 29 lines (31.86%).As stated, a variable number of lines from these populations did not produce any results, which is likely to be due to the presence of additional SNPs stacked in crossed progeny, making primer binding difficult and leading to a failed product; this possibility needs further investigation [51][52][53].
In 2023, we received the seeds of the most advanced lines, F 9 /F 8 , from Costa Rica and are evaluating them in a randomized complete block design for a second consecutive year.This year, we are testing the performance of these advanced lines with the parental genotypes initially used for genetic crossing and the high-yielding and high-fiber quality checks, DP-493 and FM-958, respectively.This year's and a subsequent year's phenotypic evaluation will allow us to register and release these advanced breeding lines as germplasms.We believe these breeding lines will serve as a resource for the plant-breeding community to develop cotton cultivars with optimal flowering time and stature, clustered flowering, enhanced fiber yield and quality, and reduced regrowth after defoliation.

Figure 1 .
Figure 1.Eight cotton genotypes scored the most data points in the gene expression study of five floral induction and meristem identity (FT, LFY, SOC1, AP1, and FUL) genes at three developmental stages (cotyledonary leaf, first square, and subsequent square) studied for three consecutive years (2017, 2018, and 2019) (for further details, see ref.[45] and Section 2).

Figure 1 .
Figure 1.Eight cotton genotypes scored the most data points in the gene expression study of five floral induction and meristem identity (FT, LFY, SOC1, AP1, and FUL) genes at three developmental stages (cotyledonary leaf, first square, and subsequent square) studied for three consecutive years (2017, 2018, and 2019) (for further details, see ref.[45] and Section 2).

Figure 2 .
Figure 2. Plots showing the distribution of the number of flowers formed between 15 July 2019 and 22 July 2019 (53 to 60 days after sowing) per plant (A) and boll opening date (B) in five F2 cotton populations.The IDs of parental genotypes are labelled on the bars reflective of the phenotype class they belong to; for the genotype names, see TableS1.DAS = days after sowing.

Figure 2 .
Figure 2. Plots showing the distribution of the number of flowers formed between 15 July 2019 and 22 July 2019 (53 to 60 days after sowing) per plant (A) and boll opening date (B) in five F 2 cotton populations.The IDs of parental genotypes are labelled on the bars reflective of the phenotype class they belong to; for the genotype names, see TableS1.DAS = days after sowing.

Genes 2023 , 19 Figure 5 .Figure 5 .
Figure 5.A principal component analysis (PCA) plot (A) and a neighbor-joining, unrooted, circulardendrogram (B) showing the relationships among 15 cotton populations (a total of 450 lines; 15 plots × 3 replicates × 10 randomly selected plants per plot) derived from seven genetic crosses of seven selected upland cotton genotypes.The analysis was based on phenotypic data (plant height, boll number, and the number of open bolls).Genotype names are based on the family name followed by plant numbers 1-30 in the PCA plot, (A) and the dendrogram, (B).Notice the grouping of genotypes

Figure 6 .
Figure 6.Diagram showing an expression-trait-associated SNP, the steps we followed to convert it into a PCR-based assay, and the steps forward to test it on the parental genotypes of the populations we developed to stack the expression traits.The marker information was retrieved from the Cot-tonGen database and used to obtain the full-length gene sequence, in this case, Gohir.A08G034500.The two SNP alleles are colored fluorescent green.The allele-specific primers designed are tagged at their 3′-ends on the SNP, and as specified in the text, a non-template-specific nucleotide change (shown in a lower-case letter) is introduced at the n-2 location to enhance primer specificity.The common forward primer used with both allele-specific primers is shown (A).An example of the PCR-based assay developed for marker 'i13851Gh with 'G' and 'A' specific primers on two parental genotypes is shown (B).

Figure 6 .
Figure 6.Diagram showing an expression-trait-associated SNP, the steps we followed to convert it into a PCR-based assay, and the steps forward to test it on the parental genotypes of the populations we developed to stack the expression traits.The marker information was retrieved from the CottonGen database and used to obtain the full-length gene sequence, in this case, Gohir.A08G034500.The two SNP alleles are colored fluorescent green.The allele-specific primers designed are tagged at their 3 -ends on the SNP, and as specified in the text, a non-template-specific nucleotide change (shown in a lower-case letter) is introduced at the n-2 location to enhance primer specificity.The common forward primer used with both allele-specific primers is shown (A).An example of the PCR-based assay developed for marker 'i13851Gh with 'G' and 'A' specific primers on two parental genotypes is shown (B).