Next Article in Journal
Composition and Structure of Tree Species in Twelve Plots Within Agroforestry Systems in the Amazonas Department, Peru
Previous Article in Journal
Colletotrichum fructicola Causes Necrotic Leaf Lesions in Avocado (Persea americana) in Amazonas, Peru: First Record and In Vitro Control Using Piper Essential Oils
 
 
Due to scheduled maintenance work on our servers, there may be short service disruptions on this website between 11:00 and 12:00 CEST on March 28th.
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genome and Transcriptome Sequencing of Oca (Oxalis tuberosa Molina) Reveals Photoperiod-Induced FT Homologs as Candidate Tuberigens

by
Maria Gancheva
1,* and
Aleksandr Tkachenko
2
1
Department of Genetics and Biotechnology, Saint Petersburg State University, St. Petersburg 199034, Russia
2
D.O. Ott Research Institute of Obstetrics, Gynaecology and Reproductology, St. Petersburg 199034, Russia
*
Author to whom correspondence should be addressed.
Int. J. Plant Biol. 2026, 17(2), 11; https://doi.org/10.3390/ijpb17020011
Submission received: 21 December 2025 / Revised: 2 February 2026 / Accepted: 9 February 2026 / Published: 10 February 2026
(This article belongs to the Topic Recent Advances in Plant Genetics and Breeding)

Abstract

Oxalis tuberosa (oca) is a tuber crop native to the Andes, valued for its nutrition but understudied genetically. Its strict short-day (SD) tuberization suggests a photoperiodic control mechanism similar to that of potato, where an FT-like protein acts as a mobile “tuberigen” signal. To identify this key regulator, we generated a de novo genome assembly for oca using long- and short-read sequencing. Integrated transcriptomic analysis of leaves under long-day (LD) and SD conditions, along with stems, roots, and tubers, enabled gene annotation and expression analysis. Our study focused on the Phosphatidylethanolamine-Binding Protein (PEBP) gene family, the source of florigen and tuberigen signals. We identified 23 OtPEBP genes and characterized their expression patterns. Among these, we discovered three FT-like homologs that are specifically and strongly upregulated in leaves under SD conditions. We therefore propose these genes as the prime candidates for the mobile tuberigen signal in oca. This work provides the foundational genomic resource for O. tuberosa and advances our understanding of the conserved photoperiodic network controlling storage organ formation beyond the Solanaceae family.

1. Introduction

Oxalis tuberosa Molina, commonly known as oca, is a herbaceous perennial plant belonging to the Oxalidaceae family, cultivated primarily for its underground starchy tubers. Native to the Andean region, oca is a staple crop providing a vital source of carbohydrates, vitamins, and minerals. Oca tubers form at the ends of underground shoots (stolons), similar to the process in potato (Solanum tuberosum L.). Tubers are mainly claviform or cylindrical in shape, with colors ranging from white to deep grayish purple. Oca, like the Andigena subspecies of potato (S. tuberosum ssp. andigena), strictly requires a short-day (SD) photoperiod for tuber formation.
The key regulator of potato tuberization is the SELF-PRUNING 6A (SP6A) gene, a member of the Phosphatidylethanolamine-Binding Protein (PEBP) gene family [1]. This gene family represents a highly conserved group of regulators found across eukaryotes, playing pivotal roles in diverse developmental transitions [2,3]. In plants, this family has undergone functional diversification into three major subclades: the FLOWERING LOCUS T (FT)-like clade, which promotes flowering and analogous reproductive processes; the TERMINAL FLOWER 1 (TFL1)-like clade, which acts as a repressor of floral transition; and the MOTHER OF FT AND TFL1 (MFT)-like clade, primarily involved in seed germination and stress responses [3]. The model plant Arabidopsis thaliana harbors six characterized PEBP members: the florigenic proteins FT and TWIN SISTER OF FT (TSF); the floral repressors TFL1, BROTHER OF FT AND TFL1 (BFT), and ARABIDOPSIS THALIANA CENTRORADIALIS (ATC); and MOTHER OF FT AND TFL1 (MFT) [3]. The photoperiodic control of flowering is mediated through the transcriptional activation of FT by CONSTANS (CO) under inductive long-day conditions [4]. The resulting FT protein is then transported from leaves to the shoot apical meristem, where it interacts with the bZIP transcription factor FD to initiate the switch to reproductive development [5,6]. The antagonistic action of TFL1, which competes with FT for interaction with FD, provides a critical balancing mechanism to fine-tune the timing of flowering [7].
In tuber-bearing species like potato, this conserved flowering network has been co-opted to regulate the formation of storage organs [8]. Initial studies identified four FT-like genes: StSP3D, StSP5G (StSP5G-A), StSP5G-like, and StSP6A [1]. Subsequent genomic analyses revealed additional paralogs, including StSP5G-B and StFTL1 [9,10]. A recent genome-wide analysis using the updated, long-read-based reference genome of potato has revealed an expanded repertoire of 15 PEBP family genes in Solanum tuberosum, classified into the FT, TFL, MFT, and PEBP-like subfamilies (StPEBP1–StPEBP15) [11]. Among these, StSP6A has been characterized as a key mobile “tuberigen” signal [1]. Its expression in leaves is induced under short-day conditions, and the protein is translocated via the phloem to stolons, where it promotes tuber initiation [1]. This systemic signaling is further modulated by interactions with proteins such as StFDL1 and 14-3-3 proteins, forming a tuberigen activation complex (TAC) [12]. The TAC regulates tuber initiation, functioning analogously to the florigen activation complex that controls flowering. Under long-day conditions, a potato CONSTANS-like protein (StCOL1) activates the expression of StSP5G genes, which in turn repress StSP6A, thereby inhibiting tuberization [9]. This intricate regulatory module demonstrates how core components of the ancient photoperiodic flowering pathway have been evolutionarily repurposed to control a fundamentally different developmental process—the formation of belowground storage tubers.
Despite the well-characterized tuberization pathway in potato, the molecular basis of tuber formation in the Andean crop Oxalis tuberosa remains largely uncharacterized. Oxalis tuberosa is a genetic outlier within its alliance, being the only octoploid species (2n = 8x = 64) among a group of related species sharing a base chromosome number of x = 8 [13]. This is confirmed by both classical cytology and modern flow cytometry analysis of nuclear DNA content, which shows that cultivated oca accessions possess approximately 3.6 pg/2C DNA, roughly four times that of the diploid species in the alliance (0.79–1.34 pg/2C) [13]. Oca is believed to be of allopolyploid origin, derived from the hybridization of wild species [13]. This unique ploidy level may be linked to selection for increased tuber size and vigor during domestication and has significant implications for its reproductive biology and breeding strategies [13].
The sequencing and assembly of the O. tuberosa genome represent a critical step toward understanding the genetic underpinnings of its unique traits, such as photoperiod-dependent tuber formation. So far, despite its great agricultural value, there are no published articles that would describe the genome sequencing of O. tuberosa. In this study, we present the de novo sequencing and assembly of the O. tuberosa genome using a combination of long-read and short-read technologies. We also analyzed oca transcriptomes from various plant parts (root, tuber, stem, leaf) under SD or LD conditions for gene annotation and the search for a tuberigen candidate. We identified multiple FT-like genes in oca that are specifically and highly expressed in leaves under SD conditions, leading us to hypothesize their function as candidate tuberigen signals. These findings provide a foundation for unraveling the conserved and unique aspects of tuberization regulation in oca and may facilitate future breeding efforts aimed at improving tuber yield and adaptability in this underutilized Andean crop.

2. Materials and Methods

2.1. Plant Growth and Tissue Collection

Oxalis tuberosa Molina plants, kindly provided by the N.I. Vavilov All-Russian Institute of Plant Genetic Resources (VIR, Saint Petersburg, Russia), were used in this study. The tubers of this accession are white, elongated (Figure 1a,b and Figure S1), and have a characteristically sour taste. Tubers were planted in pots with soil and grown in a phytotron under long-day (LD) conditions (16 h light/8 h dark) at 18–22 °C and 60–70% relative humidity. For in vitro establishment, two-node stem fragments were surface-sterilized with 50% (v/v) commercial bleach (sodium hypochlorite) for 5 min, followed by three rinses with sterile distilled water. Explants were cultivated on basal Murashige and Skoog (MS) medium supplemented with 10 g/L sucrose (MS10), pH 5.6. Plantlets were maintained by subculturing onto fresh MS10 medium every three months.
For nucleic acid extraction, the following tissues were harvested: young leaves and roots from in vitro-grown plants under LD conditions; stems and leaves from plants grown in pots for three months under LD conditions; leaves from plants transferred to short-day (SD) conditions for two weeks after three months of LD growth; tubers from plants grown under SD conditions for three months.

2.2. DNA Isolation and Sequencing

High-molecular-weight genomic DNA was isolated from young leaf tissue grown in vitro under LD conditions using a modified cetyltrimethylammonium bromide (CTAB) protocol [14]. Briefly, 100 mg of fresh tissue was ground in liquid nitrogen and homogenized in 500 µL of pre-warmed CTAB buffer (2% CTAB, 1.4 M NaCl, 20 mM EDTA, 100 mM Tris-HCl pH 8.0, 1% polyvinylpyrrolidone). After incubation for 10 min, β-mercaptoethanol was added to a final concentration of 1%, followed by a 10–15 min incubation at room temperature. The homogenate was incubated at 56 °C for 20 min, then 10 µL of freshly prepared Proteinase K (20 mg/mL) was added per 500 µL CTAB buffer, and incubation continued at 56 °C for 3 h. Subsequent purification steps included chloroform extraction, isopropanol precipitation, and ethanol washes. The DNA pellet was air-dried and dissolved in 200 µL TE buffer (10 mM Tris, pH 8, 1 mM EDTA). RNA was removed using RNase A (Thermo Fisher Scientific, Waltham, MA, USA), and final purification was performed using the QIAGEN Genomic-tips 20/G kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. DNA concentration and quality were assessed using a Qubit 4 Fluorometer and a Nanodrop ND-1000 (Thermo Fisher Scientific, Waltham, MA, USA).
For long-read sequencing, libraries were prepared using the Ligation Sequencing Kit (SQK-LSK109) and Native Barcoding Expansion Kit (EXP-NBD196) (Oxford Nanopore Technologies, Oxford, UK) and sequenced on a PromethION device using an R10.4.1 flow cell (FLO-PRO002). For short-read sequencing, libraries were prepared using the NDM627 DNA Library Prep Kit for MGI V2 and sequenced in PE100 mode on a DNBSEQ-G400 platform.

2.3. RNA Isolation and Sequencing

Total RNA was extracted from all tissues using the Magen HiPure Plant RNA Mini Kit (Magen Biotechnology, Guangzhou, China) according to the manufacturer’s protocol, with a modified wash step (incubation for 5 min at room temperature after adding Buffer RW2). RNA concentration was measured using a Qubit 2.0 Fluorometer with the RNA BR Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA). RNA integrity was assessed on a TapeStation or Bioanalyzer 2100 using the High Sensitivity RNA kit (Agilent Technologies, Santa Clara, CA, USA). Residual DNA was removed using the TURBO DNA-free Kit (Invitrogen, Carlsbad, CA, USA). Libraries were prepared from 1000 ng of RNA using the KAPA mRNA Capture Kit and KAPA RNA Hyper Kit (Roche, Basel, Switzerland), followed by final purification with KAPA HyperPure Beads (Roche, Basel, Switzerland). Library size distribution and quality were checked on a High Sensitivity D1000 TapeStation chip (Agilent Technologies, Santa Clara, CA, USA), and quantification was performed with the Quant-iT DNA Assay Kit, High Sensitivity (Thermo Fisher Scientific, Waltham, MA, USA). Libraries were pooled equimolarly and sequenced on a NextSeq 1000 (Illumina, San Diego, CA, USA) using NextSeq 2000 P3 XLEAP-SBS Reagents (200 Cycles), generating 100 bp paired-end reads.

2.4. Genome Assembly and Annotation

Base calling of Nanopore reads was performed using Guppy v6.5.7 in high-accuracy (HAC) mode, filtering for reads with quality > 7 (Q > 7). Quality of short reads was assessed with FastQC v0.11.9 [15]. Prior to assembly, the genome profile was assessed by analyzing k-mer frequency spectra (k = 21) from MGI short-read data using GenomeScope2 v2.0 [16]. The genome was assembled using Flye v2.9.2 [17] and polished with Pilon v1.24 [18] using the MGI short reads. Contigs identified as chloroplast or mitochondrial DNA via BLASTn v2.12.0 were removed.
For annotation, RNA-seq reads from all tissues were mapped to the genome using HISAT2 v2.2.1 [19]. Gene prediction was performed with BRAKER3 [20] (Galaxy Version 3.0.8) using the unmasked genome and the mapped RNA-seq data to train Augustus. Unique genes containing start and stop codons were retained using gFACs [21]. Contigs containing no genes were removed from the final assembly. Assembly statistics were evaluated with Quast v5.1.0 [22]. gffread v0.12.7 [23] was used for gtf to gff3 conversion. Genome coverage was assessed by mapping both long and short reads to the final assembly using BWA v0.7.18 [24] and calculating coverage with SAMtools v1.13 [25]. The completeness of the genome and proteome was assessed using BUSCO v5.2.2 [26] in Augustus mode with the eudicots_odb10 lineage dataset.

2.5. RNA-Seq Analysis and Identification of PEBP Family Genes

Transcript abundance was estimated using StringTie v2.2.1 [27] and kallisto v.52 [28]. Read counts were normalized using the median of ratios method implemented in DESeq2 v1.40.2 [29]. To identify PEBP family members, a BLASTp v2.12.0 search was conducted against the O. tuberosa and O. articulata proteomes using known Arabidopsis thaliana and Solanum tuberosum PEBP protein sequences. Proteins with >70% similarity were retained. Multiple sequence alignment was performed using the MUSCLE algorithm, and a phylogenetic tree was constructed with the neighbor-joining method in MEGA11 [30] using 1000 bootstrap replicates. Expression levels of identified PEBP genes were analyzed across all transcriptome datasets (LD leaves/stems, SD leaves, roots, tubers). The presence of the PBP domain was confirmed using the SMART database.

2.6. Quantitative RT–PCR Analysis

Residual genomic DNA was removed from total RNA using the Rapid Out DNA Removal Kit (Thermo Fisher Scientific, Waltham, MA, USA). Subsequently, 600 ng of RNA was used for first-strand cDNA synthesis with RevertAid reverse transcriptase (Thermo Fisher Scientific, Waltham, MA, USA) and oligo(dT) primers. Quantitative RT–PCR was performed using SYBR Green I dye (Evrogen, Moscow, Russia) on a CFX96 real-time system (Bio-Rad, Hercules, CA, USA). Expression of the four closely related OtFT-like genes (Oxtub.VIR.110512, Oxtub.VIR.156460, Oxtub.VIR.164106, Oxtub.VIR.121589) (For: 5′-TTGGGTAGGCAAACAGTTTACG-3′, Rev: 5′-ACCAGAGCCTCCTTCCC-3′, Figure S2) was analyzed by the CFX Manager software v2.1 (Bio-Rad) with the 2–ΔΔ Ct method. The actin gene (Oxtub.VIR.40059 For: 5′-TGTGCTCAGCGGTGGGAC-3′, Rev: 5′-GAAGCCAAGATAGATCCTCCG-3′) served as an internal control for normalization. Four biological replicates were analyzed, each with three technical repeats. Statistical significance was evaluated using the Wilcoxon rank sum test.

3. Results

3.1. De Novo Genome Assembly of Oxalis tuberosa

High-molecular-weight DNA was isolated from leaves of in vitro-grown plants. Sequencing on the Oxford Nanopore PromethION platform yielded 45 Gb of long-read data (926,354,960 reads; read N50 > 25 kb). These reads were assembled using Flye, and the draft assembly was subsequently polished using approximately 500 million 110 bp paired-end short reads generated on an MGI platform. The genome was covered by short reads with a median depth of 60×.
GenomeScope2 v2.0 analysis of the short-read data estimated a haploid genome length of 215–218 Mb, a unique sequence length of 115–116 Mb, and a heterozygosity rate of 11.4% (Figure 1). The final de novo genome assembly of O. tuberosa is 833,627,616 bp in length, distributed across 10,917 contigs (Table 1), consistent with the octoploid nature of oca. The assembly completeness was 95.8% based on BUSCO assessment, with a high level of duplication (S:21.3%, D:74.5), an additional indicator of oca polyploidy.

3.2. Transcriptome-Guided Gene Annotation

To facilitate comprehensive gene annotation, RNA was extracted from multiple tissues representing different developmental and environmental contexts: roots from in vitro plants; stems and leaves from plants grown under long-day (LD) conditions (16 h light/8 h dark) for three months; leaves from plants transferred to short-day (SD) conditions for two weeks after LD growth; and tubers from plants grown under SD for three months. Transcriptomes were sequenced on the Illumina platform. The reads from all libraries were mapped to the genome with a high overall alignment rate of 90.21% using HISAT2.
Gene prediction with BRAKER3, trained on this RNA-seq data, initially identified 224,769 gene models. Filtering with gFACs to retain only unique genes containing start and stop codons yielded a final set of 218,215 high-confidence protein-coding genes (File S1). The completeness of this predicted transcriptome was confirmed by BUSCO analysis, showing 96.9% completeness (Table 1).

3.3. Identification of PEBP Family Genes and a Candidate Tuberigen

To comprehensively identify the PEBP family in oca, a BLASTp search was conducted against the oca proteome using Arabidopsis and potato PEBP sequences. Following phylogenetic analysis and SMART domain verification, 26 putative PEBP members were initially identified. Three candidates (Oxtub.VIR.156463, Oxtub.VIR.164102, and Oxtub.VIR.143538) were excluded from further analysis due to a low-confidence PBP domain prediction, likely resulting from truncated gene models; these genes also showed no expression in any of the studied tissues. The final set comprised 23 high-confidence OtPEBP genes. Subsequent phylogenetic analysis and BLASTp searches against the UniProt database classified them into established subfamilies: eight FT-like genes (Figure 2, Figures S3 and S4), three TFL1-like genes, three BFT-like genes, two MFT-like genes, and eight ATC/CEN-like genes (Figure S4).
We also performed a search for the PEBP family in O. articulata, a related diploid species for which chromosome-level assemblies of two haplotypes (Haplotype A and Haplotype B) are available [31]. In each haplotype, we identified 7 PEBP homologs. These sequences exhibited a high degree of similarity with the corresponding oca genes and clustered together with them on the phylogenetic tree (Figure 2, Figures S3 and S4).
Two oca genes, Oxtub.VIR.210990 and Oxtub.VIR.110507, showed high sequence similarity to potato StSP3D and, together with their O. articulata homolog, formed a distinct clade (Figure 2). The remaining six OtFT-like genes in oca did not show clear one-to-one homology with characterized genes from potato or Arabidopsis. Instead, they formed a separate, expanded lineage that grouped closely with their homologs from O. articulata.
Analysis of the genomic loci revealed distinct organizational patterns among the OtFT-like genes. Three genes—Oxtub.VIR.110512, Oxtub.VIR.110513, and Oxtub.VIR.110507—are located in close proximity on the same contig. The genes Oxtub.VIR.210990 and Oxtub.VIR.110507 share high sequence similarity and are flanked by similar genes (Oxtub.VIR.110508 and Oxtub.VIR.210989, File S1). In contrast, Oxtub.VIR.83632, Oxtub.VIR.156460 and Oxtub.VIR.164106 reside on separate contigs with unique gene neighborhoods. The gene Oxtub.VIR.121589 is located on a solitary short contig.

3.4. Transcriptomic Profiling and qRT-PCR Validation of FT-like Gene Expression

To ensure reliable quantification of gene expression in the complex polyploid genome, where high sequence similarity between FT-like genes can lead to ambiguous read mapping, we employed and compared two independent quantification methods: HISAT2/StringTie (alignment-based) and Kallisto (pseudoalignment-based).
Analysis of OtFT-like gene expression profiles across all tissues (LD leaves, SD leaves, roots, stems, tubers) revealed distinct patterns (Figure 3a, Tables S1 and S2). Notably, three OtFT-like genes, Oxtub.VIR.110512, Oxtub.VIR.156460, and Oxtub.VIR.164106, exhibited strong and specific upregulation in leaves under SD conditions according to both quantification methods, with minimal expression in LD leaves. The high sequence similarity among some genes presented a challenge for unambiguous read assignment. For instance, the gene Oxtub.VIR.121589, which is highly similar to the SD-induced clade, was assigned zero expression by HISAT2/StringTie but showed an SD-induced profile in the Kallisto analysis. This discrepancy likely reflects differences in how the two algorithms resolve multimapping reads in regions of near-identical sequence, rather than a true biological difference. Critically, this technical uncertainty does not affect the core conclusion, as both methods robustly and consistently identified the three primary candidates (Oxtub.VIR.110512, Oxtub.VIR.156460, Oxtub.VIR.164106) as strongly and specifically SD-induced.
Conversely, two other OtFT-like genes, Oxtub.VIR.110513 and Oxtub.VIR.83632, showed higher expression in leaves under LD conditions compared to SD in both datasets. The expression of BFT-like genes was predominantly detected in roots, while TFL1-like, MFT-like and ATC/CEN-like genes displayed expression primarily in roots, stems, and/or tubers (Figure S5).
The consistent, method-independent photoperiod-specific induction of Oxtub.VIR.110512, Oxtub.VIR.156460, and Oxtub.VIR.164106 in source leaves is a hallmark of mobile tuberigen signals, such as StSP6A in potato, solidifying their status as the prime candidates for the systemic tuber-inducing signal in oca. The LD-specific expression of Oxtub.VIR.110513 and Oxtub.VIR.83632 is reminiscent of StSP5G in potato, which acts as a photoperiod-dependent repressor of tuberization.
To independently validate the photoperiod-dependent expression patterns of key candidates, we performed quantitative real-time PCR (qRT-PCR) on leaf samples from plants grown under LD and SD conditions. A specific primer pair was designed to target a region conserved among the four highly similar OtFT-like genes (Oxtub.VIR.110512, Oxtub.VIR.156460, Oxtub.VIR.164106, and Oxtub.VIR.121589), while distinguishing them from other OtFT-like genes in the oca genome (Figure S2). This strategy allowed for the specific co-amplification and quantification of their combined expression.
The results robustly confirmed the RNA-seq findings (Figure 3b). The combined expression signal for this subgroup was significantly and strongly induced in SD leaves, with minimal expression detected under LD conditions. This validated the photoperiod-specific induction pattern identified by transcriptomics.

4. Discussion

The evolutionary origin of cultivated oca within its wild tuber-bearing relatives remains an active area of investigation. The species is part of the “O. tuberosa alliance,” a clade that includes several wild tuber-bearing taxa: O. chicligastensis, O. picchensis, and two unnamed species from Peru and Bolivia. Phylogenetic studies have sought to pinpoint which of these are the direct progenitors of the polyploid crop. Current evidence points most strongly to the unnamed Bolivian taxon and O. chicligastensis as the likely genome donors [32]. Notably, the oca accession used in this study, obtained from the VIR collection, exhibits several morphological traits characteristic of primitive or wild-type tubers: they are elongated and sometimes form in a chain along a single stolon. These agro-morphological characteristics align it more closely with putative wild progenitors than with highly derived, modern cultivars selected for compact, fleshy, and storable tubers. Therefore, the successful de novo assembly of the Oxalis tuberosa genome presented here establishes a foundational genomic resource for this important yet understudied Andean crop. Our hybrid sequencing strategy, combining long Oxford Nanopore reads for scaffolding with high-depth MGI short reads for polishing, yielded a high-quality assembly of 833.6 Mb with 95.8% completeness based on BUSCO analysis. However, the BUSCO duplication rate of 74.5% is notably lower than the near-100% expected for a perfectly resolved polyploid genome. This indicates that our contig-level assembly resulted in a partial, uneven collapse of homeologous haplotypes. This inherent challenge of polyploid genome assembly led to the loss of some allelic diversity, as reflected in the final count of ~218,000 protein-coding genes. While this number aligns with the lower range of expectations for an octoploid (~115,000 gene models in a tetraploid O. stricta [33], 31,221 protein-coding genes in Oxalis oulophora [34], and ~36,000–38,000 per diploid O. articulata haplotype [31], the BUSCO metrics confirm that not all homeologous copies were retained as separate loci. Despite these limitations of the draft assembly, which preclude precise copy-number inference, this resource enables the first detailed investigation into the genetic basis of oca’s key traits, particularly its strict short-day-dependent tuberization.
Leveraging this genome and multi-tissue transcriptomes, we provided the first comprehensive characterization of the PEBP gene family in oca. We identified 23 high-confidence OtPEBP genes, representing all major subfamilies (FT, TFL1, BFT, MFT, ATC). The central finding is the identification of three strong OtFT-like candidates for the mobile tuberigen signal: Oxtub.VIR.110512, Oxtub.VIR.156460, and Oxtub.VIR.164106. Their expression profile is archetypal for a systemic inducer: strong and specific induction in source leaves under inductive SD conditions, with minimal expression in LD leaves. This pattern is identical to that of StSP6A, the definitive tuberigen in potato, and was validated by qRT-PCR. The presence of multiple SD-induced OtFT-like paralogs in oca, compared to a single major tuberigen (SP6A) in potato, may reflect its octoploid genome complexity and could indicate subfunctionalization or redundant roles in signaling. In contrast to the SD-induced candidates, Oxtub.VIR.110513 and Oxtub.VIR.83632 showed higher expression under non-inductive LD conditions. While this pattern is analogous to that of the tuberigen repressor StSP5G [9] in potato, the function of these genes in oca is not yet defined. Their LD-specific expression is consistent with a role in repressing tuberization under non-inductive conditions, but it could also reflect a primary function in regulating flowering or another photoperiodic response. Future functional studies are essential to definitively confirm the role of the identified OtFT-like genes as tuberigen or anti-uberigen signals in oca.
The genomic arrangement of the OtFT-like genes provides further insight into their potential evolutionary history and regulatory divergence. Intriguingly, three OtFT-like genes with starkly contrasting expression profiles—Oxtub.VIR.110512 (SD-induced), Oxtub.VIR.110513 (LD-induced), and the non-expressed Oxtub.VIR.110507—are located in close proximity on the same contig. This tight physical linkage of differentially regulated OtFT-like genes suggests a local regulatory landscape that fine-tunes their photoperiod-specific responses. Notably, two corresponding OtFT-like homologs in the related diploid species Oxalis articulata are also positioned adjacent to each other on chromosome 7. In contrast, Oxtub.VIR.210990 resides on a separate short contig. It shares high sequence similarity with Oxtub.VIR.110507, and their flanking genes (Oxtub.VIR.110508 and Oxtub.VIR.210989) are also highly similar, indicating this region likely represents a segmental duplication or a homeologous chromosomal segment that was not collapsed during assembly. Conversely, Oxtub.VIR.83632, Oxtub.VIR.156460 and Oxtub.VIR.164106 are located on distinct contigs with unique gene neighborhoods, supporting their status as unique loci rather than recent duplicates. The solitary placement of Oxtub.VIR.121589 on a short contig precludes similar synteny analysis. Together, this genomic architecture implies that the oca FT-like family may have arisen through a combination of mechanisms: local tandem duplication leading to subfunctionalized copies, larger-scale segmental duplication, and the retention of unique homeologous loci from the polyploid ancestor.
These findings bridge a critical knowledge gap in the biology of an orphan crop and provide specific genetic targets for future molecular breeding. The genomic resource and candidate genes identified here pave the way for strategies to modulate tuber yield, photoperiod sensitivity, and ultimately enhance the cultivation and resilience of this important nutrient source.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/ijpb17020011/s1: Table S1: Normalized expression of oca genes across five tissue/photoperiod conditions based on HISAT2/StringTie mapping; Table S2: Normalized expression of oca genes across five tissue/photoperiod conditions based on Kallisto pseudoalignment. Figure S1: Oxalis tuberosa VIR morphology; Figure S2: RT-PCR primer localization; Figure S3: Multiple sequence alignment of FT-like protein homologs; Figure S4: Neighbor joining phylogenetic tree of the PEBP proteins in Oxalis tuberosa (Oxtub), O. articulata (Oxart) Solanum tuberosum (St) and Arabidopsis thaliana (At); Figure S5: Heatmap showing normalized expression of the 23 identified OtPEBP genes. File S1: Nucleotide sequences (CDS) of the annotated Oxalis tuberosa genes (Oxtub.VIR).

Author Contributions

Conceptualization, M.G. and A.T.; methodology, M.G. and A.T.; investigation, M.G. and A.T.; writing—original draft preparation, M.G.; writing—review and editing, A.T.; visualization, M.G.; funding acquisition, M.G. All authors have read and agreed to the published version of the manuscript.

Funding

The work is supported by the Russian Science Foundation under grant No. 24-76-00018.

Data Availability Statement

All sequence reads and assembly have been deposited at the National Center for Biotechnology Information database under BioProject accession number PRJNA1378046.

Acknowledgments

Nanopore and Illumina sequencing were performed using the core facilities of the Lopukhin FRCC PCM “Genomics, proteomics, metabolomics” (http://rcpcm.org/?p=2806 (accessed on 8 February 2026)); MGI sequencing was performed in the Cerbalab (https://cerbalab.ru/, accessed on 8 February 2026).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Navarro, C.; Abelenda, J.A.; Cruz-Oró, E.; Cuéllar, C.A.; Tamaki, S.; Silva, J.; Shimamoto, K.; Prat, S. Control of flowering and storage organ formation in potato by FLOWERING LOCUS T. Nature 2011, 478, 119–122. [Google Scholar] [CrossRef]
  2. Wickland, D.P.; Hanzawa, Y. The FLOWERING LOCUS T/TERMINAL FLOWER 1 Gene Family: Functional Evolution and Molecular Mechanisms. Mol. Plant 2015, 8, 983–997. [Google Scholar] [CrossRef] [PubMed]
  3. Karlgren, A.; Gyllenstrand, N.; Källman, T.; Sundström, J.F.; Moore, D.; Lascoux, M.; Lagercrantz, U. Evolution of the PEBP gene family in plants: Functional diversification in seed plant evolution. Plant Physiol. 2011, 156, 1967–1977. [Google Scholar] [CrossRef] [PubMed]
  4. Samach, A.; Onouchi, H.; Gold, S.E.; Ditta, G.S.; Schwarz-Sommer, Z.; Yanofsky, M.F.; Coupland, G. Distinct roles of CONSTANS target genes in reproductive development of Arabidopsis. Science 2000, 288, 1613–1616. [Google Scholar] [CrossRef] [PubMed]
  5. Corbesier, L.; Vincent, C.; Jang, S.; Fornara, F.; Fan, Q.; Searle, I.; Giakountis, A.; Farrona, S.; Gissot, L.; Turnbull, C.; et al. FT protein movement contributes to long-distance signaling in floral induction of Arabidopsis. Science 2007, 316, 1030–1033. [Google Scholar] [CrossRef]
  6. Abe, M.; Kobayashi, Y.; Yamamoto, S.; Daimon, Y.; Yamaguchi, A.; Ikeda, Y.; Ichinoki, H.; Notaguchi, M.; Goto, K.; Araki, T. FD, a bZIP protein mediating signals from the floral pathway integrator FT at the shoot apex. Science 2005, 309, 1052–1056. [Google Scholar] [CrossRef]
  7. Hanano, S.; Goto, K. Arabidopsis TERMINAL FLOWER1 is involved in the regulation of flowering time and inflorescence development through transcriptional repression. Plant Cell 2011, 23, 3172–3184. [Google Scholar] [CrossRef]
  8. Rodríguez-Falcón, M.; Bou, J.; Prat, S. Seasonal control of tuberization in potato: Conserved elements with the flowering response. Annu. Rev. Plant Biol. 2006, 57, 151–180. [Google Scholar] [CrossRef]
  9. Abelenda, J.A.; Cruz-Oró, E.; Franco-Zorrilla, J.M.; Prat, S. Potato StCONSTANS-like1 suppresses storage organ formation by directly activating the FT-like StSP5G repressor. Curr. Biol. 2016, 26, 872–881. [Google Scholar] [CrossRef]
  10. Song, J.; Zhang, S.; Wang, X.; Sun, S.; Liu, Z.; Wang, K.; Wan, H.; Zhou, G.; Li, R.; Yu, H.; et al. Variations in Both FTL1 and SP5G, Two Tomato FT Paralogs, Control Day-Neutral Flowering. Mol. Plant 2020, 13, 939–942. [Google Scholar] [CrossRef]
  11. Zhang, G.; Jin, X.; Li, X.; Zhang, N.; Li, S.; Si, H.; Rajora, O.P.; Li, X.Q. Genome-wide identification of PEBP gene family members in potato, their phylogenetic relationships, and expression patterns under heat stress. Mol. Biol. Rep. 2022, 49, 4683–4697. [Google Scholar] [CrossRef]
  12. Teo, C.J.; Takahashi, K.; Shimizu, K.; Shimamoto, K.; Taoka, K.I. Potato Tuber Induction is Regulated by Interactions Between Components of a Tuberigen Complex. Plant Cell Physiol. 2017, 58, 365–374. [Google Scholar] [CrossRef] [PubMed]
  13. Emshwiller, E. Ploidy levels among species in the ‘Oxalis tuberosa Alliance’ as inferred by flow cytometry. Ann. Bot. 2002, 89, 741–753. [Google Scholar] [CrossRef] [PubMed]
  14. Murray, M.G.; Thompson, W.F. Rapid isolation of high molecular weight plant DNA. Nucleic Acids Res. 1980, 8, 4321–4325. [Google Scholar] [CrossRef] [PubMed]
  15. Andrews, S. FastQC: A Quality Control Tool for High Throughput Sequence Data. 2010. Available online: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 10 October 2022).
  16. Ranallo-Benavidez, T.R.; Jaron, K.S.; Schatz, M.C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 2020, 11, 1432. [Google Scholar] [CrossRef]
  17. Kolmogorov, M.; Yuan, J.; Lin, Y.; Pevzner, P.A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 2019, 37, 540–546. [Google Scholar] [CrossRef]
  18. Walker, B.J.; Abeel, T.; Shea, T.; Priest, M.; Abouelliel, A.; Sakthikumar, S.; Cuomo, C.A.; Zeng, Q.; Wortman, J.; Young, S.K.; et al. Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 2014, 9, e112963. [Google Scholar] [CrossRef]
  19. Kim, D.; Langmead, B.; Salzberg, S.L. HISAT: A fast spliced aligner with low memory requirements. Nat. Methods 2015, 12, 357–360. [Google Scholar] [CrossRef]
  20. Gabriel, L.; Brůna, T.; Hoff, K.J.; Ebel, M.; Lomsadze, A.; Borodovsky, M.; Stanke, M. BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Res. 2024, 34, 769–777. [Google Scholar] [CrossRef]
  21. Caballero, M.; Wegrzyn, J. gFACs: Gene filtering, analysis, and conversion to unify genome annotations across alignment and gene prediction frameworks. Genom. Proteom. Bioinform. 2019, 17, 305–310. [Google Scholar] [CrossRef]
  22. Gurevich, A.; Saveliev, V.; Vyahhi, N.; Tesler, G. QUAST: Quality assessment tool for genome assemblies. Bioinformatics 2013, 29, 1072–1075. [Google Scholar] [CrossRef]
  23. Pertea, G.; Pertea, M. GFF Utilities: GffRead and GffCompare. F1000Research 2020, 9, 304. [Google Scholar] [CrossRef]
  24. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 2013, arXiv:1303.3997. [Google Scholar] [CrossRef]
  25. Danecek, P.; Bonfield, J.K.; Liddle, J.; Marshall, J.; Ohan, V.; Pollard, M.O.; Whitwham, A.; Keane, T.; McCarthy, S.A.; Davies, R.M.; et al. Twelve years of SAMtools and BCFtools. Gigascience 2021, 10, giab008. [Google Scholar] [CrossRef] [PubMed]
  26. Manni, M.; Berkeley, M.R.; Seppey, M.; Simão, F.A.; Zdobnov, E.M. BUSCO update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 2021, 38, 4647–4654. [Google Scholar] [CrossRef] [PubMed]
  27. Pertea, M.; Pertea, G.M.; Antonescu, C.M.; Chang, T.-C.; Mendell, J.T.; Salzberg, S.L. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 2015, 33, 290–295. [Google Scholar] [CrossRef]
  28. Bray, N.L.; Pimentel, H.; Melsted, P.; Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 2016, 34, 525–527. [Google Scholar] [CrossRef]
  29. Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef]
  30. Tamura, K.; Stecher, G.; Kumar, S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol. Biol. Evol. 2021, 38, 3022–3027. [Google Scholar] [CrossRef]
  31. Yang, W.; Jiang, C.; Bi, C.; Liu, Y.; Wu, H.; Sun, X.; Liu, B.; Han, Y.; Zhang, X. A haplotype-resolved chromosomal-level genome assembly of Oxalis articulata. Sci. Data 2025, 12, 856. [Google Scholar] [CrossRef]
  32. Emshwiller, E.; Theim, T.; Grau, A.; Nina, V.; Terrazas, F. Origins of domestication and polyploidy in oca (Oxalis tuberosa; Oxalidaceae). 3. AFLP data of oca and four wild, tuber-bearing taxa. Am. J. Bot. 2009, 96, 1839–1848. [Google Scholar] [CrossRef] [PubMed]
  33. Wood, J.C.; Hamilton, J.P.; Vaillancourt, B.; Brose, J.; Edger, P.P.; Buell, C.R. Chromosome-scale genome assembly for yellow wood sorrel, Oxalis stricta. G3 Genes|Genomes|Genet. 2026; ahead of print. [Google Scholar] [CrossRef] [PubMed]
  34. Vanrooyen, D.; Emshwiller, E. First whole genome sequence of a diploid crop wild relative of the Andean tuber “oca”: Annotation and comparative genomic analysis of Oxalis oulophora. Plant Genome 2026, 19, e70193. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Shoot (a) and tuber (b) morphology. GenomeScope2 analysis based on k-mer data indicating an estimation of genome size and heterozygosity (c).
Figure 1. Shoot (a) and tuber (b) morphology. GenomeScope2 analysis based on k-mer data indicating an estimation of genome size and heterozygosity (c).
Ijpb 17 00011 g001
Figure 2. Phylogenetic analysis of the FT-like clade from Oxalis tuberosa (Oxtub). The Neighbor-Joining tree shows the relationship of oca FT homologs (pink dots) with those from the related species Oxalis articulata (Oxart), Solanum tuberosum (St), and Arabidopsis thaliana (At). The full phylogenetic tree of the entire PEBP family is provided in Supplementary Figure S3.
Figure 2. Phylogenetic analysis of the FT-like clade from Oxalis tuberosa (Oxtub). The Neighbor-Joining tree shows the relationship of oca FT homologs (pink dots) with those from the related species Oxalis articulata (Oxart), Solanum tuberosum (St), and Arabidopsis thaliana (At). The full phylogenetic tree of the entire PEBP family is provided in Supplementary Figure S3.
Ijpb 17 00011 g002
Figure 3. Expression profiles of OtFT-like genes across different tissues and photoperiods. (a) Heatmap showing normalized expression of the 8 identified OtFT-like genes. Tissues: Leaf LD (long-day leaf), Leaf SD (short-day leaf), Stem, Tuber, Root. (b) qRT-PCR validation of OtFT-like gene expression. Expression of the four closely related OtFT-like genes (Oxtub.VIR.110512, Oxtub.VIR.156460, Oxtub.VIR.164106, Oxtub.VIR.121589) was measured in leaves under LD and SD conditions using a primer pair specific to this group of genes. Expression is normalized to the actin reference gene and shown relative to the LD sample.
Figure 3. Expression profiles of OtFT-like genes across different tissues and photoperiods. (a) Heatmap showing normalized expression of the 8 identified OtFT-like genes. Tissues: Leaf LD (long-day leaf), Leaf SD (short-day leaf), Stem, Tuber, Root. (b) qRT-PCR validation of OtFT-like gene expression. Expression of the four closely related OtFT-like genes (Oxtub.VIR.110512, Oxtub.VIR.156460, Oxtub.VIR.164106, Oxtub.VIR.121589) was measured in leaves under LD and SD conditions using a primer pair specific to this group of genes. Expression is normalized to the actin reference gene and shown relative to the LD sample.
Ijpb 17 00011 g003
Table 1. Summary of the O. tuberosa VIR genome assembly data.
Table 1. Summary of the O. tuberosa VIR genome assembly data.
Genome size (Mb)833.627
GC content (%)33.5
Contig number10,917
Contig N50 (bp)235,705
Protein-coding gene number218,215
Genome BUSCO (eudicots_odb10)C: 95.8% [S:21.3%, D:74.5%], F: 0.4%, M:3.8%, n: 2326
Transcriptome BUSCO (eudicots_odb10)C: 96.9% [S:16.2%, D:80.7%], F: 0.7%, M:2.4%, n: 2326
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gancheva, M.; Tkachenko, A. Genome and Transcriptome Sequencing of Oca (Oxalis tuberosa Molina) Reveals Photoperiod-Induced FT Homologs as Candidate Tuberigens. Int. J. Plant Biol. 2026, 17, 11. https://doi.org/10.3390/ijpb17020011

AMA Style

Gancheva M, Tkachenko A. Genome and Transcriptome Sequencing of Oca (Oxalis tuberosa Molina) Reveals Photoperiod-Induced FT Homologs as Candidate Tuberigens. International Journal of Plant Biology. 2026; 17(2):11. https://doi.org/10.3390/ijpb17020011

Chicago/Turabian Style

Gancheva, Maria, and Aleksandr Tkachenko. 2026. "Genome and Transcriptome Sequencing of Oca (Oxalis tuberosa Molina) Reveals Photoperiod-Induced FT Homologs as Candidate Tuberigens" International Journal of Plant Biology 17, no. 2: 11. https://doi.org/10.3390/ijpb17020011

APA Style

Gancheva, M., & Tkachenko, A. (2026). Genome and Transcriptome Sequencing of Oca (Oxalis tuberosa Molina) Reveals Photoperiod-Induced FT Homologs as Candidate Tuberigens. International Journal of Plant Biology, 17(2), 11. https://doi.org/10.3390/ijpb17020011

Article Metrics

Back to TopTop