Fine Mapping and Gene Analysis of restorer-of-fertility Gene CaRfHZ in Pepper (Capsicum annuum L.)

Cytoplasmic male sterility (CMS) is a common biological phenomenon used in hybrid production of peppers (Capsicum annuum L.). Although several restorer-of-fertility (Rf) genes of pepper CMS lines have been mapped, there is no report that the Rf gene with clear gene function has been isolated. Here, pepper CMS line HZ1A and its restorer line HZ1C were used to construct (HZ1A × HZ1C) F2 populations and map the Rf gene. A single dominant gene CaRfHZ conferred male fertility according to inheritance analysis. Using sterile plants from (HZ1A × HZ1C) F2 populations and bulked segregant analysis (BSA), the CaRfHZ gene was mapped between P06gInDel-66 and P06gInDel-89 on chromosome 6. This region spans 533.81 kb, where four genes are annotated according to Zunla-1 V2.0 gene models. Based on the analysis of genomic DNA sequences, gene expressions, and protein structures, Capana06g002968 was proposed as the strongest candidate for the CaRfHZ gene. Our results may help with hybrid pepper breeding and to elucidate the mechanism of male fertility restoration in peppers.


Introduction
Cytoplasmic male sterility (CMS) is a common biological phenomenon observed in plants, and it is widely used for seed production of hybrid crops. Fertility restoration of CMS is critical in hybrid production, and restorer-of-fertility (Rf ) gene can restore and affect the fertility of hybrids. More than ten Rf genes have been isolated and studied in major crops [1] such as rice, maize, rapeseed, wheat, and soybean. Most of the Rf genes in crops contain multiple pentatricopeptide repeat (PPR) domains, such as Rf1 [2], Rf4 [3], Rf5 [4], and Rf6 [5] in rice; Rf1 [6] and Rf3 [7] in maize; Rfp [8] in rapeseed; Rfo [9] in radish; RFL79 and RFL29a [10] in wheat; and GmPPR576 [11] in soybean. Additionally, certain Rf genes do not contain PPR domains, such as Rf2 [6] and Rf4 [12] in maize. Rf2 is similar to the mammalian aldehyde dehydrogenase, and Rf4 is a transcription factor that contains basic helix-loop-helix (bHLH) domains. Most Rf genes restore the fertility of sterile lines by affecting the transcription of sterile genes in mitochondria. For example, the rice gene Rf4 restores fertility by reducing the transcripts of the sterility gene orf352 in the cytoplasm of wild-type rice [4,13]; the rice gene Rf6 reduces the accumulation of the transcript atp6-orfH79, thereby restoring fertility [5]; the maize gene Rf3 inhibits mitochondrial gene orf77 editing and degradation, and accelerates orf355 degradation, leading to CMS-S fertility restoration [14]; the rapeseed gene Rfp restores fertility of the sterile line by processing the transcript of the mitochondrial sterility gene atp6-orf224 [8]. 2 of 14 Pepper is one of the largest vegetable crops globally. Peterson (1958) [15] was the first to report cytoplasmic-nucleus interaction type male sterility in pepper, following which, breeders bred numerous sterile pepper lines using CMS lines [16]. Due to the different fertility segregating populations used for genetic analysis in different studies, the inheritance pattern of pepper Rf gene also has different conclusions. Wang et al. [17] used the male sterile line 77013A to cross 114 doubled haploid lines to identify the fertility of F 1 offspring, and considered that fertility restoration of pepper CMS was controlled by one major and four minor genes. Wei et al. [18] analyzed the segregation of fertility using an F 2 population. The fertility restoration of pepper CMS may be conferred by two pairs of additive-dominant epistatic major genes and additive-dominant polygenes. Gulyas et al. [19] and Ye et al. [20] investigated the fertility separation of F 1 -F 4 and F 2 populations constructed using male sterile and restorer lines, respectively. The results showed that a pair of dominant nuclear genes controls fertility restoration in CMS.
In early studies, owing to the limitations of molecular biology techniques and the absence of a reference genome, the Rf genes of pepper were only initially mapped. Zhang et al. [21] used the F 2 population to locate the major Rf gene between two randomly amplified polymorphic DNA (RAPD) markers using bulked segregant analysis (BSA), and the genetic distance was 0.37 centimorgans (cM) and 8.12 cM. Wang et al. [17] mapped a major quantitative trait locus (QTL) of fertility restoration to chromosome 6 and four minor QTLs on chromosomes 1, 2, 3, and 5. Kim et al. [22] used BSA-amplified fragment length polymorphism (AFLP) to segregate populations by fertility, constructed a linkage map of the Rf gene, and obtained a cleaved amplified polymorphic sequence (CAPS) marker AFRF8CAPS with a genetic distance of 1.8 cM from the Rf gene. Lee et al. [23] used 205 F 2 individual plants and obtained three CAPS and one sequence characterized amplified region (SCAR) markers that were closely linked to the Rf site of pepper CMS. Yang et al. [24] used the F 2 population to obtain the simple sequence repeat (SSR) molecular marker AF208834 linked to the pepper Rf gene, with a genetic distance of 20.8 cM. Jo et al. [25] screened the pepper BAC library with the petunia Rf gene, cloned the Rf candidate gene PePPR1, and found three markers closely linked to the Rf gene.
Along with the in-depth research and publication of the pepper reference genome [26,27], the pepper Rf gene has been mapped to an adjacent region of chromosome 6 in different reports (Table 1). Jo et al. [28] used BSA-AFLP and comparative genomics to locate the Rf gene within the 821 kb segment on chromosome 6 and identified the candidate gene CaPPR6. Zhang et al. [29] narrowed down the region of CaPPR6 to 128.96 Kb using kompetitive allelespecific PCR (KASP) markers. Kang et al. [30] mapped an unstable Rf gene, Rfu, using the molecular markers from Jo et al. [28] and newly developed CAPS markers. Ye et al. [20] located the Rf gene between SSR markers pep43 and pep20 on chromosome 6, with a physical distance of 498.6 kb. Wu et al. [31] performed association analysis on 287 pepper lines using specific-locus amplified fragment sequencing (SLAF-seq) and genome-wide association study (GWAS) and identified two candidate Rf genes, Capana06g002967 and Capana06g002969. Cheng et al. [32] used high-density genetic mapping and collinearity analysis to map the Rf gene CaRf to a 270.10 kb region on chromosome 6, and analyzed the candidate gene Capana06g003028. Zhang et al. [33] used high-throughput sequencing combined with BSA to initially map the Rf gene in the F 2 population, then mapped the Rf gene to a 148.05 kb segment on chromosome 6, and identified the candidate gene CaRf032 (CA00g82510), which has a base variation in the maintainer line leading to premature termination of transcription of the gene. Wei et al. [34] performed RNA sequencing on fertile and sterile pools constructed from the F 2 population and found a candidate gene, NEDD8 (Capana06g002866), on chromosome 6. However, to date, there is no report that the Rf gene with a clear gene function of the pepper CMS line has been cloned. "-" indicates there is no corresponding region. In some studies, CM334 v1.55 was selected as the reference genome; therefore, the flanking markers of the mapping interval were compared on the Zunla-1 V2.0 genome to obtain the corresponding positions. For the markers without corresponding positions, according to the corresponding positions of the annotated genes in the mapping interval on the Zunla-1 V2.0 genome, the approximate locations of the mapping intervals, such as CaPPR6 [28,29], Rfu [30], and CaRf [32] were speculated. The mapping interval of NEDD8 [34] in the original text is inconsistent with the candidate gene position.
In the present study, we constructed an F 2 segregating population and fine-mapped the Rf gene in the pepper line HZ1C. The Rf gene was mapped to a 533.81 kb region between two insertion or deletion (InDel) markers on chromosome 6 of the Zunla-1 V2.0 reference genome [26]. According to DNA sequence alignment, gene expression, and protein structure analysis, Capana06g002968 was proposed as the strongest candidate gene for the Rf gene in pepper line HZ1C.

The Inheritance Analysis of Fertility Restoration in CMS Line HZ1A
HZ1A shows the typical characteristics of male sterility, including no pollen dissemination around its anthers and fewer and shriveled pollen grains [35]. The fertility phenotypes were identified through visible pollen on anthers and Alexander's staining of pollen. All (HZ1A × HZ1C) F 1 plants were completely male fertile. The segregation of male-fertile to male-sterile plants in 11 F 2 populations of different sizes, which had been planted in the autumn and spring of 2020 and 2021, respectively, conformed to a 3:1 segregation ratio under the χ 2 criterion (p > 0.05). Overall, 1658 F 2 plants were classified as 1277 male-fertile and 381 male-sterile plants, which also conformed to a 3:1 segregation ratio (χ 2 = 3.50, p = 0.06) ( Table 2). These results demonstrated that a single dominant gene conferred male fertility in HZ1A, which we designated as CaRf HZ .

BSA and Genetic Linkage Mapping of the CaRf HZ Gene
In total, 226 plants of the (HZ1A × HZ1C) F 2 population, which had been planted in the autumn of 2020, were used to map the CaRfHZ gene according to BSA. DNA from 10 male-fertile plants was mixed to composite the male-fertile DNA bulk, and DNA from 10 male-sterile plants was mixed to composite the male-sterile DNA bulk. In total, 449 pairs of SSR markers [36] evenly distributed on pepper chromosomes were selected to detect the parents: CMS line HZ1A and restorer line HZ1C. Polymorphic markers with clear bands were detected in male-fertile and male-sterile DNA bulks. Finally, the SSR marker P06g8490 was found to be polymorphic among the DNA bulks. This implied that the CaRf HZ gene is close to P06g8490 and is located on chromosome 6. In the 12 Mb region near P06g8490, 11 polymorphic SSR markers among the parents were selected. The above 11 SSR markers and P06g8490 were amplified in 45 male-sterile plants from the (HZ1A × HZ1C) F 2 population. According to the recombinant number and genetic distance between the CaRf HZ and its linked markers (Table 3), CaRf HZ was initially located between the markers P06G8229 and P06G8560 ( Figure 1a). In this region, 6 SSR markers, P06g8264, P06g8490, P06g8494, P06g8497, P06g8536, and P06g8527, were co-segregated with CaRf HZ .

Fine Mapping of the CaRf HZ Gene
A larger population composed of 336 male-sterile plants out of the (HZ1A × HZ1C) F 2 populations, which had been planted in the spring of 2021, was used to narrow down the region of the CaRfHZ gene. More SSR markers and newly developed InDel markers (Table 3) in the primary mapping region were selected for fine mapping. Twenty-eight markers were polymorphic in HZ1A and HZ1C, and were used to map the CaRf HZ gene. The recombinant numbers and genetic distances between the CaRf HZ gene and its linked markers are shown in Table 3. Some of these markers identified the same number of recombinant individuals, such as P06gInDel-46, P06gInDel-48, and P06gInDel-56, and these markers had the same genetic distance from the CaRf HZ gene. However, owing to image size limitations, the markers are not listed in the mapping linkage (Figure 1a,b). The figure shows five F 2 recombinant individuals between the P06G8508 and P06gInDel-95 markers ( Figure 1c). Finally, the CaRf HZ gene was mapped between the markers P06gInDel-66 and P06gInDel-89 with the same genetic distance of 0.15 cM (Figure 1b,c). Three markers, P06G8527, P06gInDel-79, and P06gInDel-81, were co-segregated with the CaRf HZ gene.

Analysis of the Annotation Genes
According to the pepper reference genome sequence (Zunla-1 V2.0) [26], the physical distance between the two markers, P06gInDel-66 and P06gInDel-89, is 533.81kb on Chr. 06: 215,097,259..215,631,069. In this region, there are four annotated genes (Table 4): Capana06g002965, Capana06g002967, Capana06g002968, and Capana06g002969 according to Zunla-1 V2.0 gene models. Full-length genomic DNA sequences of the four annotated genes were sequenced in HZ1A and HZ1C, and the coding sequences were analyzed (Table 5). There were two missense variations at 129 bp and 504 bp and a synonymous variation at 436 bp in Capana06g002965. There was a missense variation of 935 bp in Capana06g002967. There were two missense variations at 20 bp and 467 bp in Capana06g002968. There were two missense 7 of 14 variations at 196 bp and 318 bp and a synonymous variation at 144 bp in Capana06g002969. Differences in the above sites lead to differences in the amino acid sequence ( Figure S1), but only the protein structure of Capana06g002969 was different. Due to the missense variation at 318 bp, the stop codon TGA of Capana06g002969 in HZ1C was changed to TGG in HZ1A, which resulted in 31 more amino acids in HZ1A.
Shading indicates the nucleobases are consistent with the restored traits.
The expression of the four annotated genes in young leaves and three developmental stages (stage 1 (S1), sporogenous tissue to meiotic stage; stage 2 (S2), tetrad to mononuclear stage; stage 3 (S3), mature pollen stage) of anthers was analyzed in HZ1A, HZ1C, and (HZ1A × HZ1C) F 1 using quantitative real-time PCR (qRT-PCR) (Figure 2). The expression of Capana06g002965 in (HZ1A × HZ1C) F 1 was low in young leaves but was not detected in HZ1A and HZ1C. At the S1 stage, the expression level of Capana06g002965 in HZ1A was significantly lower than that in HZ1C and (HZ1A × HZ1C) F 1 . Conversely, the expression level of Capana06g002965 in HZ1A was significantly higher than that in HZ1C and (HZ1A × HZ1C) F 1 at the S2 stage. The expression of Capana06g002965 was not detected in HZ1A, HZ1C, or (HZ1A × HZ1C) F 1 . The expression of Capana06g002967 in HZ1A was not significantly different from that in HZ1C, but was higher than that in (HZ1A × HZ1C) F 1 in young leaves and the S3 stage. The expression trends of Capana06g002967 were similar to Capana06g002965 at the S1 and S2 stages. The expression level of Capana06g002968 in HZ1A was significantly lower than that in HZ1C and (HZ1A × HZ1C) F 1 in young leaves and the S1 stage. At the S2 and S3 stages, the expression levels of Capana06g002968 in HZ1A were significantly higher than those in HZ1C. The expression of Capana06g002969 was not detected in the young leaves of HZ1A, HZ1C, and (HZ1A × HZ1C) F 1 . At the S1, S2, and S3 stages, the expression level of Capana06g002969 in HZ1A was significantly lower than that in HZ1C. In summary, in HZ1A and HZ1C, four genes were differentially expressed at the S1, S2, and S3 developmental stages of anther, with the exception of Capana06g002967 at the S3 stage. The gene expression trends in HZ1C and (HZ1A × HZ1C) F 1 were mostly the same.

Comparison of CaRfHZ Mapping Interval and the Published Rf Gene Position in Pepper
Thus far, the seven pepper Rf genes have been fine mapped using a forward genetics strategy (Table 1). According to the gene mapping intervals, these seven genes are located in an adjacent region of approximately 7 Mb from 210-217 Mb on chromosome 6, and the mapping results overlapped in some studies [20,31]. The situation in which the mapping results are adjacent is presumed to be related to the similarity of the materials used for mapping or to the possible clustering of pepper Rf genes on chromosome 6. In the present study, the CaRfHZ gene was fine-mapped within a 533.81 kb segment in the adjacent region. The mapping interval of the CaRfHZ gene partially overlaps the mapping region in Rf [20] and is located within the mapping range of Rf (858.26 kb) [31] and NEDD8 (5.11 Mb) [34], which are related to the larger mapping range of the above research results. Similar mapping results proved the accuracy of our results in this study but were not conducive to the discovery of new pepper Rf genes. As pepper Rf genes may exist in clusters on chromosome 6, there may be multiple genes controlling fertility restoration simultaneously. This leads to inconsistent genotypes for the molecular marker Rf, making it difficult to locate the Rf gene within a shorter interval.

Prediction and Characteristics of the Candidate Gene
According to the annotation information of the Zunla-1 V2.0 genome [26], there are four annotated genes in the localization interval of the CaRfHZ gene. All four annotated genes showed sense variations and differences in expression between HZ1A and HZ1C. The candidate gene for CaRfHZ could not be determined based on the above information.
Therefore, variations between HZ1A and HZ1C were detected in nine maintainer Although there were differences in the amino acid sequences and gene expression levels of the four annotated genes, it was not possible to determine which gene was the candidate gene for CaRf HZ based on these differences.

Comparison of CaRf HZ Mapping Interval and the Published Rf Gene Position in Pepper
Thus far, the seven pepper Rf genes have been fine mapped using a forward genetics strategy (Table 1). According to the gene mapping intervals, these seven genes are located in an adjacent region of approximately 7 Mb from 210-217 Mb on chromosome 6, and the mapping results overlapped in some studies [20,31]. The situation in which the mapping results are adjacent is presumed to be related to the similarity of the materials used for mapping or to the possible clustering of pepper Rf genes on chromosome 6. In the present study, the CaRf HZ gene was fine-mapped within a 533.81 kb segment in the adjacent region. The mapping interval of the CaRf HZ gene partially overlaps the mapping region in Rf [20] and is located within the mapping range of Rf (858.26 kb) [31] and NEDD8 (5.11 Mb) [34], which are related to the larger mapping range of the above research results. Similar mapping results proved the accuracy of our results in this study but were not conducive to the discovery of new pepper Rf genes. As pepper Rf genes may exist in clusters on chromosome 6, there may be multiple genes controlling fertility restoration simultaneously. This leads to inconsistent genotypes for the molecular marker Rf, making it difficult to locate the Rf gene within a shorter interval.

Prediction and Characteristics of the Candidate Gene
According to the annotation information of the Zunla-1 V2.0 genome [26], there are four annotated genes in the localization interval of the CaRfHZ gene. All four annotated genes showed sense variations and differences in expression between HZ1A and HZ1C. The candidate gene for CaRf HZ could not be determined based on the above information.
Therefore, variations between HZ1A and HZ1C were detected in nine maintainer and four restorer lines of HZ1A (Table 5), and missense variations were analyzed. In Capana06g002965, missense variations were the same as HZ1A in eight maintainer lines, but they were different from HZ1C in all four restorer lines, and the correlation between missense variations and restored traits was 61.54% in 13 lines. In Capana06g002967, missense variations were the same as HZ1C in all four restorer lines, but they were different from HZ1A in eight maintainer lines, and the correlation between missense variations and restored traits was 38.46% in 13 lines. The detection results of missense variations in Capana06g002968 and Capana06g002969 were similar; the missense variations were the same as HZ1A in eight maintainer lines and were the same as HZ1C in three restorer lines, and the correlation between missense variations and restored traits was 84.62% in 13 lines. Conclusively, the missense variations in Capana06g002965 and Capana06g002967 were not highly correlated with restored traits, while the missense variations in Capana06g002968 and Capana06g002969 were highly correlated with restored traits and may be candidate genes for CaRf HZ . Capana06g002968 was not reported as a candidate gene for pepper Rf in previous studies, but Capana06g002969 was identified as a candidate gene. There was no difference in Capana06g002969 sequence between the CMS and restorer lines in a study by Wu et al. [31], which is different from the findings in the present study.
Using the simple modular architecture research tool (SMART) (https://smart.embl. de/, accessed on 15 March 2022) [37] to analyze protein structures, it was found that Ca-pana06g002968 in HZ1A and HZ1C both contained tetratricopeptide repeat (TPR) domains, and Capana06g002969 in HZ1A, but not HZ1C, contained an IBR domain. No reports related to plant fertility were found for the IBR domain, but there have been reports on the TPR domain. In rice, silencing the gene ORF3 containing the TPR domain can restore the spikelet fertility of heterozygous plants [38]. The C-terminal TPR and CaMbd domains of the wFKBP73 gene were critical for male fertility in transgenic rice [39]. Overexpression of AtTRP1, which contains a TRP domain, in wild-type Arabidopsis results in dwarf plants and reduced fertility [40]. High expression of TPR may affect normal plant fertility. In our study, anther development of HZ1A started abnormally from the S2 stage, and the expression of Capana06g002968 in HZ1C was lower than that in HZ1A in the S2 and S3 stages. This suggests that lower expression of Capana06g002968 may be involved in the restoration of fertility of HZ1A. Therefore, Capana06g002968 was proposed as the strongest candidate gene for CaRf HZ , but this needs to be validated in the future.

Application of the Related Markers of CaRf HZ Gene in Pepper Breeding
Since the discovery of the first pepper CMS by Peterson [15], CMS sterile lines have been used for pepper breeding. The 'three-line' hybrid seed production system consisting of CMS, maintainer, and restorer lines is used in pepper seed production. This method can reduce the link of manual emasculation, thereby reducing the production cost and improving the purity of the hybrid. Several excellent pepper hybrid varieties were obtained by this method in China [41,42]. Fertility stability of the hybrid is an important guarantee for the large-scale promotion of the 'three-line' hybrid, and the Rf gene in the restorer line plays a key role.
Based on the differential variations in Capana06g002968 (467 bp) and Capana06g002969 (318 bp) in HZ1A and HZ1C, we developed two CAPS markers (results not shown). Amplification and enzyme digestion were performed on 13 lines (Table 5), and the results were consistent with the sequencing results. The accuracy of detecting both maintainer and restorer lines was 84.62% (Capana06g002968) and 92.31% (Capana06g002969), but not 100%, which may be related to the existence of other fertility restorer genes. The accuracies of 84.62% and 92.31% were similar to those of previous studies, such as Co1Mod1-CAPS (88.0% and 92.2%, respectively) [28], and CRF3S1S, CRF-SCAR, and Co1Mod1-CAPS markers (77.2%, 79.2%, and 70.3%, respectively) [33]. The newly developed CAPS markers in this study can be used for sterile and restorer line screening, but they need validation in more maintainer and restorer lines.

Plant Materials
The CMS line HZ1A is a pepper line with complete sterility that is obtained by backcrossing for more than 15 generations. The F 2 segregating populations ( Table 2) were derived from 11 F 1 hybrid plants, with HZ1A and HZIC as the female and male parents, respectively. In total, 226 F 2 plants and 1442 F 2 plants were planted in the autumn of 2020 and spring of 2021, respectively. Plants of F 2 populations were used for inheritance analysis and mapping of the Rf gene CaRf HZ . All plant materials, consisting of HZ1A, HZ1C, maintainer lines, restorer lines, F 1 plants, and F 2 plants, were grown in a plastic-covered greenhouse at the Zhuanghang Experimental Station of Shanghai Academy of Agricultural Sciences (Shanghai, China).

Pollen Fertility Evaluation
Pollen fertility was evaluated visually with at least three blooming flowers and further discriminated using Alexander's solution staining [43] under microscope, as described previously [35]. Plants that produced abundant pollen grains were considered malefertile, whereas plants with no pollen dissemination were considered male-sterile. After Alexander's solution staining, the pollen grains that stained magenta red were classified as male-fertile, and those stained blue-green were classified as male-sterile.

Inheritance Analysis of Fertility Restoration
Segregation of male fertility/sterility was identified based on the fertility of the plants from the (HZ1A × HZ1C) F 2 population. Chi-square goodness-of-fit test was performed using the following equation: where O is the observed frequency, and E is the expected frequency, with p = 0.05 as the threshold.

Nucleic Acid Extraction
Genomic DNA was extracted from the young leaves using a DNA extraction kit (Tiangen, Beijing, China). The anthers of HZ1A, HZ1C, and (HZ1A × HZ1C) F 1 at three developmental stages: stage 1 (S1), sporogenous tissue to meiotic stage; stage 2 (S2), tetrad to mononuclear stage; and stage 3 (S3), mature pollen stage, based on an earlier study [35] were isolated from the flower buds and immediately placed into an RNA stabilization solution (Qiagen, Hilden, Germany). Total RNA was isolated from anthers using a Biospin Plant Total RNA Extraction Kit (Bioer Technology, Hangzhou, China). Reverse transcriptase was used to enrich full-length cDNA using a HiScript II One Step RT-PCR Kit (Vazyme, Nanjing, China). According to the manufacturer's instructions, DNA and RNA were extracted, and reverse transcription was performed.

SSR and InDel Markers
SSR markers used in the present study were derived from Cheng et al. [36]. SSR markers are numbered according to their position on the chromosome, and chromosome information is added, such as P01g0001. The markers used in the CaRf HZ linkage map are shown in Table S1.
Chromosome 6 sequences of two published pepper genomes, Zunla-1 V2.0 and Chiltepin V2.0, were obtained from the China National GeneBank Sequence Archive (CNSA) (https:// db.cngb.org/search/project/CNPhis0000547/, accessed on 25 October 2020). According to the initial and fine mapping of the CaRf HZ gene, 4 Mb (Chr06:195,000,000..196,000,000 and Chr06: 213,000,000..216,000,000) sequences were selected to search for InDel. The InDel variations between 'Zunla-1' and 'Chiltepin' were detected using MUMmer 3.0 software [44] (Hamburg, Germany), and the parameter of variation length ≥ 5 bp was used for screening. The 500 bp sequences before and after the InDel variation were selected, and Primer-BLAST software (Bethesda, MD, USA) was used for primer sequence design. In total, 102 InDel markers were developed, and the information is listed in Table S2. Sangon Biotech (Shanghai, China) synthesized the primers. The design and synthesis of all primers used in the present study were the same as those described above.

BSA and Linkage Analysis of CaRf HZ Gene
The CaRf HZ gene was mapped using BSA. DNA from 10 plants each from male-fertile and male-sterile plants of the (HZ1A × HZ1C) F 2 population were randomly selected and mixed to composite the male-fertile and male-sterile DNA bulks. Then, 449 SSR markers [36], which covered the entire pepper genome, were tested for polymorphism in the parent HZ1A and HZ1C. Thereafter, the polymorphic markers between parents were detected in male-fertile and male-sterile DNA bulks. The polymorphic marker between DNA bulks was recognized as linked to the CaRf HZ gene.
In the present study, the male-sterile plants from the (HZ1A × HZ1C) F 2 populations were used to map the CaRf HZ gene ( Table 2) according to Zhang et al. [45]. The polymorphic markers between DNA bulks and the SSR and InDel markers near this marker were amplified and analyzed in male-sterile plants from the (HZ1A × HZ1C) F 2 population. The recombination rate (c) between the marker and the CaRf HZ gene was calculated using the formula c = (N1 + N2/2)/N, where N is the number of all sterile plants, N1 is the number of plants with homozygous bands from the fertile parent HZ1C, and N2 is the number of plants with heterozygous bands from the two parents [45,46]. Kosambi's formula: d = 1/4 log ((1 + 2c)/(1 − 2c)) was used to calculate the genetic distance (d) [47]. The unit of d is centimorgans (cM). MapDraw v2.1 was used to construct the genetic map [48].
PCR amplification was performed using a reaction mixture containing the following: 5 µL of 2×Hieff ® PCR Master Mix (YEASEN, Shanghai, China), 1 µL of 10 µM PCR upstream primer, 1 µL of 10 µM PCR downstream primer, 20 ng template DNA, and ddH 2 O up to 10 µL. The PCR cycling program was as follows: pre-denaturation at 94 • C for 2 min; followed by 32 cycles of 94 • C for 30 s, primer annealing temperature for 30 s, and 72 • C for 30 s; final extension at 72 • C for 5 min; cooling to 12 • C. Amplification products were electrophoresed using 8% non-denaturing polyacrylamide gels at 150 V for 1 h. The gels were visualized using 0.1% AgNO 3 solution [46].

Candidate Gene Amplification and qRT-PCR
Amplification of the candidate genes of CaRf HZ in the HZ1A and HZ1B parental lines was performed following the instructions using GoldenStar ® T6 Super PCR Mix Ver.2 (Tsingke Biotechnology, Beijing, China) in a total reaction volume of 50 µL containing 45 µL of 1.1× GoldenStar ® Mix Ver.2, 2 µL of 10 µM upstream primer, 2 µL of 10 µM downstream primer, and 200 ng template DNA. The PCR cycling conditions were as follows: pre-denaturation at 98 • C for 2 min; followed by 32 cycles of 98 • C for 10 s, primer annealing temperature for 15 s, and 72 • C for 40 s; final extension at 72 • C for 5 min, and cooling to 12 • C. The primers used for cloning the candidate genes of CaRf HZ are listed in Table S3. PCR products were sequenced by Sangon Biotech (Shanghai, China). The amino acid sequences of the candidate genes of CaRf HZ were aligned using BioXM 2.7.120 (Nanjing, China).
The qRT-PCR was performed using 2x Hieff UNICON ® Universal Blue qPCR SYBR Green Master Mix (YEASEN, Shanghai, China) following the instructions. The PCR reactions were analyzed using the Quant Studio 5 real-time PCR system (Waltham, MA, USA). The β-actin gene from Lv et al. [49] was used as an internal control to normalize expression data. The 2 −∆∆Ct method was used to calculate the relative expression levels. Three biological and three technical replicates were used for each experiment. The primers used for qRT-PCR are listed in Table S4.

Analysis of Candidate Genes
The physical positions of the molecular markers were determined with reference to the pepper reference Zunla-1 V2.0 genome sequence as described above. The annotated genes in the identified target region were obtained according to the Zunla-1 V2.0 gene models [26]. The protein structures of annotated genes were predicted by SMART (https://smart.embl.de/, accessed on 15 March 2022) [37].

Conclusions
To summarize, the male fertility of the pepper CMS line HZ1A could be restored by a single dominant gene, CaRf HZ . The Rf gene, CaRf HZ , was fine-mapped to a 533.81 kb region on chromosome 6 using BSA. Four annotated genes were analyzed according to Zunla-1 V2.0 gene models. Based on the comparison of genomic DNA sequences, qRT-PCR, and protein structures in HZ1A and its maintainer and restorer lines, Capana06g002968 was proposed as the strongest candidate gene for the CaRf HZ gene. Functional verification of the CaRfHZ gene needs to be performed in the future.