Genetic and Evolutionary Analysis of Porcine Deltacoronavirus in Guangxi Province, Southern China, from 2020 to 2023

Porcine deltacoronavirus (PDCoV) has shown large-scale global spread since its discovery in Hong Kong in 2012. In this study, a total of 4897 diarrheal fecal samples were collected from the Guangxi province of China from 2020 to 2023 and tested using RT-qPCR. In total, 362 (362/4897, 7.39%) of samples were positive for PDCoV. The S, M, and N gene sequences were obtained from 34 positive samples after amplification and sequencing. These PDCoV gene sequences, together with other PDCoV S gene reference sequences from China and other countries, were analyzed. Phylogenetic analysis revealed that the Chinese PDCoV strains have diverged in recent years. Bayesian analysis revealed that the new China 1.3 lineage began to diverge in 2012. Comparing the amino acids of the China 1.3 lineage with those of other lineages, the China 1.3 lineage showed variations of mutations, deletions, and insertions, and some variations demonstrated the same as or similar to those of the China 1.2 lineage. In addition, recombination analysis revealed interlineage recombination in CHGX-MT505459-2019 and CHGX-MT505449-2017 strains from Guangxi province. In summary, the results provide new information on the prevalence and evolution of PDCoV in Guangxi province in southern China, which will facilitate better comprehension and prevention of PDCoV.


Introduction
Porcine deltacoronavirus (PDCoV) causes a gastrointestinal infectious disease characterized by fever, vomiting, watery diarrhea, dehydration, and loss of appetite in pigs of different ages, especially piglets [1].The autopsy of piglets that die of the disease shows pathological changes such as blunting and atrophic changes of the intestinal villi and necrosis of the villous cells [2].These symptoms are similar to those of other viruses that cause gastrointestinal manifestations in pigs, and it is difficult to distinguish them only based on the clinical signs and gross pathological changes [3,4].
PDCoV, which belongs to the genus Deltacoronavirus in the family Coronaviridae of order Nidovirales, is a single-stranded, positive-sense RNA virus [5].The virus was first identified and named HKU15 coronavirus in Hong Kong, China, in 2012 [6], but it was first confirmed to be the pathogenic virus of porcine diarrheal disease in the United States in 2014.Since then, PDCoV has been confirmed in many countries such as Thailand [7], Vietnam [8], Japan [9], Korea [10], Canada [11], and Mexico [12], which seriously endangered the health of pigs worldwide [13].Notably, the blood from three children in Haiti's schools during Finally, 362 specimens were tested positive for PDCoV using the multiplex RT-qPCR [23], which were distributed in different regions of Guangxi province (Figure 1).They were further used to finish the following experiments.Finally, 362 specimens were tested positive for PDCoV using the multiplex RT-qP [23], which were distributed in different regions of Guangxi province (Figure 1).Th were further used to finish the following experiments.

Amplification and Sequencing
Of the 362 PDCoV-positive specimens using multiplex RT-qPCR [23], 34 specim were selected based on different regions and different pig farms in Guangxi province, d ferent seasons of the year, and having Ct values less than 25 cycles.They were used amplify the S (3480 bp), M (654 bp), and N (1029 bp) genes of PDCoV by RT-PCR us the amplification primers (Table 1).The total nucleic acids were extracted using MiniBE Viral RNA/DNA Extraction Kit Ver.

Amplification and Sequencing
Of the 362 PDCoV-positive specimens using multiplex RT-qPCR [23], 34 specimens were selected based on different regions and different pig farms in Guangxi province, different seasons of the year, and having Ct values less than 25 cycles.They were used to amplify the S (3480 bp), M (654 bp), and N (1029 bp) genes of PDCoV by RT-PCR using the amplification primers (Table 1).The total nucleic acids were extracted using MiniBEST Viral RNA/DNA Extraction Kit Ver.C for 20-24 h, and the plasmids were extracted using MiniBEST Plasmid Purification Kit Ver.4.0 (TaKaRa, Dalian, China; Cat No. 9760) and sent to IGE biotechnology LTD (Guangzhou, China) for sequencing.After splicing different gene fragment sequences, the complete S and N gene sequences were obtained.The obtained gene sequences were validated by BLAST analysis at the National Center for Biotechnology Information (NCBI) "https://blast.ncbi.nlm.nih.gov/Blast.cgi(accessed on 18 December 2023)".Finally, a total of 34 PDCoV S, M, and N gene sequences were obtained, respectively (Supplementary Tables S1-S3).
Table 1.Primers for amplification of the PDCoV S, M, and N genes.

Sequence Comparison and Phylogenetic Analysis
The S, M, and N gene sequences of PDCoV from Guangxi province published in NCBI "https://www.ncbi.nlm.nih.gov/nucleotide/(accessed on 18 December 2023)" were downloaded.Finally, 37 S gene sequences, 32 M gene sequences, and 34 N gene sequences were collected from NCBI (Supplementary Tables S1-S3).The other S, M, and N gene sequences from other provinces in China and from other countries such as the United States, Republic of Korea, Vietnam, Laos, Japan, and Thailand, were also collected from NCBI.The obtained gene sequences in this study, together with all the collected sequences from NCBI (Supplementary Tables S1-S3), i.e., 217 S, 200 M, and 202 N gene sequences, were analyzed using the Clustal W algorithm of DNAstar 7.0 software "https://www.dnastar.com/software/ (accessed on 18 December 2023)".The phylogenetic trees were constructed using the maximum likelihood method of MEGA X software "https://www.megasoftware.net/archived_version_active_download (accessed on 18 December 2023)", which was optimized through the online website Interactive Tree of Life (iTOL) "https://itol.embl.de/(accessed on 18 December 2023)" [24].

Bayesian Temporal Dynamics Analysis
The 71 PDCoV S gene sequences from Guangxi province were compared using MEGA X software, the maximum likelihood tree was reconstructed using IQ-TREE software "http://iqtree.cibiv.univie.ac.at/ (accessed on 18 December 2023)", the TN+F+R3 model was implemented by 1000 bootstrap.In order to determine the temporal structure, it was verified by regression of root-tip genetic distance of the sequences using TempEst software [25,26].Bayesian Markov chain Monte Carlo (MCMC) was chosen to infer the dispersion time of the PDCoV S gene in BEAST v1.10.4 "http://beast.community/(accessed on 18 December 2023)", and a suitable replacement model was calculated through the ModelFinder option of the PhyloSuite software [27].For the S gene, the final choice was used with the relaxed molecular clock, TN93+F+G4 replacement model, and Bayesian skyline model replacement and after which the MCMC was run in parallel on the 3 chains for 200 million steps with 10% aging.The post-run data was visualized through Tracer software, and only data with all parameters ESS > 200 was considered to be valid.The data was calculated through TreeAnnotator v1.10.4.The MCMC tree was annotated after aging by 10% and visualized using FigTree version 1.4.4 "http://beast.community/(accessed on 18 December 2023)".

Genome Recombination Events Analysis
Recombination analysis was performed on all S gene sequences using the Recombination Detection Program (RDP4) software, which was downloaded from the public website "http://www.bioinf.manchester.ac.uk/recombination/programs (accessed on 18 December 2023)" with a window size of 500 bp and a p-value of 0.05.Seven algorithms were selected for recombination analysis according to the recommendations of the RDP manual, including RDP, GENECONY, BootScan, MaxChi, Chimaera, SiScan, and 3Seq in RDP4 software.Sequences were considered to be potentially recombinant only if at least six algorithms supported them; for the potentially recombinant sequences, it was verified using the SimPlot software "https://github.com/Stephane-S/Simplot_PlusPlus(accessed on 18 December 2023)".

Phylogenetic Analysis Basing on S Gene Sequences
To understand the phylogenetic information of the S gene, a phylogenetic tree was generated based on the information of a total of 217 gene sequences (Supplementary Table S1), including 34 S gene sequences obtained in this study, 37 S gene sequences from Guangxi province and 146 S gene sequences from other Chinese provinces and other countries downloaded from GenBank in NCBI (Figure 2).Since there is no standard method for systematically subgrouping PDCoV at present, the subgroups of the genetic evolutionary tree were categorized in a previous report [28].All the sequences could be classified into different subgroups relating to the source of different countries/regions, i.e., the prototype (early China) lineage, the USA lineage, the Southeast Asia lineage, the China 1.1 lineage, the China 1.2 lineage, and the China 1.3 lineage.The homology of the S gene in each lineage in China is shown in Table 2. Most sequences from Guangxi province were distributed in the subgroups from China, but some other sequences were distributed in the subgroup from Southeast Asia.In particular, the 34 S gene sequences obtained in this study, together with the sequences from other Chinese provinces such as Hunan, Shandong, Hong Kong, and other provinces in recent years, were located in a newly discovered lineage in this study, i.e., the China 1.3 lineage.

Phylogenetic Analysis Basing on M Gene Sequences
The phylogenetic tree was generated based on 200 M gene sequences (Supplementary Table S2), including 34 sequences obtained in this study, 32 sequences from Guangxi province, and 134 reference sequences from other Chinese provinces and other countries downloaded from GenBank in NCBI (Figure 3).All the M gene sequences were distributed into the Chinese branch, the American branch, and the Southeast Asian branch, of which the Chinese sequences had a wide distribution.All the M gene sequences obtained in this study were located in the Chinese subgroups.

Phylogenetic Analysis Basing on M Gene Sequences
The phylogenetic tree was generated based on 200 M gene sequences (Supplementary Table S2), including 34 sequences obtained in this study, 32 sequences from Guangxi province, and 134 reference sequences from other Chinese provinces and other countries downloaded from GenBank in NCBI (Figure 3).All the M gene sequences were distributed into the Chinese branch, the American branch, and the Southeast Asian branch, of which the Chinese sequences had a wide distribution.All the M gene sequences obtained in this study were located in the Chinese subgroups.

Phylogenetic Analysis Basing on N Gene Sequences.
The phylogenetic tree was generated based on 202 N gene sequences (Supplementary Table S3), including 34 sequences obtained in this study, 34 sequences from Guangxi province, and 134 reference sequences from other Chinese provinces and other countries downloaded from GenBank in NCBI (Figure 4).Three Vietnamese strains were located in the Chinese subgroups and were closer to the sequences from Guangxi province, includ-

Phylogenetic Analysis Basing on N Gene Sequences
The phylogenetic tree was generated based on 202 N gene sequences (Supplementary Table S3), including 34 sequences obtained in this study, 34 sequences from Guangxi province, and 134 reference sequences from other Chinese provinces and other countries downloaded from GenBank in NCBI (Figure 4).Three Vietnamese strains were located in the Chinese subgroups and were closer to the sequences from Guangxi province, including those obtained in this study.

Phylogenetic Analysis Basing on N Gene Sequences.
The phylogenetic tree was generated based on 202 N gene sequences (Supplementary Table S3), including 34 sequences obtained in this study, 34 sequences from Guangxi province, and 134 reference sequences from other Chinese provinces and other countries downloaded from GenBank in NCBI (Figure 4).Three Vietnamese strains were located in the Chinese subgroups and were closer to the sequences from Guangxi province, including those obtained in this study.

Bayesian Temporal Dynamics Analysis
The phylogenetic tree was consistent with the constructed MCC tree, which is based on the PDCoV S gene with the PDCoV temporal scale (Figure 5).The MCC tree indicated that all PDCoV strains were divided into five genotypes, i.e., China 1.1,China 1.2,China 1.3, USA, and Southeast Asia branches (Figure 5a), and the branch of China 1.3 lineage had a later differentiation time than the others, and may be a newly appearing branch in recent years.The Bayesian skyline (Figure 5b) showed the practical population size map of PDCoV transmission in Guangxi province in recent years.After PDCoV was discovered in Hong Kong in 2012, the large-scale transmission of this virus also occurred in Guangxi province, peaked in 2015, then weakened until 2020, and kept a stable trend onwards.The fluctuation of the adequate population size of the Bayesian skyline was similar to a previous report [29].

Recombination Analysis of S Gene
The RDP4 software was used for recombination event analysis of the S gene (Figure 6), and two strains from Guangxi province showed recombination signals.The CHGX-MT505459-2019 strain originated from the recombination that occurred between the CHGX-MT505446-2017 and CHGX-MT505452-2018 strains.Among them, the CHGX-MT505446-2017 strain was the primary parent with 98.9% similarity, and the CHGX-MT505452-2018 strain was the minor parent with 99.4% similarity, and the potential breakpoints were found in the region of 586 nt-1390 nt.The other strain, CHGX-MT505449-2017, was derived from the CHGX-MN173782-2016 strain [30] and a recombination event between the CHGX-MT505447-2017 strain, with the CHGX-MN173782-2016 strain as the primary parent with 99.5% similarity and the CHGX-MT505447-2017 strain as the secondary parent with 100% similarity, and the potential breakpoints were found in the region of 1375 nt-2055 nt.

Recombination Analysis of S Gene
The RDP4 software was used for recombination event analysis of the S gene (Figure 6), and two strains from Guangxi province showed recombination signals.The CHGX-MT505459-2019 strain originated from the recombination that occurred between the CHGX-MT505446-2017 and CHGX-MT505452-2018 strains.Among them, the CHGX-MT505446-2017 strain was the primary parent with 98.9% similarity, and the CHGX-MT505452-2018 strain was the minor parent with 99.4% similarity, and the potential breakpoints were found in the region of 586 nt-1390 nt.The other strain, CHGX-MT505449-2017, was derived from the CHGX-MN173782-2016 strain [30] and a recombination event between the CHGX-MT505447-2017 strain, with the CHGX-MN173782-2016 strain as the primary parent with 99.5% similarity and the CHGX-MT505447-2017 strain as the secondary parent with 100% similarity, and the potential breakpoints were found in the region of 1375 nt-2055 nt.

Genetic Evolution Rates of S, M, and N Genes
The genetic evolution rate of PDCoV S, M, and N genes in Guangxi province was analyzed using BEAST software "http://beast.community/(accessed on 18 December 2023)" (Table 3).The S, M, and N genes had evolutionary rates of 1.907 × 10 −3 , 8.321 × 10 −4 , and 1.135 × 10 −3 substitutions/site/year, respectively, indicating that the S gene had the highest evolution rate.

Genetic Evolution Rates of S, M, and N Genes
The genetic evolution rate of PDCoV S, M, and N genes in Guangxi province w analyzed using BEAST software "http://beast.community/(accessed on 18 Decemb 2023)" (Table 3).The S, M, and N genes had evolutionary rates of 1.907 × 10 −3 , 8.321 × 1 and 1.135 × 10 −3 substitutions/site/year, respectively, indicating that the S gene had highest evolution rate.
Table 3.The genetic evolutionary rates of PDCoV S, M, and N genes in Guangxi province.

Deduence Analysis of the S Protein
To know more about the genetic characteristics of the China 1.3 lineage, all the amino acid sequences of this branch and some amino acid sequences of other branches were put into BioEdit software "https://www.bioedit.com/(accessed on 18 December 2023)" for comparison and analysis.The sequence of the earliest PDCoV strain CHAN-KP757890-2004 (GenBank accession no.KP757890), was used as the primary reference strain (Figure 7).The sequences of China 1.3 lineage were compared with other branches, and the mutations, deletions, and insertions were found in this branch.Interestingly, many mutations in this branch are identical or similar to those in the China 1.2 lineage, including mutations at positions S28T and Y31H/N, and deletion at position 44, and these variations were consistent in all the strains in the China 1.3 lineage.In addition, the strains showed the mutations at positions D114N, S166A, T726N, and R786K, the insertions at position T178, and the deletions at positions 165, 180, 410, 465, 488, 489, 618, and 643, and these variations were same as or similar to those in the China 1.2 lineage (Figure 7).

Discussion
PDCoV was first discovered in Hong Kong in 2012 and has been reported in many countries around the world [7][8][9][10][11][12][13]. PDCoV can infect pigs of all ages, especially piglets [1,5,6].In recent years, the prevalent situations and evolution characterizations of PDCoV in different provinces in China have been reported [31][32][33][34], but those of Guangxi province have not been studied in detail until now.Guangxi raises approximately 60 million pigs annually, ranking as the sixth among the 34 provinces in China.It is necessary to understand the genetic and evolutionary characteristics of PDCoV in Guangxi province.
A total of 4897 fecal specimens from Guangxi province from October 2020 to October 2023 were tested using multiplex RT-qPCR [23].A total of 362 specimens (7.39%, 362/4897) were positive for PDCoV, and the positivity rates decreased from 14.85% in 2020 to 1.26% in 2023.This indicated that the positivity rates of PDCoV in the clinical specimens decreased gradually after 2020, which may be attributed to strict biosecurity and effective measures performed in different pig farms since the outbreak of African swine fever (ASF) in China in 2018 [35,36], even if there no legal vaccine on the Chinese market for PDCoV until now.
Since S [15,16,37], M [21,22], and N [19,20] genes play important roles in viral replication, pathogenicity, pathogenesis, and immune response, S, M, and N genes are usually used as the targeted genes to perform a study on molecular epidemiology.PDCoV strains from different countries are usually categorized into four lineages based on the S gene sequence, including China, Early China, the United States, and Southeast Asia lineages [28,31,38,39], indicating the high genetic diversity of circulating strains of PDCoV.It has been reported that the geographic transmission patterns of the lineages in different subgroups are different, and more events of intralineage and interlineage recombination leading to higher viral genetic diversity have been found in the Chinese lineages [31].In this study, the S, M, and N gene sequences obtained in Guangxi province during 2020-2023 showed 97.1-99.6%,97.7-100%, and 96.1-100% homology to the reference sequences, respectively, indicating variations in the S gene were higher than those in the M and N genes.In addition, the genetic evolution rate of the S gene (1.907 × 10 −3 substitutions/site/year) is faster than those of M (8.321 × 10 −4 substitutions/site/year), and N (1.135 × 10 −3 substitutions/site/year) genes.These genetic evolution rates were similar to the results of the previous reports [28,31].
The genetic evolution tree of the S gene revealed a new branch in the Chinese genealogy, namely, the China 1.3 lineage.Some mutations, deletions, and insertions in the China 1.3 lineage were similar to those in the China 1.2 lineage.It is noteworthy that some of the sequences from Guangxi province were not distributed in the same branch as those from other countries.The M and N genes are relatively conserved.The genetic evolution tree based on M and N genes revealed that some sequences in Guangxi province were distributed in the same branch with some Vietnamese strains in the Southeast Asian lineage.This may be related to the geographic location of Guangxi province, which is located in southern China and adjacent to Vietnam, and acted as an essential hub for exchanges between China and Southeast Asian countries.The legal and illegal personnel exchanges and goods trade between two countries are extremely frequent, which might help the virus to spread between these countries.
The Bayesian analysis indicated that the China 1.3 lineage diverged later and might be a newly emerged branch.Through the Bayesian skyline analysis, the PDCoV in Guangxi province has also appeared to be massively spread since it was discovered in Hong Kong in 2012, reached a peak in 2015, then showed a steady downward trend, and was followed by a slight increase in 2020, showing a steady trend in recent years.These results had similarities with the results of the Bayesian skyline performance mapped in a previous report for Asian PDCoV strains [29].
Recombination and variation usually contribute to the viral high genetic diversity [31,40].A typical pattern of genetic evolution in coronaviruses is the occurrence of interspecies recombination [41].Recombinants between different PDCoV strains have been reported in previous reports [31,42,43].Therefore, the recombination events of all the S gene sequences from Guangxi province were analyzed in this study, and two strains with recombination signals were found.The CHGX-MT505459-2019 strain and the CHGX-MT505449-2017 strain, which were also previously obtained in our laboratory.Furthermore, no recombination signal was found from the sequences obtained from Guangxi province in this study.In addition, mutation, deletion, or insertion in the coronavirus S gene increases the genetic diversity and even changes the viral virulence [44,45].The PDCoV S gene sequences from Guangxi province, which were distributed in the China 1.3 lineage, indicated the same or similar variation to those in the China 1.2 lineage.Most strains in the China 1.2 lineage came from Sichuan [45] and Shandong [33] provinces, while most strains from other provinces in China were distributed in the China 1.1 lineage according to the genetic evolution tree.The high genetic diversity of the circulating PDCoV strains is one of the key factors that need to be considered for effective prevention and control of this disease.

Conclusions
PDCoV showed a positivity rate of 7.39% in the clinical specimens during 2020-2023 in Guangxi province of southern China.A new branch, i.e., the China 1.3 lineage, was found according to the phylogenetic analysis based on S gene sequences and showed unique and consistent mutations at positions S28T, Y31H/N, and deletion at position 44 in all the strains.The recombination events were found in the circulating PDCoV strains in Guangxi province.These results provide more information on the prevalence and evolution of PDCoV in Guangxi province in southern China, which will facilitate better comprehension and prevention of PDCoV.

Supplementary Materials:
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/microorganisms12020416/s1,Table S1: The information on the S gene of PDCoV strains used in this study.Table S2: The information on the M gene of PDCoV strains used in this study.Table S3: The information on the N gene of PDCoV strains used in this study.Informed Consent Statement: Written informed consent was obtained to use the clinical samples from the animal's owners.

Figure 1 .
Figure 1.PDCoV distribution in Guangxi province.The positivity rates of PDCoV are marked different regions.

Figure 1 .
Figure 1.PDCoV distribution in Guangxi province.The positivity rates of PDCoV are marked for different regions.

Figure 2 .
Figure 2. The phylogenetic tree based on PDCoV S gene nucleotide sequences.The TN93+G+I model is selected using the MEGA Ⅹ.The sequences from Guangxi province are bolded, and the sequences obtained in this study are marked with red circles.

Figure 2 .
Figure 2. The phylogenetic tree based on PDCoV S gene nucleotide sequences.The TN93+G+I model is selected using the MEGA X.The sequences from Guangxi province are bolded, and the sequences obtained in this study are marked with red circles.

Microorganisms 2024 , 20 Figure 3 .
Figure 3.The phylogenetic tree based on PDCoV M gene nucleotide sequences.The K2+G model is selected using MEGA Ⅹ.The sequences from Guangxi province are bolded, and the sequences obtained in this study are marked with red circles.

Figure 3 .
Figure 3.The phylogenetic tree based on PDCoV M gene nucleotide sequences.The K2+G model is selected using MEGA X.The sequences from Guangxi province are bolded, and the sequences obtained in this study are marked with red circles.

Figure 3 .
Figure 3.The phylogenetic tree based on PDCoV M gene nucleotide sequences.The K2+G model is selected using MEGA Ⅹ.The sequences from Guangxi province are bolded, and the sequences obtained in this study are marked with red circles.

Figure 4 .
Figure 4.The phylogenetic tree based on PDCoV N gene nucleotide sequences.The K2+G+I model is selected using MEGA X.The sequences from Guangxi province are bolded, and the sequences obtained in this study are marked with red circles.

Figure 5 .
Figure 5. (a) The maximum clade credibility (MCC) tree generated by analyzing PDCoV S gene nucleotide sequences.Two subgroups are labeled using blue and orange, and the sequences obtained in Guangxi province are labeled using red.(b) Bayesian skyline of PDCoV S gene sequences in Guangxi province.The dark blue line indicates the average of the genetic diversity, and the light blue shading indicates a 95% confidence interval.

Figure 5 .
Figure 5. (a) The maximum clade credibility (MCC) tree generated by analyzing PDCoV S gene nucleotide sequences.Two subgroups are labeled using blue and orange, and the sequences obtained in Guangxi province are labeled using red.(b) Bayesian skyline of PDCoV S gene sequences in Guangxi province.The dark blue line indicates the average of the genetic diversity, and the light blue shading indicates a 95% confidence interval.

Figure 7 .
Figure 7.The amino acid comparison of PDCoV S gene.The images have been labeled according to the branches of the genetic evolutionary tree, and the red boxes show the region where the China 1.3 lineage is different from the other branches.(a-d) represent the order of the sequence diagram.

Figure 7 .
Figure 7.The amino acid comparison of PDCoV S gene.The images have been labeled according to the branches of the genetic evolutionary tree, and the red boxes show the region where the China 1.3 lineage is different from the other branches.(a-d) represent the order of the sequence diagram.

Author Contributions:
Methodology and writing-original draft preparation: B.L. and Y.G.; investigation, and validation: Y.S. and Y.M.; software, and data curation: F.L.; visualization: Y.Y.; project administration: S.F.; funding acquisition, and writing-review and editing: K.S. and W.S. All authors have read and agreed to the published version of the manuscript.Funding: This study was supported by the Key Research and Development Program (No. AB21238003) of the Guangxi Science and Technology Bureau, China.Institutional Review Board Statement: This study was approved by the Guangxi Center for Animal Disease Control and Prevention (CADC) (No. 2020-A-02).

Table 2 .
Homology of different PDCoV branches in China.

Table 3 .
The genetic evolutionary rates of PDCoV S, M, and N genes in Guangxi province.