Next-Generation Sequencing Reveals Four Novel Viruses Associated with Calf Diarrhea

Calf diarrhea is one of the common diseases involved in the process of calf feeding. In this study, a sample of calf diarrhea that tested positive for bovine coronavirus and bovine astrovirus was subjected to high-throughput sequencing. The reassembly revealed the complete genomes of bovine norovirus, bovine astrovirus, bovine kobuvirus, and the S gene of bovine coronavirus. Phylogenetic analysis showed that the ORF2 region of bovine astrovirus had the lowest similarity with other strains and gathered in the Mamastrovirus unclassified genogroup, suggesting a new serotype/genotype could appear. Compared with the most closely related strain, there are six amino acid mutation sites in the S gene of bovine coronavirus, most of which are located in the S1 subunit region. The bovine norovirus identified in our study was BNoV-GIII 2, based on the VP1 sequences. The bovine kobuvirus is distributed in the Aichi virus B genus; the P1 gene shows as highly variable, while the 3D gene is highly conserved. These findings enriched our knowledge of the viruses in the role of calf diarrhea, and help to develop an effective strategy for disease prevention and control.


Introduction
Calf diarrhea is a common gastrointestinal disease in calves, and most cases occur at less than 1 month of age. Calf diarrhea often occurs due to environmental contamination with pathogens, and it is transmitted via fecaoral transmission routes. The main clinical features include diarrhea, dehydration, and foul-smelling feces. Morbidity and mortality are high, and the incidence of calf diarrhea was higher when cattle were raised in groups [1]. Prior studies have implicated several viruses in calf diarrhea, and rotavirus is the main viral causative agent of diarrhea in calves worldwide [2]. Moreover, several other viruses, including bovine viral diarrhea virus (BVDV), bovine adenovirus (BAdV), bovine torovirus (BToV), bovine parvovirus (BPV), bovine coronavirus (BCoV), bovine astrovirus (BoAstV), bovine kobuvirus (BKoV), and bovine norovirus (BNoV), have been recognized as the causative agents of diarrhea [3][4][5][6][7]. The economic impact caused by this condition is significant, although many new intervention strategies, such as vaccines, medications, and herd management, have been developed and implemented to minimize economic losses [8].
A metagenomic approach was used to detect all potential pathogens in a sample, this method has great potential utility in the diagnosis of infectious disease [9,10]. The traditional diagnostic techniques in laboratories include viral culture and the molecular identification of viral nucleic acids, most commonly via PCR/RT-PCR. Most molecular assays target only a limited number of pathogens using specific primers or probes, while metagenomic approaches characterize all DNA or RNA present in a sample, enabling the analysis of an entire genome. The metagenomic approach for clinical applications derives its roots from the use of microarrays in the early 2000s [11,12]; these methods have been widely used to identify infections in ancient remains [13], characterize the animal virome in both healthy and diseased states [14,15], and discover novel viral pathogens [16].
In February 2021, calf diarrhea of unknown pathogen broke out on a cattle farm in Shijiazhuang, Hebei province, China. We collected 63 cases of diarrheal calf stool and sent it to the laboratory for pathogen detection. Several diagnostic methods were performed, including the RT-PCR assay, to detect BVDV, BCoV, BoAstV, and BRoV. The results show that there are a large number of mixed infections, and, as we want to further explore the pathogens of mixed infections, the metagenomics were used to detect the unknown pathogens.

Next Generation Sequencing (NGS)
One RNA sample positive for both BCoV and BoAstV was selected and sent to Shanghai Tanpu Biotechnology Co., Ltd. (Shanghai, China) for next-generation sequencing to obtain the sequences of these two viruses and to explore whether there were other viruses that may cause calf diarrhea. We used Illumina sequencing technology to complete the genome amplification and in-depth sequencing of the virus strain. Subsequent DNase treatment and cleanup was followed by second-strand synthesis before library preparation using Nextera XT reagents and sequencing on the NovaSeq 6000 (Illumina). Read quality trimming was performed using the Skewer, with an additional trimming filter for unreliable sequences after a user-specified quality score. Host read subtraction by read-mapping was performed with the BWASW program against ribosomal RNAs (16,18,23,28, 5S, and internal transcribed spacers rRNA were retrieved from https://www.ncbi.nlm.nih.gov/, accessed on 15 March 2021), bacterial genome sequences, and the latest host organism genome sequences. We then used SPAdes and MEGAHIT software to de novo assemble the reads obtained after removal of the above-mentioned contamination sequence. The de novo assembly followed the A5-miseq pipeline. The final scaffolds were subjected to bwasw read mapping and a mega blast homology search against the NCBI NT database.

Phylogenetic and Genome Analysis
We used the ORF Finder (https://www.ncbi.nlm.nih.gov/, accessed on 12 May 2021) to predict open reading frames (ORFs) and their derived amino acid sequences. This method uses Clustal W software MEGA7.0 until genomic nucleotide sequences are aligned. Using the neighbor-joining method (neighbor-joining method, NJ), a phylogenetic tree was constructed, and an exhibit values (on Bootstrap) test was set to repeat 1000 times. At the same time, MegAlign software in the DNAstar software package was used to analyze the homology between these nucleotide sequences.

Identification of BCoV and BoAstV Infection in Feces Samples by RT-PCR
The results showed that, in 48 samples out of 63 samples, at least one pathogen was detected, and the BCoV positive detection rate was the highest at 48.39%, followed by BRoV (26.98%), BoAstV (25.81%), and BVDV (19.05%). There are six types of mixed infections, BVDV + BCoV, BVDV + BoAstV, BCoV + BRoV, BCoV + BoAstV, BVDV + BCoV + BoAstV, BCoV + BRoV + BoAstV, and the infection rate is as high as 46.03% (Table 1). In addition, we detected two different BCoV sequences. BCoV and BoAstV were detected in the mixed infection sample numbered 210,041, therefore the high-quality total RNA was subsequently used for the following deep sequencing.

Viral Metagenomics
We extracted RNA from stool samples that tested positive for BoAstV and BCoV pathogens and then processed them for viral metagenomics using the Illumina HiSeq platform. The largest portion of the reads were assigned to the families Picornaviridae (n = 2992), Astroviridae (n = 819), and Caliciviridae (n = 187). We then generated complete or partial genome sequences for a subset of these viruses (Table 2).

Discovery and Analysis of BNoV
BNoV/CN/HB-SJZ/2021 genomic RNA consists of 7316 nucleotides. The BNoV genome contains three sequential ORFs starting from the 22 nt: ORF1, 5055 (22-5076) nt; ORF2, 1569 (5063-6631) nt; ORF3, 849 (6423-7271) nt, all of which encode a nonstructural polyprotein and structural VP1 and VP2 proteins, respectively. The G + C content of our strain genome was 57.50%, similar to that of the B309 (56.96%) and Newbury2 (56.71%) strains. Homology analysis shows that BNoV/CN/HB-SJZ/2021 and the NoravusGIII virus (MN480761) have 93.4% nucleotide characteristics (Table 3). Phylogenetic tree analysis shows the closest relationship with MN480761 (Figure 1a), as well as genetic evolution analysis, revealing that the VP1 protein is located in the same branch as the GIII.2 subtype (Figure 1b). Further analysis of the genetic evolutionary tree shows that the RdRp region of the virus also has the closest genetic relationship with MN480761 ( Figure 1c).

Discovery and Analysis of BKoV
The BKoV we obtained by sequencing contained 8376 base genomes. The BKoV/CN/HB-SJZ/2021 genome contained a large ORF, which encoded potential polyprotein precursors of 2568 amino acids. Reads were assembled into a complete genome contig that shares 88.4% nucleotide identity with the BKoV reference genome (MN336260), and genetic evolution analysis shows that the BKoV/CN/HB-SJZ/2021 belongs to the Aichi virus B genus ( Figure 3). The 2C region of BKoV is relatively conserved, with nucleotide similarity reaching 62.7-89.2%. This region contains a highly conserved amino acid motif, GXXGXGKT, which is also recognized as the nucleic acid binding region of microRNA virus helicase. Therefore, 2C protein may be related to virus replication and nucleic acid uncoiling. There are three highly conserved amino acid motifs in the 3D region which encode the virus RNA polymerase. The nucleotide sequence similarity of the 3D region of the crest virus is the highest at 66.7-93.0% (Table 5).

Analysis of the Coronavirus Genome
After BLAST alignment, we assembled the complete S gene sequence of BCoV through sequencing, and a sequence of 4092 nt length was obtained, encoding 1363 amino acid. The BCoV S genes from different parts of the world were selected as reference sequences. The complete phylogenetic tree of the S gene showed that, compared with the isolates from Vietnam and Cuba, the isolates formed a clear clade. In addition, BCoV-S/CN/HB-SJZ/2021 sequences closely related to BCoV/CH/HB-BD/2019 (MK903506) (Figure 4), which encodes the S protein, were identified (99.6%). It shared a anucleotide sequence higher than 99.1% with the Chinese strain, 98.4% similarity with the Vietnamese strain, and only 93.7% homology with the human coronavirus (Table 6). Sequence alignment shows that there are 17 base mutations with MK903506, and 6 mutations in encoded amino acids.

Discussion
Viral gastroenteritis remains as an important cause of morbidity and mortality in neonatal calves [8]. A considerable number of viruses in the gastrointestinal tract of calves are yet to be identified. Currently, the use of next-generation sequencing technologies facilitates virus detection and description. In this study, the metagenomics analysis revealed that four different viruses, BCoV, BoAstV, BKoV, and BNoV, were present in feces from calf diarrhea, and the distribution of these viruses was each associated with calf diarrhea.
Additionally, a level of co-infection was detected, and co-infection or a specific combination of viruses is more likely to result in calf diarrhea, and requires the analysis of a larger number of cases and controls. A recent study has shown the presence of five viruses, including BToV, BCoV, BRoV, BVDV, and small round structured viruses (SRSV) in the feces of dairy calves, with a rate of co-infection of 14% in diarrheic calves, suggesting that mixed infections are common in calf populations [17].
The astroviruses reported here are commonly associated with gastrointestinal disease, particularly in immunodeficient infants [18,19], and they are generally detected in fecal samples [20,21]. Moreover, astrovirus can also been detected in the brains of humans [22] and animals [23], including cows, showing with neurological symptoms [24,25]; this indicates that the host's range can extend beyond enteric tissues. However, the role of BoAstV as a causative agent of diarrhea in cattle is controversial and has not been extensively studied. Based on genomic sequences, ORF2 has greater variability among the three ORFs, and the characteristics of the ORF2 gene are important bases for AstV typing [26,27]. Analysis of the genetic evolution of the virus showed that, compared with other regions of the genome, the ORF2 region had the lowest similarity to other strains, which may suggest that a new serotype/genotype may appear. It is noteworthy that neonatal calf diarrhea is generally of a multifactorial origin; our study not only established an association between the presence of the virus and disease, but also determined the genetic diversity of BoAstV in herds from China.
BCoV has been associated with enteric and respiratory disease in cattle, including neonatal calf diarrhea (NCD) [28], winter dysentery (WD) [29], and respiratory tract illness [30]. All BCoV isolates identified so far that are shed in feces and nasal secretions belong to one serotype/genotype based on virus cross-neutralization and genotyping analyses, regardless of clinical origin [31]. However, there is no specific or effective method with which to prevent or control BCoV infection in China. In other countries, inactivated vaccines or attenuated vaccines are primarily used to immunize pregnant cows so that calves can ingest colostrum to improve passive immunity, or so that newborn calves can be used to stimulate active immunity [32]. However, a commercial BCoV vaccine was not available in China. S protein is a larger membrane glycoprotein that contains two hydrophobic regions and two subunits, S1 and S2, that are formed by cleavage at the 768 and 769 amino acids [33], which are mainly responsible for cell adhesion, blood coagulation, membrane fusion, and the induction of neutralizing antibodies. S1 is responsible for the recognition and binding of viruses and host cells, while S2 is related to the fusion between virus cells. The sequence comparison between BCoV-S/CN/HB-SJZ/2021 and its nearest strain, BCoV/CH/HB-BD/2019 (MK903506), showed that there are six amino acid mutation sites, of which four amino acid mutation sites are in the S1 subunit and two are in the S2 subunit. This indicates that most of the mutations are in the S1 region, and we can conduct in-depth research on this identification and combination area in order to provide more data for vaccine development and drug design.
To date, the pathogenesis of BNoV has been poorly understood, and the prevalence of BNoV in cattle has not been well established. In our study, BNoV was identified as the co-infection pathogen from diarrheic samples, along with other pathogens. Therefore, the true role of BNoV as a primary pathogen or as a form of co-infection remains unclear. Based on the phylogenetic relationships inferred from the VP1 sequences, noroviruses have been divided into six genogroups (GI to GVI), and BNoV is classified as GIII. The strain identified in our study was BNoV-GIII 2. Considering the previously reported GIII.1 strains identified in China [34,35]. It can be inferred that both genotypes GIII.1 and GIII.2 related sequences that have been found to circulate in Chinese dairy calves. Due to the limited epidemiological data on BNoV infections in China, the dominant strains cannot be accurately confirmed. Moreover, while different genotypes co-exist, ways of preventing diarrhea caused by BNoV are unclear. The results of our study will facilitate further research on the evolution and molecular pathogenesis of BNoV.
In 2008, BKoV was isolated from fecal samples of cattle with diarrhea, and it was concluded that BKoV played a role in the pathogenesis of enteritis in calves. However, the role of BKoV infection in calf diarrhea still needs to be clarified because of the presence of this virus in clinically normal animals [5]. In this study, through homology comparison, we found that the P1 gene showed as highly variable, while the 3D gene was highly conserved. Many early studies have shown that the BKoV-3D gene is mainly used to distinguish BKoV from other kobuviruses in phylogeny, rather than to differentiate BKoV into different lineages [36,37]. Therefore, the 3D gene may be used for designing specific primers for BKoV detection.
In summary, this study provided important information on at least four viral pathogens detected in diarrheic samples. These results will certainly contribute to the understanding of the evolution and pathology of BCoV, BoAstV, BKoV, and BNoV in cattle. Further epidemiology studies will reveal whether these viruses can also be found in other diarrheic samples, as well as their pathogenic potentials. Next-generation sequencing, as a new method, can discover more unknown viruses in samples, better understand the causes of calf diarrhea, and provide powerful technical support for the prevention and control of diarrheal diseases.
Author Contributions: Q.W., J.L. and W.W. took part in all the experiments and wrote the manuscript. X.Z., D.S. and B.F. helped to design the whole project and draft the manuscript. J.Z. and D.W. conducted RNA isolation, RT-PCR detection, and sample processing for sequencing. S.S. and G.G. conducted data analysis. B.L. contributed essential ideas and discussion. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: All data used and presented in this study is either available in public repositories as described in the Methods section or is made available in NCBI database.