Emergence of a Distinct Picobirnavirus Genotype Circulating in Patients Hospitalized with Acute Respiratory Illness

Picobirnaviruses (PBV) are found in a wide range of hosts and typically associated with gastrointestinal infections in immunocompromised individuals. Here, a divergent PBV genome was assembled from a patient hospitalized for acute respiratory illness (ARI) in Colombia. The RdRp protein branched with sequences previously reported in patients with ARI from Cambodia and China. Sputa from hospitalized individuals (n = 130) were screened by RT-qPCR which enabled detection and subsequent metagenomic characterization of 25 additional PBV infections circulating in Colombia and the US. Phylogenetic analysis of RdRp highlighted the emergence of two dominant lineages linked to the index case and Asian strains, which together clustered as a distinct genotype. Bayesian inference further established capsid and RdRp sequences as both significantly associated with ARI. Various respiratory-tropic pathogens were detected in PBV+ patients, yet no specific bacteria was common among them and four individuals lacked co-infections, suggesting PBV may not be a prokaryotic virus nor exclusively opportunistic, respectively. Competing models for the origin and transmission of this PBV genotype are presented that attempt to reconcile vectoring by a bacterial host with human pathogenicity. A high prevalence in patients with ARI, an ability to reassort, and demonstrated global spread indicate PBV warrant greater public health concern.


Introduction
A majority of emerging infectious diseases are zoonoses, resulting from viruses in animal reservoirs that cross species barriers into humans. Globalization, international travel, changes in the environment and climate, and encroachment of humans into natural habitats, have all facilitated the global spread of outbreaks such as Ebola, Zika, HIV/AIDS, MERS, SARS, and Nipah. Respiratory viruses in particular, such as influenza and coronaviruses, in addition to morbidity and mortality, stoke worldwide fear and untold economic damage as their ease of transmission makes it difficult to control. As of this writing, SARS-CoV-2 has resulted in nearly 191 million cases and 4.1 million deaths (Johns Hopkins University, Baltimore, MD, USA, https://coronavirus.jhu.edu/ accessed on 22 July 2021), with new variants of concern emerging with each passing week [1][2][3]. SARS-CoV-2-specific molecular, serologic and rapid diagnostics, NGS surveillance, and vaccines have all been critical components of our public health response. To anticipate the next outbreak, metagenomic next-generation sequencing (mNGS) offers an unbiased approach to detect any known virus, bacteria, fungus, and parasite and identify novel, divergent pathogens that are otherwise undetectable with conventional molecular methods and require specific targeting or growth in culture [4,5]. Harnessing this technology has informed clinical diagnosis and aided in the discovery of a multitude of new viruses [6,7]. Acute respiratory illnesses encompass a wide spectrum of symptoms, ranging from rhinitis and phyrangitis to bronchitis and pneumonia, and as such, are caused by a litany of viruses (corona-, rhino-, adeno-, paramyxo-, herpes-, influenza, etc.), bacteria (Streptococcus, Mycoplasma, Haemophilus, Klebsiella, etc.), and fungi (Candida, Histoplasma, Blastomyces, etc.) [8]. It is imperative that surveillance of respiratory infections remains vigilant and that we proactively search for new viruses to be better prepared for the next pandemic.
Picobirnaviruses (PBV) are small (pico), two-segmented (bi), double-stranded RNA viruses primarily associated with gastroenteritis and diarrhea. Initially discovered in fecal samples from both humans and pigmy rats in Brazil, the virus is non-enveloped and contains two RNA segments that can differ in size depending on whether they are Genogroup I (2.3-2.6 kb and 1.5-1.9 kb) or Genogroup II (1.75 and 1.55 kb) strains [9,10]. Their highly heterogeneous sequences have been detected in a wide range of animal species worldwide, including everything from farm animals, reptiles, domestic pets, and wild birds, to non-human primates and untreated sewage [11]. The close relatedness of porcine and human strains and additional documented examples of interspecies transmission in chickens point to frequent crossover events and circulation between hosts [12][13][14]. Indeed, unlike other viruses that have co-evolved with their host, PBV strains do not segregate into distinct clades by species. Rather, the simple capsid appears to have acquired a generalized means of invading cells, easily transmitting from one host to another without restriction, although direct evidence of intracellular infection is still lacking [15,16]. Their high genetic diversity is further pronounced by the detection of multiple PBV strains within individuals [17]. Taken together, these observations have prompted the current debate around whether PBVs are even animal viruses. The presence of prokaryotic ribosomal binding sites (Shine Delgaro sequence) in 5 UTRs and alternative mitochondrial genetic code usage in some genera suggest these viruses may actually infect bacteria or fungi, resembling the segmented dsRNA Cystoviridae phages and Mitoviridae families, respectively. Thus, their detection at vertebrate mucosal sites may simply be a result of infection by these prokaryotic or unicellular eukaryotic hosts [18,19].
Nevertheless, studies indicate PBV can persist chronically, with periods of large shedding interspersed by periods of silence, and that some hosts can serve as asymptomatic reservoirs [20,21]. This implies the virus is adapted to the host and may underscore why pathogenicity (e.g., diarrhea) is often seen in those co-infected with other enteric viruses like rotavirus, calicivirus, and astrovirus [22,23]. Similarly, while PBVs have been found in humans as the 'sole' pathogen in cases of watery diarrhea and gastroenteritis, this is often in immunocompromised patients [24,25]. For these reasons, PBVs are currently viewed as opportunistic pathogens, with those who have underlying conditions or co-morbidities most at risk for infection. PBVs have also been detected in the respiratory secretions of pigs and humans [14,26]. Recently, novel PBV strains were detected in Uganda in two patients hospitalized with severe, acute respiratory illness [27]. In our search for novel pathogens causing respiratory illness, we surveyed patients hospitalized in South America and discovered a novel strain of PBV. We developed and applied an automated, highthroughput RT-qPCR assay which identified additional variants with high similarity to the index still in circulation, as well as strains previously reported in individuals from Cambodia and China with respiratory infections [28]. Our insights into their phylogenetic relatedness, prevalence, and association with disease warrant further examination of these picobirnavirus lineages.

Ethics Statement and Specimens
The initial 16 human specimens (sputum, ETA, and BAL) from MRN Diagnostics (Franklin, MA, USA) were collected in late 2016-early 2017 from four hospitals in Colombia, South America: Barranquilla, Cucuta, Medellin, and Valledupar. Patients were deidentified, provided oral informed consent, and the study (MRN GP 072014) received IRB approval. An additional 50 sputum specimens from MRN Diagnostics were collected in Colombia during June-July 2019. Left-over sputum from 50 individuals at a tertiary Specimens were collected independently of one another, thus the inclusion criteria and symptoms displayed differed between cohorts.

Extraction
Pre-treated sputum was extracted on an m2000sp using the TNA + Proteinase K protocol (Abbott Molecular, Des Plaines, IL, USA). Nucleic acid was eluted in 60 µL and frozen at −80 • C until use.

In Vitro Transcription
Positive controls for qPCR were subcloned into the pBlueScript plasmid, linearized with HindIII, and RNA was generated by in vitro transcription using the MEGAscript kit (Ambion, Austin, TX, USA).

PBV RT-qPCR Assay
A complete description of RT-qPCR assay volumes and cycling parameters, along with primer and probe sequences, are found in the online Appendix. Briefly, two separate reactions were performed in 96-well plates to detect PBV capsid or RdRp. In each case, 40 µl of mastermix containing AgPath-ID One-Step RT-PCR (Life Technologies) reagents, 1 mM MgCl2, primers (0.4 µM), and probe (0.3 µM) was combined with 10 µl of sample nucleic acid, with ROX added as the reference dye. Results were analyzed in MultiAnalyze software.

mNGS Library Prep and Sequencing
mNGS libraries were prepared as described [30]. Briefly, total nucleic acid from PBV+ sputum was converted to cDNA with random primers and Superscript III (SSRTIII) 1st Strand reagents (Life Technologies, Carlsbad, CA, USA), followed by 2nd strand synthesis with Sequenase V2.0 T7 DNA pol (Affymetrix, Santa Clara, CA, USA). Nextera XT was used to create barcoded, metagenomic NGS libraries which were multiplexed (n = 8) and sequenced on 3 MiSeq runs, with those requiring additional throughput re-sequenced on a HiSeq instrument.

mNGS Analysis
Metagenomic data were analyzed by SURPI and an Abbott-internal (DiVir) pipeline [31]. Briefly, SURPI uses the SNAP nucleotide aligner to remove human reads, compares remaining reads to all sequences in NCBI nt (2019 release), and then taxonomically classifies those with a match. Viral and unmatched reads are de novo assembled, translated in six reading frames, and aligned to a viral protein database using RAPSearch. Unmatched reads were probed further for divergent viral sequences using the psiBLAST algorithm in DiVir. Viral and bacterial reads identified by SURPI were normalized across experiments to determine fold change in abundance.

Pairwise Amino Acid Identity
An R/Shiny application was developed to align two amino acid sequences pairwise, measuring and plotting the rolling proportion of identical amino acids (AA) over a userdefined window, outputting mean and median identity values.

Phylogenetics
Phylogenetic analyses were performed on the entire coding regions of capsid and RdRp proteins. All available PBV sequences for both segments were downloaded from GenBank on 3 February 2021. After removing incomplete, poor quality and redundant sequences, a total of 477 capsid and 426 RdRp were included in the analyses. Alignment of amino acids with MAFFT E-INS-i and the refined taxonomic assignment of picobirnavirus sequences are described in Perez, L et al. [32]. All datasets were analyzed by the maximum likelihood (ML) phylogenetic approach computed with the IQ-TREE program [33]. IQ-TREE was also used to select the best-fit model and determine the confidence levels for the branches by the Shimodaira test and 100,000 bootstrap replicates [34]. Sequences from PBV species 2 were used as the outgroup for RdRp. See Supplementary Tables S1 and S2 for a listing of references used in trees.

Phylogeny-Trait Association Analysis
The association between clinical respiratory disease and the phylogeny of both PBV genomic segments was assessed using the Mr. Bayes BaTS software for the 477 capsid and 426 RdRp sequences by considering two states (e.g., ±clinical respiratory condition) [35]. Values of the association index (AI), parsimony score (PS) statistics, and the level of clustering in individual locations using the monophyletic clade (MC) size statistic were all calculated based on the posterior samples of trees produced by Mr. Bayes 3.2.7 using the BaTS program [36]. The null distribution for each statistic was estimated with 100,000 replicates of state randomization.

Statistics
Pearson's chi-squared test was used to determine whether there is a dependency between infection with Mycobacterium tuberculosis and picobirnavirus.

NCBI Accessions
Partial and full-length sequences were deposited in GenBank under the following accessions: OL875303-OL875351. Raw data was deposited under BioProject number PR-JNA789273 and SRA study SRP350910.

Discovery of a Novel Picobirnavirus
To discover novel human viruses involved in acute respiratory illness, specimens from patients hospitalized with severe symptoms were screened by mNGS. Sputum specimens (n = 16 patients) were collected from August 2016-February 2017 from hospitals in 4 different cities of Colombia, South America. Bacterial pathogens are known to cause respiratory or nosocomial infections, such as Klebsiella, Acinetobacter, and Stenotrophomonas were detected (Supplemental Table S3). Influenza A, Herpes simplex virus 1, known to cause respiratory infections in the immunocompromised, and Aichivirus A, a picornavirus typically causing gastroenteritis, were each found in different individuals. Sample 08853406, from a 29-year-old male hospitalized in Barranquilla, was enriched for Haemophilus parainfluenzae/influenzae reads, but also had two strong matches to porcine picobirnavirus and several other sequences with remote amino acid homology to picobirnaviruses. Additional sequences with homology to each PBV genome segment were identified by RAPsearch and psiBLAST. These reads served as seeds to de novo assemble the entire genome of 4119 nt from just 676 out of 2,327,238 total reads, at a coverage depth of 19X ( Figure 1A). picobirnaviruses. Additional sequences with homology to each PBV genome segment were identified by RAPsearch and psiBLAST. These reads served as seeds to de novo assemble the entire genome of 4119 nt from just 676 out of 2,327,238 total reads, at a coverage depth of 19X ( Figure 1A). For segment 1 (2251 nt), a 5′UTR of 144 nt with 66% AT-richness precedes ORF1 (nt 147-651), a 168 aa hypothetical protein with a predicted molecular weight of 18.7 kDa and acidic pI of 5.93. ORF1 possesses the ExxRxNxxxE repeat motifs of unknown function observed in other picobirnaviruses [37]. The top BLASTp hit for ORF1 is porcine PBV (YP_009241384.1) with only 33% identity and 47% positivity. ORF2 (nt 657-2238) encodes a 527 aa capsid protein with a predicted molecular weight of 57.8 kDa and a basic pI of 8.42. The top BLASTp hit is Grey Teal PBV (QD92430.1) at 44% identity and 63% positivity. Pairwise amino acid alignments with representative capsids from various species all derived from stool indicate an overall 35% identity to the novel sequence ( Figure 1B; left) [11,16,38]. Segment 2 (1892 nt) encodes only the RdRp protein (nt 5-1594) of 530 aa, with a predicted molecular weight of 61.1 kDa and a neutral pI of 7.69. It is preceded by a 4 nt (incomplete) 5′UTR and followed by a 300 nt 3′UTR. Its top BLASTp hit is otarine (California sea lion) PBV (AMP18954.1) at 64% identity and 75% positivity. Pairwise amino acid alignments here indicate an average identity of 60%, in agreement with the greater degree of conservation among RdRp sequences ( Figure 1B; right). The novel strain is For segment 1 (2251 nt), a 5 UTR of 144 nt with 66% AT-richness precedes ORF1 (nt 147-651), a 168 aa hypothetical protein with a predicted molecular weight of 18.7 kDa and acidic pI of 5.93. ORF1 possesses the ExxRxNxxxE repeat motifs of unknown function observed in other picobirnaviruses [37]. The top BLASTp hit for ORF1 is porcine PBV (YP_009241384.1) with only 33% identity and 47% positivity. ORF2 (nt 657-2238) encodes a 527 aa capsid protein with a predicted molecular weight of 57.8 kDa and a basic pI of 8.42. The top BLASTp hit is Grey Teal PBV (QD92430.1) at 44% identity and 63% positivity. Pairwise amino acid alignments with representative capsids from various species all derived from stool indicate an overall 35% identity to the novel sequence ( Figure 1B; left) [11,16,38]. Segment 2 (1892 nt) encodes only the RdRp protein (nt 5-1594) of 530 aa, with a predicted molecular weight of 61.1 kDa and a neutral pI of 7.69. It is preceded by a 4 nt (incomplete) 5 UTR and followed by a 300 nt 3 UTR. Its top BLASTp hit is otarine (California sea lion) PBV (AMP18954.1) at 64% identity and 75% positivity. Pairwise amino acid alignments here indicate an average identity of 60%, in agreement with the greater degree of conservation among RdRp sequences ( Figure 1B; right). The novel strain is referred to hereafter as ABT3406, and in keeping with currently established nomenclature, deposited in GenBank as follows: GI/PBV/human/Colombia/ABT3406/2016.

ABT3406 RdRp Branches with Respiratory PBV Strains
All available picobirnavirus sequences in GenBank were retrieved for phylogenetic analysis at the time of discovery (December 2018) and prior to submission (February 2021). Full-length capsid sequences (n = 423) were aligned by amino acid and classified into 3 species [32]. Long branch lengths and the absence of virus-host co-evolution patterns reflect the heterogeneity of strains and a high number of accumulated mutations indicative of adaptation to new hosts and the apparent interchangeability of PBV capsids [16]. Consistent with the BLAST results, ABT3406 branched with MK204396 from Grey Teal ducks (Anas gracilis) [38] (Supplemental Figure S1).
Viruses 2021, 13, x FOR PEER REVIEW 6 of 21 referred to hereafter as ABT3406, and in keeping with currently established nomenclature, deposited in GenBank as follows: GI/PBV/human/Colombia/ABT3406/2016.

ABT3406 RdRp Branches with Respiratory PBV Strains
All available picobirnavirus sequences in GenBank were retrieved for phylogenetic analysis at the time of discovery (December 2018) and prior to submission (February 2021). Full-length capsid sequences (n = 423) were aligned by amino acid and classified into 3 species [32]. Long branch lengths and the absence of virus-host co-evolution patterns reflect the heterogeneity of strains and a high number of accumulated mutations indicative of adaptation to new hosts and the apparent interchangeability of PBV capsids [16]. Consistent with the BLAST results, ABT3406 branched with MK204396 from Grey Teal ducks (Anas gracilis) [38] (Supplemental Figure S1).

Quantitative PCR Screen Reveals High Prevalence of PBV among Hospitalized
To understand the prevalence, identify additional strains, and assess genetic diversity, a quantitative PCR assay was designed to enable the detection of all picobirnaviruses and discriminate ABT3406 from other strains. An alignment of ABT3406 with reference sequences from human, chicken, pig, otarine, turkey, bovine, fox, camel, and monkey depicts the conserved region targeted in RdRp ( Figure 3A). Redundant, degenerate primers generate an amplicon whose 5 end is recognized by a universal, FAM-labeled probe (rFAM) situated in polymerase motif F. CY5-and CY3-labeled probes are unique to ABT3406 and KM285233, respectively, and recognize overlapping, mutually exclusive sequences at the amplicon 3 end ( Figure 3A). In a separate reaction, different primers and a FAM-labeled probe (cFAM) specifically target the ABT3406 capsid (Supplemental Figure S2). Therefore, the RT-qPCR assay provides four measurements detecting both PBV genomic segments, functioning as a discovery and diagnostic tool.
Serial dilution of in vitro transcripts (IVT) demonstrates dose dependency, sensitivities with LODs between 10-100 copies/mL, and the expected results for three classes of strains represented by ABT3406, KM285233, and AB517739 ( Figure 3B). All were detected by the universal rFAM probe (column 1). For AB517739, illustrative of any PBV, this is the only reactive probe. The ABT3406-derived IVT is also detected by the RdRp CY5 (column 2) probe, while KM285233 is dually reactive with the RdRp CY3 (column 3) probe. For the cFAM capsid probe, only the ABT3406 strain is detected (Supplemental Figure S2).
Sputum samples (n = 130) were obtained from patients hospitalized with ARI. Specimens were collected in October 2018-July 2019 from two new locations in the United States (New York, NY, USA, n = 50; Florida, FL, USA, n = 30) and from the original medical facilities in Colombia (n = 50). Nucleic acid was tested for PBV by both capsid and RdRp RT-qPCR assays as well as with an Aichivirus A qPCR (Supplemental Figure S3). A total of 25 patients (19.2%) were positive for PBV with at least one probe ( Figure 3C and Table 1); there were no Aichivirus A positives (0/130; data not shown). We identified five hits from New York, NY, USA: 2 rFAM+ (blue) and 3 rFAM+/CY5+ (green). Reactivity with only the universal RdRp probe indicates these may be altogether new PBV strains ('any PBV': non-ABT/non-KHM). For the latter three, this profile predicts an RdRp similar to ABT3406 combined with a different capsid. A single rFAM+ only (blue) sample was found in Florida, FL, USA. The remaining 19 hits all originated from Colombia. Four were rFAM+/CY5+ (green) and one was rFAM+ (blue). Eight resembled the ABT3406 index (cFAM+/rFAM+/CY5+; red) in having highly similar capsid and RdRp sequences, and two resembled the KM285233 strain from Cambodia (cFAM-/rFAM+/CY3+; orange). Finally, there appeared to be four dual ABT3406/KM285233 infections (cFAM±/rFAM+/CY5+/CY3+; purple). Estimated viral loads ranged considerably (log2-log6 copies/mL), but importantly probe Cts were of similar magnitude within a multi-reactive sample (e.g., Ct of rFAM ≈ Ct of CY5; Table 1).

Emergence of a Distinct PBV Genotype
Total nucleic acid (n = 25 hits) was converted into mNGS libraries for sequencing. The number of PBV reads identified by our analysis pipelines correlated inversely with qPCR values, and therefore those (n = 8) with Cts > 35 resulted in little to no coverage. Full and partial segments 1 and 2 sequences were recovered from 17 patients. Several had dual PBV infections, yielding a total of 25 capsid and 26 RdRp sequences. To avoid bias, consensus sequences were independently determined by mapping to a reference (e.g., ABT3406 or KM285233) and by de novo assembly, then merged into alignments by codon and trees analyzed by the maximum likelihood method (Table 1, Figure 4).    evolutionary twist. Possessing an ABT3406-like RdRp and a maroon capsid, these individuals may represent a 'missing link'. It is well documented and demonstrated here that multiple PBV strains can be found simultaneously in a host [40]. Due to the abundance of mono-infections identified, in which the pairing of genome segments was unequivocal, and taking into account equivalent NGS reads, we could conclude with reasonable certainty in dual-infection settings which capsid paired with which RdRp (Table 1, Figure 4). For capsid, we observed a clustering of nine sequences which include the ABT3406 index case: all are from Colombia (038, 044, 046, 006, 035, 015, 021, 001), have the cFAM+/rFAM+/CY5+ (red, purple) qPCR profile, and share >94% amino acid identity. Basal to this node is a mixture of five sequences from Colombia and New York. At 78-92% amino acid identity to ABT3406, these were all negative by capsid qPCR, but had similar overall profiles: cFAM-/rFAM+/CY5+ (green: 016, 4468, 4138) and cFAM-/rFAM+/CY5-(blue: -020, 4466). A separate branch indicating an earlier speciation event (not supported by bootstrapping) is populated with five sequences (-033, -034, -039, -031, -032), with onlỹ 45% amino acid identity to the index capsid and all except -039 (weakly positive) were negative for capsid qPCR. All these sequences residing in clade 13 share a common ancestor found in Grey Teal ducks, MK204396 [38] (Figure 4A). By contrast, the capsids from cFAM±/rFAM+/CY3+/CY5± (orange) strains (-039, -034, -033, -012, -015, -023), linked to the 'Cambodian' RdRp, cluster together on a distant branch of Clade 3 and possess only 19% amino acid identity to ABT3406. Capsid sequences were never deposited for Cambodian (KM285233, KM285234) and Chinese (MN145873) strains. It is remarkable given the extreme heterogeneity of PBV, that capsids assembled for all individuals, including those co-infected (e.g., PBV-19-015, 033, 034, 039), branched only in these clusters and nowhere else on the tree.
The RdRp tree yielded a highly correspondent phylogeny, indicating genome segments within a patient are linked and constitute related, but distinct strains ( Figure 4B). Despite the variability in qPCR profiles (cFAM±/rFAM+/CY5±), US and (4468, 4138, 4466) and Colombian (038, 044, 046, 006, 035, 015, 021, 001, 016, 020) RdRp sequences (n = 16) branched with the index case. These variants share 86-95% amino acid identity and demonstrate the strain's expansion throughout the Western Hemisphere. RdRp proteins with the cFAM-/rFAM+/CY3+ profile (n = 7; -039, -034, -033, -044, -012, -015, -023) were also on the same branch. It was conceivable given the Cy3 probe reactivity that these would be related to the Cambodian/Chinese respiratory strains in Figure 2, however, we would not have predicted >96% identity to these references. This confirms the strain remains in circulation and has traveled from Asia to South America. These RdRps are only~57% identical to ABT3406 and were CY5-(except dually infected 015), yet the lineages consistently branch together as genotype-3 of PBV species 1 with a bootstrap value of 69. The outliers in this tree are PBV-19-039, -034, -033 sequences within genotype 4, which BLAST with 52% identity to stool-derived sequences from different species, and ancestral to an Australian rabbit sequence. Their RdRps share only~45% identity to the index, yet intriguingly all three were weakly positive for CY5 and linked to the 'maroon' cluster of capsids that speciated from the index ( Figure 4A). Notably, all three of these individuals were co-infected with the 'Cambodian/CY3+' virus. PBV-19-031 and -032 provide a final evolutionary twist. Possessing an ABT3406-like RdRp and a maroon capsid, these individuals may represent a 'missing link'. It is well documented and demonstrated here that multiple PBV strains can be found simultaneously in a host [40]. Due to the abundance of mono-infections identified, in which the pairing of genome segments was unequivocal, and taking into account equivalent NGS reads, we could conclude with reasonable certainty in dual-infection settings which capsid paired with which RdRp (Table 1, Figure 4).

RdRp and Capsid Are Both Associated with Respiratory Disease
To determine whether these distinct lineages (e.g., genotype, clade) were linked to the clinical outcome (phenotype: respiratory disease trait), we applied Bayesian inference (BaTS) analysis [35]. RdRp sequences from sputum form a monophyletic genotype (III) for which no stool-derived strains were included. Patients with PBV-19-039GT, -034GT, -033GT sequences in genotype 4 all have co-infections within genotype 3 and were omitted. Index ratios (IR) of observed versus expected values are an indication of the strength of the association, wherein 0 predicts a complete subdivision of the population and values approaching 1 suggest random mixing (panmixia). The null hypothesis (e.g., no greater trait association with adjacent taxa than due to chance) for phenotypic structure was rejected. Rather, an association index (AI) of 0.14 suggests the emergence of RdRp genotype 3 in species PBV-1 is linked to this clinical condition and the monophyletic clade (MC) statistics (4.20 observed vs. 1.05 expected) prove that the population is phylogenetically divided by these two states (Supplemental Table S4). A parsimony score (PS) of 0.22 eliminates the randomization of this genetic trait as an explanation. All the parameters (AI, PS, MC) were in agreement, indicating this particular clade contains the genetic signature linking it to respiratory disease.
BaTS analysis was similarly applied to capsid sequences. Intuitively, with the 'Abbott' and 'Cambodian' clades separated on the tree, one might have expected the association with respiratory disease to be weaker ( Figure 4A). However, the results were even more convincing than for RdRp, with the IR at 1.9 × 10 −4 , PS at 0.08, and the MC (CRC) at 19.0 observed versus 1.38 expected (Supplemental Table S4). To eliminate ambiguity in terms of which strains were associated with respiratory illness, removing dually infected individuals from the Cambodian clade only served to improve the BaTS statistics. While we cannot account for representation biases in GenBank and the possibility that GI-derived sequences may have also caused ARI symptoms in these same strains, based on available annotations the RdRp and capsid sequences identified here were both strongly associated with respiratory illness.

PBV Is Typically Present as a Co-Infection
Metagenomics permitted addressing whether picobirnaviruses represent an opportunistic infection that may exacerbate disease and are always secondary to a primary viral, bacterial, or fungal respiratory infection, or if they are the sole pathogen present and the presumed cause of illness. HHV-1, EBV, rhinovirus-A, respirovirus-3, and influenza-A reads were individually detected in samples ranging from 41-1712 reads per million, while in others with betacoronavirus, enterovirus-D, and HHV-2 there were fewer total reads (<100) (Table 2, Figure 5A). Likewise, several bacterial genera, including Streptococcus, Haemophilus, Stenotrophomonas, and Klebsiella were enriched (Table 2, Figure 5B). Often times, phages specific to these bacteria (e.g., Klebsiella and Pseudomonas in PBV-4138 and PBV-19-016, respectively) were also detected in abundance to confirm this infection was present. Heat maps illustrate the fold increase in viral and bacterial reads relative to the other PBV+ sputum samples sequenced ( Figure 5). No fungal infections of significance were observed. While Mycobacterium tuberculosis reads were only sparingly detected by mNGS, 7/18 (39%) PBV+ patients from Colombia also had a diagnosis of pulmonary TB compared to 31/50 (62%) overall. Chi-square analysis indicated a patient was twice as likely to have either TB or PBV alone than to be co-infected (p = 0.011). For all 130 patients screened, there was no discernable difference in median age for PBV+ (60.5 years) versus PBV-(56.5 years) ( Table 2). Thus, while co-infections were common in patients, the data illustrate that these picobirnavirus strains were not linked to one or more specific respiratory bacteria or fungi.  have clear sputum coloration and chills, consistent with a viral, acute upper respiratory infection, while the two with lower viral loads have yellowish tinges perhaps indicative of a resolving infection (Table 2). Metagenomics has the added advantage of being able to discriminate samples and rule out cross-contamination. The RdRp and capsid consensus sequences from 19-038, 19-044, and 19-046 are virtually identical, but clearly originate from different individuals: Cts, % PBV reads, bacterial profiles, and the fact that only 19-044 was co-infected with the KM285233-like strain, all indicate they are unique.  As with gastroenteritis, with 21/25 PBV+ samples co-infected, PBV appears to also be an opportunistic infection of the respiratory tract, although the order of onset for each is unknown ( Figure 5). However, four patients did not show enrichment for other microbes, two of which had PBV viral loads ≥104 cp/mL, arguing it may be the sole pathogen causing symptoms or providing the initial insult (Table 2). Interestingly, these two individuals have clear sputum coloration and chills, consistent with a viral, acute upper respiratory infection, while the two with lower viral loads have yellowish tinges perhaps indicative of a resolving infection (Table 2). Metagenomics has the added advantage of being able to discriminate samples and rule out cross-contamination. The RdRp and capsid consensus sequences from 19-038, 19-044, and 19-046 are virtually identical, but clearly originate from different individuals: Cts, % PBV reads, bacterial profiles, and the fact that only 19-044 was co-infected with the KM285233-like strain, all indicate they are unique.

Discussion
A novel picobirnavirus strain was recovered from the sputum of a patient hospitalized in Colombia for acute respiratory illness. Picobirnaviruses infect a myriad of hosts, their sequences are highly variable, and they can be found in people with or without disease, thus it was initially difficult to ascertain the significance of this discovery. However, recent reports have taken note of their presence in respiratory illnesses, some of them severe, and others now appreciate PBV infects airways of animal reservoirs [26,27,41]. From the hundreds of PBV sequences deposited in GenBank, the ABT3406 RdRp branched with the infinitesimally small number of strains recovered from humans with respiratory ailments. It was conceivable that few if any additional 'hits' might be found with our screening efforts, and that these would likely represent sporadic cases with completely unrelated sequences. On the contrary, we found a high prevalence of PBV (19.2%), the majority from Colombia, and all were either related to the index case or the isolates from Cambodia and China previously implicated in respiratory illness. Extraordinary genetic diversity and multi-PBV infections are typically the norm. Sequences are either interspersed among a variety of human, vertebrate, and wastewater strains, despite having similar symptoms like diarrhea and GVHD, or it is the other extreme, where they are nearly identical and indicative of an outbreak [42,43]. Here, we observed PBV with phylogenetically related RdRps evolved over time and circulating over vast distances to emerge as a distinct genotype associated with ARI.
The robust RdRp RT-qPCR assay detected highly divergent strains over a broad range of titers and was used to screen in an automated and quantitative fashion which should now replace the limited efficacy and tediousness of manual RT-PCR and running of gels [39]. Indeed, qPCR led us to capsid and RdRp sequences with as little as 39% and 59% nucleotide identity to the index case, respectively. The amplicon region chosen appeared broad enough for the 'universal' probe to detect highly disparate sequences but specific enough to discriminate strains and identify dual infections. From a data integrity standpoint, viral loads varied considerably, with most samples positive in ≥2 channels, allaying concerns of contamination or false positives, respectively. Importantly, mNGS confirmed full-length sequences agreed with qPCR profiles. While false negatives were certainly a possibility, mNGS of PBV qPCR negatives (n = 105) on a HiSeq (>15 million reads/library) confirmed the absence of PBV in these sputum samples (data not shown).
Most studies on PBV rely upon a narrow region of RdRp to classify strains and fail to report the sequence of the highly divergent capsid [44]. PBV's ability to reassort and the seemingly interchangeable nature of capsid to permit rapid adaptation to new hosts belies the demonstrated dependence of RdRp molecules bound to genomic RNA to interact specifically with the capsid during packaging [15,45]. It has been proposed that polymerase activity requires the presence of a capsid to ensure that dsRNA genomes are enclosed within a capsid and sequestered to prevent elicitation of an antiviral response [46]. Here we demonstrated which capsid and RdRp segments were consistently linked. For example, despite negative reactivity for cFAM and/or CY5, any sample with an RdRp sequence bearing resemblance to the index was found to be paired with a corresponding ABT3406 capsid of roughly similar identity (≥91%) and vice versa. This same RdRp-capsid linkage was true of CY3 positive strains having a high identity to Cambodian/Chinese strains. Five patterns were observed that consist of mono-and co-infections of these strains, as well as those with highly related capsids (-031, -032 and -033GT, -034GT, -039GT) ( Figure 6A). RdRps clustered within a distinct genotype whereas 'Abbott' and 'Cambodian' capsids, sharing only 20% amino acid identity, were in separate clades. ( Figure 4A). Regardless, Bayesian inference analysis established both segments were linked to the respiratory disease trait. In other segmented dsRNA viruses such as reoviruses, tropism can be determined by cell-selective replication efficiency, a process regulated by the viral RNA-dependent RNA polymerase protein (λ3) at a late, post-entry point in the viral life cycle following primary transcription and translation [47]. Unfortunately, porcine and human respiratory sequences collected in the Netherlands from Smits, et al. and nosocomial infections associated with severe ARI in Wakiso, Uganda from Cummings, et al. were not deposited in GenBank and could not be compared [26,27]. The latter were reported to resemble swine (KX374477.1) and camel (KM573801.1) strains, neither of which branched closely to those we identified [27]. Nevertheless, we anticipate the emergence of other phylogenetically unrelated PBV lineages implicated in respiratory illness. Certainly, more conclusive evidence is demanded, but this data leaves open the interpretation that PBV are pathogenic animal viruses, either as the initial insult or as an opportunistic infection. Evidently, the CY3+ (KM285233) strain remains in circulation and has traveled from Asia to South America ( Figure 6B). Given the phylogenetic relationships of CY5+ RdRp and cFAM+ capsid sequences to those recovered from the Tasmanian devil and Grey Teal ducks, respectively, we speculate these hosts interacted in Australia giving rise to a zoonosis from which ABT3406 descended, making its way eastward to Colombia and throughout the Americas ( Figure 6B). Just how strains may have jumped from one host to another will require co-evolutionary analysis, however, the non-enveloped, environmentally resistant nature of PBV will have undoubtedly facilitated its spread in the absence of a direct transmission event. As with animal reservoirs for influenza, the virus can be excreted in their stool and through reassortment can change tropism to trigger emergence in an incidental (human) host. Thus, these competing models are in fact not mutually exclusive: whether as the primary or as an opportunistic infection, PBV can induce ARI symptoms while at the same time be transmitted via a prokaryotic host vector. Regardless, the key takeaway is that genotype III strains described here are uniquely associated with respiratory illness. New classifications of respiratory viruses are being discovered such as bocaviruses (Parvo-) and redondoviruses (CRESS DNA), while emergent species (e.g., MERS, SARS-CoV-2) from established families (e.g., coronaviruses) have literally changed our way of life [54,55]. The COVID-19 pandemic has laid bare the need to be proactively searching for new viruses and prepared with diagnostics, therapeutics, and vaccines. Metagenomics has once again shown that picobirnaviruses are also involved in ARI [27,28]. To our knowledge, ours is the first study to leverage mNGS to link and fully sequence both capsid and RdRp, as well as define the presence of other bacterial and viral agents in Picobirnavirus infections. Many questions remain unanswered: Is it a human or a prokaryotic virus? Is the virus seasonal? Are only the immunocompromised affected? Does PBV cause respiratory symptoms or is it simply a bystander? What domains of RdRp determine respiratory tropism? The high prevalence observed, coupled with its ability to rapidly evolve, reassort its segmented genome, and crossover to other species, indicates a need for greater public health awareness and future studies of picobirnaviruses [13,56]. Numerous examples exist of viruses transmitted by the fecal-oral route that infect the mucosal surfaces of both the alimentary and respiratory tracts, including adenoviruses, picornaviruses, and orthomyxoviruses. Our work confirms and extends recent observations that PBV infections are not restricted to the GI tract nor only involved in gastroenteritis and diarrhea [26,27,48]. Indeed, Smits et al. showed that the RdRp of respiratory PBV from pigs in Hong Kong likely descended from a related strain found in US wastewater [14]. Woo, et al. further demonstrated in cows, poultry, and monkeys that viruses from stool and throat swabs of an individual were one and the same sequence [41]. In contrast to our study, their heterogeneous sequences were widely distributed across the phylogenetic tree and not associated with animal or human disease. As with the respiratory PBV from Uganda and the Netherlands, these too were unrelated to our sequences. Regarding an association with ARI, a comparison to healthy individuals is lacking and there is no guarantee sequences annotated in GenBank were restricted to a particular anatomical site or sample type. How-ever, in the latter case, it is reasonable to assume that an individual suffering from diarrhea would not be asked to provide sputum, just as someone on a ventilator would not provide a stool specimen. Persistent shedding over months in immunocompromised individuals (e.g., HIV+) has advanced the prevailing notion that PBV is an opportunistic infection that otherwise healthy, asymptomatic carriers are invulnerable to [11,25]. Indeed, following hematopoietic stem cell transplants and immunosuppression, the sudden appearance of PBV replication was predictive of graft-versus-host disease onset in 40% of cases [42,49]. In Legoff, et al., it was theorized that infection with another pathogen creates an inflammatory milieu or leads to an imbalance in the mucosal microbiome that leads to a burst in PBV replication [42]. Here, we did not have longitudinal samples to assess causality, but a majority of samples showed evidence of viral or bacterial co-infections or had been previously diagnosed with tuberculosis, suggesting the respiratory PBV strains we detected also represent opportunistic infections.
Still, while the inherent pathogenicity of PBV remains inconclusive, questions remain as to whether PBV is a mammalian virus at all, or simply a prokaryotic virus infecting resident flora of the gut microbiome [18,19,50]. Their ability to auto-proteolyze their capsid and invade liposomes suggests they are vertebrate viruses, unlike the related partitiviruses that infect unicellular organisms and fungi [15,51]. Unfortunately, the inability to culture the virus in mammalian cells has hampered demonstrations of Koch's postulates and laying this controversy to rest, although propagating PBV in bacteria has failed as well [50]. PBV-like viruses found in animals (e.g., bats, crustaceans) using alternative translation codons typically cluster distantly from other PBV in Genogroup III/species R3), yet there are instances (e.g., mongoose, bat) where branching among species Genogroup I/species R1 is observed [52,53]. Our sequences were phylogenetically unrelated to all of these and produced intact ORFs for RdRp using the standard genetic code. However, as with all PBV, segments 1 and 2 here possessed the Shine Delgarno ribosome binding site (AGGAGG) upstream of the ATG start codon [18,19]. Thus, one model to explain our data is that respiratory-tropic bacteria-harboring PBV actually caused disease, and PBV as neutral bystanders were guilty by association.
Our metagenomic results challenge this notion in several respects, though ( Figure 5). First, we did not observe fungal co-infections in any PBV+ individuals. Second, there was either a variety of different respiratory bacteria present or none at all, rather than one or more specific prokaryotic or unicellular eukaryotic hosts linked to PBV positives. Third, there were cases of virus-only co-infections or those with no other pathogen detected, indicating PBV does not require a bacteria or fungi to sustain or initiate an infection. Certainly, more conclusive evidence is demanded, but this data leaves open the interpretation that PBV are pathogenic animal viruses, either as the initial insult or as an opportunistic infection. Evidently, the CY3+ (KM285233) strain remains in circulation and has traveled from Asia to South America ( Figure 6B). Given the phylogenetic relationships of CY5+ RdRp and cFAM+ capsid sequences to those recovered from the Tasmanian devil and Grey Teal ducks, respectively, we speculate these hosts interacted in Australia giving rise to a zoonosis from which ABT3406 descended, making its way eastward to Colombia and throughout the Americas ( Figure 6B). Just how strains may have jumped from one host to another will require co-evolutionary analysis, however, the non-enveloped, environmentally resistant nature of PBV will have undoubtedly facilitated its spread in the absence of a direct transmission event. As with animal reservoirs for influenza, the virus can be excreted in their stool and through reassortment can change tropism to trigger emergence in an incidental (human) host. Thus, these competing models are in fact not mutually exclusive: whether as the primary or as an opportunistic infection, PBV can induce ARI symptoms while at the same time be transmitted via a prokaryotic host vector. Regardless, the key takeaway is that genotype III strains described here are uniquely associated with respiratory illness.
New classifications of respiratory viruses are being discovered such as bocaviruses (Parvo-) and redondoviruses (CRESS DNA), while emergent species (e.g., MERS, SARS-CoV-2) from established families (e.g., coronaviruses) have literally changed our way of life [54,55]. The COVID-19 pandemic has laid bare the need to be proactively searching for new viruses and prepared with diagnostics, therapeutics, and vaccines. Metagenomics has once again shown that picobirnaviruses are also involved in ARI [27,28]. To our knowledge, ours is the first study to leverage mNGS to link and fully sequence both capsid and RdRp, as well as define the presence of other bacterial and viral agents in Picobirnavirus infections. Many questions remain unanswered: Is it a human or a prokaryotic virus? Is the virus seasonal? Are only the immunocompromised affected? Does PBV cause respiratory symptoms or is it simply a bystander? What domains of RdRp determine respiratory tropism? The high prevalence observed, coupled with its ability to rapidly evolve, reassort its segmented genome, and crossover to other species, indicates a need for greater public health awareness and future studies of picobirnaviruses [13,56].

Conclusions
We applied next generation sequencing for virus discovery and determined that two phylogenetically related lineages of PBV circulating in different hemispheres were present in patients with ARI, and that both capsid and RdRp segments are linked to respiratory disease. Typically thought of as opportunistic gastrointestinal infections in the immunocompromised and now more recently considered prokaryotic viruses, this report challenges the prevailing understanding and should serve as a springboard for others to explore PBVs role in respiratory infections and their legitimacy as a human pathogen.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki. Colombia specimen collection was approved by the Diagnostics Institutional Review Board on behalf of MRN Diagnostics, protocol code MRN GP 072014, which received IRB approval on August 17, 2018. New York specimen collection was approved by Ethical & Independent Review Services on behalf of New York Biologics, study number 01015-18, which received IRB approval IRB00007807 on January 9, 2018. Florida specimen collection was approved or waived by an IRB and these records are retained by Boca Biolistics, LLC and available upon request.