SARSMutOnto: An Ontology for SARS-CoV-2 Lineages and Mutations
Abstract
:1. Introduction
2. Background
2.1. SARS-CoV-2 Virus Structure
2.2. Mutation, Lineage, and Recombinant Virus
2.3. Lineage Nomenclatures
2.4. SARS-CoV-2 Mutations
3. Materials and Methods
Algorithm 1: Ontology generation algorithm |
Inputs: Pango fils and outbreak.info API
Output: SARSMutOnto ontology |
4. Results
4.1. The Proposed Ontology
4.2. Lineage Description
4.3. Querying SARSMutOnto
4.3.1. List of Lineage Mutations
Listing 1. Query to extract the mutation list of B.1.617.2 variant. |
PREFIX ns:<https://github.com/jbakkas/SARSMutOnto/blob/main/SARSMutOnto.owl#> SELECT ?mutationName ?gene FROM <https://raw.githubusercontent.com/jbakkas/SARSMutOnto/main/SARSMutOnto.owl> WHERE{ ?mutation a ns:SNP. ?lineage a ns:B.1.617.2. ?mutation ns:has_for_lineage ?lineage. ?mutation ns:has_for_gene ?gene . ?mutation ns:mutation_name ?mutationName }ORDER BY DESC ( ? gene ) |
4.3.2. List of Lineages with a Given Mutation
Listing 2. Query to extract lineages with a given mutation. |
PREFIX ns:<https://github.com/jbakkas/SARSMutOnto/blob/main/SARSMutOnto.owl#> SELECT ?lineageName FROM<https://raw.githubusercontent.com/jbakkas/SARSMutOnto/main/SARSMutOnto.owl> WHERE{ ?snp a ns:SNP. ?snp ns:has_for_lineage ?lineage. ?lineage ns:label ?lineageName. Filter (? snp=ns : N501Y) }ORDER BY ?lineageName |
4.3.3. List of Gene Mutations
Listing 3. Query to extract the list of all mutations occurring in the spike (S) protein. |
PREFIX owl:<http://www.w3.org/2002/07/owl#> PREFIX ns:<https://github.com/jbakkas/SARSMutOnto/blob/main/SARSMutOnto.owl#> SELECT ?mutationName ?gene FROM <https://raw.githubusercontent.com/jbakkas/SARSMutOnto/main/SARSMutOnto.owl> WHERE{ ?mutation a ns:SNP. ?mutation a owl:NamedIndividual. ?mutation ns:has_for_gene ns:S. ?mutation ns:mutation_name ?mutationName. } |
4.3.4. List of Recombinant Lineages
Listing 4. Query to extract the list of all Pango reconbinant lineages with their parents. |
PREFIX ns:<https://github.com/jbakkas/SARSMutOnto/blob/main/SARSMutOnto.owl#> PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#> PREFIX rdf:<http://www.w3.org/2000/01/rdf-schema#> PREFIX owl:<http://www.w3.org/2002/07/owl#> SELECT ?l ?c FROM <https://raw.githubusercontent.com/jbakkas/SARSMutOnto/main/SARSMutOnto.owl> WHERE{ ?l rdf:subClassOf ns:recombinant. ?l rdf:subClassOf ?c. Filter (?c!=ns:recombinant) } |
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Additional Queries
Appendix A.1. Query 1
Listing A1. List of all lineages. |
PREFIX ns:<https://github.com/jbakkas/SARSMutOnto/blob/main/SARSMutOnto.owl#> PREFIX owl:<http://www.w3.org/2002/07/owl#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?lineageName ?date ?description FROM <https://raw.githubusercontent.com/jbakkas/SARSMutOnto/main/SARSMutOnto.owl> WHERE{ ?lineage a owl:NamedIndividual. ?lineage ns:label ?lineageName ?lineage ns:appeared_on ?date. ?lineage ns:has_for_description ?description. }order by (?lineageName) |
Appendix A.2. Query 2
Listing A2. The appearance date of a Delta variant (B.1.617.2). |
PREFIX ns:<https://github.com/jbakkas/SARSMutOnto/blob/main/SARSMutOnto.owl#> PREFIX owl:<http://www.w3.org/2002/07/owl#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?lineageName ?date FROM <https://raw.githubusercontent.com/jbakkas/SARSMutOnto/main/SARSMutOnto.owl> WHERE{ ?lineageI a owl:NamedIndividual. ?lineageI ns:label ?lineageName. ?lineageI ns:appeared_on ?date. FILTER (?lineageName=‘‘B.1.617.2’ ’) } |
Appendix A.3. Query 3
Listing A3. List of lineages with WHO-assigned names. |
PREFIX ns:<https://github.com/jbakkas/SARSMutOnto/blob/main/SARSMutOnto.owl#> PREFIX owl:<http://www.w3.org/2002/07/owl#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?lineageName ?WHO_namedate FROM <https://raw.githubusercontent.com/jbakkas/SARSMutOnto/main/SARSMutOnto.owl> WHERE{ ?lineage a owl:NamedIndividual. ?lineage ns:label ?lineageName. ?lineage ns:has_for_WHO_name ?WHO_name . FILTER(?WHO_name!=’ ’) } |
Appendix A.4. Query 4
Listing A4. Sub-lineages of the lineage (B.1.1.529). |
PREFIX ns:<https://github.com/jbakkas/SARSMutOnto/blob/main/SARSMutOnto.owl#> PREFIX owl:<http://www.w3.org/2002/07/owl#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?name FROM <https://raw.githubusercontent.com/jbakkas/SARSMutOnto/main/SARSMutOnto.owl> WHERE{ ?subLineage rdfs:subClassOf ns:B.1.1.529. ?ind a owl:NamedIndividual. ?ind a ?subLineage. ?ind ns:label ?name } |
References
- Cui, J.; Li, F.; Shi, Z.L. Origin and evolution of pathogenic coronaviruses. Nat. Rev. Microbiol. 2019, 17, 181–192. [Google Scholar] [CrossRef] [PubMed]
- Wu, F.; Zhao, S.; Yu, B.; Chen, Y.M.; Wang, W.; Song, Z.G.; Hu, Y.; Tao, Z.W.; Tian, J.H.; Pei, Y.Y.; et al. A new coronavirus associated with human respiratory disease in China. Nature 2020, 579, 265–269. [Google Scholar] [CrossRef] [PubMed]
- Lu, H.; Stratton, C.W.; Tang, Y.W. Outbreak of pneumonia of unknown etiology in Wuhan, China: The mystery and the miracle. J. Med. Virol. 2020, 92, 401. [Google Scholar] [CrossRef] [PubMed]
- WHO. WHO Coronavirus (COVID-19) Dashboard. 2023. Available online: https://covid19.who.int/ (accessed on 12 November 2022).
- Moghadas, S.M.; Vilches, T.N.; Zhang, K.; Wells, C.R.; Shoukat, A.; Singer, B.H.; Meyers, L.A.; Neuzil, K.M.; Langley, J.M.; Fitzpatrick, M.C.; et al. The impact of vaccination on coronavirus disease 2019 (COVID-19) outbreaks in the United States. Clin. Infect. Dis. 2021, 73, 2257–2264. [Google Scholar] [CrossRef]
- Whetzel, P.L.; Noy, N.F.; Shah, N.H.; Alexander, P.R.; Nyulas, C.; Tudorache, T.; Musen, M.A. BioPortal: Enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications. Nucleic Acids Res. 2011, 39, W541–W545. [Google Scholar] [CrossRef]
- Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene ontology: Tool for the unification of biology. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef]
- Schriml, L.M.; Arze, C.; Nadendla, S.; Chang, Y.W.W.; Mazaitis, M.; Felix, V.; Feng, G.; Kibbe, W.A. Disease Ontology: A backbone for disease semantic integration. Nucleic Acids Res. 2012, 40, D940–D946. [Google Scholar] [CrossRef]
- Cowell, L.G.; Smith, B. Infectious disease ontology. In Infectious Disease Informatics; Springer: Berlin/Heidelberg, Germany, 2010; pp. 373–395. [Google Scholar]
- He, Y.; Yu, H.; Ong, E.; Wang, Y.; Liu, Y.; Huffman, A.; Huang, H.h.; Beverley, J.; Hur, J.; Yang, X.; et al. CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis. Sci. Data 2020, 7, 181. [Google Scholar] [CrossRef]
- Babcock, S.; Beverley, J.; Cowell, L.G.; Smith, B. The infectious disease ontology in the age of COVID-19. J. Biomed. Semant. 2021, 12, 13. [Google Scholar] [CrossRef]
- Sargsyan, A.; Kodamullil, A.T.; Baksi, S.; Darms, J.; Madan, S.; Gebel, S.; Keminer, O.; Jose, G.M.; Balabin, H.; DeLong, L.N.; et al. The COVID-19 ontology. Bioinformatics 2020, 36, 5703–5705. [Google Scholar] [CrossRef]
- Laddada, W.; Soualmia, L.F.; Zanni-Merk, C.; Ayadi, A.; Frydman, C.; Imbert, I. OntoRepliCov: An Ontology-Based Approach for Modeling the SARS-CoV-2 Replication Process. Procedia Comput. Sci. 2021, 192, 487–496. [Google Scholar] [CrossRef] [PubMed]
- He, Y.; Yu, H.; Huffman, A.; Lin, A.Y.; Natale, D.A.; Beverley, J.; Zheng, L.; Perl, Y.; Wang, Z.; Liu, Y.; et al. A comprehensive update on CIDO: The community-based coronavirus infectious disease ontology. J. Biomed. Semant. 2022, 13, 25. [Google Scholar] [CrossRef] [PubMed]
- Gangavarapu, K.; Latif, A.A.; Mullen, J.L.; Alkuzweny, M.; Hufbauer, E.; Tsueng, G.; Haag, E.; Zeller, M.; Aceves, C.M.; Zaiets, K.; et al. Outbreak.info genomic reports: Scalable and dynamic surveillance of SARS-CoV-2 variants and mutations. medRxiv 2022. [Google Scholar] [CrossRef]
- PANGO. PANGO Lineages. 2021. Available online: https://cov-lineages.org (accessed on 13 September 2022).
- Zhang, Q.; Xiang, R.; Huo, S.; Zhou, Y.; Jiang, S.; Wang, Q.; Yu, F. Molecular mechanism of interaction between SARS-CoV-2 and host cells and interventional therapy. Signal Transduct. Target. Ther. 2021, 6, 233. [Google Scholar] [CrossRef]
- Wu, A.; Peng, Y.; Huang, B.; Ding, X.; Wang, X.; Niu, P.; Meng, J.; Zhu, Z.; Zhang, Z.; Wang, J.; et al. Genome composition and divergence of the novel coronavirus (2019-nCoV) originating in China. Cell Host Microbe 2020, 27, 325–328. [Google Scholar] [CrossRef]
- Peiris, J.S.; Guan, Y.; Yuen, K. Severe acute respiratory syndrome. Nat. Med. 2004, 10, S88–S97. [Google Scholar] [CrossRef]
- Tang, J.W.; Tambyah, P.A.; Hui, D.S. Emergence of a new SARS-CoV-2 variant in the UK. J. Infect. 2021, 82, e27–e28. [Google Scholar] [CrossRef]
- Kirola, L. Genetic emergence of B. 1.617. 2 in COVID-19. New Microbes New Infect. 2021, 43, 100929. [Google Scholar] [CrossRef]
- Tegally, H.; Wilkinson, E.; Giovanetti, M.; Iranzadeh, A.; Fonseca, V.; Giandhari, J.; Doolabh, D.; Pillay, S.; San, E.J.; Msomi, N.; et al. Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa. MedRxiv 2020. [Google Scholar] [CrossRef]
- Voloch, C.M.; da Silva Francisco Jr, R.; de Almeida, L.G.; Cardoso, C.C.; Brustolini, O.J.; Gerber, A.L.; Guimarães, A.P.d.C.; Mariani, D.; da Costa, R.M.; Ferreira, O.C., Jr.; et al. Genomic characterization of a novel SARS-CoV-2 lineage from Rio de Janeiro, Brazil. J. Virol. 2021, 95, e00119-21. [Google Scholar] [CrossRef]
- (WHO). W.H.O. Tracking SARS-CoV-2 Variants. 2020. Available online: www.who.int/en/activities/tracking-SARS-CoV-2-variants (accessed on 10 September 2022).
- Shu, Y.; McCauley, J. GISAID: Global initiative on sharing all influenza data–from vision to reality. Eurosurveillance 2017, 22, 30494. [Google Scholar] [CrossRef] [PubMed]
- GISAID. Clade and Lineage Nomenclature Aids in Genomic Epidemiology Studies of Active hCoV-19 Viruses. 2021. Available online: https://gisaid.org/resources/statements-clarifications/clade-and-lineage-nomenclature-aids-in-genomic-epidemiology-of-active-hcov-19-viruses/ (accessed on 20 August 2022).
- Hodcroft, E.B.; Hadfield, J.; Neher, R.A.; Bedfor, T. Year-Letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org. 2020. Available online: https://nextstrain.org/blog/2020-06-02-SARSCoV2-clade-naming (accessed on 25 July 2022).
- Rambaut, A.; Holmes, E.C.; O’Toole, Á.; Hill, V.; McCrone, J.T.; Ruis, C.; du Plessis, L.; Pybus, O.G. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat. Microbiol. 2020, 5, 1403–1407. [Google Scholar] [CrossRef] [PubMed]
- O’Toole, Á.; Scher, E.; Underwood, A.; Jackson, B.; Hill, V.; McCrone, J.T.; Colquhoun, R.; Ruis, C.; Abu-Dahab, K.; Taylor, B.; et al. Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool. Virus Evol. 2021, 7, veab064. [Google Scholar] [CrossRef]
- Wang, Q.; Zhang, Y.; Wu, L.; Niu, S.; Song, C.; Zhang, Z.; Lu, G.; Qiao, C.; Hu, Y.; Yuen, K.Y.; et al. Structural and functional basis of SARS-CoV-2 entry by using human ACE2. Cell 2020, 181, 894–904. [Google Scholar] [CrossRef]
- Liu, Z.; Xiao, X.; Wei, X.; Li, J.; Yang, J.; Tan, H.; Zhu, J.; Zhang, Q.; Wu, J.; Liu, L. Composition and divergence of coronavirus spike proteins and host ACE2 receptors predict potential intermediate hosts of SARS-CoV-2. J. Med. Virol. 2020, 92, 595–601. [Google Scholar] [CrossRef]
- Plante, J.A.; Liu, Y.; Liu, J.; Xia, H.; Johnson, B.A.; Lokugamage, K.G.; Zhang, X.; Muruato, A.E.; Zou, J.; Fontes-Garfias, C.R.; et al. Spike mutation D614G alters SARS-CoV-2 fitness. Nature 2021, 592, 116–121. [Google Scholar] [CrossRef]
- Ozono, S.; Zhang, Y.; Ode, H.; Sano, K.; Tan, T.S.; Imai, K.; Miyoshi, K.; Kishigami, S.; Ueno, T.; Iwatani, Y.; et al. SARS-CoV-2 D614G spike mutation increases entry efficiency with enhanced ACE2-binding affinity. Nat. Commun. 2021, 12, 848. [Google Scholar] [CrossRef]
- Motozono, C.; Toyoda, M.; Zahradnik, J.; Saito, A.; Nasser, H.; Tan, T.S.; Ngare, I.; Kimura, I.; Uriu, K.; Kosugi, Y.; et al. SARS-CoV-2 spike L452R variant evades cellular immunity and increases infectivity. Cell Host Microbe 2021, 29, 1124–1136. [Google Scholar] [CrossRef]
- McCallum, M.; Bassi, J.; De Marco, A.; Chen, A.; Walls, A.C.; Di Iulio, J.; Tortorici, M.A.; Navarro, M.J.; Silacci-Fregni, C.; Saliba, C.; et al. SARS-CoV-2 immune evasion by the B. 1.427/B. 1.429 variant of concern. Science 2021, 373, 648–654. [Google Scholar] [CrossRef]
- Majumdar, P.; Niyogi, S. ORF3a mutation associated with higher mortality rate in SARS-CoV-2 infection. Epidemiol. Infect. 2020, 148, e262. [Google Scholar] [CrossRef]
- Lamy, J.B. Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies. Artif. Intell. Med. 2017, 80, 11–28. [Google Scholar] [CrossRef] [PubMed]
- Gangavarapu, K.; Latif, A.A.; Mullen, J.; Alkuzweny, M.; Hufbauer, E.; Tsueng, G.; Haag, E.; Zeller, M.; Aceves, C.; Zaiet, K.; et al. B.1.617.2 Lineage Report, Outbreak.info. 2022. Available online: https://outbreak.info/situation-reports?pango=B.1.617.2 (accessed on 20 September 2022).
- Kazybay, B.; Ahmad, A.; Mu, C.; Mengdesh, D.; Xie, Y. Omicron N501Y mutation among SARS-CoV-2 lineages: Insilico analysis of potent binding to tyrosine kinase and hypothetical repurposed medicine. Travel Med. Infect. Dis. 2022, 45, 102242. [Google Scholar] [CrossRef] [PubMed]
Gene | Amino Acid |
---|---|
Spike protein gene | T19R |
Spike protein gene | E156G |
Spike protein gene | del157/158 |
Spike protein gene | L452R |
Spike protein gene | T1T478K9R |
Spike protein gene | D614G |
Spike protein gene | P681R |
Spike protein gene | D950N |
Nucleocapsid gene | D63G |
Nucleocapsid gene | R203M |
Nucleocapsid gene | D377Y |
ORF7a | V82A |
ORF8a | S84L |
ORF8a | del119/120 |
ORF1b | P314L |
ORF1b | P1000L |
ORF1b | G662S |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bakkas, J.; Hanine, M.; Chekry, A.; Gounane, S.; de la Torre Díez, I.; Lipari, V.; López, N.M.M.; Ashraf, I. SARSMutOnto: An Ontology for SARS-CoV-2 Lineages and Mutations. Viruses 2023, 15, 505. https://doi.org/10.3390/v15020505
Bakkas J, Hanine M, Chekry A, Gounane S, de la Torre Díez I, Lipari V, López NMM, Ashraf I. SARSMutOnto: An Ontology for SARS-CoV-2 Lineages and Mutations. Viruses. 2023; 15(2):505. https://doi.org/10.3390/v15020505
Chicago/Turabian StyleBakkas, Jamal, Mohamed Hanine, Abderrahman Chekry, Said Gounane, Isabel de la Torre Díez, Vivian Lipari, Nohora Milena Martínez López, and Imran Ashraf. 2023. "SARSMutOnto: An Ontology for SARS-CoV-2 Lineages and Mutations" Viruses 15, no. 2: 505. https://doi.org/10.3390/v15020505
APA StyleBakkas, J., Hanine, M., Chekry, A., Gounane, S., de la Torre Díez, I., Lipari, V., López, N. M. M., & Ashraf, I. (2023). SARSMutOnto: An Ontology for SARS-CoV-2 Lineages and Mutations. Viruses, 15(2), 505. https://doi.org/10.3390/v15020505