The Innovative Informatics Approaches of High-Throughput Technologies in Livestock: Spearheading the Sustainability and Resiliency of Agrigenomics Research

For more than a decade, next-generation sequencing (NGS) has been emerging as the mainstay of agrigenomics research. High-throughput technologies have made it feasible to facilitate research at the scale and cost required for using this data in livestock research. Scale frameworks of sequencing for agricultural and livestock improvement, management, and conservation are partly attributable to innovative informatics methodologies and advancements in sequencing practices. Genome-wide sequence-based investigations are often conducted worldwide, and several databases have been created to discover the connections between worldwide scientific accomplishments. Such studies are beginning to provide revolutionary insights into a new era of genomic prediction and selection capabilities of various domesticated livestock species. In this concise review, we provide selected examples of the current state of sequencing methods, many of which are already being used in animal genomic studies, and summarize the state of the positive attributes of genome-based research for cattle (Bos taurus), sheep (Ovis aries), pigs (Sus scrofa domesticus), horses (Equus caballus), chickens (Gallus gallus domesticus), and ducks (Anas platyrhyncos). This review also emphasizes the advantageous features of sequencing technologies in monitoring and detecting infectious zoonotic diseases. In the coming years, the continued advancement of sequencing technologies in livestock agrigenomics will significantly influence the sustained momentum toward regulatory approaches that encourage innovation to ensure continued access to a safe, abundant, and affordable food supplies for future generations.


Introduction
Due to the incentives for developing quantitative theories and methodologies, highthroughput next-generation sequencing (HT-NGS) technologies have become more accessible. They are now employed in numerous biological science sectors [1]. The large-scale genome databases and sophisticated bioinformatics tools can expand new avenues of research with a wide range of applications including, but not limited to, chromatin immunoprecipitation coupled with DNA microarray (ChIP-chip) or sequencing (ChIP-seq), RNA sequencing (RNA-seq), whole-genome genotyping, de novo genome assembling and

The First Generation of Sequencing Technologies
The throughput sequencing methods can be divided into three generations of sequencing technologies: post-monopoly era sequencing platforms, second-generation, and

The First Generation of Sequencing Technologies
The throughput sequencing methods can be divided into three generations of sequencing technologies: post-monopoly era sequencing platforms, second-generation, and thirdand fourth-generation sequencing platforms [15]. In 1986, near the end of the capillary electrophoresis technique sequencing era, further development resulted in an automated fluorescent technique to sequence a genome region. The primary technology in the "first generation" of automated DNA sequencing was reported using Applied Biosystems (ABI) fluorescent sequencing [16]. Efforts are being made to improve the sequencing techniques, allowing for the development of increasingly automated DNA-sequencing equipment with fluorometric detection and enhanced sensing employing capillary-based electrophoresis. The Welcome Trust and Medical Research Council integrated a global public effort to sequence the human genome in the Human Genome Project. The project initially began in 1990 and was completed in 2003. Although the Sanger technique was used to sequence the first 3.0 billion bp of the human genome (released in 2000), the human reference genome has only covered the euchromatic part of the genome, rendering crucial heterochromatic regions incomplete. The Genome Reference Consortium (GRC) released the current human reference genome in 2013 and most recently updated it in 2019 (GRCh38.p13). This reference has evolved over the past 20 years and can be attributed to the Human Genome Project [17]. Several unique sequencing technologies were developed two decades after the advent of electrophoretic techniques for DNA sequencing. The words "next-generation" and/or "massively parallel" DNA sequencing is used to refer to the DNA HTS technologies that can sequence a large number of distinct DNA sequences in a single reaction. Sanger-based "topdown" techniques need to characterize large clones by low-resolution mapping in microtiter plate wells, whereas massively parallel approaches do not. The main premise of the NGS approaches is based on the DNA ligase covalently attaching the synthetic DNA adapters to each of the targeted fragment ends and the in situ amplification on a solid surface. The Solexa technology was developed in 1998, and the 454 Life Science in 2000. However, the GS20 454 sequencing platform debuted in 2005 and was the first non-Sanger-based commercialized technology [18]. The Roche 454, the first commercial NGS platform, employed large-scale parallel pyrosequencing chemistry to identify base pair sequences with higher throughput and lower sequencing costs per base than the Sanger sequencing [19].

The Second Generation of Sequencing Technologies
The NGS methodology has been used in many fields, including transcriptome analysis, de novo assembly, genotyping, targeted and whole genome sequencing, and the detection of SNPs, copy number variation, exome, protein-protein interactions, and genome methylation. The Roche 454 genome technology, which is based on Melamede's sequencingby-synthesis (SBS) theory (1985), was the first next-generation system to be commercially viable and uses pyrophosphate to identify the pyrophosphate generated during DNA synthesis. The Roche 454 GS system was initially released in 2005, and in 2008 it was updated to the Roche GS-FLX 454 Titanium system. The GS-run processor and the additional work in 2009 streamlined the library preparation and data processing. Roche employed a GS FLX+ sequencer capable of reading 400-600 million base pairs each run with maximum pair-read lengths of 1000, however, Roche 454 was phased out in 2016 [20].
The SOLiD platform, developed by Harvard Medical School and the Howard Hughes Medical Institute, was commercially released by ABI in October 2007 and generated 4 Gb of sequencing data within the six days of running [12]. This sequencing system employs the sequencing-by-ligation method of oligonucleotide ligation and detection. The SOLiD sequence is based on color coding, which is decoded to produce the basic sequence. However, incorrect color coding might result in decoding errors. Balasubramanian and Klenerman utilized fluorescently tagged nucleotides in the middle of the 1990s to observe a single polymerase molecule migrate [12]. In June 2006, the first Solexa sequencer was launched, and Illumina entered the industry in 2006, bought Solexa in 2007, and gradually progressed the NGS industry [16]. A paired-end module for the sequencer with new optics and camera components was included in the Genome Analyzer II in 2008 as a result of further advancements in the Illumina method [12].
Ion Torrent Systems Inc. (Gilford, CT, USA), in 2010, invented the first commercial sequencing method that did not rely on dye-labeled oligonucleotides and expensive optics. It monitors H ions generated during base incorporation and is specifically suited to amplicon sequencing. Despite its benefits, Ion Torrent's read accuracy remains a major challenge. The high rate of mistakes induced by the noisy sequencer signal is translated into a nucleotide sequence. Furthermore, the signal decays over time, resulting in a drop in the signal-to-noise ratio [21]. Polonator, a polony sequencing machine, was invented by Dr. George Church's group at Harvard Medical School in 2009. Polony sequencing, a nonelectrophoretic sequencing technology, can read millions of immobilized DNA sequences simultaneously at a lower cost per nucleotide than conventional Sanger sequencing. The fundamental limitation of this method is the non-uniform amplification, which results in a decreased sequencing accuracy and a read length of just 26 bp [22]. The second-generation 454 GS-FLX, Illumina, and SOLiD sequencing systems are not sensitive enough to detect the individual single-molecule template extensions, whereas "third-or fourth-generation sequencers" are "single molecular"-type sequencers, such as the Heliscope, PacBio, and Oxford nanopore sequencers, which do not require pre-amplification steps and are more sensitive and precise.

Third-Generation Sequencing Platforms
Third-generation sequencing systems do not include an amplification step during library creation. They allow single-molecule sequencing with average read lengths reaching 6-8 kbp and maximum read lengths exceeding 30-150 kbp. Based on real-time imaging, the SMRT sequencing technology parallelizes data from a DNA polymerase and conducts uninterrupted template-directed synthesis. By emphasizing length, it breaks out of the existing short-read HTS instruments. In 2011, Pacific Biosciences made SMRT sequencing commercially available [23]. PacBio technology yields read durations ranging from 1000 to 3000 bp on average [12].
Illumina technology employs DNA colony sequencing, which is based on reversible dye terminator sequencing via synthesis chemistry. The Illumina sequencing-by-synthesis method is the most extensively used NGS technology because it provides precise read alignment and improved indel identification [19]. Early in 2010, Illumina introduced HiSeq 2000, and the continued research on cutting-edge flow cells for Illumina HiSeq technology led to the numerous novel sequencing platforms introduced from 2011 to 2018. Illumina has produced popular sequencing systems, including MiSeq, HiSeq, and NovaSeq [24]. Current NGS methods are at least 100 times quicker than traditional Sanger sequencing. Using NGS, complete genome sequences may be retrieved, providing fast and comprehensive information [25]. As a result, NGS technology is frequently employed to monitor gene expression across an organism's genome.
Further development of HT next-generation platforms such as the GeneReader NGS technique, The 10X Genomics platform, The SeqStudio TM Genetic analyzer, the Bionano Saphy TM genomics platform, the fluorescence resonance energy transfer based GnuBio platform (Bio-Rad, Hercules, CA, USA), GenapSys, NanoString Technologies, an electron microscopy-based Electron Optica system and Firefly (Illumina, San Diego, CA, USA), nanopore sequencing by Genia (Roche, Basel, Switzerland),can revolutionize biological science through the ability to sequence more samples at higher depths, producing more insightful data in less time and at a lower cost per sample [26].

Fourth-Generation Sequencing Platforms
Following the three generations, a new type of sequencer was recently developed, represented by the PacBio sequencer and Nanopore sequencer, known as fourth-generation sequencing [27]. Oxford Nanopore Technologies (ONT) introduced two new TGS systems, MinION, PromethION, and GridION, in 2012, enabling the direct electronic study of DNA, RNA, proteins, and single molecules. This method uses nanopores and an exonucleasebased "deconstruction sequencing" approach. In 2014-2015, the MinIONs were distributed to selected laboratories for beta testing. Nanopore technology can provide real-time sequencing of single molecules for as little as USD 25-40 per Gb of sequence data. The data processing is simpler than the short-read sequencers because alignment and assembly are more straightforward using nanopore technology. GridION has tested up to five MinION Flow Cells simultaneously; it is a simplified benchtop infrastructure. It is ideal for labs with various applications that require the benefits of nanopore sequencing, such as facile library preparation, real-time analysis, and lengthy reads. PromethION is meant for HT and employs the same chemistry as MinION and GridION, which are intended for real-time usage. However, based on the number of samples, it has a high fidelity for DNA and RNA sequencing. It is a rapid sequencing method, and nanopore technology may represent the future of sequencing.
High-throughput approaches have tremendously aided research in obtaining genomic information for various species. The NGS platform's current version supports directed readings and pathogen detection [28]. Several zoonotic pathogens are detected using the Illumina NGS platform. For example, the Illumina HiScan and MiSeq technologies have been utilized to broadly detect viral quasi species in the capsid gene area as evidence of positive selection allowing cell-tropism [29]. The ngs.plot algorithm visualizes the enrichment patterns of DNA-interacting proteins in functionally essential locations using NGS data; therefore, it is a helpful tool for bridging the gap between massive datasets and functionally important genomic information [30].

The Perspective of Domestic Animal Reference Sequences
The numerous domestic livestock species' genomes, including those of chickens, pigs, cattle, sheep, and horses, have recently been partially or entirely sequenced (Table 1; Figure 2). The Red Junglefowl (RJF) chicken genome sequence was the first to be sequenced. The chicken genome's initial draft was generated using an assembly with 6.6-fold wholegenome shotgun coverage. The Bovine Genome Sequencing and Analysis Consortium published the Taurine cow genome sequence in April 2009. This preliminary assembly identified around 22,000 genes and 14,345 orthologs shared by seven mammalian species. The first draft (98% complete) of the pig genome (Sus scrofa) constructed through global collaborative efforts has been made public. The diploid pig genome is about 2.7 109 kb long and comprises 38 chromosomes (including meta-and acrocentric ones). In 2010, the interim assembly version OARv2.0 for sheep was released to discover genes linked with sheep productivity, quality, and disease features. The OARv3.0 was finalized in 2012, with details on chromosomal gaps. In brief, we have discussed the perspective of the development of genome research in cattle, pigs, chickens, sheep, and horses.

Insights into Cattle (Bos taurus) Genome Research
Cattle have a long-standing relationship with human civilization and are essential in agriculture and research as model animals. Approximately 1.5 billion cattle are raised annually worldwide. The global demand for beef in 2019 was 70 million tons, along with bovine dairy products [31]. Thus, cattle represent significant scientific opportunities and a vital economic resource. In 2009, the first complete sequence of the bovine genome was published. The Centre for Bioinformatics and Computational Biology at the University of Maryland published a whole-genome assembly of B. taurus (2.86 billion bp) as well as the UMD 3.1 B. taurus assembly [32].

Insights into Cattle (Bos taurus) Genome Research
Cattle have a long-standing relationship with human civilization and are essential in agriculture and research as model animals. Approximately 1.5 billion cattle are raised annually worldwide. The global demand for beef in 2019 was 70 million tons, along with bovine dairy products [31]. Thus, cattle represent significant scientific opportunities and a vital economic resource. In 2009, the first complete sequence of the bovine genome was published. The Centre for Bioinformatics and Computational Biology at the University of Maryland published a whole-genome assembly of B. taurus (2.86 billion bp) as well as the UMD 3.1 B. taurus assembly [32].
The Bovine Genome Sequencing Project was undertaken owing to the unique nature of ruminants and their role as a critical protein source for humans. The bovine genome sequence and haplotype map has transformed the beef and dairy sectors [33,34]. Many linkage maps have since been built to identify the economically significant features of the bovine family because the linkage map is predicted to include 90% of the bovine genome [35].
In recent years, the map of the bovine genome has also advanced rapidly. Chromosomal maps and synteny also facilitate the detection of chromosomal conservation in other species, particularly those relevant for extrapolating data from mouse and human maps to cattle [36]. Radiation hybrid mapping is a useful approach for creating in-depth The Bovine Genome Sequencing Project was undertaken owing to the unique nature of ruminants and their role as a critical protein source for humans. The bovine genome sequence and haplotype map has transformed the beef and dairy sectors [33,34]. Many linkage maps have since been built to identify the economically significant features of the bovine family because the linkage map is predicted to include 90% of the bovine genome [35].
In recent years, the map of the bovine genome has also advanced rapidly. Chromosomal maps and synteny also facilitate the detection of chromosomal conservation in other species, particularly those relevant for extrapolating data from mouse and human maps to cattle [36]. Radiation hybrid mapping is a useful approach for creating in-depth comparison maps of single chromosomes and whole genomes [37]. Whole-genome shotgun sequencing has been used to discover possible segmental duplications and compare them with publicly accessible bovine genome sequence assemblies [38].

The Decade of Swine (Sus scrofa) Genomic Research
According to molecular genetic data, the domestic pig (S. scrofa) is a eutherian mammal that emerged some 20-30 million years ago and originated in Southeast Asia [39]. Pork provides a high-quality protein source that can offer a highly desirable eating experience and supplies~35% of all meat production with increasing global demand [40]. The pig is essential in biomedical research because of its ability to create transgenic and knockout pigs using somatic nuclear cloning methods, resulting in various models for specific human diseases. It has been reported that 112 positions in porcine protein sequences have amino acids implicated in human disease [41]. Traditional selective breeding can take years to produce a pig with all the desired characteristics, whereas modification of the pig genome can provide the same results in much less time [42].
The Swine Genome Sequencing Consortium (SGSC) initiated a whole-genome sequencing study for pigs in early 2006. The Wellcome Sanger Institute sequenced the whole pig genome using clone-by-clone sequencing. More than 287 Mb of sequencing have been completed from 1660 accessioned clones used in the project [43]. Indeed, high-throughput sequencing technologies have greatly improved the study of bacterial populations colonizing the porcine gut. These results reveal more nonredundant microbial genes between humans and pigs than between humans and mice. Thus, pigs are a better animal model than mice owing to their considerable similarities with humans [44].

Genetic Assembly Research in Chickens (Gallus gallus) and Ducks (Anatidae)
Chickens are the most popular fowls worldwide across different cultures and geographical areas and play a significant role in the rural economy in most underdeveloped and developing countries. Native chickens and ducks are reared in over 90% of rural homes. They are an essential element of a balanced farming system and serve as a source of high-quality animal protein in rural dwellings [45].
The chicken (G. gallus) is a key model organism for understanding the evolutionary relationship between mammals and other vertebrates. Genetic studies in chickens date back to the start of the twentieth century. The chicken genome comprises 38 autosomes and one pair of sex chromosomes, with the female as the heterogametic sex [46]. A consensus linkage map of the chicken genome has been created using all available genotyping data and has dramatically improved comparative gene mapping. This map shows that substantial syntenic areas between the human and chicken genomes seem to be consistently conserved [47].
Ducks (Anatidae) evolved from the related turkey, chicken, and zebra finch approximately 90-100 million years ago and are now one of the most commercially significant waterfowl for meat, eggs, and feathers [48]. The duck is also one of the most common domesticated waterfowl. Advances in NGS technologies have enabled population-level comparative genomic research to uncover the unique genetic features in domestic animals, including ducks. For example, 15.56 million single nucleotide polymorphisms have been discovered in Korean native ducks [49]. Fluorescence in situ hybridization (FISH) and microarray analysis are reported as: (i) vital tools for detecting large genomic rearrangements; (ii) copy number variants (CNVs); (iii) gene gains/losses; and (iv) gene order in the macrochromosomes of birds. Comparative genomics analysis has been conducted in chicken and Peking duck macrochromosomes using FISH mapping and microarray analysis. The results revealed one interchromosomal and six intrachromosomal rearrangements between these two species [50].

Genome Architecture in Sheep (Ovis aries)
The typical role of sheep is to provide meat, milk, and fiber as globally valuable commodities [51]. Sheep meat typically accommodates 3% of global meat production, and its quality depends on muscle quality and nutritional characteristics [52]. The introduction of NGS technology has allowed the attainment of vast amounts of sequence information at a substantially reduced cost [53]. Domestic sheep have 54 diploid chromosomes, of which 26 pairs are autosomes, and two are sex chromosomes. The identification and functional annotation of genes governing the various qualities of interest in sheep is critical.
The second-generation genetic map of sheep was created using 519 markers, and the genotypic data were merged using the international and USDA mapping flocks [54]. The completed genome spanned 2.62 Gb and comprised 7157 scaffolds with an N50 of approximately 2 Mb [55]. The International Sheep Genomics Consortium is working towards sequencing the reference sheep genome. The availability of the sheep genome sequence has allowed the anticipation of the functions of noncoding RNAs. Large-scale cDNA sequencing, also known as RNA-seq, provides complete transcriptome identification, annotation, and quantification [56]. Improvements in livestock breeding and awareness of desirable genetic traits across diverse breeds have also ushered in a new age in sheep genomics. Thus, the animal breeding sector is directly benefiting from the constant technological breakthroughs in NGS [57].

Inslight in the Horse (Equus caballus) Genomics
Horses have played a vital role in agriculture, transport, industry, and sport since their domestication 6000 years ago. The first whole genome of the horse was released in 2009 [58]. Since 1995, the Antczak laboratory has been a significant participant in the international collaboration for the Horse Genome Project, a consortium of over 20 laboratories from more Life 2022, 12, 1893 9 of 28 than 12 countries that have collaborated to produce various genetic and physical maps of the horse genome, culminating in the whole genome sequence [59].
The Eli and Edythe Broad Institute of the Massachusetts Institute of Technology and Harvard University in Cambridge performed the horse genome sequencing and assembly. Paired-end low-coverage whole genome shotgun (WGS) of 100,000 reads each were generated from seven horse breeds (Arabian, Andalusian, Akhal-Teke, Quarterhorse, Icelandic horse, Standardbred, and Thoroughbred). The WGS reads were placed uniquely on the Equus1.0 Thoroughbred assembly, and the SSAHA-SNP tool was used for detection. The horse genome comprises 64 chromosomes [60], and the validation rate for these SNPs is estimated to be approximately 95%. Whole-genome sequencing of the horse genome has provided knowledge of equine genetic diversity; it has revealed 5.7 million singlenucleotide variations and 0.8 million minor indel variants, and some detrimental recessive alleles. This knowledge may facilitate the control of harmful recessive alleles in horse breeding programs and increase horse fertility [61]. According to the comparative genome sequencing of a late Pleistocene horse and the present genomes of five domestic horse breeds, all the current horses, zebras, and donkeys descend from the Equus lineage.
A study identified 29 genomic sites in horse breeds that depart from neutrality and display low variations compared to those in Przewalski's horse [62]. FISH has been used to create a second-generation whole-genome radiation hybrid, cytogenetic, and comparative map of the horse genome. This map includes 4103 markers for all 31 autosomes and the X chromosome pairs. The resulting integrated map provides the most detailed information on the physical and comparative structure of the equine genome. It is a tool for identifying genes that regulate the horse's health, illness, and performance [63].
The genomic maps of a male wild horse and a male Mongolian horse were improved by sequencing their genomes using NGS technology [64]. An assessment of the genomes of 38 normal horses from 16 different breeds revealed 258 CNV sites. Identifying variations contributing to equine genetic disorders requires a thorough understanding of CNVs in normal horse populations within and between breeds and must be undertaken [65]. Equus species exhibit higher karyotypic diversity than other animals and have a wide range of diploid chromosome counts, ranging from 32 in the mountain zebra to 66 in Przewalski's horse.

Databases and Online Resources
Global assessment of population genetic diversity and identification of genome areas under natural and artificial selection have been facilitated by NGS [66]. However, challenges concerning the storage, accessibility and efficient visualization of massive datasets remain. The need for bioinformatics resources to enable genomic research in farm animals is widely acknowledged [67,68]. Genomic databases have been created to offer current summaries on the state of genetic analysis in various farm and domestic animals, as well as experimental details and links ( Table 2). Large-scale genomic databases and helpful bioinformatics programs can provide new areas of study with a broad range of applications [69]. The resource databases and accompanying technologies have been created to manage vast amounts of experimental data. Several of these systems are designed to meet the requirements of global partnerships. Indeed, continuous development is necessary to keep the integrity and usability of existing services, especially genome databases. SNPchiMp is a public MySQL database with a web-based interface officially attributed as an Ensembl web-based server. SNPchiMp v.3 analyzed six livestock species, ranging from one species for goats to more than ten for cattle, with a total of 23 SNP arrays. The interface includes SNP mapping information from the most recent genome assembly, information extraction from dbSNP for SNPs detected in all commercially available bovine chips, and identification of SNPs shared by two or more bovine chips. [73] Metabolome database The Bovine Metabolome https://bovinedb.ca/ (accessed on 2 November 2022) It is a free online resource that contains thorough information about small molecule metabolites identified in bovines. It is meant to be used to learn more about bovine biology and the micronutrients contained in bovine tissues and biofluids, as well as to improve beef and dairy cow veterinarian treatment. Serum, ruminal fluid, liver, longissimus thoracis (LT) muscle, semimembranosus (SM) muscle, and testis tissues are all characterized quantitatively in BMDB. Many data fields are connected to various databases (HMDB, PubChem, MetaCyc, ChEBI, UniProt, and GenBank) and applets for visualizing structure and pathways. [74] Life 2022, 12, 1893 The ISGC helps researchers identify genetic areas and genes that influence sheep characteristics. This database serves as a backbone for ruminant species when coupled with data from other ruminant genome sequences. The database contains sheep genome assemblies and variants of 935 sheep representing 69 breeds from 21 countries. In addition to providing a genetic resource for animal biomedical research models, this assembly is a genomic resource for humans. [77] Sheep Genomes Database Sheep Genomes DB https://sheepgenomesdb.org/ (accessed on 2 November 2022) The USDA AFRI-funded Sheep Genomes Database is a project of the International Sheep Genomics Consortium that builds on the consortium's recent achievement of creating and sharing the Oar rambouillet v1.0 genome. It gathers and facilitates sheep genomic data, detects variants, and downloads SNP and CNV data from sheep genomes. GEISHA is a chicken embryo in situ hybridization gene expression database and genomics resource. More than 36,000 pictures of whole-embryo in situ hybridizations and embryo portions from embryonic days 0-5, as well as some older embryo data focusing on late-developing tissues, are currently available in the GEISHA database. [84] Life 2022, 12,1893  AnimalMetagenomeDB combines metagenomic sequencing data with host information to help users discover relevant data. Animal metagenomic data may be seen, searched for, and downloaded by users. Metadata for 82,097 metagenomes from four domestic animals (bovines, sheep, horses, and pigs) and 540 wild species are included in the AnimalMetagenome DB version 1.0. These metagenomes span 15 years of research, 73 nations, 1044 investigations, 63,214 amplicon sequencing data points, and 10,672 whole genome sequencing data points.

Outline of Zoonosis Infections
Identifying and analyzing host-pathogen interactions (HPI) and Protein-protein interactions (PPIs) are critical in studying infectious diseases. However, the databases of molecular interactions that are accessible need not feature numerous HPI and PPI data, particularly for host-pathogen systems in agriculture [89,90]. Based on surveillance data, the CDC reports that the majority of zoonotic illnesses (41.4%) are bacterial, followed by viral (37.7%), parasitic (18.3%), fungal (2%), and prionic (0.8%). The number of online databases and tools available for discovery, annotation analysis, and archiving microbiome data are shown in Table 3.
Zoonotic pathologic changes can be transmitted from an infected animal or human to an exposed host [91]. Viruses, bacteria, fungi, and parasites are among the pathogens that cause these illnesses [92]. They may spread to humans via food, blood transfusion, vectors in the air, or direct contact [93]. Moreover, 60% of emerging infectious diseases are reported to originate from zoonotic pathogens [94]. The Center for Disease Control and Prevention estimates that, apart from the United States of America, 48 million people worldwide get sick from dietary products and 128,000 are hospitalized, while 3000 die of foodborne diseases yearly [95].
Infectious diseases in cattle, swine, horses, sheep, chickens, and produce from chickens cause significant economic losses for the livestock industry. Outbreaks of zoonotic contagious illnesses or reverse zoonotic disease transmission (zooanthroponosis) in humans are produced by pathogen spillover (cross-species spillover), and areas where humans and animals interact regularly, are possible spillover areas [96]. The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been debated as either a zoonotic disease or an emerging infectious disease [96]. The COVID-19 pandemic has brought to public attention that even the highly developed and most qualified healthcare networks worldwide collapse when confronting a novel viral infectious disease of zoonotic origin. Before the COVID-19 pandemic, African swine fever significantly impacted the global livestock industry [97]. Following that, the COVID-19 pandemic has substantially influenced human health and the economy. The impact of the pandemic has also jeopardized the sustainability of livestock and agri-based products, significantly affecting the quality of life and causing economic losses. At the same time, more than 150 enteric viruses now recognized as crucial to human and animal health are considered in genomic surveillance efforts to monitor and forecast the subsequent pandemic spillover. In order to minimize economic losses in cattle production, advanced procedures must be prepared. Public health care considerations must also be accommodated [98]. Cenote-Taker 2 was written in Bash, Perl, and Python. All scripts can be found on GitHub. This tool is a virus discovery and annotation tool available via the command line and graphical user interface with free computation access, employs highly sensitive models of hallmark virus genes to discover familiar or divergent viral sequences from user-input contigs. Furthermore, Cenote-Taker2 employs a versatile set of modules to automatically annotate the sequence features of contigs, providing more gene information than comparable tools. The BLAST and Hmmer databases created for this tool can be found on Zenodo.  MetaGeneMark's developers, GENE PROBE Inc., have created and refined algorithms for gene prediction in metagenomic sequences for over fifteen years. This website provides access to gene prediction in metagenomes by utilizing metagenome parameters and gene prediction. This same MetaGeneMark-2 plugin has been further optimized for gene discovery in anonymous metagenomic sequences. In comparison to MetaGeneMark, estimated to be 2.7%, MetaGeneMark-2 reduces nearly twice the rate of false negative predictions and missed genes. MetaGeneMark-2 is a C++ program, and all experiments and results are run and analyzed in Python.
All scripts can be found on GitHub.  The database includes more than 500 organisms, including invertebrate vectors, eukaryotic pathogens (protists and fungus), and relevant free-living or non-pathogenic species or hosts. VEuPathDB projects integrate >1700 pre-analyzed datasets (and related metadata) with extensive search capabilities, visualizations, and analysis tools in a graphical interface to provide researchers with access to Omics data and bioinformatic studies.

WormBase Parasite
It was established in 2000 and offered each species at WormBase a dependable and recognizable user interface. Furthermore, the WormBase Parasite V WBPS17 assembles the reliable, current information about the genetics, genomes, and biology of nematode Haemonchus contortus an animal endoparasite infecting wild and domesticated ruminants (including sheep and goats) worldwide.
http://www.wormbase.org (accessed on 3 November 2022) https://parasite.wormbase.org/ Haemonchus_contortus_prjeb5 06/Info/Index/ (accessed on 3 November 2022) [116][117][118] Global Mammal Parasite Database version 2.0 GMPD GMPD, a database of parasites of wild ungulates (artiodactyls and perissodactyls), carnivores, and primates, and is provided for download as complete flat files. The updated database contains over 24,000 entries from over 2700 literature sources. It included data on sampling method and sample size when obtainable, as well as "reported" and "corrected" binomials for each host and parasite species. Current higher taxonomies and data on transmission modes used by the majority of the parasite species in the database are also included.
parasites.nunn-lab.org (accessed on 3 November 2022) [119] Fungi Saccharomyces Genome Database SGD The SGD project delivers the highest-quality manually curated information from peer-reviewed literature and algorithms like sequence similarity searches, which leads to extensive details on genome characteristics and gene relationships. Researchers have public access to these data through online sites that are built for ease of use.
https://gold.jgi.doe.gov/ (accessed on 3 November 2022) [123] Metagenomics RAST server MG-RAST The MG-RAST server is an open-source comparative genomics system based on the SEED platform. Users can upload raw fasta sequence data; the sequences will be normalized and analyzed, and summaries will be generated automatically. The service offers multiple methods for accessing the various data kinds, such as phylogenetic and metabolic reconstructions, as well as the ability to compare the metabolism and annotations of one or more metagenomes and genomes.

EggNOG Database
EggNOG is a publicly available database that analyzes thousands of genomes at once to determine orthology links between all of their genes. It included a significant upgrade to the underlying genome sets, which were enlarged to include 4445 representative bacteria and 168 archaea generated from 25 038 genomes, 477 eukaryotic species, and 2502 viral proteomes.

of 28
Potential zoonotic exposure upon contact with cattle or their products causes concern as approximately 15.4 million pounds of beef products are rejected/canceled annually [127]. Bovine zoonoses, anthrax, brucellosis, cryptosporidiosis, dermatophilosis, Escherichia coli, giardiasis, leptospirosis, listeriosis, pseudo cow pox, Q fever, rabies, ringworm, salmonellosis, tuberculosis, and vesicular stomatitis are of serious public health significance. They cause severe economic losses in animal industries [128]. Rotavirus group A is one of the most common causes of newborn calf diarrhea. In 2013, a group of rotaviruses was discovered in an epizootic outbreak of diarrhea in adult cows, which coincided with a drop in milk output in Japan [129]. Bovine enterovirus is another virus that causes diarrhea in cattle. Abortion, stillbirths, infertility, neonatal mortality, diarrhea, pyrexia, dehydration, and weight loss have all been reported worldwide. NGS technology and quantitative reverse transcription (qRT)-PCR have been used to identify bovine enteroviruses [130].
Pigs are also excellent human disease models and can spread various infections to humans. In addition, pork meat can result in the transmission of different life-threatening conditions. The major zoonotic diseases associated with swine include influenza, ringworm, erysipelas, campylobacteriosis, salmonellosis, cryptosporidiosis, giardiasis, balantidiasis, E. coli, brucellosis, and streptococcosis [131]. Pig parasites and their potential to infect humans have lately become a severe public health concern because of recent parasitic disease outbreaks where pigs acted as vectors [132].
Poultry are raised in various cultures, customs, and religious states for food security and nutrition as meat and eggs. Approximately 106 million tons of chicken meat are supplied to the market globally, with a continuous increase compared to beef and pork [133,134]. Zoonotic infections associated with poultry commonly include avian influenza, tuberculosis, erysipelas, ornithosis, cryptococcosis, histoplasmosis, salmonellosis, cryptosporidiosis, campylobacteriosis, and escherichiosis [131]. In March 2004, the chicken genome was the first genome sequenced in any agriculture-related animal species [135].
The zoonotic infections spread by sheep include severe viral diseases that can affect all mammals, such as rabies and other diseases like salmonellosis, listeriosis, Q fever, ringworm, and chlamydiosis [131]. Equine disease models can also be used to study various human diseases. Equine recurrent uveitis is an autoimmune illness that affects horses, yet it is the only valid spontaneous model of human autoimmune uveitis [136]. One of the more prevalent zoonotic parasite diseases is toxoplasmosis. The late 1930s saw the first recognition of T. gondii-related disease in humans. The primary mechanism of the vertical transmission of T. gondii involves tachyzoites [137]. Although tachyzoites of T. gondii have been discovered in the milk of a number of intermediate hosts, including sheep, goats, and cows, a report suggested that acute toxoplasmosis in humans has mostly been associated with the intake of unpasteurized goat's milk [138,139]. Furthermore, it is considered that T. gondii found in livestock meat, is a significant source of infection for people [140].
Many unknown disease-related and zoonosis-causing mutations have been discovered through advances in genome sequencing [141]. The NGS sheds fresh light on the zoonotic spread of microorganisms. High-resolution or ultra-deep sequencing showed the genetic diversity of influenza A and hepatitis E [96,142]. HT-NGS techniques were utilized for the genomic sequencing of influenza (H1N1) from animals. HTS-based metagenomic methods can be utilized to investigate new etiology outbreaks such as understanding host responses to diverse viral infections, gaining information on potential well-known illnesses suspected of having a multi-factorial etiology, and epidemic control through quick diagnosis, high sensitivity, and flexible analysis. Thus, these techniques have the potential to lead to several new advancements in food safety and public health [143].

The Mechanism of Zoonoses
Bacteria, viruses, parasites, and fungi are the primary pathogens that cause zoonotic diseases [91]. Anti-microbial resistance is a severe global issue affecting both humans and agricultural animals [144]. Adaptive resistance is the product of bacterial survival mechanisms in response to altered environmental conditions; it is attained by horizontal and vertical gene transfer. Viruses rarely encounter optimal environments, and natural selection through mutations enables their survival in extreme conditions [145]. Many mutations in the host may be eradicated through purifying selection [146]. The purifying selection represents the most predominant form of choice as it persistently wipes out newly appearing deleterious mutations in coding regions produced in virus replication [147]. Based on genome composition and host cellular organization, viruses are expected to encounter widely altered selection enforcement, particularly in an advanced organism such as a vertebrate that contains a unique mechanism of immunity, which is the highly specific detection of foreign proteins by certain recognition receptors [148]. An intense mutation rate can cause the production and accumulation of deleterious mutations; however, those deleterious mutations can be eradicated by purifying selection [147]. Austin L. Hughes et.al author indicates that purifying selection is ongoing in nonsynonymous sites and not in synonymous sites, and that there is a more effective action of purifying selection in RNA viruses than in DNA viruses [148]. Moreover, the author suggested that purifying selection is relaxed on exposed proteins of RNA and DNA viruses which are infecting vertebrates, except in the case of those with arthropod vectors (the influence of purifying selection is varied on infection from different hosts) [148]. Arcangeli et al. confirmed the presence of purifying selection in their studies, revealing that ss-RNA-strand small-ruminant lentiviruses (SRLVs) exhibit a high mutation rate and frequent recombination events, but the obtained value of the non-synonymous (dN) and synonymous (dS) substitution (dN/dS) ratio indicated the presence of purifying selection [149].
Virus mutation rates vary depending on the polymerase fidelity from high-fidelity DNA polymerases that possess proofreading activity. Mutation in RNA viruses also depends on genome size, with lethal mutations higher in larger RNA genomes. The mutation rate of DNA viruses varies depending on polymerase error, host reaction, and viral error-correction enzymes. Some small DNA viruses do not contain DNA polymerase and use host polymerases for proofreading [150]. Importantly, RNA-dependent RNA and DNA polymerases make more errors, leading to more mutations than DNA polymerases due to a lack of proofreading activity ( Figure 3). Viral mutation rates can also depend on the infected host species [151]; however, the mechanisms associated with virus spillover are still under investigation. Considering the possible impact of spillover events caused by fast mutation and resistance to conventional medications, currently available technological and NGS approaches should be employed to mitigate the effects of such infections on animals.  . The RNA-dependent RNA and DNA polymerases are more likely to cause mutation than DNA-dependent DNA polymerases as they lack proofreading activity. The mutation rate of reverse transcriptase is higher than that of DNA polymerase; however, RNA viruses show more mutations than retroviruses.

HT-NGS and Bioinformatics Simulations for Pathogens Detection
With a predicted global population of approximately 10 billion people by 2050, there will be an unparalleled growth in demand for animal protein, including meat, eggs, milk, and other animal products. The worldwide task will be to provide a food supply that is inexpensive, safe, and sustainable [152,153]. The HT-NGS, paired with computer modeling and algorithm, allows us to effectively diagnose infection in domestic animals and identify known or unknown pathogens [154]. Sequencing technologies enable the screening of vast populations of domesticated animals for genetic variations that mirror human genetic illnesses and allow the development of models that represent uncommon human disorders more precisely. These technologies can facilitate the real-time identification and quantification of aerobic and anaerobic bacteria and fungi.
The genomics revolution provides enormous promise for generating novel insights and disease control techniques as the pathogens of tickborne livestock diseases have been sequenced. Additionally, with the increasing accessibility of genetic resources, the interconnections between species participating in the tick-host-pathogen system can be investigated [155]. In Australasia and Asia, tickborne illnesses have significant adverse economic impacts on cattle operations. Oriental theileriosis is a tickborne illness that affects show the highest mutation rates (2.0 × 10 −3 nucleotide substitutions/site/year in the Ebola virus, 0.8-2.38 × 10 −3 in coronaviruses, and 1.21 × 10 −2 in norovirus). The RNA-dependent RNA and DNA polymerases are more likely to cause mutation than DNA-dependent DNA polymerases as they lack proofreading activity. The mutation rate of reverse transcriptase is higher than that of DNA polymerase; however, RNA viruses show more mutations than retroviruses.

HT-NGS and Bioinformatics Simulations for Pathogens Detection
With a predicted global population of approximately 10 billion people by 2050, there will be an unparalleled growth in demand for animal protein, including meat, eggs, milk, and other animal products. The worldwide task will be to provide a food supply that is inexpensive, safe, and sustainable [152,153]. The HT-NGS, paired with computer modeling and algorithm, allows us to effectively diagnose infection in domestic animals and identify known or unknown pathogens [154]. Sequencing technologies enable the screening of vast populations of domesticated animals for genetic variations that mirror human genetic illnesses and allow the development of models that represent uncommon human disorders more precisely. These technologies can facilitate the real-time identification and quantification of aerobic and anaerobic bacteria and fungi.
The genomics revolution provides enormous promise for generating novel insights and disease control techniques as the pathogens of tickborne livestock diseases have been sequenced. Additionally, with the increasing accessibility of genetic resources, the interconnections between species participating in the tick-host-pathogen system can be investigated [155]. In Australasia and Asia, tickborne illnesses have significant adverse economic impacts on cattle operations. Oriental theileriosis is a tickborne illness that affects cattle and is caused by the members of the Theileria Orientalis complex. Five genotypes of the T. Orientalis complex in 13 cattle samples have been identified using NGS [156]. The viral metagenomics analyses can be used to detect groups of rotaviruses from fecal samples, allowing impartial and thorough diagnoses of diseases in the animal.
Over the last decade, the control of parasitic sheep illnesses has been challenging despite several changes and developments in management, which challenges the safe rearing of sheep in many parts of the world and increases human zoonotic hazards [157]. As large-animal models for biomedical research, sheep are more promising than mice because they have more physiological similarities to humans [157]. Small ruminant lentiviruses (SRLVs) have at least four highly diverse viral genotypes, which persist in the sheep spleen. Whole-genome characterization of SRLV is now possible through NGS [158]. The establishment of genome sequence databases can facilitate prompt and accurate recognition of emerging unknown infections or disease strains, supporting endeavors for curbing widespread contemporary diseases like the coronavirus pandemic.

Conclusions
Although HT-NGS technology changed sequencing by providing unprecedented depth and accuracy, it still has significant limitations. The generation of short readings is a severe challenge. The so-called "short-read sequencing" that defines all NGS technologies necessitates the use of specialized bioinformatics tools and complex post-processing pipelines, making high-throughput data handling more challenging and increasing the average duration of the analysis. Short-read approaches are often characterized by the use of large equipment and time-consuming experimental processes, as well as substantial bioinformatics analysis. These characteristics of NGS approaches make the testing procedure complicated for post-processing analysis. Studies on variation analysis claim that long-reads have enabled researchers to more easily characterize large insertions, deletions, translocations, and other structural alterations that could be present across the genomes. Longer read lengths contribute to more figurative chromosomal elements, resulting in more contiguous genome reconstructions.
Furthermore, the metagenomic approach in environmental samples tends to multiply mistakes, thus confounding the conclusions concerning pathogen diversity. It is challenging to determine the pathogen virulence in humans or their domestic animals because infections and parasites are so varied. The databases described here can assist us in making predictions and directing the available research to validate the predictions made using bioinformatic databases developed using NGS technology. Regardless of rates or timeframes, the most critical purpose of animal genome research is to enhance our understanding of different breeds' genome information and control and prevent animal disease spread/diffusion to avert agroeconomic losses and prevent the outbreak of new pandemics.