PIDS: A User-Friendly Plant DNA Fingerprint Database Management System
Abstract
:1. Introduction
2. Materials and Methods
2.1. System and Database Schema
2.2. Fingerprint Merging and Comparison Algorithms
2.2.1. Fingerprint Merging Algorithm
2.2.2. Fingerprint Comparison Algorithm
Algorithm 1: GCA |
Input: [a1,b1] as loci 1, [a2,b2] as loci 2, a1/a2 is the first data of diploid, b1/b2 is the last data of diploid, the different offset allowed is named n result = 1 # initialized as not same conditionR1 = abs(a1 - a2) ≤ n conditionR2 = abs(b1 - b2) ≤ n conditionR3 = abs(a1 - b2) ≤ n conditionR4 = abs(a2 - b1) ≤ n if (conditionR1 and conditionR2) or (conditionR3 and conditionR4) result = 0 end if return result Output: 0-> two loci are same; 1-> two loci are different |
Algorithm 2: FCPP |
Input: queue f1 of length n, queue f2 of length m, f1 and f2 should have same number of loci, named as p Initialize compareResult array as size m*n for i = 1 to m for j = 1 to n diffCount = 0 for k = 1 to p diffCount=diffCount+GCA(f1[j][p]+f2[i][p]) end for compareResult.append([f1[j], f2[i], diffCount]) end for end for return compareResult Output: compareResult |
2.3. System and Database Implementation
3. Results
3.1. System Modules and Functionality
3.2. System Model
3.3. System Access
3.4. Data Quality Control
4. Discussion
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Jenjaroenpun, P.; Chew, C.S.; Yong, T.P.; Choowongkomon, K.; Thammasorn, W.; Kuznetsov, A. The TTSMI database: A catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome. Nucleic Acids Res. 2015, 43, 110–116. [Google Scholar] [CrossRef] [PubMed]
- Wall, J.D.; Cox, M.P.; Yong, T.P.; Mendez, F.L.; Woerner, A.; Severson, T.; Hammer, M.F. A novel DNA sequence database for analyzing human demographic history. Genome Res. 2015, 18, 1354–1361. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lapointe, M.; Rogic, A.; Bourgoin, S.; Jolicoeur, C.; Séguin, D. Leading-edge forensic DNA analyses and the necessity of including crime scene investigators, police offificers and technicians in a DNA elimination database. Forensic Sci. Int. Genet. 2015, 19, 50–55. [Google Scholar] [CrossRef] [PubMed]
- Harbison, S.A.; Fallow, M.; Bushell, D. An analysis of the success rate of 908 trace DNA samples submitted to the Crime Sample Database Unit in New Zealand. Aust. J. Forensic Sci. 2008, 40, 49–53. [Google Scholar] [CrossRef]
- Struyf, P.; De, M.S.; Vandeviver, C.; Renard, B.; Vander, B.T. The effectiveness of DNA databases in relation to their purpose and content: A systematic review. Forensic Sci. Int. 2019, 301, 371–381. [Google Scholar] [CrossRef] [PubMed]
- Benschop, C.C.G.; Van, D.M.L.; De, J.J.; Vanvooren, V.; Kempenaers, M.; Van, D.B.C.; Barni, F.; Reyes, E.L.; Moulin, L.; Pene, L.; et al. Validation of SmartRank: A likelihood ratio software for searching national DNA databases with complex DNA profiles. Forensic Sci. Int. Genet. 2017, 29, 145–153. [Google Scholar] [CrossRef]
- Carew, M.E.; Nichols, S.J.; Batovska, J.; St, C.R.; Murphy, N.P.; Blacket, M.J.; Shackleton, M.E. A DNA barcode database of Australia’s freshwater macroinvertebrate fauna. Mar. Freshw. Res. 2017, 68, 1788–1802. [Google Scholar] [CrossRef]
- Mantelatto, F.L.; Terossi, M.; Negri, M.; Buranelli, R.C.; Robles, R.; Magalhaes, T.; Tamburus, A.F.; Rossi, N.; Miyazaki, M.J. DNA sequence database as a tool to identify decapod crustaceans on the Sao Paulo coastline. Mitochondrial DNA Part A 2018, 29, 805–815. [Google Scholar] [CrossRef]
- Zhou, H.Y.; Zhang, P.H.; Luo, J.; Liu, X.Y.; Fan, S.X.; Liu, C.J.; Han, Y.Y. The establishment of a DNA fngerprinting database for 73 varieties of Lactuca sativa capitate L. using SSR molecular markers. Hortic. Environ. Biotechnol. 2018, 60, 95–103. [Google Scholar] [CrossRef]
- Backiyarani, S.; Chandrasekar, A.; Uma, S.; Saraswathi, M.S. MusatransSSRDB (a transcriptome derived SSR database)—An advanced tool for banana improvement. J. Biosci. 2019, 43, 110–116. [Google Scholar] [CrossRef]
- Yu, J.Y.; Dossa, K.; Wang, L.H.; Zhang, Y.X.; Wei, X.; Liao, B.S.; Zhang, X.R. PMDBase: A database for studying microsatellite DNA and marker development in plants. Nucleic Acids Res. 2017, 45, 1046–1053. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Pan, Y.B. Development and Integration of an SSR-Based Molecular Identity Database into Sugarcane Breeding Program. Agronomy 2016, 6, 28. [Google Scholar] [CrossRef] [Green Version]
- Sochorová, J.; Garcia, S.; Gálvez, F.; Symonová, R.; Kovařík, A. Evolutionary trends in animal ribosomal DNA loci: Introduction to a new online database. Chromosoma 2018, 127, 141–150. [Google Scholar] [CrossRef] [Green Version]
- Morgante, M.; Hanafey, M.; Powell, W. Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat. Genet. 2002, 30, 194–200. [Google Scholar] [CrossRef] [PubMed]
- Chen, H.; Liu, L.; Wang, L.X.; Wang, S.H.; Somta, P.; Cheng, X.Z. Development and validation of EST-SSR markers from the transcriptome of adzuki bean (Vigna angularis). PLoS ONE 2015, 10, e0131939. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Guichoux, E.; Lagache, L.; Wagner, S.; Chaumeil, P.; Leger, P.; Lepais, O.; Lepoittevin, C.; Malausa, T.; Revardel, E.; Salin, F.; et al. Current trends in microsatellite genotyping. Mol. Ecol. Resour. 2011, 11, 591–611. [Google Scholar] [CrossRef]
- Freixas-Coutin, J.A.; An, S.Y.; Postman, J.; Bassil, N.V.; Yates, B.; Shukla, M.; Saxena, P.K. Development of a reliable Corylus sp. reference database through the implementation of a DNA fingerprinting test. Planta 2019, 6, 1863–1874. [Google Scholar] [CrossRef]
- Li, L.; Fang, Z.W.; Zhou, J.F.; Chen, H.; Hu, Z.F.; Gao, L.F.; Chen, L.H.; Ren, S.; Ma, H.Y.; Lu, L.; et al. An accurate and efficient method for large-scale SSR genotyping and applications. Nucleic Acids Res. 2017, 10, e88. [Google Scholar] [CrossRef] [Green Version]
- Jasrotia, R.S.; Yadav, P.K.; Angadi, U.B.; Tomar, R.S.; Jaiswal, S.; Rai, A.; Kumar, D. VigSatDB: Genome-wide microsatellite DNA marker database of three species of Vigna for germplasm characterization and improvement. Database 2019, 2019, baz055. [Google Scholar] [CrossRef]
- Molla, M.R.; Ahmed, I.; Rohman, M.M.; Hossain, M.A.; Chowdhury, M.A.Z. Genetic diversity analysis and DNA fingerprinting of Mungbean (Vigna radiata L.) genotypes using SSR markers. J. Plant Sci. 2016, 6, 153–164. [Google Scholar] [CrossRef]
- Bengtsson-Palme, J.; Richardson, R.T.; Meola, M.; Wurzbacher, C.; Tremblay, E.D.; Thorell, K.; Kanger, K.; Eriksson, K.M.; Bilodeau, G.J.; Johnson, R.M.; et al. Metaxa2 Database Builder: Enabling taxonomic identification from metagenomic or metabarcoding data using any genetic marker. Bioinformatics 2018, 34, 4027–4033. [Google Scholar] [CrossRef] [PubMed]
- Wilton, R.; Wheelan, S.J.; Szalay, A.S.; Salzberg, S.L. The Terabase Search Engine: A large-scale relational database of short-read sequences. Bioinformatics 2019, 35, 665–670. [Google Scholar] [CrossRef] [PubMed]
- Jayashree, B.; Reddy, P.T.; Leeladevi, Y.; Crouch, J.H.; Mahalakshmi, V.; Buhariwalla, H.K.; KE Eshwar, K.E.; Mace, E.; Folksterma, R.; Senthilvel, S.; et al. Laboratory Information Management Software for genotyping workflows: Applications in high throughput crop genotyping. BMC Bioinform. 2006, 7, 383. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Poverennaya, E.V.; Bogolubova, N.A.; Bylko, N.N.; Ponomarenko, E.A.; Lisitsa, A.V.; Archakov, A.I. Gene-centric content management system. Biochim. Biophys. Acta Proteins Proteom. 2014, 1, 77–81. [Google Scholar] [CrossRef] [PubMed]
- Truong, C.V.C.; Duchev, Z.; Groeneveld, E. Data framework for efficient management of sequence and microsatellite data in biodiversity studies. Arch. Anim. Breed. 2013, 56, 50–64. [Google Scholar] [CrossRef] [Green Version]
- Voegele, C.; Tavtigian, S.V.; de Silva, D.; Cuber, S.; Thomas, A.; Le Calvez-Kelm, F. A Laboratory Information Management System (LIMS) for a high throughput genetic platform aimed at candidate gene mutation screening. Bioinformatics 2007, 23, 2504–2506. [Google Scholar] [CrossRef] [Green Version]
- Viksna, J.; Celms, E.; Opmanis, M.; Podnieks, K.; Rucevskis, P.; Zarins, A.; Barrett, A.; Neogi, S.G.; Krestyaninova, M.; McCarthy, M.I.; et al. PASSIM—An open source software system for managing information in biomedical studies. BMC Bioinform. 2007, 83, 52. [Google Scholar] [CrossRef] [Green Version]
- Rossum, T.V.; Tripp, B.; Daley, D. SLIMS—A user-friendly sample operations and inventory management system for genotyping labs. Bioinformatics 2010, 26, 1808–1810. [Google Scholar] [CrossRef] [Green Version]
- Sparkes, A.; Clare, A. AutoLabDB: A substantial open source database schema to support a high-throughput automated laboratory. Bioinformatics 2012, 286, 1390–1397. [Google Scholar] [CrossRef] [Green Version]
- Groeneveld, E.; Lichtenberg, H. TheSNPpit—A High Performance Database System for Managing Large Scale SNP Data. PLoS ONE 2016, 11, e0164043. [Google Scholar] [CrossRef]
- Chen, P.P. The Entity-Relationship Model—Toward a Unified View of Data. ACM Trans. Database Syst. 1976, 1, 9–36. [Google Scholar] [CrossRef]
Comparison Method | Default Fingerprint Data Range |
---|---|
Database Comparison | Entire Local Fingerprint Database |
Homonymy Comparison | Fingerprints of the same name or synonyms within the entire Local Fingerprint Database |
Non-homonymy Comparison | Fingerprints with different names and synonyms within the entire Local Fingerprint Database |
Sub-Database Comparison | Assigned through Excel |
Paired Comparison | Assigned through Excel |
Condition | Description |
---|---|
Number of comparison loci | To control the matching of locus between different fingerprints, value: X ≥ 0. This parameter is used in the Fingerprint Comparison Algorithm, proper control of a value can reduce invalid comparison to improve fingerprint comparison speed, the default value of min(X) is 20. |
Number of differential loci | To control the locus difference between samples, which used in the Fingerprint Comparison Algorithm to filter the comparison results. Proper control of a value can reduce the display of useless results, the range of X is ≥0, and the default value of max(X) is 20. |
Percentage of differential loci | To control the degree of difference between samples, the range of X is 0 ≤ X ≤ 1, detail descriptions can be found in the mixed strain comparison algorithm. The default value of max(X) is 0.05. |
Base offset | To control the difference between the two loci, the range of X is 0 bp ≤ X ≤ 2 bp. The default MaxX is 2 bp. This parameter is used in the comparison algorithm. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jiang, B.; Zhao, Y.; Yi, H.; Huo, Y.; Wu, H.; Ren, J.; Ge, J.; Zhao, J.; Wang, F. PIDS: A User-Friendly Plant DNA Fingerprint Database Management System. Genes 2020, 11, 373. https://doi.org/10.3390/genes11040373
Jiang B, Zhao Y, Yi H, Huo Y, Wu H, Ren J, Ge J, Zhao J, Wang F. PIDS: A User-Friendly Plant DNA Fingerprint Database Management System. Genes. 2020; 11(4):373. https://doi.org/10.3390/genes11040373
Chicago/Turabian StyleJiang, Bin, Yikun Zhao, Hongmei Yi, Yongxue Huo, Haotian Wu, Jie Ren, Jianrong Ge, Jiuran Zhao, and Fengge Wang. 2020. "PIDS: A User-Friendly Plant DNA Fingerprint Database Management System" Genes 11, no. 4: 373. https://doi.org/10.3390/genes11040373
APA StyleJiang, B., Zhao, Y., Yi, H., Huo, Y., Wu, H., Ren, J., Ge, J., Zhao, J., & Wang, F. (2020). PIDS: A User-Friendly Plant DNA Fingerprint Database Management System. Genes, 11(4), 373. https://doi.org/10.3390/genes11040373