Bacterial infections are a major global concern, since they can lead to public health problems. To address this issue, bioinformatics contributes extensively with the analysis and interpretation of in silico data by enabling to genetically characterize different individuals/strains, such as in bacteria. However, the growing volume of metagenomic data requires new infrastructure, technologies, and methodologies that support the analysis and prediction of this information from a clinical point of view, as intended in this work. On the other hand, distributed computational environments allow the management of these large volumes of data, due to significant advances in processing architectures, such as multicore CPU (Central Process Unit) and GPGPU (General Propose Graphics Process Unit). For this purpose, we developed a bioinformatics workflow based on filtered metagenomic data with Duk tool. Data formatting was done through Emboss software and a prototype of a workflow. A pipeline was also designed and implemented in bash script based on machine learning. Further, Python 3 programming language was used to normalize the training data of the artificial neural network, which was implemented in the TensorFlow framework, and its behavior was visualized in TensorBoard. Finally, the values from the initial bioinformatics process and the data generated during the parameterization and optimization of the Artificial Neural Network are presented and validated based on the most optimal result for the identification of the CTX-M gene group.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited