MDPI - Publisher of Open Access Journals

23 pages, 4623 KB

Open AccessArticle

ViroBioTree: A Tree-Structured Biological Evidence Retrieval Framework for Viral Protein Function Annotation

by Tinglian Lai, Fuguo Liu, Guodong Li and Liyan Hua

Viruses 2026, 18(6), 656; https://doi.org/10.3390/v18060656 - 9 Jun 2026

Viewed by 409

Accurate viral protein function annotation is essential for genomic surveillance, yet conventional retrieval-augmented generation (RAG) pipelines often fragment biological evidence into fixed-length text chunks, disrupting relationships among ORFs, annotations, structural domains, sequence motifs, residue mappings, and model-derived attention evidence. We propose ViroBioTree, a [...] Read more.

Accurate viral protein function annotation is essential for genomic surveillance, yet conventional retrieval-augmented generation (RAG) pipelines often fragment biological evidence into fixed-length text chunks, disrupting relationships among ORFs, annotations, structural domains, sequence motifs, residue mappings, and model-derived attention evidence. We propose ViroBioTree, a tree-structured biological evidence retrieval framework for downstream viral protein evidence review rather than a new primary annotation classifier. Built as an evidence organization layer on ViralMultiNet-derived ORF-level predictions and annotations, ViroBioTree converts sequence, annotation, structure, and attention evidence into typed biological nodes and traceable edges, then performs deterministic multi-channel recall, evidence-aware reranking, balanced TopK selection, rule-based verification, and node-cited report generation. In a demo benchmark, ViroBioTree achieved its strongest deterministic proxy performance on structure-explanation tasks, with Precision@K = 1.0, Recall@K = 1.0, and diversity = 0.52; these values reflect expected node-type and tag agreement rather than independent biological correctness. A bounded full-scale SARS-CoV-2 index contained 39,800 ORF rows, 80,000 attention records, 199,418 nodes, and 495,886 edges. In a stratified full20k diagnostic evaluation, ViroBioTree showed task-dependent advantages over LlamaIndex vector retrieval for conflict detection, evidence retrieval, and structure explanation, while LlamaIndex remained competitive or stronger for annotation-rich function annotation. A cross-family Influenza A Virus (IAV) diagnostic audit showed that the schema can represent IAV evidence namespaces while explicitly exposing missing formal ORF inputs, missing attention evidence, and unavailable residue/PDB assertions. Supplementary robustness, external sanity-check, diversity-risk, expert-evaluation, domain-tool positioning, and cross-family audit analyses supported traceability, report quality, and conservative evidence handling, but also showed that stable Precision@K under query perturbation does not necessarily imply stable retrieved evidence sets. ViroBioTree operates offline and deterministically, but does not address raw-read assembly, base calling, primary ORF prediction, or wet-lab validation. Its results should be interpreted as proxy and expert-reviewed evidence for traceable viral protein evidence retrieval and report generation rather than as direct validation of biological function annotation. Full article

(This article belongs to the Section General Virology)

► Show Figures

Figure 1

23 pages, 2463 KB

Open AccessArticle

Global Comparative Genomics of Stenotrophomonas maltophilia Reveals Cryptic Species Diversity, Resistome Variation, and Population Structure

by Ei Phway Thant, Chollachai Klaysubun, Sirikan Suwannasin, Thitaporn Dechathai, Kamonnut Singkhamanan, Thunchanok Yaikhan, Nattarika Chaichana, Rattanaruji Pomwised, Monwadee Wonglapsuwan, Sarunyou Chusri and Komwit Surachat

Life 2026, 16(1), 158; https://doi.org/10.3390/life16010158 - 17 Jan 2026

Viewed by 970

Abstract

Background: Stenotrophomonas maltophilia is an increasingly important multidrug-resistant opportunistic pathogen frequently isolated from clinical, environmental, and plant-associated niches. Despite its medical relevance, the global population structure, species-complex boundaries, and genomic determinants of antimicrobial resistance (AMR) and ecological adaptation remain poorly resolved, partly [...] Read more.

Background: Stenotrophomonas maltophilia is an increasingly important multidrug-resistant opportunistic pathogen frequently isolated from clinical, environmental, and plant-associated niches. Despite its medical relevance, the global population structure, species-complex boundaries, and genomic determinants of antimicrobial resistance (AMR) and ecological adaptation remain poorly resolved, partly due to inconsistent annotations and fragmented genomic datasets. Methods: Approximately 2400 genome assemblies annotated as Stenotrophomonas maltophilia were available in the NCBI Assembly database at the time of query. After pre-download filtering to exclude metagenome-assembled genomes and atypical lineages, 1750 isolate genomes were retrieved and subjected to stringent quality control (completeness ≥ 90%, contamination ≤ 5%, ≤500 contigs, N50 ≥ 10 kb, and ≤1% ambiguous bases), yielding a final curated dataset of 1518 high-quality genomes used for downstream analyses. Genomes were assessed using CheckM, annotated with Prokka, and compared using average nucleotide identity (ANI), pan-genome analysis, core-genome phylogenomics, and functional annotation. AMR genes, mobile genetic elements (MGEs), and metadata (source, host, and geographic origin) were integrated to assess lineage-specific genomic features and ecological distributions. Results: ANI-based clustering resolved the S. maltophilia complex into multiple distinct genomospecies and revealed extensive misidentification of publicly deposited genomes. The pan-genome was highly open, reflecting strong genomic plasticity driven by accessory gene acquisition. Core-genome phylogeny resolved well-supported clades associated with clinical, environmental, and plant-related niches. Resistome profiling showed widespread intrinsic MDR determinants, with certain lineages enriched for efflux pumps, β-lactamases, and trimethoprim–sulfamethoxazole resistance markers. MGE analysis identified lineage-specific integrative conjugative elements, prophages, and transposases that correlated with source and geographic distribution. Conclusions: This large-scale analysis provides the most comprehensive genomic overview of the S. maltophilia complex to date. Our findings clarify species boundaries, highlight substantial taxonomic misannotation in public databases, and reveal lineage-specific AMR and mobilome patterns linked to ecological and clinical origins. The curated dataset and evolutionary insights generated here establish a foundation for global genomic surveillance, epidemiological tracking, and future studies on the evolution of antimicrobial resistance in S. maltophilia. Full article

(This article belongs to the Section Genomics and Proteomics)

► Show Figures

Figure 1

21 pages, 8038 KB

Open AccessArticle

Semantic Data Federated Query Optimization Based on Decomposition of Block-Level Subqueries

by Yuan Yao and Yang Zhang

Future Internet 2025, 17(11), 531; https://doi.org/10.3390/fi17110531 - 20 Nov 2025

Viewed by 730

Abstract

The digital age and the rise of Internet of Things technology have led to an explosion of data, including vast amounts of semantic data. In the context of large-scale semantic data graphs, centralized storage struggles to meet the efficiency requirements of the queries. [...] Read more.

The digital age and the rise of Internet of Things technology have led to an explosion of data, including vast amounts of semantic data. In the context of large-scale semantic data graphs, centralized storage struggles to meet the efficiency requirements of the queries. This has led to a shift towards distributed semantic data systems. In federated semantic data systems, ensuring both query efficiency and comprehensive results is challenging because of data independence and privacy constraints. To address this, we propose a query processing framework featuring a block-level star decomposition method for generating efficient query plans, augmented by auxiliary indexes to guarantee the completeness of the results. A specialized FEDERATEDAND BY keyword is introduced for federated environments, and a partition-based parallel assembly method accelerates the result integration. Our approach demonstrably improves query efficiency and is analyzed for its potential application in energy systems. Full article

(This article belongs to the Special Issue Internet of Things Technology and Service Computing)

► Show Figures

Figure 1

15 pages, 2792 KB

Open AccessArticle

Evidence for the Transcription of a Satellite DNA Widely Found in Frogs

by Jennifer Nunes Pompeo, Kaleb Pretto Gatto, Diego Baldo and Luciana Bolsoni Lourenço

Genes 2024, 15(12), 1572; https://doi.org/10.3390/genes15121572 - 5 Dec 2024

Viewed by 1776

Abstract

Background: The satellite DNA (satDNA) PcP190 has been identified in multiple frog species from seven phylogenetically distant families within Hyloidea, indicating its broad distribution. This satDNA consists of repeats of approximately 190 bp and exhibits a highly conserved region (CR) of 120 bp, [...] Read more.

Background: The satellite DNA (satDNA) PcP190 has been identified in multiple frog species from seven phylogenetically distant families within Hyloidea, indicating its broad distribution. This satDNA consists of repeats of approximately 190 bp and exhibits a highly conserved region (CR) of 120 bp, which is similar to the transcribed region of 5S ribosomal DNA (rDNA), and a hypervariable region (HR) that varies in size and nucleotide composition among and within species. Here, to improve our understanding of PcP190 satDNA, we searched for evidence of its transcription in the available transcriptomes of Rhinella marina (Bufonidae) and Engystomops pustulosus (Leptodactylidae), two phylogenetically distantly related species. Methods: We first characterized the 5S rDNA and PcP190 sequences in these species by searching for them in available genome assemblies. Next, we used the PcP190 (CR and HR) and 5S rDNA sequences of each species as queries to search for these sequences in RNA-seq libraries. Results: We identified two types of 5S rDNA in each analyzed species, with a new type found in E. pustulosus. Our results also revealed a novel type of PcP190 sequence in R. marina and a new subtype of PcP-1 in E. pustulosus. Transcriptome analyses confirmed the expected transcription of the 5S rRNA gene and showed transcription of both the CR and HR of the PcP190 satDNA in both species and in different tissues. Conclusions: As the entire repeat of this satDNA is susceptible to transcription, the high variability observed in the HR cannot be attributed to transcriptional activity confined to the CR. Full article

(This article belongs to the Section Animal Genetics and Genomics)

► Show Figures

Graphical abstract

20 pages, 12482 KB

Open AccessArticle

Development and Design of an Online Quality Inspection System for Electric Car Seats

by Fangjie Wei, Dongqiang Wang and Xi Zhang

Sensors 2024, 24(21), 7085; https://doi.org/10.3390/s24217085 - 3 Nov 2024

Cited by 3 | Viewed by 2861

Abstract

As the market share of electric vehicles continues to rise, consumer demands for comfort within the vehicle interior have also increased. The noise generated by electric seats during operation has become one of the primary sources of in-cabin noise. However, the offline detection [...] Read more.

As the market share of electric vehicles continues to rise, consumer demands for comfort within the vehicle interior have also increased. The noise generated by electric seats during operation has become one of the primary sources of in-cabin noise. However, the offline detection methods for electric seat noise severely limit production capacity. To address this issue, this paper presents an online quality inspection system for automotive electric seats, developed using LabVIEW. This system is capable of simultaneously detecting both the noise and electrical functions of electric seats, thereby resolving problems associated with multiple detection processes and low integration levels that affect production efficiency on the assembly line. The system employs NI boards (9250 + 9182) to collect noise data, while communication between LabVIEW and the Programmable Logic Controller (PLC) allows for programmed control of the seat motor to gather motor current. Additionally, a supervisory computer was developed to process the collected data, which includes generating frequency and time-domain graphs, conducting data analysis and evaluation, and performing database queries. By being co-located with the production line, the system features a highly integrated hardware and software design that facilitates the online synchronous detection of noise performance and electrical functions in automotive electric seats, effectively streamlining the detection process and enhancing overall integration. Practical verification results indicate that the system improves the production line cycle time by 34.84%, enabling rapid and accurate identification of non-conforming items in the seat motor, with a detection time of less than 86 s, thereby meeting the quality inspection needs for automotive electric seats. Full article

(This article belongs to the Special Issue Signal Processing and Sensing Technologies for Fault Diagnosis)

► Show Figures

Figure 1

18 pages, 7388 KB

Open AccessFeature PaperArticle

A Comprehensive Analysis of the Genomic and Expressed Repertoire of the T-Cell Receptor Beta Chain in Equus caballus

by Rachele Antonacci, Francesco Giannico, Roberta Moschetti, Angela Pala, Anna Caputi Jambrenghi and Serafina Massari

Animals 2024, 14(19), 2817; https://doi.org/10.3390/ani14192817 - 29 Sep 2024

Cited by 1 | Viewed by 1887

Abstract

In this paper, we report a comprehensive and consistent annotation of the locus encoding the β-chain of the equine T-cell receptor (TRB), as inferred from recent genome assembly using bioinformatics tools. The horse TRB locus spans approximately 1 Mb, making it the largest [...] Read more.

In this paper, we report a comprehensive and consistent annotation of the locus encoding the β-chain of the equine T-cell receptor (TRB), as inferred from recent genome assembly using bioinformatics tools. The horse TRB locus spans approximately 1 Mb, making it the largest locus among the mammalian species studied to date, with a significantly higher number of genes related to extensive duplicative events. In the region, 136 TRBV (belonging to 29 subgroups), 2 TRBD, 13 TRBJ, and 2 TRBC genes, were identified. The general genomic organization resembles that of other mammals, with a V cluster of 135 TRBV genes located upstream of two in-tandem aligned TRBD-J-C clusters and an inverted TRBV gene at the 3′ end of the last TRBC gene. However, the horse b-chain repertoire would be affected by a high number of non-functional TRBV genes. Thus, we queried a transcriptomic dataset derived from splenic tissue of a healthy adult horse, using each TRBJ gene as a probe to analyze clonotypes encompassing the V(D)J junction. This analysis provided insights into the usage of the TRBV, TRBD, and TRBJ genes and the variability of the non-germline-encoded CDR3. Our results clearly demonstrated that the horse β-chain constitutes a complex level of variability, broadly like that described in other mammalian species. Full article

(This article belongs to the Special Issue Advances in Equine Genetics and Breeding)

► Show Figures

Figure 1

11 pages, 3493 KB

Open AccessArticle

Biophysical Studies of Amyloid-Binding Fluorophores to Tau AD Core Fibrils Formed without Cofactors

by Daniela P. Freitas, Joana Saavedra, Isabel Cardoso and Cláudio M. Gomes

Int. J. Mol. Sci. 2024, 25(18), 9946; https://doi.org/10.3390/ijms25189946 - 15 Sep 2024

Cited by 1 | Viewed by 2613

Abstract

Tau is an intrinsically disordered protein involved in several neurodegenerative diseases where a common hallmark is the appearance of tau aggregates in the brain. One common approach to elucidate the mechanisms behind the aggregation of tau has been to recapitulate in vitro the [...] Read more.

Tau is an intrinsically disordered protein involved in several neurodegenerative diseases where a common hallmark is the appearance of tau aggregates in the brain. One common approach to elucidate the mechanisms behind the aggregation of tau has been to recapitulate in vitro the self-assembly process in a fast and reproducible manner. While the seeding of tau aggregation is prompted by negatively charged cofactors, the obtained fibrils are morphologically distinct from those found in vivo. The Tau AD core fragment (TADC, tau 306–378) has emerged as a new model and potential solution for the cofactor-free in vitro aggregation of tau. Here, we use TADC to further study this process combining multiple amyloid-detecting fluorophores and fibril bioimaging. We confirmed by transmission electron microscopy that this fragment forms fibrils after quiescent incubation at 37 °C. We then employed a panel of eight amyloid-binding fluorophores to query the formed species by acquiring their emission spectra. The results obtained showed that nearly all dyes detect TADC self-assembled species. However, the successful monitoring of TADC aggregation kinetics was limited to three fluorophores (X-34, Bis-ANS, and pFTAA) which yielded sigmoidal curves but different aggregation half-times, hinting to different species being detected. Altogether, this study highlights the potential of using multiple extrinsic fluorescent probes, alone or in combination, as tools to further clarify mechanisms behind the aggregation of amyloidogenic proteins. Full article

(This article belongs to the Special Issue Neurodegenerative Diseases: From Molecular Mechanisms to Pathophysiology)

► Show Figures

Figure 1

24 pages, 14284 KB

Open AccessArticle

Mask2Former with Improved Query for Semantic Segmentation in Remote-Sensing Images

by Shichen Guo, Qi Yang, Shiming Xiang, Shuwen Wang and Xuezhi Wang

Mathematics 2024, 12(5), 765; https://doi.org/10.3390/math12050765 - 4 Mar 2024

Cited by 28 | Viewed by 13721

Abstract

Semantic segmentation of remote sensing (RS) images is vital in various practical applications, including urban construction planning, natural disaster monitoring, and land resources investigation. However, RS images are captured by airplanes or satellites at high altitudes and long distances, resulting in ground objects [...] Read more.

Semantic segmentation of remote sensing (RS) images is vital in various practical applications, including urban construction planning, natural disaster monitoring, and land resources investigation. However, RS images are captured by airplanes or satellites at high altitudes and long distances, resulting in ground objects of the same category being scattered in various corners of the image. Moreover, objects of different sizes appear simultaneously in RS images. For example, some objects occupy a large area in urban scenes, while others only have small regions. Technically, the above two universal situations pose significant challenges to the segmentation with a high quality for RS images. Based on these observations, this paper proposes a Mask2Former with an improved query (IQ2Former) for this task. The fundamental motivation behind the IQ2Former is to enhance the capability of the query of Mask2Former by exploiting the characteristics of RS images well. First, we propose the Query Scenario Module (QSM), which aims to learn and group the queries from feature maps, allowing the selection of distinct scenarios such as the urban and rural areas, building clusters, and parking lots. Second, we design the query position module (QPM), which is developed to assign the image position information to each query without increasing the number of parameters, thereby enhancing the model’s sensitivity to small targets in complex scenarios. Finally, we propose the query attention module (QAM), which is constructed to leverage the characteristics of query attention to extract valuable features from the preceding queries. Being positioned between the duplicated transformer decoder layers, QAM ensures the comprehensive utilization of the supervisory information and the exploitation of those fine-grained details. Architecturally, the QSM, QPM, and QAM as well as an end-to-end model are assembled to achieve high-quality semantic segmentation. In comparison to the classical or state-of-the-art models (FCN, PSPNet, DeepLabV3+, OCRNet, UPerNet, MaskFormer, Mask2Former), IQ2Former has demonstrated exceptional performance across three publicly challenging remote-sensing image datasets, 83.59 mIoU on the Vaihingen dataset, 87.89 mIoU on Potsdam dataset, and 56.31 mIoU on LoveDA dataset. Additionally, overall accuracy, ablation experiment, and visualization segmentation results all indicate IQ2Former validity. Full article

(This article belongs to the Special Issue Advanced Research in Data-Centric AI)

► Show Figures

Figure 1

32 pages, 6924 KB

Open AccessArticle

Structural Outlier Detection and Zernike–Canterakis Moments for Molecular Surface Meshes—Fast Implementation in Python

by Mateusz Banach

Molecules 2024, 29(1), 52; https://doi.org/10.3390/molecules29010052 - 21 Dec 2023

Cited by 2 | Viewed by 3161

Abstract

Object retrieval systems measure the degree of similarity of the shape of 3D models. They search for the elements of the 3D model databases that resemble the query model. In structural bioinformatics, the query model is a protein tertiary/quaternary structure and the objective [...] Read more.

Object retrieval systems measure the degree of similarity of the shape of 3D models. They search for the elements of the 3D model databases that resemble the query model. In structural bioinformatics, the query model is a protein tertiary/quaternary structure and the objective is to find similarly shaped molecules in the Protein Data Bank. With the ever-growing size of the PDB, a direct atomic coordinate comparison with all its members is impractical. To overcome this problem, the shape of the molecules can be encoded by fixed-length feature vectors. The distance of a protein to the entire PDB can be measured in this low-dimensional domain in linear time. The state-of-the-art approaches utilize Zernike–Canterakis moments for the shape encoding and supply the retrieval process with geometric data of the input structures. The BioZernike descriptors are a standard utility of the PDB since 2020. However, when trying to calculate the ZC moments locally, the issue of the deficiency of libraries readily available for use in custom programs (i.e., without relying on external binaries) is encountered, in particular programs written in Python. Here, a fast and well-documented Python implementation of the Pozo–Koehl algorithm is presented. In contrast to the more popular algorithm by Novotni and Klein, which is based on the voxelized volume, the PK algorithm produces ZC moments directly from the triangular surface meshes of 3D models. In particular, it can accept the molecular surfaces of proteins as its input. In the presented PK-Zernike library, owing to Numba’s just-in-time compilation, a mesh with 50,000 facets is processed by a single thread in a second at the moment order 20. Since this is the first time the PK algorithm is used in structural bioinformatics, it is employed in a novel, simple, but efficient protein structure retrieval pipeline. The elimination of the outlying chain fragments via a fast PCA-based subroutine improves the discrimination ability, allowing for this pipeline to achieve an 0.961 area under the ROC curve in the BioZernike validation suite (0.997 for the assemblies). The correlation between the results of the proposed approach and of the 3D Surfer program attains values up to 0.99. Full article

(This article belongs to the Section Computational and Theoretical Chemistry)

► Show Figures

Figure 1

23 pages, 13765 KB

Open AccessArticle

Graph Database and Matrix-Based Intelligent Generation of the Assembly Sequence of Prefabricated Building Components

by Bin Yang, Shanshan Jiang, Miaosi Dong, Dayu Zhu and Yilong Han

Appl. Sci. 2023, 13(17), 9834; https://doi.org/10.3390/app13179834 - 30 Aug 2023

Cited by 14 | Viewed by 3283

Abstract

The assembly of prefabricated components is a critical process in prefabricated building construction, influencing both progress and accuracy. However, the assembly sequence planning and optimization (ASPO) of prefabricated components have yet to receive sufficient attention from researchers, and current research has displayed limited [...] Read more.

The assembly of prefabricated components is a critical process in prefabricated building construction, influencing both progress and accuracy. However, the assembly sequence planning and optimization (ASPO) of prefabricated components have yet to receive sufficient attention from researchers, and current research has displayed limited automation and poor generalization capabilities. Therefore, this paper proposes a framework for intelligently generating assembly sequences for prefabricated components based on graph databases and matrices. The framework utilizes an adjacency matrix and interference matrix-based modeling method to comprehensively describe the connections and constraint relationships between components, enabling better evaluation of assembly difficulty during optimization. The graph database serves as the central hub for data exchange, facilitating component information storage, automatic querying, and summarization. The obtained assembly sequence and progress plan are fed back into the graph database. To accomplish assembly sequence optimization, a genetic algorithm based on the double-elite strategy is employed. Furthermore, the effectiveness of the proposed framework is validated through an actual engineering case. The results demonstrate that the framework can effectively find an optimal assembly sequence to mitigate the assembly challenge of a prefabricated building. Full article

► Show Figures

Figure 1

20 pages, 6136 KB

Open AccessArticle

ClueReader: Heterogeneous Graph Attention Network for Multi-Hop Machine Reading Comprehension

by Peng Gao, Feng Gao, Peng Wang, Jian-Cheng Ni, Fei Wang and Hamido Fujita

Electronics 2023, 12(14), 3183; https://doi.org/10.3390/electronics12143183 - 22 Jul 2023

Cited by 5 | Viewed by 2569

Abstract

Multi-hop machine reading comprehension is a challenging task in natural language processing as it requires more reasoning ability across multiple documents. Spectral models based on graph convolutional networks have shown good inferring abilities and lead to competitive results. However, the analysis and reasoning [...] Read more.

Multi-hop machine reading comprehension is a challenging task in natural language processing as it requires more reasoning ability across multiple documents. Spectral models based on graph convolutional networks have shown good inferring abilities and lead to competitive results. However, the analysis and reasoning of some are inconsistent with those of humans. Inspired by the concept of grandmother cells in cognitive neuroscience, we propose a heterogeneous graph attention network model named ClueReader to imitate the grandmother cell concept. The model is designed to assemble the semantic features in multi-level representations and automatically concentrate or alleviate information for reasoning through the attention mechanism. The name ClueReader is a metaphor for the pattern of the model: it regards the subjects of queries as the starting points of clues, takes the reasoning entities as bridge points, considers the latent candidate entities as grandmother cells, and the clues end up in candidate entities. The proposed model enables the visualization of the reasoning graph, making it possible to analyze the importance of edges connecting entities and the selectivity in the mention and candidate nodes, which is easier to comprehend empirically. Evaluations on the open-domain multi-hop reading dataset WikiHop and drug–drug interaction dataset MedHop proved the validity of ClueReader and showed the feasibility of its application of the model in the molecular biology domain. Full article

► Show Figures

Figure 1

19 pages, 2802 KB

Open AccessArticle

Genome ARTIST_v2—An Autonomous Bioinformatics Tool for Annotation of Natural Transposons in Sequenced Genomes

by Alexandru Al. Ecovoiu, Alexandru Marian Bologa, David Ioan Mihail Chifiriuc, Andrei Mihai Ciuca, Nicoleta Denisa Constantin, Iulian Constantin Ghionoiu, Iulian Cristian Ghita and Attila Cristian Ratiu

Int. J. Mol. Sci. 2022, 23(20), 12686; https://doi.org/10.3390/ijms232012686 - 21 Oct 2022

Cited by 2 | Viewed by 4152

Abstract

The annotation of transposable elements (transposons) is a very dynamic field of genomics and various tools assigned to support this bioinformatics endeavor have been developed and described. Genome ARTIST v1.19 (GA_v1.19) software was conceived for mapping artificial transposons mobilized during insertional mutagenesis projects, [...] Read more.

The annotation of transposable elements (transposons) is a very dynamic field of genomics and various tools assigned to support this bioinformatics endeavor have been developed and described. Genome ARTIST v1.19 (GA_v1.19) software was conceived for mapping artificial transposons mobilized during insertional mutagenesis projects, but the new functions of GA_v2 qualify it as a tool for the mapping and annotation of natural transposons (NTs) in long reads, contigs and assembled genomes. The tabular export of mapping and annotation data for high-throughput data analysis, the generation of a list of flanking sequences around the coordinates of insertion or around the target site duplications and the computing of a consensus sequence for the flanking sequences are all key assets of GA_v2. Additionally, we developed a set of scripts that enable the user to annotate NTs, to harness annotations offered by FlyBase for Drosophila melanogaster genome, to convert sequence files from .fasta to .raw, and to extract junction query sequences essential for NTs mapping. Herein, we present the applicability of GA_v2 for a preliminary annotation of P-element and hobo class II NTs and copia retrotransposon in the genome of D. melanogaster strain Horezu_LaPeri (Horezu), Romania, which was sequenced with Nanopore technology in our laboratory. We used contigs assembled with Flye tool and a Q10 quality filter of the reads. Our results suggest that GA_v2 is a reliable autonomous tool able to perform mapping and annotation of NTs in genomes sequenced by long sequencing technology. GA_v2 is open-source software compatible with Linux, Mac OS and Windows and is available at GitHub repository and dedicated website. Full article

(This article belongs to the Section Molecular Informatics)

► Show Figures

Figure 1

17 pages, 5274 KB

Open AccessArticle

On the Design and Implementation of a Blockchain-Based Data Management System for ETO Manufacturing

by Zhengjun Jing, Niuping Hu, Yurong Song, Bo Song, Chunsheng Gu and Lei Pan

Appl. Sci. 2022, 12(18), 9184; https://doi.org/10.3390/app12189184 - 13 Sep 2022

Cited by 8 | Viewed by 3426

Abstract

Engineer-to-order (ETO) is a currently popular production model that can meet customers’ individual needs, for which the orders are primarily non-standard parts or small batches. This production model has caused many management challenges, including the difficulty of tracing the production process data of [...] Read more.

Engineer-to-order (ETO) is a currently popular production model that can meet customers’ individual needs, for which the orders are primarily non-standard parts or small batches. This production model has caused many management challenges, including the difficulty of tracing the production process data of products and the inability to monitor order status in real-time. In this paper, by analyzing the steps of ETO manufacturing and the business process between departments in the manufacturing industry, a blockchain-based process data management system (BPDMS) is proposed. The immutable nature of the blockchain data ensures the data’s validity and consistency in each production step. Furthermore, by embedding the sequential aggregate signature in the system, the sequence verification of discrete process steps can be completed in time. Finally, an electrical equipment assembly production platform is used to discuss the specific implementation on top of the Hyperledger Fabric, a permissioned blockchain. The experiment results show that the proposed system effectively manages the process data of ETO-type production, and the real-time querying of the production status of the orders. Full article

(This article belongs to the Special Issue Advances in Blockchain-enabled Internet of Things (IoT))

► Show Figures

Figure 1

24 pages, 3165 KB

Open AccessArticle

A Cache Efficient One Hashing Blocked Bloom Filter (OHBB) for Random Strings and the K-mer Strings in DNA Sequence

by Elakkiya Prakasam and Arun Manoharan

Symmetry 2022, 14(9), 1911; https://doi.org/10.3390/sym14091911 - 13 Sep 2022

Cited by 6 | Viewed by 4230

Abstract

Bloom filters are widely used in genome assembly, IoT applications and several network applications such as symmetric encryption algorithms, and blockchain applications owing to their advantages of fast querying, despite some false positives in querying the input elements. There are many research works [...] Read more.

Bloom filters are widely used in genome assembly, IoT applications and several network applications such as symmetric encryption algorithms, and blockchain applications owing to their advantages of fast querying, despite some false positives in querying the input elements. There are many research works carried out to improve both the insertion and querying speed or reduce the false-positive or reduce the storage requirements separately. However, the optimization of all the aforementioned parameters is quite challenging with the existing reported systems. This work proposes to simultaneously improve the insertion and querying speeds by introducing a Cache-efficient One-Hashing Blocked Bloom filter. The proposed method aims to reduce the number of memory accesses required for querying elements into one by splitting the memory into blocks where the block size is equal to the cache line size of the memory. In the proposed filter, each block has further been split into partitions where the size of each partition is the prime number. For insertion and query, one hash value is required, which yields different values when modulo divided with prime numbers. The speed is accelerated using simple hash functions where the hash function is called only once. The proposed method has been implemented and validated using random strings and symmetric K-mer datasets used in the gene assembly. The simulation results show that the proposed filter outperforms the Standard Bloom Filter in terms of the insertion and querying speed. Full article

► Show Figures

Figure 1

20 pages, 25693 KB

Open AccessArticle

Madeleine: Poetry and Art of an Artificial Intelligence

by Graeme Revell

Arts 2022, 11(5), 83; https://doi.org/10.3390/arts11050083 - 5 Sep 2022

Cited by 4 | Viewed by 6117

Abstract

This article presents a project which is an experiment in the emerging field of human-machine artistic collaboration. The author/artist investigates responses by the generative pre-trained transformer (GPT-2) to poetic and esoteric prompts and curates them with elements of digital art created by the [...] Read more.

This article presents a project which is an experiment in the emerging field of human-machine artistic collaboration. The author/artist investigates responses by the generative pre-trained transformer (GPT-2) to poetic and esoteric prompts and curates them with elements of digital art created by the text-to-image transformer DALL-E 2 using those same prompts; these elements are presented in the context of photographs featuring an anthropomorphic female avatar as the messenger of the content. The tripartite ‘cyborg’ thus assembled is an artificial intelligence endowed with the human attributes of language, art and visage; it is referred to throughout as Madeleine. The results of the experiments allowed the investigation of the following hypotheses. Firstly, evidence for a convergence of machine and human creativity and intelligence is provided by moderate degrees of lossy compression, error, ignorance and the lateral formulation of analogies more typical of GPT-2 than GPT-3. Secondly, the work provides new illustrations supporting research in the field of artificial intelligence that queries the definitions and boundaries of accepted categories such as cognition, intelligence, understanding and—at the limit—consciousness, suggesting that there is a paradigm shift away from questions such as “Can machines think?” to those of immediate social and political relevance such as “How can you tell a machine from a human being?” and “Can we trust machines?” Finally, appearance and epistemic emotions: surprise, curiosity and confusion are influential in the human acceptance of machines as intelligent and trustworthy entities. The project problematises the contemporary proliferation of feminised avatars in the context of feminist critical literature and suggests that the anthropomorphic avatar might echo the social and historical position of the Delphic oracle: the Pythia, rather than a disembodied search engine such as Alexa. Full article

(This article belongs to the Collection Review of Machine Art)

► Show Figures

Figure 1

Search Results (23)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (23)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI