MDPI - Publisher of Open Access Journals

20 pages, 342 KiB

Open AccessArticle

Generalized Orthogonal de Bruijn and Kautz Sequences

by Yuan-Pon Chen, Jin Sima and Olgica Milenkovic

Entropy 2025, 27(4), 366; https://doi.org/10.3390/e27040366 - 30 Mar 2025

Viewed by 399

A de Bruijn sequence of order k over a finite alphabet is a cyclic sequence with the property that it contains every possible k-sequence as a substring exactly once. Orthogonal de Bruijn sequences are the collections of de Bruijn sequences of the [...] Read more.

A de Bruijn sequence of order k over a finite alphabet is a cyclic sequence with the property that it contains every possible k-sequence as a substring exactly once. Orthogonal de Bruijn sequences are the collections of de Bruijn sequences of the same order, k, that satisfy the joint constraint that every

(k + 1)

-sequence appears as a substring in, at most, one of the sequences in the collection. Both de Bruijn and orthogonal de Bruijn sequences have found numerous applications in synthetic biology, although the latter remain largely unexplored in the coding theory literature. Here, we study three relevant practical generalizations of orthogonal de Bruijn sequences, where we relax either the constraint that every

(k + 1)

-sequence appears exactly once or the sequences themselves are de Bruijn rather than balanced de Bruijn sequences. We also provide lower and upper bounds on the number of fixed-weight orthogonal de Bruijn sequences. The paper concludes with parallel results for orthogonal nonbinary Kautz sequences, which satisfy similar constraints as de Bruijn sequences, except for being only required to cover all subsequences of length k whose maximum run length equals one. Full article

(This article belongs to the Special Issue Coding and Algorithms for DNA-Based Data Storage Systems)

► Show Figures

Figure 1

15 pages, 1605 KiB

Open AccessArticle

CSpredR: A Multi-Site mRNA Subcellular Localization Prediction Method Based on Fusion Encoding and Hybrid Neural Networks

by Xiao Wang, Wenshuai Suo and Rong Wang

Algorithms 2025, 18(2), 67; https://doi.org/10.3390/a18020067 - 26 Jan 2025

Viewed by 936

Abstract

Current research widely acknowledges that the subcellular localization of mRNA is crucial for understanding its biological functions. However, current methods for mRNA subcellular localization based on k-mer frequency features may overlook the sequential information of the sequence, and a single encoding method may [...] Read more.

Current research widely acknowledges that the subcellular localization of mRNA is crucial for understanding its biological functions. However, current methods for mRNA subcellular localization based on k-mer frequency features may overlook the sequential information of the sequence, and a single encoding method may not adequately extract the sequence’s features. This paper proposes a novel deep learning prediction method, CSpredR, specifically designed for predicting the subcellular localization of multi-site mRNAs. Unlike previous methods, CSpredR first employs k-mer to tokenize the mRNA sequences, then converts the tokenized sequences into de Bruijn graphs, thereby enabling a more precise capture of the structural information within the sequences. To mitigate the impact of lost sequential information and better capture sequence features, we combine word2vec and fasttext models to extract the features of each node in the graph and retain the sequence order. They can encode the k-mer units in the sequence into word vectors, thus serving as the node feature vectors of the graph. In this way, each node in the graph is assigned a feature vector containing rich semantic information. Subsequently, we utilize multi-scale convolutional neural networks and bidirectional long short-term memory networks to capture sequence features, respectively, and fuse the results as input for a multi-head attention mechanism model. The information from these heads is integrated into the node representations, and finally, the attention-processed data are fed into an MLP (Multi-Layer Perceptron) for prediction tasks. Extensive experiments reveal that CSpredR achieves a 2% improvement over the best existing predictors, offering a more effective tool for mRNA subcellular localization prediction. Full article

(This article belongs to the Special Issue Advanced Research on Machine Learning Algorithms in Bioinformatics)

► Show Figures

Figure 1

10 pages, 2225 KiB

Open AccessArticle

Evaluating Sequence Alignment Tools for Antimicrobial Resistance Gene Detection in Assembly Graphs

by Yusreen Shah and Somayeh Kafaie

Microorganisms 2024, 12(11), 2168; https://doi.org/10.3390/microorganisms12112168 - 28 Oct 2024

Viewed by 1238

Abstract

Antimicrobial resistance (AMR) is an escalating global health threat, often driven by the horizontal gene transfer (HGT) of resistance genes. Detecting AMR genes and understanding their genomic context within bacterial populations is crucial for mitigating the spread of resistance. In this study, we [...] Read more.

Antimicrobial resistance (AMR) is an escalating global health threat, often driven by the horizontal gene transfer (HGT) of resistance genes. Detecting AMR genes and understanding their genomic context within bacterial populations is crucial for mitigating the spread of resistance. In this study, we evaluate the performance of three sequence alignment tools—Bandage, SPAligner, and GraphAligner—in identifying AMR gene sequences from assembly and de Bruijn graphs, which are commonly used in microbial genome assembly. Efficiently identifying these genes allows for the detection of neighboring genetic elements and possible HGT events, contributing to a deeper understanding of AMR dissemination. We compare the performance of the tools both qualitatively and quantitatively, analyzing the precision, computational efficiency, and accuracy in detecting AMR-related sequences. Our analysis reveals that Bandage offers the most precise and efficient identification of AMR gene sequences, followed by GraphAligner and SPAligner. The comparison includes evaluating the similarity of paths returned by each tool and measuring output accuracy using a modified edit distance metric. These results highlight Bandage’s potential for contributing to the accurate identification and study of AMR genes in bacterial populations, offering important insights into resistance mechanisms and potential targets for mitigating AMR spread. Full article

(This article belongs to the Section Antimicrobial Agents and Resistance)

► Show Figures

Figure 1

15 pages, 13683 KiB

Open AccessArticle

A 3D Reconstruction Method Based on Homogeneous De Bruijn-Encoded Structured Light

by Weimin Li and Songlin Li

Photonics 2024, 11(5), 458; https://doi.org/10.3390/photonics11050458 - 14 May 2024

Cited by 1 | Viewed by 2036

Abstract

Structured light three-dimensional reconstruction is one of the important methods for non-contact acquisition of sparse texture object surfaces. Variations in ambient illumination and disparities in object surface reflectance can significantly impact the fidelity of three-dimensional reconstruction, introducing considerable inaccuracies. We introduce a robust [...] Read more.

Structured light three-dimensional reconstruction is one of the important methods for non-contact acquisition of sparse texture object surfaces. Variations in ambient illumination and disparities in object surface reflectance can significantly impact the fidelity of three-dimensional reconstruction, introducing considerable inaccuracies. We introduce a robust method for color speckle structured light encoding, which is based on a variant of the De Bruijn sequence, termed the Homogeneous De Bruijn Sequence. This innovative approach enhances the reliability and accuracy of structured light techniques for three-dimensional reconstruction by utilizing the distinctive characteristics of Homogeneous De Bruijn Sequences. Through a pruning process applied to the De Bruijn sequence, a structured light pattern with seven distinct color patches is generated. This approach ensures a more equitable distribution of speckle information. Full article

(This article belongs to the Section Optical Interaction Science)

► Show Figures

Figure 1

23 pages, 5373 KiB

Open AccessArticle

PANDA: Processing in Magnetic Random-Access Memory-Accelerated de Bruijn Graph-Based DNA Assembly

by Shaahin Angizi, Naima Ahmed Fahmi, Deniz Najafi, Wei Zhang and Deliang Fan

J. Low Power Electron. Appl. 2024, 14(1), 9; https://doi.org/10.3390/jlpea14010009 - 2 Feb 2024

Cited by 1 | Viewed by 3081

Abstract

In this work, we present an efficient Processing in MRAM-Accelerated De Bruijn Graph-based DNA Assembly platform, named PANDA, based on an optimized and hardware-friendly genome assembly algorithm. PANDA is able to assemble large-scale DNA sequence datasets from all-pair overlaps. We first design a [...] Read more.

In this work, we present an efficient Processing in MRAM-Accelerated De Bruijn Graph-based DNA Assembly platform, named PANDA, based on an optimized and hardware-friendly genome assembly algorithm. PANDA is able to assemble large-scale DNA sequence datasets from all-pair overlaps. We first design a PANDA platform that exploits MRAM as computational memory and converts it to a potent processing unit for genome assembly. PANDA can not only execute efficient bulk bit-wise X(N)OR-based comparison/addition operations heavily required for the genome assembly task but also a full set of 2-/3-input logic operations inside the MRAM chip. We then develop a highly parallel and step-by-step hardware-friendly DNA assembly algorithm for PANDA that only requires the developed in-memory logic operations. The platform is then configured with a novel data partitioning and mapping technique that provides local storage and processing to utilize the algorithm level’s parallelism fully. The cross-layer simulation results demonstrate that PANDA reduces the run time and power by a factor of 18 and 11, respectively, compared with CPU. Moreover, speed-ups of up to 2.5 to 10× can be obtained over other recent processing in-memory platforms to perform the same task, like STT-MRAM, ReRAM, and DRAM. Full article

► Show Figures

Figure 1

9 pages, 326 KiB

Open AccessArticle

Multi de Bruijn Sequences and the Cross-Join Method

by Abbas Alhakim and Janusz Szmidt

Mathematics 2023, 11(5), 1262; https://doi.org/10.3390/math11051262 - 6 Mar 2023

Viewed by 1737

Abstract

We show a method to construct binary multi de Bruijn sequences using the cross-join method. We extend the proof given by Alhakim for ordinary de Bruijn sequences to the case of multi de Bruijn sequences. In particular, we establish that all multi de [...] Read more.

We show a method to construct binary multi de Bruijn sequences using the cross-join method. We extend the proof given by Alhakim for ordinary de Bruijn sequences to the case of multi de Bruijn sequences. In particular, we establish that all multi de Bruijn sequences can be obtained by cross-joining an ordinary de Bruijn sequence concatenated with itself an appropriate number of times. We implemented the generation of all multi de Bruijn sequences of type

C (2, 2, 2)

and

C (3, 2, 2) .

We experimentally confirm that some multi de Bruijn sequences can be generated by Galois Nonlinear Feedback Shift Registers (NLFSRs). It is supposed that all multi de Bruijn sequences can be generated using Galois NLFSRs. Full article

(This article belongs to the Special Issue Advanced Graph Theory and Combinatorics)

► Show Figures

Figure 1

18 pages, 5838 KiB

Open AccessArticle

Color Structured Light Stripe Edge Detection Method Based on Generative Adversarial Networks

by Dieuthuy Pham, Minhtuan Ha and Changyan Xiao

Appl. Sci. 2023, 13(1), 198; https://doi.org/10.3390/app13010198 - 23 Dec 2022

Cited by 2 | Viewed by 2036

Abstract

The one-shot structured light method using a color stripe pattern can provide a dense point cloud in a short time. However, the influence of noise and the complex characteristics of scenes still make the task of detecting the color stripe edges in deformed [...] Read more.

The one-shot structured light method using a color stripe pattern can provide a dense point cloud in a short time. However, the influence of noise and the complex characteristics of scenes still make the task of detecting the color stripe edges in deformed pattern images difficult. To overcome these challenges, a color structured light stripe edge detection method based on generative adversarial networks, which is named horizontal elastomeric attention residual Unet-based GAN (HEAR-GAN), is proposed in this paper. Additionally, a De Bruijn sequence-based color stripe pattern and a multi-slit binary pattern are designed. In our dataset, selecting the multi-slit pattern images as ground-truth images not only reduces the labor of manual annotation but also enhances the quality of the training set. With the proposed network, our method converts the task of detecting edges in color stripe pattern images into detecting centerlines in curved line images. The experimental results show that the proposed method can overcome the above challenges, and thus, most of the edges in the color stripe pattern images are detected. In addition, the comparison results demonstrate that our method can achieve a higher performance of color stripe segmentation with higher pixel location accuracy than other edge detection methods. Full article

► Show Figures

Figure 1

30 pages, 1342 KiB

Open AccessArticle

Fast Format-Aware Fuzzing for Structured Input Applications

by Zehan Chen, Yuliang Lu, Kailong Zhu, Lu Yu and Jiazhen Zhao

Appl. Sci. 2022, 12(18), 9350; https://doi.org/10.3390/app12189350 - 18 Sep 2022

Cited by 1 | Viewed by 2390

Abstract

Fuzzing is one of the most successful software testing techniques used to discover vulnerabilities in programs. Without seeds that fit the input format, existing runtime dependency recognition strategies are limited by incompleteness and high overhead. In this paper, for structured input applications, we [...] Read more.

Fuzzing is one of the most successful software testing techniques used to discover vulnerabilities in programs. Without seeds that fit the input format, existing runtime dependency recognition strategies are limited by incompleteness and high overhead. In this paper, for structured input applications, we propose a fast format-aware fuzzing approach to recognize dependencies from the specified input to the corresponding comparison instruction. We divided the dependencies into Input-to-State (I2S) and indirect dependencies. Our approach has the following advantages compared to existing works: (1) recognizing I2S dependencies more completely and swiftly using the input based on the de Bruijn sequence and its mapping structure; (2) obtaining indirect dependencies with a light dependency existence analysis on the input fragments. We implemented a fast format-aware fuzzing prototype, FFAFuzz, based on our method and evaluated FFAFuzz in real-world structured input applications. The evaluation results showed that FFAFuzz reduced the average time overhead by 76.49% while identifying more completely compared with Redqueen and by 89.10% compared with WEIZZ. FFAFuzz also achieved higher code coverage by 14.53% on average compared to WEIZZ. Full article

(This article belongs to the Special Issue Resilience and Vulnerability in Cybersecurity)

► Show Figures

Figure 1

18 pages, 400 KiB

Open AccessArticle

A New Approach to Determine the Minimal Polynomials of Binary Modified de Bruijn Sequences

by Musthofa, Indah Emilia Wijayanti, Diah Junia Eksi Palupi and Martianus Frederic Ezerman

Mathematics 2022, 10(15), 2577; https://doi.org/10.3390/math10152577 - 25 Jul 2022

Cited by 1 | Viewed by 2020

Abstract

A binary modified de Bruijn sequence is an infinite and periodic binary sequence derived by removing a zero from the longest run of zeros in a binary de Bruijn sequence. The minimal polynomial of the modified sequence is its unique least-degree characteristic polynomial. [...] Read more.

A binary modified de Bruijn sequence is an infinite and periodic binary sequence derived by removing a zero from the longest run of zeros in a binary de Bruijn sequence. The minimal polynomial of the modified sequence is its unique least-degree characteristic polynomial. Leveraging a recent characterization, we devise a novel general approach to determine the minimal polynomial. We translate the characterization into a problem of identifying a Hamiltonian cycle in a specially constructed graph. The graph is isomorphic to the modified de Bruijn–Good graph. Along the way, we demonstrate the usefulness of some computational tools from the cycle joining method in the modified setup. Full article

(This article belongs to the Special Issue Mathematical Coding Theory)

► Show Figures

Figure 1

14 pages, 4865 KiB

Open AccessArticle

Retrospective Definition of Clostridioides difficile PCR Ribotypes on the Basis of Whole Genome Polymorphisms: A Proof of Principle Study

by Manisha Goyal, Lysiane Hauben, Hannes Pouseele, Magali Jaillard, Katrien De Bruyne, Alex van Belkum and Richard Goering

Diagnostics 2020, 10(12), 1078; https://doi.org/10.3390/diagnostics10121078 - 12 Dec 2020

Cited by 3 | Viewed by 3536

Abstract

Clostridioides difficile is a cause of health care-associated infections. The epidemiological study of C. difficile infection (CDI) traditionally involves PCR ribotyping. However, ribotyping will be increasingly replaced by whole genome sequencing (WGS). This implies that WGS types need correlation with classical ribotypes (RTs) [...] Read more.

Clostridioides difficile is a cause of health care-associated infections. The epidemiological study of C. difficile infection (CDI) traditionally involves PCR ribotyping. However, ribotyping will be increasingly replaced by whole genome sequencing (WGS). This implies that WGS types need correlation with classical ribotypes (RTs) in order to perform retrospective clinical studies. Here, we selected genomes of hyper-virulent C. difficile strains of RT001, RT017, RT027, RT078, and RT106 to try and identify new discriminatory markers using in silico ribotyping PCR and De Bruijn graph-based Genome Wide Association Studies (DBGWAS). First, in silico ribotyping PCR was performed using reference primer sequences and 30 C. difficile genomes of the five different RTs identified above. Second, discriminatory genomic markers were sought with DBGWAS using a set of 160 independent C. difficile genomes (14 ribotypes). RT-specific genetic polymorphisms were annotated and validated for their specificity and sensitivity against a larger dataset of 2425 C. difficile genomes covering 132 different RTs. In silico PCR ribotyping was unsuccessful due to non-specific or missing theoretical RT PCR fragments. More successfully, DBGWAS discovered a total of 47 new markers (13 in RT017, 12 in RT078, 9 in RT106, 7 in RT027, and 6 in RT001) with minimum q-values of 0 to 7.40 × 10⁻⁵, indicating excellent marker selectivity. The specificity and sensitivity of individual markers ranged between 0.92 and 1.0 but increased to 1 by combining two markers, hence providing undisputed RT identification based on a single genome sequence. Markers were scattered throughout the C. difficile genome in intra- and intergenic regions. We propose here a set of new genomic polymorphisms that efficiently identify five hyper-virulent RTs utilizing WGS data only. Further studies need to show whether this initial proof-of-principle observation can be extended to all 600 existing RTs. Full article

(This article belongs to the Section Diagnostic Microbiology and Infectious Disease)

► Show Figures

Figure 1

13 pages, 3730 KiB

Open AccessArticle

3D-Printed All-Dielectric Electromagnetic Encoders with Synchronous Reading for Measuring Displacements and Velocities

by Cristian Herrojo, Ferran Paredes and Ferran Martín

Sensors 2020, 20(17), 4837; https://doi.org/10.3390/s20174837 - 27 Aug 2020

Cited by 16 | Viewed by 2791

Abstract

In this paper, 3D-printed electromagnetic (or microwave) encoders with synchronous reading based on permittivity contrast, and devoted to the measurement of displacements and velocities, are reported for the first time. The considered encoders are based on two chains of linearly shaped apertures made [...] Read more.

In this paper, 3D-printed electromagnetic (or microwave) encoders with synchronous reading based on permittivity contrast, and devoted to the measurement of displacements and velocities, are reported for the first time. The considered encoders are based on two chains of linearly shaped apertures made on a 3D-printed high-permittivity dielectric material. One such aperture chain contains the identification (ID) code, whereas the other chain provides the clock signal. Synchronous reading is necessary in order to determine the absolute position if the velocity between the encoder and the sensitive part of the reader is not constant. Such absolute position can be determined as long as the whole encoder is encoded with the so-called de Bruijn sequence. For encoder reading, a splitter/combiner structure with each branch loaded with a series gap and a slot resonator (each one tuned to a different frequency) is considered. Such a structure is able to detect the presence of the apertures when the encoder is displaced, at short distance, over the slots. Thus, by injecting two harmonic signals, conveniently tuned, at the input port of the splitter/combiner structure, two amplitude modulated (AM) signals are generated by tag motion at the output port of the sensitive part of the reader. One of the AM envelope functions provides the absolute position, whereas the other one provides the clock signal and the velocity of the encoder. These synchronous 3D-printed all-dielectric encoders based on permittivity contrast are a good alternative to microwave encoders based on metallic inclusions in those applications where low cost as well as major robustness against mechanical wearing and aging effects are the main concerns. Full article

(This article belongs to the Collection Position Sensor)

► Show Figures

Figure 1

11 pages, 296 KiB

Open AccessArticle

Lower Bounds, and Exact Enumeration in Particular Cases, for the Probability of Existence of a Universal Cycle or a Universal Word for a Set of Words

by Herman Z. Q. Chen, Sergey Kitaev and Brian Y. Sun

Mathematics 2020, 8(5), 778; https://doi.org/10.3390/math8050778 - 12 May 2020

Viewed by 2007

Abstract

A universal cycle, or u-cycle, for a given set of words is a circular word that contains each word from the set exactly once as a contiguous subword. The celebrated de Bruijn sequences are a particular case of such a u-cycle, where a [...] Read more.

A universal cycle, or u-cycle, for a given set of words is a circular word that contains each word from the set exactly once as a contiguous subword. The celebrated de Bruijn sequences are a particular case of such a u-cycle, where a set in question is the set

A^{n}

of all words of length n over a k-letter alphabet A. A universal word, or u-word, is a linear, i.e., non-circular, version of the notion of a u-cycle, and it is defined similarly. Removing some words in

A^{n}

may, or may not, result in a set of words for which u-cycle, or u-word, exists. The goal of this paper is to study the probability of existence of the universal objects in such a situation. We give lower bounds for the probability in general cases, and also derive explicit answers for the case of removing up to two words in

A^{n}

, or the case when

k = 2

and

n \leq 4

. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

15 pages, 3935 KiB

Open AccessArticle

Single-Shot Dense Depth Sensing with Color Sequence Coded Fringe Pattern

by Fu Li, Baoyu Zhang, Guangming Shi, Yi Niu, Ruodai Li, Lili Yang and Xuemei Xie

Sensors 2017, 17(11), 2558; https://doi.org/10.3390/s17112558 - 6 Nov 2017

Cited by 8 | Viewed by 4734

Abstract

A single-shot structured light method is widely used to acquire dense and accurate depth maps for dynamic scenes. In this paper, we propose a color sequence coded fringe depth sensing method. To overcome the phase unwrapping problem encountered in phase-based methods, the color-coded [...] Read more.

A single-shot structured light method is widely used to acquire dense and accurate depth maps for dynamic scenes. In this paper, we propose a color sequence coded fringe depth sensing method. To overcome the phase unwrapping problem encountered in phase-based methods, the color-coded sequence information is embedded into the phase information. We adopt the color-encoded De Bruijn sequence to denote the period of the phase information and assign the sequence into two channels of the pattern, while the third channel is used to code the phase information. Benefiting from this coding strategy, the phase information distributed in multiple channels can improve the quality of the phase intensity by channel overlay, which results in precise phase estimation. Meanwhile, the wrapped phase period assists the sequence decoding to obtain a precise period order. To evaluate the performance of the proposed method, an experimental platform is established. Quantitative and qualitative experiments demonstrate that the proposed method generates a higher precision depth, as compared to a Kinect and larger resolution ToF (Time of Flight) camera. Full article

(This article belongs to the Special Issue Imaging Depth Sensors—Sensors, Algorithms and Applications)

► Show Figures

Figure 1

20 pages, 513 KiB

Open AccessArticle

A Computer Simulator for Assessing Different Challenges and Strategies of de Novo Sequence Assembly

by Bjarne Knudsen, Roald Forsberg and Michael M. Miyamoto

Genes 2010, 1(2), 263-282; https://doi.org/10.3390/genes1020263 - 13 Sep 2010

Cited by 12 | Viewed by 12096

Abstract

This study presents a new computer program for assessing the effects of different factors and sequencing strategies on de novo sequence assembly. The program uses reads from actual sequencing studies or from simulations with a reference genome that may also be real or [...] Read more.

This study presents a new computer program for assessing the effects of different factors and sequencing strategies on de novo sequence assembly. The program uses reads from actual sequencing studies or from simulations with a reference genome that may also be real or simulated. The simulated reads can be created with our read simulator. They can be of differing length and coverage, consist of paired reads with varying distance, and include sequencing errors such as color space miscalls to imitate SOLiD data. The simulated or real reads are mapped to their reference genome and our assembly simulator is then used to obtain optimal assemblies that are limited only by the distribution of repeats. By way of this mapping, the assembly simulator determines which contigs are theoretically possible, or conversely (and perhaps more importantly), which are not. We illustrate the application and utility of our new simulation tools with several experiments that test the effects of genome complexity (repeats), read length and coverage, word size in De Bruijn graph assembly, and alternative sequencing strategies (e.g., BAC pooling) on sequence assemblies. These experiments highlight just some of the uses of our simulators in the experimental design of sequencing projects and in the further development of assembly algorithms. Full article

(This article belongs to the Special Issue Next Generation DNA Sequencing)

► Show Figures

Figure 1

Search Results (14)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (14)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI