Multiple string matching is a fundamental operation in real-time analytics, cybersecurity, bioinformatics, and large-scale information retrieval. Nevertheless, existing approaches continue to face inherent trade-offs among preprocessing efficiency, verification overhead, and support for dynamic pattern updates, particularly in large and continuously evolving environments. This
[...] Read more.
Multiple string matching is a fundamental operation in real-time analytics, cybersecurity, bioinformatics, and large-scale information retrieval. Nevertheless, existing approaches continue to face inherent trade-offs among preprocessing efficiency, verification overhead, and support for dynamic pattern updates, particularly in large and continuously evolving environments. This paper presents
MMIVL, a high-performance algorithm founded on the multi-character inverted list (
m-CIVL), a unified and inherently dynamic indexing framework for pattern management. By integrating positional information, termination semantics, and pattern associations within a single structure,
m-CIVL enables direct matching without requiring a separate verification stage.
MMIVL achieves a preprocessing complexity of
O(|
P|/
s), a search complexity of
O(|
T| +
nocc), and an update complexity of
O(|
p|/
s), where
s denotes the segment length. Extensive experiments on synthetic and real-world datasets demonstrate that
MMIVL consistently outperforms representative baselines, with especially strong gains in large-scale scenarios, while maintaining stable performance and favorable memory efficiency. Overall, these results establish
m-CIVL as an effective, scalable, and practically viable solution that unifies efficient preprocessing, high-throughput searching, and dynamic update capability for modern multiple string-matching applications.
Full article