This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessReview
Data Organisation for Efficient Pattern Retrieval: Indexing, Storage, and Access Structures
by
Paraskevas Koukaras
Paraskevas Koukaras
and
Christos Tjortjis
Christos Tjortjis *
School of Science and Technology, International Hellenic University, 14th km Thessaloniki-Moudania, 57001 Thessaloniki, Greece
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2025, 9(10), 258; https://doi.org/10.3390/bdcc9100258 (registering DOI)
Submission received: 24 August 2025
/
Revised: 23 September 2025
/
Accepted: 10 October 2025
/
Published: 13 October 2025
Abstract
The increasing scale and complexity of data mining outputs, such as frequent itemsets, association rules, sequences, and subgraphs have made efficient pattern retrieval a critical, yet underexplored challenge. This review addresses the organisation, indexing, and access strategies, which enable scalable and responsive retrieval of structured patterns. We examine the underlying types of data and pattern outputs, common retrieval operations, and the variety of query types encountered in practice. Key indexing structures are surveyed, including prefix trees, inverted indices, hash-based approaches, and bitmap-based methods, each suited to different pattern representations and workloads. Storage designs are discussed with attention to metadata annotation, format choices, and redundancy mitigation. Query optimisation strategies are reviewed, emphasising index-aware traversal, caching, and ranking mechanisms. This paper also explores scalability through parallel, distributed, and streaming architectures, and surveys current systems and tools, which integrate mining and retrieval capabilities. Finally, we outline pressing challenges and emerging directions, such as supporting real-time and uncertainty-aware retrieval, and enabling semantic, cross-domain pattern access. Additional frontiers include privacy-preserving indexing and secure query execution, along with integration of repositories into machine learning pipelines for hybrid symbolic–statistical workflows. We further highlight the need for dynamic repositories, probabilistic semantics, and community benchmarks to ensure that progress is measurable and reproducible across domains. This review provides a comprehensive foundation for designing next-generation pattern retrieval systems, which are scalable, flexible, and tightly integrated into analytic workflows. The analysis and roadmap offered are relevant across application areas including finance, healthcare, cybersecurity, and retail, where robust and interpretable retrieval is essential.
Share and Cite
MDPI and ACS Style
Koukaras, P.; Tjortjis, C.
Data Organisation for Efficient Pattern Retrieval: Indexing, Storage, and Access Structures. Big Data Cogn. Comput. 2025, 9, 258.
https://doi.org/10.3390/bdcc9100258
AMA Style
Koukaras P, Tjortjis C.
Data Organisation for Efficient Pattern Retrieval: Indexing, Storage, and Access Structures. Big Data and Cognitive Computing. 2025; 9(10):258.
https://doi.org/10.3390/bdcc9100258
Chicago/Turabian Style
Koukaras, Paraskevas, and Christos Tjortjis.
2025. "Data Organisation for Efficient Pattern Retrieval: Indexing, Storage, and Access Structures" Big Data and Cognitive Computing 9, no. 10: 258.
https://doi.org/10.3390/bdcc9100258
APA Style
Koukaras, P., & Tjortjis, C.
(2025). Data Organisation for Efficient Pattern Retrieval: Indexing, Storage, and Access Structures. Big Data and Cognitive Computing, 9(10), 258.
https://doi.org/10.3390/bdcc9100258
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.