PSO-FSPMiner: A Metaheuristic Approach for Mining a Representative Subset of Frequent Similar Patterns
Abstract
1. Introduction
2. Related Work
3. Basic Concepts and Notation
4. Mining a Representative Subset of Frequent Similar Patterns
4.1. Relevance Measures
4.2. PSO-FSPMiner
| Algorithm 1: PSO-FSPMiner algorithm |
| Input: Dataset Ω, Population size N, Number of Iterations Iter, Min frequency threshold minFreq, Min coverage threshold minCov, Inertial factor w, acceleration constant , acceleration constant , Weight variable boostWeight Output: Returns the set of existing solutions in the External List ![]() |
| Algorithm 2: AddExternalList |
| Input: Solutions swarm, External List ExternalList, Min Frequency minFreq, Min coverage minCov Output: Returns the list of non-redundant solutions ![]() |
5. Experiments and Results
Computational Performance and Resource Considerations
6. Concluding Remarks
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| FSP | Frequent Similar Pattern |
| FSPs | Frequent Similar Patterns |
| FSPMiner | Frequent Similar Pattern Miner |
| PSO | Particle Swarm Optimization |
| PSO-FSPMiner | Particle Swarm Optimization-Frequent Similar Pattern Miner |
| STree | Similarity Tree |
| FV-Tree | Feature-Value Tree |
| FP-Tree | Frequent Pattern Tree |
| PPC-Tree | PrePost Coding Tree |
| UCI | University of California, Irvine |
| KNN | k-Nearest Neighbor |
References
- Agrawal, R.; Imieliński, T.; Swami, A. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, 26–28 May 1993; pp. 207–216. [Google Scholar] [CrossRef]
- Fournier-Viger, P.; Gan, W.; Wu, Y.; Nouioua, M.; Song, W.; Truong, T.; Duong, H. Pattern Mining: Current Challenges and Opportunities. In Proceedings of the Database Systems for Advanced Applications. DASFAA 2022 International Workshops; Rage, U.K., Goyal, V., Reddy, P.K., Eds.; Springer: Cham, Switzerland, 2022; pp. 34–49. [Google Scholar] [CrossRef]
- Han, J.; Pei, J.; Tong, H. Data Mining: Concepts and Techniques; Morgan Kaufmann: Burlington, MA, USA, 2022. [Google Scholar]
- Cam, T.T.; Hung, D. Comparing Similarity Measures: Applications in Mining Frequent Closed Patterns. In Proceedings of the Data Analytics and Management; Swaroop, A., Virdee, B., Correia, S., Polkowski, Z., Eds.; Springer: Cham, Switzerland, 2026; pp. 503–519. [Google Scholar] [CrossRef]
- Driss, K.; Boulila, W.; Leborgne, A.; Gançarski, P. Mining frequent approximate patterns in large networks. Int. J. Imaging Syst. Technol. 2021, 31, 1265–1279. [Google Scholar] [CrossRef]
- Daher, J.B.; Brun, A. Handling Item Similarity in Behavioral Patterns through General Pattern Mining. In Proceedings of the 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Melbourne, Australia, 14–17 December 2020; pp. 611–618. [Google Scholar] [CrossRef]
- Rodríguez-González, A.Y.; Martínez-Trinidad, J.F.; Carrasco-Ochoa, J.A.; Ruiz-Shulcloper, J. Mining frequent patterns and association rules using similarities. Expert Syst. Appl. 2013, 40, 6823–6836. [Google Scholar] [CrossRef]
- Rodríguez-González, A.Y.; Martínez-Trinidad, J.F.; Carrasco-Ochoa, J.A.; Ruiz-Shulcloper, J.; Alvarado-Mentado, M. Frequent similar pattern mining using non Boolean similarity functions. J. Intell. Fuzzy Syst. 2019, 36, 4931–4944. [Google Scholar] [CrossRef]
- Rodríguez-González, A.Y.; Aranda, R.; Álvarez Carmona, M.A.; Díaz-Pacheco, A.; Rosas, R.M.V. X-FSPMiner: A Novel Algorithm for Frequent Similar Pattern Mining. ACM Trans. Knowl. Discov. Data 2024, 18, 121. [Google Scholar] [CrossRef]
- Weisberg, M. Getting serious about similarity. Philos. Sci. 2012, 79, 785–794. [Google Scholar] [CrossRef]
- Cha, S.H. Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions. Int. J. Math. Model. Methods Appl. Sci. 2007, 1, 300–307. [Google Scholar]
- Tversky, A. Features of similarity. Psychol. Rev. 1977, 84, 327. [Google Scholar] [CrossRef]
- Ortiz-Posadas, M.R. The Logical Combinatorial Approach Applied to Pattern Recognition in Medicine. In Proceedings of the New Trends and Advanced Methods in Interdisciplinary Mathematical Sciences; Toni, B., Ed.; Springer: Cham, Switzerland, 2017; pp. 169–188. [Google Scholar] [CrossRef]
- Alemán-García, N.; Ortiz-Posadas, M.R. Evaluation of Hepatic Fibrosis Stages Using the Logical Combinatorial Approach. In Proceedings of the Progress in Artificial Intelligence and Pattern Recognition; Hernández Heredia, Y., Milián Núñez, V., Ruiz Shulcloper, J., Eds.; Springer: Cham, Switzerland, 2021; pp. 158–166. [Google Scholar] [CrossRef]
- Medin, D.L.; Goldstone, R.L.; Gentner, D. Respects for similarity. Psychol. Rev. 1993, 100, 254. [Google Scholar] [CrossRef]
- Leicht, E.A.; Holme, P.; Newman, M.E.J. Vertex similarity in networks. Phys. Rev. E 2006, 73, 026120. [Google Scholar] [CrossRef] [PubMed]
- Cheng, Q. Mapping singularities with stream sediment geochemical data for prediction of undiscovered mineral deposits in Gejiu, Yunnan Province, China. Ore Geol. Rev. 2007, 32, 314–324. [Google Scholar] [CrossRef]
- Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 1990, 215, 403–410. [Google Scholar] [CrossRef]
- Danger, R.; Ruíz-Shulcloper, J.; Llavori, R.B. Objectminer: A New Approach for Mining Complex Objects. In Proceedings of the ICEIS (2); Citeseer: University Park, PA, USA, 2004; pp. 42–47. [Google Scholar]
- Rodríguez-González, A.Y.; Martínez-Trinidad, J.F.; Carrasco-Ochoa, J.A.; Ruiz-Shulcloper, J. RP-Miner: A relaxed prune algorithm for frequent similar pattern mining. Knowl. Inf. Syst. 2011, 27, 451–471. [Google Scholar] [CrossRef]
- Li, L.; Ding, P.; Chen, H.; Wu, X. Frequent Pattern Mining in Big Social Graphs. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 6, 638–648. [Google Scholar] [CrossRef]
- Patro, P.P.; Senapati, R. Advanced binary matrix-based frequent pattern mining algorithm. In Intelligent Systems; Springer: Berlin/Heidelberg, Germany, 2021; pp. 305–316. [Google Scholar] [CrossRef]
- Ruiz, E.; Casillas, J. Adaptive fuzzy partitions for evolving association rules in big data stream. Int. J. Approx. Reason. 2018, 93, 463–486. [Google Scholar] [CrossRef]
- Siahaan, A.P.U.; Ikhwan, A.; Aryza, S. A novelty of data mining for promoting education based on FP-growth algorithm. Int. J. Civ. Eng. Technol. 2018, 9, 1660–1669. [Google Scholar] [CrossRef]
- Zhang, C.; Tian, P.; Zhang, X.; Jiang, Z.L.; Yao, L.; Wang, X. Fast eclat algorithms based on minwise hashing for large scale transactions. IEEE Internet Things J. 2018, 6, 3948–3961. [Google Scholar] [CrossRef]
- Koppar, M.R.N.; Ramesh, D. Novel Algorithms for Maximal Frequent Pattern Mining. J. Optoelectron. Laser 2022, 41, 449–455. [Google Scholar] [CrossRef]
- Dey, A.; Bhattacharyya, S.; Dey, S.; Platos, J.; Snasel, V. Automatic clustering of colour images using quantum inspired meta-heuristic algorithms. Appl. Intell. 2023, 53, 9823–9845. [Google Scholar] [CrossRef]
- Chaffi, B.N.; Rahmani, M. A novel two-phase hybrid selection mechanism feeder to improve performance of many-objective optimization algorithms. Evol. Intell. 2022, 17, 889–920. [Google Scholar] [CrossRef]
- Abualigah, L.M.; Khader, A.T. Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J. Supercomput. 2017, 73, 4773–4795. [Google Scholar] [CrossRef]
- Abualigah, L.M.; Khader, A.T.; Hanandeh, E.S. A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J. Comput. Sci. 2018, 25, 456–466. [Google Scholar] [CrossRef]
- Díaz-Pacheco, A.; Reyes-Garcia, C.A. A classification-based fuzzy-rules proxy model to assist in the full model selection problem in high volume datasets. J. Exp. Theor. Artif. Intell. 2022, 34, 815–844. [Google Scholar] [CrossRef]
- Li, G.; Wang, T.; Chen, Q.; Shao, P.; Xiong, N.; Vasilakos, A. A Survey on Particle Swarm Optimization for Association Rule Mining. Electronics 2022, 11, 3044. [Google Scholar] [CrossRef]
- Al-Maolegi, M.; Arkok, B. An improved Apriori algorithm for association rules. arXiv 2014, arXiv:1403.3948. [Google Scholar] [CrossRef]
- Grahne, G.; Zhu, J. Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans. Knowl. Data Eng. 2005, 17, 1347–1362. [Google Scholar] [CrossRef]
- Aryabarzan, N.; Minaei-Bidgoli, B.; Teshnehlab, M. negFIN: An efficient algorithm for fast mining frequent itemsets. Expert Syst. Appl. 2018, 105, 129–143. [Google Scholar] [CrossRef]
- Kennedy, J. Swarm Intelligence. In Handbook of Nature-Inspired and Innovative Computing: Integrating Classical Models with Emerging Technologies; Zomaya, A.Y., Ed.; Springer: Boston, MA, USA, 2006; pp. 187–219. [Google Scholar] [CrossRef]
- Martín, D.; Alcalá-Fdez, J.; Rosete, A.; Herrera, F. Nicgar: A niching genetic algorithm to mine a diverse set of interesting quantitative association rules. Inf. Sci. 2016, 355, 208–228. [Google Scholar] [CrossRef]
- Wang, D.; Tan, D.; Liu, L. Particle swarm optimization algorithm: An overview. Soft Comput. 2018, 22, 387–408. [Google Scholar] [CrossRef]
- Abd Rahman, N.H.; Zobaa, A.F. Integrated mutation strategy with modified binary PSO algorithm for optimal PMUs placement. IEEE Trans. Ind. Inform. 2017, 13, 3124–3133. [Google Scholar] [CrossRef]
- Modiri, A.; Kiasaleh, K. Modification of real-number and binary PSO algorithms for accelerated convergence. IEEE Trans. Antennas Propag. 2010, 59, 214–224. [Google Scholar] [CrossRef]
- Yu, X.; Liu, J.; Li, H. An adaptive inertia weight particle swarm optimization algorithm for IIR digital filter. In Proceedings of the 2009 International Conference on Artificial Intelligence and Computational Intelligence; IEEE: Piscataway, NJ, USA, 2009; Volume 1, pp. 114–118. [Google Scholar] [CrossRef]
- Lipowski, A.; Lipowska, D. Roulette-wheel selection via stochastic acceptance. Phys. A Stat. Mech. Its Appl. 2012, 391, 2193–2196. [Google Scholar] [CrossRef]
- Álvarez-Carmona, M.A.; Aranda, R.; Rodríguez-González, A.Y.; Pellegrin, L.; Carlos, H. Classifying the Mexican epidemiological semaphore colour from the COVID-19 text Spanish news. J. Inf. Sci. 2022, 50, 568–589. [Google Scholar] [CrossRef]
- Gou, J.; Ma, H.; Ou, W.; Zeng, S.; Rao, Y.; Yang, H. A generalized mean distance-based k-nearest neighbor classifier. Expert Syst. Appl. 2019, 115, 356–372. [Google Scholar] [CrossRef]
- Demšar, J. Statistical Comparisons of Classifiers over Multiple Data Sets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]


| Datasets | Instances | Non-Num. | Num. | Imbalance |
|---|---|---|---|---|
| Liver disorders | 345 | 1 | 6 | 27.5 |
| Balance Scale | 576 | 1 | 4 | 0.0 |
| Pima indians diabetes | 768 | 1 | 8 | 116.0 |
| Glass Identification | 146 | 1 | 9 | 3.0 |
| Indian Liver Patient | 579 | 2 | 9 | 124.5 |
| Credit Approval | 690 | 9 | 7 | 38.0 |
| Breast Cancer Wisconsin | 683 | 1 | 9 | 102.5 |
| Diagnosis | 120 | 7 | 1 | 24.2 |
| Cryotherapy | 90 | 5 | 2 | 3.0 |
| Algerian Forest Fires | 244 | 2 | 10 | 11.3 |
| FSPs | Behaviour | ||||
|---|---|---|---|---|---|
| Datasets | Freq | X-FSPMiner | PSO-FSPMiner | % | O Acc |
| 0.02 | 16,434 | 140 (22.38) | 0.85% | 81% (0.02) | |
| 0.04 | 9716 | 165 (46.39) | 1.69% | 80% (0.02) | |
| Liver Disorders | 0.06 | 6618 | 145 (17.98) | 2.19% | 79% (0.02) |
| 0.08 | 4800 | 124 (13.14) | 2.58% | 78% (0.01) | |
| 0.10 | 3518 | 108 (12.74) | 3.06% | 75% (0.02) | |
| 0.02 | 530 | 63 (12.41) | 11.88% | 80% (0.00) | |
| 0.04 | 342 | 55 (2.57) | 16.08% | 87% (0.00) | |
| Balance Scale | 0.06 | 230 | 40 (1.42) | 17.39% | 90% (0.00) |
| 0.08 | 98 | 30 (0.75) | 30.61% | 83% (0.00) | |
| 0.10 | 66 | 32 (0.00) | 48.48% | 81% (0.00) | |
| 0.02 | 528 | 100 (34.02) | 18.93% | 87% (0.01) | |
| 0.04 | 182 | 72 (16.89) | 39.56% | 88% (0.02) | |
| Diabetes | 0.06 | 84 | 39 (5.745) | 46.42% | 92% (0.02) |
| 0.08 | 42 | 25 (0.00) | 59.52% | 95% (0.00) | |
| 0.10 | 30 | 17 (0.00) | 56.66% | 96% (0.00) | |
| 0.02 | 3062 | 429 (23.65) | 14.01% | 96% (0.01) | |
| 0.04 | 242 | 34 (0.77) | 14.04% | 100% (0.00) | |
| Glass | 0.06 | 68 | 24 (0.45) | 35.29% | 100% (0.00) |
| 0.08 | 32 | 17 (0.47) | 53.12% | 100% (0.00) | |
| 0.10 | 22 | 9 (0.15) | 40.90% | 100% (0.00) | |
| 0.02 | 357,648 | 3739 (285.92) | 1.04% | 85% (0.04) | |
| 0.04 | 231,166 | 2588 (216.83) | 1.11% | 82% (0.04) | |
| Indian Liver Patient | 0.06 | 165,608 | 1298 (46.03) | 0.78% | 83% (0.03) |
| 0.08 | 120,228 | 1270 (25.80) | 1.05% | 84% (0.03) | |
| 0.10 | 95,784 | 1288 (15.59) | 8.16% | 81% (0.04) | |
| 0.02 | 70,700 | 6749 (387.01) | 9.54% | 83% (0.02) | |
| 0.04 | 23,324 | 2644 (179.98) | 11.33% | 85% (0.02) | |
| Credit Approval | 0.06 | 12,980 | 1414 (43.89) | 10.86% | 84% (0.01) |
| 0.08 | 8378 | 876 (28.69) | 10.45% | 84% (0.03) | |
| 0.10 | 6032 | 703 (48.90) | 11.65% | 86% (0.03) | |
| 0.02 | 10,756 | 257 (5.40) | 2.38% | 93% (0.01) | |
| 0.04 | 4906 | 142 (4.41) | 2.89% | 94% (0.01) | |
| Breast | 0.06 | 3604 | 132 (3.39) | 3.66% | 93% (0.01) |
| 0.08 | 2524 | 127 (2.01) | 5.03% | 93% (0.01) | |
| 0.10 | 1908 | 128 (1.82) | 6.70% | 94% (0.02) | |
| 0.02 | 11,947 | 1142 (10.60) | 9.55% | 84% (0.03) | |
| 0.04 | 10,342 | 1107 (7.50) | 10.70% | 80% (0.04) | |
| Diagnosis | 0.06 | 6383 | 726 (7.45) | 11.37% | 77% (0.01) |
| 0.08 | 4594 | 421 (29.30) | 9.16% | 76% (0.04) | |
| 0.10 | 3454 | 343 (4.86) | 9.93% | 77% (0.05) | |
| 0.02 | 8600 | 288 (6.42) | 3.34% | 82% (0.01) | |
| 0.04 | 2588 | 154 (5.92) | 5.95% | 84% (0.01) | |
| Cryotherapy | 0.06 | 1178 | 117 (2.22) | 9.93% | 84% (0.01) |
| 0.08 | 548 | 66 (2.55) | 12.04% | 84% (0.01) | |
| 0.10 | 356 | 46 (2.31) | 12.92% | 85% (0.00) | |
| 0.02 | 216,898 | 2448 (5.38) | 1.12% | 85% (0.05) | |
| 0.04 | 81,738 | 1338 (9.55) | 1.63% | 88% (0.04) | |
| Algerian | 0.06 | 44,986 | 918 (3.56) | 2.04% | 86% (0.03) |
| 0.08 | 29,246 | 594 (2.56) | 2.03% | 89% (0.06) | |
| 0.10 | 21,646 | 404 (3.99) | 1.86% | 87% (0.05) | |
| 0.04 | 0.06 | 0.08 | 0.1 | |
|---|---|---|---|---|
| 0.02 | 0.85 | 0.10 | 0.75 | 0.20 |
| 0.04 | 0.75 | 0.10 | 0.65 | |
| 0.06 | 0.65 | 0.10 | ||
| 0.08 | 0.55 | |||
| Critical difference () = 1.92 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Rodríguez-González, A.Y.; Valdovinos-Rosas, R.M.; Bernal Baró, G.; Aranda, R.; Díaz-Pacheco, A.; Álvarez-Carmona, M.Á. PSO-FSPMiner: A Metaheuristic Approach for Mining a Representative Subset of Frequent Similar Patterns. Algorithms 2026, 19, 229. https://doi.org/10.3390/a19030229
Rodríguez-González AY, Valdovinos-Rosas RM, Bernal Baró G, Aranda R, Díaz-Pacheco A, Álvarez-Carmona MÁ. PSO-FSPMiner: A Metaheuristic Approach for Mining a Representative Subset of Frequent Similar Patterns. Algorithms. 2026; 19(3):229. https://doi.org/10.3390/a19030229
Chicago/Turabian StyleRodríguez-González, Ansel Y., Rosa María Valdovinos-Rosas, Gretel Bernal Baró, Ramón Aranda, Angel Díaz-Pacheco, and Miguel Á. Álvarez-Carmona. 2026. "PSO-FSPMiner: A Metaheuristic Approach for Mining a Representative Subset of Frequent Similar Patterns" Algorithms 19, no. 3: 229. https://doi.org/10.3390/a19030229
APA StyleRodríguez-González, A. Y., Valdovinos-Rosas, R. M., Bernal Baró, G., Aranda, R., Díaz-Pacheco, A., & Álvarez-Carmona, M. Á. (2026). PSO-FSPMiner: A Metaheuristic Approach for Mining a Representative Subset of Frequent Similar Patterns. Algorithms, 19(3), 229. https://doi.org/10.3390/a19030229



