Next Article in Journal
Structure and Properties of a Self-Lubricating Antifriction Composite Based on Regenerated Bearing-Steel Waste for Friction Units of Stencil-Printing Machines
Previous Article in Journal
Exploring the Design, Modeling, and Identification of Beneficial Nonlinear Restoring Forces: A Review
Previous Article in Special Issue
Fine-Tuning Generative AI with Domain Question Banks: Evaluating Multi-Type Question Generation and Grading
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Detecting Duplicates in Bug Tracking Systems with Artificial Intelligence: A Combined Retrieval and Classification Approach

by
Iryna Pikh
1,2,
Vsevolod Senkivskyy
3,
Alona Kudriashova
1,*,
Oleksii Bilyk
3,
Liubomyr Sikora
2 and
Nataliia Lysa
2
1
Department of Virtual Reality Systems, Institute of Computer Science and Information Technologies, Lviv Polytechnic National University, 79013 Lviv, Ukraine
2
Department of Automated Control Systems, Institute of Computer Science and Information Technologies, Lviv Polytechnic National University, 79013 Lviv, Ukraine
3
Department of Computer Technologies in Publishing and Printing Processes, Institute of Printing Art and Media Technologies, Lviv Polytechnic National University, 79013 Lviv, Ukraine
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(1), 416; https://doi.org/10.3390/app16010416 (registering DOI)
Submission received: 13 November 2025 / Revised: 15 December 2025 / Accepted: 29 December 2025 / Published: 30 December 2025

Abstract

Duplicate bug reports increase the workload of software engineering teams and delay the resolution of critical issues, making automated detection essential. This paper presents a two-stage approach that combines transformer-based semantic retrieval with classical machine-learning classification. First, text features of the defect are vectorised using transformer models such as BERT (Bidirectional Encoder Representations from Transformers, google-bert/bert-base-uncased), MiniLM (Miniature Language Model, sentence-transformers/all-MiniLM-L6-v2) or MPNet (Masked and Permuted Pre-training for Language Understanding, sentence-transformers/all-mpnet-base-v2) to identify semantically similar reports and narrow the candidate search space. Second, the filtered pairs are classified using algorithms such as XGBoost (eXtreme Gradient Boosting), SVM (Support Vector Machines) or logistic regression to determine true duplicates. This hybrid method improves accuracy while substantially lowering computational cost. Experimental results validate the proposed approach, demonstrating robust accuracy and consistent performance in identifying duplicate defect reports.
Keywords: duplicate detection; deep learning; transfer learning; natural language processing; computational efficiency duplicate detection; deep learning; transfer learning; natural language processing; computational efficiency

Share and Cite

MDPI and ACS Style

Pikh, I.; Senkivskyy, V.; Kudriashova, A.; Bilyk, O.; Sikora, L.; Lysa, N. Detecting Duplicates in Bug Tracking Systems with Artificial Intelligence: A Combined Retrieval and Classification Approach. Appl. Sci. 2026, 16, 416. https://doi.org/10.3390/app16010416

AMA Style

Pikh I, Senkivskyy V, Kudriashova A, Bilyk O, Sikora L, Lysa N. Detecting Duplicates in Bug Tracking Systems with Artificial Intelligence: A Combined Retrieval and Classification Approach. Applied Sciences. 2026; 16(1):416. https://doi.org/10.3390/app16010416

Chicago/Turabian Style

Pikh, Iryna, Vsevolod Senkivskyy, Alona Kudriashova, Oleksii Bilyk, Liubomyr Sikora, and Nataliia Lysa. 2026. "Detecting Duplicates in Bug Tracking Systems with Artificial Intelligence: A Combined Retrieval and Classification Approach" Applied Sciences 16, no. 1: 416. https://doi.org/10.3390/app16010416

APA Style

Pikh, I., Senkivskyy, V., Kudriashova, A., Bilyk, O., Sikora, L., & Lysa, N. (2026). Detecting Duplicates in Bug Tracking Systems with Artificial Intelligence: A Combined Retrieval and Classification Approach. Applied Sciences, 16(1), 416. https://doi.org/10.3390/app16010416

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop