Abstract
Background: Parkinson’s disease (PD) is a degenerative neurological disorder that greatly affects motor and speech functions; therefore, early diagnosis is vital for improving patients’ quality of life. This work introduces a unified and explainable AI framework for PD detection that integrates ensemble and deep learning models with transparent interpretability techniques. Methods: Acoustic features were extracted from the Parkinson's Voice Disorder Dataset, and a broad suite of machine learning and deep learning models was evaluated, including traditional classifiers (Logistic Regression, Decision Tree, KNN, Linear Regression, SVM), ensemble methods (Random Forest, Gradient Boosting, XGBoost, LightGBM), and neural architectures (CNN, LSTM, GAN). Results: The ensemble methods—specifically LightGBM (LGBM) and Random Forest (RF)—achieved the best performance, reaching state-of-the-art accuracy (98.01%) and ROC-AUC (0.9914). Deep learning models like CNN and GAN produced competitive results, validating their ability to capture nonlinear and generative voice patterns. XAI analysis revealed that nonlinear acoustic biomarkers such as spread2, PPE, and RPDE are the most influential predictors, consistent with clinical evidence of dysphonia in PD. Conclusions: The proposed framework achieves a strong balance between predictive accuracy and interpretability, representing a clinically relevant, scalable, and non-invasive solution for early Parkinson’s detection.