Abstract
The use of machine learning (ML) and deep learning (DL) in lung cancer detection and classification offers great promise for improving early diagnosis and reducing death rates. Despite major advances in research, there is still a significant gap between successful model development and clinical use. This review identifies the main obstacles preventing ML/DL tools from being adopted in real healthcare settings and suggests practical advice to tackle them. Using PRISMA guidelines, we examined over 100 studies published between 2022 and 2024, focusing on technical accuracy, clinical relevance, and ethical aspects. Most of the reviewed studies rely on computed tomography (CT) imaging, reflecting its dominant role in current lung cancer screening workflows. While many models achieve high performance on public datasets (e.g., >95% sensitivity on LUNA16), they often perform poorly on real clinical data due to issues like domain shift and bias, especially toward underrepresented groups. Promising solutions include federated learning for data privacy, synthetic data to support rare subtypes, and explainable AI to build trust. We also present a checklist to guide the development of clinically applicable tools, emphasizing generalizability, transparency, and workflow integration. The study recommends early collaboration between developers, clinicians, and policymakers to ensure practical adoption. Ultimately, for ML/DL solutions to gain clinical acceptance, they must be designed with healthcare professionals from the beginning.