Next Article in Journal
A Multi-Scale Airspace Sectorization Framework Based on QTM and HDQN
Previous Article in Journal
An Overview of CubeSat Missions and Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Semantic Topic Modeling of Aviation Safety Reports: A Comparative Analysis Using BERTopic and PLSA

by
Aziida Nanyonga
1,
Keith Joiner
2,
Ugur Turhan
3 and
Graham Wild
3,*
1
School of Engineering and Technology, University of New South Wales, Canberra, ACT 2600, Australia
2
Capability Systems Centre, University of New South Wales, Canberra, ACT 2610, Australia
3
School of Science, University of New South Wales, Canberra, ACT 2612, Australia
*
Author to whom correspondence should be addressed.
Aerospace 2025, 12(6), 551; https://doi.org/10.3390/aerospace12060551
Submission received: 17 May 2025 / Revised: 14 June 2025 / Accepted: 16 June 2025 / Published: 16 June 2025
(This article belongs to the Section Air Traffic and Transportation)

Abstract

Aviation safety analysis increasingly relies on extracting actionable insights from narrative incident reports to support risk identification and improve operational safety. Topic modeling techniques such as Probabilistic Latent Semantic Analysis (pLSA) and BERTopic offer automated methods to uncover latent themes in unstructured safety narratives. This study evaluates the effectiveness of each model in generating coherent, interpretable, and semantically meaningful topics for aviation safety practitioners and researchers. We assess model performance using both quantitative metrics (topic coherence scores) and qualitative evaluations of topic relevance. The findings show that while pLSA provides a solid probabilistic framework, BERTopic leveraging transformer-based embeddings and HDBSCAN clustering produces more nuanced, context-aware topic groupings, albeit with increased computational demands and tuning complexity. These results highlight the respective strengths and trade-offs of traditional versus modern topic modeling approaches in aviation safety analysis. This work advances the application of natural language processing (NLP) in aviation by demonstrating how topic modeling can support risk assessment, inform policy, and enhance safety outcomes.
Keywords: aviation safety; topic modeling; BERTopic; pLSA; ASN reports; text mining aviation safety; topic modeling; BERTopic; pLSA; ASN reports; text mining

Share and Cite

MDPI and ACS Style

Nanyonga, A.; Joiner, K.; Turhan, U.; Wild, G. Semantic Topic Modeling of Aviation Safety Reports: A Comparative Analysis Using BERTopic and PLSA. Aerospace 2025, 12, 551. https://doi.org/10.3390/aerospace12060551

AMA Style

Nanyonga A, Joiner K, Turhan U, Wild G. Semantic Topic Modeling of Aviation Safety Reports: A Comparative Analysis Using BERTopic and PLSA. Aerospace. 2025; 12(6):551. https://doi.org/10.3390/aerospace12060551

Chicago/Turabian Style

Nanyonga, Aziida, Keith Joiner, Ugur Turhan, and Graham Wild. 2025. "Semantic Topic Modeling of Aviation Safety Reports: A Comparative Analysis Using BERTopic and PLSA" Aerospace 12, no. 6: 551. https://doi.org/10.3390/aerospace12060551

APA Style

Nanyonga, A., Joiner, K., Turhan, U., & Wild, G. (2025). Semantic Topic Modeling of Aviation Safety Reports: A Comparative Analysis Using BERTopic and PLSA. Aerospace, 12(6), 551. https://doi.org/10.3390/aerospace12060551

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop