Multi-Label Classification of Contributing Causal Factors in Self-Reported Safety Narratives
Abstract
:1. Introduction
2. Literature Review
- For problems of retrieval within a multi-label taxonomy, can the breakdown of narratives into individual sentences be leveraged to provide new strategies for unsupervised classification?
2.1. Multi-Label Measures
2.2. Strategies for Multi-Label Retrieval
3. Method
3.1. Document Primary Problem Evaluation by Sentence Primary Problem
3.2. Document Contributing Factors Evaluation by Sentence Primary Problem
3.3. Document Contributing Factors Evaluation by Sentence Contributing Factors
4. Results
5. Discussion
5.1. NLP as an Combined Approach to Existing Practice
5.2. Limitations
6. Conclusions
Funding
Acknowledgments
Conflicts of Interest
Appendix A. Demonstration of Processes
On SLC WEVIC 1 RNAV Departure; the modified route on the PDC clearance was incorrectly understood and programmed as WEVIC direct KSINO versus WEVIC 1 KSINO transition thereby bypassing intermediate waypoints on the departure. Shortly after we passed WEVIC waypoint; Center asked if we were on WEVIC 1 Departure at which time we immediately detected our error and ATC assigned 20 degrees left to rejoin departure. We queried if there were any traffic conflicts and Center replied there were not. If any questions on PDC modified clearance confirm with Clearance Delivery. Do not allow any rush or distraction to take away from the careful step-by-step verification of ROUTE and LEGS page waypoints compared to clearance.
- “On SLC WEVIC 1 RNAV Departure; the modified route on the PDC clearance was incorrectly understood and programmed as WEVIC direct KSINO versus WEVIC 1 KSINO transition thereby bypassing intermediate waypoints on the departure.”
- “Shortly after we passed WEVIC waypoint; Center asked if we were on WEVIC 1 Departure at which time we immediately detected our error and ATC assigned 20 degrees left to rejoin departure.”
- “We queried if there were any traffic conflicts and Center replied there were not.”
- “If any questions on PDC modified clearance confirm with Clearance Delivery.”
- “Do not allow any rush or distraction to take away from the careful step-by-step verification of ROUTE and LEGS page waypoints compared to clearance.”
Appendix A.1. Primary Problem by Sentence
Appendix A.2. Contributing Factors by Sentence
Appendix A.3. Contributing Factors by Contributing Cause
Labels | True | Retrieved | Difference | ||||
---|---|---|---|---|---|---|---|
“Ambiguous” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Incorrect/Not Installed/Unavailable Part” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“ATC Equipment/Nav Facility/Buildings” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Human Factors” | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
“Logbook Entry” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Chart Or Publication” | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
“Equipment/Tooling” | 0 | 0 | 0 | 1 | 0 | 0 | 1 |
“MEL” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Airport” | 0 | 0 | 0 | 1 | 0 | 0 | 1 |
“Aircraft” | 0 | 0 | 1 | 1 | 1 | 1 | 1 |
“Weather” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Staffing” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Environment-Non Weather Related” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Company Policy” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Manuals” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“Procedure” | 0 | 0 | 1 | 1 | 1 | 1 | 1 |
“Airspace Structure” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
“(No label assigned)” | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Sentence | Nearest Neighbor | |
---|---|---|
Primary Problem | Contributing Factors | |
“On SLC WEVIC 1 RNAV Departure; the modified route on the PDC clearance was incorrectly understood and programmed as WEVIC direct KSINO versus WEVIC 1 KSINO transition thereby bypassing intermediate waypoints on the departure.” | HF | HF |
“Shortly after we passed WEVIC waypoint; Center asked if we were on WEVIC 1 Departure at which time we immediately detected our error and ATC assigned 20 degrees left to rejoin departure.” | PR | HF, PR |
“We queried if there were any traffic conflicts and Center replied there were not.” | HF | AC, HF |
“If any questions on PDC modified clearance confirm with Clearance Delivery.” | PR | PR, HF, CP, AP |
“Do not allow any rush or distraction to take away from the careful step-by-step verification of ROUTE and LEGS page waypoints compared to clearance.” | AC | AC |
Appendix A.4. A Discussion of the Process as Applied to the Example Narrative
References
- Wiegmann, D.A.; Shappell, S.A. A Human Error Analysis of Commercial Aviation Accidents Using the Human Factors Analysis and Classification System (HFACS). Aviat. Space Environ. Med. 2001, 72, 1006–1016. [Google Scholar] [PubMed]
- Grabowski, M.; You, Z.; Zhou, Z.; Song, H.; Steward, M.; Steward, B. Human and organizational error data challenges in complex, large-scale systems. Saf. Sci. 2009, 47, 1185–1194. [Google Scholar] [CrossRef]
- Federal Aviation Administration (Ed.) Safety Revolution; World Aviation Training Symposium, Department of Transportation: Orlando, FL, USA, 2016.
- Deerwester, S.C.; Dumais, S.T.; Landauer, T.K.; Furnas, G.W.; Harshman, R.A. Indexing by latent semantic analysis. JASIS 1990, 41, 391–407. [Google Scholar] [CrossRef] [Green Version]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Robinson, S.D.; Irwin, W.J.; Kelly, T.K.; Wu, X.O. Application of machine learning to mapping primary causal factors in self reported safety narratives. Saf. Sci. 2015, 75, 118–129. [Google Scholar] [CrossRef]
- Tanguy, L.; Tulechki, N.; Urieli, A.; Hermann, E.; Raynal, C. Natural language processing for aviation safety reports: From classification to interactive analysis. Comput. Ind. 2015, 78, 80–95. [Google Scholar] [CrossRef] [Green Version]
- Robinson, S.D. Visual representation of safety narratives. Saf. Sci. 2016, 88, 123–128. [Google Scholar] [CrossRef]
- Billings, C.; Lauber, J.; Funkhouser, H.; Lyman, E.; Huff, E. NASA Aviation Safety Reporting System; Technical Report; National Aeronautics and Space Administration: Washington, DC, USA, 1976.
- Furnas, G.W.; Deerwester, S.; Dumais, S.T.; Landauer, T.K.; Harshman, R.A.; Streeter, L.A.; Lochbaum, K.E. Information retrieval using a singular value decomposition model of latent semantic structure. In Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Grenoble, France, 13–15 June 1988; pp. 465–480. [Google Scholar]
- Tsoumakas, G.; Katakis, I. Multi-label classification: An overview. Int. J. Data Warehous. Min. (IJDWM) 2007, 3, 1–13. [Google Scholar] [CrossRef]
- Hamming, R.W. Error detecting and error correcting codes. Bell Syst. Tech. J. 1950, 29, 147–160. [Google Scholar] [CrossRef]
- Schapire, R.E.; Singer, Y. Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 1999, 37, 297–336. [Google Scholar] [CrossRef]
- Bird, S. NLTK: The natural language toolkit. In Proceedings of the COLING/ACL on Interactive Presentation Sessions, Sydney, Australia, 17–18 July 2006; Association for Computational Linguistics: Uppsala, Sweden, 2006; pp. 69–72. [Google Scholar]
- Garrette, D.; Klein, E. An extensible toolkit for computational semantics. In Proceedings of the Eighth International Conference on Computational Semantics, Tilburg, The Netherlands, 7–9 January 2009; Association for Computational Linguistics: Uppsala, Sweden, 2009; pp. 116–127. [Google Scholar]
- Wilbur, W.J.; Sirotkin, K. The automatic identification of stop words. J. Inf. Sci. 1992, 18, 45–55. [Google Scholar] [CrossRef]
- Wallace, B.; Ross, A. Beyond Human Error: Taxonomies and Safety Science; CRC Press: Boca Raton, FL, USA, 2004. [Google Scholar]
- Taib, I.A.; McIntosh, A.S.; Caponecchia, C.; Baysari, M.T. Comparing the usability and reliability of a generic and a domain-specific medical error taxonomy. Saf. Sci. 2012, 50, 1801–1805. [Google Scholar] [CrossRef]
- Guo, H.; Zhu, H.; Guo, Z.; Zhang, X.; Wu, X.; Su, Z. Domain adaptation with latent semantic association for named entity recognition. In Proceedings of the Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, CO, USA, 1–3 June 2009; Association for Computational Linguistics: Uppsala, Sweden, 2009; pp. 281–289. [Google Scholar]
- Liu, X.; Zhang, S.; Wei, F.; Zhou, M. Recognizing named entities in tweets. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2011; Association for Computational Linguistics: Uppsala, Sweden, 2011; pp. 359–367. [Google Scholar]
- Dredze, M.; Wallach, H.M.; Puller, D.; Pereira, F. Generating summary keywords for emails using topics. In Proceedings of the 13th International Conference on Intelligent User Interfaces, Perugia, Italy, 30 June–3 July 2008; pp. 199–206. [Google Scholar]
- National Aeronautics and Space Administration. ASRS Program Briefing; Technical Report; National Aeronautics and Space Administration: Washington, DC, USA, 2016.
- Beaubien, J.M.; Baker, D.P. A review of selected aviation human factors taxonomies, accident/incident reporting systems and data collection tools. Int. J. Appl. Aviat. Stud. 2002, 2, 11–36. [Google Scholar]
- Fleishman, E.A.; Quaintance, M.K.; Broedling, L.A. Taxonomies of Human Performance: The Description of Human Tasks; Academic Press: Cambridge, MA, USA, 1984. [Google Scholar]
- Halford, C.; Harper, M. ASIAS: Aviation Safety Information Analysis and sharing. In Proceedings of the IEEE/AIAA 27th Digital Avionics Systems Conference, Saint Paul, MN, USA, 26–30 October 2008. [Google Scholar]
- Pimm, C.; Raynal, C.; Tulechki, N.; Hermann, E.; Caudy, G.; Tanguy, L. Natural Language Processing (NLP) tools for the analysis of incident and accident reports. In Proceedings of the International Conference on Human-Computer Interaction in Aerospace (HCI-Aero), Brussels, Belgium, 12–14 September 2012. [Google Scholar]
- Agovic, A.; Shan, H.; Banerjee, A. Analyzing Aviation Safety Reports: From Topic Modeling to Scalable Multi-Label Classification. In Proceedings of the 2010 Conference on Intelligent Data Understanding, Mountain View, CA, USA, 5–6 October 2010; pp. 83–97. [Google Scholar]
Labels | Training Fraction | Query Fraction | ||
---|---|---|---|---|
Primary | Contributing | Primary | Contributing | |
“Ambiguous” | 0.123 | 0 | 0.106 | 0 |
“Incorrect/Not Installed/Unavailable Part” | 0.003 | 0.011 | 0.005 | 0.011 |
“ATC Equipment/Nav Facility / Buildings” | 0.011 | 0.011 | 0.007 | 0.01 |
“Human Factors” | 0.229 | 0.223 | 0.265 | 0.264 |
“Logbook Entry” | 0.000 | 0.01 | 0.002 | 0.01 |
“Chart Or Publication” | 0.013 | 0.047 | 0.02 | 0.04 |
“Equipment/Tooling” | 0.004 | 0.007 | 0.004 | 0.009 |
“MEL” | 0.007 | 0.013 | 0.006 | 0.014 |
“Airport” | 0.012 | 0.027 | 0.011 | 0.034 |
“Aircraft” | 0.412 | 0.28 | 0.364 | 0.26 |
“Weather” | 0.026 | 0.044 | 0.025 | 0.042 |
“Staffing” | 0.001 | 0.007 | 0.001 | 0.009 |
“Environment-Non Weather Related” | 0.021 | 0.044 | 0.021 | 0.038 |
“Company Policy” | 0.049 | 0.07 | 0.083 | 0.099 |
“Manuals” | 0.011 | 0.032 | 0.006 | 0.025 |
“Procedure” | 0.073 | 0.164 | 0.061 | 0.12 |
“Airspace Structure” | 0.002 | 0.011 | 0.004 | 0.015 |
“(No label assigned)” | 0.003 | 0.001 | 0.009 | 0.002 |
Matching Strategy | Precision | Recall | ||||
---|---|---|---|---|---|---|
Primary | 0.229 | 0.056 | 0.389 | 0.343 | 0.215 | 0.843 |
Contributing by primary | 0.216 | 0.056 | 0.389 | 0.484 | 0.351 | 0.781 |
Contributing by contributing | 0.364 | 0.111 | 0.611 | 0.400 | 0.255 | 0.935 |
© 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Robinson, S.D. Multi-Label Classification of Contributing Causal Factors in Self-Reported Safety Narratives. Safety 2018, 4, 30. https://doi.org/10.3390/safety4030030
Robinson SD. Multi-Label Classification of Contributing Causal Factors in Self-Reported Safety Narratives. Safety. 2018; 4(3):30. https://doi.org/10.3390/safety4030030
Chicago/Turabian StyleRobinson, Saul D. 2018. "Multi-Label Classification of Contributing Causal Factors in Self-Reported Safety Narratives" Safety 4, no. 3: 30. https://doi.org/10.3390/safety4030030
APA StyleRobinson, S. D. (2018). Multi-Label Classification of Contributing Causal Factors in Self-Reported Safety Narratives. Safety, 4(3), 30. https://doi.org/10.3390/safety4030030