Next Article in Journal
A Q-Learning-Assisted Evolutionary Optimization Method for Solving the Capacitated Vehicle Routing Problem
Previous Article in Journal
A Pilot Study of Clarifying (Fining) Agents and Their Effects on Beer Physicochemical Parameters
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Automatic Information Extraction from Scientific Publications Based on the Use Case of Additive Manufacturing

1
Faculty of Mechanical Science and Engineering, Institute of Mechatronic Engineering (IMD), TUD Dresden University of Technology, 01069 Dresden, Germany
2
Fraunhofer Institute for Material and Beam Technology (IWS), 01277 Dresden, Germany
3
Institute of Material Sciences (IfWW), TUD Dresden University of Technology, 01069 Dresden, Germany
4
Fraunhofer Institute for Machine Tools and Forming Technology (IWU), 01187 Dresden, Germany
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(17), 9331; https://doi.org/10.3390/app15179331 (registering DOI)
Submission received: 7 July 2025 / Revised: 10 August 2025 / Accepted: 21 August 2025 / Published: 25 August 2025
(This article belongs to the Section Additive Manufacturing Technologies)

Abstract

A systematic literature review is fundamental to building a robust research foundation, informing experimental methodology, and ensuring the quality of future scientific output. However, manual extraction of targeted information from scientific publications is often laborious and prone to error, especially when researchers require rapid access to relevant findings without specialized hardware. This paper introduces an automated workflow for information extraction from scientific publications in the engineering domain. The proposed workflow consists of two primary stages: data preparation and information extraction. During data preparation, PDF files are converted to plain text and segmented into logical sections using a rule-based block detection and classification algorithm for keeping semantics. Information extraction is then performed by applying regular expressions both on keys and values in the same sentence to identify and extract relevant process and material data from the segmented text. The approach was evaluated on a dataset of 18 open-access scientific publications from various journals and conference proceedings in the AM domain. The results of the automated extraction were compared with manual extraction and with a modern large language model (LLM)-based approach. The findings demonstrate that the proposed workflow can accurately and efficiently extract relevant process and material data, achieving competitive performance relative to the LLM-based method. The workflow offers a significant reduction in time and potential errors associated with manual extraction, with automated processing averaging 15 seconds per document compared to one hour for manual extraction, and achieving a 76% match rate. This efficiency enables researchers to rapidly and effectively extract data. The methodology is readily transferable to other scientific fields where systematic literature reviews and structured data extraction are required.
Keywords: automatic extraction; literature research; scientific publications; information extraction; text mining; PDF format; additive manufacturing automatic extraction; literature research; scientific publications; information extraction; text mining; PDF format; additive manufacturing

Share and Cite

MDPI and ACS Style

Feldhoff, K.; Wiemer, H.; Träger, P.; Kühne, R.; Zimmermann, M.; Ihlenfeldt, S. Automatic Information Extraction from Scientific Publications Based on the Use Case of Additive Manufacturing. Appl. Sci. 2025, 15, 9331. https://doi.org/10.3390/app15179331

AMA Style

Feldhoff K, Wiemer H, Träger P, Kühne R, Zimmermann M, Ihlenfeldt S. Automatic Information Extraction from Scientific Publications Based on the Use Case of Additive Manufacturing. Applied Sciences. 2025; 15(17):9331. https://doi.org/10.3390/app15179331

Chicago/Turabian Style

Feldhoff, Kim, Hajo Wiemer, Philip Träger, Robert Kühne, Martina Zimmermann, and Steffen Ihlenfeldt. 2025. "Automatic Information Extraction from Scientific Publications Based on the Use Case of Additive Manufacturing" Applied Sciences 15, no. 17: 9331. https://doi.org/10.3390/app15179331

APA Style

Feldhoff, K., Wiemer, H., Träger, P., Kühne, R., Zimmermann, M., & Ihlenfeldt, S. (2025). Automatic Information Extraction from Scientific Publications Based on the Use Case of Additive Manufacturing. Applied Sciences, 15(17), 9331. https://doi.org/10.3390/app15179331

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop