Next Article in Journal
Explorative Visual Analysis of Rap Music
Next Article in Special Issue
Cognitive Digital Twins for Resilience in Production: A Conceptual Framework
Previous Article in Journal
Interfaces for Searching and Triaging Large Document Sets: An Ontology-Supported Visual Analytics Approach
Previous Article in Special Issue
A Workflow for Synthetic Data Generation and Predictive Maintenance for Vibration Data
Article

Impact on Inference Model Performance for ML Tasks Using Real-Life Training Data and Synthetic Training Data from GANs

SHS—Stahl-Holding-Saar GmbH & Co. KGaA, 66763 Dillingen, Germany
*
Author to whom correspondence should be addressed.
Academic Editor: Stefano Berretti
Information 2022, 13(1), 9; https://doi.org/10.3390/info13010009
Received: 12 November 2021 / Revised: 16 December 2021 / Accepted: 22 December 2021 / Published: 28 December 2021
Collecting and labeling of good balanced training data are usually very difficult and challenging under real conditions. In addition to classic modeling methods, Generative Adversarial Networks (GANs) offer a powerful possibility to generate synthetic training data. In this paper, we evaluate the hybrid usage of real-life and generated synthetic training data in different fractions and the effect on model performance. We found that a usage of up to 75% synthetic training data can compensate for both time-consuming and costly manual annotation while the model performance in our Deep Learning (DL) use case stays in the same range compared to a 100% share in hand-annotated real images. Using synthetic training data specifically tailored to induce a balanced dataset, special care can be taken concerning events that happen only on rare occasions and a prompt industrial application of ML models can be executed without too much delay, making these feasible and economically attractive for a wide scope of industrial applications in process and manufacturing industries. Hence, the main outcome of this paper is that our methodology can help to leverage the implementation of many different industrial Machine Learning and Computer Vision applications by making them economically maintainable. It can be concluded that a multitude of industrial ML use cases that require large and balanced training data containing all information that is relevant for the target model can be solved in the future following the findings that are presented in this study. View Full-Text
Keywords: Generative Adversarial Networks; Computer Vision; image synthesis; industrial application; Cognitive Twin; Digital Twin Generative Adversarial Networks; Computer Vision; image synthesis; industrial application; Cognitive Twin; Digital Twin
Show Figures

Figure 1

MDPI and ACS Style

Faltings, U.; Bettinger, T.; Barth, S.; Schäfer, M. Impact on Inference Model Performance for ML Tasks Using Real-Life Training Data and Synthetic Training Data from GANs. Information 2022, 13, 9. https://doi.org/10.3390/info13010009

AMA Style

Faltings U, Bettinger T, Barth S, Schäfer M. Impact on Inference Model Performance for ML Tasks Using Real-Life Training Data and Synthetic Training Data from GANs. Information. 2022; 13(1):9. https://doi.org/10.3390/info13010009

Chicago/Turabian Style

Faltings, Ulrike, Tobias Bettinger, Swen Barth, and Michael Schäfer. 2022. "Impact on Inference Model Performance for ML Tasks Using Real-Life Training Data and Synthetic Training Data from GANs" Information 13, no. 1: 9. https://doi.org/10.3390/info13010009

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop