Next Article in Journal
An Interdisciplinary Review of Camera Image Collection and Analysis Techniques, with Considerations for Environmental Conservation Social Science
Previous Article in Journal
Low-Temperature Pyrolysis of Municipal Solid Waste Components and Refuse-Derived Fuel—Process Efficiency and Fuel Properties of Carbonized Solid Fuel
Previous Article in Special Issue
Big Data Usage in European Countries: Cluster Analysis Approach
Open AccessArticle

Data Wrangling in Database Systems: Purging of Dirty Data

German Center for Higher Education Research and Science Studies (DZHW), Schützenstraße 6a, Berlin 10117, Germany
Received: 30 March 2020 / Revised: 20 May 2020 / Accepted: 2 June 2020 / Published: 5 June 2020
(This article belongs to the Special Issue Challenges in Business Intelligence)
Researchers need to be able to integrate ever-increasing amounts of data into their institutional databases, regardless of the source, format, or size of the data. It is then necessary to use the increasing diversity of data to derive greater value from data for their organization. The processing of electronic data plays a central role in modern society. Data constitute a fundamental part of operational processes in companies and scientific organizations. In addition, they form the basis for decisions. Bad data quality can negatively affect decisions and have a negative impact on results. The quality of the data is crucial. This includes the new theme of data wrangling, sometimes referred to as data munging or data crunching, to find the dirty data and to transform and clean them. The aim of data wrangling is to prepare a lot of raw data in their original state so that they can be used for further analysis steps. Only then can knowledge be obtained that may bring added value. This paper shows how the data wrangling process works and how it can be used in database systems to clean up data from heterogeneous data sources during their acquisition and integration. View Full-Text
Keywords: information systems; data management systems; heterogeneous data; data integration; dirty data identification; data quality; data curation; data management; data wrangling; data munging; data crunching information systems; data management systems; heterogeneous data; data integration; dirty data identification; data quality; data curation; data management; data wrangling; data munging; data crunching
Show Figures

Figure 1

MDPI and ACS Style

Azeroual, O. Data Wrangling in Database Systems: Purging of Dirty Data. Data 2020, 5, 50.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop