Next Article in Journal
Gridded Population Maps Informed by Different Built Settlement Products
Previous Article in Journal
Nested Stochastic Valuation of Large Variable Annuity Portfolios: Monte Carlo Simulation and Synthetic Datasets
Article Menu

Export Article

Open AccessArticle

Synthesizing High-Utility Patterns from Different Data Sources

Department of Computer Engineering, St. Vincent Pallotti College of Engineering & Technology, Nagpur 441108, India
*
Author to whom correspondence should be addressed.
Received: 3 August 2018 / Revised: 25 August 2018 / Accepted: 30 August 2018 / Published: 3 September 2018
Full-Text   |   PDF [1056 KB, uploaded 3 September 2018]   |  

Abstract

In large organizations, it is often required to collect data from the different geographic branches spread over different locations. Extensive amounts of data may be gathered at the centralized location in order to generate interesting patterns via mono-mining the amassed database. However, it is feasible to mine the useful patterns at the data source itself and forward only these patterns to the centralized company, rather than the entire original database. These patterns also exist in huge numbers, and different sources calculate different utility values for each pattern. This paper proposes a weighted model for aggregating the high-utility patterns from different data sources. The procedure of pattern selection was also proposed to efficiently extract high-utility patterns in our weighted model by discarding low-utility patterns. Meanwhile, the synthesizing model yielded high-utility patterns, unlike association rule mining, in which frequent itemsets are generated by considering each item with equal utility, which is not true in real life applications such as sales transactions. Extensive experiments performed on the datasets with varied characteristics show that the proposed algorithm will be effective for mining very sparse and sparse databases with a huge number of transactions. Our proposed model also outperforms various state-of-the-art distributed models of mining in terms of running time. View Full-Text
Keywords: data integration; data mining; high-utility patterns; knowledge discovery; weighted model; multi-database mining; distributed data mining data integration; data mining; high-utility patterns; knowledge discovery; weighted model; multi-database mining; distributed data mining
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Muley, A.; Gudadhe, M. Synthesizing High-Utility Patterns from Different Data Sources. Data 2018, 3, 32.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Data EISSN 2306-5729 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top