Next Article in Journal
NO2 Selective Sensor Based on α-Fe2O3 Nanoparticles Synthesized via Hydrothermal Technique
Next Article in Special Issue
Fault Identification of Chemical Processes Based on k-NN Variable Contribution and CNN Data Reconstruction Methods
Previous Article in Journal
Predicting Emotion and Engagement of Workers in Order Picking Based on Behavior and Pulse Waves Acquired by Wearable Devices
Previous Article in Special Issue
Determination of HPLC-UV Fingerprints of Spanish Paprika (Capsicum annuum L.) for Its Classification by Linear Discriminant Analysis
Open AccessArticle

LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data

Department of Computer Science, Abdul Wali Khan University, Mardan 23200, Pakistan
Department of Computer System and Technology, Faculty of Computer Science and IT, University of Malaya, Kuala Lumpur 50603, Malaysia
Faculty of Computing and Information Technology, Northern Border University, Rafha 91911, Saudi Arabia
Department of Electronics, University of Peshawar, Peshawar 25000, Pakistan
School of Computing and Information Technology, Taylor’s University, Subang Jaya 47500, Malaysia
Authors to whom correspondence should be addressed.
Sensors 2019, 19(1), 166;
Received: 9 October 2018 / Revised: 6 November 2018 / Accepted: 7 November 2018 / Published: 4 January 2019
(This article belongs to the Special Issue Multivariate Data Analysis for Sensors and Sensor Arrays)
Multivariate data sets are common in various application areas, such as wireless sensor networks (WSNs) and DNA analysis. A robust mechanism is required to compute their similarity indexes regardless of the environment and problem domain. This study describes the usefulness of a non-metric-based approach (i.e., longest common subsequence) in computing similarity indexes. Several non-metric-based algorithms are available in the literature, the most robust and reliable one is the dynamic programming-based technique. However, dynamic programming-based techniques are considered inefficient, particularly in the context of multivariate data sets. Furthermore, the classical approaches are not powerful enough in scenarios with multivariate data sets, sensor data or when the similarity indexes are extremely high or low. To address this issue, we propose an efficient algorithm to measure the similarity indexes of multivariate data sets using a non-metric-based methodology. The proposed algorithm performs exceptionally well on numerous multivariate data sets compared with the classical dynamic programming-based algorithms. The performance of the algorithms is evaluated on the basis of several benchmark data sets and a dynamic multivariate data set, which is obtained from a WSN deployed in the Ghulam Ishaq Khan (GIK) Institute of Engineering Sciences and Technology. Our evaluation suggests that the proposed algorithm can be approximately 39.9% more efficient than its counterparts for various data sets in terms of computational time. View Full-Text
Keywords: multivariate data set; longest common subsequence; dynamic programming; WSN data multivariate data set; longest common subsequence; dynamic programming; WSN data
Show Figures

Figure 1

MDPI and ACS Style

Khan, R.; Ali, I.; Altowaijri, S.M.; Zakarya, M.; Ur Rahman, A.; Ahmedy, I.; Khan, A.; Gani, A. LCSS-Based Algorithm for Computing Multivariate Data Set Similarity: A Case Study of Real-Time WSN Data. Sensors 2019, 19, 166.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map

Back to TopTop