Previous Issue

Table of Contents

Big Data Cogn. Comput., Volume 2, Issue 4 (December 2018)

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Readerexternal link to open them.
View options order results:
result details:
Displaying articles 1-9
Export citation of selected articles as:
Open AccessArticle Prototype of Mobile Device to Contribute to Urban Mobility of Visually Impaired People
Big Data Cogn. Comput. 2018, 2(4), 38; https://doi.org/10.3390/bdcc2040038
Received: 23 September 2018 / Revised: 27 November 2018 / Accepted: 29 November 2018 / Published: 4 December 2018
Viewed by 123 | PDF Full-text (5797 KB) | HTML Full-text | XML Full-text
Abstract
Visually impaired people (VIP) feel a lack of aid for their facilitated urban mobility, mainly due to obstacles encountered on their routes. This paper describes the design of AudioMaps, a prototype of cartographic technology for mobile devices. AudioMaps was designed to register the
[...] Read more.
Visually impaired people (VIP) feel a lack of aid for their facilitated urban mobility, mainly due to obstacles encountered on their routes. This paper describes the design of AudioMaps, a prototype of cartographic technology for mobile devices. AudioMaps was designed to register the descriptions and locations of points of interest. When a point is registered, the prototype inserts a georeferenced landmark on the screen (based on Google Maps). Then, if the AudioMaps position is next to (15 or 5 m from) the previously registered point, it sends by audio the missing distance and a description. For a preview, a test area located in Monte Carmelo, Brazil, was selected, and the light poles, street corners (name of streets forming the intersections), and crosswalks were registered in AudioMaps. A tactile model, produced manually, was used to form the first image of four sighted people and four VIP, who completed a navigation task in the tested area. The results indicate that both the tactile model and the audiovisual prototype can be used by both groups of participants. Above all, the prototype proved to be a viable and promising option for decision-making and spatial orientation in urban environments. New ways of presenting data to VIP or sighted people are described. Full article
Figures

Figure 1

Open AccessArticle Leveraging Image Representation of Network Traffic Data and Transfer Learning in Botnet Detection
Big Data Cogn. Comput. 2018, 2(4), 37; https://doi.org/10.3390/bdcc2040037
Received: 27 October 2018 / Revised: 21 November 2018 / Accepted: 23 November 2018 / Published: 27 November 2018
Viewed by 216 | PDF Full-text (4852 KB) | HTML Full-text | XML Full-text
Abstract
The advancements in the Internet has enabled connecting more devices into this technology every day. The emergence of the Internet of Things has aggregated this growth. Lack of security in an IoT world makes these devices hot targets for cyber criminals to perform
[...] Read more.
The advancements in the Internet has enabled connecting more devices into this technology every day. The emergence of the Internet of Things has aggregated this growth. Lack of security in an IoT world makes these devices hot targets for cyber criminals to perform their malicious actions. One of these actions is the Botnet attack, which is one of the main destructive threats that has been evolving since 2003 into different forms. This attack is a serious threat to the security and privacy of information. Its scalability, structure, strength, and strategy are also under successive development, and that it has survived for decades. A bot is defined as a software application that executes a number of automated tasks (simple but structurally repetitive) over the Internet. Several bots make a botnet that infects a number of devices and communicates with their controller called the botmaster to get their instructions. A botnet executes tasks with a rate that would be impossible to be done by a human being. Nowadays, the activities of bots are concealed in between the normal web flows and occupy more than half of all web traffic. The largest use of bots is in web spidering (web crawler), in which an automated script fetches, analyzes, and files information from web servers. They also contribute to other attacks, such as distributed denial of service (DDoS), SPAM, identity theft, phishing, and espionage. A number of botnet detection techniques have been proposed, such as honeynet-based and Intrusion Detection System (IDS)-based. These techniques are not effective anymore due to the constant update of the bots and their evasion mechanisms. Recently, botnet detection techniques based upon machine/deep learning have been proposed that are more capable in comparison to their previously mentioned counterparts. In this work, we propose a deep learning-based engine for botnet detection to be utilized in the IoT and the wearable devices. In this system, the normal and botnet network traffic data are transformed into image before being given into a deep convolutional neural network, named DenseNet with and without considering transfer learning. The system is implemented using Python programming language and the CTU-13 Dataset is used for evaluation in one study. According to our simulation results, using transfer learning can improve the accuracy from 33.41% up to 99.98%. In addition, two other classifiers of Support Vector Machine (SVM) and logistic regression have been used. They showed an accuracy of 83.15% and 78.56%, respectively. In another study, we evaluate our system by an in-house live normal dataset and a solely botnet dataset. Similarly, the system performed very well in data classification in these studies. To examine the capability of our system for real-time applications, we measure the system training and testing times. According to our examination, it takes 0.004868 milliseconds to process each packet from the network traffic data during testing. Full article
(This article belongs to the Special Issue Applied Deep Learning: Business and Industrial Applications)
Figures

Figure 1

Open AccessArticle A Model Free Control Based on Machine Learning for Energy Converters in an Array
Big Data Cogn. Comput. 2018, 2(4), 36; https://doi.org/10.3390/bdcc2040036
Received: 22 October 2018 / Revised: 13 November 2018 / Accepted: 18 November 2018 / Published: 22 November 2018
Viewed by 241 | PDF Full-text (1245 KB) | HTML Full-text | XML Full-text
Abstract
This paper introduces a machine learning based control strategy for energy converter arrays designed to work under realistic conditions where the optimal control parameter can not be obtained analytically. The control strategy neither relies on a mathematical model, nor does it need a
[...] Read more.
This paper introduces a machine learning based control strategy for energy converter arrays designed to work under realistic conditions where the optimal control parameter can not be obtained analytically. The control strategy neither relies on a mathematical model, nor does it need a priori information about the energy medium. Therefore several identical energy converters are arranged so that they are affected simultaneously by the energy medium. Each device uses a different control strategy, of which at least one has to be the machine learning approach presented in this paper. During operation all energy converters record the absorbed power and control output; the machine learning device gets the data from the converter with the highest power absorption and so learns the best performing control strategy for each situation. Consequently, the overall network has a better overall performance than each individual strategy. This concept is evaluated for wave energy converters (WECs) with numerical simulations and experiments with physical scale models in a wave tank. In the first of two numerical simulations, the learnable WEC works in an array with four WECs applying a constant damping factor. In the second simulation, two learnable WECs were learning with each other. It showed that in the first test the WEC was able to absorb as much as the best constant damping WEC, while in the second run it could absorb even slightly more. During the physical model test, the ANN showed its ability to select the better of two possible damping coefficients based on real world input data. Full article
Figures

Figure 1

Open AccessArticle The Next Generation Cognitive Security Operations Center: Network Flow Forensics Using Cybersecurity Intelligence
Big Data Cogn. Comput. 2018, 2(4), 35; https://doi.org/10.3390/bdcc2040035
Received: 25 October 2018 / Revised: 12 November 2018 / Accepted: 20 November 2018 / Published: 22 November 2018
Viewed by 323 | PDF Full-text (634 KB) | HTML Full-text | XML Full-text
Abstract
A Security Operations Center (SOC) can be defined as an organized and highly skilled team that uses advanced computer forensics tools to prevent, detect and respond to cybersecurity incidents of an organization. The fundamental aspects of an effective SOC is related to the
[...] Read more.
A Security Operations Center (SOC) can be defined as an organized and highly skilled team that uses advanced computer forensics tools to prevent, detect and respond to cybersecurity incidents of an organization. The fundamental aspects of an effective SOC is related to the ability to examine and analyze the vast number of data flows and to correlate several other types of events from a cybersecurity perception. The supervision and categorization of network flow is an essential process not only for the scheduling, management, and regulation of the network’s services, but also for attacks identification and for the consequent forensics’ investigations. A serious potential disadvantage of the traditional software solutions used today for computer network monitoring, and specifically for the instances of effective categorization of the encrypted or obfuscated network flow, which enforces the rebuilding of messages packets in sophisticated underlying protocols, is the requirements of computational resources. In addition, an additional significant inability of these software packages is they create high false positive rates because they are deprived of accurate predicting mechanisms. For all the reasons above, in most cases, the traditional software fails completely to recognize unidentified vulnerabilities and zero-day exploitations. This paper proposes a novel intelligence driven Network Flow Forensics Framework (NF3) which uses low utilization of computing power and resources, for the Next Generation Cognitive Computing SOC (NGC2SOC) that rely solely on advanced fully automated intelligence methods. It is an effective and accurate Ensemble Machine Learning forensics tool to Network Traffic Analysis, Demystification of Malware Traffic and Encrypted Traffic Identification. Full article
Figures

Figure 1

Open AccessArticle Big-Crypto: Big Data, Blockchain and Cryptocurrency
Big Data Cogn. Comput. 2018, 2(4), 34; https://doi.org/10.3390/bdcc2040034
Received: 31 August 2018 / Revised: 5 October 2018 / Accepted: 16 October 2018 / Published: 19 October 2018
Viewed by 721 | PDF Full-text (366 KB) | HTML Full-text | XML Full-text
Abstract
Cryptocurrency has been a trending topic over the past decade, pooling tremendous technological power and attracting investments valued over trillions of dollars on a global scale. The cryptocurrency technology and its network have been endowed with many superior features due to its unique
[...] Read more.
Cryptocurrency has been a trending topic over the past decade, pooling tremendous technological power and attracting investments valued over trillions of dollars on a global scale. The cryptocurrency technology and its network have been endowed with many superior features due to its unique architecture, which also determined its worldwide efficiency, applicability and data intensive characteristics. This paper introduces and summarises the interactions between two significant concepts in the digitalized world, i.e., cryptocurrency and Big Data. Both subjects are at the forefront of technological research, and this paper focuses on their convergence and comprehensively reviews the very recent applications and developments after 2016. Accordingly, we aim to present a systematic review of the interactions between Big Data and cryptocurrency and serve as the one stop reference directory for researchers with regard to identifying research gaps and directing future explorations. Full article
Figures

Figure 1

Open AccessArticle Topological Signature of 19th Century Novelists: Persistent Homology in Text Mining
Big Data Cogn. Comput. 2018, 2(4), 33; https://doi.org/10.3390/bdcc2040033
Received: 23 September 2018 / Revised: 8 October 2018 / Accepted: 15 October 2018 / Published: 18 October 2018
Viewed by 372 | PDF Full-text (887 KB) | HTML Full-text | XML Full-text
Abstract
Topological Data Analysis (TDA) refers to a collection of methods that find the structure of shapes in data. Although recently, TDA methods have been used in many areas of data mining, it has not been widely applied to text mining tasks. In most
[...] Read more.
Topological Data Analysis (TDA) refers to a collection of methods that find the structure of shapes in data. Although recently, TDA methods have been used in many areas of data mining, it has not been widely applied to text mining tasks. In most text processing algorithms, the order in which different entities appear or co-appear is being lost. Assuming these lost orders are informative features of the data, TDA may play a significant role in the resulted gap on text processing state of the art. Once provided, the topology of different entities through a textual document may reveal some additive information regarding the document that is not reflected in any other features from conventional text processing methods. In this paper, we introduce a novel approach that hires TDA in text processing in order to capture and use the topology of different same-type entities in textual documents. First, we will show how to extract some topological signatures in the text using persistent homology-i.e., a TDA tool that captures topological signature of data cloud. Then we will show how to utilize these signatures for text classification. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2018)
Figures

Figure 1

Open AccessReview Data Stream Clustering Techniques, Applications, and Models: Comparative Analysis and Discussion
Big Data Cogn. Comput. 2018, 2(4), 32; https://doi.org/10.3390/bdcc2040032
Received: 16 July 2018 / Revised: 23 August 2018 / Accepted: 10 October 2018 / Published: 17 October 2018
Viewed by 331 | PDF Full-text (3203 KB) | HTML Full-text | XML Full-text
Abstract
Data growth in today’s world is exponential, many applications generate huge amount of data streams at very high speed such as smart grids, sensor networks, video surveillance, financial systems, medical science data, web click streams, network data, etc. In the case of traditional
[...] Read more.
Data growth in today’s world is exponential, many applications generate huge amount of data streams at very high speed such as smart grids, sensor networks, video surveillance, financial systems, medical science data, web click streams, network data, etc. In the case of traditional data mining, the data set is generally static in nature and available many times for processing and analysis. However, data stream mining has to satisfy constraints related to real-time response, bounded and limited memory, single-pass, and concept-drift detection. The main problem is identifying the hidden pattern and knowledge for understanding the context for identifying trends from continuous data streams. In this paper, various data stream methods and algorithms are reviewed and evaluated on standard synthetic data streams and real-life data streams. Density-micro clustering and density-grid-based clustering algorithms are discussed and comparative analysis in terms of various internal and external clustering evaluation methods is performed. It was observed that a single algorithm cannot satisfy all the performance measures. The performance of these data stream clustering algorithms is domain-specific and requires many parameters for density and noise thresholds. Full article
Figures

Figure 1

Open AccessArticle Constrained Optimization-Based Extreme Learning Machines with Bagging for Freezing of Gait Detection
Big Data Cogn. Comput. 2018, 2(4), 31; https://doi.org/10.3390/bdcc2040031
Received: 4 September 2018 / Revised: 9 October 2018 / Accepted: 12 October 2018 / Published: 15 October 2018
Viewed by 315 | PDF Full-text (1043 KB) | HTML Full-text | XML Full-text
Abstract
The Internet-of-Things (IoT) is a paradigm shift from slow and manual approaches to fast and automated systems. It has been deployed for various use-cases and applications in recent times. There are many aspects of IoT that can be used for the assistance of
[...] Read more.
The Internet-of-Things (IoT) is a paradigm shift from slow and manual approaches to fast and automated systems. It has been deployed for various use-cases and applications in recent times. There are many aspects of IoT that can be used for the assistance of elderly individuals. In this paper, we detect the presence or absence of freezing of gait in patients suffering from Parkinson’s disease (PD) by using the data from body-mounted acceleration sensors placed on the legs and hips of the patients. For accurate detection and estimation, constrained optimization-based extreme learning machines (C-ELM) have been utilized. Moreover, in order to enhance the accuracy even further, C-ELM with bagging (C-ELMBG) has been proposed, which uses the characteristics of least squares support vector machines. The experiments have been carried out on the publicly available Daphnet freezing of gait dataset to verify the feasibility of C-ELM and C-ELMBG. The simulation results show an accuracy above 90% for both methods. A detailed comparison with other state-of-the-art statistical learning algorithms such as linear discriminate analysis, classification and regression trees, random forest and state vector machines is also presented where C-ELM and C-ELMBG show better performance in all aspects, including accuracy, sensitivity, and specificity. Full article
(This article belongs to the Special Issue Health Assessment in the Big Data Era)
Figures

Figure 1

Open AccessArticle An Experimental Evaluation of Fault Diagnosis from Imbalanced and Incomplete Data for Smart Semiconductor Manufacturing
Big Data Cogn. Comput. 2018, 2(4), 30; https://doi.org/10.3390/bdcc2040030
Received: 18 July 2018 / Revised: 10 September 2018 / Accepted: 14 September 2018 / Published: 21 September 2018
Viewed by 450 | PDF Full-text (7738 KB) | HTML Full-text | XML Full-text
Abstract
The SECOM dataset contains information about a semiconductor production line, entailing the products that failed the in-house test line and their attributes. This dataset, similar to most semiconductor manufacturing data, contains missing values, imbalanced classes, and noisy features. In this work, the challenges
[...] Read more.
The SECOM dataset contains information about a semiconductor production line, entailing the products that failed the in-house test line and their attributes. This dataset, similar to most semiconductor manufacturing data, contains missing values, imbalanced classes, and noisy features. In this work, the challenges of this dataset are met and many different approaches for classification are evaluated to perform fault diagnosis. We present an experimental evaluation that examines 288 combinations of different approaches involving data pruning, data imputation, feature selection, and classification methods, to find the suitable approaches for this task. Furthermore, a novel data imputation approach, namely “In-painting KNN-Imputation” is introduced and is shown to outperform the common data imputation technique. The results show the capability of each classifier, feature selection method, data generation method, and data imputation technique, with a full analysis of their respective parameter optimizations. Full article
Figures

Figure 1

Back to Top