Big Data and Cognitive Computing — Open Access Journal
Big Data and Cognitive Computing (ISSN 2504-2289) is an international, scientific, peer-reviewed, open access journal of big data and cognitive computing published quarterly online by MDPI.
- Open Access free for readers, free publication for well-prepared manuscripts submitted before 1 July 2019.
- Rapid publication: manuscripts are peer-reviewed and a first decision provided to authors approximately 14.4 days after submission; acceptance to publication is undertaken in 5.1 days (median values for papers published in this journal in the second half of 2018).
- Recognition of reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Latest Articles
Open AccessReview
A Review of Facial Landmark Extraction in 2D Images and Videos Using Deep Learning
►▼
Figures
Big Data Cogn. Comput. 2019, 3(1), 14; https://doi.org/10.3390/bdcc3010014 - 13 February 2019
Abstract
The task of facial landmark extraction is fundamental in several applications which involve facial analysis, such as facial expression analysis, identity and face recognition, facial animation, and 3D face reconstruction. Taking into account the most recent advances resulting from deep-learning techniques, the performance
[...] Read more.
The task of facial landmark extraction is fundamental in several applications which involve facial analysis, such as facial expression analysis, identity and face recognition, facial animation, and 3D face reconstruction. Taking into account the most recent advances resulting from deep-learning techniques, the performance of methods for facial landmark extraction have been substantially improved, even on in-the-wild datasets. Thus, this article presents an updated survey on facial landmark extraction on 2D images and video, focusing on methods that make use of deep-learning techniques. An analysis of many approaches comparing the performances is provided. In summary, an analysis of common datasets, challenges, and future research directions are provided.
Full article

Figure 1
Open AccessArticle
Usage of the Term Big Data in Biomedical Publications: A Text Mining Approach
Big Data Cogn. Comput. 2019, 3(1), 13; https://doi.org/10.3390/bdcc3010013 - 6 February 2019
Abstract
In this study, we attempt to assess the value of the term Big Data when used by researchers in their publications. For this purpose, we systematically collected a corpus of biomedical publications that use and do not use the term Big Data. These
[...] Read more.
In this study, we attempt to assess the value of the term Big Data when used by researchers in their publications. For this purpose, we systematically collected a corpus of biomedical publications that use and do not use the term Big Data. These documents were used as input to a machine learning classifier to determine how well they can be separated into two groups and to determine the most distinguishing classification features. We generated 100 classifiers that could correctly distinguish between Big Data and non-Big Data documents with an area under the Receiver Operating Characteristic (ROC) curve of 0.96. The differences between the two groups were characterized by terms specific to Big Data themes—such as `computational’, `mining’, and `challenges’—and also by terms that indicate the research field, such as `genomics’. The ROC curves when plotted for various time intervals showed no difference over time. We conclude that there is a detectable and stable difference between publications that use the term Big Data and those that do not. Furthermore, the use of the term Big Data within a publication seems to indicate a distinct type of research in the biomedical field. Therefore, we conclude that value can be attributed to the term Big Data when used in a publication and this value has not changed over time.
Full article
Open AccessReview
Big Data and Climate Change
►▼
Figures
Big Data Cogn. Comput. 2019, 3(1), 12; https://doi.org/10.3390/bdcc3010012 - 2 February 2019
Abstract
Climate science as a data-intensive subject has overwhelmingly affected by the era of big data and relevant technological revolutions. The big successes of big data analytics in diverse areas over the past decade have also prompted the expectation of big data and its
[...] Read more.
Climate science as a data-intensive subject has overwhelmingly affected by the era of big data and relevant technological revolutions. The big successes of big data analytics in diverse areas over the past decade have also prompted the expectation of big data and its efficacy on the big problem—climate change. As an emerging topic, climate change has been at the forefront of the big climate data analytics implementations and exhaustive research have been carried out covering a variety of topics. This paper aims to present an outlook of big data in climate change studies over the recent years by investigating and summarising the current status of big data applications in climate change related studies. It is also expected to serve as a one-stop reference directory for researchers and stakeholders with an overview of this trending subject at a glance, which can be useful in guiding future research and improvements in the exploitation of big climate data.
Full article

Figure 1
Open AccessArticle
A Domain-Oriented Analysis of the Impact of Machine Learning—The Case of Retailing
►▼
Figures
by Felix Weber and Reinhard Schütte
Big Data Cogn. Comput. 2019, 3(1), 11; https://doi.org/10.3390/bdcc3010011 - 24 January 2019
Abstract
Information technologies in general and artifical intelligence (AI) in particular try to shift operational task away from a human actor. Machine learning (ML) is a discipline within AI that deals with learning improvement based on data. Subsequently, retailing and wholesaling, which are known
[...] Read more.
Information technologies in general and artifical intelligence (AI) in particular try to shift operational task away from a human actor. Machine learning (ML) is a discipline within AI that deals with learning improvement based on data. Subsequently, retailing and wholesaling, which are known for their high proportion of human work and at the same time low profit margins, can be regarded as a natural fit for the application of AI and ML tools. This article examines the current prevalence of the use of machine learning in the industry. The paper uses two disparate approaches to identify the scientific and practical state-of-the-art within the domain: a literature review on the major scientific databases and an empirical study of the 10 largest international retail companies and their adoption of ML technologies in the domain are combined with each other. This text does not present a prototype using machine learning techniques. Instead of a consideration and comparison of the particular algorythms and approaches, the underling problems and operational tasks that are elementary for the specific domain are identified. Based on a comprehensive literature review the main problem types that ML can serve, and the associated ML techniques, are evaluated. An empirical study of the 10 largest retail companies and their ML adoption shows that the practical market adoption is highly variable. The pioneers have extensively integrated applications into everyday business, while others only show a small set of early prototypes. However, some others show neither active use nor efforts to apply such a technology. Following this, a structured approach is taken to analyze the value-adding core processes of retail companies. The current scientific and practical application scenarios and possibilities are illustrated in detail. In summary, there are numerous possible applications in all areas. In particular, in areas where future forecasts and predictions are needed (like marketing or replenishment), the use of ML today is both scientifically and practically highly developed.
Full article

Figure 1
Open AccessArticle
Modelling Early Word Acquisition through Multiplex Lexical Networks and Machine Learning
►▼
Figures
Big Data Cogn. Comput. 2019, 3(1), 10; https://doi.org/10.3390/bdcc3010010 - 24 January 2019
Abstract
Early language acquisition is a complex cognitive task. Recent data-informed approaches showed that children do not learn words uniformly at random but rather follow specific strategies based on the associative representation of words in the mental lexicon, a conceptual system enabling human cognitive
[...] Read more.
Early language acquisition is a complex cognitive task. Recent data-informed approaches showed that children do not learn words uniformly at random but rather follow specific strategies based on the associative representation of words in the mental lexicon, a conceptual system enabling human cognitive computing. Building on this evidence, the current investigation introduces a combination of machine learning techniques, psycholinguistic features (i.e., frequency, length, polysemy and class) and multiplex lexical networks, representing the semantics and phonology of the mental lexicon, with the aim of predicting normative acquisition of 529 English words by toddlers between 22 and 26 months. Classifications using logistic regression and based on four psycholinguistic features achieve the best baseline cross-validated accuracy of 61.7% when half of the words have been acquired. Adding network information through multiplex closeness centrality enhances accuracy (up to 67.7%) more than adding multiplex neighbourhood density/degree (62.4%) or multiplex PageRank versatility (63.0%) or the best single-layer network metric, i.e., free association degree (65.2%), instead. Multiplex closeness operationalises the structural relevance of words for semantic and phonological information flow. These results indicate that the whole, global, multi-level flow of information and structure of the mental lexicon influence word acquisition more than single-layer or local network features of words when considered in conjunction with language norms. The highlighted synergy of multiplex lexical structure and psycholinguistic norms opens new ways for understanding human cognition and language processing through powerful and data-parsimonious cognitive computing approaches.
Full article

Figure 1
Open AccessEditorial
Acknowledgement to Reviewers of Big Data and Cognitive Computing in 2018
Big Data Cogn. Comput. 2019, 3(1), 9; https://doi.org/10.3390/bdcc3010009 - 21 January 2019
Abstract
Rigorous peer-review is the corner-stone of high-quality academic publishing [...]
Full article
Open AccessArticle
Fog Computing for Internet of Things (IoT)-Aided Smart Grid Architectures
►▼
Figures
Big Data Cogn. Comput. 2019, 3(1), 8; https://doi.org/10.3390/bdcc3010008 - 19 January 2019
Abstract
The fast-paced development of power systems necessitates the smart grid (SG) to facilitate real-time control and monitoring with bidirectional communication and electricity flows. In order to meet the computational requirements for SG applications, cloud computing (CC) provides flexible resources and services shared in
[...] Read more.
The fast-paced development of power systems necessitates the smart grid (SG) to facilitate real-time control and monitoring with bidirectional communication and electricity flows. In order to meet the computational requirements for SG applications, cloud computing (CC) provides flexible resources and services shared in network, parallel processing, and omnipresent access. Even though CC model is considered to be efficient for SG, it fails to guarantee the Quality-of-Experience (QoE) requirements for the SG services, viz. latency, bandwidth, energy consumption, and network cost. Fog Computing (FC) extends CC by deploying localized computing and processing facilities into the edge of the network, offering location-awareness, low latency, and latency-sensitive analytics for mission critical requirements of SG applications. By deploying localized computing facilities at the premise of users, it pre-stores the cloud data and distributes to SG users with fast-rate local connections. In this paper, we first examine the current state of cloud based SG architectures and highlight the motivation(s) for adopting FC as a technology enabler for real-time SG analytics. We also present a three layer FC-based SG architecture, characterizing its features towards integrating massive number of Internet of Things (IoT) devices into future SG. We then propose a cost optimization model for FC that jointly investigates data consumer association, workload distribution, virtual machine placement and Quality-of-Service (QoS) constraints. The formulated model is a Mixed-Integer Nonlinear Programming (MINLP) problem which is solved using Modified Differential Evolution (MDE) algorithm. We evaluate the proposed framework on real world parameters and show that for a network with approximately 50% time critical applications, the overall service latency for FC is nearly half to that of cloud paradigm. We also observed that the FC lowers the aggregated power consumption of the generic CC model by more than 44%.
Full article

Figure 1
Open AccessArticle
An Enhanced Inference Algorithm for Data Sampling Efficiency and Accuracy Using Periodic Beacons and Optimization
►▼
Figures
Big Data Cogn. Comput. 2019, 3(1), 7; https://doi.org/10.3390/bdcc3010007 - 16 January 2019
Abstract
Transferring data from a sensor or monitoring device in electronic health, vehicular informatics, or Internet of Things (IoT) networks has had the enduring challenge of improving data accuracy with relative efficiency. Previous works have proposed the use of an inference system at the
[...] Read more.
Transferring data from a sensor or monitoring device in electronic health, vehicular informatics, or Internet of Things (IoT) networks has had the enduring challenge of improving data accuracy with relative efficiency. Previous works have proposed the use of an inference system at the sensor device to minimize the data transfer frequency as well as the size of data to save network usage and battery resources. This has been implemented using various algorithms in sampling and inference, with a tradeoff between accuracy and efficiency. This paper proposes to enhance the accuracy without compromising efficiency by introducing new algorithms in sampling through a hybrid inference method. The experimental results show that accuracy can be significantly improved, whilst the efficiency is not diminished. These algorithms will contribute to saving operation and maintenance costs in data sampling, where resources of computational and battery are constrained and limited, such as in wireless personal area networks emerged with IoT networks.
Full article

Figure 1
Open AccessArticle
The Next Generation Cognitive Security Operations Center: Adaptive Analytic Lambda Architecture for Efficient Defense against Adversarial Attacks
►▼
Figures
by Konstantinos Demertzis, Nikos Tziritas, Panayiotis Kikiras, Salvador Llopis Sanchez and Lazaros Iliadis
Big Data Cogn. Comput. 2019, 3(1), 6; https://doi.org/10.3390/bdcc3010006 - 10 January 2019
Abstract
A Security Operations Center (SOC) is a central technical level unit responsible for monitoring, analyzing, assessing, and defending an organization’s security posture on an ongoing basis. The SOC staff works closely with incident response teams, security analysts, network engineers and organization managers using
[...] Read more.
A Security Operations Center (SOC) is a central technical level unit responsible for monitoring, analyzing, assessing, and defending an organization’s security posture on an ongoing basis. The SOC staff works closely with incident response teams, security analysts, network engineers and organization managers using sophisticated data processing technologies such as security analytics, threat intelligence, and asset criticality to ensure security issues are detected, analyzed and finally addressed quickly. Those techniques are part of a reactive security strategy because they rely on the human factor, experience and the judgment of security experts, using supplementary technology to evaluate the risk impact and minimize the attack surface. This study suggests an active security strategy that adopts a vigorous method including ingenuity, data analysis, processing and decision-making support to face various cyber hazards. Specifically, the paper introduces a novel intelligence driven cognitive computing SOC that is based exclusively on progressive fully automatic procedures. The proposed λ-Architecture Network Flow Forensics Framework (λ-ΝF3) is an efficient cybersecurity defense framework against adversarial attacks. It implements the Lambda machine learning architecture that can analyze a mixture of batch and streaming data, using two accurate novel computational intelligence algorithms. Specifically, it uses an Extreme Learning Machine neural network with Gaussian Radial Basis Function kernel (ELM/GRBFk) for the batch data analysis and a Self-Adjusting Memory k-Nearest Neighbors classifier (SAM/k-NN) to examine patterns from real-time streams. It is a forensics tool for big data that can enhance the automate defense strategies of SOCs to effectively respond to the threats their environments face.
Full article

Figure 1
Open AccessArticle
Beneficial Artificial Intelligence Coordination by Means of a Value Sensitive Design Approach
►▼
Figures
Big Data Cogn. Comput. 2019, 3(1), 5; https://doi.org/10.3390/bdcc3010005 - 6 January 2019
Abstract
This paper argues that the Value Sensitive Design (VSD) methodology provides a principled approach to embedding common values into AI systems both early and throughout the design process. To do so, it draws on an important case study: the evidence and final report
[...] Read more.
This paper argues that the Value Sensitive Design (VSD) methodology provides a principled approach to embedding common values into AI systems both early and throughout the design process. To do so, it draws on an important case study: the evidence and final report of the UK Select Committee on Artificial Intelligence. This empirical investigation shows that the different and often disparate stakeholder groups that are implicated in AI design and use share some common values that can be used to further strengthen design coordination efforts. VSD is shown to be both able to distill these common values as well as provide a framework for stakeholder coordination.
Full article

Graphical abstract
Open AccessArticle
Two-Level Fault Diagnosis of SF6 Electrical Equipment Based on Big Data Analysis
►▼
Figures
Big Data Cogn. Comput. 2019, 3(1), 4; https://doi.org/10.3390/bdcc3010004 - 3 January 2019
Abstract
With the increase of the operating time of sulphur hexafluoride (SF6) electrical equipment, the different degrees of discharge may occur inside the equipment. It makes the insulation performance of the equipment decline and will cause serious damage to the equipment. Therefore, it is
[...] Read more.
With the increase of the operating time of sulphur hexafluoride (SF6) electrical equipment, the different degrees of discharge may occur inside the equipment. It makes the insulation performance of the equipment decline and will cause serious damage to the equipment. Therefore, it is of practical significance to diagnose fault and assess state for SF6 electrical equipment. In recent years, the frequency of monitoring data acquisition for SF6 electrical equipment has been continuously improved and the scope of collection has been continuously expanded, which makes massive data accumulated in the substation database. In order to quickly process massive SF6 electrical equipment condition monitoring data, we built a two-level fault diagnosis model for SF6 electrical equipment on the Hadoop platform. And we use the MapReduce framework to achieve the parallelization of the fault diagnosis algorithm, which further improves the speed of fault diagnosis for SF6 electrical equipment.
Full article

Figure 1
Open AccessReview
Doppler Radar-Based Non-Contact Health Monitoring for Obstructive Sleep Apnea Diagnosis: A Comprehensive Review
►▼
Figures
Big Data Cogn. Comput. 2019, 3(1), 3; https://doi.org/10.3390/bdcc3010003 - 1 January 2019
Abstract
Today’s rapid growth of elderly populations and aging problems coupled with the prevalence of obstructive sleep apnea (OSA) and other health related issues have affected many aspects of society. This has led to high demands for a more robust healthcare monitoring, diagnosing and
[...] Read more.
Today’s rapid growth of elderly populations and aging problems coupled with the prevalence of obstructive sleep apnea (OSA) and other health related issues have affected many aspects of society. This has led to high demands for a more robust healthcare monitoring, diagnosing and treatments facilities. In particular to Sleep Medicine, sleep has a key role to play in both physical and mental health. The quality and duration of sleep have a direct and significant impact on people’s learning, memory, metabolism, weight, safety, mood, cardio-vascular health, diseases, and immune system function. The gold-standard for OSA diagnosis is the overnight sleep monitoring system using polysomnography (PSG). However, despite the quality and reliability of the PSG system, it is not well suited for long-term continuous usage due to limited mobility as well as causing possible irritation, distress, and discomfort to patients during the monitoring process. These limitations have led to stronger demands for non-contact sleep monitoring systems. The aim of this paper is to provide a comprehensive review of the current state of non-contact Doppler radar sleep monitoring technology and provide an outline of current challenges and make recommendations on future research directions to practically realize and commercialize the technology for everyday usage.
Full article

Figure 1
Open AccessArticle
Towards AI Welfare Science and Policies
by Soenke Ziesche and Roman Yampolskiy
Big Data Cogn. Comput. 2019, 3(1), 2; https://doi.org/10.3390/bdcc3010002 - 27 December 2018
Abstract
In light of fast progress in the field of AI there is an urgent demand for AI policies. Bostrom et al. provide “a set of policy desiderata”, out of which this article attempts to contribute to the “interests of digital minds”. The focus
[...] Read more.
In light of fast progress in the field of AI there is an urgent demand for AI policies. Bostrom et al. provide “a set of policy desiderata”, out of which this article attempts to contribute to the “interests of digital minds”. The focus is on two interests of potentially sentient digital minds: to avoid suffering and to have the freedom of choice about their deletion. Various challenges are considered, including the vast range of potential features of digital minds, the difficulties in assessing the interests and wellbeing of sentient digital minds, and the skepticism that such research may encounter. Prolegomena to abolish suffering of sentient digital minds as well as to measure and specify wellbeing of sentient digital minds are outlined by means of the new field of AI welfare science, which is derived from animal welfare science. The establishment of AI welfare science serves as a prerequisite for the formulation of AI welfare policies, which regulate the wellbeing of sentient digital minds. This article aims to contribute to sentiocentrism through inclusion, thus to policies for antispeciesism, as well as to AI safety, for which wellbeing of AIs would be a cornerstone.
Full article
Open AccessArticle
Comparative Study between Big Data Analysis Techniques in Intrusion Detection
►▼
Figures
by Mounir Hafsa and Farah Jemili
Big Data Cogn. Comput. 2019, 3(1), 1; https://doi.org/10.3390/bdcc3010001 - 20 December 2018
Abstract
Cybersecurity ventures expect that cyber-attack damage costs will rise to $11.5 billion in 2019 and that a business will fall victim to a cyber-attack every 14 seconds. Notice here that the time frame for such an event is seconds. With petabytes of data
[...] Read more.
Cybersecurity ventures expect that cyber-attack damage costs will rise to $11.5 billion in 2019 and that a business will fall victim to a cyber-attack every 14 seconds. Notice here that the time frame for such an event is seconds. With petabytes of data generated each day, this is a challenging task for traditional intrusion detection systems (IDSs). Protecting sensitive information is a major concern for both businesses and governments. Therefore, the need for a real-time, large-scale and effective IDS is a must. In this work, we present a cloud-based, fault tolerant, scalable and distributed IDS that uses Apache Spark Structured Streaming and its Machine Learning library (MLlib) to detect intrusions in real-time. To demonstrate the efficacy and effectivity of this system, we implement the proposed system within Microsoft Azure Cloud, as it provides both processing power and storage capabilities. A decision tree algorithm is used to predict the nature of incoming data. For this task, the use of the MAWILab dataset as a data source will give better insights about the system capabilities against cyber-attacks. The experimental results showed a 99.95% accuracy and more than 55,175 events per second were processed by the proposed system on a small cluster.
Full article

Figure 1
Open AccessArticle
Unscented Kalman Filter Based on Spectrum Sensing in a Cognitive Radio Network Using an Adaptive Fuzzy System
►▼
Figures
by Md Ruhul Amin, Md Mahbubur Rahman, Mohammad Amazad Hossain, Md Khairul Islam, Kazi Mowdud Ahmed, Bikash Chandra Singh and Md Sipon Miah
Big Data Cogn. Comput. 2018, 2(4), 39; https://doi.org/10.3390/bdcc2040039 - 17 December 2018
Abstract
In this paper, we proposed the unscented Kalman filter (UKF) based on cooperative spectrum sensing (CSS) scheme in a cognitive radio network (CRN) using an adaptive fuzzy system—in this proposed scheme, firstly, the UKF to apply the nonlinear system which is used to
[...] Read more.
In this paper, we proposed the unscented Kalman filter (UKF) based on cooperative spectrum sensing (CSS) scheme in a cognitive radio network (CRN) using an adaptive fuzzy system—in this proposed scheme, firstly, the UKF to apply the nonlinear system which is used to minimize the mean square estimation error; secondly, an adaptive fuzzy logic rule based on an inference engine to estimate the local decisions to detect a licensed primary user (PU) that is applied at the fusion center (FC). After that, the FC makes a global decision by using a defuzzification procedure based on a proposed algorithm. Simulation results show that the proposed scheme achieved better detection gain than the conventional schemes like an equal gain combining (EGC) based soft fusion rule and a Kalman filter (KL) based soft fusion rule under any conditions. Moreover, the proposed scheme achieved the lowest global probability of error compared to both the conventional EGC and KF schemes.
Full article

Figure 1
Open AccessArticle
Prototype of Mobile Device to Contribute to Urban Mobility of Visually Impaired People
►▼
Figures
Big Data Cogn. Comput. 2018, 2(4), 38; https://doi.org/10.3390/bdcc2040038 - 4 December 2018
Abstract
Visually impaired people (VIP) feel a lack of aid for their facilitated urban mobility, mainly due to obstacles encountered on their routes. This paper describes the design of AudioMaps, a prototype of cartographic technology for mobile devices. AudioMaps was designed to register the
[...] Read more.
Visually impaired people (VIP) feel a lack of aid for their facilitated urban mobility, mainly due to obstacles encountered on their routes. This paper describes the design of AudioMaps, a prototype of cartographic technology for mobile devices. AudioMaps was designed to register the descriptions and locations of points of interest. When a point is registered, the prototype inserts a georeferenced landmark on the screen (based on Google Maps). Then, if the AudioMaps position is next to (15 or 5 m from) the previously registered point, it sends by audio the missing distance and a description. For a preview, a test area located in Monte Carmelo, Brazil, was selected, and the light poles, street corners (name of streets forming the intersections), and crosswalks were registered in AudioMaps. A tactile model, produced manually, was used to form the first image of four sighted people and four VIP, who completed a navigation task in the tested area. The results indicate that both the tactile model and the audiovisual prototype can be used by both groups of participants. Above all, the prototype proved to be a viable and promising option for decision-making and spatial orientation in urban environments. New ways of presenting data to VIP or sighted people are described.
Full article

Figure 1
Open AccessArticle
Leveraging Image Representation of Network Traffic Data and Transfer Learning in Botnet Detection
►▼
Figures
Big Data Cogn. Comput. 2018, 2(4), 37; https://doi.org/10.3390/bdcc2040037 - 27 November 2018
Abstract
The advancements in the Internet has enabled connecting more devices into this technology every day. The emergence of the Internet of Things has aggregated this growth. Lack of security in an IoT world makes these devices hot targets for cyber criminals to perform
[...] Read more.
The advancements in the Internet has enabled connecting more devices into this technology every day. The emergence of the Internet of Things has aggregated this growth. Lack of security in an IoT world makes these devices hot targets for cyber criminals to perform their malicious actions. One of these actions is the Botnet attack, which is one of the main destructive threats that has been evolving since 2003 into different forms. This attack is a serious threat to the security and privacy of information. Its scalability, structure, strength, and strategy are also under successive development, and that it has survived for decades. A bot is defined as a software application that executes a number of automated tasks (simple but structurally repetitive) over the Internet. Several bots make a botnet that infects a number of devices and communicates with their controller called the botmaster to get their instructions. A botnet executes tasks with a rate that would be impossible to be done by a human being. Nowadays, the activities of bots are concealed in between the normal web flows and occupy more than half of all web traffic. The largest use of bots is in web spidering (web crawler), in which an automated script fetches, analyzes, and files information from web servers. They also contribute to other attacks, such as distributed denial of service (DDoS), SPAM, identity theft, phishing, and espionage. A number of botnet detection techniques have been proposed, such as honeynet-based and Intrusion Detection System (IDS)-based. These techniques are not effective anymore due to the constant update of the bots and their evasion mechanisms. Recently, botnet detection techniques based upon machine/deep learning have been proposed that are more capable in comparison to their previously mentioned counterparts. In this work, we propose a deep learning-based engine for botnet detection to be utilized in the IoT and the wearable devices. In this system, the normal and botnet network traffic data are transformed into image before being given into a deep convolutional neural network, named DenseNet with and without considering transfer learning. The system is implemented using Python programming language and the CTU-13 Dataset is used for evaluation in one study. According to our simulation results, using transfer learning can improve the accuracy from 33.41% up to 99.98%. In addition, two other classifiers of Support Vector Machine (SVM) and logistic regression have been used. They showed an accuracy of 83.15% and 78.56%, respectively. In another study, we evaluate our system by an in-house live normal dataset and a solely botnet dataset. Similarly, the system performed very well in data classification in these studies. To examine the capability of our system for real-time applications, we measure the system training and testing times. According to our examination, it takes 0.004868 milliseconds to process each packet from the network traffic data during testing.
Full article

Figure 1
Open AccessArticle
A Model Free Control Based on Machine Learning for Energy Converters in an Array
►▼
Figures
by Simon Thomas, Marianna Giassi, Mikael Eriksson, Malin Göteman, Jan Isberg, Edward Ransley, Martyn Hann and Jens Engström
Big Data Cogn. Comput. 2018, 2(4), 36; https://doi.org/10.3390/bdcc2040036 - 22 November 2018
Abstract
This paper introduces a machine learning based control strategy for energy converter arrays designed to work under realistic conditions where the optimal control parameter can not be obtained analytically. The control strategy neither relies on a mathematical model, nor does it need a
[...] Read more.
This paper introduces a machine learning based control strategy for energy converter arrays designed to work under realistic conditions where the optimal control parameter can not be obtained analytically. The control strategy neither relies on a mathematical model, nor does it need a priori information about the energy medium. Therefore several identical energy converters are arranged so that they are affected simultaneously by the energy medium. Each device uses a different control strategy, of which at least one has to be the machine learning approach presented in this paper. During operation all energy converters record the absorbed power and control output; the machine learning device gets the data from the converter with the highest power absorption and so learns the best performing control strategy for each situation. Consequently, the overall network has a better overall performance than each individual strategy. This concept is evaluated for wave energy converters (WECs) with numerical simulations and experiments with physical scale models in a wave tank. In the first of two numerical simulations, the learnable WEC works in an array with four WECs applying a constant damping factor. In the second simulation, two learnable WECs were learning with each other. It showed that in the first test the WEC was able to absorb as much as the best constant damping WEC, while in the second run it could absorb even slightly more. During the physical model test, the ANN showed its ability to select the better of two possible damping coefficients based on real world input data.
Full article

Figure 1
Open AccessArticle
The Next Generation Cognitive Security Operations Center: Network Flow Forensics Using Cybersecurity Intelligence
►▼
Figures
by Konstantinos Demertzis, Panayiotis Kikiras, Nikos Tziritas, Salvador Llopis Sanchez and Lazaros Iliadis
Big Data Cogn. Comput. 2018, 2(4), 35; https://doi.org/10.3390/bdcc2040035 - 22 November 2018
Abstract
A Security Operations Center (SOC) can be defined as an organized and highly skilled team that uses advanced computer forensics tools to prevent, detect and respond to cybersecurity incidents of an organization. The fundamental aspects of an effective SOC is related to the
[...] Read more.
A Security Operations Center (SOC) can be defined as an organized and highly skilled team that uses advanced computer forensics tools to prevent, detect and respond to cybersecurity incidents of an organization. The fundamental aspects of an effective SOC is related to the ability to examine and analyze the vast number of data flows and to correlate several other types of events from a cybersecurity perception. The supervision and categorization of network flow is an essential process not only for the scheduling, management, and regulation of the network’s services, but also for attacks identification and for the consequent forensics’ investigations. A serious potential disadvantage of the traditional software solutions used today for computer network monitoring, and specifically for the instances of effective categorization of the encrypted or obfuscated network flow, which enforces the rebuilding of messages packets in sophisticated underlying protocols, is the requirements of computational resources. In addition, an additional significant inability of these software packages is they create high false positive rates because they are deprived of accurate predicting mechanisms. For all the reasons above, in most cases, the traditional software fails completely to recognize unidentified vulnerabilities and zero-day exploitations. This paper proposes a novel intelligence driven Network Flow Forensics Framework (NF3) which uses low utilization of computing power and resources, for the Next Generation Cognitive Computing SOC (NGC2SOC) that rely solely on advanced fully automated intelligence methods. It is an effective and accurate Ensemble Machine Learning forensics tool to Network Traffic Analysis, Demystification of Malware Traffic and Encrypted Traffic Identification.
Full article

Figure 1
Open AccessArticle
Big-Crypto: Big Data, Blockchain and Cryptocurrency
►▼
Figures
Big Data Cogn. Comput. 2018, 2(4), 34; https://doi.org/10.3390/bdcc2040034 - 19 October 2018
Abstract
Cryptocurrency has been a trending topic over the past decade, pooling tremendous technological power and attracting investments valued over trillions of dollars on a global scale. The cryptocurrency technology and its network have been endowed with many superior features due to its unique
[...] Read more.
Cryptocurrency has been a trending topic over the past decade, pooling tremendous technological power and attracting investments valued over trillions of dollars on a global scale. The cryptocurrency technology and its network have been endowed with many superior features due to its unique architecture, which also determined its worldwide efficiency, applicability and data intensive characteristics. This paper introduces and summarises the interactions between two significant concepts in the digitalized world, i.e., cryptocurrency and Big Data. Both subjects are at the forefront of technological research, and this paper focuses on their convergence and comprehensively reviews the very recent applications and developments after 2016. Accordingly, we aim to present a systematic review of the interactions between Big Data and cryptocurrency and serve as the one stop reference directory for researchers with regard to identifying research gaps and directing future explorations.
Full article

Figure 1

News
Conferences
Special Issues
Special Issue in
BDCC
Big-Data Driven Multi-Criteria Decision-Making
Guest Editors: AMM Sharif Ullah, Md. Noor-E-AlamDeadline: 28 February 2019
Special Issue in
BDCC
Machine Learning and Data Analytics for Communication Networks in the 5G Era
Guest Editors: Francesco Musumeci, José Alberto Hernández, Ignacio De MiguelDeadline: 31 March 2019
Special Issue in
BDCC
Computational Models of Cognition and Learning
Guest Editors: George D. Magoulas, Maitrei Kohli, Michael ThomasDeadline: 30 April 2019
Special Issue in
BDCC
Artificial Superintelligence: Coordination & Strategy
Guest Editors: Roman Yampolskiy, Allison DuettmannDeadline: 31 May 2019
Jobs in Research
Big Data Cogn. Comput.
EISSN 2504-2289
Published by MDPI AG, Basel, Switzerland
RSS
E-Mail Table of Contents Alert