Next Issue
Volume 3, December
Previous Issue
Volume 3, June

Table of Contents

Big Data Cogn. Comput., Volume 3, Issue 3 (September 2019)

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Readerexternal link to open them.
Cover Story (view full-size image) Evaluating the performance of big data applications is required to efficiently size capacities. [...] Read more.
Order results
Result details
Select all
Export citation of selected articles as:
Open AccessArticle
Emotional Decision-Making Biases Prediction in Cyber-Physical Systems
Big Data Cogn. Comput. 2019, 3(3), 49; https://doi.org/10.3390/bdcc3030049 - 30 Aug 2019
Viewed by 243
Abstract
This article faces the challenge of discovering the trends in decision-making based on capturing emotional data and the influence of the possible external stimuli. We conducted an experiment with a significant sample of the workforce and used machine-learning techniques to model the decision-making [...] Read more.
This article faces the challenge of discovering the trends in decision-making based on capturing emotional data and the influence of the possible external stimuli. We conducted an experiment with a significant sample of the workforce and used machine-learning techniques to model the decision-making process. We studied the trends introduced by the emotional status and the external stimulus that makes these personnel act or report to the supervisor. The main result of this study is the production of a model capable of predicting the bias to act in a specific context. We studied the relationship between emotions and the probability of acting or correcting the system. The main area of interest of these issues is the ability to influence in advance the personnel to make their work more efficient and productive. This would be a whole new line of research for the future. Full article
Show Figures

Figure 1

Open AccessArticle
Optimal Number of Choices in Rating Contexts
Big Data Cogn. Comput. 2019, 3(3), 48; https://doi.org/10.3390/bdcc3030048 - 27 Aug 2019
Viewed by 246
Abstract
In many settings, people must give numerical scores to entities from a small discrete set—for instance, rating physical attractiveness from 1–5 on dating sites, or papers from 1–10 for conference reviewing. We study the problem of understanding when using a different number of [...] Read more.
In many settings, people must give numerical scores to entities from a small discrete set—for instance, rating physical attractiveness from 1–5 on dating sites, or papers from 1–10 for conference reviewing. We study the problem of understanding when using a different number of options is optimal. We consider the case when scores are uniform random and Gaussian. We study computationally when using 2, 3, 4, 5, and 10 options out of a total of 100 is optimal in these models (though our theoretical analysis is for a more general setting with k choices from n total options as well as a continuous underlying space). One may expect that using more options would always improve performance in this model, but we show that this is not necessarily the case, and that using fewer choices—even just two—can surprisingly be optimal in certain situations. While in theory for this setting it would be optimal to use all 100 options, in practice, this is prohibitive, and it is preferable to utilize a smaller number of options due to humans’ limited computational resources. Our results could have many potential applications, as settings requiring entities to be ranked by humans are ubiquitous. There could also be applications to other fields such as signal or image processing where input values from a large set must be mapped to output values in a smaller set. Full article
(This article belongs to the Special Issue Computational Models of Cognition and Learning)
Open AccessArticle
PerTract: Model Extraction and Specification of Big Data Systems for Performance Prediction by the Example of Apache Spark and Hadoop
Big Data Cogn. Comput. 2019, 3(3), 47; https://doi.org/10.3390/bdcc3030047 - 09 Aug 2019
Viewed by 429
Abstract
Evaluating and predicting the performance of big data applications are required to efficiently size capacities and manage operations. Gaining profound insights into the system architecture, dependencies of components, resource demands, and configurations cause difficulties to engineers. To address these challenges, this paper presents [...] Read more.
Evaluating and predicting the performance of big data applications are required to efficiently size capacities and manage operations. Gaining profound insights into the system architecture, dependencies of components, resource demands, and configurations cause difficulties to engineers. To address these challenges, this paper presents an approach to automatically extract and transform system specifications to predict the performance of applications. It consists of three components. First, a system-and tool-agnostic domain-specific language (DSL) allows the modeling of performance-relevant factors of big data applications, computing resources, and data workload. Second, DSL instances are automatically extracted from monitored measurements of Apache Spark and Apache Hadoop (i.e., YARN and HDFS) systems. Third, these instances are transformed to model- and simulation-based performance evaluation tools to allow predictions. By adapting DSL instances, our approach enables engineers to predict the performance of applications for different scenarios such as changing data input and resources. We evaluate our approach by predicting the performance of linear regression and random forest applications of the HiBench benchmark suite. Simulation results of adjusted DSL instances compared to measurement results show accurate predictions errors below 15% based upon averages for response times and resource utilization. Full article
Show Figures

Figure 1

Open AccessArticle
Future-Ready Strategic Oversight of Multiple Artificial Superintelligence-Enabled Adaptive Learning Systems via Human-Centric Explainable AI-Empowered Predictive Optimizations of Educational Outcomes
Big Data Cogn. Comput. 2019, 3(3), 46; https://doi.org/10.3390/bdcc3030046 - 31 Jul 2019
Viewed by 476
Abstract
Artificial intelligence-enabled adaptive learning systems (AI-ALS) have been increasingly utilized in education. Schools are usually afforded the freedom to deploy the AI-ALS that they prefer. However, even before artificial intelligence autonomously develops into artificial superintelligence in the future, it would be remiss to [...] Read more.
Artificial intelligence-enabled adaptive learning systems (AI-ALS) have been increasingly utilized in education. Schools are usually afforded the freedom to deploy the AI-ALS that they prefer. However, even before artificial intelligence autonomously develops into artificial superintelligence in the future, it would be remiss to entirely leave the students to the AI-ALS without any independent oversight of the potential issues. For example, if the students score well in formative assessments within the AI-ALS but subsequently perform badly in paper-based post-tests, or if the relentless algorithm of a particular AI-ALS is suspected of causing undue stress for the students, they should be addressed by educational stakeholders. Policy makers and educational stakeholders should collaborate to analyze the data from multiple AI-ALS deployed in different schools to achieve strategic oversight. The current paper provides exemplars to illustrate how this future-ready strategic oversight could be implemented using an artificial intelligence-based Bayesian network software to analyze the data from five dissimilar AI-ALS, each deployed in a different school. Besides using descriptive analytics to reveal potential issues experienced by students within each AI-ALS, this human-centric AI-empowered approach also enables explainable predictive analytics of the students’ learning outcomes in paper-based summative assessments after training is completed in each AI-ALS. Full article
(This article belongs to the Special Issue Artificial Superintelligence: Coordination & Strategy)
Show Figures

Figure 1

Open AccessArticle
Viability in Multiplex Lexical Networks and Machine Learning Characterizes Human Creativity
Big Data Cogn. Comput. 2019, 3(3), 45; https://doi.org/10.3390/bdcc3030045 - 31 Jul 2019
Viewed by 576
Abstract
Previous studies have shown how individual differences in creativity relate to differences in the structure of semantic memory. However, the latter is only one aspect of the whole mental lexicon, a repository of conceptual knowledge that is considered to simultaneously include multiple types [...] Read more.
Previous studies have shown how individual differences in creativity relate to differences in the structure of semantic memory. However, the latter is only one aspect of the whole mental lexicon, a repository of conceptual knowledge that is considered to simultaneously include multiple types of conceptual similarities. In the current study, we apply a multiplex network approach to compute a representation of the mental lexicon combining semantics and phonology and examine how it relates to individual differences in creativity. This multiplex combination of 150,000 phonological and semantic associations identifies a core of words in the mental lexicon known as viable cluster, a kernel containing simpler to parse, more general, concrete words acquired early during language learning. We focus on low (N = 47) and high (N = 47) creative individuals’ performance in generating animal names during a semantic fluency task. We model this performance as the outcome of a mental navigation on the multiplex lexical network, going within, outside, and in-between the viable cluster. We find that low and high creative individuals differ substantially in their access to the viable cluster during the semantic fluency task. Higher creative individuals tend to access the viable cluster less frequently, with a lower uncertainty/entropy, reaching out to more peripheral words and covering longer multiplex network distances between concepts in comparison to lower creative individuals. We use these differences for constructing a machine learning classifier of creativity levels, which leads to an accuracy of 65.0 ± 0.9 % and an area under the curve of 68.0 ± 0.8 % , which are both higher than the random expectation of 50%. These results highlight the potential relevance of combining psycholinguistic measures with multiplex network models of the mental lexicon for modelling mental navigation and, consequently, classifying people automatically according to their creativity levels. Full article
Show Figures

Figure 1

Open AccessArticle
Archetype-Based Modeling and Search of Social Media
Big Data Cogn. Comput. 2019, 3(3), 44; https://doi.org/10.3390/bdcc3030044 - 24 Jul 2019
Viewed by 494
Abstract
Existing keyword-based search techniques suffer from limitations owing to unknown, mismatched, and obscure vocabulary. These challenges are particularly prevalent in social media, where slang, jargon, and memetics are abundant. We develop a new technique, Archetype-Based Modeling and Search, that can mitigate these challenges [...] Read more.
Existing keyword-based search techniques suffer from limitations owing to unknown, mismatched, and obscure vocabulary. These challenges are particularly prevalent in social media, where slang, jargon, and memetics are abundant. We develop a new technique, Archetype-Based Modeling and Search, that can mitigate these challenges as they are encountered in social media. This technique learns to identify new relevant documents based on a specified set of archetypes from which both vocabulary and relevance information are extracted. We present a case study from the social media data from Reddit, by using authors from /r/Opiates to characterize discourse around opioid use and to find additional relevant authors on this topic. Full article
Show Figures

Figure 1

Open AccessArticle
RazorNet: Adversarial Training and Noise Training on a Deep Neural Network Fooled by a Shallow Neural Network
Big Data Cogn. Comput. 2019, 3(3), 43; https://doi.org/10.3390/bdcc3030043 - 23 Jul 2019
Viewed by 418
Abstract
In this work, we propose ShallowDeepNet, a novel system architecture that includes a shallow and a deep neural network. The shallow neural network has the duty of data preprocessing and generating adversarial samples. The deep neural network has the duty of understanding data [...] Read more.
In this work, we propose ShallowDeepNet, a novel system architecture that includes a shallow and a deep neural network. The shallow neural network has the duty of data preprocessing and generating adversarial samples. The deep neural network has the duty of understanding data and information as well as detecting adversarial samples. The deep neural network gets its weights from transfer learning, adversarial training, and noise training. The system is examined on the biometric (fingerprint and iris) and the pharmaceutical data (pill image). According to the simulation results, the system is capable of improving the detection accuracy of the biometric data from 1.31% to 80.65% when the adversarial data is used and to 93.4% when the adversarial data as well as the noisy data are given to the network. The system performance on the pill image data is increased from 34.55% to 96.03% and then to 98.2%, respectively. Training on different types of noise can benefit us in detecting samples from unknown and unseen adversarial attacks. Meanwhile, the system training on the adversarial data as well as noisy data occurs only once. In fact, retraining the system may improve the performance further. Furthermore, training the system on new types of attacks and noise can help in enhancing the system performance. Full article
Show Figures

Figure 1

Open AccessArticle
InfoFlow: A Distributed Algorithm to Detect Communities According to the Map Equation
Big Data Cogn. Comput. 2019, 3(3), 42; https://doi.org/10.3390/bdcc3030042 - 22 Jul 2019
Viewed by 373
Abstract
Formidably sized networks are becoming more and more common, including in social sciences, biology, neuroscience, and the technology space. Many network sizes are expected to challenge the storage capability of a single physical computer. Here, we take two approaches to handle big networks: [...] Read more.
Formidably sized networks are becoming more and more common, including in social sciences, biology, neuroscience, and the technology space. Many network sizes are expected to challenge the storage capability of a single physical computer. Here, we take two approaches to handle big networks: first, we look at how big data technology and distributed computing is an exciting approach to big data storage and processing. Second, most networks can be partitioned or labeled into communities, clusters, or modules, thus capturing the crux of the network while reducing detailed information, through the class of algorithms known as community detection. In this paper, we combine these two approaches, developing a distributed community detection algorithm to handle big networks. In particular, the map equation provides a way to identify network communities according to the information flow between nodes, where InfoMap is a greedy algorithm that uses the map equation. We develop discrete mathematics to adapt InfoMap into a distributed computing framework and then further develop the mathematics for a greedy algorithm, InfoFlow, which has logarithmic time complexity, compared to the linear complexity in InfoMap. Benchmark results of graphs up to millions of nodes and hundreds of millions of edges confirm the time complexity improvement, while maintaining community accuracy. Thus, we develop a map equation based community detection algorithm suitable for big network data processing. Full article
Show Figures

Figure 1

Open AccessArticle
Breast Cancer Diagnosis System Based on Semantic Analysis and Choquet Integral Feature Selection for High Risk Subjects
Big Data Cogn. Comput. 2019, 3(3), 41; https://doi.org/10.3390/bdcc3030041 - 12 Jul 2019
Viewed by 392
Abstract
In this work, we build a computer aided diagnosis (CAD) system of breast cancer for high risk patients considering the breast imaging reporting and data system (BIRADS), mapping main expert concepts and rules. Therefore, a bag of words is built based on the [...] Read more.
In this work, we build a computer aided diagnosis (CAD) system of breast cancer for high risk patients considering the breast imaging reporting and data system (BIRADS), mapping main expert concepts and rules. Therefore, a bag of words is built based on the ontology of breast cancer analysis. For a more reliable characterization of the lesion, a feature selection based on Choquet integral is applied aiming at discarding the irrelevant descriptors. Then, a set of well-known machine learning tools are used for semantic annotation to fill the gap between low level knowledge and expert concepts involved in the BIRADS classification. Indeed, expert rules are implicitly modeled using a set of classifiers for severity diagnosis. As a result, the feature selection gives a a better assessment of the lesion and the semantic analysis context offers an attractive frame to include external factors and meta-knowledge, as well as exploiting more than one modality. Accordingly, our CAD system is intended for diagnosis of breast cancer for high risk patients. It has been then validated based on two complementary modalities, MRI and dual energy contrast enhancement mammography (DECEDM), the proposed system leads a correct classification rate of 99%. Full article
(This article belongs to the Special Issue Computational Models of Cognition and Learning)
Show Figures

Figure 1

Open AccessArticle
Safe Artificial General Intelligence via Distributed Ledger Technology
Big Data Cogn. Comput. 2019, 3(3), 40; https://doi.org/10.3390/bdcc3030040 - 08 Jul 2019
Viewed by 442
Abstract
Artificial general intelligence (AGI) progression metrics indicate AGI will occur within decades. No proof exists that AGI will benefit humans and not harm or eliminate humans. A set of logically distinct conceptual components is proposed that are necessary and sufficient to (1) ensure [...] Read more.
Artificial general intelligence (AGI) progression metrics indicate AGI will occur within decades. No proof exists that AGI will benefit humans and not harm or eliminate humans. A set of logically distinct conceptual components is proposed that are necessary and sufficient to (1) ensure various AGI scenarios will not harm humanity, and (2) robustly align AGI and human values and goals. By systematically addressing pathways to malevolent AI we can induce the methods/axioms required to redress them. Distributed ledger technology (DLT, “blockchain”) is integral to this proposal, e.g., “smart contracts” are necessary to address the evolution of AI that will be too fast for human monitoring and intervention. The proposed axioms: (1) Access to technology by market license. (2) Transparent ethics embodied in DLT. (3) Morality encrypted via DLT. (4) Behavior control structure with values at roots. (5) Individual bar-code identification of critical components. (6) Configuration Item (from business continuity/disaster recovery planning). (7) Identity verification secured via DLT. (8) “Smart” automated contracts based on DLT. (9) Decentralized applications—AI software modules encrypted via DLT. (10) Audit trail of component usage stored via DLT. (11) Social ostracism (denial of resources) augmented by DLT petitions. (12) Game theory and mechanism design. Full article
(This article belongs to the Special Issue Artificial Superintelligence: Coordination & Strategy)
Show Figures

Figure 1

Open AccessArticle
An Item–Item Collaborative Filtering Recommender System Using Trust and Genre to Address the Cold-Start Problem
Big Data Cogn. Comput. 2019, 3(3), 39; https://doi.org/10.3390/bdcc3030039 - 08 Jul 2019
Viewed by 405
Abstract
Item-based collaborative filtering is one of the most popular techniques in the recommender system to retrieve useful items for the users by finding the correlation among the items. Traditional item-based collaborative filtering works well when there exists sufficient rating data but cannot calculate [...] Read more.
Item-based collaborative filtering is one of the most popular techniques in the recommender system to retrieve useful items for the users by finding the correlation among the items. Traditional item-based collaborative filtering works well when there exists sufficient rating data but cannot calculate similarity for new items, known as a cold-start problem. Usually, for the lack of rating data, the identification of the similarity among the cold-start items is difficult. As a result, existing techniques fail to predict accurate recommendations for cold-start items which also affects the recommender system’s performance. In this paper, two item-based similarity measures have been designed to overcome this problem by incorporating items’ genre data. An item might be uniform to other items as they might belong to more than one common genre. Thus, one of the similarity measures is defined by determining the degree of direct asymmetric correlation between items by considering their association of common genres. However, the similarity is determined between a couple of items where one of the items could be cold-start and another could be any highly rated item. Thus, the proposed similarity measure is accounted for as asymmetric by taking consideration of the item’s rating data. Another similarity measure is defined as the relative interconnection between items based on transitive inference. In addition, an enhanced prediction algorithm has been proposed so that it can calculate a better prediction for the recommendation. The proposed approach has experimented with two popular datasets that is Movielens and MovieTweets. In addition, it is found that the proposed technique performs better in comparison with the traditional techniques in a collaborative filtering recommender system. The proposed approach improved prediction accuracy for Movielens and MovieTweets approximately in terms of 3.42% & 8.58% mean absolute error, 7.25% & 3.29% precision, 7.20% & 7.55% recall, 8.76% & 5.15% f-measure and 49.3% and 16.49% mean reciprocal rank, respectively. Full article
Show Figures

Figure 1

Open AccessArticle
Twitter Analyzer—How to Use Semantic Analysis to Retrieve an Atmospheric Image around Political Topics in Twitter
Big Data Cogn. Comput. 2019, 3(3), 38; https://doi.org/10.3390/bdcc3030038 - 06 Jul 2019
Viewed by 433
Abstract
Social media are heavily used to shape political discussions. Thus, it is valuable for corporations and political parties to be able to analyze the content of those discussions. This is exemplified by the work of Cambridge Analytica, in support of the 2016 presidential [...] Read more.
Social media are heavily used to shape political discussions. Thus, it is valuable for corporations and political parties to be able to analyze the content of those discussions. This is exemplified by the work of Cambridge Analytica, in support of the 2016 presidential campaign of Donald Trump. One of the most straightforward metrics is the sentiment of a message, whether it is considered as positive or negative. There are many commercial and/or closed-source tools available which make it possible to analyze social media data, including sentiment analysis (SA). However, to our knowledge, not many publicly available tools have been developed that allow for analyzing social media data and help researchers around the world to enter this quickly expanding field of study. In this paper, we provide a thorough description of implementing a tool that can be used for performing sentiment analysis on tweets. In an effort to underline the necessity for open tools and additional monitoring on the Twittersphere, we propose an implementation model based exclusively on publicly available open-source software. The resulting tool is capable of downloading Tweets in real-time based on hashtags or account names and stores the sentiment for replies to specific tweets. It is therefore capable of measuring the average reaction to one tweet by a person or a hashtag, which can be represented with graphs. Finally, we tested our open-source tool within a case study based on a data set of Twitter accounts and hashtags referring to the Syrian war, covering a short time window of one week in the spring of 2018. The results show that while high accuracy of commercial or other complicated tools may not be achieved, our proposed open source tool makes it possible to get a good overview of the overall replies to specific tweets, as well as a practical perception of tweets, related to specific hashtags, identifying them as positive or negative. Full article
Show Figures

Figure 1

Open AccessArticle
Cooking Is Creating Emotion: A Study on Hinglish Sentiments of Youtube Cookery Channels Using Semi-Supervised Approach
Big Data Cogn. Comput. 2019, 3(3), 37; https://doi.org/10.3390/bdcc3030037 - 03 Jul 2019
Viewed by 427
Abstract
The success of Youtube has attracted a lot of users, which results in an increase of the number of comments present on Youtube channels. By analyzing those comments we could provide insight to the Youtubers that would help them to deliver better quality. [...] Read more.
The success of Youtube has attracted a lot of users, which results in an increase of the number of comments present on Youtube channels. By analyzing those comments we could provide insight to the Youtubers that would help them to deliver better quality. Youtube is very popular in India. A majority of the population in India speak and write a mixture of two languages known as Hinglish for casual communication on social media. Our study focuses on the sentiment analysis of Hinglish comments on cookery channels. The unsupervised learning technique DBSCAN was employed in our work to find the different patterns in the comments data. We have modelled and evaluated both parametric and non-parametric learning algorithms. Logistic regression with the term frequency vectorizer gave 74.01% accuracy in Nisha Madulika’s dataset and 75.37% accuracy in Kabita’s Kitchen dataset. Each classifier is statistically tested in our study. Full article
Show Figures

Figure 1

Open AccessArticle
Data-Driven Load Forecasting of Air Conditioners for Demand Response Using Levenberg–Marquardt Algorithm-Based ANN
Big Data Cogn. Comput. 2019, 3(3), 36; https://doi.org/10.3390/bdcc3030036 - 02 Jul 2019
Viewed by 476
Abstract
Air Conditioners (AC) impact in overall electricity consumption in buildings is very high. Therefore, controlling ACs power consumption is a significant factor for demand response. With the advancement in the area of demand side management techniques implementation and smart grid, precise AC load [...] Read more.
Air Conditioners (AC) impact in overall electricity consumption in buildings is very high. Therefore, controlling ACs power consumption is a significant factor for demand response. With the advancement in the area of demand side management techniques implementation and smart grid, precise AC load forecasting for electrical utilities and end-users is required. In this paper, big data analysis and its applications in power systems is introduced. After this, various load forecasting categories and various techniques applied for load forecasting in context of big data analysis in power systems have been explored. Then, Levenberg–Marquardt Algorithm (LMA)-based Artificial Neural Network (ANN) for residential AC short-term load forecasting is presented. This forecasting approach utilizes past hourly temperature observations and AC load as input variables for assessment. Different performance assessment indices have also been investigated. Error formulations have shown that LMA-based ANN presents better results in comparison to Scaled Conjugate Gradient (SCG) and statistical regression approach. Furthermore, information of AC load is obtainable for different time horizons like weekly, hourly, and monthly bases due to better prediction accuracy of LMA-based ANN, which is helpful for efficient demand response (DR) implementation. Full article
Show Figures

Figure 1

Open AccessArticle
A Holistic Framework for Forecasting Transformative AI
Big Data Cogn. Comput. 2019, 3(3), 35; https://doi.org/10.3390/bdcc3030035 - 26 Jun 2019
Viewed by 464
Abstract
In this paper we describe a holistic AI forecasting framework which draws on a broad body of literature from disciplines such as forecasting, technological forecasting, futures studies and scenario planning. A review of this literature leads us to propose a new class of [...] Read more.
In this paper we describe a holistic AI forecasting framework which draws on a broad body of literature from disciplines such as forecasting, technological forecasting, futures studies and scenario planning. A review of this literature leads us to propose a new class of scenario planning techniques that we call scenario mapping techniques. These techniques include scenario network mapping, cognitive maps and fuzzy cognitive maps, as well as a new method we propose that we refer to as judgmental distillation mapping. This proposed technique is based on scenario mapping and judgmental forecasting techniques, and is intended to integrate a wide variety of forecasts into a technological map with probabilistic timelines. Judgmental distillation mapping is the centerpiece of the holistic forecasting framework in which it is used to inform a strategic planning process as well as for informing future iterations of the forecasting process. Together, the framework and new technique form a holistic rethinking of how we forecast AI. We also include a discussion of the strengths and weaknesses of the framework, its implications for practice and its implications on research priorities for AI forecasting researchers. Full article
(This article belongs to the Special Issue Artificial Superintelligence: Coordination & Strategy)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop