Big Data and Cognitive Computing

17 pages, 1862 KB

Open AccessArticle

Clustering Algorithm to Measure Student Assessment Accuracy: A Double Study

by Sónia Rolland Sobral and Catarina Félix de Oliveira

Big Data Cogn. Comput. 2021, 5(4), 81; https://doi.org/10.3390/bdcc5040081 - 18 Dec 2021

Cited by 3 | Viewed by 4734

Self-assessment is one of the strategies used in active teaching to engage students in the entire learning process, in the form of self-regulated academic learning. This study aims to assess the possibility of including self-evaluation in the student’s final grade, not just as [...] Read more.

Self-assessment is one of the strategies used in active teaching to engage students in the entire learning process, in the form of self-regulated academic learning. This study aims to assess the possibility of including self-evaluation in the student’s final grade, not just as a self-assessment that allows students to predict the grade obtained but also as something to weigh on the final grade. Two different curricular units are used, both from the first year of graduation, one from the international relations course (N = 29) and the other from the computer science and computer engineering courses (N = 50). Students were asked to self-assess at each of the two evaluation moments of each unit, after submitting their work/test and after knowing the correct answers. This study uses statistical analysis as well as a clustering algorithm (K-means) on the data to try to gain deeper knowledge and visual insights into the data and the patterns among them. It was verified that there are no differences between the obtained grade and the thought grade by gender and age variables, but a direct correlation was found between the thought grade averages and the grade level. The difference is less accentuated at the second moment of evaluation—which suggests that an improvement in the self-assessment skill occurs from the first to the second evaluation moment. Full article

(This article belongs to the Special Issue Educational Data Mining and Technology)

► Show Figures

Figure 1

25 pages, 2399 KB

Open AccessFeature PaperReview

Semantic Trajectory Analytics and Recommender Systems in Cultural Spaces

by Sotiris Angelis, Konstantinos Kotis and Dimitris Spiliotopoulos

Big Data Cogn. Comput. 2021, 5(4), 80; https://doi.org/10.3390/bdcc5040080 - 16 Dec 2021

Cited by 17 | Viewed by 7010

Abstract

Semantic trajectory analytics and personalised recommender systems that enhance user experience are modern research topics that are increasingly getting attention. Semantic trajectories can efficiently model human movement for further analysis and pattern recognition, while personalised recommender systems can adapt to constantly changing user [...] Read more.

Semantic trajectory analytics and personalised recommender systems that enhance user experience are modern research topics that are increasingly getting attention. Semantic trajectories can efficiently model human movement for further analysis and pattern recognition, while personalised recommender systems can adapt to constantly changing user needs and provide meaningful and optimised suggestions. This paper focuses on the investigation of open issues and challenges at the intersection of these two topics, emphasising semantic technologies and machine learning techniques. The goal of this paper is twofold: (a) to critically review related work on semantic trajectories and knowledge-based interactive recommender systems, and (b) to propose a high-level framework, by describing its requirements. The paper presents a system architecture design for the recognition of semantic trajectory patterns and for the inferencing of possible synthesis of visitor trajectories in cultural spaces, such as museums, making suggestions for new trajectories that optimise cultural experiences. Full article

(This article belongs to the Special Issue Semantic Web Technology and Recommender Systems)

► Show Figures

Figure 1

16 pages, 6769 KB

Open AccessFeature PaperArticle

Spatial Sound in a 3D Virtual Environment: All Bark and No Bite?

by Radha Nila Meghanathan, Patrick Ruediger-Flore, Felix Hekele, Jan Spilski, Achim Ebert and Thomas Lachmann

Big Data Cogn. Comput. 2021, 5(4), 79; https://doi.org/10.3390/bdcc5040079 - 13 Dec 2021

Cited by 13 | Viewed by 6816

Abstract

Although the focus of Virtual Reality (VR) lies predominantly on the visual world, acoustic components enhance the functionality of a 3D environment. To study the interaction between visual and auditory modalities in a 3D environment, we investigated the effect of auditory cues on [...] Read more.

Although the focus of Virtual Reality (VR) lies predominantly on the visual world, acoustic components enhance the functionality of a 3D environment. To study the interaction between visual and auditory modalities in a 3D environment, we investigated the effect of auditory cues on visual searches in 3D virtual environments with both visual and auditory noise. In an experiment, we asked participants to detect visual targets in a 360° video in conditions with and without environmental noise. Auditory cues indicating the target location were either absent or one of simple stereo or binaural audio, both of which assisted sound localization. To investigate the efficacy of these cues in distracting environments, we measured participant performance using a VR headset with an eye tracker. We found that the binaural cue outperformed both stereo and no auditory cues in terms of target detection irrespective of the environmental noise. We used two eye movement measures and two physiological measures to evaluate task dynamics and mental effort. We found that the absence of a cue increased target search duration and target search path, measured as time to fixation and gaze trajectory lengths, respectively. Our physiological measures of blink rate and pupil size showed no difference between the different stadium and cue conditions. Overall, our study provides evidence for the utility of binaural audio in a realistic, noisy and virtual environment for performing a target detection task, which is a crucial part of everyday behaviour—finding someone in a crowd. Full article

(This article belongs to the Special Issue Virtual Reality, Augmented Reality, and Human-Computer Interaction)

► Show Figures

Figure 1

31 pages, 5136 KB

Open AccessArticle

Automatic Diagnosis of Epileptic Seizures in EEG Signals Using Fractal Dimension Features and Convolutional Autoencoder Method

by Anis Malekzadeh, Assef Zare, Mahdi Yaghoobi and Roohallah Alizadehsani

Big Data Cogn. Comput. 2021, 5(4), 78; https://doi.org/10.3390/bdcc5040078 - 13 Dec 2021

Cited by 56 | Viewed by 9708

Abstract

This paper proposes a new method for epileptic seizure detection in electroencephalography (EEG) signals using nonlinear features based on fractal dimension (FD) and a deep learning (DL) model. Firstly, Bonn and Freiburg datasets were used to perform experiments. The Bonn dataset consists of [...] Read more.

This paper proposes a new method for epileptic seizure detection in electroencephalography (EEG) signals using nonlinear features based on fractal dimension (FD) and a deep learning (DL) model. Firstly, Bonn and Freiburg datasets were used to perform experiments. The Bonn dataset consists of binary and multi-class classification problems, and the Freiburg dataset consists of two-class EEG classification problems. In the preprocessing step, all datasets were prepossessed using a Butterworth band pass filter with 0.5–60 Hz cut-off frequency. Then, the EEG signals of the datasets were segmented into different time windows. In this section, dual-tree complex wavelet transform (DT-CWT) was used to decompose the EEG signals into the different sub-bands. In the following section, in order to feature extraction, various FD techniques were used, including Higuchi (HFD), Katz (KFD), Petrosian (PFD), Hurst exponent (HE), detrended fluctuation analysis (DFA), Sevcik, box counting (BC), multiresolution box-counting (MBC), Margaos-Sun (MSFD), multifractal DFA (MF-DFA), and recurrence quantification analysis (RQA). In the next step, the minimum redundancy maximum relevance (mRMR) technique was used for feature selection. Finally, the k-nearest neighbors (KNN), support vector machine (SVM), and convolutional autoencoder (CNN-AE) were used for the classification step. In the classification step, the K-fold cross-validation with k = 10 was employed to demonstrate the effectiveness of the classifier methods. The experiment results show that the proposed CNN-AE method achieved an accuracy of 99.736% and 99.176% for the Bonn and Freiburg datasets, respectively. Full article

► Show Figures

Figure 1

18 pages, 546 KB

Open AccessEditor’s ChoiceArticle

DASentimental: Detecting Depression, Anxiety, and Stress in Texts via Emotional Recall, Cognitive Networks, and Machine Learning

by Asra Fatima, Ying Li, Thomas Trenholm Hills and Massimo Stella

Big Data Cogn. Comput. 2021, 5(4), 77; https://doi.org/10.3390/bdcc5040077 - 13 Dec 2021

Cited by 36 | Viewed by 10547

Abstract

Most current affect scales and sentiment analysis on written text focus on quantifying valence/sentiment, the primary dimension of emotion. Distinguishing broader, more complex negative emotions of similar valence is key to evaluating mental health. We propose a semi-supervised machine learning model, DASentimental, to [...] Read more.

Most current affect scales and sentiment analysis on written text focus on quantifying valence/sentiment, the primary dimension of emotion. Distinguishing broader, more complex negative emotions of similar valence is key to evaluating mental health. We propose a semi-supervised machine learning model, DASentimental, to extract depression, anxiety, and stress from written text. We trained DASentimental to identify how N = 200 sequences of recalled emotional words correlate with recallers’ depression, anxiety, and stress from the Depression Anxiety Stress Scale (DASS-21). Using cognitive network science, we modeled every recall list as a bag-of-words (BOW) vector and as a walk over a network representation of semantic memory—in this case, free associations. This weights BOW entries according to their centrality (degree) in semantic memory and informs recalls using semantic network distances, thus embedding recalls in a cognitive representation. This embedding translated into state-of-the-art, cross-validated predictions for depression (R = 0.7), anxiety (R = 0.44), and stress (R = 0.52), equivalent to previous results employing additional human data. Powered by a multilayer perceptron neural network, DASentimental opens the door to probing the semantic organizations of emotional distress. We found that semantic distances between recalls (i.e., walk coverage), was key for estimating depression levels but redundant for anxiety and stress levels. Semantic distances from “fear” boosted anxiety predictions but were redundant when the “sad–happy” dyad was considered. We applied DASentimental to a clinical dataset of 142 suicide notes and found that the predicted depression and anxiety levels (high/low) corresponded to differences in valence and arousal as expected from a circumplex model of affect. We discuss key directions for future research enabled by artificial intelligence detecting stress, anxiety, and depression in texts. Full article

(This article belongs to the Special Issue Knowledge Modelling and Learning through Cognitive Networks)

► Show Figures

Figure 1

17 pages, 13456 KB

Open AccessArticle

GO-E-MON: A New Online Platform for Decentralized Cognitive Science

by Satoshi Yazawa, Kikue Sakaguchi and Kazuo Hiraki

Big Data Cogn. Comput. 2021, 5(4), 76; https://doi.org/10.3390/bdcc5040076 - 13 Dec 2021

Cited by 4 | Viewed by 5424

Abstract

Advances in web technology and the widespread use of smartphones and PCs have proven that it is possible to optimize various services using personal data, such as location information and search history. While considerations of personal privacy and legal aspects lead to situations [...] Read more.

Advances in web technology and the widespread use of smartphones and PCs have proven that it is possible to optimize various services using personal data, such as location information and search history. While considerations of personal privacy and legal aspects lead to situations where data are monopolized by individual services and companies, a replication crisis has been pointed out for the data of laboratory experiments, which is challenging to solve given the difficulty of data distribution. To ensure distribution of experimental data while guaranteeing security, an online experiment platform can be a game changer. Current online experiment platforms have not yet considered improving data distribution, and it is currently difficult to use the data obtained from one experiment for other purposes. In addition, various devices such as activity meters and consumer-grade electroencephalography meters are emerging, and if a platform that collects data from such devices and tasks online is to be realized, the platform will hold a large amount of sensitive data, making it even more important to ensure security. We propose GO-E-MON, a service that combines an online experimental environment with a distributed personal data store (PDS), and explain how GO-E-MON can realize the reuse of experimental data with the subject’s consent by connecting to a distributed PDS. We report the results of the experiment in a groupwork lecture for university students to verify whether this method works. By building an online experiment environment integrated with a distributed PDS, we present the possibility of integrating multiple experiments performed by different experimenters—with the consent of individual subjects—while solving the security issues. Full article

► Show Figures

Figure 1

19 pages, 2940 KB

Open AccessArticle

Screening of Potential Indonesia Herbal Compounds Based on Multi-Label Classification for 2019 Coronavirus Disease

by Aulia Fadli, Wisnu Ananta Kusuma, Annisa, Irmanida Batubara and Rudi Heryanto

Big Data Cogn. Comput. 2021, 5(4), 75; https://doi.org/10.3390/bdcc5040075 - 9 Dec 2021

Cited by 6 | Viewed by 5966

Abstract

Coronavirus disease 2019 pandemic spreads rapidly and requires an acceleration in the process of drug discovery. Drug repurposing can help accelerate the drug discovery process by identifying new efficacy for approved drugs, and it is considered an efficient and economical approach. Research in [...] Read more.

Coronavirus disease 2019 pandemic spreads rapidly and requires an acceleration in the process of drug discovery. Drug repurposing can help accelerate the drug discovery process by identifying new efficacy for approved drugs, and it is considered an efficient and economical approach. Research in drug repurposing can be done by observing the interactions of drug compounds with protein related to a disease (DTI), then predicting the new drug-target interactions. This study conducted multilabel DTI prediction using the stack autoencoder-deep neural network (SAE-DNN) algorithm. Compound features were extracted using PubChem fingerprint, daylight fingerprint, MACCS fingerprint, and circular fingerprint. The results showed that the SAE-DNN model was able to predict DTI in COVID-19 cases with good performance. The SAE-DNN model with a circular fingerprint dataset produced the best average metrics with an accuracy of 0.831, recall of 0.918, precision of 0.888, and F-measure of 0.89. Herbal compounds prediction results using the SAE-DNN model with the circular, daylight, and PubChem fingerprint dataset resulted in 92, 65, and 79 herbal compounds contained in herbal plants in Indonesia respectively. Full article

(This article belongs to the Topic Machine and Deep Learning)

► Show Figures

Figure 1

20 pages, 43237 KB

Open AccessArticle

Fusion of Moment Invariant Method and Deep Learning Algorithm for COVID-19 Classification

by Ervin Gubin Moung, Chong Joon Hou, Maisarah Mohd Sufian, Mohd Hanafi Ahmad Hijazi, Jamal Ahmad Dargham and Sigeru Omatu

Big Data Cogn. Comput. 2021, 5(4), 74; https://doi.org/10.3390/bdcc5040074 - 8 Dec 2021

Cited by 20 | Viewed by 5949

Abstract

The COVID-19 pandemic has resulted in a global health crisis. The rapid spread of the virus has led to the infection of a significant population and millions of deaths worldwide. Therefore, the world is in urgent need of a fast and accurate COVID-19 [...] Read more.

The COVID-19 pandemic has resulted in a global health crisis. The rapid spread of the virus has led to the infection of a significant population and millions of deaths worldwide. Therefore, the world is in urgent need of a fast and accurate COVID-19 screening. Numerous researchers have performed exceptionally well to design pioneering deep learning (DL) models for the automatic screening of COVID-19 based on computerised tomography (CT) scans; however, there is still a concern regarding the performance stability affected by tiny perturbations and structural changes in CT images. This paper proposes a fusion of a moment invariant (MI) method and a DL algorithm for feature extraction to address the instabilities in the existing COVID-19 classification models. The proposed method incorporates the MI-based features into the DL models using the cascade fusion method. It was found that the fusion of MI features with DL features has the potential to improve the sensitivity and accuracy of the COVID-19 classification. Based on the evaluation using the SARS-CoV-2 dataset, the fusion of VGG16 and Hu moments shows the best result with 90% sensitivity and 93% accuracy. Full article

► Show Figures

Figure 1

25 pages, 11236 KB

Open AccessArticle

Explainable COVID-19 Detection on Chest X-rays Using an End-to-End Deep Convolutional Neural Network Architecture

by Mohamed Chetoui, Moulay A. Akhloufi, Bardia Yousefi and El Mostafa Bouattane

Big Data Cogn. Comput. 2021, 5(4), 73; https://doi.org/10.3390/bdcc5040073 - 7 Dec 2021

Cited by 31 | Viewed by 11449

Abstract

The coronavirus pandemic is spreading around the world. Medical imaging modalities such as radiography play an important role in the fight against COVID-19. Deep learning (DL) techniques have been able to improve medical imaging tools and help radiologists to make clinical decisions for [...] Read more.

The coronavirus pandemic is spreading around the world. Medical imaging modalities such as radiography play an important role in the fight against COVID-19. Deep learning (DL) techniques have been able to improve medical imaging tools and help radiologists to make clinical decisions for the diagnosis, monitoring and prognosis of different diseases. Computer-Aided Diagnostic (CAD) systems can improve work efficiency by precisely delineating infections in chest X-ray (CXR) images, thus facilitating subsequent quantification. CAD can also help automate the scanning process and reshape the workflow with minimal patient contact, providing the best protection for imaging technicians. The objective of this study is to develop a deep learning algorithm to detect COVID-19, pneumonia and normal cases on CXR images. We propose two classifications problems, (i) a binary classification to classify COVID-19 and normal cases and (ii) a multiclass classification for COVID-19, pneumonia and normal. Nine datasets and more than 3200 COVID-19 CXR images are used to assess the efficiency of the proposed technique. The model is trained on a subset of the National Institute of Health (NIH) dataset using swish activation, thus improving the training accuracy to detect COVID-19 and other pneumonia. The models are tested on eight merged datasets and on individual test sets in order to confirm the degree of generalization of the proposed algorithms. An explainability algorithm is also developed to visually show the location of the lung-infected areas detected by the model. Moreover, we provide a detailed analysis of the misclassified images. The obtained results achieve high performances with an Area Under Curve (AUC) of 0.97 for multi-class classification (COVID-19 vs. other pneumonia vs. normal) and 0.98 for the binary model (COVID-19 vs. normal). The average sensitivity and specificity are 0.97 and 0.98, respectively. The sensitivity of the COVID-19 class achieves 0.99. The results outperformed the comparable state-of-the-art models for the detection of COVID-19 on CXR images. The explainability model shows that our model is able to efficiently identify the signs of COVID-19. Full article

(This article belongs to the Special Issue COVID-19: Medical Internet of Things and Big Data Analytics)

► Show Figures

Graphical abstract

11 pages, 440 KB

Open AccessEditor’s ChoiceArticle

Exploring Ensemble-Based Class Imbalance Learners for Intrusion Detection in Industrial Control Networks

by Maya Hilda Lestari Louk and Bayu Adhi Tama

Big Data Cogn. Comput. 2021, 5(4), 72; https://doi.org/10.3390/bdcc5040072 - 6 Dec 2021

Cited by 29 | Viewed by 4831

Abstract

Classifier ensembles have been utilized in the industrial cybersecurity sector for many years. However, their efficacy and reliability for intrusion detection systems remain questionable in current research, owing to the particularly imbalanced data issue. The purpose of this article is to address a [...] Read more.

Classifier ensembles have been utilized in the industrial cybersecurity sector for many years. However, their efficacy and reliability for intrusion detection systems remain questionable in current research, owing to the particularly imbalanced data issue. The purpose of this article is to address a gap in the literature by illustrating the benefits of ensemble-based models for identifying threats and attacks in a cyber-physical power grid. We provide a framework that compares nine cost-sensitive individual and ensemble models designed specifically for handling imbalanced data, including cost-sensitive C4.5, roughly balanced bagging, random oversampling bagging, random undersampling bagging, synthetic minority oversampling bagging, random undersampling boosting, synthetic minority oversampling boosting, AdaC2, and EasyEnsemble. Each ensemble’s performance is tested against a range of benchmarked power system datasets utilizing balanced accuracy, Kappa statistics, and AUC metrics. Our findings demonstrate that EasyEnsemble outperformed significantly in comparison to its rivals across the board. Furthermore, undersampling and oversampling strategies were effective in a boosting-based ensemble but not in a bagging-based ensemble. Full article

(This article belongs to the Special Issue Artificial Intelligence for Trustworthy Industrial Internet of Things)

► Show Figures

Figure 1

17 pages, 1558 KB

Open AccessArticle

Customized Rule-Based Model to Identify At-Risk Students and Propose Rational Remedial Actions

by Balqis Albreiki, Tetiana Habuza, Zaid Shuqfa, Mohamed Adel Serhani, Nazar Zaki and Saad Harous

Big Data Cogn. Comput. 2021, 5(4), 71; https://doi.org/10.3390/bdcc5040071 - 29 Nov 2021

Cited by 22 | Viewed by 7082

Abstract

Detecting at-risk students provides advanced benefits for improving student retention rates, effective enrollment management, alumni engagement, targeted marketing improvement, and institutional effectiveness advancement. One of the success factors of educational institutes is based on accurate and timely identification and prioritization of the students [...] Read more.

Detecting at-risk students provides advanced benefits for improving student retention rates, effective enrollment management, alumni engagement, targeted marketing improvement, and institutional effectiveness advancement. One of the success factors of educational institutes is based on accurate and timely identification and prioritization of the students requiring assistance. The main objective of this paper is to detect at-risk students as early as possible in order to take appropriate correction measures taking into consideration the most important and influential attributes in students’ data. This paper emphasizes the use of a customized rule-based system (RBS) to identify and visualize at-risk students in early stages throughout the course delivery using the Risk Flag (

R F

). Moreover, it can serve as a warning tool for instructors to identify those students that may struggle to grasp learning outcomes. The module allows the instructor to have a dashboard that graphically depicts the students’ performance in different coursework components. The at-risk student will be distinguished (flagged), and remedial actions will be communicated to the student, instructor, and stakeholders. The system suggests remedial actions based on the severity of the case and the time the student is flagged. It is expected to improve students’ achievement and success, and it could also have positive impacts on under-performing students, educators, and academic institutions in general. Full article

(This article belongs to the Special Issue Educational Data Mining and Technology)

► Show Figures

Figure 1

12 pages, 563 KB

Open AccessArticle

Gambling Strategies and Prize-Pricing Recommendation in Sports Multi-Bets

by Oz Pirvandy, Moti Fridman and Gur Yaari

Big Data Cogn. Comput. 2021, 5(4), 70; https://doi.org/10.3390/bdcc5040070 - 29 Nov 2021

Cited by 2 | Viewed by 8914

Abstract

A sports multi-bet is a bet on the results of a set of N games. One type of multi-bet offered by the Israeli government is WINNER 16, where participants guess the results of a set of 16 soccer games. The prizes in WINNER [...] Read more.

A sports multi-bet is a bet on the results of a set of N games. One type of multi-bet offered by the Israeli government is WINNER 16, where participants guess the results of a set of 16 soccer games. The prizes in WINNER 16 are determined by the accumulated profit in previous rounds, and are split among all winning forms. When the reward increases beyond a certain threshold, a profitable strategy can be devised. Here, we present a machine-learning algorithm scheme to play WINNER 16. Our proposed algorithm is marginally profitable on average in a range of hyper-parameters, indicating inefficiencies in this game. To make a better prize-pricing mechanism we suggest a generalization of the single-bet approach. We studied the expected profit and risk of WINNER 16 after applying our suggestion. Our proposal can make the game more fair and more appealing without reducing the profitability. Full article

► Show Figures

Figure 1

17 pages, 1452 KB

Open AccessArticle

Networks and Stories. Analyzing the Transmission of the Feminist Intangible Cultural Heritage on Twitter

by Jordi Morales-i-Gras, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta and Simón Peña-Fernández

Big Data Cogn. Comput. 2021, 5(4), 69; https://doi.org/10.3390/bdcc5040069 - 24 Nov 2021

Cited by 11 | Viewed by 6692

Abstract

Internet social media is a key space in which the memorial resources of social movements, including the stories and knowledge of previous generations, are organised, disseminated, and reinterpreted. This is especially important for movements such as feminism, which places great emphasis on the [...] Read more.

Internet social media is a key space in which the memorial resources of social movements, including the stories and knowledge of previous generations, are organised, disseminated, and reinterpreted. This is especially important for movements such as feminism, which places great emphasis on the transmission of an intangible cultural legacy between its different generations or waves, which are conformed through these cultural transmissions. In this sense, several authors have highlighted the importance of social media and hashtivism in shaping the fourth wave of feminism that has been taking place in recent years (e.g., #metoo). The aim of this article is to present to the scientific community a hybrid methodological proposal for the network and content analysis of audiences and their interactions on Twitter: we will do so by describing and evaluating the results of different research we have carried out in the field of feminist hashtivism. Structural analysis methods such as social network analysis have demonstrated their capacity to be applied to the analysis of social media interactions as a mixed methodology, that is, both quantitative and qualitative. This article shows the potential of a specific methodological process that combines inductive and inferential reasoning with hypothetico-deductive approaches. By applying the methodology developed in the case studies included in the article, it is shown that these two modes of reasoning work best when they are used together. Full article

(This article belongs to the Special Issue Big Data Analytics for Cultural Heritage)

► Show Figures

Figure 1

15 pages, 564 KB

Open AccessArticle

The Impact of Big Data Adoption on SMEs’ Performance

by Mahdi Nasrollahi, Javaneh Ramezani and Mahmoud Sadraei

Big Data Cogn. Comput. 2021, 5(4), 68; https://doi.org/10.3390/bdcc5040068 - 24 Nov 2021

Cited by 33 | Viewed by 12543

Abstract

The notion of Industry 4.0 encompasses the adoption of new information technologies that enable an enormous amount of information to be digitally collected, analyzed, and exploited in organizations to make better decisions. Therefore, finding how organizations can adopt big data (BD) components to [...] Read more.

The notion of Industry 4.0 encompasses the adoption of new information technologies that enable an enormous amount of information to be digitally collected, analyzed, and exploited in organizations to make better decisions. Therefore, finding how organizations can adopt big data (BD) components to improve their performance becomes a relevant research area. This issue is becoming more pertinent for small and medium enterprises (SMEs), especially in developing countries that encounter limited resources and infrastructures. Due to the lack of empirical studies related to big data adoption (BDA) and BD’s business value, especially in SMEs, this study investigates the impact of BDA on SMEs’ performance by obtaining the required data from experts. The quantitative investigation followed a mixed approach, including survey data from 224 managers from Iranian SMEs, and a structural equation modeling (SEM) methodology for the data analysis. Results showed that 12 factors affected the BDA in SMEs. BDA can affect both operational performance and economic performance. There has been no support for the influence of BDA and economic performance on social performance. Finally, the study implications and findings are discussed alongside future research suggestions, as well as some limitations and unanswered questions. Full article

► Show Figures

Figure 1

12 pages, 857 KB

Open AccessEditor’s ChoiceReview

Spiking Neural Networks for Computational Intelligence: An Overview

by Shirin Dora and Nikola Kasabov

Big Data Cogn. Comput. 2021, 5(4), 67; https://doi.org/10.3390/bdcc5040067 - 15 Nov 2021

Cited by 47 | Viewed by 11833

Abstract

Deep neural networks with rate-based neurons have exhibited tremendous progress in the last decade. However, the same level of progress has not been observed in research on spiking neural networks (SNN), despite their capability to handle temporal data, energy-efficiency and low latency. This [...] Read more.

Deep neural networks with rate-based neurons have exhibited tremendous progress in the last decade. However, the same level of progress has not been observed in research on spiking neural networks (SNN), despite their capability to handle temporal data, energy-efficiency and low latency. This could be because the benchmarking techniques for SNNs are based on the methods used for evaluating deep neural networks, which do not provide a clear evaluation of the capabilities of SNNs. Particularly, the benchmarking of SNN approaches with regards to energy efficiency and latency requires realization in suitable hardware, which imposes additional temporal and resource constraints upon ongoing projects. This review aims to provide an overview of the current real-world applications of SNNs and identifies steps to accelerate research involving SNNs in the future. Full article

(This article belongs to the Special Issue Computational Intelligence: Spiking Neural Networks)

► Show Figures

Figure 1

18 pages, 604 KB

Open AccessConcept Paper

Kano Model Integration with Data Mining to Predict Customer Satisfaction

by Khaled Al Rabaiei, Fady Alnajjar and Amir Ahmad

Big Data Cogn. Comput. 2021, 5(4), 66; https://doi.org/10.3390/bdcc5040066 - 11 Nov 2021

Cited by 20 | Viewed by 9309

Abstract

The Kano model is one of the models that help determine which features must be included in a product or service to improve customer satisfaction. The model is focused on highlighting the most relevant attributes of a product or service along with customers’ [...] Read more.

The Kano model is one of the models that help determine which features must be included in a product or service to improve customer satisfaction. The model is focused on highlighting the most relevant attributes of a product or service along with customers’ estimation of how the presence of these attributes can be used to predict satisfaction about specific services or products. This research aims to develop a method to integrate the Kano model and data mining approaches to select relevant attributes that drive customer satisfaction, with a specific focus on higher education. The significant contribution of this research is to solve the problem of selecting features that are not methodically correlated to customer satisfaction, which could reduce the risk of investing in features that could ultimately be irrelevant to enhancing customer satisfaction. Questionnaire data were collected from 646 students from UAE University. The experiment suggests that XGBoost Regression and Decision Tree Regression produce best results for this kind of problem. Based on the integration between the Kano model and the feature selection method, the number of features used to predict customer satisfaction is minimized to four features. It was found that ANOVA features selection model’s integration with the Kano model gives higher Pearson correlation coefficients and higher R2 values. Full article

(This article belongs to the Topic Machine and Deep Learning)

► Show Figures

Figure 1

25 pages, 873 KB

Open AccessEditor’s ChoiceArticle

An Enhanced Parallelisation Model for Performance Prediction of Apache Spark on a Multinode Hadoop Cluster

by Nasim Ahmed, Andre L. C. Barczak, Mohammad A. Rashid and Teo Susnjak

Big Data Cogn. Comput. 2021, 5(4), 65; https://doi.org/10.3390/bdcc5040065 - 5 Nov 2021

Cited by 7 | Viewed by 6659

Abstract

Big data frameworks play a vital role in storing, processing, and analysing large datasets. Apache Spark has been established as one of the most popular big data engines for its efficiency and reliability. However, one of the significant problems of the Spark system [...] Read more.

Big data frameworks play a vital role in storing, processing, and analysing large datasets. Apache Spark has been established as one of the most popular big data engines for its efficiency and reliability. However, one of the significant problems of the Spark system is performance prediction. Spark has more than 150 configurable parameters, and configuration of so many parameters is challenging task when determining the suitable parameters for the system. In this paper, we proposed two distinct parallelisation models for performance prediction. Our insight is that each node in a Hadoop cluster can communicate with identical nodes, and a certain function of the non-parallelisable runtime can be estimated accordingly. Both models use simple equations that allows us to predict the runtime when the size of the job and the number of executables are known. The proposed models were evaluated based on five HiBench workloads, Kmeans, PageRank, Graph (NWeight), SVM, and WordCount. The workload’s empirical data were fitted with one of the two models meeting the accuracy requirements. Finally, the experimental findings show that the model can be a handy and helpful tool for scheduling and planning system deployment. Full article

► Show Figures

Figure 1

33 pages, 643 KB

Open AccessArticle

How Does Learning Analytics Contribute to Prevent Students’ Dropout in Higher Education: A Systematic Literature Review

by Catarina Félix de Oliveira, Sónia Rolland Sobral, Maria João Ferreira and Fernando Moreira

Big Data Cogn. Comput. 2021, 5(4), 64; https://doi.org/10.3390/bdcc5040064 - 4 Nov 2021

Cited by 92 | Viewed by 18402

Abstract

Retention and dropout of higher education students is a subject that must be analysed carefully. Learning analytics can be used to help prevent failure cases. The purpose of this paper is to analyse the scientific production in this area in higher education in [...] Read more.

Retention and dropout of higher education students is a subject that must be analysed carefully. Learning analytics can be used to help prevent failure cases. The purpose of this paper is to analyse the scientific production in this area in higher education in journals indexed in Clarivate Analytics’ Web of Science and Elsevier’s Scopus. We use a bibliometric and systematic study to obtain deep knowledge of the referred scientific production. The information gathered allows us to perceive where, how, and in what ways learning analytics has been used in the latest years. By analysing studies performed all over the world, we identify what kinds of data and techniques are used to approach the subject. We propose a feature classification into several categories and subcategories, regarding student and external features. Student features can be seen as personal or academic data, while external factors include information about the university, environment, and support offered to the students. To approach the problems, authors successfully use data mining applied to the identified educational data. We also identify some other concerns, such as privacy issues, that need to be considered in the studies. Full article

(This article belongs to the Special Issue Educational Data Mining and Technology)

► Show Figures

Figure 1

15 pages, 3933 KB

Open AccessFeature PaperArticle

GANs and Artificial Facial Expressions in Synthetic Portraits

by Pilar Rosado, Rubén Fernández and Ferran Reverter

Big Data Cogn. Comput. 2021, 5(4), 63; https://doi.org/10.3390/bdcc5040063 - 4 Nov 2021

Cited by 9 | Viewed by 7247

Abstract

Generative adversarial networks (GANs) provide powerful architectures for deep generative learning. GANs have enabled us to achieve an unprecedented degree of realism in the creation of synthetic images of human faces, landscapes, and buildings, among others. Not only image generation, but also image [...] Read more.

Generative adversarial networks (GANs) provide powerful architectures for deep generative learning. GANs have enabled us to achieve an unprecedented degree of realism in the creation of synthetic images of human faces, landscapes, and buildings, among others. Not only image generation, but also image manipulation is possible with GANs. Generative deep learning models are inherently limited in their creative abilities because of a focus on learning for perfection. We investigated the potential of GAN’s latent spaces to encode human expressions, highlighting creative interest for suboptimal solutions rather than perfect reproductions, in pursuit of the artistic concept. We have trained Deep Convolutional GAN (DCGAN) and StyleGAN using a collection of portraits of detained persons, portraits of dead people who died of violent causes, and people whose portraits were taken during an orgasm. We present results which diverge from standard usage of GANs with the specific intention of producing portraits that may assist us in the representation and recognition of otherness in contemporary identity construction. Full article

(This article belongs to the Special Issue Machine and Deep Learning in Computer Vision Applications)

► Show Figures

Figure 1

13 pages, 2180 KB

Open AccessEditor’s ChoiceArticle

Prediction of Cloud Fractional Cover Using Machine Learning

by Hanna Svennevik, Michael A. Riegler, Steven Hicks, Trude Storelvmo and Hugo L. Hammer

Big Data Cogn. Comput. 2021, 5(4), 62; https://doi.org/10.3390/bdcc5040062 - 3 Nov 2021

Cited by 9 | Viewed by 6268

Abstract

Climate change is stated as one of the largest issues of our time, resulting in many unwanted effects on life on earth. Cloud fractional cover (CFC), the portion of the sky covered by clouds, might affect global warming and different other aspects of [...] Read more.

Climate change is stated as one of the largest issues of our time, resulting in many unwanted effects on life on earth. Cloud fractional cover (CFC), the portion of the sky covered by clouds, might affect global warming and different other aspects of human society such as agriculture and solar energy production. It is therefore important to improve the projection of future CFC, which is usually projected using numerical climate methods. In this paper, we explore the potential of using machine learning as part of a statistical downscaling framework to project future CFC. We are not aware of any other research that has explored this. We evaluated the potential of two different methods, a convolutional long short-term memory model (ConvLSTM) and a multiple regression equation, to predict CFC from other environmental variables. The predictions were associated with much uncertainty indicating that there might not be much information in the environmental variables used in the study to predict CFC. Overall the regression equation performed the best, but the ConvLSTM was the better performing model along some coastal and mountain areas. All aspects of the research analyses are explained including data preparation, model development, ML training, performance evaluation and visualization. Full article

(This article belongs to the Special Issue Multimedia Systems for Multimedia Big Data)

► Show Figures

Figure 1

19 pages, 2639 KB

Open AccessReview

Using Machine Learning in Business Process Re-Engineering

by Younis Al-Anqoudi, Abdullah Al-Hamdani, Mohamed Al-Badawi and Rachid Hedjam

Big Data Cogn. Comput. 2021, 5(4), 61; https://doi.org/10.3390/bdcc5040061 - 2 Nov 2021

Cited by 40 | Viewed by 16808

Abstract

A business process re-engineering value in improving the business process is undoubted. Nevertheless, it is incredibly complex, time-consuming and costly. This study aims to review available literature in the use of machine learning for business process re-engineering. The review investigates available literature in [...] Read more.

A business process re-engineering value in improving the business process is undoubted. Nevertheless, it is incredibly complex, time-consuming and costly. This study aims to review available literature in the use of machine learning for business process re-engineering. The review investigates available literature in business process re-engineering frameworks, methodologies, tools, techniques, and machine-learning applications in automating business process re-engineering. The study covers 200+ research papers published between 2015 and 2020 in reputable scientific publication platforms: Scopus, Emerald, Science Direct, IEEE, and British Library. The results indicate that business process re-engineering is a well-established field with scientifically solid frameworks, methodologies, tools, and techniques, which support decision making by generating and analysing relevant data. The study indicates a wealth of data generated, analysed and utilised throughout business process re-engineering projects, thus making it a potential greenfield for innovative machine-learning applications aiming to reduce implementation costs and manage complexity by exploiting the data’s hiding patterns. This suggests that there were attempts towards applying machine learning in business process management and improvement in general. They address process discovery, process behaviour prediction, process improvement, and process optimisation. The review suggests that expanding the applications to business process re-engineering is promising. The study proposed a machine-learning model for automating business process re-engineering, inspired by the Lean Six Sigma principles of eliminating waste and variance in the business process. Full article

► Show Figures

Figure 1

9 pages, 1884 KB

Open AccessArticle

Fine-Grained Algorithm for Improving KNN Computational Performance on Clinical Trials Text Classification

by Jasmir Jasmir, Siti Nurmaini and Bambang Tutuko

Big Data Cogn. Comput. 2021, 5(4), 60; https://doi.org/10.3390/bdcc5040060 - 28 Oct 2021

Cited by 13 | Viewed by 4651

Abstract

Text classification is an important component in many applications. Text classification has attracted the attention of researchers to continue to develop innovations and build new classification models that are sourced from clinical trial texts. In building classification models, many methods are used, including [...] Read more.

Text classification is an important component in many applications. Text classification has attracted the attention of researchers to continue to develop innovations and build new classification models that are sourced from clinical trial texts. In building classification models, many methods are used, including supervised learning. The purpose of this study is to improve the computational performance of one of the supervised learning methods, namely KNN, in building a clinical trial document text classification model by combining KNN and the fine-grained algorithm. This research contributed to increasing the computational performance of KNN from 388,274 s to 260,641 s in clinical trial texts on a clinical trial text dataset with a total of 1,000,000 data. Full article

► Show Figures

Figure 1

17 pages, 2161 KB

Open AccessArticle

NERWS: Towards Improving Information Retrieval of Digital Library Management System Using Named Entity Recognition and Word Sense

by Ahmed Aliwy, Ayad Abbas and Ahmed Alkhayyat

Big Data Cogn. Comput. 2021, 5(4), 59; https://doi.org/10.3390/bdcc5040059 - 28 Oct 2021

Cited by 12 | Viewed by 6852

Abstract

An information retrieval (IR) system is the core of many applications, including digital library management systems (DLMS). The IR-based DLMS depends on either the title with keywords or content as symbolic strings. In contrast, it ignores the meaning of the content or what [...] Read more.

An information retrieval (IR) system is the core of many applications, including digital library management systems (DLMS). The IR-based DLMS depends on either the title with keywords or content as symbolic strings. In contrast, it ignores the meaning of the content or what it indicates. Many researchers tried to improve IR systems either using the named entity recognition (NER) technique or the words’ meaning (word sense) and implemented the improvements with a specific language. However, they did not test the IR system using NER and word sense disambiguation together to study the behavior of this system in the presence of these techniques. This paper aims to improve the information retrieval system used by the DLMS by adding the NER and word sense disambiguation (WSD) together for the English and Arabic languages. For NER, a voting technique was used among three completely different classifiers: rules-based, conditional random field (CRF), and bidirectional LSTM-CNN. For WSD, an examples-based method was used to implement it for the first time with the English language. For the IR system, a vector space model (VSM) was used to test the information retrieval system, and it was tested on samples from the library of the University of Kufa for the Arabic and English languages. The overall system results show that the precision, recall, and F-measures were increased from 70.9%, 74.2%, and 72.5% to 89.7%, 91.5%, and 90.6% for the English language and from 66.3%, 69.7%, and 68.0% to 89.3%, 87.1%, and 88.2% for the Arabic language. Full article

► Show Figures

Figure 1

17 pages, 4438 KB

Open AccessArticle

Preparing Datasets of Surface Roughness for Constructing Big Data from the Context of Smart Manufacturing and Cognitive Computing

by Saman Fattahi, Takuya Okamoto and Sharifu Ura

Big Data Cogn. Comput. 2021, 5(4), 58; https://doi.org/10.3390/bdcc5040058 - 25 Oct 2021

Cited by 14 | Viewed by 8081

Abstract

In smart manufacturing, human-cyber-physical systems host digital twins and IoT-based networks. The networks weave manufacturing enablers such as CNC machine tools, robots, CAD/CAM systems, process planning systems, enterprise resource planning systems, and human resources. The twins work as the brains of the enablers; [...] Read more.

In smart manufacturing, human-cyber-physical systems host digital twins and IoT-based networks. The networks weave manufacturing enablers such as CNC machine tools, robots, CAD/CAM systems, process planning systems, enterprise resource planning systems, and human resources. The twins work as the brains of the enablers; that is, the twins supply the required knowledge and help enablers solve problems autonomously in real-time. Since surface roughness is a major concern of all manufacturing processes, twins to solve surface roughness-relevant problems are needed. The twins must machine-learn the required knowledge from the relevant datasets available in big data. Therefore, preparing surface roughness-relevant datasets to be included in the human-cyber-physical system-friendly big data is a critical issue. However, preparing such datasets is a challenge due to the lack of a steadfast procedure. This study sheds some light on this issue. A state-of-the-art method is proposed to prepare the said datasets for surface roughness, wherein each dataset consists of four segments: semantic annotation, roughness model, simulation algorithm, and simulation system. These segments provide input information for digital twins’ input, modeling, simulation, and validation modules. The semantic annotation segment boils down to a concept map. A human- and machine-readable concept map is thus developed where the information of other segments (roughness model, simulation algorithm, and simulation system) is integrated. The delay map of surface roughness profile heights plays a pivotal role in the proposed dataset preparation method. The successful preparation of datasets of surface roughness underlying milling, turning, grinding, electric discharge machining, and polishing shows the efficacy of the proposed method. The method will be extended to the manufacturing processes in the next phase of this study. Full article

(This article belongs to the Special Issue Artificial Intelligence for Trustworthy Industrial Internet of Things)

► Show Figures

Figure 1

19 pages, 693 KB

Open AccessArticle

A Semantic Web Framework for Automated Smart Assistants: A Case Study for Public Health

by Yusuf Sermet and Ibrahim Demir

Big Data Cogn. Comput. 2021, 5(4), 57; https://doi.org/10.3390/bdcc5040057 - 18 Oct 2021

Cited by 45 | Viewed by 8060

Abstract

The COVID-19 pandemic elucidated that knowledge systems will be instrumental in cases where accurate information needs to be communicated to a substantial group of people with different backgrounds and technological resources. However, several challenges and obstacles hold back the wide adoption of virtual [...] Read more.

The COVID-19 pandemic elucidated that knowledge systems will be instrumental in cases where accurate information needs to be communicated to a substantial group of people with different backgrounds and technological resources. However, several challenges and obstacles hold back the wide adoption of virtual assistants by public health departments and organizations. This paper presents the Instant Expert, an open-source semantic web framework to build and integrate voice-enabled smart assistants (i.e., chatbots) for any web platform regardless of the underlying domain and technology. The component allows non-technical domain experts to effortlessly incorporate an operational assistant with voice recognition capability into their websites. Instant Expert is capable of automatically parsing, processing, and modeling Frequently Asked Questions pages as an information resource as well as communicating with an external knowledge engine for ontology-powered inference and dynamic data use. The presented framework uses advanced web technologies to ensure reusability and reliability, and an inference engine for natural-language understanding powered by deep learning and heuristic algorithms. A use case for creating an informatory assistant for COVID-19 based on the Centers for Disease Control and Prevention (CDC) data is presented to demonstrate the framework’s usage and benefits. Full article

(This article belongs to the Special Issue Knowledge Modelling and Learning through Cognitive Networks)

► Show Figures

Figure 1

54 pages, 6458 KB

Open AccessEditor’s ChoiceArticle

6G Cognitive Information Theory: A Mailbox Perspective

by Yixue Hao, Yiming Miao, Min Chen, Hamid Gharavi and Victor C. M. Leung

Big Data Cogn. Comput. 2021, 5(4), 56; https://doi.org/10.3390/bdcc5040056 - 16 Oct 2021

Cited by 34 | Viewed by 31646

Abstract

With the rapid development of 5G communications, enhanced mobile broadband, massive machine type communications and ultra-reliable low latency communications are widely supported. However, a 5G communication system is still based on Shannon’s information theory, while the meaning and value of information itself are [...] Read more.

With the rapid development of 5G communications, enhanced mobile broadband, massive machine type communications and ultra-reliable low latency communications are widely supported. However, a 5G communication system is still based on Shannon’s information theory, while the meaning and value of information itself are not taken into account in the process of transmission. Therefore, it is difficult to meet the requirements of intelligence, customization, and value transmission of 6G networks. In order to solve the above challenges, we propose a 6G mailbox theory, namely a cognitive information carrier to enable distributed algorithm embedding for intelligence networking. Based on Mailbox, a 6G network will form an intelligent agent with self-organization, self-learning, self-adaptation, and continuous evolution capabilities. With the intelligent agent, redundant transmission of data can be reduced while the value transmission of information can be improved. Then, the features of mailbox principle are introduced, including polarity, traceability, dynamics, convergence, figurability, and dependence. Furthermore, key technologies with which value transmission of information can be realized are introduced, including knowledge graph, distributed learning, and blockchain. Finally, we establish a cognitive communication system assisted by deep learning. The experimental results show that, compared with a traditional communication system, our communication system performs less data transmission quantity and error. Full article

(This article belongs to the Special Issue Big Data and Cognitive Computing: 5th Anniversary Feature Papers)

► Show Figures

Figure 1

17 pages, 763 KB

Open AccessArticle

Unraveling the Impact of Land Cover Changes on Climate Using Machine Learning and Explainable Artificial Intelligence

by Anastasiia Kolevatova, Michael A. Riegler, Francesco Cherubini, Xiangping Hu and Hugo L. Hammer

Big Data Cogn. Comput. 2021, 5(4), 55; https://doi.org/10.3390/bdcc5040055 - 15 Oct 2021

Cited by 12 | Viewed by 6846

Abstract

A general issue in climate science is the handling of big data and running complex and computationally heavy simulations. In this paper, we explore the potential of using machine learning (ML) to spare computational time and optimize data usage. The paper analyzes the [...] Read more.

A general issue in climate science is the handling of big data and running complex and computationally heavy simulations. In this paper, we explore the potential of using machine learning (ML) to spare computational time and optimize data usage. The paper analyzes the effects of changes in land cover (LC), such as deforestation or urbanization, on local climate. Along with green house gas emission, LC changes are known to be important causes of climate change. ML methods were trained to learn the relation between LC changes and temperature changes. The results showed that random forest (RF) outperformed other ML methods, and especially linear regression models representing current practice in the literature. Explainable artificial intelligence (XAI) was further used to interpret the RF method and analyze the impact of different LC changes on temperature. The results mainly agree with the climate science literature, but also reveal new and interesting findings, demonstrating that ML methods in combination with XAI can be useful in analyzing the climate effects of LC changes. All parts of the analysis pipeline are explained including data pre-processing, feature extraction, ML training, performance evaluation, and XAI. Full article

(This article belongs to the Special Issue Multimedia Systems for Multimedia Big Data)

► Show Figures

Figure 1

16 pages, 2196 KB

Open AccessEditor’s ChoiceArticle

Effects of Neuro-Cognitive Load on Learning Transfer Using a Virtual Reality-Based Driving System

by Usman Alhaji Abdurrahman, Shih-Ching Yeh, Yunying Wong and Liang Wei

Big Data Cogn. Comput. 2021, 5(4), 54; https://doi.org/10.3390/bdcc5040054 - 13 Oct 2021

Cited by 18 | Viewed by 6531

Abstract

Understanding the ways different people perceive and apply acquired knowledge, especially when driving, is an important area of study. This study introduced a novel virtual reality (VR)-based driving system to determine the effects of neuro-cognitive load on learning transfer. In the experiment, easy [...] Read more.

Understanding the ways different people perceive and apply acquired knowledge, especially when driving, is an important area of study. This study introduced a novel virtual reality (VR)-based driving system to determine the effects of neuro-cognitive load on learning transfer. In the experiment, easy and difficult routes were introduced to the participants, and the VR system is capable of recording eye-gaze, pupil dilation, heart rate, as well as driving performance data. So, the main purpose here is to apply multimodal data fusion, several machine learning algorithms, and strategic analytic methods to measure neurocognitive load for user classification. A total of ninety-eight (98) university students participated in the experiment, in which forty-nine (49) were male participants and forty-nine (49) were female participants. The results showed that data fusion methods achieved higher accuracy compared to other classification methods. These findings highlight the importance of physiological monitoring to measure mental workload during the process of learning transfer. Full article

(This article belongs to the Special Issue Virtual Reality, Augmented Reality, and Human-Computer Interaction)

► Show Figures

Figure 1

15 pages, 42126 KB

Open AccessEditor’s ChoiceArticle

Bag of Features (BoF) Based Deep Learning Framework for Bleached Corals Detection

by Sonain Jamil, MuhibUr Rahman and Amir Haider

Big Data Cogn. Comput. 2021, 5(4), 53; https://doi.org/10.3390/bdcc5040053 - 8 Oct 2021

Cited by 41 | Viewed by 9810

Abstract

Coral reefs are the sub-aqueous calcium carbonate structures collected by the invertebrates known as corals. The charm and beauty of coral reefs attract tourists, and they play a vital role in preserving biodiversity, ceasing coastal erosion, and promoting business trade. However, they are [...] Read more.

Coral reefs are the sub-aqueous calcium carbonate structures collected by the invertebrates known as corals. The charm and beauty of coral reefs attract tourists, and they play a vital role in preserving biodiversity, ceasing coastal erosion, and promoting business trade. However, they are declining because of over-exploitation, damaging fishery, marine pollution, and global climate changes. Also, coral reefs help treat human immune-deficiency virus (HIV), heart disease, and coastal erosion. The corals of Australia’s great barrier reef have started bleaching due to the ocean acidification, and global warming, which is an alarming threat to the earth’s ecosystem. Many techniques have been developed to address such issues. However, each method has a limitation due to the low resolution of images, diverse weather conditions, etc. In this paper, we propose a bag of features (BoF) based approach that can detect and localize the bleached corals before the safety measures are applied. The dataset contains images of bleached and unbleached corals, and various kernels are used to support the vector machine so that extracted features can be classified. The accuracy of handcrafted descriptors and deep convolutional neural networks is analyzed and provided in detail with comparison to the current method. Various handcrafted descriptors like local binary pattern, a histogram of an oriented gradient, locally encoded transform feature histogram, gray level co-occurrence matrix, and completed joint scale local binary pattern are used for feature extraction. Specific deep convolutional neural networks such as AlexNet, GoogLeNet, VGG-19, ResNet-50, Inception v3, and CoralNet are being used for feature extraction. From experimental analysis and results, the proposed technique outperforms in comparison to the current state-of-the-art methods. The proposed technique achieves 99.08% accuracy with a classification error of 0.92%. A novel bleached coral positioning algorithm is also proposed to locate bleached corals in the coral reef images. Full article

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

21 pages, 1020 KB

Open AccessFeature PaperEditor’s ChoiceArticle

Hardening the Security of Multi-Access Edge Computing through Bio-Inspired VM Introspection

by Huseyn Huseynov, Tarek Saadawi and Kenichi Kourai

Big Data Cogn. Comput. 2021, 5(4), 52; https://doi.org/10.3390/bdcc5040052 - 8 Oct 2021

Cited by 6 | Viewed by 5025

Abstract

The extreme bandwidth and performance of 5G mobile networks changes the way we develop and utilize digital services. Within a few years, 5G will not only touch technology and applications, but dramatically change the economy, our society and individual life. One of the [...] Read more.

The extreme bandwidth and performance of 5G mobile networks changes the way we develop and utilize digital services. Within a few years, 5G will not only touch technology and applications, but dramatically change the economy, our society and individual life. One of the emerging technologies that enables the evolution to 5G by bringing cloud capabilities near to the end users is Edge Computing or also known as Multi-Access Edge Computing (MEC) that will become pertinent towards the evolution of 5G. This evolution also entails growth in the threat landscape and increase privacy in concerns at different application areas, hence security and privacy plays a central role in the evolution towards 5G. Since MEC application instantiated in the virtualized infrastructure, in this paper we present a distributed application that aims to constantly introspect multiple virtual machines (VMs) in order to detect malicious activities based on their anomalous behavior. Once suspicious processes detected, our IDS in real-time notifies system administrator about the potential threat. Developed software is able to detect keyloggers, rootkits, trojans, process hiding and other intrusion artifacts via agent-less operation, by operating remotely or directly from the host machine. Remote memory introspection means no software to install, no notice to malware to evacuate or destroy data. Experimental results of remote VMI on more than 50 different malicious code demonstrate average anomaly detection rate close to 97%. We have established wide testbed environment connecting networks of two universities Kyushu Institute of Technology and The City College of New York through secure GRE tunnel. Conducted experiments on this testbed deliver high response time of the proposed system. Full article

(This article belongs to the Special Issue Information Security and Cyber Intelligence)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Big Data Cogn. Comput., Volume 5, Issue 4 (December 2021) – 36 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI