Topic Editors

Prof. Dr. Miltiadis D. Lytras
School of Business, Deree—The American College of Greece, 6 Gravias Street, GR-153 42 Aghia Paraskevi Athens, Greece
Prof. Dr. Andreea Claudia Serban
Faculty of Theoretical and Applied Economics, The Bucharest University of Economic Studies, Romana Square, No. 6, 010374 Bucharest, Romania

Big Data and Artificial Intelligence

Abstract submission deadline
closed (30 September 2022)
Manuscript submission deadline
31 December 2022
Viewed by
63113

Topic Information

Dear Colleagues,

The evolution of research in Big Data and artificial intelligence in recent years challenges almost all domains of human activity. The potential of artificial intelligence to act as a catalyst for all given business models, and the capacity of Big Data research to provide sophisticated data and services ecosystems at a global scale, provide a challenging context for scientific contributions and applied research. This Topic section promotes scientific dialogue for the added value of novel methodological approaches and research in the specified areas. Our interest is on the entire end-to-end spectrum of Big Data and artificial intelligence research, from social sciences to computer science including, strategic frameworks, models, and best practices, to sophisticated research related to radical innovation. The topics include, but are not limited to, the following indicative list:

  • Enabling Technologies for Big Data and AI research:
    • Data warehouses;
    • Business intelligence;
    • Machine learning;
    • Neural networks;
    • Natural language processing;
    • Image processing;
    • Bot technology;
    • AI agents;
    • Analytics and dashboards;
    • Distributed computing;
    • Edge computing,
  • Methodologies, frameworks, and models for artificial intelligence and Big Data research:
    • Towards sustainable development goals;
    • As responses to social problems and challenges;
    • For innovations in business, research, academia industry, and technology
    • For theoretical foundations and contributions to the body of knowledge of AI and Big Data research,
  • Best practices and use cases;
  • Outcomes of R&D projects;
  • Advanced data science analytics;
  • Industry-government collaboration;
  • Systems of information systems;
  • Interoperability issues;
  • Security and privacy issues;
  • Ethics on Big Data and AI;
  • Social impact of AI;
  • Open data.

Prof. Dr. Miltiadis D. Lytras
Prof. Dr. Andreea Claudia Serban
Topic Editors

Keywords

  • artificial intelligence
  • big data
  • machine learning
  • open data
  • decision making

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Big Data and Cognitive Computing
BDCC
- 6.1 2017 17 Days 1400 CHF Submit
Future Internet
futureinternet
- 5.4 2009 12.6 Days 1400 CHF Submit
Information
information
- 4.2 2010 18.9 Days 1400 CHF Submit
Remote Sensing
remotesensing
5.349 7.4 2009 19.9 Days 2500 CHF Submit
Sustainability
sustainability
3.889 5.0 2009 16.7 Days 2000 CHF Submit

Preprints is a platform dedicated to making early versions of research outputs permanently available and citable. MDPI journals allow posting on preprint servers such as Preprints.org prior to publication. For more details about reprints, please visit https://www.preprints.org.

Published Papers (74 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
Article
Analysis and Prediction of the IPv6 Traffic over Campus Networks in Shanghai
Future Internet 2022, 14(12), 353; https://doi.org/10.3390/fi14120353 - 27 Nov 2022
Abstract
With the exhaustion of IPv4 addresses, research on the adoption, deployment, and prediction of IPv6 networks becomes more and more significant. This paper analyzes the IPv6 traffic of two campus networks in Shanghai, China. We first conduct a series of analyses for the [...] Read more.
With the exhaustion of IPv4 addresses, research on the adoption, deployment, and prediction of IPv6 networks becomes more and more significant. This paper analyzes the IPv6 traffic of two campus networks in Shanghai, China. We first conduct a series of analyses for the traffic patterns and uncover weekday/weekend patterns, the self-similarity phenomenon, and the correlation between IPv6 and IPv4 traffic. On weekends, traffic usage is smaller than on weekdays, but the distribution does not change much. We find that the self-similarity of IPv4 traffic is close to that of IPv6 traffic, and there is a strong positive correlation between IPv6 traffic and IPv4 traffic. Based on our findings on traffic patterns, we propose a new IPv6 traffic prediction model by combining the advantages of the statistical and deep learning models. In addition, our model would extract useful information from the corresponding IPv4 traffic to enhance the prediction. Based on two real-world datasets, it is shown that the proposed model outperforms eight baselines with a lower prediction error. In conclusion, our approach is helpful for network resource allocation and network management. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Article
An Effective Online Sequential Stochastic Configuration Algorithm for Neural Networks
Sustainability 2022, 14(23), 15601; https://doi.org/10.3390/su142315601 - 23 Nov 2022
Abstract
Random Vector Functional-link (RVFL) networks, as a class of random learner models, have received careful attention from the neural network research community due to their advantages in obtaining fast learning algorithms and models, in which the hidden layer parameters are randomly generated and [...] Read more.
Random Vector Functional-link (RVFL) networks, as a class of random learner models, have received careful attention from the neural network research community due to their advantages in obtaining fast learning algorithms and models, in which the hidden layer parameters are randomly generated and remain fixed during the training phase. However, its universal approximation ability may not be guaranteed if the random parameters are not properly selected in an appropriate range. Moreover, the resulting random learner’s generalization performance may seriously deteriorate once the RVFL network’s structure is not well-designed. Stochastic configuration (SC) algorithm, which incrementally constructs a universal approximator by obtaining random hidden parameters under a specified supervisory mechanism, instead of fixing the selection scope in advance and without any reference to training information, can effectively circumvent these awkward issues caused by randomness. This paper extends the SC algorithm to an online sequential version, termed as an OSSC algorithm, by means of recursive least square (RLS) technique, aiming to copy with modeling tasks where training observations are sequentially provided. Compared to the online sequential learning of RVFL networks (OS-RVFL in short), our proposed OSSC algorithm can avoid the awkward setting of certain unreasonable range for the random parameters, and can also successfully build a random learner with preferable learning and generalization capabilities. The experimental study has shown the effectiveness and advantages of our OSSC algorithm. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Improving Natural Language Person Description Search from Videos with Language Model Fine-Tuning and Approximate Nearest Neighbor
Big Data Cogn. Comput. 2022, 6(4), 136; https://doi.org/10.3390/bdcc6040136 - 11 Nov 2022
Abstract
Due to the ubiquitous nature of CCTV cameras that record continuously, there is a large amount of video data that are unstructured. Often, when these recordings have to be reviewed, it is to look for a specific person that fits a certain description. [...] Read more.
Due to the ubiquitous nature of CCTV cameras that record continuously, there is a large amount of video data that are unstructured. Often, when these recordings have to be reviewed, it is to look for a specific person that fits a certain description. Currently, this is achieved by manual inspection of the videos, which is both time-consuming and labor-intensive. While person description search is not a new topic, in this work, we made two contributions. First, we improve upon the existing state-of-the-art by proposing unsupervised finetuning on the language model that forms a main part of the text branch of person description search models. This led to higher recall values on the standard dataset. The second contribution is that we engineered a complete pipeline from video files to fast searchable objects. Due to the use of an approximate nearest neighbor search and some model optimizations, a person description search can be performed such that the result is available immediately when deployed on a standard PC with no GPU, allowing an interactive search. We demonstrated the effectiveness of the system on new data and showed that most people in the videos can be successfully discovered by the search. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Unsupervised Cluster-Wise Hyperspectral Band Selection for Classification
Remote Sens. 2022, 14(21), 5374; https://doi.org/10.3390/rs14215374 - 27 Oct 2022
Abstract
A hyperspectral image provides fine details about the scene under analysis, due to its multiple bands. However, the resulting high dimensionality in the feature space may render a classification task unreliable, mainly due to overfitting and the Hughes phenomenon. In order to attenuate [...] Read more.
A hyperspectral image provides fine details about the scene under analysis, due to its multiple bands. However, the resulting high dimensionality in the feature space may render a classification task unreliable, mainly due to overfitting and the Hughes phenomenon. In order to attenuate such problems, one can resort to dimensionality reduction (DR). Thus, this paper proposes a new DR algorithm, which performs an unsupervised band selection technique following a clustering approach. More specifically, the data set was split into a predefined number of clusters, after which the bands were iteratively selected based on the parameters of a separating hyperplane, which provided the best separation in the feature space, in a one-versus-all scenario. Then, a fine-tuning of the initially selected bands took place based on the separability of clusters. A comparison with five other state-of-the-art frameworks shows that the proposed method achieved the best classification results in 60% of the experiments. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Auto-Learning Correlation-Filter-Based Target State Estimation for Real-Time UAV Tracking
Remote Sens. 2022, 14(21), 5299; https://doi.org/10.3390/rs14215299 - 23 Oct 2022
Abstract
Most existing tracking methods based on discriminative correlation filters (DCFs) update the tracker every frame with a fixed learning rate. However, constantly adjusting the tracker can hardly handle the fickle target appearance in UAV tracking (e.g., undergoing partial occlusion, illumination variation, or deformation). [...] Read more.
Most existing tracking methods based on discriminative correlation filters (DCFs) update the tracker every frame with a fixed learning rate. However, constantly adjusting the tracker can hardly handle the fickle target appearance in UAV tracking (e.g., undergoing partial occlusion, illumination variation, or deformation). To mitigate this, we propose a novel auto-learning correlation filter for UAV tracking, which fully exploits valuable information behind response maps for adaptive feedback updating. Concretely, we first introduce a principled target state estimation (TSE) criterion to reveal the confidence level of the tracking results. We suggest an auto-learning strategy with the TSE metric to update the tracker with adaptive learning rates. Based on the target state estimation, we further developed an innovative lost-and-found strategy to recognize and handle temporal target missing. Finally, we incorporated the TSE regularization term into the DCF objective function, which by alternating optimization iterations can efficiently solve without much computational cost. Extensive experiments on four widely-used UAV benchmarks have demonstrated the superiority of the proposed method compared to both DCF and deep-based trackers. Notably, ALCF achieved state-of-the-art performance on several benchmarks while running over 50 FPS on a single CPU. Code will be released soon. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Supporting Meteorologists in Data Analysis through Knowledge-Based Recommendations
Big Data Cogn. Comput. 2022, 6(4), 103; https://doi.org/10.3390/bdcc6040103 - 28 Sep 2022
Abstract
Climate change means coping directly or indirectly with extreme weather conditions for everybody. Therefore, analyzing meteorological data to create precise models is gaining more importance and might become inevitable. Meteorologists have extensive domain knowledge about meteorological data yet lack practical data analysis skills. [...] Read more.
Climate change means coping directly or indirectly with extreme weather conditions for everybody. Therefore, analyzing meteorological data to create precise models is gaining more importance and might become inevitable. Meteorologists have extensive domain knowledge about meteorological data yet lack practical data analysis skills. This paper presents a method to bridge this gap by empowering the data knowledge carriers to analyze the data. The proposed system utilizes symbolic AI, a knowledge base created by experts, and a recommendation expert system to offer suiting data analysis methods or data pre-processing to meteorologists. This paper systematically analyzes the target user group of meteorologists and practical use cases to arrive at a conceptual and technical system design implemented in the CAMeRI prototype. The concepts in this paper are aligned with the AI2VIS4BigData Reference Model and comprise a novel first-order logic knowledge base that represents analysis methods and related pre-processings. The prototype implementation was qualitatively and quantitatively evaluated. This evaluation included recommendation validation for real-world data, a cognitive walkthrough, and measuring computation timings of the different system components. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Image Retrieval Algorithm Based on Locality-Sensitive Hash Using Convolutional Neural Network and Attention Mechanism
Information 2022, 13(10), 446; https://doi.org/10.3390/info13100446 - 24 Sep 2022
Abstract
With the continuous progress of image retrieval technology, in the field of image retrieval, the speed of a search for a desired image from a great deal of image data becomes a hot issue. Convolutional Neural Networks (CNN) have been used in the [...] Read more.
With the continuous progress of image retrieval technology, in the field of image retrieval, the speed of a search for a desired image from a great deal of image data becomes a hot issue. Convolutional Neural Networks (CNN) have been used in the field of image retrieval. However, many image retrieval systems based on CNN have a poor ability to express image features, resulting in a series of problems such as low retrieval accuracy and robustness. When the target image is retrieved from a large amount of image data, the vector dimension after image coding is high and the retrieval efficiency is low. Locality-sensitive hash is a method to find similar data from massive high latitude data. It reduces the data dimension of the original spatial data through hash coding and conversion, and can also maintain the similarity between the data. The retrieval time and space complexity are low. Therefore, this paper proposes a locality-sensitive hash image retrieval method based on CNN and the attention mechanism. The steps of the method are as follows: using the ResNet50 network as the feature extractor of the image, adding the attention module after the convolution layer of the model, and using the output of the network full connection layer to retrieve the features of the image database, then using the local-sensitive hash algorithm to hash code the image features of the database to reduce the dimension and establish the index, and finally measuring the features of the image to be retrieved and the image database to get the most similar image, completing the content-based image retrieval task. The method in this paper is compared with other image retrieval methods on corel1k and corel5k datasets. The experimental results show that this method can effectively improve the accuracy of image retrieval, and the retrieval efficiency is significantly improved. It also has higher robustness in different scenarios. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Data-Driven Based Method for Pipeline Additional Stress Prediction Subject to Landslide Geohazards
Sustainability 2022, 14(19), 11999; https://doi.org/10.3390/su141911999 - 22 Sep 2022
Abstract
Pipelines that cross complex geological terrains are inevitably threatened by natural hazards, among which landslide attracts extensive attention when pipelines cross mountainous areas. The landslides are typically associated with ground movements that would induce additional stress on the pipeline. Such stress state of [...] Read more.
Pipelines that cross complex geological terrains are inevitably threatened by natural hazards, among which landslide attracts extensive attention when pipelines cross mountainous areas. The landslides are typically associated with ground movements that would induce additional stress on the pipeline. Such stress state of pipelines under landslide interference seriously damage structural integrity of the pipeline. Up to the date, limited research has been done on the combined landslide hazard and pipeline stress state analysis. In this paper, a multi-parameter integrated monitoring system was developed for the pipeline stress-strain state and landslide deformation monitoring. Also, data-driven models for the pipeline additional stress prediction was established. The developed predictive models include individual and ensemble-based machine learning approaches. The implementation procedure of the predictive models integrates the field data measured by the monitoring system, with k-fold cross validation used for the generalization performance evaluation. The obtained results indicate that the XGBoost model has the highest performance in the prediction of the additional stress. Besides, the significance of the input variables is determined through sensitivity analyses by using feature importance criteria. Thus, the integrated monitoring system together with the XGBoost prediction method is beneficial to modeling the additional stress in oil and gas pipelines, which will further contribute to pipeline geohazards monitoring management. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Data Descriptor
A Worldwide Bibliometric Analysis of Publications on Artificial Intelligence and Ethics in the Past Seven Decades
Sustainability 2022, 14(18), 11125; https://doi.org/10.3390/su141811125 - 06 Sep 2022
Abstract
Issues related to artificial intelligence (AI) and ethics have gained much traction worldwide. The impact of AI on society has been extensively discussed. This study presents a bibliometric analysis of research results, citation relationships among researchers, and highly referenced journals on AI and [...] Read more.
Issues related to artificial intelligence (AI) and ethics have gained much traction worldwide. The impact of AI on society has been extensively discussed. This study presents a bibliometric analysis of research results, citation relationships among researchers, and highly referenced journals on AI and ethics on a global scale. Papers published on AI and ethics were recovered from the Microsoft Academic Graph Collection data set, and the subject terms included “artificial intelligence” and “ethics.” With 66 nations’ researchers contributing to AI and ethics research, 1585 papers on AI and ethics were recovered, up to 5 July 2021. North America, Western Europe, and East Asia were the regions with the highest productivity. The top ten nations produced about 94.37% of the wide variety of papers. The United States accounted for 47.59% (286 articles) of all papers. Switzerland had the highest research production with a million-person ratio (1.39) when adjusted for populace size. It was followed by the Netherlands (1.26) and the United Kingdom (1.19). The most productive authors were found to be Khatib, O. (n = 10), Verner, I. (n = 9), Bekey, G. A. (n = 7), Gennert, M. A. (n = 7), and Chatila, R., (n = 7). Current research shows that research on artificial intelligence and ethics has evolved dramatically over the past 70 years. Moreover, the United States is more involved with AI and ethics research than developing or emerging countries. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Hierarchical Co-Attention Selection Network for Interpretable Fake News Detection
Big Data Cogn. Comput. 2022, 6(3), 93; https://doi.org/10.3390/bdcc6030093 - 05 Sep 2022
Abstract
Social media fake news has become a pervasive and problematic issue today with the development of the internet. Recent studies have utilized different artificial intelligence technologies to verify the truth of the news and provide explanations for the results, which have shown remarkable [...] Read more.
Social media fake news has become a pervasive and problematic issue today with the development of the internet. Recent studies have utilized different artificial intelligence technologies to verify the truth of the news and provide explanations for the results, which have shown remarkable success in interpretable fake news detection. However, individuals’ judgments of news are usually hierarchical, prioritizing valuable words above essential sentences, which is neglected by existing fake news detection models. In this paper, we propose an interpretable novel neural network-based model, the hierarchical co-attention selection network (HCSN), to predict whether the source post is fake, as well as an explanation that emphasizes important comments and particular words. The key insight of the HCSN model is to incorporate the Gumbel–Max trick in the hierarchical co-attention selection mechanism that captures sentence-level and word-level information from the source post and comments following the sequence of words–sentences–words–event. In addition, HCSN enjoys the additional benefit of interpretability—it provides a conscious explanation of how it reaches certain results by selecting comments and highlighting words. According to the experiments conducted on real-world datasets, our model outperformed state-of-the-art methods and generated reasonable explanations. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Topical and Non-Topical Approaches to Measure Similarity between Arabic Questions
Big Data Cogn. Comput. 2022, 6(3), 87; https://doi.org/10.3390/bdcc6030087 - 22 Aug 2022
Abstract
Questions are crucial expressions in any language. Many Natural Language Processing (NLP) or Natural Language Understanding (NLU) applications, such as question-answering computer systems, automatic chatting apps (chatbots), digital virtual assistants, and opinion mining, can benefit from accurately identifying similar questions in an effective [...] Read more.
Questions are crucial expressions in any language. Many Natural Language Processing (NLP) or Natural Language Understanding (NLU) applications, such as question-answering computer systems, automatic chatting apps (chatbots), digital virtual assistants, and opinion mining, can benefit from accurately identifying similar questions in an effective manner. We detail methods for identifying similarities between Arabic questions that have been posted online by Internet users and organizations. Our novel approach uses a non-topical rule-based methodology and topical information (textual similarity, lexical similarity, and semantic similarity) to determine if a pair of Arabic questions are similarly paraphrased. Our method counts the lexical and linguistic distances between each question. Additionally, it identifies questions in accordance with their format and scope using expert hypotheses (rules) that have been experimentally shown to be useful and practical. Even if there is a high degree of lexical similarity between a When question (Timex Factoid—inquiring about time) and a Who inquiry (Enamex Factoid—asking about a named entity), they will not be similar. In an experiment using 2200 question pairs, our method attained an accuracy of 0.85, which is remarkable given the simplicity of the solution and the fact that we did not employ any language models or word embedding. In order to cover common Arabic queries presented by Arabic Internet users, we gathered the questions from various online forums and resources. In this study, we describe a unique method for detecting question similarity that does not require intensive processing, a sizable linguistic corpus, or a costly semantic repository. Because there are not many rich Arabic textual resources, this is especially important for informal Arabic text processing on the Internet. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Article
Machine-Learning-Based Gender Distribution Prediction from Anonymous News Comments: The Case of Korean News Portal
Sustainability 2022, 14(16), 9939; https://doi.org/10.3390/su14169939 - 11 Aug 2022
Cited by 1
Abstract
Anonymous news comment data from a news portal in South Korea, naver.com, can help conduct gender research and resolve related issues for sustainable societies. Nevertheless, only a small portion of gender information (i.e., gender distribution) is open to the public, and therefore, it [...] Read more.
Anonymous news comment data from a news portal in South Korea, naver.com, can help conduct gender research and resolve related issues for sustainable societies. Nevertheless, only a small portion of gender information (i.e., gender distribution) is open to the public, and therefore, it has rarely been considered for gender research. Hence, this paper aims to resolve the matter of incomplete gender information and make the anonymous news comment data usable for gender research as new social media big data. This paper proposes a machine-learning-based approach for predicting the gender distribution (i.e., male and female rates) of anonymous news commenters for a news article. Initially, the big data of news articles and their anonymous news comments were collected and divided into labeled and unlabeled datasets (i.e., with and without gender information). The word2vec approach was employed to represent a news article by the characteristics of the news comments. Then, using the labeled dataset, various prediction techniques were evaluated for predicting the gender distribution of anonymous news commenters for a labeled news article. As a result, the neural network was selected as the best prediction technique, and it could accurately predict the gender distribution of anonymous news commenters of the labeled news article. Thus, this study showed that a machine-learning-based approach can overcome the incomplete gender information problem of anonymous social media users. Moreover, when the gender distributions of the unlabeled news articles were predicted using the best neural network model, trained with the labeled dataset, their distribution turned out different from the labeled news articles. The result indicates that using only the labeled dataset for gender research can result in misleading findings and distorted conclusions. The predicted gender distributions for the unlabeled news articles can help to better understand anonymous news commenters as humans for sustainable societies. Eventually, this study provides a new way for data-driven computational social science with incomplete and anonymous social media big data. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Using Explainable Artificial Intelligence to Identify Key Characteristics of Deep Poverty for Each Household
Sustainability 2022, 14(16), 9872; https://doi.org/10.3390/su14169872 - 10 Aug 2022
Abstract
The first task for eradicating poverty is accurate poverty identification. Deep poverty identification is conducive to investing resources to help deeply poor populations achieve prosperity, one of the most challenging tasks in poverty eradication. This study constructs a deep poverty identification model utilizing [...] Read more.
The first task for eradicating poverty is accurate poverty identification. Deep poverty identification is conducive to investing resources to help deeply poor populations achieve prosperity, one of the most challenging tasks in poverty eradication. This study constructs a deep poverty identification model utilizing explainable artificial intelligence (XAI) to identify deeply poor households based on the data of 23,307 poor households in rural areas in China. For comparison, a logistic regression-based model and an income-based model are developed as well. We found that our XAI-based model achieves a higher identification performance in terms of the area under the ROC curve than both the logistic regression-based model and the income-based model. For each rural household, the odds of being identified as deeply poor are obtained. Additionally, multidimensional household characteristics associated with deep poverty are specified and ranked for each poor household, while ordinary feature ranking methods can only provide ranking results for poor households as a whole. Taking all poor households into consideration, we found that common important characteristics that can be used to identify deeply poor households include household income, disability, village attributes, lack of funds, labor force, disease, and number of household members, which are validated by mutual information analysis. In conclusion, our XAI-based model can be used to identify deep poverty and specify key household characteristics associated with deep poverty for individual households, facilitating the development of new targeted poverty reduction strategies. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Efficient Supervised Image Clustering Based on Density Division and Graph Neural Networks
Remote Sens. 2022, 14(15), 3768; https://doi.org/10.3390/rs14153768 - 05 Aug 2022
Abstract
In recent research, supervised image clustering based on Graph Neural Networks (GNN) connectivity prediction has demonstrated considerable improvements over traditional clustering algorithms. However, existing supervised image clustering algorithms are usually time-consuming and limit their applications. In order to infer the connectivity between image [...] Read more.
In recent research, supervised image clustering based on Graph Neural Networks (GNN) connectivity prediction has demonstrated considerable improvements over traditional clustering algorithms. However, existing supervised image clustering algorithms are usually time-consuming and limit their applications. In order to infer the connectivity between image instances, they usually created a subgraph for each image instance. Due to the creation and process of a large number of subgraphs as the input of GNN, the computation overheads are enormous. To address the high computation overhead problem in the GNN connectivity prediction, we present a time-efficient and effective GNN-based supervised clustering framework based on density division namely DDC-GNN. DDC-GNN divides all image instances into high-density parts and low-density parts, and only performs GNN subgraph connectivity prediction on the low-density parts, resulting in a significant reduction in redundant calculations. We test two typical models in the GNN connectivity prediction module in the DDC-GNN framework, which are the graph convolutional networks (GCN)-based model and the graph auto-encoder (GAE)-based model. Meanwhile, adaptive subgraphs are generated to ensure sufficient contextual information extraction for low-density parts instead of the fixed-size subgraphs. According to the experiments on different datasets, DDC-GNN achieves higher accuracy and is almost five times quicker than those without the density division strategy. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Study on the Optimal Flexible Job-Shop Scheduling with Sequence-Dependent Setup Time Based on a Hybrid Algorithm of Improved Quantum Cat Swarm Optimization
Sustainability 2022, 14(15), 9547; https://doi.org/10.3390/su14159547 - 03 Aug 2022
Abstract
Multi-item and small-lot-size production modes lead to frequent setup, which involves significant setup times and has a substantial impact on productivity. In this study, we investigated the optimal flexible job-shop scheduling problem with a sequence-dependent setup time. We built a mathematical model with [...] Read more.
Multi-item and small-lot-size production modes lead to frequent setup, which involves significant setup times and has a substantial impact on productivity. In this study, we investigated the optimal flexible job-shop scheduling problem with a sequence-dependent setup time. We built a mathematical model with the optimal objective of minimization of the maximum completion time (makespan). Considering the process sequence, which is influenced by setup time, processing time, and machine load limitations, first, processing machinery is chosen based on machine load and processing time, and then processing tasks are scheduled based on setup time and processing time. An improved quantum cat swarm optimization (QCSO) algorithm is proposed to solve the problem, a quantum coding method is introduced, the quantum bit (Q-bit) and cat swarm algorithm (CSO) are combined, and the cats are iteratively updated by quantum rotation angle position; then, the dynamic mixture ratio (MR) value is selected according to the number of algorithm iterations. The use of this method expands our understanding of space and increases operation efficiency and speed. Finally, the improved QCSO algorithm and parallel genetic algorithm (PGA) are compared through simulation experiments. The results show that the improved QCSO algorithm has better results, and the robustness of the algorithm is improved. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
LGB-PHY: An Evaporation Duct Height Prediction Model Based on Physically Constrained LightGBM Algorithm
Remote Sens. 2022, 14(14), 3448; https://doi.org/10.3390/rs14143448 - 18 Jul 2022
Cited by 1
Abstract
The evaporation duct is a special atmospheric stratification that significantly influences the propagation path of electromagnetic waves at sea, and hence, it is crucial for the stability of the radio communication systems. Affected by physical parameters that are not universal, traditional evaporation duct [...] Read more.
The evaporation duct is a special atmospheric stratification that significantly influences the propagation path of electromagnetic waves at sea, and hence, it is crucial for the stability of the radio communication systems. Affected by physical parameters that are not universal, traditional evaporation duct theoretical models often have limited accuracy and poor generalization ability, e.g., the remote sensing method is limited by the inversion algorithm. The accuracy, generalization ability and scientific interpretability of the existing pure data-driven evaporation duct height prediction models still need to be improved. To address these issues, in this paper, we use the voyage observation data and propose the physically constrained LightGBM evaporation duct height prediction model (LGB-PHY). The proposed model integrates the Babin–Young–Carton (BYC) physical model into a custom loss function. Compared with the eXtreme Gradient Boosting (XGB) model, the LGB-PHY based on a 5-day voyage data set of the South China Sea provides significant improvement where the RMSE index is reduced by 68%, while the SCC index is improved by 6.5%. We further carried out a cross-comparison experiment of regional generalization and show that in the sea area with high latitude and strong adaptability of the BYC model, the LGB-PHY model has a stronger regional generalization performance than that of the XGB model. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Object Localization in Weakly Labeled Remote Sensing Images Based on Deep Convolutional Features
Remote Sens. 2022, 14(13), 3230; https://doi.org/10.3390/rs14133230 - 05 Jul 2022
Cited by 1
Abstract
Object recognition, as one of the most fundamental and challenging problems in high-resolution remote sensing image interpretation, has received increasing attention in recent years. However, most conventional object recognition pipelines aim to recognize instances with bounding boxes in a supervised learning strategy, which [...] Read more.
Object recognition, as one of the most fundamental and challenging problems in high-resolution remote sensing image interpretation, has received increasing attention in recent years. However, most conventional object recognition pipelines aim to recognize instances with bounding boxes in a supervised learning strategy, which require intensive and manual labor for instance annotation creation. In this paper, we propose a weakly supervised learning method to alleviate this problem. The core idea of our method is to recognize multiple objects in an image using only image-level semantic labels and indicate the recognized objects with location points instead of box extent. Specifically, a deep convolutional neural network is first trained to perform semantic scene classification, of which the result is employed for the categorical determination of objects in an image. Then, by back-propagating the categorical feature from the fully connected layer to the deep convolutional layer, the categorical and spatial information of an image are combined to obtain an object discriminative localization map, which can effectively indicate the salient regions of objects. Next, a dynamic updating method of local response extremum is proposed to further determine the locations of objects in an image. Finally, extensive experiments are conducted to localize aircraft and oiltanks in remote sensing images based on different convolutional neural networks. Experimental results show that the proposed method outperforms the-state-of-the-art methods, achieving the precision, recall, and F1-score at 94.50%, 88.79%, and 91.56% for aircraft localization and 89.12%, 83.04%, and 85.97% for oiltank localization, respectively. We hope that our work could serve as a basic reference for remote sensing object localization via a weakly supervised strategy and provide new opportunities for further research. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Fog Computing Capabilities for Big Data Provisioning: Visualization Scenario
Sustainability 2022, 14(13), 8070; https://doi.org/10.3390/su14138070 - 01 Jul 2022
Abstract
With the development of Internet technologies, huge amounts of data are collected from various sources, and used ‘anytime, anywhere’ to enrich and change the life of the whole of society, attract ways to do business, and better perceive people’s lives. Those datasets, called [...] Read more.
With the development of Internet technologies, huge amounts of data are collected from various sources, and used ‘anytime, anywhere’ to enrich and change the life of the whole of society, attract ways to do business, and better perceive people’s lives. Those datasets, called ‘big data’, need to be processed, stored, or retrieved, and special tools were developed to analyze this big data. At the same time, the ever-increasing development of the Internet of Things (IoT) requires IoT devices to be mobile, with adequate data processing performance. The new fog computing paradigm makes computing resources more accessible, and provides a flexible environment that will be widely used in next-generation networks, vehicles, etc., demonstrating enhanced capabilities and optimizing resources. This paper is devoted to analyzing fog computing capabilities for big data provisioning, while considering this technology’s different architectural and functional aspects. The analysis includes exploring the protocols suitable for fog computing by implementing an experimental fog computing network and assessing its capabilities for providing big data, originating from both a real-time stream and batch data, with appropriate visualization of big data processing. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
GenericConv: A Generic Model for Image Scene Classification Using Few-Shot Learning
Information 2022, 13(7), 315; https://doi.org/10.3390/info13070315 - 28 Jun 2022
Abstract
Scene classification is one of the most complex tasks in computer-vision. The accuracy of scene classification is dependent on other subtasks such as object detection and object classification. Accurate results may be accomplished by employing object detection in scene classification since prior information [...] Read more.
Scene classification is one of the most complex tasks in computer-vision. The accuracy of scene classification is dependent on other subtasks such as object detection and object classification. Accurate results may be accomplished by employing object detection in scene classification since prior information about objects in the image will lead to an easier interpretation of the image content. Machine and transfer learning are widely employed in scene classification achieving optimal performance. Despite the promising performance of existing models in scene classification, there are still major issues. First, the training phase for the models necessitates a large amount of data, which is a difficult and time-consuming task. Furthermore, most models are reliant on data previously seen in the training set, resulting in ineffective models that can only identify samples that are similar to the training set. As a result, few-shot learning has been introduced. Although few attempts have been reported applying few-shot learning to scene classification, they resulted in perfect accuracy. Motivated by these findings, in this paper we implement a novel few-shot learning model—GenericConv—for scene classification that has been evaluated using benchmarked datasets: MiniSun, MiniPlaces, and MIT-Indoor 67 datasets. The experimental results show that the proposed model GenericConv outperforms the other benchmark models on the three datasets, achieving accuracies of 52.16 ± 0.015, 35.86 ± 0.014, and 37.26 ± 0.014 for five-shots on MiniSun, MiniPlaces, and MIT-Indoor 67 datasets, respectively. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
An Effective Ensemble Automatic Feature Selection Method for Network Intrusion Detection
Information 2022, 13(7), 314; https://doi.org/10.3390/info13070314 - 27 Jun 2022
Abstract
The mass of redundant and irrelevant data in network traffic brings serious challenges to intrusion detection, and feature selection can effectively remove meaningless information from the data. Most current filtered and embedded feature selection methods use a fixed threshold or ratio to determine [...] Read more.
The mass of redundant and irrelevant data in network traffic brings serious challenges to intrusion detection, and feature selection can effectively remove meaningless information from the data. Most current filtered and embedded feature selection methods use a fixed threshold or ratio to determine the number of features in a subset, which requires a priori knowledge. In contrast, wrapped feature selection methods are computationally complex and time-consuming; meanwhile, individual feature selection methods have a bias in evaluating features. This work designs an ensemble-based automatic feature selection method called EAFS. Firstly, we calculate the feature importance or ranks based on individual methods, then add features to subsets sequentially by importance and evaluate subset performance comprehensively by designing an NSOM to obtain the subset with the largest NSOM value. When searching for a subset, the subset with higher accuracy is retained to lower the computational complexity by calculating the accuracy when the full set of features is used. Finally, the obtained subsets are ensembled, and by comparing the experimental results on three large-scale public datasets, the method described in this study can help in the classification, and also compared with other methods, we discover that our method outperforms other recent methods in terms of performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
VisualRPI: Visualizing Research Productivity and Impact
Sustainability 2022, 14(13), 7679; https://doi.org/10.3390/su14137679 - 23 Jun 2022
Abstract
Research productivity and impact (RPI) is commonly measured through citation analysis, such as the h-index. Despite the popularity and objectivity of this type of method, it is still difficult to effectively compare a number of related researchers in terms of various citation-related statistics [...] Read more.
Research productivity and impact (RPI) is commonly measured through citation analysis, such as the h-index. Despite the popularity and objectivity of this type of method, it is still difficult to effectively compare a number of related researchers in terms of various citation-related statistics at the same time, such as average cites per year/paper, the number of papers/citations, h-index, etc. In this work, we develop a method that employs information visualization technology, and examine its applicability for the assessment of researchers’ RPI. Specifically, our prototype, a visualizing research productivity and impact (VisualRPI) system, is introduced, which is composed of clustering and visualization components. The clustering component hierarchically clusters similar research statistics into the same groups, and the visualization component is used to display the RPI in a clear manner. A case example using information for 85 information systems researchers is used to demonstrate the usefulness of VisualRPI. The results show that this method easily measures the RPI for various performance indicators, such as cites/paper and h-index. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Mask-Guided Transformer Network with Topic Token for Remote Sensing Image Captioning
Remote Sens. 2022, 14(12), 2939; https://doi.org/10.3390/rs14122939 - 20 Jun 2022
Abstract
Remote sensing image captioning aims to describe the content of images using natural language. In contrast with natural images, the scale, distribution, and number of objects generally vary in remote sensing images, making it hard to capture global semantic information and the relationships [...] Read more.
Remote sensing image captioning aims to describe the content of images using natural language. In contrast with natural images, the scale, distribution, and number of objects generally vary in remote sensing images, making it hard to capture global semantic information and the relationships between objects at different scales. In this paper, in order to improve the accuracy and diversity of captioning, a mask-guided Transformer network with a topic token is proposed. Multi-head attention is introduced to extract features and capture the relationships between objects. On this basis, a topic token is added into the encoder, which represents the scene topic and serves as a prior in the decoder to help us focus better on global semantic information. Moreover, a new Mask-Cross-Entropy strategy is designed in order to improve the diversity of the generated captions, which randomly replaces some input words with a special word (named [Mask]) in the training stage, with the aim of enhancing the model’s learning ability and forcing exploration of uncommon word relations. Experiments on three data sets show that the proposed method can generate captions with high accuracy and diversity, and the experimental results illustrate that the proposed method can outperform state-of-the-art models. Furthermore, the CIDEr score on the RSICD data set increased from 275.49 to 298.39. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Application of Combined Models Based on Empirical Mode Decomposition, Deep Learning, and Autoregressive Integrated Moving Average Model for Short-Term Heating Load Predictions
Sustainability 2022, 14(12), 7349; https://doi.org/10.3390/su14127349 - 15 Jun 2022
Cited by 1
Abstract
Short-term building energy consumption prediction is of great significance for the optimized operation of building energy management systems and energy conservation. Due to the high-dimensional nonlinear characteristics of building heat loads, traditional single machine-learning models cannot extract the features well. Therefore, in this [...] Read more.
Short-term building energy consumption prediction is of great significance for the optimized operation of building energy management systems and energy conservation. Due to the high-dimensional nonlinear characteristics of building heat loads, traditional single machine-learning models cannot extract the features well. Therefore, in this paper, a combined model based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), four deep learning (DL), and the autoregressive integrated moving average (ARIMA) models is proposed. The DL models include a convolution neural network, long- and short-term memory (LSTM), bi-directional LSTM (bi-LSTM), and the gated recurrent unit. The CEEMDAN decomposed the heating load into different components to extract the different features, while the DL and ARIMA models were used for the prediction of heating load features with high and low complexity, respectively. The single-DL models and the CEEMDAN-DL combinations were also implemented for comparison purposes. The results show that the combined models achieved much higher accuracy compared to the single-DL models and the CEEMDAN-DL combinations. Compared to the single-DL models, the average coefficient of determination (R2), root mean square error (RMSE), and coefficient of variation of the RMSE (CV-RMSE) were improved by 2.91%, 47.93%, and 47.92%, respectively. Furthermore, CEEMDAN-bi-LSTM-ARIMA performed the best of all the combined models, achieving values of R2 = 0.983, RMSE = 70.25 kWh, and CV-RMSE = 1.47%. This study provides a new guide for developing combined models for building energy consumption prediction. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
EBBA: An Enhanced Binary Bat Algorithm Integrated with Chaos Theory and Lévy Flight for Feature Selection
Future Internet 2022, 14(6), 178; https://doi.org/10.3390/fi14060178 - 09 Jun 2022
Cited by 2
Abstract
Feature selection can efficiently improve classification accuracy and reduce the dimension of datasets. However, feature selection is a challenging and complex task that requires a high-performance optimization algorithm. In this paper, we propose an enhanced binary bat algorithm (EBBA) which is originated from [...] Read more.
Feature selection can efficiently improve classification accuracy and reduce the dimension of datasets. However, feature selection is a challenging and complex task that requires a high-performance optimization algorithm. In this paper, we propose an enhanced binary bat algorithm (EBBA) which is originated from the conventional binary bat algorithm (BBA) as the learning algorithm in a wrapper-based feature selection model. First, we model the feature selection problem and then transfer it as a fitness function. Then, we propose an EBBA for solving the feature selection problem. In EBBA, we introduce the Lévy flight-based global search method, population diversity boosting method and chaos-based loudness method to improve the BA and make it more applicable to feature selection problems. Finally, the simulations are conducted to evaluate the proposed EBBA and the simulation results demonstrate that the proposed EBBA outmatches other comparison benchmarks. Moreover, we also illustrate the effectiveness of the proposed improved factors by tests. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Semi-Supervised Cloud Detection in Satellite Images by Considering the Domain Shift Problem
Remote Sens. 2022, 14(11), 2641; https://doi.org/10.3390/rs14112641 - 31 May 2022
Cited by 1
Abstract
In terms of semi-supervised cloud detection work, efforts are being made to learn a promising cloud detection model via a limited number of pixel-wise labeled images and a large number of unlabeled ones. However, remote sensing images obtained from the same satellite sensor [...] Read more.
In terms of semi-supervised cloud detection work, efforts are being made to learn a promising cloud detection model via a limited number of pixel-wise labeled images and a large number of unlabeled ones. However, remote sensing images obtained from the same satellite sensor often show a data distribution drift problem due to the different cloud shapes and land-cover types on the Earth’s surface. Therefore, there are domain distribution gaps between labeled and unlabeled satellite images. To solve this problem, we take the domain shift problem into account for the semi-supervised learning (SSL) network. Feature-level and output-level domain adaptations are applied to reduce the domain distribution gaps between labeled and unlabeled images, thus improving predicted results accuracy of the SSL network. Experimental results on Landsat-8 OLI and GF-1 WFV multispectral images demonstrate that the proposed semi-supervised cloud detection network (SSCDnet) is able to achieve promising cloud detection performance when using a limited number of labeled samples and outperforms several state-of-the-art SSL methods. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
UAVSwarm Dataset: An Unmanned Aerial Vehicle Swarm Dataset for Multiple Object Tracking
Remote Sens. 2022, 14(11), 2601; https://doi.org/10.3390/rs14112601 - 28 May 2022
Cited by 2
Abstract
In recent years, with the rapid development of unmanned aerial vehicles (UAV) technology and swarm intelligence technology, hundreds of small-scale and low-cost UAV constitute swarms carry out complex combat tasks in the form of ad hoc networks, which brings great threats and challenges [...] Read more.
In recent years, with the rapid development of unmanned aerial vehicles (UAV) technology and swarm intelligence technology, hundreds of small-scale and low-cost UAV constitute swarms carry out complex combat tasks in the form of ad hoc networks, which brings great threats and challenges to low-altitude airspace defense. Security requirements for low-altitude airspace defense, using visual detection technology to detect and track incoming UAV swarms, is the premise of anti-UAV strategy. Therefore, this study first collected many UAV swarm videos and manually annotated a dataset named UAVSwarm dataset for UAV swarm detection and tracking; thirteen different scenes and more than nineteen types of UAV were recorded, including 12,598 annotated images—the number of UAV in each sequence is 3 to 23. Then, two advanced depth detection models are used as strong benchmarks, namely Faster R-CNN and YOLOX. Finally, two state-of-the-art multi-object tracking (MOT) models, GNMOT and ByteTrack, are used to conduct comprehensive tests and performance verification on the dataset and evaluation metrics. The experimental results show that the dataset has good availability, consistency, and universality. The UAVSwarm dataset can be widely used in training and testing of various UAV detection tasks and UAV swarm MOT tasks. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Technical Note
Rescaling-Assisted Super-Resolution for Medium-Low Resolution Remote Sensing Ship Detection
Remote Sens. 2022, 14(11), 2566; https://doi.org/10.3390/rs14112566 - 27 May 2022
Abstract
Medium-low resolution (M-LR) remote sensing ship detection is a challenging problem due to the small target sizes and insufficient appearance information. Although image super resolution (SR) has become a popular solution in recent years, the ability of image SR is limited since much [...] Read more.
Medium-low resolution (M-LR) remote sensing ship detection is a challenging problem due to the small target sizes and insufficient appearance information. Although image super resolution (SR) has become a popular solution in recent years, the ability of image SR is limited since much information is lost in input images. Inspired by the powerful information embedding ability of the encoder in image rescaling, in this paper, we introduce image rescaling to guide the training of image SR. Specifically, we add an adaption module before the SR network, and use the pre-trained rescaling network to guide the optimization of the adaption module. In this way, more information is embedded in the adapted M-LR images, and the subsequent SR module can utilize more information to achieve better performance. Extensive experimental results demonstrate the effectiveness of our method on image SR. More importantly, our method can be used as a pre-processing approach to improve the detection performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Efficient Shallow Network for River Ice Segmentation
Remote Sens. 2022, 14(10), 2378; https://doi.org/10.3390/rs14102378 - 15 May 2022
Abstract
River ice segmentation, used for surface ice concentration estimation, is important for validating river processes and ice-formation models, predicting ice jam and flooding risks, and managing water supply and hydroelectric power generation. Furthermore, discriminating between anchor ice and frazil ice is an important [...] Read more.
River ice segmentation, used for surface ice concentration estimation, is important for validating river processes and ice-formation models, predicting ice jam and flooding risks, and managing water supply and hydroelectric power generation. Furthermore, discriminating between anchor ice and frazil ice is an important factor in understanding sediment transport and release events. Modern deep learning techniques have proved to deliver promising results; however, they can show poor generalization ability and can be inefficient when hardware and computing power is limited. As river ice images are often collected in remote locations by unmanned aerial vehicles with limited computation power, we explore the performance-latency trade-offs for river ice segmentation. We propose a novel convolution block inspired by both depthwise separable convolutions and local binary convolutions giving additional efficiency and parameter savings. Our novel convolution block is used in a shallow architecture which has 99.9% fewer trainable parameters, 99% fewer multiply–add operations, and 69.8% less memory usage than a UNet, while achieving virtually the same segmentation performance. We find that the this network trains fast and is able to achieve high segmentation performance early in training due to an emphasis on both pixel intensity and texture. When compared to very efficient segmentation networks such as LR-ASPP with a MobileNetV3 backbone, we achieve good performance (mIoU of 64) 91% faster during training on a CPU and an overall mIoU that is 7.7% higher. We also find that our network is able to generalize better to new domains such as snowy environments. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
MIMO: A Unified Spatio-Temporal Model for Multi-Scale Sea Surface Temperature Prediction
Remote Sens. 2022, 14(10), 2371; https://doi.org/10.3390/rs14102371 - 14 May 2022
Cited by 1
Abstract
Sea surface temperature (SST) is a crucial factor that affects global climate and marine activities. Predicting SST at different temporal scales benefits various applications, from short-term SST prediction for weather forecasting to long-term SST prediction for analyzing El Niño–Southern Oscillation (ENSO). However, existing [...] Read more.
Sea surface temperature (SST) is a crucial factor that affects global climate and marine activities. Predicting SST at different temporal scales benefits various applications, from short-term SST prediction for weather forecasting to long-term SST prediction for analyzing El Niño–Southern Oscillation (ENSO). However, existing approaches for SST prediction train separate models for different temporal scales, which is inefficient and cannot take advantage of the correlations among the temperatures of different scales to improve the prediction performance. In this work, we propose a unified spatio-temporal model termed the Multi-In and Multi-Out (MIMO) model to predict SST at different scales. MIMO is an encoder–decoder model, where the encoder learns spatio-temporal features from the SST data of multiple scales, and fuses the learned features with a Cross Scale Fusion (CSF) operation. The decoder utilizes the learned features from the encoder to adaptively predict the SST of different scales. To our best knowledge, this is the first work to predict SST at different temporal scales simultaneously with a single model. According to the experimental evaluation on the Optimum Interpolation SST (OISST) dataset, MIMO achieves the state-of-the-art prediction performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Deep Learning Models for COVID-19 Detection
Sustainability 2022, 14(10), 5820; https://doi.org/10.3390/su14105820 - 11 May 2022
Cited by 3
Abstract
Healthcare is one of the crucial aspects of the Internet of things. Connected machine learning-based systems provide faster healthcare services. Doctors and radiologists can also use these systems for collaboration to provide better help to patients. The recently emerged Coronavirus (COVID-19) is known [...] Read more.
Healthcare is one of the crucial aspects of the Internet of things. Connected machine learning-based systems provide faster healthcare services. Doctors and radiologists can also use these systems for collaboration to provide better help to patients. The recently emerged Coronavirus (COVID-19) is known to have strong infectious ability. Reverse transcription-polymerase chain reaction (RT-PCR) is recognised as being one of the primary diagnostic tools. However, RT-PCR tests might not be accurate. In contrast, doctors can employ artificial intelligence techniques on X-ray and CT scans for analysis. Artificial intelligent methods need a large number of images; however, this might not be possible during a pandemic. In this paper, a novel data-efficient deep network is proposed for the identification of COVID-19 on CT images. This method increases the small number of available CT scans by generating synthetic versions of CT scans using the generative adversarial network (GAN). Then, we estimate the parameters of convolutional and fully connected layers of the deep networks using synthetic and augmented data. The method shows that the GAN-based deep learning model provides higher performance than classic deep learning models for COVID-19 detection. The performance evaluation is performed on COVID19-CT and Mosmed datasets. The best performing models are ResNet-18 and MobileNetV2 on COVID19-CT and Mosmed, respectively. The area under curve values of ResNet-18 and MobileNetV2 are 0.89% and 0.84%, respectively. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Review
A Survey on Memory Subsystems for Deep Neural Network Accelerators
Future Internet 2022, 14(5), 146; https://doi.org/10.3390/fi14050146 - 10 May 2022
Abstract
From self-driving cars to detecting cancer, the applications of modern artificial intelligence (AI) rely primarily on deep neural networks (DNNs). Given raw sensory data, DNNs are able to extract high-level features after the network has been trained using statistical learning. However, due to [...] Read more.
From self-driving cars to detecting cancer, the applications of modern artificial intelligence (AI) rely primarily on deep neural networks (DNNs). Given raw sensory data, DNNs are able to extract high-level features after the network has been trained using statistical learning. However, due to the massive amounts of parallel processing in computations, the memory wall largely affects the performance. Thus, a review of the different memory architectures applied in DNN accelerators would prove beneficial. While the existing surveys only address DNN accelerators in general, this paper investigates novel advancements in efficient memory organizations and design methodologies in the DNN accelerator. First, an overview of the various memory architectures used in DNN accelerators will be provided, followed by a discussion of memory organizations on non-ASIC DNN accelerators. Furthermore, flexible memory systems incorporating an adaptable DNN computation will be explored. Lastly, an analysis of emerging memory technologies will be conducted. The reader, through this article, will: 1—gain the ability to analyze various proposed memory architectures; 2—discern various DNN accelerators with different memory designs; 3—become familiar with the trade-offs associated with memory organizations; and 4—become familiar with proposed new memory systems for modern DNN accelerators to solve the memory wall and other mentioned current issues. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Two New Datasets for Italian-Language Abstractive Text Summarization
Information 2022, 13(5), 228; https://doi.org/10.3390/info13050228 - 29 Apr 2022
Abstract
Text summarization aims to produce a short summary containing relevant parts from a given text. Due to the lack of data for abstractive summarization on low-resource languages such as Italian, we propose two new original datasets collected from two Italian news websites with [...] Read more.
Text summarization aims to produce a short summary containing relevant parts from a given text. Due to the lack of data for abstractive summarization on low-resource languages such as Italian, we propose two new original datasets collected from two Italian news websites with multi-sentence summaries and corresponding articles, and from a dataset obtained by machine translation of a Spanish summarization dataset. These two datasets are currently the only two available in Italian for this task. To evaluate the quality of these two datasets, we used them to train a T5-base model and an mBART model, obtaining good results with both. To better evaluate the results obtained, we also compared the same models trained on automatically translated datasets, and the resulting summaries in the same training language, with the automatically translated summaries, which demonstrated the superiority of the models obtained from the proposed datasets. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Accurate Air-Quality Prediction Using Genetic-Optimized Gated-Recurrent-Unit Architecture
Information 2022, 13(5), 223; https://doi.org/10.3390/info13050223 - 26 Apr 2022
Abstract
Air pollution is becoming a serious concern with the development of society and urban expansion, and predicting air quality is the most pressing problem for human beings. Recently, more and more machine-learning-based methods are being used to solve the air-quality-prediction problem, and gated [...] Read more.
Air pollution is becoming a serious concern with the development of society and urban expansion, and predicting air quality is the most pressing problem for human beings. Recently, more and more machine-learning-based methods are being used to solve the air-quality-prediction problem, and gated recurrent units (GRUs) are a representative method because of their advantage for processing time-series data. However, in the same air-quality-prediction task, different researchers have always designed different structures of the GRU due to their different experiences. Data-adaptively designing a GRU structure has thus become a problem. In this paper, we propose an adaptive GRU to address this problem, and the adaptive GRU structures are determined by the dataset, which mainly contributes with three steps. Firstly, an encoding method for the GRU structure is proposed for representing the network structure in a fixed-length binary string; secondly, we define the reciprocal of the sum of the loss of each individual as the fitness function for the iteration computation; thirdly, the genetic algorithm is used for computing the data-adaptive GRU network structure, which can enhance the air-quality-prediction result. The experiment results from three real datasets in Xi’an show that the proposed method achieves better effectiveness in RMSE and SAMPE than the existing LSTM-, SVM-, and RNN-based methods. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
An Emergency Event Detection Ensemble Model Based on Big Data
Big Data Cogn. Comput. 2022, 6(2), 42; https://doi.org/10.3390/bdcc6020042 - 16 Apr 2022
Abstract
Emergency events arise when a serious, unexpected, and often dangerous threat affects normal life. Hence, knowing what is occurring during and after emergency events is critical to mitigate the effect of the incident on humans’ life, on the environment and our infrastructures, as [...] Read more.
Emergency events arise when a serious, unexpected, and often dangerous threat affects normal life. Hence, knowing what is occurring during and after emergency events is critical to mitigate the effect of the incident on humans’ life, on the environment and our infrastructures, as well as the inherent financial consequences. Social network utilization in emergency event detection models can play an important role as information is shared and users’ status is updated once an emergency event occurs. Besides, big data proved its significance as a tool to assist and alleviate emergency events by processing an enormous amount of data over a short time interval. This paper shows that it is necessary to have an appropriate emergency event detection ensemble model (EEDEM) to respond quickly once such unfortunate events occur. Furthermore, it integrates Snapchat maps to propose a novel method to pinpoint the exact location of an emergency event. Moreover, merging social networks and big data can accelerate the emergency event detection system: social network data, such as those from Twitter and Snapchat, allow us to manage, monitor, analyze and detect emergency events. The main objective of this paper is to propose a novel and efficient big data-based EEDEM to pinpoint the exact location of emergency events by employing the collected data from social networks, such as “Twitter” and “Snapchat”, while integrating big data (BD) and machine learning (ML). Furthermore, this paper evaluates the performance of five ML base models and the proposed ensemble approach to detect emergency events. Results show that the proposed ensemble approach achieved a very high accuracy of 99.87% which outperform the other base models. Moreover, the proposed base models yields a high level of accuracy: 99.72%, 99.70% for LSTM and decision tree, respectively, with an acceptable training time. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Structural Approach to Some Contradictions in Worldwide Swine Production and Health Research
Sustainability 2022, 14(8), 4748; https://doi.org/10.3390/su14084748 - 15 Apr 2022
Cited by 1
Abstract
Several biosafety gaps in agri-food sectors have become evident in recent years. Many of them are related to the global livestock systems and the organizational models involved in their management and organization. For example, producing pigs requires a global system of massive confinement [...] Read more.
Several biosafety gaps in agri-food sectors have become evident in recent years. Many of them are related to the global livestock systems and the organizational models involved in their management and organization. For example, producing pigs requires a global system of massive confinement and specific technological innovations related to animal production and health that involve broad technical and scientific structures, which are required to generate specific knowledge for successful management. This suggests the need for an underlying socially agglomerated technological ecosystem relevant for these issues. So, we propose the analysis of a specialized scientific social structure in terms of the knowledge and technologies required for pig production and health. The objective of this work is to characterize structural patterns in the research of the swine health sector worldwide. We use a mixed methodological approach, based on a social network approach, and obtained scientific information from 4868 specialized research works on health and pig production generated between 2010 to 2018, from 47 countries. It was possible to analyze swine research dynamics, such as convergence and influence, at country and regional levels, and identify differentiated behaviors and high centralization in scientific communities that have a worldwide impact in terms of achievements but also result in significant omissions. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Landslide Displacement Prediction via Attentive Graph Neural Network
Remote Sens. 2022, 14(8), 1919; https://doi.org/10.3390/rs14081919 - 15 Apr 2022
Cited by 1
Abstract
Landslides are among the most common geological hazards that result in considerable human and economic losses globally. Researchers have put great efforts into addressing the landslide prediction problem for decades. Previous methods either focus on analyzing the landslide inventory maps obtained from aerial [...] Read more.
Landslides are among the most common geological hazards that result in considerable human and economic losses globally. Researchers have put great efforts into addressing the landslide prediction problem for decades. Previous methods either focus on analyzing the landslide inventory maps obtained from aerial photography and satellite images or propose machine learning models—trained on historical land deformation data—to predict future displacement and sedimentation. However, existing approaches generally fail to capture complex spatial deformations and their inter-dependencies in different areas. This work presents a novel landslide prediction model based on graph neural networks, which utilizes graph convolutions to aggregate spatial correlations among different monitored locations. Besides, we introduce a novel locally historical transformer network to capture dynamic spatio-temporal relations and predict the surface deformation. We conduct extensive experiments on real-world data and demonstrate that our model significantly outperforms state-of-the-art approaches in terms of prediction accuracy and model interpretations. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Local Transformer Network on 3D Point Cloud Semantic Segmentation
Information 2022, 13(4), 198; https://doi.org/10.3390/info13040198 - 14 Apr 2022
Abstract
Semantic segmentation is an important component in understanding the 3D point cloud scene. Whether we can effectively obtain local and global contextual information from points is of great significance in improving the performance of 3D point cloud semantic segmentation. In this paper, we [...] Read more.
Semantic segmentation is an important component in understanding the 3D point cloud scene. Whether we can effectively obtain local and global contextual information from points is of great significance in improving the performance of 3D point cloud semantic segmentation. In this paper, we propose a self-attention feature extraction module: the local transformer structure. By stacking the encoder layer composed of this structure, we can extract local features while preserving global connectivity. The structure can automatically learn each point feature from its neighborhoods and is invariant to different point orders. We designed two unique key matrices, each of which focuses on the feature similarities and geometric structure relationships between the points to generate attention weight matrices. Additionally, the cross-skip selection of neighbors is used to obtain larger receptive fields for each point without increasing the number of calculations required, and can therefore better deal with the junction between multiple objects. When the new network was verified on the S3DIS, the mean intersection over union was 69.1%, and the segmentation accuracies on the complex outdoor scene datasets Semantic3D and SemanticKITTI were 94.3% and 87.8%, respectively, which demonstrate the effectiveness of the proposed methods. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Systematic Review
Deep Learning for Vulnerability and Attack Detection on Web Applications: A Systematic Literature Review
Future Internet 2022, 14(4), 118; https://doi.org/10.3390/fi14040118 - 13 Apr 2022
Cited by 1
Abstract
Web applications are the best Internet-based solution to provide online web services, but they also bring serious security challenges. Thus, enhancing web applications security against hacking attempts is of paramount importance. Traditional Web Application Firewalls based on manual rules and traditional Machine Learning [...] Read more.
Web applications are the best Internet-based solution to provide online web services, but they also bring serious security challenges. Thus, enhancing web applications security against hacking attempts is of paramount importance. Traditional Web Application Firewalls based on manual rules and traditional Machine Learning need a lot of domain expertise and human intervention and have limited detection results faced with the increasing number of unknown web attacks. To this end, more research work has recently been devoted to employing Deep Learning (DL) approaches for web attacks detection. We performed a Systematic Literature Review (SLR) and quality analysis of 63 Primary Studies (PS) on DL-based web applications security published between 2010 and September 2021. We investigated the PS from different perspectives and synthesized the results of the analyses. To the best of our knowledge, this study is the first of its kind on SLR in this field. The key findings of our study include the following. (i) It is fundamental to generate standard real-world web attacks datasets to encourage effective contribution in this field and to reduce the gap between research and industry. (ii) It is interesting to explore some advanced DL models, such as Generative Adversarial Networks and variants of Encoders–Decoders, in the context of web attacks detection as they have been successful in similar domains such as networks intrusion detection. (iii) It is fundamental to bridge expertise in web applications security and expertise in Machine Learning to build theoretical Machine Learning models tailored for web attacks detection. (iv) It is important to create a corpus for web attacks detection in order to take full advantage of text mining in DL-based web attacks detection models construction. (v) It is essential to define a common framework for developing and comparing DL-based web attacks detection models. This SLR is intended to improve research work in the domain of DL-based web attacks detection, as it covers a significant number of research papers and identifies the key points that need to be addressed in this research field. Such a contribution is helpful as it allows researchers to compare existing approaches and to exploit the proposed future work opportunities. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Two-Stage Low-Altitude Remote Sensing Papaver Somniferum Image Detection System Based on YOLOv5s+DenseNet121
Remote Sens. 2022, 14(8), 1834; https://doi.org/10.3390/rs14081834 - 11 Apr 2022
Cited by 2
Abstract
Papaver somniferum (opium poppy) is not only a source of raw material for the production of medical narcotic analgesics but also the major raw material for certain psychotropic drugs. Therefore, it is stipulated by law that the cultivation of Papaver somniferum must be [...] Read more.
Papaver somniferum (opium poppy) is not only a source of raw material for the production of medical narcotic analgesics but also the major raw material for certain psychotropic drugs. Therefore, it is stipulated by law that the cultivation of Papaver somniferum must be authorized by the government under stringent supervision. In certain areas, unauthorized and illicit Papaver somniferum cultivation on private-owned lands occurs from time to time. These illegal Papaver somniferum cultivation sites are dispersedly-distributed and highly-concealed, therefore becoming a tough problem for government supervision. The low-altitude inspection of Papaver somniferum cultivation by unmanned aerial vehicles has the advantages of high efficiency and time saving, but the large amount of image data collected needs to be manually screened, which not only consumes a lot of manpower and material resources but also easily causes omissions. In response to the above problems, this paper proposed a two-stage (target detection and image classification) method for the detection of Papaver somniferum cultivation sites. In the first stage, the YOLOv5s algorithm was used to detect Papaver somniferum images for the purpose of identifying all the suspicious Papaver somniferum images from the original data. In the second stage, the DenseNet121 network was used to classify the detection results from the first stage, so as to exclude the targets other than Papaver somniferum and retain the images containing Papaver somniferum only. For the first stage, YOLOv5s achieved the best overall performance among mainstream target detection models, with a Precision of 97.7%, Recall of 94.9%, and mAP of 97.4%. For the second stage, DenseNet121 with pre-training achieved the best overall performance, with a classification accuracy of 97.33% and a Precision of 95.81%. The experimental comparison results between the one-stage method and the two-stage method suggest that the Recall of the two methods remained the same, but the two-stage method reduced the number of falsely detected images by 73.88%, which greatly reduces the workload for subsequent manual screening of remote sensing Papaver somniferum images. The achievement of this paper provides an effective technical means to solve the problem in the supervision of illicit Papaver somniferum cultivation. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Learning Spatio-Temporal Attention Based Siamese Network for Tracking UAVs in the Wild
Remote Sens. 2022, 14(8), 1797; https://doi.org/10.3390/rs14081797 - 08 Apr 2022
Cited by 1
Abstract
The popularity of unmanned aerial vehicles (UAVs) has made anti-UAV technology increasingly urgent. Object tracking, especially in thermal infrared videos, offers a promising solution to counter UAV intrusion. However, troublesome issues such as fast motion and tiny size make tracking infrared drone targets [...] Read more.
The popularity of unmanned aerial vehicles (UAVs) has made anti-UAV technology increasingly urgent. Object tracking, especially in thermal infrared videos, offers a promising solution to counter UAV intrusion. However, troublesome issues such as fast motion and tiny size make tracking infrared drone targets difficult and challenging. This work proposes a simple and effective spatio-temporal attention based Siamese method called SiamSTA, which performs reliable local searching and wide-range re-detection alternatively for robustly tracking drones in the wild. Concretely, SiamSTA builds a two-stage re-detection network to predict the target state using the template of first frame and the prediction results of previous frames. To tackle the challenge of small-scale UAV targets for long-range acquisition, SiamSTA imposes spatial and temporal constraints on generating candidate proposals within local neighborhoods to eliminate interference from background distractors. Complementarily, in case of target lost from local regions due to fast movement, a third stage re-detection module is introduced, which exploits valuable motion cues through a correlation filter based on change detection to re-capture targets from a global view. Finally, a state-aware switching mechanism is adopted to adaptively integrate local searching and global re-detection and take their complementary strengths for robust tracking. Extensive experiments on three anti-UAV datasets nicely demonstrate SiamSTA’s advantage over other competitors. Notably, SiamSTA is the foundation of the 1st-place winning entry in the 2nd Anti-UAV Challenge. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Deep Learning with Word Embedding Improves Kazakh Named-Entity Recognition
Information 2022, 13(4), 180; https://doi.org/10.3390/info13040180 - 02 Apr 2022
Cited by 1
Abstract
Named-entity recognition (NER) is a preliminary step for several text extraction tasks. In this work, we try to recognize Kazakh named entities by introducing a hybrid neural network model that leverages word semantics with multidimensional features and attention mechanisms. There are two major [...] Read more.
Named-entity recognition (NER) is a preliminary step for several text extraction tasks. In this work, we try to recognize Kazakh named entities by introducing a hybrid neural network model that leverages word semantics with multidimensional features and attention mechanisms. There are two major challenges: First, Kazakh is an agglutinative and morphologically rich language that presents a challenge for NER due to data sparsity. The other is that Kazakh named entities have unclear boundaries, polysemy, and nesting. A common strategy to handle data sparsity is to apply subword segmentation. Thus, we combined the semantics of words and stems by stemming from the Kazakh morphological analysis system. Additionally, we constructed a graph structure of entities, with words, entities, and entity categories as nodes and inclusion relations as edges, and updated nodes using a gated graph neural network (GGNN) with an attention mechanism. Finally, through the conditional random field (CRF), we extracted the final results. Experimental results show that our method consistently outperforms all previous methods by 88.04% in terms of F1 scores. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
HealthFetch: An Influence-Based, Context-Aware Prefetch Scheme in Citizen-Centered Health Storage Clouds
Future Internet 2022, 14(4), 112; https://doi.org/10.3390/fi14040112 - 01 Apr 2022
Cited by 1
Abstract
Over the past few years, increasing attention has been given to the health sector and the integration of new technologies into it. Cloud computing and storage clouds have become essentially state of the art solutions for other major areas and have started to [...] Read more.
Over the past few years, increasing attention has been given to the health sector and the integration of new technologies into it. Cloud computing and storage clouds have become essentially state of the art solutions for other major areas and have started to rapidly make their presence powerful in the health sector as well. More and more companies are working toward a future that will allow healthcare professionals to engage more with such infrastructures, enabling them a vast number of possibilities. While this is a very important step, less attention has been given to the citizens. For this reason, in this paper, a citizen-centered storage cloud solution is proposed that will allow citizens to hold their health data in their own hands while also enabling the exchange of these data with healthcare professionals during emergency situations. Not only that, in order to reduce the health data transmission delay, a novel context-aware prefetch engine enriched with deep learning capabilities is proposed. The proposed prefetch scheme, along with the proposed storage cloud, is put under a two-fold evaluation in several deployment and usage scenarios in order to examine its performance with respect to the data transmission times, while also evaluating its outcomes compared to other state of the art solutions. The results show that the proposed solution shows significant improvement of the download speed when compared with the storage cloud, especially when large data are exchanged. In addition, the results of the proposed scheme evaluation depict that the proposed scheme improves the overall predictions, considering the coefficient of determination (R2 > 0.94) and the mean of errors (RMSE < 1), while also reducing the training data by 12%. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Num-Symbolic Homophonic Social Net-Words
Information 2022, 13(4), 174; https://doi.org/10.3390/info13040174 - 29 Mar 2022
Cited by 1
Abstract
Many excellent studies about social networks and text analyses can be found in the literature, facilitating the rapid development of automated text analysis technology. Due to the lack of natural separators in Chinese, the text numbers and symbols also have their original literal [...] Read more.
Many excellent studies about social networks and text analyses can be found in the literature, facilitating the rapid development of automated text analysis technology. Due to the lack of natural separators in Chinese, the text numbers and symbols also have their original literal meaning. Thus, combining Chinese characters with numbers and symbols in user-generated content is a challenge for the current analytic approaches and procedures. Therefore, we propose a new hybrid method for detecting blended numeric and symbolic homophony Chinese neologisms (BNShCNs). Interpretation of the words’ actual semantics was performed according to their independence and relative position in context. This study obtained a shortlist using a probability approach from internet-collected user-generated content; subsequently, we evaluated the shortlist by contextualizing word-embedded vectors for BNShCN detection. The experiments show that the proposed method efficiently extracted BNShCNs from user-generated content. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Is Artificial Intelligence Better than Manpower? The Effects of Different Types of Online Customer Services on Customer Purchase Intentions
Sustainability 2022, 14(7), 3974; https://doi.org/10.3390/su14073974 - 28 Mar 2022
Cited by 2
Abstract
Artificial intelligence has been widely applied to e-commerce and the online business service field. However, few studies have focused on studying the differences in the effects of types of customer service on customer purchase intentions. Based on service encounter theory and superposition theory, [...] Read more.
Artificial intelligence has been widely applied to e-commerce and the online business service field. However, few studies have focused on studying the differences in the effects of types of customer service on customer purchase intentions. Based on service encounter theory and superposition theory, we designed two shopping experiments to capture customers’ thoughts and feelings, in order to explore the differences in the effects of three different types of online customer service (AI customer service, manual customer service, and human–machine collaboration customer service) on customer purchase intention, and analyses the superposition effect of human–machine collaboration customer service. The results show that the consumer’s perceived service quality positively influences the customer’s purchase intention, and plays a mediating role in the effect of different types of online customer service on customer purchase intention; the product type plays a moderating role in the relationship between online customer service and customer purchase intention, and human–machine collaboration customer service has a superposition effect. This study helped to deepen the understanding of AI developers and e-commerce platforms regarding the application of AI in online business service, and provides reference suggestions for the formulation of more perfect business service strategies. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A LiDAR–Camera Fusion 3D Object Detection Algorithm
Information 2022, 13(4), 169; https://doi.org/10.3390/info13040169 - 26 Mar 2022
Abstract
3D object detection with LiDAR and camera fusion has always been a challenge for autonomous driving. This work proposes a deep neural network (namely FuDNN) for LiDAR–camera fusion 3D object detection. Firstly, a 2D backbone is designed to extract features from camera images. [...] Read more.
3D object detection with LiDAR and camera fusion has always been a challenge for autonomous driving. This work proposes a deep neural network (namely FuDNN) for LiDAR–camera fusion 3D object detection. Firstly, a 2D backbone is designed to extract features from camera images. Secondly, an attention-based fusion sub-network is designed to fuse the features extracted by the 2D backbone and the features extracted from 3D LiDAR point clouds by PointNet++. Besides, the FuDNN, which uses the RPN and the refinement work of PointRCNN to obtain 3D box predictions, was tested on the public KITTI dataset. Experiments on the KITTI validation set show that the proposed FuDNN achieves AP values of 92.48, 82.90, and 80.51 at easy, moderate, and hard difficulty levels for car detection. The proposed FuDNN improves the performance of LiDAR–camera fusion 3D object detection in the car category of the public KITTI dataset. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Time Series Surface Temperature Prediction Based on Cyclic Evolutionary Network Model for Complex Sea Area
Future Internet 2022, 14(3), 96; https://doi.org/10.3390/fi14030096 - 21 Mar 2022
Cited by 2
Abstract
The prediction of marine elements has become increasingly important in the field of marine research. However, time series data in a complex environment vary significantly because they are composed of dynamic changes with multiple mechanisms, causes, and laws. For example, sea surface temperature [...] Read more.
The prediction of marine elements has become increasingly important in the field of marine research. However, time series data in a complex environment vary significantly because they are composed of dynamic changes with multiple mechanisms, causes, and laws. For example, sea surface temperature (SST) can be influenced by ocean currents. Conventional models often focus on capturing the impact of historical data but ignore the spatio–temporal relationships in sea areas, and they cannot predict such widely varying data effectively. In this work, we propose a cyclic evolutionary network model (CENS), an error-driven network group, which is composed of multiple network node units. Different regions of data can be automatically matched to a suitable network node unit for prediction so that the model can cluster the data based on their characteristics and, therefore, be more practical. Experiments were performed on the Bohai Sea and the South China Sea. Firstly, we performed an ablation experiment to verify the effectiveness of the framework of the model. Secondly, we tested the model to predict sea surface temperature, and the results verified the accuracy of CENS. Lastly, there was a meaningful finding that the clustering results of the model in the South China Sea matched the actual characteristics of the continental shelf of the South China Sea, and the cluster had spatial continuity. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Machine Learning for Pan Evaporation Modeling in Different Agroclimatic Zones of the Slovak Republic (Macro-Regions)
Sustainability 2022, 14(6), 3475; https://doi.org/10.3390/su14063475 - 16 Mar 2022
Cited by 1
Abstract
Global climate change is likely to influence evapotranspiration (ET); as a result, many ET calculation methods may not give accurate results under different climatic conditions. The main objective of this study is to verify the suitability of machine learning (ML) models as calculation [...] Read more.
Global climate change is likely to influence evapotranspiration (ET); as a result, many ET calculation methods may not give accurate results under different climatic conditions. The main objective of this study is to verify the suitability of machine learning (ML) models as calculation methods for pan evaporation modeling on the macro-regional scale. The most significant PE changes in the different agroclimatic zones of the Slovak Republic were compared, and their considerable impacts were analyzed. On the basis of the agroclimatic zones, 35 meteorological stations distributed across Slovakia were classified into six macro-regions. For each of the meteorological stations, 11 variables were applied during the vegetation period in the years from 2010 to 2020 with a daily time step. The performance of eight different ML models—the neural network (NN) model, the autoneural network (AN) model, the decision tree (DT) model, the Dmine regression (DR) model, the DM neural network (DM NN) model, the gradient boosting (GB) model, the least angle regression (LARS) model, and the ensemble model (EM)—was employed to predict PE. It was found that the different models had diverse prediction accuracies in various geographical locations. In this study, the results of the values predicted by the individual models are compared. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Unsupervised Anomaly Detection and Segmentation on Dirty Datasets
Future Internet 2022, 14(3), 86; https://doi.org/10.3390/fi14030086 - 13 Mar 2022
Abstract
Industrial quality control is an important task. Most of the existing vision-based unsupervised industrial anomaly detection and segmentation methods require that the training set only consists of normal samples, which is difficult to ensure in practice. This paper proposes an unsupervised framework to [...] Read more.
Industrial quality control is an important task. Most of the existing vision-based unsupervised industrial anomaly detection and segmentation methods require that the training set only consists of normal samples, which is difficult to ensure in practice. This paper proposes an unsupervised framework to solve the industrial anomaly detection and segmentation problem when the training set contains anomaly samples. Our framework uses a model pretrained on ImageNet as a feature extractor to extract patch-level features. After that, we propose a trimming method to estimate a robust Gaussian distribution based on the patch features at each position. Then, with an iterative filtering process, we can iteratively filter out the anomaly samples in the training set and re-estimate the Gaussian distribution at each position. In the prediction phase, the Mahalanobis distance between a patch feature vector and the center of the Gaussian distribution at the corresponding position is used as the anomaly score of this patch. The subsequent anomaly region segmentation is performed based on the patch anomaly score. We tested the proposed method on three datasets containing the anomaly samples and obtained state-of-the-art performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Early Detection of Dendroctonus valens Infestation with Machine Learning Algorithms Based on Hyperspectral Reflectance
Remote Sens. 2022, 14(6), 1373; https://doi.org/10.3390/rs14061373 - 11 Mar 2022
Cited by 1
Abstract
The red turpentine beetle (Dendroctonus valens LeConte) has caused severe ecological and economic losses since its invasion into China. It gradually spreads northeast, resulting in many Chinese pine (Pinus tabuliformis Carr.) deaths. Early detection of D. valens infestation (i.e., at the [...] Read more.
The red turpentine beetle (Dendroctonus valens LeConte) has caused severe ecological and economic losses since its invasion into China. It gradually spreads northeast, resulting in many Chinese pine (Pinus tabuliformis Carr.) deaths. Early detection of D. valens infestation (i.e., at the green attack stage) is the basis of control measures to prevent its outbreak and spread. This study examined the changes in spectral reflectance after initial attacking of D. valens. We also explored the possibility of detecting early D. valens infestation based on spectral vegetation indices and machine learning algorithms. The spectral reflectance of infested trees was significantly different from healthy trees (p < 0.05), and there was an obvious decrease in the near-infrared region (760–1386 nm; p < 0.01). Spectral vegetation indices were input into three machine learning classifiers; the classification accuracy was 72.5–80%, while the sensitivity was 65–85%. Several spectral vegetation indices (DID, CUR, TBSI, DDn2, D735, SR1, NSMI, RNIR•CRI550 and RVSI) were sensitive indicators for the early detection of D. valens damage. Our results demonstrate that remote sensing technology could be successfully applied to early detect D. valens infestation and clarify the sensitive spectral regions and vegetation indices, which has important implications for early detection based on unmanned airborne vehicle and satellite data. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
SR-Net: Saliency Region Representation Network for Vehicle Detection in Remote Sensing Images
Remote Sens. 2022, 14(6), 1313; https://doi.org/10.3390/rs14061313 - 09 Mar 2022
Abstract
Vehicle detection in remote sensing imagery is a challenging task because of its inherent attributes, e.g., dense parking, small sizes, various angles, etc. Prevalent vehicle detectors adopt an oriented/rotated bounding box as a basic representation, which needs to apply a distance regression of [...] Read more.
Vehicle detection in remote sensing imagery is a challenging task because of its inherent attributes, e.g., dense parking, small sizes, various angles, etc. Prevalent vehicle detectors adopt an oriented/rotated bounding box as a basic representation, which needs to apply a distance regression of height, width, and angles of objects. These distance-regression-based detectors suffer from two challenges: (1) the periodicity of the angle causes a discontinuity of regression values, and (2) small regression deviations may also cause objects to be missed. To this end, in this paper, we propose a new vehicle modeling strategy, i.e., regarding each vehicle-rotated bounding box as a saliency area. Based on the new representation, we propose SR-Net (saliency region representation network), which transforms the vehicle detection task into a saliency object detection task. The proposed SR-Net, running in a distance (e.g., height, width, and angle)-regression-free way, can generate more accurate detection results. Experiments show that SR-Net outperforms prevalent detectors on multiple benchmark datasets. Specifically, our model yields 52.30%, 62.44%, 68.25%, and 55.81% in terms of AP on DOTA, UCAS-AOD, DLR 3K Munich, and VEDAI, respectively. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Depth-Wise Separable Convolution Attention Module for Garbage Image Classification
Sustainability 2022, 14(5), 3099; https://doi.org/10.3390/su14053099 - 07 Mar 2022
Cited by 3
Abstract
Currently, how to deal with the massive garbage produced by various human activities is a hot topic all around the world. In this paper, a preliminary and essential step is to classify the garbage into different categories. However, the mainstream waste classification mode [...] Read more.
Currently, how to deal with the massive garbage produced by various human activities is a hot topic all around the world. In this paper, a preliminary and essential step is to classify the garbage into different categories. However, the mainstream waste classification mode relies heavily on manual work, which consumes a lot of labor and is very inefficient. With the rapid development of deep learning, convolutional neural networks (CNN) have been successfully applied to various application fields. Therefore, some researchers have directly adopted CNNs to classify garbage through their images. However, compared with other images, the garbage images have their own characteristics (such as inter-class similarity, intra-class variance and complex background). Thus, neglecting these characteristics would impair the classification accuracy of CNN. To overcome the limitations of existing garbage image classification methods, a Depth-wise Separable Convolution Attention Module (DSCAM) is proposed in this paper. In DSCAM, the inherent relationships of channels and spatial positions in garbage image features are captured by two attention modules with depth-wise separable convolutions, so that our method could only focus on important information and ignore the interference. Moreover, we also adopt a residual network as the backbone of DSCAM to enhance its discriminative ability. We conduct the experiments on five garbage datasets. The experimental results demonstrate that the proposed method could effectively classify the garbage images and that it outperforms some classical methods. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Deep Learning Framework for Multimodal Course Recommendation Based on LSTM+Attention
Sustainability 2022, 14(5), 2907; https://doi.org/10.3390/su14052907 - 02 Mar 2022
Cited by 2
Abstract
With the impact of COVID-19 on education, online education is booming, enabling learners to access various courses. However, due to the overload of courses and redundant information, it is challenging for users to quickly locate courses they are interested in when faced with [...] Read more.
With the impact of COVID-19 on education, online education is booming, enabling learners to access various courses. However, due to the overload of courses and redundant information, it is challenging for users to quickly locate courses they are interested in when faced with a massive number of courses. To solve this problem, we propose a deep course recommendation model with multimodal feature extraction based on the Long- and Short-Term Memory network (LSTM) and Attention mechanism. The model uses course video, audio, and title and introduction for multimodal fusion. To build a complete learner portrait, user demographic information, explicit and implicit feedback data were added. We conducted extensive and exhaustive experiments based on real datasets, and the results show that the AUC obtained a score of 79.89%, which is significantly higher than similar algorithms and can provide users with more accurate recommendation results in course recommendation scenarios. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Parallel Particle Swarm Optimization Using Apache Beam
Information 2022, 13(3), 119; https://doi.org/10.3390/info13030119 - 28 Feb 2022
Abstract
The majority of complex research problems can be formulated as optimization problems. Particle Swarm Optimization (PSO) algorithm is very effective in solving optimization problems because of its robustness, simplicity, and global search capabilities. Since the computational cost of these problems is usually high, [...] Read more.
The majority of complex research problems can be formulated as optimization problems. Particle Swarm Optimization (PSO) algorithm is very effective in solving optimization problems because of its robustness, simplicity, and global search capabilities. Since the computational cost of these problems is usually high, it has been necessary to develop optimization algorithms with parallelization. With the advent of big-data technology, such problems can be solved by distributed parallel computing. In previous related work, MapReduce (a programming model that implements a distributed parallel approach to processing and producing large datasets on a cluster) has been used to parallelize the PSO algorithm, but frequent file reads and writes make the execution time of MRPSO very long. We propose Apache Beam particle swarm optimization (BPSO), which uses Apache Beam parallel programming model. In the experiment, we compared BPSO and PSO based on MapReduce (MRPSO) on four benchmark functions by changing the number of particles and optimizing the dimensions of the problem. The experimental results show that, as the number of particles increases, MRPSO remains largely constant when the number of particles is small (<1000), while the time required for algorithm execution increases rapidly when the number of particles exceeds a certain amount (>1000), while BPSO grows slowly and tends to yield better results than MRPSO. As the dimensionality of the optimization problem increases, BPSO can take half the time of MRPSO and obtain better results than it does. MRPSO requires more execution time than BPSO, as the problem complexity varies, but both MRPSO and BPSO are not very sensitive to problem complexity. All program code and input data are uploaded to GitHub. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Electromagnetic Signal Classification Based on Class Exemplar Selection and Multi-Objective Linear Programming
Remote Sens. 2022, 14(5), 1177; https://doi.org/10.3390/rs14051177 - 27 Feb 2022
Abstract
In the increasingly complex electromagnetic environment, a variety of new signal types are appearing; however, existing electromagnetic signal classification (ESC) models cannot handle new signal types. In this context, the emergence of class-incremental learning aims to incrementally update the classification model as new [...] Read more.
In the increasingly complex electromagnetic environment, a variety of new signal types are appearing; however, existing electromagnetic signal classification (ESC) models cannot handle new signal types. In this context, the emergence of class-incremental learning aims to incrementally update the classification model as new categories emerge. In this paper, an electromagnetic signal classification framework based on class exemplar selection and a multi-objective linear programming classifier (CES-MOLPC) is proposed in order to continuously learn new classes in an incremental manner. Specifically, our approach involves the adaptive selection of class exemplars considering normalized mutual information and a multi-objective linear programming classifier. The former is used to maintain the classification capability of the model for previous categories by selecting key samples, while the latter is used to allow the model to adapt quickly to new categories. Meanwhile, a weighted loss function based on cross-entropy and distillation loss is presented in order to fine-tune the model. We demonstrate the effectiveness of the proposed CES-MOLPC method through extensive experiments on the public RML2016.04c data set and the large-scale real-world ACARS signal data set. The results of the comparative experiments demonstrate that our method can achieve significant improvements over state-of-the-art methods. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Graph-Based Embedding Smoothing Network for Few-Shot Scene Classification of Remote Sensing Images
Remote Sens. 2022, 14(5), 1161; https://doi.org/10.3390/rs14051161 - 26 Feb 2022
Cited by 4
Abstract
As a fundamental task in the field of remote sensing, scene classification is increasingly attracting attention. The most popular way to solve scene classification is to train a deep neural network with a large-scale remote sensing dataset. However, given a small amount of [...] Read more.
As a fundamental task in the field of remote sensing, scene classification is increasingly attracting attention. The most popular way to solve scene classification is to train a deep neural network with a large-scale remote sensing dataset. However, given a small amount of data, how to train a deep neural network with outstanding performance remains a challenge. Existing methods seek to take advantage of transfer knowledge or meta-knowledge to resolve the scene classification issue of remote sensing images with a handful of labeled samples while ignoring various class-irrelevant noises existing in scene features and the specificity of different tasks. For this reason, in this paper, an end-to-end graph neural network is presented to enhance the performance of scene classification in few-shot scenarios, referred to as the graph-based embedding smoothing network (GES-Net). Specifically, GES-Net adopts an unsupervised non-parametric regularizer, called embedding smoothing, to regularize embedding features. Embedding smoothing can capture high-order feature interactions in an unsupervised manner, which is adopted to remove undesired noises from embedding features and yields smoother embedding features. Moreover, instead of the traditional sample-level relation representation, GES-Net introduces a new task-level relation representation to construct the graph. The task-level relation representation can capture the relations between nodes from the perspective of the whole task rather than only between samples, which can highlight subtle differences between nodes and enhance the discrimination of the relations between nodes. Experimental results on three public remote sensing datasets, UC Merced, WHU-RS19, and NWPU-RESISC45, showed that the proposed GES-Net approach obtained state-of-the-art results in the settings of limited labeled samples. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Tuberculosis Bacteria Detection and Counting in Fluorescence Microscopy Images Using a Multi-Stage Deep Learning Pipeline
Information 2022, 13(2), 96; https://doi.org/10.3390/info13020096 - 18 Feb 2022
Cited by 1
Abstract
The manual observation of sputum smears by fluorescence microscopy for the diagnosis and treatment monitoring of patients with tuberculosis (TB) is a laborious and subjective task. In this work, we introduce an automatic pipeline which employs a novel deep learning-based approach to rapidly [...] Read more.
The manual observation of sputum smears by fluorescence microscopy for the diagnosis and treatment monitoring of patients with tuberculosis (TB) is a laborious and subjective task. In this work, we introduce an automatic pipeline which employs a novel deep learning-based approach to rapidly detect Mycobacterium tuberculosis (Mtb) organisms in sputum samples and thus quantify the burden of the disease. Fluorescence microscopy images are used as input in a series of networks, which ultimately produces a final count of present bacteria more quickly and consistently than manual analysis by healthcare workers. The pipeline consists of four stages: annotation by cycle-consistent generative adversarial networks (GANs), extraction of salient image patches, classification of the extracted patches, and finally, regression to yield the final bacteria count. We empirically evaluate the individual stages of the pipeline as well as perform a unified evaluation on previously unseen data that were given ground-truth labels by an experienced microscopist. We show that with no human intervention, the pipeline can provide the bacterial count for a sample of images with an error of less than 5%. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
The Ethical Governance for the Vulnerability of Care Robots: Interactive-Distance-Oriented Flexible Design
Sustainability 2022, 14(4), 2303; https://doi.org/10.3390/su14042303 - 17 Feb 2022
Abstract
The application of caring robots is currently a widely accepted solution to the problem of aging. However, for the elderly groups who live in gregarious residences and share intelligence devices, caring robots will cause intimacy and assistance dilemmas in the relationship between humans [...] Read more.
The application of caring robots is currently a widely accepted solution to the problem of aging. However, for the elderly groups who live in gregarious residences and share intelligence devices, caring robots will cause intimacy and assistance dilemmas in the relationship between humans and non-human agencies. This is an information-assisted machine setting, with resulting design ethics issues brought about by the binary values of human and machine, body and mind. The “vulnerability” in risk ethics demonstrates that the ethical problems of human institutions stem from the increase of dependence and the obstruction of intimacy, which are essentially caused by the increased degree of ethical risk exposure and the restriction of agency. Based on value-sensitive design, caring ethics and machine ethics, this paper proposes a flexible design with the interaction-distance-oriented concept, and reprograms the ethical design of caring robots with intentional distance, representational distance and interpretive distance as indicators. The main purpose is to advocate a new type of human-machine interaction relationship emphasizing diversity and physical interaction. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Digital Paradox: Platform Economy and High-Quality Economic Development—New Evidence from Provincial Panel Data in China
Sustainability 2022, 14(4), 2225; https://doi.org/10.3390/su14042225 - 16 Feb 2022
Cited by 4
Abstract
Based on provincial panel data of China from 2011 to 2019, this paper discusses the influence and mechanism of the platform economy on the high-quality development of regional economies. It is found that the platform economy has an inverted U-shaped impact on the [...] Read more.
Based on provincial panel data of China from 2011 to 2019, this paper discusses the influence and mechanism of the platform economy on the high-quality development of regional economies. It is found that the platform economy has an inverted U-shaped impact on the high-quality development of regional economies. On the left side of the inverted U-shaped inflection point, the platform economy plays a significant role in promoting high-quality economic development; on the right side of the inflection point, the platform economy has an obvious inhibitory effect on high-quality economic development. Statistical analysis showed that 85% of the observations fell on the left side of the inflection point, indicating that China’s platform economy as a whole is in the early stages of development. From the strong and weak grouping test of the degree of government intervention, it was found that the platform economy only has an inverted U-shaped effect on the high-quality development of the areas with weak intervention. From the point of view of the coefficient, the platform economy has a greater promoting effect on the high-quality development of the areas with strong intervention. From the grouping test of the quality of the market system, it was found that the inverted U-shaped curve is steeper in the areas with higher institutional quality, indicating that, in the early stage of development, the platform economy has a greater promoting effect on the high-quality development of areas with perfect institutions. In addition, the analysis of regional heterogeneity showed that, in the early stage of development, the promoting effect of the platform economy on the high-quality development of the northeastern and western regions is more significant. After exceeding the threshold, the platform economy has an inhibitory effect on the high-quality development of all regions. The mechanism test shows that technology, talent, and capital in the initial stage of development can all play a positive regulatory role; after exceeding the threshold, platform economic monopoly may restrain high-quality economic development by hindering technological progress and causing a mismatch of labor–capital elements and resources. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
HyperLiteNet: Extremely Lightweight Non-Deep Parallel Network for Hyperspectral Image Classification
Remote Sens. 2022, 14(4), 866; https://doi.org/10.3390/rs14040866 - 11 Feb 2022
Cited by 1
Abstract
Deep learning (DL) is widely applied in the field of hyperspectral image (HSI) classification and has proved to be an extremely promising research technique. However, the deployment of DL-based HSI classification algorithms in mobile and embedded vision applications tends to be limited by [...] Read more.
Deep learning (DL) is widely applied in the field of hyperspectral image (HSI) classification and has proved to be an extremely promising research technique. However, the deployment of DL-based HSI classification algorithms in mobile and embedded vision applications tends to be limited by massive parameters, high memory costs, and the complex networks of DL models. In this article, we propose a novel, extremely lightweight, non-deep parallel network (HyperLiteNet) to address these issues. Based on the development trends of hardware devices, the proposed HyperLiteNet replaces the deep network by the parallel structure in terms of fewer sequential computations and lower latency. The parallel structure can extract and optimize the diverse and divergent spatial and spectral features independently. Meanwhile, an elaborately designed feature-interaction module is constructed to acquire and fuse generalized abstract spectral and spatial features in different parallel layers. The lightweight dynamic convolution further compresses the memory of the network to realize flexible spatial feature extraction. Experiments on several real HSI datasets confirm that the proposed HyperLiteNet can efficiently decrease the number of parameters and the execution time as well as achieve better classification performance compared to several recent state-of-the-art algorithms. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Semantic Segmentation of Metoceanic Processes Using SAR Observations and Deep Learning
Remote Sens. 2022, 14(4), 851; https://doi.org/10.3390/rs14040851 - 11 Feb 2022
Cited by 4
Abstract
Through the Synthetic Aperture Radar (SAR) embarked on the satellites Sentinel-1A and Sentinel-1B of the Copernicus program, a large quantity of observations is routinely acquired over the oceans. A wide range of features from both oceanic (e.g., biological slicks, icebergs, etc.) and meteorologic [...] Read more.
Through the Synthetic Aperture Radar (SAR) embarked on the satellites Sentinel-1A and Sentinel-1B of the Copernicus program, a large quantity of observations is routinely acquired over the oceans. A wide range of features from both oceanic (e.g., biological slicks, icebergs, etc.) and meteorologic origin (e.g., rain cells, wind streaks, etc.) are distinguishable on these acquisitions. This paper studies the semantic segmentation of ten metoceanic processes either in the context of a large quantity of image-level groundtruths (i.e., weakly-supervised framework) or of scarce pixel-level groundtruths (i.e., fully-supervised framework). Our main result is that a fully-supervised model outperforms any tested weakly-supervised algorithm. Adding more segmentation examples in the training set would further increase the precision of the predictions. Trained on 20 × 20 km imagettes acquired from the WV acquisition mode of the Sentinel-1 mission, the model is shown to generalize, under some assumptions, to wide-swath SAR data, which further extents its application domain to coastal areas. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Review
A Survey on Text Classification Algorithms: From Text to Predictions
Information 2022, 13(2), 83; https://doi.org/10.3390/info13020083 - 11 Feb 2022
Cited by 6
Abstract
In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage the latest advancements in deep learning methods, allowing for the automatic extraction of expressive features. The swift development [...] Read more.
In recent years, the exponential growth of digital documents has been met by rapid progress in text classification techniques. Newly proposed machine learning algorithms leverage the latest advancements in deep learning methods, allowing for the automatic extraction of expressive features. The swift development of these methods has led to a plethora of strategies to encode natural language into machine-interpretable data. The latest language modelling algorithms are used in conjunction with ad hoc preprocessing procedures, of which the description is often omitted in favour of a more detailed explanation of the classification step. This paper offers a concise review of recent text classification models, with emphasis on the flow of data, from raw text to output labels. We highlight the differences between earlier methods and more recent, deep learning-based methods in both their functioning and in how they transform input data. To give a better perspective on the text classification landscape, we provide an overview of datasets for the English language, as well as supplying instructions for the synthesis of two new multilabel datasets, which we found to be particularly scarce in this setting. Finally, we provide an outline of new experimental results and discuss the open research challenges posed by deep learning-based language models. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Tropical Cyclone Intensity Estimation Using Himawari-8 Satellite Cloud Products and Deep Learning
Remote Sens. 2022, 14(4), 812; https://doi.org/10.3390/rs14040812 - 09 Feb 2022
Cited by 4
Abstract
This study develops an objective deep-learning-based model for tropical cyclone (TC) intensity estimation. The model’s basic structure is a convolutional neural network (CNN), which is a widely used technology in computer vision tasks. In order to optimize the model’s structure and to improve [...] Read more.
This study develops an objective deep-learning-based model for tropical cyclone (TC) intensity estimation. The model’s basic structure is a convolutional neural network (CNN), which is a widely used technology in computer vision tasks. In order to optimize the model’s structure and to improve the feature extraction ability, both residual learning and attention mechanisms are embedded into the model. Five cloud products, including cloud optical thickness, cloud top temperature, cloud top height, cloud effective radius, and cloud type, which are level-2 products from the geostationary satellite Himawari-8, are used as the model training inputs. We sampled the cloud products under the 13 rotational angles of each TC to augment the training dataset. For the independent test data, the model shows improvement, with a relatively low RMSE of 4.06 m/s and a mean absolute error (MAE) of 3.23 m/s, which are comparable to the results seen in previous studies. Various cloud organization patterns, storm whirling patterns, and TC structures from the feature maps are presented to interpret the model training process. An analysis of the overestimated bias and underestimated bias shows that the model’s performance is highly affected by the initial cloud products. Moreover, several controlled experiments using other deep learning architectures demonstrate that our designed model is conducive to estimating TC intensity, thus providing insight into the forecasting of other TC metrics. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Mixed Ensemble Learning and Time-Series Methodology for Category-Specific Vehicular Energy and Emissions Modeling
Sustainability 2022, 14(3), 1900; https://doi.org/10.3390/su14031900 - 07 Feb 2022
Abstract
The serially-correlated nature of engine operation is overlooked in the vehicular fuel and emission modeling literature. Furthermore, enabling the calibration and use of time-series models for instrument-independent eco-driving applications requires reliable forecast aggregation procedures. To this end, an ensemble time-series machine-learning methodology is [...] Read more.
The serially-correlated nature of engine operation is overlooked in the vehicular fuel and emission modeling literature. Furthermore, enabling the calibration and use of time-series models for instrument-independent eco-driving applications requires reliable forecast aggregation procedures. To this end, an ensemble time-series machine-learning methodology is developed using data collected through extensive field experiments on a fleet of 35 vehicles. Among other results, it is found that Long Short-Term Memory (LSTM) architecture is the best fit for capturing the dynamic and lagged effects of speed, acceleration, and grade on fuel and emission rates. The developed vehicle-specific ensembles outperformed state-of-the-practice benchmark models by a significant margin and the category-specific models outscored the vehicle-specific sub-models by an average margin of 6%. The results qualify the developed ensembles to work as representatives for vehicle categories and allows them to be utilized in both eco-driving services as well as environmental assessment modules. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1