Topic Editors

School of Business, Deree—The American College of Greece, 6 Gravias Street, Aghia Paraskevi, GR-153 42 Athens, Greece
Faculty of Theoretical and Applied Economics, The Bucharest University of Economic Studies, Romana Square, No. 6, 010374 Bucharest, Romania

Big Data and Artificial Intelligence

Abstract submission deadline
closed (30 September 2022)
Manuscript submission deadline
closed (31 December 2022)
Viewed by
156056

Topic Information

Dear Colleagues,

The evolution of research in Big Data and artificial intelligence in recent years challenges almost all domains of human activity. The potential of artificial intelligence to act as a catalyst for all given business models, and the capacity of Big Data research to provide sophisticated data and services ecosystems at a global scale, provide a challenging context for scientific contributions and applied research. This Topic section promotes scientific dialogue for the added value of novel methodological approaches and research in the specified areas. Our interest is on the entire end-to-end spectrum of Big Data and artificial intelligence research, from social sciences to computer science including, strategic frameworks, models, and best practices, to sophisticated research related to radical innovation. The topics include, but are not limited to, the following indicative list:

  • Enabling Technologies for Big Data and AI research:
    • Data warehouses;
    • Business intelligence;
    • Machine learning;
    • Neural networks;
    • Natural language processing;
    • Image processing;
    • Bot technology;
    • AI agents;
    • Analytics and dashboards;
    • Distributed computing;
    • Edge computing,
  • Methodologies, frameworks, and models for artificial intelligence and Big Data research:
    • Towards sustainable development goals;
    • As responses to social problems and challenges;
    • For innovations in business, research, academia industry, and technology
    • For theoretical foundations and contributions to the body of knowledge of AI and Big Data research,
  • Best practices and use cases;
  • Outcomes of R&D projects;
  • Advanced data science analytics;
  • Industry-government collaboration;
  • Systems of information systems;
  • Interoperability issues;
  • Security and privacy issues;
  • Ethics on Big Data and AI;
  • Social impact of AI;
  • Open data.

Prof. Dr. Miltiadis D. Lytras
Prof. Dr. Andreea Claudia Serban
Topic Editors

Keywords

  • artificial intelligence
  • big data
  • machine learning
  • open data
  • decision making

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Big Data and Cognitive Computing
BDCC
- 4.9 2017 17.2 Days 1600 CHF
Future Internet
futureinternet
- 6.7 2009 15.2 Days 1600 CHF
Information
information
- 5.8 2010 21.8 Days 1600 CHF
Remote Sensing
remotesensing
5.349 7.9 2009 19.7 Days 2500 CHF
Sustainability
sustainability
3.889 5.8 2009 17.7 Days 2200 CHF

Preprints is a platform dedicated to making early versions of research outputs permanently available and citable. MDPI journals allow posting on preprint servers such as Preprints.org prior to publication. For more details about reprints, please visit https://www.preprints.org.

Published Papers (87 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
Communication
Skillful Seasonal Prediction of Typhoon Track Density Using Deep Learning
Remote Sens. 2023, 15(7), 1797; https://doi.org/10.3390/rs15071797 - 28 Mar 2023
Viewed by 719
Abstract
Tropical cyclones (TCs) seriously threaten the safety of human life and property especially when approaching a coast or making landfall. Robust, long-lead predictions are valuable for managing policy responses. However, despite decades of efforts, seasonal prediction of TCs remains a challenge. Here, we [...] Read more.
Tropical cyclones (TCs) seriously threaten the safety of human life and property especially when approaching a coast or making landfall. Robust, long-lead predictions are valuable for managing policy responses. However, despite decades of efforts, seasonal prediction of TCs remains a challenge. Here, we introduce a deep-learning prediction model to make skillful seasonal prediction of TC track density in the Western North Pacific (WNP) during the typhoon season, with a lead time of up to four months. To overcome the limited availability of observational data, we use TC tracks from CMIP5 and CMIP6 climate models as the training data, followed by a transfer-learning method to train a fully convolutional neural network named SeaUnet. Through the deep-learning process (i.e., heat map analysis), SeaUnet identifies physically based precursors. We show that SeaUnet has a good performance for typhoon distribution, outperforming state-of-the-art dynamic systems. The success of SeaUnet indicates its potential for operational use. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Analysis of the Numerical Solutions of the Elder Problem Using Big Data and Machine Learning
Big Data Cogn. Comput. 2023, 7(1), 52; https://doi.org/10.3390/bdcc7010052 - 20 Mar 2023
Viewed by 820
Abstract
In this study, the numerical solutions to the Elder problem are analyzed using Big Data technologies and data-driven approaches. The steady-state solutions to the Elder problem are investigated with regard to Rayleigh numbers (Ra), grid sizes, perturbations, and other parameters [...] Read more.
In this study, the numerical solutions to the Elder problem are analyzed using Big Data technologies and data-driven approaches. The steady-state solutions to the Elder problem are investigated with regard to Rayleigh numbers (Ra), grid sizes, perturbations, and other parameters of the system studied. The complexity analysis is carried out for the datasets containing different solutions to the Elder problem, and the time of the highest complexity of numerical solutions is estimated. An approach to the identification of transient fingers and the visualization of large ensembles of solutions is proposed. Predictive models are developed to forecast steady states based on early-time observations. These models are classified into three possible types depending on the features (predictors) used in a model. The numerical results of the prediction accuracy are given, including the estimated confidence intervals for the accuracy, and the estimated time of 95% predictability. Different solutions, their averages, principal components, and other parameters are visualized. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Multi-Layered Projected Entangled Pair States for Image Classification
by and
Sustainability 2023, 15(6), 5120; https://doi.org/10.3390/su15065120 - 14 Mar 2023
Viewed by 674
Abstract
Tensor networks have been recognized as a powerful numerical tool; they are applied in various fields, including physics, computer science, and more. The idea of a tensor network originates from quantum physics as an efficient representation of quantum many-body states and their operations. [...] Read more.
Tensor networks have been recognized as a powerful numerical tool; they are applied in various fields, including physics, computer science, and more. The idea of a tensor network originates from quantum physics as an efficient representation of quantum many-body states and their operations. Matrix product states (MPS) form one of the simplest tensor networks and have been applied to machine learning for image classification. However, MPS has certain limitations when processing two-dimensional images, meaning that it is preferable for an projected entangled pair states (PEPS) tensor network with a similar structure to the image to be introduced into machine learning. PEPS tensor networks are significantly superior to other tensor networks on the image classification task. Based on a PEPS tensor network, this paper constructs a multi-layered PEPS (MLPEPS) tensor network model for image classification. PEPS is used to extract features layer by layer from the image mapped to the Hilbert space, which fully utilizes the correlation between pixels while retaining the global structural information of the image. When performing classification tasks on the Fashion-MNIST dataset, MLPEPS achieves a classification accuracy of 90.44%, exceeding tensor network models such as the original PEPS. On the COVID-19 radiography dataset, MLPEPS has a test set accuracy of 91.63%, which is very close to the results of GoogLeNet. Under the same experimental conditions, the learning ability of MLPEPS is already close to that of existing neural networks while having fewer parameters. MLPEPS can be used to build different network models by modifying the structure, and as such it has great potential in machine learning. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Oriented Object Detection in Aerial Images Based on the Scaled Smooth L1 Loss Function
Remote Sens. 2023, 15(5), 1350; https://doi.org/10.3390/rs15051350 - 28 Feb 2023
Viewed by 1085
Abstract
Although many state-of-the-art object detectors have been developed, detecting small and densely packed objects with complicated orientations in remote sensing aerial images remains challenging. For object detection in remote sensing aerial images, different scales, sizes, appearances, and orientations of objects from different categories [...] Read more.
Although many state-of-the-art object detectors have been developed, detecting small and densely packed objects with complicated orientations in remote sensing aerial images remains challenging. For object detection in remote sensing aerial images, different scales, sizes, appearances, and orientations of objects from different categories could most likely enlarge the variance in the detection error. Undoubtedly, the variance in the detection error should have a non-negligible impact on the detection performance. Motivated by the above consideration, in this paper, we tackled this issue, so that we could improve the detection performance and reduce the impact of this variance on the detection performance as much as possible. By proposing a scaled smooth L1 loss function, we developed a new two-stage object detector for remote sensing aerial images, named Faster R-CNN-NeXt with RoI-Transformer. The proposed scaled smooth L1 loss function is used for bounding box regression and makes regression invariant to scale. This property ensures that the bounding box regression is more reliable in detecting small and densely packed objects with complicated orientations and backgrounds, leading to improved detection performance. To learn rotated bounding boxes and produce more accurate object locations, a RoI-Transformer module is employed. This is necessary because horizontal bounding boxes are inadequate for aerial image detection. The ResNeXt backbone is also adopted for the proposed object detector. Experimental results on two popular datasets, DOTA and HRSC2016, show that the variance in the detection error significantly affects detection performance. The proposed object detector is effective and robust, with the optimal scale factor for the scaled smooth L1 loss function being around 2.0. Compared to other promising two-stage oriented methods, our method achieves a mAP of 70.82 on DOTA, with an improvement of at least 1.26 and up to 16.49. On HRSC2016, our method achieves an mAP of 87.1, with an improvement of at least 0.9 and up to 1.4. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
How and When Does Big Data Analytics Capability Boost Innovation Performance?
Sustainability 2023, 15(5), 4036; https://doi.org/10.3390/su15054036 - 22 Feb 2023
Cited by 1 | Viewed by 1048
Abstract
The diffusion of big data in recent years has stimulated many companies to develop big data analytics capability (BDAC) to boost innovation performance. However, research regarding how and when BDAC can increase innovation performance is still scant. This study aims to test how [...] Read more.
The diffusion of big data in recent years has stimulated many companies to develop big data analytics capability (BDAC) to boost innovation performance. However, research regarding how and when BDAC can increase innovation performance is still scant. This study aims to test how (i.e., the mediating role of strategic flexibility and strategic innovation) and when (i.e., the moderating role of environmental uncertainty) BDAC can boost a firm’s innovation performance drawing on resource-based theory. Through a survey of 421 Chinese managers and employees who are engaged in the field of big data analytics, this study reveals that (1) BDAC has a positive effect on innovation performance, (2) strategic flexibility and strategic innovation play a significant serial mediating role in this relationship, and (3) the positive effect of BDAC on innovation performance is more significant under high (vs. low) environmental uncertainty conditions. This study contributes to the extant literature by verifying how BDAC can increase a firm’s innovation performance through the serial mediating role of strategic flexibility and strategic innovation. It also confirms a contingent factor (i.e., environmental uncertainty) regarding the positive effect of BDAC on innovation performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Multi-Level Distributed Computing Approach to XDraw Viewshed Analysis Using Apache Spark
Remote Sens. 2023, 15(3), 761; https://doi.org/10.3390/rs15030761 - 28 Jan 2023
Viewed by 697
Abstract
Viewshed analysis is a terrain visibility computation method based on the digital elevation model (DEM). With the rapid growth of remote sensing and data collection technologies, the volume of large-scale raster DEM data has reached a great size (ZB). However, the data storage [...] Read more.
Viewshed analysis is a terrain visibility computation method based on the digital elevation model (DEM). With the rapid growth of remote sensing and data collection technologies, the volume of large-scale raster DEM data has reached a great size (ZB). However, the data storage and GIS analysis based on such large-scale digital data volume become extra difficult. The usually distributed approaches based on Apache Hadoop and Spark can efficiently handle the viewshed analysis computation of large-scale DEM data, but there are still bottleneck and precision problems. In this article, we present a multi-level distributed XDraw (ML-XDraw) algorithm with Apache Spark to handle the viewshed analysis of large DEM data. The ML-XDraw algorithm mainly consists of 3 parts: (1) designing the XDraw algorithm into a multi-level distributed computing process, (2) introducing a multi-level data decomposition strategy to solve the calculating bottleneck problem of the cluster’s executor, and (3) proposing a boundary approximate calculation strategy to solve the precision loss problem in calculation near the boundary. Experiments show that the ML-XDraw algorithm adequately addresses the above problems and achieves better speed-up and accuracy as the volume of raster DEM data increases drastically. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Real-Time Computer Vision Based Approach to Detection and Classification of Traffic Incidents
Big Data Cogn. Comput. 2023, 7(1), 22; https://doi.org/10.3390/bdcc7010022 - 28 Jan 2023
Cited by 5 | Viewed by 2410
Abstract
To constructively ameliorate and enhance traffic safety measures in Saudi Arabia, a prolific number of AI (Artificial Intelligence) traffic surveillance technologies have emerged, including Saher, throughout the past years. However, rapidly detecting a vehicle incident can play a cardinal role in ameliorating the [...] Read more.
To constructively ameliorate and enhance traffic safety measures in Saudi Arabia, a prolific number of AI (Artificial Intelligence) traffic surveillance technologies have emerged, including Saher, throughout the past years. However, rapidly detecting a vehicle incident can play a cardinal role in ameliorating the response speed of incident management, which in turn minimizes road injuries that have been induced by the accident’s occurrence. To attain a permeating effect in increasing the entailed demand for road traffic security and safety, this paper presents a real-time traffic incident detection and alert system that is based on a computer vision approach. The proposed framework consists of three models, each of which is integrated within a prototype interface to fully visualize the system’s overall architecture. To begin, the vehicle detection and tracking model utilized the YOLOv5 object detector with the DeepSORT tracker to detect and track the vehicles’ movements by allocating a unique identification number (ID) to each vehicle. This model attained a mean average precision (mAP) of 99.2%. Second, a traffic accident and severity classification model attained a mAP of 83.3% while utilizing the YOLOv5 algorithm to accurately detect and classify an accident’s severity level, sending an immediate alert message to the nearest hospital if a severe accident has taken place. Finally, the ResNet152 algorithm was utilized to detect the ignition of a fire following the accident’s occurrence; this model achieved an accuracy rate of 98.9%, with an automated alert being sent to the fire station if this perilous event occurred. This study employed an innovative parallel computing technique for reducing the overall complexity and inference time of the AI-based system to run the proposed system in a concurrent and parallel manner. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
X-Wines: A Wine Dataset for Recommender Systems and Machine Learning
Big Data Cogn. Comput. 2023, 7(1), 20; https://doi.org/10.3390/bdcc7010020 - 22 Jan 2023
Viewed by 2141
Abstract
In the current technological scenario of artificial intelligence growth, especially using machine learning, large datasets are necessary. Recommender systems appear with increasing frequency with different techniques for information filtering. Few large wine datasets are available for use with wine recommender systems. This work [...] Read more.
In the current technological scenario of artificial intelligence growth, especially using machine learning, large datasets are necessary. Recommender systems appear with increasing frequency with different techniques for information filtering. Few large wine datasets are available for use with wine recommender systems. This work presents X-Wines, a new and consistent wine dataset containing 100,000 instances and 21 million real evaluations carried out by users. Data were collected on the open Web in 2022 and pre-processed for wider free use. They refer to the scale 1–5 ratings carried out over a period of 10 years (2012–2021) for wines produced in 62 different countries. A demonstration of some applications using X-Wines in the scope of recommender systems with deep learning algorithms is also presented. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
The Evolution of Artificial Intelligence in the Digital Economy: An Application of the Potential Dirichlet Allocation Model
Sustainability 2023, 15(2), 1360; https://doi.org/10.3390/su15021360 - 11 Jan 2023
Viewed by 1123
Abstract
The most critical driver of the digital economy comes from breakthroughs in cutting-edge technologies such as artificial intelligence. In order to promote technological innovation and layout in the field of artificial intelligence, this paper analyzes the patent text of artificial intelligence technology using [...] Read more.
The most critical driver of the digital economy comes from breakthroughs in cutting-edge technologies such as artificial intelligence. In order to promote technological innovation and layout in the field of artificial intelligence, this paper analyzes the patent text of artificial intelligence technology using the LDA topic model from the perspective of the patent technology subject based on Derwent patent data. The results reveal that AI technology is upgraded from chips, sensing, and algorithms to innovative platforms and intelligent applications. Proposed countermeasures are necessary to advance the digitalization of the global economy and to achieve economic globalization in terms of industrial integration, building ecological systems, and strengthening independent innovation. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Prediction of Pork Supply Based on Improved Mayfly Optimization Algorithm and BP Neural Network
Sustainability 2022, 14(24), 16559; https://doi.org/10.3390/su142416559 - 09 Dec 2022
Viewed by 738
Abstract
Focusing on the issues of slow convergence speed and the ease of falling into a local optimum when optimizing the weights and thresholds of a back-propagation artificial neural network (BPANN) by the gradient method, a prediction method for pork supply based on an [...] Read more.
Focusing on the issues of slow convergence speed and the ease of falling into a local optimum when optimizing the weights and thresholds of a back-propagation artificial neural network (BPANN) by the gradient method, a prediction method for pork supply based on an improved mayfly optimization algorithm (MOA) and BPANN is proposed. Firstly, in order to improve the performance of MOA, an improved mayfly optimization algorithm with an adaptive visibility coefficient (AVC-IMOA) is introduced. Secondly, AVC-IMOA is used to optimize the weights and thresholds of a BPANN (AVC-IMOA_BP). Thirdly, the trained BPANN and the statistical data are adopted to predict the pork supply in Heilongjiang Province from 2000 to 2020. Finally, to demonstrate the effectiveness of the proposed method for predicting pork supply, the pork supply in Heilongjiang Province was predicted by using AVC-IMOA_BP, a BPANN based on the gradient descent method and a BPANN based on a mixed-strategy whale optimization algorithm (MSWOA_BP), a BPANN based on an artificial bee colony algorithm (ABC_BP) and a BPANN based on a firefly algorithm and sparrow search algorithm (FASSA_BP) in the literature. The results show that the prediction accuracy of the proposed method based on AVC-IMOA and a BPANN is obviously better than those of MSWOA_BP, ABC_BP and FASSA_BP, thus verifying the superior performance of AVC-IMOA_BP. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Competency-Based E-Learning Systems: Automated Integration of User Competency Portfolio
Sustainability 2022, 14(24), 16544; https://doi.org/10.3390/su142416544 - 09 Dec 2022
Viewed by 706
Abstract
In today’s learning environment, e-learning systems are becoming a necessity. A competency-based student portfolio system is also gaining popularity. Due to the variety of e-learning systems and the increasing mobility of students between different learning institutions or e-learning systems, a higher level of [...] Read more.
In today’s learning environment, e-learning systems are becoming a necessity. A competency-based student portfolio system is also gaining popularity. Due to the variety of e-learning systems and the increasing mobility of students between different learning institutions or e-learning systems, a higher level of automated competency portfolio integration is required. Increasing mobility and complexity makes manual mapping of student competencies unsustainable. The purpose of this paper is to automate the mapping of e-learning system competencies with student-gained competencies from other systems. Natural language processing, text similarity estimation, and fuzzy logic applications were used to implement the automated mapping process. Multiple cases have been tested to determine the effectiveness of the proposed solution. The solution has been shown to be able to accurately predict the coverage of system course competency by students’ course competency with an accuracy of approximately 77%. As it is not possible to achieve 100% mapping accuracy, the competency mapping should be executed semi-automatically by applying the proposed solution to obtain the initial mapping, and then manually revising the results as necessary. When compared to a fully manual mapping of competencies, it reduces workload and increases resource sustainability. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Nexus between Renewable Energy, Credit Gap Risk, Financial Development and R&D Expenditure: Panel ARDL Approach
Sustainability 2022, 14(23), 16232; https://doi.org/10.3390/su142316232 - 05 Dec 2022
Cited by 1 | Viewed by 880
Abstract
In the study, we investigate the relationships between renewable energy consumption sub-indicators of G-8 countries and financial development, credit gap risk, and R&D expenditure from 1996 to 2018. The relationships among the variables in the study are analyzed by employing the Panel ARDL [...] Read more.
In the study, we investigate the relationships between renewable energy consumption sub-indicators of G-8 countries and financial development, credit gap risk, and R&D expenditure from 1996 to 2018. The relationships among the variables in the study are analyzed by employing the Panel ARDL method and the Dumitrescu–Hurlin panel causality test. The cointegration relationships between the variables have been analyzed using the bounds test approach, and an unrestricted error correction model has been established. Contrary to previous studies in the renewable energy literature, this study employed the variable of credit gap risk. Therefore, we believe that this study will fill the gap in the literature and attract the attention of researchers and policymakers. The results indicate that increases in total demand for renewable energy positively affect the financial development of countries. Moreover, R&D expenditures increase as the demand for hydro energy and solar energy increases. This result indicates that wind power consumption has a short-term impact on R&D expenditure, and such an impact ceases to exist in the long run. According to the empirical research findings, the rise in demand for renewable energy may be a factor mitigating the credit gap risk of countries. In other words, the credit gap risk, which is considered a leading indicator of systemic banking crises, can be mitigated by the rise in the demand for renewable energy. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Case Report
PM2.5 Prediction Based on the CEEMDAN Algorithm and a Machine Learning Hybrid Model
Sustainability 2022, 14(23), 16128; https://doi.org/10.3390/su142316128 - 02 Dec 2022
Cited by 1 | Viewed by 921
Abstract
The current serious air pollution problem has become a closely investigated topic in people’s daily lives. If we want to provide a reasonable basis for haze prevention, then the prediction of PM2.5 concentrations becomes a crucial task. However, it is difficult to complete [...] Read more.
The current serious air pollution problem has become a closely investigated topic in people’s daily lives. If we want to provide a reasonable basis for haze prevention, then the prediction of PM2.5 concentrations becomes a crucial task. However, it is difficult to complete the task of PM2.5 concentration prediction using a single model; therefore, to address this problem, this paper proposes a fully adaptive noise ensemble empirical modal decomposition (CEEMDAN) algorithm combined with deep learning hybrid models. Firstly, the CEEMDAN algorithm was used to decompose the PM2.5 timeseries data into different modal components. Then long short-term memory (LSTM), a backpropagation (BP) neural network, a differential integrated moving average autoregressive model (ARIMA), and a support vector machine (SVM) were applied to each modal component. Lastly, the best prediction results of each component were superimposed and summed to obtain the final prediction results. The PM2.5 data of Hangzhou in recent years were substituted into the model for testing, which was compared with eight models, namely, LSTM, ARIMA, BP, SVM, CEEMDAN–ARIMA, CEEMDAN–LSTM, CEEMDAN–SVM, and CEEMDAN–BP. The results show that for the coupled CEEMDAN–LSTM–BP–ARIMA model, the prediction ability was better than all the other models, and the timeseries decomposition data of PM2.5 had their own characteristics. The data with different characteristics were predicted separately using appropriate models and the final combined model results obtained were the most satisfactory. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Analysis and Prediction of the IPv6 Traffic over Campus Networks in Shanghai
Future Internet 2022, 14(12), 353; https://doi.org/10.3390/fi14120353 - 27 Nov 2022
Viewed by 1156
Abstract
With the exhaustion of IPv4 addresses, research on the adoption, deployment, and prediction of IPv6 networks becomes more and more significant. This paper analyzes the IPv6 traffic of two campus networks in Shanghai, China. We first conduct a series of analyses for the [...] Read more.
With the exhaustion of IPv4 addresses, research on the adoption, deployment, and prediction of IPv6 networks becomes more and more significant. This paper analyzes the IPv6 traffic of two campus networks in Shanghai, China. We first conduct a series of analyses for the traffic patterns and uncover weekday/weekend patterns, the self-similarity phenomenon, and the correlation between IPv6 and IPv4 traffic. On weekends, traffic usage is smaller than on weekdays, but the distribution does not change much. We find that the self-similarity of IPv4 traffic is close to that of IPv6 traffic, and there is a strong positive correlation between IPv6 traffic and IPv4 traffic. Based on our findings on traffic patterns, we propose a new IPv6 traffic prediction model by combining the advantages of the statistical and deep learning models. In addition, our model would extract useful information from the corresponding IPv4 traffic to enhance the prediction. Based on two real-world datasets, it is shown that the proposed model outperforms eight baselines with a lower prediction error. In conclusion, our approach is helpful for network resource allocation and network management. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
An Effective Online Sequential Stochastic Configuration Algorithm for Neural Networks
Sustainability 2022, 14(23), 15601; https://doi.org/10.3390/su142315601 - 23 Nov 2022
Viewed by 667
Abstract
Random Vector Functional-link (RVFL) networks, as a class of random learner models, have received careful attention from the neural network research community due to their advantages in obtaining fast learning algorithms and models, in which the hidden layer parameters are randomly generated and [...] Read more.
Random Vector Functional-link (RVFL) networks, as a class of random learner models, have received careful attention from the neural network research community due to their advantages in obtaining fast learning algorithms and models, in which the hidden layer parameters are randomly generated and remain fixed during the training phase. However, its universal approximation ability may not be guaranteed if the random parameters are not properly selected in an appropriate range. Moreover, the resulting random learner’s generalization performance may seriously deteriorate once the RVFL network’s structure is not well-designed. Stochastic configuration (SC) algorithm, which incrementally constructs a universal approximator by obtaining random hidden parameters under a specified supervisory mechanism, instead of fixing the selection scope in advance and without any reference to training information, can effectively circumvent these awkward issues caused by randomness. This paper extends the SC algorithm to an online sequential version, termed as an OSSC algorithm, by means of recursive least square (RLS) technique, aiming to copy with modeling tasks where training observations are sequentially provided. Compared to the online sequential learning of RVFL networks (OS-RVFL in short), our proposed OSSC algorithm can avoid the awkward setting of certain unreasonable range for the random parameters, and can also successfully build a random learner with preferable learning and generalization capabilities. The experimental study has shown the effectiveness and advantages of our OSSC algorithm. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Improving Natural Language Person Description Search from Videos with Language Model Fine-Tuning and Approximate Nearest Neighbor
Big Data Cogn. Comput. 2022, 6(4), 136; https://doi.org/10.3390/bdcc6040136 - 11 Nov 2022
Viewed by 1082
Abstract
Due to the ubiquitous nature of CCTV cameras that record continuously, there is a large amount of video data that are unstructured. Often, when these recordings have to be reviewed, it is to look for a specific person that fits a certain description. [...] Read more.
Due to the ubiquitous nature of CCTV cameras that record continuously, there is a large amount of video data that are unstructured. Often, when these recordings have to be reviewed, it is to look for a specific person that fits a certain description. Currently, this is achieved by manual inspection of the videos, which is both time-consuming and labor-intensive. While person description search is not a new topic, in this work, we made two contributions. First, we improve upon the existing state-of-the-art by proposing unsupervised finetuning on the language model that forms a main part of the text branch of person description search models. This led to higher recall values on the standard dataset. The second contribution is that we engineered a complete pipeline from video files to fast searchable objects. Due to the use of an approximate nearest neighbor search and some model optimizations, a person description search can be performed such that the result is available immediately when deployed on a standard PC with no GPU, allowing an interactive search. We demonstrated the effectiveness of the system on new data and showed that most people in the videos can be successfully discovered by the search. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Unsupervised Cluster-Wise Hyperspectral Band Selection for Classification
Remote Sens. 2022, 14(21), 5374; https://doi.org/10.3390/rs14215374 - 27 Oct 2022
Viewed by 765
Abstract
A hyperspectral image provides fine details about the scene under analysis, due to its multiple bands. However, the resulting high dimensionality in the feature space may render a classification task unreliable, mainly due to overfitting and the Hughes phenomenon. In order to attenuate [...] Read more.
A hyperspectral image provides fine details about the scene under analysis, due to its multiple bands. However, the resulting high dimensionality in the feature space may render a classification task unreliable, mainly due to overfitting and the Hughes phenomenon. In order to attenuate such problems, one can resort to dimensionality reduction (DR). Thus, this paper proposes a new DR algorithm, which performs an unsupervised band selection technique following a clustering approach. More specifically, the data set was split into a predefined number of clusters, after which the bands were iteratively selected based on the parameters of a separating hyperplane, which provided the best separation in the feature space, in a one-versus-all scenario. Then, a fine-tuning of the initially selected bands took place based on the separability of clusters. A comparison with five other state-of-the-art frameworks shows that the proposed method achieved the best classification results in 60% of the experiments. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Auto-Learning Correlation-Filter-Based Target State Estimation for Real-Time UAV Tracking
Remote Sens. 2022, 14(21), 5299; https://doi.org/10.3390/rs14215299 - 23 Oct 2022
Viewed by 992
Abstract
Most existing tracking methods based on discriminative correlation filters (DCFs) update the tracker every frame with a fixed learning rate. However, constantly adjusting the tracker can hardly handle the fickle target appearance in UAV tracking (e.g., undergoing partial occlusion, illumination variation, or deformation). [...] Read more.
Most existing tracking methods based on discriminative correlation filters (DCFs) update the tracker every frame with a fixed learning rate. However, constantly adjusting the tracker can hardly handle the fickle target appearance in UAV tracking (e.g., undergoing partial occlusion, illumination variation, or deformation). To mitigate this, we propose a novel auto-learning correlation filter for UAV tracking, which fully exploits valuable information behind response maps for adaptive feedback updating. Concretely, we first introduce a principled target state estimation (TSE) criterion to reveal the confidence level of the tracking results. We suggest an auto-learning strategy with the TSE metric to update the tracker with adaptive learning rates. Based on the target state estimation, we further developed an innovative lost-and-found strategy to recognize and handle temporal target missing. Finally, we incorporated the TSE regularization term into the DCF objective function, which by alternating optimization iterations can efficiently solve without much computational cost. Extensive experiments on four widely-used UAV benchmarks have demonstrated the superiority of the proposed method compared to both DCF and deep-based trackers. Notably, ALCF achieved state-of-the-art performance on several benchmarks while running over 50 FPS on a single CPU. Code will be released soon. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Supporting Meteorologists in Data Analysis through Knowledge-Based Recommendations
Big Data Cogn. Comput. 2022, 6(4), 103; https://doi.org/10.3390/bdcc6040103 - 28 Sep 2022
Cited by 2 | Viewed by 1265
Abstract
Climate change means coping directly or indirectly with extreme weather conditions for everybody. Therefore, analyzing meteorological data to create precise models is gaining more importance and might become inevitable. Meteorologists have extensive domain knowledge about meteorological data yet lack practical data analysis skills. [...] Read more.
Climate change means coping directly or indirectly with extreme weather conditions for everybody. Therefore, analyzing meteorological data to create precise models is gaining more importance and might become inevitable. Meteorologists have extensive domain knowledge about meteorological data yet lack practical data analysis skills. This paper presents a method to bridge this gap by empowering the data knowledge carriers to analyze the data. The proposed system utilizes symbolic AI, a knowledge base created by experts, and a recommendation expert system to offer suiting data analysis methods or data pre-processing to meteorologists. This paper systematically analyzes the target user group of meteorologists and practical use cases to arrive at a conceptual and technical system design implemented in the CAMeRI prototype. The concepts in this paper are aligned with the AI2VIS4BigData Reference Model and comprise a novel first-order logic knowledge base that represents analysis methods and related pre-processings. The prototype implementation was qualitatively and quantitatively evaluated. This evaluation included recommendation validation for real-world data, a cognitive walkthrough, and measuring computation timings of the different system components. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Image Retrieval Algorithm Based on Locality-Sensitive Hash Using Convolutional Neural Network and Attention Mechanism
Information 2022, 13(10), 446; https://doi.org/10.3390/info13100446 - 24 Sep 2022
Cited by 1 | Viewed by 1404
Abstract
With the continuous progress of image retrieval technology, in the field of image retrieval, the speed of a search for a desired image from a great deal of image data becomes a hot issue. Convolutional Neural Networks (CNN) have been used in the [...] Read more.
With the continuous progress of image retrieval technology, in the field of image retrieval, the speed of a search for a desired image from a great deal of image data becomes a hot issue. Convolutional Neural Networks (CNN) have been used in the field of image retrieval. However, many image retrieval systems based on CNN have a poor ability to express image features, resulting in a series of problems such as low retrieval accuracy and robustness. When the target image is retrieved from a large amount of image data, the vector dimension after image coding is high and the retrieval efficiency is low. Locality-sensitive hash is a method to find similar data from massive high latitude data. It reduces the data dimension of the original spatial data through hash coding and conversion, and can also maintain the similarity between the data. The retrieval time and space complexity are low. Therefore, this paper proposes a locality-sensitive hash image retrieval method based on CNN and the attention mechanism. The steps of the method are as follows: using the ResNet50 network as the feature extractor of the image, adding the attention module after the convolution layer of the model, and using the output of the network full connection layer to retrieve the features of the image database, then using the local-sensitive hash algorithm to hash code the image features of the database to reduce the dimension and establish the index, and finally measuring the features of the image to be retrieved and the image database to get the most similar image, completing the content-based image retrieval task. The method in this paper is compared with other image retrieval methods on corel1k and corel5k datasets. The experimental results show that this method can effectively improve the accuracy of image retrieval, and the retrieval efficiency is significantly improved. It also has higher robustness in different scenarios. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Data-Driven Based Method for Pipeline Additional Stress Prediction Subject to Landslide Geohazards
Sustainability 2022, 14(19), 11999; https://doi.org/10.3390/su141911999 - 22 Sep 2022
Cited by 2 | Viewed by 973
Abstract
Pipelines that cross complex geological terrains are inevitably threatened by natural hazards, among which landslide attracts extensive attention when pipelines cross mountainous areas. The landslides are typically associated with ground movements that would induce additional stress on the pipeline. Such stress state of [...] Read more.
Pipelines that cross complex geological terrains are inevitably threatened by natural hazards, among which landslide attracts extensive attention when pipelines cross mountainous areas. The landslides are typically associated with ground movements that would induce additional stress on the pipeline. Such stress state of pipelines under landslide interference seriously damage structural integrity of the pipeline. Up to the date, limited research has been done on the combined landslide hazard and pipeline stress state analysis. In this paper, a multi-parameter integrated monitoring system was developed for the pipeline stress-strain state and landslide deformation monitoring. Also, data-driven models for the pipeline additional stress prediction was established. The developed predictive models include individual and ensemble-based machine learning approaches. The implementation procedure of the predictive models integrates the field data measured by the monitoring system, with k-fold cross validation used for the generalization performance evaluation. The obtained results indicate that the XGBoost model has the highest performance in the prediction of the additional stress. Besides, the significance of the input variables is determined through sensitivity analyses by using feature importance criteria. Thus, the integrated monitoring system together with the XGBoost prediction method is beneficial to modeling the additional stress in oil and gas pipelines, which will further contribute to pipeline geohazards monitoring management. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Data Descriptor
A Worldwide Bibliometric Analysis of Publications on Artificial Intelligence and Ethics in the Past Seven Decades
Sustainability 2022, 14(18), 11125; https://doi.org/10.3390/su141811125 - 06 Sep 2022
Viewed by 1118
Abstract
Issues related to artificial intelligence (AI) and ethics have gained much traction worldwide. The impact of AI on society has been extensively discussed. This study presents a bibliometric analysis of research results, citation relationships among researchers, and highly referenced journals on AI and [...] Read more.
Issues related to artificial intelligence (AI) and ethics have gained much traction worldwide. The impact of AI on society has been extensively discussed. This study presents a bibliometric analysis of research results, citation relationships among researchers, and highly referenced journals on AI and ethics on a global scale. Papers published on AI and ethics were recovered from the Microsoft Academic Graph Collection data set, and the subject terms included “artificial intelligence” and “ethics.” With 66 nations’ researchers contributing to AI and ethics research, 1585 papers on AI and ethics were recovered, up to 5 July 2021. North America, Western Europe, and East Asia were the regions with the highest productivity. The top ten nations produced about 94.37% of the wide variety of papers. The United States accounted for 47.59% (286 articles) of all papers. Switzerland had the highest research production with a million-person ratio (1.39) when adjusted for populace size. It was followed by the Netherlands (1.26) and the United Kingdom (1.19). The most productive authors were found to be Khatib, O. (n = 10), Verner, I. (n = 9), Bekey, G. A. (n = 7), Gennert, M. A. (n = 7), and Chatila, R., (n = 7). Current research shows that research on artificial intelligence and ethics has evolved dramatically over the past 70 years. Moreover, the United States is more involved with AI and ethics research than developing or emerging countries. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Hierarchical Co-Attention Selection Network for Interpretable Fake News Detection
Big Data Cogn. Comput. 2022, 6(3), 93; https://doi.org/10.3390/bdcc6030093 - 05 Sep 2022
Viewed by 2128
Abstract
Social media fake news has become a pervasive and problematic issue today with the development of the internet. Recent studies have utilized different artificial intelligence technologies to verify the truth of the news and provide explanations for the results, which have shown remarkable [...] Read more.
Social media fake news has become a pervasive and problematic issue today with the development of the internet. Recent studies have utilized different artificial intelligence technologies to verify the truth of the news and provide explanations for the results, which have shown remarkable success in interpretable fake news detection. However, individuals’ judgments of news are usually hierarchical, prioritizing valuable words above essential sentences, which is neglected by existing fake news detection models. In this paper, we propose an interpretable novel neural network-based model, the hierarchical co-attention selection network (HCSN), to predict whether the source post is fake, as well as an explanation that emphasizes important comments and particular words. The key insight of the HCSN model is to incorporate the Gumbel–Max trick in the hierarchical co-attention selection mechanism that captures sentence-level and word-level information from the source post and comments following the sequence of words–sentences–words–event. In addition, HCSN enjoys the additional benefit of interpretability—it provides a conscious explanation of how it reaches certain results by selecting comments and highlighting words. According to the experiments conducted on real-world datasets, our model outperformed state-of-the-art methods and generated reasonable explanations. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Topical and Non-Topical Approaches to Measure Similarity between Arabic Questions
Big Data Cogn. Comput. 2022, 6(3), 87; https://doi.org/10.3390/bdcc6030087 - 22 Aug 2022
Cited by 1 | Viewed by 1751
Abstract
Questions are crucial expressions in any language. Many Natural Language Processing (NLP) or Natural Language Understanding (NLU) applications, such as question-answering computer systems, automatic chatting apps (chatbots), digital virtual assistants, and opinion mining, can benefit from accurately identifying similar questions in an effective [...] Read more.
Questions are crucial expressions in any language. Many Natural Language Processing (NLP) or Natural Language Understanding (NLU) applications, such as question-answering computer systems, automatic chatting apps (chatbots), digital virtual assistants, and opinion mining, can benefit from accurately identifying similar questions in an effective manner. We detail methods for identifying similarities between Arabic questions that have been posted online by Internet users and organizations. Our novel approach uses a non-topical rule-based methodology and topical information (textual similarity, lexical similarity, and semantic similarity) to determine if a pair of Arabic questions are similarly paraphrased. Our method counts the lexical and linguistic distances between each question. Additionally, it identifies questions in accordance with their format and scope using expert hypotheses (rules) that have been experimentally shown to be useful and practical. Even if there is a high degree of lexical similarity between a When question (Timex Factoid—inquiring about time) and a Who inquiry (Enamex Factoid—asking about a named entity), they will not be similar. In an experiment using 2200 question pairs, our method attained an accuracy of 0.85, which is remarkable given the simplicity of the solution and the fact that we did not employ any language models or word embedding. In order to cover common Arabic queries presented by Arabic Internet users, we gathered the questions from various online forums and resources. In this study, we describe a unique method for detecting question similarity that does not require intensive processing, a sizable linguistic corpus, or a costly semantic repository. Because there are not many rich Arabic textual resources, this is especially important for informal Arabic text processing on the Internet. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Article
Machine-Learning-Based Gender Distribution Prediction from Anonymous News Comments: The Case of Korean News Portal
Sustainability 2022, 14(16), 9939; https://doi.org/10.3390/su14169939 - 11 Aug 2022
Cited by 1 | Viewed by 926
Abstract
Anonymous news comment data from a news portal in South Korea, naver.com, can help conduct gender research and resolve related issues for sustainable societies. Nevertheless, only a small portion of gender information (i.e., gender distribution) is open to the public, and therefore, it [...] Read more.
Anonymous news comment data from a news portal in South Korea, naver.com, can help conduct gender research and resolve related issues for sustainable societies. Nevertheless, only a small portion of gender information (i.e., gender distribution) is open to the public, and therefore, it has rarely been considered for gender research. Hence, this paper aims to resolve the matter of incomplete gender information and make the anonymous news comment data usable for gender research as new social media big data. This paper proposes a machine-learning-based approach for predicting the gender distribution (i.e., male and female rates) of anonymous news commenters for a news article. Initially, the big data of news articles and their anonymous news comments were collected and divided into labeled and unlabeled datasets (i.e., with and without gender information). The word2vec approach was employed to represent a news article by the characteristics of the news comments. Then, using the labeled dataset, various prediction techniques were evaluated for predicting the gender distribution of anonymous news commenters for a labeled news article. As a result, the neural network was selected as the best prediction technique, and it could accurately predict the gender distribution of anonymous news commenters of the labeled news article. Thus, this study showed that a machine-learning-based approach can overcome the incomplete gender information problem of anonymous social media users. Moreover, when the gender distributions of the unlabeled news articles were predicted using the best neural network model, trained with the labeled dataset, their distribution turned out different from the labeled news articles. The result indicates that using only the labeled dataset for gender research can result in misleading findings and distorted conclusions. The predicted gender distributions for the unlabeled news articles can help to better understand anonymous news commenters as humans for sustainable societies. Eventually, this study provides a new way for data-driven computational social science with incomplete and anonymous social media big data. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Using Explainable Artificial Intelligence to Identify Key Characteristics of Deep Poverty for Each Household
Sustainability 2022, 14(16), 9872; https://doi.org/10.3390/su14169872 - 10 Aug 2022
Viewed by 1015
Abstract
The first task for eradicating poverty is accurate poverty identification. Deep poverty identification is conducive to investing resources to help deeply poor populations achieve prosperity, one of the most challenging tasks in poverty eradication. This study constructs a deep poverty identification model utilizing [...] Read more.
The first task for eradicating poverty is accurate poverty identification. Deep poverty identification is conducive to investing resources to help deeply poor populations achieve prosperity, one of the most challenging tasks in poverty eradication. This study constructs a deep poverty identification model utilizing explainable artificial intelligence (XAI) to identify deeply poor households based on the data of 23,307 poor households in rural areas in China. For comparison, a logistic regression-based model and an income-based model are developed as well. We found that our XAI-based model achieves a higher identification performance in terms of the area under the ROC curve than both the logistic regression-based model and the income-based model. For each rural household, the odds of being identified as deeply poor are obtained. Additionally, multidimensional household characteristics associated with deep poverty are specified and ranked for each poor household, while ordinary feature ranking methods can only provide ranking results for poor households as a whole. Taking all poor households into consideration, we found that common important characteristics that can be used to identify deeply poor households include household income, disability, village attributes, lack of funds, labor force, disease, and number of household members, which are validated by mutual information analysis. In conclusion, our XAI-based model can be used to identify deep poverty and specify key household characteristics associated with deep poverty for individual households, facilitating the development of new targeted poverty reduction strategies. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Efficient Supervised Image Clustering Based on Density Division and Graph Neural Networks
Remote Sens. 2022, 14(15), 3768; https://doi.org/10.3390/rs14153768 - 05 Aug 2022
Viewed by 1105
Abstract
In recent research, supervised image clustering based on Graph Neural Networks (GNN) connectivity prediction has demonstrated considerable improvements over traditional clustering algorithms. However, existing supervised image clustering algorithms are usually time-consuming and limit their applications. In order to infer the connectivity between image [...] Read more.
In recent research, supervised image clustering based on Graph Neural Networks (GNN) connectivity prediction has demonstrated considerable improvements over traditional clustering algorithms. However, existing supervised image clustering algorithms are usually time-consuming and limit their applications. In order to infer the connectivity between image instances, they usually created a subgraph for each image instance. Due to the creation and process of a large number of subgraphs as the input of GNN, the computation overheads are enormous. To address the high computation overhead problem in the GNN connectivity prediction, we present a time-efficient and effective GNN-based supervised clustering framework based on density division namely DDC-GNN. DDC-GNN divides all image instances into high-density parts and low-density parts, and only performs GNN subgraph connectivity prediction on the low-density parts, resulting in a significant reduction in redundant calculations. We test two typical models in the GNN connectivity prediction module in the DDC-GNN framework, which are the graph convolutional networks (GCN)-based model and the graph auto-encoder (GAE)-based model. Meanwhile, adaptive subgraphs are generated to ensure sufficient contextual information extraction for low-density parts instead of the fixed-size subgraphs. According to the experiments on different datasets, DDC-GNN achieves higher accuracy and is almost five times quicker than those without the density division strategy. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Study on the Optimal Flexible Job-Shop Scheduling with Sequence-Dependent Setup Time Based on a Hybrid Algorithm of Improved Quantum Cat Swarm Optimization
Sustainability 2022, 14(15), 9547; https://doi.org/10.3390/su14159547 - 03 Aug 2022
Cited by 2 | Viewed by 1108
Abstract
Multi-item and small-lot-size production modes lead to frequent setup, which involves significant setup times and has a substantial impact on productivity. In this study, we investigated the optimal flexible job-shop scheduling problem with a sequence-dependent setup time. We built a mathematical model with [...] Read more.
Multi-item and small-lot-size production modes lead to frequent setup, which involves significant setup times and has a substantial impact on productivity. In this study, we investigated the optimal flexible job-shop scheduling problem with a sequence-dependent setup time. We built a mathematical model with the optimal objective of minimization of the maximum completion time (makespan). Considering the process sequence, which is influenced by setup time, processing time, and machine load limitations, first, processing machinery is chosen based on machine load and processing time, and then processing tasks are scheduled based on setup time and processing time. An improved quantum cat swarm optimization (QCSO) algorithm is proposed to solve the problem, a quantum coding method is introduced, the quantum bit (Q-bit) and cat swarm algorithm (CSO) are combined, and the cats are iteratively updated by quantum rotation angle position; then, the dynamic mixture ratio (MR) value is selected according to the number of algorithm iterations. The use of this method expands our understanding of space and increases operation efficiency and speed. Finally, the improved QCSO algorithm and parallel genetic algorithm (PGA) are compared through simulation experiments. The results show that the improved QCSO algorithm has better results, and the robustness of the algorithm is improved. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
LGB-PHY: An Evaporation Duct Height Prediction Model Based on Physically Constrained LightGBM Algorithm
Remote Sens. 2022, 14(14), 3448; https://doi.org/10.3390/rs14143448 - 18 Jul 2022
Cited by 4 | Viewed by 1169
Abstract
The evaporation duct is a special atmospheric stratification that significantly influences the propagation path of electromagnetic waves at sea, and hence, it is crucial for the stability of the radio communication systems. Affected by physical parameters that are not universal, traditional evaporation duct [...] Read more.
The evaporation duct is a special atmospheric stratification that significantly influences the propagation path of electromagnetic waves at sea, and hence, it is crucial for the stability of the radio communication systems. Affected by physical parameters that are not universal, traditional evaporation duct theoretical models often have limited accuracy and poor generalization ability, e.g., the remote sensing method is limited by the inversion algorithm. The accuracy, generalization ability and scientific interpretability of the existing pure data-driven evaporation duct height prediction models still need to be improved. To address these issues, in this paper, we use the voyage observation data and propose the physically constrained LightGBM evaporation duct height prediction model (LGB-PHY). The proposed model integrates the Babin–Young–Carton (BYC) physical model into a custom loss function. Compared with the eXtreme Gradient Boosting (XGB) model, the LGB-PHY based on a 5-day voyage data set of the South China Sea provides significant improvement where the RMSE index is reduced by 68%, while the SCC index is improved by 6.5%. We further carried out a cross-comparison experiment of regional generalization and show that in the sea area with high latitude and strong adaptability of the BYC model, the LGB-PHY model has a stronger regional generalization performance than that of the XGB model. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Object Localization in Weakly Labeled Remote Sensing Images Based on Deep Convolutional Features
Remote Sens. 2022, 14(13), 3230; https://doi.org/10.3390/rs14133230 - 05 Jul 2022
Cited by 1 | Viewed by 1400
Abstract
Object recognition, as one of the most fundamental and challenging problems in high-resolution remote sensing image interpretation, has received increasing attention in recent years. However, most conventional object recognition pipelines aim to recognize instances with bounding boxes in a supervised learning strategy, which [...] Read more.
Object recognition, as one of the most fundamental and challenging problems in high-resolution remote sensing image interpretation, has received increasing attention in recent years. However, most conventional object recognition pipelines aim to recognize instances with bounding boxes in a supervised learning strategy, which require intensive and manual labor for instance annotation creation. In this paper, we propose a weakly supervised learning method to alleviate this problem. The core idea of our method is to recognize multiple objects in an image using only image-level semantic labels and indicate the recognized objects with location points instead of box extent. Specifically, a deep convolutional neural network is first trained to perform semantic scene classification, of which the result is employed for the categorical determination of objects in an image. Then, by back-propagating the categorical feature from the fully connected layer to the deep convolutional layer, the categorical and spatial information of an image are combined to obtain an object discriminative localization map, which can effectively indicate the salient regions of objects. Next, a dynamic updating method of local response extremum is proposed to further determine the locations of objects in an image. Finally, extensive experiments are conducted to localize aircraft and oiltanks in remote sensing images based on different convolutional neural networks. Experimental results show that the proposed method outperforms the-state-of-the-art methods, achieving the precision, recall, and F1-score at 94.50%, 88.79%, and 91.56% for aircraft localization and 89.12%, 83.04%, and 85.97% for oiltank localization, respectively. We hope that our work could serve as a basic reference for remote sensing object localization via a weakly supervised strategy and provide new opportunities for further research. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Fog Computing Capabilities for Big Data Provisioning: Visualization Scenario
Sustainability 2022, 14(13), 8070; https://doi.org/10.3390/su14138070 - 01 Jul 2022
Viewed by 1231
Abstract
With the development of Internet technologies, huge amounts of data are collected from various sources, and used ‘anytime, anywhere’ to enrich and change the life of the whole of society, attract ways to do business, and better perceive people’s lives. Those datasets, called [...] Read more.
With the development of Internet technologies, huge amounts of data are collected from various sources, and used ‘anytime, anywhere’ to enrich and change the life of the whole of society, attract ways to do business, and better perceive people’s lives. Those datasets, called ‘big data’, need to be processed, stored, or retrieved, and special tools were developed to analyze this big data. At the same time, the ever-increasing development of the Internet of Things (IoT) requires IoT devices to be mobile, with adequate data processing performance. The new fog computing paradigm makes computing resources more accessible, and provides a flexible environment that will be widely used in next-generation networks, vehicles, etc., demonstrating enhanced capabilities and optimizing resources. This paper is devoted to analyzing fog computing capabilities for big data provisioning, while considering this technology’s different architectural and functional aspects. The analysis includes exploring the protocols suitable for fog computing by implementing an experimental fog computing network and assessing its capabilities for providing big data, originating from both a real-time stream and batch data, with appropriate visualization of big data processing. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
GenericConv: A Generic Model for Image Scene Classification Using Few-Shot Learning
Information 2022, 13(7), 315; https://doi.org/10.3390/info13070315 - 28 Jun 2022
Viewed by 1425
Abstract
Scene classification is one of the most complex tasks in computer-vision. The accuracy of scene classification is dependent on other subtasks such as object detection and object classification. Accurate results may be accomplished by employing object detection in scene classification since prior information [...] Read more.
Scene classification is one of the most complex tasks in computer-vision. The accuracy of scene classification is dependent on other subtasks such as object detection and object classification. Accurate results may be accomplished by employing object detection in scene classification since prior information about objects in the image will lead to an easier interpretation of the image content. Machine and transfer learning are widely employed in scene classification achieving optimal performance. Despite the promising performance of existing models in scene classification, there are still major issues. First, the training phase for the models necessitates a large amount of data, which is a difficult and time-consuming task. Furthermore, most models are reliant on data previously seen in the training set, resulting in ineffective models that can only identify samples that are similar to the training set. As a result, few-shot learning has been introduced. Although few attempts have been reported applying few-shot learning to scene classification, they resulted in perfect accuracy. Motivated by these findings, in this paper we implement a novel few-shot learning model—GenericConv—for scene classification that has been evaluated using benchmarked datasets: MiniSun, MiniPlaces, and MIT-Indoor 67 datasets. The experimental results show that the proposed model GenericConv outperforms the other benchmark models on the three datasets, achieving accuracies of 52.16 ± 0.015, 35.86 ± 0.014, and 37.26 ± 0.014 for five-shots on MiniSun, MiniPlaces, and MIT-Indoor 67 datasets, respectively. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
An Effective Ensemble Automatic Feature Selection Method for Network Intrusion Detection
Information 2022, 13(7), 314; https://doi.org/10.3390/info13070314 - 27 Jun 2022
Cited by 2 | Viewed by 1588
Abstract
The mass of redundant and irrelevant data in network traffic brings serious challenges to intrusion detection, and feature selection can effectively remove meaningless information from the data. Most current filtered and embedded feature selection methods use a fixed threshold or ratio to determine [...] Read more.
The mass of redundant and irrelevant data in network traffic brings serious challenges to intrusion detection, and feature selection can effectively remove meaningless information from the data. Most current filtered and embedded feature selection methods use a fixed threshold or ratio to determine the number of features in a subset, which requires a priori knowledge. In contrast, wrapped feature selection methods are computationally complex and time-consuming; meanwhile, individual feature selection methods have a bias in evaluating features. This work designs an ensemble-based automatic feature selection method called EAFS. Firstly, we calculate the feature importance or ranks based on individual methods, then add features to subsets sequentially by importance and evaluate subset performance comprehensively by designing an NSOM to obtain the subset with the largest NSOM value. When searching for a subset, the subset with higher accuracy is retained to lower the computational complexity by calculating the accuracy when the full set of features is used. Finally, the obtained subsets are ensembled, and by comparing the experimental results on three large-scale public datasets, the method described in this study can help in the classification, and also compared with other methods, we discover that our method outperforms other recent methods in terms of performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
VisualRPI: Visualizing Research Productivity and Impact
Sustainability 2022, 14(13), 7679; https://doi.org/10.3390/su14137679 - 23 Jun 2022
Viewed by 880
Abstract
Research productivity and impact (RPI) is commonly measured through citation analysis, such as the h-index. Despite the popularity and objectivity of this type of method, it is still difficult to effectively compare a number of related researchers in terms of various citation-related statistics [...] Read more.
Research productivity and impact (RPI) is commonly measured through citation analysis, such as the h-index. Despite the popularity and objectivity of this type of method, it is still difficult to effectively compare a number of related researchers in terms of various citation-related statistics at the same time, such as average cites per year/paper, the number of papers/citations, h-index, etc. In this work, we develop a method that employs information visualization technology, and examine its applicability for the assessment of researchers’ RPI. Specifically, our prototype, a visualizing research productivity and impact (VisualRPI) system, is introduced, which is composed of clustering and visualization components. The clustering component hierarchically clusters similar research statistics into the same groups, and the visualization component is used to display the RPI in a clear manner. A case example using information for 85 information systems researchers is used to demonstrate the usefulness of VisualRPI. The results show that this method easily measures the RPI for various performance indicators, such as cites/paper and h-index. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Mask-Guided Transformer Network with Topic Token for Remote Sensing Image Captioning
Remote Sens. 2022, 14(12), 2939; https://doi.org/10.3390/rs14122939 - 20 Jun 2022
Cited by 2 | Viewed by 3416
Abstract
Remote sensing image captioning aims to describe the content of images using natural language. In contrast with natural images, the scale, distribution, and number of objects generally vary in remote sensing images, making it hard to capture global semantic information and the relationships [...] Read more.
Remote sensing image captioning aims to describe the content of images using natural language. In contrast with natural images, the scale, distribution, and number of objects generally vary in remote sensing images, making it hard to capture global semantic information and the relationships between objects at different scales. In this paper, in order to improve the accuracy and diversity of captioning, a mask-guided Transformer network with a topic token is proposed. Multi-head attention is introduced to extract features and capture the relationships between objects. On this basis, a topic token is added into the encoder, which represents the scene topic and serves as a prior in the decoder to help us focus better on global semantic information. Moreover, a new Mask-Cross-Entropy strategy is designed in order to improve the diversity of the generated captions, which randomly replaces some input words with a special word (named [Mask]) in the training stage, with the aim of enhancing the model’s learning ability and forcing exploration of uncommon word relations. Experiments on three data sets show that the proposed method can generate captions with high accuracy and diversity, and the experimental results illustrate that the proposed method can outperform state-of-the-art models. Furthermore, the CIDEr score on the RSICD data set increased from 275.49 to 298.39. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Application of Combined Models Based on Empirical Mode Decomposition, Deep Learning, and Autoregressive Integrated Moving Average Model for Short-Term Heating Load Predictions
Sustainability 2022, 14(12), 7349; https://doi.org/10.3390/su14127349 - 15 Jun 2022
Cited by 7 | Viewed by 1425
Abstract
Short-term building energy consumption prediction is of great significance for the optimized operation of building energy management systems and energy conservation. Due to the high-dimensional nonlinear characteristics of building heat loads, traditional single machine-learning models cannot extract the features well. Therefore, in this [...] Read more.
Short-term building energy consumption prediction is of great significance for the optimized operation of building energy management systems and energy conservation. Due to the high-dimensional nonlinear characteristics of building heat loads, traditional single machine-learning models cannot extract the features well. Therefore, in this paper, a combined model based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), four deep learning (DL), and the autoregressive integrated moving average (ARIMA) models is proposed. The DL models include a convolution neural network, long- and short-term memory (LSTM), bi-directional LSTM (bi-LSTM), and the gated recurrent unit. The CEEMDAN decomposed the heating load into different components to extract the different features, while the DL and ARIMA models were used for the prediction of heating load features with high and low complexity, respectively. The single-DL models and the CEEMDAN-DL combinations were also implemented for comparison purposes. The results show that the combined models achieved much higher accuracy compared to the single-DL models and the CEEMDAN-DL combinations. Compared to the single-DL models, the average coefficient of determination (R2), root mean square error (RMSE), and coefficient of variation of the RMSE (CV-RMSE) were improved by 2.91%, 47.93%, and 47.92%, respectively. Furthermore, CEEMDAN-bi-LSTM-ARIMA performed the best of all the combined models, achieving values of R2 = 0.983, RMSE = 70.25 kWh, and CV-RMSE = 1.47%. This study provides a new guide for developing combined models for building energy consumption prediction. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
EBBA: An Enhanced Binary Bat Algorithm Integrated with Chaos Theory and Lévy Flight for Feature Selection
Future Internet 2022, 14(6), 178; https://doi.org/10.3390/fi14060178 - 09 Jun 2022
Cited by 2 | Viewed by 1500
Abstract
Feature selection can efficiently improve classification accuracy and reduce the dimension of datasets. However, feature selection is a challenging and complex task that requires a high-performance optimization algorithm. In this paper, we propose an enhanced binary bat algorithm (EBBA) which is originated from [...] Read more.
Feature selection can efficiently improve classification accuracy and reduce the dimension of datasets. However, feature selection is a challenging and complex task that requires a high-performance optimization algorithm. In this paper, we propose an enhanced binary bat algorithm (EBBA) which is originated from the conventional binary bat algorithm (BBA) as the learning algorithm in a wrapper-based feature selection model. First, we model the feature selection problem and then transfer it as a fitness function. Then, we propose an EBBA for solving the feature selection problem. In EBBA, we introduce the Lévy flight-based global search method, population diversity boosting method and chaos-based loudness method to improve the BA and make it more applicable to feature selection problems. Finally, the simulations are conducted to evaluate the proposed EBBA and the simulation results demonstrate that the proposed EBBA outmatches other comparison benchmarks. Moreover, we also illustrate the effectiveness of the proposed improved factors by tests. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Semi-Supervised Cloud Detection in Satellite Images by Considering the Domain Shift Problem
Remote Sens. 2022, 14(11), 2641; https://doi.org/10.3390/rs14112641 - 31 May 2022
Cited by 5 | Viewed by 1590
Abstract
In terms of semi-supervised cloud detection work, efforts are being made to learn a promising cloud detection model via a limited number of pixel-wise labeled images and a large number of unlabeled ones. However, remote sensing images obtained from the same satellite sensor [...] Read more.
In terms of semi-supervised cloud detection work, efforts are being made to learn a promising cloud detection model via a limited number of pixel-wise labeled images and a large number of unlabeled ones. However, remote sensing images obtained from the same satellite sensor often show a data distribution drift problem due to the different cloud shapes and land-cover types on the Earth’s surface. Therefore, there are domain distribution gaps between labeled and unlabeled satellite images. To solve this problem, we take the domain shift problem into account for the semi-supervised learning (SSL) network. Feature-level and output-level domain adaptations are applied to reduce the domain distribution gaps between labeled and unlabeled images, thus improving predicted results accuracy of the SSL network. Experimental results on Landsat-8 OLI and GF-1 WFV multispectral images demonstrate that the proposed semi-supervised cloud detection network (SSCDnet) is able to achieve promising cloud detection performance when using a limited number of labeled samples and outperforms several state-of-the-art SSL methods. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
UAVSwarm Dataset: An Unmanned Aerial Vehicle Swarm Dataset for Multiple Object Tracking
Remote Sens. 2022, 14(11), 2601; https://doi.org/10.3390/rs14112601 - 28 May 2022
Cited by 4 | Viewed by 2594
Abstract
In recent years, with the rapid development of unmanned aerial vehicles (UAV) technology and swarm intelligence technology, hundreds of small-scale and low-cost UAV constitute swarms carry out complex combat tasks in the form of ad hoc networks, which brings great threats and challenges [...] Read more.
In recent years, with the rapid development of unmanned aerial vehicles (UAV) technology and swarm intelligence technology, hundreds of small-scale and low-cost UAV constitute swarms carry out complex combat tasks in the form of ad hoc networks, which brings great threats and challenges to low-altitude airspace defense. Security requirements for low-altitude airspace defense, using visual detection technology to detect and track incoming UAV swarms, is the premise of anti-UAV strategy. Therefore, this study first collected many UAV swarm videos and manually annotated a dataset named UAVSwarm dataset for UAV swarm detection and tracking; thirteen different scenes and more than nineteen types of UAV were recorded, including 12,598 annotated images—the number of UAV in each sequence is 3 to 23. Then, two advanced depth detection models are used as strong benchmarks, namely Faster R-CNN and YOLOX. Finally, two state-of-the-art multi-object tracking (MOT) models, GNMOT and ByteTrack, are used to conduct comprehensive tests and performance verification on the dataset and evaluation metrics. The experimental results show that the dataset has good availability, consistency, and universality. The UAVSwarm dataset can be widely used in training and testing of various UAV detection tasks and UAV swarm MOT tasks. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Technical Note
Rescaling-Assisted Super-Resolution for Medium-Low Resolution Remote Sensing Ship Detection
Remote Sens. 2022, 14(11), 2566; https://doi.org/10.3390/rs14112566 - 27 May 2022
Cited by 1 | Viewed by 1094
Abstract
Medium-low resolution (M-LR) remote sensing ship detection is a challenging problem due to the small target sizes and insufficient appearance information. Although image super resolution (SR) has become a popular solution in recent years, the ability of image SR is limited since much [...] Read more.
Medium-low resolution (M-LR) remote sensing ship detection is a challenging problem due to the small target sizes and insufficient appearance information. Although image super resolution (SR) has become a popular solution in recent years, the ability of image SR is limited since much information is lost in input images. Inspired by the powerful information embedding ability of the encoder in image rescaling, in this paper, we introduce image rescaling to guide the training of image SR. Specifically, we add an adaption module before the SR network, and use the pre-trained rescaling network to guide the optimization of the adaption module. In this way, more information is embedded in the adapted M-LR images, and the subsequent SR module can utilize more information to achieve better performance. Extensive experimental results demonstrate the effectiveness of our method on image SR. More importantly, our method can be used as a pre-processing approach to improve the detection performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Efficient Shallow Network for River Ice Segmentation
Remote Sens. 2022, 14(10), 2378; https://doi.org/10.3390/rs14102378 - 15 May 2022
Cited by 3 | Viewed by 1212
Abstract
River ice segmentation, used for surface ice concentration estimation, is important for validating river processes and ice-formation models, predicting ice jam and flooding risks, and managing water supply and hydroelectric power generation. Furthermore, discriminating between anchor ice and frazil ice is an important [...] Read more.
River ice segmentation, used for surface ice concentration estimation, is important for validating river processes and ice-formation models, predicting ice jam and flooding risks, and managing water supply and hydroelectric power generation. Furthermore, discriminating between anchor ice and frazil ice is an important factor in understanding sediment transport and release events. Modern deep learning techniques have proved to deliver promising results; however, they can show poor generalization ability and can be inefficient when hardware and computing power is limited. As river ice images are often collected in remote locations by unmanned aerial vehicles with limited computation power, we explore the performance-latency trade-offs for river ice segmentation. We propose a novel convolution block inspired by both depthwise separable convolutions and local binary convolutions giving additional efficiency and parameter savings. Our novel convolution block is used in a shallow architecture which has 99.9% fewer trainable parameters, 99% fewer multiply–add operations, and 69.8% less memory usage than a UNet, while achieving virtually the same segmentation performance. We find that the this network trains fast and is able to achieve high segmentation performance early in training due to an emphasis on both pixel intensity and texture. When compared to very efficient segmentation networks such as LR-ASPP with a MobileNetV3 backbone, we achieve good performance (mIoU of 64) 91% faster during training on a CPU and an overall mIoU that is 7.7% higher. We also find that our network is able to generalize better to new domains such as snowy environments. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
MIMO: A Unified Spatio-Temporal Model for Multi-Scale Sea Surface Temperature Prediction
Remote Sens. 2022, 14(10), 2371; https://doi.org/10.3390/rs14102371 - 14 May 2022
Cited by 3 | Viewed by 1600
Abstract
Sea surface temperature (SST) is a crucial factor that affects global climate and marine activities. Predicting SST at different temporal scales benefits various applications, from short-term SST prediction for weather forecasting to long-term SST prediction for analyzing El Niño–Southern Oscillation (ENSO). However, existing [...] Read more.
Sea surface temperature (SST) is a crucial factor that affects global climate and marine activities. Predicting SST at different temporal scales benefits various applications, from short-term SST prediction for weather forecasting to long-term SST prediction for analyzing El Niño–Southern Oscillation (ENSO). However, existing approaches for SST prediction train separate models for different temporal scales, which is inefficient and cannot take advantage of the correlations among the temperatures of different scales to improve the prediction performance. In this work, we propose a unified spatio-temporal model termed the Multi-In and Multi-Out (MIMO) model to predict SST at different scales. MIMO is an encoder–decoder model, where the encoder learns spatio-temporal features from the SST data of multiple scales, and fuses the learned features with a Cross Scale Fusion (CSF) operation. The decoder utilizes the learned features from the encoder to adaptively predict the SST of different scales. To our best knowledge, this is the first work to predict SST at different temporal scales simultaneously with a single model. According to the experimental evaluation on the Optimum Interpolation SST (OISST) dataset, MIMO achieves the state-of-the-art prediction performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Deep Learning Models for COVID-19 Detection
Sustainability 2022, 14(10), 5820; https://doi.org/10.3390/su14105820 - 11 May 2022
Cited by 6 | Viewed by 1447
Abstract
Healthcare is one of the crucial aspects of the Internet of things. Connected machine learning-based systems provide faster healthcare services. Doctors and radiologists can also use these systems for collaboration to provide better help to patients. The recently emerged Coronavirus (COVID-19) is known [...] Read more.
Healthcare is one of the crucial aspects of the Internet of things. Connected machine learning-based systems provide faster healthcare services. Doctors and radiologists can also use these systems for collaboration to provide better help to patients. The recently emerged Coronavirus (COVID-19) is known to have strong infectious ability. Reverse transcription-polymerase chain reaction (RT-PCR) is recognised as being one of the primary diagnostic tools. However, RT-PCR tests might not be accurate. In contrast, doctors can employ artificial intelligence techniques on X-ray and CT scans for analysis. Artificial intelligent methods need a large number of images; however, this might not be possible during a pandemic. In this paper, a novel data-efficient deep network is proposed for the identification of COVID-19 on CT images. This method increases the small number of available CT scans by generating synthetic versions of CT scans using the generative adversarial network (GAN). Then, we estimate the parameters of convolutional and fully connected layers of the deep networks using synthetic and augmented data. The method shows that the GAN-based deep learning model provides higher performance than classic deep learning models for COVID-19 detection. The performance evaluation is performed on COVID19-CT and Mosmed datasets. The best performing models are ResNet-18 and MobileNetV2 on COVID19-CT and Mosmed, respectively. The area under curve values of ResNet-18 and MobileNetV2 are 0.89% and 0.84%, respectively. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Review
A Survey on Memory Subsystems for Deep Neural Network Accelerators
Future Internet 2022, 14(5), 146; https://doi.org/10.3390/fi14050146 - 10 May 2022
Cited by 4 | Viewed by 2297
Abstract
From self-driving cars to detecting cancer, the applications of modern artificial intelligence (AI) rely primarily on deep neural networks (DNNs). Given raw sensory data, DNNs are able to extract high-level features after the network has been trained using statistical learning. However, due to [...] Read more.
From self-driving cars to detecting cancer, the applications of modern artificial intelligence (AI) rely primarily on deep neural networks (DNNs). Given raw sensory data, DNNs are able to extract high-level features after the network has been trained using statistical learning. However, due to the massive amounts of parallel processing in computations, the memory wall largely affects the performance. Thus, a review of the different memory architectures applied in DNN accelerators would prove beneficial. While the existing surveys only address DNN accelerators in general, this paper investigates novel advancements in efficient memory organizations and design methodologies in the DNN accelerator. First, an overview of the various memory architectures used in DNN accelerators will be provided, followed by a discussion of memory organizations on non-ASIC DNN accelerators. Furthermore, flexible memory systems incorporating an adaptable DNN computation will be explored. Lastly, an analysis of emerging memory technologies will be conducted. The reader, through this article, will: 1—gain the ability to analyze various proposed memory architectures; 2—discern various DNN accelerators with different memory designs; 3—become familiar with the trade-offs associated with memory organizations; and 4—become familiar with proposed new memory systems for modern DNN accelerators to solve the memory wall and other mentioned current issues. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Two New Datasets for Italian-Language Abstractive Text Summarization
Information 2022, 13(5), 228; https://doi.org/10.3390/info13050228 - 29 Apr 2022
Cited by 2 | Viewed by 2215
Abstract
Text summarization aims to produce a short summary containing relevant parts from a given text. Due to the lack of data for abstractive summarization on low-resource languages such as Italian, we propose two new original datasets collected from two Italian news websites with [...] Read more.
Text summarization aims to produce a short summary containing relevant parts from a given text. Due to the lack of data for abstractive summarization on low-resource languages such as Italian, we propose two new original datasets collected from two Italian news websites with multi-sentence summaries and corresponding articles, and from a dataset obtained by machine translation of a Spanish summarization dataset. These two datasets are currently the only two available in Italian for this task. To evaluate the quality of these two datasets, we used them to train a T5-base model and an mBART model, obtaining good results with both. To better evaluate the results obtained, we also compared the same models trained on automatically translated datasets, and the resulting summaries in the same training language, with the automatically translated summaries, which demonstrated the superiority of the models obtained from the proposed datasets. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Accurate Air-Quality Prediction Using Genetic-Optimized Gated-Recurrent-Unit Architecture
Information 2022, 13(5), 223; https://doi.org/10.3390/info13050223 - 26 Apr 2022
Viewed by 1576
Abstract
Air pollution is becoming a serious concern with the development of society and urban expansion, and predicting air quality is the most pressing problem for human beings. Recently, more and more machine-learning-based methods are being used to solve the air-quality-prediction problem, and gated [...] Read more.
Air pollution is becoming a serious concern with the development of society and urban expansion, and predicting air quality is the most pressing problem for human beings. Recently, more and more machine-learning-based methods are being used to solve the air-quality-prediction problem, and gated recurrent units (GRUs) are a representative method because of their advantage for processing time-series data. However, in the same air-quality-prediction task, different researchers have always designed different structures of the GRU due to their different experiences. Data-adaptively designing a GRU structure has thus become a problem. In this paper, we propose an adaptive GRU to address this problem, and the adaptive GRU structures are determined by the dataset, which mainly contributes with three steps. Firstly, an encoding method for the GRU structure is proposed for representing the network structure in a fixed-length binary string; secondly, we define the reciprocal of the sum of the loss of each individual as the fitness function for the iteration computation; thirdly, the genetic algorithm is used for computing the data-adaptive GRU network structure, which can enhance the air-quality-prediction result. The experiment results from three real datasets in Xi’an show that the proposed method achieves better effectiveness in RMSE and SAMPE than the existing LSTM-, SVM-, and RNN-based methods. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
An Emergency Event Detection Ensemble Model Based on Big Data
Big Data Cogn. Comput. 2022, 6(2), 42; https://doi.org/10.3390/bdcc6020042 - 16 Apr 2022
Viewed by 2756
Abstract
Emergency events arise when a serious, unexpected, and often dangerous threat affects normal life. Hence, knowing what is occurring during and after emergency events is critical to mitigate the effect of the incident on humans’ life, on the environment and our infrastructures, as [...] Read more.
Emergency events arise when a serious, unexpected, and often dangerous threat affects normal life. Hence, knowing what is occurring during and after emergency events is critical to mitigate the effect of the incident on humans’ life, on the environment and our infrastructures, as well as the inherent financial consequences. Social network utilization in emergency event detection models can play an important role as information is shared and users’ status is updated once an emergency event occurs. Besides, big data proved its significance as a tool to assist and alleviate emergency events by processing an enormous amount of data over a short time interval. This paper shows that it is necessary to have an appropriate emergency event detection ensemble model (EEDEM) to respond quickly once such unfortunate events occur. Furthermore, it integrates Snapchat maps to propose a novel method to pinpoint the exact location of an emergency event. Moreover, merging social networks and big data can accelerate the emergency event detection system: social network data, such as those from Twitter and Snapchat, allow us to manage, monitor, analyze and detect emergency events. The main objective of this paper is to propose a novel and efficient big data-based EEDEM to pinpoint the exact location of emergency events by employing the collected data from social networks, such as “Twitter” and “Snapchat”, while integrating big data (BD) and machine learning (ML). Furthermore, this paper evaluates the performance of five ML base models and the proposed ensemble approach to detect emergency events. Results show that the proposed ensemble approach achieved a very high accuracy of 99.87% which outperform the other base models. Moreover, the proposed base models yields a high level of accuracy: 99.72%, 99.70% for LSTM and decision tree, respectively, with an acceptable training time. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Structural Approach to Some Contradictions in Worldwide Swine Production and Health Research
Sustainability 2022, 14(8), 4748; https://doi.org/10.3390/su14084748 - 15 Apr 2022
Cited by 1 | Viewed by 1942
Abstract
Several biosafety gaps in agri-food sectors have become evident in recent years. Many of them are related to the global livestock systems and the organizational models involved in their management and organization. For example, producing pigs requires a global system of massive confinement [...] Read more.
Several biosafety gaps in agri-food sectors have become evident in recent years. Many of them are related to the global livestock systems and the organizational models involved in their management and organization. For example, producing pigs requires a global system of massive confinement and specific technological innovations related to animal production and health that involve broad technical and scientific structures, which are required to generate specific knowledge for successful management. This suggests the need for an underlying socially agglomerated technological ecosystem relevant for these issues. So, we propose the analysis of a specialized scientific social structure in terms of the knowledge and technologies required for pig production and health. The objective of this work is to characterize structural patterns in the research of the swine health sector worldwide. We use a mixed methodological approach, based on a social network approach, and obtained scientific information from 4868 specialized research works on health and pig production generated between 2010 to 2018, from 47 countries. It was possible to analyze swine research dynamics, such as convergence and influence, at country and regional levels, and identify differentiated behaviors and high centralization in scientific communities that have a worldwide impact in terms of achievements but also result in significant omissions. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Landslide Displacement Prediction via Attentive Graph Neural Network
Remote Sens. 2022, 14(8), 1919; https://doi.org/10.3390/rs14081919 - 15 Apr 2022
Cited by 3 | Viewed by 1688
Abstract
Landslides are among the most common geological hazards that result in considerable human and economic losses globally. Researchers have put great efforts into addressing the landslide prediction problem for decades. Previous methods either focus on analyzing the landslide inventory maps obtained from aerial [...] Read more.
Landslides are among the most common geological hazards that result in considerable human and economic losses globally. Researchers have put great efforts into addressing the landslide prediction problem for decades. Previous methods either focus on analyzing the landslide inventory maps obtained from aerial photography and satellite images or propose machine learning models—trained on historical land deformation data—to predict future displacement and sedimentation. However, existing approaches generally fail to capture complex spatial deformations and their inter-dependencies in different areas. This work presents a novel landslide prediction model based on graph neural networks, which utilizes graph convolutions to aggregate spatial correlations among different monitored locations. Besides, we introduce a novel locally historical transformer network to capture dynamic spatio-temporal relations and predict the surface deformation. We conduct extensive experiments on real-world data and demonstrate that our model significantly outperforms state-of-the-art approaches in terms of prediction accuracy and model interpretations. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Local Transformer Network on 3D Point Cloud Semantic Segmentation
Information 2022, 13(4), 198; https://doi.org/10.3390/info13040198 - 14 Apr 2022
Viewed by 1997
Abstract
Semantic segmentation is an important component in understanding the 3D point cloud scene. Whether we can effectively obtain local and global contextual information from points is of great significance in improving the performance of 3D point cloud semantic segmentation. In this paper, we [...] Read more.
Semantic segmentation is an important component in understanding the 3D point cloud scene. Whether we can effectively obtain local and global contextual information from points is of great significance in improving the performance of 3D point cloud semantic segmentation. In this paper, we propose a self-attention feature extraction module: the local transformer structure. By stacking the encoder layer composed of this structure, we can extract local features while preserving global connectivity. The structure can automatically learn each point feature from its neighborhoods and is invariant to different point orders. We designed two unique key matrices, each of which focuses on the feature similarities and geometric structure relationships between the points to generate attention weight matrices. Additionally, the cross-skip selection of neighbors is used to obtain larger receptive fields for each point without increasing the number of calculations required, and can therefore better deal with the junction between multiple objects. When the new network was verified on the S3DIS, the mean intersection over union was 69.1%, and the segmentation accuracies on the complex outdoor scene datasets Semantic3D and SemanticKITTI were 94.3% and 87.8%, respectively, which demonstrate the effectiveness of the proposed methods. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Systematic Review
Deep Learning for Vulnerability and Attack Detection on Web Applications: A Systematic Literature Review
Future Internet 2022, 14(4), 118; https://doi.org/10.3390/fi14040118 - 13 Apr 2022
Cited by 5 | Viewed by 3778
Abstract
Web applications are the best Internet-based solution to provide online web services, but they also bring serious security challenges. Thus, enhancing web applications security against hacking attempts is of paramount importance. Traditional Web Application Firewalls based on manual rules and traditional Machine Learning [...] Read more.
Web applications are the best Internet-based solution to provide online web services, but they also bring serious security challenges. Thus, enhancing web applications security against hacking attempts is of paramount importance. Traditional Web Application Firewalls based on manual rules and traditional Machine Learning need a lot of domain expertise and human intervention and have limited detection results faced with the increasing number of unknown web attacks. To this end, more research work has recently been devoted to employing Deep Learning (DL) approaches for web attacks detection. We performed a Systematic Literature Review (SLR) and quality analysis of 63 Primary Studies (PS) on DL-based web applications security published between 2010 and September 2021. We investigated the PS from different perspectives and synthesized the results of the analyses. To the best of our knowledge, this study is the first of its kind on SLR in this field. The key findings of our study include the following. (i) It is fundamental to generate standard real-world web attacks datasets to encourage effective contribution in this field and to reduce the gap between research and industry. (ii) It is interesting to explore some advanced DL models, such as Generative Adversarial Networks and variants of Encoders–Decoders, in the context of web attacks detection as they have been successful in similar domains such as networks intrusion detection. (iii) It is fundamental to bridge expertise in web applications security and expertise in Machine Learning to build theoretical Machine Learning models tailored for web attacks detection. (iv) It is important to create a corpus for web attacks detection in order to take full advantage of text mining in DL-based web attacks detection models construction. (v) It is essential to define a common framework for developing and comparing DL-based web attacks detection models. This SLR is intended to improve research work in the domain of DL-based web attacks detection, as it covers a significant number of research papers and identifies the key points that need to be addressed in this research field. Such a contribution is helpful as it allows researchers to compare existing approaches and to exploit the proposed future work opportunities. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A Two-Stage Low-Altitude Remote Sensing Papaver Somniferum Image Detection System Based on YOLOv5s+DenseNet121
Remote Sens. 2022, 14(8), 1834; https://doi.org/10.3390/rs14081834 - 11 Apr 2022
Cited by 2 | Viewed by 1984
Abstract
Papaver somniferum (opium poppy) is not only a source of raw material for the production of medical narcotic analgesics but also the major raw material for certain psychotropic drugs. Therefore, it is stipulated by law that the cultivation of Papaver somniferum must be [...] Read more.
Papaver somniferum (opium poppy) is not only a source of raw material for the production of medical narcotic analgesics but also the major raw material for certain psychotropic drugs. Therefore, it is stipulated by law that the cultivation of Papaver somniferum must be authorized by the government under stringent supervision. In certain areas, unauthorized and illicit Papaver somniferum cultivation on private-owned lands occurs from time to time. These illegal Papaver somniferum cultivation sites are dispersedly-distributed and highly-concealed, therefore becoming a tough problem for government supervision. The low-altitude inspection of Papaver somniferum cultivation by unmanned aerial vehicles has the advantages of high efficiency and time saving, but the large amount of image data collected needs to be manually screened, which not only consumes a lot of manpower and material resources but also easily causes omissions. In response to the above problems, this paper proposed a two-stage (target detection and image classification) method for the detection of Papaver somniferum cultivation sites. In the first stage, the YOLOv5s algorithm was used to detect Papaver somniferum images for the purpose of identifying all the suspicious Papaver somniferum images from the original data. In the second stage, the DenseNet121 network was used to classify the detection results from the first stage, so as to exclude the targets other than Papaver somniferum and retain the images containing Papaver somniferum only. For the first stage, YOLOv5s achieved the best overall performance among mainstream target detection models, with a Precision of 97.7%, Recall of 94.9%, and mAP of 97.4%. For the second stage, DenseNet121 with pre-training achieved the best overall performance, with a classification accuracy of 97.33% and a Precision of 95.81%. The experimental comparison results between the one-stage method and the two-stage method suggest that the Recall of the two methods remained the same, but the two-stage method reduced the number of falsely detected images by 73.88%, which greatly reduces the workload for subsequent manual screening of remote sensing Papaver somniferum images. The achievement of this paper provides an effective technical means to solve the problem in the supervision of illicit Papaver somniferum cultivation. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Learning Spatio-Temporal Attention Based Siamese Network for Tracking UAVs in the Wild
Remote Sens. 2022, 14(8), 1797; https://doi.org/10.3390/rs14081797 - 08 Apr 2022
Cited by 2 | Viewed by 1594
Abstract
The popularity of unmanned aerial vehicles (UAVs) has made anti-UAV technology increasingly urgent. Object tracking, especially in thermal infrared videos, offers a promising solution to counter UAV intrusion. However, troublesome issues such as fast motion and tiny size make tracking infrared drone targets [...] Read more.
The popularity of unmanned aerial vehicles (UAVs) has made anti-UAV technology increasingly urgent. Object tracking, especially in thermal infrared videos, offers a promising solution to counter UAV intrusion. However, troublesome issues such as fast motion and tiny size make tracking infrared drone targets difficult and challenging. This work proposes a simple and effective spatio-temporal attention based Siamese method called SiamSTA, which performs reliable local searching and wide-range re-detection alternatively for robustly tracking drones in the wild. Concretely, SiamSTA builds a two-stage re-detection network to predict the target state using the template of first frame and the prediction results of previous frames. To tackle the challenge of small-scale UAV targets for long-range acquisition, SiamSTA imposes spatial and temporal constraints on generating candidate proposals within local neighborhoods to eliminate interference from background distractors. Complementarily, in case of target lost from local regions due to fast movement, a third stage re-detection module is introduced, which exploits valuable motion cues through a correlation filter based on change detection to re-capture targets from a global view. Finally, a state-aware switching mechanism is adopted to adaptively integrate local searching and global re-detection and take their complementary strengths for robust tracking. Extensive experiments on three anti-UAV datasets nicely demonstrate SiamSTA’s advantage over other competitors. Notably, SiamSTA is the foundation of the 1st-place winning entry in the 2nd Anti-UAV Challenge. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Deep Learning with Word Embedding Improves Kazakh Named-Entity Recognition
Information 2022, 13(4), 180; https://doi.org/10.3390/info13040180 - 02 Apr 2022
Cited by 3 | Viewed by 1925
Abstract
Named-entity recognition (NER) is a preliminary step for several text extraction tasks. In this work, we try to recognize Kazakh named entities by introducing a hybrid neural network model that leverages word semantics with multidimensional features and attention mechanisms. There are two major [...] Read more.
Named-entity recognition (NER) is a preliminary step for several text extraction tasks. In this work, we try to recognize Kazakh named entities by introducing a hybrid neural network model that leverages word semantics with multidimensional features and attention mechanisms. There are two major challenges: First, Kazakh is an agglutinative and morphologically rich language that presents a challenge for NER due to data sparsity. The other is that Kazakh named entities have unclear boundaries, polysemy, and nesting. A common strategy to handle data sparsity is to apply subword segmentation. Thus, we combined the semantics of words and stems by stemming from the Kazakh morphological analysis system. Additionally, we constructed a graph structure of entities, with words, entities, and entity categories as nodes and inclusion relations as edges, and updated nodes using a gated graph neural network (GGNN) with an attention mechanism. Finally, through the conditional random field (CRF), we extracted the final results. Experimental results show that our method consistently outperforms all previous methods by 88.04% in terms of F1 scores. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
HealthFetch: An Influence-Based, Context-Aware Prefetch Scheme in Citizen-Centered Health Storage Clouds
Future Internet 2022, 14(4), 112; https://doi.org/10.3390/fi14040112 - 01 Apr 2022
Cited by 3 | Viewed by 2217
Abstract
Over the past few years, increasing attention has been given to the health sector and the integration of new technologies into it. Cloud computing and storage clouds have become essentially state of the art solutions for other major areas and have started to [...] Read more.
Over the past few years, increasing attention has been given to the health sector and the integration of new technologies into it. Cloud computing and storage clouds have become essentially state of the art solutions for other major areas and have started to rapidly make their presence powerful in the health sector as well. More and more companies are working toward a future that will allow healthcare professionals to engage more with such infrastructures, enabling them a vast number of possibilities. While this is a very important step, less attention has been given to the citizens. For this reason, in this paper, a citizen-centered storage cloud solution is proposed that will allow citizens to hold their health data in their own hands while also enabling the exchange of these data with healthcare professionals during emergency situations. Not only that, in order to reduce the health data transmission delay, a novel context-aware prefetch engine enriched with deep learning capabilities is proposed. The proposed prefetch scheme, along with the proposed storage cloud, is put under a two-fold evaluation in several deployment and usage scenarios in order to examine its performance with respect to the data transmission times, while also evaluating its outcomes compared to other state of the art solutions. The results show that the proposed solution shows significant improvement of the download speed when compared with the storage cloud, especially when large data are exchanged. In addition, the results of the proposed scheme evaluation depict that the proposed scheme improves the overall predictions, considering the coefficient of determination (R2 > 0.94) and the mean of errors (RMSE < 1), while also reducing the training data by 12%. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Num-Symbolic Homophonic Social Net-Words
Information 2022, 13(4), 174; https://doi.org/10.3390/info13040174 - 29 Mar 2022
Cited by 1 | Viewed by 1632
Abstract
Many excellent studies about social networks and text analyses can be found in the literature, facilitating the rapid development of automated text analysis technology. Due to the lack of natural separators in Chinese, the text numbers and symbols also have their original literal [...] Read more.
Many excellent studies about social networks and text analyses can be found in the literature, facilitating the rapid development of automated text analysis technology. Due to the lack of natural separators in Chinese, the text numbers and symbols also have their original literal meaning. Thus, combining Chinese characters with numbers and symbols in user-generated content is a challenge for the current analytic approaches and procedures. Therefore, we propose a new hybrid method for detecting blended numeric and symbolic homophony Chinese neologisms (BNShCNs). Interpretation of the words’ actual semantics was performed according to their independence and relative position in context. This study obtained a shortlist using a probability approach from internet-collected user-generated content; subsequently, we evaluated the shortlist by contextualizing word-embedded vectors for BNShCN detection. The experiments show that the proposed method efficiently extracted BNShCNs from user-generated content. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Is Artificial Intelligence Better than Manpower? The Effects of Different Types of Online Customer Services on Customer Purchase Intentions
Sustainability 2022, 14(7), 3974; https://doi.org/10.3390/su14073974 - 28 Mar 2022
Cited by 4 | Viewed by 3346
Abstract
Artificial intelligence has been widely applied to e-commerce and the online business service field. However, few studies have focused on studying the differences in the effects of types of customer service on customer purchase intentions. Based on service encounter theory and superposition theory, [...] Read more.
Artificial intelligence has been widely applied to e-commerce and the online business service field. However, few studies have focused on studying the differences in the effects of types of customer service on customer purchase intentions. Based on service encounter theory and superposition theory, we designed two shopping experiments to capture customers’ thoughts and feelings, in order to explore the differences in the effects of three different types of online customer service (AI customer service, manual customer service, and human–machine collaboration customer service) on customer purchase intention, and analyses the superposition effect of human–machine collaboration customer service. The results show that the consumer’s perceived service quality positively influences the customer’s purchase intention, and plays a mediating role in the effect of different types of online customer service on customer purchase intention; the product type plays a moderating role in the relationship between online customer service and customer purchase intention, and human–machine collaboration customer service has a superposition effect. This study helped to deepen the understanding of AI developers and e-commerce platforms regarding the application of AI in online business service, and provides reference suggestions for the formulation of more perfect business service strategies. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
A LiDAR–Camera Fusion 3D Object Detection Algorithm
Information 2022, 13(4), 169; https://doi.org/10.3390/info13040169 - 26 Mar 2022
Cited by 6 | Viewed by 2739
Abstract
3D object detection with LiDAR and camera fusion has always been a challenge for autonomous driving. This work proposes a deep neural network (namely FuDNN) for LiDAR–camera fusion 3D object detection. Firstly, a 2D backbone is designed to extract features from camera images. [...] Read more.
3D object detection with LiDAR and camera fusion has always been a challenge for autonomous driving. This work proposes a deep neural network (namely FuDNN) for LiDAR–camera fusion 3D object detection. Firstly, a 2D backbone is designed to extract features from camera images. Secondly, an attention-based fusion sub-network is designed to fuse the features extracted by the 2D backbone and the features extracted from 3D LiDAR point clouds by PointNet++. Besides, the FuDNN, which uses the RPN and the refinement work of PointRCNN to obtain 3D box predictions, was tested on the public KITTI dataset. Experiments on the KITTI validation set show that the proposed FuDNN achieves AP values of 92.48, 82.90, and 80.51 at easy, moderate, and hard difficulty levels for car detection. The proposed FuDNN improves the performance of LiDAR–camera fusion 3D object detection in the car category of the public KITTI dataset. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Time Series Surface Temperature Prediction Based on Cyclic Evolutionary Network Model for Complex Sea Area
Future Internet 2022, 14(3), 96; https://doi.org/10.3390/fi14030096 - 21 Mar 2022
Cited by 2 | Viewed by 1693
Abstract
The prediction of marine elements has become increasingly important in the field of marine research. However, time series data in a complex environment vary significantly because they are composed of dynamic changes with multiple mechanisms, causes, and laws. For example, sea surface temperature [...] Read more.
The prediction of marine elements has become increasingly important in the field of marine research. However, time series data in a complex environment vary significantly because they are composed of dynamic changes with multiple mechanisms, causes, and laws. For example, sea surface temperature (SST) can be influenced by ocean currents. Conventional models often focus on capturing the impact of historical data but ignore the spatio–temporal relationships in sea areas, and they cannot predict such widely varying data effectively. In this work, we propose a cyclic evolutionary network model (CENS), an error-driven network group, which is composed of multiple network node units. Different regions of data can be automatically matched to a suitable network node unit for prediction so that the model can cluster the data based on their characteristics and, therefore, be more practical. Experiments were performed on the Bohai Sea and the South China Sea. Firstly, we performed an ablation experiment to verify the effectiveness of the framework of the model. Secondly, we tested the model to predict sea surface temperature, and the results verified the accuracy of CENS. Lastly, there was a meaningful finding that the clustering results of the model in the South China Sea matched the actual characteristics of the continental shelf of the South China Sea, and the cluster had spatial continuity. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Machine Learning for Pan Evaporation Modeling in Different Agroclimatic Zones of the Slovak Republic (Macro-Regions)
Sustainability 2022, 14(6), 3475; https://doi.org/10.3390/su14063475 - 16 Mar 2022
Cited by 1 | Viewed by 1549
Abstract
Global climate change is likely to influence evapotranspiration (ET); as a result, many ET calculation methods may not give accurate results under different climatic conditions. The main objective of this study is to verify the suitability of machine learning (ML) models as calculation [...] Read more.
Global climate change is likely to influence evapotranspiration (ET); as a result, many ET calculation methods may not give accurate results under different climatic conditions. The main objective of this study is to verify the suitability of machine learning (ML) models as calculation methods for pan evaporation modeling on the macro-regional scale. The most significant PE changes in the different agroclimatic zones of the Slovak Republic were compared, and their considerable impacts were analyzed. On the basis of the agroclimatic zones, 35 meteorological stations distributed across Slovakia were classified into six macro-regions. For each of the meteorological stations, 11 variables were applied during the vegetation period in the years from 2010 to 2020 with a daily time step. The performance of eight different ML models—the neural network (NN) model, the autoneural network (AN) model, the decision tree (DT) model, the Dmine regression (DR) model, the DM neural network (DM NN) model, the gradient boosting (GB) model, the least angle regression (LARS) model, and the ensemble model (EM)—was employed to predict PE. It was found that the different models had diverse prediction accuracies in various geographical locations. In this study, the results of the values predicted by the individual models are compared. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Figure 1

Article
Unsupervised Anomaly Detection and Segmentation on Dirty Datasets
Future Internet 2022, 14(3), 86; https://doi.org/10.3390/fi14030086 - 13 Mar 2022
Viewed by 2370
Abstract
Industrial quality control is an important task. Most of the existing vision-based unsupervised industrial anomaly detection and segmentation methods require that the training set only consists of normal samples, which is difficult to ensure in practice. This paper proposes an unsupervised framework to [...] Read more.
Industrial quality control is an important task. Most of the existing vision-based unsupervised industrial anomaly detection and segmentation methods require that the training set only consists of normal samples, which is difficult to ensure in practice. This paper proposes an unsupervised framework to solve the industrial anomaly detection and segmentation problem when the training set contains anomaly samples. Our framework uses a model pretrained on ImageNet as a feature extractor to extract patch-level features. After that, we propose a trimming method to estimate a robust Gaussian distribution based on the patch features at each position. Then, with an iterative filtering process, we can iteratively filter out the anomaly samples in the training set and re-estimate the Gaussian distribution at each position. In the prediction phase, the Mahalanobis distance between a patch feature vector and the center of the Gaussian distribution at the corresponding position is used as the anomaly score of this patch. The subsequent anomaly region segmentation is performed based on the patch anomaly score. We tested the proposed method on three datasets containing the anomaly samples and obtained state-of-the-art performance. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures

Graphical abstract

Article
Early Detection of Dendroctonus valens Infestation with Machine Learning Algorithms Based on Hyperspectral Reflectance
Remote Sens. 2022, 14(6), 1373; https://doi.org/10.3390/rs14061373 - 11 Mar 2022
Cited by 3 | Viewed by 2257
Abstract
The red turpentine beetle (Dendroctonus valens LeConte) has caused severe ecological and economic losses since its invasion into China. It gradually spreads northeast, resulting in many Chinese pine (Pinus tabuliformis Carr.) deaths. Early detection of D. valens infestation (i.e., at the [...] Read more.
The red turpentine beetle (Dendroctonus valens LeConte) has caused severe ecological and economic losses since its invasion into China. It gradually spreads northeast, resulting in many Chinese pine (Pinus tabuliformis Carr.) deaths. Early detection of D. valens infestation (i.e., at the green attack stage) is the basis of control measures to prevent its outbreak and spread. This study examined the changes in spectral reflectance after initial attacking of D. valens. We also explored the possibility of detecting early D. valens infestation based on spectral vegetation indices and machine learning algorithms. The spectral reflectance of infested trees was significantly different from healthy trees (p < 0.05), and there was an obvious decrease in the near-infrared region (760–1386 nm; p < 0.01). Spectral vegetation indices were input into three machine learning classifiers; the classification accuracy was 72.5–80%, while the sensitivity was 65–85%. Several spectral vegetation indices (DID, CUR, TBSI, DDn2, D735, SR1, NSMI, RNIR•CRI550 and RVSI) were sensitive indicators for the early detection of D. valens damage. Our results demonstrate that remote sensing technology could be successfully applied to early detect D. valens infestation and clarify the sensitive spectral regions and vegetation indices, which has important implications for early detection based on unmanned airborne vehicle and satellite data. Full article
(This article belongs to the Topic Big Data and Artificial Intelligence)
Show Figures