Technologies

Editorial

Jump to: Research, Review, Other

5 pages, 159 KiB

Open AccessEditorial

Editorial for the Special Issue “Data Science and Big Data in Biology, Physical Science and Engineering”

by Mohammed Mahmoud

Technologies 2024, 12(1), 8; https://doi.org/10.3390/technologies12010008 - 8 Jan 2024

Cited by 1 | Viewed by 2426

Abstract

Big Data analysis is one of the most contemporary areas of development and research in the present day [...] Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

Research

Jump to: Editorial, Review, Other

16 pages, 293 KiB

Open AccessArticle

Get Real Get Better: A Framework for Developing Agile Program Management in the U.S. Navy Supported by the Application of Advanced Data Analytics and AI

by Jonathan Haase, Peter B. Walker, Olivia Berardi and Waldemar Karwowski

Technologies 2023, 11(6), 165; https://doi.org/10.3390/technologies11060165 - 20 Nov 2023

Cited by 11 | Viewed by 5568

Abstract

This paper discusses the “Get Real Get Better” (GRGB) approach to implementing agile program management in the U.S. Navy, supported by advanced data analytics and artificial intelligence (AI). GRGB was designed as a set of foundational principles to advance Navy culture [...] Read more.

This paper discusses the “Get Real Get Better” (GRGB) approach to implementing agile program management in the U.S. Navy, supported by advanced data analytics and artificial intelligence (AI). GRGB was designed as a set of foundational principles to advance Navy culture and support its core values. This article identifies a need for a more informed and efficient approach to program management by highlighting the benefits of implementing comprehensive data analytics that leverage recent advances in cloud computing and machine learning. The Jupiter enclave within Advana implemented by the U.S. Navy, is also discussed. The presented approach represents a practical framework that cultivates a “Get Real Get Better” mindset for implementing agile program management in the U.S. Navy. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

18 pages, 2892 KiB

Open AccessArticle

Deep Learning Techniques for Web-Based Attack Detection in Industry 5.0: A Novel Approach

by Abdu Salam, Faizan Ullah, Farhan Amin and Mohammad Abrar

Technologies 2023, 11(4), 107; https://doi.org/10.3390/technologies11040107 - 8 Aug 2023

Cited by 39 | Viewed by 7596

Abstract

As the manufacturing industry advances towards Industry 5.0, which heavily integrates advanced technologies such as cyber-physical systems, artificial intelligence, and the Internet of Things (IoT), the potential for web-based attacks increases. Cybersecurity concerns remain a crucial challenge for Industry 5.0 environments, where cyber-attacks [...] Read more.

As the manufacturing industry advances towards Industry 5.0, which heavily integrates advanced technologies such as cyber-physical systems, artificial intelligence, and the Internet of Things (IoT), the potential for web-based attacks increases. Cybersecurity concerns remain a crucial challenge for Industry 5.0 environments, where cyber-attacks can cause devastating consequences, including production downtime, data breaches, and even physical harm. To address this challenge, this research proposes an innovative deep-learning methodology for detecting web-based attacks in Industry 5.0. Convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformer models are examples of deep learning techniques that are investigated in this study for their potential to effectively classify attacks and identify anomalous behavior. The proposed transformer-based system outperforms traditional machine learning methods and existing deep learning approaches in terms of accuracy, precision, and recall, demonstrating the effectiveness of deep learning for intrusion detection in Industry 5.0. The study’s findings showcased the superiority of the proposed transformer-based system, outperforming previous approaches in accuracy, precision, and recall. This highlights the significant contribution of deep learning in addressing cybersecurity challenges in Industry 5.0 environments. This study contributes to advancing cybersecurity in Industry 5.0, ensuring the protection of critical infrastructure and sensitive data. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

10 pages, 217 KiB

Open AccessArticle

Self-Directed and Self-Designed Learning: Integrating Imperative Topics in the Case of COVID-19

by Alireza Ebrahimi

Technologies 2023, 11(4), 85; https://doi.org/10.3390/technologies11040085 - 29 Jun 2023

Cited by 3 | Viewed by 4607

Abstract

Self-directed learning and self-design became unexpectedly popular and common during the COVID-19 era. Learners are encouraged to take charge of their learning and, often the opportunity to independently design their learning experience. This research illustrates the use of technology in teaching and learning [...] Read more.

Self-directed learning and self-design became unexpectedly popular and common during the COVID-19 era. Learners are encouraged to take charge of their learning and, often the opportunity to independently design their learning experience. This research illustrates the use of technology in teaching and learning technology with a central theme of promoting self-directed learning with engaging self-design for both educators and learners. The technology used includes existing tools such as web page design, Learning Management Systems (LMS), project management tools, and basic programming foundations and concepts of big data and databases. In addition, end-users and developers can create their own tools with simple coding. Planning techniques, such as Visual Plan Construct Language with its embedded AI, are used to integrate course material and rubrics with time management. Educators may use project management tools instead. The research proposes a self-directed paradigm with self-designed resources using the existing technology with LMS modules, discussions, and self-tests. The research establishes its criteria for ensuring the quality of content and design, known as 7x2C. Additionally, other criteria for analysis, such as Design Thinking, are included. The approach is examined for a technology-based business course in creating an experiential learning system for COVID-19 awareness. Likewise, among other projects, an environment for educating learners about diabetes and obesity has been designed. The project is known as Sunchoke, which has a theme of Grow, Eat, and Heal. Educators can use their own content and rubrics to adapt this approach to their own customized teaching methods. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

24 pages, 2335 KiB

Open AccessEditor’s ChoiceArticle

An Advanced Decision Tree-Based Deep Neural Network in Nonlinear Data Classification

by Mohammad Arifuzzaman, Md. Rakibul Hasan, Tasnia Jahan Toma, Samia Binta Hassan and Anup Kumar Paul

Technologies 2023, 11(1), 24; https://doi.org/10.3390/technologies11010024 - 1 Feb 2023

Cited by 18 | Viewed by 8242

Abstract

Deep neural networks (DNNs), the integration of neural networks (NNs) and deep learning (DL), have proven highly efficient in executing numerous complex tasks, such as data and image classification. Because the multilayer in a nonlinearly separable data structure is not transparent, it is [...] Read more.

Deep neural networks (DNNs), the integration of neural networks (NNs) and deep learning (DL), have proven highly efficient in executing numerous complex tasks, such as data and image classification. Because the multilayer in a nonlinearly separable data structure is not transparent, it is critical to develop a specific data classification model from a new and unexpected dataset. In this paper, we propose a novel approach using the concepts of DNN and decision tree (DT) for classifying nonlinear data. We first developed a decision tree-based neural network (DTBNN) model. Next, we extend our model to a decision tree-based deep neural network (DTBDNN), in which the multiple hidden layers in DNN are utilized. Using DNN, the DTBDNN model achieved higher accuracy compared to the related and relevant approaches. Our proposal achieves the optimal trainable weights and bias to build an efficient model for nonlinear data classification by combining the benefits of DT and NN. By conducting in-depth performance evaluations, we demonstrate the effectiveness and feasibility of the proposal by achieving good accuracy over different datasets. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

19 pages, 2288 KiB

Open AccessArticle

Data Model Design to Support Data-Driven IT Governance Implementation

by Vittoria Biagi and Angela Russo

Technologies 2022, 10(5), 106; https://doi.org/10.3390/technologies10050106 - 8 Oct 2022

Cited by 5 | Viewed by 5768

Abstract

Organizations must quickly adapt their processes to understand the dynamic nature of modern business environments. As highlighted in the literature, centralized governance supports decision-making and performance measurement processes in technology companies. For this reason, a reliable decision-making system with an integrated data model [...] Read more.

Organizations must quickly adapt their processes to understand the dynamic nature of modern business environments. As highlighted in the literature, centralized governance supports decision-making and performance measurement processes in technology companies. For this reason, a reliable decision-making system with an integrated data model that enables the rapid collection and transformation of data stored in heterogeneous and different sources is needed. Therefore, this paper proposes the design of a data model to implement data-driven governance through a literature review of adopted approaches. The lack of a standardized procedure and a disconnection between theoretical frameworks and practical application has emerged. This paper documented the suggested approach following these steps: (i) mapping of monitoring requirements to the data structure, (ii) documentation of ER diagram design, and (iii) reporting dashboards used for monitoring and reporting. The paper helped fill the gaps highlighted in the literature by supporting the design and development of a DWH data model coupled with a BI system. The application prototype shows benefits for top management, particularly those responsible for governance and operations, especially for risk monitoring, audit compliance, communication, knowledge sharing on strategic areas of the company, and identification and implementation of performance improvements and optimizations. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

13 pages, 7056 KiB

Open AccessArticle

Rough-Set-Theory-Based Classification with Optimized k-Means Discretization

by Teguh Handjojo Dwiputranto, Noor Akhmad Setiawan and Teguh Bharata Adji

Technologies 2022, 10(2), 51; https://doi.org/10.3390/technologies10020051 - 8 Apr 2022

Cited by 8 | Viewed by 4085

Abstract

The discretization of continuous attributes in a dataset is an essential step before the Rough-Set-Theory (RST)-based classification process is applied. There are many methods for discretization, but not many of them have linked the RST instruments from the beginning of the discretization process. [...] Read more.

The discretization of continuous attributes in a dataset is an essential step before the Rough-Set-Theory (RST)-based classification process is applied. There are many methods for discretization, but not many of them have linked the RST instruments from the beginning of the discretization process. The objective of this research is to propose a method to improve the accuracy and reliability of the RST-based classifier model by involving RST instruments at the beginning of the discretization process. In the proposed method, a k-means-based discretization method optimized with a genetic algorithm (GA) was introduced. Four datasets taken from UCI were selected to test the performance of the proposed method. The evaluation of the proposed discretization technique for RST-based classification is performed by comparing it to other discretization methods, i.e., equal-frequency and entropy-based. The performance comparison among these methods is measured by the number of bins and rules generated and by its accuracy, precision, and recall. A Friedman test continued with post hoc analysis is also applied to measure the significance of the difference in performance. The experimental results indicate that, in general, the performance of the proposed discretization method is significantly better than the other compared methods. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

15 pages, 841 KiB

Open AccessEditor’s ChoiceArticle

A Novel Ensemble Machine Learning Approach for Bioarchaeological Sex Prediction

by Evan Muzzall

Technologies 2021, 9(2), 23; https://doi.org/10.3390/technologies9020023 - 1 Apr 2021

Cited by 6 | Viewed by 3746

Abstract

I present a novel machine learning approach to predict sex in the bioarchaeological record. Eighteen cranial interlandmark distances and five maxillary dental metric distances were recorded from n = 420 human skeletons from the necropolises at Alfedena (600–400 BCE) and Campovalano (750–200 BCE [...] Read more.

I present a novel machine learning approach to predict sex in the bioarchaeological record. Eighteen cranial interlandmark distances and five maxillary dental metric distances were recorded from n = 420 human skeletons from the necropolises at Alfedena (600–400 BCE) and Campovalano (750–200 BCE and 9–11th Centuries CE) in central Italy. A generalized low rank model (GLRM) was used to impute missing data and Area under the Curve—Receiver Operating Characteristic (AUC-ROC) with 20-fold stratified cross-validation was used to evaluate predictive performance of eight machine learning algorithms on different subsets of the data. Additional perspectives such as this one show strong potential for sex prediction in bioarchaeological and forensic anthropological contexts. Furthermore, GLRMs have the potential to handle missing data in ways previously unexplored in the discipline. Although results of this study look promising (highest AUC-ROC = 0.9722 for predicting binary male/female sex), the main limitation is that the sexes of the individuals included were not known but were estimated using standard macroscopic bioarchaeological methods. However, future research should apply this machine learning approach to known-sex reference samples in order to better understand its value, along with the more general contributions that machine learning can make to the reconstruction of past human lifeways. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

Review

Jump to: Editorial, Research, Other

26 pages, 3485 KiB

Open AccessReview

Hyperparameter Optimization and Combined Data Sampling Techniques in Machine Learning for Customer Churn Prediction: A Comparative Analysis

by Mehdi Imani and Hamid Reza Arabnia

Technologies 2023, 11(6), 167; https://doi.org/10.3390/technologies11060167 - 26 Nov 2023

Cited by 32 | Viewed by 8381

Abstract

This paper explores the application of various machine learning techniques for predicting customer churn in the telecommunications sector. We utilized a publicly accessible dataset and implemented several models, including Artificial Neural Networks, Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, and gradient [...] Read more.

This paper explores the application of various machine learning techniques for predicting customer churn in the telecommunications sector. We utilized a publicly accessible dataset and implemented several models, including Artificial Neural Networks, Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, and gradient boosting techniques (XGBoost, LightGBM, and CatBoost). To mitigate the challenges posed by imbalanced datasets, we adopted different data sampling strategies, namely SMOTE, SMOTE combined with Tomek Links, and SMOTE combined with Edited Nearest Neighbors. Moreover, hyperparameter tuning was employed to enhance model performance. Our evaluation employed standard metrics, such as Precision, Recall, F1-score, and the Receiver Operating Characteristic Area Under Curve (ROC AUC). In terms of the F1-score metric, CatBoost demonstrates superior performance compared to other machine learning models, achieving an outstanding 93% following the application of Optuna hyperparameter optimization. In the context of the ROC AUC metric, both XGBoost and CatBoost exhibit exceptional performance, recording remarkable scores of 91%. This achievement for XGBoost is attained after implementing a combination of SMOTE with Tomek Links, while CatBoost reaches this level of performance after the application of Optuna hyperparameter optimization. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

14 pages, 277 KiB

Open AccessEditor’s ChoiceReview

A Review of Deep Transfer Learning and Recent Advancements

by Mohammadreza Iman, Hamid Reza Arabnia and Khaled Rasheed

Technologies 2023, 11(2), 40; https://doi.org/10.3390/technologies11020040 - 14 Mar 2023

Cited by 375 | Viewed by 37040

Abstract

Deep learning has been the answer to many machine learning problems during the past two decades. However, it comes with two significant constraints: dependency on extensive labeled data and training costs. Transfer learning in deep learning, known as Deep Transfer Learning (DTL), attempts [...] Read more.

Deep learning has been the answer to many machine learning problems during the past two decades. However, it comes with two significant constraints: dependency on extensive labeled data and training costs. Transfer learning in deep learning, known as Deep Transfer Learning (DTL), attempts to reduce such reliance and costs by reusing obtained knowledge from a source data/task in training on a target data/task. Most applied DTL techniques are network/model-based approaches. These methods reduce the dependency of deep learning models on extensive training data and drastically decrease training costs. Moreover, the training cost reduction makes DTL viable on edge devices with limited resources. Like any new advancement, DTL methods have their own limitations, and a successful transfer depends on specific adjustments and strategies for different scenarios. This paper reviews the concept, definition, and taxonomy of deep transfer learning and well-known methods. It investigates the DTL approaches by reviewing applied DTL techniques in the past five years and a couple of experimental analyses of DTLs to discover the best practice for using DTL in different scenarios. Moreover, the limitations of DTLs (catastrophic forgetting dilemma and overly biased pre-trained models) are discussed, along with possible solutions and research trends. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

20 pages, 1248 KiB

Open AccessReview

Big Data in Biodiversity Science: A Framework for Engagement

by Tendai Musvuugwa, Muxe Gladmond Dlomu and Adekunle Adebowale

Technologies 2021, 9(3), 60; https://doi.org/10.3390/technologies9030060 - 17 Aug 2021

Cited by 12 | Viewed by 10033

Abstract

Despite best efforts, the loss of biodiversity has continued at a pace that constitutes a major threat to the efficient functioning of ecosystems. Curbing the loss of biodiversity and assessing its local and global trends requires a vast amount of datasets from a [...] Read more.

Despite best efforts, the loss of biodiversity has continued at a pace that constitutes a major threat to the efficient functioning of ecosystems. Curbing the loss of biodiversity and assessing its local and global trends requires a vast amount of datasets from a variety of sources. Although the means for generating, aggregating and analyzing big datasets to inform policies are now within the reach of the scientific community, the data-driven nature of a complex multidisciplinary field such as biodiversity science necessitates an overarching framework for engagement. In this review, we propose such a schematic based on the life cycle of data to interrogate the science. The framework considers data generation and collection, storage and curation, access and analysis and, finally, communication as distinct yet interdependent themes for engaging biodiversity science for the purpose of making evidenced-based decisions. We summarize historical developments in each theme, including the challenges and prospects, and offer some recommendations based on best practices. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

Other

Jump to: Editorial, Research, Review

10 pages, 1007 KiB

Open AccessCase Report

Dynamic Storage Location Assignment in Warehouses Using Deep Reinforcement Learning

by Constantin Waubert de Puiseau, Dimitri Tegomo Nanfack, Hasan Tercan, Johannes Löbbert-Plattfaut and Tobias Meisen

Technologies 2022, 10(6), 129; https://doi.org/10.3390/technologies10060129 - 11 Dec 2022

Cited by 12 | Viewed by 6072

Abstract

The warehousing industry is faced with increasing customer demands and growing global competition. A major factor in the efficient operation of warehouses is the strategic storage location assignment of arriving goods, termed the dynamic storage location assignment problem (DSLAP). This paper presents a [...] Read more.

The warehousing industry is faced with increasing customer demands and growing global competition. A major factor in the efficient operation of warehouses is the strategic storage location assignment of arriving goods, termed the dynamic storage location assignment problem (DSLAP). This paper presents a real-world use case of the DSLAP, in which deep reinforcement learning (DRL) is used to derive a suitable storage location assignment strategy to decrease transportation costs within the warehouse. The DRL agent is trained on historic data of storage and retrieval operations gathered over one year of operation. The evaluation of the agent on new data of two months shows a 6.3% decrease in incurring costs compared to the currently utilized storage location assignment strategy which is based on manual ABC-classifications. Hence, DRL proves to be a competitive solution alternative for the DSLAP and related problems in the warehousing industry. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

38 pages, 3783 KiB

Open AccessCase Report

Business Intelligence’s Self-Service Tools Evaluation

by Jordina Orcajo Hernández and Pau Fonseca i Casas

Technologies 2022, 10(4), 92; https://doi.org/10.3390/technologies10040092 - 10 Aug 2022

Cited by 3 | Viewed by 3114

Abstract

The software selection process in the context of a big company is not an easy task. In the Business Intelligence area, this decision is critical, since the resources needed to implement the tool are huge and imply the participation of all organization actors. [...] Read more.

The software selection process in the context of a big company is not an easy task. In the Business Intelligence area, this decision is critical, since the resources needed to implement the tool are huge and imply the participation of all organization actors. We propose to adopt the systemic quality model to perform a neutral comparison between four business intelligence self-service tools. To assess the quality, we consider eight characteristics and eighty-two metrics. We built a methodology to evaluate self-service BI tools, adapting the systemic quality model. As an example, we evaluated four tools that were selected from all business intelligence platforms, following a rigorous methodology. Through the assessment, we obtained two tools with the maximum quality level. To obtain the differences between them, we were more restrictive increasing the level of satisfaction. Finally, we got a unique tool with the maximum quality level, while the other one was rejected according to the rules established in the methodology. The methodology works well for this type of software, helping in the detailed analysis and neutral selection of the final software to be used for the implementation. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Data Science and Big Data in Biology, Physical Science and Engineering

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (13 papers)

Editorial

Research

Review

Other

Further Information

Guidelines

MDPI Initiatives

Follow MDPI