Special Issue "Data Science and Big Data in Biology, Physical Science and Engineering"

A special issue of Technologies (ISSN 2227-7080). This special issue belongs to the section "Information and Communication Technologies".

Deadline for manuscript submissions: 30 September 2023 | Viewed by 17859

Special Issue Editor

Department of Computer Science, Bemidji State University, Bemidji, MN 56601-2699, USA
Interests: data science; big data; machine learning; deep learning; artificial intelligence (AI); cybersecurity
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Nowadays, Big Data analysis represents one of the most important contemporary areas of development and research. Tremendous amounts of data are generated every single day from digital technologies and modern information systems, such as cloud computing and Internet of Things (IoT) devices. Analysis of these enormous amounts of data has become of crucial significance and requires a great deal of effort in order to extract valuable knowledge for decision-making which, in turn, will make important contributions in both academia and industry.

Big Data and data science have emerged due to the significant need for generating, storing, organising and processing immense amounts of data. Data scientists strive to use artificial intelligence (AI) and machine learning (ML) approaches and models to allow computers to detect and identify what the data represents and be able to detect patterns more quickly, efficiently and reliably than humans.

The goal behind this Special Issue is to explore and discuss various principles, tools and models in the context of data science, besides the diverse and varied concepts and techniques relating to Big Data in biology, chemistry, biomedical engineering, physics, mathematics and other areas that work with Big Data.

Prof. Dr. Mohammed Mahmoud
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Technologies is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Prof. Dr. Mohammed Mahmoud
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Technologies is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Data science
  • Big Data
  • Machine learning
  • Artificial intelligence

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

Article
An Advanced Decision Tree-Based Deep Neural Network in Nonlinear Data Classification
Technologies 2023, 11(1), 24; https://doi.org/10.3390/technologies11010024 - 01 Feb 2023
Viewed by 1345
Abstract
Deep neural networks (DNNs), the integration of neural networks (NNs) and deep learning (DL), have proven highly efficient in executing numerous complex tasks, such as data and image classification. Because the multilayer in a nonlinearly separable data structure is not transparent, it is [...] Read more.
Deep neural networks (DNNs), the integration of neural networks (NNs) and deep learning (DL), have proven highly efficient in executing numerous complex tasks, such as data and image classification. Because the multilayer in a nonlinearly separable data structure is not transparent, it is critical to develop a specific data classification model from a new and unexpected dataset. In this paper, we propose a novel approach using the concepts of DNN and decision tree (DT) for classifying nonlinear data. We first developed a decision tree-based neural network (DTBNN) model. Next, we extend our model to a decision tree-based deep neural network (DTBDNN), in which the multiple hidden layers in DNN are utilized. Using DNN, the DTBDNN model achieved higher accuracy compared to the related and relevant approaches. Our proposal achieves the optimal trainable weights and bias to build an efficient model for nonlinear data classification by combining the benefits of DT and NN. By conducting in-depth performance evaluations, we demonstrate the effectiveness and feasibility of the proposal by achieving good accuracy over different datasets. Full article
Show Figures

Figure 1

Article
Data Model Design to Support Data-Driven IT Governance Implementation
Technologies 2022, 10(5), 106; https://doi.org/10.3390/technologies10050106 - 08 Oct 2022
Viewed by 1707
Abstract
Organizations must quickly adapt their processes to understand the dynamic nature of modern business environments. As highlighted in the literature, centralized governance supports decision-making and performance measurement processes in technology companies. For this reason, a reliable decision-making system with an integrated data model [...] Read more.
Organizations must quickly adapt their processes to understand the dynamic nature of modern business environments. As highlighted in the literature, centralized governance supports decision-making and performance measurement processes in technology companies. For this reason, a reliable decision-making system with an integrated data model that enables the rapid collection and transformation of data stored in heterogeneous and different sources is needed. Therefore, this paper proposes the design of a data model to implement data-driven governance through a literature review of adopted approaches. The lack of a standardized procedure and a disconnection between theoretical frameworks and practical application has emerged. This paper documented the suggested approach following these steps: (i) mapping of monitoring requirements to the data structure, (ii) documentation of ER diagram design, and (iii) reporting dashboards used for monitoring and reporting. The paper helped fill the gaps highlighted in the literature by supporting the design and development of a DWH data model coupled with a BI system. The application prototype shows benefits for top management, particularly those responsible for governance and operations, especially for risk monitoring, audit compliance, communication, knowledge sharing on strategic areas of the company, and identification and implementation of performance improvements and optimizations. Full article
Show Figures

Figure 1

Article
Rough-Set-Theory-Based Classification with Optimized k-Means Discretization
Technologies 2022, 10(2), 51; https://doi.org/10.3390/technologies10020051 - 08 Apr 2022
Cited by 2 | Viewed by 1729
Abstract
The discretization of continuous attributes in a dataset is an essential step before the Rough-Set-Theory (RST)-based classification process is applied. There are many methods for discretization, but not many of them have linked the RST instruments from the beginning of the discretization process. [...] Read more.
The discretization of continuous attributes in a dataset is an essential step before the Rough-Set-Theory (RST)-based classification process is applied. There are many methods for discretization, but not many of them have linked the RST instruments from the beginning of the discretization process. The objective of this research is to propose a method to improve the accuracy and reliability of the RST-based classifier model by involving RST instruments at the beginning of the discretization process. In the proposed method, a k-means-based discretization method optimized with a genetic algorithm (GA) was introduced. Four datasets taken from UCI were selected to test the performance of the proposed method. The evaluation of the proposed discretization technique for RST-based classification is performed by comparing it to other discretization methods, i.e., equal-frequency and entropy-based. The performance comparison among these methods is measured by the number of bins and rules generated and by its accuracy, precision, and recall. A Friedman test continued with post hoc analysis is also applied to measure the significance of the difference in performance. The experimental results indicate that, in general, the performance of the proposed discretization method is significantly better than the other compared methods. Full article
Show Figures

Figure 1

Article
A Novel Ensemble Machine Learning Approach for Bioarchaeological Sex Prediction
Technologies 2021, 9(2), 23; https://doi.org/10.3390/technologies9020023 - 01 Apr 2021
Cited by 2 | Viewed by 2197
Abstract
I present a novel machine learning approach to predict sex in the bioarchaeological record. Eighteen cranial interlandmark distances and five maxillary dental metric distances were recorded from n = 420 human skeletons from the necropolises at Alfedena (600–400 BCE) and Campovalano (750–200 BCE [...] Read more.
I present a novel machine learning approach to predict sex in the bioarchaeological record. Eighteen cranial interlandmark distances and five maxillary dental metric distances were recorded from n = 420 human skeletons from the necropolises at Alfedena (600–400 BCE) and Campovalano (750–200 BCE and 9–11th Centuries CE) in central Italy. A generalized low rank model (GLRM) was used to impute missing data and Area under the Curve—Receiver Operating Characteristic (AUC-ROC) with 20-fold stratified cross-validation was used to evaluate predictive performance of eight machine learning algorithms on different subsets of the data. Additional perspectives such as this one show strong potential for sex prediction in bioarchaeological and forensic anthropological contexts. Furthermore, GLRMs have the potential to handle missing data in ways previously unexplored in the discipline. Although results of this study look promising (highest AUC-ROC = 0.9722 for predicting binary male/female sex), the main limitation is that the sexes of the individuals included were not known but were estimated using standard macroscopic bioarchaeological methods. However, future research should apply this machine learning approach to known-sex reference samples in order to better understand its value, along with the more general contributions that machine learning can make to the reconstruction of past human lifeways. Full article
Show Figures

Figure 1

Review

Jump to: Research, Other

Review
A Review of Deep Transfer Learning and Recent Advancements
Technologies 2023, 11(2), 40; https://doi.org/10.3390/technologies11020040 - 14 Mar 2023
Cited by 5 | Viewed by 1483
Abstract
Deep learning has been the answer to many machine learning problems during the past two decades. However, it comes with two significant constraints: dependency on extensive labeled data and training costs. Transfer learning in deep learning, known as Deep Transfer Learning (DTL), attempts [...] Read more.
Deep learning has been the answer to many machine learning problems during the past two decades. However, it comes with two significant constraints: dependency on extensive labeled data and training costs. Transfer learning in deep learning, known as Deep Transfer Learning (DTL), attempts to reduce such reliance and costs by reusing obtained knowledge from a source data/task in training on a target data/task. Most applied DTL techniques are network/model-based approaches. These methods reduce the dependency of deep learning models on extensive training data and drastically decrease training costs. Moreover, the training cost reduction makes DTL viable on edge devices with limited resources. Like any new advancement, DTL methods have their own limitations, and a successful transfer depends on specific adjustments and strategies for different scenarios. This paper reviews the concept, definition, and taxonomy of deep transfer learning and well-known methods. It investigates the DTL approaches by reviewing applied DTL techniques in the past five years and a couple of experimental analyses of DTLs to discover the best practice for using DTL in different scenarios. Moreover, the limitations of DTLs (catastrophic forgetting dilemma and overly biased pre-trained models) are discussed, along with possible solutions and research trends. Full article
Show Figures

Figure 1

Review
Big Data in Biodiversity Science: A Framework for Engagement
Technologies 2021, 9(3), 60; https://doi.org/10.3390/technologies9030060 - 17 Aug 2021
Cited by 2 | Viewed by 3756
Abstract
Despite best efforts, the loss of biodiversity has continued at a pace that constitutes a major threat to the efficient functioning of ecosystems. Curbing the loss of biodiversity and assessing its local and global trends requires a vast amount of datasets from a [...] Read more.
Despite best efforts, the loss of biodiversity has continued at a pace that constitutes a major threat to the efficient functioning of ecosystems. Curbing the loss of biodiversity and assessing its local and global trends requires a vast amount of datasets from a variety of sources. Although the means for generating, aggregating and analyzing big datasets to inform policies are now within the reach of the scientific community, the data-driven nature of a complex multidisciplinary field such as biodiversity science necessitates an overarching framework for engagement. In this review, we propose such a schematic based on the life cycle of data to interrogate the science. The framework considers data generation and collection, storage and curation, access and analysis and, finally, communication as distinct yet interdependent themes for engaging biodiversity science for the purpose of making evidenced-based decisions. We summarize historical developments in each theme, including the challenges and prospects, and offer some recommendations based on best practices. Full article
Show Figures

Figure 1

Other

Jump to: Research, Review

Case Report
Dynamic Storage Location Assignment in Warehouses Using Deep Reinforcement Learning
Technologies 2022, 10(6), 129; https://doi.org/10.3390/technologies10060129 - 11 Dec 2022
Viewed by 1772
Abstract
The warehousing industry is faced with increasing customer demands and growing global competition. A major factor in the efficient operation of warehouses is the strategic storage location assignment of arriving goods, termed the dynamic storage location assignment problem (DSLAP). This paper presents a [...] Read more.
The warehousing industry is faced with increasing customer demands and growing global competition. A major factor in the efficient operation of warehouses is the strategic storage location assignment of arriving goods, termed the dynamic storage location assignment problem (DSLAP). This paper presents a real-world use case of the DSLAP, in which deep reinforcement learning (DRL) is used to derive a suitable storage location assignment strategy to decrease transportation costs within the warehouse. The DRL agent is trained on historic data of storage and retrieval operations gathered over one year of operation. The evaluation of the agent on new data of two months shows a 6.3% decrease in incurring costs compared to the currently utilized storage location assignment strategy which is based on manual ABC-classifications. Hence, DRL proves to be a competitive solution alternative for the DSLAP and related problems in the warehousing industry. Full article
Show Figures

Figure 1

Case Report
Business Intelligence’s Self-Service Tools Evaluation
Technologies 2022, 10(4), 92; https://doi.org/10.3390/technologies10040092 - 10 Aug 2022
Viewed by 1215
Abstract
The software selection process in the context of a big company is not an easy task. In the Business Intelligence area, this decision is critical, since the resources needed to implement the tool are huge and imply the participation of all organization actors. [...] Read more.
The software selection process in the context of a big company is not an easy task. In the Business Intelligence area, this decision is critical, since the resources needed to implement the tool are huge and imply the participation of all organization actors. We propose to adopt the systemic quality model to perform a neutral comparison between four business intelligence self-service tools. To assess the quality, we consider eight characteristics and eighty-two metrics. We built a methodology to evaluate self-service BI tools, adapting the systemic quality model. As an example, we evaluated four tools that were selected from all business intelligence platforms, following a rigorous methodology. Through the assessment, we obtained two tools with the maximum quality level. To obtain the differences between them, we were more restrictive increasing the level of satisfaction. Finally, we got a unique tool with the maximum quality level, while the other one was rejected according to the rules established in the methodology. The methodology works well for this type of software, helping in the detailed analysis and neutral selection of the final software to be used for the implementation. Full article
Show Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

Title: Current Advances in Radiographic Methods for Personal Iden-tification of Unknown Decedents
Authors: Sharon M. Derrick 1, Ruby Mehrubeoglu 2, and Longzhuang Li 3
Affiliation: 1 Texas A&M University-Corpus Christi 1; [email protected] 2 Texas A&M University-Corpus Christi 2; [email protected] 3 Texas A&M University-Corpus Christi 3; [email protected] * Correspondence: [email protected]; Tel.: (361-825-3637, smd)
Abstract: Forensic practitioners and researchers have been cognizant for decades that quantitative and ac-cessible methods of decedent identification, processed promptly within the medical examin-er/coroner office, are needed but sorely lacking. The most available on-site use of biometric technology has been fingerprint comparison through AFIS,1 a massive repository of fingerprint data, but AFIS is not useful if the person has no fingerprints in the system or if the decedent’s hands are decomposed/skeletal. Rapid advances in radiographic technology and software over the last decade, in conjunction with an increase in digital X-ray machines and CT instrumenta-tion purchased as standard equipment in medical examiner/coroner offices, have elicited a plethora of research into new quantitative methods of radiographic identification. Large data-bases of digital antemortem and postmortem standard radiograph and CT images, which can be de-identified for research purposes, are available for access by these researchers. A review of major papers published from 2010-2021 describing novel radiographic comparison methods provides an encouraging view of the present and future use of radiography in forensic identifi-cation.
Keywords: forensic; identification; radiography

Back to TopTop