Submit to Applied Sciences Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Big Data and Machine Learning in Earth Sciences

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Earth Sciences".

Deadline for manuscript submissions: closed (31 October 2022) | Viewed by 18870

Share This Special Issue

Special Issue Editors

Dr. Kwo-Sen Kuo

E-Mail Website
Guest Editor

1. BAYESICS, LLC, Bowie, MD, USA
2. Department Atmospheric & Oceanic Science, University of Maryland, College Park, MD, USA
3. The Information Technology and Systems Center, University of Alabama, Huntsville, AL, USA
4. Goddard Space Flight Center, NASA, Greenbelt, MD, USA
Interests: (professional) geoscience information technology; geo-spatiotemporal data processing/analysis; neural networks; machine learning; artificial intelligence; electromagnetic scattering; radiative transfer; precipitation retrieval; (nonprofessional) modern history; Chinese etymology

Dr. Rahul Ramachandran

E-Mail Website
Guest Editor

Goddard Space Flight Center, NASA, Greenbelt, MD, USA
Interests: earth science informatics; data science with a focus on the application of novel computational methods and information technology to the acquisition, storage, processing, discovery, interchange, analysis and visualization of Earth science data and information

Prof. Dr. Morris Riedel

E-Mail Website
Guest Editor

Full Professor, University of Iceland, Reykjavik, Iceland &Research Group Leader, Juelich Supercomputing Centre, Forschungszentrum Juelich, Juelich, Germany
Interests: parallel and scalable machine and deep learning; high-performance computing; cloud computing; statistical data mining; remote sensing; Earth observation applications

Special Issue Information

If big data challenges can be summed up as “to cost-effectively scale computation and storage in the face of ever-increasing data volumes and varieties with an ever-escalating demand for velocity”, we posit that these challenges have been present since the dawn of digital computing for Earth science. Our desire for better fidelity from predictions urges us to incorporate into numerical models evermore comprehensive physical interactions with evermore refined intricacy, which compels evermore extensive and expansive observations with evermore detailed focus, further intensifying the challenge. As a result, another type of challenge arises, i.e., the wish to realize the full value from the deluge of data generated by models and observations. Unfortunately, until recently, we have had to rely mostly on human beings’ cognitive faculty. Machine learning promises to address this challenge. Since Earth is a complex, nonlinear system rife with processes spanning a broad spectrum of spatiotemporal scales, we can better constrain our hypotheses and direct our investigations when leveraging dissimilar data featuring complementary strengths. Similarly, the performance and, especially, machine learning models’ generalizability improve with increasing volume and variety of training data. Thus, these two types of challenges are linked: machine learning, and more recently, deep learning techniques, which are to automate the analysis and interpretation of big data, will not be very effective if the data cannot be wrangled and processed using parallel and scalable methods. We aim for this Special Issue to review the progress and explore the prospect of addressing these two interconnected challenges in the context of their history.

Dear Colleagues,

We Earth scientists are certainly no strangers to the challenges of big data. However, improvements in model and observation resolutions, the introduction of exotic simulation grids, and the employment of novel observation technologies and strategies have conspired to aggravate the acuity of the challenges, especially with the end of Moore’s law approaching. Thus, scalable parallel processing has attained paramount importance in addressing these challenges. The most challenging step in optimizing scalable parallel processing is perhaps data preparation, as attested by the 80/20 rule: data analysts or scientists devote only 20% of their time to actual analysis, with the rest, i.e., 80%, spent in preparing the data for analysis.

Moreover, even if we were able to process the volume and variety of our big data in a timely fashion in the traditional manner, it would not be sufficient because, relying purely on human effort, the analysis and interpretation of the processed results may still overwhelm us. We need help from artificial intelligence and machine learning (AIML), which have demonstrated remarkable advancement recently, often achieving comparable or even better performance than humans in specific or less advanced cognitive functions. AIML techniques are usually computationally intensive, especially in their training/learning stage, again accentuating the importance of scalable parallel processing.

This Special Issue welcomes contributions from all Earth science domains on the description, evolution, and solutions (or proposed solutions) of the aforementioned challenges.

Dr. Kwo-Sen Kuo
Dr. Rahul Ramachandran
Prof. Dr. Morris Riedel
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

machine learning
big data
earth science datasets
scalable parallel processing
data management challenges
data mining
data fusion
cloud computing pipelines
data augmentation techniques
high-performance computing
data processing challenges
in-memory processing
deep learning
apache hadoop/spark stacks

Potential contributors you would like to invite to submit:

processing approaches for large quantities of earth science datasets
scaling up high-performance computing machine learning algorithms
processing pipelines using cloud computing for earth observation data
innovative deep learning networks for earth science data analysis
harnessing open source frameworks of hadoop/spark ecosystem

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

18 pages, 8470 KB

Open AccessArticle

Advanced Elastic and Reservoir Properties Prediction through Generative Adversarial Network

by Muhammad Anwar Ishak, Abdul Halim Abdul Latiff, Eric Tatt Wei Ho, Muhammad Izzuljad Ahmad Fuad, Nian Wei Tan, Muhammad Sajid and Emad Elsebakhi

Appl. Sci. 2023, 13(10), 6311; https://doi.org/10.3390/app13106311 - 22 May 2023

Cited by 4 | Viewed by 2182

Abstract

The prediction of subsurface properties such as velocity, density, porosity, and water saturation has been the main focus of petroleum geosciences. Advanced methods such as Full Waveform Inversion (FWI), Joint Migration Inversion (JMI) and ML-Rock Physics are able to produce better predictions than their predecessors, but they still require tedious manual interpretation that is prone to human error. The research on these methods remains open as they suffer from technical limitations. As computing resources are becoming cheaper, the use of a single deep-generative adversarial network is feasible in predicting all these properties in a completely data-driven manner. In our proposed method of multiscale pix2pix applied to SEG SEAM salt data, we have managed to map from one input, which is seismic post-stack data, to several outputs of reservoir and elastic properties such as porosity, velocity, and density by using only one trained model and without having to manually interpret or pre-process the input data. With 90% accuracy of the results in the synthetic data testing, the method is worthy of being explored by the petroleum geoscience fraternity. Full article

(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)

► Show Figures

Figure 1

17 pages, 9509 KB

Open AccessArticle

A Prediction Method for Height of Water Flowing Fractured Zone Based on Sparrow Search Algorithm–Elman Neural Network in Northwest Mining Area

by Xicai Gao, Shuai Liu, Tengfei Ma, Cheng Zhao, Xichen Zhang, Huan Xia and Jianhui Yin

Appl. Sci. 2023, 13(2), 1162; https://doi.org/10.3390/app13021162 - 15 Jan 2023

Cited by 12 | Viewed by 1970

Abstract

The main Jurassic coal seams of the Ordos Basin of northwest mining area have special hosting conditions and complex hydrogeological conditions, and the high-intensity coal mining of the coal seams is likely to cause groundwater loss and negative effects on the surface ecological environment. The research was aimed at predicting the height of the water-flowing fractured zone (WFFZ) in high-intensity coal mining in that area and gave instructions for avoiding water inrush accidents and realizing damage reduction mining during the actual mining procedure of the coal mine. In this study, 18 samples of the measured height of WFFZ in Jurassic coal seams were systematically collected. In the mining method, the ratio of the thickness of the hard rock to the thickness of the soft rock in the bedrock, buried depth, mining height, and working face length was selected as the input vectors, applied the sparrow search algorithm (SSA) to iteratively optimize the weights and thresholds of the Elman neural network (ENN), constructed an SSA-Elman neural network model. The results demonstrate that the improved SSA-Elman neural network model has higher accuracy in predicting the height of the WFFZ compared with traditional prediction algorithms. The results of this study help guide damage-reducing, water-preserving mining of the middle-deep buried Jurassic coal seams in the northwest mining areas. Full article

(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)

► Show Figures

Figure 1

23 pages, 3144 KB

Open AccessArticle

Using Artificial Intelligence Techniques to Predict Intrinsic Compressibility Characteristic of Clay

by Samuel J. Abbey, Eyo U. Eyo and Colin A. Booth

Appl. Sci. 2022, 12(19), 9940; https://doi.org/10.3390/app12199940 - 2 Oct 2022

Viewed by 2491

Abstract

Reconstituted clays have often provided the basis for the interpretation and modelling of the properties of natural clays. The term “intrinsic” was introduced to describe a clay remoulded or reconstituted at moisture content up to 1.5 times its liquid limit and consolidated one-dimensionally. In order to circumvent the difficulties of measuring an intrinsic constant called “intrinsic compressibility index” (C*_c), a machine learning (ML) approach using traditional non-parametric tree-based and meta-heuristic ensembles was adopted in this study. Results indicated that tree-ensembles namely random decision forest (RDF) and boosted decision tree (BDT) performed better in C*_c prediction (average R² of 0.84 and root mean square error, RMSE of 0.51) compared to stand-alone models. However, models’ hyper parameters combined meta-heuristically, produced the highest accuracy (average R² of 0.90 and root mean square error, RMSE of 0.34). The greatest capacity to distinguish between positive and negative soil classes (average accuracy of 0.95, precision and recall of 0.86) were demonstrated by meta-ensembles in multinomial classification. Full article

(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)

► Show Figures

Figure 1

15 pages, 6300 KB

Open AccessArticle

Enhancing Channelized Feature Interpretability Using Deep Learning Predictive Modeling

by Salbiah Mad Sahad, Nian Wei Tan, Muhammad Sajid, Ernest Austin Jones, Jr. and Abdul Halim Abdul Latiff

Appl. Sci. 2022, 12(18), 9032; https://doi.org/10.3390/app12189032 - 8 Sep 2022

Cited by 4 | Viewed by 2024

Abstract

Automating geobodies using insufficient labeled training data as input for structural prediction may result in missing important features and a possibility of overfitting, leading to low accuracy. We adopt a deep learning (DL) predictive modeling scheme to alleviate detection of channelized features based on classified seismic attributes (X) and different ground truth scenarios (y), to imitate actual human interpreters’ tasks. In this approach, diverse augmentation method was applied to increase the accuracy of the model after we were satisfied with the refined annotated ground truth dataset. We evaluated the effect of dropout as a training regularizer and facies’ spatial representation towards optimized prediction results, apart from conventional hyperparameter tuning. From our findings, increasing batch size helps speedup training speed and improve performance stability. Finally, we demonstrate that the designed Convolutional Neural Network (CNN) is capable of learning channelized variation from complex deepwater settings in a fluvial-dominated depositional environment while producing outstanding mean Intersection of Union (IoU) (95%) despite utilizing 6.4% from the overall dataset and avoiding overfitting possibilities. Full article

(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)

► Show Figures

Figure 1

17 pages, 6518 KB

Open AccessArticle

Improving Forecast Accuracy with an Auto Machine Learning Post-Correction Technique in Northern Xinjiang

by Junjian Liu, Hailiang Zhang, Huoqing Li and Ali Mamtimin

Appl. Sci. 2021, 11(17), 7931; https://doi.org/10.3390/app11177931 - 27 Aug 2021

Cited by 1 | Viewed by 3389

Abstract

Reliable meteorological forecasts of temperature and relative humidity are critically important to take necessary measures to avoid potential damage and losses. An operational meteorological forecast model based on the Weather Research and Forecast (WRF) model has been built in Xinjiang. Numerical forecasts usually have significant uncertainties and errors due to imperfections in models themselves. In this study, a straightforward automated machine learning (AutoML) approach has been developed to post-process the raw forecasts of the WRF model. The method was implemented and evaluated to post-process forecasts from 13 stations in northern Xinjiang. The post-processed temperature forecasts were significantly improved from the raw forecasts, with average RMSE values in the 13 stations decreasing from 3.24 °C to 2.34 °C by a large margin of 28%. As for relative humidity, the mean RMSE at 13 stations decreased from 19.54% to 11.54%, or it showed a percentage decrease of 41%. Meanwhile, biases were also significantly decreased, with average ME values being reduced from around 2 °C to ~0.33 °C for temperature and improved from −15.6% to ~0% for relative humidity. Moreover, forecast performance values after post-correction became much closer to each other than raw forecast performance values, improving forecast applicability at regional scales. Full article

(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)

► Show Figures

Figure 1

11 pages, 2038 KB

Open AccessArticle

Deep Learning Applications in Geosciences: Insights into Ichnological Analysis

by Korhan Ayranci, Isa E. Yildirim, Umair bin Waheed and James A. MacEachern

Appl. Sci. 2021, 11(16), 7736; https://doi.org/10.3390/app11167736 - 22 Aug 2021

Cited by 12 | Viewed by 4418

Abstract

Ichnological analysis, particularly assessing bioturbation index, provides critical parameters for characterizing many oil and gas reservoirs. It provides information on reservoir quality, paleodepositional conditions, redox conditions, and more. However, accurately characterizing ichnological characteristics requires long hours of training and practice, and many marine or marginal marine reservoirs require these specialized expertise. This adds more load to geoscientists and may cause distraction, errors, and bias, particularly when continuously logging long sedimentary successions. In order to alleviate this issue, we propose an automated technique to determine the bioturbation index in cores and outcrops by harnessing the capabilities of deep convolutional neural networks (DCNNs) as image classifiers. In order to find a fast and robust solution, we utilize ideas from deep learning. We compiled and labeled a large data set (1303 images) composed of images spanning the full range (BI 0–6) of bioturbation indices. We divided these images into groups based on their bioturbation indices in order to prepare training data for the DCNN. Finally, we analyzed the trained DCNN model on images and obtained high classification accuracies. This is a pioneering work in the field of ichnological analysis, as the current practice is to perform classification tasks manually by experts in the field. Full article

(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)

► Show Figures

Journal Menu

Journal Browser

Big Data and Machine Learning in Earth Sciences

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (6 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI