applsci-logo

Journal Browser

Journal Browser

Big Data and Machine Learning in Earth Sciences

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Earth Sciences".

Deadline for manuscript submissions: closed (31 October 2022) | Viewed by 16808

Special Issue Editors


E-Mail Website
Guest Editor
1. BAYESICS, LLC, Bowie, MD, USA
2. Department Atmospheric & Oceanic Science, University of Maryland, College Park, MD, USA
3. The Information Technology and Systems Center, University of Alabama, Huntsville, AL, USA
4. Goddard Space Flight Center, NASA, Greenbelt, MD, USA
Interests: (professional) geoscience information technology; geo-spatiotemporal data processing/analysis; neural networks; machine learning; artificial intelligence; electromagnetic scattering; radiative transfer; precipitation retrieval; (nonprofessional) modern history; Chinese etymology

E-Mail Website
Guest Editor
Goddard Space Flight Center, NASA, Greenbelt, MD, USA
Interests: earth science informatics; data science with a focus on the application of novel computational methods and information technology to the acquisition, storage, processing, discovery, interchange, analysis and visualization of Earth science data and information

E-Mail Website
Guest Editor
Full Professor, University of Iceland, Reykjavik, Iceland &Research Group Leader, Juelich Supercomputing Centre, Forschungszentrum Juelich, Juelich, Germany
Interests: parallel and scalable machine and deep learning; high-performance computing; cloud computing; statistical data mining; remote sensing; Earth observation applications

Special Issue Information

If big data challenges can be summed up as “to cost-effectively scale computation and storage in the face of ever-increasing data volumes and varieties with an ever-escalating demand for velocity”, we posit that these challenges have been present since the dawn of digital computing for Earth science. Our desire for better fidelity from predictions urges us to incorporate into numerical models evermore comprehensive physical interactions with evermore refined intricacy, which compels evermore extensive and expansive observations with evermore detailed focus, further intensifying the challenge. As a result, another type of challenge arises, i.e., the wish to realize the full value from the deluge of data generated by models and observations. Unfortunately, until recently, we have had to rely mostly on human beings’ cognitive faculty. Machine learning promises to address this challenge. Since Earth is a complex, nonlinear system rife with processes spanning a broad spectrum of spatiotemporal scales, we can better constrain our hypotheses and direct our investigations when leveraging dissimilar data featuring complementary strengths. Similarly, the performance and, especially, machine learning models’ generalizability improve with increasing volume and variety of training data. Thus, these two types of challenges are linked: machine learning, and more recently, deep learning techniques, which are to automate the analysis and interpretation of big data, will not be very effective if the data cannot be wrangled and processed using parallel and scalable methods. We aim for this Special Issue to review the progress and explore the prospect of addressing these two interconnected challenges in the context of their history.

Dear Colleagues,

We Earth scientists are certainly no strangers to the challenges of big data. However, improvements in model and observation resolutions, the introduction of exotic simulation grids, and the employment of novel observation technologies and strategies have conspired to aggravate the acuity of the challenges, especially with the end of Moore’s law approaching. Thus, scalable parallel processing has attained paramount importance in addressing these challenges. The most challenging step in optimizing scalable parallel processing is perhaps data preparation, as attested by the 80/20 rule: data analysts or scientists devote only 20% of their time to actual analysis, with the rest, i.e., 80%, spent in preparing the data for analysis.

Moreover, even if we were able to process the volume and variety of our big data in a timely fashion in the traditional manner, it would not be sufficient because, relying purely on human effort, the analysis and interpretation of the processed results may still overwhelm us. We need help from artificial intelligence and machine learning (AIML), which have demonstrated remarkable advancement recently, often achieving comparable or even better performance than humans in specific or less advanced cognitive functions. AIML techniques are usually computationally intensive, especially in their training/learning stage, again accentuating the importance of scalable parallel processing.

This Special Issue welcomes contributions from all Earth science domains on the description, evolution, and solutions (or proposed solutions) of the aforementioned challenges.

Dr. Kwo-Sen Kuo
Dr. Rahul Ramachandran
Prof. Dr. Morris Riedel
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • big data
  • earth science datasets
  • scalable parallel processing
  • data management challenges
  • data mining
  • data fusion
  • cloud computing pipelines
  • data augmentation techniques
  • high-performance computing
  • data processing challenges
  • in-memory processing
  • deep learning
  • apache hadoop/spark stacks

Potential contributors you would like to invite to submit:

  • processing approaches for large quantities of earth science datasets
  • scaling up high-performance computing machine learning algorithms
  • processing pipelines using cloud computing for earth observation data
  • innovative deep learning networks for earth science data analysis
  • harnessing open source frameworks of hadoop/spark ecosystem

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 8470 KiB  
Article
Advanced Elastic and Reservoir Properties Prediction through Generative Adversarial Network
by Muhammad Anwar Ishak, Abdul Halim Abdul Latiff, Eric Tatt Wei Ho, Muhammad Izzuljad Ahmad Fuad, Nian Wei Tan, Muhammad Sajid and Emad Elsebakhi
Appl. Sci. 2023, 13(10), 6311; https://doi.org/10.3390/app13106311 - 22 May 2023
Cited by 4 | Viewed by 1841
Abstract
The prediction of subsurface properties such as velocity, density, porosity, and water saturation has been the main focus of petroleum geosciences. Advanced methods such as Full Waveform Inversion (FWI), Joint Migration Inversion (JMI) and ML-Rock Physics are able to produce better predictions than [...] Read more.
The prediction of subsurface properties such as velocity, density, porosity, and water saturation has been the main focus of petroleum geosciences. Advanced methods such as Full Waveform Inversion (FWI), Joint Migration Inversion (JMI) and ML-Rock Physics are able to produce better predictions than their predecessors, but they still require tedious manual interpretation that is prone to human error. The research on these methods remains open as they suffer from technical limitations. As computing resources are becoming cheaper, the use of a single deep-generative adversarial network is feasible in predicting all these properties in a completely data-driven manner. In our proposed method of multiscale pix2pix applied to SEG SEAM salt data, we have managed to map from one input, which is seismic post-stack data, to several outputs of reservoir and elastic properties such as porosity, velocity, and density by using only one trained model and without having to manually interpret or pre-process the input data. With 90% accuracy of the results in the synthetic data testing, the method is worthy of being explored by the petroleum geoscience fraternity. Full article
(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)
Show Figures

Figure 1

17 pages, 9509 KiB  
Article
A Prediction Method for Height of Water Flowing Fractured Zone Based on Sparrow Search Algorithm–Elman Neural Network in Northwest Mining Area
by Xicai Gao, Shuai Liu, Tengfei Ma, Cheng Zhao, Xichen Zhang, Huan Xia and Jianhui Yin
Appl. Sci. 2023, 13(2), 1162; https://doi.org/10.3390/app13021162 - 15 Jan 2023
Cited by 9 | Viewed by 1723
Abstract
The main Jurassic coal seams of the Ordos Basin of northwest mining area have special hosting conditions and complex hydrogeological conditions, and the high-intensity coal mining of the coal seams is likely to cause groundwater loss and negative effects on the surface ecological [...] Read more.
The main Jurassic coal seams of the Ordos Basin of northwest mining area have special hosting conditions and complex hydrogeological conditions, and the high-intensity coal mining of the coal seams is likely to cause groundwater loss and negative effects on the surface ecological environment. The research was aimed at predicting the height of the water-flowing fractured zone (WFFZ) in high-intensity coal mining in that area and gave instructions for avoiding water inrush accidents and realizing damage reduction mining during the actual mining procedure of the coal mine. In this study, 18 samples of the measured height of WFFZ in Jurassic coal seams were systematically collected. In the mining method, the ratio of the thickness of the hard rock to the thickness of the soft rock in the bedrock, buried depth, mining height, and working face length was selected as the input vectors, applied the sparrow search algorithm (SSA) to iteratively optimize the weights and thresholds of the Elman neural network (ENN), constructed an SSA-Elman neural network model. The results demonstrate that the improved SSA-Elman neural network model has higher accuracy in predicting the height of the WFFZ compared with traditional prediction algorithms. The results of this study help guide damage-reducing, water-preserving mining of the middle-deep buried Jurassic coal seams in the northwest mining areas. Full article
(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)
Show Figures

Figure 1

23 pages, 3144 KiB  
Article
Using Artificial Intelligence Techniques to Predict Intrinsic Compressibility Characteristic of Clay
by Samuel J. Abbey, Eyo U. Eyo and Colin A. Booth
Appl. Sci. 2022, 12(19), 9940; https://doi.org/10.3390/app12199940 - 2 Oct 2022
Viewed by 2201
Abstract
Reconstituted clays have often provided the basis for the interpretation and modelling of the properties of natural clays. The term “intrinsic” was introduced to describe a clay remoulded or reconstituted at moisture content up to 1.5 times its liquid limit and consolidated one-dimensionally. [...] Read more.
Reconstituted clays have often provided the basis for the interpretation and modelling of the properties of natural clays. The term “intrinsic” was introduced to describe a clay remoulded or reconstituted at moisture content up to 1.5 times its liquid limit and consolidated one-dimensionally. In order to circumvent the difficulties of measuring an intrinsic constant called “intrinsic compressibility index” (C*c), a machine learning (ML) approach using traditional non-parametric tree-based and meta-heuristic ensembles was adopted in this study. Results indicated that tree-ensembles namely random decision forest (RDF) and boosted decision tree (BDT) performed better in C*c prediction (average R2 of 0.84 and root mean square error, RMSE of 0.51) compared to stand-alone models. However, models’ hyper parameters combined meta-heuristically, produced the highest accuracy (average R2 of 0.90 and root mean square error, RMSE of 0.34). The greatest capacity to distinguish between positive and negative soil classes (average accuracy of 0.95, precision and recall of 0.86) were demonstrated by meta-ensembles in multinomial classification. Full article
(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)
Show Figures

Figure 1

15 pages, 6300 KiB  
Article
Enhancing Channelized Feature Interpretability Using Deep Learning Predictive Modeling
by Salbiah Mad Sahad, Nian Wei Tan, Muhammad Sajid, Ernest Austin Jones, Jr. and Abdul Halim Abdul Latiff
Appl. Sci. 2022, 12(18), 9032; https://doi.org/10.3390/app12189032 - 8 Sep 2022
Cited by 4 | Viewed by 1766
Abstract
Automating geobodies using insufficient labeled training data as input for structural prediction may result in missing important features and a possibility of overfitting, leading to low accuracy. We adopt a deep learning (DL) predictive modeling scheme to alleviate detection of channelized features based [...] Read more.
Automating geobodies using insufficient labeled training data as input for structural prediction may result in missing important features and a possibility of overfitting, leading to low accuracy. We adopt a deep learning (DL) predictive modeling scheme to alleviate detection of channelized features based on classified seismic attributes (X) and different ground truth scenarios (y), to imitate actual human interpreters’ tasks. In this approach, diverse augmentation method was applied to increase the accuracy of the model after we were satisfied with the refined annotated ground truth dataset. We evaluated the effect of dropout as a training regularizer and facies’ spatial representation towards optimized prediction results, apart from conventional hyperparameter tuning. From our findings, increasing batch size helps speedup training speed and improve performance stability. Finally, we demonstrate that the designed Convolutional Neural Network (CNN) is capable of learning channelized variation from complex deepwater settings in a fluvial-dominated depositional environment while producing outstanding mean Intersection of Union (IoU) (95%) despite utilizing 6.4% from the overall dataset and avoiding overfitting possibilities. Full article
(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)
Show Figures

Figure 1

17 pages, 6518 KiB  
Article
Improving Forecast Accuracy with an Auto Machine Learning Post-Correction Technique in Northern Xinjiang
by Junjian Liu, Hailiang Zhang, Huoqing Li and Ali Mamtimin
Appl. Sci. 2021, 11(17), 7931; https://doi.org/10.3390/app11177931 - 27 Aug 2021
Cited by 1 | Viewed by 3063
Abstract
Reliable meteorological forecasts of temperature and relative humidity are critically important to take necessary measures to avoid potential damage and losses. An operational meteorological forecast model based on the Weather Research and Forecast (WRF) model has been built in Xinjiang. Numerical forecasts usually [...] Read more.
Reliable meteorological forecasts of temperature and relative humidity are critically important to take necessary measures to avoid potential damage and losses. An operational meteorological forecast model based on the Weather Research and Forecast (WRF) model has been built in Xinjiang. Numerical forecasts usually have significant uncertainties and errors due to imperfections in models themselves. In this study, a straightforward automated machine learning (AutoML) approach has been developed to post-process the raw forecasts of the WRF model. The method was implemented and evaluated to post-process forecasts from 13 stations in northern Xinjiang. The post-processed temperature forecasts were significantly improved from the raw forecasts, with average RMSE values in the 13 stations decreasing from 3.24 °C to 2.34 °C by a large margin of 28%. As for relative humidity, the mean RMSE at 13 stations decreased from 19.54% to 11.54%, or it showed a percentage decrease of 41%. Meanwhile, biases were also significantly decreased, with average ME values being reduced from around 2 °C to ~0.33 °C for temperature and improved from −15.6% to ~0% for relative humidity. Moreover, forecast performance values after post-correction became much closer to each other than raw forecast performance values, improving forecast applicability at regional scales. Full article
(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)
Show Figures

Figure 1

11 pages, 2038 KiB  
Article
Deep Learning Applications in Geosciences: Insights into Ichnological Analysis
by Korhan Ayranci, Isa E. Yildirim, Umair bin Waheed and James A. MacEachern
Appl. Sci. 2021, 11(16), 7736; https://doi.org/10.3390/app11167736 - 22 Aug 2021
Cited by 8 | Viewed by 3966
Abstract
Ichnological analysis, particularly assessing bioturbation index, provides critical parameters for characterizing many oil and gas reservoirs. It provides information on reservoir quality, paleodepositional conditions, redox conditions, and more. However, accurately characterizing ichnological characteristics requires long hours of training and practice, and many marine [...] Read more.
Ichnological analysis, particularly assessing bioturbation index, provides critical parameters for characterizing many oil and gas reservoirs. It provides information on reservoir quality, paleodepositional conditions, redox conditions, and more. However, accurately characterizing ichnological characteristics requires long hours of training and practice, and many marine or marginal marine reservoirs require these specialized expertise. This adds more load to geoscientists and may cause distraction, errors, and bias, particularly when continuously logging long sedimentary successions. In order to alleviate this issue, we propose an automated technique to determine the bioturbation index in cores and outcrops by harnessing the capabilities of deep convolutional neural networks (DCNNs) as image classifiers. In order to find a fast and robust solution, we utilize ideas from deep learning. We compiled and labeled a large data set (1303 images) composed of images spanning the full range (BI 0–6) of bioturbation indices. We divided these images into groups based on their bioturbation indices in order to prepare training data for the DCNN. Finally, we analyzed the trained DCNN model on images and obtained high classification accuracies. This is a pioneering work in the field of ichnological analysis, as the current practice is to perform classification tasks manually by experts in the field. Full article
(This article belongs to the Special Issue Big Data and Machine Learning in Earth Sciences)
Show Figures

Figure 1

Back to TopTop