1. Introduction
Geoscience, as an interdisciplinary field, is dedicated to revealing the operational mechanisms and evolutionary patterns of the Earth system. Geoscience is very important to enhancing our understanding of the Earth, including its life, resources, and environment. In recent years, the rapid development of high-performance computing and artificial intelligence technologies has presented unparalleled opportunities and challenges to geoscience research. This book aims to explore the significance of high-performance computing and artificial intelligence research in advancing our understanding of geosciences.
High-performance computing provides powerful computational capabilities for geoscience research. The Earth’s system encompasses vast amounts of data and intricate inter-relationships, including atmospheric dynamics, seismic activities, and ocean circulation, among others. Traditional data processing and modeling methods often struggle to meet the demands of these complex systems. However, the development of high-performance computing technologies has enabled scientists to leverage tools such as parallel computing and supercomputers to process large-scale datasets and conduct more precise simulations and predictions. Through high-performance computing, scientists can enhance their understanding of the dynamical processes within the Earth system, resulting in the improved accuracy and timeliness of weather forecasting, climate simulations, earthquake early warning systems, and more.
Artificial intelligence technology has introduced novel ideas and methodologies into geoscience research. The investigation of geosciences often necessitates the extraction of valuable information from extensive datasets to show the inherent patterns within the Earth system. Machine learning and deep learning algorithms in artificial intelligence are able to process intricate data and discern patterns. They autonomously learn and extract features from massive datasets, aiding scientists in uncovering hidden patterns and trends embedded within the data. For instance, in climate change research, artificial intelligence can analyze climate models and observational data to offer more precise predictions and evaluations. Moreover, artificial intelligence has practical applications in geological exploration and resource surveys, thereby enhancing the efficiency and accuracy of resource exploration.
2. Contributions
The editors acknowledge all contributions, and we are delighted to introduce a collection of eleven selected high-quality research papers in this book.
Qi and colleagues [
1] conducted high-performance computing in flood behavior modeling, using the two-dimensional Saint-Venant equations as examples. The equations were discretized using finite difference methods with the explicit leapfrog scheme, considering initial and boundary conditions. They employed MPI, OpenMP, Pthread, and OpenCL to achieve large-scale heterogeneous parallelism and optimized algorithm performance through computation/communication overlap, workgroup optimization, and local memory optimization. Ultimately, their work yielded a well-performing, large-scale, non-uniform parallel solution for the two-dimensional Saint-Venant equations.
Wang and colleagues [
2] studied the ZTEM (
Z-axis Tipper Electromagnetic Method) electromagnetic field detection method. They performed parallel computations on the inversion algorithm for the two-dimensional ZTEM using MPI. Compared to the serial algorithm, the parallel algorithm achieved acceleration ratios ranging from 1.74 to 3.19 when the number of processes ranged from three to six.
Hao and colleagues [
3] conducted a series of parallel optimizations on the LASG/IAP Climate System Ocean Model (LICOM 2.1 version), a high-resolution ocean model independently developed by the Institute of Atmospheric Physics, Chinese Academy of Sciences. The optimizations addressed load imbalance, communication optimization, and loop optimization. Additionally, they employed hybrid parallelization using MPI and OpenMP, as well as asynchronous parallel I/O. The optimized version of LICOM 2.1 achieved more than two-fold acceleration compared to the original version. In large-scale parallel simulations, the optimized version of LICOM scaled up to 245,760 processor cores and resolved the wall time issue during the time integration process.
Yang and coworkers [
4] developed site-level gross primary productivity (GPP) using the GeoMAN model, which incorporates spatio-temporal features and incorporates external environmental factors, to predict GPP on the Tibetan Plateau. They evaluated the behavior of four models—random forest (RF), support vector machine (SVM), Deep Belief Network (DBN), and GeoMAN—to predict GPP for nine flux observation sites on the Tibetan Plateau. The GeoMAN model outperformed the other models. These findings are valuable for our understanding of the capability of deep learning models in predicting GPP while aligning with the fundamental knowledge of related fields.
Wang and coworkers [
5] proposed a time series prediction model for landslide displacements using mean-based low-rank autoregressive tensor completion (MLATC). They first analyzed the reasons for missing landslide displacement data and designed the corresponding missing dataset. Then, based on the characteristics and internal correlation of landslide displacement monitoring data, they introduced the establishment process of the mean-based low-rank tensor completion prediction model. Finally, they used the proposed method to complete and predict the random missing and non-random missing landslide displacement data. The results of the model were consistent with the original monitoring data and showed good performance in completing and predicting landslide displacement, providing valuable insights for processing missing data and predicting landslide displacement.
Cao and coworkers [
6] proposed a new method called Dual Encoder Transform (DualET) for the short-term prediction of photovoltaic (PV) power. The DualET model contained wavelet transform and sequence decomposition blocks for the extraction of information features from image and sequence data, respectively, to improve the correlation of spatial and temporal features. In addition, they proposed a cross-domain attention module to learn the correlation between temporal features and cloud information, and then modified the attention module using alternate forms and Fourier transforms to improve its performance. The model was evaluated on real-world datasets consisting of PV plant data and satellite images, and it outperformed other models in the prediction of short-term PV power generation.
Xu and coworkers [
7] used the sliding window method and gray relational analysis to extract features from multi-source real-time monitoring data of landslides in Lishan County, Hunan Province, China. They applied the K-means algorithm with particle swarm optimization for clustering and the Apriori algorithm to mine strong correlation rules between the high-speed deformation process of the landslide and rainfall features. This approach enabled them to identify short-term deformation patterns and precursors of disasters. They indicated that the probability of high-speed deformation of this landslide exceeded 80% when the rainfall occurred within 24 h and the accumulated rainfall in 7 days was greater than 130.60 mm. By using data mining technology to extract short-term deformation patterns of landslides, the accuracy and reliability of early warning systems can be improved.
Du and coworkers [
8] analyzed the potential of unsupervised machine learning methods for submarine landslide prediction and compared the performance of three different unsupervised machine learning models (K-means, spectral clustering, and hierarchical clustering) in modeling landslide susceptibility. They selected nine sets of geological factors as input parameters, which were extracted through field investigations. To estimate the susceptibility of submarine landslides, all input factors were grouped into three to four clusters based on data characteristics and environmental variables. The performance of the models was evaluated using internal indicators (the Calinski–Harabasz index, silhouette index, and Davies–Bouldin index) and external indicators (existing landslide distribution, hydrodynamic distribution, and liquefaction distribution) to verify model fit and accuracy. The results showed that all three models (K-means, spectral clustering, and hierarchical clustering) performed well in accurately predicting submarine landslides. Spectral clustering was found to be particularly effective in capturing the environmental geological parameters.
Yang and coworkers [
9] proposed an automatic landslide identification method. Their approach combined deep learning with landslide extraction from remote sensing images, using a semantic segmentation model to automate the landslide recognition process. They evaluated the model’s performance using metrics from the semantic segmentation task and tested three popular semantic segmentation models (U-Net, DeepLabv3+, and PSPNet) with different backbone networks. The best recognition accuracy of PSPNet was 91.18% mIoU with the classification network ResNet50 as the backbone network, which proved the deep learning method is feasible and effective for use in landslide recognition.
Huang and coworkers [
10] proposed a named entity recognition (GNNER) method for geological news based on a bi-directional encoder representation of a converter (BERT) pre-trained language model. The approach addressed drawbacks of traditional word vectors, including the fact that they do not effectively represent the contextual semantics and single extraction effects occur. This approach can also aid the construction of knowledge graphs of geological news. The method involves embedding words in geological news text using a BERT pre-training model, with the resulting word vectors being dynamically obtained and used as input for the model. Next, the word vectors are fed into a bidirectional long- and short-term memory model for further training to obtain contextual features. Finally, the model uses conditional random field sequence decoding to extract six entity types. Through the experiments on the constructed Chinese geological news dataset, the model achieved an average F1 score of 0.839 and was able to recognize news entities in geological news better.
Zhang and coworkers [
11] proposed an efficient deep-learning-based mineral identification method, which effectively addressed the limitations of traditional identification methods that heavily rely on the identification capabilities of the identifier and external instruments. The accuracy of existing identification methods is often affected by various factors, including Mohs hardness, color, picture scale, and especially light intensity. Deep-learning-based mineral recognition provides a new solution to this problem, not only saving labor costs but also reducing recognition errors. The authors, using a luminance equalization algorithm, reduced the impact of light intensity on recognition accuracy. First, they proposed a new algorithm combining histogram equalization (HE) and Laplace’s algorithm, used the algorithm to process the luminance of the recognized samples, and finally used the YOLOv5 model to recognize the samples and implemented a deep learning mineral recognition method based on luminance equalization.