remotesensing-logo

Journal Browser

Journal Browser

Big Data in Earth Observation: A New Computing Paradigm for Remote Data Analysis

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Environmental Remote Sensing".

Deadline for manuscript submissions: closed (10 December 2021) | Viewed by 30798

Special Issue Editors

Department of Computer Technology and Communications, Polytechnic School of Cáceres, University of Extremadura, 10003 Cáceres, Spain
Interests: hyperspectral image analysis; machine (deep) learning; neural networks; multisensor data fusion; high performance computing; cloud computing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Technology and Communications, Polytechnic School of Cáceres, University of Extremadura, avenida de la Universidad s/n, 10003 Cáceres, Spain
Interests: hyperspectral remote sensing; deep learning; Graphics Processing Units (GPUs); High Performance Computing (HPC) techniques
Special Issues, Collections and Topics in MDPI journals
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
Interests: hyperspectral image processing; remote sensing big data processing; parallel computing; machine learning; cloud computing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

With the recent advances made in the Earth Observation (EO) field, the use of remote sensing information captured by available sensors (located on aerial and/or satellite platforms) has acquired a very important role in a wide range of human activities such as the management of environment and natural resources (including forests, water, geological and mineralogical resources), prevention of risks and catastrophes, planning of urban and rural spaces, detection of military objectives and intelligence tasks, among others. This has been fostered by the fact that a detailed characterization of the Earth's surface is now possible using the data collected by current remote sensing instruments for EO, which are able to collect data with higher spatial and spectral resolutions, thus allowing for the acquisition of a large variety of remotely sensed images, from panchromatic and RGB data to multispectral and hyperspectral scenes, from LiDAR and radar sensors, to thermal and optical images, and from low to medium, high and very high spatial resolutions.

For instance, the sensors capable of acquiring images with hundreds of spectral bands (called imaging spectrometers) are able to gather large amounts of information for the same area by recording hundreds of measurements in the spectral domain at different wavelengths. This allows "to see what the human eye cannot," making possible the generation of "data cubes," also known as hyperspectral images (HSI) with very large dimensionality. These images permit a very precise characterization of the terrestrial surface. For example, NASA's Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) sensor is able to capture HSI scenes with 224 spectral bands between 0.4 and 2.5 micrometers, and spatial resolution of about 20 meters per pixel. Such wealth of spatial and spectral information (despite imposing important computational requirements) has opened new possibilities in many applications, including the detailed characterization of agricultural and urban areas, or the monitoring and prevention of natural disasters such as forest fires, oil spills and other types of chemical pollution.

This Special Issue on “Big Data in Earth Observation: a new computing paradigm for remote data analysis" is intended to introduce the latest techniques in high performance computing (HPC) to the development and application of new image processing techniques for an adequate and computationally efficient exploitation of remotely sensed scenes from a Big Data point of view, exploring new computationally efficient models for extracting information from huge remote sensing datasets, with particular interest in the development of parallel and distributed techniques based on graphical processing units (GPUs) and grid/cloud computing platforms.

The goal of this Special Issue is to collect the latest and most advanced ideas regarding the new and efficient techniques for extracting information based on the new trends in advanced learning algorithms (including the newest machine and deep learning approaches).

Dr. Juan M. Haut
Ms. Mercedes E. Paoletti
Dr. Zebin Wu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Big data
  • Neural Networks
  • Deep learning
  • Cloud computing
  • GPUs
  • Heterogeneous computing
  • Remote sensing
  • Supercomputing
  • Image processing
  • Machine learning
  • High performance computing

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 4862 KiB  
Article
A Scalable Computing Resources System for Remote Sensing Big Data Processing Using GeoPySpark Based on Spark on K8s
by Jifu Guo, Chunlin Huang and Jinliang Hou
Remote Sens. 2022, 14(3), 521; https://doi.org/10.3390/rs14030521 - 22 Jan 2022
Cited by 9 | Viewed by 2890
Abstract
As a result of Earth observation (EO) entering the era of big data, a significant challenge relating to by the storage, analysis, and visualization of a massive amount of remote sensing (RS) data must be addressed. In this paper, we proposed a novel [...] Read more.
As a result of Earth observation (EO) entering the era of big data, a significant challenge relating to by the storage, analysis, and visualization of a massive amount of remote sensing (RS) data must be addressed. In this paper, we proposed a novel scalable computing resources system to achieve high-speed processing of RS big data in a parallel distributed architecture. To reduce data movement among computing nodes, the Hadoop Distributed File System (HDFS) is established on nodes of K8s, which are also used for computing. In the process of RS data analysis, we innovatively use the tile-oriented programming model instead of the traditional strip-oriented or pixel-oriented approach to better implement parallel computing in a Spark on Kubernetes (K8s) cluster. A large RS raster layer can be abstracted as a user-defined tile format of any size, so that a whole computing task can be divided into multiple distributed parallel tasks. The computing resources applied by users would be immediately assigned in the Spark on K8s cluster by simply configuring and initializing SparkContext through a web-based Jupyter notebook console. Users can easily query, write, or visualize data in any box size from the catalog module in GeoPySpark. In summary, the system proposed in this study can provide a distributed scalable resources system for assembling big data storage, parallel computing, and real-time visualization. Full article
Show Figures

Graphical abstract

27 pages, 12880 KiB  
Article
Optimizing Urban LiDAR Flight Path Planning Using a Genetic Algorithm and a Dual Parallel Computing Framework
by Anh Vu Vo, Debra F. Laefer and Jonathan Byrne
Remote Sens. 2021, 13(21), 4437; https://doi.org/10.3390/rs13214437 - 04 Nov 2021
Cited by 6 | Viewed by 2936
Abstract
This paper introduces a genetic algorithm (GA) and a beam tracing algorithm incorporated within a dual parallel computing framework to optimize urban aerial laser scanning (ALS) missions to maximize vertical façade data capture, as needed for many three-dimensional reconstruction and modeling workflows. The [...] Read more.
This paper introduces a genetic algorithm (GA) and a beam tracing algorithm incorporated within a dual parallel computing framework to optimize urban aerial laser scanning (ALS) missions to maximize vertical façade data capture, as needed for many three-dimensional reconstruction and modeling workflows. The optimization employs a low-density point cloud from the site of interest as a spatial representation of the urban scene. The GA is suitable for LiDAR flight path optimization due to its capability of handling open-ended problems that have many solutions. However, GAs require evaluating a very large number of candidates. The use of an initial point cloud allows realistic modeling of the urban environment in the optimization at the cost of high data input volumes. To cope with the computational and data demands, a dual parallel computing framework was devised. The parallel computing framework consists of two layers of parallelization. In the upper layer, multiple evaluators work in parallel and in conjunction with a main multi-threading GA optimizer to perform GA operations and evaluate the flight paths. In the lower layer, to evaluate assigned flight paths, each evaluator distributes its data and computation to multiple executors, which can reside on multiple physical nodes of a distributed-memory computing cluster. In addition to parallelism, the data partitioning on the lower layer allows out-of-core computation. Namely, data partitions are efficiently transferred between disks and memory so that only relevant subsets of data are kept in the main memory. The objective of the proposed method is threefold: (1) search for flight paths that yield the highest numbers of vertical points, (2) create a means to explicitly consider the detailed spatial configuration of urban environments, and (3) assure that the proposed optimization strategy is fast and can scale to large problem sizes. Multiple experiments were conducted and demonstrated the success of the proposed method. Converged results were achieved after dozens of generations within two hours. Two flight paths identified by the GA as the most and the least optimal candidates were deployed in real flight missions. The optimal flight path captured 16% more vertical points than the least optimal one, slightly higher than the 13% predicted. Both layers of parallelization were efficient: 13.1/16 for the lower layer and 3.2/4 for the upper layer. The two complementary layers of parallelization allowed flexible and efficient use of distributed computing resources to reduce the runtime. The scalability of the proposed approach was successfully demonstrated up to a data size of 460 million points. The optimization results were realistic and aligned well with the test flight results. Full article
Show Figures

Graphical abstract

18 pages, 45464 KiB  
Article
Efficient and Flexible Aggregation and Distribution of MODIS Atmospheric Products Based on Climate Analytics as a Service Framework
by Jianyu Zheng, Xin Huang, Supriya Sangondimath, Jianwu Wang and Zhibo Zhang
Remote Sens. 2021, 13(17), 3541; https://doi.org/10.3390/rs13173541 - 06 Sep 2021
Cited by 3 | Viewed by 2296
Abstract
MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument onboard NASA’s Terra (launched in 1999) and Aqua (launched in 2002) satellite missions as part of the more extensive Earth Observation System (EOS). By measuring the reflection and emission by the Earth-Atmosphere system in [...] Read more.
MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument onboard NASA’s Terra (launched in 1999) and Aqua (launched in 2002) satellite missions as part of the more extensive Earth Observation System (EOS). By measuring the reflection and emission by the Earth-Atmosphere system in 36 spectral bands from the visible to thermal infrared with near-daily global coverage and high-spatial-resolution (250 m ~ 1 km at nadir), MODIS is playing a vital role in developing validated, global, interactive Earth system models. MODIS products are processed into three levels, i.e., Level-1 (L1), Level-2 (L2) and Level-3 (L3). To shift the current static and “one-size-fits-all” data provision method of MODIS products, in this paper, we propose a service-oriented flexible and efficient MODIS aggregation framework. Using this framework, users only need to get aggregated MODIS L3 data based on their unique requirements and the aggregation can run in parallel to achieve a speedup. The experiments show that our aggregation results are almost identical to the current MODIS L3 products and our parallel execution with 8 computing nodes can work 88.63 times faster than a serial code execution on a single node. Full article
Show Figures

Figure 1

25 pages, 8795 KiB  
Article
Sentinel-1 Big Data Processing with P-SBAS InSAR in the Geohazards Exploitation Platform: An Experiment on Coastal Land Subsidence and Landslides in Italy
by Francesca Cigna and Deodato Tapete
Remote Sens. 2021, 13(5), 885; https://doi.org/10.3390/rs13050885 - 26 Feb 2021
Cited by 51 | Viewed by 7289
Abstract
The growing volume of synthetic aperture radar (SAR) imagery acquired by satellite constellations creates novel opportunities and opens new challenges for interferometric SAR (InSAR) applications to observe Earth’s surface processes and geohazards. In this paper, the Parallel Small BAseline Subset (P-SBAS) advanced InSAR [...] Read more.
The growing volume of synthetic aperture radar (SAR) imagery acquired by satellite constellations creates novel opportunities and opens new challenges for interferometric SAR (InSAR) applications to observe Earth’s surface processes and geohazards. In this paper, the Parallel Small BAseline Subset (P-SBAS) advanced InSAR processing chain running on the Geohazards Exploitation Platform (GEP) is trialed to process two unprecedentedly big stacks of Copernicus Sentinel-1 C-band SAR images acquired in 2014–2020 over a coastal study area in southern Italy, including 296 and 283 scenes in ascending and descending mode, respectively. Each stack was processed in the GEP in less than 3 days, from input SAR data retrieval via repositories, up to generation of the output P-SBAS datasets of coherent targets and their displacement histories. Use-cases of long-term monitoring of land subsidence at the Capo Colonna promontory (up −2.3 cm/year vertical and −1.0 cm/year east–west rate), slow-moving landslides and erosion landforms, and deformation at modern coastal protection infrastructure in the city of Crotone are used to: (i) showcase the type and precision of deformation products outputting from P-SBAS processing of big data, and the derivable key information to support value-adding and geological interpretation; and (ii) discuss potential and challenges of big data processing using cloud/grid infrastructure. Full article
Show Figures

Graphical abstract

21 pages, 6756 KiB  
Article
A Parallel Unmixing-Based Content Retrieval System for Distributed Hyperspectral Imagery Repository on Cloud Computing Platforms
by Peng Zheng, Zebin Wu, Jin Sun, Yi Zhang, Yaoqin Zhu, Yuan Shen, Jiandong Yang, Zhihui Wei and Antonio Plaza
Remote Sens. 2021, 13(2), 176; https://doi.org/10.3390/rs13020176 - 06 Jan 2021
Cited by 16 | Viewed by 2321
Abstract
As the volume of remotely sensed data grows significantly, content-based image retrieval (CBIR) becomes increasingly important, especially for cloud computing platforms that facilitate processing and storing big data in a parallel and distributed way. This paper proposes a novel parallel CBIR system for [...] Read more.
As the volume of remotely sensed data grows significantly, content-based image retrieval (CBIR) becomes increasingly important, especially for cloud computing platforms that facilitate processing and storing big data in a parallel and distributed way. This paper proposes a novel parallel CBIR system for hyperspectral image (HSI) repository on cloud computing platforms under the guide of unmixed spectral information, i.e., endmembers and their associated fractional abundances, to retrieve hyperspectral scenes. However, existing unmixing methods would suffer extremely high computational burden when extracting meta-data from large-scale HSI data. To address this limitation, we implement a distributed and parallel unmixing method that operates on cloud computing platforms in parallel for accelerating the unmixing processing flow. In addition, we implement a global standard distributed HSI repository equipped with a large spectral library in a software-as-a-service mode, providing users with HSI storage, management, and retrieval services through web interfaces. Furthermore, the parallel implementation of unmixing processing is incorporated into the CBIR system to establish the parallel unmixing-based content retrieval system. The performance of our proposed parallel CBIR system was verified in terms of both unmixing efficiency and accuracy. Full article
Show Figures

Graphical abstract

25 pages, 15744 KiB  
Article
Fine-Tuning Self-Organizing Maps for Sentinel-2 Imagery: Separating Clouds from Bright Surfaces
by Viktoria Kristollari and Vassilia Karathanassi
Remote Sens. 2020, 12(12), 1923; https://doi.org/10.3390/rs12121923 - 14 Jun 2020
Cited by 7 | Viewed by 4082
Abstract
Removal of cloud interference is a crucial step for the exploitation of the spectral information stored in optical satellite images. Several cloud masking approaches have been developed through time, based on direct interpretation of the spectral and temporal properties of clouds through thresholds. [...] Read more.
Removal of cloud interference is a crucial step for the exploitation of the spectral information stored in optical satellite images. Several cloud masking approaches have been developed through time, based on direct interpretation of the spectral and temporal properties of clouds through thresholds. The problem has also been tackled by machine learning methods with artificial neural networks being among the most recent ones. Detection of bright non-cloud objects is one of the most difficult tasks in cloud masking applications since spectral information alone often proves inadequate for their separation from clouds. Scientific attention has recently been redrawn on self-organizing maps (SOMs) because of their unique ability to preserve topologic relations, added to the advantage of faster training time and more interpretative behavior compared to other types of artificial neural networks. This study evaluated a SOM for cloud masking Sentinel-2 images and proposed a fine-tuning methodology to separate clouds from bright land areas. The fine-tuning process which is based on the output of the non-fine-tuned network, at first directly locates the neurons that correspond to the misclassified pixels. Then, the incorrect labels of the neurons are altered without applying further training. The fine-tuning method follows a general procedure, thus its applicability is broad and not confined only in the field of cloud-masking. The network was trained on the largest publicly available spectral database for Sentinel-2 cloud masking applications and was tested on a truly independent database of Sentinel-2 cloud masks. It was evaluated both qualitatively and quantitatively with the interpretation of its behavior through multiple visualization techniques being a main part of the evaluation. It was shown that the fine-tuned SOM successfully recognized the bright non-cloud areas and outperformed the state-of-the-art algorithms: Sen2Cor and Fmask, as well as the version that was not fine-tuned. Full article
Show Figures

Graphical abstract

21 pages, 5555 KiB  
Article
Estimating Water pH Using Cloud-Based Landsat Images for a New Classification of the Nhecolândia Lakes (Brazilian Pantanal)
by Osvaldo J. R. Pereira, Eder R. Merino, Célia R. Montes, Laurent Barbiero, Ary T. Rezende-Filho, Yves Lucas and Adolpho J. Melfi
Remote Sens. 2020, 12(7), 1090; https://doi.org/10.3390/rs12071090 - 28 Mar 2020
Cited by 20 | Viewed by 6698
Abstract
The Nhecolândia region, located in the southern portion of the Pantanal wetland area, is a unique lacustrine system where tens of thousands of saline-alkaline and freshwater lakes and ponds coexist in close proximity. These lakes are suspected to be a strong source of [...] Read more.
The Nhecolândia region, located in the southern portion of the Pantanal wetland area, is a unique lacustrine system where tens of thousands of saline-alkaline and freshwater lakes and ponds coexist in close proximity. These lakes are suspected to be a strong source of greenhouse gases (GHGs) to the atmosphere, the water pH being one of the key factors in controlling the biogeochemical functioning and, consequently, production and emission of GHGs in these lakes. Here, we present a new field-validated classification of the Nhecolândia lakes using water pH values estimated based on a cloud-based Landsat (5 TM, 7 ETM+, and 8 OLI) 2002–2017 time-series in the Google Earth Engine platform. Calibrated top-of-atmosphere (TOA) reflectance collections with the Fmask method were used to ensure the usage of only cloud-free pixels, resulting in a dataset of 2081 scenes. The pH values were predicted by applying linear multiple regression and symbolic regression based on genetic programming (GP). The regression model presented an R2 value of 0.81 and pH values ranging from 4.69 to 11.64. A lake mask was used to extract the predicted pH band that was then classified into three lake classes according to their pH values: Freshwater (pH < 8), oligosaline (pH 8–8.9), and saline (≥9). Nearly 12,150 lakes were mapped with those with saline waters accounting for 7.25%. Finally, a trend surface map was created using the ALOS PRISM Digital Surface Model (DSM) to analyze the correlation between landscape features (topography, connection with the regional drainage system, size, and shape of lakes) and types of lakes. The analysis was in consonance with previous studies that pointed out that saline lakes tend to occur in lower positions compared to freshwater lakes. The results open a relevant perspective for the transfer of locally acquired experimental data to the regional balances of the Nhecolândia lakes. Full article
Show Figures

Graphical abstract

Back to TopTop