Next Article in Journal
Offshore Oil Platform Detection in Polarimetric SAR Images Using Level Set Segmentation of Limited Initial Region and Convolutional Neural Network
Next Article in Special Issue
Big Geospatial Data or Geospatial Big Data? A Systematic Narrative Review on the Use of Spatial Data Infrastructures for Big Geospatial Sensing Data in Public Health
Previous Article in Journal
Normalized Burn Ratio Plus (NBR+): A New Index for Sentinel-2 Imagery
Previous Article in Special Issue
An Overview of Platforms for Big Earth Observation Data Management and Analysis
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

A Serverless-Based, On-the-Fly Computing Framework for Remote Sensing Image Collection

State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
University of Chinese Academy of Sciences, Beijing 100049, China
State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi 830011, China
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(7), 1728;
Submission received: 20 February 2022 / Revised: 26 March 2022 / Accepted: 31 March 2022 / Published: 3 April 2022
(This article belongs to the Special Issue Spatial Data Infrastructures for Big Geospatial Sensing Data)


The rapid growth of remote sensing data calls for the construction of new computational models for algorithmic exploration, which requires on-demand execution, instant response, and multitenancy. We call this model on-the-fly computing, which could reduce the complexity of cloud programming for remote sensing data analysis and benefit from efficient multiplexing. As an advancement of cloud computing, serverless computing makes it possible to realize the on-the-fly computational model. In the study, the concise definition of an on-the-fly computing model for remote sensing data analysis and the corresponding software architecture based on the serverless computing commodities are presented. The proof-of-concept experiments have suggested that the on-the-fly computing model for remote sensing data analysis can be efficiently implemented as a serverless software. The response time is mainly related to the tile reading operation and data structure conversion. In the case of high concurrency, the system can scale to hundreds of instances in seconds.

1. Introduction

With the advance in earth observation and surveying technology, remote sensing images are increasingly accumulated and piled to be processed, heading the community of geographical information science into an era of big data. Massive remote sensing images, multisource archives of petabytes, pose great challenges for the traditional geospatial information analysis infrastructure. Cloud computing is one of the most promising technologies to tackle these challenges, and the geographical information science community has developed various cloud computing platforms for massive remote sensing images analysis [1,2].
Among the existing massive remote sensing images analysis frameworks, the most influential one is Google Earth Engine (GEE) [3], which provides two types of computing services, namely on-the-fly computing and batch computing. Based on the traditional big data analysis techniques, batch computing processes all the input data as a whole [4]. As remote sensing data grows larger and larger, the execution time of batch computing is becoming longer and longer. However, exploratory analysis in scientific research usually requires an environment enabling instant read–eval–print loop (REPL), where a program is executed piecewise for a rapid result evaluation. To solve this problem, the on-the-fly computing technologies, featuring on-demand execution, instant response, and multitenancy, are developed to improve the data science productivity. Although GEE provides an excellent instance of the on-the-fly cloud computing paradigm and has been widely used by the remote sensing community, its theoretical basis, design principles, and implementation details are not publicly available.
General big data analysis frameworks, such as Hadoop [5], Spark [4], and Dask [6], are based on individual servers with tightly integrated resources, which is called server-centric computing. As advancement in the computer architecture community enables the datacenter disaggregation [7], serverless computing comes to light [8]. Serverless computing brings about cloud functions with greater elasticity and more lightweight virtualization while changing the pricing of cloud computing from paying resources allocated to paying in proportion to resources used. Unfortunately, to the best of our knowledge, none of the existing on-the-fly remote sensing image analysis frameworks, like Geonotebook [9], have yet adopted serverless computing technologies and could not switch to batch processing seamlessly at the same time.
In summary, this paper makes the following contributions:
Proposing a definition for the on-the-fly cloud computing paradigm for remote sensing image collections, including some empirical or descriptive characteristics and a formal definition.
Designing an entirely serverless architecture based on the serverless commodities of a public cloud, which consists of a data model, a programming model, and a series of key implementing technologies for remote sensing image collection analysis.
Providing some concrete, proof-of-concept experiments suggesting that on-the-fly cloud computing for remote sensing images can effectively run on the serverless cloud platform.
The remainder of this paper is organized as follows: Section 2 is an overview of the definition for the on-the-fly computing paradigm and introduces its serverless software architecture. More details about the implementation are presented in Section 3, Section 4 and Section 5, and Section 6 shows some concrete proof-of-concept experiments of on-the-fly computing for remote sensing images. Section 7 discusses the results and concludes the paper.

2. On-the-Fly Cloud Computing

2.1. Cloud Computing vs. HPC

Currently, cloud computing has become the main paradigm of server programming, which can ship code to the big data. The key technologies include virtualization, distributed storage, and distributed computation. A large number of frameworks have been developed by the industry and scholars, which can be classified into the server-centric pattern or serverless pattern. There are some serverless computing commodities in the public cloud, such as AWS Lambda.
There are four requirements for any computing system, including ease of use, high performance, portability, and flexibility. The cloud computing system’s first object is the ease of use while that of high-performance computing (HPC) is performance. Therefore, HPC provides programming abstractions with low-level details about computer architecture, such as MPI, and cloud computing systems have more automatic optimization mechanisms.
From the perspective of workload, the big data processing frameworks can be classified into batch processing and streaming processing. However, on-the-fly computing has a significant difference from the other two paradigms. The data source for on-the-fly computing is the same as batch processing, but it requires a quick response. The streaming processing can respond instantly, but its data source is real time. Therefore, existing, general cloud computing frameworks cannot be directly applied to the algorithm exploratory analysis. A new paradigm of geocomputation is needed.

2.2. Characteristics

The target computing model is called on-the-fly cloud geocomputation, which is implemented based on general purpose cloud computing technologies, oriented to exploratory analysis, and dedicated to remote sensing processing. We summarize the characteristics of on-the-fly cloud geocomputation from the perspective of human–computer interaction as shown in Figure 1.
Shipping code to the remote sensing images persisted in the cloud storage instead of downloading the data locally for analysis.
Seamlessly switching to batch processing without code modification, which requires the data abstraction and operators to be the same.
Implicitly triggering the execution implied in specific operators, such as visualization and data export.
Dynamically determining the spatial scope of remote sensing images to be processed based on the tiles visualized on the map.
Responding as rapidly as possible when the user needs to evaluate without queueing of workloads.
Executing user-defined codes based on the overviews of remote sensing images without explicitly provisioning and managing data allocation.
Paying in proportion to remote sensing data used instead of paying for the computing resources allocated.

2.3. Formal Definition

The empirical and descriptive characteristics of on-the-fly cloud geocomputation are not sufficient to determine the structure of the target model. A directed acyclic graph (DAG)-based model of on-the-fly cloud geocomputation is presented in Definition 1. A DAG is a kind of intermediate representation for user-defined codes, and it is common in relational databases, where it is used to represent the query plans. The nodes of a DAG are function invocations with some edges for representing the inputs and outputs. Operators and user-defined codes are equivalent logically and can both be transformed into DAGs in the target framework.
Definition 1.
In order to simplify cloud programming, the target framework of the on-the-fly cloud computing model for remote sensing images should provide rich datatypes, analysis-ready data, and dedicated operators for remote sensing image analysis, which would significantly reduce the amount of user-defined codes. Its programming model could only acquire and generate the datatypes and operators accessible, according to user authentication in the frontend, and process data visible to users on maps, usually in the form of tiles, finally achieving instance response, on-demand execution, and mutitenancy.
The on-the-fly computing model for the remote sensing data analysis M can be defined as a five-tuple:
M ( E , S , U , V , A )
Here notation means “defined as”, and E refers to the main elements to be processed, including operators, datatypes, and DAGs, which can be defined as a set:
D refers to the set of datatypes, also known as data models, commonly used in remote sensing image processing, such as I m a g e and I m a g e C o l l e c t i o n . Each datatype D i is a set of elements d i , which means that d i is a specific dataset of type D i .
D { D i | i N } D i { d i k | k N } , i N
There are some specific relations R between datatypes in D , which can be defined as a three-tuple. A predicate p is virtually a functional mapping from one datatype D i to another datatype or operator. Superscript means receiving the power set.
r ( α , p ,   β ) p ( α ) β ; α D O P , p P , β D
Here notation ; means the end of an expression. There are only two predicates. The predicate c p means that α is a component of β , and more complex datatypes can be established through it. The predicate i h means that β is more specific and customed on the basis of α . We can build a classification based on predicate i h . It should be pointed out that all the datatypes and operators can be organized as a network through these two predicates.
c p ( α ) β α β ; β D , α D O P i h ( α ) β β α * ; α , β D
O P refers to a set of operators dedicated to remote sensing image analysis, which is also known as the programming model. Each operator o p is bounded with a specific datatype D i through the predicate c p . Every operator is actually a function mapping from certain datatypes with or without a base operator to the output datatypes.
O P { o p i | i N } o p D * × O P D *
G refers to a DAG, which represents the computational process of remote sensing images. The DAG is constructed by a series of computational nodes n j , which represent functional calls with the output results α , operator name, and input arguments. Similar to an operator, G is virtually a function mapping from the input remote sensing data to the results. In the implementation of the framework, G can be modeled as a series of nested objects. Each of them records the input arguments, operator name, and returned datatype.
G { n j | j N } n ( α , o p , β ) , α D * , β G * | D *
S refers to a set of states of the main elements, including datatypes, operators, and the DAG. Any datatype, operator, or DAG can be located only in the client or on the server, and it can be a static string, a callable proxy, a piece of code, or an executable cloud function. Notations s t , c b , c d , and e x represent that the element is static, callable, code style, or executable, respectively. Notations c t and s v mean that the element is located in the client or server, respectively.
S { s t , c b , c d , e x } × { c t , s v }
The notation U refers to end users. Each user has permitted access to certain operators. V refers to the viewpoint on a map, which can be defined as a set of tile numbers. The viewpoint determines the spatial scope of input remote sensing data to be processed.
U { u i | i N } , u i O P * V { ( x , y , z ) | x , y , z N }
Notation A refers to a set of actions to change the state of target elements to complete the whole computational process.
A { g e t , i n i t , g n r t , s b m t , s c h d l } a U × S U × S , a A
The action g e t can change the location of some elements that can be accessed by a certain user u i . Notation : means value of the state S .
g e t u i × S : ( s t , s v ) u i × S : ( s t , c t ) , u i U
The action i n i t represents an action for the initialization of datatypes and operators, which translates the state of them from static to callable. A datatype or an operator that is callable means that it can be programmed but will not be actually executed.
i n i t u i × S : ( s t , c t ) u i × S : ( c b , c t ) , u i U
Notation g n r t represents an action for DAG generation, which translates the user-defined script to a DAG object. The state of the DAG changes from code style to callable.
g n r t G × S : ( c d , c t ) G × S : ( c b , c t )
In contrast to action g e t , s b m t represents the action of the DAG submission, which can be modeled as translating the DAG to a static string and changing the location of the DAG from client to server.
s b m t G × S : ( c b , c t ) G × S : ( s t , s v )
The action s c h d l represents the action of DAG scheduling, which changes the state of the DAG from static to executable and obtains the result tiles determined by viewpoint. The execution or scheduling of DAG depends on a run-time environment, which can be modeled by a process calculation [10]. Serverless has no formal foundation yet, and to simplify the definition, we do not model the execution details of the DAG in the backend.
s c h d l G × S : ( s t , s v ) × V { t i l e i | i V }
It should be noted that the essence of the element state change is a process of translation rather than a process of encapsulation and invocation.

2.4. Serverless Architecture

In this study, a pure serverless software architecture means that all the components are built on serverless commodities from the public cloud providers, mainly including the function computing (FC) [11], serverless workflow [12], Tablestore [13], message service (MNS) [14], relation database system (RDS) [15], and object storage service (OSS) [16] of Alibaba Cloud. This architecture is shown in Figure 2, which introduces the high-level components in the target system and traces the execution flow of the UDF creation and pipeline execution. Due to the adoption of serverless technologies in software design, the cost of the system can be paid after the construction is completed. This will enable flexibility in the pricing of system services.
① User-defined function (UDF) creation. Some basic operators are defined and submitted to a cloud function through the UDF client. This cloud function registers the UDF in the FC engine and a symbol database. As operators are dynamically generated, users can access the created UDFs through the pipeline client.
② Control flow of execution. User-defined code based on the pipeline client are expressions that can be translated to DAGs. These DAGs are submitted to the receiver based on a cloud function and then persisted in the DAG run-time storage. Another cloud function is triggered by the run-time storage update events to translate the execution-ready nodes to a parallel workflow.
③ Data flow of execution. Remote sensing images in the serverless storage OSS can be asynchronously accessed and processed by executors. Tiles of the results are returned to the receiver and visualized on the map in the frontend.

3. Data Model

3.1. Tiling

Because on-the-fly geocomputation emphasizes analyzing during visualization, the target system needs to constantly load only part of the target remote sensing images to memory, which means that it is necessary to divide the images into tiles at different levels and organize them as pyramids.
For the remote sensing images, Cloud Optimized GeoTIFF (COG) [17] is the most popular file format to build a pyramid for on-the-fly cloud geocomputation. Because the I/O time is supposed to be much less than the time of connecting to OSS, all the tiles of different levels that belong to the same band are organized together in a single GeoTIFF file. The tiles of the COG are usually organized in a sequence of a row major.
There are two kinds of tiles in the target system: one is for visualization and the other is introduced by the COG format for cloud storage. Notation t v and t s represent a tile for visualization or storage, respectively. The tile number ( x v , y v , z v ) is usually determined according to Web Mercator projection.
t v ( x v , y v , z v , v a l u e v ) t s ( x s , y s , z s , v a l u e s )
In this paper, we highly recommend that these two tiling strategies are consistent and aligned to some extent. Images from different satellites may have different spatial projections and need conversions from ( x s , y s , l s ) to ( x v , y v , l v ) ; ( x s , y s , l s ) is usually tiling locally along the rows and the columns of image and not aligned to ( x v , y v , l v ) .

3.2. Logical Region

Tile is the basic unit for storage and visualization. The region is a logical strategy of tiles grouping, which is aimed at the dynamic requirements of data-parallel execution. It can be modeled as a set of tile numbers, representing a spatially continuous coverage, and is the minimal unit for the algorithm design, task allocation, and geodata access.
The calculation from a region number ( x r , y r , z r ) to visualization tile numbers { ( x v , y v , z v ) } can be modeled as an affine function f t . a x and a y represent the width and height of the region, respectively. b r x is an offset in region r along x while b r y is an offset in region r along y . z v and z r are levels in the pyramid, and they usually have the same value. Obviously, the conversion f t 1 from ( x v , y v , z v ) to ( x r , y r , z r ) is a kind of integer modular operation.
( x v y v z v ) f t ( x r y r z r ) × ( a x , a y , 1 ) ( b r x b r y 0 ) , a x , a y , b r x , b r y N
The region is composed of two independent types of tiles, which can be transformed to each other by f c . Due to the transformation of the projection, a tile in the COG may belong to two different regions. In order to prevent the tiles belonging to different regions from being processed multiple times, a tile-masking strategy was designed. The mask is a set of tiles, which can indicate whether the target tiles have been processed. In computation, the region is reformed as a unified larger tile that can ensure the correctness of focal operators by the overlapped zones, such as sliding windows. The logical region L R can be defined formally as follows:
L R ( T v , T s , f t , f c , M )
T v and T s are the set of t v and t s , respectively. M is also a cached set of t s to indicate the tiles that have been processed. The logical region is as shown in Figure 3. In the OSS storage, the bytes to be read are determined by some information, including bucket, image collection, image, band, region number, tile order, and overlap. The r e g i o n and o v e r l a p define the minimal basic unit for distributed geocomputation. The o v e r l a p is a dynamic value determined by the window operators. The spatial o r d e r , similar to the Hilbert curve or Z curve, determines the offset and length of bytes where the system begins to read.

3.3. Datatypes

As a kind of raster data, the remote sensing image is the main type of geodata and can be analyzed by a map algebra or an array algebra system. In addition, vector data is another type of geodata, often used to manage the image processing results. The logical region is a general purpose geodata abstraction for implicit parallel computing, which is not suitable for end-user programming and should be invisible to users. We propose a composite datatype of image collection based on the SpatioTemporal Asset Catalog (STAC) [18] and GeoJSON. The definition is as follows:
Image collection is the top-level data abstraction dedicated to remote sensing image processing and constructed based on predicates P and some basic datatypes. All datatypes and their relations are shown in Figure 4.

4. Programming Model

4.1. Workflow

The DAG is virtually a workflow for modeling the computational process, where the nodes and edges represent operators and image collections. DAG generation, presented in Section 4.4, is usually separated from the end-user programming interface and invisible to users, which makes programmers avoid the burden of constructing a global DAG data structure for workflow. This paper proposes a design or definition for the interface of remote sensing processing workflow. The user programming interface consists of composite datatypes and workflow skeletons, which can be expressed as a two-tuple, ( c o m p o s i t e   t y p e s , s k e l e t o n s ) .
c o m p o s i t e   t y p e . The composite datatypes are the user-visible components of data abstraction, defined in Section 3.3, except r e g i o n and t i l e . In the construction of workflow, I m a g e C o l l e c t i o n is the most common datatype.
s k e l e t o n . Workflow skeletons are high-level operators representing the basic workflow semantics. There are six operators related to workflow construction, including c r e a t e , f i l t , i n t e g r a t e , t r a n s f o r m , a g g r e g r a t e , and s h o w . These skeletons are functions mapping from one image collection to another.
S k e l e t o n D 11 × O P D 11
M o d e l , C o n d i t i o n , R e l a t i o n , B a s e   o p e r a t o r , A g g r e g r a t o r , R e d u c e r , and S c h e m e are kinds of user-defined functions, defined in Section 4.2. The definition is shown in Figure 5.
Every s k e l e t o n has two parts, the lefthand expression 〈 l h s 〉 and the righthand expression 〈 r h s 〉. All the input datatypes of s k e l e t o n are I m a g e C o l l e c t i o n . The c r e a t e operator is used to load an image collection with a file path of cloud storage or a data model M o d e l describing the content of the certain image collection. The F i l t operator is used to construct spatiotemporal or regular conditions for filtering the input image collection. Because each image collection has a corresponding data model M o d e l , the I n t e g r a t e operator provides a way to integrate different image collections with a R e l a t i o n between different data models. The s k e l e t o n   T r a n s f o r m maps a base intraimage operator to every item of the input image collection while the A g g r e g a t e operator maps a base interimage operator according to the A g g r e g a t e r , which also groups the items of the image collection by the selected dimensions in M o d e l , such as time or space. The S h o w operator triggers the execution of the workflow and obtains the tiles of the input image collection visible to users.

4.2. User-Defined Function

The s k e l e t o n s are collection-level operators and lack the flexibility of image-level or pixel-level operations. According to the definition of w o r k f l o w , there are six operators at the image level and pixel level, which can be defined and published by the user, including M o d e l , C o n d i t i o n , R e l a t i o n , B a s e   o p e r a t o r , A g g r e g r a t e r , and S c h e m e .
Similar to the constructor of programming language, M o d e l is a configuration for certain satellite images, including band number, coverage, resolution, spatial projection, and other attributes in a key-value form. The C o n d i t i o n is a logical expression, especially the spatial-temporal topological relationship, for selecting the target images. The R e a l t i o n is a rename operator, which specifies a unified model for the input image collections.
B a s e   o p e r a t o r refers to the image-level algorithm defined and implemented on the band-level and pixel-level interface. The band-level operators, band math, or map algebra can be regarded as some window operators on linear algebra. The r e d u c e r defines some algorithms to integrate all the images in a collection into a single image along a certain axis, such as s p a c e , t i m e , b a n d s , or other metadata.
Besides all the above operators, there are also some operators related to publishing UDFs, such as r e g i s t e r , which provide an interface for registering the basic information of UDFs, including results datatype, function name, input arguments, and description, etc. More details about operator publication are shown in Section 4.3.
The syntax description is based on the augmented Backus–Naur form (ABNF), shown in Figure 6.
For the reason of embedding into other programming languages as an internal domain-specific language (DSL) [19], the production rules adopt some symbols different from the regular ABNF [20]. All the symbols of the production rules in Figure 4 and Figure 5 are defined in Table 1.

4.3. Operator Publication

Similar to the symbol table of programming language, operators can be regarded as a kind of computational symbol with input and output information in the remote sensing processing workflows. An operator brings two kinds of information, one for user programming and the other for explicitly cloud functions calling, which can be called high-level attribute and low-level attribute of operators.
The high-level attribute is some information exposed to users, which would be sent to the pipeline client through a dynamic operator generation. It contains four kinds of information expressed as datatype, operator name, input arguments, return types, and description. The low-level attribute is a pointer to a certain cloud function, which contains two kinds of information, including operator name and function call location.
When users create operators, they need to publish in the backend before these operators can be used to create workflows. The process of operator publication is shown in Figure 7, presented in Python style. When UDFs have been published by users, they can access the published UDFs through agent datatypes through the pipeline client, and they can be used in the construction of workflows.

4.4. DAG Generation

The pipeline and UDF clients have the ability of dynamically generating operators in the frontend based on the metaprogramming technology and the high-level attributes of cloud functions. All the operators in the frontend belong to specific proxy objects, and they actually refer to one same function just recording the invocation between cloud functions, which make up a sense of code execution. This relation can be modeled as a DAG, which is a kind of intermediate representation between the remote sensing image processing pipeline and the underlying cloud functions, and generated through a series of callable proxy objects in the frontend, similar to the way of the GEE API Client.

4.5. Trigger of Execution

Although the DAGs represent the pipeline execution, they are generated in the frontend. When users invoke specific functions, the framework is triggered to submit DAGs from client to cloud. In order to receive results as soon as possible, the DAGs should have an initial viewpoint or would receive a default one if not. The framework can compute results on the tiles around the viewpoint. The trigger of execution can be defined as a tuple of ( f u n c t i o n ,   v i e w p o i n t ) . The f u n c t i o n can submit the DAGs through a request and obtain a map identifier to fetch tiles. The v i e w p o i n t defines the initial scope of input tiles. Therefore, the users can receive the result very quickly, complying with the design principle of on-the-fly geocomputation.
The most important aspect of DAG execution is to guarantee the correctness of translating the focal operators from band level to region level. For the reason that logical region is invisible to end users, the translator of the run-time environment should be able to determine the shape of logical region automatically, i.e., automatic data partition.

5. DAG Execution

5.1. Data Partition

Different from the DAGs generated automatically by the machine learning engine, the DAGs manually constructed by users are usually not too big to be partitioned for task-parallel execution. This paper only focuses on the algorithms of data-parallel execution. We need an algorithm for automatic data partition.
The focal operator has a characteristic of a structural locality, which means that the result at the location of ( i , j ) is not only determined by the value of ( i , j ) but also its neighbors. The structural locality of the remote sensing processing can be defined as a window so that the computational model of the data partition for a remote sensing image can be expressed as a function with a window as the input and a kind of logical region as the output.
r e g i o n f i m a g e ( w i n d o w )
Since all the remote sensing images are organized as a COG and persisted in the OSS, the data partition algorithm is ideally determined by the characteristics of OSS, the shape of the window, and the physical layout of the COG. As the COG is a highly customized format in the OSS for streaming, progressive rendering, and supporting on-the-fly random reading, the impacts of the OSS and the physical layout of the COG can be neglected in this paper.
A single remote sensing image can be regarded as a three-dimensional array, so the window defines the scope and neighboring pixels at a specified location of ( i , j , k ) . Different from the common window definition, such as 7 × 7 × 3, this paper adopts a more flexible window form, i.e., a set of neighboring pixels similar to ArrayUDF [21]. Expression { r , c , b } represents the three dimensions of an image. The shape of the overlapping zone { O k } guarantees the correctness of operators across logic regions and can be derived from the window parameters, [ L i , R i ] .
W ( [ L i , R i ] ) f ( { P δ r , δ c , δ b | δ i [ L i , R i ] } ) , i { r , c , b } { O k } S u m ( L k , R k ) , k { r , c , b }

5.2. Execution

To implement computing while visualizing, we propose an architecture based on the producer–consumer pattern. The producer receives tile numbers indicating data to be processed, applies the DAG to be executed on tiles determined by regions, and puts the results in a workspace to be consumed. The data-parallel DAG execution algorithm is shown in Algorithm 1.
Algorithm 1 Data-Parallel DAG Execution
Remotesensing 14 01728 i001

5.3. Cache

Considering that some tiles may be requested for more than one time in the DAG execution, the framework must maintain the execution states, including DAGs and tiles, in caches. The consumer firstly obtains tiles in the target region from the cache in the frontend and generates tiles dynamically from the upper or lower tiles maintained in the cache if they have been requested and then sends a request to the backend to receive the target tiles generated by the producers from the workspace.
Except for the cache in the frontend, there is also a cache in the backend, which plays an import role in the DAG execution. If the whole region requested by the consumers has been processed, the cache will put the tiles directly in the workspace. As some tiles are shared by different regions and the target region is partially processed, the cache will generate a mask to declare which tiles have been processed to reduce the computational cost.

6. Case Study

6.1. Data and Result

This study conceptually validates the feasibility of serverless-based, on-the-fly computation framework with a simple NDVI use case on a remote sensing image. The NDVI code is shown in Figure 8. All the remote sensing images are from Landsat8 and organized in the COG file format. Every overview of different levels in the image pyramid is tiled into tiles of 256 × 256 pixels and encoded into an individual TIFF file. This study provides a Python client integrated as an algorithm library in Jupyter. The NDVI code, the input image, and the result tiles are shown in Figure 8. The NDVI function is performed on four tiles, numbered as ( 0 , 7 ) , ( 0 , 8 ) , ( 1 , 7 ) , and ( 1 , 8 ) .
Before the system begins to process the target tiles, a DAG will be generated and submitted to the data center. Part of the DAG in Figure 9 is corresponding to the NDVI code from line 1 to line 3. It should be pointed out that the time point is modeled as a range from a start time to the end time. A DAG is actually a nested object returned by the final operator. The DAG generation is a process of creating this nested object.

6.2. Response Time

Once the algorithm is determined, it will be applied to the whole remote sensing collection. The response time is a critical characteristic of on-the-fly cloud computing for remote sensing data analysis during the phase of DAG construction or algorithm exploration. When users request to run the user-defined code, the frontend firstly submits a DAG and receives a task identifier, and then the backend executes it on the specified tiles determined by V P .
The response time is related to the efficiency of the scheduler, the complexity of the related algorithms, the latency of the communication, the characteristics of the serverless platform, and the concurrency. In the study, the response time is tested through a series of continuous requests shown in Figure 10.
The initialization phase of a serverless platform is called a cold start. The cold start of the NDVI case is around 1200 ms, which is about three times longer than that consumed by a base cloud function, which only responds to the request without doing anything. Once the serverless platform has been started, the response time will reduce to around 700 ms in the elastic mode whereas in the performance mode, the computation time will reduce to less than 400 ms. It is shown that the NDVI case benefits a lot from better hardware, but anything larger than a 4 GB memory could not further reduce the response time.
The response time of the NDVI case is mainly determined by reading tiles from the OSS, putting them together, and transforming them into an image. The time of reading a COG header and four tiles from the OSS is about 200ms, and the NDVI case needs to repeat this process. Then, the system will put four tiles together and transform them into an image, and it will take about another 200 ms. The scheduler only takes less than 100 ms, for the total response time is less than 700 ms. To reduce the response time, the system adopts a strategy similar to lazy evaluation, in which the operation of reading data from the OSS is only triggered by some specific operators that must manipulate the data. In the NDVI case, the operation of reading tiles from the OSS is deferred to the eleventh round of task scheduling.

6.3. Concurrency

An important characteristic of serverless computing and the requirement for on-the-fly cloud geocomputation is concurrency. Once the computation request is sent by a map client automatically, the serverless framework needs to respond to a large number of computation requests in a short period. The tested concurrency performance of the serverless-based remote sensing image analysis framework is shown in Figure 11.
The framework in the elasticity mode is tested through a series of asynchronous requests at the scale of tens and hundreds. It is shown that the response time stays below 1.5 s until the concurrency approaches about 700, and then the response time starts to increase linearly. When the asynchronous request is under 200, the maximum execution time of the DAG is less than about 1.5 s. The average response time is still under about 1.5 s though the scale of asynchronous requests reaches 1000 and the maximum execution time is close to 3.5 s. In contrast with the traditional technologies of cloud computing, which may scale in minutes or hours, the serverless-based framework can increase the number of functions and instances in seconds to handle new requests.

7. Discussion and Conclusions

There are some remote sensing frameworks developed from traditional parallel databases and cloud computing technologies. The paper [1] analyzed the related works of three types, including spatial databases, programming and software tools, and big spatial data infrastructures, and categorized them further into ten types from underlying general technologies. Though extensive in the scope of investigation, it only focuses on the underlying technologies without evaluating them in terms of computing service. Different from this review, here, we analyze the representative remote sensing image analysis frameworks from both computing service types and underlying cloud computing technologies.
As one of the most popular remote sensing analysis cloud platforms, GEE provides two types of computing services, namely on-the-fly computing and batch computing. On-the-fly computing is used for rapid prototyping for tiled remote sensing images while batch computing provides the capability of planetary-scale processing. Despite their different focuses, they both have the same programming interface and can switch to each other seamlessly based on GEE’s ability to translate the user-defined codes or functions to DAGs. However, GEE cannot express focal operation with structure locality at a pixel-level. In the study, the ability of the focal operation is implemented through window operators based on relative coordinates and overlapped logical regions.
Different from GEE’s local tiling strategy, the Open Data Cube (ODC) [22] reformats the raw, remote sensing images into analysis-ready data with a global tiling strategy. ODC’s rapid prototyping and parallel computing capabilities are provided by the xarray [23] and Celery framework, respectively. Although Celery has high performance, from the perspective of ease of use, ODC only has a low-level programming model compared with highly customed remote sensing processing APIs or operators and lacks control over operator access. As cloud computing systems are considered to prioritize the ease of use over high performance, we provide end users with a customed DAG-centric programming model that includes various datatypes and operators dedicated to remote sensing data analysis and operator-level access control capabilities.
GeoTrellis [24,25] and GeoSpark [26], built on the top of Spark, are generally used for batch processing of raster data and similar to the batch computing component of GEE, which is based on FlumeJava [27]. These frameworks usually have a distributed data abstraction customed to raster data based on the RDD data structure [28]. However, at the phase of algorithm exploratory, they cannot be used as REPL tools to provide a public computing service and rapid prototyping. Besides, these frameworks require the computation to be built on the distributed datatypes, which limits the expressiveness and flexibility of end-user programming for remote sensing collection processing. Although Spark can be refactored on serverless technologies, GeoTrellis and GeoSpark cannot be directly used as public cloud commodities to provide end-user programming services.
Iris [29] is a python library used for the analysis and visualization of meteorological data, which provides the ability of batch geocomputation based on the distributed numerical computing framework Dask. Contrary to the Spark-based frameworks, Iris has a higher communication efficiency and is capable of performing certain high-performance computing tasks, such as a dense linear algebra calculation. Nevertheless, Iris is highly customed to the analysis of meteorological data, which is usually organized as a NetCDF file or its variants, and, therefore, could not be applied directly to remote sensing images analysis. Besides, as Iris is deeply bound to Dask, it cannot control operator access and provide the service of paying in proportion to resources used, similar to a serverless public commodity.
With the rise of the disaggregation datacenter, serverless computing is believed to become the default computing paradigm of cloud computing and bring closure to the client–server era [8]. Although GEE has the feature of scaling automatically and billing on usage, it does not claim to be built on serverless technologies. All other remote sensing image processing frameworks need explicit resource provisioning, which can be regarded as based on a server-centric computing paradigm. Despite that serverless cloud functions are becoming more and more lightweight and have been successfully employed for several types of general workloads [30,31], there are still many limitations. The cloud functions are stateless without fine-grained coordination and do not provide high-level parallel operators, posing difficulty for remote sensing data processing workloads.
This paper presents the empirical characteristics and a formal definition of on-the-fly cloud geocomputation for the first time. Then, we give a serverless-based software architecture and some proof-of-concept experiments, which suggest that on-the-fly cloud geocomputation can be efficiently implemented with serverless technologies, such as the object storage system and function computing engine. At the frontend, we provide a DAG-based, end-user programming environment for remote sensing data analysis, which contains a series of customed datatypes and operators. The DAG is one of the core designs, which bridges the user-defined code and the cloud functions at the backend. The logical region is another core design, which guarantees the correctness of focal operators through overlapped zones and controls the amount of input image tiles to achieve a rapid response.
Nevertheless, several aspects of the proposed serverless-based system could be further improved. First, the technology stack of current serverless commodities lacks in-memory storage, similar to Redis, which limits further performance improvements. Future work could refer to Anna [32], which can provide high-performance memory storage services to improve the efficiency of COG reading and cache. In addition, the scheduler in the current system adopts a staged scheduling method, which does not pay attention to the difference in execution time between different nodes. Therefore, it is necessary to develop a scheduling algorithm specially oriented to serverless computing to achieve the optimization of both job completion time and cost of execution [33].

Author Contributions

Conceptualization, J.W. and M.W.; methodology, J.W.; software, J.W.; validation, H.L., L.L. (Lijuan Li), and L.L. (Leilei Li); investigation, J.W.; resources, J.W.; data curation, J.W.; writing—original draft preparation, J.W.; writing—review and editing, L.L. (Lijuan Li); visualization, J.W.; project administration, J.W.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.


This research was funded by the National Key Technology Research and Development Program of the Ministry of Science and Technology of China under grant number 2021YFB3900900. This work was jointly supported by the Fundamental Research Funds for the Central Universities, National Natural Science Foundation of China under grant number 41776197, the Strategic Priority Research Program of Chinese Academy of Sciences under grant number XDB42010403, and the National Key Technology Research and Development Program of the Ministry of Science and Technology of China under grant number 2018YFE0204203.

Data Availability Statement

All satellite data used in the study are available for free download from their respective data portals ( accessed on 1 January 2022).


The authors would like to thank the editors and the anonymous reviewers for their crucial comments, which improved the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Alam, M.M.; Torgo, L.; Bifet, A. A Survey on Spatio-temporal Data Analytics Systems. arXiv 2021, arXiv:2103.09883. [Google Scholar] [CrossRef]
  2. Gomes, V.; Queiroz, G.; Ferreira, K. An Overview of Platforms for Big Earth Observation Data Management and Analysis. Remote Sens. 2020, 12, 1253. [Google Scholar] [CrossRef] [Green Version]
  3. Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
  4. Zaharia, M.; Xin, R.S.; Wendell, P.; Das, T.; Armbrust, M.; Dave, A.; Meng, X.; Rosen, J.; Venkataraman, S.; Franklin, M.J. Apache spark: A unified engine for big data processing. Commun. ACM 2016, 59, 56–65. [Google Scholar] [CrossRef]
  5. White, T. Hadoop: The Definitive Guide; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2012. [Google Scholar]
  6. Rocklin, M. Dask: Parallel computation with blocked algorithms and task scheduling. In Proceedings of the 14th Python in Science Conference, Austin, TX, USA, 6–12 July 2015. [Google Scholar]
  7. Han, S.; Egi, N.; Panda, A.; Ratnasamy, S.; Shi, G.; Shenker, S. Network support for resource disaggregation in next-generation datacenters. In Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks, College Park, MD, USA, 21–22 November 2013; pp. 1–7. [Google Scholar]
  8. Jonas, E.; Schleier-Smith, J.; Sreekanti, V.; Tsai, C.-C.; Khandelwal, A.; Pu, Q.; Shankar, V.; Carreira, J.; Krauth, K.; Yadwadkar, N. Cloud programming simplified: A berkeley view on serverless computing. arXiv 2019, arXiv:1902.03383. [Google Scholar]
  9. Ozturk, D.; Chaudhary, A.; Votava, P.; Kotfila, C. GeoNotebook: Browser based Interactive analysis and visualization workflow for very large climate and geospatial datasets. AGU Fall Meet. Abstr. 2016, 2016, IN53A-1876. [Google Scholar]
  10. Jangda, A.; Pinckney, D.; Brun, Y.; Guha, A. Formal foundations of serverless computing. Proc. ACM Program. Lang. 2019, 3, 1–26. [Google Scholar] [CrossRef] [Green Version]
  11. AlibabaCloud. Function Computing. Available online: (accessed on 3 April 2022).
  12. AlibabaCloud. Serverless Workflow. Available online: (accessed on 3 April 2022).
  13. AlibabaCloud. TableStore. Available online: (accessed on 3 April 2022).
  14. AlibabaCloud. Message Service. Available online: (accessed on 3 April 2022).
  15. AlibabaCloud. Relation Database System. Available online: (accessed on 3 April 2022).
  16. AlibabaCloud. Object Storage Service. Available online: (accessed on 3 April 2022).
  17. COG. Cloud Optimized GeoTIFF. Available online: (accessed on 3 April 2022).
  18. STAC. SpatioTemporal Asset Catalogs. Available online: (accessed on 3 April 2022).
  19. Hennessy, J.; Patterson, D. A New Golden Age for Computer Architecture: Domain-Specific Hardware/Software Co-Design, Enhanced Security, Open Instruction Sets, and Agile Chip Development. In Proceedings of the Turing Lecture Given at ISCA’18, Los Angeles, CA, USA, 2–6 June 2018; Volume 10. [Google Scholar]
  20. Crocker, D.; Overell, P. Augmented BNF for Syntax Specifications: ABNF; RFC 2234; HKU Sandy Bay RFC Ltd.: Pok Fu Lam, China, 1997. [Google Scholar]
  21. Dong, B.; Wu, K.; Byna, S.; Liu, J.; Zhao, W.; Rusu, F. ArrayUDF: User-defined scientific data analysis on arrays. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, Washington, DC, USA, 26–30 June 2017; pp. 53–64. [Google Scholar]
  22. Lewis, A.; Oliver, S.; Lymburner, L.; Evans, B.; Wyborn, L.; Mueller, N.; Raevksi, G.; Hooke, J.; Woodcock, R.; Sixsmith, J.; et al. The Australian Geoscience Data Cube—Foundations and lessons learned. Remote Sens. Environ. 2017, 202, 276–292. [Google Scholar] [CrossRef]
  23. Hoyer, S.; Hamman, J. xarray: ND labeled arrays and datasets in Python. J. Open Res. Softw. 2017, 5, 10. [Google Scholar] [CrossRef] [Green Version]
  24. Eldawy, A.; Mokbel, M.F. The era of big spatial data: A survey. Foundations and Trends in Databases 2016, 6, 163–273. [Google Scholar] [CrossRef]
  25. Geotrellis. GeoTrellis is a Geographic Data Processing Engine for High Performance Applications. Available online: (accessed on 3 April 2022).
  26. Yu, J.; Wu, J.; Sarwat, M. Geospark: A cluster computing framework for processing large-scale spatial data. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 3–6 November 2015; pp. 1–4. [Google Scholar]
  27. Chambers, C.; Raniwala, A.; Perry, F.; Adams, S.; Henry, R.R.; Bradshaw, R.; Weizenbaum, N. FlumeJava: Easy, efficient data-parallel pipelines. ACM Sigplan Not. 2010, 45, 363–375. [Google Scholar] [CrossRef]
  28. Zaharia, M.; Chowdhury, M.; Das, T.; Dave, A.; Ma, J.; McCauly, M.; Franklin, M.J.; Shenker, S.; Stoica, I. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 12), San Jose, CA, USA, 25–27 April 2012; pp. 15–28. [Google Scholar]
  29. Hamman, J.; Rocklin, M.; Abernathy, R. Pangeo: A big-data ecosystem for scalable earth system science. In Proceedings of the EGU General Assembly Conference Abstracts, Vienna, Austria, 8–13 April 2018; p. 12146. [Google Scholar]
  30. Taibi, D.; El Ioini, N.; Pahl, C.; Niederkofler, J.R.S. Serverless cloud computing (function-as-a-service) patterns: A multivocal literature review. In Proceedings of the 10th International Conference on Cloud Computing and Services Science (CLOSER 2020), Prague, Czech Republic, 7–9 May 2020. [Google Scholar]
  31. Shankar, V.; Krauth, K.; Vodrahalli, K.; Pu, Q.; Recht, B.; Stoica, I.; Ragan-Kelley, J.; Jonas, E.; Venkataraman, S. Serverless linear algebra. In Proceedings of the 11th ACM Symposium on Cloud Computing, Seattle, WA, USA, 19–21 October 2020; pp. 281–295. [Google Scholar]
  32. Wu, C.; Faleiro, J.; Lin, Y.; Hellerstein, J. Anna: A kvs for any scale. IEEE Trans. Knowl. Data Eng. 2019, 33, 344–358. [Google Scholar] [CrossRef]
  33. Zhang, H.; Tang, Y.; Khandelwal, A.; Chen, J.; Stoica, I. Caerus:{NIMBLE} Task Scheduling for Serverless Analytics. In Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21), Boston, MA, USA, 12–14 April 2021; pp. 653–669. [Google Scholar]
Figure 1. Characteristics of on-the-fly cloud geocomputation.
Figure 1. Characteristics of on-the-fly cloud geocomputation.
Remotesensing 14 01728 g001
Figure 2. Serverless-based architecture.
Figure 2. Serverless-based architecture.
Remotesensing 14 01728 g002
Figure 3. Logical region.
Figure 3. Logical region.
Remotesensing 14 01728 g003
Figure 4. Definition of image collection.
Figure 4. Definition of image collection.
Remotesensing 14 01728 g004
Figure 5. Workflow skeletons.
Figure 5. Workflow skeletons.
Remotesensing 14 01728 g005
Figure 6. Syntax description.
Figure 6. Syntax description.
Remotesensing 14 01728 g006
Figure 7. Operator publication and generation.
Figure 7. Operator publication and generation.
Remotesensing 14 01728 g007
Figure 8. The Landsat8 data and the result tiles.
Figure 8. The Landsat8 data and the result tiles.
Remotesensing 14 01728 g008
Figure 9. Part of the DAG.
Figure 9. Part of the DAG.
Remotesensing 14 01728 g009
Figure 10. Response time of a series of the continuous request.
Figure 10. Response time of a series of the continuous request.
Remotesensing 14 01728 g010
Figure 11. Elasticity and concurrency.
Figure 11. Elasticity and concurrency.
Remotesensing 14 01728 g011
Table 1. Description of the symbols.
Table 1. Description of the symbols.
<> Denotes an operator or variable in programming;<img-clct>
[]Indicates creating a data structure of numeric array;[[…]…[…]]
{}Body of the UDF or a data structure of dictionary;{<metadata>*}
()Indicates the inputs of the UDF;(<img-lhs>)
|Choice operator for two candidate expressions;EQ | NE
*Zero or more occurrences of the preceding element;<img-op>*
+One or more occurrences of the preceding element;<img-op>+
.Denotes the attribute or method of an object;<id>.apply(…)
;Indicates end of a BNF statement;Image <id>;
//Annotation of the production rules;//annotation
::=Means being defined as the right-hand expressions;<l-op> ::= or;
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wu, J.; Wu, M.; Li, H.; Li, L.; Li, L. A Serverless-Based, On-the-Fly Computing Framework for Remote Sensing Image Collection. Remote Sens. 2022, 14, 1728.

AMA Style

Wu J, Wu M, Li H, Li L, Li L. A Serverless-Based, On-the-Fly Computing Framework for Remote Sensing Image Collection. Remote Sensing. 2022; 14(7):1728.

Chicago/Turabian Style

Wu, Jin, Mingbo Wu, Haiyan Li, Lijuan Li, and Leilei Li. 2022. "A Serverless-Based, On-the-Fly Computing Framework for Remote Sensing Image Collection" Remote Sensing 14, no. 7: 1728.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop