JupyTEP IDE as an Online Tool for Earth Observation Data Processing

: The paper describes a new tool called JupyTEP integrated development environment (IDE), which is an online integrated development environment for earth observation data processing available in the cloud. This work is a result of the project entitled “JupyTEP IDE—Jupyter-based IDE as an interactive and collaborative environment for the development of notebook style EO algorithms on network of exploitation platforms infrastructure” carried out in cooperation with European Space Agency. The main goal of this project was to provide a universal earth observation data processing tool to the community. JupyTEP IDE is an extension of Jupyter software ecosystem with customization of existing components for the needs of earth observation scientists and other professional and non-professional users. The approach is based on conﬁguration, customization, adaptation, and extension of Jupyter, Jupyter Hub, and Docker components on earth observation data cloud infrastructure in the most ﬂexible way; integration with accessible libraries and earth observation data tools (sentinel application platform (SNAP), geospatial data abstraction library (GDAL), etc.); adaptation of existing web processing service (WPS)-oriented earth observation services. The user-oriented product is based on a web-related user interface in the form of extended and modiﬁed Jupyter user interface (frontend) with customized layout, earth observation data processing extension, and a set of predeﬁned notebooks, widgets, and tools. The ﬁnal IDE is addressed to the remote sensing experts and other users who intend to develop Jupyter notebooks with the reuse of embedded tools, common WPS interfaces, and existing notebooks. The paper describes the background of the system, its architecture, and possible use cases.


Introduction
Earth observation (EO) data processing is a complex process that requires specialized tools. In each field, there is software that performs various functions that are important from the user's point of view. The functionality of most known programs is implemented permanently, and the user is unable to change the algorithms they contain. In most cases, this is sufficient and even desirable. However, there is a group of users who want to expand the capabilities of their software by developing new processing algorithms according to their own concepts for some of the unusual tasks they encounter. These are advanced users, such as engineers or scientists, often with programming skills. The choice of software for this group is much narrower. There are various types of general-purpose development environments or a few cloud platforms such as the Google Earth Engine. However, advanced users, including academics, often look for free open-source software, with the aim of building their own solution from the available components. This provides the basis for the question that constitutes

Current Trends among Tools for EO Data Processing
EO data processing is rapidly moving from desktop to cloud computing. This trend is visible across many applications. New EO data processing platforms are being developed all over the world [12,13].
Modern scientific computations require high-level, efficient, easy-to-use programming languages with low entry points. Moreover, the possibility to visualize data, create tables, charts, comments, and equations is as important as the calculations themselves. For this reason, the Jupyter notebook is gaining popularity. Among scientists and ordinary users, high-level programming languages such as Python or R have recently gained great popularity. The wealth of libraries and functions they contain is so large that they have become an important component of many analytical tools and geographic information systems (GIS). The philosophy of spatial data processing with the possibility of programming one's own analytical algorithms requires the user to some achieve degree of proficiency; however, it results in much higher flexibility and gives incomparably greater possibilities in relation to systems with closed and permanently programmed functionalities. All of these features are available within Jupyter notebooks.
Spatial data are a specific type of data. The combination of online processing philosophy with GIS and remote sensing (RS) technologies can yield interesting results and new, more efficient tools for data processing.
There are more and more earth observation datasets available online provided by NASA, ESA, and third-party companies. On 14 December 2017, the European Commission and the European Space Agency signed contracts under the Copernicus program with four consortia to create DIAS platforms (data and information access services) [14]. Currently, the following DIAS platforms are available: ONDA [15], Mundi [16], CreoDIAS [17], Sobloo [18], and WEkEO [19]. It is much easier to process these data remotely in the cloud without downloading large datasets to a local computer. The number of online EO data processing platforms is growing and there is a constant need for new solutions in this field. Among the existing platforms for the processing of satellite data, the following can be mentioned for example: ESA thematic exploitation platform (TEP) (Website), Google Earth Engine [12], Earth on AWS [20], or openEO.org [21]. Every online EO data processing tool has its own purpose, its advantages, and disadvantages. For example, TEP [22] websites are oriented towards a specific topics and applications. Google Earth Engine is a very powerful tool, but its configuration options are limited, it is not open source, and it cannot be scaled or reconfigured by users. OpenEO.org goal is to integrate cloud, software, and service infrastructure with a very large API while JupyTEP IDE is a platform designed for users (backend). JupyTEP IDE is open source, can be

Current Trends among Tools for EO Data Processing
EO data processing is rapidly moving from desktop to cloud computing. This trend is visible across many applications. New EO data processing platforms are being developed all over the world [12,13].
Modern scientific computations require high-level, efficient, easy-to-use programming languages with low entry points. Moreover, the possibility to visualize data, create tables, charts, comments, and equations is as important as the calculations themselves. For this reason, the Jupyter notebook is gaining popularity. Among scientists and ordinary users, high-level programming languages such as Python or R have recently gained great popularity. The wealth of libraries and functions they contain is so large that they have become an important component of many analytical tools and geographic information systems (GIS). The philosophy of spatial data processing with the possibility of programming one's own analytical algorithms requires the user to some achieve degree of proficiency; however, it results in much higher flexibility and gives incomparably greater possibilities in relation to systems with closed and permanently programmed functionalities. All of these features are available within Jupyter notebooks.
Spatial data are a specific type of data. The combination of online processing philosophy with GIS and remote sensing (RS) technologies can yield interesting results and new, more efficient tools for data processing.
There are more and more earth observation datasets available online provided by NASA, ESA, and third-party companies. On 14 December 2017, the European Commission and the European Space Agency signed contracts under the Copernicus program with four consortia to create DIAS platforms (data and information access services) [14]. Currently, the following DIAS platforms are available: ONDA [15], Mundi [16], CreoDIAS [17], Sobloo [18], and WEkEO [19]. It is much easier to process these data remotely in the cloud without downloading large datasets to a local computer. The number of online EO data processing platforms is growing and there is a constant need for new solutions in this field. Among the existing platforms for the processing of satellite data, the following can be mentioned for example: ESA thematic exploitation platform (TEP) (Website), Google Earth Engine [12], Earth on AWS [20], or openEO.org [21]. Every online EO data processing tool has its own purpose, its advantages, and disadvantages. For example, TEP [22] websites are oriented towards a specific topics and applications. Google Earth Engine is a very powerful tool, but its configuration options are limited, it is not open source, and it cannot be scaled or reconfigured by users. OpenEO.org goal is to integrate cloud, software, and service infrastructure with a very large API while JupyTEP IDE is a platform designed for users (backend). JupyTEP IDE is open source, can be self-hosted, is scalable, and Remote Sens. 2019, 11, 1973 4 of 19 fully configurable. These features distinguish it from other online EO data processing tools. Additional benefit is that in JupyTEP IDE, one can use the programming language of his choice. Another major difference is the access to virtual machine system shell, which grants full configuration options not just for JupyTEP IDE, but for the entire virtual machine.
The huge size of EO data makes researchers look for new ways to process them. One of the new paradigms in this field is the use of EO data cubes (EODC), revolutionizing the way users can interact with EO data and a promising solution to store, organize, manage, and analyze EO data [23,24].
Assumptions for this project have been developed by observing and analyzing existing trends in the development of EO data tools and is based on suggestions from ESA associates, who are unquestionable experts in this field. The main assumptions underlying the JupyTEP IDE design process were formulated so that the system would have the following capabilities: Integration with existing data sources and interfaces; • Reusability; • Parallelization in processing; • Multi tenancy/multi user; • Scalability.
The authors refer to the capabilities mentioned above in detail in the context of JupyTEP IDE in the following section.

Platform Overview, Design, Methods, and Tools
This work presents an integrated development environment (IDE) for geospatial data processing, created especially for EO (earth observation) data ( Figure 2). The solution mentioned, called JupyTEP IDE, is based on Jupyter notebook extended in a such a way as to be able to easily process EO data. It allows users to use scripting programming language (Python) and also prepared scripts and services to compute and process the data. The users also have the ability to develop and share online applications (notebooks) of their own. JupyTEP IDE is targeted to meet the needs of EO scientists and other professional and non-professional users, especially in the field of geospatial processing. It integrates the most common EO-based, geospatial tool, software, libraries, and toolboxes for EO data (SNAP, OTB), vector and raster data processing (GRASS GIS, GDAL, PostGIS, etc.), visualization, and presentation in the most suitable form. Combining the basis of the Jupyter approach and an extended Python environment integrated with a Docker prerogative allows for interconnection with most existing services (WPS, web map services (WMS), TEP interfaces) and tools for geo-data storage and distribution (PostGIS, GeoServer, Mapnik, etc.).

Tools and Libraries Used for the Project
The JupyTEP IDE environment is built with the reuse paradigm. The following core components are used for building the JupyTEP IDE solution: • The entire system is composed of tools and libraries integrated on top of Docker and Jupyter solutions to extend the functionality of the Jupyter notebook idea.
Docker platform makes creating, deploying, and running applications much easier by using containers. Containers allow one to package an entire application with all of its requirements (such as libraries) and deploy it as one package. This way, the application will run on any other Linux machine regardless of any host machine settings.
The IDE idea is based on separate cells containing fragments of code, which can be run separately. It allows for easier modification of code parts and strongly supports code reuse. In Jupyter, there are input cells (for writing code) and output cells for showing results. Input cells use programming language (kernel) defined for a certain notebook. As kernels are based on IPython, its magic commands are also available (called cell magic). These magic commands are a set of predefined functions that can be called with a command line style syntax [8].
Remote Sens. 2019, 11, x FOR PEER REVIEW 5 of 19 The entire system is composed of tools and libraries integrated on top of Docker and Jupyter solutions to extend the functionality of the Jupyter notebook idea.
Docker platform makes creating, deploying, and running applications much easier by using containers. Containers allow one to package an entire application with all of its requirements (such as libraries) and deploy it as one package. This way, the application will run on any other Linux machine regardless of any host machine settings.
The IDE idea is based on separate cells containing fragments of code, which can be run separately. It allows for easier modification of code parts and strongly supports code reuse. In Jupyter, there are input cells (for writing code) and output cells for showing results. Input cells use programming language (kernel) defined for a certain notebook. As kernels are based on IPython, its magic commands are also available (called cell magic). These magic commands are a set of predefined functions that can be called with a command line style syntax [8].

Platform Design
JupyTEP IDE is designed for a cloud-based environment and operates as a typical web-based application. The design process of the UI focuses on convenient and user-friendly search and discovery of EO data with the use of GIS tools and IDE functionalities. The application is designated to be a responsive web application for all screen sizes and resolutions. The target operating system for the server side of the JupyTEP IDE is Ubuntu, based on Docker stacks, while the client can run on any platform equipped with web browser.
The main goal was to create a solution and a platform for online EO data processing and visualization. To reach JupyTEP IDE goals, all system components were implemented with a close connection to the EO data repository. The assumptions for the project mentioned in the previous section have been implemented as follows: • JupyTEP IDE adaptation of a Jupyter web-based environment with predefined tools for EO data access to provide a development environment that is easy to use and easy to integrate with any platform infrastructure-it allows interoperability objectives to be attained; • EO tools integration is achieved as simplified access to EO data for the developer community by customization of JupyTEP IDE. Proposed integration tasks provide access to multiple sources, large volumes, and time series of EO data with predefined interfaces by integration with a cloudbased EO repository, i.e., innovative platform restbed (IPT); • The reuse paradigm of JupyTEP IDE is achieved by the integration of open-source components and sharing notebooks with results of developed algorithms and services. Additionally, the

Platform Design
JupyTEP IDE is designed for a cloud-based environment and operates as a typical web-based application. The design process of the UI focuses on convenient and user-friendly search and discovery of EO data with the use of GIS tools and IDE functionalities. The application is designated to be a responsive web application for all screen sizes and resolutions. The target operating system for the server side of the JupyTEP IDE is Ubuntu, based on Docker stacks, while the client can run on any platform equipped with web browser.
The main goal was to create a solution and a platform for online EO data processing and visualization. To reach JupyTEP IDE goals, all system components were implemented with a close connection to the EO data repository. The assumptions for the project mentioned in the previous section have been implemented as follows: • JupyTEP IDE adaptation of a Jupyter web-based environment with predefined tools for EO data access to provide a development environment that is easy to use and easy to integrate with any platform infrastructure-it allows interoperability objectives to be attained; • EO tools integration is achieved as simplified access to EO data for the developer community by customization of JupyTEP IDE. Proposed integration tasks provide access to multiple sources, large volumes, and time series of EO data with predefined interfaces by integration with a cloud-based EO repository, i.e., innovative platform restbed (IPT); • The reuse paradigm of JupyTEP IDE is achieved by the integration of open-source components and sharing notebooks with results of developed algorithms and services. Additionally, the provision of WPS, which can run notebooks and EO processing, fulfills the reuse aspects. All of these actions enhance the production of EO-based information and reuse processing algorithms in Jupyter-notebook form; • To achieve the parallelization objective, components responsible for parallel processing in Jupyter environment (i.e., parallel ipython-cluster-helper, built-in mechanisms in EO data processing libraries) were integrated; • Implementation of JupyTEP IDE in addition to JupyterHub and distribution as a Docker container allows multi-tenancy objectives to be resolved. All developments in WP3 were focused on obtaining an operational software, which covered the requirements of the JupyterHub and Docker environment; • Scalability objectives were met thanks to the integration of cloud elements, integration with platforms' infrastructure, all the components integrated within JupyTEP IDE, and functionalities implemented as a part of that environment.
The JupyTEP IDE solution is built as a multiservice platform with components in the form of running Docker containers in swarm mode. Every component plays a different role. As the user database and storage, a PostgreSQL/PostGIS is used. It is a modified and adapted version of an existing Docker image, based on PostgreSQL 9.6 and PostGIS 2.4 extension. The PostGIS container is linked to JupyTEP IDE via an internal network and is complemented by a set of tools and object-relational mappers (ORM) such as SQLAlchemy, GeoAlchemy, etc.
For the sharing of results and the implementation of WPS, web feature service (WFS), web coverage service (WCS), and web map service WMS, a GeoServer and Mapnik are used: Adapted an existing Docker image (GeoServer) and a newly built one (Mapnik). Both can be accessed via an HTTP protocol and application programming interface (API). On top of each API, software development kit (SDK) methods are added to provide full service functionalities from a notebook cell perspective.
As a WPS functionality, a PyWPS server extension is integrated for sharing results in the form of a WPS interface. It works in the form of a Docker container as a WPS server. The main role of PyWPS is to provide advanced functionalities for developing and providing user services. PyWPS is integrated via Docker interfaces and API.
To have functionalities such as those in a typical IDE, a PixieDust debugger is used. The Jupyter-based debugger provides real-time debugging functionalities and is integrated as a tool and called cell magic.

Platform Infrastructure
In terms of building a new platform infrastructure, JupyTEP IDE software provides a highly configured and optimized environment for any EO cloud infrastructure software. At the stage of development, JupyTEP IDE runs on its first candidate-"EO innovation platform testbed Poland" (IPT), which is a powerful Earth observation data repository established in Poland and launched worldwide. The repository gathers optical data provided by satellites within the framework of missions such as: Landsat, Envisat, Sentinel, and others. The project is implemented by a consortium headed by Creotech Instruments SA and also embracing a Polish company CloudFerro Ltd. and a German company Brockmann Consult Ltd., based on the contract concluded with the European Space Agency [26]. Finally, JupyTEP IDE is located on CreoDIAS with wide support and integration with the EO data repository. This type of integrated cloud infrastructure allows us to perform any development and Remote Sens. 2019, 11, 1973 7 of 19 EO data processing task without any data transfer, which results in high performance. From the infrastructure (cloud) side, JupyTEP is built on top of the Docker environment. It integrates Docker, Docker Compose, Docker Engine, Docker Machine, and Docker Swarm, composed in separated JupyTEP Docker Stacks images with a predefined set of tools. This kind of architecture allows the user to choose the optimal environment for the planned task with all necessary tools and user environment configurations ( Figure 3). JupyTEP IDE extends the paradigm of a collaborative platform in the form of data and information exchange. Data sharing is resolved with the use of two types of volumes: User-and hostoriented [27]: • User volumes (work directory) store private data with access limited to user, • Host volumes (shared directory) store data available for any user (public folder). In terms of authorization and authentication, the JupyTEP IDE platform is integrated with GitHub. In order to start using JupyTEP IDE, a user should login with their GitHub account using their GitHub credentials via the GitHub OAuth mechanism.
JupyTEP IDE is based on Docker client-server architecture [28]. Docker is an open-source platform used to package, distribute, and run applications. From the point of view of JupyTEP IDE, it provides an easy and efficient way to encapsulate applications from infrastructure to run as a single Docker image shared through a central Docker registry. The Docker JupyTEP IDE final image is used to launch a JupyTEP IDE user container, which makes the contained JupyTEP IDE available. In simple words, the Docker is a containerization platform, OS-level virtualization method used to deploy and run a distributed JupyTEP IDE application and all EO-related dependencies together in the form of a Docker container. By using the Docker Platform, JupyTEP IDE can multiply isolated instances of JupyTEP IDE services run on a single host, access the same Jupyter kernel, and ensure that the application works seamlessly for any JupyTEP IDE user (Figure 4). JupyTEP IDE extends the paradigm of a collaborative platform in the form of data and information exchange. Data sharing is resolved with the use of two types of volumes: User-and host-oriented [27]: • User volumes (work directory) store private data with access limited to user, • Host volumes (shared directory) store data available for any user (public folder).
In terms of authorization and authentication, the JupyTEP IDE platform is integrated with GitHub. In order to start using JupyTEP IDE, a user should login with their GitHub account using their GitHub credentials via the GitHub OAuth mechanism.
JupyTEP IDE is based on Docker client-server architecture [28]. Docker is an open-source platform used to package, distribute, and run applications. From the point of view of JupyTEP IDE, it provides an easy and efficient way to encapsulate applications from infrastructure to run as a single Docker image shared through a central Docker registry. The Docker JupyTEP IDE final image is used to launch a JupyTEP IDE user container, which makes the contained JupyTEP IDE available. In simple words, the Docker is a containerization platform, OS-level virtualization method used to deploy and run a distributed JupyTEP IDE application and all EO-related dependencies together in the form of a Docker container. By using the Docker Platform, JupyTEP IDE can multiply isolated instances of JupyTEP IDE services run on a single host, access the same Jupyter kernel, and ensure that the application works seamlessly for any JupyTEP IDE user (Figure 4).
From the JupyTEP IDE perspective, JupyterHub interfaces with the Docker Swarm service running on the same host machine and Docker Swarm takes care of launching containers across the other nodes. Each container launches a Jupyter notebook server for a single user, then JupyterHub proxies the container port to the users. Users do not connect directly to the nodes in the Docker Swarm pool. The JupyterHub architecture addresses the issues of locality and network topology of JupyTEP IDE. From the JupyTEP IDE perspective, the Hub and Proxy reside on a known host under the control of the site administrator, and user notebook servers are safely configured to accept connections from that host. Communication with the per-user servers is secured and authenticated via GitHub OAuth exchange between the Hub and the notebook server. By adopting the appropriate network security measures, this host allows one to safely bridge between the restricted internal network domain and a broader network domain from which users can connect, potentially including the public Internet (Figure 4).  From the JupyTEP IDE perspective, JupyterHub interfaces with the Docker Swarm service running on the same host machine and Docker Swarm takes care of launching containers across the other nodes. Each container launches a Jupyter notebook server for a single user, then JupyterHub proxies the container port to the users. Users do not connect directly to the nodes in the Docker Swarm pool. The JupyterHub architecture addresses the issues of locality and network topology of JupyTEP IDE. From the JupyTEP IDE perspective, the Hub and Proxy reside on a known host under the control of the site administrator, and user notebook servers are safely configured to accept connections from that host. Communication with the per-user servers is secured and authenticated via GitHub OAuth exchange between the Hub and the notebook server. By adopting the appropriate network security measures, this host allows one to safely bridge between the restricted internal network domain and a broader network domain from which users can connect, potentially including the public Internet ( Figure 4).
In addition to the Jupyter and Docker configuration, we developed our own set of functions and modules for easy notebook script writing. All elements were developed in Python 3.6 and implemented the most important functions for working with JupyTEP IDE, which enabled communication between components, map manipulation, and easier work, coding, and presentation of results on the map in JupyTEP IDE.
At Sentinel-5P data (NEW). However, the JupyTEP IDE philosophy is aimed at complete freedom of configuration. This offers great opportunities to adapt the software environment, which means that any other data source can be connected according to the user's needs. In addition to the Jupyter and Docker configuration, we developed our own set of functions and modules for easy notebook script writing. All elements were developed in Python 3.6 and implemented the most important functions for working with JupyTEP IDE, which enabled communication between components, map manipulation, and easier work, coding, and presentation of results on the map in JupyTEP IDE.
At the time of development, the EO Data repository available at the JupyTEP IDE platform consists mostly of the Copernicus' Sentinel satellites data, ESA/Landsat, and Envisat data: However, the JupyTEP IDE philosophy is aimed at complete freedom of configuration. This offers great opportunities to adapt the software environment, which means that any other data source can be connected according to the user's needs.

Reuse of Existing EO Data Processing Tools
One of the main concepts of JupyTEP IDE is the reuse of existing EO data processing tools. An API to ESA SNAP, OTB, and GRASS GIS are available for users. These platforms are integrated with JupyTEP IDE through Python bindings. SNAP, developed by ESA, is a common architecture for Sentinel Toolboxes. It provides many tools and processing algorithms in a desktop application available for many operating systems. It also has a command line interface and application programming interface (API) in Java with Python bindings [9].
SNAPHU is an implementation of the statistical-cost, network-flow algorithm for phase unwrapping [29]. Python binding is available through the JupyTEP IDE library. It is implemented using the subprocess Python module. OTB is an open source project for remote sensing data processing. It can be used to work with optical, multispectral, and radar images. The performance of this software is relatively high, and it can process high-resolution images at a terabyte scale. Applications such as ortho-rectification, pan sharpening, classification, SAR processing, and much more are available. OTB algorithms are accessible from Python, but also QGIS, C++, or command line. The OTB is available for many operating systems such as Linux, Windows, and macOS, among others.

Application of Developed Tools in EO Data Processing
As a result of this project, an online tool for Earth Observation data processing, namely JupyTEP IDE, has been created. Using this system, users gain access to datasets and functions that are useful for EO data processing, visualization, and results sharing. The main advantage that distinguishes this product from other tools, apart from using ready-made algorithms, is the ability to quickly and easily create users' own algorithms or scripts. As a result, the data processing can be controlled strictly at every stage and in a formula adapted to any user needs. Although the basic programming language in JupyTEP IDE is Python, it is possible to create code in other languages by launching additional interpreter engines (called kernels) offered by the integrated Jupyter notebooks functionality. JupyTEP IDE, by default, offers the ability to create scripts in Python 2.7, Python 3.6, R, Scilab, Bash, Octave, and indirectly-by launching the "cell magic" technique-it gives access to JavaScript programming capabilities. Figure 5 depicts the functional view of the system. From the user's perspective, the most important part is the Client part. The entire interaction of a user with JupyTEP IDE is through web browser. tools and processing algorithms in a desktop application available for many operating systems. It also has a command line interface and application programming interface (API) in Java with Python bindings [9].
SNAPHU is an implementation of the statistical-cost, network-flow algorithm for phase unwrapping [29]. Python binding is available through the JupyTEP IDE library. It is implemented using the subprocess Python module.
OTB is an open source project for remote sensing data processing. It can be used to work with optical, multispectral, and radar images. The performance of this software is relatively high, and it can process high-resolution images at a terabyte scale. Applications such as ortho-rectification, pan sharpening, classification, SAR processing, and much more are available. OTB algorithms are accessible from Python, but also QGIS, C++, or command line. The OTB is available for many operating systems such as Linux, Windows, and macOS, among others.

Application of Developed Tools in EO Data Processing
As a result of this project, an online tool for Earth Observation data processing, namely JupyTEP IDE, has been created. Using this system, users gain access to datasets and functions that are useful for EO data processing, visualization, and results sharing. The main advantage that distinguishes this product from other tools, apart from using ready-made algorithms, is the ability to quickly and easily create users' own algorithms or scripts. As a result, the data processing can be controlled strictly at every stage and in a formula adapted to any user needs. Although the basic programming language in JupyTEP IDE is Python, it is possible to create code in other languages by launching additional interpreter engines (called kernels) offered by the integrated Jupyter notebooks functionality. JupyTEP IDE, by default, offers the ability to create scripts in Python 2.7, Python 3.6, R, Scilab, Bash, Octave, and indirectly-by launching the "cell magic" technique-it gives access to JavaScript programming capabilities. Figure 5 depicts the functional view of the system. From the user's perspective, the most important part is the Client part. The entire interaction of a user with JupyTEP IDE is through web browser.  To better depict the possibilities and functionalities of JupyTEP IDE, a few use case diagrams are presented below.
As mentioned, JupyTEP IDE is based on Jupyter and it is an extension of Jupyter notebook web application. As shown in Figure 6, the user is able to manage the JupyTEP IDE environment by adding, removing, or switching Jupyter extensions. This gives the great possibilities of configuring the software according to users' needs. While working with own source code, one can run several notebooks at the same time and then manage them from user interface. It is also possible to manage clusters for parallel notebook execution on remote machines.
As mentioned, JupyTEP IDE is based on Jupyter and it is an extension of Jupyter notebook web application. As shown in Figure 6, the user is able to manage the JupyTEP IDE environment by adding, removing, or switching Jupyter extensions. This gives the great possibilities of configuring the software according to users' needs. While working with own source code, one can run several notebooks at the same time and then manage them from user interface. It is also possible to manage clusters for parallel notebook execution on remote machines. Each Jupytep IDE user gains access to his own remote data storage. That is why all of the most important operations on filesystem are possible, such as upload, download, create new, rename, and delete. Because the user has access to the virtual operating system, he can also run the terminal and perform any operations on the operating system. All these use cases are illustrated in Figure 7. Applications created by users and provided with JupyTEP IDE are stored in form of notebook files (*.ipynb). To make notebook browsing more convenient, one can use the notebook browser, which displays only * .ipynb files. It also allows recursive file deletion (Figure 8). This functionality is not available in standard Jupyter. Each Jupytep IDE user gains access to his own remote data storage. That is why all of the most important operations on filesystem are possible, such as upload, download, create new, rename, and delete. Because the user has access to the virtual operating system, he can also run the terminal and perform any operations on the operating system. All these use cases are illustrated in Figure 7. Applications created by users and provided with JupyTEP IDE are stored in form of notebook files (*.ipynb). To make notebook browsing more convenient, one can use the notebook browser, which displays only * .ipynb files. It also allows recursive file deletion (Figure 8). This functionality is not available in standard Jupyter.    Because JupyTEP IDE is designed for spatial data processing, using a map becomes an important functionality. The user is able to display raster and vector maps, organized into layers in map browser as shown in Figure 9. Layers can be loaded as output data from processing or from external data sources like files or web map services/web map tile service (WMS/WMTS). Furthermore, map browser serves as the output for the Earth observation data search tool, as depicted in use case diagram in Figure 10. Because JupyTEP IDE is designed for spatial data processing, using a map becomes an important functionality. The user is able to display raster and vector maps, organized into layers in map browser as shown in Figure 9. Layers can be loaded as output data from processing or from external data sources like files or web map services/web map tile service (WMS/WMTS). Furthermore, map browser serves as the output for the Earth observation data search tool, as depicted in use case diagram in Figure 10.     When writing many different programs in the same scope, we often deal with repeating fragments of code or algorithm, or we want to save some code fragments for later use. In such a case, code snippets functionality is very handy. This is also implemented in JupyTEP IDE as shown in Figure 11. JupyTEP IDE comes with working and ready-to-use examples of notebooks and code snippets. When writing many different programs in the same scope, we often deal with repeating fragments of code or algorithm, or we want to save some code fragments for later use. In such a case, code snippets functionality is very handy. This is also implemented in JupyTEP IDE as shown in Figure 11. JupyTEP IDE comes with working and ready-to-use examples of notebooks and code snippets. The first steps in the JupyTEP IDE environment involve logging into a JupyTEP IDE account and automatically launching the web service. A user gains access to their own separate virtual environment based on Docker virtualization technology. A fully functional application along with the disk space for data storage is at the user's disposal (Figure 12). Work can be started by opening one of the many provided notebooks or by creating one's own script from scratch. Regardless of the The first steps in the JupyTEP IDE environment involve logging into a JupyTEP IDE account and automatically launching the web service. A user gains access to their own separate virtual environment based on Docker virtualization technology. A fully functional application along with the disk space for data storage is at the user's disposal ( Figure 12). Work can be started by opening one of the many provided notebooks or by creating one's own script from scratch. Regardless of the chosen programming language, an algorithm can be divided into parts called cells, each of which can be run separately. The created script source code contained in a single cell can be run online by issuing the "run cell" command or pressing the ctrl+Enter combination. It is also possible to run all cells at once as the entire program, depending on the user's preferences. The processing result from each cell is visible directly below it and is presented in a text or graphic form, depending on what value is given as an output. The program, created in this way, can be saved as a script in a *.ipynb file, whose internal format is JSON. It will be saved to a user's account and can be launched at any time or downloaded to a desktop computer. The map component is a feature that distinguishes JupyTEP IDE and adapts it to the needs of the community interested in EO data processing. The side panel on the right side of the screen is divided into tabs. One of them provides a map, which allows the user to search for a particular EO product and to automatically present the results of the processing. Its functionality has been taken from popular GIS systems. Thus, any result of the algorithm in the form of spatial data can be added to a map on an ongoing basis. These can be raster data formats such as PNG, JPG, BMP, vector formats such as GeoJSON, or services such as TMS or WMS ( Figure 13B). A user has an option to create, launch, and share their own WPS service, which processes data and sends results to a specific location or to use the results of data processing services from other providers.
Another useful feature of JupyTEP IDE is the ability to create code snippets that can be reused at any time. A user can find a set of pre-prepared snippets in the "Snippets" tab on the side panel ( Figure 13A). The map component is a feature that distinguishes JupyTEP IDE and adapts it to the needs of the community interested in EO data processing. The side panel on the right side of the screen is divided into tabs. One of them provides a map, which allows the user to search for a particular EO product and to automatically present the results of the processing. Its functionality has been taken from popular GIS systems. Thus, any result of the algorithm in the form of spatial data can be added to a map on an ongoing basis. These can be raster data formats such as PNG, JPG, BMP, vector formats such as GeoJSON, or services such as TMS or WMS ( Figure 13B). A user has an option to create, launch, and share their own WPS service, which processes data and sends results to a specific location or to use the results of data processing services from other providers.
Another useful feature of JupyTEP IDE is the ability to create code snippets that can be reused at any time. A user can find a set of pre-prepared snippets in the "Snippets" tab on the side panel ( Figure 13A). to a map on an ongoing basis. These can be raster data formats such as PNG, JPG, BMP, vector formats such as GeoJSON, or services such as TMS or WMS ( Figure 13B). A user has an option to create, launch, and share their own WPS service, which processes data and sends results to a specific location or to use the results of data processing services from other providers.
Another useful feature of JupyTEP IDE is the ability to create code snippets that can be reused at any time. A user can find a set of pre-prepared snippets in the "Snippets" tab on the side panel ( Figure 13A). JupyTEP IDE also provides the possibility to search spatial data services by using functions located in the side panel toolbar. After selection of the area of interest and entering search parameters (such as mission, sensor, time span, etc.), a user will obtain a result in the form of a list of products. JupyTEP IDE also provides the possibility to search spatial data services by using functions located in the side panel toolbar. After selection of the area of interest and entering search parameters (such as mission, sensor, time span, etc.), a user will obtain a result in the form of a list of products. The coverage of these products is visualized as a graphical layer on the map. Each product has attributes that allow it to read metadata of the material and download or connect it to one's own algorithm ( Figure 14). The coverage of these products is visualized as a graphical layer on the map. Each product has attributes that allow it to read metadata of the material and download or connect it to one's own algorithm ( Figure 14). As already mentioned, a user of JupyTEP IDE gets their own virtual environment, which gives the possibility of quite an extensive interference and control over the installed software components. It is possible to switch to the terminal view at any time, which gives direct access to the virtual system shell. One can add or remove libraries or Python packages on which JupyTEP IDE is based and customize the system's functionality to suit the user's needs. A user can also make other changes within the installed applications or additions to the system platform itself and configure it for strictly defined needs. JupyTEP IDE has a built-in text editor through which a user can make changes to any file saved on their account. All files can be freely downloaded, placed on the server, or shared with other users.

Example of Processing Chain
For the purpose of JupyTEP IDE functionality, an example of a simple band math operation with its visualization is presented below. The results are presented in Figure 15. In [1] and Out [1]: Statements in the code stands for input and output IPython/Jupyter cells, respectively. In the As already mentioned, a user of JupyTEP IDE gets their own virtual environment, which gives the possibility of quite an extensive interference and control over the installed software components. It is possible to switch to the terminal view at any time, which gives direct access to the virtual system shell. One can add or remove libraries or Python packages on which JupyTEP IDE is based and customize the system's functionality to suit the user's needs. A user can also make other changes within the installed applications or additions to the system platform itself and configure it for strictly defined needs. JupyTEP IDE has a built-in text editor through which a user can make changes to any file saved on their account. All files can be freely downloaded, placed on the server, or shared with other users.

Example of Processing Chain
For the purpose of JupyTEP IDE functionality, an example of a simple band math operation with its visualization is presented below. The results are presented in Figure 15. In [1] and Out [1]: Statements in the code stands for input and output IPython/Jupyter cells, respectively. In the algorithm, we have omitted obvious operations like importing standard libraries, preparing plots using matplotlib, etc.

Discussion
Taking into account the initial assumptions, our team have managed to develop software that meets all the project requirements. The above-mentioned assumptions are a result of the analysis of existing solutions and possibilities found in the open-source software and ESA/ESRIN recommendations.
We have obtained interoperability, integration with existing data sources and interfaces, reusability, parallelization in processing, multi-tenancy, and scalability of the created solution.
As benefits of our work we can mention: • Easy and user-friendly entry point for everyone who want to process the EO data, Figure 15. The result of processing.

Discussion
Taking into account the initial assumptions, our team have managed to develop software that meets all the project requirements. The above-mentioned assumptions are a result of the analysis of existing solutions and possibilities found in the open-source software and ESA/ESRIN recommendations.
We have obtained interoperability, integration with existing data sources and interfaces, reusability, parallelization in processing, multi-tenancy, and scalability of the created solution.
As benefits of our work we can mention: • Easy and user-friendly entry point for everyone who want to process the EO data, • Development environment that allows fast prototyping with use of EO data, • Predefined environment with most of popular EO data processing tools pre-installed, • Fast and easy access to wide range of EO data, • Flexible "on-the-fly" environment customization and modification of any library and programming language for specific user purposes, • Real time visualization of processing results.

•
The proposed solution, due to the complex nature of both EO data and infrastructure, has its limitations, which are, among others: • Quite large amount of resources necessary to process EO data, • Costs of infrastructure of DIAS-es or any other commercial clouds.
After talks with representatives of all European DIAS-es, JupyTEP IDE will be integrated with all of them starting from Onda and Mundi. Right now, it is present on CreoDIAS, to which we are planning to migrate from IPT infrastructure. Following the latest trends, in the near future, we plan to develop several new functionalities enhancing services and performance (migration to JupyterLab, integration with Data Cubes, new modules, and tools).

Conclusions
JupyTEP IDE aims to be an all-in-one cloud online software tht integrates the most common open-source EO-based geospatial tools, software, libraries, and toolboxes for EO data (SNAP, OTB), vector and raster data processing (GRASS GIS, GDAL, PostGIS, etc.), visualization, and presentation in the most suitable form. Based on the Jupyter approach and extended Python environment integrated with a Docker ecosystem prerogative, it allows for interconnection with most of the existing services (WPS, WMS, TEP interfaces) and tools for geospatial data storage and distribution (PostGIS, GeoServer, Mapnik, etc.). JupyTEP IDE based on the configuration, customization, adaptation, and extension of Jupyter and Docker Swarm components provides highly configured and optimized software for any EO cloud infrastructure. It allows for the development of EO data processing tasks with high performance to be put in place without the need for data transfer.
The use of open source gives a great deal of freedom to expand and configure a created solution. This is thanks to the variety of available open-source software. Moreover, JupyTEP IDE user is able to fully configure his environment and easily add or remove components. JupyTEP IDE combines the capabilities of a cloud-based application with full access to the configuration of the platform on which it operates. This gives users great freedom in configuring their own research environment that is compliant with their requirements. It can be concluded that the most important advantages of JupyTEP IDE are: