1. Introduction
Small satellites are artificial satellites characterized by weights less than 500 kg and placed in orbit around planets and natural satellites to collect information or for communication [
1]. They are used to make Earth observations (EOs) across academia and industry, with relevance in a variety of fields such as basic research and technology demonstration [
2], environmental monitoring [
3], and disaster management [
4]. In recent years, the development of low-Earth-orbit satellite constellations and the democratization of space have boosted the launch of small satellites. This trend is expected to increase in the future [
5,
6].
Due to their remote location and inaccessibility, small satellites face significant constraints in terms of on-board storage, processing power, and downlink bandwidth. This has led to advances in handling large EO data, such as the development of new technologies to store and downlink data to the ground, as well as advanced on-board satellite processing operations [
7,
8]. Artificial intelligence (AI) systems have proven to be excellent in extracting information from a high volume of data, such as images and videos, using computer vision and other machine learning techniques [
9]. The edge deployment of neural algorithms (i.e., at the source of the data) reduces the need to transmit large volumes of raw data, allowing for more efficient bandwidth usage and faster downstream processing. It also enables the prompt extraction of actionable insights from data streams, supporting faster and more autonomous system responses.
Accordingly, there is a growing interest in exploring AI methods on board to improve small satellite (SmallSat) technologies [
10,
11].
-Sat-1 demonstrated the significant role of AI in data analysis and decision-making processes using object detection, which could contribute to environmental monitoring [
12]. Additionally, lossy image compression for SmallSats is a critical element in designing an off-satellite data transmission system, and, thus, it has been broadly explored in the literature [
13,
14,
15,
16].
A less explored avenue of image processing for SmallSats is the use of anomaly detection, which can allow for the recognition of anomalous images that could be useful for data analysis and satellite operations. Anomaly detection is defined as the recognition of unusual elements in a dataset based on a learned description of a data-driven background-only model, commonly provided by machine learning tools. For example, anomaly detection can help in the early detection and containment of wildfires, flooding, and other natural disasters [
17]. Anomaly detection can also be used for the rapid detection of instrumental issues, namely, pixel failures in the camera or other disturbances [
18]. Such methods have been studied for satellite telemetry [
19,
20,
21]. However, the area of anomaly detection in SmallSat image analysis remains relatively unexplored.
This work studies the dual use of autoencoders for data compression and anomaly detection via on-satellite deployment for SmallSat technologies. An autoencoder (AE) is an example of an ML architecture that can offer both functionalities. Autoencoders are trained to reconstruct inputs after dimensional reduction through a bottleneck latent space, therein providing a lower-dimensional basis for the input that can be used to generate authentic reproductions of the original input after sampling from the compressed representation. The use of AEs for data compression has been explored in recent years in the context of SmallSat missions [
22]. The specific model used here is a convolutional AE (CAE); convolutional layers aid the AE in learning from two-dimensional images. These results expand on those in the previous literature by providing a proof of concept that the ML models commonly used for intelligent image compression are also able to provide anomaly detection capabilities without the need for further optimization or additional processing resources.
Small satellites are cost-effective and can be developed in relatively short timescales compared to other scientific instruments. In this way, they offer a testbed for advanced technology by providing new opportunities for payload designs [
23], enabling emerging nations such as Africa to develop and launch space programs. These technologies can benefit Africa in terms of socioeconomic and environmental development, for example, through their application in agricultural surveillance, real-time disaster monitoring, and sustainable development. Therefore, the use case of anomaly detection considered here is the identification of Saharan dust storms and camera defects in aerial satellite images of Dakar, Senegal, taken during the Copernicus Sentinel-2 mission (
Figure 1).
The ability to compress images to lower-dimensional representations on a satellite could allow for reduced data transmission to Earth, followed by decompression, enabling more power-efficient data downlinks in the highly constrained satellite computing system [
25]. Further, real-time dust storm recognition at the satellite level could allow for dynamic changes in satellite image-taking (i.e., the disruption of standard image-taking patterns to focus on anomalous regions), with significant impacts of improved weather and climate monitoring in the region [
26].
2. Datasets and Image Processing
The Sentinel-2 mission is part of the European Union’s Copernicus program consisting of a wide-swath, high-resolution, multi-spectral imaging payload [
27]. The mission consists of twin satellites phased at
, with a high revisit frequency of 5 days at the equator. The payload is a multi-spectral instrument (MSI) that samples 13 spectral bands at different spatial resolutions of 10 m, 20 m, and 60 m. The orbital width is 290 km.
The Sentinel-2 datasets are open-source and easily accessible through the Copernicus Data Space Ecosystem platform. This platform provides a large amount of georeferenced imagery to researchers, developers, and policymakers worldwide [
24]. The dataset includes Level-1C products (orthorectified top-of-atmosphere reflectances) and Level-2A products (bottom-of-atmosphere reflectances). The Sentinel-2 MSI acquires measurements with a radiometric resolution of 12 bits. The same measurements are converted to reflectance and stored as 16-bit integers. The choice of Sentinel-2 datasets for this study is not only based on their free, open-access policy and large availability of satellite images but also because the data acquired by the MSI on board the satellite is similar to those acquired by the multi-spectral camera used in SmallSats.
The dataset used in this study consists of 70 Sentinel-2 aerial images of the region of Dakar, Senegal, each covering an area of approximately 12,115
. Dakar is chosen as the region of interest in Africa because it shows a balance between ocean and land for realistic image feature presence, non-negligible rates of image anomalies such as Saharan dust storm events, and favorable weather conditions for satellite observations (e.g., relatively low cloud coverage). A set of images taken from 2020 to 2024 with a 60 m resolution and <20% cloud cover at the L2A processing level is chosen to develop the CAE model, comprising 70 images in total. Of these, 63 are used for the training set, and the remaining 7 compose the validation and test sets. The anomalous dataset consists of two known dust storm events on 1–5 June 2022 [
28]. These two anomalous Sentinel-2 acquisitions are not included in the 70 images in the non-anomalous training/testing dataset and are reserved for a dedicated test set used to examine the anomaly detection capabilities of the ML approach.
Before the training dataset was passed to the network, it was preprocessed as follows: (1) bands were selected using free, open-source QGIS (Quantum Geographic Information System) v3.40 software; (2) the dataset was split into training, validation, and test sets; (3) patches (tiles) were extracted with overlap; (4) the patches were saved with their respective metadata; and (5) normalization and datatype conversion were performed. The details of each step are provided below.
Band selection was performed using QGIS. Bands 2 (blue), 3 (green), and 4 (red) were selected to isolate them in RGB from the rest, and a composite image was created for visualization, as shown in
Figure 2. Ultimately, only Band 4 (red) was used to train the CAE to allow it to properly learn both water and land features, which are predominantly present in the blue and green bands, respectively.
Images were split into training (90%) and validation/test (10%) sets based on their date of acquisition. Each image with a 60 m resolution had a size of 1830 × 1830 pixels. Single-band images were then divided into patches of size
pixels, with an overlap of 135 pixels between adjacent patches, achieving 99.89% image coverage. This was achieved by sliding a window across the image with a stride equal to the patch size minus the overlap, ensuring that neighboring patches shared common regions. The patch size was chosen because it provides a balance between power consumption and processing time and accuracy [
29]. The overlap was selected to balance comprehensive feature capture with GPU memory constraints, critical for small satellite applications, based on a systematic evaluation balancing image coverage and computational efficiency. At each position, a patch was extracted and passed to a normalization routine to ensure that the pixel values fell within the range of [0, 1]. Patch extraction was performed after dataset splitting, with identical processing applied to all images in order to prevent data leakage and ensure consistent patch views across sets.
In summary, before preprocessing, the training set has 63 images, the validation has 4 images, the test set has 3 images, and the anomaly test set has 2 images. After preprocessing, the training dataset consists of 12,348 patches, the validation dataset consists of 784 patches, the test dataset consists of 588 patches, and the anomaly test dataset consists of 98 patches, all in Band 4 representation. As all images cover the same aerial area over Dakar, patch extraction is applied after the split into training, test, and validation sets, and the patch extraction process is exactly the same for all images. It is ensured that the same image views are passed to the model in both training and evaluation.
Author Contributions
Conceptualization, J.G.; Methodology, D.J. and J.G.; Software, D.J.; Formal analysis, D.J.; Data curation, D.J.; Writing—original draft, D.J. and J.G.; Writing—review & editing, J.G.; Supervision, J.G. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by U.S. Department of Energy under contract number DE-AC02-76SF00515.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Conflicts of Interest
The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
References
- Joseph, R.; Kopacz, R.H.; Roney, J. Small satellites an overview and assessment. Acta Astronaut. 2020, 170, 93–105. [Google Scholar] [CrossRef]
- Fevgas, G.; Lagkas, T.; Sarigiannidis, P.; Argyriou, V. Advances in Remote Sensing and Propulsion Systems for Earth Observation Nanosatellites. Future Internet 2025, 17, 16. [Google Scholar] [CrossRef]
- Alzubairi, A.; Tameem, A.; Kada, B. Spacecraft formation flying orbital control for earth observation mission. Sci. Afr. 2024, 26, e02391. [Google Scholar] [CrossRef]
- Battistini, S. Chapter 12-Small satellites for disaster monitoring. In Nanotechnology-Based Smart Remote Sensing Networks for Disaster Prevention; Denizli, A., Alencar, M.S., Nguyen, T.A., Motaung, D.E., Eds.; Micro and Nano Technologies; Elsevier: Amsterdam, The Netherlands, 2022; pp. 231–251. [Google Scholar] [CrossRef]
- Barato, F.; Toson, E.; Milza, F.; Pavarin, D. Investigation of different strategies for access to space of small satellites on a defined LEO orbit. Acta Astronaut. 2024, 222, 11–28. [Google Scholar] [CrossRef]
- Kulu, E. CubeSats & Nanosatellites–2024 Statistics, Forecast and Reliability. In Proceedings of the 75th International Astronautical Congress (IAC 2024), International Astronautical Federation (IAF), Milan, Italy, 14–18 October 2024; p. IAC-24.B4.6A.13. [Google Scholar]
- Li, Y.; Ma, J.; Zhang, Y. Image retrieval from remote sensing big data: A survey. Inf. Fusion 2021, 67, 94–115. [Google Scholar] [CrossRef]
- Chintalapati, B.; Precht, A.; Hanra, S.; Laufer, R.; Liwicki, M.; Eickhoff, J. Opportunities and challenges of on-board AI-based image recognition for small satellite Earth observation missions. Adv. Space Res. 2024, 75, 6734–6751. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA, 3–6 December 2012; Volume 25, pp. 1097–1105. [Google Scholar]
- Lofqvist, M.; Cano, J. Accelerating Deep Learning Applications in Space. arXiv 2020, arXiv:2007.11089. [Google Scholar]
- Chen, Z.; Gao, H.; Lu, Z.; Zhang, Y.; Ding, Y.; Li, X.; Zhang, B. MDA-HTD: Mask-driven dual autoencoders meet hyperspectral target detection. Inf. Process. Manag. 2025, 62, 104106. [Google Scholar] [CrossRef]
- Gianluca, G.; Luca, F.; Gabriele, M.; Matej, B.; Léonie, B.; Aubrey, D. The Φ-Sat-1 Mission: The First On-Board Deep Neural Network Demonstrator for Satellite Earth Observation. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5517414. [Google Scholar] [CrossRef]
- Goodwill, J.; Wilson, D.; Sabogal, S.; George, A.D.; Wilson, C. Adaptively Lossy Image Compression for Onboard Processing. In Proceedings of the 2020 IEEE Aerospace Conference, Big Sky, MT, USA, 7–14 March 2020; pp. 1–15. [Google Scholar] [CrossRef]
- Li, J.; Liu, Z. Multispectral Transforms Using Convolution Neural Networks for Remote Sensing Multispectral Image Compression. Remote Sens. 2019, 11, 759. [Google Scholar] [CrossRef]
- Alves de Oliveira, V.; Chabert, M.; Oberlin, T.; Poulliat, C.; Bruno, M.; Latry, C.; Carlavan, M.; Henrot, S.; Falzon, F.; Camarero, R. Reduced-Complexity End-to-End Variational Autoencoder for on Board Satellite Image Compression. Remote Sens. 2021, 13, 447. [Google Scholar] [CrossRef]
- Alves de Oliveira, V.; Chabert, M.; Oberlin, T.; Poulliat, C.; Bruno, M.; Latry, C.; Carlavan, M.; Henrot, S.; Falzon, F.; Camarero, R. Satellite Image Compression and Denoising With Neural Networks. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4504105. [Google Scholar] [CrossRef]
- Paszkowsky, N.A.; Brännvall, R.; Carlstedt, J.; Milz, M.; Kovács, G.; Liwicki, M. Vegetation and Drought Trends in Sweden’s Mälardalen Region—Year-on-Year Comparison by Gaussian Process Regression. In Proceedings of the 2020 Swedish Workshop on Data Science (SweDS), Luleå, Sweden, 29–30 October 2020. [Google Scholar] [CrossRef]
- Lin, S.; Zhang, M.; Cheng, X.; Shi, L.; Gamba, P.; Wang, H. Dynamic Low-Rank and Sparse Priors Constrained Deep Autoencoders for Hyperspectral Anomaly Detection. IEEE Trans. Instrum. Meas. 2024, 73, 2500518. [Google Scholar] [CrossRef]
- Ruszczak, B.; Kotowski, K.; Evans, D.; Nalepa, J. The OPS-SAT benchmark for detecting anomalies in satellite telemetry. Sci. Data 2025, 12, 710. [Google Scholar] [CrossRef]
- Liu, L.; Tian, L.; Kang, Z.; Wan, T. Spacecraft anomaly detection with attention temporal convolution networks. Neural Comput. Appl. 2023, 35, 9753–9761. [Google Scholar] [CrossRef]
- Hussein, M. A Real-Time Anomaly Detection in Satellite Telemetry Data Using Artificial Intelligence Techniques Depending on Time-Series Analysis. J. ACS Adv. Comput. Sci. 2023, 14, 21–45. [Google Scholar] [CrossRef]
- Guerrisi, G.; Del Frate, F.; Schiavon, G. Convolutional Autoencoder Algorithm for On-Board Image Compression. In Proceedings of the IGARSS 2022—2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 151–154. [Google Scholar] [CrossRef]
- Daniel, S.; David, K. A survey and assessment of the capabilities of Cubesats for Earth observation. Acta Astronaut. 2012, 74, 50–68. [Google Scholar] [CrossRef]
- Copernicus Programme. Copernicus Data Space Ecosystem: Sentinel-2 Data Collection. 2024. Available online: https://dataspace.copernicus.eu/explore-data/data-collections/sentinel-data/sentinel-2 (accessed on 21 November 2024).
- Giuliano, A.; Gadsden, S.A.; Hilal, W.; Yawney, J. Convolutional variational autoencoders for secure lossy image compression in remote sensing. arXiv 2024, arXiv:2404.03696. [Google Scholar]
- O’Sullivan, D.; Marenco, F.; Ryder, C.L.; Pradhan, Y.; Kipling, Z.; Johnson, B.; Benedetti, A.; Brooks, M.; McGill, M.; Yorks, J.; et al. Models transport Saharan dust too low in the atmosphere: A comparison of the MetUM and CAMS forecasts with observations. Atmos. Chem. Phys. 2020, 20, 12955–12982. [Google Scholar] [CrossRef]
- Phiri, D.; Simwanda, M.; Salekin, S.; Nyirenda, V.R.; Murayama, Y.; Ranagalage, M. Sentinel-2 Data for Land Cover/Use Mapping: A Review. Remote Sens. 2020, 12, 2291. [Google Scholar] [CrossRef]
- NASA Earth Observatory. A Burst of Saharan Dust. 2022. Available online: https://earthobservatory.nasa.gov/images/149918/a-burst-of-saharan-dust (accessed on 5 August 2024).
- Guerrisi, G.; Frate, F.D.; Schiavon, G. Artificial Intelligence Based On-Board Image Compression for the ϕ-Sat-2 Mission. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 8063–8075. [Google Scholar] [CrossRef]
- Dosselmann, R.; Yang, X.D. A comprehensive assessment of the structural similarity index. Signal Image Video Process. 2011, 5, 81–91. [Google Scholar] [CrossRef]
- Horé, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar]
Figure 1.
Sentinel-2 images of an approximately 110 km × 110 km aerial region over Dakar, Senegal, used as input to the machine learning models for inference [
24]. Three different images are shown with different cloud coverage percentages, with their acquisition dates provided in the title.
Figure 2.
Example Sentinel-2 image split into the individual color bands of input images generated during preprocessing and used for training.
Figure 3.
CAE architecture used in this study. The encoder is shown in blue, the bottleneck is shown in red, and the decoder is shown in orange. The shape of each block is in the format of height × width × depth. The arrows represent the operations in each layer with the following details: Conv2D indicates convolutional layers; k is the kernel size; s is the stride; and Conv2D_T indicates the transposed convolutional layers.
Figure 4.
Original test set (top) and CAE-reconstructed (bottom) images with metadata to study the CAE performance with values around a mean test set SSIM score of 0.75. (a) Original test set image patch from 28 November 2022, Band B04, coordinates (726, 1452); (b) Original test set image patch from 28 November 2022, Band B04, coordinates (1089, 847); (c) Original test set image patch from 28 November 2022, Band B04, coordinates (121, 1573); (d) Original test set image patch from 26 June 2023, Band B04, coordinates (726, 1573); (e) CAE reconstruction of (a) with SSIM score of 0.7604; (f) CAE reconstruction of (b) with SSIM score of 0.7605; (g) CAE reconstruction of (c) with SSIM score of 0.7607; (h) CAE reconstruction of (d) with SSIM score of 0.7588.
Figure 5.
True (top) and CAE-reconstructed (bottom) images for the four test sets used to study the anomaly detection performance; from left to right, background, Sahara dust storm, dead pixel, and hot pixel images. Images are displayed in Viridis color scale for ease of visualization of anomalies. (a) True image patch from the normal test set, acquired on 26 June 2023, Band B04, coordinates (726, 1089); (b) True image patch from the dust storm anomaly test set, acquired on 06 June 2022, Band B04, patch 33; (c) True image patch with synthetic dead pixel anomaly (5% of pixels set to 0), based on the 26 June 2023 image, Band B04, coordinates (726, 1089); (d) True image patch with synthetic hot pixel anomaly (5% of pixels set to 1), based on the 26 June 2023 image, Band B04, coordinates (726, 1089); (e) CAE reconstruction of (a) with SSIM score of 0.8044; (f) CAE reconstruction of (b) with SSIM score of 0.5830; (g) CAE reconstruction of (c) with SSIM score of 0.5883; (h) CAE reconstruction of (d) with SSIM score of 0.3816.
![Information 16 00690 g005 Information 16 00690 g005]()
Figure 6.
Histogram of the CAE output SSIM score on the training, test, and three anomaly test sets (left) and the corresponding ROC curve (right). Higher SSIM values indicate better reconstruction quality.
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).