Cloud Detection Methods for Optical Satellite Imagery: A Comprehensive Review

Singh, Rohit; Pal, Mahesh; Biswas, Mantosh

doi:10.3390/geomatics5030027

Open AccessReview

Cloud Detection Methods for Optical Satellite Imagery: A Comprehensive Review

by

Rohit Singh

^1,*

,

Mahesh Pal

²

and

Mantosh Biswas

³

¹

Computer Engineering Department, National Institute of Technology, Kurukshetra 136119, India

²

Civil Engineering Department, National Institute of Technology, Kurukshetra 136119, India

³

Department of Computer Science, North Campus, University of Delhi, Delhi 110007, India

^*

Author to whom correspondence should be addressed.

Geomatics 2025, 5(3), 27; https://doi.org/10.3390/geomatics5030027

Submission received: 20 April 2025 / Revised: 23 June 2025 / Accepted: 24 June 2025 / Published: 26 June 2025

Download

Browse Figures

Versions Notes

Abstract

With the continuous advancement of remote sensing technology and its growing importance, the need for ready-to-use data has increased exponentially. Satellite platforms such as Sentinel-2, which carries the Multispectral Instrument (MSI) sensor, known for their cost-effectiveness, capture valuable information about Earth in the form of images. However, they encounter a significant challenge in the form of clouds and their shadows, which hinders the data acquisition and processing for regions of interest. This article undertakes a comprehensive literature review to systematically analyze the critical cloud-related challenges. It explores the need for accurate cloud detection, reviews existing datasets, and evaluates contemporary cloud detection methodologies, including their strengths and limitations. Additionally, it highlights the inaccuracies introduced by varying atmospheric and environmental conditions, emphasizing the importance of integrating advanced techniques that can utilize local and global semantics. The review also introduces a structured intercomparison framework to enable standardized evaluation across binary and multiclass cloud detection methods using both qualitative and quantitative metrics. To facilitate fair comparison, a conversion mechanism is highlighted to harmonize outputs across methods with different class granularities. By identifying gaps in current practices and datasets, the study highlights the importance of innovative, efficient, and scalable solutions for automated cloud detection, paving the way for unbiased evaluation and improved utilization of satellite imagery across diverse applications.

Keywords:

cloud; classification; Sentinel-2; machine learning; deep learning

1. Introduction

Remote Sensing (RS) is the process of gathering valuable observations about the Earth’s surface, atmosphere, and other environmental phenomena without direct physical contact, allowing the scientific community to explore even the most remote of the world [1,2]. It typically senses and captures data about the Earth’s surface and atmosphere using a variety of sensors, including imaging sensors (which capture data in the form of images using electromagnetic radiation, such as visible light, infrared, microwaves, and other wavelengths) as well as other sensors like radiometers (which quantify radiation intensity), spectrometers (which record spectral signatures), radar (which collects data on surface structure and movement), LiDAR (which collects precise 3D surface elevation), and more, mounted on different platforms, such as satellites, aircraft, drones, and scientific balloons (e.g., carrying Differential Optical Absorption Spectroscopy (DOAS) instruments). Satellites are ideal platforms for capturing large-scale structures of the Earth, including oceans, deserts, and polar regions. While aircraft and drones are limited to small-scale data collection due to the reason that their imaging sensors produce localized and detailed observations [3], these sensors also require human involvement for operation and may be difficult to operate in extreme weather conditions compared to satellite imaging sensors. Modern satellites have multispectral (~10 bands) and hyperspectral (~100 bands or more) imaging sensors, which acquire images across multiple wavelengths, allowing detailed analysis of the Earth’s surface [4,5,6]. Images acquired by these sensors have been used in a wide range of Earth observation activities, including:

Land Cover Monitoring: RS images help in monitoring changes in land cover (like forest, grassland, urban areas, and agriculture fields), urban growth & expansion (like development of buildings, roads, and other urban infrastructure), deforestation, monitoring mountains & canyons, and change detection [7,8,9].
Disaster Response and Recovery: data acquired by different sensors helps in identifying the extent of natural calamities like earthquakes, forest fires, hurricanes, landslides, floods, and volcanoes [10,11], as well as in estimating the magnitude of damage and guiding relief operations.
Agriculture: RS plays a pivotal role in the identification of crops, mapping production, assessing crop health, monitoring the growth stage of crops, and identifying disease and pest outbreaks [12,13].
Geology and Natural Resource Management: RS is extensively used in locating mineral resources, monitoring mining processes, managing freshwater resources (such as lakes, rivers, and glaciers), and geological surveys for identifying rock formations, faults, folds, and other geological structures [6,14].
Military Surveillance: RS helps in gathering visual intelligence like enemy activities, troop movement, equipment deployment, and infrastructure changes [15]. It also helps in keeping track of borders and suspicious activities and identifying potential threats. It monitors maritime activities like tracking naval vessels, illegal fishing, and smuggling [16].
Oceanography: RS aids in estimating ocean currents, marine life distribution, understanding sea dynamics, and climate change effects [17].

The sensor used with RS satellites can be categorized into two main types, i.e., Active and Passive. These two types of sensors play crucial roles in various remote sensing applications discussed earlier; each of them brings its own unique capabilities and advantages to the field. Active satellite sensors have their own source of energy (like Laser, Radar, and microwave), which is emitted toward the Earth and then reflected or backscattered by the surface to be detected by the sensor [18,19]. While these sensors do not generate images in the conventional optical sense, many examples, such as Synthetic Aperture Radar (SAR) and LiDAR, produce image-like representations that convey surface characteristics and structural shape. Additionally, active sensors can derive quantitative measurements, such as height, which is typically calculated from the return time of the emitted signal. These sensors operate effectively under most atmospheric conditions, though some measurements at certain wavelengths and with specific sensor types can still be affected. In contrast, passive sensors rely primarily on the natural source of electromagnetic radiation: the Sun, which dominates wavelengths up to approximately 4 microns, and thermal emission radiated by the Earth at longer wavelengths, allowing observations even in the absence of sunlight. These sensors collect information from energy reflected by objects on the Earth’s surface (commonly referred to as Earth’s albedo) as well as from various atmospheric phenomena [20]. Passive satellite sensors can be broadly classified into two main categories, i.e., optical and thermal sensors [4]. Optical sensors capture reflected solar radiation, including the Ultra-Violet (UV), Visible (VIS), Near Infrared (NIR), and Short-Wave Infrared (SWIR), which are widely used for land-cover classification, vegetation assessment, and urban growth. Meanwhile, thermal infrared sensors detect the thermal emission from the Earth’s surface, which dominates the radiation spectrum beyond 4 microns, making them suitable for surface temperature estimation, volcanic activity monitoring, and heat mapping for global warming assessment. Infrared sensors operate across a wide range of wavelengths, including near-infrared (NIR: 0.7–1.4 µm), short-wave infrared (SWIR: 1.4–3 µm), mid-wave infrared (MWIR: 3–8 µm), long-wave infrared (LWIR: 8–15 µm), and far-infrared (FIR: 15–1000 µm). These sensors are widely used in weather monitoring, atmospheric studies, and environmental observations due to their ability to capture thermal emissions and reflected energy across this broad spectrum.

Passive and active sensors provide complementary information, with passive sensors offering rich, high-resolution spatial and spectral details (especially optical data under clear-sky conditions, valuable for various surface analyses), while active sensors are effective in all-weather, day-and-night conditions and are suited for surface structure and elevation analysis [2]. The majority of passive sensors used for Earth observation are multispectral in nature, commonly found on satellite platforms such as the Landsat series, Sentinel series, and Chinese satellite Geofen. With the launch of Landsat 1 in 1972, passive sensors have become an essential source for different RS applications [4,21]. In comparison to active satellites, these sensors often capture rich information for detailed analysis of Earth’s surface. These sensors generally record and digitize the reflected radiation in digital form and convert this information into the form of an image. However, this recorded observation suffers from geometric & radiometric errors, atmospheric interference (including clouds, aerosols, gases, pollutants, and dust particles), and other imperfections like variation in sensor sensitivity, calibration issues, uneven illumination, or noise [22].

To address these challenges, level-1 preprocessing is carried out at the ground receiving station before the distribution of satellite data. The level-1 preprocessing aims to produce usable data by estimating top-of-atmosphere (TOA) reflectance [23]. The TOA reflectance is a standardized method of quantifying radiation, but it does not correct for atmospheric effects (absorption and scattering) that must be corrected in subsequent processing steps (Figure 1a). As a result, it is necessary to determine bottom-of-atmosphere (BOA) or surface reflectance for the effective use of remote sensing images [24,25]. Level-2 preprocessing converts TOA reflectance into BOA reflectance (Figure 1b). This process involves a series of radiometric and atmospheric corrections [22]. By accounting for the atmospheric interactions, one can obtain actual reflectance values that depict the actual Earth’s surface characteristics without atmospheric influence. The estimation of BOA reflectance is significantly impacted by the presence of clouds (Figure 1c,d) [26]. In Figure 1c, the level-1 processed TOA image displays a mountainous region covered in greenery, casting terrain shadow below, and some part of the image is covered with clouds. During level 2 preprocessing, the presence of these clouds causes extra whiteness on top of mountains (red circle, Figure 1d). The presence of clouds introduces a range of complexities and challenges due to their diverse optical properties and interactions during preprocessing [27,28].

Keeping these issues in view, this article presents a comprehensive review of cloud-related challenges and various cloud detection methods explored by researchers to handle the impact of clouds and their shadows. The major focus is on Sentinel-2 satellite imagery, extensively used in different Earth observation studies due to its high spatial, temporal, and spectral resolutions. Sentinel-2, part of the European Space Agency’s Copernicus program, provides global multispectral data across 13 bands, including visible, near-infrared, and shortwave infrared wavelengths. However, the presence of clouds and their shadows significantly reduces the usability of Sentinel-2 data. This review examines the challenges posed by clouds in Sentinel-2 imagery, explores state-of-the-art detection techniques encompassing both traditional image processing and advanced machine learning approaches, and highlights the existing datasets. The presented review complements existing ones by emphasizing the critical need for multiclass cloud classification, which remains underexplored despite its importance in the majority of applications. While the use of machine learning and deep learning for cloud detection on Sentinel-2 imagery is growing rapidly, with new advancements emerging continuously, the predominant focus remains on binary classification (distinguishing cloud and non-cloud regions). This focus needs to shift towards multiclass classification to better capture important distinctions such as thick clouds, thin clouds, and cloud shadows. This review also provides insights into the applicability and limitations of various cloud detection techniques, supporting researchers and practitioners in selecting appropriate methods to enhance the quality of Sentinel-2 data. Additionally, it highlights an intercomparison framework to enable fair and consistent evaluation of methods predicting different cloud classes, thereby fostering objective benchmarking and further progress in multiclass cloud detection.

2. Cloud and Its Effect

The presence of clouds obscures the view of Earth’s surface objects in satellite images, thus compromising their application and usability [26]. Clouds scatter sunlight and have their unique albedo, which can block or permit only a fraction of sunlight to pass through them. This phenomenon leads to blocking or shading beneath objects and forming shadows on the Earth’s surface. Figure 2 systematically illustrates the interaction of sunlight, atmosphere, clouds, Earth’s surface components, and the satellite sensor, highlighting how clouds and their shadows are captured and affect satellite imagery. Satellite images are collected in a timely manner after fixed intervals and are affected by the presence of clouds, as about 67% of Earth’s surface is covered by clouds [29]. There is a high probability of significant information loss if clouds remain persistent in subsequent satellite passes, which block the land region of interest (ROI) beneath. This can result in a loss of consistent data, making monitoring changes or trends in that specific area difficult. To mitigate the impact of clouds, understanding different types of clouds and their effects on information loss is necessary.

Clouds can be differentiated based on physical characteristics such as reflectivity, transparency, height, and shape, as well as perceptual properties like visibility and measured attributes such as brightness. From a satellite’s perspective, reflectivity refers to the fraction of incoming solar radiation reflected by a cloud, while brightness represents the apparent intensity captured by the sensor, influenced not only by the cloud’s reflectivity but also by physical factors like cloud thickness and altitude. Clouds are also categorized based on their height above Earth’s surface as low, medium, and high [30].

Low clouds: They are found to be nearly touching the ground and lie within 2 km above ground level (AGL). They are typically formed by the condensation of water droplets, although in high-latitude regions, low clouds can also be formed by ice particles. These clouds usually comprise stratus, stratocumulus, and nimbostratus clouds. They are generally optically thick, which mostly obscures the objects beneath them, leading to dark shadows. Vertical clouds typically extend through a wide range of altitudes, from low to high levels. They are characterized by being optically thick, meaning they significantly block or scatter incoming radiation, similar to low clouds in appearance and effect.
Medium clouds: These are between low and high clouds (2 km to 6 km) that include altostratus and altocumulus clouds. They are formed due to low temperatures by ice crystals and water droplets [31]. They are slightly less dense than low clouds, but the visibility of objects below them is less than 50%, and they have shadows slightly away from the cloud in satellite imagery. If only two cloud types are considered, they fall in the low cloud category.
High clouds: They are usually above 6 km and include cirrus, cirrostratus, and cirrocumulus clouds formed by ice crystals in stable air [32]. They are optically thin, light, streaky, and cover large areas. Ground objects are visible but look hazy and blurry due to their presence [33]. Their shadow is usually very light and often gets removed during atmospheric correction of satellite imagery.

On the basis of visibility, low clouds are typically the brightest, medium clouds are comparatively less luminous than low clouds, and high clouds usually are very thin and opaque [32]. Most RS applications rely on a combination of visual interpretation and vision computation through automated classification techniques for analyzing satellite imagery. Therefore, understanding the influence of the cloud on the quality of remote sensing images becomes a crucial aspect for accurate interpretation and classification [34]. Considering this, clouds are also classified as either thick or thin, based on their optical density and appearance, which corresponds to their optical thickness.

Thick cloud: A cloud is classified as ‘thick’ when it obscures the underlying Earth’s surface with an opacity exceeding 50%, which corresponds to a high optical thickness commonly used in remote sensing studies. Thick clouds make it impossible to comprehend any details below, ultimately hindering the interpretation of ground-level events and creating dark cloud shadows that hide the objects (Figure 3a). The formation of cloud shadows depends on the altitude & thickness of the cloud, the sun angle, and the characteristics of the Earth’s surface over which the shadow is falling [35].
Thin cloud: In contrast, a cloud is termed ‘thin’ if it covers the surface with less than 50% opacity, corresponding to low optical thickness [36]. It offers limited ground visibility, but the view is often blurred and distorted, making it challenging to accurately perceive the terrestrial activities (Figure 3b).

Identification of clouds is vital for minimizing the impact caused by different types of clouds. Various bright objects on the Earth’s surface, including snow/ice-covered landscapes, sandy deserts, shorelines, cliffs, beaches, dry riverbeds, and urban structures, are similar to clouds [37]. Similarly, cloud shadows also bear a resemblance to dark Earth objects like water bodies (lakes, rivers, oceans, etc.), terrain shadows, tree canopies, and dark rocks [38]. The satellite imagery, contaminated with cloud effects, affects the quality and utility of data for different RS applications. Cloud contamination introduces a range of issues, including diminished accuracy in tasks like land-cover classification, change detection, and object recognition. It also hampers image interpretation, potentially leading to erroneous conclusions. Additionally, cloud-contaminated imagery compromises the effectiveness of analysis, results in the loss of valuable temporal data, and presents challenges to generating precise insights [34].

Therefore, preprocessing to detect cloud presence and assess its impact becomes an essential prerequisite. This involves creating a mask (cloud mask) that identifies and isolates cloud-covered regions, allowing users to quantify the extent of cloud cover. Such masks facilitate strategies like cloud extraction and removal [33], allowing effective management of cloud-related challenges and improving the overall quality of the satellite image data.

3. Cloud Detection

The process of determining and identifying the regions in satellite imagery affected by the cloud and distinguishing them from cloud-free areas is called cloud detection or cloud masking [39]. This process relies on pixel-based recognition, enabling users like researchers, analysts, and scientists to accurately delineate cloud-affected pixels and determine the suitability of satellite data for interpretation and analysis. The Pixel-based recognition helps in creating a cloud mask, usually a two-dimensional map that can capture the degree of cloud cover, cloud types, including cloud shadow, confidence level, or probability of cloud [40]. In this process, a fixed value is assigned to each pixel of the image. This assigned value helps in the categorization of the image into different classes. For instance, clouds might be assigned a value of 0; cloud shadows, 127; clear areas, 255; and other relevant classes some different values [41]. The file that stores the assigned value is referred to as a cloud mask. The cloud detection process generally creates two types of cloud masks: binary and multiclass, though some approaches also produce probabilistic cloud likelihood maps to offer more versatile classification options (Figure 4).

Binary cloud mask: A binary cloud mask usually has only two classes or values, and it has two variants: cloud-only [42,43,44] and cloud-contaminated [45]. In the cloud-only variant, each pixel is categorized as either being under cloud cover or being free from clouds (Figure 4b). On the other hand, the cloud-contaminated variant shares the same classification structure, but the cloudy pixels include each type of cloud as well as cloud shadow (Figure 4c). Cloud-only and cloud-contaminated variants of binary masks can easily be created using multiclass cloud masks [33,40].
Multiclass cloud mask: A multiclass cloud mask is an improved version of cloud masking, where pixels are classified into several distinct categories [38,44]. Each class signifies different types of cloud cover or atmospheric conditions (Figure 4d). Additionally, this cloud mask incorporates classes that denote different Earth’s surfaces, such as snow/ice, water, forests, and others, which can exhibit characteristics resembling clouds or cloud shadows [33,46]. A more intricate and comprehensive understanding of satellite imagery can be attained by adopting this approach.

3.1. Generic Cloud Detection Classification

Cloud detection can be generally categorized into three categories based on their working principle: Manual, Automated, and Active learning. A brief discussion about the advantages and limitations of each category is provided below:

Manual Cloud Detection: It is the most reliable and highly accurate method of cloud detection used so far. This method requires a human expert to visually interpret the true-color and false-color composite of satellite imagery rather than mark the boundaries of different classes of cloud-cover areas by drawing polygons [42]. While this method is highly accurate, it is time-consuming, labor-intensive, and requires expertise [47]. This technique was viable during periods when satellite data access was constrained. With the availability of free large-scale satellite data, this method may prove to be infeasible. However, it is a good option for generating a validation dataset that can be used for performance analysis of automated methods and active learning methods.
Automated cloud detection: These methods use algorithms, specific rules, and programs that can detect clouds without direct human intervention [26]. They are designed to be faster and more efficient compared to manual cloud detection. These methods explore spectral and spatial signatures and characteristics of clouds to differentiate them from Earth’s objects. These methods range from simple threshold-based techniques to advanced machine learning/ deep learning methods. They are designed to be scalable and can quickly handle large data volumes [40]. However, these methods have some limitations that researchers are exploring to overcome:
- These methods produce commission and omission errors [48], especially in complex scenes.
- These approaches might fail to work with certain atmospheric conditions and clouds.
- These algorithms might not be universally applicable solutions and require some additional parameter tuning.
Active Learning: It is a specialized algorithm, often rooted in machine learning, that collaborates with a human expert to arrive at a definitive conclusion regarding uncertain and ambiguous cloud-cover regions [33]. Human feedback is essential, from training to refining the algorithm’s accuracy over time [41]. It is also termed a human-in-the-loop approach. This approach represents an effort to combine human expertise with machine learning capabilities. However, it necessitates careful supervision in most instances.

3.2. Sensor Band-Based Cloud Detection Classification

Many satellites, such as Landsat and Sentinel, offer data in multiple spectral bands captured by different sensors from a range of electromagnetic spectrums at different pixel resolutions (see Table 1). The cloud detection methods can also be classified based on the number of spectral bands or channels they utilize to generate cloud masks.

Single-band Cloud detection: This approach relies solely on the spatial information from a single dominant spectral band sensitive to cloud properties, such as the blue or cirrus band [49,50]. However, such detection approaches are very rare, and their ability to distinguish clouds from other bright surfaces is limited, especially under complex atmospheric and surface conditions.
Multi-band Cloud detection: The multi-band cloud detection approach capitalizes on a combination of captured multiple spectral bands to improve accuracy. This approach can be classified into four sub-categories:
- 3-band approach: This method employs the standard color bands (RGB) for cloud masking [51]. Since most satellites possess these three bands, it is a generic, versatile, and scalable approach applicable to a wide range of satellites.
- 4-band approach: This method uses the visible band (RGB) and NIR band to generate the cloud masks. These four bands have rich information required for cloud detection [52]. Most advanced remote sensing satellites like Landsat, Sentinel, PlanetScope, GeoFAN, IRS, etc., share these four bands, making this approach widely adaptable.
- All-band approach: These cloud detection methods are usually satellite-centric; they use all bands available with the satellite to generate cloud masks. Most threshold-based cloud detection uses this concept to generate cloud masks and perform atmospheric and geometric corrections.

4. Dataset Available

The Sentinel-2 mission, launched by the European Space Agency (ESA), is dedicated to capturing high-resolution optical imagery of the Earth’s surface and making it accessible to the public through open-access distribution, facilitated by the Copernicus program. This mission provides global coverage with a revisit time of five days and was initially comprised of twin operational satellites named Sentinel-2A and Sentinel-2B. Recently, a third satellite, Sentinel-2C, has been added to the constellation. These satellites orbit Earth together to capture data in 13 spectral channels at different spatial resolutions (10 m, 20 m, and 60 m) and across different segments of the electromagnetic spectrum, i.e., Visible RGB, NIR, and SWIR [7]. Among these spectral bands, the 10m resolution band (marked bold for Sentinel-2 in Table 1) is particularly useful for detailed observation of the Earth [53]. The image products, including both Level-1 and Level-2 data, are accessible at the Copernicus Data Space Ecosystem [54]. Users can define their region of interest through a provided interactive map interface and obtain the corresponding satellite image product, typically in the SAFE (Sentinel Application Format for Environment) format. The SAFE format serves as a standardized data structure that contains essential components for satellite data processing, including metadata, image data, auxiliary information, quality indicators, and a data manifest [55]. However, the optical imagery captured by Sentinel-2 is most likely to be affected by the presence of cloud, which necessitates the use of cloud masking to estimate the extent of contamination. The image products need a cloud mask, i.e., a labeled image file to identify different portions of images for reference. The vast number of images only has a few datasets that offer reference cloud masks for validation purposes. When working with these validation datasets, it is essential to understand their sources. Most of the existing validation datasets to date primarily provide cloud masks, and researchers typically need to access the corresponding image products from their respective sources.

Although Sentinel-2 provides a cloud mask through the Sen2Cor algorithm [25], it is noteworthy that it may not always accurately estimate cloud cover in the majority of cases. In order to provide a solution to the cloud detection problem, the availability of validation datasets becomes crucial for standardization. In the subsequent sub-section, the existing validation datasets are presented, offering insights into their image product and cloud mask details.

4.1. Baetens-Hagolle (CESBIO/CNES) Dataset

This dataset comprises a total of 31 reference images acquired from both Sentinel 2A and 2B satellites, each accompanied by its respective multiclass cloud mask, collected from 10 different sites worldwide (Table 2). Notably, seven of these reference images are sourced from the Hollstein dataset [38]. The cloud masks present in this dataset are manually labeled at 60 m spatial resolution for each image, using false-color image composites. These masks consist of six main classes, (i.e., low cloud, high cloud, cloud shadow, ground, water, and snow, assigned values from 0 to 7 [see Table 3]). It is noteworthy that the original Hollstein dataset was created in 2016, and has the same class labels as the Beatens-Hagolle dataset, and the Hollstein included 60 reference images from the Sentinel-2A satellite captured from across the globe [38]. Unfortunately, a majority of images in the Hollstein dataset were acquired during the early stage of the Sentinel mission, and their image product consists of incorrect auxiliary and metadata information, which makes them unsuitable for use in remote sensing applications [33].

4.2. WHUS2-CD Dataset

The Sentinel-2 cloud detection dataset known as WHUS2-CD has 32 single-date Sentinel-2A image products from 32 different sites over mainland China and the Tibet region (Table 4), where most of the selected images have cloud cover of less than 20% [56]. It consists of a manually labeled cloud-only binary mask for each image product for reference, i.e., the mask has only two classes named Clear and Cloud. The false-color composite is used for labeling, where pixels consisting of any cloud type are designated with a value of 255, representing clouds. Conversely, pixels consisting of ground, snow, water, and cloud shadow are marked with a value of 128, indicating clear. Pixels with no data are marked with a 0 value. Notably, the cloud mask in this dataset is provided at 10 m pixel resolution, which can be easily down-sampled to 20 m, 60 m, or lower pixel resolution using the nearest neighbor algorithm. This down-sampling enables comparison with other state-of-the-art cloud detection methods to generate cloud masks at different spatial resolutions. However, the limitation of this dataset is the absence of cloud shadow labels. The cloud shadow class is a significant component of achieving overall cloud detection because its presence is a form of contamination that can hinder the usability of image products.

4.3. KappaSet Dataset

The KappaSet dataset consists of a substantial collection of 9251 sub-tiles, each of size 512*512 pixels. These sub-tiles are extracted from a pool of 1079 Level-1 Sentinel-2 images acquired globally over different time periods and seasons [41]. This dataset provides the reference multiclass cloud mask at 10 m spatial resolution for each sub-tile instead of the entire imagery. The cloud masks of this dataset are labeled semi-automatically using an active learning process, involving a computer vision annotation tool [57] and segments.ai [58], into four main classes (i.e., thick cloud, semi-transparent cloud, cloud-shadow, and clear [Table 5]). However, the major drawback of this dataset is the absence of geo-referencing of each sub-tile, which makes it extremely difficult to apply Level-2 pre-processing on the provided Level-1 (L1C) data since the SAFE file, which contains metadata and auxiliary information, is the primary requirement for this purpose. Furthermore, the L1C data provided in KappaSet is not in the same format as provided by the Copernicus Sentinel-2 mission. As a Sinc Infinite Impulse Response (IIR) filter windowed with a Blackman filter is applied to this data prior to distribution, thus limiting the scope of the KappaSet dataset for cloud detection.

The KappaSet is also divided into two sets: a training and testing set (Table 6). The training set comprises 8448 sub-tiles of 955 images acquired over the course of one year, while the testing set includes 803 sub-tiles of 124 images to evaluate the performance of trained cloud detection methods

4.4. IndiaS2 Dataset

Most existing datasets are decent for cloud detection tasks, but fail to include the rich diversity present in the Indian context. India, being 2nd most populous country in the world, has a high proportion of urban areas in the form of cities, towns, and suburbs. Consequently, almost every acquired satellite image in India has some amount of urban land cover, having small to large structures possessing a similar reflective index as clouds. Additionally, a single image in India might encompass a diverse range of features, including mountains, dried riverbeds, water regions, wetlands, greenery, and drylands. Moreover, farm sizes in India tend to be relatively small, and when these agricultural areas are covered by even a small cloud, it renders the satellite imagery unusable for researchers working within specific regions of interest. Furthermore, a different part of India exhibits different conditions and landscapes. For instance, the northern part of India is characterized by mountains, snow-covered areas, and rivers, while the western part features the long range of the Thar Desert. The southern part has forest covers, coastal regions, seas, and small islands, while the eastern and northeastern parts are renowned for their rich forest covers, large ponds, rivers, and deltas. This unique and complex landscape of India calls for specialized datasets that capture this diversity to improve cloud detection methods in the Indian region.

Keeping this in view, the IndiaS2 dataset is available, which consists of 11 images from different parts of India having less than 5% of scattered cloud cover (Table 7). The less scattered clouds often corrupt entire imagery and pose challenges during detection. The provided reference masks were generated by a manual method, which provides a highly accurate cloud mask. The cloud was marked as thick if the cloud affected the Earth’s surface with an opacity of more than 50%, and if the opacity was less than 50%, then the cloud was marked as thin. Additionally, the cloud mask consists of cloud shadow and ground classes (Table 8). To ensure the accuracy of the created cloud mask, it was thoroughly validated using QGIS software. This involved overlapping the mask with various false-color composites to verify its precision. The generated cloud mask has 10m pixel resolution, which can be easily down-sampled to 20m & 60m pixel resolution using the nearest neighbor algorithm.

The reviewed Sentinel-2 datasets offer a range of cloud mask complexities (from binary to multiclass), spatial resolution ranging from 10 m to 60 m (downsampling possible), geographic coverage spanning regional to global scales, as well as varying cloud conditions. Collectively, these datasets provide researchers with flexible options for cloud detection validation, enabling the selection of a suitable method or comparison among different cloud detection methods.

5. Cloud Detection Methods

Cloud detection can be classified in accordance with the time-series nature of satellite data they utilize to generate cloud masks, as single-date and multi-temporal cloud detection. Single-date cloud detection involves identifying and distinguishing cloud-cover areas from clear areas by analyzing a single snapshot of satellite images for a particular date without knowledge of previous time series data. It usually focuses on the spectral characteristics of the imagery for cloud detection. The multi-temporal cloud detection approach uses a sequence of satellite image data captured over a period to identify the existence of cloud cover. It involves the comparison between previous clear-date images and cloud-affected images, making it easier to differentiate cloudy regions. However, it occasionally misinterprets genuine changes on the ground as clouds. These methods are computationally intensive, and successful cloud cover detection relies heavily on the availability of the latest clear-date imagery, making it less effective for cloud-prone areas. Automated cloud detection can also be classified into three categories based on the employed technique and underlying principles: threshold-based, machine learning, and deep learning. This field has undergone a dynamic evolution from conventional threshold-based methods towards more sophisticated ML/DL cloud detection methods. The aim has been to achieve enhanced accuracy, robustness, and the capability to tackle diverse and complex conditions. Within this landscape, several state-of-the-art cloud detection methods have emerged. Some of these approaches are adopted by different agencies to utilize their advantage for ready-to-use product distribution with embedded cloud masks. The following sub-sections discuss the advantages and limitations of these automated cloud detection categories.

5.1. Threshold-Based Cloud Detection

Threshold-based methods involve applying a predetermined set of rules, thresholds, or tests on different spectral bands of satellite imagery to classify each pixel as cloud-covered or cloud-free. Ref. [59] introduced the pioneering pixel-based threshold mechanism, where multiple tests were applied, and if any of these tests indicated a pixel as not cloud-free, it was then classified as cloudy. This approach, however, led to misclassifying many cloud-free pixels. To handle this, Ref. [60] proposed a recovery function to retain wrongly rejected cloud-free pixels, providing some improvement in cloud detection. Despite these efforts, misclassification persisted in numerous cases due to the existence of similar illumination between bright objects and clouds [39].

As cloud generally exhibits high reflectance, the blue band (0.44–0.52 µm) was initially used for highlighting cloud-cover pixels with appropriate spectral thresholds [49]. Unfortunately, this mechanism failed in dealing with images containing complex sites [61]. Keeping the issues with this approach in mind, Ref. [62] proposed the function of mask (Fmask), which employed multiple thresholds to estimate potential cloud, cloud-shadow, and snow/ice pixels. Another threshold-based algorithm named Sen2Cor was developed for the Sentinel-2 satellite to estimate 11 categories, including cloud and cloud-shadow, but this also faced universality challenges [25]. The Maccs-Atcor Joint Method (MAJA) is another threshold-based method utilizing multi-temporal data to detect the existence of cloud and other classes, with a major requirement being the availability of the most recent cloud-free image [63].

5.1.1. Function of Mask (Fmask)

The Fmask method was initially proposed for Landsat satellite data, which was further expanded to screen contaminated areas for Landsat 8 and Sentinel 2 satellite data [62,64,65]. The United States Geological Survey (USGS) adopted Fmask version 3.3 as an operational processing method for Landsat 4-8 data. New improved Fmask 4.0 was released in 2019 [66]. It is a single-date, rule-based cloud detection method that generates a mask for cloud, cloud shadow, and snow cover area by taking Top of Atmosphere (TOA) band data as input. In the Fmask method, a generic threshold is applied to TOA band data for identifying pixels with high reflectance values that are typically associated with clouds, marking them as potential cloud pixels. Similarly, potential cloud shadow & potential snow pixels are computed by applying specific generic thresholds. If sufficient clear pixels remain after the initial test, separate cloud probability maps are generated for water and land to refine the potential cloud areas. A pixel is marked clear only if at least 4 out of 8 neighbors are cloud-free. Additionally, erosion and dilation are performed on the final mask to remove small, bright, non-cloudy areas like snow, buildings, roads, etc. Fmask performs well for Landsat 8, which consists of thermal band data, but the non-availability of the thermal band poses challenges in effectively applying this method with Sentinel 2 imagery.

5.1.2. Sen2cor

The Sen2cor method was proposed by Telespazio VEGA Deutschland GmbH for the European Space Agency (ESA) to distribute and generate Sentinel-2 Level 2A product, which includes a cloud mask and corrected Bottom of Atmosphere (BOA) image data [67]. Sen2cor obtains BOA or surface reflectance from multispectral Level 1C TOA reflectance images and scene cloud masks, including cloud and snow probability cover area for Sentinel-2 data [25]. It computes the probability of cloud by applying different threshold tests individually. Afterward, the global mask is computed by multiplying all probabilities. Sen2cor provides good results for somewhat clear observations but shows poor universality for complex imagery with multiple classes.

5.1.3. MAJA

The MAJA method for cloud detection was proposed by the National Centre for Space Studies (CNES) in collaboration with the Center d’Etudes Spatiales de la BIOsphère (CESBIO) [68]. MAJA is a threshold-based method that uses the advancement of Multi-sensor Atmospheric Correction & Cloud Screening (MACCS) by including some methods from Atmosphere & Topographic Correction (ATCOR) [63]. MAJA is a Level 2 atmospheric correction (L2A) algorithm that is used to perform atmospheric correction, cloud detection, aerosol-optimal depth (AOT) calculation, as well as environmental & slope effect correction [69]. It requires a “clear date” (i.e., cloud-free data gathered at least one month prior) in order to execute the method to calculate the spectral difference to estimate cloud observation. This method sometimes overlooks actual changes on Earth’s surface, and the availability of the latest clear data is also a challenging task.

5.2. Machine Learning (ML) Cloud Detection

ML methods involve training of traditional machine learning algorithms, such as support vector machine (SVM), random forest, decision tree (CART), naïve bayes, k nearest neighbor (KNN), and artificial neural network (ANN), that learn patterns and signatures from spectral data to distinguish cloud-cover from cloud-free pixels [70,71,72,73,74,75,76,77,78,79]. ML basically considers cloud detection problem as a pixel-wise classification problem, which either utilizes the concept of supervised learning, where predefined labeled pixels are used for training a classifier, or the concept of semi-supervised learning, in which the classifier learns to group unlabeled pixels, which is then labeled into respective classes using small available pixel-wise information [80,81,82,83,84,85,86,87,88,89]. These methods automatically handle threshold selection requirements to generate cloud masks [90,91]. Despite their use with many datasets, ML-based cloud detection methods find it difficult to automatically learn diverse spatial characteristics from training samples alone, necessitating additional feature extraction algorithms that combine spectral and spatial features [92]. The feature extraction in satellite imagery itself is a complex task, as object interclass invariance is relatively low, and cloud interference makes it more complex to identify distinctive features [93].

Machine learning classifiers can be trained using single or multiple features [94]. Features used for ML cloud detection include spectral, texture, frequency, and other mathematically derived features [26]. Spectral features involve the intensity value or digital number (DN) of pixels in different channels or bands (Visible, infrared, microwave, and ultraviolet). Texture features are extracted using techniques such as the Grey-level co-occurrence matrix (GLCM), Local Binary pattern (LBP), bilateral filtering, and morphological profiling [35,95]. Frequency features include the Fourier transform, the Gabor filter, the wavelet, and the curvelet [72]. Common features include the Normalized Difference Vegetation Index (NDVI), NDVI using the SWIR band (NDVIswir), Normalized Difference Water Index (NDWI), Normalized Difference Building Index (NDBI), Normalized Difference Snow Index (NDSI), Enhanced Vegetation Index (EVI), Tasseled cap, Brightness temperature, and Whiteness index [81,84] can also be used as input for cloud detection.

Over the past decade, SVM and Random Forest have been extensively used for cloud detection. SVM’s advantage lies in its ability to achieve high performance with smaller training sample sizes [70,71,77,78,81,85,86]. Studies related to cloud detection suggest the need for large computational costs by SVM while working with training and testing using large-scale satellite datasets. On the other hand, Random Forest cloud detection methods are much faster, capable of handling large datasets, and achieve good performance in terms of accuracy [79,80,82,87,96]. Random Forest requires a small computational cost for cloud detection using Landsat data, but has not been considered for Sentinel-2 images that require large-scale training data [84,87]. While KNN, decision tree, naïve bayes, and basic variants of neural networks have also been applied, they often underperform in comparison to SVM and Random Forest [26,79,97]. Extreme Gradient Boosting (XGBoost) is another tree-based method that has proven to be highly efficient in dealing with large-scale data for remote sensing applications [98,99,100]. XGBoost demonstrated comparative performance in cloud detection, and further exploration of other boosting algorithms may yield improved results [100,101].

5.3. Deep Learning (DL) Cloud Detection

Deep learning, a subset of machine learning, utilizes neural networks, especially Convolutional Neural Networks (CNNs), that automatically extract low to high-level features by incorporating spatial characteristics for each pixel [101,102,103,104,105,106,107]. CNN utilizes spatial correlations that highlight cloud features usually distributed in the imagery. The main building block of CNN is the convolution layer, which captures distinctive features within local receptive fields, such as lines, edges, and intricate elements of imagery [103,108,109,110,111,112,113]. The U-Net architecture stands out as the most used variant of CNN for cloud detection, which follows a U-shaped framework as semantic segmentation that extracts features by encoding a path (down-sampling) from the input image and a decoding path (up-sampling) to generate pixel-wise predictions [37,101,104,114,115,116,117,118,119]. Other CNN variants, such as ResNet [102,103,105,106,108,113], inception [116,120], and Generative Adversarial Networks (GANs), Refs. [121,122,123,124,125] have also been used for cloud detection tasks. Despite improved performance, these architectures’ performance is found to degrade over complex regions having similar inter- and intraclass surface reflectance.

To overcome this, different mechanisms, such as atrous convolution or dilated convolution to increase the receptive field [50,106,126,127,128,129,130], deformable convolution to apply offset learning for capturing irregular geometric shapes of cloud [131,132], skip connection to deal with the vanishing gradient problem with deeper networks [102,128,133], and attention to the emphasis on useful key features [51,134,135,136], were explored. These mechanisms increase the receptive field but usually fail to provide object boundaries and localization, leading to the loss of local spatial information [129].

CNN-based classifiers require substantial computational resources, especially when dealing with large-scale satellite datasets like Sentinel-2. The utilization of deeper architecture, which can produce high-quality results, necessitates significant processing power, time, and memory. Training these deep CNN networks to achieve efficient models mandates an ample supply of labeled data capable of considering diverse and complex scenarios present in satellite images. In response to these challenges, lightweight networks such as Cloud Detection-fusing multiscale spectral and spatial features (CD-FM3SFs) [56], Efficient Cloud Detection Network (ECDNeT) [137], lightweight U-Net [138], and lightweight CNN [139] were proposed. They exhibited improved performance for binary cloud detection (cloud only) but often struggled to perform well for multiple classes of cloud, such as thin cloud and cloud shadow, which are of high importance [114,140,141,142,143]. Feed-forward neural networks using a sigmoid or tanh activation function with a single unit have also been used for binary cloud classification. This unit serves to separate both classes by applying a specified threshold point. On the other hand, results of multiclass classification are derived with the help of the SoftMax activation function, which estimates the probability of each class during prediction. It is found to be easier to manipulate the threshold to achieve desired results for binary classification, whereas multiclass classification heavily relies on the estimated logits and their associated probabilities, making it a difficult classification task.

Recently, Vision Transformers (ViT) have been used as an alternative to CNN-based deep learning classifiers [144] and found to be working well for cloud detection problems [145,146,147,148]. ViT-based cloud detection technique achieved better accuracy due to its use of multiple self-attention layers to attain global context and long-range dependencies [149,150,151]. In spite of working well, the size of the model, the number of parameters, and the computational requirements are still the major drawbacks with ViT [149,152,153]. Recently, two other ViT-inspired deep learning architectures, MLP-Mixer [154] and Global Filter Networks (GFNet; [155]), have been proposed. MLP-Mixer suggests that both convolution and attention are efficient but not necessary, which allows the network to grow linearly rather than quadratically. Whereas GFNet modified ViT by introducing a 2D Fourier transform in place of the attention layer to reduce the complexity of the architecture. These architectures are effective for cloud detection to achieve remarkable results with limited computation, while MLP-Mixer demonstrated a promising lightweight solution for cloud detection [156].

A summary of state-of-the-art ML/DL-based cloud detection methods is presented in Table 9, covering various approaches applied to optical satellite imagery. While the primary focus is on Sentinel-2, methods originally developed for other sensors and instruments are included, as they are technically transferable to Sentinel-2 applications. This inclusion also addresses the limited availability of Sentinel-2-specific studies and leverages the spectral and spatial similarities shared across platforms. Most performance metrics reported in the reviewed studies, such as Accuracy, Precision, Recall, F1-score, Producer and User Accuracy, and Intersection over Union (IoU), are calculated from the confusion matrices that compare predicted cloud masks to reference data. While additional metrics like training and prediction times provide useful additional information, confusion-matrix-derived metrics remain the primary tools for evaluation. However, variations in datasets, cloud types, and experimental setups limit direct quantitative comparisons between methods. To address this challenge, Section 6 presents an intercomparison framework designed to standardize evaluations and facilitate more meaningful comparisons. This overview focuses on ML/DL-based approaches, as they represent the current state-of-the-art in cloud detection and complement existing review articles [26,157].

5.4. Importance of Multiclass Cloud Detection in Cloud Removal

The primary objective of cloud detection is to create a mask that delineates cloud-affected areas within satellite images. This mask serves as the basis for cloud removal techniques, which are crucial for reducing the impact of clouds [160]. The effectiveness of the cloud detection method directly influences the success of cloud removal. The direct extraction of cloud and cloud-shadow regions often results in replacing these areas with zero value, leaving gaps or artifacts in the imagery. To obtain a more accurate representation of the Earth’s surface and retain the hidden information, it is essential to reconstruct or replace these areas properly [161]. The process of restoration can take two approaches: either by substituting corrupted regions using insights from available time-series or temporal data [34,162], or regenerating missing pixels based on analysis of spatial distribution and surrounding context [145,163,164].

To perform reconstruction and evaluate cloud removal techniques, access to the most recent cloud-free imagery is crucial. Locating a real cloud-free image for cloud-affected satellite imagery is often a challenging process. As a solution, many researchers proposed to use a simulation of clouds on available cloud-free images [34,165,166,167]. These cloud-free images serve as ground truth for evaluating the efficacy of removal methods. The complexity of cloud removal to reconstruct contamination usually depends on various factors, including the type of ground information present, such as multiple textures, colors, objects at different elevations, static and moving objects in imagery, and diverse regions [163,168].

Cloud removal can be categorized based on the contamination caused by different cloud types, such as thick and thin cloud removal [165,169]. Thick clouds usually obscure the Earth’s surface entirely, requiring a complete removal and replacement [34,163,170,171]. However, image mosaicking seems like a good option to replace contaminated portions, but it can introduce radiometric and geometric variations in satellite imagery [172,173]. To mitigate these variations, different pixel-wise image regeneration techniques based on machine learning/deep learning have been applied [145,166,174,175]. In contrast, thin clouds usually cause blurriness or haziness on Earth’s surface (i.e., cloud and ground information coexist in contaminated areas) and complete removal can lead to loss of important ground information [36,176,177]. Thin Cloud removal can be achieved through various techniques, including noise reduction methods [140], low and high-level feature separation [178], spectral unmixing [179], and fusion with other data sources [51,180]. This underscores the requirement of efficient thick versus thin cloud separation techniques to deal with both contaminations individually [129,177,178], i.e., an efficient multiclass cloud detection technique.

6. Intercomparison Framework and Performance Evaluation

This review builds upon the CMIX framework [40], which primarily established a standardized evaluation mechanism for threshold-based cloud detection methods through binary classification (cloud vs. non-cloud). While CMIX focused on binary evaluation, this review extends the intercomparison framework to encompass multiclass cloud detection methods, including those developed using machine learning (ML) and deep learning (DL) approaches. Unlike binary evaluations, multiclass assessments require more nuanced metrics, such as those computed via micro- or macro-averaging, to capture the performance across multiple classes more effectively.

6.1. Data Harmonization and Input Consistency

A critical component of this extended intercomparison is data harmonization, particularly regarding spatial resolution. Variations in spatial resolution and preprocessing across diverse benchmark datasets can introduce biases, especially for ML/DL methods that are sensitive to input scales. To address this, some studies, including our prior work [150,156], resample data to a common spatial resolution (e.g., 60 m) to ensure models are evaluated on consistent inputs. Such preprocessing steps are essential to reduce biases and support fair comparability of results across datasets with differing spatial characteristics.

6.2. Performance Evaluation Metrics

To evaluate the performance of cloud detection methods, both quantitative and qualitative analysis methods are commonly utilized. Quantitative analysis, also called objective analysis, compares the performance of different techniques using various accuracy metrics, including Accuracy, F1-score, Precision, Recall, Kappa Coefficient, and mean Intersection over Union.

For binary cloud detection, the standard confusion matrix given by CMIX (Figure 5a) is utilized to estimate True Positive (TP), True Negative(TN), False Positive (FP), and False Negative (FN) for evaluating different accuracy metrics (Table 10).
For multiclass cloud detection, extended confusion matrices (Figure 5b–d) are used to evaluate the performance metrics (Table 10) for each class (c), where n is the number of pixels, C is the number of classes, and True Positive (TP_c), False Positive (FP_c), False Negative (FN_c), True Negative (TN_c) are estimated as per [150,181]. The accuracy metrics for multiclass cloud detection are computed either as macro-averaged to give equal importance to each class by reducing the resultant impact of the dominant class or micro-averaged, which aggregates the contributions of all classes (Table 10).

In addition, quantitative analysis is critical; however, qualitative analysis through visual assessment provides additional insights into method performance, especially under diverse conditions. For fair comparison, different classes should be assigned consistent color codes, aligning them with the reference mask to ensure consistency.

6.3. Conversion Framework for Multiclass Comparison

To facilitate a fair comparison of different cloud detection methods that predict varying numbers of classes, a conversion mechanism is needed. This mechanism ensures that the output cloud masks from each method (Table 11), whether binary, 4-class, 6-class, or other, can be standardized to align with the class structure provided by each cloud detection dataset. For instance, a 6-class method that classifies pixels as Clear, Thick Clouds, Thin Clouds, Cloud Shadow, Water, and Snow/Ice can be mapped into a 4-class mask by grouping Water, Snow/Ice, and Clear as Clear, and the rest remain as originally classified. However, such mapping from richer classes to fewer classes may lead to the loss of important information. Therefore, visual assessment becomes important to evaluate how well the detailed mask translates into a simpler one and whether it preserves the distinctiveness of classes. For example, merging Thick and Thin Clouds into a single “Cloud” class, or combining Cloud Shadows with non-cloud areas, may obscure the method’s ability to distinguish class-specific features. Visual analysis helps ensure that such simplifications do not mask significant differences in detection quality. While this review does not reproduce full experimental comparisons, it presents the framework, standardization steps, and evaluation metrics necessary to enable unbiased intercomparison across diverse cloud detection methods, as illustrated in Table 10 and Table 11.

7. Research Gaps

While a detailed literature review suggests several strategies have been developed to detect cloud-related problems for Sentinel-2, their potential inaccuracies can lead to either failing to identify cloud-affected portions or erroneously including cloud-free portions as cloud. There are several gaps and issues in the failure of cloud detection approaches:

Threshold-based methods exhibit limited universality and scalability when applied to imagery from different locations with rich diversity. Thresholds are often static discrimination, which may only be suitable for specific regions and cloud types. Given the richness and diversity of remote sensing data, dynamic thresholds that can adapt to different conditions through expert knowledge systems are needed. Although thick cloud detection is generally efficient for most methods, improving the detection of thin clouds and cloud shadows remains a challenge. Also, realizing a comparative analysis for different approaches is important, as most algorithms generate cloud masks at different pixel resolutions. Fmask produces cloud masks at 20 m, Sen2Cor at 30 m, and MAJA at 240 m.
ML methods automatically handle threshold selection requirements but are dependent on effective feature selection and extraction methods, as the use of only spectral features generally leads to lower accuracy. Their performance can be enhanced by incorporating spectral and spatial features by considering an appropriate combination of conventional methods that generate handcrafted texture features. However, conventional methods of feature extraction are time-consuming, and a mechanism is required to consider spatial features along with spectral features automatically.
DL methods perform well by automatically generating low to high-level spectral and spatial features at various levels. However, they have high computation costs and encounter difficulties in considering large patch sizes containing rich information about large structures like clouds, shadows, and snow regions. They also face challenges in attaining discriminating features between clouds and bright areas like snow/ice, buildings, and river beds, as well as cloud shadows vs. dark areas like water bodies and terrain shadows.
The extraction of cloud leaves zero value in the imagery, either filled by mosaicking from multitemporal data or pixel-wise reconstruction. However, mosaicking fails to consider radiometric variation and tends to overlook the spatial distribution of pixel intensity. The pixel-wise reconstruction shows improvement, but most existing cloud removal methods are unsuitable for addressing thin clouds as well as large areas covered by thick clouds. Also, an efficient automated multiclass cloud detection technique is required to consider thick vs. thin cloud separation. Most cloud removal methods consider true-color or false-color image composites, while each spectral band is affected by cloud presence; therefore, a mechanism to handle cloud in each spectral band is needed.

8. Conclusions

This review highlights key contributions by systematically analyzing cloud detection techniques applied to optical satellite imagery, emphasizing the transition from traditional thresholding to advanced ML and DL approaches, and providing a comparative intercomparison framework for standardized evaluation.

Effective cloud and cloud shadow detection is crucial to mitigate contamination caused by clouds and to gain the utmost benefit from freely available satellite imagery like Sentinel-2. Automated methods, ranging from threshold to advanced machine learning (ML) and deep learning (DL) methods, have significantly advanced the way cloud cover is identified and analyzed in satellite imagery. The traditional threshold methods (Fmask and Sen2Cor), being straightforward and computationally efficient, apply various spectral thresholds to identify different types of clouds, cloud shadows, and other atmospheric phenomena, but often struggle with limited generalizability. On the other hand, ML & DL methods have emerged as more robust alternatives, overcoming the limitations of traditional techniques by learning intricate patterns from satellite imagery. ML methods have been widely used due to their ability to achieve high performance with well-engineered spectral-spatial features. However, these methods do not possess the inherent capability to extract spatial features but rely on additional feature extraction algorithms, which usually require substantial computational cost. DL methods have further enhanced cloud detection through automatic feature extraction, where CNNs effectively integrate spatial and spectral information, and ViTs, along with other variants, learn global context and long-range dependencies, improving the detection accuracy of clouds and their shadows.

Despite these advantages, cloud detection methods face challenges when detecting diverse cloud types and shadows, especially where distinguishing between clouds and other surface features (e.g., snow, ice, and water bodies) becomes difficult. Cloud detection methods trained on specific backgrounds usually underperform when transferred across diverse environmental conditions, highlighting the need for large labeled datasets as well as continued improvement in existing detection algorithms.

Future research needs to focus on improving ML and DL models through semi-supervised or self-supervised learning, domain adaptation to enhance cross-region performance, lightweight architectures for onboard processing, and hybrid fusion frameworks to combine the strengths of threshold and learning-based methods. These strategies can help in improving method efficiency, reducing dependency on extensive labeled datasets, and addressing environmental diversity, making cloud detection more scalable and applicable in real-world remote sensing scenarios.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.-S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef]
Richards, J.A. Sources and Characteristics of Remote Sensing Image Data. In Remote Sensing Digital Image Analysis; Springer: Berlin, Heidelberg, 1993; pp. 1–37. [Google Scholar]
Messina, G.; Peña, J.M.; Vizzari, M.; Modica, G. A Comparison of UAV and Satellites Multispectral Imagery in Monitoring Onion Crop. An Application in the ‘Cipolla Rossa Di Tropea’(Italy). Remote Sens. 2020, 12, 3424. [Google Scholar] [CrossRef]
Wulder, M.A.; Roy, D.P.; Radeloff, V.C.; Loveland, T.R.; Anderson, M.C.; Johnson, D.M.; Healey, S.; Zhu, Z.; Scambos, T.A.; Pahlevan, N.; et al. Fifty Years of Landsat Science and Impacts. Remote Sens. Environ. 2022, 280, 113195–113216. [Google Scholar] [CrossRef]
Qian, S.-E. Hyperspectral Satellites, Evolution, and Development History. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 7032–7056. [Google Scholar] [CrossRef]
Adjovu, G.E.; Stephen, H.; James, D.; Ahmad, S. Overview of the Application of Remote Sensing in Effective Monitoring of Water Quality Parameters. Remote Sens. 2023, 15, 1938–1973. [Google Scholar] [CrossRef]
Phiri, D.; Simwanda, M.; Salekin, S.; Nyirenda, V.R.; Murayama, Y.; Ranagalage, M. Sentinel-2 Data for Land Cover/Use Mapping: A Review. Remote Sens. 2020, 12, 2291. [Google Scholar] [CrossRef]
Ahmed, R.; Ahmad, S.T.; Wani, G.F.; Ahmed, P.; Mir, A.A.; Singh, A. Analysis of Landuse and Landcover Changes in Kashmir Valley, India—A Review. GeoJournal 2022, 87, 4391–4403. [Google Scholar] [CrossRef]
Pandey, P.C.; Koutsias, N.; Petropoulos, G.P.; Srivastava, P.K.; Ben Dor, E. Land Use/Land Cover in View of Earth Observation: Data Sources, Input Dimensions, and Classifiers—A Review of the State of the Art. Geocarto Int. 2021, 36, 957–988. [Google Scholar] [CrossRef]
Casagli, N.; Intrieri, E.; Tofani, V.; Gigli, G.; Raspini, F. Landslide Detection, Monitoring and Prediction with Remote-Sensing Techniques. Nat. Rev. Earth Environ. 2023, 4, 51–64. [Google Scholar] [CrossRef]
Liu, L.; Li, C.; Lei, Y.; Yin, J.; Zhao, J. Volcanic Ash Cloud Detection from MODIS Image Based on CPIWS Method. Acta Geophys. 2017, 65, 151. [Google Scholar] [CrossRef]
Wu, B.; Zhang, M.; Zeng, H.; Tian, F.; Potgieter, A.B.; Qin, X.; Yan, N.; Chang, S.; Zhao, Y.; Dong, Q.; et al. Challenges and Opportunities in Remote Sensing-Based Crop Monitoring: A Review. Natl. Sci. Rev. 2023, 10, nwac290. [Google Scholar] [CrossRef] [PubMed]
Giardina, G.; Macchiarulo, V.; Foroughnia, F.; Jones, J.N.; Whitworth, M.R.Z.; Voelker, B.; Milillo, P.; Penney, C.; Adams, K.; Kijewski-Correa, T. Combining Remote Sensing Techniques and Field Surveys for Post-Earthquake Reconnaissance Missions. Bull. Earthq. Eng. 2024, 22, 3415–3439. [Google Scholar] [CrossRef]
Sun, W.; Chen, C.; Liu, W.; Yang, G.; Meng, X.; Wang, L.; Ren, K. Coastline Extraction Using Remote Sensing: A Review. GIScience Remote Sens. 2023, 60, 2243671. [Google Scholar] [CrossRef]
Kussul, N.; Yailymova, H.; Drozd, S. Detection of War-Damaged Agricultural Fields of Ukraine Based on Vegetation Indices Using Sentinel-2 Data. In Proceedings of the 2022 12th International Conference on Dependable Systems, Services and Technologies (DESSERT), Athens, Greece, 9–11 December 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–5. [Google Scholar]
Ciocarlan, A.; Stoian, A. Ship Detection in Sentinel 2 Multi-Spectral Images with Self-Supervised Learning. Remote Sens. 2021, 13, 4255. [Google Scholar] [CrossRef]
Aulicino, G.; Cotroneo, Y.; de Ruggiero, P.; Buono, A.; Corcione, V.; Nunziata, F.; Fusco, G. Remote Sensing Applications in Satellite Oceanography; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
O’Reilly, D.; Herdrich, G.; Kavanagh, D.F. Electric Propulsion Methods for Small Satellites: A Review. Aerospace 2021, 8, 22. [Google Scholar] [CrossRef]
Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A Tutorial on Synthetic Aperture Radar. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–43. [Google Scholar] [CrossRef]
Fritz, S. The Albedo of the Planet Earth and of Clouds. J. Atmos. Sci. 1949, 6, 277–282. [Google Scholar] [CrossRef]
Du, X.; Wu, H. Cloud-Graph: A Feature Interaction Graph Convolutional Network for Remote Sensing Image Cloud Detection. J. Intell. Fuzzy Syst. 2023, 45, 9123–9139. [Google Scholar] [CrossRef]
Young, N.E.; Anderson, R.S.; Chignell, S.M.; Vorster, A.G.; Lawrence, R.; Evangelista, P.H. A Survival Guide to Landsat Preprocessing. Ecology 2017, 98, 920–932. [Google Scholar] [CrossRef]
Frantz, D. FORCE—Landsat + Sentinel-2 Analysis Ready Data and Beyond. Remote Sens. 2019, 11, 1124–1145. [Google Scholar] [CrossRef]
Pflug, B.; Makarau, A.; Richter, R. Processing Sentinel-2 Data with ATCOR. In Proceedings of the EGU General Assembly, Vienna, Austria, 17–22 April 2016; p. 15488. [Google Scholar]
Main-Knorn, M.; Pflug, B.; Louis, J.; Debaecker, V.; Müller-Wilm, U.; Gascon, F. Sen2Cor for Sentinel-2. In Proceedings of the Image and Signal Processing for Remote Sensing XXIII, Warsaw, Poland, 11–13 September 2017; Bruzzone, L., Bovolo, F., Benediktsson, J.A., Eds.; SPIE: Pune, India, 2017; Volume 10427, p. 3. [Google Scholar]
Mahajan, S.; Fataniya, B. Cloud Detection Methodologies: Variants and Development—A Review. Complex Intell. Syst. 2020, 6, 251–261. [Google Scholar] [CrossRef]
Tapakis, R.; Charalambides, A.G. Equipment and Methodologies for Cloud Detection and Classification: A Review. Sol. Energy 2013, 95, 392–430. [Google Scholar] [CrossRef]
Li, L.; Li, X.; Jiang, L.; Su, X.; Chen, F. A Review on Deep Learning Techniques for Cloud Detection Methodologies and Challenges. Signal Image Video Process. 2021, 15, 1527–1535. [Google Scholar] [CrossRef]
King, M.D.; Platnick, S.; Menzel, W.P.; Ackerman, S.A.; Hubanks, P.A. Spatial and Temporal Distribution of Clouds Observed by MODIS Onboard the Terra and Aqua Satellites. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3826–3852. [Google Scholar] [CrossRef]
Wang, J.; Rossow, W.B. Determination of Cloud Vertical Structure from Upper-Air Observations. J. Appl. Meteorol. Climatol. 1995, 34, 2243–2258. [Google Scholar] [CrossRef]
Parikh, J.A.; Rosenfeld, A. Automatic Segmentation and Classification of Infrared Meteorological Satellite Data. IEEE Trans. Syst. Man. Cybern. 1978, 8, 736–743. [Google Scholar] [CrossRef]
Mishchenko, M.I.; Rossow, W.B.; Macke, A.; Lacis, A.A. Sensitivity of Cirrus Cloud Albedo, Bidirectional Reflectance and Optical Thickness Retrieval Accuracy to Ice Particle Shape. J. Geophys. Res. Atmos. 1996, 101, 16973–16985. [Google Scholar] [CrossRef]
Baetens, L.; Desjardins, C.; Hagolle, O. Validation of Copernicus Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors Using Reference Cloud Masks Generated with a Supervised Active Learning Procedure. Remote Sens. 2019, 11, 433–458. [Google Scholar] [CrossRef]
Zhang, Q.; Yuan, Q.; Li, J.; Li, Z.; Shen, H.; Zhang, L. Thick Cloud and Cloud Shadow Removal in Multitemporal Imagery Using Progressively Spatio-Temporal Patch Group Deep Learning. ISPRS J. Photogramm. Remote Sens. 2020, 162, 148–160. [Google Scholar] [CrossRef]
Li, Z.; Shen, H.; Weng, Q.; Zhang, Y.; Dou, P.; Zhang, L. Cloud and Cloud Shadow Detection for Optical Satellite Imagery: Features, Algorithms, Validation, and Prospects. ISPRS J. Photogramm. Remote Sens. 2022, 188, 89–108. [Google Scholar] [CrossRef]
Shen, Y.; Wang, Y.; Lv, H.; Qian, J. Removal of Thin Clouds in Landsat-8 OLI Data with Independent Component Analysis. Remote Sens. 2015, 7, 11481–11500. [Google Scholar] [CrossRef]
Wieland, M.; Li, Y.; Martinis, S. Multi-Sensor Cloud and Cloud Shadow Segmentation with a Convolutional Neural Network. Remote Sens. Environ. 2019, 230, 111203. [Google Scholar] [CrossRef]
Hollstein, A.; Segl, K.; Guanter, L.; Brell, M.; Enesco, M. Ready-to-Use Methods for the Detection of Clouds, Cirrus, Snow, Shadow, Water and Clear Sky Pixels in Sentinel-2 MSI Images. Remote Sens. 2016, 8, 666–684. [Google Scholar] [CrossRef]
Irish, R.R. Landsat 7 Automatic Cloud Cover Assessment. In Algorithms for Multispectral, Hyperspectral, and Ultraspectral Imagery VI; Shen, S.S., Descour, M.R., Eds.; SPIE: Orlando, FL, USA, 2000; Volume 4049, p. 348. [Google Scholar]
Skakun, S.; Wevers, J.; Brockmann, C.; Doxani, G.; Aleksandrov, M.; Batič, M.; Frantz, D.; Gascon, F.; Gómez-Chova, L.; Hagolle, O.; et al. Cloud Mask Intercomparison EXercise (CMIX): An Evaluation of Cloud Masking Algorithms for Landsat 8 and Sentinel-2. Remote Sens. Environ. 2022, 274, 112990–113012. [Google Scholar] [CrossRef]
Shtym, T.; Wold, O.; Domnich, M.; Voormansik, K.; Rohtsalu, M.; Truupõld, J.; Murin, N.; Toqeer, A.; Odera, C.A.; Harun, F.; et al. KappaSet: Sentinel-2 KappaZeta Cloud and Cloud Shadow Masks 2022. Available online: https://data.niaid.nih.gov/resources?id=zenodo_7100326 (accessed on 14 December 2021).
Mohajerani, S.; Krammer, T.A.; Saeedi, P. A Cloud Detection Algorithm for Remote Sensing Images Using Fully Convolutional Neural Networks. In Proceedings of the 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), Vancouver, BC, Canada, 29–31 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–5. [Google Scholar]
Mohajerani, S.; Saeedi, P. Cloud and Cloud Shadow Segmentation for Remote Sensing Imagery Via Filtered Jaccard Loss Function and Parametric Augmentation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 4254–4266. [Google Scholar] [CrossRef]
U.S. Geological Survey L8 Biome Cloud Validation Masks. U.S. Geological Survey Data Release. Available online: https://landsat.usgs.gov/landsat-8-cloud-cover-assessment-validation-data (accessed on 14 December 2021).
Candra, D.S.; Phinn, S.; Scarth, P. Cloud and Cloud Shadow Masking Using Multi-Temporal Cloud Masking Algorithm in Tropical Environmental. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, 41, 95–100. [Google Scholar] [CrossRef]
Hughes, M.J.; Kennedy, R. High-Quality Cloud Masking of Landsat 8 Imagery Using Convolutional Neural Networks. Remote Sens. 2019, 11, 2591. [Google Scholar] [CrossRef]
Mohajerani, S.; Saeedi, P. Cloud-Net: An End-To-End Cloud Detection Algorithm for Landsat 8 Imagery. In Proceedings of the IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 1029–1032. [Google Scholar]
Foga, S.; Scaramuzza, P.L.; Guo, S.; Zhu, Z.; Dilley Jr, R.D.; Beckmann, T.; Schmidt, G.L.; Dwyer, J.L.; Hughes, M.J.; Laue, B. Cloud Detection Algorithm Comparison and Validation for Operational Landsat Data Products. Remote Sens. Environ. 2017, 194, 379–390. [Google Scholar] [CrossRef]
Breon, F.-M.; Colzy, S. Cloud Detection from the Spaceborne POLDER Instrument and Validation against Surface Synoptic Observations. J. Appl. Meteorol. Climatol. 1999, 38, 777–785. [Google Scholar] [CrossRef]
Zhan, Y.; Wang, J.; Shi, J.; Cheng, G.; Yao, L.; Sun, W. Distinguishing Cloud and Snow in Satellite Images via Deep Convolutional Network. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1785–1789. [Google Scholar] [CrossRef]
Ji, H.; Xia, M.; Zhang, D.; Lin, H. Multi-Supervised Feature Fusion Attention Network for Clouds and Shadows Detection. ISPRS Int. J. Geo-Information 2023, 12, 247–269. [Google Scholar] [CrossRef]
Grabowski, B.; Ziaja, M.; Kawulok, M.; Bosowski, P.; Longépé, N.; Le Saux, B.; Nalepa, J. Squeezing Adaptive Deep Learning Methods with Knowledge Distillation for On-Board Cloud Detection. Eng. Appl. Artif. Intell. 2024, 132, 107835. [Google Scholar] [CrossRef]
Tarrio, K.; Tang, X.; Masek, J.G.; Claverie, M.; Ju, J.; Qiu, S.; Zhu, Z.; Woodcock, C.E. Comparison of Cloud Detection Algorithms for Sentinel-2 Imagery. Sci. Remote Sens. 2020, 2, 100010. [Google Scholar] [CrossRef]
European Space Agency (ESA) Open Access Hub, Scihub.Copernicus.Eu. Available online: https://dataspace.copernicus.eu/ (accessed on 25 June 2025).
SAFE Format. Available online: https://sentiwiki.copernicus.eu/web/safe-format (accessed on 25 June 2025).
Li, J.; Wu, Z.; Hu, Z.; Jian, C.; Luo, S.; Mou, L.; Zhu, X.X.; Molinier, M. A Lightweight Deep Learning-Based Cloud Detection Method for Sentinel-2A Imagery Fusing Multiscale Spectral and Spatial Features. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–19. [Google Scholar] [CrossRef]
CVAT Computer Vision Annotation Tool. CVAT. Available online: https://www.cvat.ai/ (accessed on 28 March 2023).
Segment.ai Segments.Ai Dataset Tool. Available online: https://segments.ai/ (accessed on 28 March 2023).
Saunders, R.W.; Kriebel, K.T. An Improved Method for Detecting Clear Sky and Cloudy Radiances from AVHRR Data. Int. J. Remote Sens. 1988, 9, 123–150. [Google Scholar] [CrossRef]
Kubota, M. A New Cloud Detection Algorithm for Nighttime AVHRR/HRPT Data. J. Oceanogr. 1994, 50, 31–41. [Google Scholar] [CrossRef]
Jedlovec, G. Automated Detection of Clouds in Satellite Imagery. In Advances in Geoscience and Remote Sensing; InTech: London, UK, 2009; pp. 303–316. [Google Scholar]
Zhu, Z.; Woodcock, C.E. Object-Based Cloud and Cloud Shadow Detection in Landsat Imagery. Remote Sens. Environ. 2012, 118, 83–94. [Google Scholar] [CrossRef]
Lonjou, V.; Desjardins, C.; Hagolle, O.; Petrucci, B.; Tremas, T.; Dejus, M.; Makarau, A.; Auer, S. MACCS-ATCOR Joint Algorithm (MAJA). In Proceedings of the Remote Sensing of Clouds and the Atmosphere XXI, Edinburgh, UK, 19 October 2016; Comerón, A., Kassianov, E.I., Schäfer, K., Eds.; SPIE: Bellingham, WA, USA, 2016; Volume 10001, p. 1000107. [Google Scholar]
Zhu, Z.; Woodcock, C.E. Automated Cloud, Cloud Shadow, and Snow Detection in Multitemporal Landsat Data: An Algorithm Designed Specifically for Monitoring Land Cover Change. Remote Sens. Environ. 2014, 152, 217–234. [Google Scholar] [CrossRef]
Zhu, Z.; Wang, S.; Woodcock, C.E. Improvement and Expansion of the Fmask Algorithm: Cloud, Cloud Shadow, and Snow Detection for Landsats 4–7, 8, and Sentinel 2 Images. Remote Sens. Environ. 2015, 159, 269–277. [Google Scholar] [CrossRef]
Qiu, S.; Zhu, Z.; He, B. Fmask 4.0: Improved Cloud and Cloud Shadow Detection in Landsats 4--8 and Sentinel-2 Imagery. Remote Sens. Environ. 2019, 231, 111205. [Google Scholar] [CrossRef]
Richter, R.; Louis, J.; Berthelot, B. Sentinel-2 MSI – Level 2A Products Algorithm Theoretical Basis Document. Eur. Sp. Agency 2012, 49, 1–72. [Google Scholar]
Hagolle, O.; Huc, M.; Pascual, D.V.; Dedieu, G. A Multi-Temporal Method for Cloud Detection, Applied to FORMOSAT-2, VENµS, LANDSAT and SENTINEL-2 Images. Remote Sens. Environ. 2010, 114, 1747–1755. [Google Scholar] [CrossRef]
Hagolle, O.; Huc, M.; Villa Pascual, D.; Dedieu, G. A Multi-Temporal and Multi-Spectral Method to Estimate Aerosol Optical Thickness over Land, for the Atmospheric Correction of FormoSat-2, LandSat, VENμS and Sentinel-2 Images. Remote Sens. 2015, 7, 2668–2691. [Google Scholar] [CrossRef]
Li, P.; Dong, L.; Xiao, H.; Xu, M. A Cloud Image Detection Method Based on SVM Vector Machine. Neurocomputing 2015, 169, 34–42. [Google Scholar] [CrossRef]
Bai, T.; Li, D.; Sun, K.; Chen, Y.; Li, W. Cloud Detection for High-Resolution Satellite Imagery Using Machine Learning and Multi-Feature Fusion. Remote Sens. 2016, 8, 715–736. [Google Scholar] [CrossRef]
Tan, K.; Zhang, Y.; Tong, X. Cloud Extraction from Chinese High Resolution Satellite Imagery by Probabilistic Latent Semantic Analysis and Object-Based Machine Learning. Remote Sens. 2016, 8, 963–988. [Google Scholar] [CrossRef]
Shao, Z.; Deng, J.; Wang, L.; Fan, Y.; Sumari, N.; Cheng, Q. Fuzzy AutoEncode Based Cloud Detection for Remote Sensing Imagery. Remote Sens. 2017, 9, 311. [Google Scholar] [CrossRef]
Pérez-Suay, A.; Amorós-López, J.; Gómez-Chova, L.; Laparra, V.; Muñoz-Marí, J.; Camps-Valls, G. Randomized Kernels for Large Scale Earth Observation Applications. Remote Sens. Environ. 2017, 202, 54–63. [Google Scholar] [CrossRef]
Sun, X.; Yu, Q.; Li, Z. SVM-Based Cloud Detection Using Combined Texture Features. In Proceedings of the International Symposium of Space Optical Instrument and Application, Beijing, China, 5–7 September 2018; pp. 363–372. [Google Scholar]
Ishida, H.; Oishi, Y.; Morita, K.; Moriwaki, K.; Nakajima, T.Y. Development of a Support Vector Machine Based Cloud Detection Method for MODIS with the Adjustability to Various Conditions. Remote Sens. Environ. 2018, 205, 390–407. [Google Scholar] [CrossRef]
Pérez-Suay, A.; Amorós-López, J.; Gómez-Chova, L.; Muñoz-Mari, J.; Just, D.; Camps-Valls, G. Pattern Recognition Scheme for Large-Scale Cloud Detection over Landmarks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3977–3987. [Google Scholar] [CrossRef]
Deng, C.; Li, Z.; Wang, W.; Wang, S.; Tang, L.; Bovik, A.C. Cloud Detection in Satellite Images Based on Natural Scene Statistics and Gabor Features. IEEE Geosci. Remote Sens. Lett. 2018, 16, 608–612. [Google Scholar] [CrossRef]
Ghasemian, N.; Akhoondzadeh, M. Introducing Two Random Forest Based Methods for Cloud Detection in Remote Sensing Images. Adv. Sp. Res. 2018, 62, 288–303. [Google Scholar] [CrossRef]
Fu, H.; Shen, Y.; Liu, J.; He, G.; Chen, J.; Liu, P.; Qian, J.; Li, J. Cloud Detection for FY Meteorology Satellite Based on Ensemble Thresholds and Random Forests Approach. Remote Sens. 2018, 11, 44. [Google Scholar] [CrossRef]
Joshi, P.P.; Wynne, R.H.; Thomas, V.A. Cloud Detection Algorithm Using SVM with SWIR2 and Tasseled Cap Applied to Landsat 8. Int. J. Appl. Earth Obs. Geoinf. 2019, 82, 101898–101908. [Google Scholar] [CrossRef]
Chen, X.; Liu, L.; Gao, Y.; Zhang, X.; Xie, S. A Novel Classification Extension-Based Cloud Detection Method for Medium-Resolution Optical Images. Remote Sens. 2020, 12, 2365. [Google Scholar] [CrossRef]
Cilli, R.; Monaco, A.; Amoroso, N.; Tateo, A.; Tangaro, S.; Bellotti, R. Machine Learning for Cloud Detection of Globally Distributed Sentinel-2 Images. Remote Sens. 2020, 12, 2355. [Google Scholar] [CrossRef]
Wei, J.; Huang, W.; Li, Z.; Sun, L.; Zhu, X.; Yuan, Q.; Liu, L.; Cribb, M. Cloud Detection for Landsat Imagery by Combining the Random Forest and Superpixels Extracted via Energy-Driven Sampling Segmentation Approaches. Remote Sens. Environ. 2020, 248, 112005–112019. [Google Scholar] [CrossRef]
Ibrahim, E.; Jiang, J.; Lema, L.; Barnabé, P.; Giuliani, G.; Lacroix, P.; Pirard, E. Cloud and Cloud-Shadow Detection for Applications in Mapping Small-Scale Mining in Colombia Using Sentinel-2 Imagery. Remote Sens. 2021, 13, 736–756. [Google Scholar] [CrossRef]
Li, J.; Wang, L.; Liu, S.; Peng, B.; Ye, H. An Automatic Cloud Detection Model for Sentinel-2 Imagery Based on Google Earth Engine. Remote Sens. Lett. 2022, 13, 196–206. [Google Scholar] [CrossRef]
Yao, X.; Guo, Q.; Li, A.; Shi, L. Optical Remote Sensing Cloud Detection Based on Random Forest Only Using the Visible Light and Near-Infrared Image Bands. Eur. J. Remote Sens. 2022, 55, 150–167. [Google Scholar] [CrossRef]
Singh, R.; Biswas, M.; Pal, M. Cloud Detection Using Sentinel 2 Imageries: A Comparison of XGBoost, RF, SVM, and CNN Algorithms. Geocarto Int. 2023, 38, 1–32. [Google Scholar] [CrossRef]
Singh, R.; Biswas, M.; Pal, M. An Automated Cloud Detection Method for Sentinel-2 Imageries. In Proceedings of the 2023 IEEE India Geoscience and Remote Sensing Symposium (InGARSS), Bengaluru, India, 10–13 December 2023; IEEE: Piscataway, NJ, USA; pp. 1–4. [Google Scholar]
Shang, H.; Letu, H.; Xu, R.; Wei, L.; Wu, L.; Shao, J.; Nagao, T.M.; Nakajima, T.Y.; Riedi, J.; He, J.; et al. A Hybrid Cloud Detection and Cloud Phase Classification Algorithm Using Classic Threshold-Based Tests and Extra Randomized Tree Model. Remote Sens. Environ. 2024, 302, 113957. [Google Scholar] [CrossRef]
Liu, Z.; Yang, J.; Wang, W.; Shi, Z. Cloud Detection Methods for Remote Sensing Images: A Survey. Chin. Sp. Sci. Technol. 2023, 43, 1–17. [Google Scholar] [CrossRef]
Caraballo-Vega, J.A.; Carroll, M.L.; Neigh, C.S.R.; Wooten, M.; Lee, B.; Weis, A.; Aronne, M.; Alemu, W.G.; Williams, Z. Optimizing WorldView-2,-3 Cloud Masking Using Machine Learning Approaches. Remote Sens. Environ. 2022, 284, 113332–113347. [Google Scholar] [CrossRef]
Gawlikowski, J.; Ebel, P.; Schmitt, M.; Zhu, X.X. Explaining the Effects of Clouds on Remote Sensing Scene Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 9976–9986. [Google Scholar] [CrossRef]
AJ, G.C.; Jojy, C. A Survey of Cloud Detection Techniques For Satellite Images. Int. Res. J. Eng. Technol. 2015, 2, 2485–2490. [Google Scholar]
Wang, Y.; Gu, L.; Li, X.; Gao, F.; Jiang, T. Coexisting Cloud and Snow Detection Based on a Hybrid Features Network Applied to Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5405515. [Google Scholar] [CrossRef]
Miroszewski, A.; Mielczarek, J.; Szczepanek, F.; Czelusta, G.; Grabowski, B.; Saux, B.L.; Nalepa, J. Cloud Detection in Multispectral Satellite Images Using Support Vector Machines with Quantum Kernels. arXiv 2023, arXiv:2307.07281. [Google Scholar]
Miroszewski, A.; Mielczarek, J.; Czelusta, G.; Szczepanek, F.; Grabowski, B.; Saux, B.L.; Nalepa, J. Detecting Clouds in Multispectral Satellite Images Using Quantum-Kernel Support Vector Machines. arXiv 2023, arXiv:2302.08270. [Google Scholar] [CrossRef]
Bhagwat, R.U.; Shankar, B.U. A Novel Multilabel Classification of Remote Sensing Images Using XGBoost. In Proceedings of the IEEE 5th International Conference for Convergence in Technology, Pune, India, 29–31 March 2019; pp. 1–5. [Google Scholar]
Zamani Joharestani, M.; Cao, C.; Ni, X.; Bashir, B.; Talebiesfandarani, S. PM2. 5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data. Atmosphere 2019, 10, 373–392. [Google Scholar] [CrossRef]
Rumora, L.; Miler, M.; Medak, D. Impact of Various Atmospheric Corrections on Sentinel-2 Land Cover Classification Accuracy Using Machine Learning Classifiers. ISPRS Int. J. Geo-Inf. 2020, 9, 277–300. [Google Scholar] [CrossRef]
Jeppesen, J.H.; Jacobsen, R.H.; Inceoglu, F.; Toftegaard, T.S. A Cloud Detection Algorithm for Satellite Imagery Based on Deep Learning. Remote Sens. Environ. 2019, 229, 247–259. [Google Scholar] [CrossRef]
Xu, K.; Guan, K.; Peng, J.; Luo, Y.; Wang, S. DeepMask: An Algorithm for Cloud and Cloud Shadow Detection in Optical Satellite Remote Sensing Images Using Deep Residual Network. arXiv 2019, arXiv:1911.03607. [Google Scholar]
Yang, J.; Guo, J.; Yue, H.; Liu, Z.; Hu, H.; Li, K. CDnet: CNN-Based Cloud Detection for Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6195–6211. [Google Scholar] [CrossRef]
Shendryk, Y.; Rist, Y.; Ticehurst, C.; Thorburn, P. Deep Learning for Multi-Modal Classification of Cloud, Shadow and Land Cover Scenes in PlanetScope and Sentinel-2 Imagery. ISPRS J. Photogramm. Remote Sens. 2019, 157, 124–136. [Google Scholar] [CrossRef]
Liu, C.-C.; Zhang, Y.-C.; Chen, P.-Y.; Lai, C.-C.; Chen, Y.-H.; Cheng, J.-H.; Ko, M.-H. Clouds Classification from Sentinel-2 Imagery with Deep Residual Learning and Semantic Image Segmentation. Remote Sens. 2019, 11, 119. [Google Scholar] [CrossRef]
Kanu, S.; Khoja, R.; Lal, S.; Raghavendra, B.S.; CS, A. CloudX-Net: A Robust Encoder-Decoder Architecture for Cloud Detection from Satellite Remote Sensing Images. Remote Sens. Appl. Soc. Environ. 2020, 20, 100417. [Google Scholar] [CrossRef]
Segal-Rozenhaimer, M.; Li, A.; Das, K.; Chirayath, V. Cloud Detection Algorithm for Multi-Modal Satellite Imagery Using Convolutional Neural-Networks (CNN). Remote Sens. Environ. 2020, 237, 111446–111463. [Google Scholar] [CrossRef]
Guo, J.; Yang, J.; Yue, H.; Tan, H.; Hou, C.; Li, K. CDnetV2: CNN-Based Cloud Detection for Remote Sensing Imagery with Cloud-Snow Coexistence. IEEE Trans. Geosci. Remote Sens. 2021, 59, 700–713. [Google Scholar] [CrossRef]
Zhang, J.; Wang, Y.; Wang, H.; Wu, J.; Li, Y. CNN Cloud Detection Algorithm Based on Channel and Spatial Attention and Probabilistic Upsampling for Remote Sensing Image. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–13. [Google Scholar] [CrossRef]
Chen, Y.; Tang, L.; Kan, Z.; Latif, A.; Yang, X.; Bilal, M.; Li, Q. Cloud and Cloud Shadow Detection Based on Multiscale 3D-CNN for High Resolution Multispectral Imagery. IEEE Access 2020, 8, 16505–16516. [Google Scholar] [CrossRef]
Kristollari, V.; Karathanassi, V. Convolutional Neural Networks for Detecting Challenging Cases in Cloud Masking Using Sentinel-2 Imagery. In Proceedings of the Eighth International Conference on Remote Sensing and Geoinformation of the Environment (RSCy2020), Paphos, Cyprus, 16–18 March 2020; Themistocleous, K., Michaelides, S., Ambrosia, V., Hadjimitsis, D.G., Papadavid, G., Eds.; SPIE: Pune, India, 2020; p. 53. [Google Scholar]
Luotamo, M.; Metsamaki, S.; Klami, A. Multiscale Cloud Detection in Remote Sensing Images Using a Dual Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2021, 59, 4972–4983. [Google Scholar] [CrossRef]
Ma, N.; Sun, L.; Zhou, C.; He, Y. Cloud Detection Algorithm for Multi-Satellite Remote Sensing Imagery Based on a Spectral Library and 1D Convolutional Neural Network. Remote Sens. 2021, 13, 3319–3339. [Google Scholar] [CrossRef]
Jiao, L.; Gao, W. Refined UNet Lite: End-to-End Lightweight Network for Edge-Precise Cloud Detection. Procedia Comput. Sci. 2022, 202, 9–14. [Google Scholar] [CrossRef]
Li, X.; Yang, X.; Li, X.; Lu, S.; Ye, Y.; Ban, Y. GCDB-UNet: A Novel Robust Cloud Detection Approach for Remote Sensing Images. Knowl. Based Syst. 2022, 238, 107890–107902. [Google Scholar] [CrossRef]
Kumthekar, A.; Reddy, G.R. An Integrated Deep Learning Framework of U-Net and Inception Module for Cloud Detection of Remote Sensing Images. Arab. J. Geosci. 2021, 14, 1–13. [Google Scholar] [CrossRef]
Yin, M.; Wang, P.; Hao, W.; Ni, C. Cloud Detection of High-Resolution Remote Sensing Image Based on Improved U-Net. Multimed. Tools Appl. 2023, 82, 25271–25288. [Google Scholar] [CrossRef]
López-Puigdollers, D.; Mateo-Garcia, G.; Gómez-Chova, L. Benchmarking Deep Learning Models for Cloud Detection in Landsat-8 and Sentinel-2 Images. Remote Sens. 2021, 13, 992. [Google Scholar] [CrossRef]
Grabowski, B.; Ziaja, M.; Kawulok, M.; Cwiek, M.; Lakota, T.; Longepe, N.; Nalepa, J. Are Cloud Detection U-Nets Robust Against in-Orbit Image Acquisition Conditions? In Proceedings of the IGARSS 2022–2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 239–242. [Google Scholar]
Francis, A.; Sidiropoulos, P.; Muller, J.-P. CloudFCN: Accurate and Robust Cloud Detection for Satellite Imagery with Deep Learning. Remote Sens. 2019, 11, 2312–2334. [Google Scholar] [CrossRef]
Yang, X.; Gou, T.; Lv, Z.; Li, L.; Jin, H. Weakly-Supervised Cloud Detection and Effective Cloud Removal for Remote Sensing Images. J. Vis. Commun. Image Represent. 2023, 98, 104006. [Google Scholar] [CrossRef]
Ma, X.; Huang, Y.; Zhang, X.; Pun, M.-O.; Huang, B. Cloud-EGAN: Rethinking CycleGAN From a Feature Enhancement Perspective for Cloud Removal by Combining CNN and Transformer. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 4999–5012. [Google Scholar] [CrossRef]
Pang, S.; Sun, L.; Tian, Y.; Ma, Y.; Wei, J. Convolutional Neural Network-Driven Improvements in Global Cloud Detection for Landsat 8 and Transfer Learning on Sentinel-2 Imagery. Remote Sens. 2023, 15, 1706. [Google Scholar] [CrossRef]
Li, J.; Wu, Z.; Sheng, Q.; Wang, B.; Hu, Z.; Zheng, S.; Camps-Valls, G.; Molinier, M. A Hybrid Generative Adversarial Network for Weakly-Supervised Cloud Detection in Multispectral Images. Remote Sens. Environ. 2022, 280, 113197. [Google Scholar] [CrossRef] [PubMed]
Yang, J.; Li, W.; Chen, K.; Liu, Z.; Shi, Z.; Zou, Z. Weakly Supervised Adversarial Training for Remote Sensing Image Cloud and Snow Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 15206–15221. [Google Scholar] [CrossRef]
Shi, C.; Zhou, Y.; Qiu, B.; Guo, D.; Li, M. CloudU-Net: A Deep Convolutional Neural Network Architecture for Daytime and Nighttime Cloud Images’ Segmentation. IEEE Geosci. Remote Sens. Lett. 2021, 18, 1688–1692. [Google Scholar] [CrossRef]
Shi, C.; Zhou, Y.; Qiu, B. CloudU-Netv2: A Cloud Segmentation Method for Ground-Based Cloud Images Based on Deep Learning. Neural Process. Lett. 2021, 53, 2715–2728. [Google Scholar] [CrossRef]
Peng, L.; Chen, X.; Chen, J.; Zhao, W.; Cao, X. Understanding the Role of Receptive Field of Convolutional Neural Network for Cloud Detection in Landsat 8 OLI Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5407317. [Google Scholar] [CrossRef]
Zhao, C.; Zhang, X.; Kuang, N.; Luo, H.; Zhong, S.; Fan, J. Boundary-Aware Bilateral Fusion Network for Cloud Detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–14. [Google Scholar] [CrossRef]
Wu, K.; Xu, Z.; Lyu, X.; Ren, P. Cloud Detection with Boundary Nets. ISPRS J. Photogramm. Remote Sens. 2022, 186, 218–231. [Google Scholar] [CrossRef]
Liu, Y.; Wang, W.; Li, Q.; Min, M.; Yao, Z. DCNet: A Deformable Convolutional Cloud Detection Network for Remote Sensing Imagery. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
He, Q.; Sun, X.; Yan, Z.; Fu, K. DABNet: Deformable Contextual and Boundary-Weighted Network for Cloud Detection in Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
De Souza, A.; Shokri, P. Residual U-Net with Attention for Detecting Clouds in Satellite Imagery 2023, 1–17. Available online: https://eartharxiv.org/repository/view/4910/ (accessed on 14 December 2021).
Zhao, C.; Zhang, X.; Luo, H.; Zhong, S.; Tang, L.; Peng, J.; Fan, J. Detail-Aware Multiscale Context Fusion Network for Cloud Detection. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Yao, X.; Guo, Q.; Li, A. Light-Weight Cloud Detection Network for Optical Remote Sensing Images with Attention-Based Deeplabv3+ Architecture. Remote Sens. 2021, 13, 3617–3640. [Google Scholar] [CrossRef]
Chen, Y.; Tang, L.; Huang, W.; Guo, J.; Yang, G. A Novel Spectral Indices-Driven Spectral-Spatial-Context Attention Network for Automatic Cloud Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 3092–3103. [Google Scholar] [CrossRef]
Luo, C.; Feng, S.; Yang, X.; Ye, Y.; Li, X.; Zhang, B.; Chen, Z.; Quan, Y. LWCDnet: A Lightweight Network for Efficient Cloud Detection in Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
Zheng, Y.; Ling, W.; Shifei, T. A Lightweight Network for Remote Sensing Image Cloud Detection. In Proceedings of the 4th International Conference on Power, Intelligent Computing and Systems, Shenyang, China, 29–31 July 2022; pp. 644–649. [Google Scholar]
Li, X.; Chen, S.; Wu, J.; Li, J.; Wang, T.; Tang, J.; Hu, T.; Wu, W. Satellite Cloud Image Segmentation Based on Lightweight Convolutional Neural Network. PLoS ONE 2023, 18, e0280408. [Google Scholar] [CrossRef]
Zhang, G.; Gao, X.; Yang, J.; Yang, Y.; Tan, M.; Xu, J.; Wang, Y. A Multi-Task Driven and Reconfigurable Network for Cloud Detection in Cloud-Snow Coexistence Regions from Very-High-Resolution Remote Sensing Images. Int. J. Appl. Earth Obs. Geoinf. 2022, 114, 103070–103086. [Google Scholar] [CrossRef]
Li, X.; Ye, H.; Qiu, S. Cloud Contaminated Multispectral Remote Sensing Image Enhancement Algorithm Based on MobileNet. Remote Sens. 2022, 14, 4815–4842. [Google Scholar] [CrossRef]
Bowen, Z.; Jianlin, Z.; Xiaoxing, F.; Yaxing, S. Cloud Detection in Cloud-Snow Co-Occurrence Remote Sensing Images Based on Convolutional Neural Network. In Proceedings of the 6th International Conference on Big Data Technologies, Qingdao, China, 22–24 September 2023; ACM: New York, NY, USA, 2023; Volume 52, pp. 396–402. [Google Scholar]
Grabowski, B.; Ziaja, M.; Kawulok, M.; Bosowski, P.; Longépé, N.; Saux, B.L.; Nalepa, J. Squeezing NnU-Nets with Knowledge Distillation for On-Board Cloud Detection. arXiv 2023, arXiv:2306.09886. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. Comput. Vis. Pattern Recognit. 2020. [Google Scholar] [CrossRef]
Han, S.; Wang, J.; Zhang, S. Former-CR: A Transformer-Based Thick Cloud Removal Method with Optical and SAR Imagery. Remote Sens. 2023, 15, 1196–1218. [Google Scholar] [CrossRef]
Du, X.; Wu, H. Feature-Aware Aggregation Network for Remote Sensing Image Cloud Detection. Int. J. Remote Sens. 2023, 44, 1872–1899. [Google Scholar] [CrossRef]
Cao, Y.; Sui, B.; Zhang, S.; Qin, H. Cloud Detection From High-Resolution Remote Sensing Images Based on Convolutional Neural Networks with Geographic Features and Contextual Information. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
Qian, J.; Ci, J.; Tan, H.; Xu, W.; Jiao, Y.; Chen, P. Cloud Detection Method Based on Improved DeeplabV3+ Remote Sensing Image. IEEE Access 2024, 12, 9229–9242. [Google Scholar] [CrossRef]
Zhang, B.; Zhang, Y.; Li, Y.; Wan, Y.; Yao, Y. CloudViT: A Lightweight Vision Transformer Network for Remote Sensing Cloud Detection. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
Singh, R.; Biswas, M.; Pal, M. A Transformer-Based Cloud Detection Approach Using Sentinel 2 Imageries. Int. J. Remote Sens. 2023, 44, 3194–3208. [Google Scholar] [CrossRef]
Francis, A. Sensor Independent Cloud and Shadow Masking with Partial Labels and Multimodal Inputs. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–18. [Google Scholar] [CrossRef]
Jiao, W.; Zhang, Y.; Zhang, B.; Wan, Y. SCTrans: A Transformer Network Based on the Spatial and Channel Attention for Cloud Detection. In Proceedings of the IGARSS International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 615–618. [Google Scholar]
Ge, W.; Yang, X.; Jiang, R.; Shao, W.; Zhang, L. CD-CTFM: A Lightweight CNN-Transformer Network for Remote Sensing Cloud Detection Fusing Multiscale Features. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 4538–4551. [Google Scholar] [CrossRef]
Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. Mlp-Mixer: An All-Mlp Architecture for Vision. Adv. Neural Inf. Process. Syst. 2021, 34, 24261–24272. [Google Scholar]
Rao, Y.; Zhao, W.; Zhu, Z.; Lu, J.; Zhou, J. Global Filter Networks for Image Classification. Adv. Neural Inf. Process. Syst. 2021, 34, 980–993. [Google Scholar]
Singh, R.; Biswas, M.; Pal, M. Enhanced Cloud Detection in Sentinel-2 Imagery Using K-Means Clustering Embedded Transformer-Inspired Models. J. Appl. Remote Sens. 2024, 18, 034516. [Google Scholar] [CrossRef]
Gupta, R.; Nanda, S.J. Cloud Detection in Satellite Images with Classical and Deep Neural Network Approach: A Review. Multimed. Tools Appl. 2022, 81, 31847–31880. [Google Scholar] [CrossRef]
Wright, N.; Duncan, J.M.A.; Callow, J.N.; Thompson, S.E.; George, R.J. CloudS2Mask: A Novel Deep Learning Approach for Improved Cloud and Cloud Shadow Masking in Sentinel-2 Imagery. Remote Sens. Environ. 2024, 306, 114122. [Google Scholar] [CrossRef]
Gbodjo, Y.J.E.; Hughes, L.H.; Molinier, M.; Tuia, D.; Li, J. Self-Supervised Representation Learning for Cloud Detection Using Sentinel-2 Images 2025.
Ji, T.-Y.; Chu, D.; Zhao, X.-L.; Hong, D. A Unified Framework of Cloud Detection and Removal Based on Low-Rank and Group Sparse Regularizations for Multitemporal Multispectral Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
Zhou, J.; Luo, X.; Rong, W.; Xu, H. Cloud Removal for Optical Remote Sensing Imagery Using Distortion Coding Network Combined with Compound Loss Functions. Remote Sens. 2022, 14, 3452. [Google Scholar] [CrossRef]
Zheng, W.-J.; Zhao, X.-L.; Zheng, Y.-B.; Lin, J.; Zhuang, L.; Huang, T.-Z. Spatial-Spectral-Temporal Connective Tensor Network Decomposition for Thick Cloud Removal. ISPRS J. Photogramm. Remote Sens. 2023, 199, 182–194. [Google Scholar] [CrossRef]
Li, Z.; Shen, H.; Cheng, Q.; Li, W.; Zhang, L. Thick Cloud Removal in High-Resolution Satellite Images Using Stepwise Radiometric Adjustment and Residual Correction. Remote Sens. 2019, 11, 1925–1944. [Google Scholar] [CrossRef]
Dai, J.; Shi, N.; Zhang, T.; Xu, W. TCME: Thin Cloud Removal Network for Optical Remote Sensing Images Based on Multidimensional Features Enhancement. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–16. [Google Scholar] [CrossRef]
Zheng, J.; Liu, X.-Y.; Wang, X. Single Image Cloud Removal Using U-Net and Generative Adversarial Networks. IEEE Trans. Geosci. Remote Sens. 2020, 59, 6371–6385. [Google Scholar] [CrossRef]
Liu, H.; Huang, B.; Cai, J. Thick Cloud Removal Under Land Cover Changes Using Multisource Satellite Imagery and a Spatiotemporal Attention Network. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–18. [Google Scholar] [CrossRef]
Ma, D.; Wu, R.; Xiao, D.; Sui, B. Cloud Removal from Satellite Images Using a Deep Learning Model with the Cloud-Matting Method. Remote Sens. 2023, 15, 904–921. [Google Scholar] [CrossRef]
Xiong, Q.; Li, G.; Yao, X.; Zhang, X. SAR-to-Optical Image Translation and Cloud Removal Based on Conditional Generative Adversarial Networks: Literature Survey, Taxonomy, Evaluation Indicators, Limits and Future Directions. Remote Sens. 2023, 15, 1137–1157. [Google Scholar] [CrossRef]
Li, J.; Wu, Z.; Hu, Z.; Zhang, J.; Li, M.; Mo, L.; Molinier, M. Thin Cloud Removal in Optical Remote Sensing Images Based on Generative Adversarial Networks and Physical Model of Cloud Distortion. ISPRS J. Photogramm. Remote Sens. 2020, 166, 373–389. [Google Scholar] [CrossRef]
Chen, Y.; Tang, L.; Yang, X.; Fan, R.; Bilal, M.; Li, Q. Thick Clouds Removal from Multitemporal ZY-3 Satellite Images Using Deep Learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 13, 143–153. [Google Scholar] [CrossRef]
Zhu, S.; Li, Z.; Shen, H.; Lin, D. A Fast Two-Step Algorithm for Large-Area Thick Cloud Removal in High-Resolution Images. Remote Sens. Lett. 2022, 14, 1–9. [Google Scholar] [CrossRef]
Hu, G.; Sun, X.; Liang, D.; Sun, Y. Cloud Removal of Remote Sensing Image Based on Multi-Output Support Vector Regression. J. Syst. Eng. Electron. 2014, 25, 1082–1088. [Google Scholar] [CrossRef]
Li, X.; Feng, R.; Guan, X.; Shen, H.; Zhang, L. Remote Sensing Image Mosaicking: Achievements and Challenges. IEEE Geosci. Remote Sens. Mag. 2019, 7, 8–22. [Google Scholar] [CrossRef]
Ebel, P.; Xu, Y.; Schmitt, M.; Zhu, X.X. SEN12MS-CR-TS: A Remote-Sensing Data Set for Multimodal Multitemporal Cloud Removal. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
Zhang, Q.; Yuan, Q.; Li, Z.; Sun, F.; Zhang, L. Combined Deep Prior with Low-Rank Tensor SVD for Thick Cloud Removal in Multitemporal Images. ISPRS J. Photogramm. Remote Sens. 2021, 177, 161–173. [Google Scholar] [CrossRef]
Xu, Z.; Wu, K.; Wang, W.; Lyu, X.; Ren, P. Semi-Supervised Thin Cloud Removal with Mutually Beneficial Guides. ISPRS J. Photogramm. Remote Sens. 2022, 192, 327–343. [Google Scholar] [CrossRef]
Wu, R.; Liu, G.; Lv, J.; Fu, Y.; Bao, X.; Shama, A.; Cai, J.; Sui, B.; Wang, X.; Zhang, R. An Innovative Approach for Effective Removal of Thin Clouds in Optical Images Using Convolutional Matting Model. Remote Sens. 2023, 15, 2119–2143. [Google Scholar] [CrossRef]
Shao, Z.; Pan, Y.; Diao, C.; Cai, J. Cloud Detection in Remote Sensing Images Based on Multiscale Features-Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4062–4076. [Google Scholar] [CrossRef]
Tu, Z.; Guo, L.; Pan, H.; Lu, J.; Xu, C.; Zou, Y. Multitemporal Image Cloud Removal Using Group Sparsity and Nonconvex Low-Rank Approximation. J. Nonlinear Var. Anal. 2023, 7, 527–548. [Google Scholar] [CrossRef]
Li, Y.; Wei, F.; Zhang, Y.; Chen, W.; Ma, J. HS2P: Hierarchical Spectral and Structure-Preserving Fusion Network for Multimodal Remote Sensing Image Cloud and Shadow Removal. Inf. Fusion 2023, 94, 215–228. [Google Scholar] [CrossRef]
Theissler, A.; Thomas, M.; Burch, M.; Gerschner, F. ConfusionVis: Comparative Evaluation and Selection of Multi-Class Classifiers Based on Confusion Matrices. Knowl. Based Syst. 2022, 247, 108651. [Google Scholar] [CrossRef]

Figure 1. True-color satellite image of Sentinel-2 without the presence of clouds: (a) Level-1 corrected TOA reflectance, (b) Level-2 corrected BOA reflectance, and with clouds: (c) Level-1 corrected TOA reflectance, (d) Level-2 corrected BOA reflectance. Here, Level-2 correction is done by Sen2Cor.

Figure 2. Impact of cloud on satellite image acquisition process.

Figure 3. True color satellite imagery: (a) with thick clouds & cloud shadows, (b) thin clouds.

Figure 4. Sample of satellite imagery with different cloud masks: (a) True-color image composite, binary cloud mask: (b) cloud-only and (c) cloud-contaminated area, (d) color-coded mask where Red: thick cloud, Yellow: thin cloud, Black: cloud-shadow, and Green: ground.

Figure 5. (a) Confusion Matrix for binary cloud detection, (b) Confusion Matrix for multiclass cloud detection, (c,d) Confusion matrices for UA and PA of cloud and cloud shadow, respectively.

Table 1. Comparison of Spectral Band Information of Landsat (7–8) and Sentinel 2.

Band Names	Landsat-7		Landsat 8		Sentinel-2
Band Names	Band Index (Resolution)	Wavelength (µm)	Band Index (Resolution)	Wavelength (µm)	Band Index (Resolution)	Wavelength (µm)
Coastal	-	-	Band 1 (30 m)	0.435–0.451	Band 1 (60 m)	0.433–0.453
Blue	Band 1 (30 m)	0.441–0.514	Band 2 (30 m)	0.452–0.512	Band 2 (10 m)	0.458–0.523
Green	Band 2(30 m)	0.519–0.601	Band 3 (30 m)	0.533–0.590	Band 3 (10 m)	0.543–0.578
Red	Band 3 (30 m)	0.631–0.692	Band 4 (30 m)	0.636–0.673	Band 4 (10 m)	0.650–0.680
Red Edge 1	-	-	-	-	Band 5 (20 m)	0.698–0.713
Red Edge 2	-	-	-	-	Band 6 (20 m)	0.733–0.748
Red Edge 3	-	-	-	-	Band 7 (20 m)	0.765–0.785
Wide NIR	-	-	-	-	Band 8 (10 m)	0.785–0.900
Narrow NIR	Band 4 (30 m)	0.772–0.898	Band 5 (30 m)	0.851–0.879	Band 8A (20 m)	0.855–0.875
Water Vapor	-	-	-	-	Band 9 (60 m)	0.930–0.950
Cirrus	-	-	Band 9 (30 m)	1.363–1.384	Band 10 (60 m)	1.365–1.385
SWIR1	Band 5 (30 m)	1.547–1.749	Band 6 (30 m)	1.566–1.651	Band 11 (20 m)	1.565–1.655
SWIR2	Band 7 (30 m)	2.064–2.345	Band 7 (30 m)	2.107–2.294	Band 12 (20 m)	2.100–2.280
Panchromatic	Band 8 (15 m)	0.515–0.896	Band 8 (15 m)	0.503–0.676	-	-
TIR-1	Band 6 (60 m)	10.31–12.36	Band 10 (100 m)	10.60–11.19	-	-
TIR-2	-	-	Band 11 (100 m)	11.50–12.51	-	-

Table 2. Description of the Baetens-Hagolle Sentinel-2 dataset.

Location	Tile ID	Sentinel	Acquisition Date	Scene Information
Railroad Valley, USA	T11SPC	S2A	5 January 2017	Small cumulus over bright soil
Railroad Valley, USA	T11SPC	S2B	27 August 2017	Large cumulus over bright soil
Alta Floresta, Brazil	T21LWK	S2A	5 May 2018	Scattered small cumulus
Alta Floresta, Brazil	T21LWK	S2B	9 June 2018	Thin cirrus
Marrakech, Morocco	T29RPQ	S2A	17 April 2016	Scattered cumulus and thin cirrus
Marrakech, Morocco	T29RPQ	S2A	21 June 2017	Clear image with snow and thin cirrus
Arles, France	T31TFJ	S2A	17 September 2017	Large cloud cover
Arles, France	T31TFJ	S2B	2 October 2017	Thick and thin clouds
Orleans, France	T31UDP	S2A	16 May 2017	Thick and thin cirrus clouds
Orleans, France	T31UDP	S2B	19 August 2017	Large mid-altitude cloud cover
Ispra, Italy	T32TMR	S2A	15 August 2017	Clouds over mountains with snow
Ispra, Italy	T32TMR	S2B	9 October 2017	Clouds over mountains with snow and bright soil
Gobabeb, Namibia	T33KWP	S2A	21 December 2016	Thick clouds above the desert
Gobabeb, Namibia	T33KWP	S2B	9 September 2017	Small and low clouds
Mongu, Zambia	T34LGJ	S2A	12 November 2016	Large thick cloud cover and some cirrus
Mongu, Zambia	T34LGJ	S2B	4 August 2017	Clear image and a few mid-altitude clouds
Pretoria, South Africa	T35JPM	S2A	13 March 2017	Diverse cloud types
Pretoria, South Africa	T35JPM	S2A	20 August 2017	Scattered small clouds
Railroad Valley, USA	T11SPC	S2B	13 February 2018	Large stratus and some cumulus
Alta Floresta, Brazil	T21LWK	S2A	14 July 2018	Mid-altitude small clouds
Alta Floresta, Brazil	T21LWK	S2A	13 August 2018	Thin cirrus
Marrakech, Morocco	T29RPQ	S2A	18 December 2017	Scattered cumulus and snow
Arles, France	T31TFJ	S2B	21 December 2017	Mid-altitude thick clouds and snow
Orleans, France	T31UDP	S2B	18 February 2018	Stratus cloud
Ispra, Italy	T32TMR	S2B	11 November 2017	Clouds over mountains and mist
Gobabeb, Namibia	T33KWP	S2B	9 February 2018	High and thin clouds
Mongu, Zambia	T34LGJ	S2B	13 October 2017	Large thin cirrus cover
Pretoria, South Africa	T35JPM	S2B	13 December 2017	Altostratus and small scattered clouds
Munich, Germany	T32UPU	S2A	22 April 2018	Mostly cloud-free with a few small clouds
Munich, Germany	T32UPU	S2B	24 April 2018	Large cloud cover with cumulus and cirrus

Table 3. Manual Cloud Mask Description of Baetens-Hagolle Sentinel-2 dataset.

Label	Class Name
0	No Fill
1	No Data
2	Low Cloud
3	High Cloud
4	Cloud Shadow
5	Ground
6	Water
7	Snow

Table 4. Description of the WHUS2-CD dataset.

Location	Tile ID	Acquisition Date	Scene Information
Yiwu, China	T46TFN	14 July 2019	Cloud over snow cover and barren
Henan, China	T47SQU	19 December 2019	Cloud over barren having snow cover
Washixia, China	T45SWC	30 June 2019	Scattered cloud over the mountain with snow
Tibet	T46RGV	15 December 2019	Cloud over snow
Tibet	T45SXR	30 September 2018	Cloud over barren and water
Taila, China	T51TWM	17 March 2020	Cloud over ice, snow, and barren
Bachu, China	T44TKK	16 August 2018	Cloud over barren and clear farmland
Wentugaole, China	T47TQF	23 October 2019	Large cloud cover
Shangyi, China	T50TKL	24 August 2018	Cloud over forest and farmland
Yongding, China	T50RMN	18 November 2019	Cloud cover with forest and green area
Songyang, China	T50RQS	16 September 2019	Large cloud cover over greenery
Mengzhou, China	T49SFU	19 August 2019	Thin and thick clouds over urban
Koldeneng, China	T44TPN	15 August 2019	Large cloud cover
Qingyuan, China	T51TXG	10 April 2020	Scattered cloud over dryland
Junhe, China	T50TQQ	2 October 2019	Scattered thick and thin cloud
Luochuan, China	T49SCV	29 April 2018	A few small clouds over the forest
Guyuan, China	T51UWS	6 May 2020	A few scattered clouds over barren
Yanyuan, China	T47RQL	25 March 2020	Scattered clouds with barren and forest
Rongjiang, China	T49RBJ	28 September 2019	Scattered clouds over the forest
Dazhou, China	T48RYV	27 August 2018	A few small clouds over greenery and urban
Yangchun, China	T49QEE	22 February 2020	A few scattered clouds over the forest
Qianjiang, China	T49RFP	22 July 2018	A few scattered clouds over the forest with urban
Pingxiang, China	T49RGL	29 July 2018	Clouds over the forest with urban
Wulian, China	T50SPE	6 May 2020	Scattered thick and thin clouds over diverse region
Zhanjiang, China	T49QDD	30 September 2018	Scattered small clouds over the coastal region
Changzhou & Wuxi, China	T51STR	5 November 2019	Scattered small clouds over shrubland
Linshui, China	T48RXU	12 August 2019	Scattered small thick clouds
Yilan, China	T52TES	2 June 2019	Thick cloud over Barren and Forest
Baotou, China	T49TCF	28 March 2019	Thick clouds over Barren
Altay, China	T45TXN	2 October 2019	Scattered cloud with snow cover
Tibet	T46SFC	16 April 2020	Cloud over snow and bright mountain
Tibet	T44SPC	28 May 2020	Cloud over the mountain with small snow cover

Table 5. Reference Cloud Mask Description of KappaSet.

Label	Class Name
0	Undefined (Labeler not sure)
1	Clear
2	Cloud Shadow
3	Semi-transparent Cloud
4	Cloud
5	Missing (No Data or Fill)

Table 6. Brief Description of KappaSet.

	S.No	Month	Total Tiles	Total Imagery	Total Sub-Tiles
Train	1	January	60	60	290
	2	February	88	88	476
	3	March	84	86	698
	4	April	72	73	527
	5	May	121	126	2271
	6	June	115	117	745
	7	July	95	99	1172
	8	August	89	90	1066
	9	September	64	64	273
	10	October	87	88	556
	11	November	61	61	358
	12	December	3	3	16
Test	13	All Months	119	124	803

Table 7. Description of the IndiaS2 dataset.

	Location	Tile Id	Acquisition Date	Scene Description
Train	Bhavnagar, Gujrat	T42QZJ	28 March 2022	Clear coastal area with urban bright objects
	Jodhpur, Rajasthan	T43RCK	16 June 2022	Thick clouds with bright urban and desert objects
	Hanumangarh, Rajasthan	T43RDN	26 June 2022	Mostly clear desert and urban area
	Srinagar, Jammu & Kashmir	T43SDT	4 October 2022	Thick and thin clouds over the mountain with snow patches
	Srikakulam, Andhra Pradesh	T44QRF	28 November 2022	Sparsely cloudy sea with clear coastal area
	Tinsukia, Assam	T46RGR	29 November 2022	Thick and thin cloud over forest and clear dried riverbed
Test	Thiruvananthapuram, Kerala	T43PFK	16 August 2022	Thick and thin clouds scattered over land and sea.
	Bathinda, Punjab	T43RDP	23 November 2022	Thin cloud over cultivated farmland
	Shimla, Himachal Pradesh	T43RFQ	20 November 2022	Thick clouds over mountain with little snow
	Pithora, Chhattisgarh	T44QPJ	21 December 2022	Sparsely distributed near-to-invisible thin clouds
	Haldwani, Uttarakhand	T44RLT	28 October 2022	Thick clouds over forest and mountain ranges

Table 8. Reference Cloud Mask Description of IndiaS2 dataset.

Label	Class Name
0	No Fill
1	No data
2	Thick Cloud
3	Thin Cloud
4	Cloud Shadow
5	Ground

Table 9. Overview of ML/DL Cloud Detection Methods and Evaluation Parameters.

S. No	Author (Year)	Paper Title	Technique(s)	Satellite Sensor/Instrument (Dataset(s))	Cloud Mask Type	Evaluation Parameter	Main Highlights
1.	Bai et al. (2016) [71]	“Cloud detection for high-resolution satellite imagery using machine learning and multi-feature fusion”	SVM-RBF (GLCM + NDVI)	GF-1 (102 Gao Fen-1) GF-2	Binary	Cloud Producer Accuracy Cloud User Accuracy Overall Accuracy Kappa	Overall Accuracy of 91.45% and 80% kappa. Producer and User Accuracy of 93.67% and 95.67%, respectively.
2.	Tan et al. (2016) [72]	“Cloud extraction from Chinese high-resolution satellite imagery by probabilistic latent semantic analysis and object-based machine learning”	SVM (PLSA+SLIC)	GF-1 ZY-3	Binary	Precision Recall Error Rate (ER)	Average Precision of 87.6%, with 94.5% Recall, and 2.5% ER.
3.	Shao et al. (2017) [73]	“Fuzzy AutoEncode Based Cloud Detection for RemoteSensing Imagery”	Fuzzy Autoencode Model (FAEM)	172 Landsat ETM+ images 25 GF-1 images	Binary	Right rate (RR) Error Rate (ER) False Alarm Rate(FAR)	RR of 86.6% with 3.6% ER, 1.2% FAR for Landsat. RR of 92.7% with 4.0% ER, 2.0% FAR for GF-1.
4.	Perez-Suay et al. (2017) [74]	“Randomized kernels for large scale Earth observation applications”	Randomized Kernels	Meteosat Second Generation (MSG)/SEVIRI image	Binary	Accuracy	Achieved accuracy of 82.5% to 90.5% over different training sizes and sunlight conditions
5.	Sun et al. (2018) [75]	“SVM-Based Cloud Detection Using Combined Texture Features”	SVM (GLCM + RIULBP)	Random satellite image (64*64)	Binary	Accuracy	Overall accuracy of 98.92%
6.	Ishida et al. (2018) [76]	“Development of a support vector machine-based cloud detection method for MODIS with the adjustability to various conditions”	SVM	MODIS (MOD35)	Binary	Statistical Comparison	About 89–94% of the cloud pixels and more than 97% of clear pixels detected accurately
7.	P’erez-Suay et al. (2018) [77]	“Pattern recognition scheme for large-scale cloud detection over landmarks”	SVM	MSG dataset	Binary	Kappa Overall Accuracy	Highest Kappa score of 0.78 Highest Global Accuracy of 91.52%.
8.	Deng et al. (2018) [78]	“Cloud detection in satellite images based on natural scene statistics and Gabor features”	SVM (Gabor)	GF-1	Binary	Precision Recall Training Time Running Time	Overall Precision of 91.61% and 86.39% Recall 13 min training time and 3.32 s running time
9.	Ghasemian and Akhoondzadeh (2018) [79]	“Introducing two Random Forest based methods for cloud detection in remote sensing images”	RF	Landsat 8 MODIS	Multi-class(Thick, Thin cloud, snow/ice and background)	True Positive Rate (TPR) True Negative Rate (TNR) Kappa	Cloud kappa of 1, snow/ice kappa of 0.99, and thin cloud kappa of 0.98 for Landsat-8. Cloud kappa of 0.99 and snow/ice kappa of 0.85 for MODIS.
10.	Fu et al. (2018) [80]	“Cloud detection for FY meteorology satellite based on ensemble thresholds and random forests approach”	RF	FY-2G	Binary	Probability of detection (POD) False Alarm Rate (FAR) Critical Success Index (CSI)	Average POD of about 97%, with 0.1 average FAR and 95% average CSI.
11.	Joshi et al. (2019) [81]	“Cloud detection algorithm using SVM with SWIR2 and tasseled cap applied to Landsat 8”	SVM	Landsat 8 (biome)	Multiclass (cloud, shadow, and ground)	Precision Sensitivity Specificity Overall Accuracy F-measure Kappa	The category-wise comparison was provided by considering STmask and CFmask as references. Achieved the highest overall accuracy of 84.2% for the Forest category (STmask as well as CFmask).
12.	Chen et al. (2020) [82]	“A Novel Classification Extension-Based Cloud Detection Method for Medium-Resolution Optical Images”	Classification Extension-based Cloud Detection (CECD) using RF	Landsat-8 (biome) Sentinel-2 (12 images randomly selected)	Binary	Cloud Producer Accuracy (CPA) Cloud User Accuracy (UA) Kappa F-measure	Average CPA of 96.46%, with 97.11% F-measure and 0.9418 kappa value for Sentinel-2 imagery. Average CPA of 96.88%, with 97.65% F-measure and 0.9433 kappa value for Landsat-8 imagery.
13.	Cilli et al. (2020) [83]	“Machine learning for cloud detection of globally distributed Sentinel-2 images”	RFSVMMLP	Sentinel-2 (Random 10000 pixels from Hollstein and Baetens-hagolle)	Binary	Accuracy F1 score Sensitivity Precision Specificity	SVM achieved the highest average accuracy of 97.9%, with 95.8% F1 score RF achieved the highest sensitivity of 97.9% MLP achieved the highest Precision of 97.9% with 99.4 specificity.
14.	Wei et al. (2020) [84]	“Cloud detection for Landsat imagery by combining the random forest and superpixels extracted via energy-driven sampling segmentation approaches”	RFmask (using RF)	Landsat 7 (Irish) Landsat 8 (Biome)	Binary	Accuracy Kappa Omission Error Commission Error	Overall accuracy of 93.8% with 0.77 kappa value, 12.0% omission error and 7.4% commission error.
15.	Ibrahim et al. (2021) [85]	“Cloud and Cloud-Shadow Detection for Applications in Mapping Small-Scale Mining in Colombia Using Sentinel-2 Imagery”	SVM	Sentinel-2 (Single image from El Bagre)	Multiclass (cloud, cloud shadow, cirrus, and clear)	True Positive False Positive Specificity	Achieved 94% to 100% specificity for randomly selected pixels over mining and water areas.
16.	Li et al. (2022) [86]	“An automatic cloud detection model for Sentinel-2 imagery based on Google Earth Engine”	SVM	Sentinel-2 (over Sri Lanka)	Binary	Omission Error Commission Error Overall Accuracy	Achieved 98.21% overall accuracy
17.	Yao et al. (2022) [87]	“Optical remote sensing cloud detection based on random forest only using the visible light and near-infrared image bands”	RFCD (Random Forest Cloud Detection)	Landsat 8 (biome) Sentinel-2 images (2000*2000) GF-1	Binary	Accuracy Precision Recall Kappa	Achieved average Precision of 95.04%, with 90.26% Recall, 96.22% Accuracy, and 0.8962 Kappa value for Landsat 8.
18.	Singh et al. (2023) [88]	“Cloud detection using sentinel 2 images: a comparison of XGBoost, RF, SVM, and CNN algorithms”	XGBoostRFSVM(each using combination of spectral (S) +GLCM (G) + Morphological (M) + Bilateral (B), and ResNet14)	Sentinel-2 (Baetens-hagolle and WHU-S2)	6-class Binary	Accuracy F1-score, Precision Recall Kappa User Accuracy Producer Accuracy mIoU Prediction Time Classifier size	RF achieved the highest validation accuracy of 94.2% (Kappa value = 0.870) with S + B + M features for the Baetens-Hagolle dataset. XGBoost achieved the highest average accuracy of 91.1% for the Baetens-hagolle dataset and >98% for the WHU-S2 (binary) dataset. XGBoost classifier size is about 5000 kb and took 23.28 s prediction time.
19.	Singh et al. (2023) [89]	“An Automated Cloud Detection Method for Sentinel-2 Images”	XGBoostRFSVM(each using a pixel-wise patch-based mechanism)	Sentinel-2 (Baetens-hagolle)	6-class	Accuracy F1-score, Prediction Time Classifier size	XGBoost (patch size: 9) achieved the highest OA of 90.26% with 78.77% F1-score, 3 s prediction time per each imagery, and about 5000 kb classifier size.
20.	Shang et al. (2024) [90]	“A hybrid cloud detection and cloud phase classification algorithm using classic threshold-based tests and extra randomized tree model”	Threshold and Extra Randomized Tree (CARE algorithm)	Himawari-8/AHI Standard Data	Multiclass (cloud, probably cloud, clear, and probably clear)	Hit Rate (HR) False Alarm Rate (FAR)	HR for cloudy and clear pixels are 97.12% and 96.28%, and the FAR for cloudy and clear pixels are 1.44% and 1.84% for Extra Randomized Tree.
Deep Learning-based Cloud Detection:
21.	Jeppesen et al. (2019) [101]	“A cloud detection algorithm for satellite imagery based on deep learning”	Remote Sensing Network (RS-NET based on U-Net)	Landsat 8 (biome and SPARCS)	Binary	Accuracy F1 Score	Achieved 93.81% total accuracy, with 93.42% F1-score for biome dataset Achieved 95.60% total accuracy, with 88.52% F1-score for the SPARCS dataset
22.	Xu et al. (2019) [102]	“DeepMask: an algorithm for cloud and cloud shadow detection in optical satellite remote sensing images using deep residual network”	Deepmask (based on ResNet)	Landsat 8	Binary	Accuracy Precision Recall F1-score	Achieved 93.56% accuracy, with 94.76% precision, 92.80 recall, and 93.42% F1-score
23.	Yang et al. (2019) [103]	“CDnet: CNN-based cloud detection for remote sensing imagery”	Cloud Detection Network (CD-NET based on CNN)	ZY-3 GF-1 Landsat-8	Binary	Accuracy Kappa mIoU User Accuracy(UA) Producer Accuracy(PA)	Achieved overall accuracy of 96.47%, with 91.70% mIoU, 85.06% kappa, 89.75% PA, and 90.41% UA.
24.	Shendryk et al. (2019) [104]	“Deep learning for multi-modal classification of cloud, shadow, and land cover scenes in PlanetScope and Sentinel-2 imagery”	Ensembled DenseNet201, ResNet50 and VGG10	PlanetScope Sentinel-2	Multiclass (Clear, Partly cloudy, Cloudy, Haze)	Accuracy F2 score	Achieved F2 score of 0.76 with an accuracy of 76% for sentinel-2.
25.	Liu et al. (2019) [105]	“Clouds Classification from Sentinel-2 Imagery with Deep Residual Learning and Semantic Image Segmentation”	CloudNet (deep residual network)	Sentinel-2 (5,017,600 pixels)	Binary	Accuracy Kappa Mean Intersection over union(mIoU) Precision Prediction time	Achieved accuracy of 96.24%, with 90.29% mIoU, 0.8965 kappa, and 98.13% precision. Prediction time for 20 scenes was about 40 s.
26.	Kanu et al. (2020) [106]	“CloudX-net: A robust encoder-decoder architecture for cloud detection from satellite remote sensing images”	CloudX-net (CNN)	Landsat 8 (random image and 38 Dataset)	Binary	Jaccard Index Precision Recall Overall Accuracy F1 score	Achieved jaccard index of 80.10%, with 97.92% overall accuracy, and 88.82% F1-score for Landsat 8. 77.09% jaccard index with 93.70% accuracy for 38 Dataset.
27.	Segal-Rozenhaimer et al. (2020) [107]	“Cloud detection algorithm for multi-modal satellite imagery using convolutional neural-networks (CNN)”	Deeplab (CNN based cloud and cloud shadow detection)	World View-2 (WV-2) Sentinel 2 imagery over Fiji location	Binary	Accuracy Probability of detection (POD) False Alarm Rate (FAR) Critical Success Index (CSI) Omission Error (OME)	Achieved accuracy of about 94% with 89% POD, 16% FAR, and 75.6% CSI for WV2 Achieved 81% accuracy, with 19% OME for Sentinel-2.
28.	Kristollari et al. (2020) [111]	“Convolutional neural networks for detecting challenging cases in cloud masking using Sentinel-2 imagery”	Patch-to-pixel CNN architecture	Sentinel-2 (Baetens-hagolle)	6-class	Overall Accuracy Precision Recall F1 score	Achieved accuracy of 97.51%, with 98.15% precision, 98.00% recall, and 0.9805 F1 score.
29.	Luotamo et al. (2021) [112]	“Multiscale Cloud Detection in Remote Sensing Images Using a Dual Convolutional Neural Network”	Two-cascaded CNN architecture	Sentinel-2 (478 images from 2016–2017)	Binary	Overall Accuracy Precision Recall F1 score IoU	Achieved accuracy of 85.6%, with 0.711 F1 score and 0.597 IoU using VGG16 as baseline
30.	Ma et al. (2021) [113]	“Cloud detection algorithm for multi-satellite remote sensing imagery based on a spectral library and 1D convolutional neural network”	Cloud detection based on CNN using spectral library (CD-SLCNN based on 1D Residual Network)	Landsat 8 MODIS Sentinel 2	Binary	Overall Accuracy User Accuracy Producer Accuracy Kappa mIoU	Achieved higher overall accuracy (95.6%, 95.36%, 94.27%) and mIoU (77.82%, 77.94%, 77.23%) on the Landsat-8 OLI, MODIS, and Sentinel-2 data,
31.	Lopez-Puigdollers et al. 2021 [118]	“Benchmarking Deep Learning Models for Cloud Detection in Landsat-8 and Sentinel-2 Images”	Fully convolutional neural networks (FCNN) based on UNet	Landsat 8(sparcs, biome and 38 dataset) Sentinel-2 (hollstein and Baetens-Hagolle)	Binary	Accuracy Commission Error (CE) Omission Error (OE)	Achieved accuracy of 94.35%, with 5.36% CE, and 5.91% OE for biome dataset. Achieved accuracy of 93.57%, with 0.85% CE, and 29.68% OE for the SPARCS dataset. Achieved accuracy of 95.16%, with 0.70% CE, and 29.59% OE for the Baetens-hagolle dataset.
32.	Li et al. 2021 [56]	“A Lightweight Deep Learning-Based Cloud Detection Method for Sentinel-2A Imagery Fusing Multiscale Spectral and Spatial Features”	Cloud Detection-fusing multiscale spectral and spatial features (CD-FM3SFs)	Sentinel-2 (WHU-S2)	Binary	Accuracy Precision Recall F1-score mIoU	Achieved accuracy of 98.68%, with 0.8100 mIoU, 97.96% precision, and 0.8940 F1-score using 4 common bands.
33.	Li et al. (2022) [124]	“A hybrid generative adversarial network for weakly-supervised cloud detection in multispectral images”	GAN-CDM	Landsat-8 (Biome) Sentinel-2 (S2 CMC)	Binary	Overall Accuracy Producer Accuracy User Accuracy F1-score IoU	Achieved accuracy of 90.20%, with 0.8272 IoU, and 90.54% F1-score (Landsat-8) Achieved accuracy of 92.54%, with 0.8674 IoU, and 92.90% F1-score (Sentinel-2)
34.	Grabowski et al. (2023) [143]	“Squeezing nnU-Nets with Knowledge Distillation for On-Board Cloud Detection”	nnU-Nets	Landsat-8 (38 dataset) Sentinel-2 (Kappazeta)	Binary4-class	Jaccard Index Precision Recall Overall Accuracy	Achieved jaccard index of 75.6%, with 95.3% overall accuracy, 84.5% Precision, and 86.6% Recall for Landsat 8. 0.510 mean jaccard index for Sentinel-2.
35.	Zhang et al. (2023) [149]	“CloudViT: A Lightweight Vision TransformerNetwork for Remote Sensing Cloud Detection”	CloudViT	Landsat-7 Landsat-8 Sentinel-2 (WHU-S2) GF-1 (AIR-CD)	Binary	Mean Intersection over union(mIoU)	Achieved mIoU (71.29%, 80.62%, 81.11.27 on the Landsat-8 OLI, Sentinel-2, and GF-1 data.
36.	Singh et al. (2023) [150]	“A transformer-based cloud detection approach using Sentinel 2 images”	SSATR-CD (Spatial-spectral Attention Transformer using Cloud Detection)	Sentinel-2 (IndiaS2 and WHU-S2)	4-classBinary	Overall Accuracy Precision Recall F1-score mIoU Prediction Time	Achieved 99.63% Accuracy with 0.6149 F1 score, and 0.44391 mIoU on IndiaS2. Achieved 98.17% Accuracy with 0.8608 F1 score, and 0.7617 mIoU on WHU-S2 (model-based transfer).
37.	Francis (2024) [151]	“Sensor Independent Cloud and Shadow Masking with Partial Labels and Multimodal Inputs”	SegFormer	Sentinel-2 (CloudSEN12, Kappaset, and Baetens-Hagolle) Landsat 8 and 9	Binary	Precision Recall IoU F1 Score Balanced Accuracy (BA)	Achieved 0.9049 IoU with 0.9501 F1-score, and 93.29% BA.
38.	Singh et al. (2024) [156]	“Enhanced cloud detection in Sentinel-2 imagery using K-means clustering embedded transformer-inspired models”	KET-CD (Kmeans embedded Transformer inspired methods)	Sentinel-2 (IndiaS2 and Kappaset)	4-class	Overall Accuracy Precision Recall F1-score mIoU Prediction Time Parameters	Achieved accuracy of 82.7931% with F1-Score of 0.7383 and 0.6055 mIoU. Prediction time of about 2 min per image product and number of parameters 0.024 millions.
39.	Wright et al. (2024) [158]	“CloudS2Mask: A novel deep learning approach for improved cloud and cloud shadow masking in Sentinel-2 imagery”	CloudS2Mask based on UNet	Sentinel-2 (CloudSEN12, Kappaset, Sentinel-2 Cloud mask Catalogue)	4-class	Overall Accuracy Balanced Overall Accuracy Producer Accuracy User Accuracy	Achieved accuracy of 92.6% with BOA of 92.4%, 88.5% PA, and 95.7% UA.
40.	Gbodjp et al. (2025) [159]	“Self-supervised representation learning for cloud detection using Sentinel-2 images”	DeepCluster	Sentinel-2 (SEN12MS, WHU-S2+ and CloudSEN12)	Binary	Accuracy Precision Recall F1-Score IoU	Achieved accuracy of 98.5% with 0.91 F1-Score, 0.83 IoU, 94.2% Precision, and 87.4% Recall.

Table 10. Performance Metrics Formulas.

Metrics	Computation Formula
	Binary	Multiclass
	Binary	Macro-Averaged	Micro-Averaged
Accuracy	$\frac{T P + T N}{T P + F P + F N + T N}$	$\frac{\sum_{c = 1}^{C} \frac{T P_{c} + T N_{c}}{T P_{c} + F P_{c} + F N_{c} + T N_{c}}}{C}$	$\frac{\sum_{c = 1}^{C} T P_{c} + T N_{c}}{\sum_{c = 1}^{C} T P_{c} + F P_{c} + F N_{c} + T N_{c}}$
F1-score	$\frac{T P}{T P + \frac{1}{2} (F P + F N)}$	$\frac{\sum_{c = 1}^{C} \frac{T P_{c}}{T P_{c} + \frac{1}{2} (F P_{c} + F N_{c})}}{C}$	$\frac{\sum_{c = 1}^{C} T P_{c}}{\sum_{c = 1}^{C} T P_{c} + \frac{1}{2} (F P_{c} + F N_{c})}$
Precision	$\frac{T P}{T P + F P}$	$\frac{\sum_{c = 1}^{C} \frac{T P_{c}}{T P_{c} + F P_{c}}}{C}$	$\frac{\sum_{c = 1}^{C} T P_{c}}{\sum_{c = 1}^{C} T P_{c} + F P_{c}}$
Recall	$\frac{T P}{T P + F N}$	$\frac{\sum_{c = 1}^{C} \frac{T P_{c}}{T P_{c} + F N_{c}}}{C}$	$\frac{\sum_{c = 1}^{C} T P_{c}}{\sum_{c = 1}^{C} T P_{c} + F N_{c}}$
Kappa coefficient	$\frac{P_{o} - P_{e}}{1 - P_{e}}$	$\frac{\sum_{c = 1}^{C} \frac{P_{o}^{i} - P_{e}^{i}}{1 - P_{e}^{i}}}{C}$	$\frac{\sum_{c = 1}^{C} P_{o}^{i} - P_{e}^{i}}{\sum_{c = 1}^{C} 1 - P_{e}^{i}}$
Kappa coefficient	Where, $P_{o} = \frac{T P + T N}{T P + F P + F N + T N}$ and $P_{e} = \frac{(T P + F P) \cdot (T P + F N) + (F N + T N) \cdot (F P + T N)}{{(T P + F P + F N + T N)}^{2}}$
mIoU	$\frac{T P}{T P + F P + F N}$	$\frac{\sum_{c = 1}^{C} \frac{T P_{c}}{T P_{c} + F P_{c} + F N_{c}}}{C}$	$\frac{\sum_{c = 1}^{C} T P_{c}}{\sum_{c = 1}^{C} T P_{c} + F P_{c} + F N_{c}}$
User Accuracy (UA)	$\frac{T P}{T P + F P}$	$\frac{T P_{c}}{T P_{c} + F P_{c}}$
Producer Accuracy (PA)	$\frac{T P}{T P + F N}$	$\frac{T P_{c}}{T P_{c} + F N_{c}}$

Table 11. Conversion mechanism for predicted classes of the cloud detection method.

Binary		4-Class		6-Class		Fmask		Sen2Cor
Value	Class	Value	Class	Value	Class	Value	Class	Value	Class
0	Cloud	2	Thick cloud	2	Low cloud	4	Cloud	9	Cloud high probability
		3	Thin cloud	3	High cloud	-	-	8	Cloud medium probability
		3	Thin cloud	3	High cloud	-	-	10	Thin cirrus
1	Clear (non-cloud)	4	Cloud shadow	4	Cloud shadow	2	Cloud shadow	3	Cloud shadow
		5	Ground	5	Ground	0	Clear land	2	Dark area pixels
								4	Vegetation
								5	Bare soil
								7	Unclassified
				6	Water	1	Water	6	Water
				7	Snow/ice	3	Snow	11	Snow

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Singh, R.; Pal, M.; Biswas, M. Cloud Detection Methods for Optical Satellite Imagery: A Comprehensive Review. Geomatics 2025, 5, 27. https://doi.org/10.3390/geomatics5030027

AMA Style

Singh R, Pal M, Biswas M. Cloud Detection Methods for Optical Satellite Imagery: A Comprehensive Review. Geomatics. 2025; 5(3):27. https://doi.org/10.3390/geomatics5030027

Chicago/Turabian Style

Singh, Rohit, Mahesh Pal, and Mantosh Biswas. 2025. "Cloud Detection Methods for Optical Satellite Imagery: A Comprehensive Review" Geomatics 5, no. 3: 27. https://doi.org/10.3390/geomatics5030027

APA Style

Singh, R., Pal, M., & Biswas, M. (2025). Cloud Detection Methods for Optical Satellite Imagery: A Comprehensive Review. Geomatics, 5(3), 27. https://doi.org/10.3390/geomatics5030027

Article Menu

Cloud Detection Methods for Optical Satellite Imagery: A Comprehensive Review

Abstract

1. Introduction

2. Cloud and Its Effect

3. Cloud Detection

3.1. Generic Cloud Detection Classification

3.2. Sensor Band-Based Cloud Detection Classification

4. Dataset Available

4.1. Baetens-Hagolle (CESBIO/CNES) Dataset

4.2. WHUS2-CD Dataset

4.3. KappaSet Dataset

4.4. IndiaS2 Dataset

5. Cloud Detection Methods

5.1. Threshold-Based Cloud Detection

5.1.1. Function of Mask (Fmask)

5.1.2. Sen2cor

5.1.3. MAJA

5.2. Machine Learning (ML) Cloud Detection

5.3. Deep Learning (DL) Cloud Detection

5.4. Importance of Multiclass Cloud Detection in Cloud Removal

6. Intercomparison Framework and Performance Evaluation

6.1. Data Harmonization and Input Consistency

6.2. Performance Evaluation Metrics

6.3. Conversion Framework for Multiclass Comparison

7. Research Gaps

8. Conclusions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI