Impact of Synthetic Data on Deep Learning Models for Earth Observation: Photovoltaic Panel Detection Case Study

Hisam, Enes; Gimeno, Jesus; Miraut, David; Pérez-Aixendri, Manolo; Fernández, Marcos; Gini, Rossana; Rodríguez, Raúl; Meoni, Gabriele; Seker, Dursun Zafer

doi:10.3390/ijgi14120481

Open AccessArticle

Impact of Synthetic Data on Deep Learning Models for Earth Observation: Photovoltaic Panel Detection Case Study

by

Enes Hisam

^1,2,

Jesus Gimeno

³

,

David Miraut

¹

,

Manolo Pérez-Aixendri

^3,*

,

Marcos Fernández

³

,

Rossana Gini

¹

,

Raúl Rodríguez

¹,

Gabriele Meoni

⁴

and

Dursun Zafer Seker

⁵

¹

GMV, Harwell Science and Innovation Campus, Oxford OX11 0RL, Oxfordshire, UK

²

Geomatics Engineering Program, Graduate School, Istanbul Technical University, 34469 Istanbul, Turkey

³

IRTIC, University of Valencia, 46980 Valencia, Spain

⁴

Φ-lab, ESRIN, European Space Agency (ESA), 00044 Frascati, Italy

⁵

Geomatics Engineering Department, Faculty of Civil Engineering, Istanbul Technical University, 34469 Istanbul, Turkey

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2025, 14(12), 481; https://doi.org/10.3390/ijgi14120481

Submission received: 11 September 2025 / Revised: 16 November 2025 / Accepted: 27 November 2025 / Published: 4 December 2025

(This article belongs to the Topic Artificial Intelligence Models, Tools and Applications)

Download

Browse Figures

Versions Notes

Abstract

This study explores the impact of synthetic data, both physically based and generatively created, on deep learning analytics for earth observation (EO), focusing on the detection of photovoltaic panels. A YOLOv8 object detection model was trained using a publicly available, multi-resolution very high resolution (VHR) EO dataset (0.8 m, 0.3 m, and 0.1 m), comprising 3716 images from various locations in Jiangsu Province, China. Three benchmarks were established using only real EO data. Subsequent experiments evaluated how the inclusion of synthetic data, in varying types and quantities, influenced the model’s ability to detect photovoltaic panels in VHR imagery. Physically based synthetic images were generated using the Unity engine, which allowed the generation of a wide range of realistic scenes by varying scene parameters automatically. This approach produced not only realistic RGB images but also semantic segmentation maps and pixel-accurate masks identifying photovoltaic panel locations. Generative synthetic data were created using diffusion-based models (DALL·E 3 and Stable Diffusion XL), guided by prompts to simulate satellite-like imagery containing solar panels. All synthetic images were manually reviewed, and corresponding annotations were ensured to be consistent with the real dataset. Integrating synthetic with real data generally improved model performance, with the best results achieved when both data types were combined. Performance gains were dependent on data distribution and volume, with the most significant improvements observed when synthetic data were used to meet the YOLOv8-recommended minimum of 1500 images per class. In this setting, combining real data with both physically based and generative synthetic data yielded improvements of 1.7% in precision, 3.9% in recall, 2.3% in mAP@50, and 3.3% in mAP@95 compared to training with real data alone. The study also emphasizes the importance of carefully managing the inclusion of synthetic data in training and validation phases to avoid overfitting to synthetic features, with the goal of enhancing generalization to real-world data. Additionally, a pre-training experiment using only synthetic data, followed by fine-tuning with real images, demonstrated improved early-stage training performance, particularly during the first five epochs, highlighting potential benefits in computationally constrained environments.

Keywords:

AI-generated data; deep learning; diffusion model; earth observation; physically-based simulation; synthetic data

1. Introduction

In the earth observation (EO) domain, artificial intelligence (AI) techniques have the potential to automate complex analyses, thereby enhancing target or pattern detection. However, AI-driven analytics performance heavily relies on the training phase, which involves processing a comprehensive, varied, and curated set of images, verified against corresponding reference labels. Procuring and annotating such a vast number of images for all necessary observation conditions is typically demanding and time-consuming. In many cases, this requires specialized in-field campaigns with appropriate equipment and expertise. In this context, simulation methodologies, capable of realistically replicating physical conditions and sensing performance of various sensors, could provide a valuable and complementary data source.

Since the 20th century, there has been a significant increase in global solar photovoltaic (PV) capacity, reaching notable levels in recent years [1]. A considerable portion of this capacity comes from small-scale PV systems typically used in residential areas [2], where installation costs have dropped significantly over the past decade [3]. Dropping the PV technology prices into the range of fossil fuel prices [3] and climate change concerns have led more households to install PV solar panels.

Detailed information on distributed PV arrays, including their location, size, shape, and capacity, is essential for energy suppliers and government agencies for system integration, operation, planning, and policymaking [4]. However, collecting these data is challenging due to reliance on third-party sources and the labor-intensive process of manually labeling high-resolution aerial images [5,6].

The creation of synthetic data offers a cost-effective and controllable alternative to real EO data for machine learning (ML) applications. This method would allow researchers to test hypotheses and models in a controlled environment and to generate “what if” scenarios, helping data scientists better understand and accurately predict model outcomes. It would be particularly valuable in situations where data and labels are limited or costly to acquire, especially as deep learning (DL) models continue to scale and demand larger training datasets [7]. Studies have shown that supplementing datasets with synthetic data can improve AI system performance and can be increasingly explored as a practical response to data scarcity and annotation costs [8,9,10,11].

This paper investigates whether using synthetic data, such as AI-generated and physically based simulated data, can help improve the efficiency and accuracy of solar panel detection from satellite imagery.

The use of AI-generated data has been progressively adopted across a wide range of applications, as evidenced by numerous studies [12,13,14,15,16,17]. Synthetic data have also started to play a crucial role in improving ML models while protecting the confidentiality of real-world data by acting as a proxy. In sensitive fields like medical imaging, synthetic data provide an effective way to safeguard the privacy of original data, as they are not an exact replica of the genuine sensitive information [18,19].

Generative models have also been utilized to condense large datasets into smaller, more manageable synthetic datasets, a process known as dataset distillation.

Simulated virtual environments, particularly those employing physically based rendering (PBR), have emerged as a promising technique to alleviate the data-intensive nature of ai model training, especially for deep convolutional neural networks (CNNs). This approach is particularly valuable in domains where acquiring large-scale real-world datasets or ground truth annotations is challenging or costly.

Recent studies have explored the potential of using existing video games and animated movies to generate high-precision training data. However, a critical challenge remains ensuring that models trained on such synthetic data can generalize effectively to real-world scenarios. In [20], a probabilistic PBR system was proposed to evaluate the suitability of synthetic data for urban AI applications. Similarly, [21] demonstrated that more realistic PBR does not always translate to improved performance on real-world datasets.

In the field of robotics, PBR pipelines have been developed to generate training data for robots and embodied agents [22,23], as well as for indoor scene understanding [24].

In contrast, the application of PBR in earth observation remains relatively unexplored. While PBR is commonly used for realistic atmospheric visualization [25], its potential for training AI-driven EO algorithms has not been widely investigated.

In recent years, some studies have investigated how using synthetic data can enhance the performance of deep learning models within EO applications. Reference [26] introduced two methods for generating synthetic datasets of aircraft images: one using rendered 3D CAD models and another employing a multiscale attention module to enhance the cycle-consistent generative adversarial network (CycleGAN) across spatial and channel dimensions. Their findings showed that incorporating synthetic data could improve aircraft detection accuracy in remote sensing images, especially when real data are scarce. Similarly, [27] applied generative adversarial networks (GANs) to produce synthetic segmentation masks and corresponding synthetic remote sensing images for vehicle detection from high-resolution satellite imagery. Their study revealed that, in data-limited scenarios, augmenting datasets with synthetically generated images could enhance detection performance. Reference [28] found that synthetic data generated with their method could improve detection performance, especially when supplemented with a portion of real training images. Reference [29] employed a conventional GAN architecture to generate artificial synthetic aperture radar images. They evaluated these images by using a CNN classifier on 10 selected target categories from the moving and stationary target acquisition and recognition (MSTAR) dataset, finding that the prediction performance was comparable to that achieved with real images. Similarly, [30] utilized GANs to generate multi-band, 16-bit satellite images resembling Sentinel-2 Level-1C products. They also applied GANs for image style transfer to perform land-cover transformation, achieving promising results.

While previous studies have demonstrated the potential of synthetic data to enhance model performance in EO tasks, they often rely on a single data generation method (e.g., GAN-based or simulation-based) and rarely explore how varying the type or amount of synthetic data affects results. Moreover, the combined use of AI-generated and physically simulated data (particularly for object detection tasks like PV solar panel identification in urban VHR imagery) remains underexplored. To address this gap, our study systematically evaluates the impact of synthetic data under different configurations, focusing on how its type, combination, and quantity affect real-world generalization.

Specifically, we investigate whether incorporating labeled synthetic data, both AI-generated and physically based simulated, can improve the performance of a deep learning model for detecting distributed PV arrays in VHR satellite imagery. Unlike previous work that uses a single synthetic source or fixed augmentation strategy, this study explores diverse scenarios and proportions to understand how the choice and integration of synthetic data influence model performance.

2. Materials and Methods

The real EO dataset used in this study originates from [31] and is publicly available on Zenodo. It comprises EO images collected from various locations in Jiangsu Province, China. It includes three groups of PV samples at spatial resolutions of 0.8 m, 0.3 m, and 0.1 m, labeled as “PV08,” “PV03,” and “PV01,” respectively. “PV08” consists of imagery from Gaofen-2 and Beijing-2 satellites, “PV03” is derived from aerial imagery, and “PV01” is based on unmanned aerial vehicles (UAV) orthophotos. Altogether, the dataset contains 3716 samples of PV panels categorized by background type (ground-mounted solar panels and rooftop solar panels) (see Table 1). The dataset also covers a variety of land cover types (e.g., shrubland, grassland, cropland, saline-alkali, and water surfaces) and roof types (e.g., flat concrete, steel tile, and brick roofs). The [31] dataset provides ground truth (GT) data in the form of binary masks indicating the presence of solar panels and background. These masks were converted into bounding boxes for object detection tasks. An example of the images and corresponding masks is shown in Figure 1.

The [31] dataset was selected for this study due to its diversity, high data quality, multi-resolution characteristics, and well-structured label format. While it is not particularly large in terms of the number of samples, this limitation is shared by the most publicly available datasets. This shortcoming highlights the potential value of incorporating synthetic data to address such gaps.

3. Methodology

3.1. Generation Process of the Physically Based Simulated Data

To create physically based synthetic images, a tool was developed and integrated into the Unity 3D application development environment. This tool allows users to define the properties of the desired type of human settlement and generate random variations in these properties, simplifying the creation of large datasets. For each 3D scene, the corresponding 3D elements—such as buildings, vehicles, vegetation, streets, and more—are procedurally generated. These 3D elements are then rendered to produce the final image, enabling the generation of any type of projection, from satellite-like views to aerial perspectives where the viewing angle is not perpendicular. To include photovoltaic panels, the tool automatically generates 3D panel elements on building rooftops, calculating their position and orientation based on the roof topology. A key aspect of this tool is the accurate distribution of 3D elements to simulate various human settlement topologies.

To achieve this, the tool employs a three-level space division mechanism: Voronoi partitioning, streamline-based subdivision, and grammar-based object placement, extending the two-partitioning system presented in [32]. The first division process creates regions according to the user-defined zone types, such as industrial, residential, urban centers, or vegetation areas. A Voronoi partitioning algorithm is applied based on pseudo-random points based on user-defined parcel sizes and type of distribution. By adjusting the alignment and spacing of the initial points, different settlement topologies can be created automatically. The second process subdivides the parcels into smaller, appropriately sized parcels while maintaining the desired topology. This step ensures the parcels adhere to user-defined size and shape requirements, which are challenging to control using only Voronoi partitioning. Finally, each parcel is populated with 3D objects based on zone-specific grammars. The grammars define the arrangement of 3D elements and adapt dynamically to the unique shape of each parcel. Procedural generation of 3D elements ensures that every detail, from building structures to the random distribution of cars in parking areas, conforms to the parcel’s geometry. This approach also supports precise placement of solar panels, calculating their optimal position and orientation relative to the roof topology. An example of different random variations computed automatically is shown in Figure 2, which shows the color label assigned to each 3D element using the same color labels as OpenStreetMap.

The synthetic images are generated using physically based rendering techniques. Custom shaders handle the calculations required for realistic rendering and also enable the creation of segmentation masks for different element types. This allows users to generate multiple image outputs from the same 3D distribution, such as color images, near-infrared images, or pixel-accurate masks for residential buildings, non-residential buildings, photovoltaic panels, and more.

This tool empowers users to define zone types and generate random distributions, including specifying parameters like the percentage of solar panels in each area. By creating detailed 3D scenarios, it enables the generation of images from various viewpoints and produces pixel-level segmentation masks for all elements. This dramatically simplifies the automatic creation of large datasets, complete with associated labels, to support machine learning and other applications. An example of generated dataset can be found in [33]; Figure 3 shows an example of generated images including color image, pixel-level labels for each element and associated photovoltaic mask.

3.2. Generation Process of the AI-Generated Data

Diffusion models are a class of deep generative models designed to synthesize high-quality, diverse images by applying a probabilistic diffusion framework. During training, these models operate through a process inspired by diffusion phenomena, where data are incrementally transformed over time. The training involves two phases: the forward diffusion process and the reverse diffusion process.

In the forward diffusion process, Gaussian noise is iteratively added to the training data over a series of steps, gradually diffusing the data into a state of pure noise. This forward transformation, akin to a diffusion process in physics, models a progressive destruction of the input structure while maintaining a well-defined statistical path from data to noise for the learnt high-level concept depicted originally.

In the reverse diffusion process, the model is trained to invert this noising sequence step by step, effectively learning the dynamics required to reverse the diffusion process. Starting with pure noise, the model predicts and removes noise incrementally at each step, ultimately reconstructing the original data distribution for a combination of concepts. This bidirectional process lies at the heart of the AI model’s ability to generate realistic images. During inference, only the reverse diffusion process is used, enabling the generation of synthetic data by starting with random noise and reversing the learned diffusion path [34].

For the study case of synthesizing solar panel images in urban scenarios, the stable diffusion model from [35] was adapted. This model is a popular opensource implementation of the diffusion models subfamily, known as latent diffusion models (LDMs).

The objective was not only to synthesize images that replicated the features of imagery acquired by EO sensors but also to achieve precise pixel-level annotation. Then, the generative model had to master fine control over the spatial positioning of synthetic elements. To this end, the image generation was conditioned on an auxiliary input signal that directed how the desired elements were placed during the diffusion process. Recent advancements in hypernetwork design were applied, specifically utilizing the integration between Stable Diffusion 1.5 and ControlNet [36], which enables a reduction in the number of required training image pairs by nearly three orders of magnitude compared to training a diffusion model from scratch. This methodology facilitates the precise incorporation of concepts into the model without requiring extensive datasets or significant computational resources.

GAN-based augmentation methods, including CycleGAN variants, were evaluated as potential alternatives. However, the task addressed in this work requires precise geometric control to ensure the consistent placement of photovoltaic panels within the urban structure. Latent diffusion models with ControlNet provide explicit spatial conditioning while maintaining state-of-the-art sample quality and diversity, making them more suitable for this purpose. Recent studies have also demonstrated that diffusion models achieve performance comparable to or superior to GANs in terms of visual fidelity and stability, and that diffusion-based synthetic data can effectively improve recognition and detection tasks [8,37,38,39,40].

Instead of relying on images with depth maps, edges, or segmented maps—which would have been costly to obtain for the required volume of image pairs—the approach used rasterized OpenStreetMap patches as conditioning signals and their corresponding pixel-to-pixel satellite image patches as paired data. These pairs were used to fine-tune the diffusion model with simple yet consistent prompts that aligned with the scenes while being distinct enough from those in the base model.

The image generation process during inference involved three main steps:

An urban area was selected where solar panels could potentially be installed on rooftops, spanning as large a region as an entire city.
The rasterized map of this area was obtained and divided into patches matching the size and scale used during the hypernetwork training. These patches were fed into the conditioned diffusion model, which generated satellite-like images similar to those used during training but never identical. For each possible initial noise seed, the resulting image will feature elements aligned with the map but with significant variations in finer details.
Since solar panels are not cataloged in OpenStreetMap, the synthetic images were analyzed using simple computer vision algorithms. In the final step, solar panels were added using traditional computer graphics techniques (explained below).

Figure 4 shows the generation of synthetic data based on stable diffusion model with ControlNet using OpenStreetMap as a guide. In this study, the zoom level in OpenStreetMap was set to correspond to an approximate spatial resolution of 1 m. The RGB images used for training were sourced from the World Imagery service provided by ArcGIS Online, specifically covering the UK region.

Building on the three-step pipeline above, the third step, where solar panels are virtually added, is detailed as follows. Panels are positioned consistently on rooftops and in well-illuminated areas, using a sequence of three phases:

Detection and segmentation of candidate areas within illuminated roofs to place solar panels. A high proportion of roofs, especially on industrial buildings, reflect a significant amount of light, saturating the satellite sensors. This is likely due to the roof inclination and the fact that many of these surfaces are actually flat. These saturated areas are typically found on the most illuminated parts of the roofs, which are ideal locations for placing solar panels. Detecting saturated areas in the synthetic images was straightforward using basic thresholding algorithms. The system therefore identified very “white” regions (with a configurable deviation of 8% by default) and a minimum size of 100 pixels. These areas were marked with oriented bounding boxes, and the vertex coordinates of the boxes were provided to train solar panel detectors.
AI-generated solar panels with non-idealities. A combination of two existing diffusion models was used to synthesize solar panel textures, which were then mapped onto areas of interest detected in the previous phase. Specifically, DALL·E 3 and Stable Diffusion XL were exploited to generate high-resolution solar panel images, from which dynamic texture portions were taken to edit aerial images on the fly.
- Over 90 base textures at 1024 × 1024 pixels were created, featuring: Various cell patterns, cell shapes, and connectors. Different levels of dirt due to dust, ash, and grime. Surface defects, including incomplete and damaged panels.
Mapping to combine diffusion model samples into final images. The bounding boxes oriented towards the candidate areas identified earlier were used to map planar portions of the textures generated, following a simple procedure of rescaling and mipmapping with pyramid filtering. Since 3D object estimates were not available and the projection was nearly orthographic, this basic mapping method worked effectively. Additionally, a border was added around the roof area where the panels were placed to simulate the frames and reinforcement structures typically used in urban settings. Figure 5 shows an example of AI-based synthetic images with their associated photovoltaic binary masks.

During the second phase of the third step, high-resolution textures were generated for a wide variety of solar panels, including those significantly damaged or obscured by dust. The synthetic images produced through this method have the potential to improve the accuracy of solar panel detection systems in the context of natural disasters such as earthquakes, wildfires, or sandstorms. These scenarios are characterized by a severe lack of real, labeled data, particularly within publicly available datasets, underscoring the value of the proposed approach. All synthetic images were manually inspected to verify their consistency with real EO data in terms of illumination, perspective, and photovoltaic panel distribution. This review also ensured that the annotations accurately matched the photovoltaic panel regions. Although no quantitative image similarity metrics (e.g., SSIM or FID) were applied in this study, visual inspection was deemed sufficient to ensure dataset coherence for the training experiments.

3.3. AI-Analytics Methodology

3.3.1. YOLOv8

YOLOv8, released by Ultralytics in January 2023, is a state-of-the-art object detection model offering improvements in detection accuracy and speed compared to earlier versions like YOLOv5 and YOLOv7. Its architecture consists of three main components: the backbone, neck, and head, each designed for efficient and accurate object detection. The backbone, based on a modified CSPDarknet53, enhances feature extraction through improved modules and lightweight design, enabling robust performance with reduced computational demands. The neck employs a PAN-FPN structure to effectively combine shallow and deep features, ensuring better localization and semantic understanding. The head uses a decoupled structure to separately handle object classification and bounding box regression, allowing for greater detection precision and faster convergence. For this study, YOLOv8 is employed to evaluate its effectiveness in detecting PV panels in VHR satellite imagery [41]. YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. The aim is to assess how the inclusion of labeled synthetic data for training and validating the YOLOv8 can affect the performance in detecting PV panels in real-world scenarios.

3.3.2. Synthetic Data Preprocessing

Before starting the training process, both the real and synthetic datasets were preprocessed to ensure compatibility with the YOLOv8 model. For the real and physically based simulated data, the GT was initially provided in the form of binary masks, where pixel values of 1 indicated solar panels, and 0 represented all other objects. These binary masks were converted into YOLOv8-compatible .txt files, which specify the location and dimensions of bounding boxes that encapsulate the solar panels. For the physically based simulated data, we experimented with various labeling approaches, including treating each solar panel as a separate label or grouping multiple panels into a single bounding box. Our results showed that creating individual bounding boxes around each solar panel yielded better performance, so we adopted that method. In contrast, the AI-generated data did not require this preprocessing step, as its GT was already provided in YOLOv8’s bounding box format, eliminating the need for mask conversion. Figure 6 illustrates examples of bounding boxes for real, AI-generated, and physically based simulated data.

3.3.3. Benchmarks and Experiments

All experiments were conducted on a Supermicro Superserver 4029GP-TRT rack system equipped with 2× Intel Xeon 4114 (10 cores, 20 threads), 256 GB DDR4 RAM, and 8× NVIDIA RTX 2080 (8 GB) GPUs sourced by Super Micro Computer, Inc. Berkshire, England. Training was performed on a single GPU using the Ultralytics YOLOv8 implementation with CUDA 12.2 and cuDNN 8.9.7.

A YOLOv8 object detection model with YOLOv8-m weights was selected to balance accuracy and computational efficiency. Preliminary trials with YOLOv8-l showed only marginal performance gains at a higher computational cost; therefore, YOLOv8-m was adopted for all reported experiments. The input image size was fixed at 512 × 512 pixels.

To ensure a fair comparison across all scenarios, identical training hyperparameters were applied in all experiments. The model was trained for 100 epochs with a batch size of 16, using early stopping, YOLOv8-m weight initialization, and the default Ultralytics YOLOv8 optimization and augmentation settings for any parameters not explicitly listed.

To evaluate the impact of incorporating synthetic data, three benchmarks were established to test the model’s performance. The first benchmark involved training and validating the model using only real data from the entire [31] dataset. Since PV08 has a resolution of 0.8 m, which closely matches the AI-generated data resolution (1 m), subsequent experiments focused exclusively on PV08.

The second benchmark used only PV08 data for training, validation, and testing. In the third benchmark, the dataset configuration was adjusted to include more test data while reducing the amount of training and validation data. Notably, in this setup, the available training data fell below the minimum number of images per label recommended by the YOLOv8 documentation (1500 images). To address this, synthetic data were strategically added to meet the minimum requirements, ensuring compliance with YOLOv8’s guidelines. This thoughtful integration of synthetic data demonstrates its potential utility in augmenting datasets and enhancing model training under constrained conditions.

To ensure transparency and reproducibility, the dataset partitioning strategy is explicitly defined for each benchmark. Benchmark 1 uses the full real dataset (3716 images) with an 80/10/10 split, resulting in 2973 training, 372 validation, and 371 test images. Benchmark 2 focuses exclusively on the PV08 subset (763 real images). Due to the spatial distribution and tiling constraints of this subset, 548 images are used for training, 138 for validation, and 77 for testing, corresponding to approximately 72/18/10, while preserving an independent real test set. Benchmark 3 employs the same 763 PV08 images but is explicitly designed as a data-constrained scenario with 200 training, 50 validation, and 513 test images, thereby increasing the proportion of real test samples and reducing the amount of real training data.

Experiments were conducted to evaluate the impact of adding synthetic data to the training and validation sets. Synthetic data were included in various configurations: added exclusively to the training set, exclusively to the validation set, or to both simultaneously. Importantly, synthetic data were never incorporated into the testing dataset, as the primary objective was to assess the model’s ability to detect solar panels in real-world scenarios. Table 2 outlines the seven distinct combinations of synthetic data integrated with the real EO data for these experiments.

When synthetic data are used, they are included only in the training and/or validation sets, following an 80/20 train–validation split, and are combined with the real data according to the scenarios defined in Table 2.

In this study, in addition to using the complete synthetic datasets (both AI-generated and physically based simulated data), the effects of incorporating only subsets of synthetic data were also examined. This approach aimed at understanding how varying the amount of synthetic data influenced the model’s performance while avoiding potential overfitting to synthetic characteristics. Overfitting could impair the model’s ability to generalize and accurately detect real solar panels.

Further experiments were conducted to evaluate alternative training strategies involving synthetic data. For example, the YOLOv8-m model was pre-trained using only synthetic data before being fine-tuned with real data to enhance its adaptability to real-world scenarios (Figure 7).

3.3.4. Evaluation Metrics

The performance of the models was assessed using precision, recall, and mean average precision (mAP). Precision is an indicator of prediction accuracy for positive instances. It is defined as the ratio of true positives (TP) to the sum of TP and false positives (FP), with values ranging from 0 to 1. A higher precision value means the model more accurately identifies positive cases, minimizing false positives. This metric, shown in Equation (1), reflects how often the model’s positive predictions are correct.

P r e c i s i o n = \frac{T P}{T P + F P}

(1)

Recall measures the model’s effectiveness in identifying all relevant positive instances. Defined as the ratio of TP to the sum of TP and false negatives (FN), recall values range from 0 to 1. A higher recall indicates the model’s strength in capturing most actual positives, with fewer missed cases (FN). This metric, shown in Equation (2), reflects the model’s capacity to find all relevant positive instances.

R e c a l l = \frac{T P}{T P + F N}

(2)

mAP is a metric that combines precision and recall to assess the model’s overall detection accuracy. Specifically, mAP@0.50 evaluates precision at an intersection over union (IoU) threshold of 0.50, focusing on the model’s ability to accurately detect objects. Meanwhile, mAP@0.50:0.95 averages precision across a range of IoU thresholds (from 0.50 to 0.95), offering a comprehensive measure of detection performance. High mAP scores indicate that the model effectively balances precision and recall, achieving both accurate and thorough object detection.

4. Results and Discussion

4.1. Benchmarks and Experiments Results

Table 3 presents the results from various experiments, highlighting benchmark outcomes as well as the impact of different combinations of AI-generated and/or physically based simulated data. Performance improvements with the addition of synthetic data are marked with a + sign, while declines in performance are marked with a − sign.

The benchmark results show that the model trained on the entire real EO dataset achieved higher performance compared to those trained only on PV08 data. Furthermore, the model trained with a reduced set of PV08 samples performed worse than the model trained with the full PV08 dataset. The difference in performance between the two models trained on PV08 can be attributed to the amount of training data, as the model with fewer PV08 samples had limited exposure to variations within the dataset. The difference in performance between the two models trained on PV08 highlights the importance of sufficient training data, as the model with fewer PV08 samples had limited exposure to data variations, reducing its generalization ability. In this context, synthetic data can play a role by supplementing the dataset, enhancing the model’s exposure to diverse patterns.

All evaluations of adding synthetic data were performed as within-benchmark comparisons: for each scenario, models trained with synthetic data are assessed relative to their corresponding real-data baseline using the same architecture and protocol. Our conclusions therefore rely on changes in performance (Δ metric) within each benchmark, rather than on absolute metric differences between benchmarks, which may differ in difficulty and composition. For example, when we state that the largest improvement occurs in EX.12, this refers to its relative gain over its own benchmark after adding synthetic samples (e.g., to match YOLO’s recommended sample size), even though EX.12 still underperforms benchmarks 1 and 2 in absolute terms. This distinction ensures that the impact of synthetic data is evaluated fairly and independently of benchmark-specific characteristics.

With incorporating AI-generated and/or physically based simulated data with the real data, the findings suggest that an appropriate amount of synthetic data alongside real data can enhance model performance. The ideal amount of synthetic data, however, depends on the distribution and volume of both the real and synthetic datasets. Finding the right balance is crucial, as it can boost certain metrics while potentially impacting others negatively. Reference [8] obtained similar findings when they evaluated the performance of ResNet-50 classifier by adding different amounts of synthetic data generated from diffusion models to real data in the context of ImageNet classification. They found that the classifier’s performance is influenced by the quantity of synthetic data used to supplement the real data. They observed that in some experiments, accuracy improved slightly with a small amount of synthetic data but declined below the performance of models trained only on real data as the synthetic data approached the size of the real training set. In other cases with small size images as input, performance continued to improve as synthetic data increased up to nine times the amount of real data. However, the benefits of adding large amounts of synthetic data diminished. Across varying image sizes, they noted significant performance gains with fine-tuned diffusion models when synthetic data were up to 4–5 times the size of the real ImageNet training set [8]. This highlights the need for careful calibration to maximize the benefits of synthetic data without compromising overall model accuracy. Reference [26] explored the impact of incorporating synthetic data into varying proportions of real data for an object detection task focused on identifying aircraft in satellite imagery. They conducted three experiments where the real data constituted 10%, 20%, and 50% of the full training set. Consistent with our findings, they observed the greatest performance improvement when real data were most limited (10%). Among the three scenarios, the most significant performance boost was achieved in the third benchmark, particularly when using the PV08 dataset combined with a smaller amount of real training data. Reference [42] conducted multiple experiments using different amounts of both real and synthetic data for outdoor swimmer localization with YOLO. They found that results varied depending on the quantity of each, highlighting the importance of generating synthetic images when real data are limited.

This highlights the effectiveness of synthetic data in enhancing performance when real data are scarce. Three results from this study alongside other studies [8,26,42] emphasize that a model’s performance is influenced not only by the quantity of synthetic data but also by the availability of real data. The interaction between the two plays a critical role, as the impact of synthetic data can vary depending on how much real data are used for training.

Furthermore, the results showed that adding synthetic data can positively affect some metrics while affecting others negatively. For example, in experiments EX.1 through EX.8 and EX.11, an increase in precision was sometimes accompanied by a decrease in recall, and vice versa. The precision has increased while the recall decreased in EX1, EX2, EX4, and EX.7. While the recall has increased with decrease in precision in EX.3, EX.5, EX.6, and EX.8. In some cases, both precision and recall have been increased, as observed in EX.9, EX.10, and EX.12. Reference [43] used synthetic data generated through parametric models to improve wafermap defect classification. Their results showed that both precision and recall improved, the increase in recall was notably greater than the improvement in precision. This suggests that incorporating synthetic data can have varying impacts on different performance metrics, highlighting the need to assess its effects across multiple dimensions of model performance.

The results also show that the distribution of synthetic data between training and validation influences the trade-off between precision and recall. In EX.6 and EX.11, where physically based synthetic data are included in the training set and validation remains real-only, recall increases while precision decreases, suggesting that additional synthetic samples make the detector more sensitive on real data. In contrast, EX.1 and EX.4 (physically based data in both training and validation) yield higher precision and lower recall, whereas EX.3 and EX.5 (AI-generated data at three times the real data in both training and validation) lead to lower precision and higher recall. These patterns indicate that different training/validation compositions of synthetic data modulate model behavior along the precision–recall spectrum rather than uniformly improving all metrics. Consequently, an effective balance between real and synthetic data is application-dependent and must often be determined empirically, as it varies with the study area, data characteristics, and the type and amount of synthetic data used.

In this study, the changes in precision and recall cannot be directly linked to the type of synthetic data (AI-generated or physically based simulated), as these metrics varied across experiments and benchmarks. This variability underscores the importance of incorporating synthetic data in diverse scenarios to gain a deeper understanding of how different types and amounts of synthetic data influence the model’s performance and to determine the most effective approach for improving accuracy.

The results also reveal that there was only one scenario where the addition of synthetic data had no impact on a metric: in EX.5, the mAP@50 score remained unchanged when AI-generated data, three times the amount of real data, was added with an 80–20 split for training and validation. In this case, the decrease of −2.2 in precision was offset by an increase of +2.1 in recall, suggesting that precision and recall might have balanced each other, which has been reflected in unchanged mAP@50, a metric that combines both precision and recall.

Notably, EX.2 was the only experiment where three out of four metrics declined, whereas EX.5 saw declines in two metrics, and all other experiments (excluding those where all metrics increased) had only one metric decline. This experiment also uniquely involved incorporating 100% AI-generated data into the training and/or validation datasets. This emphasizes the importance of carefully monitoring experiments to prevent the model from overfitting synthetic data, ensuring it performs well in real-world scenarios.

Among all the experiments, the largest increase in precision compared to the benchmark was observed in EX.7 (+2.9), while the greatest increases in recall, mAP@50, and mAP@95 occurred in EX.12. In EX.7, AI-generated data (twice the amount of real data) was incorporated only into the training set, while the validation set contained only real data. In contrast, EX.12 included both AI-generated and physically based simulated data in both the training and validation sets to meet the YOLOv8 documentation’s requirement of having a minimum of 1500 images per class (https://docs.ultralytics.com/modes/train/ URL (accessed on 12 February 2024)).

The experiments revealed that the best results among all experiments were achieved when synthetic data were used to fulfill the minimum image count per class as recommended for YOLOv8. This emphasizes that synthetic data can be customized to meet specific needs [44,45], such as ensuring the required quantity of data for optimal performance of deep learning models. Additionally, synthetic data can address other requirements, like balancing dataset classes, although this is not applicable for this study as we are dealing with a single class.

The most significant decrease in precision was observed in EX.6 (−4.8), where only physically based simulated data were added to the training set, with the validation set containing only real data. However, despite the drop in precision, all other metrics in EX.6 improved. EX.2, on the other hand, showed the largest decrease in recall compared to the benchmark.

In experiments EX.9, EX.10, and EX.12, all metrics improved with the addition of synthetic data. These experiments aimed to meet the minimum recommended number of images per class (≥1500) as suggested by Ultralytics YOLO docs (https://docs.ultralytics.com/modes/train/ URL (accessed on 12 February 2024)). EX.12 achieves the best performance among the three, using both physically based and AI-generated data. Notably, EX.9, which includes AI-generated data in both the training and validation sets, outperforms EX.10, where AI-generated data are added only to the training set. On the other side, when only physically based simulated data are added (EX.11) to meet YOLOv8 recommendations, all metrics improve except for a decrease in precision.

In the four experiments where synthetic data were added to meet YOLOv8’s minimum image requirements per class, the improvement in recall was greater than the improvement in precision. Based on the formulas for precision and recall, this can be explained by the model’s tendency to predict a higher number of positive instances when synthetic data are added. This tendency leads to a larger reduction in FN compared to false positives FP.

Recall, defined as the ratio of TP to the sum of TP and FN, improves when FN decreases, meaning the model is correctly identifying more actual positive cases. Since synthetic data expanded the model’s exposure to varied positive instances, it became more likely to capture a broader range of true positives, thereby lowering FN more significantly. Precision, on the other hand, which is calculated as the ratio of TP to the sum of TP and FP, is more sensitive to the count of false positives. Although synthetic data did improve TP, it may have also led to an increase in FP. This increase in FP limited the precision gains, as the model may have classified more instances as positive, some of which were incorrect. Thus, the greater reduction in FN than FP resulted in a more notable enhancement in recall than in precision.

Overall, the experiments indicate that the effectiveness of synthetic data depends on how it is combined with real data across training and validation. For example, synthetic samples were beneficial when they compensated for limited real data or helped meet the recommended minimum number of images per class (EX.9–EX.12).

To visually assess the results, Figure 8 provides examples of visualizations of GT data compared with the model predictions using either real EO data or a combination of real and AI-generated synthetic data. Red circles highlight cases in which AI-generated synthetic data enabled the model to detect solar panels that would have been missed using only real EO data. Figure 9 presents comparison maps for EX.12, which achieved the largest performance gain. Compared to the benchmark, EX.12 produces denser and more continuous detections over large solar farms and complex industrial areas, reducing missed installations and improving the spatial coverage of detected arrays. In addition, some FPs from the benchmark are correctly suppressed in EX.12 (see the second row), indicating improved discrimination between PV and non-PV structures. Although a few minor over-detections remain, the overall visual patterns are consistent with the quantitative improvements reported in Table 3.

4.2. Test Using Synthetic Data for Transfer Learning

In another experiment, the YOLOv8-m model was initially pre-trained on synthetic data and later fine-tuned with real data. A YOLOv8 model pre-trained using YOLOv8-m weights has been trained using synthetic data, and then the resulting weights have been used to initialize training based solely on real data. Reference [46] employed a similar approach of transfer learning for driver pose estimation, reporting a significant performance improvement with this method compared to the poor results observed when models were trained exclusively on synthetic data. Table 4 presents the results of the third benchmark, comparing transfer learning with and without synthetic data after 5 and 100 epochs of training. The synthetic data used in this test comprised both AI-generated and physically simulated data.

The findings indicate that pre-training on synthetic data significantly enhanced performance during the early training stages, particularly within the first 5 epochs. However, as training extended to 100 epochs, the performance of both models began to converge, with the pre-trained model maintaining a slight advantage. This strategy accelerates the initial learning process, enabling the model to achieve high performance more quickly, an advantage in scenarios with constrained computational resources. Similarly, [47] found that leveraging synthetic data for transfer learning in time series classification reduced training time by 85% while maintaining equivalent performance.

4.3. Tests with Consistent Training Steps

In all experiments within this study, identical hyperparameters were applied consistently (Table 5). This approach ensured that any performance differences across models could be attributed to the inclusion of synthetic data rather than to variations in hyperparameters.

Another factor considered for a fair comparison is the number of learning steps, which varied between models trained solely on real EO data and those incorporating synthetic data. The addition of synthetic data increased the number of learning steps, meaning that some performance gained may result from the extended training rather than the synthetic data alone. This section aims to address this aspect in detail. Additional tests were performed to ensure an equal number of training steps for models using only real EO data and those incorporating both real and synthetic data. Two approaches were utilized: increasing batch size and decreasing the number of epochs.

Increasing batch size: To keep the training steps consistent, the batch size was raised while maintaining a fixed number of epochs. For instance, if a model using only real EO data had a batch size of 16 over 100 epochs, totaling 3600 steps, the model combining real and synthetic data would increase its batch size while keeping the 100 epochs, thus also achieving 3600 steps. However, this method has certain drawbacks:

Batch size is an important hyperparameter in YOLOv8, and changing it could influence model performance, making it unclear whether performance changes are due to the batch size or synthetic data.
Larger batch sizes demand more memory, and in several cases, the required batch size (e.g., 150 or 250) exceeded GPU capacity, limiting the practicality of this method.

Decreasing the number of epochs: In this approach, the batch size remained fixed, while the number of epochs was reduced to keep the total training steps consistent. For instance, with a batch size of 16, the number of epochs was lowered to stay within 3600 steps when synthetic data were added. The primary drawback of this method was that reducing epochs may impair model performance, particularly when there is increased data variation (as with the combined use of real and synthetic data), which often demands more training to fully capture and adapt to the diversity in the data.

The two approaches to achieving the same number of training steps were applied to experiments that used Jiang PV08 for training, validation, and testing. The selected experiments for this purpose included EX.4, EX.5, EX.6, EX.7, and EX.8, chosen primarily because of their relatively lower memory requirements when batch sizes are increased compared to other experiments. The hyperparameters of each experiment for both approaches (increasing batch size and decreasing the number of epochs) are shown in Table 6 and Table 7, respectively.

The results for both approaches are provided in Table 8 and Table 9. When applying the first approach, the increase in batch size led to higher memory usage. As a result, the decision was made to initialize the training with YOLOv8-n instead of YOLOv8-m.

In the tests where the batch size was increased while keeping the number of epochs fixed, precision values consistently worsened across all data combinations, while recall values consistently improved. The mAP@50 and mAP@95 scores showed slight increases or decreases depending on the data combination. In contrast, when the number of epochs was reduced while maintaining the same batch size, most metrics showed a general decline, regardless of the data combination. However, exceptions were observed, such as an increase in recall (+1.7) when only physically based simulated data were used for both training and validation or just for training (EX.4.3 and EX.6.3) and an increase in precision (+2.7) when AI-generated synthetic data were used for training (EX.7.3).

4.4. Comparative Analysis of Synthetic Data Generation Approaches

Both physically based simulated and AI-generated synthetic datasets contributed to improving the performance of the YOLOv8 model when combined with real EO data. However, the two approaches differ considerably in terms of generation process, computational requirements, and the degree of control over image content and labeling.

Physically based simulation, implemented in Unity using physically based rendering (PBR) techniques, provides full control over the geometry, illumination, and materials of the scene. It allows the creation of pixel-perfect annotations and consistent, reproducible datasets. This approach ensures high spatial accuracy and precise ground truth information. In practice, the generation process proved to be efficient, as multiple scenes can be automatically produced and rendered with limited manual intervention. Its main dependency lies in the availability of 3D assets and sufficient computational resources for rendering.

By contrast, AI-generated data enable the synthesis of large numbers of high-quality, photorealistic images once the model is fine-tuned. The visual realism and diversity of the generated scenes are significantly higher than in the physically based approach. However, this method offers less geometric and semantic control, as object placement and illumination cannot be precisely predefined. Manual verification is often required to ensure annotation consistency, particularly when combining generated content with real EO data.

Table 10 summarizes the main advantages and limitations of both synthetic data generation approaches. In general, physically based data provided higher precision due to the accurate labeling and deterministic rendering pipeline, whereas AI-generated data improved recall by introducing a greater diversity of visual conditions. The combination of both approaches leveraged their complementary strengths, control and realism, resulting in the best overall performance observed in the experiments (see EX.12).

The inclusion of CycleGAN or conditional GAN baselines was not pursued, as reproducing competitive implementations and tuning them to achieve the level of control required in this study would involve an effort comparable to that invested in the diffusion-based pipeline, extending beyond the defined scope. Moreover, recent evidence indicates that diffusion models outperform or match GANs in sample fidelity and diversity, while offering more stable training and more effective conditioning mechanisms. In EO applications, diffusion-based augmentation is emerging as a promising alternative, whereas classic CycleGAN (designed primarily for unpaired domain translation) does not provide direct mechanisms for controlled object insertion, which is essential in the present study. Consequently, the methodology adopted prioritizes a synthesis framework aligned with the spatial control and quality requirements of the proposed approach.

Although both synthetic data generation approaches produced realistic and diverse scenes, we acknowledge that the architectural styles represented in some synthetic samples may differ from those found in the actual testing area of Jiangsu Province, China. The objective of this study was not to reproduce local architectural patterns faithfully but to assess the general impact of synthetic data on model performance. Nevertheless, future extensions of the physically based simulator and diffusion-based generation prompts will aim to incorporate regional architectural features and materials to improve domain alignment with specific EO targets.

While the results demonstrate that both types of synthetic data improved model performance when combined with real data, a dedicated feature-level analysis (e.g., t-SNE or embedding similarity) was not conducted in this study. Future work will include such visualization to further quantify the representational similarity between real and synthetic samples and to support a deeper understanding of their complementarity.

5. Conclusions

This study explored the impact of incorporating synthetic data, including both AI-generated and physically based simulated data, alongside real VHR satellite imagery on the performance of YOLOv8, a deep learning-based object detection model. The objective was to evaluate how this combination influenced the model’s ability to detect photovoltaic panels in multi-resolution EO imagery. Three benchmarks were established using only real EO data, with experiments conducted to evaluate the impact of adding synthetic data to the training and/or validation sets, while excluding it from the testing set. These experiments also investigated the effects of incorporating different quantities of real and synthetic data and explored how this influenced model’s performance.

The results highlight the positive impact of synthetic data on YOLOv8 model performance, particularly when both physically based and AI-generated synthetic data were used in combination in appropriate quantity. The experiments showed that the best results were obtained when synthetic data were used to meet the minimum image count per class recommended for YOLOv8. This highlights that synthetic data can be used to address specific requirements. Although the optimal mix of real and synthetic data may vary depending on data distribution and volume, the experiments indicated that using an appropriate amount typically improves performance. Moreover, incorporating synthetic data into either the training or validation datasets, or both, should be approached with care to avoid overfitting to synthetic data.

Additionally, pre-training the model with synthetic data before fine-tuning it with real EO data was found to accelerate the learning process, helping the model achieve high performance in a shorter time frame. This pre-training approach is especially useful when computational resources are limited, as it may allow models to converge more quickly, enabling faster deployment.

Moving forward, future studies should aim to have greater control over the generation of AI-generated data by incorporating additional factors from OpenStreetMap, along with considering a wider range of environmental and weather conditions. Furthermore, leveraging physically based simulated data, which offer more control over their generation, can help guide the creation of AI-generated data. This approach would allow for the customization of data generation based on the specific use case and study area. Additionally, exploring a broader range of AI models and techniques could provide deeper insights into the added value of synthetic data across different methods.

Future research also could incorporate a controlled comparison between diffusion-based and GAN/CycleGAN-based augmentation approaches, applying identical data, computational, and control settings. Such analysis would allow the quantification of performance trade-offs and provide further insight into the relative advantages of each generative paradigm for synthetic data generation in EO applications.

Author Contributions

Conceptualization, Jesus Gimeno, David Miraut, Manolo Pérez-Aixendri, Marcos Fernández and Rossana Gini; Methodology, Jesus Gimeno, David Miraut, Manolo Pérez-Aixendri and Marcos Fernández; Software, Jesus Gimeno, David Miraut, Manolo Pérez-Aixendri, Marcos Fernández and Raúl Rodríguez; Validation, Enes Hisam, Jesus Gimeno, David Miraut, Manolo Pérez-Aixendri, Marcos Fernández, Rossana Gini, Raúl Rodríguez and Gabriele Meoni; Formal analysis, Enes Hisam; Investigation, Enes Hisam, Jesus Gimeno, David Miraut, Manolo Pérez-Aixendri and Marcos Fernández; Data curation, Enes Hisam; Writing—original draft, Enes Hisam, Jesus Gimeno, David Miraut, Manolo Pérez-Aixendri and Marcos Fernández; Writing—review & editing, Jesus Gimeno, David Miraut, Manolo Pérez-Aixendri, Marcos Fernández and Rossana Gini; Dursun Zafer Seker; Supervision, Rossana Gini, Gabriele Meoni and Dursun Zafer Seker; Project administration, Rossana Gini and Raúl Rodríguez; Funding acquisition, Jesus Gimeno, Marcos Fernández and Rossana Gini. All authors have read and agreed to the published version of the manuscript.

Funding

The work presented in this paper was carried out within the framework of the SD4EO project (Synthetic data for earth observation), which was funded by the European Space Agency (ESA)’s FutureEO program under contract No 4000142334/23/I-DT, and supervised by ESA’s Φ-lab.

Data Availability Statement

The data presented in this study are openly available in SD4EO: AI-based synthetic satellite Sentinel-2 images of cities and building coverture (RGB+NIR bands) at https://zenodo.org/records/14025435 (URL accessed on 16 January 2024).

Acknowledgments

The authors would like to thank Javier Portilla for his support during the research, and for the insightful discussions and valuable advice on generative-AI algorithm development.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial intelligence
CNN	Convolutional neural network
CRS	Coordinate reference system
DL	Deep learning
EO	Earth observation
ESA	European Space Agency
FN	False negative
FP	False positive
GAN	Generative adversarial network
GT	Ground truth
ML	Machine learning
PBR	Physically based rendering
PV	Photovoltaic
RGB	Red-green-blue (color model)
SD4EO	Synthetic data for earth observation
TP	True positive
UAV	Unmanned aerial vehicle
VHR	Very high resolution

Appendix A

Figure A1. Appendix: training and validation curves for the first benchmark and corresponding experiments incorporating synthetic data. (a) Benchmark 1, (b) EX1, (c) EX2, and (d) EX3.

Figure A2. Appendix: training and validation curves for the second benchmark and corresponding experiments incorporating synthetic data. (a) Benchmark 2, (b) EX4, (c) EX5, (d) EX6, (e) EX7, and (f) EX8.

Figure A3. Appendix: training and validation curves for the third benchmark and corresponding experiments incorporating synthetic data. (a) Benchmark 3, (b) EX9, (c) EX10, (d) EX11, and (e) EX12.

References

Jäger-Waldau, A. Snapshot of Photovoltaics−May 2023. EPJ Photovolt. 2023, 14, 23. [Google Scholar] [CrossRef]
Wen, D.; Gao, W. Impact of Renewable Energy Policies on Solar Photovoltaic Energy: Comparison of China, Germany, Japan, and the United States of America. In Distributed Energy Resources: Solutions for a Low Carbon Society; Gao, W., Ed.; Springer International Publishing: Cham, Switzerland, 2023; pp. 43–68. ISBN 978-3-031-21097-6. [Google Scholar]
Pourasl, H.H.; Barenji, R.V.; Khojastehnezhad, V.M. Solar Energy Status in the World: A Comprehensive Review. Energy Rep. 2023, 10, 3474–3493. [Google Scholar] [CrossRef]
He, K.; Zhang, L. Automatic Detection and Mapping of Solar Photovoltaic Arrays with Deep Convolutional Neural Networks in High Resolution Satellite Images. In Proceedings of the 2020 IEEE 4th Conference on Energy Internet and Energy System Integration (EI2), Wuhan, China, 30 October–1 November 2020; pp. 3068–3073. [Google Scholar]
Clark, C.N.; Pacifici, F. A Solar Panel Dataset of Very High Resolution Satellite Imagery to Support the Sustainable Development Goals. Sci. Data 2023, 10, 636. [Google Scholar] [CrossRef] [PubMed]
Kasmi, G.; Saint-Drenan, Y.-M.; Trebosc, D.; Jolivet, R.; Leloux, J.; Sarr, B.; Dubus, L. A Crowdsourced Dataset of Aerial Images with Annotated Solar Photovoltaic Arrays and Installation Metadata. Sci. Data 2023, 10, 59. [Google Scholar] [CrossRef]
Villalobos, P.; Ho, A.; Sevilla, J.; Besiroglu, T.; Heim, L.; Hobbhahn, M. Will We Run out of Data? Limits of LLM Scaling Based on Human-Generated Data. arXiv 2024, arXiv:2211.04325. [Google Scholar]
Azizi, S.; Kornblith, S.; Saharia, C.; Norouzi, M.; Fleet, D.J. Synthetic Data from Diffusion Models Improves ImageNet Classification. arXiv 2023, arXiv:2304.08466. [Google Scholar] [CrossRef]
Burg, M.F.; Wenzel, F.; Zietlow, D.; Horn, M.; Makansi, O.; Locatello, F.; Russell, C. Image Retrieval Outperforms Diffusion Models on Data Augmentation. arXiv 2023, arXiv:2304.10253. [Google Scholar] [CrossRef]
Luzi, L.; Mayer, P.M.; Casco-Rodriguez, J.; Siahkoohi, A.; Baraniuk, R.G. Boomerang: Local Sampling on Image Manifolds Using Diffusion Models. arXiv 2024, arXiv:2210.12100. [Google Scholar] [CrossRef]
Veselovsky, V.; Ribeiro, M.H.; West, R. Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks. arXiv 2023, arXiv:2306.07899. [Google Scholar] [CrossRef]
Bansal, H.; Grover, A. Leaving Reality to Imagination: Robust Classification via Generated Datasets. arXiv 2023, arXiv:2302.02503. [Google Scholar] [CrossRef]
Dai, H.; Liu, Z.; Liao, W.; Huang, X.; Cao, Y.; Wu, Z.; Zhao, L.; Xu, S.; Liu, W.; Liu, N.; et al. AugGPT: Leveraging ChatGPT for Text Data Augmentation. arXiv 2023, arXiv:2302.13007. [Google Scholar] [CrossRef]
He, R.; Sun, S.; Yu, X.; Xue, C.; Zhang, W.; Torr, P.; Bai, S.; Qi, X. Is Synthetic Data from Generative Models Ready for Image Recognition? arXiv 2023, arXiv:2210.07574. [Google Scholar] [CrossRef]
Lin, S.; Wang, K.; Zeng, X.; Zhao, R. Explore the Power of Synthetic Data on Few-Shot Object Detection. arXiv 2023, arXiv:2303.13221. [Google Scholar] [CrossRef]
Shipard, J.; Wiliem, A.; Thanh, K.N.; Xiang, W.; Fookes, C. Diversity is Definitely Needed: Improving Model-Agnostic Zero-Shot Classification via Stable Diffusion. arXiv 2023, arXiv:2302.03298. [Google Scholar]
Xu, C.; Guo, D.; Duan, N.; McAuley, J. Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data. arXiv 2023, arXiv:2304.01196. [Google Scholar]
Klemp, M.; Rösch, K.; Wagner, R.; Quehl, J.; Lauer, M. LDFA: Latent Diffusion Face Anonymization for Self-Driving Applications. arXiv 2023, arXiv:2302.08931v1. [Google Scholar] [CrossRef]
Packhäuser, K.; Folle, L.; Thamm, F.; Maier, A. Generation of Anonymous Chest Radiographs Using Latent Diffusion Models for Training Thoracic Abnormality Classification Systems. In Proceedings of the 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), Cartagena de Indias, Colombia, 17–21 April 2023; pp. 1–5. [Google Scholar]
Veeravasarapu, V.S.R.; Rothkopf, C.; Ramesh, V. Model-Driven Simulations for Deep Convolutional Neural Networks. arXiv 2016, arXiv:1605.09582. [Google Scholar] [CrossRef]
Veeravasarapu, V.; Rothkopf, C.; Visvanathan, R. Model-Driven Simulations for Computer Vision. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 27–29 March 2017; pp. 1063–1071. [Google Scholar]
Savva, M.; Kadian, A.; Maksymets, O.; Zhao, Y.; Wijmans, E.; Jain, B.; Straub, J.; Liu, J.; Koltun, V.; Malik, J.; et al. Habitat: A Platform for Embodied AI Research. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9339–9347. [Google Scholar]
Schwarz, M.; Behnke, S. Stillleben: Realistic Scene Synthesis for Deep Learning in Robotics. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 10502–10508. [Google Scholar]
Zhang, Y.; Song, S.; Yumer, E.; Savva, M.; Lee, J.-Y.; Jin, H.; Funkhouser, T. Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks. arXiv 2017, arXiv:1612.07429. [Google Scholar] [CrossRef]
Schneegans, S.; Meyran, T.; Ginkel, I.; Zachmann, G.; Gerndt, A. Physically Based Real-Time Rendering of Atmospheres Using Mie Theory. Comput. Graph. Forum 2024, 43, e15010. [Google Scholar] [CrossRef]
Liu, W.; Luo, B.; Liu, J. Synthetic Data Augmentation Using Multiscale Attention CycleGAN for Aircraft Detection in Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Howe, J.; Pula, K.; Reite, A.A. Conditional Generative Adversarial Networks for Data Augmentation and Adaptation in Remotely Sensed Imagery. In Proceedings of the Applications of Machine Learning, San Diego, CA, USA, 6 September 2019; p. 13. [Google Scholar]
Clement, N.; Schoen, A.; Boedihardjo, A.; Jenkins, A. Synthetic Data and Hierarchical Object Detection in Overhead Imagery. ACM Trans Multimed. Comput. Commun. Appl. 2024, 20, 117. [Google Scholar] [CrossRef]
Guo, J.; Lei, B.; Ding, C.; Zhang, Y. Synthetic Aperture Radar Image Synthesis by Using Generative Adversarial Nets. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1111–1115. [Google Scholar] [CrossRef]
Abady, L.; Barni, M.; Garzelli, A.; Tondi, B. GAN Generation of Synthetic Multispectral Satellite Images. In Proceedings of the Image and Signal Processing for Remote Sensing XXVI, SPIE, Online, 20 September 2020; Volume 11533, pp. 122–133. [Google Scholar]
Jiang, H.; Yao, L.; Lu, N.; Qin, J.; Liu, T.; Liu, Y.; Zhou, C. Multi-Resolution Dataset for Photovoltaic Panel Segmentation from Satellite and Aerial Imagery. Earth Syst. Sci. Data 2021, 13, 5389–5401. [Google Scholar] [CrossRef]
Yang, Y.-L.; Wang, J.; Vouga, E.; Wonka, P. Urban Pattern: Layout Design by Hierarchical Domain Splitting. ACM Trans. Graph. 2013, 32, 181. [Google Scholar] [CrossRef]
Gimeno Sancho, J.; Fernandez Marín, M.; García Blaya, A.; Pérez Folgado, I.; Casanova-Salas, P.; Gini, R.; Fernández Guirao, A.; Pérez Aixendri, M. SD4EO—Physically Based Rendering Images of Human Settlements for Solar Panel Detection. Zenodo 2024. [Google Scholar] [CrossRef]
Ho, J.; Jain, A.; Abbeel, P. Denoising Diffusion Probabilistic Models. arXiv 2020, arXiv:2006.11239v2. [Google Scholar] [CrossRef]
Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. arXiv 2021, arXiv:2112.10752. [Google Scholar]
Zhang, L.; Rao, A.; Agrawala, M. Adding Conditional Control to Text-to-Image Diffusion Models. arXiv 2023, arXiv:2302.05543. [Google Scholar]
Data Augmentation in Earth Observation: A Diffusion Model Approach. Available online: https://www.mdpi.com/2078-2489/16/2/81 (accessed on 12 November 2025).
Dhariwal, P.; Nichol, A. Diffusion Models Beat GANs on Image Synthesis. arXiv 2021, arXiv:2105.05233. [Google Scholar] [CrossRef]
Fang, H.; Han, B.; Zhang, S.; Zhou, S.; Hu, C.; Ye, W.-M. Data Augmentation for Object Detection via Controllable Diffusion Models. In Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 4–8 January 2024; pp. 1257–1266. [Google Scholar]
Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks. arXiv 2017, arXiv:1703.10593. [Google Scholar]
Terven, J.; Cordova-Esparza, D. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
Khan Mohammadi, M.; Schneidereit, T.; Mansouri Yarahmadi, A.; Breuß, M. Investigating Training Datasets of Real and Synthetic Images for Outdoor Swimmer Localisation with YOLO. AI 2024, 5, 576–593. [Google Scholar] [CrossRef]
Alqudah, R.; Al-Mousa, A.A.; Abu Hashyeh, Y.; Alzaibaq, O.Z. A Systemic Comparison between Using Augmented Data and Synthetic Data as Means of Enhancing Wafermap Defect Classification. Comput. Ind. 2023, 145, 103809. [Google Scholar] [CrossRef]
Liu, R.; Wei, J.; Liu, F.; Si, C.; Zhang, Y.; Rao, J.; Zheng, S.; Peng, D.; Yang, D.; Zhou, D.; et al. Best Practices and Lessons Learned on Synthetic Data. arXiv 2024, arXiv:2404.07503. [Google Scholar] [CrossRef]
Przystupa, M.; Abdul-Mageed, M. Volume 3: Shared Task Papers, Day 2. Neural Machine Translation of Low-Resource and Similar Languages with Backtranslation. In Proceedings of the Fourth Conference on Machine Translation, Florence, Italy, 1–2 August 2019; Bojar, O., Chatterjee, R., Federmann, C., Fishel, M., Graham, Y., Haddow, B., Huck, M., Yepes, A.J., Eds.; Association for Computational Linguistics: Florence, Italy, 2019; pp. 224–235. [Google Scholar]
Sagmeister, D.; Schörkhuber, D.; Nezveda, M.; Stiedl, F.; Schimkowitsch, M.; Gelautz, M. Transfer Learning for Driver Pose Estimation from Synthetic Data. In Proceedings of the 2023 IEEE Intelligent Vehicles Symposium (IV), Anchorage, AK, USA, 4–7 June 2023; pp. 1–7. [Google Scholar]
Rotem, Y.; Shimoni, N.; Rokach, L.; Shapira, B. Transfer Learning for Time Series Classification Using Synthetic Data Generation. In Proceedings of the Cyber Security, Cryptology, and Machine Learning, Be’er Sheva, Israel, 19–20 December 2024; Dolev, S., Katz, J., Meisels, A., Eds.; Springer International Publishing: Cham, Switzerland, 2022; pp. 232–246. [Google Scholar]

Figure 1. Example of dataset images (a) PV01, (b) PV03, (c) PV08 [31].

Figure 2. Sample of label images from different human settlement distribution, generated automatically with random variations in the same zones defined by the user. Black lines are roads. Red lines are railroads.

Figure 3. Example of physically based images generated: color image (a), label of each 3D element (b), and photovoltaic mask (c).

Figure 4. The generation of synthetic data based on stable diffusion model with ControlNet using OpenStreetMap as a guide.

Figure 5. Examples of AI-based synthetic images generated: color images (a,c) and photovoltaic binary masks (b,d).

Figure 6. Examples of images with the bounding boxes (red lines) around solar panels for (a) real EO data [31], (b) AI-generated synthetic data, and (c) physically based simulated data.

Figure 7. Use of synthetic data for transfer learning.

Figure 8. Examples of GT data compared with the model predictions using either only real EO data or a combination of real and AI-generated data. Blue boxes displays non-oriented bounding boxes of detected solar panels. Red circles shows the improvements when using AI-generated training data.

Figure 9. Examples of GT data compared with benchmark 3 and EX12. Blue boxes display non-oriented bounding boxes of detected solar panels. Red circles show improvements when using AI-generated training data.

Table 1. Reference [31]’s dataset grouped by the background.

Ground Sampling Distance	Ground	Rooftop	Total
PV01 (0.1 m)	0	645	645
PV03 (0.3 m)	2122	186	2308
PV08 (0.8 m)	673	90	763
Total			3716

Table 2. Synthetic data inclusion configurations in train/validation/test sets.

Iteration	Type of Synthetic Data	Included in Training Data	Included in Validation Data	Included in Testing Data
1	AI-generated	Yes	Yes	No
2	AI-generated	Yes	No	No
3	AI-generated	No	Yes	No
4	Phy-based	Yes	Yes	No
5	Phy-based	Yes	No	No
6	Phy-based	No	Yes	No
7	Phy-based + AI-generated	Yes	Yes	No

Table 3. Evaluation metrics for the experiments conducted using different benchmarks and combinations of various amounts of synthetic data.

Real EO Data			Evaluation Metrics [%]				ID	Combination of Synthetic Data		Evaluation Metrics [%]				Performance Comparison
Train.	Valid.	Test	Pr.	Re.	mAP50	mAP95	ID	Train.	Valid.	Pr.	Re.	mAP50	mAP95	Pr.	Re.	mAP50	mAP95
Jiang	Jiang	Jiang	88.5	80.6	85.9	74.4	EX.1	Phy	Phy	90.8	80.2	86.6	74.8	+2.3	−0.4	+0.7	+0.4
							EX.2	AI (100%)	AI (100%)	90.9	79.3	85.5	74.3	+2.4	−1.3	−0.4	−0.1
							EX.3	AI (3 x real)	AI (3 x real)	88.0	82.5	86.1	74.9	−0.5	+1.9	+0.2	+0.5
Jiang PV08	Jiang PV08	Jiang PV08	84.1	67.1	74.0	60.3	EX.4	Phy	Phy	84.6	66.3	74.7	60.9	+0.5	−0.8	+0.7	+0.6
							EX.5	AI (3 x real)	AI (3 x real)	81.9	69.2	74.0	60.0	−2.2	+2.1	0.0	−0.3
							EX.6	Phy	Only real	79.3	70.6	75.4	61.5	−4.8	+3.5	+1.4	+1.2
							EX.7	AI (2 x real)	Only real	87.0	66.2	74.9	60.6	+2.9	−0.9	+0.9	+0.3
							EX.8	Phy + AI (3 x real)	Phy + AI (3 x real)	80.9	69.2	75.6	61.5	−3.2	+2.1	+1.6	+1.2
Jiang PV08	Jiang PV08	Jiang PV08	75.1	59.0	65.0	49.2	EX.9	AI (Yrec)	AI (Yrec)	76.4	62.3	67.1	52.3	+1.3	+3.3	+2.1	+3.1
Jiang PV08	Jiang PV08	Jiang PV08					EX.10	AI (Yrec)	Only real	75.2	59.2	65.2	50.3	+0.1	+0.2	+0.2	+1.1
More test samples– less training and validation							EX.11	Phy	Only real	73.2	60.8	65.1	50.4	−1.9	+1.8	+0.1	+1.2
More test samples– less training and validation							EX.12	Phy + AI (Yrec)	Phy + AI (Yrec)	76.8	62.9	67.3	52.5	+1.7	+3.9	+2.3	+3.3

“Phy” stands for physically based simulated data, whilst “AI” refers to the AI-generated synthetic data. “Pr.” stands for precision and “Re.” for recall. “Yrec” refers to the minimum number of images per class which is recommended by the Ultralytics YOLO docs (≥1500 images per class recommended). The training and validation curves for the three benchmarks, along with the corresponding experiments incorporating synthetic data, are presented in the Appendix A (Figure A1, Figure A2, and Figure A3). The results demonstrate that adding synthetic data improved mAP@50 and mAP@95 scores in all experiments, except for EX.2 and EX.5, both of which used only AI-generated data. In general, when mAP scores increase, it indicates that the model is achieving a better balance between precision and recall. In this case, synthetic data helped the model perform better in most scenarios.

Table 4. Results of experiments using transfer learning based on pre-training using synthetic data and later fine-tuned with real data.

No. of Epochs	Weights Initialization	Pr. [%]	Re. [%]	mAP50 [%]	mAP95 [%]
5	With synthetic data included	57.7	44.0	44.4	31.0
5	Without synthetic data included	14.7	54.5	10.8	7.1
100	With synthetic data included	76.1	61.1	66.6	50.7
100	Without synthetic data included	75.1	59.0	65.0	49.2

Table 5. Hyperparameters applied consistently in the study.

Hyperparameter	Value
Weights initialization	YOLOv8-m
Image size	512
Batch size	16
Number of epochs	100
Early Stopping	Yes

Table 6. Hyperparameters used in tests with increased batch sizes and fixed number of epochs.

Hyperparameter	Value
Weights initialization	YOLOv8-n
Image size	512
Number of epochs	100
Number of learning steps per epoch	35–36
Batch size	Benchmark: 16 EX.4.2 and EX.6.2: 44 EX.5.2 and EX.8.2: 60 EX.7.2: 30

Table 7. Hyperparameters used in tests with decreased number of epochs and fixed batch size.

Hyperparameter	Value
Weights initialization	YOLOv8-m
Image size	512
Number of epochs	Benchmark: 100 EX.4.3 and EX.6.3: 37 EX.5.3 and EX.8.3: 27 EX.7.3: 52
Number of learning steps per epoch	35–36
Batch size	16

Table 8. Evaluation metrics for the experiments of increased batch sizes and fixed number of epochs.

Real EO Data			Evaluation Metrics [%]				ID	Combination of Synthetic Data		Evaluation Metrics [%]				Performance Comparison
Train.	Valid.	Test	Pr.	Re.	mAP50	mAP95	ID	Train.	Valid.	Pr.	Re.	mAP50	mAP95	Pr.	Re.	mAP50	mAP95
Jiang PV08	Jiang PV08	Jiang PV08	84.0	63.0	72.0	57.2	EX.4.2	Phy	Phy	79.8	67.8	72.5	57.3	−4.2	+4.8	+0.5	+0.1
							EX.5.2	AI (3 x real)	AI (3 x real)	81.7	66.9	72.5	58.4	−2.3	+3.9	+0.5	+1.2
							EX.6.2	Phy	Only real	78.8	66.7	71.8	56.2	−5.2	+3.7	−0.2	−1.0
							EX.7.2	AI (2 x real)	Only real	77.9	67.1	71.6	57.0	−6.1	+4.1	−0.4	−0.2
							EX.8.2	Phy + AI (3 x real)	Phy + AI (3 x real)	80.8	65.7	71.5	56.6	−3.2	+2.7	−0.5	−0.6

Table 9. Evaluation metrics for the experiments of decreased number of epochs and fixed batch size.

Real EO Data			Evaluation Metrics [%]				ID	Combination of Synthetic Data		Evaluation Metrics [%]				Performance Comparison
Train.	Valid.	Test	Pr.	Re.	mAP50	mAP95	ID	Train.	Valid.	Pr.	Re.	mAP50	mAP95	Pr.	Re.	mAP50	mAP95
Jiang PV08	Jiang PV08	Jiang PV08	84.1	67.1	74.0	60.3	EX.4.3	Phy	Phy	83.1	68.8	73.2	58.5	−1.0	+1.7	−0.8	−1.8
							EX.5.3	AI (3 x real)	AI (3 x real)	78.8	66.3	71.8	57.4	−5.3	−0.8	−2.2	−2.9
							EX.6.3	Phy	Only real	83.1	68.8	73.2	58.5	−1.0	+1.7	−0.8	−1.8
							EX.7.3	AI (2 x real)	Only real	86.8	64.9	73.3	59.2	+2.7	−2.2	−0.7	−1.1
							EX.8.3	Phy + AI (3 x real)	Phy + AI (3 x real)	79.8	66.7	71.0	56.4	−4.3	−0.4	−3.0	−3.9

“Phy” stands for physically based simulated data, whilst “AI” refers to the AI-generated synthetic data. “Pr.” stands for precision and “Re.” for recall.

Table 10. Comparison between physically based simulated and AI-generated synthetic data generation approaches.

Aspect	Physically Based (Unity, PBR)	AI-Generated (Stable Diffusion XL/DALL·E 3)
Control over geometry and annotations	Full and deterministic	Limited; requires manual verification
Visual realism	High, consistent illumination and materials	Very high, natural textures and diversity
Diversity of scenes	Limited by 3D assets and procedural rules	Very high; depends on prompt variety and seeds
Computational cost	Moderate; efficient rendering in practice	Moderate; inference on GPU once fine-tuned
Scalability	Scalable via procedural scene generation	Highly scalable once model is trained
Reproducibility	Fully reproducible	Partially stochastic
Best observed effect on model performance	Higher precision	Higher recall

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hisam, E.; Gimeno, J.; Miraut, D.; Pérez-Aixendri, M.; Fernández, M.; Gini, R.; Rodríguez, R.; Meoni, G.; Seker, D.Z. Impact of Synthetic Data on Deep Learning Models for Earth Observation: Photovoltaic Panel Detection Case Study. ISPRS Int. J. Geo-Inf. 2025, 14, 481. https://doi.org/10.3390/ijgi14120481

AMA Style

Hisam E, Gimeno J, Miraut D, Pérez-Aixendri M, Fernández M, Gini R, Rodríguez R, Meoni G, Seker DZ. Impact of Synthetic Data on Deep Learning Models for Earth Observation: Photovoltaic Panel Detection Case Study. ISPRS International Journal of Geo-Information. 2025; 14(12):481. https://doi.org/10.3390/ijgi14120481

Chicago/Turabian Style

Hisam, Enes, Jesus Gimeno, David Miraut, Manolo Pérez-Aixendri, Marcos Fernández, Rossana Gini, Raúl Rodríguez, Gabriele Meoni, and Dursun Zafer Seker. 2025. "Impact of Synthetic Data on Deep Learning Models for Earth Observation: Photovoltaic Panel Detection Case Study" ISPRS International Journal of Geo-Information 14, no. 12: 481. https://doi.org/10.3390/ijgi14120481

APA Style

Hisam, E., Gimeno, J., Miraut, D., Pérez-Aixendri, M., Fernández, M., Gini, R., Rodríguez, R., Meoni, G., & Seker, D. Z. (2025). Impact of Synthetic Data on Deep Learning Models for Earth Observation: Photovoltaic Panel Detection Case Study. ISPRS International Journal of Geo-Information, 14(12), 481. https://doi.org/10.3390/ijgi14120481

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Impact of Synthetic Data on Deep Learning Models for Earth Observation: Photovoltaic Panel Detection Case Study

Abstract

1. Introduction

2. Materials and Methods

3. Methodology

3.1. Generation Process of the Physically Based Simulated Data

3.2. Generation Process of the AI-Generated Data

3.3. AI-Analytics Methodology

3.3.1. YOLOv8

3.3.2. Synthetic Data Preprocessing

3.3.3. Benchmarks and Experiments

3.3.4. Evaluation Metrics

4. Results and Discussion

4.1. Benchmarks and Experiments Results

4.2. Test Using Synthetic Data for Transfer Learning

4.3. Tests with Consistent Training Steps

4.4. Comparative Analysis of Synthetic Data Generation Approaches

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI