Next Article in Journal
IBKA-MSM: A Novel Multimodal Fake News Detection Model Based on Improved Swarm Intelligence Optimization Algorithm, Loop-Verified Semantic Alignment and Confidence-Aware Fusion
Previous Article in Journal
An Enhanced Red-Billed Blue Magpie Optimizer Based on Superior Data Driven for Numerical Optimization Problems
Previous Article in Special Issue
New Binary Reptile Search Algorithms for Binary Optimization Problems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Ensemble Deep Learning for Real–Bogus Classification with Sky Survey Images

by
Pakpoom Prommool
1,
Sirikan Chucherd
1,*,
Natthakan Iam-On
2,* and
Tossapon Boongoen
2,*
1
School of Applied Digital Technology, Mae Fah Luang University, Chiang Rai 57100, Thailand
2
Department of Computer Science, Aberystwyth University, Aberystwyth SY23 3FL, UK
*
Authors to whom correspondence should be addressed.
Biomimetics 2025, 10(11), 781; https://doi.org/10.3390/biomimetics10110781
Submission received: 8 October 2025 / Revised: 6 November 2025 / Accepted: 14 November 2025 / Published: 17 November 2025

Abstract

The discovery of the fifth gravitational wave, GW170817, and its electromagnetic counterpart, resulting from the merger of neutron stars by the LIGO and Virgo teams, marked a major milestone in astronomy. It was the first time that gravitational waves and light from the same cosmic event were observed simultaneously. The LIGO detectors in the United States recorded the signal for 100 s, longer than in previous detections. The merging of neutron stars emits both gravitational and electromagnetic waves across all frequencies—from radio to gamma rays. However, pinpointing the exact source remains difficult, requiring rapid sky scanning to locate it. To address this challenge, the Gravitational-Wave Optical Transient Observer (GOTO) project was established. It is specifically designed to detect optical light from transient events associated with gravitational waves, enabling faster follow-up observations and a deeper study of these short-lived astronomical phenomena, which appear and disappear quickly in the universe. In astrophysics, it has become more important to find astronomical transient events like supernovae, gamma-ray bursts, and stellar flares because they are linked to extreme cosmic processes. However, finding these short-lived events in huge sky survey datasets, like those from the GOTO project, is very hard for traditional analysis methods. This study suggests a deep learning methodology employing Convolutional Neural Networks (CNNs) to enhance transient classification. CNNs are based on how biological vision systems work and how they are structured. They mimic how animal brains hierarchically process visual information, making it possible to automatically find complex spatial patterns in astronomical images. Transfer learning and fine-tuning on pretrained ImageNet models are utilized to emulate adaptive learning observed in biological organisms, enabling swift adaptation to new tasks with minimal data. Data augmentation methods like rotation, flipping, and noise injection mimic changes in the environment to improve model generalization. Dropout and different batch sizes are used to stop overfitting, which is similar to how biological systems use redundancy and noise tolerance. Ensemble learning strategies, such as Soft Voting and Weighted Voting, draw inspiration from collective intelligence in biological systems, integrating multiple CNN models to enhance decision-making robustness. Our findings indicate that this bio-inspired framework substantially improves the precision and dependability of transient detection, providing a scalable solution for real-time applications in extensive sky surveys such as GOTO.

1. Introduction

The detection of the fifth gravitational wave, known as GW170871 [1] and its electromagnetic signals—which were brought on by the merging of neutron stars—were announced by the team of scientists from the LIGO and Virgo laboratories. Astronomy has reached a major turning point with this. These gravitational waves were detected for over 100 s by the two LIGO gravitational wave detectors in Hanford and Livingston, USA, which is longer than the signals seen in the four preceding occurrences. Usually, huge objects like neutron stars or black holes combine to produce gravitational waves. The merger creates a longer-lasting gravitational wave signal if it involves two neutron stars, which are tiny and incredibly dense. Electromagnetic waves of every spectrum, from radio waves to gamma rays, are released during the merging. Unfortunately, the sophisticated LIGO and Virgo detectors are not very good at identifying the precise origin of gravitational waves. The gravitational wave signal must be found by rapidly scanning and searching a wide region of the sky. There is a higher chance of analyzing the transitory event if the source of the gravitational wave is discovered faster. The Gravitational-Wave Optical Transient Observer (GOTO) [2,3] project was conceived and designed in part to address this difficulty. There is more opportunities to use follow-up observatories and satellites to examine the short-lived source and its surroundings if possible electromagnetic (EM) equivalents can be found more quickly after gravitational waves are detected. While this problem was less difficult for the GW170871 event because of its very accurate localization, the projected search region has been significantly greater for many subsequent occurrences. The Gravitational-Wave Optical Transient Observer (GOTO) telescope project was specifically designed to find and detect optical phenomena related to gravitational wave observations. The observation of transient events—brief occurrences that arise and vanish rapidly—is now possible because of major advancements in optical telescope technology in recent years. The term “transient” describes astronomical occurrences that last for a short while before disappearing.
HOTPANTS [4] is a long-standing and effective image subtraction tool used in astronomy to detect transient objects and brightness variations. However, it has key limitations affecting accuracy and flexibility. Over 50% of small stars often remain after subtraction because a single kernel cannot handle varying star sizes. Parameter tuning is also manual and non-generalized, requiring repeated adjustments of Gaussian functions and σ values for different datasets. Additionally, HOTPANTS performs poorly on high-noise images or when PSF differences are large, leading to incomplete subtraction and residual artifacts, and it is computationally slow for large surveys like GOTO and ZTF. To overcome these issues, researchers now employ deep learning approaches—particularly Convolutional Neural Networks (CNNs)—which can automatically learn object and background features without manual tuning, significantly improving subtraction accuracy and transient detection in real observations.
In modern astrophysics, detecting astronomical transients [5,6]—such as supernovae, gamma-ray bursts, and stellar flares—is crucial for understanding high-energy cosmic events like neutron star mergers and black hole collisions, which reveal fundamental physical laws beyond laboratory replication. However, identifying these short-lived phenomena is challenging, especially in large-scale surveys like GOTO, which captures over 400 sky images nightly, each with more than 20,000 celestial objects [7]. Manual inspection is impractical, necessitating AI systems capable of biologically inspired learning and decision-making.
The concept of biomimetics underpins this study, combining mechanisms inspired by natural intelligence. CNNs mimic hierarchical visual processing in the brain [8], while transfer learning emulates adaptive knowledge transfer by fine-tuning pretrained models (e.g., VGGNet, ResNet, Inception, and Xception) for small or imbalanced astronomical datasets [9]. Data augmentation (rotation, flipping, and noise) models environmental adaptation [10], and dropout regularization imitates biological redundancy to prevent overfitting [11,12]. Inspired by swarm intelligence, ensemble learning (Soft and Weighted Voting) integrates multiple CNNs for more stable and noise-tolerant predictions [13,14,15].
This bio-inspired framework has achieved high accuracy and robustness in classifying transients, making it suitable for real-time alert systems like GOTO. Future work includes using Generative Adversarial Networks (GANs) to create synthetic data for underrepresented classes, mimicking the brain’s imagination process. As shown in [16,17], GAN-based augmentation enhances the classification of rare variable stars. Overall, this integrated biomimetic approach—combining visual learning, adaptive generalization, ensemble decision-making, and synthetic data generation—offers a scalable and precise solution for next-generation astronomical surveys.
The main contributions of this study can be summarized as follows:
  • We propose an ensemble deep learning framework that integrates multiple pretrained CNN architectures to enhance accuracy and robustness in astronomical transient detection.
  • We evaluate both transfer learning and fine-tuning strategies under diverse data augmentation settings (Original, Rotation, Noise, HFlip, and VFlip) and batch sizes ranging from 32 to 256.
  • We demonstrate that the proposed ensemble approach significantly reduces false positives and improves detection reliability compared with individual CNN models.
  • We analyze the architectural differences among the best-performing models (e.g., depthwise separable convolutions in MobileNet and Inception-style modules in Xception) to explain why specific networks perform better under certain augmentation conditions.
  • We provide insights that contribute to the development of scalable and generalizable deep learning solutions for real–bogus classification in wide-field sky surveys such as GOTO.
The remainder of this study is organized as follows: Section 2 presents the materials and methods, including the system overview, dataset, data acquisition process, modeling procedures, and evaluation methods. Section 3 reports the experimental results, consisting of preliminary experiments using various models under different data augmentation settings, as well as the main experiment based on the proposed ensemble deep learning approach. Section 4 provides an in-depth discussion of all experimental results presented in the previous sections. Section 5 presents the conclusions of this study, summarizing the key findings and implications. Section 6, Future Work, outlines potential directions for extending and improving this research.

2. Materials and Methods

2.1. System Overflow

In Figure 1, nine deep learning models—DenseNet121, InceptionV3, MobileNet, MobileNetV2, ResNet101, ResNet50, VGG16, and VGG19—were selected to evaluate their performance. The original data in FITS format were normalized to a range between 0 and 1 and expanded using four data augmentation techniques—noise, rotation, vertical flip (VFlip), and horizontal flip (HFlip)—to increase data diversity and improve model generalization.
After data augmentation, the expanded dataset and original images were used to train all nine models using the transfer learning approach with pretrained ImageNet weights. The data were divided into training, validation, and testing sets. Each model was then fine-tuned to enhance the performance of deeper layers unaffected by the transfer process. After training and fine-tuning, each model’s performance under each augmentation type was evaluated using the validation set, and the best-performing model from each group was selected for the final stage—ensemble deep learning.

2.2. Dataset

In Figure 2, the dataset shows a transient discovery image, divided into two parts, real and bogus, divided by astronomy experts; both real and bogus images are 21 × 21 pixels. There are 523 real images and 3598 bogus images [18].

2.2.1. Data Acquisition

In this preliminary study, we focused on analyzing simulated images generated for the GOTO telescope instead of using real observational data. The advantage of simulated data lies in knowing true transient sources beforehand, making it ideal for testing supervised machine learning methods, though it may not fully capture the complexity of real observations. The simulated images were created using SkyMaker vesrion 3.3.3 [19], which reproduces typical observational effects such as background noise and the Point Spread Function (PSF). The source list was compiled from two catalogs—SDSS for faint stars and galaxies and UCAC for bright stars (magnitude < 17)—combining both to extend the dynamic range. Based on the telescope’s field of view, each source’s RA/Dec was converted into pixel coordinates using a scale of 1.24 arcseconds per pixel, with G-band magnitudes from SDSS and V-band magnitudes from UCAC. Galaxies were modeled as disks only, omitting bulges, since the study focused on transient sources. SkyMaker required a configuration file, defining instrument properties such as the photometric zeropoint (23.5), PSF size, CCD dimensions (8176 × 6132 pixels), and pixel saturation level (65,535). The PSF FWHM was randomly varied between 0.8 and 3 arcseconds to increase realism. Two images were generated for each sky field, with new sources of magnitude 14–19 added randomly in the second image to simulate transient events. All simulated images were processed using a modified LSST software stack version 21.0.0 [20], and the image differencing outputs were then used as inputs for the machine learning algorithms.

2.2.2. Data Preparation

The dataset used in this study comprises 4101 instances, consisting of 523 labeled as “real” and 3578 labeled as “bogus.” The data were randomly split into three subsets. For each subset, the training set contained 3280 samples (2862 “bogus” and 418 “real”), while the testing set contained 821 samples (716 “bogus” and 105 “real”). Notably, the “real” data also included simulated instances. Figure 2 presents several representative examples in image format. The original data were in FITS format, and they were subsequently normalized to a range of 0 to 1 using the equations shown below. Here, arr refers to the array of one image, arr1 is obtained by subtracting the minimum value of arr from each element in arr, and the final result (arr2) is obtained by dividing arr1 by the maximum value of arr1.
a r r 1 = a r r a r r m i n
a r r 2 = a r r 1 a r r 1 m a x
In addition, the dataset exhibits a clear class imbalance, where the number of “real” images is significantly smaller than that of the “bogus” images, at an approximate ratio of 1:7, as shown in Table 1. To address this issue, oversampling techniques were applied to increase the number of training samples in both classes using data augmentation. Each augmentation method (e.g., noise injection, image rotation, horizontal flipping, vertical flipping) was applied independently to avoid the confounding effects of combined transformations. Finally, all images were resized to 224 × 224 pixels to ensure compatibility with the input requirements of ImageNet-based architectures during the transfer learning process.

2.3. Data Augmentation

Data augmentation plays a crucial role in deep learning [21,22], as it increases data diversity without needing additional real-world observations—an advantage in astronomy, where data collection is limited by time and resources [23]. By transforming images in different ways, the dataset grows, reducing overfitting and improving generalization. In this study, several augmentation methods were used to match the characteristics of astronomical images. Noise injection adds random “salt-and-pepper” noise to simulate sensor errors, weather effects, or background noise, helping the model detect real sources under poor image quality. Rotation (e.g., 90° or 180°) mimics changes in telescope orientation, while vertical (VFlip) and horizontal flipping (HFlip) simulate variations caused by different imaging angles. These transformations make the model invariant to position, orientation, and noise, thereby enhancing accuracy, robustness, and adaptability in astronomical image classification. In Figure 3 show an example of data augmentation.

2.4. Modeling

Convolutional Neural Networks (CNNs) [24,25] are powerful deep learning tools for image classification because they learn hierarchical spatial features, making them highly effective at identifying celestial objects through structural and light distribution analysis. Over the past decade, various CNN architectures have been developed, each differing in depth, parameters, and computational efficiency. In this study, we selected nine architectures to evaluate their performance in classifying astronomical transient images: VGG16, VGG19, ResNet50, ResNet101, InceptionV3, Xception, DenseNet121, MobileNet, and MobileNetV2. Since deep learning typically requires large datasets, the limited sample size posed a challenge. To address this, transfer learning was used to leverage pretrained ImageNet weights for efficient feature extraction and better generalization. This experiment thus serves as a preliminary adaptation of ImageNet-based transfer learning for sky survey images in transient detection, conducted on a system with an Intel Core i9-10900 K CPU, RTX 2080 Ti GPU, and 64 GB DDR4 RAM.
These architectures consist of the following:
  • DenseNet121 [26]: Utilizes a dense connectivity mechanism, where each layer receives input from all preceding layers. This promotes feature reuse and alleviates the vanishing gradient problem.
  • InceptionV3 [27]: Employs factorized convolutions and efficient dimensionality reduction, enabling deeper networks with lower computational cost.
  • MobileNet [28]: Designed for mobile and embedded systems, this architecture uses depthwise separable convolutions to significantly reduce computational complexity.
  • MobileNetV2 [29]: An extension of MobileNet, this version introduces inverted residual blocks, enhancing learning capacity while maintaining model compactness.
  • ResNet50 and ResNet101 [30]: Implement shortcut connections or identity mappings to combat the vanishing gradient issue and enable effective training of very deep networks.
  • VGG16 and VGG19 [31]: Feature a simple and sequential architecture composed of stacked convolutional layers with fixed kernel sizes, known for their consistency and reliability.
  • Xception [32]: Evolved from the Inception architecture by replacing all modules with depthwise separable convolutions, offering improved efficiency in extracting fine-grained features.
All nine CNN architectures were trained and fine-tuned using the same hyperparameters (Table 2). Four batch sizes—32, 64, 128, and 256—were tested to evaluate their impact on convergence and generalization. Smaller batches (e.g., 32) generally improve generalization, while larger ones may overfit [33,34]. Training was limited to 100 epochs with Early Stopping (patience = 3) to prevent overtraining [35]. The Adam optimizer was used for its adaptive learning rate and fast convergence in noisy conditions [36]. The binary cross-entropy loss function was applied since the task involves distinguishing between two classes: real and bogus. The initial learning rate for the transfer learning phase was set to 0.001, while a lower rate of 0.00001 was used during fine-tuning to ensure stable updates in deeper layers and preserve pretrained features [37,38]. During fine-tuning, only the top 30% of convolutional layers were unfrozen and retrained to adapt domain-specific features from astronomical transient images, following best practices for moderate domain shifts and small datasets [39,40]. To ensure fairness, these hyperparameter settings were applied consistently across all models.

2.5. Transfer Learning

Transfer learning enhances model performance on a new but related task by reusing knowledge learned from a previous one. Models pretrained on large datasets like ImageNet [41,42,43] (over 14 million images in 1000 classes) capture general visual features such as edges, contours, and textures, which can be effectively reused for smaller or domain-specific datasets [44]. In this study, CNN architectures pretrained on ImageNet were adapted to classify astronomical transient images (Figure 4). The original classification head was replaced with a new fully connected layer for binary classification of real and bogus objects. Two training strategies were applied: transfer learning, where only the new classifier was trained while convolutional layers remained frozen, and fine-tuning, where the top 30% of convolutional layers were unfrozen and retrained with the classifier to capture domain-specific patterns [45,46,47]. This approach efficiently reuses existing knowledge, improves generalization, and enables effective domain adaptation for transient detection.

2.6. Ensemble Deep Learning

In the final stage of this study, ensemble deep learning [48,49,50] was employed to enhance classification accuracy and model stability when dealing with diverse astronomical images. As shown in Figure 5, five CNN models were selected and trained using different data augmentation strategies [51]—original, rotation, horizontal flip (HFlip), vertical flip (VFlip), and noise injection—with varying batch sizes (32, 64, 128, 256). Architectures such as MobileNet and Xception were chosen based on their strong validation performance in both transfer learning (TL) and fine-tuning (FT) phases. The ensemble combined predictions from all models through a voting mechanism, allowing each specialized model to contribute its strengths—for instance, rotation-trained models excel in detecting orientation changes, while noise-trained models handle low-signal-to-noise data. Prior research shows that ensemble voting improves robustness and generalization compared with single-model systems [52]. This approach is particularly effective in astronomy, where image variability in brightness, morphology, and observation conditions makes ensemble learning essential for achieving accurate and reliable transient classification.

2.7. Evaluation Methods

The experimental results from each route were collected and compared to evaluate performance and efficiency. In deep learning-based scientific classification, selecting proper evaluation metrics is crucial. Key indicators—accuracy, precision, recall, and F1-score—were used as outlined in Table 3. Here, TP, TN, FP, and FN represent correctly or incorrectly predicted positive and negative cases. Since accuracy alone can be misleading for imbalanced datasets [53,54] precision, recall, and, especially, the F1-score—the harmonic mean of precision and recall—were emphasized, as this provides a balanced assessment when both false positives and false negatives carry significant impact [55].

3. Results and Discussion

At this point, we compare how well each model and each data augmentation technique works with different types of datasets to see how well they function. To find the best models, we look at key performance metrics like accuracy, precision, recall, and F1-score. The results of this comparison will help us choose the best models to use in the next deep ensemble learning experiments. These experiments will combine the best features of different architectures to achieve better accuracy, robustness, and reliability in classifying astronomical transients under different noise and distortion conditions.

3.1. Comparison of Classification Results of Different Deep Learning Techniques with Original (Nonaugmented) Dataset

A comparative experiment using the original (non-augmented) astronomical dataset evaluated transfer learning (TL) and fine-tuning (FT) across four batch sizes (32, 64, 128, and 256) and nine CNN architectures: DenseNet121, InceptionV3, MobileNet, MobileNetV2, ResNet50, ResNet101, VGG16, VGG19, and Xception. The results showed that most models performed well, with MobileNet and VGG16 being the most stable and accurate. The fine-tuned MobileNet (batch 64) achieved the highest performance—accuracy = 0.98938 and F1-score (real) = 0.95758—maintaining strong results even with larger batches. DenseNet121 and Xception also achieved high accuracy (≈0.98) and solid F1-scores, demonstrating effective deep feature extraction. VGG16 and VGG19 performed consistently well despite their depth, while ResNet50/101 showed unstable precision and F1-scores, likely due to class imbalance. Overall, fine-tuned MobileNet (batch 64) was the best model, combining high accuracy, efficiency, and low computational cost. These findings highlight the importance of selecting suitable architectures, tuning batch sizes, and optimizing training strategies to achieve robust classification even with non-augmented astronomical data. In Figure 6 illustrates the accuracy and loss graph of different deep learning techniques with original (nonaugmented) dataset and Table 4. Represents the comparison of classification results of different deep learning techniques with original dataset.

3.2. Comparison of Classification Results of Different Deep Learning Techniques with Rotation Dataset

In this experiment, only rotation-augmented astronomical images were used to improve the models’ ability to recognize rotated objects. Two learning methods—transfer learning (TL) and fine-tuning (FT)—were tested across nine CNN architectures (DenseNet121, InceptionV3, MobileNet, MobileNetV2, ResNet50, ResNet101, VGG16, VGG19, Xception) and four batch sizes (32, 64, 128, and 256). The results showed that Xception achieved the best overall performance, with TL (batch 128) yielding accuracy = 0.97750 and F1-score = 0.97761, though FT (batch 256) also maintained high stability. VGG16 (FT, batch 128) and VGG19 (FT, batch 256) followed closely with strong and consistent results. The MobileNet and ResNet models, however, suffered from overfitting or class imbalance—especially ResNet50 (FT, batch 256), which failed completely. Smaller batch sizes improved ResNet101 stability, while MobileNetV2 (TL, batch 256) achieved competitive performance with low computational cost.
In summary, rotation-based augmentation proved highly effective when paired with suitable architectures and batch sizes. The best-performing models—Xception (TL, batch 128), VGG16 (FT, batch 128), and VGG19 (FT, batch 256)—demonstrated superior accuracy and F1-scores across both “real” and “bogus” classes, emphasizing the importance of proper model selection, hyperparameter tuning, and regularization for optimal results. In Figure 7 illustrates the accuracy and loss graph of different deep learning techniques with rotation dataset and Table 5. Represents the comparison of classification results of different deep learning techniques with rotation dataset.

3.3. Comparison of Classification Results of Different Deep Learning Techniques with Noise Dataset

In this experiment, all models were trained on astronomical images with added noise to simulate real observational conditions, such as sensor artifacts, blurring, and low light. Nine CNN architectures—DenseNet121, InceptionV3, MobileNet, MobileNetV2, ResNet50, ResNet101, VGG16, VGG19, and Xception—were evaluated using transfer learning (TL) and fine-tuning (FT) across four batch sizes (32, 64, 128, and 256). Overall, most models struggled to learn from noisy data, with MobileNet, MobileNetV2, VGG16, VGG19, ResNet50, and ResNet101 showing an accuracy of 0.50000 and F1-scores of 0.00000, indicating complete failure in distinguishing real and bogus classes. Only a few models performed above random chance, notably, Xception and InceptionV3. The fine-tuned Xception (batch 256) achieved the best results with accuracy = 0.72625 and F1-score (bogus) = 0.76776, while ResNet50 (FT, batch 256) followed with accuracy = 0.85000 and F1-score (real) = 0.83827. InceptionV3 (TL, batch 32) showed moderate learning ability (accuracy = 0.63187; F1-score (real) = 0.43092) but tended to overfit under fine-tuning. Other models showed unstable or misleading results, such as MobileNetV2 (TL, batch 256), which had high precision but extremely low recall. In summary, adding noise significantly degraded model performance across most architectures. Xception (FT, batch 256) was the most noise-tolerant, followed by ResNet50 (FT, batch 256) and InceptionV3 (TL, batch 32). These findings highlight the need for denoising preprocessing, noise-aware training, and hybrid augmentation pipelines to improve model robustness and generalization under noisy astronomical conditions. In Figure 8 illustrates the accuracy and loss graph of different deep learning techniques with noise dataset and Table 6. Represents the comparison of classification results of different deep learning techniques with noise dataset.

3.4. Comparison of Classification Results of Different Deep Learning Techniques with HFlip Dataset

In this experiment, horizontal flip (HFlip) was applied to expand all datasets by mirroring astronomical images, allowing CNNs to recognize objects viewed from reversed telescope angles. Nine architectures—DenseNet121, InceptionV3, MobileNet, MobileNetV2, ResNet50, ResNet101, VGG16, VGG19, and Xception—were tested using transfer learning (TL) and fine-tuning (FT) across four batch sizes (32, 64, 128, and 256). The results showed that most models, particularly Xception, MobileNet, InceptionV3, VGG16, and VGG19, achieved nearly perfect performance. Xception reached accuracy = 0.99750–0.99813 and F1-score = 0.99688–0.99875, while MobileNet (TL, batch 128) achieved accuracy = 0.99875 and F1-score = 0.99875, matching VGG19 (FT). InceptionV3 consistently maintained accuracy > 0.996 and F1 > 0.99, and DenseNet121 (TL) also performed strongly (F1 > 0.993). However, MobileNetV2 (FT) showed moderate results (accuracy: 0.91–0.92; F1: 0.91–0.93), while ResNet50 and ResNet101 failed under some FT setups (accuracy = 0.5; F1 = 0.0). Overall, HFlip significantly improved model learning for symmetric and mirrored features, with deep, well-structured models like Xception, MobileNet, and InceptionV3 performing best. Both FT and TL delivered high accuracy, proving HFlip to be an effective augmentation method for astronomical image classification. In Figure 9 illustrates the accuracy and loss graph of different deep learning techniques with original (nonaugmented) dataset and Table 7. Represents the comparison of classification results of different deep learning techniques with hflip dataset.

3.5. Comparison of Classification Results of Different Deep Learning Techniques with VFlip Dataset

In this experiment, the Vertical Flip (VFlip) technique was applied to expand the dataset by flipping astronomical images vertically, enabling models to recognize objects captured in inverted orientations caused by varying telescope angles. Nine CNN architectures—DenseNet121, InceptionV3, MobileNet, MobileNetV2, ResNet50, ResNet101, VGG16, VGG19, and Xception—were tested with transfer learning (TL) and fine-tuning (FT) using batch sizes of 32, 64, 128, and 256. The results showed that well-structured, deeper models like MobileNet, Xception, InceptionV3, VGG16, and VGG19 performed excellently, achieving near-perfect accuracy, precision, recall, and F1-scores for both classes. MobileNet (TL) and VGG19 (FT) reached 0.99875, while Xception (TL) achieved 0.99813, maintaining high stability even with large batch sizes. DenseNet121 and InceptionV3 also demonstrated strong, consistent results, whereas ResNet50 and ResNet101 struggled under some fine-tuning conditions but improved with TL. Overall, VFlip augmentation proved highly effective for models such as Xception, MobileNet, InceptionV3, VGG19, and VGG16, particularly with medium batch sizes (64–128), enhancing model robustness against variations in image orientation during real astronomical observations. In Figure 10 illustrates the accuracy and loss graph of different deep learning techniques with vflip dataset and Table 8. Represents the comparison of classification results of different deep learning techniques with vflip dataset.

3.6. The Classification Performance of Ensemble Deep Learning

In this study, various CNN architectures were trained and tested under different conditions using data augmentation, transfer learning (TL), fine-tuning (FT), and varying batch sizes to assess how training methods affect astronomical image classification in complex scenarios. The results showed that Xception, MobileNet, InceptionV3, VGG16, and VGG19 consistently achieved high accuracy, precision, recall, and F1-scores across augmented datasets (original, vertical flip, horizontal flip, and rotation), demonstrating strong adaptability to spatial and symmetrical variations. Table 9 represents selected models for ensemble learning based on best performance by augmentation type.
However, the main challenge was noise sensitivity. When noise augmentation was applied to simulate real observational conditions, most models showed significant performance drops, with some reaching F1 = 0.00000 for the “real” class—indicating complete failure in identifying true astronomical objects. These findings highlight model bias and limited generalization under noisy conditions. The best-performing models for each augmentation type were then selected based on both accuracy and F1-score, considering the optimal batch size for each case.
In every experiment, we used an entirely new test set that the models had never seen before. The test set consisted of 4000 real images and 4000 bogus images. However, for the experiment using the original data, we used a smaller set comprising 105 real and 715 bogus images, which were separated from the original dataset. The models had also never encountered these images during training. Overall, each experiment was evaluated using a distinct test dataset corresponding to its respective data augmentation type.

3.6.1. Experimental Results of Model Combination Using Soft Voting Ensemble Technique

From Table 10, Table 11, Table 12, Table 13 and Table 14 show the results show that the Soft Voting ensemble method performs effectively across multiple datasets—original, rotation, noise, HFlip, and VFlip. The model achieved its best performance on the original dataset (accuracy = 0.9937, F1 (real) = 0.9720), correctly classifying nearly all samples with only six errors out of 820 images. Performance declined with rotation (accuracy = 0.7386, F1 = 0.6499) and dropped sharply with noise (accuracy = 0.5006, F1 = 0.0024), indicating difficulty recognizing rotated or noisy patterns. By contrast, the HFlip and VFlip datasets maintained excellent results (HFlip accuracy = 0.9867, F1 = 0.9860; VFlip accuracy = 0.9890, F1 = 0.9888), showing strong learning for symmetrical spatial features. Overall, the Soft Voting ensemble excels on datasets preserving spatial consistency (original, HFlip, and VFlip) but struggles with rotated and noisy data. The next phase explores noise-handling techniques and Weighted Voting ensembles to improve robustness and adaptability in realistic astronomical imaging conditions.

3.6.2. Experimental Results of Model Combination Using Weighted Voting Ensemble Technique

We began the experiment by giving the model the weights it needed to start. After that, we looked at the overall accuracy to see how well it worked as a group. If a certain type of data augmentation made the accuracy go down, its weight was raised to make it more important for the ensemble to vote on. We checked the performance again to see how the change in weight affected the overall results of the classification. This step-by-step method helped us to slowly make the ensemble more balanced, ensuring that models trained on harder augmentations, like noise or rotation, were stronger and better at working with different kinds of astronomical images.
From Table 15, Table 16, Table 17, Table 18, Table 19 and Table 20 show the results that the Weighted Voting ensemble whose initial parameters are the values from Table 15, significantly enhances image classification across various augmented datasets—original, rotation, noise, HFlip, and VFlip. The model achieved its best performance on the original dataset (accuracy = 0.9926, F1 (real) = 0.9722), showing near-perfect classification. Performance decreased with rotation (accuracy = 0.7628, F1 = 0.6928) and dropped sharply with noise (accuracy = 0.5013, F1 = 0.0054), indicating difficulty handling noisy or rotated data. By contrast, HFlip and VFlip performed strongly (accuracy = 0.9703–0.9760, F1 = 0.9695–0.9763), proving the model’s ability to learn symmetrical spatial features. Overall, the Weighted Voting ensemble improved accuracy and stability for structured, symmetrical data but remained sensitive to noise and rotation, highlighting the need for better noise-handling and adaptive weighting to enhance real-world astronomical image classification.
From Table 21, Table 22, Table 23, Table 24, Table 25 and Table 26 show the results that the Weighted Voting ensemble (second ensemble) whose initial parameters are the values from Table 21 greatly improved performance, especially for the rotation and noise datasets, which were weaknesses in the previous experiment. Increasing the weight of Xception (fine-tuned and noise) from 0.3 to 0.5 boosted the noise dataset performance from accuracy = 0.50 to 0.6217 and F1 (real) = 0.0054 to 0.407, showing better noise handling. The rotation dataset also improved notably (accuracy = 0.9203, F1 = 0.915). Meanwhile, the original, HFlip, and VFlip datasets maintained strong results (accuracy > 0.97, F1 ≈ 0.97), confirming effective recognition of symmetrical spatial patterns. Overall, adjusting ensemble weights enhanced system stability and accuracy, particularly in classifying real astronomical objects under noisy or distorted conditions.
From Table 27, Table 28, Table 29, Table 30, Table 31 and Table 32 show the results that the Weighted Voting ensemble (third ensemble) whose initial parameters are the values from Table 27 achieved the best overall performance after increasing the weight of the Xception (fine-tuned and noise) model to 0.8, greatly enhancing noise robustness while maintaining strong performance on other augmentations. The original dataset reached accuracy = 0.9842 and F1 (real) = 0.9411, while the rotation dataset remained stable (accuracy = 0.9, F1 = 0.8965), reflecting improved balance between precision and recall. The most notable gain was in the noise dataset, which displayed accuracy = 0.885 and F1 (real) = 0.897, showing strong resistance to signal distortion. HFlip and VFlip also maintained excellent results (F1 > 0.96). Overall, increasing the weight of the noise-trained model significantly enhanced ensemble reliability and robustness, making it more effective for real-world astronomical image analysis under noisy and variable conditions.

3.7. Comparison with Previous Experimental Results

The present study builds upon the foundational research conducted by Tabacolde [56] and Liu et al. [57], who were innovators in utilizing transient images from the GOTO sky survey for real-bogus classification. Tabacolde et al. [56] first turned images into feature vectors and then used standard machine learning models to sort them. However, their method had many problems because there were many more false detections than real transients. This was because the classes were not balanced. Attempts were made to resolve this issue through oversampling and undersampling techniques; however, these methods had inherent limitations, including the risk of overfitting or the potential to exclude valuable data. Later, Liu et al. [57] proposed an alternative method to augment the sample size of the minority class by incorporating real images via rotations. This method did improve performance a little, but it still used traditional learning methods and could not directly obtain deep hierarchical features from raw image data. To overcome these limitations, the present study employs deep learning techniques, specifically Convolutional Neural Networks (CNNs), which can independently extract complex feature representations from raw images without manual intervention. CNNs can also deal with the complex spatial patterns that are common in pictures of the stars. This experiment uses a number of data augmentation methods, like rotation, flipping, and noise injection, to make the model better at generalizing to different types of data distortions. This study also uses transfer learning and fine-tuning to make the model work better. We also employ an ensemble learning strategy called Weighted Voting, which combines several high-performing CNN models and changes their predictions based on how well they do on validation. This method fixes the problems with each model and makes the classification system stronger, especially for images that are hard to read or have a lot of noise. This deep learning method using an ensemble is much more accurate and generalizes better than older methods when it comes to classifying real and bogus data. Table 33 represents the comparison with previous experimental results.

4. Discussion

The experimental results clearly demonstrate the effectiveness of integrating advanced Convolutional Neural Network (CNN) architectures with strategic data augmentation, transfer learning, fine-tuning, and ensemble learning in the context of astronomical transient detection. Several critical insights can be drawn from these findings.

4.1. Performance of Individual CNN Models

Across all augmentation strategies, models such as Xception, MobileNet, and VGG16/19 consistently outperformed others in both accuracy and F1-score. In particular, Xception with transfer learning and batch size 128 achieved the highest performance on rotation-augmented data, while MobileNet (Fine-Tuned) performed exceptionally on the original dataset. These results emphasize the flexibility and structural robustness of certain architectures in learning domain-specific patterns, especially those involving spatial symmetry, orientation variations, and brightness inconsistencies commonly observed in transient astronomical imagery.
However, not all architectures responded equally well to each augmentation technique. For instance, ResNet50 and ResNet101 displayed significant performance degradation under some configurations, most notably with noise-augmented data and larger batch sizes. This suggests that deeper architecture may require more sophisticated regularization or denoising techniques when handling corrupt or low-SNR images.

4.2. Impact of Data Augmentation

Among the augmentation strategies tested, horizontal flip (HFlip) and vertical flip (VFlip) yielded the most consistently high classification performance across all models, often resulting in F1-scores above 0.99. This indicates that these transformations effectively simulate the positional variability of transient objects and assist CNNs in learning rotational-invariant features.
By contrast, noise augmentation proved to be the most challenging for nearly all models. Most architectures failed completely under noisy conditions, yielding F1-scores as low as 0.000, indicating severe overfitting or class bias. Only a few models, notably Xception and ResNet50, managed to retain marginal classification ability. This highlights a key vulnerability in current deep learning models applied to astronomical imagery: their sensitivity to image noise, which is a prevalent issue in real observational data.
These findings emphasize the need for future research to focus on improving robustness against noise—either through preprocessing (e.g., denoising filters), adversarial training, or noise-aware model architectures.

4.3. Effectiveness of Ensemble Learning

To address the weaknesses of individual models—particularly under noise-augmented conditions—this study explored Soft Voting and Weighted Voting ensemble strategies. The Soft Voting ensemble offered moderate improvements in accuracy but remained limited by its uniform weighting scheme, which failed to sufficiently correct performance imbalance under noisy conditions. Specifically, recall in the “real” class was consistently low, indicating a failure to detect true transient events reliably.
By contrast, Weighted Voting facilitated more flexibility by assigning higher influence to noise-trained models. Progressive weight tuning in three ensemble configurations demonstrated that increasing the weight of the noise-robust model (up to 0.8) significantly improved overall performance, culminating in an F1-score (real) of 0.931 and an overall accuracy of 0.9348. This supports the hypothesis that ensemble strategies that explicitly compensate for weak conditions—such as noise corruption—can substantially improve model generalization and classification balance.
Notably, the final ensemble configuration not only corrected recall degradation but also maintained strong performance across other augmentation types, demonstrating its adaptability and robustness in complex, heterogeneous astronomical datasets.

4.4. Generalization and Scalability

The success of this ensemble approach illustrates the potential of combining multiple specialized models to form a unified system capable of handling the high variance and complexity of real-world astronomical data. The pipeline’s design—featuring modular training, augmentation-specific modeling, and intelligent voting—offers a scalable solution that can be extended to other sky surveys and transient detection projects. Furthermore, the system’s reliance on transfer learning and moderate fine-tuning suggests that this approach is computationally efficient and suitable for deployment in time-critical applications, such as real-time transient detection in large-scale surveys like GOTO.

4.5. Broader Implications for Astronomical Investigation and Data Analysis

Our findings hold broader significance for both astronomical investigation and data analysis in time-domain astronomy. The proposed ensemble deep learning framework enhances transient detection reliability by improving accuracy, reducing false positives, and increasing robustness against noise and image artifacts. These improvements enable faster and more confident identification of real astrophysical events such as supernovae, kilonovae, and variable stars, supporting timely follow-up observations and more efficient telescope resource allocation. From a data analysis standpoint, combining multiple pretrained CNN architectures provides a diverse representation of astronomical image features, effectively addressing issues such as class imbalance and PSF variation. This multi-model integration demonstrates strong data efficiency even under limited training samples. Furthermore, the framework offers scalability for applications across future large-scale surveys, such as LSST and Pan-STARRS, where automated, interpretable, and robust AI models are essential.

5. Conclusions

This study introduced an ensemble-based deep learning framework designed to classify real and bogus astronomical transients from sky survey images. By integrating transfer learning, fine-tuning, multiple data augmentation strategies (such as rotation, horizontal flip, vertical flip, and noise), and ensemble learning techniques, the proposed system achieved substantial improvements in classification accuracy and robustness. Models like Xception, MobileNet, and VGG19 consistently outperformed others, particularly under augmentation strategies that introduced geometric variations. While most models struggled with noise-injected images, the Weighted Voting strategy—especially when assigning higher weights to noise-trained models—greatly enhanced the system’s resilience to distortion and improved the F1-score of the “real” class to 0.931 with an overall accuracy of 93.48%. These results highlight the importance of model diversity and strategic ensemble configuration in addressing the challenges of real-world astronomical datasets. The proposed approach offers a scalable and practical solution for transient detection tasks in large-scale sky surveys and lays the groundwork for future research in noise-resilient deep learning for astronomy.

6. Future Work

In future work, we aim to extend the proposed ensemble deep learning framework toward a more multimodal, interpretable, and physically consistent system for astronomical transient detection. The current study demonstrates strong performance on simulated and GOTO datasets; however, several directions can improve both the scientific robustness and operational generalization. First, we plan to develop a lightweight visual–language integration framework that combines image-based features with textual metadata such as FITS headers, World Coordinate System (WCS) parameters, and observing conditions (e.g., sky brightness, exposure time, and telescope pointing). This multimodal pathway will allow the network to condition its predictions on contextual information and generate natural-language explanations for its decisions. Inspired by advances in vision–language modeling, such as “From Gaze to Insight”, this direction will enhance interpretability and pave the way for interactive AI-assisted discovery pipelines where astronomers can query model rationales in real time. Second, we intend to address the limitations caused by image upsampling from 21 × 21 to 224 × 224 pixels. While necessary for compatibility with ImageNet-pretrained architectures, this resizing may introduce interpolation artifacts and diminish the fidelity of point-spread function (PSF) structures. Future studies will explore resolution-aware and compact CNN architectures capable of operating directly on native resolutions, including anti-aliased convolutional stems, patchified feature extractors, and shallow networks optimized for small-scale astronomical images. Third, we plan to expand model validation to real GOTO difference images obtained from multiple observing nights and sky fields. These experiments will include genuine artifacts such as PSF distortions, ghost reflections, and saturation trails to evaluate the framework’s real-world robustness. Techniques such as domain adaptation and artifact-aware augmentation will also be explored to bridge the gap between simulated and observational data distributions. Finally, future investigations will assess the cross-survey applicability of the proposed approach under varying instrumental and environmental conditions. This includes benchmarking the ensemble framework using real data from other surveys (e.g., ZTF, LSST, and Pan-STARRS) and exploring its adaptability to related astrophysical tasks, such as variable star classification and optical counterpart identification in multi-messenger astronomy. Collectively, these future developments will advance the framework toward a scalable, interpretable, and physically reliable AI system capable of supporting next-generation time-domain surveys, enhancing both the automation and scientific value of transient detection in modern astronomy.

Author Contributions

Conceptualization, T.B. and P.P.; methodology, P.P.; software, P.P.; validation, S.C., T.B. and N.I.-O.; formal analysis, S.C.; investigation, P.P.; resources, T.B.; data curation T.B.; writing—original draft preparation, P.P.; writing—review and editing, P.P. and S.C.; visualization, P.P.; supervision, S.C., T.B. and N.I.-O.; project administration, S.C.; funding acquisition, T.B. and N.I.-O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Acknowledgments

Mae Fah Luang University, School of Applied Digital Technology.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Abbott, B.P.; Abbott, R.; Abbott, T.D.; Acernese, F.; Ackley, K.; Adams, C.; Adams, T.; Addesso, P.; Adhikari, R.X.; Adya, V.B.; et al. GW170817: Observation of gravitational waves from a binary neutron star inspiral. Phys. Rev. Lett. 2017, 119, 161101. [Google Scholar] [CrossRef]
  2. Dyer, M.J.; Steeghs, D.; Galloway, D.K.; Dhillon, V.S.; O’BRien, P.T.; Ramsay, G.; Noysena, K.; Pallé, E.; Kotak, R.; Breton, R.; et al. The Gravitational-wave Optical Transient Observer (GOTO). In Proceedings of the Ground-based and Airborne Telescopes VIII, Online, 14–22 December 2020; p. 157. [Google Scholar] [CrossRef]
  3. Steeghs, D.; Galloway, D.K.; Ackley, K.; Dyer, M.J.; Lyman, J.; Ulaczyk, K.; Cutter, R.; Mong, Y.-L.; Dhillon, V.; O’Brien, P.; et al. The Gravitational-wave Optical Transient Observer (GOTO): Prototype performance and prospects for transient science. Mon. Not. R. Astron. Soc. 2022, 511, 2405–2422. [Google Scholar] [CrossRef]
  4. Long, Y.; Inoue, N.; Shinoda, K.; Yatsu, Y.; Itoh, R.; Kawai, N. Astronomical Image Subtraction for Transient Detection Using CNN. In Proceedings of the 21st Meeting on Image Recognition and Understanding (MIRU), Hokkaido, Japan, 7 August 2018; IEICE Research Committee on Pattern Recognition and Media Understanding (PRMU): Tokyo, Japan, 2018. (In Japanese). [Google Scholar]
  5. Murase, K.; Bartos, I. High-energy multimessenger transient astrophysics. Annu. Rev. Nucl. Part. Sci. 2019, 69, 477–506. [Google Scholar] [CrossRef]
  6. Fryer, C. Fundamental physics studies in time domain and multi-messenger astronomy. Front. Astron. Space Sci. 2024, 11, 1384587. [Google Scholar] [CrossRef]
  7. Boongoen, T.; Iam-On, N. Consensus clustering-based undersampling for improved classification of transient events in time-domain astronomy surveys. Sci. Rep. 2025, 15, 37382. [Google Scholar] [CrossRef]
  8. Hubel, D.H.; Wiesel, T.N. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. 1962, 160, 106–154. [Google Scholar] [CrossRef]
  9. Sánchez, H.D.; Huertas-Company, M.; Bernardi, M.; Tuccillo, D.; Fischer, J.L. Improving galaxy morphologies for SDSS with Deep Learning. Mon. Not. R. Astron. Soc. 2018, 476, 3661–3676. [Google Scholar] [CrossRef]
  10. Cabrera-Vives, G.; Reyes, I.; Förster, F.; Estévez, P.A.; Maureira, J.-C. Deep-HiTS: Rotation Invariant Convolutional Neural Network for Transient Detection. Astrophys. J. 2017, 836, 97. [Google Scholar] [CrossRef]
  11. Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
  12. Salehin, I.; Kang, D.-K. A review on dropout regularization approaches for deep neural networks within the scholarly domain. Electronics 2023, 12, 3106. [Google Scholar] [CrossRef]
  13. Priyadarshini, I.; Puri, V. A convolutional neural network (CNN) based ensemble model for exoplanet detection. Earth Sci. Inform. 2021, 14, 735–747. [Google Scholar] [CrossRef]
  14. AlZobi, F.I.; Mansour, K.; Nasayreh, A.; Samara, G.; Alsalman, N.; Bashkami, A.; Smerat, A.; Nahar, K.M. Optimized Soft-Voting CNN Ensemble Using Particle Swarm Optimization for Endometrial Cancer Histopathology Classification. Comput. Methods Programs Biomed. Update 2025, 8, 100217. [Google Scholar] [CrossRef]
  15. Uyar, K.; Yurdakul, M.; Taşdemir, Ş. Abc-based weighted voting deep ensemble learning model for multiple eye disease detection. Biomed. Signal Process. Control 2024, 96, 106617. [Google Scholar]
  16. García-Jara, G.; Protopapas, P.; Estévez, P.A. Improving astronomical time-series classification via data augmentation with generative adversarial networks. Astrophys. J. 2022, 935, 23. [Google Scholar] [CrossRef]
  17. Smith, M.J.; Geach, J.E. Generative deep fields: Arbitrarily sized, random synthetic astronomical images through deep learning. Mon. Not. R. Astron. Soc. 2019, 490, 4985–4990. [Google Scholar] [CrossRef]
  18. Prommool, P.; Chucherd, S. Comparative Study of Deep Learning Model for Transient Image Classification. In Proceedings of the 2023 7th International Conference on Information Technology (InCIT), Chiang Rai, Thailand, 16–17 November 2023; IEEE: New York, NY, USA, 2023; pp. 265–270. [Google Scholar]
  19. Bertin, E. SkyMaker: Astronomical image simulations made easy. Mem. Della Soc. Astron. Ital. 2009, 80, 422. [Google Scholar]
  20. Jurić, M.; Kantor, J.; Lim, K.-T.; Lupton, R.H.; Dubois-Felsmann, G.; Jenness, T.; Axelrod, T.S.; Aleksić, J.; Allsman, R.A.; AlSayyad, Y.; et al. The LSST Data Management System. arXiv 2015, arXiv:1512.07914. [Google Scholar] [CrossRef]
  21. Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
  22. Mumuni, A.; Mumuni, F. Data augmentation: A comprehensive survey of modern approaches. Array 2022, 16, 100258. [Google Scholar] [CrossRef]
  23. Keskar, N.S.; Mudigere, D.; Nocedal, J.; Smelyanskiy, M.; Tang, P.T.P. On large-batch training for deep learning: Generalization gap and sharp minima. arXiv 2016, arXiv:1609.04836. [Google Scholar]
  24. Zhao, X.; Wang, L.; Zhang, Y.; Han, X.; Deveci, M.; Parmar, M. A review of convolutional neural networks in computer vision. Artif. Intell. Rev. 2024, 57, 99. [Google Scholar] [CrossRef]
  25. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  26. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
  27. Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
  28. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
  29. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
  30. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  31. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
  32. Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
  33. Masters, D.; Luschi, C. Revisiting small batch training for deep neural networks. arXiv 2018, arXiv:1804.07612. [Google Scholar] [CrossRef]
  34. Thanapol, P.; Lavangnananda, K.; Bouvry, P.; Pinel, F.; Leprévost, F. Reducing Overfitting and Improving Generalization in Training Convolutional Neural Network (CNN) under Limited Sample Sizes in Image Recognition. In Proceedings of the 2020—5th International Conference on Information Technology (InCIT), Chonburi, Thailand, 21–22 October 2020; IEEE: New York, NY, USA, 2020; pp. 300–305. [Google Scholar]
  35. Prechelt, L. Early stopping-but when? In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2002; pp. 55–69. [Google Scholar]
  36. Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar] [CrossRef]
  37. Iman, M.; Arabnia, H.R.; Rasheed, K. A review of deep transfer learning and recent advancements. Technologies 2023, 11, 40. [Google Scholar] [CrossRef]
  38. Becherer, N.; Pecarina, J.; Nykl, S.; Hopkinson, K. Improving optimization of convolutional neural networks through parameter fine-tuning. Neural Comput. Appl. 2019, 31, 3469–3479. [Google Scholar] [CrossRef]
  39. Kandel, I.; Castelli, M. How Deeply to Fine-Tune a Convolutional Neural Network: A Case Study Using a Histopathology Dataset. Appl. Sci. 2020, 10, 3359. [Google Scholar] [CrossRef]
  40. Xiao, X.; Mudiyanselage, T.B.; Ji, C.; Hu, J.; Pan, Y. Fast Deep Learning Training through Intelligently Freezing Layers. In Proceedings of the 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), Atlanta, GA, USA, 14–17 July 2019; IEEE: New York, NY, USA, 2019; pp. 1225–1232. [Google Scholar] [CrossRef]
  41. Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; IEEE: New York, NY, USA, 2009; pp. 248–255. [Google Scholar] [CrossRef]
  42. Kornblith, S.; Shlens, J.; Le, Q.V. Do better imagenet models transfer better? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2661–2671. [Google Scholar]
  43. Huh, M.; Agrawal, P.; Efros, A.A. What makes ImageNet good for transfer learning? arXiv 2016, arXiv:1608.08614. [Google Scholar] [CrossRef]
  44. Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A Comprehensive Survey on Transfer Learning. Proc. IEEE 2021, 109, 43–76. [Google Scholar] [CrossRef]
  45. Vrbancic, G.; Podgorelec, V. Transfer Learning with Adaptive Fine-Tuning. IEEE Access 2020, 8, 196197–196211. [Google Scholar] [CrossRef]
  46. Shermin, T.; Teng, S.W.; Murshed, M.; Lu, G.; Sohel, F.; Paul, M. Enhanced transfer learning with imagenet trained classification layer. In Pacific-Rim Symposium on Image and Video Technology; Springer: Berlin/Heidelberg, Germany, 2019; pp. 142–155. [Google Scholar]
  47. Cui, Y.; Song, Y.; Sun, C.; Howard, A.; Belongie, S. Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; IEEE: New York, NY, USA, 2018; pp. 4109–4118. [Google Scholar] [CrossRef]
  48. Yang, Y.; Lv, H.; Chen, N. A survey on ensemble learning under the era of deep learning. Artif. Intell. Rev. 2023, 56, 5545–5589. [Google Scholar] [CrossRef]
  49. Mohammed, A.; Kora, R. A comprehensive review on ensemble deep learning: Opportunities and challenges. J. King Saud Univ.-Comput. Inf. Sci. 2023, 35, 757–774. [Google Scholar] [CrossRef]
  50. Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Suganthan, P.N. Ensemble deep learning: A review. Eng. Appl. Artif. Intell. 2022, 115, 105151. [Google Scholar] [CrossRef]
  51. Muñoz-Aseguinolaza, U.; Sierra, B.; Aginako, N. Rotational augmentation techniques: A new perspective on ensemble learning for image classification. arXiv 2023, arXiv:2306.07027. [Google Scholar] [CrossRef]
  52. Shibata, T.; Tanaka, M.; Okutomi, M. Geometric Data Augmentation Based on Feature Map Ensemble. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; IEEE: New York, NY, USA, 2021; pp. 904–908. [Google Scholar] [CrossRef]
  53. Rainio, O.; Teuho, J.; Klén, R. Evaluation metrics and statistical tests for machine learning. Sci. Rep. 2024, 14, 6086. [Google Scholar] [CrossRef]
  54. Powers, D.M. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv 2020, arXiv:2010.16061. [Google Scholar] [CrossRef]
  55. Saito, T.; Rehmsmeier, M. The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLoS ONE 2015, 10, e0118432. [Google Scholar] [CrossRef] [PubMed]
  56. Tabacolde, A.B.; Boongoen, T.; Iam-On, N.; Mullaney, J.; Sawangwit, U.; Ulaczyk, K. Transient detection modeling as imbalance data classification. In Proceedings of the 2018 1st IEEE International Conference on Knowledge Innovation and Invention (ICKII), Jeju Island, Republic of Korea, 23–27 July 2018; IEEE: New York, NY, USA, 2018; pp. 180–183. [Google Scholar]
  57. Liu, J.J.; Boongoen, T.; Iam-On, N. Improved detection of transient events in wide area sky survey using convolutional neural networks. Data Inf. Manag. 2024, 8, 100035. [Google Scholar] [CrossRef]
Figure 1. Figure depicting the overall methods used.
Figure 1. Figure depicting the overall methods used.
Biomimetics 10 00781 g001
Figure 2. Some examples of images in the dataset.
Figure 2. Some examples of images in the dataset.
Biomimetics 10 00781 g002
Figure 3. An example of data augmentation.
Figure 3. An example of data augmentation.
Biomimetics 10 00781 g003
Figure 4. The process of transferring learned features from a source domain (ImageNet) to a target domain (astronomical transient data) using the transfer learning approach.
Figure 4. The process of transferring learned features from a source domain (ImageNet) to a target domain (astronomical transient data) using the transfer learning approach.
Biomimetics 10 00781 g004
Figure 5. Ensemble architecture combining multiple CNN models trained under different augmentation strategies and batch sizes.
Figure 5. Ensemble architecture combining multiple CNN models trained under different augmentation strategies and batch sizes.
Biomimetics 10 00781 g005
Figure 6. The accuracy and loss graph of different deep learning techniques with original (nonaugmented) dataset.
Figure 6. The accuracy and loss graph of different deep learning techniques with original (nonaugmented) dataset.
Biomimetics 10 00781 g006aBiomimetics 10 00781 g006b
Figure 7. The accuracy and loss graph of different deep learning techniques with rotation dataset.
Figure 7. The accuracy and loss graph of different deep learning techniques with rotation dataset.
Biomimetics 10 00781 g007aBiomimetics 10 00781 g007b
Figure 8. The accuracy and loss graph of different deep learning techniques with noise dataset.
Figure 8. The accuracy and loss graph of different deep learning techniques with noise dataset.
Biomimetics 10 00781 g008
Figure 9. The accuracy and loss graph of different deep learning techniques with hflip dataset.
Figure 9. The accuracy and loss graph of different deep learning techniques with hflip dataset.
Biomimetics 10 00781 g009aBiomimetics 10 00781 g009b
Figure 10. The accuracy and loss graph of different deep learning techniques with vflip dataset.
Figure 10. The accuracy and loss graph of different deep learning techniques with vflip dataset.
Biomimetics 10 00781 g010aBiomimetics 10 00781 g010b
Table 1. Training data before and after oversampling, applied to address the class imbalance problem.
Table 1. Training data before and after oversampling, applied to address the class imbalance problem.
Training DataBogusReal
Before Oversampling2862418
After Oversampling40004000
Table 2. Hyperparameter settings used for training and fine-tuning pretrained convolutional and deep learning architectures (e.g., VGG, ResNet, DenseNet, MobileNet, Xception, and InceptionV3) for astronomical transient image classification.
Table 2. Hyperparameter settings used for training and fine-tuning pretrained convolutional and deep learning architectures (e.g., VGG, ResNet, DenseNet, MobileNet, Xception, and InceptionV3) for astronomical transient image classification.
ParameterValue
Batch Size3,264,128,256
Epoch100, Early Stopping (Patience = 3)
Learning0.001 (TF), 0.00001 (FT)
OptimizerAdam
Loss FunctionBinary Cross-Entropy
Fine-Tuning UnlocksTop 30%
Table 3. Performance indicators and formula.
Table 3. Performance indicators and formula.
Performance IndicatorsFormula
Precision (P) T r u e P o s i t i v e T r u e P o s i t i v e + F a l s e P o s i t i v e
Recall (R) T r u e P o s i t i v e T r u e P o s i t i v e + F a l s e N a g a t i v e
F1-score 2 P o s i t i v e R e c a l l P o s i t i v e + R e c a l l
Performance indicatorsformula
Precision (P) T r u e P o s i t i v e T r u e P o s i t i v e + F a l s e P o s i t i v e
Recall (R) T r u e P o s i t i v e T r u e P o s i t i v e + F a l s e N a g a t i v e
Table 4. Comparison of classification results of different deep learning techniques with original dataset.
Table 4. Comparison of classification results of different deep learning techniques with original dataset.
RankModelMethodBatch SizeAccuracyF1-Score (Bogus)F1-Score (Real)
1MobileNetfine-tuned640.989380.993930.95758
2ResNet50fine-tuned320.986340.992220.94410
3VGG16fine-tuned320.986340.992210.94479
4VGG19fine-tuned2560.983310.990490.93168
5MobileNettransfer320.984830.991300.94048
Table 5. Comparison of classification results of different deep learning techniques with rotation dataset.
Table 5. Comparison of classification results of different deep learning techniques with rotation dataset.
RankModelMethodBatch SizeAccuracyF1-Score (Bogus)F1-Score (Real)
1Xceptiontransfer1280.977500.977390.97761
2Xceptiontransfer2560.973750.973520.97398
3VGG16fine-tuned1280.975000.974970.97503
4VGG19fine-tuned2560.971880.971680.97207
5Xceptionfine-tuned320.971880.971460.97227
Table 6. Comparison of classification results of different deep learning techniques with noise dataset.
Table 6. Comparison of classification results of different deep learning techniques with noise dataset.
RankModelMethodBatch SizeAccuracyF1-Score (Bogus)F1-Score (Real)
1ResNet50fine-tuned2560.850000.860140.83827
2Xceptionfine-tuned2560.726250.767760.6667
3Xceptiontransfer2560.640000.729320.46269
4Xceptionfine-tuned320.655000.740850.48411
5InceptionV3transfer2560.621250.723290.4000
Table 7. Comparison of classification results of different deep learning techniques with HFlip dataset.
Table 7. Comparison of classification results of different deep learning techniques with HFlip dataset.
RankModelMethodBatch SizeAccuracyF1-Score (Bogus)F1-Score (Real)
1MobileNettransfer640.998750.998750.99875
2Xceptiontransfer640.998750.998750.99875
3VGG19fine-tuned640.998750.998750.99875
4VGG16fine-tuned640.996870.996880.99688
5MobileNetV2transfer2560.993750.993790.99379
Table 8. Comparison of classification results of different deep learning techniques with VFlip dataset.
Table 8. Comparison of classification results of different deep learning techniques with VFlip dataset.
RankModelMethodBatch SizeAccuracyF1-Score (Bogus)F1-Score (Real)
1MobileNettransfer320.998750.998750.99875
2VGG19fine-tuned320.998750.998750.99875
3InceptionV3transfer320.997500.997490.99751
4MobileNetfine-tuned32 0.997500.997490.99751
5Xceptiontransfer320.996870.996870.99688
Table 9. Selected models for ensemble learning based on best performance by augmentation type.
Table 9. Selected models for ensemble learning based on best performance by augmentation type.
ModelMethodAugmentationBatch Size
MobileNetFine-TunedOriginal64
XceptionTransferRotation128
XceptionFine-TunedNoise256
MobileNetTransferHFlip64
MobileNetTransferVFlip32
Table 10. Test with original data.
Table 10. Test with original data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.99370.99860.99300.99580.95410.99050.9720
Confusion Matrix
Pred: bogusPred: real
True: bogus7115
True: real1104
Table 11. Test with rotation data.
Table 11. Test with rotation data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.73860.65840.9920.79140.9840.48520.6499
Confusion Matrix
Pred: bogusPred: real
True: bogus396832
True: real20591941
Table 12. Test with noise data.
Table 12. Test with noise data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.50060.510.666710.00120.0024
Confusion Matrix
Pred: bogusPred: real
True: bogus40000
True: real39955
Table 13. Test with HFlip data.
Table 13. Test with HFlip data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.96670.9460.990.9670.98950.94350.966
Confusion Matrix
Pred: bogusPred: real
True: bogus396040
True: real2263774
Table 14. Test with VFlip data.
Table 14. Test with VFlip data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.98890.98480.9930.9890.99290.98470.9888
Confusion Matrix
Pred: bogusPred: real
True: bogus397228
True: real613939
Table 15. Parameters in first ensemble.
Table 15. Parameters in first ensemble.
ModelMethodBatch SizeAugmentationWeight
MobileNetfine_tuned64Original0.2
Xceptiontransfer128Rotation0.2
Xceptionfine_tuned256Noise0.3
MobileNettransfer64HFlip0.15
MobileNetTransfer32VFlip0.15
Table 16. Test with original data.
Table 16. Test with original data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.992610.99160.99580.945910.9722
Confusion Matrix
Pred: bogusPred: real
True: bogus7106
True: real0105
Table 17. Test with rotation data.
Table 17. Test with rotation data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.76280.68050.99070.80680.9830.5350.6928
Confusion Matrix
Pred: bogusPred: real
True: bogus396337
True: real15602140
Table 18. Test with noise data.
Table 18. Test with noise data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.50130.510.667210.00270.0054
Confusion Matrix
Pred: bogusPred: real
True: bogus40000
True: real395911
Table 19. Test with HFlip data.
Table 19. Test with HFlip data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.97030.95560.98650.97080.9860.95420.9698
Confusion Matrix
Pred: bogusPred: real
True: bogus394654
True: real1833817
Table 20. Test with VFlip data.
Table 20. Test with VFlip data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.9760.96250.99050.97630.99020.96150.9756
Confusion Matrix
Pred: bogusPred: real
True: bogus396238
True: real1543846
Table 21. Parameters in second ensemble.
Table 21. Parameters in second ensemble.
ModelMethodBatch SizeAugmentationWeight
MobileNetfine_tuned64Original0.2
Xceptiontransfer128Rotation0.2
Xceptionfine_tuned256Noise0.50
MobileNettransfer64HFlip0.15
MobileNettransfer32VFlip0.15
Table 22. Test with original data.
Table 22. Test with original data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.990210.98880.99430.929210.9633
Confusion Matrix
Pred: bogusPred: real
True: bogus7088
True: real0105
Table 23. Test with rotation data.
Table 23. Test with rotation data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.92030.87340.983250.92500.98080.85750.915
Confusion Matrix
Pred: bogusPred: real
True: bogus393367
True: real5703430
Table 24. Test with noise data.
Table 24. Test with noise data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.62170.57060.98320.7220.93950.26020.407
Confusion Matrix
Pred: bogusPred: real
True: bogus393367
True: real29591041
Table 25. Test with HFlip data.
Table 25. Test with HFlip data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.971250.96310.980.97140.97960.96250.970
Confusion Matrix
Pred: bogusPred: real
True: bogus392080
True: real1503850
Table 26. Test with VFlip data.
Table 26. Test with VFlip data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.9710.95840.98470.97140.98430.95720.9706
Confusion Matrix
Pred: bogusPred: real
True: bogus393961
True: real1713829
Table 27. Parameters in third ensemble.
Table 27. Parameters in third ensemble.
ModelMethodBatch SizeAugmentationWeight
MobileNetfine_tuned64Original0.20
Xceptiontransfer128Rotation0.20
Xceptionfine_tuned256Noise0.80
MobileNettransfer64HFlip0.15
MobileNettransfer32VFlip0.15
Table 28. Test with original data.
Table 28. Test with original data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.98420.99860.98320.9910.89660.99040.9411
Confusion Matrix
Pred: bogusPred: real
True: bogus70412
True: real1104
Table 29. Test with rotation data.
Table 29. Test with rotation data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.90.8720.93770.90.93270.8630.8965
Confusion Matrix
Pred: bogusPred: real
True: bogus3751249
True: real5483452
Table 30. Test with noise data.
Table 30. Test with noise data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.8850.99930.7710.8700.8130.99950.897
Confusion Matrix
Pred: bogusPred: real
True: bogus3084916
True: real23998
Table 31. Test with HFlip data.
Table 31. Test with HFlip data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.96860.96940.96860.96780.96780.96950.9686
Confusion Matrix
Pred: bogusPred: real
True: bogus3871129
True: real1223878
Table 32. Test with VFlip data.
Table 32. Test with VFlip data.
ModelAccuracyPrecision (Bogus)Recall (Bogus)F1-Score (Bogus)Precision (Real)Recall (Real)F1-Score (Real)
Ensemble0.96170.95450.96920.9620.9680.95420.9614
Confusion Matrix
Pred: bogusPred: real
True: bogus3877123
True: real1833817
Table 33. Comparison with previous experimental results.
Table 33. Comparison with previous experimental results.
Ref./YearTechniqueDatasetAccuracy/F1 (Class 1)
Tabacolde et al. [56]ML with handcrafted
features (e.g., SVM,
Decision Tree)
GOTORF best precision, but
recall < 0.1
Liu et al. [57]CNN baseline (1 conv layer) with multiple optimizers and augmentationGOTOF1-class 1 = 0.917
(best at batch size 128)
ProposedEnsemble of fine-tuned CNNs with
Weighted Voting with data augmentation
GOTOF1-class 1 = 0.9717
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Prommool, P.; Chucherd, S.; Iam-On, N.; Boongoen, T. Ensemble Deep Learning for Real–Bogus Classification with Sky Survey Images. Biomimetics 2025, 10, 781. https://doi.org/10.3390/biomimetics10110781

AMA Style

Prommool P, Chucherd S, Iam-On N, Boongoen T. Ensemble Deep Learning for Real–Bogus Classification with Sky Survey Images. Biomimetics. 2025; 10(11):781. https://doi.org/10.3390/biomimetics10110781

Chicago/Turabian Style

Prommool, Pakpoom, Sirikan Chucherd, Natthakan Iam-On, and Tossapon Boongoen. 2025. "Ensemble Deep Learning for Real–Bogus Classification with Sky Survey Images" Biomimetics 10, no. 11: 781. https://doi.org/10.3390/biomimetics10110781

APA Style

Prommool, P., Chucherd, S., Iam-On, N., & Boongoen, T. (2025). Ensemble Deep Learning for Real–Bogus Classification with Sky Survey Images. Biomimetics, 10(11), 781. https://doi.org/10.3390/biomimetics10110781

Article Metrics

Back to TopTop