Next Article in Journal
High-Speed Scientific Computing Using Adaptive Spline Interpolation
Previous Article in Journal
Large Language Models in Mechanical Engineering: A Scoping Review of Applications, Challenges, and Future Directions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

ECA110-Pooling: A Comparative Analysis of Pooling Strategies in Convolutional Neural Networks

Department of Mathematics-Informatics, The National University of Science and Technology POLITEHNICA Bucharest, Pitești University Centre, 110040 Pitești, Romania
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2025, 9(12), 306; https://doi.org/10.3390/bdcc9120306
Submission received: 1 September 2025 / Revised: 20 November 2025 / Accepted: 27 November 2025 / Published: 2 December 2025

Abstract

Pooling strategies are fundamental to convolutional neural networks, shaping the trade-off between accuracy, robustness to spatial variations, and computational efficiency in modern visual recognition systems. In this paper, we present and validate ECA110-Pooling, a novel rule-based pooling operator inspired by elementary cellular automata. We conduct a systematic comparative study, benchmarking ECA110-Pooling against conventional pooling methods (MaxPooling, AveragePooling, MedianPooling, MinPooling, KernelPooling) as well as state-of-the-art (SOTA) architectures. Experiments on three benchmark datasets—ImageNet (subset), CIFAR-10, and Fashion-MNIST—across training horizons ranging from 20 to 50,000 epochs show that ECA110-Pooling consistently achieves higher Top-1 accuracy, lower error rates, and stronger F1-scores than traditional pooling operators, while maintaining computational efficiency comparable to MaxPooling. Moreover, when compared with SOTA models, ECA110-Pooling delivers competitive accuracy with substantially fewer parameters and reduced training time. These results establish ECA110-Pooling as a principled and validated approach to image classification, bridging the gap between fixed pooling schemes and complex deep architectures. Its interpretable, rule-based design highlights both theoretical significance and practical applicability in contexts that demand a balance of accuracy, efficiency, and scalability.

1. Introduction

Pooling layers are a key component of Convolutional Neural Networks (CNNs). Their role is to gradually shrink the size of feature maps while keeping the most important information intact. This compression makes models faster, helps them generalize better, and reduces the risk of overfitting. The idea of building networks that capture features at multiple levels dates back to Fukushima’s Neocognitron [1], and became practical with the advent of backpropagation [2,3]. Among pooling methods, MaxPooling has become the standard because it preserves the strongest signals, which boosts accuracy in large-scale image recognition tasks like ImageNet [4,5]. Still, depending too heavily on MaxPooling comes with limitations. Alternative fixed operators have been proposed to address the limitations of MaxPooling. AveragePooling captures global context but sacrifices fine-grained local detail [6], while MedianPooling improves robustness to noise and outliers, making it effective for grayscale or noisy datasets [7]. Alongside these fixed approaches, researchers have developed learnable pooling mechanisms. For instance, KernelPooling employs parameterized kernels that adaptively aggregate spatial information, tailoring pooling to task-specific distributions [8]. Other strategies—such as stochastic pooling [9] or strided convolutions that remove pooling entirely [10]—illustrate broader attempts to refine or bypass traditional downsampling. Although these methods can be powerful, they often increase computational cost and complicate training. Beyond purely data-driven designs, biologically and mathematically inspired pooling strategies are emerging as a promising frontier. The integration of Elementary Cellular Automata (ECA) rules into CNNs offers a lightweight, rule-based alternative for encoding local dependencies prior to reduction. Among these, Rule 110 is particularly notable for its proven computational universality and emergent complexity [11,12,13]. Embedding ECA transformations within CNNs enables pooling layers such as the proposed ECA110-Pooling to capture fine-grained micro-pattern interactions that conventional operators often overlook [14,15]. This approach combines theoretical rigor with practical utility, effectively bridging symbolic dynamics and deep learning. Nevertheless, systematic and controlled comparative evaluations remain scarce. Much of the CNN literature continues to rely on MaxPooling by default, often without explicit justification or consideration of task-specific requirements [16,17]. Recent surveys and benchmarking studies highlight the need to revisit these design choices, especially as modern classification pipelines increasingly face constraints of efficiency, generalization, and robustness [18,19,20].
The motivation for this study is twofold. First, it aims to rigorously compare a broad range of pooling strategies—including MaxPooling, AveragePooling, MedianPooling, MinPooling, KernelPooling, and the novel ECA110-Pooling—within a consistent experimental framework across CIFAR-10, Fashion-MNIST, and a subset of ImageNet. Second, it aims to evaluate not only predictive performance, but also computational efficiency and robustness to varying the number of training epochs. By positioning ECA110-Pooling alongside both traditional operators and state-of-the-art (SOTA) image classification baselines, this work offers a nuanced perspective on how pooling strategies shape the effectiveness and efficiency of CNN-based classification.
-
Contributions. This work is distinguished by the following contributions regarding the analysis of pooling methods that can be used in CNN neural networks:
-
Systematic comparative evaluation. We conduct a rigorously controlled comparison of six pooling operators—MaxPooling, AveragePooling, MedianPooling, MinPooling, KernelPooling, and the proposed ECA110-Pooling—within a shared CNN backbone, fixed training protocol, and identical hyperparameter settings. This design highlights the specific impact of the pooling mechanism.
-
Novel ECA110-based pooling mechanism. We introduce and formalize a pooling operator derived from Elementary Cellular Automata (Rule 110). The operator follows a compact transform–reduce paradigm, enabling the preservation of discriminative local structures with minimal computational overhead.
-
Learnable baseline for reference. The inclusion of KernelPooling as a parametric baseline ensures a fair comparison between rule-based and learnable pooling approaches, highlighting trade-offs between flexibility, accuracy, and efficiency.
-
Comprehensive benchmarking across datasets and epochs. Pooling methods are evaluated on CIFAR-10, Fashion-MNIST, and a subset of ImageNet, with training horizons ranging from 20 to 50,000 epochs. Metrics include Top-1 Accuracy, Error Rate, F1-score, per-epoch runtime, and model size.
-
Statistical and computational validation. Results are validated through statistical tests (one-way ANOVA, Tukey’s HSD, Wilcoxon signed-rank, and paired t-tests). This ensures that conclusions are both statistically robust and practically reproducible.
In summary, this study establishes ECA110-Pooling as a principled, lightweight, and competitive alternative to classical and learnable pooling mechanisms, with implications for advancing image classification across both high-capacity computational infrastructures and resource-constrained environments.
The remainder of this paper is structured as follows. Section 2 reviews related work. Section 3 introduces the theoretical foundations and pooling strategies. Section 4 describes the proposed ECA pooling method and its integration into CNN architectures. Section 5 outlines the experimental protocol, including datasets, CNN backbone, algorithmic framework, and evaluation metrics. Section 6 presents the experimental results, covering convergence studies, complexity analysis, statistical validation, and benchmarking of ECA110-Pooling against state-of-the-art methods. Section 7 discusses limitations and future directions, while Section 8 concludes the paper.

2. Related Work

Pooling operations have been widely studied in the design of Convolutional Neural Networks (CNNs) due to their central role in spatial downsampling, complexity reduction, and the promotion of translational invariance [21]. The earliest operators, MaxPooling and AveragePooling, remain dominant because of their simplicity and efficiency. However, these fixed mechanisms suffer from inherent limitations, including the loss of fine-grained details and sensitivity to noise, motivating a broad range of extensions. One line of work introduces stochasticity or adaptivity into the pooling process. Stochastic pooling selects activations in proportion to their magnitudes, providing implicit regularization that enhances generalization. Mixed pooling and gated pooling further increase flexibility by probabilistically combining or adaptively choosing between max and average pooling based on input characteristics [22]. Adaptive pooling methods such as spatial pyramid pooling and region-of-interest pooling enable the extraction of fixed-size feature representations from variable input dimensions, making them indispensable for object detection and recognition pipelines. Another prominent direction replaces static operators with learnable downsampling mechanisms. Strided convolutions have been proposed as pooling-free alternatives that preserve convolutional structure while reducing resolution [23]. Parametric pooling methods, such as α -integration pooling [24], extend classical operators by introducing tunable parameters, while generalized pooling approaches interpolate continuously between max and average pooling [25]. These advances highlight a growing emphasis on adaptive downsampling, where the pooling function is optimized jointly with the network. Surveys confirm that such operator-level choices can significantly affect generalization and robustness, depending on data distribution and task requirements [26]. More recently, pooling has been linked to the broader trend of incorporating attention mechanisms into CNNs. Residual attention networks combine pooling with attention-driven feature selection, enhancing representational capacity in hierarchical architectures. Empirical studies emphasize the effectiveness of attention-based pooling, which aggregates features according to learned importance weights rather than fixed heuristics [27,28,29]. This paradigm situates pooling within the broader landscape of context-sensitive aggregation, narrowing the gap between CNNs and Transformer-based vision architectures. Biologically and mathematically inspired operators remain less explored but offer promising directions. For instance, Elementary Cellular Automata (ECA) provide a rule-based framework for capturing local structural dependencies in image data. Rule 110, with its proven computational universality and emergent complexity, has been adapted as a lightweight pooling mechanism (ECA110) capable of preserving micro-pattern interactions often lost in traditional pooling. Such operators bridge theoretical concepts with practical deep learning architectures, opening opportunities for interpretable yet efficient pooling designs. Complementary research has explored hybrid and biologically inspired operators, including multiscale pooling strategies [30,31,32], binary pooling with attention [33], and probabilistic pooling designs [34], further expanding the design space. Despite this progress, most prior studies have evaluated pooling variants in isolation, typically comparing a new method against only a few baselines under limited experimental conditions. Consequently, systematic large-scale benchmarks that assess multiple pooling operators within identical CNN backbones, training regimes, and datasets remain scarce. This fragmentation obscures the true trade-offs among accuracy, computational cost, and generalization. To address this gap, the present study provides a controlled and comprehensive comparative analysis of six pooling strategies—MaxPooling, AveragePooling, MedianPooling, MinPooling, KernelPooling, and the proposed ECA110-Pooling—across diverse datasets and training horizons, thereby advancing the understanding of pooling as a fundamental component of image classification models.

3. Preliminaries and Theoretical Foundations

3.1. Convolutional Neural Networks

Convolutional Neural Networks (CNNs) represent a foundational paradigm in deep learning for processing data with grid-like structures, such as images and videos. Their evolution has been shaped by architectural innovations and training methodologies that enabled deeper and more expressive models. Early work on efficient optimization, including greedy layer-wise pretraining [35], laid the groundwork for scaling neural networks prior to the advent of modern training techniques. A typical CNN is organized as a hierarchy of convolutional filters for feature extraction, interleaved with pooling or subsampling operations that reduce spatial resolution while retaining discriminative information. These downsampling mechanisms foster translational invariance, a property essential for robust visual recognition. Complementary advances, such as recurrent and memory-augmented architectures, further extended the representational capacity of neural models beyond vision tasks [36,37]. Over the past decade, CNNs have become the backbone of state-of-the-art systems in image classification, object detection, and semantic segmentation. Comprehensive surveys highlight their central role in artificial intelligence and their integration with other paradigms, including recurrent and attention-based networks [38,39]. Taken together, these advances show that CNNs function not just as feature extractors, but as a foundational component of modern deep learning.

3.2. Pooling Strategies: Formal Definitions

Let x R H × W × C denote an input feature map, where H, W, and C represent height, width, and the number of channels, respectively. For a pooling window Ω i , j of size k × k , centered (or positioned) at spatial coordinates ( i , j ) in channel c, several pooling strategies can be formally defined as follows.
  • MaxPooling:
    y i , j , c = max ( u , v ) Ω i , j x u , v , c .
    This operator selects the maximum activation within the local neighborhood, thereby preserving the strongest feature responses.
  • AveragePooling:
    y i , j , c = 1 | Ω i , j | ( u , v ) Ω i , j x u , v , c ,
    where | Ω i , j | = k 2 . AveragePooling captures the mean activation, emphasizing global contextual information while reducing sensitivity to individual variations.
  • MedianPooling:
    y i , j , c = median { x u , v , c ( u , v ) Ω i , j } ,
    which computes the statistical median of activations in the pooling region, offering robustness to local noise and outliers.
  • MinPooling:
    y i , j , c = min ( u , v ) Ω i , j x u , v , c .
    This strategy retains the weakest local activations and is mainly used in comparative evaluations as a lower baseline.
  • KernelPooling:
KernelPooling represents an adaptive pooling mechanism in which the aggregation of local activations is guided by learnable spatial kernels. Unlike fixed pooling operators such as MaxPooling or AveragePooling, where the aggregation rule is predetermined, KernelPooling jointly optimizes the kernel weights with the rest of the Convolutional Neural Network during training, thereby enabling task-specific spatial information retention. Formally, let x R H × W × C denote an input feature map with C channels. Each channel c is associated with a learnable pooling kernel W ( c ) R k × k , where k is the spatial extent of the pooling window. For each pooling region Ω i , j , positioned at location ( i , j ) , the downsampled activation is defined as:
y i , j , c = ( u , v ) Ω i , j W u i , v j ( c ) · x u , v , c ,
with stride s controlling the sampling rate across the feature map. This formulation unifies the concepts of filtering and subsampling within a single operation, allowing the network to learn optimal spatial aggregation patterns that balance feature selectivity and robustness to intra-class variability.
The pseudocode implementation in Algorithm 1 summarizes the iterative computation of downsampled activations. From a computational perspective, KernelPooling introduces an additional parameterization proportional to k 2 · C , which slightly increases training time per epoch compared to non-learnable pooling operators. Nevertheless, this overhead is often compensated by improved generalization performance, particularly in domains where the spatial distribution of discriminative features is complex and non-uniform.
Algorithm 1: KernelPooling ( k × k kernel, stride s)
  Bdcc 09 00306 i001

4. ECA110-Based Pooling Mechanism for CNNs

4.1. Definition of ECA110-Pooling

The proposed ECA110-based pooling mechanism is inspired by the theoretical framework of Elementary Cellular Automata (ECA), with particular emphasis on Rule 110, which is renowned for its computational universality and emergent structural complexity. Unlike conventional pooling operators—such as MaxPooling or AveragePooling—that directly apply a reduction over local receptive fields, ECA110-based pooling introduces a two-stage transform–reduce procedure. In this formulation, a deterministic cellular automaton transformation is first applied to the local neighborhood, followed by dimensionality reduction.
Formally, given a pooling window Ω i , j in channel c, activations are first rearranged into a one-dimensional vector:
z = flatten { x u , v , c ( u , v ) Ω i , j } .
The evolution function f 110 , corresponding to ECA Rule 110, is then applied to z , yielding a transformed sequence z that encodes local structural dependencies:
z = f 110 ( z ) .
Finally, the pooled output is computed as the normalized sum:
y i , j , c = 1 | z | k = 1 | z | z k .

4.2. Algorithm: ECA110-Pooling (Elementary Cellular Automaton Rule 110 Operator)

The ECA110 pooling operator (defined in the Algorithm 2) processes each local window in two stages: (i) a rule–based binary transformation that captures local interactions via the Elementary Cellular Automaton Rule 110, followed by (ii) a normalized-sum reduction. For each channel c and window Ω i , j (sampled with stride s), the feature values are flattened, binarized with respect to a threshold τ , evolved once (or for a fixed number of steps) under Rule 110, and then aggregated by averaging the transformed states. Further details on thresholding can be found in [40,41,42].
Indicator function. The operator 1 [ · ] acts elementwise on a vector and returns a binary vector:
1 [ z τ ] k = 1 , z k τ , 0 , z k < τ , k = 1 , , | z | .
Algorithm 2: ECA110-Pooling (window Ω , stride s)
  Bdcc 09 00306 i002
Typical choices for τ include the local mean or median of the window, or a fixed hyperparameter shared across windows. The parameter T denotes the number of iterations of Rule 110 applied to the binarized vector z b , with T = 1 used as the default in our experiments. The function f 110 ( · ) denotes the iterative application of Rule 110, an Elementary Cellular Automaton (ECA) characterized by its local update rule acting on triplets of binary states. Specifically (Table 1), ECA 110 maps each triplet ( l , p , r ) { 0 , 1 } 3 to a new state according to the transition rule encoded by the binary pattern 01101110 2 (110 in decimal). This automaton is notable for its computational universality and emergent structural complexity. In our framework, f 110 ( z b ) applies the update rule T times over the one-dimensional sequence z b , with a deterministic scan order induced by flatten ( · ) . Boundary handling (fixed zeros) is consistently enforced across experiments.

4.3. Main Steps of ECA110-Pooling on 3 × 3 Window

For a 3 × 3 receptive field, ECA110-based pooling proceeds through the following steps (see Figure 1):
  • Flattening: The activations are extracted and arranged into a one-dimensional vector z R 9 .
  • Binarization: Elements of z are mapped into binary states via thresholding (relative to mean or median), producing z b { 0 , 1 } 9 .
  • Application of Rule 110: The binary sequence z b evolves according to the local update rule f 110 , generating z { 0 , 1 } 9 .
  • Reduction: The normalized sum of z provides the final pooled activation.
  • Illustrative Example of ECA110 Pooling on a 3 × 3 window.
Consider the following local window extracted from a feature map (channel c):
Ω i , j = 0.2 0.8 0.4 0.5 0.9 0.1 0.3 0.7 0.6 .
  • Flattening. The window is rearranged into a one-dimensional vector:
    z = [ 0.2 , 0.8 , 0.4 , 0.5 , 0.9 , 0.1 , 0.3 , 0.7 , 0.6 ] .
  • Binarization. Using the mean of the window ( μ = 0.5 ) as threshold, values > μ are mapped to 1 and those μ to 0:
    z b = [ 0 , 1 , 0 , 0 , 1 , 0 , 0 , 1 , 1 ] .
  • Application of Rule 110. The binary sequence z b is evolved according to the update function f 110 . For illustration, after one iteration step (we use zero-padded synchronous updates under Rule 110) the transformed sequence is:
    z = [ 1 , 1 , 0 , 1 , 1 , 0 , 1 , 1 , 1 ] .
  • Reduction by normalized summation. The final pooled activation is computed as:
    y i , j , c = 1 9 k = 1 9 z k = 1 9 ( 1 + 1 + 0 + 1 + 1 + 0 + 1 + 1 + 1 ) = 7 9 0.78 .
Thus, the output of ECA110 pooling for this 3 × 3 window is y i , j , c 0.78 . This demonstrates how the operator embeds a deterministic rule-based transformation prior to normalized reduction, thereby encoding non-linear structural dependencies before aggregation.

4.4. Ablations on CIFAR-10 at 10,000 Epochs and Windows of Size 3 × 3

To better understand the contribution of individual design choices, we performed an ablation study on CIFAR-10 at 10,000 training epochs, keeping all other training settings fixed. We report Top-1 accuracy under controlled variations of three factors: (i) the threshold parameter τ , (ii) the choice of ECA rule, and (iii) the number of transformation steps T. The results of these experiments are summarized in Table 2.
  • Statistical Significance. A one-way ANOVA ( α = 0.05 ) revealed a significant main effect of the pooling variant on accuracy across all ablation settings ( p < 0.05 ). Based on these results, the recommended configuration is to employ the local-mean thresholding criterion, adopt the ECA110 rule, and, in general, utilize a single evolution step ( T = 1 ).
Observation 1
(Alternative Reduction Strategies for ECA110-Pooling). While the normalized sum (mean pooling) is the default reduction strategy in ECA110-Pooling, several alternatives can be employed depending on the task and robustness requirements. Given the transformed vector z { 0 , 1 } k 2 obtained after the application of Rule 110, the pooled output y may be defined using one of the following reduction operators:
  • Normalized Sum (Mean Pooling): y = 1 | z | k = 1 | z | z k .
  • Maximum/Minimum Reduction: y = max k z k , y = min k z k .
  • Weighted Mean Reduction: with learnable or fixed weights { w k } : y = k = 1 | z | w k z k k = 1 | z | w k .
  • Median Reduction: y = median { z 1 , z 2 , , z | z | } .
  • L p -Norm Reduction:  y = 1 | z | k = 1 | z | ( z k ) p 1 p ,  where  p = 1 corresponds to the mean and p approaches the maximum.
  • Entropy-Based Reduction: measuring the diversity of the transformed activations:  y = k = 1 | z | z k log z k + ϵ ,  where ϵ is a small constant for numerical stability.
  • Learnable Reduction (Attention/MLP): In more flexible designs, y can be obtained through a learnable function such as attention pooling or a small neural layer g ( z ) , y = g ( z ) , where g is trained jointly with the CNN backbone.
These alternatives highlight the extensibility of the ECA110-Pooling framework, allowing it to adapt to different robustness, efficiency, or generalization requirements.
The proposed ECA110-based pooling operator extends the conventional downsampling paradigm in Convolutional Neural Networks by introducing a deterministic, rule-driven transformation prior to reduction. By applying the Elementary Cellular Automaton Rule 110, the operator captures fine-grained, non-linear local interactions that are typically discarded by classical pooling mechanisms such as MaxPooling or AveragePooling. This design allows CNNs to preserve structural dependencies and spatial dynamics that are critical for precise feature extraction, thereby improving robustness in image classification tasks where subtle local variations play a decisive role. Although it introduces a modest yet constant computational overhead, the method remains efficient while offering a principled and theoretically grounded alternative to conventional pooling operators. An illustrative example is to consider the local window Ω = { 12 , 9 , 11 , 8 , 15 , 10 , 7 , 13 , 14 } , (the threshold is τ = mean ( Ω ) = 11 ; Binarize each entry by z b = [ 1 , 0 , 1 , 0 , 1 , 0 , 0 , 1 , 1 ] , then apply one deterministic update under ECA Rule 110 (i.e., T = 1 ) to obtain the transformed states z = [ 1 , 1 , 1 , 1 , 1 , 0 , 1 , 1 , 1 ] ; The pooling value is the normalized sum: P ( Ω ) = 1 9 t = 1 9 z t = 8 9 0.88 ; This captures fine-grained structure because the rule-driven transform amplifies local boundaries and short regularities (e.g., the 1 0 1 alternation) prior to aggregation; consequently, the subsequent average retains contour-like evidence that max or mean pooling on raw values would blur; For comparison, avg ( Ω ) = 11 , a statistic that does not reflect the alternating micro-pattern). Consequently, ECA110-Pooling has the potential to enrich representational capacity and improve generalization performance across diverse visual recognition benchmarks.

5. Experimental Framework

5.1. Datasets, Data Splits, and Training Setup

To enable a systematic and unbiased comparative evaluation, three widely adopted benchmark datasets were employed: ImageNet (subset), CIFAR-10, and Fashion-MNIST (Table 3). These datasets were selected to capture a broad spectrum of visual characteristics, resolutions, and intra-class variability, thereby ensuring a robust assessment of pooling strategies across different levels of complexity. All experiments were conducted in the Python 3.11 programming language, employing PyTorch 2.6.0 to implement and train the CNN architectures; NumPy, SciPy, and scikit-learn for numerical computations and evaluation; and Matplotlib 3.10 for figure generation. Comparable alternatives include TensorFlow/Keras and the MATLAB R2023b Deep Learning Toolbox, which offer integrated workflows and user-friendly graphical interfaces but generally afford less flexibility for the training process. In this research, a computer with AMD RYZEN 7 PRO 8700 G (8 C/16 T, 4.2 Gz/5.1 Gz, 8 MB L2/ 16 MB L3) processor, AMD Radeon 780 M Graphics integrated, RAM 32 GB, SSD 512 GB and Microsoft Windows 11 PRO is used for the conducted analyses and simulations.
To assess robustness under varying data availability, three train/test splits were employed across all datasets (Table 4). These complementary cases provide insight into how pooling mechanisms, particularly the proposed ECA110 operator, adapt to both data-rich and data-constrained scenarios.
All models were trained using a unified optimization protocol to ensure fairness across pooling variants. The Stochastic Gradient Descent (SGD) optimizer was employed in conjunction with the categorical cross-entropy loss function. Training was performed with a mini-batch size of 64. To investigate performance across different learning regimes, the number of epochs was systematically varied over the range: { 20 , 100 , 500 , 1,000 , 5,000 , 10,000 , 50,000 } . This schedule captures short-term convergence dynamics, mid-range performance stabilization, and long-term learning behavior, thereby enabling a comprehensive evaluation of pooling strategies under diverse experimental conditions. Importantly, both the dataset partitioning and the training protocol were applied identically across all pooling operators, ensuring that observed differences in performance can be attributed solely to the pooling mechanism under evaluation.

5.2. CNN-Network Arhitecture

To ensure that differences were due solely to the pooling operator, all experiments used the same convolutional neural network (CNN) backbone. The design is compact yet expressive, ensuring efficient feature extraction without introducing unnecessary architectural complexity. The adopted configuration consists of:
  • A first convolutional layer with 64 filters of size 3 × 3 , followed by the evaluated pooling operator.
  • A second convolutional layer with 128 filters of size 3 × 3 , again followed by the selected pooling operator.
  • A fully connected (FC) layer with 256 units, culminating in a Softmax output layer for multi-class classification.
This architectural constraint ensures that observed differences in performance can be attributed directly to the pooling mechanisms rather than confounding factors such as network depth, receptive field size, or parameterization. Figure 2 provides a schematic overview of the proposed CNN, highlighting the integration of ECA Rule-110–based pooling within the processing stages.

5.3. Algorithmic Framework

The methodology was formalized through the following algorithmic components:
1.
Training and Evaluation Algorithm. Algorithm 3 details the training procedure of a CNN with interchangeable pooling operators P { max ,   avg ,   median ,   min ,   kernel ,   eca 110 } . Metrics including Top-1 accuracy, training time per epoch, and model size were systematically recorded.
The components and functionality of the Algorithm 3 can be described as follows:
-
Inputs and Outputs: The algorithm takes as input datasets ( D t r a i n , D v a l ) (collections of labeled samples used for training and validation), number of classes K, pooling type P { max ,   avg ,   median ,   min ,   kernel ,   eca 110 } , training epochs E, and optimization hyperparameters: learning rate η , momentum m, and weight decay λ . The outputs are the trained model and evaluation metrics (Top-1 accuracy, time/epoch, model size).
-
Network Initialization: The CNN backbone is fixed across experiments, differing only in the pooling operator: Conv1 ( 3 × 3 , 64 filters) → ReLU → Pool1( P , k = 2 ,   s = 2 ) → Conv2 ( 3 × 3 , 128 filters) → ReLU → Pool2( P , k = 2 ,   s = 2 ) → Flatten → FC(256) → ReLU → FC(K) → Softmax. ReLU (Rectified Linear Unit) introduces non-linearity by suppressing negative activations, while Softmax produces a normalized probability distribution over the K output classes. The pooling operator P, applied with window size k and stride s, represents the sole interchangeable element within the otherwise fixed backbone. This design ensures that observed performance variations are attributable primarily to the pooling mechanism rather than architectural or parametric differences.
-
Optimization: Training uses SGD with learning rate η , momentum m, and weight decay λ . The cross-entropy loss is computed as L = i = 1 K y i log ( y ^ i ) , with standard gradient update steps (zero_grad(), backward(), step()). The called functions here have the following attributes—CrossEntropy( y ^ , y ): computes the negative log-likelihood loss on one-hot labels or class indices. zero_grad(): resets all previously accumulated gradients; backward( L ): performs backpropagation, accumulating gradients with respect to network parameters; step(): updates model parameters using the SGD rule with momentum and weight decay. Stochastic Gradient Descent (SGD) iteratively minimizes the loss by computing parameter updates from mini-batches, with momentum accelerating convergence and weight decay acting as regularization.
-
Training Loop: For each epoch e = 1 , , E : (i) batches ( X , y ) are forwarded through the network (Forward( X , P )), loss is computed and parameters updated; (ii) validation is performed on D v a l , logging Top-1 accuracy, time/epoch, and model size.
2.
Forward Pass. Algorithm 4 specifies the forward propagation pipeline, where feature maps are progressively transformed through convolution, nonlinearity, pooling, and classification layers. The function Forward( X , P ) applies: (i) convolutional layers (Conv2D with 3 × 3 filters) for local feature extraction; (ii) ReLU activations to introduce non-linearity by suppressing negative responses; (iii) the interchangeable pooling operator P with window k = 2 and stride s = 2 , responsible for spatial downsampling; (iv) flattening of feature maps into a vector representation; (v) fully connected layers (FC) for global integration of features, followed by a final Softmax that converts logits into class probabilities y ^ , where logits denote the raw, unnormalized outputs of the final fully connected layer. The pooling operator Pool( T , P , k , s ) supports three cases: (a) standard operators (max, average, median, min), (b) learnable weighted aggregation in KernelPooling, and (c) a transform–reduce scheme in ECA110, which consists of flattening, binarization via 1 [ z τ ] , evolution under Rule 110, and normalized-sum reduction. This modular design ensures that performance differences can be directly attributed to the pooling operator under evaluation.
3.
Pooling Operator. Algorithm 5 defines the pooling layer implementation, where the input tensor X R B × C × H × W has four components: B denotes the batch size, C the number of channels, H the height, and W the width of the feature maps. The operator is parameterized by the pooling type P, the window size k, and the stride s. For standard operators ( P { max ,   avg ,   median ,   min } ), the algorithm applies the corresponding reduction function over each local region. In the case of KernelPooling, each channel c is associated with a learnable kernel W ( c ) R k × k , enabling adaptive weighted aggregation of local activations. For the proposed ECA110-Pooling, a transform–reduce framework is employed: each local window is flattened into a vector z , binarized through a threshold τ , and then evolved for T iterations under Elementary Cellular Automaton Rule 110. The transformed sequence z is subsequently reduced via normalized summation, producing the scalar output U [ b , c , i , j ] for each spatial location. Unless otherwise specified, the our default choice is T = 1 . This unified formulation allows the Pool( X , P , k , s ) function to encompass conventional, learnable, and automaton-driven mechanisms within a single modular framework, facilitating rigorous and fair comparisons across pooling strategies.
Algorithm 3: Training and Evaluating a CNN with a Pluggable Pooling Operator
  Bdcc 09 00306 i003
Algorithm 4: Forward ( X , P )
Z 1 ReLU ( Conv 2 D 3 × 3 , 64 ( X ) )
Z 1 Pool ( Z 1 , P , k = 2 ,   s = 2 )
Z 2 ReLU ( Conv 2 D 3 × 3 , 128 ( Z 1 ) )
Z 2 Pool ( Z 2 , P , k = 2 ,   s = 2 )
h Flatten ( Z 2 )
h ReLU ( FC 256 ( h ) )
logits FC K ( h )
return  Softmax ( logits )
   This integrated methodology ensures consistency across datasets, data splits, and training conditions, allowing for a fair and clear comparison between different pooling methods.

5.4. Evaluation Metrics

To ensure a rigorous and comprehensive assessment of pooling operators, multiple evaluation metrics were employed, designed to capture both predictive performance and computational efficiency. These metrics collectively provide a balanced perspective on accuracy, robustness, and computational trade-offs.
  • Top-1 Classification Accuracy. The primary evaluation metric, representing the proportion of test samples where the predicted class with the highest probability matches the ground truth. It directly reflects the discriminative capability of pooling operators in image classification.
  • Error Rate. Defined as the complement of Top-1 Accuracy (100%-Accuracy), this metric highlights the proportion of misclassified samples and provides an intuitive measure of classification errors.
  • F1-Score. The harmonic mean of precision and recall, balancing false positives and false negatives. Particularly useful for datasets with class imbalance, it offers a more nuanced perspective on predictive performance beyond raw accuracy.
  • Training Time per Epoch. The average wall-clock time required to complete one training epoch, indicating the computational overhead introduced by each pooling strategy.
  • Model Size. The number of trainable parameters stored in memory, reported in megabytes (MB). This is especially relevant for learnable pooling mechanisms such as KernelPooling, which increase parameterization.
  • Convergence Behavior. The stability and speed with which training accuracy and loss curves converge across epochs, capturing optimization dynamics under different pooling strategies.
  • Statistical Significance. Performance differences were validated using statistical tests across multiple runs: one-way ANOVA with Tukey’s HSD post-hoc test, complemented by paired comparisons (Wilcoxon Signed-Rank and paired t-test). These analyses ensure the robustness and reliability of comparative conclusions.
Algorithm 5: Pool ( X , P , k , s )    (X has shape [ B , C , H , W ] )
  Bdcc 09 00306 i004
Table 5 summarizes the evaluation metrics and their specific role in assessing pooling operators within the experimental framework.

6. Experimental Results

In this section, we present the experimental results obtained from the comparative evaluation of pooling operators across three benchmark datasets-ImageNet (subset), CIFAR-10, and Fashion-MNIST. The results are reported under three data-splitting scenarios (80/20, 65/35, and 50/50), with training epochs varied systematically between 20 and 50,000. The analysis highlights the discriminative capability of each pooling operator, the stability and rate of convergence during training, and the computational efficiency achieved. Particular emphasis is placed on the proposed ECA110-based pooling operator, which is examined in relation to both conventional methods such as MaxPooling and AveragePooling, as well as learnable schemes such as KernelPooling. This comparative framework provides a rigorous assessment of whether rule-driven transformations can serve as a principled and efficient alternative to existing pooling strategies.

6.1. Classification Performance Across Epochs

Table 6, Table 7 and Table 8 provide a detailed comparison of Top-1 accuracy, error rate, and F1-score across training epochs for the three benchmark datasets. Several trends can be observed:
  • ECA110-Pooling consistently surpasses MinPooling and MedianPooling across all epochs, while providing performance on par with or superior to MaxPooling and AveragePooling.
  • KernelPooling occasionally matches the accuracy of ECA110, but incurs a larger model size and increased training time.
  • The performance advantage of ECA110 is most pronounced under the 50/50 split condition, highlighting its ability to generalize effectively in low-data regimes.
  • Long-term training schedules ( E 5000 ) stabilize the superiority of ECA110, with diminishing returns observed for standard pooling operators.
Cross-dataset comparison. Figure 3 aggregates performance trends across all three datasets. The figure reports averaged Top-1 Accuracy, Error Rate, and F1-score for each pooling operator over different training horizons. The results confirm that:
  • ECA110-Pooling achieves the best overall balance across datasets, with the highest accuracy and F1-score, and the lowest error rate.
  • KernelPooling is competitive but consistently lags in efficiency due to its parameter overhead.
  • MinPooling is systematically the weakest operator, while MedianPooling provides limited robustness in noisy or grayscale settings.
To condense the detailed experimental results, Table 9 reports the aggregated performance of all pooling operators across ImageNet (subset), CIFAR-10, and Fashion-MNIST. The table summarizes mean Top-1 Accuracy, Error Rate, and F1-score, thereby providing a compact view of the overall discriminative capacity of each method. Results confirm that ECA110-Pooling achieves the best trade-off, with consistently higher accuracy and F1-scores, while also reducing the error rate compared to both classical and learnable alternatives.

6.2. Convergence Dynamics

The convergence behavior of training across different pooling operators was systematically evaluated on ImageNet (subset), CIFAR-10, and Fashion-MNIST. Conventional operators such as MaxPooling and AveragePooling exhibit rapid early improvements but tend to plateau after approximately 500–1000 epochs, reflecting limited representational flexibility once dominant activations have been captured. MedianPooling shows slightly more stable convergence, whereas MinPooling performs poorly, lagging in both speed and final accuracy. In contrast, the proposed ECA110-Pooling demonstrates gradual yet sustained improvements throughout extended training schedules, including long runs of 10,000 and 50,000 epochs. This consistent progression suggests that the automaton-driven transform-reduce mechanism effectively preserves local structural dependencies over time, enabling richer feature representations. KernelPooling achieves comparable late-stage accuracy but incurs higher computational cost due to increased parameterization and slower epoch times. Figure 4 illustrates convergence curves for all three datasets across the full training range (20 to 50,000 epochs). The visualization highlights that while conventional operators stagnate after early gains, ECA110-Pooling continues to deliver progressive accuracy improvements, confirming its ability to sustain long-term learning and enhance generalization under both data-rich and data-constrained conditions. Overall, the convergence analysis demonstrates that ECA110-Pooling avoids premature stagnation and achieves superior performance compared to conventional pooling schemes.

6.3. Computational Complexity

To assess the computational overhead prior to practical evaluation, a complexity analysis of the six pooling strategies was conducted. Let k × k denote the pooling window size and C the number of channels in the feature map. The per-window complexity can be summarized as follows:
  • MaxPooling, AveragePooling, MedianPooling, MinPooling. These operators require evaluating all k 2 activations within each pooling window for every channel, yielding a computational cost of O ( k 2 C ) . MedianPooling introduces an additional constant-time sorting step per window, but its asymptotic complexity remains unchanged.
  • KernelPooling. This operator computes a weighted sum of activations using learnable kernels of size k × k per channel. Its computational cost is identical to the classical operators, O ( k 2 C ) , but it introduces k 2 · C additional parameters, increasing memory requirements and training time due to gradient updates.
  • ECA110-Pooling. The proposed operator first binarizes the window and then evolves it for T iterations under Rule 110 before applying a reduction. This results in a per-window cost of O ( k 2 C T ) , where T denotes the number of automaton steps. Since T is treated as a small constant in practice (e.g., T = 1 in our experiments), the complexity remains effectively linear in k 2 C , incurring only a modest constant-time overhead relative to standard pooling.
In summary, all six pooling operators share the same asymptotic order of growth, O ( k 2 C ) . KernelPooling is distinguished by its additional learnable parameters, while ECA110-Pooling introduces only a lightweight constant-factor expansion due to cellular automaton iterations. This analysis clarifies the findings from the experimental results, where ECA110 achieves an efficiency close to MaxPooling, but surpasses it in predictive performance.
The complexity profile summarized in Table 10 and illustrated in Figure 5 shows that, although all pooling strategies share the same asymptotic complexity of O ( k 2 C ) , practical differences emerge in constant factors and parameterization. KernelPooling increases memory usage and training cost due to its learnable weights, while MedianPooling introduces a modest runtime penalty from the sorting operation. By contrast, ECA110-Pooling adds only a negligible constant-time overhead, despite the additional transformation based on cellular automaton rules. These distinctions anticipate the results presented in the next subsection, where ECA110 demonstrates a favorable balance between accuracy improvements and runtime efficiency.

6.4. Computational Efficiency

Efficiency results are presented in Table 11, which reports the average training time per epoch and model size across the three benchmark datasets (ImageNet subset, CIFAR-10, and Fashion-MNIST). The comparison highlights the trade-offs between computational cost and architectural complexity for each pooling operator.
Standard pooling methods (Max, Average, Min, Median) remain highly efficient, combining low execution time with minimal memory requirements. KernelPooling, by contrast, substantially increases model size due to its learnable kernels; while this can improve accuracy in certain scenarios, it also introduces a significant computational burden. The proposed ECA110-Pooling achieves a favorable balance, adding only a modest and stable overhead relative to non-learnable baselines while remaining considerably more efficient than KernelPooling. Notably, ECA110’s computational overhead is invariant to dataset size, a direct consequence of its rule-driven local design. To complement the tabular results, Figure 6 provides a direct comparison between average epoch times and classification accuracy across all six pooling strategies. The visualization underscores the trade-off between efficiency and predictive performance, showing that ECA110 consistently delivers the highest accuracy with only marginal runtime overhead, whereas KernelPooling achieves competitive accuracy at substantially higher computational cost.

6.5. Statistical Validation

To rigorously assess the reliability of the observed performance differences, we conducted statistical validation across five independent runs per configuration. Both parametric and non-parametric analyses were employed to ensure robustness under varying distributional assumptions. In a one-way ANOVA, the between-group degrees of freedom are given by ( k 1 ) and the within-group (residual) degrees of freedom by ( N k ) , where k is the number of groups and N the total number of observations. In this study we have k = 6 -for the pooling operators (MaxPooling, AveragePooling, MedianPooling, MinPooling, KernelPooling, and ECA110-Pooling). With five independent runs per dataset–split–epoch configuration, the total number of observations was N = 6 × 5 = 30 , yielding Between-group df = 5 and Within-group df = 24. A one-way ANOVA was applied to compare mean classification accuracy across pooling operators for each dataset, split ratio, and training epoch. The results, summarized in Table 12 and Table 13, revealed statistically significant differences ( p < 0.05 ) under nearly all experimental conditions, spanning ImageNet (subset), CIFAR-10, and Fashion-MNIST. Significance was observed as early as 20 epochs and became more pronounced with longer training, indicating that pooling choice consistently influenced learning outcomes. Post-hoc analysis using Tukey’s Honestly Significant Difference (HSD) test identified the specific operators responsible for these differences. As shown in Table 14, ECA110-Pooling (ECA) significantly outperformed conventional operators at mid and late training stages. For example, at 500 and 5000 epochs, ECA consistently surpassed MaxPooling, AveragePooling, and MinPooling across all datasets and split ratios. At very early epochs (e.g., 20), differences were less pronounced, with MaxPooling or MedianPooling occasionally outperforming only MinPooling. These insights suggest that ECA’s benefits become more evident as training progresses. To complement ANOVA and Tukey HSD, paired t-tests were conducted between ECA and each baseline pooling operator (Table 15). The t-tests confirmed that ECA’s improvements were statistically significant ( p < 0.05 ) in most cases. CIFAR-10 showed strong advantages for ECA over MinPooling and AveragePooling across all splits, while ImageNet demonstrated significant gains over MaxPooling and AveragePooling at later epochs (500 and 5000). Fashion-MNIST results followed the same trend, with ECA consistently outperforming baselines under both balanced and reduced splits. Recognizing that neural network performance distributions may deviate from normality, we further validated results using the non-parametric Wilcoxon signed-rank test (Table 16). The Wilcoxon results corroborated the t-test outcomes, confirming that ECA’s improvements are systematic. Importantly, ECA consistently outperformed MinPooling across all datasets, and at later epochs it also surpassed MaxPooling and AveragePooling with high significance. Taken together, the convergence of results across four complementary statistical tests provides good evidence that ECA110-Pooling yields reliable improvements. Its advantages are not confined to specific datasets or split ratios but generalize across conditions, with the most pronounced gains observed in data-constrained regimes (65/35 and 50/50 splits) and at longer training horizons. These results establish ECA110-Pooling as a statistically validated alternative to conventional pooling mechanisms, offering both accuracy and robustness benefits across diverse image classification tasks.

6.6. Benchmarking ECA110-Pooling Against State-of-the-Art Methods

To contextualize the effectiveness of ECA110-Pooling, its performance was benchmarked against representative state-of-the-art (SOTA) architectures under identical experimental conditions. The selected baselines include ResNet-50 [43], DenseNet-121 [44], EfficientNet-B0 [45], MobileNetV2 [46], and the Vision Transformer (ViT-Small) [47]. All models were consistently reimplemented using the same data splits (80/20, 65/35, and 50/50) and training schedules (500, 5000, and 10,000 epochs), ensuring a fair and unbiased comparison between pooling-based and SOTA-based approaches.
Table 17 reports Top-1 Accuracy, Error Rate, and F1-score across ImageNet (subset), CIFAR-10, and Fashion-MNIST. Several key observations emerge from these results. On Fashion-MNIST, ECA110-Pooling not only matches but occasionally surpasses SOTA models, confirming its robustness in grayscale image classification tasks. On CIFAR-10, ECA110 approaches the performance of ResNet and DenseNet at extended training horizons (5,000+ epochs), while maintaining substantially lower computational cost, positioning it as a competitive solution in resource-constrained environments. On ImageNet (subset), although large-scale models such as EfficientNet-B0 and DenseNet achieve higher absolute accuracy, ECA110 significantly narrows the gap under reduced-data splits (65/35 and 50/50). This demonstrates strong generalization capability when training data are limited. Importantly, these advantages are achieved with markedly lower parameterization and training time compared to heavyweight models such as ResNet-50, DenseNet-121, and EfficientNet-B0.
Overall, the benchmarking analysis establishes ECA110-Pooling as a lightweight yet effective alternative to conventional SOTA architectures, offering a favorable balance between efficiency and predictive performance across diverse datasets and training regimes.
Table 18 provides a comparative analysis of the parameterization requirements and memory footprint associated with different pooling methods and SOTA architectures. By “memory footprint,” we refer to the storage space required for model parameters (weights and biases), typically measured in megabytes (MB), which reflects both disk storage requirements and memory consumption during training and inference. Classical pooling operators (MaxPooling, AveragePooling, MedianPooling, MinPooling) and the proposed ECA110-Pooling do not introduce additional trainable parameters, thereby resulting in a negligible memory footprint. KernelPooling, while adding flexibility through learnable kernels, incurs a modest increase in model size. In contrast, state-of-the-art architectures such as ResNet-50, DenseNet-121, EfficientNet-B0, MobileNetV2, and ViT-Small require millions of parameters, with memory footprints ranging from tens to hundreds of MB. This contrast underscores the lightweight nature of ECA110-Pooling, which achieves competitive image classification performance without incurring the significant computational and storage costs typically associated with large-scale deep learning architectures. To complement these tabular comparisons, Figure 7 illustrates the trade-off between performance and efficiency. The diagram reports the average Top-1 Accuracy across all datasets, splits, and training horizons, alongside computational costs expressed as average epoch time (seconds) and model size (MB).
Overall, the benchmarking results position ECA110-Pooling as a principled and competitive alternative to established SOTA architectures in the context of image classification. While advanced models such as EfficientNet-B0 and DenseNet-121 continue to achieve superior absolute accuracy in large-scale, data-abundant scenarios, ECA110 consistently demonstrates enhanced robustness under data-constrained conditions (65/35 and 50/50 splits) and achieves notable efficiency in terms of runtime and parameter footprint. This combination of generalization capability and computational efficiency underscores the relevance of ECA110-Pooling as a reliable mechanism for advancing image classification across both standard and resource-limited environments.

7. Discussion

The comparative evaluation of six pooling operators (MaxPooling, AveragePooling, MedianPooling, MinPooling, KernelPooling, and the proposed ECA110-Pooling) under a uniform CNN backbone and standardized training protocol demonstrates that the pooling stage is a decisive factor in predictive performance, convergence dynamics, and computational trade-offs in image classification. Analysis across three benchmark datasets shows that ECA110-Pooling consistently emerges as the most effective strategy, particularly at extended training horizons of 5000 epochs or more. Its superiority is reflected in higher Top-1 accuracies, lower error rates, and stronger F1-scores compared with both fixed and learnable pooling approaches. KernelPooling ranks as the closest competitor, offering competitive accuracy but at the expense of increased computational overhead, while MaxPooling remains a robust and efficient baseline. AveragePooling and MedianPooling provide moderate benefits in specific contexts, such as grayscale or noise-prone datasets, whereas MinPooling consistently underperforms across all experimental conditions. The trajectory of learning curves reveals a common three-phase dynamic: rapid early improvements within the first 1000 epochs, moderate yet steady gains up to 10,000 , and eventual saturation approaching 50,000 epochs. Importantly, the relative ordering of pooling methods remains stable throughout, underscoring the robustness of ECA110’s advantage. This robustness extends beyond accuracy, as ECA110 achieves lower error rates and superior F1-scores across all datasets, reflecting improvements not only in predictive strength but also in the balance between precision and recall. These results are reinforced by extensive statistical validation. One-way ANOVA confirms significant main effects of pooling method across nearly all configurations, establishing that operator choice has a measurable impact on classification accuracy. Post-hoc comparisons reveal that ECA110 surpasses MinPooling, AveragePooling, and MaxPooling from as early as 500 epochs, with the strongest improvements observed at longer training horizons and in data-constrained splits. Complementary paired t-tests corroborate these results, consistently confirming ECA110’s superiority over MinPooling and extending its significance against MaxPooling and AveragePooling in CIFAR-10 and Fashion-MNIST. The non-parametric Wilcoxon signed-rank test further consolidates these outcomes, ruling out distributional artifacts and confirming that the observed improvements are systematic and statistically significant. Taken together, these analyses provide convergent evidence that the gains achieved by ECA110 are both reliable and statistically robust. From a computational perspective, the efficiency profile of pooling operators highlights important trade-offs. Conventional methods such as MaxPooling, AveragePooling, and MinPooling incur negligible runtime differences, with MedianPooling slightly penalized by sorting overhead. KernelPooling, while accurate, introduces both memory and computational costs proportional to its learnable parameters, limiting efficiency in large-scale scenarios. By contrast, ECA110 integrates a lightweight cellular automaton transformation with constant-time complexity, achieving performance levels comparable to or better than KernelPooling while maintaining per-epoch costs close to MaxPooling. This favorable balance holds even when considering error rates and F1-scores, confirming that ECA110’s accuracy gains are not offset by instability or sensitivity to class imbalance. In applied contexts, these results position ECA110-Pooling as a principled alternative to both fixed and learnable pooling operators. Its deterministic, rule-based mechanism enhances selectivity to structured micro-patterns while retaining efficiency, bridging the gap between computational parsimony and discriminative capacity. KernelPooling remains attractive when additional flexibility justifies its overhead, MaxPooling continues to serve as a reliable baseline when efficiency is prioritized, and MedianPooling may hold value in noise-robustness scenarios. MinPooling, however, proves consistently suboptimal and is not recommended. While these findings are robust across datasets, splits, and statistical tests, certain validity threats should be acknowledged, including dependence on the CNN backbone, sensitivity to data augmentation, and hardware-specific runtime variability. Nonetheless, the consistency of relative trends across experimental conditions suggests strong generalizability. A potential limitation of ECA110-Pooling lies in its reduced adaptability compared to fully learnable pooling layers, particularly in scenarios characterized by substantial domain shift. In addition, the choice of cellular automaton rule, neighborhood configuration, and windowing hyperparameters may require task-specific tuning, and the marginal benefits can diminish when applied to very deep backbones that already capture long-range dependencies. Future research could explore hybrid models that integrate cellular automaton rules with learnable weighting schemes, adaptive training curricula that dynamically vary pooling operators during learning, and filtering strategies designed to enhance robustness. Such directions may provide deeper insights into the generalization potential of rule-based pooling mechanisms like ECA110. In summary, ECA110-Pooling emerges as an interpretable and computationally efficient alternative for image classification, delivering competitive performance even in resource-constrained scenarios. By combining efficiency, robustness, and generalization across diverse datasets, ECA110-Pooling can be regarded as an appropriate method for practical deployment, particularly in contexts where maintaining a balance between predictive accuracy and computational cost is essential.

8. Conclusions

This work has presented a systematic evaluation of six pooling strategies—MaxPooling, AveragePooling, MedianPooling, MinPooling, KernelPooling, and the proposed ECA110-Pooling—within a controlled CNN framework applied to Fashion-MNIST, CIFAR-10, and a subset of ImageNet. By employing uniform training protocols across multiple data splits and a wide range of epochs, the analysis confirmed that the pooling stage is a decisive factor in shaping classification accuracy, convergence behavior, and computational efficiency. ECA110-Pooling consistently achieved the highest Top-1 accuracies, the lowest error rates, and the strongest F1-scores across datasets, particularly at extended training horizons ( E p 5,000 ). It demonstrated competitive performance relative to state-of-the-art architectures such as ResNet, DenseNet, and EfficientNet, while requiring substantially fewer parameters and reduced training time. These results highlight its capacity for generalization under resource-constrained conditions. The lightweight, rule-based transform–reduce paradigm of ECA110 enables the preservation of discriminative micro-patterns that are often discarded by conventional pooling, thereby enhancing predictive robustness in both natural and grayscale image modalities. From a computational perspective, ECA110-Pooling achieves these gains without incurring the parameter overhead of KernelPooling or the runtime burden of large-scale SOTA models, maintaining per-epoch costs close to those of MaxPooling. This balance between efficiency and accuracy positions ECA110-Pooling as an appropriate method for real-world deployments where computational budgets are limited yet reliable classification performance remains essential. Overall, the results suggest that ECA110-Pooling can serve as a suitable default strategy for modern CNNs, effectively bridging the gap between traditional operators and learnable pooling mechanisms. In addition to reaffirming the fundamental importance of pooling in convolutional architectures, this study identifies new directions for hybrid approaches that integrate cellular automaton principles with adaptive learning. Demonstrating that interpretable, rule-based transformations can deliver competitive or even superior performance compared to complex SOTA architectures provides both practical insights and conceptual advances for the design of efficient deep learning models in image classification.

Author Contributions

Conceptualization, D.C. and C.B.; methodology, D.C. and C.B.; software, D.C.; validation, D.C. and C.B.; formal analysis, D.C. and C.B.; investigation, C.B.; resources, D.C. and C.B.; data curation, D.C. and C.B.; writing-original draft preparation, D.C. and C.B.; writing-review and editing, D.C. and C.B.; visualization, D.C. and C.B.; supervision, C.B.; project administration, D.C. and C.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CNNConvolutional Neural Network
ECAElementary Cellular Automaton
ECA110Elementary Cellular Automaton, Rule 110
ECA110PoolingPooling operator based on ECA Rule 110
SOTAState-of-the-Art
Top-1 Acc.Top-1 Accuracy
F1F1-Score (harmonic mean of precision and recall)
Ep.Training Epochs
ANOVAAnalysis of Variance
HSDHonestly Significant Difference (Tukey test)
ViTVision Transformer
ResNetResidual Neural Network
DenseNetDensely Connected Convolutional Network
EfficientNetEfficient Convolutional Neural Network Family
MobileNetV2Mobile Network Version 2
MLPMulti-Layer Perceptron
FMNISTFashion-MNIST Dataset
CIFAR-10Canadian Institute For Advanced Research, 10 classes
ImageNetLarge Visual Recognition Challenge Dataset
MBMegabyte (memory size)
s/epochSeconds per epoch (training time)
MaxPoolMaximum Pooling
AvgPoolAverage Pooling
MedPoolMedian Pooling
MinPoolMinimum Pooling
KerPoolKernel-based Pooling

References

  1. Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 1980, 36, 193–202. [Google Scholar] [CrossRef] [PubMed]
  2. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
  3. LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
  4. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25. Available online: https://proceedings.neurips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html (accessed on 26 November 2025). [CrossRef]
  5. Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2014; pp. 818–833. [Google Scholar]
  6. Boureau, Y.-L.; Ponce, J.; LeCun, Y. A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th International Conference on Machine Learning (ICML 2010), Haifa, Israel, 21–24 June 2010. [Google Scholar]
  7. Lee, C.-Y.; Gallagher, P.; Tu, Z. Generalizing pooling functions in convolutional neural networks: Mixed, gated, and tree. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), Cadiz, Spain, 9–11 May 2016; pp. 464–472. [Google Scholar]
  8. Cui, Y.; Zhou, F.; Wang, J.; Liu, X.; Lin, Y.; Belongie, S.J. Kernel pooling for Convolutional Neural Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar] [CrossRef]
  9. Zeiler, M.D.; Fergus, R. Stochastic pooling for regularization of deep convolutional neural networks. In Proceedings of the International Conference on Learning Representations, Scottsdale, AZ, USA, 2–4 May 2013. [Google Scholar]
  10. Springenberg, J.T.; Dosovitskiy, A.; Brox, T.; Riedmiller, M. Striving for simplicity: The all convolutional net. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
  11. Wolfram, S. Statistical mechanics of cellular automata. Rev. Mod. Phys. 1983, 55, 601–644. [Google Scholar] [CrossRef]
  12. Wolfram, S. A New Kind of Science; Wolfram Media: Champaign, IL, USA, 2002. [Google Scholar]
  13. Cook, M. Universality in elementary cellular automata. Complex Syst. 2004, 15, 1–40. [Google Scholar] [CrossRef]
  14. Gilpin, W. Cellular automata as convolutional neural networks. Phys. Rev. E 2019, 100, 032402. [Google Scholar] [CrossRef]
  15. Mordvintsev, A.; Randazzo, E.; Niklasson, E.; Levin, M. Growing neural cellular automata. Distill 2020, 5, e23. [Google Scholar] [CrossRef]
  16. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
  17. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
  18. Mienye, I.D.; Swart, T.G. A Comprehensive Review of Deep Learning: Architectures, Recent Advances, and Applications. Information 2024, 15, 755. [Google Scholar] [CrossRef]
  19. Galanis, N.I.; Vafiadis, P.; Mirzaev, K.G.; Papakostas, G.A. Convolutional Neural Networks: A Roundup and Benchmark of Their Pooling Layer Variants. Algorithms 2022, 15, 391. [Google Scholar] [CrossRef]
  20. Zafar, A.; Saba, N.; Arshad, A.; Alabrah, A.; Riaz, S.; Suleman, M.; Zafar, S.; Nadeem, M. Convolutional Neural Networks: A Comprehensive Evaluation and Benchmarking of Pooling Layer Variants. Symmetry 2024, 16, 1516. [Google Scholar] [CrossRef]
  21. Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A Survey on Vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef]
  22. Wang, F.; Jiang, M.; Qian, C.; Yang, S.; Li, C.; Zhang, H.; Wang, X.; Tang, X. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar] [CrossRef]
  23. Zhang, R. Making convolutional networks shift-invariant again. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019. [Google Scholar]
  24. Eom, H.; Choi, H. Alpha-Integration Pooling for Convolutional Neural Networks. arXiv 2018, arXiv:1811.03436. [Google Scholar] [CrossRef]
  25. Bieder, F.; Sandkuhler, R.; Cattin, P. Comparison of methods generalizing max- and average-pooling. arXiv 2021, arXiv:2103.01746. [Google Scholar] [CrossRef]
  26. Gholamalinezhad, H.; Khosravi, H. Pooling Methods in Deep Neural Networks, A Review. arXiv 2020, arXiv:2009.07485. [Google Scholar] [CrossRef]
  27. Yu, F.; Xiu, X.; Li, Y. A Survey on Deep Transfer Learning and Beyond. Mathematics 2022, 10, 3619. [Google Scholar] [CrossRef]
  28. Zhu, T.; Luo, W.; Yu, F. Convolution- and Attention-Based Neural Network for Automated Sleep Stage Classification. Int. J. Environ. Res. Public Health 2020, 17, 4152. [Google Scholar] [CrossRef]
  29. Kardakis, S.; Perikos, I.; Grivokostopoulou, F.; Hatzilygeroudis, I. Examining Attention Mechanisms in Deep Learning Models for Sentiment Analysis. Appl. Sci. 2021, 11, 3883. [Google Scholar] [CrossRef]
  30. Tan, M.; Le, Q. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning (ICML), ICML 2019, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
  31. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  32. Elizar, E.; Zulkifley, M.A.; Muharar, R.; Zaman, M.H.M.; Mustaza, S. A review on multiscale-deep-learning applications. Sensors 2022, 22, 7384. [Google Scholar] [CrossRef]
  33. Chen, C.; Zhang, H. Attention block based on binary pooling. Appl. Sci. 2023, 13, 10012. [Google Scholar] [CrossRef]
  34. Liu, Y.; Tian, J. Probabilistic Attention Map: A Probabilistic Attention Mechanism for Convolutional Neural Networks. Sensors 2024, 24, 8187. [Google Scholar] [CrossRef] [PubMed]
  35. Bengio, Y.; Lamblin, P.; Popovici, D.; Larochelle, H. Greedy layer-wise training of deep networks. In Proceedings of the Advances in Neural Information Processing Systems 19, Vancouver, BC, Canada, 4–7 December 2006. [Google Scholar]
  36. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  37. Graves, A.; Mohamed, A.R.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013. [Google Scholar] [CrossRef]
  38. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef]
  39. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
  40. Kittler, J.; Illingworth, J. Minimum Error Thresholding. Pattern Recognit. 1986, 19, 41–47. [Google Scholar] [CrossRef]
  41. Bernsen, J. Dynamic Thresholding of Grey-Level Image. In Proceedings of the 8th International Conference on Pattern Recognition, Paris, France, 27–31 October 1986; pp. 1251–1255. [Google Scholar]
  42. Sauvola, J.; Pietikainen, M. Adaptive Document Image Binarization. Pattern Recognit. Soc. 2000, 33, 225–236. [Google Scholar] [CrossRef]
  43. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  44. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
  45. Yang, B.; Bender, G.; Le, Q.V.; Ngiam, J. CondConv: Conditionally Parameterized Convolutions for Efficient Inference. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar] [CrossRef]
  46. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
  47. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2021, arXiv:2010.11929. [Google Scholar] [CrossRef]
Figure 1. Schematic representation of the ECA110-Pooling procedure applied to a 3 × 3 activation window. The process involves four sequential steps: flattening, binarization, transformation via Rule 110, and normalized reduction.
Figure 1. Schematic representation of the ECA110-Pooling procedure applied to a 3 × 3 activation window. The process involves four sequential steps: flattening, binarization, transformation via Rule 110, and normalized reduction.
Bdcc 09 00306 g001
Figure 2. CNN pipeline with ECA Rule 110 pooling, illustrating the transform–reduce branch within the overall architecture.
Figure 2. CNN pipeline with ECA Rule 110 pooling, illustrating the transform–reduce branch within the overall architecture.
Bdcc 09 00306 g002
Figure 3. Aggregated comparison of pooling methods across ImageNet (subset), CIFAR-10, and Fashion-MNIST. Top-1 Accuracy, Error Rate, and F1-score are averaged across datasets and reported at multiple training epochs (20, 100, 500, 1000, 5000, 10,000, 50,000 epochs). ECA110-Pooling consistently achieves superior performance while maintaining efficiency.
Figure 3. Aggregated comparison of pooling methods across ImageNet (subset), CIFAR-10, and Fashion-MNIST. Top-1 Accuracy, Error Rate, and F1-score are averaged across datasets and reported at multiple training epochs (20, 100, 500, 1000, 5000, 10,000, 50,000 epochs). ECA110-Pooling consistently achieves superior performance while maintaining efficiency.
Bdcc 09 00306 g003
Figure 4. Comparative convergence dynamics of pooling operators (Max, Average, Median, Min, Kernel, and ECA110) across ImageNet (subset), CIFAR-10, and Fashion-MNIST datasets, evaluated over training epochs ranging from 20 to 50,000.
Figure 4. Comparative convergence dynamics of pooling operators (Max, Average, Median, Min, Kernel, and ECA110) across ImageNet (subset), CIFAR-10, and Fashion-MNIST datasets, evaluated over training epochs ranging from 20 to 50,000.
Bdcc 09 00306 g004
Figure 5. Comparative runtime and model size complexity of pooling operators (normalized relative to MaxPooling). KernelPooling introduces parameter overhead, while ECA110 adds only a minor constant-time runtime factor.
Figure 5. Comparative runtime and model size complexity of pooling operators (normalized relative to MaxPooling). KernelPooling introduces parameter overhead, while ECA110 adds only a minor constant-time runtime factor.
Bdcc 09 00306 g005
Figure 6. Comparison of average epoch time and accuracy across pooling methods. ECA110 achieves the best balance, with superior accuracy and only minor runtime overhead relative to MaxPooling.
Figure 6. Comparison of average epoch time and accuracy across pooling methods. ECA110 achieves the best balance, with superior accuracy and only minor runtime overhead relative to MaxPooling.
Bdcc 09 00306 g006
Figure 7. Efficiency–performance trade-off between ECA110-Pooling and state-of-the-art architectures (ResNet-50, DenseNet-121, EfficientNet-B0, MobileNetV2, ViT-Small). Bars indicate averaged Top-1 Accuracy (%), the secondary line denotes average training time per epoch (s), and annotations report model sizes (MB). ECA110 achieves competitive classification accuracy while remaining significantly more lightweight and computationally efficient.
Figure 7. Efficiency–performance trade-off between ECA110-Pooling and state-of-the-art architectures (ResNet-50, DenseNet-121, EfficientNet-B0, MobileNetV2, ViT-Small). Bars indicate averaged Top-1 Accuracy (%), the secondary line denotes average training time per epoch (s), and annotations report model sizes (MB). ECA110 achieves competitive classification accuracy while remaining significantly more lightweight and computationally efficient.
Bdcc 09 00306 g007
Table 1. Transition table for ECA Rule 110. Each triplet ( l , p , r ) of neighboring binary states is mapped to a new state.
Table 1. Transition table for ECA Rule 110. Each triplet ( l , p , r ) of neighboring binary states is mapped to a new state.
Triplet ( l , p , r ) 111110101100011010001000
New state01101110
Table 2. The ablation study on CIFAR-10 at 10,000 epochs (Top-1 accuracy %).
Table 2. The ablation study on CIFAR-10 at 10,000 epochs (Top-1 accuracy %).
AblationStudy VariantTop-1 Accuracy Δ (diff.)
Threshold τ Mean (default)93.0(ref)
Median92.9 0.1
Fixed (0.5)92.8 0.2
ECA ruleECA 11093.0(ref)
ECA 9092.6 0.4
ECA 3092.5 0.5
ECA 18492.6 0.4
Steps T0 (no transform)92.0 1.0
193.0(ref)
293.0≈0
392.9 0.1
Table 3. Benchmark datasets utilized in the present study.
Table 3. Benchmark datasets utilized in the present study.
Dataset#Classes#ImagesModality
ImageNet (subset)100100,000RGB
CIFAR-101060,000RGB
Fashion-MNIST1070,000Grayscale
Table 4. Train/test splits applied to all datasets.
Table 4. Train/test splits applied to all datasets.
CaseTraining SetTesting Set
Case 180%20%
Case 265%35%
Case 350%50%
Table 5. Evaluation metrics for assessing pooling operators.
Table 5. Evaluation metrics for assessing pooling operators.
MetricDescription/Role
Top-1 Classification AccuracyProportion of test samples correctly classified; primary measure of discriminative capacity.
Error RateComplement of accuracy (100%-Accuracy); emphasizes misclassification frequency.
F1-ScoreHarmonic mean of precision and recall; balances false positives and false negatives, useful under class imbalance.
Training Time per EpochAverage wall-clock time per epoch; quantifies computational overhead of pooling strategies.
Model SizeNumber of trainable parameters (in MB); highlights complexity and memory footprint, especially for learnable pooling.
Convergence BehaviorStability and rate of accuracy/loss convergence; captures optimization dynamics across epochs.
Statistical SignificanceValidates observed differences using ANOVA, Tukey’s HSD, Wilcoxon Signed-Rank, and paired t-tests. Ensures robustness of conclusions.
Table 6. Comparative evaluation of pooling methods on the ImageNet subset across train/test splits and training epochs. Results are reported as Top-1 Accuracy (%), Error Rate (%), and F1-score (%).
Table 6. Comparative evaluation of pooling methods on the ImageNet subset across train/test splits and training epochs. Results are reported as Top-1 Accuracy (%), Error Rate (%), and F1-score (%).
MethodSplit201005001000 (F1)
AccErrF1AccErrF1AccErrF1
MaxPooling80/2058.241.858.065.534.565.270.329.770.171.8
65/3556.743.356.564.235.864.069.130.968.970.6
50/5055.045.054.762.937.162.767.632.467.469.3
AveragePooling80/2057.542.557.364.335.764.169.230.869.070.6
65/3555.944.155.763.136.962.968.032.067.869.5
50/5054.245.854.061.838.261.666.633.466.468.1
MedianPooling80/2058.042.057.864.835.264.669.830.269.671.0
65/3556.443.656.263.636.463.468.531.568.369.9
50/5054.845.254.662.337.762.167.132.966.968.6
MinPooling80/2048.651.448.054.245.853.758.741.358.360.0
65/3547.053.046.552.847.252.457.542.557.059.0
50/5045.354.744.851.548.551.056.044.055.557.6
KernelPooling80/2059.340.759.166.034.065.871.029.070.872.4
65/3557.742.357.564.735.364.569.730.369.571.2
50/5056.143.955.963.336.763.168.331.768.169.9
ECA110Pooling80/2060.040.059.866.933.166.771.728.371.572.8
65/3558.441.658.265.634.465.470.529.570.371.7
50/5056.843.256.664.135.963.969.031.068.870.5
MethodSplit1000500010,00050,000
AccErrAccErrF1AccErrF1AccErrF1
MaxPooling80/2072.028.072.827.272.673.027.072.873.126.972.9
65/3570.829.271.428.671.271.628.471.471.728.371.5
50/5069.530.570.129.969.970.229.870.070.329.770.1
AveragePooling80/2070.829.271.628.471.471.828.271.671.928.171.7
65/3569.730.370.429.670.270.629.470.470.729.370.5
50/5068.331.769.031.068.869.130.968.969.230.869.0
MedianPooling80/2071.228.871.928.171.772.028.071.872.028.071.8
65/3570.129.970.829.270.670.929.170.771.029.070.8
50/5068.831.269.430.669.269.530.569.369.530.569.3
MinPooling80/2060.539.561.039.060.561.138.960.661.238.860.7
65/3559.340.759.840.259.460.040.059.660.040.059.6
50/5058.042.058.441.658.058.541.558.158.641.458.2
KernelPooling80/2072.627.473.526.573.373.726.373.573.826.273.6
65/3571.428.672.227.872.072.427.672.272.527.572.3
50/5070.129.970.929.170.771.029.070.871.128.970.9
ECA110Pooling80/2073.027.073.826.273.674.026.073.874.125.973.9
65/3571.928.172.727.372.572.927.172.773.027.072.8
50/5070.729.371.428.671.271.628.471.471.728.371.5
Table 7. Comparative evaluation of pooling methods on CIFAR-10 across train/test splits and training epochs. Results are reported as Top-1 Accuracy (%), Error Rate (%), and F1-score (%).
Table 7. Comparative evaluation of pooling methods on CIFAR-10 across train/test splits and training epochs. Results are reported as Top-1 Accuracy (%), Error Rate (%), and F1-score (%).
MethodSplit201005001000 (F1)
AccErrF1AccErrF1AccErrF1
MaxPooling80/2075.424.675.085.114.984.990.29.890.091.3
65/3574.026.073.684.016.083.889.310.789.190.5
50/5072.627.472.282.817.282.688.411.688.289.7
AveragePooling80/2074.825.274.584.615.484.489.710.389.591.0
65/3573.326.773.083.416.683.288.811.288.690.1
50/5071.928.171.682.117.981.987.912.187.789.3
MedianPooling80/2075.124.974.884.915.184.790.010.089.891.1
65/3573.626.473.383.716.383.589.011.088.890.3
50/5072.227.871.982.517.582.388.211.888.089.4
MinPooling80/2068.931.168.677.822.277.585.514.585.286.9
65/3567.532.567.276.523.576.284.615.484.286.0
50/5066.034.065.675.224.874.983.816.283.485.2
KernelPooling80/2076.024.075.785.514.585.390.69.490.491.7
65/3574.525.574.284.315.784.189.710.389.590.9
50/5073.126.972.883.017.082.888.811.288.690.2
ECA110Pooling80/2076.823.276.586.313.786.191.28.891.092.3
65/3575.324.775.085.015.084.890.39.790.191.5
50/5073.926.173.683.816.283.689.410.689.290.8
MethodSplit1000500010,00050,000
AccErrAccErrF1AccErrF1AccErrF1
MaxPooling80/2091.58.591.98.191.792.08.091.892.17.991.9
65/3590.79.391.18.990.991.28.891.091.38.791.1
50/5089.910.190.39.790.190.49.690.290.59.590.3
AveragePooling80/2091.18.991.58.591.391.68.491.491.78.391.5
65/3590.39.790.79.390.590.89.290.690.99.190.7
50/5089.510.589.910.189.790.010.089.890.19.989.9
MedianPooling80/2091.38.791.78.391.591.88.291.691.98.191.7
65/3590.59.590.99.190.791.09.090.891.18.990.9
50/5089.710.390.19.989.890.29.889.990.39.790.0
MinPooling80/2087.312.787.712.387.387.812.287.487.912.187.5
65/3586.413.686.813.286.486.913.186.587.013.086.6
50/5085.614.486.014.085.686.113.985.786.213.885.8
KernelPooling80/2091.98.192.37.792.192.47.692.292.57.592.3
65/3591.18.991.58.591.391.68.491.491.78.391.5
50/5090.49.690.89.290.690.99.190.791.09.090.8
ECA110Pooling80/2092.57.592.97.192.793.07.092.893.16.992.9
65/3591.78.392.17.991.992.27.892.092.37.792.1
50/5091.09.091.48.691.291.58.591.391.68.491.4
Table 8. Comparative evaluation of pooling methods on Fashion-MNIST across train/test splits and training epochs. Results are reported as Top-1 Accuracy (%), Error Rate (%), and F1-score (%).
Table 8. Comparative evaluation of pooling methods on Fashion-MNIST across train/test splits and training epochs. Results are reported as Top-1 Accuracy (%), Error Rate (%), and F1-score (%).
MethodSplit201005001000 (F1)
AccErrF1AccErrF1AccErrF1
MaxPooling80/2089.810.289.693.66.493.595.34.795.295.6
65/3589.011.088.892.97.192.894.85.294.794.8
50/5088.311.788.192.27.892.194.35.794.294.3
AveragePooling80/2089.510.589.393.36.793.295.05.094.995.3
65/3588.711.388.592.67.492.594.55.594.494.6
50/5087.912.187.791.98.191.894.06.093.994.0
MedianPooling80/2090.010.089.893.96.193.895.64.495.595.8
65/3589.210.889.093.26.893.195.05.094.995.1
50/5088.511.588.392.57.592.494.55.594.494.5
MinPooling80/2085.614.485.389.710.389.592.17.991.992.3
65/3584.815.284.589.011.088.791.48.691.191.6
50/5084.115.983.888.311.788.090.89.290.591.0
KernelPooling80/2090.49.690.294.35.794.296.04.096.096.0
65/3589.610.489.493.66.493.595.44.695.395.4
50/5088.911.188.792.97.192.894.95.194.894.9
ECA110Pooling80/2090.99.190.794.75.394.696.23.896.196.3
65/3590.19.989.994.15.994.095.84.295.795.8
50/5089.410.689.293.46.693.395.34.795.295.4
MethodSplit1000500010,00050,000
AccErrAccErrF1AccErrF1AccErrF1
MaxPooling80/2095.74.395.94.195.896.04.095.996.04.095.9
65/3595.05.095.24.895.195.34.795.295.34.795.2
50/5094.45.694.65.494.594.65.494.594.65.494.5
AveragePooling80/2095.44.695.64.495.595.74.395.695.74.395.6
65/3594.75.394.95.194.895.05.094.995.05.094.9
50/5094.15.994.35.794.294.35.794.294.35.794.2
MedianPooling80/2095.94.196.13.996.096.23.896.196.23.896.1
65/3595.24.895.44.695.395.54.595.495.54.595.4
50/5094.65.494.85.294.794.95.194.894.95.194.8
MinPooling80/2092.67.492.87.292.592.97.192.692.97.192.6
65/3591.98.192.17.991.892.27.891.992.27.891.9
50/5091.38.791.58.591.291.58.591.291.58.591.2
KernelPooling80/2096.13.996.23.896.196.23.896.196.23.896.1
65/3595.54.595.64.495.595.64.495.595.64.495.5
50/5095.05.095.14.995.095.14.995.095.14.995.0
ECA110Pooling80/2096.43.696.53.596.496.63.496.596.63.496.5
65/3595.94.196.04.095.996.13.996.096.13.996.0
50/5095.54.595.64.495.595.74.395.695.74.395.6
Table 9. Aggregated comparative performance of pooling methods across all datasets (ImageNet subset, CIFAR-10, Fashion-MNIST) and training epochs. Results are averaged for Top-1 Accuracy, Error Rate, and F1-score.
Table 9. Aggregated comparative performance of pooling methods across all datasets (ImageNet subset, CIFAR-10, Fashion-MNIST) and training epochs. Results are averaged for Top-1 Accuracy, Error Rate, and F1-score.
Pooling MethodTop-1 Accuracy (%)Error Rate (%)F1-Score (%)
MaxPooling85.015.084.7
AveragePooling84.515.584.2
MedianPooling84.815.284.5
MinPooling80.020.079.6
KernelPooling86.014.085.8
ECA110-Pooling87.212.887.0
Table 10. Computational complexity and parameterization of the six pooling operators. Here k denotes the window size, C the number of channels, and T the number of automaton steps in ECA110.
Table 10. Computational complexity and parameterization of the six pooling operators. Here k denotes the window size, C the number of channels, and T the number of automaton steps in ECA110.
Pooling MethodTime ComplexityExtra ParametersRemarks
MaxPooling O ( k 2 C ) NoneSelects strongest activations
AveragePooling O ( k 2 C ) NoneComputes local averages
MedianPooling O ( k 2 C ) NoneRequires sorting per window
MinPooling O ( k 2 C ) NoneSelects weakest activations
KernelPooling O ( k 2 C ) k 2 · C Learnable weighted aggregation
ECA110-Pooling O ( k 2 C T ) NoneRule-based transform + reduction
Table 11. Aggregated efficiency results: average training time per epoch and model size across ImageNet (subset), CIFAR-10, and Fashion-MNIST.
Table 11. Aggregated efficiency results: average training time per epoch and model size across ImageNet (subset), CIFAR-10, and Fashion-MNIST.
Pooling MethodImageNet (Subset)CIFAR-10Fashion-MNIST
Time (s/epoch) Size (MB) Time (s/epoch) Size (MB) Time (s/epoch) Size (MB)
MaxPooling128.04.7535.24.7512.44.75
AveragePooling130.64.7535.94.7512.74.75
MedianPooling162.64.7544.74.7515.84.75
MinPooling127.54.7535.14.7512.44.75
KernelPooling148.54.8040.84.8014.44.80
ECA110Pooling133.14.7636.64.7612.94.76
Table 12. One-way ANOVA results across pooling operators for each dataset and split, reported across training epochs (20–1000). Between-group df = 5, Within-group df = 24.
Table 12. One-way ANOVA results across pooling operators for each dataset and split, reported across training epochs (20–1000). Between-group df = 5, Within-group df = 24.
DatasetSplit201005001000
ImageNet (subset)80/203.41, 0.017, Yes4.26, 0.007, Yes6.18, 0.001, Yes7.34, <0.001, Yes
65/353.12, 0.021, Yes4.05, 0.009, Yes5.91, 0.002, Yes7.12, <0.001, Yes
50/502.89, 0.032, Yes3.87, 0.011, Yes5.66, 0.002, Yes6.94, <0.001, Yes
CIFAR-1080/204.89, 0.003, Yes6.72, 0.001, Yes8.05, <0.001, Yes9.47, <0.001, Yes
65/354.51, 0.005, Yes6.38, 0.001, Yes7.81, <0.001, Yes9.12, <0.001, Yes
50/504.18, 0.008, Yes6.05, 0.002, Yes7.54, <0.001, Yes8.91, <0.001, Yes
Fashion-MNIST80/203.02, 0.028, Yes4.75, 0.004, Yes6.41, 0.001, Yes7.96, <0.001, Yes
65/352.81, 0.034, Yes4.51, 0.006, Yes6.12, 0.001, Yes7.64, <0.001, Yes
50/502.64, 0.041, Yes4.26, 0.008, Yes5.91, 0.002, Yes7.31, <0.001, Yes
Table 13. One-way ANOVA results across pooling operators for each dataset and split, reported across training epochs (5000–50000). Between-group df = 5, Within-group df = 24.
Table 13. One-way ANOVA results across pooling operators for each dataset and split, reported across training epochs (5000–50000). Between-group df = 5, Within-group df = 24.
DatasetSplit50001000050000
ImageNet (subset)80/209.11, <0.001, Yes11.02, <0.001, Yes12.85, <0.001, Yes
65/358.87, <0.001, Yes10.74, <0.001, Yes12.45, <0.001, Yes
50/508.65, <0.001, Yes10.43, <0.001, Yes12.12, <0.001, Yes
CIFAR-1080/2011.38, <0.001, Yes12.24, <0.001, Yes13.10, <0.001, Yes
65/3511.03, <0.001, Yes11.87, <0.001, Yes12.73, <0.001, Yes
50/5010.77, <0.001, Yes11.59, <0.001, Yes12.41, <0.001, Yes
Fashion-MNIST80/209.88, <0.001, Yes11.07, <0.001, Yes12.31, <0.001, Yes
65/359.55, <0.001, Yes10.77, <0.001, Yes11.98, <0.001, Yes
50/509.11, <0.001, Yes10.41, <0.001, Yes11.64, <0.001, Yes
Table 14. Tukey’s HSD pairwise comparisons across pooling operators for each dataset, split ratio, and selected training epochs (20, 500, 5000). Cells list significant differences (direction and adjusted p). Abbreviations: Max = MaxPooling, Avg = AveragePooling, Med = MedianPooling, Min = MinPooling, Ker = KernelPooling, ECA = ECA110-Pooling; n.s. = not significant.
Table 14. Tukey’s HSD pairwise comparisons across pooling operators for each dataset, split ratio, and selected training epochs (20, 500, 5000). Cells list significant differences (direction and adjusted p). Abbreviations: Max = MaxPooling, Avg = AveragePooling, Med = MedianPooling, Min = MinPooling, Ker = KernelPooling, ECA = ECA110-Pooling; n.s. = not significant.
DatasetEpochs80/2065/3550/50
ImageNet (subset)20Max > Min (p < 0.05); n.s. (ECA vs Max/Avg/Ker)Max > Min (p < 0.05); n.s.Max > Min (p < 0.05); n.s.
500ECA > Avg (p < 0.05); Max > Min (p < 0.001); Ker > Min (p < 0.01)ECA > Avg (p < 0.05); Max > Min
(p < 0.001)
ECA > Avg (p < 0.05); Max > Min
(p < 0.001)
5000ECA > Max (p < 0.05); ECA > Avg (p < 0.01); ECA > Min (p < 0.001)same patternsame pattern
CIFAR-1020Max > Min (p < 0.01); Med > Min (p < 0.05); n.s. (ECA vs Max)same patternsame pattern
500ECA > Avg (p < 0.01); ECA > Min (p < 0.001); Max > Min (p < 0.001)same patternsame pattern
5000ECA > Max (p < 0.01); ECA > Avg (p < 0.01); ECA > Min (p < 0.001)same patternsame pattern
Fashion-MNIST20Med > Min (p < 0.05); n.s. (ECA vs Med/Max)same patternsame pattern
500ECA > Min (p < 0.001); Med > Min (p < 0.001); Max > Min (p < 0.001)same patternsame pattern
5000ECA > Max (p < 0.05); ECA > Avg (p < 0.01); ECA > Min (p < 0.001)same patternsame pattern
Table 15. Paired t-test (two-sided) between ECA110-Pooling (ECA) and baseline pooling operators, reported at selected epochs (20, 500, 5000) and split ratios.
Table 15. Paired t-test (two-sided) between ECA110-Pooling (ECA) and baseline pooling operators, reported at selected epochs (20, 500, 5000) and split ratios.
DatasetEpochs80/2065/3550/50
ImageNet (subset)20ECA > Min ( p = 0.021 )ECA > Min ( p = 0.028 )ECA > Min ( p = 0.034 )
500ECA > Max ( p = 0.032 );
ECA > Avg ( p = 0.012 );
ECA > Med ( p = 0.045 );
ECA > Min ( p < 0.001 )
ECA > Max ( p = 0.036 );
ECA > Avg ( p = 0.015 );
ECA > Min ( p < 0.001 )
ECA > Avg ( p = 0.019 );
ECA > Min ( p < 0.001 )
5000ECA > Max ( p = 0.019 );
ECA > Avg ( p = 0.004 );
ECA > Med ( p = 0.022 );
ECA > Min ( p < 0.001 )
ECA > Max ( p = 0.021 );
ECA > Avg ( p = 0.007 );
ECA > Min ( p < 0.001 )
ECA > Avg ( p = 0.011 );
ECA > Min ( p < 0.001 )
CIFAR-1020ECA > Min ( p = 0.001 )ECA > Min ( p = 0.002 )ECA > Min ( p = 0.004 )
500ECA > Max ( p = 0.010 ); ECA > Avg ( p = 0.006 ); ECA > Med ( p = 0.030 ); ECA > Min ( p < 0.001 )ECA > Max ( p = 0.012 ); ECA > Avg ( p = 0.009 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.014 ); ECA > Min ( p < 0.001 )
5000ECA > Max ( p = 0.004 ); ECA > Avg ( p = 0.003 ); ECA > Med ( p = 0.012 ); ECA > Min ( p < 0.001 ); ECA > Ker ( p = 0.041 )ECA > Max ( p = 0.006 ); ECA > Avg ( p = 0.004 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.008 ); ECA > Min ( p < 0.001 )
Fashion-MNIST20ECA > Min ( p = 0.003 )ECA > Min ( p = 0.005 )ECA > Min ( p = 0.006 )
500ECA > Max ( p = 0.048 ); ECA > Avg ( p = 0.030 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.037 ); ECA > Min ( p < 0.001 )ECA > Min ( p < 0.001 )
5000ECA > Max ( p = 0.019 ); ECA > Avg ( p = 0.010 ); ECA > Med ( p = 0.042 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.013 ); ECA > Min ( p < 0.001 )ECA > Min ( p < 0.001 )
Table 16. Wilcoxon signed-rank test between ECA110-Pooling (ECA) and baseline pooling operators, reported at selected epochs and split ratios.
Table 16. Wilcoxon signed-rank test between ECA110-Pooling (ECA) and baseline pooling operators, reported at selected epochs and split ratios.
DatasetEpochs80/2065/3550/50
ImageNet (subset)20ECA > Min ( p = 0.024 )ECA > Min ( p = 0.031 )ECA > Min ( p = 0.038 )
500ECA > Max ( p = 0.041 ); ECA > Avg ( p = 0.019 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.027 ); ECA > Min ( p < 0.001 )ECA > Min ( p < 0.001 )
5000ECA > Max ( p = 0.006 ); ECA > Avg ( p = 0.011 ); ECA > Med ( p = 0.029 ); ECA > Min ( p < 0.001 )ECA > Max ( p = 0.012 ); ECA > Avg ( p = 0.018 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.025 ); ECA > Min ( p < 0.001 )
CIFAR-1020ECA > Min ( p = 0.003 )ECA > Min ( p = 0.004 )ECA > Min ( p = 0.006 )
500ECA > Max ( p = 0.021 ); ECA > Avg ( p = 0.015 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.022 ); ECA > Min ( p < 0.001 )ECA > Min ( p < 0.001 )
5000ECA > Max ( p = 0.009 ); ECA > Avg ( p = 0.007 ); ECA > Med ( p = 0.033 ); ECA > Min ( p < 0.001 ); ECA > Ker ( p = 0.012 )ECA > Max ( p = 0.014 ); ECA > Avg ( p = 0.010 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.017 ); ECA > Min ( p < 0.001 )
Fashion-MNIST20ECA > Min ( p = 0.007 )ECA > Min ( p = 0.009 )ECA > Min ( p = 0.011 )
500ECA > Max ( p = 0.037 ); ECA > Avg ( p = 0.029 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.034 ); ECA > Min ( p < 0.001 )ECA > Min ( p < 0.001 )
5000ECA > Max ( p = 0.008 ); ECA > Avg ( p = 0.013 ); ECA > Med ( p = 0.041 ); ECA > Min ( p < 0.001 )ECA > Avg ( p = 0.015 ); ECA > Min ( p < 0.001 )ECA > Min ( p < 0.001 )
Table 17. Comparison of ECA110-Pooling with SOTA architectures across datasets, split ratios, and training epochs. The results include Top-1 Accuracy (%), Error Rate (%), and F1-score (%).
Table 17. Comparison of ECA110-Pooling with SOTA architectures across datasets, split ratios, and training epochs. The results include Top-1 Accuracy (%), Error Rate (%), and F1-score (%).
DatasetMethod80/20 Split65/35 (500 ep.)
500 ep.5000 ep.10,000 ep.500 ep.
ImageNet (subset)ResNet-5070.3/29.7/70.075.9/24.1/75.776.1/23.9/75.968.4/31.6/68.2
DenseNet-12171.0/29.0/70.876.8/23.2/76.776.9/23.1/76.869.0/31.0/68.7
EfficientNet-B072.1/27.9/71.977.5/22.5/77.377.7/22.3/77.570.5/29.5/70.3
MobileNetV269.2/30.8/68.974.1/25.9/73.974.3/25.7/74.167.3/32.7/67.0
ViT-Small70.7/29.3/70.576.2/23.8/76.076.4/23.6/76.269.1/30.9/68.8
ECA110-Pooling71.7/28.3/71.573.8/26.2/73.674.0/26.0/73.970.5/29.5/70.3
CIFAR-10ResNet-5088.6/11.4/88.594.4/5.6/94.394.6/5.4/94.587.3/12.7/87.2
DenseNet-12189.2/10.8/89.194.9/5.1/94.795.1/4.9/94.987.9/12.1/87.7
EfficientNet-B089.8/10.2/89.695.4/4.6/95.295.6/4.4/95.488.5/11.5/88.3
MobileNetV287.3/12.7/87.193.2/6.8/93.093.5/6.5/93.385.9/14.1/85.7
ViT-Small88.2/11.8/88.094.0/6.0/93.894.2/5.8/94.086.8/13.2/86.6
ECA110-Pooling91.2/8.8/91.092.9/7.1/92.793.0/7.0/92.890.3/9.7/90.1
Fashion-MNISTResNet-5095.0/5.0/94.996.2/3.8/96.196.3/3.7/96.294.2/5.8/94.1
DenseNet-12195.3/4.7/95.296.4/3.6/96.396.5/3.5/96.494.5/5.5/94.4
EfficientNet-B095.6/4.4/95.596.6/3.4/96.596.7/3.3/96.694.8/5.2/94.7
MobileNetV294.7/5.3/94.695.9/4.1/95.896.0/4.0/95.994.0/6.0/93.9
ViT-Small95.1/4.9/95.096.2/3.8/96.196.3/3.7/96.294.4/5.6/94.3
ECA110-Pooling96.2/3.8/96.196.5/3.5/96.496.6/3.4/96.595.8/4.2/95.7
DatasetMethod65/35 Split50/50 Split
5000 ep.10,000 ep.500 ep.5000 ep.10,000 ep.
ImageNet (subset)ResNet-5074.3/25.7/74.075.0/25.0/74.866.5/33.5/66.272.5/27.5/72.373.0/27.0/72.8
DenseNet-12175.2/24.8/75.075.8/24.2/75.667.3/32.7/67.073.6/26.4/73.474.1/25.9/73.9
EfficientNet-B076.0/24.0/75.876.4/23.6/76.268.4/31.6/68.274.2/25.8/74.074.7/25.3/74.5
MobileNetV272.6/27.4/72.373.2/26.8/73.065.5/34.5/65.270.8/29.2/70.571.4/28.6/71.1
ViT-Small74.9/25.1/74.775.4/24.6/75.267.2/32.8/66.973.3/26.7/73.173.9/26.1/73.7
ECA110-Pooling72.7/27.3/72.572.9/27.1/72.769.0/31.0/68.871.4/28.6/71.271.7/28.3/71.5
CIFAR-10ResNet-5093.8/6.2/93.694.1/5.9/93.985.9/14.1/85.792.4/7.6/92.292.7/7.3/92.5
DenseNet-12194.3/5.7/94.194.6/5.4/94.486.5/13.5/86.392.9/7.1/92.793.2/6.8/93.0
EfficientNet-B094.8/5.2/94.695.0/5.0/94.887.0/13.0/86.893.5/6.5/93.393.9/6.1/93.7
MobileNetV292.7/7.3/92.593.1/6.9/92.984.5/15.5/84.391.4/8.6/91.291.9/8.1/91.7
ViT-Small93.4/6.6/93.293.7/6.3/93.585.4/14.6/85.292.0/8.0/91.892.4/7.6/92.2
ECA110-Pooling92.1/7.9/91.992.3/7.7/92.189.4/10.6/89.291.4/8.6/91.291.6/8.4/91.4
Fashion-MNISTResNet-5095.6/4.4/95.595.8/4.2/95.793.5/6.5/93.494.9/5.1/94.895.1/4.9/95.0
DenseNet-12195.8/4.2/95.796.0/4.0/95.993.8/6.2/93.795.2/4.8/95.195.4/4.6/95.3
EfficientNet-B096.0/4.0/95.996.2/3.8/96.194.1/5.9/94.095.5/4.5/95.495.7/4.3/95.6
MobileNetV295.3/4.7/95.295.5/4.5/95.493.3/6.7/93.294.7/5.3/94.694.9/5.1/94.8
ViT-Small95.6/4.4/95.595.8/4.2/95.793.6/6.4/93.595.0/5.0/94.995.2/4.8/95.1
ECA110-Pooling96.0/4.0/95.996.1/3.9/96.095.3/4.7/95.295.6/4.4/95.595.7/4.3/95.6
Table 18. Number of parameters, memory footprint, and observations for pooling operators and SOTA architectures.
Table 18. Number of parameters, memory footprint, and observations for pooling operators and SOTA architectures.
MethodNo. ParametersMem. Footprint (MB)Observations
MaxPooling0≈0Fixed operator, no trainable parameters.
AveragePooling0≈0Captures global information but loses local details.
MedianPooling0≈0Robust to noise and outliers, slightly higher computational cost.
MinPooling0≈0Rarely used, generally yields weak performance.
KernelPooling∼50 k∼0.2–0.5Learnable kernels; modest increase in model size.
ECA110-Pooling0≈0Lightweight rule-based operator with competitive performance.
ResNet-50∼25 M∼98High accuracy but computationally expensive.
DenseNet-121∼8 M∼33Dense connections; strong accuracy with higher inference cost.
EfficientNet-B0∼5 M∼20Excellent balance of accuracy and efficiency.
MobileNetV2∼3.5 M∼14Optimized for mobile and embedded deployments.
ViT-Small∼22 M∼85Transformer-based; strong performance but high memory needs.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Constantin, D.; Bălcău, C. ECA110-Pooling: A Comparative Analysis of Pooling Strategies in Convolutional Neural Networks. Big Data Cogn. Comput. 2025, 9, 306. https://doi.org/10.3390/bdcc9120306

AMA Style

Constantin D, Bălcău C. ECA110-Pooling: A Comparative Analysis of Pooling Strategies in Convolutional Neural Networks. Big Data and Cognitive Computing. 2025; 9(12):306. https://doi.org/10.3390/bdcc9120306

Chicago/Turabian Style

Constantin, Doru, and Costel Bălcău. 2025. "ECA110-Pooling: A Comparative Analysis of Pooling Strategies in Convolutional Neural Networks" Big Data and Cognitive Computing 9, no. 12: 306. https://doi.org/10.3390/bdcc9120306

APA Style

Constantin, D., & Bălcău, C. (2025). ECA110-Pooling: A Comparative Analysis of Pooling Strategies in Convolutional Neural Networks. Big Data and Cognitive Computing, 9(12), 306. https://doi.org/10.3390/bdcc9120306

Article Metrics

Back to TopTop