Next Article in Journal
Cyclic Codes over a Split Local Ring of Type (2,2): Structure, Gray Images, and Distance Analysis
Previous Article in Journal
Extremal Dependence and Community-Structured Risk Propagation in Complex Social Information Networks
Previous Article in Special Issue
Binary Secretary Bird Optimization Algorithm for the Set Covering Problem
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization

by
Vasileios Charilogis
1,
Ioannis G. Tsoulos
1,*,
Anna Maria Gianni
1 and
Dimitrios Tsalikakis
2
1
Department of Informatics and Telecommunications, University of Ioannina, 47150 Kostaki Artas, Greece
2
Department of Engineering Informatics and Telecommunications, University of Western Macedonia, 50100 Kozani, Greece
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(11), 2018; https://doi.org/10.3390/math14112018
Submission received: 16 April 2026 / Revised: 1 June 2026 / Accepted: 3 June 2026 / Published: 5 June 2026

Abstract

This article presents a new literature-informed benchmark problem for continuous global optimization, inspired by the semantics of weather-aware irrigation scheduling. The problem is formulated as a continuous minimization task in which irrigation decisions are determined under synthetically generated but agronomically motivated weather-dependent water demand, rainfall effects, and nonlinear operational penalties. The proposed benchmark is intentionally synthetic and self-contained requiring no external datasets or field-calibrated parameters while its components are grounded in established concepts from irrigation science such as evapotranspiration-based demand estimation, yield response–water relationships, and weather-dependent scheduling principles. It is designed to be sufficiently challenging for numerical optimization while retaining a clear agronomic interpretation. The article provides the full mathematical formulation of the problem, explains its main components and parameters, and reports an experimental study focused on its optimization behavior using ten established continuous optimizers across five problem dimensions. From this perspective, the proposed problem is positioned within a clear framework for the study of real-world-inspired continuous optimization benchmarks, offering an additional case in which mathematical formulation, practical interpretation, and experimental investigation coexist in a coherent and systematic manner.

1. Introduction

Optimization plays a central role in science and engineering because many practical decision problems can be formulated as the search for variable configurations that minimize cost, maximize performance, or satisfy multiple competing operational criteria. In continuous settings, the resulting problems are often nonlinear, multimodal, nonconvex, and derivative-free, which makes the empirical study of optimization algorithms strongly dependent on the availability of representative benchmark problems [1]. Population-based and nature-inspired metaheuristics including differential evolution, particle swarm optimization, evolution strategies, and ant colony optimization have demonstrated broad competence on such problems, yet their comparative evaluation remains a persistent methodological challenge [2,3]. In the absence of systematic benchmarking, performance claims are frequently overstated, algorithm rankings are unstable across problem classes, and the generalizable strengths and weaknesses of competing methods remain obscured [4,5].
Benchmark design is therefore not a peripheral issue but a methodological one. The two dominant paradigms for continuous benchmarking the Congress on Evolutionary Computation (CEC) suites and the Black-Box Optimization Benchmarking (BBOB) platform have each contributed important infrastructure for controlled algorithm comparison [6,7]. CEC suites, including CEC 2014, CEC 2017, CEC 2021, and CEC 2022, provide fixed-budget comparisons on shifted, rotated, and hybridized analytical functions across multiple dimensionalities [8]. BBOB, developed within the COCO platform, adopts a target-based approach and covers a set of 24 single-objective functions with documented landscape properties, recently extended to large-scale dimensions up to 640 [9]. Both platforms have made substantial contributions to reproducible benchmarking, and recent comprehensive reviews have cataloged several hundred test functions in use across the optimization literature [2,10]. At the same time, recent analyses have shown that the choice of benchmark problems can substantially alter empirical conclusions: algorithms that rank highly on CEC 2020 often perform poorly on CEC 2014 and CEC 2017, and vice versa [4,5,11]. The number of allowed function evaluations further modulates rankings, suggesting that no single benchmark configuration provides a definitive ordering of optimizers [11]. These findings reinforce the need for diverse, independently designed benchmark problems whose difficulty structure is transparent and whose components can be subjected to controlled ablation. Recent critical analyses have further emphasized that algorithm rankings are highly sensitive to the specific benchmark suite employed [5,12], and comprehensive surveys have cataloged the structural properties including separability, modality, and conditioning that distinguish existing test functions from one another [13]. These findings underscore the value of introducing new benchmarks whose difficulty mechanisms are independently documented and whose landscape properties differ qualitatively from those of existing suites.
A complementary body of work has emphasized the value of real-world inspired benchmark problems, whose objective structures reflect realistic coupling, delayed effects, threshold mechanisms, and non-separable stage interactions that are rarely present in purely analytical test functions [2,4,5]. The IEEE CEC 2011 competition introduced a set of practical optimization problems drawn from engineering domains, and subsequent efforts have proposed test suites for constrained, large-scale, and multi-objective problems motivated by physical or operational constraints [14]. Recent large-scale benchmarks have additionally highlighted the importance of heterogeneous module coupling and non-uniform interaction patterns, features that are genuinely present in physical systems but largely absent from classic analytical suites [15]. The methodological literature has accordingly called for benchmarks that combine interpretability so that the source of difficulty can be understood and communicated with sufficient computational challenge to stress modern population-based solvers across a range of dimensionalities [2,4,10].
Within this landscape of domain-specific benchmark construction, recent decomposition-based multi-objective approaches for coordinating trucks and unmanned aerial vehicles (UAVs) within cyber-physical internet frameworks [16] share with the present work the design philosophy of embedding domain-specific coupling and state-dependent dynamics into synthetic yet semantically meaningful problem formulations. However, such logistics-oriented benchmarks primarily emphasize spatial decomposition and multi-objective coordination among heterogeneous agents across multimodal transport networks, whereas the proposed irrigation benchmark instead emphasizes temporal state recursion through recursive soil-water storage and multiplicative seasonal yield aggregation as its principal difficulty mechanisms. These constitute orthogonal difficulty axes spatial decomposition and multi-objective Pareto-front navigation on one side, versus temporal path dependence and single-objective rugged landscape structure on the other suggesting that future benchmark suites could productively integrate mechanisms from both families.
Within agricultural water management, irrigation scheduling provides a natural domain for such benchmark construction. Agriculture accounts for approximately 70 percent of global freshwater withdrawals, and the efficient allocation of irrigation water across a growing season has direct implications for food production, resource conservation, and economic viability [17]. Existing studies have investigated evapotranspiration-driven scheduling based on the FAO Penman–Monteith framework, weather-aware irrigation control using probabilistic forecast integration, and simulation-optimization frameworks that couple crop growth models with operational decision support [18,19,20]. Real-time and data-assimilation-based scheduling approaches have demonstrated that incorporating short-term weather forecasts and sensor observations into the decision loop can substantially improve water use efficiency relative to rule-based alternatives [21,22]. Systematic reviews of simulation-optimization methods for irrigation scheduling have further shown that the choice of crop model, objective function structure, and optimization algorithm interacts in complex ways, highlighting the need for reproducible and well-characterized problem formulations [23]. However, many available formulations are developed as calibrated management tools for specific field conditions and crop systems rather than as intentionally difficult black-box benchmarks for continuous global optimization. Their calibration to site-specific agronomic data makes them neither self-contained nor algorithmically transparent, limiting their utility as general-purpose benchmarks. At the same time, the standard analytical benchmark suites used in continuous optimization, including CEC and BBOB, lack the semantic interpretability, state-dependent coupling, and operational realism that physically motivated domains such as irrigation scheduling naturally provide. There is therefore a recognized gap for benchmark problems that are simultaneously self-contained and analytically transparent, yet exhibit the non-separable, stateful, and rugged landscape structure that arises in real operational systems.
The present article therefore introduces a new literature-informed weather-coupled irrigation scheduling benchmark whose explicit role is not to reproduce a field-calibrated agronomic model, but to provide a mathematically explicit, semantically meaningful, and computationally demanding optimization landscape that bridges the gap identified above. The benchmark is entirely synthetic and deterministic: all exogenous weather and crop profiles are generated analytically from a normalized stage coordinate, and no external datasets, site-specific calibration, or field measurements are required. At the same time, its components are grounded in established agronomic concepts: evapotranspiration-based demand estimation following the FAO framework [18], yield response–water relationships consistent with the classical model of Doorenbos and Kassam [19], and weather-dependent scheduling principles as discussed in [20]. As a result, every term in the objective function retains a clear physical interpretation, even though the benchmark itself is not a digital twin of any specific field or crop system. The proposed benchmark is fully deterministic and analytically self-contained: all exogenous weather and crop profiles are generated from a normalized stage coordinate, requiring no external datasets, calibrated parameters, or field-specific inputs. Its objective function deliberately integrates six distinct difficulty mechanisms recursive soil-water storage propagation, oscillatory irrigation efficiency modulation, threshold-triggered drainage losses, multiplicative seasonal yield aggregation, neighbor-coupled hydraulic load, and a deceptive preferred-window penalty each of which has a clear agronomic interpretation and a quantifiable contribution to search landscape ruggedness. The benchmark is therefore positioned at the intersection of two research communities: it retains the domain semantics and operational realism of agricultural water management while providing the analytical transparency, controlled difficulty structure, and scalable dimensionality that the continuous optimization community requires for rigorous algorithm comparison.
The remainder of the article is organized as follows. Section 2 presents the proposed benchmark, including its background Section 2.1, full mathematical formulation Section 2.2, and experimental behavior Section 2.4 under a common evaluation setting Section 2.3. Section 3 discusses the optimization signature of the problem from a problem-centered perspective, and Section 4 summarizes the main conclusions.

2. Weather-Coupled Irrigation Scheduling Benchmark

2.1. Background of the Problem

The proposed benchmark is motivated by the need to represent irrigation scheduling as a genuinely coupled seasonal decision process rather than as a separable stage-wise water allocation rule. In realistic irrigation environments, the amount of water applied at one stage influences not only the immediate water balance of the crop, but also subsequent storage, drainage, evaporative losses, and the cumulative preservation of seasonal productivity [15,18,19,20,21,23]. This makes the domain naturally suitable for benchmark construction aimed at non-separable and state-dependent continuous optimization.
The benchmark is intentionally synthetic in the sense that all exogenous profiles are generated analytically from the normalized stage coordinate, yet it remains literature-informed in its semantic structure. To make this grounding explicit, the following correspondences between benchmark components and agronomic concepts are noted. The stage-wise crop evapotranspiration demand E T c i = E T o i · K c i · Δ t follows the standard reference-evapotranspiration construction of the FAO Penman–Monteith framework [18], in which a reference atmospheric demand E T o is scaled by a crop-specific coefficient Kc to obtain actual crop water requirements. The multiplicative yield-preservation factor Y r e l = y i is consistent with the classical yield response–water model of Doorenbos and Kassam [19], which relates relative yield reduction to relative evapotranspiration deficit through a crop-specific sensitivity coefficient K y . The recursive soil-water storage transition S i + 1 = f ( S i , r a i n i , I i e f f , ) reflects the widely adopted soil water-balance approach used in simulation-optimization frameworks for irrigation scheduling [21,22,23], in which the storage state at each stage depends on incoming water, drainage, and evaporative losses from the preceding stage. The weather-dependent profiles for heat, wind, rainfall, and tariff represent stylized seasonal patterns that capture the qualitative behavior documented in weather-aware scheduling studies [17,20], although their specific functional forms are chosen for analytical convenience rather than site-specific fidelity.
Beyond these literature-informed components, additional oscillatory, thresholded, and deceptive mechanisms are deliberately embedded to produce a challenging optimization landscape. These include the oscillatory modulation of irrigation efficiency η i ( x ) , which generates repeated local attraction basins, the threshold-triggered deep drainage loss, which introduces nonsmooth regime changes, and the deceptive preferred-window penalty P w i n , which attracts solvers toward irrigation levels that are suboptimal under the full objective. These mechanisms are synthetic ruggedness generators with no claim to precise field physics, they are introduced to make the benchmark genuinely difficult for modern continuous optimizers while remaining structurally interpretable. As a result, the problem should be interpreted as a literature-informed optimization benchmark with agronomic meaning, not as a calibrated digital twin of a particular field or crop system.

2.2. Mathematical Formulation

Let D 4 denote the problem dimension, and let x = ( x 1 , x 2 , , x D ) T be the decision vector, where x i is the irrigation depth (mm) applied at stage i of a synthetic growing season. The feasible set is the hyper-rectangle
Ω = [ 0 , 80 ] D , 0 x i 80 , i = 1 , , D .
The only hard constraints imposed on the problem are the box bounds 0 x i 80 on the decision variables. The internal storage state S i is maintained within admissible bounds [ 0 , S i m a x ] through the clipping operator in the storage transition equation, which acts as an implicit state constraint enforced by construction rather than by external penalization. All remaining operational constraints are treated as soft penalties incorporated directly into the objective function: the seasonal water budget is softly constrained through P b u d g e t , which penalizes deviation from the target B D = 28 D , peak hydraulic demand is softly constrained through P p e a k and terminal storage is regularized through P t e r m . This soft-constraint formulation is deliberate, as it preserves the box-constrained structure of the search domain while embedding operational feasibility requirements into the landscape topology of the objective.
The season length is fixed at 120 days, so the stage duration is
Δ t = 120 D ,
and the normalized stage coordinate is
s i = i D + 1 , i = 1 , , D .
Recommended difficult dimensions are D { 24 , 30 , 50 , 70 , 100 } , with D = 24 and D = 30 suitable for routine experiments and D = 50 or D = 70 or D = 100 producing substantially harder landscapes for population-based optimizers.
To make the benchmark fully deterministic and self-contained, all exogenous weather and crop coefficients are generated analytically from the normalized stage coordinate s i . No stochastic, scenario-based, or externally sampled weather realizations are involved: every weather profile is a fixed, closed-form trigonometric function of s i that produces identical values for any given dimension D. This design choice eliminates all sources of exogenous randomness from the benchmark, ensuring that any variability in optimization outcomes is attributable solely to the stochastic search process of the optimizer and not to weather input uncertainty. The specific functional forms are chosen to reproduce qualitative seasonal patterns (e.g., mid-season peaks in temperature and evapotranspiration, early-season dominance of rainfall) that are consistent with the general behavior documented in weather-aware scheduling studies [17,20], although they do not represent any particular geographic location or climate record. The stage-wise profiles are defined as:
E T o i = 2.8 + 3.2 sin 2 ( π s i ) ,
K c i = 0.55 + 0.65 sin ( π s i ) ,
r a i n i = 8 + 18 cos 2 ( π s i ) ,
K y i = 0.85 + 0.55 sin ( π s i ) ,
h e a t i = 0.40 + 0.60 sin 2 ( π s i ) ,
w i n d i = 0.35 + 0.65 cos 2 ( π s i ) ,
t a r i f   f i = 0.90 + 0.18 sin ( 2 π s i + 0.20 ) ,
η b a s e , i = 0.78 + 0.10 cos ( π s i ) ,
ρ i = 0.52 + 0.18 sin 2 ( π s i ) ,
S i m a x = 50 + 20 sin 2 ( π s i ) ,
f r e q i = 0.22 + 0.09 sin ( π s i ) ,
and
ϕ i = 0.70 i + 0.30 sin ( 2 π s i ) .
The crop evapotranspiration demand follows the standard evapotranspiration-based construction
E T c i = E T o i K c i Δ t ,
which grounds the benchmark in classical irrigation science [4]. The coefficient K y i is used as a stage sensitivity factor in the relative yield-preservation component, consistent with the general idea of yield response to water stress [19].
The first major source of difficulty is local decision coupling. A neighbor-coupled hydraulic load is introduced as
N i = 0.60 x i + 0.25 x i 1 + 0.15 x i + 1 ,
with boundary conventions x 0 = x D + 1 = 0 . Effective irrigation efficiency is then defined by the oscillatory nonlinear law
η i ( x ) = η b a s e , i [ 1 0.22 sin 2 ( f r e q i x i + ϕ i ) 0.10 sin 2 ( 0.07 N i + 0.50 ϕ i ) ] ,
clipped to the interval [ 0.45 , 1.00 ] , and the effective irrigation becomes
I i e f f = η i ( x ) x i .
These oscillatory factors are synthetic ruggedness generators. They are not intended to claim precise field physics, they are introduced to generate repeated attraction basins and to make coordinate-wise search substantially less effective.
The second major source of difficulty is path dependence through a synthetic soil-water storage state S i . The initial storage is fixed by
S 1 = 0.35 S 1 m a x .
The total water available at stage i is
A i = r a i n i + I i e f f + 0.55 S i ,
and the stage deficit and surplus are
d e f i c i t i = max ( 0 , E T c i A i ) ,
s u r p l u s i = max ( 0 , A i E T c i ) .
Deep drainage is modeled by the thresholded term
d e e p i = max ( 0 , s u r p l u s i 0.35 S i m a x ) 0.55 + 0.20 w i n d i + 0.15 sin 2 ( 0.11 x i + ϕ i ) ,
recharge is defined as
r e c h a r g e i = ρ i max ( 0 , s u r p l u s i d e e p i ) ,
evaporative loss is
e v a p l o s s i = ( 0.08 + 0.10 h e a t i + 0.04 w i n d i ) S i ,
and the storage transition is
S i + 1 = clip 0.82 S i + r e c h a r g e i e v a p l o s s i , 0 , S i m a x .
This recursion makes the objective stateful: changing one variable can alter the effective optimization landscape seen by many subsequent variables.
The storage transition is nonlinear due to the m a x ( · ) and c l i p ( · ) operators appearing in the drainage, surplus, deficit, and recharge terms. Rainfall and effective irrigation enter additively in the total available water A i = rain i + I i e f f + 0.55 S i while evapotranspiration demand E T c i is subtracted through the deficit-surplus split. The clipping to [ 0 , S i m a x ] ensures that the storage state remains non-negative and bounded at every stage, guaranteeing feasibility of the dynamic system. The contraction factor 0.82 in the transition ensures that the autonomous dynamics (without recharge) are inherently stable, preventing unbounded state accumulation.
A multiplicative relative yield-preservation term is included to prevent the problem from collapsing into a pure cost minimization model. The relative water-satisfaction ratio is
r i = min 1 , A i E T c i + ε , ε = 10 12 ,
the stage stress is
s t r e s s i = 1 r i ,
and the stage contribution to yield preservation is
y i = clip 1 K y i s t r e s s i 1.35 0.08 h e a t i exp A i 0.35 E T c i + 1 , 0.02 , 1 .
The season-level relative yield factor is then
Y r e l = i = 1 D y i .
The multiplicative construction is deliberate: a few severely stressed stages can dominate many small local improvements, which makes the global trade off structure significantly more deceptive than in additive reward formulations [15,19,21,23].
The benchmark is posed as the minimization problem
min x [ 0 , 80 ] D f ( x ) ,
where
f ( x ) = C w a t e r + C p u m p + P d e f + P e x c + P s m o o t h + P r e s + P i n t + P w i n + P t e r m + P b u d g e t + P p e a k 350 Y r e l .
To clarify the role and origin of each term, the eleven components of f ( x ) are organized into three categories according to their design rationale.
The first category contains terms that are directly motivated by agronomic and operational practice. The water-cost term C w a t e r represents the direct economic cost of water application, which is a primary objective in any irrigation scheduling formulation [17,20]. The pumping-cost term C p u m p captures the energy cost associated with pressurized water delivery, a well-recognized component of irrigation operating expenses [23]. The deficit penalty P d e f penalizes unmet crop water demand, reflecting the agronomic principle that water stress reduces yield [19]. The excess penalty P e x c penalizes over-irrigation that leads to deep drainage and waterlogging, consistent with the water-balance perspective in which surplus water beyond the root zone is a resource loss [18,21]. The terminal storage penalty P t e r m encourages the preservation of soil moisture at the end of the season, a common objective in multi-season water management [22]. The seasonal budget penalty P b u d g e t softly constrains total water use over the growing season, reflecting practical water-rights or allocation limits [16]. The multiplicative yield factor Y r e l aggregates stage-wise crop productivity into a seasonal yield metric, following the classical yield response–water framework [19].
The second category contains terms that model realistic operational concerns but whose specific functional forms are stylized. The smoothness penalty P s m o o t h discourages abrupt changes in irrigation between consecutive stages, reflecting practical constraints on irrigation system ramping and scheduling regularity. The peak-load penalty P p e a k penalizes excessive instantaneous hydraulic demand, which in practice can exceed the capacity of pumping and conveyance infrastructure. The neighbor-interaction penalty P i n t captures the operational coupling between adjacent irrigation stages, in which the cumulative load from consecutive high-application periods can create delivery bottlenecks.
The third category contains terms that are deliberately synthetic and are introduced to generate specific landscape difficulty features for benchmarking purposes. The resonance penalty P r e s and the deceptive preferred-window penalty P w i n are oscillatory and phase-shifted ruggedness generators designed to create dense local attractors and misleading basins that make the optimization landscape genuinely challenging for modern solvers. These terms have no direct agronomic counterpart, their role is purely to increase the benchmark’s difficulty in a controlled and ablatable manner, as verified experimentally in Section 2.4.
The water-cost term is
C w a t e r = i = 1 D t a r i f   f i x i ,
and the pumping-cost term is
C p u m p = i = 1 D 0.0045 N i 2 1 + 0.35 sin 2 ( 0.08 x i + 0.60 ϕ i ) .
A weather-modulated shock factor is introduced by
s h o c k i = 1 + 0.18 h e a t i sin 2 ( 0.09 x i + 1.3 w i n d i + ϕ i ) ,
and the deficit and excess penalties are given by
P d e f = i = 1 D ( 0.12 + 0.06 h e a t i ) d e f i c i t i 2 s h o c k i ,
P e x c = i = 1 D ( 0.05 + 0.05 w i n d i ) d e e p i 2 + 0.30 s u r p l u s i 2 .
Temporal roughness and local resonance are imposed by
P s m o o t h = i = 2 D ( 0.06 + 0.02 w i n d i ) ( x i x i 1 ) 2 ,
P r e s = i = 1 D 8.0 ( 1 + 0.70 h e a t i ) sin 2 ( f r e q i x i + ϕ i ) .
Neighbor-stage overloading and bilinear coupling are represented by
P i n t = i = 2 D 0.018 max ( 0 , x i + 0.55 x i 1 58 ) 2 + 0.025 x i x i 1 .
A deceptive preferred-window term is defined through
μ i = 18 + 26 sin 2 ( π s i ) ,
P w i n = i = 1 D 0.014 + 0.010 sin 2 ( π s i ) ( x i μ i ) 2 1 + 0.40 sin 2 ( 0.06 x i + ϕ i ) .
The terminal storage penalty is
P t e r m = 1.2 S D + 1 0.45 S D m a x 2 ,
the soft seasonal budget penalty uses
B D = 28 D ,
and is defined by
P b u d g e t = 0.010 i = 1 D x i B D 2 ,
while the peak-load penalty is
P p e a k = 0.035 i = 1 D max ( 0 , N i 62 ) 2 .
The complete problem is therefore a box-constrained, deterministic, strongly non-separable, rugged, and partially nonsmooth minimization benchmark.
No closed-form global optimum is assumed known. This is intentional and is part of the benchmark design. The objective contains repeated oscillatory structures, threshold-triggered regime changes, clipped internal states, recursive storage dynamics, multiplicative stage aggregation, and coupled penalties. Consequently, the landscape contains many local minima and deceptive trade offs, and performance cannot be inferred from low-dimensional intuition alone. From the perspective of benchmark design, the difficulty comes from the simultaneous presence of six mechanisms: recursive state propagation, local oscillatory efficiency modulation, thresholded drainage and peak penalties, multiplicative yield aggregation, neighbor coupling through N i , and soft preferred-window structure that competes with the remaining terms instead of revealing the optimum. For these reasons, the benchmark is substantially harder than a smooth irrigation cost-yield trade off model while remaining interpretable as a weather-coupled irrigation scheduling problem.
Although no closed-form global optimum is available for the full benchmark formulation, the fully simplified variant A L L O F F provides a tractable reference point for absolute performance calibration. In the A L L O F F formulation, all embedded difficulty mechanisms are disabled and the resulting landscape is smooth and convex-like: as demonstrated in Section 2.4, the methods arq, cmaes, lmcmaes, lracmaes, jso, and mlshaderl converge to identical objective values with 100% success rates and zero standard deviation across all dimensions, establishing a consensus empirical optimum that can serve as a lower bound reference for the full formulation. The optimality gap for any method on the full benchmark can therefore be estimated as the difference between its achieved objective value and the corresponding A L L O F F consensus value, providing an absolute rather than purely relative measure of solution quality. The optimality gap for a given method on the full formulation is defined explicitly as Δ = f ( A L L ) f ( A L L O F F ) , where f ( A L L ) is the best objective value achieved by the method across 30 independent runs on the full formulation, and f ( A L L O F F ) is the consensus reference value established by the subset of methods that solve the A L L O F F variant to optimality with 100% success rates and zero standard deviation. This definition is solver-independent: the consensus f ( A L L O F F ) is not the average of all methods’ A L L O F F results but the value agreed upon by those solvers that reliably converge on the simplified landscape. Methods that fail to solve A L L O F F such as gwo, ga, and clpso at large dimensions are therefore not used to establish the reference, and their optimality gaps are computed against the consensus established by the converging subset. This yields a conservative lower bound on their true optimality gap rather than an overestimate, since their A L L O F F best values exceed f ( A L L O F F ) , making their apparent A L L gap smaller than it would be if measured against the true optimum. This approach is consistent with established practice in black-box benchmarking: the BBOB platform and the CEC suites both rely on inter-algorithm relative comparison as the primary performance metric when closed-form optima are not analytically available, and the availability of a tractable simplified variant with a verifiable consensus optimum provides an additional calibration layer that is rarely present in classical synthetic benchmarks.
For convenience, Table 1 summarizes all parameters, auxiliary computed variables, and numerical coefficients appearing in the mathematical formulation of this benchmark.

2.3. Experimental Protocol

All optimization algorithms considered in this study, together with the proposed weather-aware irrigation benchmark, were implemented in C++ and integrated into the open-source OptimSolution framework, which was developed by Vasileios Charilogis. The corresponding source code is publicly available through the project’s GitHub repository: https://github.com/charilog/optimsolution (Accessed on 4 June 2026). All experiments were compiled and executed under Debian 12.12 using GCC 13.4 on a workstation equipped with an AMD Ryzen 9 9950X3D processor (16 cores, 32 threads) and 64 GB of DDR4 memory.
To ensure a fair and reproducible comparison, the same objective-function evaluation budget was imposed on all competing solvers. Since the benchmark itself is deterministic whereas the optimization methods are stochastic, each method was executed independently 30 times using different random seeds. The experimental campaign covered the recommended benchmark dimensions D { 24 , 30 , 50 , 70 , 100 } , thus including both moderate and substantially more demanding instances.
The ten optimization methods included in the comparison were selected to represent five distinct algorithmic families that are widely used in continuous global optimization: adaptive differential evolution (jso [24], mlshaderl [25]), comprehensive learning particle swarm optimization (clpso [26]), covariance matrix adaptation evolution strategy (cmaes [27], lmcmaes [28,29], lracmaes [30]), genetic algorithm (ga [31]), grey wolf optimizer (gwo [32]), ant colony optimization for continuous domains (acor [33]), and adaptive restart-based search (arq [34]). This selection was guided by three criteria: coverage of fundamentally different search mechanisms (mutation-based, swarm-based, rank-based, estimation-based, and restart-based), inclusion of both classical methods (ga, cmaes) and recent state-of-the-art solvers (jso, mlshaderl, arq), and availability of mature, well-documented implementations within the OptimSolution framework. The study focuses exclusively on population-based and derivative-free methods because the benchmark is positioned as a black-box optimization problem for which gradient information is not assumed available. The evaluation of gradient-based solvers, surrogate-assisted methods, and reinforcement learning approaches is noted as a direction for future work in Section 4.
In addition to the full problem formulation ( A L L ), the experimental protocol also included simplified ablation-based variants obtained by selectively disabling individual difficulty mechanisms embedded in the benchmark. These mechanisms include recursive storage propagation, oscillatory efficiency modulation, threshold-based loss effects, multiplicative yield aggregation, neighbor coupling, and preferred-window regularization. A fully simplified variant, denoted by A L L O F F , was also examined, with all difficulty-inducing mechanisms disabled. This broader protocol allows the experimental analysis to assess not only the optimization performance achieved on the complete benchmark, but also the relative contribution of each embedded mechanism to the overall search difficulty of the problem. The abbreviations used for all variants are summarized in Table 2.
For each optimizer on each benchmark variant, the reported statistics were computed over 30 independent runs. The principal performance indicators were the best and mean final objective values obtained at termination. These measures were complemented by the standard deviation, convergence analysis, and rank-based comparisons in order to provide a more complete view of robustness, stability, and search efficiency.

2.4. Experimental Results

Table 3 summarizes the complete experimental results obtained by applying the ten competing optimization methods to the full benchmark formulation ( A L L ) and to each of its six ablation-based variants ( A L L O F F , N O R S , N O O E , N O T L , N O M Y , N O N C , N O P W ), across the five benchmark dimensions D { 24 , 30 , 50 , 70 , 100 } . For every method-variant-dimension combination, four summary statistics are reported over 30 independent runs: the best objective value observed across all runs (Value), the mean final objective value at termination (Mean), the empirical success rate (Rate %), defined as the percentage of runs in which the method reached the best value found by any solver on that instance, and the standard deviation of the final objective values (SD). The performance indicator Rate therefore provides a run-level reliability measure that complements the point estimates given by Value and Mean. The table is organized by problem variant: each variant occupies a labeled block, and within each block the five dimensionality levels appear as consecutive rows. This structure allows a direct reading of both the cross-method comparison for a fixed variant and dimension, and the scaling behavior of each method as the problem dimension increases. Methods are listed in a fixed order throughout the table: arq [34], clpso [26], cmaes [27], ga [31], gwo [32], jso [24], lmcmaes [28,29], lracmaes [30], mlshaderl [25], and acor [33]. The column structure within each dimension block is (Value, Mean, Rate, SD), repeated once per method. Results that correspond to a Rate of 100 indicate that the method reached the reference best solution in all 30 runs, a Rate of 3 represents the minimum observable rate under the 30-run protocol, indicating that the method succeeded in only one run.
Full problem formulation ( A L L ). The results for the complete benchmark reveal a strongly differentiated and dimension-dependent performance landscape. At D = 24 , the method jso emerges as the most reliable solver with a success rate of 40%, converging to the reference best value of approximately 1487.73. The method arq attains the same best value but only in 7% of runs. All remaining solvers clpso, cmaes, ga, gwo, lmcmaes, lracmaes, mlshaderl, and acor achieve the minimum success rate of 3%, indicating that even at the lowest recommended dimension, the benchmark is genuinely challenging for the majority of established optimizers. The standard deviations reveal a clear separation between methods: jso and arq maintain SD values below 2.5, while ga and gwo exhibit SD values exceeding 40 and 51 units, respectively, reflecting high run-to-run variability characteristic of methods that lack sufficient local exploitation capacity in rugged landscapes.
As the dimension increases to D = 30 , the overall difficulty increases substantially. Only mlshaderl achieves a success rate above the minimum (10%), converging to the reference best value of approximately 2736.36. All other methods report success rates of 3%, with mean values deviating from the reference best by margins ranging from 8 units (arq) to 188 units (gwo). The rapid deterioration from D = 24 to D = 30 is attributable to the recursive storage propagation mechanism, which extends the effective dependency chain by six additional stages and thereby increases the probability that at least one stage accumulates sufficient error to deflect the solver from the global basin. This explanation is consistent with the substantially milder dimension scaling observed in the NORS variant, where the recursive storage is disabled.
At D = 50 , the benchmark becomes markedly harder. The method acor achieves the lowest best value (approximately 6784.16), but with a success rate of only 3%. All ten methods report success rates of 3%, and the gap between best and mean values widens substantially: ga reports a mean of 7265.92 against a best of 6940.32, and gwo reports 7538.50 against 7193.55. These gaps indicate that most runs for these methods terminate in qualitatively different and substantially inferior regions of the landscape. The methods jso, mlshaderl, and acor maintain relatively tight standard deviations (below 19 units), suggesting that while they cannot reliably find the global best, they consistently locate competitive local basins a behavior consistent with their archive-based and distribution-estimation search mechanisms, which provide implicit memory of previously explored regions.
At D = 70 , all methods achieve success rates of 3%. The best observed value (approximately 11,195.22, obtained by lracmaes) is followed by jso (11,214.25) and mlshaderl (11,246.26), while ga and gwo report best values exceeding 11,700 and 12,400, respectively. The standard deviations for ga (297.79) and gwo (380.82) are an order of magnitude larger than those of jso (40.04) and cmaes (30.74). This confirms that the simpler population recombination (ga) and swarm attraction (gwo) mechanisms fail to maintain sufficient search distribution precision in a 70-dimensional landscape characterized by coupled oscillatory ruggedness and multiplicative yield aggregation.
At the most challenging setting, D = 100 , the performance hierarchy among methods remains visible despite all methods achieving 3% success rates. The method cmaes achieves the lowest best value (approximately 18,161.57), followed by jso (18,203.71) and mlshaderl (18,213.28). The emergence of cmaes as a competitive solver at the highest dimension, despite its poor reliability, is explained by its covariance matrix adaptation mechanism, which can capture local second-order structure in the landscape and exploit it for rapid descent when initialized near a promising basin. This capability becomes relatively more valuable as the number of competing local attractors grows with dimension. By contrast, ga and gwo report best values exceeding 19,260 and 20,124, respectively, with standard deviations above 398 and 712, indicating near-complete loss of search coherence at this scale.
Simplified formulation with all difficulty mechanisms disabled ( A L L O F F ). In sharp contrast with the A L L variant, the A L L O F F formulation is solved to optimality by the large majority of methods across all dimensions. At D = 24, six out of ten methods arq, cmaes, jso, lracmaes, mlshaderl, and acor achieve success rates of 100% with standard deviations that are numerically indistinguishable from zero. The reference best values for A L L O F F scale moderately with dimension, from approximately 458.71 at D = 24 to 5087.80 at D = 100 . Even at D = 100 , arq, cmaes, and jso maintain 100% success rates, while mlshaderl achieves 70%. The clpso, ga, and gwo methods are the only consistent exceptions, achieving rates of 3% across all dimensions with means that deviate from the reference best by increasingly large margins. These results demonstrate unambiguously that, in the absence of all embedded difficulty mechanisms, the benchmark collapses to a smooth, tractable problem that most modern continuous optimizers can solve reliably, and that the difficulty reported in the ALL variant is genuinely structural rather than numerical.
Formulation without recursive storage ( N O R S ). Removing the recursive soil-water storage propagation restores some problem tractability compared to the ALL variant, but the problem remains substantially harder than A L L O F F . At D = 24 , jso and mlshaderl both achieve success rates of 63%, while arq and acor attain 20% each. At D = 30 , success rates decline to 7% for jso, lracmaes, mlshaderl, and acor. At higher dimensions (D = 50–100), all methods except acor and lracmaes (7% at D = 50 ) report success rates of 3%. The persistence of low success rates even without recursive storage confirms that the remaining mechanisms particularly the oscillatory efficiency modulation and the multiplicative yield aggregation are independently sufficient to create a challenging landscape. From an agronomic perspective, the recursive storage mechanism represents the temporal coupling that is fundamental to real irrigation scheduling: the soil moisture state at any point in the growing season depends on all prior irrigation decisions, creating a sequential dependency that cannot be decomposed into independent stage-wise choices. Its removal simplifies the problem by making each stage’s contribution to the objective approximately independent of preceding decisions.
Formulation without oscillatory efficiency ( N O O E ). This ablation produces the most dramatic improvement in solver performance across the entire experimental campaign, confirming that the oscillatory efficiency mechanism is the single most consequential source of landscape difficulty. At D = 30 , four methods arq, jso, mlshaderl, and acor achieve success rates of 100% with zero standard deviation, while cmaes achieves 97%. At D = 50 , jso and mlshaderl maintain 100% success rates, cmaes achieves 97%, arq 83%, and lracmaes 73%. Even at D = 100 , cmaes achieves a success rate of 53%, jso 30%, and mlshaderl 17% results that are dramatically better than the 3% rates observed for all methods in the A L L variant at the same dimension. The structural interpretation of this finding is clear: the oscillatory modulation of η i ( x ) , which depends nonlinearly on both the local decision x i and the neighbor-coupled load N i , introduces a dense lattice of local attractors that are energetically competitive with the global basin. When this mechanism is removed, the residual landscape formed by the water cost, pumping cost, deficit and excess penalties, and storage terms is sufficiently smooth to permit gradient-like local improvement strategies to succeed over repeated restarts. The practical implication for benchmark users is that the oscillatory efficiency amplitude (controlled by the parameter 0.22 in the default formulation) is the primary lever for adjusting the overall difficulty of the benchmark.
Formulation without threshold losses ( N O T L ). The removal of threshold-triggered drainage losses produces moderate improvement relative to A L L . At D = 24 , jso achieves a success rate of 50% and mlshaderl 17%. At D = 30 , jso reaches 97%, mlshaderl 73%, arq and cmaes both 30%, and lracmaes 23%. At D = 50 , jso (47%), mlshaderl (37%), arq (27%), and cmaes (27%) all exceed the minimum rate. However, at D = 70 and D = 100 , all methods collapse to 3%. These results indicate that the threshold drainage mechanism, which introduces nonsmooth regime changes when the water surplus exceeds a fraction of the storage capacity, contributes meaningfully to the difficulty of the intermediate-dimensional regime but is not the primary driver of optimizer failure at large scales. In agronomic terms, threshold drainage corresponds to the physical phenomenon of deep percolation beyond the root zone, which occurs abruptly once the soil water content exceeds field capacity a nonlinearity that is genuinely present in real irrigation systems and that makes the optimization landscape locally nonsmooth.
Formulation without multiplicative yield aggregation ( N O M Y ). This variant replaces the multiplicative seasonal yield factor Y r e l = i y i with an additive average. At D = 24 , jso achieves a success rate of 37%, while arq, cmaes, and mlshaderl each achieve 7%. At D = 30 and above, all methods report success rates of 7% or lower, with the exception of mlshaderl and acor at D = 30 (both 7%). The improvement relative to A L L is modest at low dimensions and negligible at high dimensions, indicating that multiplicative yield aggregation amplifies the deceptive structure of the objective but is not the sole driver of difficulty. The agronomic rationale for the multiplicative construction is that crop yield is a cumulative outcome of the entire growing season: a single period of severe water stress can irreversibly damage the crop, and this damage cannot be compensated by excess water at other stages. The multiplicative form captures this non-compensatory character, which is consistent with the classical Doorenbo–Kassam yield-response model [19].
Formulation without neighbor coupling ( N O N C ). The removal of the neighbor-coupling term N i produces a formulation in which each stage’s hydraulic load depends only on its own decision variable. At D = 24 , arq and jso both achieve success rates of 10%, while all other methods report 3%. At D = 30 , arq and acor achieve 7%, and all others 3%. At higher dimensions, all methods report 3%. The overall improvement relative to ALL is modest, confirming that neighbor coupling contributes to non-separability but is not the dominant difficulty mechanism. The relatively small effect of removing neighbor coupling is consistent with the observation that the oscillatory efficiency and multiplicative yield mechanisms create difficulty through fundamentally different channels: local ruggedness and global aggregation deceptiveness, respectively. By contrast, neighbor coupling primarily introduces mild inter-variable dependencies that modern population-based methods can partially accommodate through their implicit modeling of variable correlations.
Formulation without preferred-window regularization ( N O P W ). This final ablation removes the deceptive preferred-window penalty P w i n . At D = 24 , arq achieves a success rate of 17% and jso 7%, while all other methods report 3%. At D = 30 , jso achieves 17%, arq, lracmaes, and mlshaderl 7%, and all others 3%. At higher dimensions, all methods collapse to 3%. The best observed values for this variant are systematically lower than those for ALL (e.g., approximately 6681.29 vs. 6784.16 at D = 50 ), confirming that the preferred-window term acts as a genuine penalty that shifts the global optimum relative to the attractors created by the remaining terms. The modest improvement in success rates indicates that, while the preferred-window mechanism does introduce additional deceptive structure, its contribution to the overall difficulty is smaller than that of the oscillatory efficiency or the recursive storage mechanisms.
Cross-variant and cross-method synthesis. Taken together, the ablation study reveals a clear hierarchy of mechanism importance. The oscillatory efficiency modulation is unambiguously the dominant source of landscape difficulty: its removal enables multiple solvers to achieve 100% success rates at dimensions where the full formulation yields universal 3% rates. The recursive storage propagation and threshold drainage mechanisms occupy the second tier, each contributing meaningful but less dramatic difficulty that primarily affects the intermediate-dimensional regime (D = 30–50). The multiplicative yield aggregation, neighbor coupling, and preferred-window regularization contribute more modestly, with their effects becoming largely indistinguishable from the baseline difficulty at high dimensions. This ordering has a structural interpretation: the benchmark’s hardness is primarily driven by local landscape ruggedness (oscillatory efficiency) and temporal state coupling (recursive storage), rather than by global aggregation structure or inter-variable connectivity. Crucially, no single ablation restores the problem to the triviality of the fully simplified variant, confirming that the benchmark’s difficulty is genuinely multi-mechanism in character.
Among the ten competing methods, jso exhibits the strongest overall performance in the full formulation, achieving the highest success rate at D = 24 (40%) and competitive best values across all dimensions. The methods lracmaes, mlshaderl, and acor follow closely, with lracmaes achieving the best observed value at D = 70 , mlshaderl’s historical archive mechanism and acor’s continuous probability density estimation providing complementary advantages for navigating the benchmark’s coupled structure. At the highest dimension ( D = 100 ), cmaes emerges as a competitive solver in terms of best observed values, despite its low success rate, reflecting the value of second-order covariance information in very high-dimensional rugged landscapes. The methods ga and gwo consistently occupy the lowest performance tier across all variants and dimensions, with standard deviations that are typically 5–20 times larger than those of the best-performing methods. Their simpler search mechanisms tournament-based recombination in ga and hierarchical swarm leadership in gwo are fundamentally insufficient for the combined non-separability, state recursion, and oscillatory structure of the proposed benchmark.
Table 4 reports the mean wall-clock time per run for each method across all five dimensions. The benchmark’s objective function is inexpensive to evaluate: for eight of the ten methods, the time per run ranges from 0.22 s at D = 24 to approximately 1.6 s at D = 100 on the reference hardware. The exceptions are cmaes and lracmaes, both of which exhibit substantially higher wall-clock times that grow with dimension: cmaes from 2.53 s at D = 24 to 380.21 s at D = 100 , and lracmaes from 4.87 s at D = 24 to 800.07 s at D = 100 , reflecting the O ( D 2 ) cost of their covariance matrix adaptation mechanisms rather than the benchmark’s evaluation cost. A complete 30-run experimental campaign at D = 100 therefore requires less than one minute for most methods, confirming that the benchmark is computationally lightweight and suitable for routine algorithm development, with the caveat that covariance-based methods incur substantially higher per-run costs at large dimensions.
A cost-benefit analysis of the covariance-based methods at large dimensions reveals an important practical consideration for benchmark users. At D = 100 , lracmaes requires approximately 800s per run, yielding a total experimental cost of 800 × 30 24,000 s (approximately 6.7 h) for a single 30-run campaign, while cmaes requires approximately 380 × 30 11,400 s (approximately 3.2 h). By contrast, all remaining eight methods complete their 30-run campaigns at D = 100 in under one minute in total. Despite this substantial computational investment, lracmaes and cmaes do not achieve qualitatively superior performance at D = 100 relative to jso and mlshaderl, which require only seconds per full campaign: as shown in Table 3, all four methods achieve success rates of 3% at D = 100 , and cmaes achieves the lowest best value only marginally ahead of jso. This cost-benefit asymmetry motivates a practical recommendation: for benchmark users employing methods with super-linear scaling in D, such as any covariance matrix adaptation variant, D = 50 is recommended as the default experimental setting. At D = 50 , lracmaes and cmaes complete 30 runs in approximately 30 and 14 min, respectively—a computationally tractable investment while still exposing the full multi-mechanism difficulty of the benchmark—as evidenced by the universal 3% success rates reported in Table 3. The dimensions D = 70 and D = 100 remain available for studies specifically targeting the scaling behavior of covariance-based methods, with the understanding that the associated computational cost scales as O ( D 2 ) and may require dedicated hardware resources. For gradient-based solvers, surrogate-assisted methods, and reinforcement-learning approaches identified as directions for future work in Section 4 the benchmark’s evaluation cost is negligible (well below 1 s per evaluation for all dimensions), making the full range D 24 , 30 , 50 , 70 , 100 computationally accessible without restriction.
The recommended upper dimension of D = 100 is intentional and sufficient for the purposes of this benchmark. As demonstrated in Section 2.4, all ten competing methods achieve success rates of 3 % at D = 100 , representing the minimum observable rate under the 30-run protocol. This universal solver failure indicates that the benchmark has already reached a difficulty ceiling at D = 100 : extending to D = 200 or D = 500 would not produce qualitatively new algorithmic findings but would instead amplify the computational cost of the covariance-based methods (notably cmaes, whose wall-clock time scales as O ( D 2 ) , reaching 380 s per run at D = 100 ) to levels that preclude routine experimental use. Furthermore, the combinatorial growth argument presented in Section 3 establishes that the effective number of local attractors scales as 5 D , yielding approximately 5 100 10 69 at D = 100 , a landscape complexity that is already far beyond the exhaustive coverage capacity of any population-based solver. The benchmark is therefore not limited by its upper dimension but by the fundamental intractability of its landscape structure, which manifests fully within the recommended range D { 24 , 30 , 50 , 70 , 100 } .

2.5. Parameter Sensitivity Analysis

To assess how sensitive the benchmark’s optimization landscape is to its internal coefficients, a one-at-a-time perturbation study was conducted on six parameters that directly control the primary difficulty mechanisms of the benchmark. These parameters were selected because the ablation study in Section 2.4 identified the oscillatory efficiency and multiplicative yield mechanisms as the two most consequential sources of landscape difficulty, and the six parameters studied here are the numerical coefficients that directly modulate these mechanisms. These parameters are: the oscillatory efficiency amplitude (default 0.22), which governs the depth of the sinusoidal modulation in η i ( x ) , the resonance penalty weight (default 8.0), which scales the local ruggedness term P r e s , the oscillation frequency base and amplitude (defaults 0.22 and 0.09), which jointly determine the density of local attractors through f r e q i , and the yield-sensitivity base and amplitude (defaults 0.85 and 0.55), which control the stage-wise crop stress coefficient K y i . Each parameter was varied independently across its tested range while all others were held at their default values. For each setting, 30 independent runs of mlshaderl were executed at D = 30 , and the mean best objective value, standard deviation, minimum, and maximum were recorded. The main effect of each parameter is defined as the difference between the highest and lowest mean best values observed across its tested range. Table 5 reports the complete results, and Figure 1 visualizes the sensitivity profiles.
The one-at-a-time perturbation protocol adopted in this study is a deliberate and well-established choice for benchmarks whose primary goal is mechanism transparency rather than global parameter optimization. Single-factor variation isolates the individual contribution of each coefficient to the observed landscape difficulty, a property that would be obscured by factorial or Latin hypercube designs in which multiple parameters vary simultaneously. This interpretability is essential for benchmark users who wish to adjust a specific difficulty mechanism without inadvertently altering others. The analysis was conducted at D = 30 because this dimension lies at the transition between partial and universal solver failure in the full formulation, making it the most sensitive and informative setting for detecting landscape changes induced by parameter perturbations. The use of mlshaderl as the reference solver is justified by its consistent position among the top-performing methods across all dimensions and variants in Section 2.4, its low run-to-run variability (SD below 15 units at D = 30 ), and its archive-based mechanism, which provides stable convergence behavior that makes mean best objective values a reliable proxy for landscape difficulty rather than a reflection of solver-specific search artifacts.
The main effect column in Table 5 provides a clear ranking of parameter importance. The oscillatory efficiency amplitude exhibits by far the largest main effect (397.27), confirming quantitatively the conclusion drawn from the ablation study: this single coefficient is the dominant lever controlling the benchmark’s difficulty. As the amplitude increases from 0 to 0.33, the mean best objective value decreases monotonically from 2899.34 to 2502.08, reflecting the fact that stronger oscillatory modulation reduces the effective irrigation efficiency and thereby shifts the entire cost-yield trade off toward lower absolute objective values. The standard deviation remains stable across all settings (between 5.88 and 18.21), indicating that the parameter affects the location of the optimal basin rather than the reliability of convergence. The resonance penalty weight ranks second with a main effect of 203.05. Its influence is monotonically increasing: higher weights raise the mean best value from 2636.94 (at weight 5) to 2839.99 (at weight 11), consistent with the expected behavior that a stronger resonance penalty adds cost to all feasible solutions proportionally. The remaining four parameters oscillation frequency base (main effect 18.19), oscillation frequency amplitude (14.81), yield-sensitivity amplitude (5.05), and yield-sensitivity base (3.83) exhibit substantially smaller main effects, indicating that the benchmark’s landscape structure is robust to moderate perturbations of these coefficients. In particular, the near-negligible sensitivity to K y base and amplitude (main effects below 5.1) confirms that the specific numerical values of the crop stress coefficients have only a marginal influence on the achievable objective values at this dimension. The multiplicative yield mechanism contributes to the deceptive structure of the objective primarily through its aggregation form, not through the precise values of its coefficients.
Figure 1 visualizes the sensitivity profiles reported in Table 5. The top-left panel confirms the dominant and monotonically decreasing relationship between the oscillatory efficiency amplitude and the mean best objective value. The top-right panel shows the monotonically increasing effect of the resonance weight. The middle panels reveal that the oscillation frequency parameters produce only mild, non-monotonic variations within a narrow band, suggesting that the density of local attractors is relatively stable across the tested frequency range. The bottom panels confirm the near-flat sensitivity to the yield-sensitivity coefficients, with variations that fall within the inter-run standard deviation and are therefore not statistically distinguishable from noise.
To assess whether the parameter importance ranking is consistent across dimensions, the analysis was extended to D = 50 , Table 6 and Figure 2 report results for both dimensions. The ordinal ranking is preserved: the oscillatory efficiency amplitude retains its dominant position (main effect 962.89 at D = 50 , versus 397.27 at D = 30 ), the resonance penalty weight ranks second at both dimensions (356.97 and 203.05, respectively), and the oscillation frequency parameters remain negligible. The yield-sensitivity base and amplitude, which exhibit small non-zero effects at D = 30 (3.83 and 5.05), collapse to exactly zero at D = 50 , reflecting the structural property that at this dimension the shorter stage window ( Δ t = 2.4 days) allows high-quality solvers to consistently find near-zero-stress solutions, at which point the K y i coefficient vanishes from the stage yield computation. This dimensional transition is consistent with the ablation findings and does not reduce benchmark difficulty: all ten methods achieve 3% success rates at D = 50 .

2.6. Comparative Difficulty Assessment on Classical Benchmarks

To position the difficulty of the proposed benchmark relative to established test functions, the nine optimizers that are common to both experiments arq, clpso, cmaes, ga, jso, lmcmaes, lracmaes, mlshaderl, and acor were evaluated on four classical continuous optimization benchmarks at D = 30 and D = 50 under the same experimental protocol (30 independent runs, identical evaluation budget). The four benchmark functions were selected to represent distinct difficulty profiles that are individually present in the proposed irrigation benchmark but rarely combined in a single classical test function.
The Ackley function [35] is a widely used multimodal benchmark with a nearly separable structure and an exponentially decaying envelope that creates a broad basin surrounding the global optimum. It was selected because its multimodal character provides a baseline for assessing how the solvers handle oscillatory local structure in the absence of inter-variable coupling. The Rastrigin function [36] is a strongly multimodal, separable function whose cosine-based oscillatory landscape produces a dense lattice of local minima a structure that is directly analogous to the oscillatory efficiency modulation η i ( x ) embedded in the proposed benchmark. Its inclusion allows a controlled comparison between separable and non-separable oscillatory difficulty. The Rosenbrock function [37] is a unimodal but non-separable function characterized by a narrow curved valley, making it a standard test for the ability of optimizers to follow non-separable gradient structure. It was selected to isolate the contribution of non-separability to search difficulty, independent of multimodality. The Schwefel function [38] features a deceptive global structure in which the second-best local minimum is geometrically distant from the global optimum, creating a landscape that misleads solvers toward suboptimal regions. This deceptive character is analogous to the preferred-window penalty P w i n and the multiplicative yield aggregation Y r e l in the proposed benchmark. Table 7 reports the complete results.
The results in Table 7 reveal a striking contrast between the difficulty of the classical benchmarks and that of the proposed irrigation benchmark. On Ackley at D = 30 , seven out of nine solvers (arq, cmaes, jso, lmcmaes, lracmaes, mlshaderl, and acor) achieve success rates of 100%, and cmaes, jso, lmcmaes, lracmaes, and mlshaderl maintain 100% at D = 50 . This confirms that multimodality alone, when combined with near-separability and a smooth global envelope, is readily handled by modern continuous optimizers. The Rastrigin function, which presents a denser lattice of local minima, is more challenging: at D = 50 , cmaes, lmcmaes, and lracmaes achieve 100%, while arq drops to 10% and most other methods fall to 3%. Nevertheless, the best-performing solvers maintain high reliability, indicating that separable oscillatory structure, even when dense, does not constitute a fundamental barrier for methods with adequate covariance adaptation or restart mechanisms.
The Rosenbrock function introduces non-separability without multimodality. At D = 30 , arq, jso, and lmcmaes achieve 100%, while mlshaderl achieves 90% and cmaes 87%. At D = 50 , cmaes maintains a success rate of 93%, demonstrating that non-separable valley structure alone remains tractable for covariance-based methods. The Schwefel function presents the closest difficulty profile to the proposed benchmark among the four classical functions: at D = 30 , only mlshaderl achieves a meaningful success rate (40%), and at D = 50 , mlshaderl drops to 10% and all other methods to 7% or below. The deceptive global structure of Schwefel thus represents a genuine challenge, but one that is still partially navigable by archive-based methods.
By comparison, the proposed irrigation benchmark at D = 30 yields a maximum success rate of only 10% (mlshaderl), and at D = 50 , all nine methods achieve the minimum rate of 3% without exception. This represents a qualitative step beyond even the Schwefel function in terms of search difficulty. The explanation lies in the simultaneous presence of mechanisms that the classical functions present only in isolation: the proposed benchmark combines the dense oscillatory local structure of Rastrigin (through η i ( x ) ), the non-separable coupling of Rosenbrock (through the recursive storage S i and the neighbor load N i ), the deceptive global structure of Schwefel (through P w i n and the multiplicative Y r e l ) , and a stateful path dependence that is entirely absent from all four classical functions. This combination creates a landscape in which no single algorithmic capability whether covariance adaptation, archive-based memory, or probability density estimation is sufficient to ensure reliable convergence, confirming that the proposed benchmark occupies a difficulty regime that is not adequately represented by existing classical test functions.

2.7. Comparison with CEC 2011 Real-World Optimization Problems

To directly address the relationship between the proposed benchmark and established real-world optimization problem suites, the ten optimizers were additionally evaluated on four problems drawn from the CEC 2011 real-world optimization problem suite [14]. The selected problems were chosen to represent cases in which modern continuous optimizers retain partial or full tractability, thereby providing a meaningful upper bound on solver capability against which the difficulty of the proposed benchmark can be calibrated. The four problems span dimensions D { 12 , 15 , 40 , 96 } and cover distinct engineering domains: circular antenna array design (AntennaArray, D = 12 ), static economic load dispatch (ELD3, D = 15 , ELD4, D = 40 ), and hydrothermal generation scheduling (Hydrothermal, D = 96 ). None of the four problems has an analytically known global optimum, making the comparison methodology consistent with the black-box evaluation framework used throughout this study. The same experimental protocol was applied: 30 independent runs per method with an identical function evaluation budget. Section 2.7 reports the complete results.
The results reveal a clear and interpretable difficulty gradient across the four CEC 2011 problems. AntennaArray ( D = 12 ) is solved to global optimality by jso and mlshaderl in all 30 runs, and arq achieves a 60% success rate, confirming that this problem, despite its real-world origin, presents a landscape that is fully tractable for archive-based and adaptive differential evolution methods. ELD3 ( D = 15 ) remains partially tractable: mlshaderl achieves a success rate of 43%, acor 30%, and arq 20%, while the remaining methods report rates of 3%. At D = 40 , ELD4 represents the most tractable large-scale instance in the CEC 2011 suite tested here: acor achieves a success rate of 63%, the highest observed across all CEC 2011 problems, and mlshaderl attains 3% with competitive best values. At the largest dimension tested, Hydrothermal ( D = 96 ), arq achieves 20% and mlshaderl 10%, while all remaining methods report 3%, indicating that the problem remains partially navigable for methods with adaptive memory or restart mechanisms even at this scale.
Comparing these results with the proposed irrigation benchmark, the key observation is that the proposed benchmark achieves comparable or greater difficulty at substantially lower dimensions. The proposed benchmark yields a maximum success rate of 10% (mlshaderl) at D = 30 and universal 3% rates across all ten methods at D = 50 , whereas ELD4 at D = 40 still permits a 63% success rate for acor and Hydrothermal at D = 96 remains partially tractable for two methods. This dimensional efficiency of difficulty is attributable to the proposed benchmark’s simultaneous combination of oscillatory efficiency modulation, recursive state dependence, and multiplicative yield aggregation mechanisms that the CEC 2011 problems present only in partial and domain-specific forms. The method ranking observed across the CEC 2011 suite with jso and mlshaderl consistently leading and ga and gwo consistently trailing is broadly consistent with the ranking observed on the full irrigation benchmark. Notably, lmcmaes and lracmaes, which are competitive on the irrigation benchmark, achieve only 3% success rates across all CEC 2011 problems, with lracmaes exhibiting very large standard deviations on the Hydrothermal instance, suggesting that their advantage is specific to the stateful and oscillatory structure of the proposed benchmark rather than general across real-world problem families. The performance of ten optimizers on four CEC 2011 real-world problems is presented in Table 8.

2.8. Statistical Validation

To confirm that the performance differences observed in Section 2.4 are statistically significant rather than artifacts of run-to-run variability, a nonparametric statistical analysis was conducted on the 30 final objective values obtained by each method on the full benchmark formulation ( A L L ) across all five dimensions. To complement the p-values and quantify the magnitude of performance differences among methods, Kendall’s W coefficient of concordance was computed for each dimension as W = x F 2 n ( k 1 ) , where n = 30 independent runs and k = 10 methods. The resulting values, reported in Table 9, range from 0.636 ( D = 30 ) to 0.939 ( D = 100 ) across the five dimensions, indicating [weak/moderate/strong] concordance among the ten optimizers’ rank orderings. Values of W above 0.7 are conventionally interpreted as strong agreement in the ranking structure, confirming that the performance hierarchy identified by the mean Friedman ranks is not only statistically significant but also practically meaningful in terms of the magnitude of inter-method differences. The Friedman test was applied independently at each dimension to assess whether the ten methods produce significantly different rank distributions. Pairwise comparisons were then performed using Wilcoxon signed-rank tests with Holm correction to control the family-wise error rate. Table 9 reports the mean Friedman ranks for each method at each dimension, together with the overall average rank and the Friedman test p-values.
The Friedman test rejects the null hypothesis of equal method performance at all five dimensions with p-values below 10 28 , confirming that the observed differences are highly significant. The mean ranks identify jso as the best-performing method overall (average rank 2.73), followed by lracmaes (2.91), mlshaderl (3.33), and cmaes (4.39). The methods ga (9.05) and gwo (9.79) consistently occupy the lowest ranks, confirming their poor suitability for this benchmark. Notably, the ranking of acor shifts substantially across dimensions: it achieves its best rank at D = 50 (1.23) but deteriorates to 8.07 at D = 100 , suggesting that its probability density estimation mechanism is highly effective at moderate dimensions but loses precision as the search space grows. Similarly, lracmaes achieves its best rank at D = 70 (1.00) but deteriorates at D = 100 (4.60), reflecting a comparable dimension-dependent behavior. Pairwise Wilcoxon tests with Holm correction confirm that jso and mlshaderl differ significantly from ga and gwo at all dimensions (p < 0.001), while the differences among the top methods (jso, lracmaes, mlshaderl, cmaes, and acor) are significant at most but not all dimensions, reflecting the competitive and complementary nature of their search strategies. Figure 3 visualizes the rank structure across dimensions, and Figure 4 shows the corresponding distributions of final objective values.
Figure 3 confirms visually that jso and lracmaes maintain consistently low ranks across all dimensions, while ga and gwo are stably positioned at ranks 9–10. The intermediate methods (mlshaderl, cmaes, acor, arq, lmcmaes, clpso) exhibit dimension-dependent rank shifts that reflect the varying effectiveness of their search mechanisms as the landscape complexity scales with D.
Figure 4 complements the rank analysis by showing the spread of final objective values. The tight interquartile ranges of jso, lracmaes, and mlshaderl at D = 24 and D = 30 contrast sharply with the wide distributions of ga and gwo, which extend over hundreds of objective-value units. At D = 70 and D = 100 , all distributions widen substantially, but the median values of jso and lracmaes remain consistently below those of the other methods, confirming the rank-based findings.

3. Discussion

Before discussing the experimental findings, it is useful to provide a quantitative estimate of the landscape complexity induced by the benchmark’s primary difficulty mechanisms. The oscillatory efficiency law η i ( x ) contains the term s i n 2 ( f r e q i · x i + φ i ) , which over the domain x i [ 0 , 80 ] completes approximately f r e q i · 80 / π 0.22 · 80 / π 5.6 half-cycles per variable at the default frequency. Each half-cycle generates one local attractor in the efficiency landscape, so the single-variable oscillatory structure alone produces approximately 5–6 local basins per stage. Since the efficiency also depends on the neighbor-coupled load N i through the second sinusoidal term, and since the storage state S i propagates effects from all preceding stages, the number of combinatorially distinct local configurations grows multiplicatively with D. A conservative lower bound on the number of local attractors in the full problem is therefore on the order of 5 D , yielding approximately 5 24 6 × 10 16 at D = 24 and 5 50 9 × 10 34 at D = 50 . While not all of these correspond to true local minima of the composite objective many are eliminated or merged by the smoothness, budget, and deficit penalties this estimate explains why even sophisticated population-based methods with typical population sizes of 100–500 individuals cannot exhaustively cover the basin structure at moderate dimensions. The recursive storage propagation further compounds this difficulty by introducing a sequential dependency chain of length D: a perturbation at stage i can alter the storage states S i + 1 , , S D and thereby shift the effective landscape seen by all subsequent variables. This path dependence means that the effective dimensionality of the problem, in terms of the number of coupled decisions that must be simultaneously correct, is closer to D 2 than to D, consistent with the empirical observation that solver success rates collapse between D = 30 and D = 50 rather than scaling smoothly with dimension.
The most salient observation across all five dimensions of the full benchmark is the consistent superiority of jso, lracmaes, and mlshaderl over the remaining seven methods. This outcome is not incidental. The method jso employs a self-adaptive differential evolution strategy with weighted mutation and crossover operators that balances exploration and exploitation effectively in landscapes with moderate ruggedness. The method mlshaderl relies on a historical archive mechanism that provides implicit memory of previously explored regions and allows the search distribution to contract toward promising basins without premature commitment. The method lracmaes employs a learning rate adaptation mechanism within the CMA-ES framework that adjusts the step-size and covariance updates dynamically, allowing it to maintain effective search precision across the benchmark’s coupled and oscillatory landscape structure. In a landscape characterized by recursive state dependence, repeated local oscillatory structure, and multiplicative coupling between stages, these capabilities are precisely what distinguish reliable convergence from stagnation. By contrast, methods such as ga and gwo, which rely on simpler population recombination or swarm attraction heuristics, consistently fail to maintain sufficient diversity in the presence of the benchmark’s multiple deceptive attractors, as evidenced by their large standard deviations and their stable positioning at Friedman ranks 9–10 across all dimensions. The dimension-scaling behavior observed in the full formulation is consistent with the combinatorial growth of non-separable interactions: the recursive storage propagation links the effective landscape seen by variable x i to decisions made at all preceding stages x 1 , , x i 1 , creating an effective dependency chain whose length grows linearly with D. As D increases, the probability that a randomly initialized trajectory can locate and remain in the correct basin across all stages simultaneously diminishes rapidly, making global search substantially harder than dimension alone would suggest for a separable or smoothly non-separable problem.
The ablation protocol reveals a clear hierarchy of mechanism importance that is both informative and practically useful. The oscillatory efficiency mechanism emerges as the single most consequential source of landscape ruggedness: its removal enables multiple solvers to achieve success rates near or at 100% even at D = 50 , a result that is unattainable in the full formulation at that dimension. This finding has a clear structural interpretation: the oscillatory modulation of η i ( x ) , which depends on both the local decision x i and the neighbor-coupled load N i , introduces a dense lattice of local attractors that are energetically competitive with the global basin. In the absence of this mechanism, the residual landscape is sufficiently smooth to permit gradient-like local improvement strategies to succeed over repeated restarts. The recursive storage propagation and threshold drainage mechanisms occupy the second tier, each contributing meaningful but less dramatic difficulty that primarily affects the intermediate-dimensional regime (D = 30–50). The multiplicative yield aggregation, neighbor coupling, and preferred-window regularization contribute more modestly, with their effects becoming largely indistinguishable from the baseline difficulty at high dimensions. Crucially, no single ablation restores the problem to the triviality of the fully simplified variant, confirming that the benchmark’s difficulty is genuinely multi-mechanism in character.
The proposed benchmark addresses a recognized gap at the intersection of two research communities. Within continuous global optimization, the dominant benchmark suites including the CEC and BBOB families are composed primarily of synthetic analytical functions whose difficulty is controlled through transformation, rotation, and shifting, but which lack the semantic interpretability and state-dependent coupling of physically motivated problems. Within agricultural water management and irrigation scheduling, the available simulation-optimization frameworks are calibrated to specific field conditions and crop models and are therefore neither self-contained nor designed to be algorithmically difficult per se. The present benchmark occupies a complementary position: it is analytically self-contained and fully deterministic, requires no external data, exhibits a rich and verifiable difficulty structure through the ablation protocol, and retains an agronomic interpretation that makes its components meaningful beyond pure algorithmic testing. The comparative evaluation against four classical benchmarks in Section 2.6 confirms quantitatively that the proposed problem is substantially harder than Ackley, Rastrigin, Rosenbrock, and Schwefel at the same dimensions, due to its simultaneous combination of oscillatory ruggedness, non-separable coupling, deceptive global structure, and stateful path dependence features that the classical functions present only in isolation. The deliberate absence of a known closed-form global optimum further distinguishes it from classical synthetic benchmarks, making it a genuine black-box problem for which algorithmic progress can only be measured empirically.
Although the proposed benchmark is not a calibrated digital twin of any specific field or crop system, the experimental findings carry implications that are transferable to the design of optimization-based irrigation decision-support tools. The ablation study demonstrates that temporal state coupling represented here by the recursive storage mechanism and nonlinear efficiency losses are the two features that most severely degrade optimizer performance. In practical irrigation scheduling, these correspond precisely to the soil moisture carryover between irrigation events and to the nonlinear relationship between applied water and effectively delivered water, both of which are well-documented challenges in field-level water management [18,21,23]. The finding that archive-based, distribution-estimation, and learning-rate-adaptive methods (jso, mlshaderl, lracmaes) substantially outperform simpler population-based heuristics (ga, gwo) suggests that optimization tools deployed for real irrigation scheduling should prioritize algorithms with adaptive memory mechanisms capable of handling sequential dependencies, rather than relying on general-purpose metaheuristics without such capabilities. Furthermore, the parameter sensitivity analysis in Section 2.5 shows that the benchmark’s difficulty is primarily controlled by a small number of coefficients, most notably the oscillatory efficiency amplitude. This ranking is confirmed to be robust across dimensions: the cross-dimensional validation at D = 50 preserves the same parameter importance ordering, with the oscillatory efficiency amplitude retaining its dominant position and the yield-sensitivity coefficients collapsing to zero main effect. This means that practitioners wishing to use the benchmark as a testbed for irrigation-oriented optimization tools can tune its difficulty to match the complexity level of their target application without altering the overall problem structure.

4. Conclusions

This article has introduced a new continuous global optimization benchmark derived from the semantics of weather-coupled irrigation scheduling. The benchmark is formulated as a deterministic, box-constrained minimization problem in which the objective function integrates a recursive soil-water storage state, an oscillatory irrigation efficiency law, a multiplicative seasonal yield factor, neighbor-coupled hydraulic penalties, threshold-triggered drainage losses, and a deceptive preferred-window regularization term. The problem is analytically self-contained: all exogenous weather and crop profiles are generated from the normalized stage coordinate, requiring no external datasets. Recommended problem dimensions span from D = 24 to D = 100 , with the transition from moderate to high dimensionality producing a substantial and predictable increase in landscape difficulty.
The experimental study, conducted over 30 independent runs of ten established continuous optimizers across all recommended dimensions and eight problem variants, yields three principal conclusions. First, the full benchmark presents a genuinely multi-mechanism difficulty that no single state-of-the-art solver reliably resolves at D 50 : even the best-performing method, jso (average Friedman rank 2.73), achieves success rates of only 3% at the highest dimensions tested. Second, the ablation study identifies oscillatory efficiency modulation as the dominant source of landscape ruggedness, with recursive storage propagation and threshold drainage as the second tier of difficulty mechanisms, together, these components account for the majority of the difficulty gap between the full and the simplified formulations. Third, the fully simplified variant, in which all embedded difficulty mechanisms are disabled, is solved to optimality by most methods across all dimensions, confirming that the observed difficulty in the full formulation is structural rather than numerical in origin. The statistical validation through Friedman and pairwise Wilcoxon tests confirms these findings with p-values below 10 28 at all dimensions. These results collectively validate the proposed benchmark as a meaningful and scalable tool for studying continuous optimization algorithms on non-separable, stateful, and rugged landscapes with a clear agronomic interpretation.
From a practical standpoint, the proposed benchmark offers the optimization community a ready-to-use, computationally lightweight test problem whose difficulty can be continuously adjusted through the oscillatory efficiency amplitude and the resonance penalty weight, as demonstrated by the sensitivity analysis in Section 2.5 across both D = 30 and D = 50, confirming that this practical tunability is dimensionally robust. Its agronomic semantics also make it suitable as an educational or demonstrative tool for illustrating how competing operational objectives interact in seasonal resource-allocation problems. Several directions for future work are identified. First, the evaluation of gradient-based solvers, surrogate-assisted methods, and reinforcement-learning-based optimizers on the benchmark would broaden the algorithmic coverage beyond the population-based methods considered here. For such gradient-based and surrogate-assisted methods, the benchmark’s negligible per-evaluation cost makes the full dimensional range D 24 , 30 , 50 , 70 , 100 computationally accessible, by contrast, for covariance-based population methods, D = 50 is recommended as the practical default given the O ( D 2 ) scaling of their covariance update mechanisms, as discussed in Section 2.4. Second, a multi-objective extension that separately optimizes water cost and seasonal yield, rather than combining them in a single scalar objective, would enable the study of Pareto-front structure under coupled weather-dependent dynamics. Third, a stochastic variant in which the weather profiles are drawn from parameterized distributions rather than fixed trigonometric functions would allow the investigation of robust and risk-aware optimization strategies. Fourth, a dynamic extension in which decisions are made sequentially with partial observations of the storage state between stages would transform the benchmark into a sequential decision problem amenable to reinforcement learning and model predictive control. Finally, future work could apply established exploratory landscape analysis features such as fitness distance correlation, information content, and gradient homogeneity measures to the full and ablated formulations across all recommended dimensions, providing a data-driven characterization of how each difficulty mechanism shapes the search landscape. Additionally, the orthogonal difficulty axes identified through comparison with logistics-oriented benchmark frameworks [16] specifically, the contrast between the temporal state recursion and multiplicative yield aggregation of the proposed benchmark and the spatial decomposition and multi-objective coordination of truck–UAV path optimization frameworks suggest that a hybrid benchmark family combining mechanisms from both domains represents a productive direction for future benchmark design.

Author Contributions

Conceptualization, A.M.G.; methodology, V.C.; software, V.C.; validation, I.G.T.; formal analysis, I.G.T.; resources, I.G.T.; visualization, D.T.; supervision, D.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Rios, L.M.; Sahinidis, N.V. Derivative-free optimization: A review of algorithms and comparison of software implementations. J. Glob. Optim. 2013, 56, 1247–1293. [Google Scholar] [CrossRef]
  2. Sharma, P.; Raju, S. Metaheuristic optimization algorithms: A comprehensive overview and classification of benchmark test functions. Soft Comput. 2024, 28, 3123–3186. [Google Scholar] [CrossRef]
  3. Mohamed, A.W.; Sallam, K.M.; Agrawal, P.; Hadi, A.A.; Mohamed, A.K. Evaluating the performance of meta-heuristic algorithms on CEC 2021 benchmark problems. Neural Comput. Appl. 2023, 35, 1493–1517. [Google Scholar] [CrossRef]
  4. Naser, M.Z.; Al-Bashiti, M.K.; Tapeh, A.T.G.; Naser, A.; Kodur, V.; Hawileh, R.; Abdalla, J.; Khodadadi, N.; Gandomi, A.H.; Eslamlou, A.D. A Review of Benchmark and Test Functions for Global Optimization Algorithms and Metaheuristics. Wiley Interdiscip. Rev. Comput. Stat. 2025, 17, e70028. [Google Scholar] [CrossRef]
  5. Kudela, J. A critical problem in benchmarking and analysis of evolutionary computation methods. Nat. Mach. Intell. 2022, 4, 1238–1245. [Google Scholar] [CrossRef]
  6. Liang, J.J.; Qu, B.Y.; Suganthan, P.N. Problem Definitions and Evaluation Criteria for the CEC 2014 Special Session and Competition on Single Objective Real-Parameter Numerical Optimization; Technical Report 201311; Computational Intelligence Laboratory, Zhengzhou University: Zhengzhou, China, 2013. [Google Scholar]
  7. Hansen, N.; Finck, S.; Ros, R.; Auger, A. Real-Parameter Black-Box Optimization Benchmarking 2009: Noiseless Functions Definitions; Research Report RR-6829; INRIA: Paris, France, 2009. [Google Scholar]
  8. Awad, N.H.; Ali, M.Z.; Liang, J.J.; Qu, B.Y.; Suganthan, P.N. Problem Definitions and Evaluation Criteria for the CEC 2017 Special Session and Competition on Single Objective Real-Parameter Numerical Optimization; Technical Report; Nanyang Technological University: Singapore, 2016. [Google Scholar]
  9. Varelas, K.; El Hara, O.A.; Brockhoff, D.; Hansen, N.; Nguyen, D.M.; Tušar, T.; Auger, A. Benchmarking large-scale continuous optimizers: The bbob-largescale testbed, a COCO software guide and beyond. Appl. Soft Comput. 2020, 97, 106737. [Google Scholar] [CrossRef]
  10. Jamil, M.; Yang, X.-S. A literature survey of benchmark functions for global optimisation problems. J. Math. Model. Numer. Optim. 2013, 4, 150–194. [Google Scholar] [CrossRef]
  11. Piotrowski, A.P.; Napiorkowski, J.J.; Piotrowska, A.E. Metaheuristics should be tested on large benchmark sets with various numbers of function evaluations. Swarm Evol. Comput. 2024, 90, 101680. [Google Scholar] [CrossRef]
  12. Piotrowski, A.P.; Napiorkowski, J.J.; Piotrowska, A.E. Choice of benchmark optimization problems does matter. Swarm Evol. Comput. 2023, 83, 101378. [Google Scholar] [CrossRef]
  13. Kudela, J.; Matousek, R. New benchmark functions for single-objective optimization based on a zigzag pattern. IEEE Access 2022, 10, 8262–8278. [Google Scholar] [CrossRef]
  14. Das, S.; Suganthan, P.N. Problem Definitions and Evaluation Criteria for CEC 2011 Competition on Testing Evolutionary Algorithms on Real World Optimization Problems; Technical Report; Jadavpur University: Kolkata, India, 2010. [Google Scholar]
  15. Li, Z.; Zhou, Y.; Zhang, S.; Song, J.; Li, X. A large-scale continuous optimization benchmark suite with versatile coupled heterogeneous modules. Swarm Evol. Comput. 2023, 79, 101280. [Google Scholar] [CrossRef]
  16. Guo, J.; Qin, Y.; Wu, C.H.; Chen, Z.-S. Multi-objective optimization of multimodal logistics paths: A decomposition-based approach for coordinating trucks and UAVs within the cyber-physical internet framework. Int. J. Logist. Res. Appl. 2026, 1–29. [Google Scholar] [CrossRef]
  17. Jamal, A.; Linker, R.; Housh, M. Real-time irrigation scheduling based on weather forecasts, field observations, and human-machine interactions. Water Resour. Res. 2023, 59, e2023WR035810. [Google Scholar] [CrossRef]
  18. Allen, R.G.; Pereira, L.S.; Raes, D.; Smith, M. Crop Evapotranspiration: Guidelines for Computing Crop Water Requirements, 2nd rev. ed.; FAO Irrigation and Drainage Paper 56; Food and Agriculture Organization of the United Nations: Rome, Italy, 2025. [Google Scholar] [CrossRef]
  19. Doorenbos, J.; Kassam, A.H. Yield Response to Water; Irrigation and Drainage Paper 33; Food and Agriculture Organization of the United Nations: Rome, Italy, 1979. [Google Scholar] [CrossRef]
  20. Wang, D.; Cai, X. Irrigation scheduling—Role of weather forecasting and farmers’ behavior. J. Water Resour. Plan. Manag. 2009, 135, 364–372. [Google Scholar] [CrossRef]
  21. Linker, R.; Kisekka, I. Model-based simulation-optimization of irrigation scheduling—A field evaluation with processing tomatoes. Smart Agric. Technol. 2023, 4, 100234. [Google Scholar] [CrossRef]
  22. Li, X.; Zhang, J.; Cai, X.; Huo, Z.; Zhang, C. Simulation-optimization based real-time irrigation scheduling: A human-machine interactive method enhanced by data assimilation. Agric. Water Manag. 2023, 276, 108059. [Google Scholar] [CrossRef]
  23. Zhao, Y.; Li, G.; Li, S.; Luo, Y.; Bai, Y. A review on the optimization of irrigation schedules for farmlands based on a simulation-optimization model. Water 2024, 16, 2545. [Google Scholar] [CrossRef]
  24. Brest, J.; Maučec, M.S.; Bošković, B. Single objective real-parameter optimization: Algorithm jSO. In 2017 IEEE Congress on Evolutionary Computation (CEC); IEEE: Piscataway, NJ, USA, 2017; pp. 1311–1318. [Google Scholar] [CrossRef]
  25. Chauhan, D.; Trivedi, A.; Shivani. A multi-operator ensemble LSHADE with restart and local search mechanisms for single-objective optimization. arXiv 2024, arXiv:2409.15994. [Google Scholar]
  26. Liang, J.J.; Qin, A.K.; Suganthan, P.N.; Baskar, S. Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans. Evol. Comput. 2006, 10, 281–295. [Google Scholar] [CrossRef]
  27. Hansen, N.; Ostermeier, A. Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 2001, 9, 159–195. [Google Scholar] [CrossRef]
  28. Loshchilov, I. A computationally efficient limited memory CMA-ES for large scale optimization. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2014); ACM: New York, NY, USA, 2014; pp. 397–404. [Google Scholar] [CrossRef]
  29. Loshchilov, I. LM-CMA: An alternative to L-BFGS for large-scale black-box optimization. Evol. Comput. 2017, 25, 143–171. [Google Scholar] [CrossRef] [PubMed]
  30. Nomura, M.; Akimoto, Y.; Ono, I. CMA-ES with learning rate adaptation. arXiv 2023, arXiv:2401.15876. [Google Scholar]
  31. Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, 2nd ed.; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar] [CrossRef]
  32. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey Wolf Optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
  33. Socha, K.; Dorigo, M. Ant colony optimization for continuous domains. Eur. J. Oper. Res. 2008, 185, 1155–1173. [Google Scholar] [CrossRef]
  34. Charilogis, V.; Tsoulos, I.G.; Gianni, A.M.; Tsalikakis, D. ARQ: A cohesive optimization design for stable performance on noisy landscapes. Appl. Sci. 2025, 15, 12180. [Google Scholar] [CrossRef]
  35. Ackley, D.H. A Connectionist Machine for Genetic Hillclimbing; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1987. [Google Scholar]
  36. Rastrigin, L.A. Systems of Extremal Control; Nauka: Moscow, Russia, 1974. (In Russian) [Google Scholar]
  37. Rosenbrock, H.H. An automatic method for finding the greatest or least value of a function. Comput. J. 1960, 3, 175–184. [Google Scholar] [CrossRef]
  38. Schwefel, H.-P. Numerical Optimization of Computer Models; John Wiley & Sons: Hoboken, NJ, USA, 1981. [Google Scholar]
Figure 1. Mean best objective value as a function of each primary difficulty-controlling parameter at D = 30 (error bars denote ± 1 standard deviation over 30 runs).
Figure 1. Mean best objective value as a function of each primary difficulty-controlling parameter at D = 30 (error bars denote ± 1 standard deviation over 30 runs).
Mathematics 14 02018 g001
Figure 2. Mean best objective value as a function of each primary difficulty-controlling parameter at D = 50 (error bars denote ± 1 standard deviation over 30 runs).
Figure 2. Mean best objective value as a function of each primary difficulty-controlling parameter at D = 50 (error bars denote ± 1 standard deviation over 30 runs).
Mathematics 14 02018 g002
Figure 3. Mean Friedman ranks per dimension for the full benchmark formulation (lower is better).
Figure 3. Mean Friedman ranks per dimension for the full benchmark formulation (lower is better).
Mathematics 14 02018 g003
Figure 4. Distribution of final objective values per dimension for the full benchmark formulation (30 independent runs).
Figure 4. Distribution of final objective values per dimension for the full benchmark formulation (30 independent runs).
Mathematics 14 02018 g004
Table 1. Symbols, parameters, and auxiliary used in the mathematical formulation of the proposed benchmark.
Table 1. Symbols, parameters, and auxiliary used in the mathematical formulation of the proposed benchmark.
SymbolInterpretation/Coefficients/Comments
DRecommended difficult cases: D {24, 30, 50, 70, 100}
x = ( x 1 , , x D ) T Each component is expressed in mm
x i Box constraint: 0 <= x i <= 80
Ω Hyper-rectangular search domain
Δ t Stage duration (days) for a 120-day horizon
s i Normalized stage coordinate
E T o i Synthetic reference evapotranspiration profile
K c i Crop coefficient profile
r a i n i Rainfall at stage i
K y i Yield sensitivity factor to water stress
h e a t i Synthetic heat-intensity index
w i n d i Synthetic wind index
t a r i f f i Unit water cost at stage i
η b a s e , i Base irrigation efficiency before nonlinear shaping
ρ i Storage recharge coefficient
S i m a x Storage capacity at stage i
f r e q i Oscillation frequency for ruggedness terms
φ i Phase used in oscillatory penalties and efficiency laws
E T c i Crop evapotranspiration demand at stage i
x 0 , x D + 1 Used in the definition of N i
N i Local hydraulic load with coupling of neighboring decisions
η i ( x ) Clipped to [0.45, 1.00]. Main oscillatory ruggedness mechanism
I i e f f Delivered irrigation after efficiency losses
S 1 Initial synthetic soil-water storage
S i Introduces path dependence
A i Total available water at stage i
d e f i c i t i Unmet water demand
s u r p l u s i Excess water beyond demand
d e e p i Deep drainage loss with threshold behavior
r e c h a r g e i Storage recharge after drainage
e v a p l o s s i Storage loss due to heat and wind
S i + 1 Next storage state with clipping to admissible bounds
c l i p ( z , l , u ) Used in η i ( x ) , S i + 1 , and y i
r i Relative degree of water-demand satisfaction
ε Prevents division by zero in r i
s t r e s s i Water stress at stage i
y i Multiplicative stage contribution to relative seasonal yield
Y r e l Relative seasonal yield factor
f ( x ) Full deterministic minimization formulation
C w a t e r Direct water-use cost
C p u m p Pumping cost/penalty with oscillatory modulation
s h o c k i Weather-dependent amplifier of deficit penalties
P d e f Quadratic penalty for water deficit
P e x c Penalty for excess water and deep drainage
P s m o o t h Temporal roughness penalty between consecutive stages
P r e s Local resonance/ruggedness penalty
P i n t Overloading penalty for neighboring stages and bilinear interaction
μ i Center of the deceptive preferred irrigation window
P w i n Mild penalty for deviation from the preferred window with oscillatory distortion
P t e r m Terminal storage regularization
B D Soft seasonal water-use target
P b u d g e t Penalty for deviation from the seasonal budget target
P p e a k Penalty for excessively high peak hydraulic loads
Table 2. Abbreviations of the full benchmark and its ablation-based variants used in the experiments.
Table 2. Abbreviations of the full benchmark and its ablation-based variants used in the experiments.
AbbreviationVariant Name
A L L Full problem formulation
A L L O F F Simplified formulation with all difficulty mechanisms disabled
N O R S Formulation without recursive storage
N O O E Formulation without oscillatory efficiency
N O T L Formulation without threshold losses
N O M Y Formulation without multiplicative yield aggregation
N O N C Formulation without neighbor coupling
N O P W Formulation without preferred-window regularization
Table 3. Experimental performance comparison of ten continuous optimizers on the proposed benchmark under full and ablated formulations for D { 24 , 30 , 50 , 70 , 100 } .
Table 3. Experimental performance comparison of ten continuous optimizers on the proposed benchmark under full and ablated formulations for D { 24 , 30 , 50 , 70 , 100 } .
Full problem formulation
DIM = 24DIM = 30DIM = 50DIM = 70DIM = 100
Method Value Mean Rate% SD Value Mean Rate% SD Value Mean Rate% SD Value Mean Rate% SD Value Mean Rate% SD
arq1487.7311491.58572.4922736.3552769.033319.8806802.8976836.089321.40011,274.48011,388.858355.89318,376.38918,570.7213125.571
clpso1493.7401496.36731.3172744.6642750.09333.8036861.1436874.24938.22411,479.41311,527.464321.96118,845.59318,938.996339.085
cmaes1487.7311493.28033.1632739.4602778.598326.8586801.3286830.234315.76511,256.40111,298.304330.74418,161.56818,269.714357.804
ga1501.9441553.715340.3622764.4242837.572347.0756940.3187265.9213212.78111,708.87112,329.8053297.79119,259.61720,143.5513398.262
gwo1490.1021594.579351.8582774.2662924.252371.8987193.5467538.4963213.07112,414.92613,117.6253380.81720,123.78121,736.5173712.255
jso1487.7311488.914402.0822737.7982744.37935.7916788.8506806.87239.40911,214.24611,299.755340.04518,203.71418,303.518366.851
lmcmaes1489.7991500.00636.182978422743.8552783.549325.4056808.9916855.765337.65711,296.20411,414.791380.179918,260.43618,496.653843107.633
lracmaes1487.7311488.99831.3774996172740.7012757.368318.8736784.6096788.55334.28711,195.22211,212.304314.76218,375.34518,477.49095351.7904
mlshaderl1487.7311489.78832.1792736.3552749.4781013.0456787.1796815.552318.72711,246.26111,295.307338.53318,213.27718,326.946358.189
acor1487.8051488.88332.0072737.7982772.389315.0156784.1626785.25931.44611,248.03611,325.857352.41519,471.03119,575.079360.994
Simplified formulation with all difficulty mechanisms disabled
DIM = 24DIM = 30DIM = 50DIM = 70DIM = 100
Method ValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SD
arq458.713458.7131000.000763.986763.9861000.0001930.0281930.0281000.0003175.2463175.2461000.0005087.7975087.7971000.000
clpso461.250463.29430.962774.556780.13632.6232000.1682018.80938.6903332.6013369.541316.6475435.4555515.635332.799
cmaes458.713458.7131000.000763.986763.9861000.0001930.0281930.0281000.0003175.2463175.2461000.0005087.7975087.7971000.000
ga461.583474.737315.641769.891810.683329.1951982.4242179.4203116.1873513.4453827.9653163.9395894.6316270.9913188.948
gwo463.826527.990372.443780.070878.948390.2082182.4782525.5583224.2373708.0274349.7383367.2476425.0977260.8173470.806
jso458.713458.7131000.000763.986763.9861000.0001930.0281930.0281000.0003175.2463175.2461000.0005087.7975087.7971000.000
lmcmaes458.713458.71330.149763.986763.986930.1681930.0281930.0281000.0003175.2463175.2461000.0005087.7975087.7971000.000
lracmaes458.713458.7131000.000763.986763.9861000.0001930.0281930.0281000.0003175.2463175.24830.0015089.1735090.93631.266
mlshaderl458.713458.7131000.000763.986763.9861000.0001930.0281930.0281000.0003175.2463175.2461000.0005087.7975087.797700.000
acor458.713458.7131000.000763.986763.9861000.0001930.0311930.03830.0053185.7073193.24335.7445426.2905611.325398.324
Formulation without recursive storage
DIM = 24DIM = 30DIM = 50DIM = 70DIM = 100
Method ValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SD
arq975.513977.834201.4291414.8781423.30535.2313108.6163131.508320.1104964.4625003.841328.5807848.4447909.964350.480
clpso976.112976.91130.4471417.8891423.05932.3783151.9163164.39537.6605058.4295103.005315.1518111.0618161.097322.449
cmaes975.513989.078311.6651418.3541431.02337.9303104.8883116.367311.2134957.0374967.38336.7137817.9077827.99435.893
ga977.2231016.610342.6351436.0221487.685339.9393221.2583477.8333136.1395563.3915943.0783217.9908787.9519577.7563417.167
gwo1045.3981163.4943125.1431436.3401682.5073141.2973420.2863863.5583280.7545760.3956575.5223346.2719619.27010,945.8893592.577
jso975.513976.456631.3331414.2341417.52373.4083100.7903111.58736.9734961.9284977.360311.9987822.8927857.514325.081
lmcmaes977.281993.229315.7901416.5061433.44138.4043106.6833131.718313.5794965.2334995.483322.7377821.4387868.105339.434
lracmaes975.513978.45531.5651414.3591416.97673.2523100.7903101.62371.0344955.6744956.64430.7267828.5687838.67135.042
mlshaderl975.513976.549631.5481414.3591418.43874.1193104.1223116.82339.6154967.6154990.841313.9787843.3507885.378334.655
acor975.513977.792201.2671414.3591417.91173.2273100.7903101.06070.3774983.0085000.284310.8738282.6748454.449366.031
Formulation without oscillatory efficiency
DIM = 24DIM = 30DIM = 50DIM = 70DIM = 100
Method ValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SD
arq1632.9971633.232100.1073073.6153073.6151000.0007930.7177932.701834.51613,068.96213,072.060102.13520,894.09920,896.651132.003
clpso1636.9521638.45530.7993076.8823080.08132.3397949.2037954.34932.98813,144.76213,186.460317.62521,188.86521,229.333322.125
cmaes1632.9971633.952232.2843073.6153076.3819715.1547930.7177931.114972.17313,068.96213,070.019371.23420,894.09920,894.935531.218
ga1636.2301672.428334.2243083.7833174.291356.5418044.7128297.3573166.29413,470.48713,990.0803238.31022,016.76522,889.1443552.595
gwo1650.3911722.706349.9233083.1163258.022391.6558137.0978474.0603219.82613,723.01914,474.1743425.30322,492.51123,852.6453666.196
jso1632.9971633.214170.1213073.6153073.6151000.0007930.7177930.7171000.00013,068.96213,070.861231.85820,894.09920,895.528301.565
lmcmaes1633.4861635.08532.9313073.6423073.67830.0227930.7177931.53833.17613,068.96213,071.97432.24620,894.89420,897.042132.539
lracmaes1633.2791633.28030.0003073.6153073.61630.0017930.7177930.717730.00013,068.96513,069.13830.44920,894.14320,894.53170.452
mlshaderl1632.9971633.22330.1153073.6153073.6151000.0007930.7177930.7171000.00013,068.96213,071.170131.95620,894.09920,896.439172.357
acor1633.0111633.29430.0913073.6153073.6151000.0007930.7177935.49975.56313,071.13813,072.98231.08121,084.69621,168.556334.404
Formulation without threshold losses
DIM = 24DIM = 30DIM = 50DIM = 70DIM = 100
Method ValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SD
arq1864.8931894.3553106.4643653.1113656.333303.9987022.8027023.665271.42310,393.91810,397.02934.85915,486.70215,498.236315.799
clpso1880.9141895.470310.4593656.1933659.01831.5847052.3017062.31334.97010,477.94810,501.327310.64715,727.35115,771.184317.154
cmaes1868.5352253.7923283.5913653.1113657.221305.0157022.8027023.291270.76610,392.89110,394.80931.23015,486.12715,489.60333.311
ga1890.3751993.5303121.5733661.2093696.323328.2277109.4547323.8173141.77910,679.24411,163.0523212.32816,111.07716,761.0693310.876
gwo1876.2751982.494358.0093681.6323798.303389.4427293.7757676.9483192.84711,156.87311,808.1423332.48217,402.24718,240.7313392.525
jso1864.8931925.83750180.6553653.1113653.173970.3437022.8027023.058470.71210,392.74210,395.08732.53015,486.13215,490.87136.198
lmcmaes1874.3832023.7677226.4163653.1113671.362725.2627022.8027023.812331.43110,393.27610,396.80234.94615,486.58915,489.94134.815
lracmaes1866.0822000.3597229.8473653.1113653.347230.9157022.8027022.870130.14210,392.90510,393.29030.10915,492.19315,494.64231.548
mlshaderl1864.8931906.28217149.2763653.1113654.137732.5107022.8027023.880371.57510,392.74210,398.36337.20315,486.67815,491.21235.269
acor1864.9821869.31135.8593652.1583653.61531.6797022.8027022.81530.05510,416.62710,425.20636.96315,912.93016,214.7653102.215
Formulation without multiplicative yield aggregation
ValueMeanRate%SDValue
Method ValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SD
arq1472.3161476.02772.9422727.2452752.057321.0776799.3396841.315320.54911,246.03611,376.543370.67818,392.62218,578.7723123.581
clpso1476.1531478.67231.4612734.4572740.80933.9376852.6396874.53539.37211,467.06911,525.900326.10818,845.59318,938.996339.085
cmaes1472.4651477.85173.4932731.5022756.669318.6726793.8576826.933317.53911,256.39511,298.744330.58018,161.56818,272.164357.015
ga1481.6561529.059330.6302765.4792823.859349.4636968.5097322.5913207.68211,853.47312,437.9723322.68519,259.61720,156.8903430.805
gwo1508.0751582.222351.7142820.7822927.396360.7227192.2547546.9913222.92612,414.90013,120.9553368.88620,123.77821,736.5133712.254
jso1472.3161473.443371.9942727.6702735.05935.9296785.9756806.030310.30811,217.48011,301.822335.03318,203.71418,304.315366.759
lmcmaes1477.3561482.18334.6752742.4272769.860321.9046811.7316865.074342.87511,296.19811,398.755358.90518,260.43618,490.377397.096
lracmaes1473.0341474.08431.3602735.0212742.543315.3566784.5216789.67674.31211,198.12611,218.216318.05718,385.78018,472.932351.460
mlshaderl1472.3161474.50272.6052727.3932735.40079.2676785.9756811.874313.39511,246.25311,301.409340.51018,213.27718,329.547364.529
acor1472.3671472.81031.0242727.3932743.954720.0276784.0756785.63231.74511,245.26111,307.510349.87719,408.60819,568.897367.593
Formulation without neighbor coupling
DIM = 24DIM = 30DIM = 50DIM = 70DIM = 100
Method ValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SD
arq1173.9041177.955102.6392323.9372335.59576.9655917.8805968.044339.5329995.11810,154.059384.60816,535.84316,766.2973137.910
clpso1177.7191180.69731.7002350.1212359.52234.1946009.0046035.210317.85610,243.28210,308.106337.40117,051.05317,186.627353.299
cmaes1173.9041182.86737.6822328.8702344.959312.8205891.7725951.023348.4159882.74210,000.354368.79416,269.82116,355.030370.085
ga1197.7481258.826332.6012348.7442449.358361.6136166.7816473.3533197.42110,689.05611,255.2093323.48217,921.57118,539.3213425.845
gwo1199.1061284.663387.9232339.4662507.011395.6756111.3306645.4543310.17510,652.07511,775.3763449.28818,503.22119,297.7633467.746
jso1173.7261176.723101.4002323.9372338.01135.6635894.3845930.383319.7959908.71110,034.797359.60416,288.46016,462.314392.379
lmcmaes1179.0391203.696332.8642337.2112359.571320.3075929.1786010.913359.61510,110.68910,238.065388.52616,506.78916,674.8593118.819
lracmaes1176.1571178.10831.3832327.0422336.72034.0355865.5175882.21637.7519855.6289886.397328.71016,516.61116,656.497356.740
mlshaderl1173.7261176.54732.1262323.9372338.69336.2505911.1725945.892325.48810,000.43110,063.895346.76516,336.72116,543.8233100.137
acor1173.7261175.56431.4712323.9372331.09475.6415862.4015870.09136.5259838.4119911.777364.05217,239.38617,515.6063115.561
Formulation without preferred-window regularization
DIM = 24DIM = 30DIM = 50DIM = 70DIM = 100
MethodValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SDValueMeanRate%SD
arq1395.7601400.657173.7902576.7062591.275711.7976700.9006742.586325.34611,153.78311,257.313351.34518,238.93818,436.9683149.751
clpso1402.7061408.34331.8592588.8972594.23333.5126766.6266781.923310.34611,357.31811,413.566329.91918,710.66418,790.975338.622
cmaes1394.8981404.82735.5922576.7062613.468334.9716704.3286730.438319.31911,112.11311,174.732341.82518,000.78818,106.632354.223
ga1404.8061469.491344.8182598.5722686.815350.8646879.9867120.5413162.35211,634.05512,147.4523216.13919,295.97719,781.4803307.760
gwo1414.2241504.888347.8592623.0852764.1943101.9486888.5317368.7633213.57211,682.88812,561.6063361.48019,952.19721,071.6083530.519
jso1394.8981399.67173.5632575.8072582.159177.6466690.2076708.794311.40111,092.58011,170.234345.78618,065.43118,168.921362.790
lmcmaes1403.4511413.73437.5382581.4842630.903337.4026719.2896756.900330.68711,214.75911,303.540381.48618,077.50718,400.4343118.964
lracmaes1394.8991400.82333.8552575.8072578.67872.9236682.1076686.52835.21311,068.46211,091.096314.27118,111.88918,305.046373.817
mlshaderl1394.8981400.33933.3912575.8072585.873710.0196694.1446715.681313.40511,122.67011,178.054338.88618,065.78118,169.767375.379
acor1395.0361400.98033.8452575.8072582.541322.3496681.2946683.41132.14311,118.88411,214.966348.95719,183.51119,416.269398.539
Table 4. Mean wall-clock time per run (seconds) for each optimizer across all benchmark dimensions on the reference hardware.
Table 4. Mean wall-clock time per run (seconds) for each optimizer across all benchmark dimensions on the reference hardware.
DIMarqclpsocmaesgagwojsolmcmaeslracmaesmlshaderlacor
240.2710.2352.530.2550.2470.2240.2664.8660.2930.318
300.3230.2915.1280.3110.3010.270.3227.1650.3380.389
500.5040.527.4050.5110.5010.4270.50759.6770.4990.662
700.6870.723101.5660.8070.810.6680.802191.6450.7561.164
1001.1051.198380.211.0591.0280.8471.153800.0740.9151.554
Table 5. Sensitivity of the mean best objective value to perturbations of the six primary difficulty-controlling parameters at D = 30 using mlshaderl over 30 independent runs.
Table 5. Sensitivity of the mean best objective value to perturbations of the six primary difficulty-controlling parameters at D = 30 using mlshaderl over 30 independent runs.
Oscillatory efficiency amplitude
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
02899.342510.1504842891.71522947.2273397.26529630
0.112896.787718.2104362881.5352937.5378397.26529530
0.162859.90136.78491712851.52742875.322397.26529430
0.222749.47813.0454412736.35532793.9528397.26529330
0.282613.20216.57281922608.76532636.5564397.26529230
0.332502.07735.87549312494.452516.2706397.26529130
↓ Summarymin = 2502.0773|max = 2899.3425
Resonance penalty weight
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
52636.936815.9488332624.08352683.8309203.05285130
62673.920710.6399152663.40552715.9101203.05285230
72714.984613.0838262701.12652746.4207203.05285330
82749.47813.0454412736.35532793.9528203.05285430
92781.657313.2012152770.35072815.2609203.05285530
102808.25238.3138452801.08912841.2095203.05285630
112839.989711.9736032828.80222867.234203.05285730
↓ Summarymin = 2636.9368|max = 2839.9897
Oscillation frequency base
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
0.192760.462915.9162622747.36012808.518918.191715630
0.22761.097916.1945022745.68862802.841818.191715730
0.212757.662815.432442745.04042794.833518.191715530
0.222749.47813.0454412736.35532793.952818.191715430
0.232742.906213.8335842733.05752788.58818.191715130
0.242743.862318.2171822731.97662790.79918.191715230
0.252748.20918.8332782734.65322801.482218.191715330
↓ Summarymin = 2742.9062|max = 2761.0979
Oscillation frequency amplitude
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
0.062756.6558.11797222748.78632783.133214.806983630
0.072756.356910.8911262741.86552799.684614.806983530
0.082757.732217.1467782737.18722787.721214.806983730
0.092749.47813.0454412736.35532793.952814.806983430
0.12742.925211.0151272734.54712779.044914.806983130
0.112745.979117.0607852734.16952802.874814.806983330
0.122745.580414.3562822735.58722790.036214.806983230
↓ Summarymin = 2742.9252|max=2757.7322
Yield-sensitivity base
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
0.752750.636515.6948812736.35532783.32993.833925430
0.82752.004316.5937672736.35532793.95283.833925530
0.852749.47813.0454412736.35532793.95283.833925230
0.92748.170411.997982736.35532801.59533.833925130
0.952750.289514.2549682736.35532790.08213.833925330
↓ Summarymin = 2748.1704|max = 2752.0043
Yield-sensitivity amplitude
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
0.452753.345516.5836182736.35532801.77165.04879530
0.52749.48517.6120282736.35532795.34575.04879330
0.552749.47813.0454412736.35532793.95285.04879230
0.62748.296713.8254792736.35532801.59535.04879130
0.652752.3816.7280422736.35532795.34575.04879430
↓ Summarymin = 2748.2967|max = 2753.3455
Table 6. Sensitivity of the mean best objective value to perturbations of the six primary difficulty-controlling parameters at D = 50 using mlshaderl over 30 independent runs.
Table 6. Sensitivity of the mean best objective value to perturbations of the six primary difficulty-controlling parameters at D = 50 using mlshaderl over 30 independent runs.
Oscillatory efficiency amplitude
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
07331.124515.4124237306.43247363.4434962.89456630
0.117211.09093.17175127204.85427218.6255962.89456530
0.167044.774517.0577017030.32827130.8727962.89456430
0.226815.552418.7268966787.17936882.7953962.89456330
0.286567.392813.604316544.0416596.8341962.89456230
0.336368.229921.1458346334.49586422.4896962.89456130
↓ Summarymin = 6368.2299|max = 7331.1245
Resonance penalty weight
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
56623.883912.9067966606.12816656.1921356.97433130
66688.644310.325786673.5056710.7771356.97433230
76748.618512.9651956729.4976777.9409356.97433330
86815.552418.7268966787.17936882.7953356.97433430
96872.830118.808086848.04486948.7902356.97433530
106927.607213.0151846912.22616976.0468356.97433630
116980.85838.14260396962.37357000.6067356.97433730
↓ Summarymin = 6623.8839|max = 6980.8583
Oscillation frequency base
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
0.196827.233514.8679456802.17726858.91511.681103730
0.26822.728313.5827896801.77556853.295111.681103530
0.216816.007619.5151946782.30746894.46311.681103230
0.226815.552418.7268966787.17936882.795311.681103130
0.236818.562511.4950796800.95346843.962911.681103330
0.246825.463817.3857116795.10236862.37611.681103630
0.256820.718816.77576789.43316862.834811.681103430
↓ Summarymin = 6815.5524|max = 6827.2335
Oscillation frequency amplitude
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
0.066824.086617.809866805.28466871.175425.08246630
0.076824.675614.2071666801.026858.473125.08246730
0.086812.516817.5580476777.86776838.743725.08246530
0.096806.324616.9645696780.37536833.522825.08246430
0.16804.04311.5707016774.01246836.846225.08246330
0.116803.211819.9269396777.80176864.015525.08246230
0.126799.593114.2652526777.89086842.145525.08246130
↓ Summarymin = 6799.5931|max = 6824.6756
Yield-sensitivity base
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
0.756815.552418.7268966787.17936882.79530130
0.86815.552418.7268966787.17936882.79530230
0.856815.552418.7268966787.17936882.79530330
0.96815.552418.7268966787.17936882.79530430
0.956815.552418.7268966787.17936882.79530530
↓ Summarymin = 6815.5524|max = 6815.5524
Yield-sensitivity amplitude
ValueMean best_fSDMin best_fMax best_fMain effectRankN (runs)
0.456815.552418.7268966787.17936882.79530130
0.56815.552418.7268966787.17936882.79530230
0.556815.552418.7268966787.17936882.79530330
0.66815.552418.7268966787.17936882.79530430
0.656815.552418.7268966787.17936882.79530530
↓ Summarymin = 6815.5524|max = 6815.5524
Table 7. Comparative performance of nine optimizers on four classical benchmarks and the proposed irrigation benchmark at D = 30 and D = 50 (30 independent runs).
Table 7. Comparative performance of nine optimizers on four classical benchmarks and the proposed irrigation benchmark at D = 30 and D = 50 (30 independent runs).
Ackley (multimodal, nearly separable)
MethodDIM = 30DIM = 50
ValueMeanRate%SDValueMeanRate%SD
arq0.0000.0001000.0000.0000.866370.698
clpso1.8492.23830.1604.4664.88730.230
cmaes0.0000.0001000.0000.0000.0001000.000
ga1.8095.31232.8876.10511.29033.083
jso0.0000.0001000.0000.0000.0001000.000
lmcmaes0.0000.0001000.0000.0000.0001000.000
lracmaes0.0000.0001000.0000.0000.0001000.000
mlshaderl0.0000.0001000.0000.0000.0001000.000
acor0.0000.0001000.0000.0010.00230.000
Rastrigin (multimodal, separable)
MethodDIM = 30DIM = 50
ValueMeanRate%SDValueMeanRate%SD
arq0.0000.431700.7700.0002.753101.766
clpso45.56255.44535.967163.202179.14038.682
cmaes0.0000.0001000.0000.0000.0001000.000
ga34.60375.510320.113125.429211.418332.690
jso0.0000.199971.0900.0000.03330.182
lmcmaes0.0000.0001000.0000.0000.0001000.000
lracmaes0.0000.0001000.0000.0000.0001000.000
mlshaderl0.0003.05131.7515.97011.47532.965
acor173.137199.811312.340355.488436.769321.310
Rosenbrock (unimodal, non-separable)
MethodDIM = 30DIM = 50
ValueMeanRate%SDValueMeanRate%SD
arq0.0000.0001000.0000.0682.23732.012
clpso43.54166.203312.374137.479182.123319.021
cmaes0.0000.532871.3780.0000.266931.011
ga1.4039.352313.54840.653121.036346.509
jso0.0000.0001000.0000.0552.46032.045
lmcmaes0.0000.0001000.0008.81211.74431.976
lracmaes18.97620.07630.65344.48945.37430.413
mlshaderl0.0000.000900.0003.6778.27532.603
acor22.94923.48630.16944.70346.25730.366
Schwefel (deceptive global structure)
MethodDIM = 30DIM = 50
ValueMeanRate%SDValueMeanRate%SD
arq236.877911.1633422.3932260.9483257.1927554.197
clpso135.272472.7693140.7282993.3714204.9923457.860
cmaes3217.7354922.8213649.3046300.4468278.0183937.601
ga0.08225.188348.552126.888347.7953154.861
jso0.000467.8337331.0481618.6633342.07831600.885
lmcmaes4560.0026102.9503856.2727927.55110,654.03731190.889
lracmaes8443.6098752.9573196.74014,879.07315,977.3973387.892
mlshaderl0.00086.90640102.922118.439609.52110398.185
acor7541.2367958.5093183.46614,020.95814,871.2903329.079
Proposed benchmark (full formulation, A L L )
MethodDIM = 30DIM = 50
ValueMeanRate%SDValueMeanRate%SD
arq2736.3552769.033319.8806802.8976836.089321.400
clpso2744.6642750.09333.8036861.1436874.24938.224
cmaes2739.4602778.598326.8586801.3286830.234315.765
ga2764.4242837.572347.0756940.3187265.9213212.781
jso2737.7982744.37935.7916788.8506806.87239.409
lmcmaes2743.8552783.549325.40446808.9916855.765337.657
lracmaes2740.7012757.367318.8736784.6096788.55334.2875
mlshaderl2736.3562749.4781013.0456787.1796815.552318.727
acor2737.7982772.389315.0156784.1626785.25931.446
Table 8. Performance of ten optimizers on four CEC 2011 real-world problems (30 independent runs).
Table 8. Performance of ten optimizers on four CEC 2011 real-world problems (30 independent runs).
Circular Antenna Array Design ( D = 12 )Economic Load Dispatch 3 ( D = 15 )
MethodValueMeanRate%SDValueMeanRate%SD
arq0.0068100.00681560 1.14 × 10 5 32,367.57532,539.30320135.295
clpso0.0473780.09467530.02330732,526.15132,618.692349.757
cmaes0.0068100.124647100.09474432,675.52532,871.6153111.990
ga0.0076220.19256430.07316432,638.76233,004.3043184.909
gwo0.0069600.04889030.06417132,393.64232,649.6553151.395
jso0.0068100.0068101000.00032,458.94632,553.998340.849
lmcmaes0.2340.34430.05032,836.20733,103.9763201.060
lracmaes0.0070.07830.09332,513.97532,637.164367.684
mlshaderl0.0068100.0068101000.00032,367.57532,408.0654357.814
acor0.0068110.0068243 1.27 × 10 5 32,367.57532,479.01230108.299
Economic Load Dispatch 4 ( D = 40 )Hydrothermal Generation Scheduling ( D = 96 )
MethodValueMeanRate%SDValueMeanRate%SD
arq121,093.444121,262.9197177.291141,655.841141,658.464204.879
clpso121,918.009122,097.776388.465150,084.620154,166.42432425.074
cmaes121,967.550123,220.4533625.525141,693.746141,742.229321.558
ga122,541.381124,917.34931161.716242,383.761295,667.986329,379.013
gwo121,514.169122,193.8973383.671141,946.963142,025.814340.302
jso121,151.991121,390.2733194.400141,661.644141,675.409310.117
lmcmaes124,344.754125,839.1743877.080266,778.354351,222.664356,403.637
lracmaes122,893.260124,873.5193938.523265,437.5571,479,740.4933954,019.104
mlshaderl121,078.445121,168.270363.379141,655.841141,658.216102.887
acor121,211.785121,221.3086330.211142,241.225142,335.115356.664
Table 9. Mean Friedman ranks of the ten optimizers on the full benchmark formulation across all dimensions (30 independent runs, lower rank indicates better performance).
Table 9. Mean Friedman ranks of the ten optimizers on the full benchmark formulation across all dimensions (30 independent runs, lower rank indicates better performance).
MethodD = 24D = 30D = 50D = 70D = 100Avg
jso2.232.153.53.632.132.73
lracmaes3.133.981.8314.62.91
mlshaderl3.333.174.33.372.473.33
cmaes5.475.95.43.571.634.39
acor2.975.61.234.238.074.42
arq4.435.635.765.45.43
lmcmaes7.576.576.376.34.86.32
clpso7.13.537.677.96.976.63
ga9.28.89.29.078.979.05
gwo9.579.679.89.939.979.79
p-value 2.81 × 10 41 2.73 × 10 32 1.32 × 10 47 3.79 × 10 46 1.69 × 10 49
Kendall’s W0.7950.6360.9060.8810.939
W interpretationStrongModerateStrongStrongStrong
SignificantYesYesYesYesYes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Charilogis, V.; Tsoulos, I.G.; Gianni, A.M.; Tsalikakis, D. A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization. Mathematics 2026, 14, 2018. https://doi.org/10.3390/math14112018

AMA Style

Charilogis V, Tsoulos IG, Gianni AM, Tsalikakis D. A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization. Mathematics. 2026; 14(11):2018. https://doi.org/10.3390/math14112018

Chicago/Turabian Style

Charilogis, Vasileios, Ioannis G. Tsoulos, Anna Maria Gianni, and Dimitrios Tsalikakis. 2026. "A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization" Mathematics 14, no. 11: 2018. https://doi.org/10.3390/math14112018

APA Style

Charilogis, V., Tsoulos, I. G., Gianni, A. M., & Tsalikakis, D. (2026). A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization. Mathematics, 14(11), 2018. https://doi.org/10.3390/math14112018

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop