A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization

Charilogis, Vasileios; Tsoulos, Ioannis G.; Gianni, Anna Maria; Tsalikakis, Dimitrios

doi:10.3390/math14112018

Open AccessArticle

A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization

¹

Department of Informatics and Telecommunications, University of Ioannina, 47150 Kostaki Artas, Greece

²

Department of Engineering Informatics and Telecommunications, University of Western Macedonia, 50100 Kozani, Greece

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(11), 2018; https://doi.org/10.3390/math14112018

Submission received: 16 April 2026 / Revised: 1 June 2026 / Accepted: 3 June 2026 / Published: 5 June 2026

(This article belongs to the Special Issue Recent Advances in Swarm Intelligence Algorithms: Optimization and Application)

Download

Browse Figures

Versions Notes

Abstract

This article presents a new literature-informed benchmark problem for continuous global optimization, inspired by the semantics of weather-aware irrigation scheduling. The problem is formulated as a continuous minimization task in which irrigation decisions are determined under synthetically generated but agronomically motivated weather-dependent water demand, rainfall effects, and nonlinear operational penalties. The proposed benchmark is intentionally synthetic and self-contained requiring no external datasets or field-calibrated parameters while its components are grounded in established concepts from irrigation science such as evapotranspiration-based demand estimation, yield response–water relationships, and weather-dependent scheduling principles. It is designed to be sufficiently challenging for numerical optimization while retaining a clear agronomic interpretation. The article provides the full mathematical formulation of the problem, explains its main components and parameters, and reports an experimental study focused on its optimization behavior using ten established continuous optimizers across five problem dimensions. From this perspective, the proposed problem is positioned within a clear framework for the study of real-world-inspired continuous optimization benchmarks, offering an additional case in which mathematical formulation, practical interpretation, and experimental investigation coexist in a coherent and systematic manner.

Keywords:

continuous global optimization; benchmark problems; real-world optimization; irrigation scheduling; weather-aware optimization; black-box optimization; non-separable objective functions

MSC:

65K05; 65K10; 68W01

1. Introduction

Optimization plays a central role in science and engineering because many practical decision problems can be formulated as the search for variable configurations that minimize cost, maximize performance, or satisfy multiple competing operational criteria. In continuous settings, the resulting problems are often nonlinear, multimodal, nonconvex, and derivative-free, which makes the empirical study of optimization algorithms strongly dependent on the availability of representative benchmark problems [1]. Population-based and nature-inspired metaheuristics including differential evolution, particle swarm optimization, evolution strategies, and ant colony optimization have demonstrated broad competence on such problems, yet their comparative evaluation remains a persistent methodological challenge [2,3]. In the absence of systematic benchmarking, performance claims are frequently overstated, algorithm rankings are unstable across problem classes, and the generalizable strengths and weaknesses of competing methods remain obscured [4,5].

Benchmark design is therefore not a peripheral issue but a methodological one. The two dominant paradigms for continuous benchmarking the Congress on Evolutionary Computation (CEC) suites and the Black-Box Optimization Benchmarking (BBOB) platform have each contributed important infrastructure for controlled algorithm comparison [6,7]. CEC suites, including CEC 2014, CEC 2017, CEC 2021, and CEC 2022, provide fixed-budget comparisons on shifted, rotated, and hybridized analytical functions across multiple dimensionalities [8]. BBOB, developed within the COCO platform, adopts a target-based approach and covers a set of 24 single-objective functions with documented landscape properties, recently extended to large-scale dimensions up to 640 [9]. Both platforms have made substantial contributions to reproducible benchmarking, and recent comprehensive reviews have cataloged several hundred test functions in use across the optimization literature [2,10]. At the same time, recent analyses have shown that the choice of benchmark problems can substantially alter empirical conclusions: algorithms that rank highly on CEC 2020 often perform poorly on CEC 2014 and CEC 2017, and vice versa [4,5,11]. The number of allowed function evaluations further modulates rankings, suggesting that no single benchmark configuration provides a definitive ordering of optimizers [11]. These findings reinforce the need for diverse, independently designed benchmark problems whose difficulty structure is transparent and whose components can be subjected to controlled ablation. Recent critical analyses have further emphasized that algorithm rankings are highly sensitive to the specific benchmark suite employed [5,12], and comprehensive surveys have cataloged the structural properties including separability, modality, and conditioning that distinguish existing test functions from one another [13]. These findings underscore the value of introducing new benchmarks whose difficulty mechanisms are independently documented and whose landscape properties differ qualitatively from those of existing suites.

A complementary body of work has emphasized the value of real-world inspired benchmark problems, whose objective structures reflect realistic coupling, delayed effects, threshold mechanisms, and non-separable stage interactions that are rarely present in purely analytical test functions [2,4,5]. The IEEE CEC 2011 competition introduced a set of practical optimization problems drawn from engineering domains, and subsequent efforts have proposed test suites for constrained, large-scale, and multi-objective problems motivated by physical or operational constraints [14]. Recent large-scale benchmarks have additionally highlighted the importance of heterogeneous module coupling and non-uniform interaction patterns, features that are genuinely present in physical systems but largely absent from classic analytical suites [15]. The methodological literature has accordingly called for benchmarks that combine interpretability so that the source of difficulty can be understood and communicated with sufficient computational challenge to stress modern population-based solvers across a range of dimensionalities [2,4,10].

Within this landscape of domain-specific benchmark construction, recent decomposition-based multi-objective approaches for coordinating trucks and unmanned aerial vehicles (UAVs) within cyber-physical internet frameworks [16] share with the present work the design philosophy of embedding domain-specific coupling and state-dependent dynamics into synthetic yet semantically meaningful problem formulations. However, such logistics-oriented benchmarks primarily emphasize spatial decomposition and multi-objective coordination among heterogeneous agents across multimodal transport networks, whereas the proposed irrigation benchmark instead emphasizes temporal state recursion through recursive soil-water storage and multiplicative seasonal yield aggregation as its principal difficulty mechanisms. These constitute orthogonal difficulty axes spatial decomposition and multi-objective Pareto-front navigation on one side, versus temporal path dependence and single-objective rugged landscape structure on the other suggesting that future benchmark suites could productively integrate mechanisms from both families.

Within agricultural water management, irrigation scheduling provides a natural domain for such benchmark construction. Agriculture accounts for approximately 70 percent of global freshwater withdrawals, and the efficient allocation of irrigation water across a growing season has direct implications for food production, resource conservation, and economic viability [17]. Existing studies have investigated evapotranspiration-driven scheduling based on the FAO Penman–Monteith framework, weather-aware irrigation control using probabilistic forecast integration, and simulation-optimization frameworks that couple crop growth models with operational decision support [18,19,20]. Real-time and data-assimilation-based scheduling approaches have demonstrated that incorporating short-term weather forecasts and sensor observations into the decision loop can substantially improve water use efficiency relative to rule-based alternatives [21,22]. Systematic reviews of simulation-optimization methods for irrigation scheduling have further shown that the choice of crop model, objective function structure, and optimization algorithm interacts in complex ways, highlighting the need for reproducible and well-characterized problem formulations [23]. However, many available formulations are developed as calibrated management tools for specific field conditions and crop systems rather than as intentionally difficult black-box benchmarks for continuous global optimization. Their calibration to site-specific agronomic data makes them neither self-contained nor algorithmically transparent, limiting their utility as general-purpose benchmarks. At the same time, the standard analytical benchmark suites used in continuous optimization, including CEC and BBOB, lack the semantic interpretability, state-dependent coupling, and operational realism that physically motivated domains such as irrigation scheduling naturally provide. There is therefore a recognized gap for benchmark problems that are simultaneously self-contained and analytically transparent, yet exhibit the non-separable, stateful, and rugged landscape structure that arises in real operational systems.

The present article therefore introduces a new literature-informed weather-coupled irrigation scheduling benchmark whose explicit role is not to reproduce a field-calibrated agronomic model, but to provide a mathematically explicit, semantically meaningful, and computationally demanding optimization landscape that bridges the gap identified above. The benchmark is entirely synthetic and deterministic: all exogenous weather and crop profiles are generated analytically from a normalized stage coordinate, and no external datasets, site-specific calibration, or field measurements are required. At the same time, its components are grounded in established agronomic concepts: evapotranspiration-based demand estimation following the FAO framework [18], yield response–water relationships consistent with the classical model of Doorenbos and Kassam [19], and weather-dependent scheduling principles as discussed in [20]. As a result, every term in the objective function retains a clear physical interpretation, even though the benchmark itself is not a digital twin of any specific field or crop system. The proposed benchmark is fully deterministic and analytically self-contained: all exogenous weather and crop profiles are generated from a normalized stage coordinate, requiring no external datasets, calibrated parameters, or field-specific inputs. Its objective function deliberately integrates six distinct difficulty mechanisms recursive soil-water storage propagation, oscillatory irrigation efficiency modulation, threshold-triggered drainage losses, multiplicative seasonal yield aggregation, neighbor-coupled hydraulic load, and a deceptive preferred-window penalty each of which has a clear agronomic interpretation and a quantifiable contribution to search landscape ruggedness. The benchmark is therefore positioned at the intersection of two research communities: it retains the domain semantics and operational realism of agricultural water management while providing the analytical transparency, controlled difficulty structure, and scalable dimensionality that the continuous optimization community requires for rigorous algorithm comparison.

The remainder of the article is organized as follows. Section 2 presents the proposed benchmark, including its background Section 2.1, full mathematical formulation Section 2.2, and experimental behavior Section 2.4 under a common evaluation setting Section 2.3. Section 3 discusses the optimization signature of the problem from a problem-centered perspective, and Section 4 summarizes the main conclusions.

2. Weather-Coupled Irrigation Scheduling Benchmark

2.1. Background of the Problem

The proposed benchmark is motivated by the need to represent irrigation scheduling as a genuinely coupled seasonal decision process rather than as a separable stage-wise water allocation rule. In realistic irrigation environments, the amount of water applied at one stage influences not only the immediate water balance of the crop, but also subsequent storage, drainage, evaporative losses, and the cumulative preservation of seasonal productivity [15,18,19,20,21,23]. This makes the domain naturally suitable for benchmark construction aimed at non-separable and state-dependent continuous optimization.

The benchmark is intentionally synthetic in the sense that all exogenous profiles are generated analytically from the normalized stage coordinate, yet it remains literature-informed in its semantic structure. To make this grounding explicit, the following correspondences between benchmark components and agronomic concepts are noted. The stage-wise crop evapotranspiration demand

{E T c}_{i} = {E T o}_{i} \cdot {K c}_{i} \cdot Δ t

follows the standard reference-evapotranspiration construction of the FAO Penman–Monteith framework [18], in which a reference atmospheric demand

E T o

is scaled by a crop-specific coefficient Kc to obtain actual crop water requirements. The multiplicative yield-preservation factor

Y_{r e l} = \prod y_{i}

is consistent with the classical yield response–water model of Doorenbos and Kassam [19], which relates relative yield reduction to relative evapotranspiration deficit through a crop-specific sensitivity coefficient

K y

. The recursive soil-water storage transition

S_{i + 1} = f (S_{i}, r a i n_{i}, I_{i}^{e f f}, \dots)

reflects the widely adopted soil water-balance approach used in simulation-optimization frameworks for irrigation scheduling [21,22,23], in which the storage state at each stage depends on incoming water, drainage, and evaporative losses from the preceding stage. The weather-dependent profiles for heat, wind, rainfall, and tariff represent stylized seasonal patterns that capture the qualitative behavior documented in weather-aware scheduling studies [17,20], although their specific functional forms are chosen for analytical convenience rather than site-specific fidelity.

Beyond these literature-informed components, additional oscillatory, thresholded, and deceptive mechanisms are deliberately embedded to produce a challenging optimization landscape. These include the oscillatory modulation of irrigation efficiency

η_{i} (x)

, which generates repeated local attraction basins, the threshold-triggered deep drainage loss, which introduces nonsmooth regime changes, and the deceptive preferred-window penalty

P_{w i n}

, which attracts solvers toward irrigation levels that are suboptimal under the full objective. These mechanisms are synthetic ruggedness generators with no claim to precise field physics, they are introduced to make the benchmark genuinely difficult for modern continuous optimizers while remaining structurally interpretable. As a result, the problem should be interpreted as a literature-informed optimization benchmark with agronomic meaning, not as a calibrated digital twin of a particular field or crop system.

2.2. Mathematical Formulation

Let

D \geq 4

denote the problem dimension, and let

x = {(x_{1}, x_{2}, \dots, x_{D})}^{T}

be the decision vector, where

x_{i}

is the irrigation depth (mm) applied at stage i of a synthetic growing season. The feasible set is the hyper-rectangle

Ω = {[0, 80]}^{D}, 0 \leq x_{i} \leq 80, i = 1, \dots, D .

The only hard constraints imposed on the problem are the box bounds

0 \leq x_{i} \leq 80

on the decision variables. The internal storage state

S_{i}

is maintained within admissible bounds

[0, S_{i}^{m a x}]

through the clipping operator in the storage transition equation, which acts as an implicit state constraint enforced by construction rather than by external penalization. All remaining operational constraints are treated as soft penalties incorporated directly into the objective function: the seasonal water budget is softly constrained through

P_{b u d g e t}

, which penalizes deviation from the target

B_{D} = 28 D

, peak hydraulic demand is softly constrained through

P_{p e a k}

and terminal storage is regularized through

P_{t e r m}

. This soft-constraint formulation is deliberate, as it preserves the box-constrained structure of the search domain while embedding operational feasibility requirements into the landscape topology of the objective.

The season length is fixed at 120 days, so the stage duration is

Δ t = \frac{120}{D},

and the normalized stage coordinate is

s_{i} = \frac{i}{D + 1}, i = 1, \dots, D .

Recommended difficult dimensions are

D \in {24, 30, 50, 70, 100}

, with

D = 24

and

D = 30

suitable for routine experiments and

D = 50

or

D = 70

or

D = 100

producing substantially harder landscapes for population-based optimizers.

To make the benchmark fully deterministic and self-contained, all exogenous weather and crop coefficients are generated analytically from the normalized stage coordinate

s_{i}

. No stochastic, scenario-based, or externally sampled weather realizations are involved: every weather profile is a fixed, closed-form trigonometric function of

s_{i}

that produces identical values for any given dimension D. This design choice eliminates all sources of exogenous randomness from the benchmark, ensuring that any variability in optimization outcomes is attributable solely to the stochastic search process of the optimizer and not to weather input uncertainty. The specific functional forms are chosen to reproduce qualitative seasonal patterns (e.g., mid-season peaks in temperature and evapotranspiration, early-season dominance of rainfall) that are consistent with the general behavior documented in weather-aware scheduling studies [17,20], although they do not represent any particular geographic location or climate record. The stage-wise profiles are defined as:

{E T o}_{i} = 2.8 + 3.2 {sin}^{2} (π s_{i}),

{K c}_{i} = 0.55 + 0.65 sin (π s_{i}),

r a i n_{i} = 8 + 18 {cos}^{2} (π s_{i}),

{K y}_{i} = 0.85 + 0.55 sin (π s_{i}),

h e a t_{i} = 0.40 + 0.60 {sin}^{2} (π s_{i}),

w i n d_{i} = 0.35 + 0.65 {cos}^{2} (π s_{i}),

t a r i f f_{i} = 0.90 + 0.18 sin (2 π s_{i} + 0.20),

η_{b a s e, i} = 0.78 + 0.10 cos (π s_{i}),

ρ_{i} = 0.52 + 0.18 {sin}^{2} (π s_{i}),

S_{i}^{m a x} = 50 + 20 {sin}^{2} (π s_{i}),

f r e q_{i} = 0.22 + 0.09 sin (π s_{i}),

and

ϕ_{i} = 0.70 i + 0.30 sin (2 π s_{i}) .

The crop evapotranspiration demand follows the standard evapotranspiration-based construction

{E T c}_{i} = {E T o}_{i} {K c}_{i} Δ t,

which grounds the benchmark in classical irrigation science [4]. The coefficient

{K y}_{i}

is used as a stage sensitivity factor in the relative yield-preservation component, consistent with the general idea of yield response to water stress [19].

The first major source of difficulty is local decision coupling. A neighbor-coupled hydraulic load is introduced as

N_{i} = 0.60 x_{i} + 0.25 x_{i - 1} + 0.15 x_{i + 1},

with boundary conventions

x_{0} = x_{D + 1} = 0

. Effective irrigation efficiency is then defined by the oscillatory nonlinear law

η_{i} (x) = η_{b a s e, i} [1 - 0.22 {sin}^{2} (f r e q_{i} x_{i} + ϕ_{i}) - 0.10 {sin}^{2} (0.07 N_{i} + 0.50 ϕ_{i})],

clipped to the interval

[0.45, 1.00]

, and the effective irrigation becomes

I_{i}^{e f f} = η_{i} (x) x_{i} .

These oscillatory factors are synthetic ruggedness generators. They are not intended to claim precise field physics, they are introduced to generate repeated attraction basins and to make coordinate-wise search substantially less effective.

The second major source of difficulty is path dependence through a synthetic soil-water storage state

S_{i}

. The initial storage is fixed by

S_{1} = 0.35 S_{1}^{m a x} .

The total water available at stage i is

A_{i} = r a i n_{i} + I_{i}^{e f f} + 0.55 S_{i},

and the stage deficit and surplus are

d e f i c i t_{i} = max (0, E T c_{i} - A_{i}),

s u r p l u s_{i} = max (0, A_{i} - E T c_{i}) .

Deep drainage is modeled by the thresholded term

d e e p_{i} = max (0, s u r p l u s_{i} - 0.35 S_{i}^{m a x}) [0.55 + 0.20 w i n d_{i} + 0.15 {sin}^{2} (0.11 x_{i} + ϕ_{i})],

recharge is defined as

r e c h a r g e_{i} = ρ_{i} max (0, s u r p l u s_{i} - d e e p_{i}),

evaporative loss is

e v a p l o s s_{i} = (0.08 + 0.10 h e a t_{i} + 0.04 w i n d_{i}) S_{i},

and the storage transition is

S_{i + 1} = clip (0.82 S_{i} + r e c h a r g e_{i} - e v a p l o s s_{i}, 0, S_{i}^{m a x}) .

This recursion makes the objective stateful: changing one variable can alter the effective optimization landscape seen by many subsequent variables.

The storage transition is nonlinear due to the

m a x (\cdot)

and

c l i p (\cdot)

operators appearing in the drainage, surplus, deficit, and recharge terms. Rainfall and effective irrigation enter additively in the total available water

A_{i} = {rain}_{i} + I_{i}^{e f f} + 0.55 S_{i}

while evapotranspiration demand

{E T c}_{i}

is subtracted through the deficit-surplus split. The clipping to

[0, S_{i}^{m a x}]

ensures that the storage state remains non-negative and bounded at every stage, guaranteeing feasibility of the dynamic system. The contraction factor 0.82 in the transition ensures that the autonomous dynamics (without recharge) are inherently stable, preventing unbounded state accumulation.

A multiplicative relative yield-preservation term is included to prevent the problem from collapsing into a pure cost minimization model. The relative water-satisfaction ratio is

r_{i} = min (1, \frac{A_{i}}{{E T c}_{i} + ε}), ε = 10^{- 12},

the stage stress is

s t r e s s_{i} = 1 - r_{i},

and the stage contribution to yield preservation is

y_{i} = clip (1 - {K y}_{i} s t r e s s_{i}^{1.35} - 0.08 h e a t_{i} exp (- \frac{A_{i}}{0.35 {E T c}_{i} + 1}), 0.02, 1) .

The season-level relative yield factor is then

Y_{r e l} = \prod_{i = 1}^{D} y_{i} .

The multiplicative construction is deliberate: a few severely stressed stages can dominate many small local improvements, which makes the global trade off structure significantly more deceptive than in additive reward formulations [15,19,21,23].

The benchmark is posed as the minimization problem

min_{x \in {[0, 80]}^{D}} f (x),

where

f (x) = C_{w a t e r} + C_{p u m p} + P_{d e f} + P_{e x c} + P_{s m o o t h} + P_{r e s} + P_{i n t} + P_{w i n} + P_{t e r m} + P_{b u d g e t} + P_{p e a k} - 350 Y_{r e l} .

To clarify the role and origin of each term, the eleven components of

f (x)

are organized into three categories according to their design rationale.

The first category contains terms that are directly motivated by agronomic and operational practice. The water-cost term

C_{w a t e r}

represents the direct economic cost of water application, which is a primary objective in any irrigation scheduling formulation [17,20]. The pumping-cost term

C_{p u m p}

captures the energy cost associated with pressurized water delivery, a well-recognized component of irrigation operating expenses [23]. The deficit penalty

P_{d e f}

penalizes unmet crop water demand, reflecting the agronomic principle that water stress reduces yield [19]. The excess penalty

P_{e x c}

penalizes over-irrigation that leads to deep drainage and waterlogging, consistent with the water-balance perspective in which surplus water beyond the root zone is a resource loss [18,21]. The terminal storage penalty

P_{t e r m}

encourages the preservation of soil moisture at the end of the season, a common objective in multi-season water management [22]. The seasonal budget penalty

P_{b u d g e t}

softly constrains total water use over the growing season, reflecting practical water-rights or allocation limits [16]. The multiplicative yield factor

Y_{r e l}

aggregates stage-wise crop productivity into a seasonal yield metric, following the classical yield response–water framework [19].

The second category contains terms that model realistic operational concerns but whose specific functional forms are stylized. The smoothness penalty

P_{s m o o t h}

discourages abrupt changes in irrigation between consecutive stages, reflecting practical constraints on irrigation system ramping and scheduling regularity. The peak-load penalty

P_{p e a k}

penalizes excessive instantaneous hydraulic demand, which in practice can exceed the capacity of pumping and conveyance infrastructure. The neighbor-interaction penalty

P_{i n t}

captures the operational coupling between adjacent irrigation stages, in which the cumulative load from consecutive high-application periods can create delivery bottlenecks.

The third category contains terms that are deliberately synthetic and are introduced to generate specific landscape difficulty features for benchmarking purposes. The resonance penalty

P_{r e s}

and the deceptive preferred-window penalty

P_{w i n}

are oscillatory and phase-shifted ruggedness generators designed to create dense local attractors and misleading basins that make the optimization landscape genuinely challenging for modern solvers. These terms have no direct agronomic counterpart, their role is purely to increase the benchmark’s difficulty in a controlled and ablatable manner, as verified experimentally in Section 2.4.

The water-cost term is

C_{w a t e r} = \sum_{i = 1}^{D} t a r i f f_{i} x_{i},

and the pumping-cost term is

C_{p u m p} = \sum_{i = 1}^{D} 0.0045 N_{i}^{2} [1 + 0.35 {sin}^{2} (0.08 x_{i} + 0.60 ϕ_{i})] .

A weather-modulated shock factor is introduced by

s h o c k_{i} = 1 + 0.18 h e a t_{i} {sin}^{2} (0.09 x_{i} + 1.3 w i n d_{i} + ϕ_{i}),

and the deficit and excess penalties are given by

P_{d e f} = \sum_{i = 1}^{D} (0.12 + 0.06 h e a t_{i}) d e f i c i t_{i}^{2} s h o c k_{i},

P_{e x c} = \sum_{i = 1}^{D} (0.05 + 0.05 w i n d_{i}) (d e e p_{i}^{2} + 0.30 s u r p l u s_{i}^{2}) .

Temporal roughness and local resonance are imposed by

P_{s m o o t h} = \sum_{i = 2}^{D} (0.06 + 0.02 w i n d_{i}) {(x_{i} - x_{i - 1})}^{2},

P_{r e s} = \sum_{i = 1}^{D} 8.0 (1 + 0.70 h e a t_{i}) {sin}^{2} (f r e q_{i} x_{i} + ϕ_{i}) .

Neighbor-stage overloading and bilinear coupling are represented by

P_{i n t} = \sum_{i = 2}^{D} [0.018 max {(0, x_{i} + 0.55 x_{i - 1} - 58)}^{2} + 0.025 x_{i} x_{i - 1}] .

A deceptive preferred-window term is defined through

μ_{i} = 18 + 26 {sin}^{2} (π s_{i}),

P_{w i n} = \sum_{i = 1}^{D} (0.014 + 0.010 {sin}^{2} (π s_{i})) {(x_{i} - μ_{i})}^{2} [1 + 0.40 {sin}^{2} (0.06 x_{i} + ϕ_{i})] .

The terminal storage penalty is

P_{t e r m} = 1.2 {(S_{D + 1} - 0.45 S_{D}^{m a x})}^{2},

the soft seasonal budget penalty uses

B_{D} = 28 D,

and is defined by

P_{b u d g e t} = 0.010 {(\sum_{i = 1}^{D} x_{i} - B_{D})}^{2},

while the peak-load penalty is

P_{p e a k} = 0.035 \sum_{i = 1}^{D} max {(0, N_{i} - 62)}^{2} .

The complete problem is therefore a box-constrained, deterministic, strongly non-separable, rugged, and partially nonsmooth minimization benchmark.

No closed-form global optimum is assumed known. This is intentional and is part of the benchmark design. The objective contains repeated oscillatory structures, threshold-triggered regime changes, clipped internal states, recursive storage dynamics, multiplicative stage aggregation, and coupled penalties. Consequently, the landscape contains many local minima and deceptive trade offs, and performance cannot be inferred from low-dimensional intuition alone. From the perspective of benchmark design, the difficulty comes from the simultaneous presence of six mechanisms: recursive state propagation, local oscillatory efficiency modulation, thresholded drainage and peak penalties, multiplicative yield aggregation, neighbor coupling through

N_{i}

, and soft preferred-window structure that competes with the remaining terms instead of revealing the optimum. For these reasons, the benchmark is substantially harder than a smooth irrigation cost-yield trade off model while remaining interpretable as a weather-coupled irrigation scheduling problem.

Although no closed-form global optimum is available for the full benchmark formulation, the fully simplified variant

A L L_{O F F}

provides a tractable reference point for absolute performance calibration. In the

A L L_{O F F}

formulation, all embedded difficulty mechanisms are disabled and the resulting landscape is smooth and convex-like: as demonstrated in Section 2.4, the methods arq, cmaes, lmcmaes, lracmaes, jso, and mlshaderl converge to identical objective values with 100% success rates and zero standard deviation across all dimensions, establishing a consensus empirical optimum that can serve as a lower bound reference for the full formulation. The optimality gap for any method on the full benchmark can therefore be estimated as the difference between its achieved objective value and the corresponding

A L L_{O F F}

consensus value, providing an absolute rather than purely relative measure of solution quality. The optimality gap for a given method on the full formulation is defined explicitly as

Δ = f (A L L) - f * (A L L_{O F F})

, where

f (A L L)

is the best objective value achieved by the method across 30 independent runs on the full formulation, and

f * (A L L_{O F F})

is the consensus reference value established by the subset of methods that solve the

A L L_{O F F}

variant to optimality with 100% success rates and zero standard deviation. This definition is solver-independent: the consensus

f * (A L L_{O F F})

is not the average of all methods’

A L L_{O F F}

results but the value agreed upon by those solvers that reliably converge on the simplified landscape. Methods that fail to solve

A L L_{O F F}

such as gwo, ga, and clpso at large dimensions are therefore not used to establish the reference, and their optimality gaps are computed against the consensus established by the converging subset. This yields a conservative lower bound on their true optimality gap rather than an overestimate, since their

A L L_{O F F}

best values exceed

f * (A L L_{O F F})

, making their apparent

A L L

gap smaller than it would be if measured against the true optimum. This approach is consistent with established practice in black-box benchmarking: the BBOB platform and the CEC suites both rely on inter-algorithm relative comparison as the primary performance metric when closed-form optima are not analytically available, and the availability of a tractable simplified variant with a verifiable consensus optimum provides an additional calibration layer that is rarely present in classical synthetic benchmarks.

For convenience, Table 1 summarizes all parameters, auxiliary computed variables, and numerical coefficients appearing in the mathematical formulation of this benchmark.

2.3. Experimental Protocol

All optimization algorithms considered in this study, together with the proposed weather-aware irrigation benchmark, were implemented in C++ and integrated into the open-source OptimSolution framework, which was developed by Vasileios Charilogis. The corresponding source code is publicly available through the project’s GitHub repository: https://github.com/charilog/optimsolution (Accessed on 4 June 2026). All experiments were compiled and executed under Debian 12.12 using GCC 13.4 on a workstation equipped with an AMD Ryzen 9 9950X3D processor (16 cores, 32 threads) and 64 GB of DDR4 memory.

To ensure a fair and reproducible comparison, the same objective-function evaluation budget was imposed on all competing solvers. Since the benchmark itself is deterministic whereas the optimization methods are stochastic, each method was executed independently 30 times using different random seeds. The experimental campaign covered the recommended benchmark dimensions

D \in {24, 30, 50, 70, 100}

, thus including both moderate and substantially more demanding instances.

The ten optimization methods included in the comparison were selected to represent five distinct algorithmic families that are widely used in continuous global optimization: adaptive differential evolution (jso [24], mlshaderl [25]), comprehensive learning particle swarm optimization (clpso [26]), covariance matrix adaptation evolution strategy (cmaes [27], lmcmaes [28,29], lracmaes [30]), genetic algorithm (ga [31]), grey wolf optimizer (gwo [32]), ant colony optimization for continuous domains (acor [33]), and adaptive restart-based search (arq [34]). This selection was guided by three criteria: coverage of fundamentally different search mechanisms (mutation-based, swarm-based, rank-based, estimation-based, and restart-based), inclusion of both classical methods (ga, cmaes) and recent state-of-the-art solvers (jso, mlshaderl, arq), and availability of mature, well-documented implementations within the OptimSolution framework. The study focuses exclusively on population-based and derivative-free methods because the benchmark is positioned as a black-box optimization problem for which gradient information is not assumed available. The evaluation of gradient-based solvers, surrogate-assisted methods, and reinforcement learning approaches is noted as a direction for future work in Section 4.

In addition to the full problem formulation (

A L L

), the experimental protocol also included simplified ablation-based variants obtained by selectively disabling individual difficulty mechanisms embedded in the benchmark. These mechanisms include recursive storage propagation, oscillatory efficiency modulation, threshold-based loss effects, multiplicative yield aggregation, neighbor coupling, and preferred-window regularization. A fully simplified variant, denoted by

A L L_{O F F}

, was also examined, with all difficulty-inducing mechanisms disabled. This broader protocol allows the experimental analysis to assess not only the optimization performance achieved on the complete benchmark, but also the relative contribution of each embedded mechanism to the overall search difficulty of the problem. The abbreviations used for all variants are summarized in Table 2.

For each optimizer on each benchmark variant, the reported statistics were computed over 30 independent runs. The principal performance indicators were the best and mean final objective values obtained at termination. These measures were complemented by the standard deviation, convergence analysis, and rank-based comparisons in order to provide a more complete view of robustness, stability, and search efficiency.

2.4. Experimental Results

Table 3 summarizes the complete experimental results obtained by applying the ten competing optimization methods to the full benchmark formulation (

A L L

) and to each of its six ablation-based variants (

A L L_{O F F}

,

N O_{R S}

,

N O_{O E}

,

N O_{T L}

,

N O_{M Y}

,

N O_{N C}

,

N O_{P W}

), across the five benchmark dimensions

D \in {24, 30, 50, 70, 100}

. For every method-variant-dimension combination, four summary statistics are reported over 30 independent runs: the best objective value observed across all runs (Value), the mean final objective value at termination (Mean), the empirical success rate (Rate %), defined as the percentage of runs in which the method reached the best value found by any solver on that instance, and the standard deviation of the final objective values (SD). The performance indicator Rate therefore provides a run-level reliability measure that complements the point estimates given by Value and Mean. The table is organized by problem variant: each variant occupies a labeled block, and within each block the five dimensionality levels appear as consecutive rows. This structure allows a direct reading of both the cross-method comparison for a fixed variant and dimension, and the scaling behavior of each method as the problem dimension increases. Methods are listed in a fixed order throughout the table: arq [34], clpso [26], cmaes [27], ga [31], gwo [32], jso [24], lmcmaes [28,29], lracmaes [30], mlshaderl [25], and acor [33]. The column structure within each dimension block is (Value, Mean, Rate, SD), repeated once per method. Results that correspond to a Rate of 100 indicate that the method reached the reference best solution in all 30 runs, a Rate of 3 represents the minimum observable rate under the 30-run protocol, indicating that the method succeeded in only one run.

Full problem formulation (

A L L

). The results for the complete benchmark reveal a strongly differentiated and dimension-dependent performance landscape. At

D = 24

, the method jso emerges as the most reliable solver with a success rate of 40%, converging to the reference best value of approximately 1487.73. The method arq attains the same best value but only in 7% of runs. All remaining solvers clpso, cmaes, ga, gwo, lmcmaes, lracmaes, mlshaderl, and acor achieve the minimum success rate of 3%, indicating that even at the lowest recommended dimension, the benchmark is genuinely challenging for the majority of established optimizers. The standard deviations reveal a clear separation between methods: jso and arq maintain SD values below 2.5, while ga and gwo exhibit SD values exceeding 40 and 51 units, respectively, reflecting high run-to-run variability characteristic of methods that lack sufficient local exploitation capacity in rugged landscapes.

As the dimension increases to

D = 30

, the overall difficulty increases substantially. Only mlshaderl achieves a success rate above the minimum (10%), converging to the reference best value of approximately 2736.36. All other methods report success rates of 3%, with mean values deviating from the reference best by margins ranging from 8 units (arq) to 188 units (gwo). The rapid deterioration from

D = 24

to

D = 30

is attributable to the recursive storage propagation mechanism, which extends the effective dependency chain by six additional stages and thereby increases the probability that at least one stage accumulates sufficient error to deflect the solver from the global basin. This explanation is consistent with the substantially milder dimension scaling observed in the NORS variant, where the recursive storage is disabled.

At

D = 50

, the benchmark becomes markedly harder. The method acor achieves the lowest best value (approximately 6784.16), but with a success rate of only 3%. All ten methods report success rates of 3%, and the gap between best and mean values widens substantially: ga reports a mean of 7265.92 against a best of 6940.32, and gwo reports 7538.50 against 7193.55. These gaps indicate that most runs for these methods terminate in qualitatively different and substantially inferior regions of the landscape. The methods jso, mlshaderl, and acor maintain relatively tight standard deviations (below 19 units), suggesting that while they cannot reliably find the global best, they consistently locate competitive local basins a behavior consistent with their archive-based and distribution-estimation search mechanisms, which provide implicit memory of previously explored regions.

At

D = 70

, all methods achieve success rates of 3%. The best observed value (approximately 11,195.22, obtained by lracmaes) is followed by jso (11,214.25) and mlshaderl (11,246.26), while ga and gwo report best values exceeding 11,700 and 12,400, respectively. The standard deviations for ga (297.79) and gwo (380.82) are an order of magnitude larger than those of jso (40.04) and cmaes (30.74). This confirms that the simpler population recombination (ga) and swarm attraction (gwo) mechanisms fail to maintain sufficient search distribution precision in a 70-dimensional landscape characterized by coupled oscillatory ruggedness and multiplicative yield aggregation.

At the most challenging setting,

D = 100

, the performance hierarchy among methods remains visible despite all methods achieving 3% success rates. The method cmaes achieves the lowest best value (approximately 18,161.57), followed by jso (18,203.71) and mlshaderl (18,213.28). The emergence of cmaes as a competitive solver at the highest dimension, despite its poor reliability, is explained by its covariance matrix adaptation mechanism, which can capture local second-order structure in the landscape and exploit it for rapid descent when initialized near a promising basin. This capability becomes relatively more valuable as the number of competing local attractors grows with dimension. By contrast, ga and gwo report best values exceeding 19,260 and 20,124, respectively, with standard deviations above 398 and 712, indicating near-complete loss of search coherence at this scale.

Simplified formulation with all difficulty mechanisms disabled (

A L L_{O F F}

). In sharp contrast with the

A L L

variant, the

A L L_{O F F}

formulation is solved to optimality by the large majority of methods across all dimensions. At D = 24, six out of ten methods arq, cmaes, jso, lracmaes, mlshaderl, and acor achieve success rates of 100% with standard deviations that are numerically indistinguishable from zero. The reference best values for

A L L_{O F F}

scale moderately with dimension, from approximately 458.71 at

D = 24

to 5087.80 at

D = 100

. Even at

D = 100

, arq, cmaes, and jso maintain 100% success rates, while mlshaderl achieves 70%. The clpso, ga, and gwo methods are the only consistent exceptions, achieving rates of 3% across all dimensions with means that deviate from the reference best by increasingly large margins. These results demonstrate unambiguously that, in the absence of all embedded difficulty mechanisms, the benchmark collapses to a smooth, tractable problem that most modern continuous optimizers can solve reliably, and that the difficulty reported in the ALL variant is genuinely structural rather than numerical.

Formulation without recursive storage (

N O_{R S}

). Removing the recursive soil-water storage propagation restores some problem tractability compared to the ALL variant, but the problem remains substantially harder than

A L L_{O F F}

. At

D = 24

, jso and mlshaderl both achieve success rates of 63%, while arq and acor attain 20% each. At

D = 30

, success rates decline to 7% for jso, lracmaes, mlshaderl, and acor. At higher dimensions (D = 50–100), all methods except acor and lracmaes (7% at

D = 50

) report success rates of 3%. The persistence of low success rates even without recursive storage confirms that the remaining mechanisms particularly the oscillatory efficiency modulation and the multiplicative yield aggregation are independently sufficient to create a challenging landscape. From an agronomic perspective, the recursive storage mechanism represents the temporal coupling that is fundamental to real irrigation scheduling: the soil moisture state at any point in the growing season depends on all prior irrigation decisions, creating a sequential dependency that cannot be decomposed into independent stage-wise choices. Its removal simplifies the problem by making each stage’s contribution to the objective approximately independent of preceding decisions.

Formulation without oscillatory efficiency (

N O_{O E}

). This ablation produces the most dramatic improvement in solver performance across the entire experimental campaign, confirming that the oscillatory efficiency mechanism is the single most consequential source of landscape difficulty. At

D = 30

, four methods arq, jso, mlshaderl, and acor achieve success rates of 100% with zero standard deviation, while cmaes achieves 97%. At

D = 50

, jso and mlshaderl maintain 100% success rates, cmaes achieves 97%, arq 83%, and lracmaes 73%. Even at

D = 100

, cmaes achieves a success rate of 53%, jso 30%, and mlshaderl 17% results that are dramatically better than the 3% rates observed for all methods in the

A L L

variant at the same dimension. The structural interpretation of this finding is clear: the oscillatory modulation of

η_{i} (x)

, which depends nonlinearly on both the local decision

x_{i}

and the neighbor-coupled load

N_{i}

, introduces a dense lattice of local attractors that are energetically competitive with the global basin. When this mechanism is removed, the residual landscape formed by the water cost, pumping cost, deficit and excess penalties, and storage terms is sufficiently smooth to permit gradient-like local improvement strategies to succeed over repeated restarts. The practical implication for benchmark users is that the oscillatory efficiency amplitude (controlled by the parameter 0.22 in the default formulation) is the primary lever for adjusting the overall difficulty of the benchmark.

Formulation without threshold losses (

N O_{T L}

). The removal of threshold-triggered drainage losses produces moderate improvement relative to

A L L

. At

D = 24

, jso achieves a success rate of 50% and mlshaderl 17%. At

D = 30

, jso reaches 97%, mlshaderl 73%, arq and cmaes both 30%, and lracmaes 23%. At

D = 50

, jso (47%), mlshaderl (37%), arq (27%), and cmaes (27%) all exceed the minimum rate. However, at

D = 70

and

D = 100

, all methods collapse to 3%. These results indicate that the threshold drainage mechanism, which introduces nonsmooth regime changes when the water surplus exceeds a fraction of the storage capacity, contributes meaningfully to the difficulty of the intermediate-dimensional regime but is not the primary driver of optimizer failure at large scales. In agronomic terms, threshold drainage corresponds to the physical phenomenon of deep percolation beyond the root zone, which occurs abruptly once the soil water content exceeds field capacity a nonlinearity that is genuinely present in real irrigation systems and that makes the optimization landscape locally nonsmooth.

Formulation without multiplicative yield aggregation (

N O_{M Y}

). This variant replaces the multiplicative seasonal yield factor

Y_{r} e l = \prod_{i} y_{i}

with an additive average. At

D = 24

, jso achieves a success rate of 37%, while arq, cmaes, and mlshaderl each achieve 7%. At

D = 30

and above, all methods report success rates of 7% or lower, with the exception of mlshaderl and acor at

D = 30

(both 7%). The improvement relative to

A L L

is modest at low dimensions and negligible at high dimensions, indicating that multiplicative yield aggregation amplifies the deceptive structure of the objective but is not the sole driver of difficulty. The agronomic rationale for the multiplicative construction is that crop yield is a cumulative outcome of the entire growing season: a single period of severe water stress can irreversibly damage the crop, and this damage cannot be compensated by excess water at other stages. The multiplicative form captures this non-compensatory character, which is consistent with the classical Doorenbo–Kassam yield-response model [19].

Formulation without neighbor coupling (

N O_{N C}

). The removal of the neighbor-coupling term

N_{i}

produces a formulation in which each stage’s hydraulic load depends only on its own decision variable. At

D = 24

, arq and jso both achieve success rates of 10%, while all other methods report 3%. At

D = 30

, arq and acor achieve 7%, and all others 3%. At higher dimensions, all methods report 3%. The overall improvement relative to ALL is modest, confirming that neighbor coupling contributes to non-separability but is not the dominant difficulty mechanism. The relatively small effect of removing neighbor coupling is consistent with the observation that the oscillatory efficiency and multiplicative yield mechanisms create difficulty through fundamentally different channels: local ruggedness and global aggregation deceptiveness, respectively. By contrast, neighbor coupling primarily introduces mild inter-variable dependencies that modern population-based methods can partially accommodate through their implicit modeling of variable correlations.

Formulation without preferred-window regularization (

N O_{P W}

). This final ablation removes the deceptive preferred-window penalty

P_{w i n}

. At

D = 24

, arq achieves a success rate of 17% and jso 7%, while all other methods report 3%. At

D = 30

, jso achieves 17%, arq, lracmaes, and mlshaderl 7%, and all others 3%. At higher dimensions, all methods collapse to 3%. The best observed values for this variant are systematically lower than those for ALL (e.g., approximately 6681.29 vs. 6784.16 at

D = 50

), confirming that the preferred-window term acts as a genuine penalty that shifts the global optimum relative to the attractors created by the remaining terms. The modest improvement in success rates indicates that, while the preferred-window mechanism does introduce additional deceptive structure, its contribution to the overall difficulty is smaller than that of the oscillatory efficiency or the recursive storage mechanisms.

Cross-variant and cross-method synthesis. Taken together, the ablation study reveals a clear hierarchy of mechanism importance. The oscillatory efficiency modulation is unambiguously the dominant source of landscape difficulty: its removal enables multiple solvers to achieve 100% success rates at dimensions where the full formulation yields universal 3% rates. The recursive storage propagation and threshold drainage mechanisms occupy the second tier, each contributing meaningful but less dramatic difficulty that primarily affects the intermediate-dimensional regime (D = 30–50). The multiplicative yield aggregation, neighbor coupling, and preferred-window regularization contribute more modestly, with their effects becoming largely indistinguishable from the baseline difficulty at high dimensions. This ordering has a structural interpretation: the benchmark’s hardness is primarily driven by local landscape ruggedness (oscillatory efficiency) and temporal state coupling (recursive storage), rather than by global aggregation structure or inter-variable connectivity. Crucially, no single ablation restores the problem to the triviality of the fully simplified variant, confirming that the benchmark’s difficulty is genuinely multi-mechanism in character.

Among the ten competing methods, jso exhibits the strongest overall performance in the full formulation, achieving the highest success rate at

D = 24

(40%) and competitive best values across all dimensions. The methods lracmaes, mlshaderl, and acor follow closely, with lracmaes achieving the best observed value at

D = 70

, mlshaderl’s historical archive mechanism and acor’s continuous probability density estimation providing complementary advantages for navigating the benchmark’s coupled structure. At the highest dimension (

D = 100

), cmaes emerges as a competitive solver in terms of best observed values, despite its low success rate, reflecting the value of second-order covariance information in very high-dimensional rugged landscapes. The methods ga and gwo consistently occupy the lowest performance tier across all variants and dimensions, with standard deviations that are typically 5–20 times larger than those of the best-performing methods. Their simpler search mechanisms tournament-based recombination in ga and hierarchical swarm leadership in gwo are fundamentally insufficient for the combined non-separability, state recursion, and oscillatory structure of the proposed benchmark.

Table 4 reports the mean wall-clock time per run for each method across all five dimensions. The benchmark’s objective function is inexpensive to evaluate: for eight of the ten methods, the time per run ranges from 0.22 s at

D = 24

to approximately 1.6 s at

D = 100

on the reference hardware. The exceptions are cmaes and lracmaes, both of which exhibit substantially higher wall-clock times that grow with dimension: cmaes from 2.53 s at

D = 24

to 380.21 s at

D = 100

, and lracmaes from 4.87 s at

D = 24

to 800.07 s at

D = 100

, reflecting the

O (D^{2})

cost of their covariance matrix adaptation mechanisms rather than the benchmark’s evaluation cost. A complete 30-run experimental campaign at

D = 100

therefore requires less than one minute for most methods, confirming that the benchmark is computationally lightweight and suitable for routine algorithm development, with the caveat that covariance-based methods incur substantially higher per-run costs at large dimensions.

A cost-benefit analysis of the covariance-based methods at large dimensions reveals an important practical consideration for benchmark users. At

D = 100

, lracmaes requires approximately 800s per run, yielding a total experimental cost of

800 \times 30 \approx

24,000 s (approximately 6.7 h) for a single 30-run campaign, while cmaes requires approximately

380 \times 30 \approx

11,400 s (approximately 3.2 h). By contrast, all remaining eight methods complete their 30-run campaigns at

D = 100

in under one minute in total. Despite this substantial computational investment, lracmaes and cmaes do not achieve qualitatively superior performance at

D = 100

relative to jso and mlshaderl, which require only seconds per full campaign: as shown in Table 3, all four methods achieve success rates of 3% at

D = 100

, and cmaes achieves the lowest best value only marginally ahead of jso. This cost-benefit asymmetry motivates a practical recommendation: for benchmark users employing methods with super-linear scaling in D, such as any covariance matrix adaptation variant,

D = 50

is recommended as the default experimental setting. At

D = 50

, lracmaes and cmaes complete 30 runs in approximately 30 and 14 min, respectively—a computationally tractable investment while still exposing the full multi-mechanism difficulty of the benchmark—as evidenced by the universal 3% success rates reported in Table 3. The dimensions

D = 70

and

D = 100

remain available for studies specifically targeting the scaling behavior of covariance-based methods, with the understanding that the associated computational cost scales as

O (D^{2})

and may require dedicated hardware resources. For gradient-based solvers, surrogate-assisted methods, and reinforcement-learning approaches identified as directions for future work in Section 4 the benchmark’s evaluation cost is negligible (well below 1 s per evaluation for all dimensions), making the full range

D \in 24, 30, 50, 70, 100

computationally accessible without restriction.

The recommended upper dimension of

D = 100

is intentional and sufficient for the purposes of this benchmark. As demonstrated in Section 2.4, all ten competing methods achieve success rates of

3 %

at

D = 100

, representing the minimum observable rate under the 30-run protocol. This universal solver failure indicates that the benchmark has already reached a difficulty ceiling at

D = 100

: extending to

D = 200

or

D = 500

would not produce qualitatively new algorithmic findings but would instead amplify the computational cost of the covariance-based methods (notably cmaes, whose wall-clock time scales as

O (D^{2})

, reaching 380 s per run at

D = 100

) to levels that preclude routine experimental use. Furthermore, the combinatorial growth argument presented in Section 3 establishes that the effective number of local attractors scales as

5^{D}

, yielding approximately

5^{100} \approx 10^{69}

at

D = 100

, a landscape complexity that is already far beyond the exhaustive coverage capacity of any population-based solver. The benchmark is therefore not limited by its upper dimension but by the fundamental intractability of its landscape structure, which manifests fully within the recommended range

D \in {24, 30, 50, 70, 100}

.

2.5. Parameter Sensitivity Analysis

To assess how sensitive the benchmark’s optimization landscape is to its internal coefficients, a one-at-a-time perturbation study was conducted on six parameters that directly control the primary difficulty mechanisms of the benchmark. These parameters were selected because the ablation study in Section 2.4 identified the oscillatory efficiency and multiplicative yield mechanisms as the two most consequential sources of landscape difficulty, and the six parameters studied here are the numerical coefficients that directly modulate these mechanisms. These parameters are: the oscillatory efficiency amplitude (default 0.22), which governs the depth of the sinusoidal modulation in

η_{i} (x)

, the resonance penalty weight (default 8.0), which scales the local ruggedness term

P_{r e s}

, the oscillation frequency base and amplitude (defaults 0.22 and 0.09), which jointly determine the density of local attractors through

f r e q_{i}

, and the yield-sensitivity base and amplitude (defaults 0.85 and 0.55), which control the stage-wise crop stress coefficient

K y_{i}

. Each parameter was varied independently across its tested range while all others were held at their default values. For each setting, 30 independent runs of mlshaderl were executed at

D = 30

, and the mean best objective value, standard deviation, minimum, and maximum were recorded. The main effect of each parameter is defined as the difference between the highest and lowest mean best values observed across its tested range. Table 5 reports the complete results, and Figure 1 visualizes the sensitivity profiles.

The one-at-a-time perturbation protocol adopted in this study is a deliberate and well-established choice for benchmarks whose primary goal is mechanism transparency rather than global parameter optimization. Single-factor variation isolates the individual contribution of each coefficient to the observed landscape difficulty, a property that would be obscured by factorial or Latin hypercube designs in which multiple parameters vary simultaneously. This interpretability is essential for benchmark users who wish to adjust a specific difficulty mechanism without inadvertently altering others. The analysis was conducted at

D = 30

because this dimension lies at the transition between partial and universal solver failure in the full formulation, making it the most sensitive and informative setting for detecting landscape changes induced by parameter perturbations. The use of mlshaderl as the reference solver is justified by its consistent position among the top-performing methods across all dimensions and variants in Section 2.4, its low run-to-run variability (SD below 15 units at

D = 30

), and its archive-based mechanism, which provides stable convergence behavior that makes mean best objective values a reliable proxy for landscape difficulty rather than a reflection of solver-specific search artifacts.

The main effect column in Table 5 provides a clear ranking of parameter importance. The oscillatory efficiency amplitude exhibits by far the largest main effect (397.27), confirming quantitatively the conclusion drawn from the ablation study: this single coefficient is the dominant lever controlling the benchmark’s difficulty. As the amplitude increases from 0 to 0.33, the mean best objective value decreases monotonically from 2899.34 to 2502.08, reflecting the fact that stronger oscillatory modulation reduces the effective irrigation efficiency and thereby shifts the entire cost-yield trade off toward lower absolute objective values. The standard deviation remains stable across all settings (between 5.88 and 18.21), indicating that the parameter affects the location of the optimal basin rather than the reliability of convergence. The resonance penalty weight ranks second with a main effect of 203.05. Its influence is monotonically increasing: higher weights raise the mean best value from 2636.94 (at weight 5) to 2839.99 (at weight 11), consistent with the expected behavior that a stronger resonance penalty adds cost to all feasible solutions proportionally. The remaining four parameters oscillation frequency base (main effect 18.19), oscillation frequency amplitude (14.81), yield-sensitivity amplitude (5.05), and yield-sensitivity base (3.83) exhibit substantially smaller main effects, indicating that the benchmark’s landscape structure is robust to moderate perturbations of these coefficients. In particular, the near-negligible sensitivity to

K y

base and amplitude (main effects below 5.1) confirms that the specific numerical values of the crop stress coefficients have only a marginal influence on the achievable objective values at this dimension. The multiplicative yield mechanism contributes to the deceptive structure of the objective primarily through its aggregation form, not through the precise values of its coefficients.

Figure 1 visualizes the sensitivity profiles reported in Table 5. The top-left panel confirms the dominant and monotonically decreasing relationship between the oscillatory efficiency amplitude and the mean best objective value. The top-right panel shows the monotonically increasing effect of the resonance weight. The middle panels reveal that the oscillation frequency parameters produce only mild, non-monotonic variations within a narrow band, suggesting that the density of local attractors is relatively stable across the tested frequency range. The bottom panels confirm the near-flat sensitivity to the yield-sensitivity coefficients, with variations that fall within the inter-run standard deviation and are therefore not statistically distinguishable from noise.

To assess whether the parameter importance ranking is consistent across dimensions, the analysis was extended to

D = 50

, Table 6 and Figure 2 report results for both dimensions. The ordinal ranking is preserved: the oscillatory efficiency amplitude retains its dominant position (main effect 962.89 at

D = 50

, versus 397.27 at

D = 30

), the resonance penalty weight ranks second at both dimensions (356.97 and 203.05, respectively), and the oscillation frequency parameters remain negligible. The yield-sensitivity base and amplitude, which exhibit small non-zero effects at

D = 30

(3.83 and 5.05), collapse to exactly zero at

D = 50

, reflecting the structural property that at this dimension the shorter stage window (

Δ t

= 2.4 days) allows high-quality solvers to consistently find near-zero-stress solutions, at which point the

{K y}_{i}

coefficient vanishes from the stage yield computation. This dimensional transition is consistent with the ablation findings and does not reduce benchmark difficulty: all ten methods achieve 3% success rates at

D = 50

.

2.6. Comparative Difficulty Assessment on Classical Benchmarks

To position the difficulty of the proposed benchmark relative to established test functions, the nine optimizers that are common to both experiments arq, clpso, cmaes, ga, jso, lmcmaes, lracmaes, mlshaderl, and acor were evaluated on four classical continuous optimization benchmarks at

D = 30

and

D = 50

under the same experimental protocol (30 independent runs, identical evaluation budget). The four benchmark functions were selected to represent distinct difficulty profiles that are individually present in the proposed irrigation benchmark but rarely combined in a single classical test function.

The Ackley function [35] is a widely used multimodal benchmark with a nearly separable structure and an exponentially decaying envelope that creates a broad basin surrounding the global optimum. It was selected because its multimodal character provides a baseline for assessing how the solvers handle oscillatory local structure in the absence of inter-variable coupling. The Rastrigin function [36] is a strongly multimodal, separable function whose cosine-based oscillatory landscape produces a dense lattice of local minima a structure that is directly analogous to the oscillatory efficiency modulation

η_{i} (x)

embedded in the proposed benchmark. Its inclusion allows a controlled comparison between separable and non-separable oscillatory difficulty. The Rosenbrock function [37] is a unimodal but non-separable function characterized by a narrow curved valley, making it a standard test for the ability of optimizers to follow non-separable gradient structure. It was selected to isolate the contribution of non-separability to search difficulty, independent of multimodality. The Schwefel function [38] features a deceptive global structure in which the second-best local minimum is geometrically distant from the global optimum, creating a landscape that misleads solvers toward suboptimal regions. This deceptive character is analogous to the preferred-window penalty

P_{w i n}

and the multiplicative yield aggregation

Y_{r e l}

in the proposed benchmark. Table 7 reports the complete results.

The results in Table 7 reveal a striking contrast between the difficulty of the classical benchmarks and that of the proposed irrigation benchmark. On Ackley at

D = 30

, seven out of nine solvers (arq, cmaes, jso, lmcmaes, lracmaes, mlshaderl, and acor) achieve success rates of 100%, and cmaes, jso, lmcmaes, lracmaes, and mlshaderl maintain 100% at

D = 50

. This confirms that multimodality alone, when combined with near-separability and a smooth global envelope, is readily handled by modern continuous optimizers. The Rastrigin function, which presents a denser lattice of local minima, is more challenging: at

D = 50

, cmaes, lmcmaes, and lracmaes achieve 100%, while arq drops to 10% and most other methods fall to 3%. Nevertheless, the best-performing solvers maintain high reliability, indicating that separable oscillatory structure, even when dense, does not constitute a fundamental barrier for methods with adequate covariance adaptation or restart mechanisms.

The Rosenbrock function introduces non-separability without multimodality. At

D = 30

, arq, jso, and lmcmaes achieve 100%, while mlshaderl achieves 90% and cmaes 87%. At

D = 50

, cmaes maintains a success rate of 93%, demonstrating that non-separable valley structure alone remains tractable for covariance-based methods. The Schwefel function presents the closest difficulty profile to the proposed benchmark among the four classical functions: at

D = 30

, only mlshaderl achieves a meaningful success rate (40%), and at

D = 50

, mlshaderl drops to 10% and all other methods to 7% or below. The deceptive global structure of Schwefel thus represents a genuine challenge, but one that is still partially navigable by archive-based methods.

By comparison, the proposed irrigation benchmark at

D = 30

yields a maximum success rate of only 10% (mlshaderl), and at

D = 50

, all nine methods achieve the minimum rate of 3% without exception. This represents a qualitative step beyond even the Schwefel function in terms of search difficulty. The explanation lies in the simultaneous presence of mechanisms that the classical functions present only in isolation: the proposed benchmark combines the dense oscillatory local structure of Rastrigin (through

η_{i} (x)

), the non-separable coupling of Rosenbrock (through the recursive storage

S_{i}

and the neighbor load

N_{i}

), the deceptive global structure of Schwefel (through

P_{w i n}

and the multiplicative

Y_{r e l})

, and a stateful path dependence that is entirely absent from all four classical functions. This combination creates a landscape in which no single algorithmic capability whether covariance adaptation, archive-based memory, or probability density estimation is sufficient to ensure reliable convergence, confirming that the proposed benchmark occupies a difficulty regime that is not adequately represented by existing classical test functions.

2.7. Comparison with CEC 2011 Real-World Optimization Problems

To directly address the relationship between the proposed benchmark and established real-world optimization problem suites, the ten optimizers were additionally evaluated on four problems drawn from the CEC 2011 real-world optimization problem suite [14]. The selected problems were chosen to represent cases in which modern continuous optimizers retain partial or full tractability, thereby providing a meaningful upper bound on solver capability against which the difficulty of the proposed benchmark can be calibrated. The four problems span dimensions

D \in {12, 15, 40, 96}

and cover distinct engineering domains: circular antenna array design (AntennaArray,

D = 12

), static economic load dispatch (ELD3,

D = 15

, ELD4,

D = 40

), and hydrothermal generation scheduling (Hydrothermal,

D = 96

). None of the four problems has an analytically known global optimum, making the comparison methodology consistent with the black-box evaluation framework used throughout this study. The same experimental protocol was applied: 30 independent runs per method with an identical function evaluation budget. Section 2.7 reports the complete results.

The results reveal a clear and interpretable difficulty gradient across the four CEC 2011 problems. AntennaArray (

D = 12

) is solved to global optimality by jso and mlshaderl in all 30 runs, and arq achieves a 60% success rate, confirming that this problem, despite its real-world origin, presents a landscape that is fully tractable for archive-based and adaptive differential evolution methods. ELD3 (

D = 15

) remains partially tractable: mlshaderl achieves a success rate of 43%, acor 30%, and arq 20%, while the remaining methods report rates of 3%. At

D = 40

, ELD4 represents the most tractable large-scale instance in the CEC 2011 suite tested here: acor achieves a success rate of 63%, the highest observed across all CEC 2011 problems, and mlshaderl attains 3% with competitive best values. At the largest dimension tested, Hydrothermal (

D = 96

), arq achieves 20% and mlshaderl 10%, while all remaining methods report 3%, indicating that the problem remains partially navigable for methods with adaptive memory or restart mechanisms even at this scale.

Comparing these results with the proposed irrigation benchmark, the key observation is that the proposed benchmark achieves comparable or greater difficulty at substantially lower dimensions. The proposed benchmark yields a maximum success rate of 10% (mlshaderl) at

D = 30

and universal 3% rates across all ten methods at

D = 50

, whereas ELD4 at

D = 40

still permits a 63% success rate for acor and Hydrothermal at

D = 96

remains partially tractable for two methods. This dimensional efficiency of difficulty is attributable to the proposed benchmark’s simultaneous combination of oscillatory efficiency modulation, recursive state dependence, and multiplicative yield aggregation mechanisms that the CEC 2011 problems present only in partial and domain-specific forms. The method ranking observed across the CEC 2011 suite with jso and mlshaderl consistently leading and ga and gwo consistently trailing is broadly consistent with the ranking observed on the full irrigation benchmark. Notably, lmcmaes and lracmaes, which are competitive on the irrigation benchmark, achieve only 3% success rates across all CEC 2011 problems, with lracmaes exhibiting very large standard deviations on the Hydrothermal instance, suggesting that their advantage is specific to the stateful and oscillatory structure of the proposed benchmark rather than general across real-world problem families. The performance of ten optimizers on four CEC 2011 real-world problems is presented in Table 8.

2.8. Statistical Validation

To confirm that the performance differences observed in Section 2.4 are statistically significant rather than artifacts of run-to-run variability, a nonparametric statistical analysis was conducted on the 30 final objective values obtained by each method on the full benchmark formulation (

A L L

) across all five dimensions. To complement the p-values and quantify the magnitude of performance differences among methods, Kendall’s W coefficient of concordance was computed for each dimension as

W = \frac{x_{F}^{2}}{n (k - 1)}

, where n = 30 independent runs and k = 10 methods. The resulting values, reported in Table 9, range from 0.636 (

D = 30

) to 0.939 (

D = 100

) across the five dimensions, indicating [weak/moderate/strong] concordance among the ten optimizers’ rank orderings. Values of W above 0.7 are conventionally interpreted as strong agreement in the ranking structure, confirming that the performance hierarchy identified by the mean Friedman ranks is not only statistically significant but also practically meaningful in terms of the magnitude of inter-method differences. The Friedman test was applied independently at each dimension to assess whether the ten methods produce significantly different rank distributions. Pairwise comparisons were then performed using Wilcoxon signed-rank tests with Holm correction to control the family-wise error rate. Table 9 reports the mean Friedman ranks for each method at each dimension, together with the overall average rank and the Friedman test p-values.

The Friedman test rejects the null hypothesis of equal method performance at all five dimensions with p-values below

10^{- 28}

, confirming that the observed differences are highly significant. The mean ranks identify jso as the best-performing method overall (average rank 2.73), followed by lracmaes (2.91), mlshaderl (3.33), and cmaes (4.39). The methods ga (9.05) and gwo (9.79) consistently occupy the lowest ranks, confirming their poor suitability for this benchmark. Notably, the ranking of acor shifts substantially across dimensions: it achieves its best rank at

D = 50

(1.23) but deteriorates to 8.07 at

D = 100

, suggesting that its probability density estimation mechanism is highly effective at moderate dimensions but loses precision as the search space grows. Similarly, lracmaes achieves its best rank at

D = 70

(1.00) but deteriorates at

D = 100

(4.60), reflecting a comparable dimension-dependent behavior. Pairwise Wilcoxon tests with Holm correction confirm that jso and mlshaderl differ significantly from ga and gwo at all dimensions (p < 0.001), while the differences among the top methods (jso, lracmaes, mlshaderl, cmaes, and acor) are significant at most but not all dimensions, reflecting the competitive and complementary nature of their search strategies. Figure 3 visualizes the rank structure across dimensions, and Figure 4 shows the corresponding distributions of final objective values.

Figure 3 confirms visually that jso and lracmaes maintain consistently low ranks across all dimensions, while ga and gwo are stably positioned at ranks 9–10. The intermediate methods (mlshaderl, cmaes, acor, arq, lmcmaes, clpso) exhibit dimension-dependent rank shifts that reflect the varying effectiveness of their search mechanisms as the landscape complexity scales with D.

Figure 4 complements the rank analysis by showing the spread of final objective values. The tight interquartile ranges of jso, lracmaes, and mlshaderl at

D = 24

and

D = 30

contrast sharply with the wide distributions of ga and gwo, which extend over hundreds of objective-value units. At

D = 70

and

D = 100

, all distributions widen substantially, but the median values of jso and lracmaes remain consistently below those of the other methods, confirming the rank-based findings.

3. Discussion

Before discussing the experimental findings, it is useful to provide a quantitative estimate of the landscape complexity induced by the benchmark’s primary difficulty mechanisms. The oscillatory efficiency law

η_{i} (x)

contains the term

s i n^{2} (f r e q_{i} \cdot x_{i} + φ_{i})

, which over the domain

x_{i} \in [0, 80]

completes approximately

f r e q_{i} \cdot 80 / π \approx 0.22 \cdot 80 / π \approx 5.6

half-cycles per variable at the default frequency. Each half-cycle generates one local attractor in the efficiency landscape, so the single-variable oscillatory structure alone produces approximately 5–6 local basins per stage. Since the efficiency also depends on the neighbor-coupled load

N_{i}

through the second sinusoidal term, and since the storage state

S_{i}

propagates effects from all preceding stages, the number of combinatorially distinct local configurations grows multiplicatively with D. A conservative lower bound on the number of local attractors in the full problem is therefore on the order of

5^{D}

, yielding approximately

5^{24} \approx 6 \times 10^{16}

at

D = 24

and

5^{50} \approx 9 \times 10^{34}

at

D = 50

. While not all of these correspond to true local minima of the composite objective many are eliminated or merged by the smoothness, budget, and deficit penalties this estimate explains why even sophisticated population-based methods with typical population sizes of 100–500 individuals cannot exhaustively cover the basin structure at moderate dimensions. The recursive storage propagation further compounds this difficulty by introducing a sequential dependency chain of length D: a perturbation at stage i can alter the storage states

S_{i + 1}, \dots, S_{D}

and thereby shift the effective landscape seen by all subsequent variables. This path dependence means that the effective dimensionality of the problem, in terms of the number of coupled decisions that must be simultaneously correct, is closer to

D^{2}

than to D, consistent with the empirical observation that solver success rates collapse between

D = 30

and

D = 50

rather than scaling smoothly with dimension.

The most salient observation across all five dimensions of the full benchmark is the consistent superiority of jso, lracmaes, and mlshaderl over the remaining seven methods. This outcome is not incidental. The method jso employs a self-adaptive differential evolution strategy with weighted mutation and crossover operators that balances exploration and exploitation effectively in landscapes with moderate ruggedness. The method mlshaderl relies on a historical archive mechanism that provides implicit memory of previously explored regions and allows the search distribution to contract toward promising basins without premature commitment. The method lracmaes employs a learning rate adaptation mechanism within the CMA-ES framework that adjusts the step-size and covariance updates dynamically, allowing it to maintain effective search precision across the benchmark’s coupled and oscillatory landscape structure. In a landscape characterized by recursive state dependence, repeated local oscillatory structure, and multiplicative coupling between stages, these capabilities are precisely what distinguish reliable convergence from stagnation. By contrast, methods such as ga and gwo, which rely on simpler population recombination or swarm attraction heuristics, consistently fail to maintain sufficient diversity in the presence of the benchmark’s multiple deceptive attractors, as evidenced by their large standard deviations and their stable positioning at Friedman ranks 9–10 across all dimensions. The dimension-scaling behavior observed in the full formulation is consistent with the combinatorial growth of non-separable interactions: the recursive storage propagation links the effective landscape seen by variable

x_{i}

to decisions made at all preceding stages

x 1, \dots, x_{i - 1}

, creating an effective dependency chain whose length grows linearly with D. As D increases, the probability that a randomly initialized trajectory can locate and remain in the correct basin across all stages simultaneously diminishes rapidly, making global search substantially harder than dimension alone would suggest for a separable or smoothly non-separable problem.

The ablation protocol reveals a clear hierarchy of mechanism importance that is both informative and practically useful. The oscillatory efficiency mechanism emerges as the single most consequential source of landscape ruggedness: its removal enables multiple solvers to achieve success rates near or at 100% even at

D = 50

, a result that is unattainable in the full formulation at that dimension. This finding has a clear structural interpretation: the oscillatory modulation of

η_{i} (x)

, which depends on both the local decision

x_{i}

and the neighbor-coupled load

N_{i}

, introduces a dense lattice of local attractors that are energetically competitive with the global basin. In the absence of this mechanism, the residual landscape is sufficiently smooth to permit gradient-like local improvement strategies to succeed over repeated restarts. The recursive storage propagation and threshold drainage mechanisms occupy the second tier, each contributing meaningful but less dramatic difficulty that primarily affects the intermediate-dimensional regime (D = 30–50). The multiplicative yield aggregation, neighbor coupling, and preferred-window regularization contribute more modestly, with their effects becoming largely indistinguishable from the baseline difficulty at high dimensions. Crucially, no single ablation restores the problem to the triviality of the fully simplified variant, confirming that the benchmark’s difficulty is genuinely multi-mechanism in character.

The proposed benchmark addresses a recognized gap at the intersection of two research communities. Within continuous global optimization, the dominant benchmark suites including the CEC and BBOB families are composed primarily of synthetic analytical functions whose difficulty is controlled through transformation, rotation, and shifting, but which lack the semantic interpretability and state-dependent coupling of physically motivated problems. Within agricultural water management and irrigation scheduling, the available simulation-optimization frameworks are calibrated to specific field conditions and crop models and are therefore neither self-contained nor designed to be algorithmically difficult per se. The present benchmark occupies a complementary position: it is analytically self-contained and fully deterministic, requires no external data, exhibits a rich and verifiable difficulty structure through the ablation protocol, and retains an agronomic interpretation that makes its components meaningful beyond pure algorithmic testing. The comparative evaluation against four classical benchmarks in Section 2.6 confirms quantitatively that the proposed problem is substantially harder than Ackley, Rastrigin, Rosenbrock, and Schwefel at the same dimensions, due to its simultaneous combination of oscillatory ruggedness, non-separable coupling, deceptive global structure, and stateful path dependence features that the classical functions present only in isolation. The deliberate absence of a known closed-form global optimum further distinguishes it from classical synthetic benchmarks, making it a genuine black-box problem for which algorithmic progress can only be measured empirically.

Although the proposed benchmark is not a calibrated digital twin of any specific field or crop system, the experimental findings carry implications that are transferable to the design of optimization-based irrigation decision-support tools. The ablation study demonstrates that temporal state coupling represented here by the recursive storage mechanism and nonlinear efficiency losses are the two features that most severely degrade optimizer performance. In practical irrigation scheduling, these correspond precisely to the soil moisture carryover between irrigation events and to the nonlinear relationship between applied water and effectively delivered water, both of which are well-documented challenges in field-level water management [18,21,23]. The finding that archive-based, distribution-estimation, and learning-rate-adaptive methods (jso, mlshaderl, lracmaes) substantially outperform simpler population-based heuristics (ga, gwo) suggests that optimization tools deployed for real irrigation scheduling should prioritize algorithms with adaptive memory mechanisms capable of handling sequential dependencies, rather than relying on general-purpose metaheuristics without such capabilities. Furthermore, the parameter sensitivity analysis in Section 2.5 shows that the benchmark’s difficulty is primarily controlled by a small number of coefficients, most notably the oscillatory efficiency amplitude. This ranking is confirmed to be robust across dimensions: the cross-dimensional validation at D = 50 preserves the same parameter importance ordering, with the oscillatory efficiency amplitude retaining its dominant position and the yield-sensitivity coefficients collapsing to zero main effect. This means that practitioners wishing to use the benchmark as a testbed for irrigation-oriented optimization tools can tune its difficulty to match the complexity level of their target application without altering the overall problem structure.

4. Conclusions

This article has introduced a new continuous global optimization benchmark derived from the semantics of weather-coupled irrigation scheduling. The benchmark is formulated as a deterministic, box-constrained minimization problem in which the objective function integrates a recursive soil-water storage state, an oscillatory irrigation efficiency law, a multiplicative seasonal yield factor, neighbor-coupled hydraulic penalties, threshold-triggered drainage losses, and a deceptive preferred-window regularization term. The problem is analytically self-contained: all exogenous weather and crop profiles are generated from the normalized stage coordinate, requiring no external datasets. Recommended problem dimensions span from

D = 24

to

D = 100

, with the transition from moderate to high dimensionality producing a substantial and predictable increase in landscape difficulty.

The experimental study, conducted over 30 independent runs of ten established continuous optimizers across all recommended dimensions and eight problem variants, yields three principal conclusions. First, the full benchmark presents a genuinely multi-mechanism difficulty that no single state-of-the-art solver reliably resolves at

D \geq 50

: even the best-performing method, jso (average Friedman rank 2.73), achieves success rates of only 3% at the highest dimensions tested. Second, the ablation study identifies oscillatory efficiency modulation as the dominant source of landscape ruggedness, with recursive storage propagation and threshold drainage as the second tier of difficulty mechanisms, together, these components account for the majority of the difficulty gap between the full and the simplified formulations. Third, the fully simplified variant, in which all embedded difficulty mechanisms are disabled, is solved to optimality by most methods across all dimensions, confirming that the observed difficulty in the full formulation is structural rather than numerical in origin. The statistical validation through Friedman and pairwise Wilcoxon tests confirms these findings with p-values below

10^{- 28}

at all dimensions. These results collectively validate the proposed benchmark as a meaningful and scalable tool for studying continuous optimization algorithms on non-separable, stateful, and rugged landscapes with a clear agronomic interpretation.

From a practical standpoint, the proposed benchmark offers the optimization community a ready-to-use, computationally lightweight test problem whose difficulty can be continuously adjusted through the oscillatory efficiency amplitude and the resonance penalty weight, as demonstrated by the sensitivity analysis in Section 2.5 across both D = 30 and D = 50, confirming that this practical tunability is dimensionally robust. Its agronomic semantics also make it suitable as an educational or demonstrative tool for illustrating how competing operational objectives interact in seasonal resource-allocation problems. Several directions for future work are identified. First, the evaluation of gradient-based solvers, surrogate-assisted methods, and reinforcement-learning-based optimizers on the benchmark would broaden the algorithmic coverage beyond the population-based methods considered here. For such gradient-based and surrogate-assisted methods, the benchmark’s negligible per-evaluation cost makes the full dimensional range

D \in 24, 30, 50, 70, 100

computationally accessible, by contrast, for covariance-based population methods,

D = 50

is recommended as the practical default given the

O (D^{2})

scaling of their covariance update mechanisms, as discussed in Section 2.4. Second, a multi-objective extension that separately optimizes water cost and seasonal yield, rather than combining them in a single scalar objective, would enable the study of Pareto-front structure under coupled weather-dependent dynamics. Third, a stochastic variant in which the weather profiles are drawn from parameterized distributions rather than fixed trigonometric functions would allow the investigation of robust and risk-aware optimization strategies. Fourth, a dynamic extension in which decisions are made sequentially with partial observations of the storage state between stages would transform the benchmark into a sequential decision problem amenable to reinforcement learning and model predictive control. Finally, future work could apply established exploratory landscape analysis features such as fitness distance correlation, information content, and gradient homogeneity measures to the full and ablated formulations across all recommended dimensions, providing a data-driven characterization of how each difficulty mechanism shapes the search landscape. Additionally, the orthogonal difficulty axes identified through comparison with logistics-oriented benchmark frameworks [16] specifically, the contrast between the temporal state recursion and multiplicative yield aggregation of the proposed benchmark and the spatial decomposition and multi-objective coordination of truck–UAV path optimization frameworks suggest that a hybrid benchmark family combining mechanisms from both domains represents a productive direction for future benchmark design.

Author Contributions

Conceptualization, A.M.G.; methodology, V.C.; software, V.C.; validation, I.G.T.; formal analysis, I.G.T.; resources, I.G.T.; visualization, D.T.; supervision, D.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rios, L.M.; Sahinidis, N.V. Derivative-free optimization: A review of algorithms and comparison of software implementations. J. Glob. Optim. 2013, 56, 1247–1293. [Google Scholar] [CrossRef]
Sharma, P.; Raju, S. Metaheuristic optimization algorithms: A comprehensive overview and classification of benchmark test functions. Soft Comput. 2024, 28, 3123–3186. [Google Scholar] [CrossRef]
Mohamed, A.W.; Sallam, K.M.; Agrawal, P.; Hadi, A.A.; Mohamed, A.K. Evaluating the performance of meta-heuristic algorithms on CEC 2021 benchmark problems. Neural Comput. Appl. 2023, 35, 1493–1517. [Google Scholar] [CrossRef]
Naser, M.Z.; Al-Bashiti, M.K.; Tapeh, A.T.G.; Naser, A.; Kodur, V.; Hawileh, R.; Abdalla, J.; Khodadadi, N.; Gandomi, A.H.; Eslamlou, A.D. A Review of Benchmark and Test Functions for Global Optimization Algorithms and Metaheuristics. Wiley Interdiscip. Rev. Comput. Stat. 2025, 17, e70028. [Google Scholar] [CrossRef]
Kudela, J. A critical problem in benchmarking and analysis of evolutionary computation methods. Nat. Mach. Intell. 2022, 4, 1238–1245. [Google Scholar] [CrossRef]
Liang, J.J.; Qu, B.Y.; Suganthan, P.N. Problem Definitions and Evaluation Criteria for the CEC 2014 Special Session and Competition on Single Objective Real-Parameter Numerical Optimization; Technical Report 201311; Computational Intelligence Laboratory, Zhengzhou University: Zhengzhou, China, 2013. [Google Scholar]
Hansen, N.; Finck, S.; Ros, R.; Auger, A. Real-Parameter Black-Box Optimization Benchmarking 2009: Noiseless Functions Definitions; Research Report RR-6829; INRIA: Paris, France, 2009. [Google Scholar]
Awad, N.H.; Ali, M.Z.; Liang, J.J.; Qu, B.Y.; Suganthan, P.N. Problem Definitions and Evaluation Criteria for the CEC 2017 Special Session and Competition on Single Objective Real-Parameter Numerical Optimization; Technical Report; Nanyang Technological University: Singapore, 2016. [Google Scholar]
Varelas, K.; El Hara, O.A.; Brockhoff, D.; Hansen, N.; Nguyen, D.M.; Tušar, T.; Auger, A. Benchmarking large-scale continuous optimizers: The bbob-largescale testbed, a COCO software guide and beyond. Appl. Soft Comput. 2020, 97, 106737. [Google Scholar] [CrossRef]
Jamil, M.; Yang, X.-S. A literature survey of benchmark functions for global optimisation problems. J. Math. Model. Numer. Optim. 2013, 4, 150–194. [Google Scholar] [CrossRef]
Piotrowski, A.P.; Napiorkowski, J.J.; Piotrowska, A.E. Metaheuristics should be tested on large benchmark sets with various numbers of function evaluations. Swarm Evol. Comput. 2024, 90, 101680. [Google Scholar] [CrossRef]
Piotrowski, A.P.; Napiorkowski, J.J.; Piotrowska, A.E. Choice of benchmark optimization problems does matter. Swarm Evol. Comput. 2023, 83, 101378. [Google Scholar] [CrossRef]
Kudela, J.; Matousek, R. New benchmark functions for single-objective optimization based on a zigzag pattern. IEEE Access 2022, 10, 8262–8278. [Google Scholar] [CrossRef]
Das, S.; Suganthan, P.N. Problem Definitions and Evaluation Criteria for CEC 2011 Competition on Testing Evolutionary Algorithms on Real World Optimization Problems; Technical Report; Jadavpur University: Kolkata, India, 2010. [Google Scholar]
Li, Z.; Zhou, Y.; Zhang, S.; Song, J.; Li, X. A large-scale continuous optimization benchmark suite with versatile coupled heterogeneous modules. Swarm Evol. Comput. 2023, 79, 101280. [Google Scholar] [CrossRef]
Guo, J.; Qin, Y.; Wu, C.H.; Chen, Z.-S. Multi-objective optimization of multimodal logistics paths: A decomposition-based approach for coordinating trucks and UAVs within the cyber-physical internet framework. Int. J. Logist. Res. Appl. 2026, 1–29. [Google Scholar] [CrossRef]
Jamal, A.; Linker, R.; Housh, M. Real-time irrigation scheduling based on weather forecasts, field observations, and human-machine interactions. Water Resour. Res. 2023, 59, e2023WR035810. [Google Scholar] [CrossRef]
Allen, R.G.; Pereira, L.S.; Raes, D.; Smith, M. Crop Evapotranspiration: Guidelines for Computing Crop Water Requirements, 2nd rev. ed.; FAO Irrigation and Drainage Paper 56; Food and Agriculture Organization of the United Nations: Rome, Italy, 2025. [Google Scholar] [CrossRef]
Doorenbos, J.; Kassam, A.H. Yield Response to Water; Irrigation and Drainage Paper 33; Food and Agriculture Organization of the United Nations: Rome, Italy, 1979. [Google Scholar] [CrossRef]
Wang, D.; Cai, X. Irrigation scheduling—Role of weather forecasting and farmers’ behavior. J. Water Resour. Plan. Manag. 2009, 135, 364–372. [Google Scholar] [CrossRef]
Linker, R.; Kisekka, I. Model-based simulation-optimization of irrigation scheduling—A field evaluation with processing tomatoes. Smart Agric. Technol. 2023, 4, 100234. [Google Scholar] [CrossRef]
Li, X.; Zhang, J.; Cai, X.; Huo, Z.; Zhang, C. Simulation-optimization based real-time irrigation scheduling: A human-machine interactive method enhanced by data assimilation. Agric. Water Manag. 2023, 276, 108059. [Google Scholar] [CrossRef]
Zhao, Y.; Li, G.; Li, S.; Luo, Y.; Bai, Y. A review on the optimization of irrigation schedules for farmlands based on a simulation-optimization model. Water 2024, 16, 2545. [Google Scholar] [CrossRef]
Brest, J.; Maučec, M.S.; Bošković, B. Single objective real-parameter optimization: Algorithm jSO. In 2017 IEEE Congress on Evolutionary Computation (CEC); IEEE: Piscataway, NJ, USA, 2017; pp. 1311–1318. [Google Scholar] [CrossRef]
Chauhan, D.; Trivedi, A.; Shivani. A multi-operator ensemble LSHADE with restart and local search mechanisms for single-objective optimization. arXiv 2024, arXiv:2409.15994. [Google Scholar]
Liang, J.J.; Qin, A.K.; Suganthan, P.N.; Baskar, S. Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans. Evol. Comput. 2006, 10, 281–295. [Google Scholar] [CrossRef]
Hansen, N.; Ostermeier, A. Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 2001, 9, 159–195. [Google Scholar] [CrossRef]
Loshchilov, I. A computationally efficient limited memory CMA-ES for large scale optimization. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2014); ACM: New York, NY, USA, 2014; pp. 397–404. [Google Scholar] [CrossRef]
Loshchilov, I. LM-CMA: An alternative to L-BFGS for large-scale black-box optimization. Evol. Comput. 2017, 25, 143–171. [Google Scholar] [CrossRef] [PubMed]
Nomura, M.; Akimoto, Y.; Ono, I. CMA-ES with learning rate adaptation. arXiv 2023, arXiv:2401.15876. [Google Scholar]
Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, 2nd ed.; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar] [CrossRef]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey Wolf Optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
Socha, K.; Dorigo, M. Ant colony optimization for continuous domains. Eur. J. Oper. Res. 2008, 185, 1155–1173. [Google Scholar] [CrossRef]
Charilogis, V.; Tsoulos, I.G.; Gianni, A.M.; Tsalikakis, D. ARQ: A cohesive optimization design for stable performance on noisy landscapes. Appl. Sci. 2025, 15, 12180. [Google Scholar] [CrossRef]
Ackley, D.H. A Connectionist Machine for Genetic Hillclimbing; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1987. [Google Scholar]
Rastrigin, L.A. Systems of Extremal Control; Nauka: Moscow, Russia, 1974. (In Russian) [Google Scholar]
Rosenbrock, H.H. An automatic method for finding the greatest or least value of a function. Comput. J. 1960, 3, 175–184. [Google Scholar] [CrossRef]
Schwefel, H.-P. Numerical Optimization of Computer Models; John Wiley & Sons: Hoboken, NJ, USA, 1981. [Google Scholar]

Figure 1. Mean best objective value as a function of each primary difficulty-controlling parameter at

D = 30

(error bars denote

\pm 1

standard deviation over 30 runs).

Figure 1. Mean best objective value as a function of each primary difficulty-controlling parameter at

D = 30

(error bars denote

\pm 1

standard deviation over 30 runs).

Figure 2. Mean best objective value as a function of each primary difficulty-controlling parameter at

D = 50

(error bars denote

\pm 1

standard deviation over 30 runs).

Figure 2. Mean best objective value as a function of each primary difficulty-controlling parameter at

D = 50

(error bars denote

\pm 1

standard deviation over 30 runs).

Figure 3. Mean Friedman ranks per dimension for the full benchmark formulation (lower is better).

Figure 4. Distribution of final objective values per dimension for the full benchmark formulation (30 independent runs).

Table 1. Symbols, parameters, and auxiliary used in the mathematical formulation of the proposed benchmark.

Symbol	Interpretation/Coefficients/Comments
D	Recommended difficult cases: $D \in$ {24, 30, 50, 70, 100}
$x = {(x_{1}, \dots, x_{D})}^{T}$	Each component is expressed in mm
$x_{i}$	Box constraint: 0 <= $x_{i}$ <= 80
$Ω$	Hyper-rectangular search domain
$Δ t$	Stage duration (days) for a 120-day horizon
$s_{i}$	Normalized stage coordinate
${E T o}_{i}$	Synthetic reference evapotranspiration profile
${K c}_{i}$	Crop coefficient profile
$r a i n_{i}$	Rainfall at stage i
${K y}_{i}$	Yield sensitivity factor to water stress
$h e a t_{i}$	Synthetic heat-intensity index
$w i n d_{i}$	Synthetic wind index
$t a r i f f_{i}$	Unit water cost at stage i
$η_{b a s e, i}$	Base irrigation efficiency before nonlinear shaping
$ρ_{i}$	Storage recharge coefficient
$S_{i}^{m a x}$	Storage capacity at stage i
$f r e q_{i}$	Oscillation frequency for ruggedness terms
$φ_{i}$	Phase used in oscillatory penalties and efficiency laws
${E T c}_{i}$	Crop evapotranspiration demand at stage i
$x_{0}, x_{D + 1}$	Used in the definition of $N_{i}$
$N_{i}$	Local hydraulic load with coupling of neighboring decisions
$η_{i} (x)$	Clipped to [0.45, 1.00]. Main oscillatory ruggedness mechanism
$I_{i}^{e f f}$	Delivered irrigation after efficiency losses
$S_{1}$	Initial synthetic soil-water storage
$S_{i}$	Introduces path dependence
$A_{i}$	Total available water at stage i
$d e f i c i t_{i}$	Unmet water demand
$s u r p l u s_{i}$	Excess water beyond demand
$d e e p_{i}$	Deep drainage loss with threshold behavior
$r e c h a r g e_{i}$	Storage recharge after drainage
$e v a p l o s s_{i}$	Storage loss due to heat and wind
$S_{i + 1}$	Next storage state with clipping to admissible bounds
$c l i p (z, l, u)$	Used in $η_{i} (x), S_{i + 1}$ , and $y_{i}$
$r_{i}$	Relative degree of water-demand satisfaction
$ε$	Prevents division by zero in $r_{i}$
$s t r e s s_{i}$	Water stress at stage i
$y_{i}$	Multiplicative stage contribution to relative seasonal yield
$Y_{r} e l$	Relative seasonal yield factor
$f (x)$	Full deterministic minimization formulation
$C_{w a t e r}$	Direct water-use cost
$C_{p u m p}$	Pumping cost/penalty with oscillatory modulation
$s h o c k_{i}$	Weather-dependent amplifier of deficit penalties
$P_{d e f}$	Quadratic penalty for water deficit
$P_{e x c}$	Penalty for excess water and deep drainage
$P_{s m o o t h}$	Temporal roughness penalty between consecutive stages
$P_{r e s}$	Local resonance/ruggedness penalty
$P_{i n t}$	Overloading penalty for neighboring stages and bilinear interaction
$μ_{i}$	Center of the deceptive preferred irrigation window
$P_{w i n}$	Mild penalty for deviation from the preferred window with oscillatory distortion
$P_{t e r m}$	Terminal storage regularization
$B_{D}$	Soft seasonal water-use target
$P_{b u d g e t}$	Penalty for deviation from the seasonal budget target
$P_{p e a k}$	Penalty for excessively high peak hydraulic loads

Table 2. Abbreviations of the full benchmark and its ablation-based variants used in the experiments.

Abbreviation	Variant Name
$A L L$	Full problem formulation
$A L L_{O F F}$	Simplified formulation with all difficulty mechanisms disabled
$N O_{R S}$	Formulation without recursive storage
$N O_{O E}$	Formulation without oscillatory efficiency
$N O_{T L}$	Formulation without threshold losses
$N O_{M Y}$	Formulation without multiplicative yield aggregation
$N O_{N C}$	Formulation without neighbor coupling
$N O_{P W}$	Formulation without preferred-window regularization

Table 3. Experimental performance comparison of ten continuous optimizers on the proposed benchmark under full and ablated formulations for

D \in {24, 30, 50, 70, 100}

.

Table 3. Experimental performance comparison of ten continuous optimizers on the proposed benchmark under full and ablated formulations for

D \in {24, 30, 50, 70, 100}

.

Full problem formulation
	DIM = 24				DIM = 30				DIM = 50				DIM = 70				DIM = 100
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	1487.731	1491.585	7	2.492	2736.355	2769.033	3	19.880	6802.897	6836.089	3	21.400	11,274.480	11,388.858	3	55.893	18,376.389	18,570.721	3	125.571
clpso	1493.740	1496.367	3	1.317	2744.664	2750.093	3	3.803	6861.143	6874.249	3	8.224	11,479.413	11,527.464	3	21.961	18,845.593	18,938.996	3	39.085
cmaes	1487.731	1493.280	3	3.163	2739.460	2778.598	3	26.858	6801.328	6830.234	3	15.765	11,256.401	11,298.304	3	30.744	18,161.568	18,269.714	3	57.804
ga	1501.944	1553.715	3	40.362	2764.424	2837.572	3	47.075	6940.318	7265.921	3	212.781	11,708.871	12,329.805	3	297.791	19,259.617	20,143.551	3	398.262
gwo	1490.102	1594.579	3	51.858	2774.266	2924.252	3	71.898	7193.546	7538.496	3	213.071	12,414.926	13,117.625	3	380.817	20,123.781	21,736.517	3	712.255
jso	1487.731	1488.914	40	2.082	2737.798	2744.379	3	5.791	6788.850	6806.872	3	9.409	11,214.246	11,299.755	3	40.045	18,203.714	18,303.518	3	66.851
lmcmaes	1489.799	1500.006	3	6.18297842	2743.855	2783.549	3	25.405	6808.991	6855.765	3	37.657	11,296.204	11,414.791	3	80.1799	18,260.436	18,496.65384	3	107.633
lracmaes	1487.731	1488.998	3	1.377499617	2740.701	2757.368	3	18.873	6784.609	6788.553	3	4.287	11,195.222	11,212.304	3	14.762	18,375.345	18,477.49095	3	51.7904
mlshaderl	1487.731	1489.788	3	2.179	2736.355	2749.478	10	13.045	6787.179	6815.552	3	18.727	11,246.261	11,295.307	3	38.533	18,213.277	18,326.946	3	58.189
acor	1487.805	1488.883	3	2.007	2737.798	2772.389	3	15.015	6784.162	6785.259	3	1.446	11,248.036	11,325.857	3	52.415	19,471.031	19,575.079	3	60.994
Simplified formulation with all difficulty mechanisms disabled
	DIM = 24				DIM = 30				DIM = 50				DIM = 70				DIM = 100
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	458.713	458.713	100	0.000	763.986	763.986	100	0.000	1930.028	1930.028	100	0.000	3175.246	3175.246	100	0.000	5087.797	5087.797	100	0.000
clpso	461.250	463.294	3	0.962	774.556	780.136	3	2.623	2000.168	2018.809	3	8.690	3332.601	3369.541	3	16.647	5435.455	5515.635	3	32.799
cmaes	458.713	458.713	100	0.000	763.986	763.986	100	0.000	1930.028	1930.028	100	0.000	3175.246	3175.246	100	0.000	5087.797	5087.797	100	0.000
ga	461.583	474.737	3	15.641	769.891	810.683	3	29.195	1982.424	2179.420	3	116.187	3513.445	3827.965	3	163.939	5894.631	6270.991	3	188.948
gwo	463.826	527.990	3	72.443	780.070	878.948	3	90.208	2182.478	2525.558	3	224.237	3708.027	4349.738	3	367.247	6425.097	7260.817	3	470.806
jso	458.713	458.713	100	0.000	763.986	763.986	100	0.000	1930.028	1930.028	100	0.000	3175.246	3175.246	100	0.000	5087.797	5087.797	100	0.000
lmcmaes	458.713	458.713	3	0.149	763.986	763.986	93	0.168	1930.028	1930.028	100	0.000	3175.246	3175.246	100	0.000	5087.797	5087.797	100	0.000
lracmaes	458.713	458.713	100	0.000	763.986	763.986	100	0.000	1930.028	1930.028	100	0.000	3175.246	3175.248	3	0.001	5089.173	5090.936	3	1.266
mlshaderl	458.713	458.713	100	0.000	763.986	763.986	100	0.000	1930.028	1930.028	100	0.000	3175.246	3175.246	100	0.000	5087.797	5087.797	70	0.000
acor	458.713	458.713	100	0.000	763.986	763.986	100	0.000	1930.031	1930.038	3	0.005	3185.707	3193.243	3	5.744	5426.290	5611.325	3	98.324
Formulation without recursive storage
	DIM = 24				DIM = 30				DIM = 50				DIM = 70				DIM = 100
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	975.513	977.834	20	1.429	1414.878	1423.305	3	5.231	3108.616	3131.508	3	20.110	4964.462	5003.841	3	28.580	7848.444	7909.964	3	50.480
clpso	976.112	976.911	3	0.447	1417.889	1423.059	3	2.378	3151.916	3164.395	3	7.660	5058.429	5103.005	3	15.151	8111.061	8161.097	3	22.449
cmaes	975.513	989.078	3	11.665	1418.354	1431.023	3	7.930	3104.888	3116.367	3	11.213	4957.037	4967.383	3	6.713	7817.907	7827.994	3	5.893
ga	977.223	1016.610	3	42.635	1436.022	1487.685	3	39.939	3221.258	3477.833	3	136.139	5563.391	5943.078	3	217.990	8787.951	9577.756	3	417.167
gwo	1045.398	1163.494	3	125.143	1436.340	1682.507	3	141.297	3420.286	3863.558	3	280.754	5760.395	6575.522	3	346.271	9619.270	10,945.889	3	592.577
jso	975.513	976.456	63	1.333	1414.234	1417.523	7	3.408	3100.790	3111.587	3	6.973	4961.928	4977.360	3	11.998	7822.892	7857.514	3	25.081
lmcmaes	977.281	993.229	3	15.790	1416.506	1433.441	3	8.404	3106.683	3131.718	3	13.579	4965.233	4995.483	3	22.737	7821.438	7868.105	3	39.434
lracmaes	975.513	978.455	3	1.565	1414.359	1416.976	7	3.252	3100.790	3101.623	7	1.034	4955.674	4956.644	3	0.726	7828.568	7838.671	3	5.042
mlshaderl	975.513	976.549	63	1.548	1414.359	1418.438	7	4.119	3104.122	3116.823	3	9.615	4967.615	4990.841	3	13.978	7843.350	7885.378	3	34.655
acor	975.513	977.792	20	1.267	1414.359	1417.911	7	3.227	3100.790	3101.060	7	0.377	4983.008	5000.284	3	10.873	8282.674	8454.449	3	66.031
Formulation without oscillatory efficiency
	DIM = 24				DIM = 30				DIM = 50				DIM = 70				DIM = 100
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	1632.997	1633.232	10	0.107	3073.615	3073.615	100	0.000	7930.717	7932.701	83	4.516	13,068.962	13,072.060	10	2.135	20,894.099	20,896.651	13	2.003
clpso	1636.952	1638.455	3	0.799	3076.882	3080.081	3	2.339	7949.203	7954.349	3	2.988	13,144.762	13,186.460	3	17.625	21,188.865	21,229.333	3	22.125
cmaes	1632.997	1633.952	23	2.284	3073.615	3076.381	97	15.154	7930.717	7931.114	97	2.173	13,068.962	13,070.019	37	1.234	20,894.099	20,894.935	53	1.218
ga	1636.230	1672.428	3	34.224	3083.783	3174.291	3	56.541	8044.712	8297.357	3	166.294	13,470.487	13,990.080	3	238.310	22,016.765	22,889.144	3	552.595
gwo	1650.391	1722.706	3	49.923	3083.116	3258.022	3	91.655	8137.097	8474.060	3	219.826	13,723.019	14,474.174	3	425.303	22,492.511	23,852.645	3	666.196
jso	1632.997	1633.214	17	0.121	3073.615	3073.615	100	0.000	7930.717	7930.717	100	0.000	13,068.962	13,070.861	23	1.858	20,894.099	20,895.528	30	1.565
lmcmaes	1633.486	1635.085	3	2.931	3073.642	3073.678	3	0.022	7930.717	7931.538	3	3.176	13,068.962	13,071.974	3	2.246	20,894.894	20,897.042	13	2.539
lracmaes	1633.279	1633.280	3	0.000	3073.615	3073.616	3	0.001	7930.717	7930.717	73	0.000	13,068.965	13,069.138	3	0.449	20,894.143	20,894.531	7	0.452
mlshaderl	1632.997	1633.223	3	0.115	3073.615	3073.615	100	0.000	7930.717	7930.717	100	0.000	13,068.962	13,071.170	13	1.956	20,894.099	20,896.439	17	2.357
acor	1633.011	1633.294	3	0.091	3073.615	3073.615	100	0.000	7930.717	7935.499	7	5.563	13,071.138	13,072.982	3	1.081	21,084.696	21,168.556	3	34.404
Formulation without threshold losses
	DIM = 24				DIM = 30				DIM = 50				DIM = 70				DIM = 100
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	1864.893	1894.355	3	106.464	3653.111	3656.333	30	3.998	7022.802	7023.665	27	1.423	10,393.918	10,397.029	3	4.859	15,486.702	15,498.236	3	15.799
clpso	1880.914	1895.470	3	10.459	3656.193	3659.018	3	1.584	7052.301	7062.313	3	4.970	10,477.948	10,501.327	3	10.647	15,727.351	15,771.184	3	17.154
cmaes	1868.535	2253.792	3	283.591	3653.111	3657.221	30	5.015	7022.802	7023.291	27	0.766	10,392.891	10,394.809	3	1.230	15,486.127	15,489.603	3	3.311
ga	1890.375	1993.530	3	121.573	3661.209	3696.323	3	28.227	7109.454	7323.817	3	141.779	10,679.244	11,163.052	3	212.328	16,111.077	16,761.069	3	310.876
gwo	1876.275	1982.494	3	58.009	3681.632	3798.303	3	89.442	7293.775	7676.948	3	192.847	11,156.873	11,808.142	3	332.482	17,402.247	18,240.731	3	392.525
jso	1864.893	1925.837	50	180.655	3653.111	3653.173	97	0.343	7022.802	7023.058	47	0.712	10,392.742	10,395.087	3	2.530	15,486.132	15,490.871	3	6.198
lmcmaes	1874.383	2023.767	7	226.416	3653.111	3671.362	7	25.262	7022.802	7023.812	33	1.431	10,393.276	10,396.802	3	4.946	15,486.589	15,489.941	3	4.815
lracmaes	1866.082	2000.359	7	229.847	3653.111	3653.347	23	0.915	7022.802	7022.870	13	0.142	10,392.905	10,393.290	3	0.109	15,492.193	15,494.642	3	1.548
mlshaderl	1864.893	1906.282	17	149.276	3653.111	3654.137	73	2.510	7022.802	7023.880	37	1.575	10,392.742	10,398.363	3	7.203	15,486.678	15,491.212	3	5.269
acor	1864.982	1869.311	3	5.859	3652.158	3653.615	3	1.679	7022.802	7022.815	3	0.055	10,416.627	10,425.206	3	6.963	15,912.930	16,214.765	3	102.215
Formulation without multiplicative yield aggregation
	Value				Mean				Rate%				SD				Value
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	1472.316	1476.027	7	2.942	2727.245	2752.057	3	21.077	6799.339	6841.315	3	20.549	11,246.036	11,376.543	3	70.678	18,392.622	18,578.772	3	123.581
clpso	1476.153	1478.672	3	1.461	2734.457	2740.809	3	3.937	6852.639	6874.535	3	9.372	11,467.069	11,525.900	3	26.108	18,845.593	18,938.996	3	39.085
cmaes	1472.465	1477.851	7	3.493	2731.502	2756.669	3	18.672	6793.857	6826.933	3	17.539	11,256.395	11,298.744	3	30.580	18,161.568	18,272.164	3	57.015
ga	1481.656	1529.059	3	30.630	2765.479	2823.859	3	49.463	6968.509	7322.591	3	207.682	11,853.473	12,437.972	3	322.685	19,259.617	20,156.890	3	430.805
gwo	1508.075	1582.222	3	51.714	2820.782	2927.396	3	60.722	7192.254	7546.991	3	222.926	12,414.900	13,120.955	3	368.886	20,123.778	21,736.513	3	712.254
jso	1472.316	1473.443	37	1.994	2727.670	2735.059	3	5.929	6785.975	6806.030	3	10.308	11,217.480	11,301.822	3	35.033	18,203.714	18,304.315	3	66.759
lmcmaes	1477.356	1482.183	3	4.675	2742.427	2769.860	3	21.904	6811.731	6865.074	3	42.875	11,296.198	11,398.755	3	58.905	18,260.436	18,490.377	3	97.096
lracmaes	1473.034	1474.084	3	1.360	2735.021	2742.543	3	15.356	6784.521	6789.676	7	4.312	11,198.126	11,218.216	3	18.057	18,385.780	18,472.932	3	51.460
mlshaderl	1472.316	1474.502	7	2.605	2727.393	2735.400	7	9.267	6785.975	6811.874	3	13.395	11,246.253	11,301.409	3	40.510	18,213.277	18,329.547	3	64.529
acor	1472.367	1472.810	3	1.024	2727.393	2743.954	7	20.027	6784.075	6785.632	3	1.745	11,245.261	11,307.510	3	49.877	19,408.608	19,568.897	3	67.593
Formulation without neighbor coupling
	DIM = 24				DIM = 30				DIM = 50				DIM = 70				DIM = 100
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	1173.904	1177.955	10	2.639	2323.937	2335.595	7	6.965	5917.880	5968.044	3	39.532	9995.118	10,154.059	3	84.608	16,535.843	16,766.297	3	137.910
clpso	1177.719	1180.697	3	1.700	2350.121	2359.522	3	4.194	6009.004	6035.210	3	17.856	10,243.282	10,308.106	3	37.401	17,051.053	17,186.627	3	53.299
cmaes	1173.904	1182.867	3	7.682	2328.870	2344.959	3	12.820	5891.772	5951.023	3	48.415	9882.742	10,000.354	3	68.794	16,269.821	16,355.030	3	70.085
ga	1197.748	1258.826	3	32.601	2348.744	2449.358	3	61.613	6166.781	6473.353	3	197.421	10,689.056	11,255.209	3	323.482	17,921.571	18,539.321	3	425.845
gwo	1199.106	1284.663	3	87.923	2339.466	2507.011	3	95.675	6111.330	6645.454	3	310.175	10,652.075	11,775.376	3	449.288	18,503.221	19,297.763	3	467.746
jso	1173.726	1176.723	10	1.400	2323.937	2338.011	3	5.663	5894.384	5930.383	3	19.795	9908.711	10,034.797	3	59.604	16,288.460	16,462.314	3	92.379
lmcmaes	1179.039	1203.696	3	32.864	2337.211	2359.571	3	20.307	5929.178	6010.913	3	59.615	10,110.689	10,238.065	3	88.526	16,506.789	16,674.859	3	118.819
lracmaes	1176.157	1178.108	3	1.383	2327.042	2336.720	3	4.035	5865.517	5882.216	3	7.751	9855.628	9886.397	3	28.710	16,516.611	16,656.497	3	56.740
mlshaderl	1173.726	1176.547	3	2.126	2323.937	2338.693	3	6.250	5911.172	5945.892	3	25.488	10,000.431	10,063.895	3	46.765	16,336.721	16,543.823	3	100.137
acor	1173.726	1175.564	3	1.471	2323.937	2331.094	7	5.641	5862.401	5870.091	3	6.525	9838.411	9911.777	3	64.052	17,239.386	17,515.606	3	115.561
Formulation without preferred-window regularization
	DIM = 24				DIM = 30				DIM = 50				DIM = 70				DIM = 100
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	1395.760	1400.657	17	3.790	2576.706	2591.275	7	11.797	6700.900	6742.586	3	25.346	11,153.783	11,257.313	3	51.345	18,238.938	18,436.968	3	149.751
clpso	1402.706	1408.343	3	1.859	2588.897	2594.233	3	3.512	6766.626	6781.923	3	10.346	11,357.318	11,413.566	3	29.919	18,710.664	18,790.975	3	38.622
cmaes	1394.898	1404.827	3	5.592	2576.706	2613.468	3	34.971	6704.328	6730.438	3	19.319	11,112.113	11,174.732	3	41.825	18,000.788	18,106.632	3	54.223
ga	1404.806	1469.491	3	44.818	2598.572	2686.815	3	50.864	6879.986	7120.541	3	162.352	11,634.055	12,147.452	3	216.139	19,295.977	19,781.480	3	307.760
gwo	1414.224	1504.888	3	47.859	2623.085	2764.194	3	101.948	6888.531	7368.763	3	213.572	11,682.888	12,561.606	3	361.480	19,952.197	21,071.608	3	530.519
jso	1394.898	1399.671	7	3.563	2575.807	2582.159	17	7.646	6690.207	6708.794	3	11.401	11,092.580	11,170.234	3	45.786	18,065.431	18,168.921	3	62.790
lmcmaes	1403.451	1413.734	3	7.538	2581.484	2630.903	3	37.402	6719.289	6756.900	3	30.687	11,214.759	11,303.540	3	81.486	18,077.507	18,400.434	3	118.964
lracmaes	1394.899	1400.823	3	3.855	2575.807	2578.678	7	2.923	6682.107	6686.528	3	5.213	11,068.462	11,091.096	3	14.271	18,111.889	18,305.046	3	73.817
mlshaderl	1394.898	1400.339	3	3.391	2575.807	2585.873	7	10.019	6694.144	6715.681	3	13.405	11,122.670	11,178.054	3	38.886	18,065.781	18,169.767	3	75.379
acor	1395.036	1400.980	3	3.845	2575.807	2582.541	3	22.349	6681.294	6683.411	3	2.143	11,118.884	11,214.966	3	48.957	19,183.511	19,416.269	3	98.539

Table 4. Mean wall-clock time per run (seconds) for each optimizer across all benchmark dimensions on the reference hardware.

DIM	arq	clpso	cmaes	ga	gwo	jso	lmcmaes	lracmaes	mlshaderl	acor
24	0.271	0.235	2.53	0.255	0.247	0.224	0.266	4.866	0.293	0.318
30	0.323	0.291	5.128	0.311	0.301	0.27	0.322	7.165	0.338	0.389
50	0.504	0.5	27.405	0.511	0.501	0.427	0.507	59.677	0.499	0.662
70	0.687	0.723	101.566	0.807	0.81	0.668	0.802	191.645	0.756	1.164
100	1.105	1.198	380.21	1.059	1.028	0.847	1.153	800.074	0.915	1.554

Table 5. Sensitivity of the mean best objective value to perturbations of the six primary difficulty-controlling parameters at

D = 30

using mlshaderl over 30 independent runs.

Table 5. Sensitivity of the mean best objective value to perturbations of the six primary difficulty-controlling parameters at

D = 30

using mlshaderl over 30 independent runs.

Oscillatory efficiency amplitude
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0	2899.3425	10.150484	2891.7152	2947.2273	397.26529	6	30
0.11	2896.7877	18.210436	2881.535	2937.5378	397.26529	5	30
0.16	2859.9013	6.7849171	2851.5274	2875.322	397.26529	4	30
0.22	2749.478	13.045441	2736.3553	2793.9528	397.26529	3	30
0.28	2613.2021	6.5728192	2608.7653	2636.5564	397.26529	2	30
0.33	2502.0773	5.8754931	2494.45	2516.2706	397.26529	1	30
↓ Summary	min = 2502.0773\|max = 2899.3425
Resonance penalty weight
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
5	2636.9368	15.948833	2624.0835	2683.8309	203.05285	1	30
6	2673.9207	10.639915	2663.4055	2715.9101	203.05285	2	30
7	2714.9846	13.083826	2701.1265	2746.4207	203.05285	3	30
8	2749.478	13.045441	2736.3553	2793.9528	203.05285	4	30
9	2781.6573	13.201215	2770.3507	2815.2609	203.05285	5	30
10	2808.2523	8.313845	2801.0891	2841.2095	203.05285	6	30
11	2839.9897	11.973603	2828.8022	2867.234	203.05285	7	30
↓ Summary	min = 2636.9368\|max = 2839.9897
Oscillation frequency base
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0.19	2760.4629	15.916262	2747.3601	2808.5189	18.191715	6	30
0.2	2761.0979	16.194502	2745.6886	2802.8418	18.191715	7	30
0.21	2757.6628	15.43244	2745.0404	2794.8335	18.191715	5	30
0.22	2749.478	13.045441	2736.3553	2793.9528	18.191715	4	30
0.23	2742.9062	13.833584	2733.0575	2788.588	18.191715	1	30
0.24	2743.8623	18.217182	2731.9766	2790.799	18.191715	2	30
0.25	2748.209	18.833278	2734.6532	2801.4822	18.191715	3	30
↓ Summary	min = 2742.9062\|max = 2761.0979
Oscillation frequency amplitude
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0.06	2756.655	8.1179722	2748.7863	2783.1332	14.806983	6	30
0.07	2756.3569	10.891126	2741.8655	2799.6846	14.806983	5	30
0.08	2757.7322	17.146778	2737.1872	2787.7212	14.806983	7	30
0.09	2749.478	13.045441	2736.3553	2793.9528	14.806983	4	30
0.1	2742.9252	11.015127	2734.5471	2779.0449	14.806983	1	30
0.11	2745.9791	17.060785	2734.1695	2802.8748	14.806983	3	30
0.12	2745.5804	14.356282	2735.5872	2790.0362	14.806983	2	30
↓ Summary	min = 2742.9252\|max=2757.7322
Yield-sensitivity base
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0.75	2750.6365	15.694881	2736.3553	2783.3299	3.833925	4	30
0.8	2752.0043	16.593767	2736.3553	2793.9528	3.833925	5	30
0.85	2749.478	13.045441	2736.3553	2793.9528	3.833925	2	30
0.9	2748.1704	11.99798	2736.3553	2801.5953	3.833925	1	30
0.95	2750.2895	14.254968	2736.3553	2790.0821	3.833925	3	30
↓ Summary	min = 2748.1704\|max = 2752.0043
Yield-sensitivity amplitude
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0.45	2753.3455	16.583618	2736.3553	2801.7716	5.04879	5	30
0.5	2749.485	17.612028	2736.3553	2795.3457	5.04879	3	30
0.55	2749.478	13.045441	2736.3553	2793.9528	5.04879	2	30
0.6	2748.2967	13.825479	2736.3553	2801.5953	5.04879	1	30
0.65	2752.38	16.728042	2736.3553	2795.3457	5.04879	4	30
↓ Summary	min = 2748.2967\|max = 2753.3455

Table 6. Sensitivity of the mean best objective value to perturbations of the six primary difficulty-controlling parameters at

D = 50

using mlshaderl over 30 independent runs.

Table 6. Sensitivity of the mean best objective value to perturbations of the six primary difficulty-controlling parameters at

D = 50

using mlshaderl over 30 independent runs.

Oscillatory efficiency amplitude
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0	7331.1245	15.412423	7306.4324	7363.4434	962.89456	6	30
0.11	7211.0909	3.1717512	7204.8542	7218.6255	962.89456	5	30
0.16	7044.7745	17.057701	7030.3282	7130.8727	962.89456	4	30
0.22	6815.5524	18.726896	6787.1793	6882.7953	962.89456	3	30
0.28	6567.3928	13.60431	6544.041	6596.8341	962.89456	2	30
0.33	6368.2299	21.145834	6334.4958	6422.4896	962.89456	1	30
↓ Summary	min = 6368.2299\|max = 7331.1245
Resonance penalty weight
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
5	6623.8839	12.906796	6606.1281	6656.1921	356.97433	1	30
6	6688.6443	10.32578	6673.505	6710.7771	356.97433	2	30
7	6748.6185	12.965195	6729.497	6777.9409	356.97433	3	30
8	6815.5524	18.726896	6787.1793	6882.7953	356.97433	4	30
9	6872.8301	18.80808	6848.0448	6948.7902	356.97433	5	30
10	6927.6072	13.015184	6912.2261	6976.0468	356.97433	6	30
11	6980.8583	8.1426039	6962.3735	7000.6067	356.97433	7	30
↓ Summary	min = 6623.8839\|max = 6980.8583
Oscillation frequency base
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0.19	6827.2335	14.867945	6802.1772	6858.915	11.681103	7	30
0.2	6822.7283	13.582789	6801.7755	6853.2951	11.681103	5	30
0.21	6816.0076	19.515194	6782.3074	6894.463	11.681103	2	30
0.22	6815.5524	18.726896	6787.1793	6882.7953	11.681103	1	30
0.23	6818.5625	11.495079	6800.9534	6843.9629	11.681103	3	30
0.24	6825.4638	17.385711	6795.1023	6862.376	11.681103	6	30
0.25	6820.7188	16.7757	6789.4331	6862.8348	11.681103	4	30
↓ Summary	min = 6815.5524\|max = 6827.2335
Oscillation frequency amplitude
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0.06	6824.0866	17.80986	6805.2846	6871.1754	25.08246	6	30
0.07	6824.6756	14.207166	6801.02	6858.4731	25.08246	7	30
0.08	6812.5168	17.558047	6777.8677	6838.7437	25.08246	5	30
0.09	6806.3246	16.964569	6780.3753	6833.5228	25.08246	4	30
0.1	6804.043	11.570701	6774.0124	6836.8462	25.08246	3	30
0.11	6803.2118	19.926939	6777.8017	6864.0155	25.08246	2	30
0.12	6799.5931	14.265252	6777.8908	6842.1455	25.08246	1	30
↓ Summary	min = 6799.5931\|max = 6824.6756
Yield-sensitivity base
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0.75	6815.5524	18.726896	6787.1793	6882.7953	0	1	30
0.8	6815.5524	18.726896	6787.1793	6882.7953	0	2	30
0.85	6815.5524	18.726896	6787.1793	6882.7953	0	3	30
0.9	6815.5524	18.726896	6787.1793	6882.7953	0	4	30
0.95	6815.5524	18.726896	6787.1793	6882.7953	0	5	30
↓ Summary	min = 6815.5524\|max = 6815.5524
Yield-sensitivity amplitude
Value	Mean best_f	SD	Min best_f	Max best_f	Main effect	Rank	N (runs)
0.45	6815.5524	18.726896	6787.1793	6882.7953	0	1	30
0.5	6815.5524	18.726896	6787.1793	6882.7953	0	2	30
0.55	6815.5524	18.726896	6787.1793	6882.7953	0	3	30
0.6	6815.5524	18.726896	6787.1793	6882.7953	0	4	30
0.65	6815.5524	18.726896	6787.1793	6882.7953	0	5	30
↓ Summary	min = 6815.5524\|max = 6815.5524

Table 7. Comparative performance of nine optimizers on four classical benchmarks and the proposed irrigation benchmark at

D = 30

and

D = 50

(30 independent runs).

Table 7. Comparative performance of nine optimizers on four classical benchmarks and the proposed irrigation benchmark at

D = 30

and

D = 50

(30 independent runs).

Ackley (multimodal, nearly separable)
Method	DIM = 30				DIM = 50
	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	0.000	0.000	100	0.000	0.000	0.866	37	0.698
clpso	1.849	2.238	3	0.160	4.466	4.887	3	0.230
cmaes	0.000	0.000	100	0.000	0.000	0.000	100	0.000
ga	1.809	5.312	3	2.887	6.105	11.290	3	3.083
jso	0.000	0.000	100	0.000	0.000	0.000	100	0.000
lmcmaes	0.000	0.000	100	0.000	0.000	0.000	100	0.000
lracmaes	0.000	0.000	100	0.000	0.000	0.000	100	0.000
mlshaderl	0.000	0.000	100	0.000	0.000	0.000	100	0.000
acor	0.000	0.000	100	0.000	0.001	0.002	3	0.000
Rastrigin (multimodal, separable)
Method	DIM = 30				DIM = 50
	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	0.000	0.431	70	0.770	0.000	2.753	10	1.766
clpso	45.562	55.445	3	5.967	163.202	179.140	3	8.682
cmaes	0.000	0.000	100	0.000	0.000	0.000	100	0.000
ga	34.603	75.510	3	20.113	125.429	211.418	3	32.690
jso	0.000	0.199	97	1.090	0.000	0.033	3	0.182
lmcmaes	0.000	0.000	100	0.000	0.000	0.000	100	0.000
lracmaes	0.000	0.000	100	0.000	0.000	0.000	100	0.000
mlshaderl	0.000	3.051	3	1.751	5.970	11.475	3	2.965
acor	173.137	199.811	3	12.340	355.488	436.769	3	21.310
Rosenbrock (unimodal, non-separable)
Method	DIM = 30				DIM = 50
	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	0.000	0.000	100	0.000	0.068	2.237	3	2.012
clpso	43.541	66.203	3	12.374	137.479	182.123	3	19.021
cmaes	0.000	0.532	87	1.378	0.000	0.266	93	1.011
ga	1.403	9.352	3	13.548	40.653	121.036	3	46.509
jso	0.000	0.000	100	0.000	0.055	2.460	3	2.045
lmcmaes	0.000	0.000	100	0.000	8.812	11.744	3	1.976
lracmaes	18.976	20.076	3	0.653	44.489	45.374	3	0.413
mlshaderl	0.000	0.000	90	0.000	3.677	8.275	3	2.603
acor	22.949	23.486	3	0.169	44.703	46.257	3	0.366
Schwefel (deceptive global structure)
Method	DIM = 30				DIM = 50
	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	236.877	911.163	3	422.393	2260.948	3257.192	7	554.197
clpso	135.272	472.769	3	140.728	2993.371	4204.992	3	457.860
cmaes	3217.735	4922.821	3	649.304	6300.446	8278.018	3	937.601
ga	0.082	25.188	3	48.552	126.888	347.795	3	154.861
jso	0.000	467.833	7	331.048	1618.663	3342.078	3	1600.885
lmcmaes	4560.002	6102.950	3	856.272	7927.551	10,654.037	3	1190.889
lracmaes	8443.609	8752.957	3	196.740	14,879.073	15,977.397	3	387.892
mlshaderl	0.000	86.906	40	102.922	118.439	609.521	10	398.185
acor	7541.236	7958.509	3	183.466	14,020.958	14,871.290	3	329.079
Proposed benchmark (full formulation, $A L L$ )
Method	DIM = 30				DIM = 50
	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	2736.355	2769.033	3	19.880	6802.897	6836.089	3	21.400
clpso	2744.664	2750.093	3	3.803	6861.143	6874.249	3	8.224
cmaes	2739.460	2778.598	3	26.858	6801.328	6830.234	3	15.765
ga	2764.424	2837.572	3	47.075	6940.318	7265.921	3	212.781
jso	2737.798	2744.379	3	5.791	6788.850	6806.872	3	9.409
lmcmaes	2743.855	2783.549	3	25.4044	6808.991	6855.765	3	37.657
lracmaes	2740.701	2757.367	3	18.873	6784.609	6788.553	3	4.2875
mlshaderl	2736.356	2749.478	10	13.045	6787.179	6815.552	3	18.727
acor	2737.798	2772.389	3	15.015	6784.162	6785.259	3	1.446

Table 8. Performance of ten optimizers on four CEC 2011 real-world problems (30 independent runs).

	Circular Antenna Array Design ( $D = 12$ )				Economic Load Dispatch 3 ( $D = 15$ )
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	0.006810	0.006815	60	$1.14 \times 10^{- 5}$	32,367.575	32,539.303	20	135.295
clpso	0.047378	0.094675	3	0.023307	32,526.151	32,618.692	3	49.757
cmaes	0.006810	0.124647	10	0.094744	32,675.525	32,871.615	3	111.990
ga	0.007622	0.192564	3	0.073164	32,638.762	33,004.304	3	184.909
gwo	0.006960	0.048890	3	0.064171	32,393.642	32,649.655	3	151.395
jso	0.006810	0.006810	100	0.000	32,458.946	32,553.998	3	40.849
lmcmaes	0.234	0.344	3	0.050	32,836.207	33,103.976	3	201.060
lracmaes	0.007	0.078	3	0.093	32,513.975	32,637.164	3	67.684
mlshaderl	0.006810	0.006810	100	0.000	32,367.575	32,408.065	43	57.814
acor	0.006811	0.006824	3	$1.27 \times 10^{- 5}$	32,367.575	32,479.012	30	108.299
	Economic Load Dispatch 4 ( $D = 40$ )				Hydrothermal Generation Scheduling ( $D = 96$ )
Method	Value	Mean	Rate%	SD	Value	Mean	Rate%	SD
arq	121,093.444	121,262.919	7	177.291	141,655.841	141,658.464	20	4.879
clpso	121,918.009	122,097.776	3	88.465	150,084.620	154,166.424	3	2425.074
cmaes	121,967.550	123,220.453	3	625.525	141,693.746	141,742.229	3	21.558
ga	122,541.381	124,917.349	3	1161.716	242,383.761	295,667.986	3	29,379.013
gwo	121,514.169	122,193.897	3	383.671	141,946.963	142,025.814	3	40.302
jso	121,151.991	121,390.273	3	194.400	141,661.644	141,675.409	3	10.117
lmcmaes	124,344.754	125,839.174	3	877.080	266,778.354	351,222.664	3	56,403.637
lracmaes	122,893.260	124,873.519	3	938.523	265,437.557	1,479,740.493	3	954,019.104
mlshaderl	121,078.445	121,168.270	3	63.379	141,655.841	141,658.216	10	2.887
acor	121,211.785	121,221.308	63	30.211	142,241.225	142,335.115	3	56.664

Table 9. Mean Friedman ranks of the ten optimizers on the full benchmark formulation across all dimensions (30 independent runs, lower rank indicates better performance).

Method	D = 24	D = 30	D = 50	D = 70	D = 100	Avg
jso	2.23	2.15	3.5	3.63	2.13	2.73
lracmaes	3.13	3.98	1.83	1	4.6	2.91
mlshaderl	3.33	3.17	4.3	3.37	2.47	3.33
cmaes	5.47	5.9	5.4	3.57	1.63	4.39
acor	2.97	5.6	1.23	4.23	8.07	4.42
arq	4.43	5.63	5.7	6	5.4	5.43
lmcmaes	7.57	6.57	6.37	6.3	4.8	6.32
clpso	7.1	3.53	7.67	7.9	6.97	6.63
ga	9.2	8.8	9.2	9.07	8.97	9.05
gwo	9.57	9.67	9.8	9.93	9.97	9.79
p-value	$2.81 \times 10^{- 41}$	$2.73 \times 10^{- 32}$	$1.32 \times 10^{- 47}$	$3.79 \times 10^{- 46}$	$1.69 \times 10^{- 49}$
Kendall’s W	0.795	0.636	0.906	0.881	0.939
W interpretation	Strong	Moderate	Strong	Strong	Strong
Significant	Yes	Yes	Yes	Yes	Yes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Charilogis, V.; Tsoulos, I.G.; Gianni, A.M.; Tsalikakis, D. A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization. Mathematics 2026, 14, 2018. https://doi.org/10.3390/math14112018

AMA Style

Charilogis V, Tsoulos IG, Gianni AM, Tsalikakis D. A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization. Mathematics. 2026; 14(11):2018. https://doi.org/10.3390/math14112018

Chicago/Turabian Style

Charilogis, Vasileios, Ioannis G. Tsoulos, Anna Maria Gianni, and Dimitrios Tsalikakis. 2026. "A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization" Mathematics 14, no. 11: 2018. https://doi.org/10.3390/math14112018

APA Style

Charilogis, V., Tsoulos, I. G., Gianni, A. M., & Tsalikakis, D. (2026). A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization. Mathematics, 14(11), 2018. https://doi.org/10.3390/math14112018

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Weather-Aware Irrigation Scheduling Benchmark for Continuous Global Optimization

Abstract

1. Introduction

2. Weather-Coupled Irrigation Scheduling Benchmark

2.1. Background of the Problem

2.2. Mathematical Formulation

2.3. Experimental Protocol

2.4. Experimental Results

2.5. Parameter Sensitivity Analysis

2.6. Comparative Difficulty Assessment on Classical Benchmarks

2.7. Comparison with CEC 2011 Real-World Optimization Problems

2.8. Statistical Validation

3. Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI