Intelligent Optimization in Satellite Communication Protocols: Methods, Applications, and Practical Limits

Tsochev, Georgi

doi:10.3390/electronics15071473

Open AccessReview

Intelligent Optimization in Satellite Communication Protocols: Methods, Applications, and Practical Limits

by

Georgi Tsochev

Department of Intelligent Technologies in Industry, Faculty of Computer Systems and Technology, Technical University of Sofia, 1000 Sofia, Bulgaria

Electronics 2026, 15(7), 1473; https://doi.org/10.3390/electronics15071473

Submission received: 3 March 2026 / Revised: 25 March 2026 / Accepted: 27 March 2026 / Published: 1 April 2026

(This article belongs to the Special Issue Advances in Satellite/UAV Communications)

Download

Browse Figures

Versions Notes

Abstract

Satellite communication protocols are increasingly optimized in software-defined, multiorbital networks that combine broadband satellite systems, non-terrestrial 5G components, and inter-satellite transport. This review examines intelligent optimization across the physical, medium-access, network, and transport layers, with emphasis on what can be measured, what can be controlled, and what can be safely deployed under standards and operational constraints. This paper first positions the literature across DVB/ETSI, 3GPP NTN, CCSDS/DTN, LEO routing, and recent AI and digital-twin research. It then links standards-defined control surfaces to layer-specific measurements, feedback delays, and safety constraints and compares optimization families using deployment-relevant criteria such as observability, runtime predictability, verification burden, and robustness. The review argues that the central challenge is not only a simulation-to-reality gap but an evidence gap between experimental gains and operational trust. To address this gap, this paper analyzes delayed observability, rare events, bounded onboard compute, action surface mismatch, certification, and security; formalizes a generic constrained optimization problem with delayed observations and standards-compliant actions; and proposes a digital-twin-assisted research methodology supported by a worked beam-hopping example. The main conclusion is that future progress is most likely to come from hybrid, standards-compliant, and twin-assisted optimization methods whose performance claims are tied to calibration, traceability, and explicit rollback logic.

Keywords:

satellite communications; non-terrestrial networks; protocol optimization; link adaptation; scheduling; congestion control; reinforcement learning; digital twins

1. Introduction

1.1. Motivation

Satellite communications remain indispensable for wide-area coverage, resilience, and connectivity in remote, maritime, aeronautical, and disaster-affected regions, and they are increasingly positioned as a structural component of 5G and beyond non-terrestrial networks (NTNs). Modern systems span geostationary (GEO), medium-Earth-orbit (MEO), and low-Earth-orbit (LEO) deployments, combine transparent and regenerative payload options, and expose far richer software control surfaces than earlier generations of fixed satellite systems [1]. On the standards side, DVB-S2/S2X and DVB-RCS2 define practical broadband adaptation hooks [2,3,4,5,6], while 3GPP studies and specifications have progressively integrated satellite access into the 5G system, from the initial NR feasibility study in TR 38.811 [7] and the solution study in TR 38.821 [8] to the current Release 19 work on Phase-3 integration, security, and management [9,10,11]. This transition makes protocol-level adaptation—not only static link budgeting—a first-class engineering problem.

1.2. Why Satellite Protocol Optimization Is Harder than in Terrestrial Networks

These architectural shifts multiply the protocol control knobs available to designers and operators: adaptive coding and modulation (ACM) thresholds, return-link access probabilities, scheduler priorities, beam-hopping patterns, routing weights, handover triggers, gateway association rules, and transport-layer parameters. DVB/ETSI standards expose many of these knobs directly [2,3,4,5,6], while IP/transport guidance and 3GPP NTN procedures constrain how they can be adjusted [7,8,12,13,14,15,16,17,18]. Unlike terrestrial systems, satellite control loops must act under long or variable round-trip times, geometry-driven visibility changes, feeder-link bottlenecks, quantized or delayed feedback, strict power budgets, and limited onboard compute [1,7,8,13,14]. The optimization problem is therefore not simply to maximize throughput but to trade throughput, delay, fairness, robustness, and compliance under unusually hostile control conditions.

1.3. Literature Gap and the Need for a Deployment-Centered Perspective

Recent surveys have covered artificial intelligence and machine learning for satellite operations [19,20], moving-topology routing and LEO networking [21,22,23,24], 6G NTN integration and roadmaps [25,26,27,28,29,30], and network digital twins in both standards and research communities [31,32,33,34,35,36]. What is still missing is a deployment-centered synthesis that connects those strands through measurement assumptions, admissible control knobs, and evidence requirements. For example, beam-hopping studies can assume per-beam demand and interference snapshots richer than typical operations, administration, and maintenance (OAM) exports [37,38,39], while learning-based routing papers can report delay gains without specifying the telemetry cadence, loop-free fallback, or policy-governance logic required by an operational stack [23,24]. This article therefore treats deployment itself as a research problem, rather than as a final implementation afterthought.

1.4. Scope and Key Contributions of This Article

This article focuses on protocol-level optimization rather than on antenna, waveform-hardware, or payload-design problems in isolation. Its scope is the set of decisions that change how satellite systems adapt, schedule, share, route, and validate communication resources across the physical, medium-access control (MAC), network, and transport layers. Within that scope, this article makes six contributions. First, it positions the literature across DVB/ETSI SATCOM, 3GPP NTN, CCSDS/DTN, moving-topology LEO networking, and digital-twin research. Second, it maps standards’ families to concrete optimization hooks, measurable inputs, feedback delays, and safety constraints. Third, it compares major method families through a deployment lens that includes observability, runtime predictability, verification burden, and operator governance. Fourth, it develops a deeper analysis of practical limits and barriers to deployment. Fifth, it extracts from that analysis a standalone research methodology for twin-assisted satellite protocol studies and illustrates it with a worked beam-hopping example. Sixth, it prioritizes future research directions by near-term, medium-term, and longer-term deployment horizons.

The remainder of this article is organized as follows. Section 2 positions the review against adjacent strands of related work, and Section 3 connects optimization opportunities to major standards families. Section 4, Section 5 and Section 6 survey optimization entry points, method families, and application patterns. Section 7 analyzes the principal deployment barriers, while Section 8 converts those lessons into a research methodology for digital-twin-assisted protocol studies. Section 9, Section 10 and Section 11 then distill deployable engineering patterns, formulate future research questions, and conclude this review.

2. Related Work

2.1. DVB and Broadband Satellite Foundations

Protocol-level optimization in satellite communications first matured in the DVB and broadband SATCOM community. DVB-S2 [2] and DVB-S2X [3] define concrete modulation, coding, and framing options; GSE [4] and DVB-RCS2 [5,6] define practical encapsulation and return-link procedures. This strand produced operational work on ACM and variable coding and modulation (VCM) tuning, queue-aware capacity assignment, and conservative fade-margin engineering. Its main strength is immediate deployability: even when the optimization logic is sophisticated, the resulting actions remain inside standards-defined mode tables and signaling procedures. Its main limitation is that cross-layer feedback is usually sparse and strongly gateway-centric.

2.2. HTS Multi-Beam Scheduling and Beam Hopping

With high-throughput satellites and aggressive frequency reuse, the literature shifted from single-link adaptation to joint scheduling, beam hopping, and interference-aware resource allocation. Zheng et al. [40] analyze joint beam scheduling and power optimization in multi-beam systems; Chen et al. [37] propose a two-stage probabilistic framework for beam hopping; Guo et al. [38] study interference-aware multidimensional scheduling in non-geostationary orbit (NGSO) constellations; and Wang et al. [39] use reinforcement learning (RL) for hotspot-oriented beam-hopping control. Together, these studies show that combinatorial scheduling and interference coupling have become central protocol concerns rather than peripheral payload-planning problems. A recurrent deployment gap, however, is that many evaluations assume richer per-beam state visibility and more direct actuation than real operations typically expose.

2.3. 3GPP NTN Evolution from Baseline Feasibility to Phase-3 Integration

3GPP introduced a second, partly independent, strand of related work by reframing satellite access as a standardized component of the 5G system. TR 38.811 [7] identified the baseline timing, Doppler, hybrid automatic repeat request (HARQ), and mobility issues for NR over NTN; TR 38.821 [8] organized solution directions; TS 38.300 [12] anchored the overall NR/NG-RAN system description; and TR 28.808 [41] brought management and orchestration into scope. More recent Release 19 studies extend the discussion to Phase-3 architecture integration [9], security [10], and management of Phase-2 [11]. For protocol optimization research, this strand matters because it defines which control knobs are standardized, which remain implementation-specific, and which procedures must remain deterministic for interoperability.

2.4. AI/ML for SATCOM Operations and Protocol Control

The operational AI/ML literature has grown from anomaly detection and traffic prediction to configuration recommendation and closed-loop assistance. Vázquez et al. [19] highlighted early operational uses of machine learning in SATCOM operations; Fontanesi et al. [20] provided a broader 2025 survey of AI for satellite communication; Bui et al. [25] connected digital twins with integrated satellite-terrestrial control; and Aygul et al. [26] discussed machine learning-driven TN/NTN integration for 6G. This literature motivates hybrid optimization pipelines, but it also exposes a recurring weakness: public telemetry abstractions, comparable benchmarks, and governance-ready evaluation protocols remain scarce.

2.5. LEO Routing, Transport, and Moving-Topology Control

Research on LEO constellations has created a distinct protocol literature focused on time-varying graphs, frequent topology reconfigurations, and the tension between routing optimality and state-dissemination overhead. Shan et al. [21] quantify the amount of link-state information needed in LEO routing; Karapantazis and Papapetrou [22] revisit on-demand routing in LEO systems; RFC 9717 [23] frames a routing architecture for satellite networks with scheduled connectivity changes; and Shi et al. [24] explore graph-neural-network and deep-Q-network routing on moving graphs. Recent direct-satellite-to-device works such as Constellation as a Service [42] broaden this strand toward multi-constellation connectivity management and preconfigured handover paths. The remaining gap is that routing, handover, and transport behavior are still too often studied in isolation.

2.6. DTN, CCSDS, and Mission-Style Networking

At the high-delay and intermittent-connectivity extreme, CCSDS and the delay-/disruption-tolerant networking (DTN) community provide a mature alternative lineage. AOS [43], Proximity-1 [44], Space Packet [45], LTP [46,47], Bundle Protocol [48,49], and CCSDS security profiles [50,51] were not designed for consumer broadband, but they offer concrete mechanisms for store-and-forward operation, custody, contact-aware forwarding, and mission-grade interoperability. This body of work becomes directly relevant whenever low-density constellations, sparse feeder connectivity, or intermittent inter-satellite links make always-on IP assumptions unrealistic.

2.7. 6G-Native TN/NTN Integration and Experimental Roadmaps

An emerging body of work moves beyond incremental NTN adaptation and asks what a native 6G TN/NTN system should look like. Wang et al. [27], the 6G-NTN Consortium white paper [28], and 6G-IA roadmaps [29,30] argue for a tighter integration between terrestrial and non-terrestrial segments, software-defined payloads, onboard processing, and shared experimentation infrastructure. Recent studies also push toward communication–computing integration: Zhou et al. [52] study mission-driven resource scheduling in satellite-terrestrial networks, and Jiang et al. [53] analyze collaborative perception and computing offloading in 6G air–ground integrated networks. This literature is forward-looking rather than fully deployment-ready, but it is valuable for identifying where the next wave of protocol optimization problems will arise.

2.8. Positioning of This Review

Compared with these strands, the present review is centered on protocol-level optimization under deployment constraints. It does not treat AI, NTN standards, LEO routing, or DTN as separate silos; instead, it asks where they intersect in actual control loops, which method families fit the observability and compute constraints of satellite networks, and how digital twins can be used to turn simulation results into auditable deployment evidence.

Table 1 maps the main literature clusters and the type of optimization problem each cluster addresses. To move beyond breadth alone, Table 2 re-reads representative recent studies through the evidence-gap lens used in this review.

Viewed this way, the main issue is rarely that prior studies are uninteresting; it is that the observation contract, action admissibility, and governance logic are often left implicit. This motivates the deployment-centered analysis developed in later sections.

3. Standards Mapping

Protocol optimization in satellite systems is constrained by standards. Standards define framing, signaling, timing, security, and interoperability requirements, which in turn determine where optimization can be inserted safely. For broadband and broadcast SATCOM, DVB and ETSI standards specify physical-layer framing and modulation in DVB-S2/S2X [2,3], network-layer packet encapsulation in GSE [4], and return-channel procedures in DVB-RCS2 [5,6].

In cellular NTN, 3GPP specifications and study reports define how NR and the 5G system operate over satellite links, including timing relations, Doppler compensation assumptions, random-access behavior, mobility management, and management/orchestration functions [7,8,9,10,11,12,41]. These documents do not dictate a single optimizer, but they bound which actions may remain interoperable and which control loops must preserve deterministic behavior.

In space-mission links, CCSDS provides an end-to-end family of standards spanning space data-link protocols such as AOS and Proximity-1 [43,44], packet formats such as Space Packet [45], DTN mechanisms such as BP and LTP [46,47,48,49], and security profiles such as SDLS and the network-layer security adaptation profile [50,51]. These standards are conservative about non-deterministic behavior because interoperability across missions and agencies is a primary goal.

Finally, when IP is used over satellite, the Internet protocol stack and its guidance documents become relevant. RFC 2488 [13], RFC 3135 [14], and RFC 3449 [15] formalize common mitigation patterns for TCP over satellite, including performance-enhancing proxies (PEPs), acknowledgment management, and asymmetry handling; QUIC introduces path migration and transport flexibility [16]; IPv6 and IPsec shape the network-layer baseline [17,18].

In hybrid deployments, the main difficulty is not only coexistence but semantic translation between control planes. A provider may use 3GPP NTN procedures for access and mobility [7,8,12], DVB/ETSI mechanisms for broadband framing and return-link scheduling [2,3,4,5,6], and IP or DTN overlays for transport or mission traffic [13,14,15,16,17,18,46,47,48,49]. In such settings, QoS or service intent from the 5G side must be mapped onto DVB scheduler classes and beam/gateway resources, while timing, mobility, and security events do not arrive on identical clocks or with identical observability. Likewise, DTN contact-aware decisions can conflict with end-to-end transport assumptions if store-and-forward behavior is not made visible to upper layers. These cross-standard coordination constraints bound what an “optimal” controller can safely do even before algorithm choice is considered.

From an optimization viewpoint, standards can therefore be read as a list of controllable variables (knobs) and hard constraints. Table 3 summarizes the main standards families and their optimization hooks, while Table 4 makes the engineering bridge to Section 4 by listing, layer by layer, what is normally measurable, what can actually be adjusted, how stale the feedback is likely to be, and which safety constraints dominate decision-making.

Three recurring architecture patterns are especially important for protocol-level review. In a bent-pipe GEO/HTS system, most protocol intelligence remains in terminal and gateway modems; the satellite chiefly forwards waveforms, so the practical control surfaces are ACM/VCM tables, return-link grants, feeder-gateway selection, and beam-level capacity assignment. In a regenerative NTN access architecture, part of the scheduler, mobility logic, or base-station function moves closer to the space segment, which can shorten some control paths but also tightens timing, determinism, and update-governance requirements. In an ISL-rich LEO architecture, the network behaves more like a moving routed backbone: routing snapshots, handover guards, and transport behavior interact continuously. Figure 1 summarizes these recurring patterns and shows why standards mapping is inseparable from the question of where protocol state is observed and where actions are applied [1,2,3,4,5,6,7,8,9,10,11,12,23,42,43,44,45,46,47,48,49].

Figure 1 can be read from left to right as a progression from raw traffic demand to operational control. The data plane flows through terminals, satellites, gateway, and the core/overlay, while telemetry and policy information flow upward and back through the NOC. This makes clear why the same optimization problem looks different in GEO broadband, NTN access, and LEO routing settings: the observation contract, control latency, and admissible command set change with the architecture.

Table 3 enumerates the standards families and their exposed optimization hooks. Table 4 complements it with a layer-wise view of what is usually measurable, what is actually controllable, how stale the feedback is likely to be, and which safety constraints dominate deployment.

4. Optimization Entry Points in the Satellite Protocol Stack

Optimization opportunities can be organized by protocol layer and by the time scale of the control loop. At fast time scales (milliseconds to seconds), physical-layer adaptation and per-frame scheduling dominate. At slower time scales (seconds to hours), routing, gateway selection, and traffic engineering respond to demand shifts and predicted conditions such as weather attenuation. At the slowest time scales (days to months), offline design selects waveform families, access schemes, and validated controller profiles. Table 4 makes explicit why the layers differ not only by objective but also by observability, actuation delay, and safety envelope.

From a protocol-engineering perspective, the optimization landscape is a hierarchy of nested control loops rather than a single monolithic decision problem. The fastest loop selects waveform or frame-level actions using stale CSI or ACK/NACK evidence. A second loop allocates grants, slots, or beam-hopping patterns over several frames or superframes. Above these, network and transport loops update routing weights, gateway association, handover bias, or pacing over seconds to minutes. Finally, offline planning and validation loops recalibrate models, update safe envelopes, and approve rollback profiles. Figure 2 visualizes these loops and shows why time scale, evidence latency, and admissible actuation are as important as the objective function itself.

Figure 2 should be interpreted as a timing map rather than a taxonomy. Tasks migrate rightward as more context becomes available and stronger guarantees are required. The main implication is that method choice cannot be separated from control horizon: strategies suitable for minutes-scale tuning are often inappropriate for millisecond-to-second loops.

Across layers, the generic decision problem can be written in a delayed-observation, standards-constrained form:

y_{t} = h (x_{t - d_{t}}) + ε_{t}

(1)

(\hat{{(x)}_{t}}, Σ_{t}) = E (y_{0 : t}, u_{0 : t - 1}, e_{0 : t})

(2)

u_{t} = π (\hat{{(x)}_{t}}, Σ_{t}), u_{t} \in A_{(s t d)}

(3)

m a x_{π J (π)} = (E) [\sum_{t} γ^{t} U (x_{t}, u_{t})]

(4)

\Pr {g_{j (x_{t}, u_{t})} > 0} \leq α_{j} and u_{t} \leftarrow u_{(s a f e)} \{i f\} c_{t} < τ

(5)

Here,

y_{t}

denotes the measured telemetry,

x_{t}

the latent network state,

d_{t}

the measurement delay,

A_{(s t d)}

the standards-compliant action set,

Σ_{t}

the controller’s uncertainty estimate, and

u_{(s a f e)}

a validated fallback action. Equations (1)–(4) are intentionally generic, but they capture the key deployment reality emphasized throughout this review: optimization is only useful if it respects delayed evidence, feasible control surfaces, and explicit safety constraints.

4.1. Physical Layer: Adaptive Waveforms, Power, and Beam Control

Physical-layer optimization includes selecting modulation and coding modes, adjusting pilot and framing options, allocating power across carriers and beams, and controlling precoding or beamforming where applicable. DVB-S2 and DVB-S2X explicitly support constant coding and modulation (CCM), VCM, and ACM, but delayed and quantized feedback can make aggressive adaptation unstable. Robust margins and conservative fallback modes are therefore common in operational systems, with packet error rate (PER) or outage targets acting as safety constraints [2,3,54].

4.2. Medium Access and Resource Management

MAC-layer decisions include return-link random access configuration, grant allocation, frame structure selection, scheduling across users and beams, and interference coordination in frequency reuse systems. Many of these are mixed-integer problems because users are scheduled in discrete slots and beams are activated in discrete patterns (e.g., beam hopping). In practice, scalable heuristics and decompositions dominate, often augmented by learned predictors or offline-tuned parameters [5,6,37,38,39].

4.3. Network and Transport: Routing, Traffic Engineering, and Congestion Control

Above the MAC layer, optimization targets end-to-end performance. In LEO constellations with ISLs, routing and traffic engineering must adapt to a moving topology with intermittent links. In IP-based deployments, transport-layer behavior becomes a major determinant of user experience; classic guidance emphasizes TCP enhancements, PEP placement, and careful ACK and pacing behavior over asymmetric or high-latency paths. DTN protocols provide an alternative design point when disruption dominates, at the cost of operational complexity [13,14,15,16,21,22,23,46,47,48,49].

4.4. Time Scales and Control Loops

A recurring cause of instability is mixing control loops with incompatible time scales. For example, a fast scheduler reacting to short-term queue variations can conflict with a slower congestion controller reacting to RTT-scale signals. Deployable intelligent optimization therefore tends to be hierarchical: slow controllers compute policy parameters and safety envelopes, while fast controllers execute simple, bounded-time decisions [55,56].

Table 5 and Table 6 show that the dominant design axis is not simply the protocol layer, but the combination of control-loop time scale, evidence latency, and admissible action set. As control moves upward in the stack, instantaneous feedback becomes less reliable and governance constraints become more visible.

5. Method Families for Intelligent Protocol Optimization

This section surveys major optimization and learning families used to improve satellite protocol behavior. We emphasize what each family assumes about models and data, how it scales, and what it can guarantee about feasibility and stability.

In a deployable system, these method families rarely appear in isolation. A typical optimizer first normalizes telemetry and reconstructs hidden state, then passes candidate actions through one or more engines: deterministic or convex allocation for hard feasibility, robust or stochastic margin setting for uncertainty, surrogate or Bayesian optimization for expensive parameter tuning, and learning-based ranking only where contextual adaptation is safe. Before a recommendation reaches the live network, it must be projected onto standards-compliant knobs and screened by confidence thresholds, rate limits, and rollback logic. Figure 3 presents this pipeline view, which is closer to operational practice than method-by-method benchmarking in isolation [55,57,58,59,60,61,62].

Figure 3 is not a single mandatory software architecture. Rather, it is a review-oriented abstraction of how different method families are often combined in practice. The central point is that learning and search components become more trustworthy when they operate inside a model-based feasible envelope and under explicit deployment gates.

5.1. Deterministic Model-Based Optimization

When performance models are sufficiently accurate and tractable, deterministic optimization remains the most reliable tool. Convex optimization and geometric programing yield repeatable solutions with interpretable trade-offs, and they are amenable to verification. However, satellite protocol problems often involve hard nonlinearities (coding thresholds, amplifier saturation) and discrete decisions (slot/beam assignment), necessitating approximations, decomposition, or mixed-integer formulations [58,59].

5.2. Stochastic and Robust Optimization Under Uncertainty

Uncertainty arises from fading and weather attenuation, demand variability, interference, and intermittent connectivity. Robust optimization protects against worst-case uncertainty sets, while stochastic and chance-constrained formulations trade expected performance against outage risk. These methods are especially useful for link adaptation with delayed feedback and for planning decisions such as gateway diversity or contact scheduling [54,60].

5.3. Combinatorial and Mixed-Integer Optimization

Scheduling, beam-hopping patterns, gateway selection, and routing are typically combinatorial. Mixed-integer programs provide a clean modeling framework and can generate offline benchmarks, but real-time operations require scalable approximations (greedy construction, local search, and decomposition). A common deployment pattern is to solve a relaxed problem, round to a feasible schedule, and then repair violations using domain-specific rules.

5.4. Metaheuristics and Simulation-Based Search

Metaheuristics (genetic algorithms, particle swarm optimization, tabu search) remain popular when objectives are non-convex, non-differentiable, or only available through simulation. Their flexibility is valuable for protocol parameter tuning and complex pay-load constraints, but naive implementations are sample-inefficient. In satellite contexts, metaheuristics are most practical when paired with surrogates or when used offline for configuration design.

5.5. Surrogate Models, Bayesian Optimization, and Digital Twins

Surrogate modeling replaces expensive simulations or field experiments with learned predictors. Bayesian optimization (BO) uses these surrogates to select informative experiments, making it a natural choice for tuning protocol parameters under limited evaluation budgets. Digital twins extend this idea by coupling calibrated models with streaming telemetry, enabling safe “what-if” evaluation before rollout. The main risk is model drift: continuous validation and uncertainty quantification are required to prevent overconfident decisions [57,61].

5.6. Reinforcement Learning and Bandit Methods for Online Decisions

Reinforcement learning (RL) addresses sequential decision problems where actions affect future state, such as scheduling under queues, routing under congestion, or adaptive access control. Contextual bandits are a lighter-weight alternative for decisions with weak long-term coupling, such as selecting an ACM mode based on coarse channel context. In satellite networks, long feedback delays and limited safe exploration make unconstrained online RL risky; deployable designs therefore use offline training, conservative constraints, and policy selection among pre-validated candidates [55,62].

5.7. Distributed and Multi-Agent Optimization

Satellite networks are naturally distributed across gateways, satellites, and terminals. Distributed optimization and multi-agent learning attempt to coordinate decisions under partial information and communication constraints. Hierarchical schemes—local control with slower global coordination—are often the most deployable approach because they limit signaling overhead and reduce instability from simultaneous adaptation.

Table 7 uses operationally defined criteria rather than vague scenario labels. “System model required” denotes how much explicit analytical structure the method needs before it can be trusted; “online telemetry need” denotes the amount and freshness of runtime data required during deployment; “tight-loop suitability” asks whether worst-case execution time can realistically be bounded for millisecond-to-second control; and “verification effort” reflects how difficult it is to justify the resulting controller to operators and reviewers.

The main practical lesson is that successful systems rarely rely on a single method family. A common hybrid pattern is to let deterministic or robust optimization define the feasible envelope and use a lighter learning component only to rank or select among pre-validated options (Table 8). Chen et al.’s two-stage beam-hopping framework [37] and Zhou et al.’s hierarchical mission-driven scheduling [52] both fit this logic, even though they use different algorithmic ingredients. A second pattern is surrogate-assisted search, in which Bayesian optimization or learned surrogates screen configurations offline, after which a deterministic repair or projection step enforces feasibility online [57,61]. These hybrid combinations are often easier to validate than unconstrained end-to-end learning.

6. Applications in Contemporary Satellite Systems

This section highlights recurring application patterns and relates them to method choices and deployment constraints.

Although the application space is broad, many contemporary SATCOM optimizers reduce to a small set of recurring protocol algorithms. Delayed-feedback link adaptation uses a predict–rank–select loop. Beam hopping and multi-beam scheduling use a demand-aggregate, candidate-generate, feasibility-screen, and guarded-deploy loop. LEO routing and handover control use a time-sliced graph and continuity-constraint loop, while transport and DTN control select among pacing, proxy, or custody policies on the basis of path classification. Figure 4 summarizes these skeletons to give the reader a concrete algorithmic picture before the subsection-specific discussion [2,3,13,14,15,16,21,22,23,24,37,38,39,40,41,42,46,47,48,49,54].

The four panels of Figure 4 are intentionally generic. They are not vendor-specific implementations, but compact protocol workflows that expose where evidence enters, where ranking or search happens, and where feasibility or fallback must be enforced.

6.1. Link Adaptation with Delayed and Quantized Feedback

ACM mechanisms in DVB-S2/S2X expose a standardized set of modulation and coding modes and signaling options. Under delayed feedback, conservative robust adaptation (e.g., optimizing margins rather than selecting modes directly) often outperforms aggressive mode switching. Bandit methods can be used to learn per-terminal reliability of modes under specific weather and interference conditions while respecting outage constraints [2,3,54].

A representative delayed-feedback ACM controller therefore has four explicit blocks: evidence collection (CQI, ACK/NACK, fade indicators), state or margin estimation, mode ranking against outage or PER targets, and projection onto the finite DVB-S2/S2X or NTN mode table. The algorithmic difficulty lies not in choosing an unconstrained optimum, but in choosing the safest admissible mode under stale evidence and bounded confidence.

6.2. Multi-Beam Resource Allocation and Interference Management

Joint power and scheduling decisions in multi-beam systems are strongly coupled through interference. Recent works formulate joint beam scheduling and power optimization under realistic payload models and propose algorithms that balance tractability with interference-aware constraints [37,38,40]. In practice, a common design is to compute a coarse allocation in a slower loop and refine it with fast heuristics per frame. Mission-driven scheduling results in satellite-terrestrial settings [52] reinforce this hierarchy: coarse mission-to-resource matching and fine-grained collaboration/reconfiguration are often easier to validate than a single monolithic controller.

In practical multi-beam architectures, this optimization loop is usually split between a slower beam-level allocator and a faster local scheduler. The slow loop computes a coarse power or capacity split under feeder and interference constraints; the fast loop resolves per-frame user ordering, grant size, or short-horizon corrections. This decomposition is one reason why hybrid convex-plus-heuristic designs are more common than end-to-end monolithic optimizers.

6.3. Scheduling and Beam Hopping

Beam hopping introduces a discrete beam-activation pattern that can adapt capacity to spatially varying demand. Optimization approaches include mixed-integer formulations with rounding and repair [37], interference-aware scheduling [38], and RL variants that incorporate constraint handling to avoid infeasible schedules [39]. The most deployable solutions tend to limit the action space by selecting among validated hop patterns and to use learning primarily for hotspot prediction or pattern ranking, not for unconstrained direct actuation.

A typical beam-hopping scheduler maintains a finite library of hop patterns and switching rules exposed by the payload. Demand forecasts and feeder constraints are used to rank patterns, after which a feasibility screen rejects choices that would violate power, switching, or interference envelopes. In deployment, learning is most useful for hotspot prediction or pattern ranking, whereas the final action is usually selected from a validated discrete set.

6.4. Routing and Traffic Engineering in LEO Constellations

LEO routing must exploit deterministic motion while coping with intermittent inter-satellite links (ISLs) and traffic variability. Topology snapshots and ephemeris-based predictions enable time-dependent routing, but state dissemination must be carefully managed [21,22,23]. Recent direct-satellite-to-device works extend the problem from routing to tailored connectivity management: Constellation as a Service [42] treats multi-constellation infrastructure as a shared resource pool and preconfigures handover paths for service regions. This is not classical shortest-path routing, but it reinforces the same lesson: topology knowledge, mobility cadence, and executable control surfaces must be co-designed.

A practical LEO routing stack rarely solves a fresh global optimization problem from scratch at every instant. Instead, it maintains a time-sliced graph derived from ephemerides, computes route or gateway biases for the current and near-future snap-shots, and applies handover guard timers to preserve continuity. This architecture reduces oscillation and allows routing decisions to remain compatible with continuity, signaling, and transport constraints.

6.5. Transport-Layer Optimization and Cross-Layer Control

Transport performance over satellite is sensitive to round-trip time, path asymmetry, and loss processes. Classic guidance highlights standard TCP mechanisms and the use of PEPs to split control loops [13,14,15]. With LEO mobility, path changes and handovers interact with transport behavior; QUIC’s connection migration is relevant but must be coupled with careful congestion-control tuning and buffer management [16,17]. Looking forward, communication–computing integration studies such as Jiang et al. [53] suggest that regenerative payloads and onboard edge processing may create new cross-layer control loops, but these opportunities are only operationally useful if their protocol action surfaces are made explicit.

Transport-layer control is typically implemented as a path-classification and pro-file-selection problem. The controller first determines whether the path resembles a long-RTT but connected IP path, an asymmetric proxy-assisted path, or a disruption-prone DTN path. It then selects pacing, PEP, QUIC migration, or custody and priority behavior accordingly, with cross-layer guards to prevent transport adaptation from fighting lower-layer scheduling decisions.

7. Practical Limits and Barriers to Deployment

Across systems and method families, several practical limits repeatedly dominate deployment outcomes. For satellite protocols, these limits are best understood not as isolated implementation annoyances but as interacting constraints on evidence, model validity, action feasibility, and governance.

A deeper deployment analysis is useful if barriers are grouped by what they invalidate: state evidence, model validity, action feasibility, and governance. In satellite systems, long delay, moving topology, feeder-link and weather coupling, and heterogeneous standards amplify, all four at once. As a result, many prototype gains disappear not be-cause the optimizer is mathematically weak, but because the surrounding measurement-to-decision chain is incomplete, weakly instrumented, or impossible to verify under operational constraints [31,32,65,66].

7.1. Delayed Observability and Partial Feedback

Long RTTs and intermittent visibility delay or suppress feedback, which can destabilize closed-loop optimization and learning. Deployable designs must therefore model delay explicitly, reconstruct hidden state, or slow down adaptation enough to preserve stability under stale evidence.

7.1.1. Observability Regimes and Evidence Availability

Satellite protocol control seldom operates under full direct observability. Instead, deployments mix three evidence regimes: variables that are directly measured, variables that are measured late or irregularly, and variables that are only inferred through estimators or side information. Research papers often compress these regimes into a single state vector, but deployment difficulty depends precisely on which elements are sensed, delayed, estimated, or altogether unavailable [7,8,12,31,36,41].

7.1.2. Delay Mismatch Between Measurement and Actuation

In many satellites loops, the actuation horizon is shorter than the observation horizon. A scheduler, access controller, or handover policy can change the system immediately, while its effect may become visible only after one or several round trips or after orbital geometry has already changed. This creates a structural credit-assignment problem and makes reactive controllers prone to over-correction, oscillation, or misplaced blame on exogenous events [20,31,32,55,62].

7.1.3. Implications for Benchmarking and Safe Control

A deployment-oriented study should therefore state its observation contract explicitly: sampling rate, latency distribution, missing-data behavior, estimator assumptions, and the minimum telemetry needed to stay inside the safe action set. Without that contract, numerical gains are hard to interpret because the benchmark may assume a level of state visibility that the operational stack does not export [32,33,36].

7.2. Non-Stationarity, Rare Events, and Robustness

Traffic demand, interference, and weather vary across multiple time scales, and rare events such as gateway outages, interference spikes, and abrupt demand surges can dominate risk. Robust and distributional robust methods reduce sensitivity to model mismatch, while safe RL emphasizes constraint satisfaction during learning and deployment [54,55].

7.2.1. Structured Drift Versus True Novelty

Not all non-stationarity is equally harmful. Orbital motion, known beam visibility windows, and scheduled handovers create structured variation that can often be modeled or scheduled in advance, whereas feeder-link weather, external interference, and ten-ant-driven demand shocks create genuinely novel conditions that must be absorbed online. A strong controller must distinguish predictable change from true regime shift rather than treating both as generic randomness [23,25,31,34,65].

7.2.2. Rare Events, Tail Risk, and Dataset Scarcity

Tail events matter disproportionately because service obligations are often defined by continuity, outage recovery, and graceful degradation rather than by average throughput alone. Yet severe outages and congestion crises are precisely the events that appear least often in available traces. A policy can dominate on mean metrics and still be unusable if its confidence collapses under anomalous conditions or if tail-performance evidence is missing [33,34,54,55,60].

7.2.3. Adaptation Without Destabilization

Continual adaptation must be rate-limited and coordinated across layers. When routing, scheduling, and congestion control adapt simultaneously, locally rational updates may produce network-wide oscillation. This is a major reason why hierarchical control and conservative update cadence remain essential even in learning-enabled architectures [20,27,55,62].

7.3. Onboard Compute, Determinism, and Update Constraints

Onboard processing often prioritizes determinism, radiation tolerance, and limited update cadence. This favors compact policies, bounded-time algorithms, validated lookup tables, and carefully governed update procedures; complex learning models are typically trained offline and distilled into small runtime artifacts [26].

7.3.1. Runtime Determinism and Bounded Complexity

Onboard processing often prioritizes determinism, radiation tolerance, bounded memory, and limited update cadence. This is improving—NASA’s High Performance Spaceflight Computing (HPSC) program targets roughly two orders-of-magnitude improvement over legacy spaceflight processors while emphasizing performance per watt and fault tolerance [63]—but onboard control still operates under far tighter timing and software-assurance constraints than gateway or network-operations-center (NOC) environments. Consequently, the deployable algorithm class onboard is usually lookup-based adaptation, small convex updates, or distilled policy evaluation rather than large iterative search or continuously retrained models [20,63,64].

7.3.2. Communication and Control-Plane Budget

Operational acceptance depends not only on average compute cost but also on worst-case execution time, memory footprint, and timing determinism. In practical terms, per-frame or per-superframe onboard loops can usually tolerate table lookup, lightweight filtering, small linear or convex updates, or inference with compact distilled models, whereas gateway and NOC environments can accommodate decomposition-based mixed-integer optimization, larger Monte Carlo replay, and twin recalibration jobs. The real deployment question is therefore not simply “AI onboard versus AI on ground”, but whether the control loop can tolerate variable-time search, large feature vectors, and frequent model refresh without violating its timing envelope [58,59,63,64].

7.3.3. Compression, Distillation, and Update Governance

Model compression, policy distillation, rule extraction, and profile-based controllers are not secondary implementation details; they are the bridge between research and deployment. Equally important is update governance: who approves a new policy, how frequently parameters may change, which rollback image is retained, and how post-change anomalies are attributed to data, model, or actuation issues [31,36,65].

7.4. Simulation-to-Reality Gaps and Digital-Twin Drift

Many optimization gains are demonstrated in simulation, but mismatches in interference models, hardware constraints, traffic assumptions, and actuation timing can ne-gate gains in the field. Digital twins mitigate this only if they are continuously calibrated, uncertainty-aware, and embedded in disciplined change management [57,61].

7.4.1. Sources of Model-Form Mismatch

Mismatch arises at several layers: channel models omit implementation-specific non-idealities, traffic models ignore multi-tenant behavior and synchronized reporting bursts, routing studies simplify queueing and retransmission coupling, and protocol evaluations often idealize actuation delay. These modeling errors interact, meaning that a small bias in one layer may become large once it is embedded in a closed control loop [20,23,33,34,57,61]. Concrete examples illustrate the point. Chen et al. [37] present a strong beam-hopping optimization framework, but operational deployment would still require an explicit mapping from continuous recommendations to the finite hop-pattern library and switching rules exposed by the payload. Shi et al. [24] show that graph-neural-network plus deep-Q-network routing can improve moving-topology decisions, yet the study does not specify how frequently the required link-state features can be exported or how loop-free fallback is guaranteed under the stale state. CaaS [42] addresses multi-constellation DS2D coordination and preconfigured handover paths, but it assumes predictive visibility and mobility information whose operational export and governance remain open in standardized stacks.

7.4.2. Calibration and Synchronization Requirements

A useful twin must be calibrated not once but continuously. Calibration includes parameter fitting, state alignment, timestamp normalization, uncertainty estimation, and checking whether the twin reproduces both nominal and stressed historical episodes. The scientific value of the twin depends on reporting these procedures explicitly rather than treating the twin as a black-box oracle [31,32,33,36].

7.4.3. Twin Drift and Scope Boundaries

Twin drift occurs when the mapping between the physical and virtual system deteriorates because the network changes faster than the twin is updated, or because the twin was never designed to represent certain layers faithfully. Researchers should state scope boundaries upfront: what is modeled mechanistically, what is represented statistically, what is estimated from delayed telemetry, and what is outside the twin entirely [31,32,33,36].

7.5. Certification, Explainability, and Security

Space and critical-communications contexts require traceability and predictable behavior. Black-box policies can be difficult to certify, and optimization outputs must re-main standards-compliant. Security adds another layer: protocols and optimization loops must be resilient to spoofed measurements, data poisoning, and control-plane attacks. Standards such as SDLS and IPsec provide cryptographic building blocks, but anomaly detection and policy hardening remain necessary [18,50,51].

7.5.1. Traceability and Standards-Compliant Actuation

An intelligent controller may compute an excellent action that simply cannot be ex-pressed through the standard or vendor interface exposed by the real system. Traceability therefore requires a documented chain from input telemetry to internal state, candidate action, standards-compliant command, and operator-facing explanation. As Release 19 studies expand NTN architectural options and management hooks, this action-mapping problem becomes more important rather than less [9,10,11,31,32,65,67].

7.5.2. Explainability as an Operational Property

In this context, explainability should be treated as an operational property rather than only an ML property. Operators need to know why a recommendation is safe now, which assumptions are active, which uncertainty bounds are close to violation, and which fallback will trigger if the environment shifts. A method that cannot answer these questions may be analytically interesting but operationally fragile [20,31,55,66].

7.5.3. Security of Optimization and Twin Pipelines

Security must cover the whole optimization pipeline: telemetry authenticity, time synchronization, model repository integrity, feature preprocessing, decision logging, and rollback protection. A digital twin introduces additional assets to protect—calibration da-ta, scenario libraries, and counterfactual results—because corruption of these artifacts can quietly bias future operational decisions [10,18,31,36,50,51].

7.6. Barrier Interactions Across the Deployment Lifecycle

A useful way to read deployment risk is as a lifecycle problem. The dominant barrier changes from design to integration to runtime, but weaknesses accumulate across all stages and often reinforce each other [31,32,33,36].

7.6.1. Design-Time Barrier Stack

At design time, the dominant risk is model debt: simplifying assumptions that are reasonable for link-level or single-layer studies become unsafe once cross-layer dependencies, gateway diversity, weather coupling, policy constraints, and tenant differentiation are introduced. If those assumptions are hidden, later calibration and deployment work starts from a false baseline [33,34,36,64].

7.6.2. Integration-Time Barrier Stack

At integration time, the dominant risk is interface debt. The optimization component may require telemetry granularity, control hooks, synchronization accuracy, or retraining cadence that the operational stack does not expose. In many cases, this is where otherwise strong research prototypes fail, because the system lacks the actuation or instrumentation needed to instantiate the theoretical control loop [31,32,36,65].

7.6.3. Runtime Barrier Stack

At runtime, the dominant risk becomes evidence debt: decisions must be justified from stale, partial, or noisy observations while operators still require bounded behavior, anomaly visibility, and immediate rollback. These barriers interact; a weak observation pipeline makes calibration harder, poor calibration erodes trust, and low trust prevents the system from collecting the evidence needed for improvement [31,32,33,36].

The implication for research is that performance metrics alone are insufficient. A study should also report its observability assumptions, required control frequency, dependency on hidden state, calibration strategy, confidence estimation, and expected fallback behavior. For satellite studies, the minimum deployment questions remain: What measurements are actually available? Which actions are standards-compliant? How quickly can a safe fallback be executed? What rare events were explicitly tested? Without these answers, the gap between an experimental optimizer and an operational protocol remains structural rather than incremental [31,32,33,36,65].

The outcome of Table 9 is that each major deployment barrier can be translated into a concrete twin function and a minimum evidence artifact. This is useful because it reframes barriers as research design requirements rather than as vague implementation concerns.

8. Research Methodology for Twin-Assisted Satellite Protocol Studies

The barrier analysis in Section 7 suggests that satellite protocol research should be designed as an evidence-producing process rather than as a one-shot optimization benchmark. In other words, a method is not persuasive merely because it improves throughput or delay in a simulator; it becomes persuasive when the study makes explicit what was observed, which actions were actually admissible in a standards-compliant system, how model fidelity was established, and under what conditions the proposed controller would be promoted, constrained, or rolled back.

This section therefore extracts a standalone research methodology from the deployment analysis above. It treats the digital twin as a governed experimental instrument for protocol research: a synchronized environment that supports historical reconstruction, counterfactual experimentation, rare-event synthesis, shadow-mode validation, and staged deployment, while preserving clear separation between exploratory freedom in the twin and conservative behavior in the live network [31,32,33,36].

8.1. From Simulation Gap to Evidence Gap

8.1.1. The Four Recurring Mismatches

The usual phrase simulation-to-reality gap is too narrow for satellite protocol re-search. The deeper problem is an evidence gap: the chain between measured system state, modeled state, admissible actions, and claimed performance is often under-specified. A rigorous methodology should therefore examine four mismatches at once—telemetry mismatch, model-form mismatch, control-surface mismatch, and governance mismatch—because any of them can invalidate a reported optimization gain even when the numerical benchmark looks favorable.

8.1.2. What a Convincing Evaluation Must Report

This reframing matters because it changes what a convincing evaluation looks like. Beyond throughput, delay, loss, and fairness, a convincing study should report an observability contract, calibration error, uncertainty coverage, action feasibility under the target standard or vendor interface, runtime budget, and explicit promotion and rollback rules. Methodological adequacy is therefore part of the result, not auxiliary documentation. For ACM, an observability contract should state whether the controller sees terminal-level channel indicators, acknowledgment outcomes, rain-fade indicators, and the latency distribution of those signals; a meaningful calibration report can then give packet-error-rate prediction error, mode-selection regret, and safe-mode invocation rate over replay traces. For beam hopping, the contract should list per-beam queue summaries, feeder-link occupancy, interference summaries, and forecast cadence; calibration can be reported as error in predicted served bits per beam, queue-length evolution, and violation rate of power or interference envelopes during trace replay.

8.2. A Conceptual Digital-Twin Framework for Satellite Protocol Research

A digital twin should not be understood as a mere network simulator. ETSI and ongoing IRTF work frame a network digital twin (NDT) as a virtual counterpart linked to the operational network through data, models, interfaces, and logic across the lifecycle of planning, validation, operation, and continuous verification [31,32,33,34,35,36]. For satellite protocol research, this implies that the twin must be judged not only by fidelity in a narrow channel-model sense but by its ability to maintain usable correspondence with telemetry, protocol state, and admissible control actions.

A practical conceptual framework for satellite protocol research can be organized into five coupled layers (Table 10). The observation layer collects telemetry, ephemerides, topology snapshots, control-plane events, weather indicators, queue and flow summaries, and security-relevant logs. The synchronization and state-estimation layer aligns these heterogeneous data with different latencies and reconstructs hidden or stale state. The twin-core layer combines multi-scale models for channels, queues, interference, routing, mobility, and protocol state machines, together with uncertainty models and scenario generators. The experimentation layer performs trace replay, counterfactual testing, stress-case generation, policy search, and sensitivity analysis. Finally, the deployment-governance layer enforces standards-compliant action abstractions, safety envelopes, shadow-mode validation, canary release, and rollback rules.

The inter-layer data flow is central. Time-stamped telemetry and events from the observation layer are ingested by the synchronization and state-estimation layer, which aligns clocks, compensates delay, and outputs a state estimate with confidence bounds. The twin core consumes that state together with mechanistic and learned models to generate predicted trajectories and uncertainties. The experimentation layer repeatedly queries the twin core under replay and counterfactual scenarios and returns candidate policies, sensitivity maps, and stress-test outcomes. Finally, the deployment gate projects only the admissible subset of those policies onto standards-compliant commands and feeds realized outcomes back to the observation and calibration loops.

The key methodological principle is asymmetric freedom: exploration belongs in the twin, and conservatism belongs in the live network. The twin may train or search over rich policy spaces, but only policies that can be projected onto the real system’s discrete and auditable control knobs should pass the deployment gate. This is especially important as NTN standards add new architectural options such as regenerative payload, Store and Forward operation, and enhanced end-to-end management, all of which expand protocol interactions and therefore the value of pre-deployment evidence [32,34,65,66].

8.2.1. Observation and Telemetry Plane

The first layer defines what the twin can know. For satellite protocol studies, this includes not only packet-level or queue-level counters but also ephemerides, visibility windows, weather-linked feeder impairment indicators, beam utilization summaries, handover events, retransmission statistics, and security-relevant alarms. The design objective is not maximal telemetry but target-driven telemetry: the twin should request only the evidence needed for a given control or research task [31,36].

8.2.2. Synchronization and State Estimation

Because satellite observations arrive with different delays and coordinate systems, synchronization is a scientific task in its own right. The twin must align timestamps, orbital reference frames, gateway-local measurements, and asynchronous control-plane events. It must also estimate hidden variables such as stale queue occupancy, latent interference, or demand that will become visible only in future contacts. A twin without explicit synchronization logic is merely a loosely coupled data lake [31,32,33,36].

8.2.3. Twin Core and Model Hierarchy

The twin core should explicitly have multi-resolution. Mechanistic models are appropriate where physics and protocol rules are stable, such as orbital dynamics or standards-defined state machines; statistical or learned surrogates are more useful where behavior is difficult to model directly, such as burst demand, hidden interference, or user heterogeneity. The key research choice is not mechanistic versus learned modeling in the abstract, but which parts of the system need causal structure and which can be approximated safely [32,33,34,35].

8.2.4. Experimentation and Optimization Sandbox

The experimentation layer is where the twin becomes useful for research. It should support historical replay, counterfactual what-if analysis, rare-event injection, sensitivity analysis, uncertainty propagation, and policy search under standards-compliant action abstractions. In this layer, the twin serves as a controlled environment for asking not only whether a method improves performance but also under which assumptions and failure modes the improvement persists [31,32,33,36].

8.2.5. Deployment Gate, Safety Envelope, and Rollback

The final layer determines whether a recommendation may influence the live network. It should encode admissible actions, rate limits, policy priority rules, canary scope, rollback triggers, and post-change forensic logging. This gate is what separates a research twin from an operationally relevant twin: without explicit admissibility and recovery logic, the twin cannot convert analytical insight into trustworthy control [9,10,11,31,32].

Figure 5 should be read from left to right. The main horizontal path shows how raw telemetry becomes synchronized state, then a calibrated twin state, then counterfactual experimentation, and finally a guarded deployment decision. The upper loop represents slower recalibration, drift detection, and rare-event library maintenance; the lower loop represents measured outcomes, post-mortems, and policy audits feeding back into the twin. The key point is that policy search occurs inside the twin, whereas the live network only sees actions that pass the deployment gate.

8.3. Twin-Driven Research Workflow and Validation

The framework above suggests a concrete research workflow that begins with historical reconstruction and then expands to counterfactual analysis, stress-case generation, shadow mode, guarded rollout, and post-deployment monitoring. Methodologically, each loop should answer a different scientific question: can the twin reproduce known behavior, can the candidate method outperform strong baselines under controlled perturbations, can it remain well behaved when fed live telemetry, and can its recommendations be promoted without violating the network’s safety envelope?

8.3.1. Historical Reconstruction and Calibration

The first loop replays archived telemetry and known events to test whether the twin can reproduce past behavior under both nominal and stressed conditions. This stage should quantify not only average fit but also calibration under outages, handover bursts, and congestion transitions. A twin that cannot reconstruct the past reliably should not be trusted to rank future protocol actions.

8.3.2. Counterfactual and Rare-Event Experimentation

The second loop evaluates alternative protocol settings under trace-driven and syn-thetic scenarios. Rare-event libraries are essential here because many operationally important conditions are under-represented in field logs. Researchers should therefore com-bine historical replay with designed stress cases that expose timing, coordination, and safety weaknesses before deployment.

8.3.3. Shadow Mode, Guarded Rollout, and Rollback

The third loop connects the twin to live telemetry so that it produces recommendations without controlling the network. This reveals the disagreement between the twin and the physical system before any operational risk is introduced. Only after shadow agreement is acceptable should a guarded rollout begin, with limited scope, explicit confidence thresholds, and automatic rollback if anomaly, confidence, or safety triggers are crossed.

8.3.4. Evidence Artifacts, Reproducibility, and Negative Results

Each loop should produce explicit research artifacts: data provenance reports, model-calibration error, uncertainty estimates, rare-event coverage, baseline definitions, action-feasibility checks, shadow-mode disagreement metrics, promotion criteria, and rollback triggers. Reporting these artifacts makes the study reproducible and also makes negative results scientifically useful, because failures can be attributed to missing evidence, inadequate synchronization, insufficient control authority, or genuine algorithmic weakness rather than being hidden inside aggregate performance figures.

The twin is therefore not a cure-all. If telemetry is sparse, if rare events are missing from the scenario library, or if calibration drift is not monitored, the twin can create a false sense of certainty. For this reason, a credible satellite twin should always publish its scope boundaries: which layers are modeled faithfully, which variables are estimated rather than observed, what delay assumptions are built in, how often the twin is re-synchronized, and what classes of decisions are allowed to cross the deployment gate. Treating the twin itself as an object of measurement is essential for rigorous research [32,33,36].

Table 11 should be interpreted as a minimum package rather than an exhaustive checklist. Individual studies may add problem-specific evidence, but omitting these baseline artifacts makes it difficult to judge whether a reported gain is actually deployable.

9. Engineering Patterns for Deployable Intelligent Optimization

Several engineering patterns recur in successful deployments and provide a bridge from research prototypes to operational systems.

9.1. Hierarchical Control and Time-Scale Separation

Slow planning (policy parameters, safety envelopes) should be separated from fast execution (simple actions). This reduces instability and supports validation because each loop can be tested against its own time scale and assumptions.

9.2. Safe Constraint Handling and Conservative Fallbacks

Hard constraints (spectrum, power, timing, fairness) should be enforced explicitly using projections, conservative bounds, or chance constraints. Whenever confidence is low or telemetry is missing, the system should revert to a conservative baseline configuration known to be stable [54,55].

9.3. Validation Ladder: Simulation, Emulation, Hardware-in-the-Loop, In-Orbit

Deployable methods follow a validation ladder: unit simulations, end-to-end emulation with realistic delay and traffic, hardware-in-the-loop modem testing, and limited in-orbit trials with careful monitoring. Tail metrics (outage rate, tail latency, recovery time) matter as much as average throughput.

9.4. When Classical Methods Win

When models are trustworthy and the decision space is small, classical optimization and control remain superior due to their interpretability and verifiability. Learning is most appropriate when models are incomplete and context variability is high, but only when paired with safety mechanisms and operational monitoring.

10. Future Research Priorities

The most useful future directions are not generic calls for “more AI” but prioritized research problems that are aligned with standards evolution, deployment feasibility, and the evidence requirements identified in Section 7 and Section 8. To reduce fragmentation, the agenda is grouped below into near-term, medium-term, and longer-term priorities.

10.1. Near-Term Priorities

Near-term priorities are those that can be studied with today’s standards, telemetry, and operations interfaces. The first is standards-compliant optimization for Release 18/19 NTN: future work should map each standardized function to an explicit admissible action set and compare whether model-predictive control, contextual bandits, or policy selection are preferable at each interface [9,10,11,65,67]. The second is safe learning under long delay and partial observability: satellite links make exploration unusually risky, so future work should emphasize conservative policy improvement, constrained partially observable control, and provable outage or fairness limits rather than raw reward maximization [54,55]. The third is continuously calibrated digital twins: research should identify the minimum telemetry needed to keep a twin useful, how uncertainty should be propagated to decision-makers, and how drift should be detected before unsafe recommendations reach the deployment gate [25,31,32,33,36].

10.2. Medium-Term Priorities

Medium-term priorities involve broader architectural change. A first topic is routing and transport co-design for inter-satellite-link-rich constellations, where scheduled connectivity, path asymmetry, handover, and end-to-end congestion control must be optimized jointly rather than in separate silos [13,14,15,16,21,22,23]. A second topic is optimization for regenerative payloads, Store and Forward operation, and onboard edge processing, where the central question is not only algorithm quality but also which control loops should run onboard, which at gateways, and how model updates are governed [9,10,11,27,53,63,64]. A third topic is multi-standard and multi-orbit control, because practical systems increasingly combine DVB-based broadband links [2,3,4,5,6], 3GPP NTN access [7,8,12,41], and IP/DTN mechanisms [13,14,15,16,17,18,46,47,48,49] across different orbital layers.

10.3. Longer-Term Priorities

Longer-term enabling priorities concern trust and research infrastructure. Security and adversarial robustness of closed-loop control remain open because manipulated telemetry, poisoned training data, spoofed timing or location information, or compromised rollback logic can silently bias optimization pipelines [10,18,50,51]. Equally important is evaluation science itself: the field needs open benchmark suites, realistic scenario libraries, shared TN/NTN experimentation platforms, and common reporting protocols for tail metrics, fairness drift, calibration error, and rollback behavior [28,29,30,31,32,33,36,68]. Without these foundations, many apparent gains will remain difficult to reproduce or compare.

11. Conclusions

Intelligent optimization offers a practical path to improved satellite protocol performance as systems scale in complexity, heterogeneity, and software-defined control. The central lesson of this review is that deployment success depends less on algorithmic novelty in isolation than on how convincingly an approach connects measurements, models, actions, and operator trust. Across the protocol stack, the decisive questions are whether the relevant state can be observed with acceptable delay, whether the action can be expressed through standards-compliant control knobs, whether the controller behaves predictably under uncertainty, and whether safe fallback is immediate when evidence weakens.

This perspective motivates the article’s stronger emphasis on practical limits and on a standalone research methodology for twin-assisted protocol studies. The proposed framework treats the digital twin not as a passive simulator but as a governed evidence engine that supports state reconstruction, multi-scale modeling, counterfactual evaluation, rare-event stress testing, shadow-mode validation, and guarded deployment. Its scientific value lies in making research claims auditable: a credible study should report telemetry assumptions, calibration error, uncertainty coverage, action feasibility, shadow-mode disagreement, and rollback criteria, not only mean throughput or delay gains.

Looking ahead, the most impactful work is likely to come from hybrid and standards-compliant designs that combine strong domain models with selective learning, all embedded within disciplined validation ladders and digital-twin-assisted research infrastructure. As NTN standards continue to evolve toward richer architectural options, management hooks, and tighter terrestrial integration, the research community will need shared traces, reproducible evaluation practices, and explicit evidence packages for deployment readiness. The long-term opportunity is not merely smarter protocol tuning, but a more mature science of trustworthy satellite protocol innovation in which optimization, experimentation, and operational governance are designed together from the start.

Funding

This research was funded by the scientific-research project № 253CH0001-04 “Development of infrastructure and environment for aerospace education and research at TU-Sofia/INSATUS/” by the contract with “Research and development sector at TU-Sofia”.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The author declares no conflicts of interest.

References

Maral, G.; Bousquet, M.; Sun, Z. Satellite Communications Systems: Systems, Techniques and Technology, 6th ed.; Wiley: Hoboken, NJ, USA, 2020. [Google Scholar]
ETSI EN 302 307-1 V1.4.1; Digital Video Broadcasting (DVB); Second Generation Framing Structure, Channel Coding and Modulation Systems for Broadcasting, Interactive Services, News Gathering and Other Broadband Satellite Applications. Part 1: DVB-S2. ETSI: Valbonne, France, 2014.
ETSI EN 302 307-2 V1.4.1; Digital Video Broadcasting (DVB); Second Generation Framing Structure, Channel Coding and Modulation Systems for Broadcasting, Interactive Services, News Gathering and Other Broadband Satellite Applications. Part 2: DVB-S2 Extensions (DVB-S2X). ETSI: Valbonne, France, 2024.
ETSI TS 102 606-1 V1.2.1; Digital Video Broadcasting (DVB); Generic Stream Encapsulation (GSE). Part 1: Protocol. ETSI: Valbonne, France, 2014.
ETSI TS 101 545-1 V1.3.1; Digital Video Broadcasting (DVB); Second Generation DVB Interactive Satellite System (DVB-RCS2). Part 1: Overview and System Level Specification. ETSI: Valbonne, France, 2020.
ETSI EN 301 545-2 V1.4.1; Digital Video Broadcasting (DVB); Second Generation DVB Interactive Satellite System (DVB-RCS2). Part 2: Lower Layers Satellite Specification. ETSI: Valbonne, France, 2024.
3GPP. Study on New Radio (NR) to Support Non-Terrestrial Networks; 3GPP TR 38.811 (Release 15), v15.x; 2019–2020. Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3234 (accessed on 25 March 2026).
3GPP. Solutions for NR to Support Non-Terrestrial Networks (NTN); 3GPP TR 38.821 (Release 16), v16.x; 2019–2023. Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3525 (accessed on 25 March 2026).
3GPP. Study on Integration of Satellite Components in the 5G Architecture, Phase 3; 3GPP TR 23.700-29, Release 19; 2024. Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=4212 (accessed on 25 March 2026).
3GPP. Study on Security Aspects of 5G Satellite Access in the 5G Architecture Phase 3; 3GPP TR 33.700-29, Release 19; Under Change Control; 2025. Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=4241 (accessed on 25 March 2026).
3GPP. Study on Management Aspects of NTN Phase 2; 3GPP TR 28.874, Release 19; Under Change Control; 2024. Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=4268 (accessed on 25 March 2026).
3GPP. NR and NG-RAN Overall Description; Stage 2; 3GPP TS 38.300 (Release 17), v17.0.0; 2022. Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3191 (accessed on 25 March 2026).
Allman, M.; Glover, D.; Sanchez, L. Enhancing TCP Over Satellite Channels Using Standard Mechanisms; RFC 2488; 1999. Available online: https://www.rfc-editor.org/rfc/rfc2488.html (accessed on 25 March 2026).
Border, J.; Kojo, M.; Griner, J.; Montenegro, G.; Shelby, Z. Performance Enhancing Proxies Intended to Mitigate Link-Related Degradations; RFC 3135; 2001. Available online: https://www.rfc-editor.org/rfc/rfc3135.html (accessed on 25 March 2026).
Balakrishnan, H.; Padmanabhan, V.; Fairhurst, G.; Sooriyabandara, M. TCP Performance Implications of Network Path Asymmetry; RFC 3449; 2002. Available online: https://www.rfc-editor.org/rfc/rfc3449.html (accessed on 25 March 2026).
Iyengar, J.; Thomson, M. QUIC: A UDP-Based Multiplexed and Secure Transport; RFC 9000; 2021. Available online: https://datatracker.ietf.org/doc/rfc9000/ (accessed on 25 March 2026).
Deering, S.; Hinden, R. Internet Protocol, Version 6 (IPv6) Specification; RFC 8200; 2017. Available online: https://www.rfc-editor.org/rfc/rfc8200.html (accessed on 25 March 2026).
Kent, S.; Seo, K. Security Architecture for the Internet Protocol; RFC 4301; 2005. Available online: https://www.rfc-editor.org/rfc/rfc4301.html (accessed on 25 March 2026).
Vázquez, M.Á.; Henarejos, P.; Pappalardo, I.; Grechi, E.; Fort, J.; Gil, J.C.; Lancellotti, R.M. Machine Learning for Satellite Communications Operations. IEEE Commun. Mag. 2021, 59, 22–27. [Google Scholar] [CrossRef]
Fontanesi, G.; Ortiz, F.; Lagunas, E.; Garces-Socarras, L.M.; Baeza, V.M.; Vázquez, M.Á.; Vásquez-Peralvo, J.A.; Minardi, M.; Vu, H.N.; Honnaiah, P.J.; et al. Artificial Intelligence for Satellite Communication: A Survey. IEEE Commun. Surv. Tutor. 2025, 28, 1381–1435. [Google Scholar] [CrossRef]
Shan, Q.; Wang, Z.; Zhang, S.; Meng, Q.; Luo, H. Routing in LEO Satellite Networks: How Many Link-State Updates Do We Need? In Proceedings of the 2023 IEEE International Conference on Satellite Computing (Satellite); IEEE: New York, NY, USA, 2023; pp. 7–12. [Google Scholar] [CrossRef]
Karapantazis, S.; Papapetrou, E. On-Demand Routing in LEO Satellite Systems. In Proceedings of the 2007 IEEE International Conference on Communications; IEEE: New York, NY, USA, 2007. [Google Scholar]
Li, T. A Routing Architecture for Satellite Networks; RFC 9717 (Informational); 2025. Available online: https://datatracker.ietf.org/doc/rfc9717/ (accessed on 25 March 2026).
Shi, Y.; Wang, W.; Zhu, X.; Zhu, H. Low Earth Orbit Satellite Network Routing Algorithm Based on Graph Neural Networks and Deep Q-Network. Appl. Sci. 2024, 14, 3840. [Google Scholar] [CrossRef]
Bui, T.T.; Nguyen, L.D.; Canberk, B.; Sharma, V.; Dobre, O.A.; Shin, H.; Duong, T.Q. Digital twin-empowered integrated satellite-terrestrial networks toward 6G Internet of Things. IEEE Commun. Mag. 2024, 62, 74–81. [Google Scholar] [CrossRef]
Aygul, M.A.; Turkmen, H.; Cirpan, H.A.; Arslan, H. Machine learning-driven integration of terrestrial and non-terrestrial networks for enhanced 6G connectivity. Comput. Netw. 2024, 255, 110875. [Google Scholar] [CrossRef]
Wang, F.; Zhang, S.; Yang, H.; Quek, T.Q.S. Non-Terrestrial Networking for 6G: Evolution, Opportunities, and Future Directions. Engineering 2025, 54, 56–68. [Google Scholar] [CrossRef]
6G-NTN Consortium. Vision on Non-Terrestrial Networks in 6G System (or IMT-2030): Use Cases, Requirements, and Possible Standardization Approach—A Perspective from the 6G-NTN Project; White Paper; 2024. Available online: https://zenodo.org/records/14007174 (accessed on 25 March 2026).
6G-IA. Research Priorities on Non Terrestrial Networks (NTN); Workshop Report; 6G-IA: Brussels, Belgium, 2024. [Google Scholar]
6G-IA. Research Priorities; follow-up NTN workshop report; 6G-IA: Brussels, Belgium, 2025. [Google Scholar]
ETSI GR ZSM 015 V1.1.1; Zero-Touch Network and Service Management (ZSM); Network Digital Twin. ETSI: Valbonne, France, 2024.
Zhou, C.; Yang, H.; Duan, X.; Lopez, D.; Pastor, A.; Wu, Q.; Boucadair, M.; Jacquenet, C. Network Digital Twin: Concepts and Reference Architecture; Internet-Draft Draft-Irtf-Nmrg-Network-Digital-Twin-Arch-12, Work in Progress; 2026. Available online: https://datatracker.ietf.org/doc/draft-irtf-nmrg-network-digital-twin-arch/09/ (accessed on 25 March 2026).
Hakiri, A.; Gokhale, A.; Yahia, S.B.; Mellouli, N. A comprehensive survey on digital twin for future networks and emerging Internet of Things industry. Comput. Netw. 2024, 244, 110350. [Google Scholar] [CrossRef]
Mao, B.; Zhou, X.; Liu, J.; Kato, N. Digital Twin Satellite Networks Toward 6G: Motivations, Challenges, and Future Per-spectives. IEEE Netw. 2024, 38, 54–60. [Google Scholar] [CrossRef]
Vila, I.; Sallent, O.; Perez-Romero, J. On the Design of a Network Digital Twin for the Radio Access Network in 5G and Beyond. Sensors 2023, 23, 1197. [Google Scholar] [CrossRef] [PubMed]
Zhou, C.; Chen, D.; Martinez-Julia, P.; Ma, Q. Data Collection Requirements and Technologies for Network Digital Twin; Internet-Draft Draft-Zcz-Nmrg-Digitaltwin-Data-Collection-04, Work in Progress; 2026. Available online: https://datatracker.ietf.org/doc/draft-zcz-nmrg-digitaltwin-data-collection/ (accessed on 25 March 2026).
Chen, L.; Wu, L.; Lagunas, E.; Wang, A.; Lei, L.; Chatzinotas, S.; Ottersten, B. Joint Power Allocation and Beam Scheduling in Beam-Hopping Satellites: A Two-Stage Framework with a Probabilistic Perspective. IEEE Trans. Wirel. Commun. 2024, 23, 14685–14701. [Google Scholar] [CrossRef]
Guo, S.; Han, K.; Li, L.; Gong, W. Interference-aware multi-dimensional resource scheduling for beam hopping in NGSO satellite constellations. Adv. Space Res. 2026, 77, 5305–5322. [Google Scholar] [CrossRef]
Wang, J.; Qin, W.; Ran, D. Reinforcement learning-based scheduling strategy for low-orbit satellite beam-hopping re-sources in hot-spot regions. IFAC-Pap. 2025, 59, 1457–1462. [Google Scholar] [CrossRef]
Zheng, S.; Xing, Z.; Peng, W.; Wenbo, W. Joint Beam Scheduling and Power Optimization for Multi-Beam Satellite Systems. China Commun. 2024, 21, 1–14. [Google Scholar] [CrossRef]
3GPP. Study on Management and Orchestration Aspects of Integrated Satellite Components in a 5G Network; 3GPP TR 28.808 (Release 16). Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3617 (accessed on 25 March 2026).
Wang, F.; Zhang, S.; Hong, E.-K.; Quek, T.Q.S. Constellation as a Service: Tailored Connectivity Management in Direct-Satellite-to-Device Networks. IEEE Commun. Mag. 2025, 63, 30–36. [Google Scholar] [CrossRef]
CCSDS 732.0-B-4; AOS Space Data Link Protocol. CCSDS: Reston, VA, USA, 2021.
CCSDS 211.0-B-6; Proximity-1 Space Link Protocol-Data Link Layer. CCSDS: Reston, VA, USA, 2020; (incl. EC updates).
CCSDS 133.0-B-2; Space Packet Protocol. CCSDS: Reston, VA, USA, 2020.
CCSDS 734.1-B-1; Licklider Transmission Protocol (LTP) for CCSDS. CCSDS: Reston, VA, USA, 2015.
RFC 5326; Licklider Transmission Protocol—Specification. Internet Engineering Task Force (IETF): Fremont, CA, USA, 2008. Available online: https://datatracker.ietf.org/doc/rfc5326/ (accessed on 25 March 2026).
CCSDS 734.2-B-1; CCSDS Bundle Protocol Specification. CCSDS: Reston, VA, USA, 2015.
RFC 9171; Bundle Protocol Version 7. Internet Engineering Task Force (IETF): Fremont, CA, USA, 2022. Available online: https://datatracker.ietf.org/doc/html/rfc9171 (accessed on 25 March 2026).
CCSDS 355.0-B-2; Space Data Link Security Protocol. CCSDS: Reston, VA, USA, 2022.
CCSDS 356.1-B-1; Network Layer Security Adaptation Profile. CCSDS: Reston, VA, USA, 2018.
Zhou, D.; Sheng, M.; Bao, C.; Wang, Y.; Li, J.; Han, Z. Mission-Driven Resource Scheduling in Satellite-Terrestrial Networks: From Perspective of Collaboration and Reconfiguration. IEEE Trans. Commun. 2025, 73, 6705–6719. [Google Scholar] [CrossRef]
Jiang, H.; Guo, J.; Xiao, Z.; Yang, K.; Li, T.; Li, B. Collaborative Perception and Computing Offloading in 6G Air-Ground Integrated Networks. IEEE J. Sel. Areas Commun. 2026, 44, 2821–2838. [Google Scholar] [CrossRef]
Ben-Tal, A.; El Ghaoui, L.; Nemirovski, A. Robust Optimization; Princeton University Press: Princeton, NJ, USA, 2009. [Google Scholar]
García, J.; Fernández, F. A Comprehensive Survey on Safe Reinforcement Learning. J. Mach. Learn. Res. 2015, 16, 1437–1480. [Google Scholar]
Powell, W.B. Approximate Dynamic Programming: Solving the Curses of Dimensionality; Wiley: Hoboken, NJ, USA, 2011. [Google Scholar]
Shahriari, B.; Swersky, K.; Wang, Z.; Adams, R.P.; De Freitas, N. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 2016, 104, 148–175. [Google Scholar] [CrossRef]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Bertsekas, D.P. Nonlinear Programming, 3rd ed.; Athena Scientific: Nashua, NH, USA, 2016. [Google Scholar]
Shapiro, A.; Dentcheva, D.; Ruszczynski, A. Lectures on Stochastic Programming, 2nd ed.; SIAM: Philadelphia, PA, USA, 2014. [Google Scholar]
Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. In Advances in Neural Information Processing Systems (NeurIPS); 2012. Available online: https://papers.nips.cc/paper_files/paper/2012/hash/05311655a15b75fab86956663e1819cd-Abstract.html (accessed on 25 March 2026).
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
National Aeronautics and Space Administration (NASA). NASA’s High Performance Spaceflight Computer; White Paper; National Aeronautics and Space Administration (NASA): Washington, DC, USA, 2024. [Google Scholar]
Xue, E.; Zhang, Z.; Xue, J.; Wang, H.; Carvajal-Roca, I.E.; He, Z.; Zhang, H.; Wang, H.; Wan, Z.; Li, C. Space Computing: Architectures, Challenges, and Future Directions. Intell. Comput. 2025, 4, 0134. [Google Scholar] [CrossRef]
3GPP. Non-Terrestrial Networks (NTN); 3GPP Technology Overview; 2025. Available online: https://www.3gpp.org/technologies/ntn-overview (accessed on 25 March 2026).
6G-IA Vision Working Group. European Vision for the 6G Network Ecosystem; White Paper v2.0; 2024. Available online: https://zenodo.org/records/14230482 (accessed on 25 March 2026).
3GPP. 3GPP Release Overview; Release 19; 2025. Available online: https://www.3gpp.org/specifications-technologies/releases/release-19 (accessed on 25 March 2026).
ETSI. ETSI Work Programme 2024–2025; ETSI: Valbonne, France, 2024; Available online: https://www.etsi.org/e-brochure/Work-Programme/2024-2025/mobile/index.html#p=1 (accessed on 25 March 2026).

Figure 1. Typical hybrid satellite protocol architecture: bent-pipe GEO/HTS, regenerative LEO/NGSO, gateway edge, core/overlay, and NOC policy functions.

Figure 2. Hierarchical control loops and characteristic time scales in satellite protocol optimization.

Figure 3. Typical standards-compliant optimization pipeline combining estimation, candidate generation, and guarded deployment.

Figure 4. Representative protocol-control skeletons for four recurring application patterns.

Figure 5. Conceptual digital-twin framework for satellite protocol research and staged deployment. The lower loop denotes measured outcomes and policy-audit feedback; the upper loop denotes re-calibration, drift detection, and rare-event library updates.

Table 1. Representative related-work clusters relevant to protocol-level optimization.

Area	Representative Work(s)	Primary Method	What It Contributes	Typical Limitation for Deployment
DVB/broadband foundations	DVB-S2/S2X [2,3], GSE [4], DVB-RCS2 [5,6]	Model-based ACM and capacity assignment	Standards-compliant adaptation hooks for practical systems	Conservative assumptions and limited cross-layer feedback
HTS multi-beam/beam hopping	Zheng et al. [40]; Chen et al. [37]; Guo et al. [38]; Wang et al. [39]	Joint scheduling, power allocation, RL	Interference-aware spatial capacity adaptation	Scalability and limited observability under real traffic
3GPP NTN evolution	TR 38.811 [7]; TR 38.821 [8]; Release 19 studies [9,10,11]	Protocol adaptation analysis	Identifies standardized knobs, timing limits, and future NTN features	Optimization freedom is bounded by interoperability rules
SATCOM ops and AI/ML	Vazquez et al. [19]; Fontanesi et al. [20]; Bui et al. [25]; Aygul et al. [26]	Forecasting, anomaly detection, config recommendation	Brings data-driven control into network operations	Needs trustworthy telemetry, drift monitoring, and operator trust
LEO routing and transport	Shan et al. [21]; Karapantazis and Papapetrou [22]; RFC 9717 [23]; Shi [24]; CaaS [42]	Topology-aware routing and mobility control	Scheduled-connectivity and moving-graph networking	Cross-layer validation remains limited
DTN/CCSDS lineage	AOS [43], Proximity-1 [44], Space Packet [45], LTP/BP [46,47,48,49], security [50,51]	Store-and-forward and contact planning	Mature solutions for disruption tolerance	Higher operational planning and key management burden
6G-native TN/NTN roadmaps	Wang [27], 6G-NTN [28], 6G-IA [29,30], Zhou [52], Jiang [53]	Digital twins, ML integration, architecture roadmaps	Identifies next-generation control problems and test priorities	Still limited protocol-level field evidence

Table 2. Evidence-gap reading of representative recent studies.

Representative Study	Core Contribution	Telemetry Assumption	Action-Surface Assumption	Main Evidence Gap
Chen et al. [37]	Two-stage probabilistic beam hopping	Per-beam demand and interference are observable in each cycle	Continuous recommendations can be converted into feasible hop/power plans	Finite pattern library, rollback rules, and telemetry export are not explicit
Shan et al. [21]	Quantifies needed link-state updates in LEO	Ephemeris and routing state are timely enough for update budgeting	Routing stack can consume scheduled-connectivity information	Transport coupling and failure governance remain outside the loop
Shi et al. [24]	GNN + DQN routing on moving graphs	Rich graph state and training traces are available	Learned policy can directly bias routing decisions	Stale-state behavior and loop-free fallback are not specified
Wang et al. [42]	Tailored multi-constellation direct-satellite-to-device connectivity (CaaS)	Predictive visibility, demand, and mobility information are available	Sub-constellation selection and handover paths are actionable	Standardized DS2D interfaces and operator governance remain open
Zhou et al. [52]	Hierarchical mission-driven scheduling in STNs	Mission urgency and resource status can be aggregated reliably	Cross-cluster reconfiguration is admitted by the system	Concrete standards mapping and telemetry export are not detailed
Bui et al. [25]	Digital-twin-assisted integrated satellite-terrestrial control	Sufficient synchronized telemetry exists to keep the twin calibrated	Twin recommendations can reach real network controllers	Calibration, admissibility, and promotion criteria need stronger specification

Table 3. Mapping of major standards to protocol layers and typical optimization hooks.

Ecosystem	Standard (Examples)	Layer Focus	What Is Standardized	Typical Optimization Levers (Within Compliance)
DVB/ETSI SATCOM	DVB-S2 [2]/S2X [3]; GSE [4]; DVB-RCS2 [5,6]	PHY/MAC + encapsulation	Waveforms, framing, signaling, return-channel procedures	ACM/VCM policy; scheduling; random access tuning; encapsulation overhead trade-offs [2,3,4,5,6]
3GPP NTN	TR 38.811 [7]; TR 38.821 [8]; TS 38.300 [12]	NR/NG-RAN + mobility	Timing relations, Doppler assumptions, NTN scenarios	Profile selection; timing and Doppler compensation parameters; handover thresholds [7,8,9]
CCSDS (space links)	AOS [43]; Proximity-1 [44]; Space Packet [45]; BP [48,49]/LTP [46,47]	Data link + DTN	Frame formats, ARQ options, packet formats, DTN convergence layers	Contact plan selection; retransmission timers; custody/priority scheduling [12,13,14,15,16,17,18]
Security standards	SDLS [50]; network-layer security profile [51]; IPsec [18]	Link/network security	Authentication, encryption, key/credential profiles	Policy selection and key rollover scheduling; anomaly triggers [24,44,45]
IETF transport guidance	TCP satellite guidance [13,15]; PEPs [14]; QUIC [16]	Transport	Mitigation patterns and transport semantics	Congestion control profile selection; pacing; proxy placement; multipath/connection migration policies [19,20,21,22]

Table 4. What can typically be measured and adjusted across layers: inputs, control variables, representative standards hook, feedback delays, and safety constraints.

Layer	Typical Input Measurements	Primary Control Variables	Representative Standards Hooks	Feedback-Delay Regime	Key Safety/Compliance Constraints
PHY	CQI/SNR or PER summaries, ACK/NACK outcomes, rain-fade indicators	ACM/VCM mode, power, pilot/framing options	DVB-S2/S2X [2,3]; NTN timing/PHY assumptions [7,8]	Sub-second to multi-RTT	Outage/PER bounds, spectral masks, power budget
MAC	Queue backlog, access attempts, beam demand, feeder occupancy	Grant size, slot assignment, persistence, hop pattern	DVB-RCS2 [5,6]; 3GPP random access and mobility [7,8,12]	Frame to seconds	Fairness/QoS, collision risk, payload switching limits
Network	Ephemerides, link-state snapshots, gateway load, weather state	Route weights, gateway choice, ISL admission, handover bias	RFC 9717 [23]; 3GPP mobility hooks [7,8,12]	Seconds to tens of seconds	Loop freedom, continuity, control-plane overhead
Transport/DTN	RTT, ACK spacing, queue delay, contact plan, storage state	CC profile, pacing, proxy choice, custody/priority policy	TCP/PEP RFCs [13,14,15]; QUIC [16]; BP/LTP [46,47,48,49]	RTT to minutes	End-to-end semantics, recovery bounds, storage budget
PHY	CQI/SNR or PER summaries, ACK/NACK outcomes, rain-fade indicators	ACM/VCM mode, power, pilot/framing options	DVB-S2/S2X [2,3]; NTN timing/PHY assumptions [7,8]	Sub-second to multi-RTT	Outage/PER bounds, spectral masks, power budget

Table 5. Protocol optimization tasks (Part I): PHY/MAC and network control loops.

Layer/Task	Objectives	Typical Time Scale	Method Families	Deployment Notes
PHY: ACM/VCM	throughput vs. PER/outage	10–1000 ms	lookup + margins; robust optimization; bandits	delayed CSI; conservative envelopes [2,54]
PHY: multi-beam allocation	capacity; fairness; interference	0.1–10 s	convex relaxations; iterative heuristics; surrogate models	non-convex interference; solver runtime limits [37,38,40]
MAC: scheduling/beam hopping	utilization; latency; QoS	10 ms–1 s	decomposed MILP; greedy + local search; RL with constraints	scale dominates; validate fallback schedule [37,38,39]
MAC: random access tuning	access success; delay; energy	seconds–minutes	stochastic models; Bayesian optimization; contextual bandits	burstiness; overload protection [5,57]
Network: ISL routing/TE	delay; load balance; resilience	1–10 s	snapshot routing; flow approximations; topology-aware policies	control overhead vs. optimality [21,22,23]

Table 6. Protocol optimization tasks (Part II): transport, cross-layer planning, and DTN.

Task	Objectives	Typical Time Scale	Method Families	Deployment Notes
Transport: CC tuning	stable throughput; bounded delay	RTT–minutes	control theory; online optimization; safe RL	avoid oscillations; require safety constraints [19,42]
Cross-layer: gateway selection	QoE + weather risk + load	minutes–hours	robust planning; simulation-based search	weather dominates uncertainty; change management [1,38]
DTN: contact planning	delivery prob.; storage use	minutes–days	stochastic programming; heuristics; rollout	needs contact prediction and ops tooling [15,16,17]
Security policy scheduling	confidentiality + availability	hours–months	policy optimization; anomaly-driven triggers	key rollover; adversarial robustness [24,44,45]

Table 7. Comparison of method families for protocol optimization (structured deployment viewpoint).

Family	System Model Required	Online Telemetry Need	Tight-Loop Suitability	Constraint Handling	Verification Effort	Typical Hybrid Role
Convex/deterministic	Explicit analytical model or convex approximation	Low once calibrated	Yes, if problem size is bounded	Native hard constraints	Low–medium	Safety envelope or resource-allocation core
Robust/stochastic	Explicit model plus uncertainty set/scenarios	Low–medium	Conditional	Native worst-case or chance constraints	Medium	Margin selection and risk-limited planning
MILP/combinatorial	Discrete feasible-set model	Low	Conditional after decomposition	Native feasibility + repair	Medium–high	Offline benchmark plus online repair
Metaheuristics	Black-box objective	Low	Generally no	Penalty or repair heuristics	High	Offline design-space tuning
BO/surrogates	Black-box objective plus surrogate	Medium	Not for ms-s; yes for tuning	Safe BO or projection	Medium	Parameter tuning and twin calibration
RL/bandits	State-action-reward formulation	High	Yes after offline training/distillation	External shields or constrained policy	High	Policy selection inside a safe envelope

Table 8. Deployment patterns: where to run optimization and what usually works best.

Where/Constraints	Typical Time Scale	Preferred Method Pattern	Why This Pattern Is Deployable
Onboard real-time (tight compute, determinism)	ms–s	lookup tables; small convex steps; distilled policies	predictable runtime; easier verification; robust fallbacks [2,43,63]
Gateway/network controller (near real-time)	s–min	decomposed optimization + heuristics; constrained RL; BO-tuned parameters	more compute; can use telemetry; still needs stability [37,38,39,61]
NOC/mission planning (offline)	h–days	MILP benchmarks; metaheuristics; scenario-based robust planning	can run heavy simulation and calibration workloads [1,54,64]
Hybrid DTN/disruption-prone links	min–days	contact-aware planning + DTN protocols; stochastic optimization	explicitly models intermittency; supports store-and-forward [46,48,49]

Table 9. Barrier classes, the role of a digital twin, and the minimum evidence needed before operational use.

Barrier Class	Operational Consequence	Twin Function	Minimum Evidence
Delayed observability and sparse feedback	Control decisions are made on stale or partial state; instability and conservative overreaction become likely.	Time alignment, hidden-state estimation, confidence tracking, delayed-label replay.	State-estimation error; confidence calibration; performance under delayed telemetry.
Non-stationarity and rare events	Policies break during weather, handover surges, outages, or tenant traffic bursts.	Scenario library, domain randomization, stress-case generation, rare-event replay.	Tail performance; worst-case regret; coverage of stress scenarios.
Action-space mismatch with standards and vendor interfaces	Candidate actions cannot be executed directly or require unsafe ad hoc translation.	Standards-compliant action abstraction and policy projection onto real control knobs.	Fraction of candidate actions deployable without manual redesign.
Cross-layer coupling and hidden dependencies	A local gain at one layer degrades routing, transport, fairness, or service continuity elsewhere.	Multi-layer co-simulation and counterfactual testing across protocol loops.	Prediction error for cross-layer side effects; QoS and fairness drift.
Governance, security, and rollback	Operators cannot justify, approve, or safely reverse automated changes.	Audit trail, shadow mode, canary rollout, rollback logic, anomaly screening.	Rollback latency; audit completeness; anomaly false-positive and false-negative rates.

Table 10. Worked example: applying the five-layer digital-twin framework to beam hopping.

Framework Layer	Beam-Hopping Instantiation	Evidence Artifact	Deployment Gate/Failure Check
Observation	Per-beam queue backlog, served bits, feeder occupancy, rain state, interference summaries	Telemetry schema + delay histogram	Reject optimization if missing telemetry exceeds threshold
Sync and state estimation	Align gateway timestamps to superframe; infer stale queue state	Estimation error and confidence intervals	Do not optimize when confidence falls below τ
Twin core	Traffic model, hop-pattern library, feeder/interference model, demand forecast	Back-test error on served bits and queue trajectories	Recalibrate when drift exceeds ε
Experimentation/optimization	Trace replay, hotspot stress cases, outage injection, policy ranking	Regret vs. baseline; violation count; sensitivity report	Promote only if nominal and stressed traces both improve
Deployment gate	Choose only validated hop patterns; canary on limited beams; rollback to conservative schedule	Shadow disagreement, rollback latency, audit-log completeness	Rollback automatically on low confidence or constraint breach

Table 11. A minimal evidence package for digital-twin-based satellite protocol studies.

Stage	Main Question	Minimum Evidence	Typical Risk If Skipped
Problem abstraction	Can the method control a real protocol knob?	Action-space definition, standards mapping, actuation latency bounds	Research action is not deployable
Telemetry design	Is required state observable with acceptable delay?	Telemetry schema, sampling and latency statistics, missing-data policy	Hidden-state leakage or unrealistic benchmark
Twin calibration	Does the twin reproduce nominal and stressed history?	Back-testing, calibration error, uncertainty intervals, drift checks	False confidence in counterfactuals
Stress experimentation	Is the policy safe under rare but plausible events?	Scenario library, tail metrics, safe-set violation counts	Average-case overfitting
Shadow mode	Does the twin agree with the live system before control?	Recommendation logs, disagreement analysis, operator review	Uncaught model drift
Guarded rollout	Can the system recover quickly if evidence weakens?	Canary rules, rollback triggers, post-change forensics	Prolonged unsafe deployment

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tsochev, G. Intelligent Optimization in Satellite Communication Protocols: Methods, Applications, and Practical Limits. Electronics 2026, 15, 1473. https://doi.org/10.3390/electronics15071473

AMA Style

Tsochev G. Intelligent Optimization in Satellite Communication Protocols: Methods, Applications, and Practical Limits. Electronics. 2026; 15(7):1473. https://doi.org/10.3390/electronics15071473

Chicago/Turabian Style

Tsochev, Georgi. 2026. "Intelligent Optimization in Satellite Communication Protocols: Methods, Applications, and Practical Limits" Electronics 15, no. 7: 1473. https://doi.org/10.3390/electronics15071473

APA Style

Tsochev, G. (2026). Intelligent Optimization in Satellite Communication Protocols: Methods, Applications, and Practical Limits. Electronics, 15(7), 1473. https://doi.org/10.3390/electronics15071473

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intelligent Optimization in Satellite Communication Protocols: Methods, Applications, and Practical Limits

Abstract

1. Introduction

1.1. Motivation

1.2. Why Satellite Protocol Optimization Is Harder than in Terrestrial Networks

1.3. Literature Gap and the Need for a Deployment-Centered Perspective

1.4. Scope and Key Contributions of This Article

2. Related Work

2.1. DVB and Broadband Satellite Foundations

2.2. HTS Multi-Beam Scheduling and Beam Hopping

2.3. 3GPP NTN Evolution from Baseline Feasibility to Phase-3 Integration

2.4. AI/ML for SATCOM Operations and Protocol Control

2.5. LEO Routing, Transport, and Moving-Topology Control

2.6. DTN, CCSDS, and Mission-Style Networking

2.7. 6G-Native TN/NTN Integration and Experimental Roadmaps

2.8. Positioning of This Review

3. Standards Mapping

4. Optimization Entry Points in the Satellite Protocol Stack

4.1. Physical Layer: Adaptive Waveforms, Power, and Beam Control

4.2. Medium Access and Resource Management

4.3. Network and Transport: Routing, Traffic Engineering, and Congestion Control

4.4. Time Scales and Control Loops

5. Method Families for Intelligent Protocol Optimization

5.1. Deterministic Model-Based Optimization

5.2. Stochastic and Robust Optimization Under Uncertainty

5.3. Combinatorial and Mixed-Integer Optimization

5.4. Metaheuristics and Simulation-Based Search

5.5. Surrogate Models, Bayesian Optimization, and Digital Twins

5.6. Reinforcement Learning and Bandit Methods for Online Decisions

5.7. Distributed and Multi-Agent Optimization

6. Applications in Contemporary Satellite Systems

6.1. Link Adaptation with Delayed and Quantized Feedback

6.2. Multi-Beam Resource Allocation and Interference Management

6.3. Scheduling and Beam Hopping

6.4. Routing and Traffic Engineering in LEO Constellations

6.5. Transport-Layer Optimization and Cross-Layer Control

7. Practical Limits and Barriers to Deployment

7.1. Delayed Observability and Partial Feedback

7.1.1. Observability Regimes and Evidence Availability

7.1.2. Delay Mismatch Between Measurement and Actuation

7.1.3. Implications for Benchmarking and Safe Control

7.2. Non-Stationarity, Rare Events, and Robustness

7.2.1. Structured Drift Versus True Novelty

7.2.2. Rare Events, Tail Risk, and Dataset Scarcity

7.2.3. Adaptation Without Destabilization

7.3. Onboard Compute, Determinism, and Update Constraints

7.3.1. Runtime Determinism and Bounded Complexity

7.3.2. Communication and Control-Plane Budget

7.3.3. Compression, Distillation, and Update Governance

7.4. Simulation-to-Reality Gaps and Digital-Twin Drift

7.4.1. Sources of Model-Form Mismatch

7.4.2. Calibration and Synchronization Requirements

7.4.3. Twin Drift and Scope Boundaries

7.5. Certification, Explainability, and Security

7.5.1. Traceability and Standards-Compliant Actuation

7.5.2. Explainability as an Operational Property

7.5.3. Security of Optimization and Twin Pipelines

7.6. Barrier Interactions Across the Deployment Lifecycle

7.6.1. Design-Time Barrier Stack

7.6.2. Integration-Time Barrier Stack

7.6.3. Runtime Barrier Stack

8. Research Methodology for Twin-Assisted Satellite Protocol Studies

8.1. From Simulation Gap to Evidence Gap

8.1.1. The Four Recurring Mismatches

8.1.2. What a Convincing Evaluation Must Report

8.2. A Conceptual Digital-Twin Framework for Satellite Protocol Research

8.2.1. Observation and Telemetry Plane

8.2.2. Synchronization and State Estimation

8.2.3. Twin Core and Model Hierarchy

8.2.4. Experimentation and Optimization Sandbox

8.2.5. Deployment Gate, Safety Envelope, and Rollback

8.3. Twin-Driven Research Workflow and Validation

8.3.1. Historical Reconstruction and Calibration

8.3.2. Counterfactual and Rare-Event Experimentation

8.3.3. Shadow Mode, Guarded Rollout, and Rollback

8.3.4. Evidence Artifacts, Reproducibility, and Negative Results

9. Engineering Patterns for Deployable Intelligent Optimization

9.1. Hierarchical Control and Time-Scale Separation

9.2. Safe Constraint Handling and Conservative Fallbacks