Next Article in Journal
A Resource-Efficient Approach to Fine-Tuning a BERT-Base Model for Sentiment Analysis
Previous Article in Journal
Adaptive K-Fold Siamese Neural Network Classifier for Automatic Seatbelt Monitoring
Previous Article in Special Issue
AutoQALLMs: Automating Web Application Testing Using Large Language Models (LLMs) and Selenium
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Staying Young at the Edge: A Software Aging Perspective for Foundation Models as a Service

Department of Information Engineering, University of Florence, 50139 Firenze, Italy
*
Authors to whom correspondence should be addressed.
Computers 2026, 15(3), 158; https://doi.org/10.3390/computers15030158
Submission received: 8 January 2026 / Revised: 23 February 2026 / Accepted: 28 February 2026 / Published: 3 March 2026
(This article belongs to the Special Issue Best Practices, Challenges and Opportunities in Software Engineering)

Abstract

Nowadays, the emergence of Foundation Models as a Service enables mobile users to access powerful capabilities such as inference and fine-tuning on demand and without incurring local computational overhead. This paper introduces a software-aware offloading framework for FMaaS that allows edge nodes to forecast software aging and prevent service degradation. Each node employs a lightweight Echo State Network to predict its software age, with tasks dynamically assigned based on communication cost, inference delay, and forecast reliability. Simulation results including ablation studies confirm the effectiveness of software age forecasting in reducing task failures and improving session continuity.

Graphical Abstract

1. Introduction

The recent surge in Foundation Models as a Service (FMaaS) is reshaping the paradigm of intelligent computation [1], especially in scenarios involving mobile, embedded, and resource-constrained devices. Instead of running large-scale models locally, which would require substantial memory, computational capacity, and energy, users can now leverage on-demand access to powerful Foundation Models (FMs) deployed across distributed infrastructures. These FMs include Large Language Models (LLMs), vision transformers, and multimodal architectures capable of performing a wide range of complex tasks such as text generation [2], image captioning [3], translation, summarization [4], and question answering [5].
In this context, inference and model adaptation (e.g., fine-tuning or prompt engineering) are offloaded to edge servers or cloud-based platforms, enabling end-user devices to interact with state-of-the-art intelligence through lightweight queries. This decoupling between the model and the device fosters new capabilities in fields such as autonomous systems, smart sensing, and ubiquitous computing while also introducing new challenges related to latency, reliability, and the efficient use of network and computational resources.
The flexibility and effectiveness of FMs make them attractive for real-time decision-making and adaptive service provisioning; yet, their integration into edge networks requires careful orchestration. This includes task placement, resource allocation, and lifetime-aware scheduling, particularly when considering the fragility and aging of edge infrastructure components. As such, FMaaS represents not only a shift in the software delivery model but also a new optimization challenge across the communication-computation continuum. However, the computation nodes hosting FMs typically run containerized or virtualized instances of these learning models within software-defined environments deployed on general-purpose hardware. While such flexibility allows for scalability and dynamic allocation, it also exposes the system to challenges stemming from the software lifecycle [6]. Furthermore, the phenomenon of software aging, i.e., the gradual degradation of software quality due to long-running execution, memory leaks, and resource exhaustion, can significantly impact the reliability and responsiveness of deployed services [7]. In particular, controlled experiments designed to reveal anomalous software behavior or failures are described in [8]; a one-dimensional metric for evaluating SA and its temporal evolution is proposed in [9]; and Reference [10] outlines a model for analyzing software rejuvenation in continuously running applications, quantifying both downtime and the associated costs incurred during rejuvenation. Although conventional offloading strategies consider metrics such as queue length, processing time, and radio link quality, they often disregard the internal health of the software stack. Yet, the operational state of the software plays a central role in determining whether an edge node can effectively support a prolonged session of inference or on-the-fly fine-tuning. In this context, forecasting software aging evolution is key to anticipating service degradation. To this end, we propose a lightweight time-series prediction module at each edge node based on Echo State Networks (ESNs). ESNs are a class of Recurrent Neural Networks (RNNs) particularly suitable for modeling complex nonlinear dynamics with minimal training overhead. In our architecture, each node runs an ESN that predicts the short-term trajectory of its software age. Based on these forecasts, incoming user requests can be directed to nodes that are not only close in terms of latency but also expected to remain stable in the near future.
This paper introduces a reliability-aware FMaaS offloading framework that combines ESN-based software aging prediction with task assignment logic across a network of edge nodes. The proposed system aims to extend the duration and quality of intelligent sessions by proactively avoiding nodes with deteriorating software conditions.
The main contributions of this paper are:
  • We propose a distributed FMaaS architecture that enables users to offload inference and fine-tuning requests to a pool of heterogeneous edge nodes.
  • We integrate an ESN at each service node that is trained to forecast the evolution of its software age, thereby enabling predictive reliability estimation. A lightweight offloading policy is designed to account for communication cost, inference cost, and predicted software age, allowing us to optimize user-to-node mapping.
  • We evaluate the proposed framework through simulations, demonstrating significant improvements in session continuity and failure prevention compared to traditional strategies.
To the best of the authors’ knowledge, this is the first study to incorporate software age forecasting into FMaaS offloading strategies with the aim of proactively mitigating degradation risks and enhancing the resilience of foundation model services in dynamic software-driven environments.
The rest of the paper is organized as follows: Section 2 presents an in-depth review of the related literature; Section 3 details the problem statement; Section 4 proposes the integrated communication-inference-software age approach; the system architecture according to the 1+5 Architectural Views Model is discussed in Section 6; the performance evaluation is addressed in Section 7; finally, we discuss the limitation of the proposed approach and present our conclusions are in Section 8 and Section 9, respectively.

2. Related Works

Software Aging (SA) has been widely studied as a phenomenon that affects long-running software systems, leading to performance degradation [11], increased failure rates, and eventual system crashes. Recent studies have extensively investigated SA in containerized edge architectures [12] and heterogeneous virtual networks [6]. The main goal has been to identify aging-related symptoms such as memory leaks, resource exhaustion, and numerical error accumulation, often by relying on analytical models, threshold-based monitoring, and rejuvenation strategies scheduled at fixed or adaptive intervals. In particular, the problem of online failure prediction has been investigated in numerous papers. In [13], the authors proposed a lightweight failure prediction approach to reduce the overhead due to data collection on which the runtime prediction of failure manifestation is performed. A feature extraction-based disk device failures approach was developed in [14], where long-short term memory was applied. In [15], an entropy-based aging metric was formulated to develop a three-failure prediction approach. Experimental evaluations considered a system supporting on-demand video streaming and highlighted the validity of the approach. In [16], the authors integrated data analytics within a fifth-generation network in order to perform runtime time series analysis, allowing them to forecast any threats in the system which could lead to system failure. A proactive fault tolerance framework for modern large-scale storage systems was developed in [17], where disk failures were predicted by designing a unified framework able to extract features in real time, label samples, and train a prediction model. The framework incorporated an online transfer learning procedure. In another approach, a structured Hierarchical Temporal Memory (HTM)-based failure forecasting framework was proposed by [18].
In recent years, data-driven approaches have gained prominence in addressing SA. Classical machine learning techniques such as regression models, support vector machines, decision trees, and clustering algorithms have been employed to predict aging indicators and estimate time to failure. These methods typically rely on handcrafted features extracted from system-level metrics such as CPU usage, memory consumption, Input/Output (I/O) statistics, and response times. While effective in controlled environments, their applicability is often limited by feature engineering costs, poor generalization across heterogeneous systems, and sensitivity to workload variations.
Deep learning models such as RNNs, LSTMs, and autoencoders have been proposed to capture temporal dependencies and nonlinear aging patterns in time series monitoring data. For example, hybrid LSTM models have demonstrated improved forecasting accuracy for SA prediction tasks compared to classical approaches [19]. In broader time-series contexts, deep architectures such as LSTM and RNN variants achieve higher anomaly detection accuracy and richer temporal pattern extraction than traditional Machine Learning (ML) methods [20]. Similarly, LSTM-based autoencoder frameworks have been effectively used for unsupervised anomaly detection by reconstructing normal behavior and highlighting deviations [21]. However, these models usually require significant amounts of labeled data, system-specific training, and dedicated computational resources, which can hinder their adoption in large-scale or multi-tenant environments.
More recent studies have explored Artificial Intelligence (AI)-driven and self-adaptive frameworks for SA management, integrating monitoring, prediction, and rejuvenation decision-making into closed control loops. In particular, reinforcement learning has been investigated to dynamically schedule rejuvenation actions by balancing availability, performance, and operational costs [22]. Systematic reviews have shown that Reinforcement Learning (RL) methods can enable adaptation policies in software systems by learning from interactions with the runtime environment [23], and online reinforcement learning has been proposed to continuously update adaptation strategies in the presence of environmental uncertainty [24]. Although promising, these approaches often assume a static system model and struggle to adapt to evolving software architectures such as microservices, cloud-native platforms, and serverless environments.
The emergence of FMs has recently introduced a paradigm shift in artificial intelligence, moving from task-specific models toward large-scale [25] pretrained architectures capable of general-purpose reasoning [26] and transfer learning across domains [27]. The effectiveness of this paradigm has been further demonstrated by LLMs, which exhibit strong few-shot and zero-shot learning capabilities [28] along with emergent reasoning behaviors across complex analytical tasks [29]. In the context of software systems, FMs have attracted increasing attention within the software engineering community. Recent surveys have reported their successful application to activities such as code understanding, configuration analysis, log interpretation, anomaly detection, and root cause analysis [30]. In particular, LLM-based approaches have shown promising results in processing unstructured and semi-structured operational data [31], including system logs and execution traces, often outperforming traditional rule-based or ML techniques [32]. These capabilities are especially relevant for SA, where degradation symptoms are often subtle, temporally distributed, and spread across heterogeneous data sources. Alongside the evolution of FMs, their deployment paradigm has also shifted toward cloud-based delivery models, commonly referred to as FMaaS. In [33], the authors describe FMaaS as a key enabler for scalable and cost-effective AI adoption that allows organizations to access continuously updated models without the need for local training or infrastructure management. From an operational perspective, FMaaS aligns naturally with modern cloud-native and microservice-based systems, where observability data are already centralized and AI-driven analysis can be seamlessly integrated into monitoring pipelines. Despite these advances, the application of FMs and FMaaS to SA remains largely unexplored. While traditional SA studies focus on analytical modeling and system-specific predictors [30], FM-based approaches offer the potential to reason holistically over performance metrics, logs, configuration changes, and workload evolution. Moreover, FMs can act as cognitive components within self-healing and autonomic computing frameworks, allowing them to support aging-aware diagnosis and decision-making. However, open challenges persist regarding explainability, reliability of predictions, data privacy, and the validation of FM-driven actions in production environments. Overall, existing literature suggests that FMs and FMaaS represent a promising yet under-investigated direction for SA analysis and mitigation, motivating further research into unified, AI-driven, and service-oriented aging management frameworks. In particular, existing solutions typically impose significant computational overhead, rendering them unsuitable for continuous FM inference at the edge. To address this gap, unlike existing approaches, which mainly tackle SA through system-specific predictors, centralized analytics, or reactive rejuvenation mechanisms, this work introduces a distributed FMaaS-oriented perspective for SA-aware service management. We propose a framework in which FM inference and fine-tuning are dynamically offloaded across heterogeneous ENs, explicitly accounting for the SA state of each service instance. Unlike prior data-driven SA solutions that focus on failure prediction in isolation, our approach directly integrates predictive reliability estimation (obtained via lightweight ESNs trained locally at each EN) into the offloading decision process. This enables proactive SA-aware user-to-node mapping that jointly considers communication overhead, inference cost, and predicted software degradation.

3. Problem Statement

3.1. Communication and Inference Model

The reference model depicted in Figure 1 considers a set of Edge Nodes (ENs), each located near a specific Small Base Station (SBS), along with a set of clients demanding for learning task execution via foundation models. In line with Agile Software Development principles [34], which emphasize adaptability and continuous feedback, the proposed framework dynamically adjusts task offloading decisions based on predicted software aging conditions. We assume that each client requires the computation of a task i, characterized by a model size (bits) s i (e.g., weights or prompt template) and an input payload size d i (e.g., user query or sensor data). The communication delay T i comm associated with offloading task i to EN j includes both uplink and downlink transmission times, and is given by
T i comm = s i + d i R i , j ,
where R i , j is the achievable data rate (in Mb/s) between user i and node j, and depends on the channel quality and the wireless link conditions.
Once the task is received, node j performs the inference task using its local resources. The inference delay T i inf depends on the computational workload c i (e.g., number of floating-point operations) and the available computational capacity μ j , still expressed in floating point operations per second, of node j:
T i inf = c i μ j .
In the case of autoregressive LLMs, c i can be further specified as a function of the number of generated tokens. In particular, the inference process is composed of a prefill phase and a decode phase [1]: the overall workload is well approximated by the time-to-first-token (TTFT) plus the number of output tokens n i tok times the time-per-output-token (TPOT), i.e., T i inf t TTFT + n i tok · t TPOT .
Furthermore, each task consumes a certain amount of memory resources; ρ i expressed in abstract resource blocks rather than physical units on the assigned EN j, which is characterized by a maximum E N j storage capacity Q j . Therefore, the following constraint must hold
i ρ i · x i j Q j ,
where x i j is a binary variable equal to 1 if task i is offloaded on EN j and zero otherwise. Therefore, the total latency experienced by task i when assigned to EN j is
T i total = T i comm + T i inf .
This model captures both the transmission cost over the wireless channel and the inference time at the serving node, enabling the evaluation of tradeoffs in edge-cloud task offloading decisions. Although the previous derivation focuses on LLM-based inference, the proposed communication and computation model directly extends to generic service requests and multimodal tasks, as illustrated in Figure 1. In this case, the computational workload c i captures the overall processing complexity associated with heterogeneous data (e.g., image, audio, or multimodal inputs), while the model formulation remains unchanged. Specifically, for multimodal service requests, the inference workload is expressed directly through the abstract parameter c i , without requiring any token-based decomposition.

3.2. Software Aging Model

In modern virtualized edge infrastructures, service reliability is threatened not only by hardware constraints but also by software degradation over time. This phenomenon, known as SA, refers to the progressive performance degradation or increased failure probability of long-running software systems [7]. SA is typically driven by issues such as memory leakage [10], data corruption, and resource exhaustion [9].
In this paper, we focus on modeling the SA behavior of ENs, which are more susceptible to runtime faults compared to more stable central clouds, for example. We adopt the log-normal distribution as a model for the EN time to failure, following the evidence provided in [35].
The Probability Density Function (PDF) of the log-normal model for the EN time to failure, i.e., the EN software life duration, is given by
f ( t ) = 1 t σ t 2 π exp 1 2 ln t μ σ t 2 ,
where t > 0 denotes the uptime (or software life duration) and μ , σ t are the distribution parameters related to the mean and variability of the node’s lifetime.
The expected value and variance of t are
E [ t ] = e μ + σ t 2 2 , Var [ t ] = e 2 μ + σ t 2 ( e σ t 2 1 ) .
From (6), the coefficient of variation (CV) results in
Var [ t ] E [ t ] 2 = e σ t 2 1 ,
which allows us to estimate σ t from empirical CV data.
The corresponding Cumulative Distribution Function (CDF) is
F ( t ) = 0 t f ( τ ) d τ = Φ ln t μ σ t ,
where Φ ( z ) is the standard normal CDF. Finally, the instantaneous failure rate (hazard function) is
h ( t ) = f ( t ) 1 F ( t ) .
The log-normal SA model enables generation of synthetic SA traces for each E N j j E by sampling values of t over time.

3.3. Problem Formulation

The primary goal of this paper is to enhance the service continuity of FMaaS platforms by leveraging edge intelligence to anticipate and mitigate potential failures caused by SA in ENs. Specifically, we aim to reduce the number of offloaded tasks that are assigned to ENs predicted to be at risk of failure, as detailed below, thereby reducing service interruptions and preserving the user experience. To this end, we formulate a reliability-aware task assignment problem that integrates forecasting of SA within the scheduling decisions. Specifically, each client request must be assigned to an EN in a way that avoids nodes expected to exceed a critical aging threshold at the predicted completion time of the task.
In formal terms, we have
min x i j { 0 , 1 } i j f i j · x i j
subject to
j x i j = 1 ,
i ρ i · x i j Q j ,
where
f i j = 1 , if τ [ t , t + T i total ] such that α ^ j ( τ ) > v 0 , otherwise
and α ^ j ( τ ) is the forecast software age of node j during the expected task execution, with v as the critical software age threshold.

4. Software Aging Forecasting

ESNs are a specific class of RNNs developed within the Reservoir Computing paradigm, which enables temporal pattern learning with minimal training complexity. Unlike conventional RNNs, ESNs avoid issues such as vanishing gradients during backpropagation thanks to their distinctive architecture in which only the output layer undergoes supervised training [36]. The ESN architecture consists of three main components [36]:
  • The input weight matrix W in that maps the external input signal into the reservoir. This matrix is not trained; it is randomly initialized and kept fixed during learning;
  • The recurrent reservoir matrix W r that defines the internal recurrent connections among the reservoir neurons;
  • The output weight matrix W out that maps the reservoir state to the network output and represents the only trainable component of the model.
The reservoir itself comprises a large number of sparsely connected nonlinear units with internal connections that remain fixed after initialization. Combined with random initialization and a properly scaled spectral radius, this sparsity enables the reservoir to encode temporal dependencies over time with high efficiency. Formally, let x ( q ) R q be the input vector at time q. The reservoir state vector u ( q ) R s evolves according to
u ( q ) = tanh W in x ( q ) + W r u ( q 1 ) ,
where the tanh activation captures the nonlinear dynamics of the internal state update. The output is then computed as
v ( q ) = W out u ( q ) ,
where W out is the only trainable parameter set, typically learned via ridge regression over a set of collected internal states and desired outputs.
The reservoir component plays a central role in retaining memory of past inputs, making ESNs suitable for time series prediction problems such as estimating the residual life or failure likelihood of ENs based on SA indicators. In our case, each E N j j E is associated with a time series S e representing its software age evolution, i.e., the degradation trend due to factors such as memory leaks or resource saturation. The ESN is trained to forecast the software age for time horizons δ ranging from 1 up to a value typically matching the total service time experienced by the client [37], i.e., δ = 1 , , T i total .
To improve prediction accuracy, we adopt a Genetic Algorithm (GA) to optimize the ESN hyperparameters, specifically the number of reservoir neurons R and the spectral radius α . Unlike exhaustive grid searches or manual tuning, the GA provides an adaptive search over the hyperparameter space, favoring configurations that minimize prediction error.
The GA operates iteratively on a population of candidate solutions γ = ( R , α ) , evaluating each individual via a fitness function defined as the ESN’s prediction error on the training set. At each generation, the top-performing individuals (the “elite”) are retained, while new individuals are generated through crossover and mutation operations. This process continues until a maximum number of generations q is reached [38] or convergence criteria are met [39].

5. Proposed Algorithm

In this section, we detail the offloading heuristic designed to minimize the SA risk during task assignment in FMaaS systems (see Algorithm 1). The goal is to assign each user request to an EN that offers the best trade-off between latency and reliability, without violating capacity constraints and while accounting for potential software degradation over the task execution window.
The algorithm proceeds iteratively: each user selects the node minimizing the latency-risk metric, given by
Ψ i j = T i comm ( j ) + T i inf ( j ) + λ · max τ [ t , t + T i total ] α ^ j ( τ ) ,
where the last term represents the maximum forecast software age of the E N j during the expected task execution interval, with j E . This pessimistic evaluation captures the worst-case aging condition that may impact session continuity.
A task computation proposal i is submitted to the E N j minimizing Ψ i j . Upon receiving a set of proposals, each E N j evaluates the candidates based on their associated SA risk, i.e., the maximum α ^ j ( τ ) during execution. The node accepts the proposal with the lowest such risk as long as its available computational budget Q j is sufficient to accommodate the task. All other proposals are rejected.
Rejected clients are allowed to reattempt assignment in subsequent rounds. Importantly, ENs are not excluded from future proposals by clients that were previously rejected; clients may retry the same EN if they still consider it the most reliable according to (16). This feedback-driven mechanism continues until all tasks are successfully assigned or all nodes are saturated.
Algorithm 1 Risk-Aware Offloading Heuristic
  1:
Input: Clients U , edge nodes N , forecasts α ^ j ( τ ) , task memory ρ i , capacities Q j , weight λ
  2:
Initialize all clients as unassigned
  3:
while there exist unassigned clients do
  4:
   for all unassigned clients i U  do
  5:
       for all nodes j N  do
  6:
           Compute: Ψ i j = T i comm ( j ) + T i inf ( j ) + λ · max τ [ t , t + T i total ] α ^ j ( τ )
  7:
       end for
  8:
       Propose to j = arg min j Ψ i j
  9:
   end for
10:
   for all nodes j with proposals do
11:
       Let P j be the set of proposals received
12:
       Select i = arg min i P j max τ α ^ j ( τ )
13:
       if  ρ i available capacity of j then
14:
           Accept i , update Q j Q j ρ i
15:
       end if
16:
       Reject all i P j { i }
17:
   end for
18:
end while

6. System Architecture

The proposed system architecture is illustrated in Figure 2 and is described according to the 1+5 Architectural Views Model, with the use-case view acting as the integrating perspective. The 1+5 Architectural Views Model [40,41] is an architecture description approach that represents a system through six complementary views, all driven by a single set of architectural requirements and scenarios. The model separates concerns to address different stakeholder needs while maintaining architectural consistency. The views include Integrated Processes, Use Cases, Logical, Contracts, Integrated Services, and Deployment Perspectives, enabling comprehensive understanding, communication, and validation of the system architecture. From a logical view, the system consists of clients connected through a wireless access network to a set of ENs, which jointly provide FMaaS capabilities. Each EN embeds a modular execution stack comprising a foundation model inference engine, a software aging monitoring component, and an ESN-based aging predictor, following architectural principles similar to those adopted in reliability-aware edge systems such as those discussed in [41]. The process view captures the dynamic interaction among components, including task offloading, parallel inference execution, and a closed-loop monitoring–forecasting–decision process that proactively incorporates predicted software aging into resource selection. From a development view, the architecture is organized into loosely coupled and reusable components, enabling extensibility and facilitating the integration of predictive reliability mechanisms, as advocated by recent edge intelligence frameworks. The Deployment view, shown in Figure 1, depicts the deployment of ENs in proximity to Small Base Stations within a distributed wireless edge infrastructure, where nodes operate under constrained computational and memory resources. Finally, the use-case view consolidates all architectural perspectives by modeling the main operational scenario in which a client request is dynamically assigned to the most suitable EN by jointly considering latency, inference performance, and predicted software aging, thereby enhancing service continuity and system resilience.

7. Performance Analysis

A performance analysis is provided to validate the effectiveness of the proposed prediction scheme and demonstrate the overall framework’s ability to ensure service continuity in the presence of EN failures. The objective of our analysis is to answer to the following two research questions:
  • Q1. How does the prediction accuracy vary as the temporal forecasting horizon increases, including in a comparison with a different prediction method?
  • Q2. How does the proposed FMaaS framework perform in comparison with the same scheme without the SA forecasting?
In particular, three accuracy error metrics have been taken into account in order to deeply test the performance of the proposed ESN. We have considered the Mean Squared Error (MSE), defined as
MSE = 1 N t = 1 N ( y ^ t + Δ t l y t + Δ t l ) 2 ,
where N represents the number of samples in the test dataset, while y ^ t + Δ t l and y t + Δ t l represent the actual and forecasted values at time t + Δ t l , respectively. In addition, we evaluate the Mean Percent Error (MAPE), defined as
MAPE = 1 N t = 1 N y ^ t + Δ t l y t + Δ t l y t + Δ t l · 100 ,
and the Mean Absolute Deviation (MAD), for which the definition is
MAD = 1 N t = 1 N | y ^ t + Δ t l y t + Δ t l | .
To generate synthetic traces of SA, we simulate each EN’s lifetime using log-normal distributions to capture the variability in software degradation patterns. For each configuration, 400 samples are generated, representing different time evolutions of software age. The system consists of three ENs serving a spatially distributed population of clients with positions that follow a Poisson point process. To assess scalability, the number of clients is varied from 10 to 40. Software age predictions are learned using ESNs, trained to minimize the MSE between predicted and actual aging trajectories. The ESN hyperparameters are tuned through a genetic algorithm with a fixed budget of generations and population size, i.e., a population of 30 individuals and a budget of 50 generations. The critical software age threshold is set to v = 0.8 , reflecting the point beyond which the risk of failure significantly increases in the simulated aging traces. We evaluate the predictive accuracy of the proposed ESN-based forecasting model against a baseline Moving Average (MA) approach in which the prediction horizon is varied from short-term to long-term windows. As shown in Figure 3, the ESN consistently achieves lower values of MSE across all horizons, demonstrating superior ability to capture the nonlinear aging dynamics. The performance gap becomes more pronounced as the prediction horizon increases, confirming the advantage of temporal memory and internal state evolution in ESNs when modeling complex software degradation patterns. This behavior is further confirmed in Figure 4 and Figure 5, where the ESN also outperforms the baseline in terms of MAPE and MAD, respectively. These results underline the robustness of ESN-based forecasting for aging-aware edge management. Figure 6 presents the failure rate as a function of the number of clients, comparing the full proposed strategy with an ablated version that omits the forecasting module. In the latter, task assignment decisions are based solely on communication and inference latency, without considering SA trends. Figure 7 shows the failure rate as a function of the critical software age threshold. By increasing the critical software age indicator, the system progressively reduces the number of potential failure events. Higher threshold values implicitly lower the probability of software-induced failures. This behavior moves the system closer to an ideal (but not practically attainable) operating regime in which software components do not fail and the system achieves full reliability. Figure 8 reports the system failure rate as a function of the number of edge nodes. As expected, increasing the number of nodes leads to a higher overall failure rate due to the larger set of software instances and execution points involved in task processing. However, the proposed forecast-aware strategy consistently achieves lower failure rates than the baseline approach across all configurations. This indicates that predictive information remains effective in mitigating software-induced failures even as the system scales. The presence of narrow confidence intervals further suggests that the observed performance gap is statistically stable and not attributable to random fluctuations. Table 1, Table 2 and Table 3 jointly provide an ablation and sensitivity analysis of the proposed GA-based hyperparameter optimization. Table 1 isolates the impact of reservoir size on prediction accuracy, highlighting the systematic performance improvement achieved by the GA-optimized ESN with respect to the standard configuration. Table 2 and Table 3 further analyze the sensitivity of the optimization process to the population size, reporting the corresponding gains in accuracy and the associated computational cost. As evidenced by the reported values, we selected a configuration that represents a balanced tradeoff between prediction accuracy and computational effort. As the figure shows, removing software age forecasting leads to a significant increase in failure events, especially under higher load conditions. This confirms the central insight of our work: that integrating predictive information on software degradation into the decision process is essential to improving reliability and service continuity in FMaaS systems.

8. Discussion and Limitstions

The proposed framework presents several strengths. First, it introduces a predictive reliability dimension into FMaaS offloading decisions, enabling proactive mitigation of service degradation rather than reactive fault handling. Second, the use of lightweight ESNs ensures low training overhead and scalability across distributed ENs, making the approach suitable for resource-constrained environments. Third, the integration of communication, computation, and software health indicators within a unified decision metric provides a holistic orchestration strategy that better reflects real operational conditions compared to latency-only policies. Nevertheless, some limitations must be acknowledged. The proposed model relies on synthetic SA traces and assumes that software degradation follows a known statistical distribution, which may not fully capture the complexity and nonstationarity of real-world systems. In addition, the effectiveness of the prediction module depends on the quality and representativeness of monitoring data, which in practice can be noisy, incomplete, or delayed. Another potential limitation concerns scalability in highly dynamic scenarios where frequent workload fluctuations or software updates may alter aging patterns faster than the predictor can adapt. Furthermore, while ESNs offer efficiency advantages, they may exhibit reduced interpretability compared to more transparent analytical models, potentially complicating root-cause analysis and system diagnostics. Finally, the proposed heuristic focuses on minimizing failure risk rather than jointly optimizing multiple long-term system objectives such as fairness, energy efficiency, or economic cost. Addressing these aspects requires extending the framework toward multi-objective and adaptive formulations, which constitutes a relevant direction for future work.

9. Conclusions

In this paper, we have proposed a novel distributed offloading framework for FMaaS specifically designed to address and mitigate SA in edge computing infrastructures. To prevent service degradation, we integrate a predictive module based on ESNs at the EN level. The proposed method demonstrates several crucial advantages and significant improvements over traditional allocation strategies:
  • Low overhead and high efficiency: Our utilization of ESNs enables the modeling of nonlinear SA dynamics with minimal computational and training overhead. This renders the architecture exceptionally lightweight and highly suitable for resource-constrained edge environments.
  • Resilience and proactive failure prevention: Unlike purely reactive approaches, the proposed system anticipates degradation. By forecasting short-term software health trajectories, the framework preemptively redirects user requests away from nodes approaching operational exhaustion, drastically reducing the task failure rate.
  • Multi-objective optimization: Our developed offloading policy dynamically allocates tasks by jointly balancing communication costs, inference delay, and the predictive reliability of the EN. This ensures that users are served by ENs offering the optimal tradeoff between proximity (low latency) and operational stability.
  • Superior quality of service: As validated by our simulation results, SA-aware orchestration significantly enhances user session continuity. This is a fundamental requirement for demanding interactions such as on-demand fine-tuning or prolonged inference sessions.
In summary, the integration of SA forecasting into FMaaS orchestration represents a fundamental step towards increasing the resilience and reliability of edge intelligence. Future work will focus on extending the framework to incorporate dynamically triggered software rejuvenation mechanisms as well as on optimizing the offloading policies for large-scale multi-tenant scenarios.

Author Contributions

Conceptualization, B.P. and R.F.; Methodology, B.P. and R.F.; Software, B.P. and R.F.; Validation, B.P. and R.F.; Formal analysis, B.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhou, H.; Hu, C.; Yuan, D.; Yuan, Y.; Wu, D.; Liu, X.; Han, Z.; Zhang, J. Generative ai as a service in 6g edge-cloud: Generation task offloading by in-context learning. IEEE Wirel. Commun. Lett. 2025, 14, 711–715. [Google Scholar] [CrossRef]
  2. Qiao, L.; Mashhadi, M.B.; Gao, Z.; Foh, C.H.; Xiao, P.; Bennis, M. Latency-aware generative semantic communications with pre-trained diffusion models. IEEE Wirel. Commun. Lett. 2024, 13, 2652–2656. [Google Scholar] [CrossRef]
  3. Fu, M.; Wang, P.; Liu, M.; Zhang, Z.; Zhou, X. Iov-bert-ids: Hybrid network intrusion detection system in iov using large language models. IEEE Trans. Veh. Technol. 2025, 74, 1909–1921. [Google Scholar] [CrossRef]
  4. Liu, Q.; Mu, J.; Chen, D.; Zhang, R.; Liu, Y.; Hong, T. Llm enhanced reconfigurable intelligent surface for energy-efficient and reliable 6g iov. IEEE Trans. Veh. Technol. 2025, 74, 1830–1838. [Google Scholar] [CrossRef]
  5. Zhou, H.; Hu, C.; Yuan, Y.; Cui, Y.; Jin, Y.; Chen, C.; Wu, H.; Yuan, D.; Jiang, L.; Wu, D.; et al. Large language model (llm) for telecommunications: A comprehensive survey on principles, key techniques, and opportunities. IEEE Commun. Surv. Tutor. 2025, 27, 1955–2005. [Google Scholar] [CrossRef]
  6. Avritzer, A.; Janes, A.; Marin, A.; Trubiani, C.; van Hoorn, A.; Camilli, M.; Menasché, D.S.; Bondi, A.B. Software aging detection and rejuvenation assessment in heterogeneous virtual networks. IEEE Trans. Emerg. Top. Comput. 2025, 13, 299–313. [Google Scholar] [CrossRef]
  7. Vizarreta, P.; Sieber, C.; Blenk, A.; Bemten, A.V.; Ramachandra, V.; Kellerer, W.; Mas-Machuca, C.; Trivedi, K. Ares: A framework for management of aging and rejuvenation in softwarized networks. IEEE Trans. Netw. Serv. Manag. 2021, 18, 1389–1400. [Google Scholar] [CrossRef]
  8. Huang, Y.; Kintala, C.; Kolettis, N.; Fulton, N. Software rejuvenation: Analysis, module and applications. In Twenty-Fifth International Symposium on Fault-Tolerant Computing, Digest of Papers; IEEE: New York, NY, USA, 1995; pp. 381–390. [Google Scholar]
  9. Sjoeberg, D.; Hannay, J.; Hansen, O.; Kampenes, V.; Karahasanovic, A.; Liborg, N.-K.; Rekdal, A. A survey of controlled experiments in software engineering. IEEE Trans. Softw. Eng. 2005, 31, 733–753. [Google Scholar] [CrossRef]
  10. Lei, X.; Xue, K.; Jia, Y. A comprehensive metric to evaluating software aging. In 2010 International Conference on Computer Application and System Modeling (ICCASM 2010); IEEE: New York, NY, USA, 2010; Volume 11, pp. V11–518–V11–520. [Google Scholar]
  11. Martínez, C.M.; Carracedo, J.G.; Gallego, J.S. Characterizing Agile Software Development: Insights from a Data-Driven Approach Using Large-Scale Public Repositories. Software 2025, 4, 13. [Google Scholar] [CrossRef]
  12. Bai, J.; Chang, X.; Machida, F.; Trivedi, K.S. Understanding Container-Based Services Under Software Aging: Dependability and Performance Views. IEEE Trans. Sustain. Comput. 2025, 10, 562–575. [Google Scholar] [CrossRef]
  13. Ozcelik, B.; Yilmaz, C. Seer: A lightweight online failure prediction approach. IEEE Trans. Softw. Eng. 2016, 42, 26–46. [Google Scholar] [CrossRef]
  14. Xu, Z.; Ma, J.; Liu, Y. Failure prediction mechanism of disk devices based on lstm. In 2022 2nd Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS); IEEE: New York, NY, USA, 2022; pp. 388–391. [Google Scholar]
  15. Chen, P.; Qi, Y.; Li, X.; Hou, D.; Lyu, M.R.-T. Arf-predictor: Effective prediction of aging-related failure using entropy. IEEE Trans. Dependable Secur. Comput. 2018, 15, 675–693. [Google Scholar] [CrossRef]
  16. Chakraborty, P.; Corici, M.; Magedanz, T. System failure prediction within software 5g core networks using time series forecasting. In 2021 IEEE International Conference on Communications Workshops (ICC Workshops); IEEE: New York, NY, USA, 2021; pp. 1–7. [Google Scholar]
  17. Han, S.; Lee, P.P.C.; Shen, Z.; He, C.; Liu, Y.; Huang, T. Streamdfp: A general stream mining framework for adaptive disk failure prediction. IEEE Trans. Comput. 2022, 72, 520–534. [Google Scholar] [CrossRef]
  18. Riganelli, O.; Saltarel, P.; Tundo, A.; Mobilio, M.; Mariani, L. Cloud failure prediction with hierarchical temporal memory: An empirical assessment. In Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA); IEEE: New York, NY, USA, 2021; pp. 785–790. [Google Scholar]
  19. Battisti, F.; Silva, A.; Pereira, L.; Carvalho, T.; Araujo, J.; Choi, E.; Nguyen, T.A.; Min, D. hLSTM-Aging: A Hybrid LSTM Model for Software Aging Forecast. Appl. Sci. 2022, 12, 6412. [Google Scholar] [CrossRef]
  20. Iqbal, A. Time series forecasting and anomaly detection using deep learning. Comput. Chem. Eng. 2024, 182, 108560. [Google Scholar] [CrossRef]
  21. Maleki, S.; Maleki, S.R.; Jennings, N.R. Unsupervised anomaly detection with lstm autoencoders using statistical data-filtering. Expert Syst. Appl. 2021, 173, 114619. [Google Scholar] [CrossRef]
  22. Dohi, T.; Okamura, H. Dynamic software availability model with rejuvenation. J. Oper. Res. Soc. Jpn. 2016, 59, 270–290. [Google Scholar] [CrossRef][Green Version]
  23. Vincent, J.J. Recent advances in rl for self-adaptive software systems: A systematic review. Asian J. Res. Comput. Sci. 2025, 18, 143–154. [Google Scholar] [CrossRef]
  24. Metzger, A.; Quinton, C.; Mann, Z.Á.; Baresi, L.; Pohl, K. Realizing self-adaptive systems via online reinforcement learning. Computing 2022, 106, 1251–1272. [Google Scholar] [CrossRef]
  25. Kabashkin, I. Integration of Foundation Models and Federated Learning in AIoT-Based Aircraft Health Monitoring Systems. Mathematics 2024, 12, 3428. [Google Scholar] [CrossRef]
  26. Huo, C.; Chen, C.K.; Zhang, S.; Wang, Z.; Yan, H.; Shen, J.; Hong, Y.; Qi, G.; Fang, H.; Wang, Z. When Remote Sensing Meets Foundation Model: A Survey and Beyond. Remote Sens. 2025, 17, 179. [Google Scholar] [CrossRef]
  27. García, P.; de Curtò, J.; de Zarzà, I.; Cano, J.C.; Calafate, C.T. Foundation Models for Cybersecurity: A Comprehensive Multi-Modal Evaluation of TabPFN and TabICL for Tabular Intrusion Detection. Electronics 2025, 14, 3792. [Google Scholar] [CrossRef]
  28. Hou, X.; Zhao, Y.; Liu, Y.; Yang, Z.; Wang, K.; Li, L.; Luo, X.; Lo, D.; Grundy, J.; Wang, H. Large language models for software engineering: A systematic literature review. ACM Trans. Softw. Eng. Methodol. 2024, 33, 220. [Google Scholar] [CrossRef]
  29. Fan, A.; Gokkaya, B.; Harman, M.; Lyubarskiy, M.; Sengupta, S.; Yoo, S.; Zhang, J. Large language models for software engineering: Survey and open problems. In Proceedings of the 2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE); IEEE: New York, NY, USA, 2023; Volume 5, pp. 31–53. [Google Scholar]
  30. La Malfa, E.; Petrov, A.; Frieder, S.; Weinhuber, C.; Burnell, R.; Cohn, A.G.; Shadbolt, N.; Wooldridge, M. Language-models-as-a-service: Overview of a new paradigm and its challenges. J. Artif. Intell. Res. 2024, 80, 1497–1523. [Google Scholar] [CrossRef]
  31. Cotroneo, D.; Natella, R.; Pietrantuono, R.; Russo, S. A survey of software aging and rejuvenation studies. ACM J. Emerg. Technol. Comput. Syst. 2014, 10, 3357–3395. [Google Scholar] [CrossRef]
  32. Marchetti, F.; Picano, B.; Seidenari, L.; Fantacci, R. Foundation Forecasting in IoE Networks: When Generative AI Meets Programmable Edge Nodes. IEEE Internet Things J. 2025, 12, 42917–42931. [Google Scholar] [CrossRef]
  33. Ghahremani, S.; Giese, H. Evaluation of self-healing systems: An analysis of the state-of-the-art and required improvements. Computers 2020, 9, 16. [Google Scholar] [CrossRef]
  34. Rivera Ibarra, J.G.; Borrego, G.; Palacio, R.R. Early Estimation in Agile Software Development Projects: A Systematic Mapping Study. Informatics 2024, 11, 81. [Google Scholar] [CrossRef]
  35. Matias, R.; Barbetta, P.A.; Trivedi, K.S.; Filho, P.J.F. Accelerated degradation tests applied to software aging experiments. IEEE Trans. Reliab. 2010, 59, 102–114. [Google Scholar] [CrossRef]
  36. Peng, H.; Chen, C.; Lai, C.-C.; Wang, L.-C.; Han, Z. A predictive on-demand placement of uav base stations using echo state network. In 2019 IEEE/CIC International Conference on Communications in China (ICCC); IEEE: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
  37. Salfner, F.; Lenk, M.; Malek, M. A survey of online failure prediction methods. ACM Comput. Surv. 2010, 42, 10. [Google Scholar] [CrossRef]
  38. Bäck, T. Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms; Oxford University Press, Inc.: New York, NY, USA, 1996. [Google Scholar]
  39. Ferreira, A.A.; Ludermir, T.B. Genetic algorithm for reservoir computing optimization. In 2009 International Joint Conference on Neural Networks; IEEE: New York, NY, USA, 2009; pp. 811–815. [Google Scholar]
  40. Górski, T. Architectural View Model for an Integration Platform. J. Theor. Appl. Comput. Sci. 2012, 6, 25–34. [Google Scholar]
  41. Górski, T. The 1+5 Architectural Views Model in Designing Blockchain and IT System Integration Solutions. Symmetry 2021, 13, 2000. [Google Scholar] [CrossRef]
Figure 1. FMaaS paradigm in wireless networks.
Figure 1. FMaaS paradigm in wireless networks.
Computers 15 00158 g001
Figure 2. System architecture.
Figure 2. System architecture.
Computers 15 00158 g002
Figure 3. ESN accuracy in terms of MSE.
Figure 3. ESN accuracy in terms of MSE.
Computers 15 00158 g003
Figure 4. ESN accuracy in terms of MAPE.
Figure 4. ESN accuracy in terms of MAPE.
Computers 15 00158 g004
Figure 5. ESN accuracy in terms of MAD.
Figure 5. ESN accuracy in terms of MAD.
Computers 15 00158 g005
Figure 6. Failure rate as a function of the number of clients.
Figure 6. Failure rate as a function of the number of clients.
Computers 15 00158 g006
Figure 7. Failure rate as a function of the critical value for software age.
Figure 7. Failure rate as a function of the critical value for software age.
Computers 15 00158 g007
Figure 8. Failure rate as a function of the number of edge nodes.
Figure 8. Failure rate as a function of the number of edge nodes.
Computers 15 00158 g008
Table 1. Impact of GA-based hyperparameter optimization on ESN performance in terms of prediction accuracy when increasing the reservoir size.
Table 1. Impact of GA-based hyperparameter optimization on ESN performance in terms of prediction accuracy when increasing the reservoir size.
Method200300500
ESN0.9520.9240.903
GA-ESN0.9880.9630.934
Table 2. Impact of GA-based hyperparameter optimization on ESN performance in terms of prediction accuracy when increasing the population size.
Table 2. Impact of GA-based hyperparameter optimization on ESN performance in terms of prediction accuracy when increasing the population size.
Method103050
GA-ESN0.9310.9880.993
Table 3. Impact of GA-based hyperparameter optimization on ESN performance in terms of computation time when increasing the population size.
Table 3. Impact of GA-based hyperparameter optimization on ESN performance in terms of computation time when increasing the population size.
Method103050
GA-ESN785 s1238 s1896 s
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Picano, B.; Fantacci, R. Staying Young at the Edge: A Software Aging Perspective for Foundation Models as a Service. Computers 2026, 15, 158. https://doi.org/10.3390/computers15030158

AMA Style

Picano B, Fantacci R. Staying Young at the Edge: A Software Aging Perspective for Foundation Models as a Service. Computers. 2026; 15(3):158. https://doi.org/10.3390/computers15030158

Chicago/Turabian Style

Picano, Benedetta, and Romano Fantacci. 2026. "Staying Young at the Edge: A Software Aging Perspective for Foundation Models as a Service" Computers 15, no. 3: 158. https://doi.org/10.3390/computers15030158

APA Style

Picano, B., & Fantacci, R. (2026). Staying Young at the Edge: A Software Aging Perspective for Foundation Models as a Service. Computers, 15(3), 158. https://doi.org/10.3390/computers15030158

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop