Physically Interpretable and AI-Powered Applied-Field Thrust Modelling for Magnetoplasmadynamic Space Thrusters Using Symbolic Regression: Towards More Explainable Predictions

Rosa-Morales, Miguel; Ravichandran, Matthew; Song, Wenjuan; Yazdani-Asrami, Mohammad

doi:10.3390/aerospace13030245

Open AccessArticle

Physically Interpretable and AI-Powered Applied-Field Thrust Modelling for Magnetoplasmadynamic Space Thrusters Using Symbolic Regression: Towards More Explainable Predictions

CryoElectric Research Lab, Propulsion, Electrification & Superconductivity Group, Autonomous Systems and Connectivity Division, James Watt School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK

^*

Author to whom correspondence should be addressed.

Aerospace 2026, 13(3), 245; https://doi.org/10.3390/aerospace13030245

Submission received: 29 December 2025 / Revised: 9 February 2026 / Accepted: 3 March 2026 / Published: 5 March 2026

(This article belongs to the Special Issue Artificial Intelligence in Aerospace Propulsion)

Download

Browse Figures

Versions Notes

Abstract

Magnetoplasmadynamic thrusters (MPDTs) are becoming increasingly viable as electric propulsion (EP) technology for space missions, yet their complex plasma behaviour, intricate thrust-generation process, and nonlinear multi-physics thrust–field interactions prove difficult for conventional modelling approaches, including empirical techniques. Traditional empirical modelling shortcomings include failure to predict accurately across wide operational regimes. This paper introduces a physically interpretable, artificial intelligence (AI)-powered thrust model for Applied-Field Magnetoplasmadynamic Thrusters (AF-MPDTs), developed using symbolic regression (SR) to address the gap between data-driven prediction and physics-based understanding. The proposed method, an alternative to traditional black box AI methods, incorporates physics-aware composite-term operators, ensuring that the resulting analytical expressions are bounded by known physical behaviours while retaining the flexibility to discover previously overlooked nonlinear couplings. A comprehensive dataset of AF-MPDTs undergoes rigorous preprocessing to ensure dimensional consistency and noise robustness. The SR model then evolves candidate equations, balancing predictive accuracy with interpretability through Tree-Structured Parzen Estimator (TPE) optimisation. The results, closed-form surrogate correlations with 95.98% of accuracy as goodness of fit, root mean square error of 0.0199, mean absolute error of 0.0143, and mean absolute percentage error reduction of 28.91% against the benchmark model in the literature. A post-discovery protocol for numerical robustness and physical consistency is implemented, with Shapley Additive Explanations (SHAP) providing insight into the influence of each composite-term in the developed correlation, followed by a numerical robustness and physical consistency validation using a Monte Carlo (MC) envelope. A StabilityScore is calculated for all developed correlations, enabling explicit accuracy–complexity–stability comparisons. In doing so, we demonstrated that SR can systematically recover known physical relationships—such as the scaling of thrust with discharge current and applied magnetic field—while proposing interpretable higher-order corrections that improve fit quality. The resulting SR-based thrust models not only achieve competitive accuracy relative to state-of-the-art numerical and empirical methods but also offer more explainable and interpretable results capable of revealing compact formulations that capture essential acceleration mechanisms with transparency. Overall, this paper, using SR, advances explainable AI (XAI) methodologies capable of generating trustworthy, analytically transparent models for next-generation electric propulsion systems.

Keywords:

AF-MPDTs; machine learning; Monte Carlo; thrust prediction; tree-structured Parzen Estimator; XAI

1. Introduction

Space operations in recent years have received renewed interest with missions such as the expansion of orbital satellites Starlink [1], Mars exploration [2], Artemis and Dawn missions of the National Aeronautics and Space Administration (NASA) [3,4], and the SMART-1 mission of the European Space Agency (ESA) [5]. While chemical propulsion generates high levels of thrust at low specific impulses, making it suitable for launch operations, once in orbit, longer operational times, constant thrust, and fuel efficiency are critical mission constraints, making alternative propulsion a requirement [6,7]. Electrification as a means of propulsion, a viable alternative, has been studied as early as the 1900s [8]. Electrical propulsion provides thrust generation at high specific impulse and velocities. Some thrusters with proven space heritage are the Hall (HET) and the gridded ion thrusters (GIT) [9], reliably operational at low power ranges. A counterpart for these in the high-power regime is the AF-MPDT [10]. It generates thrust as the potential voltage difference between axisymmetric, concentric electrodes ionises gas being fed in a chamber; this creates ionised plasma and a current arc capable of generating self-magnetic and electrical fields. This is aided by an external magnet enhancing Lorentz forces, producing thrust via plasma interactions [11]. The interactions can be surmised in Figure 1 [12]. That said, the current technology readiness level of 4–5 [12,13] has been stalled of progression due to multiple challenges: power generation for the system [13], cryogenic cooling requirements systems in terms of superconducting AF-MPDT [14], and the intricate complexities of modelling the multi-physics interactions in the thrust generating chamber of the AF-MPDT, namely the propellant acceleration forces are well known; however, the individual mechanisms of the whole and how these are capable of changes within different arrays of operational points are not yet fully understood [10]. Total thrust is the sum of the applied field thrust T_AF, self-field thrust T_SF, and gas dynamic thrust T_GD, explained in detail in [15].

Over last few decades, multiple approaches to understanding thrust have been performed, i.e., experimentally [13,16,17,18], empirically and semi-empirically, such as the following models: Tikhonov et al.’s [19], Myers’s [20], Albertoni et al.’s [21], Fradkin et al.’s [22], Herdrich et al.’s [23], a scaling model for SUPREME [24], Coogan’s [25], and most recently Balkenhohl et al.’s [26] and Glowacki et al.’s [27]. Coogan’s model, excluding scaling methods, has the highest accuracy out of all when operating in the low thrust regime (0–1 N), which corresponds to the low-thrust subset of data used throughout this study. A key assumption made by the aforementioned study is that the data used for the model are in the T_AF-dominant region, where T_GD and T_SF are negligible. A drawback of the model is that it underperforms for thrust predictions at lower applied magnetic fields (B) and over-predicts for higher B fields [26]. Given these regime-dependent errors, in particular in the 0–1 N range, where the best model suffers from degradation, an alternative modelling approach is required, one that can capture the nonlinear couplings, while remaining transparent.

As digitalisation is adopted across industries, AI and its subset machine learning (ML) become frontrunners as intelligent tools capable of performing accurate, low-latency predictions in aviation and space applications [28,29,30,31,32,33]. The use of AI enables new and innovative models to be developed for regression and estimation applications within microseconds, while ML can perform complex operations capable of capturing the underlying patterns in multi-input complex datasets [34,35,36,37]. AI-based regression offers several advantages over conventional fitting methods and can outperform them in complex scenarios. One key benefit is the ability to handle complex relationships between inputs and outputs by identifying hidden patterns in data. Traditional methods such as mathematical fitting become inadequate, particularly when more than two or three input variables are involved. Another advantage is nonlinearity. While conventional methods like linear or polynomial regression assume linear relationships, AI-based models, such as neural networks, can learn complex nonlinear behaviours, making them better suited for irregular data distributions. AI techniques also provide superior scalability, as they can efficiently process large datasets and often improve prediction robustness as data volume increases. In contrast, conventional fitting accuracy may degrade with larger datasets due to data-specific characteristics. Furthermore, AI models typically demonstrate better generalisation, performing well on unseen data and strong adaptability, as they can be retrained or fine-tuned for different problems, making them more flexible than traditional fitting approaches.

Recent works continually expand data-driven capabilities for AF-MPDTs. A recent study [38], with a reduced-feature ML approach for thrust and voltage predictions using a known database of [25] and an additional evaluation on an unseen dataset, aimed to improve predictive accuracy via ML techniques. The reduce-feature study frames its contribution as application-oriented, indicating that further details and a comparative validation would be reported separately [38]. This emphasises the need for complementary contributions that retain strong generalisation evidence using transparent, reproducible validation protocols and go beyond by delivering interpretable, physics-consistent relationships suitable for AF-MPDT analysis and design, particularly in constrained operating regimes where model credibility and error structure matter as much as performance. In parallel, a broader benchmark study evaluated multiple AI models for thrust prediction on the aforementioned database, including explicit preprocessing, systematic sensitivity analysis, and performance metrics for empirical baselines. ML models such as k-Nearest Neighbours, Gradient Boosting Regression, and XGBoost boast higher accuracies than Coogan’s model for thrust estimation [15,39]. In addition, both studies highlight the same disadvantage, which is present in all “black box” ML models: opacity and lack of physics interpretation. Opacity is the inability of a system to offer any reason or suitable explanation behind the decision process, and it is not removed solely by incorporating physical variables; it is reduced when model form and structural simplicity are imposed to raise interpretability [40,41,42]. Opacity contributes to public distrust among space community researchers, who often view AI as something not fully understood. Explainable AI (XAI) [43,44] addresses opacity by revealing the reasoning behind the model-generated outputs. One interesting XAI tool is SR, inspired by genetic programming, generating random populations of mathematical expressions, which are mutated over generations to produce an accurate fit to the expected prediction [45,46]. The motivation behind the use of SR is not a superior accuracy claim, but to provide closed-form surrogate correlations that are deployable, comparable to semi-empirical models, and inspectable under parameter variation for design screening. An example of its advantageous use is presented in the case of finding a data-driven model for the electron anomalous transport in Hall Effect Thrusters (HET) [47]. SR can be a useful tool for innovative XAI in thrust prediction of AF-MPDT to add interpretability to develop a new analytical model.

This paper proposes the first physically interpretable XAI-powered data-driven closed-form surrogate correlations for AF-MPDTs, expanding the literature on the relation between the use of ML techniques for electrical propulsion. The SR model is capable of overcoming the black-box opacity problem by developing validated closed-form surrogate correlations within the tested envelope, intended for design-relevant predictions and scaling insights, while mathematically and physically analysing for robustness and physical consistency. These allow comparison, both in structure and performance, against the empirical correlations currently in the literature, namely, Coogan’s model.

This paper offers novel contributions in the following areas:

Physics-aware SR with composite-term constrained search space: The search space for the SR model is constrained via physically meaningful custom composite-terms, producing closed-form surrogate correlations that remain directly comparable and auditable against established AF-MPDTs scaling relations.
Low-thrust applied-field scope with interpretable driver/modulator structure: Under analysis is a bounded low-thrust envelope (0–1 N) dominated by applied-field effects; the discovered correlations are interpreted in an engineering sense by analysing the explicit structures in terms of the dominant driver and the modulator composites, inclusive of a SHAP analysis.
Post-discovery robustness and physics-consistency screening: We introduce a protocol for stress-testing each SR correlation within the feasible composite-term envelope using MC sampling and finite-difference local sensitivities. This produces envelope-validity and tail-sensitivity metrics, summarised into a StabilityScore. The protocol includes a driver-direction consistency check via ∂T/∂v_α_, allowing physical consistency checks for the SR correlations. The protocol results allow an accuracy–complexity–stability trade-off analysis.

The remainder of this paper is organised as follows: Section 2 presents the methodology where the SR principles of the model’s architecture are outlined, the data collection and pre-processing are delineated, and the implementation of the model with all physics-aware composite-terms is broken down. Section 3, results and discussion, presents the developed SR-based equations and the performance, complexity, SHAP, and post-discovery protocol analysis. The applications, findings, conclusions, and future work are presented in Section 4.

2. Methodology

In this work, SR is used to derive closed-form surrogate thrust correlations for the total thrust

T_{tot}

in the low-thrust regime (0–1 N) of AF-MPDTs, providing explicit equations that are directly comparable to semi-empirical thrust relations and more interpretable than black-box regression within the studied operating envelope. Interpretability is not presumed from variable choice only but treated as a controlled property of the model form, evaluated in the structural complexity of the resultant closed-form expressions. For this reason, multiple equations are presented in later sections to enable comparisons of predictive performance against transparency [41,42]. The motivation for this work has epistemological roots as much as practical implementations [48]. Black-box models are capable of high-accuracy interpolation [15] but provide no insight into how geometry, operating conditions, or field topology combine for thrust prediction, reinforcing opacity in engineering design. Many of these models follow the workflow seen in Figure 2, where the available data is presented to the black-box model that receives it as input and proceeds to find underlying patterns in the data by a wide array of methods, giving as the final output a prediction with no explanation of the decision process. To counter the unknown, explainable modules such as local interpretable model-agnostic explanations (LIME) [49] or Shapley Additive Explanations (SHAP) [50], which give insight into the model post hoc, showcasing the behaviour and weights of the inputs in the final prediction, have been developed. In adding these modules and creating an AI routine, an interpretable model with the capability of black-box models will be delivered.

A drawback from this routine structure observed in Figure 2 lies in the order of operations, where the interpretability factor is introduced post hoc, whereas the proposed SR searches directly over a space of analytical equations for an accurate predictor, giving insight into how the equation evolved and presenting a final optimal equation for prediction. This distinction in method is important because post hoc analysis can provide useful diagnostics, but it does not yield single explicit relationships that can be inspected, simplified, and reused as design correlations like the ones SR is capable of producing. The methodology is divided into three parts: the SR framework, the AF-MPDT dataset and filtering, and lastly the SR implementation and model selection procedure.

2.1. Overview of Symbolic Regression

SR is an ML approach grounded in evolutionary computation, originating from a foundational work in the early 1990s [51], in which a pre-determined model is fit with the parameters. SR searches the analytical space for both structure and coefficients that relate to one model. In this space, the analytical exploration is represented as tree structures, iteratively tuned through evolutionary processes, allowing nonlinear, interpretable relations between all variables [52,53]. In the context of AF-MPDT technology, this permits exploration of broad possibilities for thrust scaling while maintaining the search constrained to expressions that are physically interpretable and consistent with established theory, such as the Lorentz Force Law.

In SR, candidate equations are encoded as expression trees; these trees contain nodes that hold mathematical operators or functions, and leaves contain variables or constants [51]. The function set

(+, -, \div, x, \sqrt{x}, X^{x}, t a n h)

, for example, contains operators that avoid numerical singularities. A recurrent problem for SR is bloating and nesting of operands; constraining these is key to having an interpretable model [53]. In this work, constraints that keep the final equation aligned with scientific model discovery, similar to the works in [52,53,54,55,56], are employed.

The evolutionary algorithm for SR iterates over a four-step loop: initialisation, fitness evaluation, selection and variation, and stopping criteria. During the initialisation step, random expression trees are populated by random combinations of variables, constants, and operators under the constraints established in the programming language [42,51]. Fitness evaluation happens as a training set of data is used on each candidate of the candidate expressions. The goal of this step is loss minimisation, which translates to an increase in accuracy for the predictions. A penalty, named parsimony, is introduced to nudge the algorithm towards compact, interpretable equations. This penalty limits the equations from being bloated or overly complex [52,56,57]. The Python library PySR has a multi-objective formulation that allows simultaneous minimisation of training error and expression complexity to address these points [56] while the performance of the equations is evaluated on the test set, explored further in the next subsection of the methodology.

At the selection and variation step of the cycle, the regression selects the expressions that present better fits; variation to them is introduced in two methods, crossover and mutation. Mutations are randomizing changes to trees that allow stagnation at local minima for error. Crossover transfers offsprings or leaves to alternate parent trees to induce generalisation and allow novel combinations of high-performing structures [51,52]. The final step in the cycle carries over a fraction of the best-performing candidates, allowing for a minimum standard for accuracy. The algorithm concludes as certain conditions are met, such as the number of generations that go through mutations, convergence of the Pareto front of error vs. complexity. The output of the code is not a single model but a set of candidate equations that reside in an accuracy–complexity Pareto front [56,57]. Figure 3 presents this operation cycle, starting with a population initialised, moving towards evaluating and selecting best-fitted equations, and minimising error and expanding the search space by crossover and mutation. The final step has the resultant equations being processed into a file of candidate equations, which can be analysed and assessed for their complexity and their performance metrics. The importance of this list, called “hall of fame” for PySR, lies in the fact that, for this research, where physically interpretable and aware equations are required, it permits the selection of one or more equations at a time to be evaluated. Multiple candidate equations can be consistent with the data, but not all will align with MPDT laws and physics. First, equations that allow for the regressor to explore the space are drawn, and after this, a filter for only specific candidate equations, marking the difference between model creation and model discovery, is placed. Composite-terms that have physical sense and can be found in the literature, such as the term “

ϕ

” from Coogan’s model [12], are required in the equations. A second restriction is enforcing metrics that outperform models in the literature, tested on the same dataset.

2.2. Dataset and Composite-Term Structure Preprocessing

Following the algorithmic overview, the next step is to define the dataset. The Electric Propulsion and Plasma Dynamics Laboratory at Princeton University compiled an extensive database of over five dozen AF-MPDTs, with data spanning as far as the 1960s. The database considers single-propellant, axisymmetric coaxial AF-MPDTs, with a conventional anode–cathode topology and a solenoidal AF component, including qualitative data points (that for the purpose of this study were removed).

Table 1 presents all the inputs used. In addition, the dataset offers different ranges of thrust. AF-MPDTs have always been considered for high-power, high-thrust operations, but as high-temperature superconducting technology has become prevalent in recent years [24,58], MPD thrusters have risen as an alternative to the lower-thrust counterparts, such as HET and GIT. For the purpose of our research, the regime where MPDT thrust predictions have not been explored in depth will be our case study, meaning the regime where thrust ranges from [0, 1] N is referred to as a low-thrust regime in this manuscript. With a data-driven approach, equations that can predict thrust based on multiple parameter inputs from varied operating conditions can be developed, targeting this niche regime.

The application method for our model includes a workflow that spans from data collection to preprocessing, which includes multiple filters, data augmentation, and logarithmic transformation, followed by sensitivity analysis and Bayesian optimisation of SR loops for optimal equation discovery, and performance evaluation, used to compare against current literature. This can be observed in Figure 4. It is important to mention that for SR to produce equations that are comparable to those in the literature, in addition to equations that are physically interpretable, data engineering is performed on the dataset. Following Coogan’s modelling of data grouped inputs, we decided to use four physically motivated composite-terms, replacing the larger set of raw electromagnetic and geometric inputs. These are named Alpha, Beta, Gamma, and Phi, keeping with Coogan’s naming convention. This composite-term formulation maps raw operational and geometric inputs into physically motivated drivers of acceleration, supporting a ranked assessment of which mechanisms and geometry scaling dominate thrust variability in the selected operating envelope. However, transparency is not guaranteed by the use of physically motivated composite-terms alone; opacity is reduced in the majority through the explicit closed-form surrogate correlations produced by the regressor and the controlled complexity of those [41,42], while causal attributions to these equations require dedicated mechanical validation beyond the present framework.

Alpha (

α

): The discharge current density J and applied field B_A are multiplied, resulting in magnetic force density. Alpha represents the Lorentz force in terms of density; in AF-MPDTs, the velocity of electromagnetic acceleration scales proportionally with Lorentz forces up until a saturation point, creating the composite-term that models the physics bounding of our regression. This term is not intended to recreate a magnetohydrodynamic solution model; instead, we introduce a compact, physics-anchored element into the SR. This relation is observed and explained in multiple models that can be referenced in detail in [22].

α = J \cdot B_{A},

(1)

Beta (

β

): This term is a geometric scaling parameter for the ratio of radius for the anode (

R_{a}

) and cathode (

R_{c}

). Beta composite-term represents the governing behaviour of the discharge channel’s topology and current distribution symmetry. Higher

β

weakens current density while lower levels concentrate discharge current around the cathode, raising the discharge current and erosion risks. It measures the transverse geometry of the discharge chamber. As plasma collimation affects plasma interactions in the chamber and instabilities, we define it as such. Higher values represent a broader anode–cathode profile. Works such as [20,59] include anode–cathode radii as part of the applied-field thrust models developed, while other works such as [12] treat these radii as a necessary nondimensional geometric parameter. In addition, models presented in [60,61] support

β

as a physically meaningful composite-term to study.

β = \frac{R_{a}}{R_{c}}

(2)

Gamma (

γ

): This term is an axial design ratio parameter for the length of the anode (

L_{a})

and cathode (

L_{c a}

). It is a physical representation of how the cathode extends relative to the anode’s length. Gamma composite-term influences current path length, the electrode erosion distribution, and the effective discharge channel aspect ratio. Classical arc theory demonstrates that near-electrode drops in voltage are resultant from current density and electrode area. As electrode area increases, current density acts inversely, requiring lower voltages for sustaining current profiles. This behaviour is established in high-current arcs, linking geometric ratios to V-I operational characteristics.

γ = \frac{L_{c a}}{L_{a}}

(3)

Phi (

ϕ

): As defined by Coogan, we incorporate phi, “a nondimensional parameter for the degree to which the anode inner surface follows the magnetic field contour”. The composite-term, a non-dimensional flux ratio, is developed by modelling the applied field component with a base extracted from the Biot–Savart Law for a solenoid. It captures magnetic field divergence geometry and describes magnetic flux expansion and coupling into the plasma column of the thruster. In doing so, Coogan developed a metric, noting if the anode geometrically is diverging faster than the magnetic field topology. This specific ratio is significant as multiple works [60,61,62] claim contouring as a performance-enhancing metric.

ϕ = \frac{R_{a}^{2} \cdot r_{B}^{3}}{R_{a 0}^{2} \cdot {(r_{B}^{2} + L_{a}^{2})}^{\frac{3}{2}}}

(4)

After developing the terms, the final layer of preprocessing is performed. The data for thrust spans a wide range of magnitudes, and it is observed that the lower magnitudes have a much lower spread, whereas the higher end behaves in the opposite direction. To ensure that the loss function is not dominated by a small section of high-end values, a base-10 logarithmic transform is applied to the thrust target denoted as

t_{l o g 10} (T t o t) = \log_{10} (T_{t o t})

, where

T_{t o t}

is the total thrust. Logarithmic transformers are widely recommended to reduce skewness and stabilise the error structure [63]. This choice is consistent with established literature, where thrust and efficiency are expressed in power laws or scaling relations [12,64,65]. Moreover, the logarithmic transform aligns with our goal for interpretable expressions from experimental data [66]. The model is trained and selected using the transformed targets, but final predictions and error metrics are computed after mapping back to the physical units

{\hat{T}}_{t o t} = 10^{{\hat{t}}_{l o g 10} (T_{t o t})}

, where

10^{{\hat{t}}_{l o g 10} (T_{t o t})}

is the model predictions in the logarithmic space, while

{\hat{T}}_{t o t}

is the model prediction in the required units (i.e., N).

An additional step taken to understand how the input variables and the newly created composite-terms correlate to thrust in our dataset is the correlation heatmap shown in Figure 5. It can be observed that thrust is most sensitive in relation to J and

R_{c}

, due to the low-thrust filtering of AF-MPDTs and the different geometric values in the data. A noteworthy mention is

α

that shows a stronger relationship between thrust than any other input. This is to be expected as Lorentz forces govern AF-MPDT behaviour.

There is an important note regarding the low value of correlation between voltage and total thrust. A near-zero pairwise correlation between T_tot and voltage (V) does not imply that voltage has no physical relevance to the system; in AF-MPDTs, V is a circuit-level discharge quantity and is not representative of a single electrostatic acceleration potential. It is an aggregate of multiple contributions, including plasma-column resistive drops and electro-region voltage/power losses [26,67]. As a result, within the applied field subset of our data, V may contribute little independent linear explanatory power as compared with electromagnetic force, as it more strongly reflects dissipation and efficiency-limiting channel.

2.3. Performance Evaluation Metrics

Once the data is cleaned and filtered, it is optimised and pushed through the regressor to obtain candidate equations that can be used as alternate equations to those in the literature. Testing is performed, and metrics are drawn. To be able to quantify the performance of the equations created by the SR, metrics that are able to explain the AI model’s behaviour are required [68]. The first metric is Pearson’s coefficient, or coefficient of determination

R^{2}

(see Equation (5)).

This nondimensional unit gives insight into how well the model is reacting to variability in the results, thus allowing it to be used as an estimator for the accuracy of the model. The coefficient produces values from 0 to 1. The lowest value of the range would indicate that the model is predicting values with a high level of variability and poor accuracy, while values close to the maximum show that the model is creating a good fit to the data [68].

R^{2} = \frac{\sum_{j = 1}^{N} {(y_{j} - \bar{y})}^{2} - \sum_{j = 1}^{N} {(y_{j} - {\hat{y}}_{j})}^{2}}{\sum_{j = 1}^{N} {(y_{j} - \bar{y})}^{2}}

(5)

where

y_{j}

represents the true value,

{\hat{y}}_{j}

is the predicted value, and

\bar{y}

is the mean value of y for the

N^{t h}

iteration of measurements. The next metric is the root mean square error (RMSE) (see Equation (6)). It is a quadratic scoring measure that tells the average magnitude of the error (distance from the prediction to the true value). It is sensitive to outliers as it penalises these points harshly [68].

R M S E = \sqrt{\sum_{j = 1}^{N} \frac{{(y_{j} - {\hat{y}}_{j})}^{2}}{N}}

(6)

The mean absolute error (MAE) (see Equation (7)) is used to measure the average magnitude of the error without considering direction. It is a linear score that presents data in the same unit as the prediction to allow for a contextualised rate. In visualising the data, it is a good measure to use a percentage to understand in a more concrete manner how big or small the error rate is. For this, we use the mean absolute percentage error (see Equation (8)).

M A E = \frac{1}{N} \sum_{j = 1}^{N} |y_{j} - {\hat{y}}_{j}|

(7)

M A P E = \frac{100 %}{N} \sum_{j = 1}^{N} \frac{|y_{j} - {\hat{y}}_{j}|}{y_{j}}

(8)

The mean absolute percentage error (MAPE) presents the percentage equivalent of the MAE. Mape is ideal in cases where outliers are concentrated and few [68].

2.4. Robustness and Physics-Consistency Screening of SR Correlations (MC Envelope + Local Sensitivity)

To ensure that the discovered SR expressions are not only accurate on the filtered dataset but also numerically well-behaved and physically plausible within the stated operating envelope, we introduce an additional post-discovery screening protocol, as shown in Figure 6. This protocol is intended to reduce the likelihood that an explicit closed-form expression reinforces misleading physical interpretations due to hidden mathematical fragilities (e.g., near-singular denominators, heavy-tail excursions, or sign violations) that may not be detected when evaluating solely on the finite dataset [42,69,70,71].

The screening is performed within the same scaled composite-term space used by SR, and it produces quantitative indicators of envelope validity and local sensitivity, which are combined into a stability ranking for model comparison. Importantly, this protocol does not assert physical causality from regression; rather, it provides a reproducible way to verify that the learned equations are evaluable across the feasible envelope, physically plausible in sign/magnitude within the envelope, and directionally consistent with the expected role of the dominant driver composite

α

as used in this work [41,42,69].

For each SR candidate, we generate a MC “envelope cloud” by uniformly sampling in the scaled composite-term space across the observed bounds of

v_{α}, v_{β}, v_{γ}, v_{ϕ}

from the training subset. MC sampling provides a practical mechanism for probing model behaviour over a multidimensional feasible envelope defined by the data ranges, rather than relying on a single finite dataset realisation [72].

Let

T (\cdot)

denote the thrust predicted by a given SR equation, and let

\{x_{n}}_{n = 1}^{N_{M C}}

be the MC samples. Each equation is evaluated on the MC cloud and summarised using the following validity metrics (all formulas for the new proposed metrics can be found in Appendix A):

FracFinite_MC: fraction of MC samples for which T is finite (not NaN/Inf). This measures the proportion of the envelope where the expression remains numerically evaluable.
FracNonFinite_MC =1 − FracFinite_MC: fraction of MC samples producing NaN/Inf. This detects hidden singularities and invalid operations (e.g., division by near-zero denominators) that may not be triggered at the finite dataset points but exist within the feasible envelope [70,71,73]. This directly targets mathematical fragility that can make an explicit equation appear interpretable while being unreliable under envelope-level probing [42,69].
FracNegative_MC: fraction of finite MC samples with T < 0. Negative thrust is physically implausible for the thrust definition and envelope studied; this metric quantifies sign-violation frequency within the stated domain.
MedianT_MC: median of T over finite MC predictions. Behaves as a robust central-tendency indicator, less sensitive to outliers/heavy tails than the mean.
P01T_MC and P99T_MC: first and 99th percentiles of $T$ over finite MC predictions. It provides tail diagnostics to characterise whether the expression develops heavy tails or edge-of-envelope bias, even while remaining finite [74].
FracExtreme_MC: fraction of finite MC predictions producing “extreme” outputs relative to the equation’s robust span on the dataset.
Definition used: robust bounds are defined per equation from the dataset as $T_{1 %}^{d a t a}$ and $T_{99 %}^{d a t a}$ , with span $Δ T^{d a t a} = T_{99 %}^{d a t a} - T_{1 %}^{d a t a}$ . Predictions are flagged “extreme” if they fall outside

[T_{1 %}^{data} - k Δ T^{data}, T_{99 %}^{data} + k Δ T^{data}],

(9)

where

k

is a fixed multiplier (e.g.,

k = 10

). This identifies candidates that remain finite yet exhibit runaway growth or heavy-tail excursions within the same admissible bounds [74].

These envelope metrics are used as numerical-validity and physical-plausibility screens within the domain of validity stated in this study. They do not claim that the expression is a mechanistic law; rather, they ensure that an explicit formula does not only “fit” the finite dataset while behaving implausibly or unstably across the feasible composite-term envelope [41,42,69].

In parallel, each SR expression is screened for excessive local sensitivity within the envelope. Even for explicit equations, small perturbations in composite-terms can cause disproportionately large changes in T if the learned form contains near-singular structures or strongly coupled rational terms. We approximate local sensitivities using central finite differences in scaled space, which provides a standard second-order accurate approximation for partial derivatives [71].

For each MC sample x and each input

v_{i} \in \{v_{α}, v_{β}, v_{γ}, v_{ϕ}\}

, we compute

\frac{\partial T}{\partial v_{i}} (x) \approx \frac{T (x + h e_{i}) - T (x - h e_{i})}{2 h},

(10)

where e_i is the unit vector for the

i

-th coordinate and h is a small step size in scaled space. The derivative field is summarised using the following sensitivity metrics:

FracAllDerivativesFinite_MC: fraction of MC samples for which all four partial derivatives are finite. This detects regions where the equation is numerically ill-conditioned even when T itself is finite [70,71,73].

MedianAbsDerivative_(All,MC): median of

∣ \partial T / \partial v_{i} ∣

pooled across all variables and MC samples (finite only). This acts as a robust “typical slope” indicator over the feasible envelope [74].

P99AbsDerivative_(All,MC): the 99th percentile of

∣ \partial T / \partial v_{i} ∣

pooled across all variables and MC samples (finite only). This is a tail-sensitivity indicator; it flags rare but severe slope spikes associated with near-singular denominators or tightly coupled terms, consistent with global sensitivity-analysis practice that uses tail summaries to capture envelope-level robustness risks [74].

MedianAbs

\partial T / \partial v_{α M C}

, MedianAbs

\partial T / \partial v_{β M C}

, MedianAbs

\partial T / \partial v_{γ M C}

, MedianAbs

\partial T / \partial v_{ϕ M C}

: per-variable medians of absolute sensitivities. These terms identify which composite-terms dominate typical local sensitivity and support driver/modulator interpretation as a response property within the envelope.

P99Abs

\partial T / \partial v_{α M C}

(per-variable P99): per-variable extreme slope diagnostics that indicate whether extreme sensitivity concentrates in a specific composite-term channel [74].

Frac

(\partial T / \partial v_{α} > 0)

_MC: fraction of MC samples (finite derivatives only) where the partial derivative with respect to the primary electromagnetic driver composite is positive. This is a driver-direction consistency check: within the studied envelope, the surrogate correlations should predominantly preserve an increasing tendency with respect to

α

, when

α

is interpreted as the dominant driver term in this work. This is treated as a within-envelope physics-consistency screen (monotonic tendency), not as causal proof; derivative-sign constraints and monotonic/shape-constrained modelling are established approaches to improving trustworthiness and avoiding misleading behaviour in fitted models [75].

To enable compact comparison across candidates, we introduce a scalar StabilityScore (S) constructed from the MC validity and sensitivity metrics:

\begin{array}{l} S = w_{1} (1 - {FracNonFinite}_{M C}) + w_{2} (1 - {FracExtreme}_{M C}) + w_{3} (1 - {FracNegative}_{M C}) \\ + w_{4} ({FracAllDerivativesFinite}_{M C}) \\ + w_{5} (\frac{1}{1 + {P 99 AbsDerivative}_{(A l l, M C)}}), \end{array}

(11)

with positive weights w_i selected to prioritise avoidance of non-finite evaluations and envelope-level implausibilities while penalising extreme tail sensitivity [70,71]. The resulting score is used solely to rank candidates and to support explicit accuracy–complexity–stability comparisons rather than selecting models by fit quality alone [41,69]. The StabilityScore is a screening utility and is not interpreted as a physical quantity.

Finally, because equation complexity is used as an interpretability proxy in this work, we report how the above stability indicators evolve with complexity to distinguish compact, auditably stable expressions from higher-complexity expressions that may behave more like flexible data-fit correlations (still accurate within the envelope, but less transparent and potentially more locally sensitive in parts of the domain) [41,42,69]. The proposed screening ensures that candidate equations used for engineering insight are also numerically stable and physically plausible within the stated operating envelope [70,71].

3. Results and Discussion

A composite-term bounded search that limits the symbolic expressions to the same physically interpretable inputs in Coogan’s model is performed with the composite-terms being Alpha (α), Beta (β), Gamma (γ), and Phi (ϕ). In creating a set of physically bounded equations, we reach an XAI model that allows for understanding and removes the barrier created by the black-box problem that transparency alone cannot address. This approach tests if we can reach higher accuracy levels while keeping interpretability and comparability to established AF-MPDT thrust scaling in the literature. Data selection, included in the filtering process in our workflow, followed previous AF-MPDT practices for selecting a thrust regime that was applied field-dominant. This was realised by filtering for background pressure and using a 90% ≥ thrust ratio with Tikhonov’s model [12]. Using these filtering methods guarantees an applied field-dominant component in the total thrust; furthermore, we focus on the low-thrust range (0–1 N) where previous models have performed poorly. The goal of these equations is to study how well composite-terms-aware SR can model thrust within a low-thrust applied-field range while remaining accurate and interpretable. In addition, we explore whether it consistently outperforms a baseline (Coogan’s model) with matching inputs. To ensure fit to log10(T_tot) and evaluate linear T_tot to keep structures with low complexity and interpretable in low ML applications, hyperparameter optimisation is required to minimise error and maximise the accuracy of the model. In this paper, hyperparameters are optimised for the composite-bounded run using a Bayesian optimiser, improving the search space configurations. Throughout the runs, warm starts were used, and expressions that did not include all composite-terms (α, β, γ, ϕ) were removed, ensuring consistency with the baseline input design.

3.1. Hyperparameter Tuning

The sensitivity analysis for hyperparameter tuning was performed using Bayesian optimisation (BO) in Optuna, with the Tree-Structured Parzen Estimator (TPE) [76,77,78,79]. In this workflow, TPE does not modify SR candidate equations directly, but scores which SR hyperparameter configurations are statistically associated with lower error. Each Optuna trial runs SR under one set of proposed hyperparameter configuration, giving scores to the hall of fame candidates and returning an objective value (see Equation (12)).

\max \{- R M S E\} \equiv \min \{R M S E\}

(12)

At the end of the trial, Optuna updates the probabilistic model of the hyperparameter space with “good” or “bad” scores and suggests the next configuration to test. The dataset used is a compilation of multiple thruster and operating conditions; a stronger penalisation on large deviations is preferred, and RMSE is capable of this better than MAE. An additional consideration is the elementwise squared error loss used internally by the training loop in SR; an external RMSE optimisation is aligned with the SR. TPE at its core is a BO method that creates density models in the hyperparameter space, instead of fitting a Gaussian-process surrogate over the objective. After recording the objective values, two conditional densities are estimated (see Equation (13)).

l (x) = p (x | y \leq y^{*}) hyperparameters that have good scores, g (x) = p (x | y > y^{*}) hyperparameters that have bad scores,

(13)

where y is the objective value,

y^{*}

is a quantile threshold (the top “y fraction”), and x denotes a candidate hyperparameter configuration. In Optuna, these densities can be represented by Gaussian mixture models per parameter, and new trials are sampled preferably from regions that statistically fall more in

l (x)

than

g (x)

[41,42]. The proposed new parameters are based on maximisation of a ratio as seen:

a r g \underset{x}{m a x} (\frac{l (x)}{g (x)})

(14)

This operation biases the optimiser towards lower error densities, translating to better-performing hyperparameter optimisations. The algorithm for the TPE is given in Algorithm 1.

Algorithm 1. Tree-Structured Parzen Estimator (TPE)

Inputs:

N_{i n i t}

: number of initial random trials

N_{t o t a l}

: total evaluation budget

N_{s}

: number of candidate samples per iteration
G (·): quantile rule returning g (fraction treated as “good”)
k (·): kernel/mixture family for density estimation
B (·): bandwidth (scale) rule for the density estimator
W (·): optional trial-weighting rule (often uniform)

𝒳

: tree-structured search space (conditional parameters allowed)
ε: small constant (optional)

Output : x_{b e s t}

, best observed hyperparameter configuration
(1) Initialise the dataset:

(a) Set D = \emptyset, where D stores all evaluated pairs (x_{n}, y_{n}

) . Here, x_{n}

is a hyperparameter configuration and y_{n}

= f (x_{n})

is its objective value.
(2) Initialisation phase (random exploration):

For n = 1 \dots N_{i n i t}

,

(a) Sample a valid configuration x_{n}

from the prior over

𝒳

.

(b) Train / evaluate the model and compute the objective : y_{n} = f (x_{n})

.

(c) Update the dataset : D \leftarrow D \cup \{(x_{n}, y_{n})\} .

(3) Optimisation loop (sequential improvement) : For N = N_{i n i t}

\dots N_{t o t a l}

– 1,

(a) Quantile threshold (defines “ good ” trials) : Compute g = G (| D |) and set y^{*}

as the g - quantile of \{y : (x, y) \in D\} .

Split D into the following:

$D_{l}$ $= \{(x, y) \in D : y \leq y^{*}\}$ (good set)
$D_{g}$ $= \{(x, y) \in D : y > y^{*}\}$ (remaining set)

(b) Density estimation: Fit two conditional densities using k (·), B (·) (and optional weights W (·)):

$l (x) \approx p (x | y \leq y^{*})$ $from \{x : (x, y) \in D_{l}\}$
$g (x) \approx p (x | y > y^{*})$ $from \{x : (x, y) \in D_{g}\}$

(c) Sampling and selection (acquisition):

$Sample N_{s}$ $candidate configurations from the good density : S = {\{x_{s}\}}_{\{s = 1 \dots N_{s}\}}, w i t h x_{s} ~ l (x)$ .
$Score each candidate by the likelihood - ratio acquisition : s c o r e (x_{s}) = l (x_{s}) / (g (x_{s}) + ε) .$
$Select the next configuration : x_{\{N + 1\}}$ $= a r g m a x_{\{x_{s} \in S\}}$ $score (x_{s}$ ).

(d) Model training and evaluation:

$Train / evaluate the model with x_{\{N + 1\}}$ $and compute its objective : y_{\{N + 1\}} = f (x_{\{N + 1\}})$ .

(e) Dataset update : Append the new observation to the dataset D \leftarrow D \cup \{(x_{\{N + 1\}}, y_{\{N + 1\}})\}

.

(4) Return best configuration : x_{b e s t} = a r g m i n_{\{(x, y) \in D\}} y

.

Lastly, the physical bounding constraint that ensures our equations contain at least one instance of each of the composite-terms derived in Section 2.2 is enforced by adding a mask that removes candidate SR equations that do not comply. TPE evaluates only masked hall of fames, and by incorporating this constraint, we evolve the Optuna objective into a constrained model search that rewards hyperparameter configuration with low and physically bounded RMSEs, as seen in Table 2, rather than permitting numerically driven results.

3.2. Discovered SR Equations

This study presented “complexity” as a structural descriptor for the closed-form surrogate correlations created by SR, defined by a cumulative count of weights assigned to variables, constants, and mathematical operands that form each equation. Lower-complexity expressions lean towards clear functional dependencies, allowing for higher physical interpretability, whereas higher complexities involve nested equations with linear and nonlinear transforms, leading to convolution, leaving little room for physical explainability and resulting in higher computational strains for evaluation. In this manuscript, opacity is treated as increasing with structural nesting and nonlinear transformations, with the complexity score providing a consistent quantitative measure for this effect. A complete formulation and breakdown of complexity calculations is given in Section 3.3; however, it influences the discovered equations and is a key metric for the interpretation of the quality and usefulness of these. It is important to note that the complexity metric is calculated not just for the developed SR-derived equations but also for the empirical models from literature as baselines, enabling the discussion of interpretability on a pure common quantitative level.

The SR post-TPE optimisation resulted in several equations that satisfy the imposed physical constraints while ranging in complexity (see Equations (15)–(21)). These equations have a more similar structure to power law equations, as the function set was reduced to

(+, -, \div, x, \sqrt{x}, X^{x})

to allow the equation to remain interpretable. In parallel, additional penalisation on the overuse of (

\sqrt{x}, X^{x}

) is laced to discourage bloating and nesting or physically inconsistent terms, such as a component to the power of itself. Within the regressor’s allowed search space, the final reported equations were chosen from the hall of fame as solutions across the complexity–accuracy spectrum. Each equation follows a structural composition that evolves, with the progression of equations SR_1–7 being described as incremental additions of interactions in the interpretation of the data’s underlying structure towards our goal of a set of equations that can showcase the effect of complexity on the performance of the equations.

S R_{1} = (\frac{α}{(α + 19.628) + (\frac{β}{γ - 1.689})} + 0.065) * {0.448}^{ϕ}

(15)

SR_1, being the base equation with the highest interpretability, established the recurring structure for the low-to-medium complexity (see Equations (15)–(19)). It presents a bounded

α

as the primary driving composite-term, augmented by a

γ

-conditioned

β

-channel and an external

ϕ

-dependent multiplicative scaling factor. By including

α

in both numerator and denominator, the equation satisfies the bound for the thrust range explored in this study (0-1N) without diminishing the role of the composite-term in the equation. As a low-complexity equation and in agreement with the literature and empirical models,

α

is driving the thrust in AF-MPDTs due to the nature of the Lorentz forces in the plasma-generating thrust.

S R_{2} = {0.412}^{ϕ} * 0.064 + \frac{α}{[(\frac{ϕ}{γ - 1.735} + α) * (α + \frac{β}{α})] + 19.563}

(16)

With SR₁ having established the

α

-driven core, the subsequent equations increase robustness by evolving

ϕ

from a purely multiplicative scaling factor to an internal interaction term. The first

ϕ

term,

{0.412}^{ϕ} \cdot 0.064

of SR₂, introduces a residual calibration term controlling the final output, and the second term retains

α

in the numerator and embeds composite interactions in the denominator through a double interaction channel structure (ϕ

, α

). The denominator accomplishes two things: bounding

ϕ

to

γ

, that translate to the geometric field shaping not equating directly to an increase in thrust as higher values of

γ

dampen the thrust response. In a certain range, this interaction can increase thrust; however, beyond that range, it correlates with lower thrust; this is consistent with high levels of plasma confinement, which are not beneficial. As a response to this, the

β

-channel is introduced to

α

self-normalising as a penalty in the denominator.

S R_{3} = {0.412}^{ϕ} * 0.063 + \frac{α}{[(\frac{ϕ}{γ - 1.732} + α) * (α + \frac{β}{α * γ})] + 19.563}

(17)

SR₃ retains the low-to-medium complexity and structure but strengthens composite-term bounding by extending

γ

-conditioning beyond the

ϕ

-channel to include

β

-channel moderation. The primary driver remains

α

, and its inclusion in both denominators enforces a saturation response that avoids incremental sensitivity as

α

grows within the explored regime. In essence,

β

is now explicitly normalised by

α \cdot γ

, so

γ

regulates the strength of both the

β

- and

ϕ

-channels, improving interpretability and causing a reduction in the likelihood of unphysical growth of the

β

-term within the explored regime.

S R_{4} = {0.416}^{ϕ} * 0.065 + \frac{α}{[(\frac{ϕ}{γ - 1.735} + α) * (α + \frac{β + ϕ}{α * γ})] + 19.563}

(18)

Building on the expanding influence of the composite-term bounds, SR₄ further increases the trend by introducing

ϕ

to the same normalised interaction where the

β

interaction governs. The structural implication of the addition is to say that under the restricted mathematical operator set within the dataset, the SR discovered part of the variance tied to

ϕ

is best expressed through the same denominator channel that is governed by association with

β

as normalised by

α \cdot γ

. SR₄ is the most strongly coupled member of the low-to-medium complexity family that has

α

as the driving force.

S R_{5} = (((([\frac{ϕ + α}{- 1.772 + γ} - (α * (α - γ))] - [β * ((γ - β) * γ)]) * - 0.099) - α) - 1.304) * (- 0.05)

(19)

SR₅ constitutes the first discovered equation with high flexibility: it trades the simple bounded rational structure of SR_1–4 for cross-coupled interaction terms, capable of identifying higher-order trends, fitting curvature, and residual trends within the explored low-thrust regime. It is a departure from

α

-driven equations, allowing all composite-terms to shape the thrust while retaining a low–medium complexity. The first block of the equation represents channel conditioned by the composite interactions, while the second block allows the system to capture curvature due to the higher order nature of the block, followed by a third block of higher-order composite-terms capable of fitting residuals, in essence allowing for the model to capture deviations in a better way, giving in theory a lower error rate than SR₄.

As SR evolves and expands the discovery space for more solutions that expand fitting capabilities for the underlying patterns in the data, medium-to-high-complexity equations that have nested operations in convoluted relations arise. These equations, just like SR_6–7, prove capable of capturing complex interactions between physical composite-term constrained densities, minimising variance while retaining performance acuity in error reduction. Contrasting the bounded rational responses from SR_1–4 and the higher-order cross-coupled interaction from SR₅, SR_6–7 introduce a third structure, a nested architecture for which the thrust is defined by scaling factor and nested correction stacks.

\begin{array}{l} S R_{6} = (\frac{- 0.058}{{1.104}^{α}}) \times [(((((α + γ) * γ) + \frac{0.183}{(((α + 0.077) - γ) * 1.207) - ϕ}) * (0.193 * β)) \\ - (α - (\frac{ϕ + α}{- 1.801 + γ} - \frac{((- 0.138 * - 0.435) + (- \frac{0.011}{α}))}{(\frac{0.313}{α}) - (α \pm 0.842)} * - 0.097))) - 1.208] \end{array}

(20)

\begin{array}{l} S R_{7} = (\frac{- 0.058}{{1.104}^{α}}) \times [((((α + γ) * γ) - (- \frac{0.183}{(((α + 0.077) - γ) * 1.207) - ϕ})) * (0.193 * β)) \\ - (α - (((((\frac{ϕ + α}{- 1.801 + γ}) + 0.868) + 0.090) - \frac{((0.060) + (- \frac{0.011}{α + 0.110}))}{\frac{0.313}{α} - (α + (- 0.842 + γ))}) * - 0.097)) - {1.208}^{0.623}] \end{array}

(21)

Within both Equations (20) and (21),

α

is central as a scaling factor and as a correction term inside nested operations,

β

is the scaler for the dominant coupled term,

γ

conditions the response in two ways, with the coupling interaction in the first term and in shifted denominators, while

ϕ

contributes through multiple channels, not as the main driver but as a conditioning factor. The main differences between SR_6–7 are that SR₇ introduces additional internal offsets inside the correction nesting, such as a shifted

α

-dependence

(α + 0.110)^{- 1}

and the

γ

-dependence in the denominator

α + (- 0.842 + γ)

. Overall, SR₇ is a refinement of SR₆ that aims to increase internal flexibility with nested operations, developing higher complexity, sacrificing interpretability, while maximising the structural capability for capturing higher variance through additional curvature and second-order behaviours, all of this while staying within the bounds of the established constraints that rule equations SR_1–7. This suggests that these equations should report higher performance at the cost of interpretability.

3.3. Performance Metrics for the Proposed SR Equations

Benchmarking the performance of established thrust models provides context for interpreting the effectiveness of the SR. For this reason, Table 3 summarises the performance metrics of four empirical models from the literature and seven SR-derived equations reported in descending order of R² for both categories of models. In addition to metrics defined in Section 2.2, a consistent complexity metric is reported for all expressions, with all data regarding the derivation included in Table A1 in Appendix A. Although PySR [56] computes an internal complexity value for the discovered expressions, the literature models (Tikhonov, Albertoni, Glowacki, and Coogan) are not reported on a composite-term basis used by SR. To evaluate all models on the same basis, a compatible complexity score is defined and calculated for each model. In line with the PySR structure, we use the following weights: 0.2 units for each variable occurrence, 2.0 units for numerical constants, 1.0 units for binary operators (+, −,

\div, x

), and 2.5 per power or logarithmic operator. The choice to penalise both exponential and logarithmic units by having a higher unit value is a reflection of the greater sensitivity they introduce at domain limits, reducing physics interpretability and numerical robustness. The penalisation applied is consistent with our constraints for the SR-equation design, reflecting this explicitly in the metrics rather than implicitly in the equations. To ensure a fair comparison, the empirical models are reformulated in terms of the composite predictors to avoid double-counting, ensuring that complexity is evaluated at the composite level and not at the component level. By design, the final complexity value will reflect structural interaction richness rather than raw equation length.

In the low-thrust regime under study, Coogan’s model exhibits the strongest performance among the empirical set, achieving an R² value of 84.21% and the lowest RMSE and MAE. Tikhonov’s and Glowacki’s models achieve R² values less than 70% and exhibit larger absolute deviation, while Albertoni’s model is the overall weakest performer. Glowacki reports the lowest MAPE among empirical models (27.13%), scoring lower than Coogan’s (34.59%); however, this does not imply superior predictive fidelity in low thrust. The divergence in the error metrics is caused by MAPE’s sensitivity to small denominators, rewarding models reducing relative error regardless of deviations in the predictions, while, in contrast, MAE and RMSE reflect more directly practical Newton thrust prediction. Furthermore, the complexity between both ranges as Glowacki’s full equation falls under the medium-to-high complexity equation range, while Coogan’s is still at a low-to-medium range. Considering this, Coogan’s model is selected as the principal reference model for comparison with SR-developed equations, with its equation presented below, noting that it is already composite-term transformed. This baseline provides a useful interpretability reference point, since its correlation is widely used and it remains compact in composite-term form.

T_{C o o g a n} = 1.14 \times α \times R_{a} \times ϕ^{- 0.13} \times β^{- 0.3} \times {(10 + γ)}^{- 0.67}

(22)

All SR-developed equations outperform all empirical models in goodness-of-fit and absolute-error metrics. Relative to Coogan’s R² of 84.21%, the SR_1–7 range from 95.12% (SR₁) to 96.76% (SR₂), proving a substantial improvement in the explained variance in the dataset. The reduction in absolute error is equally present with a reduction from Coogan’s MAE from 0.0247 N to 0.0163 N in SR₁ and 0.0143 N in SR₅, constituting a reduction of 34% and 42%, respectively, while RMSE decreases from 0.0395 to near 0.019–0.022 in SR_1–5. This improvement should not be attributed to a specific physical mechanism but to the SR expressions capturing nonlinear trends and providing a more flexible mapping from composite-terms to thrust than empirical models, while staying bounded and composite-term-constrained.

According to Table 3 and within the SR set, SR₅ is selected as the optimal equation due to its stronger accuracy compared to the reference model while remaining in the same complexity class. Both SR₅ and Coogan retain a value of 25 for complexity, and despite this, SR₅ achieves improvement of R² from 84.21% to 95.98%, a reduction in RMSE from 0.0395 to 0.0199 (approximately a 50% reduction), and MAE from 0.0247 to 0.0143 (approximately a 42% reduction). SR₄ achieves marginally better metrics, R² (96.22%) and RMSE (0.0193) at the same complexity level, but the lowest MAPE within the SR_1–5 set is attributed to SR₅ (24.59%). Moreover, SR₅′s MAE is equal to the lower complexity equation SR₃. Assessing all of these factors together, SR₅ becomes a frontrunner: it delivers a noticeable improvement over the best literature model while preserving the complexity level and retaining a bounded composite-driven form.

Both SR₆ and SR₇ achieve the strongest performance overall (RMSE = 0.0180–0.0179, MAE = 0.0123, R² = 96.7%), but at a substantially higher complexity level (63–75); therefore, they are interpreted as high-flexibility upper-bound fits for the data within a constrained operator set rather than optimal candidates for interpretable models. After establishing Coogan’s as the empirical baseline for comparison in Table 3, the next step is to examine how these numerical differences present across the full thrust range. Statistics such as R², RMSE, and MAE provide a glance at the overall performance, but alone they are not capable of telling if there are localised errors in the models. Visualisation techniques are necessary for assessing over-/under-fit in predictions, just as precision is. Figure 7 presents regression plots (predicted versus actual parity) for all empirical models and for each equation in the SR set compared to Coogan’s (as the best empirical model), enabling a per-data point evaluation agreement with the y = x line and systematic deviation.

In the empirical comparison in Figure 7a, the literature models exhibit a systematic deviation (under-fit) from the ideal case of the y = x line; this is consistent with their lower R² and larger RMSE values. The closest empirical trend to the parity line is Coogan’s model, confirming the validity of its use as the reference baseline for the remaining plots. In Figure 7b–f, the fitted trendline approaches the parity line more closely, and the point dispersion tightens, consistent with the reductions in RMSE and MAE reported in Table 3.

The reduction in dispersion is caused by a reduction in outliers, suggesting that the SR-derived equation has reduced bias and dispersion relative to the empirical baseline. The remaining dispersion at higher thrust values is further reduced in Figure 7g,h consistent with SR_6–7 lower RMSE, albeit at a noticeably higher complexity. SR-derived equations improve both the global fit and per-data-point agreement across the thrust range presented in the dataset, achieving the improvement progressively, as they become structurally richer and more complex.

While the parity plots in Figure 7 confirmed an improved agreement and a reduced bias for the SR set against the empirical baseline, they do not give insight into which composite-terms are driving the improvements. Since an explicit constraint of the SR set is having at least one instance of composite-terms

(α, β, γ, ϕ)

in each equation, it is key to analyse that the inferred mapping remains interpretable in terms of composite influence to verify that the prediction is not dominated by a single fitted item.

To achieve this goal, a SHAP analysis is performed (see Figure 8) on SR_1–7. In doing so, our SR-developed equations are not treated as opaque fits (black box models); instead, SHAP decomposes the predictions into additive contributions from the composite-terms, providing insight into how they shift the resultant thrust relative to actual values. The SHAP analysis enables the composite-terms to be interpreted as dominant contributors, secondary or adjustment terms, and state-dependent influences across the dataset, strengthening the interpretability of the resultant SR-developed equations. Decisive operational and geometric variables are defined as those that contribute most strongly to the predicted thrust within the studied regime, quantified using SHAP attribution.

Across SR_1–7, the SHAP patterns show consistency between the composite-term roles and the intended physics bounding for the regressions. The largest SHAP magnitudes are associated with the composite-term

α

, having in addition the widest spread. This indicates that variation for

α

accounts for the majority of the predicted thrust variation in the dataset. This equation discovery is physically consistent with the definition given for Equation (1), a compact representation of the electromagnetic forces. In AF-MPDTs, thrust production is driven by Lorentz accelerating forces induced by current-field interactions, supported by the majority of the empirical models under study, which reduce to a JB_A dependence as the primary driver. The SHAP dominance of

α

agrees that SR_1–7 are anchored around the expected electrodynamic drivers in low-thrust regimes, rather than arbitrary terms.

A contrasting, narrower SHAP range in the majority of the models can be seen for

β

, indicating a secondary influence relative to

α

. Physical interpretation for this is coherent with the term acting as the geometric scaling composite characterising the discharge channel topology and the current distribution symmetry (defining current channels’ broadness or concentrated profiles), not the main thrust-generating element. The smaller SHAP magnitude ties with this, indicating that it primarily adjusts the thrust mapping via its effects on confinement and the electrode interaction area.

Composite-term

γ

shows the most state-dependent influence in the SHAP analysis. In Figure 8a,b, it has a weak contribution, remaining clustered near zero, indicating that in the dataset, the axial geometry ratio does not strongly affect the thrust beyond the generation levels recorded with α. Physically, this suggests that in the low-thrust set, γ is a secondary conditioning term. However, in Figure 8c–e, it becomes a bidirectional term, remaining the same in the higher-complexity equations in Figure 8f,g, denoting that changes in this composite-term can increase or reduce the thrust predictions to refine the accuracy depending on the operating point. From an interpretability view, γ is a modification of the axial electro-length configuration influencing current path lengths, which can introduce competing effects such as an enhancement of acceleration at times while raising losses and inefficiency in other cases, and SHAP captures this conditional occurrence.

Finally,

ϕ

shows a predominantly multiplicative or conditioning factor in all SHAP plots, with changes in this composite-term shifting the predicted thrust magnitude across multiple data points. This means that rather than affecting a small subset of cases, it is treated as a global adjustment term; this behaviour is consistent with

ϕ

definition in Section 2.2 as a magnetic-field contouring/topology descriptor. Variation in

ϕ

represents changes in the applied-field configuration effectiveness for a given thruster geometry. From an interpretability perspective,

ϕ

has a supporting role in SR_1–7, complementing

α

as the driver and

γ

as the state-dependent conditioning term.

Overall, the SHAP analysis provides a qualitative consistency check between the physical intent of the composite-terms and the SR-derived closed-form surrogate correlations in the SR_1–7. All equations remain anchored to the electromagnetic driver

α

; they incorporate geometrical effects through secondary adjustments via

β

, use

γ

as a state-dependent axial conditioning term, and take into account the field contour influence through

ϕ

. The alignment between these supports the interpretability of the end result of the study. From a SHAP perspective, it is noted that the composite-terms in the SR-derived equations not only improve the fit but do so in a way that is physics-aware, while remaining readable.

While the SHAP results in Figure 8 provide insight into how the composite-terms contribute to each equation mapping, the predictive error distribution across those predicted data points remains unexplored. Even when two models achieve similar MAE, the reliability and behaviours can vary greatly within their error distributions: one can exhibit a small number of large outliers while another can produce consistent small errors. To assess the robustness beyond averages of error metrics, Figure 9 compares the absolute-error histograms for both the empirical models and for all SR-derived equations, permitting an evaluation of the frequency of low-error predictions. The empirical models in Figure 9a exhibit broader error distributions and heavier tails, indicating a larger fraction of operating points with moderate-to-high absolute error. In contrast, the SR-derived equations progressively concentrate the volume of the error toward lower absolute values, as appreciated in Figure 9b–h. This is consistent with the reductions in MAE and RMSE tabulated in Table 3. It is worth noting that SR₅ in Figure 9f shows a pronounced shift toward low absolute error while remaining comparable in complexity to Coogan’s, supporting its selection as the preferred interpretable composite-term-bounded interpretable SR expression. SR_6–7, as observed in Figure 9g,h, further tighten the error distribution; however, this improvement only happens at the incurred cost of a substantially increased complexity.

3.4. Trade-Off Between Complexity and Accuracy

Model selection in this study is framed within a mXAI objective: predictive accuracy is required, but the final expression needs to remain transparent and physically interpretable as a closed-form thrust correlation. This trade-off is addressed by selecting equations on the accuracy–complexity frontier to reduce opacity while retaining performance. Equation complexity is adopted as the primary interpretability indicator within the constrained SR framework because it reflects the degree of nesting and nonlinear transformations in the symbolic structure. Expressions with lower complexity are easier to interpret as composite-driven thrust relationships. Higher-complexity expressions introduce richer, flexible, nonlinear structures that can reduce error but at the cost of transparency and increased sensitivity to domain limits.

Across the SR equation set, Table 3 shows a consistent accuracy–complexity progression, supported visually by tightening of data points at higher complexities in the parity behaviour in Figure 7 and the systematic shift in absolute-error distributions in Figure 9. As complexity increases, so does R², and the absolute error metrics decrease, indicating an improved agreement with the actual thrust for prediction on the dataset. In XAI terms, this behaviour illustrates the main trade-off in composite-term-bounded SR equations: increased structural complexity generally yields improved representational capacity of the closed-form surrogate correlations and predictive performance while progressively reducing interpretability and transparency (toward opaqueness).

For the search space of low-thrust, composite-term-bounded equations with limited mathematical operands, SR₅ maintains a compact structure. It is selected as the preferred balance between interpretability and accuracy. At complexity 24, it remains comparable in structural simplicity to the empirical reference model, yet it achieves a strong performance (R² = 95.98%, RMSE = 0.0199, MAE = 0.0143) and the lowest MAPE among SR_1–5 (24.59%). Meanwhile, SR_6–7 remain tied as the upper-bound expressions that indicate the achievable accuracy ceiling if interpretability is not the guiding principle.

3.5. Interpretability Boundaries and Physical Validity

This work uses SR to discover closed-form surrogate correlations that are explicit and inspectable, with physically interpretable terms being used in an engineering sense: the model can be audited for behaviours, its dependencies can be compared to known AF-MPDT trends, and its terms can be reasoned in relation to the established thrust-driver composite-terms [41,42,69]. This does not imply that these learned surrogates establish causality [69], nor that high predictive accuracy alone can guarantee correct representation of physical mechanisms; instead, it provides a tool for early design engineering modelling.

To reduce the risk that explicit equations could appear physically meaningful while behaving unreliably inside our envelope of operations, the post-discovery numerical validity and stability are introduced [70,71].

The metrics defined in Section 2.4 quantify whether each SR expression remains physically plausible and numerically stable when probed across the feasible composite-term domain (via MC sampling) and under local perturbations (finite-difference sensitivity) [70,71,72,73]. Our indicator does not elevate the SR expression to physical laws; instead, we provide a boundary against misleading interpretations [41,42,69].

3.5.1. Accuracy–Stability Trade-Off and Interpretability Boundary

Figure 10 explicitly showcases the accuracy–stability relationship with all plots of R² against representative protocol indicators. Two patterns emerge. First, the SR family retains uniformly high predictive accuracy in the low-thrust envelope, with incremental gains from SR_1–5 to SR_6–7. Second, the same trendline is associated with the movement toward less conservative numerical behaviours: FracNegative_MC, FracExtrem_MC, and the P99

∣ d T / d v ∣

_MC increase for SR_6–7 relative to SR_1–5, with the StabilityScore correspondingly decreasing. This behaviour does not indicate that SR_6–7 are unreliable models, as the screening outcomes remain strong in absolute terms (zero non-finite fraction across the MC envelope) and their fit accuracy is highest. The significance of this lies in the fact that higher-complexity expressions can behave more like data-fitted correlations, achieving small accuracy gains while becoming increasingly sensitive in parts of the envelope [41,42,69].

The recorded behaviour falls within the manuscript’s use of complexity as a proxy for interpretability, where interpretability is not only readability but also auditability and robust behaviour within the design envelope [41,42,69]. SR_1–5 fall within a region that is both a high-accuracy and more robust, while SR_6–7 fall further toward peak accuracy at the cost of increased sensitivity and a higher frequency of physically implausible outputs (negative thrust) under envelope probing [70,71].

A practical implication for design use is that model selection should not be based on performance metrics alone, such as R². Instead, a model should be selected from an accuracy–complexity–stability perspective: if the design task priorities are rapid parametric reasoning, SR_1–5 provide strong accuracy with higher stability; if the priority is maximum interpolation accuracy and the model will be used with tighter safeguards on input ranges, models such as SR_6–7 may be the preferred tools. The intent of the validity protocol is therefore to define a formal boundary that is often implicit: explicit equations are interpretable in form, but their physical plausibility and numerical robustness still require screening [70,71].

3.5.2. Interpreting Validity Protocol Metrics in the Context of AF-MPDT Composite-Terms

Figure 11 summarises the envelope and sensitivity metrics across SR_1–7. The absence of non-finite outputs (FracNonFinite_MC = 0) across all closed-form surrogate correlations indicates that the discovered expressions are numerically evaluable across the MC sampled envelope of the scaled composite space [72]. The rows that follow describe progressively stricter indicators of plausibility and robustness: (i) FracNegative_MC reports the frequency of negative thrust predictions; (ii) P01(T)_MC and P99(T)_MC diagnose tail behaviour of the thrust distribution under envelope sampling; and (iii) P99

∣ dT / dv ∣

_All,MC captures worst-case local sensitivity, flagging whether small perturbations in composite inputs can induce large changes in predicted thrust [72,73,74]. The direction-consistency metric Frac(

\partial T / \partial v_{α} > 0

)_MC specifically tests whether the developed correlations predominantly preserve the expected monotonic tendencies within an applied-field dominant envelope. This operation is performed with respect to the primary electromagnetic driver via Lorentz forces represented by our composite-term

α

within the tested envelope. This is treated as a consistency screen aligned with the driver/modulator interpretation, not as a causal proof: the derivative sign is used to confirm that the equation behaves like a driver-based scaling relation, which supports the interpretability claims made in this manuscript.

3.5.3. Summary Comparison: Linking Accuracy, Stability, and Complexity

Table 4 consolidates the key message by presenting R², StabilityScore, Frac(

\partial T / \partial v_{α} > 0

)_MC, and complexity. Two consistent trends emerge. First, increasing complexity yields diminishing accuracy improvements at the upper end of the SR family. Second, stability indicators degrade gradually with complexity. As expressions become more structurally elaborate, they remain explicit but become less conservative in the envelope-stress tests, consistent with a transition toward more data-fit-like behaviour. For this reason, we do not equate interpretability with causality: interpretability enables inspection of the form and its composite-term drivers, while the protocol established provides an explicit numerical-validity layer to prevent over-interpretation of expressions that are accurate yet locally sensitive.

While the discussed metrics quantify envelope robustness, they do not by themselves demonstrate that the developed surrogates encode the electromagnetic forces as the primary drives in the applied-field regime, as would be expected from physical law. For this work, that dominant driver is encoded into

α

, and we require that the surrogates preserve basic driver-direction consistency within the operating envelope. In essence, thrust should predominantly increase with

α

when all other composite-terms are perturbed locally within the same composite-term-bounded space.

To verify this, we evaluate the sign of the local derivative with respect to the scaled driver using a finite-difference approximation across the MC envelope. This can be seen in column 4 of Table 4. The values show that all the SR surrogates preserve this predominantly positive drier tendency. When interpreted alongside the stability summary, this metric provides an additional physical lens that is complementary to pure performance metrics. Incorporating the ∂T/∂v

α

consistency screening allows for auditing of whether the final surrogates preserve physically sensible and consistent behaviour.

3.6. Interpretability Boundaries, Physical Validity, and Generalisation

While all symbolic expressions are presented in explicit closed-form, the manuscript does not treat explainability as evidence of physical causality. All derived expressions are data-supported surrogate correlations over the low-thrust envelope and have been designed to be directly comparable to established semi-empirical scaling laws in AF-MPDT literature. This is valuable as it provides a tool to contrast against known scaling trends without reliance on opaque predictors. Transparent mapping between the chosen current-field-geometry drivers and thrust variations within the measured operating range is provided by the auditable nature of these correlations, namely SR_1–7. The intentional constraint of the search space of the regressor using literature-motivated composite-terms anchors resultant equations to physically motivated drives, such as

ϕ

, that already appears in the literature. This allows comparison of benchmarked semi-empirical baselines in the same operational envelope for validation. In doing so, the risk of visually plausible but physically ungrounded equations is reduced while we recognise that any regression model, particularly black-box models, is capable of returning correlations that are fully data-driven and not structurally logical.

4. Conclusions

AF-MPDTs are poised to become a space-ready technology in the near future, with advancements in their technology making steady progress in recent years. However, the complexity of the interactions between the multiple inputs that have a role in thrust generation (magnetic and electrical fields, geometry, and topology) hinders progress for having models with a high level of accuracy for design purposes. Multiple empirical attempts have resulted in models for a range of thrusts, while not being accurate in particular regimes. AIand machine learning models present a solution to this problem, and while capable of accurate predictions, a byproduct is non-explainability. To address this gap, this study proposes a physics-constrained SR framework for developing AI-powered, interpretable applied-field thrust equations from experimental data. Unlike black-box models, SR searches the space of algebraic operands and returns explicit interpretable equations, allowing direct inspections of the dependencies and consistency checks for expected physical behaviour; said expressions are data-supported correlations within the tested envelope and are not presented as causal mechanisms. The proposed framework is as follows: first, we filter a dataset containing empirical values of multiple thrusters to data points that are field-thrust-dominant, while constraining the thrust range to under 1 N. Second, composite-terms that have physical explainability to the thruster behaviour and governing laws are created to be used as inputs for the framework. Third, the SR hyperparameters are tuned using a Bayesian optimiser via Optuna. Specifically, the Tree-Structured Parzen Estimator creates density plots for optimal and suboptimal hyperparameters and reinforces the search by utilising the ratio of these spaces with emphasis on densities of hyperparameter configurations that produce lower error. Finally, a selection filter is placed in the hall of fame output from the SR. This removes equations where not all composite-terms are present, thus ensuring that TPE’s density ratio guides the regressor towards physically bounded spaces. This framework successfully discovers equation sets that are physically bounded at multiple levels of complexity, enabling an explicit accuracy-interpretability comparison, with design-targeted thrust, validated for the specific topologies provided in the dataset. An additional step after equation discovery is a mathematical and physical stress test with a MC envelope of the operational envelope. The protocol gives insight into the robustness of the closed-form surrogate correlations, reporting a matching behaviour with complexity. As it rises, the equations become less robust, moving into the territory of data-fitting equations. Note that the validity is constrained within a nominally stable operating envelope represented in the training dataset and does not predict instability/onset (current-crisis) boundaries; for design processes, onset criteria should be applied as an external operating constraint [80,81].

Key findings of this paper are as follows:

SR yields high predictive performance relative to literature correlations under the operational envelope provided by the pre-processing and filtering methods. The best performance from the literature models for this subset comes from Coogan’s model with R² = 84.21%, $R M S E = 0.0395$ , and $M A E = 0.0247$ . In contrast, all developed SR equations perform better, achieving $R^{2} \geq 0.8421$ while reducing the errors with the lowest reported $R M S E = 0.0179$ and $M A E = 0.0123$ . This indicates that the SR is capable of better capturing the variance in the data while increasing the prediction accuracy compared to the baseline models for a low-thrust, applied-field-dominant regime of coaxial AF-MPDTs.
A complexity–accuracy trade-off is observed in the developed equations, where complexity serves as a proxy for interpretability observed in the closed-form surrogate correlations. The set from $S R_{1 - 5}$ delivers improvements against benchmark models at the same or lower levels of complexity than Coogan’s model (C = 25). Additional performance improvement is observed for $S R_{6 - 7}$ at higher complexities. The results support a selection of SR₅′s superior predictive performance (R² = 95.98%, RMSE = 0.0199, MAE = 0.0143) and lower complexity (C = 24) as the best fit of the study, providing a more interpretable physics-bounded higher-performing equation than those in the literature. The trade-off implies complexity can be used as a filter for the selection of the desired equation, dependent on the engineering design objective, whether it be accuracy or interpretability. Complexity and StabilityScore are found to be inverse: as complexity increases, StabilityScore decreases.
Variance capture does not guarantee proportional accuracy in the low-thrust regime for empirical formulas in the literature. For all semi-empirical metrics, Coogan has the highest $R^{2}$ while the lowest MAPE (27.13%) belongs to Glowacki’s model, indicating that Coogan’s model produces larger deviations for predictions. Our SR models are capable of reaching a lower error (20.34%), confirming that our approach improves not only variance capture but also proportional accuracy, which is important in low-thrust regimes where MAPE can be more stringent.

The developed SR is a deployable thrust-estimation tool that can be used for AF-MPDT design studies under the operational envelope of this study, particularly in those involving big data with a wide range of geometric design variables. Due to the nature of the explicit closed-form surrogate correlations, it is computationally negligible compared with high-fidelity simulations, making it a practical model for large parametric sweeps to systematically vary key design and operating variables. This capability makes it possible to map the design of areas that yield desired outcomes before investing in costly experimental runs. Beyond this work, an intuitive expansion currently under development explores the capabilities of SR with universal ranges of thrust. This effort aims for high accuracy and efficiency for all thrust ranges, coupled with the exploration of different non-post hoc XAI technologies.

Author Contributions

Conceptualisation, M.Y.-A.; methodology, M.R.-M. and M.Y.-A.; software, M.R.-M. and M.R.; validation, W.S. and M.Y.-A.; formal analysis, M.R.-M. and M.Y.-A.; investigation, M.R.-M., M.R. and M.Y.-A.; resources, W.S. and M.Y.-A.; data curation, M.R.-M. and M.R.; writing—original draft preparation, M.R.-M. and M.Y.-A.; writing—review and editing, W.S. and M.Y.-A.; visualisation, M.R.-M. and M.Y.-A.; supervision, W.S. and M.Y.-A.; project administration, W.S. and M.Y.-A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All data that support the findings of this study are included within the article.

Acknowledgments

For the purpose of open access, the authors have applied a Creative Commons Attribution (CC BY) licence to any author-accepted manuscript version arising from this submission.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Appendix A.1

Table A1, with the breakdown of complexity for all empirical models and SR-derived equations, is provided below. Table A1 serves as the count that goes into the complexity calculation Equation (A1) and the final complexity value reported in Table 4.

C_w = W_var(N_var) + (W_const) Nconst+(W_bin) Nbin+(W_pow) Npow+(W_log) Nlog,

(A1)

where W_var is 0.2, W_const = 2.0, W_bin = 1, W_pow = 2.5, and W_log = 2.5.

Table A1. Node count and final complexity value for empirical and SR-derived equations, calculated by multiplying the node count by the respective weight (unit value).

Model	N_var	N_const	N_bin	N_pow	N_log	Complexity
Albertoni et al.	4	4	6	1	0	19
Tikhonov et al.	3	1	3	0	0	6
Glowacki et al.	11	17	23	4	1	71
Coogan et al. (Equation (19))	5	5	6	3	0	25
$S R_{1}$ (Equation (12))	5	4	7	1	0	19
$S R_{2}$ (Equation (13))	8	4	10	1	0	22
$S R_{3}$ (Equation (14))	9	4	11	1	0	23
$S R_{4}$ (Equation (15))	10	4	12	1	0	25
$S R_{5}$ (Equation (16))	11	4	14	0	0	24
$S R_{6}$ (Equation (17))	15	14	27	2	0	63
$S R_{7}$ (Equation (18))	16	17	30	3	0	75

Appendix A.2

This appendix lists the mathematical formulations for the protocol for numerical validity, physical plausibility, and local-sensitivity metrics used to screen SR candidate expressions. All metrics are computed per model/equation.

Notation

Let T(x) be the thrust predicted by a given SR equation at input x in scaled composite space.
Dataset (Data) predictions: {T_data_,i}_{i = 1…N_data}.
MC envelope predictions: {T_MC_,i}_{{i = 1}…_NMC}, where x_i ~ Uniform in (α, β, γ, ϕ) space.
Finite indicator: 1_finite(T) = 1 if T is finite (not NaN/Inf), else 0.
Percentile operator: P_q({·}) denotes the q-th percentile (e.g., q = 1 or 99) over finite values.
Central finite-difference partial derivative in scaled space (step h):

∂T/∂v_j(x) ≈ [T(x + h e_j) − T(x − h e_j)]/(2 h), j ∈ {α, β, γ, ϕ}.

A. MC envelope validity metrics

FracFinite_MC = (1/N_MC) * Σ_{i = 1…N_MC} 1_finite(T_MC,_i)

(A2)

FracNonFinite_MC = 1 − FracFinite_MC

FracNegative_MC = (1/N_fin) * Σ_{i:finite} 1(T_MC,_i < 0), where N_fin = Σ 1_finite(T_MC,i)

(A3)

MedianT_MC = P₅₀({T_MC,i | finite})

(A4)

P01T_MC = P₁({T_MC,i | finite})

(A5)

P99T_MC = P₉₉({T_MC,i | finite})

(A6)

Let T1_data = P₁({T__data_,_i | finite}), T99_data = P₉₉({T_data,_i | finite}), ΔT_data = max(T99_data − T1_{data, ε}).

(A7)

Define bounds: T_lo = T1_data − k·ΔT_data, T_hi = T99_data + k·ΔT_data.

(A8)

FracExtreme_MC = (1/N_fin) * Σ_{i:finite} 1(T_MC_,_i < T_lo OR T_MC_,_i > T_hi)

(A9)

FracAllDerivativesFinite_MC = (1/N_MC) * Σ_{i = 1…N_MC} 1_finite(D_α_,_i)·1_finite(D_β_,_i)·1_finite(D_γ_,_i)·1_finite(D_ϕ_,_i)

(A10)

MedianAbsDerivative₍_All_,_MC₎ = P₅₀({ |D_j_,_i|: i in finite-all set, j ∈ {α,β,γ,ϕ} })

(A11)

P99AbsDerivative₍_All_,_MC₎ = P₉₉({ |D_j_,_i|: i in finite-all set, j ∈ {α,β,γ,ϕ} })

(A12)

Frac(∂T/∂vα > 0)_MC = (1/N_all) * Σ_{i in finite-all set} 1(D_α_,_i > 0)

(A13)

MedianAbs_{dTdvAlpha_MC} = P₅₀({ |D_α_,_i|: i in finite-all set })

(A14)

P99Abs_{dTdvAlpha_MC} = P₉₉({ |D_α_,_i|: i in finite-all set })

(A15)

StabilityScore = w1 · (1 − FracNonFinite_MC) + w2 · (1 − FracExtreme_MC) + w3 · (1 − FracNegative_MC) + w4 · (FracAllDerivativesFinite_MC) + w5·(1/(1 + P99AbsDerivative₍_All_,_MC₎))

(A16)

References

Mohan, N.; Ferguson, A.E.; Cech, H.; Bose, R.; Renatin, P.R.; Marina, M.K.; Ott, J. A Multifaceted Look at Starlink Performance. In Proceedings of the WWW ′24: The ACM Web Conference 2024, Singapore, 13–17 May 2024; Volume 12. [Google Scholar] [CrossRef]
Kingdon, J. 3 months transit time to Mars for human missions using SpaceX Starship. Sci. Rep. 2025, 15, 17764. [Google Scholar] [CrossRef] [PubMed]
NASA. Artemis II News and Updates—NASA. Available online: https://www.nasa.gov/artemis-ii-news-and-updates/ (accessed on 7 November 2025).
Russell, C.; Raymond, C. The dawn mission to minor planets 4 vesta and 1 ceres. In The Dawn Mission to Minor Planets 4 Vesta and 1 Ceres; Springer Nature: Berlin/Heidelberg, Germany, 2012; Volume 9781461449034, pp. 1–574. [Google Scholar] [CrossRef]
Foing, B.; Racca, G.; Marini, A.; Evrard, E.; Stagnaro, L.; Almeida, M.; Koschny, D.; Frew, D.; Zender, J.; Heather, J.; et al. SMART-1 mission to the Moon: Status, first results and goals. Adv. Space Res. 2006, 37, 6–13. [Google Scholar] [CrossRef]
Oz, İ.; Yilmaz, Ü. Design Tradeoffs in Full Electric, Hybrid and Full Chemical Propulsion Communication Satellite. Sak. Univ. J. Comput. Inf. Sci. 2019, 2, 124–133. [Google Scholar] [CrossRef]
Palaszewski, B.; Meyer, M.; Johnson, L.; Goebel, D.; White, H.; Coote, D. In-Space Chemical Propulsion Systems Roadmap; Glenn Research Center: Cleveland, OH, USA, 2017. [Google Scholar]
Goddard, R.H. The Green Notebooks; The Dr. Robert H. Goddard Collection at the Clark University Archives; Clark University Archives: Worchester, MA, USA, 1906. [Google Scholar]
Goebel, D.M.; Katz, I. Fundamentals of Electric Propulsion: Ion and Hall Thrusters; JPL Space Science and Technology Series; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
Krülle, G.; Auweter-Kurtz, M.; Sasoh, A. Technology and application aspects of applied field magnetoplasmadynamic propulsion. J. Propuls. Power 1998, 14, 754–763. [Google Scholar] [CrossRef]
Collier-Wright, M.; Bögel, E.; La, M.; Betancourt, R. High-Power Electric Propulsion as an Enabler for Moon Missions. In Proceedings of the International Astronautical Congress 2021, Dubai, United Arab Emirates, 25–29 October 2021. [Google Scholar]
La, M.; Betancourt, R.; Herdrich, G.; Bauer, M. High Temperature Superconductors as game changers for Plasma based Space Propulsion Systems for GEO Satellites, Drag Compensation of Large Space Structures and Beyond Earth Orbit Missions. In Proceedings of the Space Propulsion Conference 2018, Sevilla, Spain, 14–18 May 2018. [Google Scholar]
Polk, J.; Choueiri, E.; Sheppard, A.; Smith, E.F.; Goebel, D.M.; Martin, A.; Polzin, K.A. Development of High Power Lithium Magnetoplasmadynamic Thrusters to Support Human Mars Exploration. In Proceedings of the 38th International Electric Propulsion Conference, Toulouse, France, 23–28 June 2024. [Google Scholar]
Han, M.; Rana, H. Applied-Field Magnetoplasmadynamic Thrusters for Deep Space Exploration. arXiv 2024, arXiv:2410.17478. [Google Scholar] [CrossRef]
Almeida, T.P.; Bonab, S.A.; Yazdani-Asrami, M. Data-driven thrust prediction in applied-field magnetoplasmadynamic thrusters for space missions using artificial intelligence-based models. Mach. Learn. Sci. Technol. 2025, 6, 025050. [Google Scholar] [CrossRef]
Voronov, A.S.; Troitskiy, A.A.; Egorov, I.D.; Samoilenkov, S.V.; Vavilov, A.P. Magnetoplasmadynamic thruster with an applied field based on the second generation high-temperature superconductors. In Journal of Physics: Conference Series; IOP Publishing Ltd.: Bristol, UK, 2020. [Google Scholar] [CrossRef]
Boxberger, A.; Herdrich, G. Integral Measurements of 100 kW Class Steady State Applied-Field Magnetoplasmadynamic Thruster SX3. In Proceedings of the 35th International Electric Propulsion Conference, Atlanta, GA, USA, 8–12 October 2017. [Google Scholar]
Glowacki, J.; Webster, E.; Hellmann, S. Thrust and Efficiency Characterization of a Low-power Applied-field Magnetoplasma dynamic Thruster with a Superconducting Magnet. In AIAA Science and Technology Forum and Exposition, AIAA SciTech Forum 2025; American Institute of Aeronautics and Astronautics Inc, AIAA: Reston, VA, USA, 2025. [Google Scholar] [CrossRef]
Tikhonov, V.B.; Semenikhin, S.A.; Brophy, J.R.; Polk, J.E. Performance of 130 kw MPD Thruster with an External Magnetic Field and Li as a Propellant. In Proceedings of the 25th International Electric Propulsion Conference, Cleveland, OH, USA, 24–28 October 1997. [Google Scholar]
Myers, R.M. Scaling of 100 kW class applied-field MPD thrusters. In AIAA/ASME/SAE/ASEE 28th Joint Propulsion Conference and Exhibit; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 1992; Volume 1992. [Google Scholar] [CrossRef]
Albertoni, R.; Paganucci, F.; Andrenucci, M. A phenomenological performance model for applied-field MPD thrusters. Acta Astronaut. 2015, 107, 177–186. [Google Scholar] [CrossRef]
Fradkin, D.B.; Blackstock, A.W.; Roehling, D.J.; Stratton, T.F.; Williams, M.; Liewer, K.W. Experiments using a 25-kw hollow cathode lithium vapor MPD arcjet. AIAA J. 2012, 8, 886–894. [Google Scholar] [CrossRef]
Herdrich, G.; Boxberger, A.; Petkow, D.; Gabrielli, R.; Andrenucci, M.; Albertoni, R.; Paganucci, F.; Rossetti, P.; Fasoulas, S. Advanced scaling model for simplified thrust and power scaling of an applied-field magnetoplasmadynamic thruster. In Proceedings of the 46th AIAA/ASME/SAE/ASEE Joint Propulsion Conference & Exhibit, Nashville, TN, USA, 25–28 July 2010. [Google Scholar] [CrossRef]
La, M.; Betancourt, R.; Bögel, E.; Herdrich, G. Downscaling the 100kW SX3-AF-MPD to the 5kW SUPREME Thruster. J. Electr. Propuls. 2024, 4, 28. [Google Scholar]
Coogan, W.J.; Choueiri, E.Y. A critical review of thrust models for applied-field magnetoplasmadynamic thrusters. In Proceedings of the 53rd AIAA/SAE/ASEE Joint Propulsion Conference, Atlanta, GA, USA, 10–12 July 2017. [Google Scholar] [CrossRef]
Balkenhohl, J.; Glowacki, J.; Rattenbury, N.; Cater, J. A review of low-power applied-field magnetoplasmadynamic thruster research and the development of an improved performance model. J. Electr. Propuls. 2023, 2, 1. [Google Scholar] [CrossRef]
Glowacki, J.; Webster, E.; Hellmann, S.; Balkenhohl, J.; Rattenbury, N.; Cater, J. Performance and Model Comparisons for Kilowatt-Class Applied-Field Magnetoplasmadynamic Thrusters Operating at Magnetic Flux Densities up to Tesla-Level Magnitudes. In Proceedings of the 39th International Electric Propulsion Conference, London, UK, 14–19 September 2025. [Google Scholar] [CrossRef]
Goodwill, J.; Wilson, C.; MacKinnon, J. Current AI technology in space. Precis. Med. Long Safe Permanence Hum. Space 2025, 239–250. [Google Scholar] [CrossRef]
Yazdani-Asrami, M.; Song, W.; Morandi, A.; De Carne, G.; Murta-Pina, J.; Pronto, A.; Oliveira, R.; Grilli, F.; Pardo, E.; Parizh, M.; et al. Roadmap on artificial intelligence and big data techniques for superconductivity. Supercond. Sci. Technol. 2023, 36, 043501. [Google Scholar] [CrossRef]
Ford, O.; Bonab, S.A.; Yazdani-Asrami, M. Artificial intelligence-aided liquid hydrogen tank modelling for sustainable aviation. Energy Rep. 2025, 14, 4609–4623. [Google Scholar] [CrossRef]
Bonab, S.A.; Yazdani-Asrami, M. Artificial intelligence-based model to predict the heat transfer coefficient in flow boiling of liquid hydrogen as fuel and cryogenic coolant in future hydrogen-powered cryo-electric aviation. Fuel 2024, 381, 133323. [Google Scholar] [CrossRef]
Yan, D.; Sadeghi, A.; Yazdani-Asrami, M.; Song, W. Artificial-Intelligence-Driven Model for Resistive Superconducting Fault Current Limiter in Future Electric Aircraft. IEEE Trans. Appl. Supercond. 2024, 34, 5601616. [Google Scholar] [CrossRef]
Alipour Bonab, S.; Berg, F.; Song, W.; Colle, A.; Yazdani-Asrami, M. Advanced deep-learning model for temporal-dependent prediction of dynamic behaviour of AC losses in superconducting propulsion motors for hydrogen-powered cryo-electric aircraft. Commun. Eng. 2025, 4, 221. [Google Scholar] [CrossRef] [PubMed]
Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Guidotti, R.; Del Ser, J.; Díaz-Rodríguez, N.; Herrera, F. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Inf. Fusion 2023, 99, 101805. [Google Scholar] [CrossRef]
Yazdani-Asrami, M.; Fang, L.; Pei, X.; Song, W. Smart fault detection of HTS coils using artificial intelligence techniques for large-scale superconducting electric transport applications. Supercond. Sci. Technol. 2023, 36, 085021. [Google Scholar] [CrossRef]
Bonab, S.A.; Yazdani-Asrami, M. Investigation on the heat transfer estimation of subcooled liquid hydrogen for transportation applications using intelligent technique. Int. J. Hydrogen Energy 2024, 84, 468–479. [Google Scholar] [CrossRef]
Ulchi Suresh, N.V.; Sadeghi, A.; Yazdani-Asrami, M. Critical current parameterization of high temperature Superconducting Tapes: A novel approach based on fuzzy logic. Superconductivity 2023, 5, 100036. [Google Scholar] [CrossRef]
Chai, Z.H.; Parashar, T.N. Utilization of Accurate Thrust and Voltage Models for Applied-Field Magnetoplasmadynamic Thrusters. 2025 Reg. Stud. Conf. 2025, 17, AIAA 2025-106786. [Google Scholar] [CrossRef]
Almeida, T.P.; Bonab, S.A.; Asrami, M.Y. Enhancing Thrust Prediction for AF-MPDTs using XGBoost Machine Learning Technique. In Proceedings of the 21st PEGASUS Aero Student Conference, Prague, Czech Republic, 23–25 April 2025. [Google Scholar]
Hassija, V.; Chamola, V.; Mahapatra, A.; Singal, A.; Goel, D.; Huang, K.; Scardapane, S.; Spinelli, I.; Mahmud, M.; Hussain, A. Interpretability of Black-Box Models: A Review on Explainable Artificial Intelligence (XAI). Cogn. Comput. 2024, 16, 45–74. [Google Scholar] [CrossRef]
Doshi-Velez, F.; Kim, B. Towards A Rigorous Science of Interpretable Machine Learning. March 2017. Available online: http://arxiv.org/abs/1702.08608 (accessed on 10 October 2025).
Lipton, Z.C. The Mythos of Model Interpretability. arXiv 2017, arXiv:1606.03490. [Google Scholar] [CrossRef]
Ahmed, I.; Jeon, G.; Piccialli, F. From Artificial Intelligence to Explainable Artificial Intelligence in Industry 4.0: A Survey on What, How, and Where. IEEE Trans. Ind. Informatics 2022, 18, 5031–5042. [Google Scholar] [CrossRef]
Adadi, A.; Berrada, M. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
Faraji, F.; Reza, M. Machine Learning Applications to Computational Plasma Physics and Reduced-Order Plasma Modeling: A Perspective. J. Phys. D Appl. Phys. 2025, 58, 102002. [Google Scholar] [CrossRef]
Bartlett, D.J.; Desmond, H.; Ferreira, P.G.; Kronberger, G. Introduction to the Special Issue on Symbolic Regression in the Physical Sciences. arXiv 2025, arXiv:2512.15920. [Google Scholar]
Jorns, B. Predictive, data-driven model for the anomalous electron collision frequency in a Hall effect thruster. Plasma Sources Sci. Technol. 2018, 27, 104007. [Google Scholar] [CrossRef]
Carabantes, M. Black-box artificial intelligence: An epistemological and critical analysis. AI Soc. 2019, 35, 309–317. [Google Scholar] [CrossRef]
An, J.; Zhang, Y.; Joe, I.; An, J.; Zhang, Y.; Joe, I. Specific-Input LIME Explanations for Tabular Data Based on Deep Learning Models. Appl. Sci. 2023, 13, 8782. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30; NeurIPS Proceedings: San Diego, CA, USA, 2017; Volume 2017, pp. 4766–4775. [Google Scholar]
Koza, J.R. Genetic programming as a means for programming computers by natural selection. Stat. Comput. 1994, 4, 87–112. [Google Scholar] [CrossRef]
La Cava, W.; Orzechowski, P.; Burlacu, B.; de França, F.O.; Virgolin, M.; Jin, Y.; Kommenda, M.; Moore, J.H. Contemporary Symbolic Regression Methods and their Relative Performance. arXiv 2021, arXiv:2107.14351. [Google Scholar] [CrossRef]
Manti, S.; Lucantonio, A. Discovering interpretable physical models using symbolic regression and discrete exterior calculus. Mach. Learn. Sci. Technol. 2024, 5, 015005. [Google Scholar] [CrossRef]
Kronberger, G. The Inefficiency of Genetic Programming for Symbolic Regression-Extended Version. arXiv 2024, arXiv:2404.17292. [Google Scholar]
Udrescu, S.M.; Tegmark, M. AI Feynman: A physics-inspired method for symbolic regression. Sci. Adv. 2020, 6, eaay2631. [Google Scholar] [CrossRef]
Cranmer, M. Interpretable Machine Learning for Science with PySR and SymbolicRegression.jl. arXiv 2023, arXiv:2305.01582. [Google Scholar]
Vladislavleva, E.J.; Smits, G.F.; den Hertog, D. Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Trans. Evol. Comput. 2009, 19, 333–349. [Google Scholar] [CrossRef]
Acheson, C.R. A Central-Cathode Electrostatic Thruster Featuring a High-Temperature Superconducting Applied Field Module. Ph.D. Thesis, Victoria University of Wellington, Wellington, New Zealand, 2024. [Google Scholar]
Tikhonov, V.; Semenikhin, S.; Brophy, J.R.; Polk, J.E. The experimental performances of the 100 kW Li MPD thruster with external magnetic field. In Proceedings of the 24th International Electric Propulsion Conference, IEPC-95-105, Moscow, Russia, 19–23 September 1995. [Google Scholar]
Tikhonov, V.B.; Semenikhin, S.A.; Alexandrov, V.A.; Popov, G.A. Research of plasma acceleration processes in self-field and applied magnetic field thrusters. In Proceedings of the 23rd International Electric Propulsion Conference, IEPC-93-076, Seattle, WA, USA, 13–16 September 1993. [Google Scholar]
Tahara, H.; Yasui, H.; Kagaya, Y.; Yoshikawa, T. Development of a Quasi-Steady MPD arcjet thruster for Near-Earth missions. In Proceedings of the 19th International Electric Propulsion Conference, Colorado Springs, CO, USA, 11–13 May 1987. [Google Scholar] [CrossRef]
West, R.M. Best practice in statistics: The use of log transformation. Ann. Clin. Biochem. Int. J. Biochem. Lab. Med. 2021, 59, 162–165. [Google Scholar] [CrossRef]
Myers, R.M. Geometric Scaling of Applied-Field Magnetoplasmadynamic Thrusters. J. Propuls. Power 1995, 11, 343–350. [Google Scholar] [CrossRef]
Makke, N.; Chawla, S. Interpretable scientific discovery with symbolic regression: A review. Artif. Intell. Rev. 2024, 57, 2. [Google Scholar] [CrossRef]
Ochella, S.; Shafiee, M. Performance Metrics for Artificial Intelligence (AI) Algorithms Adopted in Prognostics and Health Management (PHM) of Mechanical Systems. J. Phys. Conf. Ser. (JPCS) 2021, 1828, 012005. [Google Scholar] [CrossRef]
Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimisation Framework. arXiv 2019, arXiv:1907.10902. [Google Scholar]
Myers, R.M.; Soulas, G.C. Anode Power Deposition in Applied-Field MPD Thrusters. In Proceedings of the 28th Joint Propulsion Conference and Exhibit, Nashville, TN, USA, 6–8 July 1992. [Google Scholar]
Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for Hyper-Parameter Optimisation. In Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
Oberkampf, W.L.; Roy, C.J. Verification and Validation in Scientific Computing; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
Roache, P.J. Verification and Validation in Computational Science and Engineering; Hermosa Publishers: Albuquerque, NM, USA, 1998. [Google Scholar]
Robert, C.P.; Casella, G. Monte Carlo Statistical Methods, 2nd ed.; Springer: New York, NY, USA, 2004. [Google Scholar]
Press, W.H.; Teukolsky, S.A.; Vetterling, W.T.; Flannery, B.P. Numerical Recipes: The Art of Scientific Computing, 3rd ed.; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
Saltelli, A.; Ratto, M.; Andres, T.; Campolongo, F.; Cariboni, J.; Gatelli, D.; Saisana, M.; Tarantola, S. Global Sensitivity Analysis: The Primer; Wiley: London, UK, 2008. [Google Scholar] [CrossRef]
Gupta, M.; Cotter, A.; Pfeifer, M.; Voevodski, K.; Canini, T.; Mangylov, A.; Moczydlowski, W.; Van Esbroeck, A. Monotonic calibrated interpolated look-up tables. J. Mach. Learn. Res. 2016, 17, 1–47. [Google Scholar]
Saha, S.; Bonab, S.A.; Yazdani-Asrami, M. Estimation of performance parameters for direct methanol fuel cell with Fe-N-C cathodes using gradient boost models. J. Physics Energy 2025, 8, 015012. [Google Scholar] [CrossRef]
Shekhar; Sambyo, K.; Gupta, S.K. Extra tree regressor and Tree-structured parzen estimator based machine learning model for predicting nanofluid’s Nusselt number. Eng. Res. Express 2025, 7, 015284. [Google Scholar] [CrossRef]
Ozaki, Y.; Tanigaki, Y.; Watanabe, S.; Onishi, M. Multiobjective tree-structured parzen estimator for computationally expensive optimisation problems. In Proceedings of the GECCO 2020—Proceedings of the 2020 Genetic and Evolutionary Computation Conference, Cancun, Mexico, 8–12 July 2020; pp. 533–541. [Google Scholar] [CrossRef]
Ozaki, Y.; Tanigaki, Y.; Watanabe, S.; Nomura, M.; Onishi, M. Multiobjective Tree-Structured Parzen Estimator. J. Artif. Intell. Res. 2022, 73, 1209–1250. [Google Scholar] [CrossRef]
James, G.; Geoffrey, J. MPD Thruster Performance Analytic Models. AIP Conf. Proc. 2003, 654, 516–524. [Google Scholar] [CrossRef]
Misuri, T.; Andrenucci, M. Victor B. Tikhonov’s MHD Channel Theory: A Review. Available online: https://electricrocket.org/IEPC/IEPC-2007-319.pdf (accessed on 23 September 2025).

Figure 1. Concentrical axisymmetric representation of an AF-MPDT including the thrust-generating forces through plasma as a medium.

Figure 2. Conceptual view of more explainable artificial intelligence (mXAI) where explainable modules are introduced to a black-box model, allowing interpretability in the model architecture for opacity reduction via a Graphical User Interface (GUI).

Figure 3. Principle of operation for SR genetic algorithm, inclusive of all phases for creation of the hall of fame with candidate equations and plot summaries for error and predictions.

Figure 4. Conceptual workflow for proposed physically interpretable and explainable thrust equation discovery model, from AF-MPDT data acquisition for modelling (experimental reporting of data, filtering, and transformation), through model evaluation (regressor with Bayesian optimiser for hyperparameters tuning), to equation discovery and evaluation (thrust equation finalised with thrust predictions).

Figure 5. Correlation heatmap between thrust and inputs from the dataset, including composite-terms developed.

Figure 6. Workflow for post-discovery mathematical robustness test with MC envelope and Finite difference stress-test for physical validity.

Figure 7. Comparison plot for predicted versus actual values in the dataset for thrust for all empirical models (a) and each individual SR equation developed with ranging complexities (b–h).

Figure 8. Shapley Additive Explanations (SHAP) analysis for composite-term influence and impact on SR equations, providing insight into the strength of each composite in the data, dependent on value. All SHAP analysis are organized in ascending order of SR_1–7 corresponding each to plots (a–g).

Figure 9. Stacked absolute error distribution histogram for empirical models (a) and SR equations (b–h), representing the percentage of data points within specific error ranges.

Figure 10. Accuracy–stability trade-off and interpretability boundary dashboard plot. R² compared to (a) FracNegative_MC, (b) P99AbsDerivativeAll_MC, (c) FracExtreme_MC, and (d) StabilityScore.

Figure 11. MC envelope validity and finite-difference sensitivity summary for SR_1–7 in scaled composite-term space. Rows report envelope plausibility and stability indicators; the StabilityScore provides a compact multi-criteria ranking used for comparison, not as a physical quantity.

Table 1. Quantitative inputs filtered from the original dataset are used as input for SR composite-terms.

Parameter Symbol	Description	Unit	Range
T_Total	Measured total thrust	N	0.001–0.981
J	Discharge current	A	10–1400
B_A	Applied magnetic field strength at the tip of the cathode in T	T	0.025–0.68
ṁ	Mass flow rate	mg/s	0.83–68
Ra_max	Maximum radius of the anode	mm	1.6–40
R_c	Radius of the cathode	mm	0.8–14
Ra_min	Minimum radius of the anode	mm	1–26
La	Length of the anode	mm	3–145
Rb_i	Inner solenoid radius	mm	25–111
Rbo	Outer solenoid radius	mm	62.9–156
L_ca	Length of the cathode	mm	−16–145
V	Discharge voltage	V	14–449

Table 2. TPE optimiser resultant hyperparameter values for optimal composite-term-bounded SR equations with minimal RMSE.

SR Hyperparameters	Search Space	TPE Optimal Value
n_iterations	[100–10,000]	1000
max_size	[30–800]	400
population_size	[80–220]	150
optimiser_iterations	[40–200]	100

Table 3. Performance metrics for empirical models found in the literature and the developed SR equations, evaluated on the low-thrust filtered regime, and the complexity value for all.

Model	R² (%)	RMSE	MAE	MAPE (%)	Complexity
Albertoni et al.	58.39	0.0641	0.0411	60.13	19
Tikhonov et al.	67.02	0.0571	0.0368	55.15	6
Glowacki et al.	68.17	0.0560	0.0288	27.13	71
Coogan et al. (Equation (22))	84.21	0.0395	0.0247	34.59	25
${SR}_{1}$ (Equation (15))	95.13	0.0219	0.0163	28.55	19
${SR}_{2}$ (Equation (16))	96.05	0.0197	0.0150	29.24	22
${SR}_{3}$ (Equation (17))	96.17	0.0194	0.0143	29.12	23
${SR}_{4}$ (Equation (18))	96.22	0.0193	0.0146	26.55	25
${SR}_{5}$ (Equation (19))	95.98	0.0199	0.0143	24.59	24
${SR}_{6}$ (Equation (20))	96.72	0.0180	0.0123	20.34	63
${SR}_{7}$ (Equation (21))	96.76	0.0179	0.0123	20.72	75

Table 4. Comparison of predictive accuracy and final protocol stability indicators versus symbolic-equation complexity (interpretability proxy) for SR_1–7.

Model	R²(%)	StabilityScore	$Frac (\partial T / \partial v_{α} > 0)$ _MC	Complexity
SR₁	95.13	9.76	0.991	19
SR₂	96.05	9.64	0.853	22
SR₃	96.17	9.44	0.838	23
SR₄	96.22	9.3	0.764	25
SR₅	95.98	9.2	0.864	24
SR₆	96.72	9.04	0.919	63
SR₇	96.76	9.01	0.921	75

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rosa-Morales, M.; Ravichandran, M.; Song, W.; Yazdani-Asrami, M. Physically Interpretable and AI-Powered Applied-Field Thrust Modelling for Magnetoplasmadynamic Space Thrusters Using Symbolic Regression: Towards More Explainable Predictions. Aerospace 2026, 13, 245. https://doi.org/10.3390/aerospace13030245

AMA Style

Rosa-Morales M, Ravichandran M, Song W, Yazdani-Asrami M. Physically Interpretable and AI-Powered Applied-Field Thrust Modelling for Magnetoplasmadynamic Space Thrusters Using Symbolic Regression: Towards More Explainable Predictions. Aerospace. 2026; 13(3):245. https://doi.org/10.3390/aerospace13030245

Chicago/Turabian Style

Rosa-Morales, Miguel, Matthew Ravichandran, Wenjuan Song, and Mohammad Yazdani-Asrami. 2026. "Physically Interpretable and AI-Powered Applied-Field Thrust Modelling for Magnetoplasmadynamic Space Thrusters Using Symbolic Regression: Towards More Explainable Predictions" Aerospace 13, no. 3: 245. https://doi.org/10.3390/aerospace13030245

APA Style

Rosa-Morales, M., Ravichandran, M., Song, W., & Yazdani-Asrami, M. (2026). Physically Interpretable and AI-Powered Applied-Field Thrust Modelling for Magnetoplasmadynamic Space Thrusters Using Symbolic Regression: Towards More Explainable Predictions. Aerospace, 13(3), 245. https://doi.org/10.3390/aerospace13030245

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Physically Interpretable and AI-Powered Applied-Field Thrust Modelling for Magnetoplasmadynamic Space Thrusters Using Symbolic Regression: Towards More Explainable Predictions

Abstract

1. Introduction

2. Methodology

2.1. Overview of Symbolic Regression

2.2. Dataset and Composite-Term Structure Preprocessing

2.3. Performance Evaluation Metrics

2.4. Robustness and Physics-Consistency Screening of SR Correlations (MC Envelope + Local Sensitivity)

3. Results and Discussion

3.1. Hyperparameter Tuning

3.2. Discovered SR Equations

3.3. Performance Metrics for the Proposed SR Equations

3.4. Trade-Off Between Complexity and Accuracy

3.5. Interpretability Boundaries and Physical Validity

3.5.1. Accuracy–Stability Trade-Off and Interpretability Boundary

3.5.2. Interpreting Validity Protocol Metrics in the Context of AF-MPDT Composite-Terms

3.5.3. Summary Comparison: Linking Accuracy, Stability, and Complexity

3.6. Interpretability Boundaries, Physical Validity, and Generalisation

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1

Appendix A.2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI