1. Introduction
Pulmonary diseases alter respiratory resistance (
) and static compliance (
), impacting gas exchange and overall respiratory function [
1,
2,
3]. Monitoring these changes is crucial for assessing a patient’s dynamic health status. Traditionally, clinicians evaluate pulmonary health through arterial blood gas (ABG) analysis [
4,
5,
6] or by interpreting mechanical ventilation (MV) waveform data [
1,
3,
7]. The latter is preferred due to its lower invasiveness, reduced complexity, and real-time tracking capabilities, making it well-suited for long-term condition monitoring (CM) and recovery trend analysis.
Recent studies span CT-/CXR-based ARDS prediction and models using continuous ventilator waveforms and imaging [
8,
9,
10,
11,
12,
13]; our simulator complements these efforts by supplying large, label-certain waveform datasets.
MV waveforms consist of proximal airway pressure, flow rate, and volume, with waveform shape determined by the ventilation mode. In fully sedated patients, common modes include pressure control (PC) and volume control (VC), with VC typically displaying decelerating (VCD) or constant flow (VCC) patterns [
4,
14,
15,
16,
17,
18].
Clinicians attempt to derive
and
from these waveforms to assess disease severity, track trends, and guide treatment decisions [
1,
3,
7].
The waveforms in
Figure 1 illustrate how the shape of the dependent proximal airway pressure waveform changes in VCC mode for different health conditions. For example, the waveforms for a healthy patient (second row, first column), and ill patient (third row, third column), appear deceptively similar. This is due to the effects that both parameters simultaneously have on the pressure waveform, convoluting the pattern changes enough to complicate diagnostics. Not only is
actually much greater for the latter, but also, their
values are at the opposite ends of the spectrum. This poses a risk of misdiagnosis and inappropriate treatment, potentially exacerbating the patient’s condition.
Therefore, it is crucial for clinicians to move away from unreliable recording methods and inaccurate approximation techniques in diagnosing and treating ventilated patients. Instead, embracing autonomous condition monitoring techniques and detailed disease progression analysis is essential.
These goals may be achievable through machine learning enhanced fault detection and isolation (FDI) strategies. However, such approaches depend on access to large, high-quality datasets, which are often not available due to various challenges such as clutter in measured data, missing relevant anthropometric data, imprecise disease parameter approximation methods, and limited data export capabilities from ventilators.
This work addresses these challenges by developing an MV–P data generator that synthesises label-certain waveforms for condition-monitoring research. It does not propose a bedside estimator, yet; rather, it highlights the limitations of common estimation practices and supplies datasets for training and evaluating future estimators.
1.1. Common Methods for Estimating RRS and CS
Two methods are common practice for deriving
and
as health status indicators from MV waveform data [
1,
3,
7]:
End-Inspiratory Hold Manoeuvre (EIHM): In fully sedated patients, clinicians perform an inspiratory hold at peak inspiration to pause airflow, making the measurement independent of ventilation mode. A pressure drop in the ventilation circuit is then used to estimate and (the latter is used as an estimate for ). However, this method requires a plateau phase, temporarily depriving the patient of ventilation, making it unsuitable for critical cases.
Ventilator-Based Estimation: The mechanical ventilator autonomously estimates as total inspiratory resistance () and as dynamic compliance ().
1.2. Limitations of These Methods
Traditionally, ventilators estimated compliance and resistance using peak airway pressure (), assuming it sufficiently captured the pressure-flow relationship. However, this approach is fundamentally flawed because includes both resistive and elastic components, leading to overestimated resistance and underestimated compliance. This results in inherently inaccurate estimations of and , which are already approximations of and , compounding errors in compliance and resistance assessments.
To overcome the limitations of EIHM, modern ventilators improve
and
estimation by sampling multiple points throughout the inspiration period and applying multivariate regression techniques such as least squares fitting [
19]. Assuming a single-compartment model, this approach accounts for dynamic variations in pressure and flow, reducing estimation errors.
However, despite these improvements, ventilator-derived resistance and compliance estimates remain incomplete as they do not distinguish ventilator circuit resistance from true patient respiratory resistance. Specifically:
Inspiratory resistance () includes ventilator circuit resistance (e.g., tubing, humidifiers, and endotracheal tubes) in addition to airway resistance ().
Dynamic compliance () is influenced by circuit elasticity, valve response dynamics, and compressible gas volume, making it difficult to isolate true lung compliance.
While and serve as practical estimates for and , they do not fully represent patient-specific respiratory mechanics. Separating ventilator-induced distortions from patient-dependent properties is essential to improve diagnostic accuracy. Some commercial ventilators (e.g., Dräger and Medtronic) incorporate ETT/compliance corrections or flow-dependent resistance terms, but these remain device-specific parameter estimates rather than direct measurements and can vary in practice. Machine learning offers a potential alternative by learning to differentiate these components, but such models require large, high-quality training datasets that are not yet widely available. Accordingly, we prioritise a transparent RC baseline and provide a label-certain generator to support future data-driven approaches that separate device/circuit from patient mechanics.
1.3. Challenges in Clinical Implementation
In intubated patients, the endotracheal tube significantly affects inspiratory resistance (
), sometimes contributing to nearly half of its value while bypassing a substantial portion of natural tracheal resistance [
20]. This complicates the accurate assessment of
in clinical settings, as ventilator-derived estimates fail to fully distinguish between patient-specific airway resistance and artificial resistance from the breathing circuit.
Consequently, clinicians rely on estimated values, despite most literature focusing on using accurately determined and for diagnosing patients. Regardless, both traditional estimation methods (EIHM and multivariate regression) require manual trend tracking, which can be impractical in high-demand environments such as large-scale respiratory outbreaks (e.g., COVID-19). Under these conditions, limited resources, overwhelmed clinicians, and excessive data complexity may delay early detection of patient deterioration, leading to suboptimal patient prioritisation and increased mortality risk.
1.4. Automated Monitoring Through Fault Detection and Isolation (FDI)
In industrial settings, similar challenges are addressed using automated diagnostic techniques, commonly referred to as condition monitoring or fault detection and isolation (FDI). These techniques fall into three main categories [
21,
22,
23,
24,
25,
26]:
Model-based FDI, which compares real-time system behaviour to a known analytical model to detect deviations from expected performance.
Data-driven FDI, which identifies abnormalities by comparing real-time data to historical patterns using machine learning techniques.
Hybrid FDI, which combines both approaches to enhance robustness and adaptability.
While the latter two are widely used in industry, their application in respiratory care remains limited due to the lack of real-world datasets, resulting from three key challenges:
Patient-Specific Variability: Unlike industrial systems, each patient has unique anthropometric, biological, and disease-related parameters, many of which require invasive procedures or approximations for accurate characterisation, making large-scale data collection impractical. Moreover, this variability renders model-based approaches unsuitable, as it makes validation impractical.
Ethical and Safety Constraints: Unlike machinery, where failure cases can be experimentally induced, it is ethically unacceptable to deliberately worsen a patient’s condition or wait for deterioration solely for the purpose of data collection.
Limited Dataset Coverage and Confounding: Existing datasets [
27,
28,
29,
30,
31,
32,
33,
34,
35] are too small and homogeneous to capture the diversity of patient populations. This forces researchers to apply complex statistical methods to disentangle confounding factors, yet the absence of large, independent cohorts makes those corrections impossible to validate, ultimately undermining data-driven models.
1.5. The Role of Simulations in Overcoming Data Limitations
Since real-world clinical data is limited, developing FDI techniques requires an alternative data source. One potential solution is to generate synthetic high-fidelity datasets using simulation models that accurately represent real-world respiratory mechanics. These datasets could be analysed using data-driven FDI techniques to extract patterns associated with disease progression, enabling early-warning automation in clinical settings.
However, existing lung simulation models [
29,
36,
37,
38,
39,
40,
41,
42,
43,
44,
45,
46,
47] offer limited parameter adjustment capabilities , making them insufficient for FDI applications requiring adaptable patient, ventilator, and disease conditions.
Thus, there is a need for a fully parametric simulation model capable of dynamically adjusting patient-, ventilation-, and disease-specific parameters. Such a model would enable the generation of high-fidelity synthetic datasets, providing the necessary data for machine learning-based solutions to distinguish ventilator-induced effects from patient-specific mechanics—an essential step toward improving diagnostic accuracy and guiding clinical decisions. This simulation model is the primary contribution of this work.
2. Materials
To develop a mechanical ventilator-patient (MV-P) model for synthetic data generation, initial simplifications are necessary due to the system’s complexity. For a fully sedated, supine patient, an accurate MV-P model requires representing a passive respiratory system interfaced through an endotracheal tube (ETT) with a ventilator capable of simulating relevant modes, as depicted in
Figure 2 [
4,
48].
In this section, the equations of motion are combined with selected anthropometric data (e.g., patient height, weight, and sex) to initialise tracheal dimensions and expected lung volumes such as FRC. This approach follows common clinical heuristics for estimating baseline ventilatory parameters, while recognising that ICU patients often deviate from these norms due to illness, injury, or compounded genetic variation. To accommodate such variability, disease-state parameters (e.g., , ) are treated as independently adjustable, and parameter sweeps are performed across ranges extending below typical clinical minima and above expected maxima. For clarity and identifiability, these parameters are held constant within a breath, even though they are known to vary with pressure and flow during inspiration and expiration; controlled nonlinear extensions will be introduced in future iterations. This provides a balance between specificity and generalisability, consistent with the aim of generating a widely applicable, high-fidelity dataset for algorithm development rather than reproducing disease-specific physiology in detail.
2.1. The Passive Respiratory System
Modelling the respiratory system requires incorporating resistive (
R in cmH
2O · s/L), elastic (
E in cmH
2O/mL) or compliance (
C in mL/cmH
2O), and inertial (
I in cmH
2O · s
2/L) components [
14,
20,
49] to represent various subcomponents of the respiratory tract [
1,
2,
3].
While numerous respiratory models exist [
15,
20,
49,
50,
51], this study does not propose a novel formulation but justifies the selection of a well-studied, clinically relevant model.
Single-compartment RC and RIC models dominate clinical practice, as they effectively represent the global mechanical properties clinicians rely on for bedside diagnostics. In contrast, multi-compartment and viscoelastic models offer greater theoretical detail, particularly for regional lung heterogeneity. While parameter sets for such models have been reported in the literature, routine bedside identification remains challenging, as most effects are not directly observable without invasive methods and estimates can vary across devices and patient groups. With well-established transient corrections, EIHM can also recover tissue viscoelastic parameters in suitable cohorts (see
Section 1.1); however, such protocols are not universally applicable at the bedside and remain device- and setting-dependent. We therefore adopt the single-compartment RC baseline as a pragmatic compromise, balancing representative complexity with identifiability, while leaving richer models as extensions for future work.
The RC and RIC models adhere to the equation of motion (
1) and (
2) [
15,
20,
50]. The airway pressure
is the sum of baseline pressure
and the component-specific pressures, where
represents the delivered volume per breath:
Given typical ICU respiratory rates (<30 bpm), inertance is deliberately neglected in the baseline RC formulation; a switchable RIC variant (with modest ) will be offered as a configurable option in future iterations. We also treat compliance as lumped and quasi-static (no regional heterogeneity) to keep labels unambiguous. Non-linear or hysteretic extensions are deferred to future work (noting that machine-driven inspiration with passive expiration already dominates observed hysteresis in mandatory modes).
Therefore, this iteration of the work adopts the RC model, striking a balance between representative complexity and practical measurability. The RC model incorporates only respiratory-mechanics parameters and thus indirectly, disease parameters—a crucial part of the scoped objectives of the MV-P model. However, a comprehensive MV-P model still requires further integration of patient-specific anthropometric data, ventilator circuit dynamics, and operational features, which are developed in the sections that follow.
2.2. Patient Respiratory System Dimensions
A patient’s respiratory tract varies with age and is primarily determined by anthropometric parameters, notably biological sex and height (in cm) [
5,
52]. The respiratory components whose dimensions significantly depend on these parameters include trachea length and diameter, ideal body weight (
in kg) [
6,
51], suggested tidal volume (
in mL) [
4,
14,
16,
51,
53], dead space (in mL) [
6,
51] and functional residual capacity (
in mL) [
6]. The available heuristics for these dimensions are given in (
3)–(
7). Tracheal dimensions are not explicitly defined in the literature and are thus modelled in
Section 3.1.
A fixed dead-space fraction ( 30% of ) is used for simplicity; a three-component dynamic model (anatomical + alveolar + apparatus) is a planned extension. This simplifying assumption specifically evaluates the suitability of generating label-certain waveform data for further data analytics research.
2.3. Effects of Pulmonary Diseases on the Respiratory System
Pulmonary diseases such as acute respiratory distress syndrome (ARDS), chronic obstructive pulmonary disease (COPD), asthma, and pulmonary fibrosis alter the mechanical properties of the respiratory system, reducing its efficiency by affecting resistance (
) and compliance (
). Tracking these values over time enables clinicians to assess a patient’s health status and recovery progression using anthropometric data [
1,
3,
7].
Diseases that induce inflammation, mucus production, and airway narrowing increase
from a normal 1 cmH
2O · s/L to over 18 cmH
2O · s/L [
14,
15,
29,
51,
54,
55,
56].
Pulmonary diseases affecting lung tissue may decrease
from a normal 50 mL/cmH
2O to 10 mL/cmH
2O or increase to 100 mL/cmH
2O [
14,
15,
29,
51,
54,
55,
56].
2.4. Endotracheal Tube
Endotracheal tube (ETT) selection follows a standardised roster of 17 distinct sizes, as detailed in
Table 1 [
57,
58]. For patients younger than 16 years, the appropriate inner diameter is estimated using Equation (8) [
6,
59,
60,
61]. For older patients, clinicians determine ETT size based on anthropometric data and clinical judgment, typically selecting diameters of at least 7.0 mm [
59,
62].
2.5. Mechanical Ventilation Modes
In a passive respiratory system, where voluntary muscle contractions are absent, certain ventilation modes, i.e., those administered for active patients, are excluded [
4,
54], such as:
Modes requiring patient-initiated breaths (e.g., continuous/bi-positive airway pressure (CPAP/BiPAP)) [
63]
Modes providing assistance on demand (assist control (AC) or continuous mandatory ventilation (CMV)) [
14,
50]
Weaning modes (synchronous intermittent mandatory ventilation (SIMV) and pressure support ventilation (PSV)) [
14,
64]
Secondary and unconventional modes requiring patient wakefulness [
4,
65]
This exclusion criterion focuses this study on control modes, which deliver mandatory breaths at a preset respiratory rate as prescribed by pulmonologists [
63,
66].
In control modes, phase variables define the shape of the respective independent waveform, which dictates how air is delivered to the patient [
15,
63,
67,
68,
69]. The resulting dependent waveform, influenced by both the ventilator circuit and the patient’s respiratory mechanics, is recorded and encapsulates information about the MV-P system [
14,
67].
The two primary categories of control modes are [
4,
14,
15,
16,
17,
18]:
Since tidal volume is fully determined as the integral of the flow waveform, it is considered redundant. All diagnostically relevant information describing the MV-P system is embedded in the pressure and flow waveforms.
Volume-controlled sub-classification modes are based on flow rate patterns [
4,
15]:
Figure 3 illustrates the pressure, flow, and volume waveforms of the VCC, VDC and PC modes. The phase variables shaping these independent waveforms include baseline pressure (
or positive-end expiratory pressure (
)), peak inspiratory flow rate (
), tidal volume (
), peak inspiratory pressure (
), and inspiratory time (
).
2.6. Determining Patient’s Health Status from Waveforms
Mechanical ventilation waveforms allow clinicians to assess a ventilated patient’s pulmonary health status by estimating static compliance (
) and respiratory system resistance (
). By analysing deviations of
and
from expected healthy values, pulmonologists quantify disease severity and classify pulmonary conditions [
1,
3,
7].
Clinicians can determine
manually while approximating
using
. Using the equation of motion (
9) and (
10), clinicians can measure
during an end-inspiratory hold manoeuvre (EIHM) [
68]. Since airflow ceases at this point, compliance is purely static, making
the reciprocal of
[
6,
15,
51]. The static compliance is inversely proportional to the driving pressure (
), as described by (
11) and illustrated in
Figure 4 [
50]. Explicit assumptions are that
with initial conditions
and
, and
representing the baseline end-expiratory pressure.
For respiratory rates below 30 breaths per minute, inertance (
) is typically considered negligible [
20,
54,
70]. Under this assumption, the resistive pressure component can be approximated by the pressure drop component (
), resulting in inspiratory resistance (
) [
1,
3,
7]. As described by (
12) and
Figure 5,
is directly proportional to the pressure difference between peak inspiratory pressure (
) and plateau pressure (
) and inversely proportional to the change in flow (
Q in L/s):
Alternatively, ventilators can approximate these parameters automatically:
Much like
,
is calculated as the ratio of volume change to pressure difference (
13). However, since airflow introduces an additional resistive pressure component between peak inspiratory pressure (
) and plateau pressure (
), dynamic compliance differs from static compliance. As shown in (
13) and
Figure 4,
is computed over the entire pressure range (
to
), while
is based on the driving pressure (
to
):
Since dynamic compliance depends on the flow rate pattern, different ventilation modes yield varying
values [
1,
3,
7]. In contrast,
remains stable and is therefore preferred for assessing patient condition.
2.7. Evaluating Existing Simulation Models
A literature survey identified several MV-P simulation models, including a MATLAB® Simulink® implementation (R2023b). While many of these models contain valuable features (e.g., advanced circuit dynamics, viscoelastic or nonlinear mechanics), none fully aligns with the specific set of requirements we defined for our first iteration of a scalable, label-certain dataset generator. These requirements emphasise patient parameterisation, unified ventilation modes, and verification hooks, reflecting our focus on condition monitoring research rather than detailed physiological modelling.
- (1)
Anthropometric Parameters
- (1.1)
Adjust tracheal dimensions, dead space, FRC, and ETT size based on the patient’s biological sex, height, and age, in alignment with clinical practice.
- (2)
Patient-Ventilator Interface
- (2.1)
Implement an ETT with interfacing Y-piece, which assumes fully sedated patient.
- (3)
Ventilation Parameters
- (3.1)
Allow baseline pressure ( or ) adjustment.
- (3.2)
Support both VC and PC ventilation modes.
- (3.3)
Enable VC mode to simulate constant (VCC) and decelerating (VCD) flow.
- (3.4)
Set peak inspiratory flow () and tidal volume () in VC mode.
- (3.5)
Set peak inspiratory pressure () and inspiratory time () in PC mode.
- (4)
Complication Parameters
- (4.1)
Allow variable respiratory resistance () and static compliance () to simulate disease severity fluctuations.
- (4.2)
Implement a means to simulate patient interactions with the ventilation circuit (e.g., condensation, temperature variations, or ETT biting).
- (5)
Verification Methods
- (5.1)
Enable EIHM simulation for model verification.
Table 2 qualitatively assesses how existing MV-P models adhere to the listed requirements, indicating which aspects are fully, partially, or not at all implemented. Notably, no existing model dynamically integrates anthropometric characteristics, adapts resistance and compliance in real time, or unifies all major ventilation modes within a single framework. These limitations restrict their applicability for generating high fidelity datasets to enable machine learning-based diagnostics, emphasising the need for a more comprehensive, adaptable solution. Partially met requirements, denoted by black bars, further highlight areas requiring enhancement to achieve full compliance.
None of these models fully satisfy the criteria outlined in
Section 2.7. Most notably, they do not dynamically integrate patient-specific anthropometric data, simulate progressive disease states, model patient-tube interactions, or facilitate EIHM execution. This hinders the development of automated diagnostic techniques that require high-fidelity datasets.
Among prior works, the study by [
43] incorporates an ETT but only as a static resistance value, lacking the ability to adjust for flow variations or ETT size. Similarly, most models are limited to a single ventilation mode and require complex recalculations to modify phase variables, making them impractical for scalable dataset generation. However, nearly all allow for
adjustment. [
39] includes multiple ventilation modes but relies on separate models rather than a unified framework.
Given these constraints, the MathWorks
® model [
36] was selected as the point of departure for this work. While it lacks several key features, it offers a well-structured framework with strong representational complexity and modular adaptability. The model already supports advanced thermal and aerodynamic modelling, including laminar and turbulent flow, temperature, humidity, and condensation analysis within the circuit. Although these capabilities extend beyond the scope of this study, they provide a robust foundation for future research, such as condition monitoring algorithms capable of detecting condensation artefacts that contribute to patient-ventilator asynchrony (PVA).
To address the limitations identified in
Table 2, this study extends the MathWorks
® framework by integrating patient-specific parameterisation, enabling real-time resistance and compliance adaptation (therefore, disease progression control), and unifying all major ventilation modes into a single, adaptable simulation environment. These modifications establish the first MV-P simulation model capable of dynamically adjusting to patient-specific characteristics while preserving clinical relevance. The next section details the implementation of these enhancements.
3. Methods
This section outlines the development of a fully parametric mechanical ventilator–patient (MV-P) simulation model for generating synthetic high-fidelity datasets. The model addresses a critical need for a simulation tool that can dynamically adjust patient-specific, ventilation-specific, and disease-specific parameters. Such adaptability enables the creation of datasets suitable for training machine learning models capable of distinguishing ventilator-induced effects from true patient-specific respiratory mechanics—an essential step toward improving diagnostic accuracy.
A MATLAB
® Simulink
® (R2023b) model [
36] (diagram in
Figure 6) was selected as the foundational framework due to its robust structure, advanced fluid dynamics capabilities, and compatibility with modular enhancements—despite lacking several key features required for condition monitoring applications. It should be noted that this is the original model from [
36], not the extended model being presented as the main contribution of this work.
The control signal block initiates waveform modulation to drive the volumetric airflow. The air passes through a humidifier before entering the ventilator circuit via the inspiratory tube. Within the circuit, pressure differentials govern the direction of flow through check valves, determining the path of air. It flows to and from the patient, then exits through the expiratory tube into the ambient air sink.
The following subsections describe how this model was extended to meet the full set of requirements defined in
Section 2.7.
3.1. Trachea Dimensions
As noted in
Section 2.2, many respiratory tract parameters, such as dead space, ideal body weight, and functional residual capacity, can be derived from anthropometric characteristics. However, no established heuristic exists in the literature for estimating trachea length (
) or trachea diameter (
) as a function of patient height.
To address this hurdle, new linear models were developed (
14)–(
17) by synthesising anatomical measurements from multiple published sources [
71,
72,
73,
74,
75], assuming direct proportionality between patient height and tracheal dimensions.
Given the linear anthropometric scaling, physiological extremes could be bounded by a sigmoid clamp in future implementations, with no impact on the results reported here. Proximal sensor flow resistance is neglected in the baseline model; an optional fixed series term (default 0 cmH2O · s/L) can be included in future iterations for sensitivity analyses.
3.2. Simulation Model Enhancement
Figure 7 illustrates the enhanced MV-P model. Due to space limitations, not all internal subsystem modifications are shown but are available upon request.
The MV-P simulation model was implemented in the MATLAB
® Simulink
® environment to represent a fully sedated, supine patient with respiratory rates below 30 breaths per minute, ventilated through an endotracheal tube (ETT). Building on [
36], this implementation introduces the first simulation framework to satisfy the full set of design requirements outlined in
Section 2.7. The enhancements described below are organized according to the specified requirement categories and are numerically linked to the corresponding component blocks shown in
Figure 7 through which they were achieved:
- (1)
Anthropometric Parameters
- (1.1)
Equations (
3)–(
8), (
14)–(
17) were implemented to automatically adjust tracheal dimensions, dead space, functional residual capacity (FRC), and ETT size based on patient sex, height, and age.
- (2)
Patient-Ventilator Interface
- (2.1)
The original constant volume chamber (representing a face mask) was replaced with a Y-piece junction and an ETT pipe component, introducing more realistic fluid dynamics.
- (3)
Ventilation Parameters
- (3.1)
PEEP control was achieved by offsetting the outlet reservoir pressure in the “Room Air Sink” block by the specified PEEP value.
- (3.2)
The mechanical ventilator control signal (MVCS) block was redesigned to flexibly modulate both flow and pressure, enabling simulation of VC and PC modes within a unified framework.
- (3.3)
The MVCS was further enhanced to support both constant and decelerating flow patterns for VC mode.
- (3.4)
A graphical user interface (GUI) was added for VC mode, allowing user-defined
and
inputs to automatically shape the flow waveform for all the defined combinations (see Sub
Section 4.2).
- (3.5)
A similar GUI was implemented for PC mode, generating pressure waveforms based on specified and values.
- (4)
Complication Parameters
- (4.1)
Translational spring and damper components (representing and , respectively) were replaced with variable equivalents, allowing real-time adaptation to simulate disease progression. Although and are implemented as variable components, their values remain piecewise constant over each breath in this initial model, introducing a discontinuous switching structure in the underlying RC model. However, Simulink’s continuous-time solvers still enforce continuity of the dynamic waveforms (, and ) at each switching instant.
- (4.2)
A time-varying local restriction was inserted between the ETT and Y-piece to simulate ETT biting, modulated by a user-defined signal.
- (5)
Verification Methods
- (5.1)
The MVCS block was configured to fully halt airflow at end-inspiration, enabling accurate simulation of the end-inspiratory hold manoeuvre (EIHM). This feature supports model verification by enabling comparison of MV-P compliance and resistance settings with traditional clinical estimates.
All simulations were carried out in MATLAB® Simulink® (R2023b) using a variable-step solver with automatic solver selection (typically a stiff solver such as ode15s for Simscape models). The relative tolerance was set to , with the absolute tolerance automatically scaled. The Simscape Solver Configuration employed the time-based formulation with derivative replacement for index reduction and a consistency tolerance of , with 1 ms filtering enabled for 1-D/3-D connections.
The volumetric supply block (see
Figure 7), closely matches its generated flow output to the incomming flow signal from the MVCS block.
Figure 8 clarifies the MVCS’s internal ventilation mode switching logic.
Figure 8a exhibits a switching circuit for selecting the control mode to be implemented; VC or PC mode. For VC mode, another switching circuit further specifies the VC flow pattern; constant flow or decelerating flow (see
Figure 8b). Once in either VC mode with constant flow, VC mode with decelerating flow, or PC mode, the operating mode of the ventilator is fully selected and depending on the choice,
Figure 9 and
Figure 10 further explain the mode-specific details for mutating the MVCS’s output flow signal.
Figure 9 illustrates the schematic of a ventilator control subsystem responsible for simulating VC mode. The flow pattern input selects between two cases: Case 1 generates the constant (square), and Case 2 produces the decelerating flow waveform. Each case includes separate logic to compute the corresponding flow signal and actuator resistance (expiratory valve), based on user-defined settings such as
,
, and tube diameter. Each of the subsystems comprises of an inspiratory flow branch, and an expiratory valve control branch. The former branch further comprises a normalised inspiratory rectangualr wave, scaling stage, and enabling stage. The latter branch subtracts the inspiratoy-, the EIHM-, and an additional short expiration rectangular wave (mimicking mechanical delay-default is 0 s) from a unity signal. The resulting signal controls the expiratory valve so air can passively exit the lungs. In
Figure 9b,
represents a compensation factor of 2, which arises from the area formulas of a triangle and a rectangle when integrating flow to obtain volume.
Figure 10 illustrates the internal logic of the MVCS subsystem for simulating PC mode. A PID controller adjusts the flow rate to match the desired pressure profile by comparing the instantaneous lung pressure (
or
) to the target
. Unlike traditional fixed-gain controllers, this implementation uses gain functions derived from ventilator settings to dynamically adapt control behaviour.
Section 3.3 discusses this in extensive detail.
3.3. Developing Pressure Controlled Ventilation Implementation
Since the volumetric supply controls airflow, extending the model to support PC mode requires a closed-loop solution to regulate airway pressure. The controller design presented here is not intended to reproduce device-specific clinical servo settings; rather, it provides a transparent, adjustable simulation model that ensures stable and reproducible waveforms for simulation and dataset generation.
The initial approach was to fine tune a PI controller for the healthy (NORM) state ( cmH2O · s/L, mL/cmH2O · s, cmH2O and cmH2O).
This tuned PI controller was applied to the edge case combinations listed in
Table 3.
However, as seen in
Figure 11, fine tuning a PI controller for the healthy state, does not result in the same dynamic characteristics for other states (percentage overshoot (
) increased from 8.6 % to 26.0 %, and settling time (
) from 134 ms to 400 ms).
Minimising discrepancies across the health status range is challenging, requiring a balance of dynamic characteristics like overshoot (), undershoot (), settling time (), and transient shape. To address this, a method was developed to adjust PI coefficient pairs based on and , the key ventilator settings in PC mode.
3.3.1. Method for Determining PI Coefficient Pair Functions
For the plant’s normal state, simulate a scenario with maximum initial error (process variable and set-point are most divergent), and progressively increase by orders of magnitude until oscillation occurs with about 0%.
Assess the plant’s sensitivity and behaviour by incrementally adjusting one condition parameter at a time, documenting the parameter configurations that lead to extreme dynamic responses; highest , lowest , and fastest and slowest .
Using the plant conditions that result in the highest and lowest , determine a corresponding value to balance these extremes.
From heron, investigate the effects of incrementally increasing and decreasing , and for each adjustment, determine the corresponding balancing . Record how these changes affect the dynamic characteristics, specifically the range from the lowest to the highest , the variations in the slowest , and the overall changes in response shape. Based on these results, determine the feasible performance for the controlled plant and establish criteria for balancing trade-offs between shoot range, , and shape. Record the most desired pair.
Repeat the process with smaller initial errors by either increasing or decreasing , identifying the optimal pairs that meet the established criteria.
Plot the balanced and coefficients separately over a plane (both and ) or a range ( or -based on dominant sensitivity). Model their best-fit functions.
3.3.2. Demonstration of Determining PI Coefficient Pair Functions
The controller is intentionally a simple PI scheme calibrated to an 8.0 mm ETT to prioritise repeatable labels. Future extension of ETT calibration for 7.0 mm could cover around 95% of the adult patient population based on feedback from clinicians. Formal stability margins, anti-windup safeguards, and diameter-adaptive gains are out of scope here and earmarked for an adaptive PID upgrade.
- (1)
First, vary
in a Newtonian fashion from
to
to determine, for the NORM state, at which
value is
equal to 0 % (see
Figure 12a).
- (2)
Since
and
are inaccessible parameters prior to ventilation, the sensitivity of sweeping them across the status range will only provide a sense for what performance is possible whilst balancing the dynamic characteristics. However, the health status of the patient does not affect the controlled pressure waveform significantly. An increase in
causes an increase in
and an upward shift in both
and
, whilst decreasing the range between the two. An increase in
causes an increase in
and a downward shift in both
and
, whilst increasing the range between the two. The combined effects concludes that some of the health status scenarios mentioned in
Table 3 are synonymous with the extremes of the dynamic characteristics (see
Figure 12b):
- (a)
HRLC always indicate the most extreme
- (b)
LRHC always indicate the most extreme
- (c)
HRHC always indicate the most extreme
- (d)
LRLC always indicate the fastest
- (3)
After conducting a sweep of
(refer to
Figure 13), a notable observation is that the
pair that achieves identical
and
in the worst-case scenarios (HRLC and LRHC, respectively) leads to the quickest stabilisation across all health status scenarios.
- (4)
Decreasing
and balancing
and
with an appropriate
results in a decrease in
and
. However, this lengthens
and worsens the shape of the transient phase (especially for LRLC-causing such a long
that it becomes the new health status synonymous with the longest
). This trade-off result concludes that it is pragmatic to set the requirements for the desired
to be 15% and the desired
to be less than 0.4 s at 2% steady-state error (see
Figure 14 for best balanced case with
and
).
- (5)
After finding the first pair for balanced dynamic characteristics for any patient health status, the process is repeated for smaller initial errors between and . It is important to note that for this plant, and are initially synonymous with the and control parameters, respectively. Therefore, decreasing the initial error entails increasing or decreasing .
- (5.1)
A sensitivity check was performed on the
setting. By varying the
from 0–15
with increments of 5
, for the NORM state,
Figure 15 (a) proves that increasing
only decreases
and
, while
remains the same. Thus, it is unnecessary for the adjustable PI controller to have coefficient functions with
as one of the input arguments.
- (5.2)
A sensitivity check was performed for the
. By varying the
from 10–40
with increments of 10
, for all health states,
Figure 15 (b) proves that decreasing
increases
, decreases
, lengthens
and worsens the shape of the transient phase. Thus, it is necessary for the adjustable PI controller to have coefficient functions with
as one of the input arguments. Steps (4) and (5) of the process was, thus, repeated to find a
pair for balanced dynamic characteristics for any patient health status for different
levels.
- (6)
The relationship between
and
coefficients and
can be modeled using power functions, as illustrated in
Figure 16a,b.
The method for varying
shown in
Figure 17a was replicated using dynamic
pairs, based on power functions from
Figure 16a,b. This led to an adjustable PI controller with markedly better dynamic performance across different
levels, as depicted in the comparison between
Figure 17a,b. As
decreases, there is a notable reduction in
,
,
, and an overall improvement in the response shape.
4. Results
4.1. Model Verification and Validation
The validation presented here is focused on verifying internal consistency and label correctness of the generator rather than emulating the behaviour of specific commercial ventilators. Recovery of compliance and resistance parameters, EIHM verification, and cross-location impedance comparisons confirm that the model produces physiologically plausible, label-certain waveforms. Device-specific in-vitro validation (bench ventilator + breathing set + ETT, with multi-tap measurements) is an important next step, particularly for studies that aim to match data from specific manufacturers. Such work is part of our planned roadmap but lies beyond the scope of this initial release, which prioritises generalisability and dataset fidelity.
4.1.1. Mode Specific Ventilator Settings
The developed simulation model successfully reproduces the three primary control modes used in fully sedated intensive care ventilation—Volume Control with Constant Flow (VCC), Volume Control with Decelerating Flow (VCD), and Pressure Control (PC)—within a single, unified simulation framework. Each mode can be configured using clinically relevant phase variables (, , , and ), which are directly adjustable by the user, mirroring the interface of real-world ventilators.
Waveform outputs generated for each mode closely match both the expected physiological shapes described in the literature and the reference waveforms shown in
Figure 3, as demonstrated in the simulation results presented in
Figure 18.
Quantitative accuracy of the implemented phase variable settings is summarised in
Table 4, showing all parameter deviations remain below 2%.
This confirms that the model reliably enforces ventilation behaviour according to clinical specifications. Additionally, the VCC mode supports the execution of an end-inspiratory hold manoeuvre (EIHM), enabling traditional estimation of static compliance and inspiratory resistance—an essential feature for verifying that simulated patient responses reflect realistic disease dynamics. This capability is examined in detail in
Section 4.1.3.
These results collectively verify that the model satisfies the requirements (3.1)–(3.5) and (5.1) for ventilator configuration and behaviour outlined in
Section 2.7, offering both waveform accuracy and clinical relevance across ventilation strategies.
4.1.2. Qualitative Comparison with Clinical Data
As an illustrative step beyond internal verification, simulated mechanical ventilator waveform data were overlaid with measured clinical data (
Figure 19). Inspiratory dynamics and tidal volumes show close alignment, while discrepancies are visible during the plateau phase and expiratory flow return. These differences reflect both the simplified RC baseline (constant
,
) and device-specific factors such as valve dynamics and servo control. This comparison is intended as a qualitative illustration of plausibility rather than a formal validation; device-specific in-vitro and in-vivo benchmarking remain important next steps in our roadmap.
4.1.3. Patient Health Parameters
The same mechanical waveforms from
Figure 18 (data generated by our proposed model) can be can be analysed using the estimation methods described in
Section 1.1 to verify that our model accurately simulates the patient’s health status. For consistency’s sake, the health condition parameters for each of those waveform sets (for each of the three modes) were kept constant (
mL/cmH
2O
and
cmH
2O · s/L).
To calculate
for each mode, the analysis focuses on the first timestamp following the inspiratory phase, where the flow ceases and pressure settles. Applying (
11) results in
values of 54.72, 54.91, and 54.82 mL/cmH
2O respectively, averaging to 54.82 mL/cmH
2O, which is 99.67% of the 55 mL/cmH
2O set point.
Since pulmonologists and ventilators use data from sensors implemented between the ETT and ventilator circuit, the best approximation method executable (even with EIHM) determines the equivalent
resistance (
Figure 20 depicts it as the sum of the respiratory system impedance
and endotracheal tube resistance
), which is often imprecise due to variable flow rates and potential turbulent flows during inspiration, affecting the pressure measurements across the ETT.
Figure 21 illustrates the impact of measuring prior to the ETT (as in practice), which introduces a significant pressure component due to added resistance in all scenarios. Also, the influence of
will only be intensified for smaller ETT diameters. However, the additional pressure component of the trachea is negligible, as it aligns closely with the waveform considering only the lung, as half the trachea is bypassed using the intubating method.
To calculate
, one can reference
Figure 20 and use (
18) and (
19). However, not only is placing the sensors in-between the ETT and trachea uncommon, but measuring the
resistance from here using the EIHM results in obtaining the magnitude of the impedance (
), without the necessary phase information to calculate
and
. Moreover, the inspiratory phase is aperiodic, complicating the determination of the angular frequency
. All these factors contribute to the complexity and necessitate the approximation of
.
To demonstrate the imprecision of the measurement location used in practice, simulations with the EIHM in VCC mode at three
values were performed. Pressure and flow waveforms were recorded at three model locations: before the ETT (
), before the trachea (
), and after the trachea (
). See
Table 5 for the resulting impedances.
Thus, physicians can accurately determine
with the EIHM (99.67%-verifying its implementation), but not
when using the methods discussed in
Section 1.1. Even knowing
, and placing sensors at the end of the ETT, still,
cannot confidently be determined. Thus, traditional
approximation techniques are inaccurate and, when referencing
Table 5, could overestimate
by as much as 287%. This motivates later development of prediction models trained on this model’s generated data to enable accurate, automated condition monitoring. While advanced ventilators perform such monitoring (e.g., via least-squares estimators), they are not universally affordable; our aim is a cost-effective retrofit path that upgrades existing devices without wholesale replacement.
Further model verification theoretically simulate
to approach
by decreasing the contribution of
by leveraging (
19), effectively short-circuiting the capacitor by letting
approach infinity. The replicated experiment’s results, shown in
Table 6, confirm the generator model’s accuracy in simulating
with an error range of 2.2 to 2.9%.
4.2. User Interface
After improving the MATLAB
® Simulink
® model to meet al the requirements for enabling high fidelity mechanical ventilator data generation, an application was developed to ensure smooth user experience (see
Figure 22). Some features of the application include:
The user can easily change the anthropometric data of the patient and the model is automatically adjusted to represent such a patient.
Toggling between simulation type of either generating data for a single instance, or for sweeping parameters for multiple instances.
The user may also choose to store the generated data in a database to process later.
This application addresses a key limitation of existing models by eliminating the need for user-driven model-level adjustments. Unlike other simulation tools, it offers seamless manipulation of model context based on patient anthropometric data, selection of ICU ventilation modes with accurate endotracheal tube (ETT) implementation, configuration of ventilation parameters, simulation of disease progression, and verification through EIHM methods. As such, this simulation model is the first of its kind to combine ease of use with high-fidelity data generation and comprehensive physiological modelling capabilities.
An extensive dataset of 1.92 M unique mechanically ventilated breaths was simulated. The simulation model (v4.0.0) and its generated dataset are available at a GitHub page [
76].
5. Discussion
The developed mechanical ventilator-patient (MV-P) data generation system offers significant advantages enabling advancements in automatic condition monitoring of mechanically ventilated patients:
Parameter sweeping capabilities effectively produce an infinite dataset.
Changes in patient health status are labelled per breath.
The data generated is labelled with precision and accuracy.
Users can simulate specifically desired data.
New data can be generated on-demand.
Furthermore, the dataset’s synthetic generation does not depend on waiting for the occurrence of real-world health crises, simultaneously avoiding ethical issues like privacy concerns and patient consent as sensitive health information from patients isn’t required.
Another advantage of the simulator is its ability to generate data representing rare or extreme physiological conditions. In clinical settings, most patients fall within typical ranges of lung resistance and compliance, making it difficult to collect real-world data from individuals with unusually stiff or highly compliant lungs. This scarcity can compromise the performance or safety of machine learning models when deployed on outliers. By intentionally sweeping extreme combinations of resistance and compliance, it enables the creation of more generalisable datasets for training and testing algorithms.
In addition, the system supports simulations that model not only static lung conditions but also changes over time. By adjusting resistance and compliance across successive breaths, it becomes possible to simulate clinical trajectories—such as a patient recovering from lung injury or gradually deteriorating. These sequences can serve as synthetic case studies or training scenarios for monitoring models, providing a platform to explore transitions like lung stiffening or loosening due to disease progression or intervention.
The application’s performance was optimised to reduce the simulation time per breath. With parallel computing enabled, it achieved an average of 3.35 s per breath on a machine with an Intel® Core™ i7-6700HQ CPU (quad-core @ 2.60 GHz). A full sweep generating 1.92 million breaths would take roughly 74 days on a single machine. To accelerate this process, a lab of 60 identical machines was used, completing the full sweep in about 48 h. This produced 168 GB of raw CSV data, reduced to 66.7 GB after trimming, with the final database footprint totalling 49 GB for time-series data, 16 MB for simulation settings, and 152 MB for extracted features. Further adaptive stepping and parallel scheduling are engineering optimisations outside the present scope.
It should be evident that the developed model is just that—a model. Any clinical use will require staged, IRB-approved evaluation to ensure safety, then a silent prospective study, and only thereafter any controlled intervention. Until such validation is complete, outputs are for research use only. Its generated data should be interpreted with this understanding.
The choice to adopt a single-compartment RC baseline was deliberate: it reflects a trade-off between fidelity, transparency, and generalisability. While more complex models (multi-compartment, viscoelastic) can reproduce additional subtleties of ICU lung mechanics, they demand invasive measurements, device-specific parameter sets, and often suffer from identifiability issues. For the purpose of generating large, label-certain datasets for algorithm development, the RC framework offers a reproducible and clinically interpretable starting point. More detailed models remain a valuable extension path, but their integration must preserve the identifiability and dataset consistency that motivate this work.
Some limitations, presenting opportunities for future enhancement, include:
- 1.
Supports only controlled ventilation modes (VCC, VCD, and PC).
- 2.
Adjustable PI controller is optimised for a specific ETT size (8.0 mm).
- 3.
Neglects the respiratory inertance component.
- 4.
Mechanical parameters are independent of temperature and position.
- 5.
Does not account for patient muscle interference.
- 6.
Mechanical parameters remain static during the breathing cycle.
- 7.
Cannot simulate respiratory damage from harmful ventilation settings.
- 8.
Assumes ventilator sensors are ideal, without any faults.
- 9.
Ignores inevitable leaks in the ventilator circuit.
- 10.
Ventilator circuit components’ dynamic characteristics may deviate from actual ventilators’ (e.g., the check-valve’s pressure to area curve).
Suggested Future Work
While numerous extensions are possible, the present release already represents a substantive improvement over available open-source models (
Section 2.7). Our focus is on incremental additions that enhance physiological fidelity while maintaining identifiability and the label-certain properties that motivate this work.
It should also be noted that the present Simulink®-based implementation was chosen for transparency, modularity, and reproducibility in proof-of-concept development. Although Simulink is not optimised for generating very large-scale datasets, the modular structure of the model allows straightforward translation into compiled environments (e.g., MATLAB Coder, C/C++, or GPU-accelerated solvers) to support future high-throughput applications.
The following items outline the highest-value extensions we intend to prioritise:
Sensitivity and controller robustness
- –
Investigate the influence of ETT size during PC mode. If the current PI controller proves overly sensitive to ETT size, develop an updated controller that accounts not only for PEEP and PIP but also for a range of ETT diameters. Conversely, if tolerance to ETT variation is sufficient, expand the published dataset by simulating identical scenarios with additional ETT sizes. Based on feedback from clinicians, adding at least a size 7.0 mm ETT will already increase coverage to about 95% of the adult patient population.
Condition monitoring and FDI applications
- –
Develop a streaming interface for real-time ingestion of ventilator waveforms, including buffering and timestamp synchronisation, enabling prospective silent-mode condition monitoring.
- –
Apply machine-learning–based regression models to automatically estimate patient-specific lung mechanics with higher precision. Combining these models with longitudinal analysis may reveal how diurnal temperature variations modulate mechanical parameters and contribute to pulmonary disease progression under mechanical ventilation.
- –
Implement fault detection and isolation (FDI) methods for common ventilator issues such as leakages, excessive condensation, worn valves, and degraded flow-delivery components. Extend these capabilities to alarm-related problems including incorrect ETT placement (e.g., mainstem bronchus intubation) and faulty ETT cuffs, both of which can critically impact patient outcomes.
Analytical and mathematically challenging extensions
- –
Incorporate the muscular pressure component generated by patient effort, enhancing the model’s suitability for studying active breathing, patient–ventilator asynchrony (PVA), patient orientation effects, and optimisation of weaning strategies.
- –
Enhance the lung model by allowing mechanical parameters to vary dynamically throughout the breathing cycle. This improvement would better reflect physiological behaviour, enrich P/V and F/V loop analysis, and support more precise quantification of ventilator-induced lung injury (VILI) across ventilation modes.
- –
Expand the data-generation framework by introducing viscoelastic tissue effects and pendelluft phenomena.
- –
Introduce optional modules for nonlinear lung and chest-wall compliance, inspiratory–expiratory resistance asymmetry, flow-regime transitions (e.g., upper-airway turbulence), small-airway collapse, and preset breathing-set parasitics (filters, humidifier, tubing). Additional device-profile templates (e.g., sensor tap locations and servo characteristics) would further enhance flexibility.
- –
Investigate methods to estimate the angular frequency in the respiratory impedance model during inspiration across ventilation modes. Accurate estimation may improve calculation of and support identification of inertial components, thereby expanding the set of variables available for resource-efficient patient assessment and disease monitoring.
- –
Once the above nonlinear and higher-fidelity model extensions are in place, perform a small-signal stability analysis of the closed-loop PI-controlled system. By linearising the error dynamics,
and deriving the eigenvalues of the resulting system matrix, one can verify that
for all
i to ensure asymptotic stability. Providing this derivation would substantially strengthen the mathematical validity of the controller design.
The present work should be regarded as a baseline framework: preliminarily validated against representative clinical data, deliberately scoped to ensure reproducibility, and extensible toward more advanced physiologic modelling and validation protocols. Its immediate value lies in supplying a transparent, label-certain dataset generator—a foundation upon which more device-specific or multi-compartment models can be built.