# Evaluation of Different Control Algorithms for Carbon Dioxide Removal with Membrane Oxygenators

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

_{2}content in blood (approximately 0.5 mL/mL at 37 °C and 750 mmHg) significantly exceeds that of O

_{2}(approximately 0.2 mL/mL at 37 °C and 750 mmHg). While O

_{2}is mainly bound to hemoglobin, CO

_{2}is mainly bound in blood in the form of bicarbonate. The different binding of the two gases influences the control strategy and the operating conditions—blood flow rate, sweep gas flow rate, and sweep gas partial pressure, depending on the operation mode. Generally, the oxygenation mode needs higher blood flow rates than the CO

_{2}removal mode [4].

_{2}and O

_{2}through the permeable membrane, due to the partial pressure differences of the gases in both phases. The sweep gas usually consists of O

_{2}and N

_{2}in different ratios, while the CO

_{2}concentration should be close to zero, in order to maximize partial pressure differences. Thus, CO

_{2}diffuses from the blood into the sweep fluid and gets removed to the ambient, while O

_{2}diffuses through the membrane into the blood. For a detailed review of membrane oxygenators, the interested reader is referred to [5].

_{2}partial pressure and blood oxygenation at desired levels [7]. Due to the human factor, this process is prone to errors that can lead to serious injuries to tissues and organs [8]. An automatic controller with accurate reference tracking can reduce the costs and treatment risks. For future wearable intracorporal membrane oxygenators, automatic control is obligatory, since a perfusionist is not available outside a clinical setting. Therefore, here we present a feasible method to automatically control these classes of devices.

_{2}removal. Misgeld et al. [9] developed a PI controller with gain scheduling for the control of both blood gases. Manap et al. recently proposed a self-tuning fuzzy-PI [10] that follows setpoints for the CO

_{2}partial pressure in the blood. Sadati et al. [11] proposed a fractional PI that controls the oxygen partial pressure. Allen et al. [12] presented a linear quadratic gaussian controller to control the oxygenation of the blood. All of the above-mentioned studies used the sweep flow rate as the manipulated variable.

_{2}partial pressure in a membrane oxygenator using both the sweep flow rate and blood flow rate as manipulated variables. To that end, first, a compartmental model of the blood membrane oxygenator is presented, detailing the CO

_{2}kinetic and transport. Next, the three control algorithms are introduced—a PI feedback controller, a non-linear model predictive controller (NMPC), and a deep reinforcement learning (RL) controller. Their effectiveness in tracking setpoints and rejecting disturbances is presented. Finally, based on the results of the experiments, the advantages and disadvantages of each control algorithm are discussed.

## 2. Modeling and Control

#### 2.1. Mathematical Model

#### 2.1.1. Gas Compartment

_{2}from the blood diffuses through the membrane into the gas phase and gets removed from the system with the sweep gas flow.

#### 2.1.2. Plasma Compartment

_{2}exists in its dissolved form and as a bicarbonate. Here, the CO

_{2}gets hydrated to carbonic acid, which then dissociates to bicarbonate and hydrogen ions. The plasma compartment exchanges dissolved CO

_{2}with the gas compartment of the oxygenator and CO

_{2}and HCO

_{3}with the erythrocytes.

#### 2.1.3. Erythrocytes Compartment

_{2}exists in all three forms—dissolved, bound to proteins, and as a bicarbonate. Here, the hydration of CO

_{2}is catalyzed by the enzyme carbonic anhydrase, which increases the reaction rate by a factor of ~10

^{4}. Thus, when the partial pressure of CO

_{2}(${p}_{\mathrm{C}{\mathrm{O}}_{2},pl}$) is high, CO

_{2}would diffuse in the erythrocytes from the plasma, where it gets hydrated to HCO

_{3}, which then diffuses back into the plasma compartment. However, this generates a charge imbalance, because H

^{+}ions are accumulating in the erythrocytes, due to the dissociation of carbonic acid, leading to a decrease in pH. To compensate for that, chloride diffuses from the plasma into the red blood cells (Hamburger shift [15]). The CO

_{2}can also bind to hemoglobin and form carbamate. One hemoglobin molecule can bind to up to four CO

_{2}molecules. The CO

_{2}competes with O

_{2}to bind to hemoglobin through the Bohr and Haldane effects. The Bohr effect describes the decreasing O

_{2}affinity of hemoglobin at lower pH, thus improving O

_{2}release under venous conditions. The Haldane effect refers to the reduced CO

_{2}affinity of oxygen-saturated hemoglobin, which allows for improved CO

_{2}release in the lungs.

#### 2.1.4. Model Adjustment

_{2}partial pressure is not considered when computing the rate of change of O

_{2}concentration. They define the O

_{2}concentration as:

_{2}partial pressure is:

_{2}concentration, Equation (1) must be differentiated by time. However, in addition to ${p}_{{\mathrm{O}}_{2},b}$, ${\mathrm{pH}}_{virtual}$ and ${p}_{{\mathrm{CO}}_{2},pl}$ are also functions of time that are included in the system of differential equations (Appendix A Table A1). Therefore, the chain rule of differentiation requires Equation (2) to be differentiated, with respect to ${\mathrm{pH}}_{virtual}$ and ${p}_{{\mathrm{CO}}_{2},pl}$, in addition to ${p}_{{\mathrm{O}}_{2},b}$, leading to:

_{2}kinetics were coupled with the O

_{2}kinetics through the Bohr and Haldane effects. We modelled that through the oxygen saturation curve and implemented it as a measured disturbance in our system. This had the effect of greatly simplifying the dynamical system and making it much less non-linear, compared to solving correctly for $\frac{d{[{\mathrm{O}}_{2}]}_{b}}{dt}$ (Equation (4)). If one is interested in also controlling the oxygen concentration in the blood, then another feedback control loop can be designed that takes care of that specifically. Additionally, in Hexamer et al. (2003), Misgeld (2007), and Manap et al. (2017), there was a missing conversion factor in the membrane diffusion term in the rate of change equations of ${p}_{C{O}_{2},pl}$ and ${[{\mathrm{O}}_{2}]}_{b}$. The amount of CO

_{2}and O

_{2}, respectively, diffusing through the membrane was measured in l s

^{−1}, while all other terms were written in mol s

^{−1}. Here, the molar volume (${V}_{m,{\mathrm{CO}}_{2}}$ in Equation (A2), Appendix A Table A1) of O

_{2}(~25.7301 L/mol) and CO

_{2}(25.6312 L/mol) at 37 $\mathbb{C}$ was required to balance the equations.

#### 2.1.5. Model Implementation

_{2}partial pressure in the plasma. The blood saturation $S$ and its time derivative were implemented as measured disturbances. The diffusion capacity of the membrane oxygenator was calculated from steady-state conditions of experimental data from in vitro and in vivo measurements [20]. The gas and blood volumes were calculated from the geometry of the oxygenator. The values for the remaining model parameters were taken from the literature (Appendix A Table A2). The initial values for the model variables are listed in Appendix A, Table A3. The non-model symbols used in the text are listed in Appendix A, Table A4.

#### 2.2. Control

_{2}flux through the membrane. Another objective could be to set the CO

_{2}flux at the desired value. In this work, we investigated a strategy that sets the concentration of CO

_{2}in the blood at the desired value. This strategy was applied with three different algorithms—a classic PI controller, a modern non-linear model predictive controller, and a deep neural network-based reinforcement learning controller. Each controller acts both on the blood flow rate and the gas flow rate and has access to all system states.

#### 2.2.1. PI Controller

#### 2.2.2. Non-Linear Model Predictive Controller

#### 2.2.3. Reinforcement Learning Controller

#### 2.2.4. Imitation Learning

#### 2.2.5. Controller Implementation

#### 2.3. Control Experiments

#### 2.3.1. Rise Time

#### 2.3.2. Settling Time

#### 2.3.3. Root-Mean-Squared Error

#### 2.3.4. Action Power

#### 2.3.5. Action Standard Deviation

## 3. Results and Discussion

^{−1}s

^{−1}for blood and 363.2 mmHg L

^{−1}s

^{−1}for gas flow rate in the range of 0 to 1.2 L/min. Since the systemic circulation was usually intact, in the carbon dioxide removal context, the blood flow rate through the catheter could be freely regulated to achieve the desired conditions. Under this assumption, the RL controller optimized to a state in which it omitted the gas flow rate for the control of the CO

_{2}partial pressure, setting it to an almost constant value.

_{2}concentration and then switched off all action. This was not the desired behavior because the CO

_{2}flux, in this case, became 0. The literature refers to this behavior as reward hacking, which commonly arises in reinforcement learning tasks [33]. Interestingly, not only the RL but also the PI and the NMPC controllers found this to be an optimal solution to the task. The goal of PI and NMPC controllers is to set ${p}_{\mathrm{C}{\mathrm{O}}_{2},pl}$ with as little deviation as possible from the reference. Blood coming into the oxygenator from the body circulation has almost always different ${p}_{\mathrm{C}{\mathrm{O}}_{2}}$ then the one in the oxygenator, because the gas gradients in the oxygenator are opposite to the ones in the body. Thus, the incoming blood counteracts the efforts of the controller to set the ${p}_{\mathrm{C}{\mathrm{O}}_{2},pl}$ at the reference value. It is, therefore, beneficial for the controller to turn off this ‘disturbance’, which leads to this undesired behavior. We solved this issue by implementing a low limit of 60 mL/min for the flow rates that counteract this behavior. The upper limit of 1.2 L/min was chosen, with regard to physiology, equipment limits, and the sensitivity curve of the ${p}_{\mathrm{C}{\mathrm{O}}_{2}pl,}$. Applying these saturation limits after the controller output was the only way to solve this problem for the PI. NMPC allows the constraints to be implemented in the optimization, which makes its solution optimal. However, this issue could also be solved by including a CO

_{2}flux term in the cost function of the NMPC. On the other hand, the RL controller does not always converge to this sub-optimal solution. Our data indicate that the stochasticity in the exploration strategy of the RL agent could be the cause for this behavior.

^{−4}s to compute an action. The time complexity of the PI algorithm was $~\mathsf{O}\left(p\right)$, where p was the number of outputs. The average time needed to compute an action of the PI controller was 6.69 × 10

^{−7}s. There was a performance advantage of ~3 orders of magnitudes between PI and RL and RL and NMPC, when it came to computation time. Regarding memory requirements, PI needs 6 coefficients and takes about 1 kB of memory, NMPC needs around 30 coefficients and takes about 10 kB of memory, while the RL algorithm uses around 1.2 × 10

^{−4}coefficients and takes around 300 kB of memory. Additionally, the PI controller is a very well-understood off-the-shelf algorithm that is very easy to implement, while the NMPC is a much more complicated control scheme that also needs an accurate model of the system, which might also imply the need for domain knowledge. On the other hand, RL is a model-free algorithm and, thus, does not need any domain knowledge. Training of the neural networks takes a long time, but after the agent is deployed, the selection of actions is fast and computationally inexpensive. The biggest disadvantage of the PI is the noisy nature of its actions. During operation, the high variability in ${Q}_{b}$ and ${Q}_{g}$ could lead to patient discomfort, or even potential harm [8,35]. This is not the case with NMPC, because it penalizes varying inputs in its cost function.

## 4. Conclusions

_{2}partial pressure reference tracking for membrane oxygenators in simulation. While PI and RL require less computational power when deployed, the NMPC was found to be the most suitable overall, in terms of accuracy, speed, and patient comfort. However, hardware limitations could restrict the use of an NMPC controller.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Conflicts of Interest

## Appendix A

Gas compartment | |

${V}_{g}\frac{d{p}_{{\mathrm{CO}}_{2},g}}{dt}={Q}_{g}\left({p}_{{\mathrm{CO}}_{2},g,in}-{p}_{{\mathrm{CO}}_{2},g}\right)+{D}_{{\mathrm{CO}}_{2},m}\times {p}_{bar}\left({p}_{{\mathrm{CO}}_{2},pl}-{p}_{{\mathrm{CO}}_{2},g}\right)$ | (A1) |

Plasma compartment | |

${V}_{pl}{\alpha}_{{\mathrm{CO}}_{2}}\frac{d{p}_{{\mathrm{CO}}_{2},pl}}{dt}={Q}_{pl}{\alpha}_{{\mathrm{CO}}_{2}}\left({p}_{{\mathrm{CO}}_{2},pl,in}-{p}_{{\mathrm{CO}}_{2},pl}\right)+\frac{{D}_{{\mathrm{CO}}_{2},m}}{{V}_{m,{\mathrm{CO}}_{2}}}\left({p}_{{\mathrm{CO}}_{2},g}-{p}_{{\mathrm{CO}}_{2},pl}\right)+{D}_{{\mathrm{CO}}_{2},rbc}\left({p}_{{\mathrm{CO}}_{2},rbc}-{p}_{{\mathrm{CO}}_{2},pl}\right)+{V}_{pl}{R}_{{\mathrm{HCO}}_{3},pl}$ | (A2) |

${V}_{pl}\frac{d{\left[{\mathrm{HCO}}_{3}\right]}_{pl}}{dt}={Q}_{pl}\left({\left[{\mathrm{HCO}}_{3}\right]}_{pl,in}-{\left[{\mathrm{HCO}}_{3}\right]}_{pl}\right)-{D}_{{\mathrm{HCO}}_{3},rbc}\left({\left[{\mathrm{HCO}}_{3}\right]}_{pl}-\frac{{\left[{\mathrm{HCO}}_{3}\right]}_{rbc}}{r}\right)-{V}_{pl}{R}_{{\mathrm{HCO}}_{3},pl}$ | (A3) |

${V}_{pl}\frac{d{\left[\mathrm{H}\right]}_{pl}}{dt}={Q}_{pl}\left({\left[\mathrm{H}\right]}_{pl,in}-{\left[\mathrm{H}\right]}_{pl}\right)-{V}_{pl}\frac{2.303}{{\beta}_{pl}}{\left[\mathrm{H}\right]}_{pl}{R}_{{\mathrm{HCO}}_{3},pl}$ | (A4) |

${R}_{{\mathrm{HCO}}_{3},pl}=-{k}_{u}{\alpha}_{C{O}_{2}}{p}_{{\mathrm{CO}}_{2},pl}+\frac{{k}_{v}}{k}{\left[\mathrm{H}\right]}_{pl}{\left[{\mathrm{HCO}}_{3}\right]}_{pl}$ | (A5) |

Erythrocytes compartment | |

${V}_{rbc}{\alpha}_{{\mathrm{CO}}_{2}}\frac{d{p}_{{\mathrm{CO}}_{2},rbc}}{dt}={Q}_{rbc}{\alpha}_{{\mathrm{CO}}_{2}}\left({p}_{{\mathrm{CO}}_{2},rbc,in}-{p}_{{\mathrm{CO}}_{2},rbc}\right)-{D}_{{\mathrm{CO}}_{2},rbc}\left({p}_{{\mathrm{CO}}_{2},rbc}-{p}_{{\mathrm{CO}}_{2},pl}\right)+{V}_{rbc}{R}_{{\mathrm{HCO}}_{3},rbc}-{V}_{rbc}\frac{d\mathrm{carb}}{dt}$ | (A6) |

${V}_{rbc}\frac{d{\left[{\mathrm{HCO}}_{3}\right]}_{rbc}}{dt}={Q}_{rbc}\left({\left[{\mathrm{HCO}}_{3}\right]}_{rbc,in}-{\left[{\mathrm{HCO}}_{3}\right]}_{pl}\right)+{D}_{{\mathrm{HCO}}_{3},rbc}\left({\left[{\mathrm{HCO}}_{3}\right]}_{pl}-\frac{{\left[{\mathrm{HCO}}_{3}\right]}_{rbc}}{r}\right)-{V}_{rbc}{R}_{{\mathrm{HCO}}_{3},rbc}$ | (A7) |

${V}_{rbc}\frac{d{\left[\mathrm{H}\right]}_{rbc}}{dt}={Q}_{rbc}\left({\left[\mathrm{H}\right]}_{rbc,in}-{\left[\mathrm{H}\right]}_{rbc}\right)+{V}_{rbc}\frac{2.303}{{\beta}_{rbc}}{\left[\mathrm{H}\right]}_{rbc}(-{R}_{{\mathrm{HCO}}_{3},rbc}+1.5\frac{d\mathrm{carb}}{dt}-0.6\frac{dS}{dt})$ | (A8) |

${V}_{rbc}\frac{d\left[\mathrm{carb}\right]}{dt}={Q}_{rbc}\left({\left[\mathrm{carb}\right]}_{in}-\left[\mathrm{carb}\right]\right)+{k}_{a}{p}_{{\mathrm{CO}}_{2},rbc}{\alpha}_{{\mathrm{CO}}_{2}}{V}_{rbc}\left(\left[\mathrm{Hb}\right]-\left[\mathrm{carb}\right]\right)\left(\frac{{k}_{zo}S}{{k}_{zo}+{\left[\mathrm{H}\right]}_{rbc}}+\frac{{k}_{zr}\left(1-S\right)}{{k}_{zr}+{\left[\mathrm{H}\right]}_{rbc}}\right)-{V}_{rbc}\frac{{k}_{a}\left[\mathrm{carb}\right]{\left[\mathrm{H}\right]}_{rbc}}{{k}_{c}}$ | (A9) |

$\frac{d{\mathrm{pH}}_{virtual}}{dt}=10\left(-{\mathrm{pH}}_{virtual}-\mathrm{lg}\left(r{\left[\mathrm{H}\right]}_{rbc}\right)\right)$ | (A10) |

${R}_{{\mathrm{HCO}}_{3},rbc}=cat\left(-{k}_{u}{\alpha}_{C{O}_{2}}{p}_{{\mathrm{CO}}_{2},rbc}+\frac{{k}_{v}}{k}{\left[\mathrm{H}\right]}_{rbc}{\left[{\mathrm{HCO}}_{3}\right]}_{rbc}\right)$ | (A11) |

$r=\left(0.058\times {\mathrm{pH}}_{virtual}-0.437\right)S-0.529\times {\mathrm{pH}}_{virtual}+4.6$ | (A12) |

Symbol | Parameter | Value | Unit |
---|---|---|---|

${V}_{g}$ | Gas volume | 0.0055 | L |

${V}_{b}$ | Blood volume | 0.0121 | L |

${V}_{pl}$ | Plasma volume | ${V}_{b}\times \left(1-hct\right)$ | L |

${V}_{rbc}$ | Erythrocyte volume | ${V}_{b}\times hct$ | L |

${D}_{{\mathrm{CO}}_{2},m}$ | Membrane diffusion capacity | 5.4 × 10^{−6} | L (mmHg s)^{−1} |

${p}_{bar}$ | Gas pressure | 760 | mmHg |

${\alpha}_{{\mathrm{CO}}_{2}}$ | CO_{2} solubility coefficient | 3.5 × 10^{−5} | M mmHg^{−1} |

${V}_{m,{\mathrm{CO}}_{2}}$ | CO_{2} molar volume | 25.64 | M^{−1} |

${D}_{{\mathrm{CO}}_{2},rbc}$ | Erythrocyte diffusion capacity for CO_{2} | $0.693\frac{{\alpha}_{{\mathrm{CO}}_{2}}}{{\tau}_{rbc}}\frac{{V}_{pl}{V}_{rbc}}{{V}_{pl}+{V}_{pl}}$ | M L (mmHg s)^{−1} |

${D}_{{\mathrm{HCO}}_{3},rbc}$ | Erythrocyte diffusion capacity for HCO_{3} | $0.693\frac{{\alpha}_{{\mathrm{CO}}_{2}}}{{\tau}_{HC{O}_{3}}}\frac{{V}_{pl}{V}_{rbc}}{{V}_{pl}+{V}_{pl}}$ | M L (mmHg s)^{−1} |

${\tau}_{rbc}$ | CO_{2} half-time erythrocyte membrane diffusion | 0.001 | s |

${\tau}_{HC{O}_{3}}$ | HCO_{3} half-time erythrocyte membrane diffusion | 0.2 | s |

${\beta}_{pl}$ | Buffer capacity plasma | 6 × 10^{−3} | M pH^{−1} |

${\beta}_{rbc}$ | Buffer capacity erythrocytes | 57.7 × 10^{−3} | M pH^{−1} |

${k}_{u}$ | CO_{2} hydration reaction forward constant | 0.12 | s^{−1} |

${k}_{v}$ | CO_{2} hydration reaction reverse constant | 89 | s^{−1} |

$k$ | Carbonic acid dissociation equilibrium constant | 5.5 × 10^{−4} | M |

${k}_{a}$ | Carbamate generation forward constant | 5 × 10^{3} | (M s)^{−1} |

${k}_{zo}$ | Oxygenated hemoglobin ionization constant | 8.4 × 10^{−9} | M |

${k}_{zr}$ | De-oxygenated hemoglobin ionization constant | 7.2 × 10^{−8} | M |

${k}_{c}$ | Carbamate ionization constant | 2.4 × 10^{−5} | - |

$cat$ | Carbonic anhydrase catalytic factor | 13,000 | - |

$\left[\mathrm{Hb}\right]$ | Hemoglobin concentration | 20.7 × 10^{−3} | M |

**Table A3.**Summary of model variables. The initial values for states, actions, and values for model disturbances are shown.

Symbol | Variable | Description | Unit | Value |
---|---|---|---|---|

${p}_{{\mathrm{CO}}_{2},g}$ | Partial pressure CO_{2} gas oxygenator | state | mmHg | 0.695 |

${p}_{{\mathrm{CO}}_{2},g,in}$ | Partial pressure CO_{2} gas venous | disturbance | mmHg | 0 |

${p}_{{\mathrm{CO}}_{2},pl}$ | Partial pressure CO_{2} plasma oxygenator | state, output | mmHg | 43 |

${p}_{{\mathrm{CO}}_{2},pl,in}$ | Partial pressure CO_{2} plasma venous | disturbance | mmHg | 46 |

${p}_{{\mathrm{CO}}_{2},rbc}$ | Partial pressure CO_{2} erythrocyte oxygenator | state | mmHg | 43.1 |

${p}_{{\mathrm{CO}}_{2},rbc,in}$ | Partial pressure CO_{2} erythrocyte venous | disturbance | mmHg | 46 |

${\left[{\mathrm{HCO}}_{3}\right]}_{pl}$ | Bicarbonate concentration plasma oxygenator | state | M | 263 × 10^{−4} |

${\left[{\mathrm{HCO}}_{3}\right]}_{pl,in}$ | Bicarbonate concentration plasma venous | disturbance | M | 263 × 10^{−4} |

${\left[{\mathrm{HCO}}_{3}\right]}_{rbc}$ | Bicarbonate concentration erythrocyte oxygenator | state | M | 183 × 10^{−4} |

${\left[{\mathrm{HCO}}_{3}\right]}_{rbc,in}$ | Bicarbonate concentration erythrocyte venous | disturbance | M | 182 × 10^{−4} |

${\left[\mathrm{H}\right]}_{pl}$ | Hydrogen concentration plasma oxygenator | state | M | 42.3 × 10^{−9} |

${\left[\mathrm{H}\right]}_{pl,in}$ | Hydrogen concentration plasma venous | disturbance | M | 42.3 × 10^{−9} |

${\left[\mathrm{H}\right]}_{rbc}$ | Hydrogen concentration erythrocyte oxygenator | state | M | 61 × 10^{−9} |

${\left[\mathrm{H}\right]}_{rbc,in}$ | Hydrogen concentration erythrocyte venous | disturbance | M | 61 × 10^{−9} |

${\mathrm{pH}}_{virtual}$ | Virtual pH | state | - | 7.37 |

$\left[\mathrm{carb}\right]$ | Carbamate concentration | state | M | 184 × 10^{−5} |

${\left[\mathrm{carb}\right]}_{in}$ | Carbamate concentration venous | disturbance | M | 235 × 10^{−5} |

$S$ | Blood oxygen saturation | measured disturbance | - | - |

$hct$ | Hematocrit | disturbance | - | 0.45 |

${Q}_{g}$ | Sweep gas flow rate | manipulated variable | L/s | - |

${Q}_{b}$ | Blood flow rate | manipulated variable | L/s | - |

Symbol | Meaning |
---|---|

CO_{2} | Carbon dioxide |

O_{2} | Dioxygen |

N_{2} | Dinitrogen |

HCO_{3} | Bicarbonate Ion |

H | Hydrogen |

pH | Potential of hydrogen |

${[{\mathrm{O}}_{2}]}_{b}$ | Oxygen molar concentration |

${p}_{{\mathrm{O}}_{2},b}$ | Oxygen partial pressure |

$ca{p}_{b}$ | Blood oxygen capacity |

${p}_{{\mathrm{O}}_{2},virtual}$ | Virtual partial pressure of oxygen |

$T$ | Temperature |

${\alpha}_{{\mathrm{O}}_{2}}$ | Oxygen solubility in blood |

${\left[{\mathrm{O}}_{2}\right]}_{b,in}$ | Venous oxygen molar concentration |

${D}_{{\mathrm{O}}_{2},m}$ | Membrane diffusion capacity for oxygen |

${V}_{m,{\mathrm{O}}_{2}}$ | Oxygen molar volume |

${p}_{{\mathrm{O}}_{2},g}$ | Oxygen partial pressure in sweep gas |

${x}^{+}$ | Future state vector |

$f\left(x,u\right)$ | Function of x and u |

$x$ | State vector |

$u$ | Input vector |

$\mathrm{min}$ | Minimizing function |

$N$ | Control horizon number of samples, number of transitions |

$J$ | Cost |

$k$ | Sample index |

$l$ | Cost function |

$\lambda $ | Regularization parameter |

$\pi $ | Policy |

$Q$ | Q-value function |

$t$ | Time |

${a}_{t}$ | Agent action at time t |

${s}_{t}$ | Environment state at time t |

$r$ | Reward |

$\gamma $ | Future reward discount factor |

$\mu $ | Policy function |

$\theta $ | Function parameters (weights of the neural net) |

${Q}^{\prime}$ | Target Q-value function |

$\mathsf{\tau}$ | Target function update rate |

## References

- Makdisi, G.; Wang, I.W. Extra Corporeal Membrane Oxygenation (ECMO) review of a lifesaving technology. J. Thorac. Dis.
**2015**, 7, E166–E176. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Jeffries, R.G.; Lund, L.; Frankowski, B.; Federspiel, W.J. An extracorporeal carbon dioxide removal (ECCO 2 R) device operating at hemodialysis blood flow rates. Intensive Care Med. Exp.
**2017**, 5, 41. [Google Scholar] [CrossRef] [Green Version] - Kneyber, M.C. Mechanical ventilation during extra-corporeal membrane oxygenation: More questions than answers. Minerva Anestesiol.
**2019**, 85, 91–92. [Google Scholar] [CrossRef] [PubMed] - Ficial, B.; Vasques, F.; Zhang, J.; Whebell, S.; Slattery, M.; Lamas, T.; Daly, K.; Camporota, N.A. Physiological basis of extracorporeal membrane oxygenation and extracorporeal carbon dioxide removal in respiratory failure. Membranes
**2021**, 11, 225. [Google Scholar] [CrossRef] - Teber, O.O.; Altinay, A.D.; Mehrabani, S.A.N.; Tasdemir, R.S.; Zeytuncu, B.; Genceli, E.A.; Dulekgurgen, E.; Pekkan, K.; Koyuncu, İ. Polymeric hollow fiber membrane oxygenators as artificial lungs: A review. Biochem. Eng. J.
**2022**, 180, 108340. [Google Scholar] [CrossRef] - Bouchez, S.; de Somer, F. The evolving role of the modern perfusionist: Insights from transesophageal echocardiography. Perfusion
**2021**, 36, 222–232. [Google Scholar] [CrossRef] [PubMed] - Boeken, U.; Ensminger, S.; Assmann, A.; Schmid, C.; Werdan, G.; Michels, O.; Miera, F.; Schmidt, S.; Klotz, C.; Starck, K. Einsatz der extrakorporalen Zirkulation (ECLS/ECMO) bei Herz- und Kreislaufversagen. Kardiologe
**2021**, 15, 526–535. [Google Scholar] [CrossRef] - Utley, J.R. Techniques for avoiding neurologic injury during adult cardiac surgery. J. Cardiothorac. Vasc. Anesth.
**1996**, 10, 38–44. [Google Scholar] [CrossRef] - Misgeld, B.J.E.; Werner, J.; Hexamer, M. Simultaneous automatic control of oxygen and carbon dioxide blood gases during cardiopulmonary bypass. Artif. Organs
**2010**, 34, 503–512. [Google Scholar] [CrossRef] - Manap, H.H.; Wahab, A.K.A.; Zuki, F.M. Control for Carbon Dioxide Exchange Process in a Membrane Oxygenator Using Online Self-Tuning Fuzzy-PID Controller. Biomed. Signal Process. Control.
**2021**, 64, 102300. [Google Scholar] [CrossRef] - Sadati, S.J.; Noei, A.R.; Ghaderi, R. Fractional-order control of a nonlinear time-delay system: Case study in oxygen regulation in the heart-lung machine. J. Control Sci. Eng.
**2012**, 2012, 478346. [Google Scholar] [CrossRef] [Green Version] - Allen, J.; Fisher, A.C.; Gaylor, J.D.S.; Razieh, A.R. Development of a digital adaptive control system for PO2 regulation in a membrane oxygenator. J. Biomed. Eng.
**1992**, 14, 404–411. [Google Scholar] [CrossRef] - Hill, E.P.; Power, G.G.; Longo, L.D. A mathematical model of carbon dioxide transfer in the placenta and its interaction with oxygen. Am. J. Physiol.
**1973**, 224, 283–299. [Google Scholar] [CrossRef] - Lukitsch, B.; Ecker, P.; Elenkov, M.; Janeczek, C.; Haddadi, B.; Jordan, C.; Krenn, C.; Ulrich, R.; Gfoehler, M.; Harasek, M. Computation of global and local mass transfer in hollow fiber membrane modules. Sustainability
**2020**, 12, 2207. [Google Scholar] [CrossRef] [Green Version] - Klocke, R.A. Velocity of CO
_{2}exchange in blood. Annu. Rev. Physiol.**1988**, 50, 625–637. [Google Scholar] [CrossRef] - Hexamer, M.; Werner, J. A mathematical model for the gas transfer in an oxygenator. IFAC Proc. Vol.
**2003**, 36, 409–414. [Google Scholar] [CrossRef] - Misgeld, B.J.E. Automatic Control of the Heart-Lung Machine. Ph.D. Thesis, Ruhr-Universität Bochum, Universitätsbibliothek, Bochum, Germany, 2007. [Google Scholar]
- Manap, H.H.; Wahab, A.K.A.; Zuki, F.M. Mathematical Modelling of Carbon Dioxide Exchange in Hollow Fiber Membrane Oxygenator. IOP Conf. Ser. Mater. Sci. Eng.
**2017**, 210, 012003. [Google Scholar] [CrossRef] - Shampine, L.F.; Reichelt, M.W. The MATLAB ODE Suite. SIAM J. Sci. Comput.
**1997**, 18, 1–22. [Google Scholar] [CrossRef] [Green Version] - Lukitsch, B.; Ecker, P.; Elenkov, M.; Janeczek, C.; Jordan, C.; Krenn, C.G.; Ulrich, R.; Gfoehler, M.; Harasek, M. Suitable CO
_{2}solubility models for determination of the CO_{2}removal performance of oxygenators. Bioengineering**2021**, 8, 1–25. [Google Scholar] [CrossRef] - Harasek, M.; Elenkov, M.; Lukitsch, B.; Ecker, P.; Janeczek, C.; Gfoehler, M. Design of Control Strategies for the CO
_{2}Removal from Blood with an Intracorporeal Membrane Device. In Proceedings of the 11th Biomedical Engineering International Conference (BMEiCON), Chiang Mai, Thailand, 21–24 November 2018; pp. 1–4. [Google Scholar] [CrossRef] - Ziegler, J.G.; Nichols, N.B. Optimum settings for automatic controllers. J. Dyn. Syst. Meas. Control. Trans. ASME
**1993**, 115, 220–222. [Google Scholar] [CrossRef] [Green Version] - Grüne, L.; Pannek, J. Nonlinear Model Predictive Control Theory and Algorithms; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; The MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv
**2015**, arXiv:1509.02971. [Google Scholar] [CrossRef] - Fujimoto, S.; van Hoof, H.; Meger, D. Addressing Function Approximation Error in Actor-Critic Methods. In Proceedings of the Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2015; Volume 4, pp. 2587–2601. [Google Scholar]
- François-lavet, V.; Henderson, P.; Islam, R.; Bellemare, M.G.; Pineau, J. An Introduction to Deep Reinforcement Learning. In Foundations and Trends® in Machine Learning; Mike Casey: University of California, Berkeley, CA, USA, 2018; Volume II, pp. 1–140. [Google Scholar] [CrossRef]
- Watkins, C.J.C.H.; Dayan, P. Q-learning. Mach. Learn.
**1992**, 8, 279–292. [Google Scholar] [CrossRef] - Dulac-Arnold, G.; Mankowitz, D.; Hester, T. Challenges of Real-World Reinforcement Learning. arXiv
**2019**, arXiv:1904.12901. [Google Scholar] - Byrd, R.H.; Gilbert, J.C.; Nocedal, J. A trust region method based on interior point techniques for nonlinear programming. Math. Program. Ser. B
**2000**, 89, 149–185. [Google Scholar] [CrossRef] [Green Version] - Waltz, R.A.; Morales, J.L.; Nocedal, J.; Orban, D. An interior algorithm for nonlinear optimization that combines line search and trust region steps. Math. Program.
**2006**, 107, 391–408. [Google Scholar] [CrossRef] - Rizvi, A.S.A.; Pertzborn, A.J.; Lin, Z. Reinforcement Learning Based Optimal Tracking Control Under Unmeasurable Disturbances with Application to HVAC Systems. IEEE Trans. Neural Netw. Learn. Syst.
**2021**, 1–11. [Google Scholar] [CrossRef] - Amodei, D.; Olah, C.; Steinhardt, J.; Christiano, P.; Schulman, J.; Mané, D. Concrete Problems in AI Safety. arXiv
**2016**, arXiv:1606.06565. [Google Scholar] - Potra, F.A.; Wright, S.J. Interior-point methods. J. Comput. Appl. Math.
**2000**, 124, 281–302. [Google Scholar] [CrossRef] [Green Version] - Mills, N.L.; Ochsner, J.L. Massive air embolism during cardiopulmonary bypass. Causes, prevention, and management. J. Thorac. Cardiovasc. Surg.
**1980**, 80, 708–717. [Google Scholar] [CrossRef] - Van Hasselt, H.; Doron, Y.; Strub, F.; Hessel, M.; Sonnerat, N.; Modayil, J. Deep Reinforcement Learning and the Deadly Triad. arXiv
**2018**, arXiv:1812.02648. [Google Scholar] - Elenkov, M.; Lukitsch, B.; Ecker, P.; Janeczek, C.; Harasek, M.; Gföhler, M. Non-parametric dynamical estimation of blood flow rate, pressure difference and viscosity for a miniaturized blood pump. Int. J. Artif. Organs
**2022**, 45, 207–215. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**An overview of the system and controllers: (

**a**) A schematic representation of the compartmental model of the membrane oxygenator; (

**b**) A drawing of the physical system. The catheter—membrane oxygenator and blood pump—is inserted in the vena cava superior. The extracorporeal part of the device-controller and a compressor are worn by the patient; (

**c**) A schematic representation of the PI controller; (

**d**) A schematic representation of the NMPC controller; (

**e**) A schematic representation of the deep RL controller.

**Figure 2.**Statistics of the results of the experiments: (

**a**) the distributions for the control action and reference tracking error for each controller; (

**b**) distributions for the reference signal and the disturbance signal.

**Figure 3.**Results of the control experiments. The top row shows setpoints and manipulated variable—${p}_{\mathrm{C}{\mathrm{O}}_{2},pl}$ for the three controllers. The bottom row shows the actions taken—sweep flow rate (SFR) and blood flow rate (BFR)—by each controller for a snip of the reference tracking task. The disturbance S is also shown.

**Figure 4.**Results of the RL controller training for n = 1, 5, and 10. The agents with n = 1 and 10 showed diverging behavior.

**Table 1.**Results of the control experiments. The PI, NMPC, and RL controllers were assessed based on rise time (RT), settling time (ST), root-mean-squared error (RMSE), action power P, and action standard deviation σ.

PI | NMPC | RL | |
---|---|---|---|

RT [s] | 1.18 | 1.33 | 4.9 |

ST [s] | 2.24 | 3.57 | 7.6 |

RMSE [mmHg] | 1.06 | 1.09 | 1.37 |

${P}_{{Q}_{g}}$ [L^{2}s^{−2}] | 6.7 × 10^{−3} | 4.1 × 10^{−3} | 8 × 10^{−3} |

${P}_{{Q}_{b}}$ [L^{2}s^{−2}] | 2.2 × 10^{−3} | 1.4 ×10^{−3} | 2 × 10^{−3} |

${\sigma}_{{Q}_{g}}$ [mLs^{−1}] | 6.53 | 6.53 | 0.03 |

${\sigma}_{{Q}_{b}}$ [mLs^{−1}] | 6.3 | 5.05 | 5.95 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Elenkov, M.; Lukitsch, B.; Ecker, P.; Janeczek, C.; Harasek, M.; Gföhler, M.
Evaluation of Different Control Algorithms for Carbon Dioxide Removal with Membrane Oxygenators. *Appl. Sci.* **2022**, *12*, 11890.
https://doi.org/10.3390/app122311890

**AMA Style**

Elenkov M, Lukitsch B, Ecker P, Janeczek C, Harasek M, Gföhler M.
Evaluation of Different Control Algorithms for Carbon Dioxide Removal with Membrane Oxygenators. *Applied Sciences*. 2022; 12(23):11890.
https://doi.org/10.3390/app122311890

**Chicago/Turabian Style**

Elenkov, Martin, Benjamin Lukitsch, Paul Ecker, Christoph Janeczek, Michael Harasek, and Margit Gföhler.
2022. "Evaluation of Different Control Algorithms for Carbon Dioxide Removal with Membrane Oxygenators" *Applied Sciences* 12, no. 23: 11890.
https://doi.org/10.3390/app122311890