3. Materials and Methods
This retrospective single center study utilized archived vital sign data of pediatric patients (0–19 years of age) supported on ECMO. Patients were admitted between November 2019 and June 2023. Patients were selected based on the inclusion and exclusion criteria (
Supplementary Materials File S1) of a larger NIH funded study exploring utilizing high-frequency data to predict neurologic injury in this population. The University of Texas Southwestern institutional review board approved the study (STU 2021-0095). Data were collected from a maximum of 24 h prior to ECMO initiation through the entire ECMO course. The final dataset included 78 patients, with data spanning up to 15 days after cannulation onto ECMO.
Blood pressure was the sole vital sign that was analyzed in this study, as it was the most predictive of adverse outcomes. All patients included in this study had continuous invasive blood pressure monitors in place. Continuous 5 s measurements were exported from the Etiometry platform (Boston, MA, USA). Both continuous MAP and EHR MAP values were derived from the invasive arterial blood pressure monitor. The EHR MAP field is auto-populated from the invasive arterial-line monitor with nurse verification; nurses may manually enter MAP values per clinical practice. Median sampling was utilized to prevent the smoothing of values, which may prevent the detection of hypotension or hypertension events.
Data cleaning stages included noise reduction in the mean arterial pressure (MAP) data, for which a three-stage filtering process was applied. First, we excluded values outside the plausible range of 20–180 mmHg. Second, we excluded measurements exhibiting implausible change patterns, defined as transient step-changes >10 units over 5 s that were sustained for less than 300 s. This often occurs when the arterial line is being accessed for lab draws. Finally, we excluded sequences <200 s surrounded by missingness and then forward-filled remaining gaps; we did not impute deleted values.
Recognizing the wide age-based variability in blood pressure, MAP values were age-standardized using inpatient MAP percentiles [
1,
12]. Hypotension and hypertension were quantified using the following metrics:
Event Frequency: The count of non-overlapping 3 min windows per patient in which every resampled MAP value crossed the threshold; this definition was pre-specified to examine how coarser sampling (fewer points per window) inflates/deflates counts.
Burden: The normalized area under or over the MAP curve for a given threshold, calculated using the trapezoid method and expressed in mmHg-second. We then divide this quantity by the total duration of ECMO in seconds. The resulting burden metric is an average excursion of mmHg normalized by ECMO duration.
The event frequency and burden were chosen to align with previously published work [
13].
We consider three threshold values to determine hypotension and hypertension based on age-specific percentiles: below the 5th/above the 95th, below the 10th/above the 90ths, and below the 25th/above the 75th percentiles. Primary endpoint: The difference in per-patient event frequency across sampling intervals vs. continuous 5 s data at 10th/90th percentiles. Secondary endpoints: (i) The difference in burden (mmHg-normalized by ECMO duration) across intervals; (ii) the agreement with EHR-recorded values; (iii) sensitivity analyses at 5th/95th and 25th/75th thresholds [
14,
15].
To evaluate the effect of sampling frequency and method on the accuracy of event detection, we resampled continuous data utilizing median values across seven sampling frequencies (5 s to 1 h). Lastly, these data were compared to irregularly charted EHR-derived values as a measure of routine documentation consistent with current clinical standard practice. Wilcoxon Signed Rank was utilized to compare events and burden to both EHR and 5 s frequency, and a Bland–Altman plot analysis was performed to display data distribution.
4. Results
Table 1 summarizes the cohort demographics. The median age of patients was 239.5 days, with 48% being children (1–19 years) and 41% being neonates (<28 days). In total, 51% were male, and the median weight was 8 kg. V-A ECMO was the most common ECMO type—64% of patients—and the majority were placed on ECMO for a respiratory indication. The median length of ECMO duration was 124.1 h, with 85% of patients surviving until hospital discharge.
The quantity of data points over varying sampling frequencies and from the EHR can be seen in
Table 2. The median percentage of data that was forward-filled per patient was 0.71%. The median gap length was 30 min prior to forward-fill (IQR 25). The data volume fell from 11,432,379 points at 5 s sampling to 63,717 at 15 min (–99.4%); the EHR contained 29,167 points total (mean 373.9 per patient). The median interval of EHR-sampled data points was 1800 s (30 min), with an IQR of 3180 s.
The median per-patient frequency of hypotension/hypertension windows (10th/90th percentiles) is summarized in
Table 3. Hypotension events declined from 12,044 at 5 min to 3803 at 15 min (–68.4%), with the EHR-derived total at 3471. Similarly, mean events declined from 154.4 (5 min) to 48.8 (15 min) per patient. EHR-derived data had a mean of 44.50 hypotensive events noted per patient.
Hypertensive events, defined as events over the 90th percentile for age, similarly had a notable change from 5 to 15 min, with 1573 total events at 5 min but only 410 events found at a sampling frequency of 15 min (
Table 3). Similarly, 20.17 mean events per patient were found when sampling at 5 min, but only 5.26 events were found when sampling was reduced to every 15 min. EHR-derived data comparatively had 1587 total hypertensive events, or 20.35 on average per patient. Data when defining hypo- and hypertension at the 5th and 95th percentile, as well as the 25th and 75th percentile, can be found within the supplement (
Supplementary Materials, Tables S1 and S2).
Table 3 summarizes statistics about hypotension and hypertension events in 78 patients on ECMO; the total events are the sum across all patients calculated for each patient using the median blood pressure (i.e., median MAP) measurement across different interval sizes from continuous data resampled every 5 s to every hour. For each such frequency, the columns represent the total number of events aggregated over all patients and the mean, the median, and the interquartile range of the number of events per patient in the cohort. EHR-derived data were automatically entered into the medical record per standard practice and verified by nursing staff. Each hypotension (or hypertension) event is defined as a 3 min period where the blood pressure falls below the 10th percentile (or exceeds the 90th percentile) value for the patient’s age group. Inferential statistics completed using the Wilcoxon rank test are shown comparing 5 s and EHR to other sampling intervals.
The overall burden of hypo and hypertension, as defined by the 10th and 90th percentile, is displayed in
Table 4. The total hypotension burden was more similar to EHR-derived data, with expected decreases as sampling became more infrequent. Conversely, the hypertensive burden was higher within the EHR-derived data than was found with high-frequency data sampling across all sample rates. Despite the high event counts, the normalized hypotension burden remained modest, indicating that most excursions were short or shallow. Data regarding the burden when defined by the 5th and 95th as well as the 25th and 75th percentile can be found within the supplement (
Supplementary Materials, Tables S3 and S4). In addition, we compared the proportion of hypo- and hypertension using the 10–90th percentiles utilizing the 5 s data compared to the EHR. The median proportion of hypotension was 10.4% for the 5 s data, compared to 13.5% for the EHR data. Comparatively, the median proportion of hypertension was 2.2% for the 5 s data and 4.4% for the EHR data.
Table 4 summarizes statistics about the hypotensive and hypertensive burden in 78 patients on ECMO; normalized burdens (average excursion magnitude, mmHg) with higher values indicate more severe and/or more prolonged excursions. The total burden is the sum across all patients, while the mean and median were calculated for each patient using the median blood pressure (i.e., median MAP) measurement across different interval sizes, from continuous data resampled every 5 s to every hour. For each such frequency, the columns represent the total hypotensive and hypertensive burden over all patients and the mean, the median, and the interquartile range of the patient-wise area in the cohort. EHR-derived data were automatically entered into the medical record per standard practice and verified by nursing staff. Hypotensive (or hypertensive) burden is defined as the area of the blood pressure curve (in mmHg-seconds) where its values fall below the 10th percentile (or exceed the 90th percentile) value for the patient’s age group, normalized by the total time in seconds spanned by the observations for that patient. Inferential statistics completed using the Wilcoxon rank test are shown comparing 5 s and EHR to other sampling intervals.
Figure 1 and
Figure 2 show two illustrative patients; green markers denote EHR-recorded values across sampling strategies. The shaded area under the curve represents the time spent with hypo- or hypertension as defined by the 10th and 90th percentiles. The figures are artificially limited to 240 min pre- and post-ECMO cannulation for ease of comparison. In addition, we calculated the variance for each individual patient across sampling intervals, and this can be seen in a box plot format in
Figure 3. To evaluate individual level differences across sampling frequencies and across the EHR-derived sampling frequencies, we have included patient level heatmaps of hypo- and hypertension at the 10/90th percentile within the supplement as
Supplementary Materials, Figures S1 and S2. Bland–Altman plots are found within the supplement comparing 5 s and EHR sampling intervals.
5. Discussion
This single-center arterial-line study of 78 pediatric ECMO patients demonstrates that observation process bias in EHR charting suppresses brief excursions and inflates burden estimates during prolonged instability. Irregular EHR-derived charting intervals constitute a source of label error: they suppress the detection of brief instability and distort the time-integrated burden. In CDS and ML pipelines, this alters thresholds, inflates false-negatives, and contributes to external validation failures [
16]. Paradoxically, detection briefly rises at 30 s and 1 min intervals because fewer points per 3 min window must all breach thresholds. With fewer points per window, the criterion is statistically easier to satisfy, briefly inflating counts before they decline again at longer intervals. As our comparative analyses reveal, the event detection and total burden of abnormal blood pressure defined using the AUC drop sharply at regularly resampled continuous data sample intervals above 15 min [
17]. Additionally, standard EHR sampling underestimates the total number of blood pressure excursions. Conversely, the burden of hypo- and hypertension was overestimated in EHR sampling.
The specific reason for EHR sampling overestimating burden compared to 5 s sampling is unknown, but we hypothesize that these results may indicate a bias towards oversampling and EHR reporting of abnormal vital signs during prolonged clinical instability, but a poor ability to detect shorter episodes of clinically significant changes in blood pressure. In addition, as seen in
Figure 3, higher-frequency sampling allows for a greater detection of variance from the mean blood pressure in our cohort of pediatric ECMO patients. In addition, inferential analysis showed that EHR data were statistically different at 15 min and higher intervals at capturing hypotension burden, and at all sampling intervals for hypertension burden. These discrepancies highlight that a reliance on sparse data risks masking short-lived but clinically meaningful changes, potentially undermining the deployment of predictive algorithms and AI tools in time-sensitive scenarios while also risking confounding created by the variable sampling of vital signs caused by human input into the EHR [
18]. For CDS, excursion rules and burden thresholds should specify a minimum physiologic sampling interval (≤5 min; continuous preferred); when the cadence is sparser, logic should add confirmation windows and display confidence qualifiers, and counts/burdens should be interpreted as lower bounds.
These findings correlate with a broader body of research demonstrating that conventional EHR-based documentation often fails to reflect the magnitude or duration of physiologic shifts in critically ill populations [
1]. Such shortcomings have been documented in both adult and pediatric intensive care settings, with transient episodes of hypotension, arrhythmia, or desaturation frequently overlooked. Even with an arterial line streaming beat-to-beat data, retrospective EHR snapshots flatten the signal into sporadic points. Clinically, without this continuity, clinicians and CDSSs alike cannot quantify cumulative hypotensive burden, leaving the kidneys, brain, and myocardium exposed to an unnoticed ischemic debt.
Sparse sampling constrains downstream research and AI/ML development because pivotal vital sign patterns may be missing from training data. The available literature increasingly supports a more granular, continuous-monitoring paradigm to mitigate this critical information gap and improve AI/ML algorithm deployment and reliability at the bedside.
A primary methodological gap is the lack of a consensus on optimal sampling intervals that balance clinical feasibility with data fidelity. Our systematic comparison of intervals from sub-minute to hourly offers a practical guide: each incremental jump in sampling period leads to a disproportionately higher number of missed events, underscoring the fragility of low-resolution data. Our interval comparison demonstrates a steep loss of event detection beyond 5–15 min in this cohort of pediatric ECMO patients. This work builds upon prior work demonstrating the importance of <5 min sampling in other settings [
1,
9,
14]. Closely related is the knowledge gap regarding the role of such brief but consequential episodes in disease trajectories. Historically, hourly or bi-hourly charting intervals were considered sufficient for most clinical documentation, yet our findings suggest that clinically relevant hemodynamic events can emerge and resolve within shorter intervals, necessitating a rethink of standard clinical practice. Additionally, studies seeking to evaluate the relationships between blood pressure and clinical outcomes should ideally utilize the highest frequency input possible, with an emphasis on at least a sampling interval of no greater than 5 min.
For predictive analytics and machine learning models in critical care, high-resolution data can substantially enhance both sensitivity and specificity. Many existing algorithms, particularly those predicting events such as sepsis onset or acute hemodynamic decompensation, are trained on EHR datasets that significantly under-sample key vital signs. Our results indicate that such models may fail external validation or yield inflated false-negative rates simply because essential features—like short-term abnormalities in mean arterial pressure—are absent in the data. By contrast, integrating sub-minute sampling from continuous data aggregation systems unlocks a richer feature space, enabling a more precise detection of early physiologic deterioration. Ultimately, this could translate into timelier interventions and better outcomes. This may explain why several models derived on EHR data have performed poorly when externally validated or when deployed prospectively [
10]. Sampling less frequently than every 5 min or simply relying on EHR-derived data, especially in a critically ill population, is highly likely to miss critical events and limit the reliability of any training or validation data.
Despite demonstrating the value of high-frequency sampling, our study has several limitations. First, it was conducted retrospectively at a single center, limiting its direct generalizability across different hospital systems, patient populations, and monitoring technologies. Second, our high-frequency data required rigorous filtering and imputation; errors in raw data from sensor noise or transient disconnects could skew the results of a study utilizing only high-frequency data without filtering, but this could be performed in future work by a machine learning algorithm. Third, we focused on blood pressure as a sentinel vital sign, but other parameters like heart rate variability or oxygen saturation may show different patterns and require tailored sampling strategies. Notwithstanding these caveats, our work offers a valuable roadmap for researchers aiming to refine data collection protocols and analytics pipelines for the next generation of clinical predictive algorithms.
Moving forward, multi-center validation studies are warranted to ascertain whether the threshold frequencies identified here hold true under varied clinical conditions, including a varied acuity of critically ill patients, monitoring modalities, and patient demographics. Additionally, prospective trials that integrate real-time streaming data into automated decision support tools could establish whether more granular sampling demonstrably improves the timely recognition of deteriorations. Future work should also explore advanced modeling techniques such as recurrent neural networks or graph-based learning that inherently account for the temporal structure and complex interactions of continuous physiologic signals. The ultimate goal is to develop adaptive systems that can automatically adjust sampling frequency in high-risk states, achieving a balance between data richness and clinician workflow constraints.
Author Contributions
Conceptualization, E.S., N.S., D.R.B. and L.R.; formal analysis, S.M., J.S., P.R., D.R.B. and R.S.; analytical tool development, R.S. and S.M.; manuscript drafting, N.S. and J.S.; statistical oversight, S.N. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by NIH grant number R01NS1331142 and R01NS122119. The funder had no role in the study design, data collection, analysis, interpretation, or writing of the manuscript. This research was supported in part by the computational resources provided by the BioHPC supercomputing facility located in the Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, TX. URL:
https://portal.biohpc.swmed.edu.
Institutional Review Board Statement
Institutional Review Board approval was obtained (IRB STU 2021-0095) from the University of Texas Southwestern on 26 April 2023.
Informed Consent Statement
Informed consent for participation is not required as per the University of Texas Southwestern institutional review board’s approval.
Data Availability Statement
Code for resampling, filtering, event-window detection, and burden calculations is openly available at
https://github.com/s-ranveer/ecmo_hypo_hypertension (accessed on 14 January 2026). The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
EHRs, Electronic Health Records; ECMO, Extracorporeal Membrane Oxygenation; MAP, Mean Arterial Pressure.
References
- Maslove, D.M.; Dubin, J.A.; Shrivats, A.; Lee, J. Errors, omissions, and outliers in hourly vital signs measurements in intensive care. Crit. Care Med. 2016, 44, e1021–e1030. [Google Scholar] [CrossRef] [PubMed]
- Maslove, D.M.; Lamontagne, F.; Marshall, J.C.; Heyland, D.K. A path to precision in the ICU. Crit. Care 2017, 21, 79. [Google Scholar] [CrossRef] [PubMed]
- Sakata, K.K.; Stephenson, L.S.; Mulanax, A.; Bierman, J.; Mcgrath, K.; Scholl, G.; McDougal, A.; Bearden, D.T.; Mohan, V.; Gold, J.A. Professional and interprofessional differences in electronic health records use and recognition of safety issues in critically ill patients. J. Interprof. Care 2016, 30, 636–642. [Google Scholar] [CrossRef] [PubMed]
- Keene, C.M.; Kong, V.Y.; Clarke, D.L.; Brysiewicz, P. The effect of the quality of vital sign recording on clinical decision making in a regional acute care trauma ward. Chin. J. Traumatol. 2017, 20, 283–287. [Google Scholar] [CrossRef] [PubMed]
- Asfari, A.; Wolovits, J.; Gazit, A.Z.; Abbas, Q.M.; Macfadyen, A.J.; Cooper, D.S.; Futterman, C.; Penk, J.S.; Kelly, R.B.; Salvin, J.W.; et al. A near real-time risk analytics algorithm predicts elevated lactate levels in pediatric cardiac critical care patients. Crit. Care Explor. 2023, 5, e1013. [Google Scholar] [CrossRef] [PubMed]
- Teele, S.A.; Gazit, A.Z.; Futterman, C.; La Cava, W.G.; Cooper, D.S.M.; Schwartz, S.M.; Salvin, J.W. Investigation of a novel noninvasive risk analytics algorithm with laboratory central venous oxygen saturation measurements in critically ill pediatric patients. Crit. Care Explor. 2025, 7, e1204. [Google Scholar] [CrossRef] [PubMed]
- Futterman, C.; Salvin, J.W.; McManus, M.; Lowry, A.W.; Baronov, D.; Almodovar, M.C.; Pineda, J.A.; Nadkarni, V.M.; Laussen, P.C.; Gazit, A.Z. Inadequate oxygen delivery index dose is associated with cardiac arrest risk in neonates following cardiopulmonary bypass surgery. Resuscitation 2019, 142, 74–80. [Google Scholar] [CrossRef] [PubMed]
- Goldsmith, M.P.; Nadkarni, V.M.; Futterman, C.; Gazit, A.Z.; Baronov, D.; Tomczak, A.; Laussen, P.C.M.; Salvin, J.W. Use of a risk analytic algorithm to inform weaning from vasoactive medication in patients following pediatric cardiac surgery. Crit. Care Explor. 2021, 3, e0563. [Google Scholar] [CrossRef] [PubMed]
- Fleuren, L.M.; Klausch, T.L.T.; Zwager, C.L.; Schoonmade, L.J.; Guo, T.; Roggeveen, L.F.; Swart, E.L.; Girbes, A.R.J.; Thoral, P.; Ercole, A.; et al. Machine learning for the prediction of sepsis: A systematic review and meta-analysis of diagnostic test accuracy. Intensive Care Med. 2020, 46, 383–400. [Google Scholar] [CrossRef] [PubMed]
- Gichoya, J.W.; Thomas, K.; Celi, L.A.; Safdar, N.; Banerjee, I.; Banja, J.D.; Seyyed-Kalantari, L.; Trivedi, H.; Purkayastha, S. AI pitfalls and what not to do: Mitigating bias in AI. Br. J. Radiol. 2023, 96, 20230023. [Google Scholar] [CrossRef] [PubMed]
- Maslove, D.M.; Tang, B.; Shankar-Hari, M.; Lawler, P.R.; Angus, D.C.; Baillie, J.K.; Baron, R.M.; Bauer, M.; Buchman, T.G.; Calfee, C.S.; et al. Redefining critical illness. Nat. Med. 2022, 28, 1141–1148. [Google Scholar] [CrossRef] [PubMed]
- Flynn, J.T.; Kaelber, D.C.; Baker-Smith, C.M.; Blowey, D.; Carroll, A.E.; Daniels, S.R.; de Ferranti, S.D.; Dionne, J.M.; Falkner, B.; Flinn, S.K.; et al. Clinical practice guideline for screening and management of high blood pressure in children and adolescents. Pediatrics 2017, 140, e20171904. [Google Scholar] [CrossRef] [PubMed]
- Lowry, A.W.; Futterman, C.A.; Gazit, A.Z. Acute vital signs changes are underrepresented by a conventional electronic health record when compared with automatically acquired data in a single-center tertiary pediatric cardiac intensive care unit. J. Am. Med. Inform. Assoc. 2022, 29, 1183–1190. [Google Scholar] [CrossRef] [PubMed]
- Roberts, J.S.; Yanay, O.; Barry, D. Age-Based Percentiles of Measured Mean Arterial Pressure in Pediatric Patients in a Hospital Setting. Pediatr. Crit. Care Med. 2020, 21, e759–e768. [Google Scholar] [CrossRef] [PubMed]
- Haque, I.U.; Zaritsky, A.L. Analysis of the evidence for the lower limit of systolic and mean arterial pressure in children. Pediatr. Crit. Care Med. 2007, 8, 138–144. [Google Scholar] [CrossRef] [PubMed]
- Wijnberge, M.; Van Der Ster, B.; Vlaar, A.P.J.; Hollmann, M.W.; Geerts, B.F.; Veelo, D.P. The effect of intermittent versus continuous non-invasive blood pressure monitoring on the detection of intraoperative hypotension, a sub-study. J. Clin. Med. 2022, 11, 4083. [Google Scholar] [CrossRef] [PubMed]
- Ziegler, J.; Rush, B.N.M.; Gottlieb, E.R.; Celi, L.A.; de la Hoz, M.Á.A. High resolution data modifies intensive care unit dialysis outcome predictions as compared with low resolution administrative data set. PLoS Digit. Health 2022, 1, e0000124. [Google Scholar] [CrossRef] [PubMed]
- Sauer, C.M.; Dam, T.A.; Celi, L.A.; Faltys, M.; de la Hoz, M.A.A.; Adhikari, L.; Ziesemer, K.A.M.; Girbes, A.M.; Thoral, P.J.M.; Elbers, P.M. Systematic review and comparison of publicly available ICU data sets—A decision guide for clinicians and data scientists. Crit. Care Med. 2022, 50, e581–e588. [Google Scholar] [CrossRef] [PubMed]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |