Pressure Injury Link to Entropy of Abdominal Temperature

This study examined the association between pressure injuries and complexity of abdominal temperature measured in residents of a nursing facility. The temperature served as a proxy measure for skin thermoregulation. Refined multiscale sample entropy and bubble entropy were used to measure the irregularity of the temperature time series measured over two days at 1-min intervals. Robust summary measures were derived for the multiscale entropies and used in predictive models for pressure injuries that were built with adaptive lasso regression and neural networks. Both types of entropies were lower in the group of participants with pressure injuries (n=11) relative to the group of non-injured participants (n=15). This was generally true at the longer temporal scales, with the effect peaking at scale τ=22 min for sample entropy and τ=23 min for bubble entropy. Predictive models for pressure injury on the basis of refined multiscale sample entropy and bubble entropy yielded 96% accuracy, outperforming predictions based on any single measure of entropy. Combining entropy measures with a widely used risk assessment score led to the best prediction accuracy. Complexity of the abdominal temperature series could therefore serve as an indicator of risk of pressure injury.


Introduction
Pressure injuries, or pressure ulcers, are caused primarily by extended exposure to pressure [1] and modified by the tissue's tolerance to pressure [2]. Sustained high pressure can lead to decreased blood flow, occlusion of blood vessels and lymphatic vessels, and tissue ischemia [3]. In the conceptual model developed by Braden and Bergstrom [2], a tissue's tolerance to pressure is affected by both extrinsic and intrinsic factors. Extrinsic factors include moisture, friction, and shear. Intrinsic factors include undernutrition; decreased arteriolar pressure; and other hypothetical factors such as interstitial fluid flow, emotional stress, smoking, and skin temperature. The etiology of pressure injuries continues to be an active area of research (e.g., [4]) complemented by advances in understanding the biomechanics of aging and wound healing [5] and in the role of skin microclimate [6].
Variations in skin temperature have been used in laboratory and clinical studies as a measure of the perfusion in the papillary dermis [7][8][9]. Methods for measuring microvascular blood flow include laser Doppler flowmetry [10], optical coherence tomography (e.g., [11,12]), and thermal infrared imaging (e.g., [13]). In the context of thermal infrared imaging research, it has been reported that it is possible to measure the effect of local blood circulation on skin temperature [14]. Peripheral vasoconstriction conserves heat by preventing heat loss from convection and radiation at the skin surface, whereas vasodilation increases blood flow and heat flow from the core to the epidermis [15]. Assuming that peripheral vasodilation and vasoconstriction behave like other organ systems under autonomic control, skin temperature under usual conditions should be highly irregular; i.e., variations in the skin temperature should be complex. Skin temperature can thus play a role as a proxy measure of the thermoregulatory function.
In the scientific framework of complex adaptive systems, the complexity of physiological variables arises from continuous adjustments to stimuli in order to maintain stability, and high complexity is a sign of youth and good health [16][17][18]. Measuring the complexity via entropy measures of skin temperature is thus a reflection of thermoregulatory properties of the skin, offering one way to quantify skin function. Skin failure has been implicated in propensity for pressure injuries [19], and the speed of skin temperature recovery following the relief of pressure from an externally applied indenter has been found to predict risk of pressure injuries [20,21]. Liao et al. [22] studied the complexity of blood flow oscillations with multiscale entropy [23] under a set of experimental conditions. Local heating and cooling of the skin were found to have distinct multiscale entropy signatures in the phase of reactive hyperemia that followed the release of pressure on the skin tissue. A word about the terminology surrounding complexity is in order here, because there are two main definitions of complexity [24]. Algorithmic complexity equates complexity to randomness or irregularity, which is the sense in which we use the term in this article. In contrast, self-generated complexity links the concept of complexity to the generation of meaningful structures.
In a previous study, Rapp et al. [25] broke new ground in reporting the association found between decreased multiscale entropy of abdominal skin temperature and the risk of pressure injuries. The main limitation of the study was that only three participants developed pressure injuries during the follow-up period. In another study, secondary analysis of data collected during a multinational randomized controlled trial [26] revealed that the fractal dimension of physical activity, a measure of its complexity, was a distinguishing factor between facility residents with pressure injuries and controls who were matched on the basis of several risk factors [27].
Despite advances in mattress technologies in recent years that have improved the pressure distribution in bed, incidence of pressure injuries has remained a concern in the US and elsewhere [26,28]. North American and European estimates of the associated economic burden are high: the median treatment cost is in excess of ten thousand dollars per incidence [29][30][31]. The Braden scale [32] is an efficient, well-studied, and widely used survey instrument to assess the risk of pressure injuries in hospitals and nursing facilities. However, it is not perfect, and the development of new approaches to improve assessment of the risk of pressure injury is an important goal that aligns well with the broader goals of personalized healthcare. Improved risk assessment offers the potential to provide targeted assignment of limited staffing resources in nursing facilities, and reduces the incidence of pressure injuries, along with the high cost of treating them.
The primary purpose of this study was to examine the association between pressure injuries and multiscale entropy of abdominal temperature. A guiding heuristic principle in the science of complex biological systems is that a state of reduced disorder is associated with disease or frailty from aging. Aligned with this postulate, the study hypothesis was that the incidence of pressure injuries would be elevated in participants with lower levels of multiscale entropy. Apart from contributing new, stronger evidence for a previously tentative finding [25], this study adds several innovations. Refined multiscale entropy [33] was used to improve some deficiencies of multiscale entropy. Bubble entropy [34] was added as a complement to sample entropy [35] because the two entropies differ sharply in their approach to evaluating the disorderliness of a time series. Robust summary measures of refined multiscale entropy were subsequently developed to reduce the dimensionality, and machine learning methods were used to predict pressure injuries on the basis of entropy values.

Study Design and Data Collection
The design was a prospective cohort study with time series measurements made over 48 h after participant enrollment. Skin was examined at baseline and weekly thereafter to detect occurrence and stage of pressure injury, over a period of three weeks. Residents were recruited from an urban nursing facility with 50-bed capacity. Informed consent was required before enrollment in the study, in accordance with procedures approved by the Committee for Protection of Human Subjects at The University of Texas Health Science Center and policies of the nursing facility. The recruitment target was set at n = 40 based on consideration of power and the prevalence of pressure injuries. Using Poisson modeling based on the preliminary study [25], it was estimated that the target would yield an >0.5 chance of observing 6 or more pressure injuries in the study. Assuming the least prevalence in that range, along with the standardized effect size, d = 1.13, estimated from the preceding study, power was found to exceed 0.80. The power was calculated in G*Power 3.1 [36] for a one-sided t-test design with α = 0.05.
The study was impacted by the COVID-19 pandemic, and it was terminated after a futile wait of >18 months for the nursing facility to reopen for research activities. At the time of termination, n = 28 participants had been recruited. However, the goal of 6 or more participants been found with pressure injuries was easily exceeded, and there were 12 such participants in the study sample.
Residents of the nursing facility who were age 70 or older were eligible to participate in the study. Eligible participants either had a pressure injury or were at risk of developing a pressure injury, indicated by Braden scale score ≤ 16. Other eligibility criteria included the ability to understand and provide informed consent. Exclusion criteria included an active infection indicated by body temperature elevated above 99.5 • F.
Those who were at risk but did not develop a pressure injury during the study period will be referenced as control cases from here on, and the others will be referred to as pressure injury cases. Note that unlike common usage, control does not refer to healthy controls in this study. The control group here could be considered an at-risk control group.
Initial assessment of a new study participant included a skin examination and assessment of the pressure injury risk with the Braden scale. The temperature monitoring device was taped to the abdomen using water-resistant, hypoallergenic medical tape, approximately three inches to the left or to the right of the navel. The monitoring device was removed approximately 48 h later, and temperature data were downloaded to a secure server. Age, sex, and race of the participant; medications; number of comorbidities; dementia; and vascular conditions were noted by the Research Nurse, and vital signs were measured. Occurrence of pressure injury was monitored over the study's duration in control cases, and injury stage was monitored in pressure injury cases. Data were maintained in a secure REDCap database.

Primary Measures
Skin temperature was measured with iButton high-density temperature loggers (Maxim Integrated, San Jose, CA, USA). The model DS1922L iButton has accuracy of ±0.5 • C in the range −40 to +85 • C. More importantly for the entropy estimation, it has 11-bit resolution of 0.0625 • C, and sufficient memory for 4096 logged values. At 1-min intervals, the iButton thermochron could log temperatures for up to 2.84 days. The iButton is approximately dime-sized, weighs only 3.3 grams, and was well-tolerated by participants for the 48-h measurement duration. The Braden scale [32] for measuring pressure injury risk is composed of 6 subscales: sensory perception, activity, mobility, nutrition, moisture, friction/shear. The subscales are scored 1 to 4, except for the friction/shear subscale, which is scored 1 to 3. The range of the total score is 6 to 23. The Braden scale differentiates risk categories based on the total score, ranging from very high risk below 10 to no risk above 18. The scale has widespread use, and interrater reliability has been reported to be high. There is also satisfactory evidence of validity and reliability [37].
Head-to-toe skin assessments were made by the study research nurse using established criteria [38]. A stage 1 pressure injury involves a persistent, nonblanchable erythema over a bony prominence in a light-skinned individual or red, blue, or purple hues in dark skin present at the same site on two consecutive days. A stage 2 injury has breaks in the skin, such as blisters or abrasions; a stage 3 injury has exposed subcutaneous tissue; and a stage 4 injury has exposure that extends into muscle or bone. Interrater reliability between the study Clinical Consultant and Research Nurse was assessed prior to study commencement; the mean was 0.87 (range: 0.85, 0.90).

Entropy Measures
Sample entropy [35] is a widely used measure of entropy for time series data. Stable estimates can be produced on relatively short series lengths >10 m , where m is the embedding dimension. Sample entropy (SampEn) is the negative natural logarithm of the conditional probability that epochs of length m that match point-wise within a tolerance r also match at the next point. Higher values of SampEn indicate smaller likelihood of continued matching of a pattern of size m at the (m + 1)th point.
Multiscale entropy [23] was proposed to extend and improve the application of Sam-pEn at temporal scales longer than the scale set by the sampling interval of the time series. The simple averaging used for coarse-graining in multiscale entropy is known to have poor properties as a low-pass filter. Refined multiscale entropy [33] uses a Butterworth filter to improve the elimination of fast temporal scales, providing a flat response in the passband, with fast roll-off and elimination of side lobes in the stopband. Refined multiscale entropy additionally counteracts artificial shrinking of entropy at longer scales by updating the tolerance r as a percentage of the standard deviation of the filtered series rather than that of the original series.
In this study, refined multiscale SampEn was estimated for each abdominal temperature series using the EntropyHub library [39] that has been developed for use with multiple programming languages, including Python, which was used here. A sixth-order Butterworth filter with cutoff frequency 1/2τ for timescale τ was used for low-pass filtering of the time series. Although the original introduction of multiscale entropy involved the calculation of SampEn at each scale, EntropyHub has made it easy to expand and apply the notion of multiscale entropy to other types of entropy measures that were introduced later [40].
Bubble entropy [34] evolved from permutation entropy [41], which is based on the number of steps needed to sort the embedded sequence in ascending or descending order. This approach sets it apart from the tolerance based pattern matching approach of SampEn. Therefore, we supplemented the entropy measures with the refined multiscale bubble entropy. A similar approach was adopted in a study that used sample entropy and permutation entropy to improve classification of fever from body temperature signals [42].
Bubble entropy (BubbEn) is calculated from the conditional Rényi entropy of the probability distribution of number of swaps needed to sort the embedded sequences using the well known bubble sort algorithm. BubbEn at embedding dimension m is a normalized difference of the conditional Rényi entropies at m and m + 1 dimensions. Like SampEn, BubbEn has been shown to converge at short series lengths. Moreover, BubbEn has only one parameter, m, and estimated values have less sensitivity to the choice of m than other entropies [34].
Time series lengths were sufficient to investigate SampEn for embedding dimensions m = 2, 3. The tolerance parameter for SampEn was investigated for values r = 0.10, 0.15, 0.20, which represent fractions of the standard deviation of the time series for pattern matching. BubbEn was calculated for embeddings m = 2 to 10. The longest temporal scale for investigation of refined multiscale entropies was set to τ = 25 min so that the time series lengths were >100 times the maximum scale, on average.

Temperature Time Series
The iButton temperature sensor was in continuous logging mode after a software reset that was typically carried out a few hours before data collection began. Times of mounting and removing the temperature sensor on or off the participant's abdomen were noted as part of the protocol. Sensor on and off times were further fine tuned by tracking changes in the pattern of autoregressive behavior of the time series, detected with the ADTK package [43] for Python, which was executed in the Google Colaboratory environment. See Figure 1 for an example that shows one cluster of anomalies near the on time and another cluster of anomalies near the off time. Sections of the time series between the rightmost anomaly in the on cluster and ending 30 min prior to the first anomaly of the off cluster were selected for further analysis with the entropy methods described in Section 2.3. Stationarity is required for assessment of entropy [44], which was assessed with the augmented Dickey-Fuller test for the unit root [45]. Linear detrending was sufficient to achieve stationarity in all but three cases, for which removal of the circadian rhythm with a simple cosine fit was used to minimize changes to the structure of the series. The median length of the selected series was 2800 (46.7 h), and the interquartile range was 2630 (43.8 h) to 2826 (47.1 h). One series was of substantially shorter duration (24.0 h) than all others due to unexpected hospitalization of the participant for a reason unrelated to the study. Two temperature series were discarded due to errors in executing the software reset of loggers. Useful temperature data were available for 15 of 16 control cases and 11 of 12 pressure injury cases. Summary measures of time series were calculated with the aim of studying any differences in distributions between pressure injury and control groups. The median was used as a measure of the central tendency of abdominal temperature, along with the interquartile range (IQR) as a measure of dispersion. The trimmed 95% range between the 2.5th and 97.5th percentiles was also examined as a simple proxy measure of the amplitude of circadian variation.

Summary Measures of Multiscale Entropy
The refined multiscale entropy measures described in Section 2.3 resulted in a series of SampEn and BubbEn values at each scale per participant. One approach for shrinking the parameter space for statistical modeling would be to select the particular scales for each type of entropy that provide the highest power to discriminate between the injury groups. However, such a measure arising from a single scale may have a degree of stochasticity that could make it a feature of the sample. Aiming to find summary measures of multiscale entropy that are more generalizable, even if they might result in loss of statistical power, we focused on two properties that span a range of scales: shape of the entropy-scale curve, and a measure of the magnitude associated with the curve.
The simplest measure of the shape is the slope of the curve centered between any two specified scales. For example, in the Taylor series approximation of a function in calculus, an extension along the first derivative, or slope, serves as the first order approximation, followed by a contribution from the second derivative, which is related to curvature, and then on to smaller terms arising from higher-order derivatives. We define the scaling exponent as the slope of the entropy curve on the logarithmic scale: where En(τ) denotes either SampEn or BubbEn at scale τ. The slope is evaluated at scale s, which was selected to maximize the effect size of pressure injury, conditional upon spanning at least five consecutive scales. In practice, the scaling exponent is estimated by the parameter resulting from a regression of entropy on the logarithmic scale, which necessitates the selection of optimal limits s 1 and s 2 for the scale range. While these ranges could also vary under sampling, the inclusion of entropic structure across many scales makes this measure more robust than selecting a single scale. Area under the curve (AUC) between any two specified scales is a natural measure of the magnitude of entropy that spans more than one scale. We define the requisite AUC measure by where En(τ) denotes either SampEn or BubbEn at scale τ. The limits of the integration range, s 1 and s 2 , were selected by maximizing the effect size of pressure injury conditional upon spanning at least three consecutive scales to enhance robustness. It may be worth noting that these summary measures can be calculated independently for each individual without needing to know weights or parameters that could depend on other individuals. For example, principal components can only be calculated on the entire collection of individuals, rendering them dependent on sampling. The scaling exponent and requisite AUC measures are independent of sampling, and they need only a single time series for their estimation. The analysis was scripted in the R programming language [46], and it was executed in an RStudio environment [47].

Simple Bivariate Models
The differences in the means of the refined multiscale SampEn and BubbEn between pressure injury and control groups were assessed at each scale from the standardized effect size, d, and with two-sample Welch t-tests. Since this was an exploratory study rather than a confirmatory one, we did not control the family-wise error rate for multiple tests. The emphasis was on the effect sizes rather than p-values. Differences between groups of the scaling exponent and requisite AUC of multiscale SampEn and BubbEn, discussed in Section 2.5.1, were assessed in a similar manner.
Finding a difference in the mean of an entropy measure between pressure injury and control groups does not automatically guarantee that reversal of the dependent and independent variables will lead to a satisfactory model for predicting pressure injury from measurements of the entropy. Since any practical application of finding an appreciable difference in entropies would be geared toward prediction of the risk of pressure injury, we modeled it explicitly with machine learning methods that are described next.

Predictive Models
Two types of predictive models for pressure injuries are presented. One type of model is more traditional in the sense that it provides inferences about the links between pressure injury and the predictors, and the other type is focused more on prediction accuracy than inference. Generalized regression uses shrinkage methods and incorporates validation sets in the model construction and evaluation process. Adaptive lasso regression with leave-one-out cross-validation was used for term selection [48]. Adaptive lasso regression is a type of generalized regression that penalizes regression coefficients based on their size, shrinking some of the coefficients to zero, trading bias for reduced variance in the estimates [49,50]. Lasso regression is immune to multicollinearity, and it is known to work well even when there are large number of predictors relative to the size of the dataset, making it suitable for this study with its limited sample size. A second lasso step was used to explore the second-order interaction terms of all terms that were left in the model after the first run. The second type of predictive model was a neural network, which was restricted to have only a single hidden layer and no more than two nodes to minimize the risk of overfitting [51]. Models were built with 5-fold cross-validation, and the selection of activation functions for the nodes is described in Section 3.3.2. Adaptive lasso regression and neural models were built and executed using JMP Pro (version 15.2).

Sample Description
Distributions of demographic variables, history of comorbidities, body weight, vital signs, and Braden Scale score assessed at baseline are shown in Table 1. Descriptive statistics are displayed for the sample, in addition to being split by the pressure injury status. The control and pressure injury groups are defined in Section 2.1; briefly, control cases did not develop a pressure injury during the study period. The table includes the descriptions of summary measures of the time series temperature data. Among participants with pressure injuries, the maximum stage of injury observed during the study period was close to uniformly distributed across stages: n = 3 for stages 1 and 2, n = 4 for stage 3, and n = 2 for stage 4. The location of injury was uniformly distributed with n = 3 each for sacrum or coccyx, ischial tuberosity, heel, and other location.

Scale Structure of Entropies
Refined multiscale SampEn tended to increase over temporal scales from 1 to 25 min, as shown in Figure 2. The embedding dimension was m = 3, and tolerance parameter r = 0.15. The rate of increase was faster at the lower scales, particularly in the control group, and decreased at higher scales. The mean SampEn level tended to be lower in the pressure injury group relative to the control group at scales exceeding 7 min. The pattern was similar for embedding dimension m = 2 and variations of ±0.05 in r, but m = 3 and r = 0.15 yielded an excellent distinction between the control and pressure injury groups.
Refined multiscale BubbEn tended to increase over temporal scales from 4 to 25 min, as shown in Figure 3 for embedding dimension m = 3. The rate of increase of BubbEn had a transition point near the 11-min scale, beyond which the mean entropy flattened out in the pressure injury group, whereas it continued to increase in the control group. The pattern was similar for embedding dimension m = 2, but m = 3 yielded excellent distinction between the control and pressure injury groups. BubbEn was also explored for m > 3, being up to 10. At these higher values of m, the effect size of pressure injury appeared to be more stochastic across scales, whereas there was a more stable scale structure for m ≤ 3.

Comparison of Entropies by Pressure Injury Group
Comparisons of the mean entropy levels at each scale between control and pressure injury groups showed a generally increasing effect size over scales, peaking at 22 and 23 min for SampEn and BubbEn, respectively. Cohen's d effect size, defined by the difference in means measured in standard deviation units, is shown against scale in Figure 4. Due to the limitation of sample size, only large effects, d > 0.9, were statistically significant in inference with the independent samples t-test. Nevertheless, the extent of consecutive scales with d > 0.5 suggest robustness of the pressure injury effect on entropy levels. The scaling exponent and requisite AUC measures that summarize the refined multiscale SampEn and BubbEn curves for a participant were calculated as outlined in Section 2.5.1. The scaling exponent for the SampEn curve was derived by regressing entropy for a given participant on centered scales spanning from 5 to 10 min, and the scaling exponent for BubbEn spanned from 1 to 25 min. The specified scale ranges were obtained by a systematic search based on optimizing the effect size due to pressure injury. The use of centering on the logarithmic scale implies that the scaling exponents represent the parameter values at the centers of the ranges, which were located at 7.1 min for SampEn and 5.0 min for BubbEn. The equisite AUC spanned the range between 22 and 24 min for SampEn and 21 to 23 min for BubbEn, which includes the scale with peak effect for either type of entropy. The comparison of these summary measures between injury groups is shown in Table 2. The table includes the peak effect at the single scale. An unexpected bonus was that the scaling exponent for SampEn displayed a larger pressure injury effect than the best single scale, thereby providing a desirable combination that captured the scaling structure of the multiscale entropy along with increased statistical power.

Generalized Regression Models
Pressure injury outcome was predicted with a series of generalized regression models ranging from simple bivariate models to fully adjusted models. Adaptive lasso regression was used for all models with incorporation of leave-one-out cross-validation. This resulted in generally improved performance over logistic regression, particularly for the multivariate models. Table 3 summarizes bivariate models that predicted pressure injury from each of four summary measures of refined multiscale entropies and the Braden scale score, considered one at a time. The model performance, assessed by area under the receiver operating characteristic (ROC) curve, generally tracked the order of effect sizes displayed earlier in Table 2. Model B2 serves as a reference with the Braden scale score as the sole predictor. The area under the ROC curve was 0.740, and accuracy for predicting controls was 94.1%, whereas only 58.3% of pressure injuries were correctly predicted with the classification threshold set at probability >0.5. Changing the classification threshold to 0.4 switched the imbalance to the opposite end: accuracy was 75.0% for pressure injuries, but only 47.1% for controls. In contrast, model B5, based on predictions from the SampEn scaling exponent, had better balance and smoother changes in accuracy upon altering the classification threshold. When pressure injuries were predicted by probability >0.5, accuracy was 80.0% for controls and 90.9% for correct prediction of pressure injury cases. However, we note that predictions on the basis of individual entropy measures are capped by area under the ROC curve of 0.86, which leaves some room for improvement with multivariate models.
Next, we present and evaluate multivariate models that incorporated the summary measures of entropies and controlled for covariates. The models are summarized in Table 4. Model M1 was the culmination of a process that started with inclusion of all summary measures of entropies. The adaptive lasso regression with leave-one-out cross-validation did not eliminate any of the four entropy measures from the set of predictors. Interaction terms were explored in the second step and found to be unnecessary. Model performance improved relative to the bivariate models, as indicated by the area under the ROC curve, 0.940. When pressure injury cases were classified with a threshold, probability >0.5, the accuracy was 80.0% for prediction of injuries and 86.7% for non-injuries. The overall misclassification rate was 16.0%. Decreasing the classification threshold to 0.30 yielded 100% accuracy for pressure injury cases, and the overall misclassification rate stayed at 16.0%. Model M2 resulted from a procedure that was identical to that of model M1, except that it included any covariates measured at the baseline that were potentially different in the groups. Covariates included the Braden scale score, sex, BMI, dementia, vascular disease, heart rate, and the median and trimmed range of the temperature time series. The Braden scale score was the only covariate that was retained along with the entropy measures that are listed in Table 4. Second-order interaction terms were found to be unnecessary. The model's performance was very good, as indicated by the area under the ROC curve of 0.94. When pressure injury cases were classified with a classification threshold of 0.5, the accuracy was 100% for prediction of non-injuries, and the overall misclassification rate was 8.0%.

Neural Models
Neural models provided improved classification accuracy over the generalized regression models. Neural networks were restricted to a single hidden layer, and 5-fold cross-validation was used in model development to mitigate the risk of overfitting. Since the sample size was small, further precaution was taken to restrict the number of predictors to only the most effective ones that were indicated by the generalized regression models discussed in Section 3.3.1.
In the first neural model N1, pressure injuries were predicted from SampEn and BubbEn scaling exponents. The model had two nodes with a hyperbolic tangent activation function for one node and a Gaussian function for the other node. The area under the ROC curve was 0.946, and the classification accuracy was 100% for controls and 90.9% for pressure injuries; the overall misclassification rate was 4.0%. The corresponding threshold for classification was 0.5. Comparison of the accuracy of the adaptive lasso regression model M1 and neural model N1 is shown in Figure 5.
Addition of the Braden scale score as the third predictor in model N2 resulted in a perfect ROC area measure of 1.0 with 100% accuracy for predicting controls and pressure injuries. This model had a single layer with two nodes with a hyperbolic tangent activation function for one node and a Gaussian function for the other node. For network diagrams, parameter estimates, and other information, see Appendix A and Supplementary Material. Accuracy is shown separately for predicting pressure injury cases (red) and control cases (blue).

Discussion
A primary motivating factor behind the introduction of multiscale entropy was to reconcile apparent violations of a basic premise of the science of complex adaptive systems: higher complexity is generally indicative of a healthier system [23]. Although it may be violated at some scales, the basic truth behind this premise is made evident only when entropy is measured at temporal scales other than the one set by the choice of sampling frequency. For example, atrial fibrillation can lead to high entropy in the beat-to-beat series at short timescales but not at longer timescales. In this study, we found similar behavior for the refined multiscale SampEn and BubbEn of abdominal skin temperature. Participants without pressure injuries tended to have higher levels of entropy than participants with pressure injuries, but this was generally true only at temporal scales that were several times larger than the 1-min sampling period.
The differences in the mean entropy levels at the longer scales had large effect sizes due to pressure injury, consistent with the only other study on this topic that reported an effect size of roughly the same magnitude, d > 1, for multiscale SampEn of abdominal skin temperature [25]. That study included only three participants with pressure injuries, focused on a single measure of entropy, and included participants at low risk of pressure injuries. The new measurements of complexity in eleven participants with pressure injuries contribute stronger evidence that lower levels of SampEn and BubbEn are associated with the injury state. Moreover, this association was observed despite the exclusion of low-risk participants in the present study.
The need to maximize the observations of relatively few participants with pressure injuries was an important factor in the study design. Had the study been limited to observations of new pressure injuries in the three-week follow-up time, it would have been far too restrictive. Allowing participants to enter the study with a pre-existing pressure injury provided valuable data on participants with pressure injuries, at the expense of losing the ability to examine the causal structure. Therefore, the present study cannot conclude that loss of entropy precedes the development of pressure injuries. However, the previous study by Rapp et al. [25] had an exclusively longitudinal design, and its findings suggest that low entropy preceded the development of pressure injuries by a few days to a few weeks. A more resource-intensive study will be necessary to follow large numbers of participants over longer periods of time to make firmer judgments about causality.
The addition of refined multiscale BubbEn was found to be a useful complement to the refined multiscale SampEn. Large effects were observed for the differences in means between groups of the SampEn scaling exponent (d = 1.53), SampEn requisite AUC (d = 1.38), and BubbEn scaling exponent (d = 1.04). Despite having a smaller effect size than SampEn on a bivariate basis (see Table 2), in multivariate regression models it was the scaling exponent of BubbEn that edged out the scaling exponent of SampEn as the best predictor of pressure injuries (see Table 4). These two predictors can be thought of roughly as measures of the shape of the multiscale SampEn curve and the multiscale BubbEn curve, with the caveat of being restricted to certain scales that are discussed in Section 2.5.1. The neural models further confirmed that the scaling exponents of SampEn and BubbEn performed well together as predictors of pressure injury, yielding an overall 96% accuracy for the prediction of pressure injury and non-injury cases. This level of accuracy was unmatched by any single measure, regardless of whether it arose from SampEn or BubbEn.
Use of machine learning methods to predict pressure injuries on the basis of entropy measures resulted in models that performed well, with areas under the ROC curve not less than 0.94 for generalized regression with the adaptive lasso method and for the neural model. These models were superior to predicting pressure injuries on the basis of Braden scale scores alone. The predictors that were rejected by generalized regression are also noteworthy. Summary measures of abdominal temperature, including measures of variation such as the interquartile range and the trimmed range, did not differ between pressure injury and control groups. Entropy measures derived from the time series data were therefore crucial to detecting the pressure injury effect that could not be detected with simple measures of dispersion.
However, the best predictive models resulted from combining the entropy measures and the Braden scale score. Areas under the ROC curve for these models exceeded 0.96 for the generalized regression with the adaptive lasso method and for the neural network. A major advantage of the Braden scale is that it is in widespread use in nursing facilities. Therefore, combining this widely used score with experimental entropy measures derived from abdominal skin temperature might be the most straightforward and sensible next step to judge the risk of pressure injuries for residents of nursing facilities.
Nevertheless, it must be noted that the study design was such that it could introduce bias in the Braden scale score. The scoring was done by the research nurse on many of the participants with the knowledge that they had pressure injuries, creating the possibility of implicit bias in the scoring. The entropy measures, on the other hand, were calculated from sensor measurements, which made them immune to any subjective bias.
Entropy measures are not without their disadvantages. Roughly 42 h of temperature series data collection may be needed to obtain good estimates of entropies up to the scale at 25 min, which was the approximate location of the largest effect sizes. Alternatively, to estimate the most effective summary measure of multiscale entropy found in this study, the scaling exponent of the SampEn curve between scales 5 and 10 min, roughly 17 h of data collection may be needed. Measurement of entropy, therefore, involves a much longer duration than scoring the Braden scale, making it a process that is unlikely to be tolerated by all patients or residents of a facility. This should also demand patience from the staff or healthcare providers to wait for risk assessment to be completed from the entropy measurement.
The wide variety of entropy metrics that have been developed [40] can be both an advantage and a challenge. In this study, we used sample entropy and bubble entropy advantageously; however, we cannot rule out the possibility that there could be a different combination of entropies that provides more power to predict pressure injuries. The susceptibility to noise of various entropy metrics is another challenge. For instance, it has been shown in the context of heart period data that permutation entropy, closely related to bubble entropy, is more susceptible to the introduction of broad band noise than coarse-grained entropy [52]. Further studies of the impact of noisy temperature measurement on entropy levels could be helpful for assessing conditions under which they can be reliable measures.
Despite some limitations that we noted, entropy measures hold promise as objectively measured predictors of the risk of pressure injuries. The proposed underlying mechanism is that the entropies provide an assessment of the orderliness of temperature fluctuations that are linked to changes in the blood flow in the skin tissue. The fluctuations that occur on the timescales of a few minutes to about thirty minutes appear to be of prime importance. In healthier skin, the blood flow is likely to be more adaptive and variable, responding more dynamically to changes in surface pressure and temperature. The corresponding entropy level is therefore likely to be higher than in a state of unhealthy blood flow and/or thermoregulation. The lower state of health of the skin thermoregulation, in turn, is likely to raise the probability of the person experiencing a pressure injury when their skin is subjected to external stress.
It is known from laser Doppler flowmetry that there are a few characteristic oscillations in human peripheral blood flow [53]. While these oscillations were found at fast timescales (<2 min) in relation to the timescales pertinent to the present study, it suggests that there could be more structure present at slower timescales. This structure could arise from an interplay between the autonomic nervous system and the vascular system, and mirrored in corresponding structures in the temperature signal. Evidence for the interplay comes from studies of baroreflex that sometimes include assessment of peripheral resistance that tends to be under sympathetic control [54] and studies of oscillations arising from such control (e.g., [55]). It is also known that vasomotion diminishes with advancing age, and persistent obstruction of blood flow through the microvasculature can lead to the formation of microthrombi, which further obstructs blood flow [56,57]. Inspection of the power spectral density calculated with the Welch method [58] indicated that there were some differences in the spectral distribution between the two groups. The pressure injury group had >40% more power at several timescales (inverse frequency) between 2.4 and 9.1 min. It may be noteworthy that this interval includes the central points of the scale range for calculation of the spectral exponents of SampEn and BubbEn. In contrast, the control group had >75% more power at several timescales between 17.1 and 102.9 min. Overall, the power spectral density suggests that there was a shift of power from slower to faster timescales in the injury group. Such a shift could result in higher periodicity at timescales that are faster (smaller) than 10 min, approximately. It is plausible that this could lead to reduction in the irregularity of abdominal temperature that is detected by lowered entropy. However, this interpretation about the shift in the frequency structure is necessarily speculative and only serves as a hypothesis for a future study that will ideally be done with concurrent measurements of blood flow.
It may be worth noting that our approach based on evaluation of the irregularity of abdominal skin temperature as a proxy measure of skin thermoregulation is not to be confused with approaches that monitor skin temperature localized to the most commonly anticipated wound locations, such as the sacrum. Relative differences in skin temperature between zones can give an early indication of a developing pressure injury [59][60][61][62]. Localized approaches can result in early detection of pressure injuries, but they require clinicians or nursing facility staff to frequently scan several suspect areas for thermal imaging, which can be labor intensive. The approach presented in this study of using abdominal temperature monitoring to assess system-wide status of skin thermoregulation holds the promise of a low impact on staff workload, which is important in a climate of global shortages of health care providers.
One avenue for future studies could focus on establishing the causal pathway through laboratory studies or numerical simulations of blood flow, tissue thermodynamics, and tissue mechanics. Another avenue is from the risk assessment perspective for which longitudinal studies with longer follow-up times are required. The sample size of such a study would need to be large to offset the relatively low occurrence of pressure injuries. If low entropy levels of abdominal temperature precede the development of pressure injuries, it would make a strong case for regular monitoring of residents in nursing facilities or elsewhere. In such a study, it would be desirable to include blood flow monitoring and measurements of peak pressure in common bed postures for the participants. This would allow studying the link between temperature and blood flow and yield a better understanding of the frequency structures of these signals. The combination of risk assessment studies and engineering studies could also point the way to a treatment that can be offered to normalize the blood flow during a phase of high risk of pressure injury that might have been identified from the skin temperature measurements. For example, electrical stimulation has been proposed to increase periwound skin blood flow for nonhealing pressure injuries [63], which suggests a possibility that it could be used more effectively during an identified high-risk period before the injury is manifested.