Next Article in Journal
Conservative Finite-Difference Schemes for Two Nonlinear Schrödinger Equations Describing Frequency Tripling in a Medium with Cubic Nonlinearity: Competition of Invariants
Next Article in Special Issue
A Map of the Poor or a Poor Map?
Previous Article in Journal
Closed-Loop Nash Equilibrium in the Class of Piecewise Constant Strategies in a Linear State Feedback Form for Stochastic LQ Games
Previous Article in Special Issue
Hierarchical Bayesian Modeling and Randomized Response Method for Inferring the Sensitive-Nature Proportion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multimorbidity from Diabetes, Heart Failure, and Related Conditions: Assessing a Panel of Depressive Symptoms as Both Formative and Reflective Indicators of a Latent Trait

by
Richard B. Francoeur
School of Social Work, Adelphi University, 1 South Avenue, Garden City, NY 11530, USA
Mathematics 2021, 9(21), 2715; https://doi.org/10.3390/math9212715
Submission received: 19 August 2021 / Revised: 7 October 2021 / Accepted: 13 October 2021 / Published: 26 October 2021

Abstract

:
Through exploring specific conditions (diabetes, heart failure, related vascular/metabolic diagnoses) and their multimorbidities, I develop a more thorough means to adjust confounders of clinical targets within main or interactive contexts in epidemiological panel studies. Regression-based multiple indicators-multiple causes (MIMIC) models combine multiple or moderated regression and confirmatory factor analysis. In a novel specification, each of twenty depressive symptoms is both a “formative” (causal) indicator and a “reflective” (effect) indicator of a latent trait (Depression). Although both indicators provide identical information (under different variable names), formative indicators provide “exogenous” information (outside the model) to estimate, within groups or subgroups, “endogenous” effects (recovered by the model) from the latent trait and its reflective indicators. Formative indicators within the multiple regressions constitute comprehensive proxies for unspecified confounders by completely mediating all unspecified confounder effects on the endogenous latent trait and its reflective indicators, the latter estimated through confirmatory factor analysis. Findings of symptom clusters of Depression in these specific conditions, and in subgroups that capture their synergies, corroborate parallel MIMIC models with instrumental variables that specify several known confounders, but suggest some confounding biases remain. All multimorbidities involve synergy from co-occurring diabetes and heart failure. There may be opportunities to target screening and optimize metformin treatment for these co-occurring conditions. This strategy avoids the need to specify all confounders, which may not be possible or verifiable.

1. Introduction

1.1. Background

Analysts often use principal components analysis (PCA) or confirmatory factor analysis (CFA) to investigate panels of metabolic, biomarker, or symptom data in metabolomic and epidemiologic studies [1]. For instance, PCA of a metabolic panel determined specific metabolites that distinguished lean from obese subgroups with insulin resistance [2]. CFA assesses the effect of the disease group or subgroup on each of the measurement items while simultaneously controlling for the influence of the latent factor on each measurement item. This allows CFA to adjust for measurement error and unreliability. In contrast, PCA measurement loadings may be inflated due to the lack of similar adjustments. Indeed, evidence from Monte Carlo analyses [3,4] suggests measurement loadings for the same symptoms tend to be higher (inflated) in PCA compared to CFA (see Appendix A, note 1).
CFA also estimates the “multiple indicators” that constitute the measurement model portion of a structural equations latent trait model known as the multiple indicators-multiple causes (MIMIC) model. Each participant has a true score reflected by their innate position on a latent trait (e.g., Depression), which generates or precipitates manifest or observed measurement indicators (e.g., different depressive symptoms). The other portion of a MIMIC model, the “structural model”, predicts components of the measurement model. When the latent trait is controlled, only variation unique to each observed indicator remains. Predictor effects to the observed indicators have “local independence” in that the observed items are conditionally independent of each other because the latent trait accounts for the shared variation across the observed indicators. The addition of the structural model to CFA permits more valid modeling than CFA alone when symptoms, biomarkers, or metabolites may not all stem from a single biological pathway and confounding influences are more likely.
MIMIC model estimation may be based on data in the form of a matrix of covariances (or correlations with means that can be converted to covariances), in which all variables are endogenous (estimated by the model). If data are in the form of individual observations, regression-based estimation is a more powerful option, in which only some of the variables are endogenous and the remaining are exogenous (they provide information from outside the model to assist in estimation) [5,6,7,8,9]. In contrast to the covariance-based approach, in which all variables are considered jointly as dependent (y) variables, in the regression-based approach the exogenous predictors are independent (x) variables that estimate conditional effects of the endogenous dependent (y) variables and latent trait (i.e., the estimated effects on y are conditional on the set of x).
Let us assume a regression-based MIMIC model in which some or all of the observed indicators of the endogenous measurement model (yi) are categorical (binary or ordinal). The relationships of the measurement and structural model portions are, respectively:
y*i = ν + Λ ηi + Κ xi + εi,
ηi = α + Β ηi + Γ xi + ζi,
where
ν = vector of measurement intercepts;
Λ = matrix of factor loadings;
ηi = vector of latent traits (or latent constructs or latent factors);
Κ = matrix of regression slopes of the latent response variables on the independent variables (exogenous);
xi = vector of independent variables (exogenous);
εi = vector of measurement errors in the measurement model uncorrelated with other variables;
α = vector of intercepts in the structural model;
Β = matrix of regression slopes of the latent trait on other latent traits;
Γ = matrix of regression slopes of the latent trait on the independent variables (exogenous);
ζi = vector of residuals in the structural model uncorrelated with other variables.
The vector of latent traits (ηi) appears on the left side of the structural model equation in Equation (1) because each latent trait may be predicted by other endogenous latent traits and by the vector of exogenous independent variables (xi). It also appears on the right side in order to reveal endogenous relationships specified between latent traits, constructs, or factors. The current study includes only one latent trait and thus excludes the right side-term.
The estimation approach developed and demonstrated by Muthén [5,6,7,8,9] and available in the Mplus software program [9] applies ordinal probit regression when yi is a categorical variable to fit the correlation structure of the model to sample correlations. A derived scaling factor (Δ) yields the continuous latent response variable (y*si) behind the observed indicator (see Appendix B in [9] for further details). There may be as few as two observed indicators. When an observed indicator is continuous, it is set equal to the continuous latent response variable (y*i) and the diagonal elements of Δ are set to one. Thus,
y*si = Δ y*i.
By assuming conditional normality for y*si given xi, the regression-based MIMIC model yields estimates of conditional expectation and conditional variation:
E(y*si | xi) = Δ[ν + Λ(Ι − Β) −1 α + Λ(Ι − Β) Γ −1 xi + Κ xi],
V(y*si | xi) = Δ[Λ(Ι − Β) −1 Ψ (Ι − Β)’ –1 Λ’ + Θ]Δ
where
Ψ = covariance matrix of ζi;
Θ = covariance matrix of εi;
(Ι − Β) is non-singular.
In contrast to covariance-based approaches, the assumption of conditional normality allows the continuous latent response variables (y*si) behind categorical yi variables to be non-normal as a function of non-normality in the exogenous (xi) variables (see Appendix A, note 2).

1.2. Purpose of the Study

There is a need for approaches to address confounding biases and to account for synergistic effects from multimorbidities that influence metabolites, biomarkers, and symptoms. A major contribution of this article is that it easily adapts the latent trait model into a nonrecursive, bidirectional specification of the MIMIC model, and in so doing provides a means to partial out unspecified confounding factors.
I call attention to the unrecognized possibility of using the regression-based approach to model any given panel item as both an exogenous variable and an endogenous variable within the same MIMIC model. This special type of MIMIC model is a contender to conventional multidimensional data reduction strategies with PCA or CFA for research on panels of metabolites, biomarkers, or symptoms because it enables CFA either to be conducted across the sample or to be targeted within an overall group or interactive subgroup, while providing extensive and comprehensive control of confounding factors.
The analyst may specify interaction terms to detect synergistic effects of metabolites, biomarkers, or symptoms within homogeneous disease subgroups or subphenotypes that distinguish them from the main effects of the overall disease groups or phenotypes. Interactions among predictor variables define the subgroups rather than subsets of observations derived in cluster analysis, where there is often uncertainty about optimal feature selections for deriving unbiased clusters [10]. Furthermore, research has not allowed for and incorporated synergies within disease subgroups based on interactions of the individual disease components that arise from multimorbidity, which may lead to misassign/misattribute metabolic or symptom cluster variation to specific disease groups.
This overlooked strategy allows some or all items from a panel to be modeled simultaneously as formative (causal) indicators and reflective (effect) indicators of a latent trait, overcoming a commonly assumed restriction that it is necessary to choose only one of these options to specify and model any given indicator, e.g., [11,12,13,14]. The formative (causal) indicators are included to improve the detection of metabolic, biomarker, or symptom clusters (represented by the reflective indicators) either (1) within disease subgroups distinguished by different synergistic effects (interactions) of multimorbidity or (2) across disease groups (see Appendix A, note 3).
Using epidemiological data on metabolic and vascular disease conditions, I develop a protocol to specify MIMIC models that unveil unbiased clusters of psychometric items (the specific metabolites, biomarkers, or symptoms) of a latent trait (the overall level of the panel of items) within main or interactive disease contexts or across the full sample. The regression-based approach available in M-plus statistical software affords this superior modeling advantage over the alternative covariance-based approach. I derive the MIMIC models and multivariate regressions to develop this protocol in M-plus (Version 5.21) using the MLR estimator (maximum likelihood parameter estimates with standard errors that are robust to non-normality and non-independence in complex random samples) [9].
The literature supports testing a panel of metabolites, or higher-order features such as depressive symptoms, within diagnostic subgroups of co-occurring metabolic and vascular conditions, in which their interaction or synergistic effects identify them. I will highlight here certain factors that support a context of synergy among metabolic and vascular conditions and related depressive symptoms. First, diabetes, and especially the metabolic syndrome (based on a constellation of conditions such as obesity, diabetes, hypertension and/or atherosclerosis), has been known for years to double the risk of a heart attack or for developing chronic heart failure [15]. Second, the action of the diabetes medication, metformin, provides a clue to these metabolic-vascular interrelationships. Metformin consistently shows promise for the prevention or slowing of atherosclerosis and heart conditions such as myocardial infarction and chronic heart failure. Among these positive effects, metformin appears to improve how well the protein titin folds and recoils within the heart muscle, which determines how well blood is pumped through the arteries [16,17,18,19,20,21,22,23,24,25,26,27]. The unexpected effects of metformin suggest diabetes may be interrelated with episodes of heart failure or heart attack, and related conditions such as atherosclerosis, in previously unknown ways, and they may be co-occurring conditions with synergistic effects that worsen depressive symptoms of sickness malaise. Third, much accumulated evidence over the years attests that depression occurs in the contexts of each of these individual medical conditions, as sickness malaise that occurs as part of the biological disease process itself and not only as a psychosocial reaction to it, as discussed in [28]. Finally, individuals with multimorbidity from more than one of these medical conditions experience higher levels of depression because of synergies among the intersecting and interrelated disease processes. I recently detected clusters of depressive symptoms that were associated with multiple and interacting co-occurrences of metabolic conditions (excess weight, diabetes) with heart attack and with progressive vascular disease (hypertension, silent cerebrovascular disease, stroke, and vascular cognitive impairment), although congestive heart failure was not considered [28].
As we shall see, in the current study several reflective (effect) indicators are statistically significant in congestive heart failure. Several formative (causal) indicators of depressive symptoms that manifest from the latent trait (Depression) are also statistically significant in the overall disease group of congestive heart failure but not in any of the other disease groups or subgroups (where only two or fewer formative indicators are significant). This distinctive pattern provides a clue that at least part of the influence of congestive heart failure on the reflective indicators of depressive symptoms may occur indirectly through its interactions with other disease conditions. The current investigation focuses on the individual direct effects and synergistic comorbidity of diabetes and heart failure, with this targeted diagnostic subgroup also expanded to incorporate synergies from related conditions, such as hypertension, silent cerebrovascular disease, and previous heart attack. It will yield a protocol for testing a panel of metabolites, biomarkers, or symptoms as formative and reflective indicators in a MIMIC model. The innovation deciphers and adjusts background confounding biases by estimating bidirectional causal pathways based on both types of indicators of each panel item (and the derivative latent trait) in order to unveil effects from disease conditions (diagnoses, genes, risk factors) or synergies from multimorbidity in co-occurring conditions.

2. Materials and Methods

In order to specify MIMIC models that test a panel of metabolites, biomarkers, or symptoms, this study will compare two competing alternatives I derived. One applies instrumental variables to represent panel items considered “non-traditional” and the other, introduced for the first time in the current study, specifies bidirectional causal relationships that incorporate formative indicators of all panel items. I introduced and tested the instrumental variable MIMIC approach within vascular-metabolic subgroups of participants [28]; I direct the reader to that study for details regarding the sample, measures, and methodology. In that study and the current one, I use survey data from the New Haven, Connecticut subsample of community-residing older adults from the Established Populations for the Epidemiological Study of the Elderly (EPESE; unweighted n =2812). The original data were collected after participants (or proxies for two percent of the sample) provided written consent. The Adelphi University institutional review board exempted from review the de-identified, publicly available data [29].

2.1. Alternative MIMIC Models: Initial Comparisons

The two panels of Figure 1a,b target the subgroup in which Diabetes and Heart Failure interact (Diabetes × Heart Failure). In both (a) and (b), the right column labeled “REFLECTIVE INDICATORS” draws on all items from the Center for Epidemiological Studies-Depression (CES-D) inventory. {In (a), the items are “Depressed” to “Each of 4 Non-Traditional CES-D Items (Fearful, Lonely, People Unfriendly, People Disliked Me)”; and in (b), the items are “Depressed 2” to “Each of 4 Non-Traditional CES-D Items 2 (Fearful 2, Lonely 2, People Unfriendly 2, People Disliked Me 2)”}. These are effects of the latent trait (Depression). Furthermore, the two panels distinguish separate participant subgroups based on validated ranges of CES-D total scores. All participants are included when CES-D ≥ 0. Participants with scores of 11 or greater may be either experiencing subthreshold depression, just below the threshold of clinically significant depression (CES-D total score of 11 to 15), or clinically significant depression (CES-D total score of 16 or greater). The notes to Figure 1 are reported in Figure S1.
In (a), only four non-traditional items are “INSTRUMENTAL VARIABLES (CES-D ≥ 0)” of the same, subsumed original variables when CES-D ≥ 11. These original variables in turn are causes of the latent trait (Depression, Clinically Significant & Sub-threshold, CES-D ≥ 11).
On the other hand, (b) does not involve instrumental variables. Instead, all twenty CES-D items are “FORMATIVE INDICATORS (CES-D ≥ 0),” which are causes of the latent trait (Depression CES-D ≥ 0). The first set of depression items {from “Depressed 1” to “Each of 4 Non-Traditional CES-D Items 1 (Fearful 1, Lonely 1, People Unfriendly 1, People Disliked Me 1)”} that are causes of the latent trait (Depression) are formative indicators. The other set of depression items {“Depressed 2” to “Each of 4 Non-Traditional CES-D Items 2 (Fearful 2, Lonely 2, People Unfriendly 2, People Disliked Me 2)”} that are effects of the latent trait (Depression) are reflective indicators. An especially desirable feature of this dual specification in (b) is that it can provide more extensive and comprehensive control of confounders than trying to specify each of them directly. Each unspecified confounder operates through its effects on each of the specified formative indicators that predict the latent trait (Depression).
The error-conditioned reflective (effect) indicators can be modeled across the sample at large, in an overall group (a diagnosis, gene, epigenetic factor, environmental or other risk factor), or within a more targeted subgroup of interacting diagnoses, genes, epigenetic factors, or environmental or other risk factors.
The full panel of metabolites, biomarkers, or symptoms serve as observed items that contribute to a latent trait representing the overall level of the total metabolite, biomarker, or symptom panel {e.g., Depression in Figure 1a,b}. The simultaneous control for the level of the total panel allows estimation of more valid specific effects for individual metabolites, biomarkers, or symptoms (see Appendix B, note 1).
In Figure 1, all instrumental variables in (a) and all formative indicators in (b) are not predicted by other factors and so are considered exogenous. The diagnoses (Diabetes, Heart Failure) and their interaction (synergy within the subgroup when the diagnoses co-occur in the same individuals) are also exogenous in both panels. Depending on the MIMIC model specification, the formative indicators may provide either exogenous or endogenous information. If the variables reflecting the exogenous diagnosis subgroup in Figure 1 had predicted some of the formative indicators (e.g., the four non-traditional CES-D items), these formative indicators would be endogenous because they would mediate information contributed by the variables that constitute the exogenous diagnosis subgroup. This tighter specification detects suppressor effects, as discussed below.
In the two MIMIC models of Figure 1a,b the specification of indicators (enclosed in boxes in Figure 1a,b) to the left of the latent trait (enclosed by the circle) is estimated by multiple regression, or when interaction terms are also specified as indicators, by moderated multiple regression, which predicts the latent trait. These indicators may include: (1) predictors representing main or interactive epidemiological contexts of diagnoses, genes, epigenetic factors, environmental or other risk factors {in (a) and (b)}; (2) instrumental variables to estimate certain non-traditional items (symptoms) within the measurement model {in (a)}; and/or (3) formative (causal) indicators {in (b)}. All three types of indicators are exogenous because they provide outside information to estimate and recover the latent trait and its reflective (effect) indicators (i.e., the model does not estimate and recover the predictors and formative indicators) (see Appendix B, note 2).
CFA estimates the specification of indicators (enclosed in boxes) to the right of the latent trait, where the direction of causality is from the latent trait to the reflective (effect) indicators, which represent multiple “observed” expressions of the latent trait. It is this joint application of multiple regression and confirmatory factor analysis in estimating the latent trait that allows both reflective (effect) indicators and either instrumental variables or formative (causal) indicators to be specified, with the the formative indicators serving to absorb biases from confounding factors across the sample that would otherwise result in biased reflective indicators within the targeted subgroup.

2.2. The Instrumental Variables Approach

Figure 1a reveals a descriptive MIMIC model within a subgroup with both diabetes and heart failure (based on predictors for Diabetes, Heart Failure, and their interaction). The MIMIC models represented by Figure 1a incorporate the sixteen traditional items and four non-traditional items of the CES-D inventory to identify cases of clinically significant depression. The non-traditional items are modeled not to interfere with the contribution of traditional items to the distinct presentations of depression within the metabolic/vascular illness subgroups, but while still contributing to the latent trait (Depression) that reflects the overall level of depression necessary to identify cases of subthreshold or clinically significant depression (CES-D ≥ 11). In participants with subthreshold or clinically significant depression, CES-D ≥ 11, each non-traditional item (CES-D ≥ 11) is predicted by its targeted, subsumed instrumental variable for the same item (CES-D ≥ 0).
These four non-traditional items predict the non-traditional “formative (causal) indicator” portion of the variation in the latent trait for Depression. This leaves the exogenous predictors of interest (i.e., Diabetes, Heart Failure, Diabetes × Heart Failure) to predict the “reflective (effect) indicator” portion of the variation in the latent trait for Depression. This portion includes the effects of these three predictors on “Each of 4 Non-Traditional CES-D Items (Fearful, Lonely, People Unfriendly, and People Disliked Me)” in the measurement model portion (the right side) of the MIMIC model. The modeling of the four non-traditional items in this way adjusts for their inter-correlated variation with the sixteen traditional items so that the inclusion of the four non-traditional items does not bias their separate individual influences.
This original approach affords insight into distinct presentations, even phenomenology, by unveiling elusive clusters of psychometric items (metabolites, biomarkers, or symptoms) of a latent trait, either broadly (across a diagnosis, gene, epigenetic, environmental, or other risk group) or uniquely (within a subgroup targeted by interactions of two or more such groups). It overcomes the potential for common, insidious confounding biases in regression-based MIMIC models, which can estimate direct (unique) effects of predictors to the latent trait and to all but one of its reflective indicators. Even if it seems justified not to specify the direct effect to a certain reflective indicator (i.e., usually fixed at zero), hidden bias may infect the latent trait and proliferate across reflective indicators, undermining the validity of specified (shared and direct) effects. The advance avoids this difficulty by offering a new way to specify a MIMIC model that enables the direct effect on every single reflective indicator of a scale or subscale to be unveiled (while still adjusting shared effects across the reflective indicators to account for the level of the latent trait). Thus, it reveals the subset of reflective indicators that have statistically significant direct effects, which comprise the item cluster within the group or subgroup [28].

2.3. The Formative Indicators Approach

A MIMIC model that specifies causal pathways from every single predictor (the disease groups, their interactions, and all formative indicators) is not “identified” (i.e., unique estimates do not exist). Therefore, at least one causal pathway must not be estimated (i.e., the regression slope is fixed at zero). However, the analyst should determine specific causal pathway(s) to exclude on valid grounds, which is often not apparent or possible. The formative indicators approach is another solution to this dilemma. The exogenous/endogenous modeling distinction in regression-based MIMIC models affords a unique and valid opportunity to estimate such models. It permits us to specify (1) the endogenous portion of the model to estimate effects within the specific disease group/subgroup (i.e., all effects from the disease groups and their interactions) and (2) the exogenous portion to estimate effects across the sample at large instead of within the disease group/subgroup (i.e., no effects from the disease groups and their interactions). Normal and non-normal variation from predictors in the exogenous portion of the structural model (the formative indicators) across the sample at large conditions the estimates of effects in the endogenous portion of the model (the reflective indicators) within specific disease subgroups.
Figure 1b retains the distinction between traditional versus non-traditional CES-D items, however this distinction is unnecessary since as formative indicators, all items contribute, in the same way, to the distinct presentations of depression within the metabolic/vascular illness subgroups captured by the reflective (effect) indicators. Complete mediation occurs because the formative and reflective indicators involve the same measurement items. The specification of all CES-D depression items as exogenous formative indicators derives from a modeling conceptualization in which all unspecified and unknown confounders are direct predictors of all of these formative indicators, which mediate all confounder effects on the latent trait (e.g., Depression) and its reflective indicators. Thus, it is necessary only to specify the formative indicators as exogenous indicators in order to achieve comprehensive control for confounders (since they operate completely through the formative indicators, which are controlled). In contrast to the instrumental variables approach, the formative indicators approach avoids the need to identify and specify all of the important confounding factors, which is a task that is not possible in many contexts and even when possible its achievement is often unknown.
Instead, it relies on specifying non-recursive, bidirectional causal relationships between each panel item and the latent trait. This type of modeling adjusts for the impact of the formative indicator of each panel item as a cause of the latent trait that would otherwise bias the relationship of the latent trait on the reflective indicator of the same panel item. The lack of specification and adjustment for these relationships of reciprocal causation contributes confounding biases [30] (biased estimation occurs within the formative indicators portion of the MIMIC model because regression predictors become correlated with residual terms on account of bias from simultaneity [31] and reverse causation [32]). Regression bias in estimating the latent trait, in turn, triggers non-optimal, biased estimation across its reflective indicators. (In contrast, bias restricted only to a particular reflective indicator will not distort the inter-correlations among the remaining reflective indicators in the measurement model). In contrast to the fully endogenous, covariance-based MIMIC model, the exogenous/endogenous distinction in the regression-based MIMIC approach with both formative and reflective indicators allows a sample-wide exogenous focus while constraining the endogenous focus to be either across a particular disease group or within a more targeted disease subgroup. This distinction means variable(s) that tap the disease group or subgroup predict the reflective items and the latent trait, but not also the formative indicators, avoiding the need for an instrumental variable approach to estimation.

2.4. Further Comparisons of Both Approaches

Although the four non-traditional CES-D items operate individually, each panel of Figure 1 does not show them individually but groups them within “Each of the 4 Non-Traditional CES-D Items (Fearful, Lonely, People Unfriendly, and People Disliked Me)”. Figure 1b distinguishes the formative indicator (with a ‘1’ following each indicator name) from the reflective indicator (with a ‘2’ following each indicator name).
As an illustration, “Lonely 1” and “Lonely 2” are equivalent representations of the item Lonely, but because they have different variable names, the M-plus software program treats them as different variables. Actually, in Figure 1a,b, M-plus software models the observed distribution for Lonely as a continuous variable {as an instrumental variable in (a) and a formative or causal indicator in (b)} within the multiple regression framework of the MIMIC model. However, it models the postulated latent variable ordinal probit distribution that gives rise to the observed ordinal variable of Lonely (reflective or effect indicator) within the confirmatory factor analysis framework. Strictly speaking, the two variables are virtually, but not absolutely, identical in this context, as they would be if the reflective (effect) indicator were also modeled as a continuous variable (see Appendix B, note 3).
Thus, the instrumental variable in Figure 1a or the formative (causal) indicator in Figure 1b, which is exogenous, and the reflective (effect) indicator (in both panels) of the same item, which is endogenous, comprise bidirectional non-recursive pathways involving the latent trait (Depression). They derive essentially from the same item, which provides exogenous information in one pathway and is recovered as an endogenous factor in the other, and therefore labeled differently as separate variables in the two pathways.
Figure 1b reveals an expansion of this MIMIC model by specifying each of the twenty CES-D items as both a formative (causal) and reflective (effect) indicator. In this expansion, there is no distinction between traditional and non-traditional CES-D items; all of the CES-D items now have both a formative (causal) indicator and a reflective (effect) indicator. Thus, in the measurement model portion, analyzed using CFA, there is shared variation across all CES-D items {and not only across the sixteen traditional CES-D items, as in Figure 1a}. This highly flexible specification models both the formative (causal) and reflective (effect) indicators for every CES-D item, avoiding the need to assume that only one of these options is operative within the measurement model for traditional items (and both options for non-traditional items). Rather, it allows the data to shape latent trait and measurement model distributions while providing comprehensive adjustment for confounding factors, which the twenty formative (causal) indicators serve to mediate.
The use of instrumental exogenous variables of some of the psychometric items {Figure 1a}, or the use of original exogenous variables to incorporate formative (causal) indicators of all of the psychometric items {Figure 1b}, each allows a more expansive and flexible specification involving reflective (effect) indicators and bidirectional non-recursive pathways. However, the incorporation of formative (causal) indicators of all CES-D items in Figure 1b is likely to partial out confounding factors more thoroughly. Both approaches overcome confounding from model misspecification in which actual instrumental variable effects or formative (causal) indicator effects are incorrectly attributed to reflective (effect) indicator effects (see Appendix B, note 4).
In the current study, the descriptive MIMIC model involving the instrumental variables approach excludes confounders, while the explanatory MIMIC model adjusts for specified confounders (Black, male, age 75 or older, not a high school graduate, recent widow, income equivalence adjusted for family size, isolated, smoker, alcohol consumption, hypertension, silent cerebrovascular disease, heart failure, excess weight, lost ten pounds during the past three months, diabetes, heart attack, and number of cerebrovascular risk factors) (see Appendix B, note 5). However, specifying a range of confounders does not usually adjust all confounders. In contrast, the new formative indicators approach relies on the formative indicators of the measurement items to tap all confounders since the formative indicators necessarily mediate all confounders in their effects on the latent trait. The extent to which formative indicators partial out unspecified confounders (including other unspecified symptoms) related to specified symptoms will allow the reflective indicators to tap more valid symptom clusters within a disease group or subgroup than the use of CFA or PCA alone. To provide the best comparison between the instrumental variables approach and the formative indicators approach, the current study does not specify any confounders in the latter approach, although it can incorporate individual specified confounders.

3. Results

3.1. Introducing the Tables Reporting Parallel MIMIC Model Estimates

Table 1 reports descriptive and explanatory MIMIC analyses based on the instrumental variables approach developed previously by the author. Table 2 reports MIMIC analyses based on the formative indicators approach developed in the current article. When these latter MIMIC analyses include formative indicators for all twenty CES-D items of depression, this comprehensive adjustment for unspecified confounders parallels the comprehensive adjustment for specified confounders in the explanatory MIMIC analyses of Table 1. To be clear, although analyses in Table 2 (the formative indicators approach) adjust for unspecified confounders, they do not specify them in contrast to the explanatory analyses in Table 1 (the instrumental variables approach). Finally, I report footnotes for Table 1 and Table 2 in the corresponding Supplementary Tables S1 and S2.
Table 1 reports findings from MIMIC models with instrumental variables of four endogenous non-traditional CES-D items of depression that serve as formative indicators. As non-traditional items, the use of instrumental variables allows them to contribute to the level of the latent trait or additive composite of CES-D depression without influencing its presentation among the sixteen traditional depressive symptoms. Table 1 (A) reports findings for descriptive MIMIC models that also specify only the predictor(s) that target the disease group {a single main effects term} or subgroup {interaction term(s) and their one-way component terms}. Table 1 (B) reports findings for explanatory MIMIC models that specify these terms as well as potential confounders tapped by variables reflecting demographic groups and related vascular and metabolic conditions.
Table 2 reports findings from MIMIC models with formative indicators as exogenous predictors or illness context mediators. Table 2 (A) reports findings when specifying the original variables of the four exogenous, non-traditional CES-D depression items as four formative indicators. Table 2 (B) reports findings when specifying the original variables of all twenty exogenous CES-D depression items as twenty formative indicators. In the findings from both (A) and (B), the lack of instrumental variables means that the variation from the four non-traditional CES-D items may now compete with the sixteen traditional CES-D items. They may compete in accounting for variation in the latent trait or additive composite, and therefore, in the presentation of statistically significant depressive symptoms as reflective indicators of symptoms and symptom clusters.
Table 1 (B) provides adjustment only for known, specified confounders comprising demographic groups and related vascular and metabolic conditions. Table 2 does not also provide adjustment for known, specified confounders.

3.2. Comparisons of Formative Indicators as Instrumental Versus Original Variables

I previously focused on the instrumental variables approach in latent trait models with predictors of diabetes, excess weight, and progressive cerebrovascular disease and their interactions, but these models excluded heart failure and its interactions with these included disease predictors [28]. The current article updates the instrumental variables analyses by including heart failure as an additional predictor and an additional component of disease predictor interactions reported in Table 1. The instrumental variables analyses in Table 1 (B) also control for all other progressive cerebrovascular disease conditions, heart attack, and demographic variables in order to adjust impartially for these overlapping sources of variation that would serve as confounders if not also specified. In contrast, the formative indicators analyses exclude these control variables.
I compare these updated analyses in the instrumental variables approach (Table 1) to the parallel analyses in the new formative indicators approach (Table 2) to test the same disease group (a one-way variable) or subgroup (an interaction of variables). All of the disease subgroup interactions involve diabetes and heart failure (or heart failure without heart attack) with different combinations of related conditions (hypertension or silent cerebrovascular disease, heart attack, excess weight, up to the five-way interaction combination). Almost all have reasonably large regression slope estimates (greater than one) for symptoms, and almost all symptoms occur in one or more symptom clusters across disease subgroups. The highly similar, overlapping findings from both approaches provide evidence that the formative indicators approach (which does not include the control variables specified in the instrumental variables approach) adjusts for unspecified confounders. Furthermore, some additional predictors that were not significant in the instrumental variables approach (Table 1) become significant in the formative indicators approach (Table 2), suggesting that there are additional salient confounders beyond those that were specified as control variables in the instrumental variables approach.
I create the instrumental variables in Table 1 only from the four non-traditional CES-D items, which also constitute the four formative (causal) indicators of the MIMIC reported in Table 2 (A). The descriptive MIMIC findings in Table 1 (A) only specify diagnostic predictors for the group or subgroup of interest (i.e., no co-occurring conditions or confounders are specified). It is striking that in certain disease subgroups, some descriptive MIMIC regression slopes and standard errors in Table 1 (A) are almost identical to those in Table 2 (A). These disease subgroups are Diabetes × Heart Attack; Diabetes × High BP × Heart Failure; Diabetes × High BP × Heart Failure, without Heart Attack; and Diabetes × Silent CVD × Heart Attack × Heart Failure. Thus, both specifications {instrumental variables and formative (causal) indicators} converge to estimate the same MIMIC model. In the remaining diagnostic subgroups, findings from the descriptive MIMIC in Table 1 (A) are strongly consistent with those in Table 2 (A), with almost all of the same CES-D reflective (effect) indicators found to be statistically significant. In one case (Diabetes × Heart Failure), the descriptive MIMIC in Table 1 (A) revealed all twenty CES-D items each to be statistically significant while Table 2 (A) detected the latent trait (Depression) to be significant.
In addition to diagnostic predictors for the group or subgroup of interest, the explanatory MIMIC models reported in Table 1 (B) include a comprehensive (not exhaustive) set of related personal characteristics and diagnostic conditions (dummy variables) in order to control for known co-occurring conditions or confounders. (These are: Black, male, age 75 or older, not a high school graduate, recent widow, income equivalence adjusted for family size, isolated, smoker, alcohol consumption, hypertension, silent cerebrovascular disease, heart failure, excess weight, lost ten pounds during the past three months, diabetes, heart attack, and number of cerebrovascular risk factors). I describe these variables in [28]. To a considerable extent, the Explanatory MIMIC findings {Table 1 (B)} are similar to the findings in Table 2 (B), in which all twenty CES-D items are each specified as a formative (causal) indicator of an “exploded” MIMIC model, along with the diagnostic predictors for the group or subgroup of interest.
A lack of statistically significant effects in the explanatory MIMIC for a diagnostic subgroup in Table 1 (B) always resulted in a similar lack of statistically significant effects in Table 2 (B). The same statistically significant effects in Table 2 (B) were always statistically significant in the Explanatory MIMIC findings in Table 1 (B). However, Table 1 (B) also tended to find other items significant as well in the Explanatory MIMIC runs, which could result partly from the exclusion, in the instrumental variables approach, but not the formative indicators approach, of shared variation within reflective indicators of non-traditional CES-D items with reflective indicators of traditional CES-D items.
To a wider extent, on the other hand, it also suggests that despite the attempt to specify a comprehensive (but not necessarily exhaustive) set of related personal characteristics and diagnostic conditions associated with confounders, confounding factors remain unadjusted. Thus, the specification of all twenty CES-D items as formative (causal) indicators in the “exploded” MIMIC models {Table 2 (B)} may achieve more complete conditioning than the counterpart Explanatory MIMIC models {Table 1 (B)}. This improved conditioning reduces bias and improves reliability. Although measurement loadings are often lower in Table 2 when all twenty CES-D items rather than only the four non-traditional items have formative indicators, this more restricted specification improves the assignment of variation among competing formative and reflective indicators, and leads to perfect model fit (R2 = 1).
Curiously, in both Table 1 and Table 2, either all, or almost all, CES-D items are initially statistically significant as reflective indicators in the overall conditions of Diabetes and Heart Failure {(A)}, however very few if any remain significant in the more restrictive model {(B)}. The modeling of interactions by targeting groups of interacting illness conditions results in much more consistency in findings from both (A) and (B). Thus, findings initially attributed to an overall condition such as diabetes or heart failure may well mask confounding biases from other co-occurring and interacting conditions detected either in an explanatory MIMIC model specifying them or in a MIMIC with a more carefully specified measurement model that includes formative indicators for all items. The R2 fit statistic, which reveals the percent of the variation within the latent trait (Depression) predicted, also suggests confounding biases. In Table 1, when comparing each Descriptive MIMIC {(A)} to the respective Explanatory MIMIC {(B)}, the R2 values do not increase (and even decrease due to multicollinearity among specified predictors). However, in Table 2, there is always an appreciable increase in the R2 fit statistic when comparing (1) each MIMIC with four non-traditional items as formative indicators in (A) to (2) the comparing MIMIC with all twenty items as formative indicators in (B), in which R2 always equals 1.000 due to perfect fit between the formative and reflective indicators.
Table 2 also indicates when one or more of the formative indicators are statistically significant predictors of the weighted, additive composite within each disease group or subgroup. These may reveal a real causal effect by a symptom or symptom cluster, or they may be artifacts of outliers, heteroscedasticity, and multicollinearity from earlier confounding factors for which the symptom(s) serve as a proxy.
In the final section (III) of Table 2, the MIMIC specification is extended to incorporate an additional variable (Excess Weight) to refine or target further the subgroups of chronic conditions (i.e., Table 1 does not include parallel findings).
Supplementary Table S2 footnotes 5–10 discuss findings and suppressor effects from all of the mediated and sequential MIMIC models reported in Table 2 and their implications for detecting synergistic effects from unspecified, co-occurring illness conditions.

4. Discussion

4.1. The Formative Indicators Approach

In contrast to the covariance-based MIMIC model, which depends on the analysis of a covariance matrix to generate a unidimensional latent trait (or additive composite), the regression-based MIMIC model relies on the availability of the actual responses on each predictor (x variable) across observations. The use of the original data from the exogenous predictors incorporates skewness and non-normality into the generated distributions of the separate, endogenous latent variables behind the observed ordinal reflective indicators. The shared variation across these separate latent variables, which constitute the latent trait or additive composite, may also include skewness and non-normality. Thus, the conditional effects of the regression-based MIMIC model incorporates multidimensionality within the latent trait or additive composite. The specification of the same items as both formative and reflective indicators incorporates all the relevant non-normality in the perfectly estimated (R2 = 1) MIMIC model with an additive composite {reported in Table 2 (B)}, in contrast to the imperfectly specified and estimated instrumental variables approach (where R2 is much lower) that may retain confounding biases by missing predictors. Thus, the multidimensionality incorporated by the formative indicators approach is comprehensive, nonbiased, and meaningful.
The reflective indicators may be more likely to tap symptom clusters that stem from the same or shared biological processes, whereas the formative indicators appear more likely to tap those that stem from non-shared biological processes common only to subsets of participants within the same disease group or subgroup. Certain symptoms can act in some individuals as formative indicators (e.g., poor appetite) in causing the latent trait of underlying depression at the same time that they can act in other individuals as reflective indicators as an effect of the latent trait (e.g., poor appetite as a manifest item of depression). This simultaneous specification allows the symptom in some participants to trigger, or precipitate, the underlying latent trait, whereas the same symptom in other participants is a result of, and may manifest from, or perpetuate, the same underlying latent trait. This modeling flexibility allows for two different symptom manifestations involving the same symptom.
The disease group interactions capture the synergies unique to particular variable-defined (and not participant-defined) subgroups. This allows for detecting the shared synergies across participants within that disease subgroup while factoring out those that occur only in some of the participants of that subgroup as formative indicators effects (otherwise expressed as influential outliers, heteroscedasticity, and/or multicollinearity within the recursive or unidirectional regression portion of the MIMIC model).
The regression model portion of the MIMIC model is similar to PCA (a regression-based procedure), while CFA estimates the measurement model portion. However, the simultaneous estimation using both procedures results in some differences. The latent trait generated reflects both regression (akin to PCA) and CFA because both types of procedures condition it. Since the variation from both formative (causal) and reflective (effect) indicators shapes the derivation of the latent trait, it is likely to differ from the latent trait derived using only one of these procedures. As a modeling approach, it is more valid than assuming that the latent trait derives from only one of these procedures in the absence of evidence or other strong justification. This feature implies that metabolomic and symptom cluster studies that rely on only one of these two procedures may include biases that, in certain cases, could undermine statistical conclusion validity.
Formative (causal) and reflective (effect) indicators may both occur in specific disease processes, but research on symptom clusters does not properly incorporate each of them. The formative (causal) indicators that are statistically significant reflect different, and additive (non-overlapping), sources of variation across the significant items throughout the sample (i.e., the items tap different sources of variation; individual items cannot be dropped without affecting the influence of the remaining items). Thus, these symptoms based on formative indicators tap different sources of variation from participants in the overall sample. The reflective (effect) indicators that are statistically significant reflect shared (overlapping) variation across the significant items within the disease group or subgroup (i.e., the items tap similar variation; individual items can be dropped without affecting the influence of the remaining items). Thus, these symptoms within the symptom clusters based on reflective indicators tap the same or similar sources of variation that tend to co-occur within the same disease group or subgroup.
Symptom clusters based on reflective indicators occur because the symptoms that constitute the cluster are similar in that they stem from the same latent trait reflecting the underlying symptom level. Thus, dropping any of the symptoms does not affect the other symptoms within the cluster. However, symptoms based on formative indicators are different from each other as causes of the latent trait. Thus, dropping any symptom will change the nature of the other symptoms because it contributes a unique variation. Allowing both possibilities within disease groups or subgroups makes sense. Symptoms that are highly similar in their effects are likely to form clusters. On the other hand, symptoms that predict different portions of non-shared variation within the latent trait may or may not also co-occur (leading, if they do, to multicollinearity and heteroscedasticity from the data for the particular observations concerned), but are dissimilar in their effects (i.e., they are not also based on a common variation detected through factor analysis). The formative indicators approach is unique in modeling these multiple modeling influences.
When all panel items are formative indicators, the same data serve as formative (causal) and reflective (effect) indicators which participate equally in shaping the distributions of the latent trait and measurement model, as well as result in perfect fit (R2 = 1) of the latent trait (Depression). This perfect fit (R2 = 1) of the latent trait (Depression) means it is equivalent to a weighted, additive composite of all formative indicators. This composite can be assessed for individual observations, in contrast to a latent trait based on factor scores, valid for the sample at large, but indeterminate for individual observations (see Appendix C, note 1). Otherwise, the lack of inclusion of the formative indicators in the structural portion of the MIMIC model leads to a CFA model that retains confounding bias, results in a non-zero error term in the structural model, and cannot be estimated as determinate (nonstochastic; i.e., R2 = 1), which does not allow a weighted additive composite to be derived. This means each symptom or item contributes equally to the overall level when in reality some items contribute disproportionately to the level of the latent trait (as an artifact of uncontrolled confounding bias that is allowed to operate through them in order to derive this equal weighting) (see Appendix C, note 2).
Arguably, these properties result in more valid derivations of these distributions than does the instrumental variables approach, which depends on the extent of capturing the important co-occurring conditions and confounders (i.e., unspecified confounders contribute to confounding biases) and results in much lower R2 fit statistics. The equivalence of the latent trait to a weighted composite collapses the MIMIC model such that the individual unique variation of the reflective (effect) indicators, considered “measurement error” within the measurement model prediction of the latent trait or weighted composite, are the only remaining type of error in prediction within the MIMIC model. Furthermore, this unique context of perfect fit in which the same measurement items are used as both formative and reflective indicators means that both types of indicators can be assumed to have internal consistency (see Appendix C, note 3). These properties make it attractive to use the same estimated weights to specify, a priori, a fixed-weight additive composite for use in subsequent MIMIC or structural equations models, either in the same or different samples of data.
It is not necessarily the formative indicators per se that are of interest in generating the latent trait but their use as proxies for unspecified confounders that influences its generation. Thus, we secure this control over what would otherwise be considered biases (heteroscedasticity, influential outliers, and multicollinearity) if the formative indicators were interpreted to operate strictly as explanatory variables that do not serve also as proxies for unspecified confounders. Rather, these so-called biases reflect influences from unspecified confounders for which the formative indicators serve as proxies. Just as we do not adjust specified control variables for issues such as heteroscedasticity and influential outliers because they serve only to partial out non-random noise (their slopes are not interpreted), we similarly use the formative indicators to partial out non-random noise from unspecified confounders that must operate through the mediating formative indicators in order to influence the latent trait. The retained, non-adjusted biases from heteroscedasticity, influential outliers, and multicollinearity from unspecified confounders, mediated through the formative (causal) indicators, all contribute variation that results in perfect model fit (see Appendix C, note 4).
The regression-based MIMIC approach with formative indicator has practical utility to address what would otherwise be unresolved residual confounding (due to limitations in collected data) and reverse causation (unanticipated, data-driven causal pathways in epidemiological studies that become unmasked as formative indicators). For instance, Heart Failure is a dummy variable, but in the absence of formative indicators, residual confounding may occur if there is a threshold effect based on the number of days and/or severity of heart failure symptoms. The fact that the structural model residual term (ε) is equal to zero means there is adjustment for all residual confounding when the latent trait is equivalent to the additive weighted composite, and it means no remaining variation contributes residual confounding to the residual term. The only other specified predictor(s) target and estimate the bidirectional relationships within a disease group or subgroup. Thus, this MIMIC model addresses epidemiological biases that would otherwise result from residual confounding and reverse causation (simultaneity). The absence of residual confounding and confounding due to unspecified reverse causation overcome biases that would otherwise occur from heterogeneity of effects in different subgroups (see Appendix C, note 5).

4.2. The Derived Protocol

The utility and promise of the formative indicators approach {Figure 1b} to specify MIMIC models for testing panels of symptoms, biomarkers, or metabolites in epidemiological samples requires the derivation and articulation of a protocol for such an approach. The experience of conducting the MIMIC analyses with formative indicators in Table 2 forms the basis for deriving the protocol. The protocol is useful to guide analysts in conducting separate runs of regression-based MIMIC models that include formative indicators.
First, the analyst specifies all pathways from the predictor terms representing the disease group or subgroup to all of the reflective indicators. This run reveals the set of statistically significant endogenous reflective indicators that cluster within the exogenous disease group (a predictor specified as a main effect) or exogenous disease subgroup (two or more predictors specified separately and together as components of interaction terms).
Second, the analyst specifies all pathways from the predictor terms representing the disease group or subgroup to all of the now-endogenous formative indicators (dropping the previous pathways to all of the reflective indicators to obtain an identified model). This run reveals the set of statistically significant formative indicators that cluster within the same exogenous disease group as suppressor effects.
Third, when the first of the two previous runs does not converge to yield unique estimates, the analyst reruns the MIMIC sequentially in two parts:
  • In the first part, the analyst no longer specifies all pathways from the disease group or subgroup to the latent trait or additive composite, retaining only the pathways from the disease group or subgroup to the reflective indicators. This broader modeling does not also adjust within the disease group or subgroup for the mediating pathway that accounts for the level of the latent trait or additive composite. Thus, the modeling completely attributes all effects to the reflective indicators without simultaneous inclusion of the influence of the overall level of the latent trait or additive composite within the disease group or subgroup (see Appendix C, note 6).
  • In the second part, the analyst includes all pathways from the disease group or subgroup to the latent trait or additive composite, dropping the pathways from the disease group or subgroup to the reflective indicators. This second part tests whether the latent trait or additive composite is statistically significant within the disease group or subgroup without considering whether any of the reflective indicators are statistically significant within the disease subgroup or group.
Fourth, the analyst may specify a replication of these separate runs in the overall sample at large (i.e., only the formative indicators are exogenous predictors, dropping the predictors representing the disease group or subgroup).
Appendix C, note 7 reflects on these steps with greater specificity, especially in relation to Figure 1b and Table 2.

4.3. The Pattern of Findings

In the MIMIC models to derive the protocol, I expand the causal indicator pathways predicted by four non-traditional CES-D items {Figure 1a} into one predicted by all twenty CES-D items {Figure 1b}. I run this expansion {Table 2 (B)} of the Descriptive MIMIC model {Table 1 (A)} for the targeted subgroup of Diabetes × Heart Failure and for related or derivative interactions to provide more thorough conditioning for confounders. There is consistent evidence over the years of a stable factor analytic structure of the CES-D Depression Inventory, along with clinical evidence that only four of the items are non-traditional. (The non-traditional items are included to optimize sensitivity and specificity in detecting actual cases of depression, but they should be modeled in such a way that they do not interfere with prediction by traditional items when it is necessary to identify traditional symptoms of depression that comprise symptom clusters with different presentations). These characteristics led to the development of thresholds for depression scores that reflect either subthreshold (CES-D ≥ 11 and CES-D < 16) or clinically significant (CES-D ≥ 16) levels [33,34,35].
Even so, development of the CES-D was as a screening instrument to identify cases of potential, clinically significant depression, which require follow-up to ascertain a clinical diagnosis [24]. It is possible the expanded model over-corrects the findings in some respects, even as it appropriately adjusts for additional confounding biases in others. In Table 2, note the very high slope value for Depression in the Diabetes × Heart Failure subgroup when specifying formative indicators only for the four nontraditional CES-D items {i.e., in (A)}. This slope is not significant when specifying formative indicators for all twenty CES-D items; only the CES-D item Happy, as a formative indicator, remains significant {i.e., in (B)}. From a clinical perspective, the sixteen traditional CES-D items constitute symptoms considered in diagnosing depression (e.g., Diagnostic and Statistical Manual-Version 5), and there is much factor-analytic evidence of their validation as reflective indicators (without their simultaneous modeling as formative indicators). However, this literature does not necessarily generalize to broad contexts of “depression in the context of medical illness” since many of these traditional depressive symptoms may also be symptoms of medical illness, and therefore modeling within each disease group or subgroup should also specify them as formative indicators to test bidirectional pathways. Indeed, the several statistically significant formative indicators in Table 2 revealed in the overall disease group, Heart Failure, suggests that these symptoms may be part of the medical illness, and do not necessarily stem from co-occurring underlying depression.
Regardless of etiology, the findings in Table 2 for the disease subgroup with both diabetes and heart failure (Diabetes × Heart Failure) suggest an important nexus for screening and intervention. The very high slope value for Depression in (A) reveals pronounced Depression when individuals experience both diabetes and heart failure, which suggests there may be much utility in targeting screening for symptoms within participants with both conditions. The findings in (B) suggests that much of this symptomatology may be direct symptoms of multimorbidity from these two medical conditions rather than stemming from a separate, underlying, co-occurring condition of depression. The patterns of symptomatology may be diverse and complicated across participants, such that only the CES-D item Happy remains statistically significant as a formative indicator. This finding, albeit solely from statistical modeling, provides indirect, tentative, and cautious support for the role of the protein titin (as discussed earlier in the review of the literature) in individuals with both diabetes and heart failure and for the potential of the medication metformin in treating both conditions. Only the CES-D item Happy remains significant, which suggests that as a group these individuals are prone to experience low positive affect (i.e., they are less likely to endorse feeling happy). Furthermore, the greater number of symptoms {especially in (B)} that form reflective symptom clusters when Hypertension, Heart Attack, and/or Excess Weight are also part of the disease subgroup, suggests that the role of the protein titin and/or other metabolomic pathways expresses through these additional sources of multimorbidity.
It is possible to make too much of the potential for overcorrection, especially compared to a more restrictive MIMIC model specification without formative indicators (which provides less flexibility to model any given panel item across participants) and when the investigation is exploratory. In the exploration of symptom, biomarker, or metabolite panels without any predetermined non-traditional items, we would not expect the unexplained variation in an item to bias the remaining items. The confounding effects related to each item are captured as part of the explained variation in the formative (causal) indicator (e.g., the twenty CES-D items used as formative indicators) on the latent trait (e.g., Depression) for each participant; the other remaining part of the explained variation is the unbiased effect of the measurement item. Thus, the formative (causal) indicator effect consists of both the confounding effects associated with the formative (causal) indicator, along with the unbiased effects of the formative (causal) indicator itself. This adaptive conditioning and modeling with formative (causal) indicators leads to expected unbiased reflective (effect) indicators.
There are parallel, mirroring processes captured by (1) the exogenous versus endogenous pathways, (2) the formative versus reflective indicators of the same measurement items that capture perfect model fit (R2 = 1), and (3) the bidirectional, nonrecursive estimation of effects of the measurement items and the weighted additive composite. Considered together, these parallel, mirroring processes all tap more deeply, and precisely, in a modeling sense, how “multiple indicators” truly mimic their “multiple causes.” Formative indicators tap the extent to which specific individual symptoms differ in their relationships to other symptoms. The formative indicators factor away biases from uncontrolled confounding factors, which could include differences in symptom expression in only some of the symptom items within the disease group or subgroup of interest. This strategy leaves the reflective indicators and the latent trait to tap the common or shared symptom expression across the full range of symptoms in the disease group or subgroup. It sidesteps the controversial issue as to whether symptoms should contribute variation to more than one symptom cluster because the formative indicators automatically factor out the influence of uncontrolled confounding factors, which may otherwise lead to heterogeneity in the effects of individual symptoms or across smaller subsets of symptoms.

4.4. Future Issues

4.4.1. Extending the Utility of the Regression-Based MIMIC Model

By capturing multidimensionality within the composite equivalent of the latent trait, the MIMIC model with formative indicators could overcome the restriction of unidimensionality in CFA within the measurement model of reflective indicators, in contrast to when CFA is used alone (outside the regression-based MIMIC framework). Just because a latent trait can be postulated and estimated when only reflective indicators are used in CFA does not necessarily mean the derived latent trait is the most valid estimate of the true latent trait. A true latent trait should have the property that allows it to be modeled by dissimilar formative symptoms that do not in themselves constitute a symptom cluster of reflective indicators. These formative symptoms provide additional, exogenous modeling information to reveal statistically significant reflective symptoms and symptom clusters by identifying this more plausible latent trait equivalent to the additive composite of the formative indicators. By capturing all of the variation across these formative indicators (i.e., R2 = 1), this modeling provides determinacy of latent factor scores at the level of the individual observations because they are equivalent to the additive composite scores, in contrast to the indeterminacy of factor scores for individual observations from CFA outside of this MIMIC framework.
CFA is based on the restrictive assumption that the reflective indicators tap a unidimensional latent construct. Even if the exogenous predictors in the regression estimation of the MIMIC structural model are each unidimensional, the additive composite may not be since it consists of the weighted sums of the predictors. The latent trait may thus be multidimensional since the additive composite is equivalent to the latent trait. The determinacy of the latent trait/additive composite allows the structural portion of the MIMIC model to be separate as a multiple regression model. The multidimensionality among the predictors and within the latent trait/additive composite may be modeled by the regression-based structural portion of the MIMIC model, which overcomes the CFA restriction of unidimensionality, allowing the CFA portion to model also the same non-normal variation across the reflective (effect) indicators [5,6,7,8,9]. Even if the symptoms as reflective indicators together tap a unidimensional dimension (latent construct) within a broader multidimensional latent trait, this does not preclude that localities of multidimensionality within the latent trait may be modeled legitimately in both the structural and measurement-model portions of an encompassing MIMIC model.
It is apparent that a lack of thorough and careful attention to model specification decisions, both initially and in subsequent bias conditioning, may lead to undetected outliers, heteroscedasticity, and nonessential multicollinearity in regression-based models that lead to biased probability values and incorrect inferences within homogeneous and heterogeneous samples. However, the extent to which researchers in applied fields will consistently meet these standards continues to be rather limited. Lack of data for important variables may lead to misattributed effects even when meeting these standards. However, in regression-based MIMIC models with formative indicators, the combined regression (structural) and CFA (measurement) models together partial out unknown and unspecified confounding biases and create a more valid and precise additive composite of the shared trait “behind” what would be a more indeterminate latent trait shaped only by CFA. This modeling improvement leads to more sound probability values and confidence intervals. It helps safeguard against drawing incorrect inferences and therefore should be a promising option in the arsenal of approaches to address the scientific replication crisis (see Appendix C, note 8).
Analysts can also use the regression-based MIMIC with formative indicators when the exogenous predictor is for an intervention group rather than a disease group (such as a dummy variable, and in which the zero category refers to a comparison condition such as treatment as usual). The predictor can also be an interaction term that reflects the effect of the intervention within a targeted participant subgroup (e.g., in males) in which randomization in a randomized control trial (RCT) no longer holds within the subgroup. This addresses situations of residual confounding such as when RCTs involve insufficiently randomized small samples or in stratified or regression analysis where the specified confounder variable is not precise enough. Regression-based MIMIC analysis has promising utility for intervention subgroups that are no longer randomized and in observational studies lacking randomization altogether. Confounding bias categories based on dummy variables (e.g., whether a study is an RCT) may be insufficiently precise, resulting in residual confounding when randomization is not adequate across the overall sample. Finally, it may have special application not only within a disease group or subgroup, or for revealing the effect of an intervention, but it can target the intervention within a disease group or subgroup (e.g., diabetes × heart failure × intervention), either in an RCT or in a quasi-experimental or observational study. In these situations, the regression-based MIMIC model with formative indicators partials out confounding from imperfect or absent randomization, other residual confounding, and reverse causation.

4.4.2. Application to Metabolomic Profiling and Symptom Clusters in Epidemiology

Beyond these broader modeling concerns, there are issues specific to metabolomic profiling and symptom clusters in epidemiological studies. Measures of metabolite levels in high-throughput profiling studies and of symptoms in related epidemiological studies tend to be semi-quantitative, which can make it difficult to contrast and integrate findings across such studies [36]. In MIMIC models, even when minimum threshold concentrations for each metabolite are unknown, the inclusion of curvilinear and interaction predictors may still detect effects that may be masked when specifying only the main-effects (one-way) predictors. The higher-order effects manifest after exceeding this unknown threshold or as synergies based on thresholds within particular subgroups. The analyst may also adjust the different types and levels of confounding factors across these studies by specifying panel items as formative (causal) indicators that mediate their biasing influence, regardless of whether they are continuous or ordinal.
Ordinal MIMIC models allow greater modeling flexibility and may be more valid than continuous MIMIC models since reflective (effect) indicators often reflect semi-quantitative, ordinal, and not strictly continuous items. Ordinal variables with multiple rather than binary categories provide a better means for comparison across panel studies when fully quantitative metabolite concentrations are not feasible; they can be accommodated correctly through ordinal probit estimation of MIMIC models rather than over-estimated by considering reflective indicators of the panel items as if they were strictly continuous. Furthermore, ordinal specification of the reflective (effect) indicators avoids the need to specify numerous covariances (and possible over-fitting) across the reflective (effect) indicators in order to obtain acceptable, continuous-model fit statistics (which occurred in parallel continuous MIMIC model runs in [28]). There is much untapped scope to apply the MIMIC model, especially the ordinal probit MIMIC model, in metabolomic profiling and symptom cluster studies (see Appendix C, note 9).
Regression-based MIMIC interactions tap synergy among disease group predictors in order to reveal disease subgroups or subphenotypes implicated in prognostic or predictive enrichment. These disease subgroups or subphenotypes may differ from those based on the detection of clusters of observations derived in cluster analysis, an exploratory, data-driven approach with various options for feature selection and commonly used to detect metabolomic subgroups/subphenotypes and symptom clusters [10]. The moderated regression framework in regression-based MIMIC models avoids the feature selection issues inherent in selecting an optimal algorithm of cluster analysis for a given data context. This means that regression-based MIMIC models with measurement model items specified simultaneously as formative and reflective indicators can be tested as to whether they replicate and confirm the same, or similar, subgroups or subphenotypes detected in cluster analysis (and vice versa), thus providing analyses of statistical conclusion validity for the same data.
The MIMIC model is a promising approach to identify integrated, and not isolated, individual metabolomic processes (e.g., within the proteome), and to conduct analyses at the higher level of symptoms or the whole system. It allows us to determine which specific ‘omics perspectives involving cells, tissues, organs, and symptoms—and more definitively, which of the observed items from a panel within any given ‘omics focus—are most active as potential clinical targets [37]. A depressive symptom has a different meaning when it occurs as a residual depressive symptom with no clinical significance than when it occurs as part of clinically significant depression. Similarly, the meaning of a metabolite-specific effect is likely to differ when the functioning and metabolic reaction rate of the system is within a healthy range versus when it is functioning poorly. The first-order latent trait (or additive composite) reflects the total level of the known reaction networks within the particular ‘omics focus in modeling the observed items from the panel, which could provide a proxy indication of the metabolic reaction rate. A poorly functioning system may reveal low or inconsistent effects across the metabolites. Although certain metabolites may have prominent effects, others may have impaired effects, resulting in an inadequate metabolic reaction rate suggested by the latent trait or weighted additive composite. These factors could precipitate and sustain prominent symptoms and symptom clusters.

5. Conclusions

5.1. Summary

Through an exploration of specific conditions (diabetes, heart failure, related vascular/metabolic diagnoses) and their multimorbidities, I developed a more thorough means to adjust confounders of clinical targets within main or interactive contexts (diseases, genes, epigenetic factors, risk factors) in epidemiological panel studies of symptoms, biomarkers, or metabolites. Regression-based multiple indicators-multiple causes (MIMIC) models combine multiple or moderated regression and confirmatory factor analysis. In a novel specification, each of twenty depressive symptoms is both a “formative” (causal) indicator and a “reflective” (effect) indicator of a latent trait (Depression). Formative indicators within the multiple regressions constitute comprehensive proxies for unspecified confounders (which may be unknown) by completely mediating all unspecified confounder effects on the endogenous latent trait and its reflective indicators, the latter estimated through confirmatory factor analysis. This strategy avoids the need to specify all confounders, which may not be possible or verifiable.
Using epidemiological data on metabolic and vascular conditions, I developed a protocol to specify MIMIC models that unveil unbiased clusters of psychometric items (the specific metabolites, biomarkers, or symptoms) of a latent trait (the overall level of the panel of items) within main or interactive disease contexts or even across the sample. The current investigation focused on the individual direct effects and synergistic comorbidity of diabetes and heart failure with this targeted diagnostic subgroup also expanded to incorporate synergies from related conditions, such as hypertension, silent cerebrovascular disease, and previous heart attack. It yields a protocol for testing a panel of metabolites, biomarkers, or symptoms as formative and reflective indicators in a MIMIC model, but beyond this specific epidemiological focus, the protocol is useful to guide analysts across disciplines in conducting separate runs of regression-based MIMIC models that include formative indicators.
Findings of symptom clusters of depression in specific conditions, and in subgroups that capture their multimorbidity synergies, corroborate parallel MIMIC models with instrumental variables that specify several known confounders, but suggest they retain some confounding biases. In particular, there is evidence of pronounced levels of depression when individuals experience both diabetes and heart failure. Other analyses suggest that much of this symptomatology may be direct symptoms of multimorbidity from these two medical conditions rather than stem from a separate, underlying, co-occurring condition of depression. These findings indirectly support the role of the protein titin and the potential of the medication metformin in co-occurring diabetes and heart failure.

5.2. Model Specification and Methodological Contributions

A major contribution of this article is that it easily adapts the latent trait model into a nonrecursive, bidirectional specification of the MIMIC model, and in so doing provides a means to partial out unspecified confounding factors. I call attention to the unrecognized possibility of using the regression-based approach to model any given panel item as both an exogenous variable and an endogenous variable within the same MIMIC model. This overlooked strategy allows some or all items from a panel to be modeled simultaneously as formative (causal) and reflective (effect) indicators of a latent trait. It enables CFA to be conducted across the sample or to be targeted within an overall group or interactive subgroup while providing extensive and comprehensive control of confounding factors.
This innovation deciphers and adjusts background confounding biases by estimating bidirectional causal pathways based on both types of indicators of each panel item (and the derivative latent trait) in order to unveil effects from groups of participants (e.g., disease conditions) or subgroups (e.g., synergies from multimorbidity in co-occurring conditions). It relies on specifying non-recursive, bidirectional causal relationships between each panel item and the latent trait. This type of modeling adjusts for the impact of the formative indicator of each panel item as a cause of the latent trait that would otherwise bias the relationship of the latent trait on the reflective indicator of the same panel item. The lack of specification and adjustment for these relationships of reciprocal causation contributes confounding biases within the formative indicators portion of the MIMIC model. Regression bias in estimating the latent trait, in turn, triggers non-optimal, biased estimation across its reflective indicators.
Each unspecified confounder operates through its effects on each of the specified formative indicators that predict the latent trait. The error-conditioned reflective (effect) indicators can be modeled across the sample at large, in an overall group or within a more targeted subgroup of interactions among the group factors.
The full panel of observed items contribute to a latent trait representing the overall level of the total panel {e.g., Depression in Figure 1a,b}. The simultaneous control for the level of the total panel allows the estimation of more valid specific effects for individual observed items. This approach adjusts for the overall dynamic state of a system within cross-sectional data.
The instrumental variable in Figure 1a or the formative (causal) indicator in Figure 1b, which is exogenous, and the reflective (effect) indicator (in both panels) of the same item, which is endogenous, comprise bidirectional non-recursive pathways involving the latent trait (Depression). They derive essentially from the same item, which provides exogenous information in one pathway and is recovered as an endogenous factor in the other, and are therefore labeled differently as separate variables in the two pathways.
The use of instrumental exogenous variables of some of the psychometric items {Figure 1a}, or the use of original exogenous variables to incorporate formative (causal) indicators of all of the psychometric items {Figure 1b}, each allows a more expansive and flexible specification involving reflective (effect) indicators and bidirectional non-recursive pathways. However, the incorporation of formative (causal) indicators of all of the psychometric items in Figure 1b is likely to partial out confounding factors more thoroughly.
The highly similar, overlapping findings from both approaches provides evidence that the formative indicators approach (which does not include the control variables specified in the instrumental variables approach) adjusts for unspecified confounders. Furthermore, some additional reflective (effect) indicators within specific disease subgroups that were not significant in the instrumental variables approach (Table 1) become significant in the formative indicators approach (Table 2). This suggests that there are additional salient confounders beyond those specified as control variables in the instrumental variables approach and that the formative indicators approach has practical utility.
In certain disease subgroups, some Descriptive MIMIC regression slopes and standard errors in (A) of Table 1 are almost identical to those in (A) of Table 2. Thus, both specifications {instrumental variables and formative (causal) indicators} converge to estimate the same MIMIC model. In the remaining diagnostic subgroups, findings from the Descriptive MIMIC in (A) of Table 1 are strongly consistent with those in (A) of Table 2, with almost all of the same reflective (effect) indicators found statistically significant.
In addition to diagnostic predictors for the group or subgroup of interest, the explanatory MIMIC models reported in Table 1 (B) include a comprehensive (not exhaustive) set of related personal characteristics and diagnostic conditions (dummy variables) specified in order to control for known co-occurring conditions or confounders. (These are: Black, male, age 75 or older, not a high school graduate, recent widow, income equivalence adjusted for family size, isolated, smoker, alcohol consumption, hypertension, silent cerebrovascular disease, heart failure, excess weight, lost ten pounds during the past three months, diabetes, heart attack, and number of cerebrovascular risk factors). To a strong degree, the Explanatory MIMIC findings {Table 1 (B)} are similar to the findings in Table 2 (B), in which all twenty psychometric items are each specified as a formative (causal) indicator of a MIMIC model, along with the diagnostic predictors for the group or subgroup of interest but with no comprehensive set of specified confounders (see Appendix D, note 1).
The use of the original data from the exogenous predictors incorporates skewness and non-normality into the generated distributions of the separate, endogenous latent variables behind the observed ordinal reflective indicators. The shared variation across these separate latent variables, which constitute the latent trait or additive composite, may also include skewness and non-normality. Thus, the conditional effects of the regression-based MIMIC model incorporates multidimensionality within the latent trait or additive composite. The specification of the same items as both formative and reflective indicators incorporates all the relevant non-normality in the perfectly estimated (R2 = 1) MIMIC model with an additive composite, in contrast to the imperfectly specified and estimated instrumental variables approach (where R2 is much lower) that may retain confounding biases by missing predictors. Thus, the multidimensionality incorporated by the formative indicators approach is comprehensive, nonbiased, and meaningful (see Appendix D, note 2).
In the formative indicators approach, the same data serve as formative (causal) and reflective (effect) indicators, which participate equally in shaping the distributions of the latent trait and measurement model, as well as result in perfect fit (R2 = 1) of the latent trait. This perfect fit (R2 = 1) of the latent trait means it is equivalent to a weighted, additive composite of all formative indicators. This additive composite can be assessed for individual observations, in contrast to a latent trait based on factor scores, valid for the sample at large, but indeterminate for individual observations. Arguably, these properties result in more valid derivations of these distributions than does the instrumental variables approach, which depends on the extent of capturing the important co-occurring conditions and confounders and results in much lower R2 fit statistics. The equivalence of the latent trait to a weighted additive composite collapses the MIMIC model such that the individual unique variation of the reflective (effect) indicators, considered “measurement error” within the measurement model prediction of the latent trait or weighted additive composite, are the only remaining type of error in the model (see Appendix D, note 3).
The formative (causal) indicator effect consists of both the confounding effects associated with the formative (causal) indicator, along with the unbiased effects of the formative (causal) indicator itself. This adaptive conditioning and modeling with formative (causal) indicators leads to expected unbiased reflective (effect) indicators.
The determinacy of the additive composite allows the structural portion of the MIMIC model to be separate as a multiple regression model. Multidimensionality among the predictors and within the additive composite may be modeled by the regression (structural) portion of the MIMIC model, which overcomes the CFA restriction of unidimensionality, allowing the CFA portion to also model the same non-normal variation across the reflective (effect) indicators [5,6,7,8,9].
In regression-based MIMIC models with formative indicators, the combined regression (structural) and CFA (measurement) models together partial out unknown and unspecified confounding biases and create a more valid and precise additive composite of the shared trait “behind” what would be a more indeterminate latent trait shaped only by CFA. This modeling improvement leads to more sound probability values and confidence intervals. It helps safeguard against drawing incorrect inferences and therefore should be a promising option in the arsenal of approaches to address the scientific replication crisis.

5.3. Further Implications for Research on Symptom Clusters

Formative and reflective indicators may both occur in specific disease processes, but research on symptom clusters does not properly incorporate each of them. The formative (causal) indicators that are statistically significant reflect different, and additive (non-overlapping), sources of variation across the significant items throughout the sample (i.e., the items tap different sources of variation; individual items cannot be dropped without affecting the influence of the remaining items). Thus, these symptoms based on formative indicators tap different sources of variation from participants in the overall sample. The reflective (effect) indicators that are statistically significant reflect shared (overlapping) variation across the significant items within the disease group or subgroup (i.e., the items tap similar variation; individual items can be dropped without affecting the influence of the remaining items). Thus, these symptoms within the symptom clusters based on reflective indicators tap the same or similar sources of variation that tend to co-occur within the same disease group or subgroup.
Symptom clusters based on reflective indicators occur because the symptoms that constitute the cluster are similar in that they stem from the same latent trait reflecting the underlying symptom level. Thus, dropping any of the symptoms does not affect the other symptoms within the cluster. Allowing both possibilities within disease groups or subgroups makes sense. Symptoms that are highly similar in their effects are likely to form clusters. On the other hand, symptoms that predict different portions of non-shared variation within the latent trait may or may not also co-occur (leading, if they do, to multicollinearity and heteroscedasticity from the data for the particular observations concerned), but are dissimilar in their effects (i.e., they are not also based on common variation detected through factor analysis). The formative indicators approach is unique in modeling these multiple influences.
Formative indicators tap the extent to which specific individual symptoms differ in their relationships to other symptoms. The formative indicators factor away biases from uncontrolled confounding factors, which could include differences in symptom expression in only some of the symptom items, or in smaller clusters with fewer symptoms, within the disease group or subgroup of interest. This strategy leaves the reflective indicators and the latent trait to tap the common or shared symptom expression across the full range of symptoms in the disease group or subgroup. It sidesteps the controversial issue as to whether symptoms should contribute variation to more than one symptom cluster because the formative indicators automatically factor out the influence of uncontrolled confounding factors, which may otherwise lead to heterogeneity in the effects of individual symptoms or across smaller subsets of symptoms.
Ordinal variables with multiple categories for psychometric items provide a better means for comparison across panel studies when fully quantitative symptom measures or metabolite concentrations are not feasible. As in the current study, they can be accommodated correctly through ordinal probit estimation of MIMIC models rather than over-estimated by modelling reflective indicators of the panel items as if they were strictly continuous. Furthermore, ordinal specification of the reflective (effect) indicators avoids the need to specify numerous covariances (and possible over-fitting) across the reflective (effect) indicators in order to obtain acceptable, continuous-model fit statistics (which occurred in parallel continuous MIMIC model runs in [28]). There is much untapped scope and potential to apply the MIMIC model, especially the ordinal probit MIMIC model, in studies of symptom clusters or metabolomic profiling.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/math9212715/s1, Figure S1: Footnotes to Figure 1; Table S1: Footnotes to Table 1; Table S2: Footnotes to Table 2 [38].

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

I use de-identified, publicly available survey data from the New Haven, Connecticut subsample of community-residing older adults from the Established Populations for the Epidemiological Study of the Elderly (EPESE; unweighted n = 2812; www.icpsr.umich.edu/NACDA/studies/9915 Accessed on 1 July 2021). See [29].

Conflicts of Interest

The author declares no conflict of interest.

List of Abbreviations and Equivalent or Related Terms

Abbreviation or TermFull Expression or Definition
MIMICMultiple Indicators-Multiple Causes
PCAPrincipal Components Analysis
CFAConfirmatory Factor Analysis
CES-DCenter for Epidemiological Studies-Depression, an inventory of depressive symptoms
EPESEEstablished Populations for the Epidemiological Study of the Elderly, the publicly available data analyzed in this study
Formative IndicatorCausal Indicator in the MIMIC Measurement Model
Instrumental VariableIdentical to the continuous variable for each of four non-traditional CES-D items, except for setting the values of residual depressive symptoms to zero when the CES-D score is less than eleven. Thus, we retain positive responses only in cases of subthreshold or clinically significant depression. The instrumental variable correlates very highly with the original variable.
Latent TraitEquivalent to Latent Construct. Note: it is determinate (non-stochastic) when there are no residuals in the MIMIC Structural Model (ζi = 0 and R2 = 1) and becomes an Additive Weighted Composite.
RCTRandomized Control Trial
Reflective IndicatorEffect Indicator in the MIMIC Measurement Model
R2Coefficient of Determination (R-squared)

Appendix A. (Introduction)

1. These and other multidimensional data reduction approaches identify disease groups or subgroups, phenotypes or subphenotypes, genes, epigenetic influences, or risk factors in which particular metabolites occur, especially since metabolites usually stem from the same biological pathways. At the same time, biomarkers or symptoms either channel the influence of individual metabolites and/or reflect different biological processes or multi-system effects (e.g., sepsis).
2. The non-normality (skewness, kurtosis) captured in the structural (regression) portion of the MIMIC model engenders non-normality in the estimated ordinal probit latent variables “behind” each of the observed or manifest reflective indicators in the endogenous CFA portion. This non-normality would be absent in a pure CFA model with ordinal probit latent variables generated as normal. The non-normality improves detection of interaction terms (e.g., reflecting disease subgroups of coexisting conditions), which tap non-normal variation [5,6,7,8,9].
3. In the structural portion of the MIMIC model in Equation (1), the expected value (mean) of the residual term (ζi) is zero. In the measurement portion of the MIMIC model, the expected value (mean) of the latent trait (ηi) is zero, which allows the measurement errors (εi) that reflect unique variation in each observed item to be non-zero. These differences in assumptions enable the multiple regression procedure of the structural model to tap the unique effects of each item as a formative indicator and the CFA procedure of the measurement model to tap its shared effects with the remaining items as a reflective indicator.

Appendix B. (Materials and Methods)

1. This approach is a way to adjust for the overall dynamic state of a system within cross-sectional data, which allows better distinctions of metabolite-specific effects from their systemic or “shared” influences on metabolic reaction rates that occur across the panel, and affect the overall level, of the metabolite, biomarker, or symptom panel.
2. Analyses to detect suppressor effects specify the predictors representing the main or interactive epidemiological contexts of diagnoses, genes, epigenetics, environmental risks, or other risk factors, not as competing factors, but as causes of the formative indicators, which become mediator variables considered endogenous. Figure 1 does not specify these causal paths from each diagnosis that comprises the diagnosis subgroup either to the instrumental variables in (a) or to a subset of the formative indicators in (b).
3. Only the exogenous instrumental variable predictor of Lonely, regardless of depression level (i.e., CES-D ≥ 0), in Figure 1a can be taken to differ in any substantive sense. Although it also reflects residual depressive symptoms in participants without clinically significant or subthreshold depression, it remains very highly associated with its corresponding endogenous predictor of Lonely when CES-D ≥ 11.
4. These incorrect reflective (effect) indicator effects are associated with unspecified epigenetic factors (i.e., non-genetic influences on gene expression that may serve as environmental confounding factors in metabolomic studies) or environmental or other risks. These unspecified epigenetic and risk factors are expressed as part of the instrumental variable effects or formative (causal) indicator effects on the latent trait and do not derive as reflective (effect) indicators of the latent trait. An advantage of this reflective (effect) indicators portion of the measurement model (estimated using CFA) is that it captures the “pathway and whole-systems level” effects [33] (p. 2) shared across the multiple, observed indicators that manifest as loadings of the latent trait.
5. Of course, when the focus of the MIMIC model is a specific disease condition in this listing (from hypertension to heart attack), or a targeted disease subgroup involving the interaction of specific disease conditions, the disease condition is a primary variable, not a confounder.

Appendix C. (Discussion)

1. Because the residual from the structural model, ζi, in equation (1) is zero, the additive composite is determinate (non-stochastic) whereas a latent trait predicted by a non-zero residual from the structural model would be partially indeterminate (stochastic). In the measurement model, the shared variation across all reflective indicators (Λ ηi) and the unique variation within each reflective indicator (εi) are determinate because they reflect these systematic effects, whereas a non-zero residual from the structural model (ζi) is random, non-systematic, and stochastic. Unlike the residual from the regression-based structural model (ζi), the unique variation within each reflective indicator (εi) from the CFA-based measurement model is not a residual that taps random variation.
2. In contrast, the weighted additive composite provides more valid total scores that can be used in individual situations to obtain an overall symptomatology or metabolite score for a given panel of items, and as a more valid summative measure for use in subsequent statistical analyses (e.g., in multiple regressions). Formative indicators with highly overlapping variation (multicollinearity) only contribute their non-overlapping variation in the regression-based structural model that predicts the weighted additive composite (the residual term equals the constant value of zero), and the CFA-based measurement model taps their overlapping variation. (In the absence of the CFA portion of MIMIC, the resulting multiple regression would sequester this overlapping variation in the residual term). This allows for different types of presentations of symptoms in different participants; it is not misleading as would be the use of multiple regression alone when the data comprise heterogeneous subgroups with different effects.
Thus, the use of both partitions of the unique and shared variation from data captured by the regression-based structural model, and by the CFA model, to create the additive weighted composite is more valid than a purely CFA-determined latent trait of only the shared portion of the data. The weighted additive composite also comprises the unique portion of the variation, unlike the latent trait that would be determined only by CFA, which means it has a determinate value for each observation, in contrast to factor scores from CFA. The weighted additive composite provides the best total score within the particular disease group or subgroup tested in the MIMIC model. The different weights make sense since some items are more pronounced in their effects in the particular disease group or subgroup. In contrast, CFA alone provides a latent trait total score, based equally on all items, which is valid across the sample at large.
Some items, based on their formative indicators, are more important influences in shaping the additive weighted composite than the unique variation in other items. Thus, the greater the unique variation contributed by an item, the more it contributes to the additive weighted composite. Since only the unique variation is contributed by a formative indicator, the statistical significance of two formative indicators does not mean that they constitute a symptom cluster within the same disease group or subgroup, only that these two formative indicators are significant in the sample at large.
3. Of course, the reflective indicators inter-correlate since they all stem from a common cause (the latent trait, which in this case, is equivalent to the composite). Because the formative indicators are the same measurement items specified as reflective indicators, they must also be inter-correlated, which suggests they share similar antecedents and consequences, although not necessarily completely, as it is the unique or non-shared variation within each formative indicator that predicts the composite.
4. However, since the structural model taps only the variation unique to each formative indicator, changing the order of specifying the formative indicators as predictors should not shift the regression slope values of the composite weights. Similarly, measurement model loadings do not shift from changing the order of specifying the reflective indicators.
5. The property of local independence and the conditional independence of the reflective (effect) indicators mean that their prediction within disease or other groups and subgroups, in the measurement-model portion of the MIMIC model, avoids regression-based complications, often undetected or unadjusted, that may compromise effects by the formative (causal) indicators in the structural portion of the MIMIC model. These include biases from multicollinearity, heteroscedasticity, and influential outliers that arise from differential effects in subsets of participants and from unspecified confounders, and may undermine the local independence of effects by formative (causal) indicators. Thus, the formative indicators approach appears sound to interpret the reflective (effect) indicators within disease or other groups and subgroups.
6. These are not unique effects but rather total effects of the reflective indicators comprising the symptom cluster. When the initial model of the unique effects is inestimable, we cannot calculate the total effect of each reflective indicator (i.e., its unique effect plus the effect of the latent trait or additive composite within the disease group or subgroup). Rather, this broader modeling detects these hidden effects by estimating the total effects of each reflective indicator directly.
7. The following reviews these steps with greater specificity, especially in relation to Figure 1b and Table 2.
  • The first part of a protocol for conducting MIMIC models in symptom, biomarker, or metabolite panels is to run the model with all metabolites, biomarkers, or symptoms as both formative and reflective indicators, as in Figure 1b. There is a possibility of pronounced confounding biases especially when predictors represent overall groups (i.e., without interactions among predictors that would detect effects within targeted subgroups). I recommend including formative indicators of all measurement items for all MIMIC models, especially when MIMIC models are limited to main (non-interactive) effects. In Table 2, these are either unlabeled, or listed as “Full” runs when also reporting other types of MIMIC runs for the same diagnosis subgroup (in order to distinguish them).
  • Each MIMIC model should then be rerun to account for potential suppressor effects by predicting each formative (causal) indicator by the predictors comprising the exogenous group or subgroup {i.e., the formative indicator becomes a mediator and switches from exogenous to endogenous}. These are “Mediated” runs in Table 2. {While the formative (causal) indicators serve to mediate unspecified confounders in all of the MIMIC analyses, the term “mediated” in this second step of the protocol refers to MIMIC models in which the formative (causal) indicators mediate the specified group or subgroup. Also, note that Table 2 reports some mediated runs when only the four non-traditional CES-D items have formative indicators {i.e., in (A)}. In these cases, mediated runs do not also yield estimates when all twenty CES-D items have formative indicators {i.e., in (B)} because the model is unidentified. These runs are “Non-Mediated” in order to distinguish them from the mediated run in (A)}.
  • Some full or mediated MIMIC models do not converge to yield unique estimates of the latent trait or additive composite along with all of its reflective indicators, which leads to the third approach in the protocol. In sequential estimation, the MIMIC model is first specified and run to interpret the reflective indicators without also controlling for the latent trait or additive composite {e.g., Depression; all of the predictor arrows are dropped that lead to Depression (represented by pathways 1 and 4 in Figure 1b}. The analyst then re-specifies and reruns it to interpret the latent trait or additive composite without also controlling the individual reflective psychometric items {dropping all ‘2′ pathways in Figure 1b}. These are “Sequential” runs in Table 2. {This sequential approach detects statistically significant panel items without also adjusting for the overall level across all items, or detects a statistically significant latent trait without also adjusting for individual items. It is especially appropriate when specifying a latent trait or additive composite only as a device to model a panel of metabolites, biomarkers, or symptoms but is not inherently and substantively meaningful. For instance, the analyst may specify a panel of metabolites even though they are not all necessarily reflective indicators or a single latent trait or dimension; the latent trait merely controls for the overall level of metabolites, even if they are not unidimensional.}
  • A final approach in the protocol is the exclusion of exogenous predictors representing specific groups or subgroups {e.g., all of the ‘1′ pathways representing the diagnosis subgroup (Diabetes, Heart Failure, and their interaction) are dropped in Figure 1b} to provide estimates across the entire panel sample rather than within a predetermined disease group or subgroup from the sample. The analyst may run this final approach whether or not any of the previous approaches converge to an optimal solution. It is appropriate if there is no real basis to identify specific groups or to target subgroups. The unreported MIMIC model for the sample at large in the current study reveals all of the reflective indicators to be statistically significant, with items of dysphoria or low positive affect (Depressed 2, Sad 2, Blues 2, Happy 2) having the highest measurement loadings (0.616 to 1). All of the formative indicators are statistically significant, with two items of dysphoria (Depressed, Sad) having the highest regression slopes (0.724 and 0.671, respectively).
8. How might analysts integrate formative and reflective indicators of the same measurement items into covariance-based MIMIC models? Even as both indicators provide identical information under different variable names, it is unclear whether covariance-based estimation can accommodate both within the same MIMIC model, especially since all variables are modeled as endogenous factors (in contrast to the exogenous nature of the formative indicators in regression-based MIMIC models). For instance, the formative indicators would duplicate the same pattern of covariances as the reflective indicators, the variance of each formative indicator would be equivalent to the variance of its counterpart reflective indicator, and the covariance of each formative indicator with its counterpart reflective indicator would be equivalent to this variance. However, these three factors might result in a non-invertible covariance matrix. As another strategy to create a covariance matrix for analysis, the analyst would first specify a multiple regression (without latent variables) to predict each of the observed psychometric items using all remaining items. This approach would yield a highly correlated, but not perfect, relationship in which the predicted values serve as an instrumental variable for each item. If the original data across all observations is not available, but only the covariance matrix, the analyst can estimate a covariance-based structural equation predicting each item by the remaining items in order to create its instrumental variable. Assuming derivable estimates for each of the instrumental variables, they can serve as formative indicators, and the original predictors can serve as reflective indicators, in the subsequent covariance-based MIMIC model. The covariance-based MIMIC model should result in a small residual distribution (ζ) in the structural portion of the model {unlike in a regression-based MIMIC with perfect fit and the residual (ζ) is a constant equal to zero}. The analyst can either fix this residual term to zero prior to estimation, or ignore it and interpret the latent trait as an additive composite. The two strategies suggested in this paragraph need testing and vetting in different data to determine their validity and feasibility.
9. This novel approach with formative (causal) indicators that partial out biases from confounders may provide an alternative analysis, or replication, to relying on metabolite set enrichment, in which prior knowledge from previous research about sets of genes involved in cellular processes and generated scores from sets of metabolites under different conditions are compared [36,39]. Analysts use metabolite set enrichment especially when knowledge from the current specific context is lacking or confounded, or is difficult to assemble or derive from external databases. The approach with formative (causal) indicators may also be an alternative, or serve to replicate, analyses based on Mendelian randomization, a technique that uses genetic variants as instrumental variables to estimate causal relationships between metabolites and traits or diseases. The genetic variants (instrumental variables) are not associated with confounders due to their random assignment from parental genotypes during the formation of gametes [40]. On the other hand, I advise caution when deriving findings is entirely or primarily based on statistical modeling using formative (causal) indicators without reliance on sets of genes or genetic variants for substantive justification and corroboration. In these circumstances, the statistical approach with formative indicators may validate the application of metabolite set enrichment or Mendelian randomization, but its lack or insufficient genetic information means it may be questionable as a complete replacement.

Appendix D. (Conclusions)

1. A lack of statistically significant effects in the Explanatory MIMIC for a diagnostic subgroup in Table 1(B) resulted in a similar lack of statistically significant effects in Table 2(B). The same statistically significant effects in Table 2(B) were always statistically significant in the Explanatory MIMIC findings Table 1(B). However, Table 1(B) also tended to find other items significant as well in the Explanatory MIMIC runs, which could result partly from the exclusion, in the instrumental variable approach, but not the formative indicators approach, of shared variation within reflective indicators of non-traditional items with reflective indicators of traditional items. On the other hand, it also suggests that despite the attempt to specify a comprehensive (but not necessarily exhaustive) set of related personal characteristics and diagnostic conditions associated with confounders, some confounding factors remain unadjusted. Thus, the specification of all twenty psychometric items as formative (causal) indicators in the “exploded” MIMIC models {Table 2(B)} may achieve more complete conditioning than the counterpart Explanatory MIMIC models {Table 1(B)}.
2. By capturing multidimensionality within the additive composite equivalent of the latent trait, the MIMIC model with formative indicators could overcome the restriction of unidimensionality in CFA within the measurement model of reflective indicators, in contrast to when CFA is used alone (outside the regression-based MIMIC framework). Just because a latent trait can be postulated and estimated when only reflective indicators are used in CFA does not necessarily mean the derived latent trait is the most valid estimate of the true latent trait. A true latent trait should have the property that allows it to be modeled by dissimilar variation across formative indicators that do not in themselves constitute a cluster of reflective indicators. The formative indicators provide additional, exogenous modeling information to reveal statistically significant reflective indicators and clusters by identifying this more plausible latent trait equivalent to the additive composite of the formative indicators. By capturing all of the variation across these formative indicators (i.e., R2 = 1), this modeling provides determinacy of latent factor scores at the level of the individual observations because they are equivalent to the additive composite scores, in contrast to the indeterminacy of factor scores for individual observations from CFA outside of this MIMIC framework.
3. Furthermore, this unique context of perfect fit, in which the same measurement items are used as both formative and reflective indicators, means that both types of indicators can be assumed to have internal consistency. Of course, the reflective indicators inter-correlate since they all stem from a common cause (the latent trait, which in this case, is equivalent to the weighted additive composite). Because the formative indicators are the same measurement items specified as reflective indicators, they must also be inter-correlated, which suggests that they share similar antecedents and consequences, although not necessarily completely, as it is the unique or non-shared variation within each formative indicator that predicts the weighted additive composite. These properties make it attractive to use the obtained estimates of the weights to specify, a priori, an additive composite based on these fixed weights for use in subsequent MIMIC or other structural equations models (e.g., multiple regression), either in the same or different samples.

References

  1. McGarrah, R.W.; Crown, S.B.; Zhang, G.-F.; Shah, S.H.; Newgard, C.B. Cardiovascular metabolomics. Circ. Res. 2018, 122, 1238–1258. [Google Scholar] [CrossRef]
  2. Newgard, C.B.; An, J.; Bain, J.R.; Muehlbauer, M.J.; Stevens, R.D.; Lien, L.F.; Hagg, A.M.; Shah, S.H.; Arlotto, M.; Slentz, C.A.; et al. A branched-chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. Cell Metab. 2009, 9, 311–326. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Kim, H.-J. Common factor analysis versus principal component analysis: Choice for symptom cluster research. Asian Nurs. Res. 2008, 2, 17–24. [Google Scholar] [CrossRef] [Green Version]
  4. Snook, S.C.; Gorsuch, R.L. Component analysis versus common factor analysis: A Monte Carlo study. Psychol. Bull. 1989, 106, 148–154. [Google Scholar] [CrossRef]
  5. Muthén, B. A structural probit model with latent variables. J. Am. Stat. Assoc. 1979, 74, 807–811. [Google Scholar] [CrossRef]
  6. Muthén, B. Latent variable structural equation modeling with categorical data. J. Econom. 1983, 22, 48–65. [Google Scholar] [CrossRef]
  7. Muthén, B. A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika 1984, 49, 115–132. [Google Scholar] [CrossRef] [Green Version]
  8. Muthén, B. Latent variable modeling in heterogeneous populations. Presidential address to the Psychometric Society, July, 1989. Psychometrika 1989, 54, 557–585. [Google Scholar] [CrossRef]
  9. Muthén, L.K.; Muthén, B.O. Mplus User’s Guide, 5th ed.; Muthén: Los Angeles, CA, USA, 1998. [Google Scholar]
  10. Brent, W.-W. Special Issue “Using Metabolomics to Help Subphenotype Disease”. 2021. Available online: www.mdpi.com/journal/metabolites/special_issues/subphenotype_metabolomics (accessed on 11 August 2021).
  11. Diamantopoulos, A.; Siguaw, J.A. Formative versus reflective indicators in organizational measure development: A comparison and empirical illustration. Br. J. Manag. 2006, 17, 263–282. [Google Scholar] [CrossRef]
  12. Diamantopoulos, A.; Winklhofer, H.M. Index construction with formative indicators: An alternative to scale development. J. Mark. Res. 2018, 38, 269–277. [Google Scholar] [CrossRef]
  13. Hoyle, R.H. (Ed.) Handbook of Structural Equation Modeling; Guilford: New York, NY, USA, 2012; ISBN 978-1606230770. [Google Scholar]
  14. Kline, R.B. Reverse arrow dynamics: Feedback loops and formative measurement (pp. 41–80). In Structural Equation Modeling: A Second Course, 2nd ed.; Hancock, G.R., Mueller, R.O., Eds.; Information Age Publishing: Greenwich, CT, USA, 2013; pp. 39–76. ISBN 13: 978-1593110147. [Google Scholar]
  15. Morrish, N.J.; Wang, S.L.; Stevens, L.K.; Fuller, J.H.; Keen, H. Mortality and causes of death in the WHO Multinational Study of Vascular Disease in Diabetes. Diabetologia 2001, 44 (Suppl. 2), S14–S21. [Google Scholar] [CrossRef]
  16. Deng, M.; Su, D.; Xu, S.; Little, P.J.; Feng, X.; Tang, L.; Shen, A. Metformin and vascular diseases: A focused review on smooth muscle cell function. Front. Pharmacol. 2020, 11, 635. [Google Scholar] [CrossRef] [PubMed]
  17. Michiels, C.F.; Apers, S.; De Meyer, G.R.Y.; Martinet, W. Metformin attenuates expression of endothelial cell adhesion molecules and formation of atherosclerotic plaques via Autophagy Induction. Ann. Clin. Exp. Metabol. 2016, 1, 1001. [Google Scholar]
  18. Hopf, A.-E.; Andresen, C.; Kötter, S.; Isić, M.; Ulrich, K.; Sahin, S.; Bongardt, S.; Röll, W.; Drove, F.; Scheerer, N.; et al. Diabetes-induced cardiomyocyte passive stiffening is caused by impaired insulin-dependent titin modification and can be modulated by neuregulin-1. Circ. Res. 2018, 123, 342–355. [Google Scholar] [CrossRef]
  19. Papanas, N.; Maltezos, E.; Mikhailidis, D.P. Metformin and heart failure: Never say never again. Expert Opin. Pharmacother. 2012, 13, 1–8. [Google Scholar] [CrossRef]
  20. Dunlay, S.M.; Givertz, M.M.; Aguilar, D.; Allen, L.A.; Chan, M.; Desai, A.S.; Deswal, A.; Dickson, V.V.; Kosiborod, M.N.; Lekavich, C.L.; et al. Type 2 diabetes and heart failure: A scientific statement from the American Heart Association and the Heart Failure Society of America. Circulation 2019, 140, e294–e324. [Google Scholar] [CrossRef] [PubMed]
  21. Ekeruo, I.A.; Solhpour, A.; Taegtmeyer, H. Metformin in diabetic patients with heart failure: Safe and effective? Curr. Cardiovasc. Risk Rep. 2013, 7, 417–422. [Google Scholar] [CrossRef] [Green Version]
  22. Koser, F.; Loescher, C.; Linke, W.A. Posttranslational modifications of titin from cardiac muscle: How, where, and what for? FEBS J. 2019, 286, 2240–2260. [Google Scholar] [CrossRef] [PubMed]
  23. Rahim, M.A.A.; Rahim, Z.H.A.; Wan Ahmad, W.A.; Bakri, M.M.; Ismail, M.D.; Hashim, O.H. Inverse changes in plasma tetranectin and titin levels in patients with type 2 diabetes mellitus: A potential predictor of acute myocardial infarction? Acta Pharmacol. Sin. 2018, 39, 1197–1207. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Wilmanns, J.C.; Pandey, R.; Hon, O.; Chandran, A.; Schilling, J.M.; Forte, E.; Wu, Q.; Cagnone, G.; Bais, P.; Philip, V.; et al. Metformin intervenetion prevents cardiac dysfunction in a murine model in adult congenital heart disease. Mol. Metab. 2019, 20, 102–114. [Google Scholar] [CrossRef] [PubMed]
  25. Slater, R.E.; Strom, J.G.; Methawasin, M.; Liss, M.; Gotthardt, M.; Sweitzer, N.; Granzier, H.L. Metformin improves diastolic function in an HFpEF-like mouse model by increasing titin compliance. J. Gen. Physiol. 2019, 151, 42–52. [Google Scholar] [CrossRef]
  26. Khan, S.Z.; Rivero, M.; Nader, N.D.; Cherr, G.S.; Harris, L.M.; Dryiski, M.L.; Dosluoglu, H.H. Metformin is associated with improved survival and decreased cardiac events with no impact on patency and limb salvage after revascularization for peripheral arterial disease. Ann. Vasc Surg. 2019, 5, 63–77. [Google Scholar] [CrossRef]
  27. Tharp, C.; Mestroni, L.; Taylor, M. Modifications of titin contribute to the progression of cardiomyopathy and represent a therapeutic target for treatment of heart failure. J. Clin. Med. 2020, 9, 2770. [Google Scholar] [CrossRef] [PubMed]
  28. Francoeur, R.B. Symptom profiles of subsyndromal depression in disease clusters of diabetes, excess weight, and progressive cerebrovascular conditions: A promising new type of finding from a reliable innovation to estimate exhaustively specified multiple indicators-multiple causes (MIMIC) models. Diabetes Metab. Syndr. Obes. Targets Ther. 2016, 9, 391–416. [Google Scholar] [CrossRef] [Green Version]
  29. National Archive of Computerized Data on Aging. Established Populations for Epidemiologic Studies of the Elderly, 1981–1993: {East Boston, Massachusetts, Iowa and Washington Counties, Iowa, New Haven, Connecticut, and North Central North Carolina} (ICPSR 9915). 2021. Available online: http://www.icpsr.umich.edu/NACDA/studies/9915 (accessed on 1 June 2021).
  30. van Wie, M.P.; Li, X.; Wiedermann, W. Identification of confounded subgroups using linear model-based recursive partitioning. Psychol. Test. Assess. Model. 2019, 61, 365–387. [Google Scholar]
  31. Wooldridge, J.M. Econometric Analysis of Cross Section and Panel Data, 2nd ed.; MIT Press: Cambridge, MA, USA, 2010; ISBN 978-0262232586. [Google Scholar]
  32. Wiedermann, W.; von Eye, A. Direction-dependence analysis: A confirmatory approach for testing directional theories. Int. J. Behav. Dev. 2015, 39, 570–580. [Google Scholar] [CrossRef]
  33. Boyd, J.H.; Weissman, M.M. Screening for depression in a community sample. Arch. Gen. Psychiatry 1982, 39, 1195–1200. [Google Scholar] [CrossRef] [PubMed]
  34. Hybels, C.F.; Blazer, D.G.; Pieper, C.F. Toward a threshold for subthreshold depression: An analysis of correlates of depression by severity of symptoms using data from an elderly community sample. Gerontologist 2001, 41, 357–365. [Google Scholar] [CrossRef] [Green Version]
  35. Schein, R.L.; Koenig, H.G. The Center for Epidemiological Studies-Depression (CES-D) scale: Assessment of depression in the medically ill elderly. Int. J. Geriatr. Psychiatry 1997, 12, 436–446. [Google Scholar] [CrossRef]
  36. Khatri, P.; Sirota, M.; Butte, A.J. Ten years of pathway analysis: Current approaches and outstanding challenges. PLoS Comput. Biol. 2012, 8, e1002375. [Google Scholar] [CrossRef]
  37. Fearnley, L.G.; Inouye, M. Metabolomics in epidemiology: From metabolite concentrations to integrative reaction networks. Int. J. Epidemiol. 2016, 45, 1319–1328. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  38. David , A. Identification. 2012. Available online: http://davidakenny.net/cm/identify_formal.htm#B3b (accessed on 1 June 2021).
  39. Xia, J.; Wishart, D.S. MSEA: A web-based tool to identify biologically meaningful patterns in quantitative metabolomic data. Nucleic Acids Res. 2010, 38, W71–W77. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  40. Davey Smith, G.; Ebrahim, S. Mendelian randomization prospects, potentials, and limitations. Int. J. Epidemiol. 2004, 33, 30–42. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. MIMIC Models of Depression with 20 Reflective (Effect) Indicators in Diabetes-Heart Failure Subgroup: (a) with Instrumental Variables of 4 Non-Traditional Depression Items; (b) with 20 Formative (Causal) Indicators.
Figure 1. MIMIC Models of Depression with 20 Reflective (Effect) Indicators in Diabetes-Heart Failure Subgroup: (a) with Instrumental Variables of 4 Non-Traditional Depression Items; (b) with 20 Formative (Causal) Indicators.
Mathematics 09 02715 g001
Table 1. CES-D Depression Items in Diabetes, Heart Failure, and Targeted, Synergistic Subgroups: MIMIC Models with Instrumental Variables of Four Endogenous Formative Indicators {1, 2, 3}.
Table 1. CES-D Depression Items in Diabetes, Heart Failure, and Targeted, Synergistic Subgroups: MIMIC Models with Instrumental Variables of Four Endogenous Formative Indicators {1, 2, 3}.
CHRONIC CONDITIONS/SUBGROUPS(A) Descriptive MIMIC(B) Explanatory MIMIC
CES-D Depression ItemsbS.E.z {4}bS.E.z {4}
DIABETES
Bothered by Things1.1420.3263.507
Life a Failure1.2720.353.634
Crying Spells1.450.4043.59
Depressed2.2390.3147.135
Blues2.1450.3476.174
Sad2.0830.4744.397
Happy1.2990.4033.223
Hopeful0.5860.1893.095
Enjoyed Life1.2870.4093.147
Everything and Effort1.5490.2556.068
Poor Appetite1.5250.2675.714
Difficulty Concentrating1.4650.3723.939
Talked Less than Others0.7660.2852.686
Restless Sleep1.390.2585.387
Not Get Going1.5670.3334.702
Fearful1.6520.4633.563
Lonely2.0320.5074.006
People Unfriendly0.7220.3592.01
HEART FAILURE
Bothered by Things2.0750.2558.151
Life a Failure2.3110.355.479
Crying Spells2.4830.4048.21
Depressed3.6910.31411.913
Blues3.2880.3478.133
Sad3.2880.4747.39
Enjoyed Life2.5030.4036.78
Good as Others1.2930.1894.008
Everything an Effort2.440.40911.855
Poor Appetite2.3510.2558.510.5020.2192.286
Difficulty Concentrating1.6960.2675.737
Talked Less than Others1.7950.3727.059
Restless Sleep2.0870.2857.221
Not Get Going2.490.2587.613
Fearful2.3890.3335.133
Lonely3.1300.4635.343
People Unfriendly1.2970.5074.061
People Disliked Me1.4040.3594.254
DIABETES × HEART ATTACK
Bothered by Things3.580.7744.624
Life a Failure4.890.6437.599
Crying Spells3.5150.7824.496
Depressed8.4380.60713.908
Blues6.3750.54811.63
Sad7.3520.65311.262
Happy5.8250.36915.782
Hopeful3.280.5875.588
Enjoyed Life5.1970.539.805
Good as Others2.5650.5994.279
Everything an Effort4.8810.5588.752
Poor Appetite4.3640.6426.801
Difficulty Concentrating3.9960.36810.861
Talked Less than Others2.9630.7034.217
Restless Sleep4.420.4759.298
Not Get Going5.0390.6627.608
Fearful6.8030.67610.07
Lonely7.1050.9347.603
People Unfriendly3.7820.6056.248
People Disliked Me3.8220.6775.642
DIABETES × HEART FAILURE
Bothered by Things3.530.4937.163
Life a Failure4.0241.0273.918
Crying Spells5.4790.6927.917
Depressed8.6840.60114.453
Blues6.8340.8198.345
Sad8.0170.64112.5
Happy5.4550.40313.5440.8220.4231.940 {5}
Hopeful2.4130.5024.806
Enjoyed Life5.1340.34115.04
Good as Others2.9390.4776.162
Everything an Effort4.4080.4539.735
Poor Appetite4.070.5527.369
Difficulty Concentrating3.4110.6595.175
Talked Less than Others3.4950.536.597
Restless Sleep3.7730.5067.451
Not Get Going3.2550.4487.266
Fearful5.1270.8376.125
Lonely6.4480.6619.754
People Unfriendly3.0220.8583.521
People Disliked Me3.6070.6685.4
DIABETES × HEART ATTACK × HEART FAILURE
Crying Spells6.2412.0563.0355.481.8442.972
Depressed6.542.8072.336.6052.8522.316
Blues7.2163.2592.2146.9383.3082.097
Sad4.6521.6442.834.6461.7112.716
Happy4.1411.8792.2054.2982.1052.042
Enjoyed Life 2.5271.2711.987
Restless Sleep4.9651.9272.5765.0261.9552.571
DIABETES × HIGH BP × HEART FAILURE
Crying Spells5.2092.3492.2185.5362.1262.604
People Unfriendly3.4751.252.783.0271.1352.666
People Disliked Me2.0120.9872.0381.9460.9642.018
DIABETES × HIGH BP × HEART FAILURE,
WITHOUT HEART ATTACK
Crying Spells7.8143.4382.2737.6983.282.347
DIABETES × HIGH BP × HEART ATTACK × HEART FAILURE
Depression4.3882.0242.1683.6161.732.09
Talked Less than Others5.2821.4953.5325.5011.4723.737
DIABETES × SILENT CVD × HEART FAILURE
Blues5.2142.651.968
Sad4.4262.142.069
Happy4.3931.5392.8553.2881.6122.039
Enjoyed Life3.4931.5892.198
Restless Sleep3.3691.4432.3342.7121.2892.105
Lonely4.3762.2381.956 {6}
DIABETES × SILENT CVD × HEART FAILURE, WITHOUT HEART ATTACK
Happy4.4491.8762.3724.1311.9662.101
Enjoyed Life5.1261.9942,5714.9592.0572.411
Good as Others3.3631.7621.908 {7}3.3161.5432.148
DIABETES × SILENT CVD × HEART ATTACK × HEART FAILURE
Depression4.1241.2613.273.2931.7681.863 {8}
All of the empty cells that appear within the table reflect analyses that were not found to be statistically significant. See Table S1 for notes {1} through {8}.
Table 2. Depression in Diabetes and Heart Failure and as Comorbid Conditions in Targeted, Synergistic Subgroups: Full or Sequential MIMIC Models with Formative Indicators as Exogenous Predictors or Illness Context Mediators {1, 2, 3}.
Table 2. Depression in Diabetes and Heart Failure and as Comorbid Conditions in Targeted, Synergistic Subgroups: Full or Sequential MIMIC Models with Formative Indicators as Exogenous Predictors or Illness Context Mediators {1, 2, 3}.
CHRONIC
CONDITIONS/SUBGROUPS
(A) Formative Indicators for 4 Non-Traditional CES-D Items(B) Formative Indicators for All 20 CES-D Items
CES-D Depression ItemsbS.E.z {4}bS.E.z {4}
I. OVERALL OR MAIN EFFECTS
DIABETES
(A): Mediated {5};
(B): Non-Mediated, Full and Sequential
Bothered by Things0.9580.3113.084
Life a Failure1.0390.3343.114
Crying Spells1.190.3943.017
Depressed1.8490.3135.909
Blues1.8170.3365.403
Sad1.7380.453.867
Happy1.0650.3872.749
Hopeful0.4510.1852.434
Enjoyed Life1.0420.3982.617
Everything an Effort1.3180.2475.337
Poor Appetite1.3490.2555.298
Difficulty Concentrating1.2830.3573.589
Talked Less than Others0.5930.2762.147
Restless Sleep1.2040.2454.908
Not Get Going1.3430.3194.21
Fearful1.3860.4413.143
Lonely1.7070.4743.597
HEART FAILURE
(A): Mediated {6};
(B): Non-Mediated, Full
Bothered by Things0.7850.1844.266 0.44 0.212.091
Bothered by Things (Formative Indicator) 0.607 0.184 3.288
Life a Failure0.6630.2962.244
Life a Failure (Formative Indicator) 0.4220.2141.974
Crying Spells0.6420.2732.351
Depressed0.9350.313.02
Blues0.9680.3252.974
Blues (Formative Indicator) 0.5150.1593.246
Sad0.840.2942.857
Sad (Formative Indicator) 0.4430.152.943
Happy0.6290.2472.546
Happy (Formative Indicator) 0.4470.1423.145
Hopeful0.3850.1622.376
Hopeful (Formative Indicator) 0.3230.1162.788
Enjoyed Life0.780.3122.496
Enjoyed Life (Formative Indicator) 0.5350.134.126
Everything an Effort0.8110.1625.01
Everything an Effort (Formative Indicator) 0.5630.1324.256
Poor Appetite1.1260.2095.4010.8090.2233.625
Poor Appetite (Formative Indicator) 0.8880.1745.11
Talked Less than Others0.5780.2022.868
Talked Less than Others (Formative Indicator) 0.4550.1972.306
Restless Sleep0.7790.2233.490.4320.22.156
Restless Sleep (Formative Indicator) 0.5920.1992.972
Not Get Going 0.4980.2292.17
Not Get Going (Formative Indicator)0.9080.2393.7980.6220.1434.341
Lonely0.820.3452.379
Lonely (Formative Indicator)0.4560.2052.2260.4560.2052.226
II. DIABETES IN REFINED SUBGROUPS
DIABETES × HEART ATTACK
Depression 1.7230.9161.881 {7}
Bothered by Things3.580.7744.624
Life a Failure4.890.6437.599
Crying Spells3.5150.7824.496
Depressed8.4380.60713.908
Blues6.3750.54811.63
Sad7.3520.65311.262
Happy5.8250.36915.782
Hopeful3.280.5875.588
Enjoyed Life5.1970.539.805
Good as Others2.5650.5994.279
Everything an Effort4.8810.5588.752
Poor Appetite4.3640.6426.801
Difficulty Concentrating3.9960.36810.861
Talked Less than Others2.9630.7034.217
Restless Sleep4.420.4759.298
Not Get Going5.0390.6627.608
Fearful6.8030.67610.07
Lone7.1050.9347.603
People Unfriendly3.7820.6056.248
People Disliked Me3.8220.6775.642
DIABETES × HEART FAILURE
(A): Mediated {8};
(B): Non-Mediated, Full, Sequential {8}
Depression15.8630.40938.8
Happy (Formative Indicator) 0.7740.3462.237
DIABETES × HEART ATTACK × HEART
FAILURE
Crying Spells7.8743.0272.6015.1221.5263.356
Depressed8.8424.1052.1546.7032.3752.822
Blues9.1432.9273.1235.9362.5542.324
Happy5.5842.8141.9844.3851.7172.554
Hopeful2.5761.2742.0222.1911.1381.925 {9}
Restless Sleep6.1511.5393.9974.9791.6932.94
People Disliked Me5.1781.9932.598
DIABETES × HIGH BP × HEART FAILURE
Crying Spells5.2092.3492.2185.8312.5742.265
Crying Spells (Formative Indicator) 2.3651.0172.324
People Unfriendly3.4751.252.783.5911.2792.807
People Unfriendly (Formative Indicator)2.4021.132.1262.4021.132.126
People Disliked Me2.0120.9872.0382.0010.9372.135
DIABETES × HIGH BP × HEART FAILURE,
WITHOUT HEART ATTACK
Crying Spells7.8143.4382.2738.7933.7632.337
Crying Spells (Formative Indicator) 4.2351.7012.49
DIABETES × HIGH BP × HEART ATTACK ×
HEART FAILURE
(A) and (B): Sequential
Depression2.9711.4162.098
Talked Less than Others6.2371.633.8275.7771.4833.895 {10}
Talked Less than Others (Formative Indicator) 3.8081.7522.174
DIABETES × SILENT CVD × HEART FAILURE
Blues5.2322.6551.9715.5822.6552.103
Sad4.4422.1422.0745.8082.6572.186
Happy4.4031.5422.8555.7682.0292.842
Happy (Formative Indicator) 1.720.792.177
Enjoyed Life3.5041.5912.2034.4021.7752.48
Restless Sleep3.3771.4452.3374.1631.742.392
Lonely4.3882.2391.964.152.141.939 {11}
DIABETES × SILENT CVD × HEART FAILURE,
WITHOUT HEART ATTACK
Happy4.4491.8762.3725.6662.5432.228
Enjoyed Life5.1261.9942,5715.982.472.421
Good as Others3.3631.7621.908 {12}3.7241.9561.904 {12}
DIABETES × SILENT CVD × HEART ATTACK × HEART FAILURE
Depression4.1241.2613.276.0922.0992.903
III. DIABETES AND HEART FAILURE:
SUBGROUPS FURTHER REFINED BY EXCESS WEIGHT
DIABETES × HEART ATTACK × HEART FAILURE × EXCESS WEIGHT
Crying Spells 13.3834.8552.757
Depressed6.8642.7812.4688.8264.0382.186
Blues10.7142.4824.31711.4143.5063.255
Blues (Formative Indicator) 3.0411.4652.075
Sad6.8993.0842.2377.8313.981.968
Happy5.5211.92.9056.8722.6092.634
Enjoyed Life6.1392.1892.804
Poor Appetite8.1862.0733.9498.5731.9084.493
Poor Appetite (Formative Indicator) 4.3861.9942.199
Restless Sleep3.7651.5982.3554.5061.7112.633
Not Get Going5.1492.4122.1355.661.9532.898
DIABETES × HIGH BP × HEART FAILURE × EXCESS WEIGHT
(A): Full; (B): Full, Sequential
Talked Less than Others4.492.0212.222
Enjoyed Life 2.9611.4961.980 {13}
DIABETES × HIGH BP × HEART ATTACK × HEART FAILURE × EXCESS WEIGHT
(A) and (B): Full, Sequential
People Unfriendly4.5512.2712.004 {14}4.1692.0032.082 {14}
DIABETES × SILENT CVD × HEART FAILURE × EXCESS WEIGHT
Life a Failure6.1212.822.171
Good as Others 9.8822.1794.535
Not Get Going5.5021.8882.9145.2062.3692.198
Not Get Going (Formative Indicator) 3.5181.6862.087
DIABETES × SILENT CVD × HEART ATTACK × HEART FAILURE × EXCESS WEIGHT {15}
Enjoyed Life (Formative Indicator) 6.0932.1872.786
All confounders are unspecified. The analyses in the three right columns {(B) Formative Indicators for All 20 CES-D Items} provides the most comprehensive adjustment for unspecified confounders. All of the empty cells that appear within the table reflect analyses that were not found to be statistically significant. See Section 4.2 and its notes in Appendix C regarding the different types of model specifications conducted within some of the disease groups/subgroups. See Table S2 for notes {1} through {15}.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Francoeur, R.B. Multimorbidity from Diabetes, Heart Failure, and Related Conditions: Assessing a Panel of Depressive Symptoms as Both Formative and Reflective Indicators of a Latent Trait. Mathematics 2021, 9, 2715. https://doi.org/10.3390/math9212715

AMA Style

Francoeur RB. Multimorbidity from Diabetes, Heart Failure, and Related Conditions: Assessing a Panel of Depressive Symptoms as Both Formative and Reflective Indicators of a Latent Trait. Mathematics. 2021; 9(21):2715. https://doi.org/10.3390/math9212715

Chicago/Turabian Style

Francoeur, Richard B. 2021. "Multimorbidity from Diabetes, Heart Failure, and Related Conditions: Assessing a Panel of Depressive Symptoms as Both Formative and Reflective Indicators of a Latent Trait" Mathematics 9, no. 21: 2715. https://doi.org/10.3390/math9212715

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop