1. Introduction
Multimorbidity, usually defined as the coexistence of two or more chronic conditions, is now routine in the care of older adults and is closely linked to functional decline, poorer quality of life, greater health-care use, and higher mortality [
1,
2,
3,
4,
5]. The composition of multimorbidity also shapes downstream clinical decisions, including perioperative and procedural risk assessment in older patients with combined chronic disease and frailty [
4,
6]. Yet clinical practice and much of the evidence base remain organised around single diseases. General internists, therefore, need to manage hypertension, diabetes, heart disease, chronic lung disease, arthritis, cancer, stroke, and psychiatric or emotional disorders not only as separate diagnoses, but as combinations that change over time [
7].
Disease count is useful because it offers a simple measure of burden and allows comparison across settings [
5,
8]. Its limitation is that two patients with the same number of diagnoses may have very different clinical profiles, treatment priorities, and future risks [
5,
9]. Hypertension with diabetes is not clinically equivalent to chronic lung disease with arthritis, even though both represent two-condition multimorbidity. Measures that retain disease composition may therefore add clinically relevant information beyond the number of conditions alone.
This distinction matters in internal medicine, where management is rarely determined by count alone [
10]. Treatment burden, surveillance needs, drug interactions, functional limitation, and patient priorities depend on which conditions cluster together [
10,
11]. A count of two or three conditions can describe overall load, but it does not show whether a patient is entering a cardiometabolic pathway, developing a cardio-respiratory-musculoskeletal profile, or remaining in a relatively stable combination that requires coordinated long-term follow-up [
12,
13,
14].
A longitudinal view is needed because multimorbidity develops over time [
13,
14]. Some early combinations may be branching points from which later profiles diverge, whereas other combinations may become persistent after cardiometabolic and musculoskeletal burden has accumulated. Identifying these profiles could help clinicians recognise earlier stages of complexity, anticipate follow-up needs, and plan integrated chronic disease management before multimorbidity becomes highly entrenched [
13].
In this study, lock-in denotes low downstream diversification and relative persistence of a cumulative disease profile, whereas stabilisation sensitivity describes how much the transition structure changes in a simulation that increases the self-transition tendency of selected states.
We analysed multimorbidity as transitions between disease combinations using harmonised longitudinal data from national ageing cohorts in China, England, and the United States [
15,
16]. The objective was to identify clinically recognisable states associated with rapid accumulation, downstream branching, or later persistence, and to assess whether these roles recurred across populations with different baseline disease burdens.
2. Materials and Methods
2.1. Study Design and Data Sources
We analysed three nationally representative longitudinal ageing cohorts: the China Health and Retirement Longitudinal Study (CHARLS), the English Longitudinal Study of Ageing (ELSA), and the U.S. Health and Retirement Study (HRS). Each cohort collects repeated information on physician-diagnosed chronic conditions (
Figure 1). The first eligible wave around 2010–2011 was defined as the analytic baseline: raw wave 1 in CHARLS, raw wave 5 in ELSA, and raw wave 10 in HRS. Subsequent waves were recoded relative to this analytic baseline. Transitions were constructed only between adjacent analytic survey waves among participants observed with complete disease data at both waves; transitions spanning a missing intermediate wave were not imputed.
The three cohorts were used to examine whether broad transition roles were visible across populations with different baseline disease burdens and health-care contexts. Cohorts were not pooled because the aim was not to estimate a single average transition probability across countries. Each cohort was analysed separately and then compared descriptively to identify recurring state roles relevant to clinical medicine rather than to one national cohort alone.
2.2. Chronic Conditions and Multimorbidity States
Eight physician-diagnosed chronic conditions were included because they were available with comparable definitions across all three cohorts: hypertension, diabetes, cancer, chronic lung disease, heart disease, stroke, psychiatric or emotional disorders, and arthritis or rheumatism. This harmonised set captured common cardiometabolic, respiratory, mental health, malignant, cerebrovascular, and musculoskeletal conditions relevant to older adults, but it did not include other clinically important ageing-related conditions such as chronic kidney disease, osteoporosis, or cognitive impairment. At each wave, disease status was represented by a binary vector. Unique combinations of the eight conditions were defined as multimorbidity states; the state 00,000,000 represented no recorded condition. Disease count was the sum of conditions in the vector.
The eight-condition framework yields a possible state space of 256 profiles. In practice, only a subset of these profiles was observed in each cohort and window. This sparsity is clinically informative because it indicates that progression was concentrated around recurring combinations rather than distributed evenly across all theoretically possible profiles.
2.3. State-Transition Analysis
For each cohort and transition interval, movement from state i at wave t to state j at wave t + 1 was tabulated. Nodes represented observed multimorbidity states, and directed edges represented observed transitions. Edge weights were the number of participants moving between states. Because the included diagnoses were treated as chronic conditions and apparent disappearance may reflect reporting variation, a disease-level irreversibility rule was applied: once a condition had been recorded, it was carried forward in later waves. The resulting states should therefore be interpreted as cumulative diagnostic-history profiles rather than purely contemporaneous clinical status. This assumption is central to the transition framework, and its influence was evaluated in sensitivity analyses without the carry-forward rule (
Figure S4; Table S6).
This approach treats each observed disease combination as a clinically recognisable profile at a given time and each transition as a change in that profile over follow-up. Network terminology is used as a compact way to describe movement between profiles.
Transition probabilities were estimated empirically as follows:
where N_ij is the number of individuals moving from state i to state j, and the denominator is the total number of individuals in state i at the start of the interval.
2.4. Trajectory Measures
Each state’s outgoing transition distribution was summarised using four measures.
Accumulation speed was defined as the expected one-interval increase in disease count:
Branching was used as a descriptive measure of downstream diversification. It combined outgoing transition entropy with the variance in downstream disease counts:
Persistence was assessed using a lock-in score:
Normalised entropy was defined as follows:
when out_degree_i > 1 and as 0 otherwise.
These measures were selected because they correspond to distinct clinical questions. Accumulation speed asks whether participants in a profile tend to acquire additional diagnoses over the next interval. Branching asks whether a profile is followed by one dominant pathway or several downstream profiles with differing disease burden. Lock-in asks whether a profile tends to persist once reached. Stabilisation sensitivity asks whether changing the self-transition tendency of selected states would alter the transition structure. These measures describe cohort-level transition architecture and are not intended for individual risk prediction.
2.5. Stabilisation Sensitivity
To assess whether particular states had disproportionate influence on long-run trajectory structure, two targeted stabilisation scenarios were compared within each cohort window: increasing self-transition probability by a fixed increment (alpha = 0.30) for the five highest-branching states, or for the five states with the highest accumulation speed. Remaining outgoing probabilities were reduced proportionally so that each transition row summed to 1. The resulting distribution was compared with the baseline distribution using L1 distance. Further sensitivity analyses compared alternative alpha values with a random eligible-state benchmark using a support-weighted transition-matrix L1 distance. Specifically, the row-wise L1 distance between perturbed and baseline transition matrices was averaged with weights proportional to from-state support, so that frequently visited states contributed more to the overall distance. For each source state, the row-wise L1 distance was calculated as:
2.6. Cross-Cohort Reproducibility
For cross-cohort comparisons, analyses were restricted to states with at least 50 from-state observations within a cohort window. States were ranked by each dynamic measure and expressed as within-window percentile ranks. Rankings were summarised across windows within each cohort using from-state support as weights. States recurring in at least two cohorts were treated as reproducible key states. Sensitivity analyses examined thresholds of 30, 50, and 100 from-state observations (
Table S7). To separate dynamic behaviour from state-space geometry, accumulation, branching, and lock-in scores were regressed on disease count and log-transformed from-state support within each cohort window, using from-state support as weights; residual scores were used to identify states whose dynamic rank exceeded what disease count and support alone would predict. Additional checks decomposed BranchScore into entropy-only and downstream disease-count-variance ranks (
Table S11). We also compared observed profile counts with independent-prevalence expectations (
Table S12) and implemented a disease-count-preserving permutation null model that preserved each transition’s from-state and target disease count while randomising target disease-combination identity among compatible irreversible states (
Table S13).
2.7. Statistical Analysis
All analyses were conducted separately within each cohort using R version 4.3.3 (R Foundation for Statistical Computing, Vienna, Austria) in RStudio version 2024.12.1+563 (Posit Software, PBC, Boston, MA, USA). The study was descriptive and was designed to characterise transition patterns rather than estimate causal effects between diseases.
4. Discussion
In this longitudinal analysis of three national ageing cohorts, multimorbidity evolved through clinically recognisable disease profiles rather than through disease count alone. Early low-burden states, including hypertension, diabetes, heart disease, chronic lung disease, psychiatric or emotional disorders, and simple dyads, were more likely to accumulate further disease and branch into different downstream profiles. Later persistent states were dominated by cardiometabolic-musculoskeletal combinations, especially profiles involving hypertension, diabetes, heart disease, and arthritis.
These findings support the view that the clinical meaning of multimorbidity depends on composition as well as burden [
12,
17,
18]. Patients with the same number of diagnoses may differ in prognosis, treatment complexity, monitoring needs, and vulnerability to functional decline [
17,
19]. The present analysis extends this idea longitudinally: states with similar counts did not occupy the same role in the transition structure. Some low-burden profiles opened into multiple possible pathways, whereas some higher-burden profiles were more stable and persistent.
The early branching pattern is clinically important because general internists often encounter patients with one or two common chronic conditions long before severe multimorbidity is established. In this study, hypertension and diabetes were frequent branching states, consistent with their central role in cardiometabolic multimorbidity. Rather than being only markers of existing burden, such conditions may mark cohort-level profiles in which observed future disease combinations are more heterogeneous.
The additional count-preserving null analysis required a more conservative interpretation of the early branching result. Low-burden states had more possible downstream combinations in the finite irreversible state space, and the observed BranchScore rankings did not exceed the disease-count-preserving null distribution. Thus, early branching should be read primarily as a structural feature of the cumulative transition framework, not as evidence that those profiles independently generate clinically open futures.
These population-level findings should not be read as deterministic sequences or validated risk predictions for individual patients. Their value is in drawing attention to common profiles in which routine follow-up may need to include broader assessment: emerging cardiometabolic disease, respiratory symptoms, musculoskeletal pain and function, and mental health, rather than isolated disease-specific monitoring alone.
The later persistence of cardiometabolic-musculoskeletal combinations has a different clinical implication. Profiles involving hypertension, diabetes, heart disease, and arthritis combine vascular risk, symptom burden, pain, mobility limitation, medication complexity, and competing treatment priorities [
11,
12,
13,
18]. The repeated presence of arthritis in persistent states is notable. In older adults, arthritis may complicate physical activity, self-management, and functional reserve, and may therefore help sustain long-term complexity once cardiometabolic disease has accumulated [
20,
21].
For clinical management, these persistent profiles are less about early prevention of any single condition and more about long-term integration of care. They may require medication reconciliation, renal and cardiovascular monitoring, attention to analgesic safety, support for physical activity within functional limits, and coordination across disease-specific recommendations [
22]. The present study cannot test whether such care models improve outcomes, but it helps identify the kinds of profiles for which single-disease pathways are likely to be insufficient.
The cross-cohort pattern was consistent despite differences in baseline disease burden. CHARLS appeared to capture a more active phase of early accumulation and branching, HRS showed stronger persistence of higher-burden profiles, and ELSA was generally intermediate. Similar trajectory roles may therefore emerge at different stages of population-level multimorbidity development. For clinicians, the practical message is not that a single sequence applies to all patients, but that common early combinations and later cardiometabolic-musculoskeletal profiles deserve attention across health-care settings.
Cross-cohort reproducibility also strengthens the clinical interpretation. The cohorts differ in population composition, disease prevalence, and health-care context, yet similar roles recurred. At the same time, differences between cohorts may reflect age structure, baseline disease burden, diagnostic access, reporting behaviour, and health-system context. This balance between reproducibility and local variation is consistent with the way clinicians encounter multimorbidity: recognisable patterns recur, but their timing and burden differ across patients and settings.
The stabilisation analyses should be interpreted descriptively. They do not show that intervening on a state would prevent later disease. However, they indicate that selected high-branching states can produce larger structural perturbations than randomly selected eligible states in the transition matrix. This supports a cautious clinical focus on early profiles that lead to multiple downstream pathways, while recognising that intervention effectiveness must be tested in outcome-oriented studies.
The state-transition framework was useful because it separated three roles that disease count tends to merge: fast accumulation, branching into heterogeneous futures, and persistence once complex cardiometabolic-musculoskeletal disease is established. Together, these distinctions translate more directly into clinical follow-up priorities than disease count alone.
This study has several strengths. It used harmonised longitudinal data from three large national cohorts and examined specific disease combinations rather than relying only on counts. The state-transition framework distinguished accumulation, branching, persistence, and structural influence, allowing clinically interpretable roles to be assigned to common multimorbidity profiles.
Several limitations should be considered. Chronic conditions were based on self-reported physician diagnoses and may be affected by diagnostic access, recall, and reporting differences across countries [
23]. The analysis was restricted to eight harmonised conditions and therefore cannot capture the full clinical complexity of older adults, including chronic kidney disease, osteoporosis, cognitive impairment, frailty, and other conditions that may alter multimorbidity pathways. Follow-up intervals and measurement procedures differed across cohorts, which may affect direct comparability. Disease prevalence and from-state support also shaped the transition graph; high-prevalence conditions naturally generated more observed dyads and triads. We therefore used support thresholds and count/support-adjusted checks, but residual prevalence effects may remain.
Additional limitations relate to the state definition and interpretation. The irreversibility assumption is clinically plausible for many included diagnoses and reduces bias from inconsistent reporting, but it also means that the analysis represents cumulative diagnostic history rather than contemporaneous disease status. This choice can increase apparent persistence and may particularly affect conditions with fluctuating symptoms or reporting, such as psychiatric or emotional disorders, chronic lung disease, and arthritis. Sensitivity analyses without the carry-forward rule changed the top-ranked accumulation and branching states more than the lock-in states (
Figure S4;
Table S6), reinforcing the need to interpret early branching as a property of the chosen cumulative-history framework. Because the analysis was confined to eight conditions, profiles with fewer diagnoses have more possible downstream combinations than profiles with more diagnoses, which can contribute to higher apparent branching for low-burden states. Count-stratified and support-adjusted analyses reduced but did not fully remove this effect. Finally, disease severity, medication use, laboratory values, frailty, functional status, and health-care utilisation were not incorporated into the state definitions. The profiles should therefore be understood as broad clinical combinations rather than detailed phenotypes suitable for individual prognostication. Accordingly, lock-in may partly reflect administrative or record-based accumulation rather than stable biological persistence of every condition at every wave. The disease-count-preserving permutation analysis further indicated that the branching pattern was largely compatible with the geometry of irreversible disease-count accumulation, whereas selected persistent profiles showed stronger evidence of disease-combination identity beyond disease count alone.