1. Introduction
Migraine is the second leading cause of disability worldwide [
1], yet its molecular origins and pathways of susceptibility remain insufficiently understood [
2]. Epidemiological evidence consistently demonstrates that metabolic syndrome (MetS) is a prevalent and clinically significant comorbidity in individuals with migraine, with cardiometabolic burden associated with a modest but consistently reported elevation in migraine risk [
3,
4,
5]. However, this association is typically interpreted as a comorbid relationship arising from shared risk factors rather than as an interaction within a coupled biological system. As a result, traditional risk factor paradigms fail to explain how peripheral metabolic dysregulation might contribute to the neural vulnerability characteristic of migraine.
To move beyond traditional risk factor paradigms, a system biology perspective emphasizes that metabolic and neural circuits share a core functional imperative; both operate as predictive, interconnected networks designed to preserve homeostasis [
2,
6]. Metabolic networks anticipate and regulate energy demands, whereas neural circuits predict sensory threats and drive adaptive responses. Within this framework, MetS represents not only a cluster of metabolic abnormalities but also a state of heightened peripheral signal volatility, characterized by lipid-driven endothelial stress [
7], inflammatory drift, and instability in energy-related signalling [
6]. Conversely, migraine reflects a form of neural hypersensitivity in which cortical excitability and brainstem threat-prediction systems function near their stability margins [
2]. This functional interdependence raises the possibility that chronic metabolic disturbances may increase the burden of peripheral signalling fluctuations on neural systems, thereby contributing to vulnerability in the transitions that manifest clinically as migraine attacks [
8].
Building on this system-level view, multiomics studies have begun to identify shared genetic, lipidomic, and inflammatory mechanisms linking metabolic dysfunction with migraine [
5,
8]. For example, these studies have highlighted that in the context of cardiometabolic burden, migraine is accompanied by altered inflammatory signalling and dysregulated lipid metabolism [
5]. However, most studies focus on patients with established disease or conceptualize risk as a static accumulation of cardiometabolic factors. Such approaches provide limited insight into the molecular features preceding migraine diagnosis and the pathways through which metabolic disturbances are relayed towards neural targets [
9].
To address these unresolved diagnosis-stratified and pathway-level questions, divergence-based proteomic analysis provides a relevant quantitative basis. Theory predicts that systems approaching a state shift exhibit increasing variance, diminished resilience, and network reorganization, features that have been validated as early warning signatures [
10] across complex disorders, including cancer metastasis, diabetic kidney disease, epilepsy, and depression [
11,
12,
13]. Because these signatures manifest as quantifiable changes in molecular variability and network structure, they can be detected at scale in population-level proteomic distributions, thereby illuminating early molecular transitions that may link peripheral metabolic states to emerging neurological vulnerability. However, it remains unknown whether systematic proteomic divergence patterns can be detected within prediagnostic profiles linking MetS to incident migraine in large population cohorts.
This study therefore aimed to develop a diagnosis-stratified framework by characterizing proteomic variability before migraine in individuals with MetS and examining how this variability is organized across pathways, bridge proteins, and druggable targets. To achieve this goal, we leveraged time-stratified analyses of 47,620 UK Biobank participants with deep proteomic profiling (2923 proteins) to characterize the network features of cross-system proteomic variability (
Figure 1). We identified a distinct diagnosis-anchored window of proteomic divergence peaking approximately five years before the diagnosis of migraine in individuals with MetS, a pattern not observed in the parallel within-stratum analysis of the NoMetS stratum. Through directed network diffusion modelling, we further revealed pathways exhibiting directional consistency with a MetS-to-migraine directional axis, and we identified candidate bridge proteins positioned at the interface of metabolic volatility and neural susceptibility. Together, these findings delineate cross-system proteomic variability at the MetS–migraine interface and provide a basis for mechanistic investigation of the identified variability window, pathways, and bridge proteins.
3. Discussion
To address the limited understanding of how metabolic dysregulation might relate to migraine susceptibility over time, we integrated prospective cohort data with baseline proteomics stratified by time-to-migraine diagnosis and directed diffusion-based network modelling to explore when, and through which molecular pathways, MetS-related perturbations might be associated with increased migraine risk. Our study reframes the MetS–migraine comorbidity by integrating time-stratified proteomics with directed network dynamics, characterizing the relationship not as a static accumulation of risk but as a dynamic cross-system vulnerability pattern. We identify a proteomic divergence pattern peaking 4.71–6.76 years before migraine onset within a convergent T2 window in the MetS group. In the NoMetS stratum, the parallel within-stratum analysis showed no T2-centered peak. This proteomic divergence pattern is structurally underpinned by a directionally consistent subnetwork comprising 11 novel pathways alongside established routes, which intersects with the T2-window divergence signal to prioritize seven candidate bridge proteins (e.g., MFGE8, SRC, NRGN) whose tissue-expression profiles place them at the interface of metabolic, endothelial, and neural tissues. Collectively, these data suggest that MetS may act as a persistent source of peripheral signal volatility rather than a static risk factor, with this variability mapped to these molecular relays, thereby supporting a framework for understanding how chronic metabolic noise may be associated with reduced resilience of neural systems.
Against this system-level backdrop, we next examined how individual MetS components related to migraine risk at the epidemiologic level. Epidemiologically, the association between MetS and migraine was unevenly distributed across the syndrome components. Cox regression confirmed an overall link (HR 1.09, 95% CI 1.01–1.18). However, in the mutually adjusted models, only low HDL-C levels (HR 1.12,
p = 0.008) and elevated triglyceride levels (HR 1.12,
p = 0.005) retained independent associations with incident migraine; central obesity, hypertension, and hyperglycaemia did not reach significance. This selective concentration of risk within the lipid domain aligns with earlier reports implicating HDL dysfunction [
3] and triglyceride-rich lipoproteins [
5] in migraine and vascular dysfunction. Mechanistically, chronic dyslipidaemia produces oxidized lipoproteins and bioactive ceramides that activate endothelial cells and sustain low-grade vascular inflammation [
14,
15]. Because these bioactive species are continually generated and cleared [
14,
15], circulating lipids impose recurrent signalling bursts rather than the relatively constant mechanical load of adiposity. Notably, our observation that independent associations were confined to lipid components rather than adiposity measures aligns with this mechanistic distinction. Within our system framework, this pattern suggests that the MetS–migraine link is preferentially loaded onto dynamic lipid signalling rather than static anthropometric burden, which is consistent with the hypothesis that migraine susceptibility may be more responsive to chronic signalling noise than to structural pressure alone. While these epidemiologic associations do not establish causality, they provide a population-level premise for the proteomic divergence we subsequently observed in the T2 window. Together, these observations are consistent with lipid-driven systemic perturbations, rather than adiposity alone, representing one plausible epidemiologic substrate for the T2-window proteomic divergence observed in this analysis and suggesting this axis as a priority for future investigation.
At the T2 window, the convergence of the peak ICI and heightened protein fluctuation indicates pronounced proteomic-profile divergence [
16]. These findings suggest that the epidemiological risk observed at the population level is accompanied by elevated proteomic-profile divergence in this diagnosis-anchored window [
11,
16]. Similar divergence-based molecular patterns have also been discussed in studies of cancer and viral infections [
12,
16,
17]. In the NoMetS stratum, the parallel within-stratum analysis showed variation across windows with no clearly distinct T2 peak. Consequently, T2 represents a diagnosis-anchored window of elevated divergence in proteomic profiles in the MetS group, providing a focused basis for dissecting the specific molecular pathways potentially linking metabolic disturbances to neural susceptibility [
11,
12,
16].
Within this vulnerability window, the set of pathways identified by network modelling exhibited MetS-to-migraine directional consistency. Specifically, the recovery of NF-
B–mediated inflammatory signalling and VEGF-related vascular pathways [
18,
19] provides an internal plausibility check, indicating that the directed diffusion framework (TieDIE) [
20] can recover pathways with prior pathway-level relevance to metabolic and vascular dysfunction and migraine. This pattern supports the interpretability of the network ranking and reduces concern that the findings are driven solely by topological artefacts. Beyond these known axes, the 11 novel pathways further refine the cross-system interface. Pathways involving LUBAC-mediated linear ubiquitination and NF-
B regulatory mechanisms [
21,
22] suggest potential amplification of systemic stress; integrin adhesion and receptor tyrosine kinase modules [
23,
24] suggest complex endothelial sensing and barrier gating; and oestrogen-responsive neuroplasticity pathways [
25] connect this network pattern to neural circuits implicated in female-predominant migraine susceptibility. Collectively, these patterns outline a metabolically anchored, endothelial-facing, and neuronally connected subnetwork compatible with a cross-system vulnerability architecture rather than diffuse, nonspecific coupling. Although hub-driven effects cannot be fully excluded, the persistence of this topology under degree-preserving randomization suggests a non-random, direction-consistent subnetwork pattern within the cross-system vulnerability architecture suggested by the epidemiologic and proteomic divergence findings.
At the structural intersection between ICI-contributing proteins from the T2 window and the identified directed subnetwork, seven proteins emerged as candidate bridge nodes. Tissue specificity analysis revealed that these proteins are not uniformly distributed but are distinctly enriched in peripheral, endothelial, and neural tissues. Their known functions align with this anatomical segregation. In peripheral tissues, MFGE8 and IKBKG regulate lipid uptake and inflammatory signalling [
22,
26], suggesting that they sense systemic metabolic fluctuations. In vascular-enriched tissues, SRC [
27] and FGF2 [
28] have opposing effects on endothelial integrity: SRC promotes barrier disassembly, while FGF2 supports barrier stabilization. The concurrent identification of these countervailing factors suggests that the high variance observed in T2 may reflect a molecular “tug-of-war” and a state of homeostatic strain at the vascular interface. Finally, in the central nervous system, NRGN and STAT5B modulate synaptic plasticity and excitability. Taken together, these anatomically distinct but functionally connected nodes are compatible with a hypothesized cross-system network pattern and are consistent with a conceptual link from metabolic noise to neural susceptibility with endothelial involvement highlighted by the network and tissue-expression analyses.
In the exploratory pathway-based drug–target proximity analysis, three previously approved agents, namely, valproic acid, carvedilol and terazosin, exhibited the strongest alignment with the novel pathway set. Valproic acid, a guideline-recommended prophylactic for migraine [
29], was enriched across ten pathways. Beyond its GABAergic and sodium-channel actions, valproic acid reportedly reduces endothelial ICAM-1 and VCAM-1 expression via NF-
B inhibition, which is consistent with the inflammatory–endothelial axis highlighted here. Carvedilol, a vasodilating
-blocker widely used for MetS-related hypertension, was enriched across seven pathways; its antioxidant activity has been linked to improved endothelium-dependent vasodilatation [
30], and
-blockers have demonstrated migraine-preventive efficacy in randomized trials, aligning with the metabolic–vascular tier of the proposed network pattern. Terazosin, an
-adrenergic antagonist prescribed for MetS-related conditions, was enriched across four pathways; although it has not previously been implicated in migraine, it activates Pgk1 and confers neuroprotection in rodent stroke and sepsis models [
31], which may be relevant to the neuronal vulnerability emphasized in our framework. Taken together, these convergent signals provide biologically coherent repurposing hypotheses that may help prioritize future pharmacoepidemiologic and, where feasible, experimental or clinical evaluation in MetS-related contexts.
This study has several limitations. The observational design precluded definitive causal inference, and the directionality suggested by diffusion-based modelling reflected direction-consistent patterns under a directed diffusion framework rather than proven biological flow. Because proteomic measurements were obtained only at baseline and the timing of MetS onset was unavailable, we could not determine whether the diagnosis-anchored T2 window corresponds to a clinically forward-looking transition point relative to MetS duration. The Olink panel, although comprehensive, is enriched for cardiometabolic and inflammatory proteins and may underrepresent neuron-specific markers. Additionally, NPX is Olink’s arbitrary log2-scaled unit and supports relative quantification only; measurements for low-abundance analytes near or below the limit of detection may have reduced reliability. Although batch- and plate-related variation was minimized through UKB-PPP quality-control procedures, technical variation inherent to proximity extension assays cannot be fully excluded, and imputation of missing protein values may introduce additional measurement uncertainty. Medication use was represented by baseline treatment categories, which may have contributed to residual misclassification of MetS components. Replication in independent cohorts with denser temporal sampling and broader ancestral representation is needed. Finally, although these projected drug repurposing signals were biologically coherent, their clinical relevance in individuals with MetS requires experimental and pharmacoepidemiologic validation.
4. Materials and Methods
4.1. Study Participants
The UK Biobank included approximately 500,000 adults aged 40–69 years who were enrolled between 2006 and 2010 [
32]. To examine incident migraine epidemiologically, we excluded individuals who had migraine at baseline and applied a two-year landmark period [
33] to reduce the likelihood of reverse causation; this resulted in a cohort of 452,471 participants. For proteomic analyses, we restricted the sample to UK Biobank participants with baseline Olink Explore 3072 proteomic data (
n = 53,013). After successful linkage was established with the curated MetS and migraine labels, the final proteomic subset comprised 47,620 participants.
4.2. Proteomic Profiling
Plasma proteins were assessed using the Olink Explore 3072 platform, which employs proximity extension assays (PEA) across various panels, including cardiometabolic, inflammatory, neurologic, and oncologic. Protein levels are reported as normalized protein expression (NPX) values on a log
2 scale. The dataset, which included 2923 distinct protein targets, was derived from the QC-processed UK Biobank Pharma Proteomics Project (UKB-PPP) NPX dataset and used for further analysis [
34,
35]. These PEA-based protein profiles provide high-specificity, multiplex proteomic measurements at population scale, supporting analyses of diagnosis-anchored proteomic divergence and pathway-level molecular context in this cohort.
4.3. Exposure and Outcome Definitions
MetS was defined using the 2005 International Diabetes Federation (IDF) criteria [
36]. Participants met the MetS classification if they had central obesity, identified by a waist circumference of ≥94 cm for men or ≥80 cm for women, plus at least two of the following factors: elevated triglycerides (≥1.7 mmol/L); low high-density lipoprotein (HDL) cholesterol (<1.03 mmol/L for men or <1.29 mmol/L for women); elevated blood pressure (systolic ≥ 130 mmHg or diastolic ≥ 85 mmHg); and elevated glucose (≥5.6 mmol/L), with baseline lipid-lowering, antihypertensive, or insulin medication use counted toward the corresponding component.
Migraine diagnoses were determined from linked health records with the International Classification of Diseases, 10th Revision (ICD-10) code G43, supplemented by self-reported migraine history. At baseline, participants reported whether they had ever had a migraine (past or current) and, where applicable, the date of their migraine diagnosis. These self-reported responses were combined with ICD-10 G43 records to define prevalent and incident migraine. Prevalent migraine at baseline was identified by an ICD-10 G43 record dated on or before baseline, self-reported migraine at baseline, or a self-reported diagnosis prior to baseline. Incident migraine was defined as the first occurrence of either ICD-10 G43 coding or self-reported migraine after baseline. Follow-up duration was measured from the baseline assessment date to the earliest incident of migraine, death, or administrative censoring (30 November 2022).
4.4. Missing Data Imputation
Covariate missingness included item non-response (e.g., “do not know” or “prefer not to answer”) and incomplete questionnaire or assessment entries in the UK Biobank baseline data. Missing covariate data, including MetS components and adjustment variables, were imputed in R using a published random forest strategy with predictive mean matching [
34], implemented with the missRanger package. Five imputed datasets were generated; each used ten iterations and 200 trees. Responses recorded as “do not know” were treated as missing and were imputed, whereas “prefer not to answer” responses were set to missing after imputation. Age, sex, and outcome/time variables were not included. Before imputation, patterns of covariate missingness were summarized by variable and evaluated for associations with age and sex to inform missing-at-random assumptions, and out-of-bag prediction errors from the missRanger models were summarized across imputations as a measure of imputation model performance. Accordingly, for questionnaire-derived categorical variables in
Supplementary Table S1, percentages were calculated using variable-specific available denominators.
Protein expression values were imputed using the miceforest package in Python. Proteins missing in more than 30% of participants were excluded as predictors for imputing each protein. A single dataset was imputed with up to five iterations, while all the other parameters remained at their default settings. NPX values were scaled from 0 to 1 and centred on the median after analysis [
34].
In the proteomic analysis cohort, MetS status was determined within each of the five imputed covariate datasets and consolidated for each participant through majority voting. The probability score was calculated as the mean value across imputations. Migraine labels were sourced from the first imputed dataset, as outcome variables were not imputed.
4.5. sJSD-Based Proteomic Divergence Analysis
To identify proteomic signatures that preceded disease onset, a nested case–control design stratified by MetS status was employed. All analyses were conducted separately within the MetS and non-MetS (NoMetS) strata to describe within-stratum proteomic variability patterns; the NoMetS stratum served as a parallel within-stratum reference. Within each stratum, propensity score matching was applied to balance baseline covariates and to define a fixed reference pool [
37] (details in
Supplementary Figure S3). Incident migraine cases were subsequently grouped into five-time windows based on time-to-event quintiles, which represented the intervals from baseline to migraine diagnosis.
Proteomic signatures were quantified using single-sample Jensen–Shannon divergence (sJSD) [
16]. For each stratum, protein expression distributions in cases were compared with those in the reference pool by fitting Gaussian distributions to reference NPX values for each protein, transforming both reference and case NPX values into cumulative probabilities using the normal cumulative distribution function, and calculating the Jensen–Shannon divergence across all measured proteins. An inconsistency index (ICI) was derived as the mean divergence, where higher scores indicate greater deviation from the reference proteomic state.
Differences across time-to-event windows were assessed using pairwise permutation tests (5000 iterations) between the peak window and the others, with multiple-comparison control using Benjamini-Hochberg and Holm methods. The null distribution of the window score was estimated via a 10,000-iteration resampling procedure. Stability was evaluated using 1000-iteration bootstraps on protein subsets and case samples, and peak frequency across windows was tested by chi-square goodness-of-fit. Additional sensitivity analyses examined consistency across different matching specifications and case compositions in each stratum.
4.6. Network Proximity Analysis
Disease-associated pathways were identified through overrepresentation analysis using Fisher’s exact test (one-sided,
p ≤ 0.05) applied to disease–gene associations from the Open Targets Platform (v25.09) [
38]. Network-based proximity between disease and pathway genes was quantified as the average shortest-path distance within the STRING protein-protein interaction (PPI) network (v12.0) [
39,
40]. Statistical significance was determined by comparing observed distances against a null distribution generated through degree-preserving randomization (1000 iterations). Pathways with
z scores ≤ −2.0 were considered significantly proximal to each disease. Pathways showing significant proximity to both MetS and migraine were designated as shared pathways. Pathway annotations were obtained from Reactome through the Molecular Signatures Database (MSigDB; v2025.1) [
41,
42].
Pathways shared between MetS and migraine were evaluated through Pearson correlation and linear regression analyses of proximity profiles. Statistical significance was assessed against degree-preserving randomized networks to control for network topology effects. Each pathway was classified as either “Known” or “Novel” based on structured pathway-level literature review. Pathways with published pathway-level associations with MetS, migraine, or a core MetS component were designated “Known”, whereas pathways for which no such pathway-level evidence was identified were designated “Novel”. Here, “Novel” refers to pathways that were previously underrecognized at the pathway level in the context of MetS and migraine, rather than entirely unprecedented biological mechanisms. These shared pathways provided a disease-anchored candidate set for subsequent directed diffusion analysis, ensuring that signal propagation was restricted to routes jointly proximal to both disease modules rather than the entire interactome.
4.7. Mediator Subnetwork Identification and Directional Characterization
To identify signalling pathways linking MetS and migraine, Tied Diffusion Through Interacting Events (TieDIE) [
20] was applied to the Superpathway Directed Signalling Network (v2.0), which integrates curated directed signalling relationships from multiple pathway resources and is well suited for directionality inference. Disease-associated genes from the Open Targets Platform were used as the source set (MetS) and target set (migraine), with genes filtered to those present in the network. Initial heat values were diffused across the network using a matrix exponentiation–based heat diffusion kernel. Forwards diffusion propagated heat from MetS genes along directed edges, whereas reverse diffusion propagated heat from migraine genes through transposed edges, effectively running the algorithm in the opposite direction. Because diffusion involves continuous heat decay, nodes located closer to the source in the forwards direction display stronger red heat, whereas nodes closer to the target in the reverse direction display stronger blue heat. Nodes exhibiting high heat from both diffusion processes (above the per-direction 75th percentile) were extracted as the mediator subnetwork, representing genes positioned along putative molecular paths connecting MetS and migraine. The 75th percentile cut-off was chosen as a compromise between network sparsity and coverage, and robustness to this choice was evaluated by repeated analyses at the 70th and 80th percentile thresholds.
Pathway enrichment within the mediator subnetwork was evaluated using Fisher’s exact test for the 50 shared pathways identified above. For each pathway, directionality was calculated as the mean MetS-direction heat minus the mean migraine-direction heat across mediator genes. Positive values indicate relative proximity to the MetS source, whereas negative values indicate relative proximity to the migraine target. Directional balance was quantified using a distance-to-diagonal metric, defined as |red_heat − blue_heat|/(red_heat + blue_heat). Bridge score was calculated as the product of red and blue heat. Statistical significance was assessed using the Mann-Whitney U test and generalized linear models that included pathway size and node degree as covariates. Robustness was further evaluated by varying the heat thresholds (70th, 75th, and 80th percentiles) and false discovery rate (FDR) cut-offs (0.01, 0.05, and 0.10).
4.8. Drug–Pathway Prioritization and Target Contextualization
Candidate drugs for the direction-consistent novel pathways identified in the directed diffusion analysis were prioritized using ProXimal Pathway Enrichment Analysis (PxEA) [
43,
44]. PxEA evaluates the topological proximity between drug targets and pathway genes within the PPI network and calculates enrichment statistics similar to gene set enrichment analysis. Statistical significance was determined by permutation testing (1000 iterations), and multiple testing was controlled using the Benjamini-Hochberg method. Before PxEA, drugs were filtered for indications related to MetS or migraine.
Drug target information was retrieved from DrugBank (version 5.1.10) [
45]. The overlap between candidate pathway-mapped drug targets and the top 500 dynamic network biomarker proteins identified in the Olink proteomic dataset was examined. For these overlapping proteins, cross-tissue expression validation was performed using data from the Human Protein Atlas [
46] and the GTEx Consortium [
47]. Associations between protein expression and MetS status were evaluated using analysis of covariance (ANCOVA), adjusted for age and sex.
Effect sizes are reported as Cohen’s d values.
4.9. Statistical Analysis
All the statistical analyses were performed in R (version 4.2.1) and Python (version 3.9.13). Associations between MetS and incident migraine were examined using Cox proportional hazards regression implemented in the survival package. Within the Cox regression and other statistical models described in this section, R functions were used with their default parameter settings unless explicitly stated otherwise. Three sequential models with increasing levels of covariate adjustment were specified. Model 1 included age, sex, and ethnicity. Model 2 was additionally adjusted for socioeconomic indicators (income, education, and the Townsend deprivation index [tertiles]). Model 3 further incorporated lifestyle factors (smoking status, alcohol consumption, sleep duration, and physical activity). The proportional hazards assumption was verified using Schoenfeld residuals.
Cox models were separately fitted for each of the multiply imputed datasets. Estimates were pooled using Rubin’s rules to combine within- and between-imputation variance, resulting in overall hazard ratios (HRs), standard errors, and 95% confidence intervals (CIs). Dose–response relationships were assessed by modelling the number of MetS components (0–4) as both categorical and continuous exposures. Linear trend tests were conducted across the ordered categories. The independent associations of individual MetS components (central obesity, hypertension, hyperglycaemia, elevated triglycerides, and reduced HDL cholesterol [HDL-C]) were evaluated using mutually adjusted Cox models that included all five components simultaneously. Subgroup analyses were stratified by age, sex, ethnicity, socioeconomic status, and lifestyle factors. Interaction p values were obtained by comparing Cox models with and without multiplicative interaction terms for each subgroup variable using a likelihood-ratio chi-square test; p values from each imputed dataset were then combined using Fisher’s method.
p values less than 0.05 were considered to indicate statistical significance; interaction
p values were Bonferroni-adjusted across subgroups to correct for multiple testing across the subgroup comparisons. Additional methodological details are provided in the
Supplementary Methods.