Sweat Proteomics in Cystic Fibrosis: Discovering Companion Biomarkers for Precision Medicine and Therapeutic Development

In clinical routine, the diagnosis of cystic fibrosis (CF) is still challenging regardless of international consensus on diagnosis guidelines and tests. For decades, the classical Gibson and Cooke test measuring sweat chloride concentration has been a keystone, yet, it may provide normal or equivocal results. As of now, despite the combination of sweat testing, CFTR genotyping, and CFTR functional testing, a small fraction (1–2%) of inconclusive diagnoses are reported and justifies the search for new CF biomarkers. More importantly, in the context of precision medicine, with a view to early diagnosis, better prognosis, appropriate clinical follow-up, and new therapeutic development, discovering companion biomarkers of CF severity and phenotypic rescue are of utmost interest. To date, previous sweat proteomic studies have already documented disease-specific variations of sweat proteins (e.g., in schizophrenia and tuberculosis). In the current study, sweat samples from 28 healthy control subjects and 14 patients with CF were analyzed by nanoUHPLC-Q-Orbitrap-based shotgun proteomics, to look for CF-associated changes in sweat protein composition and abundance. A total of 1057 proteins were identified and quantified at an individual level, by a shotgun label-free approach. Notwithstanding similar proteome composition, enrichment, and functional annotations, control and CF samples featured distinct quantitative proteome profiles significantly correlated with CF, accounting for the respective inter-individual variabilities of control and CF sweat. All in all: (i) 402 sweat proteins were differentially abundant between controls and patients with CF, (ii) 68 proteins varied in abundance between F508del homozygous patients and patients with another genotype, (iii) 71 proteins were differentially abundant according to the pancreatic function, and iv) 54 proteins changed in abundance depending on the lung function. The functional annotation of pathophysiological biomarkers highlighted eccrine gland cell perturbations in: (i) protein biosynthesis and trafficking, (ii) CFTR proteostasis and membrane stability, and (iii) cell-cell adherence, membrane integrity, and cytoskeleton crosstalk. Cytoskeleton-related biomarkers were of utmost interest because of the consistency between variations observed here in CF sweat and variations previously documented in other CF tissues. From a clinical stance, nine candidate biomarkers of CF diagnosis (CUTA, ARG1, EZR, AGA, FLNA, MAN1A1, MIA3, LFNG, SIAE) and seven candidate biomarkers of CF severity (ARG1, GPT, MDH2, EML4 (F508del homozygous), MGAT1 (pancreatic insufficiency), IGJ, TOLLIP (lung function impairment)) were deemed suitable for further verification.


Materials and Methods
The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of Cliniques Universitaires Saint-Luc-Université Catholique de Louvain faculty hospital (ClinicalTrials.gov identifier: NCT03993600, date of approval 4 December 2018). A material transfer agreement was signed with the University of Liège for sample analysis. Informed consent was obtained from all subjects involved in the study.

Sweat Collection
Sweat samples were collected from 30 healthy volunteers (15 females, 15 males) and 15 patients with CF (11 males, 4 females) under the most standardized and spectroscopically pure conditions following the current recommendations for the Gibson and Cooke sweat test [16]. In short, the volar region of the forearms was chosen based on its high density in eccrine glands and low density in apocrine/apoeccrine glands together with its easy access. Sweat samples were collected from each forearm successively, from fasting and well-hydrated individuals. Before sampling, the tested region was washed with 70% ethanol, rinsed with ultrapure water, and dried using ashless filter paper. Sweat secretion was stimulated by pilocarpine iontophoresis using pilocarpine gel-padded (Pilogel ® discs, ELITechGroup, Brussels, Belgium) electrodes. A 5 mA current (Webster Sweat Inducer, Model 3700, ELITechGroup, Brussels, Belgium) was applied for 5 min. After stimulation, the electrodes were removed and a Macroduct Sweat Collector (ELITechGroup, Brussels, Belgium) (blue dye removed with 70% ethanol and ashless filter paper) was attached in the place of the cathode to collect sweat for 30 min. At the end of the collection, the tubing was uncoiled, cut off, and connected to a needle and syringe to transfer sweat to a 0.6 mL micro-tube.

Sample Preparation for Shotgun Proteomics
Pure, undiluted sweat samples were processed in three series of ten control samples and two series of seven and eight CF samples. Five sample preparation rounds (1 per series) were performed to avoid any technical bias that might come from a single sample preparation experiment.
Sweat protein concentration was estimated using the Pierce Micro BCA™ Protein Assay kit (#23235, ThermoFisher Scientific, Waltham, MA, USA) according to the manufacturer's instructions. Ten micrograms of proteins were precipitated by incubation in 90% acetonitrile for 30 min at 4 • C followed by centrifugation for 10 min at 4 • C, at 10,000× g. The protein pellet was re-suspended in 50 mM ammonium bicarbonate and then incubated in: (i) 10 mM DTT (dithiothreitol) for 40 min at 56 • C, under stirring at 600 rpm (Thermomixer comfort, Eppendorf, Hamburg, Germany), to reduce disulfide bonds, (ii) 20 mM iodoacetamide protected from light for 30 min at room temperature to alkylate/block cysteine residues, (iii) 11 mM DTT protected from light for 10 min at room temperature to quench the residual iodoacetamide, (iv) mass-spectrometry grade trypsin (Pierce™ Trypsin Protease, MS Grade, ThermoFisher Scientific, Waltham, MA, USA) at a 1:50 enzyme:protein ratio (protein concentration = 0.25 µg·µL −1 ) for 18 h at 37 • C, under stirring at 600 rpm (Thermomixer comfort, Eppendorf, Hamburg, Germany), (v) MS-grade trypsin in 80% acetonitrile, at a 1:100 enzyme:protein ratio for 3 h at 37 • C, under stirring at 600 rpm (Thermomixer comfort, Eppendorf, Hamburg, Germany). Digestion was stopped by adding TFA (trifluoroacetic acid) to a final concentration of 0.5% (v/v). Samples were dried in a vacuum concentrator and re-suspended at 3.75 µg/20 µL in 0.1% TFA. At this step, aliquots from each control sample were collected and mixed in three 10-sample pools, considering three series of 10 individual samples and an average pooled sample per series. CF samples were only processed as individual samples. No CF pooled library was prepared. Individual samples and pooled samples were desalted with C18 Zip Tips according to the manufacturer's recommendations, dried, and re-suspended at 3 µg/9 µL (injection volume) in 0.1% TFA spiked with an equivalent of MassPREP Digestion Standard Mixture 1 (MPDS Mix 1, #186002865, Waters, Milford, MA, USA) corresponding to 50 fmol of ADH (alcohol dehydrogenase 1 from Saccharomyces cerevisiae) content per injection volume.

Liquid Chromatography and Mass Spectrometry Data Acquisition
The control individual samples and pooled samples were randomly sorted into three series of ten individual samples plus one pooled sample while the CF samples were randomly sorted into two series of seven and eight individual samples. All samples were analyzed using an ACQUITY UPLC M-Class liquid chromatography system (Waters, Milford, MA, USA) coupled to a Q-Exactive Plus Hybrid Quadrupole-Orbitrap mass spectrometer (ThermoFisher Scientific, Waltham, MA, USA). Five acquisition rounds (1 per series) were performed to avoid any technical bias that might come from a single LC-MS acquisition series.
The mass acquisition was operated in data-dependent positive ion mode. Source parameters were set at: (i) 2.3 kV for spray voltage, (ii) 270 • C for capillary temperature, (iii) S-lens RF level = 50.0.
For individual samples, MS spectra were obtained for scans between m/z 400 and m/z 1600 with a mass resolution of 70,000 at m/z 200, an Automated Gain Control (AGC) of 3 × 10 6 , a maximum Injection Time (IT) of 200 ms, and an internal lock mass calibration at m/z 445.12003. MS/MS spectra were obtained for the top 10 most intense ions of each MS scan (TopN = 10) with a mass resolution of 17.500 at m/z 200, an isolation window of 1.6 m/z with an isolation offset of 0.5 m/z, an AGC of 1 × 10 5 , a maximum IT of 200 ms, and an (N)CE at 28. The exclusion of single-charged ions and a dynamic exclusion of 10 s were enabled.
For each pooled sample, the MS acquisition consisted of a two-round strategy of three injections each. During both rounds, MS spectra were obtained for scans between m/z 400 and m/z 528.3, m/z 524.3 and m/z 662.8, or m/z 658.8 and m/z 1600, in three independent analyses, respectively, with a mass resolution of 70,000 at m/z 200, an AGC of 3 × 10 6 , a maximum IT of 200 ms, and internal lock mass calibrations at m/z 445.12003, m/z 536.16537, and m/z 684.20295, respectively. During the first acquisition round, MS/MS spectra were obtained for the top 25 most intense ions of each MS scan (TopN = 25) with a mass resolution of 17.500 at m/z 200, an isolation window of 1.6 m/z with an isolation offset of 0.5 m/z, an AGC of 1 × 10 5 , a maximum IT of 250 ms, and an (N)CE at 28. For the second acquisition round, an exclusion list for all signals related to peptides identified in the first round with more than 4 PSM (peptide-spectrum matches) was uploaded to the methods. During the second acquisition round, MS/MS spectra were obtained for the top 10 most intense ions of each MS scan (TopN = 10) with a mass resolution of 17.500 at m/z 200, an isolation window of 1.6 m/z with an isolation offset of 0.5 m/z, an AGC of 1 × 10 5 , a maximum IT of 600 ms and an (N)CE at 28. The exclusion of single-charged ions and a 15 s dynamic exclusion were enabled for both rounds.

Bioinformatic Analysis
Raw MS data were submitted to protein identification and label-free quantification by the MaxQuant software [17] (version 1.6.6.0) using default settings when not specified otherwise. Identification consisted of a search against a custom-made reviewed Uniprot Homo sapiens database (20443 Homo sapiens entries + 4 MPDS Mix 1 entries, release date 8 August 2019) with Carbamidomethyl (C) set as a fixed modification Oxidation (M), Deamidation (NQ) set as variable modifications and a minimum of two peptides (including one unique peptide) required. LFQ was enabled and separated between control and CF groups with a minimum LFQ ratio count of 1, no Fast LFQ, and no requirement of MSMS for LFQ comparison. The 'match between runs' (MbR) option was enabled and tuned to allow matches from the library (pooled samples considered as parameter group 1, 'match from') and between individual samples (parameter groups 0 and 2, 'match from and to'). A match time window of 2.5 min was used.
MaxQuant output data (proteingroups.txt) were submitted to statistical analysis using the Perseus software [18] (version 1.6.10.43). 'Only identified by site', 'REVERSED', and Contaminant data were filtered out. LFQ intensities were log2-transformed and proteins with less than 50% of valid values were filtered out. Principal Component Analysis (PCA) was performed on Z-score-normalized LFQ intensities.
Computation of Pearson's correlation coefficients (PCC) and average Euclidian distance hierarchical clustering were used to classify samples according to quantitative profiles of pairwise correlation with other samples in the cohort, without a priori. Available clinical parameters were tested for the significance of their effects on the proteome profile clustering by PERMANOVA.
PERMANOVA (PERmutational Multivariate ANalysis Of VAriance) tests were performed with the PAST software [19] (version 4.04) using the hierarchical clustering distance matrix with a number of permutations set to 999.
Sparse Partial Least Squares (SPLS) regression was performed as an unsupervised multivariate analysis to test the association between highly correlated covariates from the proteomic data (matrix X of multivariate predictors) and the clinical data (matrix Y of multivariate responses). Before SPLS processing, the predictors and responses were centered. η, a sparsity tuning parameter, and k, the number of latent components were determined among possible numbers for η − η ∈ (0.1,0.9) and k − 1 ≤ k ≤ 15. η = 0.69 and k = 2 minimized the mean squared prediction error (MSPE, Supplementary Figure S1).
Control versus CF group comparison was achieved by a two-sample Student's t-test with a p-value-based threshold. Proteins with a p-value below 0.05 were considered significantly differentially expressed between the control and CF groups. A second comparison was performed by a two-sample Student's t-test with a permutation-based FDR calculation and a q-value-based threshold.
Differentially abundant proteins were characterized based on both the p-value (a less stringent cut-off threshold (p < 0.05) for biological relevance) and q-value (a more stringent permutation-based FDR threshold (q < 0.05) for clinical biomarker relevance) thresholds. On top of the q-value cut-off, stringent data filtering based on sample occurrence, difference significance, and difference value was applied to a shortlist of candidate clinical biomarkers.
Control versus CF Volcano plot visualization was achieved by a two-sample Student's t-test with a permutation-based FDR calculation (test = t-test; side = both; number of randomizations = 250; no grouping in randomizations; FDR = 0.05; s0 = 0.1).
Functional annotations of identified proteins and over-representation/enrichment tests were conducted using the online search engine powered by the PANTHER Classification system. The PANTHER Overrepresentation test (release date: 28 July 2020 and 24 February 2021) parsed the PANTHER database (version 16.0, release date: 1 December 2020) using the Homo sapiens reference list, the PANTHER-Gene Ontology-Slim, and the PANTHER Protein Class annotation dataset. Only p < 0.05 items were retained and considered significantly over-represented.
Visualization of interaction networks was performed by submitting SwissProt accession IDs to STRING (version 11.0, https://string-db.org, accessed on 13 October 2021).
For concision, all protein names were abbreviated with the related gene names.

Characterization of Sweat Actin
Sweat actin concentration was determined for each individual sample using the "Total Protein Approach" to estimate absolute quantification [20]. A sweat aliquot equivalent to 5 ng of actin was diluted with ultrapure water (q.s. 10 µL, final concentration 0.5 ng·µL −1 ). Phalloidin-TRITC stock solution was prepared by dissolving 0.1 mg of phalloidin-TRITC (P1951, Phalloidin-TRITC peptide from Amanita phalloides, Sigma-Aldrich) in 1 mL of 50% methanol (stock concentration 0.1 mg·mL −1 ). Buffer A consisted of a 20 mM potassium acetate/20 mM Tris acetate, pH 7.5 solution. F-actin microfilaments were labeled with phalloidin-TRITC by diluting sweat aliquots with buffer B (1:10 dilution of dye conjugate stock solution in buffer A) to a final concentration of 0.25 ng·µL −1 (final volume: 20 µL) and incubating for 45 min, at room temperature, under stirring at 600 rpm, protected from light.
To eliminate residual dye conjugates, labeled F-actin was precipitated with other sweat proteins by incubation in 90% acetonitrile for 30 min at 4 • C followed by centrifugation for 10 min at 4 • C, at 10,000× g. The supernatant containing free dye conjugates in the solution was discarded. The protein pellet containing phalloidin-TRITC-F-actin was re-suspended in 20 µL of buffer A. Twenty µL of labeled F-actin solution were put on a microscope slide covered with a cover slip sealed with nail polish.
For each individual microscope slide, 20 snapshots were taken as followed: 10 snapshots at an operator-chosen position and 10 snapshots at a random position.
Images were automatically analyzed using ImageJ (version 1.53c). Image analysis consisted of an in-house macro implementation: pixel-to-µm scaling, conversion to 8-bit format, auto-thresholding using the MaxEntropy method [21], selection of thresholded particles (size filter = 200 pixels, to filter noise particles out), and particle analysis (count, area, perimeter, length, and width). Control versus CF comparison was achieved by an unpaired t-test using GraphPad Prism (version 7.00).

Experimental Design and Statistical Rationale
All sweat samples were collected under steady state conditions from subjects with no known acute or chronic illness in controls and no exacerbation in patients with CF, no drug (controls) or additional drug (CF) use at the time of collection, no cosmetic use or skin damage at the site of collection, no clinical sign of dehydration. Female subjects were neither pregnant nor lactating. Patients with CF had a confirmed diagnosis and were clinically stable, having a Forced Expiratory Volume in one second (FEV1) % predicted ≥ 30% and an O 2 saturation ≥ 92%. Patients with CF, tested under stable conditions, were not enrolled in other clinical trials or under CFTR modulator therapies. All subjects were asked to be fasting and kept well-hydrated for a minimum of 8 h before collection.
Due to the small and variable sweat volumes collected, together with the relatively low and variable protein concentrations of sweat, plus the need to store sweat samples for further analyses, no technical replicate could be performed for each sample. For the same reasons, no CF sample library was analyzed considering the high proteome similarity between the control and CF proteomes as well as for the use of the inter-sample match between the run option. To account for technical variability, sweat samples were processed and analyzed in three series of ten control samples plus a pooled sample and two series of seven and eight CF samples. The inter-series technical variability for a given group was negligible when compared to the biological variability [22]. Inter-sample technical variability was reduced by separating LFQ normalization between the control and CF groups: considering the equivalent amount of proteins processed and injected for all samples, the computation of both non-normalized data and separated LFQ normalized data drew the same general biological conclusions (both differential proteomic analysis and differential quantification of the quality control standard digest spike-in were also similar). However, global LFQ normalization led to a quantification bias (e.g., the 1:1:1:1 ratio of the standard protein digest spike-in was lost). Protein abundances between the two groups were not suitable for global normalization (Supplementary Figure S2).

The Proteome of CF Sweat
Sweat samples from 30 controls and 15 patients with CF-see Supplementary Tables S1 and S2 for complete clinical data summary-were analyzed by nanoLC-MS/MS ( Figure 1). Based on chromatogram discrepancy and poor correlation with the other sample data, likely of technical origin, two female control samples and one male CF sample were discarded (Supplementary Figure S3). Clinical data of the remaining control and patients with CF are summarized in Tables 1 and 2. Considering a minimum of two peptidesincluding one unique peptide-and an FDR below 0.01 for protein identification, a total of 1057 proteins were identified, accounting for data filtering (Supplementary Table S3) and the standard protein mixture (MPDS Mix 1, Waters) for quality control. About 520 ± 18 (mean ± SEM, control: 542 ± 23, CF: 476 ± 26) proteins were peptide-spectrum matching hits while 317 ± 7 proteins (control: 310 ± 9, CF: 330 ± 11) required matching between runs for identification for an average total of 837 ± 14 proteins (control: 853 ± 18, CF: 805 ± 22) identified in each sample ( Figure 2A). A total of 314 proteins were consistently identified across all samples.
The comparison of protein identifications between control and CF sweat proteomes emphasized a near-complete overlap and a high degree of similarity: 98% of proteins were common to the control and CF groups and only 18 out of 1057 identified proteins were exclusive to control sweat ( Figure 2B). Exclusive proteins were in the low abundance and low occurrence tiers so one could not consider them biologically relevant. The classification and over-representation analysis of identified proteins highlighted the predominance of (i) proteins related to proteolytic activity, proteases, and peptidases as well as their respective inhibitors, (ii) cytoskeletal proteins, i.e., protein components and regulators (actin and Actin-Binding Proteins (ABP)) of the actin cytoskeleton organization and dynamics, (iii) proteins of reactive oxygen species metabolism and oxidative stress, (iv) markers of UPR and RE stress, (v) components and regulators of the proteasome, or (vi) proteins of all major metabolic pathways, among the over-represented proteins mapped to annotation clusters of the PANTHER Classification system and Gene Ontology Enrichment analysis, mapping protein IDs against PANTHER GO Slim annotation datasets (Supplementary Tables S4-S7).

Analysis of Sweat Proteome Profiles
A total of 1057 identified proteins were suitable for protein label-free quantification. For further statistical differential analysis between control and CF sample groups, only proteins identified and quantified in at least 50% of a sample group were used, amounting to 947 proteins.
First of all, the hierarchical clustering of control and CF samples together confirmed CF-specific/control-specific proteome profiles of PCC since samples were sorted into eight clusters, grouping samples from the same subject group ( Figure 3A, upper panel), without a priori. Only four samples (one control and three CF) were mismatched. According to PER-MANOVA, the variations in proteome profiles between control and CF were significantly correlated with Na + and Cl − concentrations, protein concentration, Na + amount, and CF status ( Figure 3A, lower panel). Then, the hierarchical clustering of CF samples into five clusters of CF proteome profiles highlighted the inter-individual biological variability of CF sweat ( Figure 3B, upper panel). According to PERMANOVA, the variations in CF sweat proteome profiles were significantly correlated with sweat Na + and K + concentrations, protein concentration, and K + amount ( Figure 3B, lower panel). Interestingly, no significant correlation with CFTR genotype or clinical manifestations (pancreatic insufficiency, diabetes, airway infection status, or lung function impairment) was observed.     Table S8A). Especially, SPLS confirmed the correlation between sweat ion composition and protein profiles (Supplementary Figure S4).

Characterization of Candidate CF Biomarkers
The clustering of sweat proteome profiles like the sample grouping by PCA (Figure 4A) resulted from the differential abundance of 402 out of 947 proteins (Supplementary  Table S8B), as tested by a supervised two-sample t-test. CF was associated with a decrease in the expression of 351 out of 402 proteins in differential abundance, 51 out of 402 proteins being over-expressed. The proteome dynamic correlated with a partial depletion in CF sweat (as visualized in Figure 4B). Of note, 17 of the 20 most abundant proteins in sweat [22] were in significantly differential abundance between the control and CF groups (Supplementary Table S8B, highlighted in blue). In addition, kallikreins (KLK5, KLK11 (ESG-specific), KLK14) were significantly decreased in CF sweat. (Supplementary Table S8B, highlighted in yellow).
One-hundred and eighty-nine proteins were differentially abundant between the control and CF groups (11 up in CF, 178 down, Supplementary Table S8C) after a supervised two-sample t-test with permutation-based FDR calculation. Data filtering based on sample occurrence (protein found in all samples) significance (p < 0.001) and value (control-CF log 2 LFQ difference ≥ 2 or control-CF log 2 LFQ difference ≤ −2) of the difference in protein abundance highlighted nine candidate CF biomarkers of potential clinical relevance ( Figure 4). Decreases in the protein abundances of ARG1, CUTA, MAN1A1, AGA, EZR, SIAE, LFNG, MIA3, and FNLA ( Figure 4C, panel a) between the control and CF groups were the most statistically significant. Only EZR and FLNA were closely functionally related, being involved in the integrity and stability of the cortical actin network.
According to the PANTHER-Gene Ontology functional annotation and over-representation test (Supplementary Tables S9-S12), proteins involved in: (i) the hydrolase activity (both protease/peptidase and glycosidase activity) of the lysosome, (ii) the structure and function of the proteasome, (iii) protein processing and mechanisms of ER stress and UPR, (iv) the structure of desmosomal anchoring junctions, and (v) the structure, organization, and dynamics of the actin cytoskeleton were over-represented in the set of differentially abundant proteins ( Figure 5). Of note, the differential phenotypes of CFTR/F508del CFTR physical and functional interactions could be described in sweat since subsets of sweat CF biomarkers cross-checked the CFTR interactome (n = 82, Figure 6A) and F508del CFTR interactome (n = 23, Figure 6B) as established by Pankow et al. [23].

CF Biomarker Profiles Partially Reflect CF Severity Related to CFTR Genotype
To determine if CF biomarker profiles correlate with genotype, pancreatic status, airway infection status, or spirometry, computation of Pearson's correlation coefficients, average Euclidian distance hierarchical clustering of samples, and PERMANOVA testing of available clinical parameters were applied to CF biomarkers only (n = 402).
Firstly, the hierarchical clustering of all samples using only the protein abundance profiles of CF biomarkers generated CF and control groups sub-divided into five clusters of protein abundance profiles (Supplementary Figure S5A, upper panel), without a priori.
According to PERMANOVA, the variations in CF biomarker profiles between control and CF were significantly correlated with sweat Na + , Cl − , and K + concentrations, protein concentration, Na + , Cl − , and K + amounts, and CF status (Supplementary Figure S5A, lower panel). Secondly, the hierarchical clustering of CF samples (n = 14) using only the protein abundance profiles of CF biomarker sorted samples into six clusters (Supplementary Figure S5B, upper panel), without a priori. According to PERMANOVA, the variations in CF sweat proteome profiles were significantly correlated with CFTR genotype (F508del homozygous status) (Supplementary Figure S5B, lower panel). No significant correlation with clinical manifestations (pancreatic insufficiency, diabetes, airway infection status, or lung function impairment) was observed.

Characterization of CF Severity Biomarkers
The correlation between sweat CF biomarker profiles and CFTR genotype, as it was observed by PERMANOVA of hierarchical clustering, resulted from the differential abundance of 68 proteins between F508del homozygous patients and patients with other genotypes (Supplementary Table S13), as tested by a supervised two-sample t-test. Dermcidin (DCD), the most abundant protein in sweat, was significantly less abundant in F508del homozygous sweat (Supplementary Table S13, highlighted in blue).
Concurrently, applying the same supervised statistics to patients with or without pancreatic insufficiency, 71 proteins were found differentially abundant (Supplementary Table S14).
According to the PANTHER-Gene Ontology functional annotation and over-representation test, proteins involved in: (i) the hydrolase activity (both protease/peptidase and glycosidase activity) of the lysosome and (ii) the structure and function of the ribosome were over-represented in both F508del homozygous and PI patients (Supplementary  Tables S15-S34).
Meanwhile, 54 proteins were differentially abundant in correlation with lung function, between patients with normal or mildly impaired lung function (≥70% FEV 1 %) and patients with moderate to severe lung function impairment (<70% FEV 1 %) (Supplementary Table S35). According to the PANTHER-Gene Ontology functional annotation and overrepresentation test, hydrolases and desmosomal linkers to intermediate filaments were over-represented in this subset (Supplementary Tables S36-S43).
Data filtering based on sample occurrence (protein found in all samples) significance (p < 0.001) and value (control-CF log 2 LFQ difference ≥ 2 or control-CF log 2 LFQ difference ≤ −2) of the difference in protein abundance highlighted seven candidate biomarkers of CF severity with potential clinical relevance (Figure 4). Differential abundances of: (i) ARG1, GPT, MDH2, and EML4 between F508del homozygous and F508del heterozygous patients, ( Figure 4C, panel b), (ii) MGAT1 between pancreatic insufficient and pancreatic sufficient patients ( Figure 4C, panel c), and (iii) IGJ and TOLLIP between patients with normal lung function/mild lung function impairment and moderate/severe lung function impairment ( Figure 4C, panel d) were the most statistically significant.
Nevertheless, the latter results were generated from a small number of patients and would gain clinical relevance with a larger cohort and subsequent dataset.

Actin Dynamics in CF Sweat
In light of the functional annotation of the sweat proteome, 35 out of all identified proteins are involved in the organization and dynamics of the actin cytoskeleton (Supplementary Tables S4-S7). Supervised differential proteomics reported 13 ABPs with significant changes in protein abundance between the control and CF groups while actin abundance remained steady between healthy subjects and patients with CF.
In regard to the differential abundance of ABPs and functionally associated proteins, the observation of sweat F-actin ( Figure 7A) featured significant differences in microfil-ament organization between control and CF sweat. In detail, particles assimilated to microfilaments in CF sweat were significantly more abundant ( Figure 7B, panel a, fold change (FC) = 1.81, p < 0.001 ***), longer and larger with significantly greater mean microfilament area ( Figure 7B,

Discussion
The current study was designed to achieve the first thorough and in-depth characterization of the sweat proteome of patients with CF versus healthy subjects, at an individual level. A standardized and optimized workflow from the sample collection and preparation to the LC-MS analysis and bio-informatic data processing was developed. With a particular emphasis on sweat sampling in a non-invasive and reproducible way, the methodology followed the Gibson and Cooke sweat test gold standard guidelines. Accounting for all analyzed samples, 1057 proteins were identified and quantified by their inter-individual relative abundance. Comparing 14 patients with CF to 28 healthy subjects as a control cohort, relative protein abundances between control and CF sweat samples were computed to: (i) evaluate whether sweat protein profiles would help discriminate patients with CF from healthy subjects, (ii) test the correlation between CF severity and sweat protein profiles, and (iii) characterize biomarkers of CF disease and severity in sweat.
First and foremost, the statistical analysis of sweat proteome profiles without a priori partially reflected the clinical diagnosis and conclusions of the sweat test since CF status alongside Na + and Cl − concentrations correlated with the control versus CF sample clustering. However, the clinical relevance of sweat proteome profiles was hindered by the respective inter-individual variabilities of control sweat-inter-subject variability correlated with Na + amount [22]-and CF sweat-inter-patient variability correlated with K + concentration and K + amount alongside Na + concentration and sweat protein concentration. On a side note, these correlations underlined the susceptibility of protein content to the shifts in the mechanisms of eccrine ion secretion between control subjects and patients with CF. The fact remains that the combined physiological and pathophysiological inter-individual variability of sweat allowed control versus CF sample correlation and clustering mismatches, with no evidence to rationalize or exclude mismatched samples as outliers. Therefore, the statistical analysis of whole sweat proteome profiles without a priori could not discriminate patients with CF from the healthy population. Neither could it discriminate against patients based on their genotype or the clinical manifestations, failing to report CF severity.
However, CF samples were distributed together without a priori among clusters in correlation with CF status, even if a priori control and CF sample groups were not retained afterward. So, CF pathophysiology partly influenced the protein composition of sweat. Applying a supervised statistical analysis for differential proteomics between the CF and control groups, a third of all characterized sweat proteins (351 out of 947) were significantly less abundant in CF sweat. Functionally wise, the over-representation of differentially abundant proteins involved in protein processing, ER stress, and UPR pathways was consistent with this apparent depletion. This observation had to be correlated with phenotypes of impaired CFTR processing due to CFTR mutations and generalized to the global protein machinery of eccrine gland cells. Thirteen out of fourteen patients with CF carried the F508del mutation whose phenotype is the absence of functional CFTR at the membrane due to CFTR production and trafficking impairment and degradation of misfolded proteins. The over-representation of proteins in disease-specific abundance related to proteasome-and lysosome-mediated proteolysis alongside markers of UPR was in total agreement with the prevalence of the F508del mutation in the patient cohort.
To sum up, the protein composition of CF sweat compared to control sweat highlighted that CF pathophysiology of the eccrine gland, e.g., defects in the mechanisms of protein processing resulting in ER stress, the onset of the UPR, and proteolysis can be indirectly monitored by sweat proteomics. From a pathophysiological standpoint, the F508del mutation globally affected the protein machinery of the eccrine gland beyond the sole processing of CFTR, hence the disease-driven depletion of the CF sweat proteome.
When considering the 402 differentially abundant proteins, the proteome profiles of CF sweat correlated with CFTR F508del mutation, i.e., sweat was a matrix of candidate biomarkers of both CF diagnosis and discrimination between F508del homozygous and F508del heterozygous status. The pathophysiology of F508del homozygous patients is frequently associated with the onset of pancreatic insufficiency [24]. Interestingly, the most differentially abundant proteins were both F508del homozygous and PI biomarkers. Yet, the recruitment of F508del heterozygous patients with PI helped characterize F508del homozygous-and PI-specific biomarkers. The over-representation of proteins in genotypeor PI-related abundance involved in the protease activity of the cytoplasm and the structure and function of the ribosome echoed the correlation between CFTR mutation classes, phenotypes of protein processing defects, and CF severity.
In brief, the protein composition of CF sweat highlighted that factors of CF severity (CFTR genotype) can be monitored by sweat proteomics. From a pathophysiological perspective, ribosomal stalk proteins were described as modifiers of CF severity when the silencing of corresponding genes elicited the partial phenotype rescue of F508del CFTR processing defects [25]. Here, ribosomal stalk proteins uL11, P0 (uL10), and P2 plus ribosomal proteins uL4 and eL6 were sweat markers of CF disease and severity, respectively.
As for the pathophysiological relevance of sweat proteins in differential abundance, 17 proteins from the core proteome (systematically found in all samples) and the top 20 most abundant proteins of sweat [22] were in CF-specific abundance. To a greater extent, proteins in the high abundance and high occurrence tiers were all affected by CF. Dermcidin, the most abundant protein in sweat was characterized as a biomarker of F508del homozygous and PI. Moreover, a number of previous results about CF-specific abundance of proteins were confirmed: (i) the decreased levels of kallikreins were already described in sweat [26] and correlated with a decreased enzymatic activity in CF plasma [27], (ii) the decreased arginase-1 abundance in CF sweat would correlate with the decreased abundance and reduced enzymatic activity in CF plasma [28], (iii) the under-expression of filamins A and B in CF sweat could be correlated with the decreased cell levels of filamins [29]. Interestingly, in the current work, arginase-1 and filamin A were characterized as promising candidate biomarkers for CF diagnosis and estimation of CF severity from CF sweat. More importantly, some biomarkers found in CF sweat were already described in previous studies on the secretome and proteome of bronchial epithelial cell lines [30][31][32], the proteome of nasal epithelial cells [33], the proteome of CF serum [34], and the proteome of CF urine exosomes [35] (Supplementary Table S44). Inconsistencies in the protein abundances of these biomarkers were observed between studies and could be partially explained by the nature of the models (e.g., in vitro cell lines versus patient tissue samples). Still, proteins related to CFTR proteostasis and membrane stability plus cytoskeleton crosstalk (e.g., FLNA, EZR, VCL, SET, COL6A1, and HSPA5) were among the ones in good agreement throughout. In total, 82 sweat proteins in CF-specific abundance (Supplementary Table S45) were listed in the CFTR interactome [23]. In addition, in CF sweat, the decreased levels of CFTR interactors, e.g., ERM proteins (EZR, MSN) or filamins, plus the under-expression of all the protein constituents of the desmosome highlighted: (i) the defects in CFTR homeostasis and membrane stability as well as cell-cell adherence and membrane integrity of the eccrine gland cells, (ii) the potential of sweat analysis to remotely monitor some aspects of CF pathophysiology in other epithelia. In addition, from a clinical standpoint, sweat CFTR interactors and F508del CFTR interactors (Supplementary Table S46) in CFspecific abundance are of utmost interest in the search for biomarkers of phenotypic rescue to benefit new therapeutic developments.
Concurrently, the functional annotation of proteins in disease-specific abundance pointed out new insights into the pathophysiology of CF sweat, i.e., the significant overrepresentation of proteins related to sweat actin organization and dynamics.
Precisely, protein abundances of lysozyme C (LYZ) and Actin-Binding Protein (ABP) cofilin-1 (CFL) were significantly increased in CF sweat while actin levels remained steady. Both LYZ and CFL are known to alter actin dynamics in relation to variations in ionic strength and concentration ratio to actin, as discussed below and summarized in Figure 8. Changes in LYZ and CFL abundance correlated with CF-specific changes in sweat actin organization, i.e., larger, more abundant microfilaments resulting from disease-induced F-actin polymerization and bundling. Initially described as an Actin-Depolymerizing Factor (ADF) [36], CFL regulates actin dynamics as a whole from F-actin/G-actin balance and treadmilling to the spatial organization of the actin cytoskeleton [37,38]. CFL concentration ratio to actin (CFL: ACT ratio), pH, oxidative stress, or CFL phosphorylation are well-described modulators of CFL actin-related biological function and activity [39].
At a higher CFL: ACT ratio, F-actin microfilaments are severed and prone to the nucleation and polymerization of new branches. Conversely, F-actin microfilaments are severed and completely depolymerized into G-actin at a lower CFL: ACT ratio [40,41]. In CF sweat, the higher CFL abundance in relation to steady actin levels increased CFL:ACT ratio. So, CF sweat G-actin was more likely to nucleate and be polymerized into F-actin while microfilaments were more likely to elongate.
This phenomenon would be accentuated by CFL sensitivity to pH [42]. At a slightly acidic pH (6.5), CFL is less likely to divert actin dynamics from nucleation and polymerization [40]. Since sweat pH is neutral to slightly acidic varying between pH 5 (at the lowest sweat rate) and pH 7 [43], with no significant change correlated with CF [44], F-actin microfilaments would be more likely to stabilize and elongate further.
In the meantime, tropomyosins, known competitors of CFL for access to actin microfilaments [45,46], and MTPN, an inhibitor of actin polymerization via interaction with actin-capping proteins [47], were decreased in CF sweat.
Although (de)phosphorylation of its Ser3 residue is an important modulator of intracellular CFL activity, it is noteworthy that no protein involved in CFL regulation by (de)phosphorylation (slingshot phosphatases and LIM kinases) was identified in control or CF sweat. Interestingly enough, an upstream regulator of LIM kinases, small RhoGTPase Rac1, was under-expressed in CF sweat. On that note, Rac1 is involved in the regulation of both actin polymerization by CFL and actin bundling by IRSp53/BAIAP2. IRSp53 bundles F-actin microfilaments by means of its SH3 domain (recruitment of other ABPs and actin regulators) and its IMD domain (direct actin binding) [48,49]. The respective CF-related abundances of Rac1 (down) and IRSp53 (up) suggested a promotion of IMD-mediated actin bundling by IRSp53 without regulation.
Indeed, polycationic proteins and peptides such as LYZ and cathelicidin-derived LL-37 were described as promoters of F-actin nucleation, polymerization, and bundling [51,52]. F-actin microfilaments are bound together through electrostatic (hydrophobic) interactions, mediated by intercalated polycations, with sensitivity to changes in ionic strength, polycation concentration, and actin concentration [53]. Polycation-induced F-actin bundling is promoted by an increase in polycation concentration while prevented by an increase in ionic strength or actin concentration. Sweat ionic strength is mainly influenced by NaCl concentration. In physiological conditions, sweat is hypotonic due to ion reabsorption along the duct of the eccrine gland. In CF, sweat is hypertonic (up to plasma concentration) as the absence of a functional CFTR prevents Cl reabsorption. The presence of polycation-bound F-actin bundles has already been described in the pathologically viscous sputum [54]-a bio-fluid with similar neutral to slightly acidic pH [55,56] and increased ionic strength [57] as CF sweat-secreted and accumulated in the pulmonary airways of patients with CF. In the same way, reports of the higher viscosity of CF sweat [44] could be explained by extensive LYZ-mediated F-actin bundling of polymerizing, unbranched microfilaments.
Besides their interaction with actin, polycationic proteins and peptides are also known for their antimicrobial properties [58], essential to skin host defenses and the innate immune system. While trapped in F-actin bundles, polycationic proteins and peptides reversibly lose their antimicrobial activity [59] and potentiate the risk of pulmonary infection linked to higher mucus viscosity in patients with CF. Nonetheless, the prevalence of skin infections in patients with CF does not corroborate defects in innate immune defenses to take place as in CF sputum and airways. The predominance of dermcidin (DCD)-a polyanionic precursor of antimicrobial peptides and the most abundant protein in sweat-in skin defenses together with the absence of CF changes in DCD abundance could explain these observations [60][61][62].
Interestingly, in the event of a deep correlation, yet to be established, between sweat and mucus F-actin bundling states and subsequent viscosities, the monitoring of sweat F-actin dynamics could prove useful in the early evaluation of CF severity.
To sum up, CF was correlated with disease-specific proteome profiles, contributing to non-physiological inter-individual variations of the sweat proteome. The characterization of differentially expressed proteins (control vs. CF, F508del heterozygous vs. F508del homozygous, PS vs. PI, normal/mild vs. moderate/severe lung function impairment) showcased nine CF diagnosis biomarkers (CUTA, ARG1, EZR, AGA, FLNA, MAN1A1, MIA3, LFNG, SIAE) and seven CF severity biomarkers (ARG1, GPT, MDH2, EML4 (F508del homozygous), MGAT1 (pancreatic insufficiency), TOLLIP, IGJ (lung function impairment)) candidates as well as potential markers of CFTR phenotypic rescue to be further investigated for clinical relevance. On that note, particular attention to the pathophysiology of ABPs in sweat compared to other CF tissue would also deserve further investigations. In conclusion, sweat proved to be an informative bio-fluid to help and improve the understanding of CF pathophysiology by providing candidate biomarkers of interest for precision medicine and therapeutic development.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/cells11152358/s1, Figure S1: Mean Square Prediction Error (MSPE) plot for the determination of SPLS parameters.; Figure S2: Separated LFQ normalization was applied to raw data since global LFQ normalization elicited a quantification bias; Figure S3: Discarded samples based on chromatogram discrepancy and poor correlation with other samples' protein profiles; Figure S4: Plot of the estimated coefficients of the first four sweat proteins selected as important variables for all responses, following SPLS regression; Figure S5: Sweat CF biomarker profiles discriminated patients with CF from control subjects, in correlation with CFTR genotype; Tables S1-S46. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Acknowledgments:
The authors thank all the healthy volunteers and patients with CF who agreed to participate in this study. The authors acknowledge the valuable contributions of Nancy Rosieère, Nanou Tanteliarisoa, and Lisette Trzpiot (ULiège) to the conduct of this work, through their technical and experimental support. The authors thank Halehsadat Nekoee Zahraei and Anne-Françoise Donneau (Biostatistics Unit, Department of Public Health, ULiège) for their help with the SPLS analysis.

Conflicts of Interest:
The authors declare no conflict of interest.