Mass Spectrometry-Based Proteomics of Human Milk to Identify Differentially Expressed Proteins in Women with Breast Cancer versus Controls

It is thought that accurate risk assessment and early diagnosis of breast cancer (BC) can help reduce cancer-related mortality. Proteomics analysis of breast milk may provide biomarkers of risk and occult disease. Our group works on the analysis of human milk samples from women with BC and controls to investigate alterations in protein patterns of milk that could be related to BC. In the current study, we used mass spectrometry (MS)-based proteomics analysis of 12 milk samples from donors with BC and matched controls. Specifically, we used one-dimensional (1D)-polyacrylamide gel electrophoresis (PAGE) coupled with nanoliquid chromatography tandem MS (nanoLC-MS/MS), followed by bioinformatics analysis. We confirmed the dysregulation of several proteins identified previously in a different set of milk samples. We also identified additional dysregulations in milk proteins shown to play a role in cancer development, such as Lactadherin isoform A, O-linked N-acetylglucosamine (GlcNAc) transferase, galactosyltransferase, recoverin, perilipin-3 isoform 1, histone-lysine methyltransferase, or clathrin heavy chain. Our results expand our current understanding of using milk as a biological fluid for identification of BC-related dysregulated proteins. Overall, our results also indicate that milk has the potential to be used for BC biomarker discovery, early detection and risk assessment in young, reproductively active women.


Introduction
BC is one of the most common cancers worldwide and in the United States [1][2][3]. Accurate risk assessment and earlier detection would benefit all women especially young women for whom mammography is not effective due to their dense breast tissue [4], and reproductively active women who might be temporarily at a higher risk of pregnancyrelated BC [5,6]. A biomarker is a protein, set of proteins or other molecules whose dysregulation is consistently associated with a disease or disorder. One of the most robust and common tools for the discovery of protein biomarkers is MS, which is a precise method applied in identification, quantitation, characterization and post translational modifications of proteins [7]. Early diagnosis and risk assessment of BC could be achieved non-invasively by the discovery of BC biomarkers in different types of bodily fluids, and much research has been published on this subject [8,9]. Still, there remains a need for more research in this field to provide a comprehensive biomarker signature for BC based on the protein biomarkers found in bodily fluids. Human milk, directly derived from the breast ducts, has been studied for BC investigations [4,5,8,[10][11][12][13] and is accepted as a proper microenvironment for the purpose of BC biomarker discovery [1][2][3][4][5][6]10,13,14] We previously investigated protein dysregulations in 10 human milk samples, (from 5 women with BC and 5 controls) using 1D-SDS-PAGE coupled with nanoLC-MS/MS and identified several dysregulated (upregulated or downregulated) proteins [5]. In a second study we focused on one of these comparison pairs, a within woman comparison. Specifically, both samples (BC and control) were donated by the same woman, one from the breast identified with BC 24 months after donation, and one from the contralateral. We performed 2D-SDS-PAGE coupled with nanoLC-MS/MS to achieve a more comprehensive investigation of dysregulated proteins in this pair of samples and identified several dysregulated proteins [15]. Most of the proteins identified in our previous work have been shown to be potentially involved in cancer development and some have been reported to be dysregulated in either cancer or cancer cell lines (reviewed in our previous studies [5,15]. In the present study, we used 1D-SDS-PAGE coupled with nanoLC-MS/MS to analyze a new set of paired milk samples (n = 6 pairs). In the study, 5 of the 6 comparison pairs include BC vs. control pairs, 4 of which are across women comparisons, meaning that the BC sample is milk combined from left and right breasts of a woman diagnosed with BC compared to milk combined from left and right breasts of another woman with no cancer diagnosis. In addition, one, comparison pair is a within woman comparison, meaning that the BC sample came from the right breast of a woman diagnosed with cancer in the right breast and the control sample came from her unaffected left breast. We also analyzed one comparison pair from the right and left breasts of a woman without BC, to investigate the protein differences between the milk from two breasts. We applied 1D-SDS-PAGE coupled with nanoLC-MS/MS on these 6 pairs of human milk samples and we were able to identify several protein dysregulations (upregulations or downregulations) some of which were identified in our previous studies as well. These dysregulated proteins might be considered as potential future biomarkers for BC early detection and risk assessment.

Human Subjects and Milk Samples
Analyses were performed on 12 human milk samples collected with IRB approval from the University of Massachusetts, Amherst. The procedure for sample collection has been described elsewhere [10,13]. Briefly, milk samples received at the laboratory between 2008 and 2015 were aliquoted and maintained at −20 • C. We attempted to match cases and controls for mother's age at sample donation and age at first birth, the number of live births, and the length of time samples were maintained at −20 • C ( Table 1). The participants who donated milk and were diagnosed with BC comprised two categories: 1) they were diagnosed with BC before milk donation, or 2) they were diagnosed with BC after milk donation. Table 1 provides the participant demographics that were used for assigning the comparison pairs. As shown in Table 1, analyses were conducted on milk donated by 10 women. For 8 women (4 with BC and 4 controls) samples prepared by combining samples from right and left breasts were analyzed. These samples provided 4 comparison pairs with the following sample codes: 1_BC vs. 2_Con, 3_BC vs. 4_Con, 5_BC vs. 6_Con and 7_BC vs. 8_Con). The 9th woman provided two milk samples, one from the right breast diagnosed with cancer, and a control sample from the left breast, in which there was no cancer, allowing a within woman comparison (9_R_BC vs. 9_L_Con). Lastly, the 10th woman, who did not have BC, donated milk from her right and left breasts, allowing a within woman comparison of protein patterns from two control breasts (10_R_Con vs. 10_L_Con). As seen in Table 1, Sample 3_BC was donated 6.2 years after the participant was diagnosed with BC. We compared this sample with a milk sample from a woman who was never diagnosed with BC, to observe whether alterations in protein pattern remain years after the BC was removed. * Codes for milk. The date after the participant ID indicates the date at which the samples were received at the lab and stored at −20 • C. IDC = invasive ductal carcinoma, DCIS = ductal carcinoma in situ. ER/PR/Her2 = estrogen receptor/progesterone receptor/human epidermal growth factor receptor 2. BC = milk (combined from left and right breasts) came from a woman diagnosed with breast cancer. Con = milk (combined from left and right breasts) came from a woman with no cancer diagnosis; control. NA = not applicable. For samples 9 and 10 separate milk samples from the left and right breasts were analyzed; 9_R_BC indicates that the milk came from the right breast of a woman diagnosed with cancer in the right breast; 9_L_Con indicates that the milk came from the left breast (control) of the same woman whose cancer was diagnosed in the right breast, whereas for participant 10 [no BC], each milk sample came from a breast considered a control.
Comparison pairs (BC versus control) were assigned in an attempt to minimize differences in BC risk factors including mother's age, her age at first birth, and number of births. It was not possible to match BC and control samples on baby's age. Comparison pairs were analyzed at the same time to minimize potential errors resulting from possible deviations in the performance of the instruments. Except for samples from participants 9 and 10 (milk samples 9_R_BC, 9_L_Con, 10_R_Con, 10_L_Con), all samples are mixtures of milk from the right and left breasts. For participant 9 (a woman with BC in the right breast) and 10 (a woman without BC), milk was taken separately from the right and left breasts, and the comparison was between the milk from right and left breasts.

Reagents
All the chemicals used in this study were from Sigma-Aldrich (St. Louis, MO, USA).

MS-Based Proteomics Analysis
As described in our previous study [5], the following procedure was followed for MS-based proteomics analysis of human milk, with the aim of identifying dysregulated proteins in BC vs. control: The milk samples were thawed, and a Bradford assay was conducted to determine total protein concentration in each sample. Then, 800 µg of the proteins for each sample were separated in 11% sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and a Coomassie Blue stained gel was obtained for the milk samples. Each of 12 gel lanes was cut into 30 protein bands, then the bands were excised, cut to very small pieces and underwent in-gel trypsin digestion, as described previously [5]. After overnight in-gel trypsin digestion, the peptides were extracted and purified by Zip-Tip reverse phase chromatography (C18 Ziptip™; Millipore, Billerica, MA, USA). The clean, concentrated peptide mixture was analyzed by nanoLC-MS/MS (a NanoAcquity UPLC coupled with a QTOF Ultima API MS; Waters, Milford, MA, USA), as described elsewhere [16]. The MS raw data from MassLynx software (MassLynx version 4.1, Waters) was converted to peak list (pkl) files by ProteinLynx Global Server software (PLGS version 2.4, Waters) as described elsewhere [17], using the following parameters: a background polynomial of order 5 and a background threshold of 35%, Savitzky-Golay smoothing type, 2 iterations and window of 3 channels, centroid top of 80% of peaks and minimum peak width of 4 channels. The resulting pkl files from PLGS were submitted to our in-house Mascot server (www.matrixscience.com, Matrix Science, London, UK, version 2.5.1) (accessed on 16 October 2022) for protein identification using the following parameters: NCBI_20150706 database (69146588 sequences; 24782014966 residues) (NCBI: national center for biotechnology information), homo sapiens (human) (312165 sequences) as taxonomy, trypsin enzyme, carbamidomethyl (cysteine) as fixed modification, acetylation (lysine), oxidation (methionine), phosphorylation (serine, threonine and tyrosine) as variable modifications, Peptide mass tolerance of ±1.3 Da (one 13 C isotope), fragment mass tolerance of ±0.8 Da and one maximum missed cleavage. The exported results from Mascot server (in the format of Mascot.DAT files) were then analyzed by the Scaffold software (Scaffold version 4.2.1, Proteome Software Inc., Portland, OR, USA) for statistical analysis of the paired comparison groups and to verify the identified proteins based on the MS/MS data using the following parameters [18]: Protein threshold of minimum 90% probability and minimum two peptides identified by the Protein Prophet algorithm and peptide threshold of minimum 20% probability by the Scaffold Local FDR (false discovery rate) algorithm. To investigate protein dysregulations, the differences with Fisher's exact test p-value ≤ 0.05 and fold change ≥ 2 considered to be statistically significant. Fold change for upregulation (total spectra count of BC divided by total spectra count of control) is shown with positive numbers and fold change for downregulation (spectra count of control sample divided by spectra count of BC sample) is shown with negative numbers.

Data Availability
The data generated during the current study are available from the corresponding author on reasonable request utilizing to Clarkson University' Material Transfer Agreement.

Results and Discussion
One hundred µg of protein from each of the 12 milk samples comprising the 6 pairs were separated by SDS-PAGE. The gel image is shown in Figure 1. For further proteomics analysis, eight hundred µg of protein from each of the 12 milk samples were separated by SDS-PAGE (Supplementary Materials Figure S1; the lanes in the image were rearranged to present each sample next to its pair). Visual inspection of the 100 µg and 800 µg gel images indicates that the overall protein pattern is very similar among all milk samples. There are however, some differences that can be discerned directly from the gel. For example, both samples from pair 10 (milk from the left and right breasts of a woman who did not have BC, Supplementary Materials Figure S1) lack a major band in the 63 kDa region that is present in both the cancers and controls of the other four pairs. Examination of the results from the database search identifies this region as corresponding to immunoglobulins.
To identify proteins potentially associated with BC, we applied nanoLC-MS/MS analysis on 30 sets of trypsin-digested bands from six pairs of milk samples. As shown in Table 1, the first four pairs included milk from a woman diagnosed with BC and milk from a woman without BC (control or Con). Pairs were constructed to minimize differences in woman's age, age at first birth, and number of live births. Baby's age was substantially less for the control samples as compared to the BC samples of the first three pairs. The 5th pair (#9R/L) included milk from the left and right breasts of a woman diagnosed with cancer in only one breast, and the 6th pair included milk from the left and right breasts of a woman with no cancer diagnosis in either breast. This 6th pair (#10L/R) provides a baseline for the number of proteins that can be expected to be differentially expressed in the milk of the left and right breasts of a healthy, non-symptomatic woman. To identify proteins potentially associated with BC, we applied nanoLC-MS/MS analysis on 30 sets of trypsin-digested bands from six pairs of milk samples. As shown in Table 1, the first four pairs included milk from a woman diagnosed with BC and milk from a woman without BC (control or Con). Pairs were constructed to minimize differences in woman's age, age at first birth, and number of live births. Baby's age was substantially less for the control samples as compared to the BC samples of the first three pairs. The 5th pair (#9R/L) included milk from the left and right breasts of a woman diagnosed with cancer in only one breast, and the 6th pair included milk from the left and right breasts of a woman with no cancer diagnosis in either breast. This 6th pair (#10L/R) provides a baseline for the number of proteins that can be expected to be differentially expressed in the milk of the left and right breasts of a healthy, non-symptomatic woman. Analysis using nanoLC-MS/MS revealed several significantly differentially expressed proteins (p-value ≤ 0.05 and fold change ≥ 2) among the 5 paired comparisons of BC and control milk samples. Some of the differentially expressed proteins were observed in the single comparison between the milk from left and right breasts of control #10 (woman without cancer). To determine which of the differentially expressed proteins might be markers of BC or BC risk, we identified a subset of these proteins that were similarly dysregulated in our previous studies [5,15] and present them in Table 2, along with information on whether these proteins were differentially expressed in the control comparison (participant 10). Next, we focused only on those proteins for which the differential expression was limited to comparisons between cancer and control (some examples are shown in Supplementary Materials Figure S2a-d).    Table 2 provides the list of all proteins that were differentially expressed both in our present comparisons of cancer and control breast milk samples. Some of these proteins were also identified in our previous comparisons of cancer and control milk samples [5,15]. Among the proteins differentially expressed between the cancer and control comparisons, some of them were also differentially expressed in the comparison between two control breast milk samples from participant 10 (shaded in Table 2).

Differentially Expressed Proteins in BC vs. Control That Were Identified in the Current Study (and Also Identified Erentially Expressed in Our Previous Studies on Human Milk)
Examples of some of the most important dysregulated proteins are shown in Supplementary Materials Figure S2. The spectral count, and fold change of the difference are shown in the graphs. These proteins are important in our comparison study, since the same dysregulation was observed in multiple comparison pairs in the current study and observed in our previous studies (mostly on multiple comparison pairs). Additionally, the dysregulation of these proteins did not exist in control samples from right and left breasts of participant 10. These dysregulated proteins include proteins from casein, albumin, lactoferrin and bile salt stimulated lipase families.
Several of the dysregulated proteins were observed in the comparison pair of 3_BC vs. 4_Con (Table 2). In this pair, the BC sample was donated 6.2 years after the woman was diagnosed with cancer. The aberrant expression of the proteins related to BC, could either remain or disappear after the cancer is treated, depending on the cause of the dysregulation. This depends on the type of biomarker and whether or not the biomarker has a specific relationship with the therapy [19].

Dysregulated Proteins Specific to the Current Study
In addition to the differentially expressed proteins identified in other studies, we also identified several differentially expressed proteins specific to the current study (Table 2).
For all the protein families in Table 2, here we discuss selected functions, number of milk pairs that showed dysregulation, both in the current study and in our previous studies, and possible role/dysregulation previously found in cancer, based on literature ( Table 3). As seen in Table 3, some of these dysregulations were observed in multiple comparison pairs, while others were specific to individual pairs. This is likely because of the wide variety in timing between milk donation and cancer diagnosis across the samples. Additionally, we did the study regardless of subtype of BC in a set of 5 cancer control pairings (small sample group). We still considered these dysregulated proteins, because (based on literature) we found possible relationship between these proteins (or the proteins from the same family or the genes that encode these proteins) and cancer development and in some cases, dysregulation was observed by other research groups, using different methods. The functions of these proteins, as well as the possible relationships between them and cancer are shown in Table 4. Table 3. Protein functions, type of dysregulation, number of pairs that showed dysregulation and possible role/dysregulation, previously found in cancer based on literature for the proteins discussed in Table 2  -Involved in purine catabolism -Downregulation observed in BC patients [30]. -Involved in uric acid synthesis (which has antioxidant activity) [31]. -A protease inhibitor that protects tissues from enzymatic damage -The gene might be involved in cancer development [34]. -Upregulated in lung cancer tissues [35]. -Upregulated in prostate cancer tissues [36].
Zn-alpha2glycoprotein -One upregulation in 1 out of 5 pairs -One dysregulation in control samples from participant 10 -One upregulation in one pair of within woman comparison [15] -Lipid degradation -In high levels, could cause body fat deficiency and cachexia -Reported to be a potential biomarker in different cancers, including BC [37]. -Upregulated in BC tumors [38]. -Upregulated in advanced BC tumors [39]. -High gene expression has been reported in BC [40].  -Cell membrane protein, might be involved in ion channels transportations.
-Upregulation is reported in ovarian cancer [60,61] Table 4. Protein functions, type of dysregulation, number of milk pairs that showed dysregulation and possible role/dysregulation, previously found in cancer based on literature for the proteins discussed in Table 3. Downregulated in ER positive BC progression, although upregulated in triple negative BC [62]. High expression of MFG-E8 (gene that encodes lactadherin) observed in breast carcinomas [63].

Protein Family Dysregulation in the Current Study Selected Functions Cancer Related Investigations
O-linkedN-acetyl Glucosamine transferase (GlcNAc) One upregulation in 1 out of 5 pairs -Enzyme involved in protein glycosylation Upregulated in cancers (including BC) and is involved in cancer progression [64]. Upregulated in BC and plays a role in cancer cells glycolysis [65]. Upregulated in BC cell lines [66]. Upregulated in prostate cancer cell lines [67]. Upregulated in lung and colon cancer tissues [68].

Enolase
One upregulation in 1 out of 5 pairs -Enzyme involved in glycolysis Upregulated in different types of cancers including BC [69,70]. Elevated levels in BC, resulted from environmental pollutants [71]. Upregulated in BC tissues [72].
galactosyltransferase One upregulation in 1 out of 5 pairs -Enzyme for galactose transfer Plays a role in BC cell line proliferation [73]. Plays a role in cell adhesion in BC cell line [74]. Plays a role in cell transformation to malignancy [75]. Upregulated in malignant BC tissues and cell lines [75]. Upregulated in lung cancer cells [75][76][77].  [91]. Upregulated in mice mammary gland tumors [92]. Upregulated in M4A4 BC cell line [93] human protein disulfide isomerase (Hpdi) One upregulation in 1 out of 5 pairs One dysregulation in control samples from participant 10 -Enzyme involved in protein folding Involved in cancer development and progression [94]. Upregulated in different types of cancers [95]. elongation factor One upregulation in 1 out of 5 pairs -Plays a role in cell cycle and protein translation Upregulation has been reported in different cancers [96,97] Overexpression is reported in BC tumors [98] clathrin In both the current study and our previous studies [5,15], we observed several protein differences in the within woman comparisons of cancer and control (samples 9_R_BC and 9_L_Con in the current study). These differences are important because in this case the differences related to genetic and epigenetics factors between milk samples, which have to be considered in across women comparisons, are eliminated. However, when interpreting our paired comparison strategy, it must be considered that the discrepancies in protein dysregulations among different BC vs. control pairs might be due to the wide range in time between milk donation and cancer diagnosis across the samples (as shown in Table 1).
In addition to the dysregulated proteins reported in this study, several immunoglobulins and other components of the immune system were frequently observed to differ between pairs (data not shown). However, we did not observe a consistent pattern between BC and control samples and these data are not discussed here. Varying responses to unrelated responses and to cancer may affect immunoglobulin expression.

Conclusions
In this study, we performed MS-based proteomics on 12 human milk samples, including 5 paired BC vs. control samples to identify dysregulated proteins in human milk from women with BC vs. control and one comparison group between the right and left breast of a woman without BC to investigate the differences between the protein patterns of milk from different breasts of the same donor. Most of the proteins that we found to be dysregulated in BC vs. control have potential roles in cancer progression and tumor development/ growth and have been shown to be dysregulated in cancer.
Based on our current and published studies [5,15], the tentative draft biomarker signature that we have identified so far contains downregulated Caseins, Bile salt stimulated lipase Xanthine dehydrogenase/oxidase, Lactoferrins, Lactate dehydrogenase, Fatty acid synthase and upregulated Zn-alpha2-glycoprotein and antichymotrypsin. Even if this signature was built from three independent studies, the signature is still fragile because the sample size was small, and our findings must be confirmed in a larger study. Yes, despite all limitations of this and previous studies, our findings support the use of breast milk to examine the BC microenvironment and for BC biomarkers discovery. Therefore, identifying dysregulated proteins in human milk by MS-based proteomics could serve as a tool for detection of BC and assessing BC risk.

Limitations
This pilot study with 12 milk samples has several limitations. First, we compared the protein profiles of 6 pairs of human milk; a small sample size that could have led to spurious findings. Second, the disparity in baby's age between the BC and control milk samples could underlie some of the observed differences in protein expression. Third, the time between milk donation and cancer diagnosis varied greatly which effectively made each pair a unique analysis and comparisons across samples difficult. Despite these limitations, some consistencies were observed for proteins differentially expressed in the milk of women with cancer, and these findings support the need for further research.
Another limitation of the current study is the types of proteins that we identified. While we know the identity of most proteins, it is clear to us that more than one protein isoforms are present in the milk samples and identified in the current proteomics study. Yet, it is premature to know which isoproteins are responsible for the onset and/or progression or BC and which isoproteins are actually protecting the breast and preventing BC from developing. Despite this, identifying dysregulated proteins in more than one study and then later identifying additional new proteins demonstrate the power of proteomics in biomarker discovery and warrants further investigation.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/proteomes10040036/s1, Figure S1: SDS-PAGE of milk samples. Eight hundred µg of protein was loaded in each well. For better understanding, the gel lanes were cropped, and comparison pairs are shown next to each other; Figure S2. Dysregulated proteins in BC vs. control, also found to be dysregulated in our previous studies on human milk, which did not show any dysregulations in control samples from participant 10. Each bar graph shows total spectra counting in BC (in red) vs. control (in blue) for different proteins within the same family. The bars are labeled by the corresponding comparison pair and the fold change (FC) for each comparison. The red label means that the corresponding pair showed inconsistency compared to the other pairs in terms of up or down regulation.