1. Introduction
The study of molecules responsible for the mechanisms behind body function, better known as omics, has revolutionized the study of the human body. After the conclusion of the human genome project (2001), efforts have shifted into the study and understanding of the human proteome. The increased sensibility and specificity of the available technology has led to the discovery of an increased number of proteins, making possible the creation of databases for proteomes of different organs as in the case of the Human Eye Proteome Project (HEPP) [
1]. The human microbiome project has also been created to understand the correlation between human health and the human microbiome [
2]. Increasing our overall knowledge of the proteomic composition of the human body can provide insight into the mechanism involved in pathological states.
Technological advances in the field of quantitative proteomics have allowed for their use in the study of the mechanisms and treatment of disease. With the new techniques, an increased number and types of biological samples can be analyzed. Each biological sample presents a possible place to discover biomarkers in the study for certain diseases.
Table S2 contains a summary of the total identified proteins found in the cited work. Some of these proteins are being considered for their use as biomarkers for disease. Some of the biological samples that can be acquired with noninvasive techniques are cerumen, tears, stools, and saliva. Notably, despite its accessibility, cerumen has not been widely studied as a bio fluid [
3]. Other samples such as vitreous humor and aqueous humor are acquired by invasive methods but can provide valuable information into many pathologies, especially, but not limited to, the eye. The biological samples can be used to find new approaches to treatment of disease that can eventually translate into progress in personalized medicine efforts.
This review focuses in the uses of quantitative proteomics methodology in the diagnosis and treatment of disease. It centers on research done in human biological samples to serve as a guide for future research. Although there are a large number of biological samples that are currently being used in research, our discussion is limited to research done with mass spectrometry technology in the following samples: cerumen, saliva, vitreous humor, aqueous humor, tears, nipple aspirate fluid, breast milk, cervicovaginal fluid, nasal secretions, and stools.
Table S1 contains a summary of the diseases studied with each sample type in the cited works.
2. Techniques in Quantitative Proteomics
Before addressing the theme of non-conventional body fluids in quantitative proteomics, it is of utmost importance to acknowledge two professionals who revolutionized the study of proteins with their application of mass spectrometry in the field. John B. Fenn and Koichi Tanaka shared a Nobel prize in chemistry (2002) for the development of electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI), respectively [
4]. Their contribution enabled scientist to identify and quantify proteins with a high level of precision. The following part will be dedicated to the understanding of the technology available, giving special consideration to the sample preparation and the mass spectrometry technology for protein analysis (ESI and MALDI).
Research in proteomics requires profound understanding of each phase of the sample preparation and analysis. Some of the most important steps are sample collection, sample conditioning, mass spectrometry processing, comparison of theoretical data with experimental data, and finally, the use of specialized software that can generate lists and tables with the analyzed data. There are also a number of variables that need to be considered as they can often affect the results. One of them is the denaturing of the sample caused by errors in collection or mishandling. Others include poorly performed digestions, desalting errors, and loss of relevant data due to failure to perform depletions. The mass spectrometry and label method selected for the quantitative proteomics assays can also affect the results. It is for this reason that before developing a study design with this type of technology, it is fundamental to understand what method works best for the desired analysis. This can be achieved by the study of previous research, especially of studies using clinical samples.
The acquisition of clinical samples is a delicate process as it often involves permits and is limited by its availability. When collecting a sample, it is important to follow previously specified conditions of temperature and storage in order for them not to be contaminated or altered. To achieve a good digestion, it is essential to know the initial concentration of the sample. Reaction salts in each phase must be removed before mass spectrometry analysis so it does not interfere with the sample analysis [
5]. One of the uses of biological samples in research is their potential as a source for biomarkers. A biomarker is defined as “any substance, structure or process that can be measured in the body or its products and influence or predict the incidence of outcome or disease.” [
6]. In order for a substance to be considered as a potential biomarker, some of the most important qualities it must have are precision, sensibility, and reproducibility [
7]. Since it is very difficult to find many of those traits in the molecules studied, there are not many potential biomarkers found despite the apparent availability of proteins in the samples. The increased precision of the technology available can help find new and more abundant quantities of proteins that can eventually become biomarkers.
As previously mentioned, there are two main technologies available for sample processing in mass spectrometry quantitative proteomics analysis. These are electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI). Both of these techniques can be used with a variety of mass spectrometers. ESI technology uses electric energy to change ions from a solution into gas phase. Neutral compounds can also be processed with ESI MS by being ionized [
8]. After they obtain their charge, the ions travel through the analyzer to the detector which can identify them according to their mass/charge (
m/
z) ratio [
8]. The signals are then analyzed and recorded as a mass spectrum in a computer [
8]. In the case of MALDI MS technology, it uses a laser beam to irradiate a solid sample in an organic matrix. This causes the formation of protonated molecules. The ionized samples then travel through a mass spectrometer that in MALDI technology usually works with time of fight (TOF). This type of analysis eventually translates to the
m/
z of the sample [
9]. A major setback in the use of mass spectrometry is the high cost associated with acquiring the equipment which limits the type of institutions that can own and use the technology [
8].
An important step in MS quantitative proteomics is the process of sample labelling. Labelling enables the identification and quantification of the samples by different methods. There are two main types of quantification methods: absolute and relative. These are based on the absolute or relative abundance of the samples. Most of the techniques available are part of relative quantification. With stable isotype labelling methods, the quantitative analysis is achieved by calculating the amount of protein with the ratio of peak intensity of isotope ions. The principle behind it is to have samples tagged with stable isotopes so they can be differentiated by their mass. Some of the better known relative quantification isotope strategies are isotope-coded affinity tag (ICAT) isobaric tags for relative and absolute quantification (iTRAQ), dimethyl labeling,
16O/
18O and stable isotope labeling with amino acids in cell culture (SILAC). There is also the label free method in which, as the name suggests, the sample is not labeled [
10]. In this technique, the quantity of aprotein is determined by the peak intensity of peptide ions. Even though this type of technique does not require the labelling step and, in theory, can detect more proteins, it lacks precision when compared to the ones mentioned above [
10]. Another method that can be used in the quantification of proteins is 2D gel which is widely used in protein separation and quantification [
10]. Each method has its advantages and disadvantages that determine the scope of their usage. However, many of the methods have been optimized by their constant usage; eliminating many of the disadvantages initially reported. [
7,
10,
11,
12]. Some of the pros and cons of the previously mentioned labelling methods are described in
Figure 1.
At present, the technology for protein identification and quantification is being constantly studied and modified to address the problems encountered in their use. Their use will depend on the availability of the instruments for spectrometry and how well the experimenter knows the labels to be used. It is also important to understand that different variables such as the type of sample and quantity can determine what method is best suited for each experiment. These technologies will hopefully keep advancing and their limitations will continue to be addressed. Other virtues and uses of them will also be discovered as they are used in relatively unexplored samples.
4. Saliva
Saliva is one of the most accessible body fluids. It has been studied in the search of biomarkers for a number of diseases, including oral cancer [
13,
14]. In one study, myosin and actin were evaluated as possible biomarkers for oral cell carcinoma. The study revealed that both proteins showed differential expression in precancerous and cancerous lesions which can help distinguish between them [
14]. These findings can eventually help achieve an earlier and targeted treatment that can better the prognosis of oral cancer.
Immune diseases such as chronic graft vs host disease (CGVH) have also been studied using saliva. CGVH is a complication of allogenic hematopoietic stem cell transplantation that can lead to impaired organ function [
6]. It commonly affects the oral cavity and skin. A research study using unstimulated saliva found 102 differently expressed proteins, including downregulation of those associated with oral antimicrobial host immunity. The study used LC-MS/MS labeled free quantification for achieving the results [
15]. Alteration of proteins associated with immune response was also found in another study performed with isobaric tags for relative and absolute quantification (iTRAQ) and tandem mass spectrometry (MS/MS) [
16]. Additionally, two downregulated proteins were proposed as potential biomarkers: IL-1 receptor antagonist and cystatin B. Another immune related pathology of interest in the search of biomarkers in saliva using quantitative proteomic tools is Sjogren’s syndrome [
17,
18]. Since there is a lack of clinical tests available for diagnosis, biomarkers found in an accessible body fluid such as saliva would aid in early diagnosis and treatment.
For years, a large amount research into the mechanism and treatment of human immunodeficiency virus (HIV) infection has been reported. Studies based on the quantitative analysis of saliva have been done to search for new ways to monitor the progression of the disease and to understand the mechanism of infection [
19]. HIV neurocognitive disorders (HAND) have also been investigated with saliva for the possible involvement of the gut in the pathophysiological mechanisms through quantitative proteomics analysis of the saliva. Using liquid chromatography mass spectrometry, they found 58 proteins that were correlated with cognitive scores. These results reveal apparent oral modulation of brain function during HIV infection [
20]. Quantitative proteomic analysis of saliva has also been considered for the development of less invasive glucose monitoring tools in diabetic patients [
21] and Down Syndrome comorbidities biomarker discovery [
22].
5. Vitreous Humor
The human vitreous humor is an aqueous solution that fills the posterior compartment of the eye [
23]. Although the substructures of the vitreous and their proteins have been investigated recently [
24], studies have centered on using vitreous humor as a diagnostic tool for retinal pathologies. The proximity of the vitreous humor and the retina makes it possible for retinal components to diffuse into the vitreous humor [
25] and can make it an ideal setting to find biomarkers for retinal pathologies.
Aside from its many advantages, a major setback to the use of vitreous humor is the invasive techniques for removal of the fluid. This also poses a problem in acquiring samples for study purposes since ethically healthy eyes cannot be subjected to the procedures for the sole purpose of extracting the fluid [
1]. To address this issue, researchers have used samples of patients undergoing eye surgery for various diseases but that are presumed to have healthy vitreous humor as controls [
25,
26] while others have used corneal transplant donated eyes [
27,
28]. The first study conducted with corneal transplant donated eyes as controls carried out with type 2 diabetes patients [
27] revealed 29 differentially expressed proteins: eight of which were increased and 21 which were decreased in proliferative diabetic retinopathy (PDR) patients. With the exclusion of serum proteins, 19 proteins were differentially identified in vitreous fluid of PDR patients compared to those without the disease [
27]. Among the proteins identified, the ones that were documented for the first time in vitreous humor were: N(G), N(G)-dimethylarginine dimethylaminohydrolase 1 (DDAH 1), tubulin alpha-1B chain, gamma-enolase, cytosolic acyl coenzyme A thioester hydrolase (ACOT1), malate dehydrogenase (MDH), and phosphatidylethanolamine-binding protein 1 (PEBP1) [
27].
The pathways associated with PDR have also been studied with quantitative proteomic techniques [
28]. Among their findings, they identified differentially expressed proteins involved in glycolysis/gluconeogenesis, complement and coagulation cascades, gap junction, and phagosome pathways. A research done in 2015 compared proliferative diabetic retinopathy vs non proliferative diabetic retinopathy using free labeled quantitative proteomics analysis [
29]. They found 230 proteins that were increased significantly in PDR when compared with non-proliferative retinopathy [
29]. In another study [
30], 96 proteins from previously published studies were selected to create a list of possible biomarkers for diabetic retinopathy (DR) in PDR and non-proliferative diabetic retinopathy. Based on the results of the study carried out with semi quantitative multiple reaction monitoring (SQ-MRM) and stable isotope dilution with multiple reaction monitoring (SID-MRM), they proposed a protein marker panel composed of APO4, C7, CLU, and ITIH2 for future studies [
30].
Exudative or wet age related macular degeneration (AMD) is commonly associated with alteration of the retinal pigment epithelium and its basal membrane and can lead to blindness [
31]. Using bottom up analysis with capillary electrophoresis–mass spectrometry (CE-MS) and LC-MS/MS for the identification of proteins in the vitreous, a research identified 97 proteins: 19 which were significantly increased in AMD patients [
31]. Among the upregulated proteins they found albumin and serotransferrin. Another retinal pathology that has been studied with vitreous humor is idiopathic epiretinal membrane (iEM). To understand the underlying mechanism, reversed phase high-performance liquid chromatography (RP-HPLC) coupled with electrospray ionization tandem mass spectrometry (ESI-MS/MS) were used to analyze vitreous humor samples of patients with (iEM) [
32].
6. Aqueous Humor
As with vitreous humor, aqueous humor (AH) collection cannot be performed in adults that do not suffer from ocular diseases or that are not subjected to eye surgery [
1]. Still, it is an excellent source of potential biomarkers for various eye pathologies. One cause of blindness in the elderly population, age related macular degeneration (AMD), has been studied with the use of AH for the possibility of finding biomarkers for early detection. An assessment of the complete composition of the AH of AMD patients was first done in 2012 using Multiple reaction monitoring-mass spectrometry (MRM-MS) [
33]. Another research study done using LC-ESI-MS/MS to analyze the AH of AMD patients, reported four proteins that were significantly increased [
34]. They proposed Rpn2, one of the proteins found to be over expressed, as a potential biomarker for the disease [
34]. A separate study using MALDI-TOF-MS/MS, identified 78 proteins, 68 which were differentially expressed in people with wet AMD vs control [
35]. Further research into AH and AMD can yield the desired biomarkers for the disease. In the study of body fluids with quantitative proteomics, protein expression can reveal information regarding the progression of disease. Keratoconus (KC) is a condition that affects the cornea. In KC the cornea adopts a conical shape due to corneal thinning and conical protrusion. In order to better understand the mechanism behind the keratoconus, label free LC-MS/MS quantitative proteomics was used in KC patients and controls [
36]. The study found different expression levels in 16 proteins. Some of these proteins were known to play a role in regulation of proteolysis and responses to hypoxia and hydrogen peroxide. Other research done with AH analysis using quantitative proteomics has explored its potential role in the creation of diseases for Juvenile idiopathic arthritis uveitis [
37], diabetic retinopathy [
38], and branch retinal vein occlusion induced macular edema [
39].
7. Tears
Tear fluid has been used to study a diverse array of diseases. Unlike vitreous humor and aqueous humor, it is readily accessible and the methods for acquiring it are noninvasive. Dry eyes (DE) is a multifactorial ocular surface disease. The search for understanding more of this pathology has prompted the study of differences in expression of proteins in tear fluid [
40,
41]. In one study, protein expression in DE was analyzed with iTRAQ and LC-MS/MS and a total of 386 were found [
41]. Participants were divided into four groups: non DE (NDE) or control, mild DE (MDE), moderate-to-severe DE (MSDE), and mixed DE (MXDE) [
41]. Downregulation of lipocalin-1 lysozyme and prolactin-inducible protein was present in all subgroups of (DE). It was also found that there was an increased amount of downregulated proteins in MSDE when compared with MDE [
41].
Another approach to the study of DE was done by focusing on the different clinical phenotypes of patients with DE and how they could affect the tear proteome [
42]. This research was done using laser desorption/ionization time of flight mass spectrometry (MALDI-TOF/TOF MS). Participants were subdivided into four groups: healthy controls, aqueous-deficient dry eye (DRYaq), lipid-deficient dry eye (DRYlip), and a combination of the two (DRYaqlip). Downregulation of PRR4 and upregulation of mammaglobulin B and lipophilin A was seen in DRYaq patients and DRYaqlip when compared with that of controls and DRYlip patients. There results demonstrated that different clinical phenotypes are associated with different alterations of the tear film proteome [
42]. Other research has been done to understand DE associated factors, such as contact lens related dry eye [
43].
The association of dry eye syndrome with type 2 diabetes was studied using Two-Dimensional Strong Cation-Exchange/Reversed-Phase Nano-Scale LC MS [
44]. They found increased expression of proteins in patients with diabetes and dry eye syndrome. Among the proteins with increased expression they found annexin A1, elastase 2, clusterin, and apolipoprotein AII. Diabetic retinopathy, another well-known diabetes complication, has also been studied with tears fluid using a quantitative approach [
45]. Recently, a study addressed the search for new biomarkers for the differentiation of thyroid-associated orbithopathy (TAO) and dry eye syndrome [
46]. With the use of matrix-assisted laser desorption ionization mass spectrometry, they identified deregulated proteins in TAO and dry eye. Among the findings, downregulated proteins in TAO compared to that of dry eye were reported including proline-rich protein 1, uridine diphosphate glucosedehydrogenase, calgranulin A transcriptionactivator BRG1, annexin A1, cystatin, heat shock protein 27, and galectin [
46]. Other diseases than have been studied to define their tear proteome expression differences are primary open angle glaucoma and pseudoexfoliative glaucoma [
47].
As previously mentioned, tears have been used to study a variety of disease, including some that are not ocular in nature. The search for non-invasive biomarkers for Alzheimer’s disease has led quantitative proteomics research in tear fluid. Using LC-MS/MS and Selected Reaction Monitoring (SRM) based targeted proteomics, they found that a combination of lipocalin-1, dermicin, lysozyme C, and lacritin as biomarkers showed 81% sensitivity and 77% specificity [
48]. Quantitative approaches have also been used to test ocular responses to laser platforms for refractive surgery as they manifest in tear proteins. This type of research can help understand the effect of different surgical procedures on the body [
49]. Other pathologies have been studied using tears and quantitative proteomics tools are vernal keratoconjunctivitis [
49], multiple sclerosis [
50], and primary open glaucoma [
51].
8. Nipple Aspirate Fluid
Nipple aspirate fluid (NAF) is a ductal fluid that can be extracted from the breast through the nipple with a non-invasive technique [
52]. NAF has primarily been studied as a source for cancer biomarkers. The need for better diagnostic tools for early breast cancer detection coupled with the non-invasiveness of the technique of NAF extraction makes it a promising source of biomarkers. However, each sample yields a low abundance of proteins which affects its processing. This setback has been addressed by studying different pre-fractionation technique platforms for NAF, providing methodological information for future quantitative studies [
53].
In the early 2000s, several studies in the field of proteomics explored the use of NAF as a source of breast cancer biomarkers [
54,
55,
56]. More recently, quantitative liquid chromatography tandem mass spectrometry (LC-MS/MS) was used to compare the parent estrogen and their metabolites in nipple aspirate fluid, ductal lavage supernatant, and serum in BRCA1/2 mutation carriers [
57]. The goal of this study was to compare measurements of estradiol and estrone metabolites EM and parent estrogens (PE) in different samples for biomarker discovery to see which sample provided better results. Although serum yielded the most promising results, further studies comparing biological samples could generate more comprehensive data of the biomarkers discovered in all of them.
10. Cervicovaginal Fluid
Cervicovaginal fluid (CVF) originates from the vagina, cervix, endometrium, and oviducts [
63]. Over the years, several research projects have focused on identifying the proteomic composition of CVF [
63,
64]. CVF has been known to play a crucial role in the innate immune defense as seen with studies regarding HIV transmission [
65]. An iTRAQ based study was conducted to analyze CFV from HIV-exposed seronegative individuals (HESN) that were at high risk compared to low risk (HESN) and HIV positive patients. The research revealed that Serpin A5 was up regulated and Myeloblastin was downregulated in high risk HESN when compared to that in the other groups [
66]. Further study into the mechanism behind HIV resistance and development of treatments is recommended [
66].
CVF has also been studied for its potential use in cervical cancer screening. Using label free quantitative analysis as well as qualitative identification a study found alpha-actinin-4 as a potential biomarker in CVF for cervical cancer [
67]. CVF is easily accessible and has a number of potential biomarkers that makes it ideal for the development of self-diagnostic testing for cervical cancer.
12. Broncho Alveolar Lavage Fluid
Broncho alveolar lavage fluid (BALF) has been used in the study of pulmonary diseases. Idiopathic pulmonary fibrosis (IPF) is a chronic, progressive interstitial pneumonia with poor prognosis. To understand more about the underlying mechanism of the pathology and differentiate between other fibrocystic pneumonia candidate, biomarkers in BALF have been investigated [
71]. The first gel free quantitative analysis of BALF in IPF patients found upregulation of probiotic cytokine CCL24 and an overexpression of osteopontine [
72].
Chronic obstructive pulmonary disease (COPD) is a complex and heterogeneous disease with major public health importance. A study using nano-reverse phase liquid chromatography mass spectrometer (RPLC/MS) found 423 proteins, 76 of which displayed altered expression. They also observe upregulated expression of alcohol metabolism enzymes, including ADH1B, ALDH2, and ALDH3A1. This was the first report of the association between alcohol metabolism and COPD and its possible implications [
73]. Another study focused on the association between lung cancer and COPD, using matrix-assisted laser desorption/ionization-time of flight (MALDI:TOF/TOF) to help with the understanding of the pathogenic pathways of both diseases and possible protein biomarkers [
74].
The use of BALF has also been explored to help with the understanding of some inflammatory and immune diseases. Acute respiratory distress syndrome (ARDS) is caused by a response to infection or other inflammatory triggers and has a high mortality rate. Currently, there are no biomarkers for ARDS that can provide prognostic information for clinical management. To investigate new possible biomarkers, a study used iTRAQ and LC-MS/MS to compare BALF protein content in different stages from the disease [
75]. Their findings demonstrated differences in absolute protein levels in the different stages studied [
75].
Chronic graft dysfunction is a complication of lung transplantation and is the major cause of morbidity and mortality in transplant patients. The search for non-invasive biomarkers has led to the investigation of BALF with the use of quantitative proteomic techniques. Lung cancer is another area of interest for BALF research. Primary lung cancer adenocarcinoma has a poor prognosis. A study using liquid chromatography mass spectrometry found 33 overexpressed proteins in samples taken from patients that are potential biomarkers in the future [
76].