Clinical neuroscientists and practitioners have gained access to an increasing array of tools to assist in the diagnosis of neurodegenerative disease dementias. Various neuroimaging techniques and a number of cerebrospinal fluid (CSF) biomarkers can now complement diagnosis that was once based solely on careful clinical and neuropsychological assessments of symptoms and only positively confirmed at autopsy [1
]. These additional biomarkers can be extremely informative, as many neurological diseases present with similar sets of cognitive, behavioral, and/or movement symptoms, particularly in early disease stages. While neuroimaging-based techniques, including structural and functional Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET), are currently the most commonly used diagnostic measures, these require sophisticated on-site technologies and expertise in specialized centers and they are expensive [2
]. The field could benefit from increasing availability of biomarkers in blood, CSF, or other biofluids, which are more widely attainable through minimally invasive means, simpler to interpret, and performed on more routine diagnostic equipment [3
A series of National Institute on Aging and Alzheimer Association consensus conferences suggested a number of criteria that a biomarker of neurodegenerative disease should fulfill [4
]. A putative marker should be linked to the fundamental neuropathology of the disease and validated in neuropathologically confirmed cases. Ideally, a marker would be able to detect the disease before the onset of symptoms, distinguish between neurodegenerative disorders, and not be affected by treatment with symptom-relieving drugs. Practically, a marker should be non- or minimally invasive, simple to execute, and relatively inexpensive. Based on these principles, a new research framework, “AT(N)”, was proposed for clear delineation of Alzheimer’s disease (AD) from other disorders. In this framework [1
], an indication of amyloid pathology (A+) by amyloid PET or in CSF is necessary for assigning a subject to an AD diagnosis. The disease can be further classified by the presence or absence of tau fibrillation (T), measured by PET or phosphorylated-tau (pTau) in CSF, and the extent of neurodegeneration (N) as measured by structural MRI or total tau in CSF. Despite this improvement in defining AD in biological terms, these markers alone do not allow for clear staging and AD prognosis. For example, the definition of a case as A+T+ may predict progression of a subject from mild cognitive impairment (MCI) to dementia but with a highly variable timeframe. As a result of this variability, the AT(N) framework was designed to flexibly accommodate the addition of further biomarker groups such as vascular and synuclein markers that may aid in the overall characterization of neurodegenerative disorders as distinct clinical entities and likely treatment groups.
Biofluids fulfill the practicality recommendations for a biomarker, being relatively easily and economically attainable. CSF is the primary fluid of choice, being in intimate contact with the interstitial fluid of the brain and carrying molecules secreted by neurons and glia, excreted metabolic waste, and material from dying synapses, axons, and cells that indicate neurodegeneration [5
]. However, although the lumbar puncture procedure to obtain CSF is generally considered straightforward, safe, and tolerable, it is not routinely performed in many neurology clinics due to patient and clinician disinclination [8
]. The procedure is also not particularly well suited to multiple short-term repeat measures, such as those used to assess target engagement, pharmacokinetics, or acute pharmacodynamic response of a novel drug. This had led to a widespread belief that the “holy grail” of neurodegenerative disease research lies in a blood-based biomarker [10
In blood-derived fluids (plasma and serum), central nervous system (CNS)-specific proteins are diluted by proteins from all other peripheral tissue sources, leading to potentially low concentrations that require ultrasensitive quantification [6
]. Proteins may be regulated and modified by different processes in the CNS versus the periphery, resulting in a lack of correlation between abundance in CSF and blood [11
]. Blood may also be presumed to be more labile, being in contact with many more secretory and excretory tissues than CSF. Finally, blood, and to a lesser extent CSF, is a complex mixture of proteins and metabolites that span a large range of abundances. In plasma, protein concentrations range from the most abundant protein, human serum albumin at 50 mg/mL, to signaling proteins in the low pg/mL range, such as IL-6 [13
]. These large differences in protein abundance mean there is currently no perfect technique for quantifying a large number of analytes that span this dynamic range.
Proteomic approaches are an excellent companion in the search for novel neurodegenerative disease biomarkers. Recent improvements in reproducibility and sensitivity of liquid chromatography tandem mass spectrometry (LC-MS/MS) instrumentation [16
], coupled with the development of immunoassay-based single molecule quantification and multiplexing [17
], offer a wide range of tools to allow for hypothesis-free target discovery through to the ability to accurately, sensitively, and simultaneously quantify a specific small number of targets. While proteomic techniques are available that together span most of the range of protein abundances in a complex biofluid, from ultrasensitive (~0.05 pg/mL) through to extremely abundant (~50 mg/mL), careful experimental selection and design is important to maximize the likelihood of accurately quantifying a target of interest (Figure 1
). In this review, we introduce a toolbox of techniques available to the biomarker researcher, the advantages and disadvantages of the major technologies, and finally, some of the key discoveries to date in the field of protein biomarkers for neurodegeneration.
4. Considerations for Accurate and Reproducible Findings
If the field wishes to discover reliable, quantifiable biomarkers for neurodegenerative dementias, then data from multiple large studies across heterogeneous populations must be comparable. In the final section of this review, we will discuss some technical considerations important for the accuracy of these techniques and recommendations for reporting that will improve our ability to compare data and achieve sufficient sample sizes to draw population-level conclusions above the variability of human samples.
4.1. Preanalytical Effects
In addition to post-data collection processing and platform-specific variability, preanalytical factors can affect the accuracy and reproducibility of measured analytes. The effects of preanalytical factors have already been systematically reviewed [126
]. Here, we aim to emphasize the importance of standardizing these factors to ensure reliable measurements across multiple centers. Preanalytical factors are divided into two subgroups: in vivo and in vitro factors. These factors include but are not limited to: collection methods and materials, hemolytic contamination of samples, sample handling, storage temperature, thaw conditions, sample stability prior to processing, and kit lot-to-lot variability [129
]. Much has been written on the importance of collecting and storing CSF only with polypropylene plasticware, as polystyrene or other materials can bind very sticky proteins such as amyloid-β or prion proteins [131
]. Freeze-thaw cycles (the number of times a stored sample is thawed and refrozen) are often investigated as a cause of protein degradation over repeated uses [133
]. Protein integrity varies across analytes and biofluids and maximum acceptable freeze-thaw cycles are specific to each platform, depending on detection sensitivity. Ideally, sample collection methods and times should be strictly controlled to minimize diurnal effects, as well as accounting for possible differences in analyte concentrations between fasting and nonfasting biofluids, which can affect levels of hormones, triglycerides, and other metabolic-pathway-related markers. Levels of certain proteins may vary widely day to day, and thus it is also important to examine the biotemporal stability of an analyte before considering its use as a biomarker [23
4.2. Matrix Effects
Biofluid composition is also an important consideration when using a multiplex immunoassay system. Matrix effects can negatively impact the ability of highly sensitive immunoassays to accurately quantify certain analytes [134
]. As with label-free proteomics techniques, complex matrices with high abundance of albumin and immunoglobulins can affect antibody binding and increase background, masking low-abundance proteins. These low-abundance proteins often approach immunoassay limits of detection, increasing the difficulty of accurate quantification. In some cases, such as with CSF, increasing sample volume may allow for the detection of these low abundance proteins. However, for more complex biofluids, the sample matrix has been found to inhibit detection of certain analytes in spike-recovery experiments, and increasing sample volume would not improve quantification [135
]. In a comparison between standards of known concentrations spiked in immunoassay buffer versus serum and plasma matrices, analyte quantification was significantly lower in the presence of either human sample matrix compared to the buffer. This inhibitory effect has been investigated by a number of other studies researching the quantification of low-abundance proteins in complex biofluids [136
These sources of interference in immunoassay detection can lead to misinterpretation of assay results, which can affect clinical or research outcomes. Inhibitory effects may vary between immunoassay detection systems and contribute to inaccurate measurements, increasing the difficulty of comparing quantification across multiple platforms. Due to possible matrix effects, it is generally recommended that the interpretation of analyte quantification in undiluted samples be relative rather than absolute; that is, the measurement should be interpreted in relation to other sample concentrations measured using the same platform. Dilution of samples in immunoassay buffers often improves quantification accuracy by mitigating such matrix effects, resulting in more absolute quantification. When investigating a new immunoassay, it is important to take into consideration possible sources of interference and assess dilution linearity and spike-recovery performance to determine optimal sample conditions. Some assays may not be suited to analyte detection in all matrices, as each sample matrix requires individual optimization. For CSF, dilution factors may be necessary for absolute quantification but can cause analyte measurements to fall below the limit of detection.
4.3. Data Processing
The difficult challenge of how to standardize data comes from the technical aspects of the proteomic workflow. The adoption of different quantification techniques for proteins of variable abundance makes comparison across studies difficult. LC performance can vary substantially over time and can introduce significant variability to an experiment [36
]. Simple measures can be taken to improve monitoring of day-to-day instrument variability and demonstrate instrument reliability, such as spiking with retention time calibrators and monitoring of abundant peptides in automatic QC systems like AutoQC in Panorama [138
How to appropriately normalize data and compare across studies is a more difficult problem with very little consensus, and the field should consider a series of questions. The first regards whether input protein concentration should be normalized before proteomic quantification, as is standard in LC-MS/MS workflows, or whether the same volume of each fluid should be used per assay (as applies to ELISA workflows). The second is whether distribution-based normalization methods (e.g., median or quantile normalization) are appropriate in this context, given that they are based on the assumption that most analytes will not change in abundance between conditions, and that a roughly symmetric proportion of proteins will increase and decrease in abundance. If the integrity of the blood brain barrier is compromised by a neurodegenerative process, this may lead to proteome-wide increases in CSF protein concentration, invalidating the assumption that most proteins will not change in abundance between conditions [139
]. Where panels of proteins have been selected on the basis that they are likely to vary between disease conditions, the same assumption is also invalidated and distribution-based normalization may be rendered inappropriate. The alternative approach, to select a subset of “housekeeping” proteins to which to normalize, is also problematic, as a number of studies have shown significant disease-related differences in the abundant biofluid proteins, which would be the most obvious candidates for selection. We would argue that there is currently insufficient high-quality data available to select a panel of normalizing peptides/proteins that may be stable across neurodegenerative conditions, and establishing whether such stable proteins exist should be an additional priority of hypothesis-free proteomic experiments. The current gold standard in quantification and reproducibility, therefore, may be smaller-scale targeted experiments, where ratiometric comparisons to a heavy-labeled standard with proven linearity or a standard curve allowing reporting of a concentration may be the most reliable means of quantification. As this approach does not allow for hypothesis-free discovery, these approaches should be used in replication cohorts for findings that arise from untargeted methods.
4.4. Multisite Variability
It is important to conduct replication studies to assess intersite and interuser variability using the same platform and data-processing methods. Seemingly trivial or unapparent differences in techniques, materials, or environmental conditions can affect results. It is not sufficient to assume that employing the same sample-processing procedures, the same multiplex assay kits or LC setup, and standardized data reporting will necessarily eliminate variability. In an extensive multisite study involving six different labs, Breen et al. [118
] found that each analyte measured showed at least one significant lab or assay lot-to-lot effect despite following a consensus protocol across all sites. Care should be taken to establish systems of determining assay reproducibility, such as including standardized plate-to-plate controls to minimize plate effects across multiple sites and batch ordering assays to ensure lot consistency. Even so, controlling for every source of variability and assessing the performance of all available technologies and platforms is often unrealistic due to financial and resource limitations.
5. Future Directions
Proteomics is a relatively new and rapidly growing field and has yet to develop clear standards for reporting data and consistent methods to allow for confident comparison of datasets. The complexity of and similarity between neurodegenerative diseases means that studies of large, diverse populations are required to define biomarkers that are both sensitive and specific. It is therefore of critical importance that the field as a whole adopts stringent and detailed reporting criteria to build knowledge on a scale that will help delineate and stratify subjects across populations in a biologically informative manner. While proteomic-specific journals have begun to adopt set reporting criteria, clinical journals do not generally require this level of detail, and the field suffers as a result. At a bare minimum, a data table that includes every peptide and/or protein confidently detected in each proteomic experiment (including retention time and mz data for LC-MS/MS), abundance in each individual sample, and per group summary statistics should be provided for every study. A list of significantly changed proteins with a fold change and p/q value is not sufficient for thorough examination of the data. As a field, a decision should be made to use a standardized protein reference, as switching between Uniprot IDs [140
], gene names, and other reference formats often leads to errors and data loss. We propose the use of both the Ensembl gene ID [141
], which is clearly linked to genomic locus and reference version, and a more descriptive gene ID such as the gene symbol for ease of understanding results. Similarly, clinical and demographic data should be provided on an individual subject level to allow for modeling of age, sex, and other important demographic variables. The development and adoption of user-friendly resources such as the CSF Proteome Resource and Plasma Proteome Database [14
] to allow for cross-study comparison is also critically important. Adoption of standards along these lines will likely lead to leaps forward in the biomarker discovery pipeline equivalent to the speed at which the discovery technology is improving.