Proteomics is a field that encompasses a multitude of techniques and technologies that are applied to a variety of scientific questions. However, it is quite often the case that only a subset of the techniques is applied to answer a specific question because the techniques come with limitations, such as the inability to use harsh surfactants when studying protein complexes and interactions which could result in insolubility and an inability to analyze all complexes. One of the most common applications of proteomics is for the quantification of the abundance of ‘proteins’, a term that encompasses proteoforms, Open Reading Frame (ORF) products and proteins complexes. Due to the sheer number of distinct proteoforms, their range of abundances and the variability of this abundance in different cells or tissues or environmental conditions, the accurate quantification of abundance changes is challenging and subject to bias and errors.
Normalization is defined as the process of returning something to a normal condition or state. In system-wide -omics analysis, normalization tries to account for bias or errors, systematic or otherwise, to make samples more comparable [1
] and measure differences in the abundance of molecules that are due to biological changes rather than bias or errors. When referring to normalization strategies used in proteomics, whether it be gel based or Liquid Chromatography-Mass Spectrometry (LC-MS)-based, a great deal of work has been performed to develop software solutions that attempt normalization towards the end of an acquisition, using either gel densitometry images or ion intensity values from technical replicates. At this point, normalization is to overcome differences in staining intensity or ionization efficiency that are often beyond the direct control of the researcher. However, many points in the experimental process occur before this in which normalization of some type could be applied to make the samples more comparable. Normalization is traditionally understood to be a function that is performed post acquisition of data to account for random variance and “batch effects” (which will be discussed later). However, when considering the actual function of normalization, enabling proper proportionate comparison of different biological samples, the methods that can achieve this are highly varied. Simple approaches such as a performing protein level quantitation prior to sample digestion, can easily be considered as normalization steps. Additionally, robustness can also be introduced through normalization of samples by increasing the reproducibility of the measurements within technical replicates. A further consideration is that normalization needs to be applied across biological replicates and treatments. With all these considerations in mind, the term normalization in proteomics and indeed other ‘omics’ style system wide analyses, becomes more of a “strategy” or experimental design approach than a single technique. Therefore, when one is deciding how to go about ensuring the best possible removal of bias and systematic error, the appropriate methodologies available are numerous but which should be applied is unclear.
In our own work, we have often wondered whether the strategies we apply throughout our experimental work to minimize variation and normalize data are actually achieving that objective. The wide range of biological samples that pass through our Core Facility means that some of the strategies developed by other researchers to normalize data in specific samples or situations do not apply to the samples that we are analyzing. In this article, we review the means of normalization and the potential multi technique approaches that can be used to achieve a normalized workflow while also pointing out misgivings we have with the way those means and workflows are applied. In some cases, we have no solutions but would like to flag the issue to the greater proteomics community so that some form of consensus may be reached.
2. Cellular Normalization
The quantification of proteins by relative abundance or fold-change in comparison to a reference or control is arguably no longer sufficient to infer cellular events of biological significance [2
]. Therefore, absolute quantification based on protein copies per cell has become a desirable objective in recent years [4
]. The key advantage of absolute quantification is the ability to compare absolute values across multiple published data sets, though this is currently unachievable. In addition, the normalization of this is a technical challenge due to many confounding factors, such as the estimation of cell number in a sample, the degree of cell variability and the changing protein expression between individual cells within the same population. Although it may be argued that these problems can be overcome with methods such as Stable Isotope Labelling by Amino acids in Cell culture (SILAC) [5
], where label incorporation occurs in the cell and therefore allows cellular level normalization, the use of this technique incurs an additional expense per experiment and it is susceptible to many of the issues outlined below such as what parameter to normalize to. Additionally, label-free quantitative methods are also desirable due to the greater range of proteome coverage achieved when more than one peptide is required for identification [6
As highlighted by Wiśniewski et al. (2014) [3
], current methods of determining protein copy number per cell require an accurate estimation of cells in the sample. In microbial culturing, a common method of measuring cells is via the Optical Density (OD) at 600 nm, whereby the amount of light scattered is proportional to the number of cells present. However, it has been shown that a significant bias is introduced in the form of fluorescent reporter proteins that absorb light at the same wavelength [7
]. While it has been suggested that this can be normalized by measuring OD at the alternate wavelength of 700 nm, the remaining issue with this method is that it constitutes an estimate of cell abundance rather than an absolute value [8
]. Furthermore, cell viability is not taken into account, which is particularly problematic for experiments in which treatments or conditions may significantly impact microbial viability [9
In tissue culture procedures, manual counting utilizing a haemocytometer remains a method commonly employed by many laboratories due to its cost-effective nature and wide applicability to virtually any homogenous cell culture. However, due to the arduous nature of this task, estimates are based on a relatively low sampling size and are subjective to the laboratory analyst. Automated image-based cell counters reduce this bias considerably but are reliant on software algorithms that must accurately identify a cell type based on cell size and morphology. Therefore, aggregates of cells, which can be characteristic of many cell lines, are unlikely to be detected by these algorithms [10
]. Alternatively, flow cytometry with commercially available counting beads allows rapid and accurate quantification of cells, with the additional advantage of being able to differentiate between cell types in complex or heterogeneous samples [11
]. However, this technique is less versatile due to reduced availability of relevant bead standards, which are considerably expensive in comparison to manual techniques. In addition, this technique requires a high level of technical expertise. Another caveat is that a single-cell suspension is fundamental for flow cytometry, and therefore, significant processing of tissues and adherent cell lines is required to avoid the formation of aggregates [12
There are numerous methods to either count or estimate the number of cells in a sample. However, none of these methods typically account for the inherent variabilities present within a homogeneous cell population. For example, estimations based on cell volume fail to take into account differences in cell size, which in turn is affected by cell cycle stage [13
]. These issues can be circumvented by the “proteomic ruler” approach. This methodology shows that there is a proportional relationship between DNA mass, cell counts, and the cumulative MS signal of histone-derived peptides. Therefore, accurate measurement of DNA content within a proteomic sample can allow this relationship to be exploited to predict the copy numbers per cell of an individual protein based on its MS signal [3
]. A rapid method for measurement of DNA is Absorbance at 260 nm (A260), although the accuracy of this measurement can be compromised by the presence of protein, meaning that DNA must be effectively isolated from the sample. However, the purity of the DNA can be assessed by calculating the A260/A280 ratio, whereby a ratio of approximately 1.8 is indicative of an uncontaminated DNA sample [14
]. Therefore, while the proteomic ruler is a promising normalization method it does not account for the difference in protein abundance between cells of the same population which result from stochastic or pulsatile gene expression [15
]. In addition, it is well-documented that histone mRNA and protein biosynthesis are increased 35-fold during the S-phase of the cell cycle [16
]. As such, the ratio of cells in S-phase within a culture at the time of harvest could result in an artificially higher histone MS signal and calculated cell count. Furthermore, this disparity is increased in complex samples, such as whole tissue lysates with multiple cell types, or mixed samples whereby there is more than one organism present. Importantly, whether these are legitimate issues with the proteomic ruler or not has not been determined. A practice that can be used to mitigate potential S-phase fluctuations skewing data is to work with homogenous cell lines that are synchronized by serum starvation prior to treatments, keeping the supplemental fetal calf serum at a low percentage [17
], inducing cell senescence or if feasible working with terminally differentiated cells. However, this is not useful for studying cell lines normally utilized for experiments or animal models.
Similar normalization methods based on genomic DNA copy numbers have also emerged, particularly in relation to metabolomics [18
]. These, like the proteomic ruler, rely on the assumption that the genome size and ploidy is known. This may pose an issue with plant matter and some microorganisms due to polyploidy or conversion between ploidy states [19
], exemplifying a bias of these normalization protocols towards mammalian systems. In these circumstances, flow cytometry is required to determine ploidy. However, flow cytometry does not account for conversion between ploidy states, which is known to be affected by several factors [21
]. In addition, copy number variation, which is widely reported in disease states [22
], would significantly influence such normalization methods.
As mentioned, SILAC is a highly reliable method for protein quantification and has been reported to have minimal errors [23
]. The errors that can occur are the in vivo conversion of labelled arginine into to other amino acids (typically proline [24
]), isotopic amino acids not incorporating completely, and errors with sample mixing. The conversion of arginine to proline can negatively influence the accuracy of protein quantification. The extent of labelling and the amino acid conversion profile could be problematic for primary cells as many of these do not proliferate well and consequently may not achieve consistent label incorporation for particular proteins, which could lead to further errors. Another solution that has been presented is to incorporate label-swap replications to increase the reliability of expression ratios, but this would need to be used synergistically with post analysis statistical normalization which has its own limitations [25
3. Normalization During and after Lysis and Protein Extraction
A key part of any proteomics workflow is sample disruption for the extraction of protein. How these steps are conducted depends heavily on the experimental aim, the separation technology to be used (electrophoresis or chromatography, for instance), and the chemistry of the sample [26
]. It is outside the scope of this review to describe all conceivable strategies. However, it is significant to note that the chemicals used and handling of these initial stages of sample disruption are well-documented to affect downstream results significantly (carbamylation of lysine by excessive heating of urea solutions for example [28
]), and therefore the uniformity and type of sample handling at these stages is important when considering the effectiveness of normalization strategies [29
There are several critical points in cell lysis: Protein extraction, separation, solubilization and removal of unwanted contaminants or molecules, and each is a point where bias can be introduced. The workflow is as follows, using cell lines as an example: the sample is harvested from the growth media and centrifuged. It can then be washed and resuspended in the relevant lysis buffer [30
] before progressing to protein extraction and ultimately to analysis. The first step must be completed at a consistent time point amongst biological replicates and samples (typically as measured by cellular normalization strategies) to avoid carrying over variance in growth time that could affect normalization strategies during and after lysis and protein extraction [31
]. The reason is due to biological variances, such as different growth times, usually carrying a greater bias effect than technical variance [32
Furthermore, an important consideration is to limit the presence or occurrence of random chemical modifications that can be introduced during sample handling. Endogenous proteases can often be a source of random cleavage events which will in turn effect the efficacy of subsequent deliberate proteolytic cleavage [33
]. The approach most commonly used in proteomics is to add protease inhibitors to samples as soon as tissues are dissociated [34
] or in the case of liquid based samples, at the point of collection [35
In our own work, we assume that once the lysis and protein extraction protocol chosen has been deemed fit for purpose, uniform and accurate handling is the most reliable method of limiting technical variance at this stage. Nonetheless, more research into this area regarding other normalization strategies at the time of or just after sample disruption would be beneficial, as there is currently little to no experimental studies explicitly evaluating how normalization is affected by these factors at this early stage.