We have performed a panoramic study of human proteins and their proteoforms using a cancer cell line (HepG2) and normal liver tissue. Previously, some of these data were already published [
7,
19]. We generated the list of proteins identified in liver and HepG2 cell extracts using treatment with trypsin according to the FASP protocol [
23], and by separation according to pI/Mw using 2DE, followed by sectional analysis of the gel by ESI LC-MS/MS. A total of 20,462 proteoforms encoded by 3773 genes were identified in the case of HepG2 cells [
7], and 14,667 proteoforms, encoded by 3305 genes, in the case of liver cells [
19]. Here, we present further analyses of these data. The basic information about the number of proteins detected by these methods is presented in
Figure 1.
In the bottom, the number of proteins (genes) detected by shotgun mass-spectrometry (1221) using FASP protocol (left ellipse (1221): the liver, right ellipse (1467): HepG2 cells) is presented. Only 666 proteins were detected in both liver and HepG2 cells, while 555 proteins were detected only in liver, and 801were detected only in HepG2 cells. This is because of the level of detection sensitivity in our experiment and the levels of proteins in liver and HepG2 cells. The quantity of some proteins is enough to be detected in both samples, but some are only detected in liver and not in HepG2 cells and vice versa. This statement is confirmed by experiments using sectional analysis (top ellipses), when many more proteins were detected. Using sectional analysis, a total of 1920 proteins were detected in both liver and HepG2 cells (including many that were only detected in liver or HepG2 cells before). Again, many proteins were detected in liver only (1385) or HepG2 cells (1853). Concerning sensitivity, it is relevant to stress that only 293 proteins were detected in HepG2 cells but not in liver (in reverse case, 167) using both types of experiments. That confirms our statement about the sensitivity issue. Additionally, it is interesting to compare our data with data published in the paper by Wiśniewski et al. [
14]. It happens that most of the abovementioned 293 proteins (detected in HepG2 cells only) were also identified by Wiśniewski et al. [
14]. Furthermore, they showed that their level is much higher in HepG2 cells than in hepatocytes. Interestingly, despite the greater sensitivity of detection and the larger number of proteins detected by Wiśniewski et al. [
14], they did not detect 30 of these 293 proteins (
Supplementary Table S1).
The main part of this study is a set of proteoform profiles that we generated based on a combination of 2DE with LC ESI-MS/MS. We have produced these profiles as 3D graphical images. Some profiles are very similar in both samples and contain only one or two proteoforms (peaks). Often, proteins have many proteoforms, and the profiles for some of them are very different in liver and HepG2 cells. The most abundant peak usually has pI/Mw coordinates that are congruent with theoretical ones. The profiles of some proteins have an exceptionally large amount of proteoforms. Mostly, these are samples from HepG2 cells. Keeping in mind the cancerous nature of these cells, we have paid special attention to proteins that already are used or are under consideration to be used as tumor biomarkers. It is of note that the list of such markers is actually very long [
26]. Since our object here is HCC, we narrowed the analysis on biomarkers for this tumor (
Table 1).
The most well-known protein and the only one approved for clinical usage as a marker for HCC is alpha-fetoprotein (FETA) [
27,
28]. FETA levels in serum may increase with hepatocyte regeneration and in case of development of HCC [
29]. It remains the most commonly used screening biomarker for HCC [
10,
28]. However, increased serum levels of FETA might be a result of other liver deceases (hepatitis, liver cirrhosis etc.) decreasing the specificity of FETA testing for HCC. Furthermore, FETA is not expressed at high levels in all HCC patients, resulting in decreased sensitivity. Importantly, while FETA protein is not always a good marker for HCC, there is an example of a more specific proteoform which is used as a biomarker. A fucosylated form of serum AFP is most closely associated with HCC. This proteoform is designated as AFP-L3 and used as a more specific biomarker for HCC [
30]. In our case, 18 proteoforms of FETA were detected in HepG2 extracts (
Figure 2). Even more proteoforms (35) were observed when sectional analysis with higher resolution was applied (
Figure 3). In liver extract, this protein was not detected with a great enough reliability (at least two significant sequences). That confirms its usage as a HCC biomarker. There are more proteins from the list of HCC biomarkers (
Table 1) that were detected in HepG2 cells only (GPC3, FUCO2, KITH, SRC, SRPK1) (
Figure 2). Other proteins were detected in both samples (
Figure 4). For instance, profiles of heat shock protein beta (HSPB1) or fibrinogen gamma chain (FIBG) are very similar, but HSP74, ANXA2, ZA2G, CYB5, PGRC1, CATB, HPT are different. In all cases, we can find many proteoforms presented in HepG2 cells but not in liver and vice versa. For instance, in the case of haptoglobin (HPT), which exhibits decreased levels in HCC [
11], we observed a strong simplification of the profile in HepG2 cells compared to liver (
Figure 4). In the case of heat shock protein beta 1 (HSPB1) and annexin A2 (ANXA2), profiles are very similar in liver and HepG2 cells, but with a clear anodic shift of peaks in HepG2 cells which may be due to phosphorylation, as phosphorylation is a known PTM for these proteins [
41,
42]). Zinc-alpha-2-glycoprotein (ZA2G) is characterized by a set of different proteoforms (more than 30) distributed all around the gel. Many of these proteoforms have a greater Mw compared to the theoretical Mw (this protein can be heavily glycosylated [
43]).