Transcriptomic and Proteomic Analysis of Clear Cell Foci (CCF) in the Human Non-Cirrhotic Liver Identifies Several Differentially Expressed Genes and Proteins with Functions in Cancer Cell Biology and Glycogen Metabolism

Clear cell foci (CCF) of the liver are considered to be pre-neoplastic lesions of hepatocellular adenomas and carcinomas. They are hallmarked by glycogen overload and activation of AKT (v-akt murine thymoma viral oncogene homolog)/mTOR (mammalian target of rapamycin)-signaling. Here, we report the transcriptome and proteome of CCF extracted from human liver biopsies by laser capture microdissection. We found 14 genes and 22 proteins differentially expressed in CCF and the majority of these were expressed at lower levels in CCF. Using immunohistochemistry, the reduced expressions of STBD1 (starch-binding domain-containing protein 1), USP28 (ubiquitin-specific peptidase 28), monad/WDR92 (WD repeat domain 92), CYB5B (Cytochrome b5 type B), and HSPE1 (10 kDa heat shock protein, mitochondrial) were validated in CCF in independent specimens. Knockout of Stbd1, the gene coding for Starch-binding domain-containing protein 1, in mice did not have a significant effect on liver glycogen levels, indicating that additional factors are required for glycogen overload in CCF. Usp28 knockout mice did not show changes in glycogen storage in diethylnitrosamine-induced liver carcinoma, demonstrating that CCF are distinct from this type of cancer model, despite the decreased USP28 expression. Moreover, our data indicates that decreased USP28 expression is a novel factor contributing to the pre-neoplastic character of CCF. In summary, our work identifies several novel and unexpected candidates that are differentially expressed in CCF and that have functions in glycogen metabolism and tumorigenesis.


Introduction
The early processes underlying human hepatocellular carcinogenesis are poorly understood. Very diverse conditions, such as the cirrhotic liver, non-cirrhotic liver with glycogen storage disease type I [1], as well as metabolic disorders (alpha-1-antitrypsin deficiency and hemochromatosis), obesity [2], hyperinsulinism, alcohol abuse, and type 2 diabetes mellitus [3,4], are known to be risk factors for hepatocellular carcinoma (HCC) development. While high-grade dysplastic nodules in liver cirrhosis are accepted as pre-neoplastic lesions of HCC [5], the situation in the absence of liver cirrhosis is less clear, even though 15-20% of HCC occurs in non-cirrhotic livers [6].
To better understand the mechanisms underlying carcinogenesis, it is important to better characterize the precursor stages-pre-neoplastic lesions-for improving early diagnosis and treatment of HCC. This is increasingly important as primary liver cancer is the fifth most frequent malignancy worldwide and the proportion of HCCs in the background of type 2 diabetes and obesity is becoming more common [2].
In the human cirrhotic liver, different types of foci of altered hepatocytes were described by Bannasch: glycogen-storing foci (clear cell foci (CCF), with pale hematoxylin and eosin (HE) staining), mixed cell foci, and basophilic foci [7]. These foci are also well known in diverse animal models of hepatocarcinogenesis [8] and their progression to hepatocellular adenomas and HCC is well described [7,[9][10][11]. Moreover, using the intraportal pancreatic islet transplantation model of hepatocarcinogenesis [12][13][14][15], we found the AKT (v-akt murine thymoma viral oncogene homolog)/mTOR (mammalian target of rapamycin) and the Ras (rat sarcoma)/MAPK (mitogen-activated protein kinase) pathways to be activated throughout the development of CCF to HCC, where they play important roles as major oncogenic downstream effectors of insulin signaling [16,17]. The lipogenic phenotype is characterized by increased lipogenesis and storage of lipid droplets. These alterations have also been described in human HCC, where they are associated with unfavorable prognosis [18,19]. Recently, we described that CCF in human non-cirrhotic livers reveal many molecular and metabolic characteristics, like pre-neoplastic liver foci of the hormonal model of hepatocarcinogenesis after intraportal pancreatic islet transplantation [20]. Specifically, we found an increase in glycogen storage, reduced glucose-6-phosphatase activity, and an upregulation of enzymes regulating glycolysis, de novo lipogenesis, and beta-oxidation, as well as overexpression of the insulin receptor and activated AKT/mTOR and Ras/MAPK pathways in CCF from both human livers and the rat model [20]. Similarly, in the mouse, hepatocarcinogenesis is associated with activation of the insulin/AKT/mTOR signaling pathway, the transcriptional regulator ChREBP (Carbohydrate-response element-binding protein) [16,21], as well as the lipogenic pathway [18,22,23].
Although these data hint to several pathways and regulators, a comprehensive inventory of gene and protein expression in human CCF is missing.
In the current work, we applied microarray analysis and proteomics after laser microdissection as unbiased approaches to identify and further characterize CCF of human non-cirrhotic liver parenchyma. To this end, we compared RNA and protein expression in CCF with neighboring tissue as a control and found several genes and proteins with significantly altered expression. Starch-binding domain-containing protein 1 (STBD1), ubiquitin-specific peptidase 28 (USP28), WD repeat-containing protein 92 (WDR92)/Monad, and heat shock protein family E (Hsp10) member 1 (HSP10) were among the candidates with the highest differential expression, and their expression was validated by immunohistochemistry.

More RNAs Have Reduced Expression in CCF Compared to Controls
After standard processing of the microarray dataset, as described in the Materials and Methods Section, we performed the following tests to identify any problems with sample quality, normalization, or signal quality. First, we determined how many transcripts were above the non-detection threshold in each sample and did not identify any samples to exclude (Supplementary Figure S1A). Also, the mean number of transcripts of control and CCF samples did not show any statistically significant differences (Supplementary Figure S1A,B; n = 18, Student's t-test, p (unpaired) = 0.104; p (paired) = 0.063). The distributions of log-transformed signal intensities per sample were quite similar between samples (Supplementary Figure S2). From these observations, we concluded that the quality of the data was acceptable for further analysis.
Using cluster analysis, samples clustered in a patient-dependent manner in most cases ( Figure 1A), indicating that differences between patients were more extensive than those between CCF and control samples. A larger degree of heterogeneity is not uncommon for human tissue samples, in general, and we suggest that transcripts with statistically significant differences between CCF and control samples are likely to be quite robust. On the other hand, we may miss differentially expressed genes due to the higher degree of noise in the data. Taking the inter-patient heterogeneity into account, we compared RNA expression between CCF and controls by calculating fold-changes (CCF/control) per patient. Fourteen transcripts (Table 1 and Figure 1B) had at least 2-fold higher or lower expression in CCF than in control samples, three of these with higher and 11 with lower expression in CCF samples. Interestingly, all three transcripts with increased expression in CCF coded for long non-coding RNAs (lnc-FOXG1-6:17, LINC01124:6, and LINC02290:27). However, very little is known about the function of any of these lncRNAs. LINC01124:6 is annotated as a bidirectional, 2129 bp long lncRNA encoded by one exon, while LNC02290:27 is an intergenic lncRNA of 463 bp length encoded on 4 exons (LNCIPEDIA v 5.2, www.lncipedia.org). and control samples from human liver biopsy specimens. Control and CCF samples were laser-capture micro-dissected from cryosections and total RNA was used for microarray analysis. (A) Cluster analysis of the full dataset revealed larger interpatient differences than differences between CCF and control samples. (B) Scatter plot of transcripts (mRNAs, miRNAs, and lncRNAs) with statistically significant differential expression in CCF vs. control samples. More transcripts show lower expression in CCF than in control samples. Horizontal red line: p-value = 0.05; vertical red lines: FC = 2-fold down-or up-regulation, respectively.  Figure 1. The transcriptome of CCF (clear cell foci) and control samples from human liver biopsy specimens. Control and CCF samples were laser-capture micro-dissected from cryosections and total RNA was used for microarray analysis. (A) Cluster analysis of the full dataset revealed larger inter-patient differences than differences between CCF and control samples. (B) Scatter plot of transcripts (mRNAs, miRNAs, and lncRNAs) with statistically significant differential expression in CCF vs. control samples. More transcripts show lower expression in CCF than in control samples. Horizontal red line: p-value = 0.05; vertical red lines: FC = 2-fold down-or up-regulation, respectively.  [30]; suppressor of colorectal cancer progression 30662801; enhancer of aggression of ovarian cancer cells [31]; protection from apoptosis under stress conditions [32]; promotion of tumorigenicity and metastasis in breast cancer [33]; growth suppression in pancreatic cancer [34] NM_001256476. Ten of the eleven genes with lower expression in CCF than in control samples code for protein-coding RNAs. The gene with the lowest expression was POU2AF1 (encoding POU class 2 associating factor 1), which is important for the regulation of B-cell maturation [40]. The other downregulated genes have diverse functions in actin binding (FOHD3), calcium-signaling (CALML3), prefoldin-like complex and signaling components (monad/WDR92), protein N-acetylgalactosaminyl transfer (GALNT6), G protein-coupled receptor-mediated signaling (GPR174), nucleotide binding (CNBD1), protein de-ubiquitination (USP28), and as yet unidentified functions (C15orf48 and ZNF880 and the long non-coding RNA lnc-C15orf41-2:1) ( Table 1).

More Proteins Have Reduced Expression in CCF Compared to Control Samples
Using mass-spectrometric analysis, we obtained quantitative data of 995 to 2253 proteins in 14 samples (seven control + seven CCF samples from the same patients), with an average of 1474 proteins identified per sample. A total of 504 proteins was identified in all 14 samples (Supplementary Figure S3A). The average number of identified proteins in control and CCF samples did not differ (means/geometric means: 1583/1559 and 1365/1315 proteins; control and CCF, respectively. Supplementary Figure S3B; Student's t-test, p (unpaired) = 0.300). Also, there was no statistically significant difference between the number of identified proteins from CCF and control samples per patient (Supplementary Figure S3C; paired Student's t-test, p = 0.137). The normalized log2-transformed expression data approximated normal distribution in all samples (Supplementary Figure S4).
As in the cluster analysis of the microarray data, cluster analysis of the proteomic data could not clearly separate control and CCF samples into distinct groups ( Figure 2A). Therefore, the higher heterogeneity between patients than between CCF and control samples was also present at the protein level. Furthermore, more proteins were expressed at a lower level in CCF compared to control samples ( Figure 2B and Table 2). The expression of three proteins was decreased more than 2-fold: these were Cytochrome b5 type b (CYB5B, 2.3-fold lower), mitochondrial 10 kDa heat shock protein (HSP10/HSPE1; 2.2-fold lower), and starch-binding domain-containing protein 1 (STBD1; 2.2-fold lower). Further, 19 proteins showed a downregulation of more than 1.5-fold ( Figure 2B and Table 2). In contrast, there were only three proteins with more than 1.5-fold increased expression: these were, GSTM4, RAB12, and RAB35. No proteins with >= 2-fold higher expression in CCF than in controls were identified.  To validate data obtained with these high-throughput methods, the protein expression of two microarray candidates (monad/WDR92 and USP28) and five proteomics candidates (the downregulated STBD1, CYB5B, and HSPE1, as well as the two upregulated RAB12 and RAB35) was analyzed in liver sections containing CCF using immunohistochemistry. Reduced expression of monad/WDR92, USP28, STBD1, CYB5B, and HSPE1 in CCF was confirmed ( Figure 3 and Table 3). Increased expression of RAB12 and RAB35 in CCF could not be verified, as in most samples there was either no difference between CCF and the surrounding tissue, or a slightly lower expression in CCF (Table 3 and Figure 3). As negative controls, we used proteins with unchanged expression, like TRAP1, Cullin-3, ACSL4, COPS7A, and A-Raf. These proteins had fold changes close to one (1.37, 1.25, 1.11, 1.03, and 1.02, respectively) in the proteomic dataset. Immunohistochemical analysis confirmed no clear difference in expression of these proteins when comparing CCF to surrounding tissue (specimens from 12-15 patients analyzed, Supplementary Figure S5).
From these results, we conclude that the proteomic data from laser-capture micro-dissected samples reflects the protein composition of CCF and surrounding tissue. Our immunohistochemical data of WDR92/monad and USP28 indicate that the observed differences in protein expression in CCF and surrounding tissue could be due to regulation at the transcript level, as the mRNAs coding for these proteins were reduced in CCF according to microarray analysis. was either no difference between CCF and the surrounding tissue, or a slightly lower expression in CCF (Table 3 and Figure 3). As negative controls, we used proteins with unchanged expression, like TRAP1, Cullin-3, ACSL4, COPS7A, and A-Raf. These proteins had fold changes close to one (1.37, 1.25, 1.11, 1.03, and 1.02, respectively) in the proteomic dataset. Immunohistochemical analysis confirmed no clear difference in expression of these proteins when comparing CCF to surrounding tissue (specimens from 12-15 patients analyzed, Supplementary Figure S5).   Table 3. Validation of protein expression as predicted by high-throughput analysis (see Tables 1  and 2 for fold changes) in CCF by immunohistochemical analysis. Arrows indicate higher (↑), lower (↓), and unchanged (←→) expression in CCF compared to unaltered liver tissue. Numbers in the immunohistochemistry column denote the number of individual patient samples analyzed.

Protein
High-Throughput Analysis Immunohistochemistry

Loss of STBD1 in Mice Does Not Cause Glycogen Accumulation in the Liver
Starch-binding domain-containing protein 1 (STBD1) is N-terminally anchored within the membrane of the endoplasmic reticulum and binds glycogen via its C-terminally located family 20 starch binding module [41]. An Atg8 family interacting motif (AIM) and interaction with GABARAPL1 (Gamma-aminobutyric acid receptor-associated protein-like 1) implicated STBD1 to be involved in autophagic glycogen degradation, so called glycophagy [42]. In a mouse model of Pompe disease, deletion of STBD1 suppressed lysosomal glycogen accumulation, suggesting STDB1 to be involved in transfer of glycogen from the cytoplasm to lysosomes [43].
The reduced expression of STBD1 in CCF observed in the present study could contribute to the hepatocellular glycogen accumulation. To test whether loss of STBD1 would result in glycogen accumulation in the liver, glycogen concentrations were determined in livers of nine-month-old male wild-type (WT) and Stbd1-knock-out (KO) mice with access to food throughout or fasted for 16 h prior to sacrifice. Fasting resulted in significant reductions of glycogen in bot wild type (WT) and Stbd1-KO mice ( Figure 4). However, there was no significant difference between WT and Stbd1-KO mice at either condition ( Figure 4). From these results, we conclude that under fasting and normal conditions, loss of STBD1 does not have a significant effect on glycogen degradation in otherwise healthy mice.
prior to sacrifice. Fasting resulted in significant reductions of glycogen in bot wild type (WT) and Stbd1-KO mice (Figure 4). However, there was no significant difference between WT and Stbd1-KO mice at either condition ( Figure 4). From these results, we conclude that under fasting and normal conditions, loss of STBD1 does not have a significant effect on glycogen degradation in otherwise healthy mice.

Diethylnitrosamine (DEN)-Induced Hepatocellular Carcinomas in Usp28-KO Mice Do Not Accumulate More Glycogen than in Control Mice
Knockout of Usp28 in mice promotes liver carcinogenesis in diethylnitrosamine (DEN)-injected mice. Although the mechanism has been reported to involve p53 through 53BP1 [44][45][46], no major impact for this regulatory axis was found in the DEN-induced HCCs of Usp28-KO mice [25]. As USP28 levels were downregulated in the CCF, it could mediate its effects at least partially via glycogen regulation. To investigate whether loss of USP28 may affect glycogen metabolism in carcinoma, we compared glycogen accumulation in DEN-induced HCC of WT and Usp28-KO mice using Periodic acid-Schiff (PAS) staining. However, no difference in PAS staining could be detected, suggesting that lack of USP28 does not cause glycogen differences in CCF ( Figure 5).

Diethylnitrosamine (DEN)-Induced Hepatocellular Carcinomas in Usp28-KO Mice Do Not Accumulate More Glycogen than in Control Mice
Knockout of Usp28 in mice promotes liver carcinogenesis in diethylnitrosamine (DEN)-injected mice. Although the mechanism has been reported to involve p53 through 53BP1 [44][45][46], no major impact for this regulatory axis was found in the DEN-induced HCCs of Usp28-KO mice [25]. As USP28 levels were downregulated in the CCF, it could mediate its effects at least partially via glycogen regulation. To investigate whether loss of USP28 may affect glycogen metabolism in carcinoma, we compared glycogen accumulation in DEN-induced HCC of WT and Usp28-KO mice using Periodic acid-Schiff (PAS) staining. However, no difference in PAS staining could be detected, suggesting that lack of USP28 does not cause glycogen differences in CCF ( Figure 5).

Discussion
In this study, we identified novel genes and proteins that are differentially regulated in human CCF in comparison to the surrounding liver tissue to better understand the processes leading to

Discussion
In this study, we identified novel genes and proteins that are differentially regulated in human CCF in comparison to the surrounding liver tissue to better understand the processes leading to glycogen accumulation and the possible tumor development. We found only a small number of genes/proteins with significant changes higher than two-fold. The biggest challenge was the heterogeneity of the samples, which may be due to e.g. different origins of tumors, patient medication, dietary status or age. Although desirable, it was not feasible to further increase the number of samples, as the number of appropriate specimens is limited. Furthermore, laser-capture micro-dissection of CCF is very labor-intensive: for each specimen, about 90 CCF and control samples had to be excised to obtain sufficient material for RNA and protein extraction. Hence, even though the statistical support of the omics data is not very strong and the datasets show high degrees of heterogeneity, the top scoring candidates do seem to be robust, as we were able to validate the differential expression of several candidates by immunohistochemistry using specimens from independent patient material.
Comparing the microarray and the proteomics data, no protein was found for which the respective RNA was changed accordingly. Therefore, these proteins may be not regulated at the transcript level to the same degree that they are regulated at the protein level. Since the degree of correlation between RNA level and protein expression at the genomic level is estimated to be 40-50% [47], the identified proteins may belong to the 50-60% of proteins that are mainly regulated at the post-transcriptional level.
Notably, in both the RNA and protein datasets, more genes/proteins were downregulated than upregulated in CCF, and the ratios of upregulated versus downregulated hits were similar (RNA: 3 up, 11 down (FC cutoff 2-fold), protein: 6 up, 22 down (FC cutoff 1.5-fold)). This could indicate a general non-specific replacement of cellular components by glycogen. However, we did not find indications that signal intensities of RNA and proteins were generally lower in CCF samples compared with control samples. Hence, we conclude that the differences in RNA and protein levels we observed between CCF and surrounding tissue were due to regulatory processes within cells.

Non-Coding RNAs
We found several non-coding RNAs to be differentially expressed in CCFs compared to controls. Unfortunately, no function is known for any of these. However, lnc-FOXG1-6:17 is an antisense lncRNA of PRKD1 encoded by two exons (LNCIPEDIA v 5.2, www.lncipedia.org). According to current understanding, antisense transcripts are implicated in gene regulation, which can act in cis through transcriptional interference, double-stranded RNA/RNA masking, double-stranded RNA/RNA adenine to inosine editing, or double-stranded RNA/RNA interference. Cis and trans antisense lncRNAs may affect gene expression more globally through chromatin modifications (reviewed in Reference [48]). Hence, it is not clear if and how lnc-FOXG1-6:17 would affect the expression of PRKD1, which codes for Serine/threonine-protein kinase D1 (PRKD1). PRKD1 itself is implicated in cell proliferation, cell motility, invasion, protein transport, and apoptosis [49], and would represent an interesting candidate in regard to tumor development and progression in CCF. Its RNA and protein expression, though, is low in liver when compared to other tissues (human protein atlas [50-52]), and its mRNA level was not altered, according to our microarray analysis. Hence, a direct and strong effect of lnc-FOXG1-6:17 on PRKD1 expression seems unlikely.

STBD1
Glycogen accumulation in CCF is usually explained by decreased gluconeogenesis due to lower activity of the glucose-6-phosphatase-increased glycogen synthesis and reduced glycogen degradation mediated by insulin/AKT-signaling [7]. In our current work, we observed reduced expression of starch-binding domain-containing protein 1 (STBD1) in CCF compared to control/surrounding tissue in the proteomic dataset and validated this finding by immunohistochemistry. STBD1 is a N-terminally membrane-anchored [53] glycogen-binding [41,54] protein that localizes glycogen to perinuclear sites/ER, late endosomes, and lysosomes, and is most abundantly expressed in muscle and liver [41]. The lower expression of STBD1 that we observed in CCF together with its involvement in the lysosomal glycogen degradation pathway [43] suggests that this reduction of STBD1 expression could be an additional factor contributing to glycogen accumulation in CCF. However, a significant glycogen accumulation in the livers of Stbd1-KO mice was not observed. This was unexpected, as 10% of glycogen degraded in the liver passes through the lysosomal pathway [55] and the loss of STBD1 should result in an appreciable accumulation of glycogen in the liver. One explanation for the lack of this glycogen accumulation is the compensation of STBD1-mediated glycogen translocation into lysosomes by cytoplasmic glycogen degradation. Alternative mechanisms of glycogen translocation are also possible, as the loss of STBD1 in a mouse model of lysosomal glycogen overload (acid alpha-glucosidase-KO mice) did not completely antagonize lysosomal glycogen accumulation [43]. Hence, we conclude that reduction of STBD1 in CCF may only contribute to glycogen accumulation in combination with yet unidentified factors that favor lysosomal glycogen degradation. As large glycogen granulae are preferentially degraded via the lysosomal pathway [55], it would be of interest to determine whether glycogen granulae in CCF are of a larger size than in normal liver tissue. An alternative interpretation of the reduction of STBD1 in CCF could be a rerouting of carbohydrate use. Reducing the utilization of glycogen for lysosomal degradation and glucose export would result in increased intracellular glucose availability. Indeed, cancer cells have been shown to route glucose through glycogen and this glucose was able to support the pentose phosphate pathway more optimally [56,57].

USP28
With regard to hepatocellular carcinogenesis, USP28 is of particular interest. Functionally, USP28 is a deubiquitination enzyme, catalyzing the deubiquitination of target proteins [58,59], thereby counteracting ubiquitin-dependent proteasomal protein degradation. Known target proteins of USP28 are 53BP1 [44][45][46] and claspin [59], MYC [60], LSD1 [61], histone H2A [62], and HIF-1alpha [63]. In particular, lack of USP28 results in an earlier onset and greater tumor burden in a mouse model of chemically (diethylnitrosamine (DEN)) induced hepatocellular carcinoma [25]. The same study also reports that USP28 expression is reduced in patients with hepatocellular carcinoma when comparing carcinoma with control liver tissue from the same patient [25], in line with the data from the current study. Functions of USP28 in different cancer cell types reveal a variety of cancer-relevant effects of USP28: it elicits stem-cell-like characteristics through LSD1-stabilization [61], it acts on p53 and GATA4, affecting cellular senescence [64], it impacts on cell proliferation through deubiquitination of histone H2A [62], and it sensitizes cells to DNA-damage via its interactions with 53BP1 and claspin [59]. Finally, USP28 has been targeted in several studies for the treatment of different cancers, such as non-small cell lung cancer, breast cancer, intestinal cancers, gliomas, and bladder cancer [65]. Our data (reduced USP28 mRNA in CCF, validated at the protein level by immunohistochemistry) reveal changes in USP28 expression in small human hepatocellular foci, providing further evidence for CCF being very early lesions in hepatocellular carcinogenesis or pre-neoplastic lesions, respectively. As we did not find any evidence that a lack of USP28 would induce glycogen overload in DEN-induced hepatocellular carcinoma, we propose that the reduced expression of USP28 preferentially affects the reported p53/senescence pathway axis to promote tumorigenesis and cancerogenesis in CCF rather than having an additional function in regulating glycogen metabolism. To our knowledge, the alteration of USP28 expression in early clear cell lesions is the first of its kind and it will be interesting to investigate mechanisms that trigger the reduced USP28 expression in CCF. Our results of reduced USP28 mRNA expression (microarrays) and reduced USP28 protein expression (IHC) suggest that the differential regulation is mediated at the transcript level. What causes the differential USP28-expression and what effects it has on CCF progression to tumors will be the focus of future studies.

Conclusions
With our omics approach, we identified several new genes/proteins that show differential expression in CCF. Initial functional studies indicate that reduced expression of STBD1 may not have a direct effect on glycogen levels despite its role in glycogen degradation. Most likely, other factors also contribute and need to be identified in dedicated studies.
Furthermore, the reduced expression of USP28, a gene known to affect onset and progression of liver and breast cancer [25], is likely to mediate glycogen-independent oncogenic functions. Our work is the first to show that the expression of USP28 is already decreased in CCF, underscoring their pre-neoplastic character.

Human Liver Specimens
Liver samples originated from a former cohort [20] without signs of liver cirrhosis, obtained from human liver resections taken during surgery. Specimens were from patients (age ranging from 42 to 77 years) with liver metastases of tumors of different origin (two colon carcinomas, two neuroendocrine tumors, one gastrointestinal stromal tumor) or with cholangiocellular carcinoma (n = 4). Tissue samples (1.5 × 1.5 × 0.5 cm) were collected and frozen in liquid nitrogen-cooled isopentane and stored at −80 • C until cryosectioning. Experiments were reviewed and permitted by the ethical committee of the Universitaetsmedizin Greifswald (No. BB 67/10).

Cryosectioning
Cryosections were made at −16 to −20 • C using a Cryostar NX (Thermo Scientific, Waltham, MA, USA) disinfected with Leica Cryofect Disinfectant Spray (Leica, Wetzlar, Germany) before use. All equipment was cleaned with RNase AWAY (Molecular Bio Products, San Diego, CA, USA) and glassware was additionally rinsed with RNase-free water (Aqua B. Braun, Ecotainer, B. Braun Melsungen AG, Melsungen, Germany) and a new blade was used for every sample to reduce risks of cross-contamination. Samples were attached to an object-holder with 0.9% sodium chloride solution (NaCl 0.9%; B. Braun Melsungen AG, Melsungen, Germany). Specimens with clear cell foci were identified by H&E staining of 8 µm sections and validated by PAS reaction. In regions with CCF, two to four consecutive cryosections with a thickness of 12 µm were put on membrane-slides (Leica Frame Slides, Nuclease and human nucleic acid free, PET-Membrane, 1.4 µm, Leica, Wetzlar, Germany) and incubated for one minute in 70% ethanol at −16 • C. Membrane slides were collected in a box, vacuum-sealed (Severin FS 3602, Severin Elektrogeräte GmbH, Sundern, Germany), and stored at −80 • C until laser micro-dissection.

Laser Micro-Dissection
CCF and control samples were isolated by laser micro-dissection using a Leica LMD 6500 System (Leica Microsystems, Wetzlar, Germany) wiped with RNase AWAY (Molecular Bio Products, San Diego, CA, USA). Sample boxes were thawed for 30 min on ice before staining the cryosections according to a modified H&E-staining protocol. Briefly: sections were incubated in DEPC-water (0.1% diethyl-pyrocarbonate, Sigma-Aldrich, St. Louis, MO, USA) for ten seconds, stained with hemalaun for 50 s, and washed for 10 s with DEPC-water before staining with eosin for 10 s. Slides were then incubated in 90% ethanol for 30 seconds.
Material from the same patient and same sample type was collected in one tube and stored on dry ice during collection. For storage, 800 µL TRIzol Reagent (Life Technologies, Carlsbad, CA, USA) was added and samples were stored at −80 • C.
For microarray analysis, nine (N = 9), and for proteomic analysis, seven, sample pairs (N = 7) were of suitable quality.

RNA and Protein Isolation
Samples (on average 91 (CCF) and 96 (control) dissected tissue pieces per patient) were homogenized in liquid nitrogen, pre-cooled in 4 mL PTFE-vials with one 8 mm stainless-steel bead using a Micro-Dismembranator (Sartorius AG, Göttingen) at 2600 rotations per minute (RPM) for two min.
The homogenate was transferred into a 1.7 mL centrifuge tube (Sorenson Bioscience Inc., Murray, UT, USA) and the homogenization vials were flushed with another 200 µL TRIzol. Samples stored on dry ice were thawed at room temperature for ten min, then centrifuged at 4 • C and 12,000× g for ten min (Heraeus Fresco 17 Centrifuge Refrigerated, Thermo Scientific, Waltham, MA, USA) to remove crude debris.
RNA and proteins were isolated using TRIzol extraction according to the manufacturer's protocol with the following changes. RNA was precipitated with isopropanol overnight at −20 • C, RNA was washed twice with 70% ethanol, and pellets were resuspended by incubation on water ice for three hours, followed by 30 min of incubation at room temperature.

Microarray Analysis
Processing of purified RNA for microarray analysis and the microarray analysis were carried out at OakLabs (Hennigsdorf, Germany) according to their standard procedures. Briefly, RNA concentrations were between 17 and 66 ng/µL in volumes of 30 or 40 µL H 2 O with integrity numbers (RIN, Bioanalyzer, Agilent Technologies, USA) between 6.3 and 7.9 (Supplementary Table S1). RNA was labeled using the Low-Input QuickAmp Labeling Kit (Agilent Technologies, USA) and cRNA was hybridized with ArrayXS Human (OakLabs, Germany) at 65 • C for 17 h using the Agilent Gene Expression Hybridization Kit (Agilent Technologies, USA), washed once with Agilent Gene Expression Wash Buffer 1 for one minute at room temperature, followed by a second wash with preheated (37 • C) Gene Expression Wash Buffer 2 for one minute.
Microarrays were scanned with a SureScan Microarray Scanner (Agilent Technologies, USA) and Agilent´s Feature Extraction software was used to detect features. Signals from control probes were removed and means of signals from replicate probes and of signals from all probes of a target were determined before normalization of the background subtracted signals. Data from all samples was quantile normalized using ranked mean quartiles [66]. Normalized data was statistically analyzed by paired analysis of variance (ANOVA) (clear cell foci vs. control samples from the same patient).

Gel Electrophoresis and Silver Staining
Protein concentrations were determined by Bradford assay (Quick Start Bradford Protein Assay, Bio-Rad Laboratories Inc., Hercules, CA, USA) (average concentration: 1.24 µg/µL +/− 0.44). To assess protein integrity, 13.7-20 µg total protein per sample was resolved under reducing conditions on Novex NuPAGE gels (4-12% bis-tris protein gels) using LDS sample buffer and MES running buffer according to the manufacturer's protocol (Life Technologies, Carlsbad, USA). Gels were stained using the Silver Staining Plus Kit and visualized on the ChemiDoc XRS+ system (Bio-Rad Laboratories Inc., Hercules, CA, USA) according to the manufacturer's protocol.

Proteomics Sample Preparation and LC-MS/MS
Sample preparation and mass-spectrometric analysis was carried out by the proteomic facility of Porto Conte Ricerche (Alghero, Italy). Protein extracts were subjected to on-filter reduction, alkylation, and trypsin digestion according to the filter-aided sample preparation (FASP) protocol, with slight modifications. Briefly, protein extracts were diluted in 8 M urea in Tris-HCl 100 mM pH 8.8, and buffer was exchanged using Microcon Ultracel YM-10 filtration devices (Millipore, Billerica, MA, USA). Proteins were reduced in 10 mM dithiothreitiol (DTT) for 30 min, alkylated in 50 mM iodoacetamide for 20 min, washed five times (3× in 8 M urea and 2× in ammonium bicarbonate), before trypsin digestion on the filter (1:100 enzyme-to-protein ratio) at 37 • C overnight. Peptides were collected by centrifugation, followed by an additional wash with an elution solution (70% acetonitrile plus 1% formic acid). Finally, the peptide mixture was dried, and reconstituted in 0.2% formic acid to an approximate final concentration of 1 µg/µL. Peptide mixture concentration was estimated by measuring absorbance at 280 nm with a NanoDrop 2000 spectrophotometer (Thermo Scientific, San Jose, CA, USA) and a standard curve made from MassPREP E. Coli Digest Standard (Waters, Milford, MA, USA).
Liquid Chromatography with tandem mass spectrometry (LC-MS/MS) analyses were carried out using a Q Exactive mass spectrometer (Thermo Scientific) interfaced with an UltiMate 3000 RSLCnano LC system (Thermo Scientific). After loading, peptide mixtures (4 µg per run) were concentrated and desalted on a trapping pre-column (Acclaim PepMap C18, 75 µm × 2 cm nanoViper, 3 µm, 100 Å, Thermo Scientific), using 0.2% formic acid at a flow rate of 5 µL/min. The peptide separation was performed at 35 • C using a C18 column (EASY-Spray column, 15 cm × 75 µm ID, PepMap C18, 3 µm, Thermo Scientific) at a flow rate of 300 nL/min, using a 485 min gradient from 1 to 50% eluent B (0.2% formic acid in 95% acetonitrile) in eluent A (0.2% formic acid in 5% acetonitrile). MS data were acquired using a data-dependent top10 method dynamically choosing the most abundant precursor ions from the survey scan, under direct control of the Xcalibur software (version 1.0.2.65 SP2), where a full-scan spectrum (from 300 to 1700 m/z) was followed by tandem mass spectra (MS/MS). The instrument was operated in positive mode with a spray voltage of 1.8 kV and a capillary temperature of 275 • C. Survey and MS/MS scans were performed in the Orbitrap with resolution of 70,000 and 17,500 at 200 m/z, respectively. The automatic gain control was set to 1,000,000 ions and the lock mass option was enabled on a protonated polydimethylcyclosiloxane background ion as internal recalibration for accurate mass measurements. The dynamic exclusion was set to 30 s. Higher Energy Collisional Dissociation (HCD), performed at the far side of the C-trap, was used as the fragmentation method, by applying a 25 eV value for normalized collision energy, and an isolation width of m/z 2.0. Nitrogen was used as the collision gas.
Peptide identification was performed using Proteome Discoverer (version 1.4; Thermo Scientific) using Sequest-HT as a search engine for protein identification, according to the following criteria: Database UniprotKB, taxonomy human (release 2014_10); Precursor mass tolerance: 10 ppm; Fragment mass tolerance: 0.02 Da; Static modification: cysteine carbamidomethylation; Dynamic modification: methionine oxidation, and Percolator for peptide validation (false discovery rate (FDR) < 1% based on peptide q-value). Results were filtered in order to keep only rank 1 peptides, and protein grouping was allowed according to the maximum parsimony principle.
Protein abundance was expressed by means of the normalized spectral abundance factor (NSAF). NSAF was calculated as follows: NSAF = SAF i /N, where the subscript i denotes a protein identity and N is the total number of proteins, while SAF is a protein spectral abundance factor that is defined as the protein spectral counts divided by its length (number of residues or molecular weight). In this approach, the spectral counts of each protein were divided by its length and normalized to the average number of spectral counts in a given analysis. In order to eliminate discontinuity due to Spectral counts = 0, a correction factor, set to 2, was used. The NSAF log ratio (RNSAF) was calculated according to the following formula: RNSAF = log2(NSAF1/NSAF2), where RNSAF is the log2 ratio of the abundance of a protein in sample groups 1 (clear cell foci, NSAF1) and 2 (control, NSAF2).
Proteins showing RNSAF > 0.5 or < −0.5 were considered as differentially abundant between groups. A two-tailed t-test was applied, using in-house software, in order to evaluate the statistical significance of differences between groups. No correction for multiple testing was applied. To address the potentially higher rate of false-positive hits, the most interesting candidates were validated by immunohistochemistry.

Cluster Analysis and Plotting of Omics Data
Cluster analysis of proteomic and microarray data was done using Perseus Software (version 1.6.0.7,). Microarray data: quartile normed data was filtered by intensity (signal ≥ 10, in 8 out of 9 samples within one group; groups: normal and CCF). Settings for hierarchical clustering: Distance-Euclidean; Linkage-Average (process with k-means, number of clusters: 300, maximal number of iterations: 10, number of restarts: 1).
Data plotting was done using R (3.5.2) with the following packages: ggplot2, ggrepel, and gridExtra.

Immunohistochemistry and Histochemistry
Histochemistry of formaldehyde-fixed and paraffin-embedded specimen was performed as previously described [14].
Immunohistochemistry was carried out on formaldehyde-fixed and paraffin-embedded specimens according to standard immunohistochemical protocols for de-paraffination and embedding. Endogenous peroxidase was blocked with Novocastra TM Peroxidase Block (#RE7101, Leica Biosystems) for 15 min at room temperature and sections were blocked with Universal Block (Dako) for 20 min. Primary antibodies and unmasking techniques are listed in Supplementary Table S2. LSAB2 System-HRP (#K0675, Dako) and Liquid DAB+ Substrate Chromogen System (#K3468, Dako) were used for signal amplification and staining.

Mouse Models
Stbd1-KO mice [43] and the respective control animals were bred and housed at Duke University, USA. All animal procedures were done in accordance with Duke University Institutional Animal Care and Use Committee-approved guidelines.
Sections of Usp28-KO mice were prepared as described [25].

Glycogen Quantification in Mouse Liver
Glycogen was quantified in mouse liver as described previously [67].
Supplementary Materials: The following are available online, Figure S1: Number of transcripts per sample, Figure S2: Histograms of log2 transformed RNA expression, Figure S3: Number of proteins identified per sample, Figure S4: Histograms of log2 transformed protein expression, Figure S5: Immunohistochemistry of negative controls, Table S1: RNA concentration and quality, Table S2: Antibodies used in this study.