Integrated Proteomics Based on 2D Gel Electrophoresis and Mass Spectrometry with Validations: Identification of a Biomarker Compendium for Oral Submucous Fibrosis—An Indian Study

Oral Submucous Fibrosis (OSMF) is a chronic debilitating disease more frequently found in the South East Asian population. This disease poses a public health priority, as it is grouped under oral potentially malignant disorders, with malignant transformation rates of around 7 to 13%. Hence, early identification of high-risk OSMF patients is of the utmost importance to prevent malignant transformation. Proteomic expression profiling is a promising method for identifying differentially expressed proteins for disease prognosis and risk stratification in OSMF. In this study, overexpressed proteins in OSMF, OSMF transformed into oral squamous cell carcinoma (OSCC) and normal tissues were evaluated by proteomic analysis using two-dimensional electrophoresis (2DE) and mass spectrometry, which revealed 23 upregulated proteins. Validation was done using immunohistochemistry for three secretory proteins, namely 14-3-3ε (n = 130), carbonic anhydrase 1 (CA 1) (n = 125) and heat shock protein 70 (HSP 70) (n = 117), which showed significant overexpression in OSMF, OSCC compared to normal. The present study is the first of its kind in India to the best of our knowledge, assessing the altered expression of proteins in OSMF and OSMF which has undergone malignant transformation, obtaining a better knowledge of the molecular pathways involved in the disease progression. The current study shows that the biomarkers studied can be potentially useful for risk stratification of OSMF to OSCC serving as novel targets for therapeutic intervention. Clinical validation of the targets can further pave way for precision medicine to improve the quality of life in OSMF patients.


Introduction
Oral submucous fibrosis is a chronic, potentially malignant disorder of the oral mucosa, which is more widespread in the Indian subcontinent [1,2]. The main etiological determinant is the areca nut, which stimulates fibroblast proliferation and collagen synthesis and also reduces collagen breakdown. It consists of alkaloids, namely arecoline, arecaidine, guvacine and guvacoline, which undergo nitrosation to form nitrosamines, which further alkylates with DNA, leading to malignant transformation on prolonged exposure. Listed as a Group I carcinogen by the International Agency for Research on Cancer, the areca nut is a common component of betel quid, which is predominantly used in Southeast Asia [3]. Among the various oral potentially malignant disorders (OPMD), OSMF has a significant malignant transformation rate, ranging from 7 to 13% [4]. Chewing areca nut causes continuous local irritation, leading to injury-related chronic inflammation, oxidative stress, and cytokine production. Oxidative stress and the successive creation of reactive oxygen species (ROS) induce cell proliferation, cell aging or apoptosis, depending on the amount of ROS production. In the event of chronic exposure, these events lead to preneoplastic changes in the oral cavity and, subsequently, to oral malignancy [5]. Furthermore, the epithelial to mesenchymal transition has been found to be involved in the pathogenesis of OSMF. The betel quid induced tissue injury releases ROS, mediating TGF β-induced epithelial to mesenchymal transition and playing an important role in the fibrosis of OSMF [6].
Studies on proteomic analysis in OSMF have been found to be sparse in the existing literature. The earlier study performed in oral cancers progressing from OSMF showed that ANXA4 and FLNA have great prognostic value for patient survival, which could be potential targets for therapeutic interventions [7]. In another study, two-dimensional electrophoresis-based proteomic approaches were used to detect the differentially expressed proteins between OSMF and normal tissue. A total of 88 proteins with altered expression levels were identified, and cyclophilin A was proposed as a potential biomarker and therapeutic target for OSMF [8]. In addition, a study conducted in India showed differentially expressed proteins between OSMF and normal tissue by a proteome analysis with twodimensional electrophoresis and Matrix-Assisted Laser Desorption Ionization Imaging Time of Flight (MALDI TOF) mass spectrometry, which showed that 15 proteins were upregulated and that 10 proteins were downregulated in the OSMF tissues compared to normal tissue [9]. MALDI-IMS-based proteome analysis, used to analyse the differences in protein expression between OSCC tissues and adjacent non-cancerous OSMF tissues, showed nine differently expressed proteins, of which the expression of NCOA7 in OSCC tissues was upregulated by immunohistochemical staining and Western blotting and correlated with the clinicopathological parameters [10].
With the current existing knowledge, studies concerning proteomic profiling in OSMF are sparse. The present study shows the proteomic expression profile in normal, OSMF and OSCC samples, with validation of the top upregulated proteins using immunohistochemistry (IHC) for a better understanding of protein targets. Our study shows comprehensive differentially expressed profiles of OSMF and OSMF along with OSCC in the same sample compared with the normal samples. The targets have been additionally validated in a large series of clinical samples.

Patient Tissue Samples
The study was approved by the Institutional Ethical Committee (IEC No. 19/DEC/72/118) and was conducted at the Department of Oral Medicine and Radiology, Sri Ramachandra Institute of Higher Education and Research, from December 2019 until January 2021. Written informed consent was obtained from all participants, followed by the collection of tissue samples. Patient demographic details, medical history, habits and details of clinical examination were recorded in the proforma. Patients with no habit of chewing areca nut and no clinical signs of OSMF were included in the normal group, and mucosal tissue was collected during the extraction of the third molar. Patients who were clinically diagnosed with OSMF with histological confirmation and patients who were clinically and histologically diagnosed with OSCC along with pre-existing OSMF were included in their respective study groups. In addition, formalin-fixed paraffin-embedded (FFPE) sections from normal, OSMF and OSCC were obtained for validation studies. All the tissue samples were snap-frozen and stored in liquid nitrogen until used for RNA and protein extraction.

Protein Extraction from Tissue Samples
Tissue extracts were made by crushing the pooled samples (Normal = 10, OSMF = 12, OSMF with OSCC = 6) with liquid nitrogen in a pre-chilled mortar-pestle, and they were then dissolved in a lysis buffer (7M CO(NH 2 ) 2 , 2M SC(NH 2 ) 2 , 4% CHAPS, 20 mM phenylmethylsulfonyl fluoride and 20mM dithiothreitol). Tissue samples were then sonicated for 10 min and centrifuged for 15 min at 4 • C at 12,000 rpm [11]. The proteins extracted were estimated using the Bradford method before being aliquoted and stored at −80 • C for further analysis.

2D Gel Electrophoresis
Two-dimensional electrophoresis of the treated proteins was performed as previously stated [12]. In the first dimension, 13cm IPG strips of pH 3-10 (GE Healthcare, Uppsala, Sweden) were used, and an active/passive rehydration process was carried out. Proteins were focused for 50,000 Vhs in an IPGPhor III (GE Healthcare, Uppsala, Sweden) apparatus with the following IEF conditions: 100 V gradient for 1 h, 300 V gradient for 2 h, 1000 V gradient for 1 h, 5000 V gradient for 5 h, and 5000 V step and held for 7 h at a constant temperature (20 • C). Following isoelectric focusing (IEF), each IPG strip was placed in an equilibration solution containing 2% DTT, followed by incubation in another buffer containing 2.5% iodoacetamide in place of DTT. The second dimension PAGE (12.5%) was performed in an SE600 (GE Healthcare, Uppsala, Sweden) at 1W/gel for 1 h and 13W/gel for 3 h.

In-Gel Trypsin Digestion and MALDI-TOF
Proteins were stained separately with colloidal Coomassie blue G-250 and scanned with a high precision scanner (ScanMaker 9700XL, Microtek). The gel image analysis programme PDQuest 8.01 (Bio-Rad) was used to detect protein expression levels [13]. In the gel, the protein spots of interest were digested with trypsin before being examined by mass spectrometry. Gel fragments were rinsed with Milli-Q water before being treated with a decolorizing solution containing 50% acetonitrile and 25% ammonium bicarbonate. Discoloured gel fragments were thoroughly dehydrated in 100% acetonitrile (ACN) for 10 min before being vacuum dried for 30 min.
The gel pieces were rehydrated/trypsinized for 30 min on ice in 5 litres of trypsin buffer (10 mM ammonium bicarbonate in 10% ACN) containing 400 ng trypsin (Sigma Aldrich, USA) and then incubated for 16 h at 37 • C in 25 litres of buffer (40 mM ammonium bicarbonate in 10% ACN). Following incubation, the peptides were extracted twice by sonication (10 min) with 25 litres of 0.1% trifluoroacetic acid (TFA) in 60% ACN, followed by 20 litres of 100% ACN. Extracted peptides were vacuum-dried for 90 min and kept at 4 • C [14,15]. Mass accuracy was externally calibrated for peptide mass fingerprint analysis using a peptide standard range of 700-4000 Da. Internal calibration was performed using enzyme autolysis peaks. The raw spectra produced by MS were analysed using the SNAP algorithm in the FlexAnalysis software 2.4 (Bruker Daltonics). Searching the NCBI database with the Mascot search engine V2.2 (Matrix Science, UK) was performed [16,17].

Pathway Analysis and Gene Ontology
The gene symbols of differentially expressed proteins were added into the PANTHER (www.pantherdb.org (accessed on 8 December 2021)) database for functional categorization and pathway analysis. STRING (www.string.db.org (accessed on 8 December 2021)) was used to create protein networks [18]. Correlations were formed directly (physically) and indirectly (functionally) from four separate sources: genetic context, high throughput testing, prior knowledge, and conserved co-expression. The integration maps were built utilizing quantitatively combined interaction data from multiple sources.

Immunohistochemistry
Immunohistochemistry was done on 4µm obtained from formalin-fixed paraffinembedded tissue (FFPE) samples. Sections were taken on slides coated with 3-aminopropyl triethoxysilane (APES). The sections were deparaffinised in xylene and rehydrated using absolute alcohol. Endogenous peroxidase activity was quenched by immersing the sections for 10 min in 0.03% hydrogen peroxide in distilled water, followed by a distilled water wash. Antigen retrieval was done with 0.05M Tris EDTA Buffer (pH-9) in a pressure cooker for 20 min. Sections were pre-incubated with 2% bovine serum albumin (BSA) for 40 min. The sections were incubated with primary antibody CA 1 (Santa Cruz-393490 CA1 antibody F-5; 1:50 dilutions in 1% BSA), 14-3-3ε (Santa Cruz-23957 14-3-3ε antibody 8C3; 1:250 dilutions in 1% BSA) and HSP 70 (Vitro SA-MAD000531Q mouse anti-human HSP 70 monoclonal antibody clone W27; prediluted) overnight at 4 • C in 100% moisture. The Polyexcel HRP/DAB detection system (PathnSitu, 1257 Pleasanton, CA, USA) was used to detect expression. Hematoxylin counterstained sections were dehydrated using ascending grades of isopropyl alcohol and xylene and mounted in DPX. Known positive controls and negative controls were used. The expression of 14-3-3ε, CA 1 and HSP 70 was graded and compared to the absolute normal oral mucosa. In 10 high power fields (40X), several positive cells were counted in the epithelium and connective tissue (connective tissue cells like fibroblasts and inflammatory cells), and the % positivity was computed. Counting was done on a computer display using the software ProgRes CapturePro v2.8.8 [19][20][21]. Briefly, CA 1, HSP 70 and 14-3-3ε expression was assessed semi-quantitatively by evaluating the percentage of epithelial and CT cells, 20% or more, expressing the respective proteins considered positive. The staining intensity was measured at several levels of the epithelium (basal, stratum spinosum and superficial). Similarly, expression in the connective tissue was also counted. Scoring for IHC was done by an oral pathologist who was blinded to the clinical details of the included samples [22][23][24].

Statistical Analysis
The student's t-test was employed in 2D-gel electrophoresis to calculate statistically significant differences in the relative abundance of particular protein spots between two groups. A value of p < 0.05 was considered statistically significant. Clinicopathological parameters and immunoexpression-based correlations were done using SPSS (IBM Corporation version 16) [25].

Quantitative Protein Profiling Using 2D Gel Electrophoresis and MALDI-TOF
A comparative proteomic study of pooled tissue samples was taken from patients with OSMF, OSMF with histologically proven OSCC and was compared to pooled normal samples ( Figure 1). The image analysis platform yielded more than 50 protein pairs, 23 of which were upregulated in tumour samples when compared to nearby normal protein samples. The top 23 differentially expressed proteins were chosen for mass spectrometry from the list of differentially expressed proteins. Table 1 shows the protein identification details for the differentially regulated proteins, including the database accession number, mass spectroscopy probability score and percentage of sequence coverage match. The top 23 differentially upregulated proteins are listed in Table 2, in which three secretory proteins, namely carbonic anhydrase 1 (CA 1), 14-3-3 epsilon (14-3-3ε) and heat shock protein 70 (HSP 70), were taken up further in the current study for validation with immunohistochemistry. mass spectroscopy probability score and percentage of sequence coverage match. The top 190 23 differentially upregulated proteins are listed in Table 2, in which three secretory 191 proteins, namely carbonic anhydrase 1 (CA 1), 14-3-3 epsilon (14-3-3ε) and heat shock 192 protein 70 (HSP 70), were taken up further in the current study for validation with 193 immunohistochemistry.

Functional Classification of Identified Proteins and Biological Network Analysis
The STRING database was used to analyse protein-protein interactions for all 23 differentially elevated proteins. A cluster analysis found that among the 23 proteins discovered, two (Serpin B4 and Dermcidin) stood out as individual actors and had not been documented to interact with other proteins identified in this study. The remaining proteins interacted with one another, either directly or indirectly, via other protein networks ( Figure 2). Serum albumin interacted with alpha-enolase, heat shock protein, myosin and lumican to form a network that connected to other protein networks such as keratin and annexin proteins. A cluster analysis was performed for all the interaction networks by K means clustering, and they were grouped into three major clusters; one was with all contractile proteins, conserved proteins (HSP), cytosolic proteins (carbonic anhydrase, enolase) including profilin, myosin light chain family members and tropomyosin. Other major clusters included all heat shock proteins, inflammatory proteins, serpin family proteins and annexin.  was with all contractile proteins, conserved proteins (HSP), cytosolic proteins (carbonic 211 anhydrase, enolase) including profilin, myosin light chain family members and 212 tropomyosin. Other major clusters included all heat shock proteins, inflammatory 213 proteins, serpin family proteins and annexin. 214 All differently expressed upregulated proteins were divided into three groups based 215 on their molecular function, their biological process and their cellular components 216 (Figures S1A-S1C). For the molecular function of the identified proteins, most of them had 217 binding and catalytic activity, followed by transporter activity. Most of them were 218 cytoskeletal proteins and were mainly involved in the cellular and metabolic processes 219 ( Figure S2A). When all functional categories were summarized, it was very obvious that 220 all these differently expressed proteins played a critical role in tumour development, as 221 they were mainly involved in biological regulation, including glycolysis, the epidermal 222 growth factor/fibroblast growth factor signalling pathway, the cytoskeletal system 223 regulation by Rho-GTPase and the apoptosis signalling pathway ( Figure S2B). It was 224 found that the three differently expressed secretory proteins (CA 1, 14-3-3ε and HSP 70), 225 which were considered for further validation, were mainly involved in cellular processes, 226 biological regulation and molecular functions (Figure 3 and Tables S1-S3). All differently expressed upregulated proteins were divided into three groups based on their molecular function, their biological process and their cellular components ( Figure S1A,C). For the molecular function of the identified proteins, most of them had binding and catalytic activity, followed by transporter activity. Most of them were cytoskeletal proteins and were mainly involved in the cellular and metabolic processes ( Figure S2A). When all functional categories were summarized, it was very obvious that all these differently expressed proteins played a critical role in tumour development, as they were mainly involved in biological regulation, including glycolysis, the epidermal growth factor/fibroblast growth factor signalling pathway, the cytoskeletal system regulation by Rho-GTPase and the apoptosis signalling pathway ( Figure S2B). It was found that the three differently expressed secretory proteins (CA 1, 14-3-3ε and HSP 70), which were considered for further validation, were mainly involved in cellular processes, biological regulation and molecular functions (Figure 3 and Tables S1-S3).

233
Immunohistochemistry was done using clinical samples of normal, OSMF and 234 OSCC. Initially, IHC was performed with a sample size of n = 130 for the three proteins, 235 namely CA 1, 14-3-3ε and HSP 70. However, five sections of CA 1 and 13 sections of HSP 236 70 got washed off during the IHC procedure and hence were not available for scoring and 237 statistical analysis. Thus, the final sample size taken for statistical evaluation was n = 130 238 for 14-3-3ε , n = 125 for CA 1 and n = 117 for HSP 70. 239 Positive immunoexpression for CA 1 was found in 70.4% (88/125). Among the 40 240 patients with OSCC, CA 1 overexpression was found in 75% (30/40), and among the OSMF 241 patients, CA 1 overexpression was found in 77.8% (56/72) compared to 15.4% (2/13) of 242 normal, which was found to be statistically significant (p = 0.000; χ2 = 21.169) ( Table 3). 243 Based on different degrees of epithelial abnormalities, we found a statistically significant 244 association (p = 0.025; χ2 = 11.144). Our results showed epithelial atrophy among the 245 OSMF patients showing positive CA 1 overexpression. There was no significant 246 association between CA 1 expression and degrees of inflammation, fibrosis and 247 vascularity. Figure 4 shows the negative immunoexpression of CA1 in normal samples. 248 Figure 5 shows the CA 1-positive immunoexpression in epithelial cells in OSMF samples, 249 with cytoplasmic and nuclear positivity. In OSCC samples, CA 1 demonstrates strong 250 cytoplasmic and nuclear positivity in malignant epithelial cells ( Figure 6).

Validation Studies in Clinical Samples Using Immunohistochemistry
Immunohistochemistry was done using clinical samples of normal, OSMF and OSCC. Initially, IHC was performed with a sample size of n = 130 for the three proteins, namely CA 1, 14-3-3ε and HSP 70. However, five sections of CA 1 and 13 sections of HSP 70 got washed off during the IHC procedure and hence were not available for scoring and statistical analysis. Thus, the final sample size taken for statistical evaluation was n = 130 for 14-3-3ε, n = 125 for CA 1 and n = 117 for HSP 70.

297
The current study shows CA 1, 14-3-3ε and HSP 70 to be potentially useful markers 298 in identifying OSMF patients who are at the risk of malignant transformation and require 299 immediate intervention. Risk stratification of OPMD requires diagnostic tools with 300 increased specificity and sensitivity, which will enable early detection of oral cancer [26]. 301 The current study has shown that the discovery of protein biomarkers based on proteomic 302 analysis can help in the assessment of disease prognosis and also aid in the development 303 of targeted therapy, as described in existing literature [27,28]. We identified 23 top 304 upregulated proteins in OSMF and validated proteins, namely 14-3-3ε, CA 1 and HSP 70. 305

Discussion
The current study shows CA 1, 14-3-3ε and HSP 70 to be potentially useful markers in identifying OSMF patients who are at the risk of malignant transformation and require immediate intervention. Risk stratification of OPMD requires diagnostic tools with increased specificity and sensitivity, which will enable early detection of oral can-cer [26]. The current study has shown that the discovery of protein biomarkers based on proteomic analysis can help in the assessment of disease prognosis and also aid in the development of targeted therapy, as described in existing literature [27,28]. We identified 23 top upregulated proteins in OSMF and validated proteins, namely 14-3-3ε, CA 1 and HSP 70. Previous studies have shown that proteomic analysis employs better efficiency approaches supported by bioinformatics to identify and quantify the total protein content of cells, tissues or biological fluids [29]. Proteomic analysis helps in assessing the protein interactions involved in the cellular process, thereby aiding in understanding the molecular pathogenesis of the disease [30]. MALDI-TOF, which was employed in our study, is known for its sensitivity, ease of use and its capacity for reaching a range of small molecules (100 Da) to large proteins (>300 kDa), thereby allowing measurement of metabolites, lipids, peptides and proteins [31].
In our study, the Protein ANalysis Through Evolutionary Relationships (PANTHER) analysis was performed to gain a better understanding of the functions of all differently expressed proteins. The proteins were categorized based on cellular localization, molecular function and biological process [32]. To the best of our knowledge, there was only one previously published Indian study on the proteomic analysis of OSMF and normal tissue that identified 15 upregulated genes, of which HSP-70, enolase and Lumican were also recognized in the current study [9]. However, our study, apart from samples from OSMF patients, additionally included patient samples with histologically proven OSCC along with OSMF and which were compared to normal patients, which helped in evaluating the target proteins involved in the malignant transformation of OSMF. Among the 23 upregulated proteins identified, 14-3-3ε, CA 1 and HSP 70 were found to be secretory proteins. Therefore, these protein targets were chosen with a larger number of samples for further validation using IHC.
14-3-3 proteins have been found to play a vital role at the interface between cancer, aging and age-related neurodegenerative diseases [33]. The 14-3-3 protein family consists of 2833 kDa acidic proteins that are found in all eukaryotes and are phosphorylated serine/threonine binding proteins that bind to various kinases, phosphatases, transmembrane receptors and transcription factors [34,35]. The 14-3-3 proteins generally interact with proteins involved in functions such as regulation, localization or catalysis. It is generally accepted that 14-3-3 proteins act in two ways: they either act as adapters or exhibit chaperone-like activity [36,37]. The 14-3-3 proteins regulate critical biological processes such as cell proliferation, growth and apoptosis through interaction with their partners and are thus also involved in the regulation of various tumours [38,39]. There are seven well-recognized isoforms of 14-3-3 proteins, namely 14-3-3ζ, 14-3-3σ, 14-3-3β, 14-3-3ε, 14-3-3γ, 14-3-3η and 14-3-3τ/θ in various cancers, of which 14-3-3ε has been implicated in kidney, liver, squamous cell, breast and stomach cancers [40,41]. Overexpression of 14-3-3ζ, among the various 14-3-3 proteins, has been observed in OPMD and OSCC using IHC and has therefore been suggested to play an important role in tumour development and progression [42]. The literature on 14-3-3ε, however, appears to be sparse in oral cancer and the first of its kind is reported in OSMF. Overexpression of 14-3-3ε in OSMF and OSCC shows its role in the initiation and progression of OSCC, and it may therefore be worthwhile to further investigate its association with oral tumorigenesis.
Currently, there are no studies available to correlate CA 1 in OSMF, and this study is the first of its kind showing increased expression and significant association in OSMF and OSCC when compared to normal samples. The CAs are grouped into six distinctive classes (α, β, γ, δ, τ, η), of which the α-class is detected in humans, mainly expressed in tissues with differing in pH and metabolic rate and is involved in catalyzed CO 2 reactions of various physiological processes [43,44]. The 15 isoforms of CAs are encoded, among which 12 (CAs I-IV, CAs VA-VB, CAs VI-VII, CA IX and CAs XII-XIV) coordinate with zinc at their active site [45]. Based on cellular localization, the CA family is further subdivided into cytosolic (CAs I, II, III, VII and VIII), membrane-associated (CAs IV, IX, XII and XIV), mitochondrial (CA Va and Vb), secreted isoenzymes (CA VI) and catalytic CA isoforms (CAs X and XI). The CAs which have been implicated in cancer include CA IX, XII and XIV, among which CA IX in OSCC has been explored extensively [46,47]. Similarly, a study assessing CA IX in the plasma of OSMF and OSCC have shown increased expression [2]. The available literature on the role of CA 1 in OSCC is limited. A study done by Li et al. (2021) using IHC showed a significant correlation of CA I and CA II in OSCC [48].
HSPs are a group of proteins highly expressed during biological stress, especially inflammation, and they are thought to promote tumorigenesis by inhibiting apoptosis [49,50]. Heat shock proteins are broadly classified into five major families as the HSP 100 family (100-104 KDa), HS P90 family (82-90 KDa), HSP 70 family (68-75 KDa), HSP 60 family (58-65 KDa) and the small HSP family (15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), based on apparent molecular weight, amino acid sequence homologies and their functional aspects [51]. Increased expression of HSP 70 was noted in leukoplakia with dysplasia and OSCC, suggested to be a marker for epithelial malignancy [52]. A study done on HSP 70 by Thubashini et al. (2011) on OSMF and OSCC using IHC showed increased expression, attributed to increased copper levels, leading to elevated oxidative stress levels in OSMF [53]. A similar significant correlation has been observed in leukoplakia and OSCC compared to normal, and hence, the authors have concluded to explore more on the therapeutic approach for OPMD and OSCC [54]. The current study also showed a high expression in OSMF and OSCC, which is concurrent with the previous study.
To the best of our knowledge, this is the first study showing 14-3-3 and CA 1 in OSMF and their interrelationships. The current study of validating the secretory markers can pave way for the assessment of 14-3-3ε and CA in saliva samples in the future. Clinically validated targets help in the development of precision therapy, thereby improving patient care [55]. Our studies show that 14-3-3ε and CA 1 are potential new targets identified for OSMF as biomarkers, along with HSP 70, which has previously been shown to be involved in OSMF as a marker of pathogenesis. Although HSP 70 has been explored in the literature, the correlation with the stages of OSMF and OSCC is sparse. The proteins validated using IHC will pave the way in the identification of novel targets for the therapy of OSMF and for preventing its malignant transformation. These three overexpressed proteins can be offered as a panel of IHC-based biomarkers for risk stratification of OSMF.

Conclusions
South East Asia has a unique increasing burden of OSMF and also a scarcity of intense research work on exclusive OSMF patients. The current study shows proteomic profiling using clinical samples, along with validation in a large series of cases. Validation studies revealed a panel of biomarkers, namely CA1, 14-3-3ε and HSP 70, to be potentially useful in identifying OSMF patients with an increased risk of oral cancer development. With currently no efficient treatment modality available, the results obtained will play a significant role in the evolution of a novel targeted therapy and in preventing malignant transformation.