The Complex Subtype-Dependent Role of Connexin 43 (GJA1) in Breast Cancer

Gap junction transmembrane channels allow the transfer of small molecules between the cytoplasm of adjacent cells. They are formed by proteins named connexins (Cxs) that have long been considered as a tumor suppressor. This widespread view has been challenged by recent studies suggesting that the role of Connexin 43 (Cx43) in cancer is tissue- and stage-specific and can even promote tumor progression. High throughput profiling of invasive breast cancer has allowed for the construction of subtyping schemes that partition patients into at least four distinct intrinsic subtypes. This study characterizes Cx43 expression during cancer progression with each of the tumor subtypes using a compendium of publicly available gene expression data. In particular, we show that Cx43 expression depends greatly on intrinsic subtype. Tumor grade also co-varies with patient subtype, resulting in Cx43 co-expression with grade in a subtype-dependent manner. Better survival was associated with a high expression of Cx43 in unstratified and luminal tumors but with a low expression in Her2e subtype. A better understanding of Cx43 regulation in a subtype-dependent manner is needed to clarify the context in which Cx43 is associated with tumor suppression or cancer progression.


Introduction
Connexin 43 (Cx43), a protein encoded by the Gap Junction protein alpha 1 gene (GJA1), forms gap junction transmembrane channels facilitating communication between the cytoplasm of two adjacent cells. Small molecules, including metabolites, second messengers and electrical signals pass through these channels in a process called Gap Junction Intercellular Communication (GJIC). Cx43 transcription is thought to be regulated both by transcription factors and by epigenetic mechanisms [1], but is also regulated at the protein level by post-transcriptional modifications, trafficking to and from the plasma membrane and gating of the channels [2].
The breast epithelium is composed of two layers of cells: an inner layer of luminal cells surrounded by an outer layer of basal cells, composed mainly of myoepithelial cells but also comprising stem and progenitor cell populations [3]. It is well established that Cx43 is expressed mainly in the basal layer; however, a few studies showed Cx43 expression in luminal cells [4][5][6]. A study using transmission electron microscopy reported gap junctions to be present between the basal and the luminal layers in normal breast tissues, although the exact connexin involved was not determined [7]. A few studies have also demonstrated the expression of Cx43 in fibroblasts surrounding the breast epithelium and in endothelial cells [8][9][10].
The role of Cx43 in breast cancer is controversial. On the one hand, Cx43 has long been considered a tumor suppressor [11] with studies demonstrating it was under-expressed at the mRNA and the protein level in cancer cell lines [12,13] or aberrant localization and phosphorylation in tumors [12][13][14][15][16].
Cx43 has also been linked to the control of processes associated with breast cancer progression and metastasis such as proliferation, invasion, migration and apoptosis [17]. Moreover, it was shown in vivo and in vitro that metastatic capacity was increased in tumors cells showing a weak GJIC capacity and a lower number of gap junction plaques [18,19]. Re-expression of Cx43 in tumor cells led to reduced growth of tumors in nude mice and fewer metastases to the lungs [20,21]. Mice expressing a mutant form of Cx43 (G60S) also showed increased breast tumor metastasis to the lung [3].
On the other hand, much evidence suggested that Cx43 is involved in later stages of breast cancer progression. For instance, it has been suggested that Cx43 mediates the interaction between tumor and endothelial cells to facilitate adhesion and extravasation at secondary sites [22][23][24]. Cx43 has also been found to be expressed at higher levels in lymph node metastasis than in the corresponding primary tumor [25]. The context of expression that allows Cx43 to act as a tumor suppressor or promoter has not been elucidated and therefore precludes its targeting in breast cancer therapies [11].
Breast cancer is highly heterogeneous, with both intra-and inter-tumoral molecular variability. During the last decade, high throughput techniques have generated a body of new data in many diseases including breast cancer. Genome-wide gene expression profiling has produced classification schemes including the intrinsic subtypes consisting of luminal A (LumA), luminal B (LumB), basal-like and HER2-enriched (Her2e) tumors. Luminal tumors are generally characterized by the expression of the estrogen receptor alpha (ERα) and the progesterone receptors (PR). Most Her2e tumors harbor a genomic amplification of chromosome 17q12 that contains the erb-b2 receptor tyrosine kinase 2 gene (ERBB2/HER2). Approximately half of Her2e tumors express ERα. Basal-like tumors are often negative for ERα or PR receptors as well as for HER2 and also express basal cytokeratins [26,27].
This study aims to investigate Cx43's ambiguous role during cancer progression with each of the breast tumor intrinsic subtypes using a compendium of publicly available gene expression data with large samples. Here, we report that Cx43 expression depends greatly on intrinsic subtype. Tumor grade also co-varies with patient subtype, resulting in Cx43 co-expression with grade in a subtype-dependent manner. Better survival was associated with a high expression of Cx43 in unstratified and luminal tumors but with a low expression in the Her2e subtype.

GJA1 Expression and Localization in the Breast
We first investigated the tissue localization and level of expression of Cx43 protein in human samples of both morphologically normal breast tissue and tumors using the Human Protein Atlas. This is a public database containing a large collection of normal and cancer tissue slides which have been probed with various antibodies followed by a hematoxylin counterstain [28]. Cx43 is a membrane channel and is usually considered to be expressed in the myoepithelial cell. A typical punctate staining of junctional plaques formed by Cx43 channels was observed for normal tissues. The staining could be observed in the myoepithelial layer, as expected, but also in some luminal cells (Figure 1a). Although an under-expression of Cx43 protein is observed in some of the 21 cancer samples available (Figure 1a), others show a clear over-expression, mostly in well differentiated luminal-like neoplastic cells, which did not appeared to be associated with a basal layer (Figure 1c). In other samples, Cx43 could also be seen in a layer of cells separating neoplastic tissue from stroma, although this layer sometimes adhered poorly to both adjacent compartments (Figure 1d,e). Cx43 was also observed in samples with poorly differentiated cell and tissue morphology (Figure 1f,g). Interestingly, Cx43 protein could also be found in spindle-shaped cells in the stroma (Figure 1h). Overall, some normal punctate patterns could be observed in some tumors (Figure 1d We next compared transcript levels of GJA1 in breast tumor samples and in the non-cancerous adjacent tissues using microarray-based data of The Cancer Genome Atlas project (TCGA) breast invasive carcinoma cohort (BRCA) of clinical samples. We observed a far greater variance in mRNA expression in tumor samples compared to tumor adjacent morphologically normal breast tissue ( Figure 1i, the Fligner-Killeen test of homogeneity of variances, p value < 10 −12 ). We also used whole sample Cx43 protein levels obtained for 105 TCGA samples by mass spectrometry. GJA1 mRNA and protein level are significantly correlated (Figure 1j, (Pearson correlation rho = 0.6515, p value < e −13 ). Our results confirm that, in breast cancer, GJA1 is concurrently dysregulated at both the protein and the mRNA level.

GJA1 Expression Varies with Breast Cancer Subtype
We then speculated that GJA1 variability could be linked to the molecular heterogeneity of breast cancer. When we compared GJA1 mRNA expression after stratifying patient samples by their intrinsic subtype (Pam50 by genefu [29]) (Figure 2a), the increase in variance in gene expression of tumor samples relative to normal tissue was observed in every subtype. The LumA had a mean expression level statistically indistinguishable from morphologically normal samples, but a small significant progressive decrease in expression is observed from LumB to basal and Her2e subtypes (Figure 2a). A similar pattern was observed in the four other datasets, although normal breast tissues were only used in TCGA and Curtis datasets ( Figure A1). A similar pattern was also observed at the protein level in the TCGA dataset ( Figure 2b). Together, these results suggest that the expression of GJA1 is strongly associated with tumor subtype and is more variable in each subtype in comparison to morphologically normal tissue.

Somatic DNA-Level Events of GJA1 Do Not Drive Expression Changes of GJA1 in Breast Cancer
We next asked if underlying DNA-level somatic copy number changes in the genomic loci harboring GJA1 influence gene expression levels. For the TCGA dataset, a few tumors had amplification or deletion of GJA1, compared to genes known to be amplified in breast cancer ( Figure 3a).
Moreover, tumors with GJA1 amplification did not show an increase in expression while only deep deletions reduced expression (−2 in called copy number, as shown in Figure 3c). Most luminal tumors with the highest expression of GJA1 were found to have either a normal copy number or single deletion (Figure 3b-d). Moreover, in tumors with a GJA1 gain or amplification, a slight but significant decrease in expression, rather than an increase, could be observed compared to normal tissues ( Figure 3c). GJA1 mRNA also weakly negatively correlated with DNA copy number (Figure 3c), suggesting that Cx43 over-expression in breast cancer is not driven by DNA amplification. To validate our procedure, we used the HSF2 gene, a close neighbor of GJA1 on chr6q22 which shares similar copy number in 99% of TCGA's breast cancer cases. In contrast to what was observed for GJA1, HSF2 mRNA was positively correlated with the copy number of its own gene ( Figure 3c). Moreover, somatic point mutation data showed that, in the TCGA cohort, only three breast cancer patients out of 977 harbored at least one GJA1 mutation, accounting for 0.31% of the tumors (Figure 3e). Only one tumor with an extremely high number of total mutations (TCGA-AN-A046) was found to have both a GJA1 mutation and a slightly higher expression of the gene (2.76) compared to normal range (−1.01 to 2.20) (Figure 3c,e). Together, these results argue that loss or amplification of the GJA1 gene likely does not dictate mRNA and protein dysregulation in breast cancer.  d   RB1  PTEN  CCAR2  CDKN2A  GJA1  ERBB2  ASPH  FGFR1  RAB11FIP1  CCND1  FLG  ATP1A4  USH2A  AKT3  MDM4  GOLPH3L

GJA1 Level Is Dependent on Hormonal Receptor Status
Because Cx43 level varies through the mammary gland development and the reproductive cycle, it has been suggested that it could be regulated by hormones, similar to what has been observed in other tissues [4,[30][31][32]. We thus next investigated whether the GJA1 mRNA level was directly linked with hormonal receptors status.
Consistent with the subtype-specific expression of Cx43, ERα-or PR-positive breast tumors had a significantly higher expression of GJA1 mRNA compared to ERα-or PR-negative tumors (Figures 4a and A1-A3). Results were similar for all five datasets, except for PR in the NKI dataset where the low number of samples did not allow statistical significance to be reached (Figures 4a, A2a and A3a). However, there were no strong correlations between GJA1 expression and ESR1 mRNA, with total protein level (ERα), or with the activated form of ERα phosphorylated on serine 118 (ERα_pS118) (Figures 4b-e and A1b,c). While only weak correlations were observed in most individual subtypes, a stronger correlation between GJA1 and ESR1 mRNA and protein was observed when the tumors were pooled (Figure 4b-d). As expected, ESR1 mRNA was better correlated with total ERα (Pearson's rho = 0.9011, Spearman's rho = 0.8969) than with ERα_pS118 proteins (Pearson's rho = 0.5459, Spearman's rho = 0.6407). ER−alpha_pS118 RPPA.tcga.compatible GJA1 (A_24_P55287) X: ER−alpha_pS118 RPPA.tcga.compatible in tcga (probe A_23_P100800) Y: GJA1 exprs in tcga (probe A_24_P55287)  Y: GJA1 exprs in tcga (probe A_24_P55287) PGR (A_32_P49197) GJA1 (A_24_P55287) X: PGR exprs in tcga (probe A_32_P49197) Y: GJA1 exprs in tcga (probe A_24_P55287) (Subtype Pooled patient for which pam50.genefu contain Pooled , n = 550 ) n = 488  Stronger correlations between PGR mRNA and protein levels and GJA1 mRNA levels were observed, not only in unstratified (pooled) analysis, but also in individual subtypes within most datasets (Figures 4b-e and A1b,c). This association was stronger in cancer samples than in normal breast tissues in all datasets for which normal tissues were available. Similar to ERα, total PR protein was well correlated with PGR mRNA (Pearson' rho = 0.8593 and Spearman's rho = 0.8723).
Tumors positive for the HER2 receptor by histochemistry (TCGA dataset) did not express significantly different levels of Cx43 mRNA. However, when HER2 status was given by HER2 amplicon probes or HER2 mRNA expression (Vanvliet, NKI and Curtis datasets), HER2+ tumors had a significantly lower level of GJA1. No direct correlation was observed between GJA1 and the HER2 (ERBB2) mRNA (Figures 4a-e and A4a-d). A good correlation was observed between HER2 mRNA and total HER2 protein level (Pearson's rho = 0.8344, Spearman's rho 0.68634). The correlation between HER2 protein (HER2 and HER2_pY1248 activated form) and GJA1 mRNA was not stronger than that observed for HER2 mRNA (Figures 4b-e and A3d).
Together, the significant differences observed in GJA1 mRNA level in individual subtypes and with receptor status suggest that GJA1 level is dependent on the molecular context provided by such subtypes. In addition, GJA1 does not vary directly with ESR1 and HER2 mRNA and protein levels but shows a stronger correlation with PGR mRNA and PR protein in tumor samples.

GJA1 mRNA Is Dysregulated at the Early Stages of Breast Cancer and Is Reduced with Grade When Tumors Are Pooled
To reconcile evidence supporting both tumor-suppressive and -promoting roles, it has been suggested that Cx43 function could depend on tissue type or evolve with tumor stage [11]. We therefore investigated whether GJA1 expression in primary breast tumor changed with stage and grade at the mRNA level in breast cancer. Since grade/stage are strongly associated with subtype, we first stratified our cohorts by intrinsic subtype. We used the Curtis dataset, as GJA1 expression was available for invasive tumors (stages 0 to IV) and "normal" adjacent tissue for numerous samples. A significant dysregulation of GJA1 expression occurred at the early stages in all breast cancer subtypes, although both over-expression and downregulation could be observed ( Figures 5 and A5). Most of the GJA1 over-expressing luminal tumors were found to be of low stage (0-II). However, a reduction was observed in early stage basal-like and Her2e tumors ( Figures 5 and A5). A significant increase in GJA1 was also observed in the invasive stage I compared with stage 0 in all subtypes ( Figure 5).   We then investigated whether or not the expression of GJA1 could be linked to tumors' grade. Our analysis revealed that GJA1 mRNA expression was significantly decreased with grade when all tumors were pooled, but not when they were stratified by intrinsic subtype (Figures 6 and A6). A significant decrease in GJA1 with grades in LumB tumors could be observed only in the Vanvliet's dataset but not in other datasets analyzed (Figures 6 and A6). Interestingly, basal and Her2e tumors, which express a low level of GJA1 (Figure 2), account for an important proportion of grade 3 tumors, thus reducing the mean GJA1 expression for this grade ( Figure 6). Moreover, grade 1 tumors are mostly luminal A and B, with a subset of GJA1 over-expressing tumors, introducing an upward bias in this grade. Grade 2 tumors consist of a more balanced mix of all the subtypes (Figures 2 and 6). These results suggest that an observed reduction in GJA1 with grade in pooled tumors is likely a bias induced by the pooling of the tumors' subtypes.

In Her2e Breast Tumors, a Low Expression of GJA1 Is Associated with a Better Prognosis
To gain further insight into the role of Cx43 in breast cancer, we analyzed how the level of GJA1 mRNA expression in each subtype was associated with outcome. Observations that Cx43 expression was associated with a worse prognostic in ER-negative [33] and Her2e [34] tumors have been previously reported using the web-based platform KMPlotter [35] while ER-positive tumors had a better prognosis [33]. Investigating further the results of BreastMark and KMPlotter Web platforms, survival analysis showed that pooled and luminal tumors with high levels of GJA1 mRNA were associated with a better prognostic (hazard ratio < 1), although results were not always statistically significant (Figures 7 and A7a,b). Conversely, basal-like and Her2e tumors followed an opposite trend (hazard ratio > 1), with high expression of GJA1 strongly associated with a worse prognosis in the Her2e subtype ( Figure 7. GJA1 is associated with a diverging outcome depending on breast cancer subtype. The Kaplan-Meyer plots show survival curves for patients with breast tumors expressing either high (blue) or low (red) levels of GJA1 mRNA in pooled tumors or in individual intrinsic subtypes. TCGA, NKI, Vanvliet and both Curtis datasets were aggregated for the analysis. The best cutoff was determined as the percentile lending the lowest log rank test p value ( Figure A12 ROC curves have shown that the highest area Under the curve (AUC) for GJA1 was obtained when tumors were pooled ( Figure A8a) and GJA1 was then ranked, at worst, in the eleven first percentiles when compared to all the probes present in the five datasets ( Figure A8c). The log rank test was highly significant for all the analyses (Figures 7, A9 and A12) and for a vast range of cutoffs ( Figure A10c), suggesting that GJA1 has the greatest discriminating power when cohorts are unstratified. This is in line with a differential expression of GJA1 in luminal vs. basal and Her2e tumors that also have diverging prognostics (Figure 2a).
When analyzing individual subtypes, a high expression was also significantly associated with a better prognosis in all analyses for LumA and for most analyses for LumB tumors (Figures 7, A9 and A12). However, survival curves in most analyses as well as the hazard ratio for a wide range of cutoffs ( Figure A11) showed that this tendency is reversed in Basal and Her2e tumors where GJA1 is mostly associated with a worse prognosis. This result was most significant in Her2e tumors, especially with smaller cutoffs while significance was rarely reached for Basal tumors.
However, GJA1 ROC curves showed that GJA1 did not consistently identify bad prognosis tumors with a high specificity and sensitivity ( Figure A8a). These results suggest that although stratifying tumors revealed that the role of GJA1 possibly differs in different breast cancer subtypes, GJA1 should not be used as a clinical marker. These results also highlight once again how analyses using pooled tumor subtypes might induce biases and hide diverging results that are subtype-specific.

Discussion
Traditionally, Cx43 was considered as a tumor suppressor in the breast, with many studies reporting decreased Cx43 expression in tumor compared to normal breast tissue via both in vivo and in vitro studies [3,12,13,[18][19][20][21]. However, other studies contradict these findings [8,[22][23][24][25]. This recent evidence has cast doubt on Cxs tumor's suppressive role, suggesting that the Cxs function in cancer was tissue-and tumor stage-dependent [11,17]. At least four different subtypes of breast cancer have been identified, each having unique molecular profiles, responses to treatment and prognostics. Our evidence suggests that the role of Cx43 is dependent on subtype.

Cx43 Expression Is Dysregulated in Breast Cancer
Early studies first showed a dramatic downregulation of GJA1 at the mRNA and the protein level in breast cancer cell lines as well as in rat and human breast tumors [12,13]. Conversely, other studies showed an increase in a subset of tumors [15]. Most of these studies analyzed a limited number of samples and were conducted either prior to the intrinsic subtype classification of breast cancer or did not use such classification. Our results, with several large cohorts of breast cancer clinical samples, reconcile these contradictory data by demonstrating that the observed dysregulation can involve both he increased and decreased expression of the Cx43 protein and mRNA. These observations are consistent with more recent reports at the protein level [15,16,25,36].

Dysregulation of Cx43 Is Linked to Hormonal Receptor Status and Tumor Subtype
Our results showed that the expression of Cx43 in breast tumors was lower in basal and Her2e than in normal tissues and that Cx43 levels vary greatly within luminal subtypes. This subtype-dependent expression was also shown by more recent studies, the result of which also support a higher expression of Cx43 mRNA and protein in luminal tumors than in basal-like and Her2e subtypes [36,37]. Because the intrinsic subtypes are characterized by, among others, hormonal receptor status, we wanted to evaluate whether a functional link could be captured in whole-tumor expression profiles between Cx43 and ERα, PR or HER2. Whole-tumor expression has been used by others, both to assess the content of specific cell types in samples and to decipher functional links between genes [38,39]. Using this method, we showed that GJA1 mRNA increases in a subset of ERα-and PR-positive tumors and in the luminal subtypes, which are largely ERα-and PR-positive. These results were not surprising as much evidence supports a link between Cx43 and hormones in breast tissue [32] and in other tissues [40][41][42][43]. GJA1 is also expressed at lower levels when HER2 status is positive and within the Her2e breast cancer subtype, except in the TCGA dataset. In an early study, it was reported that Cx43 gap junctions were dramatically reduced in breast tumors, and that this reduction was considered to occur regardless of ERα, PR or HER2 status [12]. More recent studies have reported that Cx43 protein expression correlated positively with PR and ERα status [44,45] and negatively with HER2 protein expression [45]. However, Conklin et al. reported that no correlation was observed between Cx43 and HER2 protein in tissue microarrays [44].
Our results suggest a direct relationship between GJA1 and PR expression in breast cancer samples. Our analysis shows that GJA1 level correlates with PR mRNA and protein in several subtypes. These results suggest that either PR or GJA1 levels are dependent on the relative amount of some cell types co-expressing both genes, or that a functional link exists in the regulation of these genes in the same cell type or via paracrine signaling. Accumulating evidence has shown that ERα and PR are expressed in cell populations that do not totally overlap. GJA1 is usually associated with basal cells while PR is thought to be expressed mainly in hormone-responsive luminal cell [1]. However, PR has been detected in some human breast basal cells, especially within immature lobules [1], suggesting an expression in primitive basal progenitor cells. PR has been suggested to coordinate basal cell proliferation, either via paracrine or autocrine stimulation [1]. It was also reported that the unliganded progesterone receptor isoform A (PRA) could activate Cx43 transcription by interacting with AP-1 heterodimers composed of FRA2 and JUND [42]. More studies are needed to better understand Cx43 localization and regulation, as well as its potential link with hormones. This knowledge is essential to further understand mammary gland morphogenesis and how Cx43 and hormones are involved in breast cancer.
Several other questions remain unanswered regarding the link between GJA1 and ERα, PR and HER2. While the receptor's protein and mRNA levels were well correlated in our study, their functional status in the samples is unknown. Protein expression data for some phosphorylated forms of ER (ERα_pS118) and HER2 (HER2_pY1248) receptor were available. Beyond single phosphorylation, the activation of these receptors is mostly dependent on complex post-transcriptional processing which affects receptors' specific functions and gene transcription. As a result, prognostic significance of ERα has been shown to be phosphorylation site-specific [46]. Therefore, it cannot be excluded that GJA1 mRNA expression can be regulated by ERα or HER2 and that these links could not be captured by expression profiles from breast cancer samples. Regardless of the precise nature of the link between GJA1 and hormone receptors, our results suggest that GJA1 level is dependent on the overall molecular context provided by each breast cancer subtype and that this might relate to PGR level, at least in some subtypes.

Upregulation of Cx43 mRNA Is Not Driven by DNA Amplification in Breast Cancer
Somatic DNA-level chromosomal aberrations are a defining characteristic of cancer and are common in breast carcinoma. Genomic loss and amplification cause decreases and increases in the transcription of genes in the region and often with concomitant effects of protein expression. We found that GJA1 is rarely the target of such somatic events, and when it occurred it was often in Her2e and basal subtypes, consistent with the observation that these two subtypes generally have an increased amount of genomic instability in comparison to the luminal subtypes. Our results are also in accordance with previous studies that have shown that the region of human chromosome 6 where GJA1 is located (6q22.31) has a relatively low level of amplification and deletions [47]. Cx43 was rarely mutated in breast cancer samples. Together, these results suggest that GJA1 dysregulation at the mRNA and protein levels involves a dysregulation of other factors impacting the transcription (epigenetics, transcription factors) or mRNA stability.

Cx43 mRNA Level Is Dysregulated at the Early Stages of Breast Cancer
It was previously reported that, in primary tumors, Cx43 protein expression correlated with clinical stages [45]. Our analysis of microarray data from large cohorts of primary tumors suggested that Cx43 is decreased in a subset of tumors during early carcinogenesis (stage 0) and is re-expressed at higher levels in stage I tumors. While stage 0 of the luminal subtypes showed an increased variance of expression, those of basal-like and Her2e tumors had a significantly reduced expression compared to normal tissues. However, we could not observe a robust mRNA reduction, or increase, in later stages compared either to early tumor stages or to normal tissues.
A previous study investigated immunohistochemistry for Cx43 protein expression in ductal carcinoma in situ (DCIS), DCIS with microinvasion, DCIS with invasive ductal carcinoma (IDC) and IDC alone. In pooled tumors as well as in most subtypes, the lowest expression of Cx43 protein occurred neither in DCIS nor IDC alone but precisely in DCIS with microinvasion where only three out of thirty-seven cases (8%) were positive [36]. On the other side, out of 193 invasive lesions, sixty-three (33%) expressed Cx43. When looking specifically at the Her2e subtype, Cx43 was not expressed in a lower number of DCIS with microinvasion as in other subtypes since Cx43 was rarely expressed. Cx43 was present in only one of twenty IDC samples while the remaining twenty-six samples of other groups (DCIS, DCIS with microinvasion and DCIS with IDC) were all negatives. Whether or not the stromal compartment was included in the analysis was not specified. These results are consistent with our observation that, in all subtypes, DCIS (typically stage 0) had a lower expression than invasive stage I tumors. Together, these result point to Cx43 dysregulation as an early event in tumorigenesis, similar to what has been observed in the early stages of cervix, endometrial and thyroid cancers [48].
Moreover, it should be noted that while breast cancer stages are based on the size and the spreading of the disease in the tissue or to distant sites, the mRNA expression profiles we used only account for gene expression in whole primary tumors. Important morphologic information is therefore lost. During cancer progression, localized neoplastic cells acquire the capacity to invade surrounding tissues, and eventually reach the blood or lymphatic vasculature, allowing them to spread to other organs [49]. Depending on the stage and their location within the tumor or the tissue, these tumor cell populations face different challenges depending on the processes accomplished and on the microenvironment surrounding them [49]. Microarray data do not make it possible to either finely assess the expression of specific cells according to their specific localization in the tumor or to distinguish tumor gene expression from the stroma. A comprehensive study of events occurring early in carcinogenesis and accounting for the geographical localization within the tumors at primary or distant sites, and for the different cell populations in a subtype-dependent manner, is therefore the next logical step in further understanding Cx43's role in tumor progression.

The Apparent Grade-Dependent Decrease in Cx43 Is Linked to Its Low Expression in More Aggressive Subtypes
Tumor grade is a measure of the degree of abnormality of tumor cells and of dedifferentiation of cancer tissues compared to normal breast tissue. Our results showed that, when all tumors are pooled, the GJA1 mRNA level seems to increase in grade 1 tumors compared to normal tissues and gradually decrease with increasing grade. However, stratifying the tumors by subtype showed that within an intrinsic subtype, the distribution of the tumors within each grade varies considerably. Indeed, luminal A and B tumors are more frequently of grade 1 or 2 and some of them over-express GJA1 (Figure 2), while most basal and Her2e tumors, that express a low level of Cx43, are mostly of grade 3. As a result, GJA1 mRNA is not lost with grade in individual subtypes. These results suggest that the observed correlation in pooled tumors is, in fact, a bias attributable to the pooling of the tumors, and reflects the high grade of basal and Her2e tumors. These results also highlight how pooling the different intrinsic subtypes, expressing varying degrees of GJA1, can introduce important biases in cohort analysis and will likely yield different results depending on the composition, in terms of the subtypes, of the cohorts studied.

High Expression of Cx43 Is Associated with a Good Prognostic in Luminal Subtypes, but with a Worse Prognostic in Her2e Tumors
Our results showed that, consistent with its ascribed role as a tumor suppressor in breast cancer, Cx43 was expressed at lower levels in more aggressive basal and Her2e subtypes than in luminal subtypes. Survival analysis of pooled tumors therefore showed a better survival of tumors highly expressing Cx43. However, as with grade, pooling breast cancer subtypes to analyze the effect of GJA1 on the outcome introduces biases. Tumors expressing low levels of GJA1 are overrepresented in aggressive basal and Her2e tumors, likely dragging down the survival of the group expressing a low level of GJA1. Therefore, performing survival analysis on pooled tumors, a good prognosis patient is automatically segregated into the curve of tumors highly expressing GJA1, and vice versa.
Paradoxically, the prognostic associated with Cx43 expression diverged depending on the intrinsic subtype, with a good prognosis in luminal tumors and an opposite trend in Her2e tumors. A previous study using immunohistochemistry found no correlation between Cx43 protein level and patient outcome [44]. However, similar to our results, more recent studies using expression array-based survival curves found that a high GJA1 expression was associated with a better prognosis in ERα-positive breast cancer tumors, while an opposite trend was observed in ERα-negative tumors [33] and Her2e tumors [34]. The worse prognosis associated with GJA1 in Her2e tumors suggests that GJA1 function in breast cancer might not just be tissue-and stage-dependent, as suggested by others [11,17], but might also be subtype-dependent.
Cx43 has been reported to be expressed both in epithelial and stromal cells types. The molecular landscape provided by different cell types and/or by different breast cancer subtypes might provide different context, possibly allowing Cx43 to assume different functions and leading to different outcomes. It could be hypothesized that such context may provide different sets of interacting partners for GJA1, and its expression might even be driven by a different set of transcriptional or epigenetic regulators. In addition, an important determinant of the capacity of Cx43 to assume its channel function is unarguably its proper membrane localization. From array based mRNA expression data, it is until now impossible to assess neither the cellular localization nor the functional status of Cx43. It is very likely that these important and relevant information would contribute to a more complete understanding of the functions of Cx43 in breast cancer. For instance, a recent study demonstrated that over-expressing Cx43 in two different HER2-positive breast cancer cell lines lead to a diverging ability to proliferate, migrate, form mammospheres and form tumors in mice. Tumorigenic characteristics of the cancer cells were enhanced when functional gap junction channels could not be formed upon Cx43 over-expression, but were reduced when membrane gap junctions plaques allowed cells to communicate [34]. These aspects of Cx43 biology might explain its different roles according to subtypes but also possibly within subtypes and should therefore be addressed. Additional researches are required to better understand the context that allows Cx43 to suppress or promote carcinogenesis in different intrinsic subtypes.

Gene Expression
We used 4K samples over different expression platforms. Vanvliet used Affymetrix Human Genome U133A (data processed with Robust Multi-Array Average (RMA)) [50]. Curtis discovery and Curtis validation used the Illumina HT-12 v3 platform (expression given as a Log2 intensity level) [47]. The Cancer Genome Atlas (TCGA) used a custom Agilent G4502A 244K array (expression given as Log2 Lowess normalized ratio) [51]. NKI used a Hu25K Agilent platform (samples were hybridized against a pool of equal amount of RNA from each patient and gene expression is given as a log10 of intensity ratio) [52]. Normalized signal per probe or probe set mRNA expression was downloaded for tumor samples for all five datasets. Breast cancer intrinsic subtype was assigned to each sample with the Pam50 molecular subtyping algorithm using the R genefu package [29]. Survival for each case was determined as in [27].
ERα, PR and HER2 status provided in the original publication of the dataset was used. For TCGA, ER, PR and HER2 status was obtained by immunohistochemistry (IHC). For NKI, ER and PR status was determined by IHC and the sample was considered positive if at least 10% of the cells were positive. For Vanvliet, ER and PR status was determined with the Bioconductor package ROCR based on the expression of the probe 205225_at and validated with IHC when available. NKI and Vanvliet HER2 status was determined using the probes of the HER2 amplicon genes. For Curtis datasets, ER, PR and HER2 status was based on mRNA expression. In the TCGA dataset, the level of some proteins has been investigated with reverse phase protein assay (RPPA). Data were available for total ERα, PR and HER2 as well as for the phosphorylated forms of ERα (pS118) and HER2 (pY1248) that are at least partially indicative of the activation status [46,53]. GJA1 protein level obtained by mass spectrometry for 105 TCGA samples was retrieved from the protein report found at the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data portal [54]. Levels are given as the log2 of the ratio of each sample with respect to a pooled reporter sample.

DNA Alteration
Copy number alterations (CNAs) were measured in the TCGA dataset with Affymetrix 6.0 single nucleotide polymorphism (SNP) arrays and segmented using Circular Binary Segmentation (labeled here as Relative linear copy number values) [51]. Data were further processed by TCGA using GISTIC 2.0 to assign the Putative copy number calls per gene (−2: Homozygous deletion, −1: Hemizygous deletion, 0: Neutral/no change, 1: Gain, 2: High level amplification) [51]. Mutations were detected using whole-exome sequencing after controlling for germline and normal adjacent tissue mutations [51]. Linear and called CNA data as well as mutation data for the TCGA dataset were retrieved using R via cBioportal [55,56]. A total of 977 patients had data for mutations [51].

Survival Analysis
Survival data was available for all five datasets. The log-rank test was used to estimate significance and hazard ratios (95% CI) were computed via Cox regression using survival package [57]. ROC curves were computed using the pROC package [58]. Kaplan-Meier plots were used to visualize the data. The best cutoff to determine tumors expressing high or low levels of Cx43 was selected using either the ROC curves or based on the smallest p value of the log rank test computed for each threshold between 10 and 90. For aggregated datasets analysis, each cohort was first split into groups based on the selected threshold and datasets were pooled only after splitting.
In addition, survival analyses were computed using the BreastMark and KMPlotter web platforms that use several well-known dataset [35,59]. BreastMark allows thresholds of 25, 50 and 75 percentile to be selected to split the different cohorts used before aggregating them. For each analysis, we selected the threshold giving the best results. Since less samples were available in KMPlotter for statistical computation, we only included analyses for which there were at least 100 samples to draw both high and low expression curves for each subtype.

Statistical Analysis
All statistical analyses were carried out with R version 3.4.3 [60]. For two class comparisons, (cancer vs. normal tissues; positive vs. negative hormonal status) the Wilcoxon-Mann-Witney test was used. When more than two classes were compared, the Kruskall-Wallis test was used followed by the Dunn post-hoc test to assess the statistical significance for each pair of samples (to compare subtypes). For stages and grades, differential gene expression was assessed using Limma package [61]. A Benjamini-Hochberg correction was applied to adjust the p values for multiple testing. Because each subtype had a different number of patients, when correlation tests were performed between the expression of GJA1 and the expression of other genes, a non-parametric bootstrap procedure was used for each subtype to derive the mean correlation coefficient and a percent confidence interval.

Conclusions
Our study has clarified the expression pattern of GJA1 mRNA in breast cancer and showed that GJA1 expression, as well as its prognostic significance, is dependent on breast cancer subtype. We also highlighted important biases that are introduced in analyzing pooled tumors. These biases need to be taken into consideration when studying GJA1, but also numerous other genes that are known to be linked, for instance, to ERα expression. Breast cancers are heterogeneous and genetically diverse and the lack of recognition of this molecular heterogeneity might explain the conflicting results from the literature, not only for GJA1, but potentially for other tumor suppressors or oncogenes. Overall, these results clearly showed that the molecular context where Cx43 is expressed in general, and the tumor subtypes of breast cancer in particular, should be taken into account when investigating Cx43's role in carcinogenesis.

Conflicts of Interest:
The authors declare no conflict of interest.    I  II  III IV  Breast I  II  III  IV  Breast I  II  III  IV  Breast I  II  III  IV  Breast I  II  III           Threshold p value  Threshold p value Threshold p value Threshold p value Threshold p value Threshold p value

Curtis Validation
Cox regression hazard ratio Figure A11. Cox regression hazard ratio associated with GJA1 for different thresholds. Graph of the Cox regression hazard ratio to compare the survival of patients with Cx43 expressed at high or low levels according to varying thresholds (10-90 percentile). Horizontal red lines indicate a neutral hazard ratio of 1. Vertical red lines indicate the threshold with the lowest log rank test p value as determined in Figure A10, used for subsequent survival analysis. Results given for pooled tumors or individual breast cancer subtypes in our five datasets (TCGA, NKI, Vanvliet and Curtis discovery and Curtis validation) as well as in aggregated datasets (Pooled datasets).