The HPSE Gene Insulator—A Novel Regulatory Element That Affects Heparanase Expression, Stem Cell Mobilization, and the Risk of Acute Graft versus Host Disease

The HPSE gene encodes heparanase (HPSE), a key player in cancer, inflammation, and autoimmunity. We have previously identified a strong HPSE gene enhancer involved in self-regulation of heparanase by negative feedback exerted in a functional rs4693608 single-nucleotide polymorphism (SNP) dependent manner. In the present study, we analyzed the HPSE gene insulator region, located in intron 9 and containing rs4426765, rs28649799, and rs4364254 SNPs. Our results indicate that this region exhibits HPSE regulatory activity. SNP substitutions lead to modulation of a unique DNA-protein complex that affects insulator activity. Analysis of interactions between enhancer and insulator SNPs revealed that rs4693608 has a major effect on HPSE expression and the risk of post-transplantation acute graft versus host disease (GVHD). The C alleles of insulator SNPs rs4364254 and rs4426765 modify the activity of the HPSE enhancer, resulting in altered HPSE expression and increased risk of acute GVHD. Moreover, rs4426765 correlated with HPSE expression in activated mononuclear cells, as well as with CD3 levels and lymphocyte counts following G-CSF mobilization. rs4363084 and rs28649799 were found to be associated with CD34+ levels. Our study provides new insight into the mechanism of HPSE gene regulation and its impact on normal and pathological processes in the hematopoietic system.


Study Population
The retrospective study included 674 consecutive patients (284 females and 390 males) with hematologic malignancies. The patients were transplanted at the Bone Marrow Transplantation Unit of the Chaim Sheba Medical Center (Tel Hashomer, Israel). All patients gave written informed consent, and the study was approved by the Ethics Committee of the Sheba Medical Center and the Israeli Ministry of Health (#4247). Patient characteristics are presented in Supplementary Table S1. Three hundred and sixty-five patients received grafts from human leukocyte antigen (HLA)-identical siblings, and 309 patients were transplanted from unrelated donors. The median age was 51 years (range 15 to 80 years). Conditioning regimens before HSCT, prophylaxis against GVHD, administration of granulocyte-colony stimulating factor (G-CSF), and prophylaxis with antibiotics and antimycotics were performed as previously described [9,26,30].
The effect of functional HPSE SNPs on G-CSF-mediated peripheral blood stem cell mobilization was analyzed in 280 consenting stem cell transplantation normal donors. Donors received G-CSF for 5 days. The daily dose of G-CSF was adjusted to donor weight (approximately 10 µg/kg/day), as recommended on the G-CSF label (480, 600, 780, or 960 µg/day) [31]. Total leucocytes from 108 unrelated healthy Israeli volunteers (53 females and 55 males) served as normal controls for analyzing the association of new insulator SNPs with heparanase mRNA expression levels. Their median age was 38 years (range 18 to 59 years). In addition, 104 peripheral blood (PB) samples from normal healthy individuals were examined for HPSE gene expression in untreated and LPS-treated mononuclear cells (MNCs). Median age was 39 years (range 25 to 56 years). All normal subjects gave their written informed consent and the study was approved by the Ethics Committee of the Sheba Medical Center and the Israeli Ministry of Health (#4247 and #2341-01-SMC). Human total leukocytes from healthy individuals and umbilical cord blood and primary cells from patients with T-ALL and B-ALL were applied for nuclear protein extractions. All subjects gave their written informed consent.

SNPs Analysis
The genotypes of rs4693084 and rs4364254 SNPs were obtained by allele-specific amplification. Polymerase chain reaction (PCR) fragments were amplified from genomic DNA (Wizard ® Genomic DNA Purification kit, Promega, Madison, WI, USA). PCR reactions were performed as previously described [22]. Genotypes of rs4693608, rs4426765, and rs28649799 SNPs were identified using Real-Time SNP Assay. Custom-specific primers and probes were purchased from Bio Search Technologies (Novato, CA, USA). The PCR reaction mixture consisted of 5 µL of AccuStart Genotype Tough Mix Rox (Quanta, Gaithersburg, MD, USA) and 0.25 µL of primers and probes mix according to the manufacturer's instructions. Reactions were performed using an ABI PRISM 7700 sequence detector (Applied Biosystems, Warrington, UK).

DNA Constructs
A fragment of intron 9 (223 bp), which includes SNPs rs4426765, rs28649799, and rs4364254, was cloned by PCR via the PCR II-TOPO vector (Invitrogen by Life Technologies, Carlsbad, CA, USA) using the forward and reverse primers (Supplementary Table S2). The insulator fragments were digested with Hind III and Xho I and ligated into the phosphatasetreated pGL4.26 (luc2/minP/Hygro) vector (Promega, Madison, WI, USA) with subsequent cloning into JM109 Competent cells (Promega, Madison, WI, USA). Three sense and three antisense DNA constructs were prepared ( Figure 1) followed by conformation of their direction by DNA sequencing.
. Figure 1. Structure of DNA constructs for luciferase reporter assay. Cloning of the HPSE gene fragment of intron 9 was performed using PCR II-TOPO vector. The insulator fragments were digested with Hind III and Xho I with subsequent ligation into phosphatase-treated pGL4.26 (luc2/minP/Hygro) vector. Three sense and three antisense DNA constructs were prepared. The constructs were designated with Roman numerals I, II, and III, and represent possible allelic variants in the general population.

Luciferase Reporter Assay
HT-1080, H1229, MCF7, PC3, RPMI8226, SU-DHL4, KG-1, Jurkat, CEM, NALM6, and MOLT3 cells were selected for Luciferase assay, performed in six-well or 12-well culture dishes according to the manufacturer's instructions (AMAXA biosystems, Lonza, Germany). Cells were transfected using the Ingenio Electroporation Kit (Mirus Bio, Madison, WI, USA) and Nucleofector TM (AMAXA biosystems, Lonza, Germany). Two µg of DNA construct were used for each electroporation. Twenty-four hours after reaction transfection, the cells were lysed in Passive Lysis Buffer (Promega, Madison, WI, USA). Firefly luciferase activity was measured by The Luciferase assay system (Promega, Madison, WI, USA) and 20/20 n Luminometer (Turner BioSystems, Sunnyvale, CA, USA). For normalization, total protein was quantified by the Bradford method (Bio-Rad, Hercules, CA, USA). Luciferase activity of pGL4.26 without insert was normalized to be 100%. All experiments were performed in triplicates and analyzed using a t-test. A p value ≤ 0.05 was calculated as statistically significant.

Electromobility Shift Assay (EMSA)
The primers applied to create allele-specific biotin-labeled probes are presented in Table S2. Method for generation of oligonucleotide probes was previously published [22]. The altered bases are marked in bold. Nuclear protein extracts were prepared using a Nuclear Extraction Kit (Millipore, Temecula, CA, USA). The Gelshift Chemiluminescent Figure 1. Structure of DNA constructs for luciferase reporter assay. Cloning of the HPSE gene fragment of intron 9 was performed using PCR II-TOPO vector. The insulator fragments were digested with Hind III and Xho I with subsequent ligation into phosphatase-treated pGL4.26 (luc2/minP/Hygro) vector. Three sense and three antisense DNA constructs were prepared. The constructs were designated with Roman numerals I, II, and III, and represent possible allelic variants in the general population.

Luciferase Reporter Assay
HT-1080, H1229, MCF7, PC3, RPMI8226, SU-DHL4, KG-1, Jurkat, CEM, NALM6, and MOLT3 cells were selected for Luciferase assay, performed in six-well or 12-well culture dishes according to the manufacturer's instructions (AMAXA biosystems, Lonza, Germany). Cells were transfected using the Ingenio Electroporation Kit (Mirus Bio, Madison, WI, USA) and Nucleofector TM (AMAXA biosystems, Lonza, Germany). Two µg of DNA construct were used for each electroporation. Twenty-four hours after reaction transfection, the cells were lysed in Passive Lysis Buffer (Promega, Madison, WI, USA). Firefly luciferase activity was measured by The Luciferase assay system (Promega, Madison, WI, USA) and 20/20 n Luminometer (Turner BioSystems, Sunnyvale, CA, USA). For normalization, total protein was quantified by the Bradford method (Bio-Rad, Hercules, CA, USA). Luciferase activity of pGL4.26 without insert was normalized to be 100%. All experiments were performed in triplicates and analyzed using a t-test. A p value ≤ 0.05 was calculated as statistically significant.

Electromobility Shift Assay (EMSA)
The primers applied to create allele-specific biotin-labeled probes are presented in Table S2. Method for generation of oligonucleotide probes was previously published [22]. The altered bases are marked in bold. Nuclear protein extracts were prepared using a Nuclear Extraction Kit (Millipore, Temecula, CA, USA). The Gelshift Chemiluminescent EMSA Kit (Active Motif, Rixensart, Belgium) was used according to the manufacturer's protocol. EMSA reactions were performed as previously described [22].

Statistical Analysis
Genotype and allele frequencies of the SNPs were calculated by direct counting. All individuals were distributed into three groups according to three possible genotypes for each SNP. The means of mRNA relative quantification was calculated for each group. The Aspin-Welch unequal variances t-test was used to assess the association between the HPSE gene SNPs and the relative HPSE mRNA in normal total leukocytes and MNCs treated with LPS. This test is more reliable when the two groups have unequal variances and/or unequal sizes. G-CSF-mediated peripheral blood stem cell mobilization as a continuous variable, small groups with a low frequency of the corresponding alleles, and lack of the ability to suggest a normal distribution of variables allowed us to choose the Mann-Whitney U test for the statistical analysis. Acute GVHD was graded using the Glucksberg criteria [32]. The cumulative incidence of acute GVHD was calculated for both enhancer and insulator HPSE SNPs. Statistical analysis was carried out for grades II-IV acute GVHD. In the analysis of the cumulative incidence of GVHD, relapse was considered a competing risk [33]. Time to clinical event was measured from the date of HSCT. An analysis of the disparity between patient and donor was performed. A p value ≤ 0.05 was considered statistically significant. The NCSS 2007 software (NCSS, Kaysville, Utah, USA) was used for statistical analysis.

Detection of Insulator Activity in Cancer Cell Lines of Hematological and Non-Hematological Origin
The ChIP-seq data, published in the Ensembl Database, identified a 200 bp regulatory region in the 9th intron of the HPSE gene (http://apr2020.archive.ensembl.org/Homo_ sapiens/Variation/Mappings?db=core;r=4:83301922-83302922;v=rs4426765;vdb=variation; vf=66317526 accessed on 20 July 2021). The region was defined as an insulator according to the binding site of the CCCTC-binding factor (CTCF). The substitution of C to A for SNP rs4426765 disrupts the binding site of this transcription factor. Two additional polymorphic SNPs rs4364254 and rs28649799 are located in this region and can also alter the regulatory activity of the insulator.
To determine the presence of an active insulator in intron 9 and to study the functional effects of the three polymorphic SNPs, we used a luciferase reporter gene with a minimal promoter and measured luciferase activity. Cell lines derived from seven hematological malignancies (RPMI8226, SU-DHL4, KG-1, Jurkat, CEM, NALM6, and MOLT3) and four solid tumors (HT-1080, H1229, PC-3, and MCF7) were transiently transfected with each of the 6 DNA constructs or empty vector ( Figure 1). Our results show that a 200 bp fragment of the HPSE gene exhibits regulatory activity in both the sense and antisense directions ( Table 1). Of note, in most of the tested cell lines, the relative activity of luciferase was higher in constructs with an antisense direction of the insulator. Moreover, the constructs that included the allele A-A-T yielded increased levels of luciferase activity compared to the other constructs (Table 1). Also, the relative activity of luciferase was significantly higher in the solid tumor (especially HT1080), multiple myeloma (RPMI8226), and lymphoma (SU-DHL4) cells compared with low activity in the leukemic cell lines, particularly the CEM and MOLT3 acute lymphocytic leukemia cells. The effect of rs4426765 and rs4364254 SNPs on luciferase activity was evident in solid tumors and some of the hematological cancer cell lines (Table 1). Specifically, allele C of both SNPs significantly suppressed luciferase activity in the antisense direction.

Effect of rs4426765, rs28649799, and rs4364254 SNPs on the Binding of Nuclear Proteins to the Insulator Region
Nuclear extracts of hematological and solid tumor cell lines (NALM6, 018Z, H1229, and PANC-1), primary leucocytes from healthy adults and umbilical cord blood, as well as primary cells from patients with T-ALL and B-ALL were incubated with biotin-labeled probes (Supplementary Table S2) and subjected to electrophoretic mobility shift assay. Each probe included a corresponding SNP alteration. The DNA-protein complexes appeared as a band shift pattern. The specificity of the protein-oligonucleotide interaction was determined as previously described [22].
Analysis of normal leukocyte samples from healthy controls revealed gel shift bands for both allelic probes in two (rs28649799 and rs4364254) out of the three SNPs, while in samples from umbilical cord blood nuclear-protein complexes were formed only with the C alleles of rs4426765 and rs4364254 SNPs ( Figure 2). Remarkably, in the analyzed malignant cell lines, the band of DNA-protein complex was shifted significantly more than in the normal cell samples ( Figure 2). Furthermore, the formation of DNA/protein complexes was found predominantly with the C or G alleles of each SNP ( Figure 2). EMSA analysis of samples from patients with B-ALL and T-ALL exhibited the presence of large DNA/protein complexes formed by all three SNP probes ( Figure 2). Each probe disclosed its own pattern. The affinity of the protein for the C allele of rs4426765 and the G allele of rs28649799 was higher compared to the A allele, respectively. As seen in Figure 2, the number of nuclear proteins that were associated with the rs4426765 probe was so large that it led to disappearance of the unbound probes. Additionally, the affinity of the T-ALL patient samples for DNA was lower compared to that of B-ALL cells. Less noticeable was the disappearance of unbound probes rs28649799 and rs4364254. These results indicate that the highest amount of nuclear proteins bind to the insulator region which contains the CTCF binding site. Analysis of normal leukocyte samples from healthy controls revealed gel shift bands for both allelic probes in two (rs28649799 and rs4364254) out of the three SNPs, while in samples from umbilical cord blood nuclear-protein complexes were formed only with the C alleles of rs4426765 and rs4364254 SNPs ( Figure 2). Remarkably, in the analyzed malignant cell lines, the band of DNA-protein complex was shifted significantly more than in the normal cell samples ( Figure 2). Furthermore, the formation of DNA/protein complexes was found predominantly with the C or G alleles of each SNP ( Figure 2).

Figure 2.
Electromobility shift assay (EMSA) of rs4426765, rs28649799, and rs4364254 SNPs using allele-specific oligonucleotide probes. Nuclear protein extracts from healthy donors, hematological malignancies and solid tumor cell lines, were incubated with three allele-specific biotin-labeled probes and analyzed by electrophoretic mobility shift assay. The order of the SNPs (rs4426765, rs28649799, and rs4364254) corresponds to their location in the insulator. In normal leukocytes, gel shift bands were allocated for both allelic probes rs28649799 and rs4364254 SNPs. In cord blood samples, nuclear-protein complexes were formed only with the C alleles of rs4426765 and rs4364254 SNPs. In the malignant cell lines, the band representing DNA-protein complexes shifted to a higher extent than in normal samples. The binding of Figure 2. Electromobility shift assay (EMSA) of rs4426765, rs28649799, and rs4364254 SNPs using allele-specific oligonucleotide probes. Nuclear protein extracts from healthy donors, hematological malignancies and solid tumor cell lines, were incubated with three allele-specific biotin-labeled probes and analyzed by electrophoretic mobility shift assay. The order of the SNPs (rs4426765, rs28649799, and rs4364254) corresponds to their location in the insulator. In normal leukocytes, gel shift bands were allocated for both allelic probes rs28649799 and rs4364254 SNPs. In cord blood samples, nuclear-protein complexes were formed only with the C alleles of rs4426765 and rs4364254 SNPs. In the malignant cell lines, the band representing DNA-protein complexes shifted to a higher extent than in normal samples. The binding of protein complexes was found predominantly with the C, G, and C alleles of each SNP. Analysis of B-ALL and T-ALL primary samples exhibited the presence of large DNA/protein complexes formed by all three SNP probes. The affinity of the protein for the C allele of rs4426765 and the G allele of rs28649799 was higher compared to the A allele. The number of nuclear proteins bound to the rs4426765 probes was so large that the bands of unbound probes disappeared. Less noticeable was the disappearance of unbound probes for rs28649799 and rs4364254 SNPs.

Acute GVHD
To assess the role of the HPSE insulator in predicting the risk of acute GVHD post-HSCT, two patient groups from our previous studies [9,26] were pooled and analyzed. Three hundred and ten patients developed acute GVHD, 53 (17.1%) grade I, 125 (40.3%)grade II, 49 (15.8%)-grade III, and 83 (26.7%)-grade IV, while 261 did not.
The univariate analysis of the cumulative incidence of clinically significant acute GVHD (grades II-IV) in association with two enhancers (rs4693608, rs4693084) and three insulators (rs4426765, rs28649799, and rs4364254) SNPs and their combinations, is presented in Table 2. rs4693608 is the main SNP associated with the risk of acute GVHD. The cumulative incidence of acute GVHD on day 100 (grade II-IV) was 47 Table 2). Additional two SNPs (rs4693084 and rs4364254) tended to be correlated with the risk of acute GVHD (p = 0.067 and p = 0.076, respectively, Table 2).  We subsequently analyzed the interaction between enhancer (rs4693608) and insulator (rs4364254 and rs4426765) SNPs. For this, the cumulative incidence of acute GVHD was calculated for each genotype separately (Table 3). This approach disclosed that the rs4364254 CC genotype reduces the risk of acute GVHD in patients with heterozygous rs4693608 AG genotype (20.7%; 95% CI 10. 2-42.2). An opposite effect was obtained for the AA-TC insulator genotype. This SNP combination dictates an increased risk of acute GVHD in both rs4693608 AG (55.2%; 95% CI 42.0-72.4) and GG (46.7%; 95% CI 30.2-72.0) carriers, respectively ( Table 3). The low frequency of certain genotypes (AA-CC-TT, AA-CC-TC, AA-AA-CC, and AA-CC-CC) in the general population did not allow assessing the risk of acute GVHD in these patients. We, therefore, combined them into one group that demonstrated a high risk of acute GVHD-66.7% (95% CI 46.6-95.4) ( Table 3). Table 3. Univariate analysis of cumulative incidence of clinically significant (II-IV) acute GVHD: examination of interaction between patient rs4693608, rs4426765, and rs4364254 SNPs. In our previous studies [25,26] the combination of rs4693608 and rs4364254 SNPs was evaluated. This approach allowed distribution of all possible HPSE genotype combinations into three groups (HR, MR, and LR) correlating with high, intermediate, and low heparanase mRNA expression levels and high, intermediate and low risk of acute GVHD, respectively. In the present study, the cumulative rate of acute GVHD incidence (grade II-IV) for recipients of group HR was 47.1% (95% CI 40.4-55.1), recipients of group MR was 41.1% (95% CI 35.5-47.5), and for recipients of group LR was 27.6% (95% CI 21.6-35.3) on day 100 post-HSCT (p = 0.000121, Table 2).
Based on enhancer-insulator interaction results (Table 3), updated SNP groups were allocated. The N-HR group includes all individuals with the rs4693608 AA genotype and AG-AA-TC recipients from the AG heterozygote group. The N-MR group consists of carriers AG-AC-TC, AG-CC-TC, AG-NN-TT, and GG-AA-TC, and the N-LR group includes the genotypes GG-AC-TC, GG-NN-TT, GG-NN-CC, and AG-NN-CC. NN is any genotype (AA, AC, or CC) of rs4426765 SNP. The cumulative incidence of day 100 grade II-IV acute GVHD was 49.2% (95% CI 43.1-56.0) for N-HR recipients, 39.6% (95% CI 33.9-46.4) for N-MR, and 22.5% (95% CI 16.5-30.7) in the N-LR group, respectively (p < 0.00001, Table 2).
We previously reported that disparities between recipient and donor pairs in SNP combinations of the HPSE gene significantly increase the likelihood of developing acute GVHD post-HSCT [9,26]. All recipient-donor pairs were divided into three groups according to the potential risk for acute GVHD development. The first cohort, D1, contained pairs with high risk for developing acute GVHD (HR-MR and HR-LR pairs). The second group, D2, consisted of pairs with moderate risk of acute GVHD development (HR-HR, MR-MR, MR-HR, and MR-LR). The third group, D3, included three pairs with a low risk of developing acute GVHD (LR-LR, LR-MR, and LR-HR pairs) [26]. Now we performed a univariate analysis of the cumulative incidence of clinically significant acute GVHD in the renewed recipient-donor pairs (

Correlation between Enhancer and Insulator HPSE SNPs and Heparanase mRNA Levels in MNCs before and after LPS Treatment
Applying activated PB MNCs, we have shown statistically significant differences between HPSE expression levels and rs4693608, rs4364254, and their genotype combinations among healthy individuals [9]. HPSE expression in steady-state MNCs showed little or no differences among individuals with various HPSE gene genotypes (Table 4). In contrast, analysis of the correlation between new insulator SNPs (rs4426765 and rs28649799) and HPSE gene expression in LPS treated MNCs revealed a very significant association with rs4426765. The level of HPSE expression was relatively high in carriers of the AA and AC genotypes and relatively low in carriers of the CC genotype ( Table 4). The ratio test revealed a strong correlation between rs4693608, rs4426765, and rs4364254 SNP genotypes and the increase in HPSE expression levels in response to LPS stimulation. Healthy people with genotypes AA, AA, and TT increased HPSE levels to a greater extent than their counterparts with genotypes GG, CC, and CC, respectively (Table 4).  Next, we investigated how interactions between the three significant SNPs (rs4693608, rs4426765, and rs4364254) affect HPSE expression in LPS stimulated MNCs. First, the interaction between the two insulator SNPs (rs4426765 and rs4364254) was analyzed. We allocated all individuals into three groups depending on the level of HPSE expression. Group 1 included genotypes AA-TT and AA-TC, group 2 included AC-TT and AC-TC, and group 3 the CC-TC, CC-CC, AC-CC, and AA-CC genotypes. Comparison of heparanase expression between AA-TT and AA-TC (group 1), among AC-TT and AC-TC (group 2), and between AC-CC and CC-CC (group 3) did not reveal any differences (Table 5), whereas highly significant differences were observed between the three groups (Table 5). Group 1 exhibited a higher level of HPSE mRNA compared to the other two groups (p = 0.025), while group 3 expressed a low level of HPSE (p = 0.0029). Subsequently, the interaction between the enhancer and insulator SNPs was studied. It became clear that the best way to understand how insulator SNPs alter enhancer activity is to analyze heterozygous individuals with intermediate HPSE expression (Table 5). Heterozygous AG individuals with genotypes AA-TT and AA-TC in the insulator disclosed relatively high levels of HPSE expression, while heterozygous individuals with two or more C alleles in the insulator SNPs exhibited relatively low levels of HPSE gene expression (p = 0.023, Table 5). Thus, we concluded that an active insulator with C alleles decreases the activity of the HPSE gene enhancer and thereby diminishes HPSE expression levels. Of note, the same analysis of homozygous individuals with genotypes AA and GG for rs4693608 did not reveal any difference in the correlation analysis (Table 5). These results are consistent with the enhancer-insulator interaction found in the group of patients with acute GVHD (Table 3). Recipients with the AG-AA-TC genotype had a higher cumulative incidence risk, while those with the AG-NN-CC genotype revealed a lower risk of acute GVHD (Table 3). Association analysis between the updated groups of enhancer-insulator and the relative expression of HPSE mRNA in LPS treated MNCs disclosed that individuals with the N-LR genotype displayed low levels of heparanase compared to those with the N-MR and N-HR genotypes (p = 0.0041) ( Table 4).

Association between Enhancer and Insulator HPSE Gene SNPs and the Levels of HPSE mRNA in Normal Leukocytes
Our previous study showed that HPSE rs4693608 and rs4364254 SNPs correlated with HPSE mRNA expression in total leucocytes [25]. Subsequent investigations have suggested that this correlation is restricted to neutrophils [9]. In the present study, new rs4426765 and rs28649799 insulator SNPs were analyzed for correlation with HPSE mRNA levels in total leukocytes among healthy individuals. In addition, the interaction between enhancer and insulator SNPs was examined. Healthy individuals with genotypes GG (rs4693608) or CC (rs4364254) disclosed relatively low levels of HPSE (p = 0.0028 and 0.00049, respectively), while individuals with genotypes AA (rs4693608) or TT (rs4364254) expressed relatively high levels of HPSE mRNA. In contrast, the new insulator SNPs (rs44267765 and rs28649799) disclosed no correlation (Table 6). Previous genotype combination groups HR, MR, and LR are presented in Table 6. Correlation analysis between the updated groups of enhancer-insulator SNPs and the relative expression of HPSE mRNA in normal total leucocytes was performed. Individuals with the N-LR genotype combination exhibited low levels of heparanase compared to those with the N-MR and N-HR genotypes (p = 0.000076) ( Table 6). This association revealed better results compared to previous rs4693608-rs4364254 SNP combinations (p = 0.000076 for the N-LR group vs. p = 0.00098 for the LR group compared to the other groups, respectively, Table 6). The results indicate that the complicated interaction between rs4693608 enhancer SNP and rs44267765 and rs4364254 insulator SNPs significantly determine heparanase expression variability among normal persons.

G-CSF-Mediated Peripheral Blood Stem Cell Mobilization
G-CSF-mobilized peripheral blood stem/progenitor CD34 + cells are the main graft source used for HSCT [34]. We have previously shown that G-CSF treatment resulted in high-affinity binding of heparanase to the enhancer region of the HPSE gene, followed by increased heparanase expression in donor cells [22,27].
Two hundred and ten donors (106 females and 174 males) were included in the analyses. The median age was 45 (range, 19-73) years. All the observed polymorphisms were in the Hardy-Weinberg equilibrium. The parameters of mobilization included in the analyses are presented in Table 7. One enhancer SNP (rs4363084) and one insulator SNP (rs28649799) were found to correlate with CD34 + ( Table 7). The overall CD34 + yield, as well as CD34 + × 10 6 /kg, were higher in carriers of the TT genotype compared to those with GG and GT genotypes (p = 0.0068 and p = 0.025, respectively). The insulator SNP rs28649799 also revealed an association with the low-frequency G allele (p = 0.039). Analysis of the interaction between these two SNPs indicated that only TT-AA, GG-AG, and TT-AG disclosed an increased number of CD34 + cells, whereas the GT-AG genotype showed a lower level, similar to that of other genotype combinations. The TT-AG genotype exhibited the highest CD34 + mobilization yield. The additive effect of both SNPs may explain this result. However, the frequency of genotypes TT-AA, GG-AG, and TT-AG in the general population is low. Therefore, additional studies applying large population cohorts are needed to clarify the function of both SNPs in G-CSF mediated mobilization.

Discussion
Even though noncoding regions constitute over 98% of the human genome, epigenetic profiling studies have shown that more than 80% of the human genome is potentially functional [35]. Insulators are classes of DNA sequence elements that have a common ability to protect genes from inappropriate signals proceeding from their surrounding environment [36]. An insulator element may act as an enhancer-blocker by disrupting enhancer-promoter interactions, when positioned between enhancer and promoter regions, without rendering the enhancer inactive. Insulators can also protect genes by acting as "barriers" that prevent the advance of nearby condensed chromatin that may otherwise silence gene expression. While barrier activity prevents transcriptional repression, its enhancer-blocking property interferes with transcriptional activation [36].
In the present study, we identified a new regulatory region in intron 9 of the HPSE gene. According to ChIP-seq data, the CCCTC-binding factor binds to this region. In mammals, a zinc-finger protein, CTCF, binds to most of the known insulator sequences [36], suggesting that the newly identified 200 bp regulatory region is an insulator. There are three polymorphic SNPs in this region of the HPSE gene. C to A substitution in SNP rs4426765 destroys the CTCF binding site. Analysis of the luciferase reporter revealed cellular-and SNP-dependent regulatory activity in this region. EMSA disclosed binding of the DNA/protein complex to both alleles with a higher affinity for allele C. Importantly, all the examined malignant cell lines and primary leukemia samples disclosed a more prominent shift of the main bands compared to normal cell samples. This result is likely due to the formation of DNA/protein complexes with additional proteins that were bound to the probes. Binding of protein complexes was found mainly to the C or G alleles of rs4426765, rs28649799, and rs4364254.
While subsets of CTCF binding sites are known to be associated with insulators, many insulators can function independently of CTCF [29]. Correlation analyses disclosed that HPSE insulator function is not dictated solely by binding of the CTCF transcription factor. Our results suggest that additional proteins are involved in the formation of a DNA-protein complex in the insulator region and that insulator SNPs can alter the composition and affinity of these complexes. Figure 3 presents a summary diagram describing the identified significant correlations between enhancer and insulator SNPs with heparanase expression, G-CSF mobilization, and the risk of acute GVHD after HSCT. Location of the enhancer (red rectangle) and insulator (blue rectangle) is indicated relative to the HPSE gene map ( Figure 3A) and the effect of HPSE SNPs on various processes is shown in Figure 3B.
Correlation analysis of HPSE enhancer and insulator SNPs in different groups of healthy individuals and transplanted patients revealed that rs4364254 and rs4426765 play an important role in the insulator activity. We have previously described a significant correlation between rs4364254 and progesterone receptor (PR) expression in breast cancer patients [27]. High expression of the progesterone receptor correlated with the C allele, and low expression with the T allele (p = 0.002) [27]. PR-positive breast cancers exhibit improved prognosis [37,38]. In addition, the C allele of rs4364254 SNP was found to be associated with a low risk of developing Veno occlusive disease of the liver [39]. These results suggest that the CC genotype of the rs4364254 insulator SNP decreases the enhancer activity and thereby diminishes the expression level of the HPSE gene and possibly its protumorigenic and proinflammatory activities.
Our results emphasize the role of the CTCF-associated rs4426465 in lymphocyte activity and mobilization after G-CSF treatment. Carriers of the CC genotype disclosed a low level of HPSE expression following incubation with LPS in comparison to carriers of the AC and AA genotypes, differences that are highly significant. In addition, significantly improved G-CSF induced CD3 levels and higher lymphocyte counts were found in CC genotype donors. These results are complementary to our previously published findings, in which we analyzed the effect of LPS on the ability of DNA/protein complexes to bind the strong intron 2 enhancer of the HPSE gene [22]. Exposure to LPS resulted in the disappearance of DNA/protein complexes in normal MNCs and decreased the affinity of DNA/protein complexes in U937 monocytic cells [22]. In contrast, exposure to G-CSF caused high-affinity interaction of nuclear proteins with the enhancer region, associated with increased HPSE expression in donor cells [22,27]. It is conceivable that the HPSE intron insulator strongly regulates the function of the enhancer and thereby affects heparanase involvement in lymphocyte activation and function. Additional confirmation supporting the importance of rs4426465 SNP in lymphocyte function was provided by the EMSA results in B-and T-ALL patient samples. A large number of nuclear proteins bind to the insulator region containing the CTCF binding site. The affinity of nuclear proteins for allele C was higher than that of allele A. The size of DNA/protein complexes decreased upon the progressive diverging from SNP rs4426465 to distal polymorphisms along the insulator nucleotide sequence.  Correlation analysis of HPSE enhancer and insulator SNPs in different groups of healthy individuals and transplanted patients revealed that rs4364254 and rs4426765 play an important role in the insulator activity. We have previously described a significant correlation between rs4364254 and progesterone receptor (PR) expression in breast cancer patients [27]. High expression of the progesterone receptor correlated with the C allele, and low expression with the T allele (p = 0.002) [27]. PR-positive breast cancers exhibit improved prognosis [37,38]. In addition, the C allele of rs4364254 SNP was found to be associated with a low risk of developing Veno occlusive disease of the liver [39]. These results suggest that the CC genotype of the rs4364254 insulator SNP decreases the enhancer activity and thereby diminishes the expression level of the HPSE gene and possibly its protumorigenic and proinflammatory activities. The enhancer SNP rs4693608 exhibited a leading function in the expression of the HPSE gene in normal neutrophils and LPS-activated MNCs, as well as in predicting the risk of acute GVHD post-transplantation. Analysis of the interaction between enhancer and insulator SNPs (especially in heterozygous individuals for the enhancer rs4363608) provides insight into the mechanism by which an insulator can alter enhancer activity. Genotype CC of rs4364254 SNP reduces the activity of the HPSE gene enhancer. Individuals with the AG-CC genotype were found to have low levels of HPSE expression in normal neutrophils and LPS-activated MNCs, and a low risk of developing acute GVHD. In contrast, the AA-TC genotype from rs4426765 and rs4364254 insulator SNPs increased the level of HPSE expression in LPS-activated MNCs. Moreover, the AA-TC insulator genotype increased the risk of acute GVHD in carriers of the AG and GG enhancer genotypes.
Heparanase is known to play a decisive role in tumor metastasis and angiogenesis [5]. High expression of heparanase is frequently observed in a long list of primary human tumors of various etiologies, in correlation with high vessel density and a poor clinical outcome [40][41][42][43]. The enzyme is also involved in inflammation [44], fibrosis [45], diabetes [46], and kidney dysfunction [47]. The enhancer and insulator of the HPSE gene may be involved in the development of these and other pathological processes. It is important to note; however, that the interaction between enhancer and insulator can vary depending on the nature of cells and the heparanase-related processes.
Heparanase-inhibiting compounds, antibodies, and small molecules are being developed [48][49][50]. Four different heparin mimics were subjected to clinical trials in cancer patients. However, none of the molecules reached the stage of approval for clinical use.
Elucidation of the precise mechanism(s) of HPSE gene regulation will facilitate rational development of heparanase-specific inhibitors.
It is worth noting that over the past several years, new and potential technologies have been developed (e.g., nuclease-deficient zinc fingers, TALEs, and CRISPR fusion) with the potential of treating diseases by modulating gene expression, an approach termed cis-regulation therapy (CRT) [51]. These approaches can be used to modify the activity of gene-regulatory elements such as promoters, enhancers, silencers, and insulators, thereby modulating the expression levels of their target genes aiming at achieving therapeutic effects. Future development of such technologies will hopefully allow fine-tuning of heparanase expression for the treatment of diseases associated with dysregulation of HPSE gene expression.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/cells10102523/s1, Table S1: Characteristics of transplant recipients and donors; Table S2: Primer sequences for insulator cloning and generation of allele-specific probes.