Differential Expression Profile of lncRNAs from Primary Human Hepatocytes Following DEET and Fipronil Exposure

While the synthesis and use of new chemical compounds is at an all-time high, the study of their potential impact on human health is quickly falling behind, and new methods are needed to assess their impact. We chose to examine the effects of two common environmental chemicals, the insect repellent N,N-diethyl-m-toluamide (DEET) and the insecticide fluocyanobenpyrazole (fipronil), on transcript levels of long non-protein coding RNAs (lncRNAs) in primary human hepatocytes using a global RNA-Seq approach. While lncRNAs are believed to play a critical role in numerous important biological processes, many still remain uncharacterized, and their functions and modes of action remain largely unclear, especially in relation to environmental chemicals. RNA-Seq showed that 100 µM DEET significantly increased transcript levels for 2 lncRNAs and lowered transcript levels for 18 lncRNAs, while fipronil at 10 µM increased transcript levels for 76 lncRNAs and decreased levels for 193 lncRNAs. A mixture of 100 µM DEET and 10 µM fipronil increased transcript levels for 75 lncRNAs and lowered transcript levels for 258 lncRNAs. This indicates a more-than-additive effect on lncRNA transcript expression when the two chemicals were presented in combination versus each chemical alone. Differentially expressed lncRNA genes were mapped to chromosomes, analyzed by proximity to neighboring protein-coding genes, and functionally characterized via gene ontology and molecular mapping algorithms. While further testing is required to assess the organismal impact of changes in transcript levels, this initial analysis links several of the dysregulated lncRNAs to processes and pathways critical to proper cellular function, such as the innate and adaptive immune response and the p53 signaling pathway.

of lncRNA transcription sites with up-or down-regulated transcripts after 100 µM DEET exposure to closest protein-coding gene TSS; percentages on the Y axis in (B) refer to the ratio of lncRNAs that fall into categories within a certain range from the TSS of the closest protein-coding gene to the total number of lncRNAs that fall within these ranges (36 total since a single gene can span more than one range category). (C) Genomic distance in kb and orientation of lncRNA transcription sites with up-or down-regulated transcripts after 100 µM DEET exposure to closest protein-coding gene TSS. Percentages on the Y axis in (C) refer to the ratio of lncRNAs that fall into categories within a certain range both before and after the closest protein-coding gene TSS to the total number of lncRNAs that fall within these ranges (36 total in this case). D-F represents the same data points for the 10 µM fipronil treatment and G-I represents the same data points for the 100 µM DEET plus 10 µM fipronil treatment. Table S1. Chromosome (chrom) distribution of lncRNAs significantly dysregulated (P ≤ 0.01) after primary human hepatocytes were exposed to 100 µM DEET (DT), 10 µM fipronil (Fip), or a mixture of 100 µM DEET and 10 µM fipronil (DT+Fip). a Total coding and noncoding genes based on Ensembl release 87 [25]. b LncRNAs vs total = the number of dysregulated lncRNAs from a single chromosome divided by the total number of known genes, coding and noncoding, on that chromosome (represented as a percentage). c Two genes omitted since chromosome location not well-established.   Table S2. Chromosomal distribution of protein-coding genes significantly dysregulated (P ≤ 0.01) after primary human hepatocytes were exposed to 100 µM DEET (DT), 10 µM fipronil (Fip), or a mixture of 100 µM DEET and 10 µM fipronil (DT+Fip). a Total coding and noncoding genes based on Ensembl release 87 [25]. b Coding vs total = the number of dysregulated coding genes from a single chromosome divided by the total number of known genes, coding and noncoding, on that chromosome (represented as a percentage).

TABLES
c Two genes omitted since chromosome location not well-established .     Table S1 shows the distribution of lncRNAs whose transcripts were differentially expressed (P ≤ 0.01) in primary human hepatocytes for each chromosome. In the 100 µM DEET treatment, chromosome 16 had the most lncRNAs affected (three) while only two lncRNAs were dysregulated on each of chromosomes 5, 6, 7, 9, 11, and 19. Since there were only 20 lncRNAs in total affected by the DEET-only treatment, the rest of the chromosomes only had one or no lncRNAs dysregulated by the treatment. There were many more lncRNAs whose transcripts were up-or downregulated by the 10 µM fipronil treatment (269 total or 13.5X as many), which led to all 23 of the chromosomes having lncRNAs that were affected by the treatment. The three chromosomes with the most lncRNAs affected were chromosomes 1, 7, and 10, while the chromosomes with the lowest number of lncRNAs affected were chromosomes 13 and 18. For the mixture of 100 µM DEET and 10 µM fipronil treatments, the three chromosomes that had the most lncRNAs dysregulated by the treatment were chromosomes 1, 6, and 7 while chromosomes 13, 18, and 21 were the three with the lowest number of affected lncRNAs.

S2. Comparison of Dysregulated lncRNA and Protein-Coding Gene Chromosomal Distribution
We compared our findings on the chromosomal distribution of lncRNA genes with up-and downregulated transcripts to the chromosomal distribution of protein-coding genes dysregulated after primary human hepatocytes were exposed to both DEET and fipronil using the percentage of genes dysregulated versus the total number of genes on each chromosome. Table S2 shows the chromosomal distribution of protein-coding genes significantly dysregulated (P ≤ 0.01) after primary human hepatocytes were treated with 100 µM DEET, 10 µM fipronil, and a mixture of 100 µM DEET plus 10 µM fipronil. It takes into account the most recent Ensembl gene annotations from December of 2016 [25] in calculations comparing the number of dysregulated protein-coding genes to the total number of coding and noncoding genes on each chromosome. For the 100 µM DEET treatment the three chromosomes that had the most dysregulated protein-coding genes were chromosome 7 (0.57%), chromosome 1 (0.50%), and chromosome 4 (0.57%). These percentages did not correspond with the top three genes that had the most lncRNAs affected in the DEET-only treatment. In the 10 µM fipronil treatment, the three chromosomes with the most dysregulated protein-coding genes were chromosome 1 (7.92%), chromosome 12 (7.53%), and chromosome 16 (7.49%) and the three chromosomes with the lowest number of dysregulated proteincoding genes were chromosome 13 (3.84%), chromosome 14 (4.72%), and chromosome 18 (4.73%). Chromosomes 13 and 14 had the least number of lncRNAs with differentially expressed transcripts in the 10 µM fipronil treatment and no lncRNAs dysregulated in the 100 µM DEET treatment, which suggests that these two chromosomes play a smaller role in the response of liver cells to DEET and fipronil at these concentrations than the other chromosomes. There was little correlation between the chromosomes with the most dysregulated lncRNAs and protein-coding genes in response to 10 µM fipronil. The three chromosomes with the highest number of differentially expressed transcripts from protein-coding genes in response to the mixture of 100 µM DEET plus 10 µM fipronil were chromosome 19 (12.27%), chromosome 1 (10.99%), and chromosome 12 (10.21%). There was no obvious correlation with the dysregulated lncRNA profile and the protein-coding gene profile in response to the mixture either. Chromosome 13 (as it did in the 100 µM DEET and 10 µM fipronil treatments) had the lowest percentage of protein-coding genes affected by the 100 µM DEET plus 10 µM fipronil treatment (5.57%), suggesting that chromosome 13 is less important in the human hepatocyte response to a mixture of DEET and fipronil than any other chromosomes if we base our assumption on the percent of dysregulated genes versus the total number of genes per chromosome.
In summary, 20 lncRNA genes in primary human hepatocytes had transcripts that were significantly up-or downregulated (P ≤ 0.01) by 100 µM DEET, 269 by 10 µM fipronil (13.5X the number dysregulated by 100 µM DEET only), and 333 by a mixture of 100 µM DEET and 10 µM fipronil (1.2X the number dysregulated by 10 µM fipronil alone and 16.7X the number dysregulated by 100 µM DEET alone). Therefore, 0.04% of the total number of known genes (both coding and noncoding) were lncRNAs dysregulated by exposure to 100 µM DEET, 0.48% were lncRNAs affected by the 10 µM fipronil treatment, and 0.59% were lncRNAs dysregulated by the 100 µM DEET plus 10 µM fipronil treatment. We observed a more-than-additive effect with the mixture of 100 µM DEET and 10 µM fipronil together (333 lncRNAs whose transcripts were up-or downregulated) versus a purely additive effect that would have totaled 289 lncRNAs with differentially expressed transcripts, which was the sum of the dysregulated lncRNAs from the 100 µM DEET treatment and the 10 µM fipronil treatment. This reveals that more lncRNA transcripts, and protein-coding transcripts from our previous study [18], were up-or downregulated in primary human hepatocytes to metabolize a mixture of 100 µM DEET and 10 µM fipronil together than either chemical alone. Interestingly, the concentration of fipronil used was 10-fold lower than the concentration of DEET used (10 µM fipronil versus 100 µM DEET) suggesting that a 100 µM fipronil dosage might elicit a much stronger response than a 100 µM DEET dosage. These findings underline the importance of studying environmental chemicals as they are more typically encountered, which is in combination rather than singly, especially to vulnerable populations like agricultural field workers, for example, that may encounter these substances more regularly.

S3. GREAT Analysis of LncRNA-Coding Gene Relationships
Figure S1A-C displays the number of neighboring protein-coding genes per dysregulated input lncRNA sequence, the absolute distance to the closest TSS in kilobases (kb) for each dysregulated lncRNA transcription site, and the orientation and distance of each dysregulated lncRNA gene to its closest TSS, respectively, after 100 µM DEET exposure. Some of the protein-coding genes with differentially expressed transcripts that we identified previously and lay within 1000 kb of these 20 dysregulated lncRNA genes were not designated as targets in the GREAT analysis, but we chose to include them in all further downstream analysis. In Figure S1A, only 4 dysregulated lncRNAs had one neighboring protein-coding gene while 16 of the dysregulated lncRNAs had two neighboring protein-coding genes within 1000 kb. Figure S1B shows that 2 lncRNAs were within 0-5 kb of their closest protein-coding gene TSS, 14 lncRNAs were within 5-50 kb, 18 were within 50-500 kb, and 2 were 500-1000 kb away. This implies that in previous gene-based tools focused on this type of analysis, over half of our dataset would have been excluded based on distance to the nearest TSS. Figure S1C shows the distance and orientation of the 20 dysregulated lncRNAs versus their closest neighboring protein-coding genes, where 23 were upstream of the TSS and 13 were downstream. Similar bar graphs are displayed for the 10 µM fipronil treatment (Figure S1D-F) and the 100 µM DEET plus 10 µM fipronil mixture ( Figure S1G-I).

S4.1 Highly up-regulated in liver cancer (HULC) transcripts down-regulated
HULC transcripts were significantly down-regulated in the 10 µM fipronil treatment (log2FC of -0.57). While HULC expression is known to influence the development of hepatocellular carcinoma (HCC), the most common type of liver cancer, there are conflicting reports regarding its over or under-expression in relation to HCC [66]. In pancreatic cancer HULC overexpression is positively correlated with larger tumors and decreased survivability [67] and in gastric cancer HULC overexpression contributed to lymph node metastasis [68]. However, liver studies by Yang et al (2015) established that higher levels of HULC in HCC resulted in less vascular invasion and increased survivability in some instances, which conflicts with other HCC studies [69]. It is also known that HULC can act as a miRNA sponge to reduce miRNA activity [70]. One miRNA type that HULC interacts with is mir-372, which is known to suppress tumorigenesis in certain types of cancer like endometrial carcinoma. The up-regulation of HULC could therefore potentially inhibit the tumor suppression capability of mir-372 [71]. Wu et al. (2015) demonstrated that down-regulation of mir-372 was correlated with tumor metastasis and poor prognosis in HCC [72]. Therefore, downregulation of HULC may also be a positive in certain situations where an abundance of mir-372 may be necessary to combat HCC. Focused research is needed to determine in what scenarios the up-or downregulation of HULC can serve as a prognostic or diagnostic indicator of liver disease, which is likely to differ between stages and types.

S4.2. H19 transcripts up-regulated
The H19 gene, sometimes referred to as long intergenic non-protein coding RNA 8, codes for an lncRNA whose transcripts were up-regulated in both the 10 µM fipronil and 100 µM DEET plus 10 µM fipronil treatments (log2FC of +1.55 and +0.59, respectively). Overexpression of H19 is associated with tumorigenesis in several different tissue types, and blocking H19 in breast cancer and HCC cells reduced their ability to grow and develop [73,74]. H19 also has the ability to affect p53 (discussed in more detail later) which has been called the "master regulator" due to its ability to prevent genome mutations. H19 was up-regulated in cells containing mutant p53 where oxygen conditions were low, but its expression remained normal when oxygen levels were sufficient [75].

S4.3. Metastasis associated lung adenocarcinoma transcript 1 (MALAT1) transcripts down-regulated
MALAT1 transcripts were significantly downregulated in the 100 µM DEET, 10 µM fipronil, and 100 µM DEET plus 10 µM fipronil treatments (log2FC of -1.34, -1.87, and -2.28 respectively) and is one of the most studied lncRNAs. It plays a multitude of roles in processes including gene splicing and nuclear organization, but its overexpression is related to several types of cancer. MALAT1 overexpression promotes malignancy in cancer cells and controls gene expression of several metastasis-associated transcripts in lung cancer cells [76]. However, the down-regulation of MALAT1 (as we observed in primary human hepatocytes) was shown to induce tumor progression in a recent breast cancer study [77]. Therefore the over-or under-expression of MALAT1 is likely very tissue specific. Downregulation of MALAT1 as we observed in our data set could play a completely different role in primary liver cells than it does in other tissues and systems. We must consider that many of the studies determining the role of MALAT1 were conducted in immortalized cell lines or model non-human organisms and not primary human hepatocytes [68].

S4.4. Nuclear enriched abundant transcript 1 (NEAT1) transcripts down-regulated
NEAT1 transcripts were also downregulated in all three of our treatment conditions (log2FC of -1.10 for 100 µM DEET, -1.22 for 10 µM fipronil, and -1.52 for the 100 µM DEET plus 10 µM fipronil mixture). The lncRNA NEAT1 plays an important role in the formation of nuclear paraspeckles, which are subnuclear bodies formed in response to stress, viral infection, and circadian rhythm maintenance. NEAT1 is more of an architectural components that interacts with proteins that have a direct role in transcription and RNA processing. Downregulation of NEAT1 results in impairment or inhibition of paraspeckle formation that could have a profound effect on some of the processes already mentioned above like response to viral infection [78]. Interestingly, recent studies demonstrate that NEAT1 and MALAT1 co-localize to hundreds of genomic sites likely due to cues from the transcription process and not specific DNA sequences [79]. Therefore, disruption of either of these lncRNAs could affect many signaling pathways in a positive or negative manner either by their interaction with one another or on genetic components that they both may influence.

S4.5. X-inactive specific transcript (XIST) and TSIX transcripts down-regulated
X-inactive specific transcript (XIST) and TSIX transcripts were both significantly downregulated by the 10 µM fipronil and 100 µM DEET plus 10 µM fipronil treatments (XIST, log2FC of -0.60 and -0.93, respectively; TSIX, log2FC of -0.59 and -0.97, respectively). Both XIST and TSIX are lncRNAs involved in the process of X inactivation where certain genes on the sex chromosomes are repressed to equalize gene expression among the sexes. The function of XIST is to coat certain regions of the sex chromosomes and suppress gene expression at the coating site (i.e., X inactivation) while TSIX, the antisense of XIST, functions to downregulate the expression of XIST when necessary [80]. It would seem logical that the repression of both of these lncRNAs would disrupt functionality of the entire X inactivation system but further study is required.

S4.6. Maternally expressed 3 (MEG3) transcripts down-regulated
MEG3 transcripts were downregulated in both the 10 µM fipronil and 100 µM DEET plus 10 µM fipronil treatments (log2FC of -1.19 and -1.70, respectively). This lncRNA activates p53 and functions as a tumor suppressor, so its expression is typically reduced or lost in cancer cells. A 2015 study revealed that MEG3 interacts with several TGF-β pathway genes, which are important in several cellular processes like cell growth, differentiation, and apoptosis [81,82]. Therefore, dysregulation of MEG3 could negatively influence all of these processes.