Bridging Plant and Human Radiation Response and DNA Repair through an In Silico Approach

The mechanisms of response to radiation exposure are conserved in plants and animals. The DNA damage response (DDR) pathways are the predominant molecular pathways activated upon exposure to radiation, both in plants and animals. The conserved features of DDR in plants and animals might facilitate interdisciplinary studies that cross traditional boundaries between animal and plant biology in order to expand the collection of biomarkers currently used for radiation exposure monitoring (REM) in environmental and biomedical settings. Genes implicated in trans-kingdom conserved DDR networks often triggered by ionizing radiation (IR) and UV light are deposited into biological databases. In this study, we have applied an innovative approach utilizing data pertinent to plant and human genes from publicly available databases towards the design of a ‘plant radiation biodosimeter’, that is, a plant and DDR gene-based platform that could serve as a REM reliable biomarker for assessing environmental radiation exposure and associated risk. From our analysis, in addition to REM biomarkers, a significant number of genes, both in human and Arabidopsis thaliana, not yet characterized as DDR, are suggested as possible DNA repair players. Last but not least, we provide an example on the applicability of an Arabidopsis thaliana—based plant system monitoring the role of cancer-related DNA repair genes BRCA1, BARD1 and PARP1 in processing DNA lesions.


Introduction
Both prokaryotic and eukaryotic cells exposed to radiation acquire different types of DNA lesions (e.g., single-strand breaks (SSB), double-strand breaks (DSB), mismatches, modified bases etc.). This genotoxic effect induced by radiation leads to the activation of DNA damage response (DDR) pathways. DDR can be defined as the sum of functions (sensors, transducers, effectors) that orchestrate DNA damage sensing and signal transduction, triggering either DNA repair, cell survival or cell death.
The molecular bases of natural radiotolerance have been investigated in the IR-resistant fungus Ustilago maydis which relies on the presence of a highly efficient machinery homologous recombinational (HR) repair of DNA damage, and particularly on the activity of the BRH2 gene, homolog of the human BRCA2 gene [12]. Recently, the γ-rays responsive transcriptome of the radioresistant basidiomycetous fungus Cryptococcus neoformans has been identified by Jung et al. [13] who found a novel transcription factor containing a basic leucine zipper domain, named BDR1 (bZIP TF for DNA damage response), able to modulate the expression of DNA repair genes. The BDR1 gene expression was in turn regulated by the highly conserved DDR protein kinase RAD53 [13]. Animals and plants display different levels of radiosensitivity, with a radiotolerance range of 0.001-1 and 1-100 Gy, respectively [14]. Plants have been exposed to IR, which is part of the natural background radiation, throughout evolution, with the consequent enhancement of DNA repair mechanisms necessary to cope with genotoxic stress. It has been reported that radioresistance positively correlates with genome size, since polyploidy facilitates protection against DNA damage [14,15]. Coniferous trees (e.g., pine trees) are very radiosensitive and they show severe damage leading to mortality when exposed to doses >17 Gy. On the contrary, deciduous trees (e.g., birch, alder and aspen) shed their irradiated foliage on the ground and thus can withstand radiation doses up to 90 Gy. The most radiotolerant plants are the herbaceous (weeds) and pasture plants (e.g., grasses and legumes) which are able to withstand doses up to 870 Gy [16,17].

Bioinformatics Approaches for the Identification of Candidate Genes for the Plant Radiation Dosimeter
The basic idea was to discover through in silico analysis those genes that could serve as reliable biomarkers or a 'global tool' for the development of a 'non-mammalian radiation dosimetry'. Using meta-analysis and bioinformatics procedures, Nikitaki et al. [18,19] have recently identified unique human gene biomarkers specific for different types of cellular stresses (e.g., IR, replication or oxidative stress), pointing out the potential application, in terms of experimental exploitation and technical advancement, of the selected gene products in the detection of harmful environmental stresses. The investigation described herein has been expanded to develop an in silico-plant-based platform, useful for REM in a non-mammalian and relatively inexpensive model, such as plants ('the plant radiation dosimeter'). From a biophysical point of view, IR and non-IR exhibit substantially different DNA damage patterns, thus inducing different DDR pathways dominated by distinct genes/proteins [20]. The expected differences in radiation-induced damage at the protein/lipid level could also account for different profiles of gene induction. To this end, we focused on those genes encoding products that serve as 'exclusive' biomarkers for the in planta detection of radiation exposure and identification of radiation quality. These universal biomarkers could be expressed in different plant populations/communities specifically linked to different geographical areas. As for radiation quality, attention was given to genes responsive to X-rays, γ-rays and non-IR (UV-A, UV-B or UV-C). In silico searches were performed on a plethora of plant species based on Gene Ontology (GO) terms. These GO terms served as filters in plant databases and, for each plant species, the corresponding gene lists were connected to the model plant Arabidopsis thaliana, based on their orthologous counterpart. Based on a protein-protein interaction network, we detected the most important orthologues in terms of functionality (nodes). The final sorting was performed according to orthologue multiplicity (number of orthologues in different species). The detailed procedure is described below.

Selection of Gene Ontology Terms
Beginning with the broad GO term 'response to radiation' through the QuickGO platform (http: //www.ebi.ac.uk/QuickGO) [21], child GO terms that fall within the selected criteria were explored. Since QuickGO provides only the direct descendants of each term, the search for direct descendants for the resultant GO terms was repeated until all child terms of the initial input were collected. At this Cancers 2017, 9, 65 4 of 20 step, 55 GO terms were retrieved, among which only 7 were eligible as sufficient, i.e., there was no need of including further child terms ( Table 1). The hierarchical relations of the selected GO terms are presented at Figure 1, including the initial GO term 'response to radiation'; however, this term was not included in the final selection. selected GO terms are presented at Figure 1, including the initial GO term 'response to radiation'; however, this term was not included in the final selection.

Orthologous Genes
In order to detect reliable and comprehensive biomarkers of assessing radiation exposure, all the available plant species from the Ensembl Plants annotation system [22] were investigated for genes orthologous to the reference plant A. thaliana. For each GO term and for every available plant (a total of 39, including the model plant), the counterpart of each orthologue was found in A. thaliana. At this step, 273 lists of genes were derived, that is, the product of the 39 plants with the seven selected GO terms. For each GO term, the corresponding 39 lists were unified, ending up with seven sets of genes with a total number of 410 different genes.

Orthologous Genes
In order to detect reliable and comprehensive biomarkers of assessing radiation exposure, all the available plant species from the Ensembl Plants annotation system [22] were investigated for genes orthologous to the reference plant A. thaliana. For each GO term and for every available plant (a total of 39, including the model plant), the counterpart of each orthologue was found in A. thaliana. At this step, 273 lists of genes were derived, that is, the product of the 39 plants with the seven selected GO terms. For each GO term, the corresponding 39 lists were unified, ending up with seven sets of genes with a total number of 410 different genes.

Exclusion of Common Genes
Given that in this study biodosimeters specific for radiation-quality are sought, genes categorized into more than one of the desired radiation subsets (X-ray, γ-ray, UV-A, UV-B and UV-C) had to be excluded from the subsequent steps of the analysis. Using the Draw Venn application [23], the corresponding lists of intersections among all the possible combinations of the sets were provided ( Figure 2). Cancers 2017, 9,65 5 of 20

Exclusion of Common Genes
Given that in this study biodosimeters specific for radiation-quality are sought, genes categorized into more than one of the desired radiation subsets (X-ray, γ-ray, UV-A, UV-B and UV-C) had to be excluded from the subsequent steps of the analysis. Using the Draw Venn application [23], the corresponding lists of intersections among all the possible combinations of the sets were provided ( Figure 2).

Protein-Protein Interactions Network
The initial screening resulted into an overall number of candidate genes that was too large to be informative. In order to reduce this number, STRING v10 (http://string-db.org) [24] was utilized. STRING V.10.0 is a database and a browser that for a given set of genes provides protein-protein interaction networks, allowing the user to set the criteria of the interaction prediction methods.
The resulting protein interaction network (a detail of which is presented in Figure 3) revealed genes/gene products that serve as nodes of dense cliques and then those genes were selected as the most critical ones for the plant functionality. The selected genes are presented in Table 2.

Protein-Protein Interactions Network
The initial screening resulted into an overall number of candidate genes that was too large to be informative. In order to reduce this number, STRING v10 (http://string-db.org) [24] was utilized. STRING V.10.0 is a database and a browser that for a given set of genes provides protein-protein interaction networks, allowing the user to set the criteria of the interaction prediction methods.
The resulting protein interaction network (a detail of which is presented in Figure 3) revealed genes/gene products that serve as nodes of dense cliques and then those genes were selected as the most critical ones for the plant functionality. The selected genes are presented in Table 2.  10.0, where the 237 A. thaliana selected by the previous step genes were set as input. The network was rearranged in order to better identify some of the key genes. Table 2. Selected genes the products of which would serve as 'exclusive'/highly specific biomarkers for the identification of radiation quality in planta, and therefore as specific indicators of the geographical region in which the 'biosensor plant' lives. These genes encode products that appeared as nodes of dense cliques of a protein-protein interaction network, created using STRING v10 ( Figure 3). Genes are listed, along with a number, indicative of the multiplicity of their orthologues across the plant species under study. The symbol '#' indicates the number of species.   10.0, where the 237 A. thaliana selected by the previous step genes were set as input. The network was rearranged in order to better identify some of the key genes. Table 2. Selected genes the products of which would serve as 'exclusive'/highly specific biomarkers for the identification of radiation quality in planta, and therefore as specific indicators of the geographical region in which the 'biosensor plant' lives. These genes encode products that appeared as nodes of dense cliques of a protein-protein interaction network, created using STRING v10 ( Figure 3). Genes are listed, along with a number, indicative of the multiplicity of their orthologues across the plant species under study. The symbol '#' indicates the number of species.

Ranking Based on Multiplicity
To further reduce the number of suggested genes, candidates highlighted in the previous step were ranked according to their multiplicity of incidence among the plant species under study. Genes that  (Table 2) were used for establishing the plant radiation dosimetry.

Human Orthologues of the Resulting Genes
The candidate genes retrieved through in silico analyses could be also experimentally validated. Towards this end, first, their distinct expression profiles in relation to the different regions of the electromagnetic spectrum need to be estimated in order to determine radiation-specific gene up-regulation. In this way, only those genes induced at a certain wavelength range would be included in the plant radiation dosimeter. Experimental validation would allow to elucidate the function of genes that are annotated to a GO term. For instance, a given gene which is annotated to the GO term 'response to UV-B' could likely respond to UV-C as well. Conversely, if a given gene is not annotated to a specific GO term does not necessarily mean that the gene is not classified under this term, but it has rather not been verified yet. Last but not least, in order to provide a better understanding of the 'plant biodosimeter genes', we searched for orthologues of the proposed plant genes ( Table 2) the products of which could serve as biodosimeters in human (Table 3). In addition, we examined whether the human orthologous genes are annotated to any of the initial GO terms (Table 1), that is, we examined whether the human orthologues shown in Table 3 have been also characterized as genes involved in the response to radiation. To this end, for the genes in the second column of Table 3, the RefSeq [25] accession code of their corresponding encoded proteins was identified (Table 3, third column). For every protein, OrthoGroups were found in OrthoMCL-DB [26] (Table 3, fourth column) and scanned for orthologues in H. sapiens. The human proteins, along with their Ensembl protein identifiers (i.e., ENSP) [26], are presented in the fifth column of Table 3. The corresponding gene names (according to HUGO gene nomenclature (HGNC) [27]), were assigned to the retrieved proteins (Table 3, sixth column). Notably, among these proteins, there are also several key DSB repair proteins like RAD54, RAD51, LIG4 (DNA LIGASE 4), as well as proteins participating in HR, MMR (e.g., MSH5 or MutS-HOMOLOG 5), NER (ERCC5, ERCC3; or EXCISION REPAIR CROSS-COMPLEMENTING 5 and 3, respectively) and DDR (RPA1 or REPLICATION PROTEIN A1), further highlighting the pivotal role of the DNA damage repair components in optimized radiation biodosimetry. Table 3. Human orthologues of the resulting genes, proposed as exclusive biomarkers for the detection of the exposure to the several types of the electromagnetic spectrum.

Arabidopsis thaliana
Ortho Group Homo Sapiens

Arabidopsis thaliana
Ortho Group Homo Sapiens

Comparison between Arabidopsis thaliana and Homo sapiens DNA Repair Mechanisms
Given that DDR pathways are the principle molecular pathways triggered following exposure to IR and non-IR, both in mammals and plants, the DNA repair mechanisms were analyzed comparatively in the model plant and animal species, Arabidopsis thaliana and human, respectively.

Selection of Gene Ontology Terms
At the QuickGO (http://www.ebi.ac.uk/QuickGO) [21] platform, child terms that fall within the selected criteria were explored, beginning with the broad term 'DNA repair'. Following steps similar to those described in Section 2.2.1, we ended up with the six GO terms presented in Table 4 and Figure 4.

Human DNA Repair Genes
From Ensembl [28], Biomart, Ensembl Genes 83, the dataset 'Homo sapiens genes (GRCh38.p5)' was chosen. For each search a separate GO term listed in Table 4 was used as 'Filter'. By choosing 'Ensembl Gene ID' and 'HGNC symbol' under 'Attributes -Features', six '.txt' files were created for Homo sapiens. The Venn diagram of the initial genes is presented in Figure 5. As it was expected, all the five DNA repair mechanisms are sub-sets of DNA repair, given that these GO terms are child terms of DNA repair ( Figure 4). This Venn diagram was created manually, because it exceeds the maximum number of elements supported for automated creation by Draw Venn; however the subsets were determined by using Draw Venn [23]. The contents of the Venn diagram are found in the Supplementary Information Table S1.  Table 4.

Arabidopsis thaliana DNA Repair Genes
In a similar manner, for A. thaliana, the dataset 'Arabidopsis thaliana genes (TAIR10 (2010-09-TAIR10))' was chosen from EnsemblPlants [22], Biomart, Plant Mart. For each search, a separate GO term listed in Table 4 was used as 'Filter'. By selecting 'Gene stable ID', 'Gene name' and 'RefSeq

Human DNA Repair Genes
From Ensembl [28], Biomart, Ensembl Genes 83, the dataset 'Homo sapiens genes (GRCh38.p5)' was chosen. For each search a separate GO term listed in Table 4 was used as 'Filter'. By choosing 'Ensembl Gene ID' and 'HGNC symbol' under 'Attributes -Features', six '.txt' files were created for Homo sapiens. The Venn diagram of the initial genes is presented in Figure 5. As it was expected, all the five DNA repair mechanisms are sub-sets of DNA repair, given that these GO terms are child terms of DNA repair ( Figure 4). This Venn diagram was created manually, because it exceeds the maximum number of elements supported for automated creation by Draw Venn; however the sub-sets were determined by using Draw Venn [23]. The contents of the Venn diagram are found in the Supplementary Information Table S1.

Human DNA Repair Genes
From Ensembl [28], Biomart, Ensembl Genes 83, the dataset 'Homo sapiens genes (GRCh38.p5)' was chosen. For each search a separate GO term listed in Table 4 was used as 'Filter'. By choosing 'Ensembl Gene ID' and 'HGNC symbol' under 'Attributes -Features', six '.txt' files were created for Homo sapiens. The Venn diagram of the initial genes is presented in Figure 5. As it was expected, all the five DNA repair mechanisms are sub-sets of DNA repair, given that these GO terms are child terms of DNA repair (Figure 4). This Venn diagram was created manually, because it exceeds the maximum number of elements supported for automated creation by Draw Venn; however the subsets were determined by using Draw Venn [23]. The contents of the Venn diagram are found in the Supplementary Information Table S1.  Table 4.

Arabidopsis thaliana DNA Repair Genes
In a similar manner, for A. thaliana, the dataset 'Arabidopsis thaliana genes (TAIR10 (2010-09-TAIR10))' was chosen from EnsemblPlants [22], Biomart, Plant Mart. For each search, a separate GO term listed in Table 4 was used as 'Filter'. By selecting 'Gene stable ID', 'Gene name' and 'RefSeq  Table 4.

Arabidopsis thaliana DNA Repair Genes
In a similar manner, for A. thaliana, the dataset 'Arabidopsis thaliana genes (TAIR10 (2010-09-TAIR10))' was chosen from Ensembl Plants [22], Biomart, Plant Mart. For each search, a separate GO term listed in Table 4 was used as 'Filter'. By selecting 'Gene stable ID', 'Gene name' and 'RefSeq protein ID' under 'Attributes -Features', six '.txt' files were generated for Arabidopsis thaliana. A Venn diagram for these genes is presented in Figure 6. This diagram was created manually, because it exceeds the maximum number of elements supported for automated creation by the software used, however the sub-sets were determined using the Draw Venn application [22]. The contents of this diagram can be found in the Table S2 in Supplementary Information. Cancers 2017, 9,65 10 of 20 protein ID' under 'Attributes -Features', six '.txt' files were generated for Arabidopsis thaliana. A Venn diagram for these genes is presented in Figure 6. This diagram was created manually, because it exceeds the maximum number of elements supported for automated creation by the software used, however the sub-sets were determined using the Draw Venn application [22]. The contents of this diagram can be found in the Table S2 in Supplementary Information. Figure 6. Venn diagram of the Arabidopsis thaliana genes that were found under each of the GO terms listed in Table 4.

Identification of Orthologies between Homo sapiens and Arabidopsis thaliana DNA Repair Genes
The derived twelve sets of genes were used as input to OrthoMCL-DB (http://orthomcl.cbil.upenn.edu) [29], providing for each gene a group of orthologues in the species under study. The results for the two species were collected and stored in a relational database. Data were combined in order to demonstrate known orthologies, propose new orthologies and, most importantly, suggest roles in DNA repair for orthologous genes. The orthologues pairing process, as well as the assignment of new roles to genes, are illustrated in Figure 7. Those of the initial genes that have been identified as annotated under the specific Gene Ontology (GO) terms are shown in parentheses. The same genes are represented in bold in the Supplementary Information (Tables S5-10), where the analytical results of this procedure can be found. Each one of these genes was used as input to Ortho MCL-DB, resulting to one or more Ortho Groups containing orthologous genes across several organisms. Ortho Groups A, B, and C contain both human and Arabidopsis genes. Ortho Groups D and E include only plant genes, while Ortho Group F contains only human genes. The overall procedure can be better described using the following example. As shown in Figure 7, the H. sapiens gene '(a)' belongs to A and B Ortho Groups. The Arabidopsis gene 'i' was identified in group A, by virtue of orthology, without having been previously annotated under the initial GO term. Ortho Group B contains also the human gene '(a)' and A. thaliana gene '(ii)'. For this reason, the previously characterized genes '(a)' and '(ii)' were paired as orthologues. These kinds of pairs are presented in Tables 5 and 6. By using the A. thaliana gene '(ii)' as query in OrthoMCL-BD, we identified the orthogroup D, which also contains the not yet annotated A. thaliana gene 'iv'. On the other hand, H. sapiens gene '(e)' belongs to group F, but since group F does not contain any A. thaliana gene, gene '(e)' was eventually excluded from the results.  Table 4.

Identification of Orthologies between Homo sapiens and Arabidopsis thaliana DNA Repair Genes
The derived twelve sets of genes were used as input to OrthoMCL-DB (http://orthomcl.cbil. upenn.edu) [29], providing for each gene a group of orthologues in the species under study. The results for the two species were collected and stored in a relational database. Data were combined in order to demonstrate known orthologies, propose new orthologies and, most importantly, suggest roles in DNA repair for orthologous genes. The orthologues pairing process, as well as the assignment of new roles to genes, are illustrated in Figure 7. Those of the initial genes that have been identified as annotated under the specific Gene Ontology (GO) terms are shown in parentheses. The same genes are represented in bold in the Supplementary Information (Tables S5-10), where the analytical results of this procedure can be found. Each one of these genes was used as input to Ortho MCL-DB, resulting to one or more Ortho Groups containing orthologous genes across several organisms. Ortho Groups A, B, and C contain both human and Arabidopsis genes. Ortho Groups D and E include only plant genes, while Ortho Group F contains only human genes. The overall procedure can be better described using the following example. As shown in Figure 7, the H. sapiens gene '(a)' belongs to A and B Ortho Groups. The Arabidopsis gene 'i' was identified in group A, by virtue of orthology, without having been previously annotated under the initial GO term. Ortho Group B contains also the human gene '(a)' and A. thaliana gene '(ii)'. For this reason, the previously characterized genes '(a)' and '(ii)' were paired as orthologues. These kinds of pairs are presented in Tables 5 and 6. By using the A. thaliana gene '(ii)' as query in OrthoMCL-BD, we identified the orthogroup D, which also contains the not yet annotated A. thaliana gene 'iv'. On the other hand, H. sapiens gene '(e)' belongs to group F, but since group F does not contain any A. thaliana gene, gene '(e)' was eventually excluded from the results. Of note, protein names instead of gene names were used at this step. Therefore, for Homo sapiens, the ENSP (Ensembl protein ID) [7] and for Arabidopsis thaliana, the RefSeq (Reference Sequence) [8] nomenclature was used, respectively. For instance, the corresponding proteins of the gene OGG1 (according to HGNC [9]) are ENSP00000305584 and NP_173590, in Human and Arabidopsis thaliana, respectively. Those DNA repair genes classified in orthologous groups are presented in Table 5, which contains condensed information presented in Table S5. The sets of orthologous genes between Homo sapiens (Hs) and Arabidopsis thaliana (At) retrieved for the five main DNA repair mechanisms are presented in Table 6.  Of note, protein names instead of gene names were used at this step. Therefore, for Homo sapiens, the ENSP (Ensembl protein ID) [7] and for Arabidopsis thaliana, the RefSeq (Reference Sequence) [8] nomenclature was used, respectively. For instance, the corresponding proteins of the gene OGG1 (according to HGNC [9]) are ENSP00000305584 and NP_173590, in Human and Arabidopsis thaliana, respectively. Those DNA repair genes classified in orthologous groups are presented in Table 5, which contains condensed information presented in Table S5. The sets of orthologous genes between Homo sapiens (Hs) and Arabidopsis thaliana (At) retrieved for the five main DNA repair mechanisms are presented in Table 6.

'New Genes' Emerging from Comparative Analysis
The sets of orthologous genes between H. sapiens and A. thaliana involved in the five main DNA repair mechanisms, retrieved as previously described, are reported in Table 6. Genes already known to participate in these mechanisms (termed as 'old genes') were included in Table 6. The bioinformatic analysis revealed a significant number of 'new' genes (e.g., 'c', 'i' and 'iv' genes described in Figure 7). Analytical results are available in Supplementary Information (Tables S5-10), where detailed lists of both 'old' and 'new' genes are provided. Possible relations are presented in the form of Venn diagrams (Figures 8 and 9, Supplementary Information: Tables S3 and 4). Given that the newly retrieved genes (Table 7, columns 8 and 9) have not yet been characterized and assigned to those specific GO terms, these genes are considered as novel candidate players in DNA repair. These results imply that there are unexplored areas in A. thaliana for further research and discoveries. As shown in Table 7, 300 DNA repair genes are already known, whereas 243 'entirely new' genes are suggested (see also Figure 9). Of those, 87 and 8 candidates are involved in the DSBs repair pathways HR and NHEJ, respectively. These 'new genes' could possibly play an auxiliary or parallel role in DSB repair, as recently discovered in the case of backup NHEJ [30]. Therefore, these genes could be described as 'genes in new roles'. 7). Analytical results are available in Supplementary Information (Tables S5-10), where detailed lists of both 'old' and 'new' genes are provided. Possible relations are presented in the form of Venn diagrams (Figures 8 and 9, Supplementary Information: Tables S3 and 4). Given that the newly retrieved genes ( Table 7, columns 8 and 9) have not yet been characterized and assigned to those specific GO terms, these genes are considered as novel candidate players in DNA repair. These results imply that there are unexplored areas in A. thaliana for further research and discoveries. As shown in Table 7, 300 DNA repair genes are already known, whereas 243 'entirely new' genes are suggested (see also Figure 9). Of those, 87 and 8 candidates are involved in the DSBs repair pathways HR and NHEJ, respectively. These 'new genes' could possibly play an auxiliary or parallel role in DSB repair, as recently discovered in the case of backup NHEJ [30]. Therefore, these genes could be described as 'genes in new roles'.    Supplementary Information (Tables S5-10), where detailed lists of both 'old' and 'new' genes are provided. Possible relations are presented in the form of Venn diagrams (Figures 8 and 9, Supplementary Information: Tables S3 and 4). Given that the newly retrieved genes ( Table 7, columns 8 and 9) have not yet been characterized and assigned to those specific GO terms, these genes are considered as novel candidate players in DNA repair. These results imply that there are unexplored areas in A. thaliana for further research and discoveries. As shown in Table 7, 300 DNA repair genes are already known, whereas 243 'entirely new' genes are suggested (see also Figure 9). Of those, 87 and 8 candidates are involved in the DSBs repair pathways HR and NHEJ, respectively. These 'new genes' could possibly play an auxiliary or parallel role in DSB repair, as recently discovered in the case of backup NHEJ [30]. Therefore, these genes could be described as 'genes in new roles'.   Our results are presented in Venn diagrams (Figures 8 and 9), in a concise manner, avoiding the repetition of the common genes among the mechanisms. For the sake of completeness, we have included in these diagrams the intersection with the initial DNA repair genes, referred to as 'established DNA repair genes' (black dashed line in Figures 8 and 9). In these Venn diagrams, only the genes of Supplementary Information: Tables S5-S10 that are not in bold, i.e., those genes that have arisen from the present analysis, are included. Thus, it is not surprising that some genes are found outside the DNA repair set (blue line Figures 8 and 9) and they therefore fall into the 'established DNA repair' group.

An In Vitro Approach Monitoring Key DNA Repair Genes
Germline variants in the human BRCA1 gene are associated with familial breast and ovarian cancers [31]. The human BARD1 (BRCA1-associated RING domain protein 1) is essential for the sequestration of BRCA1 at DNA damage sites [32]. Moreover, the DDR-related protein PARP1 (Poly(ADP-ribose) polymerase-1) [33] was shown to mediate the function of BRCA1 in DDR [32]. Herein, the DNA-CL (crosslinks) comet assay, an assay modified for detection of DNA interstrand cross-links, was used to assess the effect of the Arabidopsis thaliana genes BRCA1, BARD1 and PARP1 on DNA damage repair. To this end, the alkaline/neutral (A/N) protocol of comet assay described in Angelis et al. [34], with the additional enzyme treatment step, was employed. Isolated nuclei from chopped Arabidopsis thaliana seedlings were embedded into a 0.7% agarose gel and lysed for 1 h. Then, the lysed nuclei were enzymatically treated. In particular, they were equilibrated for 20 min in a restriction endonuclease Sal1 buffer and then 50 µL of Sal1 solution (1U/ml) were added. Each gel was then spread on microscopic slides, covered with Parafilm and incubated in a sterile moist chamber for 50 min at 37 • C. The restriction enzyme digestion was stopped using TE buffer. Following enzyme treatment, the slides were dipped for 20 min into a DNA-unwinding solution (0.3 M NaOH; 10 mM EDTA), neutralized for 5 min in 1× TBE buffer and electrophoresed in the same buffer at 1 Volt/cm for 5 min.
As shown in Figure 10, comparison of DDR kinetics revealed similarly reduced ability of the AtBRCA1 and AtBARD1 mutants to remove CL. Despite of the fact that wild type AtCol0 removed all CL after 3 h of DDR, residual DNA damage was observed in mutant AtBRCA1. The half-life of CL in AtBRCA1 and AtBARD1 (t 1/2 = 10 h) is approximately 10 times longer compared to Col0 (t 1/2 = 1.2 h). Therefore, AtBRCA1 and AtBARD1 can efficiently repair DNA interstrand cross-linked adducts generated by mitomycin C. On Figure 11 is shown the effect of SSB accumulation during the early stages of base excision repair (BER) recovery in Arabidopsis plants due to PARP1 impairment either by a knockout mutation leading to AtPARP1 or by two PARP1 inhibitors, the selective PARP1 inhibitor AG14361, developed by Pfizer for the sensitization of human breast cancer cells prior to irradiation treatment, or the non-specific PARP inhibitor 3-aminobenzamide (3-ABA). Of interest, the PARP1-mediated signaling is conserved among kingdoms, since the selective AG14361 inhibitor of HsPARP1 is also effective in Arabidopsis and exhibits the same repair kinetic behavior as the knockout AtPARP1 mutation, contrary to 3-ABA. The above observations lead to the suggestion that the genes AtBRCA1, AtBARD1 and AtPARP1 play an equally important role in DNA damage repair in plants, like Arabidopsis thaliana, as in animals and humans.

Conclusions
The systematic bioinformatic approach employed in this study to select candidate genes for the 'plant radiation dosimeter' (Figure 12) revealed that, despite the fact that the last common ancestor of human and A. thaliana is traced approximately 1.5 billion years ago [35], fundamental mechanisms

Conclusions
The systematic bioinformatic approach employed in this study to select candidate genes for the 'plant radiation dosimeter' (Figure 12) revealed that, despite the fact that the last common ancestor of human and A. thaliana is traced approximately 1.5 billion years ago [35], fundamental mechanisms Figure 11. Effect of the knockdown mutation of AtPARP1 and of specific and broad spectrum inhibitors of PARP1 on SSB repair kinetics obtained by an A/N comet assay. SSBs generated following 1 hr treatment with 2 mM MMS (methyl methanesulfonate) in AtPARP1 and in Arabidopsis Col1, in the presence of 3 mM 3-aminobenzamide (3-ABA) and 10 µM HsPARP1 specific AG14361 inhibitors.

Conclusions
The systematic bioinformatic approach employed in this study to select candidate genes for the 'plant radiation dosimeter' (Figure 12) revealed that, despite the fact that the last common ancestor of human and A. thaliana is traced approximately 1.5 billion years ago [35], fundamental mechanisms underlying the maintenance of genome integrity, as well as their associated genes, are conserved between animals and plants. In particular, we have identified plant genes with human counterparts that can be used as 'signature radiation genes' in order to allow direct comparisons on the primary mechanisms governing the DNA damage repair response to different types of radiation (ionizing and non-ionizing) across the tree of life. The expression patterns (up-or down-regulation) of the genes which have been classified in each part of the electromagnetic spectrum should be evaluated experimentally after exposure of the plant only to a certain component of the electromagnetic spectrum. It is expected that some of the proposed genes could actually be exploited as biomarkers of exposure to electromagnetic radiation, for assessing radiation risk in environment. The main conclusion is that the results from the in silico analysis performed herein are expected to provide the foundation for future research efforts in order to design a radiation biodosimeter. Apart from the REM biomarkers, a large number of putative genes suggested to participate in DDR was also identified in human and Arabidopsis thaliana. Our studies support the further development of a plant-based radiation biodosimeter. We believe, that the importance of using a low-cost non-animal system to monitor and estimate radiation risk is high since it helps in establishing a reliable methodology avoiding all the ethical issues associated in most cases with the use of animals or human cells. In addition, this plant-based platform maybe used in other cases for the screening of specialized drugs targeting for example DNA repair like PARP inhibitors, predict response to inhibitors of the DNA damage sensors ATM and ATR, and inhibitors of nonhomologous end joining etc. As recently discussed in Stover et al. [36], producing and validating reliable biomarkers will help boost the efficiency of DNA repair targeted therapies and exploit their role(s) on cancer treatment. Also in this case, plants may prove as a first-step screening tool on all of the above cases. underlying the maintenance of genome integrity, as well as their associated genes, are conserved between animals and plants. In particular, we have identified plant genes with human counterparts that can be used as 'signature radiation genes' in order to allow direct comparisons on the primary mechanisms governing the DNA damage repair response to different types of radiation (ionizing and non-ionizing) across the tree of life. The expression patterns (up-or down-regulation) of the genes which have been classified in each part of the electromagnetic spectrum should be evaluated experimentally after exposure of the plant only to a certain component of the electromagnetic spectrum. It is expected that some of the proposed genes could actually be exploited as biomarkers of exposure to electromagnetic radiation, for assessing radiation risk in environment. The main conclusion is that the results from the in silico analysis performed herein are expected to provide the foundation for future research efforts in order to design a radiation biodosimeter. Apart from the REM biomarkers, a large number of putative genes suggested to participate in DDR was also identified in human and Arabidopsis thaliana. Our studies support the further development of a plantbased radiation biodosimeter. We believe, that the importance of using a low-cost non-animal system to monitor and estimate radiation risk is high since it helps in establishing a reliable methodology avoiding all the ethical issues associated in most cases with the use of animals or human cells. In addition, this plant-based platform maybe used in other cases for the screening of specialized drugs targeting for example DNA repair like PARP inhibitors, predict response to inhibitors of the DNA damage sensors ATM and ATR, and inhibitors of nonhomologous end joining etc. As recently discussed in Stover et al. [36], producing and validating reliable biomarkers will help boost the efficiency of DNA repair targeted therapies and exploit their role(s) on cancer treatment. Also in this case, plants may prove as a first-step screening tool on all of the above cases.  IR, ionizing radiation, UV, ultraviolet radiation. PPi, protein-protein interaction. Please see recent work by Pateras et al. [20], for analytical description of all DDR pathways.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2072-6694/9/6/65/s1. Table S1: Human DNA repair genes. Contents of the Venn diagram presented in Figure 5 in the main text, Table S2: Arabidopsis thaliana DNA repair genes. Contents of the Venn diagram presented in Figure 6 in the main text, Table S3: Homo sapiens genes that stemmed from the present analysis (i.e., contents of the Venn diagram shown in Figure 8 in the main text), Table S4: Arabidopsis thaliana genes that stemmed from the present analysis (i.e., contents of the Venn diagram shown in Figure 9 in the main text), Table S5. DNA repair genes in human and Arabidopsis thaliana, grouped according to orthology, Table S6: BER genes in human and Arabidopsis thaliana, grouped according to orthology, Table S7: NER genes in human and Arabidopsis thaliana, grouped according to orthology, Table S8: MMR genes in human and Arabidopsis thaliana, grouped according to orthology, Table S9. HR genes in human and Arabidopsis thaliana, grouped according to orthology, Table S10: NHEJ genes in human and Arabidopsis thaliana, grouped according to orthology.