Characterization of the Radiation Desiccation Response Regulon of the Radioresistant Bacterium Deinococcus radiodurans by Integrative Genomic Analyses

Numerous genes are overexpressed in the radioresistant bacterium Deinococcus radiodurans after exposure to radiation or prolonged desiccation. It was shown that the DdrO and IrrE proteins play a major role in regulating the expression of approximately twenty genes. The transcriptional repressor DdrO blocks the expression of these genes under normal growth conditions. After exposure to genotoxic agents, the IrrE metalloprotease cleaves DdrO and relieves gene repression. At present, many questions remain, such as the number of genes regulated by DdrO. Here, we present the first ChIP-seq analysis performed at the genome level in Deinococcus species coupled with RNA-seq, which was achieved in the presence or not of DdrO. We also resequenced our laboratory stock strain of D. radiodurans R1 ATCC 13939 to obtain an accurate reference for read alignments and gene expression quantifications. We highlighted genes that are directly under the control of this transcriptional repressor and showed that the DdrO regulon in D. radiodurans includes numerous other genes than those previously described, including DNA and RNA metabolism proteins. These results thus pave the way to better understand the radioresistance pathways encoded by this bacterium and to compare the stress-induced responses mediated by this pair of proteins in diverse bacteria.


Introduction
Deinococcus radiodurans is one of the most resistant bacteria to genotoxic agent exposure and desiccation isolated to date [1][2][3][4]. Unlike radiosensitive organisms, once exposed to huge γ-ray doses, or after prolonged desiccation, D. radiodurans is able to reconstruct an intact genome in a few hours from several hundred DNA fragments [5]. Many factors contribute to the radioresistance of D. radiodurans, including efficient DNA repair mechanisms [5][6][7][8], a condensed nucleoid limiting the dispersion of genome fragments after irradiation [9,10], and the protection of proteins against oxidative damage [11]. Thus, the exceptional ability of this bacterium to overcome severe DNA damaging conditions is described as a combination of active and passive mechanisms acting in synergy within the cell, enabling survival following these stresses.
The exposure of D. radiodurans to γ-rays, or its recovery from desiccation, results in a rapid upregulation of the expression of numerous genes [12,13], even if constitutively expressed genes are also involved in the mechanisms of radioresistance. In many bacterial species, expression of DNA repair genes is under the control of LexA, the repressor of the well-known SOS response (for review [14]). D. radiodurans encodes two LexA homologs The p17235 and p17236 shuttle vectors (E. coli/D. radiodurans) code for a spectinomycin resistance gene, in addition to a repU gene encoding the RepU protein essential for plasmid replication in D. radiodurans. The p17235 and p17236 plasmids contain the wild-type repU gene or a repU Ts gene encoding a thermosensitive protein (RepU Ts ), respectively. To construct the p17238 plasmid used for conditional expression of ddrO, the native genomic ddrO gene was amplified by PCR using the NE28-NE29 primers and the PCR product was cloned into the p17236 plasmid between the BamHI/NotI sites. The same strategy was used to construct the p17237 plasmid. Following transformation of D. radiodurans strain GY14125 (non-homogenotized ∆ddrOΩcat) with p17236 or p17237 plasmids, both expressing ddrO gene, the transformants were streaked several times on plates supplemented with chloramphenicol and spectinomycin, and the complete deletion from all chromosome copies of native D. radiodurans ddrO was analyzed by diagnostic PCR.
∆ddrO strains complemented by ddrO expressed, under its own promoter region, from a plasmid with wild-type or thermosensitive (prepU ts ) replication, were grown at a permissive temperature (30 • C) with spectinomycin and chloramphenicol. Cells were diluted in fresh medium with antibiotics and grown at permissive temperature (30 • C) and cells in exponential growth (A 650nm~0 .5) were harvested by centrifugation, washed two times with TGY2X and reused to A 650nm = 0.1 in fresh medium without antibiotics. Then, the temperature was shifted to 37 • C (non-permissive temperature for the thermosensitive replication plasmid). At 1,4,6,8,16, and 24 h, aliquots of 20 mL were removed for fluorescence microscopy and transcriptome analysis or Western blot analysis.

RNA Extraction, cDNA Library Construction, and Sequencing
For each aliquot, total RNA was isolated using the Fast RNA Pro Blue Kit (MP Biomedicals, Illkirch-Graffenstaden, France) and the FastPrep-24 instrument, according to the manufacturer's protocols. Extracted RNA was rigorously treated with TURBO DNAfree (Invitrogen, Waltham, MA, USA), according to the manufacturer's instructions and the absence of DNA genomic contamination was checked by quantitative PCR (qPCR). The quality and quantity of treated RNA were analyzed using a DeNovix DS-11 spectrophotometer (DeNovix Inc., Wilmington, DE, USA) and the Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA) with an RNA integrity number ≥6 for cDNA library preparation. The rRNA depletion and Illumina libraries were made following the Illumina protocol (High-throughput sequencing facility of the I2BC, Gif sur Yvette, France). The cDNA samples were sequenced using Illumina NextSeq v. NS500446 (High-throughput sequencing facility of the I2BC, Gif sur Yvette, France), yielding, on average, 22.8 × 10 6 50 nt. paired-end reads (±6.8 × 10 6 reads).

RNA-Seq Data Analysis
Read sequences were mapped on our reference genome sequence with BWA mem (v. 0.7.9a-r786) using default settings, and coverage values of all genomic features were computed with the bedtools "coverage" command (v2.17.0) [45]. RNA differential gene expression analysis was performed with the DESeq R-package v. 1. 39.0] [46].

Western Blot Analysis
The protein extractions and Western blot analyses were performed as previously described [20]. The membranes were incubated overnight at 4 • C with a 1:5000 dilution of monoclonal rabbit anti-HA antibodies (Sigma-Aldrich, Saint-Louis, MO, USA) or 1:5000 dilution of monoclonal rabbit anti-FLAG antibodies (Sigma-Aldrich, Saint-Louis, MO, USA).
Immune complexes were eluted in 200 µL of elution buffer. The eluted samples (20 µL) were saved for control Western blots, and the remainder was incubated for 2 h at 37 • C with shaking with 100 µg/mL Proteinase K. Then, the supernatant was incubated overnight at 65 • C to reverse crosslinking with 100 µg/mL RNAse A. The DNA was purified using the PCR Clean-up kit (Macherey-Nagel, Duren, Germany). Three independent ChIP experiments were performed for "IP" samples.

ChIP-Seq
Raw FASTQ files were obtained from sequencing three "IP" samples comprising 19,7, and 3 × 10 6 sequences, respectively, in addition to the "Mock" (19 × 10 6 sequences) and the "Input" (13 × 10 6 sequences) samples. The quality score was verified with FASTQC software (https://www.bioinformatics.babraham.ac.uk/projects/fastqc) (accessed on 10 September 2021) and Illumina adaptor sequences were removed with Cutadapt software (https://cutadapt.readthedocs.io/en/stable) (accessed on 10 September 2021). Sequence alignments on the genomic sequence were performed with Bowtie2 software (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) (accessed on 10 September 2021). Output SAM files were converted and indexed into BAM files, using the Samtools software (http://www.htslib.org) (accessed on 10 September 2021). They were used both for visualization with IGV, and additional conversion into BED files with Bedtools software (https://bedtools.readthedocs.io/en/latest/index.html) (accessed on 10 September 2021) providing the input file format required by bPeaks programs (Available online: https://cran.r-project.org/web/packages/bPeaks/index.html) (accessed on 10 September 2021) to perform peak calling. Searches for conserved motifs were performed by MEME and FIMO (https://meme-suite.org/meme) (accessed on 10 September 2021) with a Match p-value < 1.0 × 10 −4 . Prediction of E. coli-like gene promoter elements and transcription start sites in gene promoters was carried out using BPROM (http://www.fruitfly.org/seq_tools/promoter.html) (accessed on 10 September 2021) [47]. To sort data from Chip-seq and RNA seq and to integrate them with the conserved motifs found by MEME and FIMO, we used an in-house script (File S1) that defines, without a priori knowledge, different lists of candidate genes to be DdrO targets, and hence provides detailed information about the process, which was applied to obtain the results presented in the main text and Supplementary Materials files.

Western Blot Analysis of RDR Tagged-Proteins
Exponentially growing bacteria (15 mL, A 650nm = 0.3), grown at 30 • C, were exposed to 1 or 5 µg/mL mitomycin C. After 3 h at 30 • C with continuous shaking, cells were Cells 2021, 10, 2536 6 of 28 harvested by centrifugation at 4 • C and the pellets washed with 1X cold saline-sodium citrate (SSC) buffer. Then, the bacteria pellets were re-suspended in 150 µL of SSC 1X with 0.4 mM protease inhibitor cocktail (Roche) and cells disrupted with a FastPrep Instrument using 0.1 g of glass beads (500 µm) and four pulses of 30 s. Cell debris were removed by centrifugation at 20,000× g for 10 min at 4 • C, and the supernatant fluid collected and placed in sterile microcentrifuge tubes. Protein concentrations were determined by Bradford assay (Bio-Rad). Proteins were subjected to electrophoresis through a 12% Glycine SDS-PAGE gel (Mini-PROTEAN TGX Stain-Free Precast gel, Bio-Rad, Hercules, CA, USA) and transferred onto a polyvinylidene difluoride (PVDF) membrane (GE Healthcare, Chicago, IL, USA). All these experiments were performed three times as previously described [20] with a 15,000 dilution of anti-V5 rabbit primary antibody (Abcam, Cambridge, UK) or with a 1:5000 dilution of monoclonal rabbit anti-HA antibodies (Sigma-Aldrich, Budapest, Hungary).

Sensitivity Assay to DNA-Damaging Agents Mitomycin C and UVC
Bacteria were grown in TGY2X liquid medium at 30 • C to an A 650nm = 1 and sequential dilutions of cells were spotted on TGY plates supplemented (or not) with mitomycin C (60 ng/mL and 80 ng/mL at final concentration), exposed (or not) to UVC at a dose rate of 3.5 J/m 2 /s.

Deposition of Sequences and of Expression Data
The complete sequence and annotation of the genome were deposited with GenBank under accession numbers CP068791, CP068792, CP068793, and CP068794. The complete high-throughput sequence data were deposited with the Gene Expression Omnibus (GEO) data bank under accession number GSE175875 (RNA-seq and ChIP-seq).

Genome Sequencing
Two genome sequences of the D. radiodurans strain R1 are available in databases [48,49]. The genome is distributed over four replicons: two chromosomes, one megaplasmid, and a plasmid (Table 1). A nucleotide polymorphism between the two complete genome sequences was reported, in addition to several insertions, deletions, or substitutions frequently found in bacterial genomes [49]. To promote accurate RNA-seq and ChIP-seq analysis, and for searching for conserved binding motifs for the DdrO protein, we sequenced the D. radiodurans genome of strain R1 ATCC 13939 maintained in our laboratory. We opted for Illumina NextSeq Oxford sequencing coupled with Nanopore Technologies GridION to unambiguously locate the repeated elements that may misassemble short sequences in size. Merging both sets of sequences produced an ensemble of four high quality contigs, totaling 3,578,820 nt, with a 450-fold average coverage. Among the 3230 predicted genes, 3147 encode proteins.
As shown in Table 1, the size of chromosome 2 and the two plasmids, in addition to the number of CDS encoded by D. radiodurans deduced from our sequence, are closer to those published by White et al. [48] than these published by Hua [49]. The large sequence insertions revealed in the more recent release were not found here. However, because the sequence published by White et al. [48] contains many errors, the degree of identity of genes was better with the genome sequence published by Hua [49], with a higher percentage of genes found between these two releases when a threshold of 90% of maximum bit score was applied ( Table 1). The sequence origin for each chromosomal element and plasmid was adjusted to correspond to the genome sequence of White et al. [48]. In addition to the orthologs of CDSs with both previously sequenced genomes as listed in Table S3, numerous genes or pseudo genes from different transposon families were found.

In Vivo Identification of DdrO Binding Sites by ChIP-Seq Assays
To localize in vivo the chromosomal regions bound by the DdrO protein, we constructed the GY 18218 strain expressing a V5-tagged DdrO protein, in all the genome copies, from the native promoter region of ddrO ( Figure S1). Cells expressing the recombinant protein, tagged at its C-terminal end, displayed the same growth rate as the wild-type strain and the expression of DdrO-V5 did not affect the resistance of the strain to DNA damaging agents (mitomycin C and UV) ( Figure S1). These results demonstrate that DdrO-V5 protein is functional and remained cleavable by IrrE under stress conditions. D. radiodurans GY 18218 and R1 strains were grown to mid-log phase and ChIP-seq was performed on DNA precipitated by the ChIP grade anti-V5 antibody. The Input sample (chromosomal DNA of the GY 18218 strain), the Mock sample (immuno-precipitated (IP) DNA of the wild-type strain), and three replicates of the DdrO-V5 IP sample were used to prepare sequencing libraries. The DNA regions over-represented in the DdrO-V5 IP sample and corresponding to potential binding sites for DdrO-V5 were identified using the bPeaks program [50].
A total of 136 ChIP-enriched peaks were found, mainly (110/136) within intergenic regions of genes encoding proteins ( Figure 1A and Table S4), whereas 26 peaks were intragenic or found at the vicinity of genes coding for tRNA. Significant peaks, as illustrated for five genes, ddrA, ddrB, ddrC, gyrA, and gyrB ( Figure 2), were identified in the promoter region of 18 of 25 genes reported as belonging to the RDR regulon ( Figure 2 and Figure S2) [21,26,33].

Figure 1.
Overview of the computational strategy used to integrate omics data and identified candidate genes for inclusion in the DdrO regulon. (A). ChIP-seq analyses, i.e., defining the genomic regions for which interactions between DNA and the DdrO protein were observed. The three IP replicates were compared to the Input and Mock controls. Peaks identified in both comparisons were retained. An additional filter was applied to focus on only the peaks located in intergenic regions. (B). Sequences of the peaks identified in (A) were used to search for over-represented DNA motifs, applying the MEME program. Four position specific scoring matrixes (PSSMs) were retained, because of their redundancies. PSSM were used as inputs for the FIMO program, scanning sequences between −500 and +100 of all annotated CDS. Positive matches were retained and are referred to as "RDRM promoters". (C). RNA-seq data was used to identify differentially expressed genes, comparing each time point (4,6,8,16, and 24 h) to the first (1 h). In the W37 strain, genes identified as differentially expressed in less than two comparisons were selected (DE < 2), whereas in the D37 strain, genes identified as differentially expressed in more than three comparisons were selected (DE > 3). Common genes from the two lists were retained and an additional filter was applied to focus further analyses on only these genes which exhibited differential expression (upor downregulation) at intermediate times, i.e., 6, 8, and 16 h. (D). Results obtained in (A-C) were integrated to define four lists in the DdrO regulon. The first list is comprised of genes for which (i) a peak was detected upstream of the gene location (ChIP-seq results), (ii) specific differential expression was observed in the D37 strain (transcriptome results), and (iii) an RDRM motif was found in the gene's promoter region (sequence analysis). Other lists (L2-L3-L4) matched with only two criteria (green tick mark). These genes are worth considering as potential targets for DdrO. Overview of the computational strategy used to integrate omics data and identified candidate genes for inclusion in the DdrO regulon. (A). ChIP-seq analyses, i.e., defining the genomic regions for which interactions between DNA and the DdrO protein were observed. The three IP replicates were compared to the Input and Mock controls. Peaks identified in both comparisons were retained. An additional filter was applied to focus on only the peaks located in intergenic regions. (B). Sequences of the peaks identified in (A) were used to search for over-represented DNA motifs, applying the MEME program. Four position specific scoring matrixes (PSSMs) were retained, because of their redundancies. PSSM were used as inputs for the FIMO program, scanning sequences between −500 and +100 of all annotated CDS. Positive matches were retained and are referred to as "RDRM promoters". (C). RNA-seq data was used to identify differentially expressed genes, comparing each time point (4,6,8,16, and 24 h) to the first (1 h). In the W37 strain, genes identified as differentially expressed in less than two comparisons were selected (DE < 2), whereas in the D37 strain, genes identified as differentially expressed in more than three comparisons were selected (DE > 3). Common genes from the two lists were retained and an additional filter was applied to focus further analyses on only these genes which exhibited differential expression (upor downregulation) at intermediate times, i.e., 6, 8, and 16 h. (D). Results obtained in (A-C) were integrated to define four lists in the DdrO regulon. The first list is comprised of genes for which (i) a peak was detected upstream of the gene location (ChIP-seq results), (ii) specific differential expression was observed in the D37 strain (transcriptome results), and (iii) an RDRM motif was found in the gene's promoter region (sequence analysis). Other lists (L2-L3-L4) matched with only two criteria (green tick mark). These genes are worth considering as potential targets for DdrO. Figure 2. Visualization, through IGV, of the binding peaks obtained from genome analysis. Tag density profiles are illustrated for 2 IP (purple), the Input (dark grey), and the Mock (light grey) for five known DdrO-regulated genes: ddrA, ddrB, ddrC, gyrA, and gyrB. The green lines indicate the size of each peak identified by bPeaks. Genes are represented by green boxes, their location on either strand is indicated by > (strand +) and < (strand −). The genomic coordinates of each locus are indicated on the X-axis. dro_0898 is split into two CDS (DR0903-DR0904) in the White et al. annotation [48].
A careful inspection of DdrO-V5 IP tag density through the IGV program of the seven missing genes showed that a lower coverage of reads was observed at the promoter region of mutL, and small peaks were observed in the promoter regions of recQ and sbcD that fell below the threshold used for peak detection with the bPeaks program. No peaks were detected in the region upstream of the hutU, irrI, frnE, and rsr genes. To identify candidate binding sites of the DdrO protein, the nucleotide sequences of the ChIP peaks, between 151 and 1401 in length, were compared using MEME [51], to search for palindromic or non-palindromic motifs with an occurrence of one motif per sequence or any number of repetitions ( Figure 1B). A total of 41 peak sequences, located in the promoter regions, contained a conserved DNA motif close to the RDRM sequence, with some loci containing two motifs (Table S4), as illustrated for DdrA ( Figure 2). A broad peak was observed in its promoter region because it contains two RDRM (Table S4). Interestingly, based on ChIPseq data, the number of RDRM reported here is larger than that predicted by previous in silico analyses [19,33]. No other conserved sequence pattern was found, with the exception of the predicted core promoters, either from the 41 ChIP peaks sequences or from the other sequences lacking an RDRM. However, a degenerate RDRM may not have been detected due to the threshold used for these bioinformatic analyses. Altogether, these results confirmed in vivo the role of the RDRM for DdrO binding to the D. radiodurans genome. Independently, we investigated whether an RDRM was found in the promoter regions of other genes encoded by D. radiodurans. We monitored, with FIMO, their presence in a set of sequences covering the regions located between −500 and +100 nucleotides from the start of translation of all the 3147 CDS encoded by D. radiodurans. A total of 222 putative RDRM-like sequences ( Figure 1B, Table S5) were found, including the 41 detected by MEME and eight other potential sites, that were not detected by MEME, Figure 2. Visualization, through IGV, of the binding peaks obtained from genome analysis. Tag density profiles are illustrated for 2 IP (purple), the Input (dark grey), and the Mock (light grey) for five known DdrO-regulated genes: ddrA, ddrB, ddrC, gyrA, and gyrB. The green lines indicate the size of each peak identified by bPeaks. Genes are represented by green boxes, their location on either strand is indicated by > (strand +) and < (strand −). The genomic coordinates of each locus are indicated on the X-axis. dro_0898 is split into two CDS (DR0903-DR0904) in the White et al. annotation [48].
A careful inspection of DdrO-V5 IP tag density through the IGV program of the seven missing genes showed that a lower coverage of reads was observed at the promoter region of mutL, and small peaks were observed in the promoter regions of recQ and sbcD that fell below the threshold used for peak detection with the bPeaks program. No peaks were detected in the region upstream of the hutU, irrI, frnE, and rsr genes. To identify candidate binding sites of the DdrO protein, the nucleotide sequences of the ChIP peaks, between 151 and 1401 in length, were compared using MEME [51], to search for palindromic or non-palindromic motifs with an occurrence of one motif per sequence or any number of repetitions ( Figure 1B). A total of 41 peak sequences, located in the promoter regions, contained a conserved DNA motif close to the RDRM sequence, with some loci containing two motifs (Table S4), as illustrated for DdrA ( Figure 2). A broad peak was observed in its promoter region because it contains two RDRM (Table S4). Interestingly, based on ChIP-seq data, the number of RDRM reported here is larger than that predicted by previous in silico analyses [19,33]. No other conserved sequence pattern was found, with the exception of the predicted core promoters, either from the 41 ChIP peaks sequences or from the other sequences lacking an RDRM. However, a degenerate RDRM may not have been detected due to the threshold used for these bioinformatic analyses. Altogether, these results confirmed in vivo the role of the RDRM for DdrO binding to the D. radiodurans genome. Independently, we investigated whether an RDRM was found in the promoter regions of other genes encoded by D. radiodurans. We monitored, with FIMO, their presence in a set of sequences covering the regions located between −500 and +100 nucleotides from the start of translation of all the 3147 CDS encoded by D. radiodurans. A total of 222 putative RDRM-like sequences ( Figure 1B, Table S5) were found, including the 41 detected by MEME and eight other potential sites, that were not detected by MEME, but six of which were located far from the start of coding sequences and outside the ChIP-seq peaks.
Based on ChIP-seq results, 89 genes, sometimes included in operons, may be regulated by DdrO, but many enriched peaks were located within the intergenic region of divergently transcribed genes. It is possible that only one of the two divergent genes may be under the control of DdrO.

Transcriptome Analysis of D. radiodurans in Response to the Depletion of DdrO
In parallel, to further characterize the RDR regulon in D. radiodurans, we compared transcriptome profiles of cells expressing, or not, DdrO. Because DdrO is essential for cell viability [17,20], we used a conditional gene inactivation system [20,52]. In this system, ∆ddrO cells expressed the DdrO protein under control of its own promoter region at 30 • C from a temperature-sensitive (repU Ts ) replication vector ( Figure 3A) [20]. Shifting the culture to 37 • C, a non-permissive temperature, resulted in an inability of the plasmid to replicate during successive cell divisions, leading to the depletion of DdrO, in contrast with a derivative of this expression vector, containing the wild-type repU + gene, that did not cause depletion of DdrO at 37 • C [20,52]. The ∆ddrO/ddrO + (prepU Ts ) and ∆ddrO/ddrO + (prepU+) strains grown at 37 • C are denoted D37 and W37, respectively. Under our experimental conditions, the number of cells carrying the repU + vector is proportional to the increase in cell mass at 37 • C without a selective antibiotic ( Figure 3B). In contrast, the number of cells carrying the repU TS vector stopped increasing and remained stable over 24 h ( Figure 3B). The growth curves of both strains exhibited a comparable doubling time over 6 h. However, after this time lapse, the growth of the D37 strain also stopped, coinciding with the stress triggered by the depletion of DdrO ( Figure 3B).
In a first attempt, we analyzed the effect of DdrO depletion on the expression of DdrD, DdrO, PprA, and RecA proteins belonging to the RDR regulon. For this purpose, we used derivatives of the strain ∆ddrO (prepU Ts :ddrO + ) expressing DdrO-FLAG, PprA-HA, DdrD-HA, or RecA-HA tagged proteins. Depletion of the DdrO repressor, in cells grown at the non-permissive temperature, resulted in the complete loss of DdrO after 4 h at 37 • C ( Figure 3C) and an increasing amount of PprA, DdrD, and RecA proteins during the kinetics ( Figure S3).
In a second step, the kinetics of gene expression changes induced by DdrO depletion were analyzed for both strains, from three independent cultures and at six time points (1,4,6,8,16, and 24 h) ( Figure 4A). RNA sequencing data was performed from 36 samples, corresponding for each sample to an average sequencing depth of 647-fold the genome sequence.
A two-fold change in expression threshold for the ratio in these experiments was applied, together with a p-value < 0.01. Principal component analysis (PCA) confirmed that the transcriptome of the three biological replicates was clustered at each time point, showing the reproducibility of the experiments and the transcriptome patterns evolved as cells progressed through the time course of the experiment ( Figure 4B). The datasets are separate, in addition to the PC1 and the PC2 levels, which together explained approximately 75% of the variance.
To compare the transcriptome, the 1 h time point was used as the reference, giving time for the genome to stabilize its expression after shifting the temperature. We first compared, for each strain, the deregulation of all genes along the time course. The results of the differential expression for all genes in W37 and D37 are presented in Figure 5 and Tables S4 and S5. After an incubation of 24 h at 37 • C, numerous genes were deregulated, as 2129 unique genes i.e., 67.7% of all genes in the W37 strain, and 2330 unique genes, i.e., 74% in the D37 strain, were up-or downregulated at a minimum of one time point during the time course, showing that a cascade of cell regulation occurred into each strain over 24 h. Moreover, genes reported in one time point were often found in the following time point. From the set of 350 and 358 regulated genes in W37 and D37, respectively, during a time point between 1 and 4 h, 159 (45%) in W37 and 195 (54%) in D37 remained regulated during all time points (Table S6 and Figure 5). After 6 h at 37 • C, although most of the cells in the D37 strain lost the thermosensitive plasmid, 679 and 1011 genes were deregulated in W37 and D37, respectively, with 501 genes shared between them (Table S6), and several upregulated genes found in W37 were downregulated in D37. These results showed that, rapidly after the temperature shift, the expression patterns of W37 and D37 changed differentially and the depletion of DdrO in D37 directly or indirectly deregulated the expression of other genes.
viability [17,20], we used a conditional gene inactivation system [20,52]. In this system, ∆ddrO cells expressed the DdrO protein under control of its own promoter region at 30 °C from a temperature-sensitive (repUTs) replication vector ( Figure 3A) [20]. Shifting the culture to 37 °C, a non-permissive temperature, resulted in an inability of the plasmid to replicate during successive cell divisions, leading to the depletion of DdrO, in contrast with a derivative of this expression vector, containing the wild-type repU + gene, that did not cause depletion of DdrO at 37 °C [20,52]. The ∆ddrO/ddrO + (prepUTs) and ∆ddrO/ddrO + (prepU+) strains grown at 37 °C are denoted D37 and W37, respectively. Under our experimental conditions, the number of cells carrying the repU + vector is proportional to the increase in cell mass at 37 °C without a selective antibiotic ( Figure 3B). In contrast, the number of cells carrying the repUTS vector stopped increasing and remained stable over 24 h ( Figure 3B). The growth curves of both strains exhibited a comparable doubling time over 6 h. However, after this time lapse, the growth of the D37 strain also stopped, coinciding with the stress triggered by the depletion of DdrO ( Figure 3B).   To compare the transcriptome, the 1 h time point was used as the reference, givin time for the genome to stabilize its expression after shifting the temperature. We fir compared, for each strain, the deregulation of all genes along the time course. The resu of the differential expression for all genes in W37 and D37 are presented in Figure 5 an Tables S4 and S5. After an incubation of 24 h at 37 °C, numerous genes were deregulate as 2129 unique genes i.e., 67.7% of all genes in the W37 strain, and 2330 unique genes, i. 74% in the D37 strain, were up-or downregulated at a minimum of one time point durin the time course, showing that a cascade of cell regulation occurred into each strain ov 24 h. Moreover, genes reported in one time point were often found in the following tim point. From the set of 350 and 358 regulated genes in W37 and D37, respectively, durin a time point between 1 and 4 h, 159 (45%) in W37 and 195 (54%) in D37 remained regulate during all time points (Table S6 and Figure 5). After 6 h at 37 °C, although most of the ce in the D37 strain lost the thermosensitive plasmid, 679 and 1011 genes were deregulate in W37 and D37, respectively, with 501 genes shared between them (Table S6), and sever upregulated genes found in W37 were downregulated in D37. These results showed tha rapidly after the temperature shift, the expression patterns of W37 and D37 change differentially and the depletion of DdrO in D37 directly or indirectly deregulated th expression of other genes. To confirm the loss of the prepU Ts plasmid at 37 • C at the transcriptome level, we investigated the expression profiles of ddrO and spr encoding resistance to spectinomycin. As shown in Table S7, the spr gene was downregulated in D37, confirming the loss of the prepU Ts plasmid in growing cells at 37 • C (Figure 3), whereas the ddrO gene was upregulated in D37. When ∆ddrO/ddrO + strains were constructed, only the CDS encoding the DdrO protein was deleted from the genome. Therefore, the upstream region, containing the promoter and the 5 UTR region of ddrO, is duplicated, one located on the chromosomal locus, the second on the plasmid. The RDRM is located in the ddrO gene 153 nucleotides upstream of the ATG in the vicinity of the predicted promoter. The 5 UTR reads density profiles of ddrO were very low in the W37 strain, but were augmented in the D37 strain as soon as the cell lost the plasmid, supporting the findings of previous studies that show that DdrO regulates the expression of its own gene [17,26] ( Figure S4). Cells 2021, 10, x FOR PEER REVIEW 13 of 30  Table  S6) and the number of common genes among indicated time intervals (illustrated by solid black points, linked by black lines). The right panel shows the increasing number of unique genes that are differentially expressed at a minimum of one time point after the temperature shift. Number of deregulated genes (|FC| ≥ 2, p-value ≤ 0.01).
To confirm the loss of the prepUTs plasmid at 37 °C at the transcriptome level, we investigated the expression profiles of ddrO and spr encoding resistance to spectinomycin. As shown in Table S7, the spr gene was downregulated in D37, confirming the loss of the prepUTs plasmid in growing cells at 37 °C ( Figure 3), whereas the ddrO gene was upregulated in D37. When ∆ddrO/ddrO + strains were constructed, only the CDS encoding the DdrO protein was deleted from the genome. Therefore, the upstream region, containing the promoter and the 5′UTR region of ddrO, is duplicated, one located on the chromosomal locus, the second on the plasmid. The RDRM is located in the ddrO gene 153 nucleotides upstream of the ATG in the vicinity of the predicted promoter. The 5′UTR reads density profiles of ddrO were very low in the W37 strain, but were augmented in the D37 strain as soon as the cell lost the plasmid, supporting the findings of previous studies that show that DdrO regulates the expression of its own gene [17,26] ( Figure S4).
Twenty genes of the RDR regulon were upregulated in the D37 strain, often from the beginning of the experimental temperature shift, with increasing fold changes as cells progressed through the time course. The dr2256 gene encoding a transketolase, in addition to the ddrF and ssb genes, were also upregulated later (6 or 8 h, Table S7), and the uvrA gene (dr1771) was upregulated only after 16 h at 37 °C. The other genes, such as uvrD (dr1775) and irrI (dr0171), were not upregulated or only changed in a late stage of the experiment. In contrast, drA0151, encoding the first gene of the hut operon, was strongly downregulated in the W37 strain and in the D37 strain, but this gene was not reported as being under the control of the DdrO/IrrE proteins [24].
We also wondered if other D. radiodurans genes displayed a transcriptome pattern comparable to the RDR regulon genes. For this purpose, we selected genes as differentially expressed (DE) in the W37 strain in two or fewer comparisons (DE ≤ 2), and in more than  Table S6) and the number of common genes among indicated time intervals (illustrated by solid black points, linked by black lines). The right panel shows the increasing number of unique genes that are differentially expressed at a minimum of one time point after the temperature shift. Number of deregulated genes (|FC| ≥ 2, p-value ≤ 0.01).
Twenty genes of the RDR regulon were upregulated in the D37 strain, often from the beginning of the experimental temperature shift, with increasing fold changes as cells progressed through the time course. The dr2256 gene encoding a transketolase, in addition to the ddrF and ssb genes, were also upregulated later (6 or 8 h, Table S7), and the uvrA gene (dr1771) was upregulated only after 16 h at 37 • C. The other genes, such as uvrD (dr1775) and irrI (dr0171), were not upregulated or only changed in a late stage of the experiment. In contrast, drA0151, encoding the first gene of the hut operon, was strongly downregulated in the W37 strain and in the D37 strain, but this gene was not reported as being under the control of the DdrO/IrrE proteins [24].
We also wondered if other D. radiodurans genes displayed a transcriptome pattern comparable to the RDR regulon genes. For this purpose, we selected genes as differentially expressed (DE) in the W37 strain in two or fewer comparisons (DE ≤ 2), and in more than three comparisons in the D37 strain (DE > 3), considering most of the profiles exhibited by the predicted RDR genes (see Methods, Table S4). A total of 436 genes displayed similar transcriptome profiles ( Figure 1C, Table S8), reduced to 151 genes when an additional filter was applied to focus on genes which exhibited differential expression (up-or downregulation) at only intermediate time points, i.e., 6, 8, and 16 h. A total of 260, of the 436 identified genes, were upregulated in the D37 strain (Table S9), and 60% of these were distributed into four functional categories. A total of 31 genes encoding proteins involved in DNA metabolism, and including most of the previously predicted RDR regulon, were found, but also the recG and recO genes encoding a primosomal protein N', in addition to two DNA polymerase III subunits (DR0001 and DR0507), polA, two putative helicases (DR0837 and DR1572), ligA, mutS, and recN ( Figure 6). Interestingly, several genes encoding transcription factors involved in stress responses, such as LexA1, LexA2, or the Phage Shock Protein A PspA, were also upregulated in D37 ( Figure 6). Therefore, several regulatory networks were likely triggered in response to induction of the RDR regulon. downregulation) at only intermediate time points, i.e., 6, 8, and 16 h. A total of 260, of the 436 identified genes, were upregulated in the D37 strain (Table S9), and 60% of these were distributed into four functional categories. A total of 31 genes encoding proteins involved in DNA metabolism, and including most of the previously predicted RDR regulon, were found, but also the recG and recO genes encoding a primosomal protein N', in addition to two DNA polymerase III subunits (DR0001 and DR0507), polA, two putative helicases (DR0837 and DR1572), ligA, mutS, and recN ( Figure 6). Interestingly, several genes encoding transcription factors involved in stress responses, such as LexA1, LexA2, or the Phage Shock Protein A PspA, were also upregulated in D37 ( Figure 6). Therefore, several regulatory networks were likely triggered in response to induction of the RDR regulon. In addition, 15 genes encoding putative proteases and peptidases or regulators of protease activity, 15 genes coding for ABC transporters, permeases, and efflux In addition, 15 genes encoding putative proteases and peptidases or regulators of protease activity, 15 genes coding for ABC transporters, permeases, and efflux components, and >100 genes coding for uncharacterized proteins or of unknown function were also deregulated with similar expression patterns ( Figure 6).
We also investigated the presence of downregulated genes during DdrO depletion. Using the same settings, 176 downregulated genes were found ( Figure S5 and Table S10) to be widely distributed in the different COG categories, with a transcriptomic repression mainly beginning at 6 h when D37 cells are depleted in DdrO.
Our RNA seq analysis exhibited 436 deregulated genes, but the overall up-or downexpression of these genes may be a consequence of a cascade of regulation occurring when cells lost the ddrO gene.

Integration of the Data: The DdrO Map in D. radiodurans
To map the DdrO regulon, we integrated the results obtained in ChIP-Seq, RNAseq, and motif research. When candidates from these three experiments were compared,  37 genes met all three criteria, or 42 considering that several genes are included in putative operons (Table 2), such as the cinA-ligT-recA operon previously described by Makarova et al. [33], and 47 other genes met two criteria ( Figures 1D and 7). A consensus motif was searched for by MEME from all of the RDRM sequences listed in Table 2. This motif is consistent with those previously described [20,34]. In addition, 47 genes matched only two criteria. Among these, six genes upregulated in D37 and a ChIP peak was located in each promoter region, but sequences did not exhibit any RDRM or other conserved motif (Table S12). Howeve cannot exclude the possibility that the DdrO protein bound to a degenerate R sequence, which was not reported because of the criteria used here for in silico ana Among these, dr2606 encodes a predicted primosomal protein N′ and dr1790 encod the yellow protein, belonging to the ancient yellow/major royal jelly (MRJ) protein fa The deletion of dr1790 in D. radiodurans increased its membrane permeability decreased the cell growth rate and survival upon exposure to hydrogen peroxid radiation [53].
Twenty-six other genes were only associated with a DdrO peak and also wi RDRM in their promotor region (Figure 7 and Table S13). From this set of 26 genes, 1 divergently transcribed from genes that matched all criteria. It is thus likely that onl of the two genes that share the same intergenic region was regulated by DdrO. Sev the 13 other genes, such as dr0001 encoding DNA polymerase III subunit beta, Based on these results, the DdrO regulon comprises 16 previously predicted RDR genes, mainly involved in DNA repair pathways. Moreover, three other genes involved in DNA metabolism (recG, helD, and DNA ligase ligA), four genes associated with different metabolic pathways, five genes involved in translation, and seven new genes encoding proteins of unknown function (Figure 7, Table 2 and Table S11) are also under the control of DdrO. Surprisingly, two genes encoding transposases, drC0017 and dr1296, also matched all the criteria. One copy of drC0017 and two copies of the second IS (dr1296 and drC0033) are present in the genome. According to the Chandler classification (https://www-is.biotoul. fr/index.php) (accessed on 10 September 2021) dr1296 encodes the ISDra5 and drC0017 is part of the transposable element TnDra1. Because drC0033 and dr1296 CDS exhibited 100% sequence identity, we were not able to determine, by RNA-seq, whether one of the two genes displayed an upregulation during the time course. Moreover, only a small peak was observed for drC0033 but was below the threshold established for this study.
In addition, 47 genes matched only two criteria. Among these, six genes were upregulated in D37 and a ChIP peak was located in each promoter region, but their sequences did not exhibit any RDRM or other conserved motif (Table S12). However, we cannot exclude the possibility that the DdrO protein bound to a degenerate RDRM sequence, which was not reported because of the criteria used here for in silico analyses. Among these, dr2606 encodes a predicted primosomal protein N and dr1790 encodes for the yellow protein, belonging to the ancient yellow/major royal jelly (MRJ) protein family. The deletion of dr1790 in D. radiodurans increased its membrane permeability and decreased the cell growth rate and survival upon exposure to hydrogen peroxide and radiation [53].
Twenty-six other genes were only associated with a DdrO peak and also with an RDRM in their promotor region (Figure 7 and Table S13). From this set of 26 genes, 13 are divergently transcribed from genes that matched all criteria. It is thus likely that only one of the two genes that share the same intergenic region was regulated by DdrO. Seven of the 13 other genes, such as dr0001 encoding DNA polymerase III subunit beta, were described as upregulated upon exposure to gamma rays [12], but was not reported as differentially expressed in a ∆irrE mutant [24]. The uvrA and uvrD genes were also reported as being upregulated in the first 1.5 h when cells are exposed to large doses of gamma rays [12] and were differentially expressed very early in a ∆irrE mutant. These results suggest that the expression of these genes, including uvrA and uvrD, may be augmented very early during the time course (<1 h) or are under the control of DdrO and other regulatory elements, and the simple depletion of DdrO did not modify their expression under our experimental procedures.
Finally, 16 genes were only found to be upregulated in D37 and not in W37, and an RDRM was detected near their respective promoter region, but no ChIP-Seq peak was reported from the ChIP-seq analysis (Table S14). However, careful inspection of all the peaks with the IGV program showed that a small peak, that fell below the threshold used to analyze ChIP-seq data, was observed in the promoter regions of five genes (sbcD, recQ, drC0012 encoding a putative transcriptional regulator, drC0033 (transposase), and dr1707 encoding DNA polymerase I). Table 2. List of genes belonging to the DdrO regulon in D. radiodurans matching all criteria (transcriptomic, ChIP-seq, and RDRM). Previous experiments: (1) RDR genes previously predicted by bioinformatic analyses [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2 [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2. ND: Not determined.   [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2. ND: Not determined.   [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2. ND: Not determined.  List of genes belonging to the DdrO regulon in D. radiodurans matching all criteria (transcriptomic, ChIP-seq, and RDRM). Previous experiments: (1) RDR genes previously predicted by bioinformatic analyses [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2. ND: Not determined.  Table 2. List of genes belonging to the DdrO regulon in D. radiodurans matching all criteria (transcriptomic, ChIP-seq, and RDRM). Previous experiments: (1) RDR genes previously predicted by bioinformatic analyses [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2. ND: Not determined.  Table 2. List of genes belonging to the DdrO regulon in D. radiodurans matching all criteria (transcriptomic, ChIP-seq, and RDRM). Previous experiments: (1) RDR genes previously predicted by bioinformatic analyses [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in   List of genes belonging to the DdrO regulon in D. radiodurans matching all criteria (transcriptomic, ChIP-seq, and RDRM). Previous experiments: (1) RDR genes previously predicted by bioinformatic analyses [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2. ND: Not determined.    [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2. ND: Not determined.   List of genes belonging to the DdrO regulon in D. radiodurans matching all criteria (transcriptomic, ChIP-seq, and RDRM). Previous experiments: (1) RDR genes previously predicted by bioinformatic analyses [19,33] (2) by in vitro EMSA assays which analyzed the binding of DdrO to RDRM sequences located in the promoter regions of these genes [26] or shown to be regulated after exposure to radiation by (3) transcriptomic [12], (4) proteomic studies [54], or (5) in a ∆irrE mutant [24]. Logo of the RDRM consensus performed by MEME from all the RDRM sequences listed in Table 2. ND: Not determined.

Consensus motif TATGYYHTTARCRKA
A highly condensed structure of the D. radiodurans chromatin may have locally impaired or decreased the efficiency of the ChIP experiments, and thus, these genes may also belong to the DdrO regulon. The location of the RDRM was also analyzed in the promoter region of all genes matching two or three of our criteria. The RDRM sequences were mainly located in the vicinity of the predicted position of E. coli-like −35 and −10 promoter consensus sequences (Tables S9, S11, and S12). These results are consistent with previous reports showing that RDRM in D. deserti was found both upstream (−50 bp) and downstream (+20 bp) of transcriptional start sites (TSS) potentially overlapping with the RNA polymerase binding site [17].
In parallel, the amount of protein encoded by the newly identified genes was analyzed in wild-type cells and in a ∆irrE mutant after exposure to mitomycin C (MMC). For this purpose, we monitored, by Western blot analysis, the expression of C-terminal-tagged recombinant proteins. The cellular levels of five recombinant proteins (RecG, LigA, DdrN, DRA0166, and DR2173) increased in wild-type cells in response to MMC, but remained constant in an ∆irrE mutant, thus corroborating our data (Figure 8). A highly condensed structure of the D. radiodurans chromatin may have locally impaired or decreased the efficiency of the ChIP experiments, and thus, these genes may also belong to the DdrO regulon. The location of the RDRM was also analyzed in the promoter region of all genes matching two or three of our criteria. The RDRM sequences were mainly located in the vicinity of the predicted position of E. coli-like −35 and −10 promoter consensus sequences (Tables S9, S11, and S12). These results are consistent with previous reports showing that RDRM in D. deserti was found both upstream (−50 bp) and downstream (+20 bp) of transcriptional start sites (TSS) potentially overlapping with the RNA polymerase binding site [17].
In parallel, the amount of protein encoded by the newly identified genes was analyzed in wild-type cells and in a ∆irrE mutant after exposure to mitomycin C (MMC). For this purpose, we monitored, by Western blot analysis, the expression of C-terminaltagged recombinant proteins. The cellular levels of five recombinant proteins (RecG, LigA, DdrN, DRA0166, and DR2173) increased in wild-type cells in response to MMC, but remained constant in an ∆irrE mutant, thus corroborating our data (Figure 8). We were not able to detect dr1297 or dr0561 tagged proteins, but these genes encode a predicted ABC transporter and a sugar ABC transporter, respectively, containing predicted transmembrane regions or a predicted periplasmic peptide signal that could lead to a low solubility of these proteins. In agreement with our gene expression data, previous transcriptome studies have shown that dr0561 was not upregulated in a ∆irrE strain when compared to a wild-type strain upon exposure to γ-ray irradiation [24], lending support to the idea that this predicted transporter is regulated in a DdrO/IrrE manner.
Among the set of genes predicted in silico by Makarova et al. [33], the frnE and rsr genes were upregulated in D37 over the time course of the experiment, but no DdrO-peak was found in their respective promoter regions. We analyzed the expression of the FrnE and Rsr proteins, and of DdrR (DR0053) predicted as belonging to the RDR regulon in D. deserti [19], although ddrR was not upregulated during our time course experiment and no peak was observed in its promoter region. The cellular levels of three recombinant proteins increased in response to MMC in wild-type cells, and also in a ∆irrE mutant We were not able to detect dr1297 or dr0561 tagged proteins, but these genes encode a predicted ABC transporter and a sugar ABC transporter, respectively, containing predicted transmembrane regions or a predicted periplasmic peptide signal that could lead to a low solubility of these proteins. In agreement with our gene expression data, previous transcriptome studies have shown that dr0561 was not upregulated in a ∆irrE strain when compared to a wild-type strain upon exposure to γ-ray irradiation [24], lending support to the idea that this predicted transporter is regulated in a DdrO/IrrE manner.
Among the set of genes predicted in silico by Makarova et al. [33], the frnE and rsr genes were upregulated in D37 over the time course of the experiment, but no DdrO-peak was found in their respective promoter regions. We analyzed the expression of the FrnE and Rsr proteins, and of DdrR (DR0053) predicted as belonging to the RDR regulon in D. deserti [19], although ddrR was not upregulated during our time course experiment and no peak was observed in its promoter region. The cellular levels of three recombinant proteins increased in response to MMC in wild-type cells, and also in a ∆irrE mutant ( Figure S6). These results strongly suggest that DdrR, FrnE, and Rsr proteins are induced by genotoxic stress but not in a DdrO/IrrE-dependent manner.
With the exception of HflX, which was found to be induced in wild-type and ∆irrE cells after exposure to MMC, no change in protein quantity was observed for DRA0275, DR0217, DR0685, DR1571, DR1572, DR2255, and DR2174 after exposure to MMC ( Figure S6). However, as already reported in D. radiodurans, the upregulation of several genes at the transcriptomic level was not always observed at the proteomic level [54]. This may be due to their abundance in the cell, their very transient expression, or their instability. A mechanism of translational regulation may also occur after transcription of these genes. Alternatively, it is possible that the genotoxic conditions after MMC exposure did not exhibit appropriate deleterious effects in cells to trigger induction of these proteins when compared to an exposure to other stresses, such as gamma radiation, Methyl MethaneSulfonate (MMS), or desiccation [27].

Discussion
An RDR regulon was proposed several years ago based on the presence of the RDRM sequence, a common 17 bp palindromic sequence, located in the promoter region of the most highly ionizing radiation and desiccation upregulated genes in D. radiodurans and D. geothermalis [33]. To date, identification of putative DdrO target genes in D. radiodurans has been mostly proposed by a combination of bioinformatic analyses based on microarray gene expression data [33] and a validation, in vitro, by Electrophoretic Mobility Shift Assay (EMSA) experiments [26]. Here, we combined two large scale approaches to identify DdrO targets in vivo with reliable accuracy. Analysis among transcriptome data, identification of enriched DdrO binding sites, and the presence of an RDRM in the promoter region of D. radiodurans genes allowed us to identify at least 35 DdrO target genes matching all criteria ( Table 2) and to propose other genes that may also be regulated by this transcription factor (Tables S10-S12).
Up to 70% of the identified target genes were previously predicted to be part of the RDR regulon [19,26,33]. The ddrF gene (dr0219), absent from the genome annotation published by Hua and Hua [49], is identified here as belonging to the RDR regulon, as initially described [33]. In addition, we highlighted 18 new DdrO target genes, including genes involved in DNA maintenance such as dr1289, dr1572, and dr2069, encoding the RecG helicase, HelD superfamily I helicase IV, and the replicative DNA ligase LigA, respectively. In agreement with these data, transcriptome analysis of cells recovering from exposure to ionizing radiation showed no upregulation of expression of these three genes in a ∆irrE mutant compared to the wild-type strain [24]. In E. coli, RecG plays an important role in DNA repair, recombination, and replication [55], whereas HelD, from Bacillus subtilis or Mycobacterium smegmatis, is associated with transcriptional pathways [56,57]. In D. radiodurans, cells devoid of RecG exhibit a delay in growth and double strand break (DSB) repair kinetics, and a decrease in resistance to γ-irradiation and H 2 O 2 [58,59], whereas the ∆dr1572 mutant exhibited a greater sensitivity to H 2 O 2 , but no change in resistance to ionizing radiation and to MMC when compared to the wild-type strain [60].
The identification of ligA as a target gene is interesting, because DNA ligases are implicated in DNA repair and are essential in other fundamental processes within the cell [61]. Ligase activity is crucial during DNA recombination and replication, explaining the constitutive expression of DNA ligase during all phases of the cell cycle [61]. Therefore, DdrO binding on the ligA promoter region should not completely repress gene expression, to thus ensure a minimum level of DNA ligase activity. However, in response to elevated amounts of DNA damage, and particularly to DSB, the basal level of LigA may not be sufficient for accurate Single Strand Annealing (SSA) [8] and Extended Synthesis-Dependent Strand Annealing (ESDSA) mechanisms, or for homologous recombination [5].
We identified new DdrO target genes, ddrN and dr2255, encoding putative GNAT family acetyltransferases that may be involved in post-translational modification (PTM) pathways. The ∆irrE cells recovering from exposure to ionizing radiation exhibited no upregulation of ddrN expression compared to a wild-type strain [24]. PTMs in bacteria play crucial roles in various cellular pathways, including after metabolic shifts and stress adaptation [62]. Acetylation is known to modify a variety of substrates involved in RNA metabolism, enzymatic activity, or DNA-related mechanisms [63]. In E. coli, acetylation of the chromosomal replication initiation protein DnaA leads to an arrest of DNA replication [64]. Moreover, acetylation of histone-like nucleoid protein HU in M. tuberculosis alters the in vitro DNA-binding capacity of HU and the DNA structure, which may affect gene transcription and other protein-DNA interactions [65]. D. radiodurans HU protein has been reported as a major actor of nucleoid compaction [52,66] and may also be acetylated. Further analyses are required to understand the impact of acetylation activity in response to DNA damage in D. radiodurans.
Surprisingly, two genes, drC0017 and dr1287, encoding transposases belonging to the Tn3 family and to the ISDra5 (IS5 family), respectively, were identified as belonging to the DdrO regulon. Transposons are major actors of genome remodeling and play an important role to create diversity and to facilitate adaptation of the host to extreme environmental conditions. Insertion sequences are abundant in D. radiodurans and IS transposition is a major event in spontaneous, in addition to induced, mutagenesis [67]. It has been previously shown that ISDra2, ISDra5, ISDra3, ISDra4, and IS2621 belonging to different families (IS200/IS605, IS5, IS630, IS630, and IS4, respectively) were transpositionally active in this organism under normal growth conditions and transposition was enhanced in cells recovering from DNA damage. Transposable element expression and movement are generally tightly regulated and different mechanisms control their gene expression. In E. coli, LexA protein represses expression of the Tn5 transposase gene [68]. Further studies are required to better understand how DdrO contributes to the regulation of transposition events of these two ISs.
We also identified several genes encoding proteins of unknown function, such as dr2173 and the drA0165-drA0166 operon, which were strongly upregulated in response to a depletion of DdrO. These genes were not upregulated in irradiated ∆irrE cells [24]. Domains of unknown function (DUF) found in DR2173 and DRA0166 are widely conserved in bacteria. A DUF4132 within DRA0166 may be involved in the molybdopterin biosynthesis. DR2173 also shares an N-terminal WGR domain with the MolR protein, which may be involved in regulation of molybdate biosynthesis in E. coli [69] and was described as belonging to the LexA regulon. However, molybdate-metabolism associated genes, such as D. radiodurans moeA or moeB, were not differentially expressed in D37 compared to W37. The role of these strongly upregulated unknown genes in response to DNA damage also remains to be discovered.
DdrO-structure and biochemicals data suggested that, in response to DNA damaging conditions, upregulation of the expression of genes of the RDR regulon is dependent on a dynamic balance between DdrO dimers bound to DNA and the IrrE-cleavable DdrO monomer forms [25,28]. Thus, the cleavage of DdrO monomers by IrrE would reduce the amount of DdrO dimers able to bind to DNA, leading to the induction of the expression of genes controlled by this regulator. Here, the transcriptome data showed that identified RDR regulon genes were not all upregulated at the same time during the DdrO depletion. The expression of some genes, such as ddrC, ddrD, and pprA, was strongly upregulated at early times, whereas ssb, uvrD, and ddrF showed a late upregulation, suggesting that DdrO bound with more or less affinity to DNA according to the divergent RDRM sequences that may diverge. Therefore, after exposure of ionizing radiation, some genes would be upregulated earlier than others during cell recovery. It has been shown that, following extended DdrO-depletion, D. radiodurans cells were engaged in an apoptotic-like cell death (ALD) pathway leading to morphological alterations, such as larger cells, membrane blebbing, and DNA fragmentation [20]. In Caulobacter crescentus, the BapE endonuclease was reported to be involved in DNA fragmentation upon severe and extensive DNA damage [70]. BapE induction is triggered only in the case of prolonged LexA self-cleavage and was not described as part of early induced SOS response genes [70]. Instead, our results did not allow us to identify new genes belonging to the RDR regulon and differentially expressed at late times, suggesting that ALD would be triggered by long-lasting induction of one or more genes from the RDR regulon, or by a cascade of regulation events following depletion of DdrO.
Despite the identification of many promoter regions containing a putative RDRM sequence in the D. radiodurans genome, we showed that only a small proportion of these are bound by DdrO. Based on the presence of the RDRM sequence and induction of expression in response to ionizing radiation, rsr, frnE, irrI, ddrR, and the hutU operon were previously described as part of the RDR regulon in D. radiodurans [19,33]. In agreement with Wang et al. [26], we showed that expression of the hutU operon, rsR, frnE, and irrI is not under the control of DdrO, but also ddrR. However, we showed that the quantity of RsR, FrnE, and DdrR proteins is induced in response to exposure to MMC ( Figure S6) in concordance with gene expression data [12,13,19]. Thus, identification of the RDRM sequence is not sufficient to enable DdrO binding. Many factors, such as DNA structure or binding competition between multiple transcription factors, can affect accessibility to a genome region for DdrO [71]. If our data showed that DdrO appears to bind exclusively to the RDRM sequence, then some RDR regulon genes, such as ddrA, gyrA, or dr1572, were also described as being under the control of another major regulator, DdrI [72], highlighting the regulatory cross-talk in the D. radiodurans DNA damage response.
The D. radiodurans genome encodes more than one hundred predicted transcriptional regulators, but few studies have been undertaken to identify the genes they may regulate. The RDR regulon of the radioresistant bacterium D. radiodurans, characterized here by integrative genomic analyses, paves the way for further studies to better depict the regulatory networks underlying the mechanisms that contribute to the extreme radiation tolerance of this fascinating bacterium.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/cells10102536/s1. Figure S1: A ddrO allele replacement by a gene expressing the recombinant DdrO-V5 protein did not modify the growth rate. Figure S2. Visualization, through IGV, of the tag density profiles for 20 genes reported as belonging to the RDR regulon. Figure S3. Western blot analysis of the expression of the recombinant RecA-HA, PprA-HA, and DdrD-HA proteins in D37 at 37 • C. Figure S4. Visualization using Tablet software [73] of the DdrO gene reads mapping at 4 h at the D. radiodurans ddrO genome locus and on the replication plasmids in strains W37 and D37, respectively. Figure S5. Hierarchical clustering of genes whose expression is specifically downregulated in response to DdrO depletion. Figure S6. Expression of several proteins after exposure to MMC in an IrrE-and DdrO-independent manner. Table S1. Bacterial strains and plasmids used in this study. Table S2. Primers used for strain and plasmid constructions. Restriction sites are indicated in bold. Table S3. List of CDS in the D. radiodurans strain ATCC 13939 genome sequence. The paralogs found in this sequence, and the orthologs in other D. radiodurans R1 sequence releases, are indicated with a threshold of 80% of the maximum bit score applied. Table S4. Data integration for all CDS. List of CDS (Column A). RDR genes predicted by Makarova et al. [33] (Column B). RNA-seq data (columns C-F): Number of time points with differential expression in W37 (column C) or in D37 (column E), or only upregulated at intermediate time points, i.e., 6, 8, and 16 h in W37 (column D) or in D37 (column F). ChIP-seq data (Columns G-L). Number of peaks detected in the promoter region of each CDS (Column G), peak name (Column H), and their coordinates on Genome Orsay (Columns K-L). In the case of the presence of two peaks, all of the information for the second peak is given in columns Y-AF. Search for palindromic or non-palindromic motifs ("palindrome" or "no palindrome", respectively) with MEME or/and FIMO with an occurrence of one motif per sequence or any number of repetitions ("0-1" or "any", respectively) (Columns M-X). The sequences of each motif close to the RDRM is indicated for each peak. For palindromic motifs, only the sequence found on one DNA strand is indicated. Table S5. Conserved motifs predicted by FIMO. The four position specific scoring matrix (PSSM) from MEME analysis ( Figure 1B) was used as inputs for the FIMO program, scanning sequences between −500 and +100 of all annotated CDS, for palindromic or non-palindromic motifs with an occurrence of one motif per sequence or any number of repetitions. For a search for palindromic sequences, FIMO listed the motifs found on the both strands. Table S6. Number of unique and common up-and downregulated genes in W37 and D37, respectively, during the time course. For each time lapse, a 2-fold change expression threshold for the ratio experiments was applied together with a p-value < 0.01. Table S7. Variation