Computational elucidation of regulatory network 1 responding to acid stress in Lactococcus lactis MG1363

predict and validate regulons related to acid stress response in Lactococcus lactis 33 MG1363. A total of 51 regulons were identified, and 14 of them have computational 34 verified significance. Among these 14 regulons, five of them were computationally 35 predicted to be connected with acid stress response with (i) known transcriptional 36 factors in MEME suite database successfully mapped in Lactococcus lactis MG1363; 37 and (ii) differentially expressed genes between pH values of 6.5 (control) and 5.1 38 (treatment). Validated by 36 literature confirmed acid stress response related proteins 39 and genes, 33 genes in Lactococcus lactis MG1363 were found having orthologous 40 genes using BLAST, associated to six regulons. An acid response related regulatory 41 network was constructed, involving two trans-membrane proteins, eight regulons 42 ( llrA , llrC , hllA , ccpA , NHP6A, rcfB , regulons #8 and #39), nine functional modules, 43 and 33 genes with orthologous genes known to be associated to acid stress. Our 44 RECTA pipeline provides an effective way to construct a reliable gene regulatory 45 network based on regulon elucidation. The predicted resistance pathways could serve 46 as promising candidates for better acid tolerance engineering in Lactococcus lactis . It 47 has a strong application power and can be effectively applied to other bacterial 48 genomes, where the elucidation of the transcriptional regulation network is needed. 49


INTRODUCTION
Lactococcus lactis (L.lactis) is one of the mesophilic Gram-positive lactic acidproducing bacteria.It is has been widely applied in dairy fermentations such as cheese and milk product (Wegmann et al. 2007).Several studies have provided evidence of its essential roles in desecrating and delivering proteins or vaccine for immune treatment such as diabetes (Ma, Liu, et al. 2014), malaria (Ramasamy et al. 2006), tumor (Bermudez-Humaran et al. 2005, Zhang et al. 2016) and infections (Hanniffy et al. 2007).Holding the advantage of higher acid tolerance to protect vectors from resolving during delivery inside of the animal body, L. lactis has more potential and safety in oral drug development (Hols et al. 1999).Moreover, it has been found that L. lactis, along with some Lactobacillus, Bifidobacterium and other gut microbiota, were associated with obesity (Million et al. 2012).Such studies lead to the possibility and availability of L. lactis in metagenome study to investigate the effect of microbial interaction between L. lactis and other species in the human body.The optimal pH range for their growth is between 6.3-6.9 (Hutkins and Nannen 1993).The production of lactic acid, a weak organic acid with a pKa of 3.86, leads to pH drop in the extracellular media, reduced metabolic capability, decreased growth rate and ultimately reduced cell viability (van de Guchte et al. 2002).It is now well established that Lactococcus have evolved stress-sensing systems, which enable them to tolerate harsh environmental conditions (Carvalho et al. 2013, Hutkins and Nannen 1993, van de Guchte, Serror, Chervaux, Smokvina, Ehrlich and Maguin 2002).
The reason that bacteria maintain the protection mechanism to against acid stress is to withstand the deleterious effects caused by the harmful high level of protons in the exposed environment.Acid stress is known to change the level of the alarmones (guanosine tetraphosphate and guanosine pentaphosphate, collectively referred to as (p)ppGpp) (Hauryliuk et al. 2015) and leads to stringent response (Rallu et al. 2000).Many mechanisms or genes related to the acid stress response (ASR) have been identified.Proton-pumping activity as the direct regulator to acid stress response controls the intracellular pH level by pumping extra protons out of the cell (Koebmann et al. 2000, Lund et al. 2014), and the increase of alkaline compounds level also counters the acidification found in streptococci (Shabayek and Spellerberg 2017).Cell repairing for acid damage by chaperones or proteases, such as GroES, GroEL, GrpE, HrcA, DnaK, DnaJ and Clp (Frees et al. 2003, Jayaraman et al. 1997), and hdeA/B and Hsp31 in Escherichia coli (E.coli) (Kern et al. 2007, Mujacic andBaneyx 2007), the arginine deiminase system (ADI) (Budin-Verneuil et al. 2006, Ryan et al. 2009, Sun et al. 2012, Zuniga et al. 2002) and glutamate decarboxylases (GAD) pathways, etc. (Hoskins et al. 2001, Nomura et al. 1999, Sanders et al. 1998) have been proved to be associated with the acid response.Additionally, transcriptional regulators, sigma factors, and two-component signal transduction system (TCSs) have also been demonstrated to be responsible for ASR by modifying gene expression (Cotter and Hill 2003).These genes or pathways suggest low pH has widespread or global adverse effects on cell functions and inflicts response at genomic, metabolic, and macromolecular levels.To better understand the mechanism that controls the acid tolerance and responds to the acid stress in L. lactis, we considered MG1363, a strain extensively studied for acid resistance, to carry out computational analyses (Carvalho, Turner, Fonseca, Solopova, Catarino, Kuipers, Voit, Neves and Santos 2013, Hartke et al. 1996, Linares et al. 2010, Sanders et al. 1999).
To adequately describe the transcriptional state and gene regulation responsible for ASR in L. lactis, a gene regulatory network (GRN) integrating all individual pathways is needed.Genomic and transcriptomic analyses have been widely used for elucidating GRN hierarchies and offering insight into the coordination of response capabilities (Arnoldini et al. 2012, Carvalho, Turner, Fonseca, Solopova, Catarino, Kuipers, Voit, Neves and Santos 2013, Levine et al. 2013, Locke et al. 2011).
However, an obstacle to this kind of genome-scale network is the lack of welldocumented transcription factors (TFs) and the regulatory mechanisms at the transcription unit level in MG1363.One way to study the mechanism of transcriptional regulation in microbe genomics is the regulon prediction.A regulon is a group of co-regulated operons, which contains single or multiple consecutive genes along the genome (Cao, Ma, et al. 2017, Mao et al. 2015, Zhou et al. 2014).Genes in the same operon are controlled by the same promoter and are co-regulated by one or a set of TFs (Jacob et al. 1960).The elucidation of regulons can improve the identification of transcriptional genes, and thus, reliably predict the gene transcription regulation networks (Liu, Zhou, et al. 2016).
There are three ways for regulon prediction, which includes: (i) Predicting new operons for a known regulon (Kumka andBauer 2015, Tan et al. 2001).This method combines motif profiling with a comparative genomic strategy to search for related regulon members and carries out systematical gene regulation study.(ii) Integrating cis-regulatory motif (motif for short) comparison and clustering to find significantly enriched motif candidates (Gupta et al. 2007, Ma et al. 2013).The candidate motifs are then assembled into regulons.(iii) Performing ab initio novel regulon inference using de novo motif finding strategy (Novichkov et al. 2010).The approach uses phylogenetic footprinting technique which mostly relies on the reference verification (Blanchette et al. 2002, Katara et al. 2012, Liu, Zhang, et al. 2016), and can perform a horizontal sequential comparison to predict regulons in target organisms by searching known functional related regulons or TFs from other relevant species.One algorithm for phylogenetic footprinting analysis called Motif Prediction by Phylogenetic Footprinting (MP3) has been developed by Dr. Liu etc. and used for regulon prediction in E. coli (Liu, Zhang, Zhou, Li, Fennell, Wang, Kang, Liu and Ma 2016).MP3 was then integrated into DMINDA web server along with other algorithms like the Database of Prokaryotic Operons 2.0 (DOOR2) (Cao, Ma, Chen andXu 2017, Mao et al. 2014), BOttleneck BROken (BoBro) (Li, Ma, et al. 2011) and BoBro-based motif Comparison (BBC) (Ma, Liu, Zhou, Yin, Li and Xu 2013), to construct a complete pipeline for regulon prediction.In the latest research, a newly developed pipeline called Single-cell Regulatory Network Inference and Clustering (SCENIC) combines motif finding from co-expression gene modules (CEMs) with regulon prediction for single-cell clustering and analysis (Aibar et al. 2017).Such mentality builds up a way of regulon application in single-cell and metagenomic research.Nevertheless, without a suitable regulon database, researchers need to build up the library first through operon identification, CEM analysis, motif prediction and comparison (Jensen et al. 2005).
Here, we designed a computational framework of regulon identification based on comparative genomics and transcriptomics analysis (RECTA) to elucidate the GRN responding to acid stress in MG1363.The general framework is showcased in Figure 1 with six steps: (i) MG1363 co-expression gene modules (CEMs) and differentially expressed genes (DEG) were generated from microarray data by hcluster package and Wilcoxon test in R, respectively.MG1363 operons were predicted from genome sequence using DOOR2 webserver and assigned into each CEM; (ii) For each CEM, the 300bp upstream to the promoter was extracted and the sequences were used to find motifs using DMINDA2.0;(iii) The top five significant motifs in each CEM were reassembled by their similarity comparison and clustering to predict regulons; (iv) The motifs were compared to known transcription factor binding sites (TFBSs) in the MEME suite, and the TFs corresponding to these TFBSs were mapped to MG1363 using BLAST.Only regulons with DEGs and mapped TF were kept as ASR-related regulons; (v) Experimentally identified ASR-related genes in other organisms were mapped to MG1363 using BLAST and allocated to corresponding regulons for further verification; and (vi) The relationship between regulons and functional gene modules were established to elucidate the overall ASR mechanism in MG1363.As a result, 14 regulons are identified, literature verified or putative, to be connected to ASR.Eight regulons, related to nine functional modules and 33 associated genes, which are considered as the essential elements in acid resistance in MG1363.This proposed computational pipeline and the above results significantly expand the current understanding of the ASR system, providing a new method to predict systematic regulatory network based on regulon clustering.

Data Acquisition
L. lactic MG1363 genome sequence was downloaded from NCBI (GenBank accession number: AM406671).The microarray dataset contains eight samples under different acid stress conditions for MG1363 was downloaded from the Gene Expression Omnibus (GEO) Database (Series number: GSE47012).The data has been treated with LOWESS normalization by the provider.The details on cell culture preparation and data processing can be found in the previous study (Carvalho, Turner, Fonseca, Solopova, Catarino, Kuipers, Voit, Neves and Santos 2013).This dataset has all the bacteria grown in the basic conditions: a 2-liter fermenter in chemically defined medium containing 1% (w/v) glucose at 30°C.The control and treatment samples were grown at a pH of 6.5 and 5.1, respectively.

Operon identification
The genome-scale operons of MG1363 were identified by DOOR2.It is a one-stop operon-centered resource including operons, alternative transcriptional units, motifs, terminators, conserved operons information across multiple species (Mao, Ma, Zhou, Chen, Zhang, Yang, Mao, Lai and Xu 2014).Operons were predicted by the back-end prediction algorithm, with a prediction accuracy of 90-95%, based on the features of intergenic distance, neighborhood conservation, short DNA motifs, length ratio between gene pairs, and newly developed transcriptomic features trained from the strand-specific RNA-seq dataset (Chou et al. 2015, Li, Liu, et al. 2011).

Gene differential expression analysis and co-expression analysis
DEGs were identified based on the Wilcoxon signed-rank test (Bauer 1972) between the control and treatment, which was performed in R. The gene co-expression analysis was performed using a hierarchical clustering method ('hcluster' package in R (Antoine Lucas 2006)) to detect the CEMs under the acid stress in MG1363.

Motif finding and regulon prediction
Genes from each CEM were first mapped to the identified operons to retrieve the basic transcription units.Next, 300 bps in the upstream of the translation starting sites for each operon were extracted, in which motif finding was carried out using the web server DMINDA (Ma, Zhang, et al. 2014, Yang et al. 2017).DMINDA is a dominant motif prediction tool, embraced five analytical algorithms to find, scan, and compare motifs (Li, Liu, Ma and Xu 2011, Liu et al. 2017, Ma, Liu, Zhou, Yin, Li and Xu 2013), including a phylogenetic footprint framework to elucidate the mechanism of transcriptional regulation at a system level in prokaryotic genomes (Li, Ma, Mao, Yin, Zhu and Xu 2011, Liu, Zhang, Zhou, Li, Fennell, Wang, Kang, Liu and Ma 2016, Liu, Zhou, Li, Zhang, Zeng, Liu and Ma 2016).All sequences were uploaded to the server and default parameters were used to find the top five significant motifs (p-value < 0.05) in each cluster.The identified motifs were subjected to motif comparison and grouped into regulons based a sequence similarity cutoff using the BBC program in DMINDA (Ma, Liu, Zhou, Yin, Li and Xu 2013).

Regulon validation based on TF BLAST and DEG filtering.
Each highly conserved motif was considered to contain the same TFBS among species.Therefore, a comparison study was performed using TOMTOM in the MEME Suite (Bailey et al. 2009) between identified motif and public-domain TFBS databases, including DPINTERACT, JASPAR, RegTransBase, Prodoric Release and Yeastract, to find TFBSs and corresponding TFs with significant p-values in other prokaryotic species.Those TFs were then mapped to MG1363 using BLAST by default parameters to predict the connection between regulons and TFs in MG1363.On the other hand, since genes without differential expression were supposed not to react to pH changes, and thus, irrelevant to ASR, regulons without DEG were supposed not involved in the GRN, and thus, excluded from the following steps.

Regulon validation based on known ASR proteins from the literature.
To validate the performance of the above computational pipeline for regulon prediction, a literature-based validation was performed.Thirty-six ASR-related proteins and genes in other organisms including L. lactis, E. coli, Streptococci, etc. were first manually collected from literature, and their sequence was retrieved from the NCBI and UniProt databases.They were used to examine the existing known mechanisms in response to pH changes in MG1363 using the BLAST program by default parameters on NCBI.Such literature-based validation can either confirm the putative regulons when known ASR-related genes can be found in the significant regulons or expand our results to some insufficiently significant regulons, which indicate both false positive and true negative rate to evaluate the computational pipeline.

Predicted operons and CEM generation
A total of 1,565 operons with 2,439 coding genes of MG1363 (Dataset S1) were retrieved from the DOOR2 database.By co-expression analysis, the 1,565 operons were grouped into 124 co-expressed clusters.Among these clusters, two large ones contain more than 200 operons.Each of them was removed from the subsequent analyses as larger clusters may have higher chances to induce false positive operons which were connected with true operons by co-expression analysis.For the rest 122 clusters covering 2,122 genes, 26 (21%) of them contain operons no more than ten; the smallest cluster had two operons, and most of the clusters (90%) contained operons in a range of 10 to 50 (Dataset S2 and Figure S1).

Predicted regulons based on motif finding and clustering.
Using BoBro in the DMINDA web server, multiple motif sequences were identified from the 300 bps in the upstream of the translation start sites for each operon.Only the top five significant motifs (p-value < 0.05) were selected in each cluster, giving rise to a total of 610 (122×5) identified.The motif comparison-and-clustering analysis was then performed on the 610 motifs, and 51 motif clusters were identified with a motif similarity 0.8 as a cutoff.Intuitively, the operons sharing highly similar motifs in each motif cluster are supposed to be regulated by the same TF and tend to be in the same regulon.Hence, these 51 motif clusters correspond to 51 regulons (Dataset S3).
Table 1 should be placed here.
Additionally, 86 down-regulated genes and 55 up-regulated genes (Dataset S5), resulted from DEG analysis, were integrated into the regulons.Regulons #10, #37, #44 and #47 were found lack of DEGs.Thus gene (llmg_0271), related to regulon #10, was not likely to respond to acid stress in MG1363 even has successfully mapped to MG1363, and was then grouped into the potential candidate.On the contrary, ccpA and llrA were still retained due to their involvements in regulons #15 and #12 with DEGs, respectively.

Verified regulons based on literature verification
Altogether, 36 literature-supported ASR-related transporters were successfully mapped to MG1363 using blast with an E-value cutoff as 1e-10 and resulted in a total of 33 mapped genes.All the 36 transporters were categorized into nine modules based on their biological functions or regulated pathways, including L-lactate dehydrogenase (LDH), GAD, ADI, urea degradation, F1/F0ATPase, acid stress, protein repair and protease, envelope alterations and DNA repair.The 33 mapped genes compose of 22 operons and six regulons: llrA, llrC, hllA, NHP6A, regulon #8 and #39, which were subjected, one or more, to each functional module (Table 2).
Regulon llrA, llrC, and hllA have already been computationally identified in Table 1 and supported again by literature verification results.NHP6A which interestingly has a homologous TF in human and fungi but not in L. lactis (Kolodrubetz andBurgum 1990, Stillman 2010), yet failed to map in MG1363.Here, we are using NHP6A to represent regulon #20, as their relationship has been predicted computationally in Table 1.Regulon #39 was identified to be regulated by llrD, one of the six twocomponent regulatory systems in MG1363 (O'Connell-Motherway, van Sinderen, Morel-Deville, Fitzgerald, Ehrlich and Morel 2000).Regulons #8 (llmg_1803) and #39 (llrD) were not included in the 14 significant regulons in Table 1.For NHP6A, regulons #8 and #39 were the enrichment by literature validation as it expanded regulon results of RECTA pipeline.Among the nine functional modules, llrA were found connected to five of them, and NHP6A related to three.On the other hand, the GAD and urea degradation functional modules were failed to connect to any previous regulons.
Table 2 should be placed here.
Compare to the regulon verification based on TF blast and DEG, the literature verification identified two more regulons (#8 and #39) that lay in the insignificant group, however, with no sign of ccpA regulon.Thus, such result indicated a possible false positive rate of 1/5 and a true negative rate of 2/37 of our computational pipeline, indicating a reliability and feasibility of using RECTA to predict the ASRrelated regulons.In Figure 2, we are showing the processes and results for both literature verification and computational pipeline in details.The final eight regulons predicted from both parts were then compared to construct a GRN response to acid stress, integrating with other information found in the literature.

A model of regulatory network in response to pH change
According to the results outlined above, we are presenting a working model of the transcriptional regulatory network in response to acid stress response in MG1363 (Figure 3).The network consists of two transmembrane proteins (Dataset S6), eight regulons, nine functional modules, and 33 orthologous genes known for ASR in other bacteria that are also contributing in MG1363.
Figure 3 should be placed here.
The network is subjected to respond to the change of intracellular proton level.The signal is captured by H+ sensor and regulons are initiated to be regulated.Although significance was not shown for rcfB in our computational result, it has been reported to recognize and regulate promoter P170 (Madsen et al. 1999), P1 and P3 (Hindre et al. 2004, Rince et al. 1994), which are activated by the ACiD box and essential to acid response (Madsen et al. 2005).With the ACiD box, operons like groESL, lacticin 481 and lacZ have been proved to be regulated by rcfB, while als, aldB, etc. have not (Madsen, Hindre, Le Pennec, Israelsen and Dufour 2005).The homologous comparative study also predicted the existence of the ACiD box in llrA (Akyol et al. 2008, O'Connell-Motherway, van Sinderen, Morel-Deville, Fitzgerald, Ehrlich andMorel 2000).With such evidence, we separated rcfB from regulon 39 and predicted that rcfB is first triggered by H+ sensor and act as the global initiator that controls the other seven regulons.It was reasonable that rcfB-related regulon #39 failed to show significant TF matching result after CEM treatment in the operon clustering step.RcfB worked as a trustworthy global factor; its differential expression should be less significant than regulons directly respond to acid stress, thus lead to the failure of being predicted by the RECTA pipeline.Nevertheless, the less amount of microarray data sets (eight) also limited the real performance to the ASR.However, the mechanism of how H+ sensor is activating and regulating the GRN and rcfB remains unclear.In the seven regulons, three of them, llrA, llrC and hllA, were documentary verified to be related to ASR; regulons #8 and #39 showed less significant in regulon prediction; NHP6A was considered as putative regulon due to its failure mapping in MG1363; and ccpA was another putative regulon without literature support.
The six downstream regulons (llrA,llrC,hllA,NHP6A,regulon #39,and regulon #8) other than ccpA, interact with each other to regulate six ASR-related functional modules, including ADI system, DNA repair, LDH, protein repair, envelope alterations, and F0/F1ATPase.The ADI pathway, which generates ATP and protects cells from acid stress (Zuniga, Perez and Gonzalez-Candelas 2002), is under the regulation of NHP6A, llrC, llrA, and hllA.Another important pathway is the LDH (EC 1.1.1.27)under the regulation of NHP6A and llrA, which converts pyruvate and H+ to lactate which is exported outside of cells (Dennis and Kaplan 1960).Chaperons which take part in macromolecule protection and repairing are subjected to regulon llrA.The function of chaperons includes providing protection to against environmental stress, helping protein folding, and repairing damaged proteins, and have been demonstrated to show clear linkage with acid stress in numerous Grampositive bacteria (Frees, Vogensen and Ingmer 2003, Jayaraman, Penders and Burne 1997, Kern, Malki, Abdallah, Tagourti and Richarme 2007, Mujacic and Baneyx 2007).F0/F1ATPase, controlled by llrA and regulon #8, also plays an important role in maintaining normal cellular pH, which pumps H+ out of cells at the expense of ATP (Amachi et al. 1998, Koebmann, Nilsson, Kuipers and Jensen 2000, Lund, Tramonti and De Biase 2014, O'Sullivan and Condon 1999).The GAD (Nomura, Nakajima, Fujita, Kobayashi, Kimoto, Suzuki andAso 1999, Sanders, Leenhouts, Burghoorn, Brands, Venema andKok 1998) and urea degradation (Cotter and Hill 2003) functional modules are missing reliable associations with the regulons in MG1363 while maintaining functions in ASR mechanism in other species.

DISCUSSION
In this study, we designed a computational framework, RECTA, to construct an eightregulons enrolled ASR regulatory network.The framework provides a useful tool and will be a starting point toward a more systems-level understanding of the question (Cao, Wei, et al. 2017).The identified motifs and regulons suggest the acid resistance is a coordinated response regarding regulons, although most of which have not been identified or experimentally verified.From the three well-identified regulons-llrA, llrC, and hllA, it appears the gene regulation is also complex, as these regulons also interact with other proteins and TFs.F0/F1ATPase is directly involved in the concentration regulations of the intracellular proton.Other pathways are responsible for repairing the damage caused by acid stress, such as DNA repair, protein repair, and cell envelops alterations.However, there were also several reported ASR-related genes or transporters such as htrA in Clostridium spp.(Alsaker et al. 2010), CovS/CovR acid response regulator in Streptococcus (Cumley et al. 2012), cyclopropane fatty acid synthase (cfa) for cell-membrane modification (Budin-Verneuil et al. 2005), and oxidative damage protectant genes like sodA, nox-1 and nox-2 (Santi et al. 2009) that failed to map to MG1363.Using more gene expression datasets for CEM and DEG analyses could be a way to strengthen the result of our computational pipeline, which might cover more significant regulons to construct a more solid and complete regulatory network.
Homology mapping at the genomic level showed very long distance in evolution between MG1363 and currently well-annotated model species.Hence, the functional analysis for MG1363 is limited, and it is hard to apply gene functional enrichment to verify our prediction results.With more expression dataset and experiments about protein-protein interactions, the ASR mechanism can be largely improved in L. lactic MG1363.
In summary, we found that the ASR at the transcriptome level in MG1363 is an orchestrated complex network.Functional annotation shows these regulons are involved in many levels of biological processes, including but not limited to DNA expression, transcription, and metabolism.Our method builds a TF-regulons-GRN relationship so that the new ASR-related genes can be predicted.Besides, the low false positive and true negative rate indicate the RECTA pipeline as sensitive and reasonable.In fact, considering the high accuracy, we regarded ccpA as the putative regulon, though not connected to any related functional modules, while more robust proves are required.Such results expand current pathways to those that can corroborate cell structures-cell wall, cell membranes-and related functions.Our findings suggest that acid has profound adverse effects and inflict systems-level response.Such predicted response pathways can inform better resistance design.
Looking forward to the acid tolerance advantage of L. lactis, which makes its prospective application in drug and vaccine delivery, the effects on anti-obesity, and the metagenomic study, the ASR-related GRN in L. lactis shows an excellent research value.Fully understanding of its theory may contribute to the development of Lactococcus therapy, and can even expand to other close species by genetic modification.Furthermore, our computational pipeline provides an effective way to construct reliable GRN based on regulon prediction, integrating CEMs, DEG analysis, motif finding, and comparative genomics study.It has a durable application power and can be effectively applied to other bacterial genomes, where the elucidation of the transcriptional regulation network is needed.Step 1: microarray data was used to generate co-expressed gene clusters and DEG, and MG1363 genome sequence was used to find operons.Step 2: a motif finding progress was carried out to identify all statistically significant motifs in each of the CEMs.Step 3: a regulon finding procedure was designed to identify all the possible regulon candidates encoded in the genome based on motif comparison and clustering.

ADI
Step 4: the motifs of each of these regulons were compared to known TFBSs, and DEG analysis between low pH condition and normal condition was used to figure out the ASR-related regulons.Step 5: regulon validation based on literature information verified the significant putative regulons and expanded the results to some insufficiently significant regulons.Step 6: the ASR-related GRN in MG1363 was predicted and described with eight regulons, nine functional modules, and 33 genes.The combination of the above information forms a genome-scale regulatory network constructed for ASR.
Figure 2. Regulon prediction using RECTA pipeline (red) and validation and enrichment using literature information and gene blast (blue).All processes were shown in rectangles and results were highlighted with corresponding background colors.In the computational pipeline, 51 regulons with assigned motifs and operons were analyzed sequentially through significant TFBS pairing, DEG conformation, and TF blast.Only regulons contained DEGs (ten) and had related mapped TF (eight) were believed to be the final predicted ASR-related regulons (five).These five regulons were then merged into four and using the corresponding TFs to represent their names.In the literature validation process, known ASR-related transporters were first mapped to the MG1363 genome and resulted in 33 genes.Those genes were then searched in 51 regulons and determine six related regulons.All regulons resulted from both computational pipeline and literature validation were combined, along with the information of functional modules, to determine the GRN.RcfB is assumed to be the overall activator for the rest seven regulons and controls the ASR functional module solely.Three kinds of literature verified and significant ASR-related regulons, llrA, llrC and hllA, two insufficient significant regulon llrD (regulon #39) and regulon #8 (llmg_1803) predicted via our workflow but results under 0.8 motif similarity cutoff or could not find hit, and one putative significant regulon NHP6A control the seven functional modules which are experimentally verified in close species MG1363.The other significant regulon ccpA failed to be confirmed by any literature proved genes or transporters.Two extra functional modules, GAD, and urea degradation show no direct connection to all seven the regulons.One or more homology genes are found in MG1363 for all the nine modules using BLAST.The solid arrows indicate regulation between regulons/TFs and functional modules/genes, and the dashed arrows indicate uncertain control processes.Besides, two ovals indicate two trans-membrane proteins, one is confirmed as F0/F1ATPase and the other one with dash line that we still not find its related information in the public-domain literature.

TABLE LEGEND
Table 1．14 significant regulons that are verified and mapped to known TFs.According to analyses, operon numbers and DEG determination (yes or no), matched template TFs and mapped TFs were assigned for each significant regulon, respectively, and were aligned based on regulon ID number.Five regulons containing DEG and having the corresponding TF at the same time were bolded, as which were being computationally verified regulons to be responsible for acid stress in MG1363.

Figure 1
Figure 1 should be placed here.

Figure 2
Figure 2 should be placed here.

Figure 3 .
Figure 3.A working model of the transcriptional gene regulatory network in response to pH change in L. lactis.The mechanism is activated by the change of proton signal

Table 2．Known
ASR-related gene mapping from literature in response to pH change.Literature-supported ASR-related genes found in close species or other L. lactis strains.The template transporters and genes were first identified in published studies from the NCBI and UniProt databases.L. lactis Il1403 was used as the organism which is very close to MG1363 if template gene existed.Only 36 templates that successfully mapped to the MG1363 genome were listed, which resulted in 33 genes.All mapped genes and corresponding templated were organized by their regulated pathways which were further used as functional modules.Mapped genes were searched in 51 regulons to build the connections between functional modules and regulons.