The ability of bacteria and archaea to modulate metabolic process, defensive response, and pathogenic capabilities depend on their repertoire of genes and capacity to regulate the expression of them. Transcription factors (TFs) have fundamental roles in controlling these processes. TFs are proteins dedicated to favor and/or impede the activity of the RNA polymerase. In prokaryotes these proteins have been grouped into families that can be found in most of the different taxonomic divisions. In this work, the association between the expansion patterns of 111 protein regulatory families was systematically evaluated in 1351 non-redundant prokaryotic genomes. This analysis provides insights into the functional and evolutionary constraints imposed on different classes of regulatory factors in bacterial and archaeal organisms. Based on their distribution, we found a relationship between the contents of some TF families and genome size. For example, nine TF families that represent 43.7% of the complete collection of TFs are closely associated with genome size; i.e., in large genomes, members of these families are also abundant, but when a genome is small, such TF family sizes are decreased. In contrast, almost 102 families (56.3% of the collection) do not exhibit or show only a low correlation with the genome size, suggesting that a large proportion of duplication or gene loss events occur independently of the genome size and that various yet-unexplored questions about the evolution of these TF families remain. In addition, we identified a group of families that have a similar distribution pattern across Bacteria
, suggesting common functional and probable coevolution processes, and a group of families universally distributed among all the genomes. Finally, a specific association between the TF families and their additional domains was identified, suggesting that the families sense specific signals or make specific protein-protein contacts to achieve the regulatory roles.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited