Next Article in Journal
Advancing Energy Efficiency in Educational Buildings: A Case Study on Sustainable Retrofitting and Management Strategies
Previous Article in Journal
Shots During One-Goal Leads and Match Outcomes in the English Premier League
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

From Genomes to Applications: Comparative Analysis of Aeribacillus pallidus Reveals a Thermophilic Chassis for Biotechnology

by
Songül Yaşar Yıldız
1 and
Nadja Radchenkova
2,*
1
Department of Bioengineering, Faculty of Engineering and Natural Sciences, Istanbul Medeniyet University, 34700 Istanbul, Turkey
2
The “Stephan Angeloff” Institute of Microbiology, Bulgarian Academy of Sciences, 1113 Sofia, Bulgaria
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(20), 10866; https://doi.org/10.3390/app152010866
Submission received: 3 September 2025 / Revised: 2 October 2025 / Accepted: 6 October 2025 / Published: 10 October 2025

Abstract

Thermophilic microorganisms represent an untapped reservoir of thermostable biocatalysts and stress-resilient biomolecules for industrial biotechnology. Aeribacillus pallidus, a Gram-positive moderate thermophile, has attracted attention for its enzymatic versatility and environmental adaptability, yet its genomic potential remains underexplored. Here, we present a comparative genomic analysis of 13 A. pallidus strains to uncover conserved and strain-specific traits relevant to biotechnology. Genomes ranged from 3.24 to 4.98 Mb, with GC content largely conserved (~39%) except for GS3372 (57.4%), indicating possible horizontal gene transfer. All strains encoded complete central metabolic pathways, while carbohydrate-active enzyme profiling revealed abundant glycoside hydrolases and glycosyltransferases, with GS3372 and MHI3390 enriched for lignocellulose-degrading enzymes. Secondary metabolite mining identified diverse biosynthetic gene clusters, including terpenes, sactipeptides, and bacteriocins, with PI8, W-12, and 8m3 exhibiting the greatest biosynthetic diversity. A core set of heat shock and universal stress proteins underscored robust thermotolerance. Phylogenomic and pan-genome analyses revealed high intraspecific diversity and an open pan-genome structure. Collectively, these findings position A. pallidus as a promising thermophilic chassis organism for sustainable applications, including biomass conversion, biofuel production, bioremediation, and the synthesis of heat-stable antimicrobial agents.

1. Introduction

Aeribacillus pallidus was first described in 1987 under the name Bacillus pallidus [1]. Later, in 2004, Banat et al. reclassified the species as Geobacillus pallidus [2]. Finally, in 2010, Miñana-Galbis et al. introduced the new genus Aeribacillus based on 16S rRNA phylogeny and chemotaxonomic differences, and the species name was revised to Aeribacillus pallidus [3]. When the genus Aeribacillus was first established, it included only a single species, A. pallidus. However, over time, additional thermophilic species such as A. composti, A. alveayuensis, and A. kexueae have also been identified and classified within this genus. A. pallidus is a Gram-positive, rod-shaped bacterium that is motile (via peritrichous or lophotrichous flagella) and forms ellipsoidal endospores [4]. As a moderate thermophile, A. pallidus grows in the temperature range of roughly 30–70 °C, with an optimum around 55–60 °C.
The thermophilic and metabolically versatile characteristics of A. pallidus position it as a promising candidate for a wide range of industrial and biotechnological applications. Published studies have demonstrated its potential in diverse fields, including the production of thermostable enzymes and the bioremediation of industrial wastewaters, highlighting its relevance in sustainable and high-temperature bioprocesses.
The thermostable enzymes produced by A. pallidus are of significant industrial relevance due to their ability to function under harsh process conditions, such as elevated temperatures and extreme pH levels. For instance, the pectate lyase secreted by the TD1 strain has been identified as a potential biocatalyst in the degradation of pectin, with applications in fruit juice clarification, textile processing, and the paper industry [4]. Similarly, amylases derived from A. pallidus strains exhibit high-temperature activity, making them suitable for starch conversion processes in starch-based sweetener production and various food industry applications.
An alkaline protease isolated from A. pallidus strain C10 has been evaluated for its utility in textile processing (e.g., wool treatment) and as an additive in laundry detergent formulations [5]. This protease demonstrated optimal activity at around 60 °C and pH 9, aligning well with the operating conditions of many commercial detergents. Experimental results indicated its effectiveness in removing protein-based stains and its stability within detergent formulations, underscoring its commercial potential [5,6].
Lipases from A. pallidus have also attracted attention for their thermostable properties. These enzymes are valuable in biodiesel production through transesterification of fats and oils, as well as in the food industry for applications such as cheese ripening and the synthesis of flavor compounds [7]. Additionally, ongoing research investigates the ability of A. pallidus strains to produce lignocellulose-degrading enzymes such as xylanases and cellulases. Preliminary findings suggest that this bacterium possesses the enzymatic machinery necessary to contribute to composting processes and biomass degradation [8].
Certain strains of A. pallidus have been investigated for their potential role in the bioremediation of petroleum and fuel products, particularly for the biological removal of undesirable sulfur- and nitrogen-containing compounds (biodesulfurization and denitrification). Notably, genomic analysis of the A. pallidus W-12 strain, which was isolated from a petroleum reservoir, revealed the presence of gene clusters associated with the degradation of organosulfur compounds such as dibenzothiophene and nitrogenous aromatic compounds [9]. Studies on this strain suggest that it may enable efficient biodesulfurization of petroleum derivatives at elevated temperatures, particularly around 60 °C. Given that conventional hydrodesulfurization methods are energy-intensive and costly, the use of thermophilic microorganisms like A. pallidus offers a promising, eco-friendly, and economically viable alternative. The W-12 strain, in particular, holds potential for application in in situ biotreatment of oil wells or for sulfur removal from refinery waste streams, including gas and wastewater treatment systems.
Thermostable bacteriocins produced by A. pallidus, such as pallidocyclin, hold promising potential as antimicrobial agents in both the food industry and healthcare applications. Due to its high thermal stability, pallidocyclin can be incorporated as a natural preservative in heat-processed or pasteurized foods to inhibit the growth of spore-forming bacteria responsible for spoilage. For instance, it may help extend the shelf life of dairy products by suppressing heat-resistant spores of Geobacillus and Bacillus species that can survive pasteurization [10]. In a study by Lücking et al. (2024), A. pallidus strains isolated from milk and cocoa samples were found to harbor gene clusters resembling those responsible for the production of known circular and linear bacteriocins, such as amylocyclin and lacticin Z [11]. This genomic evidence suggests that A. pallidus may also produce yet-undiscovered antimicrobial peptides. Given the growing interest in bacteriocins as natural food preservatives and potential alternatives to conventional antibiotics, A. pallidus-derived antimicrobial compounds are considered promising candidates for future biotechnological exploration and development.
While the previously mentioned studies highlight several promising uses of A. pallidus, they likely represent only a fraction of the organism’s full biotechnological potential. As new strains are isolated from diverse environments, novel enzymes and metabolites adapted to unique conditions are continually being identified. These include enzymes for use in detergent formulations, food processing, and environmental biotechnology (e.g., wastewater treatment and bioremediation), as well as naturally occurring antimicrobial compounds for food safety applications. The growing interest in A. pallidus is reflected in the increasing number of publications in recent years, underscoring its emerging relevance in practical applications.
The study of thermophilic microorganisms such as A. pallidus not only expands our understanding of life in extreme environments but also presents unique opportunities for the development of high-performance biocatalysts and biomolecules tailored for industrial use. Genomic approaches offer powerful tools to explore and exploit this potential. Through whole-genome sequencing and annotation, researchers can uncover genes encoding for thermostable enzymes, metabolic pathways for valuable biomolecule synthesis, and traits linked to environmental resilience [12,13,14].
Such genomic insights are instrumental in guiding downstream applications, including the identification of novel enzymes with specific catalytic properties that are applicable in sectors such as pharmaceuticals, bioenergy, and green chemistry [15]. For instance, enzymes capable of functioning at high temperatures are of significant interest due to their stability, efficiency, and reduced contamination risks, especially in industries like textile manufacturing, food processing, and biodiesel production.
Moreover, the genetic blueprint revealed through genome analysis allows researchers to map out entire metabolic pathways responsible for the biosynthesis of antibiotics, pigments, biosurfactants, biofuels, and bioplastics. This information is crucial for metabolic engineering and the development of optimized microbial cell factories. In particular, genomic information supports the identification and enhancement of bioconversion processes, making them more efficient and cost-effective—thus promoting sustainable industrial practices [16].
To our knowledge, this is the first comprehensive comparative genomic study of Aeribacillus pallidus, covering 13 strains isolated from diverse environments. Previous studies have primarily focused on single strains or on enzymatic characterizations [4,5,6,7,9], but no work has systematically explored the full genomic repertoire of the species. By integrating subsystem annotation, CAZyme profiling, secondary metabolite mining, stress protein analysis, and pan-genome reconstruction, this study provides a significant step toward comprehensive understanding of the biotechnological potential of A. pallidus to date. This novelty strengthens the positioning of A. pallidus as a candidate chassis organism for industrial biotechnology and sets the foundation for future applied research.

2. Materials and Methods

2.1. Whole Genome Sequence and Accession Numbers of Nucleotide Sequence

The complete genomic sequences of 13 Aeribacillus pallidus strains were retrieved from the NCBI database in FASTA format and subjected to stringent quality assessment to ensure data reliability for subsequent analyses. The whole-genome shotgun (WGS) assemblies for all strains are publicly available in the DDBJ/EMBL/GenBank repositories, with their respective accession numbers provided in Table 1. This approach ensured that the most current and comprehensive genomic data were utilized in the study, enabling robust comparative and functional genomic evaluations.

2.2. Genome Annotation and Functional Categorization of A. pallidus Strains

The genome annotation and gene prediction of A. pallidus strains were performed using a multi-platform strategy integrating several automated annotation tools. The genomic assemblies were first uploaded to the RAST server (Rapid Annotations using Subsystems Technology; http://rast.nmpdr.org/), which provided automated predictions for coding sequences (CDSs), rRNA and tRNA genes, as well as functional subsystem assignments [17]. To validate and enhance the accuracy of these annotations, results were compared with those obtained from the Bacterial and Viral Bioinformatics Resource Center (BV-BRC; https://www.bv-brc.org/) [18].
Further manual verification of key protein-coding genes—particularly those associated with essential metabolic pathways—was conducted via BLASTp searches (https://blast.ncbi.nlm.nih.gov/Blast.cgi (access date: 14 July 2025)), referencing well-curated protein databases such as UniProt (http://www.uniprot.org/) and NCBI RefSeq (https://www.ncbi.nlm.nih.gov/refseq/ (access date: 16 July 2025)). These comparisons supported the refinement of gene function predictions and helped ensure annotation reliability.
To better characterize enzymatic functions and metabolic capabilities, gene products were also analyzed through the Kyoto Encyclopedia of Genes and Genomes (KEGG; http://www.genome.jp/kegg/ (access date: 17 July 2025)) and ExPASy (http://www.expasy.org/ (access date: 17 July 2025)) platforms. This enabled the mapping of enzyme-coding genes to known biochemical pathways.
In parallel, orthology-based functional classification was conducted using the eggNOG database (evolutionary genealogy of genes: Non-supervised Orthologous Groups), which clusters protein-coding genes into orthologous groups (OGs) based on shared ancestry and conserved function [19]. Functional categories were assigned according to COG (Clusters of Orthologous Groups) designations, enriched with Gene Ontology (GO) terms, KEGG pathways, and protein domain data from SMART and PFAM.
To visualize and explore these annotations in a genome context, the MicroScope Microbial Genome Annotation & Analysis Platform (MaGe; https://mage.genoscope.cns.fr/microscope/home/index.php (access date: 21 July 2025)) was employed [20]. This platform facilitated comparative analyses and COG-based functional categorization across all 13 A. pallidus strains analyzed in the study, enabling comprehensive profiling of their genomic architecture and biotechnological potential.

2.3. Prediction of CAZymes and Secondary Metabolite Biosynthetic Gene Clusters

To explore the carbohydrate-active enzyme (CAZyme) repertoire of A. pallidus strains, genome sequences were analyzed using the dbCAN3 meta server [21]. This platform integrates multiple bioinformatics tools to ensure accurate functional annotation. Specifically, CAZyme predictions were made using: (i) HMMER searches against the dbCAN CAZy domain HMM library; (ii) DIAMOND-based similarity comparisons against the curated CAZy database; (iii) HMMER searches against the dbCAN substrate-specific HMM profiles. A gene was retained as a confidently predicted CAZyme only if it was supported by at least two out of the three analytical methods, following the recommended confidence threshold.
In addition to CAZyme profiling, secondary metabolite biosynthetic potential was assessed using the antiSMASH pipeline (Antibiotics and Secondary Metabolite Analysis Shell) [22]. This tool enables the detection and annotation of biosynthetic gene clusters (BGCs) responsible for the synthesis of a broad spectrum of secondary metabolites. The identified clusters were then cross-referenced with entries in the antiSMASH reference database to determine putative metabolite classes and to evaluate their similarity to known bioactive compounds.
This integrative approach provided insights into the enzymatic capabilities of A. pallidus strains related to polysaccharide degradation as well as their potential for producing industrially and pharmaceutically relevant secondary metabolites.

2.4. Whole-Genome Phylogenomic Clustering of Aeribacillus Strains

To elucidate the genomic relationships within the Aeribacillus genus, a phylogenomic analysis was performed based on whole-genome sequences of 33 strains, encompassing both type and environmental isolates. The dataset included 13 Aeribacillus pallidus strains and 20 additional strains that cluster within or close to the Aeribacillus genus, including those previously classified as Aeribacillus compositus, Aeribacillus alveayunensis, Aeribacillus keuexuae and Aeribacillus sp. Genomic similarity-based clustering was implemented through Mash distance estimation, which utilizes MinHash sketches to rapidly approximate genome-wide mutation distances and genome similarity levels. These Mash distances are strongly correlated with Average Nucleotide Identity (ANI), a widely accepted metric for quantifying sequence-level homology across microbial genomes, particularly within species boundaries [23].
Pairwise Mash distances were computed and transformed into a distance matrix, which served as the input for the construction of a phylogenetic tree using the neighbor-joining algorithm. This method iteratively minimizes total branch length to produce an optimal unrooted tree, adjusting negative branch values to zero while redistributing excess distance to adjacent nodes to maintain overall accuracy [24].
To delineate genome clusters corresponding to putative species, MicroScope Genome Clusters (MICGCs) were formed using a graph-based community detection strategy. Genomes were represented as nodes in a network, with edges denoting Mash-based similarity. Edges corresponding to ANI values below 94% were removed, effectively separating genomes into species-level clusters. Only genomes meeting stringent quality thresholds—≥90% completeness and ≤5% contamination, as assessed by CheckM—were retained for downstream clustering [25].
Community detection was performed using the Louvain modularity optimization algorithm [26]. Parameter settings were carefully chosen for optimal sensitivity and specificity: a k-mer size of 18, sketch size of 5000, and resolution value of 2. These parameters enabled accurate identification of genome communities reflecting biologically meaningful phylogenetic groupings, supporting both species-level resolution and reconstruction of evolutionary relationships across the Aeribacillus genus.

2.5. Pan-Genome and Core-Genome Analysis

To investigate the genomic diversity and shared gene content within the Aeribacillus genus, a pan/core genome analysis was performed by comparing the whole-genome sequences (WGS) of A. pallidus strains with those of other available Aeribacillus species. This analysis was conducted using the MicroScope Microbial Genome Annotation and Analysis Platform (https://www.genoscope.cns.fr/agc/microscope/home/index.php (access date: 4 August 2025)), which offers an integrated framework for genome comparison, annotation, and visualization.
The platform enabled functional annotation of protein-coding genes across all selected genomes, allowing for detailed interspecies comparisons. For homolog identification, relaxed parameters were applied to ensure comprehensive detection: a minimum of 50% amino acid identity and at least 80% alignment coverage were required between sequences. These permissive thresholds facilitated the identification of both conserved core genes and variable gene sets across the genomes, enabling robust delineation of shared and strain-specific functions [27].
This comparative approach provided insights into the evolutionary relationships, functional overlap, and adaptation strategies of Aeribacillus species. Subsystem-level gene comparisons for A. pallidus strains were further refined using annotations derived from the RAST server, allowing the identification of key genomic features and functional elements that may underlie ecological versatility or biotechnological relevance.

3. Results

3.1. General Genomic Properties of A. pallidus Strains

The comparative genomic analysis of 13 A. pallidus strains revealed notable variation in genome architecture despite overall taxonomic consistency. Genome sizes ranged from 3.24 Mb (A. pallidus NRS-2058) to 4.98 Mb (A. pallidus GS3372), with most strains clustering around 3.8–4.0 Mb. GC content was highly conserved across the majority of strains (38.7–39.3%), except for A. pallidus GS3372, which exhibited an unusually high GC content of 57.4%, suggesting possible horizontal gene transfer events or misclassification (Table 2).
Assembly quality metrics showed variability among strains. The number of contigs with predicted protein-encoding genes (PEGs) ranged from as few as 1 (in complete genomes such as PI8, BK1, and KCTC3564) to over 200 in draft genomes like TD1 and SJP27. N50 values spanned from 39,514 bp (MHI3391) to 281,598 bp (MHI3390), while L50 values ranged from 1 to 27, indicating differing assembly continuity.
The number of annotated subsystems, as predicted by RAST, was relatively consistent (ranging from 284 to 307), highlighting a conserved functional gene set. The total number of predicted coding sequences (CDSs) varied from 3481 (NRS-2058) to 5592 (GS3372), again emphasizing the genomic outlier status of the latter. RNA gene counts varied moderately, from 37 to 105, with the highest numbers found in complete genome assemblies (Supplementary Material S1, Tables S1–S13).
Collectively, these genomic features indicate that while A. pallidus maintains a largely conserved genomic backbone, certain strains—most notably GS3372—display significant deviations that merit further phylogenetic and functional scrutiny.
A comprehensive comparative analysis of the complete genome annotations for all 13 A. pallidus strains was performed to assess the distribution of protein-encoding genes (PEGs), RNA elements, and CRISPR loci. For each annotated feature, its presence or absence was systematically determined across the strains, enabling the identification of both conserved and strain-specific elements. The resulting dataset is presented as a detailed supplementary table, in which each row corresponds to a unique annotated feature and each column indicates its occurrence in a given strain (Supplementary Material S2, Table S14). This comparative matrix provides a clear overview of the genetic repertoire of the strains, highlighting potential functional diversity, strain-specific adaptations, and conserved genomic elements that may contribute to the ecological versatility and biotechnological potential of A. pallidus. Moreover, genome annotation revealed that all A. pallidus strains possess complete gene sets for key central metabolic pathways, including glycolysis, gluconeogenesis, the pentose phosphate pathway, and the tricarboxylic acid (TCA) cycle. These pathways collectively enable efficient carbohydrate catabolism, energy generation, and the provision of biosynthetic precursors essential for cellular growth. Furthermore, with the exception of strains NRS-2058 and GS3372, the genomes encode the full complement of enzymes required for ethanol fermentation, indicating the potential to metabolize sugars into ethanol under anaerobic conditions. Overall, these findings suggest that A. pallidus has the genetic capacity to carry out a broad range of fundamental metabolic processes necessary for survival and adaptation in diverse environments.

3.2. Comparative Functional Annotation and COG Categories of A. pallidus Strains

The comprehensive subsystem-based annotation of 13 A. pallidus strains revealed both conserved and divergent functional capacities across the genomes. Core metabolic functions—such as protein biosynthesis, transcription, RNA processing, and central carbohydrate metabolism—are highly conserved, underscoring their essential roles in basic cellular processes. All strains possessed robust counts for these pathways, with protein biosynthesis genes ranging from 114 to 155 across the dataset, highlighting consistent translational potential among strains (Figure 1).
Conversely, notable variations were observed in subsystems related to stress response, secondary metabolism, and iron acquisition, indicating adaptive divergence among strains. For instance, the gene count for resistance to antibiotics and toxic compounds ranged from 10 to 22, with strains like GS3372 and 8m3 showing higher resistance-related gene counts. This may reflect enhanced environmental resilience in these strains, potentially due to habitat-specific selective pressures.
Subsystems linked to secondary metabolism were only present in a few strains (e.g., NRS-2058, MHI3390, GS3372), suggesting limited but specialized biosynthetic capacities. Similarly, iron acquisition and siderophore biosynthesis genes were absent in certain strains (e.g., NRS-2058) but highly represented in others (e.g., W-12, BK1), pointing to niche-specific metal scavenging strategies.
Some subsystem categories (e.g., “Cell Wall and Capsule”, “Carbohydrates”, “DNA Metabolism”) are further diversified by subcategories that vary significantly among strains, which may contribute to phenotypic plasticity. Additionally, stress-related pathways such as oxidative stress, detoxification, and osmotic stress showed moderately conserved yet strain-specific enrichment, indicating distinct stress adaptation strategies.
In summary, while A. pallidus strains maintain a shared genomic foundation supporting core life functions, they also exhibit notable variation in environmental adaptation traits. These differences highlight the potential of specific strains (e.g., GS3372, W-12, MHI3390) for biotechnological applications requiring robust stress tolerance, metabolic flexibility, or secondary metabolite production.
A functional categorization of protein-coding genes (CDSs) based on COG (Clusters of Orthologous Groups) classification was performed across 13 A. pallidus strains (Table 3). The analysis revealed a broadly conserved genomic architecture, particularly in core cellular processes. Among all COG categories, amino acid transport and metabolism, general function prediction only, and transcription were consistently among the most populated, indicating that amino acid biosynthesis, regulatory control, and metabolic flexibility are central to A. pallidus biology.
The amino acid transport and metabolism category exhibited the highest number of CDSs across nearly all strains, ranging from 145 in SJP27 to 224 in GS3372. This highlights the adaptive importance of amino acid utilization pathways in diverse environmental contexts. Similarly, genes related to posttranslational modification, protein turnover, chaperones and energy production and conversion also showed high representation, supporting the organism’s robust metabolic and stress-handling capacities.
Strain-specific variations were notable in certain categories. For example, GS3372 displayed a particularly high number of genes involved in carbohydrate transport and metabolism and cell wall/membrane/envelope biogenesis, suggesting possible niche-specific adaptations such as cell surface remodeling or enhanced carbohydrate processing capability. On the other hand, some strains like W-12 and TD1 showed relatively balanced distributions across categories, indicating generalist genomic profiles.
Functional categories related to defense mechanisms, cell motility, and secondary metabolite biosynthesis were less populated, suggesting these traits may be less central or more strain-specific in A. pallidus. Despite this, the presence of even low-copy-number genes in categories like signal transduction mechanisms and inorganic ion transport points to environmental sensing and regulatory complexity.
In conclusion, the COG-based functional profiling reveals that while A. pallidus strains share a common genomic backbone with enriched core metabolic functions, there are distinguishable differences in categories tied to environmental adaptation. These findings provide a basis for selecting specific strains for applications requiring enhanced metabolic or stress-resistance capabilities.

3.3. Thermal Stress Response

Thermophilic bacteria possess a diverse set of stress response proteins that are essential for maintaining protein homeostasis and cellular function under extreme conditions such as high temperature, oxidative stress, pH fluctuations, and nutrient limitation. Among these, universal stress proteins (USPs) and heat shock proteins (HSPs) play pivotal roles by acting as molecular chaperones or proteases, enabling adaptation, survival, and colonization [28,29].
The comparative genomic analysis of 13 A. pallidus strains revealed a conserved presence of key stress-related proteins, while a few showed strain-specific distributions, indicating possible functional specialization or differential stress adaptation strategies (Table 4).
Both GroEL and GroES were present in all 13 strains, indicating their essential and universal function in protein folding under thermal stress. GroEL (60 kDa) and GroES (10 kDa) form a chaperonin complex that facilitates proper protein folding, particularly under heat stress, regulated by sigma factor σ32 [30,31].
Similarly, DnaK, DnaJ, and GrpE were consistently detected in all strains, highlighting the conserved nature of this chaperone machinery. This complex assists in preventing protein aggregation and facilitates protein refolding under both heat shock and oxidative conditions [32].
The HtpG protein, a bacterial homolog of Hsp90, was found in all strains, reinforcing its importance in maintaining protein homeostasis under thermal stress [33].
ClpB and ClpC were universally detected, underlining their role in disaggregating stress-denatured proteins and cooperating with other chaperones like DnaK [34,35]. ClpP and ClpX were also broadly conserved, with only minor absence in isolated strains, indicating their central role in ATP-dependent proteolysis of misfolded proteins [35,36,37]. ClpE and MecA, though present in many strains, showed variable distribution, suggesting strain-specific regulation or compensatory mechanisms.
HslU and HslV were present in the majority of strains, supporting their role in proteolytic degradation under heat stress to prevent toxic protein aggregates [38,39]. The co-occurrence of HslO in some strains points to additional layers of regulatory or structural stabilization.
The DegP/HtrA protease, which is crucial for degrading misfolded periplasmic proteins during heat and envelope stress, was found in all strains. This confirms its significance in bacterial envelope integrity under thermal fluctuations [40]. Both HflC and HflK were consistently present, indicating their involvement in modulating FtsH protease activity and maintaining membrane protein quality during stress [41].
The Lon protease, involved in degrading regulatory and damaged proteins under various stresses (including antibiotic and heat stress), was detected in all strains, reinforcing its role in protein quality control and cellular fitness [42]. The presence of sHSPs across the majority of strains suggests an auxiliary role in protecting against protein aggregation by functioning as holdases during acute stress exposure.
This distribution pattern strongly supports the hypothesis that A. pallidus strains rely on a core set of stress response elements for thermal adaptation, while accessory chaperones and proteases may contribute to niche-specific fitness and adaptive plasticity. These findings not only underscore the ecological resilience of A. pallidus but also highlight potential targets for industrial applications where thermal stability is a desired trait.

3.4. CAZyme and Secondary Metabolite Identification

The CAZyme profile among the analyzed A. pallidus strains exhibits both diversity and conservation across different enzyme families (Table 5). Several auxiliary activity (AA) families, particularly AA4 and AA6, are widely represented and conserved, occurring in nearly all strains with relatively high copy numbers. In contrast, AA1 and AA3 are more sporadically distributed, with AA1 present in approximately half of the strains and AA3 mostly restricted to only a few.
Carbohydrate-binding modules (CBMs), especially CBM50, are notably abundant and consistently detected across all strains, suggesting a potentially conserved role in carbohydrate recognition or cell wall remodeling. Other CBMs like CBM20, CBM34, CBM48, and CBM96 appear in very few strains, indicating more specialized or strain-specific functions.
Glycoside hydrolases (GHs), particularly GH18, GH13, GH23, and GH188, show high abundance and widespread presence among the strains. GH188, for instance, is one of the most frequently encountered CAZyme families, with copy numbers ranging from 3 to 7 per strain. This suggests a key role in the degradation of complex polysaccharides such as chitin or peptidoglycan-like substrates. Similarly, members of GH18 and GH4 are prevalent and may contribute to robust carbohydrate-degrading capabilities under thermophilic conditions.
Glycosyltransferases (GTs), especially GT2, GT4, GT51, and GT119, are also among the most represented families. Notably, GT4 and GT119 are found in nearly all strains, with GT4 reaching copy numbers as high as 8, implying a significant role in the biosynthesis of glycan structures such as exopolysaccharides (EPS), cell wall polymers, or glycoproteins.
Strain GS3372 stands out with the highest overall number of CAZyme families, including a particularly elevated count of CE4 (Carbohydrate Esterase Family 4, n = 10), GH188, and various GTs. This suggests that GS3372 may possess enhanced capabilities for complex carbohydrate deconstruction and could be a promising candidate for biotechnological exploitation in biomass conversion or biocatalysis.
In contrast, strains like PI8 and 8 show a more moderate CAZyme profile, suggesting narrower functional specialization or reduced adaptability to a broad spectrum of carbohydrate substrates. However, these strains still maintain core GH and GT families, indicating that they are not entirely devoid of carbohydrate metabolism potential.
Overall, the CAZyme profiles of A. pallidus strains reflect a balance between conserved core functionalities and strain-specific adaptations. The abundance of GH and GT families suggests that these strains harbor significant potential for biotechnological applications such as thermostable glycosidase production, polysaccharide modification, or biofuel precursor generation. Particularly, strains like GS3372 and MHI3390 could be prioritized for future functional characterization and industrial enzyme development due to their broad and enriched CAZyme profiles.
Genome mining revealed the presence of diverse biosynthetic gene clusters (BGCs) associated with secondary metabolite production across the 13 A. pallidus strains (Figure 2). The comparative analysis indicated substantial inter-strain variability in the type and confidence level of predicted clusters. While all strains harbored terpene and terpene-precursor biosynthetic elements, the confidence levels varied widely. Strains such as W-12, PI8, 8m3, and 8 consistently exhibited high-confidence predictions (red) for NRPS, NRP-metallophore, and terpenoid clusters, suggesting a higher secondary metabolic potential compared to others.
The metabolite class sactipeptides was the most broadly distributed among the strains, with medium- to high-confidence predictions observed in the majority. In contrast, glycocin and CDPS were rarely detected and only with high confidence in a few specific strains, notably NRS-1637 and 8. This indicates that certain strains may possess niche or strain-specific secondary metabolite capabilities that could be of functional or industrial interest.
Interestingly, W-12 and PI8 stood out as the most biosynthetically diverse, harboring multiple cluster types with high confidence, including betalactones, RiPP-like peptides, and NRP-derived metabolites. On the other hand, strains like NRS-2058 and SJP27 showed a lower overall number of confidently predicted clusters, with many of their BGC predictions falling into the “not significant” (gray) category. These differences likely reflect genomic divergence and ecological adaptation within the A. pallidus lineage.
Overall, the results suggest that certain A. pallidus strains, especially PI8, W-12, and 8m3, may serve as promising candidates for further exploration of novel bioactive compounds, including antimicrobial peptides and specialized terpenoids. These strains could be prioritized for downstream experimental validation and compound isolation studies.

3.5. Phylogenomic Clustering Reveals Genetic Relationships Among Aeribacillus Strains

A comprehensive phylogenomic tree was constructed using whole-genome sequences of Aeribacillus pallidus and closely related Aeribacillus species, as shown in the cladogram (Figure 3). The tree illustrates the evolutionary relationships among 35 strains, including publicly available genomes. Branch lengths reflect the evolutionary divergence based on core-genome similarities, while bootstrap-like support values are indicated on internal nodes.
The cladogram reveals clear genomic clustering within the A. pallidus clade, with several well-supported subclades. For instance, A. pallidus strains such as NRS-2058, KCTC3564, and BK1 form a robustly supported group (branch support values < 1 × 10−2), suggesting a close evolutionary relationship that may reflect shared ecological niches or genomic traits. Interestingly, certain A. pallidus strains, such as GS3372 and SJP27, cluster distantly from the main group, indicating greater genomic divergence and potentially unique adaptations or functional capabilities.
A subset of strains annotated as Aeribacillus sp. (e.g., FSL K6-2211, FSL M8-0235) showed close clustering with A. pallidus isolates, supporting their taxonomic affiliation with this species. Similarly, the group containing A. pallidus PI8, 8, and W-12 is genetically distinct but still retains tight internal branching, suggesting intraspecific variability within the species.
On the other hand, several strains classified as Aeribacillus composti (e.g., KCTC 33824, HB-1, NRS-1512) formed distinct monophyletic groups, clearly separating from A. pallidus at the species level. However, the presence of intermixed branching patterns and intermediate placements of certain Aeribacillus sp. strains indicates ambiguity in current species delineations within the genus. These findings suggest that some taxonomic classifications may not fully reflect the underlying genomic relationships.
Overall, the whole-genome-based cladogram strongly supports the genomic coherence of A. pallidus, while also revealing substantial intraspecies diversity and strain-specific divergence. These results demonstrate that while A. pallidus forms a genetically cohesive group, it also harbors notable genomic variability. This diversity may reflect ecological adaptations and holds significance for the species’ biotechnological potential.

3.6. Pan/Core Genome Analysis

A comprehensive pan-genome analysis was performed using 38 whole-genome sequences belonging to the Aeribacillus genus, including multiple strains of A. pallidus, A. composti, A. alveayuensis, A. kexueae, and other unclassified Aeribacillus spp. (Table 6). The analysis aimed to quantify the extent of genomic diversity within the genus by categorizing coding sequences (CDS) into core, variable, and strain-specific components.
The results demonstrated a highly conserved core genome, with approximately 1100–1200 genes present across all strains. However, the size of the accessory genome (variable and strain-specific CDS) varied substantially among the species and even among strains of the same species. Notably, A. composti strains showed remarkable genomic conservation, with strain-specific CDS percentages consistently below 5%, and in some cases (e.g., NRS-1631) as low as 0.18%, highlighting their high genomic similarity and potential clonal relatedness.
In contrast, A. alveayuensis 24KAM51 and A. pallidus NRS-2058 exhibited the highest proportion of strain-specific genes, at 21.1% and 30.2%, respectively. These findings suggest a greater degree of genomic plasticity or niche-specific adaptation in these strains. Similarly, A. pallidus GS3372 stood out due to its relatively large genome and the highest count of unique genes (3125 CDS), indicating a potentially distinct ecological role or acquisition of exogenous elements through horizontal gene transfer.
Across the genus, the proportion of core genes ranged from approximately 23% to 34% of the total CDS, while the variable genome represented the largest fraction in most strains, comprising 65–77% of the annotated coding potential. This distribution underscores the open nature of the Aeribacillus pan-genome, characterized by extensive genomic diversity, likely driven by environmental adaptation mechanisms and the acquisition of mobile genetic elements.
Collectively, this analysis highlights significant inter- and intra-species genomic variability within the Aeribacillus genus. Strains with higher proportions of unique or variable genes—such as GS3372, NRS-2058, and 24KAM51—may harbor novel traits of ecological or biotechnological interest and warrant further functional exploration. Conversely, the highly conserved nature of A. composti strains suggests a more stable genome architecture, making them suitable models for studying core functional traits of the genus.

4. Discussion

The comparative genomic analysis of Aeribacillus pallidus strains presented in this study provides a comprehensive understanding of the species’ metabolic capabilities, stress tolerance mechanisms, and potential for biotechnological exploitation. While A. pallidus has been recognized as a thermophilic bacterium producing a variety of thermostable enzymes [4,5,6], the present genomic dataset allows a deeper exploration of the genetic determinants underpinning these traits and reveals strain-specific features that may expand its industrial relevance.
Despite a conserved GC content (~39%) and the presence of complete central metabolic pathways in all strains, notable genomic heterogeneity was observed, particularly in strain GS3372, which displayed a higher GC content (57.4%) and genome size. Such deviations likely reflect horizontal gene transfer events, conferring novel metabolic capacities and environmental adaptability [9]. The open nature of the Aeribacillus pan-genome identified here, with a core genome representing only 23–34% of coding sequences, aligns with previous observations in thermophilic bacteria where genomic plasticity enables adaptation to diverse niches [16].
Core metabolic modules—including glycolysis, gluconeogenesis, the pentose phosphate pathway, and the TCA cycle—were conserved across all strains, consistent with earlier biochemical characterizations [4,14]. Furthermore, the ability to perform ethanol fermentation in most strains suggests metabolic flexibility that could be leveraged for bioethanol production under thermophilic conditions.
All strains possessed a conserved suite of heat shock proteins (HSPs) and universal stress proteins (USPs), underscoring their essential roles in thermotolerance and protein homeostasis. The universal presence of GroEL/GroES and DnaK/DnaJ/GrpE complexes confirms their critical role in preventing protein aggregation and assisting refolding under stress [30,32]. Similarly, Clp proteases, HtpG, and DegP/HtrA contribute to proteostasis and membrane stability during heat and oxidative stress [31,43]. The presence of accessory chaperones such as ClpE and HslO in a strain-dependent manner may indicate fine-tuned adaptation to specific ecological niches. These robust stress adaptation systems reinforce the suitability of A. pallidus for industrial processes requiring high thermal stability.
The CAZyme profiles revealed an abundant and conserved set of glycoside hydrolases (GH13, GH18, GH23, GH188) and glycosyltransferases (GT2, GT4, GT51, GT119), highlighting a strong capacity for polysaccharide degradation and glycan biosynthesis. GH188’s high copy number across strains suggests specialization in chitin or peptidoglycan modification, with potential applications in biomass conversion and biocontrol. Strains GS3372 and MHI3390 showed the most enriched CAZyme repertoires, including elevated CE4 and multiple GT families, indicating enhanced lignocellulose-degrading capacity. This finding aligns with prior studies demonstrating lignocellulolytic activity in thermophilic A. pallidus [8]. Such capabilities are valuable for biofuel production and the generation of biopolymers from renewable biomass.
Genome mining revealed diverse biosynthetic gene clusters (BGCs), including terpenes, sactipeptides, betalactones, NRPS-derived metabolites, and RiPP-like peptides. Particularly, strains PI8, W-12, and 8m3 displayed multiple high-confidence BGCs, suggesting a greater capacity to produce novel bioactive compounds. Prior work has reported the production of the thermostable bacteriocin pallidocyclin in A. pallidus PI8 [10], and genomic evidence points to additional, as-yet-undiscovered antimicrobial peptides [11]. The diversity of BGCs observed here suggests applications in food preservation, pharmaceuticals, and agriculture, especially where heat stability is advantageous.
Phylogenomic analysis confirmed the taxonomic coherence of A. pallidus while revealing substantial intraspecific diversity. Outlier strains such as GS3372 and SJP27 formed distinct branches, reflecting possible ecological specialization. The large accessory genome and frequent strain-specific genes suggest ongoing diversification driven by environmental pressures, paralleling trends in other extremophiles adapted to fluctuating environments [14,15].
The combination of thermostable enzymes, robust stress response networks, enriched CAZyme repertoires, and diverse BGCs underscores A. pallidus’ potential as a versatile biotechnological resource. Possible applications include thermostable enzyme production for starch, pectin, and lipid processing [5,6,7], lignocellulose-based biofuel generation, bioremediation of petroleum-contaminated environments [9] and production of heat-stable antimicrobial agents [10,11]. Future work should focus on experimental validation of candidate genes, heterologous expression of promising enzymes, and metabolomic profiling to verify BGC products. Integrating genomic data with systems biology and metabolic engineering could establish A. pallidus as a chassis organism for high-temperature industrial processes.
In comparison with other thermophilic genera, Aeribacillus pallidus shows both shared and unique genomic features. Similarly to Geobacillus species, which are well-known for their thermostable xylanases and lipases [44], A. pallidus harbors abundant glycoside hydrolases and proteases that support efficient biomass degradation and industrial enzyme applications. Likewise, parallels can be drawn with Thermus, a genus famous for heat-stable DNA polymerases such as Taq polymerase [45], underscoring the potential of A. pallidus to yield additional thermostable biomolecules. Several Bacillus species are widely used for alkaline proteases [46] and amylases [47] in food and detergent industries, and our findings suggest that A. pallidus possesses similar enzymatic repertoires. Importantly, distinctive features such as the enriched CAZyme families in GS3372 and the presence of diverse bacteriocin-related biosynthetic gene clusters indicate that A. pallidus may complement and extend the biotechnological capacities of these established thermophilic genera.
In summary, this comparative genomic study provides novel insights into the diversity and adaptive capacity of A. pallidus. Our findings build upon previous studies by extending the analysis from single strains to a broader species-level framework, thereby highlighting both conserved and unique traits. These results not only reinforce the relevance of A. pallidus in industrial biotechnology but also open new avenues for targeted experimental validation and metabolic engineering in future research.

5. Conclusions

This study provides the first comprehensive comparative genomic analysis of 13 Aeribacillus pallidus strains, revealing both conserved and strain-specific traits that underpin their ecological resilience and biotechnological potential. The conserved presence of core metabolic pathways—including glycolysis, gluconeogenesis, the pentose phosphate pathway, and the TCA cycle—confirms the robust metabolic foundation of A. pallidus. Meanwhile, the genomic diversity observed, particularly in strains such as GS3372, highlights the species’ adaptive plasticity and suggests ongoing evolutionary diversification through horizontal gene transfer and niche-specific adaptation.
The identification of a broad repertoire of carbohydrate-active enzymes (CAZymes) and diverse secondary metabolite biosynthetic gene clusters (BGCs) underscores the potential of A. pallidus as a source of thermostable enzymes and novel bioactive compounds. Strains with enriched CAZyme profiles, such as GS3372 and MHI3390, present promising candidates for biomass conversion and lignocellulose degradation, while those with multiple high-confidence BGCs, such as PI8, W-12, and 8m3, may serve as valuable sources of antimicrobial peptides and other specialized metabolites.
In addition, the conserved suite of heat shock proteins and universal stress proteins across all strains reinforces the ability of A. pallidus to thrive under extreme conditions. This resilience, combined with metabolic versatility, positions A. pallidus as a strong candidate for use as a thermophilic chassis organism in industrial biotechnology, particularly in applications requiring high process temperatures, such as biofuel production, thermotolerant biocatalysis, and bioremediation in harsh environments.
Future research should focus on experimental validation of the predicted metabolic pathways and biosynthetic gene clusters, as well as the functional characterization of novel enzymes and metabolites. Integrating these genomic insights with systems biology and metabolic engineering approaches could accelerate the development of optimized A. pallidus strains tailored to specific industrial applications, thereby contributing to sustainable bioprocessing solutions in the emerging bioeconomy.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app152010866/s1, Table S1: Annotation of Aeribacillus pallidus NRS-2058 genome; Table S2: Annotation of Aeribacillus pallidus W-12 genome; Table S3: Annotation of Aeribacillus pallidus TD1 genome; Table S4: Annotation of Aeribacillus pallidus SJP27 genome; Table S5: Annotation of Aeribacillus pallidus PI8 genome; Table S6: Annotation of Aeribacillus pallidus NRS-1637 genome; Table S7: Annotation of Aeribacillus pallidus MHI3391 genome; Table S8: Annotation of Aeribacillus pallidus MHI3390 genome; Table S9: Annotation of Aeribacillus pallidus KCTC3564 genome; Table S10: Annotation of Aeribacillus pallidus GS3372 genome; Table S11: Annotation of Aeribacillus pallidus BK1 genome; Table S12: Annotation of Aeribacillus pallidus 8m3 genome; Table S13: Annotation of Aeribacillus pallidus 8 genome; Table S14: Function comparison of Aeribacillus pallidus strains.

Author Contributions

S.Y.Y. and N.R. carried out all the work, prepared figures and tables, wrote the main text and reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Scholz, T.; Demharter, W.; Hensel, R.; Kandler, O. Bacillus pallidus sp. nov., a new thermophilic species from sewage. Syst. Appl. Microbiol. 1987, 9, 91–96. [Google Scholar] [CrossRef]
  2. Banat, I.M.; Marchant, R.; Rahman, T.J. Geobacillus debilis sp. nov., a novel obligately thermophilic bacterium isolated from a cool soil environment, and reassignment of Bacillus pallidus to Geobacillus pallidus comb. nov. Int. J. Syst. Evol. Microbiol. 2004, 54, 2197–2201. [Google Scholar] [CrossRef]
  3. Minana-Galbis, D.; Pinzon, D.L.; Loren, J.G.; Manresa, A.; Oliart-Ros, R.M. Reclassification of Geobacillus pallidus (Scholz et al. 1988) Banat et al. 2004 as Aeribacillus pallidus gen. nov., comb. nov. Int. J. Syst. Evol. Microbiol. 2010, 60, 1600–1604. [Google Scholar] [CrossRef]
  4. Yasawong, M.; Areekit, S.; Pakpitchareon, A.; Santiwatanakul, S.; Chansiri, K. Characterization of thermophilic halotolerant Aeribacillus pallidus TD1 from Tao dam hot spring, Thailand. Int. J. Mol. Sci. 2011, 12, 5294–5303. [Google Scholar] [CrossRef]
  5. Timilsina, P.M.; Pandey, G.R.; Shrestha, A.; Ojha, M.; Karki, T.B. Purification and characterization of a noble thermostable algal starch liquefying alpha-amylase from Aeribacillus pallidus BTPS-2 isolated from geothermal spring of Nepal. Biotechnol. Rep. 2020, 28, e00551. [Google Scholar] [CrossRef]
  6. Yildirim, V.; Baltaci, M.O.; Ozgencli, I.; Sisecioglu, M.; Adiguzel, A.; Adiguzel, G. Purification and biochemical characterization of a novel thermostable serine alkaline protease from Aeribacillus pallidus C10: A potential additive for detergents. J. Enzym. Inhib. Med. Chem. 2017, 32, 468–477. [Google Scholar] [CrossRef]
  7. Ktata, A.; Krayem, N.; Aloulou, A.; Bezzine, S.; Sayari, A.; Chamkha, M.; Karray, A. Purification, biochemical and molecular study of lipase producing from a newly thermoalkaliphilic Aeribacillus pallidus for oily wastewater treatment. J. Biochem. 2020, 167, 89–99. [Google Scholar] [CrossRef]
  8. López López, M.J.; Jurado Rodríguez, M.d.M.; López González, J.A.; Estrella González, M.J.; Martínez Gallardo, M.R.; Toribio Gallardo, A.J.; Suárez Estrella, F. Characterization of Thermophilic Lignocellulolytic Microorganisms in Composting. Front. Microbiol. 2021, 12, 697480. [Google Scholar] [CrossRef]
  9. Filippidou, S.; Jaussi, M.; Junier, T.; Wunderlin, T.; Jeanneret, N.; Regenspurg, S.; Li, P.-E.; Lo, C.-C.; Johnson, S.; McMurry, K. Genome sequence of Aeribacillus pallidus strain GS3372, an endospore-forming bacterium isolated in a deep geothermal reservoir. Genome Announc. 2015, 3, 00981-15. [Google Scholar] [CrossRef]
  10. Kita, K.; Yoshida, S.; Masuo, S.; Nakamura, A.; Ishikawa, S.; Yoshida, K.-i. Genes encoding a novel thermostable bacteriocin in the thermophilic bacterium Aeribacillus pallidus PI8. J. Appl. Microbiol. 2023, 134, lxad293. [Google Scholar] [CrossRef]
  11. Lücking, G.; Albrecht, K.; Märtlbauer, E.; Schauer, K. Draft genome sequences of two thermophilic, spore-forming Aeribacillus pallidus strains isolated from dairy products. Microbiol. Resour. Announc. 2024, 13, e00896-23. [Google Scholar] [CrossRef]
  12. Stavridou, E.; Karapetsi, L.; Nteve, G.M.; Tsintzou, G.; Chatzikonstantinou, M.; Tsaousi, M.; Martinez, A.; Flores, P.; Merino, M.; Dobrovic, L. Landscape of microalgae omics and metabolic engineering research for strain improvement: An overview. Aquaculture 2024, 587, 740803. [Google Scholar] [CrossRef]
  13. Davey, J.W.; Hohenlohe, P.A.; Etter, P.D.; Boone, J.Q.; Catchen, J.M.; Blaxter, M.L. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat. Rev. Genet. 2011, 12, 499–510. [Google Scholar] [CrossRef]
  14. Yildiz, S.Y.; Radchenkova, N.; Arga, K.Y.; Kambourova, M.; Toksoy Oner, E. Genomic analysis of Brevibacillus thermoruber 423 reveals its biotechnological and industrial potential. Appl. Microbiol. Biotechnol. 2015, 99, 2277–2289. [Google Scholar] [CrossRef]
  15. Yaşar Yıldız, S. Genomic insights into Thermomonas hydrothermalis: Potential applications in industrial biotechnology. World J. Microbiol. Biotechnol. 2025, 41, 30. [Google Scholar] [CrossRef]
  16. Yasar Yildiz, S.; Finore, I.; Leone, L.; Romano, I.; Lama, L.; Kasavi, C.; Nicolaus, B.; Toksoy Oner, E.; Poli, A. Genomic analysis provides new insights into biotechnological and industrial potential of Parageobacillus thermantarcticus M1. Front. Microbiol. 2022, 13, 923038. [Google Scholar] [CrossRef]
  17. Aziz, R.K.; Bartels, D.; Best, A.A.; DeJongh, M.; Disz, T.; Edwards, R.A.; Formsma, K.; Gerdes, S.; Glass, E.M.; Kubal, M. The RAST Server: Rapid annotations using subsystems technology. BMC Genom. 2008, 9, 75. [Google Scholar] [CrossRef]
  18. Olson, R.D.; Assaf, R.; Brettin, T.; Conrad, N.; Cucinell, C.; Davis, J.J.; Dempsey, D.M.; Dickerman, A.; Dietrich, E.M.; Kenyon, R.W. Introducing the bacterial and viral bioinformatics resource center (BV-BRC): A resource combining PATRIC, IRD and ViPR. Nucleic Acids Res. 2023, 51, D678–D689. [Google Scholar] [CrossRef]
  19. Huerta-Cepas, J.; Szklarczyk, D.; Heller, D.; Hernández-Plaza, A.; Forslund, S.K.; Cook, H.; Mende, D.R.; Letunic, I.; Rattei, T.; Jensen, L.J. eggNOG 5.0: A hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 2019, 47, D309–D314. [Google Scholar] [CrossRef]
  20. Vallenet, D.; Engelen, S.; Mornico, D.; Cruveiller, S.; Fleury, L.; Lajus, A.; Rouy, Z.; Roche, D.; Salvignol, G.; Scarpelli, C. MicroScope: A platform for microbial genome annotation and comparative genomics. Database 2009, 2009, bap021. [Google Scholar] [CrossRef]
  21. Zheng, J.; Ge, Q.; Yan, Y.; Zhang, X.; Huang, L.; Yin, Y. dbCAN3: Automated carbohydrate-active enzyme and substrate annotation. Nucleic Acids Res. 2023, 51, W115–W121. [Google Scholar] [CrossRef]
  22. Medema, M.H.; Blin, K.; Cimermancic, P.; De Jager, V.; Zakrzewski, P.; Fischbach, M.A.; Weber, T.; Takano, E.; Breitling, R. antiSMASH: Rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res. 2011, 39, W339–W346. [Google Scholar] [CrossRef]
  23. Konstantinidis, K.T.; Tiedje, J.M. Genomic insights that advance the species definition for prokaryotes. Proc. Natl. Acad. Sci. USA 2005, 102, 2567–2572. [Google Scholar] [CrossRef]
  24. Ondov, B.D.; Treangen, T.J.; Melsted, P.; Mallonee, A.B.; Bergman, N.H.; Koren, S.; Phillippy, A.M. Mash: Fast genome and metagenome distance estimation using MinHash. Genome Biol. 2016, 17, 132. [Google Scholar] [CrossRef]
  25. Vallenet, D.; Calteau, A.; Dubois, M.; Amours, P.; Bazin, A.; Beuvin, M.; Burlot, L.; Bussell, X.; Fouteau, S.; Gautreau, G. MicroScope: An integrated platform for the annotation and exploration of microbial gene functions through genomic, pangenomic and metabolic comparative analysis. Nucleic Acids Res. 2020, 48, D579–D589. [Google Scholar] [CrossRef]
  26. Blondel, V.D.; Guillaume, J.-L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef]
  27. Deb, S. Pan-genome evolution and its association with divergence of metabolic functions in Bifidobacterium genus. World J. Microbiol. Biotechnol. 2022, 38, 231. [Google Scholar] [CrossRef]
  28. Khan, Z.; Shahwar, D. Role of heat shock proteins (HSPs) and heat stress tolerance in crop plants. In Sustainable Agriculture in the Era of Climate Change; Springer: Berlin/Heidelberg, Germany, 2020; pp. 211–234. [Google Scholar]
  29. Luo, D.; Wu, Z.; Bai, Q.; Zhang, Y.; Huang, M.; Huang, Y.; Li, X. Universal stress proteins: From gene to function. Int. J. Mol. Sci. 2023, 24, 4725. [Google Scholar] [CrossRef]
  30. Gragerov, A.; Nudler, E.; Komissarova, N.; Gaitanaris, G.A.; Gottesman, M.E.; Nikiforov, V. Cooperation of GroEL/GroES and DnaK/DnaJ heat shock proteins in preventing protein misfolding in Escherichia coli. Proc. Natl. Acad. Sci. USA 1992, 89, 10341–10344. [Google Scholar] [CrossRef]
  31. Hu, C.; Yang, J.; Qi, Z.; Wu, H.; Wang, B.; Zou, F.; Mei, H.; Liu, J.; Wang, W.; Liu, Q. Heat shock proteins: Biological functions, pathological roles, and therapeutic opportunities. MedComm 2022, 3, e161. [Google Scholar] [CrossRef]
  32. Mayer, M.; Bukau, B. Hsp70 chaperones: Cellular functions and molecular mechanism. Cell. Mol. Life Sci. 2005, 62, 670. [Google Scholar] [CrossRef]
  33. Eelager, M.P.; Masti, S.P.; Chougale, R.B.; Dalbanjan, N.P.; Kumar, S.P. Noni (Morinda citrifolia) leaf extract incorporated methylcellulose active films: A sustainable strategy for browning inhibition in apple slice packaging. Int. J. Biol. Macromol. 2024, 269, 132270. [Google Scholar] [CrossRef]
  34. Katikaridis, P.; Bohl, V.; Mogk, A. Resisting the heat: Bacterial disaggregases rescue cells from devastating protein aggregation. Front. Mol. Biosci. 2021, 8, 681439. [Google Scholar] [CrossRef]
  35. Queraltó, C.; Álvarez, R.; Ortega, C.; Díaz-Yáñez, F.; Paredes-Sabja, D.; Gil, F. Role and regulation of Clp proteases: A target against gram-positive bacteria. Bacteria 2023, 2, 21–36. [Google Scholar] [CrossRef]
  36. Michel, A.; Agerer, F.; Hauck, C.R.; Herrmann, M.; Ullrich, J.; Hacker, J.r.; Ohlsen, K. Global regulatory impact of ClpP protease of Staphylococcus aureus on regulons involved in virulence, oxidative stress response, autolysis, and DNA repair. J. Bacteriol. 2006, 188, 5783–5796. [Google Scholar] [CrossRef]
  37. Jensen, C.; Fosberg, M.J.; Thalsø-Madsen, I.; Bæk, K.T.; Frees, D. Staphylococcus aureus ClpX localizes at the division septum and impacts transcription of genes involved in cell division, T7-secretion, and SaPI5-excision. Sci. Rep. 2019, 9, 16456. [Google Scholar] [CrossRef]
  38. Rohrwild, M.; Coux, O.; Huang, H.; Moerschell, R.P.; Yoo, S.J.; Seol, J.H.; Chung, C.H.; Goldberg, A.L. HslV-HslU: A novel ATP-dependent protease complex in Escherichia coli related to the eukaryotic proteasome. Proc. Natl. Acad. Sci. USA 1996, 93, 5808–5813. [Google Scholar] [CrossRef]
  39. Kebe, N.M.; Samanta, K.; Singh, P.; Lai-Kee-Him, J.; Apicella, V.; Payrot, N.; Lauraire, N.; Legrand, B.; Lisowski, V.; Mbang-Benet, D.-E. The HslV protease from Leishmania major and its activation by C-terminal HslU peptides. Int. J. Mol. Sci. 2019, 20, 1021. [Google Scholar] [CrossRef]
  40. Zarzecka, U.; Modrak-Wójcik, A.; Figaj, D.; Apanowicz, M.; Lesner, A.; Bzowska, A.; Lipinska, B.; Zawilak-Pawlik, A.; Backert, S.; Skorko-Glonek, J. Properties of the HtrA protease from bacterium Helicobacter pylori whose activity is indispensable for growth under stress conditions. Front. Microbiol. 2019, 10, 961. [Google Scholar] [CrossRef]
  41. Yi, L.; Liu, B.; Nixon, P.J.; Yu, J.; Chen, F. Recent advances in understanding the structural and functional evolution of FtsH proteases. Front. Plant Sci. 2022, 13, 837528. [Google Scholar] [CrossRef]
  42. Kirthika, P.; Lloren, K.K.S.; Jawalagatti, V.; Lee, J.H. Structure, substrate specificity and role of lon protease in bacterial pathogenesis and survival. Int. J. Mol. Sci. 2023, 24, 3422. [Google Scholar] [CrossRef]
  43. Laksanalamai, P.; Robb, F.T. Small heat shock proteins from extremophiles: A review. Extremophiles 2004, 8, 1–11. [Google Scholar] [CrossRef]
  44. Najar, I.N.; Thakur, N. A systematic review of the genera Geobacillus and Parageobacillus: Their evolution, current taxonomic status and major applications. Microbiology 2020, 166, 800–816. [Google Scholar] [CrossRef]
  45. Gelfand, D.H. Taq DNA polymerase. In PCR Technology: Principles and Applications for DNA Amplification; Springer: Berlin/Heidelberg, Germany, 1989; pp. 17–22. [Google Scholar]
  46. Keay, L.; Moser, P.W.; Wildi, B.S. Proteases of the genus Bacillus. II. Alkaline proteases. Biotechnol. Bioeng. 1970, 12, 213–249. [Google Scholar] [CrossRef]
  47. Cordeiro, C.A.M.; Martins, M.L.L.; Luciano, A.B. Production and properties of alpha-amylase from thermophilic Bacillus sp. Braz. J. Microbiol. 2002, 33, 57–61. [Google Scholar] [CrossRef]
Figure 1. Gene count per subsystem in A. pallidus strains.
Figure 1. Gene count per subsystem in A. pallidus strains.
Applsci 15 10866 g001
Figure 2. Secondary metabolite similarity confidence per strain of A. pallidus.
Figure 2. Secondary metabolite similarity confidence per strain of A. pallidus.
Applsci 15 10866 g002
Figure 3. Genome clustering tree of Aeribacillus family.
Figure 3. Genome clustering tree of Aeribacillus family.
Applsci 15 10866 g003
Table 1. List of publicly available A. pallidus genomes and their DDBJ/EMBL/GenBank accession numbers.
Table 1. List of publicly available A. pallidus genomes and their DDBJ/EMBL/GenBank accession numbers.
StrainDDBJ/EMBL/GenBank Accession Number
NRS-2058JBCNBW010000001.1
KCTC3564CP017703.1
8m3 LWBR01000001.1
W-12 QURG01000001.1
8LVHY01000001.1
TD1 SFCD01000001.1
BK1 CP160301.1
PI8AP022323.1
MHI3390JAVLRY010000001.1
NRS-1637 JARTFV010000001.1
GS3372 JYCD01000002.1
SJP27 JBFQFV010000001.1
MHI3391JAVLRZ010000001.1
Table 2. General genomic properties of the A. pallidus strains.
Table 2. General genomic properties of the A. pallidus strains.
GenomeSize (bp)GC
Content (%)
N50L50Number of Contigs (with PEGs)Number of SubsystemsNumber of Coding SequencesNumber of RNAs
NRS-20583,245,67939.058,61918109284348175
W-123,839,13838.999,75512140304428337
TD13,748,96538.845,88126207300430053
SJP273,377,77639.247,33624209293393369
PI83,833,11439.0-113044229104
NRS-16373,863,89638.771,69916120302438588
MHI33913,782,92539.139,51427171301432687
MHI33903,911,43839.0281,598466307441387
KCTC35644,089,45739.3-113004535104
GS33724,985,86357.466,22025185302559272
BK13,935,11839.2-113014293105
8m33,818,61038.9136,754979306424568
83,903,80038.885,25215169301447895
Table 3. Number of genes associated with the general cluster of orthologous group (COG) functional categories.
Table 3. Number of genes associated with the general cluster of orthologous group (COG) functional categories.
NRS-2058W-12TD1SJP27PI8NRS-1637MHI3391MHI3390KCTC3564GS3372BK18m38
COG CategoryCOG Category DescriptionCDS
Cellular Processes and Signaling
DCell cycle control, cell division, chromosome partitioning49435048454647474667444248
MCell wall/membrane/envelope biogenesis157154153118138153155175184216163158150
NCell motility73757361737570717086707271
OPost-translational modification, protein turnover, chaperones104118113116119117120120121132121118116
TSignal transduction mechanisms116147161124145149149151150224148146148
UIntracellular trafficking, secretion, and vesicular transport55484946515652545985475155
VDefense mechanisms 42566752656857746468595071
WExtracellular structures0000000101000
ZCytoskeleton0000000003000
Information Storage and Processing
ARNA processing and modification0101110111111
BChromatin structure and dynamics1111111111111
JTranslation, ribosomal structure and biogenesis189169176169170170169172172201169169170
KTranscription188268272240264264267264270389257270269
LReplication, recombination and repair216275239377329317337307578251463280421
Metabolism
CEnergy production and conversion144245248205234237224239229254226246237
EAmino acid transport and metabolism278302309251295300292296282465279295299
FNucleotide transport and metabolism1009510193999710699100111969598
GCarbohydrate transport and metabolism150213224178222228214222217294206215231
HCoenzyme transport and metabolism113146153135146137137152155182135146137
ILipid transport and metabolism11012513183123125100121100154102122126
PInorganic ion transport and metabolism171262250201240243239270221295246263243
QSecondary metabolites biosynthesis, transport and catabolism537972497374567872105627274
Poorly Characterized
SFunction unknown1068103610049239851049103510421073146399410241059
Table 4. Thermal stress related proteins of A. pallidus strains. (Black boxes indicate absence of the corresponding gene; white boxes indicate presence).
Table 4. Thermal stress related proteins of A. pallidus strains. (Black boxes indicate absence of the corresponding gene; white boxes indicate presence).
NRS-2058W-12TD1SJP27PI8NRS-1637MHI3391MHI3390KCTC3564GS3372BK18m38
Heat shock protein 60 kDa family chaperone GroEL
Heat shock protein 10 kDa family chaperone GroES
RNA polymerase sigma factor RpoD
RNA polymerase heat shock sigma factor SigI
Chaperone protein DnaK
Chaperone protein DnaJ
Heat shock protein GrpE
Chaperone protein HtpG
ClpCP protease substrate adapter protein MecA
ATP-dependent Clp protease, ATP-binding subunit ClpC
ATP-dependent Clp protease proteolytic subunit ClpP (EC 3.4.21.92)
ATP-dependent Clp protease ATP-binding subunit ClpX
Putative membrane-bound ClpP-class protease associated with aq_911
ATP-dependent Clp protease, ATP-binding subunit ClpE
Chaperone protein ClpB (ATP-dependent unfoldase)
33 kDa chaperonin HslO
ATP-dependent protease subunit HslV (EC 3.4.25.2)
ATP-dependent hsl protease ATP-binding subunit HslU
Serine protease, DegP/HtrA, do-like (EC 3.4.21.-)
HflK protein
HflC protein
Lon-like protease with PDZ domain
Small heat shock protein
Table 5. Carbohydrate-active enzymes (CAZymes) annotation of A. pallidus strains.
Table 5. Carbohydrate-active enzymes (CAZymes) annotation of A. pallidus strains.
HMM ProfileCAZyme Classes/Associated ModuleNRS-2058W-12TD1SJP27PI8NRS-1637MHI3391MHI3390KCTC3564GS3372BK18m38
AA1Auxiliary Activity Family 11101001024110
AA3Auxiliary Activity Family 31000000001000
AA4Auxiliary Activity Family 40444444453444
AA6Auxiliary Activity Family 61111111110111
CBM20Carbohydrate-binding Module Family 201000000000000
CBM34Carbohydrate-binding Module Family 341000000010000
CBM48Carbohydrate-binding Module Family 481000000000000
CBM50Carbohydrate-binding Module Family 504333334431333
CBM68Carbohydrate-binding Module Family 681000000000000
CBM96Carbohydrate-binding Module Family 961000000000000
CE1Carbohydrate Esterase Family 11231221315222
CE4Carbohydrate Esterase Family 445544445410444
CE7Carbohydrate Esterase Family 70010010000001
CE9Carbohydrate Esterase Family 91111212211211
CE14Carbohydrate Esterase Family 143211211214221
GH1Glycosyl Hydrolase Family 10220222220222
GH3Glycosyl Hydrolase Family 30111111111111
GH4Glycosyl Hydrolase Family 40224222220220
GH13_2Glycosyl Hydrolase Family 13/Subf 21000000000000
GH13_9Glycosyl Hydrolase Family 13/Subf 91000000000000
GH13_14Glycosyl Hydrolase Family 13/Subf 141000000000000
GH13_20Glycosyl Hydrolase Family 13/Subf 201000000010000
GH13_29Glycosyl Hydrolase Family 13/Subf 291111111110111
GH13_31Glycosyl Hydrolase Family 13/Subf 313000000020000
GH13_45Glycosyl Hydrolase Family 13/Subf 451111111120111
GH15Glycosyl Hydrolase Family 150000000003000
GH18Glycosyl Hydrolase Family 183232433344323
GH20Glycosyl Hydrolase Family 200000000001000
GH23Glycosyl Hydrolase Family 231111111112111
GH25Glycosyl Hydrolase Family 250000001000000
GH31_1Glycosyl Hydrolase Family 31/Subf 11000000010000
GH31_2Glycosyl Hydrolase Family 31/Subf 20000010000001
GH32Glycosyl Hydrolase Family 321111111110111
GH38Glycosyl Hydrolase Family 380010100100000
GH57Glycosyl Hydrolase Family 570000001221002
GH73Glycosyl Hydrolase Family 731111120000112
GH84Glycosyl Hydrolase Family 840000000001000
GH109Glycosyl Hydrolase Family 1090000100001000
GH130_4Glycosyl Hydrolase Family 130/Subf 40111111110111
GH170Glycosyl Hydrolase Family 1700000101100101
GH171Glycosyl Hydrolase Family 17100000000 2000
GH176Glycosyl Hydrolase Family 1761000000000000
GH177Glycosyl Hydrolase Family 1770000110000000
GH179Glycosyl Hydrolase Family 1790221221311132
GH188Glycosyl Hydrolase Family 1880565375665567
GT1Glycosyl Transferase Family 10000000001000
GT2Glycosyl Transferase Family 260001132312301
GT4Glycosyl Transferase Family 47771568538656
GT5Glycosyl Transferase Family 51000000001005
GT8Glycosyl Transferase Family 80111111110110
GT27Glycosyl Transferase Family 270000000000000
GT28Glycosyl Transferase Family 283400000001000
GT35Glycosyl Transferase Family 351044444445444
GT51Glycosyl Transferase Family 515555555554551
GT61Glycosyl Transferase Family 510000000002000
GT100Glycosyl Transferase Family 1001000000000000
GT108Glycosyl Transferase Family 1080111111110111
GT111Glycosyl Transferase Family 1111000000000000
GT118Glycosyl Transferase Family 1180100000000000
GT119Glycosyl Transferase Family 1196133333333333
GT121Glycosyl Transferase Family 1210010110300111
PL12Polysaccharide Lyase Family 120000000010000
PL12_1Polysaccharide Lyase Family 12/Subf 10000000001000
PL12_3Polysaccharide Lyase Family 12/Subf 30100000000000
SLHSurface layer (S-layer) Homology1000000002000
Table 6. Pan/core genome analysis of Aeribacillus genus.
Table 6. Pan/core genome analysis of Aeribacillus genus.
OrganismCDSCDS (Without Artefact Fam.)Pan CDSCore CDSVar CDSStrain Specific CDSCore CDS (%)Var CDS (%)Strain Spe. CDS (%)
A. alveayuensis 24KAM5170476666666621794487140432.68867.31221.062
A. alveayuensis DSM 190923359316131611103205836134.89465.10611.42
A. composti B-147423980375537551175258010231.29268.7082.716
A. composti B-147453630341834181138228016433.29466.7064.798
A. composti HB-1402637653765116126048030.83769.1632.125
A. composti KCTC 33824400637813781115826237030.62769.3731.851
A. composti NRS-1511409938733873117726968830.3969.612.272
A. composti NRS-1512393337153715117825379631.70968.2912.584
A. composti NRS-1630415339033903117227312030.02869.9720.512
A. composti NRS-163141403885388511732712730.19369.8070.18
A. composti NRS-1632407538363836116926675730.47469.5261.486
A. composti NRS-1633394137063706116625407631.46268.5382.051
A. composti NRS-2045395337333733117725563631.5368.470.964
A. kexueae KCTC 338813488332533251097222894932.99267.00828.541
A. pallidus 8427339463946117227745029.70170.2991.267
A. pallidus 8m3404638063806118826185931.21468.7861.55
A. pallidus BK1412238603860114727136529.71570.2851.684
A. pallidus GS337255195298529812234075312523.08476.91658.985
A. pallidus KCTC35644524420042001161303928927.64372.3576.881
A. pallidus MHI33904199394539451188275718530.11469.8864.689
A. pallidus MHI33914078383738371143269415429.78970.2114.014
A. pallidus NRS-1637415938983898117227261630.06769.9330.41
A. pallidus NRS-20583450327732771084219399133.07966.92130.241
A. pallidus PI8400037353735115825779331.00468.9962.49
A. pallidus SJP273648345134511117233416032.36767.6334.636
A. pallidus TD14037381738171192262517231.22968.7714.506
A. pallidus W-12407138573857119026678930.85369.1472.307
Aeribacillus sp. FSL K6-11214423414141411176296515628.39971.6013.767
Aeribacillus sp. FSL K6-1305417038943894115727378029.71270.2882.054
Aeribacillus sp. FSL K6-2211 4365406140611151291011428.34371.6572.807
Aeribacillus sp. FSL K6-2833423339473947116527828629.51670.4842.179
Aeribacillus sp. FSL K6-28484567425142511171308015927.54672.4543.74
Aeribacillus sp. FSL K6-3256397637133713117525383031.64668.3540.808
Aeribacillus sp. FSL K6-82104348403840381155288310728.60371.3972.65
Aeribacillus sp. FSL K6-83943911368236821124255821330.52769.4735.785
Aeribacillus sp. FSL M8-02354012374937491136261315330.30169.6994.081
Aeribacillus sp. FSL M8-02544169394039401144279626629.03670.9646.751
Aeribacillus sp. FSL W8-0870 434340464046117828688129.11570.8852.002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yıldız, S.Y.; Radchenkova, N. From Genomes to Applications: Comparative Analysis of Aeribacillus pallidus Reveals a Thermophilic Chassis for Biotechnology. Appl. Sci. 2025, 15, 10866. https://doi.org/10.3390/app152010866

AMA Style

Yıldız SY, Radchenkova N. From Genomes to Applications: Comparative Analysis of Aeribacillus pallidus Reveals a Thermophilic Chassis for Biotechnology. Applied Sciences. 2025; 15(20):10866. https://doi.org/10.3390/app152010866

Chicago/Turabian Style

Yıldız, Songül Yaşar, and Nadja Radchenkova. 2025. "From Genomes to Applications: Comparative Analysis of Aeribacillus pallidus Reveals a Thermophilic Chassis for Biotechnology" Applied Sciences 15, no. 20: 10866. https://doi.org/10.3390/app152010866

APA Style

Yıldız, S. Y., & Radchenkova, N. (2025). From Genomes to Applications: Comparative Analysis of Aeribacillus pallidus Reveals a Thermophilic Chassis for Biotechnology. Applied Sciences, 15(20), 10866. https://doi.org/10.3390/app152010866

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop