1. Introduction
Wild relatives of cultivated legumes represent an essential component of plant genetic resources and play a key role in sustainable agriculture, biodiversity conservation, and crop improvement. These species often possess valuable adaptive traits such as tolerance to drought, nutrient-poor soils, pests, and diseases, making them particularly important under conditions of climate change and environmental stress [
1,
2]. In addition to their genetic value, wild legumes contribute significantly to ecosystem functioning through biological nitrogen fixation, improvement of soil fertility, and stabilization of plant communities in marginal environments [
3,
4].
Legume–microbe interactions play a fundamental role in soil ecological processes. Rhizosphere microorganisms influence nutrient cycling, plant growth, and stress tolerance, and shape plant distribution and population stability [
5,
6]. Bacterial groups such as Proteobacteria and Actinobacteriota are often associated with nitrogen fixation, nutrient mineralization, and plant growth promotion, while fungal communities contribute to organic matter decomposition, soil aggregation, and plant health [
7,
8]. Understanding soil microbiome composition associated with wild legumes is therefore essential for evaluating ecological adaptation and conservation potential.
Bulgaria represents one of the biodiversity-rich regions in Europe due to its diverse climatic conditions, complex geological structure, and heterogeneous landscapes. The country lies at the intersection of Mediterranean, continental, and steppe climatic zones, creating favorable conditions for the development of diverse plant communities and wild relatives of cultivated species [
9,
10]. Among these, wild legumes such as
Pisum elatius,
Cicer montbrettii,
Vicia incisa, and species within the genus
Lupinus represent valuable genetic resources with potential agricultural and ecological importance.
Pisum elatius M. Bieb. is considered one of the most important wild relatives of cultivated pea (
Pisum sativum L.). The species is widely distributed in southeastern Europe and the Mediterranean region and typically occurs in dry, rocky habitats, grasslands, and shrub vegetation [
11,
12]. Wild populations of
P. elatius demonstrate considerable morphological and ecological variability, reflecting adaptation to local environmental conditions. These populations are particularly valuable for breeding programs aimed at improving abiotic stress tolerance and disease resistance [
13].
Cicer montbrettii Jaub. & Spach is a wild relative of chickpea and endemic to southeastern Europe. In Bulgaria, the species is primarily associated with Strandzha Mountain habitats, where it occurs in forest–steppe and Mediterranean transitional ecosystems [
14]. The species exhibits adaptation to marginal soils and environmental stress, making it an important genetic resource for chickpea improvement [
15].
Vicia incisa, belonging to the
Vicia sativa complex, is considered a rare species in Bulgaria and occurs mainly in Strandzha and Eastern Rhodope habitats. This species exhibits rapid growth, environmental adaptability, and high biomass production potential, making it suitable for forage production and soil improvement [
16]. Additionally, species within the genus
Lupinus have been reported in Strandzha and Eastern Rhodopes, although their distribution remains insufficiently studied.
The selected study regions—Kaliakra, Strandzha, and the Eastern Rhodopes—represent three ecologically contrasting environments. The Kaliakra region, located along the Northern Black Sea coast, is characterized by coastal steppe ecosystems, shallow rendzina soils, and strong maritime influence. The climate is relatively dry, with high solar radiation and strong winds. The geological substrate consists mainly of Sarmatian limestone and clay conglomerates, which form shallow soils with limited water retention [
17,
18]. These conditions create stress environments influencing plant–microbe interactions. The Strandzha Mountain region is characterized by transitional Mediterranean climate with higher precipitation and moderate temperatures. The soils are primarily cinnamon forest soils and yellow-earth podzolic soils rich in organic matter. Vegetation includes mesophilic forests and transitional ecosystems supporting diverse plant communities [
19]. The Eastern Rhodopes represent volcanic landscapes with Mediterranean-continental transitional climate, heterogeneous terrain, and diverse vegetation. These environmental gradients create multiple ecological niches for wild legumes and associated microbial communities.
Recent advances in next-generation sequencing technologies allow for comprehensive characterisation of soil microbial communities. Metagenomic approaches enable analysis of microbial diversity, ecological interactions, and functional potential of soil microbiomes [
20,
21]. Comparative microbiome studies across ecological gradients are particularly important for understanding plant–soil–microbe interactions and identifying microbial indicators of ecosystem health.
The present study contributes to agronomy by linking ecological and metagenomic characterisation of wild legume habitats with their potential application in sustainable agricultural systems. Understanding rhizosphere microbiomes associated with crop wild relatives provides opportunities for identifying beneficial microorganisms that can be utilized as biofertilizers, biostimulants, or biocontrol agents, thereby supporting crop productivity, resilience, and soil health. From an agronomic perspective, wild relatives of legumes represent an important reservoir of beneficial rhizosphere microorganisms that can be exploited for sustainable crop production. Understanding the structure and function of these microbiomes provides opportunities for the development of biofertilizers, biostimulants, and biological control agents, contributing to improved soil fertility, nutrient use efficiency, and crop resilience under environmental stress conditions.
2. Materials and Methods
2.1. Study Regions and Site Selection
Field surveys were conducted across three ecologically distinct regions of Bulgaria that are known to host or potentially host wild legume populations: Cape Kaliakra (Northern Black Sea Coast), Eastern Rhodopes (Madzharovo volcanic basin), and Strandzha Mountain (forest–steppe transition zone). Site coordinates, elevation, habitat structure, and dominant vegetation were recorded at each location. Habitat identification was guided by mapping, historical floristic records, and previous observations from national biodiversity monitoring programs. The survey targeted priority wild Fabaceae taxa, including Pisum elatius, Cicer montbrettii, Vicia incisa, and species within the Vicia sativa complex. These species were selected due to their ecological importance, conservation relevance, and their role as wild relatives of economically important legume crops, as well as their occurrence across contrasting environmental gradients.
From each region, rhizosphere soil samples were collected from three independent locations associated with natural habitats of wild legume species. Metagenomic analyses were performed separately for each sampling site, and the results were subsequently combined for regional comparison. The sequencing data generated in this study are available in the NCBI Sequence Read Archive under BioProject PRJNA1444713 (Submission ID: SUB16088483).
2.2. Field Expeditions and Sampling Design
Two major fieldwork phases were conducted to capture the main stages of plant development: Phase I (May), during mass flowering and early pod formation, and Phase II (late June), during pod maturation and full seed ripening. Field surveys were conducted along two predefined expedition routes selected to encompass the ecological and geographical variability of the study region. These transects covered coastal, steppe, and inland habitats, allowing for a comprehensive assessment of population characteristics, habitat conditions, and associated plant communities (
Figure 1). A total of 14 composite rhizosphere soil samples were collected (3–5 subsamples per site; four sites per region) from a depth of 2–5 cm, corresponding to the active root zone. Subsamples were homogenized to obtain representative composite samples, and the results are reported as aggregated values at the regional level (
Supplementary Table S1).
Expedition routes 1: Plovdiv → Kavarna → Cape Kaliakra → Bolata → Rusalka → Kamen Bryag → Yaylata → Tyulenovo → Shabla → Durankulak → Plovdiv.
This route traversed the Northern Black Sea coast, including Natura 2000 sites and protected areas characterized by coastal cliffs, steppe vegetation, and limestone plateaus. The chosen locations are known for hosting fragmented and vulnerable plant populations, making them particularly relevant for monitoring.
Expedition routes 2: Plovdiv → Harmanli → Borislavtsi → Madzharovo → Stambolovo → Plovdiv.
This inland transect covered transitional Mediterranean and continental habitats, including semi-natural grasslands, rocky slopes, and anthropogenically influenced areas. These sites provided contrast to the coastal localities and allowed for evaluation of population responses under different climatic and land-use conditions.
At each site, a standardized protocol was applied to characterize both the plant populations and their environmental context. Population size: The number of individuals per patch was counted or estimated depending on patch density. For clumped populations, individuals were quantified within clearly defined patch boundaries, while in more dispersed populations, belt transects were used to estimate abundance. Patch size, continuity, and degree of fragmentation were recorded to evaluate population structure and potential isolation effects. For the phenological stage, plants were assigned to phenological categories (vegetative growth, budding, flowering, fruiting, or senescence) to determine population-level reproductive status at the time of sampling. Rapid floristic surveys were carried out within and around each population patch. Dominant and co-occurring species were noted to characterize community type, competition intensity, and habitat preferences. Habitat vulnerability indicators were systematically evaluated. This included evidence of soil erosion, tourism pressure, grazing intensity (presence of livestock tracks, browsing signs), and vegetation degradation. These parameters were important for identifying threats to population viability and habitat stability. Seed material was collected only when populations were sufficiently robust and reproductive output allowed for sustainable harvesting. All sampling activities complied with ethical and regulatory requirements of the Ministry of Environment and Water, including adherence to national guidelines for conservation of protected species and habitats. Collected seeds were stored under controlled conditions and used exclusively for scientific analysis.
2.3. Morphological Characterisation of Plant Populations
Morphological characterisation of
Pisum elatius and co-occurring Fabaceae species was conducted through detailed field observations following established botanical and taxonomic guidelines [
11,
22,
23]. For each population, vegetative and reproductive traits were recorded to capture intra- and interspecific variability relevant to species identification, ecotypic differentiation, and habitat-associated morphological patterns. Vegetative traits included stem architecture, noting whether plants exhibited an erect or procumbent growth habit, the degree of branching, and any structural adaptations typical of wild pea phenotypes. Leaf morphology was examined by documenting leaflet shape, size, and arrangement, together with the development and configuration of tendrils, which are important diagnostic features in the genus
Pisum and related legumes.
Reproductive characteristics were also systematically recorded. These included flower morphologies, encompassing color, corolla size, symmetry, and the number of flowers per inflorescence, all of which are key descriptors in Fabaceae systematics [
24]. Pod traits, such as length, cross-sectional shape, coloration, and dehiscence pattern, were assessed in mature individuals to support reliable species identification and comparison across habitats. Additionally, seed characteristics—including seed size, shape, surface texture, and pigmentation—were documented, as these traits often reflect both genetic lineage and environmental influences [
1]. To support taxonomic verification and long-term documentation, herbarium vouchers were prepared for all recorded populations following standard herbarium protocols [
25]. Vouchers were pressed, dried, labeled with georeferenced locality data, and deposited in institutional collections for future reference and comparative studies.
2.4. Soil Sampling and Physicochemical Measurements
To characterize the edaphic conditions associated with the studied plant populations, rhizosphere soil samples were collected from representative individuals of
Pisum elatius,
Cicer montbrettii, and
Vicia incisa wherever these species occurred along the expedition routes. Sampling focused on the immediate root zone, where soil–plant–microbe interactions are most intensive, as emphasized by [
5,
26].
Rhizosphere soil was carefully removed from the upper 2–5 cm of the soil profile by gently shaking soil adhering to fine roots, following minimally disruptive protocols described by Vandenkoornhuyse et al. (2015) [
27]. For each habitat, three to five composite samples were collected. Each composite consisted of subsamples taken from multiple individuals within the same patch, ensuring representative coverage of local microhabitat heterogeneity. Samples were homogenized and stored in sterile polyethylene bags at 4 °C, following recommendations for minimizing post-collection changes in microbial activity [
28]. In the laboratory, soil samples were air-dried, sieved to 2 mm, and analysed for basic physicochemical characteristics. Soil pH was determined in a 1:2.5 soil-to-water suspension using a calibrated pH electrode, following ISO 10390 (2005) and the procedures outlined by Jones (2001) [
29,
30]. Electrical conductivity (EC) was measured in the same 1:2.5 extract with a conductivity meter, and expressed in µS cm
−1, according to standard soil salinity assessment methods described by Rhoades (1996) [
31]. These parameters were used to characterize the edaphic properties of each habitat and to support ecological interpretation of population structure and microbial community differences. Soil pH and EC are widely recognized as key drivers of microbial community assembly, influencing nutrient cycling, enzymatic activity, and root–microbe interactions [
30,
32].
2.5. DNA Extraction and Amplicon Sequencing
Total genomic DNA was extracted from 0.25 g of each rhizosphere soil sample using the DNeasy PowerSoil Pro Kit (QIAGEN, Hilden, Germany), following the manufacturer’s protocol to obtain high-quality nucleic acids suitable for downstream molecular analyses. This extraction method combines mechanical bead-beating and chemical lysis to efficiently disrupt microbial cells embedded in soil particles and organic matter, thereby providing representative DNA extracts of the rhizosphere microbial community [
33]. Two taxonomically informative marker regions were selected for high-throughput amplicon sequencing. For bacterial community profiling, the V1–V3 hypervariable regions of the 16S rRNA gene were amplified using the universal primer pair 27F (5′-AGAGTTTGATCMTGGCTCAG-3′) and 534R (5′-ATTACCGCGGCTGCTGG-3′), which are widely used for assessing prokaryotic diversity across environmental samples [
34,
35]. For fungal community analysis, the ITS1 region of the ribosomal internal transcribed spacer (ITS) was amplified using the ITS1F (5′-CTTGGTCATTTAGAGGAAGTAA-3′) and ITS5 (5′-GGAAGTAAAAGTCGTAACAAGG-3′) primer pair, commonly employed for fungal diversity and taxonomic identification [
36,
37].
PCR amplification was performed using region-specific primers containing Illumina adapter overhang sequences to facilitate downstream library preparation following established amplicon sequencing workflows [
20,
38]. Amplification reactions were conducted in 25 μL volumes containing 12.5 μL of KAPA HiFi HotStart ReadyMix (Roche, Basel, Switzerland), 0.2 μM of each primer, and approximately 10 ng of template DNA. PCR cycling conditions included an initial denaturation at 95 °C for 3 min, followed by 25–30 cycles of denaturation at 95 °C for 30 s, annealing at 55 °C for 30 s, extension at 72 °C for 30 s, and a final extension at 72 °C for 5 min, following widely adopted high-fidelity amplification protocols for amplicon sequencing [
39].
PCR products were verified by electrophoresis on 1.5% agarose gels stained with GelRed nucleic acid stain (Biotium, Fremont, CA, USA). Amplicons were purified using AMPure XP beads (Beckman Coulter, Brea, CA, USA) to remove primer dimers and non-specific amplification products.
Purified amplicons were subsequently indexed using dual-index barcodes with the Nextera XT Index Kit (Illumina, San Diego, CA, USA) to enable multiplexing of samples in a single sequencing run. Dual indexing significantly reduces index hopping and barcode misassignment during high-throughput sequencing [
40]. Both libraries were quantified using the Qubit 4 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) and normalized prior to pooling. Accurate library quantification ensures even representation of samples in multiplexed sequencing runs [
41,
42]. The pooled libraries were subjected to paired-end sequencing (2 × 300 bp) on the Illumina MiSeq platform (Illumina, San Diego, CA, USA), following standard manufacturer protocols. High-quality paired-end reads were generated to ensure robust coverage and accurate taxonomic assignment [
20,
42]. All sequencing procedures were carried out by a certified sequencing service provider, Novogene Europe (Cambridge, UK), which followed strict quality control standards for library preparation, cluster generation, and data output.
2.6. Library Preparation
Amplicon libraries were prepared following the standard Illumina amplicon sequencing workflow with minor modifications commonly applied in microbial community studies [
20,
42]. After successful PCR amplification of the bacterial 16S rRNA gene (V1–V3 regions) and fungal ITS1 region, library preparation was performed in several sequential steps. Initial PCR products were purified using magnetic bead–based cleanup with AMPure XP magnetic beads (Beckman Coulter, Brea, CA, USA) to remove primer dimers, residual nucleotides, enzymes, and non-specific amplification fragments.
Purified amplicons were subjected to a second PCR step to incorporate dual-index barcodes and Illumina sequencing adapters using the Nextera XT Index Kit (Illumina, San Diego, CA, USA). Dual indexing minimizes barcode misassignment and allows for accurate multiplexing of multiple samples within a single sequencing run [
40,
42]. Libraries were quantified using the Qubit dsDNA HS Assay Kit and Qubit 4 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA). Library fragment size distribution and quality were assessed using the Agilent 2100 Bioanalyser (Agilent Technologies, Santa Clara, CA, USA). Accurate quantification and size verification are critical for ensuring even sample representation in multiplexed sequencing runs and improving downstream data quality [
43] Equimolar amounts of each indexed library were pooled into a single sequencing-ready mixture. The pooled libraries underwent final quality control to verify concentration, fragment size distribution, and overall integrity prior to sequencing. These procedures follow established best practices for amplicon sequencing and ensure generation of high-quality paired-end reads suitable for microbial community analysis [
20,
21]. Library preparation and sequencing were conducted according to Illumina recommendations and widely accepted protocols for microbial amplicon sequencing, ensuring reliable and reproducible characterisation of bacterial and fungal diversity in rhizosphere soil samples.
2.7. Bioinformatics and Sequence Processing
Bioinformatic processing of the Illumina-generated paired-end reads was performed using the QIIME 2 analytical platform (version 2024.10), which provides a reproducible, modular framework for amplicon-based microbial community analysis [
21]. Raw FASTQ files were demultiplexed and imported into QIIME 2, followed by standardized sequence quality control procedures. Initial inspection of read quality profiles guided trimming of low-quality regions and removal of sequencing artefacts. This step ensured that downstream analyses were based on high-quality sequence data and followed widely accepted recommendations for amplicon-based studies [
44]. Quality filtering and trimming were performed using QIIME 2 integrated tools, removing ambiguous bases, low-quality reads, and sequencing artefacts. High-resolution Amplicon Sequence Variants (ASVs) were inferred using the DADA2 algorithm implemented in the q2-dada2 plugin [
43]. The DADA2 workflow includes error-rate learning, dereplication, denoising, merging of paired-end reads, and removal of sequencing errors. This method improves taxonomic resolution compared to traditional Operational Taxonomic Unit (OTU) clustering and allows for identification of biologically meaningful sequence variants. During this process, chimeric sequences were identified and removed using the consensus method implemented in DADA2, reducing artificial inflation of diversity estimates and improving taxonomic accuracy. Taxonomic assignment of the resulting ASVs was conducted using a Naïve Bayes classifier implemented in the QIIME 2 feature-classifier plugin [
44]. For bacterial communities, 16S rRNA gene sequences were classified against the SILVA reference database (release 138.1), a curated and widely used resource in microbial ecology [
45]. Fungal ITS sequences were classified using the UNITE reference database (version 9.0), which provides curated species hypothesis clusters for fungal ITS regions [
46]. The resulting feature tables, taxonomic assignments, and representative sequences were exported for downstream ecological and statistical analyses, including alpha diversity, beta diversity, ordination analyses, and community composition comparisons.
2.8. Alpha Diversity
Alpha diversity metrics were calculated to assess microbial richness and evenness within each rhizosphere soil sample. Multiple complementary diversity indices were applied to capture different aspects of within-sample microbial diversity. Species richness was estimated using the Chao1 index, a non-parametric estimator that accounts for the number of rare taxa and provides a more accurate approximation of true richness in under sampled or heterogeneous microbial communities [
47]. To evaluate both species richness and the relative distribution of taxa, the Shannon diversity index was calculated. This metric incorporates both abundance and evenness, offering a comprehensive measure of community complexity and diversity [
48]. The Shannon index is widely used in microbial ecology studies to assess differences in microbial community structure across environmental gradients [
49]. Additionally, the ACE (Abundance-based Coverage Estimator) index was calculated as an alternative richness estimator emphasizing the contribution of less abundant taxa. The ACE index improves sensitivity for detecting diversity patterns in microbial communities characterized by a large proportion of rare species, which is commonly observed in soil microbiomes [
50] Alpha diversity metrics were calculated using QIIME 2 (
https://qiime2.org, accessed on 20 September 2025) and further processed in R (version 4.3.2; R Foundation for Statistical Computing, Vienna, Austria;
https://www.r-project.org/, accessed on 28 September 2025) using the phyloseq and vegan packages for ecological diversity analysis. These indices provided a robust characterisation of alpha diversity, allowing for comparison of microbial richness and community structure across the different habitats and plant-associated rhizospheres examined in this study.
2.9. Beta Diversity and Statistical Analysis
Beta diversity analyses were performed to evaluate differences in microbial community composition among the various habitats and plant-associated rhizosphere samples. Community dissimilarity was quantified using the Bray–Curtis distance metric, a widely applied abundance-based index that captures differences in community composition while being robust to variations in sampling depth and suitable for zero-inflated ecological datasets [
51]. This measure is particularly appropriate for microbial community data, where taxa abundance distributions are often uneven and dominated by rare species. Principal Coordinates Analysis (PCoA) was performed to visualize patterns of microbial community variation and identify major axes of compositional differences among samples. PCoA is a commonly used ordination method that transforms distance matrices into orthogonal axes representing variation in community composition [
52]. This approach facilitates interpretation of ecological gradients and clustering patterns associated with environmental factors and plant species. Beta diversity analyses were conducted using QIIME 2 (
https://qiime2.org, accessed on 2 October 2025) and further visualized using R (version 4.3.2; R Foundation for Statistical Computing, Vienna, Austria;
https://www.r-project.org/, accessed on 20 September 2025) with the vegan and phyloseq packages. These analyses provided a comprehensive assessment of beta diversity, enabling evaluation of how microbial communities differ across environmental gradients, geographical regions, and plant-associated rhizospheres. Statistical differences in microbial community composition among regions were assessed using PERMANOVA (Permutational Multivariate Analysis of Variance) based on Bray–Curtis dissimilarity matrices. Due to the aggregated nature of the dataset at the regional level, PERMANOVA was applied to evaluate overall compositional differences among regions. A significance level of
p < 0.05 was used. Differences in taxa abundance were evaluated using the Kruskal–Wallis test followed by Dunn’s post hoc test with Benjamini–Hochberg correction. These approaches are widely recommended for microbiome data due to their robustness to non-normal distributions and compositional structure.
2.10. Differential Abundance and Biomarker Detection
To identify microbial taxa that were differentially abundant among habitats and host legume species, we employed LEfSe (Linear Discriminant Analysis Effect Size), a widely used method for biomarker discovery in microbial ecology [
53]. LEfSe integrates non-parametric statistical testing with effect-size estimation, enabling detection of biologically meaningful differences in microbial relative abundances across predefined groups. The workflow began with a Kruskal–Wallis test to identify taxa showing statistically significant differences among the studied habitats. A significance threshold of α = 0.05 was applied, consistent with recommended parameters for LEfSe-based analyses. Taxa passing this initial screening step were subsequently subjected to Linear Discriminant Analysis (LDA) to estimate the magnitude of their effect size, with an LDA score cutoff of 2.0 used to retain only robust and discriminatory biomarkers. This approach allowed us to pinpoint bacterial and fungal taxa that consistently characterized the microbial communities associated with environmental gradients across the Kaliakra, Strandzha, and Eastern Rhodopes regions. By integrating statistical significance with effect-size estimation, LEfSe provided a reliable means of identifying biomarkers that contribute to habitat-specific microbial signatures and potential ecological differentiation among legume-associated rhizosphere communities.
2.11. Ethical Considerations and Permit Compliance
All sampling activities were conducted in compliance with Bulgarian biodiversity conservation legislation. Collection of seeds and plant material followed the regulations of the Ministry of Environment and Water, Republic of Bulgaria, and adhered to established limits for sampling within protected habitats. No destructive sampling of endangered or legally protected species was undertaken.
4. Discussion
4.1. Regional Structuring of Rhizosphere Microbial Communities
The present study demonstrates clear regional differentiation of rhizosphere microbial communities associated with wild legume habitats across Kaliakra, Strandzha, and the Eastern Rhodopes. Beta diversity analysis based on Bray–Curtis dissimilarity, supported by PERMANOVA, confirmed that microbial community composition differs significantly among regions. These patterns indicate that geographical and environmental factors play a key role in shaping microbial assemblages.
The observed clustering of samples in the PCoA ordination reflects distinct compositional profiles among regions. Strandzha samples formed a well-defined cluster, whereas Kaliakra and Eastern Rhodopes showed partial overlap, indicating differences in community structure across environmental gradients. Such spatial structuring of soil microbiomes has been widely reported and is typically associated with variation in soil physicochemical properties, vegetation composition, and climatic conditions.
4.2. Ecological Differentiation of Wild Legume Habitats
The three investigated regions—Kaliakra, Strandzha, and the Eastern Rhodopes—represent ecologically contrasting environments that influence both plant populations and associated rhizosphere microbial communities. Coastal steppe habitats in Kaliakra are characterized by shallow calcareous soils, strong winds, and limited water availability, creating stressful environmental conditions for plant establishment. Such environments are known to promote adaptive traits in wild legumes and select for stress-tolerant microbial [
11,
54]. In contrast, the Eastern Rhodopes represent heterogeneous volcanic landscapes with transitional Mediterranean-continental climate conditions. The diverse terrain and vegetation provide multiple ecological niches, which likely contributed to the more balanced microbial communities observed in this region. Similar relationships between habitat heterogeneity and microbial diversity have been reported in Mediterranean ecosystems, where environmental variability promotes community stability and resilience [
55]. The Strandzha region differs from the other two areas by its higher precipitation and forest-influenced ecosystems. These conditions support increased organic matter accumulation and more complex plant communities, which may explain the higher abundance of Actinobacteriota and enhanced microbial metabolic activity observed in this region. Previous studies have demonstrated that vegetation structure and organic matter availability strongly influence rhizosphere microbial composition [
8,
34].
4.3. Soil Properties and Microbial Community Structure
Soil physicochemical characteristics differed among the three regions and likely contributed to microbial community differentiation. Alkaline soils in Kaliakra favored Firmicutes-dominated communities, which are often associated with stress tolerance and drought adaptation. In contrast, Strandzha soils supported higher proportions of Actinobacteriota, commonly linked to organic matter decomposition and nutrient cycling. The Eastern Rhodopes showed more balanced bacterial community composition, suggesting more stable ecological conditions. Similar patterns have been observed in Mediterranean ecosystems, where soil heterogeneity and vegetation diversity influence microbial community assembly and ecosystem functioning [
5,
54]. These results showed that environmental gradients across Bulgarian wild legume habitats shape microbial community composition while maintaining ecological functionality.
4.4. Taxa-Specific Variation and Ecological Implications
Significant differences in the relative abundance of selected taxa, including Bacillus and Fusarium, indicate region-specific microbial signatures. These taxa are widely reported as key components of soil microbial communities and are frequently associated with plant–microbe interactions.
Members of the genus Bacillus are commonly reported as plant growth-promoting bacteria with roles in nutrient mobilization, phytohormone production, and stress tolerance. Their occurrence across the studied habitats confirms their widespread presence in rhizosphere environments associated with wild legumes.
Similarly, Fusarium species are commonly detected in soil ecosystems and represent an important component of fungal communities. The variation in their relative abundance across regions reflects differences in microbial community structure under contrasting environmental conditions.
The observed differences in taxa abundance support the concept that rhizosphere microbial communities are shaped by both environmental filtering and plant-associated factors. The coexistence of bacterial and fungal taxa across all regions indicates that core microbial groups are maintained across habitats, while their relative abundances vary depending on local conditions.
4.5. Functional Prediction and Microbial Activity
Functional prediction revealed that chemoheterotrophy and aerobic chemoheterotrophy dominated across all regions, indicating that carbon cycling represents the primary ecological process in rhizosphere soils. The higher abundance of these functions in Strandzha suggests enhanced organic matter decomposition under more humid conditions. Despite taxonomic differences among regions, functional profiles were relatively similar, indicating functional redundancy within microbial communities. Functional redundancy is considered an important mechanism for maintaining ecosystem stability under environmental fluctuations [
54]. These results revealed that wild legume habitats in Bulgaria support functionally stable microbial communities that contribute to plant adaptation, nutrient cycling, and ecosystem resilience. The observed microbial communities associated with wild legumes may represent an important reservoir of beneficial microorganisms that could be exploited in agricultural systems. Genera such as
Bacillus,
Pseudomonas, and
Variovorax are widely recognized for their plant growth-promoting properties, including nitrogen fixation, phytohormone production, and stress mitigation. Their presence in natural habitats suggests that wild legumes serve as ecological niches for beneficial microbiota that could be harnessed for crop improvement. The observed differences in the relative abundance of
Bacillus and
Fusarium among regions highlight the influence of environmental and geographic factors on microbial community composition. The higher abundance of
Bacillus in certain regions may be associated with its well-documented role as a plant growth-promoting bacterium, contributing to nutrient cycling, phytohormone production, and stress tolerance. In contrast, the elevated abundance of
Fusarium in specific regions may reflect environmental conditions that are favorable for phytopathogenic fungi or differences in host plant–microbe interactions. These findings suggest that regional ecological factors shape the distribution of both beneficial and potentially harmful microorganisms, reinforcing the importance of local microbiome characterization for sustainable agricultural management.
The observed pairwise differences suggest that the microbial communities in Kaliakra are more distinct, potentially reflecting unique environmental conditions such as soil properties, climate, or plant–microbe interactions. In contrast, the lack of significant differences between Strandzha and the Eastern Rhodopes indicates a greater degree of ecological similarity between these regions. This pattern supports the hypothesis that regional environmental gradients play a key role in shaping microbial community composition, with certain locations acting as distinct ecological niches.
From an agronomic perspective, the identification of region-specific microbial taxa such as Bacillus supports the potential use of native microbiomes as sources of biofertilizers and biocontrol agents, while the distribution of Fusarium highlights potential phytosanitary risks associated with specific environments.
4.6. Implications for Conservation and Sustainable Agriculture
Wild legume populations represent valuable genetic and microbial resources. The observed microbial diversity associated with
Pisum elatius,
Cicer montbrettii, and
Vicia incisa highlights the ecological importance of these species. Conservation of wild legume habitats is therefore essential for preserving both plant genetic diversity and beneficial microbial communities. Integrating ecological, floristic, and metagenomic approaches provides a comprehensive framework for evaluating crop wild relatives. Such approaches are increasingly recommended for biodiversity conservation and sustainable agricultural applications [
13,
25].
The results demonstrate that wild legume habitats harbor diverse and functionally relevant microbial communities with potential applications in sustainable agriculture. These microbiomes may serve as sources of beneficial microorganisms for the development of biofertilizers, biostimulants, and biological control agents. Incorporating such microbiomes into agricultural systems may improve nutrient use efficiency, enhance plant resilience to abiotic stress, and reduce dependence on chemical inputs. Despite the robustness of the applied analytical framework, certain limitations should be acknowledged. The study focuses primarily on taxonomic composition, and further research integrating functional metagenomics or transcriptomics would provide deeper insights into microbial activity and ecosystem functioning. Additionally, expanding the number of sampling sites and environmental parameters would help to better resolve the drivers of microbial community variation.
Future studies should also explore the functional roles of key taxa identified in this work, particularly in relation to plant health, nutrient cycling, and stress tolerance, to fully harness their potential in sustainable agricultural systems.
4.7. Limitations and Future Perspectives
Although the present study provides valuable insights into the structure and diversity of rhizosphere microbial communities, certain limitations should be acknowledged. The analysis was conducted at the regional level, and therefore the results primarily reflect overall patterns of community differentiation rather than within-region variability.
Future studies should include a larger number of sampling sites and replicate samples to better capture spatial variability and improve statistical resolution. Additionally, integration of functional metagenomics, transcriptomics, or metabolomics approaches would provide deeper insights into microbial activity and ecological functions. Research should also focus on the isolation and characterization of key microbial taxa identified in this study to evaluate their potential applications in agriculture, particularly in relation to plant growth promotion, stress tolerance, and disease suppression.