Next Article in Journal
Stature Estimation in Forensic Anthropology: Addressing the Current Status, Challenges and Future Prospects
Previous Article in Journal
Forensic Analysis of Skeletal Remains Recovered from the Second World War Mass Grave of Ossero: From Biases to Uncertainties
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genetic Characterization of the Arabic-Speaking Population from the Casablanca-Settat Region Using Autosomal STR Markers: Understanding the Interplay of Geography and Language in Moroccan Population History

1
Laboratory of Biosciences, Integrated and Molecular Functional Exploration, Faculty of Sciences and Techniques, University Hassan 2nd Mohammedia, Casablanca 28650, Morocco
2
Biology, Health and Environment Team, Higher Institute of Nursing Professions and Health Techniques, Ouarzazate 45000, Morocco
3
Bio-Geosciences and Materials Engineering Laboratory, Higher Normal School, Hassan II University, Casablanca 21100, Morocco
4
Genetic Fingerprinting Department, National Scientific Police Laboratories, Casablanca 20250, Morocco
5
Institute of Criminalistics, Royal Gendarmerie, Rabat 10000, Morocco
*
Author to whom correspondence should be addressed.
Forensic Sci. 2026, 6(1), 22; https://doi.org/10.3390/forensicsci6010022
Submission received: 12 January 2026 / Revised: 10 February 2026 / Accepted: 19 February 2026 / Published: 21 February 2026

Abstract

Background/Objectives: The Casablanca-Settat region of Morocco, located at the interface between Arab and Amazigh cultural zones, has only recently been investigated using autosomal short tandem repeat (STR) markers. The objective of this study was to characterize the genetic diversity and forensic efficiency of 15 autosomal STR loci in the Casablanca-Settat population and to evaluate its genetic relationships with other Moroccan populations. Methods: Fifteen autosomal STR loci were genotyped in 138 unrelated Arabic-speaking individuals from the Casablanca-Settat region. Allele frequencies, Hardy–Weinberg equilibrium, and standard forensic parameters were calculated. The genetic structure of the population was further examined through comparative analyses with 12 previously published Moroccan reference populations using multivariate and phylogenetic approaches. Results: A total of 146 distinct alleles were identified across the 15 loci. D18S51 was the most polymorphic marker (Ho = 0.9203), whereas D3S1358, TPOX, D5S818, and D16S539 exhibited lower allelic diversity. No statistically significant deviation from Hardy–Weinberg equilibrium was detected after correction for multiple testing. The combined power of discrimination exceeded 0.99, and the combined power of exclusion reached 0.99999965, demonstrating the high forensic efficiency of the STR panel. Population structure analyses positioned the Casablanca-Settat population within an intermediate genetic cluster, closely related to central Moroccan populations, consistent with historical gene flow and admixture. Conclusions: This study provides robust autosomal STR reference data for the Casablanca-Settat population, confirming the suitability of these markers for forensic identification in Morocco and offering valuable insights into regional population structure and genetic diversity.

1. Introduction

Genetic admixture among human populations has long been a subject of interest in anthropology, particularly in the context of the founder effect and genetic drift theories [1,2]. To assess genetic heterogeneity within and between populations [3,4], a range of molecular tools and statistical approaches are employed to better understand the underlying genetic structure and affinities among human groups. It is well established that geographically proximate populations tend to share greater genetic similarity, whereas this similarity typically decreases with increasing geographic distance [1,5].
Advancements in molecular biology, especially the use of highly polymorphic markers such as Short Tandem Repeats (STRs), have revolutionized the direct analysis of gene pools and gene flow among populations [6,7]. STR markers, due to their high mutation rates and codominant inheritance, are widely used in forensic investigations, kinship analysis, and population genetics [8,9]. Their utility in capturing both recent demographic events and deeper ancestral lineages makes them powerful tools for dissecting the complex genetic history of human populations [10].
This study focuses on the Casablanca-Settat region of Morocco, a geographically and ethnically diverse area with a rich history of migration and gene flow [11]. Located in the Atlantic coastal plain, this region has been shaped by multiple historical layers, including Arab, Amazigh (Berber), Sub-Saharan African, and European influences [12,13]. These historical migrations and socio-cultural interactions have blurred genetic boundaries between ethnic groups, making social and cultural factors often more distinguishing than biological differences.
Despite this complexity, certain practices such as endogamy and consanguinity have played a role in preserving local genetic signatures [11,14]. Recent studies have highlighted how these matrimonial behaviors influence genetic structure by increasing homozygosity and conserving lineage-specific alleles [14,15].
In this context, we present the first comprehensive autosomal STR dataset for an Arabic-speaking population from the Casablanca-Settat region. This study represents an ideal case to explore the relationship between linguistic affiliation (Arabophone) and genetic diversity, and to determine whether language correlates with distinct genetic signatures or whether genetic diversity transcends linguistic boundaries.
This study aims to investigate the genetic diversity of an Arabic-speaking population from the Casablanca-Settat region through the analysis of autosomal STR markers. We analyze allele distributions to understand the population’s genetic structure, historical admixture, and to test the usefulness of STRs for forensic and population studies in Morocco. This research helps expand knowledge of North African genetic diversity and supports the creation of regional forensic databases.

2. Material and Methods

2.1. Population Samples

Saliva samples were collected from 146 healthy, unrelated Arabophone individuals whose four grandparents originated from the localities in the region of Casablanca–Settat. All procedures were carried out in strict accordance with ethical standards, and written informed consent was obtained from all participants. This study included individuals from seven locations. El Jadida, Sidi Bennour, Zemmamra, Laounate, Hed Oulad Fraj, Sebt Saiss, and Beni Hellal. At the collection stage, 146 samples were obtained. After quality control, eight samples were excluded due to insufficient DNA quality or incomplete profiles, leaving 138 individuals for statistical analysis.

2.2. Ethics Statement

The study was conducted in accordance with the guidelines of the Biomedical Research Ethics Committee (CERBC) of Casablanca, Morocco, and complied with the principles of the Declaration of Helsinki (2008 version). Ethical approval was granted on 22 April 2022 (registration number N°6/19). Written informed consent was secured from all participants prior to sample collection.

2.3. DNA Extraction and STR Amplification

Genomic DNA was extracted from the collected buccal swabs using the manual phenol-chloroform-isoamyl alcohol method. The extracted DNA was quantified using NanoDrop™ 2000/2000c spectrophotometers. A DNA concentration of 1 ng was used for the amplification of 15 STR loci and the Amelogenin marker included in the Investigator IDplex Plus Kit (Qiagen, Hilden, Germany), covering the following loci: D2S1338, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D19S433, D21S11, CSF1PO, FGA, TH01, TPOX, and vWA.
Samples were amplified in multiplex using fluorescently labeled primers. The PCR products were then analyzed using the 3500 Genetic Analyzer System (Applied Biosystems, Foster City, CA, USA), with POP-4™ polymer for the 3500/SeqStudio™ Flex and the 3500 Genetic Analyzer 8-Capillary Array, 36 cm (Applied Biosystems, Foster City, CA, USA). The SST 550 BTO size standard (Qiagen) was used to determine the fragment sizes. Allelic designations were obtained by comparison with the allelic ladder, and genotyping was performed using the GeneMapper IDX software (v1.6, Applied Biosystems).

2.4. Forensic Parameter Estimation

Allele frequencies and Hardy–Weinberg equilibrium (HWE) were evaluated using GENEPOP software v.3.3 [16]. HWE was assessed by an exact test with an initial significance threshold of p < 0.05, and Bonferroni correction was applied to adjust for multiple testing across the 15 STR loci, resulting in a corrected significance level of p < 0.0033. Observed heterozygosity (Ho), expected heterozygosity (He), power of discrimination (PD), matching probability (PM), polymorphic information content (PIC), and power of exclusion (PE) were calculated using CERVUS software v.2.0.

2.5. Inter-Population Analysis

In order to determine the genetic relationships of our sample with other populations, we compared our data with previously published STR databases from various Moroccan regions and two global populations (Table S1). Central Morocco was represented by datasets from the central and central-western regions [17], Chaouia-Ouardigha [18], west-central Morocco [19], and the combined regions of Doukkala and Jbala [20]. The Middle Atlas Mountains were represented by the Azrou population [21]. Southern Morocco included samples from the far south [22] and the Souss region [23]. Nationwide data covering various Moroccan regions were obtained from broader studies [24,25]. For external comparison, we included reference populations from Spain [26] and China [27].
Neighbor-Joining (NJ) phylogenetic trees were constructed using MEGAX software (v.12.0.1). Nei’s genetic distances among populations were pairwise estimated using the dist.genpop function in the R package adegenet v.2.1.11. The principal component analysis (PCoA) was performed using Python v.3.12.4 (skbio library). To display the genetic relationship between populations visually, a two-dimensional Principal Coordinate Analysis (PCoA) based on Nei’s distance matrix was performed. The eigenvalues were calculated to determine the proportion of total genetic variation captured by each axis, ensuring the reliability of the spatial representation. Subsequently, the scatter plots of Principal Coordinate 1 (PCo1) and Principal Coordinate 2 (PCo2) were generated, representing the most significant dimensions of the genetic variance.

3. Results

3.1. Allele Frequency Distribution and Forensic Parameters

To characterize the genetic structure of 138 individuals from the Casablanca-Settat region of Morocco, 15 autosomal STR loci were analyzed. Across the 15 STR loci, 146 alleles were identified with frequencies ranging from 0.0068 to 0.4658. Supplementary File S1 summarizes the allele frequency data for 15 autosomal STR loci collected from 138 unrelated individuals. Table 1 provides an overview of the main forensic parameters. The most polymorphic marker was D18S51 (Ho = 0.9203), while the least polymorphic loci were D3S1358 and TPOX, each with only seven distinct alleles. Observed heterozygosity values varied between 0.7029 (TPOX) and 0.9203 (D18S51), with a mean of approximately 0.80, confirming a high level of genetic diversity. No statistically significant deviation from Hardy–Weinberg equilibrium was detected after Bonferroni correction. A positive correlation between allele number and PIC was observed, supporting the informativeness of highly polymorphic loci.
The power of discrimination (PD) ranged from 0.8573 (TPOX) to 0.9679 (D18S51). Additionally, the power of exclusion (PE) varied from 0.5741 to 0.9935 (Table 2). The combined power of exclusion (CPE) was 0.99999965, and the combined matching probability (CMP) was 3.85 × 10−18, confirming the robustness of this marker set for forensic applications. Overall, the combined power of discrimination (CPD) is virtually 100%, highlighting the efficiency of these markers for both forensic and population genetics studies in the Moroccan population.

3.2. Interpopulation Analysis

Population relationships were examined using pairwise Nei’s genetic distances (Table 3), which were graphically represented by means of Principal Coordinates Analysis (PCoA) (Figure 1) and a Neighbor-Joining (NJ) phylogenetic tree (Figure 2). These complementary approaches allowed us to situate the Arabic-speaking Casablanca-Settat population (P9) within its broader geographical, linguistic, and historical context.
The PCoA analysis (Figure 1) proved highly robust, with the first two coordinates accounting for 81.13% of the total genetic variance (70.80% explained by Coordinate 1 and 10.33% by Coordinate 2). This high cumulative percentage, represented by the calculated eigenvalues, ensures a faithful representation of the population structure. The study population (P9) occupied a central position within the Moroccan cluster, particularly showing the lowest genetic distances with Moroccan Arabs WC1 (0.0184) and WC3 (0.0206). Similarly, the phylogenetic tree (Figure 2) revealed that P9 clusters tightly with various Moroccan groups, including both Berber and Arab populations from different regions of the country. This close clustering reflects a strong genetic relatedness likely shaped by shared ancestry and geographic proximity. The Spanish population was positioned on a neighboring branch, suggesting a moderate level of genetic affinity, possibly reflecting historical gene flow across the Mediterranean. In contrast, the Chinese population formed a distinct and distant branch, clearly separated along Coordinate 1, highlighting the significant genetic divergence between the Moroccan groups and this external reference population.

4. Discussion

In the present study, the analyzed STR loci demonstrated high forensic efficiency, as reflected by their strong discriminatory power across all markers (Table 1). This observation aligns well with previous reports on other Moroccan populations, including Chaouia [15]. Central Morocco [18], and Souss [23]. In particular, the Souss population exhibited remarkable allelic diversity at the D18S51 locus, a pattern also observed in our studied population. Moreover, loci such as D19S433. FGA, and D18S51 consistently showed high heterozygosity values (>0.85), underscoring their pronounced polymorphism and reinforcing their relevance for both forensic identification and population genetic studies throughout Morocco.
From a forensic perspective, the high levels of polymorphism and heterozygosity observed across the analyzed STR loci highlight the importance of generating region-specific allele frequency data for Morocco. The present dataset provides a reliable reference for the Casablanca-Settat region, supporting accurate statistical calculations in individual identification, kinship analysis, and paternity testing. Given the demographic complexity and internal migration characterizing this region, the use of localized population data is essential to avoid biased likelihood estimates that may arise from extrapolating frequencies from geographically or genetically distant groups. In this context, our results represent a valuable contribution toward the development and refinement of a national Moroccan STR database, enhancing both the reliability and equity of forensic genetic applications.
The genetic structure observed in the Casablanca-Settat population (P9), reflects both its historical and geographical context. Its central position in multivariate analyses (Figure 2) indicates a genetically intermediate profile relative to other Moroccan groups. consistent with the region’s role as a contact zone between Arab and Amazigh (Berber) communities [29]. This pattern aligns with the heterogeneous genetic character of North African populations. shaped by long-term migrations and cultural interactions [30,31].
The Arabic-speaking Casablanca-Settat population (study population) occupies an intermediate position within the Moroccan cluster, showing close genetic affinities with both Arab- and Berber-speaking groups from different regions of the country.
Phylogenetic analysis demonstrates that the population from the Casablanca-Settat region occupies a pivotal, intermediate position within the Moroccan genetic cluster. It exhibits its strongest genetic affinity with Arabic-speaking groups from West-Central Morocco, specifically “Moroccan Arabs WC 3” (P8) and “Moroccan Arabs WC 2” (P14). This proximity is consistent with the geographical (Figure 3) and linguistic profile of P9, suggesting a shared genetic heritage among Arabic-speaking populations in the West-Central Atlantic plains. Minor branching patterns observed in the NJ tree likely reflect local demographic histories, such as endogamy, tribal structure, or regional isolation. rather than deep genetic divergence [11,32].
Interestingly, Berber-speaking populations from Azrou (Middle Atlas) and from southern Morocco cluster in close proximity to the Arabic-speaking Casablanca-Settat population, indicating a transitional or admixed genetic core linking Arabophone and Amazigh groups. This close genetic affinity suggests substantial shared ancestry and long-term admixture despite linguistic differentiation. reinforcing the notion of limited genetic discontinuity among Moroccan populations [13,14,33]. This finding supports the hypothesis that Arabization in Morocco was predominantly cultural and linguistic rather than genetic [17,18,23].
The clustering of Moroccan Mixed (P2. P3) populations, which include individuals with diverse Moroccan origins. likely represents an average Moroccan genetic background shaped by multiple ancestral contributions [13,32]. The close positioning of the study population (P9) to these groups further supports its intermediate genetic status within the Moroccan genetic landscape [17,18,23].
The Spanish reference population branches adjacent to the Moroccan cluster, indicating a moderate level of genetic affinity between populations from Morocco and the Iberian Peninsula. This proximity is consistent with documented historical contacts across the Strait of Gibraltar, including trans-Mediterranean migrations and exchanges during different historical periods during the Islamic presence in Al-Andalus (8th–15th centuries) [34]. Similar signatures have been documented in Y-chromosome studies [35,36]. In contrast, the Chinese population forms a clearly distinct and distant branch, serving as an external outgroup and confirming the robustness of the genetic distance estimates and the overall tree topology.
The observed proximity between the Arabic-speaking Casablanca-Settat group and the Berber-speaking groups from Azrou and Souss likely reflects Morocco’s long history of population mobility and interaction shaped by migrations, trade routes, and tribal confederations (Figure 1). This close genetic affinity supports the existence of extensive historical gene flow between regions traditionally inhabited by Arab and Berber communities. Moreover. the gradient observed from the Berber mountain groups (Azrou) to the Arab-dominant plains (Chaouia) reveals a clinal pattern of genetic variation [4,37]. a well-documented feature in North African genetic structure [10,14]. Such clines generally result from continuous gene flow across geographic and cultural boundaries rather than sharp population divisions. In this context. Morocco’s genetic landscape appears to be shaped by gradual transitions and admixture, reflecting complex demographic dynamics rather than discrete, isolated clusters.
While the use of STR markers provides a robust and validated framework for assessing genetic diversity and is the gold standard for forensic identification, this approach has certain inherent limitations when exploring complex evolutionary histories. The current study, based on a standard STR panel, primarily captures recent demographic events and overall population affinities. However, the moderate mutation rate and limited number of loci compared to genome-wide assays may constrain the ability to detect more ancient or subtle admixture events. Therefore, our results should be viewed as a foundational genetic characterization of the “ThisStudyWC4” (P9) population within the current Moroccan forensic framework.
To refine the resolution of these findings, future research would benefit from the integration of high-density SNP data or Whole-Genome Sequencing (WGS). These high-resolution markers will allow for more sophisticated analyses, such as calculating ancestral proportions and dating gene flow events with greater precision. Furthermore, the current analysis could be strengthened by including sub-Saharan African reference populations to better quantify the historical southward gene flow that has shaped North African genomes.
Finally, as this study focuses on autosomal variation, it does not distinguish between maternal and paternal demographic histories. The future inclusion of uniparental markers (mtDNA and Y-STRs) will be essential to identify potential sex-biased migration patterns and to reconstruct the lineage-specific contributions of Arab and Berber groups. Such multi-marker approaches will ultimately provide a more comprehensive picture of Moroccan population dynamics and further enhance the national forensic genetic databases.

5. Conclusions

This study refines our understanding of Morocco’s genetic landscape by characterizing an Arabic-speaking population from the Casablanca-Settat region. The findings demonstrate that a shared linguistic identity does not necessarily imply genetic homogeneity.
Overall, the data suggest a clinal gradient extending from the Middle Atlas through southern Morocco to the Atlantic plains, shaped by migration, admixture and sociopolitical history. Additionally, both trans-Mediterranean affinities (linking Morocco and Iberia) and east west differentiation underscore Morocco’s role as both a crossroads and a barrier between Europe and sub-Saharan Africa.
These findings highlight the complex interplay of language, geography, and history in shaping North African genetic diversity. Future genome-wide and uniparental marker studies, integrated with linguistic, archaeological and historical evidence, will be essential to fully elucidate the demographic history and microevolutionary processes underlying this mosaic ancestry.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/forensicsci6010022/s1. Table S1. Observed allele frequencies for 15 Short Tandem Repeat (STR) loci in the study population (N = 138). Allele frequencies were calculated using the direct counting method (2N = 276 chromosomes). This table presents raw observed frequencies. These raw frequencies include all detected variants, including rare alleles with a count of 1 (n = 1) or 2 (n = 2), which correspond to observed frequencies of approximately 0.0036 and 0.0072, respectively. Note that these raw values differ from the adjusted frequencies used in the main analysis (e.g., Table 1), where a conservative Minimum Allele Frequency (MAF) threshold of (5/2N) (0.0181) was applied to rare alleles in accordance with NRC II recommendations.

Author Contributions

Conceptualization, B.E.H., H.Y., and T.F.; Methodology, O.E., F.C., and B.E.H.; Formal analysis and investigation, O.E., F.C., A.H., and B.E.H.; Writing—original draft preparation, O.E., H.E.O., and B.E.H.; Writing—review and editing, O.E., A.H., and B.E.H.; Supervision, B.E.H., H.E.O., and T.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by Biomedical Research Ethics Committee (CERBC) of Casablanca, Morocco (CERBC/06/19 and 22 April 2022).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the participants to publish this paper.

Data Availability Statement

Data and materials would be made available upon request.

Acknowledgments

The authors are grateful to all subjects for their contribution and participation in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

DNADeoxyribonucleic Acid
HWEHardy–Weinberg Equilibrium
NJNeighbour-Joining
PCAPrincipal Component Analysis
PCRPolymerase Chain Reaction
PDPower of Discrimination
PICPolymorphism Information Content
PMMatching Probability
STRShort Tandem Repeat
TPITypical Paternity Index

References

  1. Rando, J.C.; Pinto, F.; González, A.M.; Hernández, M.; Larruga, J.M.; Cabrera, V.M.; Bandelt, H.J. Mitochondrial DNA Analysis of Northwest African Populations Reveals Genetic Exchanges with European, near-Eastern, and Sub-Saharan Populations. Ann. Hum. Genet. 1998, 62, 531–550. [Google Scholar] [CrossRef]
  2. Serradell, J.M.; Lorenzo-Salazar, J.M.; Flores, C.; Lao, O.; Comas, D. Modelling the Demographic History of Human North African Genomes Points to a Recent Soft Split Divergence between Populations. Genome Biol. 2024, 25, 201. [Google Scholar] [CrossRef]
  3. Claerhout, S.; Verstraete, P.; Warnez, L.; Vanpaemel, S.; Larmuseau, M.; Decorte, R. CSYseq: The First Y-Chromosome Sequencing Tool Typing a Large Number of Y-SNPs and Y-STRs to Unravel Worldwide Human Population Genetics. PLoS Genet. 2021, 17, e1009758. [Google Scholar] [CrossRef] [PubMed]
  4. Handley, L.J.L.; Manica, A.; Goudet, J.; Balloux, F. Going the Distance: Human Population Genetics in a Clinal World. Trends Genet. 2007, 23, 432–439. [Google Scholar] [CrossRef] [PubMed]
  5. Vicente, M.; Schlebusch, C.M. African Population History: An Ancient DNA Perspective. Curr. Opin. Genet. Dev. 2020, 62, 8–15. [Google Scholar] [CrossRef]
  6. Butler, J.M. Genetics and Genomics of Core Short Tandem Repeat Loci Used in Human Identity Testing. J. Forensic Sci. 2006, 51, 253–265. [Google Scholar] [CrossRef]
  7. Micka, K.; Sprecher, C.; Lins, A.; Comey, C.; Koons, B.; Crouse, C.; Endean, D.; Pirelli, K.; Lee, S.; Duda, N.; et al. Validation of Multiplex Polymorphic STR Amplification Sets Developed for Personal Identification Applications. J. Forensic Sci. 1996, 41, 582–590. [Google Scholar] [CrossRef] [PubMed]
  8. Vajpayee, K.; Dash, H.R.; Parekh, P.B.; Shukla, R.K. PCR Inhibitors and Facilitators—Their Role in Forensic DNA Analysis. Forensic Sci. Int. 2023, 349, 111773. [Google Scholar] [CrossRef]
  9. Zuniga, A.; Molina, Y.; Amaya, K.; Moya, Z.; Soriano, P.; Pineda, D.; Pinto, Y.; Zablah, I. Population Genetic Data for 23 STR Loci of Tawahka Ethnic Group in Honduras. Forensic Sci. 2025, 5, 72. [Google Scholar] [CrossRef]
  10. Attaoui, A.; Foddha, H.; Othman, H.; Ben Abdennebi, H.; Haj Khelil, A. Utility of Regional STR Marker Variations in Tunisian and Sub-Saharan Populations: Insights into Forensic and Population Genetics. Front. Bioinform. 2025, 5, 1550730. [Google Scholar] [CrossRef]
  11. Khair, A.E.; Dahbi, N.; Cheffi, K.; Talbi, J.; Hilali, A.; Ossmani, H.E. The Endogamous Marriage in the Population of Doukkala (Morocco). Interdiscip. Soc. Stud. 2023, 2, 1877–1882. [Google Scholar] [CrossRef]
  12. Habibeddine, L.; Ouardani, M.; El Ossmani, H.; Amzazi, S.; Talbi, J. Study of Consanguinity of the Population of Northern Morocco. J. Forensic Res. 2018, 9. [Google Scholar] [CrossRef]
  13. Moral, P.; Kandil, M.; Fernandez-Santander, A.; Esteban, E.; Valveny, N. The History of Iberian and Moroccan Populations: Evidence from Genetic Data (DNA Studies and Classical Polymorphisms). In Prehistoric Iberia: Genetics, Anthropology, and Linguistics; Arnaiz-Villena, A., Martínez-Laso, J., Gómez-Casado, E., Eds.; Springer: Boston, MA, USA, 2000; pp. 51–64. ISBN 978-1-4615-4231-5. [Google Scholar]
  14. Arauna, L.R.; Mendoza-Revilla, J.; Mas-Sandoval, A.; Izaabel, H.; Bekada, A.; Benhamamouch, S.; Fadhlaoui-Zid, K.; Zalloua, P.; Hellenthal, G.; Comas, D. Recent Historical Migrations Have Shaped the Gene Pool of Arabs and Berbers in North Africa. Mol. Biol. Evol. 2017, 34, 318–329. [Google Scholar] [CrossRef]
  15. Cheffi, K.; El Khair, A.; Dahbi, N.; Talbi, J.; Hilali, A.; El Ossmani, H. Genetic Analysis Based on 15 Autosomal Short Tandem Repeats (STRs) in the Chaouia Population, Western Center Morocco, and Genetic Relationships with Worldwide Populations. Mol. Genet. Genomics 2023, 298, 931–941. [Google Scholar] [CrossRef] [PubMed]
  16. Raymond, M.; Rousset, F. GENEPOP (Version 1.2): Population Genetics Software for Exact Tests and Ecumenicism. J. Hered. 1995, 86, 248–249. [Google Scholar] [CrossRef]
  17. Gaibar, M.; Esteban, M.E.; Via, M.; Harich, N.; Kandil, M.; Fernández-Santander, A. Usefulness of Autosomal STR Polymorphisms beyond Forensic Purposes: Data on Arabic- and Berber-Speaking Populations from Central Morocco. Ann. Hum. Biol. 2012, 39, 297–304. [Google Scholar] [CrossRef]
  18. Essoubaiy, O.; Gazzaz, B.; Yahia, H.; El Ossmani, H.; Talbi, J.; El Houate, B.; Fechtali, T. Anthropogenetic Study of the Arabic—Speaking Population of Chaouia Ouardigha (Morocco) Based on Autosomal STRs. Egypt. J. Forensic Sci. 2024, 14, 17. [Google Scholar] [CrossRef]
  19. El Khair, A.; Cheffi, K.; Dahbi, N.; Talbi, J.; Hilali, A.; El Ossmani, H. Exploring the Genetic Landscape of the Doukkala Population (Morocco) Using 15 Autosomal Short Tandem Repeats (STRs). Span. J. Leg. Med. 2024, 50, 54–61. [Google Scholar] [CrossRef]
  20. Reguig, A.; El Houate, B.; Yahia, H.; Naasse, Y.; Rouba, H.; Harich, N. Genetic Diversity of 15 Autosomal STR Loci in a Moroccan Population. Int. J. Sci. Res. Publ. 2015, 323. [Google Scholar]
  21. El Ossmani, H.; Bouchrif, B.; Aboukhalid, R.; Bouabdillah, M.; Gazzaz, B.; Zaoui, D.; Chafik, A.; Talbi, J. Assessment of Phylogenetic Structure of Berber-Speaking Population of Azrou Using 15 STRs of Identifiler Kit. Leg. Med. 2010, 12, 52–56. [Google Scholar] [CrossRef]
  22. Ossmani, H.E.; Talbi, J.; Bouchrif, B.; Chafik, A. Allele Frequencies of 15 Autosomal STR Loci in the Southern Morocco Population with Phylogenetic Structure among Worldwide Populations. Leg. Med. 2009, 11, 155–158. [Google Scholar] [CrossRef]
  23. Dahbi, N.; Cheffi, K.; El Khair, A.; Habbibeddine, L.; Talbi, J.; Hilali, A.; El Ossmani, H. Genetic Characterization of the Berber-Speaking Population of Souss (Morocco) Based on Autosomal STRs. Mol. Genet. Genomic Med. 2023, 11, e2156. [Google Scholar] [CrossRef]
  24. Bentayebi, K.; Abada, F.; Ihzmad, H.; Amzazi, S. Genetic Ancestry of a Moroccan Population as Inferred from Autosomal STRs. Meta Gene 2014, 2, 427–438. [Google Scholar] [CrossRef]
  25. Bouabdellah, M.; Ouenzar, F.; Aboukhalid, R.; Elmzibri, M.; Squalli, D.; Amzazi, S. STR Data for the 15 AmpFlSTR Identifiler Loci in the Moroccan Population. Forensic Sci. Int. Genet. Suppl. Ser. 2008, 1, 306–308. [Google Scholar] [CrossRef]
  26. Camacho, M.V.; Benito, C.; Figueiras, A.M. Allelic Frequencies of the 15 STR Loci Included in the AmpFlSTR Identifiler PCR Amplification Kit in an Autochthonous Sample from Spain. Forensic Sci. Int. 2007, 173, 241–245. [Google Scholar] [CrossRef]
  27. Deng, Y.; Zhu, B.; Shen, C.; Wang, H.; Huang, J.; Li, Y.; Qin, H.; Mu, H.; Su, J.; Wu, J.; et al. Genetic Polymorphism Analysis of 15 STR Loci in Chinese Hui Ethnic Group Residing in Qinghai Province of China. Mol. Biol. Rep. 2011, 38, 2315–2322. [Google Scholar] [CrossRef] [PubMed]
  28. Coudray, C.; Guitard, E.; Keyser-Tracqui, C.; Melhaoui, M.; Cherkaoui, M.; Larrouy, G.; Dugoujon, J.-M. Population Genetic Data of 15 Tetrameric Short Tandem Repeats (STRs) in Berbers from Morocco. Forensic Sci. Int. 2007, 167, 81–86. [Google Scholar] [CrossRef] [PubMed]
  29. Stepanova, A.V. Origin of the Berber Tribal Confederation of Ṣanhādja. Kalmyk Sci. Cent. Russ. Acad. Sci. PAH 2018, 11, 2–13. [Google Scholar]
  30. Arredi, B.; Poloni, E.S.; Paracchini, S.; Zerjal, T.; Fathallah, D.M.; Makrelouf, M.; Pascali, V.L.; Novelletto, A.; Tyler-Smith, C. A Predominantly Neolithic Origin for Y-Chromosomal DNA Variation in North Africa. Am. J. Hum. Genet. 2004, 75, 338–345. [Google Scholar] [CrossRef]
  31. Ennafaa, H.; Cabrera, V.M.; Abu-Amero, K.K.; González, A.M.; Amor, M.B.; Bouhaha, R.; Dzimiri, N.; Elgaaïed, A.B.; Larruga, J.M. Mitochondrial DNA Haplogroup H Structure in North Africa. BMC Genet. 2009, 10, 8. [Google Scholar] [CrossRef]
  32. Vilà-Valls, L.; Abdeli, A.; Lucas-Sánchez, M.; Bekada, A.; Calafell, F.; Benhassine, T.; Comas, D. Understanding the Genomic Heterogeneity of North African Imazighen: From Broad to Microgeographical Perspectives. Sci. Rep. 2024, 14, 9979. [Google Scholar] [CrossRef]
  33. Falchi, A.; Giovannoni, L.; Calo, C.M.; Piras, I.S.; Moral, P.; Paoli, G.; Vona, G.; Varesi, L. Genetic History of Some Western Mediterranean Human Isolates through mtDNA HVR1 Polymorphisms. J. Hum. Genet. 2006, 51, 9–14. [Google Scholar] [CrossRef] [PubMed]
  34. Haghnavaz, J.; Alerasoul, S. Islam in Andalusia (Historical Spain). Res. J. Humanit. Soc. Sci. 2014, 5, 384–388. [Google Scholar]
  35. Bosch, E.; Calafell, F.; Pérez-Lezaun, A.; Clarimón, J.; Comas, D.; Mateu, E.; Martínez-Arias, R.; Morera, B.; Brakez, Z.; Akhayat, O.; et al. Genetic Structure of North-West Africa Revealed by STR Analysis. Eur. J. Hum. Genet. 2000, 8, 360–366. [Google Scholar] [CrossRef] [PubMed]
  36. Navarro-López, B.; Baeta, M.; Moreno-López, O.; Kleinbielen, T.; Raffone, C.; Granizo-Rodríguez, E.; Ferragut, J.F.; Alvarez-Gila, O.; Barbaro, A.; Picornell, A.; et al. Y-Chromosome Analysis Recapitulates Key Events of Mediterranean Populations. Heliyon 2024, 10, e35329. [Google Scholar] [CrossRef]
  37. Luis, J.R.; Lacau, H.; Fadhlaoui-Zid, K.; Alfonso-Sanchez, M.A.; Garcia-Bertrand, R.; Herrera, R.J. Afghanistan: Conduits of Human Migrations Identified Using AmpFlSTR Markers. Int. J. Legal Med. 2019, 133, 1659–1666. [Google Scholar] [CrossRef]
Figure 1. Principal Coordinates Analysis (PCoA) based on Nei’s genetic distances among 14 reference populations. Each point represents a population. The first two coordinates (PCo1 and PCo2) explain 70.80% and 10.33% of the total genetic variation, respectively.
Figure 1. Principal Coordinates Analysis (PCoA) based on Nei’s genetic distances among 14 reference populations. Each point represents a population. The first two coordinates (PCo1 and PCo2) explain 70.80% and 10.33% of the total genetic variation, respectively.
Forensicsci 06 00022 g001
Figure 2. Neighbor-Joining (NJ) tree showing the evolutionary relationships among the studied populations based on Nei’s genetic distance. Branch lengths are proportional to genetic distance, with the scale bar indicating 0.01 genetic distance units. China (Qinghai) was included as an out-group. Population abbreviations and labels are defined in Table 1.
Figure 2. Neighbor-Joining (NJ) tree showing the evolutionary relationships among the studied populations based on Nei’s genetic distance. Branch lengths are proportional to genetic distance, with the scale bar indicating 0.01 genetic distance units. China (Qinghai) was included as an out-group. Population abbreviations and labels are defined in Table 1.
Forensicsci 06 00022 g002
Figure 3. The study population (P9) is positioned in the central-western region of Morocco. Figure 1 highlights a clinal genetic variation between Amazigh (Berber) groups—such as P1 (Azrou), P5 (South), and P11 (Middle Atlas)—and Arabophone groups, including P14 (Chaouia), P8 (Chaouia-Ouardigha), and P10 (Abda-Chaouia and Tadla). This reflects a gradual gene flow shaped by geography, as well as historical and sociolinguistic transitions. Furthermore, the analysis reflects historical trans-Mediterranean gene flow between Moroccan populations and Southern Europe, especially Spain, highlighting long-standing historical connectivity. Population codes are provided in Table 1.
Figure 3. The study population (P9) is positioned in the central-western region of Morocco. Figure 1 highlights a clinal genetic variation between Amazigh (Berber) groups—such as P1 (Azrou), P5 (South), and P11 (Middle Atlas)—and Arabophone groups, including P14 (Chaouia), P8 (Chaouia-Ouardigha), and P10 (Abda-Chaouia and Tadla). This reflects a gradual gene flow shaped by geography, as well as historical and sociolinguistic transitions. Furthermore, the analysis reflects historical trans-Mediterranean gene flow between Moroccan populations and Southern Europe, especially Spain, highlighting long-standing historical connectivity. Population codes are provided in Table 1.
Forensicsci 06 00022 g003
Table 1. Characteristics of the Moroccan and global populations included for comparative analysis.
Table 1. Characteristics of the Moroccan and global populations included for comparative analysis.
Sample
Size
AbbreviationCode Ethnicity and Language SpeakingLocation References
209 Moroccan Berbers EC P6 Berber from Asni and Bouhria. Central and Eastern Morocco [28]
201 Moroccan Berbers Azrou P1 Berber-speaking of Azrou. Central Middle Atlas region of Morocco. [21]
75 Moroccan Berbers MA P11 Berber-speaking Middle Atlas of Morocco [17]
150 Moroccan Berbers South P5 Berber-speaking from Souss Southern Morocco [23]
204 Moroccan Arabs South P12 Arabic-speaking Southern Morocco [22]
80 Moroccan Arabs NWC P13 Arabic-speaking North-Central and West-Central of Morocco. [20]
71 Moroccan Arabs WC 1 P10 Arabic-speaking from Abda, Chaouia, Doukkali, and Tabdla Central-West Coast of Morocco [17]
150 Moroccan Arabs WC 2 P14 Arabic-speaking of Chaouia. Western Center of Morocco [15]
153 Moroccan Arabs WC 3 P8 Arabic-speaking of Chaouia-Ouardigha. Western center of Morocco [18]
138 This Study WC 4 P9 Arabic-speaking Centre-ouest du Maroc This study
425 Moroccan Mixed 1 P3 Berbers, Arabs, and Sahraouis. Various parts of Morocco [25]
320 Moroccan Mixed 2 P2 Arabs, Berbers, and Sahrawi. Various parts of Morocco [24]
342 Spain Caucasian P4 Caucasian in Spain. Spain [26]
2975 China Qinghai P7 Chinese. China (Qinghai province) [27]
Table 2. Forensic parameters of 15 STR loci in the Moroccan population sample (N = 138).
Table 2. Forensic parameters of 15 STR loci in the Moroccan population sample (N = 138).
TH01D3S1358vWAD21S11TPOXD7S820D19S433D5S818D2S1338D16S539CSF1POD13S317FGAD18S51D8S1179
Ho0.73910.73190.81880.85510.70290.78260.86230.74640.81160.77540.80430.76090.84780.92030.8478
He0.79110.76340.81760.8340.68840.76830.80720.75560.82970.78110.74340.7560.84060.8660.8174
PE0.77150.70620.84620.90880.57410.73090.83710.69990.90530.76230.660.70450.91310.99350.8531
PD0.92430.9050.94180.95380.85730.91240.93890.90250.95370.92120.88920.90360.95470.96790.943
PIC0.7590.72430.79260.81540.64280.73440.78320.71780.81240.75020.69850.71910.82070.85190.7938
p_HWE0.17990.37070.12130.99150.96520.28930.05250.08240.04410.86110.90880.19170.99980.28540.3009
Ho, observed heterozygosity; He, expected heterozygosity; PD, power of discrimination; PE, power of exclusion; PIC, polymorphic information content; p-HWE, p-value of the Hardy–Weinberg Equilibrium exact test.
Table 3. Pairwise Nei’s genetic distances among 14 Moroccan and global populations.
Table 3. Pairwise Nei’s genetic distances among 14 Moroccan and global populations.
P1P2P3P4P5P6P7P8P9P10P11P12P13P14
P10
P20.034380
P30.0345610.0001920
P40.0290680.0473550.0473410
P50.0101620.0329860.0332630.027990
P60.0269360.0376780.0372090.0489530.0222880
P70.0986830.1241810.1231610.088810.0979790.1090460
P80.0220070.031890.0320460.0339930.0163640.0188060.0965760
P90.0248090.0388010.0386390.0378270.0209450.0267190.0888310.018370
P100.0283810.0363170.0363660.0360140.0244530.0329320.1038680.0245740.0247210
P110.0356890.050870.0502960.0444810.0281740.0289660.0967570.0300510.0303720.0333710
P120.0130.0291540.0291860.0226480.0145530.0259690.0910070.0195460.0274990.0313380.0268510
P130.0214190.0405580.0403780.0393040.0185460.0340920.0980430.0306990.0350550.02870.0309050.0190130
P140.0149850.0296720.0298190.0295910.0120320.0241370.090550.0195920.0206270.0265510.0280660.0095590.0246090
Numerical values represent the genetic divergence between each pair of populations. Population abbreviations and labels (P1–P14) are defined in Table 1. Genetic distances were estimated pairwise using the dist.genpop function in the R package adegenet (v.2.1.11). Lower values indicate higher genetic similarity, while higher values reflect greater genetic differentiation.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Essoubaiy, O.; Hakem, A.; Chbel, F.; Yahia, H.; EL Ossmani, H.; Fechtali, T.; El Houate, B. Genetic Characterization of the Arabic-Speaking Population from the Casablanca-Settat Region Using Autosomal STR Markers: Understanding the Interplay of Geography and Language in Moroccan Population History. Forensic Sci. 2026, 6, 22. https://doi.org/10.3390/forensicsci6010022

AMA Style

Essoubaiy O, Hakem A, Chbel F, Yahia H, EL Ossmani H, Fechtali T, El Houate B. Genetic Characterization of the Arabic-Speaking Population from the Casablanca-Settat Region Using Autosomal STR Markers: Understanding the Interplay of Geography and Language in Moroccan Population History. Forensic Sciences. 2026; 6(1):22. https://doi.org/10.3390/forensicsci6010022

Chicago/Turabian Style

Essoubaiy, Othmane, Adnane Hakem, Faiza Chbel, Hakima Yahia, Hicham EL Ossmani, Taoufiq Fechtali, and Brahim El Houate. 2026. "Genetic Characterization of the Arabic-Speaking Population from the Casablanca-Settat Region Using Autosomal STR Markers: Understanding the Interplay of Geography and Language in Moroccan Population History" Forensic Sciences 6, no. 1: 22. https://doi.org/10.3390/forensicsci6010022

APA Style

Essoubaiy, O., Hakem, A., Chbel, F., Yahia, H., EL Ossmani, H., Fechtali, T., & El Houate, B. (2026). Genetic Characterization of the Arabic-Speaking Population from the Casablanca-Settat Region Using Autosomal STR Markers: Understanding the Interplay of Geography and Language in Moroccan Population History. Forensic Sciences, 6(1), 22. https://doi.org/10.3390/forensicsci6010022

Article Metrics

Back to TopTop