Next Article in Journal
Efficiency of Lentiviral Vectors Pseudotyped with LCMV-G in Gene Transfer to Ldlr/−ApoB100/100 Mice
Previous Article in Journal
Expanded Carrier Screening: Current Evidence and Future Directions in the Era of Population Genomics
Previous Article in Special Issue
Cardiac Genetic Variants in Sudden, Unexpected Death in Epilepsy: From Challenging DNA Extraction Methods to Updated NGS Panels for Improved Genetic Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Predicting Physical Appearance from Low Template: State of the Art and Future Perspectives

by
Francesco Sessa
1,*,†,
Emina Dervišević
2,†,
Massimiliano Esposito
3,
Martina Francaviglia
4,
Mario Chisari
3,
Cristoforo Pomara
4,‡ and
Monica Salerno
4,*,‡
1
Department of Psychology and Health Sciences, Faculty of Human Sciences, Education, and Sports, Pegaso University, 80143 Naples, Italy
2
Department of Forensic Medicine, Faculty of Medicine, University of Sarajevo, 71000 Sarajevo, Bosnia and Herzegovina
3
Faculty of Medicine and Surgery, “Kore” University of Enna, 94100 Enna, Italy
4
Department of Medical, Surgical and Advanced Technologies “G.F. Ingrassia”, University of Catania, 95121 Catania, Italy
*
Authors to whom correspondence should be addressed.
These authors contributed equally to the study.
These authors share last authorship.
Genes 2026, 17(1), 59; https://doi.org/10.3390/genes17010059
Submission received: 11 December 2025 / Revised: 30 December 2025 / Accepted: 1 January 2026 / Published: 5 January 2026
(This article belongs to the Special Issue Advanced Research in Forensic Genetics)

Abstract

Background/Objectives: Forensic DNA phenotyping (FDP) enables the prediction of externally visible characteristics (EVCs) such as eye, hair, and skin color, ancestry, and age from biological traces. However, low template DNA (LT-DNA), often derived from degraded or trace samples, poses significant challenges due to allelic dropout, contamination, and incomplete profiles. This review evaluates recent advances in FDP from LT-DNA, focusing on the integration of machine learning (ML) models to improve predictive accuracy and operational readiness, while addressing ethical and population-related considerations. Methods: A comprehensive literature review was conducted on FDP and ML applications in forensic genomics. Key areas examined include SNP-based trait modeling, genotype imputation, epigenetic age estimation, and probabilistic inference. Comparative performance of ML algorithms (Random Forests, Support Vector Machines, Gradient Boosting, and deep learning) was assessed using datasets such as the 1000 Genomes Project, UK Biobank, and forensic casework samples. Ethical frameworks and validation standards were also analyzed. Results: ML approaches significantly enhance phenotype prediction from LT-DNA, achieving AUC > 0.9 for eye color and improving SNP recovery by up to 15% through imputation. Tools like HIrisPlex-S and VISAGE panels remain robust for eye and hair color, with moderate accuracy for skin tone and emerging capabilities for age and facial morphology. Limitations persist in admixed populations and traits with polygenic complexity. Interpretability and bias mitigation remain critical for forensic admissibility. Conclusions: L integration strengthens FDP from LT-DNA, offering valuable investigative leads in challenging scenarios. Future directions include multi-omics integration, portable sequencing platforms, inclusive reference datasets, and explainable AI to ensure accuracy, transparency, and ethical compliance in forensic applications.

1. Introduction

The forensic application of genetic data has evolved from a tool of straightforward identity verification to a complex system capable of predicting human phenotypes from biological traces, a field known as Forensic DNA Phenotyping (FDP) [1,2]. This paradigm shift is driven by technological advances in genomics, particularly genome-wide association studies (GWAS), next-generation sequencing (NGS), and increasingly sophisticated computational modeling [3]. FDP allows investigators to derive investigative leads from crime scene DNA by predicting externally visible characteristics (EVCs) such as eye, hair, and skin color, biogeographic ancestry, and chronological age, even in the absence of a known reference profile [4].
The idea of forensic DNA intelligence is to extract any information from DNA that can help guide criminal investigations, especially when no match exists in short tandem repeat (STR) databases [5]. Of practical importance are clues to the physical phenotype, due to the high heritability of human appearance traits [6]. While early forensic genetics relied on STR profiling for identity matching, STR markers are neutral and do not provide phenotypic information [7]. FDP has thus emerged as an essential tool in cases involving unidentified perpetrators, cold cases, mass disasters, or heavily degraded biological remains [8].
The notion that physical features can be “read” from DNA began to gain scientific credibility in the early 2000s with the publication of large-scale GWAS that identified single nucleotide polymorphisms (SNPs) associated with human pigmentation and morphology. Early SNP-based systems such as IrisPlex and HIrisPlex-S demonstrated the feasibility of predicting eye and hair color with moderate to high accuracy. As understanding of complex trait genetics expanded, FDP broadened to include traits like freckles, hair structure, male pattern baldness, and tall stature, as well as subcontinental ancestry and age from multiple tissues [1].
A major challenge in FDP is the prediction of complex traits, which are polygenic and often influenced by environmental and developmental factors. Traits like facial morphology or skin tone involve many small-effect variants, requiring big data approaches and machine learning (ML) models trained on large, diverse reference datasets (Figure 1). For simpler traits with high heritability, targeted analysis of a limited number of markers using massively parallel sequencing (MPS) has proven effective. However, achieving universal genomic predictors for all traits will require the construction of much larger databases derived from whole-genome sequencing (WGS), including rare variants that may influence phenotypes. Furthermore, for accurate appearance prediction, age estimation must be considered; this can be derived from epigenetic modifications, particularly DNA methylation markers, which reflect biological age [3].
Technological limitations persist, especially in cases involving LT-DNA or degraded skeletal remains, which are common in real-world forensics. LT-DNA samples, often collected from trace biological material such as epithelial cells, touched objects, or aged remains, are susceptible to allelic dropout, amplification errors, and partial genotyping. Despite these challenges, tools like the Ion AmpliSeq™ HIrisPlex-S panel, validated using Ion Torrent MPS technology, have shown remarkable resilience. In a study analyzing 63 bone samples aged up to 80 years, full pigmentation profiles could be recovered from over 55% of cases, with successful eye and hair color predictions made from partial profiles in many others. Nonetheless, certain SNPs, especially those relevant to skin color, showed consistent underperformance, highlighting the need for primer redesign in future iterations of the panel [9].
Importantly, the correctness and utility of forensic phenotyping are fundamentally tied to data quality, genetic coverage, and technological sensitivity. As shown by the example of paleoanthropology, where DNA is extracted from highly degraded remains, there is potential for adapting high-throughput, low-input genomic methods to forensic contexts. The slow pace of their forensic adoption, however, is due in part to technical, regulatory, and funding constraints [10,11,12,13,14].
This review explores the scientific and technological advances that are reshaping the field of FDP, with particular emphasis on the application of ML for predicting physical appearance from LT-DNA. It begins by defining LT-DNA and outlining the unique challenges it presents in forensic investigations. The development of FDP is then examined, from its early use in identity testing to its current application in predicting (EVCs), biogeographic ancestry, and chronological age. Particular attention is given to the integration of ML approaches, which have become essential for modeling complex traits from partial or degraded genetic data. The review also addresses the practical and technical limitations associated with compromised samples and highlights how recent technological and computational advancements aim to mitigate these issues. Finally, it discusses future perspectives in the field, including the integration of multi-omics data, the expansion of reference databases to enhance predictive accuracy across diverse populations, and the ethical considerations surrounding the forensic use of predictive genetic information.

2. Low Template DNA: Challenges and Opportunities

2.1. Definition and Characteristics

LT-DNA represents a critical category of forensic genetic evidence characterized by a minute quantity of recoverable DNA, typically considered to be below 100 picograms (pg), or fewer than approximately 15–20 human diploid cells [15]. Such low-level samples are frequently encountered in crime scene investigations involving biological trace, including touch DNA, aged stains, shed hairs, or severely degraded tissue [16]. The forensic utility of LT-DNA lies in its potential to provide investigative leads in cases where standard DNA profiling techniques may fail due to insufficient template quantity, stochastic effects, or contamination risks [17].
A defining feature of LT-DNA analysis is its susceptibility to stochastic variation during PCR amplification, leading to challenges such as allelic dropout (failure to detect one allele of a heterozygous locus), allelic drop-in (spurious alleles introduced during amplification), and peak imbalance in electropherograms [18]. These artifacts compromise interpretation reliability, necessitating both technical rigor and statistical caution in forensic applications. To address these limitations, laboratories have developed specialized protocols and technologies aimed at maximizing the information recovered from low-level DNA while minimizing noise and erroneous profiles [19].
One of the central technical questions in processing LT-DNA is whether to concentrate the available extract into a single amplification or split it across replicates. A decision-theoretical study investigating this dilemma demonstrated that performing two replicates yields a higher expected net gain (ENG) than a single amplification, provided that the DNA quantity is sufficient to produce peak heights as low as 43 relative fluorescence units (RFU) on average. This finding underscores the benefit of replicate analyses in enhancing the reliability and information content of LT-DNA profiles, particularly when operating near the analytical sensitivity threshold of current instrumentation [20].
Analytical threshold (AT) determination is another key factor in managing LT-DNA profiles. Given the low signal intensity and high background noise often observed in such samples, accurate AT setting becomes crucial for distinguishing true alleles from spurious peaks. Research has shown that baseline signal patterns are influenced by reagent kits, laboratory conditions, instrument calibration, and the number of amplification cycles. In response, some laboratories now calculate the AT dynamically using baseline signal distributions from negative control profiles. One study proposed a comparative framework for evaluating AT methods and introduced a real-time analysis tool that allows forensic analysts to tailor ATs to specific experimental conditions, improving data accuracy while preserving true allele signals in most LT-DNA scenarios, though challenges persist in extremely low-template and high-cycle cases [21,22] (Figure 2).
Beyond analytical thresholds, optimizing amplification strategies is essential when dealing with limited or degraded DNA. A study involving 155 forensic case samples with DNA concentrations between 5 and 14.3 pg/μL revealed that samples exceeding 10 pg/μL and with a degradation index below 3 (a ratio indicating the relative amplification success of short versus long DNA fragments, used to assess sample degradation) were significantly more likely to yield informative profiles. Applying empirically derived thresholds for DNA quantity and quality, researchers found that forensic laboratories could reduce analytical workload by 32% while retaining 83% of usable profiles, offering a pragmatic approach to resource allocation in high-throughput forensic contexts [23].
Recent developments in post-PCR processing have also demonstrated promises for improving allele recovery from low template and trace DNA samples. A study evaluating the Amplicon RX post-PCR clean-up method found significant improvements in allele recovery and signal intensity when compared to standard 29- and 30-cycle protocols using the GlobalFiler amplification kit. Particularly at DNA concentrations as low as 0.001 ng/μL (i.e., 1 pg/μL), the Amplicon RX method yielded superior results, reinforcing its value in analyzing extremely low-template and degraded samples. While all methods struggled at the lowest tested concentration (0.0001 ng/μL), the clean-up procedure still outperformed traditional protocols, emphasizing the need for continued innovation in post-amplification refinement techniques [24].
Despite these methodological advances, significant limitations remain in working with LT-DNA. The inherent stochasticity of low-copy amplification, variability in laboratory-specific workflows, and the impact of environmental factors such as temperature and humidity all contribute to result uncertainty [25,26]. Additionally, inter-laboratory standardization for defining what constitutes “low template” remains inconsistent, although the commonly cited threshold of 100 pg has become widely accepted as a working guideline [27]. Importantly, successful profiling from LT-DNA is not solely a function of DNA quantity, but rather a balance of factors including degradation, inhibition, locus-specific amplification efficiency, and the effectiveness of downstream analysis tools [28].
In conclusion, LT-DNA represents both a challenge and an opportunity in forensic genetics. Its successful analysis requires careful consideration of DNA quantity, quality, amplification strategies, threshold calibration, and post-PCR treatment. The body of evidence now supports the use of multi-replicate strategies, adaptive thresholding, and optimized amplification-clean-up protocols as key methods to enhance the reliability of LT-DNA profiling. As technological advancements in sequencing sensitivity and forensic assay design continue to evolve, it is anticipated that the definition and analytical handling of LT-DNA will become increasingly standardized and robust, improving the evidentiary value of this critical class of forensic samples.

2.2. Sources of LT-DNA

LT-DNA is frequently obtained from minute or degraded samples encountered in forensic casework, including touch DNA, shed or rootless hair shafts, trace biological fluids, and skeletal remains such as aged bone and teeth. While these sources can be pivotal in linking a suspect to a crime scene or identifying unknown remains, they each present distinct challenges related to DNA quantity, degradation, contamination risk, and extraction complexity [29,30].
Touch DNA, or transfer DNA, is one of the most common sources of LT-DNA in forensic casework. It consists of DNA deposited via skin contact on surfaces, items, or human skin. The DNA is primarily derived from shed epithelial cells (corneocytes), which are often anucleated or contain degraded nuclear material. Studies have shown that to recover a full STR profile using direct PCR, at least 40 buccal cells or 4000 corneocytes are needed from touch samples collected via swab or tape lift. When extracted using conventional workflows, 8000 or more corneocytes may be required to obtain comparable results [31]. Collection methods significantly impact DNA recovery, though no single swabbing technique has proven superior. The widely used double-swab method showed modest improvement in DNA yield (average 13.7% more DNA) but not necessarily in STR recovery [32]. Further complicating touch DNA analysis is the indirect transfer of human DNA into the environment, onto dust, furniture, or air particles, raising the possibility of retrieving DNA from individuals who were merely present in a room, not necessarily involved in a criminal act [33].
Hair shafts, especially shed or rootless hairs, are another key yet problematic source of LT-DNA. Hair shafts contain highly degraded and fragmented nuclear DNA, often in insufficient quantities for STR profiling, particularly when subjected to environmental stress such as UV exposure or chemical treatment [34]. While mitochondrial DNA (mtDNA) can be recovered more consistently from hair shafts, hair nuDNA typing has been improved by massively parallel sequencing (MPS) platforms, which enable simultaneous analysis of mtDNA, STRs, SNPs, and Amelogenin markers [35]. DNA extraction methods, such as the Investigator and MinElute protocols, significantly affect outcomes; the Investigator method has shown superior results in terms of depth of coverage, SNP genotyping success, and lower dropout rates [35]. Additionally, telogen hair roots, those not in active growth phase, have historically yielded poor DNA results. However, using Hematoxylin staining to quantify nuclear content before selecting roots for testing has been shown to double the success rate of DNA profiling, from 32% to 69% [36].
Trace biological fluids such as saliva, sweat, and dried blood smears are also common sources of LT-DNA. These samples often contain small quantities of nucleated cells, and the DNA yield varies widely. Fluids such as urine and sweat yield the least DNA due to their low cellular content, whereas blood and semen typically contain higher concentrations of nucleated cells and are more amenable to traditional STR analysis [37]. Identification and proper collection of such fluids, especially when partially degraded or exposed, are essential to ensure successful analysis. Innovative strategies using fluorescent nucleic acid dyes have enhanced the visualization and targeted collection of cellular material, improving downstream processing success rates [38].
Skeletal remains, including bones and teeth, are routinely encountered in forensic investigations involving long postmortem intervals or extreme environmental exposure. These matrices are inherently challenging due to the degradation of DNA over time by temperature, humidity, microbial activity, and pH. Among skeletal elements, the petrous part of the temporal bone has demonstrated the highest DNA yield, up to 30 times that of other bones, due to its dense structure and protective anatomy. Several studies have confirmed that petrous bones outperform femurs, metacarpals, and even teeth in terms of STR profile completeness and reduced dropout rates [39,40]. Similarly, teeth remain a viable DNA source even when exposed to harsh environmental or chemical conditions, such as prolonged submersion in acids, due to their resilient enamel structure. Full or partial DNA profiles can still be recovered from teeth exposed to sulfuric, hydrochloric, or nitric acids for up to 120 h [41].
Aged or degraded skeletal samples require optimized pre-analytical strategies, including the use of decalcification agents, silica-based extraction protocols, and automated systems such as the Maxwell FSC DNA IQ platform. These methods improve yield and reduce handling time, though the success of DNA recovery still largely depends on selecting the most suitable skeletal elements and applying strict contamination controls [42].
In conclusion, LT-DNA is obtained from a wide range of forensic materials, each presenting unique challenges that must be addressed through tailored extraction, quantification, and amplification protocols. The advent of MPS technologies, nuclear content screening, and enhanced collection methods has significantly expanded the ability to recover probative genetic information from even the most limited and compromised DNA sources. Continued innovation in these areas, combined with better triage methods and decision thresholds, will be essential for maximizing the utility of LT-DNA in forensic casework.

2.3. Current Forensic Approaches for LT-DNA Analysis

LT-DNA analysis involves working with very small amounts of DNA, and presents a series of challenges related to sensitivity, reproducibility, and interpretation. Despite these limitations, significant advances in molecular techniques and analytical strategies have allowed forensic scientists to maximize the evidentiary value of minimal and degraded DNA samples, which are increasingly encountered in real-world forensic contexts.
One of the central technical approaches in LT-DNA analysis is the increase in PCR cycle numbers during STR amplification to enhance sensitivity. By extending the amplification process beyond the standard 28–30 cycles to 34 or more, low levels of DNA can be more effectively amplified, increasing the likelihood of detecting alleles that would otherwise remain below analytical thresholds. However, this approach is not without drawbacks. Stochastic effects, including allelic and locus drop-out, allelic drop-in, and peak imbalance, become more pronounced as the DNA quantity decreases and amplification cycles increase. These artifacts can compromise the reliability of the resulting profiles, especially in mixed samples or degraded material. Nonetheless, validation studies have demonstrated that replicate testing, even at very low template levels (e.g., 10–100 pg), can produce reliable single-source profiles when a consensus approach is used. Such strategies have helped forensic laboratories implement protocols that balance sensitivity with interpretive robustness, even in high-stakes criminal investigations [43].
In addition to replicating testing, alternative strategies for improving the quality of LT-DNA results have gained traction. Among these is single-cell DNA (scDNA) analysis, which offers a solution to the limitations of traditional “bulk” extraction methods used for mixed samples. In conventional workflows, crime scene stains containing multiple contributors are extracted en masse, producing a homogenized mixture that complicates deconvolution. In contrast, scDNA approaches involve subsampling individual cells from the mixture before DNA extraction, thereby enabling the generation of highly probative, single-source DNA profiles. This technique is particularly advantageous in resolving complex mixtures, such as those involving relatives with similar genetic profiles, or when detecting low-level contributors who may be masked in standard bulk analysis. Although scDNA techniques are still in development and face technical hurdles related to sensitivity, contamination risk, and cost, they represent a promising direction for the future of LT-DNA analysis and forensic genomics as a whole [15].
Beyond individual technical innovations, forensic laboratories are also integrating probabilistic genotyping software (PGS) to enhance interpretation of low-template or complex DNA profiles. These software platforms employ statistical modeling to assign likelihood ratios to competing hypotheses about DNA contributors, incorporating drop-out probabilities, stutter modeling (stutter refers to minor PCR artifacts where a small peak appears one repeat unit shorter than the true allele, caused by slippage during amplification), and peak height variability. In LT-DNA contexts, where traditional binary interpretation methods often fall short, PGS allows analysts to extract meaningful conclusions from incomplete or uncertain data. Probabilistic genotyping is now considered the best practice in the analysis of complex mixtures and degraded samples and is increasingly used in courtrooms worldwide [44].
Another major advancement supporting LT-DNA analysis is the implementation of NGS, which enables massively parallel processing of STRs, SNPs, and other markers from minimal input DNA. NGS offers several advantages over capillary electrophoresis (CE)-based STR analysis, including the ability to detect additional sequence variation within STR alleles, improved resolution of mixture components, and simultaneous analysis of mitochondrial and nuclear DNA. In LT-DNA contexts, NGS improves both the quantity and quality of data recovered from challenging samples, such as aged bones, hair shafts, or trace skin cells. However, NGS is not yet universally adopted in forensic laboratories due to its cost, complexity, and validation requirements. Continued technological refinement and standardization will be essential for widespread implementation in routine casework.
Despite these technical advances, LT-DNA analysis is not without controversy. Ethical and legal debates persist regarding the admissibility and interpretation of low-level DNA evidence. Concerns about contamination, false inclusions, and the risk of overstating the probative value of partial profiles have led courts and scholars to scrutinize the scientific reliability and legal fairness of LT-DNA testimony. Furthermore, as forensic DNA analysis becomes more powerful and intrusive, especially with the rise in familial searching, phenotyping, and investigative genetic genealogy, important questions arise about privacy, consent, and the potential infringement of civil liberties. Drawing on historical parallels from the early days of DNA profiling, some commentators argue that current advances must be met with a renewed commitment to protecting individual rights and ensuring transparent, accountable use of forensic science [43].
Looking ahead, the future of LT-DNA analysis lies in the integration of multidisciplinary approaches. As highlighted by the INTERPOL Forensic Science Managers Symposium, the field is advancing through the convergence of technologies such as rapid DNA testing, lineage marker analysis, microhaplotypes, proteomics, and epigenetic clocks for age prediction. These methods are being combined with enhanced extraction protocols, automation, and increasingly sophisticated bioinformatic tools to overcome the inherent limitations of low-template samples. Quality assurance remains a top priority, with over 70 new international guidance documents published in recent years to standardize procedures, enhance reliability, and promote cross-laboratory reproducibility [44].
In conclusion, current forensic approaches in LT-DNA analysis represent a dynamic and rapidly evolving area of forensic science. Through the combined use of increased PCR cycles, replicate consensus profiling, probabilistic modeling, single-cell analysis, and NGS technologies, laboratories are increasingly capable of obtaining meaningful results from the smallest and most degraded samples. At the same time, the legal and ethical frameworks surrounding these technologies must evolve in parallel to ensure that forensic innovations serve justice while respecting fundamental rights.

3. From Genotype to Phenotype: The Rise in FDP

The emergence of FDP marks a paradigm shift in modern forensic science. Where traditional DNA profiling has long been limited to identifying individuals through STR markers, FDP opens the possibility of predicting EVCs directly from genetic material, particularly in human remains [45]. By bridging the gap between genotype and phenotype, FDP enables investigators to infer aspects of an unknown person’s physical appearance, ancestry, and even age from DNA found at a crime scene [46]. This approach offers valuable investigative leads, especially in cases where conventional methods reach a dead end [1].
FDP currently focuses on a defined but gradually expanding set of traits, each with its own level of scientific maturity and predictive accuracy. Eye color, for instance, is among the most reliable phenotypic traits to be predicted from DNA, thanks to well-characterized variants in genes such as HERC2 and OCA2 [47,48]. In contrast, traits like skin pigmentation and hair color are more genetically complex, influenced by multiple genes and moderated by environmental factors. While red and black hair can be predicted with reasonable confidence, intermediate hair shades and skin tones remain more difficult to assess, particularly in admixed or non-European populations [49,50,51].
Secondary traits such as freckling and sun sensitivity can further refine a DNA-based appearance profile, though they are typically used in conjunction with primary pigmentation traits [52]. A more ambitious goal is the prediction of facial morphology, which remains one of the most challenging areas due to its highly polygenic nature and sensitivity to non-genetic influences like age, nutrition, and development. Although some genetic variants (e.g., in PAX3 or EDAR) have been associated with facial shape, practical applications in forensic casework are still in the early stages [53].
In parallel, advancements in epigenetics have introduced the possibility of estimating chronological age from DNA, using methylation markers at specific CpG sites. Though age is not a fixed phenotype in the genetic sense, these DNA methylation-based “epigenetic clocks” offer promising tools for narrowing suspect or victim pools, particularly when traditional identifiers are absent [54]. Moreover, biogeographical ancestry inference provides essential context for phenotype prediction, as genetic variant frequencies differ across populations [55].
Tools like IrisPlex, HIrisPlex, and HIrisPlex-S panels, as well as the VISible Attributes through GEnomics (VISAGE) Enhanced Tool, have made FDP more accessible in forensic settings, allowing for targeted genotyping of key SNPs related to eye, hair, and skin color. Genotyping 24 genetic markers (SNPs and indels) enables rapid and reliable prediction of eye and hair color using the HIrisPlex system; the inclusion of 17 additional markers extends this capability to skin color prediction through the HIrisPlex-S system. While these panels show strong performance under ideal laboratory conditions, their predictive accuracy often diminishes in real-world scenarios involving degraded, mixed, or low-template DNA. As a result, integrating FDP into investigations requires not only scientific rigor but also careful consideration of its limitations and the probabilistic nature of the predictions it offers [56,57,58,59]. It is important to note that current forensic prediction panels (IrisPlex, HIrisPlex-S, VISAGE) primarily employ multinomial logistic regression for trait inference. These models are interpretable and validated for forensic use but differ fundamentally from ML approaches discussed later, which aim to capture complex genotype–phenotype relationships using non-linear, high-dimensional methods.

3.1. Forensic Eye Color Prediction from DNA: Insights from Genetic Markers

In forensic science, biological traces such as blood, saliva, semen, or epithelial cells recovered from crime scenes are primarily used for human identification through DNA profiling. However, recent advances have enabled the use of DNA to predict EVCs [60].
Eye color prediction is based on the analysis of specific SNPs associated with pigmentation genes [47]. These SNPs influence melanin synthesis and distribution in the iris, which, despite being covered by the transparent cornea, exhibits color variation due to both pigment concentration and structural light scattering in the stroma. The variation in iris color is primarily determined by the number and distribution of stromal melanocytes and the type of melanin, especially eumelanin, present [61].
Eye color is a polygenic trait, with multiple genes contributing to its expression [57]. Among these, the SNP rs12913832 in the HERC2 gene has the most significant impact, particularly in European populations. Although located in an intronic region of HERC2, this SNP regulates the expression of the OCA2 gene, which encodes a protein essential for melanin transport. Individuals with the AA genotype at rs12913832 typically have blue eyes, GG genotype correlates with brown eyes, and AG heterozygotes often exhibit intermediate shades such as green or hazel [62,63].
Other genes, such as SLC24A4, SLC45A2, TYR, TYRP1, ASIP or IRF4, also contribute to eye color variation, especially for intermediate phenotypes. However, their predictive power is generally lower and may vary across populations. Polymorphisms in these genes, while informative, are often population-specific and less conserved than the HERC2-OCA2 regulatory axis [64,65,66].
To operate these genetic insights, the IrisPlex system was developed. This forensic tool uses a panel of six SNPs (Table 1) to predict eye color using a multinomial logistic regression model. The system outputs probabilities for three eye color categories: blue, brown, and intermediate. A prediction is considered reliable when the highest probability exceeds a threshold of 0.7. For example, a prediction of 0.89 for blue, 0.08 for brown, and 0.03 for intermediate would be interpreted as blue eyes [58].
Validation studies have demonstrated that IrisPlex achieves over 90% accuracy for blue and brown eyes in European populations. However, prediction accuracy for intermediate eye colors remains lower (around 73%), with sensitivity as low as 1.1%, reflecting the complex genetic architecture of these phenotypes [58].
As detailed in Section 2.3, LT-DNA introduces stochastic effects and allelic dropout that may compromise prediction accuracy. Future challenges in iris color prediction lie in integrating newly identified genetic markers to enhance predictive accuracy. A recent publication highlighted that whole-exome sequencing of 150 individuals has uncovered 27 previously unreported variants associated with eye color, offering promising new targets for prediction models. Notably, the SNP rs2253104 in the ARFIP2 gene emerged as a key predictor, selected by multiple feature selection methods and contributing to the most accurate regression models. These findings suggest that expanding SNP panels with newly validated variants could significantly improve prediction outcomes, especially when working with degraded or limited DNA samples where maximizing the information from each locus is crucial. Validating and adapting these new markers for use in mini-amplicon assays could be a key step forward in applying iris prediction to challenging forensic samples [47].
Additionally, high-sensitivity genotyping platforms, including MassARRAY (MALDI-TOF mass spectrometry for multiplex SNP genotyping), NGS/MPS (parallel sequencing of multiple loci), and real-time PCR (using allele-specific probes or high-resolution melting analysis), have enhanced the precision and robustness of SNP detection [51,67]. Replicate testing and consensus genotyping further mitigate random amplification errors by allowing analysts to identify consistent results across multiple runs. Moreover, the integration of probabilistic models into prediction tools enables the communication of uncertainty levels, helping to avoid overinterpretation of marginal or ambiguous profiles. Enforcing strict quality control thresholds, such as minimum allele calling criteria and prediction probability cutoffs (e.g., >0.7), also ensures that only high-confidence phenotype predictions are reported. Nevertheless, certain limitations remain. Intermediate eye colors, for instance, continue to be difficult to predict accurately due to their polygenic inheritance patterns and relatively low heritability. Furthermore, variations in allele frequencies across populations can influence prediction outcomes, highlighting the necessity of validating predictive models in diverse genetic backgrounds to ensure their broader applicability and forensic reliability [68].
While eye color is generally considered a low-risk trait in terms of privacy, its use in FDP must adhere to ethical standards and legal frameworks. Misinterpretation of probabilistic predictions, especially in cases with low confidence or ambiguous results, can lead to investigational bias. Therefore, careful communication of FDP findings to law enforcement is crucial to avoid wrongful suspicion [11,69].

3.2. Hair Color Prediction from Biological Traces

When biological traces are collected, forensic experts apply extraction protocols designed to preserve DNA quality even from degraded or low template samples [70]. Following DNA isolation and quantification, molecular techniques are used to identify SNPs associated with hair pigmentation [71].
Specific DNA regions are involved in the production, distribution, and degradation of melanin, the pigment primarily responsible for hair, eye, and skin coloration [56]. Hair color is determined by the type and quantity of melanin in the hair shaft: eumelanin (black/brown pigment) and pheomelanin (red/yellow pigment). The balance between these two types of melanin, regulated by a network of genes and genetic variants, gives rise to the wide spectrum of human hair colors, from black and brown to blond and red [72].
Hair color is a complex trait shaped by the combined influence of many genes and their interactions, each contributing in different ways to the final pigmentation phenotype. Among these, MC1R (Melanocortin 1 Receptor) plays a particularly prominent role. Variants in this gene are closely linked to red hair, as MC1R controls the switch between the production of eumelanin (dark pigment) and pheomelanin (red/yellow pigment). When MC1R function is disrupted by specific genetic variants, the balance shifts toward pheomelanin, leading to the characteristic red or auburn hair tones [73,74,75].
Other genes also contribute significantly to hair color diversity. HERC2 and OCA2, which are more widely recognized for their role in determining eye color, also influence melanin synthesis in the hair, particularly affecting the variations observed in blond and brown shades [48].
Genes such as SLC24A4, SLC45A2, and TYR further support the pigmentation process by regulating melanosome function and the activity of tyrosinase, an enzyme critical for melanin biosynthesis. Variants in these genes are commonly associated with lighter hair tones [76].
Lastly, IRF4 has emerged as another important contributor. This gene influences pigmentation through its regulatory functions in melanocyte biology. Not only is it associated with lighter hair colors, but it is also implicated in age-related changes in pigmentation, such as the gradual transition to gray hair [77].
The HIrisPlex system, developed as an extension of the earlier IrisPlex model for eye color prediction, integrates both eye and hair color SNPs into a single assay. As summarized in Table 2, the system uses 24 SNPs across multiple pigmentation genes [56].
The model outputs probability for each color, allowing forensic experts to assess the likely hair color of the DNA donor. A prediction is usually accepted when one color category exceeds a predefined probability threshold (e.g., >70%). For example, a sample yielding 0.82 probability for brown, 0.10 for black, 0.06 for blond, and 0.02 for red would result in a brown hair color prediction [78].
The predictive accuracy of hair color varies by color category and population group. For example, red hair can be predicted with over 80–90% accuracy, primarily due to the strong effect of MC1R variants [79]. Blond and brown hair predictions are also relatively reliable (75–85%) but may show variation across different ethnic backgrounds. Black hair, being the most common worldwide, is typically predicted with high sensitivity but may have slightly lower specificity due to overlapping SNP effects [80].
Prediction models have been validated in European populations, where hair color diversity is greatest. However, performance in non-European populations can differ due to allele frequency differences and additional contributing variants. Ongoing research continues to expand databases and improve prediction accuracy across ancestrally diverse groups [81].
The utility of hair color prediction from biological traces hinges on the ability to generate accurate genotypes from compromised samples [82].
In cases involving severely degraded DNA, MPS-based typing offers advantages by enabling parallel analysis of multiple short DNA fragments and improving coverage at key SNP sites [67,83,84]. Still, probabilistic interpretation remains essential, particularly when prediction confidence falls below threshold values [82].
Although hair color is a relatively non-sensitive trait in forensic terms, the use of predictive models still raises questions about genetic privacy, the potential for profiling, and public trust. Moreover, probabilistic results should be presented with appropriate caveats, particularly when based on partial or degraded DNA samples [83].

3.3. Skin Pigmentation Prediction in Forensic Science from Biological Traces

The HIrisPlex-S model further incorporates skin color prediction and uses a total of 41 SNPs (Table 3), enabling a broader EVC profile from a single DNA sample. This comprehensive approach improves the informative value of FDP in real-world investigations [84].
While the SNP tables provide a comprehensive overview of markers used in FDP, performance varies significantly across panels and traits, especially under degraded-DNA conditions. The HIrisPlex-S system, which integrates eye, hair, and skin color prediction, has demonstrated robust performance in forensic contexts, including aged bone samples, ref. [59] with successful recovery of pigmentation profiles in over 55% of cases [9]. However, certain SNPs associated with skin tone (e.g., rs1426654 in SLC24A5 and rs12913832 in HERC2) show reduced amplification success in highly degraded samples, leading to underperformance in intermediate pigmentation categories. VISAGE panels, designed for massively parallel sequencing, offer improved sensitivity and multiplexing, enabling better SNP recovery from LT-DNA and degraded skeletal remains. Inter-laboratory validation studies [59] confirm that VISAGE achieves higher reproducibility and lower dropout rates compared to HIrisPlex-S, particularly for skin color and ancestry inference. Nonetheless, both panels exhibit population-specific limitations: predictive accuracy is highest in European cohorts and declines in admixed populations, underscoring the need for inclusive reference datasets and ongoing calibration.
Skin pigmentation prediction from biological traces is an emerging and powerful facet of FDP, allowing forensic scientists to infer EVCs of unknown individuals when conventional identification techniques are unfeasible. Alongside eye and hair color, skin pigmentation provides essential descriptive features that can generate investigative leads from DNA alone. This is particularly relevant in cases involving unknown perpetrators, unidentified human remains, or degraded biological evidence [85].
Biological traces recovered from crime scenes can contain sufficient nuclear DNA to permit genotyping, and in such cases, the typing of SNPs associated with pigmentation traits [44]. Human skin color is primarily determined by the quantity and type of melanin produced by melanocytes in the basal layer of the epidermis. The density, distribution, and synthesis of melanin granules, regulated by a network of pigmentation genes, are key factors in determining skin tone [86]. These processes are influenced by several genes that affect melanosome formation, melanocyte biology, and melanin biosynthesis pathways [87]. Numerous genes have been associated with variations in skin pigmentation, many of which have been included in forensic prediction panels. Furthermore, skin color is a continuous trait influenced by both genetic and environmental factors (i.e., sun exposure, tanning behavior, and age can affect observed skin tone) [88]. Its prediction from DNA is complex due to its polygenic nature, wide variation across populations, and the influence of evolutionary history. Despite this, robust predictive models have been developed, especially those designed to distinguish between broad pigmentation categories (light, intermediate, dark) across major global ancestries [89]. This classification system simplifies the continuous nature of skin tone into practical categories for forensic investigation. The model utilizes logistic regression trained on a large reference dataset that includes individuals from diverse ancestral backgrounds [90].
In validation studies, HIrisPlex-S demonstrated high accuracy, particularly in predicting light and dark pigmentation. Intermediate skin tones are more challenging due to greater phenotypic and genetic diversity and less clear boundaries between categories. The prediction accuracy is also influenced by the genetic background of the individual: for example, predictions are more reliable in populations of European or sub-Saharan African descent, and less so in admixed or South Asian populations [81].
The prediction of skin color from LT-DNA is a promising yet complex frontier in forensic genetics. LT-DNA samples are inherently prone to degradation and contamination [70]. These factors can compromise the integrity of genetic markers, particularly SNPs used in pigmentation prediction models such as HIrisPlex-S. Indeed, LT-DNA samples frequently result in incomplete genetic profiles, which limits the number of informative SNPs available for analysis. This can hinder the statistical confidence of phenotype predictions, especially in individuals with intermediate or admixed ancestry, where subtle genetic variations play a significant role in pigmentation. Advanced probabilistic models and imputation techniques are being developed to address these limitations, but their effectiveness remains contingent on the quality and completeness of the input DNA [9].
Despite these challenges, LT-DNA-based skin color prediction remains a valuable tool in forensic investigations. When used responsibly, it can provide critical leads in otherwise unsolvable cases, offering a glimpse into the physical appearance of unknown individuals and narrowing suspect pools in a scientifically grounded manner.

3.4. Freckles and Sun Sensitivity Prediction in Forensic Science from Biological Traces

Among the suite of the prediction of EVCs from biological traces, freckles and sun sensitivity may seem minor at first glance, but they can contribute meaningfully to the construction of a biogeographic and phenotypic profile of an unknown individual [91]. Freckles (ephelides) are small, pigmented macules that typically appear on sun-exposed areas of fair skin. Sun sensitivity refers to a person’s susceptibility to burning rather than tanning upon UV exposure, typically resulting from lower eumelanin levels and higher pheomelanin content [86]. These traits, rooted in pigmentation biology, are closely associated with genes involved in melanin synthesis and regulation, particularly the MC1R gene, located on chromosome 16 [92]. Several common MC1R variants (e.g., R151C, R160W, D294H) are collectively referred to as “R alleles” and are associated with the red hair color (RHC) phenotype, but also contribute independently to freckling and UV sensitivity, even in individuals without red hair [93]. Other genes, such as ASIP, TYR, and IRF4, may modulate freckling and pigmentation patterns, although their effects are smaller and often population-specific [94].
With appropriate genetic analysis, forensic scientists can derive probabilistic predictions about whether an individual is likely to have freckled skin and/or increased sensitivity to sunlight. The relative SNPs are included in comprehensive forensic phenotyping tools such as HIrisPlex-S [85]. These models use multinomial logistic regression or ML classifiers trained on large, multi-ethnic datasets [51,95]. For freckles, individuals are categorized as likely or unlikely to have them based on the presence of specific MC1R variants and other associated SNPs [3,96,97,98,99]. For sun sensitivity, prediction involves assessing genetic profiles associated with reduced melanin production and increased burn tendency (high likelihood of sunburn with minimal tanning, moderate sensitivity, low sensitivity/likely to tan) [100].
Prediction accuracy for freckles and sun sensitivity is generally good when the relevant SNPs are reliably genotyped, and the individual belongs to a well-represented population in the model’s training data (typically of European ancestry) [51]. For instance, freckle prediction can reach accuracies above 75–80%, particularly when individuals carry two or more non-functional MC1R alleles. Sun sensitivity prediction is somewhat more variable due to the continuous nature of the phenotype and its interaction with environmental factors (e.g., lifetime sun exposure, behavior, geography) [101].
However, there are several limitations: -polygenic complexity: many small-effect genes and gene-environment interactions contribute to these traits; -environmental influence: freckling can be influenced by UV exposure history, making it a dynamic, rather than strictly genetically determined, trait; -population bias: predictive models perform best in individuals of European ancestry, where pigmentation variation is highest and best studied; accuracy drops in admixed or underrepresented populations; -LT-DNA [100].

3.5. Facial Morphology Prediction in Forensic Science from Biological Traces

In the field of FDP, the prediction of facial morphology, the shape and structure of the human face, represents a frontier with considerable scientific promise and complex challenges [102]. While traits like eye, hair, and skin color have reached relatively high predictive reliability, facial morphology prediction is more intricate due to the polygenic and multifactorial nature of facial features. However, recent advances in genomics, 3D facial imaging, and ML models are progressively transforming facial prediction from a speculative vision into an emerging forensic tool. The ability to reconstruct aspects of a person’s facial structure from biological traces such as blood, saliva, or touch DNA could significantly support investigations, particularly when no biometric or documentary information is available [102,103].
Facial morphology is a highly heritable trait governed by the interaction of hundreds of genes and regulatory elements. GWAS have identified dozens of loci associated with specific facial traits, including: facial width and height, nasal bridge length and tip projection, lip thickness, chin prominence, brow ridge shape, mandibular and maxillary dimensions [6,104].
Key genes such as PAX3, EDAR, DCHS2, RUNX2, and PRDM16 have been linked to facial development, often through their role in craniofacial growth, tissue differentiation, and skeletal patterning. Variants in these genes influence the position and projection of facial landmarks that define individual appearance. Notably, the expression of these traits is further modulated by age, sex, ancestry, and environmental influences (e.g., nutrition, trauma), which makes prediction from DNA alone inherently probabilistic [105,106].
In addition to genotyping, the ancestry and sex of the donor are determined, as both play pivotal roles in shaping facial morphology. Ancestry informs baseline structural patterns common in different populations (e.g., nasal breadth, cheekbone prominence), while sex influences sexually dimorphic traits such as jaw angle and brow thickness [107].
To transform genetic data into facial predictions, researchers apply ML techniques trained on large databases of paired genotype and 3D facial imaging data. These datasets often consist of thousands of individuals with detailed facial scans and genome-wide SNP data. Key modeling approaches include partial least squares regression (PLSR) to link SNPs to principal components of facial variation; convolutional neural networks (CNNs) to analyze facial geometry from genetic and demographic inputs; deep generative models (e.g., variational autoencoders or GANs) to create realistic 3D face renderings from DNA data [108].
Recent examples include the FaceBase project, VisiGen, and research from the Penn State Forensic Anthropology Lab, which has shown that genetic data can explain 10–20% of the variance in certain facial traits, modest but meaningful progress toward practical forensic use [109,110]. Despite promising advances, current models explain only a modest proportion of facial trait variance, typically between 10–20%, with substantial limitations in predicting features influenced by environmental and developmental factors. Traits such as facial asymmetry, age-related changes, and expression dynamics remain poorly captured by genotype-based models. This modest predictive power underscores that current models are research-grade and unsuitable for operational forensic deployment.
The predictive power of facial trait modeling is strongest when combined with known ancestry, sex, and age estimates, particularly in young adults, whose facial structures are relatively stable. Nonetheless, these models often lack precision at the individual level and are more effective at generating composite depictions or narrowing demographic search pools than at identifying exact facial characteristics. Reproducibility remains a major challenge in facial morphology prediction. Variability in imaging techniques, SNP panels, and population structure can lead to inconsistent results across studies. Moreover, the lack of standardized validation protocols and limited availability of forensic-grade datasets restrict the generalizability of current models. According to recent publications, the highest predictability is observed for traits such as nose width, facial width, intercanthal distance (the distance between the eyes), and lip thickness. Moderate predictability applies to features like brow ridge prominence, chin shape, and nasal tip projection. In contrast, traits such as facial asymmetry, expression-related features, and age-related morphological changes exhibit low predictability [46,111].
In forensic contexts, these predictions are used as investigative leads rather than as confirmatory evidence. Reproducibility remains limited due to variability in imaging protocols, SNP panels, and population structure. Most data sets are European-biased, reducing generalizability and increasing misclassification risk in admixed populations. Future efforts should focus on expanding training datasets to include diverse populations, integrating multi-omics data to capture non-genetic influences, and establishing standardized pipelines for model development and validation to enhance reproducibility and forensic reliability. For example, in cold cases, a predicted facial image may be released to the public to generate tips. In mass disaster scenarios, facial predictions can assist in identifying unknown victims when traditional methods, such as fingerprinting or dental records, are unavailable. Similarly, in cases involving unidentified remains, predicted facial features can complement skeletal reconstructions and ancestry estimations, enhancing the overall identification process.
Facial prediction from LT-DNA represents a cutting-edge intersection of forensic genetics and computational modeling. Facial predictions derived from LT-DNA are probabilistic rather than definitive and are best used to generate investigative leads rather than serve as conclusive evidence [3,102]. Indeed, despite promising advances, current models explain only 10–20% of variance in facial traits, which is insufficient for individual-level reconstruction. These outputs remain research-grade and are not forensically deployable. Reproducibility is constrained by differences in imaging protocols, SNP panels, and population structure. Beyond technical limitations, predicting facial morphology raises significant ethical concerns. Unlike eye, hair, and skin color, facial features are closely linked to ancestry and identity, making them particularly sensitive. Generated facial composites risk being misinterpreted as accurate likenesses, potentially leading to investigational bias or discrimination. These challenges underscore the need for transparent communication of uncertainty, strict governance, and clear disclaimers when using such predictions as investigative leads rather than confirmatory evidence.

3.6. Age Prediction in Forensic Science Based on Biological Traces

Estimating chronological age from biological traces has become a valuable forensic tool, particularly when conventional identifiers are unavailable [112]. Unlike skeletal assessments, molecular approaches rely on biomarkers that change predictably over time [113]. Among these, DNA methylation at specific CpG sites is the most reliable indicator of age, outperforming earlier methods based on telomere length or mtDNA mutations, which showed high variability and limited forensic utility [114,115,116].
Epigenetic clocks exploit methylation changes at selected loci to predict age with high accuracy. Forensic models prioritize simplicity and robustness for degraded or LT-DNA. A widely used approach by Bekaert et al. employs four CpG sites in genes such as ELOVL2, FHL2, KLF14, and TRIM59, achieving mean absolute deviations of about 3–4 years in blood samples. ELOVL2 is particularly informative due to its consistent methylation increase across tissues, including blood, saliva, and buccal cells [117]. The VISAGE Consortium has further advanced this field by creating multiplex assays compatible with NGS, enabling sensitive and robust age estimation from trace DNA. ELOVL2, in particular, has emerged as a key biomarker due to its consistent methylation increase with age across multiple tissue types, including blood, saliva, and buccal cells [118].
One of the strengths of DNA methylation-based age prediction is its applicability across various forensic sample types [119]. Blood and saliva are most commonly used due to their higher DNA yield and stable methylation profiles. Semen requires tissue-specific models, as methylation patterns vary significantly between tissues. Touch DNA, derived from epithelial cells, presents greater challenges due to low DNA quantities, but ongoing research is improving its feasibility. Even skeletal remains can be analyzed for methylation-based age estimation, offering valuable insights in cold cases or mass grave investigations [112,120].
In forensic casework, age prediction serves multiple roles. It can assist in suspect profiling when no known individual is linked to the biological evidence, providing an estimated age range to guide investigations. In victim identification, especially in cases involving unidentified remains, age estimates help correlate findings with missing persons databases. Additionally, age prediction is sometimes used in legal and immigration contexts to determine whether individuals are minors, although this application remains ethically and legally contentious [121,122].
Technological platforms for methylation analysis include pyrosequencing, quantitative PCR, and NGS. Pyrosequencing remains popular for targeted assays because of its cost-effectiveness and compatibility with forensic workflows, while NGS offers higher sensitivity and multiplexing for degraded samples. Recent VISAGE Consortium developments integrate methylation-based age estimation into multiplex panels, enabling simultaneous prediction of age and other traits from trace DNA [123,124].
Despite these advances, LT-DNA poses challenges such as allelic dropout and incomplete methylation profiles, which can reduce accuracy [125]. Optimized assays using short amplicons, replicate testing, and probabilistic models help mitigate these issues [126]. When combined with strict quality control and robust statistical frameworks, methylation-based age prediction provides actionable investigative leads, even from compromised samples [127].

3.7. Ancestry Prediction in Forensic Science from Biological Traces

Ancestry prediction from biological traces has become a powerful tool in forensic science, offering critical insights into the biogeographical origins of unknown individuals when traditional identification methods are unavailable [128]. This approach is particularly valuable in criminal investigations, mass disasters, and the identification of unidentified remains, where DNA evidence may be the only clue. Unlike cultural or self-reported ethnicity, forensic ancestry inference relies on genetic markers that reflect population-level differences shaped by evolutionary history, migration, and genetic drift. These markers, primarily SNPs, are distributed across the genome and exhibit frequency patterns that vary among continental and subcontinental populations [129].
The foundation of ancestry prediction lies in the analysis of ancestry-informative markers (AIMs), SNPs selected for their high allele frequency differences between populations. Panels such as the SNPforID 34-plex or others, such as the Precision ID Ancestry Panel, include hundreds of AIMs optimized for forensic use. These panels can distinguish major continental ancestries (e.g., African, European, East Asian, Native American, South Asian) and, in some cases, provide finer resolution within regions [130,131]. The VISAGE Consortium has developed advanced multiplex assays compatible with MPS, enabling ancestry inference from degraded or LT-DNA samples commonly encountered in forensic contexts [59,132].
Despite the promise of ancestry prediction, several challenges must be addressed, particularly when working with LT-DNA. However, recent advances in sequencing sensitivity and bioinformatic tools have improved the robustness of ancestry inference from trace samples. Probabilistic models and ML algorithms can now integrate partial genotypes and assign ancestry with high confidence, even from compromised DNA [90,133].
In forensic casework, ancestry prediction serves multiple purposes. It can help narrow suspect pools by providing investigators with information about the likely population background of an unknown individual. In missing persons cases, ancestry estimates can guide comparisons with databases and assist in facial reconstruction efforts. In mass disaster scenarios, ancestry inference can support victim identification when other biological or contextual information is lacking. Importantly, ancestry prediction is not intended to identify individuals directly but to provide investigative leads that complement other forensic evidence [134].
The accuracy of ancestry prediction depends on several factors, including the number and informativeness of SNPs used, the diversity of reference populations, and the complexity of individual genetic backgrounds. Admixed individuals, those with ancestry from multiple populations, pose particular challenges, as their genetic profiles may not align neatly with reference categories. To address this, modern forensic tools incorporate admixture analysis and ancestry deconvolution, estimating proportional ancestry contributions from different populations. These methods enhance the interpretability of results and reduce the risk of misclassification [135].
As forensic genomics continues to evolve, ancestry prediction from biological traces, especially LT-DNA, will play an increasingly important role in investigative workflows. With ongoing improvements in marker panels, sequencing technologies, and computational methods, the ability to infer ancestry from even the most challenging samples is becoming more accurate and accessible, offering valuable insights in the pursuit of justice.

4. Integrating ML with LT-DNA in FDP

The integration of ML into FDP represents a transformative advancement in the ability to predict EVCs from genetic material, particularly in challenging cases involving LT-DNA. LT-DNA samples, often recovered from trace biological material such as touch DNA or degraded remains, contain minimal quantities of DNA and are prone to allelic dropout, contamination, and stochastic effects. These limitations pose significant challenges for traditional statistical models, which often rely on complete and high-quality genotypic data. ML, with its capacity to handle high-dimensional, noisy, and incomplete datasets, offers a powerful solution for enhancing phenotype prediction from LT-DNA [95,136]. It is important to clarify that we define ML as a class of algorithms capable of learning complex, often non-linear relationships from data without relying on pre-specified parametric forms. Examples include Random Forests, Support Vector Machines, Gradient Boosting, and deep neural networks. In contrast, traditional statistical models, such as logistic regression, assume a fixed functional form and are widely used in operational FDP systems (e.g., IrisPlex, HIrisPlex-S, VISAGE). While logistic regression is sometimes grouped under supervised learning, it is not considered ML in the contemporary sense of high-dimensional, non-linear modeling.

4.1. Common ML Algorithms and Their Forensic Applications

ML models are particularly well-suited to forensic genomics due to several key factors: the high dimensionality of genomic data, the complex and non-linear relationships between SNPs and phenotypic traits, and the need to integrate diverse data types, including genotype, gene expression, and epigenetic modifications such as DNA methylation. Commonly used ML algorithms in FDP include Random Forests (RFs), which are valued for their robustness and interpretability; Support Vector Machines (SVMs), which perform well with small sample sizes and high-dimensional data; Gradient Boosting Machines (GBMs), which often outperform RFs in accuracy; and deep learning models such as neural networks (NNs), which are capable of modeling intricate genotype–phenotype relationships. Simple models like k-nearest neighbors (KNN) can also be effective in small datasets, particularly when combined with feature selection techniques [137,138]. Recent comparative studies provide quantitative evidence of ML performance in FDP. Katsara et al. [95] evaluated RFs, SVMs, and GBMs for eye color prediction using HIrisPlex SNPs in 1200 individuals. RF achieved an AUC of 0.93 for blue/brown classification, outperforming SVMs (AUC = 0.91) and logistic regression (AUC = 0.89) under 10-fold cross-validation. For skin color prediction, Zaorska et al. [91] applied neural networks to a Polish cohort (n = 800), reporting precision = 0.82 and recall = 0.79 for light pigmentation, compared to RF precision = 0.78 and recall = 0.75. VISAGE validation studies further demonstrated that ML-based genotype imputation improved SNP recovery from LT-DNA samples by 12–15%, enabling phenotype prediction from partial profiles. These findings underscore the advantage of ML in handling incomplete and noisy data, particularly in LT-DNA contexts (Table 4).
While deep learning models achieved slightly higher accuracy for complex traits, their lack of interpretability raises concerns for forensic admissibility, making RF and SVM preferred in operational contexts.
A critical challenge in applying ML to FDP is data imbalance and overfitting. Most training datasets, such as HIrisPlex and VISAGE, are predominantly composed of individuals of European ancestry, which can bias predictions and reduce accuracy in admixed or underrepresented populations [95,139]. Overfitting occurs when models capture noise or population-specific patterns rather than generalizable features, leading to poor performance on forensic casework samples.
To mitigate imbalance and overfitting, strategies such as feature selection (e.g., LASSO, recursive feature elimination) and regularization techniques (ridge regression, dropout layers in neural networks) are employed. These approaches improve generalization and reduce variance when working with high-dimensional genomic data.
These approaches collectively enhance robustness when working with LT-DNA, where incomplete and noisy profiles are common. However, further research is needed to validate these methods across diverse populations and forensic-grade datasets.
To provide a concise overview, Table 5 summarizes commonly used ML algorithms in FDP, their input data types, dataset sizes, predicted traits, and reported performance metrics. These results highlight that Random Forest and Gradient Boosting consistently achieve high AUC values for eye color prediction, while neural networks show promise for complex traits such as skin pigmentation. Deep learning approaches applied to facial morphology explain only a modest proportion of variance (10–20%), underscoring the need for larger and more diverse training datasets. For age estimation, regularized regression models using methylation data achieve mean absolute deviations of approximately 3–4 years, demonstrating their forensic utility.

4.2. Logistic Regression vs. Random Forest vs. Deep Learning in FDP: Accuracy, Data Require-Ments, and Admissibility

Operational FDP panels such as IrisPlex, HIrisPlex-S, and VISAGE rely on multinomial logistic regression because of its interpretability, stability under limited SNP sets, and extensive forensic validation. In contrast, RF, SVM, GBM, and deep learning architectures (CNNs, GANs) are increasingly explored in research for modeling complex, non-linear genotype–phenotype relationships and integrating diverse data types, but they remain largely experimental in operational pipelines.
Comparative evidence suggests that ML models can offer incremental accuracy gains under controlled conditions. For example, RF achieved AUC ≈ 0.93 for eye color prediction, outperforming SVM (≈0.91) and logistic regression (≈0.89) in cross-validation on HIrisPlex SNPs [95]. Neural networks reported precision ≈ 0.82 and recall ≈ 0.79 for skin color in a Polish cohort, slightly higher than RF (precision ≈ 0.78, recall ≈ 0.75) [91]. However, these improvements often diminish in LT-DNA casework due to allelic dropout and incomplete profiles. Facial morphology pre-diction remains particularly challenging, with deep learning explaining only 10–20% of variance in research cohorts [7,109,110,111], insufficient for individual-level reconstruction.
Modeling assumptions differ markedly. Logistic regression assumes a fixed parametric form and works well with modest sample sizes, making it straightforward to validate. RF, SVM, and GBM capture non-linear interactions and tolerate incomplete data but require larger, diverse datasets and careful regularization to avoid overfitting [95]. Deep learning demands thousands of samples and rich paired inputs (e.g., genotype and 3D facial scans), otherwise risking spurious, population-specific signals [7,109,110,111].
Overfitting and population bias remain critical concerns. FDP training datasets are often European-ancestry-heavy, which can skew predictions and reduce generalizability in admixed populations [95,135]. Cross-validation alone is insufficient; external and inter-laboratory validation on forensic-grade LT-DNA is essential [44,59]. Imputation of missing SNPs using ML can improve completeness but must include uncertainty estimates, and imputed calls should never be reported as hard genotypes [139,140,141].
Interpretability and admissibility are decisive for forensic use. Courts favor transparent methods with known error rates and peer-reviewed acceptance [14]. Logistic regression meets these standards; RF and GBM can be partially explained using feature importance and SHAP/LIME [136], but these tools are not yet standardized for legal contexts. Deep learning poses greater interpretability challenges, limiting its current admissibility.
Logistic regression remains the operational standard for FDP due to its interpretability and validation history. ML approaches are promising for research but remain experimental until broader validation, inclusive datasets, and courtroom-ready interpretability are achieved.

4.3. Future Directions and Operational Readiness

One of the most critical applications of ML in LT-DNA analysis is genotype imputation. Due to dropout events, many SNPs may be missing in LT-DNA samples. ML-based imputation tools such as Beagle, IMPUTE2, and deep learning-based networks can reconstruct missing genotypes with high accuracy, enabling more complete input data for phenotype prediction. Additionally, ML models are inherently more tolerant of noise and missing data than traditional approaches. Algorithms like RF and GBM can manage incomplete datasets by focusing on the most informative loci, while feature selection methods such as LASSO (Least Absolute Shrinkage and Selection Operator) and recursive feature elimination help reduce dimensionality and improve model performance [140].
Another promising strategy is the use of synthetic data augmentation to enhance model training. Techniques such as Synthetic Minority Oversampling Technique (SMOTE) or Generative Adversarial Networks (GANs) can simulate plausible low-quality or rare-case samples, thereby improving the generalizability of ML models to real-world forensic scenarios. This is particularly valuable in FDP, where the availability of high-quality, annotated training data is often limited [139,141].
The application of ML in FDP is already evident in tools like HIrisPlex and HIrisPlex-S, which use logistic regression and probabilistic models to predict eye, hair, and skin color from SNP panels. However, as FDP expands to include more complex traits such as facial morphology, freckles, sun sensitivity, and age estimation, ML becomes increasingly indispensable. For example, deep learning models trained on large datasets of paired genotype and 3D facial imaging data have shown promise in predicting facial structure, while epigenetic clocks based on DNA methylation patterns use regression models to estimate chronological age with high accuracy, even from degraded or LT-DNA samples [95,142].
To ensure forensic reliability, ML models must be trained and validated on diverse, well-annotated datasets such as the 1000 Genomes Project or the UK Biobank. Cross-validation and external validation using real forensic case data are essential to assess model robustness under practical constraints. Moreover, ethical considerations must guide the deployment of ML in forensic contexts, particularly regarding privacy, consent, and the communication of probabilistic predictions [143].
In summary, the integration of ML with LT-DNA analysis significantly enhances the predictive power and applicability of FDP. By enabling accurate phenotype inference from compromised samples, ML-driven FDP provides valuable investigative leads in cases where traditional methods fall short, marking a new era in forensic science.

5. Ethical Frameworks, Limitations, and Future Directions in FDP from LT-DNA

FDP has emerged as a powerful tool for predicting EVCs such as eye, hair, and skin color, as well as ancestry and age, from biological traces (Figure 3). However, when applied to LT-DNA samples, those containing minimal and often degraded genetic material, FDP faces significant technical, ethical, and interpretive challenges [144]. LT-DNA is particularly vulnerable to allelic dropout, contamination, and stochastic amplification errors, which can compromise the accuracy and reliability of phenotype predictions. These limitations are especially pronounced in complex traits like facial morphology, where prediction models rely on the integration of hundreds of genetic markers and are sensitive to even minor genotyping errors [1,145].
Prediction accuracy varies widely depending on the trait and the quality of the DNA. Eye color prediction remains the most robust, with AUC values exceeding 0.9 in European populations. Hair color and skin tone predictions are moderately accurate, while traits such as facial structure, freckles, and sun sensitivity are more difficult to predict reliably, particularly from LT-DNA. Moreover, most predictive models have been trained on individuals of European ancestry, limiting their applicability and accuracy in admixed or underrepresented populations. This population bias can lead to skewed or misleading results when applied in diverse forensic contexts [135] (Table 6).
Ethical and legal concerns further complicate the use of FDP in LT-DNA scenarios. Privacy remains a central issue, particularly under frameworks such as the EU General Data Protection Regulation (GDPR), which classifies genetic data as a special category requiring strict compliance with principles of lawfulness, purpose limitation, and data minimization. FDP applications in surveillance or public appeals raise heightened risks because they infer sensitive personal traits—such as ancestry or appearance—from biological material without consent. The secondary use of publicly available genomic data or biobank resources for forensic purposes introduces additional ethical challenges, including unclear consent scope, governance gaps, and potential erosion of public trust. Recent European debates on the European Health Data Space and U.S. discussions around investigative genetic genealogy highlight the tension between investigative utility and individual rights [69,146,147,148].
Algorithmic bias is another critical concern: ML models trained predominantly on European datasets often underperform for admixed or underrepresented populations, leading to inaccurate or misleading predictions. This bias can amplify inequities and undermine the legitimacy of FDP in diverse forensic contexts. To mitigate these risks, inclusive reference datasets, transparent validation across populations, and clear reporting of limitations are essential [149,150].
Court admissibility standards further demand rigorous attention. Under frameworks such as Daubert and Rule 702 in the U.S., and proportionality principles in Europe, FDP evidence must demonstrate scientific validity, known error rates, and peer-reviewed acceptance. Uncertainty should be communicated explicitly through probabilistic outputs, confidence intervals, and verbal scales, avoiding deterministic language that could mislead investigators or courts. Failure to convey uncertainty risks investigational bias or wrongful suspicion, particularly in high-stakes cases [14,151].
Finally, public perception and informed consent remain pivotal. Surveys and stakeholder interviews in Europe and the U.S. reveal mixed attitudes toward FDP, with concerns about genetic privacy, discrimination, and misuse. Transparent communication of capabilities and limitations, coupled with robust governance and consent frameworks, is essential to maintain trust and ensure that FDP serves justice without compromising fundamental rights [152,153].
To address these concerns, ethical frameworks must evolve alongside technological advancements. Clear guidelines are needed to govern the use of FDP, particularly in cases involving LT-DNA, where the margin for error is higher. Transparency in reporting, including the communication of prediction probabilities and confidence intervals, is essential to prevent misuse or misinterpretation. Additionally, the development of inclusive models trained in diverse populations is critical to ensuring fairness and accuracy across global forensic applications [154].
To ensure that forensic FDP is applied responsibly and transparently, especially in cases involving LT-DNA, it is essential to establish operational safeguards that address both scientific limitations and ethical risks. To operationalize ethical safeguards, we propose the following checklist outlines key considerations for the ethical deployment of FDP in forensic investigations (Figure 4).
Looking ahead, several promising directions are emerging to enhance FDP from LT-DNA. One key area is the integration of multi-omics data, combining genomics with transcriptomics, epigenomics, and proteomics, to enable more dynamic and context-sensitive predictions. For example, integrating DNA methylation profiles can improve age estimation, while gene expression data may offer insights into lifestyle or environmental exposures. Another advantage is the development of portable and rapid phenotyping platforms, such as those based on Oxford Nanopore sequencing and edge-based ML inference, which could allow on-site analysis of LT-DNA in real time [155,156].
ML and artificial intelligence AI are also playing an increasingly significant role in FDP. ML models can manage noisy, incomplete data and are well-suited to the imputation of missing genotypes, a common issue in LT-DNA samples. Techniques such as SHAP values and LIME are being explored to improve model interpretability, which is crucial for courtroom credibility. Furthermore, synthetic data generation using generative adversarial networks (GANs) may help augment training datasets, improving model robustness in low-quality scenarios [157,158,159].
Ultimately, the future of FDP in LT-DNA contexts depends on balancing scientific innovation with ethical responsibility. By addressing current limitations, enhancing model inclusivity, and ensuring transparent communication, forensic science can harness the full potential of DNA phenotyping while safeguarding individual rights and public trust.

6. Conclusions

The intersection of LT-DNA analysis and ML is poised to revolutionize FDP. While ML approaches show promise for improving phenotype prediction from LT-DNA, most operational systems currently rely on logistic regression, which remains the forensic standard due to its interpretability and validation status. By enabling the prediction of EVCs from minimal and often degraded biological material, this integration expands the investigative potential of forensic science beyond traditional identity matching. ML models, with their ability to manage high-dimensional, noisy, and incomplete data, are particularly well-suited to the challenges posed by LT-DNA. They facilitate genotype imputation, enhance trait prediction accuracy, and support probabilistic interpretation of complex phenotypes such as facial morphology and age.
Despite these advances, significant limitations remain. Prediction accuracy varies by trait and is often reduced in LT-DNA due to allelic dropout and partial genotyping. Moreover, the majority of current models are trained on European-ancestry datasets, limiting their generalizability across diverse populations. Ethical concerns, including privacy, consent, and the risk of overinterpreting probabilistic predictions, must be addressed through robust legal and regulatory frameworks.
Looking ahead, the future of FDP in LT-DNA contexts lies in the integration of multi-omics data, the development of portable and rapid sequencing technologies, and the refinement of interpretable ML models. These innovations, coupled with efforts to diversify training datasets and standardize forensic protocols, will be essential for transitioning FDP from experimental research to routine forensic practice.
To strengthen future research and operational readiness, we recommend the following actions:
  • Validation across diverse populations: Systematically validate ML-FDP models on non-European and admixed cohorts to mitigate algorithmic bias and improve global applicability.
  • Integration of multi-omics data: Incorporate DNA methylation, transcriptomics, and proteomics into predictive frameworks to enhance accuracy for complex traits such as age and facial morphology.
  • Portable sequencing workflows: Develop rapid, field-deployable FDP solutions using portable sequencing platforms (e.g., Oxford Nanopore) combined with edge-based ML inference for on-site analysis.
  • Interpretability and transparency: Establish forensic ML interpretability standards using explainable AI tools (e.g., SHAP, LIME) to ensure transparency, accountability, and courtroom admissibility.
  • Global collaboration: Promote international initiatives to create inclusive reference datasets and harmonized validation protocols across laboratories, ensuring reproducibility and fairness.
These steps will accelerate the transition of FDP from experimental research to operational forensic practice while safeguarding ethical and legal principles.

Author Contributions

Conceptualization, F.S., E.D., C.P. and M.S.; methodology, F.S., E.D., M.E., M.F., M.C., C.P. and M.S.; validation, F.S., E.D., C.P. and M.S.; formal analysis, F.S., E.D., M.E., M.F., M.C., C.P. and M.S.; writing—original draft preparation, F.S. and E.D.; writing—review and editing, F.S., E.D., M.E., M.F., M.C., C.P. and M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data were included in the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
FDPForensic DNA Phenotyping
LT-DNALow Template DNA
MLMachine Learning
EVCExternally Visible Characteristics
SNPSingle Nucleotide Polymorphism
STRShort Tandem Repeat
GWASGenome-Wide Association Studies
NGSNext-Generation Sequencing
MPSMassively Parallel Sequencing
WGSWhole-Genome Sequencing
PCRPolymerase Chain Reaction
RFURelative Fluorescence Units
ATAnalytical Threshold
scDNASingle-Cell DNA
PGSProbabilistic Genotyping Software
AIMsAncestry-Informative Markers
CNNConvolutional Neural Network
GANGenerative Adversarial Network
SMOTESynthetic Minority Oversampling Technique
LASSOLeast Absolute Shrinkage and Selection Operator
PLSRPartial Least Squares Regression
CpGCytosine-phosphate-Guanine (dinucleotide)
MADMean Absolute Deviation
qPCRQuantitative Polymerase Chain Reaction
VISAGEVISible Attributes through GEnomics
AIArtificial Intelligence
SHAPSHapley Additive exPlanations
LIMELocal Interpretable Model-Agnostic Explanations

References

  1. Kayser, M.; Branicki, W.; Parson, W.; Phillips, C. Recent Advances in Forensic DNA Phenotyping of Appearance, Ancestry and Age. Forensic Sci. Int. Genet. 2023, 65, 102870. [Google Scholar] [CrossRef] [PubMed]
  2. Haidar, M.; Mousawi, F.; Al-Matrouk, A.K. Forensic DNA Phenotyping Using Next-Generation Sequencing. In Next Generation Sequencing (NGS) Technology in DNA Analysis; Academic Press: Cambridge, MA, USA, 2024; pp. 289–310. [Google Scholar] [CrossRef]
  3. Pośpiech, E.; Teisseyre, P.; Mielniczuk, J.; Branicki, W. Predicting Physical Appearance from DNA Data—Towards Genomic Solutions. Genes 2022, 13, 121. [Google Scholar] [CrossRef]
  4. Vajpayee, K.; Shukla, R.K. DNA Phenotyping: The Technique of the Future. In Handbook of DNA Profiling; Springer: Singapore, 2022. [Google Scholar]
  5. Ge, J.; Budowle, B. Forensic Investigation Approaches of Searching Relatives in DNA Databases. J. Forensic Sci. 2021, 66, 430–443. [Google Scholar] [CrossRef]
  6. Richmond, S.; Howe, L.J.; Lewis, S.; Stergiakouli, E.; Zhurov, A. Facial Genetics: A Brief Overview. Front. Genet. 2018, 9, 462. [Google Scholar] [CrossRef]
  7. Claes, P.; Hill, H.; Shriver, M.D. Toward DNA-Based Facial Composites: Preliminary Results and Validation. Forensic Sci. Int. Genet. 2014, 13, 208–216. [Google Scholar] [CrossRef]
  8. Ketsekioulafis, I.; Katsos, K.; Kolentinis, C.; Kouzos, D.; Moraitis, K.; Spiliopoulou, C.; Sakelliadis, E.I. Humanitarian Forensic Medicine: A Systematic Review. Int. J. Leg. Med. 2025, 139, 747–761. [Google Scholar] [CrossRef]
  9. Kukla-Bartoszek, M.; Szargut, M.; Pośpiech, E.; Diepenbroek, M.; Zielińska, G.; Jarosz, A.; Piniewska-Róg, D.; Arciszewska, J.; Cytacka, S.; Spólnicka, M.; et al. The Challenge of Predicting Human Pigmentation Traits in Degraded Bone Samples with the MPS-Based HIrisPlex-S System. Forensic Sci. Int. Genet. 2020, 47, 102301. [Google Scholar] [CrossRef]
  10. Fleckhaus, J.; Bugert, P.; Al-Rashedi, N.A.M.; Rothschild, M.A. Investigation of the Impact of Biogeographic Ancestry on DNA Methylation Based Age Predictions Comparing a Middle East and a Central European Population. Forensic Sci. Int. Genet. 2023, 67, 102923. [Google Scholar] [CrossRef] [PubMed]
  11. Zieger, M.; Scudder, N. Ethical and Legal Reflections on Secondary Research Using Genetic Data Acquired for Criminal Investigation Purposes. Forensic Sci. Int. Genet. 2025, 75, 103178. [Google Scholar] [CrossRef] [PubMed]
  12. Gareau-Léonard, A.; Mousseau, V.; Crispino, F.; Milot, E. Forensic DNA Phenotyping: Examining Knowledge and Operational View from Police Officers. Forensic Sci. Int. 2025, 10, 100586. [Google Scholar] [CrossRef]
  13. Zieger, M. Forensic DNA Phenotyping in Europe: How Far May It Go? J. Law Biosci. 2022, 9, lsac024. [Google Scholar] [CrossRef]
  14. Sessa, F.; Chisari, M.; Esposito, M.; Karaboue, M.A.A.; Salerno, M.; Cocimano, G. Ethical, Legal and Social Implications (ELSI) Regarding Forensic Genetic Investigations (FGIs). J. Acad. Ethics 2024, 23, 617–637. [Google Scholar] [CrossRef]
  15. Huffman, K.; Ballantyne, J. Single Cell Genomics Applications in Forensic Science: Current State and Future Directions. iScience 2023, 26, 107961. [Google Scholar] [CrossRef]
  16. van Oorschot, R.A.H.; Meakin, G.E.; Kokshoorn, B.; Goray, M.; Szkuta, B. DNA Transfer in Forensic Science: Recent Progress towards Meeting Challenges. Genes 2021, 12, 1766. [Google Scholar] [CrossRef]
  17. Hicks, T.; Buckleton, J.; Castella, V.; Evett, I.; Jackson, G. A Logical Framework for Forensic DNA Interpretation. Genes 2022, 13, 957. [Google Scholar] [CrossRef]
  18. Schulte, J.; Marciano, M.A.; Scheurer, E.; Schulz, I. A Systematic Approach to Improve Downstream Single-Cell Analysis for the DEPArrayTM Technology. J. Forensic Sci. 2023, 68, 1875–1893. [Google Scholar] [CrossRef] [PubMed]
  19. Laurin, N.; Boulianne, H.; Frégeau, C. Comparative Analysis of Two Rapid DNA Technologies for the Processing of Blood and Saliva-Based Samples. Forensic Sci. Int. Genet. 2023, 67, 102928. [Google Scholar] [CrossRef] [PubMed]
  20. Gittelson, S.; Steffen, C.R.; Coble, M.D. Low-Template DNA: A Single DNA Analysis or Two Replicates? Forensic Sci. Int. 2016, 264, 139–145. [Google Scholar] [CrossRef]
  21. Chen, D.; Tan, M.; Xue, J.; Wu, M.; Song, J.; Wu, Q.; Liu, G.; Zheng, Y.; Xiao, Y.; Lv, M.; et al. Optimizing Analytical Thresholds for Low-Template DNA Analysis: Insights from Multi-Laboratory Negative Controls. Genes 2024, 15, 117. [Google Scholar] [CrossRef] [PubMed]
  22. Martínez-Gómez, J.; Laso-Izquierdo, S.; Vera-Yánez, A.; Fernández-Serrano, J.J.; Gomes, C. Preliminary Results of Reduced Polymerase Chain Reaction (PCR) Volumes When Analysing Low Template DNA Samples with GlobalfilerTM and YfilerTM Plus Kits. DNA 2025, 5, 2. [Google Scholar] [CrossRef]
  23. Ido, A.; Kirshenbaum, L.; Waiskopf, O.; Voskoboinik, L. Optimizing Amplification Threshold of Low Template DNA. J. Forensic Sci. 2025, 70, 1521–1526. [Google Scholar] [CrossRef]
  24. Aljanahi, N.S.; Alketbi, S.K.; Almheiri, M.M.; Alshehhi, S.A.; Sanqoor, A.N.; Alghanim, H.J. Enhancing Trace DNA Profile Recovery in Forensic Casework Using the Amplicon RX Post-PCR Clean-up Kit. Sci. Rep. 2025, 15, 3324. [Google Scholar] [CrossRef]
  25. Tozzo, P.; Mazzobel, E.; Marcante, B.; Delicati, A.; Caenazzo, L. Touch DNA Sampling Methods: Efficacy Evaluation and Systematic Review. Int. J. Mol. Sci. 2022, 23, 15541. [Google Scholar] [CrossRef]
  26. Sessa, F.; Pomara, C.; Esposito, M.; Grassi, P.; Cocimano, G.; Salerno, M. Indirect DNA Transfer and Forensic Implications: A Literature Review. Genes 2023, 14, 2153. [Google Scholar] [CrossRef]
  27. Costa, C.; Figueiredo, C.; Costa, S.; Ferreira, P.M.; Amorim, A.; Prieto, L.; Pinto, N. The Impact of Parameter Variation in the Quantification of Forensic Genetic Evidence. Sci. Rep. 2025, 15, 2524. [Google Scholar] [CrossRef] [PubMed]
  28. Butler, J.M.; Iyer, H.; Press, R.; Taylor, M.K.; Vallone, P.M.; Willis, S. DNA Mixture Interpretation. 2024. Available online: https://nvlpubs.nist.gov/nistpubs/ir/2024/NIST.IR.8351.pdf (accessed on 10 December 2025).
  29. Sessa, F.; Salerno, M.; Bertozzi, G.; Messina, G.; Ricci, P.; Ledda, C.; Rapisarda, V.; Cantatore, S.; Turillazzi, E.; Pomara, C. Touch DNA: Impact of Handling Time on Touch Deposit and Evaluation of Different Recovery Techniques: An Experimental Study. Sci. Rep. 2019, 9, 9542. [Google Scholar] [CrossRef]
  30. Sessa, F.; Panepinto, E.; Salerno, M.; Chisari, M.; Esposito, M.; Cocimano, G.; Pomara, C. Impact of Liquid Antibacterial Soap and Hand Sanitizer on DNA Transfer in Forensic Investigations: An Experimental Study. Forensic Sci. Res. 2025, 10, owae068. [Google Scholar] [CrossRef]
  31. Kanokwongnuwut, P.; Martin, B.; Taylor, D.; Kirkbride, K.P.; Linacre, A. How Many Cells Are Required for Successful DNA Profiling? Forensic Sci. Int. Genet. 2021, 51, 102453. [Google Scholar] [CrossRef] [PubMed]
  32. Kallupurackal, V.; Kummer, S.; Voegeli, P.; Kratzer, A.; Dørum, G.; Haas, C.; Hess, S. Sampling Touch DNA from Human Skin Following Skin-to-Skin Contact in Mock Assault Scenarios—A Comparison of Nine Collection Methods. J. Forensic Sci. 2021, 66, 1889–1900. [Google Scholar] [CrossRef] [PubMed]
  33. Fantinato, C.; Gill, P.; Fonneløp, A.E. Investigative Use of Human Environmental DNA in Forensic Genetics. Forensic Sci. Int. Genet. 2024, 70, 10302. [Google Scholar] [CrossRef]
  34. Rayimoglu, G.; Yonar, F.C.; Anılanmert, B. From One Strand Dyed/Undyed Hair With/Without Root to Fast and Successful STR Profiling and Evaluation With Principle Component Analysis. Electrophoresis 2025, 46, 1292–1307. [Google Scholar] [CrossRef]
  35. Peng, D.; Wang, N.; Zang, Y.; Liu, Z.; Liu, Z.; Geng, J.; Cong, B.; Sun, H.; Wu, R. Concurrent Genotyping of Mitochondrial DNA and Nuclear DNA in Rootless Hair Shafts and Blood Samples for Enhanced Analysis. Forensic Sci. Int. Genet. 2025, 75, 103176. [Google Scholar] [CrossRef]
  36. Admire, L.; Carson, M.; Crawford, K.; Nguyen, E.; Daniels, T. Hair Root Staining with Hematoxylin: Increasing the Rate of Obtaining DNA Profiles in Forensic Casework. Forensic Sci. Int. 2023, 343, 111544. [Google Scholar] [CrossRef] [PubMed]
  37. Shrivastava, P.; Kumawat, R.K.; Kushwaha, P.; Rana, M. Biological Sources of DNA: The Target Materials for Forensic DNA Typing. In Handbook of DNA Profiling; Springer Nature: Heidelberg, Germany, 2022. [Google Scholar]
  38. Madden, I.; Taylor, D.; Mitchell, N.; Goray, M.; Henry, J. Predicting Probative Levels of Touch DNA on Tapelifts Using DiamondTM Nucleic Acid Dye. Forensic Sci. Int. Genet. 2024, 70, 103024. [Google Scholar] [CrossRef] [PubMed]
  39. Haarkötter, C.; Vinueza-Espinosa, D.C.; Gálvez, X.; Saiz, M.; Medina-Lozano, M.I.; Lorente, J.A.; Álvarez, J.C. A Comparison between Petrous Bone and Tooth, Femur and Tibia DNA Analysis from Degraded Skeletal Remains. Electrophoresis 2023, 44, 1559–1568. [Google Scholar] [CrossRef] [PubMed]
  40. Di Stefano, B.; Zupanič Pajnič, I.; Concato, M.; Bertoglio, B.; Calvano, M.G.; Sorçaburu Ciglieri, S.; Bosetti, A.; Grignani, P.; Addoum, Y.; Vetrini, R.; et al. Evaluation of a New DNA Extraction Method on Challenging Bone Samples Recovered from a WWII Mass Grave. Genes 2024, 15, 672. [Google Scholar] [CrossRef]
  41. Sapan, T.Ü.; Karaboğa, N. Determination of DNA Recovery from Human Teeth Exposed to Various Acids. Int. J. Leg. Med. 2025, 139, 1453–1463. [Google Scholar] [CrossRef]
  42. Varela Morillas, Á.; Suhling, K.; Frascione, N. Unlocking the Potential of Forensic Traces: Analytical Approaches to Generate Investigative Leads. Sci. Justice 2022, 62, 310–326. [Google Scholar] [CrossRef]
  43. Oosthuizen, T.; Howes, L.M. The Development of Forensic DNA Analysis: New Debates on the Issue of Fundamental Human Rights. Forensic Sci. Int. Genet. 2022, 56, 102606. [Google Scholar] [CrossRef]
  44. Butler, J.M. Recent Advances in Forensic Biology and Forensic DNA Typing: INTERPOL Review 2019–2022. Forensic Sci. Int. 2023, 6, 100311. [Google Scholar] [CrossRef]
  45. Fabbri, M.; Alfieri, L.; Mazdai, L.; Frisoni, P.; Gaudio, R.M.; Neri, M. Application of Forensic DNA Phenotyping for Prediction of Eye, Hair and Skin Colour in Highly Decomposed Bodies. Healthcare 2023, 11, 647. [Google Scholar] [CrossRef]
  46. Kayser, M. Forensic DNA Phenotyping: Predicting Human Appearance from Crime Scene Material for Investigative Purposes. Forensic Sci. Int. Genet. 2015, 18, 33–48. [Google Scholar] [CrossRef]
  47. Kukla-Bartoszek, M.; Teisseyre, P.; Pośpiech, E.; Karłowska-Pik, J.; Zieliński, P.; Woźniak, A.; Boroń, M.; Dąbrowski, M.; Zubańska, M.; Jarosz, A.; et al. Searching for Improvements in Predicting Human Eye Colour from DNA. Int. J. Leg. Med. 2021, 135, 2175–2187. [Google Scholar] [CrossRef]
  48. Brancato, D.; Coniglio, E.; Bruno, F.; Agostini, V.; Saccone, S.; Federico, C. Forensic DNA Phenotyping: Genes and Genetic Variants for Eye Color Prediction. Genes 2023, 14, 1604. [Google Scholar] [CrossRef] [PubMed]
  49. Carratto, T.M.T.; Guimarães de Oliveira, M.L.; Mendes-Junior, C.T. Forensic DNA Phenotyping in the Next-Generation Sequencing Era. In Next Generation Sequencing (NGS) Technology in DNA Analysis; Academic Press: Cambridge, MA, USA, 2023; ISBN 9780323991445. [Google Scholar]
  50. Liu, F.; Wen, B.; Kayser, M. Colorful DNA Polymorphisms in Humans. Semin. Cell Dev. Biol. 2013, 24, 562–575. [Google Scholar] [CrossRef]
  51. Terrado-Ortuño, N.; May, P. Forensic DNA Phenotyping: A Review on SNP Panels, Genotyping Techniques, and Prediction Models. Forensic Sci. Res. 2025, 10, owae013. [Google Scholar] [CrossRef] [PubMed]
  52. Kukla-Bartoszek, M.; Pośpiech, E.; Woźniak, A.; Boroń, M.; Karłowska-Pik, J.; Teisseyre, P.; Zubańska, M.; Bronikowska, A.; Grzybowski, T.; Płoski, R.; et al. DNA-Based Predictive Models for the Presence of Freckles. Forensic Sci. Int. Genet. 2019, 42, 252–259. [Google Scholar] [CrossRef]
  53. Naqvi, S.; Hoskens, H.; Wilke, F.; Weinberg, S.M.; Shaffer, J.R.; Walsh, S.; Shriver, M.D.; Wysocka, J.; Claes, P. Decoding the Human Face: Progress and Challenges in Understanding the Genetics of Craniofacial Morphology. Annu. Rev. Genom. Hum. Genet. 2022, 23, 383–412. [Google Scholar] [CrossRef]
  54. Bell, C.G.; Lowe, R.; Adams, P.D.; Baccarelli, A.A.; Beck, S.; Bell, J.T.; Christensen, B.C.; Gladyshev, V.N.; Heijmans, B.T.; Horvath, S.; et al. DNA Methylation Aging Clocks: Challenges and Recommendations. Genome Biol. 2019, 20, 249. [Google Scholar] [CrossRef] [PubMed]
  55. Cheung, E.Y.Y.; Gahan, M.E.; McNevin, D. Prediction of Biogeographical Ancestry from Genotype: A Comparison of Classifiers. Int. J. Leg. Med. 2017, 131, 901–912. [Google Scholar] [CrossRef]
  56. Chaitanya, L.; Breslin, K.; Zuñiga, S.; Wirken, L.; Pośpiech, E.; Kukla-Bartoszek, M.; Sijen, T.; Knijff, P.d.; Liu, F.; Branicki, W.; et al. The HIrisPlex-S System for Eye, Hair and Skin Colour Prediction from DNA: Introduction and Forensic Developmental Validation. Forensic Sci. Int. Genet. 2018, 35, 123–135. [Google Scholar] [CrossRef] [PubMed]
  57. Walsh, S.; Liu, F.; Wollstein, A.; Kovatsi, L.; Ralf, A.; Kosiniak-Kamysz, A.; Branicki, W.; Kayser, M. The HIrisPlex System for Simultaneous Prediction of Hair and Eye Colour from DNA. Forensic Sci. Int. Genet. 2013, 7, 98–115. [Google Scholar] [CrossRef] [PubMed]
  58. Walsh, S.; Liu, F.; Ballantyne, K.N.; Van Oven, M.; Lao, O.; Kayser, M. IrisPlex: A Sensitive DNA Tool for Accurate Prediction of Blue and Brown Eye Colour in the Absence of Ancestry Information. Forensic Sci. Int. Genet. 2011, 5, 170–180. [Google Scholar] [CrossRef]
  59. Xavier, C.; de la Puente, M.; Mosquera-Miguel, A.; Freire-Aradas, A.; Kalamara, V.; Ralf, A.; Revoir, A.; Gross, T.E.; Schneider, P.M.; Ames, C.; et al. Development and Inter-Laboratory Evaluation of the VISAGE Enhanced Tool for Appearance and Ancestry Inference from DNA. Forensic Sci. Int. Genet. 2022, 61, 102779. [Google Scholar] [CrossRef]
  60. Bukyya, J.L.; Tejasvi, M.L.A.; Avinash, A.; Chanchala, H.P.; Talwade, P.; Afroz, M.M.; Pokala, A.; Neela, P.K.; Shyamilee, T.K.; Srisha, V. DNA Profiling in Forensic Science: A Review. Glob. Med. Genet. 2021, 08, 135–143. [Google Scholar] [CrossRef]
  61. Mackey, D.A. What Colour Are Your Eyes? Teaching the Genetics of Eye Colour & Colour Vision. Edridge Green Lecture RCOphth Annual Congress Glasgow May 2019. Eye 2022, 36, 704–715. [Google Scholar] [CrossRef]
  62. Shapturenko, M.N.; Vakula, S.I.; Kandratsiuk, A.V.; Gudievskaya, I.G.; Shinkevich, M.V.; Luhauniou, A.U.; Borovko, S.R.; Marchenko, L.N.; Kilchevsky, A.V. HERC2 (Rs12913832) and OCA2 (Rs1800407) Genes Polymorphisms in Relation to Iris Color Variation in Belarusian Population. Forensic Sci. Int. Genet. Suppl. Ser. 2019, 7, 331–332. [Google Scholar] [CrossRef]
  63. Salvo, N.M.; Andersen, J.D.; Janssen, K.; Meyer, O.L.; Berg, T.; Børsting, C.; Olsen, G.H. Association between Variants in the OCA2-HERC2 Region and Blue Eye Colour in HERC2 Rs12913832 AA and AG Individuals. Genes 2023, 14, 698. [Google Scholar] [CrossRef] [PubMed]
  64. Rafati, A.; Hosseini, S.M.; Far, H.Z. Using the Irisplex System for Eye Color Prediction on Skeletal Remaining from the Past 30 Years. J. Forensic Sci. Med. 2023, 9, 367–370. [Google Scholar] [CrossRef]
  65. Becher, D.; Jmel, H.; Kheriji, N.; Sarno, S.; Kefi, R. Genetic Landscape of Forensic DNA Phenotyping Markers among Mediterranean Populations. Forensic Sci. Int. 2024, 354, 111906. [Google Scholar] [CrossRef]
  66. Martinez-Cadenas, C.; Penãa-Chilet, M.; Ibarrola-Villava, M.; Ribas, G. Gender Is a Major Factor Explaining Discrepancies in Eye Colour Prediction Based on HERC2/OCA2 Genotype and the IrisPlex Model. Forensic Sci. Int. Genet. 2013, 7, 453–460. [Google Scholar] [CrossRef]
  67. Carratto, T.M.T.; Moraes, V.M.S.; Recalde, T.S.F.; de Oliveira, M.L.G.; Mendes-Junior, C.T. Applications of Massively Parallel Sequencing in Forensic Genetics. Genet. Mol. Biol. 2022, 45, e20220077. [Google Scholar] [CrossRef]
  68. Atwood, L.; Raymond, J.; Sears, A.; Bell, M.; Daniel, R. From Identification to Intelligence: An Assessment of the Suitability of Forensic DNA Phenotyping Service Providers for Use in Australian Law Enforcement Casework. Front. Genet. 2021, 11, 568701. [Google Scholar] [CrossRef]
  69. Sessa, F. Secondary transfer and its involvement in forensic investigation. Investig. Predict. DNA 2026, 289–300. [Google Scholar] [CrossRef]
  70. Bhoyar, L.; Mehar, P.; Chavali, K. An Overview of DNA Degradation and Its Implications in Forensic Caseworks. Egypt. J. Forensic Sci. 2024, 14, 15. [Google Scholar] [CrossRef]
  71. Pośpiech, E.; Chen, Y.; Kukla-Bartoszek, M.; Breslin, K.; Aliferi, A.; Andersen, J.D.; Ballard, D.; Chaitanya, L.; Freire-Aradas, A.; van der Gaag, K.J.; et al. Towards Broadening Forensic DNA Phenotyping beyond Pigmentation: Improving the Prediction of Head Hair Shape from DNA. Forensic Sci. Int. Genet. 2018, 37, 241–251. [Google Scholar] [CrossRef]
  72. van Daal, A. The Genetic Basis of Human Pigmentation. Forensic Sci. Int. Genet. Suppl. Ser. 2008, 1, 541–543. [Google Scholar] [CrossRef]
  73. Suarez, P.; Baumer, K.; Hall, D. Further Insight into the Global Variability of the OCA2-HERC2 Locus for Human Pigmentation from Multiallelic Markers. Sci. Rep. 2021, 11, 22530. [Google Scholar] [CrossRef]
  74. Andrade, E.S.; Fracasso, N.C.A.; Strazza Júnior, P.S.; Simões, A.L.; Mendes-Junior, C.T. Associations of OCA2-HERC2 SNPs and Haplotypes with Human Pigmentation Characteristics in the Brazilian Population. Leg. Med. 2017, 24, 78–83. [Google Scholar] [CrossRef]
  75. Branicki, W.; Brudnik, U.; Wojas-Pelc, A. Interactions between HERC2, OCA2 and MC1R May Influence Human Pigmentation Phenotype. Ann. Hum. Genet. 2009, 73, 160–170. [Google Scholar] [CrossRef]
  76. Le, L.; Escobar, I.E.; Ho, T.; Lefkovith, A.J.; Latteri, E.; Haltaufderhyde, K.D.; Dennis, M.K.; Plowright, L.; Sviderskaya, E.V.; Bennett, D.C.; et al. SLC45A2 Protein Stability and Regulation of Melanosome PH Determine Melanocyte Pigmentation. Mol. Biol. Cell 2020, 31, 2687–2702. [Google Scholar] [CrossRef]
  77. Pośpiech, E.; Kukla-Bartoszek, M.; Karłowska-Pik, J.; Zieliński, P.; Woźniak, A.; Boroń, M.; Dąbrowski, M.; Zubańska, M.; Jarosz, A.; Grzybowski, T.; et al. Exploring the Possibility of Predicting Human Head Hair Greying from DNA Using Whole-Exome and Targeted NGS Data. BMC Genom. 2020, 21, 538. [Google Scholar] [CrossRef]
  78. Branicki, W.; Liu, F.; Van Duijn, K.; Draus-Barini, J.; Pośpiech, E.; Walsh, S.; Kupiec, T.; Wojas-Pelc, A.; Kayser, M. Model-Based Prediction of Human Hair Color Using DNA Variants. Hum. Genet. 2011, 129, 443–454. [Google Scholar] [CrossRef]
  79. Zorina-Lichtenwalter, K.; Lichtenwalter, R.N.; Zaykin, D.V.; Parisien, M.; Gravel, S.; Bortsov, A.; Diatchenko, L. A Study in Scarlet: MC1R as the Main Predictor of Red Hair and Exemplar of the Flip-Flop Effect. Hum. Mol. Genet. 2019, 28, 2093–2106. [Google Scholar] [CrossRef]
  80. Kukla-Bartoszek, M.; Pośpiech, E.; Spólnicka, M.; Karłowska-Pik, J.; Strapagiel, D.; Żądzińska, E.; Rosset, I.; Sobalska-Kwapis, M.; Słomka, M.; Walsh, S.; et al. Investigating the Impact of Age-Depended Hair Colour Darkening during Childhood on DNA-Based Hair Colour Prediction with the HIrisPlex System. Forensic Sci. Int. Genet. 2018, 36, 26–33. [Google Scholar] [CrossRef]
  81. Navarro-López, B.; Baeta, M.; Moreno-López, O.; Olalde, I.; de Pancorbo, M.M.; Suárez-Ulloa, V.; Martos-Fernández, R.; Martínez-Jarreta, B.; Jiménez, S. Exploring Eye, Hair, and Skin Pigmentation in a Spanish Population: Insights from Hirisplex-S Predictions. Genes 2024, 15, 1330. [Google Scholar] [CrossRef]
  82. Browne, T.N.; Freeman, M. Next Generation Sequencing: Forensic Applications and Policy Considerations. WIREs Forensic Sci. 2024, 6, e1531. [Google Scholar] [CrossRef]
  83. Cabrejas-Olalla, A.; Athanasiadis, G.; Jørgensen, F.G.; Cheng, J.Y.; Kjærgaard, P.C.; Schierup, M.H.; Mailund, T. Genetic Predictions of Eye and Hair Colour in the Danish Population. Forensic Sci. Int. Genet. 2025, 78, 103267. [Google Scholar] [CrossRef]
  84. Breslin, K.; Wills, B.; Ralf, A.; Ventayol Garcia, M.; Kukla-Bartoszek, M.; Pospiech, E.; Freire-Aradas, A.; Xavier, C.; Ingold, S.; de La Puente, M.; et al. HIrisPlex-S System for Eye, Hair, and Skin Color Prediction from DNA: Massively Parallel Sequencing Solutions for Two Common Forensically Used Platforms. Forensic Sci. Int. Genet. 2019, 43, 102152. [Google Scholar] [CrossRef]
  85. Perez Palomeque, G.; Khacha-ananda, S.; Monum, T.; Wunnapuk, K. Prediction of Skin Color Using Forensic DNA Phenotyping in Asian Populations: A Focus on Thailand. Biomolecules 2025, 15, 548. [Google Scholar] [CrossRef]
  86. Del Bino, S.; Duval, C.; Bernerd, F. Clinical and Biological Characterization of Skin Pigmentation Diversity and Its Consequences on UV Impact. Int. J. Mol. Sci. 2018, 19, 2668. [Google Scholar] [CrossRef]
  87. D’Mello, S.A.N.; Finlay, G.J.; Baguley, B.C.; Askarian-Amiri, M.E. Signaling Pathways in Melanogenesis. Int. J. Mol. Sci. 2016, 17, 1144. [Google Scholar] [CrossRef]
  88. Maroñas, O.; Phillips, C.; Söchtig, J.; Gomez-Tato, A.; Cruz, R.; Alvarez-Dios, J.; De Cal, M.C.; Ruiz, Y.; Fondevila, M.; Carracedo, Á.; et al. Development of a Forensic Skin Colour Predictive Test. Forensic Sci. Int. Genet. 2014, 13, 34–44. [Google Scholar] [CrossRef]
  89. Irving-Pease, E.K.; Muktupavela, R.; Dannemann, M.; Racimo, F. Quantitative Human Paleogenetics: What Can Ancient DNA Tell Us About Complex Trait Evolution? Front. Genet. 2021, 12, 703541. [Google Scholar] [CrossRef]
  90. Barash, M.; McNevin, D.; Fedorenko, V.; Giverts, P. Machine Learning Applications in Forensic DNA Profiling: A Critical Review. Forensic Sci. Int. Genet. 2024, 69, 102994. [Google Scholar] [CrossRef]
  91. Zaorska, K.; Zawierucha, P.; Nowicki, M. Prediction of Skin Color, Tanning and Freckling from DNA in Polish Population: Linear Regression, Random Forest and Neural Network Approaches. Hum. Genet. 2019, 138, 635–647. [Google Scholar] [CrossRef] [PubMed]
  92. Herraiz, C.; Garcia-Borron, J.C.; Jiménez-Cervantes, C.; Olivares, C. MC1R Signaling. Intracellular Partners and Pathophysiological Implications. Biochim. Biophys. Acta Mol. Basis Dis. 2017, 1863, 2448–2461. [Google Scholar] [CrossRef] [PubMed]
  93. Davies, J.R.; Randerson-Moor, J.; Kukalizch, K.; Harland, M.; Kumar, R.; Madhusudan, S.; Nagore, E.; Hansson, J.; Höiom, V.; Ghiorzo, P.; et al. Inherited Variants in the MC1R Gene and Survival from Cutaneous Melanoma: A BioGenoMEL Study. Pigment. Cell Melanoma Res. 2012, 25, 384–394. [Google Scholar] [CrossRef]
  94. Jacobs, L.C.; Hamer, M.A.; Gunn, D.A.; Deelen, J.; Lall, J.S.; Van Heemst, D.; Uh, H.W.; Hofman, A.; Uitterlinden, A.G.; Griffiths, C.E.M.; et al. A Genome-Wide Association Study Identifies the Skin Color Genes IRF4, MC1R, ASIP, and BNC2 Influencing Facial Pigmented Spots. J. Investig. Dermatol. 2015, 135, 1735–1742. [Google Scholar] [CrossRef]
  95. Katsara, M.A.; Branicki, W.; Walsh, S.; Kayser, M.; Nothnagel, M. Evaluation of Supervised Machine-Learning Methods for Predicting Appearance Traits from DNA. Forensic Sci. Int. Genet. 2021, 53, 102507. [Google Scholar] [CrossRef] [PubMed]
  96. Cha, M.Y.; Choi, J.E.; Lee, D.S.; Lee, S.R.; Lee, S.I.; Park, J.H.; Shin, J.H.; Suh, I.S.; Kim, B.H.; Hong, K.W. Novel Genetic Associations for Skin Aging Phenotypes and Validation of Previously Reported Skin GWAS Results. Appl. Sci. 2022, 12, 11422. [Google Scholar] [CrossRef]
  97. Fridman, C.; Ferreira, M.A.; Marano, L.A.; Forlenza, B.S. Analysis of Genetic Polymorphisms Associated with the Presence of Freckles for Phenotypic Prediction. Forensic Sci. Int. Genet. Suppl. Ser. 2022, 8, 26–28. [Google Scholar] [CrossRef]
  98. Elkins, K.M.; Garloff, A.T.; Zeller, C.B. Additional Predictions for Forensic DNA Phenotyping of Externally Visible Characteristics Using the ForenSeq and Imagen Kits. J. Forensic Sci. 2023, 68, 608–613. [Google Scholar] [CrossRef]
  99. Bastiaens, M.; Ter Huurne, J.; Gruis, N.; Bergman, W.; Westendorp, R.; Vermeer, B.J.; Bavinck, J.N.B. The Melanocortin-1-Receptor Gene Is the Major Freckle Gene. Hum. Mol. Genet. 2001, 10, 1701–1708. [Google Scholar] [CrossRef] [PubMed]
  100. Walsh, S.; Chaitanya, L.; Breslin, K.; Muralidharan, C.; Bronikowska, A.; Pospiech, E.; Koller, J.; Kovatsi, L.; Wollstein, A.; Branicki, W.; et al. Global Skin Colour Prediction from DNA. Hum. Genet. 2017, 136, 847–863. [Google Scholar] [CrossRef] [PubMed]
  101. Guida, S.; Guida, G.; Goding, C.R. MC1R Functions, Expression, and Implications for Targeted Therapy. J. Investig. Dermatol. 2022, 142, 293–302. [Google Scholar] [CrossRef]
  102. Alshehhi, A.; Almarzooqi, A.; Alhammadi, K.; Werghi, N.; Tay, G.K.; Alsafar, H. Advancement in Human Face Prediction Using DNA. Genes 2023, 14, 136. [Google Scholar] [CrossRef]
  103. Stephan, C.N.; Caple, J.M.; Guyomarc’h, P.; Claes, P. An Overview of the Latest Developments in Facial Imaging. Forensic Sci. Res. 2019, 4, 10–28. [Google Scholar] [CrossRef]
  104. Weinberg, S.M.; Roosenboom, J.; Shaffer, J.R.; Shriver, M.D.; Wysocka, J.; Claes, P. Hunting for Genes That Shape Human Faces: Initial Successes and Challenges for the Future. Orthod. Craniofac. Res. 2019, 22, 207–212. [Google Scholar] [CrossRef]
  105. Navarro-López, B.; Wilke, F.; Suárez-Ulloa, V.; Baeta, M.; Martos-Fernández, R.; Moreno-López, O.; Olalde, I.; Martínez-Jarreta, B.; Jiménez, S.; Walsh, S.; et al. Exploring the Association between SNPs and Facial Morphology in a Spanish Population. Sci. Rep. 2025, 15, 13826. [Google Scholar] [CrossRef]
  106. Qian, W.; Zhang, M.; Wan, K.; Xie, Y.; Du, S.; Li, J.; Mu, X.; Qiu, J.; Xue, X.; Zhuang, X.; et al. Genetic Evidence for Facial Variation Being a Composite Phenotype of Cranial Variation and Facial Soft Tissue Thickness. J. Genet. Genom. 2022, 49, 934–942. [Google Scholar] [CrossRef]
  107. Hopman, S.M.J.; Merks, J.H.M.; Suttie, M.; Hennekam, R.C.M.; Hammond, P. Face Shape Differs in Phylogenetically Related Populations. Eur. J. Hum. Genet. 2014, 22, 1268–1271. [Google Scholar] [CrossRef] [PubMed]
  108. Yuan, M.; Goovaerts, S.; Vanneste, M.; Matthews, H.; Hoskens, H.; Richmond, S.; Klein, O.D.; Spritz, R.A.; Hallgrimsson, B.; Walsh, S.; et al. Mapping Genes for Human Face Shape: Exploration of Univariate Phenotyping Strategies. PLoS Comput. Biol. 2024, 20, e1012617. [Google Scholar] [CrossRef]
  109. Liu, F.; van der Lijn, F.; Schurmann, C.; Zhu, G.; Chakravarty, M.M.; Hysi, P.G.; Wollstein, A.; Lao, O.; de Bruijne, M.; Ikram, M.A.; et al. A Genome-Wide Association Study Identifies Five Loci Influencing Facial Morphology in Europeans. PLoS Genet. 2012, 8, e1002932. [Google Scholar] [CrossRef] [PubMed]
  110. Samuels, B.D.; Aho, R.; Brinkley, J.F.; Bugacov, A.; Feingold, E.; Fisher, S.; Gonzalez-Reiche, A.S.; Hacia, J.G.; Hallgrimsson, B.; Hansen, K.; et al. FaceBase 3: Analytical Tools and FAIR Resources for Craniofacial and Dental Research. Development 2020, 147, dev191213. [Google Scholar] [CrossRef]
  111. Xiong, Z.; Dankova, G.; Howe, L.J.; Lee, M.K.; Hysi, P.G.; De Jong, M.A.; Zhu, G.; Adhikar, K.; Li, D.; Li, Y.; et al. Novel Genetic Loci Affecting Facial Shape Variation in Humans. Elife 2019, 8, e49898. [Google Scholar] [CrossRef] [PubMed]
  112. Castagnola, M.J.; Medina-Paz, F.; Zapico, S.C. Uncovering Forensic Evidence: A Path to Age Estimation through DNA Methylation. Int. J. Mol. Sci. 2024, 25, 4917. [Google Scholar] [CrossRef]
  113. Zhu, B.; Li, D.; Han, G.; Yao, X.; Gu, H.; Liu, T.; Liu, L.; Dai, J.; Liu, I.Z.; Liang, Y.; et al. Multiplexing and Massive Parallel Sequencing of Targeted DNA Methylation to Predict Chronological Age. Front. Aging 2025, 6, 1467639. [Google Scholar] [CrossRef]
  114. Mathur, A.; Taurin, S.; Alshammary, S. New Insights into Methods to Measure Biological Age: A Literature Review. Front. Aging 2024, 5, 1395649. [Google Scholar] [CrossRef]
  115. Maulani, C.; Auerkari, E.I. Age Estimation Using DNA Methylation Technique in Forensics: A Systematic Review. Egypt. J. Forensic Sci. 2020, 10, 38. [Google Scholar] [CrossRef]
  116. Gerra, M.C.; Dallabona, C.; Cecchi, R. Epigenetic Analyses in Forensic Medicine: Future and Challenges. Int. J. Leg. Med. 2024, 138, 701–719. [Google Scholar] [CrossRef]
  117. Bekaert, B.; Kamalandua, A.; Zapico, S.C.; Van De Voorde, W.; Decorte, R. Improved Age Determination of Blood and Teeth Samples Using a Selected Set of DNA Methylation Markers. Epigenetics 2015, 10, 922–930. [Google Scholar] [CrossRef]
  118. Woźniak, A.; Heidegger, A.; Piniewska-Róg, D.; Pośpiech, E.; Xavier, C.; Pisarek, A.; Kartasińska, E.; Boroń, M.; Freire-Aradas, A.; Wojtas, M.; et al. Development of the VISAGE Enhanced Tool and Statistical Models for Epigenetic Age Estimation in Blood, Buccal Cells and Bones. Aging 2021, 13, 6459–6484. [Google Scholar] [CrossRef]
  119. Marcante, B.; Marino, L.; Cattaneo, N.E.; Delicati, A.; Tozzo, P.; Caenazzo, L. Advancing Forensic Human Chronological Age Estimation: Biochemical, Genetic, and Epigenetic Approaches from the Last 15 Years: A Systematic Review. Int. J. Mol. Sci. 2025, 26, 3158. [Google Scholar] [CrossRef]
  120. Onofri, M.; Delicati, A.; Marcante, B.; Carlini, L.; Alessandrini, F.; Tozzo, P.; Carnevali, E. Forensic Age Estimation through a DNA Methylation-Based Age Prediction Model in the Italian Population: A Pilot Study. Int. J. Mol. Sci. 2023, 24, 5381. [Google Scholar] [CrossRef]
  121. Correia Dias, H.; Cunha, E.; Corte Real, F.; Manco, L. Age Prediction in Living: Forensic Epigenetic Age Estimation Based on Blood Samples. Leg. Med. 2020, 47, 101763. [Google Scholar] [CrossRef]
  122. Refn, M.R.; Kampmann, M.L.; Morling, N.; Tfelt-Hansen, J.; Børsting, C.; Pereira, V. Prediction of Chronological Age and Its Applications in Forensic Casework: Methods, Current Practices, and Future Perspectives. Forensic Sci. Res. 2023, 8, 85–97. [Google Scholar] [CrossRef] [PubMed]
  123. Kurdyukov, S.; Bullock, M. DNA Methylation Analysis: Choosing the Right Method. Biology 2016, 5, 3. [Google Scholar] [CrossRef] [PubMed]
  124. De Chiara, L.; Leiro-Fernandez, V.; Rodríguez-Girondo, M.; Valverde, D.; Botana-Rial, M.I.; Fernández-Villar, A. Comparison of Bisulfite Pyrosequencing and Methylation-Specific qPCR for Methylation Assessment. Int. J. Mol. Sci. 2020, 21, 9242. [Google Scholar] [CrossRef]
  125. Naue, J.; Hoefsloot, H.C.J.; Kloosterman, A.D.; Verschure, P.J. Forensic DNA Methylation Profiling from Minimal Traces: How Low Can We Go? Forensic Sci. Int. Genet. 2018, 33, 17–23. [Google Scholar] [CrossRef] [PubMed]
  126. Manco, L.; Dias, H.C. DNA Methylation Analysis of ELOVL2 Gene Using Droplet Digital PCR for Age Estimation Purposes. Forensic Sci. Int. 2022, 333, 111206. [Google Scholar] [CrossRef] [PubMed]
  127. Żarczyńska, M.; Żarczyński, P.; Tomsia, M. Nucleic Acids Persistence—Benefits and Limitations in Forensic Genetics. Genes 2023, 14, 1643. [Google Scholar] [CrossRef] [PubMed]
  128. Wen, Y.; Liu, J.; Su, Y.; Chen, X.; Hou, Y.; Liao, L.; Wang, Z. Forensic Biogeographical Ancestry Inference: Recent Insights and Current Trends. Genes Genom. 2023, 45, 1229–1238. [Google Scholar] [CrossRef]
  129. Than, K.Z.; Muisuk, K.; Woravatin, W.; Suwannapoom, C.; Srikummool, M.; Srithawong, S.; Lorphengsy, S.; Kutanan, W. Genetic Structure and Forensic Utility of 23 Autosomal STRs of the Ethnic Lao Groups From Laos and Thailand. Front. Genet. 2022, 13, 954586. [Google Scholar] [CrossRef]
  130. Fondevila, M.; Phillips, C.; Santos, C.; Freire Aradas, A.; Vallone, P.M.; Butler, J.M.; Lareu, M.V.; Carracedo, Á. Revision of the SNPforID 34-Plex Forensic Ancestry Test: Assay Enhancements, Standard Reference Sample Genotypes and Extended Population Studies. Forensic Sci. Int. Genet. 2013, 7, 63–74. [Google Scholar] [CrossRef] [PubMed]
  131. Zhang, S.; Bian, Y.; Chen, A.; Zheng, H.; Gao, Y.; Hou, Y.; Li, C. Developmental Validation of a Custom Panel Including 273 SNPs for Forensic Application Using Ion Torrent PGM. Forensic Sci. Int. Genet. 2017, 27, 50–57. [Google Scholar] [CrossRef]
  132. Ruiz-Ramírez, J.; de la Puente, M.; Xavier, C.; Ambroa-Conde, A.; Álvarez-Dios, J.; Freire-Aradas, A.; Mosquera-Miguel, A.; Ralf, A.; Amory, C.; Katsara, M.A.; et al. Development and Evaluations of the Ancestry Informative Markers of the VISAGE Enhanced Tool for Appearance and Ancestry. Forensic Sci. Int. Genet. 2023, 64, 102853. [Google Scholar] [CrossRef]
  133. Sun, K.; Yao, Y.; Yun, L.; Zhang, C.; Xie, J.; Qian, X.; Tang, Q.; Sun, L. Application of Machine Learning for Ancestry Inference Using Multi-InDel Markers. Forensic Sci. Int. Genet. 2022, 59, 102702. [Google Scholar] [CrossRef]
  134. Kling, D.; Phillips, C.; Kennett, D.; Tillmar, A. Investigative Genetic Genealogy: Current Methods, Knowledge and Practice. Forensic Sci. Int. Genet. 2021, 52, 102474. [Google Scholar] [CrossRef]
  135. Peterson, R.E.; Kuchenbaecker, K.; Walters, R.K.; Chen, C.Y.; Popejoy, A.B.; Periyasamy, S.; Lam, M.; Iyegbe, C.; Strawbridge, R.J.; Brick, L.; et al. Genome-Wide Association Studies in Ancestrally Diverse Populations: Opportunities, Methods, Pitfalls, and Recommendations. Cell 2019, 179, 589–603. [Google Scholar] [CrossRef]
  136. Tan, M.; Tan, Y.; Jiang, H.; Xue, J.; Wu, Q.; Zheng, Y.; Liu, G.; Xiao, Y.; Lv, M.; Liao, M.; et al. Explainable Artificial Intelligence in Forensic DNA Analysis: Alleles Identification in Challenging Electropherograms Using Supervised Machine Learning Methods. Forensic Sci. Int. Genet. 2025, 78, 103289. [Google Scholar] [CrossRef]
  137. Gaonkar, B.; Shinohara, R.T.; Davatzikos, C. Interpreting Support Vector Machine Models for Multivariate Group Wise Analysis in Neuroimaging. Med. Image Anal. 2015, 24, 190–204. [Google Scholar] [CrossRef]
  138. Couronné, R.; Probst, P.; Boulesteix, A.L. Random Forest versus Logistic Regression: A Large-Scale Benchmark Experiment. BMC Bioinform. 2018, 19, 27. [Google Scholar] [CrossRef]
  139. Ausmees, K.; Nettelblad, C. Achieving Improved Accuracy for Imputation of Ancient DNA. Bioinformatics 2023, 39, btac738. [Google Scholar] [CrossRef]
  140. Mochurad, L.; Horun, P. Improvement Technologies for Data Imputation in Bioinformatics. Technologies 2023, 11, 154. [Google Scholar] [CrossRef]
  141. Ausmees, K.; Sanchez-Quinto, F.; Jakobsson, M.; Nettelblad, C. An Empirical Evaluation of Genotype Imputation of Ancient DNA. G3 Genes Genomes Genet. 2022, 12, jkac089. [Google Scholar] [CrossRef] [PubMed]
  142. Alharbi, W.S.; Rashid, M. A Review of Deep Learning Applications in Human Genomics Using Next-Generation Sequencing Data. Hum. Genom. 2022, 16, 26. [Google Scholar] [CrossRef] [PubMed]
  143. Kolobkov, D.; Mishra Sharma, S.; Medvedev, A.; Lebedev, M.; Kosaretskiy, E.; Vakhitov, R. Efficacy of Federated Learning on Genomic Data: A Study on the UK Biobank and the 1000 Genomes Project. Front. Big Data 2024, 7, 1266031. [Google Scholar] [CrossRef] [PubMed]
  144. Chen, J.; Chen, A.; Tao, R.; Zhu, R.; Zhang, H.; You, X.; Li, C.; Zhang, S. Solution to a Case Involving the Interpretation of Trace Degraded DNA Mixtures. Int. J. Leg. Med. 2024, 138, 2325–2330. [Google Scholar] [CrossRef]
  145. Diepenbroek, M.; Bayer, B.; Anslinger, K. Pushing the Boundaries: Forensic DNA Phenotyping Challenged by Single-Cell Sequencing. Genes 2021, 12, 1362. [Google Scholar] [CrossRef]
  146. Toom, V.; Wienroth, M.; M’Charek, A.; Prainsack, B.; Williams, R.; Duster, T.; Heinemann, T.; Kruse, C.; MacHado, H.; Murphy, E. Approaching Ethical, Legal and Social Issues of Emerging Forensic DNA Phenotyping (FDP) Technologies Comprehensively: Reply to “Forensic DNA Phenotyping: Predicting Human Appearance from Crime Scene Material for Investigative Purposes” by Manfred Kayser. Forensic Sci. Int. Genet. 2016, 22, e1–e4. [Google Scholar] [CrossRef] [PubMed]
  147. Ogbogu, U.; Ahmed, N. Ethical, Legal, and Social Implications (ELSI) Research: Methods and Approaches. Curr. Protoc. 2022, 2, e354. [Google Scholar] [CrossRef] [PubMed]
  148. García, Ó. Forensic Genealogy. Social, Ethical, Legal and Scientific Implications. Span. J. Leg. Med. 2021, 47, 112–119. [Google Scholar] [CrossRef]
  149. Piraianu, A.I.; Fulga, A.; Musat, C.L.; Ciobotaru, O.R.; Poalelungi, D.G.; Stamate, E.; Ciobotaru, O.; Fulga, I. Enhancing the Evidence with Algorithms: How Artificial Intelligence Is Transforming Forensic Medicine. Diagnostics 2023, 13, 2992. [Google Scholar] [CrossRef]
  150. Nikita, E.; Nikitas, P. On the Use of Machine Learning Algorithms in Forensic Anthropology. Leg. Med. 2020, 47, 101771. [Google Scholar] [CrossRef]
  151. Guillen, M.; Lareu, M.V.; Pestoni, C.; Salas, A.; Carracedo, A. Ethical-Legal Problems of DNA Databases in Criminal Investigation. J. Med. Ethics 2000, 26, 266–271. [Google Scholar] [CrossRef]
  152. Bernhardt, B.A.; Roche, M.I.; Perry, D.L.; Scollon, S.R.; Tomlinson, A.N.; Skinner, D. Experiences with Obtaining Informed Consent for Genomic Sequencing. Am. J. Med. Genet. A 2015, 167, 2635–2646. [Google Scholar] [CrossRef]
  153. Hendriks, S.; Grady, C.; Ramos, K.M.; Chiong, W.; Fins, J.J.; Ford, P.; Goering, S.; Greely, H.T.; Hutchison, K.; Kelly, M.L.; et al. Ethical Challenges of Risk, Informed Consent, and Posttrial Responsibilities in Human Research with Neural Devices: A Review. JAMA Neurol. 2019, 76, 1506–1514. [Google Scholar] [CrossRef] [PubMed]
  154. Ballantyne, K.N.; Summersby, S.; Pearson, J.R.; Nicol, K.; Pirie, E.; Quinn, C.; Kogios, R. A Transparent Approach: Openness in Forensic Science Reporting. Forensic Sci. Int. 2024, 8, 100474. [Google Scholar] [CrossRef]
  155. Bollé, T.; Casey, E.; Jacquet, M. The Role of Evaluations in Reaching Decisions Using Automated Systems Supporting Forensic Analysis. Forensic Sci. Int. Digit. Investig. 2020, 34, 301016. [Google Scholar] [CrossRef]
  156. Kayser, M.; Parson, W. Transitioning from Forensic Genetics to Forensic Genomics. Genes 2018, 9, 3. [Google Scholar] [CrossRef] [PubMed]
  157. Mohsin, K. Artificial Intelligence in Forensic Science. Int. J. Forensic Res. 2023, 4, 172–173. [Google Scholar] [CrossRef]
  158. Galante, N.; Cotroneo, R.; Furci, D.; Lodetti, G.; Casali, M.B. Applications of Artificial Intelligence in Forensic Sciences: Current Potential Benefits, Limitations and Perspectives. Int. J. Leg. Med. 2022, 137, 445–458. [Google Scholar] [CrossRef] [PubMed]
  159. Sessa, F.; Esposito, M.; Cocimano, G.; Sablone, S.; Karaboue, M.A.A.; Chisari, M.; Albano, D.G.; Salerno, M. Artificial Intelligence and Forensic Genetics: Current Applications and Future Perspectives. Appl. Sci. 2024, 14, 2113. [Google Scholar] [CrossRef]
Figure 1. The integration of ML with FDP has significantly enhanced the ability to extract meaningful information from LT-DNA, though the accuracy of predictions remains fundamentally tied to data quality, genetic coverage, and technological sensitivity (created with BioRender, https://www.biorender.com).
Figure 1. The integration of ML with FDP has significantly enhanced the ability to extract meaningful information from LT-DNA, though the accuracy of predictions remains fundamentally tied to data quality, genetic coverage, and technological sensitivity (created with BioRender, https://www.biorender.com).
Genes 17 00059 g001
Figure 2. Optimized workflow for LT-DNA analysis in forensic laboratories. Key steps include: (1) sample assessment for DNA concentration and degradation index; (2) replicate analysis to improve reliability and ENG; and (3) dynamic threshold calibration based on baseline signal distributions. (Created with BioRender: www.biorender.com).
Figure 2. Optimized workflow for LT-DNA analysis in forensic laboratories. Key steps include: (1) sample assessment for DNA concentration and degradation index; (2) replicate analysis to improve reliability and ENG; and (3) dynamic threshold calibration based on baseline signal distributions. (Created with BioRender: www.biorender.com).
Genes 17 00059 g002
Figure 3. Relative prediction accuracy of EVCs from DNA, ranked by trait complexity. Eye color shows the highest accuracy (AUC > 0.9), followed by hair and skin color. Predictions for facial morphology, freckles, and sun sensitivity are less reliable, especially from LT-DNA. (Created with BioRender: www.biorender.com).
Figure 3. Relative prediction accuracy of EVCs from DNA, ranked by trait complexity. Eye color shows the highest accuracy (AUC > 0.9), followed by hair and skin color. Predictions for facial morphology, freckles, and sun sensitivity are less reliable, especially from LT-DNA. (Created with BioRender: www.biorender.com).
Genes 17 00059 g003
Figure 4. Ethical deployment checklist for FDP. Designed to assist forensic practitioners, legal professionals, and oversight bodies in evaluating scientific validity, privacy safeguards, and procedural accountability of FDP-based predictions.
Figure 4. Ethical deployment checklist for FDP. Designed to assist forensic practitioners, legal professionals, and oversight bodies in evaluating scientific validity, privacy safeguards, and procedural accountability of FDP-based predictions.
Genes 17 00059 g004
Table 1. This table summarizes six SNPs located on various chromosomes, each associated with genes involved in eye color prediction.
Table 1. This table summarizes six SNPs located on various chromosomes, each associated with genes involved in eye color prediction.
SNP IDGeneChromosomePositionReference Genome
rs12913832HERC21528120472GRCh38 38.1/141
rs1800407OCA21527985172GRCh38 38.1/142
rs12896399LOC1053706271492307319GRCh38 38.1/141
rs16891982SLC45A2533951588GRCh38 38.1/141
rs1393350TYR1189277878GRCh38 38.1/141
rs12203592IRF46396321GRCh37 37.1/131
Table 2. This table summarizes twenty-four SNPs located on various chromosomes, each associated with genes involved in eye and hair color prediction.
Table 2. This table summarizes twenty-four SNPs located on various chromosomes, each associated with genes involved in eye and hair color prediction.
SNP IDGeneChromosomePositionReference Genome
rs312262906 (rs796296176)MC1R1689919342GRCh38.p7 38.3/149
rs11547464MC1R1689919683GRCh38 38.1/141
rs885479MC1R1689919746GRCh38 38.1/141
rs1805008MC1R1689919736GRCh38 38.1/141
rs1805005MC1R1689919436GRCh38 38.1/141
rs1805006MC1R1689919510GRCh38 38.1/141
rs1805007MC1R1689919709GRCh38 38.1/141
rs1805009TUBB31689920138GRCh38 38.1/141
rs201326893MC1R1689919714GRCh38 38.1/141
rs2228479MC1R1689919532GRCh38 38.1/141
rs1110400MC1R1689919722GRCh38 38.1/141
rs28777SLC45A2533958854GRCh38 38.1/141
rs16891982SLC45A2533951588GRCh38 38.1/141
rs12821256KITLG1288934558GRCh38 38.1/141
rs4959270LOC1053748756457748GRCh37.p5 37.3/135
rs12203592IRF46396321GRCh37 37.1/131
rs1042602TYR1189178528GRCh38 38.1/141
rs1800407OCA21527985172GRCh38 38.1/142
rs2402130SLC24A41492334859GRCh38 38.1/141
rs12913832HERC21528120472GRCh38 38.1/141
rs2378249PIGU2034630286GRCh38.p7 38.3/151
rs12896399LOC1053706271492307319GRCh38 38.1/141
rs1393350TYR1189277878GRCh38 38.1/141
rs683TYRP1912709305GRCh38 38.1/141
Table 3. This table summarizes forty-one SNPs located on various chromosomes, each associated with genes involved in eye and hair color prediction.
Table 3. This table summarizes forty-one SNPs located on various chromosomes, each associated with genes involved in eye and hair color prediction.
SNP IDGeneChromosomePositionReference Genome
rs312262906 (rs796296176)MC1R1689919342GRCh38.p7 38.3/149
rs11547464MC1R1689919683GRCh38 38.1/141
rs885479MC1R1689919746GRCh38 38.1/141
rs1805008MC1R1689919736GRCh38 38.1/141
rs1805005MC1R1689919436GRCh38 38.1/141
rs1805006MC1R1689919510GRCh38 38.1/141
rs1805007MC1R1689919709GRCh38 38.1/141
rs1805009TUBB31689920138GRCh38 38.1/141
rs201326893MC1R1689919714GRCh38 38.1/141
rs2228479MC1R1689919532GRCh38 38.1/141
rs1110400MC1R1689919722GRCh38 38.1/141
rs28777SLC45A2533958854GRCh38 38.1/141
rs16891982SLC45A2533951588GRCh38 38.1/141
rs12821256KITLG1288934558GRCh38 38.1/141
rs4959270LOC1053748756457748GRCh37.p5 37.3/135
rs12203592IRF46396321GRCh37 37.1/131
rs1042602TYR1189178528GRCh38 38.1/141
rs1800407OCA21527985172GRCh38 38.1/142
rs2402130SLC24A41492334859GRCh38 38.1/141
rs12913832HERC21528120472GRCh38 38.1/141
rs2378249PIGU2034630286GRCh38.p7 38.3/151
rs12896399LOC1053706271492307319GRCh38 38.1/141
rs1393350TYR1189277878GRCh38 38.1/141
rs683TYRP1912709305GRCh38 38.1/141
rs3114908ANKRD111689317317GRCh38.p14
rs1800414OCA21527951891GRCh38 38.1/141
rs10756819BNC2916858086GRCh38 38.1/141
rs2238289HERC21528208069GRCh38 38.1/141
rs17128291SLC24A41492416482GRCh38.p14
rs6497292HERC21528251049GRCh38.p14
rs1129038HERC21528111713GRCh38 38.1/141
rs1667394HERC21528285036GRCh38 38.1/141
rs1126809TYR1189284793GRCh38 38.1/141
rs1470608OCA21528042975GRCh38.p14
rs1426654SLC24A51548134287GRCh38 38.1/141
rs6119471ASIP2034197406GRCh38.p14
rs1545397OCA21527942626GRCh38.p14
rs6059655RALY2034077942GRCh38.p7 38.3/151
rs12441727OCA21528026629GRCh38.p14
rs3212355MC1R1689917970GRCh38.p14
rs8051733DEF81689957798GRCh38.p14
Table 4. Performance of ML algorithms for FDP traits across datasets.
Table 4. Performance of ML algorithms for FDP traits across datasets.
TraitAlgorithmDatasetAUCPrecisionRecallReference
Eye colorRandom ForestHIrisPlex SNPs0.930.900.88[95]
Eye colorSVMHIrisPlex SNPs0.910.880.86[95]
Skin colorNeural NetworkPolish cohort--- (*)0.820.79[91]
Skin colorRandom ForestPolish cohort--- (*)0.780.75[91]
* AUC not reported for skin color in Zaorska et al. [91].
Table 5. Summary of ML algorithms applied in FDP: input type, dataset size, accuracy, and key references.
Table 5. Summary of ML algorithms applied in FDP: input type, dataset size, accuracy, and key references.
AlgorithmInput TypeDataset SizeTrait(s) PredictedReported Accuracy (AUC/Precision/Recall)Reference
Random ForestSNP array (HIrisPlex)~1200 individualsEye color (blue/brown)AUC = 0.93; Precision = 0.90; Recall = 0.88[95]
SVMSNP array (HIrisPlex)~1200 individualsEye colorAUC = 0.91; Precision = 0.88; Recall = 0.86[95]
Neural NetworkSNP array~800 individualsSkin colorPrecision = 0.82; Recall = 0.79[91]
Gradient BoostingSNP array~1200 individualsEye colorAUC = 0.92[95]
Deep Learning (CNN)SNP + 3D facial scans~3000 individualsFacial morphologyExplains 10–20% variance in facial traits[7]
Elastic NetDNA methylation (CpG sites)~500 individualsAge estimationMAD ≈ 3.5 years[117]
Accuracy metrics vary by population and validation method; most studies used cross-validation. IrisPlex and HIrisPlex-S use logistic regression, not ML in the contemporary sense.
Table 6. Comparative prediction accuracies for major traits across populations.
Table 6. Comparative prediction accuracies for major traits across populations.
TraitCategoryAccuracy (%)Notes
Eye colorBlue/Brown>90High reliability in Europeans
Eye colorIntermediate70Low sensitivity (1–2%)
Hair colorRed85–90Strong MC1R effect
Hair colorBlond/Brown75–85Moderate accuracy
Skin colorLight/Dark80–85Reliable in Europeans/African cohorts
Skin colorIntermediate<70Challenging, especially in admixed
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sessa, F.; Dervišević, E.; Esposito, M.; Francaviglia, M.; Chisari, M.; Pomara, C.; Salerno, M. Predicting Physical Appearance from Low Template: State of the Art and Future Perspectives. Genes 2026, 17, 59. https://doi.org/10.3390/genes17010059

AMA Style

Sessa F, Dervišević E, Esposito M, Francaviglia M, Chisari M, Pomara C, Salerno M. Predicting Physical Appearance from Low Template: State of the Art and Future Perspectives. Genes. 2026; 17(1):59. https://doi.org/10.3390/genes17010059

Chicago/Turabian Style

Sessa, Francesco, Emina Dervišević, Massimiliano Esposito, Martina Francaviglia, Mario Chisari, Cristoforo Pomara, and Monica Salerno. 2026. "Predicting Physical Appearance from Low Template: State of the Art and Future Perspectives" Genes 17, no. 1: 59. https://doi.org/10.3390/genes17010059

APA Style

Sessa, F., Dervišević, E., Esposito, M., Francaviglia, M., Chisari, M., Pomara, C., & Salerno, M. (2026). Predicting Physical Appearance from Low Template: State of the Art and Future Perspectives. Genes, 17(1), 59. https://doi.org/10.3390/genes17010059

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop