Evaluation of Density Functional Theory-Generated Data for Infrared Spectroscopy of Novel Psychoactive Substances Using Unsupervised Learning

: Novel psychoactive substances (NPSs) are compounds plotted to modify the chemical structures of prohibited substances, offering alternatives for consumption and evading legislation. The prompt emergence of these substances presents challenges in health concerns and forensic assessment because of the lack of analytical standards. A viable alternative for establishing these standards involves leveraging in silico methods to acquire spectroscopic data. This study assesses the efficacy of utilizing infrared spectroscopy (IRS) data derived from density functional theory (DFT) for analyzing NPSs. Various functionals were employed to generate infrared spectra for five distinct NPS categories including the following: amphetamines, benzodiazepines, synthetic cannabinoids, cathinones, and fentanyls. PRISMA software was conceived to rationalize data management. Unsupervised learning techniques, including Hierarchical Cluster Analysis (HCA), Principal Component Analysis (PCA), and t-distributed stochastic neighbor embedding (t-SNE), were utilized to refine the assessment process. Our findings reveal no significant disparities among the different functionals used to generate infrared spectra data. Additionally, the application of unsupervised learning demonstrated adequate segregation of NPSs within their respective groups. In conclusion, integrating theoretical data and dimension reduction techniques proves to be a powerful strategy for evaluating the spectroscopic characteristics of NPSs. This underscores the potential of this combined methodology as a diagnostic tool for distinguishing IR spectra across various NPS groups, facilitating the evaluation of newly unknown compounds.


Introduction
New drugs have been developed through chemical modifications to structures originally prohibited to circumvent prohibitions and legislation.These substances are referred to as NPSs (new psychoactive substances), also known as designer drugs, legal highs, and bath salts, among other terms.NPSs belong to various chemical groups, including cannabinoids, benzodiazepines, cathinones, fentanyls, and amphetamines, which rank among the most consumed drugs [1].
Understanding the pharmacological properties, risk factors, side effects, prevalence, and use of NPSs (new psychoactive substances) is essential.
Amphetamines, popular for their stimulating effects, are synthetic drugs that boost neurotransmitter release, including dopamine, norepinephrine, and serotonin, leading to alertness, energy, euphoria, and cognitive enhancement.Their chemical structure allows for various analogs, contributing to novel psychoactive substances (NPSs).However, their excessive or prolonged use poses significant health risks, including increased heart rate, high blood pressure, insomnia, anxiety, psychosis, tolerance, dependence, cardiovascular issues, neurological damage, and psychiatric disorders.Concerns also arise regarding impurities, potency variations, and contamination in clandestinely produced NPS amphetamines, exacerbating risks.The consumption of NPS amphetamines has risen because of factors such as online availability, perceived legality, and novelty appeal.The clandestine nature of production and distribution complicates regulatory efforts, making monitoring and control challenging and highlighting the need for proactive measures to address evolving drug landscapes and mitigate the associated harm [2][3][4].
Benzodiazepines are gaining attention as novel psychoactive substances (NPSs) because of their easy availability, recreational use, and potential for misuse [5,6].They are central nervous system depressants that enhance gamma-aminobutyric acid (GABA) effects, resulting in sedative, anxiolytic, muscle relaxant, and anticonvulsant effects [7].Despite their therapeutic benefits, benzodiazepine misuse can cause drowsiness, dizziness, confusion, impaired coordination, and memory issues.Prolonged or high-dose use may lead to tolerance, dependence, withdrawal symptoms, respiratory depression, coma, and even death, especially when combined with other depressants.Concerns about counterfeit production raise worries about purity, dosage variability, and contamination, increasing consumption risks.Benzodiazepines have become popular among recreational users seeking relaxation, euphoria, or relief from anxiety or insomnia symptoms [8,9].Their accessibility through illicit online markets and counterfeit production contributes to their rising prevalence in the NPS market.Concurrent use of other substances presents additional risks and challenges for harm reduction efforts and clinical management [6].
Synthetic cannabinoids have emerged as a class of novel psychoactive substances (NPSs) that mimic the effects of herbal cannabinoids present in cannabis.Recreational drug customers are increasingly attracted to these drugs for alternative reports or to circumvent drug legal guidelines.These chemically engineered compounds interact with cannabinoid receptors within the mind and body, concentrating on CB1 and CB2 receptors.By imitating the properties of delta-9-tetrahydrocannabinol (THC), the primary psychoactive aspect of hashish, synthetic cannabinoids activate cannabinoid receptors and modulate neurotransmitter release.However, they often display extra potency and affinity for receptors as compared with natural cannabinoids, resulting in unpredictable and doubtlessly harmful outcomes [10,11].The use of these cannabinoids has significant risks because of their pharmacological consequences and variability in chemical composition.Short-term consequences can also encompass altered belief, euphoria, rest, impaired coordination, and anxiety, with common damaging reactions together with agitation, paranoia, hallucinations, seizures, and psychosis, especially with high doses or extended use.Synthetic cannabinoids have been associated with numerous emergency department visits, acute toxicities, and fatalities, underscoring their ability to damage [12].Marketed as "legal highs", natural blends, or incense merchandise under diverse brand names, they may be effectively available online, contributing to their massive use amongst susceptible populations.Various consumption patterns in the NPS market consist of smoking, vaporization, oral ingestion, and even intravenous administration.The online accessibility and perceived legality of synthetic cannabinoids have fueled their adoption among numerous demographics, which includes children, teens, and individuals on probation or parole.Furthermore, the anonymity supplied by online markets enables discreet purchasing and consumption, enhancing their attraction.Notably, artificial cannabinoids are increasingly being included in eaten products, which include infused sweets or baked goods, and it is difficult to understand their psychoactive results.This trend provides significant challenges for regulatory government and public fitness organizations in monitoring and controlling the distribution and consumption of these substances [13].
Synthetic cathinones, often marketed under diverse brand names such as "bath salts", "plant food", or "research chemicals" constitute a considerable subset of novel psychoactive substances (NPSs), imitating the consequences of natural cathinone determined in the khat plant [14].These synthetic derivatives of cathinone, a stimulant compound located in the leaves of the Catha edulis plant, act as monoamine reuptake inhibitors.They decorate neurotransmitter release and inhibit reuptake, mainly affecting dopamine, norepinephrine, and serotonin degrees inside the brain.This pharmacological profile mirrors that of amphetamines and MDMA, leading to elevated energy, euphoria, alertness, and sociability [15].They pose significant risks because of their strong stimulant effects and capacity for unfavorable reactions.Interim consequences may consist of accelerated coronary heart price, extended blood stress, hyperthermia, agitation, hallucinations, paranoia, and psychosis.Prolonged use or excessive doses can bring about tolerance, dependence, cardiovascular complications, seizures, renal failure, or even death.The unpredictable nature of synthetic cathinones, compounded with variability in chemical composition and efficiency, provides demanding situations for clinical control and damage discount efforts [16].Consumption of synthetic cathinones may additionally occur through various routes, such as oral ingestion, insufflation, smoking, or intravenous administration, reflecting diverse use styles within the NPS market.However, the clandestine nature of manufacturing and distribution, on the side of the non-stop emergence of the latest synthetic cathinone analogs, offers big demanding situations for regulatory efforts and public health interventions to mitigate the associated harms.They are readily available online and through illicit distribution channels.This accessibility, coupled with perceived legality and simplicity of acquisition, has fueled their sizable use, particularly among vulnerable populations, young people, and young adults [17,18].
Fentanyl, a potent synthetic opioid far more powerful than morphine, has led to a sharp rise in opioid overdoses and deaths.Fentanyl analogs and synthetic opioids have become a significant concern for public health and law enforcement agencies worldwide.There is a severe opioid crisis in the United States, and fentanyl plays a significant role [19,20].Illegally produced fentanyl, often mixed with heroin or sold, flooded the illicit drug market, exacerbating the epidemic.Fentanyl analogs are popular among opioid users seeking more potent effects or in combination with other drugs.It is available online and in illegal markets, contributing to widespread use.Efforts are needed to address the growing phenomenon of fentanyl synonyms and synthetic opioids, including regulation, education, and harm reduction.Fentanyl characterization of products can be addressed early, resulting in potent chemical analogs that evade regulatory action.The availability and lethality of fentanyl have stressed healthcare systems, regulators, and communities, underscoring the need for immediate action to manage supply and demand.Fentanyl-like systems have a profound effect on opioid receptors in the brain and spinal cord for pain relief, sedation, and euphoria.They have a high risk of overdose, even in small doses, and the changes in potency make medical application difficult.There are serious risks for users, including respiratory depression and overdose.The manufacture and distribution of fentanyl analogs in counterfeit pharmaceutical forms add risk and challenge to the public health response [21,22].The emergence of carfentanil, an even more potent fentanyl analog, poses an unprecedented risk to public health.Its power is so great, even in small doses [23,24].
Grasping these factors is crucial for health professionals and healthcare providers to analyze potential outcomes and interactions effectively.Comprehending the structure of these substances is necessary to know them precisely.It can offer an additional understanding of their effects and intoxication [25].Knowledge of the associated risks and side effects is essential for developing effective intervention and treatment strategies.By analyzing trends and usage patterns, public health and law enforcement agencies can empower themselves to conduct prevention and education efforts, thus minimizing harm to the public.Providing this information to the public enhances awareness of the risks linked to psychoactive substance use, dispels misconceptions and stigma, and promotes informed discussion.Moreover, beyond being aware of health-related properties, it is imperative to verify conditions to detect and uncover the structure of these substances.
A consequence of drug trafficking is that forensic assessment faces challenges beyond the detection of compounds, involving strategies for collecting and disseminating data.Forensic scientists must be informed to confront new challenges through new methodologies and technologies [26].The main difficulty that drug analysts encounter is finding appropriate methods to extract parameters or characteristics considered crucial for the treatment and contextualization of drug profile data.It results in difficulties in identifying NPSs, which is crucial for understanding these substances' chemical structure and properties, requiring significant time, financial, and technological resources.This identification is essential for determining the chemical structure and underlying properties.Only after a substance is detected and identified can its appearance be monitored by disseminating national and international records [26].Intelligence tools play the role of organized memory to provide accurate, timely, helpful, and meaningful information [27].
Increasing technological and scientific development has underscored the importance of science in the legal world.Legal systems have evolved toward a reality increasingly reliant on the natural sciences to aid investigative mechanisms and enforce the law [2][3][4][5].In this context, chemical analysis tools are essential in providing accurate, timely, helpful, and meaningful information that can unveil the characteristics of these substances' properties [6].Scientific and technological tools strive to enhance the transparency and expediency of judicial decisions [28][29][30][31][32][33].
Spectroscopic methods are widely used in chemistry.Infrared spectroscopy (IRS) is vital in forensics, aiding in identifying and characterizing substances.It uses molecular vibration principles to reveal chemical structures, assisting in identifying illicit drugs and explosives.IRS's precision allows for detailed analysis without compromising evidence integrity, which is crucial in legal proceedings.Its speed and simplicity make it worthwhile for on-site investigations, enhancing law enforcement capabilities.It complements methods like mass spectrometry and chromatography, providing comprehensive forensic information.IRS also helps trace evidence origins and manufacturing processes, linking evidence to suspects or criminal groups.IRS is indispensable in forensic analysis but poses challenges without comparison standards [34].
For the evaluation of psychoactive substances, many forensic laboratories follow the international recommendations of the Scientific Working Group for the Analysis of Seized Drugs (SWGDRUG).There are categories of recommended techniques for spectroscopic characterizations, as per the indicative framework, according to Figure 1.The use of methods follows recommendations to ensure the correct identification of these substances [35].
imperative to verify conditions to detect and uncover the structure of these substances.
A consequence of drug trafficking is that forensic assessment faces challenges beyond the detection of compounds, involving strategies for collecting and disseminating data.Forensic scientists must be informed to confront new challenges through new methodologies and technologies [26].The main difficulty that drug analysts encounter is finding appropriate methods to extract parameters or characteristics considered crucial for the treatment and contextualization of drug profile data.It results in difficulties in identifying NPSs, which is crucial for understanding these substances' chemical structure and properties, requiring significant time, financial, and technological resources.This identification is essential for determining the chemical structure and underlying properties.Only after a substance is detected and identified can its appearance be monitored by disseminating national and international records [26].Intelligence tools play the role of organized memory to provide accurate, timely, helpful, and meaningful information [27].
Increasing technological and scientific development has underscored the importance of science in the legal world.Legal systems have evolved toward a reality increasingly reliant on the natural sciences to aid investigative mechanisms and enforce the law [2][3][4][5].In this context, chemical analysis tools are essential in providing accurate, timely, helpful, and meaningful information that can unveil the characteristics of these substances' properties [6].Scientific and technological tools strive to enhance the transparency and expediency of judicial decisions [28][29][30][31][32][33].
Spectroscopic methods are widely used in chemistry.Infrared spectroscopy (IRS) is vital in forensics, aiding in identifying and characterizing substances.It uses molecular vibration principles to reveal chemical structures, assisting in identifying illicit drugs and explosives.IRS's precision allows for detailed analysis without compromising evidence integrity, which is crucial in legal proceedings.Its speed and simplicity make it worthwhile for on-site investigations, enhancing law enforcement capabilities.It complements methods like mass spectrometry and chromatography, providing comprehensive forensic information.IRS also helps trace evidence origins and manufacturing processes, linking evidence to suspects or criminal groups.IRS is indispensable in forensic analysis but poses challenges without comparison standards [34].
For the evaluation of psychoactive substances, many forensic laboratories follow the international recommendations of the Scientific Working Group for the Analysis of Seized Drugs (SWGDRUG).There are categories of recommended techniques for spectroscopic characterizations, as per the indicative framework, according to Figure 1.The use of methods follows recommendations to ensure the correct identification of these substances [35].Categories for drug analysis as suggested for SWGDRUG [36].* when used with a wavelength range.
When a validated technique from Category A is used, at least another method from any other category should also be employed.When a Category A technique cannot be used, at least three different techniques should be employed, with two being from Category B (and not correlated with each other).
For all NPS groups, we determined the minimum energy structure and calculated its vibrational frequencies to confirm whether the structure was indeed at its minimum energy.We multiplied the vibrational frequencies by a scaling factor to better align them with the experimental spectrum.This adjustment aimed to address the following two issues: the approximation inherent in the electronic structure calculation and the non-harmonic nature of the potential energy surface.The scaling factors for the vibrational frequency calculation method were as follows: (a) B3LYP: 0.9654; (b) B3PW91: 0.9643; (c) M062X: 0.9462; and (d) PBE0: 0.9596.All data were set to transmittance, and a Gaussian transformation was applied to fit the data to the same dimension [89,90].The resulting matrix has 75 lines (samples) and 3600 columns (variables), indicating each peak in intervals of 1 cm −1 , to account for all possible signals.These data can be obtained by contacting the authors.
The spectroscopic data were evaluated by multivariate statistical analysis to elucidate the main characteristics of each group.Figure 2 outlines a flowchart depicting the sequential steps involved in the computational procedure.

Structural and Spectroscopical Analyses
Starting with the crystallographic structure obtained from the PubChem database (https://pubchem.ncbi.nlm.nih.gov)[6,78], the geometry optimization calculation for all molecules was conducted using the MMFF94 method (Merck Molecular Force Field 94) within Avogadro® software (1.2.0,University of Pittsburgh Department of Chemistry, Pittsburgh, Pennsylvania, United States of America) [79][80][81].The MMFF94 force field, tailored to experimental parameters, is adept at handling numerous combinations of functional groups [79,82], making it optimal for energy minimization calculations.
For all NPS groups, we determined the minimum energy structure and calculated its vibrational frequencies to confirm whether the structure was indeed at its minimum energy.We multiplied the vibrational frequencies by a scaling factor to better align them with the experimental spectrum.This adjustment aimed to address the following two issues: the approximation inherent in the electronic structure calculation and the nonharmonic nature of the potential energy surface.The scaling factors for the vibrational frequency calculation method were as follows: (a) B3LYP: 0.9654; (b) B3PW91: 0.9643; (c) M062X: 0.9462; and (d) PBE0: 0.9596.All data were set to transmittance, and a Gaussian transformation was applied to fit the data to the same dimension [89,90].The resulting matrix has 75 lines (samples) and 3600 columns (variables), indicating each peak in intervals of 1 cm −1 , to account for all possible signals.These data can be obtained by contacting the authors.
The spectroscopic data were evaluated by multivariate statistical analysis to elucidate the main characteristics of each group.Figure 2 outlines a flowchart depicting the sequential steps involved in the computational procedure.Because of the substantial number of molecules involved in this study, we created PRISMA 2.0 software (dos Santos, C. São Carlos, Brazil) [91,92]) to automate various tasks.Its implementation facilitated swift analysis of ORCA output, the generation and normalization of IR spectra, and the identification of potential errors.In negative frequencies, input parameters (geometry and precision) were adjusted, and the DFT calculation was repeated with the new atomic coordinates.This process was iterated until no negative vibrational frequencies were present in the results.Molecular similarity (RMSD-Root Mean Square Deviation) [93,94] was calculated between the original and optimized structures using the VMD (Visual Molecular Dynamics) program [95].Furthermore, it played a pivotal role in minimizing errors during the creation of input files.The data generated in ORCA were then utilized in the PRISMA program to produce the infrared spectrum.Developed in the Python™ language [96] with the PyQt5 graphical API [97][98][99][100][101][102], PRISMA employs a layered architecture [103] and encompasses modules for the following:

•
Generating input files for different DFT methods.• Analyzing vibrational frequency results in the ORCA results file.

•
Generating a.csv file of the spectrum normalized by the ORCA file.

•
Batch generating.csvfiles to process chosen folders with ORCA results: each file is analyzed individually, and the output includes the molecule identification data, the DFT method, and the spectral data (frequency range and intensities).
A PRISMA interface example is provided in Figures S1 and S2 in the Supplementary Materials Section.

Unsupervised Learning Evaluation
The resulting matrix with all IR spectra was submitted to multivariate analysis.We performed Hierarchical Cluster Analysis (HCA), Principal Component Analysis (PCA), and T-Distributed Stochastic Neighbor Embedding (t-NSE) to evaluate the data [104][105][106][107].We performed these analyses by means of the R ® package (4.3.2,Posit Software, Boston, TX, USA) [108].The links to Github codes are available in the Supplementary Materials Section.

Structural and Spectroscopical Analyses
Tables 2-6 present the Root Mean Square Deviation (RMSD) values [95] for all molecules within each group, corresponding to the different functionals utilized.Each set of NPSs includes an illustrative example of the obtained infrared spectra, showcased in Figures 3-7.Comprehensive infrared spectra for all molecules can be found in the Supplementary Materials (Figures S3-S77).

Results for the Studied Amphetamines
According to the results obtained for amphetamines, it is observed from Table 2 that the minimum energy structures obtained in all cases differ, on average, by less than 0.3 angstroms from those obtained from the crystallographic structure.The DFT M062X showed the highest average deviation, while the other methods showed lower, equivalent deviations.Regarding the infrared spectra, Figure 3 shows the overlap of the results obtained for amphetamine.The main bands are highlighted.Visually, a slight shift in the results relative to M062X is observed.

Results for the Studied Amphetamines
According to the results obtained for amphetamines, it is observed from Table 2 that the minimum energy structures obtained in all cases differ, on average, by less than 0.3 angstroms from those obtained from the crystallographic structure.The DFT M062X showed the highest average deviation, while the other methods showed lower, equivalent deviations.Regarding the infrared spectra, Figure 3 shows the overlap of the results obtained for amphetamine.The main bands are highlighted.Visually, a slight shift in the results relative to M062X is observed.

Results for the Studied Benzodiazepines
Table 3 shows that the results for benzodiazepines were also less than 0.3 angstroms on average.However, a greater deviation is observed for the Diclazepam molecule, but still, this deviation is around 0.5 angstroms.Once again, the M062X method showed the highest deviations.Figure 4 shows the spectra obtained for each DFT for diazepam molecules.In this case, it is observed that the signals are not perfectly overlapped but are within the same region.

Results for the Studied Synthetic Cannabinoids
Table 4 shows the results of the RMSD for synthetic cannabinoids.For the JWH-203 molecule, the B3LYP and B3PW91 methods showed a value above 2.0 angstroms for RMSD, in contrast to the M062X and PBE0 methods, whose values were less than 0.5 angstroms.The molecule AM-1248 also showed a value greater than 1.0 angstroms in relation to the crystallographic structure for all methods, with the B3LYP and M062X methods showing the highest deviations.On average, the PBE0 method showed the lowest deviations.Figure 5 presents the spectrum for the cannabinol molecule.It is observed that the bands appear in the same regions for all methods, with slight displacement between them.

Results for the Studied Cathinones
The results of the structural deviations between the calculated minimum energy and X-ray structures for cathinones are presented in Table 5.In this case, all functionals returned approximately the same average value, with the M062X method showing a slightly higher value, and cathinone having the highest individual deviation, around 0.95 angstroms.Figure 6, which presents the spectra for cathinone, shows that there is no significant disparity between the simulations of the bands in all methods.

Results for the Studied Opioids
Table 6 presents the results for fentanyls.In this case, a significant disparity is observed for the M062X method for some molecules, such as fentanyl and P-F-acetylfentanyl.The other methods show similar results for obtaining minimum energy structures, indicating equivalent deviations from the crystallographic structure.Figure 7 also shows that the M062X method presents a different shift for the fentanyl bands, while the other methods are more reproducible.

Results of Hierarchical Cluster Analysis (HCA)
Figure 8 shows the results of the Hierarchical Cluster Analysis with an accuracy of 0.95 and confidence intervals of 95%: (0.9189, 0.9717).Table 7 presents the results of the exploratory analysis in each class, while Table 8 displays the confusion matrix results of the HCA modeling.This analysis enables the validation of qualitative methods and ensures excellent reliability of the statistical procedure applied to the samples of interest.Thus, this evaluation makes it possible to indicate the probability of a new sample, with characteristics like those used to create the model, belonging or not to classes with greater confidence [109,110].
exploratory analysis in each class, while Table 8 displays the confusion matrix results of the HCA modeling.This analysis enables the validation of qualitative methods and ensures excellent reliability of the statistical procedure applied to the samples of interest.Thus, this evaluation makes it possible to indicate the probability of a new sample, with characteristics like those used to create the model, belonging or not to classes with greater confidence [109,110].

Results of Principal Component Analysis (PCA)
Table 9 presents the results for variance from the Principal Component Analysis.The explained and cumulated variance until the fifth principal component is presented.Figure 9 presents score plots for the three first principal components, in a 2D view combining them.Figure 10 shows a 3D view of scores, and the ellipses represent the 95% confidence interval.Table 9 presents the results for variance from the Principal Component Analysis.The explained and cumulated variance until the fifth principal component is presented.Figure 9 presents score plots for the three first principal components, in a 2D view combining them.Figure 10 shows a 3D view of scores, and the ellipses represent the 95% confidence interval.To verify the quality of the PCA, the variance explained in each principal component shows that more than 99% of all the original information is contained in just two compo- To verify the quality of the PCA, the variance explained in each principal component shows that more than 99% of all the original information is contained in just two components.Assessing the residual chart in Figure 11 is also a way to verify the quality of the PCA model.It revealed a random scatter pattern around zero, indicating a satisfactory fit of the model to the data.This observation suggests that the principal components adequately capture the variability present in the dataset without any systematic bias or unexplained patterns.The absence of discernible trends or deviations in the residuals plot provides confidence in the reliability of the PCA results and the validity of the underlying assumptions coming from the PCA [111,112].To verify the quality of the PCA, the variance explained in each principal compon shows that more than 99% of all the original information is contained in just two com nents.Assessing the residual chart in Figure 11 is also a way to verify the quality of PCA model.It revealed a random scatter pattern around zero, indicating a satisfactor of the model to the data.This observation suggests that the principal components a quately capture the variability present in the dataset without any systematic bias or un plained patterns.The absence of discernible trends or deviations in the residuals plot p vides confidence in the reliability of the PCA results and the validity of the underly assumptions coming from the PCA [111,112].Figure 12 shows the results of the t-SNE analysis.The best separation was obtained with three dimensions.

Discussion
In this work, we have evaluated the feasibility of raising spectroscopic information  Figure 12 shows the results of the t-SNE analysis.The best separation was obtained with three dimensions.

Discussion
In this work, we have evaluated the feasibility of raising spectroscopic information for NPS through DFT calculation.According to the results from Section 3.1, all functionals were similar in predicting both the structure and the infrared spectra.However, the M062X showed the highest bias for all groups of molecules.

Discussion
In this work, we have evaluated the feasibility of raising spectroscopic information for NPS through DFT calculation.According to the results from Section 3.1, all functionals were similar in predicting both the structure and the infrared spectra.However, the M062X showed the highest bias for all groups of molecules.
Regarding Section 3.2, a clear separation among the different groups in the dendrogram was obtained from HCA. Numerical values in Tables 7 and 8 confirm our findings, as the values in the confusion matrix indicate that our model is highly sensitive and specific.Additionally, HCA also revealed that both precision and accuracy exhibit favorable values.The No Information Rate assesses whether our classifier outperforms random assignment, but we observed that the accuracy significantly surpasses this benchmark, affirming the adequacy of the classifier.The Kappa statistic serves as a measure of how classification results compare to values assigned by chance.With a value of 0.9374, the agreement can be considered high, as values exceeding 0.8 are typically deemed very good.In this case, we can assert that the deal is excellent [113].Principal Component Analysis further corroborated the practical separation of groups, validating our observations from HCA.Table 9 indicates that only two principal components account for more than 99% of the variance.
The application of the t-SNE algorithm resulted in excellent separation of the five substance categories in the reduced three-dimensional space, as illustrated in Figure 12.Each category formed distinct clusters, demonstrating the effectiveness of t-SNE for classification tasks.This outcome underscores the efficacy of t-SNE in effectively delineating the distinct characteristics and underlying patterns present within the complex multi-dimensional space of NPS data.Such precise clustering enables enhanced comprehension and classification of these substances, contributing to the advancement of our understanding and management of NPS-related phenomena.In comparison, PCA also provided reasonable separation; however, t-SNE exhibited superior performance in preserving local structures and capturing nonlinear relationships among the data points.
Considering the lack of information about these substances and the costs of analysis, structure, and human resources involved in obtaining spectroscopic data, DFT methods linked to unsupervised analysis were able to give insights about each group's spectroscopic behavior, providing a viable alternative to acquiring data and improving the knowledge for NPSs [90,114].

Conclusions
This research used computational methods to extract spectroscopic details about different new psychoactive substance (NPS) groups.We tested four different functionals to carry out the calculations.Interestingly, our analysis revealed that the M062X functional showed slightly higher discrepancies in determining structures and infrared spectra.However, these differences were not considered significant in numerical terms.We found that these distinct NPS groups could still be effectively distinguished through an unsupervised approach.There was a clear separation among the clusters, indicating the robustness of our strategy.Our results underscore the value of combining quantum chemistry with multivariate statistics.By employing Density Functional Theory (DFT) to simulate and predict infrared spectra, researchers can gain insights into molecular structures, detect chemical compositions, and discern subtle differences between compounds with remarkable accuracy.Integrating experimental data with computational modeling strengthens the precision and reliability of forensic analyses, aiding in identifying and characterizing unknown substances.By raising infrared data with DFT calculations, forensic chemists gain a potent tool to untangle complex chemical puzzles, embedding information regarding NPSs and providing substantive contributions to unveil their properties.

Figure 1 .
Figure 1.Categories for drug analysis as suggested for SWGDRUG[36].* when used with a wavelength range.

Figure 1 .
Figure 1.Categories for drug analysis as suggested for SWGDRUG[36].* when used with a wavelength range.

Figure 1 .
Figure 1.Categories for drug analysis as suggested for SWGDRUG[36].* when used with a wavelength range.

Figure 1 .
Figure 1.Categories for drug analysis as suggested for SWGDRUG[36].* when used with a wavelength range.

Figure 2 .
Figure 2. Flowchart of the computational procedure.

Figure 2 .
Figure 2. Flowchart of the computational procedure.

Figure 3 .
Figure 3. Theoretical infrared spectra for amphetamine across all functionals.

Figure 4 .
Figure 4. Theoretical infrared spectra for diazepam across all functionals.

Figure 4 .
Figure 4. Theoretical infrared spectra for diazepam across all functionals.

Figure 4 .
Figure 4. Theoretical infrared spectra for diazepam across all functionals.

Figure 4 .
Figure 4. Theoretical infrared spectra for diazepam across all functionals.

Figure 7 .
Figure 7. Theoretical infrared spectra for fentanyl across all functionals.

Figure 12 15 Figure 11 .
Figure12shows the results of the t-SNE analysis.The best separation was obtained with three dimensions.

Table 1 .
The most straightforward structures for each group of NPSs.

Table 2 .
RMSD for the molecules in the amphetamines group.

Table 3 .
RMSD for the molecules in the benzodiazepines group.

Table 4 .
RMSD for the molecules in the cannabinoids group.

Table 5 .
RMSD for the molecules in the cathinones group.

Table 6 .
RMSD for the molecules in the opioids group.

Table 6 .
RMSD for the molecules in the opioids group.

Table 9 .
Principal Component Analysis results for variance.

Table 9 .
Principal Component Analysis results for variance.