Next Article in Journal
Improving Avatar Accuracy with Gaussian Process Regression Method in Mirror Metaverses
Previous Article in Journal
Client-Attentive Personalized Federated Learning for AR-Assisted Information Push in Power Emergency Maintenance
Previous Article in Special Issue
Measuring Narrative Complexity Among Suicide Deaths in the National Violent Death Reporting System (2003–2021 NVDRS)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Validating the Use of Natural Language Processing and Text Mining for Hospital-Based Violence Intervention Programs and Criminal Justice Articles

by
Cyril S. Ku
1,*,
Katheryne Pugliese
2,
Jared R. Dmello
3,
Morgan R. Peltier
4,
Robert Green
5 and
Sheetal Ranjan
6,7
1
Department of Computer Science, William Paterson University, Wayne, NJ 07470, USA
2
Department of Criminal Justice, John Jay College of Criminal Justice, New York, NY 10019, USA
3
School of Social Sciences, Adelaide University, Adelaide, SA 5005, Australia
4
Jersey Shore University Hospital, Hackensack Meridian School of Medicine, Nutley, NJ 07110, USA
5
School of Criminal Justice, Rutgers University, Newark, NJ 07102, USA
6
Department of Justice Studies & Sociology, Montclair State University, Montclair, NJ 07043, USA
7
Department of Psychiatry, Hackensack Meridian School of Medicine, Nutley, NJ 07110, USA
*
Author to whom correspondence should be addressed.
Information 2025, 16(12), 1098; https://doi.org/10.3390/info16121098
Submission received: 27 October 2025 / Revised: 20 November 2025 / Accepted: 20 November 2025 / Published: 11 December 2025

Abstract

Hospital-based violence intervention programs (HVIPs) are a form of community violence intervention designed to address trauma resulting from violent injuries. This public health approach has been implemented across the United States since the 1990s, with numerous qualitative and quantitative studies evaluating its effectiveness. Manual systematic reviews by domain experts have helped identify major themes and research gaps. While these reviews are valuable for synthesizing the existing literature, thisprocess can be time-consuming and labor-intensive, given the vast amount of research in public health and criminal justice. To meet the urgent need for accessible insights into the violence-related literature, more efficient methods are essential. Recent advances in artificial intelligence (AI) offer promising tools to streamline this process. This study applies AI, specifically natural language processing techniques, to analyze recurring themes in the HVIP-related literature at the intersection of criminal justice and public health. The findings indicate that text-mining methods can enhance and accelerate the systematic review process, while also revealing new insights. The results underscore the potential of AI-driven tools to support evidence-based practices and highlight the importance of interdisciplinary collaboration to improve the effectiveness and implementation of HVIPs.

Graphical Abstract

1. Introduction

Hospital-based violence intervention programs (HVIPs) were conceived to address community violence, with the pioneering program established in Oakland, California, in 1994. Their expansion was propelled by the Department of Justice’s Office for Victims of Crime in 1998, and the U.S. now has over forty HVIPs, which are predominantly funded through the Victims of Crime Act and the Department of Justice (see https://www.thehavi.org). These programs craft individually tailored plans for victims, addressing issues like mental health, substance abuse, housing, employment, and education. They also help victims by harnessing resources from hospitals, law enforcement, communities, victims’ families, and social networks to address the underlying issues that put them at risk of further victimisation or retaliatory violence. While HVIPs’ effectiveness is generally gauged by metrics like reduced medical costs [1], improved personal outcomes [2], and the diminished revictimization of violence [1], existing qualitative and quantitative research has presented a fragmented picture. Notably, there is limited research that has yet explored the convergence of HVIPs and the criminal justice system, which is an area of significant overlap. To fill this research gap, a domain expert in this area of research needs to first identify the relevant literature at the intersection of HVIPs and criminal justice [3,4].
Systematic reviews are a structured methodology for collating and evaluating evidence from multiple studies on a particular topic [5]. The first step is to gather relevant articles for the study. This process is time-intensive and often demands significant resources as the volume of research publications continues to grow. Advances in natural language processing (NLP) and text mining may offer researchers a solution to this literature selection. These methods could also make the process more repeatable and better able to incorporate future studies evaluating progress with HVIPs as the literature continues to grow. However, the integration of NLP methods into criminal justice research poses potential ethical quandaries like the obfuscation of oversight and accountability in the research process [6,7]. These challenges highlight the importance of developing NLP methods that address ethical concerns and empower scholars and practitioners to analyze broad sets of data.
This study examines the effectiveness of NLP and text-mining techniques in accelerating the literature selection process for systematic reviews, with a focus on the intersection of HVIPs and the criminal justice literature. This study begins with the manual collection of relevant publications that explore the involvement of criminal justice agencies in HVIPs. This curated body of literature is then analyzed using NLP and text-mining methods. The process is conducted using a visual programming tool. The goal is to showcase the benefits of these technologies in streamlining systematic reviews and to identify existing gaps in the literature, thus offering valuable insights for researchers, practitioners, and stakeholders in both the public health and criminal justice domains.

2. Literature Review

Criminal justice scholars have called for more focus on interventions that draw from an interdisciplinary framework to address community violence [8,9]. These types of interventions operate under the assumption that violence impacts not only the physical health of victims but also their mental health, their families, and the larger community. HVIPs have emerged as a promising method for intervening in cycles of violence by providing trauma-informed care to victims of violence and improving service delivery through community collaboration to collectively address the issues that contribute to ongoing neighbourhood challenges. This community-driven focus to address the cycles of violence draws from a transdisciplinary approach that relies on coordinated collaboration between health and justice agencies [9].
Many HVIPs utilize a community-coordinated response (CCR) model, which aims to bring together local legal and community-based agencies to collaborate on negotiations and more community-driven decisions. This collaboration draws from the expertise of community members and local service providers who are knowledgeable of the culture of the community to address violence as a public health issue rather than an individual-centred criminal justice issue [9]. Nevertheless, criminal justice agencies are important to HVIPs and the CCR model [10]. For example, law enforcement officers are often present when victims are initially brought to the hospital, especially for serious injuries like gunshot wounds. Additionally, many victims of violence have previously been involved with the criminal justice system, thus further demonstrating the overlap between HVIPs and law enforcement. For this reason, law enforcement and corrections agencies are key members of the CCR model for HVIPs. Integrating criminal justice agencies into CCR teams allows stakeholders to organize activities and interventions that are coordinated around the varying goals of stakeholders within the community [11].
Extensive research on HVIPs has shown positive findings regarding the individual impact that programs have on victims of violence [12,13,14,15]. This body of research relies on quantitative evaluation methods, which demonstrate the positive changes in clients, such as improvements in their mental and physical health and reductions in re-hospitalization. There is considerably less research that has understood community engagement between stakeholders, which is fundamental to successful HVIP outcomes [10]. There has been very little consideration of the collaboration between public health and criminal justice practitioners, which is crucial to HVIP success [10]. To reinforce these collaborations, a review of the current research needs to be conducted to better understand how these fields are interwoven.

NLP in Criminal Justice

NLP is a form of artificial intelligence (AI) that trains computers to learn and predict human language [16]. The use of AI in the public sector has streamlined communication with individuals and automated tasks in areas with large caseloads, such as criminal justice and public health settings. Criminal justice practitioners have utilized language models, like NLP, to undertake administrative tasks like analyzing legal documents to ease the workload of corrections officers [6]. More recently, correctional settings have been utilizing NLP to administer and analyze large quantities of risk assessments that predict criminal behaviour and assist practitioners to make decisions in corrections [17]. However, this technique remains underutilized in research and practice.
The use of NLP in the field of criminal justice has raised questions regarding the ethics of using AI simulators to make legal decisions [6,7]. Završnik [7] (p. 568) argues that the computational complexity inherent to NLP causes decision-making processes to be “hidden from human oversight,” which detracts from the human rights that are intended within the criminal justice system. Similarly, Hunter and colleagues [18] argue that allowing machines to make decisions inherently raises ethical flags pertaining to oversight and accountability in the research process. Nonetheless, scholars have also emphasised the benefits of incorporating AI into the academic realm by engaging in ethically sound research protocols. For example, Schnoebelen [19] argues that focusing on specific goals or task processes is essential for engaging in ethical machine learning and NLP techniques in a research environment. In their study on gun violence, Patton et al. [20] found that hand models outperformed distant models, further contributing to the academic debate on the use of these technologies to assist in the research process.
There is some indication that the use of AI resources could be beneficial in criminal justice and health settings [21,22]. For example, Cook et al. [21] highlight that the quick detection of suicide risks could be crucial to averting a health-related emergency (see also: Morrow [23]). This is in line with the broader literature that has been used by NLP to study the intersection of justice and health [24,25,26]. The public health and medical fields have called for enhanced data analytics and AI-assisted approaches to research more broadly [27,28,29], so continued innovations in research focusing on healthcare approaches to justice necessitate a similar application. Past scholarship in this space also demonstrates that both the public health and criminological spaces are engaging in similar debates and considerations about how to process larger amounts of data more effectively to better inform policy and practice in real-world implementations [30].
NLP and text mining have also been used in criminal justice research to facilitate the analysis of high-volume data sources [31,32]. This approach has been implemented on a variety of resources, including social media posts [33,34], journal articles, technical reports [35,36], court records [37], personal narratives [38], and electronic health records [39]. This can yield a considerably quicker approach to learning about a topic of interest or understanding qualitative data. In their study on the police perception of de-policing, Mourtgos & Adams [32] used structural topic modeling, which is a form of NLP, to digitize open-ended survey responses and obtain a better understanding of the use of language across a large sample of police officers. Coulter et al. [31] used NLP to analyze pre-sentence reports in mainstream and indigenous sentencing courts to pick up on differences in language between each correctional setting. They argue that text-mining and NLP are effective in the examination of legal texts, highlighting where criminal justice research can use these resources to better understand large quantities of qualitative data. Given the mixed results of past research regarding hand coding versus computer-assisted methods (see Patton et al. [20]), the current study seeks to expand on this body of literature by using these approaches to analyze the body of scholarship related to HVIPs and criminal justice as a case study.

3. The Current Study

Developing a systematic literature review is an intensive process. There have been several step-by-step guides that inform researchers on how to automate this process [40,41]. Nevertheless, few studies have explored how NLP technology can be used to conclude a specific topic [42,43], and to the best of our knowledge, no studies have used NLP for systematic review synthesis in the criminal justice-related literature. While the initial goal of this study was to compare the efficacy of manual literature review analysis and NLP literature analysis, commitment changes among the authors repositioned the goals of this study. Rather than comparing these processes, this study draws from elements of both manual search and NLP analysis strategies to offer guidelines on how to best utilize NLP tools to analyze the literature and ensure efficiency and accuracy during this process. We posit that this study will provide a base for developing NLP and text-mining protocols for researchers and practitioners.

4. Methodology

This study utilizes two types of methods to determine how NLP software can be used to facilitate the systematic literature review process. This study was conducted in two parts. First, a domain expert (a criminal justice doctoral candidate) with familiarity and experience in working on HVIP projects in the Healthcare Approaches to Justice Collaborative (HAJC) [44] was tasked with identifying the current body of literature at the intersection of criminal justice and HVIPs. Next, NLP methods were used by a computer scientist (Professor) in the HAJC to highlight common topics in the literature and identify themes.

4.1. The Manual Search Process Conducted by the Domain Expert

The first part of this study consisted of a manual systematic literature review by a domain expert to identify related publications. The databases used in this study were ProQuest (https://www.proquest.com/), PubMed (https://pubmed.ncbi.nlm.nih.gov/), Google Scholar (https://scholar.google.com/), Google (https://www.google.com/), and Web of Science (http://www.webofscience.com). The following initial search terms were utilised: ‘Hospital-based violence intervention’, ‘Hospital-based violence intervention programs’, ‘violence intervention’, ‘violence intervention programs’, ‘hospital-based violence victims’, ‘hospital-based violence offenders’, ‘hospital-based violence intervention AND criminal justice’, ‘hospital-based violence intervention programs AND criminal justice’, ‘hospital-based violence intervention AND law enforcement’, and ‘hospital-based violence intervention programs AND law enforcement’. Articles were then acquired, and the reference lists were hand-searched to identify other potentially relevant studies for inclusion in this study.
Full-text papers were then manually reviewed by the domain expert to determine whether they were in the scope of this literature search. The articles were selected for review if they: (a) were written in English; (b) were published within the last 20 years; (c) involved a qualitative, quantitative, or mixed-methods empirical assessment; and (d) evaluated a hospital intervention in connection with criminal justice and law enforcement, currently in operation or otherwise. No a priori exclusion criteria were applied so all types of victims of violence (e.g., including studies focused on adults, women, gang violence, or domestic violence) would be utilized for our study, as narrowing the violence to one type would limit our ability to assess the research.
Utilizing the search terms mentioned above, several hundred thousand results appeared, including overlapping ones. For example, searching Google Scholar for a ‘hospital-based violence intervention program’ yielded 33,800 results. Scholarly publications, working papers, technical reports, conference proceedings, “in-house” studies, newspaper articles, and blogs are just some examples of these results. Following the suggestion of Haddaway et al. [45], only the first 300 results for each search term were reviewed within the databases. Studies that did not fit within this research and/or overlapped within databases were excluded. This resulted in 147 papers remaining before independent review. Out of these 147 papers, 66 were eliminated because they (1) were written 20 years ago, or (2) had a title that seemed to be relevant to our research, but they were not available freely online. In the end, 81 articles were manually selected for review. The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram for this search and screening process is shown in Figure 1.
We chose a smaller corpus of 81 abstracts because they tend to follow a relatively consistent structure and writing style, which reduces methodological noise and makes the results easier to interpret and validate. Using a smaller, more homogeneous dataset also allowed us to cross-check the automated output against manual review, thereby strengthening the validity of the approach in this proof-of-concept study.

4.2. NLP and Topic Modeling

Typically, the next phase of a systematic literature review would be to manually review the articles, code the articles, and determine the recurrent themes and conclusions that arise from the literature. At times, this phase of the process can be the lengthiest. Therefore, the use of NLP can help to speed up this process by allowing AI technology to compile information and identify the key links and themes that the articles share. The second part of this study utilizes publications collected from the manual literature search to be analyzed using NLP and text-mining tools to extract the key topics or themes of the literature.
For years, programmers have relied on languages such as Python and R to perform NLP and text-mining tasks using specialized libraries like the Natural Language Toolkit (NLTK) [46]. To reduce the need for manual coding, visual programming environments have emerged, including Orange [47], RapidMiner [48], WEKA [49], and SAS Enterprise Miner [50] with Text Miner.
In this study, we utilized SAS Enterprise Miner [51] with Text Miner [52] version 15.2 as our software tool. This visual programming platform integrates NLP tasks—such as analyzing grammatical structure and semantic meaning—with text-mining techniques that examine word frequencies, patterns, relationships between terms and documents, and topic modeling. One of its key features is the ability to group similar documents into clusters, based on the frequency and distribution of terms across the entire corpus and within individual documents [53]. SAS Text Miner provides a user-friendly and accessible environment while still leveraging established algorithms like topic modeling.
Figure 2 presents the process diagram created in the SAS Visual Programming environment. The NLP and text-mining workflow is carried out through four primary nodes: File Import, Text Parsing, Text Filter, and Topic Modeling. These components work in sequence to facilitate data preprocessing and thematic analysis. Each of these nodes is briefly described in the subsections that follow.

4.2.1. File Import

The manual literature collection process requires an individual to review the full article to determine whether it will be included for analysis. While we originally intended to use the 81 articles to be input into the File Import node for validation, upon inspection, most had lengthy descriptions that included the objective, method, result, discussion, and conclusion sub-sections. Therefore, we used more synthesized descriptions from the study's abstract. In addition to simplifying the NLP process, the use of abstracts indicates the strength of the manual literature process and speeds up the analysis process. Please note the following terminology for our validation study using SAS Text Miner: (1) a word is a term, (2) an abstract is the same as a document, and (3) all the documents form a corpus. Therefore, the 81 abstracts (hereinafter referred to as the corpus) are used for the ‘File Import’ node of the process.

4.2.2. Text Parsing

Before text mining algorithms could be run, NLP software was used to parse the information in the corpus. This process converts the text within the corpus (in unstructured form) to a vector representation (a structured, numerical form such as a frequency matrix). The conversion of textual data into numerical form allows data mining algorithms to work on textual data and move beyond syntax (structure) to semantics (meaning of words). To complete the parsing process, the output of the File Import node was imported to the Text Parsing node, which allows Text Miner to identify the different parts of speech, noun groups, and multi-word terms.

4.2.3. Text Filtering (Text Filter)

Subsequently, text filtering wasused to reduce the number of terms included in the text analysis. After parsing the texts, the Text Filter node wasused to filter out terms present in the corpus. The node configures the term weights to be based on the frequency with which they appear in the texts and removes extracted words that are not related to the topic of study. For example, if a term appears in fewer than four documents, then the Text Filter node filters the terms out. Each filtered term can also be manually reviewed by the researcher and modified for inclusion and exclusion in the analysis to ensure that important terms are not accidentally filtered out. This step can be one of the more time-consuming tasks in which a user must manually explore all the terms and treat similar terms as synonyms. Because this study works with articles that have already been manually selected for relevance, we decided to let the software decide which terms to drop and which terms to keep, without researcher review.

4.2.4. Topic Modeling (Text Topic and Text Cluster)

After filtering out and determining the words that are to be used in the analysis, the Text Topic node wasused to create topics or themes from the input text. The topics were then modelled to create clusters, which is a process known as topic modeling. SAS Text Miner provides both Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) for topic modeling. While the software also provides LSI, we selected LDA because it offers a probabilistic generative framework that models documents as mixtures of latent topics, topics as distributions of words, and topic coherence, thus allowing for a greater interpretability of the results compared to the linear algebraic decomposition approach used by LSI. LDA has also been successfully applied in several prior projects by the research team, thus ensuring methodological continuity and familiarity. Within SAS Text Miner, hyperparameters such as the number of topics, maximum iterations, and convergence criteria can be adjusted. For this study, we retained the software’s default settings, which produced stable and reliable results. This approach allowed us to leverage a robust, well-established algorithm while maintaining accessibility for the intended audience of criminal justice and HVIP researchers, many of whom may not have advanced programming expertise in languages such as Python or R.
This process is used to outline the major topics or themes that are present in an unstructured text, and it is often used to examine large bodies of text. The SAS Text Miner Text Topic node performs topic modeling. Each word in the body of the article contributes to the theme or topic. Each article may belong to multiple topics across the text that is being analyzed, meaning that articles may belong to several topics of analysis. For the topic formation process in the Text Topic node, the sequence of high-weighted terms was set to a multi-term topic of five, thus combining five terms into each topic group.

4.2.5. Number of Topics

The SAS Text Miner default setting for the number of topics is 25 (with each topic consisting of 5 terms). Using this setting, the software produced the following 16 topics from (A) to (P) shown in Table 1.
Terms with a ‘+’ sign in front of them mean that they are a root term. For example, ‘+estimates’ is the root term for ‘estimated’, ‘estimating’, and ‘estimation’. The Term Cutoff and Doc Cutoff columns specify the weightings that the software uses to determine whether a term should be in the topic and whether a document should be associated with that topic, respectively. The # of Terms column indicates the total number of terms that make up a topic. The final column, # of Docs, refers to the number of documents that belong to that topic. The total number of documents does not add up to 81 (the input abstracts) because a document can belong to multiple topics. A key finding to point out here is that the simple analysis of abstracts is sufficient in highlighting the key topics that are the focus of this corpus.
The topics were all related to HVIPs, violence, patients/victims, criminal justice, demographics, or related themes. The software generated only 16 topics (as opposed to the default setting of 25), which indicates that the literature collected by the domain expert was concentrated on these topics. During this process, it would be ideal for the literature software to generate fewer topics because thismeans that the manual literature search is focused and not scattered across many themes. As aforementioned, the topic modeling process produces a cluster of terms that form a topic. In our analysis of topics, we found that some terms overlapped among the topics. For example, ‘+woman’ appears in Topics (C) and (O); ‘+survivor’ appears in Topics (E) and (M); ‘+youth’ appears in Topics (A) and (L); ‘+state’ appears in Topics (E) and (L); ‘abuse’ appears in Topic (H) and Topic (M); and ‘+client’ appears in Topics (D), (N), and (O). To adjust these overlapping terms, we tested whether we could narrow down the number of topics to produce a set of topics that did not have any overlapping terms. To conductthis, we narrowed down topics one by one and checked the uniqueness of the terms. When we reached 11 topics, the only overlapping term was ‘+youth’ (topics t7 and t8 in Table 2). When testing for 10 and 9 topics, the results were the same. Therefore, we concluded that 11 topics formed a stable cluster for the corpus.
Our initial analysis consisted of using topics and themes interchangeably, which is the goal of topic modeling. However, literature synthesis requires a thematic analysis wherein topics are clustered based on common themes. For example, if a topic consists of ‘apple’, ‘orange’, ‘banana’, ‘pear’, and ‘kiwi’, the theme is ‘fruit’. From the topics listed in Table 2, we manually grouped the topics into 4 themes based on the topics provided from the analysis (Table 3). We finally decided that a higher level of abstraction could group the topics into 2 themes, as shown in Table 3 below. The topic modeling algorithm in SAS Text Miner confirmed that all 81 abstracts have the major theme of HVIPs, criminal justice, or both.

4.2.6. Highlighting Key Themes

The topic groups consist of documents that are grouped based on similar topics identified in the corpus (in this case, the abstracts of the collected literature). Drawing from these topic groups, we can further cluster the groups of documents into themes that allow for a synthesis of the literature and a better understanding of information. In a manual review of the literature, the author would need to read over each article and code it by hand based on the information detected from the literature.
It is important to note that a specific article likely has topics that fall between all four themes. The manual search process conducted at the start of this project helps to ensure that articles fall within the confines of criminal justice/public health research. The discussion of the themes created from the identified topics assures that studiesin both lines of research (criminal justice and public health) take on these specific topics. While the topics do not provide ample information on the literature that we collected and analyzed, it is possible to make some conclusions on the trends in the literature at the intersection of public health and criminal justice. Below is a brief discussion of each theme and the importance it plays in the HVIP literature.
Theme 1: Trauma and Reinjury—Trauma and reinjury-related topics are one of the major themes identified from the body of articles analysed in this study. A key goal of HVIPs is to assist in trauma-informed care that allows individuals to avoid reinjury through social assistance and clinical sessions intended to address the trauma incurred from their injury or previous trauma the patient may have experienced. Topics encompass key mental health-related ideas like post-traumatic stress disorder (PTSD) and symptoms related to the potential trauma incurred from these events. Document groups also consist of the key ideas of re-injury or “injury recidivism” (highlighted in Topic t5). These topics outline some of the more measurable outcomes of HVIPs, like rehospitalization and reinjury, wherein individuals may incur reinjury if their trauma is not sufficiently addressed. Identifying this theme allows for a further understanding of the public health goals and outcomes of HVIPs and where the criminal justice-related HVIP literature explores these topics.
Theme 2: Practitioners and Program Development—The second theme outlined in these topics is related to practitioner and program development. By gauging the repeated topics picked up from the corpus, this theme allows for a more thorough discussion of programmatic information, such as that regarding staff, managing clients, and case processing. Because of the hospital-based setting in which HVIPs are implemented, it is not surprising that a discussion of these ideas on practitioner and program development will be prevalent in the selected body of research.
Theme 3: Domestic and Gender-based Violence—The third theme highlights some of the key intersections between HVIPs and victimization, wherein it examines the topics of intimate partner violence (IPV), elder abuse, and domestic violence as it relates to service delivery. Identifying these topics in the corpus suggests the role that HVIPs play for women who are victims of violence, which is a population that is often challenging for service providers to reach [54,55]. Additionally, a unique set of topics in this theme consists of the consideration of elder abuse, which possibly explains where HVIPs can be leveraged to address issues of elder abuse. Elder abuse is a growing topic in criminal justice as researchers and practitioners have started to shed light on the unique needs and risks that the elderly population faces [56]. This review, therefore, identifies where trauma-informed hospital programs can be used to address the needs of these populations.
Theme 4: Violence and Victimization—The fourth and final theme that is present throughout the research includes the topics of violence and victims of violence. These topics briefly discuss the population affected by violence and victimisation as seen in Topic t3 (Table 3), with the repetition of men and black men specifically. Topics in this theme also appear to relate to gun violence (firearms) and possible avenues of violence prevention. HVIPs and HVIP services are oriented towards individuals who are victims of violence. The recurrent topics in this theme highlight the significance of the needs of clients and the importance of placing special attention on gun violence.

5. Discussion

Using NLP technology and manual review, we were able to conduct a synthesis and thematic grouping based on the topics and themes of the 81 article abstracts selected for this study. We showed that NLP methods (text parsing, text filtering) and text-mining algorithms (topic modeling) together form an effective software tool for analyzing large quantities of the literature. Our study revealed three key findings: (1) we found that the abstracts of the articles are effective in determining the key themes from the literature synthesis; (2) key topics revealed the range of information that exists in the HVIP and criminal justice-related literature; and (3) the findings show that topics can be grouped into central themes that identify the key ideas that are currently present in HVIP and criminal justice related data and points out potential gaps in the body of literature.
First, we found that the manual literature search was effective to start off this review and that the use of an abstract for NLP methods is sufficient for analysing the literature and producing quality results. This result was determined from the Text Topics phase, which analyzed all the abstracts of the studies and identified 16 topic clusters. We know this because the software can generate up to 25 topic clusters, thus demonstrating that the literature selected from the manual review is related to one another. Additionally, because we could narrow down the topics to 11 topic clusters, it is evident that an analysis of just the abstract is sufficient for this research, and outlining the basic ideas present in the text. The previous literature has stated that the collaboration between NLP and manual research methods is useful for cutting down time during the literature synthesis process while also helping to organize and properly contextualize the data [30]. Therefore, the use of both manual literature review and NLP could help to organize the research and ensure that the literature is organized around the proper information.
Next, central themes in the literature are apparent in the topics identified using NLP software. These central themes reveal the importance of treatment, violence and victimization, programmatic organization, and domestic abuse within the HVIP literature that intersect with public health and criminal justice. The identification of these specific themes in the topics also pinpoints some key gaps in the literature. For example, some HVIPs, such as Project HEAL based in New Jersey [57], have started to integrate victims of childhood abuse. These organizations have argued for the importance of targeting individuals who have grown up with adverse childhood experiences (ACEs) and the long-term trauma that individuals carry. This framework is a key facet of the criminal justice literature and appears to be less frequent in the themes identified in this study. More research exploring HVIP clients who suffer from childhood victimization is necessary to better understand these unique individuals.
The implementation of NLP technology is evolving in criminal justice research and practice [31,32]. This approach builds on other implementations of AI into healthcare research settings; for example, scholars have used machine learning to predict delirium risks in hospitalized patients [39]. These techniques have been applied in Australia and New Zealand, thus streamlining the use of AI in research and practice to better detect cultural nuances in criminal justice settings [38]; specifically, this technique can be applied to provide more nuanced insights into marginalized and historically targeted populations [31]. As these practices expand globally, emphasis needs to be placed on how to ethically leverage these technologies to serve the societal good. For example, within the practitioner setting, units are often competing between exacerbated caseloads and significant staffing shortages; AI can serve as a mechanism for streamlining efforts while still providing powerful insights to ensure data-driven policy and practice [42,43].
A note of caution is warranted regarding algorithmic biases in NLP. Topic modeling and related methods may inadvertently reflect or amplify biases present in the underlying literature, such as the over- or under-representation of certain populations, perspectives, or intervention outcomes. These biases carry the risk of misinterpretation if automated outputs are applied uncritically in practice. Therefore, careful validation, transparency in algorithmic design, and continued human oversight are essential to mitigate bias and ensure the responsible application of NLP tools in HVIP research.

6. Limitations and Future Research

Methodologically, this study sought to use NLP and text mining processes to validate whether articles identified by a domain expert were relevant to the research on HVIPs and criminal justice. The domain expert did not record the amount of effort/time that it took her to collect all the articles, as this was not the original goal of her task. Future research should assess whether time can be saved with the help of a software tool to evaluate the efficiency of resource allocation by using each method. However, given our findings, it is likely that significant time saving would be achieved if SAS Text Miner or a similar text mining tool were used from the start. In addition, this tool provides further insights into the text that would not be otherwise available in the manual process, while automating the protocol to enable big data approaches in a shorter time interval. This would improve our ability to remain updated on this literature, which would be useful for practitioners to stay informed on current evidence-based findings and approaches.
Future research can extend this prototype study in several important ways. First, while we selected LDA for its interpretability and prior validation, systematic comparisons with alternative methods such as LSI could clarify the impact of algorithm choice and hyperparameter settings on topic quality. Additionally, the mapping from terms/topics to themes was based on manual judgement. This process was carried out collaboratively by the interdisciplinary team of the authors. The emergence of themes would be strengthened through explicit machine-based coherence metrics (e.g., topic coherence, perplexity) and formal inter-rater reliability measures among human evaluators. Second, although we focused on a smaller, homogeneous set of abstracts to establish feasibility, scaling the analysis to the full 12,000-document corpus will allow us to evaluate the method’s scalability, yield additional insights, and explore the potential for automated document filtering. Third, applying topic modeling across publication years could reveal the temporal evolution of HVIP research, highlighting how thematic emphases have emerged, shifted, or declined over time. Together, these directions point toward a broader research agenda that leverages NLP not only for methodological rigor but also for advancing the substantive understanding of the HVIPs.
Finally, our future research in text mining will use generative AI technology. Generative AI [58] refers to a subset of AI models that can produce content, such as text, images, or music, which often mimics the data it has been trained on. Language learning models like ChatGPT [59] are exemplary instances of this technology in the domain of NLP. Trained on vast amounts of text, ChatGPT can generate human-like, coherent, and contextually relevant sentences. This capability positions this technology as a powerful tool for text mining. Unlike traditional text-mining methods that search for explicit patterns in data, language learning models can interpret, generate, and modify content, thus making them suitable for more complex tasks. They can extract themes from large datasets, summarize content, and even translate between different terminologies or contexts. These newer approaches, including community detection algorithms and transformer-based models such as BERTopic, may offer finer control over topic resolution and should be explored to assess whether their additional complexity yields deeper insights. We plan to use language learning models to perform and expedite the meta-analysis and meta-synthesis processes.

7. Conclusions

Research on HVIPs spans diverse lenses and disciplinary frameworks, thus making it essential for scholars to adopt holistic approaches that transcend disciplinary boundaries. Our findings illustrate that multiple pathways exist for engaging with this body of work—ranging from manual review and coding by domain experts to automated processes employing machine learning and NLP. Manual review provides researchers with deep contextual access but is time-intensive, whereas AI-enabled analysis can rapidly synthesize large corpora through advanced grouping mechanisms. As documented by Hunter and colleagues [18] in the broader criminal justice system, these approaches can be adapted to applied settings, in whichthey inform healthcare-oriented responses to justice and violence intervention. Ultimately, this study demonstrates that effective collaboration in healthcare approaches to justice requires not only methodological innovation but also disciplinary integration, thus uniting medical, public health, and criminological scholarship within a shared framework. In practice, this calls for cross-disciplinary teams capable of developing and implementing data-driven strategies for violence intervention.
This study applied established NLP and text-mining methods—specifically LDA topic modeling implemented through SAS Text Miner—to analyze recurring themes in the HVIP-related literature at the intersection of criminal justice and public health. The results demonstrate how well-known NLP techniques can accelerate the systematic review process and reveal novel insights, thereby underscoring the potential of AI-driven tools to support interdisciplinary research in text summarization, extraction, and classification without requiring advanced programming expertise. This study highlights the translational and practical value of using reproducible NLP tools to support evidence-based inquiry across applied domains. Rather than pursuing methodological innovation, our focus was on adapting proven approaches such as LDA for real-world contexts. Future research may extend this work by integrating newer NLP models, including transformer-based and hybrid methods, to bridge applied applications with emerging methodological developments in information science.

Author Contributions

Conceptualization, C.S.K.; Data curation, C.S.K. and R.G.; Formal analysis, C.S.K. and M.R.P.; Investigation, C.S.K., K.P. and R.G.; Methodology, C.S.K.; Project administration, C.S.K. and S.R.; Resources, C.S.K., R.G. and S.R.; Software, C.S.K.; Supervision, M.R.P. and S.R.; Validation, C.S.K., K.P., J.R.D. and M.R.P.; Visualization, C.S.K., K.P. and J.R.D.; Writing—original draft, C.S.K.; Writing—review & editing, C.S.K., K.P., J.R.D. and S.R. All authors have read and agreed to the published version of the manuscript.

Funding

Cyril S. Ku’s research for this work was partially supported by the National Science Foundation under Grant No. 2028011.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in ProQuest at https://www.proquest.com/, PubMed at https://pubmed.ncbi.nlm.nih.gov/, Google Scholar at https://scholar.google.com/, Google at https://www.google.com/, and Web of Science at http://www.webofscience.com.

Acknowledgments

The authors gratefully acknowledge Christine Neudecker for her valuable contributions to this work. Christine conducted the manual systematic review that initially motivated our exploration of NLP-driven approaches to enhance the process. She also played a significant role in several phases of this research, including conceptualization, data collection, investigation, and drafting the original manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
HAJCHealthcare Approaches to Justice Collaborative
HVIPHospital-based Violence Intervention Program
NLPNatural Language Processing
PRISMAPreferred Reporting Items for Systematic reviews and Meta-Analyses

References

  1. Gorman, E.; Coles, Z.; Baker, N.; Tufariello, A.; Edemba, D.; Ordonez, M.; Walling, P.M.; Livingston, D.H.M.; Bonne, S. Beyond recidivism: Hospital-based violence intervention and early health and social outcomes. J. Am. Coll. Surg. 2022, 235, 927–939. [Google Scholar] [CrossRef]
  2. Richardson, J.B., Jr.; Wical, W.; Kottage, N.; Bullock, C. Shook Ones: Understanding the Intersection of Nonfatal Violent Firearm Injury, Incarceration, and Traumatic Stress Among Young Black Men. Am. J. Men’s Health 2020, 14, 1557988320982181. [Google Scholar] [CrossRef] [PubMed]
  3. Affinati, S.; Patton, D.; Hansen, L.; Ranney, M.; Christmas, A.B.; Violano, P.; Sodhi, A.; Robinson, B.; Crandall, M.; from the Eastern Association for the Surgery of Trauma Injury Control and Violence Prevention Section and Guidelines Section. Hospital-based violence intervention programs targeting adult populations: An Eastern Association for the Surgery of Trauma evidence-based review. Trauma Surg. Acute Care Open 2016, 1, e000024. [Google Scholar] [CrossRef]
  4. Strong, B.L.; Shipper, A.G.; Downton, K.D.; Lane, W.G. The effects of health care–based violence intervention programs on injury recidivism and costs: A systematic review. J. Trauma Acute Care Surg. 2016, 81, 961–970. [Google Scholar] [CrossRef] [PubMed]
  5. Xiao, Y.; Watson, M. Guidance on conducting a systematic literature review. J. Plan. Educ. Res. 2019, 39, 93–112. [Google Scholar] [CrossRef]
  6. Barrio, F. The Procrustean Nature of AI and the Legal Implications of Its Use in the Criminal System Decision Making of Argentina. In Government Response to Disruptive Innovation: Perspectives and Examinations; Edwards, S., III, Masterson, J., Eds.; IGI Global: Hershey, PA, USA, 2023; pp. 80–92. [Google Scholar]
  7. Završnik, A. Criminal justice, artificial intelligence systems, and human rights. In ERA Forum; Springer: Berlin/Heidelberg, Germany, 2020; Volume 20, pp. 567–583. [Google Scholar]
  8. Braga, A.A.; Kennedy, D.M. A Framework for Addressing Violence and Serious Crime: Focused Deterrence, Legitimacy, and Prevention; Cambridge University Press: Cambridge, UK, 2021. [Google Scholar]
  9. Ranjan, S.; Shah, A.K.; Strange, C.C.; Stillman, K. Hospital-based violence intervention: Strategies for cultivating internal support, community partnerships, and strengthening practitioner engagement. J. Aggress. Confl. Peace Res. 2022, 14, 14–25. [Google Scholar] [CrossRef]
  10. Ranjan, S.; Neudecker, C.H.; Strange, C.C.; Wojcik, M.L.; Shah, A.; Solhkhah, R. Hospital-based violence intervention programs (HVIPs): Making a case for qualitative evaluation designs. Crime Delinq. 2023, 69, 487–509. [Google Scholar] [CrossRef]
  11. Ranjan, S.; Dmello, J.R. Proposing a unified framework for coordinated community response. Violence Against Women 2022, 28, 1873–1889. [Google Scholar] [CrossRef]
  12. Cheng, T.L.; Wright, J.L.; Markakis, D.; Copeland-Linder, N.; Menvielle, E. Randomized trial of a case management program for assault-injured youth: Impact on service utilization and risk for reinjury. Pediatr. Emerg. Care 2008, 24, 130–136. [Google Scholar] [CrossRef]
  13. Cooper, C.; Eslinger, D.M.; Stolley, P.D. Hospital-based violence intervention programs work. J. Trauma Acute Care Surg. 2006, 61, 534–540. [Google Scholar] [CrossRef]
  14. Snider, C.E.; Jiang, D.; Logsetty, S.; Chernomas, W.; Mordoch, E.; Cochrane, C.; Mahmood, J.; Woodward, H.; Klassen, T.P. Feasibility and efficacy of a hospital-based violence intervention program on reducing repeat violent injury in youth: A randomized control trial. Can. J. Emerg. Med. 2020, 22, 313–320. [Google Scholar] [CrossRef]
  15. Zatzick, D.; Russo, J.; Lord, S.P.; Varley, C.; Wang, J.; Berliner, L.; Jurkovich, G.; Whiteside, L.K.; O’Connor, S.; Rivara, F.P. Collaborative care intervention targeting violence risk behaviors, substance use, and posttraumatic stress and depressive symptoms in injured adolescents: A randomized clinical trial. JAMA Pediatr. 2014, 168, 532–539. [Google Scholar] [CrossRef] [PubMed]
  16. Khurana, D.; Koli, A.; Khatter, K.; Singh, S. Natural language processing: State of the art, current trends and challenges. Multimed. Tools Appl. 2023, 82, 3713–3744. [Google Scholar] [CrossRef]
  17. Rigano, C. Using artificial intelligence to address criminal justice needs. Natl. Inst. Justice J. 2019, 280, 1–10. [Google Scholar]
  18. Hunter, D.; Bagaric, M.; Stobbs, N. A Framework for the Efficient and Ethical Use of Artificial Intelligence in the Criminal Justice System. Fla. State Univ. Law Rev. 2019, 47, 749. [Google Scholar]
  19. Schnoebelen, T. Goal-oriented design for ethical machine learning and NLP. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, Valencia, Spain, 4 April 2017; pp. 88–93. [Google Scholar]
  20. Patton, D.U.; Frey, W.R.; McGregor, K.A.; Lee, F.T.; McKeown, K.; Moss, E. Contextual analysis of social media: The promise and challenge of eliciting context in social media posts with natural language processing. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, 7–9 February 2020; pp. 337–342. [Google Scholar]
  21. Cook, B.L.; Progovac, A.M.; Chen, P.; Mullin, B.; Hou, S.; Baca-Garcia, E. Novel Use of Natural Language Processing (NLP) to Predict Suicidal Ideation and Psychiatric Symptoms in a Text-Based Mental Health Intervention in Madrid. In Computational and Mathematical Methods in Medicine; John Wiley & Sons Ltd.: Hoboken, NJ, USA, 2016; pp. 1–8. [Google Scholar]
  22. Janssen, L.; Pokhilenko, I.; Drost, R.; Paulus, A.; Evers, S. Criminal Justice Costs And Benefits Of Mental Health Interventions. Int. J. Technol. Assess. Health Care 2019, 35, 9–10. [Google Scholar] [CrossRef]
  23. Morrow, D.; Zamora-Resendiz, R.; Beckham, J.C.; Kimbrel, N.A.; Oslin, D.W.; Tamang, S.; Crivelli, S.; Million Veteran Program Suicide Exemplar Work Group. A case for developing domain-specific vocabularies for extracting suicide factors from healthcare notes. J. Psychiatr. Res. 2022, 151, 328–338. [Google Scholar] [CrossRef]
  24. Huang, T.T.; Socrates, V.; Gilson, A.; Safranek, C.; Chi, L.; Wang, E.; Puglisi, L.B.; Brandt, C.; Wang, K. Identifying Incarceration Status in the Electronic Health Record Using Natural Language Processing in Emergency Department Settings. medRxiv 2023. [Google Scholar] [CrossRef]
  25. Patra, B.G.; Sharma, M.M.; Vekaria, V.; Adekkanattu, P.; Patterson, O.V.; Glicksberg, B.; Lepow, L.A.; Ryu, E.; Biernacka, J.M.; Furmanchuk, A.; et al. Extracting social determinants of health from electronic health records using natural language processing: A systematic review. J. Am. Med. Inform. Assoc. JAMIA 2021, 28, 2716–2727. [Google Scholar] [CrossRef]
  26. Wang, E.A.; Long, J.B.; McGinnis, K.A.; Wang, K.H.; Wildeman, C.J.; Kim, C.; Bucklen, K.B.; Fiellin, D.A.; Bates, J.; Brandt, C.; et al. Measuring exposure to incarceration using the electronic health record. Med. Care 2019, 57, S157–S163. [Google Scholar] [CrossRef] [PubMed]
  27. Colling, C.; Khondoker, M.; Patel, R.; Fok, M.; Harland, R.; Broadbent, M.; McCrone, P.; Stewart, R. Predicting high-cost care in a mental health setting. BJPsych Open 2020, 6, e10. [Google Scholar] [CrossRef] [PubMed]
  28. Gichoya, J.W.; McCoy, L.G.; Celi, L.A.; Ghassemi, M. Equity in essence: A call for operationalising fairness in machine learning for healthcare. BMJ Health Care Inform. 2021, 28, e100289. [Google Scholar] [CrossRef] [PubMed]
  29. Maël, L.E. Terminology development for Digital Forensics using Natural Language Processing. Master’s Thesis, University of Dundee, Dundee, UK, 2022. [Google Scholar]
  30. Parker, R.D.; Mancini, K.; Abram, M.D. Natural Language Processing Enhanced Qualitative Methods: An Opportunity to Improve Health Outcomes. Int. J. Qual. Methods 2023, 22, 16094069231214144. [Google Scholar] [CrossRef]
  31. Coulter, D.; Forkan, A.R.M.; Kang, Y.B.; Trounson, J.; Anthony, T.; Marchetti, E.; Shepherd, S. Pre-sentence reports for Aboriginal and Torres Strait Islander people: An analysis of language and sentiment. Trends Issues Crime Crim. Justice 2022, 659, 1–11. [Google Scholar]
  32. Mourtgos, S.M.; Adams, I.T. The rhetoric of de-policing: Evaluating open-ended survey responses from police officers with machine learning-based structural topic modeling. J. Crim. Justice 2019, 64, 101627. [Google Scholar] [CrossRef]
  33. Choi, J.A.; Ku, C.S. Identifying the Public’s Changing Concerns During a Global Health Crisis: Text Mining and Comparative Analysis of Tweets During the COVID-19 Pandemic. In International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing; Springer: Cham, Switzerland, 2022; pp. 141–151. [Google Scholar]
  34. McCosker, A.; Farmer, J.; Soltani Panah, A. Community Responses to Family Violence: Charting Policy Outcomes Using Novel Data Sources, Text Mining and Topic Modelling; Swinburne University of Technology: Melbourne, Australia, 2020. [Google Scholar]
  35. Goin, D.E.; Rudolph, K.E.; Ahern, J. Predictors of firearm violence in urban communities: A machine-learning approach. Health Place 2018, 51, 61–67. [Google Scholar] [CrossRef]
  36. Sahri, Z.; Shuhidan, S.M.; Sanusi, Z.M. An ontology-based representation of the financial criminology domain using text analytics processing. Int. J. Comput. Sci. Netw. Secur. 2018, 18, 56–62. [Google Scholar]
  37. Pina-Sánchez, J.; Grech, D.; Brunton-Smith, I.; Sferopoulos, D. Exploring the origin of sentencing disparities in the Crown Court: Using text mining techniques to differentiate between court and judge disparities. Soc. Sci. Res. 2019, 84, 102343. [Google Scholar] [CrossRef]
  38. Adily, A.; Karystianis, G.; Butler, T. Text mining police narratives to identify types of abuse and victim injuries in family and domestic violence events. Trends Issues Crime Crim. Justice 2021, 630, 1–12. [Google Scholar]
  39. Wong, A.; Young, A.T.; Liang, A.S.; Gonzales, R.; Douglas, V.C.; Hadley, D. Development and validation of an electronic health record–based machine learning model to estimate delirium risk in newly hospitalized patients without known cognitive impairment. JAMA Netw. Open 2018, 1, e181018. [Google Scholar] [CrossRef]
  40. Feng, L.; Chiam, Y.K.; Lo, S.K. Text-mining techniques and tools for systematic literature reviews: A systematic literature review. In Proceedings of the 24th Asia-Pacific Software Engineering Conference, Nanjing, China, 4–8 December 2017; pp. 41–50. [Google Scholar]
  41. van Dinter, R.; Tekinerdogan, B.; Catal, C. Automation of systematic literature reviews: A systematic literature review. Inf. Softw. Technol. 2021, 136, 106589. [Google Scholar] [CrossRef]
  42. Atkinson, C.F. Cheap, Quick, and Rigorous: Artificial Intelligence and the Systematic Literature Review. Soc. Sci. Comput. Rev. 2023, 42, 376–393. [Google Scholar] [CrossRef]
  43. de la Torre-López, J.; Ramírez, A.; Romero, J.R. Artificial intelligence to automate the systematic review of scientific literature. Computing 2023, 105, 2171–2194. [Google Scholar] [CrossRef]
  44. Healthcare Approaches to Justice Collaborative, Montclair State University. HAJC. Available online: https://www.montclair.edu/chss/about-the-college/chss-initiatives/healthcare-approaches-to-justice-collaborative/ (accessed on 20 July 2025).
  45. Haddaway, N.R.; Collins, A.M.; Coughlin, D.; Kirk, S. The Role of Google Scholar in Evidence Reviews and Its Applicability to Grey Literature Searching; NIH: National Library of Medicine: Bethesda, MD, USA, 2015. [Google Scholar]
  46. Bird, S.; Klein, E.; Loper, E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit; O’Reilly Media: Sebastopol, CA, USA, 2009. [Google Scholar]
  47. Demsar, J.; Curk, T.; Erjavec, A.; Gorup, C.; Hocevar, T.; Milutinovic, M.; Mozina, M.; Polajnar, M.; Toplak, M.; Staric, A.; et al. Orange: Data Mining Toolbox in Python. J. Mach. Learn. Res. 2013, 14, 2349–2353. [Google Scholar]
  48. Ertek, G.; Tapucu, D.; Arin, I. Text Mining with RapidMiner Chapter. In RapidMiner: Data Mining Use Cases and Business Analytics Applications, 1st ed.; Chapman and Hall/CRC: Boca Raton, FL, USA, 2014. [Google Scholar]
  49. Frank, E.; Hall, M.A.; Witten, I.H. The WEKA Workbench. In Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Morgan Kaufmann: San Francisco, CA, USA, 2016. [Google Scholar]
  50. Matignon, R. Data Mining Using SAS® Enterprise Miner; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
  51. SAS Institute Inc. SAS® Enterprise Miner 15.2: Reference Help; SAS Institute Inc.: Cary, NC, USA, 2018. [Google Scholar]
  52. SAS Institute Inc. SAS® Text Miner 15.2: Reference Help; SAS Institute Inc.: Cary, NC, USA, 2018. [Google Scholar]
  53. Chakraborty, G.; Pagolu, M.; Garla, S. Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS®; SAS Institute Inc.: Cary, NC, USA, 2013. [Google Scholar]
  54. Rodríguez, M.; Valentine, J.M.; Son, J.B.; Muhammad, M. Intimate Partner Violence and Barriers to Mental Health Care for Ethnically Diverse Populations of Women. Trauma Violence Abus. 2009, 10, 358–374. [Google Scholar] [CrossRef]
  55. Wilson, J.L.; Uthman, C.; Nichols-Hadeed, C.; Kruchten, R.; Thompson Stone, J.; Cerulli, C. Mental health therapists’ perceived barriers to addressing intimate partner violence and suicide: Families, Systems, & Health. Fam. Syst. Health 2021, 39, 188–197. [Google Scholar]
  56. Jackson, S.L. The shifting conceptualization of elder abuse in the United States: From social services, to criminal justice, and beyond. Int. Psychogeriatr. 2016, 28, 1–8. [Google Scholar] [CrossRef] [PubMed]
  57. Hackensack Meridian Health. Project HEAL. Available online: https://www.hackensackmeridianhealth.org/en/project-heal (accessed on 21 July 2025).
  58. Okadome, T. Essentials of Generative AI; 2025th Edition; Springer: Berlin/Heidelberg, Germany, 2025. [Google Scholar]
  59. OpenAI. ChatGPT. Available online: https://chat.openai.com (accessed on 20 July 2025).
Figure 1. PRISMA diagram of manual HVIP/criminal justice search and screening process.
Figure 1. PRISMA diagram of manual HVIP/criminal justice search and screening process.
Information 16 01098 g001
Figure 2. SAS Text Miner workflow.
Figure 2. SAS Text Miner workflow.
Information 16 01098 g002
Table 1. Document topics.
Table 1. Document topics.
TopicTerm CutoffDoc Cutoff# of Terms# of Docs
(A) ed, +youth, eds, +process, staff0.0580.047239
(B) +cost, +cost, +estimate, +vip, different0.0580.0435440
(C) ipv, +woman, +provider, +partner, partner violence0.0570.054428
(D) +manager, +client, +insight, +case, +relationship0.0580.0476740
(E) +justice system, +survivor, support, justice, +state0.0580.0357935
(F) +control, +group, treatment, reinjury, control group0.0580.0556249
(G) +vip, recidivism, injury recidivism, +associate, success0.0580.0486836
(H) elder, +old, abuse, +adult, inclusion0.0570.0495234
(I) ptsd, prevalence, pediatric, psychological, +score0.0580.0477153
(J) hvips, hvip, +barrier, existing, +literature0.0580.0446344
(K) +firearm, +firearm injury, +assault, +hospital, patient0.0580.0416344
(L) +youth, +attitude, +gun, +state, awareness0.0580.0436143
(M) +survivor, domestic, +service, +referral, abuse0.0580.0517055
(N) gang, +client, success, +goal, interpersonal violence0.0580.0367735
(O) +client, penetrating, +woman, +wound, stab0.0580.0467453
(P) +man, black, +young, black, +black man0.0580.0516442
Table 2. Topics with minimum overlapping terms.
Table 2. Topics with minimum overlapping terms.
TopicTerm CutoffDoc Cutoff# of Terms# of Docs
(t1) +control, +group, reinjury, repeat, treatment0.0580.0566646
(t2) ipv, +woman, +provider, +partner, partner violence0.0580.0484729
(t3) +young, black, +man, +male, +violent injury0.0580.0495638
(t4) +manager, +client, +case, +insight, +relationship0.0580.0446040
(t5) +vip, recidivism, +cost, injury recidivism, +cost0.0580.0456238
(t6) elder, abuse, +old, +adult, inclusion0.0570.0494938
(t7) +youth, +gun, +attitude, prevention, awareness0.0580.0426141
(t8) ed, +youth, eds, +process, staff0.0580.0438046
(t9) +firearm, +client, +firearm injury, +assault, +victim0.0580.0488148
(t10) hvips, ptsd, +practice, +symptom, mental health0.0580.0457648
(t11) +survivor, domestic, +referral, +service, domestic violence0.0580.056655
Table 3. Mappings between topics and themes.
Table 3. Mappings between topics and themes.
Topic4 Themes2 Themes
(t1) +control, +group, reinjury, repeat, treatmentTrauma and ReinjuryPublic Health
(t10) hvips, ptsd, +practice, +symptom, mental health 
(t5) +vip, recidivism, +cost, injury recidivism, +cost
(t4) +manager, +client, +case, +insight, +relationshipPractitioner and Program Development
(t8) ed, +youth, eds, +process, staff
(t2) ipv, +woman, +provider, +partner, partner violenceDomestic and Gender-based ViolenceCriminal Justice
(t6) elder, abuse, +old, +adult, inclusion
(t11)+survivor, domestic, +referral, +service, domestic violence
(t3) +young, black, +man, +male, +violent injuryViolence and Victimization
(t7) +youth, +gun, +attitude, prevention, awareness
(t9) +firearm, +client, +firearm injury, +assault, +victim
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ku, C.S.; Pugliese, K.; Dmello, J.R.; Peltier, M.R.; Green, R.; Ranjan, S. Validating the Use of Natural Language Processing and Text Mining for Hospital-Based Violence Intervention Programs and Criminal Justice Articles. Information 2025, 16, 1098. https://doi.org/10.3390/info16121098

AMA Style

Ku CS, Pugliese K, Dmello JR, Peltier MR, Green R, Ranjan S. Validating the Use of Natural Language Processing and Text Mining for Hospital-Based Violence Intervention Programs and Criminal Justice Articles. Information. 2025; 16(12):1098. https://doi.org/10.3390/info16121098

Chicago/Turabian Style

Ku, Cyril S., Katheryne Pugliese, Jared R. Dmello, Morgan R. Peltier, Robert Green, and Sheetal Ranjan. 2025. "Validating the Use of Natural Language Processing and Text Mining for Hospital-Based Violence Intervention Programs and Criminal Justice Articles" Information 16, no. 12: 1098. https://doi.org/10.3390/info16121098

APA Style

Ku, C. S., Pugliese, K., Dmello, J. R., Peltier, M. R., Green, R., & Ranjan, S. (2025). Validating the Use of Natural Language Processing and Text Mining for Hospital-Based Violence Intervention Programs and Criminal Justice Articles. Information, 16(12), 1098. https://doi.org/10.3390/info16121098

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop