1. Introduction
The DASH (Domestic Abuse, Stalking and Harassment and Honour-Based Violence risk assessment [
1]) has been used by most UK police forces to manage their approach to risk surrounding domestic abuse cases since 2009. However, recent events have suggested that the DASH has “obvious problems” [
2]. In this paper, we review the effectiveness of the DASH in the prediction of which perpetrators of domestic abuse go on to commit a further act of “deadly” domestic violence or are “persistent” perpetrators of domestic abuse.
The DASH risk assessment is a multi-agency tool used by police, health professionals, housing officers, social workers, and domestic abuse specialists. The DASH checklist includes 27 questions covering issues such as coercive control, physical and sexual violence, stalking and harassment, impact on children, and “honour-based” violence. The DASH risk assessment is completed through a structured, face-to-face, or telephone interview with the victim of the offence. Consent is normally required, but in cases of perceived high-risk or safeguarding situations, the DASH may be completed without consent. Responses are recorded as “Yes”, “No”, or “Refused”. Practitioners may add contextual notes to clarify or expand on answers. A risk-level is then assigned to the victim for any given incident as Standard, Medium, or High. This risk-level is determined by the responses victims gave to the DASH, officers’ knowledge of previous incidents of domestic abuse, safeguarding plans already in place, and the officer’s professional judgement. Typically, a high-risk category is given if 14 or more “Yes” answers are given, or if there is evidence of escalation, serious threats, or the practitioner’s professional judgement indicates concern even if fewer risk factors are disclosed. Hence, the DASH is designed to be objective, but the addition of professional judgement is crucial—especially if it is believed that the victim has minimized or omitted details.
The use of the DASH has been reviewed by a number of authors [
3,
4,
5,
6,
7,
8,
9,
10,
11]. While its original inception was that of a structured professional judgement scheme and authors have argued that the DASH is not a predictive tool, in practice this is what it is being used as by practitioners [
11]. Indeed, Ariza et al. [
4] argued that putting people into risk categories (Standard, Medium, or High) is a predictive task in itself. Further, the evidence that has called into question the effectiveness of the DASH (see below) is based upon its poor ability to predict future instances of domestic abuse.
Previous Research on DASH
Chalkley and Strang [
11] identified 107 cases of high violence (domestic murders and near murders) and compared these cases to 214 control cases where “less-deadly” violence was perpetrated. Sixty-seven of the high-violence cases had a previous DASH assessment. Of these, 45 (67%) were not classified as high-risk at the previous assessment stage, with only 33% being assessed as high-risk. The proportion of high-risk in the matched control sample is not provided.
Chalkley and Strang [
11] conclude that the DASH failed to predict the majority of deadly domestic violence cases (see also [
8,
12]). This is clearly true, but could be rectified by altering the criteria of classification of “high-risk” to a far lower threshold, thus capturing far more offenders who commit high violence. Of course, such a change in threshold would come at the cost of producing more false positive results (labelling someone high-risk who does not go on to commit such an act of high violence), which would create a far greater workload, and would spread limited resources even more thinly. In turn, this may lead to poorer management of the high-risk offenders and more, rather than fewer, instances of further violence. Hence, merely changing thresholds of categorization does not “improve” the risk assessment!
To judge whether the DASH (or any other risk assessment scheme) is effective in identifying high-risk offenders, information is needed about performance in both the group that did go on to commit a violent act and those that did not commit further offences. Signal detection theory can then be used to provide a bias-free prediction (one that is not contingent on a particular threshold being used to make a “high” vs. “low” decision). This is achieved by constructing the Receiver Operating Characteristic that plots the true positives (correctly predicting the act of violence) against the false positives (incorrectly predicting an act of violence) and calculating the area under the curve (AUC). Such analyses are routinely used in many aspects of medical science, including the prediction of violence, as well as many other areas (see [
13]). AUCs range from 0 to 1.0, with scores near 0.5 indicating an instrument with no predictive value and AUCs > 0.70 as a large effect size [
14].
Turner et al. [
10] looked at whether officers in a large metropolitan police force in the UK were able to distinguish cases of “serious harm” using risk ratings based on the DASH. Only 5.7% of these future serious harm perpetrators were classified as high-risk. This study did use ROC analysis and found an AUC of 0.54 in identifying these future serious harm events. This suggests little value in the predictions based on the DASH.
Turner et al. [
9] performed similar analyses of the DASH and compared these to alternate methods using information already available in police databases. They also found that the DASH assessment had little predictive accuracy (AUC ≈ 0.55), while an algorithm based upon police records was far more successful (AUC ≈ 0.75). Adding the DASH information to the predictions from the police records alone did not produce any additional increase in predictive efficacy (see also Grogger et al. [
15]).
The extant literature on DASH therefore suggests little value in the DASH assessment procedure when viewed purely from the perspective of a “risk prediction tool” (see
Section 4) and it is these findings that have called into question its continued use.
However, any instrument might fail if it is not being used in the correct manner. Robinson et al. [
6] focused on the implementation of the DASH, in particular its use in the classification, identification, and assessment of risk of domestic abuse. They carried out a UK-wide exercise using in-depth fieldwork with three police forces. The DASH was used inconsistently, if at all, and there were also inconsistencies in how responses were recorded. For example, they uncovered instances of the DASH being used inappropriately, such as an officer cutting and pasting from a previous assessment or only completing it if they felt a prosecution was likely. Sebire and Barling [
16] used intraclass correlation tests to establish the DASH’s inter rater reliability. They used information from known police history, initial police reports and responses of victims to complete the DASH. They concluded that the location and volume of cases which officers are carrying could be impacting the decision making of officers, and that officers in a real-life setting would use other extraneous factors aside from the DASH. This suggests responses to DASH items were not the deciding factor when it comes to assessing risk.
The present study was commenced to establish if the existing use of the DASH was warranted in a particular police force in the UK. In response to a Her Majesty’s Inspectorate review, Dyfed-Powys Police implemented changes in their handing of domestic violence cases. This included the setting up of a Vulnerability Desk to aid frontline officers in their response to domestic abuse incidents and a commitment to “100%” in DASH completion for all reported domestic abuse incidents. Furthermore, training for all frontline staff was introduced in “Domestic Abuse Matters”, to educate officers and staff on the complexities of domestic abuse from the perspective of the victim. Officers were given a separate input from internal force trainers on how to complete a DASH assessment. A Secondary Risk Assessment Unit was introduced to provide consistency of approach to risk identification, assessment, and management. When a call was received in the force control room, an officer was dispatched to the scene (where appropriate). The Vulnerability Desk, made up of officers and police staff, are informed of an ongoing domestic incident. They can interrogate police systems and inform an officer in live time of any previous criminal history and other relevant information concerning those involved in the incident. This provides officers with a more comprehensive understanding of previous incidents to help inform the overall risk assessment. The information provided by the Vulnerability Desk is added to the System for Tasking and Operational Resource Management (known as STORM) log (the log of the initial call). This information provides the officer with background knowledge which they could explore during the completion of the DASH with the victim. The officer at the scene completes a DASH with the victim and enters the responses into their Mobile Data Terminal. The attending officer then assigns a risk rating of “Standard”, “Medium”, or “High” risk, according to the responses the victim gives to the DASH items, the information provided by colleagues on the Vulnerability Desk, and their own professional judgement of the situation, having attended the scene. All this information is logged on the crime management system, along with the officer’s rationale for their risk stratification. The officer’s supervisor reviews the risk rating before the end of their shift and will either agree with the officer’s risk grading or change it, documenting their reasons for the change. Given these attempts at improvement in the use of the DASH, Dyfed-Powys Police aimed to evaluate if the DASH was able to predict future cases of serious domestic violence. This was the overall objective of the current research.
3. Results
Sixty-two offenders in the Deadly group had an index offence DASH, of which 16 (25.8%) were rated as high-risk. For the Persistent offenders, 427 had an index offence DASH, of which 69 (16.2%) were rated as high-risk. For the control group, 432 offenders had an index offence DASH, of which 23 (5.3%) were rated as high-risk. A Chi-square test showed this pattern of results to be highly significant (χ2 (4, 921) = 65.6, p < 0.001). This was also the case when the Deadly group was compared to the control group (χ2 (2, 494) = 38.4, p < 0.001) or when the Persistent group was compared to the control group (χ2 (2, 859) = 52.6, p < 0.001). Thus, DASH ratings were far higher for the Deadly and Persistent groups compared to the control group, with a rate of high-risk ratings approximately five times greater for the Deadly group compared to the control group.
The risk ratings were used to predict group membership using signal detection theory. When comparing the Deadly and control groups, the AUC was 0.67 (95% CI [0.59, 0.75], p < 0.001), illustrating that the DASH risk stratification procedure performed well at predicting the Deadly group in comparison to the control group. The analysis was repeated for the male offenders (N = 44) only as there were too few Deadly female offenders (N = 18). This analysis produced a similar result (AUC = 0.69; 95% CI [0.59, 0.79], p < 0.001). An analysis of the Persistent offenders in comparison to the control group produced an AUC of 0.62 (95% CI [0.58, 0.66], p < 0.001), illustrating a modest ability to distinguish between Persistent offenders and controls. The AUC for the male offenders (N = 376 persistent, N = 302 control) was 0.60 (95% CI [0.56, 0.65], p < 0.001), and for the female offenders (N = 51 persistent, N = 130 control) it was 0.64 (95% CI [0.54, 0.73], p = 0.003). A similar analysis was performed using the DASH tool in a purely actuarial manner. This was performed by adding the number of endorsed items rather than using the risk ratings. Nearly identical powers of prediction were achieved.
3.1. Performance on Individual Items of the DASH
The ability of each of the DASH items to distinguish between the Deadly and the control offenders was assessed by examining the rate of endorsement for that item (the Persistent offenders were omitted from this analysis, but these data are available in the
Supplementary Materials).
Table 2 illustrates these results.
A total of 15 of the 27 items (55.6%) did not produce a significant difference (defined here as
p < 0.05) between the two groups. This finding is consistent with previous research suggesting that only a few of the DASH items are predictive of future serious harm (see
Section 4).
3.2. DASH Short Form
The finding that over half of the DASH items are not predictive of future serious domestic violence suggests some of these items could be eliminated and the performance of the instrument might improve. Adding in non-predictive items to the risk analysis would only serve to add noise and error into the risk evaluation and make the decisions of the police officers regarding the level of risk of the offender more difficult to make. To examine this issue, we performed a multiple logistic regression in which all 27 DASH items were entered to predict deadly violence (vs. control). In the resulting model, only five of the items were significantly predictive and had large odds ratios. These were the following:
Q27. “Do you know if the alleged offender has ever been in trouble with the police or has a criminal history?”
Q24. “Has the alleged offender had problems in the past year with drugs (prescription or other), alcohol or mental health leading to problems in a normal life?”
Q26. “Has the alleged offender ever breached bail/an injunction and/or any agreement for when they can see the injured person (IP) and/or the children?”
Q16. “Has the alleged offender ever used weapons or objects to hurt the IP?”
Q6. “Has the IP separated or tried to separate from the A/O within the past year?”
The scores from these five items were then added together to produce a DASH-short form (DASH-SF) which has a range of scores from 0 to 5. Analysis of this shortened version to predict deadly violence showed an AUC of 0.80 (95% CI [0.73, 0.87], p < 0.001). This is significantly greater than the AUC obtained from the full DASH items (p < 0.001). The DASH-SF was also a predictor of persistent offending (AUC = 0.71, 95% CI [0.64, 0.75], p < 0.001), which was also greater than the full DASH items (p < 0.001).
3.3. Prediction from Criminological Variables
Much research (see, for example, Kroner et al. [
19]) has shown that criminological variables are strong predictors of future violence. Recent research has shown that such variables are also effective in domestic abuse cases. For instance, Turner et al. [
9] demonstrated the value of such variables (which produced an AUC of 0.78 in their study) and found that the addition of DASH information did not increase the magnitude of prediction produced by these criminological variables alone. We conducted a similar analysis in the present sample where four criminological variables (number of previous offences, number of previous violent offences, age of first offence, and age of first violent offence) were taken from the Police National Computer records for each offender. We used a regression analysis to predict Deadly violence (vs. control) for these items and found strong effects for the number of offences and number of violent offences, but not for the age-related variables. We then produced a simple coding scheme where the number of offences and the number of violent offences were coded (e.g., for violence offences 0 = 0 offences, 1 = 1 offence, 2 = 2–3 offences, 3 = 4–9 offences, and 4 = 10 or more offences) to produce a score from 0 to 8. We labelled this the CV-score (criminological variable). A signal detection analysis showed that this CV-score was a strong predictor of deadly violence with an AUC = 0.82 (95% CI [0.76, 0.89],
p < 0.001). The CV-score was also predictive of persistent offending (AUC = 0.70, 95% CI [0.66, 0.74],
p < 0.001).
These analyses show that both the DASH-SF and the CV-score are predictive of deadly violence. We therefore tested whether, together, they may be more predictive than either alone. We first performed regression analyses to predict membership of the Deadly violence group where we first added one of the variables at step 1 and then the other at step 2 to see if its addition improved the model’s fit. Addition of the CV-score at step 2 improved the model’s fit when DASH-SF was entered at step 1 (p < 0.01). Likewise, the DASH-SF improved the model’s fit when entered at step 2 when CV-score was entered at step 1. Both the CV-score (Exp(B) = 1.54, p < 0.001) and the DASH-SF (Exp(B) = 1.73, p < 0.01) were significant predictors in the final model. Hence, the CV-score and the DASH-SF make non-redundant predictions of future deadly violence.
Finally, we produced a predictor variable that combined the DASH-SF with the CV-score by the simple addition of the two, which we termed DASH-CV. This was strongly predictive of membership of the Deadly violence group (AUC = 0.84, 95% CI [0.78, 0.91], p < 0.001), although the AUC was not significantly greater than that of the DASH-SF or CV-score alone. Despite being based on items that predicted deadly violence, the DASH-CV was also strongly predictive of membership of the Persistent offending group (AUC = 0.74, 95% CI [0.70, 0.78], p < 0.001).
4. Discussion
The study found that the DASH had predictive validity for both deadly violence and for persistent domestic abuse. The AUC of 0.67 for deadly violence is regarded as a “moderate” effect size [
14]. The present results therefore differ from previous examinations of the DASH that failed to find that it had any predictive validity [
8,
9,
10,
11,
12].
Whilst the predictive validity of the DASH is modest, it does not appear to be out of line with other instruments designed to predict domestic violence. The meta-analytic review of van Der Put et al. [
20] looked at a range of instruments designed to predict violence in general or domestic violence. Overall, they found an AUC of 0.65 (though, when correcting for possible missing studies, this reduced to 0.60), with those specifically designed for domestic violence also producing an AUC of 0.65. Limiting studies to those that looked at severe/near fatal violence (as in the present study) produced an AUC of 0.66. Looking at some of the most used and studied domestic violence risk instruments across the world, they showed that the DVSI [
21] had an AUC of 0.61, the B-SAFER [
22] an AUC of 0.60, the DA [
23] an AUC of 0.66, the ODARA [
24] an AUC of 0.69, and the SARA [
25] an AUC of 0.64. Hence, the DASH appears to perform as well as these other domestic violence instruments and there is no obvious evidence that one instrument is a better predictor than any another.
Why might there be a discrepancy between our study showing that DASH is as predictive of deadly violence commensurate with other domestic violence risk prediction schemes and the previous studies finding little validity for DASH? Any scheme designed to predict domestic abuse (or other forms of violence) is dependent on the correct usage of the instrument. As discussed earlier, Robinson et al. [
7,
8] have shown that the DASH is often not used in the correct manner, with instances of cutting and pasting from previous reports, only completing the DASH if it was felt worthwhile, etc. The current research was partly driven by the recognition of the increasing importance of the accurate assessment of the risk of domestic abuse and improvements being made in the Dyfed-Powys police force to improve the quality of the police response to domestic incidents. This included greater training in the use of the DASH. These improvements may have produced a higher quality of the DASH, allowing it to be predictive of future incidents of domestic violence. We stress that this possible reason for DASH’s efficacy in this study compared to the previous studies is speculative, as a direct comparison to other forces was not conducted.
The current study also found that the DASH performed moderately well in distinguishing persistent offenders from the control group. The AUC of 0.62 is also a moderate effect size [
14], but is less than that for deadly violence (but this difference was not statistically significant). We also show that modifications to the DASH improved this predictive validity (DASH-SF: AUC = 0.70, DASH-CV: AUC = 0.74) even though these modifications to the DASH scheme used information about item prediction of deadly violence. It seems likely that further development of the DASH based on items that are related to the prediction of persistent offending may well produce even stronger predictive efficacy. No previous studies have looked at the efficacy of the DASH in predicting persistent offenders. These persistent offenders may commit lower-level abuse/offences than the Deadly group but do so with a frequency that could inflict serious psychological harm to their victims. As such, identification of such individuals and an understanding of the reasons for such action is important for risk management and intervention, and in safeguarding victims.
Despite the “success” of the DASH in predicting deadly violence, it was notable that many of the individual DASH items were not predictive. Again, this should not be regarded as specific to the DASH scheme, as it has been shown that many items in other violence risk assessment tools are also not predictive [
26]. Previous studies have also shown that many of the DASH items are not useful in terms of predicting future violence [
3,
10]. It is therefore of interest to examine which items are predictive.
The present study found that Q27 (criminal history) was the largest predictor of a deadly offender, which is consistent with the findings of Almond et al. [
3] and Turner et al. [
10]. However, Thornton [
8,
12] and Chalkley and Strang [
11] found that males in the control sample had significantly more arrests and convictions than those in the deadly domestic offender’s sample. This discrepancy could be due to differences in the control groups used in the studies. Q26 (breach of bail or injunction) was also a strong predictor of deadly violence.
Q24 (Alcohol, drugs, or mental health) had the next greatest effect on predicting deadly offenders. Chalkley and Strang [
11] found more drug abuse and mental health difficulties in female deadly offenders compared to the control group. They also found self-harm to be present at a higher rate for deadly male offenders compared to the control group.
Two questions probing the type of violence previously used were also predictive of future deadly violence. Q16 (Use of weapons) had a large effect size in identifying deadly offenders. This finding concurs with that of Thornton [
8,
12], that female offenders who have used weapons were almost five times more likely to be deadly domestic abuse offenders. Q18 (strangle/choke/suffocate/drown) had an odds ratio of 3.01 in the current study. Almond et al. [
3] calculated it to have an odds ratio of 2.00 in identifying violent recidivists, compared to non-recidivists.
Q6 (Separation from the offender) had a large effect on predicting deadly offenders in this current study, with an odds ratio of 2.46. This is also consistent with the findings of Almond et al. [
3], who found an odds ratio of 2.23 of having been separated from their victim.
Q15 (Controlling or excessively jealous) was found to have a medium effect on predicting deadly offenders in the current study. However, none of the other studies that looked at the DASH items found this item to have a significant effect. This may be due to the definition of harm in Turner et al. [
10], which eliminated any non-physical abuse, such as coercive and controlling behaviours. Thus, it is not surprising that, if we eliminate non-physical abuse and controlling behaviours in our definition of harm, an item evaluating controlling behaviour in the perpetrator may not be an effective predictor. Almond et al. [
3] also did not find this item to have an effect in predicting outcomes in any of their groups of offenders. The fact that the current research has highlighted controlling or excessively jealous behaviour as having a medium effect size may be due to the impact of the training of front-line officers and staff in the Dyfed-Powys Police on “Domestic Abuse Matters”. This training programme educated staff on the various guises of domestic abuse, including the insidious nature of coercive control and how to recognize this.
Many of the DASH items were not predictive of deadly violence. Including these items would serve to add in noise and error to the evaluation and make the task of the police officers more difficult. We therefore wanted to evaluate if eliminating these non-predictive items would lead to a more accurate and effective instrument for the assessment of domestic abuse. To evaluate this, we took five of the most predictive items of deadly violence to produce a “DASH-shortform” assessment (DASH-SF). As predicted, this instrument was able to predict with a higher level of accuracy (AUC = 0.80) than the full DASH alone. We also showed that merely using two pieces of information from the PNC records (the CV-score) can also produce an effective deadly-violence predictor (AUC = 0.82). Finally, with the two combined (DASH-CV), an instrument with a large effect size (AUC = 0.84) was achieved.
Although this preliminary work is encouraging, these initial results should be taken with caution. The development and testing of a risk prediction instrument using just the construction sample can take advantage of random variations that are in its favour, and which might not be replicated in another sample of domestic abuse offenders. It is important to test the DASH-CV in a new independent sample to see if the improvement in prediction accuracy over the standard DASH indeed reflects an instrument with better accuracy or was merely due to the inflation caused by selecting items that were chosen because of the efficacy in this particular sample.
This paper focuses solely on the predictive ability of the DASH. However, the point of risk assessment is not merely to predict future events but to devise a safety plan to prevent harm, safeguard potential victims, and mitigate against any future adverse events. Indeed, the DASH was originally devised as an example of a structured professional judgement where the items of the DASH are a starting point for the risk assessment. The use of a few key risk indicators, such as the DASH-SF, is unlikely to help the assessor understand the needs of the victim/perpetrator and the reasons (both distal and proximal) that mediate domestic abuse. It is appreciated that most police forces, and many other agencies and sectors, have only limited resources and time to be able to perform in-depth psychological formulations to understand the drivers of domestic abuse (see, for example, Snowden et al. [
27]), and that the use of checklists such as the DASH may be needed to assess the need for intervention and to determine when urgent safeguarding action is required. We would note, however, that DASH was originally put forwards as an example of a “structured professional judgement” scheme (which uses psychological formulation as its basis), and where the data gathered from such an assessment is then subjected to professional scrutiny to determine risks and needs. However, in practice, this does not appear to be how the DASH is being used.
While we have emphasized that the DASH, at least in this study, has a predictive accuracy commensurate with other risk assessment schemes, it is still the case that the majority of the perpetrators of deadly violence were not regarded as “high risk” in their previous DASH assessment. This “low risk paradox” often leads to the idea that the risk assessment instrument does not work. The low risk paradox has been identified in many risk assessment schemes and is known as the “prevention paradox” [
28]. In most risk assessment schemes, the large majority of cases are given a “low” rating with only a few receiving “high” ratings (in the present study, only 5.5% of the total cases were placed in the “high-risk” category). Hence, even if the classification scheme is good (but not perfect), the fact that there are many more “low-risk” cases produces the finding that more people in the low-risk category have the target outcome (such as heart disease, breast cancer, suicide, deadly domestic violence, etc.), even though the proportion of target events is far higher in the high-risk group. Hence, the evaluation of an instrument based solely on the proportion of “correct” identifications of those that commit the act of violence, or on a comparison of the number of people who commit the act in each group, is inappropriate and misleading.
Limitations and Future Directions
The study is based solely on official police data. It is appreciated that a large proportion of domestic violence incidents are not reported, with recent figures suggesting only 20–24% of such incidents are being reported to the police [
29]. Such a situation leads to inaccuracies in our dependent variable. It seems probable, however, that a greater percentage of “deadly violence” is likely to be reported and to be present in official records than that of lower-level violence. Thus, the effect of this low level of reporting of domestic violence would be that our control group may be contaminated by having people who did commit a further act of domestic abuse, but that this was not reported/recorded. Any effect that serves to put “noise” into the dependent variable will reduce the ability of any instrument to correctly predict group membership. Thus, it is likely that the figures presented here, and in many other studies using official police and conviction data, will be lower than the “true” figures achievable if all such incidents were reported.
The study was well powered for its intended purpose but was not powered to be able to look at group differences related to the effectiveness of the DASH in identifying the risk of domestic abuse. Larger-scale studies are needed to examine differential predictiveness for perpetrators of different genders, in heterosexual vs. non-heterosexual relationships, in different cultural settings, in groups such as older adults [
30], etc. Likewise, it would be of interest to examine different forms of domestic abuse, such as physical assaults vs. coercive control, etc.
The data presented were based on cases where a DASH was completed (or, at least, the majority of the DASH was completed), as our aim was to examine if a completed DASH was predictive of future domestic abuse. However, the exclusion of cases where the DASH was incomplete may have removed higher-risk and/or chaotic cases and does not therefore represent true operational conditions. Future research is needed to examine the implications of incomplete DASHes (and the reason for the non-completion) for future domestic abuse.