Improving Pre-Trial Risk Assessment
John Ursino, MA,
Zachary Hamilton, Ph.D, and
Alex Kigirl, Ph.D.
There are currently over 400,000 pretrial detainees in U.S. city and county jails (Sawyer & Wagner, 2023). Pretrial release decisions occur after an individual is charged with a criminal offense and brought before a judge for their first court appearance. Here, judges decide whether a defendant is to be detained, released into the community, or required to pay bail/bond to gain pre-trial release (Scott-Hayward et al., 2022). There are increasing concerns that current pretrial detention practices worsen jail crowding, and that cash bail/bond disproportionately affects individuals living in poverty, women, and racial/ethnic minorities (Mayson, 2018).
Recent jail statistics support these concerns. In 2023, over 60% of detainees in county and city jails were pretrial defendants, with people of color (POC) making up nearly 70% of this population (Sawyer & Wagner, 2023). The toll that the bail/bond system has on lower-income defendants is also apparent, as the average annual income for those who could not afford bail range from $11,000 to $16,000 (Rabuy & Kopf, 2016). These disparities run contrary to the important ideals of fairness and impartiality important in court proceedings. Therefore, courts have been seeking ways to counteract these concerning trends by adopting pretrial risk assessments to help increase the number of defendants provided pretrial release, while retaining public safety (Desmarais et al., 2021).
Pretrial Risk Assessments
Pretrial risk assessments (PRAs) are administered prior to a defendant’s first appearance and help judges make release decisions by providing a rating of risk or release recommendations. Most PRAs attempt to identify those at greatest risk for ‘failure to appear’ (FTA) to their next court date, or to receive a ‘new criminal arrest’ (NCA), or a ‘new violent criminal arrest’ (NVCA) if they were to be released pre-trial (Desmarais et al., 2021). In turn, PRAs can help to reduce the use of detention and cash bail/bond, reserving supervision and jail resources for only the highest risk defendants (Scott-Hayward et al., 2022). To be worthwhile, PRAs must be efficient, accurate, standardized, and reduce the biases that are more commonly observed in human decision-making. However, while PRAs have the potential to improve release decisions, there are some important limitations (Desmarais et al., 2021).
PRA Limitations
PRAs are often adopted ‘off-the-shelf,’ meaning they are created in one jurisdiction, and then implemented in other agencies/jurisdictions (Picard-Fritsche et al., 2017). While more convenient than creating a tool from scratch, assessments adopted off-the-shelf demonstrate reduced accuracy due to a phenomenon called ‘predictive shrinkage.’ Predictive shrinkage occurs when the agency adopting the assessment has different statutes, practices, and serves a substantially distinct population when compared to the one the tool was originally designed for (Hamilton et al., 2016). Despite these concerns, off-the-shelf tools are an alluring choice to agencies looking to adopt a PRA, and many contain only five-to-ten items, making them efficient and cheap to administer (Scott-Hayward et al., 2022). Of the three most common PRAs—the Ohio Risk Assessment System (ORAS), the Public Safety Assessment (PSA), and the Virginia Pretrial Risk Assessment Instrument—Revised (VPRAI-R)—most assessment items used are static, meaning they measure defendant characteristics that cannot change over time. Criminal history items are common static indicators and are often strong predictors of future criminal behavior (Andrews & Bonta, 2010).
Unfortunately, assessments that consist of only, or mostly, static items have been shown to increase racial/ethnic and gender bias (Butler et al., 2022). Research has shown that an over-reliance on criminal history items inflate risk scores for racial/ethnic minorities, due to disproportionate minority contact reflected through local policing practices and prosecutorial decisions (Eckhouse et al., 2019). Similarly, females offend at lower rates than men, and tools that use predominantly criminal history indicators inflate the risk posed by women, especially for violent crimes (Salisbury et al., 2009). Subsequently, contemporary assessments tend to mistakenly ‘overclassify’ risk of future offending, assigning a defendant a higher than appropriate risk rating (Hamilton et al., 2023). Further, current PRAs often fail to gather information on defendants’ pretrial circumstances (e.g., residential stability, substance use, mental health needs, & employment), which are often used by judges when considering release and/or applying conditions that allow for successful pretrial supervision.
Balancing utility at the expense of efficiency is a formidable challenge faced by assessment developers and the agencies their tools service. Fortunately, the development of a new PRA, the Personal Recognizance Interview and Needs Screen (PRINS), provides a different perspective, using improved development methods aimed at resolving these limitations.
Overcoming Limitations
PRAs applied off-the-shelf are adopted ‘as is’ and many agencies may feel like a ‘square peg in a round hole’, needing to gather potentially irrelevant information from pretrial defendants to score a tool. Tools that include items that the court system does not routinely collect can create complications for the agency and extend the workloads of staff. Further, items that are consequential to judicial decisions, representing the defendant’s release circumstances, are commonly ignored (DeMichele et al., 2018; Scott-Hayward et al., 2022). Conversely, the PRINS was developed via a localized approach. Localization means that a tool is created with the context of a local jurisdiction in mind. Localized tools overcome limitations of off-the-shelf assessments by using jurisdiction-specific data to inform their development and are crafted by carefully considering the agency’s needs and the population they serve (Hamilton et al., 2022). For example, short assessments are sometimes viewed as not especially useful because they are more likely to leave out information that judges consider important when making pretrial release decisions (DeMichele et al., 2018). When judges and the courtroom work group are skeptical of what the tool captures, this often results in the assessment recommendations being overridden, which reduces accuracy (Lowder et al., 2023). Conversely, the creators of the PRINS collaborated with local stakeholders to ensure that all the important information is commonly gathered by the court system, such as details about defendants’ involvement in substance abuse treatment, employment, and living situation. This approach also helped to tailor the outcomes the PRA predicts. In particular, the King County stakeholders sought more specific outcomes than the three commonly predicted by PRAs (failure to appear [FTA], new criminal arrest [NCA], and new violent criminal arrest [NVCA]), resulting in the offense types being further broken down into ‘any’, felony, violent, property, drug, and domestic violence sub-categories. Overall, the efficiency and utility of PRAs can be greatly increased via localization.
Tailoring assessment items to an agency’s policies, jurisdiction, resources, and population can also increase accuracy. For one, tailoring involves strategically selecting the most useful items for that agency (Hamilton et al., 2016). As a result, buy-in for judges and stakeholders is improved, reducing the likelihood of overrides (Hamilton et al., 2016). Further, by using local data, the strongest predictors for that population can be given more weight (i.e., larger scores), and thus, more importance (Stevenson, 2018).
Similar methods can be used to reduce bias in the assessment outcomes (Duwe, 2014; Hamilton et al., 2022). To reduce female overclassification, the PRINS was developed using gender-responsive modeling. This method recognizes that there are different pathways to crime for males and females, and as a result, not all assessment items predict FTA and arrest outcomes equally across genders (Van Voorhis et al., 2010). For example, among individuals arrested for drug-related crimes, significantly more females attribute the onset of their substance abuse disorder to symptoms of post-traumatic stress and/or prior victimization (Reichert & Bostwick, 2010). PRINS developers created male and female-specific samples, then selected and weighted items that were the most important for the respective gender-specific samples. Gender-responsive models have proven to be capable of reducing gender bias (Duwe & Rocque, 2019; Hamilton et al., 2022). This is due, in part, to their ability to identify and incorporate more dynamic itemsthat are important to that agency’s population (Andrews & Bonta, 2010; Hamilton et al., 2022). Unlike static items, dynamic items are changeable measures of a defendant’s personal circumstances and can be used to establish pretrial supervision services that reduce their risk of FTA or arrest (Scott-Hayward et al., 2022). Further, including dynamic items reduces the need for criminal history indicators, which can exacerbate racial/ethnicity bias (Butler et al., 2022). Specifically, bias can be reduced by removing predictors that are correlated with being non-white and supplementing their removal by including a greater number of dynamic-need items (Butler et al., 2022).
Creating the PRINS
In 2017, King County, Washington stakeholders sought to adopt a PRA as a response to growing concerns of bias and overuse of cash bail in pretrial decision-making. When doing so, they wanted a tool that could demonstrate reductions in bias, while maintaining efficiency and accuracy. Subsequently, researchers from the Nebraska Center for Justice Research (NCJR) were asked to develop a PRA that was specific to the King County population. Creating a local tool offset concerns of predictive shrinkage and allowed for more sophisticated development methods, increased efficiency, and reduced bias.
Development occurred in three phases. Phase 1 involved collaboration with local subject matter experts (SMEs) to tailor the tool items and scoring to the existing court system infrastructure and data collection. This also allowed NCJR to identify items that were most important for the King County population, and to create risk level categories (RLCs) that reflect the distribution of Low, Moderate, and High-Risk defendants. Unlike most PRAs that focus on criminal history, the PRINS incorporated information regarding defendants’ education, social support, community supervision, employment, and other demographic information. Item weighting and scoring was completed via a gender-responsive approach, in an effort to reduce gender bias. Once items were selected and weighted, the tool was developed and deployed via software. In Phase 2 the tool was ‘piloted’ in the courts.
Named the PRINS, King County’s homegrown tool design was specified to eliminate issues of contemporary tools. During this time, the judges were not allowed to consider the PRINS’ findings when making release decisions. This process allowed data to be collected as a ‘naturalized experiment’, allowing researchers to assess the accuracy of the tool. Phase 3 began in 2022 by analyzing the 28,147 cases collected during Phase 2 to evaluate tool performance and make updates. The final product was a highly accurate assessment tailored to King County’s needs.
Testing the PRINS
As judges were not allowed to consider PRINS results, we were able to compare the tool’s accuracy to the release decisions made by court officials. To do so, we compared the PRINS risk level categories (RLCs) to three judicial release decisions:
1. personal recognizance (ROR; i.e., no conditions),
2. conditional, and
3. bail/bond release.
Although comparing the PRINS to judge’s release decisions was important evidence for local stakeholders, to provide a broader context of the tool’s results we compared its accuracy to commonly used pretrial assessment tools. Specifically, we used King County data to code the items and responses of the ORAS, VPRAI-R, and PSA. We then tested the tools’ accuracy in predicting FTA, NCA, and NCVA using the data collected during Phase 2.
Comparing Accuracy to Judicial Decisions
When comparing RLCs to release decisions, ideally, we would see more Low-Risk defendants released via ROR, moderate receiving conditional release, and High-Risk receiving bail/bond conditions. The comparison of RLCs and release decisions showed that following PRINS recommendations would have led to more Low-Risk defendants being provided ROR release, yet their release would not have increased FTAs or new arrest outcomes. The comparison also highlighted the PRINS’ ability to limit the use of bail/bond by identifying High-Risk defendants more accurately (Hamilton et al., 2022). These findings demonstrate that, had the King County Court been using the PRINS to guide release decisions, more people would have been released, and fewer FTAs and NCA/NVCAs would have occurred.
Comparing Accuracy to Other PRAs
Next, we examined the PRINS compared to three contemporary PRAS. Figure 2 displays the results of the tool comparison. The vertical axis represents the probability that a defendant will commit the predicted outcome (e.g., FTA). The horizontal axis represents defendants’ PRA scores. Ideally, as risk scores increase across the horizontal axis, probability of failure increases, resulting in a steeper upward slope, where a flatter slope indicates that a tool has difficulty distinguishing between Low and High-Risk.
The figure shows that PRINS (in red) has the steepest slope of all the tools across the three outcomes. On the high end of the risk scale, there is a 14-percentage point difference in NCA probability between the PRINS and the other tools. The pattern is also notable for the other outcomes, with the PRINS improving predictive accuracy by 12 percentage points for NVCA and 11-15 percentage points for FTA. Overall, these results show the improved accuracy of PRINS.
Comparing Bias
While improvements in accuracy were expected, we also anticipated that including more dynamic indicators would improve prediction for POC. To test this effect, we separated the sample into White and POC groups. We took the average score of the three non-PRINS PRAs to create a three-tool average, highlighting the difference between tools. Figure 3 plots the probabilities of the three outcomes for POC defendants. The PRINS prediction for NCA shows notable improvement over the other tools in the high-risk half of the scale, where the PRINS improves accuracy by 10 percentage points. A similar trend is observed for NVCA, where PRINS extends to 20 percentage points over the other tools’ average, and 10 percentage points for FTA. Overall, the figure demonstrates that the inclusion of dynamic indicators in a localized tool helps to improve predictive accuracy for POC.
Next, we assessed if gender-specific scoring improved prediction for female defendants. Figure 4 provides comparisons between the PRINS and other tools’ average for our subsample of females. These plots illustrate that the PRINS provides a substantial improvement for female NCA and NVCA prediction, as differences again extend to over 20 percentage points by the end of the risk scale. While FTA distinctions are less substantial, a similar pattern is seen, as the PRINS provides greater prediction for females in the higher risk end of the scale.
Conclusion
As noted, PRAs are relatively new to the criminal justice field. While contemporary PRAs are efficient, they are not without limitations. We caution agencies considering adopting these, or any, tools in anticipation of an easy fix to problems of bias or drastic pretrial release increases. Specifically, defendant populations will differ from a tool’s original development sample, reducing an off-the-shelf tool’s accuracy when applied in a new location. Without applying the assessment innovations described here, one can anticipate performance reductions and overclassification of both females and POC. The PRINS provides a necessary improvement to the PRA landscape by increasing accuracy through localization, gender-responsive scoring, and the inclusion of a greater number of dynamic items that are commonly collected by the local agency. Further, dynamic items provide valuable information for pretrial release decisions and reduce potential racial/ethnic and gender biases. Given that PRAs are often adopted to reduce the use of bail/bond, it is critical to ensure tools are not perpetuating biases with a greater impact for economically disadvantaged populations. Overall, we believe that the PRINS provides a much-needed advancement and template for PRA development that will ultimately improve defendant’s lives and public safety.
_______________________________________
John Ursino, MA, is a Research Assistant and Doctoral Student at the University of Nebraska—Omaha School of Criminology and Criminal Justice. His research interests include risk and needs assessment and correctional programming, and he has appeared in the Journal of Criminal Justice. John can be reached at jursino@unomaha.edu
Zachary Hamilton, Ph.D., is a Professor in the School of Criminology and Criminal Justice at the University of Nebraska at Omaha and the Associate Director of the Nebraska Center for Justice Research. His research has focused on risk and need assessment development, quantitative methods and correctional program and policy evaluation. His research has appeared in Criminology and Public Policy, Justice Quarterly, and Crime and Delinquency, among others. Dr. Hamilton can be reached zhamilton@unomaha.edu
Alex Kigerl, Ph.D., is a Research Associate in the School of Criminology and Criminal Justice at the University of Nebraska at Omaha and the Nebraska Center for Justice Research. His research focus has been on machine learning, quantitative methods, risk assessment development, and cybercrime. His research has appeared in Social Science Computer Review, Justice Quarterly, and Criminal Justice and Behavior, among others. Dr. Kigerl can be reached at akigerl@unomaha.edu
References
Andrews, D. A., & Bonta, J. (2010). The psychology of criminal conduct. Routledge.
Butler, L. C., Hamilton, Z., Krushas, A. E., Kigerl, A., & Kowalski, M. (2022). Racial Bias and Amelioration Strategies for Juvenile Risk Assessment. Handbook on Inequalities in Sentencing and Corrections among Marginalized Populations, 70-118.
Desmarais, S. L., Zottola, S. A., Duhart Clarke, S. E., & Lowder, E. M. (2021). Predictive validity of pretrial risk assessments: A systematic review of the literature. Criminal Justice and Behavior, 48(4), 398–420.
DeMichele, M., Baumgartner, P., Wenger, M., Barrick, K., Comfort, M., & Misra, S. (2018). The public safety assessment: A re-validation and assessment of predictive utility and differential prediction by race and gender in Kentucky.
Duwe, G. (2014). The development, validity, and reliability of the Minnesota screening tool assessing recidivism risk (MnSTARR). Criminal Justice Policy Review, 25(5), 579-613.
Duwe, G., & Rocque, M. (2019). The predictive performance of the Minnesota screening tool assessing recidivism risk (MnSTARR): An external validation.
Eckhouse, L., Lum, K., Conti-Cook, C., & Ciccolini, J. (2019). Layers of bias: A unified approach for understanding problems with risk assessment. Criminal Justice and Behavior, 46(2), 185-209.
Hamilton, M. (2019). The sexist algorithm. Behavioral sciences & the law, 37(2), 145-157.
Hamilton, Z., Tollefsbol, E. T., Campagna, M., & Van Wormer, J. (2016). Customizing criminal justice assessments. In Handbook on Risk and Need Assessment (pp. 349-393). Routledge.
Hamilton, Z., Kigerl, A., Ursino, J., Krushas, A., (2022). Personal Recognizance Interview and Needs Screen (PRINS): Evaluation and Revalidation
Hamilton, Z., Kowalski, M. & Kigerl, A. (2022) Prediction is Local: The Benefits of Optimization. Justice Quarterly, 39:4, 722-744
Hamilton, Z., Kowalski, M., Campagna, M., *Kobie, A., & Kigerl, A. (2023, Forthcoming). Comparing Meters to Yards: A Nationally Representative Evaluation of Gender Bias in Risk Assessment. Justice Quarterly.
Lowder, E. M., Kamara, Z. B., & Kent, A. (2023). Pretrial Decision-Making Matrices: The Role of Risk and Charge Weighting in Risk Assessment–Guided Decisions. Criminal Justice and Behavior.
Mayson, S. G. (2018). Bias in, bias out. Yale lJ, 128, 2218.
Picard-Fritsche, S., Rempel, M., Tallon, J. A., Adler, J., & Reyes, N. (2017). Demystifying risk assessment. Center for Court Innovation.
Rabuy, B., & Kopf, D. (2016, May 10). Detaining the poor: How money bail perpetuates an endless cycle of poverty and Jail Time..
Reichert, J., & Bostwick, L. (2010, November). Post-traumatic stress disorder and victimization–ICJIA. Illinois Criminal Justice Information Authority. Retrieved November 28, 2021.
Salisbury, E. J., Van Voorhis, P., & Spiropoulos, G. V. (2009). The predictive validity of a gender-responsive needs assessment: An exploratory study. Crime & Delinquency, 55(4), 550-585.
Sawyer, W., & Wagner, P. (2023, March 14). Mass incarceration: The whole pie 2023. Prison Policy Initiative. https://www.prisonpolicy.org/reports/pie2023.html
Scott-Hayward, C. S., & Fradella, H. F. (2022). Abolishing Bail. Transforming Criminal Justice: An Evidence-Based Agenda for Reform, 97.
Van Voorhis, P., Wright, E. M., Salisbury, E., & Bauman, A. (2010). Women’s risk factors and their contributions to existing risk/needs assessment: The current status of a gender-responsive supplement. Criminal Justice and Behavior, 37(3), 261-288.
Zottola, S. A., Desmarais, S. L., Stewart, D. K., Duhart Clarke, S. E., & Monahan, J. (2023). Pretrial Risk Assessment, Release Recommendations, and Racial Bias. Criminal Justice and Behavior.