Investigating the factors influencing Repeatedly Crash-Involved Drivers (RCIDs): A Random Parameter Hazard-Based Duration approach

Accid Anal Prev. 2024 Dec 8:211:107876. doi: 10.1016/j.aap.2024.107876. Online ahead of print.

Abstract

Repeatedly Crash-Involved Drivers (RCIDs) pose significant challenges to traffic safety, contributing disproportionately to crash occurrences and their severe consequences. While existing research has explored factors influencing crash involvement, the literature often neglects the influence of a driver's crash history and inter-crash intervals on their evolving crash risk. Additionally, many traditional models fail to address unobserved heterogeneity, limiting their ability to capture the complex interplay of factors contributing to repeated crash involvement. This study investigates the factors influencing RCIDs using a hybrid methodology that integrates machine learning with a Random Parameter Hazard-Based Duration Model (HBDM). Machine learning techniques are employed to identify the most critical factors affecting RCID involvement, which are then incorporated into the HBDM framework. By leveraging machine learning's capacity to analyze complex relationships within high-dimensional data and the HBDM's ability to address unobserved heterogeneity, this approach provides a comprehensive understanding of RCID behavior. Key findings reveal that male drivers, individuals with histories of distracted or alcohol-impaired driving, and those with prior traffic violations exhibit heightened crash risks. Roadway conditions, vehicle age, and regional variations also emerge as significant contributors. Drivers with extensive crash histories demonstrate dynamic risk profiles, with cumulative hazard estimates indicating increased crash likelihood over time for those with multiple prior incidents. Additionally, unobserved heterogeneity (Theta) emphasized latent, driver-specific risk factors, especially in higher-tier drivers, highlighting the complex nature of crash repeating. These findings offer a more nuanced understanding of RCIDs and underscore the need for targeted interventions that account for both observable risks and more profound, unmeasured influences on driver behavior.

Keywords: Duration Between Crashes; Hazard-Based Duration Models (HBDM); Machine Learning; Random Forest; Random Parameters; Repeatedly Crash-Involved Drivers (RCIDs); Traffic Safety; Unobserved Heterogeneity.