Background: Vast numbers of domestic violence (DV) incidents are attended by the New South Wales Police Force each year in New South Wales and recorded as both structured quantitative data and unstructured free text in the WebCOPS (Web-based interface for the Computerised Operational Policing System) database regarding the details of the incident, the victim, and person of interest (POI). Although the structured data are used for reporting purposes, the free text remains untapped for DV reporting and surveillance purposes.
Objective: In this paper, we explore whether text mining can automatically identify mental health disorders from this unstructured text.
Methods: We used a training set of 200 DV recorded events to design a knowledge-driven approach based on lexical patterns in text suggesting mental health disorders for POIs and victims.
Results: The precision returned from an evaluation set of 100 DV events was 97.5% and 87.1% for mental health disorders related to POIs and victims, respectively. After applying our approach to a large-scale corpus of almost a half million DV events, we identified 77,995 events (15.83%) that mentioned mental health disorders, with 76.96% (60,032/77,995) of those linked to POIs versus 16.47% (12,852/77,995) for the victims and 6.55% (5111/77,995) for both. Depression was the most common mental health disorder mentioned in both victims (22.25%, 3269) and POIs (18.70%, 8944), followed by alcohol abuse for POIs (12.19%, 5829) and various anxiety disorders (eg, panic disorder, generalized anxiety disorder) for victims (11.66%, 1714).
Conclusions: The results suggest that text mining can automatically extract targeted information from police-recorded DV events to support further public health research into the nexus between mental health disorders and DV.
Keywords: domestic violence; mental health disorders; police narratives; rule-based approach; text mining.
©George Karystianis, Armita Adily, Peter Schofield, Lee Knight, Clara Galdon, David Greenberg, Louisa Jorm, Goran Nenadic, Tony Butler. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 13.09.2018.