Hostname: page-component-89b8bd64d-z2ts4 Total loading time: 0 Render date: 2026-05-11T20:24:42.578Z Has data issue: false hasContentIssue false

A recommendation and risk classification system for connecting rough sleepers to essential outreach services

Published online by Cambridge University Press:  22 January 2021

Harrison Wilde
Affiliation:
Department of Statistics, University of Warwick, Coventry, United Kingdom
Lucia L. Chen
Affiliation:
School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
Austin Nguyen
Affiliation:
Data Science, Tripadvisor, Needham, Massachusetts, USA
Zoe Kimpel
Affiliation:
Master in Data Science, Northwestern University, Chicago, Illinois, USA
Joshua Sidgwick
Affiliation:
The Alan Turing Institute, London, United Kingdom
Adolfo De Unanue
Affiliation:
Departamento de Matemáticas, Instituto Tecnologico Autonomo de Mexico, Mexico City, Mexico
Davide Veronese
Affiliation:
Master in Public Policy Candidate, Harvard Kennedy School, Cambridge, Massachusetts, USA
Bilal Mateen
Affiliation:
The Alan Turing Institute, London, United Kingdom
Rayid Ghani
Affiliation:
Machine Learning Department and Heinz College of Information Systems and Public Policy, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Sebastian Vollmer*
Affiliation:
Department of Statistics, University of Warwick, Coventry, United Kingdom The Alan Turing Institute, London, United Kingdom
*
*Corresponding author. E-mail: svollmer@turing.ac.uk

Abstract

Rough sleeping is a chronic experience faced by some of the most disadvantaged people in modern society. This paper describes work carried out in partnership with Homeless Link (HL), a UK-based charity, in developing a data-driven approach to better connect people sleeping rough on the streets with outreach service providers. HL's platform has grown exponentially in recent years, leading to thousands of alerts per day during extreme weather events; this overwhelms the volunteer-based system they currently rely upon for the processing of alerts. In order to solve this problem, we propose a human-centered machine learning system to augment the volunteers' efforts by prioritizing alerts based on the likelihood of making a successful connection with a rough sleeper. This addresses capacity and resource limitations whilst allowing HL to quickly, effectively, and equitably process all of the alerts that they receive. Initial evaluation using historical data shows that our approach increases the rate at which rough sleepers are found following a referral by at least 15% based on labeled data, implying a greater overall increase when the alerts with unknown outcomes are considered, and suggesting the benefit in a trial taking place over a longer period to assess the models in practice. The discussion and modeling process is done with careful considerations of ethics, transparency, and explainability due to the sensitive nature of the data involved and the vulnerability of the people that are affected.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press in association with Data for Policy
Figure 0

Figure 1. The proposed alert prioritization process.

Figure 1

Figure 2. StreetLink's platform is experiencing exponential growth, likely due to increased awareness of the problem and StreetLink itself. This increased level of demand has not been matched with an increased level of resources. The number of referrals made and people connected with follow a similar but slightly smaller exponential growth indicating the severity of the challenge that StreetLink faces.

Figure 2

Table 1. Examples of label mappings for the Referral and Positive Outcome Models.

Figure 3

Table 2. Metric definitions.

Figure 4

Table 3. Baseline homeless link statistics by fold/month.

Figure 5

Table 4. Results table for the best Positive Outcome Model.

Figure 6

Table 5. Results table for the best Referral Model.

Figure 7

Figure 3. Plot illustrating the distribution of scores for the chosen Positive Outcome and Referral Models. This particular plot is for February of 2019.

Figure 8

Figure 4. The average precision, recall, and found rate across all folds for the best examples of each classifier type. Points of seeming discontinuity arise due to the nature of our temporal nested cross validation; each fold and its corresponding test set spanning a month contains a different number of alerts.

Figure 9

Figure 5. Log-scaled feature group importance scores from each fold for the chosen random forest models. Temporal aggregates include all of the features generated by counting positive outcomes, referrals, and total number of alerts received in varying time windows; spatial aggregates is similar but with varying proximities; LDA topics include location and activity topics extracted from free text; manual topics are ones defined manually as important keywords to count; platform includes the features that indicate whether the alert originated from the web, mobile app, or via a phone call.

Submit a response

Comments

No Comments have been published for this article.