Hostname: page-component-5db58dd55d-8lnk4 Total loading time: 0 Render date: 2026-06-02T22:06:33.461Z Has data issue: false hasContentIssue false

A Machine Learning architecture to forecast Irregular Border Crossings and Asylum requests for policy support in Europe: a case study

Published online by Cambridge University Press:  20 December 2024

Claudio Bosco*
Affiliation:
European Commission, Joint Research Centre (JRC), Ispra, Italy
Umberto Minora
Affiliation:
European Commission, Joint Research Centre (JRC), Ispra, Italy
Anna Rosińska
Affiliation:
European Commission, Joint Research Centre (JRC), Ispra, Italy
Maurizio Teobaldelli
Affiliation:
ARCADIA SIT S.R.L., Milano, Italy
Martina Belmonte
Affiliation:
European Commission, Joint Research Centre (JRC), Ispra, Italy
*
Corresponding author: Claudio Bosco; Email: claudio.bosco@ec.europa.eu

Abstract

Anticipating future migration trends is instrumental to the development of effective policies to manage the challenges and opportunities that arise from population movements. However, anticipation is challenging. Migration is a complex system, with multifaceted drivers, such as demographic structure, economic disparities, political instability, and climate change. Measurements encompass inherent uncertainties, and the majority of migration theories are either under-specified or hardly actionable. Moreover, approaches for forecasting generally target specific migration flows, and this poses challenges for generalisation.

In this paper, we present the results of a case study to predict Irregular Border Crossings (IBCs) through the Central Mediterranean Route and Asylum requests in Italy. We applied a set of Machine Learning techniques in combination with a suite of traditional data to forecast migration flows. We then applied an ensemble modelling approach for aggregating the results of the different Machine Learning models to improve the modelling prediction capacity.

Our results show the potential of this modelling architecture in producing forecasts of IBCs and Asylum requests over 6 months. The explained variance of our models through a validation set is as high as 80%. This study offers a robust basis for the construction of timely forecasts. In the discussion, we offer a comment on how this approach could benefit migration management in the European Union at various levels of policy making.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Time delay embedding example for vector X of size Nx1 into a matrix of size NxL, where L is set to 6

Figure 1

Figure 1. Plot showing the RMSE obtained with Random Forest using 1) a different number of variables (num.vars) ranging from 1 to 17 and 2) forty-eight different sets of optimal hyperparameters for the learning algorithm (ntree: number of trees to grow; mtry: number of variables randomly sampled as candidates at each split; sampsize: size of sample to draw).

Figure 2

Figure 2. The flowchart reports the modelling architecture applied to forecast asylum applications in Italy and IBCs along the Central Mediterranean Route.

Figure 3

Table 2. Modelling results related to all the different models we applied to predict first-time asylum application in Italy. MAE and explained variance in training and validation were calculated for all the models we applied and compared with the same metrics calculated for a trivial model. Further information is available in the electronic supplementary material

Figure 4

Figure 3. Results of the modelling ensemble to predict asylum applications in Italy (validation set of data). In blue are the observed values and in black are the modelling predictions. The dashed black line on the right represents the predictions beyond the last available observation (August to December 2023). In red is the model uncertainty.

Figure 5

Figure 4. Feature Importance plot for the estimate of first-time asylum applications in Italy, related to the application of the Random Forest model (similar results were obtained for ANN and GBDT models). cpg.faostat: Consumer Price Index general by FAOSTAT. fpi.faostat: Food Price Index by FAOSTAT. 6M: 6-months lag. Suffixes are ISO 3 country codes.

Figure 6

Table 3. Modelling results in predicting Irregular Border Crossings along the Central Mediterranean Migratory Route. MAE and explained variance in training and validation were calculated for all the applied models and compared with the same metrics calculated for a trivial model. Further information is available in the electronic supplementary material

Figure 7

Figure 5. Results of the modelling ensemble to predict IBCs over the Central Mediterranean Route (validation set of data). In blue are the observed values, and in black are the predictions of the model. The dashed black line on the right represents the predictions beyond the last available observation (August to December 2023). In red is the model uncertainty.

Figure 8

Figure 6. Example of a Feature Importance plot for the estimate of IBCs. It was obtained by applying PFI for the RF model (similar results were obtained for the ANN and GBDT models). cpg.faostat: Consumer Price Index general by FAOSTAT. fpi.faostat: Food Price Index by FAOSTAT. cpf.faostat: the FAOSTAT monthly Food Consumer Price Index. inflation.food_price: Monthly food price inflation estimates for fragile countries by Worldbank. 6M: 6-months lag. Prefixes are ISO 3 country codes.

Supplementary material: File

Bosco et al. supplementary material

Bosco et al. supplementary material
Download Bosco et al. supplementary material(File)
File 1.1 MB
Submit a response

Comments

No Comments have been published for this article.