Hostname: page-component-89b8bd64d-5bvrz Total loading time: 0 Render date: 2026-05-08T12:00:22.930Z Has data issue: false hasContentIssue false

Predicting social assistance beneficiaries: On the social welfare damage of data biases

Published online by Cambridge University Press:  22 January 2024

Stephan Dietrich*
Affiliation:
UNU-MERIT, Maastricht, Netherlands
Daniele Malerba
Affiliation:
IDOS, Bonn, Germany
Franziska Gassmann
Affiliation:
UNU-MERIT, Maastricht, Netherlands
*
Corresponding author: Stephan Dietrich; Email: s.dietrich@maastrichtuniversity.nl

Abstract

Cash transfer programs are the most common anti-poverty tool in low- and middle-income countries, reaching more than one billion people globally. Benefits are typically targeted using prediction models. In this paper, we develop an extended targeting assessment framework for proxy means testing that accounts for societal sensitivity to targeting errors. Using a social welfare framework, we weight targeting errors based on their position in the welfare distribution and adjust for different levels of societal inequality aversion. While this approach provides a more comprehensive assessment of targeting performance, our two case studies show that bias in the data, particularly in the form of label bias and unstable proxy means testing weights, leads to a substantial underestimation of welfare losses, disadvantaging some groups more than others.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Poverty headcount in Malawi and Tanzania data

Figure 1

Table 2. PMT and test data used for assessment comparisons

Figure 2

Table 3. Confusion matrix Tanzania

Figure 3

Table 4. Malawi confusion matrix

Figure 4

Figure 1. Actual (first row) and predicted (second and third rows) poor households.Notes: x and y axis show two-dimensional test data of PMT variables: first PMT variables were standardized and thereafter rescaled using multidimensional scaling. Actual refers to true (consumption) poverty status and linear and grbt refer to predicted poverty status with linear and xgboost model.

Figure 5

Figure 2. Marginal welfare loss of unit transfer with linear and xgboost model.Notes: Welfare loss computed as percentage change in welfare in comparison to perfect targeting assuming different levels of inequality aversion.

Figure 6

Figure 3. Welfare loss depending on consumption measurement module and by household size.Notes: Welfare loss computed as percentage change in welfare in comparison to perfect targeting assuming different levels of inequality aversion. Evaluation based on the same test data, but models were trained with data only including recall or diary data.

Figure 7

Figure 4. Welfare loss distribution by household size.Notes: Welfare loss computed as percentage change in welfare in comparison to perfect targeting assuming different levels of inequality aversion. Evaluation based on the same test data, but models were trained with data only including harvest or lean season data.

Figure 8

Table A1. Public cash transfer programs with PMT in East-Africa

Figure 9

Table A2. Summary of PMT variables, Malawi

Figure 10

Table A3. Lean season balance tests

Figure 11

Table A4. Summary of PMT variables, Tanzania

Figure 12

Table A5. Hyper-parameter grid search gradient boosting model

Figure 13

Figure A1. Feature importance/coefficients.

Figure 14

Table A6. Performance summary of the PMT models

Figure 15

Figure A2. Marginal welfare loss of unit transfer with diary and recall PMT (only using personal diaries with high supervision frequency treatment to train diary model and for model validation).

Figure 16

Figure A3. Welfare loss predictions using a fixed quota instead of fixed poverty line approach.

Figure 17

Figure A4. Welfare losses after removing household size from PMT input list, by distinguishing between urban and rural households, and the age of the household head.

Figure 18

Figure A5. Welfare losses if consumption is converted to account for household economies of scale.Notes: Welfare loss computed as percentage change in welfare in comparison to perfect targeting assuming different levels of inequality aversion. Evaluation based on the same test data, but models were trained with data only including recall or diary data.

Figure 19

Table A7. Month-on-month variation in household size, Malawi phone survey

Figure 20

Figure A6. Variance of predicted poverty due to household size sampling variability.

Submit a response

Comments

No Comments have been published for this article.