The Causes and Consequences of Refugee Flows: A Contemporary Reanalysis

major


INTRODUCTION
T he world faces a forcible displacement crisis: more than one hundred million individuals have either fled their countries or been displaced within them. 1 International displacement has risen dramatically in recent decades due to conflict and political instability in countries from Afghanistan to Venezuela, with Ukraine producing the single largest outflow of refugees in a single year in recorded history. 2 Forcible displacement has tremendous human costs, and its causes and consequences are the focus of significant social science research.However, empirical studies of refugee flows have been limited by a lack of reliable data.
Existing results are often derived from refugee population stock measures-total end-of-year population counts of refugees within host countries-rather than actual flow estimates.Stock-based results are subject to fundamental questions about internal validity.Much prior research is based on data collected before 2000, when the United Nations Refugee Agency (UNHCR) began systematically tracking refugee and asylum seeker (REF/ASY)3 numbers globally.The quality of pre-2000 data is limited, with many missing values that much existing research does not properly account for.Analyses focused on origin countries are often missing data because asylum countries were not then reporting arrivals to the UNHCR; separately, asylum countries that were reporting arrival data often did not collect national origin.
In this study, we seek to address these issues.Following a multiyear collaboration with the UNHCR, culminating in the release of new international displacement flow data (1962-2022), 4 we reevaluate 28 studies published over decades on the causes and consequences of refugee flows. 5sing country-specific reporting timelines, we update these articles' results to account for possible measurement error introduced when missing values were treated as 0s.We also temporally extend these studies so that results are based on contemporary observations less affected by historical data-quality issues.Our goal is to understand how existing results change when we address these issues-an important question for a body of work with significant influence on academic and policy discourses.
In brief, we observe large inconsistencies between the newly released flow numbers and the stock-based estimates upon which decades of research is based, and we find that the inappropriate treatment of missing historical values is widespread.We produce significantly different results when we replicate the existing literature following three different approaches: first, replacing the old stock-based data with the newly introduced flow data; second, correcting the treatment of missing historical values; and third, temporally extending the studies.
Specifically, we find that in 19 articles on flow causes, ≈ 74%of findings replicate; in 9 articles focused on flow consequences, only ≈ 50% replicate.These percentages are conservative: we assess only whether previously reported results maintain statistical significance and/or whether the sign of estimated coefficients reverses.A stricter standard would also assess substantial changes in magnitude (toward 0), likely driving these percentages lower still.
A subset of our replications are "theory-based"these consist of reanalyses of studies that focus on refugee flows but adopt some other measure of REF/ASY (e.g., stocks). 6These studies contribute substantially to the ongoing debate about the effects of refugees on political violence in host countries-as Savun and Gineste (2019, 88) note, "security consequences associated with refugee flows are among the most widely studied aspects of forced migration."Thus, we view these replications as complements to the original work, offering new empirical insights, and we clearly distinguish in the results which replications are theory-based.In our replications, effects of refugees on security conditions are attenuated, suggesting that the literature's identification of refugees as sources of violent instability is likely overstated.The contrast between our findings and those based on stocks points to potentially important differences in the effects of refugee inflows versus sustained presence.
The new data also reveal that forced displacement is much more common than reported: Rubin and Moore  (2007, 91) note that "[f]orced migration is a relatively rare event … around 82% [of country-year cases] experienced no forced migration … ."When we extend this study to 2000-21, we calculate that number to be 22%.7

Refugee Measurement
The UNHCR has tracked REF/ASY flows since 1962.Flow records were used primarily for operational purposes and were not centralized until recently.In 2019, the UNHCR released a draft flow dataset.We engaged in extensive discussions with UNHCR staff about the data, including possible additions/modifications to capture new international movements and apparent inconsistencies across data versions. 8We also compared statistical results using redacted and unredacted data versions to help validate the UNHCR's decision to release only redacted data to protect individual asylum seekers' identities in cases of very small dyadic flows.
The UNHCR ultimately released the final "Forced Displacement Flow Dataset" in 2022.The new flow data are depicted in Figure 1, with additional details in Appendix A.1.1 of the Supplementary Material.

Flows and Stock-Based Estimates Compared
Actual flows diverge from stock-based estimates for several reasons.9First, researchers estimate flows from stocks as follows, where i denotes either the directed dyad (i.e., sending-receiving country pair) or the asylum/origin country, depending on the unit of interest, and t refers to the year: Stock-based estimates calculated this way do not account for naturalizations, returns, or resettlements (hereafter "stock departures"); births and deaths; or any other variable affecting host-country stock levels.We find that a substantial number of (directed-dyad) cases involve simultaneous (same-year) stock departures and directed flows.In ≈ 45.40%of asylum countryyear observations, inflows and stock departures co-occur; in ≈ 20.81% of cases, stock departures are greater than or equal to inflows (see Figure 2).By capturing new arrivals, the new flow data avoid this issue.
Second, under the stock-based estimation approach, years of "negative" flow are set to 0. We calculate that nearly one-third (≈ 29:68% ) of all first-differenced observations result in negative values that are converted to 0s.In just under half of these directed-dyadyear cases (≈ 48:56% ), the new data report positive values instead.
Third, stock-based flow estimates suffer from major left-censoring.The UNHCR begins tracking REF/ASY arrivals for different countries in different years.Under the first-differences approach, estimates may capture preexisting populations (not inflows) for the first year in which a positive value is reported.To quantify this potential issue, we compare the sum of refugee stocks for all directed-dyad-year observations corresponding to the first year of UNHCR reporting to the sum of the new data's flows for those same years.Results suggest that many stock-based flow estimates capture preexisting refugee populations rather than new flows-a source of significant potential error in statistical estimates (see Figure 2).Preexisting population values do not enter into the new flow data.
Fourth, until 2007, stock data include population values for third country resettlements, erroneously depicting "flows" into countries where REF/ASY eventually resettled, sometimes years after displacement.The new data prioritize asylum seeker applications to reflect increases in the year of their actual arrival.
Fifth, the stock data include "non-flow increases": adjustments to stock values due to methodological revision, legislative change, or other host-country changes to how REF/ASY are defined or calculated.These positive reestimations produce apparent flow increases that do not reflect actual new arrivals.In the new flow data, these changes have been removed. 10ixth, stock-based flows lag actual flows in countries that use their asylum systems to grant refugee status; asylum seekers enter into stock data only after their asylum applications have been processed and approved -sometimes years after arrival. 11The new flow data prioritize asylum seeker applications to capture movements during the years in which they occurred.
Overall, how do the new flows compare with stockbased estimates?s i,t is a function of stock departures.When stock departures occur simultaneously (within the same year) with inflows, measures of flows are attenuated: f i,t ≤ f i,t (i.e., the number of stock departures in a given year reduces the calculable inflows by that number).This is consistent with patterns in the data: in ≈ 81% of directed-dyad-year cases, flow values are strictly larger than stock-based estimates (and larger than or equal in ≈ 84% of cases).Overall, we The figure shows that significant numbers of refugees/asylum seekers are often naturalized, resettled, and/or returned in the same years that refugees continue to arrive.In such years, stock-based inflow estimates will be skewed downward.Data on naturalizations, resettlements, returns, and inflows provided by the UNHCR (UNHCR 2021a, 2021b).Plot 2: This figure provides strong evidence of a leftcensoring effect in stock-based flow estimates.Specifically, inflows estimated using stocks show significant spikes on the first year of UNHCR country reporting that likely reflect preexisting refugee populations, not actual new inflows.Plot 3: This figure displays estimated "bias" in stock-based estimates of inflows given by inflowi,t ðstocki,t −stocki,t−1Þ .The figure displays the distribution of resulting percentages for (a) all asylum-country year observations and (b) all origin-country year observations.Overall, these percentages fall well above 100%, indicating that stock-based estimates generally significantly underestimate actual inflows.
calculate that the new flow data capture 14,227,372 more flows than the stock-based data from 1962 to 2022 12 : for every ≈ 5 flows reported under the stockbased approach, the new data report one additional flow.
We directly compare flow values with their stockbased estimates, estimating bias as 2 displays the distribution of resulting percentages for (a) all asylum-country year observations and (b) all origin-country year observations.Overall, these percentages fall above 100%. 13 , 14In Appendix A.1.3 of the Supplementary Material, we supplement this analysis by reporting for each asylum country the correlation between inflows to that country and stock-based estimates.Results indicate that stock-based estimates tend to significantly underestimate flows; in > 10% of cases, the two variables are either not correlated or are negatively correlated.

Pre-2000 Data Missingness and Quality Issues
Three major issues are associated with UNHCR data generation and reporting patterns before the year 2000, when the UNHCR standardized approaches to data collection and when many asylum states adopted information and communication technologies that significantly improved reporting. 15The empirical problems discussed below persist beyond the year 2000, but are significantly reduced; we use the pre-/post-2000 framing for analytical parsimony. 16he first empirical issue is the inappropriate treatment of missing data.Until recently, centralized data on when the UNHCR began tracking REF/ASY in each country were unavailable.In the absence of positive displacement values, many panel datasets set country-/dyad-year observations to 0. 17 , 18 While some missing positive refugee values for yearly country/ dyadic observations may reflect true 0s, others still reflect positive values that the UNHCR did not collect.Nearly every study we replicated follows the practice of setting such observations to 0 when they precede country-specific data collection timelines.For an asylum-country-year panel dataset 1962-99, this practice results in ≈ 49:82% of observations being set to 0. 19 We supplement our analysis with UNHCR-supplied data on centralized collection efforts by country from 1970 on. 20Patterns in data collection are depicted in Figure 3.Many countries' data do not appear in centralized records until long after statistics began to be collected.Before 2000, UNHCR collected asylum seeker data only from several dozen industrialized countries; in 2000, when they centralized data collection, that number jumped to 137 countries, with more countries being added every year.
Using the new flow data, we produce panel datasets with observations set to NA (rather than 0) for years before data collection began. 21As we show in Appendix A.3.1 of the Supplementary Material (and in the full set of results posted in a secondary appendix in the Dataverse; see Shaver et al. 2024), this replacement produces additional changes in several results.
The second empirical issue emerges because UNHCR records are mostly constructed from asylum state records: studies using origin-country panel datasets are missing some unknown (potentially very substantial) number of REF/ASY outflows.These missing values were not captured by corresponding inflow data from asylum countries that were not yet reporting data to the UNHCR (see Figure 3).Approximately 68% of the "causes" studies we replicate (and ≈ 40% of all of the studies we replicate) use origincountry panel datasets.
The third empirical issue is that until 2000, a significant amount of UNHCR data for tracked REF/ASY are missing national origin information.For research designs in which REF/ASY origin is relevant, missingness on this variable introduces noise (and potentially bias) to results.We display this pattern in Figure 3.
We cannot directly correct for these final two issues.However, by 2000, these problems are substantially eliminated.For this reason, our analyses include contemporary replications: we extend studies through the most recent date possible and analyze them from the period beginning in 2000.These are our preferred specifications, as all four issues that we raise are either resolved or substantially mitigated. 2212 This figure excludes stateless and Palestinian refugees from both the new flow and stock datasets (see Appendix A.1.1 of the Supplementary Material). 13Outlying values are produced when flows significantly exceed the stock-based estimates, with resulting means exceeding 1,000% ; we therefore use median values (134.92% and 212.84% for the asylum and origin countries, respectively). 14This approach drops observations in cases where Δstock < 0 and values are set to 0. At the asylum-and origin-country levels, these percentages are 15.13% and 24.83%, respectively. 15See Marbach (2018) for additional discussion of these issues and a proposed solution. 16See Appendix A.1.4 of the Supplementary Material for a more detailed treatment. 17Typically, authors constructed balanced panels, setting country/ dyad-year observations to the same starting year and assigning 0s to observations without refugee values. 18This practice is widespread.Of the 28 articles we analyze, 24 impute 0s; 4 avoided this issue by limiting their analyses to modern periods. 19For a directed-dyad panel, the percentage is ≈ 50:75%. 20The exception is the set of 37 countries that supplied data to the UNHCR on asylum seeker flows 1970 to 1999.For these countries, initial reporting years are unknown; we know only that these 37 countries reported asylum figures for some or all years during this period.We discuss this in more detail later. 21As discussed below, data on flows were secured at the level of the asylum country (not the origin country).We can identify only a subset of observations with missing data in our origin country-year panel dataset.We therefore construct origin-country panel datasets using a more complicated procedure, described in Appendix A.2.4 of the Supplementary Material. 22Data issues aside, there are other reasons why contemporary study results might differ from previous results, ranging from overall displacement numbers to the evolution of the international response to flows.

REPLICATIONS
We used Google Scholar to search general and social science academic journals for articles engaging in quantitative research on global refugee flows.Our search query limited cases to those that (i) reference refugee flows, (ii) include the terms "UNHCR" and "data,"23 and (iii) were published by a major publisher,24 returning 1,556 responsive articles.We manually inspected each, eliminating studies that (i) did not deal with causes or consequences of refugee flows, (ii) were entirely qualitative, (iii) incorporated data on refugees only as a control or in secondary (tertiary, etc.) analyses, or (iv) were single country studies.This produced 35 qualifying studies.We were unable to obtain replication materials for seven of these.We replicated the remaining 28 by (i) correcting incorrectly imputed zeros and (ii) replacing the old stock-based measures with the new flow data.We assess whether previously reported results maintain statistical significance and/or whether the sign of estimated coefficients reverses. 25 detailed description of these articles appears in Appendix A.2.2 of the Supplementary Material; more Pre-2000 origin-country data are very likely missing many outflows from those countries, for many of the countries to which they fled did not report them.Plot 3: This figure plots (in green) the number of countries reporting refugee or asylum seeker inflow data to the UNHCR.Lines in red and light blue disaggregate yearly country totals between refugees and asylum seekers, respectively.The darker blue depicts the total number of countries per year for which actual refugee or asylum seeker numbers were reported.Differences in the green and blue lines may reflect cases in which potential asylum states had data sharing agreements in place with the UNHCR but did not have any numbers to actually report.Plot 4: This figure displays stock-based inflow estimates for each asylum country for the decade preceding and following the year in which a UNHCR reporting process was put in place in that country.Mean stock-based inflow estimates for all asylum countries are plotted for each year.Points depict individual asylum-country values.If post-reporting process trends generally reflect actual pre-reporting process trends, then pre-reporting values adopted by scholars (virtually all 0s) are likely systematically skewed toward 0. information on replication procedures is included in Appendix A.2.4 of the Supplementary Material.
By assessing how existing results change when we address the empirical issues described above, this effort falls into the class of "broad" (Dafoe 2014), "statistical" (Hamermesh 2007), or "wide" (Pesaran 2003) replications involving reestimating test results with the use of new data or related modifications (e.g., adopting alternatively constructed variables).The studies we replicate form the backbone of research on the causes and consequences of refugee flows and have influenced research agendas, curricula, and policy discourses. 26eplication results inform causal inferences in cases in which the original authors' testing strategies were well-identified-save for the empirical corrections we apply-but more generally, they update our understanding of the "published record … recognized [as] state of the art" (King 2006, 119), providing direction for additional scholarly inquiry and the reexamination of their policy implications. 27

RESULTS AND DISCUSSION
Results are succinctly presented in Tables A1 and A2 in Appendix A.3.1 of the Supplementary Material.Complete replication regression results (alongside original estimates) appear in a supplementary appendix posted to Dataverse (Shaver et al. 2024).Of the 62 total tests from causes articles, ≈ 74% replicate.More significantly, of 20 total tests from consequences articles, only ≈ 50% replicate. 28We classify 14 of the 28 (50%) articles we replicate as "plausibly causally identified."29Of these, ≈ 69% of results on causes replicate and ≈ 50% focused on consequences replicate.This is quantitatively and substantively consistent with the results in our full sample.We present and discuss a select set of results below.

Causes
With respect to the causes of flows, updated study results rarely overturned original findings; however, they frequently supported hypotheses discarded by the original authors as statistically unsupported.
Our findings confirm the central roles played by the "push factors" of political violence and state repression in driving international displacement.We corroborate results that link civil war/insurgency to outflows, estimating larger effects of these factors than did Echevarria and Gardeazabal (2021), Davenport, Moore, and  Poe (2003), and Moore and Shellman (2004); we also uncover a larger effect of state repression on outflows than Rubin and Moore (2007).Steele (2017, 9) has observed that "current understanding tends to equate wars or violence with an increase in displacement, but we can and need to be more precise."Our replications amplify her call.Future work might incorporate the new flow data into global analyses of potential heterogeneous effects across factors such as the timing of violence, its spatial distribution and intensity, and the technologies used to perpetrate it.
We also replicated papers focused on "pull factors" that incentivize international over internal displacement and influence the choice of international destination.In replications of Moore and Shellman (2004;  2007) and Turkoglu and Chadefaux (2019), we find limited support for the idea that refugees are motivated by economic opportunity or democratic institutions in destination countries.This finding contrasts with the framing of asylum seekers as opportunists-as echoed in prominent political discourse.Other results raise tensions warranting further study: for instance, regarding the role of alliance dynamics, we fail to substantiate Moorthy and Brathwaite's (2019) finding that the presence of formal alliances positively influences dyadic flows.However, whereas Moore and Shellman (2007)  do not find this, we do.
A more subtle theme of our replications is the underexplored role of factors discouraging or restricting individuals from seeking refuge abroad.On the one hand, some updated findings point to the role of restrictions.In our replication of Echevarria and Gardeazabal  (2021), we estimate larger effects of country size, proximity to potential asylum states, and island status.We estimate a larger effect size than Moore and Shellman  (2007) of potential asylum state contiguity.On the other hand, we find little connection in the Moore  and Shellman (2007) replication between conflict and repression in potential asylum states and refugee inflows.These and other such findings encourage further broader inquiry into the set of factors responsible for restricting international displacement-from border securitization (Simmons and Kenwick 2022) to severe weather and natural disasters along border regions under climate change.

Consequences
With respect to the consequences of inflows, we observe significant changes from previous results.In our replications of this seminal literature, the relationship between the arrival of refugees and the onset of war and political violence is attenuated: we find that refugees are only infrequently conduits of violence, and the conditions under which forced displacement poses a risk to host countries appear to be specific.
With respect to refugees' connection to terrorism, our replications find only partial support for Choi and  Salehyan's (2013) analysis linking these variables.This is consistent with Milton, Spencer and Findley's (2013)  results and our corresponding replication (though we estimate smaller effect sizes).We corroborate Polo and  Wucherpfennig's (2022) causal finding that refugee influx is positively associated with terrorism in the specific case of refugees from communities with ties to transnational terrorist organizations; we find additional evidence that the association for refugees originating from countries without ties is negative.Findings highlight the potential heterogeneous treatment effects of inflows on terrorism, with potential implications for more tailored policy responses and programming.
When we reexamine work on refugees and governments' respect for human rights, we fail to confirm either of Wright and Moorthy's (2018) findings: our results indicate neither that an influx of refugees positively influences repression nor that this relationship is moderated by development.We do not recover Chu's (2020) findings relating to refugees from rival and nonrival origin states and hosts' respect for human rights.
Finally, when we replicate work linking refugees to inter-and intrastate conflict, our findings do not substantiate Salehyan and Gleditsch's (2006) seminal research associating refugees to civil war diffusion. 30ur results partially support Salehyan's (2008) findings that flows between states can provoke militarized disputes: flows between a given dyad increase the probability that the receiving state initiates a conflict with the sender, but we do not find that it increases the probability of sender-state initiation.
Collectively, these findings speak against the new politics of fear, challenging political narratives that frame refugees as security threats and the restrictive state policies they underpin.Our findings do not indicate that there are no effects of refugee inflows on violent instability, but it seems that refugees play a role in producing or facilitating political violence, wittingly or not, only under particular circumstances.The differences between our findings and those based on refugee stocks highlight potentially important differences between the effects of refugee inflows into a country and the effects of sustained refugee presence.These differences warrant further exploration-particularly because of the "growing difficulty [of] uprooted people … in finding lasting solutions to their plight" (Crisp 2021, 3).
We conclude by noting a publication bias against null results (Esarey and Wu 2016; Gerber and Malhotra  2008).Our replications sometimes supported hypotheses that were discarded when tested with lower-quality data because they lacked statistical support.Other meaningful relationships relating to refugee flows may have therefore gone undiscovered, which scholars might now retest with these new data.

SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit https://doi.org/10.1017/S0003055424000285.

DATA AVAILABILITY STATEMENT
Research documentation and data that support the findings of this study are openly available at the American Political Science Review Dataverse: https://doi.org/10.7910/DVN/JADOZL.Limitations on data availability are discussed in the text and/or Supplementary Material.

AUTHORS CONTRIBUTIONS
The first two authors are the primary authors and are listed in order of contribution.The remaining authors are (former) Political Violence Lab members, listed in order of their respective contributions (and alphabetically where contributions were equal).

ACKNOWLEDGEMENTS
We are grateful for the opportunity to work with the United Nations Refugee Agency (UNHCR) on its release of the data that enabled this project.We thank the University of California, Merced's University Library Collection Services for supporting the acquisition of data for this project.The Political Violence Lab, based at the University of California, Merced, acknowledges the support it has received from the University of California, Merced and the University of California Washington Center.We further thank the American Political Science Review editorial team and the three anonymous referees, Lamis Abdelaaty, Kyle Beardsley, Mietek Boduszyński, Alex Bollfrass, Alex Braithwaite, Mateo Villamizar Chaparro, Elaine Denny, James Fearon, Guy Grossman, Biz Herman,  Connor Huff, Adam Lichtenheld, Bryce Loidolt, Eric  Mvukiyehe, Fouad Pervez, Ryan Powers, David Siegel,  Abbey Steele, Yang-Yang Zhou, and

FIGURE 1 .
FIGURE 1. Global Dyad-Year Refugee and Asylum Seeker Outflows

FIGURE 2 .
FIGURE 2. Measurement Issues Associated with Stock-Based Flow Data
participants of Duke University's Political Economy Seminar Series; Harvard University's Political Violence Workshop; the Annual Peace Science Society International Conference; and the Graduate Students in International Relations etc. (GSISE) Seminar for comments on this project.For research assistance, we gratefully acknowledge Political Violence Lab interns Mairead Allen, Mia