How urban riots influence political behaviour: Vote choices after the 2011 London riots

What are the electoral consequences of urban riots? We argue that riots highlight the economic and social problems suffered by those who participate, inducing potential electoral allies to mobilize. These allies can then punish local incumbents at the ballot box. We test this hypothesis with fine-grained geographic data that capture how exposure to the 2011 London riots changed vote choices in the subsequent 2012 mayoral election. We find that physical proximity to both riot locations and the homes of rioters raised turnout and reduced the vote for the incumbent Conservative mayor. These results are partly driven by a change in the turnout and vote choices of white residents. This provides support for the view that riots can help shift votes against incumbents who oppose the implied policy goals of rioters.


Introduction
Minorities have limited leverage at the ballot box: their turnout and vote choices can be outweighed by those of larger groups. For their preferred candidates to be elected, minorities need other groups to mobilize and vote for them. We argue that riots can help minorities achieve this: riots publicly communicate the challenges these groups face, and this information can induce larger groups to turn out and support the candidates more likely to address these problems. 1 In this way, riots can be politically salient despite their often inchoate nature.  examine citizen responses after the Los Angeles riot of 1992 and find that the attitudes and behaviours of both white and Black voters moved in a progressive direction, with increased registrations and greater voter support in ballot initiatives. 2 Yet it is possible for individuals to change their views on specific issues without changing their choice of candidate.
Gillion (2020) looks at voting in elections and finds that local protests in Democratic areas increased turnout and support for the Democrats. He interprets this as evidence that the "silent majority" was sympathetic to the rioters. This however does not show whether riots change voting behaviour in a way that can lead to the election of more sympathetic politicians. For that, we need to look at a case where the incumbent was seen as hostile to the rioters. This article focuses on one such case.
We test whether the 2011 London riots changed voting behaviour in the 2012 election for the mayor of London. There are important reasons to expect that previous findings, for example from Los Angeles in 1992 (e.g. Enos et al (2019)), will not apply to London in 2011: the riots happened several decades apart and London has a different racial history. It is also a less segregated city than Los Angeles. Therefore our study also tests whether previous findings are generalisable.
Following the 2011 London riots, voters concerned with law and order would likely have favoured 1 Olzak and Shanahan (1996) define a riot as a violent event involving thirty or more individuals motivated by racial grievances.
2 They build on a literature that suggests that US riots stimulated Black activism and increased support from a sympathetic public (Sears and McConahay, 1973). the incumbent Conservative mayor, Boris Johnson, whose party is closely associated with law and order. Therefore backlash resulting from a negative reaction to the violence (Feinberg, Willer and Kovacheff, 2017) should have resulted in an increase in support for the Conservative incumbent. 3 On the other hand, the Conservative party is associated with post-2008 financial crisis austerity and is generally seen as less sympathetic towards ethnic minorities and the poor. Consequently, a change in attitude in favour of the rioters should have manifested itself in a shift of the vote away from the Conservative Party. From this logic, the Conservative Party's vote share can be used to assess whether the riots strengthened or weakened the electoral prospects of candidates favourable to the rioters.
In this article we measure exposure to the riots as physical proximity to the riot locations or to where the rioters lived, since the rioters' most likely allies are those individuals who were physically proximate to the events. These two measures likely capture different mechanisms: proximity to riot location would have provided first-hand information, while proximity to the homes of rioters would have provided information through social interactions and local networks. Exposure to a riot captures an average between positive and negative responses; e.g. some people are concerned and/or fearful and vote Conservative, others become informed and switch their vote away from the Conservatives. The size of these groups and of their responses will depend on the type of proximity, i.e. whether it is to riot location or to rioter residence.
We deploy a difference-in-differences specification that takes advantage of the localised data that is generated by riots that happen in urban areas (Dancygier, 2010). We estimate treatment effects on each electoral ward, and find that exposure to the riots increased turnout and reduced the Conservative party's vote share in the first election after 2011. 4 Since the Conservative party was associated with austerity and seen as the party of law and order, these shifts are consistent with our hypothesis that the riots helped mobilize voters in favour of parties more sympathetic to 3 Most existing work finds no evidence of backlash after the 1960s riots in the US (e.g. Bellisfield, 1972), but recently Wasow (2020) has shown that violent protest in the 1960s shifted votes to the Republicans.
4 Consequently, we are only estimating the short-term effect of the riots.

the rioters.
To delve deeper into our results, we turn to ecological inference estimates and focus on the behaviour of white voters. We find that our aggregate results are in part due to white voters who turned out more and reduced their votes for the incumbent Conservative mayor as a result of exposure to the riots (where exposure is defined as either living close to where the riots happened or to where the rioters lived). 5

Case selection and research question
The 2011 London riots followed the shooting by police of a Black suspect, Mark Duggan, on 4 August, causing a protest near the site of the shooting that turned into generalised unrest across London and other parts of England. The riots lasted several days, generating substantial media and political attention and leading to damage to property, numerous arrests, and subsequent criminal prosecutions. Spatial research shows that rioters conformed to the rational model of decisionmaking, with riots occurring close to areas of deprivation (Baudains, Braithwaite and Johnson, 2013b;Kawalerowicz and Biggs, 2015;Leon-Ablan and Kawalerowicz, 2021). 6 We take media reports of riots and geocode their locations. Our unit of observation is the 5 Black voters are limited in their ability to punish the Conservative incumbent because they overwhelmingly are Labour supporters (Sanders et al., 2014), and so there are few votes to be switched away from the Conservatives. Using ecological inference (King, 1997), we find that in the 2008 mayoral election, the last before the riots, 0.4 per cent of Black voters and 41.3 per cent of white voters in our sample voted for the Conservative candidate. We find no evidence that exposure to the riots changed the turnout of Black voters. This absence of change is consistent with empowerment theory (Bobo and Gilliam, 1990): when individuals lack empowerment, their political involvement is low. It is also consistent with signalling theory (e.g. Gillion, 2020; Gause, 2020): riots are unlikely to convey information (e.g. about police brutality) that the Black community does not already know. In this sense riots can be seen as opening the eyes of a "blind" white majority. 6 The top three reasons given by participants in the 2011 London riots were poverty, policing, and government policy (Lewis et al., 2011). 4 electoral ward, and we use data from three mayoral elections: 2004, 2008, and 2012, turnout for 2008 and 2012, and vote shares by party for 2004, 2008, and 2012. The mayoral election is Londonwide, using the supplementary vote (SV) system, which means that both first and second preferences count. We have data of 3,552 riot-related arrests from the Metropolitan Police. Details of sources and variable constructions are in the online appendix.

Identification and specification
We use a difference-in-differences estimator, where the "policy change" is the riots in 2011, with treatment status defined as either proximity to the riots or proximity to where the rioters lived. 7 Riots are likely to change the behaviour of those who are more closely exposed (Gillion, 2020). For example, those who observe the violence first-hand or know it is happening near their homes will respond differently from those whose exposure is exclusively through television news reports. In short, we exploit variation in treatment intensity induced by distance to riot location and rioter residence.
We construct a treatment group of electoral wards with centroids between 0.5 and 3 kms of the locations where the riots happened, and a control group of all electoral wards with centroids within 3.0 and 5.5 kms of where riots happened. 8 We exclude all areas within 500 metres of the riots to address the concern that riots may have happened in particular types of areas (e.g. those 7 An important feature of the 2011 London riots is that participants travelled in order to congregate in a small number of locations, and so most individuals rioted away from where they lived (Baudains, Braithwaite and Johnson, 2013a).
8 Tables A1 and A2 in the online appendix show balance tests, and in our regressions we control for the factors that are significant in these tables. The average distance between wards with charged residents and the nearest riot location is 2.98 kms; hence our choice of a 3km threshold to allocate wards between the treatment and control groups. Our results are robust to varying the treatment and control bands (see the online appendix). 5 with a large ethnic minority population or with a large number of low-skilled workers). 9 Figure   1 illustrates this identification strategy. Our second treatment is proximity to rioter residence, defined as all wards that had at least one resident charged for rioting. The control group is all wards between 0.5 and 5.5 kms of the riots that had no residents charged for riot-related offences. 10 Figure 2 illustrates this identification strategy.
The identifying assumption is that the data exhibit common trends: in the absence of the riots, vote shares and turnout would have evolved in the same way in the treatment and control wards (conditional on the controls). The riots took place in a number of locations across the city, and London is a patchwork of many small neighbourhoods that vary substantially in terms of income and ethnic and social composition (Manley and Johnston, 2014); therefore both our treatment and control groups include poor and rich areas, more and less diverse areas, and areas with high and low concentrations of working class residents. Figures A1 and A2 in the online appendix plot the data for the pre-riots period and show that trends were parallel. A violation of the common trends assumption when treatment is proximity to the riot locations would require that areas close to the riots shift away from the Conservative Party faster than other areas. Looking at Figure 1, the treated areas are those between the inner and middle rings. They follow no clear spatial pattern other than being equidistant from riot locations, and it is difficult to think of what changes these areas could have experienced relative to the control areas (wards between the middle and outer rings). A violation of the common trends assumption when treatment is proximity to the homes of rioters would require that the areas in dark in Figure 2 follow different trends or experience different shocks between 2008 and 2012. However, the dark and light-coloured areas in the map follow no clear spatial pattern, making shocks of the required type unlikely. In the online appendix we show that our results are robust to including these areas.
10 Table A4 in the online appendix shows the number of wards near the riots (treatment 1), the number of wards that had residents charged for participating in the riots (treatment 2), and both. 11 The Conservative and Labour parties are the two main political parties in the UK. 6 treatment and control areas were evolving in similar ways prior to the 2011 riots.
To look at whether treated areas changed their voting behaviour more than control areas, we estimate the following equation: We consider two outcomes: turnout in ward i in election t and the share of the votes cast for the Conservative mayor in ward i in election t. Depending on the specification, the variable treatment i is either (i) a dummy that equals 1 if ward i was between 0.5 and 3 kms of a riot and 0 if it was between 3 and 5.5 kms (all other wards are dropped), or (ii) a dummy that equals 1 if at least one resident of i was arrested and 0 otherwise (looking only at wards between 0.5 and 5.5 kms from a riot). The variable post t equals 1 for elections after the 2011 riots, and 0 otherwise. 12 The coefficient on the interaction between these two variables is the difference-in-differences estimate.
We also include controls measured in 2011, and borough and year effects. We adjust the standard errors to take account of spatial and serial correlation following the procedure in Conley (1999). 13 Table 1 shows the main results. We find that the riots generated a positive effect on turnout (columns 1 and 2) in the treated areas (those close to a riot or to a rioter residence). They also led to a larger decrease in the fraction of the vote received by the Conservative candidate in treated areas: the interaction shows that in treatment wards the Conservative vote share fell 1.5 percentage 12 We have three elections -2004, 2008, and 2012 -and so post t equals 1 only for 2012, which is captured by the year dummy; this is why we do not include the non-interacted post t in the regression equation.

Results
13 Figures A3 and A4 show turnout and the share of the Labour vote as a function of distance to the nearest riot location. Figure A5 shows the distribution of charged individuals across wards.  OLS OLS OLS OLS Notes: Standard errors in parentheses; *** p<0.001, ** p<0.01, * p<0.05, + p<0.10. Standard errors are adjusted for spatial correlation following the procedure in Conley (1999). Variable definitions and sources can be found in the online appendix. points more (column 3) or 2.2 percentage points more (column 4) than it did in control wards. Given that the election was won by the Conservative candidate Boris Johnson with 51 per cent of the vote against Labour's 48.5, the effects are quantitatively large. 14 These results are robust to considering as treated only those wards with more than the median number of charged individuals (Table A6 in the online appendix) and to changing the size of the treatment and control groups (Tables A7, A8 and A9). The results are also robust to two placebo tests. The first takes as treated those areas near Charing Cross station (the centre of London) and as control those areas that are farther away (Table A10 and Figure A7), and finds no effect. The second placebo test uses propensity score matching to find areas similar to those that experienced 14 Figure A6 shows these coefficients graphically. OLS Notes: Standard errors in parentheses; *** p<0.001, ** p<0.01, * p<0.05, + p<0.10. Standard errors are adjusted for spatial correlation following the procedure in Conley (1999). Variable definitions and sources can be found in the online appendix.
riots, but that did not experience any unrest. Treatment is then defined on the basis of distance to these alternative riot locations (Table A11 and Figure A8). Again, this placebo test does not replicate the results in Table 1.

Ecological inference results
To explore our results in more detail, we examine whether the changes we found were driven by the behaviour of a particular subgroup of the population. We do this by replicating the estimation for white and for Black voters, using ecological inference. 15 Table 2 shows the results: we find evidence that treatment has a positive effect on the turnout of white voters, who also switch away from the Conservative party at a faster rate than whites in control areas. Black voters also switch away from the Conservatives, but there is no evidence of a change in Black turnout. 16 Following Bobo and Gilliam (1990), the relative lack of empowerment felt by Black voters may explain the low and unchanging turnout. It is also possible that, consistent with signalling theory (e.g. Gause, 2020), the riots conveyed no new information to these voters.

Conclusion
We contribute to understandings of the effect of political violence on voting behaviour. Drawing on the work of Enos et al (2019), we hypothesized that riots can raise voters' awareness of the problems and challenges faced by the groups that riot, and that this translates into electoral support for candidates with more progressive objectives. Our case involves a vote where the incumbent was seen as unsympathetic to the rioters, and therefore extends Enos et al (2019)'s work on ballot initiatives to the core democratic process of electing a representative. Our findings from the 2011 London riots confirm our expectations. We show that it is not just proximity to the riots that matters, but also closeness to where rioters lived. Future work should delve deeper into the mechanisms that connect these two measures of exposure to the changes in voting behaviour they generate. Finally, we uncover some evidence suggesting the results are partly due to a change in the behaviour of white voters, both in terms of turnout and vote choice. Black voters shift their vote away from the Conservatives, but show no change in turnout; we hypothesize that this is due to a feeling of effective (if not formal) disenfranchisement. Further work should address this issue.
In a period of increasing minority protest and violence, these dynamics may point out the future nature of electoral politics. Rather than the African American voting blocks identified by Fording (1997;2001) as a result of the 1960s riots, recent riots have a more heterogeneous and cross-racial response. In the era of Black Lives Matter, there may be long-term consequences for voting behaviour, mobilising white voters in support progressive parties. 16 These results are robust to changing the size of the treatment and control groups; see Tables A12, A13, A14 in the online appendix. They are also robust to using weighted least squares instead of OLS, as shown in Table A5 in the online appendix.

Treatment variables
Proximity to riots (treated (near riot)): The riot locations are taken from newspaper and media reports and geocoded. A ward is considered to have been near the riots if its centroid is between 0.5 and 3 kms from where a riot happened. Distance is measured as the crow flies.
Proximity to the residence of rioters (treated (near rioter)): The data on where the charged rioters lived is from the Metropolitan Police. This is at the LSOA level, which we aggregate up to the electoral ward level. Wards are considered treated if at least one of their residents was charged.

Controls
Fraction Black: From the 2011 UK census. We use the variable that measures the fraction of black Caribbean, black African and mixed race in the ward.

A.2 Common trends
We first consider two balance tables, the first for the proximity to the riots treatment and the second for the proximity to where the rioters lived treatment: Notes: The value displayed for t-tests are the differences in the means across the groups. The value displayed for F-tests are the F-statistics. Standard errors are clustered at variable ward mayorN. ***, **, and * indicate significance at the 1, 5, and 10 percent critical level. Notes: The value displayed for t-tests are the differences in the means across the groups. The value displayed for F-tests are the F-statistics. Standard errors are clustered at variable ward mayorN. ***, **, and * indicate significance at the 1, 5, and 10 percent critical level.
FIGURE A1: Common trends check for Conservative vote share, where treatment is proximity to the riots (0=control; 1=treatment). In particular, a violation of parallel trends would require that that the areas close to the riots move away from the Conservative party faster than other areas. Looking at Figure 1 in the text, this corresponds to the areas between the inner and middle rings. These areas follow no clear spatial pattern, other than being equidistant from riot locations, and it is difficult to think of what changes these areas could have been experiencing relative to the control areas (wards between the middle and outer rings). year Control Treatment share Con vote

A.4 Ecological Inference
We have no individual-level voting data and so we need to estimate the fraction of whites and the fraction of Blacks who (i) turned out to vote, and who (ii) voted for the Conservative party in each electoral ward, in the local elections held in 2004, 2008 and 2012.
We do so by using the ecological inference method (EI) developed by King (1997).
A potential issue with our estimates is that they are what Herron and Shotts (2003) and Adolph et al. (2003) call second-stage estimates, which in general makes them inconsistent. A solution proposed by Adolph et al. (2003) is to use weighted least squares (WLS), as it generally produces estimates with negligible bias. We replicate the estimation in Table 2 of the main text using weighted least squares with weights that are equal to the inverse of the ei standard errors. The results are presented in Table A5 below and are largely in line with those in the main text.
The Ecological Inference method in King (1997) relies on three key assumptions: (i) parameters vary across wards following a truncated bivariate normal distribution, (ii) there is no spatial autocorrelation, and (iii) the parameters are uncorrelated with the regressors. Tam Cho (1998) examines the robustness of EI estimates to violations of these three key assumptions. While she finds that EI estimates are typically robust to violations of assumptions (i) and (ii), the violation of assumption (iii) can lead to biased and inconsistent coefficients. This assumption is also known as the 'no aggregation bias assumption', since it requires that no bias be introduced by the aggregation of individuallevel data. In our case, this assumption would be violated if (i) the parameter varies across wards, and (ii) this variation is correlated with some other variable.
In our setting, the main concern is that the response of the two groups to the riots, in terms of their turnout and change in their Conservative vote, is conditioned by the ethnic composition of the electoral ward. We expect the turnout and vote choices of Black voters to change little on the basis of whether they live in a more or less ethnically diverse neighborhood, and so the main concern is selection of whites into particular neighborhoods. Following Enos, Kaufman, and Sands (2019), we establish the likely direction of the bias in the weighted regression. Under the plausible assumption that whites who live in more diverse neighborhoods (defined as those with a larger fraction of Black residents) are more left-leaning, our expectation is that treated whites in diverse neighborhoods will show an increase in turnout and a decrease in their Conservative vote, while treated whites in less diverse neighborhoods will show an increase in turnout but a switch in their vote in favor of the Conservatives. In other words, we expect turnout to be largely unbiased in the unweighted regressions, while the positive and negative effects on the Conservative vote will approximately cancel out.
Turning to the weighted least squares regression, it will underweigh whites in diverse neighborhoods and overweigh whites in less diverse areas. Hence the WLS coefficient should bias the white vote (for the Conservative party) towards zero. In practice, the WLS results are in line with those reported in Table 2.    Table A6: Robustness to categorizing as 'treated' only wards with more than the median number of individuals charged, i.e. areas that were heavily treated. The median number of charged individuals is 2. Treatment is discrete, and equals 1 if two or more residents of the ward were charged with riot-related offences.
(1)  Notes: Standard errors in parentheses; *** p<0.001, ** p<0.01, * p<0.05, + p<0.10. Standard errors are adjusted for spatial correlation following the procedure in Conley (1999).  Table 1 to using distances: 0.5-1 kms and 1-1.5 kms. It is reassuring that the coefficients are largely in line with those in Table 1, although the small number of observations makes these estimates difficult to interpret. Notice that the number of observations drops non-linearly as a result of the reduction in distance; this is because once the distances are relatively small there will be few wards that meet the proximity condition.
(1) Notes: Standard errors in parentheses; *** p<0.001, ** p<0.01, * p<0.05, + p<0.10. Standard errors are adjusted for spatial correlation following the procedure in Conley (1999).  Notes: Standard errors in parentheses; *** p<0.001, ** p<0.01, * p<0.05, + p<0.10. Standard errors are adjusted for spatial correlation following the procedure in Conley (1999).  Table 1 where instead of looking at proximity to riot locations we look at proximity to Charing Cross, the train station that marks the center of London. Wards within 5kms of Charing Cross are considered to be treated, areas 5-10kms of Charing Cross are the control group (see the map in Figure  A7). This addresses the possible concern that inner London areas might be different from outer neighborhoods. The absence of a significant difference between treatment and control is consistent with our understanding of London: it is a patchwork city where poor and rich, diverse and homogeneous areas are in close proximity to each other, and not organized in concentric circles like in many other large cities.
(1)   Table A11: Placebo 2: We replicate the analysis in Table 1 but instead of looking at proximity to riot locations we look at proximity to areas that are similar in key characteristics to the riot locations but did not experience any riots. The characteristics we consider are: fraction Black, fraction white, unemployment rate, fraction with high qualifications and the crime rate in 2010. We then use the Stata command teffects nnmatch and output the list of nearest neighbors that this command creates using propensity score matching. For each riot location we pick the most similar area that is in a different borough. We then use ArcGIS to place these locations on a map and calculate a new full set of distances between these locations and all the wards. The map in Figure A8 shows the original riot locations and the placebo locations. We then use these distances to construct new treatment (0.5-3kms) and control (3-5.5kms) wards. The negative coefficient on the variable did in column 1 shows that, if anything, areas less affected by the riots experienced a drop in turnout, while in Table 1 we find an increase in turnout in treated areas. There is no effect on the share of the vote that goes to the Conservative candidate.