Getting the Message Out: Why Mail-Delivered GOTV Interventions Succeed or Fail

ABSTRACT Mail-delivered get-out-the-vote (GOTV) field experiments have been found to increase voter turnout in some but not all contexts. We hypothesize that mail-delivered GOTV interventions are more successful in low-salience elections and test this in a systematic way for the first time. Relying on a systematic literature review and a meta-regression framework, we find that primary elections have a strong and significant positive impact on the success of mail-delivered GOTV interventions, whereas other commonly used measures of election salience, such as voter turnout, margin of victory, and a dummy for local elections, do not. These results highlight the possibility of fostering voter turnout using GOTV mail messages, especially in primary elections.

T he proportion of registered voters who cast their vote during elections is one of the most common indicators of political participation in democratic countries. It is often assumed that the higher the voter turnout, the healthier the democracy. Still, voter turnout has been declining at the global scale since the 1980s (Solijonov 2016), prompting partisan and nonpartisan actors to develop get-out-the-vote (GOTV) interventions during election campaigns to increase voter turnout through various means. Mail delivery, although not as effective as in-person contact (Bhatti et al. 2016), has several unique advantages relative to other commonly used GOTV techniques, such as phone messages, canvassing, and emails. First, phone and email address banks typically lack contact information for many registered voters in the district or area under study. Conversely, a mailing can be sent to any registered voter whose registered residence is in the district or region under study. Mail delivery therefore makes it easier for a GOTV intervention to reach a representative sample of the population and to mobilize groups of citizens who might be unwilling to answer phone calls and emails. Second, mail delivery is cheaper per capita than phone calls and canvassing. Multiple studies have used this contact method and, because of its low cost, have been able to mobilize more voters with limited resources. Third, all recipients of the mailing receive the same written text. There is no direct discussion with the person delivering the message, making the treatment more uniform across recipients.
One potential downside with relying on mail-delivered messages, however, is that contact rates, if they are defined not merely as receiving the letter but as reading it, are not usually possible to estimate. Phone calls and canvassing typically have contact rates below 50% (Bhatti et al. 2016;Ha and Karlan 2009;McNulty 2005), in contrast to mail-delivered messages that are received by all those they are sent to, except if they are lost in the mail. However, respondents may toss these mailings without opening them, and it is not possible to estimate how many will read the messages.
This article reviews existing studies on the efficacy of maildelivered GOTV interventions and asks whether low-salience elections generate large effect sizes for such interventions in field experiments. Randomized controlled experiments in the field are the gold standard method to assess the efficacy of GOTV interventions because they enable intervention effect sizes to be estimated with a lower risk of bias due to confounding factors than do observational studies. The study presented here relies on meta-regressions that include both a broader range of election characteristics than other meta-analytical studies and research conducted in different countries. Most GOTV interventions have been carried out in the American context, and many claims about the efficacy of GOTV in one context (e.g., a competitive party primary) may be refined by considering other contextual elements (e.g., the general voter turnout in that election). Using meta-regressions makes it possible to control for many contextual elements and better distinguish among the main predictors of GOTV intervention success.

THE SUCCESS OF MAIL-DELIVERED GOTV INTERVENTIONS
Field experiments on mail-delivered GOTV interventions have already been reviewed in three meta-analyses (Green and Gerber 2015;Green, McGrath, and Aronow 2013;Ouimet et al. 2014). Green, McGrath, and Aronow (2013) report an ES (effect size) of 0.16 percentage point (CI [confidence intervals] = þ0.08 to 0.25) per GOTV mailing in general or 2.85 points (CI = þ2.69 to þ3.01) per social pressure mailing. Social pressure mailings inform voters that their turnout records are being scrutinized by researchers. For four of the six types of GOTV messages they tested, Ouimet and coauthors (2014) report a slightly positive but not statistically significant estimated pooled effect. However, interventions using social pressure messages are generally successful (ES = þ3.6 points; 95% CI = þ2.1 to þ5.1 points). Green and Gerber (2015) report a pooled effect size of þ0.5 percentage point per nonpartisan mailing involving no social pressure (95% CI = þ0.2 to þ0.8); þ0.01 per partisan mailing involving no social pressure (95% CI = -0.1 to þ0.1); and þ2.3 per social pressure mailing (95% CI = þ1.3 to þ3.3). Overall, GOTV mail messages seem to slightly increase recipients' likelihood of voting, and social pressure mail messages have a substantially larger effect of more than two percentage points.
Although they provide separate estimates for different message types, these three studies do not determine the pooled effect of covariates, such as country, election salience, and election type, on GOTV interventions' effect sizes. They also include studies that have not been peer reviewed-some of those included by Green and Gerber (2015) are anonymous-and do not include those conducted outside the United States.

MAIL-DELIVERED GOTV INTERVENTION SUCCESS BY ELECTION SALIENCE
Research on mail-delivered GOTV interventions relies on two main factors to explain their relative success (large effect sizes) or failure (small effect sizes): the intervention itself (the message type) and the context in which it is delivered (the election's salience). Studies examining message types generally find greater success for "social pressure" types of messages. However, results are much more contested for election salience. Moreover, definitions of election salience vary between studies. Some define election salience as a proxy for voter turnout (Bhatti et al. 2015;Gerber et al. 2017;Matland and Murray 2012); others suggest that the type of election (party primary, general election, special election, etc.) defines how salient an election is (Abrajano and Panagopoulos 2011;Murray and Matland 2014;Panagopoulos 2011); a third group of studies uses other metrics such as the highest office level (countrywide, regional, local, etc.) or the number of relevant countrywide offices at stake (president, upper chamber, lower chamber, etc.) to define an election's salience (Bhatti et al. 2015(Bhatti et al. , 2018Fieldhouse et al. 2013;Foos and John 2018;Gerber et al. 2017;Gerber, Green, and Larimer 2008;Gerber, Green, and Shachar 2003;Matland and Murray 2012;Murray and Matland 2014;Panagopoulos 2014); and, finally, some studies base their definition of election salience on how competitive it is (Gerber, Green, and Larimer 2008;Gerber et al. 2017;Panagopoulos 2011). We next explain each of these four approaches and the extent to which each is related to GOTV interventions' success or failure.
First, many studies have made the association between voter turnout and election salience. Arceneaux and Nickerson (2009) suggest that voter turnout provides the most accurate estimation of election salience. This definition has since been borrowed or adopted by several GOTV studies relying on mail messages, all of which associate high voter turnout with high election salience (Bhatti et al. 2015;Gerber et al. 2017;Matland and Murray 2012).
The relative effect sizes of GOTV interventions for low-/highpropensity voters in high-/low-turnout elections have also been examined. Arceneaux and Nickerson (2009) find that low-propensity voters are easier to mobilize through door-to-door canvassing in high-turnout contexts, whereas high-propensity voters are easier to mobilize in low-turnout elections. Results are less stark in studies that rely on mail-delivered GOTV interventions. In lowturnout settings, Matland and Murray (2012) show that chronic nonvoters are harder to mobilize than occasional nonvoters, and Panagopoulos (2013) finds effect sizes among voters who voted in the previous elections and those who did not to be similar. However, again in mostly low-turnout settings, coauthors (2013, 2017) find that chronic nonvoters are more receptive to GOTV mailings than people who voted in the past. Finally, in high-turnout settings, colleagues (2015, 2018) find that low-propensity voters are more likely to be mobilized by a GOTV mailing than high-propensity voters. 1 Overall, there is no consensus on whether high-or low-propensity voters are more likely to react to GOTV efforts; the extent to which voter turnout determines which types of voters will react to the message is also unclear.
Second, election salience has also been linked with election type. In the United States, party primaries and special elections Overall, GOTV mail messages seem to slightly increase recipients' likelihood of voting, and social pressure mail messages have a substantially larger effect of more than two percentage points.
(elsewhere known as by-elections) have been described as lowsalience elections (Abrajano and Panagopoulos 2011;Murray and Matland 2014;Panagopoulos 2011). The reasons for labelling these as low salience are often implicit: special elections and primary elections are typically associated with lower voter turnout. Panagopoulos (2011) also suggests that special elections are low salience because more voters are unaware of them and media coverage is low.
Third, some studies make a distinction between the salience of first-order elections and second-order elections. Local elections and European Union elections are deemed to be second-order, low-salience elections (Fieldhouse et al. 2013;Foos and John 2018;Gerber et al. 2017;Gerber, Green, and Shachar 2003;Murray and Matland 2014;Panagopoulos 2014), whereas countrywide elections, especially presidential ones (Gerber, Green, and Larimer 2008;Matland and Murray 2012), are first-order, high-salience elections. coworkers (2015, 2018) consider the Danish municipal elections to be high-salience elections, but this seems to be due to high voter turnout, not to the election taking place at the local level. In the same vein, Gerber and coauthors (2017) define election salience based partly on the number of elections occurring at the same time as first-order elections for countrywide offices.
Fourth, some researchers consider an election's level of competitiveness an indicator of election salience. Gerber, Green, and Larimer (2008) suggest election salience is higher in battleground swing US states than in those states where races are typically decided by large margins. Gerber and colleagues (2017) similarly include in their measure of election salience a dichotomous measure of whether a race is rated as "toss-up" by the Cook Political Report, whereas Panagopoulos (2011) suggests that competitive races are more salient than uncompetitive ones.
Following Arceneaux and Nickerson (2009), authors have generally assumed that low-salience elections-in whichever way they are defined-yield larger effect sizes of GOTV interventions. For instance, Fieldhouse and colleagues (2014) find that successful GOTV campaigns are more likely in high-turnout areas. Rogers and coauthors (2017) instead suggest that low-turnout settings make electoral mobilization easier. Hill and Kousser (2016) find more potential for the mobilization of voters in primary elections than in general elections. However, Gerber, Green, and Larimer (2008) do not find a relationship between a district's level of competitiveness and the treatment's effect size, and Panagopoulos (2011) finds similar effect sizes in general, local, and special elections of various levels of competitiveness. Matland and Murray (2012) also suggest that a noncompetitive race is likely to foster higher GOTV intervention success, whereas Fieldhouse and colleagues (2013) find just the opposite.
Although the results of these studies vary widely, the majority view seems to be that low-salience elections are contexts in which GOTV interventions are more effective because low voter interest and attention at the outset enlarge the pool of potential voters who may be receptive to mail-delivered messages. However, the concept of election salience is contested and needs to be disaggregated.
In this article, we test four hypotheses, which, along with the main theoretical expectations and findings in this section, are summarized in figure 1.
(H1) In elections with low voter turnout-low-salience electionsmail-delivered GOTV interventions will have larger effect sizes than in elections with high voter turnout.
Some voters who normally abstain in these contexts might be easier to mobilize, whereas the few who abstain in high-turnout settings might be harder-to-convince chronic nonvoters.
(H2) In primary elections-low-salience elections-mail-delivered GOTV interventions will have larger effect sizes than in general elections.
Primary elections do not directly lead to elected office, and voters might have weaker incentives to vote.
(H3) In local elections-low-salience elections-mail-delivered GOTV interventions will have larger effect sizes than in federated state or countrywide elections.
Local elections generally foster low media attention and low turnout.
(H4) In elections where the margin of victory is greater-lowsalience elections-mail-delivered GOTV interventions will have larger effect sizes than in elections where the margin of victory is smaller.
These elections are deemed to be more competitive and therefore could have higher voter engagement at the outset.

DATA AND METHODS
We conducted a systematic literature review using keywords related to voter turnout and experimental studies on nine bibliographic databases 2 ; we complemented this search with reference lists of included studies and the literature reviews of Green and colleagues (2013), Ouimet and coauthors (2014), and Green and Gerber (2015; see online appendix A for a list of included studies and online appendix B for examples of a research query). Exclusion criteria for studies were as follows: (1) GOTV field experiments including interactions through other means than mail; (2) nonfield experiments or interventions in which there was no random assignment in control and treatment groups; (3) voter turnout measured subjectively through self-reporting instead of official postelection voting records; (4) non-peer-reviewed publications; (5) no reporting of numbers or percentages of voters and nonvoters for one or both experimental groups; (6) secondary data; (7) elections created for study purposes; (8) studies involving Although the results of these studies vary widely, the majority view seems to be that lowsalience elections are contexts in which GOTV interventions are more effective because low voter interest and attention at the outset enlarge the pool of potential voters who may be receptive to mail-delivered messages.  Experiments were categorized according to several covariates: the level of the highest office at stake for this election (local, federated state, countrywide, or European Union); the number of countrywide offices at stake; the type of election (general or primary); the country where the experiment took place; the winner's margin of victory when candidates were the same for all participants in the experiment; and the type of GOTV message. For this last element, we created six categories: standard reminders, which remind voters about the upcoming election; civic duty messages, which call on voters' sense of civic duty while urging them to vote; past turnout messages, which reveal whether the person receiving the message voted in at least one past election; neighbors messages, which reveal whether some neighbors of the person receiving the message voted in at least one past election; advocacy messages, which argue in favor of or against a candidate, party, or policy position; and other messages, which include messages that go beyond standard reminders but do not fit into the other categories.
In total, the sample includes data from 125 field experiments reported in 39 publications. 4 Meta-regressions were estimated in R using the metafor package (Viechtbauer 2010). Meta-regressions pool estimates from different studies and give more weight to those with more units. They add several covariates into the equation to help explain the treatment effect on an outcome variable-in this case, intervention effect size. Meta-regressions are most appropriate for evaluating the impact of covariates on effect sizes across multiple studies. They can be used with several control variables to test whether a given covariate is associated with higher effects (Harrer et al. 2021); this makes it possible to test treatment effects across multiple studies with multiple levels of voter turnout, multiple types of elections (primary, special, and so on), and using different message types in different countries. The goal is to have a comprehensive picture of the effects of maildelivered GOTV interventions. As suggested by Berkey and coworkers (1995) and Viechtbauer (2010), we used a restricted maximum-likelihood estimation (REML) random-effects estimator to compare levels of residual heterogeneity accounted for by different models. Linear slopes were fit on all experiments within each message type. The impact of covariates on intervention effect size was estimated using risk differences. Table 1 presents descriptive statistics for nine study-level variables for the experiments included in the systematic literature review. Figure 2 illustrates the bivariate relationship between treatment effect magnitude and voter turnout in the control group for all mail experiments. Larger (proportional to the inverse of the variance of the measured effect) dots indicate larger samples that are weighted more heavily in the meta-regressions. Visually, there appears to be a small curvilinear or decreasing trend overall: studies where turnout in the control group is higher see smaller intervention effects.  average effect size of a GOTV intervention (see model 0) is þ1.3 percentage points (p < 0.001; 95% CI = þ0.9 to þ1.6 points). Models 1 through 5 present the results of univariate meta-regression models testing our four hypotheses. In the context of bivariate models, there is support for H1 and H2. Model 1 shows that the higher the turnout, the lower the effect of GOTV interventions. Effect size is predicted to stand 1.8 points lower (p < 0.05; 95% CI = -3.5 to -0.1 points) when voter turnout reaches 100% than when it stands at 0%. Model 2 shows that turnout squared seems to explain a greater portion of the variance than turnout alone (ES = -9.7 points; p < 0.05), indicating a potential curvilinear relationship, as in figure 2. This finding would be in line with Arceneaux and Nickerson (2009), who suggest that the impact of turnout on effect size can be curvilinear. In model 3, effect size is predicted to be 1.3 points higher (p < 0.01; CI = þ0.4 to þ2.0) for primary elections than general elections. In contrast, we do not find support for H3 and H4: salience, as measured by whether an election is local or not and how high is the margin of victory, does not affect the intervention's efficacy. We then explored whether these effects hold when put in a single model (model 6). Model 7 adds controls for (1) message type, (2) a dummy for experiments conducted in the United States, and (3) the number of countrywide offices at stake for robustness. Model 8 adds a squared turnout coefficient. Margin of victory is not added to models 6-8, because it reduces the number of included experiments from 125 to 100 and does not have a significant impact on effect size when added in tests. In the three models, primary elections are still associated with a large effect when controlling for all other factors (þ1.3 percentage points; p < 0.01; CI = þ0.4 to þ2.1 in model 7), but the statistically significant effect on turnout and on its squared version disappears. We did not find evidence of a linear or curvilinear relationship between turnout in the control group and effect size when controlling for other factors.

RESULTS
Using predicted probabilities and the coefficients found in model 7, it is possible to estimate the average effect size of a GOTV intervention in the best and worst possible scenarios, and therefore to get a general idea of the variation in effect size across multiple contexts. In the worst scenario-100% turnout in a nonlocal general election in the United States with no countrywide office at stake and an advocacy type of message-the predicted effect size is -1.4 percentage points, a negative effect size but marginally not significant. By contrast, in the best scenario-0% turnout in a local primary election outside the United States with three offices at stake and a neighbors message-the predicted effect size is þ5.1 percentage points (p < 0.001; 95% CI = þ2.6 to þ7.6 points).

DISCUSSION
Our study shows that, overall, mail-delivered GOTV messages seem to have significant positive effects on voter turnout in most contexts; this finding is similar to those from Green and Gerber (2015) and others. However, its principal contribution is to identify how election salience contributes to the success (or not) of maildelivered GOTV interventions. The results show that it is easier to mobilize voters when turnout is low and in primary elections. However, in a model including both turnout in the control group and a dummy for primary elections, the effect of turnout is eliminated: only primary elections are associated with higher intervention success. Because all primary elections in our dataset were held in the United States, this finding should be treated with caution, and we encourage scholars to study their effects in other countries to test whether this finding holds regardless of context. No effect is found for two other measures of salience: local elections and margin of victory.
Many studies using mail-delivered GOTV interventions claim that their effect size is partly due to the context in which an election takes place, because it is the context that determines whether the election is salient or not. Fieldhouse and coauthors (2014), among others, identified greater efficacy of their intervention in high-turnout areas in a first-order countrywide election, which they deemed salient. Although our results do not contradict this result specifically (we have no data on high-and low-turnout areas within an election), across multiple contexts primary elections are positively associated with mail intervention success, whereas other ways of measuring election salience, including voter turnout, are not. Although these results do not extend to other types of GOTV interventions than those delivered by mail, and the target populations differ from one study to the next, our Future mail-delivered field experiments are therefore likely to be more effective in primary elections and in other contexts when using nonstandard types of messages.
systematic look at mail-delivered GOTV interventions suggests that studies should be cautious about claiming that an intervention had greater or lower success because the election had low voter turnout, took place at the local level, or was noncompetitive: none of these factors had a statistically significant effect on intervention success in the meta-regression models. In summary, only limited support is found for the effect of election salience on intervention success. On a related note, researchers have referred to "election salience" to talk about phenomena-election type, voter turnout, competitiveness, and so on-that are not only different in nature but also have different effects on intervention success. Local elections have also been described as being either high or low salience in different contexts. Therefore, although we do not suggest a new definition of election salience, scholars in the GOTV field should remain aware of the multiple ways of defining this concept.
Finally, the results suggest that all GOTV interventions are not equal. Like previous studies have found (notably Gerber, Green, and Larimer 2008), some nonstandard types of messages, especially those that tap into social pressure by revealing the past turnout of recipients' neighbors, are associated with larger intervention effect sizes. Future mail-delivered field experiments are therefore likely to be more effective in primary elections and in other contexts when using nonstandard types of messages.

DATA AVAILABILITY STATEMENT
Research documentation and data that support the findings of this study are openly available at the PS: Political Science & Politics Harvard Dataverse at https://doi.org/10.7910/DVN/DPDU4P.

APSA Journals
apsanet.org/journals @ps_polisci @JPSE_Editors PS: Political Science & Politics provides critical analyses of contemporary political phenomena and is the journal of record for the discipline of political science reporting on research, teaching, and professional development.
The Journal of Political Science Education is an intellectually rigorous, peer-reviewed quarterly journal that publishes evidence-based and theoretically informed scholarship on teaching and pedagogical issues in political science.
The American Political Science Review is political science's premier scholarly research journal, providing peer-reviewed articles and review essays from subfields throughout the discipline.
Perspectives on Politics seeks to provide a space for broad and synthetic discussion within the political science profession and between the profession and the broader scholarly and reading publics.