Efficacy of bacterial vaccines to prevent respiratory disease in swine: a systematic review and network meta-analysis

Abstract A systematic review and network meta-analysis (MA) was conducted to address the question, ‘What is the efficacy of bacterial vaccines to prevent respiratory disease in swine?’ Four electronic databases and the grey literature were searched to identify clinical trials in healthy swine where at least one intervention arm was a commercially available vaccine for one or more bacterial pathogens associated with respiratory disease in swine, including Mycoplasma hyopneumoniae, Actinobacillus pleuropneumonia, Actinobacillus suis, Bordetella bronchiseptica, Pasteurella multocida, Stretococcus suis, Haemophils parasuis, and Mycoplasma hyorhinis. To be eligible, trials had to measure at least one of the following outcomes: incidence of clinical morbidity, mortality, lung lesions, or total antibiotic use. There were 179 eligible trials identified in 146 publications. Network MA was undertaken for morbidity, mortality, and the presence or absence of non-specific lung lesions. However, there was not a sufficient body of research evaluating the same interventions and outcomes to allow a meaningful synthesis of the comparative efficacy of the vaccines. To build this body of research, additional rigor in trial design and analysis, and detailed reporting of trial methods and results are warranted.


Introduction
Porcine respiratory disease complex (PRDC), which refers to respiratory disease caused by viruses or bacteria in swine, is the leading single cause of death during the nursery stage (47.3%), and is responsible for the majority of swine deaths during the grower-finisher production stage (75.1%) (USDA, 2016a). Swine respiratory disease is multifactorial, involving the interplay between environmental factors, host characteristics, and infectious disease agents (Opriessnig et al., 2011). Bacterial pathogens involved in respiratory disease may induce disease, or act as co-infections, making the animal more susceptible to other disease agents (Opriessnig et al., 2011). Bacterial pathogens frequently involved in PRDC include Actinobacillus pleuropneumonia, Actinobacillus suis, Bordetella bronpchiseptica, and Mycoplasma hyopneumoniae (Opriessnig et al., 2011;VanAlstine, 2012). Treatment of respiratory disease is a major source of antimicrobial use in swine production (Karriker et al., 2012). In the USA in 2012, the National Animal Health Monitoring System (NAHMS), which is run by the United States Department of Agriculture (USDA), reported that 59.9% of swine farm nursery sites used injectable antibiotics and 41.2% used water-soluble antibiotics to treat respiratory disease. Similarly, the NAHMS reported that 72.8% of swine farm grower-finisher sites used injectable antibiotics and 64.2% used water-soluble antibiotics to treat respiratory disease (USDA, 2016b).
Antibiotic use in animals or humans leads to the development of antimicrobial resistance (AMR). The World Health Organization (WHO) Global Action Plan on Antibiotic Resistance (2015) stated that AMR is currently one of the most critical threats to global health, food security, and economic development. To minimize the development of AMR, it is therefore essential that antibiotics be used judiciously in both animal and human medicine. The American Association of Swine Veterinarians (AASV) endorsed a position statement regarding the judicious use of antimicrobials in swine production that encourages veterinarians to implement preventive strategies (AASV, 2019). One such approach is the use of vaccinations, intended to prevent disease and therefore the use of antibiotics to treat clinical illness. Swine veterinarians and producers require information on the effectiveness of alternative non-antibiotic treatments, such as vaccinations, to mitigate the impact of major bacterial pathogens involved in swine respiratory disease without relying on antibiotics.
The systematic review method has been recognized for decades as a transparent and replicable method for synthesizing and assessing the available evidence on the effectiveness of interventions (Higgins and Green, 2011). The use of systematic review methods to synthesize scientific data for use in policy-making has been endorsed internationally by the WHO, the European Food Safety Authority (EFSA), and the Codex Alimentarius Commission (CAC) (EFSA, 2010;CAC, 2014;WHO, 2018). Meta-analysis (MA), or the statistical pooling of estimates of intervention effect sizes across individual studies, may also be combined with systematic review to provide a summary estimate of intervention efficacy (Higgins and Green, 2011). However, one important limitation of a standard (pairwise) MA is that it only allows the comparison of pairs of treatments, such as a specific vaccine to a placebo or the comparison of two vaccines.
Often veterinarians and producers are faced with multiple possible treatment options, and thus they require comparisons of relative efficacy between more than two alternatives. A network meta-analysis (NMA) generates estimates of the relative efficacy of multiple interventions by incorporating direct head-to-head comparisons of interventions based on the available literature, as well as indirect comparisons of interventions across a network of trials (Dias et al., 2011;Cipriani et al., 2013). For example, if treatment A is compared to treatment B in one or more studies, and treatment B is compared to treatment C in one or more studies, the relative relationship between treatments A and C can be inferred indirectly from the available information under some standard assumptions. Thus, given a dataset of studies investigating the efficacy of intervention options for the same outcome, within which the underlying NMA model assumptions are met, NMA allows the comparison of multiple interventions using direct and indirect evidence (Lu and Aedes, 2004). A systematic review followed by an NMA therefore provides a rigorous method for synthesizing evidence regarding the relative effectiveness of available swine respiratory vaccines.
The objective of this study was to conduct a systematic review and NMA to address the question: 'What is the relative efficacy of commercially produced bacterial vaccines to prevent respiratory disease in swine from controlled trials with natural disease exposure?'.

Protocol
A review protocol was prepared a priori, and was reported in accordance with the PRISMA-P guidelines . The protocol was published in the University of Guelph's institutional repository (https://atrium.lib.uoguelph.ca/xmlui/handle/ 10214/10046). The protocol is also available through Systematic Reviews for Animals and Food (SYREAF) (http://www.syreaf. org/contact/). The Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for NMA (PRISMA -NMA) guidelines were used for the reporting of this systematic review (Hutton et al., 2015).

Eligibility criteria
Primary research studies available in English were eligible for inclusion in the review. In addition, primary studies must have included the following elements based on the PICOS components: Population (P): Healthy swine at any stage of production; Intervention (I): At least one intervention arm was a commercially available vaccine or a commercially produced injectable autogenous vaccine (derived from culture) for bacterial pathogens associated with respiratory disease in swine, specifically M. hyopneumoniae, A. pleuropneumonia, A. suis, Bordetella bronchiseptica, Pasteurella multocida, Stretococcus suis, Haemophils parasuis, and Mycoplasma hyorhinis; Comparator (C): Negative control group, sham treatment, saline placebo, or other alternative treatment (including another vaccine); Outcomes (O): At least one of the following outcomes was evaluated: morbidity (clinical morbidity as defined by the authors, cough index, or general or specific lung lesions), mortality, or total antibiotic use; and Study design (S): Controlled trials with natural disease exposure.

Search
The search strategy was comprised of four distinct concepts: swine at any stage of production, vaccines, bacteria associated with respiratory disease in swine, and respiratory outcomes. A list of relevant search terms was compiled for each concept within the search strategy. Search terms for each part of the strategy were combined using the Boolean operator 'OR', and parts were linked using the Boolean operator 'AND'. The full search strategy as developed for Science Citation Index is listed in Table 1. The search strings were modified as necessary to meet the formatting requirements of each of the remaining databases. Database searches were conducted on 20 August 2018 via the library resources at the University of Guelph, Canada. Search results were uploaded to EndNoteX7 (Clarivate Analytics, Philadelphia, by the search for potential eligibility. All reviewers conducted a pre-test of the first 250 titles and abstracts to ensure clarity of understanding and consistent application of the questions. Thereafter, the following questions were used to assess eligibility: (1) Is this a primary study that evaluated the use of vaccines for bacterial causes of respiratory disease in live swine? YES, NO (exclude), UNCLEAR (2) Is there a concurrent comparison group (i.e. controlled trial with natural or deliberate disease exposure, or analytical observational study)? YES, NO (exclude), UNCLEAR (3) Is the text available in English? YES (include for full-text screening), NO (exclude), UNCLEAR (include for full-text screening) Citations were excluded if both reviewers responded 'NO' to any of the questions; agreement was at the include/exclude level. Disagreements were resolved by consensus or by a third reviewer (JMS or CBW) if consensus could not be reached. Full texts were acquired for citations that passed the initial eligibility screening or for which the results were unclear. Two independent reviewers conducted the full-text screening, using the first 10 articles as a pre-test. The following questions were used to evaluate the eligibility of the full-text articles: (1) Is the full text available with >500 words? YES, NO (EXCLUDE) (2) Is the full text available in English? YES, NO (EXCLUDE) (3) Does this study assess the use of a monovalent or polyvalent commercially available vaccine or a commercially produced injectable autogenous vaccine (derived from culture) for one or more of the following bacterial pathogens associated with respiratory disease in swine: Mycoplasma hyopneumoniae, Actinobacillus pleuropneumoniae, Actinobacillus suis, Bordetella bronchiseptica, Pasteurella multocida, Streptococcus suis, Haemophilus parasuis, or Mycoplasma hyorhinis? YES, NO (EXCLUDE) (4) Is there a concurrent comparison group? (i.e. controlled trial with natural or deliberate disease exposure, or analytical observational study)? YES, NO (EXCLUDE) (5) Are at least one of the following outcomes described: clinical morbidity (as defined by the authors), cough index, lung lesions at slaughter, mortality, or antibiotic use? YES, NO (EXCLUDE) (6) Is the study a controlled trial with natural disease exposure?
YES (moves to data extraction stage), NO, the study is a controlled trial with deliberate disease induction (indicate the pathogen targets included in the vaccine, but exclude from data extraction) NO, the study is an observational study (indicate the pathogen targets included in the vaccine but exclude, from data extraction) Studies were included in the NMA if sufficient data were reported to enable the calculation of the log odds ratio (OR) and the standard error of the log OR, based on the extraction of the prioritized outcome metrics. The criteria for inclusion in the NMA are described in the statistical analysis section.

Data collection
Two reviewers used a standardized form to independently extract data from all citations meeting the full-text screening criteria. The data extraction form was created using a hierarchical nesting of forms in DistillerSR, which allowed data from multiple comparisons or from multiple outcomes to be linked. The forms were pretested on the first 10 articles by all reviewers to ensure consistency and for training in the use of nested forms. Discrepancies in data extraction were resolved by consensus, with mediation by JMS and CBW if an agreement could not be reached.

Study characteristics
Extracted study characteristics included country of conduct, year(s) the study was conducted, months of data collection, whether the study was a commercial or research trial, the number of herds/farms enrolled in the study, the reason for vaccinating (endemic disease, prevention of clinical disease, response to a disease outbreak, or not reported), farm-and animal-level inclusion/ exclusion criteria, and whether the study reported clinically ill animals at the time the intervention was given.

Interventions
Details on each intervention arm included vaccine name, the target bacterium or bacteria for the vaccine, other infectious disease targets included in the vaccine, dose, route, frequency of administration, the stage of production at which the intervention was administered and at which the outcome was measured, the type of vaccine (killed, modified live, autogenous, or not reported), and whether there were any concurrent treatments. A description of other intervention arms (e.g. a non-eligible vaccine, placebo, or alternative treatment) also was extracted, as well as the number of animals and pens enrolled and analyzed in each treatment group. Information was collected on whether animals in different intervention groups were comingled within pens or whether each pen only contained animals in the same intervention group.

Eligible outcomes
Outcomes eligible for inclusion were morbidity (clinical morbidity defined by the authors as observed illness or illness requiring treatment, cough index, and general and specific lung lesions), mortality, and total antibiotic use. Morbidity related to respiratory disease was included as a critical outcome in the review protocol, whereas lung lesions were not included as an eligible outcome in the protocol. However, as most studies included lung lesions as a proxy for morbidity (clinical or subclinical), we added this outcome beginning at the title and abstract screening level as a protocol modification. For trials in which there were some animals with the clinical disease at the start of the study, data were only extracted for incident cases (i.e. data were only extracted when they were reported separately for the animals that did not have clinical disease at the start of the trial). Multiple specific outcomes within the same outcome category often were reported for both morbidity and lung lesions. In these instances, a single outcome within each category was extracted based on priorities identified at the start of the data extraction process. For morbidity, if both general and respiratory-related were reported within the same trial, respiratory-related morbidity data were extracted. For lung lesions, information on general lung lesions (i.e. not specific to a pathogen) was extracted when presented. If an article also included disease-specific lung lesions, a single measure for disease-specific lesions was extracted; the prioritization (based on author description) was as follows: (1) Lung lesions consistent with enzootic pneumonia or Mycoplasma spp.; (2) Lung lesions consistent with Actinobaccillus pleuropneumonia (APP), including APP index; (3) Slaughterhouse pleurisy evaluation system (SPES) scores; and (4) Pathology described as pleuritic or pleurisy.
When extracting data on effects and effect sizes, relative effect measures (OR, risk ratio (RR), or mean difference) and corresponding precision estimates were extracted when these data were provided. Information on any additional adjusted variables used in the calculation of effect sizes was also extracted, as were losses to follow-up. When relative effect measures were not reported, arm-level data were extracted, including variance measures for continuous data. If a measure of variance was not reported, or could not be calculated from the available information, data for that comparison were not extracted. When results were presented only in a graph, with no numerical labels, data were not extracted. When data were reported as a frequency of observations within ordinal categories (e.g. for lung lesions, data in some trials were reported as the number of pigs with a score of 0, the number of pigs with a score between 1 and 20, the number of pigs with a score between 21 and 40, etc.), the data were collapsed into a binary outcome for data extraction (e.g. lung lesions were present or not present).
Some publications included multiple trials, or trials in multiple herds where the results were reported separately by the herd. In both of these instances, the results were extracted separately and are reported as separate trials.

Geometry of the network
We used a visual approach to qualitatively evaluate the geometry of the network, to determine if some pairwise comparisons dominated and to determine whether the network appeared to have a star or web-like structure. We also evaluated whether there were intervention comparisons that were not linked to the network (i.e. did not have an intervention in common with one or more other published studies).

Risk of bias in individual studies
Risk of bias was assessed at the outcome level for the three main outcomes for trials included in the NMAs using the Cochrane risk-of-bias tool for randomized trials (RoB 2.0). This tool provides a framework for assessing the likelihood that bias could be introduced in each of the five domains. Specifically, the risk of bias domains included in the tool are: risk of bias arising from the randomization process; risk of bias due to deviations from the intended intervention(s); risk of bias due to missing outcome data; risk of bias in the measurement of the outcome; and risk of bias in the selection of the reported result (Higgins et al., 2016).
All outcomes included in an NMA were evaluated using this tool, regardless of whether interventions were allocated at the individual or pen level, or whether animals within intervention groups were comingled or grouped by intervention within pens. Signaling questions are a component of the RoB 2.0 tool that is used to elicit information on the use of trial features that are relevant to the potential for bias in a trial. The signaling questions were modified to address common practices in livestock health for animals that are grouped in pens. For assessing risk of bias due to allocation, the Cochrane risk of bias tool includes a question on whether the authors described the method for generating the random allocation sequence. This question was modified to include a response for studies where allocation to intervention group(s) was reported as 'random', but no information was provided on the actual method for generating the random sequence. In the RoB 2.0 domain related to deviations from the intended intervention, there is a question on whether participants were aware of their intervention group allocations; for each of the studies in this review, this question was answered as a 'no' because the participants in these trials were pigs. An additional question in this domain asks about blinding of study personnel, and for the purposes of this review, this question was clarified to refer to the blinding of animal caregivers.
The overall risk of bias within each domain was calculated based on the algorithms suggested by Higgins et al. (2016), with the exception of the domain related to randomization. For this domain, allocation concealment was not considered as a criterion related to bias due to the randomization process, as all eligible pens and animals are included in swine trials. Furthermore, it is unlikely that a producer or investigator would have any treatment preference for a given pen, as the differential economic value of animals or pens would not be known at the time interventions are allocated. This approach is consistent with the risk of bias assessments of other livestock studies (Moura et al., 2019).

Summary measures
The summary effect size was the log OR, which was converted back to the RR using the baseline risk and a Bayesian approach method of estimation. For the morbidity outcome, the posterior mean and standard deviation of the baseline risk mean were −2.2448 and 1.4449 and the posterior mean and standard deviation of the baseline risk standard deviation were 1.3853 and 0.2455. For the mortality outcome, the posterior mean and standard deviation of the baseline risk mean were −2.7471 and 0.7528 and the posterior mean and standard deviation of the baseline risk standard deviation were 0.7437 and 0.0779. For the lung lesion outcome, the posterior mean and standard deviation of the baseline risk mean were 1.378 and 1.7094, and the posterior mean and standard deviation of the baseline risk standard deviation were 1.6433 and 0.2941.
For studies that had zero cells for some data points, the ORs could not be calculated. Such trials were therefore excluded from the analyses.

Planned methods of analysis
After data extraction was complete, a treatment map of all reported interventions was compiled (Table 2). Vaccines for the same bacterial target produced by different companies were considered to be different interventions, because the specific bacterial or viral strains used in the manufacture of the vaccine may differ, or the products may contain different antigens or adjuvants. However, if the same vaccine or intervention arm was issued via different routes or in a different number of doses, then this was considered to be the same intervention. Vaccines with identical bacterial targets that were produced in different countries by the same company were also considered to be part of the same intervention arm. Vaccines that incorporated an additional treatment, such as an immune stimulant, were considered to be separate interventions from those vaccines that did not include the additional treatment. Vaccines containing only viral respiratory targets, non-commercial vaccines, or other identified interventions were included as comparison arms in the MA, but because these were not the focus of this review, they are not shown in the ranked results.
Some studies provided separate morbidity results for more than one stage of production. In order to conduct the NMA, only one set of results was included per trial. To identify the most relevant set of results for each trial, we first prioritized overall results representing the entire production process, followed by results from the grower/finisher stage, followed by results from the nursery stage. The grower phase was prioritized over the nursery phase as most vaccines were given on entry to the nursery and would be expected to take some period of time before immunological protection could be achieved.
The methodological approach for conducting network meta-analyses has been described in detail elsewhere (Dias et al., 2011;O'Connor et al., 2013). For all studies included in the review, results were converted into log ORs for the present analysis. If the authors reported a RR, this was converted into a log OR using the reported risk of disease in the placebo group. When authors reported the probability of an outcome in each intervention group based on a statistical model, that probability was converted to a log OR using a process described elsewhere (Hu et al., 2019).

Selection of prior distributions in Bayesian analysis
The choice of prior probability distributions was based on an approach reported previously (Dias et al., 2011). Accordingly, we assessed both σ ∼ U (0,2) and σ ∼ U (0,5). The results suggested that σ ∼ U (0,5) was preferred, and so we retained this prior in the model.

Implementation and output
All posterior samples were generated using Markov Chain Monte Carlo (MCMC) simulation, which was implemented using Just Another Gibbs Sampler (JAGS) software (Plummer 2015). All statistical analyses were performed using R software (version 3.5.2) (RCore, 2015). The model was fitted using JAGS; JAGS was called from R via the rjags package (version 4-8) (Plummer, 2015). Three chains were simulated in the analysis, and convergence was assessed using Gelman-Rubin diagnostics. We discarded 5000 'burn-in' model iterations and based all inferences on a further 10,000 model iterations. The model output included all possible pairwise comparisons using log ORs (for inconsistency assessment), RRs (for comparative efficacy reporting), and the treatment failure rankings (for comparative efficacy reporting).

Assessment of model fit
To assess the fit of the model, we examined the residual deviance between the predicted values of the log ORs from the NMA model and the observed value for each study (Dias et al., 2010).

Assessment of inconsistency
NMA relies on an assumption of consistency between direct and indirect intervention effects, apart from the usual variation that stems from a random-effects MA model (White et al., 2012). For example, if one trial compares the direct effect of a treatment A with the effect of treatment B, and another study compares the efficacy of treatments B and C, then the effect of A relative to B and B relative to C can be used to infer the (indirect) effect of A relative to C. The assessment of inconsistency compares whether the direct effects and the indirect effects give a similar result. We used the back-calculation method to assess the consistency assumption in our NMA (Dias et al., 2010). We compared the estimates from the direct and indirect models and considered the standard deviation of each estimate, rather than relying on the P-values.

Risk of bias across studies (across the network)
To describe the quality of the evidence network, a modification of the Grading of Recommendation Assessment, Development and Evaluation (GRADE) approach for NMA was used (Salanti et al., 2014;Papakonstantinou et al., 2018). The GRADE framework provides a method of evaluating the quality or certainty of evidence and the strength of the recommendations derived from that evidence. The GRADE for NMA was conducted using the Confidence in Network Meta-Analysis (CINeMA) online software (http://cinema.ispm.ch). CINeMA calculates intervention effects using a frequentist approach, which is based on the metafor package in R (Viechtbauer, 2010). The contribution matrix of direct and indirect evidence for the risk of bias was based on this analysis. The GRADE approach in CINeMA examines six different domains: within-study bias, across-studies bias, indirectness, imprecision, heterogeneity, and incoherence.
Rather than presenting indirectness and overall within-study bias, we replaced these domains with an evaluation of the contribution of studies based on their reporting of randomization and blinding. We believed that randomization and blinding would be more variable across studies compared to overall bias risk, and would therefore be more informative. In a GRADE assessment for NMA, indirectness refers to the differences between the populations, interventions, and outcomes in the included studies and the populations, interventions, and outcomes that were the target of the NMA (Salanti et al., 2014). We felt that all of the trials included in our NMA would have minimal indirectness because the review was restricted to relevant populations. We recognize that the swine population varies in regards to management, herd health practices, and prevalent pathogens. However, these data may not be consistently reported in trial reports, and attempting to extract this information was therefore beyond the scope of this review.

278
Jan M. Sargeant et al. To characterize randomization as reported in each of the included studies, we sorted each trial into one of three categories: (1) the authors reported random allocation to intervention group (s) and provided information on the random sequence generation (low risk); (2) the authors reported random allocation to intervention group(s) without providing information on how the sequence was generated (some concerns); and (3) the allocation method was non-random or no information on the method of allocation was provided (high risk). For blinding, we categorized the trials based on reporting of the following: (1) outcome assessors and caregivers were both blinded (low risk); (2) either outcome assessors or caregivers were blinded (some concerns); and (3) neither outcome assessor nor caregivers were blinded or no information was provided (high risk).
The process for assessing across-studies bias in an NMA is not well developed. Further, no pairwise comparisons in this review included more than 10 trials, which is the number typically believed to be necessary for an accurate across-studies bias assessment (Sterne et al., 2000). Therefore, across-studies bias was not evaluated.
The imprecision assessment portion of the GRADE analysis examines whether the boundaries of the confidence intervals for the intervention effect indicate a clinically appreciable benefit or harm, or whether the intervention effects are clinically ambiguous (i.e. the confidence intervals span values representing both benefits and harms, or benefit or harm and a null value). For this analysis, we used an OR cut-point of 0.8 to represent a clinically meaningful OR (i.e. an OR<0.8 represents appreciable benefit and an OR>1.25 represents appreciable harm). For an NMA, the major impact of heterogeneity is whether it will affect decision-making; heterogeneity is thereby judged by the variability in effects in relation to a clinically important effect size. Thus, ORs of 0.8 and 1.25 also were also used in the assessment of heterogeneity. We did not present the results of the incoherence analysis from CINeMA, which measures the consistency of the network, because we conducted and presented the consistency analysis results based on the Bayesian analysis described previously in the Methods section.

Additional analyses
No additional analyses were conducted.

Study selection
The database and grey literature searches identified 2157 unique citations, 1621 of which were excluded at the title/abstract eligibility screening stage, and an additional 390 were excluded at fulltext screening (Fig. 1). The excluded citations included 81 challenge trials and eight observational studies that evaluated bacterial vaccines in swine and examined at least one eligible outcome. Following the full-text screening, there were 179 trials from 146 publications eligible for data extraction.

Study characteristics
The 179 trials were conducted in 29 countries; the most commonly reported country was France (N = 19/179; 10.6%), followed by Belgium (N = 11/179; 6.1%). The country in which the trial was conducted was not reported for 37/179 (20.7%) trials. The year when the trial was conducted was not reported in most studies (155/179; 86.6%). Trials in which the year was reported were conducted between 1981 and 2013. The majority (165/179; 92.2%) of the trials were conducted on commercial farms, with the remainder either conducted in research herds (6/179; 3.4%) or in an unreported setting (8/179; 4.5%). Trials were conducted on a single farm for 132/179 (76.7%) of the trials, with the number of farms within a trial ranging from 1 to 16. The number of farms enrolled in the trial was not reported in 7/179 trials (3.9%). There were two key reasons for vaccinating reported in the trials: the presence of endemic disease (155/179; 69.8%), or the prevention of clinical disease (38/179; 21.2%). Vaccinating because of a disease outbreak was not reported for any trial, and the reason for vaccinating was not reported in 16/179 trials (8.9%). Full details on the characteristics of the 179 trials that were included in the review are available in Supplementary file Table S1.
Complete outcome data could not be extracted from 44 trials because insufficient information was reported in the study; reasons included continuous outcomes reported as means without a measure of variability, results presented only in figures, and proportions provided without stating a denominator. As a result, data for the NMA were extracted from only 135 trials. Of the 135 trials for which data could be extracted, 26 included pens containing pigs receiving different treatments, pigs receiving different treatments were not co-mingled within pens in 30 trials, and it was not possible to determine whether treatments were co-mingled within pens in 79 trials.
Due to the large number of studies and outcomes, the remainder of the results focus on the three most common outcomes: clinical morbidity, mortality, and the presence or absence of nonspecific lung lesions. In addition, cough index and total antibiotic use were each measured in three trials.

Risk of bias within studies by outcome
Risk of bias within each of the studies included in the NMA was determined using the Cochrane risk of bias domains for each of Other interventions that were not eligible for inclusion in the review, but were included as comparison groups (code the three outcomes for which NMA were conducted. The results are available in the Supplementary material Figs S1-3. The ratings indicating 'some concerns' about the possible presence of bias were primarily the result of non-reporting of key trial design features. The probability of bias due to intervention group allocation was rated as high for approximately 40-60% of the trials, depending on the outcome. A high rating of bias under this domain was driven by studies where allocation to treatment group was not random or where the authors provided no information about the method of allocating pigs to different treatment groups, and where there were imbalances between the treatment groups at baseline. Across all outcomes, trials were categorized from low to high risk of bias due to deviations from the intended interventions and missing outcome data. For mortality, all studies were at a low risk of bias from outcome measurement, because outcome assessors were not likely to be biased in their recordings of mortality regardless of whether they were blinded. In contrast, the risk of bias related to outcome measurement in trials reporting morbidity or lung lesions ranged from low risk to high risk, because of the potential for knowledge of intervention status to influence assessors' measurement of the subjective outcome. For bias due to selective reporting of outcomes, all trials for all outcomes were assigned a rating of 'some concerns' about the presence of bias because an a priori trial protocol is needed to judge whether a trial is at a high or low risk of bias in this domain, and a priori protocols were not available for any trial.

Study results
Not all of the trials that measured a given outcome were included in the NMA because some intervention arms were collapsed into a single category, some trials did not provide a description of one or more vaccine interventions, and some trials had zero cells that precluded the calculation of an OR. In addition, only a small number of trials assessed some of the outcomes, and the NMA would be uninformative due to wide confidence intervals if there was only a small number of contributing trials. As a result, NMA was only conducted for the three most commonly reported outcomes: clinical morbidity (n = 20 trials contributing to the network), mortality (n = 54 trials contributing to the network), and presence/absence of non-specific lung lesions (n = 27 trials contributing to the network) (Fig. 1).

Incidence of clinical morbidity
Structure of the network Of the 24 trials that measured clinical morbidity, three did not describe the specific vaccine used in one or more intervention arms and one trial had no animals with morbidity in one or more intervention groups, and therefore a RR could not be calculated. Thus, there were 20 trials with 27 treatment comparisons that reported a clinical morbidity outcome and contributed to the NMA. The network of intervention comparisons for the clinical morbidity outcome is shown in Fig. 2; the number of comparisons for each intervention is shown in parentheses beside the intervention node, and the abbreviations are defined in Table 2. Although some vaccine-to-vaccine comparisons were present in the network, a non-active treatment (not treated, placebo, or sham treatment) was the most common intervention arm reported. The network was generally sparse (small number of trials contributing to most comparisons), and the predominance of some intervention arms suggested a non-random distribution of intervention evaluations. None of the trials reported any statistical adjustments for pen effects (i.e. statistical non-independence of pigs within pens). Although data from all intervention arms with clinical morbidity outcomes informed the NMA, relative results for vaccine efficacy are presented only for eligible vaccines with a bacterial pathogen target and placebo controls. For the group of trials with morbidity as an outcome, comparative results are presented for 14 vaccines plus a non-active intervention group.

Results of individual studies and synthesis of results
RRs for all pairwise comparison for the bacterial vaccines, including comparisons to a non-active control, are available in the Supplementary material Table S2. Figure 3 illustrates the relative ranking of each eligible vaccine for preventing morbidity along with 95% credibility intervals; the mean rankings are available in the Supplementary material Table S3. For the majority of interventions, the credibility intervals were wide and overlapping; of the 14 vaccine intervention arms, six represented a single direct comparison, six represented two comparisons, and two intervention arms were represented by more than two comparisons. Based on the relative ranking, Ingelvac HP-1 and Respisure/Respisure One ranked as more efficacious than a non-active control. However, these results were based on a small number of comparisons, and therefore these relative rankings should be interpreted with caution. These results are consistent with the distribution of the probability of failure (morbidity) for each vaccine (see Supplementary material Fig. S4), where there are no substantial differences in the probability of failure (morbidity) between vaccine interventions.
Measures of the consistency of the direct and indirect comparisons for morbidity as an outcome are available in the Supplementary material Table S4. There was no evidence of inconsistency between the direct and indirect estimates because the credible intervals for all comparisons included the null value of no difference between the two values.

Structure of the network
Of the 81 trials that measured mortality, 14 did not describe the specific vaccine used in one or more intervention arms, eight did not have extractable data, two applied the interventions in sows and measured the outcome in offspring, two did not have an intervention arm that linked to the network (i.e. was not common to an intervention arm in any other trials), and one trial had no mortality in one or more intervention groups, and therefore a RR could not be calculated. Thus, there were 54 trials that reported mortality as an outcome and contributed to the NMA. None of the trials included adjustments for pen effects. The network for all intervention arms that included estimates of mortality is shown in Fig. 4; the number of comparisons involving each intervention is shown in parentheses beside the intervention node and the acronyms are defined in Table 2. More intervention arms contributed to this network compared to the morbidity outcome network. The most common intervention arm in this network also was the nonactive treatment arm and most of the trials included comparisons between a vaccine and a non-active treatment arm, rather than vaccine-to-vaccine comparisons. None of the trials included in the mortality analysis had an adjustment for pen effects.

Results of individual studies and synthesis of results
Relative results are presented only for eligible vaccines and placebo controls. For mortality as an outcome, comparative results are presented for 28 vaccines plus a non-active intervention group.
RRs for all pairwise comparison for the bacterial vaccines, including comparisons to a placebo, are available in the Supplementary material Table S5. Figure 5 shows the relative ranking of each eligible vaccine for preventing mortality, with 95% credibility intervals. The three vaccines with the highest average ranks were Ingelvac HP-1 and Respisure, which both target M. hyopneumoniae, and Porcilis APP, which targets A. pleuropneumoniae. However, the results for Ingelvac HP-1 and Respisure are each based on only one trial. The mean rankings are available in the Supplementary material Table S6 and the distribution of the probability of failure (mortality) for each vaccine is available in Supplementary material Fig. S5.
An evaluation of the consistency of the direct and indirect comparisons for mortality as an outcome is available in Supplementary material Table S7. Similar to the results for the clinical morbidity outcome, there was no evidence of inconsistency between direct and indirect estimates.

282
Jan M. Sargeant et al.

Presence or absence of non-pathogen-specific lung lesions at slaughter
Structure of the network Of the 137 trials that measured lung lesions at slaughter, 39 evaluated lesions specific to a bacterial pathogen and 14 presented the results using a continuous outcome scale. Thus, there were 84 trials that measured the presence or absence of non-specific lung lesions at slaughter. Of these, 40 did not present the data in a form that could be extracted, 12 did not provide a description of one or more intervention arms, four trials had zero cells thereby precluding a calculation of a RR, and one trial did not link to the network. Therefore, there were 27 trials that reported the presence or absence of non-specific lung lesions at slaughter as an outcome that contributed to the NMA. The network for all intervention arms for which non-specific lung lesions were reported is shown in Fig. 6; the number of comparisons involving each intervention is shown in parentheses beside the intervention node and the acronyms for the vaccine labels are defined in Table 2. Again, the most common intervention was a non-active treatment. None of the trials included an adjustment for pen effects.

Results of individual studies and synthesis of results
Results are presented for the 18 eligible vaccines that contributed to the network plus a non-active intervention. RRs with 95% confidence intervals for all pairwise comparison are available in the Supplementary material Table S8. Figure 7 shows the relative ranking of each eligible vaccine for preventing mortality, with 95% credibility intervals. As with clinical morbidity and mortality, several vaccines appeared to be more efficacious at preventing lung lesions than a non-active treatment; however, eligible vaccine rankings were generally based on a small number of comparisons. The mean rankings are available in the Supplementary material Table S9. These results are consistent with the distribution of the probability of failure (presence of lung lesions) for each vaccine (see Supplementary material Fig. S6). The analysis of the consistency between the direct and indirect comparisons for non-specific lung lesions as an outcome is shown Fig. 2. The network of intervention arms in a network meta-analysis of the relative efficacy of bacterial vaccines to prevent morbidity in swine. The size of the circle provides a relative indication of the number of intervention arms, the width of the line provides a relative indication of the number of direct comparisons between interventions that were reported in the literature, and the number of arms for each intervention is shown in parentheses beside the intervention node.
in Supplementary material Table S10. Similar to the results for morbidity and mortality outcomes, there was no evidence of inconsistency between the direct and indirect estimates.

Risk of bias across studies
There were a large number of possible combinations of comparisons between intervention arms, and so we present the risk of bias results for only those comparisons between eligible vaccines. Estimates of the risk of bias across studies are presented for the outcomes of clinical morbidity, mortality, and the presence or absence of non-specific lung lesions in Supplementary Tables S11-13, respectively. The results focus on the risk of bias due to the domains of randomization, blinding, imprecision, and heterogeneity. For each of the three outcomes, the contribution of studies to the RR for each comparison between the eligible vaccines was calculated both based on the approach to randomization and based on the approach to blinding; these results are presented in Supplementary material Figs S7-12.
Bias due to imprecision was a major concern for many of the comparisons under each of the three outcome headings; this reflects the large confidence intervals surrounding the effect estimates, which resulted from the small number of trials contributing to each comparison. There were no comparisons with major concerns about bias due to heterogeneity for any of the outcomes. This result was expected based on the wide confidence intervals on the RRs, which again were due to the small number of studies contributing to any given comparison. Across all outcomes, the risk of bias due to the randomization process was high or unclear for a substantive component of the evidence for many intervention comparisons. Most of the evidence contributing to the treatment comparisons was derived from studies with a high risk of bias associated with the lack of blinding of outcome assessors and livestock caregivers.

Summary of evidence
Despite a seemingly large number of trials evaluating one or more bacterial vaccines for swine respiratory disease, there was an insufficient body of evidence available from which to draw firm, evidence-based conclusions about the relative efficacy of available commercial vaccines. Several factors exacerbated the lack of data: there was little replication of each specific vaccine intervention in the existing literature with only one or two estimates contributing to most estimates; the metrics and measurements used for outcomes varied considerably across trials; and small sample sizes in trials investigating rare outcomes resulted in zero cells, rendering it difficult to calculate effect sizes.

Limitations of the body of literature
Clinical trials are fundamental to understanding the efficacy of an intervention. However, a single trial does not provide a definitive answer to a clinical question, as the results of a single trial represent only one random result from a distribution of possible results. Even when trials are addressing identical research questions, differences in the results between trials may occur due to subtle differences in the populations, interventions, or outcomes Fig. 4. The network of intervention arms in a network meta-analysis of the relative efficacy of bacterial vaccines to prevent mortality in swine. The size of the circle provides a relative indication of the number of intervention arms, the width of the line provides a relative indication of the number of direct comparisons between interventions that were reported in the literature, and the number of arms for each intervention is shown in parentheses beside the intervention node. studied, or due to differences in the disease exposure experience of the experimental animals. Thus, replication of studies is essential to understand the true efficacy of an intervention.
In creating the network of vaccine interventions, we combined vaccines produced by the same company for the same bacterial target, even if the dose, frequency, or route of administration differed.
This was done to increase the power of the network, although it introduced some heterogeneity into the intervention definition. Nonetheless, even after some vaccine interventions were combined, the majority of intervention arms were represented in only one or two trials. In addition, a non-active treatment arm was the most common comparison group, meaning that most of the vaccine-to-vaccine comparisons depended on indirect evidence, which is not as strong as direct evidence. The networks of interventions presented for each outcome may assist researchers in identifying which direct vaccine-to-vaccine or vaccine-to-placebo studies could be undertaken in the future to help build a more robust network for evaluating the relative efficacy of these vaccines. There also was considerable variation in the methods and metrics used to measure morbidity and lung lesions, as well as variation in the outcomes that were reported in different trials. When the same outcome is measured in different ways (e.g. morbidity based on clinical signs versus morbidity based on antibiotic treatment requirements), heterogeneity is introduced into the results, thus decreasing the precision of the efficacy estimates. Variation in the outcomes reported across trials also reduces the power of evidence synthesis, because different outcomes (e.g. nonspecific lung lesions versus lung lesions specific to a particular pathogen) cannot be meaningfully combined in the same MA. One possible solution is to develop and use a core set of outcomes for all trials addressing a given health issue, such as the prevention of respiratory disease in swine. This approach has been used for trials investigating several different human health care issues (for examples of recent protocols for developing core outcome sets, see Wuytack et al. (2018) andO'Donnell et al. (2019)). Guidelines for developing core outcome sets have been recently published by the Core Outcome Measures in Effectiveness Trials (COMET) initiative (Williamson et al., 2017). Creation of a core outcome set does not restrict researchers from reporting other outcomes in their trial publications, but rather provides a recommendation for a minimum set of outcomes to be included in all trials, thus allowing trial results to be compared or synthesized across studies.
The network of studies would have been considerably larger had we included challenge trials as an eligible study design. Challenge trials are often conducted during the development and assessment phases of an intervention to provide proof of concept for efficacy. However, they differ from trials with natural disease exposure in that the disease challenge may not be representative of the disease exposure affecting animals in field conditions, and challenge trials often are conducted in more controlled settings than are typical in most livestock facilities (Sargeant et al., 2014a(Sargeant et al., , 2014b. Challenge trials also are more likely to indicate beneficial treatment effects compared to natural exposure trials investigating the same intervention(s) (Egger et al., 1997;Wisener et al., 2014). This is partially related to publication bias, because challenge trials tend to involve smaller numbers of animals than natural exposure trials and studies with small sample sizes also are more likely to be published if they show statistically significant results. Therefore, the published results of challenge trials may not be representative of the true range of results from those trials (Egger and Smith, 1998). For these reasons, we did not include challenge studies in this review.
We included in the present analysis all trials that met the eligibility criteria, regardless of whether or not other strategies to prevent respiratory disease were in place. Thus, the included trials could be evaluating whether vaccines worked as a single preventive strategy or whether there was an added benefit beyond other preventative measures already in place, such as biosecurity, all-in all-out management, or prophylactic antibiotics. As the volume of research on vaccine efficacy grows, it may be important to consider this difference and to clearly report any additional disease prevention measures that affect the animals in the trial. Although the pigs in all of the included studies were housed in pens, none of the studies included an adjustment for the nonindependence of pigs within pens. This is a necessary step regardless of whether or not animals in different intervention groups are comingled within pens. When non-independence or clustering within pens is not accounted for in the analysis, the confidence intervals will be inappropriately narrow and the P-values will be inappropriately small. This in turn might lead researchers to overestimate the precision and increases the probability of a type I error (i.e. there appears to be a significant association when one does not exist) (Schukken et al., 2003). Statistical methods are available to control for non-independence, and should be applied in all studies in which animals are grouped or multiple farms are included.
Additionally, this review included both trials in which individual pigs were allocated to each treatment group (with or without comingling within pens), as well as those with allocation to intervention occurring at the pen level. If trial pens are grouped within barns, then it is possible that pigs in different intervention groups are sharing the same air space. This may mean that pigs in a nonactive treatment group are afforded some protection from infection due to herd immunity, which could bias vaccine efficacy results toward the null.
In addition to the challenge of insufficient data for meaningful synthesis, there were also issues related to potential biases in the trials that were included in the review. In some cases, trial reports did not include key design features such as random allocation to treatment groups and the blinding of study personnel and animal caregivers to intervention status. In other instances, information related to the key features needed to assess the potential for bias was not provided in the trial reports, making it impossible to assess bias risks. This is not an uncommon issue in the animal health literature; deficiencies in reporting have been documented in numerous studies (Wellman and O'Connor, 2007;Burns and O'Connor, 2008;Sargeant et al., 2009aSargeant et al., , 2009bBrace et al., 2010;Winder et al., 2019). Poor reporting of study design features has been associated with exaggerated treatment effects in numerous studies in both the animal and human health literature (Moher et al., 1998;Burns and O'Connor, 2008;Sargeant et al., 2009aSargeant et al., , 2009b. In response to concerns about reporting deficiencies in livestock trials, the REFLECT statement was developed by an expert consensus process . The REFLECT statement consists of a 22-item checklist to provide guidance on the components of trials that should be reported in publications, as well as an explanation and elaboration document that provides additional information about each item on the checklist. The REFLECT statement methods and elaboration publications were co-published in multiple journals (O'Connor et al., , 2010c(O'Connor et al., , 2010dSargeant et al., 2010aSargeant et al., , 2010b and also are available online (http://www.reflect-statement.org/; https://meridian.cvm.iastate.edu/). Improved reporting of trial design, methods, and results will help readers to judge the validity of trial results and to make meaningful comparisons across studies.

Limitations of the review
Although we attempted to conduct a comprehensive search, it is possible that not all of the relevant existing literature was captured. We used the names of bacterial respiratory pathogens and broad terms for vaccination, but did not include the commercial names of specific vaccines, as these may vary over time and by country; thus, we may have missed trial reports that used only a vaccine name without mentioning that it was a respiratory pathogen vaccine. Additionally, our review included only Englishlanguage articles. Therefore, the results of this review may not be representative of the entire body of literature assessing bacterial vaccines for respiratory disease on swine. Further, we collapsed intervention arms and combined similar outcomes to improve Fig. 7. Ranking forest plot for intervention arms evaluating non-specific lung lesions at slaughter as an outcome for the efficacy of bacterial vaccines for swine respiratory disease. Relative ranking and 95% credibility intervals are shown.

288
Jan M. Sargeant et al.
the number of treatment and outcome replications included in the NMA. There may have been some misclassification of the morbidity outcome, in that not all trials provided information as to whether all treated animals were treated specifically for respiratory disease. It is possible that our decision rules for combining interventions or outcomes impacted our results. If there were any substantive differences in the treatments or outcomes that were combined in the NMA, the differential impacts of those variations would be obscured. However, we attempted to be transparent in terms of how the data were manipulated for the analysis, allowing the reader to consider whether they agree with the grouping decisions.

Conclusions
This review identified 179 trials evaluating vaccines for bacterial causes of respiratory disease in swine. However, because of the variability in outcome measurements and the small number of trials replicating each of the vaccine interventions, the comparative efficacy results from the NMA had wide confidence intervals. In many cases, the confidence intervals spanned values indicating both appreciable benefits and appreciable harms due to the vaccine intervention, and so clear conclusions about the efficacy of vaccines to manage respiratory disease in swine could not be made. There were also deficiencies in the reporting of key design features in many of the trials, which resulted in a high or unclear risk of bias. The limitations for research synthesis identified by this review were a function of the body of work rather than the synthesis methods. The same limitation would apply for an expert opinion or narrative review based on this body of work, although it may not be as apparent or transparent as it occurs when systematic review methods are employed. Future research could use the networks presented in this review to target gaps in vaccine trial replication to build a more robust body of literature. Adhering to recommendations for the reporting of livestock trials will improve the comparability of studies and the assessment of the potential for bias in trial results.
Supplementary material. The supplementary material for this article can be found at https://doi.org/10.1017/S1466252319000173 Author contributions. JMS developed the review protocol, coordinated the project team, assisted with data analysis, interpreting the results, and wrote the manuscript drafts. BD, MB, KC, KD, JD, CM, MR, and BW conducted relevance screening, extracted data, conducted risk of bias assessments, commented on manuscript drafts, and approved the final manuscript version. DH conducted the data analysis, provided guidance for the interpretation of the results, commented on manuscript drafts, and approved the final manuscript. CW assisted with the development of the review protocol, provided guidance on the conduct of the analysis and interpretation of the results, and approved the final manuscript draft. AMOC assisted with the development of the review protocol, provided guidance on the interpretation of the results, commented on the manuscript drafts, and approved the final manuscript draft. TOS assisted with the development of the review protocol, provided guidance on the interpretation of the results, commented on the manuscript drafts, and approved the final manuscript. CBW assisted with the development of the review protocol, co-coordinated the research team, assisted with data screening, data extraction, and risk of bias assessment, provided guidance on the interpretation of the results, commented on the manuscript drafts, and approved the final manuscript draft.
Financial support. Support for this project was provided by The Pew Charitable Trusts.
Conflict of interest. None of the authors has conflicts to declare.