RELATIVE EFFECTIVENESS IN BREAST CANCER TREATMENT: A HEALTH PRODUCTION APPROACH

Background: Pharmaceuticals’ relative effectiveness has come to the fore in the policy arena, reflecting the need to understand how relative efficacy (what can work) translates into added benefit in routine clinical use (what does work). European payers and licensing authorities assess value for money and post-launch benefit–risk profiles, and efforts to standardize assessments of relative effectiveness across the European Union (EU) are under way. However, the ways that relative effectiveness differs across EU healthcare settings are poorly understood. Methods: To understand which factors influence differences in relative effectiveness, we developed an analytical framework that treats the healthcare system as a health production function. Using evidence on breast cancer from England, Spain, and Sweden as a case study, we investigated the reasons why the relative effectiveness of a new drug might vary across healthcare systems. Evidence was identified from a literature review and national clinical guidance. Results: The review included thirteen international studies and thirty country-specific studies. Cross-country differences in population age structure, deprivation, and educational attainment were consistently associated with variation in outcomes. Screening intensity appeared to drive differences in survival, although the impact on mortality was unclear. Conclusions: The way efficacy translates into relative effectiveness across health systems is likely to be influenced by a range of complex and interrelated factors. These factors could inform government and payer policy decisions on ways to optimize relative effectiveness, and help increase understanding of the potential transferability of data on relative effectiveness from one health system to another.

Relative effectiveness can be defined as "the extent to which an intervention does more good than harm compared with one or more alternative interventions under the usual circumstances of healthcare practice" (1). This contrasts with relative efficacy, which is a comparison "under ideal circumstances," which is usually associated with controlled clinical trials (2). "Comparative effectiveness" is closely related to relative effectiveness (3). Towse et al. (3) propose an analytical framework, which draws upon production function theory, that describes how certain sets of inputs and processes yield specified outcomes. The aim is to systematically identify and quantify the potential determinants of relative effectiveness. This study reports a first Research funding was received from Pfizer. The views expressed here are those of the authors and not necessarily those of the funder. We thank all of our anonymous referees for their constructive comments. assessment of the framework to help understand the contextual differences between countries that could be associated with differences in effectiveness and relative effectiveness. In recognition of ongoing efforts to develop European Union (EU) -level approaches to assessment, our case study focuses on breast cancer in three countries in the EU.

OBJECTIVE
To highlight potential cross-country differences in the relative effectiveness of a new drug we reviewed studies investigating reasons for differences in health outcomes in breast cancer. We also reviewed relevant national clinical guidelines and health technology assessment (HTA) reports to understand similarities and differences in the management of breast cancer. We show how our analytical framework can help to understand the factors that might drive differences in relative effectiveness across different settings.

METHODS
In a separate study in this issue (3), we set out an analytical framework that uses a health production function approach, with health as the output of interest (4). Inputs ("factors" or "determinants") are classified according to the level at which they operate: patient level (i.e., individuals' clinical or sociodemographic characteristics); provider level; and the level of the healthcare environment or system. The relative effectiveness of a drug is the additional net output (health) achieved by adding a new drug to usual care or substituting it for another treatment.
In this study, we use breast cancer as a case study to identify evidence on the factors associated with health outcomes, drawing on findings from England, Spain, and Sweden.
In selecting a disease area for our case study, we considered several potential tracer conditions including cardiovascular disease, Alzheimer's disease, schizophrenia, cancer, osteoporosis, and rheumatoid arthritis. We selected breast cancer because it is a common condition, is a high clinical priority in all three countries, and new drugs enter the market regularly. Outcomes are driven by both drug and nondrug interventions, as well as by the coordination of care across different settings, and the care pathway covers prevention, early detection, diagnosis, surgery, and adjuvant therapy.
The selection of countries was mainly driven by the likelihood that data would be available for most of the factors we wanted to investigate. We, therefore, decided to limit our choice to countries with similar gross domestic product (high income countries), that had good data on usage and cost, and that varied in technology diffusion and health outcomes. Pragmatically, national clinical guidelines would be accessible only if published in English, Spanish, or Swedish, and this factor helped us to finalize our selection. Our three study countries, England, Spain, and Sweden, have published clinical guidelines on breast cancer, which provide an indication of national priorities and inputs that may influence outcomes. Two of the three countries (England and Sweden) have also assessed the cost-effectiveness of (some) breast cancer drugs.
To identify the data that would be needed to populate a health production model for breast cancer, we undertook a review of the literature. We also reviewed national clinical guidelines and HTA reports.

Literature Review
A recent review of studies explored the extent of any variation in relative efficacy and relative effectiveness of medicines used in one or more EU Member States (5). The review found little empirical evidence on cross-country differences, and no crossborder observational studies to compare effectiveness in routine practice. For the purpose of this article, we, therefore, simplified our approach: focusing on breast cancer, we searched for studies that investigated determinants of health outcomes such as mortality or quality of life in one or more of our countries Titles and abstracts from the searches were screened for eligibility by two of the review team (R.P.P., A.M.). To be eligible for inclusion, studies needed to explicitly investigate determinants driving differences in outcomes, either across countries (international comparative studies) or within countries (individual country studies). Potentially eligible studies were identified by two authors (R.P.P., A.M.) and assessed for inclusion by one author (RPP). Figure 1 shows the study selection process. One member of the review team (R.P.P.) extracted the data from each study into a template, providing details of the study design, countries covered by the study, data sources, health outcomes and findings (see online Supplementary Tables 1 and 2). As shown in Table 2, the factors identified were then grouped into the framework categories reflecting the level of influence (individual, provider, and national level) using the template from Table 1 in Towse et al. 2015 (3). These data were checked by a second reviewer (A.M.).

Clinical Guidance Review
To identify similarities and differences in recommended care pathways across our study countries, clinical guidelines for the treatment of breast cancer and relevant HTA reports were reviewed. We searched the Web sites of national HTA agencies (England and Wales, Sweden), Ministries of Health (Spain) and Royal Colleges (Spain), and consulted experts (Sweden). Comparative data on screening programs, and treatment recommendations by stage of disease were extracted and tabulated.

Individual Level Factors
At the individual level, several demographic factors were consistently associated with poorer outcomes in breast cancer patients, including older age, socio-economic status, and lifestyle factors (smoking status). Older women (aged 75 and over) had lower survival rates than younger women. Although this is partly explained by stage at diagnosis (13)-older women are more likely to present with late stage disease-a Swedish study found that survival differences persisted and were more pronounced in older women with late stage disease than clinically comparable (but younger) women. Older women underwent less intensive diagnostic activity, and less aggressive treatment, even after adjusting for comorbidity (33). Evidence from England and Sweden suggested that women with lower socio-economic status have worse survival, after adjusting for tumor size and Age: Older patients tend to have worse prognosis and therefore lower survival rates. This is partly due to greater co-morbidity, but the observed large differences in the intensity of treatment of older patients cannot be explained by co-morbidity alone.
Introduction of multidisciplinary teams associated with improvements in processes of care for breast cancer patients, but not with improved survival. Data quality / comparability: there are differences between countries in the methods and specificity of certifying cause of death which partially explains differences in reported breast cancer mortality rates. Awareness of symptoms may lead to early diagnosis and access treatment improving the health outcomes The distribution of tumor biology can differ across populations and lead to differences in cancer outcomes across countries.
National /regional guidelines/ regulations Site-specialist multidisciplinary teams introduced as part of a national initiative changed treatment patterns and increased surgical specialization, but the improvement in survival rates was not statistically significant.

Service delivery and organisation
Screening intensity is associated with increased incidence of early stage breast cancer, which in turn leads to improved overall survival rates (lead time bias, length time bias). Improved mortality rates are less evident. Access issues (local/regional/ national) Diagnosis and treatment: Inequalities of access can be partly explained by the national total expenditure on healthcare. Lower use of radiotherapy may adversely affect cancer outcomes. Fewer GPs per thousand population is associated with delays in diagnosis of cancer and worsened prognosis.
Economy Income and expenditure: Positive association between survival rates and countries' national income; expenditure on health as a proportion of GNP; and public expenditure on health as percentage of total health expenditure.
Note. Two international studies (8; 14)and one based in Spain (31) discussed the quality and efficacy of care in relation to their findings but none formally tested for it.
age (19;42) and that better educated women are likely to have a better prognosis (39;40). A Swedish study found that smoking status independently increased the risk of death (after adjusting for age and stage of disease) (44). We found no direct evidence on treatment concordance (adherence). In terms of individuals' clinical characteristics, there was strong evidence that disease stage at diagnosis is an importantand perhaps the most important-predictor of cross-country differences in 5-year survival. However, stage at diagnosis is not, in itself, an "explanation"; rather, it begs the question of why disease stage differs across countries. Possible reasons include screening intensity, access to diagnosis and treatment, and public awareness (which we consider below). Tumor pathology, in particular, the proportion of women with node negative disease, accounts for some differences in survival (13), and Swedish studies found that genetic (familial) determinants also affect prognosis and survival (37;38;43). Women with specific comorbidities may have fewer treatment options, for instance if they are unsuitable for radiotherapy or chemotherapy (33). However, we found no study that explicitly tested the impact of co-morbidity on survival.

Provider Level Factors
There was less evidence on which features of the healthcare system influence survival, and our searches found no crosscountry analyses. Studies from England have investigated the role played by access (travel time) and by multidisciplinary teams (MDTs). Travel time to the GP (general practitioner) was correlated with stage at diagnosis, but there was no consistent relationship between travel time to hospital and survival or stage at diagnosis (20). MDTs improved the process of care but did not significantly improve survival at 1, 3, or 5 years (23;25). However, if average survival for a breast cancer patient is around 7 to 8 years after diagnosis (15), longer follow-up periods may be needed to detect an effect.
Other studies have considered access to diagnostic facilities and to treatments, and waiting times between symptom onset and treatment. The importance of access to diagnostic facilities is well-recognized, and we discuss this in relation to screening programs (see below). An English study analyzed data from the Northern and Yorkshire Cancer Registry and Information Service (NYCRIS) to compare 3-year survival rates for those diagnosed between 1982 and 1990 with cases diagnosed between 1991 and 1999 (24). In all age groups, 3-year survival improved significantly between the two periods. Stage at diagnosis explained all the improvement in those aged over 65, and explained most of the improvement in women aged below 65. Although the uptake of systemic treatment (chemotherapy and hormone treatment) increased substantially over time, systemic treatment had no statistically significant effect in explaining improvements in prognosis in any age group or overall. However, there are several reasons why this "negative" finding for treatment effect needs to be interpreted carefully. First, 3-year survival may be too short a time to robustly assess the impact of systemic therapy on mortality. In addition, data on stage at diagnosis were missing for a large proportion of cases, particularly in the earlier period. This "stage migration" could have led to greater misclassification bias in the first period, which could, therefore, overstate the role of stage in explaining survival improvement. Lastly, the study did not test for an interaction between stage at diagnosis and treatment uptake, so did not isolate the effect of earlier treatment per se. Further details of this study (24) are available in online Supplementary Table 1.
Finally, the quality and consistency of data recording is known to vary across countries, and there are differences between countries in the methods and specificity of certifying cause of death (6;8). However, a recent analysis found that even "implausibly extreme" assumptions about data errors could not account for all the observed cross-country differences in survival (18).

National / Environmental Factors
There are national screening programs in operation in England and Wales (63) and in Sweden (64). In Spain, screening programs are managed and run on a regional basis. Table 3 summarizes the characteristics of the screening programs in terms of the target population and screening interval, based on the review of clinical guidelines.
The intensity of screening activity was strongly associated with improved survival, although evidence for an impact on mortality rates was mixed (6;32). Both national screening programs and opportunistic screening increased the incidence of early stage breast cancer. This improves overall survival rates, reflecting both the effect of earlier treatment and lead time bias. However, countries that have not introduced screening have also seen improvements in survival (6;8), suggesting that other factors play a role.
Evidence on the role of national guidelines was sparse, in terms of both the extent of implementation and the effect on outcomes. Our review of national guidance found few differences in recommendations for treatment of breast cancer, but variation in the date of issue and of the scope of guidance, as well as its implementation, may be important. A Swedish study investigated regional differences in survival, and found that suboptimal diagnostic activity in one county explained the variation. Services were reorganized in this county: multidisciplinary working was better staffed and co-ordinated, screening and diagnostic activity were quality assured, and treatment recommendations were implemented. When guideline adherence improved in these ways, survival also improved (34). An evaluation of the effects of 1995 Calman-Hine report, which introduced national cancer guidelines, found that adherence varied across English regions (23). A study found evidence that care processes had improved as a result of both the Calman-Hine  (65) report and the subsequent English Cancer Strategy (2000), but improvements in survival were not statistically significant (25). Several international studies found that countries with higher national income, and that spent a greater proportion on healthcare, also had better survival rates (7;12;13;16;18). This may be due to improved access to care. For example, countries with higher national income may be able to afford better equipped hospitals; the number of in-patient beds and computerized tomography (CT) scanners per million population were found to be positively associated with survival (16). However, some of this improvement in survival may be an artefact of improved detection methods (e.g., screening programs) which increases the incidence of "over diagnosed" cancers (see Table 1).

DISCUSSION
Our case study is not a definitive assessment of the validity of our framework, but rather a first attempt to explore how a health production approach can help identify the factors that should be considered in an assessment of the relative effectiveness of a new drug. These factors could potentially be used to optimize effectiveness in routine practice. Engagement from broad group of stakeholders (including providers) would be crucial to the success of this process, and we set out below the types of challenge they would need to resolve.

Choice of Outcome Measure
Cross-country differences in breast cancer outcomes are well documented (6;7;13;14;64). However, the outcome measure used to assess relative performance across countries can give very different results in terms of ranking. When our three countries are assessed by 5-year survival rates, Sweden is ranked first and the United Kingdom is ranked last (14); but an analysis of mortality trends from 1989 to 2006 ranked Spain first and Sweden last (6). To understand this apparent discrepancy, we need to recognize that survival is a "complex indicator of a country's performance" (7). Longer survival may reflect later death and/or earlier diagnosis-and earlier diagnosis may reflect screening intensity. But earlier diagnosis that does not lead to later death is of questionable benefit to patients. Comparisons based on survival may, therefore, be misleading, if differences in survival do not reflect reductions in mortality. A recent international comparison suggested screening did not play a direct part in reductions in mortality (65). Both survival and mortality may need to be considered alongside incidence if valid assessments of prognosis are to be made (15;66).

Data Limitations
A limitation is that we have only identified factors reported in the literature, and there may be other important drivers that have not been assessed. For example, we found no study that isolated the impact of hormone replacement therapy (HRT) on outcomes. HRT is associated with an increase in the risk of breast cancer (67;68), but only an estimated 3 of 100 breast cancers is related to use of HRT (69). As use of HRT varies and breast cancers induced by HRT may be less aggressive, variations in HRT prescribing across countries are likely to influence international differences in survival rates in a complicated way.
Most of the evidence related to the individual level, which probably reflects data availability-cancer registries include an array of patient characteristics, but comparable information on countries' healthcare provider systems must be added from external sources. Where access to treatment was assessed, this typically did not take account of dose or duration of treatment. Conversely, we found more evidence on national factors, such as screening programs. Subsequent studies need to further elucidate the factors that may influence breast cancer outcomes, ideally in consultation with clinical experts and possibly drawing on additional (unpublished) data sources such as those documenting differences in resource availability, or spend on breast cancer. They would need to take account of evidence of the impact of genetic variations on both prognosis and choice of therapy.

Causality or Association?
A further shortcoming of our review is that it reports associations between health outcomes and various factors, but it is less clear whether the relationships are causal. This is because most of our studies are retrospective analyses of observational data. The quality of this type of study is heavily dependent upon the number of observations, the underlying data quality (which is rarely reported in journal articles), the functional form of the model and whether there are confounding factors that are not, perhaps cannot be, taken into account. To explore causality would require different study designs, such as randomized trials. However, these are not feasible when investigating the impact of national factors. Even if associations are robust, they shed little light on drivers relating to the inputs and activities included in the care given, which will impact on how a treatment is used and what, if anything, it displaces. There may also be interactions and correlations between the factors we identified, both within and between different levels, for instance, national income is likely to be correlated with individuals' educational level and individuals' stage at diagnosis will be linked to system level screening policy. This problem is perhaps more complex for breast cancer than for some other diseases, such as acute conditions, although most chronic diseases are managed through a combination of screening, diagnosis, lifestyle alterations or interventions, and drug treatment.

CONCLUSIONS AND POLICY IMPLICATIONS
Based on our review of studies comparing breast cancer outcomes and of guidelines/HTA reports in three European countries, we believe that the way efficacy translates into relative effectiveness across health systems is likely to be influenced by a range of complex and interrelated factors. These comprise not only the genetic and other biological and behavioral patient factors mentioned by Eichler et al. (2) (which we term "individual" patient level factors in our model) but also the characteristics of the providers and healthcare environment and system-level factors. For example, the importance of stage at diagnosis begs the question of why stage of disease differs across countries. Arguably, this finding reflects the conclusion of Eichler et al.
(2) that "where there is an apparent large gap between efficacy and effectiveness, one is not looking at a drug problem but at a healthcare delivery problem, and the focus of remedial action should be shifted to improving real life performance." Relative effectiveness is a current policy issue in Europe, and this is why our case study is focused here. In principle, the same issues arise in any context where drugs are approved centrally but where there may be significant regional variations in how the drugs are used in practice and, therefore, differences in relative effectiveness. By recognizing that impediments to improving health can arise at several levels, policy makers in any jurisdiction can begin to explore ways to optimize relative effectiveness. Studies that show differences in relative effectiveness between countries, or that identify factors suggesting these exist, provide one way to identify how health system performance can be improved.
Careful consideration of the determinants within our framework may also aid discussions on the extent to which evidence for HTA based decision making can be shared across health systems, and identify the data required for robust comparisons. In some cases, it will be reasonable to expect evidence on relative effectiveness to be transferable; in other cases, it may be possible to anticipate and adjust for expected differences in relative effectiveness between countries, and so use evidence from one country in another. In other cases, however, an understanding of relative effectiveness in a country may generate questions that cannot be answered by existing evidence and that require a bespoke study.

CONFLICTS OF INTEREST
Puig-Peiro, M.Sc. reports grants from Pfizer during the conduct of the study and grants from The Association of the British Pharmaceutical Industry outside the submitted work. At the time of writing the report, Dr. Puig-Peiro was working at the Office of Health Economics. Her new affiliation is the Catalan Health Service and she does not have conflict of interests. Dr. Mason reports grants from Pfizer (contract with OHE Consulting) during the conduct of the study and grants from Novartis (contract with OHE Consulting) outside the submitted work. Dr. Mestre-Ferrandiz reports grants from Pfizer during the conduct of the study and from The Association of the British Pharmaceutical Industry outside the submitted work. Professor Towse reports grants from Pfizer during the conduct of the study and from The Association of the British Pharmaceutical Industry, outside the submitted work. Dr. McGrath reports grants from Pfizer during the conduct of the study and from Pfizer and AstraZeneca outside the submitted work. Professor Jönsson reports personal fees from Pfizer, during the conduct of the study.