Hospitalisations with infections related to antimicrobial-resistant bacteria from the French nationwide hospital discharge database, 2016

Massive use of antibiotics has led to increased bacterial resistance to these drugs, making infections more difficult to treat. Few studies have assessed the overall antimicrobial resistance (AMR) burden, and there is a paucity of comprehensive data to inform health policies. This study aims to assess the overall annual incident number of hospitalised patients with AMR infection in France, using the National Hospital Discharge database. All incident hospitalisations with acute infections in 2016 were extracted. Infections which could be linked with an infecting microorganism were first analysed. Then, an extrapolation of bacterial species and resistance status was performed, according to age class, gender and infection site to estimate the total number of AMR cases. Resistant bacteria caused 139 105 (95% CI 127 920–150 289) infections, resulting in a 12.3% (95% CI 11.3–13.2) resistance rate. ESBL-producing Enterobacteriaceae and methicillin-resistant Staphylococcus aureus were the most common resistant bacteria (>50%), causing respectively 49 692 (95% CI 47 223–52 142) and 19 493 (95% CI 15 237–23 747) infections. Although assumptions are needed to provide national estimates, information from PMSI is comprehensive, covering all acute bacterial infections and a wide variety of microorganisms.


Introduction
Antibiotics have markedly reduced the mortality associated with infectious diseases in the 20th century. However, their remarkable efficacy has been accompanied by extensive use in human and animal medicine: worldwide antibiotic consumption by humans has increased by about 40% from 2000 to 2010 [1,2]. As a consequence, the prevalence of antibiotic-resistant bacteria (AMRB) has increased worldwide, limiting our ability to fight infectious diseases [3][4][5][6], and resulting in a higher risk of morbidity from infections [7][8][9].
Antimicrobial resistance (AMR) has thus emerged as a major public health issue over the past decade [10]. Despite implementation of several national and global action plans, worldwide antibiotic consumption and AMR rates remain high [10][11][12]. In France, for example, 29% of Klebsiella pneumoniae were resistant to third-generation cephalosporins in 2016 [11]. According to the Centers for Disease Control and Prevention, infections due to resistant microorganisms were associated with 25 000 deaths in Europe in 2007 [13] and exceeded 23 000 deaths in the USA in 2013 [12]. A review on AMR commissioned by the UK Prime Minister estimated that 700 000 AMR-attributable deaths occurred globally in 2014 [14]. According to that study, the current rising AMR trend would cause 10 million deaths worldwide in 2050, thereby becoming the leading cause of mortality, with an annual mean of 390 000 deaths expected in Europe. The French Institute for Public Health estimated that, in 2012, 158 000 infections were caused by multidrug-resistant bacteria (MDRB) in France, and 12 500 deaths were associated with these infections [15].
Nationwide estimates of the overall burden of AMR [14][15][16][17][18] are however hampered by the lack of comprehensive and national data. Thus, most of the time surveillance data have been used to derive such estimates. However, surveillance reports focus on defined microorganismantibiotic pairs, and collect selected data. The European Antimicrobial Resistance Surveillance Network (EARS-net), for example, collects data from local and clinical laboratories in 30 countries, but surveillance is limited to invasive infections and specific AMRB, and its coverage varies by country. It was estimated that EARS-Net covers 18% of hospital bed-days in France (except for Streptococcus pneumoniae, for which it was estimated at 67%) [11,16]. Surveillance data have thus been extrapolated to the national level, using weighting from the literature. These predictive models require strong assumptions and use different sources, which have often their own specifications [19].
In 2014, the World Health Organization (WHO) revised its codification of infections and AMR within the 10th revision of the international classification of diseases (ICD-10) [20]. In France, specific codes were added in 2015 to the French ICD-10 version such as 'U82.10' (methicillin-resistant Staphylococcus aureus; MRSA) or 'U83.70' (emerging highly resistant bacteria), to allow codification of colonisation or infection with AMRB which pose specific management issues (i.e. isolation precautions or complex therapeutic management) as per current French recommendations. Those refinements provide an opportunity to use the French National Hospital Discharge Database (French acronym PMSI) and ICD-10 codes to estimate the actual incidence of AMR based on real nationwide data, while using limited assumptions and estimations.
This study was undertaken to estimate the incidence of AMR in French hospitals in 2016, using the PMSI database.

Data collection
The study population included all incident hospitalisations with an acute infection caused by Streptococcus, Staphylococcus, Enterobacteriaceae or other Gram-negative bacteria identified in 2016 from the PMSI.
The PMSI database covers all hospitalisations (also referred to as stays) in French publicly funded and private hospitals. Because this database is used for reimbursement purposes, each hospitalisation description is standardised, following the Technical Agency for Hospital Information (ATIH) recommendations. It contains ICD-10-coded hospital diagnoses, medical procedures performed during each stay and individual data such as age, sex and geographic area of residence. Diagnoses are coded as primary diagnosis (PD: condition requiring hospitalisation), related diagnosis (RD: adds information to PD) and significant associated diagnosis (SAD: complications and co-morbidities potentially affecting the course or cost of hospitalisation). Anonymised PMSI data from 2016 were extracted from the French National Health Data System (SNDS) that records every individual's demographic information, healthcare encounters and drug reimbursements [21]. We restricted the study to metropolitan France (henceforth referred to as France), excluding overseas territories, and representing 96% of its population in 2016, and hospitalisations in short-stay institutions (medicine, surgery, obstetrics).
Stays with admission between 1 January and 31 December 2016 were selected, when the PD, RD or SAD contained at least one specific infection code. Only stays longer than 1 day were considered as hospitalisation. To identify stays with acute infections, lists of ICD-10 codes corresponding to infections, microorganisms and resistance markers were established in collaboration with infectious diseases and PMSI coding specialists (Supplementary Tables S1 and S2). Codes only corresponding to colonisation with AMR were excluded from these lists. Most data available in the literature focuses on the first infection episode in a given year; in order to provide comparable data, only the first infectious episode in patients hospitalised within 2016 was selected. Because of the structure of the database and possible missing bacterial data, the first infectious episode was defined as the first infection that occurred at an anatomical site, also referred to as 'incident infection'.

Selection and recoding algorithm, and extrapolation
Using PMSI to estimate the incidence of AMR among hospitalised patients proved challenging. Three major constraints of the database were identified.
The first one is related to database structure. Indeed, in the database, diagnoses are not linked: infections are not linked with their causal bacteria or bacteria with their resistance status, except for some specific codes such as 'J13' (pneumonia due to S. pneumoniae) or 'U82.10' (MRSA) (Table S1). Therefore, linking an infection to its aetiological microorganism and its AMR status was not possible when several infections, pathogens and/or resistance statuses were coded for a given stay. Consequently, stays with several infection sites were excluded, except for blood infections and medical device infections. Indeed, stays with blood infection and another site recorded were categorised as infection of the recorded site. Likewise, medical device infection was favoured when associated with an infection at the same site, e.g. 'T82.6' (infection due to cardiac valve prosthesis) and 'I33.0' (infective endocarditis).
The second limitation is related to French coding practices and its clinical perspective. Indeed, in France, coding rules imply that AMR status is notified only when resistance implies changing usual clinical management. Consequently, natural resistance or resistance that does not result in modification of conventional management or therapeutic difficulties would not be recorded.
Finally, some stays with infection might not be associated with an infecting microorganism. No microorganism code may indicate a lack of microbiological testing (i.e. the physician treated the infection empirically); or it may indicate negative results of samplings, part of which may be secondary to effective antibiotic therapy administered before sampling.
From the first 2016 stay with one infection, two groups were defined: (M+) stays, with at least one microorganism coded; and (M−), with no microorganism coded ( Fig. 1). For M+, to minimise the number of excluded stays with multiple bacteria codes and only one specific AMR code recorded, resistance was attributed to the most relevant bacteria coded, excluding the other bacterial codes (e.g. methicillin resistance was assigned to S. aureus, ESBL resistance to Enterobacteriaceae or other Gramnegative bacteria and vancomycin resistance to Enterococcus). If attribution was not possible, stays with several microorganisms were excluded. In addition, to minimise inappropriate AMR categorisation (e.g. natural resistance), non-concordant microorganism-resistance pairs were recoded to the nearest most relevant (e.g. penicillin-resistant Enterobacteriaceae as ESBL, and vancomycin-resistant Streptococcus as Enterococcus). Relevant microorganism-resistance pairs are listed in Supplementary Table S2. Finally, when assignment to a given microorganism was not possible because of several resistance codes or impossible recoding, AMR status was retained but resistance was classified as 'unknown'.
Given the above limitation and the large number of stays with missing bacteria codes, we first analysed stays with complete information, i.e. those with an infection associated with an identified microorganism, whether resistant or susceptible. Extrapolation was then performed on the M− group, assuming that these stays were similar to group M+, conditional on gender, age and infection site stratum (classified as below). Thus, microorganism and AMR distributions were extrapolated from group 2 M. Opatowski et al.
M+, according to those parameters. For descriptive purposes, resistance was also extrapolated according to some patients' and stay recorded characteristics: Charlson comorbidity index, infection as PD, length of stay, surgical procedure during the stay and in-hospital death. Consequently, some minor resistance-distribution differences could be noted, reflecting marginal differences in distributions of these variables between the M+ and M− groups. When <10 M+ stays were found for a given set of gender, age class and infection site, extrapolation was not performed and the corresponding observations were excluded. 95% CI were estimated for extrapolated number of stays and resistance rates. Moreover, the distributions of the different categories within a variable were calculated from the estimated number of stays. Since confidence interval were defined independently for each class, it was thus not relevant to estimate confidence interval for distributions.
The following variables were used to describe the infections: site and bacterial species involved, AMR status and resistance type. Infection sites were stratified into 13 categories (Table S1): (1) urinary and genital tracts, (2) medical device associated, (3) skin and soft tissues, (4) lower respiratory tract, (5) abdomen and digestive tract, (6) bacteraemia and sepsis (henceforth blood infection), (7) infection during pregnancy, (8) bone and joint, (9) newborn infection, (10) heart and mediastinum, (11) ear, nose and throat, (12) eye, (13) nervous system. Medical device infections included foreign body infections and those associated with diagnostic procedures. Bacterial species were classified as Staphylococcus, Streptococcus, Enterobacteriaceae or other Gram-negative bacteria and by species. Resistance was categorised as to (1) penicillin, (2) methicillin, (3) extended-spectrum β-lactamase (ESBL), (4) other β-lactams resistance mechanisms, (5) vancomycin and related, (6) quinolones, (7) multiple antibiotics, (8) highly drug-resistant, (9) MDRB, (10) other or unspecified or (11) unknown (as explained above). In the French coding system, a MDRB code means that 'multidrug-resistant' is written in the bacteriology reports; conversely 'resistant to multiple antibiotics' is used when the bacteria is resistant to several antibiotics but multidrug-resistant is not written. It was assumed  Table 2. Because of insufficient sample size, 452 stays could not be extrapolated. b The first infectious episode in patients hospitalized within 2016 was defined as the first infection that occurred at an anatomical site.

Epidemiology and Infection
that MDRB refers to a bacterium resistant to at least three classes of antibiotic to which it is normally sensitive. Some specific resistances cannot currently be extracted from the PMSI, for example, resistance to third-generation cephalosporins is coded within group [4]. Microorganisms were considered susceptible to antibiotics when no resistance was coded (i.e. there was indeed no resistance marker detected, or if present, the resistance was not clinically significant and did not require a modification of conventional management of infection).

Analyses
To validate M+ selection and reclassification, S. aureus, Enterobacteriaceae and Streptococcus estimated resistance rates were compared with those available from 2016 EARS-net data for France [11]. Rates were estimated from M+ blood infections (primary and secondary) and meningitis (Enterobacteriaceae and S. pneumoniae only) in order to be compatible with EARS-net data. EARS-net ESBL-pE rates were estimated from rates provided for third-generation cephalosporin-resistant isolates [11,24]. The same validation was conducted on the extrapolated population.
The overall population was described after extrapolation. The characteristics of patients (gender, age and Charlson index), stays (infection as PD, duration, surgery during the stay, in-hospital mortality) and infections (site) were studied. For each infection site, resistance rates were estimated. Microorganism distribution in the population, associated resistance rates and frequencies of the most frequent bacterium-resistance pairs were studied. Because of the very large size of the sample, and the wide variety of hospitalisations and infections, no statistical test was used to compare the two groups of patients having resistant or susceptible bacterial infections.
Finally, incidences of total AMRB and of the most frequent bacterium-resistance pairs were estimated, by dividing the number of hospitalisations with AMRB infection by the total number of days of hospitalisations in short-stay institutions (medicine, surgery, obstetrics), in metropolitan France in 2016 [25]. They were expressed as the number of cases per 1000 patient-days.
Statistical analyses were computed with SAS Enterprise Guide (v 7.13 software, SAS Institute Inc., Cary, NC, USA). The study and data extraction were approved by the French Data Protection Agency (CNIL, approval DE-2016-176).

Results
In 2016, among 1 602 762 infection-associated stays, 70.8% were retained for analysis, corresponding to 1 135 310 first stays without prior infection at the same site in the corresponding year ( Fig. 1). During the selection process, 19.2% hospitalisations were excluded because of several infection or bacterial codes. Bacteria were coded in 30.9% of the selected sample (group M+), 12.5% of which were antibiotic-resistant, with 4.7% of resistance codes attributed to one of several bacteria, 20.3% of bacterium-resistance pairs recoded and 20.1% of resistances classified as 'unknown' (15.2% with inconsistent bacterium-resistance pairs and 4.9% with several resistance codes). Extrapolation yielded a total of 139 105 (95% CI 127 920-150 289) stays associated with an AMR infection, corresponding to 12.3% (95% CI 11.3-13.2) of the sample. Since extrapolation was not performed when the matching subgroup sample was deemed too small (<10 stays), a total of 452 stays were excluded from the analyses.
Likewise, estimates of resistance rates recorded from the extrapolated sample were consistent with those provided by EARS-net for MRSA, the emerging highly drug-resistant bacteria, ESBL-p E.

Epidemiology and Infection
coli and K. pneumoniae as well as for penicillin-resistant S. pneumoniae (Table 1).

Patient and stays
Around 58% of patients infected with AMRB were >65 years old and >20% had a Charlson index >2, as compared with 52% and 16% of those infected with a susceptible strain, respectively (   had an average AMR rate similar to the overall mean rate (12.3% vs 12.2%, respectively). MRSA were principally isolated from lower respiratory tract (32.3%), skin and soft tissues (31.3%) and primary blood infections (8.7%) ( Table 4). ESBL-p Enterobacteriaceae were mostly identified in urinary and genital tract, gastrointestinal and lower respiratory tract infections (respectively, 43.6%, 28.3%, 12.2% for E. coli and 32.3%, 13.8% and 26.9% for K. pneumoniae). More than a third of carbapenem-resistant K. pneumoniae was identified in lower respiratory tract, and around 20% in urinary and genital tract infections. Half of the vancomycin-resistant Enterococcus was associated with gastrointestinal infections. Finally, around 76% of penicillin-resistant S. pneumoniae were identified in lower respiratory tract.

Discussion
This innovative real-life study used an administrative medical database to assess the nationwide AMR incidence in France. It was estimated that around 140 000 (95% CI 130 000-150 000) hospitalisations were associated with an infection due to AMRB in 2016. The PMSI database provides a number of information allowing detailed description of infections and associated stays. However, our analysis also emphasises the difficulties encountered when using such data sources for an objective other than their original purpose, particularly when analysing infections and AMR. Indeed, contrarily to the majority of studies in other areas where simply identifying the diagnosis of a given disease is sufficient, studies dealing with microbial resistance require identifying both the infection and the associated microorganism responsible for the infection, as well as the potentially associated resistance marker.
One of the strengths of this database is that it covers all hospital stays in France. Administrative databases are increasingly used for epidemiological and diseases burden studies in various countries, but they have not yet been used to measure the AMR burden [26][27][28][29]. They provide access to 'real-life' data of the national population, and recent works illustrate the value of such data, particularly in the field of infectious diseases [29][30][31][32][33]. Sahli et al. concluded that quality of coding of infection diagnoses in the PMSI was high for PD, with positive predictive values of correct coding ranging between 0.98 (95% CI 0.95-1.00) and 0.93 (95% CI 0.88-0.98), but inferior for RD 0.70 (95% CI 0.61-0.71) [31]. Moreover, Afshar et al. considered that ICD-10 codes would enable analyses of different pathogens and their drug-resistance patterns [33]. However, the database refund-management purposes led to some difficulties. Indeed, hospital reimbursements are determined by diagnosis, concomitant diseases and stay duration, directly extracted from PMSI. Consequently, concomitant events at diagnosis or complicating the disease are well encoded but information not used for reimbursement calculation could be absent or miscoded. Thus, first, a multi-level algorithm of selection and recoding was required, developed in association with infectious disease specialists and public health doctors specialised in PMSI-coding. Two Ph.D. students independently programmed the algorithm to verify it. For example, 35% of bacterium-resistance pairs were inconsistent in 2016, including natural resistance, inconsistent association between a given bacterium and a resistance marker, or resistance that would not usually influence management; 57% of them could be recoded. Moreover, 19% of stays were excluded because diagnostic codes were not linkable. Although recovery through resistance codes could induce a selection bias, thereby increasing the per cent of resistance, it corresponded to only 0.2% of the selected population and 0.7% of the stays excluded. Conversely, exclusion of stays with multiple codes could cause underestimation of AMR rates and bias stays descriptions: 16% of them were associated with at least one resistance code, and hospitalisations were more severe (longer stays, higher in-hospital mortality, higher Charlson score, older patients, data not shown). Finally, because around 70% of stays had no bacteria coded, an extrapolation was conducted, assuming that, conditioning on gender, age and infection site, stays with missing bacteria code had similar profiles to those with available information. An alternative approach to extrapolation was tested for the M− group, where all missing microorganisms were assumed to be susceptible (data not shown). However, resulting resistance rates were far below surveillance and literature data.
Database recoding and extrapolation were validated, comparing our study results with surveillance data from EARS-net [11]. Moreover, our incidence estimates were consistent with other 2016 French data, estimating MRSA and ESBL-p Enterobacteriaceae incidence to 0.34 and 0.95 hospitalisations per 1000 patients-days, in short-stay institutions [34]. This study estimated incidence from the first 3 months of 2016, from 1354 health institutions (short-stay, psychiatry, long-term care facilities…). In addition, our study estimates were consistent with the low range of the MDRB-infection estimate of 158 000 (127 000-245 000) generated for France in 2012 [15]. Several factors might explain our relatively low estimates. First, the latter study was based on 2012 data and used prevalence estimations whereas our study focused on first hospitalisations with acute infections and 2016 hospitalisation data. Moreover, most of the patients with several infections or bacteria were excluded (with 16% resistance), notably the most serious infections. Finally, in the PMSI, AMR should be coded only if it alters clinical management and rates can be strongly influenced by clinical practices. Indeed, if microbiological investigations are not performed or impractical and first-line treatments include broad-spectrum antibiotics, AMR is unlikely to alter clinical management. For example, urinary and genital infections were more frequently associated with bacteria codes, which can be explained by commonly obtained urine cultures in hospitals, unlike for example, bronchopulmonary secretions samplings. The distribution of infection sites among stays without coded microorganisms supports this hypothesis, with a higher proportion of lower respiratory tract and abdominal and gastrointestinal infections (Supplementary Table S3).
In this study, patients infected with AMRB were slightly older and had more comorbidities than the others. They underwent a surgical procedure more often during hospitalisation, and their length of stay was longer. Even though not attributable to AMRB, in-hospital death occurred more often during hospitalisation with an infection with ARMB than in those without ARMB.
In conclusion, estimating the AMR incidence based on hospital discharge database requires resolving complex problems in data handling, necessitating hypotheses to correct for missing or inconsistent ICD-10 codes. However, the resulting information, consistent with data derived from surveillance studies, is very thorough, indeed covering all acute bacterial infections and a variety of selected microorganisms, thereby providing more comprehensive information than most available estimates based on a microorganism or disease selection.