Geospatial cluster analyses of pneumonia-associated hospitalisations among adults in New York City, 2010–2014

Pneumonia is a leading cause of death in New York City (NYC). We identified spatial clusters of pneumonia-associated hospitalisation for persons residing in NYC, aged ⩾18 years during 2010–2014. We detected pneumonia-associated hospitalisations using an all-payer inpatient dataset. Using geostatistical semivariogram modelling, local Moran's I cluster analyses and χ2 tests, we characterised differences between ‘hot spots’ and ‘cold spots’ for pneumonia-associated hospitalisations. During 2010–2014, there were 141 730 pneumonia-associated hospitalisations across 188 NYC neighbourhoods, of which 43.5% (N = 61 712) were sub-classified as severe. Hot spots of pneumonia-associated hospitalisation spanned 26 neighbourhoods in the Bronx, Manhattan and Staten Island, whereas cold spots were found in lower Manhattan and northeastern Queens. We identified hot spots of severe pneumonia-associated hospitalisation in the northern Bronx and the northern tip of Staten Island. For severe pneumonia-associated hospitalisations, hot-spot patients were of lower mean age and a greater proportion identified as non-Hispanic Black compared with cold spot patients; additionally, hot-spot patients had a longer hospital stay and a greater proportion experienced in-hospital death compared with cold-spot patients. Pneumonia prevention efforts within NYC should consider examining the reasons for higher rates in hot-spot neighbourhoods, and focus interventions towards the Bronx, northern Manhattan and Staten Island.


Introduction
Pneumonia is a clinical syndrome characterised by infection of the lungs. Common aetiologic agents include Streptococcus pneumoniae and influenza virus, with clinical manifestations ranging from mild symptoms to severe illness and death [1]. Infections can occur within the community setting or in association with healthcare settings [2].
Together, 'pneumonia and influenza' rank as the third leading cause of death in New York City (NYC), with most deaths attributed to an underlying cause of pneumonia, not influenza [3]. During 2010-2014, 54.3% of pneumonia-associated hospitalisations among adults in NYC were due to community-acquired pneumonia (CAP), 30.2% to healthcare-associated pneumonia (HCAP), 14.0% to hospital-acquired pneumonia (HAP) and the remaining 1.6% to ventilator-associated pneumonia (VAP) [4]. While the distribution of each setting of acquisition is known to vary across the five boroughs of NYC, the distribution within each borough has not yet been assessed.
To develop pneumonia prevention strategies that consider the epidemiological variation within and between NYC boroughs, we sought to identify spatial clusters with significantly higher rates of pneumonia-associated hospitalisations for NYC residents aged ⩾18 years during 2010-2014. We also conducted cluster analyses of pneumonia-associated hospitalisations by severity and setting of acquisition.

Study site and population
Our study population consisted of persons residing in NYC, aged ⩾18 years who were admitted to a New York State (NYS) acute care facility with a pneumonia-associated hospitalisation during 1 January 2010-31 December 2014.

Data source and definitions
The NYC Neighborhood Tabulation Area (NTA) was the geographic unit of analysis for this study. NTAs were created by the NYC Department of City Planning using aggregates of whole census tracts [5]. Based on the 2010 US Census, 2168 census tracts define the geopolitical sub-divisions of NYC, and correspond to 195 NTAs. The NTA provides a statistically reliable alternative to low population denominators and high sampling error associated with individual census tracts. A majority of NTAs within NYC are residential neighbourhoods (N = 188); however, several non-residential NTAs throughout the city include public domains such as parks, correctional facilities and airports (N = 7) [5]. Individuals housed within facilities (correctional, psychiatric, substance abuse treatment, etc.) located in non-residential NTAs were excluded from our study population. For this investigation, we used the NTA shapefile in a projected geographic coordinate system, New York Long Island FIPS 3104 North American Datum of 1983/Universal Transverse Mercator zone 18N (NAD83/UTM zone 18N).
Hospital discharge data were obtained through the Statewide Planning and Research Cooperative System (SPARCS) [6]. SPARCS is an all-payer reporting system, mandated to collect data on inpatient and outpatient hospital visits across NYS under Section 28.16 of the Public Health Law [7]. Data for each inpatient admission include patient demographics, geocoded address information, admission and discharge dates, International Classification of Diseases, Ninth edition, Clinical Modification (ICD-9-CM) principal and secondary diagnoses, and patient disposition upon discharge (e.g. whether in-hospital death occurred), among other variables. We analysed the most recent database available for each year (July 2014 release for years 2009  An ICD-9-CM principal diagnosis code indicates the condition chiefly responsible for a patient's admission to the hospital [8]. Within SPARCS, healthcare staff are able to specify up to 24 secondary diagnosis codes on a medical discharge record for conditions that either coexist at the time of patient admission, develop subsequent to hospitalisation, affect the patient's course of treatment or lengthen the hospital stay. For this investigation, we defined a 'non-severe pneumoniaassociated hospitalisation' as any inpatient discharge record having a principal diagnosis of pneumonia. A 'severe pneumoniaassociated hospitalisation' was one with a principal diagnosis of sepsis or respiratory failure and a secondary diagnosis of pneumonia [9]. Taken together, we defined an 'overall pneumonia-associated hospitalisation' more broadly as one that could be non-severe or severe (Table 1). Hereafter we use the term 'patient' in reference to an individual who experienced a pneumoniaassociated hospitalisation.
Setting of acquisition for a pneumonia-associated hospitalisation was classified as part of a previous study in accordance with Infectious Disease Society of America (IDSA) and American Thoracic Society (ATS) professional guidelines [2,4].
NYC residency was based on a patient's home address within one of five NYC boroughs (Manhattan, Bronx, Brooklyn, Queens and Staten Island), which correspond to NYS counties (New York, Bronx, Kings, Queens and Richmond, respectively).

Ethical considerations
This investigation involved analyses of existing deidentified hospitalisation data. The Centers for Disease Control and Prevention (CDC) determined this activity to be public health non-research, NYC Department of Health and Mental Hygiene (DOHMH) determined this activity exempt from federal regulations for the protection of human research subjects, and Columbia University Medical Center (CUMC) approved the protocol under expedited review.

Analytical methods
Rates of pneumonia-associated hospitalisation We calculated average annual, age-adjusted rates of overall pneumonia-associated hospitalisation for each residential NTA. Rates were calculated for each geographic unit, however not for each categorical age group. Annual age-stratified hospitalisation frequencies were normalised with DOHMH population estimates modified from 2010 Census denominators, and age-adjusted using the direct method, according to 2000 Census guidelines [10,11]. For the direct method of standardisation, the number of hospitalisation events for each age group was first divided by the estimated population of each age group, then multiplied by a constant of 100 000 persons to calculate the age-specific hospitalisation rate. This age-specific rate was then multiplied by the proportion of the US standard population for the age group. Age-specific results were summed to calculate the age-adjusted hospitalisation rate [11].
Analogous calculations were conducted for severe pneumoniaassociated hospitalisation, as well as for CAP, HCAP and HAP. We excluded analyses of VAP at the NTA-level, given the large number of non-zero counts less than 10, which may have poor reliability when converted to age-adjusted rates [12].

Geostatistical analyses
We developed a semivariogram model to first evaluate if NTAlevel pneumonia-associated hospitalisation rates were significantly clustered. If the model demonstrated clustering, we calculated an average radius of clustering, referred to as the range of spatial autocorrelation. Spatial autocorrelation indicates if values for locations that are nearby to one another are more similar than values for locations that are more distant. If we were able to successfully fit a semivariogram model and define a range of spatial autocorrelation, we can assume that, within this average radius, NTAs that are nearby to one another have hospitalisation rates that are more similar than hospitalisation rates for NTAs that are more distant [13]. Fitting a semivariogram model required sequentially developing an empirical, experimental and statistical semivariogram (Electronic Supplementary Material, Methods) [14,15].
We developed semivariogram models for rates of overall pneumonia-associated hospitalisation, as well as for each subclassification of interest (severe pneumonia-associated hospitalisation rates, CAP, HCAP and HAP hospitalisation rates). Results were validated using the Incremental Spatial Autocorrelation (ISA) tool in ArcMap 10.2.1. This tool runs a global Moran's I analysis for increasing distances and provides a corresponding z-score to measure the intensity of spatial clustering. We compared the range of spatial autocorrelation from the semivariogram model to the first distance with a peak z-score, as calculated by the ISA tool. If results of the semivariogram model and ISA indicated clustering across NYC overall, we went on to assess local clustering, as described in the Cluster analyses section. The range of spatial autocorrelation was used as a fixed bandwidth parameter for spatial weighting within these cluster analyses.

Cluster analyses
To identify clusters, or neighbourhoods within close proximity of one another with similar pneumonia-associated hospitalisation rates, we conducted local Moran's I cluster analyses [16]. The geographical summation of local Moran's I relationships result in either statistically significant (1) clusters of nearby neighbourhoods with similarly high pneumonia-associated hospitalisation rates (hot spots); (2) clusters of nearby neighbourhoods with similarly low pneumonia-associated hospitalisation rates (cold spots); (3) neighbourhoods with high rates near low-rate neighbourhoods or neighbourhoods with low rates near high-rate neighbourhoods (spatial outliers); (4) or regions without statistically significant spatial clustering [16]. Statistical significance was estimated using 999 Monte Carlo simulations at a 5% level of significance. Additional details about geostatistical and cluster analyses are provided in Electronic Supplementary Material, Methods. Finally, to better understand the profile of clusters for overall, severe and CAP-associated hospitalisation, we compared mean age using the Wilcoxon rank-sum test, and examined the median and interquartile range (IQR) for hospital length of stay (LOS). Additionally, we used χ 2 tests to compare the proportion of patients in hot spots vs. cold spots according to sex, race/ethnicity, discharge location after hospitalisation and in-hospital death.
The highest average annual age-adjusted pneumoniaassociated hospitalisation rates were consistently seen across northern and southern Bronx, as well as pockets of northern Manhattan and Staten Island. These patterns were demonstrated for overall pneumonia-associated hospitalisation, severe pneumonia-associated hospitalisation, as well as pneumonia-associated hospitalisation stratified by setting of acquisition. While neighbourhoods with high rates were also seen in parts of Brooklyn and Queens, these higher rate neighbourhoods were less densely concentrated in comparison to the Bronx, Manhattan and Staten Island (Figs 1 and 2). Annual age-adjusted rates are shown in Supplementary Figures S1-S3.

Semivariogram modelling
Semivariogram modelling revealed a range of spatial autocorrelation of 4859 m for overall pneumonia-associated hospitalisation and 5740 m for severe pneumonia-associated hospitalisation. CAP-associated hospitalisation had a range of spatial autocorrelation of 6005 m ( Supplementary Fig. S4). We were unable to define a range of spatial autocorrelation for HCAP and HAP; therefore, cluster analyses for these subcategories were not performed. Output of the ISA analysis confirmed semivariogram modelling results.

Cluster analyses
Cluster analyses for overall pneumonia-associated hospitalisation Hot spots of overall pneumonia-associated hospitalisation spanned 20 neighbourhoods in the Bronx, four in Manhattan and two in Staten Island, while cold spots were demonstrated across five neighbourhoods in lower Manhattan and four in northeastern Queens (Fig. 3a). When comparing patients who resided within hot vs. cold spots, we found that the greatest proportion of patients residing in hot-spot neighbourhoods identified as non-Hispanic Black (N = 9788; 36.8%) ( Table 2). A remaining 29.4% identified as Latino/Hispanic (N = 7799), 19.1% as non-Hispanic other race (N = 5080) and 13.6% as non-Hispanic White (N = 3614). In contrast, the overwhelming majority of patients residing in cold spots identified as non-Hispanic White (N = 3113; 70.0%), while 16.4% identified as non-Hispanic other race (N = 730), 5.0% identified as Latino/Hispanic (N = 222) and 4.3% as non-Hispanic Black (N = 189).
The mean age at hospitalisation for patients within hot spots of overall pneumonia-associated hospitalisation was 64 years and 76 years for patients within cold-spot neighbourhoods (P < 0.01).
Cluster analyses for severe pneumonia-associated hospitalisation For severe pneumonia-associated hospitalisation, we found hot spots within nine neighbourhoods in northern Bronx, one neighbourhood in northern Manhattan, one in northeast Queens and three at the northern tip of Staten Island (Fig. 3b). Within hot spots of severe pneumonia-associated hospitalisation, the largest proportion of hospitalisations occurred among patients who identified as non-Hispanic Black (N = 2863; 35.5%), while 27.3% Table 1. ICD-9-CM classification of overall pneumonia-associated hospitalisation among adults in New York City, 2010-2014
The mean age at hospitalisation for patients within hot spots of severe pneumonia-associated hospitalisation was 64 years and 72 years for patients within cold-spot neighbourhoods (P < 0.01). For hospitalisations among hot-spot patients, the median LOS was 10.0 days (IQR: 6.0-18.0) compared with 5.0 days (IQR: 3.0-10.0) for cold-spot patients. Only 19.6% (N = 1581) of patients residing in hot spots were released to their home under self-care or covered skilled care; however, this is where the greatest proportion of cold-spot patients were discharged (N = 5118; 66.1%) (P < 0.01). In contrast, the largest proportion of patients residing in hot spots were released to skilled nursing facilities (N = 3910; 48.5%); this proportion was significantly less for cold-spot patients (N = 1003; 13.0%) (P < 0.01). Nearly onequarter of patients within hot spots experienced in-hospital death (N = 1976; 24.5%), whereas approximately one-tenth of cold-spot patients experienced in-hospital death (N = 780; 10.1%) (P < 0.01).
Cluster analyses for CAP-associated hospitalisation For CAP-associated hospitalisation, hot spots were revealed for 26 neighbourhoods across Bronx and northern Manhattan (Fig. 3c). Cold spots were observed in northeast Queens, as well as seven neighbourhoods in southern Manhattan and Brooklyn.
The mean age at hospitalisation for patients within hot spots of CAP-associated hospitalisation was 63 years, as opposed to 72 years for cold spots (P < 0.01). Once more, the largest proportion of patients residing in hot spots identified as non-Hispanic Black (N = 9063; 35.2%), with approximately one-third of individuals identifying as Latino/Hispanic (N = 8535; 33.2%) ( Table 4). For hospitalisations among hot-spot patients, the median LOS was 5.0 days (IQR: 3.0-10.0) compared with 6.0 days (IQR: 3.0-10.0) for cold-spot patients. In examining the severity of CAP-associated hospitalisations, proportions were similar for hot spots vs. cold spots (P > 0.05). We did not identify statistically significant differences in the proportion of patients discharged to the home under self-care or covered skilled care for CAP-associated hospitalisation clusters, and the proportion of patients who experienced in-hospital death was only slightly higher for hot spots (N = 2967; 11.5%) compared with cold spots (N = 780; 10.7%) (P = 0.05). However, a larger percentage of patients were discharged to skilled nursing facilities within hot spots (N = 4968; 19.3%) compared with cold spots (N = 1187; 16.2%) (P < 0.01).

Discussion
We found distinct spatial patterns in the rates of overall pneumonia-associated hospitalisation, severe pneumonia-associated hospitalisation and CAP-associated hospitalisation for NYC during 2010-2014. In applying a spatial framework to investigate the epidemiology of pneumonia, we incorporated the inter-relatedness of race/ethnicity, health behaviours and health services in representing rates of pneumonia-associated hospitalisation [18].
To reduce illness and death from pneumonia, it is important to understand where burden is greatest and the setting in which NYC residents acquire infection. Our previous study found that 60% of pneumonia-associated hospitalisations are attributable to CAP, and here we identified neighbourhoods for potential interventions [4]. Specific risk factors for each neighbourhood cannot be ascertained without regression modelling. However, patterns in the spatial clusters for CAP largely reflected segregation by race/ethnicity that exists across the city and inequities in poverty and chronic disease, as described below. Setting of acquisition results differed from estimates presented by Corrado et al. due to variations in the definition of pneumonia-associated hospitalisation. Corrado et al. present pneumonia-associated hospitalisations based on diagnostic codes among any of the discharge diagnoses, while we include only those based on principal diagnosis [4].
Throughout this study, we demonstrated that neighbourhoods in the Bronx and northern Staten Island were disproportionately affected by high rates of pneumonia-associated hospitalisation. Twenty of the 36 NTAs that make up the Bronx were classified as hot spots of overall pneumonia-associated hospitalisation, revealing a high burden across the borough. For Staten Island, only a small locality was defined as a hot spot of overall pneumonia-associated hospitalisation and severe pneumonia-associated hospitalisation. These communities have higher Latino/Hispanic and non-Hispanic Black populations compared with other neighbourhoods within this city. In the Bronx, 45% of individuals identify as being of White race, while approximately 64% of residents in Manhattan identify as White [10]. For neighbourhoods within the Staten Island hot spot, 39% of individuals identify as White, while remaining portions of the borough range between 70% showing NYC neighbourhoods according to average annual age-adjusted hospitalisation rates of community-acquired pneumonia-associated hospitalisation (a), healthcare-associated pneumonia-associated hospitalisation (b), hospital-acquired pneumonia-associated hospitalisation (c), and ventilator-associated pneumonia-associated hospitalisation (d). Labels indicate the five NYC boroughs (Manhattan, Bronx, Brooklyn, Queens and Staten Island). Hospitalisation rates were calculated for each residential Neighborhood Tabulation Area (NTA) based on hospital discharge data from the New York Statewide Planning and Research Cooperative System, and are divided into quartile classifications, with an equal number of residential NTAs in each class. Higher hospitalisation rates are shown in darker blue, and lower hospitalisation rates shown in lighter blue. Non-residential NTAs were excluded from the analysis and are shown in grey.

Epidemiology and Infection
and 85% White [19]. Area-based poverty may be one contributing factor to pneumonia-risk within NYC. Approximately 31% of individuals in the Bronx live below the Federal Poverty Level (FPL), representing the highest percentage among the five boroughs, and 20% live below the FPL in the Staten Island hot spot, the highest percentage in that borough [19,20].
Additionally, increased risk of pneumonia among individuals with chronic conditions such as asthma and diabetes has been described [21,22]. To this end, the Bronx has rates of adult hospitalisation for diabetes and asthma that are approximately twice the citywide rate [20]. Adults within the Staten Island hot spot also have higher rates of diabetes and asthma hospitalisations compared with the borough and city overall [19]. Finally, the rate of premature death (death before age of 65 years) in certain neighbourhoods of the Bronx is over four times the rate of premature death in higher socioeconomic status neighbourhoods of southern Manhattan [20]. These associations reinforce that efforts to reduce pneumonia morbidity and mortality cannot be dissociated from broader health equity interventions. Instead, pneumonia prevention would benefit from being conducted in conjunction with efforts to address underlying social conditions that make residents of these hot spots more vulnerable to infectious and chronic disease as well as premature death [23,24].
Cold spots of pneumonia-associated hospitalisation were densely concentrated in southern Manhattan and western Brooklyn. The average age of patients experiencing a severe pneumonia-associated hospitalisation was eight years lower within hot spots vs. cold spots. Given the high proportion of poverty in the Bronx and Staten Island, we hypothesise that premature age at hospitalisation is a downstream effect of disparities in access to care due to factors such as race, income inequality and access to health care [25][26][27]. Between 2010 and 2015, which overlaps the study period, there was a 4.1% increase in comprehensive health insurance coverage among NYC residents [28]. Additionally, in 2015, NYC launched an effort to expand community health centres in 25 underserved neighbourhoods to build primary care capacity [29]. As data become available, our analyses can be replicated to understand if and how efforts to improve community health have impacted rates and clusters of severe pneumoniaassociated hospitalisation. Furthermore, our results established that the LOS for patients admitted for severe pneumonia was longer and the proportion of patients who experienced in-hospital death was over two times higher for residents of hot spots vs. cold spots. Therefore, directing prevention and treatment efforts towards severe pneumonia-associated hospitalisation hot spots creates a potential opportunity to more substantially impact pneumonia-associated mortality.

Limitations
While this study adds to our understanding of the spatial epidemiology of pneumonia-associated hospitalisations within NYC, several limitations exist. Foremost, the quality of data for classifying pneumonia within SPARCS is unknown. ICD-9-CM coding is subject to differential facility management and medical charting practices, with the potential for inaccuracies [30,31]. We assumed that the use of respiratory failure or sepsis as a principal diagnosis represented a severe pneumonia-associated hospitalisation, but have not verified this through chart review. Additionally, hospitals may be incentivised to code and bill for diagnoses that maximise insurance reimbursement [31]. We controlled for this bias, in part, by utilizing hospitalisation definitions set forth by Lindenauer et al. [9] The accuracy of patient-level records in SPARCS is unknown, particularly with respect to race/ethnicity and residential information. For address data, precise reporting is a challenge for individuals of low socioeconomic status subject to housing instability, and this study does not take into account persons who reside, either short-term or long-term, within nonresidential NTAs [32]. Furthermore, it is important to distinguish that SPARCS uniquely identifies hospitalisations for which pneumonia has been listed as a discharge diagnosis, and not pneumonia-confirmed cases. Despite this, rates derived from SPARCS likely underestimate the incidence of pneumonia in NYC during 2010-2014, as we did not account for outpatient  (Figs 1 and 2). Non-residential NTAs were excluded from the analysis and are shown in grey. a Hot spot: clusters of nearby neighbourhoods with similarly high hospitalisation rates; high-low spatial outlier: neighbourhoods with high hospitalisation rates near neighbourhoods with low hospitalisation rates; low-high spatial outlier: neighbourhoods with low hospitalisation rates near neighbourhoods with high hospitalisation rates; cold spot: clusters of nearby neighbourhoods with similarly low hospitalisation rates; not significant: neighbourhoods without statistically significant spatial clustering. 6 P. A. Kache et al.
pneumonia cases. Limitations associated with the defined setting of acquisition are described in detail by Corrado et al. [4] Due to the large numbers of hospitalisations within each hot and cold spot, the proportional differences between the clusters are overwhelmingly statistically significant at a 5% level of significance. Results should therefore be interpreted with respect to clinical rather than statistical significance [33]. This study is based on average-annual age-adjusted rates for 2010-2014, without accounting for inter-annual variability in hospitalisations during this period. While exploring the drivers of this variation (e.g. seasonal and pandemic influenza) falls out of the scope of this study, annual rates are presented in Supplementary Figures S1-S3. Finally, results of the comparative analyses between hot and cold spots of overall, severe and CAP-associated hospitalisations may reflect confounding demographic and clinical factors. This limitation will be explored by developing future multivariable regression models. Based on the spatial autocorrelation of pneumonia-associated hospitalisation rates detected in this study, a geographically weighted regression would likely be the most suitable course of action. Given the constraints of SPARCS data described in this section, further research is required to identify covariates of CAP and HAP in NYC at the appropriate spatial scale. This study was built off of previous DOHMH examinations of the setting of acquisition, hospitalisation rates and mortality rates associated with pneumonia in NYC [4]. This study helps to reveal how pneumonia-associated hospitalisations vary within and between NYC boroughs. Additionally, our analyses detail geographic differences in the clinical and demographic characterisations of pneumonia-associated hospitalisations.
Reducing pneumonia mortality within NYC will likely require a systems approach, leveraging a combination of citywide initiatives and community engagement activities to raise health equity [34]. Certain strategies, such as improving standards for inpatient care can be addressed through governmental and hospital policy. However, efforts to target risk factors for pneumonia within the community must incorporate the specific needs, culture and infrastructure of NYC neighbourhoods. Based on results of this study, we recommend that additional research be focused on understanding the reasons for higher rates in hot spots through geographic regression modelling, and heterogeneities in clinical management of pneumonia across NYC be examined through facility-level analyses. We propose that resources intended for the improvement of pneumonia morbidity and mortality be directed towards neighbourhoods within the Bronx, as well as northern Manhattan and Staten Island. Setting of acquisition classifications: community-acquired pneumonia (CAP); healthcare-associated pneumonia (HCAP); hospital-acquired pneumonia (HAP); ventilator-associated pneumonia (VAP). g Other classification included patients who left against medical advice or discontinued care, or those discharged to: short-term general hospitals, facilities that provide custodial or supportive care, designated cancer centres or children's hospitals, federal healthcare facilities, hospice, inpatient rehabilitation facilities, Medicare-certified long-term care hospitals, psychiatric hospitals, critical access hospitals, or another type of healthcare institution not defined in the SPARCS code list. *Set to 5% level of significance. 8 P. A. Kache et al. Missing sex classification for n = 1. g Other classification included patients who left against medical advice or discontinued care, or those discharged to: short-term general hospitals, facilities that provide custodial or supportive care, designated cancer centres or children's hospitals, federal healthcare facilities, hospice, inpatient rehabilitation facilities, Medicare-certified long-term care hospitals, psychiatric hospitals, critical access hospitals or another type of healthcare institution not defined in the SPARCS code list. *Set to 5% level of significance.