Association between Salmonella infection and colon cancer: a nationwide registry-based cohort study

Laboratory data increasingly suggest that Salmonella infection may contribute to colon cancer (CC) development. Here, we examined epidemiologically the potential risk of CC associated with salmonellosis in the human population. We conducted a population-based cohort study using four health registries in Denmark. Person-level demographic data of all residents were linked to laboratory-confirmed non-typhoidal salmonellosis and to CC diagnoses in 1994–2016. Hazard ratios (HRs) for CC (overall/proximal/distal) associated with reported salmonellosis were estimated using Cox proportional hazard models. Potential effects of serovar, age, sex, inflammatory bowel disease and follow-up time post-infection were also assessed. We found no increased risk of CC ≥1 year post-infection (HR 0.99; 95% confidence interval (CI) 0.88–1.13). When stratifying by serovar, there was a significantly increased risk of proximal CC ≥1 year post-infection with serovars other than Enteritidis and Typhimurium (HR 1.40; 95% CI 1.03–1.90). CC risk was significantly increased in the first year post-infection (HR 2.08; 95% CI 1.48–2.93). The association between salmonellosis and CC in the first year post-infection can be explained by increased stool testing around the time of CC diagnosis. The association between proximal CC and non-Enteritidis/non-Typhimurium serovars is unclear and warrants further investigation. Overall, this study provides epidemiological evidence that notified Salmonella infections do not contribute significantly to CC risk in the studied population.


Introduction
Colon cancer (CC) is the third most common cancer in industrialised countries, with 1.1 million new diagnoses annually worldwide [1]. Although genetic, environmental and lifestyle-related exposures are the best-known risk factors for cancer, around 20% of the global cancer burden is estimated to be attributable to infectious agents, including bacteria [2]. Examples hereof concerning the gastrointestinal tract include Helicobacter pylori infection as risk factor for gastric cancer, and Salmonella Typhi infection as risk factor for gallbladder carcinoma in chronic typhoid carriers [3][4][5].
Several mechanisms have been identified through which bacteria can contribute to cancer formation. These include chronic inflammation, production of DNA-damaging toxins and manipulation of host cell signalling pathways [3,4,6]. The latter promotes bacterial uptake, intracellular survival and egress in case of Salmonella infection. Indeed, several Salmonella effector proteins have been shown to activate the major host cell signalling pathways AKT and MAPK, which are central to many signalling cascades and are often deregulated in cancers [4]. Salmonella is expected to contribute to carcinogenesis mainly under conditions of longlasting infections, an intact bacterial type 3 secretion System (T3SS), and with a background of host predisposition, in which significant numbers of pre-transformed cells are present in the intestine. This has been shown in vivo by experiments demonstrating a higher risk of colon carcinoma formation after infection with wild type vs. ΔprgH mutant S. Typhimurium (lacking the T3SS) strains in mice genetically predisposed to cancer (APC+/−) vs. normal mice [4].
Against this background of experimental data, population-based epidemiological studies addressing the association between Salmonella infection and CC are limited to one [7]. In a nationwide registry study in the Netherlands, an increased risk of CC was observed among patients who had a reported (severe) Salmonella infection between 20 and 60 years of age as compared to the baseline CC risk in the Dutch population [7]. This increased risk was significant following infection with S. Enteritidis and for the proximal part of the colon. Moreover, it was shown that among CC patients, the risk of having had a previously notified Salmonella infection was higher for individuals with pre-infectious inflammatory bowel disease (IBD), although numbers were small [7]. IBD is a known risk factor for both CC and salmonellosis, as this chronic condition is associated with recurrent episodes of gut inflammation and increased susceptibility to infection and testing [8,9].
Salmonella is a major cause of bacterial gastroenteritis worldwide, with over 90 000 infections reported to public health authorities in Europe each year [10]. In Denmark, an annual average of 1100 salmonellosis cases has been reported in recent years through the national surveillance system [11]. As most Salmonella infections are mild with self-limiting symptoms, the majority of infections go unreported. It is estimated that the true number of Salmonella infections (i.e. after correction for underreporting and underdiagnosis) is approximately 10 times higher than the number of infections reported in the national disease surveillance system in Denmark [12]. Each year, around 3400 people are diagnosed with CC in the Danish population [13]. Although screening programmes aiming at early detection of CC typically target the older population (i.e. individuals aged >50 years), the incidence of CC in young adults has increased during the last 25 years, being a cause for concern [14]. IR, incidence rate. a Including both 'newly exposed' and exposed. b per 100 000 person-years 2 Janneke W. Duijster et al.
In this study, we assessed the potential association between Salmonella infection and CC in Denmark. To this end, we made use of data from comprehensive health registries in Denmark to compare the incidence of CC among individuals with a previously reported salmonellosis to that of individuals without reported salmonellosis. In addition, we assessed potential effects in subgroup analyses as defined by age, sex, IBD and time since infection on the association between Salmonella infection and CC.

Data sources
We conducted a population-based cohort study with data from four health registries in Denmark between January 1994 and December 2016. Demographic characteristics including sex, date of birth, vital status (e.g. date of death, immigration and emigration), marital status and region of living of all people residing in Denmark were retrieved from the Danish Civil Registration System [15]. A second dataset included information on bacterial gastrointestinal infections, with recorded bacterial species and subspecies/serovar and date of diagnosis (Danish Register on Enteric Pathogens) [16]. The presence of IBD, i.e. ulcerative colitis and Crohn's disease with date of diagnosis, was obtained from the Danish National Patient Registry [17]. The fourth dataset contained all CC diagnoses from 1978 until December 2016 reported to the Danish Cancer Registry, with date of diagnosis and tumour location (based on ICD-10 code) [18]. Data of all four sources were matched using the CPR-number, which is a unique identifier used across all national registries [15].

Exposure and outcome definition
The exposure variable was defined as having or not having had a reported non-typhoidal Salmonella infection. Salmonella infection was categorised into infections with SE, ST or other serovars. For individuals with multiple Salmonella infections, only the first reported infection was considered. In analyses restricting exposure to a serovar, only the first infection of the serovar of interest was used. Considering a minimal development time of 1 year for CC formation after infection, which has been assumed previously to have a plausible relation to the infection [7,19], people were considered at risk of CC from 1 year after reported Salmonella infection onwards. Hence, we defined the exposure status as a time-varying variable with three states: individuals were 'unexposed' (reference) until first reported infection, 'newly exposed' in the first year post-infection and 'exposed' from 1 year post-infection onwards. We excluded individuals with a diagnosed CC between January 1978 and December 1993, to reduce the risk that CC had developed before the Salmonella infection occurred. The outcome studied was CC (ICD-10 codes C180-C187). For the analysis, we looked at CC overall and by colon subsite: proximal colon (C180-C185) and distal colon (C186, C187). In the analyses of risk of cancer in one colon subsite individuals were not censored for cancers in the other subsite.

Statistical analysis
Characteristics of the study population were presented descriptively. In the survival analyses, individuals were followed from birth or 1st January 1994, whichever was last. Follow-up ended at date of cancer diagnosis, death or the end of study (31st December 2016), whichever occurred first. In addition, risk time excluded periods where individuals were temporarily or permanently living outside of Denmark. Three of the potential confounders; geographical region, marital status and IBD status were time-varying variables with five (North Jutland, South Jutland, Middle Jutland, Zealand, Capital), four (unmarried, married, divorced, widowed) and two (yes, no) levels, respectively. We used Cox proportional hazard models to estimate hazard ratios (HRs) with 95% confidence intervals (CIs) for developing CC in individuals with a history of reported Salmonella infection vs. people without such history. The main comparison of interest was exposed vs. unexposed; hence, all analyses show the HRs for this comparison. Besides, in the main analysis the HRs for 'newly exposed' vs. unexposed were displayed to address the potential effect of testing/diagnostic bias in symptomatic individuals with yet undiagnosed CC [20]. Attained age was used as the time scale for the baseline hazard function, which was stratified by sex, year of birth, geographical region, marital status and IBD to adjust for potential confounding effects. Additionally, we conducted analyses to examine whether HRs varied by sex, attained age (<50, 50-59, 60-69, 70-79 and ≥80 years), age at infection (<50, 50-59, 60-69, 70-79 and ≥80 years), follow-up time postinfection (2nd-5th year, 6th-10th year and >10 years) and IBD. The proportional hazards assumption of the main analysis (salmonellosis overall and CC overall) was assessed using a test for homogeneity of the HR in the age intervals; <50, 50-59, 60-69, 70-79 and ≥80 years of age. Incidence curves of CC (overall and by subsite) in the exposed vs. unexposed group (stratified by Salmonella serovar) were also generated to graphically display the comparison; incidences at all ages were calculated by weighting the number of cancers and risk days within a time span of ±5 years using a parabolic kernel. In all analyses, P-values <0.05 were considered statistically significant. In accordance with privacy legislation, small numbers were not displayed in tables. Statistical analysis was performed using the PHREG procedure of SAS (version 9.4).

Results
During a total of 124.7 million person-years of follow-up, 54 902 individuals were diagnosed with CC, at a median age of 72 years (IQR: 64-80). Among those with a CC diagnosis, 278 individuals were diagnosed with CC after salmonellosis, of which 33 occurred within the first year post-infection. The median time span between infection and CC diagnosis was 7.5 years (IQR: 3.0-13.9). In the subsite-specific analyses, 29 422 individuals were diagnosed with proximal CC and 26 108 with distal CC. Table 1 shows the number of overall CC events and incidence rates (IRs) in the exposed and unexposed groups by different subgroups. The average IR of CC in the exposed group was 47.16 per 100 000 person-years at-risk, whereas in the unexposed group the IR was 44.02.

Risk of colon cancer
Adjusting for sex, year of birth, region of residence, IBD and marital status, the overall risk of CC did not differ between the exposed and unexposed groups (HR: 0.99 [95% CI 0.88-1.13]) (Table 2). Similarly, no differences were observed between these groups when stratifying by colon subsite and sex. However, within 1 year post-infection the overall risk of CC increased twofold compared to the unexposed group (HR: 2.08 [95% CI 1.48-2.93]). For the exposed group, when stratifying by serovar, an HR of 1.40 (95% CI 1.03-1.90) for cancer in the proximal colon was observed in individuals who had an infection with serovars other than SE and ST ( Table 2). The association between Salmonella infection and CC did not vary by attained age (Table 3). A test for homogeneity of HRs in the five age groups yielded a P-value of 0.59. Figure 1 shows the incidence of CC (overall and per colon subsite) by attained age for the different serovars. For both proximal and distal CC, the IRs were the lowest for ST in people aged above 60 years as compared to SE and other serovars. In the age-stratified analyses, a 1.87-fold (95% CI 1.00-3.50) increased risk of distal CC was observed in the exposed group aged 0-49 years (for Salmonella overall). For the proximal colon, the highest HR, although not significant, was also observed in the age group 0-49 years among those infected with other serovars (HR: 1.75; 95% CI 0.56-5.44). The estimated association between Salmonella and CC risk did not vary much by age at infection ( Table 4). The median observed ages at infection of different serovars in the total cohort were 37 years (IQR: 16-54) for SE, 30 years (IQR: 9-52) for ST and 32 years (IQR: 16-54) for other serovars.
There was no significant effect of follow-up time post-infection on CC risk; the HRs of proximal CC for people infected with other serovars were 1.62 (95% CI 0.96-2.74), 1.47 (95% CI 0.85-2.53) and 1.20 (95% CI 0.72-1.90) at 1-5 years, 5-10 years and >10 years post-infection, respectively (Table 5). With regards to the potential effect modification of IBD on CC risk, the HR for overall CC was not significantly higher for people with underlying IBD (HR 1.18; 95% CI 0.81-1.71) ( Table 2).

Discussion
We assessed the risk of CC after reported Salmonella infection in a 23-year follow-up of the entire Danish population. The risk of CC in individuals with reported salmonellosis was compared to the risk in individuals without a reported salmonellosis, accounting for potential confounding and modifying effects of age, sex, IBD and follow-up time post-infection. Overall, we observed no increased risk of CC among salmonellosis cases. The only significantly increased risk of CC concerned the proximal colon after infection with serovars other than SE and ST. The proximal part of the colon is the subsite of primary interest, as exposure to Salmonella is highest in this part of the large intestine located directly after the ileum, where Salmonella typically establishes infection. CC risk was highest at an attained age of <50 years for those infected with other serovars. For SE, the estimated HRs increased with increasing age, whereas for ST they decreased with increasing age, which may be due to differences in agespecific reporting between these serovars.

Epidemiology and Infection
In a recent Dutch study, a statistically significantly increased risk of cancer in the proximal colon was found among individuals with a history of SE infection [7]. This result was not confirmed in the current study, indicating that the previously observed association between SE infection and proximal CC is not generalisable to other study populations. On the one hand, the inconsistent findings might be explained by a more complex causal mechanism than originally anticipated and the existence of situational differences, but may also represent a chance finding. Indeed, our results seem to indicate a possible scenario of increased risk of proximal CC associated with infection with a Salmonella serovar other than SE or ST, but it could also be the result of type I error due to multiple hypothesis testing.
For surveillance design reasons, the two studies used different types of analyses and effect measures. The Dutch Salmonella surveillance system covers approximately 64% of the population; therefore, the risk of CC in individuals with a reported salmonellosis was compared to the baseline CC risk in the general Dutch population, expressed as standardised incidence ratios [7]. The Danish surveillance system covers the whole population, which allowed us to compare the risk of CC in people with those without a reported salmonellosis using Cox regression. Yet, both studies used individual-level data and the inclusion criteria were comparable. Apart from the aforementioned possibility of a chance finding, other and largely unknown factors might underlie this dissimilarity including, for instance, different serovar distributions and populations exposed to them. Disease outcomes (e.g. severity, antimicrobial resistance, etc.) and epidemiology (e.g. sources, modes of transmission, high-risk groups, etc.) differ by serovar, partly due to differences in exposure but also potential factors related to virulence, invasiveness and toxins of the bacterium itself [21]. The estimated number of Salmonella infections in Denmark is somewhat higher compared to the Netherlands, with respectively 18.1 and 15.8 infections per 10 000 inhabitants in Denmark and the Netherlands in 2017 [12,22]. In both the Netherlands and Denmark, SE accounts for a substantial part (25-30%) of the salmonellosis cases [11,23]. The successful implementation of a Salmonella control programme in the poultry production chain led to a marked reduction of domestically acquired human SE infections in Denmark since 1998, with most infections nowadays being attributable to foreign travel (78.2% in 2016) [11,24]. In contrast, most SE infections in the Netherlands remain domestically acquired; therefore, the groups of people infected with SE might not be fully comparable in terms of, e.g. general health status, lifestyle, socio-economic status, ethnicity and possible co-morbidities. Besides, with regards to serovars other than SE and ST, different distribution and exposure patterns, as well as specific strains, might also have contributed to these differences, as the genetic makeup of the strains themselves might be associated with their ability to transform [21]. It has been shown experimentally that Salmonella infection of pre-transformed fibroblasts and organoids induces full cell transformation [4]. Development from a pre-malignant state to an advanced carcinoma takes several years; however, Salmonella infection is likely to accelerate this process [4,25 ]. The results of a sub-analysis showed a twofold increased risk of CC (overall and per subsite) for individuals within the first year postinfection. Even though Salmonella could accelerate transformation, tumour development in less than 1 year seems implausible. We therefore consider this observation to reflect testing/diagnostic bias rather than the transformation capacity of Salmonella. Undiagnosed CC patients often present at their general practitioner (GP) with nonspecific symptoms resembling gastroenteritis, such as diarrhoea and frequent bowel movements. In a Danish cohort, it was shown that CC patients had significantly more GP consultations in the 9 months prior to the cancer diagnosis [17]. A similar pattern was observed in another Danish cohort study that examined the risk of IBD after a Salmonella-positive stool test. An increased risk of IBD was observed in the first year after Salmonella infection; however, this was even more pronounced in the first year following a negative stool test [9]. The association we found in the first year postinfection is compatible with these prior observations. An alternative hypothesis might be that people in an early stage of cancer are more susceptible to Salmonella infection due to dysbiosis or other changes in the gut microbiome [26], so the association observed in the first year after infection might also reflect reverse causality. Still, both the testing/diagnostic bias and the increased susceptibility could co-exist in people with an early-stage pre-diagnosed CC.
This study has some limitations. First, mainly severe infections, outbreak-related infections or infections with a suspected foreign source are included, as most people do not present at their GP with mild and self-limiting gastrointestinal complaints. Hence, we could not assess whether multiple mild Salmonella infections that are undiagnosed and unreported contribute to CC risk or not. This could be the subject of another study using, for instance, serology to measure the magnitude of exposure to Salmonella regardless of reporting bias. Second, in the Dutch cohort, the risk of cancer was only significantly increased for enteric infections and not for invasive (bloodstream) infections [7], but we were not able to address this observation in the current study. Third, we were not able to control for some of the main risk factors for CC, such as obesity, smoking and alcohol consumption. Although considering these variables would be relevant to explain CC risk along with the studied Salmonella infection, this would require a different study design as these types of time-varying variables are not generally present in national health registries.
In conclusion, the current study found no unusual CC risk associated with previously reported Salmonella infection overall. Therefore, although there is growing experimental evidence for a potential role of Salmonella in CC development, notified Salmonella infections do not appear to be an important driver of CC risk in the studied population. Indeed, the previously observed epidemiological association between SE infection and proximal CC was not confirmed here. The explanation for these differences, if not merely occurring by chance, is unclear. Financial support. This study was supported by KWF grant 'Bacterial food poisoning and colon cancer; a cell biological and epidemiological study' (2017-1-11001). No funding bodies had any role in study design, data collection and analysis, decision to publish or preparation of the manuscript.