COVID-19 and gender-specific difference: Analysis of public surveillance data in Hong Kong and Shenzhen, China, from January 10 to February 15, 2020

Financial support. The authors acknowledge a postdoctoral grant from The Second Affiliated Hospital of Zhengzhou University (to S.K.) and an operating grant support from the National Natural Science Foundation of China (grant nos. 81870942, 81471174, and 81520108011), a grant from the National Key Research and Development Program of China (grant no. 2018YFC1312200), and a grant from Innovation Scientists and Technicians Troop Constructions Projects of Henan Province of China (to M.X.).

To the Editor-An outbreak of coronavirus disease (COVID-19), which began in Wuhan, China in the end of 2019, 1 has now reached over 100 countries and poses a huge threat to the global public health and economy. 2 Given the risk of human-to-human transmission, the serial interval, which refers to the time interval from symptom onset of a primary case (ie, the infector) to that of a secondary case (ie, the infectee), 3 is an essential quantity, in addition to the basic reproduction number, that drives the speed of spread.
We examined the publicly available materials and collected the records of COVID-19 transmission events in 2 neighboring large cities, Hong Kong 4 and Shenzhen, 5 in south China from January 10 to February 15, 2020, and we extracted the serial interval data. We identified 48 transmission events (21 in Hong Kong and 27 in Shenzhen), among which 40 events contained the gender information of the primary cases. The last onset date of the primary cases among all collected transmission events was February 2, 2020. The data were collected via public domain; thus, neither ethical approval nor individual consent was applicable. All data used in this work were publicly available from press releases from the Centre for Health Protection (CHP) of Hong Kong 4 and the COVID-19 outbreak situation reports of the Shenzhen Municipal Health Commission, 5 and the key R code is provided as a supplementary file online.
To explore the temporal patterns and the gender-specific difference of serial intervals, we adopted two regression models. Model 1 is a log-linear form for the percentage change, E[ln(SI i, the serial interval of the ith primary case whose onset date is the tth day. G i denotes the gender of the ith primary case. Hence, the [exp(α 2 ) -1] × 100% quantifies the percentage change, and β 2 quantifies the unit change (day) in the serial interval, namely change per day in the calendar date. The gender-specific difference can be interpreted similarly. We fit both models using the standard least-squares approach.
As shown in Figure 1, the serial interval decreased by 0.4 (95% CI, 0.1−0.7), or 6.2% per day (95% CI, 0.4%−11.6%) from January 10 to February 2 in Hong Kong and Shenzhen. The Pearson correlation coefficient between the serial interval and calendar date is estimated at −0.37 (P < .01). The serial interval of male primary cases was 3.5 days (95% CI, 1.2−5.7) shorter than that of female primary cases, or 49.7% (95% CI, 15.3−70.1%) lower in percentage. To verify this finding, we additionally conducted a Cox proportional hazard modeling analysis using a similar formula as in models 1 and 2 to calculate the hazard ratio estimates. The association between serial interval and calendar date as well as gender-specific difference held consistently and significantly.
The shortening in serial interval over time is likely due to the strengthening of the public health control measures. The contact tracing and timely isolation of confirmed COVID-19 infections could lead to shorter observed serial interval due to right censoring 'bias'. 6,7 As such, we call the observed serial interval under the effects of control measures the effective serial interval, which has a mean of 5.2 days from our data set. This result appears slightly but not significantly shorter than the previous estimated 'intrinsic' serial interval, with a mean of 7.5 days. 1 The mechanism behind the gender difference remains unknown, but it may be partly due to the fact that male cases are more severe than female cases (ie, "officials recorded a 2.8% fatality rate for male patients versus 1.7% for female patients" 8   . In both panels, the red represents the female primary cases, and the blue represents the male primary cases. The dots are the observed (or median) serial interval, and the bars are the ranges of serial intervals for multiple primary cases. The bold curves are the fitting results and the dashed curves are the 95% confidence intervals.