To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
A machine learning approach to zero-inflated Poisson (ZIP) regression is introduced to address common difficulty arising from imbalanced financial data. The suggested ZIP can be interpreted as an adaptive weight adjustment procedure that removes the need for post-modeling re-calibration and results in a substantial enhancement of predictive accuracy. Notwithstanding the increased complexity due to the expanded parameter set, we utilize a cyclic coordinate descent optimization to implement the ZIP regression, with adjustments made to address saddle points. We also study how various approaches alleviate the potential drawbacks of incomplete exposures in insurance applications. The procedure is tested on real-life data. We demonstrate a significant improvement in performance relative to other popular alternatives, which justifies our modeling techniques.
This chapter serves as a guide to common advanced statistical methods: multiple regression, two-way and three-way analysis of variance, logistic regression, multiple logistic regression, Spearman’s rho correlation, Wilcoxon rank-sum test, and the Kruskal-Wallis test. Each of these is explanations is accompanied by a software guide to show how to conduct these procedures and interpret the results. There is also a brief description of common multivariate procedures.
The prevalence of extended-spectrum beta-lactamase (ESBL)-producing Escherichia coli and Klebsiella pneumoniae urinary tract infections (UTIs) is increasing worldwide. We investigated the prevalence, clinical findings, impact and risk factors of ESBL E. coli/K. pneumoniae UTI through a retrospective review of the medical records of children with UTI aged <15 years admitted to Prince of Songkla University Hospital, Thailand over 10 years (2004–2013). Thirty-seven boys and 46 girls had ESBL-positive isolates in 102 UTI episodes, compared with 85 boys and 103 girls with non-ESBL isolates in 222 UTI episodes. The age of presentation and gender were not significantly different between the two groups. The prevalence of ESBL rose between 2004 and 2008 before plateauing at around 30–40% per year, with a significant difference between first and recurrent UTI episodes of 27.3% and 46.5%, respectively (P = 0.003). Fever prior to UTI diagnosis was found in 78.4% of episodes in the non-ESBL group and 61.8% of episodes in the ESBL group (P = 0.003). Multivariate analysis indicated that children without fever (odds ratio (OR) 2.14, 95% confidence interval (CI) 1.23–3.74) and those with recurrent UTI (OR 2.67, 95% CI 1.37–5.19) were more likely to yield ESBL on culture. Congenital anomalies of the kidney and urinary tract were not linked to the presence of ESBL UTI. In conclusion, ESBL producers represented one-third of E. coli/K. pneumoniae UTI episodes but neither clinical condition nor imaging studies were predictive of ESBL infections. Recurrent UTI was the sole independent risk factor identified.
Chapter 5 teaches how data analysts can change the scale of a distribution by performing a linear transformation, which is the process of adding, subtracting, multiplying, or dividing the data by a constant. Adding and subtracting a constant will change the mean of a variable, but not its standard deviation or variance. Multiplying and dividing by a constant will change the mean, the standard deviation, and the variance of a dataset. A table shows students shows how linear transformations change the values of models of central tendency and variability. One special linear transformation is the z-score. All z-score values have a mean of 0 and a standard deviation of 1. Putting datasets on a common scale permits comparisons across different units. But linear transformations, like the z-score transformation, force the data to have the desired mean and standard deviation. Yet, they do not change the shape of the distribution – only its scale. Indeed, all scales are arbitrary, and scientists can use linear transformations to give their data any mean and standard deviation they choose.
A one-sample t-test is an NHST procedure that is appropriate when a z-test cannot be performed because the population standard deviation is unknown. The one-sample t-test follows all of the eight steps of the z-test, but requires modifications to accommodate the unknown sample standard deviation. First, the formulas that used σy now use the estimated population standard deviation based on sample data instead. Second, degrees of freedom must be calculated. Finally, t-tests use a new probability distribution called a t-distribution.
This chapter also explains more about p-values. First, when p is lower than α, the null hypothesis is always rejected. Second, when p is higher than α, the null hypothesis is always retained. Therefore, we can determine whether p is smaller or larger than α by determining whether the null hypothesis was retained or rejected for α. This chapter also discusses confidence intervals (CIs), which are a range of plausible values for a population parameter. CIs can vary in width, which the researcher chooses. The 95% CI width is most common in social science research. Finally, one-sample t-tests can be used to test the hypothesis that the sample mean is equal to any value (not the population mean).