Search results for Statistics and Probability

A NONPARAMETRIC TEST OF SIGNIFICANT VARIABLES IN GRADIENTS
Feng Yao, Taining Wang
Journal:

Econometric Theory / Volume 37 / Issue 5 / October 2021

Published online by Cambridge University Press:

20 November 2020, pp. 959-1003
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We propose a nonparametric test of significant variables in the partial derivative of a regression mean function. The derivative is estimated by local polynomial estimation and the test statistic is constructed through a variation-based measure of the derivative in the direction of variables of interest. We establish the asymptotic null distribution of the test statistic and demonstrate that it is consistent. Motivated by the null distribution, we propose a wild bootstrap test, and show that it exhibits the same null distribution, whether the null is valid or not. We perform a Monte Carlo study to demonstrate its encouraging finite sample performance. An empirical application is conducted showing how the test can be applied to infer certain aspects of regression structures in a hedonic price model.

Prevalence of Diarrheagenic Escherichia coli (DEC) and Salmonella spp. with zoonotic potential in urban rats in Salvador, Brazil
C. Pimentel Sobrinho, J. Lima Godoi, F. Neves Souza, C. Graco Zeppelini, V. Espirito Santo, D. Carvalho Santiago, R. Sady Alves, H. Khalil, T. Carvalho Pereira, M. Hanzen Pinna, M. Begon, S. Machado Cordeiro, J. Neves Reis, F. Costa
Journal:

Epidemiology & Infection / Volume 149 / 2021

Published online by Cambridge University Press:

20 November 2020, e128
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Studies evaluating the occurrence of enteropathogenic bacteria in urban rats (Rattus spp.) are scarce worldwide, specifically in the urban environments of tropical countries. This study aims to estimate the prevalence of diarrhoeagenic Escherichia coli (DEC) and Salmonella spp. with zoonotic potential in urban slum environments. We trapped rats between April and June 2018 in Salvador, Brazil. We collected rectal swabs from Rattus spp., and cultured for E. coli and Salmonella spp., and screened E. coli isolates by polymerase chain reaction to identify pathotypes. E. coli were found in 70% of Rattus norvegicus and were found in four Rattus rattus. DEC were isolated in 31.3% of the 67 brown rats (R. norvegicus). The pathotypes detected more frequently were shiga toxin E. coli in 11.9%, followed by atypical enteropathogenic E. coli in 10.4% and enteroinvasive E. coli in 4.5%. From the five black rats (R. rattus), two presented DEC. Salmonella enterica was found in only one (1.4%) of 67 R. norvegicus. Our findings indicate that both R. norvegicus and R. rattus are host of DEC and, at lower prevalence, S. enterica, highlighting the importance of rodents as potential sources of pathogenic agents for humans.

Sociodemographic factors associated with patients hospitalised for coccidioidomycosis in California and Arizona, State Inpatient Database 2005–2011
D. Kupferwasser, L. G. Miller
Journal:

Epidemiology & Infection / Volume 149 / 2021

Published online by Cambridge University Press:

20 November 2020, e127
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Coccidioidomycosis is endemic in the Southwestern United States. Disseminated infection can be life-threatening and is responsible for hospitalisation and significant healthcare resource utilisation. There are limited data evaluating factors associated with hospitalisation for coccidioidomycosis. We conducted a cross-sectional study to assess incidence and factors associated with coccidioidomycosis-associated hospitalisation in California and Arizona. We analysed hospital discharge data obtained from the State Inpatient Dataset for California and Arizona between 2005 and 2011 and performed multivariable logistic regression examining factors associated with coccidioidomycosis-associated hospitalisation. During our time frame, we found 23 758 coccidioidomycosis-associated hospitalisations. Coccidioidomycosis incidence was over sixfold higher in Arizona compared to California (198.9 vs. 29.6/100 000 person-years). In our multivariable model, coccidioidomycosis-associated hospitalisation was associated with age group 40–49 years (referent group: age 18–29 years, adjusted odds ratio (aOR) = 1.50 (95% confidence interval (CI) 1.43–1.59)), African American race (referent group: Caucasian, aOR = 1.98 (95% CI 1.89–2.06)), residing in a large rural town (referent group: urban area, aOR = 2.28 (95% CI 2.19–2.39)), uncomplicated diabetes (aOR = 1.47 (95% CI 1.41–1.52)) chronic obstructive pulmonary disease (aOR = 1.59 (95% CI 1.54–1.65)) and higher number of comorbidities (aOR = 1.02 (95% CI 1.02–1.03) for each point in the Elixhauser score). Identifying persons at highest risk for hospitalisation with coccidioidomycosis may be helpful for future prevention efforts.

4 - Evaluation
from Part I - Introduction and Overview
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 63-98
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

With the full system introduced in Chapter 3, now we are ready to discuss how to evaluate the performance of a pattern recognition system: a task that seems easy at first glance but is in fact quite complex. We introduce core concepts such as error and accuracy rates, under- and overfitting, and parameters and hyperparameters. We pay special attention to imbalanced problems. Finally, we present a brief introduction on how confident we can be of the evaluation outcomes. We establish the fact that errors are inevitable in most pattern recognition systems, and also introduce a decomposition of errors into different terms.

2 - Mathematical Background
from Part I - Introduction and Overview
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 15-43
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This book intends to be self-contained, and this chapter provides a short recap of (almost) all the necessary mathematical background that is required to understand the rest of this book.

Contents
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp v-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 379-384
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

12 - Hidden Markov Model
from Part IV - Handling Diverse Data Formats
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 266-290
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

HMM (hidden Markov model) is a key tool to handle sequences (time series data), but it is not the only one. We start this chapter with a very brief introduction to a few tools for such data, then devote the rest of this chapter to HMM. We first illustrate what the Markov property is and why it is so important, then naturally present HMM. Three basic problems are introduced in HMM: evaluation, decoding, and learning. Dynamic programming turns out to be the solution to the first two basic problems, and we also introduce Baum--Welch, an algorithm for learning HMM parameters.

5 - Principal Component Analysis
from Part II - Domain-Independent Feature Extraction
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 101-122
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Part II introduces domain-independent feature extraction methods, and this chapter presents principal component analysis (PCA). We start from its motivation, using an example. Then we gradually discover and develop the PCA algorithm: starting from zero dimensions, then one dimension, and finally the complete algorithm. We analyze its errors in ideal and practical conditions, and establish the equivalence between maximum variance and minimum reconstruction error. Two important issues are also discussed: when we can use PCA, and the relationship between PCA and SVD (singular value decomposition).

Part II - Domain-Independent Feature Extraction
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 99-100
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Frontmatter
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

11 - Sparse and Misaligned Data
from Part IV - Handling Diverse Data Formats
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 245-265
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

There is no silver bullet: no model can fit all data. Hence, special data requires special algorithms. In this chapter, we deal with two types of special data: sparse data and sequences that can be aligned to each other. We will not dive deep into sparsity learning, which is very complex. Rather, we introduce key concepts: sparsity inducing loss functions, dictionary learning, and what exactly the word sparsity means. For the second part in this chapter, we introduce dynamic time warping (DTW), which deals with sequences that can be aligned with each other (but there are sequences that cannot be aligned, which we will discuss in the next chapter). We use our old tricks: ideas, visualizations, formalizations, to reach the DTW solution. The key idea behind its success is divide-and-conquer and the key technology is dynamic programming.

13 - The Normal Distribution
from Part V - Advanced Topics
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 293-315
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The normal distribution is the most widely used continuous distribution, but many of its relevant properties are a little bit advanced for an undergraduate course. Hence, Part IV introduces some of these advanced topics. This chapter devotes itself to properties of normal distributions: single- and multivariate normal distributions, moment and canonical parameterizations, sum and product, geometry and the Mahalanobis distance, and conditional distributions. We also show that with these properties, some algorithms will become much easier to understand. We use parameter estimation and the Kalman filter as two such examples.

15 - Convolutional Neural Networks
from Part V - Advanced Topics
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 333-364
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We cannot miss deep learning in a modern pattern recognition textbook, and we introduce CNN (convolutional neural networks) in this chapter. Although the mathematical derivation of CNN, especially the back-propagation process and gradient computation, is complex, we use a lot of useful tools to help readers understand what exactlyis going on in a CNN. Hence, this chapter focuses on accessibility rather than completeness. In its exercise problems, we introduce more relevant topics and methods.

6 - Fisher’s Linear Discriminant
from Part II - Domain-Independent Feature Extraction
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 123-140
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Unlike PCA, which is unsupervised, FLD uses labels associated with data points, and no doubt it may get better linear features and accuracy than PCA. We start by illustrating this motivation, and practice the problem-solving framework by gradually developing the correct mathematical formulation behind the relatively simple idea behind Fisher's linear discriminant (FLD). We discuss various practical issues: the solution for the binary case, the scenario where this solution breaks down, and how to generalize from tasks with only two categories to many categories.

Transmission dynamics of COVID-19 among index case family clusters in Beijing, China
Ying Cao, Yueh Wang, Aritra Das, Calvin Q. Pan, Wen Xie
Journal:

Epidemiology & Infection / Volume 149 / 2021

Published online by Cambridge University Press:

19 November 2020, e74
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
The outbreak of coronavirus disease-2019 (COVID-19) impacts public health dramatically around the world. The demographic characteristics, exposure history, dates of illness onset and dates of confirmed diagnosis were collected from the data of 24 family clusters from Beijing. The characteristics of the cases and the estimated key epidemiologic time-to-event distributions were described. The basic reproductive number (R0) was calculated. Among 89 confirmed COVID-19 patients from 24 family clusters, the median age was 38.0 years and 43.8% were male. The median of incubation period was 5.08 days (95% confidence interval (CI) 4.17–6.21). The median of serial interval was 6.00 days (95% CI 5.00–7.00). The basic reproductive number (R0) was 2.06 (95% CI 2.02–2.08). The median of onset-to-care-seeking days and the median of onset-to-hospital admission days were significantly reduced after 23 January 2020, which implied the enhanced public health awareness among families. With epidemic containment measures in place, the results can inform health authorities about possible extent of epidemic transmission within families. Furthermore, following initiation of interventions, public health measures are not only important for curbing the epidemic spread at the community level but also improve health seeking behaviour at the individual level.

List of Tables
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp xi-xii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - Probabilistic Methods
from Part III - Classifiers and Tools
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 173-195
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter is a succinct introduction to basic probabilistic methods for pattern recognition and machine learning. One focus is to clearly present the exact meanings of different terms, including the taxonomy of different probabilistic methods. We present a basic introduction to maximum likelihood and maximum a posteriori estimation, and a very brief example to showcase the concept of Bayesian estimation. For the nonparametric world, we start from the drawbacks of parametric methods, gradually analyzing the properties preferred for a nonparametric one, and finally reach the kernel density estimation, a typical nonparametric method.

Part IV - Handling Diverse Data Formats
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp 243-244
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Notation
Jianxin Wu, Nanjing University, China
Book:

Essentials of Pattern Recognition

Published online:

08 December 2020

Print publication:

19 November 2020, pp xvi-xvi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Statistics and Probability

Refine search

Refine search

Actions for selected content:

52383 results in Statistics and Probability

A NONPARAMETRIC TEST OF SIGNIFICANT VARIABLES IN GRADIENTS

Prevalence of Diarrheagenic Escherichia coli (DEC) and Salmonella spp. with zoonotic potential in urban rats in Salvador, Brazil

Sociodemographic factors associated with patients hospitalised for coccidioidomycosis in California and Arizona, State Inpatient Database 2005–2011

4 - Evaluation

Summary

2 - Mathematical Background

Summary

Contents

Index

12 - Hidden Markov Model

Summary

5 - Principal Component Analysis

Summary

Part II - Domain-Independent Feature Extraction

Frontmatter

11 - Sparse and Misaligned Data

Summary

13 - The Normal Distribution

Summary

15 - Convolutional Neural Networks

Summary

6 - Fisher’s Linear Discriminant

Summary

Transmission dynamics of COVID-19 among index case family clusters in Beijing, China

List of Tables

8 - Probabilistic Methods

Summary

Part IV - Handling Diverse Data Formats

Notation

Statistics and Probability

Refine search

Refine search

Actions for selected content:

Save Search

52383 results in Statistics and Probability

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary