Skip to main content Accessibility help
×
Hostname: page-component-77f85d65b8-g98kq Total loading time: 0 Render date: 2026-04-17T13:57:28.912Z Has data issue: false hasContentIssue false

Index

Published online by Cambridge University Press:  14 September 2018

Vaclav Brezina
Affiliation:
Lancaster University

Information

Index

AIC (Akaike information criterion), 123, 124, 125
ANOVA, 13, 170, 192, 194, 195, 197, 198, 276, 277
multi-way, 197
one-way, 169, 170, 177, 192, 194, 262, 277
association measure, 67, 69, 70, 71, 72, 76, 79, 84, 276, 277
assumption
of a test, 13
bias
types of, 16, 17, 262, 265, 269
bootstrap test, 231, 232, 233, 235, 249
bootstrapping, 195, 231, 232, 233
case, 6
central tendency
measures of, 10, 13
chi-squared
test, 13, 113, 114, 117, 121, 126, 131, 202, 203, 204, 230, 264
cluster analysis, 153, 154, 159, 173, 174, 236, 237
hierarchical agglomerative, 154, 159, 236
variability-based neighbour, 236
coefficient of determination (r2), 147, 276
coefficient of variation, 50, 51, 52, 237
Cohen’s d, 72, 86, 190, 191, 195, 263, 272, 273, 277, 276, 277, 278
Cohen’s κ, 90, 91
collinearity, 120, 121
collocation, 66, 67, 68, 69, 70, 71, 74, 75, 76, 77, 79, 94, 108, 245, 277
collocation parameters notation (CPN), 74, 75, 77, 79
graph, 75
network, 76, 77, 79, 94
span, 67
window, 69, 70
collocation window, see collocation: span
complete separation, 120, 124
concordance index (C-index), 125, 126, 129, 133
confidence interval, 13, 20, 24, 30, 31, 116, 117, 128, 129, 144, 146, 150, 191, 199, 226, 233, 235, 242, 243, 246, 268, 272, 273
contingency table, 69, 70, 84, 108, 113, 114, 204
corpus, 6
of interest, 80
population-based, 18
reference, 80, 81
corpus linguistics, 2
correlation, 121, 141, 142, 144, 146, 147, 148, 150, 151, 164, 165, 172, 191, 196, 197, 236, 276
coefficient, 144, 145, 147, 150, 197, 237, 276
interclass, 91
matrix, 148, 148, 150, 165, 172
negative, 141, 144, 149, 151, 164, 172, 176
Pearson’s, 142
positive, 141, 142, 144, 151, 164
rank biserial, 196
Spearman’s correlation, 146
correspondence analysis, 200, 202, 204, 205, 206, 214
covariance, 142, 144, 146
Cramer’s V, 114
cross-tabulation, 108, 109, 110, 111, 112, 117, 133, 200, 202, 204, 206
data
sparseness, 19
data point, see case
dataset, 6
degrees of freedom (df), 50, 114, 115, 117, 187, 188, 189, 190, 193, 194, 197
Delta P, 70, 72, 74
Dice, 70, 72, 76
directionality, 70, 74
dispersion, 10, 11, 13, 46, 47, 48, 50, 51, 53, 54, 61, 70, 74, 85
distance
chi-squared, 200, 203, 204, 206
Euclidean, 153, 154, 203, 204
Manhattan, 153, 159, 173
DP (deviation of proportions), 52
effect
fixed, 209, 211
main, 125
random, 209, 211
effect size, 14, 20, 30, 32, 91, 114, 115, 116, 117, 125, 128, 144, 145, 150, 190, 191, 195, 196, 197, 232, 233, 235, 262, 270, 272, 273, 274, 275, 276, 277, 278, 279
eigenvalue, 166
envelope of variation, 106, 185
eta squared (η2), 195, 276
factor analysis, 164, 164, 165, 170, 172, 174, 176, 200, 205, 239
Fisher exact test, 113
Fleiss’s κ, 91
frequency
absolute, 22, 42, 43, 44, 48, 54, 55, 57
average reduced (ARF), 54, 55, 57, 61
distribution, 8, 13, 25, 60
expected, 70, 71, 84, 113, 114
marginal, 109
mean, 43, 231
observed, 67, 68, 69, 70, 84, 113, 142
graph
bar chart, 23, 103, 229
boxplot, 23, 30, 228
candlestick plot, 226, 228, 248
correspondence plot, 200, 202, 205, 206
dendrogram, 154, 158, 159, 237, 239, 241, 250
error bars, 14, 24, 197, 226, 228
forest plot, 273, 275
geomapping, 27
histogram, 8, 25
line graph, 224, 243, 248
mosaic plot, 109
scatterplot matrix, 25, 26
scree plot, 166, 167, 174, 239, 241
sparkline, 226, 228, 279
stacked bar chart, 103, 259
Gwet’s AC1, 90, 91
homoscedasticity, 13, 189, 192
independence of observations, 112, 120, 121, 189, 192
information/ink ratio, 23
intercept, 118, 120, 124, 125, 126, 127
inter-rater agreement, 87, 88, 90, 91, 92, 95, 129, 208, 245, 246, 247, 274
keywords, 79, 80, 81, 82, 83, 85, 86, 87, 93, 265, 276
negative, 80, 81, 82, 83
positive, 80, 81, 82, 83, 84, 87, 93
Kruskal–Wallis test, 195, 196, 197, 199
lemma, 40, 41, 60, 61
lexeme, 40, 41, 60
lexico-grammatical frame, 102, 106, 107, 119, 129, 130
line of the best fit, see regression line
linearity, 13, 120, 121
lockwords, 80, 81, 83, 220, 265
log Dice, 70, 72, 76, 276
log likelihood, 72, 84, 86, 113, 123, 124, 126
log odds, 121, 124, 125, 126, 127, 277
log odds ratio, 127, 277
log ratio, 72, 74, 85, 86
logistic regression, 106, 117, 118, 119, 120, 121, 122, 124, 125, 126, 127, 128, 129, 132, 134, 209, 210, 211, 276
Mann–Whitney U test, 13, 195, 196, 197, 231, 276
mean, 3, 4, 10, 11, 12, 13, 19, 24
20% trimmed, 10
grand, 192, 193
meaning fluctuation analysis, 245
median, 10, 11, 23, 199
meta-analysis, 14, 267, 268, 269, 270, 271, 272, 273, 274, 277
MI, 72
MI2, 70, 72
MI3, 72
minimum sensitivity, 72
mixed-effects models, 121, 208, 209, 211
model
baseline (or null), 123
parsimonious, 123
Monte Carlo, 166
MU, 72
multidimensional analysis, 170, 262
NHST (null-hypothesis significance testing), 12, 13, 20
node, 67, 68, 69, 70, 71, 75, 76, 94, 208, 245, 246
non-parametric test, 13, 196, 199, 231
normal distribution, 8, 13, 189, 195
normalized frequency, see relative frequency
normality, 13, 191, 192
null hypothesis, 12, 13, 20, 30, 84, 91, 111, 113, 114, 147, 202, 220
odds, 14, 116, 121, 125, 126, 127, 128, 129, 133, 277, 276
odds ratio, 14, 116, 127, 128, 129, 276
outlier, 9, 10, 23, 154
parallel analysis, 166
parametric test, 13
peaks and troughs, 242, 243, 245, 246, 247
percentage increase/decrease, 230
phi, 114
population, 6, 12, 13, 14, 15, 16, 18, 19, 20, 24, 49, 52, 82, 116, 128, 144, 145, 147, 150, 188, 189, 191, 192, 195, 199, 221, 231, 232, 259, 272
post-hoc test, 194, 195, 197, 199
predictor, 105, 109, 111, 117, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 132, 177, 208, 276
interactions, 125
probability of superiority (PS), 197, 278
probability ratio, 115, 116
pseudo-R2, 126
p-value, 12, 13
r (effect size), 14, 191, 192, 277
range1, 11, 48
interquartile, 11, 23
range2, 11, 48
raw agreement, 89, 90
raw frequency, see frequency: absolute
regression line, 4, 26, 141
relative frequency, 43
repeated measures test, 189, 197
replication, 267
representativeness, 17, 18, 19, 30, 82, 221, 222, 223, 224, 234, 259
research design
individual-text/speaker, 21, 22
linguistic feature, 22, 105, 106, 108, 119, 270
whole corpus, 21, 104, 272
rogue value, 9
sample, 3, 6
sampling
random, 15, 16
sampling frame, 16, 17, 18, 43, 87, 221, 259
science, 2
simple maths parameter (SMP), 85, 86, 276
standard deviation, 11, 48, 49, 50, 142, 144, 152, 153, 168, 190, 191, 195, 237, 241, 276
population, 52
sample, 49, 50, 52, 187
standard error, 124, 128
statistical significance, 12, 20
statistics, 3
descriptive, 4, 14, 18, 50, 221, 259
inferential, 12, 13, 19, 20, 117
token, 39, 41, 54, 57, 58, 68, 270
t-score, 72, 73
t-test, 12, 13, 30, 31, 187, 189, 190, 191, 192, 194, 195, 197, 213, 231, 277
type, 39, 40, 41, 60
type/token ratio (TTR), 57, 58, 59, 163, 171, 175, 176
moving average, 58
standardized, 58
VARBRUL, 210, 211
variable, 6, 8, 10, 11, 24, 105, 106, 107, 108, 109, 112, 113, 115, 117, 119, 122, 123, 125, 129, 132, 141, 142, 147, 148, 152, 164, 165, 166, 168, 172, 172
ambient, 185
categorical, 108, 118, 119, 121, 122, 154, 208
dummy, 127
explanatory, 6, 105, 108, 111, 117, 119, 120, 192, 197, 209, 269, 270
judgement, 87, 89, 91, 92, 120, 208
level of, 119
lexico-grammatical, 106, 107, 117, 120
linguistic, 6, 8, 13, 14, 19, 22, 25, 30, 38, 103, 105, 107, 108, 109, 110, 113, 118, 119, 123, 125, 139, 141, 142, 148, 151, 153, 158, 161, 163, 164, 166, 170, 171, 172, 174, 185, 187, 189, 190, 195, 196, 214, 224, 228, 229, 230, 232, 236, 237, 242, 262, 274, 275, 276, 278
linguistic ambient, 107
moderator, 274
nominal, 7, 90, 105, 276
ordinal, 7, 108, 142, 147
outcome, 105, 107, 118, 119, 121, 122, 128, 129, 132, 208
predictor, 118, 119, 120, 121, 123, 127, 132, 208
scale, 7, 8, 119, 122, 141, 142, 147
sociolinguistic, 185, 187, 207
sociolinguistic Labovian, 185
variable context, 106, 207
variance, 13, 187, 188, 189, 190, 192, 193, 194, 272, 273, 275
variation
functional, 104, 161, 170, 185
percentage of, 51, 206
register, 104, 177
Wald’s z, 128
whelk problem, 46
Zipf’s law, 44, 46
z-score1, 72
z-score2, 152, 153, 154

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Index
  • Vaclav Brezina, Lancaster University
  • Book: Statistics in Corpus Linguistics
  • Online publication: 14 September 2018
  • Chapter DOI: https://doi.org/10.1017/9781316410899.011
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • Index
  • Vaclav Brezina, Lancaster University
  • Book: Statistics in Corpus Linguistics
  • Online publication: 14 September 2018
  • Chapter DOI: https://doi.org/10.1017/9781316410899.011
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • Index
  • Vaclav Brezina, Lancaster University
  • Book: Statistics in Corpus Linguistics
  • Online publication: 14 September 2018
  • Chapter DOI: https://doi.org/10.1017/9781316410899.011
Available formats
×