To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter covers what have traditionally been called grammaticality judgments in linguistics (which are more aptly referred to as acceptability judgments – see below). We examine such judgments from several angles, with the goal of assisting researchers in deciding whether and how to use this kind of data. Our goal in this chapter is to provide an introduction to the major themes that arise when using acceptability judgments as a data source for the construction of linguistic theories. Importantly, this chapter will not be a step-by-step guide for constructing a particular experiment, as the curious reader can find several fine introductions to the mechanics of experiment construction and analysis elsewhere (e.g., Chapters 7 and 8, and Cowart 1997). Instead, we intend this chapter to be an introduction to the theory underlying the methodology of acceptability judgment collection. Most of what follows will involve discussion of syntactic well-formedness judgments, because that is where the greatest amount of research about judgment data has been focused, but we believe that many of our remarks are also relevant for judgments at other levels of linguistic representation. Specific considerations regarding other sorts of judgments can be found elsewhere in this volume. For example, judgments about the lexicon and phonotactic well-formedness are generally gathered in the language documentation process (see Chapter 4); judgments about morphological processes might be gathered using the experimental methods that predominate in psycholinguistics (see Chapter 8); judgments about sociolinguistic variables might be gathered via a survey (see Chapter 6). For considerations specific to semantic judgments, see Matthewson (2004) and Chemla and Spector (2011).
So you want to investigate the language used by a group of people. One of the first questions you might ask yourself is: Who do I collect these data from? A crucial element of empirical linguistic work is to choose not only what type of data to collect (e.g., naturally occurring data, interview data, questionnaire data, experimental data; see Part I of this volume), but also which people to target for data collection. The most reliable method for finding out about the language use of a particular group of people would be to collect linguistic information from every single person in the population, which in the social sciences refers to all members of the community. Obviously, except for very small populations, this method is rather impractical, expensive, and time-consuming. Hence, most researchers only target “some people in the group in such a way that their responses and characteristics reflect those of the group from which they are drawn . . . This is the principle of sampling” (De Vaus 2001: 60). The subgroup of people that reflects the population as a whole (in terms of their social and linguistic characteristics), and therefore lends itself to generalizations above and beyond the scope of the study, is called a representative sample. The question we need to ask as linguists is: To what extent are the findings reported on the basis of a subsample representative of the linguistic habits of a certain population or group?
Digital filtering is a very commonly used seismic data processing technique, and it has many forms for different applications. This chapter begins by describing three ways to express digital filtering: the rational form, recursive formula, and block diagram. The names of the filters usually come from their effects on the frequency spectrum. In the rational form of a filter, the zeros are the roots of the numerator, and the poles are the roots of the denominator. Using the zeros and poles we can make the pole–zero representation on the complex z-plane as a convenient way to quantify the effect of a digital filter as a function of frequency. The rule of thumb is: poles add, zeros remove, and the magnitude of the effect of the pole or zero depends on their distance from the unit circle. Different types of filtering in seismic data processing are discussed in the chapter using several examples. In particular, f–k filtering is discussed in detail with its typical processing flow. Owing to the widespread application of inverse problem in geophysics, much of the attention is given to inverse filtering, which requires that the corresponding filter be invertible. It can be proven that a minimum-phase filter is always invertible because all of its zeros and poles are outside the unit circle on the complex z-plane. This notion means that the minimum-phase filters occupy an important position in seismic data processing. In general, a minimum-phase wavelet is preferred in seismic data processing because of stability concerns, while a zero-phase wavelet is preferred in seismic interpretation to maximize the seismic resolution. The final section of the chapter prepares the reader with the physical and mathematical background materials for inverse filtering. These materials are fundamental to the understanding of deconvolution, an application of inverse filtering, in the next chapter.
Functionality of a digital filter
A digital filter is represented by a sequence of numbers called weighting coefficients, which can be expressed as a time series or denoted by the z-transform. When the filter acts on an input digital signal which can be expressed as another time series, the filter functions as a convolution with the input signal (Figure 5.1).
When we have a small amount of data, we can avoid statistics completely. In such cases, we can inspect and discuss each and every observation or data point. For example, if we measured the fundamental frequencies (F0) of three siblings’ speech, we might observe that Betty’s voice was 25 Hz lower than Sue’s, but 100 Hz higher than Frank’s. It would probably be uninteresting to report a statistic like the average pitch of the family. With a larger dataset, like F0 measurements taken from 1,000 men and 1,000 women, the situation is reversed. It is no longer possible to discuss each data point individually, and while it can still be useful to make graphs that display every observation, we will usually be less interested in individual points and more interested in the patterns or trends formed by groups of points.
This is where descriptive statistics come in. Descriptive statistics generally constitute the second step in a quantitative analysis. The first step is to display the data in a tabular or graphical format, using a histogram, bar chart, scatterplot, cross-tabulation, or other method. This will reveal any peculiarities of the data that will shape further analysis. For example, a severely skewed dataset may motivate a transformation, or the use of non-parametric statistics. The second step is the descriptive statistics themselves, which distill the complexities of the data down to a small, manageable set of numbers, abstracting away from details (and noise) in order to describe the basic overall properties of the data. This process can suggest the answers to existing questions or inspire new hypotheses to be tested.
Velocity analysis is synonymous with velocity model building (VMB) because the goal is to produce a velocity model for the subsurface. VMB is among the most common practices in seismology for two reasons. First, for any study area, its seismic velocity model is one of the main measurable results from geoscience. Second, a velocity model is a precondition for seismic migration and other seismic imaging methods to map subsurface reflectors and scatters using reflected or scattered waves. Since seismic velocity is inferred from traveltimes of seismic waves, the resolution of each velocity model is limited by the frequency bandwidth and spatial coverage of seismic data. This chapter starts with definitions and measurements of different types of seismic velocities. The observations reveal the trends of seismic velocities as functions of pressure, temperature, and other physical parameters. The dominance of the 1D or V(z) velocity variation at large scale in the Earth leads to the classic refraction velocity analysis based on seismic ray tracing in a layer-cake velocity model.
The three common seismic velocity analysis methods based on NMO semblance, seismic migration, and seismic tomography are discussed in three sections. Assuming gentle to no changes in the reflector dip and lateral velocity variation, semblance velocity analysis provides stable estimates of the stacking velocity. The stacking velocity at each depth, marked by the corresponding two-way traveltime, is the average velocity of all layers above the depth. By taking advantage of the dependency of depth migration on velocity variations, migration velocity analysis enables the velocity model to be refined using horizons defined by data to accommodate lateral velocity variations. Nowadays most migration velocity analyses are conducted on common image gathers (CIGs). To constrain the velocity variation using all data together, tomographic velocity analysis derives or refines the velocity model through an inversion approach to update the velocity perturbations iteratively. Some practical issues in tomographic VMB are discussed in the final section, mostly on inversion artifacts and deformable layer tomography.
As linguists, how do we capture the passage of time in our empirical research? Long relegated to the periphery of core linguistics due to the legacy of de Saussure (1984 [1916]), the relationship between language and time has traditionally been associated with the research domain of historical linguistics. However, a number of domains of linguistic research have challenged the Saussurean dichotomy between diachrony and synchrony, and developed methodological approaches to take into account the relationship between language and time.
Various approaches have been adopted to analyze the passage of time and its effect on linguistic structure and processes. Linguists have assessed stability or instability in language through the observation of speech events, the linguistic behavior of individuals over their life span, successive generations of a given speech community, the history of a language over a longer span, or, at the broadest level, the evolution of language. Depending on their research questions, linguists have focused on the individual, the community, a specific language or dialect, or the language faculty as a whole. The various fields of linguistics have problematized the time dimension differently. First language (L1) acquisition studies examine the question of time alongside cognitive developmental stages in early childhood. While the focus of second language acquisition (SLA) is also on the individual, this field observes the development of interlanguage stages within a time span that can encompass a longer portion of the individual’s life span. Historical linguistics apprehends the time dimension through a much larger time scale, trying to understand, for example, how modern Romance languages, such as French, Spanish, Catalan, and Portuguese, emerged over centuries from spoken Latin, or how a specific linguistic phenomenon has evolved or grammaticalized over time. Sociolinguistics fits somewhere in between on this continuum, with one of its central research questions relating to linguistic change in progress at the community level, but also with an interest in the development of sociolinguistic competence at the individual level. These various objects of inquiry have directly impacted research design and methodological choices within the discipline.
Over the last few decades, corpus-linguistic methods have established themselves as among the most powerful and versatile tools to study language acquisition, processing, variation, and change. This development has been driven in particular by the following considerations:
a. technological progress (e.g., processor speeds as well as hard drive and RAM sizes);
b. methodological progress (e.g., the development of software tools, programming languages, and statistical methods);
c. a growing desire by many linguists for (more) objective, quantifiable, and replicable findings as an alternative to, or at least as an addition to, intuitive acceptability judgments (see Chapter 3);
d. theoretical developments such as the growing interest in cognitively and psycholinguistically motivated approaches to language in which frequency of (co-)occurrence plays an important role for language acquisition, processing, use, and change.
In this chapter, we will discuss a necessarily small selection of issues regarding (i) the creation, or compilation, of new corpora and (ii) the use of corpora once they have been compiled. Although this chapter encompasses both the creation and use of corpora, there is no expectation that any individual researcher would be engaged in both these kinds of activities. Different skills are called for when it comes to creating and using corpora, a point noted by Sinclair (2005: 1), who draws attention to the potential pitfalls of a corpus analyst building a corpus, specifically, the danger that the corpus will be constructed in a way that can only serve to confirm the analyst’s pre-existing expectations. Some of the issues addressed in this chapter are also dealt with in Wynne (2005), McEnery, Xiao, and Tono (2006), and McEnery and Hardie (2012) in a fairly succinct way, and more thoroughly in Lüdeling and Kytö (2008a, 2008b) and Beal, Corrigan, and Moisl (2007a, 2007b).
Acoustic analysis, once a method used primarily within the domain of phonetics, has become an increasingly necessary skill across the field of linguistics. To name just a few examples, phonologists sometimes appeal to acoustic data to substantiate theoretical arguments, sociolinguists tend to characterize vowel shifts and mergers in terms of their acoustic properties, and psycholinguists frequently draw on acoustic analysis techniques to construct stimuli for experiments.
The analysis of acoustic signals is mainly performed with the help of generally available software. Because of its capability of creating publication-quality graphics, the pictures in this chapter were made with Praat (Boersma and Weenink 1992–2012), a general set of tools for analyzing, synthesizing and manipulating speech and other sounds, bundled into a single integrated computer program. Praat is available free of charge for all current major computer platforms (nowadays MacOS, Windows, Linux) and is continually updated to accommodate new operating system developments and new analysis methods.
Graphical software allows us to perform acoustic analysis by inspecting visualized speech. The types of visualization addressed in the present chapter are the waveform, the pitch curve, the intensity curve, the spectrum, the spectrogram, and formant tracks. These types of visualization will be seen to help in measuring the following articulatory, acoustic, and auditory quantities: glottal period, resonance frequencies, pitch, duration, intensity, noisiness, and place of articulation. Examples of practical uses for each of these measures will appear throughout the chapter.
Discrete spectral analysis is a suite of classic data processing tools aiming to quantify the energy distribution of seismic data over temporal or spatial scales. This chapter starts with the law of decomposition and superposition, which is the foundation of many seismic processing methods. According to Fourier theory, a seismic trace can be expressed as a linear superposition of harmonic functions of different frequencies with appropriate amplitudes and phases, thus enabling the spectral analysis. Because seismic data are in digital form with limited time durations, classic spectral analysis is achieved using discrete Fourier transform (DFT). Readers should pay special attention to the characteristics of the DFT, as these often differ from the continuous Fourier transform. Fast Fourier transform (FFT) is described as an example to improve the computation efficiency in processing.
Discrete spectral analysis is discussed using several examples. Useful processing tricks in one type of processing are often borrowed to solve problems in another type of processing. To decompose seismic traces, for instance, we may use wavelets of fixed shape but varying amplitudes and lengths rather than harmonic functions as the basic building blocks. This enables wavelet decomposition and seismic wavelet analysis, as another application of the law of decomposition and superposition. Yet another usage of the law is in interpolation of digital data, which has many applications in the processing of seismic and non-seismic data.
This chapter provides an introduction to some of the key methods commonly used in psycholinguistic research. We will focus on three main types of methods: reaction-time-based methods, visual-attention-based methods, and brain-based methods, and also briefly mention other kinds of approaches. As will become clear over the course of this chapter, each of these categories consists of multiple experimental paradigms, and choosing the “right” one often comes down to which method is most appropriate for a particular research question. It would be inaccurate to characterize one method as better than the others, since each has its own strengths and weaknesses. In what follows, we will consider each method in some depth, and comment on the ease of implementation and data analysis.
For the most part, the discussion in this chapter will focus on language comprehension, but some discussion of production methods is also included (see also Bock 1996 for an in-depth review of production methods). This asymmetry is a reflection of the greater body of prior work that exists on language comprehension. In the past, production has received considerably less attention, mostly due to methodological challenges.
The fact is, Phaedrus, that writing involves a similar disadvantage to painting. The productions of painting look like living beings, but if you ask them a question, they maintain a solemn silence.
Plato 1975: 96
This chapter is on how we can bring the evidence from textual data to bear on linguistic analysis, primarily in historical linguistics. Linguistic analysis comes in many varieties, however, and before we discuss the value of textual data, we should briefly consider what the object of study of linguistics is. Much depends here on theoretical perspective: to mention some examples, linguists concerned primarily with language use may be interested in spoken language use, variation in register, interactive modes, language use as a marker of social status, including prestige-driven norms, and so on. Linguists working from a formal perspective will be interested in speakers’ language competence – the internalized grammar that is assumed to be the core of a speaker’s knowledge of language.
In this chapter, we discuss the notion and practice of “making an argument” in generative linguistics, taking examples from phonology, morphology, and syntax. Argumentation is central in linguistics, yet there are few explicit and thorough accounts of what it is to make an argument (though see Soames and Perlmutter 1979; Aarts 2001; Green and Morgan 2001; Kertész and Rákosi 2012 for argumentation in syntax). Our goal here is not to present a general philosophical discussion of argumentation, but rather to present the concept as it is typically practiced in linguistics. Through this, we will provide an overview of how to construct a linguistic analysis and support it. The chapter is structured as follows: we first outline how an argument is typically formulated, in abstract terms, based on the notion of supporting a hypothesis more generally. We then discuss various case studies of arguing for hypotheses of different degrees of abstraction, ranging from empirical arguments to theoretical arguments. We conclude with some discussion of writing style in argumentation.
Making an argument in linguistics
Making an argument is a creative exercise, to develop and motivate a hypothesis which provides some insight into some set of facts. There are a few linguists whose work has come to be associated with a strong emphasis on argumentation. David Perlmutter is one, and he writes in the introduction to Perlmutter 2010 (xx):
I have tried to emphasize four things in my work in linguistics: explicit arguments for one hypothesis over others, extending the range of languages and phenomena for which linguistic theory is to be held accountable, making explicit the ways languages differ and the ways they are alike, and explanation in linguistics. All four were already present in my 1968 doctoral dissertation (Perlmutter 1971), especially in the chapter arguing for surface structure constraints on the order of clitic pronouns in Spanish and French (Perlmutter 1970b)
Variation analysis takes as its object of study differences in linguistic form with no apparent change in meaning or function. While other methods of linguistic analysis try to eliminate variation by finding structural or semantic contexts that disambiguate the choice of linguistic form, variation analysis seeks to understand variation by assessing which dimensions of the linguistic and/or social context correlate with the occurrence of a particular variant form. Linguistic variation is analyzed within different subfields of linguistics, such as sociolinguistics, historical linguistics, corpus linguistics, first and second language acquisition, and phonetics, each of which addresses slightly different research questions. The primary focus of this chapter is the analysis of linguistic variation within sociolinguistics, though the methods discussed here apply in principle to the other subfields. I begin by defining the central construct of variation analysis, the linguistic variable and its identification at the levels of phonetics/phonology and grammar, before proceeding through the steps of variation analysis: circumscribing the variable context, formulating and testing hypotheses through coding tokens for different independent variables, statistical testing, and interpreting results. I include some comments on the relationship between variation analysis and linguistic theory.
Identifying linguistic variables
The analysis of variation begins by noting that two or more linguistic forms are “different ways of saying the same thing,” a phrase that will serve as a good provisional definition for the central construct of variation analysis, the linguistic variable. In this case, “the same thing” refers to a single underlying form (in phonology) or a single meaning or function (in morphosyntax), and the “different ways” refers to the variant forms (or variants). For example, some Spanish speakers sometimes produce a word like avión ‘airplane’ with a final alveolar [n] and sometimes with a final velar [ŋ]. Thus, in Spanish there is a linguistic variable (n) with two variants, [n] and [ŋ]. Similarly, when referring to the future, English speakers sometimes use a form of the present tense, as in (1a, b), sometimes a modal, as in (1c), and sometimes a periphrastic construction, as in (1d). These variant forms have an underlying discourse function in common, a reference to future time.
There are two exciting facets of language description: the fieldwork experience, which is necessary for data collection, and the process of discovery and analysis that leads to the description of the target language. In order for our record of language structures to be as accurate as possible, data collection is best conducted using rigorous methodology. The goal of language description is often not to capture just one speaker’s internal grammar but to represent prevalent patterns for a community of speakers. In that sense, grammatical description is “fake” in that no one speaker will instantiate all the structures described in the grammar; at the same time, however, the grammar is “real” because the facts described therein are accepted by most speakers as accurately representing their language. The main product of descriptive fieldwork, whether a grammar or a targeted description of particular parts of a grammar, must therefore include data from a variety of speakers, favoring the most frequent patterns and noting common variations based on social or contextual factors.
Speakers and fieldworkers
A typical fieldwork project requires the participation of several speakers, in part due to differing talents and interests. The primary consultants, the speakers who participate on a regular basis in a project, will be those who are excited by language study. Some speakers show an amazing amount of linguistic sophistication even without linguistic training; for example, even if a speaker is unable to explain word class membership using terms such as “verb” and “noun,” she might still identify the lexical category of a word by providing paradigms or synonyms (Dixon 1992). Some speakers show initiative by bringing their own analyses to field sessions or by asking community members for their opinions on constructions discussed with the field linguist. Some speakers may be good storytellers, others able to repeat things slowly and exactly to aid with transcription (a surprisingly difficult task; see Chapter 12). Some speakers may be ideal for recording conversations and narratives but may be too prescriptive to help with translation (they may be more interested in “correcting” data than commenting on it).