To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
SAC is an acronym for the Seismic Analysis Code, a command line tool for basic operations on time series data, especially seismic data. SAC includes a graphical interface for viewing and picking waveforms. It defines a standard type of file for storing and retrieving time series and also reads files written in other data formats (SEG-Y, MSEED, GCF) used during field collection of seismic data. SAC is self-contained and does not rely on network access for any of its capabilities, including documentation, which makes it useful for field data quality control.
SAC reads data formats (CSS and GSE formats) used by nuclear test monitoring agencies. It also contains programming language constructs that provide basic methods for developing elaborate, multi-step analysis methodologies. Collectively, these features make SAC a useful interactive platform upon which customized analytical methods may be built and prototypical procedures may be developed.
SAC is widely known. The IRIS Data Management Center (DMC), one of the largest whole-Earth seismological data repositories in existence, allows data to be requested in SAC form. The instrument response information provided by the DMC's SEED reading program, rseed, is usable by SAC in pole-zero or evalresp form. Owing to SAC's longevity, a rather large and well debugged software tool ecosystem has evolved around its file format. One such tool, jweed, searches for and retrieves data held by the DMC. SAC data and SAC-compatible instrument response information are among its output options.
SAC has facilities for handling a basic type of three-dimensional (3D) data, which are function values evaluated on a regular grid of (X,Y) positions. This type of data can be viewed as evenly sampled in space and fits naturally into SAC file types. The file consists of the string of samples at each grid point, and file header information that flags it as 3D gives the X and Y dimensions of the grid. SAC calls data of this type XYZ.
The simplest visualization of 3D data is map elevation data. Imagine a section of land whose elevation is specified on a grid every 100 m eastwards and 100 m northwards. Over a plot of land 1 km by 1 km, the elevation could be described by a string of 121 values, the first 11 values along the westernmost traverse and the final 11 along the easternmost traverse.
Another type of data that can be represented in 3D is a spectrum as a function of time. If equal lengths of time are cut from a seismogram, the number of points in their FFTs will be the same. If the interval between spectral samples is constant, these values will also form agrid.
A final type of data inherently 3D is a misfit function evaluated over a two-dimensional (2D) grid of points. The misfit minimum within the sampled parameter space yields the optimum combination of parameters that best fit some observations, and the shape of the minimum provides a measure of the joint uncertainty of the parameter estimates.
Arabic language is strongly structured and considered as one of the most highly inflected and derivational languages. Learning Arabic morphology is a basic step for language learners to develop language skills such as listening, speaking, reading, and writing. Arabic morphology is non-concatenative and provides the ability to attach a large number of affixes to each root or stem that makes combinatorial increment of possible inflected words. As such, Arabic lexical (morphological and phonological) rules may be confusing for second language learners. Our study indicates that research and development endeavors on spelling, and checking of grammatical errors does not provide adequate interpretations to second language learners’ errors. In this paper we address issues related to error diagnosis and feedback for second language learners of Arabic verbs and how they impact the development of a web-based intelligent language tutoring system. The major aim is to develop an Arabic intelligent language tutoring system that solves these issues and helps second language learners to improve their linguistic knowledge. Learners are encouraged to produce input freely in various situations and contexts, and are guided to recognize by themselves the erroneous functions of their misused expressions. Moreover, we proposed a framework that allows for the individualization of the learning process and provides the intelligent feedback that conforms to the learner's expertise for each class of error. Error diagnosis is not possible with current Arabic morphological analyzers. So constraint relaxation and edit distance techniques are successfully employed to provide error-specific diagnosis and adaptive feedback to learners. We demonstrated the capabilities of these techniques in diagnosing errors related to Arabic weak verbs formed using complex morphological rules. As a proof of concept, we have implemented the components that diagnose learner's errors and generate feedback which have been effectively evaluated against test data acquired from real teaching environment. The experimental results were satisfactory, and the performance achieved was 74.34 percent in terms of recall rate.
We consider numbers and sizes of independent sets in graphs with minimum degree at least d. In particular, we investigate which of these graphs yield the maximum numbers of independent sets of different sizes, and which yield the largest random independent sets. We establish a strengthened form of a conjecture of Galvin concerning the first of these topics.
Paraphrase corpora are an essential but scarce resource in Natural Language Processing. In this paper, we present the Wikipedia-based Relational Paraphrase Acquisition (WRPA) method, which extracts relational paraphrases from Wikipedia, and the derived WRPA paraphrase corpus. The WRPA corpus currently covers person-related and authorship relations in English and Spanish, respectively, suggesting that, given adequate Wikipedia coverage, our method is independent of the language and the relation addressed. WRPA extracts entity pairs from structured information in Wikipedia applying distant learning and, based on the distributional hypothesis, uses them as anchor points for candidate paraphrase extraction from the free text in the body of Wikipedia articles. Focussing on relational paraphrasing and taking advantage of Wikipedia-structured information allows for an automatic and consistent evaluation of the results. The WRPA corpus characteristics distinguish it from other types of corpora that rely on string similarity or transformation operations. WRPA relies on distributional similarity and is the result of the free use of language outside any reformulation framework. Validation results show a high precision for the corpus.
A colouring of a graph G is called distinguishing if its stabilizer in Aut G is trivial. It has been conjectured that, if every automorphism of a locally finite graph moves infinitely many vertices, then there is a distinguishing 2-colouring. We study properties of random 2-colourings of locally finite graphs and show that the stabilizer of such a colouring is almost surely nowhere dense in Aut G and a null set with respect to the Haar measure on the automorphism group. We also investigate random 2-colourings in several classes of locally finite graphs where the existence of a distinguishing 2-colouring has already been established. It turns out that in all of these cases a random 2-colouring is almost surely distinguishing.
For k-graphs F0 and H, an F0-packing of H is a family $\mathscr{F}$ of pairwise edge-disjoint copies of F0 in H. Let νF0(H) denote the maximum size |$\mathscr{F}$| of an F0-packing of H. Already in the case of graphs, computing νF0(H) is NP-hard for most fixed F0 (Dor and Tarsi [6]).
In this paper, we consider the case when F0 is a fixed linear k-graph. We establish an algorithm which, for ζ > 0 and a given k-graph H, constructs in time polynomial in |V(H)| an F0-packing of H of size at least νF0(H) − ζ |V(H)|k. Our result extends one of Haxell and Rödl, who established the analogous algorithm for graphs.
This book is partially based on the material covered in several Cambridge Mathematical Tripos courses: the third-year undergraduate courses Information Theory (which existed and evolved over the last four decades under slightly varied titles) and Coding and Cryptography (a much younger and simplified course avoiding cumbersome technicalities), and a number of more advanced Part III courses (Part III is a Cambridge equivalent to an MSc in Mathematics). The presentation revolves, essentially, around the following core concepts: (a) the entropy of a probability distribution as a measure of ‘uncertainty’ (and the entropy rate of a random process as a measure of ‘variability’ of its sample trajectories), and (b) coding as a means to measure and use redundancy in information generated by the process.
Thus, the contents of this book includes a more or less standard package of information-theoretical material which can be found nowadays in courses taught across the world, mainly at Computer Science and Electrical Engineering Departments and sometimes at Probability and/or Statistics Departments. What makes this book different is, first of all, a wide range of examples (a pattern that we followed from the onset of the series of textbooks Probability and Statistics by Example by the present authors, published by Cambridge University Press). Most of these examples are of a particular level adopted in Cambridge Mathematical Tripos exams. Therefore, our readers can make their own judgement about what level they have reached or want to reach.
A minimum feedback arc set of a directed graph G is a smallest set of arcs whose removal makes G acyclic. Its cardinality is denoted by β(G). We show that a simple Eulerian digraph with n vertices and m arcs has β(G) ≥ m2/2n2+m/2n, and this bound is optimal for infinitely many m, n. Using this result we prove that a simple Eulerian digraph contains a cycle of length at most 6n2/m, and has an Eulerian subgraph with minimum degree at least m2/24n3. Both estimates are tight up to a constant factor. Finally, motivated by a conjecture of Bollobás and Scott, we also show how to find long cycles in Eulerian digraphs.
We present a methodology for the extraction of narrative information from a large corpus. The key idea is to transform the corpus into a network, formed by linking the key actors and objects of the narration, and then to analyse this network to extract information about their relations. By representing information into a single network it is possible to infer relations between these entities, including when they have never been mentioned together. We discuss various types of information that can be extracted by our method, various ways to validate the information extracted and two different application scenarios. Our methodology is very scalable, and addresses specific research needs in social sciences.
We discuss our outreach efforts to introduce school students to network science and explain why researchers who study networks should be involved in such outreach activities. We provide overviews of modules that we have designed for these efforts, comment on our successes and failures, and illustrate the potentially enormous impact of such outreach efforts.
Comparisons sort objects based on their superiority or inferiority and they may have major effects on a variety of evaluation processes. The Web facilitates qualitative and quantitative comparisons via online debates, discussion forums, product comparison sites, etc., and comparison analysis is becoming increasingly useful in many application areas. This study develops a method for classifying sentences in Korean text documents into several different comparative types to facilitate their analysis. We divide our study into two tasks: (1) extracting comparative sentences from text documents and (2) classifying comparative sentences into seven types. In the first task, we investigate many actual comparative sentences by referring to previous studies and construct a lexicon of comparisons. Sentences that contain elements from the lexicon are regarded as comparative sentence candidates. Next, we use machine learning techniques to eliminate non-comparative sentences from the candidates. In the second task, we roughly classify the comparative sentences using keywords and use a transformation-based learning method to correct initial classification errors. Experimental results show that our method could be suitable for practical use. We obtained an F1-score of 90.23% in the first task, an accuracy of 81.67% in the second task, and an overall accuracy of 88.59% for the integrated system with both tasks.