To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This paper discusses a framework for designing online tasks that capitalizes on the possibilities that the Internet and the Web offer for language learning. To present such a framework, we draw from constructivist theories (Brooks and Brooks, 1993) and their application to educational technology (Newby, Stepich, Lehman and Russell, 1996; Jonassen, Mayes and McAleese, 1993); second language learning and learning autonomy (Benson and Voller, 1997); and distance education (Race, 1989; White, 1999). On the one hand our model balances the requirements of the need for control and learning autonomy by the independent language learner; and on the other, the possibilities that online task-based learning offer for new reading processes by taking into account new literacy models (Schetzer and Warschauer, 2000), and the effect that the new media have on students’ knowledge construction and understanding of texts. We explain how this model works in the design of reading tasks within the specific distance learning context of the Open University, UK. Trayectorias is a tool that consists of an open problem-solving Web-quest and provides students with ‘scaffolding’ that guides their navigation around the Web whilst modelling learning approaches and new learning paradigms triggered by the medium. We then discuss a small-scale trial with a cohort of students (n = 23). This trial had a double purpose: (a) to evaluate to what extent the writing task fulfilled the investigators’ intentions; and (b) to obtain some information about the students’ perceptions of the task.
In the rapidly changing environment of language learning and teaching, electronic literacies have an increasingly important role to play. While much research on new literacies focuses on the World Wide Web, the aim in this study is to investigate the importance of corpus consultation as a new type of literacy which is of particular relevance in the context of language learning and teaching. After briefly situating the theoretical and pedagogical context of the study in relation to authenticity and learner autonomy, the paper describes an empirical study involving eight postgraduate students of French. As part of a Masters course they write a short text and subsequently attempt to improve it by using concordancing software to consult a small corpus containing texts on a similar subject. The analysis of the results reveals a significant number of changes made by the learners which may be classified as follows in order of frequency: grammatical errors (gender and agreement, prepositions, verb forms/mood, negation and syntax); misspellings, accents and hyphens; lexico-grammatical patterning (native language interference, choice of verb and inappropriate vocabulary); and capitalisation. The conclusion notes that the situation in which these students found themselves (i.e. faced with a text on which the teacher had indicated phrases which could be improved) is replicated in many cases every day, and suggests that corpus consultation may have a useful role to play in the context of interactive feedback, particularly in cases where traditional language learning resources are of little use.
There are now many CALL authoring packages that can create interactive websites and a large number of language teachers are writing materials for the whole range of such packages. Currently, each product stores its data in different formats thus hindering interoperability, pooling of digital resources and moving between software packages based in different technology. The use of Extensible Mark-up Language (XML) for data storage goes a long way to solve this problem and allows for the easy conversion of exercises. Starting from a desire to develop a common format between Hot Potatoes, WELTS (part of the WELL project) and the Interactive Language Learning package from London Metropolitan University, a new version of the Interactive Language Learning software, now renamed Guildhall Interactive Software for Multimedia On-line (GISMO), has made such conversion possible. Given the immense resources required to develop the critical mass of material required to make online CALL relevant to an individual’s teaching practice, such a common approach is required to facilitate the pooling of resources. Should a bureaucratic or financial decision in an institution result in a change of software, teachers need to be able to easily convert their legacy material. XML technology can facilitate interoperability, thereby increasing potential accessibility by allowing teachers and students to have the use of a greater amount of pedagogical material. It is further proposed, using these developments, to create a large pool of exercises for practice and assessment that is independent of the delivery approach employed. This will obviate the need for teachers to keep reproducing basic language learning material and allow for the expansion of online CALL into more imaginative areas. This possibility introduces the question of standards within XML and whether it is necessary to further specify how the material is stored, perhaps using a standard such as the ‘IMS Question & Test Interoperability Specification’ or whether XML is a sufficient standard in itself.
Probably the most important data type after vectors and free text is that of symbol strings of varying lengths. This type of data is commonplace in bioinformatics applications, where it can be used to represent proteins as sequences of amino acids, genomic DNA as sequences of nucleotides, promoters and other structures. Partly for this reason a great deal of research has been devoted to it in the last few years. Many other application domains consider data in the form of sequences so that many of the techniques have a history of development within computer science, as for example in stringology, the study of string algorithms.
Kernels have been developed to compute the inner product between images of strings in high-dimensional feature spaces using dynamic programming techniques. Although sequences can be regarded as a special case of a more general class of structures for which kernels have been designed, we will discuss them separately for most of the chapter in order to emphasise their importance in applications and to aid understanding of the computational methods. In the last part of the chapter, we will show how these concepts and techniques can be extended to cover more general data structures, including trees, arrays, graphs and so on.
Certain kernels for strings based on probabilistic modelling of the data-generating source will not be discussed here, since Chapter 12 is entirely devoted to these kinds of methods. There is, however, some overlap between the structure kernels presented here and those arising from probabilistic modelling covered in Chapter 12.
The last decade has seen an explosion of readily available digital text that has rendered attempts to analyse and classify by hand infeasible. As a result automatic processing of natural language text documents has become a main research interest of Artificial Intelligence (AI) and computer science in general. It is probably fair to say that after multivariate data, natural language text is the most important data format for applications. Its particular characteristics therefore deserve specific attention.
We will see how well-known techniques from Information Retrieval (IR), such as the rich class of vector space models, can be naturally reinterpreted as kernel methods. This new perspective enriches our understanding of the approach, as well as leading naturally to further extensions and improvements. The approach that this perspective suggests is based on detecting and exploiting statistical patterns of words in the documents. An important property of the vector space representation is that the primal–dual dialectic we have developed through this book has an interesting counterpart in the interplay between term-based and document-based representations.
The goal of this chapter is to introduce the Vector Space family of kernel methods highlighting their construction and the primal–dual dichotomy that they illustrate. Other kernel constructions can be applied to text, for example using probabilistic generative models and string matching, but since these kernels are not specific to natural language text, they will be discussed separately in Chapters 11 and 12.
The previous chapter saw the development of some basic tools for working in a kernel-defined feature space resulting in some useful algorithms and techniques. The current chapter will extend the methods in order to understand the spread of the data in the feature space. This will be followed by examining the problem of identifying correlations between input vectors and target values. Finally, we discuss the task of identifying covariances between two different representations of the same object.
All of these important problems in kernel-based pattern analysis can be reduced to performing an eigen- or generalised eigen-analysis, that is the problem of finding solutions of the equation Aw = λBw given symmetric matrices A and B. These problems range from finding a set of k directions in the embedding space containing the maximum amount of variance in the data (principal components analysis (PCA)), through finding correlations between input and output representations (partial least squares (PLS)), to finding correlations between two different representations of the same data (canonical correlation analysis (CCA)). Also the Fisher discriminant analysis from Chapter 5 can be cast as a generalised eigenvalue problem.
The importance of this class of algorithms is that the generalised eigenvectors problem provides an efficient way of optimising an important family of cost functions; it can be studied with simple linear algebra and can be solved or approximated efficiently using a number of well-known techniques from computational algebra.
In this chapter we conclude our presentation of kernel-based pattern analysis algorithms by discussing three further common tasks in data analysis: ranking, clustering and data visualisation.
Ranking is the problem of learning a ranking function from a training set of ranked data. The number of ranks need not be specified though typically the training data comes with a relative ordering specified by assignment to one of an ordered sequence of labels.
Clustering is perhaps the most important and widely used method of unsupervised learning: it is the problem of identifying groupings of similar points that are relatively ‘isolated’ from each other, or in other words to partition the data into dissimilar groups of similar items. The number of such clusters may not be specified a priori. As exact solutions are often computationally hard to find, effective approximations via relaxation procedures need to be sought.
Data visualisation is often overlooked in pattern analysis and machine learning textbooks, despite being very popular in the data mining literature. It is a crucial step in the process of data analysis, enabling an understanding of the relations that exist within the data by displaying them in such a way that the discovered patterns are emphasised. These methods will allow us to visualise the data in the kernel-defined feature space, something very valuable for the kernel selection process. Technically it reduces to finding low-dimensional embeddings of the data that approximately retain the relevant information.
There are two key properties that are required of a kernel function for an application. Firstly, it should capture the measure of similarity appropriate to the particular task and domain, and secondly, its evaluation should require significantly less computation than would be needed in an explicit evaluation of the corresponding feature mapping ϕ. Both of these issues will be addressed in the next four chapters but the current chapter begins the consideration of the efficiency question.
A number of computational methods can be deployed in order to shortcut the computation: some involve using closed-form analytic expressions, others exploit recursive relations, and others are based on sampling. This chapter aims to show several different methods in action, with the aim of illustrating how to design new kernels for specific applications. It will also pave the way for the final three chapters that carry these techniques into the design of advanced kernels.
We will also return to an important theme already broached in Chapter 3, namely that kernel functions are not restricted to vectorial inputs: kernels can be designed for objects and structures as diverse as strings, graphs, text documents, sets and graph-nodes. Given the different evaluation methods and the diversity of the types of data on which kernels can be defined, together with the methods for composing and manipulating kernels outlined in Chapter 3, it should be clear how versatile this approach to data modelling can be, allowing as it does for refined customisations of the embedding map ϕ to the problem at hand.
Pattern analysis deals with the automatic detection of patterns in data, and plays a central role in many modern artificial intelligence and computer science problems. By patterns we understand any relations, regularities or structure inherent in some source of data. By detecting significant patterns in the available data, a system can expect to make predictions about new data coming from the same source. In this sense the system has acquired generalisation power by ‘learning’ something about the source generating the data. There are many important problems that can only be solved using this approach, problems ranging from bioinformatics to text categorization, from image analysis to web retrieval. In recent years, pattern analysis has become a standard software engineering approach, and is present in many commercial products.
Early approaches were efficient in finding linear relations, while nonlinear patterns were dealt with in a less principled way. The methods described in this book combine the theoretically well-founded approach previously limited to linear systems, with the flexibility and applicability typical of nonlinear methods, hence forming a remarkably powerful and robust class of pattern analysis techniques.
There has been a distinction drawn between statistical and syntactical pattern recognition, the former dealing essentially with vectors under statistical assumptions about their distribution, and the latter dealing with structured objects such as sequences or formal languages, and relying much less on statistical analysis.