To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
A text usually contains one or a few main topics, which are split up into subtopics, which in their turn can be further described by more detailed topics. In this article we describe a system that segments a text into topics and subtopics. Each segment is characterized by important key terms that are extracted from it and by its begin and end position in the text. A table of contents is built by using the hierarchical and sequential relationships between topical segments that are identified in a text. The table of contents generator relies upon universal linguistic theories on the topic and comment of a sentence and on patterns of thematic progression in text. The linguistic theories of topic and comment are modeled both deterministically and probabilistically. The system is applied to English texts (news, World Wide Web and encyclopedia texts) and is evaluated.
Morphological analysis is a crucial component of several natural language processing tasks, especially for languages with a highly productive morphology, where stipulating a full lexicon of surface forms is not feasible. This paper describes HAMSAH (HAifa Morphological System for Analyzing Hebrew), a morphological processor for Modern Hebrew, based on finite-state linguistically motivated rules and a broad coverage lexicon. The set of rules comprehensively covers the morphological, morpho-phonological and orthographic phenomena that are observable in contemporary Hebrew texts. Reliance on finite-state technology facilitates the construction of a highly efficient, completely bidirectional system for analysis and generation.
This paper presents a short survey of some recent approaches relating two different areas, viz. deterministic chaos and computability. Chaos in classical physics may be approached by dynamical (equationally determined) systems or stochastic ones (as random processes). However, randomness has also been effectively modelled using recursion theoretic tools by P. Martin-Löf. We recall its connections to Kolmogorov complexity and show some applications to dynamical systems. This allows us to introduce results that connect well-established notions of entropy and algorithmic information.
For complex tasks such as parse selection, the creation of labelled training sets can be extremely costly. Resource-efficient schemes for creating informative labelled material must therefore be considered. We investigate the relationship between two broad strategies for reducing the amount of manual labelling necessary to train accurate parse selection models: ensemble models and active learning. We show that popular active learning methods for reducing annotation costs can be outperformed by instead using a model class which uses the available labelled data more efficiently. For this, we use a simple type of ensemble model called the Logarithmic Opinion Pool (LOP). We furthermore show that LOPs themselves can benefit from active learning. As predicted by a theoretical explanation of the predictive power of LOPs, a detailed analysis of active learning using LOPs shows that component model diversity is a strong predictor of successful LOP performance. Other contributions include a novel active learning method, a justification of our simulation studies using timing information, and cross-domain verification of our main ideas using text classification.
We present a new kind of ambient calculus in which the open capability is replaced by direct mobility of generic processes. The calculus comes equipped with a labelled transition system in which types play a major role: this system allows us to show interesting algebraic laws. As usual, types express the communication, access and mobility properties of the modelled system, and inferred types express the minimal constraints required for the system to be well behaved.
We model micro-architectures with non-pipelined instruction processing and pipelined instruction processing using Maurer machines, basic thread algebra and program algebra. We show that stored programs are executed as intended with these micro-architectures. We believe that this work provides a new mathematical approach to the modelling of micro-architectures and the verification of their correctness and the anticipated speed-up results.
The design and construction of lexical resources is a critical issue in Natural Language Processing (NLP). Real-world NLP systems need large-scale lexica, which provide rich information about words and word senses at all levels: morphologic, syntactic, lexical semantics, etc., but the construction of lexical resources is a difficult and costly task. The last decade has been highly influenced by the notion of reusability, that is, the use of the information of existing lexical resources in constructing new ones. It is unrealistic, however, to expect that the great variety of available lexical information resources could be converted into a single and standard representation schema in the near future. The purpose of this article is to present the ELHISA system, a software architecture for the integration of heterogeneous lexical information. We address, from the point of view of the information integration area, the problem of querying very different existing lexical information sources using a unique and common query language. The integration in ELHISA is performed in a logical way, so that the lexical resources do not suffer any modification when integrating them into the system. ELHISA is primarily defined as a consultation system for accessing structured lexical information, and therefore it does not have the capability to modify or update the underlying information. For this purpose, a General Conceptual Model (GCM) for describing diverse lexical data has been conceived. The GCM establishes a fixed vocabulary describing objects in the lexical information domain, their attributes, and the relationships among them. To integrate the lexical resources into the federation, a Source Conceptual Model (SCM) is built on the top of each one, which represents the lexical objects concurring in each particular source. To answer the user queries, ELHISA must access the integrated resources, and, hence, it must translate the query expressed in GCM terms into queries formulated in terms of the SCM of each source. The relation between the GCM and the SCMs is explicitly described by means of mapping rules called Content Description Rules. Data integration at the extensional level is achieved by means of the data cleansing process, needed if we want to compare the data arriving from different sources. In this process, the object identification step is carried out. Based on this architecture, a prototype named ELHISA has been built, and five resources covering a broad scope have been integrated into it so far for testing purposes. The fact that such heterogeneous resources have been integrated with ease into the system shows, in the opinion of the authors, the suitability of the approach taken.
We explore the way in which the refinement of individual ‘local’ components of a specification relates to the development of a ‘global’ system from a specification of requirements. The observational interpretation of specifications and refinements adds expressive power and flexibility, but introduces some subtle problems. Our study of these issues is carried out in the context of Casl architectural specifications. We introduce a definition of observational equivalence for Casl models, leading to an observational semantics for architectural specifications for which we prove important properties. Overall, this fulfills the long-standing goal of complementing the standard semantics of Casl specifications with an observational view that supports observational refinement of specifications in combination with Casl-style architectural design.
This study investigates the impact of an online workbook on the attitudes of 245 second language (L2) Spanish learners toward this pedagogical tool over two consecutive semesters. The treatment consisted of four hours of classroom instruction and one set of online homework per week, during two consecutive semesters. Students' attitudes toward the electronic workbook were measured by means of a survey administered after eight months of exposure to the workbook. The qualitative data of the survey was compared to quantitative data from two different language assessment tests. The results of these tests indicated a significant increase in grammar scores. These results are consonant with the positive findings of student perceptions about the online workbook obtained in this and previous studies, emphasizing its benefits in terms of accessibility to the material, user-friendliness, and instant error feedback. More importantly, most students praised the usefulness of the online workbook for language learning, particularly in the areas of grammar and vocabulary acquisition. Despite participants' mostly positive attitudes, the survey also revealed some negative aspects of the use of the online workbook, such as the amount of time needed to complete the online exercises. This paper addresses these issues, and provides suggestions to overcome this type of problem.
What does learning in today's technology-enhanced environment mean? Is learning as an activity fundamentally changing as a result of the opportunities offered by new technologies and tools? How are the new communicative channels and increased social dimensions possible through Web 2.0 technologies impacting on the way students work and learn? And what does this mean for the role of teachers and institutions in terms of how they support students? This paper considers these questions and reports on findings from current research evaluating how students are actually using technologies and what this research tells us about the ways in which patterns of learning might be changing. It will consider the implications for individual teachers (in terms of designing and supporting learning activities for students) and institutions in terms of the impact on policy and the associated infrastructure needed to provide an appropriate environment that maximises the potential offered by new technologies.
Over the last decade, most studies in Computer-Mediated Communication (CMC) have highlighted how online synchronous learning environments implement a new literacy related to multimodal communication. The environment used in our experiment is based on a synchronous audio-graphic conferencing tool. This study concerns false beginners in an English for Specific Purposes (ESP) course, presenting a high degree of heterogeneity in their proficiency levels. A coding scheme was developed to translate the video data into user actions and speech acts that occurred in the various modalities of the system (aural, textchat, text editing, websites). The paper intends to shed further light on and increase our understanding of multimodal communication structures through learner participation and learning practices. On the basis of evidence from an ongoing research investigation into online CALL literacy, we identify how learners use different modalities to produce collectively a writing task, and how the multimodal learning interaction affects the learners' focus and engagement within the learning process. The adopted methodology combines a quantitative analysis of the learners' participation in a writing task with regard to the use of multimodal tools, and a qualitative analysis focusing on how the multimodal dimension of communication enhances language and learning strategies. By looking at the relationship between how the learning tasks are designed by tutors and how they are implemented by learners, that is to say taking into account the whole perception of multimodal communication for language learning purposes, we provide a framework for evaluating the potential of such an environment for language learning.
This paper presents a comprehensive picture of what has been investigated in terms of CALL effectiveness over the period 1981–2005 throwing light on why this question is still such a difficult one to answer unequivocally. The author looks at both strengths and weaknesses in this body of work, highlighting pitfalls and paradoxes in research procedures and providing valid design models. This includes the contribution of dedicated meta-analyses to this controversial field and a discussion of the benefits and limitations associated with this type of research. Substantial data, drawn from three extensive studies (Felix, 2005a, b; Felix, 2006a), allows the author to present for the first time synthesized findings relating to the impact of technologies on language learning. The paper concludes with strategies for future work in the context of a proposed research agenda.
This paper describes an intercontinental project with the use of interactive tools, both synchronous and asynchronous, which was set up to internationalize academic learning of Spanish language and culture. The objective of this case study was to investigate whether video-web communication tools can contribute to enriching the quality of foreign language curricula, by facilitating a motivating virtual communication environment for purposeful interaction between non native and native speakers of Spanish to accomplish learning tasks. The project was carried out between a class of twenty Spanish as foreign language students from the University of Utrecht, The Netherlands, and a class of twenty Chilean trainee Spanish teachers from the University of Concepción, Chile. Students interacted weekly, over two months, in dyads and small groups making use of a video-web communication tool, Adobe Connect. The video-web communication tool enabled synchronous interactions, during which participants could see each other while talking and sharing audiovisual documents on-line. A blog was also used to promote collaboration, reflection and exchange of ideas about issues raised during the synchronous sessions. Qualitative data was collected through a questionnaire, analysis of recordings of learners' interaction sessions and the project blog. Results show a positive impact on motivation and on learning outcomes, particularly regarding understanding of the use of language in given contexts, and of cultural issues.
Although the success of automatic speech recognition (ASR)-based Computer Assisted Pronunciation Training (CAPT) systems is increasing, little is known about the pedagogical effectiveness of these systems. This is particularly regrettable because ASR technology still suffers from limitations that may result in the provision of erroneous feedback, possibly leading to learning breakdowns. To study the effectiveness of ASR-based feedback for improving pronunciation, we developed and tested a CAPT system providing automatic feedback on Dutch phonemes that areproblematic for adult learners of Dutch. Thirty immigrants who were studying Dutch were assigned to three groups using either the ASR-based CAPT system with automatic feedback, a CAPT system without feedback, or no CAPT system. Pronunciation quality was assessed for each participant before and after the training by human experts who evaluated overall segmental quality and the quality of the phonemes addressed in the training. The participants' impressions of the CAPT system used were also studied through anonymous questionnaires. The results on global segmental quality show that the group receiving ASR-based feedback made the largest mean improvement, but the groups' mean improvements did not differ significantly. The group receiving ASR-based feedback showed a significantly larger improvement than the no-feedback group in the segmental quality of the problematic phonemes targeted.
We consider the classical coupon collector's problem in which each new coupon collected is type i with probability pi; ∑i=1npi=1. We derive some formulas concerning N, the number of coupons needed to have a complete set of at least one of each type, that are computationally useful when n is not too large. We also present efficient simulation procedures for determining P(N > k), as well as analytic bounds for this probability.
In many companies, legacy systems have been used to serve customers arriving at service counters. The demand of a customer arriving at a counter is divided into R subdemands (SDs). Each SD is processed sequentially by the legacy system. When the final SD service is completed, the customer leaves the counter. On the other hand, Internet users desire that each SD be processed through the Internet without going to a service counter. We note that the applications for operating the web system differ from those of the legacy systems. For that reason, companies might use legacy systems for customers who come to the company through the Internet, without changing those legacy systems' applications. This web system, which is integrated with the legacy system, is called a front-end web system. Suppose that, for the web system, demands are generated according to a Poisson process. Then we show that the loss probability and macrostate distribution for the front-end web system are insensitive with respect to the distribution of the discrete random variable R, aside from the continuous distributions of other uncertain factors, under a restriction.
In this article we will derive some results for characterizing the almost closed sets of a face-homogeneous random walk. We will present a conjecture on the relation between discrete scattering of the fluid limit and the absence of nonatomic almost closed sets. We will illustrate the conjecture with random walks with both simple and nonsimple decomposition into almost closed sets.
In this article we consider an insurance company selling life insurance policies. New policies are sold at random points in time, and each policy stays active for an exponential amount of time with rate μ, during which the policyholder pays premiums continuously at rate r. When the policy expires, the insurance company pays a claim of random size. The aim is to compute the probability of eventual ruin starting with a given number of policies and a given level of insurance fund. We establish the remarkable result that the ruin probability is identical to the one in the standard compound Poisson model where the insurance fund increases at constant rate r and claims occur according to a Poisson process with rate μ.