To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We prove that, for all values of the edge probability $p(n)$, the largest eigenvalue of the random graph $G(n, p)$ satisfies almost surely $\lambda_1(G)=(1+o(1))\max\{\sqrt{\Delta}, np\}$, where Δ is the maximum degree of $G$, and the o(1) term tends to zero as $\max\{\sqrt{\Delta}, np\}$ tends to infinity.
In path planning design, potential fields can introduce force constraints to ensure curvature continuity of trajectories and thus facilitate path-tracking design. The parametric thrift of fractional potentials permits smooth variations of the potential in function of the distance to obstacles without requiring design of geometric charge distribution. In the approach we use, the fractional order of differentiation is the risk coefficient associated to obstacles. A convex danger map towards a target and a convex geodesic distance map are defined. Real-time computation can also lead to the shortest minimum danger trajectory, or to the least dangerous of minimum length trajectories.
Classifier combination is an effective and broadly useful method of improving system performance. This article investigates in depth a large number of both well-established and novel classifier combination approaches for the word sense disambiguation task, studied over a diverse classifier pool which includes feature-enhanced Naïve Bayes, Cosine, Decision List, Transformation-based Learning and MMVC classifiers. Each classifier has access to the same rich feature space, comprised of distance weighted bag-of-lemmas, local ngram context and specific syntactic relations, such as Verb-Object and Noun-Modifier. This study examines several key issues in system combination for the word sense disambiguation task, ranging from algorithmic structure to parameter estimation. Experiments using the standard SENSEVAL2 lexical-sample data sets in four languages (English, Spanish, Swedish and Basque) demonstrate that the combination system obtains a significantly lower error rate when compared with other systems participating in the SENSEVAL2 exercise, yielding state-of-the-art performance on these data sets.
This paper presents a novel approach for word sense disambiguation. The underlying algorithm has two main components: (1) pattern learning from available sense-tagged corpora (SemCor), from dictionary definitions (WordNet) and from a generated corpus (GenCor); and (2) instance based learning with automatic feature selection, when training data is available for a particular word. The ideas described in this paper were implemented in a system that achieves excellent performance on the data provided during the SENSEVAL-2 evaluation exercise, for both English all words and English lexical sample tasks.
Various Machine Learning (ML) approaches have been demonstrated to produce relatively successful Word Sense Disambiguation (WSD) systems. There are still unexplained differences among the performance measurements of different algorithms, hence it is warranted to deepen the investigation into which algorithm has the right ‘bias’ for this task. In this paper, we show that this is not easy to accomplish, due to intricate interactions between information sources, parameter settings, and properties of the training data. We investigate the impact of parameter optimization on generalization accuracy in a memory-based learning approach to English and Dutch WSD. A ‘word-expert’ architecture was adopted, yielding a set of classifiers, each specialized in one single wordform. The experts consist of multiple memory-based learning classifiers, each taking different information sources as input, combined in a voting scheme. We optimized the architectural and parametric settings for each individual word-expert by performing cross-validation experiments on the learning material. The results of these experiments show that the variation of both the algorithmic parameters and the information sources available to the classifiers leads to large fluctuations in accuracy. We demonstrate that optimization per word-expert leads to an overall significant improvement in the generalization accuracies of the produced WSD systems.
This paper presents a comprehensive empirical exploration and evaluation of a diverse range of data characteristics which influence word sense disambiguation performance. It focuses on a set of six core supervised algorithms, including three variants of Bayesian classifiers, a cosine model, non-hierarchical decision lists, and an extension of the transformation-based learning model. Performance is investigated in detail with respect to the following parameters: (a) target language (English, Spanish, Swedish and Basque); (b) part of speech; (c) sense granularity; (d) inclusion and exclusion of major feature classes; (e) variable context width (further broken down by part-of-speech of keyword); (f) number of training examples; (g) baseline probability of the most likely sense; (h) sense distributional entropy; (i) number of senses per keyword; (j) divergence between training and test data; (k) degree of (artificially introduced) noise in the training data; (l) the effectiveness of an algorithm's confidence rankings; and (m) a full keyword breakdown of the performance of each algorithm. The paper concludes with a brief analysis of similarities, differences, strengths and weaknesses of the algorithms and a hierarchical clustering of these algorithms based on agreement of sense classification behavior. Collectively, the paper constitutes the most comprehensive survey of evaluation measures and tests yet applied to sense disambiguation algorithms. And it does so over a diverse range of supervised algorithms, languages and parameter spaces in single unified experimental framework.
Has system performance on Word Sense Disambiguation (WSD) reached a limit? Automatic systems don't perform nearly as well as humans on the task, and from the results of the SENSEVAL exercises, recent improvements in system performance appear negligible or even negative. Still, systems do perform much better than the baselines, so something is being done right. System evaluation is crucial to explain these results and to show the way forward. Indeed, the success of any project in WSD is tied to the evaluation methodology used, and especially to the formalization of the task that the systems perform. The evaluation of WSD has turned out to be as difficult as designing the systems in the first place.
The aim of our paper is twofold: to introduce some general reflections on the task of lexical semantic annotation and the adequacy of existing lexical-semantic reference resources, while giving an overall description of the Italian lexical sample task for the SENSEVAL-2 experiment. We suggest how the SENSEVAL exercise (and comparison between the two editions of the experiment) can be employed to evaluate the lexical reference resources used for annotation. We conclude with a few general remarks on the gap between the lexicon, a partially decontextualised object, and the corpus, where context plays a significant role.
This paper explores the role of domain information in word sense disambiguation. The underlying hypothesis is that domain labels, such as MEDICINE, ARCHITECTURE and SPORT, provide a useful way to establish semantic relations among word senses, which can be profitably used during the disambiguation process. Results obtained at the SENSEVAL-2 initiative confirm that for a significant subset of words domain information can be used to disambiguate with a very high level of precision.
Gale, as he liked to be called by his friends and family, had extremely broad interests, both professionally and otherwise. His professional career at Bell Labs included radio astronomy, economics, statistics and computational linguistics.
We extend the proof-irrelevant model defined in Smith (1988) to the whole of Martin-Löf's logical framework. The main difference here is the existence of a type whose objects themselves represent types rather than proof-objects. This means that the model must now be able to distinguish between objects with different degree of relevance: those that denote proofs are irrelevant whereas those that denote types are not. In fact a whole hierarchy of relevance exists.
Another difference is the higher level of detail in the formulation of the formal theory, such as the explicit manipulation of contexts and substitutions. This demands an equally detailed definition of the model, including interpreting contexts and substitutions.
We are thus led to a whole reformulation of the proof-irrelevant model. We present a model that is built up from an arbitrary model of the untyped lambda calculus. We also show how to extend it when the logical framework itself is enlarged with inductive definitions. In doing so, a variant of Church numerals is introduced.
As in Smith (1988), the model can only be defined in the absence of universes, and it is useful to obtain an elementary proof of consistency and to prove the independence of Peano's fourth axiom.
Mixins are modules that may contain deferred components, that is, components not defined in the module itself; moreover, in contrast to parameterised modules (like ML functors), they can be mutually dependent and allow their definitions to be overridden. In a preceding paper we defined a syntax and denotational semantics of a kernel language of mixin modules. Here, we take instead an axiomatic approach, giving a set of algebraic laws expressing the expected properties of a small set of primitive operators on mixins. Interpreting axioms as rewriting rules, we get a reduction semantics for the language and prove the existence of normal forms. Moreover, we show that the model defined in the earlier paper satisfies the given axiomatisation.
We show that Friedman's proof of the existence of non-trivial βη-complete models of λ→ can be extended to system F. We isolate a set of conditions that are sufficient to ensure βη-completeness for a model of F (and α-completeness at the level of types), and we discuss which class of models we get. In particular, the model introduced in Barbanera and Berardi (1997), having as polymorphic maps exactly all possible Scott continuous maps, is βη-complete, and is hence the first known complete non-syntactic model of F. In order to have a suitable framework in which to express the conditions and develop the proof, we also introduce the very natural notion of ‘polymax models’ of System F.
This paper introduces a temporal logic for coalgebras. Nexttime and lasttime operators are defined for a coalgebra, acting on predicates on the state space. They give rise to what is called a Galois algebra. Galois algebras form models of temporal logics like Linear Temporal Logic (LTL) and Computation Tree Logic (CTL). The mapping from coalgebras to Galois algebras turns out to be functorial, yielding indexed categorical structures. This construction gives many examples, for coalgebras of polynomial functors on sets. More generally, it will be shown how ‘fuzzy’ predicates on metric spaces, and predicates on presheaves, yield indexed Galois algebras, in basically the same coalgebraic manner.
This paper presents a simple implementation of the λ-calculus in the interaction net paradigm. It is based on a two-fold translation. λ-terms are coded (for duplication) or decoded (for execution), and reduction is achieved by switching between these two states: decoding corresponds to head reduction and encoding to left reduction.
There are two main approaches to obtaining ‘topological’ cartesian-closed categories. Under one approach, one restricts to a full subcategory of topological spaces that happens to be cartesian closed – for example, the category of sequential spaces. Under the other, one generalises the notion of space – for example, to Scott's notion of equilogical space. In this paper, we show that the two approaches are equivalent for a large class of objects. We first observe that the category of countably based equilogical spaces has, in a precisely defined sense, a largest full subcategory that can be simultaneously viewed as a full subcategory of topological spaces. In fact, this category turns out to be equivalent to the category of all quotient spaces of countably based topological spaces. We show that the category is bicartesian closed with its structure inherited, on the one hand, from the category of sequential spaces, and, on the other, from the category of equilogical spaces. We also show that the category of countably based equilogical spaces has a larger full subcategory that can be simultaneously viewed as a full subcategory of limit spaces. This full subcategory is locally cartesian closed and the embeddings into limit spaces and countably based equilogical spaces preserve this structure. We observe that it seems essential to go beyond the realm of topological spaces to achieve this result.
This overview of Fortran 90 (F90) features is presented as a series of tables that illustrate the syntax and abilities of F90. Frequently, comparisons are made with similar features in the C++ and F77 languages and the Matlab environment.
These tables show that F90 has significant improvements over F77 and matches or exceeds newer software capabilities found in C++ and Matlab for dynamic memory management, user-defined data structures, matrix operations, operator definition and overloading, intrinsics for vector and parallel processors and the basic requirements for object-oriented programming.
They are intended to serve as a condensed quick-reference guide for programming in F90 and for understanding programs developed by others.
The programming process is similar in approach and creativity to writing a paper. In composition, you are writing to express ideas; in programming, you are expressing a computation. Both the programmer and the writer must adhere to the syntactic rules (grammar) of a particular language. In prose, the fundamental idea-expressing unit is the sentence; in programming, two units – statements and comments – are available.
Composition, from technical prose to fiction, should be organized broadly, usually through an outline. The outline should be expanded as the detail is elaborated and the whole reexamined and reorganized when structural or creative flaws arise. Once the outline settles, you begin the actual composition process using sentences to weave the fabric your outline expresses. Clarity in writing occurs when your sentences, both internally and globally, communicate the outline succinctly and clearly. We stress this approach here with the aim of developing a programming style that produces efficient programs humans can easily understand.
To a great degree, no matter which language you choose for your composition, the idea can be expressed with the same degree of clarity. Some subtleties can be better expressed in one language than another, but the fundamental reason for choosing your language is your audience: people do not know many languages, and if you want to address the American population, you had better choose English over Swahili.