To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
During the discussion of relevance feedback in Section 9.1.2, we observed that if we have some known relevant and nonrelevant documents, then we can straightforwardly start to estimate the probability of a term t appearing in a relevant document P(t|R = 1), and that this could be the basis of a classifier that decides whether documents are relevant or not. In this chapter, we more systematically introduce this probabilistic approach to information retrieval (IR), which provides a different formal basis for a retrieval model and results in different techniques for setting term weights.
Users start with information needs, which they translate into query representations. Similarly, there are documents, which are converted into document representations (the latter differing at least by how text is tokenized, but perhaps containing fundamentally less information, as when a nonpositional index is used). Based on these two representations, a system tries to determine how well documents satisfy information needs. In the Boolean or vector space models of IR, matching is done in a formally defined but semantically imprecise calculus of index terms. Given only a query, an IR system has an uncertain understanding of the information need. Given the query and document representations, a system has an uncertain guess of whether a document has content relevant to the information need. Probability theory provides a principled foundation for such reasoning under uncertainty.
In 1998 Łuczak Rödl and Szemerédi [7] proved, by means of the Regularity Lemma, that there exists n0 such that, for any n ≥ n0 and two-edge-colouring of Kn, there exists a pair of vertex-disjoint monochromatic cycles of opposite colours covering the vertices of Kn. In this paper we make use of an alternative method of finding useful structure in a graph, leading to a proof of the same result with a much smaller value of n0. The proof gives a polynomial-time algorithm for finding the two cycles.
We use a greedy probabilistic method to prove that, for every ε > 0, every m × n Latin rectangle on n symbols has an orthogonal mate, where m = (1 − ε)n. That is, we show the existence of a second Latin rectangle such that no pair of the mn cells receives the same pair of symbols in the two rectangles.
A nonholonomic system subjected to external noise from the environment, or internal noise in its own actuators, will evolve in a stochastic manner described by an ensemble of trajectories. This ensemble of trajectories is equivalent to the solution of a Fokker–Planck equation that typically evolves on a Lie group. If the most likely state of such a system is to be estimated, and plans for subsequent motions from the current state are to be made so as to move the system to a desired state with high probability, then modeling how the probability density of the system evolves is critical. Methods for solving Fokker-Planck equations that evolve on Lie groups then become important. Such equations can be solved using the operational properties of group Fourier transforms in which irreducible unitary representation (IUR) matrices play a critical role. Therefore, we develop a simple approach for the numerical approximation of all the IUR matrices for two of the groups of most interest in robotics: the rotation group in three-dimensional space, SO(3), and the Euclidean motion group of the plane, SE(2). This approach uses the exponential mapping from the Lie algebras of these groups, and takes advantage of the sparse nature of the Lie algebra representation matrices. Other techniques for density estimation on groups are also explored. The computed densities are applied in the context of probabilistic path planning for kinematic cart in the plane and flexible needle steering in three-dimensional space. In these examples the injection of artificial noise into the computational models (rather than noise in the actual physical systems) serves as a tool to search the configuration spaces and plan paths. Finally, we illustrate how density estimation problems arise in the characterization of physical noise in orientational sensors such as gyroscopes.
Both the hopcount HN (the number of links) and the weight WN (the sum of the weights on links) of the shortest path between two arbitrary nodes in the complete graph KN with i.i.d. exponential link weights is computed. We consider the joint distribution of the pair (HN, WN) and derive, after proper scaling, the joint limiting distribution. One of the results is that HN and WN, properly scaled, are asymptotically independent.
We describe how Lie-theoretical methods can be used to analyze color related problems in machine vision. The basic observation is that the nonnegative nature of spectral color signals restricts these functions to be members of a limited, conical section of the larger Hilbert space of square-integrable functions. From this observation, we conclude that the space of color signals can be equipped with a coordinate system consisting of a half-axis and a unit ball with the Lorentz groups as natural transformation group. We introduce the theory of the Lorentz group SU(1, 1) as a natural tool for analyzing color image processing problems and derive some descriptions and algorithms that are useful in the investigation of dynamical color changes. We illustrate the usage of these results by describing how to compress, interpolate, extrapolate, and compensate image sequences generated by dynamical color changes.
In this paper, a geometrical approach is developed to generate simultaneously optimal (or near-optimal) smooth paths for a set of non-holonomic robots, moving only forward in a 2D environment cluttered with static and moving obstacles. The robots environment is represented by a 3D geometric entity called Bump-Surface, which is embedded in a 4D Euclidean space. The multi-motion planning problem (MMPP) is resolved by simultaneously finding the paths for the set of robots represented by monoparametric smooth C2 curves onto the Bump-Surface, such that their inverse images onto the initial 2D workspace satisfy the optimization motion-planning criteria and constraints. The MMPP is expressed as an optimization problem, which is solved on the Bump-Surface using a genetic algorithm. The performance of the proposed approach is tested through a considerable number of simulated 2D dynamic environments with car-like robots.
Semple and Welsh [5] introduced the concept of correlated matroids, which relate to conjectures by Grimmett and Winkler [2], and Pemantle [4], respectively, that the uniformly random forest and the uniformly random connected subgraph of a finite graph have the edge-negative-association property. In this paper, we extend results of Semple and Welsh, and show that the Grimmett and Winkler, and Pemantle conjectures are equivalent to statements about correlated graphic matroids. We also answer some open questions raised in [5] regarding correlated matroids, and in particular show that the 2-sum of correlated matroids is correlated.
For a graph G and an integer t we let mcct(G) be the smallest m such that there exists a colouring of the vertices of G by t colours with no monochromatic connected subgraph having more than m vertices. Let be any non-trivial minor-closed family of graphs. We show that mcc2(G) = O(n2/3) for any n-vertex graph G ∈ . This bound is asymptotically optimal and it is attained for planar graphs. More generally, for every such , and every fixed t we show that mcct(G)=O(n2/(t+1)). On the other hand, we have examples of graphs G with no Kt+3 minor and with mcct(G)=Ω(n2/(2t−1)).
It is also interesting to consider graphs of bounded degrees. Haxell, Szabó and Tardos proved mcc2(G) ≤ 20000 for every graph G of maximum degree 5. We show that there are n-vertex 7-regular graphs G with mcc2(G)=Ω(n), and more sharply, for every ϵ > 0 there exists cϵ > 0 and n-vertex graphs of maximum degree 7, average degree at most 6 + ϵ for all subgraphs, and with mcc2(G) ≥ cϵn. For 6-regular graphs it is known only that the maximum order of magnitude of mcc2 is between and n.
We also offer a Ramsey-theoretic perspective of the quantity mcct(G).
Motor algebra, a 4D degenerate geometric algebra, offers a rigorous yet simple representation of the 3D velocity of a rigid body. Using this representation, we study 3D extended arm pointing and reaching movements. We analyze the choice of arm orientation about the vector connecting the shoulder and the wrist, in cases for which this orientation is not prescribed by the task. Our findings show that the changes in this orientation throughout the movement were very small, possibly indicating an underlying motion planning strategy. We additionally examine the decomposition of movements into submovements and reconstruct the motion by assuming superposition of the velocity profiles of the underlying submovements by analyzing both the translational and rotational components of the 3D spatial velocity. This movement decomposition method reveals a larger number of submovement than is found using previously applied submovement extraction methods that are based only on the analysis of the hand tangential velocity. The reconstructed velocity profiles and final orientations are relatively close to the actual values, indicating that single-axis submovements may be the basic building blocks underlying 3D movement construction.
This work examines the Cooperative Hunters problem, where a swarm of unmanned air vehicles (UAVs) is used for searching one or more “evading targets,” which are moving in a predefined area while trying to avoid a detection by the swarm. By arranging themselves into efficient geometric flight configurations, the UAVs optimize their integrated sensing capabilities, enabling the search of a maximal territory.
Lexical semantic classes of verbs play an important role in structuring complex predicate information in a lexicon, thereby avoiding redundancy and enabling generalizations across semantically similar verbs with respect to their usage. Such classes, however, require many person-years of expert effort to create manually, and methods are needed for automatically assigning verbs to appropriate classes. In this work, we develop and evaluate a feature space to support the automatic assignment of verbs into a well-known lexical semantic classification that is frequently used in natural language processing. The feature space is general – applicable to any class distinctions within the target classification; broad – tapping into a variety of semantic features of the classes; and inexpensive – requiring no more than a POS tagger and chunker. We perform experiments using support vector machines (SVMs) with the proposed feature space, demonstrating a reduction in error rate ranging from 48% to 88% over a chance baseline accuracy, across classification tasks of varying difficulty. In particular, we attain performance comparable to or better than that of feature sets manually selected for the particular tasks. Our results show that the approach is generally applicable, and reduces the need for resource-intensive linguistic analysis for each new classification task. We also perform a wide range of experiments to determine the most informative features in the feature space, finding that simple, easily extractable features suffice for good verb classification performance.
Being able to identify which rhetorical relations (e.g., contrast or explanation) hold between spans of text is important for many natural language processing applications. Using machine learning to obtain a classifier which can distinguish between different relations typically depends on the availability of manually labelled training data, which is very time-consuming to create. However, rhetorical relations are sometimes lexically marked, i.e., signalled by discourse markers (e.g., because, but, consequently etc.), and it has been suggested (Marcu and Echihabi, 2002) that the presence of these cues in some examples can be exploited to label them automatically with the corresponding relation. The discourse markers are then removed and the automatically labelled data are used to train a classifier to determine relations even when no discourse marker is present (based on other linguistic cues such as word co-occurrences). In this paper, we investigate empirically how feasible this approach is. In particular, we test whether automatically labelled, lexically marked examples are really suitable training material for classifiers that are then applied to unmarked examples. Our results suggest that training on this type of data may not be such a good strategy, as models trained in this way do not seem to generalise very well to unmarked data. Furthermore, we found some evidence that this behaviour is largely independent of the classifiers used and seems to lie in the data itself (e.g., marked and unmarked examples may be too dissimilar linguistically and removing unambiguous markers in the automatic labelling process may lead to a meaning shift in the examples).
This paper presents a robust parsing algorithm and semantic formalism for the interpretation of utterances in spoken negotiative dialogue with databases. The algorithm works in two passes: a domain-specific pattern-matching phase and a domain-independent semantic analysis phase. Robustness is achieved by limiting the set of representable utterance types to an empirically motivated subclass which is more expressive than propositional slot–value lists, but much less expressive than first-order logic. Our evaluation shows that in actual practice the vast majority of utterances that occur can be handled, and that the parsing algorithm is highly efficient and accurate.
Given a set of s points and a set of n2 lines in three-dimensional Euclidean space such that each line is incident to n points but no n lines are coplanar, we show that s = Ω(n11/4). This is the first non-trivial answer to a question recently posed by Jean Bourgain.
Let the random graph Rn be drawn uniformly at random from the set of all simple planar graphs on n labelled vertices. We see that with high probability the maximum degree of Rn is Θ(ln n). We consider also the maximum size of a face and the maximum increase in the number of components on deleting a vertex. These results extend to graphs embeddable on any fixed surface.
For a fixed ρ ∈ [0, 1], what is (asymptotically) the minimal possible density g3(ρ) of triangles in a graph with edge density ρ? We completely solve this problem by proving thatwhere is the integer such that .
Existing penalty-based haptic rendering approaches are based on the penetration depth estimation in strictly translational sense and cannot properly take object rotation into account. We propose a new six-degree-of-freedom (6-DOF) haptic rendering algorithm which is based on determining the closest-point projection of the inadmissible configuration onto the set of admissible configurations. Energy is used to define a metric on the configuration space. Once the projection is found the 6-DOF wrench can be computed from the generalized penetration depth. The space is locally represented with exponential coordinates to make the algorithm more efficient. Examples compare the proposed algorithm with the existing approaches and show its advantages.
The research, described in this paper, concerns the robot indoor navigation, emphasizing the aspects of sensor model and calibration, environment representation, and self-localization. The main point is that combining all of these aspects, an effective navigation system is obtained. We present a model of the catadioptric image formation process. Our model simplifies the operations needed in the catadioptric image process. Once we know the model of the catadioptric sensor, we have to calibrate it with respect to the other sensors of the robot, in order to be able to fuse their information. When the sensors are mounted on a robot arm, we can use the hand-eye calibration algorithm to calibrate them. In our case the sensors are mounted on a mobile robot that moves over a flat floor, thus the sensors have less degrees of freedom. For this reason we develop a calibration algorithm for sensors mounted on a mobile robot. Finally, combining all the previous results and a scan matching algorithm that we develop, we build 3D maps of the environment. These maps are used for the self-localization of the robot and to carry out path following tasks. In this work we present experiments which show the effectiveness of the proposed algorithms.
Large alphabet languages such as Chinese are very different from English, and therefore present different problems for text compression. In this article, we first examine the characteristics of Chinese, then we introduce a new variant of the Prediction by Partial Match (PPM) model especially for Chinese characters. Unlike the traditional PPM coding schemes, which encodes an escape probability if a novel character occurs in the context, the new coding scheme directly encodes the order first before encoding a symbol, without having to output an escape probability. This scheme achieves excellent compression rates in comparison with other schemes on a variety of Chinese text files.