Search results for Computer Science

11 - Probabilistic information retrieval
Christopher D. Manning, Stanford University, California, Prabhakar Raghavan, Hinrich Schütze, Universität Stuttgart
Book:

Introduction to Information Retrieval

Published online:

05 June 2012

Print publication:

07 July 2008, pp 201-217
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

During the discussion of relevance feedback in Section 9.1.2, we observed that if we have some known relevant and nonrelevant documents, then we can straightforwardly start to estimate the probability of a term t appearing in a relevant document P(t|R = 1), and that this could be the basis of a classifier that decides whether documents are relevant or not. In this chapter, we more systematically introduce this probabilistic approach to information retrieval (IR), which provides a different formal basis for a retrieval model and results in different techniques for setting term weights.
Users start with information needs, which they translate into query representations. Similarly, there are documents, which are converted into document representations (the latter differing at least by how text is tokenized, but perhaps containing fundamentally less information, as when a nonpositional index is used). Based on these two representations, a system tries to determine how well documents satisfy information needs. In the Boolean or vector space models of IR, matching is done in a formally defined but semantically imprecise calculus of index terms. Given only a query, an IR system has an uncertain understanding of the information need. Given the query and document representations, a system has an uncertain guess of whether a document has content relevant to the information need. Probability theory provides a principled foundation for such reasoning under uncertainty.

Covering Two-Edge-Coloured Complete Graphs with Two Disjoint Monochromatic Cycles
PETER ALLEN
Journal:

Combinatorics, Probability and Computing / Volume 17 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 471-486
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In 1998 Łuczak Rödl and Szemerédi [7] proved, by means of the Regularity Lemma, that there exists n0 such that, for any n ≥ n0 and two-edge-colouring of Kn, there exists a pair of vertex-disjoint monochromatic cycles of opposite colours covering the vertices of Kn. In this paper we make use of an alternative method of finding useful structure in a graph, leading to a proof of the same result with a much smaller value of n0. The proof gives a polynomial-time algorithm for finding the two cycles.

Orthogonal Latin Rectangles
ROLAND HÄGGKVIST, ANDERS JOHANSSON
Journal:

Combinatorics, Probability and Computing / Volume 17 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 519-536
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We use a greedy probabilistic method to prove that, for every ε > 0, every m × n Latin rectangle on n symbols has an orthogonal mate, where m = (1 − ε)n. That is, we show the existence of a second Latin rectangle such that no pair of the mn cells receives the same pair of symbols in the two rectangles.

Kinematic state estimation and motion planning for stochastic nonholonomic systems using the exponential map
Wooram Park, Yan Liu, Yu Zhou, Matthew Moses, Gregory S. Chirikjian
Journal:

Robotica / Volume 26 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 419-434
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A nonholonomic system subjected to external noise from the environment, or internal noise in its own actuators, will evolve in a stochastic manner described by an ensemble of trajectories. This ensemble of trajectories is equivalent to the solution of a Fokker–Planck equation that typically evolves on a Lie group. If the most likely state of such a system is to be estimated, and plans for subsequent motions from the current state are to be made so as to move the system to a desired state with high probability, then modeling how the probability density of the system evolves is critical. Methods for solving Fokker-Planck equations that evolve on Lie groups then become important. Such equations can be solved using the operational properties of group Fourier transforms in which irreducible unitary representation (IUR) matrices play a critical role. Therefore, we develop a simple approach for the numerical approximation of all the IUR matrices for two of the groups of most interest in robotics: the rotation group in three-dimensional space, SO(3), and the Euclidean motion group of the plane, SE(2). This approach uses the exponential mapping from the Lie algebras of these groups, and takes advantage of the sparse nature of the Lie algebra representation matrices. Other techniques for density estimation on groups are also explored. The computed densities are applied in the context of probabilistic path planning for kinematic cart in the plane and flexible needle steering in three-dimensional space. In these examples the injection of artificial noise into the computational models (rather than noise in the actual physical systems) serves as a tool to search the configuration spaces and plan paths. Finally, we illustrate how density estimation problems arise in the characterization of physical noise in orientational sensors such as gyroscopes.

The Weight and Hopcount of the Shortest Path in the Complete Graph with Exponential Weights
GERARD HOOGHIEMSTRA, PIET VAN MIEGHEM
Journal:

Combinatorics, Probability and Computing / Volume 17 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 537-548
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Both the hopcount HN (the number of links) and the weight WN (the sum of the weights on links) of the shortest path between two arbitrary nodes in the complete graph KN with i.i.d. exponential link weights is computed. We consider the joint distribution of the pair (HN, WN) and derive, after proper scaling, the joint limiting distribution. One of the results is that HN and WN, properly scaled, are asymptotically independent.

Lie methods for color robot vision
Reiner Lenz
Journal:

Robotica / Volume 26 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 453-464
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We describe how Lie-theoretical methods can be used to analyze color related problems in machine vision. The basic observation is that the nonnegative nature of spectral color signals restricts these functions to be members of a limited, conical section of the larger Hilbert space of square-integrable functions. From this observation, we conclude that the space of color signals can be equipped with a coordinate system consisting of a half-axis and a unit ball with the Lorentz groups as natural transformation group. We introduce the theory of the Lorentz group SU(1, 1) as a natural tool for analyzing color image processing problems and derive some descriptions and algorithms that are useful in the investigation of dynamical color changes. We illustrate the usage of these results by describing how to compress, interpolate, extrapolate, and compensate image sequences generated by dynamical color changes.

Motion planning for multiple non-holonomic robots: a geometric approach
Elias K. Xidias, Nikos A. Aspragathos
Journal:

Robotica / Volume 26 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 525-536
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper, a geometrical approach is developed to generate simultaneously optimal (or near-optimal) smooth paths for a set of non-holonomic robots, moving only forward in a 2D environment cluttered with static and moving obstacles. The robots environment is represented by a 3D geometric entity called Bump-Surface, which is embedded in a 4D Euclidean space. The multi-motion planning problem (MMPP) is resolved by simultaneously finding the paths for the set of robots represented by monoparametric smooth C2 curves onto the Bump-Surface, such that their inverse images onto the initial 2D workspace satisfy the optimization motion-planning criteria and constraints. The MMPP is expressed as an optimization problem, which is solved on the Bump-Surface using a genetic algorithm. The performance of the proposed approach is tested through a considerable number of simulated 2D dynamic environments with car-like robots.

Correlated Matroids
CLIFFORD C. COCKS
Journal:

Combinatorics, Probability and Computing / Volume 17 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 511-518
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Semple and Welsh [5] introduced the concept of correlated matroids, which relate to conjectures by Grimmett and Winkler [2], and Pemantle [4], respectively, that the uniformly random forest and the uniformly random connected subgraph of a finite graph have the edge-negative-association property. In this paper, we extend results of Semple and Welsh, and show that the Grimmett and Winkler, and Pemantle conjectures are equivalent to statements about correlated graphic matroids. We also answer some open questions raised in [5] regarding correlated matroids, and in particular show that the 2-sum of correlated matroids is correlated.

Graph Colouring with No Large Monochromatic Components
NATHAN LINIAL, JIŘÍ MATOUŠEK, OR SHEFFET, GÁBOR TARDOS
Journal:

Combinatorics, Probability and Computing / Volume 17 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 577-589
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
For a graph G and an integer t we let mcct(G) be the smallest m such that there exists a colouring of the vertices of G by t colours with no monochromatic connected subgraph having more than m vertices. Let be any non-trivial minor-closed family of graphs. We show that mcc2(G) = O(n2/3) for any n-vertex graph G ∈ . This bound is asymptotically optimal and it is attained for planar graphs. More generally, for every such , and every fixed t we show that mcct(G)=O(n2/(t+1)). On the other hand, we have examples of graphs G with no Kt+3 minor and with mcct(G)=Ω(n2/(2t−1)).
It is also interesting to consider graphs of bounded degrees. Haxell, Szabó and Tardos proved mcc2(G) ≤ 20000 for every graph G of maximum degree 5. We show that there are n-vertex 7-regular graphs G with mcc2(G)=Ω(n), and more sharply, for every ϵ > 0 there exists cϵ > 0 and n-vertex graphs of maximum degree 7, average degree at most 6 + ϵ for all subgraphs, and with mcc2(G) ≥ cϵn. For 6-regular graphs it is known only that the maximum order of magnitude of mcc2 is between and n.
We also offer a Ramsey-theoretic perspective of the quantity mcct(G).

Application of motor algebra to the analysis of human arm movements
Sigal Berman, Dario G. Liebermann, Tamar Flash
Journal:

Robotica / Volume 26 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 435-451
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Motor algebra, a 4D degenerate geometric algebra, offers a rigorous yet simple representation of the 3D velocity of a rigid body. Using this representation, we study 3D extended arm pointing and reaching movements. We analyze the choice of arm orientation about the vector connecting the shoulder and the wrist, in cases for which this orientation is not prescribed by the task. Our findings show that the changes in this orientation throughout the movement were very small, possibly indicating an underlying motion planning strategy. We additionally examine the decomposition of movements into submovements and reconstruct the motion by assuming superposition of the velocity profiles of the underlying submovements by analyzing both the translational and rotational components of the 3D spatial velocity. This movement decomposition method reveals a larger number of submovement than is found using previously applied submovement extraction methods that are based only on the analysis of the hand tangential velocity. The reconstructed velocity profiles and final orientations are relatively close to the actual values, indicating that single-axis submovements may be the basic building blocks underlying 3D movement construction.

Efficient cooperative search of smart targets using UAV Swarms 1
Yaniv Altshuler, Vladimir Yanovsky, Israel A. Wagner, Alfred M. Bruckstein
Journal:

Robotica / Volume 26 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 551-557
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This work examines the Cooperative Hunters problem, where a swarm of unmanned air vehicles (UAVs) is used for searching one or more “evading targets,” which are moving in a predefined area while trying to avoid a detection by the swarm. By arranging themselves into efficient geometric flight configurations, the UAVs optimize their integrated sensing capabilities, enabling the search of a maximal territory.

A general feature space for automatic verb classification
ERIC JOANIS, SUZANNE STEVENSON, DAVID JAMES
Journal:

Natural Language Engineering / Volume 14 / Issue 3 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 337-367
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Lexical semantic classes of verbs play an important role in structuring complex predicate information in a lexicon, thereby avoiding redundancy and enabling generalizations across semantically similar verbs with respect to their usage. Such classes, however, require many person-years of expert effort to create manually, and methods are needed for automatically assigning verbs to appropriate classes. In this work, we develop and evaluate a feature space to support the automatic assignment of verbs into a well-known lexical semantic classification that is frequently used in natural language processing. The feature space is general – applicable to any class distinctions within the target classification; broad – tapping into a variety of semantic features of the classes; and inexpensive – requiring no more than a POS tagger and chunker. We perform experiments using support vector machines (SVMs) with the proposed feature space, demonstrating a reduction in error rate ranging from 48% to 88% over a chance baseline accuracy, across classification tasks of varying difficulty. In particular, we attain performance comparable to or better than that of feature sets manually selected for the particular tasks. Our results show that the approach is generally applicable, and reduces the need for resource-intensive linguistic analysis for each new classification task. We also perform a wide range of experiments to determine the most informative features in the feature space, finding that simple, easily extractable features suffice for good verb classification performance.

Using automatically labelled examples to classify rhetorical relations: an assessment
CAROLINE SPORLEDER, ALEX LASCARIDES
Journal:

Natural Language Engineering / Volume 14 / Issue 3 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 369-416
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Being able to identify which rhetorical relations (e.g., contrast or explanation) hold between spans of text is important for many natural language processing applications. Using machine learning to obtain a classifier which can distinguish between different relations typically depends on the availability of manually labelled training data, which is very time-consuming to create. However, rhetorical relations are sometimes lexically marked, i.e., signalled by discourse markers (e.g., because, but, consequently etc.), and it has been suggested (Marcu and Echihabi, 2002) that the presence of these cues in some examples can be exploited to label them automatically with the corresponding relation. The discourse markers are then removed and the automatically labelled data are used to train a classifier to determine relations even when no discourse marker is present (based on other linguistic cues such as word co-occurrences). In this paper, we investigate empirically how feasible this approach is. In particular, we test whether automatically labelled, lexically marked examples are really suitable training material for classifiers that are then applied to unmarked examples. Our results suggest that training on this type of data may not be such a good strategy, as models trained in this way do not seem to generalise very well to unmarked data. Furthermore, we found some evidence that this behaviour is largely independent of the classifiers used and seems to lie in the data itself (e.g., marked and unmarked examples may be too dissimilar linguistically and removing unambiguous markers in the automatic labelling process may lead to a meaning shift in the examples).

Robust parsing and spoken negotiative dialogue with databases
JOHAN BOYE, MATS WIRÉN
Journal:

Natural Language Engineering / Volume 14 / Issue 3 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 289-312
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper presents a robust parsing algorithm and semantic formalism for the interpretation of utterances in spoken negotiative dialogue with databases. The algorithm works in two passes: a domain-specific pattern-matching phase and a domain-independent semantic analysis phase. Robustness is achieved by limiting the set of representable utterance types to an empirically motivated subclass which is more expressive than propositional slot–value lists, but much less expressive than first-order logic. Our evaluation shows that in actual practice the vast majority of utterances that occur can be handled, and that the parsing algorithm is highly efficient and accurate.

On a Question of Bourgain about Geometric Incidences
JÓZSEF SOLYMOSI, CSABA D. TÓTH
Journal:

Combinatorics, Probability and Computing / Volume 17 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 619-625
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Given a set of s points and a set of n2 lines in three-dimensional Euclidean space such that each line is incident to n points but no n lines are coplanar, we show that s = Ω(n11/4). This is the first non-trivial answer to a question recently posed by Jean Bourgain.

On the Maximum Degree of a Random Planar Graph
COLIN McDIARMID, BRUCE REED
Journal:

Combinatorics, Probability and Computing / Volume 17 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 591-601
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Let the random graph Rn be drawn uniformly at random from the set of all simple planar graphs on n labelled vertices. We see that with high probability the maximum degree of Rn is Θ(ln n). We consider also the maximum size of a face and the maximum increase in the number of components on deleting a vertex. These results extend to graphs embeddable on any fixed surface.

On the Minimal Density of Triangles in Graphs
ALEXANDER A. RAZBOROV
Journal:

Combinatorics, Probability and Computing / Volume 17 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 603-618
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
For a fixed ρ ∈ [0, 1], what is (asymptotically) the minimal possible density g3(ρ) of triangles in a graph with edge density ρ? We completely solve this problem by proving thatwhere is the integer such that .

Generalized penetration depth for penalty-based six-degree-of-freedom haptic rendering
Maxim Kolesnikov, Miloš Žefran
Journal:

Robotica / Volume 26 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 513-524
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Existing penalty-based haptic rendering approaches are based on the penetration depth estimation in strictly translational sense and cannot properly take object rotation into account. We propose a new six-degree-of-freedom (6-DOF) haptic rendering algorithm which is based on determining the closest-point projection of the inadmissible configuration onto the set of admissible configurations. Energy is used to define a metric on the configuration space. Once the projection is found the 6-DOF wrench can be computed from the generalized penetration depth. The space is locally represented with exponential coordinates to make the algorithm more efficient. Examples compare the proposed algorithm with the existing approaches and show its advantages.

Body sensor calibration and construction of 3D maps for robot navigation using the framework of conformal geometric algebra
C. López-Franco, E. Bayro-Corrochano
Journal:

Robotica / Volume 26 / Issue 4 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 465-481
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The research, described in this paper, concerns the robot indoor navigation, emphasizing the aspects of sensor model and calibration, environment representation, and self-localization. The main point is that combining all of these aspects, an effective navigation system is obtained. We present a model of the catadioptric image formation process. Our model simplifies the operations needed in the catadioptric image process. Once we know the model of the catadioptric sensor, we have to calibrate it with respect to the other sensors of the robot, in order to be able to fuse their information. When the sensors are mounted on a robot arm, we can use the hand-eye calibration algorithm to calibrate them. In our case the sensors are mounted on a mobile robot that moves over a flat floor, thus the sensors have less degrees of freedom. For this reason we develop a calibration algorithm for sensors mounted on a mobile robot. Finally, combining all the previous results and a scan matching algorithm that we develop, we build 3D maps of the environment. These maps are used for the self-localization of the robot and to carry out path following tasks. In this work we present experiments which show the effectiveness of the proposed algorithms.

A new PPM variant for Chinese text compression
PEILIANG WU, W. J. TEAHAN
Journal:

Natural Language Engineering / Volume 14 / Issue 3 / July 2008

Published online by Cambridge University Press:

01 July 2008, pp. 417-430
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Large alphabet languages such as Chinese are very different from English, and therefore present different problems for text compression. In this article, we first examine the characteristics of Chinese, then we introduce a new variant of the Prediction by Partial Match (PPM) model especially for Chinese characters. Unlike the traditional PPM coding schemes, which encodes an escape probability if a novel character occurs in the context, the new coding scheme directly encodes the order first before encoding a symbol, without having to output an escape probability. This scheme achieves excellent compression rates in comparison with other schemes on a variety of Chinese text files.

Computer Science

Refine search

Refine search

Actions for selected content:

48569 results in Computer Science

11 - Probabilistic information retrieval

Summary

Covering Two-Edge-Coloured Complete Graphs with Two Disjoint Monochromatic Cycles

Orthogonal Latin Rectangles

Kinematic state estimation and motion planning for stochastic nonholonomic systems using the exponential map

The Weight and Hopcount of the Shortest Path in the Complete Graph with Exponential Weights

Lie methods for color robot vision

Motion planning for multiple non-holonomic robots: a geometric approach

Correlated Matroids

Graph Colouring with No Large Monochromatic Components

Application of motor algebra to the analysis of human arm movements

Efficient cooperative search of smart targets using UAV Swarms 1

A general feature space for automatic verb classification

Using automatically labelled examples to classify rhetorical relations: an assessment

Robust parsing and spoken negotiative dialogue with databases

On a Question of Bourgain about Geometric Incidences

On the Maximum Degree of a Random Planar Graph

On the Minimal Density of Triangles in Graphs

Generalized penetration depth for penalty-based six-degree-of-freedom haptic rendering

Body sensor calibration and construction of 3D maps for robot navigation using the framework of conformal geometric algebra

A new PPM variant for Chinese text compression

Computer Science

Refine search

Refine search

Actions for selected content:

Save Search

48569 results in Computer Science

Summary