To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Suppose M and N are distinct matroids on a set E such that, for every e ∈ E, the deletion of e from M equals the deletion of e from N or the contraction of e from M equals the contraction of e from N. In this note we prove that, apart from some easily specified exceptions, one of M and N must be a relaxation of the other.
This paper presents a constructive proof that for any planar digraph G on p vertices, there exists a subset S of the transitive closure of G such that the number of arcs in S is less than or equal to the number of arcs in G, and such that the diameter of G∪S is O(α(p, p)(log p)2). Here the diameter refers to the maximum distance from a vertex υ to a vertex w where (υ, w) is from the transitive closure of G – which is also the transitive closure of G ∪ S. This result provides support for the author's previous conjecture that such a set S achieving a diameter polylogarithmic in the number of vertices exists for any digraph. The result also adresses an open question of Chazelle, who did some related work on trees, and suggested the generalization to the planar cases.
We continue the study of the following general problem on the vertex colourings of graphs. Suppose that some vertices of a graph G are assigned to some colours. Can this ‘precolouring’ be extended to a proper colouring of G with at most k colours (for some given k)? Here we investigate the complexity status of precolouring extendibility on some classes of perfect graphs, giving good characterizations (necessary and sufficient conditions) that lead to algorithms with linear or polynomial running time. It is also shown how a larger subclass of perfect graphs can be derived from graphs containing no induced path on four vertices.
This paper is an introduction to natural language interfaces to databases (NLIDBS). A brief overview of the history of NLIDBS is first given. Some advantages and disadvantages of NLIDBS are then discussed, comparing NLIDBS to formal query languages, form-based interfaces, and graphical interfaces. An introduction to some of the linguistic problems NLIDBS have to confront follows, for the benefit of readers less familiar with computational linguistics. The discussion then moves on to NLIDB architectures, portability issues, restricted natural language input systems (including menu-based NLIDBS), and NLIDBS with reasoning capabilities. Some less explored areas of NLIDB research are then presented, namely database updates, meta-knowledge questions, temporal questions, and multi-modal NLIDBS. The paper ends with reflections on the current state of the art.
Let M be a matroid and let Xr count copies of M in a random matroid of rank r. The Poisson and normal convergence of Xr are investigated under some restriction.
We consider a string editing problem in a probabilistic framework. This problem is of considerable interest to many facets of science, most notably molecular biology and computer science. A string editing transforms one string into another by performing a series of weighted edit operations of overall maximum (minimum) cost. The problem is equivalent to finding an optimal path in a weighted grid graph. In this paper we provide several results regarding a typical behaviour of such a path. In particular, we observe that the optimal path (i.e. edit distance) is almost surely (a.s.) equal to αn for large n where α is a constant and n is the sum of lengths of both strings. More importantly, we show that the edit distance is well concentrated around its average value. In the so called independent model in which all weights (in the associated grid graph) are statistically independent, we derive some bounds for the constant α. As a by-product of our results, we also present a precise estimate of the number of alignments between two strings. To prove these findings we use techniques of random walks, diffusion limiting processes, generating functions, and the method of bounded difference.
For a graph G with m edges let its Range of Subgraph Sizes (RSS)
ρ(G) = {t : G contains a vertex-induced subgraph with t edges}.
G has a full RSS if ρ(G) = {0, 1, …, m}. We establish the threshold for a random graph to have a full RSS and give tight bounds on the likely RSS of a dense random graph.
In a set of even cardinality n, each member ranks all the others in order of preference. A stable matching is a partition of the set into n/2 pairs, with the property that no two unpaired members both prefer each other to their partners under matching. It is known that for some problem instances no stable matching exists. In 1985, Irving found an O(n2) two-phase algorithm that would determine, for any instance, whether a stable matching exists, and if so, would find such a matching. Recently, Tan proved that Irving's algorithm, with a modified second phase, always finds a stable cyclic partition of the members set, which is a stable matching when each cycle has length two. In this paper we study a likely behavior of the algorithm under the assumption that an instance of the ranking system is chosen uniformly at random. We prove that the likely number of basic steps, i.e. the individual proposals in the first phase and the rotation eliminations, involving subsets of members in the second phase, is O(n log n), and that the likely size of a rotation is O((n log n)1/2). We establish a ‘hyperbola law’ analogous to our past result on stable marriages. It states that at every step of the second phase, the product of the rank of proposers and the rank of proposal holders is asymptotic, in probability, to n3. We show that every stable cyclic partition is likely to be almost a stable matching, in the sense that at most O((n log n)1/2) members can be involved in the cycles of length three or more.
The Portable Extendable Traffic Information Collator (POETIC) is an information extraction system that extracts traffic information from free text occurring in police incident logs and initiates (simulated) broadcasts of traffic bulletins to motorists when appropriate. POETIC is a second stage prototype system; the initial prototype (TIC, see Evans and Hartley 1990) was limited to the practices and requirements of a single police force. In POETIC, the architecture and data representations have been generalised to make the system tailorable to many different police force ‘domains’. In this paper we describe these developments, and report on tests of the system on authentic input data from three police domains.
The thickness of sparse random graphs in the model Gn, p is closely related to the arboricity, provided p(n) is suitably small. This allows us to identify a range of p(n) for which the thickness is approximately np/2.
Let H be a graph on h vertices, and G be a graph on n vertices. An H-factor of G is a spanning subgraph of G consisting of n/h vertex disjoint copies of H. The fractional arboricity of H is , where the maximum is taken over all subgraphs (V′, E′) of H with |V′| > 1. Let δ(H) denote the minimum degree of a vertex of H. It is shown that if δ(H) < a(H), then n−1/a(H) is a sharp threshold function for the property that the random graph G(n, p) contains an H-factor. That is, there are two positive constants c and C so that for p(n) = cn−1/a(H) almost surely G(n, p(n)) does not have an H-factor, whereas for p(n) = Cn−1/a(H), almost surely G(n, p(n)) contains an H-factor (provided h divides n). A special case of this answers a problem of Erdős.
Let the Kp-independence number αp (G) of a graph G be the maximum order of an induced subgraph in G that contains no Kp. (So K2-independence number is just the maximum size of an independent set.) For given integers r, p, m > 0 and graphs L1,…,Lr, we define the corresponding Turán-Ramsey function RTp(n, L1,…,Lr, m) to be the maximum number of edges in a graph Gn of order n such that αp(Gn) ≤ m and there is an edge-colouring of G with r colours such that the jth colour class contains no copy of Lj, for j = 1,…, r. In this continuation of [11] and [12], we will investigate the problem where, instead of α(Gn) = o(n), we assume (for some fixed p > 2) the stronger condition that αp(Gn) = o(n). The first part of the paper contains multicoloured Turán-Ramsey theorems for graphs Gn of order n with small Kp-independence number αp(Gn). Some structure theorems are given for the case αp(Gn) = o(n), showing that there are graphs with fairly simple structure that are within o(n2) of the extremal size; the structure is described in terms of the edge densities between certain sets of vertices.
The second part of the paper is devoted to the case r = 1, i.e., to the problem of determining the asymptotic value of
for p < q. Several results are proved, and some other problems and conjectures are stated.
For a graph H and an integer r ≥ 2, the induced r-size-Ramsey number of H is defined to be the smallest integer m for which there exists a graph G with m edges with the following property: however one colours the edges of G with r colours, there always exists a monochromatic induced subgraph H′ of G that is isomorphic to H. This is a concept closely related to the classical r-size-Ramsey number of Erdős, Faudree, Rousseau and Schelp, and to the r-induced Ramsey number, a natural notion that appears in problems and conjectures due to, among others, Graham and Rödl, and Trotter. Here, we prove a result that implies that the induced r-size-Ramsey number of the cycle Cℓ is at most crℓ for some constant cr that depends only upon r. Thus we settle a conjecture of Graham and Rödl, which states that the above holds for the path Pℓ of order ℓ and also generalise in part a result of Bollobás, Burr and Reimer that implies that the r-size Ramsey number of the cycle Cℓ is linear in ℓ Our method of proof is heavily based on techniques from the theory of random graphs and on a variant of the powerful regularity lemma of Szemerédi.
This paper describes a natural language text extraction system, called MEDLEE, that has been applied to the medical domain. The system extracts, structures, and encodes clinical information from textual patient reports. It was integrated with the Clinical Information System (CIS), which was developed at Columbia-Presbyterian Medical Center (CPMC) to help improve patient care. MEDLEE is currently used on a daily basis to routinely process radiological reports of patients at CPMC.
In order to describe how the natural language system was made compatible with the existing CIS, this paper will also discuss engineering issues which involve performance, robustness, and accessibility of the data from the end users' viewpoint.
Also described are the three evaluations that have been performed on the system. The first evaluation was useful primarily for further refinement of the system. The two other evaluations involved an actual clinical application which consisted of retrieving reports that were associated with specified diseases. Automated queries were written by a medical expert based on the structured output forms generated as a result of text processing. The retrievals obtained by the automated system were compared to the retrievals obtained by independent medical experts who read the reports manually to determine whether they were associated with the specified diseases. MEDLEE was shown to perform comparably to the experts. The technique used to perform the last two evaluations was found to be a realistic evaluation technique for a natural language processor.