To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this paper we focus on a specific search-related query expansion topic, namely search on Danish compounds and expansion to some of their synonymous phrases. Compounds constitute a specific issue in search, in particular in languages where they are written in one word, as is the case for Danish and the other Scandinavian languages. For such languages, expansion of the query compound into separate lemmas is a way of finding the often frequent alternative synonymous phrases in which the content of a compound can also be expressed. However, it is crucial to note that the number of irrelevant hits is generally very high when using this expansion strategy. The aim of this paper is therefore to examine how we can obtain better search results on split compounds, partly by looking at the internal structure of the original compound, partly by analyzing the context in which the split compound occurs. In this context, we pursue two hypotheses: (1) that some categories of compounds are more likely to have synonymous ‘split’ counterparts than others; and (2) that search results where both the search words (obtained by splitting the compound) occur in the same noun phrase, are more likely to contain a synonymous phrase to the original compound query. The search results from 410 enhanced compound queries are used as a test bed for our experiments. On these search results, we perform a shallow linguistic analysis and introduce a new, linguistically based threshold for retrieved hits. The results obtained by using this strategy demonstrate that compound splitting combined with a shallow linguistic analysis focusing on the argument structure of the compound head as well as on the recognition of NPs, can improve search by substantially bringing down the number of irrelevant hits.
We investigate a technique from the literature, called the phantom-types technique, that uses parametric polymorphism, type constraints, and unification of polymorphic types to model a subtyping hierarchy. Hindley-Milner type systems, such as the one found in Standard ML, can be used to enforce the subtyping relation, at least for first-order values. We show that this technique can be used to encode any finite subtyping hierarchy (including hierarchies arising from multiple interface inheritance). We formally demonstrate the suitability of the phantom-types technique for capturing first-order subtyping by exhibiting a type-preserving translation from a simple calculus with bounded polymorphism to a calculus embodying the type system of SML.
The usual definition of average degree for a non-regular lattice has the disadvantage that it takes the same value for many lattices with clearly different connectivity. We introduce an alternative definition of average degree, which better separates different lattices.
These measures are compared on a class of lattices and are analysed using a Markov chain describing a random walk on the lattice. Using the new measure, we conjecture the order of both the critical probabilities for bond percolation and the connective constants for self-avoiding walks on these lattices.
Let $s$ and $t$ be integers satisfying $s \geq 2$ and $t \geq 2$. Let $S$ be a tree of size $s$, and let $P_t$ be the path of length $t$. We show in this paper that, for every edge-colouring of the complete graph on $n$ vertices, where $n=224(s-1)^2t$, there is either a monochromatic copy of $S$ or a rainbow copy of $P_t$. So, in particular, the number of vertices needed grows only linearly in $t$.
For any partition of $\{1, 2, \ldots{,}\, n\}$ we define its increments$X_i, 1 \leq i \leq n$ by $X_i = 1$ if $i$ is the smallest element in the partition block that contains it, $X_i = 0$ otherwise. We prove that for partially exchangeable random partitions (where the probability of a partition depends only on its block sizes in order of appearance), the law of the increments uniquely determines the law of the partition. One consequence is that the Chinese Restaurant Process CRP($\theta$) (the partition with distribution given by the Ewens sampling formula with parameter $\theta$) is the only exchangeable random partition with independent increments.
We consider the problem of reorienting an oriented matroid so that all its cocircuits are ‘as balanced as possible in ratio’. It is well known that any oriented matroid having no coloops has a totally cyclic reorientation, a reorientation in which every signed cocircuit $B = \{B^+, B^-\}$ satisfies $B^+, B^- \neq \emptyset$. We show that, for some reorientation, every signed cocircuit satisfies \[1/f(r) \leq |B^+|/|B^-| \leq f(r)\], where $f(r) \leq 14\,r^2\ln(r)$, and $r$ is the rank of the oriented matroid.
In geometry, this problem corresponds to bounding the discrepancies (in ratio) that occur among the Radon partitions of a dependent set of vectors. For graphs, this result corresponds to bounding the chromatic number of a connected graph by a function of its Betti number (corank) $|E|-|V|+1$.
A dominating set $\cal D$ of a graph $G$ is a subset of $V(G)$ such that, for every vertex $v\in V(G)$, either in $v\in {\cal D}$ or there exists a vertex $u \in {\cal D}$ that is adjacent to $v$. We are interested in finding dominating sets of small cardinality. A dominating set $\cal I$ of a graph $G$ is said to be independent if no two vertices of ${\cal I}$ are connected by an edge of $G$. The size of a smallest independent dominating set of a graph $G$ is the independent domination number of $G$. In this paper we present upper bounds on the independent domination number of random regular graphs. This is achieved by analysing the performance of a randomized greedy algorithm on random regular graphs using differential equations.
In this paper, we study percolation on finite Cayley graphs. A conjecture of Benjamini says that the critical percolation $p_c$ of any vertex-transitive graph satisfying a certain diameter condition can be bounded away from one. We prove Benjamini's conjecture for some special classes of Cayley graphs. We also establish a reduction theorem, which allows us to build Cayley graphs for large groups without increasing $p_c$.
Let $c$ be a constant and $(e_1,f_1), (e_2,f_2), \dots, (e_{cn},f_{cn})$ be a sequence of ordered pairs of edges on vertex set $[n]$ chosen uniformly and independently at random. Let $A$ be an algorithm for the on-line choice of one edge from each presented pair, and for $i= 1,\hellip,cn$ let $G_A(i)$ be the graph on vertex set $[n]$ consisting of the first $i$ edges chosen by $A$. We prove that all algorithms in a certain class have a critical value $c_A$ for the emergence of a giant component in $G_A(cn) (ie$, if $c \gt c_A$, then with high probability the largest component in $G_A(cn)$ has $o(n)$ vertices, and if $c > c_A$ then with high probability there is a component of size $\Omega(n)$ in $G_A(cn))$. We show that a particular algorithm in this class with high probability produces a giant component before $0.385 n$ steps in the process ($ie$, we exhibit an algorithm that creates a giant component relatively quickly). The fact that another specific algorithm that is in this class has a critical value resolves a conjecture of Spencer.
In addition, we establish a lower bound on the time of emergence of a giant component in any process produced by an on-line algorithm and show that there is a phase transition for the off-line version of the problem of creating a giant component.
A 3-connected graph $G$ is weakly 3-connected if, for every edge $e$ of $G$, at most one of $G\backslash e$ and $G/e$ is 3-connected. The main result of this paper is that any weakly 3-connected graph can be reduced to $K_4$ by a sequence of simple operations. This extends a result of Dawes [5] on minimally 3-connected graphs.
The notion of conductance introduced by Jerrum and Sinclair [8] has been widely used to prove rapid mixing of Markov chains. Here we introduce a bound that extends this in two directions. First, instead of measuring the conductance of the worst subset of states, we bound the mixing time by a formula that can be thought of as a weighted average of the Jerrum–Sinclair bound (where the average is taken over subsets of states with different sizes). Furthermore, instead of just the conductance, which in graph theory terms measures edge expansion, we also take into account node expansion. Our bound is related to the logarithmic Sobolev inequalities, but it appears to be more flexible and easier to compute.
In the case of random walks in convex bodies, we show that this new bound is better than the known bounds for the worst case. This saves a factor of $O(n)$ in the mixing time bound, which is incurred in all proofs as a ‘penalty’ for a ‘bad start’. We show that in a convex body in $\mathbb{R}^n$, with diameter $D$, random walk with steps in a ball with radius $\delta$ mixes in $O^*(nD^2/\delta^2)$ time (if idle steps at the boundary are not counted). This gives an $O^*(n^3)$ sampling algorithm after appropriate preprocessing, improving the previous bound of $O^*(n^4)$.
The application of the general conductance bound in the geometric setting depends on an improved isoperimetric inequality for convex bodies.
This article describes the scientific contributions of Milton Sobel. It motivates his research by considering his family background, his war experiences, and his mentors and fellow students at Columbia University. His research in sequential analysis, selection, ranking, group testing, and probabilistic combinatorics are highlighted.
Recently, Pellerey, Shaked, and Zinn [6] introduced a discrete-time analogue of the nonhomogeneous Poisson process. The purpose of this article is to provide some results for stochastic comparisons of the epoch times and the interepoch times of those processes. Also, we show the relationships between these processes and discrete record values and we provide several results for discrete weak record values.
In this article, we consider an insurance risk model where the claim and premium processes follow some time series models. We first consider the model proposed in Gerber [2,3]; then a model with dependent structure between premium and claim processes modeled by using Granger's causal model is considered. By using some martingale arguments, Lundberg-type upper bounds for the ruin probabilities under both models are obtained. Some special cases are discussed.
In this article, we obtain error bounds for exponential approximations to the classes of weighted residual and equilibrium lifetime distributions with monotone weight functions. These bounds are obtained for the class of distributions with increasing (decreasing) hazard rate and mean residual life functions.
We investigate some new properties of mean inactivity time (MIT) order and increasing MIT (IMIT) class of life distributions. The preservation property of MIT order under increasing and concave transformations, reversed preservation properties of MIT order, and IMIT class of life distributions under the taking of maximum are developed. Based on the residual life at a random time and the excess lifetime in a renewal process, stochastic comparisons of both IMIT and decreasing mean residual life distributions are conducted as well.
The concept of generalized order statistics was introduced as a unified approach to a variety of models of ordered random variables. The purpose of this article is to investigate conditions on the distributions and the parameters on which the generalized order statistics are based to establish the likelihood ratio ordering of general p-spacings and the hazard rate and the dispersive orderings of (normalizing) simple spacings from two samples. We thus strengthen and complement some results in Franco, Ruiz, and Ruiz [7] and Belzunce, Mercader, and Ruiz [5]. This article is a continuation of Hu and Zhuang [10].
We consider funding an interest-bearing warranty reserve with contributions after each sale. The problem for the manufacturer is to determine the initial level of the reserve fund and the amount to be put in after each sale, so as to ensure that the reserve fund covers all of the warranty liabilities with a prespecified probability over a fixed period of time. We assume a nonhomogeneous Poisson sales process, random warranty periods, and a constant failure rate for items under warranty. We derive the mean and variance of the reserve level as a function of time and provide a robust heuristic to aid the manufacturer in its decision.