To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This paper presents a moving object detection algorithm for H.264/AVC video streams that is applied in the compressed domain. The method is able to extract and analyze several syntax elements from any H.264/AVC-compliant bit stream. The number of analyzed syntax elements depends on the mode in which the method operates. The algorithm is able to perform either a spatiotemporal analysis in a single step or a two-step analysis that starts with a spatial analysis of each frame, followed by a temporal analysis of several subsequent frames. Thereby, in each mode either only (sub-)macroblock types and partition modes or, additionally, quantization parameters are analyzed. The evaluation of these syntax elements enables the algorithm to determine a “weight” for each 4×4 block of pixels that indicates the level of motion within this block. A final segmentation after creating these weights segments each frame to foreground and background and hence indicates the positions and sizes of all moving objects. Our experiments show that the algorithm is able to efficiently detect moving objects in the compressed domain and that it is configurable to process a large number of parallel bit streams in real time.
We recently proposed a bidirectional hierarchical anchoring (BIHA) of motion fields for highly scalable video coding. The BIHA scheme employs piecewise-smooth motion fields, and uses breakpoints to signal motion discontinuities. In this paper, we show how the fundamental building block of the BIHA scheme can be used to perform bidirectional, occlusion-aware temporal frame interpolation (BOA-TFI). From a “parent” motion field between two reference frames, we use information about motion discontinuities to compose motion fields from both reference frames to the target frame; these then get inverted so that they can be used to predict the target frame. During the motion inversion process, we compute a reliable occlusion mask, which is used to guide the bidirectional motion-compensated prediction of the target frame. The scheme can be used in any state-of-the-art codec, but is most beneficial if used in conjunction with a highly scalable video coder which employs piecewise-smooth motion fields with motion discontinuities. We evaluate the proposed BOA-TFI scheme on a large variety of natural and challenging computer-generated sequences, and our results compare favorably to state-of-the-art TFI methods.
This paper proposes a novel model estimation method, which uses nested Gibbs sampling to develop a mixture-of-mixture model to represent the distribution of the model's components with a mixture model. This model is suitable for analyzing multilevel data comprising frame-wise observations, such as videos and acoustic signals, which are composed of frame-wise observations. Deterministic procedures, such as the expectation–maximization algorithm have been employed to estimate these kinds of models, but this approach often suffers from a large bias when the amount of data is limited. To avoid this problem, we introduce a Markov chain Monte Carlo-based model estimation method. In particular, we aim to identify a suitable sampling method for the mixture-of-mixture models. Gibbs sampling is a possible approach, but this can easily lead to the local optimum problem when each component is represented by a multi-modal distribution. Thus, we propose a novel Gibbs sampling method, called “nested Gibbs sampling,” which represents the lower-level (fine) data structure based on elemental mixture distributions and the higher-level (coarse) data structure based on mixture-of-mixture distributions. We applied this method to a speaker clustering problem and conducted experiments under various conditions. The results demonstrated that the proposed method outperformed conventional sampling-based, variational Bayesian, and hierarchical agglomerative methods.
Conversations in poster sessions in academic events, referred to as poster conversations, pose interesting, and challenging topics on multi-modal signal and information processing. We have developed a smart posterboard for multi-modal recording and analysis of poster conversations. The smart posterboard has multiple sensing devices to record poster conversations, so we can review who came to the poster and what kind of questions or comments he/she made. The conversation analysis incorporates face and eye-gaze tracking for effective speaker diarization. It is demonstrated that eye-gaze information is useful for predicting turn-taking and also improving speaker diarization. Moreover, high-level indexing of interest and comprehension level of the audience is explored based on the multi-modal behaviors during the conversation. This is realized by predicting the audience's speech acts such as questions and reactive tokens.
State-of-the-art lossless image compression schemes, such as JPEG-LS and CALIC, have been proposed in the context-adaptive predictive coding framework. These schemes involve a prediction step followed by context-adaptive entropy coding of the residuals. However, the models for context determination proposed in the literature, have been designed using ad-hoc techniques. In this paper, we take an alternative approach where we fix a simpler context model and then rely on a systematic technique to efficiently exploit spatial correlation to achieve efficient compression. The essential idea is to decompose the image into binary bitmaps such that the spatial correlation that exists among non-binary symbols is captured as the correlation among few bit positions. The proposed scheme then encodes the bitmaps in a particular order based on the simple context model. However, instead of encoding a bitmap as a whole, we partition it into rectangular blocks, induced by a binary tree, and then independently encode the blocks. The motivation for partitioning is to explicitly identify the blocks within which the statistical correlation remains the same. On a set of standard test images, the proposed scheme, using the same predictor as JPEG-LS, achieved an overall bit-rate saving of 1.56% against JPEG-LS.
There is a recent surge in research activities around “deep neural networks” (DNN). While the notion of neural networks have enjoyed cycles of enthusiasm, which may continue its ebb and flow, concrete advances now abound. Significant performance improvements have been shown in a number of pattern recognition tasks. As a technical topic, DNN is important in classes and tutorial articles and related learning resources are available. Streams of questions, nonetheless, never subside from students or researchers and there appears to be a frustrating tendency among the learners to treat DNN simply as a black box. This is an awkward and alarming situation in education. This paper thus has the intent to help the reader to properly understand DNN, not just its mechanism (what and how) but its motivation and justification (why). It is written from a developmental perspective with a comprehensive view, from the very basic but oft-forgotten principle of statistical pattern recognition and decision theory, through the problem stages that may be encountered during system design, to key ideas that led to the new advance. This paper can serve as a learning guide with historical reviews and important references, helpful in reaching an insightful understanding of the subject.
We prove that every integer $n\geqslant 10$ such that $n\not \equiv 1\text{ mod }4$ can be written as the sum of the square of a prime and a square-free number. This makes explicit a theorem of Erdős that every sufficiently large integer of this type may be written in such a way. Our proof requires us to construct new explicit results for primes in arithmetic progressions. As such, we use the second author’s numerical computation regarding the generalised Riemann hypothesis to extend the explicit bounds of Ramaré–Rumely.
The exponential growth in digital technology is leading us to a future in which all things and all people are connected all the time, something we refer to as The Infinite Network (TIN), which will cause profound changes in every industry. Here, we focus on the impact it will have in healthcare. TIN will change the essence of healthcare to a data-driven continuous approach as opposed to the event-driven discrete approach used today. At a micro or individual level, smart sensing will play a key role, in the form of embedded sensors, wearable sensors, and sensing from smart medical devices. At a macro or aggregate level, healthcare will be provided by Intelligent Telehealth Networks that evolve from the telehealth networks that are available today. Traditional telemedicine has delivered remote care to patients in the area where doctors are not readily available, but has not achieved at large scale. New advanced networks will deliver care at a much larger scale. The long-term future requires intelligent hybrid networks that combine artificial intelligence with human intelligence to provide continuity of care at higher quality and lower cost than is possible today.
In a paper, entitled Binary lambda calculus and combinatory logic, John Tromp presents a simple way of encoding lambda calculus terms as binary sequences. In what follows, we study the numbers of binary strings of a given size that represent lambda terms and derive results from their generating functions, especially that the number of terms of size n grows roughly like 1.963447954. . .n. In a second part we use this approach to generate random lambda terms using Boltzmann samplers.
We relate the so-called powercone models of mixed non-deterministic and probabilistic choice proposed by Tix, Keimel, Plotkin, Mislove, Ouaknine, Worrell, Morgan and McIver, to our own models of previsions. Under suitable topological assumptions, we show that they are isomorphic. We rely on Keimel's cone-theoretic variants of the classical Hahn–Banach separation theorems, using functional analytic methods, and on the Schröder–Simpson Theorem.
A k-uniform hypergraph H = (V, E) is called ℓ-orientable if there is an assignment of each edge e ∈ E to one of its vertices v ∈ e such that no vertex is assigned more than ℓ edges. Let Hn,m,k be a hypergraph, drawn uniformly at random from the set of all k-uniform hypergraphs with n vertices and m edges. In this paper we establish the threshold for the ℓ-orientability of Hn,m,k for all k ⩾ 3 and ℓ ⩾ 2, that is, we determine a critical quantity c*k,ℓ such that with probability 1 − o(1) the graph Hn,cn,k has an ℓ-orientation if c < c*k,ℓ, but fails to do so if c > c*k,ℓ.
Our result has various applications, including sharp load thresholds for cuckoo hashing, load balancing with guaranteed maximum load, and massive parallel access to hard disk arrays.
This paper first presents a method of motion planning and implementation for the self-recovery of an overturned six-legged robot. Previous studies aimed at the static and dynamic stabilization of robots for preventing them from overturning. However, no one can guarantee that an overturn accident will not occur during various applications of robots. Therefore, the problems involving overturning should be considered and solved during robot design and control. The design inspirations of multi-legged robots come from nature, especially insects and mammals. In addition, the self-recovery approach of an insect could also be imitated by robots. In this paper, such a self-recovery mechanism is reported. The inertial forces of the dangling legs are used to bias some legs to touch the ground, and the ground reaction forces exerted on the feet of landing legs are achieved to support and push the body to enable recovery without additional help. By employing the mechanism, a self-recovery approach named SSR (Sidewise-Self-Recovery) is presented and applied to multi-legged robots. Experiments of NOROS are performed to validate the effectiveness of the self-recovery motions. The results show that the SSR is a suitable method for multi-legged robots and that the hemisphere shell of robots can help them to perform self-recovery.
The book Das Interpretationsproblem der Formalisierten Zahlentheorie und ihre Formale Widerspruchsfreiheit by Erik Stenius published in 1952 contains a consistency proof for infinite ω-arithmetic based on a semantical interpretation. Despite the proof’s reference to semantics the truth definition is in fact equivalent to a syntactical derivability or reduction condition. Based on this reduction condition Stenius proves that the complexity of formulas in a derivation can be limited by the complexity of the conclusion. This independent result can also be proved by cut elimination for ω-arithmetic which was done by Schütte in 1951.
In this paper we interpret the syntactic reduction in Stenius’ work as a method for cut elimination based on invertibility of the logical rules. Through this interpretation the constructivity of Stenius’ proof becomes apparent. This improvement was explicitly requested from Stenius by Paul Bernays in private correspondence (In a letter from Bernays begun on the 19th of September 1952 (Stenius & Bernays, 1951–75)). Bernays, who took a deep interest in Stenius’ manuscript, applied the described method in a proof Herbrand’s theorem. In this paper we prove Herbrand’s theorem, as an application of Stenius’ work, based on lecture notes of Bernays (Bernays, 1961). The main result completely resolves Bernays’ suggestions for improvement by eliminating references to Stenius’ semantics and by showing the constructive nature of the proof. A comparison with Schütte’s cut elimination proof shows how Stenius’ simplification of the reduction of universal cut formulas, which in Schütte’s proof requires duplication and repositioning of the cuts, shifts the problematic case of reduction to implications.
According to propositional contingentism, it is contingent what propositions there are. This paper presents two ways of modeling contingency in what propositions there are using two classes of possible worlds models. The two classes of models are shown to be equivalent as models of contingency in what propositions there are, although they differ as to which other aspects of reality they represent. These constructions are based on recent work by Robert Stalnaker; the aim of this paper is to explain, expand, and, in one aspect, correct Stalnaker’s discussion.