To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
An extension of the well-known Particle Swarm Optimization (PSO) to multi-robot applications has been recently proposed and denoted as Robotic Darwinian PSO (RDPSO), benefited from the dynamical partitioning of the whole population of robots. Although such strategy allows decreasing the amount of required information exchange among robots, a further analysis on the communication complexity of the RDPSO needs to be carried out so as to evaluate the scalability of the algorithm. Moreover, a further study on the most adequate multi-hop routing protocol should be conducted. Therefore, this paper starts by analyzing the architecture and characteristics of the RDPSO communication system, thus describing the dynamics of the communication data packet structure shared between teammates. Such procedure will be the first step to achieving a more scalable implementation of RDPSO by optimizing the communication procedure between robots. Second, an ad hoc on-demand distance vector reactive routing protocol is extended based on the RDPSO concepts, so as to reduce the communication overhead within swarms of robots. Experimental results with teams of 15 real robots and 60 simulated robots show that the proposed methodology significantly reduces the communication overhead, thus improving the scalability and applicability of the RDPSO algorithm.
In this chapter, we examine visualization and debugging as well as the relationship of these services and the infrastructure provided by a SPS. Visualization and debugging tools help developers and analysts to inspect and to understand the current state of an application and the data flow between its components, thus mitigating the cognitive and software engineering challenges associated with developing, optimizing, deploying, and managing SPAs, particularly the large-scale distributed ones.
On the one hand, visualization techniques are important at development time, where the ability to picture the application layout and its live data flows can aid in refining its design.
On the other hand, debugging techniques and tools, which are sometimes integrated with visualization tools, are important because the continuous and critical nature of some SPAs requires the ability to effectively diagnose and address problems before and after they reach a production stage, where disruptions can have serious consequences.
This chapter starts with a discussion of software visualization techniques for SPAs (Section 6.2), including the mechanisms to produce effective visual representations of an application's data flow graph topology, its performance metrics, and its live status.
Debugging is intimately related to visualization. Hence, the second half of this chapter focuses on the different types of debugging tasks used in stream processing (Section 6.3).
Visualization
Comprehensive visualization infrastructure is a fundamental tool to support the development, understanding, debugging, and optimization of SPAs.
Let Hd(n,p) signify a random d-uniform hypergraph with n vertices in which each of the $\binom{n}{d}$ possible edges is present with probability p=p(n) independently, and let Hd(n,m) denote a uniformly distributed d-uniform hypergraph with n vertices and m edges. We derive local limit theorems for the joint distribution of the number of vertices and the number of edges in the largest component of Hd(n,p) and Hd(n,m) in the regime $(d-1)\binom{n-1}{d-1}p>1+\varepsilon$, resp. d(d−1)m/n>1+ϵ, where ϵ>0 is arbitrarily small but fixed as n → ∞. The proofs are based on a purely probabilistic approach.
The world has become information-driven, with many facets of business and government being fully automated and their systems being instrumented and interconnected. On the one hand, private and public organizations have been investing heavily in deploying sensors and infrastructure to collect readings from these sensors, on a continuous basis. On the other hand, the need to monitor and act on information from the sensors in the field to drive rapid decisions, to tweak production processes, to tweak logistics choices, and, ultimately, to better monitor and manage physical systems, is now fundamental to many organizations.
The emergence of stream processing was driven by increasingly stringent data management, processing, and analysis needs from business and scientific applications, coupled with the confluence of two major technological and scientific shifts: first, the advances in software and hardware technologies for database, data management, and distributed systems, and, second, the advances in supporting techniques in signal processing, statistics, data mining, and in optimization theory.
In Section 1.2, we will look more deeply into the data processing requirements that led to the design of stream processing systems and applications. In Section 1.3, we will trace the roots of the theoretical and engineering underpinnings that enabled these applications, as well as the middleware supporting them. While providing this historical perspective, we will illustrate how stream processing uses and extends these fundamental building blocks.
Stream processing has emerged from the confluence of advances in data management, parallel and distributed computing, signal processing, statistics, data mining, and optimization theory.
Stream processing is an intuitive computing paradigm where data is consumed as it is generated, computation is performed at wire speed, and results are immediately produced, all within a continuous cycle. The rise of this computing paradigm was the result of the need to support a new class of applications. These analytic-centric applications are focused on extracting intelligence from large quantities of continuously generated data, to provide faster, online, and real-time results. These applications span multiple domains, including environment and infrastructure monitoring, manufacturing, finance, healthcare, telecommunications, physical and cyber security, and, finally, large-scale scientific and experimental research.
In this book, we have discussed the emergence of stream processing and the three pillars that sustain it: the programming paradigm, the software infrastructure, and the analytics, which together enable the development of large-scale high-performance SPAs.
In this chapter, we start with a quick recap of the book (Section 13.1), then look at the existing challenges and open problems in stream processing (Section 13.2), and end with a discussion on how this technology may evolve in the coming years (Section 13.3).
Book summary
In the two introductory chapters (Chapters 1 and 2) of the book, we traced the origins of stream processing as well as provided an overview of its technical fundamentals, and a description of the technological landscape in the area of continuous data processing.
Stream processing is a paradigm built to support natural and intuitive ways of designing, expressing, and implementing continuous online high-speed data processing. If we look at systems that manage the critical infrastructure that makes modern life possible, each of their components must be able to sense what is happening externally, by processing continuous inputs, and to respond by continuously producing results and actions. This pattern is very intuitive and is not very dissimilar from how the human body works, constantly sensing and responding to external stimuli. For this reason, stream processing is a natural way to analyze information as well as to interconnect the different components that make such processing fast and scalable.
We wrote this book as a comprehensive reference for students, developers, and researchers to allow them to design and implement their applications using the stream processing paradigm. In many domains, employing this paradigm yields results that better match the needs of certain types of applications, primarily along three dimensions.
First, many applications naturally adhere to a sense-and-respond pattern. Hence, engineering these types of applications is simpler, as both the programming model and the supporting stream processing systems provide abstractions and constructs that match the needs associated with continuously sensing, processing, predicting, and reacting.
Second, the stream processing paradigm naturally supports extensibility and scalability requirements. This allows stream processing applications to better cope with high data volumes, handle fluctuations in the workload and resources, and also readjust to time-varying data and processing characteristics.
In this chapter, we switch the focus from a conceptual description of the SPS architecture to the specifics of one such system, InfoSphere Streams. The concepts, entities, services, and interfaces described in Chapter 7 will now be made concrete by studying the engineering foundations of Streams. We start with a brief recount of Streams' research roots and historical context in Section 8.2. In Section 8.3 we discuss user interaction with Streams’ application runtime environment.
We then describe the principal components of Streams in Section 8.4. We focus on how these components interact to form a cohesive runtime environment to support users and applications sharing a Streams instance. In the second half of this chapter we focus on services (Section 8.5), delving into the internals of Streams' architectural components, providing a broader discussion of their service APIs and their steady state runtime life cycles. We discuss Streams with a top-down description of its application runtime environment, starting with the overall architecture, followed by the specific services provided by the environment.
Finally, we discuss the facets of the architecture that are devoted to supporting application development and tuning.
Background and history
InfoSphere Streams can trace its roots to the System S middleware, which was developed between 2003 and 2009 at IBM Research [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]. The architectural foundations and programming language model in Streams are based on counterparts in System S.
In this paper, a new method is presented that allows an intelligent manipulator robotic system to track a human hand from far distance in 3D space and estimate its orientation and position in real time with the goal of ultimately using the algorithm with a robotic spherical wrist system. In this proposed algorithm, several image processing and morphology techniques are used in conjunction with various mathematical formulas to calculate the hand position and orientation. The proposed technique was tested on Remote teleguided virtual Robotic system. Experimental results show that proposed method is a robust technique in terms of the required processing time of estimation of orientation and position of hand.
A recent paper by Pozdnyakov and Steele [6] is devoted to distributed sensor networks in which the sensors send summarized information to some remotely located fusion agents. Main focus is on sensors that send their information according to a so-called binary-plus-passive design. Some exact relations in a sequential approach, aimed at making decisions before a given budget is exhausted are presented. In this paper, we provide asymptotic results in the case of more general sensors for increasing budgets. The essential ingredient is the observation that the setting can be modeled within the framework of certain stopped two-dimensional random walks. We also provide asymptotics for the cost for large decision boundaries, and, finally, some comments on the probability of a positive/negative decision as well as on the duration of the process until a decision is made in the binary-plus-passive design.
Timed Concurrent Constraint Programming (tcc) is a declarative model for concurrency offering a logic for specifying reactive systems, i.e., systems that continuously interact with the environment. The universal tcc formalism (utcc) is an extension of tcc with the ability to express mobility. Here mobility is understood as communication of private names as typically done for mobile systems and security protocols. In this paper we consider the denotational semantics for tcc, and extend it to a “collecting” semantics for utcc based on closure operators over sequences of constraints. Relying on this semantics, we formalize a general framework for data flow analyses of tcc and utcc programs by abstract interpretation techniques. The concrete and abstract semantics that we propose are compositional, thus allowing us to reduce the complexity of data flow analyses. We show that our method is sound and parametric with respect to the abstract domain. Thus, different analyses can be performed by instantiating the framework. We illustrate how it is possible to reuse abstract domains previously defined for logic programming to perform, for instance, a groundness analysis for tcc programs. We show the applicability of this analysis in the context of reactive systems. Furthermore, we also make use of the abstract semantics to exhibit a secrecy flaw in a security protocol. We also show how it is possible to make an analysis which may show that tcc programs are suspension-free. This can be useful for several purposes, such as for optimizing compilation or for debugging.
This paper focuses on behavior recognition in an underwater application as a substitute for communicating through acoustic transmissions, which can be unreliable. The importance of this work is that sensor information regarding other agents can be leveraged to perform behavior recognition, which is activity recognition of robots performing specific programmed behaviors, and task-assignment. This work illustrates the use of Behavior Histograms, Hidden Markov Models (HMMs), and Conditional Random Fields (CRFs) to perform behavior recognition. We present challenges associated with using each behavior recognition technique along with results on individually selected test trajectories, from simulated and real sonar data, and real-time recognition through a simulated mission.
This chapter reduces the inference problem in probabilistic graphical models to an equivalent maximum weight stable set problem on a graph. We discuss methods for recognizing when the latter problem can be solved efficiently by appealing to perfect graph theory. Furthermore, practical solvers based on convex programming and message-passing are presented.
Tractability is the study of computational tasks with the goal of identifying which problem classes are tractable or, in other words, efficiently solvable. The class of tractable problems is traditionally assumed to be solvable in polynomial time by a deterministic Turing machine and is denoted by P.The class contains many natural tasks such as sorting a set of numbers, linear programming (the decision version), determining if a number is prime, and finding a maximum weight matching. Many interesting problems, however, lie in another class that generalizes P and is known as NP: the class of languages decidable in polynomial time on a non-deterministic Turing machine. We trivially have that P isasubset of NP (many researchers also believe that it is a strict subset). It is believed that many problems in the class NP are, in the worst case, intractable and do not admit efficient inference. Problems such as maximum stable set, the traveling salesman problem and graph coloring are known to be NP-hard (at least as hard as the hardest problems in NP). It is, therefore, widely suspected that there are no polynomial-time algorithms for NP-hard problems.