To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter, we consider a joint sampling and scheduling problem for optimizing data freshness in multisource systems. Data freshness is measured by a nondecreasing penalty function of Age of Information, where all sources have the same age-penalty function. Sources take turns to generate update samples, and forward them to their destinations one-by-one through a shared channel with random delay. There is a scheduler, that chooses the update order of the sources, and a sampler, that determines when a source should generate a new sample in its turn. We aim to find the optimal scheduler–sampler pairs that minimize the total-average age-penalty (Ta-AP). We start the chapter by providing a brief explanation of the sampling problem in the light of single–source networks, as well as some useful insights and applications on age of information and its penalty functions. Then, we move on to the multisource networks, where the problem becomes more challenging. We provide a detailed explanation of the model and the solution in this case. Finally, we conclude this chapter by providing an open question in this area and its inherent challenges.
While age of Information (AoI) has gained importance as a metric characterizing the freshness of information in information-update systems and time-critical applications, most previous studies on AoI have been theoretical. In this chapter, we compile a set of recent works reporting AoI measurements in real-life networks and experimental testbeds, and investigating practical issues such assynchronization, the role of various transport layer protocols, congestion control mechanisms, application of machine learning for adaptation to network conditions, and device-related bottlenecks such as limited processing power.
In this chapter, we discuss the relationship between Age of Information and signal estimation error in real-time signal sampling and reconstruction. Consider a remote estimation system, where samples of a scalar Gauss–Markov signal are taken at a source node and forwarded to a remote estimator through a channel that is modeled as a queue. The estimator reconstructs an estimate of the real-time signal value from causally received samples. The optimal sampling policy for minimizing the mean square estimation error is presented, in which a new sample is taken once the instantaneous estimation error exceeds a predetermined threshold. When the sampler has no knowledge of current and history signal values, the optimal sampling problem reduces to a problem for minimizing a nonlinear Age of Information metric. In the AoI-optimal sampling policy, a new sample is taken once the expected estimation error exceeds a threshold. The threshold can be computed by low-complexity algorithms and the insights behind these algorithms are provided. These optimal sampling results were established (i) for general service time distributions of the queueing server, (ii) for both stable and unstable scalar Gauss–Markov signals, and (iii) for sampling problems both with and without a sampling rate constraint.
This chapter explores Age of Information (AoI) in the context of the timely source coding problem. In most of the existing literature, service (transmission) times are based on a given distribution. In the timely source coding problem, by using source coding schemes, we design the transmission times of the status updates. We observe that the average age minimization problem is different than the traditional source coding problem, as the average age depends on both the first and the second moments of the codeword lengths. For the age minimization problem, we first consider a greedy source coding scheme where all realizations are encoded. For this source coding scheme, we find the age-optimal real-valued code word lengths. Then, we explore the highest k selective encoding scheme, where instead of encoding all realizations, we encode only the most probable k realizations. For each source encoding scheme, we first determine the average age expressions and then, for a given pmf, characterize the age-optimal k value, and find the corresponding age-optimal codeword lengths. Through numerical results, we show that selective encoding schemes achieve lower average age than encoding all realizations.
At the forefront of cutting-edge technologies, this text provides a comprehensive treatment of a crucial network performance metric, ushering in new opportunities for rethinking the whole design of communication systems. Detailed exposition of the communication and network theoretic foundations of Age of Information (AoI) gives the reader a solid background, and discussion of the implications on signal processing and control theory shed light on the important potential of recent research. The text includes extensive real-world applications of this vital metric, including caching, the Internet of Things (IoT), and energy harvesting networks. The far-reaching applications of AoI include networked monitoring systems, cyber-physical systems such as the IoT, and information-oriented systems and data analytics applications ranging from the stock market to social networks. The future of this exciting subject in 5G communication systems and beyond make this a vital resource for graduate students, researchers and professionals.
This self-contained introduction to machine learning, designed from the start with engineers in mind, will equip students with everything they need to start applying machine learning principles and algorithms to real-world engineering problems. With a consistent emphasis on the connections between estimation, detection, information theory, and optimization, it includes: an accessible overview of the relationships between machine learning and signal processing, providing a solid foundation for further study; clear explanations of the differences between state-of-the-art techniques and more classical methods, equipping students with all the understanding they need to make informed technique choices; demonstration of the links between information-theoretical concepts and their practical engineering relevance; reproducible examples using Matlab, enabling hands-on student experimentation. Assuming only a basic understanding of probability and linear algebra, and accompanied by lecture slides and solutions for instructors, this is the ideal introduction to machine learning for engineering students of all disciplines.
We continue our discussion of hidden Markov models (HMMs) and consider in this chapter the solution of decoding problems. Specifically, given a sequence of observations , we would like to devise mechanisms that allow us to estimate the underlying sequence of state or latent variables . That is, we would like to recover the state evolution that “most likely” explains the measurements. We already know how to perform decoding for the case of mixture models with independent observations by using (38.12a)–(38.12b). The solution is more challenging for HMMs because of the dependency among the states.
The various reinforcement learning algorithms described in the last two chapters rely on estimating state values, , or state–action values, , directly.
One prominent application of the variational inference methodology of Chapter 36 arises in the context of topic modeling. In this application, the objective is to discover similarities between texts or documents such as news articles. For example, given a large library of articles, running perhaps into the millions, such as a database of newspaper articles written over 100 years, it would be useful to be able to discover in an automated manner the multitude of topics that are covered in the database and to cluster together articles dealing with similar topics such as sports or health or politics. In another example, when a user is browsing an article online, it would be useful to be able to identify automatically the subject matter of the article in order to recommend to the reader other articles of similar content. Latent Dirichlet allocation (or LDA) refers to the procedure that results from applying variational inference techniques to topic modeling in order to address questions of this type.
This chapter summarizes recent advances on the analysis of the optimization landscape of neural network training. We first review classical results for linear networks trained with a squared loss and without regularization. Such results show that under certain conditions on the input-output data spurious local minima are guaranteed not to exist, i.e. critical points are either saddle points or global minima. Moreover, the globally optimal weights can be found by factorizing certain matrices obtained from the input-output covariance matrices.We then review recent results for deep networks with parallel structure, positively homogeneous network mapping and regularization, and trained with a convex loss. Such results show that the non-convex objective on theweights can be lower-bounded by a convex objective on the network mapping. Moreover, when the network is sufficiently wide, local minima of the non-convex objective that satisfy a certain condition yield global minima of both the non-convex and convex objectives, and that there is always a non-increasing path to a global minimizer from any initialization.
In this chapter we discuss the algorithmic and theoretical underpinnings of layer-wise relevance propagation (LRP), apply the method to a complex model trained for the task of visual question answering (VQA), and demonstrate that it produces meaningful explanations, revealing interesting details about the model’s reasoning. We conclude the chapter by commenting on the general limitations of current explanation techniques and interesting future directions.
The maximum-likelihood (ML) formulation is one of the most formidable tools for the solution of inference problems in modern statistical analysis. It allows the estimation of unknown parameters in order to fit probability density functions (pdfs) onto data measurements. We introduce the ML approach in this chapter and limit our discussions to properties that will be relevant for the future developments in the text. The presentation is not meant to be exhaustive, but targets key concepts that will be revisited in later chapters. We also avoid anomalous situations and focus on the main features of ML inference that are generally valid under some reasonable regularity conditions.
The temporal learning algorithms TD(0) and TD() of the previous chapter are useful procedures for state value evaluation; i.e., they permit the estimation of the state value function for a given target policy by observing actions and rewards arising from this policy (on‐policy learning) or another behavior policy (off‐policy learning).In most situations, however, we are not interested in state values but rather in determining optimal policies, denoted by (i.e., in selecting what optimal actions an agent should follow in a Markov decision process (MDP)).