Search results for Communications and signal processing

4 - Optimization Landscape of Neural Networks
- By René Vidal, Zhihui Zhu, Benjamin D. Haeffele
Edited by Philipp Grohs, Universität Wien, Austria, Gitta Kutyniok, Ludwig-Maximilians-Universität Munchen
Book:

Mathematical Aspects of Deep Learning

Published online:

29 November 2022

Print publication:

22 December 2022, pp 200-228
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter summarizes recent advances on the analysis of the optimization landscape of neural network training. We first review classical results for linear networks trained with a squared loss and without regularization. Such results show that under certain conditions on the input-output data spurious local minima are guaranteed not to exist, i.e. critical points are either saddle points or global minima. Moreover, the globally optimal weights can be found by factorizing certain matrices obtained from the input-output covariance matrices.We then review recent results for deep networks with parallel structure, positively homogeneous network mapping and regularization, and trained with a convex loss. Such results show that the non-convex objective on theweights can be lower-bounded by a convex objective on the network mapping. Moreover, when the network is sufficiently wide, local minima of the non-convex objective that satisfy a certain condition yield global minima of both the non-convex and convex objectives, and that there is always a non-increasing path to a global minimizer from any initialization.

5 - Explaining the Decisions of Convolutional and Recurrent Neural Networks
- By Wojciech Samek, Leila Arras, Ahmed Osman, Grégoire Montavon, Klaus-Robert Müller
Edited by Philipp Grohs, Universität Wien, Austria, Gitta Kutyniok, Ludwig-Maximilians-Universität Munchen
Book:

Mathematical Aspects of Deep Learning

Published online:

29 November 2022

Print publication:

22 December 2022, pp 229-266
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter we discuss the algorithmic and theoretical underpinnings of layer-wise relevance propagation (LRP), apply the method to a complex model trained for the task of visual question answering (VQA), and demonstrate that it produces meaningful explanations, revealing interesting details about the model’s reasoning. We conclude the chapter by commenting on the general limitations of current explanation techniques and interesting future directions.

Subject Index
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 1033-1052
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

31 - Maximum Likelihood
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1211-1275
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The maximum-likelihood (ML) formulation is one of the most formidable tools for the solution of inference problems in modern statistical analysis. It allows the estimation of unknown parameters in order to fit probability density functions (pdfs) onto data measurements. We introduce the ML approach in this chapter and limit our discussions to properties that will be relevant for the future developments in the text. The presentation is not meant to be exhaustive, but targets key concepts that will be revisited in later chapters. We also avoid anomalous situations and focus on the main features of ML inference that are generally valid under some reasonable regularity conditions.

Frontmatter
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

47 - Q-Learning
- By Ali H. Sayed
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1971-2007
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The temporal learning algorithms TD(0) and TD( $λ$ ) of the previous chapter are useful procedures for state value evaluation; i.e., they permit the estimation of the state value function $v^{π} (s)$ for a given target policy $π (a | s)$ by observing actions and rewards arising from this policy (on‐policy learning) or another behavior policy (off‐policy learning).In most situations, however, we are not interested in state values but rather in determining optimal policies, denoted by $π^{⋆} (a | s)$ (i.e., in selecting what optimal actions an agent should follow in a Markov decision process (MDP)).

71 - Adversarial Attacks
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 3065-3098
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We have described a range of supervised learning algorithms in the previous chapters, including several neural network implementations and their training by means of the backpropagation algorithm. The performance of some of these algorithms has been demonstrated in practice to match or even exceed human performance in important applications. At the same time, it has also been observed that the algorithms are susceptible to adversarial attacks that can drive them to erroneous decisions under minimal perturbations to the data. For instance, adding small perturbations to an image that may not even be perceptible to the human eye has been shown to cause learning algorithms to classify the image incorrectly.

Preface
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp xxvii-xliv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

38 - Hidden Markov Models
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1517-1562
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We expectation-maximization (EM) algorithm can be used to estimate the underlying parameters of the conditional probability density functions (pdfs) by approximating the maximum-likelihood (ML) solution. We found that the algorithm operates on a collection of independent observations, where each observation is generated independently from one of the mixture components. In this chapter and the next, we extend this construction and consider hidden Markov models (HMMs), where the mixture component for one observation is now dependent on the component used to generate the most recent past observation.

57 - Principal Component Analysis
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2383-2423
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Oftentimes, the dimension of the feature space, $h_{n} \in R^{M}$ , is prohibitively large either for computational or visualization purposes. In these situations, it becomes necessary to perform an initial dimensionality reduction step where each $h_{n}$ is replaced by a lower-dimensional vector $h_{n}^{'} \in R^{M^{'}}$ with $M^{'} ≪ M$ .

33 - Predictive Modeling
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1319-1351
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Maximum likelihood (ML) is a powerful statistical tool that determines model parameters $θ$ in order to fit probability density functions (pdfs) onto data measurements. The estimated pdfs can then be used for at least two purposes. First, they can help construct optimal estimators or classifiers (such as the conditional mean estimator, the maximum a-posteriori (MAP) estimator, or the Bayes classifier) since, as we already know from previous chapters, these optimal constructions require knowledge of the conditional or joint probability distributions of the variables involved in the inference problem. Second, once a pdf is learned, we can sample from it to generate additional observations. For example, consider a database consisting of images of cats and assume we are able to characterize (or learn) the pdf distribution of the pixel values in these images. Then, we could use the learned pdf to generate “fake” cat-like images (i.e., ones that look like real cats). We will learn later in this text that this construction is possible and some machine-learning architectures are based on this principle: They use data to learn what we call a “generative model,” and then use the model to generate “similar” data. We provide a brief explanation to this effect in the next section, where we explain the significance of posterior distributions.

16 - Stochastic Optimization
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 547-598
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

11 - Proximal Operator
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 341-374
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Frontmatter
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

25 - Decentralized Optimization I: Primal Methods
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 902-968
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

36 - Variational Inference
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1405-1471
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Chapters 33 and 34 we described three methods for approximating posterior distributions: the Laplace method, the Markov chain Monte Carlo (MCMC) method, and the expectation-propagation (EP) method. Given an observable $y$ and a latent variable $z$ , the Laplace method approximates $f_{z ∣ y} (z ∣ y)$ by a Gaussian distribution and was seen to be suitable for problems with small-dimensional latent spaces because its implementation involves a matrix inversion. The Gaussian approximation, however, is not sufficient in many instances and can perform poorly. The MCMC method is more powerful, and also more popular, and relies on elegant sampling techniques and the Metropolis–Hastings algorithm. However, MCMC requires a large number of samples, does not perform well on complex models, and does not scale well to higher dimensions and large datasets. The EP method, on the other hand, limits the class of distributions from which the posterior is approximated to the Gaussian or exponential families, and can be analytically demanding. In this chapter, we develop a fourth powerful method for posterior approximation known as variational inference. One of its advantages is that it usually scales better to large datasets and large dimensions.

Preface
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp xxvii-xliv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp vii-xxvi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

32 - Expectation Maximization
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1276-1318
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We formulated the maximum-likelihood (ML) approach in the previous chapter, where an unknown parameter $θ$ is estimated by maximizing the log-likelihood function. We showed there that in some cases of interest this problem can be solved analytically in closed form and an expression for the parameter estimate can be determined in terms of the observations. However, there are many important scenarios where the ML solution cannot be pursued in closed form, either due to mathematical intractability or due to missing data or hidden variables that are unobservable. In this chapter, we motivate and describe the expectation maximization (EM) procedure as a useful tool for constructing ML estimates under these more challenging conditions. We also illustrate how EM can be used to fit mixture models onto data.

30 - Kalman Filter
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1154-1210
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter we illustrate one important application of the linear meansquare-error (MSE) theory to the derivation of the famed Kalman filter. The filter is a powerful recursive technique for updating the estimates of the state (hidden) variables of a state-space model from noisy observations. The state evolution satisfies a Markovian property in the sense that the distribution of the state $x_{n}$ at time $n$ is only dependent on the most recent past state, $x_{n - 1}$ . Likewise, the distribution of the observation $y_{n}$ at the same time instant is only dependent on the state $x_{n}$ . The state and observation variables are represented by a linear state-space model, which will be shown to enable a powerful recursive solution. One key step in the argument is the introduction of the innovations process and the exploitation to great effect of the principle of orthogonality. In Chapter 35 we will allow for nonlinear state-space models and derive the class of particle filters by relying instead on the concept of sequential importance sampling.

Communications and signal processing

Refine search

Refine search

Actions for selected content:

6829 results in Communications and signal processing

4 - Optimization Landscape of Neural Networks

Summary

5 - Explaining the Decisions of Convolutional and Recurrent Neural Networks

Summary

Subject Index

31 - Maximum Likelihood

Summary

Frontmatter

47 - Q-Learning

Summary

71 - Adversarial Attacks

Summary

Preface

38 - Hidden Markov Models

Summary

57 - Principal Component Analysis

Summary

33 - Predictive Modeling

Summary

16 - Stochastic Optimization

11 - Proximal Operator

Frontmatter

25 - Decentralized Optimization I: Primal Methods

36 - Variational Inference

Summary

Preface

Contents

32 - Expectation Maximization

Summary

30 - Kalman Filter

Summary

Communications and signal processing

Refine search

Refine search

Actions for selected content:

Save Search

6829 results in Communications and signal processing

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary