Search results for Communications and signal processing

2 - Generalization in Deep Learning
- By K. Kawaguchi, Y. Bengio, L. Kaelbling
Edited by Philipp Grohs, Universität Wien, Austria, Gitta Kutyniok, Ludwig-Maximilians-Universität Munchen
Book:

Mathematical Aspects of Deep Learning

Published online:

29 November 2022

Print publication:

22 December 2022, pp 112-148
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter provides theoreticalinsights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, non-robustness, and sharp minima, responding to an open question in the literature. We also discuss approaches to provide non-vacuousgeneralization guarantees for deep learning. On the basis of the theoreticalobservations, wepropose new open problems.

43 - Undirected Graphs
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1740-1806
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The discussion in the last two chapters focused on directed graphical models or Bayesian networks, where a directed link from a variable $x_{1}$ toward another variable $x_{2}$ carries with it an implicit connotation of “causal effect” by $x_{1}$ on $x_{2}$ . In many instances, this implication need not be appropriate or can even be limiting. For example, there are cases where conditional independence relations cannot be represented by a directed graph. One such example is provided in Prob. 43.1. In this chapter, we examine another form of graphical representations where the links are not required to be directed anymore, and the probability distributions are replaced by potential functions. These are strictly positive functions defined over sets of connected nodes; they broaden the level of representation by graphical models. The potential functions carry with them a connotation of “similarity” or “affinity” among the variables, but can also be rolled back to represent probability distributions. Over undirected graphs, edges linking nodes will continue to reflect pairwise relationship between the variables but will lead to a fundamental factorization result in terms of the product of clique potential functions. We will show that these functions play a prominent role in the development of message-passing algorithms for the solution of inference problems.

41 - Bayesian Networks
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1643-1681
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The inference of a random variable $x$ from observations ${y_{1}, y_{2}, \dots, y_{N}}$ requires that we evaluate the posterior distribution $f_{x ∣ y_{1 : N}} (x ∣ y_{1}, \dots, y_{N})$ as happens, for example, in inference formulations based on mean-square-error (MSE), maximum a-posteriori (MAP), or probability of error metrics. In previous chapters, we described several techniques to facilitate the computation or approximation of such posterior distributions using Monte Carlo or variational inference methods. We will encounter other types of approximations in later chapters. For example, in the context of naïve Bayes classifiers in Chapter 55, we will assume that, conditioned on the latent variable $x$ , the observations are independent of each other in order to write

Contents
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp vii-xxvi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

10 - Dynamical Systems andOptimal Control Approach to Deep Learning
- By Weinan E, Jiequn Han, Qianxiao Li
Edited by Philipp Grohs, Universität Wien, Austria, Gitta Kutyniok, Ludwig-Maximilians-Universität Munchen
Book:

Mathematical Aspects of Deep Learning

Published online:

29 November 2022

Print publication:

22 December 2022, pp 422-438
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We give a short and concise review about the dynamical system and the control theory approach to deep learning. From the viewpoint of the dynamical systems, the back-propagation algorithm in deep learning becomes a simple consequence of the variational equations in ODEs. From the viewpoint of control theory, deep learning is a case of mean-field control in that all the agents share the same control. As an application, we discuss a new class of algorithms for deep learning based on Pontryagin’s maximum principle in control theory.

21 - Convergence Analysis III: Stochastic Proximal Algorithms
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 756-778
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

45 - Value and Policy Iterations
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 1853-1916
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We continue our treatment of Markov decision processes (MDPs) and focus in this chapter on methods for determining optimal actions or policies. We derive two popular methods known as value iteration and policy iteration, and establish their convergence properties. We also examine the Bellman optimality principle in the context of value and policy learning. In a later section, we extend the discussion to the more challenging case of partially observable MDPs (POMDPs), where the successive states of the MDP are unobservable to the agent, and the agent is only able to sense measurements emitted randomly by the MDP from the various states. We will define POMDPs and explain that they can be reformulated as belief‐MDPs with continuous (rather than discrete) states. This fact complicates the solution of the value iteration. Nevertheless, we will show that the successive value iterates share a useful property, namely, that they are piecewise linear and convex. This property can be exploited by computational methods to reduce the complexity of solving the value iteration for POMDPs.

6 - Stochastic Feedforward Neural Networks: Universal Approximation
- By Thomas Merkh, Guido Montúfar
Edited by Philipp Grohs, Universität Wien, Austria, Gitta Kutyniok, Ludwig-Maximilians-Universität Munchen
Book:

Mathematical Aspects of Deep Learning

Published online:

29 November 2022

Print publication:

22 December 2022, pp 267-313
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We take a look at the universal approximation question for stochastic feedforward neural networks. In contrast with deterministic networks, which represent mappings from inputs to outputs, stochastic networks represent mappings from inputs to probability distributions over outputs. Even if the sets of inputs and outputs are finite, the set of stochastic mappings is continuous. Moreover, the values of the output variables may be correlated, which requires that their values are computed jointly. A prominent class of stochastic feedforward networks are deep belief networks. We discuss the representational power in terms of compositions of Markov kernels expressed by the layers of the network. We investigate different types of shallow and deep architectures, and the minimal number of layers and units that are necessary and sufficient in order for the network to be able to approximate any stochastic mapping arbitrarily well. The discussion builds on notions of probability sharing, focusing on the case of binary variables and sigmoid units. After reviewing existing results, we present a detailed analysis of shallow networks and a unified analysis for a variety of deep networks.

18 - Gradient Noise
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 642-682
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

59 - Logistic Regression
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2457-2498
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we describe a popular discriminative approach for classification problems, known as logistic regression. Assuming binary classification with labels $γ \in {\pm 1}$ and features $h \in R^{M}$ , we explained earlier in expression (28.85) that the optimal Bayes classifier for predicting $γ$ is given by

54 - Decision Trees
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2313-2340
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We mentioned earlier in Section 52.3 that the nearest-neighbor (NN) rule for classification and clustering treats equally all attributes within each feature vector, $h_{n} \in R^{M}$ . If, for example, some attributes are more relevant to the classification task than other attributes, then this aspect is ignored by the NN classifier because all entries of the feature vector will contribute similarly to the calculation of Euclidean distances and the determination of neighborhoods.

Contents
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp vii-xxvi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

9 - Deep Generative Models and Inverse Problems
- By Alexandros G. Dimakis
Edited by Philipp Grohs, Universität Wien, Austria, Gitta Kutyniok, Ludwig-Maximilians-Universität Munchen
Book:

Mathematical Aspects of Deep Learning

Published online:

29 November 2022

Print publication:

22 December 2022, pp 400-421
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Deep generative models have been recently proposed as modular datadriven priors to solve inverse problems. Linear inverse problems involve the reconstruction of an unknown signal (e.g. a tomographic image) from an underdetermined system of noisy linear measurements. Most results in the literature require that the reconstructed signal has some known structure, e.g. it is sparse in some known basis (usually Fourier or wavelet). Such prior assumptions can be replaced with pre-trained deep generative models (e.g. generative adversarial getworks (GANs) and variational autoencoders (VAEs)) with significant performance gains. This chapter surveys this rapidly evolving research area and includes empirical and theoretical results in compressed sensing for deep generative models.

1 - The Modern Mathematics of Deep Learning
- By Julius Berner, Philipp Grohs, Gitta Kutyniok, Philipp Petersen
Edited by Philipp Grohs, Universität Wien, Austria, Gitta Kutyniok, Ludwig-Maximilians-Universität Munchen
Book:

Mathematical Aspects of Deep Learning

Published online:

29 November 2022

Print publication:

22 December 2022, pp 1-111
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We describe the new field of the mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding what features are learned, why deep architectures perform exceptionally well in physical problems, and which fine aspects of an architecture affect the behavior of a learning task in which way. We present an overview of modern approaches that yield partial answers to these questions. For selected approaches, we describe the main ideas in more detail.

22 - Variance-Reduced Methods I: Uniform Sampling
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 779-815
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Gaussian Distribution
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 February 2023

Print publication:

22 December 2022, pp 132-166
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Edited by Philipp Grohs, Universität Wien, Austria, Gitta Kutyniok, Ludwig-Maximilians-Universität Munchen
Book:

Mathematical Aspects of Deep Learning

Published online:

29 November 2022

Print publication:

22 December 2022, pp v-xii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Author Index
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

17 March 2023

Print publication:

22 December 2022, pp 2121-2144
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Notation
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp xlv-lii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

67 - Convolutional Networks
Ali H. Sayed, École Polytechnique Fédérale de Lausanne
Book:

Inference and Learning from Data

Published online:

24 February 2023

Print publication:

22 December 2022, pp 2838-2904
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Convolutional neural networks (CNNs) are prevalent in computer vision, image, speech, and language processing applications, where they have been successfully applied to perform classification tasks at high accuracy rates. One of their main attractions is the ability to operate directly on raw input signals, such as images, and to extract salient features automatically from the raw data. The designer does not need to worry about which features to select to drive the classification process.

Communications and signal processing

Refine search

Refine search

Actions for selected content:

6829 results in Communications and signal processing

2 - Generalization in Deep Learning

Summary

43 - Undirected Graphs

Summary

41 - Bayesian Networks

Summary

Contents

10 - Dynamical Systems andOptimal Control Approach to Deep Learning

Summary

21 - Convergence Analysis III: Stochastic Proximal Algorithms

45 - Value and Policy Iterations

Summary

6 - Stochastic Feedforward Neural Networks: Universal Approximation

Summary

18 - Gradient Noise

59 - Logistic Regression

Summary

54 - Decision Trees

Summary

Contents

9 - Deep Generative Models and Inverse Problems

Summary

1 - The Modern Mathematics of Deep Learning

Summary

22 - Variance-Reduced Methods I: Uniform Sampling

4 - Gaussian Distribution

Contents

Author Index

Notation

67 - Convolutional Networks

Summary

Communications and signal processing

Refine search

Refine search

Actions for selected content:

Save Search

6829 results in Communications and signal processing

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary