Search results for Statistical theory and methods

6 - Tensorization and Information Rates
from Part I - Information Measures
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 106-114
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Chapter 6 we start with explaining the important property of mutual information known as tensorization (or single-letterization), which allows one to maximize and minimize mutual information between two high-dimensional vectors. Next, we extend the information measures discussed in previous chapters for random variables to random processes by introducing the concepts of entropy rate (for a stochastic process) and mutual information rate (for a pair of stochastic processes).

References
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 690-713
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Exercises for Part III
from Part III - Hypothesis Testing and Large Deviations
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 333-344
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This enthusiastic introduction to the fundamentals of information theory builds from classical Shannon theory through to modern applications in statistical learning, equipping students with a uniquely well-rounded and rigorous foundation for further study. The book introduces core topics such as data compression, channel coding, and rate-distortion theory using a unique finite blocklength approach. With over 210 end-of-part exercises and numerous examples, students are introduced to contemporary applications in statistics, machine learning, and modern communication theory. This textbook presents information-theoretic methods with applications in statistical learning and computer science, such as f-divergences, PAC-Bayes and variational principle, Kolmogorov’s metric entropy, strong data-processing inequalities, and entropic upper bounds for statistical estimation. Accompanied by additional stand-alone chapters on more specialized topics in information theory, this is the ideal introductory textbook for senior undergraduate and graduate students in electrical engineering, statistics, and computer science.

5 - Extremization of Mutual Information: Capacity Saddle Point
from Part I - Information Measures
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 91-105
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

There are four fundamental optimization problems arising in information theory: I-projection, maximum likelihood, rate distortion, and capacity. In Chapter 5 we show that all these problems have convex/concave objective functions, discuss iterative algorithms for solving them, and study the capacity problem in more detail. As an application, we show that Gaussian distribution extremizes mutual information in various problems with second moment constraints.

Part III - Hypothesis Testing and Large Deviations
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 277-280
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Exercises for Part I
from Part I - Information Measures
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 175-194
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This enthusiastic introduction to the fundamentals of information theory builds from classical Shannon theory through to modern applications in statistical learning, equipping students with a uniquely well-rounded and rigorous foundation for further study. The book introduces core topics such as data compression, channel coding, and rate-distortion theory using a unique finite blocklength approach. With over 210 end-of-part exercises and numerous examples, students are introduced to contemporary applications in statistics, machine learning, and modern communication theory. This textbook presents information-theoretic methods with applications in statistical learning and computer science, such as f-divergences, PAC-Bayes and variational principle, Kolmogorov’s metric entropy, strong data-processing inequalities, and entropic upper bounds for statistical estimation. Accompanied by additional stand-alone chapters on more specialized topics in information theory, this is the ideal introductory textbook for senior undergraduate and graduate students in electrical engineering, statistics, and computer science.

3 - Integral Equations and the Resolvent
from Part I - Theory
Katsuto Tanaka, Hitotsubashi University, Tokyo
Book:

Brownian Motion, the Fredholm Determinant, and Time Series Analysis

Published online:

19 December 2024

Print publication:

02 January 2025, pp 160-192
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

22 - Strong Converse, Channel Dispersion, Error Exponents, and Finite Blocklength
from Part IV - Channel Coding
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 431-451
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 22 is a survey of different results on fundamental limits that have been shown in the 75 years since Shannon. We discuss topics like the strong converse, channel dispersion, error exponents, and finite-blocklength bounds. In particular, the error exponents study the rate of convergence of probability of error to 0 or 1 (depending on which side of the capacity the coding rate is). Finite-blocklength results aim to prove computationally efficient bounds that give tight characterizations of non-asymptotic fundamental limits.

5 - Fredholm Determinants in the State Space Model
from Part II - Applications
Katsuto Tanaka, Hitotsubashi University, Tokyo
Book:

Brownian Motion, the Fredholm Determinant, and Time Series Analysis

Published online:

19 December 2024

Print publication:

02 January 2025, pp 220-247
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Part IV - Channel Coding
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 345-348
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

28 - Basics of Statistical Decision Theory
from Part VI - Statistical Applications
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 571-587
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we discuss the decision-theoretic framework of statistical estimation and introduce several important examples. Section 28.1 presents the basic elements of statistical experiment and statistical estimation. Section 28.3 introduces the Bayes risk (average-case) and the minimax risk (worst-case) as the respective fundamental limits of statistical estimation in Bayesian and frequentist settings, with the latter being our primary focus in this part. We discuss several versions of the minimax theorem (and prove a simple one) that equates the minimax risk with the worst-case Bayes risk. Two variants are introduced next that extend a basic statistical experiment to either large sample size or large dimension: Section 28.4 on independent observations and Section 28.5 on tensorization of experiments. Throughout this chapter the Gaussian location model (GLM), introduced in Section 28.2, serves as a running example, with different focus at different places (such as the role of loss functions, parameter spaces, low versus high dimensions, etc.). In Section 28.6, we discuss a key result known as Anderson’s lemma for determining the exact minimax risk of (unconstrained) GLM in any dimension for a broad class of loss functions, which provides a benchmark for various more general techniques introduced in later chapters.

12 - Entropy of Ergodic Processes
from Part II - Lossless Data Compression
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 231-244
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Chapter 12, we shall examine results for a large class of processes with memory, known as ergodic processes. We start this chapter with a quick review of the main concepts of ergodic theory, then state our main results: Shannon–McMillan theorem, compression limit, and asymptotic equipartition property (AEP). Subsequent sections are dedicated to proofs of the Shannon–McMillan and ergodic theorems. Finally, in the last section we introduce Kolmogorov–Sinai entropy, which associates to a fully deterministic transformation the measure of how “chaotic” it is. This concept plays a very important role in formalizing an apparent paradox: large mechanical systems (such as collections of gas particles) are on the one hand fully deterministic (described by Newton’s laws of motion) and on the other hand have a lot of probabilistic properties (Maxwell distribution of velocities, fluctuations, etc.). Kolmogorov–Sinai entropy shows how these two notions can coexist. In addition it was used to resolve a long-standing open problem in dynamical systems regarding isomorphism of Bernoulli shifts.

1 - Quadratic Functionals of Brownian Motion
from Part I - Theory
Katsuto Tanaka, Hitotsubashi University, Tokyo
Book:

Brownian Motion, the Fredholm Determinant, and Time Series Analysis

Published online:

19 December 2024

Print publication:

02 January 2025, pp 7-62
- Chapter
- - You have access
- PDF
- Export citation

14 - Neyman–Pearson Lemma
from Part III - Hypothesis Testing and Large Deviations
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 281-295
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Chapter 14 we first define a performance metric giving a full description of the binary hypothesis testing (BHT) problem. A key result in this theory, the Neyman–Pearson lemma, determines the form of the optimal test and at the same time characterizes the given performance metric. We then specialize to the setting of iid observations and consider two types of asymptotics: Stein’s regime (where type-I error is held constant) and Chernoff’s regime (where errors of both types are required to decay exponentially). In this chapter we only discuss Stein's regime and find out that fundamental limit is given by the KL divergence. Subsequent chapters will address the Chernoff's regime.

Part I - Theory
Katsuto Tanaka, Hitotsubashi University, Tokyo
Book:

Brownian Motion, the Fredholm Determinant, and Time Series Analysis

Published online:

19 December 2024

Print publication:

02 January 2025, pp 1-16
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

11 - Fixed-Length Compression and Slepian–Wolf Theorem
from Part II - Lossless Data Compression
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 214-230
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In the previous chapter we introduced the concept of variable-length compression and studied its fundamental limits (with and without the prefix-free condition). In some situations, however, one may desire that the output of the compressor always has a fixed length, say, k bits. Unless k is unreasonably large, then, this will require relaxing the losslessness condition. This is the focus of Chapter 11: compression in the presence of (typically vanishingly small) probability of error. It turns out allowing even very small error enables several beautiful effects: The possibility to compress data via matrix multiplication over finite fields (linear compression). The possibility to reduce compression length if side information is available at the decompressor (Slepian–Wolf). The possibility to reduce compression length if access to a compressed representation of side information is available at the decompressor (Ahlswede–Körner–Wyner).

Part I - Information Measures
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 5-6
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Part II - Applications
Katsuto Tanaka, Hitotsubashi University, Tokyo
Book:

Brownian Motion, the Fredholm Determinant, and Time Series Analysis

Published online:

19 December 2024

Print publication:

02 January 2025, pp 193-198
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

19 - Channel Capacity
from Part IV - Channel Coding
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 378-401
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Chapter 19 we apply methods developed in the previous chapters (namely the weak converse and the random/maximal coding achievability) to compute the channel capacity. This latter notion quantifies the maximal amount of (data) bits that can be reliably communicated per single channel use in the limit of using the channel many times. Formalizing the latter statement will require introducing the concept of a communication channel. Then for special kinds of channels (the memoryless and the information-stable ones) we will show that computing the channel capacity reduces to maximizing the (sequence of the) mutual information. This result, known as Shannon’s noisy channel coding theorem, is very special as it relates the value of a (discrete, combinatorial) optimization problem over codebooks to that of a (convex) optimization problem over information measures. It builds a bridge between the abstraction of information measures (Part I) and practical engineering problems.

1 - Entropy
from Part I - Information Measures
Yury Polyanskiy, Massachusetts Institute of Technology, Yihong Wu, Yale University, Connecticut
Book:

Information Theory

Published online:

09 January 2025

Print publication:

02 January 2025, pp 7-18
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 1 introduces the first information measure – Shannon entropy. After studying its standard properties (chain rule, conditioning), we will briefly describe how one could arrive at its definition. We discuss axiomatic characterization, the historical development in statistical mechanics, as well as the underlying combinatorial foundation (“method of types”). We close the chapter with Han’s and Shearer’s inequalities, which both exploit the submodularity of entropy.

Statistical theory and methods

Refine search

Refine search

Actions for selected content:

2325 results in Statistical theory and methods

6 - Tensorization and Information Rates

Summary

References

Exercises for Part III

Summary

5 - Extremization of Mutual Information: Capacity Saddle Point

Summary

Part III - Hypothesis Testing and Large Deviations

Exercises for Part I

Summary

3 - Integral Equations and the Resolvent

22 - Strong Converse, Channel Dispersion, Error Exponents, and Finite Blocklength

Summary

5 - Fredholm Determinants in the State Space Model

Part IV - Channel Coding

28 - Basics of Statistical Decision Theory

Summary

12 - Entropy of Ergodic Processes

Summary

1 - Quadratic Functionals of Brownian Motion

14 - Neyman–Pearson Lemma

Summary

Part I - Theory

11 - Fixed-Length Compression and Slepian–Wolf Theorem

Summary

Part I - Information Measures

Part II - Applications

19 - Channel Capacity

Summary

1 - Entropy

Summary

Statistical theory and methods

Refine search

Refine search

Actions for selected content:

Save Search

2325 results in Statistical theory and methods

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary