1 Introduction
Throughout, we assume that

and induced norm
$\|\cdot \|$
. Let
${\operatorname {Id}}$
denote the identity operator on X. Recall the following well-known definitions [Reference Bauschke and Combettes6, Reference Cegielski13].
Definition 1.1 Let
$T: X \rightarrow X$
and
$\mu>0$
. Then, T is
-
(i) nonexpansiveFootnote 1 if
$$ \begin{align*}(\forall x \in X)(\forall y \in X) \quad\|T x-T y\| \leqslant\|x-y\|; \end{align*} $$
-
(ii) firmly nonexpansive if
$$ \begin{align*}(\forall x \in X)(\forall y \in X) \quad \|T x-T y\|^2+\|({\operatorname{Id}}-T)x-({\operatorname{Id}}-T)y\|^2\leqslant \|x-y\|^2; \end{align*} $$
-
(iii)
$\mu $ -cocoercive if
$\mu T$ is firmly nonexpansive.
Definition 1.2 Let
$T: X \rightarrow X$
be nonexpansive. T is k-averagedFootnote
2
if T can be represented as

where
$N:X\rightarrow X$
is nonexpansive, and
$k \in [0,1]$
.
Averaged operators are important in optimization (see, e.g., [Reference Baillon, Bruck and Reich1, Reference Bartz, Dao and Phan3, Reference Bauschke, Bendit and Moursi5, Reference Bauschke and Combettes6, Reference Bauschke and Moursi9, Reference Bauschke, Moursi and Wang10, Reference Cegielski13–Reference Combettes and Yamada15, Reference Ogura and Yamada20, Reference Xu24]). Firmly nonexpansive operators, being
$1/2$
-averaged [Reference Bauschke and Combettes6, Proposition 4.4], form a proper subclass of the class of averaged operators. From the definition, we have
${\operatorname {Id}}$
is the only 0-averaged operator. When
$k \in (0,1]$
, various characterizations of k-averagedness (see [Reference Bartz, Dao and Phan3, Proposition 2.2], [Reference Bauschke and Combettes6, Reference Cegielski13]) are available, including

and
$(\forall x \in X)(\forall y \in X)$

When
$k=0$
, while the historic Definition 1.2 gives
$T={\operatorname {Id}}$
(linear), characterization (1.2) gives
$T={\operatorname {Id}}+v$
(affine) for some
$v\in X$
, hence they are not equivalent in this case. From (1.1) or (1.2) and the fact that
$\mathrm {Id}$
is the only
$0$
-averaged operator, we can deduce that if an operator is
$k_0$
-averaged, then it is k-averaged for every
$k\geqslant k_0$
. This motivates the following definition, which was proposed by Bauschke, Bendit, and Moursi [Reference Bauschke, Bendit and Moursi5].
Definition 1.3 (Bauschke–Bendit–Moursi modulus of averagedness)
Let
$T: X \rightarrow X$
be nonexpansive. The Bauschke–Bendit–Moursi modulus of averagedness of T is defined by

We call it the BBM modulus of averagedness.
It is natural to ask: How does the modulus of averagedness impact classifications of averaged operators? In view of Definition 1.3, if
$T: X \rightarrow X$
is firmly nonexpansive then
$k(T) \leqslant 1/2$
. Based on this, we define the following, which classifies the class of firmly nonexpansive operators using the modulus of averagedness.
Definition 1.4 (Normal and special nonexpansiveness)
Let
$T: X \rightarrow X$
. We say that T is normally (firmly) nonexpansive if
$k(T)<1/2$
, and T is specially (firmly) nonexpansive if
$k(T)=1/2$
.

Let
$\Gamma _0(X)$
denote the set of all proper lower semicontinuous convex functions from X to
$(-\infty ,+\infty ]$
. Recall that for
$f \in \Gamma _0(X),$
its proximal operator is defined by
$(\forall x\in X)\ \mathrm {P}_{f}(x):=\underset {u \in X}{\operatorname {argmin}}\left \{f(u)+\frac {1}{2}\|u-x\|^2\right \}$
. For a nonempty closed convex subset C of X, its indicator function is defined by
$\iota _{C}(x):=0$
if
$x\in C$
, and
$+\infty $
otherwise. If
$f=\iota _{C}$
, we write
$\mathrm {P}_{f}=P_{C}$
, the projection operator onto C. It is well known that
$\mathrm {P}_{f}$
is firmly nonexpansive [Reference Bauschke and Combettes6], which implies
$k(\mathrm {P}_{f}) \leqslant 1/2$
. Some natural questions arise: Given
$f \in \Gamma _0(X)$
, when is
$\mathrm {P}_{f}$
normally (or specially) nonexpansive? how can we calculate
$k(\mathrm {P}_{f})$
? In [Reference Bauschke, Bendit and Moursi5], these problems are essentially solved in linear cases, or, in smooth case on the real line.
The goal of this article is to classify averaged nonexpansive operators, including firmly nonexpansive operators, via the Bauschke–Bendit–Moursi modulus of averagedness in a general Hilbert space. We provide some fundamental properties of modulus of averagedness of averaged mappings, firmly nonexpansive mappings and proximal mappings. We determine what properties normally (or specially) nonexpansive operators possess by using the monotone operator theory. One amazing result is that a proximal mapping of a convex function has its modulus of averagedness less than
$1/2$
if and only if the function is Lipschitz smooth. Many examples are provided to illustrate our results. Bauschke–Bendit–Moursi modulus of averagedness turns out to be an extremely powerful tool in studying averaged operators and firmly nonexpansive operators!
The rest of the article is organized as follows. In Section 2, we explore some basic properties of the modulus function and show that a normally nonexpansive operator is a bi-Lipschitz homeomorphism. In Section 3, averagedness of operator compositions and some asymptotic behaviors of averaged operators are examined. In particular, the limiting operator of an averaged operator is a projection if and only if its BBM modulus is
$1/2$
. In Sections 4 and 5, we investigate both normal and special nonexpansiveness of resolvents and proximal operators. Our surprising results are Theorem 4.17 and Theorem 5.3, characterizing normal and special resolvents and proximal operators. In Section 6, we establish formulae of modulus of averagedness of resolvents in terms of various values of maximally monotone operators. Finally, in Section 7, we extend a modulus of averagedness formula on a composition of two projections by Bauschke, Bendit, and Moursi in
$\mathbb {R}^2$
to general Hilbert spaces.
2 Bijective theorem
2.1 Auxiliary results
This section collects preparatory results on modulus of averagedness used in later proofs. For any operator
$T: X \rightarrow X$
and any
$v \in X$
, the operator
$T+v$
is defined by

Proposition 2.1 Let
$T: X \rightarrow X$
be nonexpansive and
$v \in X$
. Then,
-
(i)
$k(T+v)=k(T)$ .
-
(ii)
$k(T(\cdot +v))=k(T)$ .
Proof (i): The result follows by combining
$(T+v) x-(T+v) y=T x-T y$
with characterization (1.2).
(ii): The result follows by combining
$x-y=(x+v)-(y+v)$
with characterization (1.2).
Proposition 2.2 Let
$T: X \rightarrow X$
be nonexpansive. If
$k(T)>0$
, then T is
$k(T)$
-averaged. Moreover, T is
$\beta $
-averaged for every
$\beta \in [k(T),1]$
.
Proof Due to
$k(T)>0$
, we can use characterization either (1.1) or (1.2). The right hand side of (1.1) or (1.2) is a continuous and increasing function in term of k, thus the result follows.
Let
${\operatorname {Fix}} T:=\{x \in X \mid T x=x\}$
denote the set of fixed points of
$T: X \rightarrow X$
. Our following result characterizes
$k(T)=0$
.
Proposition 2.3 Let
$T: X \rightarrow X$
be nonexpansive. Then,

If, in addition,
${\operatorname {Fix}} T\neq \varnothing $
, then

Proof Suppose
$\exists v \in X: T=\mathrm {Id}+v$
. Obviously
$k({\operatorname {Id}})=0$
. Thus, by Proposition 2.1,
$k(T)=k(\mathrm {Id}+v)=0$
.
Suppose
$k(T)=0$
. Assume that for any
$v \in X$
:
$T \neq {\operatorname {Id}}+v$
. Then, there exist
$x_0, y_0 \in X$
such that
$(T-\mathrm {Id}) x_0 \neq (T-\mathrm {Id}) y_0$
, whence
$\left \|(T-\mathrm {Id}) x_0-(T-\mathrm {Id}) y_0\right \|^2>0$
. Our assumption implies
$T \neq {\operatorname {Id}}$
, and
${\operatorname {Id}}$
is the only 0-averaged operator, thus there exists a sequence
$(k_n)_{n \in \mathbb {N}}$
in
$(0,1]$
such that T is
$k_n$
-averaged and
$k_n \rightarrow 0$
. Now characterization (1.1) implies that for any
$n \in \mathbb {N}$
:

i.e.,

Note that
$\left \|(T-\mathrm {Id}) x_0-(T-\mathrm {Id}) y_0\right \|^2>0$
. Now letting
$n \rightarrow \infty $
yields
$ 0 \leqslant -\infty , $
which is a contradiction.
When
${\operatorname {Fix}} T\neq \varnothing $
, (2.2) follows from (2.1).
Proposition 2.4 Let
$T: X \rightarrow X$
be nonexpansive. Then, T is firmly nonexpansive if and only if
$ k(T)\leqslant 1/2$
.
Proof “
$\Leftarrow $
”: When
$0<k(T)<1/2$
, apply Proposition 2.2. When
$k(T)=0$
, apply Proposition 2.3.
“
$\Rightarrow $
”: The assumption implies that T is
$1/2$
-averaged. Hence,
$k(T)\leqslant 1/2$
.
Example 2.5 If
$T:X\rightarrow X$
is a constant mapping, i.e.,
$(\exists v\in X)(\forall x\in X) \ Tx=v,$
then
$k(T)=1/2$
.
Proof Because T is firmly nonexpansive,
$k(T)\leqslant 1/2$
. By (1.2), if T is k-averaged, then
$2k\geqslant 1$
, so
$k(T)\geqslant 1/2$
. Altogether,
$k(T)=1/2$
.
We end up this section with a fact on convexity.
Fact 2.6 [Reference Bauschke, Bendit and Moursi5, Fact 1.3]
Let
$T_{1}, T_{2}: X \rightarrow X$
be nonexpansive and
$\lambda \in [0,1]$
. Then,
$k(\lambda T_{1} +(1-\lambda )T_{2})\leqslant \lambda k(T_{1})+(1-\lambda )k(T_{2}).$
Consequently,
$T\mapsto k(T)$
is a convex function on the set of averaged mappings, as well as on the set of firmly nonexpansive mappings.
Corollary 2.7 Let
$T: X \rightarrow X$
be nonexpansive and
$\lambda \in [0,1]$
. Then,
$k(\lambda T)\leqslant \lambda k(T)+(1-\lambda )/2$
.
2.2 Bijective theorem
In this section, we will show that normally nonexpansive operator must be bijective and bi-Lipschitz. First, we prove that normally nonexpansive operators must be bi-Lipschitz and injective.
Lemma 2.8 Let
$T: X \rightarrow X$
be normally nonexpansive. Then, T is a bi-Lipschitz homeomorphism from X to
$\operatorname {ran} T$
. In particular, T is injective.
Proof In view of Proposition 2.3, we may assume
$k(T)>0$
. Then, T is
$k(T)$
-averaged by Proposition 2.2, i.e.,

Since
$k(T)<\frac {1}{2}$
, there exists
$\alpha \in (0,\frac {1}{2})$
such that
$k(T)=\frac {1}{2}-\alpha $
. Substituting
$k(T)$
in above inequality and using the Cauchy–Schwarz inequality, we have

Now if
$\|x-y\|-\|T x-T y\|=0$
, then
$\left \|Tx-Ty\right \|=\|x-y\| \geqslant 2\alpha \|x-y\|$
since
$2\alpha <1$
. If
$\|x-y\|-\left \|Tx-Ty\right \| \neq 0$
, then
$2 \alpha \|x-y\| \leqslant \|T x-T y\|$
. Thus in both cases we have
$2 \alpha \|x-y\| \leqslant \|T x-T y\|$
. Combining it with
$\|T x-T y\| \leqslant \|x-y\|$
, we have

i.e., T is a bi-Lipschitz homeomorphism from X to
$\operatorname {ran} T$
.
Next, we make use of monotone operator theory to prove that normally nonexpansive operators must also be surjective.
Fact 2.9 [Reference Bauschke and Combettes6, Example 20.30]
Let
$T: X \rightarrow X$
be firmly nonexpansive. Then, T is maximally monotone.
Fact 2.10 ((Rockafellar–Vesely) [Reference Bauschke and Combettes6, Corollary 21.24])
Let
$A: X \rightrightarrows X$
be a maximally monotone operator such that
$ \lim _{\|x\| \rightarrow +\infty } \inf \|A x\|=+\infty. $
Then, A is surjective.
Lemma 2.11 Let
$T: X \rightarrow X$
be normally nonexpansive. Then, T is surjective.
Proof By Lemma 2.8, T is bi-Lipschitz since T is normally nonexpansive. Thus, there exists
$\varepsilon>0$
, such that
$\varepsilon \|x-y\| \leqslant \|T x-T y\|$
. Let
$y=0$
, then
$\varepsilon \|x\| \leqslant \|T x-T 0\|$
. Using the triangle inequality, we have

Thus,
$\lim _{\|x\| \rightarrow \infty }\|T x\|=\infty $
. Combining Fact 2.9 with Fact 2.10 we complete the proof.
Theorem 2.12 (bi-Lipschitz homeomorphism)
Let
$T: X \rightarrow X$
be normally nonexpansive. Then, T is a bi-Lipschitz homeomorphism of X. In particular, T is bijective.
Taking the contrapositive of Theorem 2.12, we obtain a lower bound for modulus of averagedness.
Corollary 2.13 Let
$T: X \rightarrow X$
be nonexpansive. If T is not bijective, then
$k(T) \geqslant 1/2$
.
Remark 2.14 In terms of compact operators (see, e.g., [Reference Rudin23]), Theorem 2.12 implies that X is finite-dimensional if and only if there exists a normally nonexpansive compact operator on X.
Example 2.15 (Averagedness of projection)
Let C be a nonempty closed convex set in X and
$C \neq X$
. Then,
$P_C$
is specially nonexpansive.
Proof We have
$k(P_C) \leqslant 1/2$
since
$P_C$
is firmly nonexpansive. Now since
$C \neq X$
, let
$x_0 \in X \backslash C$
. Because
$P_C\left (x_0\right ) \in C$
and
$x_0 \in X \backslash C$
, we have
$P_C\left (x_0\right ) \neq x_0$
. However,
$P_C\left (x_0\right )=P_C\left (P_C\left (x_0\right )\right )$
. Thus,
$P_C$
is not injective. Therefore,
$P_C$
is specially nonexpansive by Corollary 2.13. Another way is to observe that
$P_C$
is not surjective.
Corollary 2.16 Let
$M \in \mathbb {R}^{n \times n}$
be nonexpansive. If
$\operatorname {det}(M)=0$
, then
$k(M) \geqslant 1 / 2$
.
Remark 2.17 Consider the matrix

Then, one can verify that
$k(A)=3/4>1/2$
. However,
$\operatorname {det}(A) \neq 0$
and thus A is a bi-Lipschitz homeomorphism of
$\mathbb {R}^{2}$
. Hence, the converse of Theorem 2.12 fails. We will show later that the converse of Theorem 2.12 does hold when T is a proximal operator (see Theorem 5.3).
3 Operator compositions and limiting operator
In this section, we examine the modulus of averagedness of operator compositions and explore its asymptotic properties.
3.1 Composition
Proposition 3.1 Let
$T_1$
and
$T_2$
be nonexpansive operators from X to X. Suppose one of the following holds:
-
(i)
$T_1$ is not surjective.
-
(ii)
$T_2$ is not injective.
-
(iii)
$T_1$ is bijective and
$T_2$ is not surjective.
-
(iv)
$T_2$ is bijective and
$T_1$ is not injective.
Then,
$k(T_1 T_2) \geqslant 1/2$
.
Proof Since
$T_1$
and
$T_2$
are nonexpansive operators, we have
$T_1 T_2$
is nonexpansive as well. Each one of the four conditions implies that
$T_1 T_2$
is not bijective. Now, use Corollary 2.13.
Ogura and Yamada [Reference Ogura and Yamada20] obtained the following result about the averagedness of operator compositions.
Fact 3.2 ([Reference Ogura and Yamada20, Theorem 3] (see also [Reference Combettes and Yamada15, Proposition 2.4]))
Let
$T_1: X \rightarrow X$
be
$\alpha _1$
-averaged, and let
$T_2: X \rightarrow X$
be
$\alpha _2$
-averaged, where
$\alpha _1, \alpha _2 \in (0,1)$
. Set

Then,
$\alpha \in ( 0,1)$
and T is
$\alpha $
-averaged.
Formulating this result here using the modulus of averagedness, we have the following result.
Proposition 3.3 Let
$T_1: X \rightarrow X$
and
$T_2: X \rightarrow X$
be nonexpansive. Suppose
$k\left (T_1\right )k\left (T_2\right ) \neq 1$
. Then,

Proof Let
$\varphi \left (T_1, T_2\right ):=\frac {k\left (T_1\right )+k\left (T_2\right )-2 k\left (T_1\right ) k\left (T_2\right )}{1-k\left (T_1\right ) k\left (T_2\right )}$
. We consider five cases.
Case 1:
$k\left (T_i\right )=1$
for some
$i \in \{1,2\}$
. Then,
$\varphi \left (T_1, T_2\right )=1$
. Since
$T_1$
and
$T_2$
are nonexpansive, we have
$T_1 T_2$
is nonexpansive, i.e.,
$k\left (T_1 T_2\right ) \leqslant 1=\varphi \left (T_1, T_2\right )$
.
Case 2:
$k\left (T_i\right ) \in (0,1)$
for any
$i \in \{1,2\}$
. Then, combining Proposition 2.2 and Fact 3.2, we have
$T_1 T_2$
is
$\varphi \left (T_1, T_2\right )$
-averaged. Thus,
$k\left (T_1T_2\right ) \leqslant \varphi \left (T_1, T_2\right )$
.
Case 3:
$k\left (T_1\right )=0$
and
$k\left (T_2\right ) \in (0,1)$
. Then, there exists
$v_1 \in X$
such that
$T_1={\operatorname {Id}}+v_1$
by Proposition 2.3. Thus,
$T_1 T_2=T_2+v_1$
and
$k\left (T_1 T_2\right )=k\left (T_2+v_1\right )=k\left (T_2\right )$
by Proposition 2.1. While
$\varphi \left (T_1, T_2\right )=k\left (T_2\right )$
in this case, we have
$k\left (T_1T_2\right )=\varphi \left (T_1, T_2\right )$
.
Case 4:
$k\left (T_1\right ) \in (0,1)$
and
$k\left (T_2\right )=0$
. Then, there exists
$v_2 \in X$
such that
$T_2={\operatorname {Id}}+v_2$
by Proposition 2.3. Thus,
$T_1 T_2=T_1(\cdot +v_2)$
and
$k\left (T_1 T_2\right )=k\left (T_1(\cdot +v_2)\right )=k\left (T_1\right )$
by Proposition 2.1. While
$\varphi \left (T_1,T_2\right )=k\left (T_1\right )$
in this case, we have
$k\left (T_1T_2\right )=\varphi \left (T_1, T_2\right )$
.
Case 5:
$k\left (T_1\right )=k\left (T_2\right )=0$
. Then, there exist
$v_1 \in X$
and
$v_2 \in X$
such that
$T_1={\operatorname {Id}}+v_1$
and
$T_2={\operatorname {Id}}+v_2$
by Proposition 2.3. Thus,
$T_1 T_2=\mathrm {Id}+v_2+v_1$
and
$k\left (T_1 T_2\right )=k\left (\mathrm {Id}+v_2+v_1\right )=0$
. While
$\varphi \left (T_1,T_2\right )=0$
in this case, we have
$k\left (T_1T_2\right )=\varphi \left (T_1, T_2\right )$
.
Altogether, we complete the proof.
Proposition 3.4 Let C be a nonempty closed convex set in X and
$C \neq X$
. Then, for any nonexpansive operator
$T: X \rightarrow X$
:

and

Proof Observe that
$P_C$
is neither surjective nor injective in this case. Thus, by Proposition 3.1, we have
$k(T \circ P_C) \geqslant 1/2$
and
$k(P_C \circ T) \geqslant 1/2$
. Now by Example 2.15,

Thus, by Proposition 3.3, we have
$k\left (T \circ P_C\right ) \leqslant \frac {1}{2-k(T)}$
and
$k\left (P_C \circ T\right ) \leqslant \frac {1}{2-k(T)}$
, which complete the proof.
Remark 3.5 Particularly, if we let
$T=P_V$
and
$C=U$
, where U and V are both closed linear subspaces, then
$k\left (P_V P_U\right )=\frac {1+c_F}{2+c_F} \in \left [\frac {1}{2}, \frac {2}{3}\right ]$
, where
$c_F \in [0,1]$
(see [Reference Bauschke, Bendit and Moursi5, Corollary 3.3]). This coincides with the bounds we obtained as
$\frac {1}{2-k(P_U)}=\frac {2}{3}$
by Example 2.15.
We can generalize the results of two operator compositions to finite operator compositions.
Proposition 3.6 Let
$m \geqslant 2$
be an integer and let
$I=\{1, \ldots , m\}$
. For any
$i \in I$
, let
$T_i$
be nonexpansive from X to X. Suppose one of the following holds:
-
(i)
$T_1$ is not surjective.
-
(ii)
$T_m$ is not injective.
-
(iii)
$T_1$ is bijective and
$T_2 \cdots T_m$ is not surjective.
-
(iv)
$T_m$ is bijective and
$T_1 \cdots T_{m-1}$ is not injective.
Then,
$k\left (T_1 \cdots T_m\right ) \geqslant 1/2$
.
Proof Apply Proposition 3.1.
Corollary 3.7 Let
$C_1, \ldots , C_m$
be nonempty closed convex sets in X. If
$C_1 \neq X$
or
$C_m \neq X$
, then
$k\left (P_{C_1} \cdots P_{C_m}\right ) \geqslant 1/2$
.
The following result is about modulus of averagedness of isometries.
Proposition 3.8 Let A be a
$n \times n$
orthogonal matrix and
$A \neq {\operatorname {Id}}$
. Then,
$k(A)=1$
.
Proof Since A is orthogonal, we have
$\left \|Ax-Ay\right \|=\|x-y\|$
. On the other hand,
$\operatorname {ran}(\operatorname {Id}-A)$
is not a singleton. Hence,
$k(A)=1$
by using (1.1).
Corollary 3.9 Let
$m \geqslant 1$
be an integer and let
$I=\{1, \ldots , m\}$
. For any
$i \in I$
, let
$A_i$
be a
$n \times n$
orthogonal matrix. Suppose that
$A_1 \cdots A_m \neq {\operatorname {Id}}$
. Then,
$k(A_1 \cdots A_m)=1$
.
3.2 Limiting operator
In this section, we discuss the asymptotic behavior of modulus of averagedness. Recall that a sequence
$\left (x_n\right )_{n \in \mathbb {N}}$
in a Hilbert space X is said to converge weakly to a point x in X if
$(\forall y\in X)\ \lim _{n \rightarrow \infty }\left \langle x_n, y\right \rangle =\langle x, y\rangle. $
We use the notation
$\lim _{n \rightarrow \infty }^w x_n$
for the weak limit of
$\left (x_n\right )_{n \in \mathbb {N}}$
. Recall for a nonexpansive operator
$T: X \rightarrow X$
,
${\operatorname {Fix}} T$
is closed and convex (see, e.g., [Reference Bauschke and Moursi9, Proposition 22.9]).
Fact 3.10 [Reference Bauschke and Combettes6, Proposition 5.16]
Let
$\alpha \in (0,1)$
and let
$T: X \rightarrow X$
be
$\alpha $
-averaged such that
$\operatorname {Fix} T \neq \varnothing $
. Then, for any
$x \in X$
,
$\left (T^n x\right )_{n \in \mathbb {N}}$
converges weakly to a point in
$\operatorname {Fix} T$
.
In view of the above fact, we propose the following type of operator.
Definition 3.1 (Limiting operator)
Let
$\alpha \in (0,1)$
and let
$T: X \rightarrow X$
be
$\alpha $
-averaged such that
$\operatorname {Fix} T \neq \varnothing $
. Define its limiting operator
$T_{\infty }: X \rightarrow X$
by
$x\mapsto \lim ^w_{n\rightarrow \infty } T^n x.$
Remark 3.11 The full domain and single-valuedness of
$T_{\infty }$
are guaranteed by Fact 3.10. Hence,
$T_{\infty }: X \rightarrow X$
is well defined.
Example 3.12
-
(i) [Reference Bauschke and Combettes6, Example 5.29] Let
$\alpha \in (0,1)$ and let
$T: X \rightarrow X$ be
$\alpha $ -averaged such that
${\operatorname {Fix}} T \neq \varnothing $ . Suppose T is linear. Then,
$T_{\infty }=P_{{\operatorname {Fix}} T}$ .
-
(ii) [Reference Bauschke and Combettes6, Proposition 5.9] Let
$\alpha \in (0,1)$ and let
$T: X \rightarrow X$ be
$\alpha $ -averaged such that
${\operatorname {Fix}} T \neq \varnothing $ . Suppose
${\operatorname {Fix}} T$ is a closed affine subspace of X. Then,
$T_{\infty }=P_{{\operatorname {Fix}} T}$ .
The limiting operator of an averaged mapping enjoys the following pleasing properties.
Proposition 3.13 Let
$\alpha \in (0,1)$
and let
$T: X \rightarrow X$
be
$\alpha $
-averaged such that
${\operatorname {Fix}} T \neq \varnothing , X$
. Then, the following hold:
-
(i)
${\operatorname {Fix}} T={\operatorname {Fix}} T_{\infty }=\operatorname {ran} T_{\infty }$ .
-
(ii)
$\left (T_{\infty }\right )^2=T_{\infty }$ .
-
(iii)
$k\left (T_{\infty }\right ) \in \left [\frac {1}{2}, 1\right ]$ .
Proof (i): If
$x \notin {\operatorname {Fix}} T$
, then
$T_{\infty }x \neq x$
since
$T_{\infty }x \in {\operatorname {Fix}} T$
by Fact 3.10. If
$x \in {\operatorname {Fix}} T$
, then
$T_{\infty }x=\lim ^w _{n \rightarrow \infty } T^n x=\lim ^w _{n \rightarrow \infty } x=x$
. Thus,
$\operatorname {Fix} T=\operatorname {Fix} T_{\infty }$
. The equality
${\operatorname {Fix}} T=\operatorname {ran} T_{\infty }$
follows by using Fact 3.10 again.
(ii): For any
$x \in X$
,
$T_{\infty } x \in \operatorname {ran} T_{\infty }$
, thus
$T_{\infty } x \in {\operatorname {Fix}} T_{\infty }$
by (i). Therefore,
$\left (T_{\infty }\right )^2 x=T_{\infty }\left (T_{\infty } x\right )=T_{\infty } x$
, which implies that
$\left (T_{\infty }\right )^2=T_{\infty }$
.
(iii): Since the norm is weakly lower-semicontinuous, we have

As
$T: X \rightarrow X$
is nonexpansive, by induction we have for any
$n \in \mathbb {N}$
,
$\left \|T^n x-T^n y\right \| \leqslant \|x-y\|$
. Altogether,
$T_{\infty }$
is nonexpansive, which implies that
$k(T_{\infty }) \leqslant 1$
. On the other hand, we have
$\operatorname {ran} T_{\infty } = \operatorname {Fix} T \neq X$
by (i) and the assumption, so
$T_{\infty }$
is not surjective. Thus,
$k(T_{\infty }) \geqslant 1/2$
by Corollary 2.13.
The modulus of averagedness provides further insights into the limiting operator.
Theorem 3.14 Let
$\alpha \in (0,1)$
and let
$T: X \rightarrow X$
be
$\alpha $
-averaged such that
${\operatorname {Fix}} T \neq \varnothing , X$
. Then, the following are equivalent:
-
(i)
$T_{\infty }=P_{{\operatorname {Fix}} T}$ .
-
(ii)
$k\left (T_{\infty }\right ) \leqslant 1/2$ .
-
(iii)
$k\left (T_{\infty }\right )=1/2$ .
Proof (i)
$\Rightarrow $
(ii): Obvious.
(ii)
$\Rightarrow $
(i): The result follows by combining Proposition 3.13(i)&(ii) and the fact that if
$T: X \rightarrow X$
is firmly nonexpansive and
$T \circ T=T$
, then
$T=P_{{\operatorname {ran}} T}$
(see [Reference Bauschke and Moursi9, Exercise 22.5] or [Reference Bauschke, Moffat and Wang8, Theorem 2.1(xx)]).
(ii)
$\Leftrightarrow $
(iii): Apply Proposition 3.13(iii).
In the following, we discuss limiting operator on
$\mathbb {R}$
. The following extends [Reference Bauschke, Bendit and Moursi5, Proposition 2.8] from differentiable functions to locally Lipschitz functions. Below
$\partial _{L}g$
denotes the Mordukhovich limiting subdifferential [Reference Borwein and Zhu12, Reference Mordukhovich19, Reference Rockafellar and Wets22].
Lemma 3.15 Let
$g: \mathbb {R} \rightarrow \mathbb {R}$
be a locally Lipschitz function. Then, g is nonexpansive if and only if
$(\forall x\in \mathbb {R})\ \partial _{L}g(x)\subset [-1,1]$
in which case
$k(g)=\left (1-\inf \partial _{L}g(\mathbb {R})\right ) / 2$
.
Proof The nonexpansiveness characterization of g follows from [Reference Borwein and Zhu12, Theorem 3.4.8]. Write
$g=(1-\alpha ){\operatorname {Id}}+\alpha N,$
where
$\alpha \in [0,1]$
and
$N:\mathbb {R}\rightarrow \mathbb {R}$
is nonexpansive. If
$\alpha =0$
, the result clearly holds. Let us assume
$\alpha>0$
. Then,
$N(x)=(g(x)-(1-\alpha )x)/\alpha $
and
$\partial _{L}N(x)= (\partial _{L}g(x)-(1-\alpha ))/\alpha $
. N is nonexpansive is equivalent to

from which

and the result follows.
In Example 3.12 we see that if
$T: X \rightarrow X$
is
$\alpha $
-averaged and linear with
$\operatorname {Fix} T \neq \varnothing , X$
, then
$k\left (T_{\infty }\right )=1/2$
. The following example shows that it is not true in nonlinear case.
Example 3.16 Let

Then, f is
$(3/4)$
-averaged and
$\operatorname {Fix} T=[0,1]$
. However,

and
$k\left (f_{\infty }\right )=3/4$
.
Proof By computation, we have

Applying Lemma 3.15, we obtain
$k(f)$
and
$k(f_{\infty })$
.
Next, we show that if
$T: X \rightarrow X$
is firmly nonexpansive, a stronger condition than averagedness, then on the real line it is true that
$k\left (T_{\infty }\right )=1/2$
.
Proposition 3.17 Let
$f: \mathbb {R} \rightarrow \mathbb {R}$
be firmly nonexpansive such that
${\operatorname {Fix}} f \neq \varnothing , \mathbb {R}$
. Then,
$f_{\infty }=P_{{\operatorname {Fix}} f}$
. Consequently,
$k\left (f_{\infty }\right )=1 / 2$
.
Proof Since f is firmly nonexpansive, we have f is nondecreasing and nonexpansive. Now as
${\operatorname {Fix}} f\subseteq \mathbb {R}$
is closed and convex, it must be one of the form
$[a,+\infty )$
,
$(-\infty ,b]$
or
$[a,b]$
with
$a,b\in \mathbb {R}$
because
$\operatorname {Fix} f \neq \varnothing , \mathbb {R}$
. Since the proofs for all cases are similar, let us assume that
${\operatorname {Fix}} f=[a,b]$
. When
$x\geqslant b$
, because f is nondecreasing, we have
$f(x)\geqslant f(b)=b$
,
$f^2(x)\geqslant f(b)=b$
, and an induction leads
$f^n(x)\geqslant b$
. Then,
$f_{\infty }(x)\geqslant b$
by Fact 3.10. Since
$f_{\infty }(x)\in [a,b]$
by Fact 3.10 again, we derive that
$f_{\infty }(x)=b$
. Similar arguments give
$f_{\infty }(x)=a$
when
$x\leqslant a$
. Clearly, when
$x\in [a,b]$
,
$(\forall n\in \mathbb {N})\ f^{n}(x)=x$
, so
$f_{\infty }(x)=x$
. Altogether
$f_{\infty }=P_{{\operatorname {Fix}} f}$
.
Motivated by Example 3.12 and Proposition 3.17, one might conjecture that
$k\left (T_{\infty }\right )=1 / 2$
whenever
$k\left (T\right ) \leqslant 1 / 2$
. However, this is not true in general. To find a counter example, by Theorem 3.14, it suffices to find a firmly nonexpansive operator such that its limiting operator is not a projection. We conclude this section with the following example from [Reference Bauschke, Dao, Noll and Phan7, Example 4.2].
Example 3.18 Suppose that
$X=\mathbb {R}^2$
. Let
$A=\mathbb {R} (1, 1)$
and
$B=\left \{(x, y) \in \mathbb {R}^2 \mid -y \leqslant x \leqslant 2\right \}$
. For
$z=(x, y)\in \mathbb {R}^2,$
we have
$P_A(z)=\left (\frac {x+y}{2}, \frac {x+y}{2}\right )$
and

Then, the Douglas–Rachford operator
$T=\operatorname {Id}-P_A+P_B (2 P_A-\mathrm {Id})$
is firmly nonexpansive and has
$k(T_{\infty })>1/2$
. By Theorem 3.14, it suffices to show
$T_{\infty } \neq P_{{\operatorname {Fix}} T}$
. Indeed, by [Reference Bauschke, Dao, Noll and Phan7, Fact 3.1] we have
${\operatorname {Fix}} T=\big \{{s(1,1)} \mid {s\in [0,2]}\big \}$
because of
$A\cap {\operatorname {int}} B\neq \varnothing $
. Let
$z_0=(4,10)$
, and
$(\forall n \in \mathbb {N})$
$z_{n+1}=T z_n$
. Direct computations give

On the other hand, let
$z^*=(2, 2)$
, then
$\{z_4, z^*\} \subset {\operatorname {Fix}} T$
. Thus,
$T_{\infty }z_0=z_4$
while
$P_{{\operatorname {Fix}} T}z_0 \neq z_4$
as
$\left \|z_0-z^*\right \|=2 \sqrt {17} < \left \|z_0-z_4\right \|=2 \sqrt {29}$
.
4 Resolvent
Let
$A: X \rightrightarrows X$
be a set-valued operator, i.e., a mapping from X to its power set. Recall that the resolvent of A is
$J_A :=(\mathrm {Id}+A)^{-1}$
and the reflected resolvent of A is
$R_A :=2 J_A-\mathrm {Id}$
. The graph of A is
$ {\operatorname {gra}} A :=\{(x, u) \in X \times X \mid u \in A x\}$
and the inverse of A, denoted by
$A^{-1}$
, is the operator with graph
$\operatorname {gra} A^{-1} :=\{(u, x) \in X \times X \mid u \in A x\}$
. The domain of A is
$\operatorname {dom} A :=\{x \in X \mid A x \neq \varnothing \}$
. A is monotone, if

A is maximally monotone, if it is monotone and there is no monotone operator
$B: X \rightrightarrows X$
such that
$\operatorname {gra} A$
is properly contained in
$\operatorname {gra} B$
. Unless stated otherwise, we assume from now on that

Fact 4.1 ((Minty’s theorem) [Reference Bauschke and Combettes6, Proposition 23.8])
Let
$T: X \rightarrow X$
. Then, T is firmly nonexpansive if and only if T is the resolvent of a maximally monotone operator.
The goal of this section is to give characterizations of normal and special nonexpansiveness by using the monotone operator theory.
4.1 Auxiliary results
We first provide a nice formula for the modulus of averagedness of
$(1-\lambda ){\operatorname {Id}}+\lambda T$
in terms of the modulus of averagedness of T. The following is an adaption of [Reference Bauschke and Combettes6, Proposition 4.40]. For completeness, we include a simple proof.
Fact 4.2 Let
$T:X\rightarrow X$
be nonexpansive and let
$\lambda \in (0, 1]$
. For
$\alpha \in [0, 1]$
, T is
$\alpha $
-averaged if and only if
$(1-\lambda ){\operatorname {Id}}+ \lambda T$
is
$\lambda \alpha $
-averaged.
Proof Suppose T is
$\alpha $
-averaged. Then,
$T=(1-\alpha ){\operatorname {Id}}+\alpha R$
with R being nonexpansive. It follows that


so that
$(1-\lambda ){\operatorname {Id}} +\lambda T$
is
$\lambda \alpha $
-averaged. Because
$\lambda \in (0,1]$
, the reverse direction also holds.
Lemma 4.3 Let
$T: X \rightarrow X$
be nonexpansive. Then, for every
$\lambda \in [0,1],$
we have

Proof We split the proof into the following cases.
Case 1:
$\lambda =0$
. Clearly, (4.3) holds because
$k({\operatorname {Id}})=0$
.
Case 2:
$\lambda>0$
. We show (4.3) by two subcases.
Case 2.1:
$k((1-\lambda ) \mathrm {Id}+\lambda T)=0$
. By Proposition 2.3, there exists
$v \in X$
:
$ (1-\lambda ) \mathrm {Id}+\lambda T={\operatorname {Id}}+v$
such that
$T={\operatorname {Id}}+v/\lambda $
. Then,
$k((1-\lambda ) {\operatorname {Id}}+\lambda T)=0=k(T)$
by Proposition 2.3 again.
Case 2.2:
$k((1-\lambda ) \mathrm {Id}+\lambda T)>0$
. On one hand, we derive
$k((1-\lambda ){\operatorname {Id}}+\lambda T)\leqslant \lambda k(T)$
by Fact 4.2. On the other hand, since
$(1-\lambda ) \mathrm {Id}+\lambda T$
is
$\lambda $
-averaged, we have
$0<k((1-\lambda ) \mathrm {Id}+\lambda T)\leqslant \lambda $
. For every
$\beta \in [k((1-\lambda ) \mathrm {Id}+\lambda T), \lambda ]$
, the mapping
$(1-\lambda ) \mathrm {Id}+\lambda T$
is
$\beta $
-averaged. Write
$\beta =\lambda \alpha $
with
$\alpha =\beta /\lambda \in (0,1]$
. Fact 4.2 implies that T is
$\alpha $
-averaged, thus
$k(T)\leqslant \beta /\lambda $
. Taking infimum over
$\beta $
gives
$k(T)\leqslant k((1-\lambda ){\operatorname {Id}}+\lambda T)/\lambda $
, i.e.,
$\lambda k(T)\leqslant k((1-\lambda ){\operatorname {Id}}+\lambda T)$
. Therefore,
$k((1-\lambda ){\operatorname {Id}}+\lambda T)=\lambda k(T)$
.
Altogether, (4.3) holds.
Example 4.4 Let C be a nonempty closed convex set in X and
$C \neq X$
. Consider the reflector to C defined by
$R_C:=2 P_C-\mathrm {Id}$
. Then, the following hold:
-
(i)
$k(R_{C})=1$ .
-
(ii) For
$\lambda \in [0,1]$ ,
$k((1-\lambda ){\operatorname {Id}}+\lambda R_{C}))=\lambda $ .
-
(iii) For
$\lambda \in [0,1]$ ,
$k((1-\lambda ){\operatorname {Id}}+\lambda P_{C}))=\lambda /2$ .
Remark 4.5 This recovers [Reference Bauschke, Bendit and Moursi5, Example 2.3] for
$C=V$
, a closed subspace of X.
Example 4.6 Let
$A: X \rightrightarrows X$
be maximally monotone. Consider the reflected resolvent of A defined by
$R_{A}:=2J_{A}-{\operatorname {Id}}$
. Then,
$k\left (R_A\right )=2k\left (J_A\right )$
by Lemma 4.3. Consequently,
$k\left (R_A\right )<1$
(that is,
$R_A$
is
$\alpha $
-averaged for some
$\alpha \in [0, 1)$
) if and only if
$J_A$
is normally nonexpansive. Likewise,
$k\left (R_A\right )=1$
if and only if
$J_A$
is specially nonexpansive.
The following result concerning the Douglas–Rachford operator (see, e.g., [Reference Bauschke and Combettes6, Reference Bauschke and Moursi9]) is of independent interest.
Theorem 4.7 Let
$U, V$
be two closed subspaces of X, and
$U\neq V$
. Consider the Douglas–Rachford operator

Then,
$k(T_{U,V})=1/2$
.
Proof We have
$R_U R_V \neq {\operatorname {Id}}$
since
$U \neq V$
. Note both
$R_U$
and
$R_V$
are orthogonal. Thus,
$k\left (R_U R_V\right )=1$
by Corollary 3.9. Therefore, by Lemma 4.3, we have
$k(T_{U,V})=k(R_{U}R_{V})/2=1/2.$
Remark 4.8 Let
$A,B:X\rightrightarrows X$
be two maximally monotone operators. The Douglas–Rachford operator related to
$(A,B)$
is

It is interesting to know
$k(T_{A,B})$
in general.
Next, we recall Yosida regularizations of monotone operators. They are essential for our proofs in Section 4.2.
Definition 4.1 (Yosida regularization)
For
$\mu>0$
, the Yosida
$\mu $
-regularization of A is the operator

For Yosida regularization, we have the classic identity:
$Y_\mu (A)=\mu ^{-1}\left ({\operatorname {Id}}-J_{\mu A}\right )$
; see [Reference Rockafellar and Wets22, Lemma 12.14]. The following result is [Reference Bauschke and Combettes6, Theorem 23.7(iv)]. Here, we take the opportunity to give a detailed proof.
Proposition 4.9 For
$\alpha , \mu>0$
, the following formula holds

Proof First,

Thus, we only need to prove the formula holds for
$\alpha =1$
.
Let
$y \in X$
,
$z=J_{(\mu +1) A}(y)$
and
$x=\frac {\mu }{\mu +1} y+\frac {1}{\mu +1} z$
. We will prove
$x=J_{Y_{\mu (A)}}(y)$
. We have
$z=(\mu +1) x-\mu y$
,
$y-z=\frac {\mu +1}{\mu }(x-z)$
and
$Y_\mu (A)=\frac {1}{\mu }\left ({\operatorname {Id}}-J_{\mu A}\right )$
. Thus,

Combining Proposition 4.9 and Lemma 4.3, we have the following.
Corollary 4.10 For any
$\alpha \in [0,1)$
, the following hold:
-
(i)
$J_{\alpha Y_{1-\alpha }(A)}=(1-\alpha ) \mathrm {Id}+\alpha J_A$ .
-
(ii)
$k(J_{\alpha Y_{1-\alpha }}(A))=\alpha k(J_A)$ .

Corollary 4.11 For
$\mu>0$
, the following hold:
-
(i)
$J_{Y_\mu (A)}=\frac {\mu }{\mu +1} {\operatorname {Id}}+\frac {1}{\mu +1} J_{(\mu +1) A}$ .
-
(ii)
$k(J_{Y_\mu (A)})=\frac {1}{\mu +1} k(J_{(\mu +1) A})$ .
Example 4.12 Let C be a nonempty closed convex set in X and
$C\neq X$
. Consider the normal cone to C defined by
$N_{C}(x):=\{u \in X \mid \sup _{c \in C} \langle c-x, u\rangle \leqslant 0\}$
if
$x \in C$
, and
$\varnothing $
otherwise. Then,

In particular,

Proof Apply Corollary 4.10 with
$A=N_{C}$
to obtain



Using Lemma 4.3 and
$k(P_{C})=1/2$
because
$C\neq X$
, we have

Finally, (4.5) follows from (4.4) by using
$\mu =(1-\alpha )/\alpha $
.
Remark 4.13 Observe that Corollary 4.11(i) shows that
$Y_{\mu }(A)$
is the resolvent average of monotone operators
$0$
and
$(\mu +1)A$
(see, e.g., [Reference Bartz, Bauschke, Moffat and Wang4].
4.2 Characterization of normally averaged mappings
The Yosida regularization of monotone operators provides the key. Recall that
$T: X \rightarrow X$
is
$\mu $
-cocoercive with
$\mu>0$
if
$\mu T$
is firmly nonexpansive, i.e.,

Fact 4.14 [Reference Bauschke and Combettes6, Proposition 23.21 (ii)]
$T: X \rightarrow X$
is
$\mu $
-cocoercive if and only if there exists a maximally monotone operator
$A: X\rightrightarrows X$
such that
$T=Y_\mu (A)$
.
Lemma 4.15 Let
$A:X\rightrightarrows X$
be maximally monotone. Suppose that
$J_A$
is normally nonexpansive. Then, A is single-valued with full domain, and cocoercive.
Proof If
$k(J_{A})=0$
, Proposition 2.3 shows that
$J_{A}={\operatorname {Id}}+ v$
for some
$v\in X$
. Then,
$A:=-v$
, which is clearly single-valued with full domain, and cocoercive. Hence, we shall assume
$0<k\left (J_A\right )<1/2$
. Set

Then,
$J_A=\left (1-2 k\left (J_A\right )\right ) {\operatorname {Id}}+2 k\left (J_A\right ) N$
and N is nonexpansive with
$k(N)=1/2$
by Lemma 4.3. It follows from Fact 4.1 that N is firmly nonexpansive, i.e., there exists a maximally monotone operator
$B:X\rightrightarrows X$
such that
$N=J_B$
. Thus, by Corollary 4.10, we have

Therefore,
$A=2 k\left (J_A\right ) Y_{1-2 k\left (J_A\right )}(B)$
. Since
$J_A$
is normally nonexpansive, we have
$2 k\left (J_A\right ) \in (0,1)$
. Thus,
$2 k\left (J_A\right ) Y_{1-2 k\left (J_A\right )}(B)$
, being a Yosida regularization, is a single-valued, full domain, and cocoercive operator due to Fact 4.14. Hence, A is single-valued with full domain, and cocoercive.
Lemma 4.16 Suppose
$A: X\rightrightarrows X$
is single-valued with full domain, and cocoercive. Then,
$J_A$
is normally nonexpansive.
Proof Since A is single-valued with full domain, and cocoercive, by Fact 4.14, there exist a maximally monotone operator
$B:X\rightrightarrows X$
and
$\mu>0$
such that
$A=Y_\mu (B)$
. Since B is maximally monotone, by Corollary 4.11, we have

Since B is maximally monotone and
$\mu +1>1$
, we have
$(\mu +1) B$
is maximally monotone as well. Thus,
$k(J_{(\mu +1) B}) \leqslant 1/2$
by Fact 4.1. Now, Lemma 4.3 gives

The main result of this section comes as follows.
Theorem 4.17 (Characterization of normally averaged mapping)
Let
$A:X\rightrightarrows X$
be maximally monotone. Then,
$J_A$
is normally nonexpansive if and only if A is single-valued with full domain, and cocoercive.
In view of Fact 4.1, the characterization of special nonexpansiveness follows immediately as well.
Example 4.18 Let
$A\in \mathbb {S}_{++}^{n}$
, the set of
$n\times n$
positive definite symmetric matrices. Then,
$k\left (J_A\right )<1/2$
and
$k\left (J_{A^{-1}}\right )<1/2$
by Theorem 4.17.
The following fact follows from [Reference Bauschke, Moffat and Wang8, Theorem 2.1(i)&(iv)].
Fact 4.19 Let
$T:X\rightarrow X$
be firmly nonexpansive. Then, the following hold:
-
(i)
$T=J_{A}$ for a maximally monotone operator
$A:X\rightrightarrows X$ .
-
(ii) T is injective if and only if A is at most single-valued, i.e.,
$$ \begin{align*}(\forall x\in{\operatorname{dom}} A)\ Ax \text{ is empty or a singleton.}\end{align*} $$
-
(iii) T is surjective if and only if
${\operatorname {dom}} A=X$ .
5 Proximal operator
Let
$f \in \Gamma _0(X).$
Recall that the proximal operator of f is given by

that the Moreau envelope of f with parameter
$\mu>0$
is defined by
$e_\mu f(x):=\min _{u \in X}(f(u)+\frac {1}{2 \mu }\|u-x\|^2)$
, and that the Fenchel conjugate of f is defined by
$f^*(y):=\sup _{x \in X}(\langle x, y\rangle -f(x))$
for
$y \in X$
. It is well known that
$\mathrm {P}_{f}=(\mathrm {Id}+\partial f)^{-1}$
, where
$\partial f$
is the subdifferential of f given by
$\partial f(x):=\{u \in X \mid (\forall y \in X) \ f(y)\geqslant f(x)+\langle u, y-x\rangle \}$
if
$x\in {\operatorname {dom}} f$
, and
$\varnothing $
if
$x\not \in {\operatorname {dom}} f$
. Also,
$\mathrm {P}_f$
is firmly nonexpansive, i.e.,
$k\left (\mathrm {P}_f\right ) \leqslant 1/2$
(see, e.g., [Reference Bauschke and Combettes6, Reference Rockafellar and Wets22]).
In this section, we will characterize the normal and special nonexpansiveness of
$\mathrm {P}_{f}$
. We begin with the following definition.
Definition 5.1 (L-smoothness)
Let
$L \in [0,+\infty )$
. Then, f is L-smooth on X if f is Fréchet differentiable on X and
$\nabla f$
is L-Lipschitz, i.e.,

Fact 5.1 ((Baillon-Haddad) [Reference Baillon and Haddad2] (see also [Reference Bauschke and Combettes6, Corollary 18.17]))
Let
$f \in \Gamma _0(X)$
. Suppose f is Fréchet differentiable on X. Then,
$\nabla f$
is
$\mu $
-cocoercive if and only if
$\nabla f$
is
$\mu ^{-1}$
-Lipschitz continuous.
For further properties of L-smooth functions, see [Reference Bauschke and Combettes6, Reference Beck11, Reference Rockafellar and Wets22]. We also need
Fact 5.2 ((Moreau) [Reference Bauschke and Combettes6, Theorem 20.25])
Let
$f \in \Gamma _0(X)$
. Then,
$\partial f$
is maximally monotone.
The following interesting result characterizes a L-smooth function f via the modulus of averagedness of
$\mathrm {P}_{f}$
. It shows that for proximal operators not only can Theorem 4.17 be significantly improved but also the converse of Theorem 2.12 holds.
Theorem 5.3 (Characterization of normal proximal operator)
Let
$f\in \Gamma _{0}(X)$
. Then, the following are equivalent:
-
(i)
$\mathrm {P}_{f}$ is normally nonexpansive.
-
(ii) There exists
$L>0$ such that f is L-smooth on X.
-
(iii)
$f^*$ is
$1/L$ -strongly convex for some
$L>0$ .
-
(iv)
$\mathrm {P}_{f^*}$ is a Banach contraction.
-
(v)
$\mathrm {P}_{f}$ is a bi-Lipschitz homeomorphism of X.
Proof “(i)
$\Leftrightarrow $
(ii)”: By Fact 5.2,
$\partial f$
is maximally monotone. Let
$A=\partial f$
in Theorem 4.17 and combine it with Fact 5.1.
“(ii)
$\Leftrightarrow $
(iii)”: Apply [Reference Bauschke and Combettes6, Theorem 18.15].
“(iii)
$\Leftrightarrow $
(iv)”: Apply [Reference Luo, Wang and Yang18, Corollary 3.6].
“(i)
$\Rightarrow $
(v)”: Apply Theorem 2.12.
“(v)
$\Rightarrow $
(i)”: The assumption implies that
$(\mathrm {P}_{f})^{-1}={\operatorname {Id}}+\partial f$
is full domain, single-valued and Lipschitz, so is
$\partial f=\nabla f$
. By Fact 5.1,
$\nabla f$
is co-coercive. It remains to apply Theorem 4.17.
Remark 5.4 (1) Bi-Lipschitz homeomorphisms of a Euclidean space form an important class of operators. For instance, Hausdorff dimension, which plays a central role in fractal geometry and harmonic analysis, is bi-Lipschitz invariant (see [Reference Falconer17]). Theorem 5.3
$(\mathrm {i})\Leftrightarrow (\mathrm {v})$
thus provides a large class of such nonlinear operators.
(2) By endowing
$\Gamma _{0}(X)$
with the topology of epi-convergence (see, e.g., [Reference Planiden and Wang21, Proposition 3.5, Corollary 4.18]), Theorem 5.3
$(\mathrm {i}) \Leftrightarrow (\mathrm {ii})$
implies that most convex functions have their proximal mappings with modulus of averagedness exactly
$1/2$
, in the sense of co-meagerness (the complement of a meager set).
The characterization of special proximal operator follows immediately as well. The following example shows that
$\mathrm {P}_{f}$
being only bijective does not imply that
$\mathrm {P}_{f}$
is normally nonexpansive.
Example 5.5 Let
$X=\mathbb {R}$
. Define

Then, the following hold:
-
(i)
$\varphi $ is a proximal operator of a function in
$\Gamma _0(\mathbb {R})$ .
-
(ii)
$\varphi $ is a bijection.
-
(iii)
$\varphi $ is specially nonexpansive.
-
(iv) The inverse mapping of
$\varphi $ :
$$ \begin{align*}(\varphi)^{-1}(y)=\begin{cases} e^y &\text{ if } y\geqslant 1,\\ e y &\text{ if } -1\leqslant y\leqslant 1,\\ -e^{-y} & \text{ if } y\leqslant -1, \end{cases} \end{align*} $$
Proof (i):
$\varphi $
is a proximal operator because it is nonexpansive and increasing (see [Reference Bauschke and Combettes6, Proposition 24.31]). (ii): Obvious. (iii): We have that
$\varphi $
is differentiable with

Thus,
$\inf _{x \in \mathbb {R}} \varphi ^{\prime }(x)=0$
. By Lemma 3.15 or [Reference Bauschke, Bendit and Moursi5, Proposition 2.8],
$k(\varphi )=(1-\inf _{x \in \mathbb {R}} \varphi ^{\prime }(x))/{2}=1/2$
. (iv): Direct calculations.
Corollary 5.6 Let
$f\in \Gamma _{0}(X)$
. Suppose
${\operatorname {dom}} f \neq X$
. Then,
$\mathrm {P}_{f}$
is specially nonexpansive.
Proof Observe that
$\operatorname {dom} f \neq X$
implies
$\operatorname {dom} \partial f \neq X$
. Thus, f is not L-smooth for any
$L>0$
and the result follows by Theorem 5.3.
Remark 5.7 When C is a nonempty closed convex subset of X and
$C\neq X$
, obviously
$\iota _C \in \Gamma _0(X)$
and
$\operatorname {dom} \iota _C=C$
. By Corollary 5.6,
$P_C$
is specially nonexpansive, which recovers Example 2.15.
For the Moreau envelope, we have the following result.
Proposition 5.8 Let
$f\in \Gamma _{0}(X)$
and let
$\mu , \alpha>0$
. Then,

If, in addition, f is not Lipschitz smooth, then

Proof By [Reference Bauschke and Moursi9, Theorem 27.9], we have

It suffices to apply Lemma 4.3.
If, in addition, f is not Lipschitz smooth, then
$(\mu +\alpha )f$
is not Lipschitz smooth so that
$k(\mathrm {P}_{(\mu +\alpha )f})=1/2$
by Theorem 5.3. Use (5.1) to complete the proof.
Example 5.9 Let
$\mu , \alpha>0$
. Consider the Huber function defined by

It is well-known that
$H_{\mu }=e_{\mu }\|\cdot \|$
and that
$\|\cdot \|$
is not Lipschitz smooth, Therefore, by Proposition 5.8,

Example 5.10 Let C be a nonempty closed convex subset of X and
$C\neq X$
. Consider the support function of C defined by
$\sigma _C: X \rightarrow [-\infty ,+\infty ]: x \mapsto \sup _{c \in C}\langle c, x\rangle .$
Then, the following hold:
-
(i) If C is a singleton, then
$k(P_{C})=1/2$ and
$(\forall \lambda>0)\ k(\mathrm {P}_{\lambda \sigma _{C}})=0.$
-
(ii) If C contains more than one point, then
$k(P_{C})=1/2$ and
$(\forall \lambda>0)\ k(\mathrm {P}_{\lambda \sigma _{C}})=1/2.$
Proof The fact that
$k(P_{C})=1/2$
has been given by Example 2.15. Now observe that the support function
$\sigma _{C}$
has
$\mathrm {P}_{\lambda \sigma _{C}}={\operatorname {Id}}-\lambda P_{C}(\cdot /\lambda ).$
(i): We have
$\mathrm {P}_{\lambda \sigma _{C}}={\operatorname {Id}}+v$
for some
$v \in X$
. Then, apply Proposition 2.3.
(ii): The function
$\lambda \sigma _{C}(x)$
is not Lipschitz smooth, since it is not differentiable at
$0$
. Apply Theorem 5.3 to derive
$k(\mathrm {P}_{\lambda \sigma _{C}})=1/2$
.
6 Compute modulus of averagedness via other constants or values
In this section, introducing monotone value for monotone operators, cocercive value for cocoercive mappings and Lipschitz value for Lipschitz mappings, we provide various formulae to quantify the modulus of averagedness for resolvents and proximal operators.
6.1 Monotone value and cocoercive value
Recall that we assume
$A: X \rightrightarrows X$
is a maximally monotone operator. For
$\mu>0$
, we say that A is
$\mu $
-strongly monotone if
$A-\mu \mathrm {Id}$
is monotone, i.e.,

It is clear that if an operator is
$\mu _0$
-strongly monotone (or cocoercive), then it is
$\mu $
-strongly monotone (or cocoercive) for
$\mu \leqslant \mu _0$
. Observing this property, we define the following functions for a maximally monotone operator.
Definition 6.1 (Monotone value)
Suppose that A is strongly monotone. The monotone value (or best strong monotonicity constant) of A is defined by

Otherwise, we define
$m(A)=0$
.
Definition 6.2 (Cocoercive value)
Suppose A is single-valued with full domain, and cocoercive. The cocoercive value (or best cocoercivity constant) of A is defined by

Otherwise, we define
$c(A)=0$
.
We present basic properties of monotone value and cocoercive value. Note an operator is
$\mu $
-cocoercive if and only if its inverse is
$\mu $
-strongly monotone.
Proposition 6.1 Let
$\mu> 0$
. The following hold:
-
(i) (duality)
$m\left (A\right )=c\left (A^{-1}\right )$ and
$m\left (A^{-1}\right )=c\left (A\right )$ .
-
(ii)
$m(\mu A)=\mu m(A)$ and
$c(\mu A)=\mu ^{-1} c(A)$ .
-
(iii)
$c(A)=+\infty $ if and only if A is a constant operator on X.
-
(iv)
$m(A+B) \geqslant m(A)+m(B)$ and
$$ \begin{align*}c\left(\left(A^{-1}+B^{-1}\right)^{-1}\right) \geqslant c(A)+c(B).\end{align*} $$
-
(v) (Yosida regularization)
$m(A+\mu {\operatorname {Id}})=m(A)+\mu $ and
$c\left (Y_\mu (A)\right )=c(A)+\mu .$
Proof (i), (ii), (iii) and (iv) can be directly verified. (v): Since
$Y_\mu (A)=\left (\mu \mathrm {Id}+A^{-1}\right )^{-1}$
, we have

The following fact connects averaged operators with cocoercive mappings, and can be directly verified.
Fact 6.2 [Reference Xu24, Proposition 3.4(iii)]
Let
$T:X\rightarrow X$
be nonexpansive and
$\alpha \in (0, 1)$
. Then, T is
$\alpha $
-averaged if and only if
${\operatorname {Id}}-T$
is
$1/(2\alpha )$
-cocoercive.
Proposition 6.3 Let
$T:X\rightarrow X$
be nonexpansive. Then,

Corollary 6.4 Let
$T: X \rightarrow X$
be normally nonexpansive. Then,
${\operatorname {Id}}-T$
is a Banach contraction with constant
$2 k(T)$
.
Proof By Proposition 6.3,
$\operatorname {Id}-T$
is cocoercive with constant
$1/(2 k(T))$
. Using the Cauchy–Schwarz inequality, we have that
${\operatorname {Id}}-T$
is Lipschitz with constant
$2 k(T)$
. The contraction property follows by
$2 k(T)<1$
since T is normally nonexpansive.
Remark 6.5 Lemma 2.11 can also be proved by using Corollary 6.4 and the Banach fixed-point theorem. Indeed, given a normally nonexpansive T and for any
$v \in X$
, the mapping
$x \mapsto x-T x+v$
is a Banach contraction and, therefore, has a fixed point
$x_0$
. Then,
$x_0=x_0-T x_0+v$
which implies that
$T x_0=v$
, therefore, T is surjective.
The following result connects the modulus of averagedness of a resolvent to the co-coercivity of associated maximally monotone operator.
Proposition 6.6 (Modulus of averagedness via cocoercive value)
Let
$A: X \rightrightarrows X$
be maximally monotone and
$\alpha>0$
. Then,

Proof In view of Proposition 6.1(ii), it suffices to prove the case when
$\alpha =1$
. Note that
$Y_1(A)=\operatorname {Id}-J_{A}$
. By Proposition 6.1(v),
$c\left (\operatorname {Id}-J_{A}\right )=c(A)+1$
. Now apply Proposition 6.3.
We have the following corollary in view of Proposition 6.1(i).
Corollary 6.7 (Modulus of averagedness via monotone value)
Let
$A:X\rightrightarrows X$
be maximally monotone and
$\alpha>0$
. Then,

The following example illustrates our formulae in this section.
Example 6.8 Suppose that
$A:X\rightarrow X$
is a bounded linear operator and that A is skew, i.e.,
$(\forall x\in X) \langle {{x},{Ax}}\rangle =0.$
Then, A is maximally monotone, and the following hold:
-
(i) If
$A\equiv 0$ , then
$c(A)=+\infty $ . Clearly,
$$ \begin{align*}k(J_{A})=k({\operatorname{Id}})=0=\frac{1}{2}\frac{1}{1+\infty}.\end{align*} $$
-
(ii) If A is not a zero operator, then
$c(A)=m(A)=m(A^{-1})=c(A^{-1})=0$ . Therefore, the formulae give
$k(J_{A})=k(J_{A^{-1}})=1/2$ , which coincides with Theorem 4.17 because A and
$A^{-1}$ is not cocoercive.
6.2 Lipschitz value
Definition 6.3 (Lipschitz value)
Let
$T: X \rightarrow X$
. The Lipschitz value (or best Lipschitz constant) of T is defined by

Moreover, for a maximally monotone operator
$A: X \rightrightarrows X$
, define
$\ell (A)=+\infty $
if A is not single-valued with full domain.
The following formula connects Lipschitz value with cocoercive value. Note that we follow the convention that
$\inf \varnothing =+\infty $
,
$(+\infty )^{-1}=0$
and
$0^{-1}=+\infty $
.
Lemma 6.9
$\ell (A) \leqslant [c(A)]^{-1}$
.
Proof Suppose
$c(A) \in (0,+\infty )$
. Then, A is
$c(A)$
-cocoercive and, therefore,
$[c(A)]^{-1}$
-Lipschitz on X by the Cauchy–Schwarz inequality. Thus,
$\ell (A) \leqslant [c(A)]^{-1}$
.
Suppose
$c(A)=+\infty $
. It follows from Proposition 6.1(iii) that A is a constant operator. Thus,
$\ell (A)=0=[c(A)]^{-1}$
.
Suppose
$c(A)=0$
. Then,
$\ell (A) \leqslant +\infty =[c(A)]^{-1}$
.
Fact 6.10 [Reference Bauschke and Combettes6, Proposition 17.31]
Let f be convex and proper on X, and suppose that
$x \in \operatorname {int} \operatorname {dom} f$
. Then,
$ f \text { is G}\hat{a}\text{teaux differentiable at } x \Leftrightarrow \partial f(x) \text { is a singleton } $
in which case
$\partial f(x)=\{\nabla f(x)\}$
.
Proposition 6.11
$\ell (\partial f)=[c(\partial f)]^{-1}$
.
Proof Suppose
$c(\partial f) \in (0,+\infty )$
. Then,
$\partial f$
is singe-valued with full domain. Thus,
$\partial f=\nabla f$
by Fact 6.10. While
$\nabla f$
is
$c(\partial f)$
-cocoercive, by applying Fact 5.1, we have
$\ell (\nabla f)=[c(\nabla f)]^{-1}$
.
Suppose
$c(\partial f)=+\infty $
. Then,
$\partial f$
is a constant operator by Proposition 6.1. Thus,
$\ell (\partial f)=0=[c(\partial f)]^{-1}$
.
Suppose
$c(\partial f)=0$
. If
$\partial f$
is singe-valued with full domain, then again by applying Facts 6.10 and 5.1, we have
$\partial f=\nabla f$
is not Lipschitz, thus
$\ell (\partial f)=+\infty =[c(\partial f)]^{-1}$
. If
$\partial f$
is not singe-valued, or not with full domain, then
$\ell (\partial f)=+\infty $
by the definition of Lipschitz value. Thus,
$\ell (\partial f)=+\infty =[c(\partial f)]^{-1}$
.
Now we are able to propose the following interesting formula for proximal operators.
Theorem 6.12 (Modulus of averagedness via Lipschitz value)
Let
$f \in \Gamma _0(X)$
. Then,

Proof By Fact 5.2,
$\partial f$
is maximally monotone. The result follows by letting
$A=\partial f$
in Proposition 6.6 and combining it with Proposition 6.11.
Using
$\ell (\alpha T)=\alpha \ell (T)$
for
$\alpha>0$
, we obtain the following result.
Corollary 6.13 Let
$f \in \Gamma _0(X)$
be L-smooth on X for some
$L>0$
and let
$\alpha>0$
. Then,

The following example illustrates our formulae in this section.
Example 6.14 Let C be a nonempty closed convex set in X and
$C \neq X$
. Consider the distance function of C defined by
$ d_C(x): X \rightarrow [-\infty ,+\infty ]: x \mapsto \inf _{c \in C}\|x-c\|. $
Then, for any
$\alpha>0$
the following hold:
-
(i)
$k\left (\mathrm {P}_{\frac {\alpha }{2} d_C^2}\right )=\frac {1}{2} \frac {\alpha }{1+\alpha }$ .
-
(ii)
$c\left ({\operatorname {Id}}-P_C\right )=\ell \left ({\operatorname {Id}}-P_C\right )=1$ .
-
(iii)
$\mathrm {P}_{\frac {\alpha }{2}d_C^2}$ is a bi-Lipschitz homeomorphism of X.
Proof (i): By [Reference Beck11, Example 6.65],
$\mathrm {P}_{\frac {\alpha }{2} d_C^2}=\frac {1}{1+\alpha } {\operatorname {Id}}+\frac {\alpha }{1+\alpha } P_C$
. Thus, we have
$k\left (\mathrm {P}_{\frac {\alpha }{2} d_C^2}\right )=\frac {\alpha }{1+\alpha }k(P_C)=\frac {1}{2} \frac {\alpha }{1+\alpha }$
by Lemma 4.3 and Example 2.15.
(ii): We have
$c\left (\mathrm {Id}-P_C\right )=1$
by Proposition 6.3 and
$k(P_C)=1/2$
. On the other hand, since
$\frac {1}{2} d_C^2 \in \Gamma _0(X)$
and
$\nabla \frac {1}{2} d_C^2={\operatorname {Id}}-P_C$
(see, e.g., [Reference Bauschke and Combettes6, Corollary 12.31], we have
$c\left (\mathrm {Id}-P_C\right )=\ell \left (\mathrm {Id}-P_C\right )=1$
by Proposition 6.11.
Consequently, Corollary 6.13 is verified by the results of (i) and (ii):

(iii): By (i),
$k\left (\mathrm {P}_{\frac {\alpha }{2} d_C^2}\right )=\frac {1}{2} \frac {\alpha }{1+\alpha }<\frac {1}{2}$
, i.e.,
$\mathrm {P}_{\frac {\alpha }{2} d_C^2}$
is normally nonexpansive. The result follows by Theorem 2.12.
7 Bauschke, Bendit, & Moursi’s example generalized
The following example on the modulus of averagedness of
$P_{V}P_{U}$
extends [Reference Bauschke, Bendit and Moursi5, Example 3.5] in
$\mathbb {R}^2$
to a Hilbert space. Instead of using [Reference Bauschke, Bendit and Moursi5, Theorem 3.2], we provide a much simpler proof.
Example 7.1 Let
$\theta \in (0,\pi /2)$
. In the product Hilbert space
$\mathcal {H}=X\times X$
, define

Then,

Proof We have

so that

Put
$T=P_VP_U$
. Then, T is k-averaged if and only if

For
$x=(x_{1}, x_{2})$
with
$x_i\in X$
, we have

Substitute above into (7.2) to obtain

which can be simplified to

When
$x_{2}=0$
, this gives
$(2k-1)(-\tan ^2\theta )\leqslant 0$
, so
$k\geqslant 1/2$
. If
$k=1/2$
, this gives

which is impossible. Thus,
$k>1/2$
. Dividing (7.3) by
$\|x_{2}\|^2$
and applying the Cauchy–Schwarz inequality, we have

Substituting
$t=\|x_{1}\|/\|x_{2}\|$
into (7.4) yields

which happens if and only if

i.e.,
$(1-k)^2\leqslant (2k-1)^2(1+\tan ^2\theta )$
. Taking square root both sides, we have
$1-k\leqslant (2k-1)/\cos \theta $
, so that
$k\geqslant (1+\cos \theta )/(2+\cos \theta )$
. Hence,

Remark 7.2 Let
$U, V$
be two closed subspaces of
$\mathcal {H}$
. Recall that while the cosine of Dixmier angle between
$U, V$
is defined by

the cosine of the Friedrich angle between
$U, V$
is defined by

For more details on the angle between subspaces, see [Reference Bauschke, Bendit and Moursi5, Reference Deutsch16]. With
$U=X\times \{0\},\quad V=\big \{{(y,(\tan \theta ) y)} \mid {y\in X}\big \}$
given in Example 7.1, for
$\theta \in (0,\pi /2)$
, we have
$U\cap V=0$
so that
$(U\cap V)^{\perp }=\mathcal {H}=X\times X$
. Then,



Hence, both the Dixmier and Friedrich angles between U and V are exactly
$\theta $
.