ON THE CONVERGENCE RATE OF THE KRASNOSEL’SKIˇI–MANN ITERATION

The Krasnosel’ski˘ı–Mann (KM) iteration is a widely used method to solve ﬁxed point problems. This paper investigates the convergence rate for the KM iteration. We ﬁrst establish a new convergence rate for the KM iteration which improves the known big-O rate to little-o without any other restrictions. The proof relies on the connection between the KM iteration and a useful technique on the convergence rate of summable sequences. Then we apply the result to give new results on convergence rates for the proximal point algorithm and the Douglas–Rachford method.


Introduction
Let H be a real Hilbert space endowed with the inner product •, • and norm • , and consider the following fixed point problem: where T is a nonexpansive mapping on H. Henceforth, the set of fixed points, Fix(T ), of T is always assumed to be nonempty.An iterative procedure for solving (1.1) is the Krasnosel'skiȋ-Mann (KM) iteration, which was first proposed in [14,17].Consider the following KM iteration: for any initial point x 0 ∈ H, where {α k } ⊂ [0, 1] is a sequence of relaxation parameters.To simplify the notation, we let σ k := k j=0 α j (1 − α j ) (k ∈ N).The KM iteration can be specified as the proximal point algorithm [18,21], the Douglas-Rachford method [8,16], the alternating direction method of multipliers [10,16] and a three-operator splitting [6].The convergence of (1.2) is well studied (see [1,20]).In particular, under the assumption that lim k→∞ σ k = ∞, the sequence generated by (1.2) weakly converges to a point in Fix(T ) [20,Theorem 2].
In this paper, we focus on analysing the convergence rate of {x k }.Throughout, we use the quantity as a measure of the convergence rate since (I − T )(x) = 0 if and only if T (x) = x and the property lim k→∞ (I − T )(x k ) = 0 always holds when Fix(T ) ∅. Recently, Cominetti et al. [3] showed that (1.3) converges to zero at a rate of O(1/ √ σ k ) (big-O) when lim k→∞ σ k = ∞.Similar big-O results were also considered in [15].Little-o rates of convergence for (1.3) were established by Davis and Yin [5] when Theorem 1].However, it is not clear whether the big-O rate in [3] can be improved to little-o.
The purpose of this paper is to show that (1.3) converges to zero at a rate of o(1/ √ σ k ) when lim k→∞ σ k = ∞.To achieve this goal, we consider a useful technique on the convergence rates of summable sequences which appeared in [7,Lemma 3.2].
We show that this technique can be applied to the KM iteration and establish that (I − T )(x k ) = o(1/ √ σ k ).This result improves the existing convergence rate [3] without any other restrictions.
The KM iteration generalises several other methods.In particular, we apply our result to analyse the proximal point algorithm and the Douglas-Rachford method.Recently, some results on convergence rates for these methods were established in [4,11] by using constant relaxation parameters.We establish improved convergence rates for the proximal point algorithm and the Douglas-Rachford method under mild assumptions.
The rest of this paper is organised as follows.In Section 2, some preliminaries are presented.In Section 3, we improve the convergence rate of the KM iteration.Then, we discuss convergence rates for the proximal point algorithm and the Douglas-Rachford method in Sections 4 and 5, respectively.

Preliminaries
The following notation will be used in this paper: R denotes the set of real numbers; N denotes the set of nonnegative integers; H denotes a real Hilbert space: for any x, y ∈ H, x, y denotes the inner product of x and y and, for any z ∈ H, z denotes the norm of z, that is, z = √ z, z ; for any C ⊂ H and mapping U : denotes the graph of A; and the set of zero points of A is denoted by A mapping U : C → C is said to be: A set-valued mapping A : H → 2 H is said to be: The maximal monotonicity of A implies that R(I + rA) = H for all r > 0. To simplify the notation in this paper, we let r := 1.Then we can define the resolvent J A of A by for all x ∈ H.The reflected resolvent R A of J A is defined by 2J A − I (see [1,23]).
Let A : H → 2 H and B : H → 2 H be maximal monotone set-valued mappings.Then J A : H → H is firmly nonexpansive and R A : H → H is nonexpansive and See [1,2] for more details.
The following result will be the key to deducing convergence rates for the KM iteration.
Lemma 2.1 [7, Lemma 3.2].Let {b k }, {c k } be sequences of positive numbers.Assume that the sequence {b k } is nonsummable, the sequence {c k } is decreasing and where the o-notation means that s k = o(1/t k ) if and only if lim k→∞ s k t k = 0. Remark 2.2.Dong [7] used Lemma 2.1 to analyse the proximal point algorithm.

Krasnosel'skiȋ-Mann iteration
In this section, we study the convergence rate for the KM iteration in a Hilbert space.Using Lemma 2.1, we prove the following result.Theorem 3.1.Let C be a nonempty closed convex subset of H, let T : C → C be a nonexpansive mapping such that Fix(T ) ∅ and let {x k } be the sequence generated by (1.2), where x 0 ∈ C, and {α k } is a sequence in [0, 1] such that σ k := k j=0 α j (1 − α j ) for k ∈ N and lim k→∞ σ k = ∞.Then the convergence rate estimate Proof.Let u ∈ F(T ).By virtue of [1, Theorem 5.14], the following properties hold.
(i) For any k ∈ N, By taking l → ∞, we see that Since lim k→∞ σ k = ∞, the assumptions of Lemma 2.1 hold with b k := α k (1 − α k ) and c k := (I − T )(x k ) 2 and hence We can therefore conclude that ( But the reverse implication does not hold.An example of Theorem 1] follows from Theorem 3.1.

Proximal point algorithm
We consider the convergence rates for the proximal point algorithm.Theorem 3.1 can be applied directly to derive new convergence rates.
The proximal point algorithm is an algorithm for solving the inclusion problem, 0 ∈ A(u), where A is a maximal monotone set-valued mapping on H.This algorithm was first introduced by Martinet [18] and further developed by Rockafellar [21].It is known that the sequence generated by the proximal point algorithm weakly converges to a point in A −1 (0) under mild assumptions in the infinite-dimensional Hilbert spaces.
The framework of the generalised proximal point algorithm for a maximal monotone set-valued mapping A is as follows: given x 0 ∈ H, set where {β k } ⊂ [0, 2] is a sequence of relaxation parameters and J A is the resolvent of A.
The convergence of (4.1) under some conditions has been discussed in [2,9,12,13,19,21].Using the definition of R A , we can write (4.1) equivalently as To simplify the notation, we let Since R A is nonexpansive, (4.2) can be viewed as the KM iteration and {x k } weakly converges to a point in Fix(R A ) (= A −1 (0)) when lim k→∞ σ k = ∞ and Fix(R A ) ∅.
Remark 4.1.Since J A is (firmly) nonexpansive, (4.1) can also be viewed as the KM iteration.In order to apply the KM iteration to (4.1), it is necessary to restrict Using Theorem 3.1, we obtain new estimates of convergence rates for (4.1).

Douglas-Rachford method
We next consider the Douglas-Rachford (DR) method.Theorem 3.1 can also be applied to improve the convergence rate for the DR method.
The DR method is a fundamental algorithm for solving the inclusion problem 0 ∈ (A + B)(u), where A and B are maximal monotone set-valued mappings on H.This method was first introduced by Douglas and Rachford [8] and further developed by Lions and Mercier [16] and Eckstein and Bertsekas [9].
The framework of the DR method for maximal monotone set-valued mappings A and B is as follows: given x 0 ∈ H, set where {γ k } ⊂ [0, 2] is a sequence of relaxation parameters.Under appropriate assumptions, the sequence generated by (5.1) weakly converges to a point x * ∈ H such that x * ∈ Fix(R B R A ) and J A (x * ) ∈ (A + B) −1 (0) (see [1,2,9,16]).Using (2.3), we can write (5.1) in the equivalent form To simplify the notation, we let Since R B R A is nonexpansive, (5.2) can be viewed as the KM iteration and {x k } weakly converges to a point in Fix(R B R A ) when lim k→∞ σ k = ∞ and Fix(R B R A ) ∅.Note that it is not guaranteed that the sequence {x k } generated by (5.2) weakly converges to a point in (A + B) −1 (0).Svaiter [22] showed that the shadow sequence {J A (x k )} weakly converges to a point in (A + B) −1 (0) when γ k := 1 (k ∈ N) and (A + B) −1 (0) ∅.By using the demiclosed principle, Bauschke and Combettes [1,Proposition 25.17] showed weak convergence of {J A (x k )} when ∞ j=0 γ j (2 − γ j ) = ∞.On the other hand, the worst-case convergence rate of {J A (x k )} has been recently analysed.He and Yuan [11,Theorem 3.1] showed that J A (x k+1 ) − J A (x k ) converges to zero at a rate of O(1/ √ k) when H is finite dimensional, γ ∈ (0, 2) and γ k := γ(k ∈ N).