• Nenhum resultado encontrado

On Proximal Subgradient Splitting Method for Minimizing the sum of two Nonsmooth Convex Functions

N/A
N/A
Protected

Academic year: 2022

Share "On Proximal Subgradient Splitting Method for Minimizing the sum of two Nonsmooth Convex Functions"

Copied!
19
0
0

Texto

(1)

DOI 10.1007/s11228-016-0376-5

On Proximal Subgradient Splitting Method for Minimizing the sum of two Nonsmooth Convex Functions

Jos´e Yunier Bello Cruz1

Received: 3 May 2015 / Accepted: 16 May 2016 / Published online: 30 May 2016

© Springer Science+Business Media Dordrecht 2016

Abstract In this paper we present a variant of the proximal forward-backward splitting iteration for solving nonsmooth optimization problems in Hilbert spaces, when the objec- tive function is the sum of two nondifferentiable convex functions. The proposed iteration, which will be called Proximal Subgradient Splitting Method, extends the classical subgra- dient iteration for important classes of problems, exploiting the additive structure of the objective function. The weak convergence of the generated sequence was established using different stepsizes and under suitable assumptions. Moreover, we analyze the complexity of the iterates.

Keywords Convex problems·Nonsmooth optimization problems·Proximal forward-backward splitting iteration·Subgradient method

Mathematics Subject Classification (2010) 65K05·90C25·90C30

1 Introduction

The purpose of this paper is to study the convergence properties of a variant of the proximal forward-backward splitting method for solving the following optimization problem:

minf (x)+g(x) s.t.xH, (1)

whereHis a nontrivial real Hilbert space, andf :H→R:=R∪ {+∞}andg:H→R are two proper lower semicontinuous and convex functions. We are interested in the case where both functionsf andg are nondifferentiable, and when the domain off contains the domain ofg. The solution set of this problem will be denoted byS, which is a closed

Jos´e Yunier Bello Cruz yunier.bello@gmail.com

1 Department of Mathematical Sciences, Northern Illinois University, DeKalb, IL 60115, USA

(2)

and convex subset of the domain ofg. Problem (1) has recently received much attention from the optimization community due to its broad applications to several different areas such as control, signal processing, system identification, machine learning and restoration of images; see, for instance, [18,19,24,32] and the references therein.

A special case of problem (1) is the nonsmooth constrained optimization problem, taking g =δC whereδC is the indicator function of a nonempty closed and convex setCinH, defined byδC(y) := 0, ifyCand+∞, otherwise. Then, problem (1) reduces to the constrained minimization problem

min f (x) s.t.xC. (2)

Another important case of problem (1), which has had much interest in signal denoising and data mining, is the following optimization problem with1-regularization

min f (x)+λx1 s.t.xH, (3) whereλ >0 and the norm · 1is used to induce the sparsity in the solutions. Moreover, problem (3) covers the important and the well-studied1-regularized least square minimiza- tion problem, whenH= Rnandf (x)= Axb22 whereA ∈ Rm×n,m << n, and b ∈Rm, which is just a convex approximation of the very famous0-minimization prob- lem; see [12]. Recently, this problem became popular in signal processing and statistical inference; see, for instance, [23,43].

We focus here on the so-calledproximal forward-backward splitting iteration[32], which contains a forward gradient step off (an explicit step) followed by a backward proximal step ofg (an implicit step). The main idea of our approach consists of replacing, in the forward step of the proximal forward-backward splitting iteration, the gradient off by a subgradient off (note that herefis assumed nondifferentiable in general). In the particular case thatgis the indicator function, the proposed iteration reduces to the classical projected subgradient iteration.

To describe and motivate our iteration, first we recall the definition of the so-calledprox- imal operatorasproxg : HHassociated toga proper lower semicontinuous convex function, whereproxg(z),zHis the unique solution of the following strongly convex optimization problem

min g(y)+1

2yz2 s.t.yH. (4)

Note that the norm · is induced by this inner product ofH,i.e.,x := √ x, x for allxH. The proximal operatorproxgis well-defined and has many attractive prop- erties,e.g., it is continuous and firmly nonexpansive,i.e., for allx, yH,proxg(x)proxg(y)2xy2− [xproxg(x)] − [yproxg(y)]2. This nice property can be used to construct algorithms to solve optimization problems [39]; for other properties and algebraic rules see [3,18,19]. Ifg=δC is the indicator function, the orthogonal projection ontoC,PC(x):= {yC: xy =dist(x, C)}is the same asproxδC(x)for allxH [2]. For an exhaustive discussion about the evaluation of the proximity operator of a wide variety of functions see Section 6 of [32]. Now, let us recall the definition of the subdiffer- ential operator∂g:HHby∂g(x):= {wH:g(y)g(x)+ w, yx,yH}. We also present the relation of the proximal operatorproxαgwith the subdifferential oper- ator∂g,i.e.,proxαg = (Id+α∂g)1 and as a direct consequence of the first optimality condition of (4), we have the following useful inclusion:

zproxαg(z)

α∂g(proxαg(z)), (5)

(3)

for anyzHandα >0. The iteration proposed in this paper, calledProximal Subgradient Splitting Method, is motivated by the well-known fact thatxSif and only if there exists u∂f (x) such thatx = proxαg(xαu). Thus, the iteration generalizes the proximal forward-backward splitting iteration for the differentiable case, as a fixed point iteration of the above equation, which is defined as follows: starting atx0belonging to the domain of g, set

xk+1=proxα

kg(xkαkuk), (6)

whereuk∂f (xk)and the stepsizeαkis positive for allk∈N. Iteration (6) recovers the classical subgradient iteration [38], wheng=0, and the proximal point iteration [39], when f =0. Moreover, it covers important situations in whichf is nondifferentiable and it can also be seen as aforward-backward Euler discretizationof the subgradient flow differential inclusion

˙

x(t)∈ −[f (x(t))+g(x(t))],

with variablex:R+H; see [32]. Actually, if the derivative on the left side is replaced by the divided difference(xk+1xk)/αk, then the discretization obtained is(xkxk+1)/αk

∂f (xk)+∂g(xk+1),which is the proximal subgradient iteration (6).

The nondifferentiability of the functionfhas a direct impact on the computational effort and the importance of such problems, whenf is nonsmooth, is underlined because they occur frequently in applications. Nondifferentiability arises, for instance, in the problem of minimizing the total variation of a signal over a convex set, in the problem of minimizing the sum of two set-distance functions, in problems involving maxima of convex functions, the Dantzing selector-type problems, the non-Gaussian image denoising problem and in Tykhonov regularization problems withL1 norms and others; see, for instance, [4,13,17, 26]. The iteration of the proximal subgradient splitting method, proposed in (6), can be applied in these important instances, extending the classical subgradient iteration for more general problems as (3). In problem (1),fis usually assumed to be differentiable as in [35], which is not necessarily the case in this work. Moreover, the convergence of the iteration (6) to a solution of (1) has been established in the literature, when the gradient offis globally Lipschitz continuous and the stepsizesαk,k ∈ Nhave to be chosen very small,i.e., for allk,αk is less than some constant related with the Lipschitz constant of the gradient of f; see, for instance, [19]. Recently, whenf is continuously differentiable but the Lipschitz constant is not available, the steplengths can be chosen using backtracking procedures; see [6,10,32,35].

It is important to mention that the forward-backward iteration also finds applications in solving more general problems, like the variational inequality and inclusion problems; see, for instance, [9,11,14,15,42] and the references therein. On the other hand, the standard convergence analysis of this iteration, for solving these general problems, requires at least a co-coercivity assumption of the operator and the stepsizes to lie within a suitable inter- val; see, for instance, Theorem 25.8 of [3]. Note that co-coercive operators are monotone and Lipschitz continuous, but the converse does not hold in general; see [44]. Although, for gradients of lower semicontinuous, proper and convex functions, the co-coercivity is equiv- alent to the global Lipschitz continuity assumption. This nice and surprising fact, which is strongly used in the convergence analysis of the proximal forward-backward method for problem (1), whenf is differentiable, is known as theBaillon-Haddad Theorem; see Corollary 18.16 of [3].

The main aim of this work is to remove the differentiability assumption fromf in the forward-backward splitting method, extending the classical projected subgradient method

(4)

and containing, as particular case, a new proximal subgradient iteration for more general problems.

This work is organized as follows. The next subsection provides our notations and assumptions, and some preliminaries results that will be used in the remainder of this paper. The proximal subgradient splitting method and its weak convergence are analyzed by choosing different stepsizes in Section2. Finally, Section3gives some concluding remarks.

1.1 Assumptions and Preliminaries

In this section, we present our assumptions, classical definitions and some results needed for the convergence analysis of the proposed method.

We start by recalling some definitions and notation used in this paper, which are standard and follows from [3,32]. Throughout this paper, we write p := q to indicate that pis defined to be equal toq. We writeNfor the nonnegative integers{0,1,2, . . .}and remind that the extended-real number system isR:=R∪{+∞}. The closed ball centered atxH with radiusγ >0 will be denoted byB[x;γ],i.e.,B[x;γ] := {yH: yxγ}. The domain of any functionh:H→R, denoted bydom(h), is defined asdom(h):= {xH: h(x) <+∞}. The optimal value of problem (1) will be denoted bys:=inf{(f +g)(x): xH}, noting that whenS= ∅,s=min{(f +g)(x):xH} =(f +g)(x)for any xS. Finally,1(N)denotes the set of summable sequences in[0,+∞).

Throughout this paper we assume the following:

A1.∂f is bounded on bounded sets on the domain ofg,i.e., there existsζ =ζ (V ) >0 such that∂f (x) ⊆ B[0;ζ]for allxV, whereV is any bounded and closed subset of dom(g).

A2. ∂g has bounded elements on the domain of g,i.e.,ρ ≥ 0 such that ∂g(x)∩ B[0;ρ] = ∅for allxdom(g).

In connection with Assumption A1, we recall that∂f is locally bounded on its open domain. In finite dimension spaces, this result implies thatA1always holds whendom(f )is open. A widely used sufficient condition forA1is the Lipschitz continuity offondom(g).

Furthermore, the boundedness of the subgradients is crucial for the convergence analysis of many classical subgradient methods in Hilbert spaces and it has been widely considered in the literature; see, for instance, [1,8,9,38].

Regarding AssumptionA2, we emphasize that it holds trivially for important instances of problem (1),e.g., problems (2) and (3) because∂δC(x)=NC(x)and∂x1= {uH: u ≤1, u, x = x1}, respectively, or whendom(g)is a bounded set or also when His a finite dimensional space. Note that AssumptionA2allows instances where∂gis an unbounded set as is the particular case whengis the indicator function. It is an existence condition, which is in general weaker thanA1.

Let us end the section by recalling the well-known concepts so-called quasi-Fej´er and Fej´er convergence.

Definition 1.1 LetSbe a nonempty subset ofH. A sequence(xk)k∈N inHis said to be quasi-Fej´er convergent toSif and only if for allxSthere exists a sequence(k)k∈Nin 1(N)andxk+1x2xkx2+kfor allk∈N. When(k)k∈Nis a null sequence, we say that(xk)k∈Nis Fej´er convergent toS.

The definition originates in [22] and has been elaborated further in [16]. This definition, originated in [22], has been elaborated further in [16]. In the following we present two well-known fact for quasi-Fej´er convergent sequences.

(5)

Fact 1.1 If the sequence(xk)k∈Nis quasi-Fej´er convergent toS, then:

(a) The sequence(xk)k∈Nis bounded.

(b) (xk)k∈Nis weakly convergent iff all weak accumulation points of(xk)k∈Nbelong to S.

Proof Item(a)follows from Proposition 3.3(i) of [16], and Item(b)follows from Theorem 3.8 of [16].

2 The Proximal Subgradient Splitting Method

In this section we propose the proximal subgradient splitting method extending the classi- cal subgradient iteration. We prove that the sequence of points generated by the proposed method converges weakly to a solution of (1) using different strategies for choosing of the stepsizes. Moreover, we show the complexity analysis for the generated sequence.

The method is formally stated as follows:

IfPSS Methodstops at stepk, thenxk=proxα

kg

xkαkuk

withuk∂f (xk), implying thatxkis solution of problem (1). Then, from now on, we assume thatPSS Methodgener- ates an infinite sequence(xk)k∈N. Moreover, it follows directly from (6) that the sequence (xk)k∈Nbelongs todom(g).

Before the formal analysis of the convergence properties ofPSS Method, we discuss below about the necessity of taking a (forward) subgradient step off instead of another (backward) proximal step.

Remark 2.1 To evaluate the proximal operator off is necessary to solve a strongly con- vex minimization problem as (4). Thus, in the context of problem (1), we assume that it is hard to evaluate the proximal operator off, leaving out the possibility to use the standard and very powerful iteration so-calledDouglas-Rachford splitting methodpresented in [17].

Such situations appear mainly whenfhas a complicated algebraic expression and therefore it may impossibility to solve, explicitly or efficiently, subproblem (4). Indeed, very often in the applications, the formula for the proximity operator is not available in closed form and ad hoc algorithms or approximation procedures have be used to computeproxαf. This happens for instance when applying proximal methods to image deblurring with total vari- ation [5], or to structured sparsity regularization problems in machine learning and inverse problems [31]. A classical problem of the form of (1), when the subgradient off is easily available andproxfdoes not has explicitly formula is the dual formulation of the following constrained convex problem:

minh0(y) subject that hi(y)≤0 (i=1, . . . , n), (7)

(6)

wherehi :Rm→R(i=0, . . . , n)are convex. It has as dual problem

xmin∈Rnf (x)+δRn+(x) withf :Rn→Rdefined asf (x)= −infy∈Rm{h0(y)+n

i=1xihi(y)}. It is well-known that

∂f (x)=conv

h(yx):f (x)=h0(yx)+ n

i=1

xihi(yx)

,

and conv{S}denotes the convex hull of a set S. However, computeproxf does not look an easy problem. This argument is used widely in the literature to motivated the projected subgradient method, which can be easily modified for recovering problems as (1), wheng is not necessary the indicator function. Indeed, consider problem (7) whenn=mwith an additional and simple restrictiong0, that is:y∈Rn

minh0(y) subject that hi(y)≤0 (i=1, . . . , n), g0(y)≤0, (8) which can be rewritten now as minx∈Rnf (x)+δRn+(x)+λg0(x), λ >0,where

f (x)= − inf

y∈Rn{h0(y)+ n

i=1

xihi(y)}.

This last problem is a particular case of (1), by takingg = δRn++λg0. Note that if dom(g0)⊆Rn+theng=λg0.

Thus,PSS Methoduses the proximal operator ofgand the explicit subgradient iteration off(i.e., the proximal operator off is never evaluated), which is, in general, much easier to implement than the proximal operator off+gorf, as happens in the standard proxi- mal point iteration or the Douglas-Rachford algorithm, respectively for solving nonsmooth problems, as (1); see, for instance, [17]. Furthermore, note that in our case the subgradient iteration for the sumf+gis not possible, because the domains offandgare not the whole space.

In the following we prove a crucial property of the iterates generated byPSS Method.

Lemma 2.1 Let(xk)k∈Nand(uk)k∈Nbe the sequences generated byPSS Method. Then, for allk∈Nandxdom(g),

xk+1x2xkx2+2αk

(f+g)(x)(f+g)(xk) +αk2uk+wk2, wherewk∂g(xk)is arbitrary.

Proof Take anyxdom(g). Note that (5) and (6) imply thatw¯k+1 := xkxk+1 αkuk, withuk∂f (xk)as defined byPSS Method, belongs to∂g(xk+1). Then,

α2kuk+ ¯wk+12+ xkx2xk+1x = xk+1xk2+ xkx2xk+1x2

=2xkxk+1, xkx =2αkuk, xkx +2xkxk+1αkuk, xkx

=2αk uk, xkx +2αk

xkxk+1

αkuk, xk+1x

+2αk

xkxk+1

αkuk, xkxk+1

=2αk uk, xkx +2αk

w¯k+1, xk+1x

+2αkuk, xk+1xk +2xkxk+12.

(7)

Now using again that xkxk+1

αkuk = ¯wk+1∂g(xk+1)and the convexity ofgandf, the above equality leads us

2xkxk+1, xkxk

f (xk)f (x)+g(xk+1)g(x)+ uk, xk+1xk +2xkxk+12

=k

(f+g)(xk)(f+g)(x)+g(xk+1)g(xk)+ uk, xk+1xk +2kuk+ ¯wk+12

k

(f+g)(xk)(f+g)(x)+ wk+uk, xk+1xk +k2uk+ ¯wk+12, for anywk∂g(xk). We thus have shown that

xk+1x2xkx2+2αk

(f +g)(x)(f +g)(xk) +2αk2 uk+wk, uk+ ¯wk+1αk2uk+ ¯wk+12

= xkx2+2αk

(f+g)(x)(f+g)(xk) +αk2uk+wk2αk2wk− ¯wk+12. Note thatwk∂g(xk)is arbitrary and the result follows.

Since subgradient methods are not descent methods, as the proposed method here, it is common to keep track of the best point found so far,i.e., the one with minimum function value among the iterates. At each step, we set it recursively as(f +g)0best:=(f +g)(x0) and

(f +g)kbest:=min

(f +g)k−1best, (f +g)(xk)

, (9)

for allk. Since

(f+g)kbest

k∈Nis a decreasing sequence, it has a limit (which can be−∞).

When the functionfis differentiable and its gradient Lipschitz continuous, it is possible to prove the complexity of the iterates generated byPSS Method; see [35]. In our instance (f is not necessarily differentiable) we expect, of course, slower convergence.

Next we present a convergence rate result for the sequence of the best functional values (f +g)kbest

k∈Nto min{(f+g)(x):xH}. Lemma 2.2 Let

(f +g)kbest

k∈Nbe the sequence defined by(9). IfS = ∅then, for all k∈N,

(f +g)kbest−min

x∈H(f +g)(x)≤[dist(x0, S)]2+Ckk i=0αi2 2k

i=0αi ,

whereCk:=max

ui+wi2 :0≤ik

withwi∂g(xi) (i=0, . . . , k)are arbitrary.

Proof Definex := PS(x0). Note that x exists becauseS is a nonempty closed and convex set ofH. By applying Lemma 2.1,k+1 times, fori∈ {0,1, . . . , k}atxS, we get

xk+1x2xkx2+2αk

(f +g)(x)(f+g)(xk) +αk2uk+wk2

x0x2+2 k i=0

αi

(f +g)(x)(f +g)(xi) + k i=0

αi2ui+wi2

≤ [dist(x0, S)]2+2

xmin∈H(f +g)(x)(f+g)kbest k

i=0

αi+Ck k

i=0

αi2, (10) where(f+g)kbestis defined by (9) and the result follows after simple algebra.

(8)

Next we establish the rate of convergence of the rate of the objective values at the ergodic sequence(x¯k)k∈Nof(xk)k∈N, which is defined recursively asx¯0 =x0and givenσ0 =α0

andσk=σk1+αk, we define

¯ xk=

1−αk

σk

¯

xk−1+αk

σk

xk.

After easy induction, we haveσk=k

i=0αi and

¯ xk= 1

σk

k i=0

αixi, (11)

for allk∈N.

The following result is similar to Lemma 2.2, considering now the ergodic sequence defined by (11).

Lemma 2.3 Let(x¯k)k∈Nbe the ergodic sequence defined by(11). IfS= ∅, then (f +g)(x¯k)−min

x∈H(f +g)(x)≤[dist(x0, S)]2+Ckk

i=0α2i 2k

i=0αi

,

whereCk=max

ui+wi2:0≤ik

withwi∂g(xi) (i=0, . . . , k)are arbitrary.

Proof Proceeding as in the proof of Lemma 2.2 until inequality (10) and after dividing by 2k

i=0αi, we get k

i=0

αi

σk

(f+g)(xi)−min

xH(f+g)(x)

≤ 1 2σk

[dist(x0, S)]2xk+1x2 + Ck

k

k i=0

α2i

≤ 1 2σk

[dist(x0, S)]2+Ck

k i=0

α2i

, (12)

whereσk :=k

i=0αi. Using the convexity off +gafter note thatσαi

k ∈ [0,1]for alli ∈ {0,1, . . . , k}andk

i=0 αi

σk =1 and (11) in the above inequality (12), the result follows.

Next we focus on constant stepsizes, which is motivated by the fact that we are interested in quantifying the progress of the proposed method to find an approximate solution.

Corollary 2.4 Let(xk)k∈Nbe the sequence generated byPSS Methodwith the stepsizes αkconstant equal toα,

(f +g)kbest

k∈Nbe the sequence defined by(9)and(x¯k)k∈Nbe the ergodic sequence as(11). Then, the iteration attains the optimal rate atα = dist(xC0,S)

k ·

1

k+1, i.e., for allk∈N, (f +g)kbest−min

x∈H(f +g)(x)≤[dist(x0, S)]2+α2(k+1)Ck

2(k+1)α ≤ dist(x0, S)·√ Ck

k+1 and

(f+g)(x¯k)−min

x∈H(f +g)(x)≤[dist(x0, S)]2+α2(k+1)Ck

2(k+1)α ≤dist(x0, S)·√ Ck

k+1 , whereCk=max

ui+wi2:0≤ik

withwi∂g(xi) (i=0, . . . , k)are arbitrary.

(9)

Proof If we consider constant stepsizes,i.e.,αk = αfor allk ∈ N, then the optimal rate is obtained whenα = dist(xC0,S)

k · k+11 from minimizing the right part of Lemmas 2.2 and 2.3.

Note that under AssumptionA2,Ck(max1≤i≤kui+ρ)2. Hence when∂f is bounded on thedom(g)(which occurs whendom(g)is bounded over our assumptions), Assump- tionA1implies thatCk +ρ)2for allk ∈ N. In this case, our analysis showed that the expected error of the iterates generated byPSS Methodwith constant stepsizes after k iterations isO

(k+1)−1/2

. Hence, we can search an ε-solution of problem (1) with O

ε−2

iterations. Of course, this is worse than the rateO(k−1)andO ε−1

iterations of the proximal forward-backward iteration for the differentiable and convexfwith Lipschitz continuous gradient; see, for instance, [35]. However, as was showed in Section 3.2.1, The- orem 3.2.1 of [37], the worst expected error afterkiterations of the classical subgradient iteration is attainable equal toO

(k+1)−1/2

for general nonsmooth problems.

2.1 Exogenous Stepsizes

In this subsection we analyze the convergence ofPSS Methodusing exogenous stepsizes, i.e., the positive exogenous sequence of stepsizes(αk)k∈N satisfies thatαk = βk

ηk

where ηk:=max{1,uk}for allk, and

k=0

βk2<+∞ and k=0

βk= +∞. (13)

We begin with a useful consequence of Lemma 2.1.

Corollary 2.5 Letxdom(g). Then, for allk∈N, xk+1x2xkx2+2βk

ηk

(f +g)(x)(f +g)(xk) +

1+2ρ+ρ2

βk2,

whereρ≥0is as defined in AssumptionA2.

Proof The result follows by noting thatηkuk,ηk ≥ 1 for allk ∈ N and letting wk∂g(xk)such thatwkρfor allk∈Nin view of AssumptionA2. Then,

uk+wk2

ηk2uk2

ηk2 +2ukwk

ηk2 +wk2

η2k ≤1+2ρ+ρ2. Now, Lemma 2.1 implies the desired result.

Now we define the auxiliary set Slev(x0):=

xdom(g):(f +g)(x)(f+g)(xk),k∈N

. (14)

When the solution set of problem (1) is nonempty,Slev(x0) = ∅becauseSSlev(x0).

Next, we prove the two main results of this subsection.

Theorem 2.6 Let (xk)k∈N be the sequence generated by PSS Method with exogenous stepsizes. If there existsx¯∈Slev(x0), then:

(10)

(a) The sequence(xk)k∈Nis quasi-Fej´er convergent to

Lf+g(x)¯ := {xdom(g):(f+g)(x)(f +g)(x)¯ }. (b) limk→∞(f +g)(xk)=(f +g)(x).¯

(c) The sequence(xk)k∈Nis weakly convergent to somex˜∈Lf+g(x).¯

Proof By assumption there existsx¯∈Slev(x0),i.e.,(f+g)(x)¯ ≤(f+g)(xk), for allk∈N. (a) To show that (xk)k∈N is quasi-Fej´er convergent to Lf+g(x)¯ (which is nonempty becausex¯∈Lf+g(x)), we use Corollary 2.5, for any¯ xLf+g(x)¯ ⊆dom(g), estab- lishing thatxk+1x2xkx2+(1+2ρ+ρ2k2, for allk∈N. Thus,(xk)k∈N is quasi-Fej´er convergent toLf+g(x).¯

(b) The sequence (xk)k∈N is bounded from Fact 1.1a, and hence it has accumulation points in the sense of the weak topology. To prove that

k→∞lim (f +g)(xk)=(f +g)(x),¯ (15) we use Corollary 2.5, withx= ¯xLf+g(x)¯ ⊆dom(g), to get

βk

(f +g)(xk)(f +g)(x)¯ ≤ 12(xk− ¯x2− xk+1− ¯x2)+12(1+2ρ+ρ2k2. Summing, fromk=0 tom, the above inequality, we have

m k=0

βk

(f+g)(xk)(f+g)(x)¯ 12(x0− ¯x2xm+1− ¯x2)+12(1++ρ2) m k=0βk2, and taking limit, whenmgoes to∞,

k=0

βk

(f +g)(xk)(f +g)(x)¯ <+∞. (16) Then, (16) together with (13) implies that there exists a subsequence (f +g)(xik)

k∈Nof

(f+g)(xk)

k∈Nsuch that lim inf

k→∞

(f +g)(xik)(f+g)(x)¯ =0. (17) Indeed, if (17) does not hold, then there exist σ > 0 and k ≥ ˜k, such that(f + g)(xk)(f +g)(x)¯ ≥σand using (16), we get

+∞>

k=˜k

βk

(f +g)(xk)(f+g)(x)¯ ≥σ k=˜k

βk,

in contradiction with (13). Next, defineϕk :=(f +g)(xk)(f +g)(x), which is¯ positive for allkbecausex¯∈Slev(x0). Then, for anyuk∂f (xk)andwk∂g(xk), we get

ϕkϕk+1 = (f+g)(xk)(f +g)(xk+1)uk+wk, xkxk+1

uk+wkxkxk+1 +ρ)xkxk+1, (18) whereζ >0 such thatukζ, for allk∈N(ζ exists in virtue of the boundedness of(xk)k∈N and AssumptionA1) andwkρ, for allk ∈ N(ρ exists because wk∂g(xk)are arbitrary and the use of AssumptionA2). Using Corollary 2.5, with

(11)

x=xk, we havexkxk+1

1+2ρ+ρ2·βk, which together with (18) implies that

ϕkϕk+1

1+2ρ+ρ2·+ρ)βk:= ¯ρβk (19) for allk ∈ N. From (17), there exists a subsequence

ϕik

k∈Nofk)k∈Nsuch that limk→∞ϕik =0. If the claim given in (15) does not hold, then there exists someδ >0 and a subsequence

ϕk

k∈Nofk)k∈N, such thatϕkδfor allk∈N. Thus, we can construct a third subsequence

ϕjk

k∈Nofk)k∈N, where the indicesjkare chosen in the following way:

j0:=min{m≥0|ϕmδ}, j2k+1:=min{mj2k|ϕmδ/2}, j2k+2:=min{mj2k+1|ϕmδ}, for eachk. The existence of the subsequences

ϕik

k∈N, ϕk

k∈Nofk)k∈N, guaran- tees that the subsequence

ϕjk

k∈Nofk)k∈Nis well-defined for allk≥0. It follows from the definition ofjkthat

ϕmδ for j2kmj2k+1−1 (20) ϕmδ

2 for j2k+1mj2k+2−1 for allk, and hence

ϕj2kϕj2k+1δ

2, (21)

for allk∈N. In view of (16) and remind thatϕk =(f +g)(xk)(f +g)(x)¯ ≥0 for allk∈N,

+∞ >

k=0

βkϕk k=0

j2k+1−1 m=j2k

βmϕmδ 2

k=0

j2k+1−1 m=j2k

βm

= δ 2ρ¯

k=0

j2k+11

m=j2k

¯ ρβmδ

2ρ¯ k=0

j2k+11

m=j2k

mϕm+1)= δ 2ρ¯

k=0

j2kϕj2k+1)

δ 2ρ¯

k=0

δ

2 = +∞,

where we have used (20) in the second inequality and (19) in the third inequality and (21) in the last one. Thus, lim

k→∞(f+g)(xk)=(f+g)(x), establishing¯ (b).

(c) Letx˜be a weak accumulation point of(xk)k∈N, and note thatx˜exists by Item(a)and Fact 1.1a. From now on, we use(xik)k∈Nto denote any subsequence of(xk)k∈Nthat converges weakly tox. Since˜ f+gis weakly lower semicontinuous, using (15), we get

(f +g)(x)˜ ≤lim inf

k→∞(f+g)(xik)= lim

k→∞(f+g)(xk)=(f+g)(x),¯ implying that (f +g)(x)˜ ≤ (f +g)(x)¯ and thus x˜ ∈ Lf+g(x). As consequence, all¯ weak accumulation points of(xk)k∈Nbelong toLf+g(x)¯ and since(xk)k∈Nis quasi-Fej´er convergent to Lf+g(x), we get that¯ (xk)k∈N converges weakly to x˜ ∈ Lf+g(x)¯ from Fact 1.1b.

Theorem 2.7 Let (xk)k∈N be the sequence generated by PSS Method with exogenous stepsizes. Then,

(12)

(a) lim infk→∞(f +g)(xk)=infx∈H(f+g)(x)=s(possiblys= −∞).

(b) IfS = ∅, thenlimk→∞(f +g)(xk) =minx∈H(f +g)(x)and(xk)k∈Nconverges weakly to somex¯∈S.

(c) IfS= ∅, then(xk)k∈Nis unbounded.

Proof (a) Since(xk)k∈Ndom(g), we gets≤lim infk→∞(f +g)(xk). Suppose that s<lim infk→∞(f +g)(xk). Hence, there existsxˆsuch that

(f +g)(x) <ˆ lim inf

k→∞(f +g)(xk). (22)

It follows from (22) that there existsk¯∈Nsuch that(f+g)(x)ˆ ≤(f+g)(xk)for all k≥ ¯k. Sincek¯is finite we can assume without loss of generality that(f+g)(x)ˆ ≤ (f+g)(xk)for allk∈N. Using the definition ofSlev(x0), given in (14), we have that

ˆ

xSlev(x0). By Theorem 2.6b limk→∞(f +g)(xk)=(f +g)(x), in contradictionˆ with (22).

(b) SinceS = ∅, take xS and note that this impliesLf+g(x) = S. Since (xk)k∈Ndom(g), we get(f +g)(x)(f +g)(xk)for allk∈Nimplying that xSlev(x0). By applying items (b) and (c) of Theorem 2.6, atx¯ =x, we get that limk→∞(f +g)(xk)=(f +g)(x)and(xk)k∈Nconverges weakly to somex˜ ∈S, respectively.

(c) Assume thatSis empty but the sequence(xk)k∈Nis bounded. Let(xk)k∈Nbe a sub- sequence of(xk)k∈Nsuch that limk→∞(f+g)(xk)=lim infk→∞(f+g)(xk). Since (xk)k∈Nis bounded, without loss of generality (i.e., refining(xk)k∈Nif necessary), we may assume that(xk)k∈Nconverges weakly to somex¯ ∈dom(g). By the weak lower semicontinuity off+gondom(g),

(f +g)(x)¯ ≤lim inf

k→∞(f+g)(xk)= lim

k→∞(f+g)(xk)=lim inf

k→∞(f+g)(xk)=s, (23) using Item(a)in the last equality. By (23),x¯∈S, in contradiction with the hypothesis and the result follows.

For exogenous stepsizes, Theorem 2.7a guarantees the convergence of

(f +g)(xk)

k∈N

to the optimal value of problem (1),i.e., lim infk→∞(f+g)(xk)=s,implying the conver- gence of

(f +g)kbest

k∈N, defined in (9), tos. It is important to mention that in the proof of the above two crucial results, we have used a similar idea recently presented in [7] for a different instance.

In the following we present a direct consequence of Lemmas 2.2 and 2.3, when the stepsizes satisfy (13).

Corollary 2.8 Let(x¯k)k∈Nbe the ergodic sequence defined by(11)and(βk)k∈Nas(13). If S= ∅, then, for allk∈N,

(f +g)kbest−min

x∈H(f +g)(x)ζ[dist(x0, S)]2+(1+2ρ+ρ2)k i=0βi2 2k

i=0βi

Referências

Documentos relacionados

Ousasse apontar algumas hipóteses para a solução desse problema público a partir do exposto dos autores usados como base para fundamentação teórica, da análise dos dados

The limestone caves in South Korea include a variety of speleothems such as soda straw, stalactite, stalagmite, column, curtain (and bacon sheet), cave coral,

Esse movimento de concentração de investimentos no exterior em países mais próximos psiquicamente do Brasil encontra respaldo nas principais teorias comportamentais

In this section we present the inexact version of the proximal point method for vector optimization and prove that any generated sequence convergence to the weakly efficient

Variação anual da quebra relativa da produti- vidade média simulada pelo modelo CROPGRO – Dry Bean, para a cultura do feijoeiro, cultivar Carioca, nas 36 datas de semeadura

Contrariamente ao mito de origem hobbesiano, que efetivamente transportava o capitalismo para um estado de natureza habitado por indivíduos autônomos e egoístas,

No entanto, ao fim de 60 minutos de oxidação a α- amilase de Bacillus amyloliquefaciens possui uma maior percentagem de resíduos oxidados (64%) do que a α-amilase

O cuidador familiar e o idoso em sua totalidade aprendem a conviver com a condição de dependência, com dinâmicas sociais construídas nos significados do cuidar,