• Nenhum resultado encontrado

On some boosted methods for DC programming and the extension of the DCA to Hadamard Manifolds

N/A
N/A
Protected

Academic year: 2022

Share "On some boosted methods for DC programming and the extension of the DCA to Hadamard Manifolds"

Copied!
143
0
0

Texto

(1)

UNIVERSIDADE FEDERAL DE GOIÁS (UFG)

INSTITUTO DE MATEMÁTICA E ESTATÍSTICA (IME) PROGRAMA DE PÓS GRADUAÇÃO EM MATEMÁTICA

ELIANDERSON MENESES SANTOS

On some boosted methods for DC programming and the extension of the

DCA to Hadamard Manifolds

GOIÂNIA 2022

(2)

26/01/22 14:55 SEI/UFG - 2607181 - Termo de Ciência e de Autorização (TECA)

Página 1 de 2 https://sei.ufg.br/sei/controlador.php?acao=documento_impri…d8a4bf20dc84e37282efa9d3ba480c58c855b1a7de0ecdfdf75811bd787

UNIVERSIDADE*FEDERAL*DE*GOIÁS INSTITUTO*DE*MATEMÁTICA*E*ESTATÍSTICA

TERMO&DE&CIÊNCIA&E&DE&AUTORIZAÇÃO&(TECA)&PARA&DISPONIBILIZAR&VERSÕES&ELETRÔNICAS&DE TESES

E&DISSERTAÇÕES&NA&BIBLIOTECA&DIGITAL&DA&UFG

Na*qualidade*de*;tular*dos*direitos*de*autor,*autorizo*a*Universidade*Federal*de*Goiás (UFG)* a* disponibilizar,* gratuitamente,* por* meio* da* Biblioteca* Digital* de* Teses* e* Dissertações (BDTD/UFG),* regulamentada* pela* Resolução* CEPEC* nº* 832/2007,* sem* ressarcimento* dos* direitos autorais,*de*acordo*com*a*Lei*9.610/98,*o*documento*conforme*permissões*assinaladas*abaixo,*para fins* de* leitura,* impressão* e/ou* download,* a* `tulo* de* divulgação* da* produção* cien`fica* brasileira,* a par;r*desta*data.

O* conteúdo* das* Teses* e* Dissertações* disponibilizado* na* BDTD/UFG* é* de responsabilidade* exclusiva* do* autor.* Ao* encaminhar* o* produto* final,* o* autor(a)* e* o(a)* orientador(a) firmam* o* compromisso* de* que* o* trabalho* não* contém* nenhuma* violação* de* quaisquer* direitos autorais*ou*outro*direito*de*terceiros.

1.&IdenAficação&do&material&bibliográfico [**]*Dissertação*********[*X*]*Tese

2.&Nome&completo&do&autor Elianderson*Meneses*Santos 3.&Título&do&trabalho

On*some*boosted*methods*for*DC*programming*and*the*extension*of*the*DCA*to*Hadamard*Manifolds 4.&Informações&de&acesso&ao&documento&(este&campo&deve&ser&preenchido&pelo&orientador)

Concorda*com*a*liberação*total*do*documento*[*X*]*SIM***********[*****]*NÃO¹

[1]*Neste*caso*o*documento*será*embargado*por*até*um*ano*a*par;r*da*data*de*defesa.*Após*esse período,*a*possível*disponibilização*ocorrerá*apenas*mediante:

a)*consulta*ao(à)*autor(a)*e*ao(à)*orientador(a);

b)*novo*Termo*de*Ciência*e*de*Autorização*(TECA)*assinado*e*inserido*no*arquivo*da*tese*ou*dissertação.

O*documento*não*será*disponibilizado*durante*o*período*de*embargo.

Casos*de*embargo:

]*Solicitação*de*registro*de*patente;

]*Submissão*de*ar;go*em*revista*cien`fica;

]&Publicação*como*capítulo*de*livro;

]*Publicação*da*dissertação/tese*em*livro.

Obs.&Este&termo&deverá&ser&assinado&no&SEI&pelo&orientador&e&pelo&autor.

(3)

26/01/22 14:55 SEI/UFG - 2607181 - Termo de Ciência e de Autorização (TECA)

Página 2 de 2 https://sei.ufg.br/sei/controlador.php?acao=documento_impri…d8a4bf20dc84e37282efa9d3ba480c58c855b1a7de0ecdfdf75811bd787

Documento*assinado*eletronicamente*por*ELIANDERSON&MENESES&SANTOS,*Discente,*em 03/01/2022,*às*16:32,*conforme*horário*oficial*de*Brasília,*com*fundamento*no*§*3º*do*art.*4º do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*Orizon&Pereira&Ferreira,*Professora&do&Magistério Superior,*em*04/01/2022,*às*04:16,*conforme*horário*oficial*de*Brasília,*com*fundamento*no*§

3º*do*art.*4º*do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

A*auten;cidade*deste*documento*pode*ser*conferida*no*site hsps://sei.ufg.br/sei/controlador_externo.php?

acao=documento_conferir&id_orgao_acesso_externo=0,*informando*o*código*verificador 2607181*e*o*código*CRC*0ABC1C53.

Referência:*Processo*nº*23070.060746/2021x87 SEI*nº*2607181

(4)

ELIANDERSON MENESES SANTOS

On some boosted methods for DC programming and the extension of the

DCA to Hadamard Manifolds

GOIÂNIA 2022

Tese apresentada ao Programa de Pós- graduação em Matemática, do Instituto de Matemática e Estatísica (IME), da Universidade Federal de Goiás (UFG), como requisito para obtenção do título de Doutor em Matemática.

Área de concentração: Otimização

Orientador: Prof. Dr. Orizon Pereira Ferreira

Coorientador: Prof. Dr. João Carlos de Oliveira Souza

(5)

Ficha de identificação da obra elaborada pelo autor, através do Programa de Geração Automática do Sistema de Bibliotecas da UFG.

CDU 51 Santos, Elianderson Meneses

On some boosted methods for DC programming and the extension of the DCA to Hadamard Manifolds [manuscrito] / Elianderson Meneses Santos. - 2022.

142 f.: il.

Orientador: Prof. Dr. Orizon Pereira Ferreira; co-orientador Dr.

João Carlos de Oliveira Souza.

Tese (Doutorado) - Universidade Federal de Goiás, Instituto de Matemática e Estatística (IME), Programa de Pós-Graduação em Matemática, Goiânia, 2022.

Bibliografia.

Inclui símbolos, gráfico, tabelas, algoritmos.

1. Difference of convex functions. 2. DC optimization. 3. DCA. 4.

Kurdyka-Lojasiewicz property. 5. optimization on Riemannian manifolds. I. Ferreira, Orizon Pereira, orient. II. Título.

(6)

26/01/22 14:55 SEI/UFG - 2586804 - Ata de Defesa de Tese

Página 1 de 2 https://sei.ufg.br/sei/controlador.php?acao=documento_impri…5652d57a585032460c8c39a5b7c72f74b7e7a927219b62bd6422adac75

UNIVERSIDADE*FEDERAL*DE*GOIÁS INSTITUTO*DE*MATEMÁTICA*E*ESTATÍSTICA

ATA#DE#DEFESA#DE#TESE

Ata*nº*11*da*sessão*de*Defesa*de*Tese*de*Elianderson#Meneses#Santos​,*que*confere*o Ctulo*de*Doutor*em*MatemáGca,*na#área#de#concentração#de#O:mização.

*

Ao*décimo*séGmo*dia*do*mês*de*dezembro*do*ano*de*dois*mil*e*vinte*um,*a*parGr*das dez* horas,* através* de* webRvídeoRconferência,* realizouRse* a* sessão* pública* de* Defesa* de Tese*inGtulada*“On#some#boosted#methods#for#DC#programming#and#the#extension#of#the#DCA#to Hadamard# Manifolds”.* Os* trabalhos* foram* instalados* pelo* Orientador* e* presidente* da* banca, Professor* Doutor*Orizon# Pereira# Ferreira# I# IME/UFG* com* a* parGcipação* dos* demais* membros* da Banca* Examinadora:* Professor* Doutor* *Leandro# da# Fonseca# Prudente​# I# IME/UFG* membro* Gtular interno,*Professor*Doutor*Glaydston#de#Carvalho#Bento​#I#IME/UFG*membro*Gtular*interno,*Professor Doutor*João# Xavier# da# Cruz# Neto# I# MAT/UFPI* membro* Gtular* externo,* Professor* Doutor*Welington Luis# de# Oliveira# I# MINES# ParisTech,# France​,* membro* Gtular* externo* e* o* CoR orientador* Professor* Doutor* João# Carlos# de# Oliveira# Souza* R* MAT/UFPI# membro* Gtular externo.* Durante* a* arguição* os* membros* da* banca*não# fizeram* sugestão* de* alteração* do* Ctulo do*trabalho.*A*Banca*Examinadora*reuniuRse*em*sessão*secreta*a*fim*de*concluir*o*julgamento*da*Tese, tendo* sido* o* candidato* aprovado* pelos* seus* membros.* Proclamados* os* resultados pelo* Professor* Doutor*Orizon# Pereira# Ferreira# I# IME/UFG,* Presidente* da* Banca* Examinadora,* foram encerrados*os*trabalhos*e,*para*constar,*lavrouRse*a*presente*ata*que*é*assinada*pelos*Membros*da Banca*Examinadora,*Ao*décimo*séGmo*dia*do*mês*de*dezembro*do*ano*de*dois*mil*e*vinte*um.

TÍTULO SUGERIDO PELA BANCA

On#some#boosted#methods#for#DC#programming#and#the#extension#of#the#DCA#to#Hadamard Manifolds

Documento*assinado*eletronicamente*por*JOÃO#XAVIER#DA#CRUZ#NETO,*Usuário#Externo,*em 20/12/2021,*às*13:48,*conforme*horário*oficial*de*Brasília,*com*fundamento*no*§*3º*do*art.*4º do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*JOÃO#CARLOS#DE#OLIVEIRA#SOUZA,*Usuário Externo,*em*20/12/2021,*às*13:54,*conforme*horário*oficial*de*Brasília,*com*fundamento*no*§

3º*do*art.*4º*do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

(7)

26/01/22 14:55 SEI/UFG - 2586804 - Ata de Defesa de Tese

Página 2 de 2 https://sei.ufg.br/sei/controlador.php?acao=documento_impri…5652d57a585032460c8c39a5b7c72f74b7e7a927219b62bd6422adac75

Documento*assinado*eletronicamente*por*Orizon#Pereira#Ferreira,*Professora#do#Magistério Superior,*em*20/12/2021,*às*15:32,*conforme*horário*oficial*de*Brasília,*com*fundamento*no*§

3º*do*art.*4º*do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*Leandro#Da#Fonseca#Prudente,*Professor#do Magistério#Superior,*em*20/12/2021,*às*16:09,*conforme*horário*oficial*de*Brasília,*com fundamento*no*§*3º*do*art.*4º*do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*Welington#Luis#de#Oliveira,*Usuário#Externo,*em 26/12/2021,*às*18:55,*conforme*horário*oficial*de*Brasília,*com*fundamento*no*§*3º*do*art.*4º do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*Glaydston#De#Carvalho#Bento,*Professor#do Magistério#Superior,*em*03/01/2022,*às*11:38,*conforme*horário*oficial*de*Brasília,*com fundamento*no*§*3º*do*art.*4º*do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

A*autenGcidade*deste*documento*pode*ser*conferida*no*site hips://sei.ufg.br/sei/controlador_externo.php?

acao=documento_conferir&id_orgao_acesso_externo=0,*informando*o*código*verificador 2586804*e*o*código*CRC*AFBAE386.

Referência: Processo nº 23070.060746/2021-87 SEI nº 2586804

(8)

E LIANDERSON M ENESES S ANTOS

On some boosted methods for DC programming and the extension of the

DCA to Hadamard Manifolds

Tese defendida no Programa de Pós–Graduação do Instituto de Matemática e Estatística - IME da Universidade Federal de Goiás como requisito parcial para obtenção do título de Doutor em Matemática, aprovada em 17 de Dezembro de 2021, pela Banca Examinadora con- stituída pelos professores:

Prof. Dr. Orizon Pereira Ferreira Instituto de Matemática e Estatística - IME – UFG

Presidente da Banca

Prof. Dr. João Carlos de Oliveira Souza Departamento de Matemática - DM – UFPI

Prof. Dr. Glaydston de Carvalho Bento Instituto de Matemática e Estatística - IME – UFG

Prof. Dr. João Xavier da Cruz Neto Departamento de Matemática - DM – UFPI

Prof. Dr. Leandro da Fonseca Prudente Instituto de Matemática e Estatística - IME – UFG

Prof. Dr. Welington Luis de Oliveira

Centre de Mathématiques Appliquées - CMA – Mines ParisTech

(9)

Todos os direitos reservados. É proibida a reprodução total ou parcial do trabalho sem autorização da universidade, do autor e do orientador(a).

Elianderson Meneses Santos

Graduou-se em licenciatura plena em matemática na Universidade Federal do Piauí - UFPI, em Setembro de 2013, e obteve o grau de mestre em matemática também pela UFPI, com defesa de dissertação em outubro de 2015. Atuou como professor de matemática em cursos de graduação da UFPI, entre maio de 2014 até março de 2016, e também em cursos de graduação da Universidade Estadual do Piauí - UESPI, entre abril de 2016 e fevereiro de 2019. Iniciou o doutorado em matemática no Instituto de Matemática e Estatística da Universidade Federal de Goiás em março de 2019.

(10)

À minha família.

(11)

Agradecimentos

Agradeço primeiramente à Deus por ter me dado o dom da vida e ter me permitido terminar esse curso apesar de todas as dificuldades que enfrentei no caminho.

Agradeço à minha família por todo o apoio ao longo não só desse meu período no doutorado, mas também ao longo de toda a minha vida pessoal e acadêmica.

Agradeço à minha companheira Cíntia por estar ao meu lado em todos os momentos, sempre me incentivando, não deixando que eu desanime, e me acompanhando nos maiores desafios e aventuras dos últimos tempos.

Agradeço aos meus amigos, que mesmo à distância também não deixaram de me apoiar. Agradeço em especial aos amigos do Quarteto, Renan, Rômulo, Kellwyn, Phelps, Danilo, Helson e Neves, e também à galera doMalditos Amigos, em especial ao Atécio, Magari e João Neto pela parceria. Agradeço também ao amigos Rafael Emanuel e Hércules Bezerra.

Agradeço aos meu orientadores, professores Orizon Pereira Ferreira e João Car- los de Oliveira Souza, pela paciência, disponibilidade e também pelo companheirismo e os bons conselhos. Agradeço a ambos por me aconselharem sobre os primeiros caminhos da pesquisa acadêmica em matemática, tanto no que diz respeito aos aspectos técnicos quanto em relação à boas práticas éticas no decorrer da mesma.

Agradeço aos professores e funcionários do IME-UFG, que através do seu trabalho no instituto também contribuíram para que eu conseguisse terminar este curso.

Agradeço também aos meus colegas do curso de doutorado, pela parceria durante o tempo em que cursamos as mesmas disciplinas, especialmente nos cursos de Geometria Riemannniana e Otimização, dentre os quais destaco especialmente o apoio dos colegas Danilo e Thamara.

Agradeço aos professores Glaydston de Carvalho Bento, João Xavier da Cruz Neto, Leandro da Fonseca Prudente e Welington Luis de Oliveira por terem terem aceito participar da banca de defesa desta tese de doutorado, e por todas as suas ótimas observações e sugestões. Agradeço também ao Maurício Silva Louzeiro, cujas dicas e observações também contribuíram para a versão final desta tese.

Agradeço à CAPES pelo apoio financeiro.

(12)

"Resgate suas forças e se sinta bem Rompendo a sombra da própria loucura Cuide de quem corre do seu lado E quem te quer bem Essa é a coisa mais pura Fragmentos da realidade Estilo mundo cão Tem gente que desanda Por falta de opção E toda fé que eu tenho Eu tô ligado que ainda é pouco Os bandidos de verdade Tão em Brasília, tudo solto Eu faço da dificuldade A minha motivação A volta por cima Vem na continuação O que se leva dessa vida É o que se vive, é o que se faz Saber muito é muito pouco

‘Stay Will’esteja em paz"

Chorão, Pontes indestrutíveis - Charlie Brown Jr.

(13)

Resumo

Santos, E. M.. Sobre alguns métodos impulsionados para programação DC e a extensão do DCA para variedades de Hadamard. Goiânia, 2021. 142p.

Tese de Doutorado. Instituto de Matemática e Estatística - IME, Universidade Federal de Goiás.

Nesta tese são apresentados alguns novos métodos para otimização de funções DC. O primeiro deles, denominadoBSSM, é proposto para resolver problemas de otimização DC sobre Rn onde a primeira componente DC é diferenciável a a segunda é possivelmente não diferenciável. O segundo método, que será chamado de nmBDCA, é uma extensão não monótona do método BDCA para lidar com problemas de otimização DC em Rn onde ambas as componentes DC são não diferenciáveis. O terceiro método é uma combinação doBSSMcom onmBDCApara tratar de problemas de otimização DC sobre um conjunto convexo fechado C com restrições lineares, onde a primeira componente DC da função objetivo é a soma de uma função convexa suave com uma função convexa não diferenciável, e a segundo componente DC é não diferenciável. O último método apresentado nesta tese é uma extensão do DCA para o contexto da otimização de funções DC em variedades de Hadamard.

Palavras–chave

Diferença de funções convexas, otimização de funções DC, DCA, propriedade de Kurdyka-Łojasiewicz, otimização em variedades Riemannianas.

(14)

Abstract

Santos, E. M.. On some boosted methods for DC programming and the extension of the DCA to Hadamard Manifolds. Goiânia, 2021. 142p. PhD.

Thesis. Instituto de Matemática e Estatística - IME, Universidade Federal de Goiás.

In this thesis some new methods for DC optimization are presented. The first one, called BSSM, is proposed to solve DC problems over Rn where the first DC component is differentiable and the second one is non-smooth. The second method, callednmBDCA, is a non-monotone extension of the BDCA to deal with DC problems overRn where both DC components are non-smooth. The third method is a combination of the BSSM with thenmBDCAto deal with DC problems over a closed convex setCwith linear constraits, where the first component of the objective function is a sum of a of a smooth convex function with a non-differentiable convex function, and the second DC component is non- smooth. The last method is an extension of the DCA to the context of DC optimization on Hadamard manifolds.

Keywords

Difference of convex functions, DC optimization, DCA, Kurdyka-Łojasiewicz property, optimization on Riemannian manifolds.

(15)

Contents

1 Introduction 16

2 Preliminaries 20

2.1 Basic concepts and results of optimization onRn 20

2.1.1 Some facts about locally Lipschitz functions 20

2.1.2 On closed convex sets with linear constraits 23

2.1.3 On the Kurdyka-Łojasiewicz property 25

2.2 Basic concepts and results of optimization on Hadamard manifolds 26

2.2.1 Basic concepts and results of Riemannian geometry 26

2.2.2 Concepts and results of optimization in Hadamard Manifolds 30

2.2.3 On the Fenchel conjugate in Hadamard manifols 34

2.3 Brief comments about DC programming 36

3 Boosted scaled subgradient method 38

3.1 BSSM for DC programming 38

3.1.1 Well definedness and partial asymptotic convergence analysis 40

3.1.2 Iteration-complexity bounds 44

3.1.3 Full convergence under the Kurdyka-Łojasiewicz property 48

3.2 BSSM for DC programming with linear constraints 56

3.2.1 Partial asymptotic convergence analysis 58

3.2.2 Full convergence for quadratic objective functions 61

3.3 Numerical Experiments 67

3.3.1 Academical examples 67

3.3.2 Fermat-Weber location problem 72

4 Non-monotone Boosted DC Algorithm 74

4.1 The nmBDCA for DC programming 74

4.2 Analysis of nmBDCA forgpossibly non-smooth 78

4.2.1 Well definedness 78

4.2.2 Asymptotic convergence analysis 81

4.2.3 Iteration-complexity analysis 85

4.3 Analysis of nmBDCA forgcontinuously differentiable 87

4.3.1 Well definedness of nmBDCA and a computation of an inferior bound for the

step-lenght 88

4.3.2 Iteration complexity bounds 91

4.3.3 Full convergence under the Kurdyka-Łojasiewicz property 94

4.4 Numerical experiments 96

(16)

5 Non-monotone BSDCA for linearly constrained DC programming 104

5.1 The Boosted Scaled DC Algorithm with non-monotone line search 104

5.1.1 Well definedness of the nm-BSDCA 109

5.1.2 Convergence analysis 113

5.1.3 Iteration-complexity bounds 117

5.2 Application: The constrained`1−2minimization problem 119

6 The DC Algorithm in Hadamard manifolds 121

6.1 Duality in DC optimization in Hadamard Manifolds 121

6.2 DCA on Hadamard Manifolds 126

6.3 Convergence analysis of DCA 129

7 Final remarks 135

Bibliography 136

(17)

CHAPTER 1

Introduction

A generalDC problemis a non-convex and non-smooth optimization problem in the format

minx∈F φ(x) =g(x)−h(x), (PDC) whereg:

F

Randh:

F

Rare both convex and lower semi-continuous functions, and

F

is a feasible set, which can be a convex subsetCofRnor a Riemannian manifold

M

.The function φ is known as a DC function, i.e., a function that can be expressed as adifference of two convex functions, and each one of those functions is said to be aDC componentofφ.

The first and most popular method developed to deal with DC programs, named Difference of Convex Algorithmor merely DCA, was first presented in [68], see also [69].

In those works the authors establish some results on duality in DC optimization and then they present the DCA, which is based in the conjugate function and duality relations of convex functions. In the following we present the original formulation of the DCA, which we will refer asnatural form of DCA, or simply DCA when there is no confusion. Such algorithm is concerned to minimize the functionφ=g−hdefined above when

F

=Rn. The algorithm is as follows:

Algorithm 1DC Algorithm

1: Choose an initial pointx0∈dom(g). Setk=0.

2: Takeξk∈∂h(xk), and compute

xk+1∈∂gk). (1-1)

3: Ifxk+1=xk, then STOP and returnxk. Otherwise, go to Step 4.

4: Setk←k+1 and go to Step 2.

The conjugate function g:Rn→Rof the real valued function gis defined as g(y):=supx∈Rnhx,yi −g(x), see e.g. [60, p. 473]. Sinceg is convex, g is also convex on Rn. Note that the next iterate xk+1 of DCA in (1-1) is a subgradient ofg at ξk, see

(18)

17

e.g. [68,69] and [4]. On the other hand, from the definition of the conjugate function, it is not hard to see that Algorithm1 is equivalent to an alternative formulation of the DCA, calledsimplified form of DCA, which is the following:

Algorithm 2 [4, Section 2.3.1] DC Algorithm (Simplified form)

1: Choose an initial pointx0∈dom(g). Setk=0.

2: Takeξk∈∂h(xk), and the next iteratedpk+1is defined as xk+1∈argminx∈Rn

g(x)−

ξk,x−xk

. (1-2)

3: Ifxk+1=xk, then STOP and returnxk. Otherwise, go to Step 4.

4: Setk←k+1 and go to Step 2.

We also remark that both formulations of DCA given above are equivalent in the following sense: given the current iteratexkof DCA, the next iteratexk+1satisfies (1-1) if and only if it satifies (1-2).

It is worth to note that in several works dealing with DC optimization the following hypothesis is made: the DC components g and h of the DC function φ are assumed being strongly convex. Such assumption is not restrictive since we always can sum to each DC component ofφthe same strongly convex function ¯f :

F

Rin order to obtain ¯g:

F

Rand ¯h:

F

Rgiven by ¯g:=g+f¯and ¯h:=h+f¯.In this case, both ¯g and ¯hare strongly convex and it holds thatφ:=g−h=g¯−h.¯ Therefore, without loss of generalitywe always can consider the DC problem(PDC)with both DC components being strongly convex.Moreover, a well established result in the study of DCA is the following:

if we assume that the functionsgandhare strongly convex, then every cluster point ¯xof the sequence(xk)k∈Ngenerated by the DCA satisfies∂g(x)¯ ∩∂g(x)¯ 6=∅,i.e., ¯xis a critical point of the DC functionφ:=g−h.

Over last years the interest by the DC theory has much increased and a large class of works devoted to DC optimization in different contexts has been developed. Such interest is due especially to the fact that DC functions have a large range of practical applications, which includes, for example, computational biology [45, 46], machine learning [2,66,70], image analysis [43,44,58], Cryptography [23,67], the minimum sum- of-squares clustering problem [6,30,57], the bilevel hierarchical clustering problem [55], Clusterwise linear regression [10], the multicast network design problem [37], and the multidimensional scaling problem [3,6] and Fermat-Weber location problem [25,27], see also [15]. A great list of works dealing with DC optimization can be found in the recent review [47], which celebrates the 30th birthday of DC programming and DCA.

In another way, the interest by optimization in Riemannian manifolds also has increased in last years. However, specifically when we talk about DC optimization in

(19)

18

Riemannian manifolds, only a few works and some specific algorithms or numerical experiments were proposed to deal with it, see [1,63]. We also remark that the Fenchel conjugate of a function has been recently established in [19].

In this sense, the first aim of this thesis is to study the DC problem (PDC) in the unconstrained case, i.e., when

F

=Rn. Such study is divided in two chapters.

In the first one we present a scaled subgradient method for DC programming and we prove that the negative scaled generalized subgradient at the current iterate is a descent direction for the objective function from an auxiliary point. Thus, instead of applying the Armijo line search and computing the next iterate from the current iterate, both the line search and the new iterate are computed from that auxiliary point along the direction of the negative scaled generalized subgradient. Consequently, the proposed method, called BSSM, has similar asymptotic convergence properties and iteration-complexity bounds as the usual descent methods to minimize differentiable convex functions employing Armijo line search. The second part of our study about unconstrained DC problems consists in an extension of the applicability of the BDCA, which was originally proposed in [5]

for differentiable DC functions, and then for non-differentiable DC functions where the first DC component is differentiable, but the second one is non-smooth, see [6]. In our approach, we develop a version of the BDCA for non-differentiable DC functions where the both DC components are not differentiable. Such version was possible by applying a non-monotone line search instead of the usual monotone line search employed in [5,6].

Under suitable assumptions, we show that any cluster point of the sequence generated by our method, called nmBDCA, is a critical point of the problem, and then we provide some iteration-complexity bounds. Some numerical experiments show that the nmBDCA outperforms the DCA such as its monotone version.

The second aim of this thesis is to present a method that combine the strategies of the BSSM and nmBDCA to study DC problems under linear constraints. We prove that every cluster point of the sequence generated by the proposed method is a critical point for the problem, and some iteration-complexity bounds. Moreover, we show that the proposed method retrieves both BSSM and nmBDCA under some reasonable assumptions.

The third and last aim of this thesis is to study the DC problem (PDC) in the context of Riemannian manifolds. In this sense, in last chapter we propose a primal-dual study of the DC problem (PDC) in Hadamard manifolds, and then we present an extension of the DCA [68] to solve such problems. We remark that our method also can be seen as a practical application of the Fenchel conjugate recently presented in [19]. We prove that the DCA in Hadamard manifolds is a descent method, and that every cluster point of the generated sequence is a critical point for the problem under consideration. Moreover, we prove also a primal-dual asymptotic convergence result.

This thesis is organized as follows. Chapter2presents some notation and basic

(20)

19

results that will be used throughout the text. Chapter3presents a boosted scaled subgradi- ent method (BSSM) to solve DC problems where the first DC component is differentiable and the second one is non-smooth. Chapter 4 presents a boosted DC algorithm with a non-monotone line search (nmBDCA) to solve the DC problems where both DC compo- nents are possibly non-smooth. Chapter5presents a Boosted Scaled DC Algorithm with non-monotone line search (nmBSDCA) to deal with the DC problems where the first DC component is a sum of a convex smooth function and a non-differentible convex function, and the second DC component is convex and non-smooth. Chapter 6 presents the DC problem on a Hadamard manifold and its dual, and also presents an extension of the DC Algorithm given in [68] to this context, as well its convergence analysis. It is worth to note that the content of chapter6is based on our work [36], which is available online. We also remark that throughout chapters3 and4 φ denotes a DC function with DC components gandh. In chapter5φis a DC function with DC componentsg:=g1+g2andh, where g1is convex and differentiable andg2andhare convex, but not necessarily differentiable functions. In chapter6,gandhdenotes convex functions on a Hadamard manifold

M

.

(21)

CHAPTER 2

Preliminaries

In this chapter we present some notations, definitions, and results that will be used throughout the text of this thesis.

2.1 Basic concepts and results of optimization on R

n

In this section, we present some preliminary concepts that will be used through- out the next three chapters of this thesis.

2.1.1 Some facts about locally Lipschitz functions

This subsection presents some notations, definitions, and results about locally Lipschitz functions. First we define convex functions; see [40, Definition 1.1.1, p. 144, and Proposition 1.1.2, p. 145]

Definition 2.1 A functionψ:Rn→Ris said to be convex if

ψ(λx+ (1−λ)y)≤λψ(x) + (1−λ)ψ(y), ∀x,y∈Rn,∀λ∈[0,1].

We say that ψ is strictly convex when last inequality is strict for x6=y. Moreover, for a givenσ>0, the functionψ is said to bestrongly convex with modulusσ orσ-strongly convexifψ−(σ/2)k · k2is convex, or equivalently, if

ψ(λx+ (1−λ)y)≤λψ(x) + (1−λ)ψ(y)−σ(1−λ)λkx−yk2, (2-1) for all x,y∈Rnand allλ∈[0,1].

Remark 2.2 To make the texteasier to read, throughout this thesis, especially in chap- ter5, we will use the followingabuse of notationto referconvex functions: we will say that the functionψisconvexif, and only if,ψis0-strongly convex, i.e., whenψsatisfies (2-1)withσ=0. Such abuse of notation also appears in Lemma2.13.

(22)

2.1 Basic concepts and results of optimization onRn 21

The next two definitions can be found in [26].

Definition 2.3 We say thatψ:Rn→Ris locally Lipschitz if, for all x∈Rn, there exist a constant Kx>0and a neighborhood Ux of x such that|ψ(x)−ψ(y)| ≤Kxkx−yk, for all y∈Ux.

Ifψ:Rn→Ris convex, thenψis locally Lipschitz; see [26, p. 34].

Definition 2.4 Let ψ:Rn→R be a locally Lipschitz function. The Clarke’s subdiffer- ential of ψat x∈Rnis given by∂cψ(x) ={v∈Rn(x;d)≥ hv,di, ∀d ∈Rn}, where ψ(x;d)is the generalized directional derivative ofψat x in the direction d given by

ψ(x;d) =lim sup

ux t0

ψ(u+td)−ψ(u)

t .

Ifψis convex, then∂cψ(x)coincides with the subdifferential∂ψ(x)in the sense of convex analysis, andψ(x;d)coincides with the usual directional derivativeψ0(x;d); see [26, p.

36]. We recall that ifψ:Rn→Ris continuously differentiable, then∂cψ(x) ={∇ψ(x)}

for anyx∈Rn.

Theorem 2.5 Letψ:Rn→Rbe a locally Lipschitz function. Then, for all x∈Rn, there hold:

(i) ∂cψ(x) is a non-empty, convex, compact subset of Rn and kvk ≤ Kx, for all v∈∂cψ(x), where Kx>0is the Lipschitz constant ofψaround x;

(ii) ψ(x;d) =max{hv,di: v∈∂cψ(x)}.

Proof. See [26, Proposition 2.1.2, p. 27].

Theorem 2.6 Let ψ12:Rn→R be convex functions. Then, for every x,d ∈Rn, the following assertions hold:

(i) Ifψ1is differentiable, then(ψ1−ψ2)(x;d) =h∇ψ1(x),di −ψ02(x;d);

(ii) ∂c1±ψ2)(x)⊆∂ψ1(x)±∂ψ2(x),and the equality holds if eitherψ1orψ2is differentiable.

Proof. See [26, Proposition 2.3.1, p. 38, and Corollary 1, p. 39].

Proposition 2.7 Let ψ:Rn →R be convex and (uk)k∈N such that limk→∞uk = u. If (vk)k∈Nis a sequence such that vk∈∂ψ(uk)for every k∈N, then(vk)k∈Nis bounded and its cluster points belongs to∂ψ(u).

(23)

2.1 Basic concepts and results of optimization onRn 22

Proof. See [40, Propositions 6.2.1 and 6.2.2, p. 282].

Lemma 2.8 Letψ:Rn→R be a strongly convex function with modulusσ>0, and let ψ¯ :Rn→Rbe convex. Thenψ+ψ¯ is strongly convex function with modulusσ>0.

Proof. See [14, Lemma 5.20, p. 119].

Theorem 2.9 Let ψ:Rn→R be a strongly convex function and C⊆Rn a closed and convex set. Then, ψ has an unique minimizer x∈C. Moreover, there exist v∈∂ψ(x) such thathv,x−xi ≥0, for all x∈C.

Proof. See [14, Theorem 5.25, p. 122 and Corollary 3.68, p. 76].

Theorem 2.10 The following statements are equivalent

(i) ψ:Rn→Ris a strongly convex function with modulusσ>0.

(ii) ψ(y)≥ψ(x) +hv,y−xi+ (σ/2)ky−xk2, for all x,y∈Rnand all v∈∂ψ(x).

(iii) hw−v,x−yi ≥σky−xk2, for all x,y∈Rn, all w∈∂ψ(x)and all v∈∂ψ(y).

Proof. See [14, Theorem 5.24, p. 119].

The following definition appears in [14, p. 107]

Definition 2.11 A differentiable functionψ:Rn→Rhas Lipschitz continuous gradient with constant L>0wheneverk∇ψ(x)−∇ψ(y)k ≤Lkx−yk, for all x,y∈Rn.

Lemma 2.12 (Descent lemma) Assume that ψ satisfies Definition 2.11. Then, for all x,d∈Rnand allλ∈R, there holds

ψ(x+λd)≤ψ(x) +λh∇ψ(x),di+Lλ2 2 kdk2.

Proof. See [14, Lemma 5.7, p. 109].

The following Lemma extends the descent Lemma above.

Lemma 2.13 Let ψ: Rn →R be a function given by ψ :=ψ1−ψ2. Assume that ψ1 satisfies Definition 2.11 and ψ2 is strongly convex with modulus σ≥0. Then, for all x,d∈Rnand allλ∈R, there holds

ψ(x+λd)≤ψ(x) +λh∇ψ1(x)−w,di+(L−σ)

2 λ2kdk2 ∀w∈∂ψ2(x).

(24)

2.1 Basic concepts and results of optimization onRn 23

Proof. Let x ∈ Rn and an arbitrary w∈ ∂ψ2(x). Define the function p :Rn →R by p(z) =ψ1(z)− hw,zi. Thus we have∇p(z) =∇ψ1(z)−wand, due to∇ψ1 be Lipschitz continuous with constantL, we obtain that∇pis also Lipschitz continuous with constant L. Givend∈Rnandλ∈R, by using Lemma2.12with p, we obtain that

p(x+λd)≤p(x) +λh∇ψ1(x)−w,di+Lλ2 2 kdk2. Since p(z) =ψ1(z)− hw,zi, the last inequality is equivalent to

ψ1(x+λd)≤ψ1(x) +λhw,di+λh∇ψ1(x)−w,di+Lλ2

2 kdk2. (2-2) Due toψ2be strongly convex with modulusσ≥0 andw∈∂ψ2(x)it follows from item(ii) of Theorem2.10that

λhw,di ≤ψ2(x+λd)−ψ2(x)−σλ2 2 kdk2. Hence, the last inequality together (2-2) yield

ψ1(x+λd)−ψ2(x+λd)≤ψ1(x)−ψ2(x) +λh∇ψ1(x)−w,di+(L−σ)λ2 2 kdk2, which taking into account thatψ=ψ1−ψ2concludes the proof.

Remark 2.14 It is worth to note that in Lemma 2.13 is sufficient to assume that ∇ψ1

is Lipschitz continuous with constant L>0. In this case [26, Corollary of Proposition 2.2.1, p. 32] ensures that if ψ1 is continually differentiable then ψ1 is locally Lipschitz.

Hence, by item(ii)of Theorem2.6we have∂cψ(x) ={∇ψ1(x)} −∂ψ2(x)and the result follows. We also note that the Lemma2.13generalizes Lemma2.12. Indeed, takingψ2≡0, Lemma2.13becomes exactly the Lemma2.12.

2.1.2 On closed convex sets with linear constraits

In this subsection we present some preliminary results about convex sets with linear constraits. The definition below can be found for example in [7, Section 2].

Definition 2.15 Let

C:={x∈Rn: hai,xi ≤bi, i=1,· · ·,p.}, (2-3) where ai∈Rn, bi∈R for all i∈

I

:={1,2,· · ·,p}. The set of active constraints at the point x∈C is given by

I

(x):={i

I

: hai,xi=bi}.

(25)

2.1 Basic concepts and results of optimization onRn 24

Next definition can be found for example in [14, p. 36].

Definition 2.16 Given a set

F

Rnand x∈

F

. Thenormal coneof

F

at x is defined as NF(x) ={v∈Rn:hv,z−xi ≤0,∀z∈

F

}.

Example 2.17 If C is given by(2-3), then for every x∈C we have NC(x):=

( p

i=1

µiai: µi≥0, µi(hai,xi −bi) =0, i=1, . . . ,p )

. (2-4)

Indeed, denote by Nthe set in the right hand of (2-4)and take v∈N. Then, there exist µi≥0,i=1, . . . ,p such that v=∑i=1p µiai. If x∈C,thenhai,xi ≤bi for all i=1, . . . ,p.

Thus, for all z∈C, hv,z−xi=

p i=1

µihai,zi −

p i=1

µihai,xi ≤

p i=1

µi(bi− hai,xi) =0,

which means that N⊆NC(x). Assume by contradiction that there exist v∈NC(x)\N.

SinceN is convex, by the strict separation theorem [14, Theorem 2.33, p.31] there exist u∈Rn\ {0} and z∈R such that hu,wi<z<hu,vi, for all w∈N. Due to0∈N, we have z> 0. Note that for any i∈

I

(x) and all µ0 we have µhu,aii <z. Then, for µ>0we obtain µhu,aii<(1/µ)z, and letting µ→∞we conclude thathu,aii ≤0for all i∈

I

(x). Then, for i ∈

I

(x) and all t¯>0, we have hai,x+tui ≤¯ bi. If hai,xi<bi and hu,aii ≤0, then a similar argument shows that hai,x+tui ≤¯ bi fort¯>0. Ifhai,xi<bi andhu,aii>0,then for all0<t¯<(bi− hai,xi)/hu,aii,we havehai,x+tui ≤¯ bi. Thus, settingt¯:=min

bi− hai,xi

/hai,ui|hai,ui>0 >0we ensure that x+tu¯ ∈C. Finally, hv,ui>z>0yeldshv,(x+tu)¯ −xi=hv,tui¯ >tz¯ >0, which is a contradiction due to the fact of v∈NC(x). Therefore, NC(x) =N.In particular, if ai=0∈Rnand bi=0∈Rfor all i=1, . . . ,p, then C=Rnand NC(x) ={0}for every x∈Rn.

The following proposition will be useful ahead in chapter5.

Proposition 2.18 Iflimn→+∞xn=x, v¯ n∈NC(xn)andlimn→+∞vn=v, then¯ v¯∈NC(x).¯

Proof. See [61, Proposition 6.6, p. 202].

Next lemma will be used in chapters 3and5 to guarantee the feasibility of an auxiliary point generated by some of our methods.

Lemma 2.19 Let C be given by(2-3)and take x,z∈C. If d=z−x6=0and

I

(z)

I

(x),

then z+td∈C, for all t∈[0,ε], where

0<ε:=

( min

nbi−hai,zi

|hai,di| : i∈

I

\

I

(x),hai,di 6=0o,

I

/

I

(x)6=0;/

+∞, else.

(2-5)

(26)

2.1 Basic concepts and results of optimization onRn 25

Proof. Takei∈

I

(x). Then, for anyi∈

I

(z)

I

(x), it holds that hai,di=hai,zi − hai,xi=bi−bi=0.

Otherwise, ifi∈

I

(x)\

I

(z), then we conclude that

hai,di=hai,zi − hai,xi=hai,zi −bi<0.

Thus, hai,di ≤0, for alli∈

I

(x). SincezC, it holds thathai,zi −bi≤0, for alli∈

I

.

Hence, combining the two last inequalities we obtain

hai,z+tdi −bi≤0, ∀t≥0, ∀i∈

I

(x). (2-6)

In the following, consideri∈

I

\

I

(x). Due to the fact ofz∈Cand

I

(z)

I

(x), for each i∈

I

\

I

(x)

I

\

I

(z)we know thathai,zi<bi. Note that, ifhai,di=0,then, taking into account thatz∈C, we havehai,z+tdi −bi=hai,zi −bi≤0,for allt≥0.On the other hand, ifhai,di 6=0 then, letting

εi:=bi− hai,zi

|hai,di| >0 we have

hai,z+tdi −bi=hai,zi −bi+thai,di ≤0, ∀t∈[0,εi], ∀i∈

I

\

I

(x). (2-7)

Therefore, the definition of ε in (2-5) together with (2-6) and (2-7) imply the desired

result.

2.1.3 On the Kurdyka-Łojasiewicz property

Next we present the definition of the Kurdyka-Łojasiewicz property.

Definition 2.20 Let C1[(0,+∞)] be the set of all continually differentiable functions defined in(0,+∞), F:Rn→Rbe a locally Lipschitz function and∂cF(·)be the Clarke’s subdifferential of F. The function F is said to have the Kurdyka-Łojasiewicz property at x if there existη∈(0,+∞], a neighborhood U of xand a continuous concave function γ:[0,η)→R+ (called desingularizing function) such that:γ(0) =0,γ∈C1[(0,+∞)]and γ0(t)>0for all t ∈(0,η). In addition, satisfies

γ0(F(x)−F(x))dist(0,∂cF(x))≥1, ∀x∈U∩ {x∈Rn|F(x)<F(x)<F(x) +η}.

(27)

2.2 Basic concepts and results of optimization on Hadamard manifolds 26

In order to make the text simplest, from now on in this thesis we will write “f is KŁ at x” instead of writing “the function f has the Kurdyka-Łojasiewicz property at x”. Next remarks show that there exists a huge number of functions satisfying the KŁ-property.

Remark 2.21 S. Łojasiewicz proved in 1963 that real-analytic functions satisfy an in- equality of the above type withγ(t) =t1−θwhereθ∈[(1/2),1); see [52].

Remark 2.22 Let A⊂Rnand B⊂Rn×R. The set B is called semianalytic if each point ofRn×Radmits a neighborhood V ⊂Rn×Rfor which B∩V assumes the form as follows

p

[

i=1 Q

\

j=1

{(x,y)∈V : fi j(x,y) =0, gi j(x,y)>0},

where the functions fi j,gi j:V →Rare real-analytic, for all i=1,· · ·,p and j=1,· · ·,q.

Then, the set A is called subanalytic if each point ofRnadmits a neighborhood V⊂Rn×R and B⊂Rn×Ra bounded semianalytic subset such that A∩V ={x∈Rn : (x,y)∈B}.

Finally, a function f :Rn→Ris called subanalytic if its graph is a subanalytic subset of Rn×R. It is worth to point out that subanalytic functions that is continuous when restricted to its closed domain satisfies the KŁ-property with desingularising function γ(t) = Dtθ/θ with D >0 and θ∈ (0,1]; for more details see [20, Theorem 3.1]. For examples of subanalytic functions see e.g. [8,20,21].

2.2 Basic concepts and results of optimization on Hadamard manifolds

In this section, we recall some concepts, notations, and basics results about Riemannian manifolds and optimization. For more details see, for example, [32,59,62, 71]. Let us begin with concepts about Riemannian manifolds.

2.2.1 Basic concepts and results of Riemannian geometry

In this subsection we present some basic results about Riemannian geometry that will be used throughout Chapter6. For the next definitions, please see [32].

Definition 2.23 A differentiable manifold of dimension n is a set Mn and a family of injective mappingsxα:Uα ⊂Rn→M of open sets UαofRninto M such that:

(i) Sαxα(Uα) =M;

(28)

2.2 Basic concepts and results of optimization on Hadamard manifolds 27

(ii) for any pairα,β,withxα(Uα)∩xβ(Uβ) =W 6=∅,the setsx−1α (W)andx−1

β (W) are open sets inRnand the mappingsx−1

β ◦x−1α are differentiable;

(iii) The family{(Uα,xα)}is maximal relative to the conditions(i)and(ii)above.

The indexnin the notationMnindicates the dimension ofM. When there is no confusion such index can be omitted. Unless when explicitly stated, through this sectionMdenotes a differentiable (which can be also a Riemannian) manifold of dimensionn.

Definition 2.24 Let M be a differentiable manifold. A differentiable function α : (−ε,ε)→M is called a (differentiable) curve in M. Suppose that α(0) = p∈M, and let

D

be the set of functions on M that are differentiable at p. The tangent vector to the curveαat t=0is a functionα0(0):

D

Rgiven by

α0(0)f = d(f◦α) dt

t=0

, f ∈

D

.

A tangent vector at p is the tangent vector at t=0 of some curveα:(−ε,ε)→M with α(0) =p. The set of all tangent vectors to p is denoted by TpM. The set T M={(p,v): p∈ M,v∈TpM}is calledtangent budleof M.

Definition 2.25 Let M1nand M2mbe differentiable manifolds. A mapping F:M1→M2is differentiableat p∈M1if given a parametrizationy:V ⊂Rm→M2at F(p)there exists a parametrizationx:U ⊂Rn→M1at p such that F(x(U))⊂y(V)and the mapping

y−1◦F◦x:U ⊂Rn→Rm

is differentiable atx−1(p). F is differentiable on an open set of M1if F is differentiable at all of the points of this open set.

Definition 2.26 Avector fieldX on a differentiable manifold M is a correspondence that associates to each point p∈M a vector X(p)∈TpM. The field is differentiable if the mapping X :M→T M is differentiable. The set of all vector fields on M of class C is denoted by

X

(M).

Definition 2.27 A Riemanninan metric on a differentiable manifold M is a correspon- dence which associates to each point p of M an inner product (that is, a symmetric, bilin- ear, positive-defined form) hh·,·iip,on the tangent space TpM.A differentiable manifold M with a given Riemannian metric is called aRiemannian manifold.

Definition 2.28 A differentiable mapping c:I→M of an open interval I⊂R into M a differentiable manifold M is called a parametrized curve. A vector field along a curve c :I → M is a differentiable mapping that associates to every t ∈I a tangent vector

(29)

2.2 Basic concepts and results of optimization on Hadamard manifolds 28

V(t)∈Tc(t)M. The vector field dcdt is called thetangent vector fieldof c. The restriction of a curve c to a closed interval[a,b]⊂I is called asegment. If M is a Riemannian manifold, we define the lenght of a segment by

`ba(c) = Z b

a

hhc0(t),c0(t)ii1/2dt.

Definition 2.29 An affine connection ∇ on a differentiable manifold M is a mapping

∇:

X

(M)×

X

(M)

X

(M), which is denoted by(X,Y)→XY and which satisfy:

(i) ∇f X+gYZ= f∇XZ+g∇YZ (ii) ∇X(Y+Z) =∇XY+∇XZ (iii) ∇X(f Y) = f∇XY+X(f)Y, for all X,Y,Z∈

X

(M)and f,g

D

(M).

Proposition 2.30 Let M be a differentiable manifold and let ∇be an affine connection on M. There exists an unique correspondence which associates to a vector field V along the differentiabla curve c:I →M another vector field DVdt along c, called the covariant derivativeof V along c, such that:

(i) dtD(V+W) =dVdt +dWdt .

(ii) dtD(f V) =d fdtV+fDVdt ,where V is a vector field along c and f is a differentiable function on I.

(iii) If V(t) =Y(c(t)), then DVdt =∇dc/dtY.

Proof. See [32, p. 50].

Definition 2.31 Let M be a differentiable manifold and let∇be an affine connection on M. A vector field V along a curve c:I→M is called parallelif DVdt =0,for all t∈I.

Proposition 2.32 Let M a differentiable manifold with an affine connection∇. Let c:I→ M be a differentiable curve in M and let V0be a vector tangent to M at c(t0), t0∈I.Then there exist a unique parallel vector field V along c, such that V(t0) =V (the field V(t)is called theparallel transportof V(t0)along c).

Proof. See [32, p. 52].

Definition 2.33 A parametrized curve γ:I →M is a geodesic at t0 ∈I if dtD

dt

for t=t0; ifγis a geodeic for all t ∈I, we say thatγis ageodesic. If[a,b]⊂I andγ:I→M is a geodesic, the restriction ofγto[a,b]is called ageodesic segmentjoiningγ(a)toγ(b).

(30)

2.2 Basic concepts and results of optimization on Hadamard manifolds 29

When there is no confusion, we will consider the notationPq←p for the parallel transport along the geodesic segment γ joining p to q. Following [32, Chapter 3], let M be a Riemannian manifold. The exponential map expp :TpM → M at p ∈M is defined by expp(v) =γv(1,p) for each v∈TpM, where γ(·) =γv(·,p) is the geodesic starting at p with velocityv. Then expp(tv) =γv(t,p)for each real numbert.

Definition 2.34 Let M be a Riemannian manifold and p,q∈M.Consider the setΓp,q:=

{c:[a,b]→M | c is a piecewise differentiable curve joining p and q}. TheRiemannian distancefrom p to q is d(p,q):=inf{`(c)|c∈Γp,q}.

A Riemannian manifoldMiscompleteif the geodesics inMare defined for allt ∈R. Theorem 2.35 (Hopf-Rinow) Let M be a connected Riemannian manifold. Then the following conditions are equivalent:

(i) M is geodesically complete at a point p∈M.

(ii) M is geodesically complete, i.e., the geodesics in M are defined for all t∈R. (iii) For a fixed point p∈M, the set B[p,r]:={q∈M :d(p,q)≤r}is compact for

any r>0.

(iv) For any p∈M and any r>0, B[p,r]is compact.

(v) (M,d) is complete as a metric space. Namely, any Cauchy sequence of M is a convergent sequence.

Moreover, each one of above items(i)-(v)implies in the following:

(vi) For any two points p,q∈M there exists a geodesic (calledminimal geodesic)γ joining p to q such that`(γ) =d(p,q).

Proof. See [62, p. 84].

Definition 2.36 The curvature R of a Riemannian manifold M is a correspondence that associates to every pair X,Y ∈

X

(M)a mapping R(X,Y):

X

(M)

X

(M)given by

R(X,Y)Z=∇YXZ−∇XYZ+∇(XY−Y X)Z, Z∈

X

(M),

where∇is the Riemannian connection of M.

Remark 2.37 The field XY−Y X ∈

X

(M)is the unique vector field given by thebracket operation of X and Y . For more details about the bracket operation, see [32, Chapter 0, Section 5].

(31)

2.2 Basic concepts and results of optimization on Hadamard manifolds 30

Definition 2.38 Letσ⊂TpM be a two-dimensional subspace of TpM and let x,y∈σbe two linearly independent vectors. Then the sectional curvatureof M at p relative to the sectionσis given by

K(x,y) = hhR(x,y)x,yiip p|x|2|y|2− hx,yi2. The next definition can be found in [62, p. 222].

Definition 2.39 A complete simply connected Riemannian manifold M of nonpositive sectional curvature is called aHadamard manifold.

From now on in this work we will denote a Hadamard manifold by

M

. We remark that due to the Hadamard-Cartan’s Theorem [62, p. 222], if

M

is Hadamard, then the exponential map expp:Tp

M

M

is a diffeomorphism for every p∈

M

and exp−1p :

M

Tp

M

denotes its inverse. Denote byRthe real extended line, i.e.,R=R∪ {±∞}. Thedomain of a function f :

M

Ris denoted bydom(f):={p∈

M

: f(p)<+∞}. Following [9]

and [71] we present below some concepts on optimization on Hadamard manifolds.

2.2.2 Concepts and results of optimization in Hadamard Manifolds

This subsection presents some definitions, notations and preliminary results about optimization in Riemannian manifolds.

Definition 2.40 A subset A⊂

M

is said to be convex if for any two points p and q in A, the geodesic joining p to q is contained in A, that is, ifγ:[a,b]→

M

is a geodesic such that p=γ(a)and q=γ(b), thenγ((1−t)a+tb)∈A for all t∈[0,1].

Definition 2.41 A functionψ:

M

Risproperifdom(ψ)6=∅andψ(p)>−∞holds for all p∈

M

.

Next we define convex functions on a manifold. The definition below can be found for example in [71, p. 60].

Definition 2.42 (Convex function) Let A be a convex subset of

M

.A functionψ:AR isconvexifψ(γ(t))≤(1−t)ψ(p) +tψ(q),for all p,q∈A∩dom(ψ), all t∈[0,1]and all geodesicγ:[0,1]→

M

,such thatγ(0) =p andγ(1) =q.

If the inequality in above definition is strict, thenψis said to be strictly convex. From [71]

we know that a function ψ:

M

R isconvex (resp. strictly convex) if and only if for every geodesic γ :[a,b]→

M

, the function ψ◦γ :[a,b]→ R is convex (resp. strictly convex) in the usual sense, i.e.,(ψ◦γ)((1−s)t1+st2)≤(1−s)(ψ◦γ)(t1) +s(ψ◦γ)(t2), for alls∈[0,1]and allt1,t2∈[a,b].

(32)

2.2 Basic concepts and results of optimization on Hadamard manifolds 31

Definition 2.43 Let A be a convex subset of

M

. A function ψ:A→R is said to be σ- strongly convex forσ>0if, for any p,q∈A∩dom(ψ)and any geodesicγ:[0,1]→

M

joining p to q, the composition ψ◦γ:[0,1]→Ris σ-strongly convex, i.e.,(ψ◦γ)(t)≤ (1−t)ψ(p) +tψ(q)−σ2t(1−t)`(γ)2, for all t ∈[0,1].

Following [19], for each p∈

M

,for now on we will denote byTp

M

thetangent spaceto

M

at p. Its dual space, calledcotangent spaceto

M

at p, will be denoted byTp

M

. The

duality product betweenX∈Tp

M

andξTp

M

is denoted byhξ,Xi=ξ(X).Moreover, the tangent budle and cotangent budleof

M

, will be denoted respectively byT

M

and

T

M

, where

T

M

= [

p∈M

{p} ×Tp

M

, T

M

= [

p∈M

{p} ×Tp

M

.

The Riemannian metric of

M

provides a linear bijective correspondence between the tangent and cotangent spaces via the Riesz map and its inverse; see [48, Chapter 11].

They are defined as

[:Tp

M

3X 7→X[Tp

M

, hX[,Yi=X[(Y) =hhX,Yiip, ∀Y ∈Tp

M

,

and

]:Tp

M

3ξ7→ξ]Tp

M

, hhξ],Yiip=ξ(Y) =hξ,Yi, ∀Y ∈Tp

M

.

Note that such isomorphisms further introduces an inner product on the cotangent space Tp

M

, which we will also denote by h·,·ip.From these facts, we shall introduce below the definition of subdifferential of a convex function, which can be found for example in [19, Definition 2.7].

Definition 2.44 The subdifferential ∂ψ at a point p∈

M

of a proper, convex function ψ:

M

Ris given by

∂ψ(p):={ξ∈Tp

M

|ψ(q)ψ(p) +hξ,exp−1p qi, ∀q∈

M

}.

Theorem 2.45 Let

M

be a Hadamard manifold and ψ:

M

R be a proper function.

The following statements hold:

(i) If ψ is convex, then ψ(p)≥ψ(q) +hξ,exp−1q pi, for all p,q∈dom(ψ)and all ξ∈∂ψ(q);

(ii) Ifψisσ-strongly convex forσ≥0,thenψ(p)≥ψ(q) +hξ,exp−1q pi+σ2d2(p,q), for all p,q∈dom(ψ)and allξ∈∂ψ(q).

(33)

2.2 Basic concepts and results of optimization on Hadamard manifolds 32

Proof. The proof of item (i) follows directly from Definition 2.44. To prove item (ii) assume that ψ is σ-strongly convex. Let p,q ∈

M

and γ : [0,1]

M

be the unique geodesic joining q to p, that is, γ(t) = expq(texp−1q p). Thus, γ0(0) = exp−1q p and

`(γ) =kexp−1q pk=d(p,q).Hence, from Definition2.43we have

(ψ◦γ)(t)≤(1−t)ψ(q) +tψ(p)−σ

2t(1−t)d2(p,q), ∀t∈[0,1], which implies that

t(ψ(γ(t))−ψ(p)) + (1−t)(ψ(γ(t))−ψ(q))≤ −σ

2t(1−t)d2(p,q), ∀t∈[0,1].

Multiplying last inequality for 1/twitht ∈(0,1]we have ψ(q)−ψ(p) +ψ(γ(t))−ψ(q)

t ≤ −σ

2(1−t)d2(p,q), ∀t∈[0,1], (2-8) Taking the limit in (2-8) ast goes to 0 we obtain

ψ(q)−ψ(p) +ψ0(q;v)≤ −σ

2d2(p,q), (2-9)

where

ψ0(q;v) =lim

t→0

ψ(expq(tv))−ψ(q) t

andv=γ0(0) =exp−1q p.Thus, by using [29, Proposition 3.2] we conclude that ψ(p)≥ψ(q) +hξ,exp−1q pi+σ

2d2(p,q), ∀p,q∈

M

,∀ξ∂ψ(q).

Next definition can be found in [24, p. 363].

Definition 2.46 Let

M

be a Hadamard manifold. A function ψ:

M

R is said to be lower semi-continuous, orlsc, at p∈

M

iflim infx→pψ(x) =ψ(p).

Definition 2.47 ψ:

M

Ris said to be 1-coercive at p¯∈

M

if

lim

d(p,p)→+∞¯

ψ(p)

d(p,¯ p) = +∞.

If f is 1-coercive for all p∈

M

, then f is said to be 1-coercive on

M

.

Proposition 2.48 Assume thatψ:

M

Ris lsc and 1-coercive on

M

.Then the global minimizer set ofψis nonempty.

Referências

Documentos relacionados