On some boosted methods for DC programming and the extension of the DCA to Hadamard Manifolds

(1)

UNIVERSIDADE FEDERAL DE GOIÁS (UFG)

INSTITUTO DE MATEMÁTICA E ESTATÍSTICA (IME) PROGRAMA DE PÓS GRADUAÇÃO EM MATEMÁTICA

ELIANDERSON MENESES SANTOS

On some boosted methods for DC programming and the extension of the

DCA to Hadamard Manifolds

GOIÂNIA 2022

(2)

26/01/22 14:55 SEI/UFG - 2607181 - Termo de Ciência e de Autorização (TECA)

Página 1 de 2 https://sei.ufg.br/sei/controlador.php?acao=documento_impri…d8a4bf20dc84e37282efa9d3ba480c58c855b1a7de0ecdfdf75811bd787

UNIVERSIDADE*FEDERAL*DE*GOIÁS INSTITUTO*DE*MATEMÁTICA*E*ESTATÍSTICA

TERMO&DE&CIÊNCIA&E&DE&AUTORIZAÇÃO&(TECA)&PARA&DISPONIBILIZAR&VERSÕES&ELETRÔNICAS&DE TESES

E&DISSERTAÇÕES&NA&BIBLIOTECA&DIGITAL&DA&UFG

Na*qualidade*de*;tular*dos*direitos*de*autor,*autorizo*a*Universidade*Federal*de*Goiás (UFG)* a* disponibilizar,* gratuitamente,* por* meio* da* Biblioteca* Digital* de* Teses* e* Dissertações (BDTD/UFG),* regulamentada* pela* Resolução* CEPEC* nº* 832/2007,* sem* ressarcimento* dos* direitos autorais,*de*acordo*com*a*Lei*9.610/98,*o*documento*conforme*permissões*assinaladas*abaixo,*para ﬁns* de* leitura,* impressão* e/ou* download,* a* `tulo* de* divulgação* da* produção* cien`ﬁca* brasileira,* a par;r*desta*data.

O* conteúdo* das* Teses* e* Dissertações* disponibilizado* na* BDTD/UFG* é* de responsabilidade* exclusiva* do* autor.* Ao* encaminhar* o* produto* ﬁnal,* o* autor(a)* e* o(a)* orientador(a) ﬁrmam* o* compromisso* de* que* o* trabalho* não* contém* nenhuma* violação* de* quaisquer* direitos autorais*ou*outro*direito*de*terceiros.

1.&IdenAﬁcação&do&material&bibliográﬁco [**]*Dissertação*********[*X*]*Tese

2.&Nome&completo&do&autor Elianderson*Meneses*Santos 3.&Título&do&trabalho

On*some*boosted*methods*for*DC*programming*and*the*extension*of*the*DCA*to*Hadamard*Manifolds 4.&Informações&de&acesso&ao&documento&(este&campo&deve&ser&preenchido&pelo&orientador)

Concorda*com*a*liberação*total*do*documento*[*X*]*SIM***********[*****]*NÃO¹

[1]*Neste*caso*o*documento*será*embargado*por*até*um*ano*a*par;r*da*data*de*defesa.*Após*esse período,*a*possível*disponibilização*ocorrerá*apenas*mediante:

a)*consulta*ao(à)*autor(a)*e*ao(à)*orientador(a);

b)*novo*Termo*de*Ciência*e*de*Autorização*(TECA)*assinado*e*inserido*no*arquivo*da*tese*ou*dissertação.

O*documento*não*será*disponibilizado*durante*o*período*de*embargo.

Casos*de*embargo:

]*Solicitação*de*registro*de*patente;

]*Submissão*de*ar;go*em*revista*cien`ﬁca;

]&Publicação*como*capítulo*de*livro;

]*Publicação*da*dissertação/tese*em*livro.

Obs.&Este&termo&deverá&ser&assinado&no&SEI&pelo&orientador&e&pelo&autor.

(3)

26/01/22 14:55 SEI/UFG - 2607181 - Termo de Ciência e de Autorização (TECA)

Página 2 de 2 https://sei.ufg.br/sei/controlador.php?acao=documento_impri…d8a4bf20dc84e37282efa9d3ba480c58c855b1a7de0ecdfdf75811bd787

Documento*assinado*eletronicamente*por*ELIANDERSON&MENESES&SANTOS,*Discente,*em 03/01/2022,*às*16:32,*conforme*horário*oﬁcial*de*Brasília,*com*fundamento*no*§*3º*do*art.*4º do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*Orizon&Pereira&Ferreira,*Professora&do&Magistério Superior,*em*04/01/2022,*às*04:16,*conforme*horário*oﬁcial*de*Brasília,*com*fundamento*no*§

3º*do*art.*4º*do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

A*auten;cidade*deste*documento*pode*ser*conferida*no*site hsps://sei.ufg.br/sei/controlador_externo.php?

acao=documento_conferir&id_orgao_acesso_externo=0,*informando*o*código*veriﬁcador 2607181*e*o*código*CRC*0ABC1C53.

Referência:*Processo*nº*23070.060746/2021x87 SEI*nº*2607181

(4)

ELIANDERSON MENESES SANTOS

On some boosted methods for DC programming and the extension of the

DCA to Hadamard Manifolds

GOIÂNIA 2022

Tese apresentada ao Programa de Pós- graduação em Matemática, do Instituto de Matemática e Estatísica (IME), da Universidade Federal de Goiás (UFG), como requisito para obtenção do título de Doutor em Matemática.

Área de concentração: Otimização

Orientador: Prof. Dr. Orizon Pereira Ferreira

Coorientador: Prof. Dr. João Carlos de Oliveira Souza

(5)

Ficha de identificação da obra elaborada pelo autor, através do Programa de Geração Automática do Sistema de Bibliotecas da UFG.

CDU 51 Santos, Elianderson Meneses

On some boosted methods for DC programming and the extension of the DCA to Hadamard Manifolds [manuscrito] / Elianderson Meneses Santos. - 2022.

142 f.: il.

Orientador: Prof. Dr. Orizon Pereira Ferreira; co-orientador Dr.

João Carlos de Oliveira Souza.

Tese (Doutorado) - Universidade Federal de Goiás, Instituto de Matemática e Estatística (IME), Programa de Pós-Graduação em Matemática, Goiânia, 2022.

Bibliografia.

Inclui símbolos, gráfico, tabelas, algoritmos.

1. Difference of convex functions. 2. DC optimization. 3. DCA. 4.

Kurdyka-Lojasiewicz property. 5. optimization on Riemannian manifolds. I. Ferreira, Orizon Pereira, orient. II. Título.

(6)

26/01/22 14:55 SEI/UFG - 2586804 - Ata de Defesa de Tese

Página 1 de 2 https://sei.ufg.br/sei/controlador.php?acao=documento_impri…5652d57a585032460c8c39a5b7c72f74b7e7a927219b62bd6422adac75

UNIVERSIDADE*FEDERAL*DE*GOIÁS INSTITUTO*DE*MATEMÁTICA*E*ESTATÍSTICA

ATA#DE#DEFESA#DE#TESE

Ata*nº*11*da*sessão*de*Defesa*de*Tese*de*Elianderson#Meneses#Santos,*que*confere*o Ctulo*de*Doutor*em*MatemáGca,*na#área#de#concentração#de#O:mização.

*

Ao*décimo*séGmo*dia*do*mês*de*dezembro*do*ano*de*dois*mil*e*vinte*um,*a*parGr*das dez* horas,* através* de* webRvídeoRconferência,* realizouRse* a* sessão* pública* de* Defesa* de Tese*inGtulada*“On#some#boosted#methods#for#DC#programming#and#the#extension#of#the#DCA#to Hadamard# Manifolds”.* Os* trabalhos* foram* instalados* pelo* Orientador* e* presidente* da* banca, Professor* Doutor*Orizon# Pereira# Ferreira# I# IME/UFG* com* a* parGcipação* dos* demais* membros* da Banca* Examinadora:* Professor* Doutor* *Leandro# da# Fonseca# Prudente# I# IME/UFG* membro* Gtular interno,*Professor*Doutor*Glaydston#de#Carvalho#Bento#I#IME/UFG*membro*Gtular*interno,*Professor Doutor*João# Xavier# da# Cruz# Neto# I# MAT/UFPI* membro* Gtular* externo,* Professor* Doutor*Welington Luis# de# Oliveira# I# MINES# ParisTech,# France,* membro* Gtular* externo* e* o* CoR orientador* Professor* Doutor* João# Carlos# de# Oliveira# Souza* R* MAT/UFPI# membro* Gtular externo.* Durante* a* arguição* os* membros* da* banca*não# ﬁzeram* sugestão* de* alteração* do* Ctulo do*trabalho.*A*Banca*Examinadora*reuniuRse*em*sessão*secreta*a*ﬁm*de*concluir*o*julgamento*da*Tese, tendo* sido* o* candidato* aprovado* pelos* seus* membros.* Proclamados* os* resultados pelo* Professor* Doutor*Orizon# Pereira# Ferreira# I# IME/UFG,* Presidente* da* Banca* Examinadora,* foram encerrados*os*trabalhos*e,*para*constar,*lavrouRse*a*presente*ata*que*é*assinada*pelos*Membros*da Banca*Examinadora,*Ao*décimo*séGmo*dia*do*mês*de*dezembro*do*ano*de*dois*mil*e*vinte*um.

TÍTULO SUGERIDO PELA BANCA

On#some#boosted#methods#for#DC#programming#and#the#extension#of#the#DCA#to#Hadamard Manifolds

Documento*assinado*eletronicamente*por*JOÃO#XAVIER#DA#CRUZ#NETO,*Usuário#Externo,*em 20/12/2021,*às*13:48,*conforme*horário*oﬁcial*de*Brasília,*com*fundamento*no*§*3º*do*art.*4º do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*JOÃO#CARLOS#DE#OLIVEIRA#SOUZA,*Usuário Externo,*em*20/12/2021,*às*13:54,*conforme*horário*oﬁcial*de*Brasília,*com*fundamento*no*§

(7)

26/01/22 14:55 SEI/UFG - 2586804 - Ata de Defesa de Tese

Página 2 de 2 https://sei.ufg.br/sei/controlador.php?acao=documento_impri…5652d57a585032460c8c39a5b7c72f74b7e7a927219b62bd6422adac75

Documento*assinado*eletronicamente*por*Orizon#Pereira#Ferreira,*Professora#do#Magistério Superior,*em*20/12/2021,*às*15:32,*conforme*horário*oﬁcial*de*Brasília,*com*fundamento*no*§

Documento*assinado*eletronicamente*por*Leandro#Da#Fonseca#Prudente,*Professor#do Magistério#Superior,*em*20/12/2021,*às*16:09,*conforme*horário*oﬁcial*de*Brasília,*com fundamento*no*§*3º*do*art.*4º*do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*Welington#Luis#de#Oliveira,*Usuário#Externo,*em 26/12/2021,*às*18:55,*conforme*horário*oﬁcial*de*Brasília,*com*fundamento*no*§*3º*do*art.*4º do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

Documento*assinado*eletronicamente*por*Glaydston#De#Carvalho#Bento,*Professor#do Magistério#Superior,*em*03/01/2022,*às*11:38,*conforme*horário*oﬁcial*de*Brasília,*com fundamento*no*§*3º*do*art.*4º*do*Decreto*nº*10.543,*de*13*de*novembro*de*2020.

A*autenGcidade*deste*documento*pode*ser*conferida*no*site hips://sei.ufg.br/sei/controlador_externo.php?

acao=documento_conferir&id_orgao_acesso_externo=0,*informando*o*código*veriﬁcador 2586804*e*o*código*CRC*AFBAE386.

Referência: Processo nº 23070.060746/2021-87 SEI nº 2586804

(8)

E ^LIANDERSON M ^ENESES S ^ANTOS

On some boosted methods for DC programming and the extension of the

DCA to Hadamard Manifolds

Tese defendida no Programa de Pós–Graduação do Instituto de Matemática e Estatística - IME da Universidade Federal de Goiás como requisito parcial para obtenção do título de Doutor em Matemática, aprovada em 17 de Dezembro de 2021, pela Banca Examinadora con- stituída pelos professores:

Prof. Dr. Orizon Pereira Ferreira Instituto de Matemática e Estatística - IME – UFG

Presidente da Banca

Prof. Dr. João Carlos de Oliveira Souza Departamento de Matemática - DM – UFPI

Prof. Dr. Glaydston de Carvalho Bento Instituto de Matemática e Estatística - IME – UFG

Prof. Dr. João Xavier da Cruz Neto Departamento de Matemática - DM – UFPI

Prof. Dr. Leandro da Fonseca Prudente Instituto de Matemática e Estatística - IME – UFG

Prof. Dr. Welington Luis de Oliveira

Centre de Mathématiques Appliquées - CMA – Mines ParisTech

(9)

Elianderson Meneses Santos

Graduou-se em licenciatura plena em matemática na Universidade Federal do Piauí - UFPI, em Setembro de 2013, e obteve o grau de mestre em matemática também pela UFPI, com defesa de dissertação em outubro de 2015. Atuou como professor de matemática em cursos de graduação da UFPI, entre maio de 2014 até março de 2016, e também em cursos de graduação da Universidade Estadual do Piauí - UESPI, entre abril de 2016 e fevereiro de 2019. Iniciou o doutorado em matemática no Instituto de Matemática e Estatística da Universidade Federal de Goiás em março de 2019.

(10)

À minha família.

(11)

Agradecimentos

Agradeço primeiramente à Deus por ter me dado o dom da vida e ter me permitido terminar esse curso apesar de todas as dificuldades que enfrentei no caminho.

Agradeço à minha família por todo o apoio ao longo não só desse meu período no doutorado, mas também ao longo de toda a minha vida pessoal e acadêmica.

Agradeço à minha companheira Cíntia por estar ao meu lado em todos os momentos, sempre me incentivando, não deixando que eu desanime, e me acompanhando nos maiores desafios e aventuras dos últimos tempos.

Agradeço aos meus amigos, que mesmo à distância também não deixaram de me apoiar. Agradeço em especial aos amigos do Quarteto, Renan, Rômulo, Kellwyn, Phelps, Danilo, Helson e Neves, e também à galera doMalditos Amigos, em especial ao Atécio, Magari e João Neto pela parceria. Agradeço também ao amigos Rafael Emanuel e Hércules Bezerra.

Agradeço aos meu orientadores, professores Orizon Pereira Ferreira e João Car- los de Oliveira Souza, pela paciência, disponibilidade e também pelo companheirismo e os bons conselhos. Agradeço a ambos por me aconselharem sobre os primeiros caminhos da pesquisa acadêmica em matemática, tanto no que diz respeito aos aspectos técnicos quanto em relação à boas práticas éticas no decorrer da mesma.

Agradeço aos professores e funcionários do IME-UFG, que através do seu trabalho no instituto também contribuíram para que eu conseguisse terminar este curso.

Agradeço também aos meus colegas do curso de doutorado, pela parceria durante o tempo em que cursamos as mesmas disciplinas, especialmente nos cursos de Geometria Riemannniana e Otimização, dentre os quais destaco especialmente o apoio dos colegas Danilo e Thamara.

Agradeço aos professores Glaydston de Carvalho Bento, João Xavier da Cruz Neto, Leandro da Fonseca Prudente e Welington Luis de Oliveira por terem terem aceito participar da banca de defesa desta tese de doutorado, e por todas as suas ótimas observações e sugestões. Agradeço também ao Maurício Silva Louzeiro, cujas dicas e observações também contribuíram para a versão final desta tese.

Agradeço à CAPES pelo apoio financeiro.

(12)

"Resgate suas forças e se sinta bem Rompendo a sombra da própria loucura Cuide de quem corre do seu lado E quem te quer bem Essa é a coisa mais pura Fragmentos da realidade Estilo mundo cão Tem gente que desanda Por falta de opção E toda fé que eu tenho Eu tô ligado que ainda é pouco Os bandidos de verdade Tão em Brasília, tudo solto Eu faço da dificuldade A minha motivação A volta por cima Vem na continuação O que se leva dessa vida É o que se vive, é o que se faz Saber muito é muito pouco

‘Stay Will’esteja em paz"

Chorão, Pontes indestrutíveis - Charlie Brown Jr.

(13)

Resumo

Santos, E. M.. Sobre alguns métodos impulsionados para programação DC e a extensão do DCA para variedades de Hadamard. Goiânia, 2021. 142p.

Tese de Doutorado. Instituto de Matemática e Estatística - IME, Universidade Federal de Goiás.

Nesta tese são apresentados alguns novos métodos para otimização de funções DC. O primeiro deles, denominadoBSSM, é proposto para resolver problemas de otimização DC sobre Rⁿ onde a primeira componente DC é diferenciável a a segunda é possivelmente não diferenciável. O segundo método, que será chamado de nmBDCA, é uma extensão não monótona do método BDCA para lidar com problemas de otimização DC em Rⁿ onde ambas as componentes DC são não diferenciáveis. O terceiro método é uma combinação doBSSMcom onmBDCApara tratar de problemas de otimização DC sobre um conjunto convexo fechado C com restrições lineares, onde a primeira componente DC da função objetivo é a soma de uma função convexa suave com uma função convexa não diferenciável, e a segundo componente DC é não diferenciável. O último método apresentado nesta tese é uma extensão do DCA para o contexto da otimização de funções DC em variedades de Hadamard.

Palavras–chave

Diferença de funções convexas, otimização de funções DC, DCA, propriedade de Kurdyka-Łojasiewicz, otimização em variedades Riemannianas.

(14)

Abstract

Santos, E. M.. On some boosted methods for DC programming and the extension of the DCA to Hadamard Manifolds. Goiânia, 2021. 142p. PhD.

Thesis. Instituto de Matemática e Estatística - IME, Universidade Federal de Goiás.

In this thesis some new methods for DC optimization are presented. The first one, called BSSM, is proposed to solve DC problems over Rⁿ where the first DC component is differentiable and the second one is non-smooth. The second method, callednmBDCA, is a non-monotone extension of the BDCA to deal with DC problems overRⁿ where both DC components are non-smooth. The third method is a combination of the BSSM with thenmBDCAto deal with DC problems over a closed convex setCwith linear constraits, where the first component of the objective function is a sum of a of a smooth convex function with a non-differentiable convex function, and the second DC component is non- smooth. The last method is an extension of the DCA to the context of DC optimization on Hadamard manifolds.

Keywords

Difference of convex functions, DC optimization, DCA, Kurdyka-Łojasiewicz property, optimization on Riemannian manifolds.

(15)

CHAPTER 1 Introduction

A generalDC problemis a non-convex and non-smooth optimization problem in the format

minx∈F φ(x) =g(x)−h(x), (PDC) whereg:

F

^→Randh:

F

^→Rare both convex and lower semi-continuous functions, and

F

is a feasible set, which can be a convex subsetCofRⁿor a Riemannian manifold

M

^.The function φ is known as a DC function, i.e., a function that can be expressed as adifference of two convex functions, and each one of those functions is said to be aDC componentofφ.

The first and most popular method developed to deal with DC programs, named Difference of Convex Algorithmor merely DCA, was first presented in [68], see also [69].

In those works the authors establish some results on duality in DC optimization and then they present the DCA, which is based in the conjugate function and duality relations of convex functions. In the following we present the original formulation of the DCA, which we will refer asnatural form of DCA, or simply DCA when there is no confusion. Such algorithm is concerned to minimize the functionφ=g−hdefined above when

F

⁼Rⁿ. The algorithm is as follows:

Algorithm 1DC Algorithm

1: Choose an initial pointx⁰∈dom(g). Setk=0.

2: Takeξ^k∈∂h(x^k), and compute

x^k+1∈∂g^∗(ξ^k). (1-1)

3: Ifx^k+1=x^k, then STOP and returnx^k. Otherwise, go to Step 4.

4: Setk←k+1 and go to Step 2.

The conjugate function g^∗:Rⁿ→Rof the real valued function gis defined as g^∗(y):=sup_x∈_Rnhx,yi −g(x), see e.g. [60, p. 473]. Sinceg is convex, g^∗ is also convex on Rⁿ. Note that the next iterate x^k+1 of DCA in (1-1) is a subgradient ofg^∗ at ξ^k, see

(18)

17

e.g. [68,69] and [4]. On the other hand, from the definition of the conjugate function, it is not hard to see that Algorithm1 is equivalent to an alternative formulation of the DCA, calledsimplified form of DCA, which is the following:

Algorithm 2 [4, Section 2.3.1] DC Algorithm (Simplified form)

1: Choose an initial pointx⁰∈dom(g). Setk=0.

2: Takeξ^k∈∂h(x^k), and the next iteratedp^k+1is defined as x^k+1∈argmin_x∈_Rn

g(x)−

ξ^k,x−x^k

. (1-2)

3: Ifx^k+1=x^k, then STOP and returnx^k. Otherwise, go to Step 4.

4: Setk←k+1 and go to Step 2.

We also remark that both formulations of DCA given above are equivalent in the following sense: given the current iteratex^kof DCA, the next iteratex^k+1satisfies (1-1) if and only if it satifies (1-2).

It is worth to note that in several works dealing with DC optimization the following hypothesis is made: the DC components g and h of the DC function φ are assumed being strongly convex. Such assumption is not restrictive since we always can sum to each DC component ofφthe same strongly convex function ¯f :

F

^→Rin order to obtain ¯g:

F

^→Rand ¯h:

F

^→Rgiven by ¯g:=g+f¯and ¯h:=h+f¯.In this case, both ¯g and ¯hare strongly convex and it holds thatφ:=g−h=g¯−h.¯ Therefore, without loss of generalitywe always can consider the DC problem(PDC)with both DC components being strongly convex.Moreover, a well established result in the study of DCA is the following:

if we assume that the functionsgandhare strongly convex, then every cluster point ¯xof the sequence(x^k)_k∈_Ngenerated by the DCA satisfies∂g(x)¯ ∩∂g(x)¯ 6=∅,i.e., ¯xis a critical point of the DC functionφ:=g−h.

Over last years the interest by the DC theory has much increased and a large class of works devoted to DC optimization in different contexts has been developed. Such interest is due especially to the fact that DC functions have a large range of practical applications, which includes, for example, computational biology [45, 46], machine learning [2,66,70], image analysis [43,44,58], Cryptography [23,67], the minimum sum- of-squares clustering problem [6,30,57], the bilevel hierarchical clustering problem [55], Clusterwise linear regression [10], the multicast network design problem [37], and the multidimensional scaling problem [3,6] and Fermat-Weber location problem [25,27], see also [15]. A great list of works dealing with DC optimization can be found in the recent review [47], which celebrates the 30th birthday of DC programming and DCA.

In another way, the interest by optimization in Riemannian manifolds also has increased in last years. However, specifically when we talk about DC optimization in

(19)

18

Riemannian manifolds, only a few works and some specific algorithms or numerical experiments were proposed to deal with it, see [1,63]. We also remark that the Fenchel conjugate of a function has been recently established in [19].

In this sense, the first aim of this thesis is to study the DC problem (PDC) in the unconstrained case, i.e., when

F

⁼Rⁿ. Such study is divided in two chapters.

In the first one we present a scaled subgradient method for DC programming and we prove that the negative scaled generalized subgradient at the current iterate is a descent direction for the objective function from an auxiliary point. Thus, instead of applying the Armijo line search and computing the next iterate from the current iterate, both the line search and the new iterate are computed from that auxiliary point along the direction of the negative scaled generalized subgradient. Consequently, the proposed method, called BSSM, has similar asymptotic convergence properties and iteration-complexity bounds as the usual descent methods to minimize differentiable convex functions employing Armijo line search. The second part of our study about unconstrained DC problems consists in an extension of the applicability of the BDCA, which was originally proposed in [5]

for differentiable DC functions, and then for non-differentiable DC functions where the first DC component is differentiable, but the second one is non-smooth, see [6]. In our approach, we develop a version of the BDCA for non-differentiable DC functions where the both DC components are not differentiable. Such version was possible by applying a non-monotone line search instead of the usual monotone line search employed in [5,6].

Under suitable assumptions, we show that any cluster point of the sequence generated by our method, called nmBDCA, is a critical point of the problem, and then we provide some iteration-complexity bounds. Some numerical experiments show that the nmBDCA outperforms the DCA such as its monotone version.

The second aim of this thesis is to present a method that combine the strategies of the BSSM and nmBDCA to study DC problems under linear constraints. We prove that every cluster point of the sequence generated by the proposed method is a critical point for the problem, and some iteration-complexity bounds. Moreover, we show that the proposed method retrieves both BSSM and nmBDCA under some reasonable assumptions.

The third and last aim of this thesis is to study the DC problem (PDC) in the context of Riemannian manifolds. In this sense, in last chapter we propose a primal-dual study of the DC problem (PDC) in Hadamard manifolds, and then we present an extension of the DCA [68] to solve such problems. We remark that our method also can be seen as a practical application of the Fenchel conjugate recently presented in [19]. We prove that the DCA in Hadamard manifolds is a descent method, and that every cluster point of the generated sequence is a critical point for the problem under consideration. Moreover, we prove also a primal-dual asymptotic convergence result.

This thesis is organized as follows. Chapter2presents some notation and basic

(20)

19

results that will be used throughout the text. Chapter3presents a boosted scaled subgradient method (BSSM) to solve DC problems where the first DC component is differentiable and the second one is non-smooth. Chapter 4 presents a boosted DC algorithm with a non-monotone line search (nmBDCA) to solve the DC problems where both DC components are possibly non-smooth. Chapter5presents a Boosted Scaled DC Algorithm with non-monotone line search (nmBSDCA) to deal with the DC problems where the first DC component is a sum of a convex smooth function and a non-differentible convex function, and the second DC component is convex and non-smooth. Chapter 6 presents the DC problem on a Hadamard manifold and its dual, and also presents an extension of the DC Algorithm given in [68] to this context, as well its convergence analysis. It is worth to note that the content of chapter6is based on our work [36], which is available online. We also remark that throughout chapters3 and4 φ denotes a DC function with DC components gandh. In chapter5φis a DC function with DC componentsg:=g₁+g₂andh, where g₁is convex and differentiable andg₂andhare convex, but not necessarily differentiable functions. In chapter6,gandhdenotes convex functions on a Hadamard manifold

M

^.

(21)

CHAPTER 2 Preliminaries

In this chapter we present some notations, definitions, and results that will be used throughout the text of this thesis.

2.1 Basic concepts and results of optimization on R

ⁿ

In this section, we present some preliminary concepts that will be used throughout the next three chapters of this thesis.

2.1.1 Some facts about locally Lipschitz functions

This subsection presents some notations, definitions, and results about locally Lipschitz functions. First we define convex functions; see [40, Definition 1.1.1, p. 144, and Proposition 1.1.2, p. 145]

Definition 2.1 A functionψ:Rⁿ→Ris said to be convex if

ψ(λx+ (1−λ)y)≤λψ(x) + (1−λ)ψ(y), ∀x,y∈Rⁿ,∀λ∈[0,1].

We say that ψ is strictly convex when last inequality is strict for x6=y. Moreover, for a givenσ>0, the functionψ is said to bestrongly convex with modulusσ orσ-strongly convexifψ−(σ/2)k · k²is convex, or equivalently, if

ψ(λx+ (1−λ)y)≤λψ(x) + (1−λ)ψ(y)−σ(1−λ)λkx−yk², (2-1) for all x,y∈Rⁿand allλ∈[0,1].

Remark 2.2 To make the texteasier to read, throughout this thesis, especially in chap- ter5, we will use the followingabuse of notationto referconvex functions: we will say that the functionψisconvexif, and only if,ψis0-strongly convex, i.e., whenψsatisfies (2-1)withσ=0. Such abuse of notation also appears in Lemma2.13.

(22)

The next two definitions can be found in [26].

Definition 2.3 We say thatψ:Rⁿ→Ris locally Lipschitz if, for all x∈Rⁿ, there exist a constant K_x>0and a neighborhood U_x of x such that|ψ(x)−ψ(y)| ≤K_xkx−yk, for all y∈U_x.

Ifψ:Rⁿ→Ris convex, thenψis locally Lipschitz; see [26, p. 34].

Definition 2.4 Let ψ:Rⁿ→R be a locally Lipschitz function. The Clarke’s subdifferential of ψat x∈Rⁿis given by∂_cψ(x) ={v∈Rⁿ|ψ^◦(x;d)≥ hv,di, ∀d ∈Rⁿ}, where ψ^◦(x;d)is the generalized directional derivative ofψat x in the direction d given by

ψ^◦(x;d) =lim sup

u→x t↓0

ψ(u+td)−ψ(u)

t .

Ifψis convex, then∂_cψ(x)coincides with the subdifferential∂ψ(x)in the sense of convex analysis, andψ^◦(x;d)coincides with the usual directional derivativeψ⁰(x;d); see [26, p.

36]. We recall that ifψ:Rⁿ→Ris continuously differentiable, then∂cψ(x) ={∇ψ(x)}

for anyx∈Rⁿ.

Theorem 2.5 Letψ:Rⁿ→Rbe a locally Lipschitz function. Then, for all x∈Rⁿ, there hold:

(i) ∂_cψ(x) is a non-empty, convex, compact subset of Rⁿ and kvk ≤ K_x, for all v∈∂_cψ(x), where K_x>0is the Lipschitz constant ofψaround x;

(ii) ψ^◦(x;d) =max{hv,di: v∈∂_cψ(x)}.

Proof. See [26, Proposition 2.1.2, p. 27].

Theorem 2.6 Let ψ1,ψ2:Rⁿ→R be convex functions. Then, for every x,d ∈Rⁿ, the following assertions hold:

(i) Ifψ₁is differentiable, then(ψ₁−ψ₂)^◦(x;d) =h∇ψ₁(x),di −ψ⁰₂(x;d);

(ii) ∂_c(ψ₁±ψ₂)(x)⊆∂ψ₁(x)±∂ψ₂(x),and the equality holds if eitherψ₁orψ₂is differentiable.

Proof. See [26, Proposition 2.3.1, p. 38, and Corollary 1, p. 39].

Proposition 2.7 Let ψ:Rⁿ →R be convex and (u^k)_k∈N such that lim_k→∞u^k = u^∗. If (v^k)_k∈Nis a sequence such that v^k∈∂ψ(u^k)for every k∈N, then(v^k)_k∈Nis bounded and its cluster points belongs to∂ψ(u^∗).

(23)

Proof. See [40, Propositions 6.2.1 and 6.2.2, p. 282].

Lemma 2.8 Letψ:Rⁿ→R be a strongly convex function with modulusσ>0, and let ψ¯ :Rⁿ→Rbe convex. Thenψ+ψ¯ is strongly convex function with modulusσ>0.

Proof. See [14, Lemma 5.20, p. 119].

Theorem 2.9 Let ψ:Rⁿ→R be a strongly convex function and C⊆Rⁿ a closed and convex set. Then, ψ has an unique minimizer x^∗∈C. Moreover, there exist v∈∂ψ(x^∗) such thathv,x−x^∗i ≥0, for all x∈C.

Proof. See [14, Theorem 5.25, p. 122 and Corollary 3.68, p. 76].

Theorem 2.10 The following statements are equivalent

(i) ψ:Rⁿ→Ris a strongly convex function with modulusσ>0.

(ii) ψ(y)≥ψ(x) +hv,y−xi+ (σ/2)ky−xk², for all x,y∈Rⁿand all v∈∂ψ(x).

(iii) hw−v,x−yi ≥σky−xk², for all x,y∈Rⁿ, all w∈∂ψ(x)and all v∈∂ψ(y).

Proof. See [14, Theorem 5.24, p. 119].

The following definition appears in [14, p. 107]

Definition 2.11 A differentiable functionψ:Rⁿ→Rhas Lipschitz continuous gradient with constant L>0wheneverk∇ψ(x)−∇ψ(y)k ≤Lkx−yk, for all x,y∈Rⁿ.

Lemma 2.12 (Descent lemma) Assume that ψ satisfies Definition 2.11. Then, for all x,d∈Rⁿand allλ∈R, there holds

ψ(x+λd)≤ψ(x) +λh∇ψ(x),di+Lλ² 2 kdk².

Proof. See [14, Lemma 5.7, p. 109].

The following Lemma extends the descent Lemma above.

Lemma 2.13 Let ψ: Rⁿ →R be a function given by ψ :=ψ₁−ψ₂. Assume that ψ₁ satisfies Definition 2.11 and ψ2 is strongly convex with modulus σ≥0. Then, for all x,d∈Rⁿand allλ∈R, there holds

ψ(x+λd)≤ψ(x) +λh∇ψ₁(x)−w,di+(L−σ)

2 λ²kdk² ∀w∈∂ψ₂(x).

(24)

Proof. Let x ∈ Rⁿ and an arbitrary w∈ ∂ψ₂(x). Define the function p :Rⁿ →R by p(z) =ψ₁(z)− hw,zi. Thus we have∇p(z) =∇ψ₁(z)−wand, due to∇ψ₁ be Lipschitz continuous with constantL, we obtain that∇pis also Lipschitz continuous with constant L. Givend∈Rⁿandλ∈R, by using Lemma2.12with p, we obtain that

p(x+λd)≤p(x) +λh∇ψ₁(x)−w,di+Lλ² 2 kdk². Since p(z) =ψ1(z)− hw,zi, the last inequality is equivalent to

ψ₁(x+λd)≤ψ₁(x) +λhw,di+λh∇ψ₁(x)−w,di+Lλ²

2 kdk². (2-2) Due toψ2be strongly convex with modulusσ≥0 andw∈∂ψ2(x)it follows from item(ii) of Theorem2.10that

λhw,di ≤ψ₂(x+λd)−ψ₂(x)−σλ² 2 kdk². Hence, the last inequality together (2-2) yield

ψ₁(x+λd)−ψ₂(x+λd)≤ψ₁(x)−ψ₂(x) +λh∇ψ₁(x)−w,di+(L−σ)λ² 2 kdk², which taking into account thatψ=ψ1−ψ2concludes the proof.

Remark 2.14 It is worth to note that in Lemma 2.13 is sufficient to assume that ∇ψ1

is Lipschitz continuous with constant L>0. In this case [26, Corollary of Proposition 2.2.1, p. 32] ensures that if ψ1 is continually differentiable then ψ1 is locally Lipschitz.

Hence, by item(ii)of Theorem2.6we have∂cψ(x) ={∇ψ₁(x)} −∂ψ2(x)and the result follows. We also note that the Lemma2.13generalizes Lemma2.12. Indeed, takingψ₂≡0, Lemma2.13becomes exactly the Lemma2.12.

2.1.2 On closed convex sets with linear constraits

In this subsection we present some preliminary results about convex sets with linear constraits. The definition below can be found for example in [7, Section 2].

Definition 2.15 Let

C:={x∈Rⁿ: haⁱ,xi ≤b_i, i=1,· · ·,p.}, (2-3) where aⁱ∈Rⁿ, b_i∈R for all i∈

I

^:=^{1,^2,^{· · ·}^,p}. The set of active constraints at the point x∈C is given by

I

^(x)^:=^{i^∈

I

^: ^haⁱ^,^xi⁼^bi}.

(25)

Next definition can be found for example in [14, p. 36].

Definition 2.16 Given a set

F

^⊆Rⁿand x∈

F

^{. The}normal coneof

F

at x is defined as N_F(x) ={v∈Rⁿ:hv,z−xi ≤0,∀z∈

F

^}.

Example 2.17 If C is given by(2-3), then for every x∈C we have N_C(x):=

( _p

i=1

∑

µ_iaⁱ: µ_i≥0, µ_i(haⁱ,xi −b_i) =0, i=1, . . . ,p )

. (2-4)

Indeed, denote by Nthe set in the right hand of (2-4)and take v∈N. Then, there exist µ_i≥0,i=1, . . . ,p such that v=∑_i=1^p µ_iaⁱ. If x∈C,thenhaⁱ,xi ≤b_i for all i=1, . . . ,p.

Thus, for all z∈C, hv,z−xi=

p i=1

∑

µ_ihaⁱ,zi −

p i=1

∑

µ_ihaⁱ,xi ≤

p i=1

∑

µ_i(b_i− ha_i,xi) =0,

which means that N⊆N_C(x). Assume by contradiction that there exist v∈N_C(x)\N.

SinceN is convex, by the strict separation theorem [14, Theorem 2.33, p.31] there exist u∈Rⁿ\ {0} and z∈R such that hu,wi<z<hu,vi, for all w∈N. Due to0∈N, we have z> 0. Note that for any i∈

I

^(x) ^{and all µ}^≥⁰ we have µhu,aⁱi <z. Then, for µ>0we obtain µhu,aⁱi<(1/µ)z, and letting µ→∞we conclude thathu,aⁱi ≤0for all i∈

I

(x). Then, for i ∈

I

^(x) ^{and all} ^t^¯^>^{0, we have} ^haⁱ^,^x⁺^{tui ≤}^¯ ^bi. If haⁱ,xi<b_i and hu,aⁱi ≤0, then a similar argument shows that haⁱ,x+tui ≤¯ b_i fort¯>0. Ifhaⁱ,xi<b_i andhu,aⁱi>0,then for all0<t¯<(b_i− ha_i,xi)/hu,aⁱi,we havehaⁱ,x+tui ≤¯ b_i. Thus, settingt¯:=min

b_i− haⁱ,xi

/haⁱ,ui|haⁱ,ui>0 >0we ensure that x+tu¯ ∈C. Finally, hv,ui>z>0yeldshv,(x+tu)¯ −xi=hv,tui¯ >tz¯ >0, which is a contradiction due to the fact of v∈N_C(x). Therefore, N_C(x) =N.In particular, if aⁱ=0∈Rⁿand b_i=0∈Rfor all i=1, . . . ,p, then C=Rⁿand N_C(x) ={0}for every x∈Rⁿ.

The following proposition will be useful ahead in chapter5.

Proposition 2.18 Iflimn→+∞xⁿ=x, v¯ ⁿ∈N_C(xⁿ)andlimn→+∞vⁿ=v, then¯ v¯∈N_C(x).¯

Proof. See [61, Proposition 6.6, p. 202].

Next lemma will be used in chapters 3and5 to guarantee the feasibility of an auxiliary point generated by some of our methods.

Lemma 2.19 Let C be given by(2-3)and take x,z∈C. If d=z−x6=0and

I

^(z)^⊆

I

^(x),

then z+td∈C, for all t∈[0,ε], where

0<ε:=

( min

nbi−haⁱ,zi

|haⁱ,di| : i∈

I

^\

I

^(x),^haⁱ^,^{di 6=}⁰^o^,

I

^/

I

^(x)⁶⁼^0;^/

+∞, else.

(2-5)

(26)

Proof. Takei∈

I

(x). Then, for anyi∈

I

^(z)^⊆

I

(x), it holds that haⁱ,di=haⁱ,zi − haⁱ,xi=b_i−b_i=0.

Otherwise, ifi∈

I

^(x)^\

I

(z), then we conclude that

haⁱ,di=haⁱ,zi − haⁱ,xi=haⁱ,zi −b_i<0.

Thus, haⁱ,di ≤0, for alli∈

I

^{(x). Since}^z^∈C, it holds thathaⁱ,zi −b_i≤0, for alli∈

I

^.

Hence, combining the two last inequalities we obtain

haⁱ,z+tdi −b_i≤0, ∀t≥0, ∀i∈

I

^(x). ^(2-6)

In the following, consideri∈

I

^\

I

(x). Due to the fact ofz∈Cand

I

^(z)^⊆

I

(x), for each i∈

I

^\

I

^(x)^⊆

I

^\

I

^(z)we know thathaⁱ,zi<b_i. Note that, ifhaⁱ,di=0,then, taking into account thatz∈C, we havehaⁱ,z+tdi −b_i=haⁱ,zi −b_i≤0,for allt≥0.On the other hand, ifhaⁱ,di 6=0 then, letting

εi:=b_i− haⁱ,zi

|haⁱ,di| >0 we have

haⁱ,z+tdi −b_i=haⁱ,zi −b_i+thaⁱ,di ≤0, ∀t∈[0,ε_i], ∀i∈

I

^\

I

^(x). ^(2-7)

Therefore, the definition of ε in (2-5) together with (2-6) and (2-7) imply the desired

result.

2.1.3 On the Kurdyka-Łojasiewicz property

Next we present the definition of the Kurdyka-Łojasiewicz property.

Definition 2.20 Let C¹[(0,+∞)] be the set of all continually differentiable functions defined in(0,+∞), F:Rⁿ→Rbe a locally Lipschitz function and∂cF(·)be the Clarke’s subdifferential of F. The function F is said to have the Kurdyka-Łojasiewicz property at x^∗ if there existη∈(0,+∞], a neighborhood U of x^∗and a continuous concave function γ:[0,η)→R+ (called desingularizing function) such that:γ(0) =0,γ∈C¹[(0,+∞)]and γ⁰(t)>0for all t ∈(0,η). In addition, satisfies

γ⁰(F(x)−F(x^∗))dist(0,∂_cF(x))≥1, ∀x∈U∩ {x∈Rⁿ|F(x^∗)<F(x)<F(x^∗) +η}.

(27)

In order to make the text simplest, from now on in this thesis we will write “f is KŁ at x^∗” instead of writing “the function f has the Kurdyka-Łojasiewicz property at x^∗”. Next remarks show that there exists a huge number of functions satisfying the KŁ-property.

Remark 2.21 S. Łojasiewicz proved in 1963 that real-analytic functions satisfy an inequality of the above type withγ(t) =t^1−θwhereθ∈[(1/2),1); see [52].

Remark 2.22 Let A⊂Rⁿand B⊂Rⁿ×R. The set B is called semianalytic if each point ofRⁿ×Radmits a neighborhood V ⊂Rⁿ×Rfor which B∩V assumes the form as follows

p

[

i=1 Q

\

j=1

{(x,y)∈V : f_{i j}(x,y) =0, g_{i j}(x,y)>0},

where the functions f_{i j},g_{i j}:V →Rare real-analytic, for all i=1,· · ·,p and j=1,· · ·,q.

Then, the set A is called subanalytic if each point ofRⁿadmits a neighborhood V⊂Rⁿ×R and B⊂Rⁿ×Ra bounded semianalytic subset such that A∩V ={x∈Rⁿ : (x,y)∈B}.

Finally, a function f :Rⁿ→Ris called subanalytic if its graph is a subanalytic subset of Rⁿ×R. It is worth to point out that subanalytic functions that is continuous when restricted to its closed domain satisfies the KŁ-property with desingularising function γ(t) = Dt^θ/θ with D >0 and θ∈ (0,1]; for more details see [20, Theorem 3.1]. For examples of subanalytic functions see e.g. [8,20,21].

2.2 Basic concepts and results of optimization on Hadamard manifolds

In this section, we recall some concepts, notations, and basics results about Riemannian manifolds and optimization. For more details see, for example, [32,59,62, 71]. Let us begin with concepts about Riemannian manifolds.

2.2.1 Basic concepts and results of Riemannian geometry

In this subsection we present some basic results about Riemannian geometry that will be used throughout Chapter6. For the next definitions, please see [32].

Definition 2.23 A differentiable manifold of dimension n is a set Mⁿ and a family of injective mappingsx_α:U_α ⊂Rⁿ→M of open sets U_αofRⁿinto M such that:

(i) ^S_αx_α(U_α) =M;

(28)

(ii) for any pairα,β,withx_α(U_α)∩x_β(U_β) =W 6=∅,the setsx⁻¹_α (W)andx⁻¹

β (W) are open sets inRⁿand the mappingsx⁻¹

β ◦x⁻¹_α are differentiable;

(iii) The family{(U_α,x_α)}is maximal relative to the conditions(i)and(ii)above.

The indexnin the notationMⁿindicates the dimension ofM. When there is no confusion such index can be omitted. Unless when explicitly stated, through this sectionMdenotes a differentiable (which can be also a Riemannian) manifold of dimensionn.

Definition 2.24 Let M be a differentiable manifold. A differentiable function α : (−ε,ε)→M is called a (differentiable) curve in M. Suppose that α(0) = p∈M, and let

D

be the set of functions on M that are differentiable at p. The tangent vector to the curveαat t=0is a functionα⁰(0):

D

^→Rgiven by

α⁰(0)f = d(f◦α) dt

t=0

, f ∈

D

^.

A tangent vector at p is the tangent vector at t=0 of some curveα:(−ε,ε)→M with α(0) =p. The set of all tangent vectors to p is denoted by T_pM. The set T M={(p,v): p∈ M,v∈T_pM}is calledtangent budleof M.

Definition 2.25 Let M₁ⁿand M₂^mbe differentiable manifolds. A mapping F:M₁→M₂is differentiableat p∈M₁if given a parametrizationy:V ⊂R^m→M₂at F(p)there exists a parametrizationx:U ⊂Rⁿ→M₁at p such that F(x(U))⊂y(V)and the mapping

y⁻¹◦F◦x:U ⊂Rⁿ→R^m

is differentiable atx⁻¹(p). F is differentiable on an open set of M₁if F is differentiable at all of the points of this open set.

Definition 2.26 Avector fieldX on a differentiable manifold M is a correspondence that associates to each point p∈M a vector X(p)∈T_pM. The field is differentiable if the mapping X :M→T M is differentiable. The set of all vector fields on M of class C^∞ is denoted by

X

^(M).

Definition 2.27 A Riemanninan metric on a differentiable manifold M is a correspondence which associates to each point p of M an inner product (that is, a symmetric, bilin- ear, positive-defined form) hh·,·ii_p,on the tangent space T_pM.A differentiable manifold M with a given Riemannian metric is called aRiemannian manifold.

Definition 2.28 A differentiable mapping c:I→M of an open interval I⊂R into M a differentiable manifold M is called a parametrized curve. A vector field along a curve c :I → M is a differentiable mapping that associates to every t ∈I a tangent vector

(29)

V(t)∈T_c(t)M. The vector field ^dc_dt is called thetangent vector fieldof c. The restriction of a curve c to a closed interval[a,b]⊂I is called asegment. If M is a Riemannian manifold, we define the lenght of a segment by

`^b_a(c) = Z _b

a

hhc⁰(t),c⁰(t)ii^1/2dt.

Definition 2.29 An affine connection ∇ on a differentiable manifold M is a mapping

∇:

X

^(M)^×

X

^(M)^→

X

(M), which is denoted by(X,Y)→^∇ ∇_XY and which satisfy:

(i) ∇_{f X+gY}Z= f∇_XZ+g∇_YZ (ii) ∇_X(Y+Z) =∇_XY+∇_XZ (iii) ∇_X(f Y) = f∇_XY+X(f)Y, for all X,Y,Z∈

X

^(M)^{and f}^,^g^∈

D

^(M).

Proposition 2.30 Let M be a differentiable manifold and let ∇be an affine connection on M. There exists an unique correspondence which associates to a vector field V along the differentiabla curve c:I →M another vector field ^DV_dt along c, called the covariant derivativeof V along c, such that:

(i) _dt^D(V+W) =^dV_dt +^dW_dt .

(ii) _dt^D(f V) =^{d f}_dtV+f^DV_dt ,where V is a vector field along c and f is a differentiable function on I.

(iii) If V(t) =Y(c(t)), then ^DV_dt =∇_dc/dtY.

Proof. See [32, p. 50].

Definition 2.31 Let M be a differentiable manifold and let∇be an affine connection on M. A vector field V along a curve c:I→M is called parallelif ^DV_dt =0,for all t∈I.

Proposition 2.32 Let M a differentiable manifold with an affine connection∇. Let c:I→ M be a differentiable curve in M and let V₀be a vector tangent to M at c(t₀), t₀∈I.Then there exist a unique parallel vector field V along c, such that V(t₀) =V (the field V(t)is called theparallel transportof V(t₀)along c).

Proof. See [32, p. 52].

Definition 2.33 A parametrized curve γ:I →M is a geodesic at t₀ ∈I if _dt^D dγ

dt

for t=t₀; ifγis a geodeic for all t ∈I, we say thatγis ageodesic. If[a,b]⊂I andγ:I→M is a geodesic, the restriction ofγto[a,b]is called ageodesic segmentjoiningγ(a)toγ(b).

(30)

When there is no confusion, we will consider the notationP_q←p for the parallel transport along the geodesic segment γ joining p to q. Following [32, Chapter 3], let M be a Riemannian manifold. The exponential map exp_p :T_pM → M at p ∈M is defined by exp_p(v) =γ_v(1,p) for each v∈T_pM, where γ(·) =γ_v(·,p) is the geodesic starting at p with velocityv. Then exp_p(tv) =γ_v(t,p)for each real numbert.

Definition 2.34 Let M be a Riemannian manifold and p,q∈M.Consider the setΓp,q:=

{c:[a,b]→M | c is a piecewise differentiable curve joining p and q}. TheRiemannian distancefrom p to q is d(p,q):=inf{`(c)|c∈Γp,q}.

A Riemannian manifoldMiscompleteif the geodesics inMare defined for allt ∈R. Theorem 2.35 (Hopf-Rinow) Let M be a connected Riemannian manifold. Then the following conditions are equivalent:

(i) M is geodesically complete at a point p∈M.

(ii) M is geodesically complete, i.e., the geodesics in M are defined for all t∈R. (iii) For a fixed point p∈M, the set B[p,r]:={q∈M :d(p,q)≤r}is compact for

any r>0.

(iv) For any p∈M and any r>0, B[p,r]is compact.

(v) (M,d) is complete as a metric space. Namely, any Cauchy sequence of M is a convergent sequence.

Moreover, each one of above items(i)-(v)implies in the following:

(vi) For any two points p,q∈M there exists a geodesic (calledminimal geodesic)γ joining p to q such that`(γ) =d(p,q).

Proof. See [62, p. 84].

Definition 2.36 The curvature R of a Riemannian manifold M is a correspondence that associates to every pair X,Y ∈

X

^(M)a mapping R(X,Y):

X

^(M)^→

X

^(M)^{given by}

R(X,Y)Z=∇_Y∇_XZ−∇_X∇_YZ+∇_(XY_{−Y X}₎Z, Z∈

X

^(M),

where∇is the Riemannian connection of M.

Remark 2.37 The field XY−Y X ∈

X

^(M)is the unique vector field given by thebracket operation of X and Y . For more details about the bracket operation, see [32, Chapter 0, Section 5].

(31)

Definition 2.38 Letσ⊂T_pM be a two-dimensional subspace of T_pM and let x,y∈σbe two linearly independent vectors. Then the sectional curvatureof M at p relative to the sectionσis given by

K(x,y) = hhR(x,y)x,yii_p p|x|²|y|²− hx,yi². The next definition can be found in [62, p. 222].

Definition 2.39 A complete simply connected Riemannian manifold M of nonpositive sectional curvature is called aHadamard manifold.

From now on in this work we will denote a Hadamard manifold by

M

. We remark that due to the Hadamard-Cartan’s Theorem [62, p. 222], if

M

is Hadamard, then the exponential map exp_p:T_p

M

^→

M

is a diffeomorphism for every p∈

M

^{and exp}⁻¹p :

M

^→^Tp

M

denotes its inverse. Denote byRthe real extended line, i.e.,R=R∪ {±∞}. Thedomain of a function f :

M

^→Ris denoted bydom(f):={p∈

M

^: ^f^(p)^<+∞}. Following [9]

and [71] we present below some concepts on optimization on Hadamard manifolds.

2.2.2 Concepts and results of optimization in Hadamard Manifolds

This subsection presents some definitions, notations and preliminary results about optimization in Riemannian manifolds.

Definition 2.40 A subset A⊂

M

is said to be convex if for any two points p and q in A, the geodesic joining p to q is contained in A, that is, ifγ:[a,b]→

M

is a geodesic such that p=γ(a)and q=γ(b), thenγ((1−t)a+tb)∈A for all t∈[0,1].

Definition 2.41 A functionψ:

M

^→Risproperifdom(ψ)6=∅andψ(p)>−∞holds for all p∈

M

^.

Next we define convex functions on a manifold. The definition below can be found for example in [71, p. 60].

Definition 2.42 (Convex function) Let A be a convex subset of

M

^.^{A function}^ψ^:^A^→R isconvexifψ(γ(t))≤(1−t)ψ(p) +tψ(q),for all p,q∈A∩dom(ψ), all t∈[0,1]and all geodesicγ:[0,1]→

M

^,^{such that}^{γ(0) =}^{p and}^{γ(1) =}^q.

If the inequality in above definition is strict, thenψis said to be strictly convex. From [71]

we know that a function ψ:

M

^→R isconvex (resp. strictly convex) if and only if for every geodesic γ :[a,b]→

M

^, the function ψ◦γ :[a,b]→ R is convex (resp. strictly convex) in the usual sense, i.e.,(ψ◦γ)((1−s)t₁+st₂)≤(1−s)(ψ◦γ)(t₁) +s(ψ◦γ)(t₂), for alls∈[0,1]and allt₁,t₂∈[a,b].

(32)

Definition 2.43 Let A be a convex subset of

M

. A function ψ:A→R is said to be σ- strongly convex forσ>0if, for any p,q∈A∩dom(ψ)and any geodesicγ:[0,1]→

M

joining p to q, the composition ψ◦γ:[0,1]→Ris σ-strongly convex, i.e.,(ψ◦γ)(t)≤ (1−t)ψ(p) +tψ(q)−^σ₂t(1−t)`(γ)², for all t ∈[0,1].

Following [19], for each p∈

M

^,for now on we will denote byT_p

M

^thetangent spaceto

M

^at p. Its dual space, calledcotangent spaceto

M

^at p, will be denoted byT_p^∗

M

^{. The}

duality product betweenX∈T_p

M

^and^ξ^∈^Tp^∗

M

is denoted byhξ,Xi=ξ(X).Moreover, the tangent budle and cotangent budleof

M

^, will be denoted respectively byT

M

^and

T^∗

M

^{, where}

T

M

⁼ ^[

p∈M

{p} ×T_p

M

^, ^T^∗

M

⁼ ^[

p∈M

{p} ×T_p^∗

M

^.

The Riemannian metric of

M

provides a linear bijective correspondence between the tangent and cotangent spaces via the Riesz map and its inverse; see [48, Chapter 11].

They are defined as

[:T_p

M

³^X ^7→^X^[^∈^Tp^∗

M

^, ^hX^[^,Yⁱ⁼^X^[^(Y^{) =}^hhX^,Yⁱⁱp, ∀Y ∈T_p

M

^,

and

]:T_p^∗

M

³^ξ^7→^ξ^]^∈^Tp

M

^, ^hhξ^]^,Yⁱⁱp=ξ(Y) =hξ,Yi, ∀Y ∈T_p

M

^.

Note that such isomorphisms further introduces an inner product on the cotangent space T_p^∗

M

^, which we will also denote by h·,·i_p.From these facts, we shall introduce below the definition of subdifferential of a convex function, which can be found for example in [19, Definition 2.7].

Definition 2.44 The subdifferential ∂ψ at a point p∈

M

of a proper, convex function ψ:

M

^→Ris given by

∂ψ(p):={ξ∈T_p^∗

M

^|^ψ(q)^≥^{ψ(p) +}^hξ,^exp⁻¹p qi, ∀q∈

M

^}.

Theorem 2.45 Let

M

be a Hadamard manifold and ψ:

M

^→R be a proper function.

The following statements hold:

(i) If ψ is convex, then ψ(p)≥ψ(q) +hξ,exp⁻¹_q pi, for all p,q∈dom(ψ)and all ξ∈∂ψ(q);

(ii) Ifψisσ-strongly convex forσ≥0,thenψ(p)≥ψ(q) +hξ,exp⁻¹_q pi+^σ₂d²(p,q), for all p,q∈dom(ψ)and allξ∈∂ψ(q).

(33)

Proof. The proof of item (i) follows directly from Definition 2.44. To prove item (ii) assume that ψ is σ-strongly convex. Let p,q ∈

M

^and ^γ ^: ^[0,^1]^→

M

be the unique geodesic joining q to p, that is, γ(t) = exp_q(texp⁻¹_q p). Thus, γ⁰(0) = exp⁻¹_q p and

`(γ) =kexp⁻¹_q pk=d(p,q).Hence, from Definition2.43we have

(ψ◦γ)(t)≤(1−t)ψ(q) +tψ(p)−σ

2t(1−t)d²(p,q), ∀t∈[0,1], which implies that

t(ψ(γ(t))−ψ(p)) + (1−t)(ψ(γ(t))−ψ(q))≤ −σ

2t(1−t)d²(p,q), ∀t∈[0,1].

Multiplying last inequality for 1/twitht ∈(0,1]we have ψ(q)−ψ(p) +ψ(γ(t))−ψ(q)

t ≤ −σ

2(1−t)d²(p,q), ∀t∈[0,1], (2-8) Taking the limit in (2-8) ast goes to 0 we obtain

ψ(q)−ψ(p) +ψ⁰(q;v)≤ −σ

2d²(p,q), (2-9)

where

ψ⁰(q;v) =lim

t→0

ψ(exp_q(tv))−ψ(q) t

andv=γ⁰(0) =exp⁻¹_q p.Thus, by using [29, Proposition 3.2] we conclude that ψ(p)≥ψ(q) +hξ,exp⁻¹_q pi+σ

2d²(p,q), ∀p,q∈

M

^,^∀ξ^∈^∂ψ(q).

Next definition can be found in [24, p. 363].

Definition 2.46 Let

M

be a Hadamard manifold. A function ψ:

M

^→R is said to be lower semi-continuous, orlsc, at p∈

M

^if^{lim inf}x→pψ(x) =ψ(p).

Definition 2.47 ψ:

M

^→Ris said to be 1-coercive at p¯∈

M

^if

lim

d(p,p)→+∞¯

ψ(p)

d(p,¯ p) = +∞.

If f is 1-coercive for all p∈

M

, then f is said to be 1-coercive on

M

^.

Proposition 2.48 Assume thatψ:

M

^→Ris lsc and 1-coercive on

M

^.Then the global minimizer set ofψis nonempty.

On some boosted methods for DC programming and the extension of the DCA to Hadamard Manifolds

E LIANDERSON M ENESES S ANTOS

On some boosted methods for DC programming and the extension of the

DCA to Hadamard Manifolds

Agradecimentos

Resumo

Abstract

Contents

CHAPTER 1

Introduction

F

F

F

M

F

F

F

F

F

M

CHAPTER 2

Preliminaries

2.1 Basic concepts and results of optimization on R

2.1.1 Some facts about locally Lipschitz functions

2.1.2 On closed convex sets with linear constraits

I

I

I

F

F

F

F

∑

∑

∑

∑

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

2.1.3 On the Kurdyka-Łojasiewicz property

2.2 Basic concepts and results of optimization on Hadamard manifolds

2.2.1 Basic concepts and results of Riemannian geometry

D

D

D

X

X

X

X

X

D

X

X

X

X

X

E ^LIANDERSON M ^ENESES S ^ANTOS