Comput. Appl. Math. vol.26 número2

(1)

ISSN 0101-8205 www.scielo.br/cam

Angular analysis of two classes of non-polyhedral convex

cones: the point of view of optimization theory

∗

ALFREDO IUSEM1 and ALBERTO SEEGER2

1_{Instituto de Matemática Pura e Aplicada, Estrada Dona Castorina 110} Jardim Botânico, Rio de Janeiro, Brazil

2_{University of Avignon, Department of Mathematics, 33 rue Pasteur, 84000 Avignon, France} E-mails: [email protected] / [email protected]

Abstract. There are three related concepts that arise in connection with the angular analysis of a convex cone: antipodality, criticality, and Nash equilibria. These concepts are geometric in nature but they can also be approached from the perspective of optimization theory. A detailed angular analysis of polyhedral convex cones has been carried out in a recent work of ours. This note focus on two important classes of non-polyhedral convex cones: elliptic cones in an Euclidean vector space and spectral cones in a space of symmetric matrices.

Mathematical subject classification: 52A40, 90C26.

Key words: antipodal pairs, convex cones, maximal angle, critical angle, Nash angular equi-libria, elliptic cones, spectral cones.

1 Introduction

The Euclidean spaceRd_{is equipped with the usual inner product}_h_u, v_{i =}uTv

and the associated norm_{k ∙ k}. The symbol6d refers to the unit sphere inRd.

We also use the notation

4(Rd)_≡ nontrivial closed convex cones inRd.

#703/07. Received: 08/II/07. Accepted: 07/III/07.

(2)

That a convex cone K inRd_{is nontrivial means that K is different from}

{0_}and different from the spaceRd_{itself. The symbol K}+_{is used to indicate the positive} dual cone of K .

There are various tools that serve to describe the angular structure of a convex cone. The following definition recalls the main conceptual ingredients used in this note.

Definition 1. Let K _∈4(Rd₎_{. Let}

ˉ

u andv_ˉ be two unit vectors in K .

i) (u_ˉ,v)_ˉ is an antipodal pair of K ifu and_ˉ v_ˉ achieve the maximal angle

θmax(K)= sup

u,v∈K∩6d

arccos_hu, v_i. (1)

ii) The angleθ (u_ˉ,v)_ˉ ₌arccos_{h ˉ}u,v_ˉ_iformed by a critical pair(u_ˉ,v)_ˉ is called a critical angle. That a pair(u_ˉ,v)_ˉ is critical means that

ˉ

v_{− h ˉ}u,v_ˉ_{i ˉ}u _∈ K+ and u_ˉ _{− h ˉ}u,v_ˉ_{i ˉ}v_∈ K+.

The adjective proper is added whenu and_ˉ v_ˉare not collinear, i.e.,_{|h ˉ}u,v_ˉ_i|

6= 1. The set of all proper critical angles of K , denoted by(K), is called the angular spectrum of K .

iii) (u_ˉ,v)_ˉ is a Nash angular equilibrium of K if

θ (u_ˉ,v)_ˉ _≥θ (u_ˉ, v) _∀v_∈ K _∩6d,

θ (u_ˉ,v)_ˉ _≥θ (u,v)_ˉ _∀u _∈K _∩6d.

The numberθ (u_ˉ,v)_ˉ is then called a Nash angle of K .

(3)

show that the numberθmax(K)plays a role in estimating the efficiency of certain interior point methods for solving feasibility systems with inequalities described by K . On the other hand,θmax(K)is related to

ρ(K) ₌ min

Q∈4(Rd₎ Q unpointed

haus(K,Q), (2)

a number which has been suggested in [4] as tool for measuring the degree of pointedness of K . Here

haus(K,Q)₌max

max

x∈K∩6d

dist(x,Q), max

x∈Q∩6d

dist(x,K)

stands for the bounded Pompeiu-Hausdorff metric on4(Rd₎_{. In general, the}

evaluation of (2) is a cumbersome task even for cones having a relatively simple structure. Fortunately, the least distance problem (2) is related to the angle maximization problem (1) which, in principle, is easier to solve because the decision variables u, vlive in a standard Euclidean space. In fact, one has the following striking formula [9]

ρ(K)₌cos

θmax(K) 2

.

Moreover, if K is not a half-line and admits(u_ˉ,v)_ˉ as antipodal pair, then the closed convex cone

Q₌K _{∩ [}R(u_ˉ _{− ˉ}v)_]⊥ ₊ R(u_ˉ _{− ˉ}v)

is unpointed and lies at minimal distance from K .

Of course,θmax(K)is not the only critical angle of interest. The smallest proper critical angle plays also a relevant role in the description of the cone, namely, it can be used as tool for measuring its degree of solidity. By an index of solidity we understand any continuous function G_:4(Rd₎

→R_{satisfying the axioms:}

i) G(K)₌0 if and only if K is not solid,

ii) G(K)₌1 if and only if K is a half-space,

iii) G(U(K))₌G(K)for any orthonormal matrix U ,

(4)

The Frobenius coefficient

Gfrob(K)= (

radius of the largest ball contained in K and centered in a unit vector

is the first example of index of solidity that comes to mind. An alternative choice is

Gangular(K)=ρ(K+)=cos

θmax(K+) 2

. (3)

What is bothering about the expression (3) is that it involves the dual cone K+ and not the original cone K itself. However, this problem can be remediated since it is possible to write (3) in the equivalent form

Gangular(K)= (

sinhθmin(K) 2

i

if K is solid

0 if K is not solid,

whereθmin(K)indicates the smallest proper critical angle of K .

We have explained in few words why the maximal angle and the smallest proper critical angle are mathematical objects of interest. The intermediate critical angles are perhaps less useful, but in any case they provide additional information on the geometric structure of the cone. We now come back to the main stream of the presentation. The main fact to be remembered about Definition 1 is that every antipodal pair is a Nash angular equilibrium, and that every Nash angular equilibrium is a critical pair.

It is useful to split the angular spectrum of K in two disjoint pieces:

(K)₌nash(K)∪ord(K).

The first piece collects the proper critical angles that are formed by a Nash angular equilibrium. The remaining proper critical angles are said to be “ordinary” and they are thrown in the setord(K).

(5)

In this note we study two important classes of non-polyhedral convex cones arising in practice: elliptic cones in an Euclidean vector space and spectral cones in a space of symmetric matrices.

2 Angular Analysis of Elliptic Cones

In this section we consider d ₌n₊1 with n _≥2. Recall that the elliptic cone E(A)associated to a symmetric positive definite matrix A_∈Rn×n_{is the closed}

convex cone inRn+1_{given by}

E₍_A₎₌n₍_x_,_r₎_∈Rn+1_:√_xT_Ax

≤r

o

.

Elliptic cones are used in a great diversity of areas. The three dimensional case has applications in contact problems with orthotropic friction law [3, 17] and in electromagnetic scattering [16], just to mention two concrete examples. General background on higher dimensional elliptic cones can be found in [4, 7] among other references.

The trace of the elliptic cone E₍_A₎ _{over the unit sphere} ₆_n₊₁ _{is the set all} vectors(x,r)_∈Rn+1_{such that}

√

xT_Ax _≤_r_, ₍₄₎

kx_k2₊r2₌1. (5) The Cartesian representation (4)-(5) is not always the best way of describing E₍_A₎_∩₆_n

+1. As explained in the next lemma, this set can also be described by using a parametric representation. The notation

e(C)₌

α _∈Rn_{: h}α,Cα_{i ≤}1

stands for the ellipsoid associated to a symmetric positive definite matrix C _∈

Rn×n_.

Lemma 1. Decompose the symmetric positive definite matrix A _∈ Rn×n _in

(6)

and Q _{= [}q1∙ ∙ ∙qn]is an orthonormal matrix whose columns are formed with

corresponding eigenvectors q1, . . . ,qn. Then,

z _∈E₍_A₎_∩₆_n

+1 ⇐⇒ z = Qα, p

1_{− k}α_k2

wi t h α_∈e(I ₊D). (6)

Proof. Take any z ₌ (x,r) and write its first component in the form x ₌

Qα with α _∈ Rn_{. The fact that z has unit length is expressed by the}

rela-tion r ₌ p1_{− k}α_k2_{. Membership of z in}_E₍_A₎_{corresponds to the inequality}

hα, (I ₊D)α_{i ≤}1.

We refer to (6) as the canonical parametrization of E₍_A₎ _∩ ₆_n₊_{1. Since}

e(I ₊ D) is contained in Bn, the closed unit ball of Rn, the square root

op-eration in (6) is well defined. For the sake of convenience, we introduce the function9 _:Bn→Rn+1given by

9(α)₌ Qα,p1_{− k}α_k2

.

Although9depends explicitly on the collection_{q1, . . .qn}of eigenvectors of

A, the spherical product

(α, β)_∈ Bn×Bn 7→8(α, β)= h9(α), 9(β)i

is a more intrinsic concept. Indeed, by orthonormality of Q, one simply has

8(α, β)_{= h}α, β_{i +}p1_{− k}α_k2p₁_{− k}_β_k2_.

There is a very interesting theory behind the definition of8, but we shall not elaborate on this subject more than strictly necessary. The use of this special type of vector product will be clear in a moment.

2.1 Antipodal and critical pairs inE₍_A₎

In what follows we use the parametrization ofE(A)_∩6n+1described in Lem-ma 1. Arbitrary points inE₍_A₎_∩₆_n

+1, say u andv, will be represented in the parametric form

(7)

Since9is a bijection between e(I₊D)andE(A)_∩6n+1, the angle maximization problem (1) becomes

minimize_hα, β_{i +}p1_{− k}α_k2p₁

− kβ_k2 ₍₇₎ subject toα, β _∈e(I ₊D).

A careful analysis of the above variational problem leads to a full characterization of the set of antipodal pairs ofE(A).

Theorem 1. Decompose the symmetric positive definite matrix A _∈ Rn×n _as

in Lemma 1. Suppose that the smallest eigenvalue of A, denoted byμmin(A),

has multiplicity r , that is to say, the eigenspace associated to μmin(A) is r

-dimensional. Then,

(a) the maximal angle ofE₍_A₎_{is given by}

θmax(E(A))=arccos

μmin(A)−1

μmin(A)+1

.

(b) (u_ˉ,v)_ˉ is an antipodal pair ofE(A)if and only

ˉ

u₌

r

X

i=1

αiqi,

s

μmin(A)

μmin(A)+1 !

, v_ˉ ₌ ₋ r

X

i=1

αiqi,

s

μmin(A)

μmin(A)+1 !

,

with coefficientsα1, . . . , αr ∈Rsuch that r

X

i=1

α_i2₌ 1 μmin(A)+1

. (8)

Proof. Given the specific structure of (7), one sees that this minimization problem is solved by takingαr+1= ∙ ∙ ∙ =αn=0 and the first r coefficients of αas in (8). The parameter vectorβ must have opposite orientation with respect

(8)

Remark 1. If the eigenvalueμmin(A)is simple, that is to say, r =1, then the elliptic coneE₍_A₎_{has exactly two antipodal pairs, namely,}

ˉ

u ₌ _√ 1

μmin(A)+1

±q1, p

μmin(A)

,

ˉ

v ₌ _√ 1

μmin(A)+1

∓q1, p

μmin(A)

.

Since one antipodal pair is obtained from the other by permuting the order of u andv, one may say that the antipodal pair is "unique". In case of higher order multiplicity, i.e. r _≥ 2, the collection of antipodal pairs can be parametrized with an r -dimensional parameter vector as in (8).

We recall a result on angular spectra of elliptic cones obtained recently in [7]. As shown in the next theorem, computing the angular spectrum ofE₍_A₎ is essentially the same job as computing all the eigenvalues of the matrix A. The critical pairs ofE₍_A₎_{are obtained by using the eigenvectors of A.}

Theorem 2. Let A _∈ Rn×n _{be symmetric and positive definite. The vectors} (x,r)and(y,s)form a proper critical pair ofE₍_A₎_{if and only if the following}

three conditions hold:

(a) y_{= −}x,

(b) s₌r ₌p1_{− k}x_k2₌√_xT_Ax,

(c) x is an eigenvector of A.

In this case, the corresponding critical angle takes the value

θ ₌arccos

μ₋1

μ₊1

, (9)

whereμis the eigenvalue of A associated to x.

The proof technique used in [7] doesn’t rely on the canonical parametriza-tion ofE₍_A₎_∩₆_n

(9)

2.2 Nash angular equilibria inE(A)

Elliptic cones are, no doubt, very special objects. According to Theorem 2, the angular spectrum ofE(A)has at most n elements. Indeed,

(E(A))_{= {}θ1, . . . , θn},

where the i -th critical angle

θi =arccos

μi −1 μi +1

is obtained directly from the i -th eigenvalueμi of A.

The theorem stated below is a fundamental result on Nash angular equilibria of elliptic cones. Several consequences of this theorem will be presented as soon as the proof is completed.

Theorem 3. Let (u_ˉ,v)_ˉ be a proper critical pair ofE₍_A₎_{. Denote by} _θ _the

corresponding critical angle and byμthe corresponding eigenvalue, that is to say,μis the unique solution of the equation(9). The following four conditions are then equivalent:

(a) (u_ˉ,v)_ˉ is a Nash angular equilibrium ofE(A),

(b) u minimizes the linear form_ˉ _{h ∙},v_ˉ_ioverE(A)_∩6n+1, (c) v_ˉminimizes the linear form_{h ˉ}u, _{∙ i}overE₍_A₎_∩₆_n₊₁_, (d) μ_≤1₊2μmin(A)

Proof. Take, for instance, θ ₌ θk, that is to say, consider the critical angle

that derives fromμ ₌ μk. As in Lemma 1, we form Q = [q1∙ ∙ ∙qn] with an

orthonormal basis of eigenvectors of A. The diagonal matrix D is formed with the corresponding eigenvalues that we arrange in nondecreasing orderμmin(A)=

μ1 ≤ ∙ ∙ ∙ ≤ μn. Either by using the parametric representation ofu andˉ vˉ as in

Lemma 1, or by applying Theorem 2, one gets

ˉ

u ₌ _√ 1

1₊μ(qk,

√_μ),

ˉ

v ₌ _√ 1

1₊μ(−qk,

√_μ).

(10)

For the sake of clarity, we divide the proof of the theorem in five steps.

Step 1: We study the behavior of_{h ˉ}u, _{∙ i}over a special path. For each t _{∈ [}0,1_], consider the point

zt =(1+rt2)−

1/2

(xt,rt),

with

xt = −

√

t qk+

√

1₋t q1, and rt =

q

xtTAxt.

Note that xthas unit length, and so does zt. Note also that, in view of the equality

in the definition of rt, the vector zt belongs to bd[E(A)], the boundary ofE(A).

In short, _{zt : t ∈ [0,1]} is a continuous path on the set 6n+1 ∩bd[E(A)]. We will study the behavior of the univariate functionφ _{: [}0,1_{] →} R_{given by} φ(t)_{= h ˉ}u,zti.In view of (10) and the definition of zt, after some simplification

one arrives at the expression

φ(t)₌ −

√

t₊√μ√tμ₊(1₋t)μ1 √

1₊μ√1₊tμ₊(1₋t)μ1

. (11)

Observe that φ (1) ₌ (μ₋1)/(μ₊1) _{= h ˉ}u,v_ˉ_i.Let us examine the sign of

φ′₍₁₎_{. Since}

φ′(t)₌

−1 √

t+

√_μ(μ 1−μ1) √

t(μ−μ1)+μ1

√

t(μ₋μ1)+μ1+1− − √

t+√μ√t(μ−μ1)+μ1

(μ−μ1) √

t(μ−μ1)+μ1+1

2_[t(μ−μ1)+μ1+1]√μ+1

,

one gets

φ′(1)₌ 1

2(μ₊1)3/2

"

−1₊ √_μ(μ

−μ1)

√_μ

p

μ₊1₋ −1+

√_μ√_μ

(μ₋μ1)

√ μ₊1

#

= 1

2(μ₊1)2(μ−2μ1−1).

With this information at hand, we can proceed with the next step.

Step 2: We prove that(c)_⇒(d). Suppose on the contrary thatμ >2μ1−1. Soφ′₍₁_{) >}_{0, and therefore}

(11)

for some t _{∈ [}0,1_]closed enough to 1. Since zt belongs toE(A)∩6n+1for all

t _{∈ [}0,1_], the vectorv_ˉ is not a minimizer of _{h ˉ}u, _{∙ i}overE₍_A₎_∩₆_n

+1. This contradiction confirms that (c) implies (d).

Step 3: We derive a so-called “extrapolation property” associated to condition

(c). Consider the inequality

h ˉu,v_ˉ_{i ≤ h ˉ}u, v_{i ∀} v_∈E(A)_∩6n+1.

We represent a pointv_∈E(A)_∩6n+1in terms of a parameter vectorβ∈Rnas described in Lemma 1, i.e.

v₌8(β) with β _∈e(I ₊D). (12)

It follows that, forvas in (12), one has

h ˉu, v_{i =} _√ 1

1₊μ(βk+

√

μp1_{− k}β_k2_). For convenience we make the change of variables

r ₌

p

1_{− k}β_k2

kβ_k , δi = βi

kβ_k for i =1, . . . ,n.

Thus,

h ˉu, v_{i =} δk+r

√_μ

p

(1₊μ)(1₊r2₎ ≥

−|δk| +r√μ

p

(1₊μ)(1₊r2₎

= p −|δk|

(1₊μ)(1₊r2₎ +

√_μ

q

(1₊μ)(1₊_r12) .

(13)

The rightmost expression in (13) is clearly an increasing function of r on the positive half-line. Observe that, sincevbelongs toE₍_A₎_and_μ_i _≥_μ₁₍₁_≤_i _≤

n), it holds that

r2_≥

n

X

i=1

δ_i2μi =δ2kμ+

X

i6=k

δ_i2μi ≥δ2kμ+

 

X

i6=k δ_i2

 μ1

=δ2_kμ₊ 1₋δ_k2μ1,

(12)

using the facts thatPn i=1δ

2

i = 1 in the rightmost equality of (14). Replacing

(14) in (13), and calling t₌δ2_k _{∈ [}0,1_], one obtains

h ˉu, v_{i ≥} −

√

t₊√μ√tμ₊(1₋t)μ1 √

1₊μ√1₊tμ₊(1₋t)μ1 = h ˉ

u,zti,

where zt is defined as in Step 1. Summarizing, we have shown the following

extrapolation property: given an arbitrary pointv _∈ 6n+1∩E(A), there exists another point zt belonging to6n+1∩bd[E(A)] and lying farther away fromuˉ thanv.

Step 4: We prove that(d)_⇒(c). In view of the extrapolation property derived in Step 3, it suffices to prove that

h ˉu,zti ≥ h ˉu,vˉi ∀t ∈ [0,1],

or equivalently, thatφ _{: [}0,1_{] →}R_{attains its minimum at t} ₌ _{1. Under the}

assumptionμ _≤ 1₊2μ1, one has of course μ1 ≥ max{0, (μ−1)/2}. We consider two cases, according to the value of this maximum, i.e. to the sign of

μ₋1. First we look at the right hand side of (11) as a function ofμ1, which we will now denote asφ(t, μ1). By rewriting (11) as

φ (t, μ1) = −

√

t

√

1₊μ√1₊tμ₊(1₋t)μ1 +

√_μ √

1₊μ

q

1₊ _tμ₊₍₁1₋_t)μ 1

,

one sees thatφ(t,_∙)is nondecreasing in the non-negative half-line for all t _∈ [0,1_]. Now we study the two cases of interest.

i) μ₋1<0. Observe that

h ˉu,zti =φ(t, μ1)≥φ (t,0)= − √

t₊√μ√μt

√

μ₊1√1₊μt

= √

t(μ₋1)

√

μ₊1√1₊μt

= _√ μ−1

μ₊1 q

1

t +μ .

(13)

Since the rightmost expression in (15) is decreasing as a function of t in the interval_[0,1_], one gets

h ˉu,z_ˉti ≥φ (1,0)=(μ−1)/(μ+1)= h ˉu,vˉi

for all t _{∈ [}0,1_], as we needed to prove.

ii) μ₋1_≥0. In this case we write

h ˉu,zti = φ (t, μ1)≥φ

t,μ−1

2

= − √

2t₊√μ√t(μ₊1)₊μ₋1

(μ₊1)√t₊1 .

(16)

For t _{∈ [}0,1_], denote by ψ (t)as the rightmost expression of (16). The functionψ _{: [}0,1_{] →}R_{is derivable and}

(μ₊1)(t₊1)ψ′(t) ₌

−√2

2√t +

√_μ(μ +1)

2√t(μ₊1)₊μ₋1 _√

t₊1

−

−√2t₊√μ√t(μ₊1)₊μ₋1 2√t₊1

.

After a tedious simplification one arrives at

ψ′(t)₌ 1

2(μ₊1)(t₊1)3/2 − r

2

t +

2√μ

√

t(μ₊1)₊μ₋1 !

. (17)

It follows from (17) thatψ′₍_t₎ _≤ _{0 if and only if} _(μ₋₁₎₍₁₋_t₎ _≥ _0. We conclude thatψis nonincreasing in the whole interval_[0,1_]. Hence,

h ˉu,_ˉzti ≥ψ(t)≥ψ (1) =

1

μ₊1

−√2₊√μ√2μ

√ 2

!

= μ_μ−1

+1 = h ˉu,vˉi

(14)

Step 5: Completion of the proof. We have shown insofar that(c) _⇐⇒ (d). By using a mutatis mutandis argument one proves similarly that(b)_⇐⇒ (d). For completing the proof of the theorem it is now enough to observe that (a) is

the conjunction of (b) and (c).

Corollary 1. For a proper critical pair(u_ˉ,v)_ˉ of an elliptic cone K inRn+1_to

be a Nash angular equilibrium, it is necessary and sufficient that

k ˉu_{− ˉ}v_{k ≥}

√ 2

2 diam(K ∩6n+1). (18) Proof. Let K ₌ E₍_A₎_{and suppose that the proper critical pair} ₍_u_ˉ_,_v)_ˉ _is as-sociated with the eigenvalue μ. Since _{h ˉ}u,v_ˉ_{i =} (μ₋1)/(μ ₊1), one has k ˉu_{− ˉ}v_{k =}2/√μ₊1. On the other hand, the diameter of K _∩6n+1is attained with an antipodal pair of K , so one has

diam(K _∩6n+1)=2/ p

μmin(A)+1. It is clear that the conditionμ_≤2μmin(A)+1 is equivalent to

2 √

μ₊1 ≥ √

2 2

2 √

μmin(A)+1

,

so the announced result follows from Theorem 3.

Corollary 2. For a proper critical angleθ of an elliptic cone K to be a Nash angle, it is necessary and sufficient that

arccos

1₊cosθmax(K) 2

≤θ. (19)

Proof. It is a matter of reformulating Corollary 1.

2.3 Nash threshold coefficient ofE₍_A₎

(15)

cone. This idea is corroborated by the relation (19) in Corollary 2 or, equiva-lently, by the relation (18) in Corollary 1.

In connection with the above observation, recall that the Nash threshold

coeffi-cient of a nontrivial closed convex cone K inRd_{is the largest constant}_β

∈ [0,1_] such that

k ˉu_{− ˉ}v_{k ≥}β diam(K _∩6d) ∀(uˉ,v)ˉ ∈Nash(K), (20)

where Nash(K) stands for the set of all Nash angular equilibria of K . If βK

denotes the Nash threshold coefficient of K , then the infimal-value

β∗₌ inf

K∈4(Rd₎βK

corresponds to the largest constantβ for which the inequality (20) holds uni-formly with respect to all elements in4(Rd₎_{. After a considerable amount of}

effort we were able to prove in [8, Corollary 2] that

(

in dimension d greater or equal than three,

the infimal-valueβ∗is equal to√3/3.

Unless K has a special structure, it is hopeless to derive a simple formula for evaluatingβKitself. As far as elliptic cones are concerned, we are now in position

to establish the following result.

Proposition 1. The Nash threshold coefficient of an elliptic cone E₍_A₎ _is

given by

βE(A)=

s

μmin(A)+1

μ⋆₍_A₎₊₁ , (21)

where μ⋆(A) denotes the largest eigenvalue of A that is less or equal than

2μmin(A)+1. In particular, inf

K∈4(Rd₎ K elliptic

βK =

√ 2/2,

and this infimum is attained by any elliptic coneE(A)such that 2μmin(A)+1 is

(16)

Proof. Letμ1≤ ∙ ∙ ∙ ≤μnbe the eigenvalues of A. In view of Corollary 1, the

numberβE(A)corresponds to the largest constantβ ∈ [0,1]such that

2 √

μk+1 ≥

β _√ 2

μmin(A)+1

holds for any k_{∈ {}1, . . . ,n_}satisfyingμk ≤2μmin(A)+1. A matter of

simpli-fication leads directly to the formula (21).

3 Angular analysis of spectral cones

We now work in a linear space of dimension d ₌ (1/2)n(n₊1)with n _≥ 2. More precisely, we are concerned with a special class of convex cones in

S_n_≡_{real symmetric matrices of size n}_×_n_.

As usual,S_n_{is equipped with the Frobenius inner product}_h_A,B_{i =}trace(A B)

and the associated norm. For the sake of convenience we write 6(S_n₎ ₌

{A_∈S_n _{: k}_A_{k =}₁_}_{and reserve the symbol} 6n for the unit sphere in the

Eu-clidean spaceRn_.

For an arbitrary closed convex coneM_inS_n_{, the computation of the maximal}

angleθmax(M)is in general a cumbersome task. The same remark applies to the computation of the other possible critical angles. The purpose of this section is to derive useful calculus rules for computing critical angles at least for a special class of convex cones inS_n_.

Recall that a convex coneM_inS_n _{is said to be spectral (or weakly unitarily}

invariant) if

A_∈M _=⇒ _UT_AU _∈M _{for all} _U _∈O_n ₍₂₂₎

with O_n _{denoting the group of orthonormal matrices of size n} _×_{n. Notice,} incidentally, that the concept of spectrality applies to an arbitrary set inS_n _and

not just for a convex cone.

The next two lemmas are part of the folklore on weakly unitarily invariant sets and functions, see for instance the references [1, 2, 10, 11, 15]. In the sequel the notationλ(A)₌(λ1(A), . . . , λn(A))stands for the vector of eigenvalues of

(17)

Lemma 2. A convex coneM_inS_n _{is spectral if and only if there is a}

permu-tation invariant convex cone K inRn_{such that}

M₌

A_∈S_n _: _λ(_A₎_∈ _K _.

Furthermore, such K is unique and given by

KM=

x _∈Rn_: _diag₍_x₎_∈M _.

One refers to KM as the permutation invariant convex cone induced byM.

Recall that a set K inRn _{is called permutation invariant if}₅₍_K₎

= K for all

5_∈5n, with5ndenoting the set of n×n permutation matrices.

Lemma 3. One has:

(a) the dual of a permutation invariant convex cone in Rn _{is a permutation}

invariant convex cone inRn_.

(b) If M _{is a spectral convex cone in}S_n_{, then its dual}M+_{is a spectral convex}

cone inS_n_{. Furthermore,}M+_{can be computed by using the formula}

M+₌

A_∈S_n _: _λ(_A₎_∈ _K+

M .

The most popular example of spectral convex cone inS_n_{is the Loewner cone}

of positive semidefinite symmetric matrices. A list of more elaborate spectral convex cones includes

M_{= {}A_∈S_n_: λ1(A)+. . .+λm(A)≥0} (1≤m≤n−1), (23)

M_{= {}_A_∈S_n_: _γPn

i=rλi(A)≤Pmi=1λi(A)} (1≤m,r≤n, γ ≥0),

M_{= {}A_∈S_n_: max_{0, λn(A)} +γkAk ≤δtrace(A)} (γ≥0, δ∈R),

M_{= {}A_∈S_n_: q

[λn(A)−λ1(A)]2+ kAk2≤δtrace(A)} (δ≥0), M_{= {}A_∈S_n_: _γ_k_A_{k ≤}_trace₍_A₎_} _{(γ >}₀_),

(18)

vector which is obtained by rearranging in nondecreasing order the components of x _∈Rn_{, then one gets}

KM = {x ∈Rn: x₁↑+. . .+x_m↑ ≥0}, (24)

KM = {x ∈Rn: γ Pn_i₌_ℓx_i↑≤Pm_i₌₁x_i↑},

and so on. The example (23) is perhaps the most interesting one since it appears in concrete problems of optimization [12] and principal components analysis [14].

Be aware that KMmay posses some properties that are lacking inM, think for

instance of polyhedrality. It is not difficult to see that the convex cone (24) is polyhedral, but the spectral convex cone (23) is not. The lack of polyhedrality in (23) is due to a “curvature” effect introduced by the eigenvalue functions

λi : Sn→R.

3.1 Antipodal pairs in a spectral cone

Is there any link between the angular structure ofM_{and that of K}M? Answering

this question is not a trivial matter. Our first task will be comparing the maximal angle

θmax(M) = sup

A,B∈M

kAk=1,kBk=1

arccos_hA,B_i

of the spectral coneM_inS_n _{and the maximal angle}

θmax(KM) = sup

u,v∈KM

kuk=1,kvk=1

arccos_hu, v_i.

of the permutation invariant cone KM. The last term is easier to evaluate because

KMhas in principle a simpler structure and, in any case, it lives in a vector space

of lower dimension.

As explained in the next theorem, the antipodal pairs ofM_{and those of K}M

are related through the eigenvalue mapλ_:S_n _→Rn_{. We start by establishing a}

commutation principle for optimization problems with spectral data. Recall that a spectral set inS_n_{is defined by means of the relation (22). Similarly, a function} 8onS_n_{is said to be spectral (or weakly unitarily invariant) if}

(19)

Lemma 4 (Commutation principle). LetN _⊂ S_n _{be a spectral set and}8 _: S_n _→ R_{be a spectral function. Let} _Aˉ_,_Bˉ _∈S_n_{. If} _{B is a local minimum (or a}ˉ

local maximum) overN_{of the fonction}_{h ˉ}_A, _{∙ i +}8(_∙), then A andˉ B commute.ˉ

Proof. Suppose that B is a local minimum ofˉ _{h ˉ}A, _{∙ i +}8(_∙) overN_{. This} means thatBˉ _∈N_and

h ˉA,Bˉ_{i +}8(Bˉ)_{≤ h ˉ}A,B_{i +}8(B) _∀B _∈N_∩O_ε₍_Bˉ₎ ₍₂₅₎

withO_ε(Bˉ)denoting an open ball of radiusε >0 and centerB. Takeˉ Xˉ _∈ O_n so that

ˉ

XTBˉXˉ ₌E ₌diag(λ(Bˉ)).

By definition of spectral set, X E XT _{belong to}_N_{for all X}

∈O_n_{. By a continuity} argument, there is a smallδ > 0 such that X E XT _∈ O_ε₍_Bˉ₎ _{whenever X} _∈ O_δ(Xˉ). In view of (25), one gets

h ˉA,X Eˉ XˉT_{i +}8(X Eˉ XˉT)_{≤ h ˉ}A,X E XT_{i +}8(X E XT) _∀X _∈O_n_∩O_δ₍_Xˉ_).

But, by spectrality of8, the terms8(X Eˉ XˉT₎_and₈₍_{X E X}T₎_{cancel out. The}

conclusion is thatX is a local solution to the optimization problemˉ

minimize f(X)_{= h ˉ}A,X E XT_i

with respect to X _∈O_n, (26)

and hence it satisfies the first order necessary optimality conditions for this prob-lem. Before writing down these optimality conditions, observe that f is not defined onS_n_{but on the space}M_n_{of arbitrary real matrices of size n}_×_{n. The}

Frobenius inner product onM_n_{is given by}_h_X_,_Y_{i =}_trace₍_XT_Y_)._{By rewriting}

the contraint (26) as X XT

=I and introducing the Lagrangean function

(X,M)_∈M_n_×M_n _7→ _L₍_X_,_M₎_{= h ˉ}_A_,_{X E X}T_{i − h}_M_,_{X X}T ₋_I_i_,

we see thatX satisfiesˉ

∇ML(Xˉ,Mˉ) = ˉXXˉT −I =0, (27)

(20)

for someMˉ _∈M_n_{. Since} _Xˉ−1

= ˉXT _{by (27), we get from (28) that}

ˉ

ABˉ _{= ˉ}A(X Eˉ XˉT)₌(1/2)(Mˉ _{+ ˉ}MT)

is a symmetric matrix. It follows thatAˉBˉ ₌(AˉBˉ)T _{= ˉ}BA. The case of a localˉ

maximum is treated in a similar way.

It is possible to derive more sophisticated versions of the above commutation principle but Lemma 4 is general enough to cover all our needs. Everything is now ready to state:

Theorem 4. A spectral closed convex coneM_inS_n _{and the associated}

per-mutation invariant closed convex cone KMhave the same maximal angle, i.e.,

θmax(M)=θmax(KM).Furthermore, the following statements are equivalent:

(a) (Aˉ,Bˉ)_∈S_n_×S_n_{is an antipodal pair of}M_.

(b) there exist an antipodal pair(u_ˉ,v)_ˉ of KMand a matrix Q∈Onsuch that

ˉ

A₌ Qdiag(u_ˉ)QT _and_B_ˉ

= Qdiag(v)_ˉ QT_.

Proof. Let(u_ˉ,v)_ˉ be an antipodal pair of KM. Write Aˉ = diag(uˉ)and Bˉ =

diag(v)._ˉ One has_{k ˉ}A_{k = k ˉ}u_{k =} 1, _{k ˉ}B_{k = k ˉ}v_{k =} 1, and alsoλ(Aˉ) _{= ˉ}u↑_,

λ(Bˉ)_{= ˉ}v↑_._{Since K}_M_{is permutation invariant, the vectors}_u_ˉ↑_and_v_ˉ↑_{remain in}

KM, and therefore Aˉ,Bˉ ∈M.The conclusion is that

θmax(M)≥arccosh ˉA,Bˉi =arccosh ˉu,vˉi =θmax(KM).

The reverse inequality is obtained by exploiting the commutation principle stated in Lemma 4. The proof runs as follows. Take matricesAˉ,Bˉ _∈M_{of unit length} realizing the maximal angle inM_{, i.e.,}

h ˉA,Bˉ_{i =} min

A,B∈M_∩6(S_n)hA,Bi.

In particular, B is a minimizer of the linear formˉ _{h ˉ}A, _{∙ i}over the spectral set M_∩₆₍S_n₎_{. Lemma 4 implies that}_{A and}ˉ _{B commute. Hence,}ˉ _{A and}ˉ _{B can be}ˉ

simultaneously diagonalized by means of a matrix Q_∈O_n_{. This means that}

(21)

for suitable vectorsu_ˉ,v_ˉ _∈Rn_{. Hence,}

h ˉA,Bˉ_{i = h}Qdiag(u_ˉ)QT,Qdiag(v)_ˉ QT_{i = h}diag(u_ˉ),diag(v)_ˉ _{i = h ˉ}u,v_ˉ_i.

Observe also thatu_ˉ,v_ˉare unit vectors and, by spectrality ofM_{, they are in K}M.

Hence, one gets

h ˉA,Bˉ_{i ≥} inf

u,v∈KM∩6nh

u, v_i

and the desired inequalityθmax(M)≤θmax(KM).The second part of the theorem

is implicit in the above proof.

3.2 Critical pairs in a spectral cone

We now compare the angular spectra ofM_{and K}M. The commutation principle

stated in Lemma 4 plays again a crucial role.

Theorem 5. A spectral closed convex coneM_inS_n _{and the associated}

per-mutation invariant closed convex cone KMhave the same collection of proper

critical angles, i.e., (M) ₌ (KM).Furthermore, the following statements

are equivalent:

(a) (Aˉ,Bˉ)_∈S_n_×S_n_{is a critical pair of}M_.

(b) there exist a critical pair(u_ˉ,v)_ˉ of KMand a matrix Q ∈ On such that

ˉ

A₌ Qdiag(u_ˉ)QT andBˉ ₌ Qdiag(v)_ˉ QT.

Proof. Suppose thatAˉ,Bˉ _∈M_∩₆₍S_n₎_{satisfy the criticality conditions}

ˉ

B_{− h ˉ}A,Bˉ_{i ˉ}A_∈M+_, ₍₃₀₎ ˉ

A_{− h ˉ}A,Bˉ_{i ˉ}B_∈M+. (31)

We shall prove that A andˉ B commute. To do this, we come back to the veryˉ

definition of a critical pair. It is not difficult to see that the system (30)-(31) can be written in the form

− [ ˉB₋ηAˉ_{] ∈}∂9M(Aˉ),

− [ ˉA₋ηBˉ_{] ∈}∂9M(Bˉ),

(22)

with∂ standing for the subdifferential operator in the sense of convex analysis,

9Mdenoting the indicator function ofM, andη= h ˉA,Bˉi. Let us examine more

closely for instance the relation (32). Standard rules of convex analysis show that (32) amounts to saying thatB minimizes the linear functionˉ _{h ˉ}A₋ηBˉ, _{∙ i}

over the setM_{. In view of Lemma 4, one obtains the commutation property}

(Aˉ ₋ηBˉ)Bˉ _{= ˉ}B(Aˉ₋ηBˉ),

confirming in this way thatAˉBˉ _{= ˉ}BA. By proceeding to a simultaneous diago-ˉ

nalization as in (29) one obtains

Qdiag(v)_ˉ QT

−ηQdiag(u_ˉ)QT

∈M+_,

Qdiag(u_ˉ)QT

−ηQdiag(v)_ˉ QT

∈M+_.

By spectrality ofM+_{we can drop the common orthonormal transformation Q} and write simply

diag(v_ˉ₋ηu_ˉ)_∈M+_,

diag(u_ˉ₋ηv)_ˉ _∈M+_.

By recalling Lemmas 2 and 3, one sees that(u_ˉ,v)_ˉ is necessarily a critical pair of KM. The proof of the reverse implication(b)⇒(a)is omitted since it offers

no difficulty.

Remark 2. As one sees from Theorem 5, a critical pair(Aˉ,Bˉ)ofM_{is formed} necessarily with a couple of commuting matrices. The components of the vectors

ˉ

u,v_ˉproduced by the simultaneous diagonalization (29) may not be arranged in a similar order, i.e., there may be no common permutation matrix5_∈5nsuch that 5u ₌u↑_and_5v₌ _v↑_{. This explains why we cannot infer that}_(λ(_A_ˉ_{), λ(}_B_ˉ₎₎ is a critical pair of KM.

Remark 3. Recall that the critical angles of a closed convex cone can be sepa-rated into two disjoint groups: Nash angles and ordinary critical angles. Suppose that(Aˉ,Bˉ)is a Nash angular equilibrium ofM_{, i.e., one has the combination of} the following two conditions:

ˉ

B minimizes_{h ˉ}A, _{∙ i}overM_∩₆₍S_n_), ₍₃₃₎

ˉ

(23)

In view of Lemma 4, either one of these conditions imply that AˉBˉ _{= ˉ}BA.ˉ

By pluggingAˉ ₌Qdiag(u_ˉ)QT andBˉ ₌ Qdiag(v)_ˉ QT into (33)-(34) and work-ing out the details, one concludes that(u_ˉ,v)_ˉ is necessarily a Nash angular equi-librium of KM.

REFERENCES

[1] B. Dacorogna and P. Maréchal, Convex S O(N)_×S O(n)-invariant functions and refinements of von Neumann’s inequality. Ann. Fac. Sci. Toulouse Math., 16 (2007), 71–89.

[2] C. Davis, All convex invariant functions of hermitian matrices. Archiv der Math., 8 (1957), 276–278.

[3] Z.Q. Feng, M. Hjiaj, G. de Saxcé and Z. Mróz, Effect of frictional anisotropy on the quasistatic motion of a deformable solid sliding on a planar surface. Comput. Mech., 37 (2006), 349–361. [4] A. Iusem and A. Seeger, Measuring the degree of pointedness of a closed convex cone: a

metric approach. Math. Nachrichten, 279 (2006), 599–618.

[5] A. Iusem and A. Seeger, On vectors achieving the maximal angle of a convex cone. Math.

Programming, 104 (2005), 501–523.

[6] A. Iusem and A. Seeger, On convex cones with infinitely many critical angles. Optimization, 56 (2007), 115–128.

[7] A. Iusem and A. Seeger, Searching for critical angles in a convex cone. Math. Programming, 2007, in press.

[8] A. Iusem and A. Seeger, Antipodal pairs, critical pairs, and Nash angular equilibria in convex cones. Optimization Methods and Software, submitted.

[9] A. Iusem and A. Seeger, Antipodality in convex cones and distance to unpointedness, May 2006, submitted.

[10] A. Lewis, Convex analysis of on the Hermitian matrices. SIAM J. Optim., 6 (1996), 164–177. [11] J.E. Martinez-Legaz, On convex and quasiconvex spectral functions. Proceedings of the 2nd Catalan Days on Applied Mathematics (Odeillo, 1995), 199-208, Collect. Études, Presses Univ. Perpignan, Perpignan, 1995.

[12] M. Overton and R.S. Womersley, Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices. Math. Programming, 62 (1993) 321–357.

[13] J. Peña and J. Renegar, Computing approximate solutions for conic systems of constraints.

Math. Programming, 87 (2000), 351–383.

[14] G. Saporta, Probabilités, Analyse des Donnés et Statistique, Editions Technip, Paris, 1990. [15] A. Seeger, Convex analysis of spectrally defined matrix functions. SIAM J. Optim., 7 (1997),

(24)

[16] D.S. Wang and L.N. Medgyesi-Mitschang, Electromagnetic scattering from finite circular and elliptic cones. IEEE Trans. Antennas and Propagation, 33 (1985), 488–497.