• Nenhum resultado encontrado

2 Projection and Convex Sets

N/A
N/A
Protected

Academic year: 2023

Share "2 Projection and Convex Sets"

Copied!
12
0
0

Texto

(1)

Introduction to Convex Analysis Microeconomics II - Tutoring Class

Professor: V. Filipe Martins-da-Rocha TA: Cinthia Konichi

April 2010

1 Basic Concepts and Results

This is a first glance on basic convex analysis results that are going to be used exten- sively during the course. Convexity is a very important concept in optimization theory since if it is assumed, necessary conditions for optimality become sufficient conditions.

For an introduction to convex analysis as applied to optimization theory see Florenzano and Le Van (2001) and Izmailov and Solodov (2005). Borwein and Lewis (2000) and Bert- sekas et al. (2003) are other useful references. For a throughout treatment see Rockafellar (1970).

We begin by characterizing convex sets. Let E be a real vector space. Then the following definitions are appropriate:

Definition 1 A setD⊂E is said to be convex if for any x, y ∈D the set{αx+(1−α)y: α∈[0,1]} is contained inD.

The pointαx+ (1−α)y is called convex combination of x and y (with parameterα). By induction, it is easily seen that Dis convex if and only if Pp

k=1λkxk∈D for every finite set{x1, ..., xp} of pelements of D and for every system of pnonnegative real coefficients {λ1, ..., λp} such thatPp

k=1λk= 1. Hence, a subset Dof E is convex if and only if every convex combination of finitely many elements of D belongs to D.

Some examples of convex sets are the open (or closed) balls on a vector space, the line segments, the vector space itself and the empty set.

EPGE-FGV

(2)

x y αx+ (1−α)y

x y

Figure 1: A convex and a non-convex set

Lemma 2 Let Λ be an arbitrary set and {Dλ}λ∈Λ be a family of convex subsets of E. Then D=∩λ∈ΛDλ is convex.

Proof: Let x ∈ D and y ∈ D. Then x ∈ Dλ and y ∈ Dλ for all λ ∈ Λ. Since the sets Dλ, λ ∈ Λ, are convex, αx + (1−α)y ∈ Dλ for any α ∈ [0,1] for all λ ∈ Λ. Hence, αx+ (1−α)y∈D, that isD is convex.

Then, on an optimization problem, for example, an arbitrary number of convex con- straints, on the same space, turns out to be convex.

With any arbitrary set C of E, we can associate another set, called the convex hull of C, denoted by coC, which is the intersection of all convex subsets ofE containingC.

By Lemma 2, the convex hull is convex. By the way, it is the smallest convex subset of E containingC.

Lemma 3 Let D⊂E be a convex set, then clD is convex1

Proof: 2 Letx, y ∈ clD and α ∈(0,1). Then, there are sequences (xn)n∈N and (yn)n∈N

in D, such that xn → x and yn → y. But, by the convexity of D αxn+ (1−α)yn ∈ D,

∀n∈N. Then αxn+ (1−α)yn →αx+ (1−α)y implies αx+ (1−α)y ∈clD.

cl C co C

C

Figure 2: The closure and the convex hull of a set C

1Where clD is the closure of D. Remember that the closure of an arbitrary set B on a vector space is the set of limit points of sequences belonging to B.

2For convenience, assume that E is a normed real vector space and take the usual convergence concept.

(3)

Definition 4 Let D⊂E be a convex set. The function f :D→R is convex in D when for any x∈D, y∈D e α∈[0,1] we have:

f(αx+ (1−α)y)≤αf(x) + (1−α)f(y)

The function f is said strictly convex if the inequality above is strict for all x 6= y and α∈(0,1).

x αx+ (1−α)y y f(x)

f(αx+ (1−α)y) f(y) αf(x) + (1−α)f(y)

Figure 3: An illustration of a convex function

Definition 5 Let D ⊂ E be a convex set. The function f : D → R is concave in D if (−f) is convex in D.

Next lemma states an equivalent definition of concavity:

Lemma 6 Let D⊂E be a convex set, then f :D→R is concave if and only if the set hypof :={(x, µ)∈D×R:f(x)≥µ}

is convex. Such set is called the hypograph of f.

Proof: First, suppose that hypof is convex. Let x∈D and y∈ D. Clearly, (x, f(x))∈ hypof and (y, f(y)) ∈ hypof. Because of the convexity of hypof, for all α ∈ [0,1], we have:

(αx+ (1−α)y, αf(x) + (1−α)f(y)) = α(x, f(x)) + (1−α)(y, f(y))∈hypof By the definition of hypof, we have:

f(αx+ (1−α)y)≥αf(x) + (1−α)f(y)

(4)

Sof is concave.

Conversely, suppose now that f is concave. Let (x, c1)∈hypof and (x, c2)∈hypof. Since f(x)≥c1 and f(y)≥c2, by the concavity off, for all α∈[0,1] we have:

f(αx+ (1−α)y)≥αf(x) + (1−α)f(y)≥αc1+ (1−α)c2 which means that:

α(x, c1) + (1−α)(y, c2) = (αx+ (1−α)y, αc1+ (1−α)c2)∈hypof Hence, the hypof is convex.

We say that

maxf(x) subject tox∈D (1)

is convex maximization problem when D ⊂E is a convex set and f : D→ R is concave inD. The importance of the convexity assumption can be seen in the following result.

Theorem 7 Let D⊂E be a convex set and f :D→R a concave function in D. Then, every local maximum of problem 1 is global. Moreover, the set of elements that maximize the problem is convex. If f is strictly concave, the problem does not have more than one maximizer.

Proof: Suppose by way of contradiction that ¯x ∈ D is a local maximizer that is not global. Then, exists y∈D such that f(y)> f(¯x). Define x(α) =αy+ (1−α)¯x. By the convexity of D, x(α)∈ D for all α ∈[0,1]. By the concavity of f, for all α ∈(0,1], we have:

f(x(α))≥αf(y) + (1−α)f(¯x) = f(¯x) +α(f(y)−f(¯x))> f(¯x)

Taking α > 0 sufficiently low, we can guarantee that the point x(α) is arbitrarily close to the point ¯x and f(x(α))> f(¯x). This contradicts the fact that ¯xis a local maximizer of problem 1. Then, any local solution must be a global solution.

Let S ⊂ D be the set of (global) maximizers and ¯v ∈ R the optimum value of the problem. Note that we havef(x) = ¯v for any x∈S. For anyx∈S, ¯x∈S andα∈[0,1], by the concavity of f, we have:

f(αx+ (1−α)¯x)≥αf(x) + (1−α)f(¯x) =α¯v+ (1−α)¯v = ¯v which implies that f(αx+ (1−α)¯x) = ¯v and then αx+ (1−α)¯x∈S.

Suppose now that f is strictly concave and that exist x∈ S and ¯x ∈S, with x 6= ¯x.

Let α ∈ (0,1). Since x and ¯x are global maximizers and αx+ (1 −α)¯x ∈ D by the convexity of D, it follows that:

f(αx+ (1−α)¯x)≤f(x) =f(¯x) = ¯v

(5)

However, as f is strictly concave:

f(αx+ (1−α)¯x)> αf(x) + (1−α)f(¯x) =α¯v + (1−α)¯v = ¯v (2) which is a contradiction.

2 Projection and Convex Sets

Henceforth, letEbe a real vector space equipped with an inner producth·,·i:E×E → R. Consider that k · k:E →R+ is the norm generated by such inner product.

Definition 8 Let B ⊂E be a nonempty set and let x0 ∈E be an arbitrary point. Then we define the distance of the point x0 to the set B to be dB(x0) :E →R+, where

dB(x0) := inf

x∈Bkx−x0k

The set PB(x0) := {x∈B :kx−x0k=dB(x0)} is called the projection of x0 on B.

Note that, since B 6= ∅ and k · k ≥ 0, this function is well defined. It is easy to see that dB is continuous3. If E is finite dimensional and B is a closed set, the minimum is attained. In fact, note that we can define a sequence in B whose distance from x0 converges to dB(x0). But this implies that this sequence of distances is bounded, which, by the way, implies that the sequence in B is bounded. Therefore, by the Bolzano- Weierstrass Theorem4, this sequence admits a convergent subsequence. Thus, its limit point belongs to B (closed).

Although, under closedness the minimum is not necessarily unique, adding convexity guarantees uniqueness. The geometrical intuition (E = R2) for this result is that the distance is defined by a path which is orthogonal to the set. Therefore, two different projections imply two different paths that are orthogonal to the set. But, then we can define a triangle between those points. However, note that the convexity of B implies that the line segment joining the projections is in B. Hence we have a non-degenerate triangle with two right angles, a contradiction.

Let’s do the formal statement:

Theorem 9 Let E be a finite dimensional real vector space, with a norm defined by an inner product. Let D⊂E be a closed and convex set; and, fix x0 ∈E. Then

3Letx, yE. Thenkxx0k ≤ kxyk+kyx0k,∀x0 Eby the triangle inequalityinfx0∈Bkx−

x0k ≤ kxyk+ infx0∈Bkyx0k. By the other hand,kyx0k ≤ kyxk+kxx0k,∀x0E. Therefore

|dB(x)dB(y)| ≤ kxyk, i.e.,dB is a Lipschitz function (thus is continuous).

4Note that the assertion that a bounded sequence has a convergent subsequence may be invalid on an infinite dimensional space. For example, onRconsider the sequence (xn)n∈N such thatxn= (yn,t)t∈N

and yn,t = 1 if t = n and 0 otherwise. It’s easily seen that (xn)n∈N is bounded, however it has no convergent subsequence (on the usual definition of convergence).

(6)

−4 −3 −2 −1 1 2 3 4

−1 1 2 3

0

DA= 2.5 DC = 2.5

GH = 1

A C

D G

H

Figure 4: Geometrical intuition for the unique minimum 1. x¯∈PD(x0) if and only if x¯∈D and h¯x−x0,x¯−yi ≤0, ∀y ∈D;

2. there is a unique x? ∈D such that kx?−x0k=dD(x0).

Proof: Let’s prove the first assertion and then use it to demonstrate the second one. Let

¯

x∈PD(x0). Then, since D is convex, for any α∈(0,1) and anyy ∈D\{¯x}

k¯x−x0k ≤ k(1−α)¯x+αy−x0k ⇒ k¯x−x0k2− k(1−α)¯x+αy−x0k2 ≤0 h¯x−x0,x¯−x0i − h¯x−x0−α¯x+αy,x¯−x0−α¯x+αyi ≤0

h¯x−x0,x¯−x0i − h¯x−x0,x¯−x0 −αx¯+αyi − h−α¯x+αy,x¯−x0−α¯x+αyi ≤0 h¯x−x0, α¯x−αyi − h−α¯x+αy,x¯−x0−α¯x+αyi ≤0

h¯x−x0, α¯x−αyi − h−α¯x+αy,x¯−x0i − h−α¯x+αy,−αx¯+αyi ≤0 2αh¯x−x0,x¯−yi −α2k¯x−yk2 ≤0

Dividing both sides of the inequality above by 2α > 0 and letting α → 0, we get h¯x− x0,x¯−yi ≤0.

On the other hand, let ¯x∈D be such thath¯x−x0,x¯−yi ≤0,∀y∈D. Note that, for any y∈D, h¯x−x0,x¯−yi=k¯x−x0k2+h¯x−x0, x0−yi. But, by the Cauchy-Schwartz inequality we have k¯x−x0kkx0−yk ≥ h¯x−x0, x0−yi. Therefore, ∀y∈D,

0≥ h¯x−x0,x¯−yi ⇒ k¯x−x0k2 ≤ h¯x−x0, x0−yi ≤ k¯x−x0kkx0 −yk

If xo ∈D, then k¯x−x0k2 ≤0 so ¯x=x0 ∈PD(x0). If x0 ∈/ D then k¯x−x0k ≤ kx0−yk,

∀y∈D, so ¯x∈PD(x0). Thus (1) is proved.

Now, let x, x0 ∈PD(x0). Then, ∀y ∈D,hx−x0, x−yi ≤0 and hx0−x0, x0−yi ≤0.

In particular, hx−x0, x−x0i ≤0 andhx0−x0, x0−xi ≤0. Therefore

0≥ hx−x0, x−x0i − hx0 −x0, x−x0i=hx−x0, x−x0i=kx−x0k2

(7)

Hence, x=x0.

Therefore, for a convex D ⊂ E, it is possible to define a function pD : E → D such that PD(x) ={pD(x)}, ∀x∈E.

3 Separating Hyperplane Theorems

Before going to the results, let us define some objects.

Definition 10 For a∈E\{0} and c∈R the set

H(a, c) :={x∈E :ha, xi=c}

is said to be a hyperplane.

Note that E may be written as the union of two disjoint sets and the hyperplane.

That is E =H(a, c)∪ {x∈E :ha, xi< c} ∪ {x∈E :ha, xi> c}.

Definition 11 Let B1, B2 ⊂E. The hyperplane H(a, c) is said to separate B1 and B2 if

∀x∈B1, ha, xi ≤c≤ ha, yi ,∀y∈B2

If both inequalities are strict, we say that H(a, c) strictly separates B1 and B2.

In the geometric sense, separability means that a set is on one side of the hyperplane and the other set on the other side (see Figure 5). The next result is important for convex sets:

Lemma 12 LetD⊂E be a convex set and letH(a, c)be a hyperplane such thatH(a, c)∩

D=∅. Then D∩ {x∈E :ha, xi< c}=∅ or D∩ {x∈E :ha, xi> c}=∅.

Proof: Assume by way of contradiction that x0, x00 ∈ D, x0 ∈ {x ∈ E : ha, xi < c} and x00 ∈ {x∈E :ha, xi> c}. Then, ¯x=λx0+ (1−λ)x00 ∈H(a, c), whereλ = ha,xha,x00i−ha,x00i−c0i ∈ (0,1). Therefore, ¯x∈D∩H(a, c), contradiction.

3.1 Support Theorem

The next lemma states that a point that does not belong to the closure of a convex set can be strict separated from that set.

Lemma 13 (Minkowski Lemma) Let D ⊂E be a non-empty convex set, where E is finite dimensional. If x /∈clD, then there area ∈E\{0} andc∈Rsuch that x∈H(a, c) and ha, yi> c, ∀y∈D.

(8)

...

...

c

Figure 5: Strict and non-strict separability

Proof: By Lemma 3, we know that clD is convex. By Theorem 9, there is a unique projection ¯x=PD(x)∈ clD. Let a= ¯x−x (a 6= 0 sincex /∈clD) and c=ha, xi. Then,

∀y∈D, ha, yi=h¯x−x, yi ≥ h¯x−x,xi. But, by Theorem 9,¯ h¯x−x,x¯−yi ≤0,∀y∈D.

Therefore,ha, yi ≥ h¯x−x,xi¯ =k¯x−xk2+h¯x−x, xi=k¯x−xk2+c > c, since ¯x6=x.

Definition 14 A hyperplane H(a, c) is said to support a set B if H(a, c)∩frB5 6=∅ and B∩ {x∈ E :ha, xi< c} =∅ or B∩ {x∈E :ha, xi > c}=∅. If x0 ∈H(a, c)∩frB it is

cl D cl D

x a

Figure 6: An illustration of Minkowski Lemma

5Where frB is the boundary of B. Remember that x E belongs to frB if and only if there are sequences onB andBc which both converge tox. Thus, frB clB.

(9)

said that H(a, c) supports D at x0.

That is, a hyperplane supports a set if it contains at least one point of the set and if all the points of the set are on a same side of the hyperplane. Note that a set can admit more than one hyperplanes supporting it at the same point.

Theorem 15 (Support Theorem) Let E be finite dimensional and D ⊂E be a non- empty convex set. If x∈frD, then there are a∈E\{0} and c∈R such that x∈H(a, c) and ha, yi ≥c, ∀y ∈D, i.e., the hyperplane H(a, c) supports D at x.

Proof: Since x ∈ frD, there is (xn)n∈N such that xn → x and xn ∈/ clD, ∀n ∈ N. By Minkowski Lemma, for each n ∈ N, there are an ∈ E\{0} and cn ∈ R such that xn∈H(an, cn). Note that

han, xni=cn ⇒ an

kank, xn

= cn kank Since (xn)n∈N is convergent, it is bounded. Furthermore,

an

kank

n∈N

is bounded. Then, by the Cauchy-Schwartz inequality,

cn

kank

n∈N

is bounded. Therefore, there is an infinite subset N0 ⊂ N such that

cn

kank

n∈N0 and

an

kank

n∈N0 are convergent. Let a ∈ E and c∈R be the respective limit points of those subsequences. Sincek

an

kank

k= 1, ∀n∈N, we have a ∈ E\{0}. By continuity of the inner product, we have x ∈ H(a, c). By the other hand, since han, yi> cn, ∀y∈D, ∀n∈N, we have ha, yi ≥c.

3.2 Separating Theorems

Finally, the most important result of this material:

Theorem 16 (Separating Hyperplane) Let E be finite dimensional and D1, D2 ⊂E be disjoint non-empty convex sets. Then there are a∈E\{0} and c∈R such that

∀x∈D1, ha, xi ≤c≤ ha, yi ,∀y∈D2 i.e., there is a hyperplane H(a, c) which separates D1 and D2.

Proof: Define D = D2 −D1. Then D is convex and 0 ∈/ D because D1 ∩D2 = ∅. If 0∈/ clD we can apply Minkowski Lemma. Otherwise, if 0∈ frD, we apply the Support Theorem. In both cases, there exista∈E\{0}such thatha, zi ≥0 ∀z ∈D. Therefore,

∀x∈D1, ha, xi ≤ ha, yi ,∀y∈D2

(10)

In particular, the function ha,·i is bounded from below in D2 and bounded from above inD1. Then,

∀x∈D1, ha, xi ≤ sup

x0∈D1

ha, x0i ≤ inf

x00∈D2ha, x00i ≤ ha, yi ,∀y∈D2

Since D1 and D2 are non-empty, the last inequality is well defined and defining c= supx0∈D1ha, x0i+ infx00∈D2ha, x00i

2 the result holds.

If, in addition to being convex, we assume that the sets are closed and at least one of them is compact, then we have strict separation. The intuition is that under compacity the sets cannot be, at the same time, arbitrarily near and disjoint.

Theorem 17 (Strict Separating Hyperplane) LetE be finite dimensional andD1, D2 ⊂ E be disjoint, non-empty, closed and convex sets. In addition assume that D1 is compact.

Then there are a∈E\{0} and b, c∈R such that

∀x∈D1, ha, xi ≤b < c≤ ha, yi ,∀y ∈D2 i.e., there is a hyperplane H(a, c) which separates D1 and D2.

Proof: Since dD2 is continuous andD1 is compact, by the Weierstrass Theorem

∃x? ∈arg min{dD2(x) :x∈D1}

AsD1 is compact, D2 is closed and D1∩D2 =∅, we havedD2(x?)>0. Lety? =PD2(x?).

Definea =y? −x? 6= 0.

But by Theorem 9, we have 0≥ hy?−x?, y?−yi,∀y∈D2 ⇒ ha, yi ≥ ha, y?i,∀y∈D2. Definec2 =ha, y?i. Note that, for any x∈D1 we have

ky?−xk ≥dD2(x)≥dD2(x?) = kx?−PD2(x?)k=kx?−y?k ⇒x? =PD1(y?)

Hence, we have h−a, xi ≥ h−a, x?i,∀x∈D1. Definec1 =ha, x?i. Note that c2−c1 = ky?−x?k2 >0. Therefore, let b = (c2−c1)/4 and c= (c2−c1)/2. Then, we have

∀x∈D1, ha, xi ≤b < c≤ ha, yi ,∀y ∈D2

(11)

4 Applications

In this section, we will give an important application of the results seen above.

Lemma 18 Assume E is finite dimensional and letD⊂E be a convex set. If f :D→R is concave and x∈intD, then the following set is non-empty:

∂f(x) ={b ∈E :f(y) +hb, y−xi ≤f(x), ∀y∈D}

Proof: By Lemma 6, hypof is convex. Then, by the Support Theorem, there is a ∈ (R×E)\{0} and c ∈ R such that (f(x), x) ∈ H(a, c) and ha, zi ≤ c, ∀z ∈ hypof. Let a= (a1, a2) where a1 ∈R and a2 ∈E. Analogously, let z = (z1, z2). For each z ∈hypof we have a1z1 +ha2, z2i ≤ c. Since for all v < z1, (v, z2) ∈ hypof, it should be the case that a1 ≥0. Otherwise, for v sufficiently negative the inequality would be invalid.

Now we have to prove thata1 >0. Assume by way of contradiction thata1 = 0. Since x∈H(a, c), we haveha2, xi=c. But forδ >0 sufficiently small we havex0 =x+δa2 ∈D, and ha2, x0i = ha2, xi +δka2k2E ≤ c ⇒ a2 = 0, a contradiction because the Support Theorem guarantees thata6= 0. Therefore, a1 >0. Defineb=a2/a1 and ˆc=c/a1. Since (f(y), y)∈hypof, we havef(y) +hb, yi ≤ˆc=f(x) +hb, xi. Thus,f(y) +hb, y−xi ≤f(x)

⇒ b∈∂f(x).

Note the implication of the Support Theorem as applied to concave functions. For each interior point of the function’s domain6, there is a concave programming problem whose solution is given by this point. In fact, for anyx∈intD and b ∈∂f(x) we have:

x∈arg max

x0∈D{f(x0) +hb, x0i}

Hence, the supporting hyperplanes gives us a way to distort (rotate around a point) the graph of the objective function in such a way that any point may be turned into a global optimum. But then, one may think about rotating the function on a special point, a point which is a solution of a restricted optimization problem.

The Lagrange Theorem builds upon this principle, but instead of using only the hypograph of the function, it uses the convex sets defined by the constraint functions.

This way, it is possible to create restrictions on the hyperplane which can arise as support for the restricted optimum. Indeed, under some conditions, we can uniquely identify such hyperplane. Then we can solve the global optimization problem first.

6If the function is assumed to be continuous over the entire domain, then the property is valid for all points. An important remark should be given (we omit the proof): A concave function is continuous at the interior of its domain.

(12)

References

D.P. Bertsekas, A. Nedi´c, and A.E. Ozdaglar. Convex analysis and optimization. Athena Scientific Belmont, Mass, 2003.

J.M. Borwein and A.S. Lewis. Convex analysis and nonlinear optimization. Springer Hong Kong, 2000.

M. Florenzano and C. Le Van. Finite dimensional convexity and optimization. Springer, 2001.

A. Izmailov and M. Solodov. Otimiza¸c˜ao–volume 1: Condi¸c˜oes de Otimalidade, Elemen- tos de An´alise Convexa e de Dualidade. IMPA, 2005.

R.T. Rockafellar. Convex analysis. Princeton University Press, 1970.

Referências

Documentos relacionados

Despercebido: não visto, não notado, não observado, ignorado.. Não me passou despercebido

The probability of attending school four our group of interest in this region increased by 6.5 percentage points after the expansion of the Bolsa Família program in 2007 and

A partir da necessidade de implementar a melhoria dos processos de produção, este artigo teve por objetivo a aplicação do método DMAIC (Define, Measure,

Table 7 Percentage of dominance of ectoparasites in different bird species (continued).. Table 7 Percentage of dominance of ectoparasites in different bird

A POLÍTICA DE CONSTITUIÇÃO DE TERRITÓRIOS RURAIS NO BRASIL Políticas de organização e desenvolvimento territorial foram historicamente apli- cadas na França quando o Estado tratava

Conclui-se que, a economia criativa apresenta-se como um sistema de grande potencial de desenvolvimento em ambos países, porém, o Reino Unido se destaca como força mundial

Apresenta fatores de risco importantes e bem conhecidos como por exemplo a idade compreendida entre os 40 e os 65 anos, o sexo masculino e a obesidade, mas não devemos

Em resumo, tal como ocorreu na Alemanha, no Brasil, falar em estado moderno, principalmente a partir da Era Vargas, diz respeito ao empenho do centro do poder econômico