July22–26,2016 LeventTun¸cel Semideﬁniteprogrammingtechniquesincombinatorialoptimization

(1)

Semidefinite programming techniques in

combinatorial optimization

Levent Tun¸cel

Dept. of Combinatorics and Optimization, Faculty of Mathematics, University of Waterloo

July 22–26, 2016

(2)

We will start with the discussion of various forms of Semidefinite Programming (SDP) problems and some necessary background (no previous background on SDPs is required).

Then, we will formulate various problems in combinatorial optimization, graph theory, and discrete mathematics in general either as SDP problems or as nonconvex optimization problems with natural and useful SDP relaxations.

We will continue with some geometric representations of graphs as they relate to SDP, more recent work on Lov´asz theta body and its extensions, lift-and-project methods, and conclude with some of the more recent work in these research areas and the research area of lifted SDP-representations (or extended formulations) and some

(3)

For some positive integerd, letF ⊆ {0,1}^d. Suppose c ∈R^d is given. Then, the problem

maxc^>x subject to: x ∈ F is acombinatorial optimization problem.

For the purposes of this short course, the above is a definition of combinatorial optimization problem.

(4)

How do we solve such problems?

Of course, ifF is given as a list (as part of the input), this is trivial (we evaluatec^>x for eachx in the list, ...).

What ifF is given implicitly? I.e., letG = (V,E) be a simple undirected graph, thenS ⊆V is astable set inG if for every i,j ∈S,{i,j}∈/ E. (No two elements inS are joined by an edge in G.) Then we defineF in terms of the input graph G:

F:=n

x ∈ {0,1}^V :x is the incidence vector of a stable set inGo . How hard is this problem? Stability number ofG

α(G) := max{|S|:S is a stable set in G}

(5)

We can consider a convex optimization approach.

Line segment joiningu andv in Rⁿ:

{λu+ (1−λ)v :λ∈[0,1]}.

Convex sets: F ⊆Rⁿ is convexif for every pair of points u,v ∈F, the line segment joiningu andv lies entirely inF.

Fact: Intersection of an arbitrary collection of convex sets is convex.

(6)

Externally: Convex Hulls:

LetF ⊆Rⁿ. The intersection of all convex sets containingF is called theconvex hull ofF and is denoted by conv(F).

(7)

Internally: Convex combinations of x⁽¹⁾,x⁽²⁾, . . . ,x^(k⁾:

k

X

i=1

λ_ix⁽ⁱ⁾,

for someλ∈R^k₊ such thatPk

i=1λi = 1.

Extreme pointsof convex sets: LetF ⊆Rⁿ be a convex set. x ∈F is called anextreme pointof F if6 ∃u,v ∈F \ {x}such that x= ¹₂u+¹₂v.

(8)

function. Then the optimization problem max{f(x) :x ∈ F } is equivalent to (by adding a new variablex_d+1),

max{x_d+1 :f(x)≥x_d+1,x∈ F, `_d+1≤x_d+1≤u_d+1}, where`_d+1 ∈Ris a lower bound on the minimum value of f over F and ud+1∈Ris an upper bound on the maximum value off overF.

We used:

Theorem

d

(9)

So, without loss of generality, we may assume that the objective function islinear. By redefining,d,c andF, our problem becomes

max n

c^>x :x∈ Fo ,

for a given vectorc ∈R^d, and for a given compact set F ⊂R^d. Now, we have

max n

c^>x :x ∈ Fo

= max n

c^>x:x ∈conv(F) o

.

(10)

We denote byH thed-dimensional unit hypersphere (in the context of the next theorem this is the set of all nontrivial objective functions).

Theorem

Theset of all c ∈H for which theoptimal solution of maxn

c^>x:x ∈conv(F)o

is not uniquehaszero,(d −1)-dimensional Hausdorffmeasure.

An implication is that, if we were to pick the objective functionc from the hypersphereH, each point inH having the “same”

(11)

In caseF is a finite set (as in combinatorial optimization), conv(F) is a polytope (convex hull of finitely many points,

equivalently, intersection of finitely many closed half spaces, that is bounded). Then, the situation is particularly nice.

(12)

Picture on the board...

(13)

If we are able to get our hands on a convex compact setF which is tractable (that is, we can optimize a linear function over it in polynomial time) andF = conv(F), then for almost all objective function vectorsc, the problem

max n

c^>x:x ∈F o

will have aunique minimizer ˆx. Thus, ˆx ∈ext(F)⊆ F. Therefore, ˆ

x will also be anoptimal solution of ouroriginal (non-convex) problem

max n

c^>x :x∈ Fo .

(14)

How do we “compute” or approximate theconvex hull?

(15)

Linear Optimization

LetA∈R^m×n, given.

(LP1) Maximize c^>x

Ax ≤ b, (x ∈ Rⁿ).

All vectors are column vectors.

u,v ∈R^m,u ≤v meansui ≤vi,∀i ∈ {1,2, . . . ,m}.

(16)

Let us use another form for our linear optimization problem:

LetA∈R^m×n,b ∈R^m andc ∈Rⁿ be given. Then, we have the LP problem

(LP) Max c^>x

Ax = b, x ≥ 0.

x≥0 or x∈Rⁿ+ (x must lie in the nonnegative orthant—a convex cone).

(17)

K ⊆Rⁿ is a convex cone if∀x,v∈K andα∈R++, we have αx ∈K and (x+v)∈K.

ReplaceRⁿ+ by an arbitrary convex cone, to get a more general convex optimization problem.

(18)

LetSⁿ denote the space ofn-by-n symmetric matrices with entries inR.

Definition LetX ∈Sⁿ.

X is positive semidefiniteif

h^>Xh≥0, ∀h∈Rⁿ. X is positive definite if

> n

(19)

The set ofn-by-n, symmetric positive semidefinite matrices form a closed convex cone inSⁿ. We denote it by Sⁿ+.

The set ofn-by-n, symmetric positive definite matrices form an open convex cone inSⁿ. We denote it bySⁿ₊₊.

(20)

ForA,B ∈Sⁿ, we use thetrace inner-product:

hA,Bi:= Tr(A^>B) =

n

X

i=1 n

X

j=1

AijBij,

we write

AB to mean (A−B) ispositive semidefinite;

AB

−

(21)

Note that forX ∈Sⁿ all eigenvaluesλ_j(X) are real. Also, X 0 ⇐⇒ λ(X)≥0

and

X 0 ⇐⇒ λ(X)>0.

(22)

Theorem

For every X ∈Sⁿ, there exists Q∈R^n×n, orthogonal (Q^>Q=I ) such that

X =QDiag(λ(X))Q^>.

Let us denote byq⁽¹⁾,q⁽²⁾, . . . ,q⁽ⁿ⁾ the columns of Q and denote bye_j the jth unit vector. Then,

Xq^(j)=QDiag(λ(X))ej =λj(X)Qej =λj(X)q^(j), that is,q^(j⁾ is an eigenvector of X associated with the eigenvalue λ_j(X).

By the above observations

(23)

Some commonly used norms onSⁿ. For everyh∈Rⁿ,p ∈[1,+∞],

khk_p :=





n

X

j=1

|h_j|^p





1 p

.

In the matrix space, we have Frobenius norm:

kXk_F :=hX,Xi^1/2 =kλ(X)k₂= v u u t

n

X

j=1

(λ_j(X))². Operatorp-norm (for everyp ∈[1,+∞]):

kXk_p:= max{kXhk_p:h ∈Rⁿ,khk_p= 1}. Note that

kXk₂ = max

j {|λ_j(X)|}.

(24)

ForX ∈Sⁿ₊₊, we can define its unique symmetric positive semidefinite square root as

X^1/2 :=Q[Diag(λ(X))]^1/2Q^>. We extend this definition toX ∈Sⁿ+.

Another notion of “square-root:”

Theorem

(Cholesky Decomposition) Let X ∈Sⁿ. Then X ∈Sⁿ+ iff there exists B∈R^n×n lower-triangular such that X =BB^>.

(25)

Theorem

Let X ∈Sⁿ. Then, TFAE (a) X is positive semidefinite;

(b) λ_j(X)≥0, for every j ∈ {1,2, . . . ,n};

(c) there exist µ∈Rⁿ+ and h⁽ⁱ⁾ ∈Rⁿ, for every i ∈ {1,2, . . . ,n}

such that

X =

n

X

i=1

µ_ih⁽ⁱ⁾h⁽ⁱ⁾^>;

(d) there exists B ∈R^n×n such that X =BB^> (here, B can be chosen as a lower triangular matrix—the Cholesky

decomposition of X );

(e) for every nonempty J ⊆ {1,2, . . . ,n},det(X_J)≥0, where XJ :={[X_ij] :i,j ∈J};

(f) for every S ∈Sⁿ₊,hX,Si ≥0.

(26)

Lemma

(Schur Complement Lemma) Let X ∈Sⁿ and T ∈S^m++. Then

M :=

T U^>

U X

0 ⇐⇒ X −UT⁻¹U^> 0.

Moreover, M0 iff X −UT⁻¹U^>0.

(27)

Proof.

I 0 UT⁻¹ I

T 0

0 X−UT⁻¹U^>

I T⁻¹U^>

0 I

=M.

Note that we wrote M =L

T 0 0 X −UT⁻¹U^>

L^>,

whereLis nonsingular (in our case det(L) = 1). Therefore, M 0 ⇐⇒ X −UT⁻¹U^> 0.

Also,

M 0 ⇐⇒ X −UT⁻¹U^> 0.

(X −UT⁻¹U^>) is called theSchur complement of T in M.

Levent Tun¸cel SDP Techniques in Comb. Opt.

(28)

One interesting special case is

1 x^>

x X

. We have

1 x^>

x X

0 ⇐⇒ X −xx^>0

and

1 x^>

x X

0 ⇐⇒ X −xx^>0.

(29)

Theorem

(1) Sⁿ++= int Sⁿ+

.

(2) Let X ∈Sⁿ. Then, TFAE (a) X is positive definite;

(b) λj(X)>0, for every j ∈ {1,2, . . . ,n};

(c) there existµ∈Rⁿ++ and h⁽ⁱ⁾ ∈Rⁿ, for every i∈ {1,2, . . . ,n}

linearly independent such that

X =

n

X

i=1

µih⁽ⁱ⁾h⁽ⁱ⁾^>;

(d) there exists B ∈R^n×n nonsingular such that X =BB^> (here, B can be chosen as a lower triangular matrix—the Cholesky decomposition of X );

(e) for every Jk :={1,2, . . . ,k}, k ∈ {1,2, . . . ,n},det(XJ_k)>0;

(f) for every S ∈Sⁿ+\{0},hX,Si>0;

(g) X 0andrank(X) =n.

(30)

GivenC,A1,A2, . . . ,Am ∈Sⁿ,b∈R^m we have (P) inf hC,Xi

hA_i,Xi = b_i, ∀i ∈ {1,2, . . . ,m}

X 0,

(D) sup b^>y

m

X

i=1

y_iA_i C.

(31)

ForSDP:Slater pointy¯∈R^m such that

m

X

i=1

¯

yiAi ≺C.

Theorem

(A Strong Duality Theorem) Suppose(D)has a Slater point. If the objective value of(D) is bounded from above then(P) attains its optimum value and the optimum values of(P)and (D) coincide.

(32)

Why do we needextra assumptions for the duality theorem of SDP?

Consider the examples (withparameter γ >0):

C :=





γ 0 0

0 0 0 0 0 0



,b :=

0 1

,

A₁ :=





0 0 0 0 1 0 0 0 0



,A₂ :=





1 0 0 0 0 1 0 1 0



.

(33)

Then for every feasible solution of (D), we havey2 = 0 and y1 ≤0. So, the set of feasible solutions of (D) (in theS-space, S :=C −Pm

i=1A_iy_i) is











γ 0 0

0 S22 0

0 0 0



:S22≥0





 .

Moreover,every feasible solution of (D) is optimal in (D).

(34)

For the primal, it is also true thatthe set of feasible solutions and the set of optimal solutions coincide, which is (in theX-space):











1 0 X31

0 0 0

X₃₁ 0 X₃₃



:X₃₃≥X₃₁²





 .

Even though both(P) and (D) have finite optimal objective values, both of these values are attained, there is a duality gap ofγ.

(35)

In the domain of (LP): Letv^∗ denote the optimal objective value...

Suppose we have a feasible solution ¯x for (P) and a feasible solution ¯y of (D). Then

c^>x¯≤v^∗≤b^>y.¯

Moreover, using an optimal ¯x for (P) and an optimal ¯y of (D), we have

c^>x¯=v^∗=b^>y.¯

(36)

In the domain of (SDP): Letv^∗ denote the optimal objective value...

Suppose we have a feasible solution ¯X for (P) and a feasible solution ¯y of (D). Then

hC,X¯i ≥v^∗ ≥b^>y.¯

Moreover, using an optimal ¯X for (P) and an optimal ¯y of (D) (for a suitably defined (D), or under a Slater type assumption), we have

hC,X¯i=v^∗ =b^>y.¯

(37)

In the domain of Combinatorial Optimization problems (e.g., 0,1 Integer Programming Problems): Suppose we have an integer feasible solution ¯x of the LP or SDP relaxation, and a feasible solution ¯y of the dual of the convex relaxationC_k. Then

c^>x¯≤v^∗≤b^>y.¯

Moreover, using an optimal ¯x for the original Integer Programming Problem and an optimal solution ¯y of the dual of the convex relaxationCd = conv(F), we have

c^>x¯=v^∗=b^>y.¯

(38)

We can solve such semidefinite optimization problems efficiently

both in terms of

computational complexity theory and practical computation.

However, in terms of bothcomputational complexity theory and practical computationthe situation in SDP is much worse than that of LP. (That is, there are very many deep, open problems.)

(39)

We can consider complexity analyses based on the ellipsoid method. Which typically assumes the existence balls with radiiR andr (the former for a ball containing some optimal solutions and the latter for a ball contained in the set of feasible solutions intersected with the first ball). It is also possible to analyze the algorithms for convex optimization in the case of unknownR,r, and to express the computational complexity of the algorithms in terms ofR andr. For example, Freund and Vera [2009] prove that if a convex setF ⊆R^d is given by a separation oracle, the problem of computing ¯x ∈F can be solved in

O

d

ln(d) +dln 1

r +R r

+ ln(R+ 1)

iterations, where each iteration makes one call to the separation oracle (and all the constants hidden byO(·) are at most 2).

(40)

Many interior-point algorithms lead to similar polynomial iteration complexity bounds, under similar (but not the same) assumptions.

See for instance, Nesterov and Nemirovskii [1994], Alizadeh [1995], Nesterov and Todd [1997,1998], Nesterov, Todd and Ye [1999], Nemirovksii and T. [2005,2010].

For a bit complexity analysis of the SDP feasibility problem (which admits an exponential bound in the size of the input), see Porkolab and Khachiyan [1997].

(41)

GivenA1,A2, . . . ,Am∈Sⁿ∩Z^n×n andb ∈Z^m, the

SDP-Feasibility Problemis to decide whether there exists X ∈Sⁿ+

such thathA_i,Xi=bi, for every i ∈ {1,2, . . . ,m}.

Open Problem: Is SDP-feasibility inP? In fact,

Open Problem: Is SDP-feasibility inN P? Theorem

(Ramana [1997]) In the real number computational model, the problem of deciding SDP feasibility is inN P ∩co-N P. Theorem

(Ramana [1997]) In the Turing machine model, if SDP feasibility is inN P then it is in N P ∩co-N P.

(42)

We can start with a possibly nonlinear algebraic representation of the constraint

x∈ F.

(43)

(I) Homogeneous Lifting to the matrix spaceS^d+1: To enforce the constraintx∈ {0,1}^d, it suffices to write

x_j²−x_j = 0 ∀j ∈ {1,2,· · ·,d}.

Consider representingx in the spaceS^d+1 by the matrix 1 x^>

x xx^>

=:Y. Then, the linear constraints onY,

Y₀₀= 1,diag(Y) =Ye₀, together with the requirements thatY 0 and

rank(Y) = 1,

allows us to transform any 0,1 integer programming formulation to an SDP formulation with an additional rank constraint.

Removing the rank constraint, leads to an SDP relaxation.

(44)

the constraintx∈ {−1,1}^d, it suffices to write x_j² = 1 ∀j ∈ {1,2,· · · ,d}.

Consider representingx in the spaceS^d by the matrix xx^>=:X.

Then, the linear constraints onX,

diag(X) = ¯e,X 0, together with the requirement that

rank(X) = 1,

allows us to transform any -1,1 pure quadratic programming

(45)

Goemans and Williamson [1995] used the following SDP relaxation of MaxCut

max −¹₄hW,Xi +¹₄hW,e¯e¯^>i subject to: diag(X) = e¯,

X 0,

whereW_ij is either zero (if {i,j}∈/ E) or is equal to the given nonnegative weight of the edge{i,j}, in proving their theorem:

Theorem

(Goemans and Williamson [1995]) Let G = (V,E) with w ∈Q^E₊ be given. Then a cut in G of value at least0.87856.(weight of the max. cut) can be computed in polynomial-time.

(46)

(III) Vector representation of variables: Another approach is to start with a formulation that is based on polynomial equations and inequalities. Suppose we are usingx ∈ {−1,1}^d formulation.

Beat all the higher degree monomials down to quadratics by adding new equations and new variables.

E.g., consider the system

x₁⁴x₂²+x₂³x3+x₁⁵−1 ≥ 0, 2x₁³−x₂⁴ ≥ 0

(47)

x₁⁴x₂²+x₂³x₃+x₁⁵−1 ≥ 0, 2x₁³−x₂⁴ ≥ 0

which is equivalent to the quadratic system:

x5x6+x6x7+x1x5−1 ≥ 0, 2x1x4−x₆² ≥ 0, x₄ = x₁², x₅ = x₄², x₆ = x₂², x7 = x2x3, ...

(48)

Say we have a quadratic inequality

x₁²+ 12x1x2+ 23x2x3+x₄² ≤32.

Let us represent each variablexj, by a vectorv^(j⁾∈R^d. Then our quadratic inequality becomes

D

v⁽¹⁾,v⁽¹⁾E + 12D

v⁽¹⁾,v⁽²⁾E + 23D

v⁽²⁾,v⁽³⁾E +D

v⁽⁴⁾,v⁽⁴⁾E

≤32.

The quadratic equationsx_j² = 1 ∀j become D

v^(j⁾,v^(j)E

= 1 ∀j.

(49)

Recall that

X ∈S^d+ iff X =BB^>

for someB ∈R^d×d. Define

B^>=:h

v⁽¹⁾v⁽²⁾. . .v^(d)i . Then

X_ij =hv⁽ⁱ⁾,v^(j)i ∀i,j.

I.e., Semidefinite Optimization allows us to express a problem by enforcing linear constraints on the variables: hv⁽ⁱ⁾,v^(j)i.

(50)

diag(X) = ¯e is equivalent to:

D

v^(j⁾,v^(j) E

= 1 ∀j. Similarly,

X11+ 12X12+ 23X23+X44≤32 is equivalent to:

D

v⁽¹⁾,v⁽¹⁾E + 12D

v⁽¹⁾,v⁽²⁾E + 23D

v⁽²⁾,v⁽³⁾E +D

v⁽⁴⁾,v⁽⁴⁾E

≤32.

If there are no additional restrictions on the dimension ofv^(j⁾, and all the constraints are as above, then the SDP formulation is an

(51)

For example, Arora, Rao and Vazirani [2004] use an approach as above to derive an SDP relaxation for the sparsest cut problem.

de Carli Silva, Harvey and Sato [2016] utilize SDP techniques to derive polynomial time algorithms that approximate sums of symmetric positive semidefinite matrices. Such algorithms have applications in finding sparsifiers of hypergraphs, sparse solutions to semidefinite programs, sparsifiers of unique games etc.

(52)

(IV) SoS Relaxations of prescribed degree:

Since any system of polynomial inequalities can be reformulated as a system of quadratic inequalities, the above results can be

translated to the setting of polynomial optimization problems (POP):

min p0(x)

p_i(x) ≥ 0, i ∈ {1,2, . . . ,m}, wherep0,p1, . . . ,pm :Rⁿ→R are polynomials.

Before going forward, let’s take a break and consider a question about a single multivariate polynomial.

(53)

Just a question... How can I convince you that

f(x) := 83−108x1+ 216x2+x₃²x₁²−2x₃³x1−32x1x₂³ +24x₁²x₂²−8x₁³x₂−144x₁x₂²+ 72x₁²x₂

−216x₁x₂+x₃²+ 54x₁²+ 216x₂²+ 432x₁⁵x₂²x₄

−432x₁⁴x₂³x4+ 144x₁³x₂⁴x4−576x₁³x₂³x₄² +256x₁²x₂²x₄⁴+ 864x₁⁴x₂²x₄²+ 768x₁³x₂²x₄³

−16x₁²x₂⁵x4−12x₁³+ 96x₂³+x₁⁴+ 16x₂⁴+x₃⁴ +96x₁²x₂⁴x₄²−256x₁²x₂³x₄³−12x₁³x₂⁵

+54x₁⁴x₂⁴−108x₁⁵x₂³+ 81x₁⁶x₂²+x₁²x₂⁶ +2x₃²x₁−2x₃³ ≥2, ∀x∈R⁴?

(54)

What if I claim ...

f(x) = (x₁−2x₂−3)⁴+x₃²(x₃−x₁−1)² +x₁²x₂²(3x1−x2+ 4x4)⁴+ 2

≥2, ∀x∈R⁴?

That is,f(x) = Sum-of-Squares + 2, ∀x ∈R⁴.

(55)

There is a lot of work in the area of solving (POP) by utilizing Linear Optimization, Semidefinite Optimization and Convex Optimization techniques. See Lasserre [2001-...], Parrilo [2003-...], Laurent [2003-...], Gouveia, Parrilo, Thomas [2009].

(56)

Lasserre uses the connections to Putinar’s Theorem [1993]:

Theorem

SupposeF :={x∈Rⁿ:pi(x)≥0,i ∈ {1,2, . . . ,m}} is compact, the polynomials p_i have even degree, and their highest degree homogeneous parts do not have common zeroes inRⁿ except 0.

Then every polynomial that is positive onF can be written as a nonnegative combination of polynomials of the form

[h0(x)]²+

m

X

i=1

[hi(x)]²pi(x).

(57)

Perhaps one of the most fundamental problems here is the K -moment problemwhich is, given K ⊂Rⁿ to decide when a real valued functionf of set of monomials inn variables is a moment functionR

Kx^mdµfor some nonnegative Borel measureµ onK. Schm¨udgen [1991] characterized the solutions to the K-moment problem (calledK -moment sequences) for all compact

semi-algebraic setsK in terms of the positive definiteness of matrices arising from the moment functions. Schm¨udgen’s proof utilizesPositivstellensatzin proving the above-mentioned algebraic fact. The result also generalizes many other preexisting beautiful results such as Handelman’s Theorem [1988]; some of these connections are old and they generalize results some of which go all the way back to Minkowski in late 1800’s.

(58)

Positivstellensatz(Stengle [1974]):

F:={x ∈Rⁿ:pi(x)≥0,i ∈ {1,2, . . . ,m}}=∅ iff that there existsg ∈cone (p1,p2, . . . ,pm) such thatg(x) =−1.

That is, iff there exists₀,s_J, . . .∈SoS(n,∗) such that

g = X

J⊆{1,2,...,m}

s_JY

i∈J

p_i =−1.

(59)

Positivstellensatzis a “common” generalization of Farkas’ Lemma Lemma

(Farkas’ Lemma) Let A∈R^m×n, b ∈R^m be given. Then exactly one of the following systems has a solution:

(I) Ax =b,x ≥0;

(II) A^>y ≥0,b^>y <0.

andHilbert’s Nullstellensatz [1901](characterizing when a system of polynomial equations has no solution overCⁿ): Only for this slide, letp_i be polynomials in complex variables (x ∈Cⁿ). Then exactly one of the following systems has a solution:

(I) p_i(x) = 0,∀i ∈ {1,2, . . . ,m};

(II) ∃ polynomialsh_i such that P_m

i=1h_i(x)p_i(x) =−1.

(60)

Givenx ∈Rⁿ and polynomialf :Rⁿ→R of degree 2d, let h(x) := [1,x1,x2, . . . ,xn,x₁²,x1x2,x1x3, . . . ,x₂², . . . ,x_n^d]^>∈R^N, whereN:= ^n+d_d

. We are interested in F(f) :=

n

X ∈S^N : [h(x)]^>Xh(x) =f(x) o

. The following well-known fact connects SoS and semidefinite optimization.

Theorem

Letz¯∈R. Then [f(x)−z¯]is SoS iff

(61)

LetG = (V,E) be given. A unit distance representation of G is v:V →R^d for somed ≥1 such that

kv⁽ⁱ⁾−v^(j⁾k₂= 1, ∀{i,j} ∈E. Theorem

(Lov´asz [2003]) Let G = (V,E) be a given graph. Then, the optimal objective value of the following semidefinite optimization problem is finite, it is attained and is equal to the square of the radius of the smallest ball containing a unit distance representation of G :

t(G) := min t

subject to: diag(X)−te¯ ≤ 0,

Xii −2Xij +Xjj = 1, ∀{i,j} ∈E, X ∈ S^V+.

(62)

Theorem

(Lov´asz [2003]) Let G = (V,E) be a given graph. Then, the optimal objective value of the following semidefinite optimization problem is finite, it is attained and is equal to the square of the radius of the smallest ball containing a smallest hypersphere representation of G :

min t

subject to: diag(X)−te¯ = 0,

X_ii −2X_ij +X_jj = 1, ∀{i,j} ∈E, X ∈ S^V+.

(63)

We can consider generalization of such representations to ellipsoids. Find a unit distance representation of a given graphG which lies in the “smallest” ellipsoid. See, de Carli Silva [2013], de Carli Silva and T. [2013]. Many of these problems are hard, and there are many open problems related to such ellipsoidal

representation problems.

(64)

Unit representations of graphs in a ball and unit representations of graphs on hyperspheres are closely connected toorthonormal representations of graphs:

{u⁽ⁱ⁾∈R^d :i ∈V} is called anorthonormal representation of G if

ku⁽ⁱ⁾k₂= 1, ∀i ∈V,and hu⁽ⁱ⁾,u^(j)i= 0, ∀{i,j} ∈E¯.

We will cover the connections of such representations to the stable set problem in the next section.

(65)

We haveAandb given, describing a nonempty polytope P :=

n

x∈R^d :Ax ≤b, 0≤x ≤e¯ o

.

We are interested in 0,1 vectors in P:

PI := conv

P∩ {0,1}^d .

(66)

We will consider operators Γ that take a compact convex set Ck ⊆[0,1]^d and return a compact convex setCk+1 such that

Ck∩ {0,1}^d ⊆Ck+1:= Γ(Ck)⊆Ck,

C_k+16=C_k unlessC_k = conv C_k ∩ {0,1}^d .

(67)

Note that the given system of equations and inequalities is:

Ax ≤b, 0≤x≤e,¯ x_j²−xj = 0, ∀j ∈ {1,2, . . . ,d}.

(68)

P :=

n

x∈R^d :Ax ≤b, 0≤x ≤e¯ o

.

We are interested in 0,1 vectors in P:

P_I := conv

P∩ {0,1}^d .

BCC_(j₎(P) := conv{(P ∩ {x :xj = 0})∪(P∩ {x :xj = 1})}, wherej ∈ {1,2, . . . ,d}. Since the inclusions

PI ⊆BCC_(j₎(P)⊆P

are clear for everyj, it makes sense to consider applying this

(69)

Let us define

J :={j₁,j₂, . . . ,j_k} ⊆ {1,2, . . . ,d}.

Let us denote

BCC_(J)(P) :=BCC_(j_k₎ BCC_(j_k−1₎ · · ·BCC_(j₁₎(P)· · · . It is easy to check that in the above context, the operatorsBCC_(j) commute with each other.

Therefore, the notationBCC_(J)(·) is justified.

(70)

A beautiful, fundamental property of these operators is:

Lemma

For every J⊆ {1,2, . . . ,d}, we have

BCC_(J)(P) = conv (P∩ {x :x_j ∈ {0,1},∀j ∈J}). The lemma directly leads to the convergence theorem.

Theorem

(Balas [1974]) Let P be as above. Then

BCC (P) =P .

(71)

BCC_(j) LS0 LS SA SA⁰ BZ BZ⁰⁰ BZ⁰ LS₊ SA₊ SA⁰₊ BZ₊ BZ⁰⁰₊ BZ⁰₊

Las

PSD Operators Polyhedral Operators Tractable w/ weak separation oracle forP

Tractable w/ facet description ofP

Figure:Various properties of lift-and-project operators (Au and T. [2011, 2015, 2016]).

(72)

Lov´asz and Schrijver [1991] proposed:

M0(K) :=

n

Y ∈R(d+1)×(d+1)

: Ye0=Y^>e0 = diag(Y), Yei ∈K,Y(e0−ei)∈K,

∀i ∈ {1,2, . . . ,d}}

LS0(K) :={Ye₀:Y ∈M0(K)}.

(73)

Tighter,

M(K) :=M0(K)∩S^d+1,

LS(K) :={Ye₀ :Y ∈M(K)}.

(74)

and tighter,

M+(K) :=M0(K)∩S^d+1+ ,

LS₊(K) :={Ye₀:Y ∈M₊(K)}.

(75)

Lemma

Let K be as above. Then

K_I ⊆LS+(K)⊆LS(K)⊆LS0(K)⊆K.

(76)

Theorem

(Lov´asz and Schrijver [1991]) Let P be as above. Then P ⊇LS₀(P)⊇LS²₀(P)⊇ · · · ⊇LS₀^d(P) =P_I. Similarly for LS as well as LS₊.

(77)

Moreover, the relaxations obtained after a few iterations are still tractableif the original relaxationP is.

Theorem

(Lov´asz and Schrijver [1991]) Let P be as above. If we have a polynomial time weak separation oracle for P then we can optimize any linear function over any of LS₀^k(P), LS^k(P), LS₊^k(P)in

polynomial time, provided k =O(1).

There is a wide spectrum of lift-and-project type operators: Balas [1974], Sherali and Adams [1990], Lov´asz and Schrijver [1991], Balas, Ceria and Cornu´ejols [1993], Kojima and T. [2000], Lasserre [2001], de Klerk and Pasechnik [2002], Parrilo [2003], Bienstock and Zuckerberg [2004].

(78)

Such a general method (it applies toevery combinatorial optimization problem)...

Can it be really good onanyproblem?

(79)

LetG = (V,E) be an undirected graph.

We define thefractional stable set polytope as FRAC(G) :=n

x∈[0,1]^V :x_i +x_j ≤1 for all {i,j} ∈Eo . This polytope is used as the initial approximation to the convex hull of incidence vectors of thestable setsof G, which is called the stable set polytope:

STAB(G) := conv

FRAC(G)∩ {0,1}^V .

(80)

Let us define the class ofodd-cycle inequalities. LetH be the node set of an odd-cycle inG then the inequality

X

i∈H

x_i ≤ |H| −1 2 is valid for STAB(G). We define

OC(G) :={x ∈FRAC(G) :x satisfies all odd-cycle constraints forG}.

(81)

IfHis an odd-anti-hole then the inequality X

i∈H

x_i ≤2 is valid for STAB(G).

(82)

If we have an odd-wheel inG with hub node represented byx_2k+2 and therimnodes represented byx₁,x₂, . . . ,x_2k+1, then the odd-wheel inequality

kx_2k+2+

2k+1

X

i=1

x_i ≤k

is valid for STAB(G).

(83)

Based on these classes of inequalities we define the polytopes OC(G),ANTI-HOLE(G),WHEEL(G).

(84)

Theorem

(Lov´asz and Schrijver [1991]) For every graph G , LS0(G) = LS(G) = OC(G).

Note that this theorem provides a compact lifted representations of the odd-cycle polytope ofG (in the spacesR^({0}∪V^)×({0}∪V⁾ and

S^{0}∪V). This polytope can have exponentially many facets in the

worst case.

However,M(G) is represented by

|V|(|V|+ 1)/2 variables and O |V|³

linear inequalities.

(85)

What about LS²₀(G),LS²(G)?

(86)

What about LS²₀(G),LS²(G)?

There exist graphsG for which

LS²₀(G)6= LS²(G) Au and T. [2009].

Open Problem: Give good combinatorial characterizations for LS²₀(G) and LS²(G).

Some partial results by Lipt´ak [1999] and by Lipt´ak and T. [2003].

(87)

Aclique inG is a subset of nodes inG so that every pair of them are joined by an edge. Theclique polytopeof G is defined by

CLQ(G) :=







x∈R^V+ :X

j∈C

x_j ≤1 for every clique C in G





 .

Optimizing a linear function over FRAC(G) is easy!

Linear optimization over CLQ(G) (and STAB(G)) is N P-hard!

(88)

Orthonormal Representations of Graphs and the Theta Body of G := (V,E)

u⁽¹⁾,u⁽²⁾, . . . ,u^(|V^|)∈R^d such that

hu⁽ⁱ⁾,u^(j⁾i= 0, for all i 6=j,{i,j}∈/ E, and

hu⁽ⁱ⁾,u⁽ⁱ⁾i= 1, for all i ∈V.

(89)

TH(G) :=

(

x∈R^V+ :X

i∈V

c^>u⁽ⁱ⁾2

x_i ≤1,

∀ ortho. representations andc ∈R^d s.t. kck₂ = 1

TH(G)⊇STAB(G)

butinfinitely many linear inequalities!

(90)

Letχ(G) denote the chromatic numberof G. Clique number:

ω(G) := max{|C|:C is a clique in G}

G isperfect if for every node-induced subgraphH ofG, we have

χ(H) =ω(H).

(91)

Theorem

Let G = (V,E). Then TFAE (i) G is perfect

(ii) TH(G) is a polytope (iii) TH(G) = CLQ(G) (iv) STAB(G) = CLQ(G)

(v) G does not contain an odd-hole or odd anti-hole (vi) the ideal generated by

(x_v²−x_v),∀v ∈V;x_ux_v,∀{u,v} ∈E is (1,1)-SoS.

Follows from the work of Chudnovsky, Chv´atal, Fulkerson,

Gouveia, Gr¨otschel, Lov´asz, Parrilo, Robertson, Schrijver, Seymour, Rekha Thomas, Robin Thomas.

(92)

TH(·) behaves well under duality and graph complementation:

Theorem

For every graph G = (V,E),

[TH(G)]^o ∩R^V+ = TH(G).

That is, the polar of TH(G) when restricted to the nonnegative orthant, coincides with the TH(·) set of the complement ofG. (Polar of a set intersected with the nonnegative orthant is called the antiblockerof the set— another notion of duality.)

Convex sets that satisfy such a beautiful property have been characterized (under mild assumptions). See, de Carli Silva [2013],

(93)

There is a strong connection between LS+(G) and TH(G):

Theorem

(Lov´asz and Schrijver [1991]) Let G = (V,E). Then TH(G) =

x ∈R^V : 1

x

=Ye₀;Y_ij = 0,∀{i,j} ∈E; Ye0=diag(Y);Y 0}.

(94)

Theorem

(Lov´asz and Schrijver [1991]) For every graph G ,

LS₊(G)⊆OC(G)∩ANTI-HOLE(G)∩WHEEL(G)∩CLQ(G)∩TH(G).

Open Problem: Give full, elegant, combinatorial characterizations for LS+(G).

(95)

Is LS₊(G) polyhedral for everyG?

LetG_αβ be the graph in the following figure:

Figure:The graphGαβ.

(96)

A two dimensional cross-section of the compact convex relaxation LS₊(G_αβ) has a nonpolyhedral piece on its boundary.

We say thatz ∈R⁸ is an αβ-point, if α andβ are both nonnegative andz_i :=

α if i ∈ {1,2,3,4}, β if i ∈ {5,6,7,8}.

(97)

Theorem

(Bianchi, Escalante, Nasini, T. [2014]) Anαβ-point with 1

4 ≤α≤ 1 2 belongs toLS₊(G_αβ) if and only if

β ≤ 3−p

1 + 8(−1 + 4α)²

8 .

Moreover, if anαβ-point is in the set with the maximum allowed value forβ, then it is an extreme point of the set.

(98)

The SDP relaxation LS₊(G) of STAB(G) is stronger than TH(G).

By following the same line of reasoning used for perfect graphs, MWSSP can be solved in polynomial time for the class of graphs for which LS₊(G) = STAB(G).

We call theseLS+-perfect graphs.

(99)

IfG⁰ is a node-induced subgraph ofG (G⁰ ⊆G), we consider every point in STAB(G⁰) as a set of points in STAB(G), although they do not belong to the same space

(for the missing nodes, we take direct sums with the interval [0,1], since originally STAB(G)⊆STAB(G⁰)⊕[0,1]^V^(G)\V^(G⁰⁾).

With this notation, given any family of graphsF and a graph G, we denote byF(G)the relaxation of STAB(G) defined by

F(G) := \

G⁰⊆G;G⁰∈F

STAB(G⁰).

(100)

A graph is callednear-bipartiteif after deleting the closed

neighborhood ofany node, the resulting graph is bipartite. Let us denote by NB the class of all near-bipartite graphs.

For every graphG,

LS₊(G)⊆NB(G) and

NB(G)⊆CLQ(G)∩OC(G)∩ANTI-HOLE(G)∩WHEEL(G).

(101)

IfG⁰ is a node-induced subgraph ofG (G⁰ ⊆G), we consider every point in STAB(G⁰) as a point in STAB(G), although they do not belong to the same space (for the missing nodes, we take direct sums with the interval [0,1], since originally

STAB(G)⊆STAB(G⁰)⊕[0,1]^V^(G)\V^(G⁰⁾). With this notation, given any family of graphsF and a graph G, we denote byF(G) the relaxation of STAB(G) defined by

F(G) := \

G⁰⊆G;G⁰∈F

STAB(G⁰).

(102)

Open Problem: Find acombinatorial characterization of LS₊-perfectgraphs.

Current best characterization (Bianchi, Escalante, Nasini, T.

[2014])

LS+(G)⊆NB(G)∩TH(Gˆ ).

(103)

What is the smallest graph which isLS₊-imperfect?

In a related context, Knuth (1993) asked what is the smallest graph for which STAB(G)6= LS₊(G)?

(104)

@

r @r

r r

1 4

1 3 1 3

17 40 17 40

Figure:Little graph that could! G2 with corresponding weights

(105)

Proposition

(Lipt´ak, T., 2003) G₂ is the smallest graph for which LS+(G)6= STAB(G).

(106)

TheLS-rank of P is the smallestk for which LS^k(P) =P_I. Analogously, LS₀-rank of P, LS₊-rank of P_I relative to P ...

We denote these ranks byr(G),r0(G), and r+(G), respectively.

(107)

Theorem

(Lipt´ak, T. [2003]) For every graph G = (V,E), r₊(G)≤j_|V_|

3

k .

n+(k) := min{|V(G)|:r+(G) =k}.

Open Problem: What are the values ofn+(k) for everyk ∈Z+? In particular,

Conjecture(Lipt´ak, T. [2003]): Is it true that n₊(k) = 3k for all k∈Z+?

(108)

n₊(k) := min{|V(G)|:r₊(G) =k}.

Open Problem: What are the values ofn₊(k) for everyk ∈Z+? In particular,

Conjecture(Lipt´ak, T. [2003]): Is it true that n₊(k) = 3k for all k∈Z+?

k= 1 (triangle is the answer);

k= 2 the above graph G₂ is the answer;

k= 3, Escalante, Montelar, Nasini (2006);

k= 4?

(109)

What about the polyhedral graph ranks?

Conjecture(Lipt´ak, T. [2003]): r₀(G) =r(G) ∀ graphs G.

True for:

bipartite graphs, series-parallel graphs, perfect graphs and odd-star-subdivisions of graphs inB (which contains cliques and wheels, among many other graphs), antiholes and graphs that have LS0-rank≤2. Also true for all 8-node graphs, and for 9-node graphs that contain a 7-hole or a 7-antihole as an induced subgraph Au [2008].

(110)

[2000], Goemans and T. [2001], Laurent [2002], Laurent [2003], Aguilera, Bianchi and Nasini [2004], Escalante, Montelar and Nasini [2006], Arora, Bollob´as, Lov´asz and Tourlakis [2006], Cheung [2007], Georgiou, Magen, Pitassi, Tourlakis [2007], Schoeneback, Trevisan and Tulsiani [2007], Charikar, Makarychev and Makarychev [2009], Mathieu and Sinclair [2009], Raghavendra and Steurer [2009], Benabbas and Magen [2010], Karlin, Mathieu and Thach Nguyen [2011], Chan, Lee, Raghavendra and Steurer [2013], Thapper and Zivny [2016].

Many of the lower bound proofs have been unified/generalized:

Hong and T. [2008].

Other work on convex relaxation methods on the stable set

(111)

Stronger “lower bound” results via study ofextended complexity.

(112)

BCC_(j) LS0 LS SA SA⁰ BZ BZ⁰⁰ BZ⁰ LS₊ SA₊ SA⁰₊ BZ₊ BZ⁰⁰₊ BZ⁰₊

Las

PSD Operators Polyhedral Operators Tractable w/ weak separation oracle forP

Tractable w/ facet description ofP

Figure:Various properties of lift-and-project operators (Au and T. [2011,

(113)

Denote{0,1}^d byF. Define A:= 2^F. For eachx ∈ F, we define the vectorx^A ∈R^A such that

x_α^A=

1, ifx ∈α;

0, otherwise.

(114)

For any givenx ∈ F, if we defineY_A^x :=x^A(x^A)^>, then, the following must hold:

Y_A^xe0 = (Y_A^x)^>e0=diag(Y_A^x) =x^A; Y_A^xeα ∈

0,x^A , ∀α∈ A;

Y_A^x ∈S^A+;

Y_A^x[α, β] = 1 ⇐⇒ x∈α∩β;

Ifα1∩β1 =α2∩β2, then Y_A^x[α1, β1] =Y_A^x[α2, β2].

(115)

GivenS ⊆[d] and t∈ {0,1}, we define

S|_t :={x∈ F :x_i =t, ∀i ∈S}. For any integeri ∈[0,d], define

A_i :={S|₁∩T|₀ :S,T ⊆[n],S∩T =∅,|S|+|T| ≤i}

and

A⁺_i :={S|₁:S ⊆[d],|S| ≤i}.

(116)

Let ˜SA^k(P) denote the set of matricesY ∈R^A

+

1×A_k that satisfy all of the following conditions:

(SA 1) Y[F,F] = 1;

(SA 2) x(Yeˆ _α)∈K(P) for everyα∈ Ak; (SA 3) For eachS|₁∩T|₀∈ A_k−1, impose

Ye_S_|₁_∩T|₀ =Ye_S|₁_∩T|₀_∩j|₁+Ye_S_|₁_∩T|₀_∩j|₀, ∀j ∈[n]\(S∪T).

(SA 4) For eachα∈ A⁺₁, β∈ Ak such thatα∩β =∅, impose Y[α, β] = 0;

(SA 5) For everyα1, α2∈ A⁺₁, β1, β2∈ Ak such that α1∩β1=α2∩β2, imposeY[α1, β1] =Y[α2, β2].

(117)

Let ˜SA^k₊(P) denote the set of matricesY ∈S^A+^k that satisfies all of the following conditions:

(SA+1) (SA 1), (SA 2) and (SA 3);

(SA+2) For eachα, β∈ Ak such that conv(α)∩conv(β)∩P=∅, imposeY[α, β] = 0;

(SA+3) For anyα1, α2, β1, β2∈ Ak such that α1∩β1=α2∩β2, imposeY[α1, β1] =Y[α2, β2].

Define

SA^k(P) :=

n

x∈R^d :∃Y ∈SA˜ ^k(P) :YeF = ˆx o

and

SA^k₊(P) :=

n

x ∈R^d :∃Y ∈SA˜ ^k₊(P) : ˆx(YeF) = ˆx o

. The SA^k₊ operator extends the lifted space of the SA^k operator to a set of square matrices, and imposes an additional positive semidefiniteness constraint. Moreover, SA^k₊ refines the operator LS^k₊.