Quasi-Randomness and Algorithmic Regularity for Graphs with General Degree Distributions?

(1)

Quasi-Randomness and Algorithmic Regularity for Graphs with General Degree Distributions

^?

Noga Alon^1??, Amin Coja-Oghlan^{2? ? ?}, Hiê.p Hàn^3†, Mihyun Kang^4‡, Vojtˇech Rödl^5§, and Mathias Schacht^3¶

1 School of Mathematics and Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel

[email protected]

2 University of Edinburgh, School of Informatics, Edinburgh EH8 9AB, UK [email protected]

3 Humboldt-Universit¨at zu Berlin, Institut f¨ur Informatik, Unter den Linden 6, 10099 Berlin, Germany {hhan,schacht}@informatik.hu-berlin.de

4 Technische Universit¨at Berlin, Institut f¨ur Mathematik, Straße des 17. Juni 136, 10623 Berlin, Germany [email protected]

5 Department of Mathematics and Computer Science, Emory University, Atlanta, GA 30322, USA [email protected]

Abstract. We deal with two intimately related subjects: quasi-randomness and regular partitions. The purpose of the concept of quasi-randomness is to measure how much a given graph “resembles” a random one. Moreover, a regular partition approximates a given graph by a bounded number of quasi-random graphs. Regarding quasi- randomness, we present a new spectral characterization of low discrepancy, which extends to sparse graphs. Con- cerning regular partitions, we introduce a concept of regularity that takes into account vertex weights, and show that ifG= (V, E)satisfies a certain boundedness condition, thenGadmits a regular partition. In addition, building on the work of Alon and Naor [4], we provide an algorithm that computes a regular partition of a given (possibly sparse) graphGin polynomial time. As an application, we present a polynomial time approximation scheme for MAX CUT on (sparse) graphs without “dense spots”.

Key words: quasi-random graphs, Laplacian eigenvalues, regularity lemma, Grothendieck’s inequality.

1 Introduction and Results

This paper deals with quasi-randomness and regular partitions. Loosely speaking, a graph is quasi-random if the global distribution of the edges resembles the expected edge distribution of a random graph. Furthermore, a regular partition approximates a given graph by a constant number of quasi-random graphs. Such partitions are of algorithmic importance, because a number of NP-hard problems can be solved in polynomial time on graphs that come with regular partitions. In this section we present our main results and discuss related work. The remaining sections contain the proofs and detailed descriptions of the algorithms.

1.1 Quasi-Randomness: Discrepancy and Eigenvalues

Random graphs are well known to have a number of remarkable properties (e.g., excellent expansion). Therefore, quantifying how much a given graph “resembles” a random one is an important problem, both from a structural and an algorithmic point of view. Providing such measures is the purpose of the notion ofquasi-randomness. While this

?An extended abstract version of this work appeared in the Proceedings of ICALP 2007.

??Supported by an ISF grant, an ERC Advanced grant, and by the Hermann Minkowski Minerva Center for Geometry at Tel Aviv University.

? ? ?

Supported by DFG grant CO 646.

†Supported by DFG within the research training group “Methods for Discrete Structures”.

‡Supported by DFG grant PR 296/7-3 and DFG grant KA 2748/2-1.

§Supported by NSF Grants DMS 0300529 and DMS 0800070.

¶Supported by DFG grant SCHA 1263/1-1 and GIF Grant 889/05.

(2)

concept is rather well developed for dense graphs (i.e., graphsG= (V, E)with|E|=Ω(|V|²)), less is known in the sparse case, which we deal with in the present work. In fact, we shall actually deal with (sparse) graphs withgeneral degree distributions, including but not limited to the ubiquitous power-law degree distributions (cf. [1]).

We will mainly consider two types of quasi-random properties: low discrepancy and eigenvalue separation. The low discrepancy property concerns the global edge distribution and basically states thateverysetSof vertices approximately spans as many edges as we would expect in a random graph with the same degree distribution. More precisely, ifG = (V, E)is a graph, then we letd_v signify the degree ofv ∈ V. Furthermore, thevolumeof a setS ⊂ V is vol(S) = P

v∈Sdv. In addition, ifS, T ⊂V are disjoint sets, thene(S, T)denotes the number ofS-T-edges inG ande(S)is two times the number of edges spanned by the setS. For not necessarily disjoint setsS, T ⊂V we let e(S, T) =e(S\T, T \S) +e(S∩T).

Disc(ε): We say thatGhas discrepancy at mostε(“GhasDisc(ε)” for short) if

∀S⊂V :

e(S)−vol(S)² vol(V)

<2ε·vol(V). (1)

To explain (1), letd = (d_v)_v∈V, and letG(d)signify a random graph with expected degree distributiond; that is, any two verticesv, ware adjacent with probabilityp_vw =d_vd_w/vol(V)independently. Then inG(d)theexpected number of edges inside ofS ⊂V equals ¹₂P

(v,w)∈S²p_vw = ¹₂vol(S)²/vol(V). Consequently, (1) just says that for anysetS the actual number of edges inside ofS must not deviate from what we expect inG(d)by more than an ε-fraction of the total volume.

An obvious problem with the bounded discrepancy property (1) is that it seems quite difficult to check whether G= (V, E)satisfies this condition. This is because apparently one would have to inspect an exponential number of subsetsS ⊂V. Therefore, we consider a second property that refers to the eigenvalues of a certain matrix representing G. More precisely, we will deal with thenormalized LaplacianL(G), whose entries(`vw)_v,w∈V are defined as

`vw=







1 ifv=wanddv≥1,

−(dvdw)⁻¹² ifv, ware adjacent, 0 otherwise;

Due to the normalization by the geometric mean√

dvdw of the vertex degrees,L(G)turns out to be appropriate for representing graphs with general degree distributions. Moreover,L(G)is well known to be positive semidefinite, and the multiplicity of the eigenvalue0equals the number of connected components ofG(cf. [9]).

Eig(δ): Letting0 =λ1[L(G)]≤ · · · ≤λ_|V_|[L(G)]denote the eigenvalues ofL(G), we say thatGhasδ-eigenvalue separation(“GhasEig(δ)”) if1−δ≤λ₂[L(G)]≤λ_|V_|[L(G)]≤1 +δ.

As the eigenvalues of L(G) can be computed in polynomial time (within arbitrary numerical precision), we can essentially check efficiently whetherGhasEig(δ)or not.

It is not difficult to see thatEig(δ)provides asufficientcondition forDisc(ε). That is, for anyε > 0there is a δ > 0such that any graphGthat hasEig(δ)also hasDisc(ε). However, while the converse implication is true ifG is dense (i.e.,vol(V) =Ω(|V|²)), it is false for sparse graphs. In fact, providing anecessarycondition forDisc(ε)in terms of eigenvalues has been an open problem in the area of sparse quasi-random graphs since the work of Chung and Graham [11]. Concerning this problem, we basically observe that the reason whyDisc(ε)does in general not imply Eig(δ)is the existence of a small set of “exceptional” vertices.

ess-Eig(δ): We say thatGhas essentialδ-eigenvalue separation (“Ghasess-Eig(δ)”) if there is a setW ⊂ V of volumevol(W)≥(1−δ)vol(V)such that the following is true. LetL(G)_W = (`_vw)_v,w∈W denote the minor of L(G)induced onW×W, and letλ₁[L(G)_W]≤ · · · ≤ λ_|W_|[L(G)_W]signify its eigenvalues. Then we require that1−δ < λ₂[L(G)_W]≤λ_|W_|[L(G)_W]<1 +δ.

Theorem 1. There is a constantγ >0such that the following is true for all graphsG= (V, E)and allε >0.

1. IfGhasess-Eig(ε), thenGsatisfiesDisc(10√ ε).

(3)

2. IfGhasDisc(γε²), thenGsatisfiesess-Eig(ε).

The main contribution is the second implication. Its proof is based on Grothendieck’s inequality and the duality theorem for semidefinite programs. In effect, the proof actually provides us with an efficient algorithm that computes a setW as in the definition ofess-Eig(ε). The second part of Theorem1is best possible, up to the precise value of the constantγ(see Section6).

1.2 The Algorithmic Regularity Lemma

Loosely speaking, a regular partition of a graphG= (V, E)is a partition of(V1, . . . , Vt)ofV such that for “most”

index pairs1 ≤ i < j ≤ tthe bipartite subgraph spanned byVi andVj is quasi-random. Thus, a regular partition approximatesGby quasi-random graphs. Furthermore, the numbertof classes may depend on a parameterεthat rules the accuracy of the approximation, but it doesnotdepend on the order of the graphGitself. Therefore, if for some class of graphs we can compute regular partitions in polynomial time, then this graph class will admit polynomial time algorithms for various problems that are NP-hard in general.

In the sequel we introduce a new concept of regular partitions that takes into account a given “ambient” weight distributionD= (D_v)_v∈V, which is an arbitrary sequence of rationals between1andn=|V|. We will see at the end of this section how this relates to the notion of quasi-randomness discussed in the previous section. LetG= (V, E) be a graph. For subsetsX, Y ⊂V we set

%(X, Y) = e(X, Y)

D(X)D(Y), whereD(U) =P

u∈UDufor anyU ⊂V.

Further, we say that for disjoint setsX, Y ⊂V the pair(X, Y)is(ε,D)-regularif for allX⁰⊂X,Y⁰ ⊂Y satisfying D(X⁰)≥εD(X),D(Y⁰)≥εD(Y)we have

|e(X⁰, Y⁰)−%(X, Y)D(X⁰)D(Y⁰)| ≤ε·D(X)D(Y)

D(V) . (2)

Roughly speaking, (2) states that the bipartite graph spanned byXandY is “quasi-random” with respect to the vertex weightsD.

In the present notation, Szemer´edi’s original regularity lemma [23] states thatevery graphGadmits a regular partition with respect to the weight distributionD(v) =nfor allv ∈V. However, ifGis sparse (i.e.,|E| |V|²), then such a regular partition is not helpful because the bound on the r.h.s. of (2) exceeds|E|. To obtain an appropriate bound, we would have to consider a weight distribution such thatD(v) nfor (at least) somev ∈ V. But with respect to such weight distributions regular partitions do not necessarily exist. The basic obstacle is the presence of large “dense spots”(X, Y), wheree(X, Y)is far bigger than the termD(X)D(Y)suggests. To rule these out, we consider the following notion.

(C, η,D)-boundedness. Let C ≥ 1 and η > 0. A graph G is (C, η,D)-bounded if for all X, Y ⊂ V with D(X), D(Y)≥ηD(V)we have%(X, Y)D(V)≤C .

To illustrate the boundedness condition, consider a random graphG(D)with expected degree sequenceDsuch thatD(V)n. Then for any two disjoint setsX, Y ⊂V we haveE[e(X, Y)] =D(X)D(Y)/D(V) +o(D(V)).

Hence, Chernoff bounds imply that for allX, Y simultaneously we havee(X, Y) =D(X)D(Y)/D(V) +o(D(V)) with probability1−o(1)asn → ∞. Therefore, for any fixedη > 0the random graphG(D)is(1 +o(1), η,D) bounded with probability1−o(1).

Now, we can state the followingalgorithmic regularity lemmafor graphs with general degree distributions. which does not only ensure theexistenceof regular partitions, but also that such a partition can be computed efficiently. We lethDisignify theencoding lengthof a weight distributionD= (Dv)_v∈V, i.e., the number of bits that are needed to write down the rationals(Dv)_v∈V. Observe thathDi ≥n.

Theorem 2. For any two numbersC≥1andε >0there existη >0andn0>0such that for alln≥n0and every sequence of rationalsD= (Dv)_v∈V with|V|=nand1≤Dv≤nfor allv∈V the following holds. IfG= (V, E) is a(C, η,D)-bounded graph andD(V)≥η⁻¹n, then there is a partitionP ={Vi: 0≤i≤t}ofV that satisfies the following two properties:

(4)

REG1. (a) ηD(V)≤D(Vi)≤εD(V)for alli= 1, . . . , t, (b) D(V₀)≤εD(V), and

(c) |D(Vi)−D(Vj)|<max_v∈V Dvfor all1≤i < j≤t.

REG2. LetLbe the set of all pairs(i, j),1≤i < j≤tsuch that(Vi, Vj)is not(ε,D)-regular. Then X

(i,j)∈L

D(Vi)D(Vj)≤εD(V)².

Furthermore, for fixedC and εthe partitionP can be computed in polynomial time. More precisely, there exist a functionfand a polynomialΠsuch that the partitionPcan be computed in timef(C, ε)·Π(hDi).

ConditionREG1states all of the classesV₁, . . . , V_thave approximately the same, non-negligible weight, while the “exceptional” classV₀has a “small” weight. Also note that due toREG1(a)the number of classestof the partition Pis bounded by1/η, which only depends onCandε, but not onG,D, orn. Moreover,REG2requires that the total weight of the irregular pairs(Vi, Vj)is small relative to the total weight. Thus, a partitionPthat satisfiesREG1and REG2approximatesGby a bounded number of bipartite quasi-random graphs.

We illustrate the use of Theorem2with the example of the MAX CUT problem. While approximating MAX CUT within a ratio better than¹⁶₁₇ is NP-hard on general graphs [19,24], the following theorem provides a polynomial time approximation scheme for(C, η,D)-bounded graphs.

Theorem 3. For anyδ > 0andC ≥ 1there exist two numbersη > 0,n0 >0 and a polynomial time algorithm ApxMaxCutsuch that for alln≥n0and every sequence of rationalsD= (Dv)_v∈V with|V|=nand1≤Dv≤n for allv∈V the following is true. IfG= (V, E)is a(C, η,D)-bounded graph andD(V)> η⁻¹n, thenApxMaxCut outputs a cut ofGthat approximates the maximum cut up to an additive error ofδ|D(V)|.

Finally, let us discuss a few examples and applications of the above results.

1. If we letD(v) =nfor allv∈V, then Theorem2is just an algorithmic version of Szemer´edi’s regularity lemma.

Such a result was established previously in [3].

2. Suppose thatD(v) = ¯dfor some numberd¯= ¯d(n) =o(n). Then the above notions of regularity and boundedness coincide with those of the classical “sparse regularity lemma” of Kohayakawa [21] and R¨odl (unpublished). Hence, Theorem2provides an algorithmic version of this regularity concept. This result has not been published previously (although it may have been known to experts in the area that this can be derived from Alon and Naor [4]). Actually devising an algorithm for computing a sparse regular partition is mentioned as an open problem in [21].

3. For a given graphG= (V, E)we could just use the degree sequence as a weight distribution, i.e.,D(v) =dvfor allv∈V. ThenD(U) = vol(U)for allU ⊂V. Hence, the notion of regularity (2) is closely related to the notion of quasi-randomness from Section1.1. The resulting regularity concept is a generalization of the “classical” sparse regularity lemma. The new concept allows for graphs with highly irregular degree distributions.

1.3 Further Related Work

Quasi-random graphs with general degree distributions were first studied by Chung and Graham [10]. They considered the propertiesDisc(ε)andEig(δ), and a number of further related ones (e.g., concerning weighted cycles). Chung and Graham observed that Eig(δ)implies Disc(ε), and that the converse is true in the case of dense graphs (i.e., vol(V) =Ω(|V|²)).

Regarding the step from discrepancy to eigenvalue separation, Butler [8] proved that any graphGsuch that for all setsX, Y ⊂V the bound

|e(X, Y)−vol(X)vol(Y)/vol(V)| ≤εp

vol(X)vol(Y) (3)

holds, satisfiesEig(O(ε(1−lnε))). His proof builds upon the work of Bilu and Linial [6], who derived a similar result for regular graphs, and on the earlier related work of Bollob´as and Nikiforov [7].

Butler’s result relates to the second part of Theorem1as follows. The r.h.s. of (3) refers to the volumes of the sets X,Y, and may thus be significantly smaller thanεvol(V). By comparison, the second part of Theorem1just requires

(5)

that the “original” discrepancy conditionDisc(δ) is true, i.e., we just need to bound|e(S)−vol(S)²/vol(V)|in terms of thetotalvolumevol(V). Hence, Butler shows that the “original” eigenvalue separation conditionEigfollows from a stronger version of the discrepancy property. By contrast, Theorem1shows that the “original” discrepancy conditionDiscimplies a weak form of eigenvalue separationess-Eig, thereby answering a question posed by Chung and Graham [10,11]. Furthermore, relying on Grothendieck’s inequality and SDP duality, the proof of Theorem 1 employs quite different techniques than [6,7,8].

In the present work we consider a concept of quasi-randomness that takes into account vertex degrees. Other concepts that do not refer to the degree sequence (and are therefore restricted to approximately regular graphs) were studied by Chung, Graham and Wilson [12] (dense graphs) and by Chung and Graham [11] (sparse graphs). Also in this setting it has been an open problem to derive eigenvalue separation from low discrepancy. Concerning this simpler concept of quasi-randomness, our techniques yield a similar result as Theorem1as well. The proof is similar and we omit the details.

Szemer´edi’s original regularity lemma [23] has become an important tool in various areas, including extremal graph theory and property testing. Alon, Duke, Lefmann, R¨odl, and Yuster [3] presented an algorithmic version, and showed how this lemma can be used to provide polynomial time approximation schemes for dense instances of NP-hard problems (see also [22] for a faster algorithm). Moreover, Frieze and Kannan [13] introduced a different algorithmic regularity concept, which yields better efficiency in terms of the desired approximation guarantee. Both [3,13]

encompass Theorem3in the case thatD(v) =nfor allv ∈V. The sparse regularity lemma from Kohayakawa [21]

and R¨odl (unpublished) is related to the notion of quasi-randomness from [11]. This concept of regularity has proved very useful in the theory of random graphs, see Gerke and Steger [15].

2 Preliminaries

2.1 Notation

IfS ⊂V is a subset of some setV, then we let1S ∈R^V denote the vector whose entries are1on the components corresponding to elements ofS, and0otherwise. More generally, ifξ∈R^V is a vector, thenξS ∈R^V signifies the vector obtained fromξby replacing all components with indices inV \S by0. Moreover, ifA= (avw)v,w∈V is a matrix, thenA_S = (a_vw)_v,w∈S denotes the minor ofAinduced onS×S. Further, for a vectorξ ∈R^V we letkξk signify the`₂-norm, and for a matrixM ∈R^V^×V we let

kM||= max

06=ξ∈R^V

kM ξk

kξk = max

ξ,η∈R^V\{0}

hM ξ, ηi kξk · kηk denote the spectral norm.

If ξ = (ξv)_v∈V is a vector, thendiag(ξ)signifies the V ×V matrix with diagonalξand off-diagonal entries equal to0. In particular,E= diag(1)denotes the identity matrix (of any size). Moreover, ifMis aν×νmatrix, then diag(M)∈R^νsignifies the vector comprising the diagonal entries ofM. If bothA= (aij)1≤i,j≤ν, B= (bij)1≤i,j≤ν

areν×νmatrices, then we lethA, Bi=Pν

i,j=1aijbij. IfM is a symmetricν×ν matrix, then

λ₁[M]≤ · · · ≤λ_ν[M] =λ_max[M]

denote the eigenvalues ofM. We will occasionally need the Courant-Fischer characterizations ofλ₂andλ_max, which read

λ2[M] = max

06=ζ∈R^ν min

ξ⊥ζ,kξk=1hM ξ, ξi, λmax[M] = max

ζ∈R^ν,kζk=1hM ζ, ζi (see [5, Chapter 7]). (4) Recall that a symmetric matrixM is positive semidefinite ifλ1[M]≥0. In this case we writeM ≥0. Furthermore, M is positive definite ifλ1[M] >0, denoted asM >0. IfM, M⁰ are symmetric, thenM ≥M⁰(resp.M > M⁰) denotes the fact thatM−M⁰≥0(resp.M−M⁰ >0).

(6)

2.2 Grothendieck’s inequality

An important ingredient to our proofs and algorithms is Grothendieck’s inequality. LetM = (mij)i,j∈Ibe a matrix.

Then thecut-normofM is

kMk_cut= max

I,J⊂I

X

(i,j)∈I×J

mij

.

In addition, consider the following optimization problem:

SDP(M) = max X

i,j∈I

m_ijhxi, y_ji s.t.∀i∈ I:kxik=kyik= 1, x_i, y_i ∈R^2|I|. (5)

This can be reformulated as alinearoptimization problem over the cone of positive semidefinite2|I| ×2|I|matrices, i.e., as a semidefinite program (see Alizadeh [2]).

Lemma 4. For anyν×νmatrixM we have SDP(M) =1

2max 0 1

1 0

⊗M , X

s.t.diag(X) =1, X≥0, X∈R^2ν×2ν. (6) Proof. Letx₁, . . . , x_2ν ∈ R^2ν be a family of unit vectors such thatSDP(M) = Pν

i,j=1m_ijhx_i, x_j+νi. Then we obtain a positive semidefinite matrixX = (x_i,j)_{1≤i,j≤2ν}by settingx_i,j =hxi, x_ji. Sincex_i,i=kxik²= 1for alli, this matrix satisfiesdiag(X) =1. Moreover,

0 1 1 0

⊗M , X

= 2

ν

X

i,j=1

m_ijx_i,j+ν = 2

ν

X

i,j=1

m_ijhx_i, x_j+νi. (7)

Hence, the optimization problem on the r.h.s. of (6) yields an upper bound onSDP(M).

Conversely, if X = (xi,j) is a feasible solution to (6), then there exist vectorsx1, . . . , x2ν ∈ R^2ν such that xi,j=hxi, xji, becauseX is positive semidefinite. Moreover, sincediag(X) =1, we have1 =xi,i=kxik². Thus, x1, . . . , x2νis a feasible solution to (5), and (7) shows that the resulting objective function values coincide. ut Grothendieck [17] established the following relation betweenSDP(M)andkMk_cut.

Theorem 5. There is a constantθ >1such that for all matricesM we havekMk_cut≤SDP(M)≤θ· kMk_cut. Since by Lemma4SDP(M)can be stated as a semidefinite program, an optimal solution toSDP(M)can be ap- proximated in polynomial time within any numerical precision, e.g., via the ellipsoid method [18]. By applying an appropriate rounding procedure to a near-optimal solution toSDP(M), Alon and Naor [4] obtained the following algorithmic result.

Theorem 6. There are a constantθ⁰>0and a polynomial time algorithmApxCutNormthat on inputM computes two setsI, J ⊂ Isuch thatθ⁰· kMk_cut≤

P

i∈I,j∈Jmij

.

Alon and Naor presented a randomized algorithm that guarantees an approximation rationθ⁰>0.56, and a determin- istic one withθ⁰ ≥0.03. Finally, we need the following dual characterization ofSDP. The proof can be found in the next section, Section2.3.

Lemma 7. For any symmetricn×nmatrixQwe haveSDP(Q) =n·minz∈Rⁿ, z⊥1λmax

0 1 1 0

⊗Q−diag ^z_z

.

(7)

2.3 Proof of Lemma7

The proof of Lemma7relies on the duality theorem for semidefinite programs. For a symmetricn×nmatrixQset Q= ¹₂

0 1 1 0

⊗Q. Furthermore, let

DSDP(Q) = minh1, yi s.t.Q ≤diag(y), y∈R²ⁿ. Lemma 8. We haveSDP(Q) = DSDP(Q).

Proof. By Lemma4we can rewrite the vector programSDP(Q)in the standard form of a semidefinite program:

SDP(Q) = maxhQ, Xi s.t.diag(X) =1, X≥0, X ∈R^(2n)×(2n).

SinceDSDP(Q)is the dual ofSDP(Q), the lemma follows directly from the SDP duality theorem as stated in [20,

Corollary 2.2.6]. ut

To infer Lemma7, we shall simplifyDSDPand reformulate this semidefinite program as an eigenvalue minimiza- tion problem. First, we show that it suffices to optimize overy⁰∈Rⁿrather thany∈R²ⁿ.

Lemma 9. LetDSDP⁰(Q) = min 2h1, y⁰i s.t.Q ≤diag( ¹₁

⊗y⁰), y⁰∈Rⁿ.ThenDSDP(Q) = DSDP⁰(Q).

Proof. Since for any feasible solutiony⁰toDSDP⁰(Q)the vectory = ¹₁

⊗y⁰ is a feasible solution toDSDP(Q), we conclude thatDSDP(Q) ≤ DSDP⁰(Q). Thus, we just need to establish the converse inequalityDSDP⁰(Q) ≤ DSDP(Q).

To this end, letF(Q)⊂R²ⁿsignify the set of all feasible solutionsytoDSDP(Q). We shall prove thatF(Q)is closed under the linear operator

I:R²ⁿ →R²ⁿ, (y₁, . . . , y_n, y_n+1, . . . , y_2n)7→(y_n+1, . . . , y_2n, y₁, . . . , y_n),

i.e.,I(F(Q))⊂ F(Q); note thatIjust swaps the first and the lastnentries ofy. To see that this implies the assertion, consider an optimal solutiony = (yi)1≤i≤2n ∈ F(Q). Then¹₂(y+Iy)∈ F(Q), becauseF(Q)is convex. Now, let y⁰= (y_i⁰)_1≤i≤nbe the projection of¹₂(y+Iy)onto the firstncoordinates. Since¹₂(y+Iy)is a fixed point ofI, we have ¹₂(y+Iy) = ¹₁

⊗y⁰. Hence, the fact that ¹₂(y+Iy)is feasible forDSDP(Q)implies thaty⁰is feasible for DSDP⁰(Q). Thus, we conclude thatDSDP⁰(Q)≤2h1, y⁰i=h1, yi= DSDP(Q).

To show thatF(Q)is closed underI, consider a vectory∈ F(Q). Sincediag(y)− Qis positive semidefinite, we have

∀η∈R²ⁿ:h(diag(y)− Q)η, ηi ≥0. (8) The objective is to show thatdiag(Iy)− Qis positive semidefinite, i.e.,

∀ξ∈R²ⁿ:h(diag(Iy)− Q)ξ, ξi ≥0. (9) To derive (9) from (8), we decompose y into its two halfsy = ^u_v

(u, v ∈ Rⁿ). ThenIy = _u^v

. Moreover, let ξ= ^α_β

∈R²ⁿbe any vector, and setη=Iξ= _α^β

. We obtain

h(diag(Iy)− Q)ξ, ξi=hdiag(v)α, αi+hdiag(u)β, βi −hQα, βi+hQβ, αi

2 =h(diag(y)− Q)η, ηi⁽⁸⁾≥0,

thereby proving (9). ut

Proof of Lemma7.Let

DSDP⁰⁰(Q) =n· min

z∈Rⁿ, z⊥1λmax

0 1 1 0

⊗Q+ diag 1

1

⊗z

.

(8)

By Lemmas8and9, it suffices to prove thatDSDP⁰(Q) = DSDP⁰⁰(Q).

To see thatDSDP⁰⁰(Q)≤ DSDP⁰(Q), consider an optimal solutiony⁰ toDSDP⁰(Q). Letλ=n⁻¹h1, y⁰iand z= 2(λ1−y⁰). Thenhz,1i= 2(nλ− h1, y⁰i) = 0, whencezis a feasible solution toDSDP⁰⁰(Q). Furthermore, as y⁰is a feasible solution toDSDP⁰(Q), we have

0 1 1 0

⊗Q= 2Q ≤2diag 1

1

⊗y⁰= 2λE−diag 1

1

⊗z,

whereEis the identity matrix. Hence, the matrix2λE−

0 1 1 0

⊗Q−diag ¹₁

⊗zis positive semidefinite. This implies that all eigenvalues of

0 1 1 0

⊗Q+ diag ¹₁

⊗zare bounded by2λ, i.e.,λ_max 0 1

1 0

⊗Q+ diag ¹₁

⊗z

≤2λ.

As a consequence,

DSDP⁰⁰(Q)≤nλ_max 0 1

1 0

⊗Q+ diag 1

1

⊗z

≤2nλ= 2h1, y⁰i= DSDP⁰(Q).

Conversely, consider an optimal solutionztoDSDP⁰⁰(Q). Set µ=λmax

0 1 1 0

⊗Q+ diag 1

1

⊗z

=n⁻¹DSDP⁰⁰(Q), y⁰= 1

2(µ1−z).

Since all eigenvalues of 0 1

1 0

⊗Q+ diag ¹₁

⊗zare bounded byµ, the matrixµE− 0 1

1 0

⊗Q−diag ¹₁

⊗z is positive semidefinite, i.e.,

0 1 1 0

⊗Q≤µE−diag ¹₁

⊗z. Therefore,

Q=1 2

0 1 1 0

⊗Q≤1 2

µE−diag 1

1

⊗z

= diag 1

1

⊗y⁰. Hence,y⁰is a feasible solution toDSDP⁰(Q). Furthermore, sincez⊥1we obtain

DSDP⁰(Q)≤2h1, y⁰i=µn= DSDP⁰⁰(Q),

as desired. ut

3 Quasi-Randomness: Proof of Theorem 1

3.1 From Essential Eigenvalue Separation to Low Discrepancy

We prove the first part of Theorem1. Suppose thatG = (V, E)is a graph that admits a setW ⊂ V of volume vol(W)≥(1−ε)vol(V)such that the eigenvalues of the minorLW of the normalized Laplacian satisfy

1−ε≤λ2[LW]≤λmax[LW]≤1 +ε. (10) We may assume without loss of generality thatε <0.01. Our goal is to show thatGhasDisc(10√

ε).

Let∆= (√

d_v)_v∈W ∈R^W. Hence,∆is a real vector indexed by the elements ofW. Moreover, letL_W denote the matrix whosevw’th entry is(dvdw)⁻¹² ifv, ware adjacent, and0otherwise (v, w∈W), so thatLW =E− LW. Further, letMW = vol(V)⁻¹∆∆^T− LW. Then for all unit vectorsξ⊥∆we have

L_Wξ−ξ=−L_Wξ=M_Wξ. (11)

Moreover, for allS⊂W

|hMW∆S, ∆Si|=

vol(S)² vol(V) −e(S)

. (12)

The key step of the proof is to derive the following bound on the operator norm ofMW.

(9)

Lemma 10. We havekMWk ≤10√ ε.

If it were the case thatW = V, then Lemma10would be immediate. For ifW =V, then∆is an eigenvector of L=L_W with eigenvalue0. Hence, the definitionM_W =k∆k⁻²∆∆^T−E+L_W ensures thatM_W∆= 0. Moreover, for allξ⊥∆we haveMWξ= (LW−E)ξ, whence (10) implies thatkMWk ≤max{|λ2[LW]−1|,|λmax[LW]− 1|} ≤ε.

But of course generallyW is a proper subset ofV. In this case∆is not necessarily an eigenvector ofLW. In fact, the smallest eigenvalue ofLWmay be strictly positive. In order to prove Lemma10we will investigate the eigenvector ζofLW with the smallest eigenvalueλ1[LW]and show that it is “close” to∆. Then, we will use (10) to derive the desired bound onkMWk.

Proof of Lemma10.Letζ be a unit length eigenvector ofL_W with eigenvalueλ₁[L_W]. There is a decomposition

∆ = k∆k ·(sζ +tχ), where s²+t² = 1 andχ ⊥ ζ is a unit vector. Since hL_W∆, ∆i = e(W, V \W) ≤ vol(V \W)≤εvol(V)andk∆k²= vol(W)≥(1−ε)vol(V)≥0.99vol(V), we have

2ε≥ k∆k⁻²hLW∆, ∆i=s²hLWζ, ζi+t²hLWχ, χi. (13) Asχis perpendicular to the eigenvectorζwith eigenvalueλ1[LW], Courant-Fischer (4) and (10) yieldhLWχ, χi ≥ λ2[LW]≥ ¹₂. Hence, (13) implies2ε≥t²/2. Consequently,

t²≤4ε, and thus s²≥1−4ε. (14)

Now, let ξ ⊥ ∆be a unit vector, and decompose ξ = xζ+yη, whereη ⊥ ζ is a unit vector. Because ζ = s⁻¹

∆ k∆k −tχ

, we havex =hζ, ξi =s⁻¹D

∆ k∆k, ξE

−_s^thχ, ξi = −_s^thχ, ξi.Hence, (14) impliesx² ≤5εand y² ≥ 1−5ε. Combining these two estimates with (10) and (11), we conclude that kMWξk = kLWξ−ξk ≤ x(1−λ1[LW]) +ykLWη−ηk ≤3√

ε.Hence, we have established that sup

06=ξ⊥∆

kMWξk kξk ≤3√

ε. (15)

Furthermore, sincek∆k²= vol(W), (12) implies

|hMW∆, ∆i|

k∆k² =

vol(W)

vol(V) − e(W) vol(W)

≤

vol(W)

vol(V) − e(W) vol(V)

+

e(W)

vol(W)− e(W) vol(V)

=e(W, V \W)

vol(V) +e(W)(vol(V)−vol(W)) vol(V)vol(W)

≤e(W, V \W)

vol(V) +vol(V \W)

vol(V) ≤ 2vol(V \W)

vol(V) . (16)

As we are assuming thatvol(W)≥(1−ε)vol(V), we obtaink∆k⁻²|hM_W∆, ∆i| ≤2ε. Finally, combining this last estimate with (15), we conclude thatkMWk ≤10√

ε. ut

Lemma10easily implies thatGhasDisc(10√

ε). For letR⊂V be arbitrary, setS=R∩W, and letT =R\W. Sincek∆Sk²= vol(S)≤vol(V), Lemma10and (12) imply that

vol(S)² vol(V) −e(S)

≤ kMWk · k∆Sk²≤10√

εvol(V). (17)

Furthermore, asvol(W)≥(1−ε)vol(V),

e(R)−e(S)≤e(T) + 2e(S, T)≤2vol(T)≤2vol(V \W)≤2εvol(V), and vol(R)²−vol(S)²

2vol(V) ≤ vol(T)²

2vol(V)+vol(S)vol(T)

vol(V) ≤ vol(V \W)²

2vol(V) + vol(V \W)≤2εvol(V).

Combining these two estimates with (17), we see that

vol(R)²

vol(V) −e(R) <20√

εvol(V), i.e.,GsatisfiesDisc(10√ ε).

(10)

3.2 From Low Discrepancy to Essential Eigenvalue Separation

In this section we establish the second part of Theorem1. Letθ denote the constant from Theorem5and setγ = 10⁻⁶/θ. Assume thatG= (V, E)is a graph that hasDisc(γε²)for someε < 0.001. In addition, we may assume without loss of generality thatGhas no isolated vertices. Letdvdenote the degree of v ∈ V, let n =|V|, and set d¯= vol(V)/n =P

v∈V dv/n. Our goal is to show thatGhasess-Eig(ε). To this end, we introduce an additional property.

Cut(δ): We sayGhasCut(δ)if the matrixM = (m_vw)_v,w∈V with entries m_vw= d_vd_w

vol(V)−e(v, w)

has cut normkMk_cut< δ·vol(V); heree(v, w) = 1if{v, w} ∈Eande(v, w) = 0otherwise.

Proposition 11. For anyδ >0the following is true: ifGsatisfiesDisc(0.01δ), thenGsatisfiesCut(δ).

Proof. Suppose thatG= (V, E)hasDisc(0.01δ). We shall prove below that for any twoS, T ⊂V

|hM1S,1Ti| ≤0.03δvol(V)ifS∩T =∅, (18)

|hM1S,1Ti| ≤0.02δvol(V)ifS=T. (19) To see that (18) and (19) imply the assertion, consider two arbitrary subsets X, Y ⊂ V. LettingZ = X ∩Y and combining (18) and (19), we obtain

|hM1X,1Yi| ≤

M1_X\Z,1_Y_\Z +

M1Z,1_Y_\Z +

M1Z,1_X\Z

+ 2|hM1Z,1Zi|

≤δvol(V).

Since this bound holds for anyX, Y, we conclude thatkMk_cut≤δvol(V).

To prove (18), we note thatDisc(0.01δ)implies for disjoint setsSandT

≤0.02δvol(V),

e(T)−vol(T)² vol(V)

≤0.02δvol(V), (20)

e(S∪T)−(vol(S) + vol(T))² vol(V)

≤0.02δvol(V). (21)

IfSandTare disjoint, (20)–(21) yield

| hM1_S,1_Ti |=

e(S, T)−vol(S)vol(T) vol(V)

=1 2

e(S∪T)−e(S)−e(T)−(vol(S) + vol(T))²−vol(S)²−vol(T)² vol(V)

≤1 2

+1 2

e(T)−vol(T)² vol(V)

+1 2

e(S∪T)−(vol(S) + vol(T))² vol(V)

≤0.03δvol(V),

whence (18) follows. Finally, as| hM1S,1Si |=

e(S)−^vol(S)_vol(V₎²

, (19) follows from (20). ut

LetD = diag(dv)_v∈V be the matrix with the vertex degrees on the diagonal. LetM=D⁻¹²M D⁻¹². Then the vw’th the entry ofMis

√dvdw

vol(V) −(dvdw)^−1/2ifv, ware adjacent, and

√dvdw

vol(V) otherwise. Establishing the following lemma is the key step.

Lemma 12. Suppose that SDP(M) < ε²vol(V)/64. Then there exists a subset W ⊂ V of volume vol(W) ≥ (1−ε)·vol(V)such thatkMWk< ε.

(11)

Proof. Recall thatd¯= vol(V)/n. Lemma7implies that there is a vector1⊥z∈R^V such that λ_max

0 1 1 0

⊗M−diag z

z

= SDP(M)/n < ε²d/64.¯ (22) BasicallyW is going to be the set of allv such that|zv|is small (and such thatdv is not too small). On the minor induced onW×W the diagonal matrixdiag ^z_z

has little effect, and thus (22) will imply the desired bound onkMWk.

To carry out the details we need to defineW precisely, boundkMWk, and prove thatvol(W)≥(1−ε)vol(V).

Lety =D⁻¹zandU = {v ∈ V : dv > εd/8}. Let¯ y⁰ = (yv)v∈U andz⁰ = (zv)v∈U. Since all entries of the restricted diagonal matrixDUexceedεd/8, we have¯

λ_max 0 1

1 0

⊗ M_U−diag y⁰

y⁰

=λmax

1 0 0 1

⊗D⁻

1 2

U · 0 1

1 0

⊗MU−diag z⁰

z⁰

· 1 0

0 1

⊗D⁻

1 2

U

≤8 εd¯⁻¹ λmax

0 1 1 0

⊗MU −diag z⁰

z⁰

≤8(εd)¯⁻¹λmax

0 1 1 0

⊗M−diag z

z ₍₂₂₎

< ε/8. (23)

LetW = {v ∈ U :|yv| < ε/8}and lety⁰⁰ = (y_v)_v∈W. Thenkdiag ^y_y⁰⁰00

k < ε/8, because the norm of a diagonal matrix equals the largest absolute value of an entry on the diagonal. Therefore, (23) yields

λ_max 0 1

1 0

⊗ MW

≤λ_max 0 1

1 0

⊗ MW −diag y⁰⁰

y⁰⁰

+

diag y⁰⁰

y⁰⁰

≤λ_max 0 1

1 0

⊗ MU −diag y⁰

y⁰

+

diag y⁰⁰

y⁰⁰

≤ε/4. (24) Further, (24) implies thatkM_Wk< ε. To see this, consider a pairξ, η∈R^W of unit vectors. Then (24) and Courant- Fischer (4) yield

ε/2≥2λ_max 0 1

1 0

⊗ MW

≥ 0 1

1 0

⊗ MW · ξ

η

, ξ

η

=

MWη MWξ

, ξ

η

=hM_Wη, ξi+hM_Wξ, ηi= 2hM_Wξ, ηi [asM_W is symmetric].

Since this holds for any pairξ, η, we conclude thatkMWk ≤ε/4< ε.

Finally, we need to show that vol(W)is large. To this end, we consider the setS = {v ∈ V : zv <0}. Since vol(V) = ¯dn≥d|S|, we have¯

ε²vol(V)

32 ≥ ε²d|S|¯ 32 = ε²d¯

64 ·

1_S 1S

2

≥λ_max 0 1

1 0

⊗M−diag z

z

·

1_S 1S

2

[due to (22)]

≥

0 1 1 0

⊗M−diag z

z

· 1S

1_S

, 1S

1_S

[by Courant-Fischer (4)]

= 2hM1S,1Si −2X

v∈S

zv. (25)

Further, Theorem 5 implies | hM1S,1Si | ≤ kMk_cut ≤ SDP(M) ≤ ε²vol(V)/64. Inserting this into (25) and recalling thatzv <0for allv ∈S, we conclude thatP

v∈S|zv| ≤ε²vol(V)/16.Sincez⊥1, this actually implies P

v∈V |zv| ≤ε²vol(V)/8. Asz=Dyand|yv|> ε/8for allv∈V \W, we obtain εvol(V \W)/8≤ X

v∈V\W

dv|yv|= X

v∈V\W

|zv| ≤ε²vol(V)/8.

Hence,vol(V \W)≤εvol(V), which impliesvol(W)≥(1−ε)vol(V). ut

(12)

Finally, we show how it implies thatGhas ess-Eig(ε). Assume thatGhasDisc(γε²). By Proposition11this implies thatGsatisfiesCut(100γε²). Hence, Theorem5showsSDP(M)≤βε²vol(V)for some0 < β ≤100θγ.

Thus, by Lemma12and our choice ofγthere is a setW such thatvol(W)≥(1−ε/10)vol(V)andkMWk< ε/10.

Furthermore,MW relates to the minorL_W of the Laplacian as follows. LetLW =E−L_W be the matrix whose vw’th entry is(d_vd_w)^−1/2ifv, w ∈W are adjacent, and0otherwise. Moreover, let∆= (√

d_v)_v∈W ∈R^W. Then M_W = vol(V)⁻¹∆∆^T − L_W.Therefore, for all unit vectorsξ⊥∆we have

λ2[LW] = max

06=ζ∈R^W min

ξ⊥ζ,kξk=1hLWξ, ξi ≥ min

ξ⊥∆,kξk=1hLWξ, ξi ≥1−ε. (27) To boundλ_max[L_W]as well, we need to computekL_W∆k². To this end, recall that the row ofL_W corresponding to a vertexv∈V contains a one at positionv. Forw6=vthe entry is−(dvdw)⁻¹² ifvandware adjacent, and0otherwise.

Hence, thev-entry of the vectorLW∆equals

∆v− X

w∈W:{v,w}∈E

∆w

√dvdw

=p

dv−e(v, W)

√dv

=dv−e(v, W)

√dv

.

Sincek∆k²=P

v∈Wdv= vol(W)≥(1−ε/10)vol(V), we obtain kLW∆k²

k∆k² = X

v∈W

(e(v, W)−dv)² d_v·vol(W)

≤ 1

1−ε/10 X

v∈W

dv−e(v, W)

vol(V) ≤ 2vol(V \W)

vol(V) < ε/5. (28)

Further, decomposing any unit vectorη ∈R^W asη=αk∆k⁻¹∆+βξwith a unit vectorξ⊥∆andα²+β²= 1, we get

hLWη, ηi=

LW αk∆k⁻¹∆+βξ

, αk∆k⁻¹∆+βξ

= α²

k∆k² · hL_W∆, ∆i+ αβ

k∆k · hL_W∆, ξi+ αβ

k∆k· hL_Wξ, ∆i+β²hL_Wξ, ξi

= α²

k∆k² · hLW∆, ∆i+ 2αβ

k∆k · hLW∆, ξi+β²hLWξ, ξi,

where the last step follows from the fact thatL_W is symmetric. Hence, using (26) and (28), we get hL_Wη, ηi ≤ α²

k∆k² · kL_W∆k · k∆k+ 2αβ

k∆k · kL_W∆k · kξk+β²hL_Wξ, ξi

≤α²p

ε/5 + 2αβp

ε/5 +β²(1 +| hLξ, ξi −1|)

≤p

ε/5(α²+ 2αβ) +β²(1 +ε/10)

≤3p

ε/5· |α|+ (1−α²)(1 +ε/10).

Differentiating the last expression, we find that the maximum is attained atα = ³₂p

ε/5/(1 +ε/10). Plugging this value in, we obtainhLWη, ηi ≤ 1 +ε. Hence, by Courant-Fischer (4),λmax[LW] = max_kηk=1hLWη, ηi ≤1 +ε.

Thus, (27) shows thatGhasess-Eig(ε).

(13)

4 The Algorithmic Regularity Lemma: Proof of Theorem 2

In this section we establish Theorem2. The proof is conceptually similar to Szemer´edi’s original proof of the “dense”

regularity lemma [23] and its adaptation for sparse graphs due to Kohayakawa [21] and R¨odl (unpublished). A new aspect here is that we deal with a different (more general) notion of regularity; this requires various technical modifica- tions of the previous arguments. More importantly, we present analgorithmfor actually computing a regular partition of a sparse graph efficiently.

In order to find a regular partition efficiently, we crucially need an algorithm to check for a given weight distribution D= (D_v)_v∈V, a given graphG, and a pair(A, B)of vertex sets whether(A, B)is(ε,D)-regular. While [3] features a (purely combinatorial) algorithm for this problem in dense graphs, this approach does not work in the sparse case. In Section4.1we present an algorithmWitnessthat does. It is based on Grothendieck’s inequality and the semidefinite relaxation of the cut norm (see Theorem6). Then, in Section4.2we will show howWitnesscan be used to compute a regular partition to establish Theorem2.

Throughout this section, we let0 < ε < 10⁻⁷be an arbitrarily small but fixed number, andC ≥ 1signifies an arbitrarily large but fixed number. In addition, we define a sequence(tk)_k≥1by

t1=d1/ε²e and tk+1=d2200²C²t⁶_k2^t^k/ε^4(k+1)e. (29) Note that due to that choice we have

tk+1≥2200Ct^2.5_k . (30)

Further, let

k^∗=d10⁶C²ε⁻³e and η= min

ε^8k^∗

12800²t⁶_k∗C⁴, 1 t²_k∗

(31) and choose n0 = n0(C, ε) > 0 big enough. We let G = (V, E)be a graph on n = |V| > n0 vertices, and let D = (Dv)v∈V be a sequence of rationals with 1 ≤ Dv ≤ n for all v ∈ V. We will always assume that Gis (C, η,D)-bounded, and thatD(V)≥η⁻¹n.

4.1 The ProcedureWitness

The subroutine Witness is given a graphG, a weight distribution D, vertex sets A,B, and a number ε > 0.

Witnesseither outputs “yes”, in which case(A, B)is(ε,D)-regular inG, or “no”. In the latter case the algorithm also produces a “witness of irregularity”, i.e., a pair of setsX^∗⊂A,Y^∗⊂Bfor which the regularity condition (2) is violated withεreplaced byε/200.Witnessemploys the algorithmApxCutNormfrom Theorem6.

Algorithm 13. Witness(G,D, A, B, ε)

1. Set up the matrixM = (mvw)(v,w)∈A×Bwith entries

mvw=

1−%(A, B)DvDw ifv, ware adjacent inG,

−%(A, B)DvDw otherwise.

CallApxCutNorm(M)to compute setsX⊂A,Y ⊂Bsuch that| hM1X,1Yi | ≥ ₁₀₀³ kMk_cut. 2. If| hM1X,1Yi |<₁₀₀^3ε ^D(A)D(B)_D(V₎ , then return “yes”.

3. If not, letX⁰=A\X.

– IfD(X)≥ ₁₀₀^3εD(A), then letX^∗=X.

– IfD(X)< ₁₀₀^3εD(A)and|e(X⁰, Y)−%(A, B)D(X⁰)D(Y)|> ^εD(A)D(B)_100D(V₎ , setX^∗=X⁰. – Otherwise, setX^∗=X∪X⁰.

4. LetY⁰=B\Y.

– IfD(Y)≥₂₀₀^ε D(B), then letY^∗=Y.

– IfD(Y)<₂₀₀^ε D(B)and|e(X^∗, Y⁰)−%(A, B)D(X^∗)D(Y⁰)|>^εD(A)D(B)_200D(V₎ , letY^∗=Y⁰. – Otherwise, setY^∗=Y ∪Y⁰.

(14)

5. Answer “no” and output(X^∗, Y^∗)as an(ε/200,D)-witness.

Lemma 14. Suppose thatA, B⊂V are disjoint.

1. IfWitness(G,D, A, B, ε)answers “yes”, then the pair(A, B)is(ε,D)-regular.

2. If the answer is “no”, then (A, B) is not(ε/200,D)-regular. In this caseWitness outputs an(ε/200,D)- witness, i.e., a pair(X^∗, Y^∗)of subsetsX^∗ ⊂A,Y^∗ ⊂Bsuch thatD(X^∗)≥ ₂₀₀^ε D(A),D(Y^∗)≥ ₂₀₀^ε D(B), and

|e(X^∗, Y^∗)−%(A, B)D(X^∗)D(Y^∗)|> ε

200 ·D(A)D(B) D(V) .

Moreover, there exist a functionf and a polynomialΠsuch that the running time ofWitnessis bounded byf(C, ε)· Π(hDi).

Proof. Note that for any two subsetsS ⊂AandT ⊂Bwe have

hM1_S,1_Ti=e(S, T)−%(A, B)D(S)D(T).

Therefore, if the setsX ⊂AandY ⊂Bcomputed byApxCutNormare such that

| hM1X,1Yi |< 3ε 100

D(A)D(B) D(V)

then by Theorem6we have

|e(S, T)−%(A, B)D(S)D(T)| ≤ kMk_cut≤ 100

3 |hM1X,1Yi|< εD(A)D(B) D(V) for allS⊂AandT ⊂B. Thus, ifWitnessanswers “yes” then the pair(A, B)is(ε,D)-regular.

One the other hand, ifApxCutNormyields setsX,Y such thathM1X,1Yi ≥₁₀₀^3ε ^D(A)D(B)_D(V₎ thenWitnesshas to guarantee that the output pair(X^∗, Y^∗)is an(ε/200,D)-witness.

Indeed, ifD(X)≥ ₁₀₀^3εD(A)andD(Y)≥ ₂₀₀^ε D(B)then(X, Y)actually is an(ε/200,D)-witness. However, as ApxCutNormdoes not guarantee any lower bound onD(X)andD(Y)let assume first thatD(X)< ₁₀₀^3εD(A)and D(Y)≥ ₂₀₀^ε D(B). Then Step 3 ofWitnesssetsX⁰ =A\X. We haveD(X⁰)≥ ₁₀₀³ D(A). IfX⁰ itself satisfies

|e(X⁰, Y)−%(A, B)D(X⁰)D(Y)| > ^εD(A)D(B)_100D(V₎ then(X⁰, Y)obviously is an(ε/200,D)-witness. Otherwise, by the triangle inequality, we deduce

e(X∪X⁰, Y)−e(A, B)D(X∪X⁰)D(Y) D(A)D(B)

≥ 2ε 100

D(A)D(B) D(V) and thus,(X∪X⁰, Y)is an(ε/200,D)-witness.

In the caseD(X)< ₁₀₀^3εD(A)andD(Y)<₂₀₀^ε D(B)we simply repeat the argument forY, and henceWitness outputs an(ε/200,D)-witness for(A, B).

The running time ofWitnessis dominated by Step 1, i.e., the execution ofApxCutNorm. By Theorem6the running time ofApxCutNormis polynomial in the encoding length of the input matrix. Moreover, the construction of M in Step 1 shows that its encoding length is of the formf(C, ε)·Π(hDi)for a certain functionfand a polynomial

Π, as claimed. ut

4.2 The AlgorithmRegularize

In order to compute the desired regular partition of the input graphG, the algorithm Regularizestarts with an arbitrary initial partitionP¹={V_i¹: 0≤i≤s1}such that each classV_i¹(1≤i≤s1) has a “decent” weightD(V_i¹).

In the subsequent steps,Regularizecomputes a sequence(P^k)of partitions such thatP^k+1 is a “more regular”

refinement ofP^k (k ≥ 1). The algorithm halts as soon as it can verify thatP^k satisfies bothREG1andREG2of Theorem2. To this endRegularizeapplies the subroutineWitnessto each pair(V_i^k, V_j^k)of the current partition

(15)

P^k. By Lemma14this yields a setL^kof pairs(i, j)such that all(V_i^k, V_j^k)with(i, j)6∈ L^kare(ε,D)-regular. Hence, P^ksatisfiesREG2as soon as

X

(i,j)∈L^k

D(V_i^k)D(V_j^k)< εD(V)². (32)

In this case the algorithmRegularizestops and outputsP^k. As we will see, all partitionsP^k satisfyREG1by construction. Consequently, once (32) holds,Regularizehas obtained the desired regular partition.

Algorithm 15. Regularize(G, C,D, ε)

1. Fix an arbitrary partitionP¹ ={V_i¹: 0≤i≤s1}for somes1≤t1with the property – D(V)/t1−maxv∈VDv< D(Vi¹)≤D(V)/t1for all1≤i≤s1and

– D(V \(S

i∈[s₁]V_i¹))≤D(V)/t1. SetV₀¹=V \S

i∈[s₁]V_i¹and setk^∗=d1000²C²ε⁻³e.

2. Fork= 1,2,3, . . . , k^∗do 3. Initially, letL^k=∅.

For each pair(Vi^k, Vj^k) (i < j)of classes of partitionP^k 4. call the procedureWitness(G,D, V_i^k, V_j^k, ε).

If it answers “no” and hence outputs an (ε/200,D)-witness (Xij^k, Xji^k)for (Vi^k, Vj^k), then add(i, j)toL^k.

5. IfP

(i,j)∈L^kD(V_i^k)D(V_j^k)< ε(D(V))², then output the partitionP^kand halt.

6. Else construct a refinementP^k+1ofP^kas follows:

– First construct the unique minimal partitionC^kofV\V0^k, which refines{Xij^k, Vi\Xij^k} for everyi= 1, . . . , sk and everyj 6=i. More precisely, we define the equivalence relation≡^ki onVi by lettingu ≡^ki viff for allj such that(i, j) ∈ Lk it is true that u ∈ Xij^k ⇔ v ∈ Xij^k and we let C^k be the set of all equivalence classes of the relations≡^ki (1≤i≤sk).

– Set αk = ε^4(k+1)/(2200²C²t⁶_k2^t^k) and split each vertex class of C^k into blocks with weight betweenαkD(V)andαkD(V) + maxv∈VDv and possibly one exceptional block of smaller weight. More preceisely, we construct a refinement C∗^k = {V_0,1^k+1, . . . , V_0,r^k+1

k, V₁^k+1, . . . , Vs^k+1_k+1}ofC^ksuch that:

• rk≤ |C^k| ≤sk2^s^k,

• D(V_0,q^k+1)< αkD(V)for allq∈[rk], and

• αkD(V)≤D(V_i^k+1)< αkD(V) + maxv∈VDvfor alli∈[sk+1].

– LetV₀^k+1=V0^k∪S

q∈[r_k]V_0,q^k+1and setP^k+1={V_i^k+1: 0≤i≤sk+1}.

Step 6 is the central step of the algorithm. In the first part of that step we construct a joint refinement of the previous partitionP^kand all the witnesses of irregularity(X_ij^k, X_ji^k)discovered in Step 4. Similarly as in the original proof of Szemer´edi’s it will turn out that a bounded parameter (the so-called index defined below) of the partitionC^kincreases byΩ(ε³)compared toP^k. SincePkconsists ofskclasses and for everyi= 1, . . . , skthere are at mostsk−1witness setsXij(j 6=i), the refinementC^kcontains at mostsk2^s^k⁻¹< sk2^s^kvertex classes. In the second part of Step 6 we split the classes ofC^kinto pieces of almost equal weight. Here for each class ofC^k we may get one class of left-over verticesV_0,q^k of smaller weight, which together withV₀^kform the new exceptional classV₀^k+1. Due to the construction in Step 6, the bounds1≤t1, and (29) for anyk≥0the partitionP^k+1consist of at most

s_k+1+ 1≤ d2200²C²t⁶_k2^t^k/ε^4(k+1)e=t_k+1 classes. Moreover, our choice (31) ofηand the construction in Step 1 ensure that

ε²D(V)≥D(V_i^k+1)≥√

ηD(V)for all1≤i≤sk+1 (33)