Quasi-Randomness and Algorithmic Regularity for Graphs with General Degree Distributions
?Noga Alon1??, Amin Coja-Oghlan2? ? ?, Hiˆe.p H`an3†, Mihyun Kang4‡, Vojtˇech R¨odl5§, and Mathias Schacht3¶
1 School of Mathematics and Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
2 University of Edinburgh, School of Informatics, Edinburgh EH8 9AB, UK [email protected]
3 Humboldt-Universit¨at zu Berlin, Institut f¨ur Informatik, Unter den Linden 6, 10099 Berlin, Germany {hhan,schacht}@informatik.hu-berlin.de
4 Technische Universit¨at Berlin, Institut f¨ur Mathematik, Straße des 17. Juni 136, 10623 Berlin, Germany [email protected]
5 Department of Mathematics and Computer Science, Emory University, Atlanta, GA 30322, USA [email protected]
Abstract. We deal with two intimately related subjects: quasi-randomness and regular partitions. The purpose of the concept of quasi-randomness is to measure how much a given graph “resembles” a random one. Moreover, a regular partition approximates a given graph by a bounded number of quasi-random graphs. Regarding quasi- randomness, we present a new spectral characterization of low discrepancy, which extends to sparse graphs. Con- cerning regular partitions, we introduce a concept of regularity that takes into account vertex weights, and show that ifG= (V, E)satisfies a certain boundedness condition, thenGadmits a regular partition. In addition, building on the work of Alon and Naor [4], we provide an algorithm that computes a regular partition of a given (possibly sparse) graphGin polynomial time. As an application, we present a polynomial time approximation scheme for MAX CUT on (sparse) graphs without “dense spots”.
Key words: quasi-random graphs, Laplacian eigenvalues, regularity lemma, Grothendieck’s inequality.
1 Introduction and Results
This paper deals with quasi-randomness and regular partitions. Loosely speaking, a graph is quasi-random if the global distribution of the edges resembles the expected edge distribution of a random graph. Furthermore, a regular partition approximates a given graph by a constant number of quasi-random graphs. Such partitions are of algorithmic importance, because a number of NP-hard problems can be solved in polynomial time on graphs that come with regular partitions. In this section we present our main results and discuss related work. The remaining sections contain the proofs and detailed descriptions of the algorithms.
1.1 Quasi-Randomness: Discrepancy and Eigenvalues
Random graphs are well known to have a number of remarkable properties (e.g., excellent expansion). Therefore, quantifying how much a given graph “resembles” a random one is an important problem, both from a structural and an algorithmic point of view. Providing such measures is the purpose of the notion ofquasi-randomness. While this
?An extended abstract version of this work appeared in the Proceedings of ICALP 2007.
??Supported by an ISF grant, an ERC Advanced grant, and by the Hermann Minkowski Minerva Center for Geometry at Tel Aviv University.
? ? ?
Supported by DFG grant CO 646.
†Supported by DFG within the research training group “Methods for Discrete Structures”.
‡Supported by DFG grant PR 296/7-3 and DFG grant KA 2748/2-1.
§Supported by NSF Grants DMS 0300529 and DMS 0800070.
¶Supported by DFG grant SCHA 1263/1-1 and GIF Grant 889/05.
concept is rather well developed for dense graphs (i.e., graphsG= (V, E)with|E|=Ω(|V|2)), less is known in the sparse case, which we deal with in the present work. In fact, we shall actually deal with (sparse) graphs withgeneral degree distributions, including but not limited to the ubiquitous power-law degree distributions (cf. [1]).
We will mainly consider two types of quasi-random properties: low discrepancy and eigenvalue separation. The low discrepancy property concerns the global edge distribution and basically states thateverysetSof vertices approx- imately spans as many edges as we would expect in a random graph with the same degree distribution. More precisely, ifG = (V, E)is a graph, then we letdv signify the degree ofv ∈ V. Furthermore, thevolumeof a setS ⊂ V is vol(S) = P
v∈Sdv. In addition, ifS, T ⊂V are disjoint sets, thene(S, T)denotes the number ofS-T-edges inG ande(S)is two times the number of edges spanned by the setS. For not necessarily disjoint setsS, T ⊂V we let e(S, T) =e(S\T, T \S) +e(S∩T).
Disc(ε): We say thatGhas discrepancy at mostε(“GhasDisc(ε)” for short) if
∀S⊂V :
e(S)−vol(S)2 vol(V)
<2ε·vol(V). (1)
To explain (1), letd = (dv)v∈V, and letG(d)signify a random graph with expected degree distributiond; that is, any two verticesv, ware adjacent with probabilitypvw =dvdw/vol(V)independently. Then inG(d)theexpected number of edges inside ofS ⊂V equals 12P
(v,w)∈S2pvw = 12vol(S)2/vol(V). Consequently, (1) just says that for anysetS the actual number of edges inside ofS must not deviate from what we expect inG(d)by more than an ε-fraction of the total volume.
An obvious problem with the bounded discrepancy property (1) is that it seems quite difficult to check whether G= (V, E)satisfies this condition. This is because apparently one would have to inspect an exponential number of subsetsS ⊂V. Therefore, we consider a second property that refers to the eigenvalues of a certain matrix representing G. More precisely, we will deal with thenormalized LaplacianL(G), whose entries(`vw)v,w∈V are defined as
`vw=
1 ifv=wanddv≥1,
−(dvdw)−12 ifv, ware adjacent, 0 otherwise;
Due to the normalization by the geometric mean√
dvdw of the vertex degrees,L(G)turns out to be appropriate for representing graphs with general degree distributions. Moreover,L(G)is well known to be positive semidefinite, and the multiplicity of the eigenvalue0equals the number of connected components ofG(cf. [9]).
Eig(δ): Letting0 =λ1[L(G)]≤ · · · ≤λ|V|[L(G)]denote the eigenvalues ofL(G), we say thatGhasδ-eigenvalue separation(“GhasEig(δ)”) if1−δ≤λ2[L(G)]≤λ|V|[L(G)]≤1 +δ.
As the eigenvalues of L(G) can be computed in polynomial time (within arbitrary numerical precision), we can essentially check efficiently whetherGhasEig(δ)or not.
It is not difficult to see thatEig(δ)provides asufficientcondition forDisc(ε). That is, for anyε > 0there is a δ > 0such that any graphGthat hasEig(δ)also hasDisc(ε). However, while the converse implication is true ifG is dense (i.e.,vol(V) =Ω(|V|2)), it is false for sparse graphs. In fact, providing anecessarycondition forDisc(ε)in terms of eigenvalues has been an open problem in the area of sparse quasi-random graphs since the work of Chung and Graham [11]. Concerning this problem, we basically observe that the reason whyDisc(ε)does in general not imply Eig(δ)is the existence of a small set of “exceptional” vertices.
ess-Eig(δ): We say thatGhas essentialδ-eigenvalue separation (“Ghasess-Eig(δ)”) if there is a setW ⊂ V of volumevol(W)≥(1−δ)vol(V)such that the following is true. LetL(G)W = (`vw)v,w∈W denote the minor of L(G)induced onW×W, and letλ1[L(G)W]≤ · · · ≤ λ|W|[L(G)W]signify its eigenvalues. Then we require that1−δ < λ2[L(G)W]≤λ|W|[L(G)W]<1 +δ.
Theorem 1. There is a constantγ >0such that the following is true for all graphsG= (V, E)and allε >0.
1. IfGhasess-Eig(ε), thenGsatisfiesDisc(10√ ε).
2. IfGhasDisc(γε2), thenGsatisfiesess-Eig(ε).
The main contribution is the second implication. Its proof is based on Grothendieck’s inequality and the duality the- orem for semidefinite programs. In effect, the proof actually provides us with an efficient algorithm that computes a setW as in the definition ofess-Eig(ε). The second part of Theorem1is best possible, up to the precise value of the constantγ(see Section6).
1.2 The Algorithmic Regularity Lemma
Loosely speaking, a regular partition of a graphG= (V, E)is a partition of(V1, . . . , Vt)ofV such that for “most”
index pairs1 ≤ i < j ≤ tthe bipartite subgraph spanned byVi andVj is quasi-random. Thus, a regular partition approximatesGby quasi-random graphs. Furthermore, the numbertof classes may depend on a parameterεthat rules the accuracy of the approximation, but it doesnotdepend on the order of the graphGitself. Therefore, if for some class of graphs we can compute regular partitions in polynomial time, then this graph class will admit polynomial time algorithms for various problems that are NP-hard in general.
In the sequel we introduce a new concept of regular partitions that takes into account a given “ambient” weight distributionD= (Dv)v∈V, which is an arbitrary sequence of rationals between1andn=|V|. We will see at the end of this section how this relates to the notion of quasi-randomness discussed in the previous section. LetG= (V, E) be a graph. For subsetsX, Y ⊂V we set
%(X, Y) = e(X, Y)
D(X)D(Y), whereD(U) =P
u∈UDufor anyU ⊂V.
Further, we say that for disjoint setsX, Y ⊂V the pair(X, Y)is(ε,D)-regularif for allX0⊂X,Y0 ⊂Y satisfying D(X0)≥εD(X),D(Y0)≥εD(Y)we have
|e(X0, Y0)−%(X, Y)D(X0)D(Y0)| ≤ε·D(X)D(Y)
D(V) . (2)
Roughly speaking, (2) states that the bipartite graph spanned byXandY is “quasi-random” with respect to the vertex weightsD.
In the present notation, Szemer´edi’s original regularity lemma [23] states thatevery graphGadmits a regular partition with respect to the weight distributionD(v) =nfor allv ∈V. However, ifGis sparse (i.e.,|E| |V|2), then such a regular partition is not helpful because the bound on the r.h.s. of (2) exceeds|E|. To obtain an appropriate bound, we would have to consider a weight distribution such thatD(v) nfor (at least) somev ∈ V. But with respect to such weight distributions regular partitions do not necessarily exist. The basic obstacle is the presence of large “dense spots”(X, Y), wheree(X, Y)is far bigger than the termD(X)D(Y)suggests. To rule these out, we consider the following notion.
(C, η,D)-boundedness. Let C ≥ 1 and η > 0. A graph G is (C, η,D)-bounded if for all X, Y ⊂ V with D(X), D(Y)≥ηD(V)we have%(X, Y)D(V)≤C .
To illustrate the boundedness condition, consider a random graphG(D)with expected degree sequenceDsuch thatD(V)n. Then for any two disjoint setsX, Y ⊂V we haveE[e(X, Y)] =D(X)D(Y)/D(V) +o(D(V)).
Hence, Chernoff bounds imply that for allX, Y simultaneously we havee(X, Y) =D(X)D(Y)/D(V) +o(D(V)) with probability1−o(1)asn → ∞. Therefore, for any fixedη > 0the random graphG(D)is(1 +o(1), η,D) bounded with probability1−o(1).
Now, we can state the followingalgorithmic regularity lemmafor graphs with general degree distributions. which does not only ensure theexistenceof regular partitions, but also that such a partition can be computed efficiently. We lethDisignify theencoding lengthof a weight distributionD= (Dv)v∈V, i.e., the number of bits that are needed to write down the rationals(Dv)v∈V. Observe thathDi ≥n.
Theorem 2. For any two numbersC≥1andε >0there existη >0andn0>0such that for alln≥n0and every sequence of rationalsD= (Dv)v∈V with|V|=nand1≤Dv≤nfor allv∈V the following holds. IfG= (V, E) is a(C, η,D)-bounded graph andD(V)≥η−1n, then there is a partitionP ={Vi: 0≤i≤t}ofV that satisfies the following two properties:
REG1. (a) ηD(V)≤D(Vi)≤εD(V)for alli= 1, . . . , t, (b) D(V0)≤εD(V), and
(c) |D(Vi)−D(Vj)|<maxv∈V Dvfor all1≤i < j≤t.
REG2. LetLbe the set of all pairs(i, j),1≤i < j≤tsuch that(Vi, Vj)is not(ε,D)-regular. Then X
(i,j)∈L
D(Vi)D(Vj)≤εD(V)2.
Furthermore, for fixedC and εthe partitionP can be computed in polynomial time. More precisely, there exist a functionfand a polynomialΠsuch that the partitionPcan be computed in timef(C, ε)·Π(hDi).
ConditionREG1states all of the classesV1, . . . , Vthave approximately the same, non-negligible weight, while the “exceptional” classV0has a “small” weight. Also note that due toREG1(a)the number of classestof the partition Pis bounded by1/η, which only depends onCandε, but not onG,D, orn. Moreover,REG2requires that the total weight of the irregular pairs(Vi, Vj)is small relative to the total weight. Thus, a partitionPthat satisfiesREG1and REG2approximatesGby a bounded number of bipartite quasi-random graphs.
We illustrate the use of Theorem2with the example of the MAX CUT problem. While approximating MAX CUT within a ratio better than1617 is NP-hard on general graphs [19,24], the following theorem provides a polynomial time approximation scheme for(C, η,D)-bounded graphs.
Theorem 3. For anyδ > 0andC ≥ 1there exist two numbersη > 0,n0 >0 and a polynomial time algorithm ApxMaxCutsuch that for alln≥n0and every sequence of rationalsD= (Dv)v∈V with|V|=nand1≤Dv≤n for allv∈V the following is true. IfG= (V, E)is a(C, η,D)-bounded graph andD(V)> η−1n, thenApxMaxCut outputs a cut ofGthat approximates the maximum cut up to an additive error ofδ|D(V)|.
Finally, let us discuss a few examples and applications of the above results.
1. If we letD(v) =nfor allv∈V, then Theorem2is just an algorithmic version of Szemer´edi’s regularity lemma.
Such a result was established previously in [3].
2. Suppose thatD(v) = ¯dfor some numberd¯= ¯d(n) =o(n). Then the above notions of regularity and boundedness coincide with those of the classical “sparse regularity lemma” of Kohayakawa [21] and R¨odl (unpublished). Hence, Theorem2provides an algorithmic version of this regularity concept. This result has not been published previously (although it may have been known to experts in the area that this can be derived from Alon and Naor [4]). Actually devising an algorithm for computing a sparse regular partition is mentioned as an open problem in [21].
3. For a given graphG= (V, E)we could just use the degree sequence as a weight distribution, i.e.,D(v) =dvfor allv∈V. ThenD(U) = vol(U)for allU ⊂V. Hence, the notion of regularity (2) is closely related to the notion of quasi-randomness from Section1.1. The resulting regularity concept is a generalization of the “classical” sparse regularity lemma. The new concept allows for graphs with highly irregular degree distributions.
1.3 Further Related Work
Quasi-random graphs with general degree distributions were first studied by Chung and Graham [10]. They considered the propertiesDisc(ε)andEig(δ), and a number of further related ones (e.g., concerning weighted cycles). Chung and Graham observed that Eig(δ)implies Disc(ε), and that the converse is true in the case of dense graphs (i.e., vol(V) =Ω(|V|2)).
Regarding the step from discrepancy to eigenvalue separation, Butler [8] proved that any graphGsuch that for all setsX, Y ⊂V the bound
|e(X, Y)−vol(X)vol(Y)/vol(V)| ≤εp
vol(X)vol(Y) (3)
holds, satisfiesEig(O(ε(1−lnε))). His proof builds upon the work of Bilu and Linial [6], who derived a similar result for regular graphs, and on the earlier related work of Bollob´as and Nikiforov [7].
Butler’s result relates to the second part of Theorem1as follows. The r.h.s. of (3) refers to the volumes of the sets X,Y, and may thus be significantly smaller thanεvol(V). By comparison, the second part of Theorem1just requires
that the “original” discrepancy conditionDisc(δ) is true, i.e., we just need to bound|e(S)−vol(S)2/vol(V)|in terms of thetotalvolumevol(V). Hence, Butler shows that the “original” eigenvalue separation conditionEigfollows from a stronger version of the discrepancy property. By contrast, Theorem1shows that the “original” discrepancy conditionDiscimplies a weak form of eigenvalue separationess-Eig, thereby answering a question posed by Chung and Graham [10,11]. Furthermore, relying on Grothendieck’s inequality and SDP duality, the proof of Theorem 1 employs quite different techniques than [6,7,8].
In the present work we consider a concept of quasi-randomness that takes into account vertex degrees. Other concepts that do not refer to the degree sequence (and are therefore restricted to approximately regular graphs) were studied by Chung, Graham and Wilson [12] (dense graphs) and by Chung and Graham [11] (sparse graphs). Also in this setting it has been an open problem to derive eigenvalue separation from low discrepancy. Concerning this simpler concept of quasi-randomness, our techniques yield a similar result as Theorem1as well. The proof is similar and we omit the details.
Szemer´edi’s original regularity lemma [23] has become an important tool in various areas, including extremal graph theory and property testing. Alon, Duke, Lefmann, R¨odl, and Yuster [3] presented an algorithmic version, and showed how this lemma can be used to provide polynomial time approximation schemes for dense instances of NP-hard problems (see also [22] for a faster algorithm). Moreover, Frieze and Kannan [13] introduced a different algo- rithmic regularity concept, which yields better efficiency in terms of the desired approximation guarantee. Both [3,13]
encompass Theorem3in the case thatD(v) =nfor allv ∈V. The sparse regularity lemma from Kohayakawa [21]
and R¨odl (unpublished) is related to the notion of quasi-randomness from [11]. This concept of regularity has proved very useful in the theory of random graphs, see Gerke and Steger [15].
2 Preliminaries
2.1 Notation
IfS ⊂V is a subset of some setV, then we let1S ∈RV denote the vector whose entries are1on the components corresponding to elements ofS, and0otherwise. More generally, ifξ∈RV is a vector, thenξS ∈RV signifies the vector obtained fromξby replacing all components with indices inV \S by0. Moreover, ifA= (avw)v,w∈V is a matrix, thenAS = (avw)v,w∈S denotes the minor ofAinduced onS×S. Further, for a vectorξ ∈RV we letkξk signify the`2-norm, and for a matrixM ∈RV×V we let
kM||= max
06=ξ∈RV
kM ξk
kξk = max
ξ,η∈RV\{0}
hM ξ, ηi kξk · kηk denote the spectral norm.
If ξ = (ξv)v∈V is a vector, thendiag(ξ)signifies the V ×V matrix with diagonalξand off-diagonal entries equal to0. In particular,E= diag(1)denotes the identity matrix (of any size). Moreover, ifMis aν×νmatrix, then diag(M)∈Rνsignifies the vector comprising the diagonal entries ofM. If bothA= (aij)1≤i,j≤ν, B= (bij)1≤i,j≤ν
areν×νmatrices, then we lethA, Bi=Pν
i,j=1aijbij. IfM is a symmetricν×ν matrix, then
λ1[M]≤ · · · ≤λν[M] =λmax[M]
denote the eigenvalues ofM. We will occasionally need the Courant-Fischer characterizations ofλ2andλmax, which read
λ2[M] = max
06=ζ∈Rν min
ξ⊥ζ,kξk=1hM ξ, ξi, λmax[M] = max
ζ∈Rν,kζk=1hM ζ, ζi (see [5, Chapter 7]). (4) Recall that a symmetric matrixM is positive semidefinite ifλ1[M]≥0. In this case we writeM ≥0. Furthermore, M is positive definite ifλ1[M] >0, denoted asM >0. IfM, M0 are symmetric, thenM ≥M0(resp.M > M0) denotes the fact thatM−M0≥0(resp.M−M0 >0).
2.2 Grothendieck’s inequality
An important ingredient to our proofs and algorithms is Grothendieck’s inequality. LetM = (mij)i,j∈Ibe a matrix.
Then thecut-normofM is
kMkcut= max
I,J⊂I
X
(i,j)∈I×J
mij
.
In addition, consider the following optimization problem:
SDP(M) = max X
i,j∈I
mijhxi, yji s.t.∀i∈ I:kxik=kyik= 1, xi, yi ∈R2|I|. (5)
This can be reformulated as alinearoptimization problem over the cone of positive semidefinite2|I| ×2|I|matrices, i.e., as a semidefinite program (see Alizadeh [2]).
Lemma 4. For anyν×νmatrixM we have SDP(M) =1
2max 0 1
1 0
⊗M , X
s.t.diag(X) =1, X≥0, X∈R2ν×2ν. (6) Proof. Letx1, . . . , x2ν ∈ R2ν be a family of unit vectors such thatSDP(M) = Pν
i,j=1mijhxi, xj+νi. Then we obtain a positive semidefinite matrixX = (xi,j)1≤i,j≤2νby settingxi,j =hxi, xji. Sincexi,i=kxik2= 1for alli, this matrix satisfiesdiag(X) =1. Moreover,
0 1 1 0
⊗M , X
= 2
ν
X
i,j=1
mijxi,j+ν = 2
ν
X
i,j=1
mijhxi, xj+νi. (7)
Hence, the optimization problem on the r.h.s. of (6) yields an upper bound onSDP(M).
Conversely, if X = (xi,j) is a feasible solution to (6), then there exist vectorsx1, . . . , x2ν ∈ R2ν such that xi,j=hxi, xji, becauseX is positive semidefinite. Moreover, sincediag(X) =1, we have1 =xi,i=kxik2. Thus, x1, . . . , x2νis a feasible solution to (5), and (7) shows that the resulting objective function values coincide. ut Grothendieck [17] established the following relation betweenSDP(M)andkMkcut.
Theorem 5. There is a constantθ >1such that for all matricesM we havekMkcut≤SDP(M)≤θ· kMkcut. Since by Lemma4SDP(M)can be stated as a semidefinite program, an optimal solution toSDP(M)can be ap- proximated in polynomial time within any numerical precision, e.g., via the ellipsoid method [18]. By applying an appropriate rounding procedure to a near-optimal solution toSDP(M), Alon and Naor [4] obtained the following algorithmic result.
Theorem 6. There are a constantθ0>0and a polynomial time algorithmApxCutNormthat on inputM computes two setsI, J ⊂ Isuch thatθ0· kMkcut≤
P
i∈I,j∈Jmij
.
Alon and Naor presented a randomized algorithm that guarantees an approximation rationθ0>0.56, and a determin- istic one withθ0 ≥0.03. Finally, we need the following dual characterization ofSDP. The proof can be found in the next section, Section2.3.
Lemma 7. For any symmetricn×nmatrixQwe haveSDP(Q) =n·minz∈Rn, z⊥1λmax
0 1 1 0
⊗Q−diag zz
.
2.3 Proof of Lemma7
The proof of Lemma7relies on the duality theorem for semidefinite programs. For a symmetricn×nmatrixQset Q= 12
0 1 1 0
⊗Q. Furthermore, let
DSDP(Q) = minh1, yi s.t.Q ≤diag(y), y∈R2n. Lemma 8. We haveSDP(Q) = DSDP(Q).
Proof. By Lemma4we can rewrite the vector programSDP(Q)in the standard form of a semidefinite program:
SDP(Q) = maxhQ, Xi s.t.diag(X) =1, X≥0, X ∈R(2n)×(2n).
SinceDSDP(Q)is the dual ofSDP(Q), the lemma follows directly from the SDP duality theorem as stated in [20,
Corollary 2.2.6]. ut
To infer Lemma7, we shall simplifyDSDPand reformulate this semidefinite program as an eigenvalue minimiza- tion problem. First, we show that it suffices to optimize overy0∈Rnrather thany∈R2n.
Lemma 9. LetDSDP0(Q) = min 2h1, y0i s.t.Q ≤diag( 11
⊗y0), y0∈Rn.ThenDSDP(Q) = DSDP0(Q).
Proof. Since for any feasible solutiony0toDSDP0(Q)the vectory = 11
⊗y0 is a feasible solution toDSDP(Q), we conclude thatDSDP(Q) ≤ DSDP0(Q). Thus, we just need to establish the converse inequalityDSDP0(Q) ≤ DSDP(Q).
To this end, letF(Q)⊂R2nsignify the set of all feasible solutionsytoDSDP(Q). We shall prove thatF(Q)is closed under the linear operator
I:R2n →R2n, (y1, . . . , yn, yn+1, . . . , y2n)7→(yn+1, . . . , y2n, y1, . . . , yn),
i.e.,I(F(Q))⊂ F(Q); note thatIjust swaps the first and the lastnentries ofy. To see that this implies the assertion, consider an optimal solutiony = (yi)1≤i≤2n ∈ F(Q). Then12(y+Iy)∈ F(Q), becauseF(Q)is convex. Now, let y0= (yi0)1≤i≤nbe the projection of12(y+Iy)onto the firstncoordinates. Since12(y+Iy)is a fixed point ofI, we have 12(y+Iy) = 11
⊗y0. Hence, the fact that 12(y+Iy)is feasible forDSDP(Q)implies thaty0is feasible for DSDP0(Q). Thus, we conclude thatDSDP0(Q)≤2h1, y0i=h1, yi= DSDP(Q).
To show thatF(Q)is closed underI, consider a vectory∈ F(Q). Sincediag(y)− Qis positive semidefinite, we have
∀η∈R2n:h(diag(y)− Q)η, ηi ≥0. (8) The objective is to show thatdiag(Iy)− Qis positive semidefinite, i.e.,
∀ξ∈R2n:h(diag(Iy)− Q)ξ, ξi ≥0. (9) To derive (9) from (8), we decompose y into its two halfsy = uv
(u, v ∈ Rn). ThenIy = uv
. Moreover, let ξ= αβ
∈R2nbe any vector, and setη=Iξ= αβ
. We obtain
h(diag(Iy)− Q)ξ, ξi=hdiag(v)α, αi+hdiag(u)β, βi −hQα, βi+hQβ, αi
2 =h(diag(y)− Q)η, ηi(8)≥0,
thereby proving (9). ut
Proof of Lemma7.Let
DSDP00(Q) =n· min
z∈Rn, z⊥1λmax
0 1 1 0
⊗Q+ diag 1
1
⊗z
.
By Lemmas8and9, it suffices to prove thatDSDP0(Q) = DSDP00(Q).
To see thatDSDP00(Q)≤ DSDP0(Q), consider an optimal solutiony0 toDSDP0(Q). Letλ=n−1h1, y0iand z= 2(λ1−y0). Thenhz,1i= 2(nλ− h1, y0i) = 0, whencezis a feasible solution toDSDP00(Q). Furthermore, as y0is a feasible solution toDSDP0(Q), we have
0 1 1 0
⊗Q= 2Q ≤2diag 1
1
⊗y0= 2λE−diag 1
1
⊗z,
whereEis the identity matrix. Hence, the matrix2λE−
0 1 1 0
⊗Q−diag 11
⊗zis positive semidefinite. This implies that all eigenvalues of
0 1 1 0
⊗Q+ diag 11
⊗zare bounded by2λ, i.e.,λmax 0 1
1 0
⊗Q+ diag 11
⊗z
≤2λ.
As a consequence,
DSDP00(Q)≤nλmax 0 1
1 0
⊗Q+ diag 1
1
⊗z
≤2nλ= 2h1, y0i= DSDP0(Q).
Conversely, consider an optimal solutionztoDSDP00(Q). Set µ=λmax
0 1 1 0
⊗Q+ diag 1
1
⊗z
=n−1DSDP00(Q), y0= 1
2(µ1−z).
Since all eigenvalues of 0 1
1 0
⊗Q+ diag 11
⊗zare bounded byµ, the matrixµE− 0 1
1 0
⊗Q−diag 11
⊗z is positive semidefinite, i.e.,
0 1 1 0
⊗Q≤µE−diag 11
⊗z. Therefore,
Q=1 2
0 1 1 0
⊗Q≤1 2
µE−diag 1
1
⊗z
= diag 1
1
⊗y0. Hence,y0is a feasible solution toDSDP0(Q). Furthermore, sincez⊥1we obtain
DSDP0(Q)≤2h1, y0i=µn= DSDP00(Q),
as desired. ut
3 Quasi-Randomness: Proof of Theorem 1
3.1 From Essential Eigenvalue Separation to Low Discrepancy
We prove the first part of Theorem1. Suppose thatG = (V, E)is a graph that admits a setW ⊂ V of volume vol(W)≥(1−ε)vol(V)such that the eigenvalues of the minorLW of the normalized Laplacian satisfy
1−ε≤λ2[LW]≤λmax[LW]≤1 +ε. (10) We may assume without loss of generality thatε <0.01. Our goal is to show thatGhasDisc(10√
ε).
Let∆= (√
dv)v∈W ∈RW. Hence,∆is a real vector indexed by the elements ofW. Moreover, letLW denote the matrix whosevw’th entry is(dvdw)−12 ifv, ware adjacent, and0otherwise (v, w∈W), so thatLW =E− LW. Further, letMW = vol(V)−1∆∆T− LW. Then for all unit vectorsξ⊥∆we have
LWξ−ξ=−LWξ=MWξ. (11)
Moreover, for allS⊂W
|hMW∆S, ∆Si|=
vol(S)2 vol(V) −e(S)
. (12)
The key step of the proof is to derive the following bound on the operator norm ofMW.
Lemma 10. We havekMWk ≤10√ ε.
If it were the case thatW = V, then Lemma10would be immediate. For ifW =V, then∆is an eigenvector of L=LW with eigenvalue0. Hence, the definitionMW =k∆k−2∆∆T−E+LW ensures thatMW∆= 0. Moreover, for allξ⊥∆we haveMWξ= (LW−E)ξ, whence (10) implies thatkMWk ≤max{|λ2[LW]−1|,|λmax[LW]− 1|} ≤ε.
But of course generallyW is a proper subset ofV. In this case∆is not necessarily an eigenvector ofLW. In fact, the smallest eigenvalue ofLWmay be strictly positive. In order to prove Lemma10we will investigate the eigenvector ζofLW with the smallest eigenvalueλ1[LW]and show that it is “close” to∆. Then, we will use (10) to derive the desired bound onkMWk.
Proof of Lemma10.Letζ be a unit length eigenvector ofLW with eigenvalueλ1[LW]. There is a decomposition
∆ = k∆k ·(sζ +tχ), where s2+t2 = 1 andχ ⊥ ζ is a unit vector. Since hLW∆, ∆i = e(W, V \W) ≤ vol(V \W)≤εvol(V)andk∆k2= vol(W)≥(1−ε)vol(V)≥0.99vol(V), we have
2ε≥ k∆k−2hLW∆, ∆i=s2hLWζ, ζi+t2hLWχ, χi. (13) Asχis perpendicular to the eigenvectorζwith eigenvalueλ1[LW], Courant-Fischer (4) and (10) yieldhLWχ, χi ≥ λ2[LW]≥ 12. Hence, (13) implies2ε≥t2/2. Consequently,
t2≤4ε, and thus s2≥1−4ε. (14)
Now, let ξ ⊥ ∆be a unit vector, and decompose ξ = xζ+yη, whereη ⊥ ζ is a unit vector. Because ζ = s−1
∆ k∆k −tχ
, we havex =hζ, ξi =s−1D
∆ k∆k, ξE
−sthχ, ξi = −sthχ, ξi.Hence, (14) impliesx2 ≤5εand y2 ≥ 1−5ε. Combining these two estimates with (10) and (11), we conclude that kMWξk = kLWξ−ξk ≤ x(1−λ1[LW]) +ykLWη−ηk ≤3√
ε.Hence, we have established that sup
06=ξ⊥∆
kMWξk kξk ≤3√
ε. (15)
Furthermore, sincek∆k2= vol(W), (12) implies
|hMW∆, ∆i|
k∆k2 =
vol(W)
vol(V) − e(W) vol(W)
≤
vol(W)
vol(V) − e(W) vol(V)
+
e(W)
vol(W)− e(W) vol(V)
=e(W, V \W)
vol(V) +e(W)(vol(V)−vol(W)) vol(V)vol(W)
≤e(W, V \W)
vol(V) +vol(V \W)
vol(V) ≤ 2vol(V \W)
vol(V) . (16)
As we are assuming thatvol(W)≥(1−ε)vol(V), we obtaink∆k−2|hMW∆, ∆i| ≤2ε. Finally, combining this last estimate with (15), we conclude thatkMWk ≤10√
ε. ut
Lemma10easily implies thatGhasDisc(10√
ε). For letR⊂V be arbitrary, setS=R∩W, and letT =R\W. Sincek∆Sk2= vol(S)≤vol(V), Lemma10and (12) imply that
vol(S)2 vol(V) −e(S)
≤ kMWk · k∆Sk2≤10√
εvol(V). (17)
Furthermore, asvol(W)≥(1−ε)vol(V),
e(R)−e(S)≤e(T) + 2e(S, T)≤2vol(T)≤2vol(V \W)≤2εvol(V), and vol(R)2−vol(S)2
2vol(V) ≤ vol(T)2
2vol(V)+vol(S)vol(T)
vol(V) ≤ vol(V \W)2
2vol(V) + vol(V \W)≤2εvol(V).
Combining these two estimates with (17), we see that
vol(R)2
vol(V) −e(R) <20√
εvol(V), i.e.,GsatisfiesDisc(10√ ε).
3.2 From Low Discrepancy to Essential Eigenvalue Separation
In this section we establish the second part of Theorem1. Letθ denote the constant from Theorem5and setγ = 10−6/θ. Assume thatG= (V, E)is a graph that hasDisc(γε2)for someε < 0.001. In addition, we may assume without loss of generality thatGhas no isolated vertices. Letdvdenote the degree of v ∈ V, let n =|V|, and set d¯= vol(V)/n =P
v∈V dv/n. Our goal is to show thatGhasess-Eig(ε). To this end, we introduce an additional property.
Cut(δ): We sayGhasCut(δ)if the matrixM = (mvw)v,w∈V with entries mvw= dvdw
vol(V)−e(v, w)
has cut normkMkcut< δ·vol(V); heree(v, w) = 1if{v, w} ∈Eande(v, w) = 0otherwise.
Proposition 11. For anyδ >0the following is true: ifGsatisfiesDisc(0.01δ), thenGsatisfiesCut(δ).
Proof. Suppose thatG= (V, E)hasDisc(0.01δ). We shall prove below that for any twoS, T ⊂V
|hM1S,1Ti| ≤0.03δvol(V)ifS∩T =∅, (18)
|hM1S,1Ti| ≤0.02δvol(V)ifS=T. (19) To see that (18) and (19) imply the assertion, consider two arbitrary subsets X, Y ⊂ V. LettingZ = X ∩Y and combining (18) and (19), we obtain
|hM1X,1Yi| ≤
M1X\Z,1Y\Z +
M1Z,1Y\Z +
M1Z,1X\Z
+ 2|hM1Z,1Zi|
≤δvol(V).
Since this bound holds for anyX, Y, we conclude thatkMkcut≤δvol(V).
To prove (18), we note thatDisc(0.01δ)implies for disjoint setsSandT
e(S)−vol(S)2 vol(V)
≤0.02δvol(V),
e(T)−vol(T)2 vol(V)
≤0.02δvol(V), (20)
e(S∪T)−(vol(S) + vol(T))2 vol(V)
≤0.02δvol(V). (21)
IfSandTare disjoint, (20)–(21) yield
| hM1S,1Ti |=
e(S, T)−vol(S)vol(T) vol(V)
=1 2
e(S∪T)−e(S)−e(T)−(vol(S) + vol(T))2−vol(S)2−vol(T)2 vol(V)
≤1 2
e(S)−vol(S)2 vol(V)
+1 2
e(T)−vol(T)2 vol(V)
+1 2
e(S∪T)−(vol(S) + vol(T))2 vol(V)
≤0.03δvol(V),
whence (18) follows. Finally, as| hM1S,1Si |=
e(S)−vol(S)vol(V)2
, (19) follows from (20). ut
LetD = diag(dv)v∈V be the matrix with the vertex degrees on the diagonal. LetM=D−12M D−12. Then the vw’th the entry ofMis
√dvdw
vol(V) −(dvdw)−1/2ifv, ware adjacent, and
√dvdw
vol(V) otherwise. Establishing the following lemma is the key step.
Lemma 12. Suppose that SDP(M) < ε2vol(V)/64. Then there exists a subset W ⊂ V of volume vol(W) ≥ (1−ε)·vol(V)such thatkMWk< ε.
Proof. Recall thatd¯= vol(V)/n. Lemma7implies that there is a vector1⊥z∈RV such that λmax
0 1 1 0
⊗M−diag z
z
= SDP(M)/n < ε2d/64.¯ (22) BasicallyW is going to be the set of allv such that|zv|is small (and such thatdv is not too small). On the minor induced onW×W the diagonal matrixdiag zz
has little effect, and thus (22) will imply the desired bound onkMWk.
To carry out the details we need to defineW precisely, boundkMWk, and prove thatvol(W)≥(1−ε)vol(V).
Lety =D−1zandU = {v ∈ V : dv > εd/8}. Let¯ y0 = (yv)v∈U andz0 = (zv)v∈U. Since all entries of the restricted diagonal matrixDUexceedεd/8, we have¯
λmax 0 1
1 0
⊗ MU−diag y0
y0
=λmax
1 0 0 1
⊗D−
1 2
U · 0 1
1 0
⊗MU−diag z0
z0
· 1 0
0 1
⊗D−
1 2
U
≤8 εd¯−1 λmax
0 1 1 0
⊗MU −diag z0
z0
≤8(εd)¯−1λmax
0 1 1 0
⊗M−diag z
z (22)
< ε/8. (23)
LetW = {v ∈ U :|yv| < ε/8}and lety00 = (yv)v∈W. Thenkdiag yy0000
k < ε/8, because the norm of a diagonal matrix equals the largest absolute value of an entry on the diagonal. Therefore, (23) yields
λmax 0 1
1 0
⊗ MW
≤λmax 0 1
1 0
⊗ MW −diag y00
y00
+
diag y00
y00
≤λmax 0 1
1 0
⊗ MU −diag y0
y0
+
diag y00
y00
≤ε/4. (24) Further, (24) implies thatkMWk< ε. To see this, consider a pairξ, η∈RW of unit vectors. Then (24) and Courant- Fischer (4) yield
ε/2≥2λmax 0 1
1 0
⊗ MW
≥ 0 1
1 0
⊗ MW · ξ
η
, ξ
η
=
MWη MWξ
, ξ
η
=hMWη, ξi+hMWξ, ηi= 2hMWξ, ηi [asMW is symmetric].
Since this holds for any pairξ, η, we conclude thatkMWk ≤ε/4< ε.
Finally, we need to show that vol(W)is large. To this end, we consider the setS = {v ∈ V : zv <0}. Since vol(V) = ¯dn≥d|S|, we have¯
ε2vol(V)
32 ≥ ε2d|S|¯ 32 = ε2d¯
64 ·
1S 1S
2
≥λmax 0 1
1 0
⊗M−diag z
z
·
1S 1S
2
[due to (22)]
≥
0 1 1 0
⊗M−diag z
z
· 1S
1S
, 1S
1S
[by Courant-Fischer (4)]
= 2hM1S,1Si −2X
v∈S
zv. (25)
Further, Theorem 5 implies | hM1S,1Si | ≤ kMkcut ≤ SDP(M) ≤ ε2vol(V)/64. Inserting this into (25) and recalling thatzv <0for allv ∈S, we conclude thatP
v∈S|zv| ≤ε2vol(V)/16.Sincez⊥1, this actually implies P
v∈V |zv| ≤ε2vol(V)/8. Asz=Dyand|yv|> ε/8for allv∈V \W, we obtain εvol(V \W)/8≤ X
v∈V\W
dv|yv|= X
v∈V\W
|zv| ≤ε2vol(V)/8.
Hence,vol(V \W)≤εvol(V), which impliesvol(W)≥(1−ε)vol(V). ut
Finally, we show how it implies thatGhas ess-Eig(ε). Assume thatGhasDisc(γε2). By Proposition11this implies thatGsatisfiesCut(100γε2). Hence, Theorem5showsSDP(M)≤βε2vol(V)for some0 < β ≤100θγ.
Thus, by Lemma12and our choice ofγthere is a setW such thatvol(W)≥(1−ε/10)vol(V)andkMWk< ε/10.
Furthermore,MW relates to the minorLW of the Laplacian as follows. LetLW =E−LW be the matrix whose vw’th entry is(dvdw)−1/2ifv, w ∈W are adjacent, and0otherwise. Moreover, let∆= (√
dv)v∈W ∈RW. Then MW = vol(V)−1∆∆T − LW.Therefore, for all unit vectorsξ⊥∆we have
|hLWξ, ξi −1|=|hLWξ, ξi|=|hMWξ, ξi| ≤ kMWk< ε/10. (26) Combining (26) with the Courant-Fischer (4), we obtain
λ2[LW] = max
06=ζ∈RW min
ξ⊥ζ,kξk=1hLWξ, ξi ≥ min
ξ⊥∆,kξk=1hLWξ, ξi ≥1−ε. (27) To boundλmax[LW]as well, we need to computekLW∆k2. To this end, recall that the row ofLW corresponding to a vertexv∈V contains a one at positionv. Forw6=vthe entry is−(dvdw)−12 ifvandware adjacent, and0otherwise.
Hence, thev-entry of the vectorLW∆equals
∆v− X
w∈W:{v,w}∈E
∆w
√dvdw
=p
dv−e(v, W)
√dv
=dv−e(v, W)
√dv
.
Sincek∆k2=P
v∈Wdv= vol(W)≥(1−ε/10)vol(V), we obtain kLW∆k2
k∆k2 = X
v∈W
(e(v, W)−dv)2 dv·vol(W)
≤ 1
1−ε/10 X
v∈W
dv−e(v, W)
vol(V) ≤ 2vol(V \W)
vol(V) < ε/5. (28)
Further, decomposing any unit vectorη ∈RW asη=αk∆k−1∆+βξwith a unit vectorξ⊥∆andα2+β2= 1, we get
hLWη, ηi=
LW αk∆k−1∆+βξ
, αk∆k−1∆+βξ
= α2
k∆k2 · hLW∆, ∆i+ αβ
k∆k · hLW∆, ξi+ αβ
k∆k· hLWξ, ∆i+β2hLWξ, ξi
= α2
k∆k2 · hLW∆, ∆i+ 2αβ
k∆k · hLW∆, ξi+β2hLWξ, ξi,
where the last step follows from the fact thatLW is symmetric. Hence, using (26) and (28), we get hLWη, ηi ≤ α2
k∆k2 · kLW∆k · k∆k+ 2αβ
k∆k · kLW∆k · kξk+β2hLWξ, ξi
≤α2p
ε/5 + 2αβp
ε/5 +β2(1 +| hLξ, ξi −1|)
≤p
ε/5(α2+ 2αβ) +β2(1 +ε/10)
≤3p
ε/5· |α|+ (1−α2)(1 +ε/10).
Differentiating the last expression, we find that the maximum is attained atα = 32p
ε/5/(1 +ε/10). Plugging this value in, we obtainhLWη, ηi ≤ 1 +ε. Hence, by Courant-Fischer (4),λmax[LW] = maxkηk=1hLWη, ηi ≤1 +ε.
Thus, (27) shows thatGhasess-Eig(ε).
4 The Algorithmic Regularity Lemma: Proof of Theorem 2
In this section we establish Theorem2. The proof is conceptually similar to Szemer´edi’s original proof of the “dense”
regularity lemma [23] and its adaptation for sparse graphs due to Kohayakawa [21] and R¨odl (unpublished). A new aspect here is that we deal with a different (more general) notion of regularity; this requires various technical modifica- tions of the previous arguments. More importantly, we present analgorithmfor actually computing a regular partition of a sparse graph efficiently.
In order to find a regular partition efficiently, we crucially need an algorithm to check for a given weight distribution D= (Dv)v∈V, a given graphG, and a pair(A, B)of vertex sets whether(A, B)is(ε,D)-regular. While [3] features a (purely combinatorial) algorithm for this problem in dense graphs, this approach does not work in the sparse case. In Section4.1we present an algorithmWitnessthat does. It is based on Grothendieck’s inequality and the semidefinite relaxation of the cut norm (see Theorem6). Then, in Section4.2we will show howWitnesscan be used to compute a regular partition to establish Theorem2.
Throughout this section, we let0 < ε < 10−7be an arbitrarily small but fixed number, andC ≥ 1signifies an arbitrarily large but fixed number. In addition, we define a sequence(tk)k≥1by
t1=d1/ε2e and tk+1=d22002C2t6k2tk/ε4(k+1)e. (29) Note that due to that choice we have
tk+1≥2200Ct2.5k . (30)
Further, let
k∗=d106C2ε−3e and η= min
ε8k∗
128002t6k∗C4, 1 t2k∗
(31) and choose n0 = n0(C, ε) > 0 big enough. We let G = (V, E)be a graph on n = |V| > n0 vertices, and let D = (Dv)v∈V be a sequence of rationals with 1 ≤ Dv ≤ n for all v ∈ V. We will always assume that Gis (C, η,D)-bounded, and thatD(V)≥η−1n.
4.1 The ProcedureWitness
The subroutine Witness is given a graphG, a weight distribution D, vertex sets A,B, and a number ε > 0.
Witnesseither outputs “yes”, in which case(A, B)is(ε,D)-regular inG, or “no”. In the latter case the algorithm also produces a “witness of irregularity”, i.e., a pair of setsX∗⊂A,Y∗⊂Bfor which the regularity condition (2) is violated withεreplaced byε/200.Witnessemploys the algorithmApxCutNormfrom Theorem6.
Algorithm 13. Witness(G,D, A, B, ε)
1. Set up the matrixM = (mvw)(v,w)∈A×Bwith entries
mvw=
1−%(A, B)DvDw ifv, ware adjacent inG,
−%(A, B)DvDw otherwise.
CallApxCutNorm(M)to compute setsX⊂A,Y ⊂Bsuch that| hM1X,1Yi | ≥ 1003 kMkcut. 2. If| hM1X,1Yi |<1003ε D(A)D(B)D(V) , then return “yes”.
3. If not, letX0=A\X.
– IfD(X)≥ 1003εD(A), then letX∗=X.
– IfD(X)< 1003εD(A)and|e(X0, Y)−%(A, B)D(X0)D(Y)|> εD(A)D(B)100D(V) , setX∗=X0. – Otherwise, setX∗=X∪X0.
4. LetY0=B\Y.
– IfD(Y)≥200ε D(B), then letY∗=Y.
– IfD(Y)<200ε D(B)and|e(X∗, Y0)−%(A, B)D(X∗)D(Y0)|>εD(A)D(B)200D(V) , letY∗=Y0. – Otherwise, setY∗=Y ∪Y0.
5. Answer “no” and output(X∗, Y∗)as an(ε/200,D)-witness.
Lemma 14. Suppose thatA, B⊂V are disjoint.
1. IfWitness(G,D, A, B, ε)answers “yes”, then the pair(A, B)is(ε,D)-regular.
2. If the answer is “no”, then (A, B) is not(ε/200,D)-regular. In this caseWitness outputs an(ε/200,D)- witness, i.e., a pair(X∗, Y∗)of subsetsX∗ ⊂A,Y∗ ⊂Bsuch thatD(X∗)≥ 200ε D(A),D(Y∗)≥ 200ε D(B), and
|e(X∗, Y∗)−%(A, B)D(X∗)D(Y∗)|> ε
200 ·D(A)D(B) D(V) .
Moreover, there exist a functionf and a polynomialΠsuch that the running time ofWitnessis bounded byf(C, ε)· Π(hDi).
Proof. Note that for any two subsetsS ⊂AandT ⊂Bwe have
hM1S,1Ti=e(S, T)−%(A, B)D(S)D(T).
Therefore, if the setsX ⊂AandY ⊂Bcomputed byApxCutNormare such that
| hM1X,1Yi |< 3ε 100
D(A)D(B) D(V)
then by Theorem6we have
|e(S, T)−%(A, B)D(S)D(T)| ≤ kMkcut≤ 100
3 |hM1X,1Yi|< εD(A)D(B) D(V) for allS⊂AandT ⊂B. Thus, ifWitnessanswers “yes” then the pair(A, B)is(ε,D)-regular.
One the other hand, ifApxCutNormyields setsX,Y such thathM1X,1Yi ≥1003ε D(A)D(B)D(V) thenWitnesshas to guarantee that the output pair(X∗, Y∗)is an(ε/200,D)-witness.
Indeed, ifD(X)≥ 1003εD(A)andD(Y)≥ 200ε D(B)then(X, Y)actually is an(ε/200,D)-witness. However, as ApxCutNormdoes not guarantee any lower bound onD(X)andD(Y)let assume first thatD(X)< 1003εD(A)and D(Y)≥ 200ε D(B). Then Step 3 ofWitnesssetsX0 =A\X. We haveD(X0)≥ 1003 D(A). IfX0 itself satisfies
|e(X0, Y)−%(A, B)D(X0)D(Y)| > εD(A)D(B)100D(V) then(X0, Y)obviously is an(ε/200,D)-witness. Otherwise, by the triangle inequality, we deduce
e(X∪X0, Y)−e(A, B)D(X∪X0)D(Y) D(A)D(B)
≥ 2ε 100
D(A)D(B) D(V) and thus,(X∪X0, Y)is an(ε/200,D)-witness.
In the caseD(X)< 1003εD(A)andD(Y)<200ε D(B)we simply repeat the argument forY, and henceWitness outputs an(ε/200,D)-witness for(A, B).
The running time ofWitnessis dominated by Step 1, i.e., the execution ofApxCutNorm. By Theorem6the running time ofApxCutNormis polynomial in the encoding length of the input matrix. Moreover, the construction of M in Step 1 shows that its encoding length is of the formf(C, ε)·Π(hDi)for a certain functionfand a polynomial
Π, as claimed. ut
4.2 The AlgorithmRegularize
In order to compute the desired regular partition of the input graphG, the algorithm Regularizestarts with an arbitrary initial partitionP1={Vi1: 0≤i≤s1}such that each classVi1(1≤i≤s1) has a “decent” weightD(Vi1).
In the subsequent steps,Regularizecomputes a sequence(Pk)of partitions such thatPk+1 is a “more regular”
refinement ofPk (k ≥ 1). The algorithm halts as soon as it can verify thatPk satisfies bothREG1andREG2of Theorem2. To this endRegularizeapplies the subroutineWitnessto each pair(Vik, Vjk)of the current partition
Pk. By Lemma14this yields a setLkof pairs(i, j)such that all(Vik, Vjk)with(i, j)6∈ Lkare(ε,D)-regular. Hence, PksatisfiesREG2as soon as
X
(i,j)∈Lk
D(Vik)D(Vjk)< εD(V)2. (32)
In this case the algorithmRegularizestops and outputsPk. As we will see, all partitionsPk satisfyREG1by construction. Consequently, once (32) holds,Regularizehas obtained the desired regular partition.
Algorithm 15. Regularize(G, C,D, ε)
1. Fix an arbitrary partitionP1 ={Vi1: 0≤i≤s1}for somes1≤t1with the property – D(V)/t1−maxv∈VDv< D(Vi1)≤D(V)/t1for all1≤i≤s1and
– D(V \(S
i∈[s1]Vi1))≤D(V)/t1. SetV01=V \S
i∈[s1]Vi1and setk∗=d10002C2ε−3e.
2. Fork= 1,2,3, . . . , k∗do 3. Initially, letLk=∅.
For each pair(Vik, Vjk) (i < j)of classes of partitionPk 4. call the procedureWitness(G,D, Vik, Vjk, ε).
If it answers “no” and hence outputs an (ε/200,D)-witness (Xijk, Xjik)for (Vik, Vjk), then add(i, j)toLk.
5. IfP
(i,j)∈LkD(Vik)D(Vjk)< ε(D(V))2, then output the partitionPkand halt.
6. Else construct a refinementPk+1ofPkas follows:
– First construct the unique minimal partitionCkofV\V0k, which refines{Xijk, Vi\Xijk} for everyi= 1, . . . , sk and everyj 6=i. More precisely, we define the equivalence relation≡ki onVi by lettingu ≡ki viff for allj such that(i, j) ∈ Lk it is true that u ∈ Xijk ⇔ v ∈ Xijk and we let Ck be the set of all equivalence classes of the relations≡ki (1≤i≤sk).
– Set αk = ε4(k+1)/(22002C2t6k2tk) and split each vertex class of Ck into blocks with weight betweenαkD(V)andαkD(V) + maxv∈VDv and possibly one excep- tional block of smaller weight. More preceisely, we construct a refinement C∗k = {V0,1k+1, . . . , V0,rk+1
k, V1k+1, . . . , Vsk+1k+1}ofCksuch that:
• rk≤ |Ck| ≤sk2sk,
• D(V0,qk+1)< αkD(V)for allq∈[rk], and
• αkD(V)≤D(Vik+1)< αkD(V) + maxv∈VDvfor alli∈[sk+1].
– LetV0k+1=V0k∪S
q∈[rk]V0,qk+1and setPk+1={Vik+1: 0≤i≤sk+1}.
Step 6 is the central step of the algorithm. In the first part of that step we construct a joint refinement of the previous partitionPkand all the witnesses of irregularity(Xijk, Xjik)discovered in Step 4. Similarly as in the original proof of Szemer´edi’s it will turn out that a bounded parameter (the so-called index defined below) of the partitionCkincreases byΩ(ε3)compared toPk. SincePkconsists ofskclasses and for everyi= 1, . . . , skthere are at mostsk−1witness setsXij(j 6=i), the refinementCkcontains at mostsk2sk−1< sk2skvertex classes. In the second part of Step 6 we split the classes ofCkinto pieces of almost equal weight. Here for each class ofCk we may get one class of left-over verticesV0,qk of smaller weight, which together withV0kform the new exceptional classV0k+1. Due to the construction in Step 6, the bounds1≤t1, and (29) for anyk≥0the partitionPk+1consist of at most
sk+1+ 1≤ d22002C2t6k2tk/ε4(k+1)e=tk+1 classes. Moreover, our choice (31) ofηand the construction in Step 1 ensure that
ε2D(V)≥D(Vik+1)≥√
ηD(V)for all1≤i≤sk+1 (33)