• Nenhum resultado encontrado

Note:

All through the paper we will be bounding errors whp as mΩ(1)1 . Note that these errors are less thanifmis chosen to be a sufficiently a large polynomial of 1. Think of whp to mean with probability 1−mΩ(1)1 .

All through the paper we leave the errors in terms ofm, think of adding up all the errors and union bounding the probabilities and fixing all the parameters in terms ofand then we choose a sufficiently largem= Ω(1)1 to make the sum of all the errorsO() and the union of all the error probabilitiesO().

ILemma 6. For any degree 2 multilinear polynomialp= P

i,j∈[n]

aijxixj+ P

l∈[n]

blxl+C, we have the following bound:

| E

x∼{±1}nsgn(p(x))− E

x∼Nn(0,1)

sgn(p(x))| ≤O

"

n

P

i=1

Infi2(p) (V ar[p])2

#19 .

where theith influence ofpis defined as Infi(p) =E|∂x∂p

i|2= 2 P

j∈[n]

a2ij+b2i. Think ofith influence as the variance ofpalong theith coordinate.

Observe thatV ar[p]≤

n

P

i=1

Infi(p)≤2V ar[p]. Now we define the notion ofregularity for polynomials which essentially means that there is no single variable whose influence is very large as compared to the rest of the variables.

IDefinition 7. We say that the polynomialpisτ-regular if max

i∈[n]Infi(p)≤τ V ar[p].

Thus for a τ-regular polynomial p we can bound the replacement error above as O(τ19) because

n

P

i=1

Infi2(p) (V ar[p])2

[max

i Infi(p)]

n

P

i=1

Infi(p) (V ar[p])2 ≤2τ.

Note that when we apply this, we pick τ=O(1). Regularity Lemma

We will use the following Regularity Lemma from [6]:

I Lemma 8. Every multilinear degree 2 polynomial p: {±1}n → R can be written as a decision tree of depthD= 1τ ·O

logτ θ1O(1)

such that with probability(1−θ)over a random leaf the resulting polynomialpα is either

(i) τ regular, OR (ii) V ar(pα)< θ||p||22.

Note that when we apply this Regularity lemma we will chooseθ=1m, τ =O(1) so that D= (logm)O(1). After all the parameters are fixed we finally pickm= Ω(1)1 large enough so that all the errors get bounded by O().

2.2 Eigenvalues of polynomials, Central Limit Theorem.

Eigenvalues

Letp:Rn→Rbe a multilinear polynomial of degree 2. Thus there exist a real symmetric matrixA, a vectorBt and a constantC such that

p(x) =xtAx+Btx+C.

The eigenvalues ofpare defined to be the eigenvaluesλ1, . . . , λn of the real symmetric matrixA. Sincepis a multilinear polynomial we have

n

P

i=1

λi= 0.

We have the following expression for variance of the polynomial from [3]:

V ar[p] =

n

X

i=1

(b2i + 2a2ii) + X

1≤i<jn

a2ij.

The eigenvalues capture a lot of information about the polynomial. For instance if all the eigenvalues are small then the polynomial behaves like a single Gaussian. Let’s define this notion of regular polynomials.

IDefinition 9. If all the eigenvalues of a polynomialpare small relative to it’s variance, that is|λmax(p)| ≤p

V ar[p], then it is called an-regular polynomial.

Central Limit theorem

We would need the following Central Limit Theorem from [3](Lemma 31 in their paper).

It essentially says that if all the eigenvalues of a degree 2 polynomialpare small then the polynomial can be well approximated with a single Gaussian which has the same mean and variance. That is,

ILemma 10. Letp:Rn→Rbe a degree-2polynomial over independent standard Gaussians.

If |λmax(p)| ≤ p

V ar[p], then p is O()-close to the Gaussian N(E[p], V ar[p]) in total variation distance(hence also in Kolmogorov distance).

2.3 Definition of L and basic facts.

We define L as follows: L is determined by a hash functionh: [n]→[m] and a sign function σ: [n]→ {±1} as follows:

L(y)i=σ(i)yh(i).

Note that for eachi∈[n],his uniformly random on [m] and σis±1 uniformly at random.

h, σare chosen from 8-wise independent families. Thus Lcan be represented by a n×m matrix where theith row ofLisci=σ(i)eh(i)whereej is thejth standard basis vector of Rm. It is depicted in the following figure:

L =

n

m ci

Figure 1Construction ofL

Note that the rows ofLsatisfy the following properties:

EL[hci, cji1] =EL[hci, cji3] =δij. EL[hci, cji2] =

(1, if i=j.

1 m, else.

Note that this is a standard Johnson-Lindenstrauss matrix. In the following Lemma we show that they preserveL2 norms and inner products of vectors to give a feel for the kind of computations we need. In factLtpreserves a lot more structure as we shall see in the next section.

ILemma 11. For anyn, >0, there exists anm=poly(1)and an explicit family of Linear transformationsLt(with seed lengthO(logn)from{±1}m→ {±1}n)so that for any two unit vectorsv1, v2∈Rn we have

|hLtv1, Ltv2i−hv1, v2i|< wp1−2 overL.

Proof. We know thatLtv1=

n

P

i=1

vi1ci, Ltv2=

n

P

j=1

v2jcj. Thus we have

hLtv1, Ltv2i=h

n

X

i=1

vi1ci,

n

X

j=1

vj2cji= X

i,j∈[n]

v1iv2jhci, cji hLtv1, Ltv2i−hv1, v2i= X

i6=j∈[n]

v1iv2jhci, cji (hLtv1, Ltv2i−hv1, v2i)2= X

i16=j1 i26=j2

v1i1vi12v2j1vj22hci1, cj1ihci2, cj2i.

Note that when averaged wrtEL, the only terms that survive are those that are paired either as (i1=i2, j1=j2) or (i1=j2, i2=j1).

The rest of the terms average to 0 because of the signσ, that isEσ[σ(i1)σ(i2)σ(j1)σ(j2)]

only survives if the indices are paired and we already have the constraintsi16=j1, i26=j2. Thus we have

EL(hLtv1, Ltv2i−hv1, v2i)2=X

i6=j

(v1i)2(v2j)2ELhci, cji2+X

i6=j

v1iv2iv1jv2jELhci, cji2

= 1 m

X

i6=j

[(vi1)2(vj2)2+vi1vi2vj1vj2]

≤ 1

m(|v1|22|v2|22+hv1, v2i2)≤ 2 m Thus using Chebyshev’s inequality we have

|hLtv1, Ltv2i−hv1, v2i| ≤ 1

m1/3 wp 1− 2

m1/3

overL.

Now we choose m=13 to have

|hLtv1, Ltv2i−hv1, v2i| ≤wp1−2 overL.

This completes the proof. J

To see that norms are preserved too just choosev1=v2above.

Note

All through the paper we will be computing such expected moments and bounding them by

1

mΩ(1) and then use Markov|Chebyshev’s inequality (We can’t use big moments becauseL has limited independence). Think of these errors as small because after all the parameters are fixed we pickm= Ω(1)1 , to be a sufficiently large polynomial of 1 to bound all the terms by O(). We showed the constants explicitly in the above Lemma but we would not be computing them exactly later on and just denote them withO(1).

2.4 Technical Lemmas involving L

We show that the transformationppLdoesn’t change the variance by a lot. Ifp(x) = xtAx+Btx+C then pL(y) = yt(LtAL)y+ (BtL)y+C. Note that this is just a basic moment computation and doesn’t involve anything non trivial.

ILemma 12. If p(x) =xtAx+Btx+C is a multilinear polynomial, pL(y) =yt(LtAL)y+ (BtL)y+C. Then,

ELV ar[pL] =

n

X

i=1

b2i + 1 + 3

m

|A|2F =V ar[p] + 3 m|A|2F Proof. We know that

V ar[p] =

n

X

i=1

(b2i +a2ii) +||A||2F =

n

X

i=1

b2i +||A||2F. Let’s compute the same forpL. Note thatLtAL= P

i,j∈[n]

aijcicj. Thus,

|LtAL|2F = X

i1,j1,i2,j2∈[n]

ai1,j1ai2,j2hci1, ci2ihcj1, cj2i

= X

i16=j1 i26=j2

σ(i1)σ(i2)σ(j1)σ(j2)ai1,j1ai2,j2I{h(i1)=h(i2), h(j1)=h(j2)}.

Let’s take expectation overσ. We know thatEσ[σ(i1)σ(i2)σ(j1)σ(j2)]6= 0 iff (i1, j1) = (i2, j2) or (i1, j1) = (j2, i2).

LetT1 denote the terms of the first kind, then we have T1= P

i1,j1

a2i1,j1 =|A|2F. LetT2

denote the terms of the second kind, then we haveT2= P

i1,j1

a2i

1,j1I{h(i1)=h(j1)}and thus EL[T2] = P

i1,j1

a2i1,j1m1 =m1|A|2F. Also

hBtL, BtLi= X

i1,i2∈[n]

bi1bi2hci1, ci2i=X

i1,i2

σ(i1)σ(i2)bi1bi2I{h(i1) =h(i2)}

EσhBtL, BtLi=X

i

b2i.

We now compute P

l∈[m]

(LtAL)2ll.

X

l∈[m]

(LtAL)2ll=

m

X

l=1

X

i,j∈[n]

aijcliclj2

= X

i1,i2,j1,j2

ai1j1ai2j2 m

X

l=1

cli

1cli

2clj

1clj

2

= X

i16=j1

i26=j2

σ(i1)σ(i2)σ(j1)σ(j2)ai1,j1ai2,j2I{h(i1)=h(i2)=h(j1)=h(j2)}

Let’s take expectation overσ. We know thatEσ[σ(i1)σ(i2)σ(j1)σ(j2)]6= 0 iff (i1, j1) = (i2, j2) or (i1, j1) = (j2, i2).

Eσ

X

l∈[m]

(LtAL)2ll= 2X

i1,j1

a2i

1j1I{h(i1) =h(j1)}

Thus, EL

X

l∈[m]

(LtAL)2ll= 2

m|A|2F. J

In the following Lemma we prove bounds onV arL[V ary[pL]]. This would help us show thatV ary[pL] = Θ(V ar[p]) whp.

ILemma 13.

V arL[V ary[pL]] = O(1) m . Proof. From Lemma 12 we have

V ary[pL] =|LtAL|2F +|BtL|22+

m

X

l=1

(LtAL)2ll

= X

i1,i2,j1,j2

ai1j1ai2j2hci1, ci2ihcj1, cj2i+X

r1,r2

br1br2hcr1, cr2i+

m

X

l=1

(LtAL)2ll

where X

l∈[m]

(LtAL)2ll= X

i16=j1

i26=j2

σ(i1)σ(i2)σ(j1)σ(j2)ai1,j1ai2,j2I{h(i1)=h(i2)=h(j1)=h(j2)}

Thus we have

V ary[pL]−EL[V ary[pL]] = X

(i1,j1)6=(i2,j2)

ai1j1ai2j2hci1, ci2ihcj1, cj2i+ X

r16=r2

br1br2hcr1, cr2i

+

m

X

l=1

(LtAL)2ll− 3 m|A|2F.

We skip showing the elaborate yet simple moment calculations but observe that when squared and averaged overLeach term above will have atleast a m1 term in it. Also the corresponding coefficients can be bounded using Cauchy Schwarz and noting that|B|22≤1 and|A|2F ≤1.

Thus

EL

V ary[pL]−EL[V ary[pL]]2

=OV ar2[p]

m

. J

Now we put together these two Lemmas to show that V ary[pL] = Θ(V ar[p]) whp. We exclude the proof as it is a direct consequence of Chebyshev inequality using Lemma 12 and Lemma 13.

ILemma 14.

|V ary[pL]−V ar[p]| ≤OV ar[p]

m1/3

wp 1− 1

m1/3

overL.

The following lemma would also be useful. Intuitively it means thatLwould not perturb an eigenvalue ofAby a huge amount whp. In fact this would imply that all the eigenvalues ofAwould be in thepseudospectrum ofLtAL.

ILemma 15. Let λbe an eigenvalue of A and let the unit vectorv be the corresponding eigenvector. Then we have

EL|(LtALλIm×m)Ltv|22=O1 m

. Proof. SubstitutingAv=λv, we have

(LtALλIm×m)Ltv=LtALLtvLtAv.

Expanding the productLtALLtv we have, LtALLtv= X

i,j,k∈[n]

ai,jcihcj, ckivk LtALLtvLtAv=X

i j6=k

ai,jcihcj, ckivk

Thus

|LtALLtvLtAv|22= X

i1,i2 j16=k1

j26=k2

ai1,j1ai2,j2vk1vk2hci1, ci2ihcj1, ck1ihcj2, ck2i.

A term survives Eσ only if all the indices {i1, i2, j1, j2, k1, k2} are paired appropriately.

However when we takeEhsince we havej16=k1, j26=k2 we would see atleast a 1/min every term. Now the corresponding coefficient can be bounded using Cauchy Schwarz and noting that|v|22= 1 and|A|2F ≤1. Thus we have

EL|(LtALλIm×m)Ltv|22=O(1)

m . J