Gaussian PRG | E - 3 The regular case - LIPIcs – Leibniz International Proceedings in Informati

3 The regular case

3.1 Gaussian PRG | E

x∼Nⁿ(0,1)

sgn(p(x)) − E

y∼N^m(0,1)

sgn(pL(y))|

In this section we will show that in the Gaussian settingpcannot distinguish betweenxand Ly. The main idea is that to understand the average sign of a degree 2 polynomial you just need to keep track of the top few eigenvalues and the total mass in the rest of the eigenvalues.

This is because either the latter eigenvalues are too small and thus the truncated part overall contributes very little mass to the total polynomial or these eigenvalues are small but do contribute a significant fraction of the total mass(we call this part the eigenregular part), then you could replace all of them by a single Gaussian with the same total mass via the CLT tools used in [3].

Thus let’s think of the polynomialpas the top few eigenvalues and a lump mass of the rest of the eigenvalues. The Johnson-Lindenstrauss like matrixL we use preserves the top eigenvalue structure of the polynomial and also keeps the eigenregular part stilleigenregular.

It introduces some negligible dependence between the top eigenvalue part and theeigenregular part which we remove to begin with to keep them independent.

To begin with assume p(x) =x^tAx+B^tx+Cbe a degree 2 multilinear polynomial with

|A|F = 1. SinceAis a real symmetric matrix, let it be diagonalised asA=VΛV^t, where V is an orthonormal matrix who columns are the eigenvectors ofA. Let the eigenvalues ofA be|λ1| ≥ |λ2| ≥. . .≥ |λn|. Now letk+ 1 be the first index with|λk+1|< δwhere we will chooseδ=^O(1) later on. Since P

i∈[n]

λ²_i = 1, we know thatk≤ _δ¹2 = (¹)^O(1)m. LetV_≤k denote the firstkeigenvectors ofV and Λ_k denote the topk×kdiagonal submatrix of Λ containing the topkeigenvalues of A.

IDefinition 16. DefineA₁=V_≤kΛ_kV_≤k^t to be the top eigenpart ofAandA₂=V_>kΛⁿ_k+1V_>k^t to be the lower eigenpart ofA, we haveA=A1+A2.

Accordingly decomposep(x) =q₁(x) +r₁(x) where q₁(x) =x^tA₁x+B^tV_≤kV_≤k^t x+C,

r₁(x) =x^tA₂x+B^tV_>kV_>k^t x.

Note that q₁(x) and r₁(x) are independent of each other because the columns ofV_≤k are orthogonal to the columns ofV>k. In the following lemma we replacer1(x) by just a single Gaussian that has the same mass and thus ignoring the total structure ofr₁(x). Letz be an one dimensional Gaussian independent ofx.

ILemma 17. Given > 0 let δ be a sufficiently large power of , δ=^O(1). If p(x) can be written as a sum of two independent polynomials, that is p(x) = q1(x) +r1(x) where

|λmax(r₁)|< δ, then

x∼Nⁿ(0,1)sgn(p(x))− E

x∼Nⁿ(0,1) z∼N(0,1)

sgn

q1(x) +p

V ar[r1]z

< O().

Proof. We consider two cases:

Case I - Say r₁ has very small variance, that is p

V ar[r₁(x)] < ^δ. Then we can use Lemma 5 to see that the replacement ofr1(x) byp

V ar[r1]z will only incur an error of at mostO(^δ)¹³. By an appropriate choice ofδ=^O(1) that we make later on this error will beO().

Case II - Say p

V ar[r₁(x)] > ^δ, then note that every eigenvalue λ of r₁(x) satisfies

|λ| < p

V ar[r1]. Such a polynomial all of whose eigenvalues are small compared to its variance are called eigenregular polynomials and we could use Lemma 10 to replace r₁(x) by p

V ar[r₁]z and incur an error of at most O(). Note that we are using the independence ofq1(x) andr1(x) in aconvolutionargument used to insertq1after applying the CLT.

Thus in either case the lemma holds after an appropriate choice ofδ=^O(1). J To keep the presentation simple henceforth we assume thatL^tV_≤k still hasorthonormal columns, that isV_≤k^t LL^tV_≤k =I_k×k. The exact computation proceeds by first using the Gram Schmidt processto orthonormalize{L^tv₁, . . . L^tv_k}. However this would not be very different from the exact analysis because L approximately preserves inner products and norms whp and we can union bound because k is a small constant depending on . In particular we have the following lemma.

ILemma 18.

V_≤k^t LL^tV≤k−Ik×k

F =Ok² m

Proof. This is a straightforward computation. ReplacingI_k×k=V_≤k^t V_≤k, we haveV_≤k^t LL^tV≤k−Ik×k =V_≤k^t (LL^t−In×n)V≤k. This gives,

V_≤k^t (LL^t−I_n×n)V_≤k

2 F

= X

a,b∈[k]

i₁6=i2

vⁱ_a¹v_bⁱ²hc_i₁, c_i₂i2

= X

a,b∈[k]

i16=i2

i36=i4

v_aⁱ¹vⁱ_b²v_aⁱ³vⁱ_b⁴hci₁, ci₂ihci₃, ci₄i.

This gives EL

V_≤k^t (LL^t−I_n×n)V_≤k

2 F

≤Ok² m

J Let y ∼ N^m(0,1) be a Gaussian independent of x, z. Since Gaussian distribution is invariant to rotations V_≤k^t x ∼ N^k(0,1) and [V_≤k^t L]y ∼ N^k(0,1) are identically distributed. Thus q1(x) = [x^tV_≤k]Λk[V_≤k^t x] +B^tV_≤k[V_≤k^t x] +C is identically distributed as [y^tL^tV_≤k]Λk[V_≤k^t Ly] +B^tV_≤k[V_≤k^t Ly] +Cwhich is exactly q₁(Ly).

Thus we have,

x∼Nⁿ(0,1)

sgn(p(x))− E

y∼N^m(0,1) z∼N(0,1)

sgn

q₁(Ly) +p

V ar[r₁]z

< O().

Let’s look at p(Ly). We have

p(Ly) =y^tL^tALy+B^tLy+C=y^t[L^tV]Λ[V^tL]y+B^tLy+C.

LetP denote theprojection matrix onto the vector space spanned byL^tv1, . . . , L^tvk. The projection matrix can be expressed by them×mmatrixP ^def= L^tV_≤k(V_≤k^t LL^tV_≤k)⁻¹V_≤k^t L.

Note that P² =P, P^t=P. SinceV_≤k^t LL^tV≤k =Ik×k, this simplifies toP =L^tV≤kV_≤k^t L.

Now as before we breakp(Ly) into two piecesp(Ly) =q₂(y) +r₂(y), wherein q₂(y) =y^tL^tA₁Ly+B^tLP y+C

r₂(y) =y^tL^tA₂Ly+B^tL[I−P]y

The goal is to do similarCLT like analysis but the problem is thatq₂(y) andr₂(y) are not independent. We refiner2(y) tor3(y) to make it independent ofq2(y) by separating the part of it that correlates withq₂(y). That is, define

r₃(y) =y^t[I−P]L^tA₂L[I−P]y+B^tL[I−P]y s(y) =y^tP L^tA₂L[I−P]y+y^tL^tA₂LP y.

Observe thatr₃(y) is independent ofq₂(y). We havep(Ly) =q₂(y) +r₃(y) +s(y). First let’s get rid ofs(y) by showing thatV ar[s] is small whp overLand invoking Lemma 5.

ILemma 19. V ar[s] =O(^√¹_m)wp

1−^O(1)^√_m

overL.

Proof. It suffices to show that |L^tA₂LP|_F is small. SinceP is a projection matrix we have,

|L^tA₂LP|²_F =T r[L^tA₂LP L^tA₂L] =|L^tA₂LL^tV_≤k|²_F. SinceA=A₁+A₂, we have

L^tA₂L=L^tAL−L^tA₁L=L^tAL−L^tV_≤kΛkV_≤k^t L.

Thus

L^tA₂LL^tV_≤k = (L^tAL)L^tV_≤k−L^tV_≤kΛ_k

Ik×k

z }| { V_≤k^t LL^tV_≤k

= (L^tAL)L^tV_≤k−L^tV_≤kΛ_k, Thus we have

|L^tA2LL^tV_≤k|²_F =

l=1

|(L^tAL)L^tvl−λlL^tvl|²₂.

Now we could use Lemma 15 to bound this. So we have, EL|L^tA₂LL^tV_≤k|²_F =Ok

m .

Now the Lemma follows by Markov’s inequality and noting that V ar[s] =O(|L^tA2LP|²_F).

Now we could apply Lemma 5 to remove s. That is,

| E

y∼N^m(0,1)

sgn(q₂(y)+r₃(y))− E

y∼N^m(0,1)

sgn(pL(y))| ≤O 1 m^O(1)

1−O(1)

√m

overL Now thatq₂(y) andr₃(y) are independent, to go ahead with the CLT like analysis we first show that the largest eigenvalue ofr3(y) is at most√

δ.

ILemma 20. λmax[r3(y)]≤√ δ whp

Proof. We want to show that the eigenvalues of [I−P](L^tA₂L)[I−P] are small. Its eigenvalues are interlaced into the eigenvalues ofL^tA2L because [I−P] is a projection matrix. Thus it suffices to bound the eigenvalues ofL^tA₂L, whereA₂=V_>kΛⁿ_k+1V_>k^t . Note thatA₂is a symmetric matrix with spectrum 0^k, λk+1, λk+2, . . . , λn. To bound the eigenvalues ofL^tA2L we boundT r(L^tA₂L)⁴=|(L^tA₂L)²|²_F. We have

|(L^tA2L)²|²_F = X

j₁···j8∈[n]

A2j₁j2A2j₃j4A2j₅j6A2j₇j8hcj₁, cj₅ihcj₂, cj₃ihcj₄, cj₈ihcj₆, cj₇i

EL|(L^tA2L)²|²_F = X

j₁,j₂,j₄,j₆∈[n]

A2j₁j2A2j₂j4A2j₁j6A2j₆j4+O(1) m

=T r(A⁴₂) +O(1)

m ≤δ²+O(1) m .

This shows that the maximum absolute eigenvalue ofr3(y) is at mostO(√

δ) whp. This let’s us either remove it as a low variance term or apply the CLT machinery onr3(y). J Now thatq2andr3are independent polynomials and since Lemma 20 givesλmax[r3(y)]≤

√

δwe could use a slight variant of Lemma 17 to bound the following error:

| E

y∼N^m(0,1) z∼N(0,1)

sgn

q2(y) +p

V ar[r3]z

− E

y∼N^m(0,1)sgn(q2(y)+r3(y))|< O().

We now bound the remaining term that finishes the telescoping for the Gaussian PRG part.

ILemma 21.

y∼N^m(0,1) z∼N(0,1)

sgn

q₁(Ly) +p

V ar[r₁]z

− E

y∼N^m(0,1) z∼N(0,1)

sgn

q₂(y) +p

V ar[r₃]z

≤O()whp Proof. Since y and z are independent it suffices to show that V ary[q1(Ly)−q2(y)] and

|V ar[r1]−V ar[r3]|are both small and invoke Lemma 5 to prove this Lemma.

We have

q2(y)−q1(Ly) =B^t[LL^t−I]V≤kV_≤k^t Ly V arh

q2(y)−q1(Ly)i

B[LL^t−I]V_≤kV_≤k^t L

2 2

SinceL^tV_≤k has orthonormal columns, this simplifies further to

V arh

q₂(y)−q₁(Ly)i

B^t[LL^t−I]V_≤k

2 2

l=1

h X

j16=j2∈[n]

hc_j₁, c_j₂ib_j₁v^j_l²i2

Thus ELV arh

q₂(y)−q₁(Ly)i

=Ok|B|2

To see thatV ar[r₁]≈V ar[r₃], note thatV ar[r₃]≈V ar[r₂] becauseV ar[s(y)] is small as shown above. Now to show thatV ar[r1]≈V ar[r2] we need to show the following:

|A2|F ≈ |L^tA2L|F. This follows from Lemma 12.

t=k+1

hB^t, vti². Since L^tV_≤k has orthonormal columns, we have

|B^tL[I−P]|²₂=|B^tL|²₂−

l=1

hB^tL, L^tvli².

Now this follows by noting thatL^tapproximately preserves the norms and inner products of vectors and that sincek is a constant we can union bound. J To summarize we telescoped the Gaussian PRG error as:

| E

x∼Nⁿ(0,1)sgn(p(x))− E

y∼N^m(0,1)sgn(pL(y))| ≤

x∼Nⁿ(0,1)sgn(p(x))− E

y∼N^m(0,1) z∼N(0,1)

sgn

q1(Ly) +p

V ar[r1]z

+ E

y∼N^m(0,1) z∼N(0,1)

sgn

q₁(Ly) +p

V ar[r₁]z

− E

y∼N^m(0,1) z∼N(0,1)

sgn

q₂(y) +p

V ar[r₃]z

+| E

y∼N^m(0,1) z∼N(0,1)

sgn

q₂(y) +p

V ar[r₃]z

− E

y∼N^m(0,1)

sgn(q₂(y)+r₃(y))|

+| E

y∼N^m(0,1)sgn(q2(y)+r3(y))− E

y∼N^m(0,1)sgn(pL(y))|

and showed that each of the terms is small whp over L. This completes the analysis of the Gaussian PRG error term.

Now we move back from Gaussian to Boolean setting to finish the analysis for regular polynomials.

No documento LIPIcs – Leibniz International Proceedings in Informatics (páginas 48-52)