MEASURES OF PSEUDORANDOMNESS FOR FINITE SEQUENCES: TYPICAL VALUES

(1)

SEQUENCES: TYPICAL VALUES

N. ALON, Y. KOHAYAKAWA, C. MAUDUIT, C. G. MOREIRA, AND V. R ¨ODL

Abstract. Mauduit and Sárközy introduced and studied certain nu- merical parameters associated to finite binary sequencesEN∈ {−1,1}^N in order to measure their ‘level of randomness’. Those parameters, the normality measure N(EN), thewell-distribution measure W(EN), and thecorrelation measureCk(EN)of orderk, focus on different combinatorial aspects ofEN. In their work, amongst others, Mauduit and Sárközy (i) investigated the relationship among those parameters and their minimal possible value, (ii) estimated N(EN), W(EN), and Ck(EN) for certain explicitly constructed sequencesEN suggested to have a ‘pseudorandom nature’, and (iii) investigated the value of those parameters for genuinely random sequencesEN.

In this paper, we continue the work in the direction of (iii) above and determine the order of magnitude ofN(EN),W(EN), andCk(EN) for typicalEN. We prove that, for mostEN ∈ {−1,1}^N, bothW(EN) andN(EN) are of order√

N, whileCk(EN) is of order q

Nlog`N k

´for any given 2≤k≤N/4.

Date: Copy produced on April 12, 2006.

1991Mathematics Subject Classification. 68R15.

Key words and phrases. Random sequences, pseudorandom sequences, finite words, normality, correlation, well-distribution, discrepancy.

Part of this work was done at IMPA, whose hospitality the authors gratefully acknowledge. This research was partially supported by IM-AGIMB/IMPA. The first author was partially supported by the Israel Science Foundation, by a USA-Israeli BSF grant, by NSF grant CCR-0324906, by the James Wolfensohn fund and by the State of New Jersey.

The second author was partially supported by FAPESP and CNPq through a Tem´atico–

ProNEx project (Proc. FAPESP 2003/09925–5) and by CNPq (Proc. 306334/2004–6 and 479882/2004–5). The third author was partially supported by the Brazil/France Agree- ment in Mathematics (Proc. CNPq 60-0014/01-5 and 69-0140/03-7). The fourth author was partially supported by MCT/CNPq through a ProNEx project (Proc. CNPq 662416/1996–1) and by CNPq (Proc. 300647/95–6). The fifth author was partially supported by NSF Grant DMS 0300529. The authors gratefully acknowledge the support of a CNPq/NSF cooperative grant (910064/99–7, 0072064).

(2)

Contents

1. Introduction and statement of results 2

1.1. Measures of pseudorandomness for finite binary sequences 2 1.2. Typical values ofW(E_N),C_k(E_N), andN(E_N) 3 1.3. Minimal values ofW(EN),Ck(EN), andN(EN) 5 2. Estimates forW(EN),Ck(EN), andN(EN) for random sequencesEN 5

2.1. Estimates for the binomial distribution 6

2.2. The well-distribution measureW 8

2.3. The correlation measureCk 11

2.4. The normality measureN 20

3. SmallW(EN) andN(EN) 23

3.1. The probability of havingW(EN) small 23

3.2. The probability of havingN(EN) small 33

4. Concluding remarks 40

References 41

1. Introduction and statement of results

1.1. Measures of pseudorandomness for finite binary sequences. In a series of papers, Mauduit and S´ark¨ozy studied finite pseudorandom binary sequences E_N = (e₁, . . . , e_N) ∈ {−1,1}^N. In particular, they investigated in [11] the ‘measures of pseudorandomness’ to be defined shortly. The read- ers interested in detailed discussions concerning the definitions below and further related literature are referred to [10] and [11].

Let k ∈ N, M ∈ N, X ∈ {−1,1}^k, a ∈ Z, b ∈ N, b > 0, and D = (d₁, . . . , d_k) ∈ N^k with 0 ≤ d₁ < · · · < d_k < N be given. Below, we write cardS for the cardinality of a setS, and ifS is a set of numbers, then we write P

S for the sum P

s∈Ss. We let

T(E_N, M, X) = card{n: 0≤n < M, n+k≤N, and

(e_n+1, e_n+2, . . . , e_n+k) =X}, (1) U(EN, M, a, b) =X

{e_a+jb: 1 ≤j ≤M, 1≤a+jb ≤N for all j}, (2) and

V(E_N, M, D) =X

{e_n+d₁e_n+d₂. . . e_n+d_k: 1≤n≤M, n+d_k ≤N}. (3) In words,T(E_N, M, X) is the number of occurrences of the patternXinE_N, counting only those occurrences whose first symbol is among the first M elements of E_N. The quantity U(E_N, M, a, b) is the ‘discrepancy’ of E_N on an M-element arithmetic progression contained in {1, . . . , N}. Finally, V(E_N, M, D) is the ‘correlation’ among k length M segments ofE_N ‘rela- tively positioned’ according toD= (d₁, . . . , d_k).

(3)

The normality measure of EN is defined as N(EN) = max

k max

X max

M

T(EN, M, X)−M 2^k

, (4)

where the maxima are taken over all k ≤ log₂N, X ∈ {−1,1}^k, and 0 <

M ≤N + 1−k. Thewell-distribution measure of EN is defined as W(E_N) = max{|U(E_N, M, a, b)|:

a,b, andM such that 1≤a+b < a+M b≤N}. (5) Finally, thecorrelation measure of order kof EN is defined as

C_k(E_N) = max{|V(E_N, M, D)|:M and Dsuch that M+d_k≤N}. (6) 1.2. Typical values of W(EN), C_k(EN), and N(EN). In [4], Cassaigne, Mauduit, and S´ark¨ozy studied, amongst others, the values of W(E_N) and C_k(E_N) for random binary sequences E_N, with all the 2^N sequences in {−1,1}^N equiprobable, and the minimum possible values for W(EN) and C_k(E_N). They proved the following theorems. (Below and elsewhere in this paper, we write log for the natural logarithm.)

Theorem A. For allε >0, there are numbersN₀ =N₀(ε)andδ=δ(ε)>0 such that for N ≥N0 we have

δ√

N < W(E_N)<6p

NlogN (7)

with probability at least 1−ε.

Theorem B. For every integer k ≥ 2 and real ε > 0, there are numbers N₀=N₀(ε, k) andδ =δ(ε, k)>0 such that for all N ≥N₀ we have

δ

√

N < Ck(EN)<5p

kNlogN (8)

As it turns out, an improvement of the upper bound in Theorem A may be deduced from a proof of a closely related result of Erd˝os and Spencer.

Indeed, an argument in [7, Chapter 8] tells us that one may drop the loga- rithmic factor in (7), at the expense of increasing the multiplicative constant.

In this paper, we give stronger versions of Theorems A and B.

Theorem 1. For any given ε > 0 there exist N₀ and δ > 0 such that if N ≥N0, then

δ

√

N < W(EN)< 1 δ

√

N (9)

Theorem1above is essentially proved in Erd˝os and Spencer [7, Chapter 8].

However, we give our alternative proof for this result because an idea in this proof is also used in the proofs of Theorems4,5, and6 below.

We next state a result that establishes the typical order of magnitude of C_k(E_N) for a wide range ofk, including values ofkproportional toN.

(4)

Theorem 2. Let0< ε0≤1/16be fixed and letε1 =ε1(N) = (log logN)/logN. There is a constant N₀ =N₀(ε₀) such that ifN ≥N₀, then, with probability at least 1−ε0, we have

2 5

s Nlog

N k

< C_k(E_N)<

s

(2 +ε₁)Nlog

N N

k

<

s

(3 +ε₀)Nlog N

k

< 7 4

s Nlog

N k

(10) for every integer kwith 2≤k≤N/4.

Our next result tells us thatCk(EN) is concentrated around its meanE(Ck) in the case in which kis small.

Theorem 3. For any fixed constant ε > 0 and any integer function k = k(N) with 2 ≤k≤ logN −log logN, there is a constant N0 for which the following holds. If N ≥N₀, then the probability that

1−ε < Ck(EN)

E(C_k) <1 +ε (11) holds is at least 1−ε.

We suspect that the upper bound on k in Theorem3 may be weakened.

We now turn to the normality measureN(EN).

Theorem 4. For any given ε > 0 there exist N₀ and δ > 0 such that if N ≥N0, then

δ

√

N <N(EN)< 1 δ

√

N (12)

We shall show that the lower bounds in Theorems1 and4(i.e., the lower bounds in (9) and (12)) are in a sense best possible. In Section3 we prove the following two results.

Theorem 5. For any δ > 0, there is c(δ) >0 and N₀ =N₀(δ) such that, for any N ≥N0, we have

P(W(EN)< δ

√

N)> c(δ). (13)

Theorem 6. For any δ > 0, there is c(δ) >0 and N₀ =N₀(δ) such that, for any N ≥N₀, we have

P(N(E_N)< δ√

N)> c(δ). (14)

In Section4, we make some simple remarks to show that the upper bounds in (9) and (12) are in a sense best possible. Those remarks and Theorems5 and 6tells us thatW(E_N)/√

N and N(E_N)/√

N do not converge in distribution as N → ∞ to a distribution that is concentrated at a point.

(5)

1.3. Minimal values ofW(EN),Ck(EN), andN(EN). The minimal possible values ofW(E_N),C_k(E_N), andN(E_N) have also been investigated in the literature, see, e.g., [1, 4, 9, 15]. We include this brief section for the convenience of the reader. It follows from the classical results of Roth [15]

and Matouˇsek and Spencer [9] that the order of magnitude of min

W(EN) :EN ∈ {−1,1}^N (15) isN^1/4. For any fixed evenk≥2, the order of magnitude of minE_NC_k(EN), where the minimum is again taken over allE_N ∈ {−1,1}^N, has been estab- lished up to a polylogarithmic factor recently:

s 1 2

N k+ 1

<min

EN

C_k(E_N)≤ 7 4

pkNlogN , (16) where we supposeN ≥N0(k) for the upper bound. The lower bound in (16) is proved in [1], whereas the upper bound in (16) follows from Theorem 2 above. It is easy to see that, for any odd k, we have min_E_NC_k(E_N) = 1.

Some further results giving lower bounds forC_k(EN) are given in [1].

Turning to the normality measure N(E_N), we mention that the best bounds that we know of for the minimal value ofN(E_N) (E_N ∈ {−1,1}^N) are as follows:

1

2 +o(1)

log₂N ≤min

EN

N(EN)≤3N^1/3(logN)^2/3 (17) for all large enough N. The upper bound in (17) is proved in [1] construc- tively, where a suitable, explicit algebraic construction based on finite fields is given. It is interesting to compare the upper bounds in (17) and (12) in Theorem4. The lower bound in (17) may be proved simply (see the remarks at the end of Section 2.4). It would be interesting to close the rather wide gap in (17). The following problem was already raised in [1].

Problem 7. Is there an absolute constant α >0 for which we have minEN

N(E_N)> N^α (18)

for all large enough N?

The authors believe that the answer to Problem7 is positive.

2. Estimates forW(EN), Ck(EN), and N(EN) for random sequences E_N

We shall prove Theorems 1–4 in this section. Recall that these results concern random elements EN from the uniform space {−1,1}^N. In this section, unless stated otherwise, E_N will always stand for such a random sequence.

(6)

2.1. Estimates for the binomial distribution. We give in this section some standard facts about the binomial distribution.

We start with the following form of the de Moivre–Laplace theorem (see, e.g., [3, Chapter I, Theorem 6]). Let Bi(n, p) denote the binomial distribution with parameters n and p. Moreover, write S(n, p) for the sum of n independent Bernoulli random variables with meanp. We first consider the symmetric casep= 1/2.

Fact 8. (i) For any `=`(n)∈Z with`=o(n^2/3), we have P

S(n,1/2) = jn

2 k

+`

= 1 2ⁿ

n bn/2c+`

= r 2

πn e^−2`²^/n(1 +o(1)), (19) (ii) For anyc=c(n)>0 withc=o(n^1/6), we have

P

S(n,1/2)≥jn 2 k

+c√ n

= X

`≥c√ n

1 2ⁿ

n bn/2c+`

= r2

π +o(1)

!Z ∞ c

e^−2x²dx

. (20) In particular, if we further have that c→ ∞, then

P

S(n,1/2)≥jn 2 k

+c√ n

= e^−2c² 2c√

2π(1 +o(1)). (21) (iii) The estimates (20) and (21) hold for the lower tail

P

S(n,1/2)≤jn 2

k−c√ n as well.

In what follows, we shall often be concerned with sums of±1 independent random variables. We let

S^±(n) = X

1≤i≤n

Xi (22)

where Xi (1≤i≤n) are independent random variables with mean 0, that is,

P(Xi =−1) =P(Xi = +1) = 1/2.

Clearly, (S^±(n) +n)/2 is binomially distributed with parametersnand 1/2.

Let us now state a well known estimate for large deviations of S^±(n) (see, e.g., [2, Appendix A]).

Fact 9. Let Xi (1 ≤ i ≤ n) be independent ±1 random variables with mean0. LetS^±(n) =P

1≤i≤nX_i. For any real number a >0, we have P(S^±(n)> a)< e^−a²^/2n. (23)

(7)

We now prove a lower estimate for the symmetric binomial distribution.

As usual, we let{x}=x− bxc.

Fact 10. Let n and `be integers with

−jn 2 k

≤`≤ln 2 m

. (24)

If n is sufficiently large, then

P

S(n,1/2) =jn 2 k

+`

= 1 2ⁿ

n bn/2c+`

≥2−4(`+{n/2})²/n−n

n bn/2c

= (1 +o(1))2−4(`+{n/2})²/n

r 2

πn. (25) Proof. We shall in fact prove the following statement, which implies Fact10:

(†) if0≤`≤n/2 and nis large enough, then n

bn/2c −`

≥2−4(`+{n/2})²/n

n bn/2c

. (26)

We remark that if n is even, or else if n is odd and ` ≤ 0, then (†) is equivalent to what is claimed in Fact 10. If nis odd and ` >0, then (†) is slightly stronger, because in this case we have

n bn/2c+`

=

n dn/2e −`

=

n bn/2c −`+ 1

≥2−4(`−1+{n/2})²/n

n bn/2c

≥2−4(`+{n/2})²/n

n bn/2c

, (27) where we used (26) in the first inequality in (27).

We shall now prove (†). Let

a(`) = 2^4(`+{n/2})²^/n

n bn/2c −`

(28) for all 0≤`≤n/2. It is easy to check that

a(0), a(bn/2c)≥ n

bn/2c

(29) for all large enoughn. Therefore, it suffices to show that

min{a(`) : 0≤`≤n/2}

is achieved either at`= 0 or at`=bn/2c. In particular, if we show thata(`) is unimodal, we are done. Let

b(`) = a(`+ 1)

a(`) = 24(2`+1+2{n/2})/n bn/2c −`

dn/2e+`+ 1 (30) for all 0≤`≤n/2−1. Observe that, for all large enoughn, we have

b(0) = 2^4/n bn/2c

dn/2e+ 1 >1 (31)

(8)

and

b(bn/2c −1) = 1

n2^4(1−1/n)<1. (32) We now show thatb(`) is itself unimodal. Let

c(`) = b(`+ 1)

b(`) = 2^8/n

bn/2c −`−1 bn/2c −`

dn/2e+`+ 1 dn/2e+`+ 2

= 2^8/n

1− n+ 1

(bn/2c −`) (dn/2e+`+ 2)

(33) for all 0≤`≤n/2−2. Note thatc(0)>1 for sufficiently large nand c(`) is decreasing. Therefore, b(`) is unimodal, ‘starts’ withb(0)>1, and ‘ends’

withb(bn/2c −1)<1. Therefore we conclude that a(`) is indeed unimodal,

as required.

In Section2.4, we shall need results similar to Fact9for the tail of Bi(n, p) for arbitrary 0 < p < 1. We shall use the following estimates (see [8, Theorem 2.1]).

Fact 11. Let X have binomial distributionBi(n, p) and setλ=E(X) =pn.

Then, for any t≥0, we have P X ≥E(X) +t

≤exp

− t² 2(λ+t/3)

(34) and

P X≤E(X)−t

≤exp

−t² 2λ

. (35)

2.2. The well-distribution measure W. Our aim in this section is to prove Theorem 1. Let E_N = (e₁, . . . , e_N) ∈ {−1,1}^N be drawn uniformly at random; we wish to estimate

W(E_N) = max

a,b,M

X

1≤j≤M

e_a+jb, (36)

where the maximum is taken over all integersa,b, andM with 1≤a+b <

a+bM ≤N.

We start with the lower bound in (9). Observe that Fact 8tells us that, for any fixed C >0, we have

Nlim→∞P





X

1≤j≤N

e_j

≥C√ N



= 2 r2

π Z ∞

C/2

e^−2x²dx >0. (37) The lower bound forW(E_N) for a typicalE_N follows from (37) and the fact that

2 r2

π Z ∞

C/2

e^−2x²dx→1

(9)

asC →0. Indeed, the facts above show that, for anyε >0, there isδ >0 so that the lower bound in (9) holds with probability at least 1−ε. It remains to prove the upper bound for W(EN) for typical sequences EN.

LetK be an integer. The arithmetic progressions of the form 1≤a+b <

a+ 2b < · · · < a+T b ≤ K, where −b+ 1 ≤ a ≤ 0 and T is the largest integer satisfyinga+T b≤K, will be calledcomplete arithmetic progressions in[1, K]∩Z. Our first auxiliary lemma gives an estimate, as a function ofK, for the maximal value of the ‘discrepancy’

S_a,b^(K)=

X

1≤a+jb≤K

e_a+jb

(38) over complete arithmetic progressions in [1, K], for typical sequences

(e₁, . . . , e_K)∈ {−1,1}^K.

Lemma 12. Let K be a positive integer andC a positive real number. Let (e₁, . . . , e_K) be drawn uniformly at random from {−1,1}^K. The probability that there are integers aand b with −b+ 1≤a≤0 such that

S_a,b^(K)=

X

1≤a+jb≤K

ea+jb

> C

√

K (39)

is O e^−C²^/4

, as long as, say,C ≥2.

Proof. For any given integers a and b with −b+ 1 ≤ a ≤ 0, the complete arithmetic progression 1≤a+b < a+2b <· · · ≤Kin [1, K] has eitherbK/bc or dK/be terms. Fix a large constant C, and observe that (23) in Fact 9 tells us that, if b≤K, then the probability that we have

S_a,b^(K)=

X

1≤a+jb≤K

e_a+jb

> C

√ K

is less than 2e^−bC²^/4.

Now, givenb, there arebarithmetic progressions as above, and the possible values forbare 1, . . . , K−1. Therefore, the probability that there area and bfor which S_a,b^(K) is larger thanC√

K is O

X

1≤b<K

be^−bC²^/4

=O e^−C²^/4

, (40)

where we used that C ≥ 2 and hence the terms in the sum in (40) drop

geometrically.

Let us now consider intervals of integers of the form

B_m,r = (m2^r,(m+ 1)2^r]∩Z, (41) wherem and r are non-negative integers. Clearly,|B_m,r|= 2^r. We refer to theB_m,r asblocks. Put

k= 1 +blog₂Nc, (42)

(10)

and observe that all blocksBm,r contained in [1, N] haver < k.

Lemma 12 above tells us that, given a block B_m,r, the probability that the discrepancy on some complete arithmetic progression contained inBm,r

is greater thanC(k−r)√

2^r isO(e^−C²^(k−r)²^/4); that is, the probability that we have

maxa,b

X{e_a+jb:a+jb∈Bm,r}

> C(k−r)√

2^r (43)

is O(e^−C²^(k−r)²^/4). On the other hand, for anyr, we have at most N/2^r ≤ 2^k−r blocks B_m,r contained in [1, N]. Therefore, if C is large enough, the probability that (43) holds for some blockBm,r ⊂[1, N] is

O

X

0≤r<k

2^k−re^−C²^(k−r)²^/4

=O e^−C²^/4

, (44)

which tends to 0 as C → ∞. Therefore, to complete our proof, we may suppose that (43) does not hold for any B_m,r, that is, we may suppose that

(*) for allm and r withB_m,r ⊂[1, N], we have maxa,b

X{e_a+jb:a+jb∈Bm,r}

≤C(k−r)

√ 2^r. If we are able to deduce from (*) thatW(E_N) =O(√

N), then we shall be done. Let us now state and prove the following combinatorial observation.

Fact 13. Given any integers 1 ≤ p < q ≤ N, it is possible to write the integer interval I_p,q= [p, q]∩Zas the disjoint union of blocks B_m,r so that, for any r, we use at most two blocks of the form Bm,r.

Proof. To see this, first consider the smallest f for which [p, q] ⊂ [1,2^f) (that is, f = dlog₂(q + 1)e), and then consider the ‘dyadic’ ruler in the interval [1,2^f): the level 0 mark in our ruler is 1; the level1 mark is 2^f−1; the set oflevel 2 marks is{2^f−2,2^f−1+ 2^f⁻²}; the set oflevel 3 marks is

{2^f−3,2^f−2+ 2^f⁻³,2^f−1+ 2^f−3,2^f⁻¹+ 2^f−2+ 2^f−3};

and so on. Let Lbe the mark of smallest level contained in [p, q]. One may then use the binary expansion ofq−Lto obtain a decomposition of (L, q]∩Z into blocks, using, for eachr, at most one block of the formB_m,r. Similarly, one may use the binary expansion ofL−p+1 to obtain such a decomposition

of [p, L]∩Z.

We now use (*) and Fact 13 above to prove that W(E_N) = O(√ N).

Givena,b, andM with 1≤a+b < a+M b≤N, we write [a+b, a+M b]∩Z as a disjoint union of blocks such that, for each r, we make use of at most two blocks of the formB_m,r. LetBbe the collection of blocksB_m,r that we

(11)

have used in our partition of [a+b, a+M b]∩Z. We have

X

1≤j≤M

e_a+jb

≤ X

Bm,r∈B

X{e_a+jb:a+jb∈B_m,r}

≤ X

0≤r<k

2×C(k−r)

√ 2^r

= 2C2^k/2 X

1≤`≤k

`2^−`/2 =O(2^k/2) =O(

√ N).

This completes the proof of Theorem 1.

2.3. The correlation measure Ck. In this section we prove Theorems 2 and 3.

2.3.1. Proof of Theorem2. We shall first prove the upper estimate forCk(EN) for typical sequencesE_N. Indeed, we prove the following lemma.

Lemma 14. Let ε = ε(N) = (log logN)/logN. With probability tending to 1 as N → ∞, we have

C_k(E_N)<

s

(2 +ε)Nlog

N N

k

Proof. We first consider a fixed integer k with 2 ≤ k ≤ N/4. Fix a se- quenceD= (d₁, . . . , d_k) with 0≤d₁ <· · ·< d_k< N. SinceE_N = (e_i)1≤i≤N

is drawn uniformly at random from{−1,1}^N, the sequence

(e_1+d₁e_1+d₂. . . e_1+d_k, e_2+d₁e_2+d₂. . . e_2+d_k, . . . , eN−d_k+d1eN−d_k+d2. . . e_N) is a random element of the uniform space {−1,1}^N^−d^k. Recall (6), and let an integer M with M +d_k ≤ N be fixed. Clearly, V(E_N, M, D) has the same distribution asS^±(M). We now use (23) with

a= s

(2 +ε)Nlog

N N

k

,

whereε= (log logN)/logN is as in the statement of our lemma, to deduce that

|V(E_N, M, D)|>

s

(2 +ε)Nlog

N N

k

(46) holds with probability less than

exp

− 1

2M(2 +ε)Nlog

N N

k

≤

N N

k

−(1+ε/2)

. (47) Summing over all possible choices of M and D, we get an extra factor of N ^N_k

, which gives an upper bound of n

N ^N_ko−ε/2

for the probability

(12)

of the event that (46) should hold for some M and D. We shall now sum these bounds over 2≤k≤N/4. For estimating this sum, we observe that

1

1−1/3^δ ≤ 1

δ (48)

for all 0 < δ ≤ δ₀, where δ₀ > 0 is some suitably small absolute constant, and

N k−1

N k

≤ 1

3 (49)

for 2< k≤N/4. Using (48) and (49), we see that the probability that (46) holds for some M,D, andk withk in the range 2≤k≤N/4 is at most

X

2≤k≤N/4

N

N k

−ε/2

≤

N N

2

−ε/2

X

`≥0

1 3

`ε/2

≤ 2 ε

N

N 2

−ε/2

=O

1 (log logN)√

logN

=o(1), (50)

and our result follows.

The proof of Theorem2will be complete if we prove the following result.

Lemma 15. With probability tending to 1 as N → ∞, we have C_k(EN)> 2

5 s

Nlog N

k

We start with an auxiliary result.

Fact 16. Letm=bN/3c. For every sufficiently largeN, the following hold.

(i) If 2≤k≤logm, then s

Nlog N/3

k

≥0.99 s

Nlog N

k

. (52)

(ii) If logm < k≤N/4, then s

Nlog N/3

k

≥ s

1−10⁻¹⁰ 3 Nlog

N k

. (53)

We include the proof of Fact16 for completeness. The reader may prefer to go directly to the proof of Lemma15, given below.

Proof of Fact 16. Throughout this proof, we suppose thatN is larger than a suitably large absolute constant.

(13)

Let us start with the proof of (i). Suppose 2 ≤ k ≤ logm. Using that (a/b)^b ≤ ^a_b

≤(ea/b)^b, we have N/3

k

≥ N

3k k

≥ 1

3e k

eN k

k

≥ 1

3e k

N k

. (54)

Therefore, log

N/3 k

≥ 1−klog(3e) log ^N_k

! log

N k

≥

1− 1 + log 3 log(N/k)

log

N k

≥

1− 3 logN

log

N k

. (55) Inequality (52) follows from (55), and (i) is proved.

Let us now turn to (ii). Suppose logm < k ≤ N/4. Since ^N/3_k

≥

bN/3c k

= ^m_k

, it clearly suffices to prove that m

k

≥ N

k

(1−10⁻¹⁰)/3

(56) for all large enoughN. Ifk < m/2, thenm−k > m/2> k. Moreover,

m k

= m

m−k

(57) and

N m−k

(1−10⁻¹⁰)/3

≥ N

k

(1−10⁻¹⁰)/3

. (58)

Therefore, it suffices to consider kwith 1

2 N

3

= 1

2m≤k≤ N

4. (59)

Now, the left-hand side of (56) is decreasing in the range (59), while the right-hand side is increasing in that range. Therefore, it suffices to verify (56) atk=bN/4c.

To check (56) fork =bN/4c, we may use the fact that if r =r(n) is an integer function of nwith limn→∞r(n)/n=x, then

n r

= 2(H(x)+o(1))n, (60)

whereo(1)→0 asn→ ∞, and

H(x) =−xlog₂x−(1−x) log₂(1−x) (61) for all 0 ≤ x ≤ 1 (H(0) = H(1) = 0). Of course, this is an immediate consequence of Stirling’s formula

n! =√ 2πn

n e

n 1 +O

1 n

. (62)

(14)

We now note that 1 3H

3 4

= 1 3H

1 4

> 1−10⁻¹⁰

3 H

1 4

, (63)

and hence (56) does indeed follow for k = bN/4c as long as N is large

enough. This completes the proof of Fact 16.

We now give the proof of the main result in this section.

Proof of Lemma 15. Setm=bN/3c, and recall that we writeS(m,1/2) for a random variable with binomial distribution Bi(m,1/2). Fix 2≤k≤N/4.

We are interested in the largest integer r for which P

S(m,1/2)≥ 1

2(m+r)

≥k²(logN)

m+ 1 k−1

−1

(64) holds. Indeed, we let

r(m) =r_k(m) = max{r∈N: inequality (64) holds}. (65) We need the following lower bounds forr(m).

Fact 17. For every sufficiently large N, the following hold.

(i) If 2≤k≤logm, then r(m)≥0.99

s 2mlog

m+ 1 k−1

. (66)

(ii) If logm < k≤N/4, then r(m)≥(1−10⁻¹⁰)

s m log 2log

m+ 1 k−1

. (67)

(iii) If 2≤k≤N/4, then r(m)≥ 2

5 s

Nlog N

k

. (68)

Proof. In this proof, we may and shall suppose that N is larger than a suitably large absolute constant for the inequalities below to hold.

We start observing that we have k²(logN)

m+ 1 k−1

−1

=

m+ 1 k−1

−1+o(1)

(69) uniformly in kin the range 2 ≤k≤N/4; that is, for all η >0 there is N0

such that ifN ≥N₀, then, for all 2≤k≤N/4, we have k²logN ≤

m+ 1 k−1

η

. (70)

To check this assertion, note that if 2 ≤ k ≤ (logN)², then the left-hand side of (70) is polylogarithmic in N, whereas ^m+1_k−1

≥ N/3. On the other

(15)

hand, if (logN)²< k≤N/4, then the left-hand side of (70) isO(N²logN), whereas ^m+1_k−1

≥ (m+ 1)/(k−1)k−1

≥(4/3 +o(1))^k−1 ≥(5/4)^(log^N)²⁻¹, which is superpolynomial in N. Therefore, (69) does hold.

We now turn to the proof of (i). Suppose 2≤k≤logm. Set r =

&

0.99 s

2mlog

m+ 1 k−1

'

. (71)

We shall show that (64) holds for the value of r given in (71). Let c= r+ 1

2√

m = (1 +o(1))0.99 s

1 2log

m+ 1 k−1

. (72)

We now use (21) in Fact 8(ii) to deduce that P

S(m,1/2)≥ 1

2(m+r)

≥P

S(m,1/2)≥jm 2

k +c√

m

= e^−2c² 2c√

2π(1 +o(1))≥ 1 100

m+ 1 k−1

−0.99 log

m+ 1 k−1

−1/2

, (73) which is clearly larger than

k²(logN)

m+ 1 k−1

−1

(74) for large enoughN (see (69)). This completes the proof of (64), and hence (i) follows. We now turn to (ii). Suppose logm < k≤N/4. Set

r =

&

(1−10⁻¹⁰) s

m log 2log

m+ 1 k−1

'

. (75)

We shall show that (64) holds for the value of r given in (75). Let

`=

r+ 1 2

= (1 +o(1))1−10⁻¹⁰ 2√

log 2 s

mlog

m+ 1 k−1

. (76)

We now use (25) in Fact 10 to deduce that P

S(m,1/2)≥ 1

2(m+r)

≥P

S(m,1/2)≥jm 2

k +`

≥(1 +o(1))2−4(d(r+1)/2e+1/2)²/m

r 2 πm

≥(1 +o(1))2^−r²^/m2^−(6r+9)/m r 2

πm

≥(1 +o(1))

m+ 1 k−1

−1+10⁻¹⁰

≥k²(logN)

m+ 1 k−1

−1

for all large enough N, where, again, we used (69) for the last inequality.

This completes the proof of (64), and (ii) follows.

(16)

We now turn to (iii). Suppose first that 2≤k≤logm, so that (66) holds.

Using that ^m+1_k−1

≥ _k−1^N/3

≥ ^N/3_k ^1/2

in this range of k and Fact 16(i), we deduce that

r(m)≥0.99 s

2 N

3

log

m+ 1 k−1

≥(1 +o(1))0.99

√3 s

Nlog N/3

k

≥ 2 5

s Nlog

N k

. (77) Suppose now that logm < k≤N/4, so that (67) holds. Using that ^m+1_k−1

≥

N/3 k

^1−o(1)

in this range ofk and Fact16(ii), we deduce that r(m)≥ 1−10⁻¹⁰

√log 2 s

N 3

log

m+ 1 k−1

≥(1 +o(1))1−10⁻¹⁰

√3 log 2 s

Nlog N/3

k

≥ 2 5

s Nlog

N k

. (78) Inequality (68) follows from (77) and (78), and (iii) is proved.

To prove Lemma 15, we shall show that, with probability ≤2/k²logN, we have

Ck(EN)≤rk(m), (79) and then we shall sum over all 2≤k≤N/4. This gives that (79) holds for some k with 2≤k≤N/4 with probabilityO(1/logN) =o(1). Therefore, (79)fails forall 2≤k≤N/4 with probability 1−o(1), and Lemma15 will be proved.

Let 2 ≤ k ≤ N/4 be fixed. Our strategy to estimate the probability that (79) should hold will be as follows. Recall E_N = (e₁, . . . , e_N) and let u= (e1, . . . , em). Let D_k be the set of (k−1)-tuples D= (d1, . . . , dk−1) withm≤d₁ <· · ·< dk−1 ≤2m. For eachD∈ D_k, let

vD = (e1+d1e1+d2. . . e1+dk−1, . . . , em+d1em+d2. . . em+dk−1), (80) and letA(D) be the event that

|hu, v_Di|> r(m) =rk(m) (81) holds. It suffices to show that someA(D) (D∈ D_k) holds with probability at least 1−2/k²logN. For convenience, letX =X(EN) be the number of events A(D) (D∈ D_k) that hold forE_N. Let

p=p(m) =P

S(m,1/2)≥ 1

2(m+r(m))

. (82)

Because of (65), we have

E(X) =p|D_k|=p

m+ 1 k−1

≥k²logN. (83)

(17)

We have now arrived at the key claim: the events A(D) (D ∈ D_k) are pairwise independent.

Claim 18. For all distinctD and D⁰ ∈ D_k, we have

P(A(D)∩A(D⁰)) =p². (84) We postpone the proof of Claim18. To complete the proof of Lemma15, we make use of the following result, which gives a lower bound for the probability of a union of pairwise independent events. We shall in fact state a stronger lemma, which has as hypothesis that the events should be asymptotically negatively correlated. Versions of this lemma may be found in [5] and [6]. More recently, Petrov used this result to generalize the Borel–

Cantelli lemma [13,14]

Lemma 19. Let A₁, . . . , A_M be events in a probability space, each with probability at least p. Let ε≥0 be given, and suppose that

P(A_i∩A_j)≤p²(1 +ε) (85) for all i6=j. Then

P [

1≤j≤MAj

≥ M p

1 + (M −1)p(1 +ε)

= 1− 1−p+ (M−1)pε

1 + (M−1)p(1 +ε) >1−ε− 2

M p. (86) The proof of Lemma19is given below. We conclude the proof of Lemma15 combining Claim18 and Lemma19. It suffices to notice that we haveM =

m+1 k−1

pairwise independent eventsA(D) (that is,ε= 0), andpM ≥k²logN (see (83)). Inequality (86) then tells us that, with probability greater than 1−2/k²logN, the eventA(D) happens for someD∈ D_k. We conclude that (79) occurs with probability at most 2/k²logN, and hence, as observed above, summing over all 2≤k≤N/4, Lemma 15 follows.

We shall now prove Claim 18 and Lemma19.

Proof of Claim 18. Let us consider the eventsA(D) andA(D⁰) for two dis- tinct D, D⁰ ∈ D_k. Let D = (d1, . . . , dk−1) and D⁰ = (d⁰₁, . . . , d⁰_k−1), and recall that

m≤d1 <· · ·< dk−1 ≤2m, m≤d⁰₁ <· · ·< d⁰_k−1≤2m, and that

vD = (e1+d1e1+d2. . . e1+dk−1, . . . , em+d1em+d2. . . em+dk−1), v_D⁰ = (e_1+d⁰

1e_1+d⁰

2. . . e_1+d⁰

k−1, . . . , e_m+d⁰

1e_m+d⁰

2. . . e_m+d⁰

k−1).

Let v_i^D = ei+d1. . . ei+dk−1 and v_i^D⁰ = e_i+d⁰

1. . . e_i+d⁰

k−1 (1 ≤ i ≤ m) be the components of v_D and v_D⁰. For convenience, let us write ω ∈_U Ω if ω is a

(18)

uniformly distributed random element of Ω. We shall prove more than (84);

we shall prove that

(e1v₁^D, . . . , emv_m^D, e1v₁^D⁰, . . . , emv_m^D⁰)∈_U {−1,1}^2m. (87) Clearly, this proves Claim18, becausehu, v_Di=P

1≤i≤meiv^D_i andhu, v_Di= P

1≤i≤me_iv_i^D⁰.

To check (87), we make the following observation: we have

(x1, . . . , xm, y1, . . . , ym)∈_U {−1,1}^2m (88) if and only if

(x1, . . . , xm, x1y1, . . . , xmym)∈_U {−1,1}^2m. (89) This observation follows immediately from the fact that the map that takes the vector in (88) to the vector in (89) is a bijection (it is in fact an involu- tion).

We apply the observation above to

x_i =e_iv^D_i and y_i=e_iv_i^D⁰ (1≤i≤m) (90) to obtain that (87) holds if and only if

(x1, . . . , xm, x1y1, . . . , xmym)

= (e1v₁^D, . . . , emv_m^D, v^D₁ v^D₁⁰, . . . , v_m^Dv_m^D⁰)∈_U {−1,1}^2m. (91) However, (91) does hold. Indeed, since D6=D⁰, a moment’s thought shows that

(v^D₁ v^D₁⁰, . . . , v_m^Dv_m^D⁰)∈_U {−1,1}^m. (92) (To see (92), note that the value of v^D_i v^D_i ⁰ (1 ≤ i ≤ m) is determined by thee_j with

j∈i+D4D⁰=i+ (D\D⁰)∪(D⁰\D) (93) alone, and the family of setsi+D4D⁰ (1≤i≤m) form a “hypertree”. We shall not elaborate on this simple argument.) Fact (92) and the fact that thee_i (1≤i≤m) are not involved in (92) at all imply (91), and hence (87) does hold. This completes the proof of Claim18.

Proof of Lemma 19. Let f = P

1≤j≤Mχj, whereχj denotes the character- istic function of the event Aj. We have

Z

f dP= X

1≤j≤M

P(A_j)≥M p, (94) and

Z

f²dP=

Z X

1≤j≤M

χj+ 2 X

1≤i<j≤M

χiχj

dP

≤ Z

f dP+M(M −1)p²(1 +ε). (95)

(19)

LetX =S

1≤j≤MAj. By the Cauchy–Schwarz inequality, we have Z

X

1 dP Z

X

f² dP

≥Z

X

f dP 2

,

which, together with (95), gives P(X)Z

f dP+M(M−1)p²(1 +ε)

≥Z

X

f dP 2

.

Therefore P(X)≥Z

X

f dP

2.Z

f dP+M(M −1)p²(1 +ε)

≥ M²p²

M p+M(M−1)p²(1 +ε) = M p

1 + (M−1)p(1 +ε), (96)

and the lemma follows.

2.3.2. Proof of Theorem3. Theorem3follows simply from some well known results concerning concentration of measure. Letε and k=k(N) as in the statement of that theorem be given. We now observe that ifEN = (ej)1≤j≤N

and E_N⁰ = (e⁰_j)1≤j≤N ∈ {−1,1}^N differ in exactly one coordinate, then

|V(E_N, M, D)−V(E_N⁰ , M, D)| ≤2k (97) for any M and D. Indeed, suppose e⁰_j

0 = −e_j₀ and e_j = e⁰_j for all j 6=

j0. Note that ej0 is involved in at most k summands in the definition of V(E_N, M, D) (see (3)), and hence changing its value to −e_j₀ will change V(EN, M, D) by at most 2k. Therefore,

|C_k(EN)−Ck(E_N⁰ )| ≤2k (98) as well. We may now use, for instance, Lemma 1.2 from [12] to deduce that, for any t >0, we have

P |C_k(EN)−E(Ck)| ≥t

≤2 exp −t²/2k²N

. (99)

It now suffices to check that if

k≤logN−log logN, (100)

then, takingt=εE(C_k), we have that the right-hand side of (99) iso(1).

Theorem2 tells us that, ifN is large enough, then, say, E(C_k)≥ 1

5 s

Nlog N

k

. (101)

Condition (100) on kand (101) easily imply that t²

2k²N = ε²E(C_k)²

2k²N → ∞ (102)

asN → ∞, and this completes the proof of Theorem3.

(20)

2.4. The normality measure N. Recall that the normality measure of E_N = (e₁, . . . , e_N)∈ {−1,1}^N is defined as

N(EN) = max

k max

X max

M

T(EN, M, X)−M 2^k

, (103)

where the maxima are taken over all k ≤ log₂N, X ∈ {−1,1}^k, and 0 <

M ≤ N + 1−k, and T(EN, M, X) is the number of occurrences of the pattern X in E_N, counting only those occurrences starting with some e_j withj ≤M (see (1)). Our aim in this section is to prove Theorem 4.

Proof of Theorem 4. We start with the lower bound in (12). We takek= 1 in (103); in fact, we considerT(E_N, N,(1)), the number of occurrences of 1 inEN. Fact 8tells us that

N→∞lim P

T(E_N, N,(1))−N 2

≥C√ N

= 2 r2

π Z ∞

C

e^−2x²dx >0. (104) Since

2 r2

π Z ∞

C

e^−2x²dx→1

as C → 0, the lower bound in Theorem 4 follows. Indeed, the facts above show that, for any ε > 0, there is δ > 0 so that the lower bound in (12) holds with probability at least 1−ε. It remains to prove the upper bound forN(E_N) for typical sequences E_N.

The basic lemma that we shall use is Lemma20below. Recall that we refer to the sets B_m,r asblocks (recall (41)). For an integerk≥1,X ∈ {−1,1}^k, and Bm,r a block with

maxB_m,r = (m+ 1)2^r≤N−k+ 1, (105) we shall write T(E_N, Bm,r, X) for the number of occurrences of the pattern X inE_N, counting only those occurrences starting in B_m,r, that is,

T(E_N, B_m,r, X) = card{n∈B_m,r:E_N⁽ⁿ⁾=X}, (106) where

E_N⁽ⁿ⁾= (e_n, e_n+1, . . . , en+k−1), (107) and, as usual, E_N = (e₁, . . . , e_N).

Lemma 20. Let m andr be fixed non-negative integers with Bm,r ⊂[1, N].

For all D > 0, the probability that there is X ∈ {−1,1}^k with k ≤ log₂N satisfying (105) such that

T(E_N, B_m,r, X)−2^r−k > D√

2^r (108)

is at most O

e^−2D²^/9

+ 2(log₂N)²Nexp

− 3D 4 log₂N 2^r/2

. (109)

(21)

We postpone the proof of Lemma20and continue with the proof of The- orem 4. Let us apply Lemma 20 with

D=C(K−r) (110)

for all blocksB_m,r ⊂[1, N], where

K = 1 +blog₂Nc, (111)

and C is a large constant. There are at most N/2^r < 2^K−r blocks Bm,r

contained in [1, N] and any such block is such that r < K. Let us call a blockBm,r large if

r ≥4 log₂log₂N (112)

andsmall otherwise. We claim that, with the value ofDgiven in (110) with a large enough constantC, inequality (108) holds for some large blockBm,r

and some X∈ {−1,1}^k (k≤log₂N) with probabilityO

e^−2C²^/9 . To prove our claim, for convenience, let P0

r indicate sum over all r < K satisfying (112). Adding up (109) over all large blocksBm,r ⊂[1, N], we get

X0

r2^K−rO

e^−2C²^(K−r)²^/9

+X0

r2^K−r+1(log₂N)²Nexp

−3C(K−r) 4 log₂N 2^r/2

≤ X

1≤`≤K

2^`O

e^−2C²^`²^/9

+ (log₂N)²N²X⁰

rexp

− 3C

4 log₂N 2^r/2

=O

e^−2C²^/9 +O

1 N

=O

e^−2C²^/9

, (113)

as long asC is a large enough constant andN is large enough with respect toC, proving our claim.

Since the bound in (113) tends to 0 asC → ∞, we may and shall suppose henceforth that

(**) for all integers m, r, and k ≤ log₂N with B_m,r ⊂ [1, N] satisfying (105) and (112), and everyX∈ {−1,1}^k, we have

T(E_N, B_m,r, X)−2^r−k

≤C(K−r)√ 2^r.

Now fixk≤log₂N and fixM with 1≤M ≤N−k+ 1. Observe that we may write [1, M] as a disjoint union of blocks B_m,r (r ≤log₂M < K) with at most one block of the formBm,r for eachr. Indeed, such a decomposition of [1, M] may be read out from the binary expansion ofM. Let us write I for the set of the pairs (m, r) for which B_m,r occurs in this decomposition of [1, M]. Furthermore, let I =I+∪I− be the partition of I with

I₊={(m, r)∈I:r satisfies (112)}. (114)

(22)

For later reference, observe that

|I−|<4 log₂log₂N. (115) Observe also that if (m, r)∈I−, then

T(E_N, Bm,r, X)−2^r−k

≤2^r<(log₂N)⁴ (116) for anyX∈ {−1,1}^k. Using (**), (115), and (116), we see that, for anyX∈ {−1,1}^k, we have

T(E_N, M, X)−M2^−k

≤ X

(m,r)∈I

T(E_N, B_m,r, X)−2^r−k

= X

(m,r)∈I+

T(EN, Bm,r, X)−2^r−k

+ X

(m,r)∈I−

T(EN, Bm,r, X)−2^r−k

≤ X

0≤r<K

C(K−r)2^r/2+|I−|(log₂N)⁴ (117)

≤C2^K/2 X

1≤`≤K

`2^−`/2+ 4(log₂log₂N)(log₂N)⁴

=O(2^K/2) =O(

√ N),

as required. It now remains to prove Lemma20.

Proof of Lemma 20. Let non-negative integers m and r withB_m,r ⊂[1, N] be fixed, and let D be a positive real. Because of the form of the upper bound in (109), we may suppose thatD ≥D0 for some conveniently large absolute constantD₀. Indeed, if we prove an upper bound of the form (109) forD≥D0, then we may simply adjust the absolute constant in the bigO term to take into account the case in whichD < D₀ (making (109) greater than 1, say).

Let us now fixk≤log₂N for which (105) holds, and letX∈ {−1,1}^k be given. We shall first show that the probability that (108) happens for this fixed X is suitably small.

Let us decompose Bm,r in arithmetic progressions with difference k: for 0≤s < k, let

I_s ={j∈B_m,r:j ≡s (modk)}, (118) so that Bm,r =S

0≤s<kIs is a partition ofBm,r. Let

T(EN, Is, X) = card{n:n∈Is andE_N⁽ⁿ⁾=X}, (119) and observe that, clearly, T(EN, Bm,r, X) = P

0≤s<kT(EN, Is, X). In particular, if

T(E_N, I_s, X)− |I_s|2^−k ≤ 1

kD√

2^r (120)

for all 0≤s < k, then (108) fails. Let us fix 0≤s < k, and let us estimate the probability that (120) should fail.

(23)

Clearly, T(EN, Is, X) has binomial distribution Bi(|I_s|,2^−k). Therefore, using Fact11, we see that the probability that (120) should fail is at most

2 exp

− t² 2(λ+t/3)

, (121)

where t = (D/k)2^r/2 and λ=|I_s|2^−k ≤2^r−k. If λ≥ t/3, then (121) is at most

2 exp

−t² 4λ

≤2 exp

−D²2^r/k² 2^r−k+2

= 2e^−D²²^k−2^/k². (122) If, on the other hand,λ < t/3, then (121) is at most

2 exp

−3 4t

= 2 exp

−3D 4k2^r/2

≤2 exp

− 3D 4 log₂N2^r/2

. (123) Below, rather crudely, we bound (121) by the sum of (122) and (123).

Indeed, we add up (122) and (123) over all choices of s (0 ≤ s < k) and X ∈ {−1,1}^k, and get that the probability that (108) should hold for someX∈ {−1,1}^k (k≤log₂N) is

2k2^k

e^−D²²^k−2^/k² + exp

− 3D 4 log₂N2^r/2

. (124)

We now sum (124) over all 1≤k≤ blog₂Nc=K−1. We add up the terms in (124) separately.

We have

2 X

1≤k<K

k2^ke^−D²²^k−2^/k² = 4e^−D²^/2+ 2⁴e^−D²^/4+ 3×2⁴e^−2D²^/9S, (125) where S is dominated by the sum of a convergent geometric series as long asD is large enough. Therefore, the sum in (125) is

O

e^−2D²^/9

. (126)

Moreover,

2 X

1≤k<K

k2^kexp

− 3D 4 log₂N2^r/2

≤2(log₂N)²Nexp

− 3D 4 log₂N2^r/2

.

(127)

Lemma20 follows from (125)–(127).

3. Small W(E_N) and N(E_N)

3.1. The probability of having W(E_N) small. Our aim in this section is to prove Theorem 5, which tells us that, given any δ >0, the probability of the eventW(EN)≤δ√

N is bounded away from 0.

The proof of Theorem 5 is given in Section 3.1.2, but a key ‘positive correlation’ result is first proved in Section3.1.1. To be a little more precise, Claim26, needed in the proof of Theorem5, is proved by invoking a general correlation result given in Corollary 22 in Section3.1.1.