W W L CHEN
c
W W L Chen, 1997, 2008.
This chapter is available free to all individuals, on the understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 11
APPLICATIONS OF
REAL INNER PRODUCT SPACES
11.1. Least Squares Approximation
Given a continuous functionf : [a, b]→R, we wish to approximatef by a polynomialg: [a, b]→Rof
degree at mostk, such that the error
Z b
a |
f(x)−g(x)|2dx
is minimized. The purpose of this section is to study this problem using the theory of real inner product spaces. Our argument is underpinned by the following simple result in the theory.
PROPOSITION 11A.Suppose thatV is a real inner product space, and thatW is a finite-dimensional subspace of V. Given anyu∈V, the inequality
ku−projWuk ≤ ku−wk
holds for every w∈W.
In other words, the distance from u to any w ∈ W is minimized by the choice w = projWu, the
orthogonal projection ofuon the subspaceW. Alternatively, projWucan be thought of as the vector
inW closest tou.
Proof of Proposition 11A.Note that
It follows from Pythagoras’s theorem that
ku−wk2=k(u−projWu) + (projWu−w)k
2=
ku−projWuk
2+
kprojWu−wk
2,
so that
ku−wk2− ku−projWuk2=kprojWu−wk2≥0.
The result follows immediately.
Let V denote the vector space C[a, b] of all continuous real valued functions on the closed interval [a, b], with inner product
hf, gi=
Z b
a
f(x)g(x) dx.
Then
Z b
a |
f(x)−g(x)|2dx=hf −g, f−gi=kf−gk2.
It follows that the least squares approximation problem is reduced to one of finding a suitable polynomial g to minimize the normkf−gk.
Now let W =Pk[a, b] be the collection of all polynomials g : [a, b]→R with real coefficients and of
degree at mostk. Note thatW is essentiallyPk, although the variable is restricted to the closed interval
[a, b]. It is easy to show thatW is a subspace ofV. In view of Proposition 11A, we conclude that
g= projWf
gives the best least squares approximation among polynomials in W = Pk[a, b]. This subspace is of
dimensionk+ 1. Suppose that{v0, v1, . . . , vk}is an orthogonal basis of W =Pk[a, b]. Then by
Propo-sition 9L, we have
g=hf, v0i kv0k2v0+
hf, v1i
kv1k2v1+. . .+
hf, vki
kvkk2
vk.
Example 11.1.1.Consider the functionf(x) =x2in the interval [0,2]. Suppose that we wish to find a least squares approximation by a polynomial of degree at most 1. In this case, we can takeV =C[0,2], with inner product
hf, gi=
Z 2 0
f(x)g(x) dx,
andW =P1[0,2], with basis{1, x}. We now apply the Gram-Schmidt orthogonalization process to this
basis to obtain an orthogonal basis{1, x−1}ofW, and take
g=hx
2,1i
k1k2 1 +
hx2, x−1i
kx−1k2 (x−1).
Now
hx2,1i=
Z 2 0
x2dx= 8
3 and k1k
2=
h1,1i=
Z 2 0
while
hx2, x−1i=
Z 2 0
x2(x−1) dx=4
3 and kx−1k
2=
hx−1, x−1i=
Z 2 0
(x−1)2dx=2 3.
It follows that
g= 4
3 + 2(x−1) = 2x− 2 3.
Example 11.1.2.Consider the functionf(x) = exin the interval [0,1]. Suppose that we wish to find a
least squares approximation by a polynomial of degree at most 1. In this case, we can takeV =C[0,1], with inner product
hf, gi=
Z 1 0
f(x)g(x) dx,
andW =P1[0,1], with basis{1, x}. We now apply the Gram-Schmidt orthogonalization process to this
basis to obtain an orthogonal basis{1, x−1/2} ofW, and take
g=he
x,1i
k1k2 1 +
hex, x−1/2i
kx−1/2k2
x−1 2
.
Now
hex,1
i=
Z 1 0
exdx= e
−1 and hex, x
i=
Z 1 0
exxdx= 1,
so that
ex, x
−12
=hex, x
i −12hex,1
i= 3 2−
e 2.
Also
k1k2=h1,1i=
Z 1 0
dx= 1 and
x−
1 2
2
=
x−12, x−12
=
Z 1 0
x−12
2
dx= 1 12.
It follows that
g= (e−1) + (18−6e)
x−12
= (18−6e)x+ (4e−10).
Remark.From the proof of Proposition 11A, it is clear thatku−wkis minimized by the unique choice
w= projWu. It follows that the least squares approximation problem posed here has a unique solution.
11.2. Quadratic Forms
A real quadratic form innvariablesx1, . . . , xn is an expression of the form n
X
i=1
n
X
j=1
i≤j
cijxixj, (1)
Example 11.2.1.The expression 5x2
1+ 6x1x2+ 7x22 is a quadratic form in two variablesx1 andx2. It
can be written in the form
5x21+ 6x1x2+ 7x22= (x1 x2)
5 3 3 7
x1
x2
.
Example 11.2.2.The expression 4x2
1+ 5x22+ 3x23+ 2x1x2+ 4x1x3+ 6x2x3is a quadratic form in three
variablesx1,x2 andx3. It can be written in the form
4x21+ 5x22+ 3x23+ 2x1x2+ 4x1x3+ 6x2x3= (x1 x2 x3)
4 1 2 1 5 3 2 3 3
x1
x2
x3 .
Note that in both examples, the quadratic form can be described in terms of a real symmetric matrix. In fact, this is always possible. To see this, note that given any quadratic form (1), we can write, for everyi, j= 1, . . . , n,
aij=
cij ifi=j,
1
2cij ifi < j, 1
2cji ifi > j.
(2)
Then
n
X
i=1
n
X
j=1
i≤j
cijxixj= n
X
i=1
n
X
j=1
aijxixj= (x1 . . . xn)
a11 . . . a1n
..
. ...
an1 . . . ann
x1
.. . xn
.
The matrix
A=
a11 . . . a1n
..
. ...
an1 . . . ann
is clearly symmetric, in view of (2).
We are interested in the case whenx1, . . . , xn take real values. In this case, we can write
x=
x1
.. . xn
.
It follows that a quadratic form can be written as
xtAx,
whereAis ann×nreal symmetric matrix andxtakes values inRn.
Many problems in mathematics can be studied using quadratic forms. Here we shall restrict our attention to two fundamental problems which are in fact related. The first is the question of what conditions the matrixAmust satisfy in order that the inequality
xtAx>0
holds for every non-zerox∈Rn. The second is the question of whether it is possible to have a change of
Definition.A quadratic formxtAxis said to be positive definite ifxtAx>0 for every non-zerox
∈Rn.
In this case, we say that the symmetric matrixAis a positive definite matrix.
To answer our first question, we shall prove the following result.
PROPOSITION 11B.A quadratic form xtAxis positive definite if and only if all the eigenvalues of the symmetric matrix Aare positive.
Our strategy here is to prove Proposition 11B by first studying our second question. Since the matrix Ais real and symmetric, it follows from Proposition 10E that it is orthogonally diagonalizable. In other words, there exists an orthogonal matrix P and a diagonal matrix D such that PtAP = D, and so
A=P DPt. It follows that
xtAx=xtP DPtx, and so, writing
y=Ptx, we have
xtAx=ytDy.
Also, sinceP is an orthogonal matrix, we also havex=Py. This answers our second question.
Furthermore, in view of the Orthogonal diagonalization process, the diagonal entries in the matrixD can be taken to be the eigenvalues ofA, so that
D=
λ1
. .. λn
,
whereλ1, . . . , λn∈Rare the eigenvalues ofA. Writing
y=
y1
.. . yn
,
we have
xtAx=ytDy=λ1y21+. . .+λnyn2. (3)
Note now thatx=0if and only if y=0, sinceP is an invertible matrix. Proposition 11B now follows immediately from (3).
Example 11.2.3.Consider the quadratic form 2x2
1+ 5x22+ 2x23+ 4x1x2+ 2x1x3+ 4x2x3. This can be
written in the formxtAx, where
A=
2 2 1 2 5 2 1 2 2
and x=
x1
x2
x3 .
The matrixAhas eigenvaluesλ1= 7 and (double root)λ2=λ3= 1; see Example 10.3.1. Furthermore,
we havePtAP =D, where
P =
1/√6 1/√2 1/√3
2/√6 0 −1/√3
1/√6 −1/√2 1/√3
and D=
7 0 0 0 1 0 0 0 1
.
Writingy=Ptx, the quadratic form becomes 7y2
Example 11.2.4.Consider the quadratic form 5x2
1+ 6x22+ 7x23−4x1x2+ 4x2x3. This can be written
in the formxtAx, where
A=
5 −2 0
−2 6 2
0 2 7
and x=
x1
x2
x3 .
The matrix Ahas eigenvalues λ1 = 3, λ2 = 6 and λ3= 9; see Example 10.3.3. Furthermore, we have
PtAP =D, where
P=
2/3 2/3 −1/3
2/3 −1/3 2/3
−1/3 2/3 2/3
and D=
3 0 0 0 6 0 0 0 9
.
Writingy=Ptx, the quadratic form becomes 3y2
1+ 6y22+ 9y32which is clearly positive definite.
Example 11.2.5.Consider the quadratic formx2
1+x22+ 2x1x2. Clearly this is equal to (x1+x2)2 and
is therefore not positive definite. The quadratic form can be written in the formxtAx, where
A=
1 1 1 1
and x=
x1
x2
.
It follows from Proposition 11B that the eigenvalues ofAare not all positive. Indeed, the matrixAhas eigenvaluesλ1= 2 andλ2= 0, with corresponding eigenvectors
1 1
and
1 −1
.
Hence we may take
P =
1/√2 1/√2 1/√2 −1/√2
and D=
2 0 0 0
.
Writingy=Ptx, the quadratic form becomes 2y2
1 which is not positive definite.
11.3. Real Fourier Series
Let E denote the collection of all functions f : [−π, π] → R which are piecewise continuous on the
interval [−π, π]. This means that any f ∈E has at most a finite number of points of discontinuity, at each of whichf need not be defined but must have one sided limits which are finite. We further adopt the convention that any two functionsf, g∈E are considered equal, denoted by f =g, if f(x) =g(x) for everyx∈[−π, π] with at most a finite number of exceptions.
It is easy to check that E forms a real vector space. More precisely, let λ∈E denote the function λ: [−π, π]→R, whereλ(x) = 0 for everyx∈[−π, π]. Then the following conditions hold:
• For everyf, g∈E, we havef +g∈E.
• For everyf, g, h∈E, we havef+ (g+h) = (f+g) +h. • For everyf ∈E, we havef+λ=λ+f =f.
• For everyf ∈E, we havef+ (−f) =λ. • For everyf, g∈E, we havef +g=g+f. • For everyc∈Randf ∈E, we havecf ∈E.
• For everyc∈Randf, g∈E, we havec(f+g) =cf+cg.
• For everya, b∈Randf ∈E, we have (a+b)f =af+bf.
• For everya, b∈Randf ∈E, we have (ab)f =a(bf).
We now give this vector space E more structure by introducing an inner product. For every f, g∈E, write
hf, gi= 1 π
Z π
−π
f(x)g(x) dx.
The integral exists since the function f(x)g(x) is clearly piecewise continuous on [−π, π]. It is easy to check that the following conditions hold:
• For everyf, g∈E, we havehf, gi=hg, fi.
• For everyf, g, h∈E, we havehf, g+hi=hf, gi+hf, hi. • For everyf, g∈E andc∈R, we havechf, gi=hcf, gi.
• For everyf ∈E, we havehf, fi ≥0, andhf, fi= 0 if and only iff =λ.
HenceE is a real inner product space.
The difficulty here is that the inner product spaceEis not finite-dimensional. It is not straightforward to show that the set
1 √
2,sinx,cosx,sin 2x,cos 2x,sin 3x,cos 3x, . . .
(4)
inE forms an orthonormal “basis” forE. The difficulty is to show that the set spans E.
Remark.It is easy to check that the elements in (4) form an orthonormal “system”. For everyk, m∈N,
we have
1
√ 2,
1 √ 2
= 1 π
Z π
−π
1
2dx= 1;
1 √
2,sinkx
= 1 π
Z π
−π
1 √
2sinkx= 0;
1
√
2,coskx
= 1 π
Z π
−π
1 √
2coskx= 0;
as well as
hsinkx,sinmxi= 1 π
Z π
−π
sinkxsinmxdx= 1 π
Z π
−π
1
2(cos(k−m)x−cos(k+m)x) dx=
1 ifk=m, 0 ifk6=m;
hcoskx,cosmxi= 1 π
Z π
−π
coskxcosmxdx= 1 π
Z π
−π
1
2(cos(k−m)x+ cos(k+m)x) dx=
1 ifk=m, 0 ifk6=m;
and
hsinkx,cosmxi= 1 π
Z π
−π
sinkxcosmxdx= 1 π
Z π
−π
1
2(sin(k−m)x+ sin(k+m)x) dx= 0.
Let us assume that we have established that the set (4) forms an orthonormal basis for E. Then a natural extension of Proposition 9H gives rise to the following: Every function f ∈E can be written uniquely in the form
a0
2 +
∞
X
n=1
(ancosnx+bnsinnx), (5)
known usually as the (trigonometric) Fourier series of the functionf, with Fourier coefficients
a0
√ 2 =
f,√1 2
= 1 π
Z π
−π
and, for everyn∈N,
an =hf,cosnxi=
1 π
Z π
−π
f(x) cosnxdx and bn=hf,sinnxi=
1 π
Z π
−π
f(x) sinnxdx.
Note that the constant term in the Fourier series (5) is given by
f,√1 2 1 √ 2 = a0 2 .
Example 11.3.1.Consider the function f : [−π, π]→R, given byf(x) =xfor everyx∈[−π, π]. For
everyn∈N∪ {0}, we have
an=
1 π
Z π
−π
xcosnxdx= 0,
since the integrand is an odd function. On the other hand, for everyn∈N, we have
bn=
1 π
Z π
−π
xsinnxdx= 2 π
Z π 0
xsinnxdx,
since the integrand is an even function. On integrating by parts, we have
bn =
2 π
−
xcosnx n π 0 + Z π 0 cosnx n dx = 2 π −
xcosnx n π 0 + sinnx n2 π 0
= 2(−1)
n+1
n .
We therefore have the (trigonometric) Fourier series
∞
X
n=1
2(−1)n+1
n sinnx.
Note that the function f is odd, and this plays a crucial role in eschewing the Fourier coefficients an
corresponding to the even part of the Fourier series.
Example 11.3.2.Consider the functionf : [−π, π]→R, given byf(x) =|x|for everyx∈[−π, π]. For
everyn∈N∪ {0}, we have
an=
1 π
Z π
−π
|x|cosnxdx= 2 π
Z π 0
xcosnxdx,
since the integrand is an even function. Clearly
a0=
2 π
Z π 0
xdx=π.
Furthermore, for everyn∈N, on integrating by parts, we have
an=
2 π
xsinnx n π 0 − Z π 0 sinnx n dx = 2 π
xsinnx n π 0 + cosnx n2 π 0 =
0 ifnis even,
−πn42 ifnis odd.
On the other hand, for everyn∈N, we have
bn=
1 π
Z π
−π|
since the integrand is an odd function. We therefore have the (trigonometric) Fourier series π 2 − ∞ X n=1 n odd 4
πn2cosnx=
π 2 − ∞ X k=1 4
π(2k−1)2cos(2k−1)x.
Note that the function f is even, and this plays a crucial role in eschewing the Fourier coefficientsbn
corresponding to the odd part of the Fourier series.
Example 11.3.3.Consider the functionf : [−π, π]→R, given for everyx∈[−π, π] by
f(x) = sgn(x) =
(+1 if 0< x≤π,
0 ifx= 0, −1 if−π≤x <0.
For everyn∈N∪ {0}, we have
an=
1 π
Z π
−π
sgn(x) cosnxdx= 0,
since the integrand is an odd function. On the other hand, for everyn∈N, we have
bn=
1 π
Z π
−π
sgn(x) sinnxdx= 2 π
Z π
0
sinnxdx,
since the integrand is an even function. It is easy to see that
bn=−
2 π cosnx n π 0 =
0 ifnis even,
4
πn ifnis odd.
We therefore have the (trigonometric) Fourier series
∞
X
n=1
nodd
4
πnsinnx=
∞
X
k=1
4
π(2k−1)sin(2k−1)x.
Example 11.3.4.Consider the functionf : [−π, π]→R, given byf(x) =x2for every x∈[−π, π]. For
everyn∈N∪ {0}, we have
an =
1 π
Z π
−π
x2cosnxdx= 2
π
Z π
0
x2cosnxdx,
since the integrand is an even function. Clearly
a0= 2
π
Z π 0
x2dx= 2π
2
3 .
Furthermore, for everyn∈N, on integrating by parts, we have
an=
2 π
x2sinnx
n π 0 − Z π 0
2xsinnx
n dx
= 2 π
x2sinnx
n
π 0
+
2xcosnx
n2 π 0 − Z π 0
2 cosnx n2 dx
= 2 π
x2sinnx
n
π 0
+
2xcosnx n2
π 0
−
2 sinnx n3
π 0
=4(−1)
n
On the other hand, for everyn∈N, we have
bn=
1 π
Z π
−π
x2sinnxdx= 0,
since the integrand is an odd function. We therefore have the (trigonometric) Fourier series
π2
3 +
∞
X
n=1
4(−1)n
Problems for Chapter 11
1. Consider the functionf : [−1,1]→R:x7→x3. We wish to find a polynomialg(x) =ax+bwhich
minimizes the error
Z 1
−1|
f(x)−g(x)|2dx.
Follow the steps below to find this polynomialg:
a) Consider the real vector spaceC[−1,1]. Write down a suitable real inner product onC[−1,1] for this problem, explaining carefully the steps that you take.
b) Consider now the subspaceP1[−1,1] of all polynomials of degree at most 1. Describe the
poly-nomialg in terms of f and orthogonal projection with respect to the inner product in part (a). Give a brief explanation for your choice.
c) Write down a basis ofP1[−1,1].
d) Apply the Gram-Schmidt process to your basis in part (c) to obtain an orthogonal basis of P1[−1,1].
e) Describe your polynomial in part (b) as a linear combination of the elements of your basis in part (d), and find the precise values of the coefficients.
2. For each of the following functions, find the best least squares approximation by linear polynomials of the formax+b, wherea, b∈R:
a) f : [0, π/2]→R:x7→sinx b) f : [0,1]→R:x7→x3
c) f : [0,2]→R:x7→ex
3. Consider the quadratic form 2x2
1+x22+x23+ 2x1x2+ 2x1x3in three variablesx1, x2, x3.
a) Write the quadratic form in the formxtAx, where
x=
x1
x2
x3
and whereAis a symmetric matrix with real entries.
b) Apply the Orthogonal diagonalization process to the matrixA.
c) Find a transformation of the typex=Py, whereP is an invertible matrix, so that the quadratic form can be written asytDy, where
y=
y1
y2
y3
and where D is a diagonal matrix with real entries. You should give the matrices P and D explicitly.
d) Is the quadratic form positive definite? Justify your assertion both in terms of the eigenvalues of Aand in terms of your solution to part (c).
4. For each of the following quadratic forms in three variables, write it in the form xtAx, find a
substitution x = Py so that it can be written as a diagonal form in the variables y1, y2, y3, and
determine whether the quadratic form is positive definite: a) x2
1+x22+ 2x23−2x1x2+ 4x1x3+ 4x2x3 b) 3x21+ 2x22+ 3x23+ 2x1x3
c) 3x2
1+ 5x22+ 4x23+ 4x1x3−4x2x3 d) 5x21+ 2x22+ 5x23+ 4x1x2−8x1x3−4x2x3
e) x2
5. Determine which of the following matrices are positive definite:
a)
0 1 1 1 0 1 1 1 0
b)
3 1 1 1 1 2 1 2 1
c)
6 1 7 1 1 2 7 2 9
d)
6 −2 −1
−2 6 −1
−1 −1 5
e)
3 −2 4
−2 6 2
4 2 3
f)
2 0 0 0 0 1 0 1 0 0 2 0 0 1 0 1
6. Find the trigonometric Fourier series for each of the following functionsf : [−π, π]→C:
a) f(x) =x|x|for every x∈[−π, π] b) f(x) =|sinx|for everyx∈[−π, π]
c) f(x) =|cosx|for everyx∈[−π, π]
d) f(x) = 0 for everyx∈[−π,0] andf(x) =xfor everyx∈(0, π] e) f(x) = sinxfor every x∈[−π,0] andf(x) = cosxfor everyx∈(0, π] f) f(x) = cosxfor everyx∈[−π,0] andf(x) = sinxfor everyx∈(0, π] g) f(x) = cos(x/2) for everyx∈[−π, π]