New insights on MSE behavior of the widely linear LMS algorithm

(1)

FEDERAL UNIVERSITY OF SANTA CATARINA TECHNOLOGICAL CENTER

ELECTRICAL AND ELETRONICS ENGINEERING DEPARTMENT

New Insights on MSE Behavior of the

Widely-Linear LMS Algorithm

Undergraduate Thesis presented to the Federal University of Santa Catarina as a requisite for the bachelor degree of Electronics

Engineering.

Enrique Theisen Rodrigues Pinto

Advisor: Leonardo S. Resende

(2)

(3)

ENRIQUE THEISEN RODRIGUES PINTO

NEW INSIGHTS ON MSE BEHAVIOR OF THE

WIDELY-LINEAR LMS ALGORITHM

Undergraduate Thesis presented to the Federal University of Santa Catarina as a requisite for the bach-elor degree of Electronics Engineer-ing.

Advisor: Leonardo S. Resende.

FLORIANÓPOLIS

2019

(4)

Ficha de identificação da obra elaborada pelo autor, através do Programa de Geração Automática da Biblioteca Universitária da UFSC.

Pinto, Enrique Theisen Rodrigues

New Insights on MSE Behavior of the Widely Linear LMS Algorithm / Enrique Theisen Rodrigues Pinto ; orientador, Leonardo Silva Resende, 2019.

46 p.

Trabalho de Conclusão de Curso (graduação) -Universidade Federal de Santa Catarina, Centro Tecnológico, Graduação em Engenharia Eletrônica, Florianópolis, 2019. Inclui referências.

1. Engenharia Eletrônica. 2. WL-LMS. 3. Procesamento de sinais impróprios. 4. Erro Quadrático Médio. I. Resende, Leonardo Silva. II. Universidade Federal de Santa Catarina. Graduação em Engenharia Eletrônica. III. Título.

(5)

(6)

(7)

Agradecimentos

Dedicado a:

Meus pais: pelo fundamento. Minha namorada: pelo carinho. Meus amigos: pela alegria. Meus mestres: pelo rumo.

(8)

(9)

(10)

(11)

RESUMO

Este trabalho apresenta um novo modelo estatístico completo do erro quadrático médio (MSE) para filtragem adaptativa de Wiener am-plamente linear (Widely Linear), envolvendo sinais impróprios. Dentre as contribuições estão, além do modelo completo para o MSE, a inclusão de novas ferramentas para caracterização comparativa da performance em regime permanente do algoritmo LMS amplamente linear (WL-LMS) em relação ao algoritmo LMS tradicional. Teoria e simulações relativas aos casos de equalização de canal linear AWGN e identificação de sistemas amplamente linear são apresentados e analisados.

Palavras-chave: Processamento amplamente linear, impropriedade,

(12)

(13)

ABSTRACT

This work presents a novel complete statistical model of the mean square error (MSE) for widely linear adaptive Wiener filtering, which involves improper signals. Amongst the contributions are, in addition to the complete MSE model, new tools for comparative characterization of the steady state performance of the widely linear LMS algorithm in relation to the traditional LMS algorithm. Theory and simulations concerning linear AWGN channel equalization and widely linear system identification are presented and analysed.

Keywords: Widely linear processing, impropriety, LMS, WL-LMS,

(14)

(15)

List of Figures

3.1 Widely Linear Wiener filtering scheme, where (∗) repre-sents complex conjugation, and ν(n) and m(n) are additive Gaussian noise signals. . . 8 4.1 Used 4-QAM constellation with gray encoding. . . 17 4.2 Frequency response for c1. . . 21

4.3 Ψ for the c1 impulse response with σ2ν ranging from 10−4

to 102 and ρu ranging from 0 to 1. . . 22

4.4 Zoomed in Ψ for the c1 impulse response with σ2ν ranging

from 10−6 to 1 and ρuranging from 0.95 to 1. . . 23

4.5 MSE convergence curve with ρu=1, σν2=0.01, ne=100. . . 24

4.6 Frequency response of channel c2. . . 25 4.7 Minimum MSE ratio surface for the c2 impulse response

with σν2ranging from 10−8 to 102and ρ ranging from 0 to 1. 25

4.8 c2MSE convergence curve with ρu=0.7, σ2ν=0.01, ne=300 26

4.9 c2MSE convergence curve with ρu=0.9, σ2ν=0.0001, ne=50 27

4.10 c2MSE convergence curve with ρu=1, σν2=10−8, ne=5 . 28

4.11 c2MSE convergence curve with ρu=0, σν2=10−5, ne=300 28

4.12 Ψ for the described widely-linear system to be identified. . 30 4.13 MSE convergence curve for system identification with ρx=0,

σ2

(16)

xiv LIST OF FIGURES

4.14 MSE convergence curve for system identification with ρx=1,

σ2

ν=0.0001, ne=100. . . 32

4.15 MSE convergence curve for system identification with ρx=0.9,

σ2

(17)

Acronyms

WLMSE Widely Linear Mean Square Estimation. WL-LMS Widely Linear Least Mean Squares.

CLMS Complex Least Mean Squares. MSE Mean Square Error.

MMSE Minimum Mean Square Error.

CAWGN Circular Additive White Gaussian Noise. QAM Quadrature Amplitude Modulation.

(18)

(19)

CHAPTER

1

Introduction

The technique of Widely Linear Mean Square Estimation (WLMSE) has been introduced by B. Picinbono and P. Chevalier in [1] with the purpose of attaining improved estimation performance, compared to the Strictly Linear estimator, while dealing with noncircular signals. Deriving a widely linear algorithm based on the popular Complex Least Mean Squares (CLMS) algorithm, introduced by B. Widrow in [2], is straightforward and yields the algorithm henceforth referred to as Widely Linear Least Mean Squares (WL-LMS) .

The question lies in whether the utilization of the WL-LMS algo-rithm provides advantages when compared to the ordinary CLMS in terms of convergence speed and Minimum Mean Square Error (MMSE). Efforts have been made to supply tools for analysing the difference be-tween CLMS and WL-LMS behaviour.

Furthermore, the lack of a complete WL-LMS transient MSE sta-tistical model in the bibliography inspires us to develop such model, so that WL-LMS convergence may be analysed in more detail without the need of computer intensive statistical simulations.

(22)

2 CHAPTER 1. INTRODUCTION

1.1 Objectives

1.1.1 General Objectives

This work aims to develop and effectively display novel approaches to suitably comprehend the WL-LMS MSE transient and steady state characteristics.

1.1.2 Specific Objectives

• Establish an accurate model for the WL-LMS transient MSE so that statistical simulations with multiple random process realiza-tions are not necessary to accurately characterize it.

• Introduce the MMSE ratio surface to visualize the steady state advantage of the WL-LMS over the CLMS.

• Present simulation results as a means to demonstrate and exem-plify the application of the described techniques.

• Detail the derivations of relevant equations and expressions as a manner to provide exposition in a more instructive manner.

(23)

CHAPTER

2

Literature Review and Theoretical Introduction

Since its introduction in [1], there have been some works analysing structure and application of the WL-LMS algorithm such as [3], [4], [5] and [6]. From the cited examples, [4] and [5] mainly provide analysis based on the Strong Uncorrelating Transform (SUT) as a way to tap onto the eigenstructure of the input vector correlation matrix. With a focus on application, [3] introduces widely linear prediction. As a solid framework, [7] details the theoretical foundation and tools for complex signal processing, including noncircular signals and the widely linear approach.

2.1 Widely Linear Estimation

A summary of the introductory theory of widely linear adaptive filtering is presented in this section.

Based on the popular Complex Least Mean Squares (CLMS) algo-rithm, introduced by B. Widrow in [2], one can easily derive the Widely Linear LMS (WL-LMS) algorithm.

(24)

4 CHAPTER 2. LITERATURE REVIEW AND THEORETICAL INTRODUCTION

The CLMS algorithm is given by

y(n) = wH(n)x(n) (2.1)

e(n) = d(n) − y(n) (2.2)

w(n + 1) = w(n) + µe∗(n)x(n) (2.3) where w(n) = [w0(n) w1(n) ... wN −1(n)]

T

is the filter coefficient vector and x(n) = [x0(n) x1(n) ... xN −1(n)]

T

is the input signal vector, y(n) is the output signal, d(n) is the desired signal, e(n) is the error signal, and µ is the step size.

Defining the extended, i.e. widely linear, input signal and filter coefficient vectors xe(n) = h xT_(n) _xH_(n)iT _(2.4) we(n) = h fT(n) gT(n) iT (2.5) where f (n) and g(n) are length ’N ’ filter coefficients vector. Applying the necessary corrections to fit our definitions, the WL-LMS algorithm is defined in [6] as:

y(n) = wHe(n)xe(n) (2.6)

e(n) = d(n) − y(n) (2.7)

we(n + 1) = we(n) + µe∗(n)xe(n) (2.8) with notation following closely the notation of CLMS. Likewise, the derivation of most results related to the WL-LMS algorithm are essen-tially identical to the CLMS. It is straightforward to prove that, when the WL-LMS algorithm converges, it converges on average to the opti-mum weight coefficient vector in the Mean Square Error (MSE) sense, given by:

weo= Re−1pe (2.9) where Re= E{xe(n)xHe(n)} is the extended input autocorrelation ma-trix and pe = E{xe(n)d∗(n)} is the extended cross-correlation vector, where E{·} denotes the expected value operator. It follows from the

(25)

2.1. WIDELY LINEAR ESTIMATION 5

definition that Re may be rewritten as

Re=   Rxx Qxx Q∗xx Rxx∗   (2.10)

with Rxx= E{x(n)xH(n)} and Qxx= E{x(n)xT(n)} being the input autocorrelation matrix and pseudo-correlation matrix, respectively. In the same manner, pe may be expressed as

pe= h pT xd qxdH iT (2.11) with pxd= E{x(n)d∗(n)} and qxd= E{x(n)d(n)}, henceforth referred to as the input cross-correlation vector and the input pseudo-correlation vector, respectively.

It is simple to prove that the MSE cost function J=E{e(n)

2

}, is minimized when we= weo. The Minimum MSE is then

Jmine = σd2− pHeRe−1pe= σd2− pHeweo (2.12) where σ2

d is the variance of the desired signal, assuming it has zero

mean. It follows as well that Jmine is always inferior to the CLMS

minimum Mean Square Error (see Appendix A) equal to:

Jmin= σ2d− p H

xdR−1xxpxd= σd2− p H

xdwo (2.13)

(26)

(27)

CHAPTER

3

Methodology

3.1 Methodological Concerns

Initial bibliographical research revealed the absence of an available WL-LMS MSE model. Since a solid MSE model is of importance to effec-tively characterize the adaptation algorithm, the benefits it may pro-vide for enabling easier adaptive filter project methodologies have mo-tivated its development. The conducted research is mainly focused on the obtaining a MSE model for the WL-LMS algorithm, as well as ver-ifying the accuracy of such model. The secondary focus was comparing performance between the WL-LMS and the CLMS and introducing a form of visualizing the steady state advantage of the widely linear ap-proach as a function of the signal impropriety coefficient and additive noise variance.

Extensive simulations and theoretical analysis were conducted con-centrating on:

• Verifying if the simulation results agree minimally with the the-ory, as failures in this regard would be indicative of theoretical or simulation mistakes.

(28)

8 CHAPTER 3. METHODOLOGY

• Developing insight and maturity relating to both simulation con-texts: Equalization and System Identification.

• Validating the MSE model under a multitude of conditions. Even though a particular simulation run may have been conducted to verify other aspects of the WL-LMS, e.g. convergence speed or MMSE, the accuracy of the model was still verified. This ensures that it was tested many times in various scenarios.

• Finding indicators and applicable performance indicators for the WL-LMS algorithm so that one can predict its general behaviour and advantages when compared to the CLMS.

• Checking the validity of potential performance indicators.

3.2 Theory and Simulation Procedures

The MSE model was derived analytically with assumptions and proce-dure similar to the ones that were used in [9] to extract an analogous model for the CLMS. While the majority of the procedure is identi-cal, the WL-LMS MSE derivations have the need of considering the impropriety of some variables, i.e. E{xx} 6= 0, for better accuracy.

Figure 3.1: Widely Linear Wiener filtering scheme, where (∗) represents complex conjugation, and ν(n) and m(n) are additive Gaussian noise signals.

(29)

3.2. THEORY AND SIMULATION PROCEDURES 9

All equalization and system Identification simulations have been conducted in MATLAB R2017b. This choice was mainly due to famil-iarity with the mentioned tool and its resources for producing quality plots. However, there is no inherent necessity for using MATLAB, and it may be even better in terms of performance to consider a high level object oriented programming language such as Python, as it provides a good balance between execution speed and development time allied with versatile standard and community developed libraries.

Each simulation consists of a ensemble of realizations of the random process of Widely Linear FIR Wiener filter adaptation algorithm with random inputs and circular additive white Gaussian noise, such as the one presented in Fig. 3.1. The ensemble average of each iteration’s

(30)

(31)

CHAPTER

4

Analysis

4.1 Statistical Analysis

For generality, to encompass all applications of adaptive Wiener filter-ing, a random process m(n) is added to the desired response d(n) for modeling a measurement noise. Then, the error signal at instant n is given by

e(n) = d(n) + m(n) − wH_e(n)xe(n) (4.1)

4.1.1 Simplifying Assumptions

For mathematical tractability, the analysis is performed under the fol-lowing set of simplifying assumptions:

A1 x(n) and d(n) are zero-mean complex-valued Gaussian random

processes;

A2 m(n) is a zero-mean complex-valued white Gaussian random

pro-cess, which is statistically independent of any other aleatory sig-nal;

A3 The statistical dependence between xe(n) and we(n) can be ne-glected.

(32)

12 CHAPTER 4. ANALYSIS

4.1.2 Mean Weight Vector Behavior

Taking the expected value on both sides of (2.8), and using the above simplifying assumptions, leads to the recursive equation for the tran-sient behavior of we(n):

E{we(n + 1)} = [I2N− µRe]E{we(n)} + µpe (4.2) The steady state behavior of E{we(n)} can be obtained from (4.2), assuming that the WL-LMS algorithm converges when n → ∞:

we∞, lim

n→∞E{we(n)} (4.3)

Taking this into account in (4.2) leads to

we∞= R−1e pe (4.4) Comparing (4.4) and (2.9) reveals that the WL-LMS algorithm con-verges to the optimum solution.

4.1.3 Mean Weight Error Vector Behavior

Defining the 2N × 1 augmented weight error vector at instant n:

ve(n), we(n) − we∞ (4.5) from (2.8), yields

ve(n) = ve(n − 1) + µe∗(n − 1)xe(n − 1) (4.6) Taking the expected value of (4.6), under the simplifying assumptions again, leads to

E{ve(n)} = [I2N− µRe]E{ve(n − 1)} = [I2N− µRe]nE{ve(0)}

(4.7)

Hence, convergence in the mean of we(n) requires all the eigenvalues of I2N − µRe to be inside the unit circle. In this case, the solution

we(n) becomes unbiased as n → ∞. Now, using the orthogonal eigen-decomposition Re = QΛQH, where Λ is a diagonal matrix consisting

(33)

4.1. STATISTICAL ANALYSIS 13

of the eigenvalues of Re, and Q is the unitary matrix consisting of the associated eigenvectors (QQH= I2N), (4.7) yields

E{ve(n + 1)} = Q[I2N− µΛ]n+1QHE{ve(0)} (4.8) From (4.8), an upper limit for µ ensuring the convergence of the WL-LMS algorithm can be determined:

0 < µ  1

λmax

(4.9) where λmax denotes the largest eigenvalue of Re.

4.1.4 Mean Square Error Behavior

From (4.1) and (4.5), and making use of the simplifying assumptions once again, leads to

E{|e(n)|2} = σ2 d+ σ 2 m− 2Re{p H eE{ve(n)}}+ − 2Re{pH

ewe∞} + wHe∞Rewe∞+ + 2Re{wH_e∞ReE{ve(n)}} + tr[ReV(n)]

(4.10)

where σ2

m = E{|m(n)|2} is the variance of the measurement noise,

Re{z} denotes the real part of the complex-valued scalar z, tr[Z]

de-notes the trace of matrix Z, and

V(n) = E{ve(n)vHe(n)} (4.11) is the weight error correlation matrix. Substituting (4.5) into (4.10) and using (4.4) yields

E{|e(n)|2} = Je min+ J e ex(n) (4.12) where J_mine = σ_d2+ σ_m2 − pH ewe∞ (4.13)

is the minimum MSE in (2.12), added from the measurement noise power, and

(34)

accounts for the excess MSE at time n of the stochastic gradient-based adaptation technique.

4.1.5 Excess MSE

Post-multiplying (4.6) by its hermitian, taking the expected value and using the simplifying assumptions yields

V(n) = V(n − 1) + µ[A(n − 1) + AH(n − 1)] + µ2B(n − 1) (4.15) where

A(n − 1) = E{ve(n − 1)xHe(n − 1)e(n − 1)} (4.16) and

B(n − 1) = E{xe(n − 1)xHe(n − 1)e(n − 1)e

∗_{(n − 1)}} _(4.17)

Substituting (4.1) into (4.16), under the simplifying assumptions, it follows that

A(n − 1) = −V(n − 1)Re (4.18)

Using assumptions A1 and A2, the fourth-order moment in (4.17) can be obtained based on the Gaussian moment-factoring theorem (also known as Isserlis’ theorem). So, it can be rewritten as

B(n − 1) = ReE{|e(n − 1)|2}+

+ E{xe(n − 1)e∗(n − 1)}E{xHe(n − 1)e(n − 1)}+ + E{xe(n − 1)e(n − 1)}E{xHe(n − 1)e∗(n − 1)}

(4.19)

which, under A3 and after some algebraic manipulation, becomes

B(n − 1) = ReE{|e(n − 1)|2}+ + ReE{ve(n − 1)}E{vHe(n − 1)}Re+ + ReJ2NE{v∗e(n − 1)}E{v T e(n − 1)}J2NRe (4.20) where J2N =   0N IN IN 0N   (4.21)

(35)

4.1. STATISTICAL ANALYSIS 15

Equations (4.20), (4.18), (4.15) and (4.14) resume the recursive ex-pression for updating Je

ex(n) and, consequently, the MSE behavior in

(4.12).

The steady-state MSE is reached after algorithm convergence, being defined as J_∞e _{, lim} n→∞E{|e(n)| 2_{} = J}e min+ J e ex(∞) (4.22) where J_exe (∞) = tr[ReV∞] (4.23) and V∞, lim n→∞V(n) (4.24)

From (4.20), (4.18) and (4.15), it can be shown that

tr[ReV∞] = µ 2J e ∞tr[Re] (4.25) and Jexe (∞) = µtr[Re]Jmine 2 − µtr[Re] (4.26) Finally, solving for Je

∞yields J_∞e = Jmine 1 + µtr[Re] 2 − µtr[Re] (4.27) 4.1.6 Misadjustment

By definition, the misadjustment for the WL-LMS algorithm is given by ξ = J e ex(∞) Je min = µtr[Re] 2 − µtr[Re] (4.28) For a sufficiently small µ, this results in

ξ ≈ 1

2µtr[Re] (4.29)

which means that increasing µ increases the misadjustment. From (4.28), a stability upper bound can also be established for µ:

µmax=

2

(36)

4.1.7 Algorithm

The complete derived algorithm for WL-LMS MSE estimation follows the update rule below:

Algorithm WL-LMS MSE estimation update rule

1: Setup: 2: n ← 0 3: v(n) ← we(n) − weo 4: V(n) ← (we(n) − weo)(we(n) − weo)H 5: J_mine ← σ2 d+ σ 2 m− pHewe∞ 6: J (n) ← Jmine + tr[ReV(n)] 7: Update: 8: while n < nmax do 9: n ← n + 1 10: ve(n) ← [I2N− µRe]ve(n − 1) 11: B(n) ← ReJ (n − 1) + Rev(n)vH(n)Re+ ReJ2Nv∗(n)vT(n)J2NRe 12: V(n) ← V(n − 1) − µ(ReV(n − 1) + V(n − 1)Re) − µ2B(n) 13: J (n) ← Jmine + tr[ReV(n)]

4.2 Application Specific Theory

Before presenting the simulation results themselves, some particulari-ties of each case are analysed as a means to infer on the more general behavior of the WL-LMS algorithm. The MMSE ratio function Φ is introduced and studied, meanwhile, the validity of the MSE model is verified in a multitude of simulations.

4.2.1 Equalization

In the equalization simulation the desired signal is the time delayed transmitted signal d(n) = u(n − α), as to compensate the filtering delay introduced by the channel and equalizer. We define the input signal as gray encoded 4-QAM stochastic process in which the "00" and "11" symbols occur with probability p/2 and the remaining symbols occur with probability (1 − p)/2.

As in [7], the impropriety coefficient of a vector x(n) is defined as

ρx= 1 −

det(Rxx− QxxR∗−1xx Q∗xx)

det(Rxx)

(37)

4.2. APPLICATION SPECIFIC THEORY 17

Figure 4.1: Used 4-QAM constellation with gray encoding.

If x(n) is a scalar quantity, ρ reduces to

ρx=

|qx|2

r2

x

(4.32)

where qx= E{x(n)x(n)} and rx= E{x(n)x∗(n)} are the input

pseudo-correlation and autopseudo-correlation, respectively.

Varying the probability parameter p allows one to alter the impro-priety coefficient of the transmitted signal process. Writing the trans-mitted symbol as u = a + jb, already suppressing the dependency on "n" for ease of notation, gives

qu= E{a2} − E{b2} + j2E{ab} (4.33)

ru= E{a2} + E{b2} (4.34) Taking a, b ∈ {± q Es 2 } × {± q Es

2 }, with "×" denoting Cartesian

prod-uct and Esbeing the symbol energy, one finds that

qu= jEs(2p − 1) (4.35)

ru= Es (4.36)

yielding

ρu= |2p − 1|2 (4.37)

When performing simulations on equalization, one finds that the impropriety coefficient of the transmitted signal, the noise variance to transmitted signal power ratio (σ2

ν/σu2) and the impulse response of the

(38)

that the impropriety coefficient of the input signal or vector, the trans-mitted signal impropriety coefficient, and the noise variance do not completely define the difference in performance between the two struc-tures, i.e. each channel impulse response prompts a unique behaviour for each equalizer.

In light of this information, it strikes as useful to investigate some relevant indicators of the performance relation between WL-LMS and CLMS for a given channel impulse response. One important figure is the minimum MSE ratio Ψ = Je

min/Jmin, which gives relevant

infor-mation about the possible steady state advantage of the WL-LMS over the CLMS.

The MMSE ratio, when evaluated over multiple values of ρuand σν2,

becomes a MMSE ratio surface, which has some interesting properties that facilitate its visualization:

• Ψ does not depend on the phase of qu, therefore the dependence

is only on ρ. This proof is presented in Appendix B.

• Ψ depends on the noise variance to transmitted signal variance ra-tio (σν2/σ2u), not on the individual values of σ2νand σ2u. This means

that Ψ should be expressed as Ψ(σ2

ν/σu2) rather than Ψ(σ2ν, σu2).

Proof for this statement is provided in Appendix C The aforementioned properties allow Ψ to be plotted as Ψ(σ2

ν/σu2, ρu),

this yield a helpful visual aid to characterizing the approximate steady state advantages of the WL-LMS.

4.2.2 System Identification

In widely-linear system identification, the model is slightly modified according to the following equations:

y(n) = wH(n)x(n) (4.38)

e(n) = d(n) + ν(n) − y(n) (4.39)

w(n + 1) = w(n) + µe∗(n)x(n) (4.40) With y(n) being

(39)

4.2. APPLICATION SPECIFIC THEORY 19

The widely-linear MMSE for system identification is given by

J_mine = σ2_ν+ σ2_d+ peHR−1e pe (4.42) Also, σ2

d, pxd, qxd and Re are easily expressed as

σ_d2= fsH(Rxxfs+ Qxxgs) + gsH(Q∗xxfs+ Rxx∗ gs) (4.43) pxd= Rxxfs+ Qxxgs (4.44) qxd= Qxxfs∗+ Rxxg∗s (4.45) Re =   rxIN qxIN qx∗IN r∗xIN   (4.46) Since Rxx= rxIN (4.47) Qxx= qxIN (4.48)

Also, from (4.44) and (4.45)

pe= Re   fs gs   (4.49)

It follows then that

peHR−1e pe= σ2d (4.50)

and finally that

J_mine = σ_ν2 (4.51) Analogously, initial expression for the strictly linear MMSE is

Jmin= σν2+ σ 2 d− p H xdRxx−1pxd (4.52) Using (4.44) and (4.46) pH_xdR−1_xxpxd= σd2+ σ 2 xg H s gs(ρ − 1) (4.53) and finally J = Je + σ2gHg (1 − ρ) (4.54)

(40)

Which is also a proof that J_mine ≤ Jmin for system identification, once

that σ2

xgHs gs is always positive. As ρ increases Jmin approaches Jmine ,

the equality happens for ρ = 1.

The result in (4.54) allows for a very intuitive justification. No-tice that σx2gHs gs(1 − ρ) is precisely the power of the signal that a strictly-linear estimator cannot quantify, i.e. the widely-linear part of the system output.

The above presented derivation also makes way for showing that, like in strictly-linear channel equalization, Ψ depends only on the σ2

ν/σx2

ratio (rather than the individual values on the numerator and denom-inator) and that the phase of qx has no effect on the steady state

behaviour, allowing Ψ to be written, without loss of information or generality, as a function of ρx. These two conclusions are presented in

the same equation derived from (4.51) and (4.54)

Ψ = 1 +σ 2 x σ2 ν gH_s gs(1 − ρ) !−1 (4.55)

which depends only on the σ_ν2/σ2_xratio and on ρ, presenting a cleaner expression for Ψ in widely-linear system identification.

The MMSE ratio is not relevant when the identified system is strictly linear, since the MMSE is equal for both the WL-LMS and CLMS. This happens because the strictly linear estimator is capable of completely characterizing the system.

4.3 Simulation

To show the comparison between the behaviors of the WL-LMS and the conventional CLMS algorithms, some simulations are presented. The simulations concern the equalization of a Circular Additive White Gaussian Noise (CAWGN) complex channel, and the identification of a widely linear system with CAWGN measurement noise added to d(n).

(41)

4.3. SIMULATION 21

4.3.1 Equalization Channel 1

Consider the channel impulse response vector cH

1 = [0.2 1 0.2] with fre-quency response represented in Fig. 4.2. One can plot the MMSE ratio surface for this given channel as way to visualize the regions of better performance of the WL-LMS compared to the CLMS algorithm. In all simulations σ2

u = 1 is assigned without loss of generality. The

re-marks made always relate only to the specified channel unless otherwise stated.

Figure 4.2: Frequency response for c1.

We notice in Fig. 4.3 that the WL-LMS better performs, for this given channel, for signals with ρu near 1 and σν2 close to 10−1, but

never surpassing Je

min/Jmin< 0.5. It is also known in the context here

discussed, proven in [7] and in Appendix A through a different method, that J_mine = Jmin if ρu= 0 for all possible channels!

In all equalization simulations presented involving the MSE conver-gence some conventions and observations are established:

(i) The two dashed lines represent the minimum MSE for the WL-LMS and CWL-LMS algorithms, with the lower one always refering to Je as previously discussed (these lines may coincide).

(42)

Figure 4.3: Ψ for the c1 impulse response with σ2νranging from 10−4to 102

and ρuranging from 0 to 1.

(ii) Since ρuis determined by the probability of occurrence of symbols

associated with "01" and "10" bits, in situations concerning values of ρu near 1, the associated statistical noise will be substantial.

Also it may be desired to analyse situations with σ2

ν nearing 1,

adding further to the noise perceived in the MSE curves. (iii) The step sizes of both algorithms are the same, in the current

case µW L= µSL = 0.005.

(iv) All equalizer filters (f , g and w) have 16 coefficients, so that the WL-LMS and CLMS algorithms update a total of 32 and 16 coefficients, respectively.

(v) The equalization delay in the following adaptive equalization is always α = 9.

In order to mitigate the effects of noise mentioned in item 2 the original MSE curves, which consist of the ensemble average from nerealizations

of equalization process, undergo wavelet denoising as implemented in MATLAB’s "wdenoise" function’s default parameters and are then sub-jected to a moving average filter with the appropriate sample span. This

(43)

4.3. SIMULATION 23

allows for a smaller number of realizations in each simulation, while still being able to visualize the obtained results. As additional information, the original MSE curves are preserved and plotted in a lighter shade.

Analysing Fig. 4.3, it is striking that, for channel c1, even in low noise situations, WL-LMS only exhibits advantageous behaviour in terms of the minimum MSE ratio if ρu is very near to 1. The closer

σ2

ν gets to 0, the closer ρuneeds to be to 1 for the WL-LMS algorithm

to be reasonably useful in this aspect. This is noticeable in Fig. 4.4, which zooms in the "tail" of the surface. However, Ψ eventually rises back to one when ρu= 1 and σν2≈ 10−13.

Figure 4.4: Zoomed in Ψ for the c1 impulse response with σν2 ranging from

10−6 to 1 and ρuranging from 0.95 to 1.

If ρu = 1 and σν2 = 10−2, the WL-LMS converges faster (on

aver-age) and to a considerably lower MSE when compared to the CLMS. Observing Fig. 4.5 it is clear that the widely linear approach produces considerably better results than the its strictly linear counterpart. In fact the Ψ is calculated to be 0.5004, i.e. the WL-LMS MSE is almost

(44)

Figure 4.5: MSE convergence curve with ρu=1, σ2ν=0.01, ne=100.

Channel 2

Now a complex baseband equivalent channel with impulse response equal to cH2 = [0.2 −0.5 −0.7j 1] is considered. For curiosity’s sake, this channel’s frequency response is shown in Fig. 4.6.

Once again one may plot Ψ to gain insight on the problem and understand the regions of fruitful MMSE performance for the WL-LMS estimator. The corresponding Ψ plot is given in Fig. 4.7, which reveals a completely different outcome in comparison to the previously studied channel. In the first presented channel, the regions of lower minimum MSE ratio are restricted to high values of ρu or higher noise power

situations; in the second channel, however, it is possible to achieve values of minimum MSE ratio lower than the global minimum of the

C1 channel with ρu≈ 0.6 and low noise power.

The shape of Ψ in channel equalization seems to be related to the linear independence of the elements in the channel impulse response. If cH _{can be written as}

cH = zhh1 h2 · · · hN −1

i

, z ∈ C and hi∈ R (4.56)

(45)

4.3. SIMULATION 25

Figure 4.6: Frequency response of channel c2.

then it will be similar to Fig. 4.7. Further study is needed to present a more detailed explanation for why is this the case.

Figure 4.7: Minimum MSE ratio surface for the c2impulse response with σ2ν

(46)

Some convergence plots are now presented to assist in grasping the variety of possible behaviours the WL-LMS algorithm can deliver when compared to the CLMS. The span of the moving average filter is changed according to perceived need in the convergence plots regarding

c2.

Fig. 4.8 considers a situation with medium to high ρu and

consid-erable noise. WL-LMS convergence is a little slower than the CLMS, however the MMSE ratio is 0.555. As one would expect from Fig. 4.7, it is not an example of WL-LMS’s best behaviour. It is, however, still superior to the CLMS in terms of MMSE and overall convergence.

Figure 4.8: c2 MSE convergence curve with ρu=0.7, σν2=0.01, ne=300

.

Fig. 4.9 exhibits the MSE convergence curves for low noise and high ρu. The MMSE ratio is approximately 0.132, which describes a

very advantageous situation relative to MMSE. The WL-LMS MSE, in this case, is also always smaller than the mean CLMS MSE for the entirety of the iterations. This behaviour is only reinforced for lower noise power and higher ρu, as visible in Fig. 4.10, in which the MMSE

ratio is even closer to zero (equal to 1.352 · 10−7).

As a rule, the Reeigenvalue spread becomes increasingly larger than the Rxx eigenvalue spread as the noise becomes smaller and ρu rises.

(47)

4.3. SIMULATION 27

Figure 4.9: c2 MSE convergence curve with ρu=0.9, σν2=0.0001, ne=50

.

This would, on a first thought, be an indicator of slower convergence, however slower convergence does not mean that the WL-LMS algorithm can not be superior in every aspect (smaller MMSE and MSE for all iterations) when compared to the CLMS in a particular scenario. It may happen that the initial values of the error components associated with the modes of slower convergence are already small enough not to make a significant difference. This means that comparing the eigenvalue spreads does not give enough significant information about the overall convergence behaviour to determine if the WL-LMS will be superior or inferior to the CLMS in this standpoint.

Also, as an additional important note, if ρu = 0 and the channel

is strictly-linear (as has been considered so far), the WL-LMS and the CLMS are very similar in every aspect, almost always displaying overlapping, or really close MSE curves.

This can be seen on Fig. 4.11, in which the MSE curve overlap is significant. A relevant indicator to how easy it is to perceive the similarity of the curves, in this case (ρu = 0), is how close the initial

(48)

Figure 4.10: c2 MSE convergence curve with ρu=1, σν2=10

−8

, ne=5

.

Figure 4.11: c2 MSE convergence curve with ρu=0, σν2=10

−5

, ne=300

(49)

4.3. SIMULATION 29

4.3.2 System Identification

Repeated simulations show that the impropriety coefficient ρxis crucial

in determining the convergence speed, with higher values of ρx

gener-ally displaying faster convergence WL-LMS compared to strictly linear cases. The input signal x(n) is taken to be (not necessarily circular) white noise, in order to excite the entire frequency spectrum of the sys-tem under identification. The correlation and pseudo-correlation are then varied for the MSE convergence plots to be analysed.

For convenience, rx and qx were taken to be

rx= 1 (4.57)

qx= jρx (4.58)

When ρ = 0 and gs = 0 (strictly-linear system), the MSE curves for both algorithms are nearly identical, similarly to the equalization sce-nario.

The following described behaviour was observed consistently in a vast collection of widely-linear systems:

• As witnessed in all conducted simulations, in many different sys-tems both strictly and widely linear, WL-LMS convergence was always fastest when ρx= 1.

• For strictly-linear system identification (gs = 0), with Jmin =

Je

min, WL-LMS was always faster than CLMS when ρx = 1. In

this scope, it seems to be no apparent advantage for using any other value of ρx besides 1.

• Generally WL-LMS’s MSE convergence slows down considerably (in comparison to ρx = 1 and ρx = 0) for some intermediate

values of ρx.

The system to be identified in the following simulations is defined as fsH= h 0.2 −0.5j −0.7j 1 iT (4.59) g H =h i T (4.60)

(50)

To first characterize it, as usual, the Ψ for this system is plotted in Fig. 4.12 (notice the inversion in the color scheme, done to minimize black ink in print).

Figure 4.12: Ψ for the described widely-linear system to be identified. The structure of Ψ for system identification, differently to what happens in linear channel equalization, does not rely as heavily on the system itself but on gH

s gs. This means that the same overall structure is preserved with differences of apparent horizontal translation. The higher gHs gs is, more to the right the transition region goes.

It is important to notice that, since Je

min is always equal to σν2,

varying ρx so that Ψ approaches 0 does not mean in any way that

the WL-LMS’s MMSE is better, but that the CLMS’s MMSE gets progressively worse.

To illustrate the convergence behavior, some MSE convergence plots are presented varying ρx.

The scenario in Fig. 4.13 demonstrates the incapacity of the CLMS to sufficiently describe the system, achieving considerably higher MSE throughout all iterations.

The scenario in Fig. 4.14 displays a better case for the CLMS, however the WL-LMS is still better in all aspects.

(51)

4.3. SIMULATION 31

Figure 4.13: MSE convergence curve for system identification with ρx=0,

σ2

ν=0.0001, ne=100.

typical of ρxnear 1. Even having dramatically slower convergence than

the ρx= 1 or ρx= 0 cases, its convergence and MMSE still outperform

the CLMS.

(52)

Figure 4.14: MSE convergence curve for system identification with ρx=1,

σ2ν=0.0001, ne=100.

Figure 4.15: MSE convergence curve for system identification with ρx=0.9,

σ2

(53)

CHAPTER

5

Conclusion

In this work we: presented an accurate complete model of the WL-LMS MSE, with only minor deviations to the statistically measured values, and provided the notion of a MMSE ratio surface for assessing the steady state advantages of the WL-LMS over the CLMS. The attained results satisfied the initial expectations and opened way to new concepts worthy of further research.

Future work

During the development of this thesis, two concepts were discussed and may be deemed interesting enough to be subject of future research. They are:

(i) Detailed analysis of the MSE statistical model, seeking more de-tailed information on the convergence behavior of the WL-LMS algorithm.

(ii) Deriving a modulation involving the impropriety coefficients of the transmitted signal.

(54)

(55)

Bibliography

[1] B. Picinbono and P. Chevalier. Widely linear estimation with com-plex data. IEEE Transactions on Signal Processing, 43(8):2030– 2033, Aug 1995.

[2] B. Widrow, J. McCool, and M. Ball. The complex lms algorithm.

Proceedings of the IEEE, 63(4):719–720, April 1975.

[3] B. Picinbono. Wide-sense linear mean square estimation and pre-diction. In 1995 International Conference on Acoustics, Speech, and

Signal Processing, volume 3, pages 2032–2035 vol.3, May 1995.

[4] S. C. Douglas and D. P. Mandic. Performance analysis of the con-ventional complex lms and augmented complex lms algorithms. In

2010 IEEE International Conference on Acoustics, Speech and Sig-nal Processing, pages 3794–3797, March 2010.

[5] C. Cheong Took, S. C. Douglas, and D. P. Mandic. On approximate diagonalization of correlation matrices in widely linear signal pro-cessing. IEEE Transactions on Signal Processing, 60(3):1469–1473, March 2012.

[6] S. Javidi, M. Pedzisz, S. L. Goh, and D. P. Mandic. The augmented complex least mean square algorithm with application to adaptive

(56)

36 BIBLIOGRAPHY

prediction problems. In 2008 IAPR Workshop Cognitive

Informa-tion Processing Syst., 2008.

[7] Peter J. Schreier and Louis L. Scharf. Statistical Signal Processing

of Complex-Valued Data: The Theory of Improper and Noncircular Signals. Cambridge University Press, 2010.

[8] S Haykin. Adaptive Filter Theory. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1986.

[9] Lal Chand. Godara and CRC Press. Smart antennas. CRC Press Boca Raton, Fla, 2004.

(57)

APPENDIX

A

Derivation 1

Proof that Je

min≤ Jmin for equalization.

The Widely-Linear solution of the Wiener-Hopf equations for MMSE is given by

Rewopte = pe (A.1) Rewriting the above equation using (2.5), (2.10) and (2.11):

  Rxx Qxx Q∗xx R∗xx     fo go  =   pxd q_xd∗   (A.2)

Which can be expanded in the following system of equations

pxd= Rxxfo+ Qxxgo (A.3)

q∗_xd= Q∗_xxfo+ R∗xxgo (A.4) Based on this, (2.13) can be expressed as

Jmin= σ2d− (foHRxxfo+ foHQxxgo

+ goHQ∗xxfo+ goHQ∗xxR

−1

(58)

38 APPENDIX A. DERIVATION 1

in light of the properties that QHxx = Q∗xx and RTxx = Rxx∗ . In (2.12) one can substitute (2.5) and (2.11) to obtain

Jmine = σ

2

d− (p

H

fo+ qTgo) (A.6) which in turn, using (A.3) and (A.4), gives

J_mine = σ_d2− (foHRxxfo+ goHQ∗xxfo

+ foHQxxfo+ goHR∗xxgo) (A.7) One can easily see that

Jmin= Jmine + goH(Rxx∗ − Q

∗

xxR

−1

xxQxx)go (A.8)

Reis an autocorrelation matrix, that implies it is positive semi-definite. It is known that the Schur complement of a positive semi-definite matrix is also positive semi-definite. The term Rxx∗ − Q∗xxRxx−1Qxx is precisely the Schur complement of Rxx in Re, therefore the quadractic form

goH(R∗xx−Qxx∗ R−1xxQxx)goyields a positive real number or zero. Thus the minimum MSE of the strictly linear estimator is larger or equal to the widely linear estimator’s minimum MSE.

(59)

APPENDIX

B

Derivation 2

Proof that Ψ does not depend on the phase of qu in equalization.

The aim is to show that, knowing Re, pe and C, Ψ = Jmine /Jmin

can be expressed as Ψ(ρu, σ2u/σν2). Given u(n) = a(n) + jb(n) (B.1) with ru(k) = E{u(n)u∗(n − k)} (B.2) qu(k) = E{u(n)u(n − k)} (B.3)

Consider that transmitted symbols are independent of each other, i.e.

ru(k 6= 0) = qu(k 6= 0) = 0.

For ease of notation dependence on k will be omitted, always re-ferring to the same k. Additionally ru(0) = ru and qu(0) = qu. Since

ru∈ R+ and ρu∈ [0, 1], by definition: ru

√

ρu= |qu|.

In equalization, it is defined

(60)

40 APPENDIX B. DERIVATION 2

where u(n) is the vector of the N latest u(n). and cH is the length N FIR channel impulse response vector.

The matrix Qxx can be expressed as

Qxx=         qx(0) qx(1) · · · qx(N − 1) qx(−1) qx(0) · · · qx(N − 2) .. . ... . .. ... qx(1 − N ) qx(2 − N ) · · · qx(o)         (B.5) qx(k) = cHQ(k)uuc∗ (B.6) where Q(k)uu = E{u(n)uT(n − k)}. Equation (B.7) presumes that

E{ν(n)ν(n)} = 0, implying the added noise is not improper. Also qxd= h q0 q1 · · · qN −1 iT (B.7) qα−i = ci· qu (B.8)

in which ci is the ’i’th element of the channel impulse response vector

cH_{, always only taking values of i such that α−i falls inside the interval}

[0, N − 1].

Let q1= q2ej∆ϕ, where |q1| = |q2| = |q|. From Eqs. (B.5) and (B.7)

Q1_xx= Q2_xxej∆ϕ (B.9)

q_xd1 = q2_xdej∆ϕ (B.10) Based on the widely-linear Wiener-Hopf equations given by

  Rxx Qxx Q∗_xx R∗_xx     fo go  =   pxd q_xd∗   (B.11) it follows that fo= (Rxx− QxxR∗−1Q∗xx)−1(pxd− QxxR∗−1qxd∗ ) (B.12) go= (R∗xx− Q∗xxR−1Qxx)−1(q∗xd− Q∗xxR∗−1pxd) (B.13)

(61)

41

After some algebra, it is clear that

fo1= fo2 (B.14)

g1_o= g2_oej∆ϕ (B.15) Since Je

min = σ2d − pHewo, using (B.14) and (B.15), it follows that

Je1

min= Jmine2 .

Once that it is proved that J_mine does not depend on the phase of qu, and knowing that Jmin does not consider any pseudocorrelation

information, the conclusion is that Ψ can be expressed as Ψ(ρu), rather

(62)

(63)

APPENDIX

C

Derivation 3

Proof that Ψ in channel equalization depends on the noise variance to transmitted signal power ratio (σ2

ν/σ2u), not on the individual values of

σ2ν and σ2u.

The aim is to express Ψ only in terms of the ratio of the variances

σ2

u and σν2.

By definition and from (B.4), it is easy to show that

Rxx=         rx(0) rx(1) · · · rx(N − 1) rx(−1) rx(0) · · · rx(N − 2) .. . ... . .. ... rx(1 − N ) rx(2 − N ) · · · rx(o)         (C.1) rx(k) = cHR(k)uuc + δ0kσ2ν (C.2)

in which ’δij’ is the Kronecker delta and R

(k)

(64)

44 APPENDIX C. DERIVATION 3

Define the matrix F as

F =         f0 f1 · · · fN −1 f−1 f0 · · · fN −2 .. . ... . .. ... f1−N f2−N · · · f0         (C.3) fk = N −1 X i=0 ci· c∗i+k (C.4)

where ci is the ’i’th element of the channel’s impulse response vector

cH_{. If i + k would result in a value outside [0, N − 1], c}

i+kis taken to be

0. Effectively, if ci were a discrete function c(n) instead of an ordinary

vector, (C.4) would describe the autocorrelation of c(n) at the instant

k.

After some analysis, it is straightforward to visualize that Rxx can be written in the following way

Rxx= σu2 F + IN σ2 ν σ2 u ! (C.5)

In much the same manner, define the matrix G as

G =         g0 g1 · · · gN −1 g−1 g0 · · · gN −2 .. . ... . .. ... g1−N g2−N · · · g0         (C.6) gk = N −1 X i=0 ci· ci+k (C.7)

Following the same conventions on ci+k.

Again assuming that the added noise ν(n) is circular, and by the definition of the impropriety coefficient ρu, Qxx can be written as

Qxx= Gσ2u

√

(65)

45

Combining the widely-linear correlation matrix Re with (C.5) and (C.8): Re =   Fσ2 u+ INσν2 Gσu2 √ ρuejϕ G∗σ_u2√ρue−jϕ F∗T σu2+ INσ2ν   (C.9)

Rewriting Re in terms of a matrix Re0(σ2ν/σu2) results in

Re= σ2u   F + IN σ2 ν σ2 u G √ ρuejϕ G∗√ρue−jϕ F∗+ INσ 2 ν σ2 u   = σ2_u R0_e+ I2N σ2 ν σ2 u ! (C.10)

Now for analysing the cross-correlation vectors pxd and qxd. In similar fashion to (B.7) pxd = h p0 p1 · · · pN −1 iT (C.11) pα−i = ci· ru (C.12)

It is then possible to write

pxd = p0· ru (C.13)

qxd= p0· qu= p0· ru

√

ρuejϕ (C.14)

with p0being the column vector of length N containing c shifted "down" by α elements.

The widely-linear cross correlation vector pecan then be written as

pe=   p0· ru p0∗· ru √ ρue−jϕ  = p 0 e· ru (C.15)

Expressing the widely-linear MMSE and the strictly-linear MMSE using (C.10) and (C.15) yields

Ψ = 1 − p0H_e R0_e+ I2Nσ 2 ν σ2 u −1 p0_e 1 − p0H_{F + I} Nσ 2 ν −1 p0 (C.16)

(66)

46 APPENDIX C. DERIVATION 3

Since R0eis only a function of the ratio σν2/σ2u, it is correct to say that

Ψ depends only on this ratio, rather than the individual values on the numerator and denominator.

New insights on MSE behavior of the widely linear LMS algorithm

New Insights on MSE Behavior of the

Widely-Linear LMS Algorithm

Enrique Theisen Rodrigues Pinto

ENRIQUE THEISEN RODRIGUES PINTO

NEW INSIGHTS ON MSE BEHAVIOR OF THE

WIDELY-LINEAR LMS ALGORITHM

FLORIANÓPOLIS

2019

Agradecimentos

List of Figures

Acronyms

Contents

CHAPTER

1

Introduction

1.1

Objectives

CHAPTER

2

Literature Review and Theoretical Introduction

2.1

Widely Linear Estimation

CHAPTER

3

Methodology

3.1

Methodological Concerns

3.2

Theory and Simulation Procedures

CHAPTER

4

Analysis

4.1

Statistical Analysis

4.2

Application Specific Theory

4.3

Simulation

CHAPTER

5

Conclusion

Future work

Bibliography

APPENDIX

A

Derivation 1

APPENDIX

B

Derivation 2

APPENDIX

C

Derivation 3