Linear autoregressive process with circular noise

2.7 Experimental results

2.7.1 Linear autoregressive process with circular noise

This means that the matrix

H_khas the form

H_k =







H¹_k H²_k H³_k H⁴_k ıH²_kı⁻¹ ıH¹_kı⁻¹ ıH⁴_kı⁻¹ ıH³_kı⁻¹

H³_k⁻¹ H⁴_k⁻¹ H¹_k⁻¹ H²_k⁻¹ κH⁴_kκ⁻¹ κH³_kκ⁻¹ κH²_kκ⁻¹ κH¹_kκ⁻¹





 ,

whereH¹_k = (J¹_k)^HJ¹_k+ı(J²_k)^HJ²_kı⁻¹+(J³_k)^HJ³_k⁻¹+κ(J⁴_k)^HJ⁴_kκ⁻¹+λ_kIN,H²_k= (J¹_k)^HJ²_k+ ı(J²_k)^HJ¹_kı⁻¹+(J³_k)^HJ⁴_k⁻¹+κ(J⁴_k)^HJ³_kκ⁻¹,H³_k = (J¹_k)^HJ³_k+ı(J²_k)^HJ⁴_kı⁻¹+(J³_k)^HJ¹_k⁻¹+ κ(J⁴_k)^HJ²_kκ⁻¹,H⁴_k= (J¹_k)^HJ⁴_k+ı(J²_k)^HJ³_kı⁻¹+(J³_k)^HJ²_k⁻¹+κ(J⁴_k)^HJ¹_kκ⁻¹. Furthermore, we have that^Hg_k =





 g_k ıg_kı⁻¹

g_k⁻¹ κg_kκ⁻¹







, whereg_k = (J¹_k)^He_k+ı(J²_k)^He_kı⁻¹+(J³_k)^He_k⁻¹+κ(J⁴_k)^He_kκ⁻¹. Now we have all the necessary ingredients to compute the update rule given in (2.6.3).

Up until now we have worked with vectors fromH^4N. Ideally, we would like to work with vectors directly in H^N. Considering the definition of^Hqforq ∈ H^N, this is done by taking the firstN elements of the vector ^Hq. By using the Banachiewicz inversion formula [237], relation (2.6.3) thus becomes: [150]

w_k+1 =w_k−(H¹_k)⁻¹g_k+ (H¹_k)⁻¹ H²_k H³_k H⁴_k T⁻¹





−ıH²_kı⁻¹(H¹_k)⁻¹gk+ıgkı⁻¹

−H³_k⁻¹(H¹_k)⁻¹g_k+g_k⁻¹

−κH⁴_kκ⁻¹(H¹_k)⁻¹g_k+κg_kκ⁻¹



,

where T =





ıH¹_kı⁻¹ ıH⁴_kı⁻¹ ıH³_kı⁻¹

H⁴_k⁻¹ H¹_k⁻¹ H²_k⁻¹ κH³_kκ⁻¹ κH²_kκ⁻¹ κH¹_kκ⁻¹



−





ıH²_kı⁻¹

H³_k⁻¹ κH⁴_kκ⁻¹



(H¹_k)⁻¹ H²_k H³_k H⁴_k

, which represents thequaternion-valued Levenberg-Marquardt (LM)algorithm.

In this case too, the gradient of the error function at different steps is computed using the well-known backpropagation scheme.

gorithm with symmetric rank-one updates (SR1), the quasi-Newton algorithm with Davidon- Fletcher-Powell updates (DFP), the quasi-Newton algorithm with Broyden-Fletcher-Goldfarb- Shanno updates (BFGS), the one step secant method (OSS), and the Levenberg-Marquardt algorithm (LM).

The tap input of the filter was 4, so the networks had 4 inputs, 4 hidden neurons on a single hidden layer, and one output. The activation function for the hidden layer was the fully quaternion hyperbolic tangent function, given byG²(q) = tanhq = ^e_e^qq^−e+e^−q^−q,and the activation function for the output layer was the identityG³(q) = q. Training was done for5000 epochs with5000randomly generated training samples.

To evaluate the effectiveness of the algorithms, we used a measure of performance called prediction gain, defined by R_p = 10 log₁₀ ^σ_σ²^x2

e, where σ_x² represents the variance of the input signal andσ_e²represents the variance of the prediction error. The prediction gain is given in dB and it is obvious that, because of the way it is defined, a bigger prediction gain means better performance. After running each algorithm 50 times, the prediction gains are given in Table 2.1.

We can see that QCP, SAB, and RPR performed approximately the same, followed by DBD, but all of them performed better than the gradient descent algorithm. Then, CGHS and CGPR gave approximately the same results, with CGFR performing better and CGPB worse. The SCG algorithm was better than the conjugate gradient algorithms. From the quasi-Newton methods, DFP and SR1 gave approximately the same results, with BFGS performing better and OSS worse. The absolute best was the LM algorithm.

Table 2.1: Experimental results for linear autoregressive process with circular noise Algorithm Prediction gain

GD 4.51±6.64e-2

QCP 6.37±1.08e-1

RPR 6.41±1.47e-1

DBD 5.46±1.48e-1

SAB 6.40±1.31e-1

CGHS 5.17±1.30e-1 CGPR 5.19±8.14e-2 CGFR 6.91±2.51e-1 CGPB 5.00±9.57e-2 SCG 7.36±9.25e-2

SR1 6.73±2.34e-1

DFP 6.61±2.15e-1

BFGS 7.23±3.80e-1

OSS 5.11±2.04e-1

LM 8.94±3.33e-1

QESN [229] 3.57

AQESN [229] 3.51

2.7.2 3D Lorenz system

The 3D Lorenz system is given by the ordinary differential equations dx

dt = α(y−x) dy

dt = −xz+ρx−y dz

dt = xy−βz,

whereα= 10,ρ= 28, andβ = 2/3. This represents a chaotic time series prediction problem, and was used to test quaternion-valued neural networks in [7, 23, 209, 212, 33, 229].

The tap input of the filter was4, and so the networks had4inputs,4hidden neurons, and one output neuron. The networks were trained for 5000epochs with1337 training samples, which result from solving the 3D Lorenz system on the interval [0,25], with the initial conditions (x, y, z) = (1,2,3).

The results after50runs of each algorithm are given in Table 2.2. The measure of performance was the prediction gain, like in the above experiment.

In this case, QCP and RPR performed best, SAB followed, and DBD was again last of the four, but still better than GD. Next, CGHS and CGPB performed approximately in the same way, CGFR slightly better, and the best was CGPR. In this experiment also, SCG had better results than the conjugate gradient algorithms. From the quasi-Newton methods, DFP and SR1 performed approximately in the same way, OSS slightly better, and the best was BFGS. The best overall performance was attained by the LM algorithm.

Table 2.2: Experimental results for the 3D Lorenz system Algorithm Prediction gain

GD 7.56±7.42e-1

QCP 10.59±5.29e-1

RPR 11.07±7.08e-1

DBD 9.35±6.17e-1

SAB 10.33±7.09e-1

CGHS 10.04±6.65e-1

CGPR 11.31±8.34e-1

CGFR 10.69±5.81e-1

CGPB 10.12±7.35e-1

SCG 12.58±6.44e-1 SR1 11.74±6.82e-1

DFP 11.27±7.76e-1

BFGS 13.74±6.30e-1 OSS 12.09±7.806e-1

LM 31.45±1.21e0

QESN [229] 17.73

AQESN [229] 18.92

2.7.3 4D Saito chaotic circuit

Lastly, we experimented on the 4D Saito chaotic circuit given by _dx₁

dydt1

−1 1

−α₁ α₁β₁

x₁−ηρ₁h(z) y₁−η^ρ_β¹

1h(z)

_dx₂

dydt2

−1 1

−α₂ α₂β₂

x₂−ηρ₂h(z) y₂−η^ρ_β²

2h(z)

whereh(z) =

(1, z ≥ −1

−1, z ≤1 is the normalized hysteresis value, andz =x₁+x₂, ρ₁ = _1−β^β¹

1, ρ₂ = _1−β^β²

2. The parameters are given by (α₁, β₁, α₂, β₂, η) = (7.5,0.16,15,0.097,1.3). This is also a chaotic time series prediction problem, and was used as a benchmark for quaternion- valued neural networks in [5, 6, 7, 31, 210, 211, 32].

The networks had the same architectures as the ones described earlier, and were trained for 5000epochs with5249training samples, which result from solving the 4D Saito chaotic circuit on the interval[0,10], with the initial conditions(x₁, y₁, x₂, y₂) = (1,0,1,0).

The prediction gains after50runs of each algorithm are given in Table 2.3.

In this last experiment, the performances were similar to those in the previous experiments:

QCP, SAB, and RPR had approximately the same performance, followed by DBD, and finally by GD. Also, CGPR had the best performance, followed closely by CGPB, and lastly by CGFR and CGHS. The performance of the SCG algorithm was similar to the ones in the previous experiments. Between the quasi-Newton algorithms, OSS had the best performance, followed closely by BFGS, and lastly by SR1 and DFP. The conclusion is the same: the LM algorithm had the best performance among all the tested algorithms.

Table 2.3: Experimental results for the 4D Saito chaotic circuit Algorithm Prediction gain

GD 5.76±1.70e-1 QCP 11.49±6.47e-1 RPR 11.58±7.91e-1 DBD 6.28±3.36e-1

SAB 11.55±4.96e-1 CGHS 11.59±4.09e-1 CGPR 13.64±3.67e-1 CGFR 12.08±5.30e-1 CGPB 13.02±4.93e-1 SCG 15.32±9.35-1

SR1 11.71±6.73e-1 DFP 11.10±6.32e-1 BFGS 16.24±5.06e-1 OSS 16.94±7.70e-1 LM 25.36±9.63e-1

No documento Complex- and hypercomplex-valued neural networks (páginas 34-37)