CFD Analysis and Reduction
4.3 Incompressible Navier Stokes
4.3.3 Reduction
ROM is than the HDM when we need to compute results outside the original training interval. The do- main subdivisions chosen, 50, 100 and 200, generates respectively 5 000, 20 000 and 80 000 triangles (the only type of element available in 2D for FreeFem++). We have tried employing more subdivisions so as to reach a number of elements near105−106, as would be expected of a high-fidelity model for an aircraft. However for more than 200 subdivisions FreeFem++ seems to fail in updating the reduced state variable used in our ROMs. These variables are stored differently than the HDM variables : the former are stored in a standard array, whilst the latter are stored in a special type of array meant for FEM fields.
This means that while we were able to compute HDM responses for a mesh with 400 subdivisions, we were not capable of computing and updating the reduced state variable when such a high quantity of memory is demanded. Still, we hope that with up to 200 subdivisions we will be able catch the efficiency trend of the ROM. For each mesh test case (50x50,100x100 and 200x200) we will present: Total Time for computing the ROM,tROM; Time for computing the HDM, tHDM; Time for computing the SVD re- quired for the POD, tSV D; Total number of degrees of freedom for the HDM,N DOF; Relative velocity error, eu; Relative pressure error, ep. Since the ROM will employ all the computed snapshots, it will always be of order 4, and we expect it to become faster to compute than the complete HDM.
POD basis
Before performing the model reduction itself, we will briefly discuss basis generation. In this case, since only one parameter is being varied, we feel there is no need of greedy or adaptive sampling. We will therefore focus on ensuring that we are correctly performing the POD of the HDM snapshots taken at Reynolds = 1, 51,101 and 151. As discussed in previous chapters, for the POD we compute the SVD of the snapshot matrix and take the first four left-eigenvectors. These eigenvectors will form our basisV. It should be noted that since this is a static problem, the snapshot matrix is simply a matrix where the solutions computed by the HDM are in column-vector form. We naturally expect that the modes computed by Proper Orthogonal Decomposition to be orthogonal. In other words, we expect the following equality to be verified:
VTV = Λ (4.8)
Where Λ is a diagonal matrix equal to the identity matrix if the modes are orthonormal. Since the original author of the script claims to be performing POD, we expect the original modes of the script to be orthogonal too. But while indeed the modes computed by our method seem to be orthonormal up to machine precision (see table 4.3), this does not seem to be the case for the modes originally computed by the script. In fact, as can be seen in table 4.4, the modes do not even seem to be orthogonal : the off-diagonal terms are within the same order of magnitude of the diagonal terms. This even without a renormalization performed in the original script, where each computed mode was divided by the L-2 norm of the velocity terms of the mode.
It can be seen in figures 4.35 through 4.38, there is no major change in the flow field when going from 1 to 151 Reynolds’ number. We would naturally expect then that at least the very first mode of the POD would produce a flow field similar to those observed. However, the first original mode in
Table 4.3: Numerical result of the product ofVTV for the modes obtained by us.
1.00 3.63e-17 -2.31e-17 -1.37e-17 3.63e-17 1.00 2.98e-16 6.70e-19 -2.31e-17 2.98e-16 1.00 -2.06e-16 -1.37e-17 6.69e-19 -2.06e-16 1.00
Table 4.4: Numerical result of the product ofVTV for the original modes of the script.
3.70e+05 -2.34e+05 -1.06e+05 2.85e-08 -2.34e+05 1.48e+05 6.73e+04 -1.80e-08 -1.06e+05 6.73e+04 3.07+e04 -8.21e-09 2.85e-08 -1.80e-08 -8.21e-09 2.20e-21
(a) Before renormalization
7.77e+07 -2.02e+08 -7.59e+08 7.75e+08 -2.02e+08 5.27e+08 1.98e+09 -2.02e+09 -7.59e+08 1.98e+09 7.41e+09 -7.57e+09 7.75e+08 -2.02e+09 -7.57e+09 7.73e+09
(b) After renormalization
figure 4.2 does not seem to reproduce the average shape of the flow field at all. This is because in the original script, the average of the solution samples was subtracted from the snapshot matrix. When we remove this operation from the script, we obtain the modes in figures 4.43 through 4.46, where for the 1st mode we seem to obtain something closer to the average observed flow field. This average velocity field is also present in the modes computed by us, shown in figures 4.39 through 4.42. There are slight differences between the modes computed by us and those in the original script, with or without subtracting the average of the samples, and the original script’s modes always remain non-orthogonal.
One final observation is that the ROM response of the original script forRe = 101 has non-negligible error, as shown in figure 4.12. This is unexpected, because the original author is computing a solution that was already sampled and fully integrated into the ROM (recall that the ROM of the original script is of 4th order and built from 4 samples). Since we have the same model order as number of samples taken, the solutions at the snapshots belong to the generated reduced state-space. It should then be possible to exactly compute these solutions and exactly ”interpolate” the HDM: the difference between the HDM and the ROM at the snapshots should be nearly null.
Galerkin Projection
We will now attempt to reduce the Lid-Driven cavity problem via Galerkin Projection + POD, whilst using the same samples used in [109]. This means that we will sample the driven cavity problem for Re = 1, 51, 101 and 151, obtain a reduced basis from these samples via the POD and use it to reduce the system presented in equation 4.3. The fully discretized fixed-point system is of the following form :
b=a(wi−1)wi (4.9)
Wherebis the right hand side due to boundary conditions (in this case, Dirichlet type conditions at the walls), andais the system matrix. The system is accordingly reduced as:
VTb=VTa(V qi−1)V qi (4.10)
The algorithm for reduction can be stated as follows:
Figure 4.35: Velocity field for Re = 1.
Figure 4.36: Velocity field for Re = 51.
Figure 4.37: Velocity field for Re = 101.
Figure 4.38: Velocity field for Re = 151.
Figure 4.39: 1st POD mode computed through our method.
Figure 4.40: 2nd POD mode computed through our method.
Figure 4.41: 3rd POD mode computed through our method.
Figure 4.42: 4th POD mode computed through our method.
Figure 4.43: 1st Mode com- puted from the original script, with the average.
Figure 4.44: 2nd Mode com- puted from the original script, with the average.
Figure 4.45: 3rd Mode com- puted from the original script, with the average.
Figure 4.46: 4th Mode com- puted from the original script, with the average.
1. Compute HDM atRe= 1, 51, 101 and 151
2. Compute the SVD of the obtained Snapshot Matrix.
3. From the computed SVD, use all the left eigenvectors to form the basisV. The ROM will be of order 4.
4. Assemble the matrices of problem 4.3
5. Transform these matrices into the reduced system 4.10
6. Solve the reduced system, compute reduced solution and iterate from step 4.
To test our ROM, we will perform the previously mentioned sweep and scalability tests. We expect the ROM to give an exact solution at the snapshots, and to become significantly faster than the HDM as we use finer and finer meshes. After performing the sweep test (see figures 4.48 and 4.47), we can see that at the snapshots (Re = 1, 51, 101 and 151) the ROM gives an exact solution : all errors are far below the percent unit. This contrasts with the ROM initially provided by the script, which displays errors in velocity of around 10 % for Re = 101 (see figure 4.12). It can be seen that throughout the sweep we obtain very acceptable errors for the velocity field for the interval inRe= 1 to 151. However it seems that the error for the pressure is non-negligible. The reason behind these high errors in pressure can be found in images 4.49 and 4.50: the ROM fails to appropriately approximate the pressure at the corners. The HDM presents a very local and rapid variation of pressure at the corners, and it seems that the ROM fails to capture this localized phenomena, resulting in the high error found in the sweep.
One way to avoid these high errors in pressure could be to use a hybrid ROM: using a full order model at the corners and a ROM for the rest could help in correctly simulating these details whilst still being faster than the complete model. When using models with higher subdivisions (see table 4.5), we see that the errors do not seem to change much, but the ROM becomes more efficient : for 200 subdivisions of the domain, corresponding to a total of 80 000 elements, the ROM takes about 30 % of the time of the HDM to compute a solution. Should we find a way to store the reduced system in such a way that no previous HDM assembly is required, we could save half of the ROM computation time, becoming much more efficient. This can be made possible by using tensors rather than matrices for the discretization of the PDE in 4.3. We could then construct and store a reduced tensor for the advection term, rather than recomputing it each time. However constructing such a tensor in FreeFEM++ is not trivial, and our objective is to test several MOR techniques for several CFD models, rather than optimizing one specific implementation. As a final note, the sudden increase intSV Dfrom 0.96s to 3.56s is due to the overflow of the function made available by FreeFem++ to compute the SVD. For 200 subdivisions, we were forced to interface the script with SciPy : the snapshot matrix was published onto a file, which was analyzed by an external python script which produced a different file with the required left-eigenvectors. Since these operations rely on the usage of the hard-drive and not the RAM of the machine, the performance was naturally impacted. To avoid this issue we could use the fast SVD update techniques described in the bibliographical review, but the performance impact is not so severe as to warrant their use.
0 100 200 300 400 10−3
10−2 10−1 100
Reynolds eu
Figure 4.47: Relative Velocity error of the con- structed ROM for different Reynolds numbers for the mesh given in 4.13. Values below10−3 are not shown.
0 100 200 300 400
10−3 10−2 10−1 100 101 102 103
Reynolds ep
Figure 4.48: Relative pressure error of the con- structed ROM for different Reynolds numbers for the mesh given in 4.13. Values below10−3 are not shown.
Table 4.5: Scalability test results for Re = 201. We also include the time spent assembling the HDM matrices per each solve as tAssembly. The solution time for the ROM is given in seconds and as a percentage of the HDM solution time.
N tHDM tSV D tROM tAssembly N DOF eu ep
50 5.64 s 0.22 s 4.08 s (72%) 1.92 s 23003 1.9 % 4000 % 100 36.54 s 0.96 s 19.12 s (52 %) 8.25 s 91003 2.2 % 4570 % 200 266.35 s 3.59 s 79.36 s (30%) 31.64 s 362003 2.0 % 4193 %
Figure 4.49: Pressure calculated by the ROM made from the mesh in figure 4.13 for Re=210.
Figure 4.50: Pressure calculated by the HDM made from the mesh in figure 4.13 for Re=210.
LMSQ
In this part we will test the Least mean squares reduction. This particular method is an optimization problem of the type:
qi=argmin
q ||b−a(V qi−1)V q||2 (4.11)
Herebis the original right-hand side of the HDM andais the original HDM matrix. This is a least mean- squares problem, and to be solved we need two ingredients: a cost function and the gradient of the cost function. We will start by deriving these quantities for the HDM, and then convert them to the ROM problem. For the HDM, we can simply base ourselves on the square of the residue, and obtain our cost functional and gradient as (using Einstein’s notation):
J = (f(w)−b)T(f(w)−b), ∂J
∂wj
= 2(fi(w)−bi)∂fi(w)
∂wj
(4.12)
Here f(w) = a(wk−1)w, k−1referring to the solution at the previous step. The question now is how to obtain ∂f∂wi(w)
j . Fortunately, in the manual of FreeFem++ [100] has an example with exactly the same system function (and code), where the incompressible Navier-Stokes are solved via a Newton method (instead of fixed-point iterations). The Newton method requires the differential of f to work, and this differential (both the code and the analytical form) are supplied by the manual. In figures 4.51 and 4.52 we can see the evolution of the residue and its gradient given in figure 4.12 during the resolution of the HDM. As it can be seen in the previous figures , the proposed residue and gradient decrease monotonically as the HDM converges, which would be expected of the true residue and its gradient.
The sole incertitude not accounted for is the sign of the gradient. But, if we do have coded the wrong sign, this will be evident as the optimization algorithms will not converge, and the correction of the sign is fairly simple to do. Now that we have proved to have a working residue and gradient for the HDM, we
2 4 6 8 10
10−30 10−23 10−16 10−9
Iteration
J
Re = 1 Re = 51 Re = 101 Re = 151 Re = 201
Figure 4.51: Evolution of the cost functional throughout the iterations of the HDM for sev- eral Reynolds’ numbers.
2 4 6 8 10
10−13 10−10 10−7 10−4 10−1
Iteration
DJ(L-2norm)
Re = 1 Re = 51 Re = 101 Re = 151 Re = 201
Figure 4.52: Evolution of the L-2 norm of the gradient of the cost functional throughout the it- erations of the HDM for several Reynolds’ num- bers.
need to create one for the ROM. This can be obtained immediately through the following formulas (using Einstein’s notation):
J(q) =J(V q), ∂J
∂qk = ∂J
∂wj
∂wj
∂qk =∂J(V q)
∂wj Vjk (4.13)
Wherei, j, kare tensorial indexes. We should be now able to implement LMSQ reduction with minimal hassle. In FreeFem++, we found 2 optimization routines that did gradient-based optimization:a Con-
Table 4.6: Scalability test results forRe= 201 for the (Lin) initialization.
N tHDM tSV D tROM N DOF eu ep
50 6.21 s 0.30 s 61.829 23 003 9.89 % 18.70%
100 36.75 s 1.06 s 375 s 91 003 10.1 % 18.6 % 200 265.93 s 4.22 s 2.59e+3 s 362003 10.2% 18.6%
jugate Gradient optimizer (CG) and a Broyden–Fletcher–Goldfarb–Shanno optimizer (BFGS). These optimization routines come with internal convergence checks: if the algorithm detects that it is diverging, it will stop immediately and print a warning to the console. Upon implementing model reduction with the cost functional and gradient obtained through equation 4.13, neither of the algorithms reported any divergence. If we used a gradient with the opposite sign, all algorithms stopped in less than 10 iterations and printed a divergence warning to the console. Despite this, only BFGS systematically reduced the cost functional, while the CG algorithm increased it each time. Even if we let the CG algorithm run for well over 10 iterations, we got increasingly higher residues at each iteration. For this reason LMSQ reduction was implemented using BFGS only. The performance of the BFGS algorithm (and most op- timization algorithms for that matter) depends highly upon the starting point given to the optimizer. For this reason, besides the usual sweep test with 10 optimizer iterations, we have decided to test different ways to initialize the optimizer:
• Initialization with null solution (Null).
• Initialization with solution linearly interpolated from neighboring samples using hat functions (Lin).
• Initialization with solution interpolated from neighboring samples via a cosine bump function (cos).
The results can be seen in figures 4.54 and 4.53. It seems that the only implementation that gives reasonable errors (below 10%) is the one where we use the Linear initializer. The errors for the co-sinus initializers are always near 100 %, except when we recompute the solution at the snapshots. This is expected, since the exact solution is already given by the initializer. When we do not use previously computed snapshots, the deviations are always superior to 100 %, and we do not obtain the exact
”interpolation” as observed for the Galerkin Projection or LMSQ+initializers. Finally the deviations for pressure given by the LMSQ method, in particular with the linear initializer, are nearly 100×smaller than those observed for the Galerkin projection. It seems that the LMSQ method is more robust against punctual variations. However, unlike the Galerkin Projection, the ROM for LMSQ actually takes more time to compute than the HDM, as can be seen in table 4.6. This might be a problem specific to the optimizers available in FreeFEM++: the optimization routines available tend to publish informations to the console at each iteration, slowing them down. We have not yet found a way to control these console prints without affecting other important outputs (such as computation times), and designing, implementing and verifying an optimization routine falls outside of the scope of this thesis.
0 100 200 300 400 10−3
10−2 10−1 100
Reynolds eu
cos Linear
Null
Figure 4.53: Relative Velocity error of the con- structed ROM for different Reynolds numbers for the mesh given in 4.13. Values below10−3 are not shown.
0 100 200 300 400
10−3 10−2 10−1 100 101
Reynolds ep
cos Linear
Null
Figure 4.54: Relative pressure error of the con- structed ROM for different Reynolds numbers for the mesh given in 4.13. Values below10−3 are not shown.