Local optimization methods - Optimization Algorithms

Chapter 2. Optimization Algorithms

2.3. Local optimization methods

2.3.1. Sequential quadratic programming

The sequential quadratic programming (SQP) is a local direct search method where constraints are handled explicitly during the whole procedure. Within this method, the solution is found by solving a sequence of quadratic programming (QP) problems [184]. SQP can be considered as a generalization of Newton method for unconstrained optimization [18, 75, 146, 147] as it finds a step away from the current point by minimizing a quadratic model of the problem. Given the general problem described in expressions (2.1) to (2.3), the main concept of SQP method is the formulation of a QP problem based on a quadratic approximation of the following Lagrangian function (2.4),

,λ_{= +}λ_∙

(2.4)

being λ_i the Lagrange multiplier. The QP problem is thus obtained by linearizing the nonlinear constraints, (2.5),

∈

2 + (2.5)

subjected to (2.6),

∇+ = 0, = 1, … , # (2.6)

and to (2.7),

∇+ ≤ 0, = #+ 1, … , (2.7) being d_k the search direction vector and H_k the Hessian matrix of the Lagrangian function.

This problem can be solved by using a QP algorithm. Therefore, the SQP implementation consists of three main stages: (1) updating the Hessian matrix; (2) quadratic programming solution; (3) line search and objective function.

2.3.1.1. Updating the Hessian matrix

A positive definite quasi-Newton approximation of the Hessian of the Lagrangian function (2.4) is computed each iteration (2.8),

%= +&&

&'−''

'' (2.8)

where sk = xk+1 - xk, and qk is obtained by expression (2.9).

& = )% + λ_%

* − ) + λ

* (2.9)

Powell [146, 147] recommends keeping the Hessian positive definite within the whole procedure, although it might be positive indefinite at solution points. A positive definite Hessian is maintained providing that q_k^Ts_k is positive at each update and that H_k is initialized with a positive definite matrix. When q_k^Ts_k is not positive, q_k is modified on an element-by- element basis so that qkTsk > 0. The aim of this modification is to distort the elements of qk, which contribute to a positive definite update, as little as possible. Therefore, in an initial phase of the modification, the most negative element of q_k^Ts_k is repeatedly halved. This procedure is continued until q_k^Ts_k is greater than or equal to a small negative tolerance. If, after this procedure, qkTsk is still not positive, qk is modified by adding a vector v multiplied by a constant scalar w, that is, q_k = q_k + w v. Vector v is computed through expression (2.10).

+ = %% − (2.10)

if (qk)i w < 0 and (qk)i (sk)i < 0, for i = 1, …, m, otherwise vi = 0. This value is increased systematically until q_k^Ts_k becomes positive.

2.3.1.2. Quadratic programming solution

A QP problem of the following form, obtained from expressions (2.5) to (2.7), is solved each iteration, (2.11),

∈& =1

2 + , (2.11)

subjected to (2.12),

Aidk = bi, i = 1, …,me (2.12)

and (2.13),

Aidk ≤ b_i, i = me+1,…,m (2.13)

where A_i refers to the i-th row of the m-by-n matrix A. An active strategy method may be used to solve this problem [65]. This procedure involves two phases. The first one comprises the computation of a feasible point, while the second corresponds to the generation of an iterative sequence of feasible points that converges to the solution. In this method an active set Ā_k, that is an estimate of the active constraints at solution point, is maintained. The active set is updated each iteration k and this is used to form a basis for the search direction d_k. The search direction d_k is computed and minimizes the objective function, while remaining on any active constraint boundaries.

The feasible subspace for d_k is formed from a basis Z_k whose columns are orthogonal to the estimate of the active set Ā_k. Thus a search direction, which is formed from a linear summation of any combination of the columns of Zk, is guaranteed to remain on the boundaries of the active constraints. The matrix Z_k is formed from the last m - l columns of the decomposition of the matrix Ā_k^T, where l is the number of active constraints or those which are on the constraint boundaries. Once Z_k is found, a new search direction d_k is sought that minimizes q(d), where dk is in the null space of the active constraints. Therefore dk is a linear combination of the columns of Z_k: d_k = Z_k p, for some vector p. Then by viewing expression (2.11) as a function of p, it results in expression (2.14).

&- =1

2 -..- + ,.- (2.14)

Differentiating this with respect to p yields expression (2.15).

∇&- = ..- + ., (2.15)

In which this differentiation is referred to as the projected gradient of the quadratic function

by Z_k, occurs when the projected gradient is null, which corresponds to the solution of the following system of linear equations (2.16).

..- = −., (2.16)

A step is then taken of the form x_k+1 = x_k+ α^dk. At each iteration, due to the quadratic nature of the objective function, there are only two choices of step length (α).A step of unity along dk

is the exact step to the minimum of the function, restricted to the null space of Ā_k. If such a step can be taken, without violation of the constraints, then this is the problem solution.

Otherwise, the step along dk to the nearest constraint is less than unity and a new constraint is included in the active set at the next iteration. The distance to the constraint boundaries in any direction dk is given by (2.17),

/ = _∈0,…,12−3− 4

3 5 (2.17)

which is defined for constraints not in the active set. When n independent constraints are included in the active set, without location of the minimum, the Lagrange multipliers (λk) are computed in order to satisfy the nonsingular set of linear equations (2.18).

6666λ₇ = c (2.18)

If all elements of λk are positive, than x_k is the optimal solution of the QP problem. However, if any component of λk is negative, and the component does not correspond to an equality constraint, then the corresponding element is deleted from the active set and a new iterate is sought.

2.3.1.3. Line search and objective function

The solution to the QP problem produces a vector dk which is used to form a new iterate, (2.19).

x_k+1 = x_k+ α^dk (2.19)

The step length parameter αk is determined in order to produce a sufficient decrease in the objective function. The objective function used by Han [75] and Powell [146, 147] presents the following form, (2.20),

9 = + :

+ :<=0, >

(2.20)

being r_i the penalty parameter obtained by expression (2.21).

: = % = <2?,+ ?

2 5 , = 1, … , (2.21)

This allows positive contribution from constraints that are inactive in the QP solution but were recently active.

Although similar to other traditional active-set algorithms [184], SQP presents some differences as: (1) strict feasibility with respect to bounds. The SQP algorithm takes every step in the region constrained by bounds; (2) robustness. During iterations the SQP algorithm can attempt to take a step that fails. In this situation the algorithm attempts to take a smaller step; (3) refactored linear algebra routines. The SQP algorithm uses a different set of linear algebra routines to solve QP problems. These routines are more efficient in both memory usage and speed than traditional active-set routines; (4) reformulated feasibility routines. SQP algorithm has two new approaches to the solution when constraints are not satisfied: (a) the SQP algorithm combines the objective and constraint functions into an objective function. This modified problem can lead to a feasible solution. This approach has more variables than the original problem and this can slow the solution of the QP problem [172, 186]; (b) the SQP considers an attempt step that causes the constraint violation to grow. The SQP algorithm attempts to obtain feasibility using a second-order approximation to the constraints. This technique can slow the solution by requiring more evaluations of the nonlinear constraint functions.

2.4. Global optimization methods

No documento José António Silva de Carvalho Campos e Matos Uncertainty Evaluation of Reinforced Concrete and Composite Structures Behavior (páginas 42-46)