Chapter 2. Optimization Algorithms
2.3. Local optimization methods
2.3.1. Sequential quadratic programming
The sequential quadratic programming (SQP) is a local direct search method where constraints are handled explicitly during the whole procedure. Within this method, the solution is found by solving a sequence of quadratic programming (QP) problems [184]. SQP can be considered as a generalization of Newton method for unconstrained optimization [18, 75, 146, 147] as it finds a step away from the current point by minimizing a quadratic model of the problem. Given the general problem described in expressions (2.1) to (2.3), the main concept of SQP method is the formulation of a QP problem based on a quadratic approximation of the following Lagrangian function (2.4),
,λ = + λ∙
(2.4)
being λi the Lagrange multiplier. The QP problem is thus obtained by linearizing the nonlinear constraints, (2.5),
∈
1
2 + (2.5)
subjected to (2.6),
∇+ = 0, = 1, … , # (2.6)
and to (2.7),
∇+ ≤ 0, = #+ 1, … , (2.7) being dk the search direction vector and Hk the Hessian matrix of the Lagrangian function.
This problem can be solved by using a QP algorithm. Therefore, the SQP implementation consists of three main stages: (1) updating the Hessian matrix; (2) quadratic programming solution; (3) line search and objective function.
2.3.1.1. Updating the Hessian matrix
A positive definite quasi-Newton approximation of the Hessian of the Lagrangian function (2.4) is computed each iteration (2.8),
%= +&&
&'−''
'' (2.8)
where sk = xk+1 - xk, and qk is obtained by expression (2.9).
& = )% + λ%
* − ) + λ
* (2.9)
Powell [146, 147] recommends keeping the Hessian positive definite within the whole procedure, although it might be positive indefinite at solution points. A positive definite Hessian is maintained providing that qkTsk is positive at each update and that Hk is initialized with a positive definite matrix. When qkTsk is not positive, qk is modified on an element-by- element basis so that qkTsk > 0. The aim of this modification is to distort the elements of qk, which contribute to a positive definite update, as little as possible. Therefore, in an initial phase of the modification, the most negative element of qkTsk is repeatedly halved. This procedure is continued until qkTsk is greater than or equal to a small negative tolerance. If, after this procedure, qkTsk is still not positive, qk is modified by adding a vector v multiplied by a constant scalar w, that is, qk = qk + w v. Vector v is computed through expression (2.10).
+ = %% − (2.10)
if (qk)i w < 0 and (qk)i (sk)i < 0, for i = 1, …, m, otherwise vi = 0. This value is increased systematically until qkTsk becomes positive.
2.3.1.2. Quadratic programming solution
A QP problem of the following form, obtained from expressions (2.5) to (2.7), is solved each iteration, (2.11),
∈& =1
2 + , (2.11)
subjected to (2.12),
Aidk = bi, i = 1, …,me (2.12)
and (2.13),
Aidk ≤ bi, i = me+1,…,m (2.13)
where Ai refers to the i-th row of the m-by-n matrix A. An active strategy method may be used to solve this problem [65]. This procedure involves two phases. The first one comprises the computation of a feasible point, while the second corresponds to the generation of an iterative sequence of feasible points that converges to the solution. In this method an active set Āk, that is an estimate of the active constraints at solution point, is maintained. The active set is updated each iteration k and this is used to form a basis for the search direction dk. The search direction dk is computed and minimizes the objective function, while remaining on any active constraint boundaries.
The feasible subspace for dk is formed from a basis Zk whose columns are orthogonal to the estimate of the active set Āk. Thus a search direction, which is formed from a linear summation of any combination of the columns of Zk, is guaranteed to remain on the boundaries of the active constraints. The matrix Zk is formed from the last m - l columns of the decomposition of the matrix ĀkT, where l is the number of active constraints or those which are on the constraint boundaries. Once Zk is found, a new search direction dk is sought that minimizes q(d), where dk is in the null space of the active constraints. Therefore dk is a linear combination of the columns of Zk: dk = Zk p, for some vector p. Then by viewing expression (2.11) as a function of p, it results in expression (2.14).
&- =1
2 -..- + ,.- (2.14)
Differentiating this with respect to p yields expression (2.15).
∇&- = ..- + ., (2.15)
In which this differentiation is referred to as the projected gradient of the quadratic function
by Zk, occurs when the projected gradient is null, which corresponds to the solution of the following system of linear equations (2.16).
..- = −., (2.16)
A step is then taken of the form xk+1 = xk + α dk. At each iteration, due to the quadratic nature of the objective function, there are only two choices of step length (α).A step of unity along dk
is the exact step to the minimum of the function, restricted to the null space of Āk. If such a step can be taken, without violation of the constraints, then this is the problem solution.
Otherwise, the step along dk to the nearest constraint is less than unity and a new constraint is included in the active set at the next iteration. The distance to the constraint boundaries in any direction dk is given by (2.17),
/ = ∈0,…,12−3− 4
3 5 (2.17)
which is defined for constraints not in the active set. When n independent constraints are included in the active set, without location of the minimum, the Lagrange multipliers (λk) are computed in order to satisfy the nonsingular set of linear equations (2.18).
3
6666λ7 = c (2.18)
If all elements of λk are positive, than xk is the optimal solution of the QP problem. However, if any component of λk is negative, and the component does not correspond to an equality constraint, then the corresponding element is deleted from the active set and a new iterate is sought.
2.3.1.3. Line search and objective function
The solution to the QP problem produces a vector dk which is used to form a new iterate, (2.19).
xk+1 = xk + αdk (2.19)
The step length parameter αk is determined in order to produce a sufficient decrease in the objective function. The objective function used by Han [75] and Powell [146, 147] presents the following form, (2.20),
9 = + :
;
+ :<=0, >
;%
(2.20)
being ri the penalty parameter obtained by expression (2.21).
: = % = <2?,+ ?
2 5 , = 1, … , (2.21)
This allows positive contribution from constraints that are inactive in the QP solution but were recently active.
Although similar to other traditional active-set algorithms [184], SQP presents some differences as: (1) strict feasibility with respect to bounds. The SQP algorithm takes every step in the region constrained by bounds; (2) robustness. During iterations the SQP algorithm can attempt to take a step that fails. In this situation the algorithm attempts to take a smaller step; (3) refactored linear algebra routines. The SQP algorithm uses a different set of linear algebra routines to solve QP problems. These routines are more efficient in both memory usage and speed than traditional active-set routines; (4) reformulated feasibility routines. SQP algorithm has two new approaches to the solution when constraints are not satisfied: (a) the SQP algorithm combines the objective and constraint functions into an objective function. This modified problem can lead to a feasible solution. This approach has more variables than the original problem and this can slow the solution of the QP problem [172, 186]; (b) the SQP considers an attempt step that causes the constraint violation to grow. The SQP algorithm attempts to obtain feasibility using a second-order approximation to the constraints. This technique can slow the solution by requiring more evaluations of the nonlinear constraint functions.
2.4. Global optimization methods