The Maximum Likelihood Expectation Maximization (MLEM) algorithm 26

4.2 Algebraical Algorithms

4.2.2 The Maximum Likelihood Expectation Maximization (MLEM) algorithm 26

The MLEM algorithm, developed by Shepp and Vardi [104], and Lange and Carson [68], presents a way to solve for

λˆ= arg max

λ Φ,

with Φ being either the un-penalized or penalized objective function.

5There exist many possible solutions ofλcompatible with the measurementsp.

4.2. ALGEBRAICAL ALGORITHMS 27 The convenience of this algorithm comes from its ability to produce an iterative and mono- tonic algorithm capable of dealing with the major difficulty encountered in the formulation of the ML estimator under a Poisson model: the maximization of Φ. How this maximization is achieved can be seen under perspective of the optimization transfer principle (Appendix A.4).

Let’s consider the unpenalized case, where the objective function is written and manipulated more easily in its log-likelihood form (see Eq.(4.7)).

L(λ) = log (P(p|λ)) = log ÃY

e^−p^dp_d^p^d p_d!

(4.9)

= X

[p_dlog(p_d)−p_d−log(p_d!)]. (4.10) By incorporating (4.4) into (4.10) one can find an expression in terms ofλ, as follows:

L(λ) =X

p_dlog ÃX

λ_bR_db

−X

λ_bR_db−log(p_d!)

(4.11) Unfortunately, maximization of (4.11) is intractable because of the nested sums. To overcome this problem a first component present in every EM approach is used. Instead of using the incomplete data p (i.e., observed data giving no direct access to the hidden data set λ), a complete data set (i.e., set of random variables that in general were not observed, but that could have simplified the estimation if they had been observed) is used.

In [104], Shepp and Vardi proposed to use as complete data the number of detections captured by detector tube dand emitted by voxelb (i.e. p_db). This selection favors the estimation of the ML estimator. In fact, if Eq. (4.9) is rewritten now using the complete data, the log-likelihood is:

L(λ) = log (P(p|λ)) = log



Y

d,b

e^−λ^dbλ_db^p^db p_db!





= X

d,b

[p_dblog(λ_db)−λ_db−log(p_db!)]. (4.12) This way, maximization of the log-likelihood in (4.12) is much more easier to perform than in (4.10). Indeed, taking first derivative of (4.12)

∂L(λ)

∂λ_b = ∂

∂λ_b



X

d,b

−λ_db+ log

µλ_db^p^db p_db!

¶



= ∂

∂λ_b



−X

λ_bX

R_db+X

d,b

p_dblog(λ_bR_db) +Cst





= ∂

∂λ_b



−X

λ_bX

R_db+X

d,b

p_dblog(λ_b) +Cst





∂L(λ)

∂λ_b = 0 ⇔ −X

R_db+X

p_db λ_b = 0

⇔ λˆb= P

dp_db P

dR_db (4.13)

Here, a second ingredient of the EM algorithm is incorporated. Since we do not have access to the complete datap_db, Eq. (4.12) is replaced by its conditional expectation given the measures pd and the current estimateλ. Let’s define this asQ(λ,λ^<K>), which has the following form:

Q(λ,λ^<K>) =E[log(P(p_db|λ))|p_d,λ^<K>] (4.14) Further, remembering that for independent Poisson variablesX,Y with meansλ_X, λ_Y, the expectation of X conditioned on the sum X +Y is E[X|X +Y] = ^(X_λ^+Y^)λ^X

X+λ_Y [104]. Thus, Q(λ,λ^<K>) can be calculated as

E[p_db|p_d, λ] = p_dλ_db P

b⁰λ_db0

= p_dλ_bR_db P

b⁰λ_b0R_db0

(4.15) Then, substituting equation (4.15) into (4.13) we get

λ<K+1>

b = λ^<K>_b P

dR_db X

p_dR_db P

b⁰λ^<K>_b₀ R_db0

(4.16) The pseudo-code for the MLEM algorithm is

Algorithm 4.2.1: EM(M LEM) for K ←0to n-iterations











p_d=P

dR_dbλ^<K>_d , d= 1, . . . , M.

for b←1 to N do

(C_b^<K> =P

dP pdRdb b0λ^<K>_b₀ R_db0

λ<K+1>

b =λ^<K>_b C_b^<K>/P

dR_db

4.2.3 Properties of the MLEM and stopping criteria

The principal characteristics of the MLEM algorithm are its non-negativity (i.e., it assures non- negative pixel values for all images generated, provided one starts with a non-null image) and, for every iteration the number of emissions equals the number of detections.

One aspect that still remains open is to figure out when the iterations should be stopped.

Several measures exist that can be used to check the quality of the reconstructed image, and can be used as stop criteria. In [72, 63], the authors present the Root Mean Square value (RMS) as a good figure-of-merit

RM S_K = sP

b(f_b−λ^<K>_b )² P

bf_b² . (4.17)

4.2. ALGEBRAICAL ALGORITHMS 29 With f_b the number of emissions from b. Previous studies of the RMS value show that it begins by decreasing continously until a minimum is reached, and then starts to increase, which indicates that noise in the measured data begins to be added to the reconstructed image [64].

Since the RMS value has only utility in simulations studies, where the density distribution can be known a priori, it can not be applied to real studies.

The likelihood of the objective function can also be used as a statistical stopping criteria:

L(λ) =X

[p_dlog(p_d)−p_d−log(p_d!)] (4.18) Using equation (4.4) in (4.18), the likelihood can be calculated as:

L(λ) =X

[p_dlog ÃX

λ_bR_db

−X

λ_bR_db−log(p_d!)] (4.19) The problem of using equation (4.19) is that, as the iterations continue, the likelihood will increase (monotonicity of the solution), without indicating the point where noise will begin to be added to the reconstructed image.

In [22], the author presents a stopping criteria that consists in separating randomly the projection data in two halves, namely A and B. Then, one proceeds with the reconstruction of the set A, and for each iteration the likelihood of the data set B is calculated using the estimates obtained with A. It is shown that the likelihood will increase up to a certain point.

At this point the iterations over A are stopped, changing to the data set B. Once both points of convergence are reached, the two estimates are summed up to obtain the final image estimate of the density distribution. This technique has shown good results on noise rejection but as it has been remarked in [56] that the cross-likelihood is dependent of the number of counts.

In [56], Johnson proposed a variant to the cross-likelihood scheme, in which the projection data set is divided in k subsets. Then, each subset is subtracted from the complete data set.

Each subtracted data set is reconstructed and multiplied by 1/(k−1) to preserve the count number in each iteration. For each subtracted data set, the iterations are stopped when the likelihood of the non-included data set is maximized.

Another approach was proposed in [64], where a study of the multiplicative update coefficients of the MLEM algorithm allowed the authors to establish a stopping rule. For each iteration, the update coefficients are stored and histogrammed. The technique is based on the fact that the optimum iteration value (given in a simulation study by the RMS value) is reached always in the same value for the histogrammed coefficients. In a previous work [65], the authors stated that a value of 0.8 produces images close to the optimal reconstructed image (with±5 iterations).

No documento Mauricio Antonio Reyes Aguirre (páginas 47-50)