A mathematical review of the generalized entropies and their matrix trace inequalities

(1)

A mathematical review of the generalized

entropies and their matrix trace inequalities

Shigeru Furuichi

∗

Abstract– We review the properties of the gener-alized entropies in our previous papers in the fol-lowing way. (1)A generalized Fannes’ inequality is shown by the axiomatically characterized Tsallis en-tropy. (2)The maximum entropy principles in nonex-tensive statistical physics are revisited as an applica-tion of the Tsallis relative entropy defined for the non-negative matrices in the framework of matrix analysis. (3)A variational expression for Tsallis relative entropy is derived and some related inequalities are studied.

Matrix trace inequalities, Tsallis relative entropy, Fannes’ inequality, variational expression and maxi-mum entropy principle

1 Introduction

Three or four decades ago, some extensions of Shannon entropy was studied by many researchers [1]. In 1988, Tsallis introduced the one-parameter extended entropy for the analysis of the physical model in statistical physics [2].

We denote theq-logarithmic function by

lnqx≡

x1−q −1 1−q

and theq-exponential function by

expq(x)≡

(1 + (1−q)x)1−q1 _, _if _{1 + (1}₋_q₎_{x >}₀_,

0 otherwise

forq∈R, q 6= 1, x >0. The functions expq(x) and lnqx

converge to exp(x) and logxasq→1, respectively. Note that we have the following relations:

expq{x+y+ (1−q)xy}= expq(x) expq(y),

lnqxy= lnqx+ lnqy+ (1−q) lnqxlnqy.

The set of all probability density functions onR is rep-resented by

Dcl(R)≡

f :R→R:f(x)≥0,

Z ∞ −∞

f(x)dx= 1

.

∗_{Department of Electronics and Computer Science, Tokyo}

Uni-versity of Science, Onoda City, Yamaguchi, 756-0884, Japan Email: furuichi@ed.yama.tus.ac.jp

Then the Tsallis entropy [2] is defined by

Hq(φ(x))≡ −

Z ∞ −∞

φ(x)qlnqφ(x)dx (1)

for any nonnegative real numberqand a probability dis-tribution functionφ(x)∈Dcl(R). In addition, the Tsallis

relative entropy is defined by

Dq(φ(x)|ψ(x))≡

Z ∞ −∞

φ(x)q(lnqφ(x)−lnqψ(x))dx (2)

for any nonnegative real number q and two probability distribution functions φ(x), ψ(x) ∈ Dcl(R). Taking the

limit asq→1, the Tsallis entropy and the Tsallis relative entropy converge to the Shannon entropy H1(φ(x)) ≡ −R∞

−∞φ(x) logφ(x)dx and the Kullback-Leibler

diver-genceD1(φ(x)|ψ(x))≡R∞

−∞φ(x)(logφ(x)−logψ(x))dx.

We define two sets involving the constraints on the q -expecation and theq-variance:

Cq(c)≡

f ∈Dcl(R) :

1

cq

Z ∞ −∞

xf(x)qdx=µq

and

Cq(g)≡

f ∈Cq(c):

1

cq

Z ∞ −∞

(x−µq)2f(x)qdx=σq2

.

Then theq-cannonical distributionφ(qc)(x)∈Dcl(R) and

the q-Gaussian distribution φ(qg)(x) ∈ Dcl(R) were

for-mulated [3, 4, 5, 6, 9, 10] by

φ(qc)(x)≡

1

Zq(c)

expq

n

−β(qc)(x−µq)

o

,

Zq(c)≡

Z ∞ −∞

expq

n

−βq(c)(x−µq)

o

dx

and

φ(g)

q (x)≡

1

Zq(g)

expq

(

−β (g)

q (x−µq)2

σ2

q

)

,

Zq(g)≡

Z ∞ −∞

expq

(

−β (g)

q (x−µq)2

σ2

q

)

dx,

respectively.

(2)

Theorem 1.1 ([7])If φ∈Cq(c), then

Hq(φ(x))≤ −cqlnq 1

Zq(c)

,

with equality if and only if

φ(x) = 1

Zq(c)

expq

n

−βq(c)(x−µq)

o

,

wherecq ≡R

∞ −∞φ(x)

q_dx_.

Theorem 1.2 ([7]) If φ ∈ Cq(g) for 0 < q < 3, q 6= 1,

then

Hq(φ(x))≤ −cqlnq

1

Zq(g)

+cqβq(g)Z(g)

q−1

q ,

with equality if and only if

φ(x) = 1

Zq(g)

expq

n

−β(g)

q (x−µq)2/σq2

o

,

whereβq(g)= 1/(3−q).

2 Uniqueness theorem of Tsallis entropy

Here we deal with n × n matrix whose set is de-noted by Mn(C) acting on the complex vector space

V on C. In the sequel, the set of all density ma-trices (quantum states) is represented by Dn(C) ≡

{X∈Mn(C) :X≥0,Tr[X] = 1}.For−I≤X≤I and

λ∈(−1,0)∪(0,1), we denote the generalized exponen-tial function by expλ(X)≡(I+λX)

1/λ

. As the inverse function of expλ(·), forX ≥0 andλ∈(−1,0)∪(0,1), we

denote the generalized logarithmic function by lnλX ≡ Xλ₋_I

λ . Then the Tsallis relative entropy and Tsallis

en-tropy for nonnegative matricesX andY are defined by

Dλ(X|Y)≡TrX1−λ(lnλX−lnλY)

and

Sλ(X)≡ −Dλ(X|I).

These entropies are generalizations of the von Neumann entropy [17] and the Umegaki relative entropy [19] in the sense that

lim

λ→0

Sλ(X) =S0(X)≡ −Tr[XlogX]

and

lim

λ→0

Dλ(X|Y) =D0(X|Y)≡Tr[X(logX−logY)].

Let Tλ be a mapping on the set Dn(C) of all density

matrices toR+_.

Axiom 2.1 We give the postulates which the Tsallis en-tropy should satisfy.

T1. Continuity: For ρ∈ Dn(C), Tλ(ρ) is a continuous

function with respect to the 1-norm k·k₁.

T2. Invariance: For unitary transformation U,

Tλ(U∗ρU) =Tλ(ρ).

T3. Generalized mixing condition: For ρ= ⊕n

k=1λkρk on

V = ⊕n

k=1Vk, where λk ≥ 0,

Pn

k=1λk = 1, ρk ∈

Dn(C), we have the additivity:

Tλ(ρ) = n

X

k=1

λ1−λ

k Tλ(ρk) +Tλ(λ1,· · ·, λn),

where (λ1,· · ·, λn) represents the diagonal matrix

(λkδkj)k,j=1,···,n.

Theorem 2.2 ([12]) If Tλ satisfies Axiom 2.1, thenTλ

is uniquely given by the following form

Tλ(ρ) =µλSλ(ρ),

with a positive constant numberµλ.

3 Generalized Fannes’ inequality

We give a continuity property of the Tsallis entropy

Sλ(ρ). To do so, we state a few lemmas.

Lemma 3.1 For a density matrixρon the complex vec-tor spaceV onC, we have

Sλ(ρ)≤lnλd,

whered≡dimV.

Lemma 3.2 ([12]) If f is a concave function and

f(0) =f(1) = 0, then we have

|f(t+s)−f(t)| ≤max{f(s), f(1−s)}

for anys∈[0,1/2]andt∈[0,1]satisfying 0≤s+t≤1.

Lemma 3.3 ([12])For any real numberu, v∈[0,1]and

λ∈[−1,1], if|u−v| ≤ 1

2, then|ηλ(u)−ηλ(v)| ≤ηλ(|u−

v|).

Theorem 3.4 ([12]) For two density matrices ρ1 and

ρ2 onV andλ∈[−1,1], ifkρ1−ρ2k₁≤(1−λ)1/λ, then

|Sλ(ρ1)−Sλ(ρ2)| ≤ kρ1−ρ2k1

−λ

1 lnλd+ηλ(kρ1−ρ2k1).

Where we denote kAk₁ ≡Tr

(A∗ A)1/2

for any matrix

A.

By taking the limit as λ → 0, we have the following Fannes’ inequality (see pp.512 of [15], also [14, 13, 16]) as a cororally, since limλ→0(1−λ)1/λ=

1

(3)

Corollary 3.5 For two density operators ρ1 and ρ2 on

V, if kρ1−ρ2k₁≤1_e, then

|S1(ρ1)−S1(ρ2)| ≤ kρ1−ρ2k1lnd+η1(kρ1−ρ2k1),

where S1 represents the von Neumann entropy S1(ρ) =

Tr[η1(ρ)]andη1(x) =−xlnx.

4 Maximum entropy principle in

nonex-tensive statistical physics

The problem of the maximum entropy principle has been studied in classical system and quantum system [3, 4, 5, 6, 7, 8]. We give the maximum entropy principle for the Tsallis entropy from the operator-theoretical point of view.

Theorem 4.1 ([8]) Let Y = Z−1

λ expλ(−H/kHk),

where Zλ≡Tr[expλ(−H/kHk)], for λ∈(−1,0)∪(0,1)

and a Hermitian matrixH. We denote

Cλ≡X ∈Dn(X) :Tr[X1

−λ

H]≤Tr[Y1−λ H] .

If X ∈Cλ, thenSλ(X)≤Sλ(Y).

Remark 4.2 Since −x1−λ

lnλx is a strictly concave

function,Sλ is a strictly concave function on the setCλ.

This means that the maximizerY is uniquely determined so that we may regard Y as a generalized Gibbs state. Thus we may define a generalized Helmholtz free energy such by

Fλ(X, H)≡Tr[X1

−λ

H]− kHkSλ(X).

This can be also represented by the Tsallis relative entropy such as

Fλ(X, H) =kHkDλ(X|Y)+lnλZ_λ−1Tr[X1−λ(kHk−λH)].

We straightforwardly have the following corollary by tak-ing the limit asλ→0.

Corollary 4.3 ([18]) Let Y = Z−1

0 exp (−H/kHk), where Z0 ≡Tr[exp (−H/kHk)], for a Hermitian matrix

H. IfX ∈C0, then

S0(X)≤S0(Y).

5 A variational expression of Tsallis

rel-ative entropy

In this section, we derive a variational expression of the Tsallis relative entropy as a parametric extension of that of the relative entropy in Lemma 1.2 of [21]. A variational expression of the relative entropy has been studied in the general setting of von Neumann algebras [23, 22].

Theorem 5.1 ([20]) Forλ∈(0,1], we have the follow-ing relations.

1. If AandY are positive matrices, then

lnλTr[eA_λ+lnλY]

= max

Tr[X1−λ

A]−Dλ(X|Y) :X∈Dn(C) .

2. If X is a density matrix and B is a Hermitian ma-trix, then

Dλ(X|eBλ)

= max

Tr[X1−λ

A]−lnλTr[eAλ+B] :A=A

∗ .

Taking the limit asλ→0, Theorem 5.1 recovers Lemma 1.2 in [21]. IfY =I andB = 0 in 1. and 2. of Theorem 5.1, respectively, then we obtain the following corollary.

Corollary 5.2 1. If Ais a positive matrix, then

lnλTr[eAλ]

= max

Tr[X1−λ

A] +Sλ(X) :X ∈Dn(C) .

2. For a density matrix X, we have

−Sλ(X) = maxTr[X1−λA]−lnλTr[eAλ] :A=A

∗ .

In this section, we derive some trace inequalities in terms of the results obtained in the above. From 1. of Corollary 5.2, we have the generalized thermodynamic inequality:

lnλTr[eHλ]≥Tr[D1

−λ

H] +Sλ(D), (3)

for a density matrix D and a Hermitian matrix H. Putting D = A

Tr[A] and H = lnλB in Eq.(3), we have the generalized Peierls-Bogoliubov inequality (Theorem 3.3 of [11]):

(Tr[A])1−λ

(lnλTr[A]−lnλTr[B])

≤Tr[A1−λ

(lnλA−lnλB)], (4)

for nonnegative matricesAandB.

Lemma 5.3 ([20]) The following statements are equiv-alent.

1. Fλ(A) = lnλTr[eAλ] is convex in a Hermitian matrix

A.

2. fλ(t) = lnλTr[eAλ+tB]is convex in t∈R.

Corollary 5.4 ([20])For Hermitian matricesAandB, we have

lnλTr[eA_λ+B]−lnλTr[eAλ]≥

Tr[B(eAλ)1

−λ

] (Tr[eA

λ])1

(4)

Note that we also obtain the inequality (5) by putting

A=eH

λ andB=eHλ+Kin the inequality (4). In addition,

putting A = lnλD and B = H −lnλD for a density

matrixD in the inequality (5), we obtain the inequality (3). Thus we have the following proposition.

Proposition 5.5 ([20]) The following conditions are equivalent.

1. The generalized thermodynamic inequality (3).

2. The generalized Peierls-Bogoliubov inequality (4).

3. The trace inequality (5) given in Corollary 5.4.

For nonnegative real numbers x, y and 0 < λ ≤ 1, the relationsexλ+y ≤e

x+y+λxy λ =exλe

y

λ hold. These relations

naturally motivate us to consider the following inequali-ties in the noncommutative case.

Proposition 5.6 ([20]) For nonnegative matrices X

andY, and0< λ≤1, we have

Tr[eX_λ+Y]≤Tr[e_λX+Y+λY1/2XY1/2].

Note that we have the matrix inequality :

eX_λ+Y ≤e_λX+Y+λY1/2XY1/2

forλ≥1 by the application of the L¨owner-Heinz inequal-ity [24, 25, 26].

Proposition 5.7 ([20])For nonnegative matricesX, Y, andλ∈(0,1], we have

Tr[eX_λ+Y+λXY]≤Tr[eXλeYλ]. (6)

Notice that Golden-Thompson inequality [27, 28],

Tr[eX+Y_]_≤_Tr_[_eX_eY_]

which holds for Hermitian matrices X and Y, is recov-ered by taking the limit asλ→0 in Proposition 5.7, in particular case of nonnegative matricesX andY.

SinceTr[HZHZ]≤Tr[H2_Z2_{] for Hermitian matrices}_H andZ [23], we have for nonnegative matricesX andY,

Tr[(I+X+Y +Y1/2_XY1/2₎2_]

≤Tr[(I+X+Y +XY)2] (7)

by easy calculations. This implies the inequality

Tr[eX+Y+1/2Y 1/2

XY1/2

1/2 ]≤Tr[e

X+Y+1/2XY

1/2 ].

Thus we have

Tr[eX₁_/+₂Y]≤Tr[eX1/2eY1/2] (8)

from Proposition 5.6 and Proposition 5.7. PuttingB = ln1/2Y and A = ln1/2Y−1/2XY−1/2 in 2. of Theorem 5.1 and using Eq.(8), we have

D1/2(X|Y)≥Tr[X1/2ln1/2Y

−1/2

XY−1/2

], (9)

which gives a lower bound of the Tsallis relative entropy in the case ofλ= 1/2.

6 Related matrix trace inequalities

In this section, we consider an extension of the following inequality [21]:

Tr[X(logX+ logY)]≤ 1

pTr[XlogX

p/2_Yp_Xp/2_{] (10)}

for nonnegative matricesX and Y, andp >0.

Theorem 6.1 ([8])

1. For positive matricesX andY,p≥1and0< λ≤1, we have

Tr[X1−λ

(lnλX−lnλY)]

≤ −Tr[Xlnλ(X−p/2YpX−p/2)1/p]. (11)

2. For positive matrices X andY,0 < p <1 and0 < λ≤1, the following inequality dose not hold:

Tr[X1−λ

(lnλX−lnλY)]

≤ −Tr[Xlnλ(X−p/2YpX−p/2)1/p] (12)

Corollary 6.2 1. For positive matrices X andY, the trace inequality

Dλ(X|Y)≤ −Tr[Xlnλ(X−1/2Y X−1/2)] (13)

holds.

2. For positive matricesX andY, andp≥1, we have the inequality (10).

7 Conjecture

The trace inequality (7) makes us conjecture the following inequalities.

Conjecture 7.1 For nonnegative matricesX andY, we have

1. Tr[(I+X+Y +Y1/2_XY1/2₎p_]_≤_Tr_[(_I₊_X₊_Y ₊

XY)p_] _for_p_≥_1.

2. Tr[(I+X+Y +Y1/2_XY1/2₎p_]_≥_Tr_[(_I₊_X₊_Y ₊

(5)

Acknowledgement

The author was partially supported by the Japanese Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Encouragement of Young Scientists (B), 17740068.

References

[1] J.Acz´el and Z.Dar´oczy, On measures of information and their characterizations, Academic Press, 1975.

[2] C.Tsallis, Possible generalization of Bolzmann-Gibbs statistics, J.Stat.Phys.,Vol.52(1988),pp.479-487.

[3] S. Martinez, F. Nicol´as, F. Penninia and A. Plasti-noa, Tsallis’ entropy maximization procedure revis-ited, Physica A,Vol.286(2000), pp.489-502.

[4] C.Tsallis, R.S.Mendesc and A.R.Plastino, The role of constraints within generalized nonextensive statis-tics, Physica A,Vol.261,(1998),pp.534-554.

[5] S.Abe, S. Mart´nez, F. Pennini, and A. Plas-tino, Nonextensive thermodynamic relations, Phys.Lett.A,Vol.281(2001),pp.126-130.

[6] S.Abe, Heat and entropy in nonextensive thermodynamics: transmutation from Tsallis theory to R´enyi-entropy-based theory,Physica A,Vol.300(2001),pp.417-423.

[7] S.Furuichi, On the maximum entropy principle in Tsallis statistics, preprint.

[8] S.Furuichi, Matrix trace inequalities on the Tsallis entropies, preprint.

[9] H.Suyari, The unique non self-referentialq-canonical distribution and the physical temperature derived from the maximum entropy principle in Tsallis statistics, cond-mat/0502298.

[10] H.Suyari and M.Tsukada, Law of error in Tsal-lis statistics, IEEE Trans.Information Theory, Vol.51(2005),pp.753-757.

[11] S.Furuichi, K.Yanagi and K.Kuriyama, Funda-mental properties of Tsallis relative entropy, J.Math.Phys,Vol.45(2004),pp.4868-4877.

[12] S.Furuichi,K.Yanagi and K.Kuriyama, A generalized Fannes’ inequality, Journal of Inequalities in Pure and Applied Mathematics, Vol.8(2007), Issue 1, Ar-ticle 5, 6pp.

[13] R.Alicki and M.Fannes, Quantum Dynamical Sys-tems, Oxford University Press, 2001.

[14] M.Fannes, A continuity property of en-tropy density for spin lattice systems, Commun.Math.Phys.,Vol.31(1973),pp.291-294.

[15] M.A.Nielsen and I.Chuang, Quantum Computation and Quantum Information, Cambridge Press, 2000.

[16] M.Ohya and D.Petz, Quantum Entropy and its Use,Springer-Verlag,1993.

[17] J.von Neumann,Thermodynamik quantenmechanis-cher Gesamtheiten,G¨ottinger Nachrichen,pp.273-291(1927).

[18] W.Thirring, Quantum mechanics of large systems, Springer-Verlag, 1980.

[19] H.Umegaki, Conditional expectation in an opera-tor algebra, IV (entropy and information),Kodai Math.Sem.Rep., Vol.14(1962),pp.59-85.

[20] S.furuichi, Trace inequalities in nonextensive statis-tical mechanics, Linear Alg. Appl., Vol.418(2006), pp.821-827.

[21] F.Hiai and D.Petz, The Golden-Thompson trace inequality is complemented, Linear Alg. Appl.,Vol.181(1993),pp.153-185.

[22] H.Kosaki, Relative entropy for states: a varia-tional expression, J.Operator Theory, Vol.16(1986), pp.335-348.

[23] D.Petz, A variational expression for the relative entropy of states of a von Neumann algebra, Commun.Math.Phys.,Vol.114(1988),pp.345-349.

[24] K.L¨owner, Uber¨ monotone Matrixfunktionen, Math.Z., Vol.38(1934), pp.177-216.

[25] E.Heinz, Beitr¨age zur St¨orungstheorie der Spektralz-erlegung, Math. Ann., Vol.123(1951), pp.415-438.

[26] G.K.Pedersen, Some operator monotone functions, Proc. Amer. Math. Soc., Vol.36(1972), pp.309-310.

[27] S.Golden, Lower bounds for the Helmholtz function, Phys. Rev., Vol.137(1965), pp.B1127-B1128.

[28] C.J.Thompson, Inequality with applications in statistical mechanics, J.Math.Phys., Vol.6(1965), pp.1812-1813.

Appendix

On Theorem 1.2, here we showq-Gaussian distribution:

φ(x) =Z−1

q expq

−βq(x−µq)2/σ2q , (14)

where Zq ≡ R

∞ −∞expq

−βq(x−µq)2/σ2q dx with βq =

1/(3−q) satisfies the constraints:

1

cq

Z ∞ −∞

xφ(x)q_dx₌_µ

(6)

and

1

cq

Z ∞ −∞

(x−µq)2φ(x)qdx=σq2, (16)

cq≡R

∞ −∞φ(x)

q_dx_.

Proof: It is sufficient to prove the case of µq = 0 and

σq = 1. Since the function xexpq

−₍₃x−2q)

q

is the odd function, we see that φ(x) satisfies the first constraint. To show the second constraint is equivalent to show

Z ∞ −∞

x2expq

− x

2

(3−q)

q dx= Z ∞ −∞ expq − x 2

(3−q)

q

dx

(17)

(1) 0 ≤ q < 1: Since expq

−₍₃x−2q)

is equal to

1−(1−q)x2

(3−q)

1₋1q

, if −q3−q

1−q < x <

q

3−q

1−q,

other-wise equal to 0, L.H.S. of Eq.(17) is calculated by

Z

q3−q

1₋q

− q₃

−q

1−q

x2expq

− x

2

(3−q)

q

dx

=

₃₋_q

1−q

3/2Z 1

−1

y2 1−y21−qq _dy

= 2

₃₋_q

1−q

3/2Z 1

0

y2 1−y21₋qq_dy

=

₃₋_q

1−q

3/2

B

₃

2, 1 1−q

.

Also R.H.S. of Eq.(17) is calculated by

Z

q₃

−q

1−q

− q3−q

1−q

expq

− x

2

(3−q)

q

dx

=

₃₋_q

1−q

1/2Z 1

−1

1−y21₋qq_dy

= 2

3−q

1−q

1/2Z 1

0

1−y21−qq _dy

=

3−q

1−q

1/2

B

1 2,

1 1−q

.

In the process of the above calculations, the following formula:

Z 1

0

xα 1−xλβ

dx= 1

λB

α+ 1

λ , β+ 1

,

(α, β >−1, λ >0)

was used. By the properties of the beta function

and gamma function,3−q

1−q

3/2

B3₂,₁−1q

coincides

with3−q

1−q

1/2

B1₂,₁−1q

.

(2) 1< q≤3: L.H.S. of Eq.(17) is calculated by

Z ∞ −∞

x2expq

− x

2

(3−q)

q

dx

= 2

3−q q−1

3/2Z ∞

0

y2 1 +y21−qq _dy

=

3−q q−1

3/2

B

q q−1 −

3 2, 3 2 .

R.H.S. of Eq.(17) is calculated by

Z ∞ −∞ expq − x 2

(3−q)

q

dx

= 2

₃₋_q

q−1

1/2Z ∞

0

1 +y21₋qq_dy

=

₃₋_q

q−1

1/2

B

_q

q−1− 1 2, 1 2 .

In the process of the above calculations, the following formula:

Z ∞

0

dx

xα_{(1 +}_xλ₎β =

1

λB

β−1−α

λ ,

1−α λ

,

(α <1, β >0, λ >0, λβ >1−α).

By the properties of the beta function and gamma

function, 3−q

q−1

3/2

B_q−q1−

3 2, 3 2 coincides with

3−q

q−1

1/2

B_q−q1−