A mathematical review of the generalized
entropies and their matrix trace inequalities
Shigeru Furuichi
∗Abstract– We review the properties of the gener-alized entropies in our previous papers in the fol-lowing way. (1)A generalized Fannes’ inequality is shown by the axiomatically characterized Tsallis en-tropy. (2)The maximum entropy principles in nonex-tensive statistical physics are revisited as an applica-tion of the Tsallis relative entropy defined for the non-negative matrices in the framework of matrix analysis. (3)A variational expression for Tsallis relative entropy is derived and some related inequalities are studied.
Matrix trace inequalities, Tsallis relative entropy, Fannes’ inequality, variational expression and maxi-mum entropy principle
1
Introduction
Three or four decades ago, some extensions of Shannon entropy was studied by many researchers [1]. In 1988, Tsallis introduced the one-parameter extended entropy for the analysis of the physical model in statistical physics [2].
We denote theq-logarithmic function by
lnqx≡
x1−q −1 1−q
and theq-exponential function by
expq(x)≡
(1 + (1−q)x)1−q1 , if 1 + (1−q)x >0,
0 otherwise
forq∈R, q 6= 1, x >0. The functions expq(x) and lnqx
converge to exp(x) and logxasq→1, respectively. Note that we have the following relations:
expq{x+y+ (1−q)xy}= expq(x) expq(y),
lnqxy= lnqx+ lnqy+ (1−q) lnqxlnqy.
The set of all probability density functions onR is rep-resented by
Dcl(R)≡
f :R→R:f(x)≥0,
Z ∞ −∞
f(x)dx= 1
.
∗Department of Electronics and Computer Science, Tokyo
Uni-versity of Science, Onoda City, Yamaguchi, 756-0884, Japan Email: furuichi@ed.yama.tus.ac.jp
Then the Tsallis entropy [2] is defined by
Hq(φ(x))≡ −
Z ∞ −∞
φ(x)qlnqφ(x)dx (1)
for any nonnegative real numberqand a probability dis-tribution functionφ(x)∈Dcl(R). In addition, the Tsallis
relative entropy is defined by
Dq(φ(x)|ψ(x))≡
Z ∞ −∞
φ(x)q(lnqφ(x)−lnqψ(x))dx (2)
for any nonnegative real number q and two probability distribution functions φ(x), ψ(x) ∈ Dcl(R). Taking the
limit asq→1, the Tsallis entropy and the Tsallis relative entropy converge to the Shannon entropy H1(φ(x)) ≡ −R∞
−∞φ(x) logφ(x)dx and the Kullback-Leibler
diver-genceD1(φ(x)|ψ(x))≡R∞
−∞φ(x)(logφ(x)−logψ(x))dx.
We define two sets involving the constraints on the q -expecation and theq-variance:
Cq(c)≡
f ∈Dcl(R) :
1
cq
Z ∞ −∞
xf(x)qdx=µq
and
Cq(g)≡
f ∈Cq(c):
1
cq
Z ∞ −∞
(x−µq)2f(x)qdx=σq2
.
Then theq-cannonical distributionφ(qc)(x)∈Dcl(R) and
the q-Gaussian distribution φ(qg)(x) ∈ Dcl(R) were
for-mulated [3, 4, 5, 6, 9, 10] by
φ(qc)(x)≡
1
Zq(c)
expq
n
−β(qc)(x−µq)
o
,
Zq(c)≡
Z ∞ −∞
expq
n
−βq(c)(x−µq)
o
dx
and
φ(g)
q (x)≡
1
Zq(g)
expq
(
−β (g)
q (x−µq)2
σ2
q
)
,
Zq(g)≡
Z ∞ −∞
expq
(
−β (g)
q (x−µq)2
σ2
q
)
dx,
respectively.
Theorem 1.1 ([7])If φ∈Cq(c), then
Hq(φ(x))≤ −cqlnq 1
Zq(c)
,
with equality if and only if
φ(x) = 1
Zq(c)
expq
n
−βq(c)(x−µq)
o
,
wherecq ≡R
∞ −∞φ(x)
qdx.
Theorem 1.2 ([7]) If φ ∈ Cq(g) for 0 < q < 3, q 6= 1,
then
Hq(φ(x))≤ −cqlnq
1
Zq(g)
+cqβq(g)Z(g)
q−1
q ,
with equality if and only if
φ(x) = 1
Zq(g)
expq
n
−β(g)
q (x−µq)2/σq2
o
,
whereβq(g)= 1/(3−q).
2
Uniqueness theorem of Tsallis entropy
Here we deal with n × n matrix whose set is de-noted by Mn(C) acting on the complex vector space
V on C. In the sequel, the set of all density ma-trices (quantum states) is represented by Dn(C) ≡
{X∈Mn(C) :X≥0,Tr[X] = 1}.For−I≤X≤I and
λ∈(−1,0)∪(0,1), we denote the generalized exponen-tial function by expλ(X)≡(I+λX)
1/λ
. As the inverse function of expλ(·), forX ≥0 andλ∈(−1,0)∪(0,1), we
denote the generalized logarithmic function by lnλX ≡ Xλ−I
λ . Then the Tsallis relative entropy and Tsallis
en-tropy for nonnegative matricesX andY are defined by
Dλ(X|Y)≡TrX1−λ(lnλX−lnλY)
and
Sλ(X)≡ −Dλ(X|I).
These entropies are generalizations of the von Neumann entropy [17] and the Umegaki relative entropy [19] in the sense that
lim
λ→0
Sλ(X) =S0(X)≡ −Tr[XlogX]
and
lim
λ→0
Dλ(X|Y) =D0(X|Y)≡Tr[X(logX−logY)].
Let Tλ be a mapping on the set Dn(C) of all density
matrices toR+.
Axiom 2.1 We give the postulates which the Tsallis en-tropy should satisfy.
T1. Continuity: For ρ∈ Dn(C), Tλ(ρ) is a continuous
function with respect to the 1-norm k·k1.
T2. Invariance: For unitary transformation U,
Tλ(U∗ρU) =Tλ(ρ).
T3. Generalized mixing condition: For ρ= ⊕n
k=1λkρk on
V = ⊕n
k=1Vk, where λk ≥ 0,
Pn
k=1λk = 1, ρk ∈
Dn(C), we have the additivity:
Tλ(ρ) = n
X
k=1
λ1−λ
k Tλ(ρk) +Tλ(λ1,· · ·, λn),
where (λ1,· · ·, λn) represents the diagonal matrix
(λkδkj)k,j=1,···,n.
Theorem 2.2 ([12]) If Tλ satisfies Axiom 2.1, thenTλ
is uniquely given by the following form
Tλ(ρ) =µλSλ(ρ),
with a positive constant numberµλ.
3
Generalized Fannes’ inequality
We give a continuity property of the Tsallis entropy
Sλ(ρ). To do so, we state a few lemmas.
Lemma 3.1 For a density matrixρon the complex vec-tor spaceV onC, we have
Sλ(ρ)≤lnλd,
whered≡dimV.
Lemma 3.2 ([12]) If f is a concave function and
f(0) =f(1) = 0, then we have
|f(t+s)−f(t)| ≤max{f(s), f(1−s)}
for anys∈[0,1/2]andt∈[0,1]satisfying 0≤s+t≤1.
Lemma 3.3 ([12])For any real numberu, v∈[0,1]and
λ∈[−1,1], if|u−v| ≤ 1
2, then|ηλ(u)−ηλ(v)| ≤ηλ(|u−
v|).
Theorem 3.4 ([12]) For two density matrices ρ1 and
ρ2 onV andλ∈[−1,1], ifkρ1−ρ2k1≤(1−λ)1/λ, then
|Sλ(ρ1)−Sλ(ρ2)| ≤ kρ1−ρ2k1
−λ
1 lnλd+ηλ(kρ1−ρ2k1).
Where we denote kAk1 ≡Tr
(A∗ A)1/2
for any matrix
A.
By taking the limit as λ → 0, we have the following Fannes’ inequality (see pp.512 of [15], also [14, 13, 16]) as a cororally, since limλ→0(1−λ)1/λ=
1
Corollary 3.5 For two density operators ρ1 and ρ2 on
V, if kρ1−ρ2k1≤1e, then
|S1(ρ1)−S1(ρ2)| ≤ kρ1−ρ2k1lnd+η1(kρ1−ρ2k1),
where S1 represents the von Neumann entropy S1(ρ) =
Tr[η1(ρ)]andη1(x) =−xlnx.
4
Maximum entropy principle in
nonex-tensive statistical physics
The problem of the maximum entropy principle has been studied in classical system and quantum system [3, 4, 5, 6, 7, 8]. We give the maximum entropy principle for the Tsallis entropy from the operator-theoretical point of view.
Theorem 4.1 ([8]) Let Y = Z−1
λ expλ(−H/kHk),
where Zλ≡Tr[expλ(−H/kHk)], for λ∈(−1,0)∪(0,1)
and a Hermitian matrixH. We denote
Cλ≡X ∈Dn(X) :Tr[X1
−λ
H]≤Tr[Y1−λ H] .
If X ∈Cλ, thenSλ(X)≤Sλ(Y).
Remark 4.2 Since −x1−λ
lnλx is a strictly concave
function,Sλ is a strictly concave function on the setCλ.
This means that the maximizerY is uniquely determined so that we may regard Y as a generalized Gibbs state. Thus we may define a generalized Helmholtz free energy such by
Fλ(X, H)≡Tr[X1
−λ
H]− kHkSλ(X).
This can be also represented by the Tsallis relative entropy such as
Fλ(X, H) =kHkDλ(X|Y)+lnλZλ−1Tr[X1−λ(kHk−λH)].
We straightforwardly have the following corollary by tak-ing the limit asλ→0.
Corollary 4.3 ([18]) Let Y = Z−1
0 exp (−H/kHk), where Z0 ≡Tr[exp (−H/kHk)], for a Hermitian matrix
H. IfX ∈C0, then
S0(X)≤S0(Y).
5
A variational expression of Tsallis
rel-ative entropy
In this section, we derive a variational expression of the Tsallis relative entropy as a parametric extension of that of the relative entropy in Lemma 1.2 of [21]. A variational expression of the relative entropy has been studied in the general setting of von Neumann algebras [23, 22].
Theorem 5.1 ([20]) Forλ∈(0,1], we have the follow-ing relations.
1. If AandY are positive matrices, then
lnλTr[eAλ+lnλY]
= max
Tr[X1−λ
A]−Dλ(X|Y) :X∈Dn(C) .
2. If X is a density matrix and B is a Hermitian ma-trix, then
Dλ(X|eBλ)
= max
Tr[X1−λ
A]−lnλTr[eAλ+B] :A=A
∗ .
Taking the limit asλ→0, Theorem 5.1 recovers Lemma 1.2 in [21]. IfY =I andB = 0 in 1. and 2. of Theorem 5.1, respectively, then we obtain the following corollary.
Corollary 5.2 1. If Ais a positive matrix, then
lnλTr[eAλ]
= max
Tr[X1−λ
A] +Sλ(X) :X ∈Dn(C) .
2. For a density matrix X, we have
−Sλ(X) = maxTr[X1−λA]−lnλTr[eAλ] :A=A
∗ .
In this section, we derive some trace inequalities in terms of the results obtained in the above. From 1. of Corollary 5.2, we have the generalized thermodynamic inequality:
lnλTr[eHλ]≥Tr[D1
−λ
H] +Sλ(D), (3)
for a density matrix D and a Hermitian matrix H. Putting D = A
Tr[A] and H = lnλB in Eq.(3), we have the generalized Peierls-Bogoliubov inequality (Theorem 3.3 of [11]):
(Tr[A])1−λ
(lnλTr[A]−lnλTr[B])
≤Tr[A1−λ
(lnλA−lnλB)], (4)
for nonnegative matricesAandB.
Lemma 5.3 ([20]) The following statements are equiv-alent.
1. Fλ(A) = lnλTr[eAλ] is convex in a Hermitian matrix
A.
2. fλ(t) = lnλTr[eAλ+tB]is convex in t∈R.
Corollary 5.4 ([20])For Hermitian matricesAandB, we have
lnλTr[eAλ+B]−lnλTr[eAλ]≥
Tr[B(eAλ)1
−λ
] (Tr[eA
λ])1
Note that we also obtain the inequality (5) by putting
A=eH
λ andB=eHλ+Kin the inequality (4). In addition,
putting A = lnλD and B = H −lnλD for a density
matrixD in the inequality (5), we obtain the inequality (3). Thus we have the following proposition.
Proposition 5.5 ([20]) The following conditions are equivalent.
1. The generalized thermodynamic inequality (3).
2. The generalized Peierls-Bogoliubov inequality (4).
3. The trace inequality (5) given in Corollary 5.4.
For nonnegative real numbers x, y and 0 < λ ≤ 1, the relationsexλ+y ≤e
x+y+λxy λ =exλe
y
λ hold. These relations
naturally motivate us to consider the following inequali-ties in the noncommutative case.
Proposition 5.6 ([20]) For nonnegative matrices X
andY, and0< λ≤1, we have
Tr[eXλ+Y]≤Tr[eλX+Y+λY1/2XY1/2].
Note that we have the matrix inequality :
eXλ+Y ≤eλX+Y+λY1/2XY1/2
forλ≥1 by the application of the L¨owner-Heinz inequal-ity [24, 25, 26].
Proposition 5.7 ([20])For nonnegative matricesX, Y, andλ∈(0,1], we have
Tr[eXλ+Y+λXY]≤Tr[eXλeYλ]. (6)
Notice that Golden-Thompson inequality [27, 28],
Tr[eX+Y]≤Tr[eXeY]
which holds for Hermitian matrices X and Y, is recov-ered by taking the limit asλ→0 in Proposition 5.7, in particular case of nonnegative matricesX andY.
SinceTr[HZHZ]≤Tr[H2Z2] for Hermitian matricesH andZ [23], we have for nonnegative matricesX andY,
Tr[(I+X+Y +Y1/2XY1/2)2]
≤Tr[(I+X+Y +XY)2] (7)
by easy calculations. This implies the inequality
Tr[eX+Y+1/2Y 1/2
XY1/2
1/2 ]≤Tr[e
X+Y+1/2XY
1/2 ].
Thus we have
Tr[eX1/+2Y]≤Tr[eX1/2eY1/2] (8)
from Proposition 5.6 and Proposition 5.7. PuttingB = ln1/2Y and A = ln1/2Y−1/2XY−1/2 in 2. of Theorem 5.1 and using Eq.(8), we have
D1/2(X|Y)≥Tr[X1/2ln1/2Y
−1/2
XY−1/2
], (9)
which gives a lower bound of the Tsallis relative entropy in the case ofλ= 1/2.
6
Related matrix trace inequalities
In this section, we consider an extension of the following inequality [21]:
Tr[X(logX+ logY)]≤ 1
pTr[XlogX
p/2YpXp/2] (10)
for nonnegative matricesX and Y, andp >0.
Theorem 6.1 ([8])
1. For positive matricesX andY,p≥1and0< λ≤1, we have
Tr[X1−λ
(lnλX−lnλY)]
≤ −Tr[Xlnλ(X−p/2YpX−p/2)1/p]. (11)
2. For positive matrices X andY,0 < p <1 and0 < λ≤1, the following inequality dose not hold:
Tr[X1−λ
(lnλX−lnλY)]
≤ −Tr[Xlnλ(X−p/2YpX−p/2)1/p] (12)
Corollary 6.2 1. For positive matrices X andY, the trace inequality
Dλ(X|Y)≤ −Tr[Xlnλ(X−1/2Y X−1/2)] (13)
holds.
2. For positive matricesX andY, andp≥1, we have the inequality (10).
7
Conjecture
The trace inequality (7) makes us conjecture the following inequalities.
Conjecture 7.1 For nonnegative matricesX andY, we have
1. Tr[(I+X+Y +Y1/2XY1/2)p]≤Tr[(I+X+Y +
XY)p] forp≥1.
2. Tr[(I+X+Y +Y1/2XY1/2)p]≥Tr[(I+X+Y +
Acknowledgement
The author was partially supported by the Japanese Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Encouragement of Young Scientists (B), 17740068.
References
[1] J.Acz´el and Z.Dar´oczy, On measures of information and their characterizations, Academic Press, 1975.
[2] C.Tsallis, Possible generalization of Bolzmann-Gibbs statistics, J.Stat.Phys.,Vol.52(1988),pp.479-487.
[3] S. Martinez, F. Nicol´as, F. Penninia and A. Plasti-noa, Tsallis’ entropy maximization procedure revis-ited, Physica A,Vol.286(2000), pp.489-502.
[4] C.Tsallis, R.S.Mendesc and A.R.Plastino, The role of constraints within generalized nonextensive statis-tics, Physica A,Vol.261,(1998),pp.534-554.
[5] S.Abe, S. Mart´nez, F. Pennini, and A. Plas-tino, Nonextensive thermodynamic relations, Phys.Lett.A,Vol.281(2001),pp.126-130.
[6] S.Abe, Heat and entropy in nonextensive thermodynamics: transmutation from Tsallis theory to R´enyi-entropy-based theory,Physica A,Vol.300(2001),pp.417-423.
[7] S.Furuichi, On the maximum entropy principle in Tsallis statistics, preprint.
[8] S.Furuichi, Matrix trace inequalities on the Tsallis entropies, preprint.
[9] H.Suyari, The unique non self-referentialq-canonical distribution and the physical temperature derived from the maximum entropy principle in Tsallis statistics, cond-mat/0502298.
[10] H.Suyari and M.Tsukada, Law of error in Tsal-lis statistics, IEEE Trans.Information Theory, Vol.51(2005),pp.753-757.
[11] S.Furuichi, K.Yanagi and K.Kuriyama, Funda-mental properties of Tsallis relative entropy, J.Math.Phys,Vol.45(2004),pp.4868-4877.
[12] S.Furuichi,K.Yanagi and K.Kuriyama, A generalized Fannes’ inequality, Journal of Inequalities in Pure and Applied Mathematics, Vol.8(2007), Issue 1, Ar-ticle 5, 6pp.
[13] R.Alicki and M.Fannes, Quantum Dynamical Sys-tems, Oxford University Press, 2001.
[14] M.Fannes, A continuity property of en-tropy density for spin lattice systems, Commun.Math.Phys.,Vol.31(1973),pp.291-294.
[15] M.A.Nielsen and I.Chuang, Quantum Computation and Quantum Information, Cambridge Press, 2000.
[16] M.Ohya and D.Petz, Quantum Entropy and its Use,Springer-Verlag,1993.
[17] J.von Neumann,Thermodynamik quantenmechanis-cher Gesamtheiten,G¨ottinger Nachrichen,pp.273-291(1927).
[18] W.Thirring, Quantum mechanics of large systems, Springer-Verlag, 1980.
[19] H.Umegaki, Conditional expectation in an opera-tor algebra, IV (entropy and information),Kodai Math.Sem.Rep., Vol.14(1962),pp.59-85.
[20] S.furuichi, Trace inequalities in nonextensive statis-tical mechanics, Linear Alg. Appl., Vol.418(2006), pp.821-827.
[21] F.Hiai and D.Petz, The Golden-Thompson trace inequality is complemented, Linear Alg. Appl.,Vol.181(1993),pp.153-185.
[22] H.Kosaki, Relative entropy for states: a varia-tional expression, J.Operator Theory, Vol.16(1986), pp.335-348.
[23] D.Petz, A variational expression for the relative entropy of states of a von Neumann algebra, Commun.Math.Phys.,Vol.114(1988),pp.345-349.
[24] K.L¨owner, Uber¨ monotone Matrixfunktionen, Math.Z., Vol.38(1934), pp.177-216.
[25] E.Heinz, Beitr¨age zur St¨orungstheorie der Spektralz-erlegung, Math. Ann., Vol.123(1951), pp.415-438.
[26] G.K.Pedersen, Some operator monotone functions, Proc. Amer. Math. Soc., Vol.36(1972), pp.309-310.
[27] S.Golden, Lower bounds for the Helmholtz function, Phys. Rev., Vol.137(1965), pp.B1127-B1128.
[28] C.J.Thompson, Inequality with applications in statistical mechanics, J.Math.Phys., Vol.6(1965), pp.1812-1813.
Appendix
On Theorem 1.2, here we showq-Gaussian distribution:
φ(x) =Z−1
q expq
−βq(x−µq)2/σ2q , (14)
where Zq ≡ R
∞ −∞expq
−βq(x−µq)2/σ2q dx with βq =
1/(3−q) satisfies the constraints:
1
cq
Z ∞ −∞
xφ(x)qdx=µ
and
1
cq
Z ∞ −∞
(x−µq)2φ(x)qdx=σq2, (16)
cq≡R
∞ −∞φ(x)
qdx.
Proof: It is sufficient to prove the case of µq = 0 and
σq = 1. Since the function xexpq
−(3x−2q)
q
is the odd function, we see that φ(x) satisfies the first constraint. To show the second constraint is equivalent to show
Z ∞ −∞
x2expq
− x
2
(3−q)
q dx= Z ∞ −∞ expq − x 2
(3−q)
q
dx
(17)
(1) 0 ≤ q < 1: Since expq
−(3x−2q)
is equal to
1−(1−q)x2
(3−q)
1−1q
, if −q3−q
1−q < x <
q
3−q
1−q,
other-wise equal to 0, L.H.S. of Eq.(17) is calculated by
Z
q3−q
1−q
− q3
−q
1−q
x2expq
− x
2
(3−q)
q
dx
=
3−q
1−q
3/2Z 1
−1
y2 1−y21−qq dy
= 2
3−q
1−q
3/2Z 1
0
y2 1−y21−qqdy
=
3−q
1−q
3/2
B
3
2, 1 1−q
.
Also R.H.S. of Eq.(17) is calculated by
Z
q3
−q
1−q
− q3−q
1−q
expq
− x
2
(3−q)
q
dx
=
3−q
1−q
1/2Z 1
−1
1−y21−qqdy
= 2
3−q
1−q
1/2Z 1
0
1−y21−qq dy
=
3−q
1−q
1/2
B
1 2,
1 1−q
.
In the process of the above calculations, the following formula:
Z 1
0
xα 1−xλβ
dx= 1
λB
α+ 1
λ , β+ 1
,
(α, β >−1, λ >0)
was used. By the properties of the beta function
and gamma function,3−q
1−q
3/2
B32,1−1q
coincides
with3−q
1−q
1/2
B12,1−1q
.
(2) 1< q≤3: L.H.S. of Eq.(17) is calculated by
Z ∞ −∞
x2expq
− x
2
(3−q)
q
dx
= 2
3−q q−1
3/2Z ∞
0
y2 1 +y21−qq dy
=
3−q q−1
3/2
B
q q−1 −
3 2, 3 2 .
R.H.S. of Eq.(17) is calculated by
Z ∞ −∞ expq − x 2
(3−q)
q
dx
= 2
3−q
q−1
1/2Z ∞
0
1 +y21−qqdy
=
3−q
q−1
1/2
B
q
q−1− 1 2, 1 2 .
In the process of the above calculations, the following formula:
Z ∞
0
dx
xα(1 +xλ)β =
1
λB
β−1−α
λ ,
1−α λ
,
(α <1, β >0, λ >0, λβ >1−α).
By the properties of the beta function and gamma
function, 3−q
q−1
3/2
Bq−q1−
3 2, 3 2 coincides with
3−q
q−1
1/2
Bq−q1−