• Nenhum resultado encontrado

Figure 30: A fatgraph model forα-helix.

Figure 31: A fatgraph model forβ-sheet.

For each boundary component in the fatgraph, we have a (possibly empty) sequence of unpaired hydrogens and oxygens, represented by the external ma- trices Λ1 and Λ2. To record these sequences, we introduce a combinatorial parameter, which we call the boundary point type spectrum, `ij = {`ij}. It is a sequence of numbers `ij, indexed by two sequences i = (i1, . . . , iK) and j= (j1, . . . , jK). For each boundary component,irecords the numbers of con- secutive unpaired oxygens, andjrecords the numbers of consecutive hydrogens.

So for example, the diagram in figure 30 has the boundary point type spectrum (`(1),(2)= 1, `(2),(0)= 1, `(0),(1) = 1), while the diagram in figure 31 has the spec- trum (`(0),(0)= 1, `(1,1),(1,1)= 1, `(1),(1) = 1). The total number 2l of unpaired hydrogens and oxygens is given by

l = X

K1

X

(i1,...,iK)

X

(j1,...,jK)

XK L=1

iL`(i1,...,iK)(j1,...,jK)

= X

K1

X

(i1,...,iK)

X

(j1,...,jK)

XK L=1

jL`(i1,...,iK)(j1,...,jK).

We also require two backbone spectra, a = {ai} and b = {bi}, for the numbers of backbone segments of each type with i peptide units. For the di- agram in figure 30, we have {ai} = e5,{bi} = 0, and for figure 31 we have {ai}= 0,{bi}=e5.

Leta=P

i1be the total number of peptide units in theα-helix backbones, andb=P

i1 be the total number in theβ-sheet backbones.

Definition 4. LetNg,k,l(a,b,`ij) denote the number of protein fatgraphs with genus g, k propagators, 2l marked points, a backbone spectrum for the un- twisted backbones (i.e.α-helix),bbackbone spectrum for the twisted backbones (i.e.β-sheet), and`ij boundary point spectrum. The generating function of the

number Ng,k,l(a,b,`ij) is defined by F(x, y;α,β,rij) =X

b0

Fa,b(x, y;α,β,rij), Fa,b(x, y;α,β,rij)

= 1 a!b!

X

g≥0

X

k≥a+b−1

X

`ij

X

Pai=a

X

Pbi=b

Ng,k,l(a,b,`ij)x2g−2yk

×Y

i0

αaiiβibiY

ij

rij`ij.

Using Wick’s theorem with the Wick conrtaction hAabBcdi= 1

Vol(HN)2 Z

HN

dAdB AabBcdeTrABadδbc, (42) we can express the generating function as a hermitian matrix integral.

Theorem 4.5. Let ZN(y;α,β,rij) denote the exponential of the generating function:

ZN(y;α,β,rij) = exp

F(1/N, y;α,β,rij) .

ZN(y;α,β,rij) is given as the partition function of the hermitian 2-matrix model with external fields Λ1 andΛ2:

ZN(y;α,β,rij)

= 1

Vol2N Z

HN2

dAdB exp

"

−NTr

AB−X

i0

αiyi(A+y1/2Λ1)i(B+y1/2Λ2)i

−X

i≥0

βiyi((A+y−1/2Λ1)(B+y−1/2Λ2))i

#

= 1

Vol2N Z

H⊗2N

dAdBeNTrVy,α,β(A,B;Λ12), (43)

whereVolN =NN(N+1)/2Vol(HN), andrij’s are defined by the single trace for a product of Λ1’s andΛ2’s as

r(i1,...,iK),(j1,...,jK)= 1

NTr(Λi11Λj21Λi12Λj22· · ·Λi1KΛj2K).

Proof. The construction is done similarly to section 3.3, by assigning the ap- propriate elements to the diagram elements as described above.

This generating function obeys the heat equation.

Theorem 4.6. The generating functionZN(y;α,β,rij)satisfies the heat equa- tion:

∂yZN(y;α,β,rij)

= 1

2N Tr ∂2

∂Λ1∂Λ2

+ Tr ∂2

∂Λ2∂Λ1

!

ZN(y;α,β,rij), (44)

whereTr∂Λ12∂Λ2 denotes

Tr ∂2

∂Λ1∂Λ2

= XN a,b=1

2

∂Λ1ab∂Λ2ba

.

Proof. The heat equation for the partition functionZN(y;α,β,rij) is obtained by the shift invariance of the matrix integral measuredAanddB.

∂Λ1ba

ZN(y;α,β,rij)

= 1

Vol2N Z

HN2

dAdB NX

i1

yi1/2

×

αi i1

X

k=0

(A+y1/2Λ1)k(B+y1/2Λ2)i(A+y1/2Λ1)ik1

ab

+iβi (B+y−1/2Λ2)((A+y−1/2Λ1)(B+y−1/2Λ2))i−1

ab

×e−NTrVy,α,β(A,B;Λ12)

= 1

Vol2N Z

HN2

dXdY NX

i1

yi1/2 αi

Xi−1 k=0

(XkYiXik1) +iβiY(XY)i1

!

ab

×eNTrWy,α,β(X,Y12),

whereX =A+y1/2Λ1, Y =B+y1/2Λ2, and Wy,α,β(X, Y; Λ12)

= (X−y1/2Λ1)(Y −y1/2Λ2)−X

i0

αiyiXiYi−X

i0

βiyi(XY)i.

We then compute the derivativePN

a,b=1∂/∂Λ2ab to find Tr ∂2

∂Λ1∂Λ2

ZN(y;α,β,rij)

= 1

Vol2N Z

H⊗2N

dXdY N2X

i≥1

yi−1

×Tr 

αi i1

X

k=0

XkYiXik1+iβiY(XY)i1

(X−y1/2Λ1)

!

×e−NTrWy,α,β(X,Y12).

Exchanging the role of (X,Λ1) and (Y,Λ2), we find Tr ∂2

∂Λ2∂Λ1

ZN(y;α,β,rij)

= 1

Vol2N Z

HN2

dXdY N2X

i1

yi1

×Tr 

αi i1

X

k=0

YkXiYi−k−1+iβi(XY)i−1X

(Y −y−1/2Λ2)

!

×eNTrWy,α,β(X,Y12). (45)

On the other hand, the derivative with respect toyis

∂yZN(y;α,β,rij)

= 1

Vol2N Z

HN2

dAdB N 2

X

i1

yi1

×Tr

"

αi i1

X

k=0

(A+y−1/2Λ1)kA(A+y−1/2Λ1)i−k−1(B+y−1/2Λ2)i

+(B+y1/2Λ2)kB(B+y1/2Λ2)ik1(A+y1/2Λ1)i +iβi

A(B+y1/2Λ2) + (A+y1/2Λ1)B

(A+y1/2Λ1)(B+y1/2Λ2) i1#

×e−NTrVy,α,β(A,B;Λ12)

= 1

2N Tr ∂2

∂Λ1∂Λ2

+ Tr ∂2

∂Λ2∂Λ1

!

ZN(y;α,β,rij).

The initial condition for this heat equation is found by settingy= 0 in (43).

ZN(y= 0;α,β,rij) = eNPi≥0Tr(αiΛi1Λi2i1Λ2)i).

The above heat equation can be expressed as a cut-and-join equation.

Theorem 4.7. Let L0 and L2 denote the following differential operators with

respect to parametersrij;

L0= X

K1

X

{iL,jL}KL=1

XK L=1

XK M=1

iL

X

k=1 jM

X

`=1

r(iLk1,iL+1,...,iM),(jL,...,jM−1,`)r(k,iM+1,...,iL−1),(jM`1,jM+1,...,jL−1)

× ∂

∂r(i1,...,iK),(j1,...,jK)

,

L2= X

K,V≥1

X

{iL,jL}KL=1

X

{sQ,tQ}VQ=1

XK L=1

XV Q=1

iL

X

k=1 tQ

X

u=1

r(iLk1,iL+1,...,iL−1,k,sQ+1,...,sQ−1,sQ),(jL,jL+1,...,jL−1,tQu1,tQ+1,...,tQ−1,u)

× ∂2

∂r(i1,...,iK),(j1,...,jK)∂r(s1,...,sV),(t1,...,tV)

,

where the labelsL, M are defined modulo K, and the labelQ is defined modulo V.

The heat equation (44) is rewritten as the cut-and-join equation:

∂ZN(y;α,β,rij)

∂y =

L0+ 1

N2L2

ZN(y;α,β,rij).

Proof. By the chain rule applied to the right hand side of the heat equation (44), we find

Tr ∂2

∂Λ1∂Λ2

ZN(y;α,β,rij)

= X

K1

X

{iL,jL}KL=1

Tr∂2r(i1,...,iK),(j1,...,jK)

∂Λ1∂Λ2

∂ZN(y;α,β,rij)

∂r(i1,...,iK),(j1,...,jK)

+ X

K,V1

X

{iL,jL}KL=1

X

{sQ,tQ}VQ=1

Tr∂r(i1,...,iK),(j1,...,jK)

∂Λ1

∂r(s1,...,sV),(t1,...,tV)

∂Λ2

× ∂2ZN(y;α,β,rij)

∂r(i1,...,iK),(j1,...,jK)∂r(s1,...,sV),(t1,...,tV)

.(46)

The coefficients in (46) are;

Tr∂2r(i1,...,iK),(j1,...,jK)

∂Λ1∂Λ2

= 1 N

XK L,M=1

iL

X

k=1 jM

X

`=1

Tr(Λi1Lk1Λj2L· · ·Λi1MΛ`2)

×Tr(Λj2M−`−1Λi1M+1Λj2M+1· · ·Λi1L−1Λj2L−1Λk1)

=N XK L,M=1

iL

X

k=1 jM

X

`=1

r(iLk1,iL+1,...,iM),(jL,...,jM1,`)

×r(k,iM+1,...,iL1),(jM`1,jM+1,...,jL1), Tr∂r(i1,...,iK),(j1,...,jK)

∂Λ1

∂r(s1,...,sV),(t1,...,tV)

∂Λ2

= 1 N2

XK L=1

XV Q=1

iL

X

k=1 tQ

X

u=1

Tr(Λi1L−k−1Λj2LΛi1L+1Λj2L+1· · ·Λi1L1Λj2L1Λk1

·Λt2Qu1Λs1Q+1Λt2Q+1· · ·Λs1Q1Λt2Q1Λs1QΛu2)

= 1 N

XK L=1

XV Q=1

iL

X

k=1 tQ

X

u=1

r(iLk1,iL+1,...,iL1,k,sQ+1,...,sQ1,sQ),(jL,jL+1,...,jL1,tQu1,tQ+1,...,tQ1,u). Thus we find the operators L0 and L2. Summing with the results of TrΛ2

2Λ1

accounts for the factor 2N1 in (44).

Merging backbones

We will now slightly relax the initial requirement that a backbone can only contain one type of connection between the peptide units by introducing another matrix M. To the endpoints of backbones, we attach the matrix M, with propagators connecting these endpoints to create a loop structure (figure 32).

Figure 32: Connectingα-helix andβ-sheet backbones.

Definition 5. The partition function of the protein matrix model for merged backbones is defined as the following hermitian 3-matrix model:

ZN(y;α(1)(2)(1)(2),rij)

= 1

Vol(HN)3 Z

HN3

dAdBdM exp

"

−NTr AB+1 2M2

+X

i≥0

yi (

α(1)i M(A+y−1/2Λ1)i(B+y−1/2Λ2)iM +α(2)i M(B+y−1/2Λ2)i(A+y−1/2Λ1)iM +βi(1)M (A+y−1/2Λ1)(B+y−1/2Λ2)i

M +βi(2)M (B+y1/2Λ2)(A+y1/2Λ1)i

M )!#

= 1

Vol(HN)3 Z

HN3

dAdBdM eNTrVα,β(A,B,M12).

The propagatorhAabBcdiof this matrix model represents the hydrogen bond- ings and the propagator hMabMcdi represents the the loop that connects the α-helices andβ-sheets in the backbones.

The heat equation for the model is as follows.

Theorem 4.8. The partition functionZN(y;α(1)(2)(1)(2),rij)obeys the heat equation,

∂ZN(y;α(1)(2)(1)(2),rij)

∂y

= 1

2NTr ∂2

∂Λ1∂Λ2

+ ∂2

∂Λ2∂Λ1

!

ZN(y;α(1)(2)(1)(2),rij).

Proof. First, we consider the derivative∂/∂Λ1of the partition functionZN(y;α(1)(2)(1)(2),rij).

∂Λ1baZN(y;{α(1)i },{α(2)i },β(1)(2),rij)

= 1

Vol3N Z

H⊗3N

dAdBdM NX

i≥1

yi−1/2

×

α(1)i

i1

X

k=0

(A+y1/2Λ1)k(B+y1/2Λ2)iM2(A+y1/2Λ1)ik1

ab

(2)i

i1

X

k=0

(A+y−1/2Λ1)kM2(B+y−1/2Λ2)i(A+y−1/2Λ1)i−k−1

ab

i(1) Xi−1 k=0

(B+y1/2Λ2)((A+y1/2Λ1)(B+y1/2Λ2))kM2

×((A+y−1/2Λ1)(B+y−1/2Λ2))i−k−1

ab

i(2)

i1

X

k=0

((B+y−1/2Λ2)(A+y−1/2Λ1))kM2

×((B+y−1/2Λ2)(A+y−1/2Λ1))i−k−1(B+y−1/2Λ2)

ab

×eNTrVy,α,β(A,B,M;Λ12)

= 1

Vol3N Z

HN3

dXdY dM NX

i1

yi1/2

(1)i

i1

X

k=0

(XkYiM2Xi−k−1) +α(2)i

i1

X

k=0

(XkM2YiXi−k−1)

i(1) Xi−1 k=0

Y(XY)kM2(XY)ik1i(2) Xi−1 k=0

(Y X)kM2(Y X)ik1Y

!

ab

×e−NTrWy,α,β(X,Y,M;Λ12),

whereX =A+y1/2Λ1, Y =B+y1/2Λ2, and Wy,α,β(X, Y, M; Λ12)

= (X−y1/2Λ1)(Y −y1/2Λ2) +1 2M2

−X

i0

yi(1)i M XiYiM +α(2)i M YiXiM)

−X

i0

yii(1)M(XY)iM+βi(2)M(Y X)iM).

We now compute the derivativePN

a,b=1∂/∂Λ2abto find Tr ∂2

∂Λ1∂Λ2ZN(y;α(1)(2),{β(1)i },{β(2)i },rij)

= 1

Vol3N Z

HN3

dXdY dM N2X

i≥1

yi−1

×Tr α(1)i

Xi−1 k=0

XkYiM2Xik1i(2) Xi−1 k=0

XkM2YiXik1

i(1)

i1

X

k=0

Y(XY)kM2(XY)i−k−1i(2)

i1

X

k=0

(Y X)kM2(Y X)i−k−1Y

×(X−y1/2Λ1)

!

eNTrWy,α,β(X,Y,M;Λ12)

= 1

Vol3N Z

HN3

dXdY dM N2X

i1

yi1eNTrWy,α,β(X,Y,M;Λ12)

×Tr α(1)i

i1

X

k=1

M Xi−k−1(X−y−1/2Λ1)XkYiM

(2)i Xi−1 k=1

M YiXik1(X−y1/2Λ1)XkM

i(1)

i1

X

k=1

M(XY)i−k−1(X−y−1/2Λ1)Y(XY)kM

i(2) Xi−1 k=1

M(Y X)ik1Y(X−y1/2Λ1)(Y X)kM

!

. (47)

Exchanging the role of (X,Λ1) and (Y,Λ2), we find Tr ∂2

∂Λ2∂Λ1ZN(y;α(1)(2),{βi(1)},{βi(2)},rij)

= 1

Vol3N Z

H⊗3N

dXdY dM N2X

i≥1

yi−1e−NTrWy,α,β(X,Y,M;Λ12)

×Tr α(1)i

i1

X

k=1

M Xiyik1(Y −y1/2Λ2)YkM

(2)i

i1

X

k=1

M Yi−k−1(Y −y−1/2Λ2)YkXiM

i(1) Xi−1 k=1

M(XY)ik1X(Y −y1/2Λ2)(XY)kM

i(2)

i1

X

k=1

M(Y X)i−k−1(Y −y−1/2Λ2)X(Y X)kM

! . (48)

Finally, we compute the derivative with respect toy to find

∂yZN(y;{αi},{βi},rij)

= 1

Vol3N Z

HN3

dAdBdM N 2

X

i1

yi1

×Tr

"

α(1)i Xi−1 k=0

M(A+y1/2Λ1)kA(A+y1/2Λ1)ik1(B+y1/2Λ2)iM

+M(A+y1/2Λ1)i(B+y1/2Λ2)iB(B+y1/2Λ2)ik1M +α(2)i

i1

X

k=0

M(B+y−1/2Λ2)kB(B+y−1/2Λ2)i−k−1(A+y−1/2Λ1)iM

+M(B+y1/2Λ2)i(A+y1/2Λ1)kA(A+y1/2Λ1)ik1i(1)

Xi−1 k=0

(A+y1/2Λ1)(B+y1/2Λ2) k

×

A(B+y−1/2Λ2) + (A+y−1/2Λ1)B

(A+y−1/2Λ1)(B+y−1/2Λ2) ik1

i(2) Xi−1 k=0

(B+y1/2Λ2)(A+y1/2Λ1) k

×

B(A+y−1/2Λ1) + (B+y−1/2Λ2)A

(B+y−1/2Λ2)(A+y−1/2Λ1)

i−k−1#

×eNTrVy,α,β(A,B;Λ12)

= 1

2N Tr ∂2

∂Λ1∂Λ2 + Tr ∂2

∂Λ2∂Λ1

!

ZN(y;{αi},{βi},rij). (49)

For the initial condition of the heat equation, we find ZN(y= 0;α(1)(2)(1)(2),rij)

= 1

Vol(HN) Z

HN

dM eN2TrM

IN2P

i≥0(1)i Λi1Λi2(2)i Λi2Λi1(1)i 1Λ2)ii(2)2Λ1)i) M

= det

IN−2X

i0

(1)i Λi1Λi2(2)i Λi2Λi1i(1)1Λ2)i(2)i2Λ1)i)

1/2

= exp

 X n=1

1 nTr

X

i0

(1)i Λi1Λi2(2)i Λi2Λi1i(1)1Λ2)ii(2)2Λ1)i)

n

,

where the Plemelj’s formula is used det(IN +X) = exp

X

n=1

(−1)n1 n TrXn

.

Introducing N- and C-termini

We extend the protein matrix model further by introducing yet another exter- nal matrix Λ, which labels N- and C-termini of the backbones (figure 33 and figure 34). The boundary cycles containingpbackbone ends are labelled by the set of numbers (i(1)1 , . . . , i(1)K1 : · · ·: i(p)1 , . . . , i(p)Kp) that count the number of un- paired carboxyl oxygens (Λ1) and (j1(1), . . . , jK(1)1 :· · ·:j1(p), . . . , jK(p)p) that count the number of unpaired amino hydrogens (Λ2) keeping their ordering on the boundary cycle.

Figure 33: Adding C- and N-ends of backbone

Figure 34: Two backbones with C- and N-ends

Let nij,p denote the extended boundary point type spectrum that counts the number nij;p of boundary components containing a sequence of i Λ1’s a sequence ofj Λ2’s, andpbackbone end points.

Definition 6. The partition function of the protein matrix model with back-

bone endpoints is defined as the following hermitian 3-matrix model;

ZN(y, η;α(1)(2)(1)(2),ri,j;p)

= 1

Vol(HN)3 Z

H⊗3N

dAdBdM exp

"

−NTr AB+1 2M2

−X

i0

yiη(M +η1/2Λ) (

(1)i (A+y1/2Λ1)i(B+y1/2Λ2)i(2)i (B+y1/2Λ2)i(A+y1/2Λ1)ii(1) (A+y1/2Λ1)(B+y1/2Λ2)i

i(2) (B+y1/2Λ2)(A+y1/2Λ1)i

)

(M +η1/2Λ)

!#

= 1

Vol(HN)3 Z

H⊗3N

dAdBdM eNTrVy,η,α,β(A,B,M;Λ12,Λ). (50) The parameterri,j;,p is given by

ri,j;,p

=r(i(1)

1 ,...,i(1)K

1:i(2)1 ,...,i(2)K

2:...:i(p)1 ,...,i(p)Kp),(j1(1),...,j(1)K

1:j1(2),...,jK(2)

2:...:j(p)1 ,...,jKp(p))

= 1 NTr

Λi

(1) 1

1 Λj

(1) 1

2 · · ·Λi

(1) K1

1 Λj

(1) K1

2 ΛΛi

(2) 1

1 Λj

(2) 1

2 · · ·Λi

(2) K2

1 Λj

(2) K2

2 Λ· · ·Λi

(p) 1

1 Λj

(p) 1

2 · · ·Λi

(p) Kp

1 Λj

(p) Kp

2 Λ

. The heat equations are as follows.

Theorem 4.9. The partition functionZN(y, η;α(1)(2)(1)(2),rij;p)obeys heat equations:

∂

∂y − 1 2N

2

∂Λ1∂Λ2

+ ∂2

∂Λ2∂Λ1

!

ZN(y, η;α(1)(2)(1)(2),rij;p) = 0, (51)

∂η − 1 2N

2

∂Λ2

!

ZN(y, η;α(1)(2)(1)(2),rij;p) = 0. (52) Proof. The first equation is proven in the same way as the previous model (i.e.

Λ = 0). Here we focus on the proof of the second equation (52).

Consider the derivative with respect to Λ Tr ∂2

∂Λ2ZN(y, η;α(1)(2)(1)(2),rij;p)

= 1

Vol3N Z

HN3

dXdY dT N2X

i0

yieNTrWy,η,α,β(X,Y,T12,Λ)

× (

α(1)i XiYi(2)i YiXii(1)(XY)i(2)i (Y X)i )

×

(T−η1/2Λ)T+T(T−η1/2Λ)

!#) ,

whereT =M +η1/2Λ and

Wy,η,α,β(X, Y, T; Λ12,Λ)

= (X−y1/2Λ1)(Y −y1/2Λ2) +1

2(T−η1/2Λ)2

−X

i0

yi(1)i T XiYiT +α(2)i T YiXiT)

−X

i0

yii(1)T(XY)iT+βi(2)T(Y X)iT).

The derivative with respect toη is given by

∂ηZN(y, η;α(1)(2)(1)(2),rij;p)

= 1

Vol3N Z

H⊗3N

dAdBdM N 2

X

i≥0

yieNTrVy,ζ,α,β(A,B;Λ12,Λ)

×Tr

"(

α(1)i (A+y1/2Λ1)i(B+y1/2Λ2)i(2)i (B+y1/2Λ2)i(A+y1/2Λ1)ii(1) (A+y1/2Λ1)(B+y1/2Λ2)ii(2) (B+y−1/2Λ2)(A+y−1/2Λ1)i

)

×(M(M +η1/2Λ) + (M+η1/2Λ)M)

# . Comparing these two results, we obtain the heat equation (51).

For the initial condition withy= 0 andη= 0, we find

ZN(y= 0, η= 0;{α(1)i },{α(2)i },{βi(1)},{β(2)i },{r{i},{j};{K},p})

= exp

X

i0

TrΛ(αi(1)i1Λi2) +α(2)ii2Λi1) +βi(1)1Λ2)ii(2)2Λ1)i

.

The initial condition that keepsη can also be considered as follows:

ZN(y= 0, η;{αi(1)},{α(2)i },{βi(1)},{β(2)i },{r{i},{j};{K},p})

= 1

Vol(HN) Z

HN

dM exp

"

−NTr (M2

2 −(M +η−1/2Λ)

αi(1)i1Λi2) +α(2)ii2Λi1)

i(1)1Λ2)ii(2)2Λ1)i

(M +η1/2Λ) )#

. Finally, we express the heat equations as the cut-and-join equations. The indexing of r makes the notation cumbersome, but a systematic computation gives the following result.

Theorem 4.10. Let L0 and L2 denote the derivatives following differential operators;

L0=X

p1

X

{K}

X

{i},{j}

Xp q=1

Xp r=1

Kq

X

L=1 Kr

X

M=1 i(q)L −1

X

`=0 j(r)M−1

X

m=0

r(i(r)

1 , . . . , i(r) M, `, i(q)

L+1, . . . , i(q) Kq:i(q+1)

1 , . . .:· · ·:. . . , i(r1) Kr−1) ,(j(r)

1 , . . . , j(r) M1, j(r)

M m1, j(q) L , . . . , j(q)

Kq:j(q+1)

1 , . . .:· · ·:. . . , j(r1) Kr−1)

×r(i(q)

1 , . . . , i(q) L−1, i(q)

L `1, i(r)

M+1, . . . , i(r) Kr:i(r+1)

1 , . . .:· · ·:. . . , i(q1) Kq−1) ,(j(q)

1 , . . . , j(q) L−1, m, j(r)

M+1, . . . , j(r) Kr:j(r+1)

1 , . . . ,:· · ·:. . . , j(q1) Kq−1)

× ∂

∂r(i(1)

1 ,...:···:...,i(p)

Kp),(j(1)1 ,...:···:...,j(p)

Kp)

,

L2= X

p,u≥1

X

{K},{V}

X

{i,j}

X

{s,t}

Xp q=1

Xu w=1

Kq

X

L=1 Vw

X

R=1 i(q)L −1

X

`=0 t(w)R −1

X

b=0

r(s(w)

1 , . . . , s(w) R , `, i(q)

L+1, . . . , i(q) Kq:i(q+1)

1 , . . .:· · ·:. . . , i(q1) Kq−1: i(q)

1 , . . . , i(q) L−1, i(q)

L `1, s(w)

R+1, . . . , s(w) Vw:s(w+1)

1 , . . .:· · ·:. . . , s(w1) Vw−1 :) ,(t(w)

1 , . . . , t(w) R1, t(w)

R b1, j(q) L , . . . , j(q)

Kq:j(q+1)

1 , . . .:· · ·:. . . , j(q1) Kq1: j(q)

1 , . . . , j(q) L1, b, t(w)

R+1, . . . , t(w) Vw:t(w+1)

1 , . . .:· · ·:. . . , t(w1) Vw−1:)

× ∂2

∂r(i(1)

1 ,...:···:...,i(p)Kp),(j(1)1 ,...:···:...,jKp(p))∂r(s(1)

1 ,...:···:...,s(u)Vu),(t(1)1 ,...:···:...,t(u)Vu)

.

Let M0 andM2 denote the following differential operators;

M0=1 2

X

p1

X

{K}

X

{i},{j}

Xp q=1

Xp

r=1

r(i(r1)

1 , . . . , i(r1) Kr1, i(q)

1 , . . . , i(q) Kq :i(q+1)

1 , . . . ,:· · ·:, . . . , i(r2) Kr2:) ,(j(r−1)

1 , . . . , j(r−1) Kr−1, j(q)

1 , . . . , j(q) Kq:j(q+1)

1 , . . . ,:· · ·:, . . . , j(r−2) Kr−2:)

×r(i(q1)

1 , . . . , i(q1) Kq−1, i(r)

1 , . . . , i(r) Kr:i(r+1)

1 , . . . ,:· · ·:, . . . , i(q2) Kq−2:) ,(j(q1)

1 , . . . , j(q1) Kq−1, j(r)

1 , . . . , j(r) Kr:j(r+1)

1 , . . . ,:· · ·:, . . . , j(q2) Kq−2:)

× ∂

∂r(i(1)

1 ,...:···:...,i(p)Kp),(j1(1),...:···:...,jKp(p))

,

M2=1 2

X

p,u1

X

{K},{V}

X

{i},{j}

X

{s},{t}

Xp q=1

Xu

w=1

r(s(w1)

1 , . . . , s(w1) Vw−1, i(q)

1 , . . . , i(q) Kq :i(q+1)

1 , . . .:· · ·:, . . . , i(q2) Kq−2: i(q1)

1 , . . . , i(q1) Kq1, s(w)

1 , . . . , s(w) Vw :s(w+1)

1 , . . .:· · ·:. . . , s(w2) Vw2 :) ,(t(w1)

1 , . . . , t(w1) Vw−1, j(q)

1 , . . . , j(q) Kq :j(q+1)

1 , . . .:· · ·:, . . . , j(q2) Kq−2: j(q1)

1 , . . . , j(q1) Kq−1, t(w)

1 , . . . , t(w) Vw :t(w+1)

1 , . . .:· · ·:. . . , t(w2) Vw−2 :)

× ∂2

∂r(i(1)

1 ,...:···:...,i(p)Kp),(j1(1),...:···:...,jKp(p))∂r(s(1)

1 ,...:···:...,s(u)Vu),(t(1)1 ,...:···:...,t(u)Vu)

.

The heat equations (51) and (52) can be rewritten as the cut-and-join equa-

tions:

∂ZN(y, η;α(1)(2)(1)(2),rij;p)

∂y

=

L0+ 1 N2L2

ZN(y, η;α(1)(2)(1)(2),rij;p), (53)

∂ZN(y, η;α(1)(2)(1)(2),rij;p)

∂η

=

M0+ 1 N2M2

ZN(y, η;α(1)(2)(1)(2),rij;p). (54) Proof. First we will derive L0 and L2 operators from the chain rule. The L0

operator comes from the following derivative:

Tr ∂2rij;p

∂Λ1∂Λ2

= 1 N

Xp q=1

Xp r=1

Kq

X

L=1 Kr

X

M=1 i(q)L 1

X

`=0 jM(r)1

X

m=0

Tr

Λ`1Λj

(q) L

2 Λi

(q) L+1

1 Λj

(q) L+1

2 · · ·Λi

(q) Kq

1 Λj

(q) Kq

2 Λ· · ·ΛΛi

(r) 1

1 Λj

(r) 1

2 · · ·Λi

(r) M

1 Λj

(r) Mm1 2

×Tr

Λm2Λi

(r) M+1

1 Λj

(r) M+1

2 · · ·Λi

(r) Kr

1 Λj

(r) Kr

2 Λ· · ·ΛΛi

(q) 1

1 Λj

(q) 1

2 · · ·Λi

(q) L `1 1

=N Xp q=1

Xp r=1

Kq

X

L=1 Kr

X

M=1 i(q)L −1

X

`=0 j(r)M−1

X

m=0

r(i(r)

1 , . . . , i(r) M, `, i(q)

L+1, . . . , i(q) Kq:i(q+1)

1 , . . .:· · ·:. . . , i(r1) Kr−1) ,(j(r)

1 , . . . , j(r) M−1, j(r)

M m1, j(q) L , . . . , j(q)

Kq:j(q+1)

1 , . . .:· · ·:. . . , j(r1) Kr−1)

×r(i(q)

1 , . . . , i(q) L1, i(q)

L `1, i(r)

M+1, . . . , i(r) Kr:i(r+1)

1 , . . .:· · ·:. . . , i(q1) Kq−1) ,(j(q)

1 , . . . , j(q) L−1, m, j(r)

M+1, . . . , j(r) Kr:j(r+1)

1 , . . . ,:· · ·:. . . , j(q1) Kq−1)

.

TheL2operator is from Tr∂rij;p

∂Λ1

∂rst;u

∂Λ2

= 1 N2

Xp q=1

Xu w=1

Kq

X

L=1 Vw

X

R=1 i(q)L 1

X

`=0 t(w)R 1

X

b=0

Tr

Λ`1Λj

(q) L

2 Λi

(q) L+1

1 Λj

(q) L+1

2 · · ·Λi

(q) Kq

1 Λj

(q) Kq

2 Λ· · ·ΛΛi

(q) 1

1 Λj

(q) 1

2 · · ·Λi

(q) L `1 1

·Λb2Λs

(w) R+1

1 Λt

(w) R+1

2 · · ·Λs

(w) Vw

1 Λt

(w) Vw

2 Λ· · ·ΛΛs

(w) 1

1 Λt

(w) 1

2 · · ·Λs

(w) R

1 Λt

(w) R b1 2

= 1 N

Xp q=1

Xu w=1

Kq

X

L=1 Vw

X

R=1 i(q)L 1

X

`=0 t(w)R 1

X

b=0

r(s(w)

1 , . . . , s(w) R , `, i(q)

L+1, . . . , i(q) Kq:i(q+1)

1 , . . .:· · ·:. . . , i(q1) Kq−1: i(q)

1 , . . . , i(q) L−1, i(q)

L `1, s(w)

R+1, . . . , s(w) Vw :s(w+1)

1 , . . .:· · ·:. . . , s(w1) Vw−1 :) ,(t(w)

1 , . . . , t(w) R1, t(w)

R b1, j(q) L , . . . , j(q)

Kq :j(q+1)

1 , . . .:· · ·:. . . , j(q1) Kq−1: j(q)

1 , . . . , j(q) L1, b, t(w)

R+1, . . . , t(w) Vw:t(w+1)

1 , . . . ,:· · ·:. . . , t(w−1) Vw−1 :)

.

For the second heat equation, theM0 operator is from Tr∂2rij;p

∂Λ2

= 1 N

Xp q=1

Xp r=1

Tr

Λi

(q) 1

1 Λj

(q) 1

2 · · ·Λi

(q) Kq

1 Λj

(q) Kq

2 Λ· · ·ΛΛi

(r1) 1

1 Λj

(r1) 1

2 · · ·Λi

(r1) Kr−1

1 Λj

(r1) Kr−1

2

×Tr

Λi

(r) 1

1 Λj

(r) 1

2 · · ·Λi

(r)

1KrΛj

(r)

2KrΛ· · ·ΛΛi1(q−1)1 Λj2(q−1)1 · · ·Λi

(q1) Kq1

1 Λj

(q1) Kq1

2

=N Xp q=1

Xp r=1

r(i(r1)

1 , . . . , i(r1) Kr−1, i(q)

1 , . . . , i(q) Kq :i(q+1)

1 , . . . ,:· · ·:, . . . , i(r2) Kr−2:) ,(j(r1)

1 , . . . , j(r1) Kr−1, j(q)

1 , . . . , j(q) Kq :j(q+1)

1 , . . . ,:· · ·:, . . . , j(r2) Kr−2:)

×r(i(q1)

1 , . . . , i(q1) Kq−1, i(r)

1 , . . . , i(r) Kr:i(r+1)

1 , . . . ,:· · ·:, . . . , i(q2) Kq−2:) ,(j(q1)

1 , . . . , j(q1) Kq−1, j(r)

1 , . . . , j(r) Kr:j(r+1)

1 , . . . ,:· · ·:, . . . , j(q2) Kq−2:)

.

Finally, theM2operator is from Tr∂rij;p

∂Λ

∂rst;u

∂Λ

= 1 N2

Xp q=1

Xu w=1

Tr

Λi

(q) 1

1 Λj

(q) 1

2 · · ·Λi

(q) Kq

1 Λj

(q) Kq

2 Λ· · ·ΛΛi

(q1) 1

1 Λj

(q1) 1

2 · · ·Λi

(q1) Kq−1

1 Λj

(q1) Kq−1

2

×Λs

(w) 1

1 Λt

(w) 1

2 · · ·Λs

(w)

1VwΛt

(w)

2VwΛ· · ·ΛΛs

(w1) 1

1 Λt

(w1) 1

2 · · ·Λs

(w1) Vw1

1 Λt

(w1) Vw1

2

= 1 N

Xp q=1

Xu w=1

r(s(w1)

1 , . . . , s(w1) Vw−1, i(q)

1 , . . . , i(q) Kq :i(q+1)

1 , . . .:· · ·:, . . . , i(q2) Kq−2: i(q1)

1 , . . . , i(q1) Kq−1, s(w)

1 , . . . , s(w) Vw :s(w+1)

1 , . . .:· · ·:. . . , s(w2) Vw−2 :) ,(t(w1)

1 , . . . , t(w1) Vw1, j(q)

1 , . . . , j(q) Kq :j(q+1)

1 , . . .:· · ·:, . . . , j(q2) Kq2: j(q1)

1 , . . . , j(q1) Kq−1, t(w)

1 , . . . , t(w) Vw :t(w+1)

1 , . . .:· · ·:. . . , t(w2) Vw−2 :)

.

We note the first cut-and-join equation (53) expresses the cut/join manipu- lation of the hydrogen bonds, while the second cut-and-join equation (54) is for loops (or turns) in the backbones.

5 Topology of protein β-sheets

5.1 β-sheet topology

The α-helix and the β-sheet are two common protein secondary structures.

While theα-helix is essentially a local structure with the participating residues all lying together along the backbone, theβ-sheet involves interactions between residues which are far apart in the backbone (section 4.1). It is also more heterogeneous as a structure, consisting of both parallel and anti-parallel con- figurations of the participatingβ-strands. Furthermore,β-sheet has an intrinsic structural flexibility compared to α-helix, complicating the structural analyses [30]. A better understanding of their structures and foldings is therefore crucial, if we are to understand the folding mechanism of entire proteins.

The configurations of β-strands in a protein, often called β-sheet topolo- gies, have been studied since the 1970’s [79]. Early studies ([79, 78, 87]) have identified some general rules (such as the preference for the right-handedness in parallel β-sheets) from investigation of individual proteins. As the amount of available data increased, studies have used computer programmes to survey the database and found frequent patterns in the β-strand configurations [99, 80]. The information can be used to filter and rank a series of candidate struc- tures by computing probabilities for different patterns [80]. Another approach is to assign pseudoenergy to each pair ofβ-strand residues and solve theβ-sheet topology prediction problem as an optimisation problem [29]. At least one study [44] has compared the two methods, and found that the latter’s performance to be better. One may also combine the two methods by, for example, forbidding certain β-strand configurations that are not found in the database [89], or by incorporating the two in Bayesian modelling [18]. Other studies used integer programming techniques to predict β-sheet topologies [81, 34].

In order to study β-sheet topology of proteins, we introduce a new model inspired by the protein fatgraph model described in section 4, which we call pro- tein metastructure. This model greatly simplifies the study ofβ-sheet topologies by amalgamating consecutive residues belonging to the same secondary struc- ture, but still retains the information needed to understand the configuration of β-strands. We give a detailed definition in section 5.2. Furthermore, each metas- tructure corresponds to a fatgraph, and this transition to fatgraphs allows us to compute topological invariants such as the number of boundary components and the genus associated to each protein. The details of this correspondence are described in section 2.1. Compared to the model described in section 4, our construction is much simpler, and only takes into account the hydrogen bonds that are part of β-sheets. In the following sections, we will analyse the topol- ogy of fatgraphs associated to proteins and suggest potential applications in the study ofβ-sheet topology.