Deterministic Modelling - School of Computing Science

(1)

[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers

Systems Biology 1

Deterministic Modelling

(2)

Timetable

• Friday: 11-2

• Monday: 10-12, 1-2

• Both days: mixture of ‘lecture’ and ‘lab’

work.

(3)

Overview

•

^Friday

•

Brief introduction to Systems Biology and modelling.

•

Brief introduction to deterministic modelling with Ordinary Differential Equations

•

^Monday

•

Brief introduction to exact stochastic simulation with the Gillespie algorithm

(4)

Practical work

•

We will use a solver written in Java

•

I’m assuming that everyone is happy with compiling and running Java code from the command line?

•

Speak now if not…

•

We do not have much time: I strongly advise you to complete the lab exercises (and build other models) outside the official teaching

time.

(5)

Lecture notes

• On Moodle.

• They are not exhaustive.

(6)

What is Systems Biology?

(7)

What is modelling? What is a model?

The purpose of building a model of a biological system is to formulate in either a mathematical or diagrammatical

manner, a representation of our understanding of the system. This process is useful in itself (it pulls together knowledge of the system into a single coherent place), but

can also be used to test our understanding via making predictions of system behaviour that can be verified in

the laboratory.

Q: can we gain knowledge from a model?

(8)

Types of model in SB

Networks Data analysis

Dynamic/temporal

This module

Next semester

(9)

What is modelling? What is a model?

•

For us, model = mathematical model

•

In particular, how things change over time.

[Two mathematical models describing how P behaves over time (t)]

(10)

What kind of equations?

•

P = 2t tells us the value of P at any time, t.

•

Our understanding of biological phenomena is not like this.

•

e.g. reactions tell us how things change

•

things = species like proteins and other interesting molecules

3 Deterministic Modelling

The protein decay example in the previous section is an example of building a Deterministic model.

The model is deterministic because when we solve the di↵erential equation, we will always get the same result. This is a very strong assumption – biological systems aren’t so well-behaved, but perhaps it’s still a useful model?

In building this model, we also slipped in a few other assumptions:

• Deterministic. We have assumed that the system always behaves in the same way.

• It makes sense to measure proteins as concentrations. Implies that there are large quantities of the species.

• There are no external influences. The system is not influenced by anything else.

• That the rate of decrease of protein concentration is proportional to protein concentration.

We will discuss the limitations of these assumption in class, and later, when we consider stochastic simulations.

3.1 From reactions to equations

Our model will often be described by a set of reactions. For example our decay example is described by the following reaction:

P ! 0 Here’s a more complex example:

A + B ^k!¹ A + B + B A ^k!² 0

i.e., an A combined with a B creates an additional B in the first reaction. In the second, A decays. The rates of the two reactions are k₁ and k₂. These are analogous to in the previous example. To describe this system, we need two di↵erential equations. One for A and one for B. A is involved in both reactions, so it stands to reason that both could potentially change A:

A˙_t = change caused by reaction 1 + change caused by reaction 2.

Reaction 2 looks like our protein decay example, so we’ll use the same model we used there:

A˙_t = change caused by reaction 1 k₂A_t.

Inspecting reaction 1, how does A change? There is one molecule of A on each side, so there is, in fact, no change in A:

A˙_t = 0 A_t = k₂A_t

Moving to B, it is only involved in reaction 1, and when reaction 1 happens, a B is produced. We now have to make an additional modelling assumption – how do the quantities of A and B a↵ect the rate at which this reaction happens? This most common assumption is mass action, where we assume that the rate at which the reaction occurs is proportional to A ⇥ B (mass action is actually a bit more involved than this, but it will do for our purposes). In other words, assuming mass action gives us the following di↵erential equation for B:

B˙_t = k₁AB Our system of ODEs is therefore given by:

A˙_t = k₂A_t B˙_t = k₁A_tB_t

8

(11)

What kind of equations?

• How things change = gradients

(12)

Ordinary Differential Equations

• Ordinary Differential Equations define gradients instead of values.

• Temporal logic.

• Petri nets.

• Process algebras.

2.3 Di↵erential equations

The examples given above use equations that tell us directly how much P there is at some time t (in the first example, it is always 3, in the second it is always 2t). Unfortunately, useful modelling of Biological systems is rarely so easy. In particular, it often makes more sense in biological systems to define how the quantities of di↵erent species are changing, rather than the quantities themselves. This is typically done by defining and solving di↵erential equations.

We do not have time to go into much detail on di↵erential equations here but I encourage you to read more. There are many introductory textbooks that cover these topics.

A di↵erential equation looks something like this:

dP

dt = P

The left hand side is read as “the rate of change of P with respect to time (t)” (although it looks like a division, it isn’t!) and the right hand side defines what this rate of change is. Some people also use ˙ P as a shorthand for

^dP_dt

.

The di↵erential equations for our two examples above are:

P ˙ = dP

dt = 0 and P ˙ = dP

dt = 2,

respectively. In the first example, P never changes (it is always 3) and so its rate of change is 0. In the second its rate of change is 2 (it increases by 2 units for every unit of time).

If we try and re-create the plots above using just the rate of change information we come across a prob- lem. In particular, ˙ P = 0 just tells us that P doesn’t change: we also need to know the starting value, P

₀

= 3. Similarly, in the second example we need to know that P

₀

= 0 to re-create the plot. These values are known as the initial conditions.

Let’s now build a model of a more realistic but still simple biological example.

2.4 Example: protein decay

Consider a species of proteins (P ) that ‘live’ for a while and then spontaneously decay. The proteins have a probability of 0.5 of decaying in a particular unit of time. In other words, for each currently suriving protein, we can toss a coin at the passing of each time unit and if the coin shows ‘head’, the protein decays. This suggests that after one time unit, roughly 50% of the proteins will have decayed.

After the next time unit, roughly 50% of the proteins that survived the previous time step will have themselves decayed. Every time unit, the number of proteins will roughly half. This behaviour can be seen in the following Figure, where we start with a concentration of 100 proteins (note that we’ve moved to concentrations (i.e. proteins per unit volume rather than actual numbers – more on this later):

This is a common shape of function in the biological world (spontaneous decay is commonly assumed), and corresponds to the following di↵erential equation:

P ˙

_t

= P

_t

where P

_t

means “the value of P at time t” (remember P

₀

?) and is a parameter (in this case it is equal to ln(0.5), but the particular value doesn’t matter.)

Hopefully this equation makes some intuitive sense: ˙ P

_t

is always negative (the concentration of proteins is always decreasing ) and it decreases faster the more P there is (the more times we toss the coin, the

5 • Temporal logic.

• Petri nets.

• Process algebras.

2.3 Di↵erential equations

The examples given above use equations that tell us directly how much P there is at some time t (in the first example, it is always 3, in the second it is always 2t). Unfortunately, useful modelling of Biological systems is rarely so easy. In particular, it often makes more sense in biological systems to define how the quantities of di↵erent species are changing, rather than the quantities themselves. This is typically done by defining and solving di↵erential equations.

We do not have time to go into much detail on di↵erential equations here but I encourage you to read more. There are many introductory textbooks that cover these topics.

A di↵erential equation looks something like this:

dP

dt = P

The left hand side is read as “the rate of change of P with respect to time (t)” (although it looks like a division, it isn’t!) and the right hand side defines what this rate of change is. Some people also use ˙ P as a shorthand for ^dP _dt .

The di↵erential equations for our two examples above are:

P ˙ = dP

dt = 0 and P ˙ = dP

dt = 2,

respectively. In the first example, P never changes (it is always 3) and so its rate of change is 0. In the second its rate of change is 2 (it increases by 2 units for every unit of time).

If we try and re-create the plots above using just the rate of change information we come across a prob- lem. In particular, ˙ P = 0 just tells us that P doesn’t change: we also need to know the starting value, P ₀ = 3. Similarly, in the second example we need to know that P ₀ = 0 to re-create the plot. These values are known as the initial conditions.

Let’s now build a model of a more realistic but still simple biological example.

2.4 Example: protein decay

Consider a species of proteins (P ) that ‘live’ for a while and then spontaneously decay. The proteins have a probability of 0.5 of decaying in a particular unit of time. In other words, for each currently suriving protein, we can toss a coin at the passing of each time unit and if the coin shows ‘head’, the protein decays. This suggests that after one time unit, roughly 50% of the proteins will have decayed.

After the next time unit, roughly 50% of the proteins that survived the previous time step will have themselves decayed. Every time unit, the number of proteins will roughly half. This behaviour can be seen in the following Figure, where we start with a concentration of 100 proteins (note that we’ve moved to concentrations (i.e. proteins per unit volume rather than actual numbers – more on this later):

This is a common shape of function in the biological world (spontaneous decay is commonly assumed), and corresponds to the following di↵erential equation:

P ˙ _t = P _t

where P _t means “the value of P at time t” (remember P ₀ ?) and is a parameter (in this case it is equal to ln(0.5), but the particular value doesn’t matter.)

Hopefully this equation makes some intuitive sense: ˙ P _t is always negative (the concentration of proteins is always decreasing ) and it decreases faster the more P there is (the more times we toss the coin, the

5 Shorthand

The rate of change of P with respect to t

(13)

Ordinary Differential Equations

Typically:

[the rate of change of P w.r.t t is a function of P, t, and other things]

[the rate of change of P (at time t) w.r.t t is a function of P (at time t), t, and other things (at time t)]

3 Deterministic Modelling

3.1 From reactions to equations

A + B ^k!¹ A + B + B A ^k!² 0

A˙_t = 0 A_t = k₂A_t

8

the amount that B increases at some time t depends on how much A is in the system at time t…

(14)

Roadmap

•

Given a set of reactions

•

we can define a set of differential equations

•

which we can solve (on a computer)

•

to tell us how much of each thing there is at any time t.

•

There will be one equation for each species (e.g. protein) of interest.

(15)

What is P?

• Our variables (e.g. P,A,B etc) are

concentrations of things we want to model.

• Mathematically we treat them as

continuous values - each one can take any value.

•

Q: implications of this? Assumptions?

•

[they should also be +ve]

(16)

Example: protein decay

• P = concentration of protein.

Q: where does that come from?

Q: what else do we need to know?

Q: what does changing lambda do?

(17)

Protein decay

0 2 4 6 8 10

0 20 40 60 80 100

t

P

more ‘heads’ we get). tells us how quickly P drops. The higher , the quicker the decrease. For example, the following plot shows the same curve for a range of values – the higest curve has = 0.1 and the lowest has = 2:

0 2 4 6 8 10

0 20 40 60 80 100

t

P

Note that is related to the probability of the coin showing heads, but isn’t exactly the same.

Now, given the di↵erential equation:

P˙_t = P_t

and an initial value, P₀ = 100, how do we actually get the plot above? We need the values of P_t and not the change in P_t. To translate from the latter to the former, we have to solve the di↵erential equation. A very small proportion of di↵erential equations that you will meet within Systems Biology can be solved analytically. That is, it is possible to write out an equation for P_t. This is one of those examples, and:

P_t = P₀exp( t).

In general, we won’t be so lucky and will have to resort to a numerical solver within a computer program of which there are many available. The Java code provided here is an example of the most simple method for numerically solving di↵erential equations [more details in class; non-examinable]. It’s important to note that this solver is very bad and should only be used for examples, and not any serious modellinga.

6

This particular differential equation can be solved analytically, the result, for an initial concentration of 100 can be seen below:

0 2 4 6 8 10

0 20 40 60 80 100

t

P

more ‘heads’ we get). tells us how quickly P drops. The higher , the quicker the decrease. For example, the following plot shows the same curve for a range of values – the higest curve has = 0.1 and the lowest has = 2:

0 2 4 6 8 10

0 20 40 60 80 100

t

P

Note that is related to the probability of the coin showing heads, but isn’t exactly the same.

Now, given the di↵erential equation:

P˙_t = P_t

and an initial value, P₀ = 100, how do we actually get the plot above? We need the values of P_t and not the change in P_t. To translate from the latter to the former, we have to solve the di↵erential equation. A very small proportion of di↵erential equations that you will meet within Systems Biology can be solved analytically. That is, it is possible to write out an equation for P_t. This is one of those examples, and:

P_t = P₀ exp( t).

In general, we won’t be so lucky and will have to resort to a numerical solver within a computer program of which there are many available. The Java code provided here is an example of the most simple method for numerically solving di↵erential equations [more details in class; non-examinable]. It’s important to note that this solver is very bad and should only be used for examples, and not any serious modellinga.

No useful models can be solved analytically. So we have to resort to 6

numerically solving on a computer.

(18)

ODE Solvers

•

Many packages exist.

•

Basic mathematical idea is quite simple.

•

I’ve provided a [very] simple solver written in Java.

•

Do not use for anything ‘real’!!

Using our solver, the results can be seen in the figure below. To generate this output, use:

>java DeterministicSolver decay 10 1e-2 1 out.txt

(see Appendices for explanation) and plot the results (out.txt) in your favourite plotting program.

7

Simulator output

>java DeterministicSolver decay 20 1e-3 1 out.txt

(19)

More complex example

To fully explain the Gillespie algorithm, we need a system with more than one reaction. Consider the following set of reactions:

A

^k

!

¹

B B

^k

!

²

0 Using A

₀

= 100, B

₀

= 1, k

₁

= 0.1, k

₂

= 0.1, an example output from the Gillespie simulator is shown below. To run this, do:

>java GillespieSolver simple 100 out.txt 1

In this example, we have N = 2 species and M = 2 reactions. We will use X

_n

to denote the current count of the nth species. We will also use k

_m

to denote the rate of the mth reaction. Assume that we are currently at time t. The Gillespie algorithm does the following:

• For each reaction, m, compute a

_m

= k

_m

h

_m

. h

_m

is the number of reactant combinations, described below.

• Compute a

₀

= P

m

a

_m

• Generate two uniform random variates, u

₁

and u

₂

(e.g. Math.random() in Java)

18

Q: can you predict what will happen if you change k1 or k2?

(20)

From reactions to equations

3 Deterministic Modelling

3.1 From reactions to equations

A + B ^k!¹ A + B + B A ^k!² 0

A˙_t = 0 A_t = k₂A_t

8

Q: how do the reactions change the species?

Q: how fast do the reactions happen?

Kinetic models define how fast reactions happen.

e.g. mass action, Michaelis-Menten A model inside a model!

(21)

Mass Action

3 Deterministic Modelling

3.1 From reactions to equations

A + B ^k!¹ A + B + B A ^k!² 0

A˙_t = 0 A_t = k₂A_t

8

3 Deterministic Modelling

The protein decay example in the previous section is an example of building a Deterministic model.

The model is deterministic because when we solve the di ↵erential equation, we will always get the same result. This is a very strong assumption – biological systems aren’t so well-behaved, but perhaps it’s still a useful model?

In building this model, we also slipped in a few other assumptions:

• Deterministic. We have assumed that the system always behaves in the same way.

• It makes sense to measure proteins as concentrations. Implies that there are large quantities of the species.

• There are no external influences. The system is not influenced by anything else.

• That the rate of decrease of protein concentration is proportional to protein concen- tration.

We will discuss the limitations of these assumption in class, and later, when we consider stochastic sim- ulations.

3.1 From reactions to equations

Our model will often be described by a set of reactions. For example our decay example is described by the following reaction:

P ! 0 Here’s a more complex example:

A + B

^k

!

¹

A + B + B A

^k

!

²

0 i.e., an A combined with a B creates an additional B in the first reaction. In the second, A decays. The rates of the two reactions are k

₁

and k

₂

. These are analogous to in the previous example. To describe this system, we need two di↵erential equations. One for A and one for B . A is involved in both reactions, so it stands to reason that both could potentially change A:

A ˙

_t

= change caused by reaction 1 + change caused by reaction 2.

Reaction 2 looks like our protein decay example, so we’ll use the same model we used there:

A ˙

_t

= change caused by reaction 1 k

₂

A

_t

.

Inspecting reaction 1, how does A change? There is one molecule of A on each side, so there is, in fact, no change in A:

A ˙

_t

= 0 A

_t

= k

₂

A

_t

Moving to B , it is only involved in reaction 1, and when reaction 1 happens, a B is produced. We now have to make an additional modelling assumption – how do the quantities of A and B a↵ect the rate at which this reaction happens? This most common assumption is mass action, where we assume that the rate at which the reaction occurs is proportional to A ⇥ B (mass action is actually a bit more involved than this, but it will do for our purposes). In other words, assuming mass action gives us the following di↵erential equation for B :

B ˙

_t

= k

₁

AB Our system of ODEs is therefore given by:

A ˙

_t

= k

₂

A

_t

B ˙

_t

= k

₁

A

_t

B

_t

8

Reaction 1 leaves A unchanged. Reaction 2 decreases A.

Reaction 1 increases B. Reaction 2 leaves B unchanged.

Mass Action: rate of reaction is proportional to the product of the

‘masses’ of the reactants.

Q: what are the main assumptions under- pinning mass action?

(22)

Mass Action Example

To complete our definition, we need initial concentrations (A = 100, B = 1) and rates k₁ = 0.01, k₂ = 0.5.

Running this model in our simulator gives the output shown in the Figure below. To produce this output, use:

>java DeterministicSolver maeg 100 1e-3 1 out.txt

Exercise: extend this model to also incorporate decay of B, with rate k3 = 0.01.

3.2 Mass Action Example: Lotka-Voltera Model

Consider the following system of reactions:

A + B ^k!¹ A + B + B B + C ^k!² C + C C ^k!³ 0

Taking the species in turn, and assuming mass action, we can derive a set of di↵erential equations:

A : A doesn’t change in any reaction, therefore:

A˙_t = 0

9

(23)

Class exercise

• What are the mass action ODEs for the following system of reactions:

To complete our definition, we need initial concentrations (A = 100, B = 1) and rates k

₁

= 0.01, k

₂

= 0.5.

Running this model in our simulator gives the output shown in the Figure below. To produce this output, use:

>java DeterministicSolver maeg 100 1e-3 1 out.txt

Exercise: extend this model to also incorporate decay of B , with rate k

₃

= 0.01.

3.2 Mass Action Example: Lotka-Voltera Model

Consider the following system of reactions:

A + B

^k

!

¹

A + B + B B + C

^k

!

²

C + C C

^k

!

³

0 Taking the species in turn, and assuming mass action, we can derive a set of di↵erential equations:

A : A doesn’t change in any reaction, therefore:

A ˙

_t

= 0

9

Tip: one equation per species, not per reaction…

(24)

Run it

B : B increases in reaction 1 (with rate k₁A_tB_t) and decreases in reaction 2 (with rate k₂B_tC_t).

Therefore:

B˙_t = k₁A_tB_t k₂B_tC_t

C : C increases in reaction 2 (with rate k₂B_tC_t) and decreases in reaction 3 (with rate k₃C_t). Therefore:

C˙_t = k₂B_tC_t k₃C_t

Using the initial conditions A₀ = 1, B₀ = 50, C₀ = 50 and the rates k₁ = 0.25, k₂ = 0.0025, k₃ = 0.125, we obtain the following system behaviour when we solve the di↵erential equations. To run this, use:

>java DeterministicSolver lotka 100 1e-4 1 out.txt

Notice that A remains constant (it has to: A˙ = 0), whilst the other two species oscillate. It’s quite complex behaviour from what is a very simple set of reactions.

Exercise: vary the initial conditions and rates and observe the changes in behaviour.

10

(25)

Other kinetic models

3.3 Other kinetic models

In the previous example, we used the mass action kinetic model. Many others are used in biological models. A popular one is the Michaelis-Menten kinetic model. For example, for the following reaction:

A ! A + B

the di↵erential equation governing B with mass action would be:

B ˙

_t

= kA

_t

With Michaelis-Menten kinetics, it would be:

B ˙

_t

= k

₁

A

_t

k

₂

+ A

_t

which requires two parameters: k

₁

and k

₂

. In the example project mmexample, the following system of reactions is encoded:

A

^k

!

¹

A + B A

^k²^,k

!

³

A + C A

^k

!

⁴

0 where reactions 1 and 3 use mass action kinetics, and reaction 2 Michaelis-Menten. Using A

₀

= 100, B

₀

= C

₀

= 0 and k

₁

= k

₂

= 0.1, k

₃

= 1, k

₄

= 0.1, we obtain the system behaviour shown in the next plot. To obtain these results, use:

>java DeterministicSolver mmexample 100 1e-3 1 out.txt

Exercise: Experiment with varying the values of k

₂

and k

₃

. What is their role?

11 3.3 Other kinetic models

In the previous example, we used the mass action kinetic model. Many others are used in biological models. A popular one is the Michaelis-Menten kinetic model. For example, for the following reaction:

A ! A + B

the di↵erential equation governing B with mass action would be:

B ˙

_t

= kA

_t

With Michaelis-Menten kinetics, it would be:

B ˙

_t

= k

₁

A

_t

k

₂

+ A

_t

which requires two parameters: k

₁

and k

₂

. In the example project mmexample, the following system of reactions is encoded:

A

^k

!

¹

A + B A

^k²^,k

!

³

A + C

A

^k

!

⁴

0 where reactions 1 and 3 use mass action kinetics, and reaction 2 Michaelis-Menten. Using A

₀

= 100, B

₀

= C

₀

= 0 and k

₁

= k

₂

= 0.1, k

₃

= 1, k

₄

= 0.1, we obtain the system behaviour shown in the next plot. To obtain these results, use:

>java DeterministicSolver mmexample 100 1e-3 1 out.txt

Exercise: Experiment with varying the values of k

₂

and k

₃

. What is their role?

11 3.3 Other kinetic models

In the previous example, we used the mass action kinetic model. Many others are used in biological models. A popular one is the Michaelis-Menten kinetic model. For example, for the following reaction:

A ! A + B

the di↵erential equation governing B with mass action would be:

B˙_t = kA_t With Michaelis-Menten kinetics, it would be:

B˙_t = k₁A_t k₂ + A_t

which requires two parameters: k₁ and k₂. In the example project mmexample, the following system of reactions is encoded:

A ^k!¹ A + B A ^k²^,k!³ A + C A ^k!⁴ 0

where reactions 1 and 3 use mass action kinetics, and reaction 2 Michaelis-Menten. Using A₀ = 100, B₀ = C₀ = 0 and k₁ = k₂ = 0.1, k₃ = 1, k₄ = 0.1, we obtain the system behaviour shown in the next plot. To obtain these results, use:

>java DeterministicSolver mmexample 100 1e-3 1 out.txt

Exercise: Experiment with varying the values of k₂ and k₃. What is their role?

11

Mass Action

Michaelis-Menten

12

Try other parameter values in the lab

(26)

Other kinetic models II

• Michaelis-Menten:

• Q: can you work out what real

phenomena the parameters correspond

to?

(27)

Summary

•

We have (briefly) covered some of the

ideas behind deterministic modelling with ordinary differential equations.

•

In the notes, you’ll see some exercises to try in the lab (now).

•

After the lab, we will look over these exercises and possible move onto

discussing data and parameter/model inference

(28)

Data

Note: what follows is another whole lecture…we will not cover it all!

(29)

Dr. Simon Rogers, School of Computing Science, University of Glasgow

To simulate, we need a model

• All simulations start with:

• A model

• Some parameters

• Some initial conditions:

• What if we start with:

• A model and no parameters?

• No model?

A + B 2B + A B + C 2C

C ⇥

Initial Value Problem

An initial value problem is an ODE (or a system of ODEs) and values of variables at t = 0.

>8

><

>>

:

A˙ = 0

B˙ = k₁ · A · B k₂ · B · C C˙ = k₂ · B · C k₃ · C

A|^t⁼⁰ = 1 B|^t⁼⁰ = 50 C|^t⁼⁰ = 50 k₁ = 0.25 k₂ = 0.0025 k3 = 0.125

Solving an initial value problem means finding the function of dependent variables that satisfies the initial condition and behaves by the law defined with the differential

equation.

Dr Vlad Vyshemirsky (University of Glasgow) Systems Biology 15 / 37

(30)

A model and no parameters

• Given a system, we can hypothesise a model.

• e.g. I can hypothesise that proteins decay without necessarily knowing how fast this will happen

(encoded as a parameter).

• Without parameters we can’t simulate.

• But we can measure data.

• It would be useful if we could ‘learn’ parameters from measured data.

(31)

A model and no parameters

• Recall our model of protein decay and the initial condition:

• We measure some data in the lab:

P˙ = P P₀ = 1

What is ?

(32)

Points and distributions

• Two broad strategies:

• Find the ‘best’ value - point estimates

• Find a distribution - Bayesian inference

What is ?

= 0.5

(33)

Point estimates

• Finding the ‘best’ value:

• What does ‘best’ mean?

What is ?

Each value produces a curve.

Which curve is best?

This curve best represents the experimental evidence.

(34)

Measures of ‘bestness’

• To automate the process of finding the best

curve, we need a formal definition of ‘bestness’.

• Non-statistical: e.g. squared error, absolute error (lower is better)

• Statistical: e.g. likelihood (higher is better)

(35)

Squared error

• Compute the squared difference between the model and

experimental data at the times when the data was measured.

• Why squared?

• Add the values for all data points together.

For a particular

= L( )

(36)

Squared error

• Mathematically speaking:

• Find the value that minimises L

• How?

• Guess lots of times and keep the best.

arg min L( )

(37)

Squared error

• First 10 guesses (best is highlighted)

(38)

Squared error

(39)

Squared error

= 0.2933

(40)

Alternatively

• Guess a value, compute L.

• Change it a bit, compute new L.

• If new L is better, keep new value. Otherwise throw it away.

= 0.2997

(41)

Help is at hand

• Many algorithms exist for optimisation.

• Most are smarter than these two.

• Different ways of updating the parameters.

• Necessary when problem gets more complex.

• e.g. Finding best values of many parameters at once.

• Using simbiology in Matlab gives:

• This is the ‘correct’ value.

= 0.3

(42)

Complexity

• For any optimisation routine:

• Each evaluation of L requires one model simulation.

• For models with many parameters, we may require millions of evaluations!

• Very slow.

• Active research area.

(43)

Is there a correct value?

• Imagine doing the biological experiment twice

What is ?

• Not the same - why?

• Stochasticity, contamination, measurement error

(44)

Distributions, not points

• Bayesian parameter estimation:

• Some (me included) would argue that point estimates are misleading.

• They represent a degree of certainty that is unjustified:

• Repeat the experiment, value changes.

• Alternative: Produce a distribution over values.

(45)

Do the experiment lots of times...

• Repeat the experiment many times.

• Find the best value for each experiment.

• Histogram the results:

• More informative:

• Tells us something about the uncertainty caused by things we can’t measure

(46)

Bayesian Parameter Estimation

• We can get this without repeating the experiment.

• Instead, we make assumptions about the things we can’t measure.

• i.e. all things we can’t measure, added together look like random variables of some particular type.

• e.g. deviations from model (noise) are Gaussian

(47)

Bayesian Parameter Estimation

• Unfortunately (for you) it’s hard - not supported by SimBiology.

• Fortunately (for me) it’s hard - requires research.

• Other benefits:

• It’s possible to incorporate prior knowledge.

• It’s possible to rigorously compare how well different

models explain data: comparing hypotheses (science!)

• Negatives?

• Even more evaluations of L: even slower!

• Normally requires expert tuning.

(48)

But ODE models are bad....

• What about learning parameters in stochastic models?

• What kind of data would we need?

• Single cell, molecule counts - very expensive.

• Microarray would be no good.

• Discrete data causes problems in standard algorithms (non-smooth).

• Even slower still.

• Active research area.

(49)

No model and no parameters

• Comparing hypotheses:

• Given some data, can we choose:

Receptor

Bound Receptor

Ligand

MAPKKK MAPKKK*

MAPKK MAPKK*

MAPK MAPK*

Receptor

Bound Receptor

Ligand

MAPKKK MAPKKK*

MAPKK MAPKK*

MAPK MAPK*

V

Yes:

(50)

No model and no parameters

• (Many) people have also tried to rely more heavily on data:

• Search for statistical (not necessarily biological) relationships between species:

Just data

Expand an existing model using data

(51)

Summary

• If we have a model and no parameters, we can find them with data.

• Point estimates:

• Easy (to understand and do).

• Too certain.

• Bayesian inference:

• Hard(er) (to understand and do).

• Reflects uncertainty.

• Needs additional assumptions - noise.

• Can use data to compare models.

• Can use data to build models.

Deterministic Modelling - School of Computing Science

Systems Biology 1

Timetable

• Friday: 11-2

• Monday: 10-12, 1-2

• Both days: mixture of ‘lecture’ and ‘lab’

work.

Overview

•

•

•

•

•

Practical work

•

•

•

•

Lecture notes

• On Moodle.

• They are not exhaustive.

What is Systems Biology?

Q: can we gain knowledge from a model?

Types of model in SB

This module

Next semester

•

•

What kind of equations?

•

•

•

•

3 Deterministic Modelling

What kind of equations?

• How things change = gradients

Ordinary Differential Equations

• Ordinary Differential Equations define gradients instead of values.

• Temporal logic.

• Petri nets.

• Process algebras.

2.3 Di↵erential equations

We do not have time to go into much detail on di↵erential equations here but I encourage you to read more. There are many introductory textbooks that cover these topics.

A di↵erential equation looks something like this:

dP

dt = P

The left hand side is read as “the rate of change of P with respect to time (t)” (although it looks like a division, it isn’t!) and the right hand side defines what this rate of change is. Some people also use ˙ P as a shorthand for

.

The di↵erential equations for our two examples above are:

P ˙ = dP

dt = 0 and P ˙ = dP

dt = 2,

respectively. In the first example, P never changes (it is always 3) and so its rate of change is 0. In the second its rate of change is 2 (it increases by 2 units for every unit of time).

If we try and re-create the plots above using just the rate of change information we come across a prob- lem. In particular, ˙ P = 0 just tells us that P doesn’t change: we also need to know the starting value, P

= 3. Similarly, in the second example we need to know that P

= 0 to re-create the plot. These values are known as the initial conditions.

Let’s now build a model of a more realistic but still simple biological example.

2.4 Example: protein decay

This is a common shape of function in the biological world (spontaneous decay is commonly assumed), and corresponds to the following di↵erential equation:

P ˙

= P

where P

means “the value of P at time t” (remember P

?) and is a parameter (in this case it is equal to ln(0.5), but the particular value doesn’t matter.)

Hopefully this equation makes some intuitive sense: ˙ P

is always negative (the concentration of proteins is always decreasing ) and it decreases faster the more P there is (the more times we toss the coin, the

5

• Temporal logic.

• Petri nets.

• Process algebras.

2.3 Di↵erential equations

We do not have time to go into much detail on di↵erential equations here but I encourage you to read more. There are many introductory textbooks that cover these topics.

A di↵erential equation looks something like this:

dP

dt = P

The left hand side is read as “the rate of change of P with respect to time (t)” (although it looks like a division, it isn’t!) and the right hand side defines what this rate of change is. Some people also use ˙ P as a shorthand for dP dt .

The di↵erential equations for our two examples above are:

P ˙ = dP

dt = 0 and P ˙ = dP

dt = 2,

The left hand side is read as “the rate of change of P with respect to time (t)” (although it looks like a division, it isn’t!) and the right hand side defines what this rate of change is. Some people also use ˙ P as a shorthand for ^dP _dt .

P ˙ _t = P _t

where P _t means “the value of P at time t” (remember P ₀ ?) and is a parameter (in this case it is equal to ln(0.5), but the particular value doesn’t matter.)

Hopefully this equation makes some intuitive sense: ˙ P _t is always negative (the concentration of proteins is always decreasing ) and it decreases faster the more P there is (the more times we toss the coin, the