[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Timetable
• Friday: 11-2
• Monday: 10-12, 1-2
• Both days: mixture of ‘lecture’ and ‘lab’
work.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Overview
•
Friday•
Brief introduction to Systems Biology and modelling.•
Brief introduction to deterministic modelling with Ordinary Differential Equations•
Monday•
Brief introduction to exact stochastic simulation with the Gillespie algorithm[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Practical work
•
We will use a solver written in Java•
I’m assuming that everyone is happy with compiling and running Java code from the command line?•
Speak now if not…•
We do not have much time: I strongly advise you to complete the lab exercises (and build other models) outside the official teachingtime.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Lecture notes
• On Moodle.
• They are not exhaustive.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
What is Systems Biology?
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
What is modelling? What is a model?
The purpose of building a model of a biological system is to formulate in either a mathematical or diagrammatical
manner, a representation of our understanding of the system. This process is useful in itself (it pulls together knowledge of the system into a single coherent place), but
can also be used to test our understanding via making predictions of system behaviour that can be verified in
the laboratory.
Q: can we gain knowledge from a model?
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Types of model in SB
Networks Data analysis
Dynamic/temporal
This module
Next semester
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
What is modelling? What is a model?
•
For us, model = mathematical model•
In particular, how things change over time.[Two mathematical models describing how P behaves over time (t)]
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
What kind of equations?
•
P = 2t tells us the value of P at any time, t.•
Our understanding of biological phenomena is not like this.•
e.g. reactions tell us how things change•
things = species like proteins and other interesting molecules3 Deterministic Modelling
The protein decay example in the previous section is an example of building a Deterministic model.
The model is deterministic because when we solve the di↵erential equation, we will always get the same result. This is a very strong assumption – biological systems aren’t so well-behaved, but perhaps it’s still a useful model?
In building this model, we also slipped in a few other assumptions:
• Deterministic. We have assumed that the system always behaves in the same way.
• It makes sense to measure proteins as concentrations. Implies that there are large quantities of the species.
• There are no external influences. The system is not influenced by anything else.
• That the rate of decrease of protein concentration is proportional to protein concen- tration.
We will discuss the limitations of these assumption in class, and later, when we consider stochastic sim- ulations.
3.1 From reactions to equations
Our model will often be described by a set of reactions. For example our decay example is described by the following reaction:
P ! 0 Here’s a more complex example:
A + B k!1 A + B + B A k!2 0
i.e., an A combined with a B creates an additional B in the first reaction. In the second, A decays. The rates of the two reactions are k1 and k2. These are analogous to in the previous example. To describe this system, we need two di↵erential equations. One for A and one for B. A is involved in both reactions, so it stands to reason that both could potentially change A:
A˙t = change caused by reaction 1 + change caused by reaction 2.
Reaction 2 looks like our protein decay example, so we’ll use the same model we used there:
A˙t = change caused by reaction 1 k2At.
Inspecting reaction 1, how does A change? There is one molecule of A on each side, so there is, in fact, no change in A:
A˙t = 0 At = k2At
Moving to B, it is only involved in reaction 1, and when reaction 1 happens, a B is produced. We now have to make an additional modelling assumption – how do the quantities of A and B a↵ect the rate at which this reaction happens? This most common assumption is mass action, where we assume that the rate at which the reaction occurs is proportional to A ⇥ B (mass action is actually a bit more involved than this, but it will do for our purposes). In other words, assuming mass action gives us the following di↵erential equation for B:
B˙t = k1AB Our system of ODEs is therefore given by:
A˙t = k2At B˙t = k1AtBt
8
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
What kind of equations?
• How things change = gradients
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Ordinary Differential Equations
• Ordinary Differential Equations define gradients instead of values.
• Temporal logic.
• Petri nets.
• Process algebras.
2.3 Di↵erential equations
The examples given above use equations that tell us directly how much P there is at some time t (in the first example, it is always 3, in the second it is always 2t). Unfortunately, useful modelling of Biological systems is rarely so easy. In particular, it often makes more sense in biological systems to define how the quantities of di↵erent species are changing, rather than the quantities themselves. This is typically done by defining and solving di↵erential equations.
We do not have time to go into much detail on di↵erential equations here but I encourage you to read more. There are many introductory textbooks that cover these topics.
A di↵erential equation looks something like this:
dP
dt = P
The left hand side is read as “the rate of change of P with respect to time (t)” (although it looks like a division, it isn’t!) and the right hand side defines what this rate of change is. Some people also use ˙ P as a shorthand for
dPdt.
The di↵erential equations for our two examples above are:
P ˙ = dP
dt = 0 and P ˙ = dP
dt = 2,
respectively. In the first example, P never changes (it is always 3) and so its rate of change is 0. In the second its rate of change is 2 (it increases by 2 units for every unit of time).
If we try and re-create the plots above using just the rate of change information we come across a prob- lem. In particular, ˙ P = 0 just tells us that P doesn’t change: we also need to know the starting value, P
0= 3. Similarly, in the second example we need to know that P
0= 0 to re-create the plot. These values are known as the initial conditions.
Let’s now build a model of a more realistic but still simple biological example.
2.4 Example: protein decay
Consider a species of proteins (P ) that ‘live’ for a while and then spontaneously decay. The proteins have a probability of 0.5 of decaying in a particular unit of time. In other words, for each currently suriving protein, we can toss a coin at the passing of each time unit and if the coin shows ‘head’, the protein decays. This suggests that after one time unit, roughly 50% of the proteins will have decayed.
After the next time unit, roughly 50% of the proteins that survived the previous time step will have themselves decayed. Every time unit, the number of proteins will roughly half. This behaviour can be seen in the following Figure, where we start with a concentration of 100 proteins (note that we’ve moved to concentrations (i.e. proteins per unit volume rather than actual numbers – more on this later):
This is a common shape of function in the biological world (spontaneous decay is commonly assumed), and corresponds to the following di↵erential equation:
P ˙
t= P
twhere P
tmeans “the value of P at time t” (remember P
0?) and is a parameter (in this case it is equal to ln(0.5), but the particular value doesn’t matter.)
Hopefully this equation makes some intuitive sense: ˙ P
tis always negative (the concentration of proteins is always decreasing ) and it decreases faster the more P there is (the more times we toss the coin, the
5
• Temporal logic.
• Petri nets.
• Process algebras.
2.3 Di↵erential equations
The examples given above use equations that tell us directly how much P there is at some time t (in the first example, it is always 3, in the second it is always 2t). Unfortunately, useful modelling of Biological systems is rarely so easy. In particular, it often makes more sense in biological systems to define how the quantities of di↵erent species are changing, rather than the quantities themselves. This is typically done by defining and solving di↵erential equations.
We do not have time to go into much detail on di↵erential equations here but I encourage you to read more. There are many introductory textbooks that cover these topics.
A di↵erential equation looks something like this:
dP
dt = P
The left hand side is read as “the rate of change of P with respect to time (t)” (although it looks like a division, it isn’t!) and the right hand side defines what this rate of change is. Some people also use ˙ P as a shorthand for dP dt .
The di↵erential equations for our two examples above are:
P ˙ = dP
dt = 0 and P ˙ = dP
dt = 2,
respectively. In the first example, P never changes (it is always 3) and so its rate of change is 0. In the second its rate of change is 2 (it increases by 2 units for every unit of time).
If we try and re-create the plots above using just the rate of change information we come across a prob- lem. In particular, ˙ P = 0 just tells us that P doesn’t change: we also need to know the starting value, P 0 = 3. Similarly, in the second example we need to know that P 0 = 0 to re-create the plot. These values are known as the initial conditions.
Let’s now build a model of a more realistic but still simple biological example.
2.4 Example: protein decay
Consider a species of proteins (P ) that ‘live’ for a while and then spontaneously decay. The proteins have a probability of 0.5 of decaying in a particular unit of time. In other words, for each currently suriving protein, we can toss a coin at the passing of each time unit and if the coin shows ‘head’, the protein decays. This suggests that after one time unit, roughly 50% of the proteins will have decayed.
After the next time unit, roughly 50% of the proteins that survived the previous time step will have themselves decayed. Every time unit, the number of proteins will roughly half. This behaviour can be seen in the following Figure, where we start with a concentration of 100 proteins (note that we’ve moved to concentrations (i.e. proteins per unit volume rather than actual numbers – more on this later):
This is a common shape of function in the biological world (spontaneous decay is commonly assumed), and corresponds to the following di↵erential equation:
P ˙ t = P t
where P t means “the value of P at time t” (remember P 0 ?) and is a parameter (in this case it is equal to ln(0.5), but the particular value doesn’t matter.)
Hopefully this equation makes some intuitive sense: ˙ P t is always negative (the concentration of proteins is always decreasing ) and it decreases faster the more P there is (the more times we toss the coin, the
5
Shorthand
The rate of change of P with respect to t
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Ordinary Differential Equations
Typically:
[the rate of change of P w.r.t t is a function of P, t, and other things]
[the rate of change of P (at time t) w.r.t t is a function of P (at time t), t, and other things (at time t)]
3 Deterministic Modelling
The protein decay example in the previous section is an example of building a Deterministic model.
The model is deterministic because when we solve the di↵erential equation, we will always get the same result. This is a very strong assumption – biological systems aren’t so well-behaved, but perhaps it’s still a useful model?
In building this model, we also slipped in a few other assumptions:
• Deterministic. We have assumed that the system always behaves in the same way.
• It makes sense to measure proteins as concentrations. Implies that there are large quantities of the species.
• There are no external influences. The system is not influenced by anything else.
• That the rate of decrease of protein concentration is proportional to protein concen- tration.
We will discuss the limitations of these assumption in class, and later, when we consider stochastic sim- ulations.
3.1 From reactions to equations
Our model will often be described by a set of reactions. For example our decay example is described by the following reaction:
P ! 0 Here’s a more complex example:
A + B k!1 A + B + B A k!2 0
i.e., an A combined with a B creates an additional B in the first reaction. In the second, A decays. The rates of the two reactions are k1 and k2. These are analogous to in the previous example. To describe this system, we need two di↵erential equations. One for A and one for B. A is involved in both reactions, so it stands to reason that both could potentially change A:
A˙t = change caused by reaction 1 + change caused by reaction 2.
Reaction 2 looks like our protein decay example, so we’ll use the same model we used there:
A˙t = change caused by reaction 1 k2At.
Inspecting reaction 1, how does A change? There is one molecule of A on each side, so there is, in fact, no change in A:
A˙t = 0 At = k2At
Moving to B, it is only involved in reaction 1, and when reaction 1 happens, a B is produced. We now have to make an additional modelling assumption – how do the quantities of A and B a↵ect the rate at which this reaction happens? This most common assumption is mass action, where we assume that the rate at which the reaction occurs is proportional to A ⇥ B (mass action is actually a bit more involved than this, but it will do for our purposes). In other words, assuming mass action gives us the following di↵erential equation for B:
B˙t = k1AB Our system of ODEs is therefore given by:
A˙t = k2At B˙t = k1AtBt
8
the amount that B increases at some time t depends on how much A is in the system at time t…
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Roadmap
•
Given a set of reactions•
we can define a set of differential equations•
which we can solve (on a computer)•
to tell us how much of each thing there is at any time t.•
There will be one equation for each species (e.g. protein) of interest.[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
What is P?
• Our variables (e.g. P,A,B etc) are
concentrations of things we want to model.
• Mathematically we treat them as
continuous values - each one can take any value.
•
Q: implications of this? Assumptions?•
[they should also be +ve][email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Example: protein decay
• P = concentration of protein.
Q: where does that come from?
Q: what else do we need to know?
Q: what does changing lambda do?
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Protein decay
0 2 4 6 8 10
0 20 40 60 80 100
t
P
more ‘heads’ we get). tells us how quickly P drops. The higher , the quicker the decrease. For example, the following plot shows the same curve for a range of values – the higest curve has = 0.1 and the lowest has = 2:
0 2 4 6 8 10
0 20 40 60 80 100
t
P
Note that is related to the probability of the coin showing heads, but isn’t exactly the same.
Now, given the di↵erential equation:
P˙t = Pt
and an initial value, P0 = 100, how do we actually get the plot above? We need the values of Pt and not the change in Pt. To translate from the latter to the former, we have to solve the di↵erential equation. A very small proportion of di↵erential equations that you will meet within Systems Biology can be solved analytically. That is, it is possible to write out an equation for Pt. This is one of those examples, and:
Pt = P0exp( t).
In general, we won’t be so lucky and will have to resort to a numerical solver within a computer program of which there are many available. The Java code provided here is an example of the most simple method for numerically solving di↵erential equations [more details in class; non-examinable]. It’s important to note that this solver is very bad and should only be used for examples, and not any serious modellinga.
6
This particular differential equation can be solved analytically, the result, for an initial concentration of 100 can be seen below:
0 2 4 6 8 10
0 20 40 60 80 100
t
P
more ‘heads’ we get). tells us how quickly P drops. The higher , the quicker the decrease. For example, the following plot shows the same curve for a range of values – the higest curve has = 0.1 and the lowest has = 2:
0 2 4 6 8 10
0 20 40 60 80 100
t
P
Note that is related to the probability of the coin showing heads, but isn’t exactly the same.
Now, given the di↵erential equation:
P˙t = Pt
and an initial value, P0 = 100, how do we actually get the plot above? We need the values of Pt and not the change in Pt. To translate from the latter to the former, we have to solve the di↵erential equation. A very small proportion of di↵erential equations that you will meet within Systems Biology can be solved analytically. That is, it is possible to write out an equation for Pt. This is one of those examples, and:
Pt = P0 exp( t).
In general, we won’t be so lucky and will have to resort to a numerical solver within a computer program of which there are many available. The Java code provided here is an example of the most simple method for numerically solving di↵erential equations [more details in class; non-examinable]. It’s important to note that this solver is very bad and should only be used for examples, and not any serious modellinga.
No useful models can be solved analytically. So we have to resort to 6
numerically solving on a computer.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
ODE Solvers
•
Many packages exist.•
Basic mathematical idea is quite simple.•
I’ve provided a [very] simple solver written in Java.•
Do not use for anything ‘real’!!Using our solver, the results can be seen in the figure below. To generate this output, use:
>java DeterministicSolver decay 10 1e-2 1 out.txt
(see Appendices for explanation) and plot the results (out.txt) in your favourite plotting program.
7
Simulator output
>java DeterministicSolver decay 20 1e-3 1 out.txt
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
More complex example
To fully explain the Gillespie algorithm, we need a system with more than one reaction. Consider the following set of reactions:
A
k!
1B B
k!
20
Using A
0= 100, B
0= 1, k
1= 0.1, k
2= 0.1, an example output from the Gillespie simulator is shown below. To run this, do:
>java GillespieSolver simple 100 out.txt 1
In this example, we have N = 2 species and M = 2 reactions. We will use X
nto denote the current count of the nth species. We will also use k
mto denote the rate of the mth reaction. Assume that we are currently at time t. The Gillespie algorithm does the following:
• For each reaction, m, compute a
m= k
mh
m. h
mis the number of reactant combinations, described below.
• Compute a
0= P
m
a
m• Generate two uniform random variates, u
1and u
2(e.g. Math.random() in Java)
18
Q: can you predict what will happen if you change k1 or k2?
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
From reactions to equations
3 Deterministic Modelling
The protein decay example in the previous section is an example of building a Deterministic model.
The model is deterministic because when we solve the di↵erential equation, we will always get the same result. This is a very strong assumption – biological systems aren’t so well-behaved, but perhaps it’s still a useful model?
In building this model, we also slipped in a few other assumptions:
• Deterministic. We have assumed that the system always behaves in the same way.
• It makes sense to measure proteins as concentrations. Implies that there are large quantities of the species.
• There are no external influences. The system is not influenced by anything else.
• That the rate of decrease of protein concentration is proportional to protein concen- tration.
We will discuss the limitations of these assumption in class, and later, when we consider stochastic sim- ulations.
3.1 From reactions to equations
Our model will often be described by a set of reactions. For example our decay example is described by the following reaction:
P ! 0 Here’s a more complex example:
A + B k!1 A + B + B A k!2 0
i.e., an A combined with a B creates an additional B in the first reaction. In the second, A decays. The rates of the two reactions are k1 and k2. These are analogous to in the previous example. To describe this system, we need two di↵erential equations. One for A and one for B. A is involved in both reactions, so it stands to reason that both could potentially change A:
A˙t = change caused by reaction 1 + change caused by reaction 2.
Reaction 2 looks like our protein decay example, so we’ll use the same model we used there:
A˙t = change caused by reaction 1 k2At.
Inspecting reaction 1, how does A change? There is one molecule of A on each side, so there is, in fact, no change in A:
A˙t = 0 At = k2At
Moving to B, it is only involved in reaction 1, and when reaction 1 happens, a B is produced. We now have to make an additional modelling assumption – how do the quantities of A and B a↵ect the rate at which this reaction happens? This most common assumption is mass action, where we assume that the rate at which the reaction occurs is proportional to A ⇥ B (mass action is actually a bit more involved than this, but it will do for our purposes). In other words, assuming mass action gives us the following di↵erential equation for B:
B˙t = k1AB Our system of ODEs is therefore given by:
A˙t = k2At B˙t = k1AtBt
8
Q: how do the reactions change the species?
Q: how fast do the reactions happen?
Kinetic models define how fast reactions happen.
e.g. mass action, Michaelis-Menten A model inside a model!
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Mass Action
3 Deterministic Modelling
The protein decay example in the previous section is an example of building a Deterministic model.
The model is deterministic because when we solve the di↵erential equation, we will always get the same result. This is a very strong assumption – biological systems aren’t so well-behaved, but perhaps it’s still a useful model?
In building this model, we also slipped in a few other assumptions:
• Deterministic. We have assumed that the system always behaves in the same way.
• It makes sense to measure proteins as concentrations. Implies that there are large quantities of the species.
• There are no external influences. The system is not influenced by anything else.
• That the rate of decrease of protein concentration is proportional to protein concen- tration.
We will discuss the limitations of these assumption in class, and later, when we consider stochastic sim- ulations.
3.1 From reactions to equations
Our model will often be described by a set of reactions. For example our decay example is described by the following reaction:
P ! 0 Here’s a more complex example:
A + B k!1 A + B + B A k!2 0
i.e., an A combined with a B creates an additional B in the first reaction. In the second, A decays. The rates of the two reactions are k1 and k2. These are analogous to in the previous example. To describe this system, we need two di↵erential equations. One for A and one for B. A is involved in both reactions, so it stands to reason that both could potentially change A:
A˙t = change caused by reaction 1 + change caused by reaction 2.
Reaction 2 looks like our protein decay example, so we’ll use the same model we used there:
A˙t = change caused by reaction 1 k2At.
Inspecting reaction 1, how does A change? There is one molecule of A on each side, so there is, in fact, no change in A:
A˙t = 0 At = k2At
Moving to B, it is only involved in reaction 1, and when reaction 1 happens, a B is produced. We now have to make an additional modelling assumption – how do the quantities of A and B a↵ect the rate at which this reaction happens? This most common assumption is mass action, where we assume that the rate at which the reaction occurs is proportional to A ⇥ B (mass action is actually a bit more involved than this, but it will do for our purposes). In other words, assuming mass action gives us the following di↵erential equation for B:
B˙t = k1AB Our system of ODEs is therefore given by:
A˙t = k2At B˙t = k1AtBt
8
3 Deterministic Modelling
The protein decay example in the previous section is an example of building a Deterministic model.
The model is deterministic because when we solve the di ↵erential equation, we will always get the same result. This is a very strong assumption – biological systems aren’t so well-behaved, but perhaps it’s still a useful model?
In building this model, we also slipped in a few other assumptions:
• Deterministic. We have assumed that the system always behaves in the same way.
• It makes sense to measure proteins as concentrations. Implies that there are large quantities of the species.
• There are no external influences. The system is not influenced by anything else.
• That the rate of decrease of protein concentration is proportional to protein concen- tration.
We will discuss the limitations of these assumption in class, and later, when we consider stochastic sim- ulations.
3.1 From reactions to equations
Our model will often be described by a set of reactions. For example our decay example is described by the following reaction:
P ! 0 Here’s a more complex example:
A + B
k!
1A + B + B A
k!
20
i.e., an A combined with a B creates an additional B in the first reaction. In the second, A decays. The rates of the two reactions are k
1and k
2. These are analogous to in the previous example. To describe this system, we need two di↵erential equations. One for A and one for B . A is involved in both reactions, so it stands to reason that both could potentially change A:
A ˙
t= change caused by reaction 1 + change caused by reaction 2.
Reaction 2 looks like our protein decay example, so we’ll use the same model we used there:
A ˙
t= change caused by reaction 1 k
2A
t.
Inspecting reaction 1, how does A change? There is one molecule of A on each side, so there is, in fact, no change in A:
A ˙
t= 0 A
t= k
2A
tMoving to B , it is only involved in reaction 1, and when reaction 1 happens, a B is produced. We now have to make an additional modelling assumption – how do the quantities of A and B a↵ect the rate at which this reaction happens? This most common assumption is mass action, where we assume that the rate at which the reaction occurs is proportional to A ⇥ B (mass action is actually a bit more involved than this, but it will do for our purposes). In other words, assuming mass action gives us the following di↵erential equation for B :
B ˙
t= k
1AB Our system of ODEs is therefore given by:
A ˙
t= k
2A
tB ˙
t= k
1A
tB
t8
Reaction 1 leaves A unchanged. Reaction 2 decreases A.
Reaction 1 increases B. Reaction 2 leaves B unchanged.
Mass Action: rate of reaction is proportional to the product of the
‘masses’ of the reactants.
Q: what are the main assumptions under- pinning mass action?
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Mass Action Example
To complete our definition, we need initial concentrations (A = 100, B = 1) and rates k1 = 0.01, k2 = 0.5.
Running this model in our simulator gives the output shown in the Figure below. To produce this output, use:
>java DeterministicSolver maeg 100 1e-3 1 out.txt
Exercise: extend this model to also incorporate decay of B, with rate k3 = 0.01.
3.2 Mass Action Example: Lotka-Voltera Model
Consider the following system of reactions:
A + B k!1 A + B + B B + C k!2 C + C C k!3 0
Taking the species in turn, and assuming mass action, we can derive a set of di↵erential equations:
A : A doesn’t change in any reaction, therefore:
A˙t = 0
9
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Class exercise
• What are the mass action ODEs for the following system of reactions:
To complete our definition, we need initial concentrations (A = 100, B = 1) and rates k
1= 0.01, k
2= 0.5.
Running this model in our simulator gives the output shown in the Figure below. To produce this output, use:
>java DeterministicSolver maeg 100 1e-3 1 out.txt
Exercise: extend this model to also incorporate decay of B , with rate k
3= 0.01.
3.2 Mass Action Example: Lotka-Voltera Model
Consider the following system of reactions:
A + B
k!
1A + B + B B + C
k!
2C + C C
k!
30
Taking the species in turn, and assuming mass action, we can derive a set of di↵erential equations:
A : A doesn’t change in any reaction, therefore:
A ˙
t= 0
9
Tip: one equation per species, not per reaction…
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Run it
B : B increases in reaction 1 (with rate k1AtBt) and decreases in reaction 2 (with rate k2BtCt).
Therefore:
B˙t = k1AtBt k2BtCt
C : C increases in reaction 2 (with rate k2BtCt) and decreases in reaction 3 (with rate k3Ct). Therefore:
C˙t = k2BtCt k3Ct
Using the initial conditions A0 = 1, B0 = 50, C0 = 50 and the rates k1 = 0.25, k2 = 0.0025, k3 = 0.125, we obtain the following system behaviour when we solve the di↵erential equations. To run this, use:
>java DeterministicSolver lotka 100 1e-4 1 out.txt
Notice that A remains constant (it has to: A˙ = 0), whilst the other two species oscillate. It’s quite complex behaviour from what is a very simple set of reactions.
Exercise: vary the initial conditions and rates and observe the changes in behaviour.
10
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Other kinetic models
3.3 Other kinetic models
In the previous example, we used the mass action kinetic model. Many others are used in biological models. A popular one is the Michaelis-Menten kinetic model. For example, for the following reaction:
A ! A + B
the di↵erential equation governing B with mass action would be:
B ˙
t= kA
tWith Michaelis-Menten kinetics, it would be:
B ˙
t= k
1A
tk
2+ A
twhich requires two parameters: k
1and k
2. In the example project mmexample, the following system of reactions is encoded:
A
k!
1A + B A
k2,k!
3A + C A
k!
40
where reactions 1 and 3 use mass action kinetics, and reaction 2 Michaelis-Menten. Using A
0= 100, B
0= C
0= 0 and k
1= k
2= 0.1, k
3= 1, k
4= 0.1, we obtain the system behaviour shown in the next plot. To obtain these results, use:
>java DeterministicSolver mmexample 100 1e-3 1 out.txt
Exercise: Experiment with varying the values of k
2and k
3. What is their role?
11
3.3 Other kinetic models
In the previous example, we used the mass action kinetic model. Many others are used in biological models. A popular one is the Michaelis-Menten kinetic model. For example, for the following reaction:
A ! A + B
the di↵erential equation governing B with mass action would be:
B ˙
t= kA
tWith Michaelis-Menten kinetics, it would be:
B ˙
t= k
1A
tk
2+ A
twhich requires two parameters: k
1and k
2. In the example project mmexample, the following system of reactions is encoded:
A
k!
1A + B A
k2,k!
3A + C
A
k!
40
where reactions 1 and 3 use mass action kinetics, and reaction 2 Michaelis-Menten. Using A
0= 100, B
0= C
0= 0 and k
1= k
2= 0.1, k
3= 1, k
4= 0.1, we obtain the system behaviour shown in the next plot. To obtain these results, use:
>java DeterministicSolver mmexample 100 1e-3 1 out.txt
Exercise: Experiment with varying the values of k
2and k
3. What is their role?
11
3.3 Other kinetic models
In the previous example, we used the mass action kinetic model. Many others are used in biological models. A popular one is the Michaelis-Menten kinetic model. For example, for the following reaction:
A ! A + B
the di↵erential equation governing B with mass action would be:
B˙t = kAt With Michaelis-Menten kinetics, it would be:
B˙t = k1At k2 + At
which requires two parameters: k1 and k2. In the example project mmexample, the following system of reactions is encoded:
A k!1 A + B A k2,k!3 A + C A k!4 0
where reactions 1 and 3 use mass action kinetics, and reaction 2 Michaelis-Menten. Using A0 = 100, B0 = C0 = 0 and k1 = k2 = 0.1, k3 = 1, k4 = 0.1, we obtain the system behaviour shown in the next plot. To obtain these results, use:
>java DeterministicSolver mmexample 100 1e-3 1 out.txt
Exercise: Experiment with varying the values of k2 and k3. What is their role?
11
Mass Action
Michaelis-Menten
12
Try other parameter values in the lab
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Other kinetic models II
• Michaelis-Menten:
• Q: can you work out what real
phenomena the parameters correspond
to?
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Summary
•
We have (briefly) covered some of theideas behind deterministic modelling with ordinary differential equations.
•
In the notes, you’ll see some exercises to try in the lab (now).•
After the lab, we will look over these exercises and possible move ontodiscussing data and parameter/model inference
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Data
Note: what follows is another whole lecture…we will not cover it all!
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
To simulate, we need a model
• All simulations start with:
• A model
• Some parameters
• Some initial conditions:
• What if we start with:
• A model and no parameters?
• No model?
A + B 2B + A B + C 2C
C ⇥
Initial Value Problem
An initial value problem is an ODE (or a system of ODEs) and values of variables at t = 0.
>8
><
>>
:
A˙ = 0
B˙ = k1 · A · B k2 · B · C C˙ = k2 · B · C k3 · C
A|t=0 = 1 B|t=0 = 50 C|t=0 = 50 k1 = 0.25 k2 = 0.0025 k3 = 0.125
Solving an initial value problem means finding the function of dependent variables that satisfies the initial condition and behaves by the law defined with the differential
equation.
Dr Vlad Vyshemirsky (University of Glasgow) Systems Biology 15 / 37
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
A model and no parameters
• Given a system, we can hypothesise a model.
• e.g. I can hypothesise that proteins decay without necessarily knowing how fast this will happen
(encoded as a parameter).
• Without parameters we can’t simulate.
• But we can measure data.
• It would be useful if we could ‘learn’ parameters from measured data.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
A model and no parameters
• Recall our model of protein decay and the initial condition:
• We measure some data in the lab:
P˙ = P P0 = 1
What is ?
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Points and distributions
• Two broad strategies:
• Find the ‘best’ value - point estimates
• Find a distribution - Bayesian inference
What is ?
= 0.5
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Point estimates
• Finding the ‘best’ value:
• What does ‘best’ mean?
What is ?
Each value produces a curve.
Which curve is best?
This curve best represents the experimental evidence.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Measures of ‘bestness’
• To automate the process of finding the best
curve, we need a formal definition of ‘bestness’.
• Non-statistical: e.g. squared error, absolute error (lower is better)
• Statistical: e.g. likelihood (higher is better)
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Squared error
• Compute the squared difference between the model and
experimental data at the times when the data was measured.
• Why squared?
• Add the values for all data points together.
For a particular
= L( )
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Squared error
• Mathematically speaking:
• Find the value that minimises L
• How?
• Guess lots of times and keep the best.
arg min L( )
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Squared error
• First 10 guesses (best is highlighted)
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Squared error
• First 50 guesses (best is highlighted)
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Squared error
• First 100 guesses (best is highlighted)
= 0.2933
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Alternatively
• Guess a value, compute L.
• Change it a bit, compute new L.
• If new L is better, keep new value. Otherwise throw it away.
= 0.2997
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Help is at hand
• Many algorithms exist for optimisation.
• Most are smarter than these two.
• Different ways of updating the parameters.
• Necessary when problem gets more complex.
• e.g. Finding best values of many parameters at once.
• Using simbiology in Matlab gives:
• This is the ‘correct’ value.
= 0.3
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Complexity
• For any optimisation routine:
• Each evaluation of L requires one model simulation.
• For models with many parameters, we may require millions of evaluations!
• Very slow.
• Active research area.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Is there a correct value?
• Imagine doing the biological experiment twice
What is ?
• Not the same - why?
• Stochasticity, contamination, measurement error
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Distributions, not points
• Bayesian parameter estimation:
• Some (me included) would argue that point estimates are misleading.
• They represent a degree of certainty that is unjustified:
• Repeat the experiment, value changes.
• Alternative: Produce a distribution over values.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Do the experiment lots of times...
• Repeat the experiment many times.
• Find the best value for each experiment.
• Histogram the results:
• More informative:
• Tells us something about the uncertainty caused by things we can’t measure
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Bayesian Parameter Estimation
• We can get this without repeating the experiment.
• Instead, we make assumptions about the things we can’t measure.
• i.e. all things we can’t measure, added together look like random variables of some particular type.
• e.g. deviations from model (noise) are Gaussian
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Bayesian Parameter Estimation
• Unfortunately (for you) it’s hard - not supported by SimBiology.
• Fortunately (for me) it’s hard - requires research.
• Other benefits:
• It’s possible to incorporate prior knowledge.
• It’s possible to rigorously compare how well different
models explain data: comparing hypotheses (science!)
• Negatives?
• Even more evaluations of L: even slower!
• Normally requires expert tuning.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
But ODE models are bad....
• What about learning parameters in stochastic models?
• What kind of data would we need?
• Single cell, molecule counts - very expensive.
• Microarray would be no good.
• Discrete data causes problems in standard algorithms (non-smooth).
• Even slower still.
• Active research area.
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
No model and no parameters
• Comparing hypotheses:
• Given some data, can we choose:
Receptor
Bound Receptor
Ligand
MAPKKK MAPKKK*
MAPKK MAPKK*
MAPK MAPK*
Receptor
Bound Receptor
Ligand
MAPKKK MAPKKK*
MAPKK MAPKK*
MAPK MAPK*
V
Yes:
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
No model and no parameters
• (Many) people have also tried to rely more heavily on data:
• Search for statistical (not necessarily biological) relationships between species:
Just data
Expand an existing model using data
[email protected] www.dcs.gla.ac.uk/~srogers @sdrogers
Dr. Simon Rogers, School of Computing Science, University of Glasgow
Summary
• If we have a model and no parameters, we can find them with data.
• Point estimates:
• Easy (to understand and do).
• Too certain.
• Bayesian inference:
• Hard(er) (to understand and do).
• Reflects uncertainty.
• Needs additional assumptions - noise.
• Can use data to compare models.
• Can use data to build models.