• Nenhum resultado encontrado

Building a knowledge base for automated modeling

N/A
N/A
Protected

Academic year: 2023

Share "Building a knowledge base for automated modeling"

Copied!
24
0
0

Texto

(1)

University of Ljubljana

Faculty of Civil and Geodetic Engineering Institute of Sanitary Engineering

Domain library construction for knowledge- based equation discovery:

An application in limnology

Nataša Atanasova, Ljupčo Todorovski, Boris Kompare, Sašo Džeroski

(2)

Overview

• Motivation for automated modelling

• Domain knowledge for modelling of population dynamics in lake ecosystems

• Formalization of the domain knowledge for Lagramge 2

• Application of Lagramge 2: Lake Bled

(3)

Motivation for automated modelling-

modelling procedures

SYSTEM TO BE MODELLED

MEASUREMENTS

conceptual model

of the system modellingresults deduced from expert

knowledge

statistical models, ANN...

expert knowledge

induced from measurements DEDUCTION

INDUCTION

decision trees regression trees rules induction equation discovery

model induced from measurements and background knowledge EQ. DISC. TOOL LAGRAMGE 2.0 black - box

transparent

DEDUCTION +

INDUCTION

(4)

LAGRAMGE 2: how it works?

• from task specification and knowledge library to grammar

• using the grammar for equation discovery with lagramge

Todorovski (2003)

(5)

Generalized scheme of population dynamics processes and variables

NUTRIENTS POOL

PRIMARY PRODUCERS HERBIVOROUS ZOOPLANKTON

DETRITUS TEMPERATURE

LIGHT

¸¸

ZOOPLANKTON PREDATORS

INCOMING NUTRIENTS

OUTFLOW

Respiration (3)

Mortality_PP (4) Growth_PP (3)

Excretion_A

Mortality_A (4)

Sedimentation

Sedimentation Release_D

Mortality_A

Feeds_on (2) Feeds_on IN EACH LAYER:

Layer connecting process: Rising and Diffusion Sedimentation

Simple Loss process defined for PP and Animals

(6)

Formalized population dynamics knowledge

an example for primary producer mass balance

dpp/dt

Feeds_on (Zoo,{pp},{T}) Mortality (pp, {T},{L},{N})

Growth(pp,{T},{L},{N})

type 1: Cfmax* f(T) * f(food) *pp*zoo type 2:Cgmax* f(T) * f(food) *zoo pp*mmax* f(T)*f(L)*f(N)

Processes

Functions

)

) (

(T T Tref f =Θ

1

) (

+

= Iopt I

opt

I e L I F

N K N N f

s +

= ) ( type 1:

type 2:

type 3: pp*mmax* f(T, L)*f(N) pp*mmax* f(T, Topt)*f(L)*f(N)

min

) min

( T T

T T T

f

ref

=

=

min

3 . 2 exp )

( T T

T T T

f

opt opt

I K L I F

sI+

= ) (

2 2

)

( K N

N N f

s+

=

)

* exp(

1 )

(N k N

f =

Mass balance

Sedimentation (pp, {T})

Respiration (pp, {T},{L},{N}) type 1:...

type 2:...

type 3:...

type 1:...

type 2:...

(7)

Growth rate influences and limitations

0 5 10 15 20 25 30

-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4

temperature

growth rate

Linear temperature response

(T-Tmin)/(Tref-Tmin) T/Tref

0 5 10 15 20 25 30

0 0.5 1 1.5 2 2.5 3 3.5 4

temperature

growth rate

Exponential temperature response

theta

0 5 10 15 20 25 30

0 20 40 60 80 100 120 140

temperature

growth rate

Exponential temperature response theta at Tref=0 exp(k.*T), ASTER

0 5 10 15 20 25 30

-0.2 0 0.2 0.4 0.6 0.8 1 1.2

temperature

growth rate

Optimum temperature curves

(8)

0 0.5 1 1.5 2 2.5 0.1

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

daily mean light gro

wateth r

Light response curves

saturation curve photoinhibition curve

0 0.5 1 1.5 2 2.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1 Coupled effects of light and temperature

daily mean light gro

wateth r

T= 5 T=10 T=15 T=20 T=25

0 2 4 6 8 10

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

nutrient concentration gro

wth r ate

Nutrient limitation curves

Monod Monod2 exp

(9)

Process formalisation in the library

the process Growth_PP

Bendoricchio 1994, Topt is a variable

) , , ( ) ( ) ( )

max(Tref f T f L f P N C µ

µ =

) , , ( ) ( ) ,

( )

max(Tref f T Topt f L f P N C

µ µ =

dt pp

dpp = µ ⋅

) , , ( ) , ( )

max(Tref f T L f P N C µ

µ =

Talbot et al., 1991: the Aster model, most models

Growth of a primary producer (pp) is influenced by temperature and limited by light and nutrients

) ( ) ( ) ( )

, ,

(P N C f P f N f C

f =

[

( ), ( ), ( )

]

min )

, ,

(P N C f P f N f C

f = pp primary producer concentration

µ growth rate [1/time]

f(T) temperature influence function f(L) light limitation function

f(N,P,C) nutrients limitation function

n

C f N f P C f

N P

f ( ) ( ) ( )

) , ,

( + +

=

(10)

Library

process class PP_growth (Primary_producer pp, Inorganics ns, Temperatures ts, Lights ls)

process class PP_growth_type_1() is PP_growth

expression pp*const(max_growth_rate, 0.01, 0.1, 4)*

Food_limitations(ns)*Temp_influences(ts)*

Light_limitations(ls)

process class PP_growth_type_2() is PP_growth

expression pp*const(max_growth_rate, 0.01, 0.1, 4)*

Food_limitations(ns)*Light_limitations(ls)*

product({t1,t2}, t1 in ts, Temp_influence_opt(t1,t2)) process class PP_growth_type_3() is PP_growth

expression pp*const(max_growth_rate, 0.01, 0.1, 4)*

Food_limitations(ns)*

product({t,l}, t in ts, l in ls, Light_temp(t, l))

(11)

Combining scheme (mass balances)

combining scheme Lake(Primary_producer pp) time_deriv(pp) =

+ sum({food, ts, ls}, true, PP_growth(pp, food, ts, ls)) - sum({ts}, true, Loss(pp, ts))

- sum({ts,ns}, true, Respiration_PP(pp, ts, ns)) - sum({ts,ns}, true, Mortality_PP(pp, ts, ns)) - sum({}, true, Outflow(pp))

- sum({}, true, Sedimentation(pp)) + sum({pp1}, true, Rising(pp,pp1))

- sum({a, food, ts}, pp in food, Feeds_on(a, food, ts))*pp

(12)

The knowledge base

• Supports:

– Food chain modelling in a lake – 0 dimensional models

– N box models i.e., supports modelling of stratified lakes

– Fixed internal nutrient levels in primary producers and animals

• Complexity of the models is defined by the

expert, i.e. number of state variables and

processes

(13)

Example of automated modeling

Case study: Lake Bled

(14)

Oscillatoria rubescens in Lake of Bled, November 1999. Photo by Mirko Kunšič Vol. = 25.7 * 106 m3 Area = 1.47 * 106 m2 Max. depth = 30 m

(western basin) Avg. depth = 17.5 m Ae = 0.98*106 m2 Aw= 0.49*106 m2

(15)

Inducing a food-chain model for Lake Bled

• Data:

– Environmental Agency of the RS

– monthly measurements of physical, chemical and biological data – Long term data set 1985 to 2002

– Each variable is measured at two locations in the lake (western and eastern basin)

– Samples are taken each two meters from the surface to the bottom

– i.e, we have parameter measurements in two water columns

(16)

Sampling points in the

western and eastern basin

ARSO: Monitoring kakovosti ...

(17)

Measured data (variables) in lake Bled used for model induction

variable name description Units Frequency

q_krivica Inflow to the lake m3/day daily

q_misca Inflow to the lake m3/day daily

q_radovna Inflow to the lake m3/day daily

q_jezernica Outflow m3/day daily

q_natega Outflow m3/day daily

ortp_krivica, ortp_misca, ortp_radovna Nutrient (orthophosphate) concentration in the inflows

mg/l monthly

temp Water temperature of the streams and

lake

°C monthly

light Light intensity J/(m2*day) monthly

ortp Nutrient (orthophosphate) concentration

in the lake

mg/l monthly

phyto Phytoplankton biomass concentration in

the lake

mg/l monthly

daph Zooplankton (daphnia hyalina) biomass

concentration in the lake

mg/l monthly

(18)

Inducing a model on real data

• Three d.e. model: ortp-phyto-daph

• Induced on eastern basin data, euphotic zone (upper 10 m), 1996

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

4.1.1996 23.2.1996 13.4.1996 2.6.1996 22.7.1996 10.9.1996 30.10.1996 time

phytoplankton, daphnia hyalina

0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018

inorganic phosphorus

phytoplankton daphnia hyalina phosphorus

(19)

Expert knowledge

• General form of the equations

Phytoplankton Daphnia hyalina temperature

light

Loss

Respiration

PP_growth Feeds_on

inflow of nutrients

Settling

Dissolved inorganic phosphorus

outflow of nutrients

_growth const

- outflow -

inflow PP

dt

dortp =

Feeds_on settling

n respiratio growth

_

= PP dt

dphyto

loss on

Feeds dt

ddaph

= _

(20)

Task specification

variable Inorganic ortp_krivica variable Inorganic ortp_misca variable Inorganic ortp_radovna variable Flow q_krivica

variable Flow q_misca variable Flow q_radovna variable Flow q_jezernica variable Flow q_natega variable Inorganic ortp

variable Primary_producer phyto variable Animal daph

variable Temperature temp variable Light light

process Inflow(ortp, ortp_krivica, q_krivica) inflow1 process Inflow(ortp, ortp_misca, q_misca) inflow2 process Inflow(ortp, ortp_radovna, q_radovna) inflow3

process Outflow(ortp, q_jezernica) outflow1 process Outflow(ortp, q_natega) outflow2

process PP_growth(phyto, {ortp}, {temp}, {light}) p1 process Feeds_on(daph, {phyto}, {temp}) p5

process Loss(phyto, {temp}) p6

process Respiration_PP(phyto, {temp},{}) p9 process Loss(daph, {temp}) p8

(21)

Step by step discovery of three d.e.

Phosphorus equation:

Using the growth term in the phosphorus eq., discover the rest of the processes in the phytoplankton eq.

+ +

+

+

+

=

)

10 7 _ 10

7 ( _

10 ) 7 _ _

10 7 _ _

10 7 _ _

(

6 6

6 6

6

natega ORTP q

jezernica ORTP q

radovna radovna q

misca ORTP misca q

krivica ORTP krivica q

dt ORTP dortp

phyto phyto

phyto daph temp

phyto

phyto temp dt

dphyto

+

+

+

=

5 5

. 3 01 20

. 10 1

009 . 0

4 18

7 . 046 2

. 0

2 2

2

2 2

3 2

2

5

001 . 0 5

4 . 01 16 . 1 02

.

0 phyto daph daph

phyto phyto daph temp

dt ddaph

+

+

=

(22)

Model performance

Jan96 Apr96 Jul96 Oct96 Jan97

0.005 0.01 0.015

0.02 Dissolved inorganic phosphorus concentration model measured

Jan960 Apr96 Jul96 Oct96 Jan97

1 2 3 4

5 Phytoplankton concentration

Jan960 Apr96 Jul96 Oct96 Jan97

0.2 0.4 0.6

0.8 Daphnia concentration

Jan950 Apr95 Jul95 Oct95 Jan96

0.01 0.02

0.03 Dissolved inorganic phosphorus concentration model measured

Jan950 Apr95 Jul95 Oct95 Jan96

1 2

3 Phytoplankton concentration

model measured

Jan950 Apr95 Jul95 Oct95 Jan96

0.5

1 Daphnia concentration

model measured

(23)

Conclusions

• Domain knowledge can be successfuly introduced in the equation (model) discovery procedure

• Lake modelling library to support automated modelling of lakes with the Lagramge 2 machine learning tool was

constructed.

• Library language formalism supports a precise formulation of the expert knowledge

• Lake Bled: induction of a simple model for complex dynamics

• Model improvement: increase of model complexity,

(24)

Acknowledgements

• Ministry of the Environment, Spatial Planning and Energy; Environmental Agency of the

Republic of Slovenia

• Špela Rekar, Environmental Agency of the

Republic of Slovenia

Referências

Documentos relacionados

Vdrzea lakes: in continuous or interrupted liaison with the white waters of the Amazon; they generally lie in the inundation plain of the rivers (lake Maica, Lake Redondo)3.