• Nenhum resultado encontrado

Use of the all-Russian mathematical portal Math-Net.Ru implies that you have read and agreed to these terms of use

N/A
N/A
Protected

Academic year: 2024

Share "Use of the all-Russian mathematical portal Math-Net.Ru implies that you have read and agreed to these terms of use"

Copied!
9
0
0

Texto

(1)

Math-Net.Ru

All Russian mathematical portal

S. V. Chistyakov, F. F. Nikitin, On regular differential games of pursuit with fixed duration, Vestnik S.-Petersburg Univ. Ser. 10.

Prikl. Mat. Inform. Prots. Upr., 2014, Issue 4, 17–24

Use of the all-Russian mathematical portal Math-Net.Ru implies that you have read and agreed to these terms of use

http://www.mathnet.ru/eng/agreement Download details:

IP: 178.128.90.69

November 7, 2022, 06:06:01

(2)

UDC 518.9 Vestnik of St. Petersburg University. Serie 10. 2014. Issue 4

S. V. Chistyakov, F. F. Nikitin

ON REGULAR DIFFERENTIAL GAMES OF PURSUIT WITH FIXED DURATION

St. Petersburg State University, 7/9, Universitetskaya embankment, St. Petersburg, 199034, Russian Federation

In any differential game the programmed maxmin is a guaranteed payoff of first player.

For a long time, due to the simplicity of geometric interpretation of programmed maxmin and difficulties of implementation for Isaacs’ method, programmed maxmin was extensively studied. Researchers were interested in finding conditions under which programmed maxmin is the value of differential game. These conditions are called regular conditions. Differential games satisfying these conditions are called regular games. The programmed iteration method could be considered a non-smooth version of the dynamic programming method. Initially the programmed iteration method was aimed at studying non-regular differential games. Later it became obvious that the scope of application of programmed iteration method is wider. For example based on results of the programmed iteration method the theory of differential games could be built. One more example is provided in this article. Based on results of programmed iteration method, theorem on convex-concave functions and the theorem on measurable selector of multi-valued map we provide simple proof of well-known regular condition for linear differential game of approach with fixed duration. Bibliogr. 14.

Keywords: differential games, zero-sum games, regular games, programmed iteration method.

С. В. Чистяков, Ф. Ф. Никитин

О РЕГУЛЯРНЫХ ДИФФЕРЕНЦИАЛЬНЫХ ИГРАХ

ПРЕСЛЕДОВАНИЯ С ОГРАНИЧЕННОЙ ПРОДОЛЖИТЕЛЬНОСТЬЮ

Санкт-Петербургский государственный университет, Российская Федерация, 199034, Санкт-Петербург, Университетская наб., 7/9

В любой дифференциальной игре величина программного максимина является гаран- тированным выигрышем первого игрока. Долгое время, по причине ее простого геомет- рического смысла в играх преследования и сложности реализации метода Айзекса, эта величина была предметом исследований, целью которых был поиск условий, позволяю- щих утверждать, что при их выполнении она является также и тем проигрышем, более которого заведомо мог бы не проиграть и второй игрок. Эти условия принято называть условиями регулярности, а игры, в которых они выполняются, – регулярными играми. Та- ким образом, условия регулярности гарантируют, что величина программного максимина есть значение дифференциальной игры. В истоках метода программных итераций, пред- ставляющего собой негладкую версию метода динамического программирования, лежат исследования нерегулярных дифференциальных игр, в которых величина программного максимина значением игры не является. Вместе с тем развитие метода программных ите- раций показало, что его возможности существенно шире. В частности, он может быть положен в основу построения теории дифференциальных игр в целом. Еще одна иллю- страция этого положения приводится в представляемой статье, где на основе результатов Chistyakov Sergei Vladimirovich – doctor of physical and mathematical sciences, professor; e-mail:

svch50@mail.ru

Nikitin Fedor Fedorovich – candidate of physical and mathematical sciences, assistant; e-mail:

fedor.nikitin@gmail.com

Чистяков Сергей Владимирович – доктор физико-математических наук, профессор; e-mail:

svch50@mail.ru

Никитин Федор Федорович – кандидат физико-математических наук, ассистент; e-mail:

fedor.nikitin@gmail.com

(3)

метода программных итераций, теоремы о минимаксе для выпукло-вогнутых функций и теоремы об измеримом селекторе многозначного отображения предложено простое обос- нование известного условия регулярности в линейной игре сближения в заданный момент времени. Библиогр. 14 назв.

Ключевые слова: дифференциальные игры, игры с нулевой суммой, регулярные игры, метод программных итераций.

1. Introduction. In zero-sum differential game programmed maxmin [1] is the guaranteed payoff for maximizing player. For long time due to simple geometric interpretation of programmed maxmin function [2] and difficulties in applying Isaacs method [3] research in differential games was focused on finding conditions under which programmed maxmin is guaranteed payoff for the second player as well. Such conditions are called regularity conditions and games which posses the property of regularity are called regular games. In other words in regular differential game the value of the game is equal to programmed maxmin.

In the beginning the method of programmed iterations [4–7] was developed for non-regular differential games and was considered as non-smooth version of dynamic programming method. For these games programmed maxmin does not coincide with the value of the game. Later it turned out that the scope of applications of programmed iteration method is wider [8–12]. Particularly, based on results of programmed iteration method the theory of differential games can be developed [12]. In this paper we demonstrate how known regularity conditions for linear differential games with fixed duration is derived based on programmed iteraton method, the theorem on convex-concave functions [13] and the theorem of measurable selector of multi-valued maps [14].

Consider the game Γ

T

(t

0

, x

0

, y

0

), where pursuer P (in the space { x } = R

n

) and evader E (in the space { y } = R

m

), start from positons x(t

0

) = x

0

and y(t

0

) = y

0

and move according to the system of linear differential equations

dx

dt = A(t)x + B(t)u + f (t) (1)

and dy

dt = C(t)y + D(t)v + g(t), (2)

here A( · ), B( · ), C( · ), D( · ) are continuous matrix functions of corresponding dimensions, f ( · ), g( · ) are bounded and measurable vector functions and u, v are vectors of controls of players which are chosen from the sets

u P Comp R

p

, v Q Comp R

q

. The payoff in the game is

H

x(T ), y(T)

=

k

i=1

x

i

(T ) y

i

(T )

2

12

,

where x(T ) =

x

1

(T ), ..., x

n

(T )

, y(T ) =

y

1

(T ), ..., y

m

(T )

, k min { n, m } .

Both players make their control decisions based on full information about the game,

i.e. they know the system of differential equations, initial conditions and position (t, x(t))

for any moment t [t

0

, T ].

(4)

We consider the game in the set of positional strategies [1]. However, last assumption is not important and the game could be formulated in other classes of strategies too [2].

2. Regularity criterion. Let C

t

(t

, x

) (correspondingly C

t

(t

, y

)) be the set of all positions achievable to the moment t from initial state x

= x(t

) (y = y(t

)) using measurable controls u( · ) (v( · )) with values u(τ ) U (v(τ) V ) for almost all τ [t

, t].

The sets are called reachability sets and they are compact.

Let D ( −∞ , T ] × R

n

× R

m

be the set which with every position (t

, x

, y

) contains also the set

D(t

, x

, y

) =

(t, x, y) [t

, T ] × R

n

× R

m

| x C

t

(t

, x

), y C

t

(t

, y

) . Obviously the set D = D(t

0

, x

0

, y

0

) satisfies this property. We embed the game Γ

T

(t

0

, x

0

, y

0

) in the set of games

Γ

T

(D) =

Γ

T

(t

, x

, y

) | (t

, x

, y

) D ,

where each element of the set is differential game with different initial state. Every game Γ

T

(t

, x

, y

), (t

, x

, y

) D has a value [1]. Function

w( · ) : (t

, x

, y

) w(t

, x

, y

),

maps every position (t

, x

, y

) D to the value w(t

, x

, y

). This function is called the value function of game Γ

T

(D). Game Γ

T

(D) is regular if the value function is equal to programmed maxmin function

w

(0)

( · ) : (t

, x

, y

) w

(0)

(t

, x

, y

) = max

y∈CT(t,y)

min

x∈CT(t,x)

H(x, y), (3) (t

, x

, y

) D.

Define operator Φ

: C(D) C(D), so that for any function w( · ) C(D) and any position (t

, x

, y

) D [6]

Φ

w(t

, x

, y

) = max

t∈[t,T]

max

y∈Ct(t,y)

min

x∈Ct(t,x)

w(t, x, y), (4) here Φ

w(t

, x

, y

) is the value of image of function w( · ) in position (t

, x

, y

).

The following theorem could be easily proved based on facts from the method of programmed iterations

Theorem 1. Game Γ

T

(D) is regular if and only if the function of programmed maxmin is a fixed point of operator Φ

.

3. Sufficient condition for regularity. Let (x)

k

((y)

k

) be projection of vector x R

n

(y R

m

) on space R

k

. The function

H (x, y) =

k

i=1

x

i

y

i

2

12

could be expressed as

H (x, y) = max

l1l∈Rk

l, (x)

k

(y)

k

,

(5)

where · , · and · is scalar product and Euclidean norm in R

k

correspondingly. Using this expression in (2) and minmax theorem for convex-concave functions [13] we get

w

(0)

(t

, x

, y

) = max

y∈CT(t,y)

min

x∈CT(t,x)

max

l1l∈Rk

l, (x)

k

(y)

k

=

= max

l1l∈Rk

max

y∈CT(t,y)

min

x∈CT(t,x)

l, (x)

k

(y)

k

. (5)

Due to Cauchy formula there exists control u( · ) such that

x = W (t, t

)x

+

t

t

W (t, τ)

B(τ)u(τ) + f (τ )

dτ. (6)

Here W (t, τ) = X (t)X

1

(τ) and X (t) is fundamental matrix of solutions for homogeneous linear differential equatios corresponding to (1). Similarly for y C

t

(t

, y

) there exists such control v( · ) of player E such that

y = Z (t, t

)y

+

t

t

Z (t, τ)

D(τ)v(τ) + g(τ)

dτ, (7)

where Z(t, τ) = Y (t)Y

1

(τ) and Y (t) is fundamental matrix of solutions for homogeneous linear differential equations corresponding to (2). Thus if we define

h(τ) =

W (T , τ )f (τ)

k

Z(T , τ )g(τ)

k

(8)

and

ρ(l, t, x, y) =

l,

W (T , t)x

k

Z(T , t)y

k

+

T

t

h(τ)

, (9)

then from (6)–(9) follows

w

(0)

(t

, x

, y

) =

= max

l1l∈Rk

max

v(·)

min

u(·)

ρ(l, t

, x

, y

)+

T

t

l,

W (T , τ )B(τ)u(τ)

k

Z(T , τ )D(τ)v(τ)

k

=

= max

l1l∈Rk

ρ(l, t

, x

, y

) + max

v(·)

min

u(·)

T

t

l,

W (T , τ )B(τ )u(τ)

k

Z (T , τ )D(τ)v(τ)

k

,

(10) here maximum and minimum is taken over all controls u( · ) and v( · ) of players E and P . Note that

max

v(·)

min

u(·)

T

t

l,

W (T , τ )B(τ)u(τ)

k

Z (T , τ)D(τ)v(τ)

k

=

(6)

= min

u(·)

T

t

l,

W (T , τ )B(τ )u(τ)

k

max

v(·)

T

t

l,

Z (T , τ )D(τ)v(τ)

k

=

=

T

t

min

u∈P

l,

W (T , τ )B(τ )u

k

T

t

max

v∈Q

l,

Z (T , τ )D(τ)v

k

=

=

T

t

max

v∈Q

min

u∈P

l,

W (T , τ )B(τ)u

k

Z(T , τ )D(τ)v

k

dτ. (11)

Equations (11) are valid due to the fact that every multi-valued map τ

u

P | l,

W (T , τ )u

k

= min

u∈P

l,

W (T , τ )u

k

and

τ

v

Q | l,

W (T , τ )v

k

= min

v∈Q

l,

W (T , τ )v

k

,

is upper semi-continuous and, hence, posses measurable selector [14].

From (10) and (11) we get

w

(0)

(t

, x

, y

) =

= max

l1l∈Rk

ρ(l, t

, x

, y

) +

T

t

max

v∈Q

min

u∈P

l,

W (T , τ)B(τ)u

k

Z(T , τ )D(τ)v

k

. (12)

Theorem 2. If for any τ ( −∞ , T ] function φ(l, τ ) = max

v∈Q

min

u∈P

l,

W (T , τ )B (τ)u

k

Z (T , τ)D(τ)v

k

(13) is concave in l R

k

, l 1 then game Γ

T

(D) is regular.

Proof. By the theorem 1 it is enough to proof that programmed maxmin function w

(0)

( · ) is fixed point of operator Φ

: C(D) C(D).

Let (t

, x

) D be an arbitrary point. From (4), (12) and (13) follows that Φ

w

(0)

(t

, x

, y

) =

= max

t∈[t,T]

max

y∈Ct(t,y)

min

x∈Ct(t,x)

max

l1l∈Rk

ρ(l, t, x, y) +

T

t

φ(l, τ )

. (14)

From (10) we have that function ρ(l, t, x, y) is linear in l, x and y. Then due to concavity of function in l R

k

, l 1, the function

χ(l, t, x, y) = ρ(l, t, x, y) +

T

t

φ(l, τ)

(7)

is convex in x C

t

(t

, x

) and concave in l R

k

, l 1. Then, we can interchange max and min operations in right-hand side of (14). Hence

Φ

w

(0)

(t

, x

, y

) = max

l1l∈Rk

max

t∈[t,T]

max

y∈Ct(t,y)

min

x∈Ct(t,x)

ρ(l, t, x, y) +

T

t

φ(l, τ )

,

and due to (10)

Φ

w

(0)

(t

, x

, y

) =

= max

l1l∈Rk

max

t∈[t,T]

max

y∈Ct(t,y)

min

x∈Ct(t,x)

l,

W (T , t)x

k

Z (T , t)y

k

+

T

t

h(τ)

+

T

t

φ(l, τ)

or

Φ

w

(0)

(t

, x

, y

) =

= max

l1l∈Rk

max

t∈[t,T]

min

x∈Ct(t,x)

l,

W (T , t)x

k

max

y∈Ct(t,y)

l,

Z(T , t)y

k

+

+

l,

T

t

h(τ)

+

T

t

φ(l, τ)

. (15)

Let us note that W (T , t)W (t, τ) = W (T , τ ). Then from (7) we get min

x∈Ct(t,x)

l,

W (T , t)x

k

=

= min

u(·)

l,

W (T , t

)x

k

+

t

t

W (T , τ )B (τ)u(τ)

k

+

W (T , τ )f (τ)

k

=

=

l,

W (T , t

)x

k

+

t

t

W (T , τ )f (τ)

k

+

t

t

min

u∈P

l,

W (T , τ )B(τ)u

k

dτ, (16)

where previous to the last equiality is justified same way as in (11). Similarly, we have max

y∈Ct(t,y)

l,

Z(T , t)y

k

=

=

l,

Z (T , t

)y

k

+

t

t

Z(T , τ )f (τ )

k

+

t

t

max

v∈Q

l,

Z(T , τ )D(τ)v

k

dτ. (17)

From (16) and (17) taking into account (9) and (13) we get that min

x∈Ct(t,x)

l,

W (T , t)x

k

max

y∈Ct(t,y)

l,

Z(T , t)y

k

=

=

l,

W (T , t

)x

k

Z (T , t

)y

k

+

l,

t

t

h(τ)

+

t

t

φ(l, τ ) dτ.

(8)

Substituing this expression in (15) we conclude

Φ

w

(0)

(t

, x

, y

) =

= max

lRk l1

max

t∈[t,T]

l,

W (T , t

)x

k

Z(T , t

)y

k

+

l,

t

t

h(τ)

+

t

t

φ(l, τ ) +

+

l,

T

t

h(τ)

+

T

t

φ(l, τ )

=

= max

l1l∈Rk

l,

W (T , t

)x

k

Z(T , t

)y

k

+

T

t

h(τ)

+

T

t

φ(l, τ )

.

From this, (10), (12) and (13) it follows that

Φ

w

(0)

(t

, x

, y

) = w

(0)

(t

, x

, y

).

Due to position (t

, x

, y

) D was chosen arbitrary the last equations means that programmed maxmin function w

(0)

( · ) is indeed fixed point of operator Φ

. End of proof.

Note. Obviously conditions of theorem 2 are satisfied when A( · ) = C( · ), B( · ) = D( · ) and Q = a + βP , where β 1.

4. Conclusions. In this paper it was demonstrated how based on results of programmed iteration method, the theorem on convex-concave functions and theorem of measurable selector of multi-valued maps regularity conditions for linear differential games with fixed duration are derived.

References

1. Krasovskii N. N., Subbotin A. I. Game-theoretical control problems. London: Springer, 2011, 532 p.

2. Petrosyan L. A.Differencial’nye igry presledovanija(Differential pursuit games). Leningrad: Izd- vo Leningr. un-ta, 1977, 222 p.

3. Isaacs R. Differential games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York: John Wiley and Sons, Inc., 1965, 384 p.

4. Chentsov A. G. O strukture odnoj igrovoj zadachi sblizhenija (The structure of an approach problem).Dokl. Akad. Nauk of the USSA, 1975, vol. 224, pp. 1272–1275.

5. Chentsov A. G.Ob igrovoj zadache sblizhenija v zadannyj moment vremeni(On differential game of approach).Mat. sb., 1976, vol. 99, issue 3, pp. 394–420.

6. Chistyakov S. V., Petrosyan L. A.Ob odnom podhode k resheniyu igr presledovanija (On one approach for solutions of games of pursuit).Prikl. Mat. Mekh., 1977, vol. 41, pp. 825–832.

7. Chistyakov S. V.К resheniyu igrovyh zadach presledovanija(On solutions for game problems of pursuit).Prikl. Mat. Mekh., 1977, vol. 41, pp. 825–832.

8. Chistyakov S. V.O funkcional’nyh uravnenijah v igrah sblizhenija v zadannyj moment vremeni (On functional equations for differential games with fixed duration). Prikl. Mat. Mekh., 1982, vol. 41, pp. 874–877.

9. Chistyakov S. V. Progpammnye iteracii i universal’nye -optimal’nye strategii v pozicionnoj differencial’noj igre(Programmed iterations and universal -optimal strategies in positional differential game).Dokl. Akad. Nauk of the USSA, 1991, vol. 319, pp. 1333–1335.

10. Chentsov A. G., Subbotin A. I.Iteracionnaja procedura postroenija minimaksnyh i vjazkostnyh reshenij uravnenija Gamil’tonana-Jakobi(An iterative procedure for constructing minimax and viscosity solutions for the Hamilton–Jacobi equations and its generalization). Proc. Steklov Inst. Math., 1999, vol. 224, pp. 286–309.

11. Chistyakov S. V.Operatory znachenija v teorii differencial’nyh igr(Value operators in the theory of differential games).Izv. IMI Udm. State University, 2006, vol. 37, issue 3, pp. 169–172.

(9)

12. Chistyakov S. V., Nikitin F. F. Teorema sushhestvovanija i edinstvennosti reshenija obobshhennogo uravnenija Ajzeksa–Bellmana(Existence and uniqueness theorem for a generalized Isaacs–

Bellman equation).Differential Equations, 2007, vol. 43, no. 6, pp. 757–766.

13. Fan Ky. Minimax theorem.Proc. Nat. Acad. Sci. USA, 1953, vol. 39, no. 1, pp. 42–47.

14. Castaing C., Valadier M. Convex analysis and measurable multifunctions. New York: Springer- Verlag, 1977, 277 p.

Литература

1. Krasovskii N. N., Subbotin A. I. Game-theoretical control problems. London: Springer, 2011.

532 p.

2.Петросян Л. А.Дифференциальные игры преследования. Л.: Изд-во Ленингр. ун-та, 1977.

222 с.

3.Isaacs R.Differential games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York: John Wiley and Sons, Inc., 1965. 384 p.

4.Ченцов А. Г.O структуре одной игровой задачи сближения // Докл. АН СССР. 1975. Т. 224.

С. 1272–1275.

5.Ченцов А. Г.Oб игровой задаче сближения в заданный момент времени // Мат. сб. 1976.

Т. 99, вып. 3. С. 394–420.

6.Чистяков С. В., Петросян Л. А.Oб одном подходе к решению игр преследования // Прикл.

математика и мeханика. 1977. Т. 41. С. 825–832.

7.Чистяков С. В.К решению игровых задач преследования // Прикл. математика и мeханика.

1977. Т. 41. С. 825–832.

8. Чистяков С. В. O функциональных уравнениях в играх сближения в заданный момент времени // Прикл. математика и мeханика. 1982. Т. 41. С. 874–877.

9.Чистяков С. В.Прогpаммные итерации и универсальные -оптимальные стратегии в пози- ционной дифференциальной игре // Докл. АН СССР. 1991. Т. 319, № 6. С. 1333–1335.

10. Ченцов А. Г., Субботин А. И.Итерационная процедура построения минимаксных и вяз- костных решений уравнения Гамильтонана–Якоби // Proc. Steklov Inst. Math. 1999. T. 224. C. 286–

309.

11.Чистяков С. В.Операторы значения в теории дифференциальных игр // Изв. Ин-та ма- тематики и информатики Удмурт. гос. ун-та. 2006. Т. 37, вып. 3. С. 169–172.

12.Чистяков С. В., Никитин Ф. Ф.Теорема существования и единственности решения обоб- щенного уравнения Айзекса–Беллмана // Дифф. уpавнения. 2007. Т. 43, № 6. С. 757–766.

13.Fan Ky.Minimax theorem // Proc. Nat. Acad. Sci. USA. 1953. Vol. 39, N 1. P. 42–47.

14.Castaing C., Valadier M.Convex analysis and measurable multifunctions. New York: Springer- Verlag, 1977. 277 p.

The article is received by the editorial office on June 26, 2014.

Статья поступила в редакцию 26 июня 2014 г.

Referências

Documentos relacionados