• Nenhum resultado encontrado

tr elatóriotécnicoechnicreportal

N/A
N/A
Protected

Academic year: 2022

Share "tr elatóriotécnicoechnicreportal"

Copied!
10
0
0

Texto

(1)

elatório técnico

echnic report

t al

r

IPP-HURRAY! Research Group

Polytechnic Institute of Porto School of Engineering (ISEP-IPP)

Distributed Computer-Controlled Systems: the DEAR-COTS Approach

Paulo VERÍSSIMO (FCUL) António CASIMIRO (FCUL)

Luís Miguel PINHO Francisco VASQUES (FEUP)

Luís RODRIGUES (FCUL) Eduardo TOVAR HURRAY-TR-0021

October 2000

(2)

THIS WORK IS PARTIALLY SUPPORTED BY FCT UNDER PROJECT DEAR-COTS (Praxis/P/EEI/14187/1998)

Distributed Computer-Controlled Sytems: the DEAR-COTS Approach

Paulo VERÍSSIMO, António CASIMIRO, Luís RODRIGUES University of Lisboa, FCUL

Bloco C5, Campo Grande, 1749-016 Lisboa

Portugal

Tel.: +351.21.7500087

E-mail: {pjv,casim,ler}@di.fc.ul.pt http://www.hurray.isep.ipp.pt

Luís Miguel PINHO, Eduardo TOVAR IPP-HURRAY! Research Group

Polytechnic Institute of Porto (ISEP-IPP) Rua Dr. António Bernardino de Almeida, 431 4200-072 Porto

Portugal

Tel.: +351.22.8340502, Fax: +351.22.8340529 E-mail: {lpinho, llf}@dei.isep.ipp.pt

http://www.hurray.isep.ipp.pt

Francisco VASQUES University of Porto (FEUP) Rua Dr. Roberto Frias 4050-123 Porto Portugal

Tel.: +351.22.5081702, Fax:

E-mail: [email protected] http://www.fe.up.pt/~vasques

Abstract:

This paper proposes a new architecture targeting real-time and reliable Distributed

Computer-Controlled Systems (DCCS). This architecture provides a structured approach for

the integration of soft and/or hard real-time applications with Commercial Off-The-Shelf

(COTS) components. The Timely Computing Base model is used as the reference model to

deal with the heterogeneity of system components with respect to guaranteeing the timeliness

of applications. The reliability and availability requirements of hard real-time applications are

guaranteed by a software-based fault-tolerance approach.

(3)

SYSTEMS: THEDEAR-COTS APPROACH 1

P. Verssimo

, A.Casimiro

, L.M. Pinho

,

F.Vasques

, L. Rodrigues

,E. Tovar

UniversityofLisboa, FCUL,BlocoC5,Campo Grande,

1749-016 Lisboa, Portugal,E-mail: pjv,casim,[email protected]

Polytechnic InstituteofPorto, ISEP,Ruade S~ao Tome,

4200-072 Porto,Portugal,E-mail: lpinho,[email protected] t

Universityof Porto, FEUP,RuaDr. RobertoFrias, 4200-465

Porto, Portugal,E-mail: [email protected]

Abstract: This paper proposes a new architecture targeting real-time and reliable

Distributed Computer-Controlled Systems (DCCS). This architecture provides a

structured approach for the integration of soft and/or hard real-time applications

with Commercial O-The-Shelf (COTS) components. The Timely Computing Base

model is used as the reference model to deal with the heterogeneity of system

componentswithrespecttoguaranteeingthetimelinessofapplications.Thereliability

and availability requirements of hard real-time applications are guaranteed by a

software-basedfault-toleranceapproach.

Keywords:Real-Time,FaultTolerance,DistributedEmbeddedSystems,COTS

1. INTRODUCTION

Currently, there is a trend to incorporate Com-

mercialO-The-Shelf(COTS)componentsinDis-

tributed Computer-Controlled Systems (DCCS),

inordertominimisecostsanddevelopmenttime.

Therefore, there is the need forarchitectures al-

lowingtheintegrationofCOTScomponents,hard

real-time reliable applications and non-reliable

and/orsoftreal-timeapplications.

This paper proposes the DEAR-COTS (Dis-

tributedEmbeddedARchitectureusingCommer-

cial O-The-Shelf components) architecture for

supportingreal-timeandreliableDCCS.Thisar-

chitecture provides DCCS with a generic frame-

work,inwhichhardreal-timeapplicationscanex-

ecute,guaranteeingtheirtimelinessandreliability

requirements,whilst, at thesametime, embody-

ing soft real-time or non real-time applications.

1

ThisworkwaspartiallysupportedbytheFCT,through

This framework is put together by means of a

TimelyComputingBase(TCB),whichisusedas

thereferencemodeltodealwiththeheterogeneity

ofsystemcomponentswithrespecttoguarantee-

ingthetimelinessofapplications.

The remainder of this paper is organised as fol-

lows. Section 2 presents some of the relevant

workconcerningdependabledistributedreal-time

systems. Afterwards, Section 3 presents thepro-

posed architecture. Section 4 and 5 present the

HardReal-Time and SoftReal-Time Subsystems

oftheDEAR-COTSarchitecture.Finally,Section

6drawssomeconclusionsandpointsoutpossible

future directions.

2. RELATEDWORK

The problem of reliable real-time DCCS is rea-

sonably well understood when considering syn-

chronousandpredictableenvironments.However,

theuseofCOTScomponentsandtheintegration,

(4)

softornonreal-timeapplications introducesnew

problems.

The guarantee of these properties cannot be

achieved byan \ad-hoc" solution,as ithas been

shown by structured approaches to the prob-

lem of designing dependable real-time systems

(e.g. DELTA-4 (Powell, 1991), MARS (Kopetz

et al., 1989) orGUARDS (Powellet al., 1999)).

However,specialisedarchitecturesarecostly,thus,

thereistheneedforafault-tolerantreal-timedis-

tributed architecture forDCCS, based on highly

availableand lowcostCOTStechnologies.

A few works haveaddressedthe implementation

oftimelinessrequirementsinenvironmentswhere

noreal-timeguaranteescanbegiven(Cristianand

Fetzer, 1999; Verssimo andAlmeida, 1995).But

the Timely Computing Basemodel presentedin

(Verssimoet al.,2000)providesagenericframe-

work to deal with this problem. It is used as a

referencemodelinthedesignoftheDEAR-COTS

architecture. The appropriateness of the TCB

model to COTS-based environmentsis discussed

in(Casimiroetal.,2000)foradistributedsystem

composed exclusively by COTS components It

concludes that the system must be able to cope

withtheoccurrenceofresidualtimingfailures.

Fault toleranceis thepreferredmeansto achieve

reliability. A fault-tolerantservice can be imple-

mented by co-ordinating a group of processes

replicated on dierent nodes. There are three

main replication approaches: active replication,

primary-backup (passive) replication and semi-

active replication (Powell, 1991). The use of

COTSinducesfail-uncontrolledreplicas, soitbe-

comesnecessarytheuseofactivereplicationtech-

niques. Inactivereplication, allthereplicas pro-

cessthesameinputs,keepingtheirinternalstate

synchronisedandvotingallonthesameoutputs.

It is also clear that the feasibility of the appli-

cations' requirementsmust be ensured bysound

schedulability analysis techniques. The use of

these techniques (e.g., the well-known Response

Time Analysis (Audsley et al., 1993)) allows

checkingifthetask set willmeet its deadlines,a

requiredconditionforhardreal-timeapplications.

3. THEDEAR-COTS ARCHITECTURE

TheDEAR-COTSarchitectureprovidesanexecu-

tionenvironmentforreal-time applications, with

reliabilityand availability requirements,through

theuseofCOTShardwareandsoftware.Itsmain

purposeistoprovidecontinuousandadequateser-

vicetothecontrolledsystem,inordertoincrease

thecondencelevelputinthecontrollingsystem.

TheDEAR-COTSarchitectureisnottargetedto

strictedset offailureassumptions(Laprie,1992).

Besides the reliability and availability require-

ments to guarantee the correct behaviour of

the supported real-time applications, DCCS also

needstobeinterconnectedwithotherpartsofthe

overall system. Currently, these kind of systems

demand formoreexibilityandinterconnectivity

capabilities, while guaranteeing the availability

andreliabilityrequirementsofthesupportedreal-

time applications. Hence,theintegration ofhard

real-time applications, whose requirements have

tobeguaranteed,withsoftreal-timeapplications,

where a more exible approach can be used, is

anothergoaloftheDEAR-COTS architecture.

Acommoncharacteristicamongalltheseapplica-

tions is theirreal-time behaviour. This real-time

behaviour is specied in compliance with timeli-

nessrequirements,whichinessencecallforasyn-

chronoussystemmodel.However,thedemandfor

an environment with exibility and interconnec-

tivity capabilities and based on possibly hetero-

geneous COTS components, makes the enforce-

mentoftimelinessassumptionsverydiÆcult.The

TimelyComputingBase(TCB)modelprovidesa

genericsolutiontothisproblem.

IntheDEAR-COTSarchitecturetheTCBmodel

is used as a reference model to deal with the

heterogeneityofsystemcomponentsandoftheen-

vironment,withrespecttotimeliness.Fromasys-

temmodelperspectivewedeviseagenericDEAR-

COTS architecture to address the fundamental

problemsinaglobalandintegratedway.Froman

engineeringpointofview,wedevisespecicmech-

anismstodealwiththereliabilityandavailability

requirements of hard real-time applications, and

to deal with the requirements of soft real-time

applications.

3.1 TheTCB model

The TCB model provides a generic framework

to deal with theproblem of implementing appli-

cations with timeliness requirements in environ-

ments that are unpredictable or unreliable. The

fundamentalidea,whichisappliedtotheDEAR-

COTS architecture,is that systemsare assumed

to have two distinct parts with respect to syn-

chrony.InasystemwithaTCBthereisacontrol

part,with synchronousproperties,made of local

TCBmodulesinterconnectedbyacontrolchannel.

Thereisalsoapayloadpart,overaglobalnetwork

orpayload channel, that mayhave any degreeof

synchronism.Inparticular, thepayload partcan

either enjoy synchronousproperties but canalso

becompletelyasynchronous.

The synchronous part is supposed to be a very

small and simple part of the overall system.

(5)

chronous properties would have to be assumed

for the overall system. Applications execute in

the payload part of the system and use a set of

basic services provided by the control part, the

TCB: Timely Execution, Duration Measurement

and Timing Failure Detection. The denition of

theseservicesandtherespectiveAPIcanbefound

in(Verssimoetal.,2000).

Applications can benet from the TCB by con-

struction, using TCB services to \be aware" of

theirtimelinessand,therefore,tobehavealwaysin

amannerthatpreservesthesafetyofthesystem.

This is the case of many soft real-time applica-

tions,namelyallthosethatcanlivewithsporadic

failures, that can adapt to the available quality

of service, orthose that canswitch to afail-safe

statewhensometimingfailureis detected.

Infact,timingfaults aectingcomponentsorap-

plicationswithsoftreal-timerequirements,canbe

addressedwiththehelpofatimingfailuredetec-

tion service and by applying adequate tolerance

orsafetymeasures,as explainedinsection5.

3.2 AgenericDEAR-COTSnode

Given the above description of the TCB model,

thearchitectureof agenericDEAR-COTS node,

capable of simultaneously handle the require-

mentsofhardandsoftreal-timeapplications,can

beunderstoodinaveryintuitivemanner.Infact,

the basic idea of casting into the architecture

the heterogeneity in systemsynchrony, has been

appliedtothegenericDEAR-COTSnode,asrep-

resentedbyFigure 1.

Control channel

Real-Time channel General purpose channel

TCB DC H DC S

1

(Timely Computing Base) Synchronous Asynchronous synchronous

DC S

n

real-timeHard Soft

Partially real-time real-timeSoft

Fig.1.DEAR-COTSnodestructure.

The several modules depicted in the gure are

intended to fully characterise a node in terms

of the synchrony assumptions. The TCB mod-

ule acts as a gluing element, being always the

most synchronous part of the system. It pro-

vides timely services to less synchronous mod-

ules through an interface that bridges the syn-

chronygap.TheDC

H ,DC

S1

andDC

Sn

modules

constitute the payload part of the system, and

represent theseveral possible environments with

respectto synchronythat mayexistin aDEAR-

withdierentrequirementswillbeexecuted.DC

H

represents a hard real-time subsystem (HRTS),

with synchronous properties, which is supposed

toexistineverynoderunningdistributedand/or

replicated hard real-time applications. Dierent

DC

S

subsystems can be dened, accordingly to

thesynchronypropertiesthattheyenjoy,namely

the asynchronous one, DC

S1

, and several par-

tial synchrony models, DC

S

n

(Cristian and Fet-

zer,1999;VerssimoandAlmeida,1995).Withthe

helpofaTCB,allthesesubsystems(includingthe

asynchronous one) can support the execution of

certainapplicationswithtimelinessrequirements.

Therefore,wewillrefertoallofthemassoftreal-

timesubsystems(SRTS).

Eachnodemayhaveaccesstodierentcommuni-

cation channels with respect tosynchrony. As in

the TCB model, we assume that TCB modules

communicate through a control channel, which

canbeimplementedasavirtualchanneloverthe

physical (real-time) network or can be a sepa-

rate network by itself. There is also a real-time

channel serving DC

H

subsystems, implemented

over areal-time network, and a general purpose

channelserving DC

S

subsystems, whereno real-

timeguaranteesareprovided.

This generic node architecture canindeed be in-

stantiated into dierent node congurations, de-

pendingonthemodulesusedinanode.ADEAR-

COTS systemis builtby interconnecting several

DEAR-COTSnodes,choosingsuitablecongura-

tionsforeachnode.

3.3 ImplementingaDEAR-COTSsystem

ADEAR-COTSsystemisbuiltusingdistributed

processing nodes, where distributed hard real-

time andsoftreal-time applicationsmaycoexist.

Toensurethedesiredleveloffaulttolerance(relia-

bilityandavailability)tothesupportedhardreal-

time applications, specic components of these

applications may be replicated. To ensure the

timelinesspropertiesofsoftreal-timeapplications

andallowsystemresourcestobesharedbyevery

(soft and hard real-time) application, processing

nodesaremodelledasDEAR-COTSnodes.

From ageneric DEAR-COTS node it is possible

to dene particular node instances, with dier-

entcharacteristicsandpurposes.ADEAR-COTS

nodeischaracterisedbythesynchronysubsystems

itiscomposedof.Thereareessentiallythreebasic

node types: Hard real-time nodes (H), soft real-

timenodes(S)andgatewaynodes(H/S).

Hard real-time nodes are those where only a

hard real-time subsystem (HRTS) exists. There-

fore, they will exclusively be used to provide a

frameworkto supportreliableandavailablehard

(6)

thesenodesisnotessentialtoallowtheimplemen-

tationofhardreal-timeapplications.However,as

in any system assumed to be synchronous, and

evenmoreinaCOTSbasedone,thereisalwaysa

small probabilitythat timing failures occur.The

needforaTCB, asaninstrumentto amplifythe

coverage of synchronous assumptions, has to be

equatedintermsofthedependabilityrequiredby

thesupportedDCCSapplication.

Softreal-time nodesonlyinclude asoftreal-time

subsystem (SRTS). Thesoftreal-time subsystem

providestheexecutionenvironmentfortheremote

supervision and remote management of the Dis-

tributedComputerControlSystem.Theexistence

of a TCB in these nodes is crucial to allow the

implementationofapplicationswithtimelinessre-

quirements.

A gatewaynodeintegratesbothahard real-time

subsystem(HRTS)andasoftreal-timesubsystem

(SRTS).Inthesenodestherearetwodistinctand

well-dened executionenvironments.The idea is

to allowhardreal-time components,executingin

the HRTS, to interact in a controlled manner

with softreal-time components,executing in the

SRTS. The communication mechanisms between

bothsubsystemsmustbecarefullydesigned,guar-

anteeing that failures in the soft real-time sub-

system (less reliable) do not interfere with the

hard real-time subsystem(concerning itstiming,

availability and reliability requirements). There-

fore, mechanisms for memory partitioning must

be provided, and also the inter-communication

mechanismsmustguaranteetheintegrityofdata

transferred from the SRTS to the HRTS, byup-

gradingitscondencelevel.

Wide Area Network

DEAR-COTS

H Node H/S

Node

"DC Gateway"

H Node

S Node S Node

DEAR-COTS DEAR-COTS

DEAR-COTS DEAR-COTS

Sensors/Actuators

Controller Area Network

General purpose

Real-Time DEAR-COTS nodes:

H/S Node - Gateway node S Node - Soft Real-Time node H Node - Hard Real-Time node

HRT App

SRT App

SRT App

HRT App SRT App HRT

App

Fig.2.GenericDEAR-COTSsystem.

Figure 2 represents a system containing all the

abovenodecongurations.H andH/SNodesare

interconnectedbyareal-timenetwork,whichpro-

vides the communication infrastructure for the

hardreal-time applications (interconnectingcon-

trollers, sensors and actuators). This real-time

networkalsoprovidestheDEAR-COTSarchitec-

turewithacommunicationinfrastructuretosup-

portthereplicamanagementmechanisms.Atthe

above level, as there is the need to interconnect

these nodes with the upper levels of the DCCS

remote management), there is ageneral-purpose

network interconnecting H/S and S nodes. The

control channel requiredfor theTCB isnotnec-

essarilyanindependentnetwork.Theassumption

of a restricted channel with predictable timing

characteristics(control) coexisting with possibly

asynchronous channels is feasible in some of the

currentnetworks(Prycker,1995).

At the hard real-time level, the HRTS is re-

sponsible for providing a framework for reliable

execution. Hence, applications have guaranteed

execution resources, including processing power,

memory and communication infrastructure.This

is the main reason for the need of a separated

real-time communicationnetwork for the HRTS,

wheremessagessentfromonenodetoanotherare

receivedandprocessedinaboundedtimeinterval.

TheSRTSprovidesasetofservicestosupportthe

supervisionand management levelof the DCCS.

At this system level, exibility is a major goal,

sincenewservicescanbeprovidedasthesystem

evolves. However, since there are no hard real-

timeguarantees,eithertheTCBservicesareused

to allowapplications tobeawareof theavailable

qualityofservice(andadaptthemselves,ifpossi-

ble), ortechniques such asbest-eortscheduling

orvalue-basedschedulingmustbeused.

4. THEHARDREAL-TIMESUBSYSTEM

ThesetofHandH/Snodesprovidesaframework

tosupportdistributedhardreal-timeapplications

(Figure3).Thereliabilityandavailabilityrequire-

mentsare guaranteed bythesoftwarereplication

of COTScomponents, rather than the usual so-

lution, which is to build software on top of spe-

cialisedhardware.

Real-Time Network

Kernel Hardware Kernel

Hardware Kernel

Hardware

Kernel Hardware

Hard Real-Time Application Hard Real-Time Application Hard Real-Time Application

Fig.3.HRTSstructure.

One hard real-time application is constituted by

several tasks (processing units). These tasks are

distributedovertheHandH/Snodes.Eachnode

has its own (non-distributed) COTS real-time

kerneland hardware, which provides thedesired

multitaskingsupport.Anadvantageofusingboth

a COTS kernel and hardware is that it allows

for the easy upgradability and portabilityof the

system.

ThesetofHandH/Snodesalsosupportstheac-

tivereplicationofsoftware(Figure4)withdissim-

(7)

tolerancetodesignfaultsisincreased,sincenodes

are considered as independent from the point of

view of failures. At the same time exibility is

increased,since nodesarenotjust copies ofeach

other, allowingforamoreexible designof real-

timeapplications.

τ

1

Replica Manager Communication Manager

τ

1

‘ τ

2

τ

2

’ τ

3

τ

3

HRTS Support Software

Fig.4.ReplicatedHardReal-TimeApplication.

The goal of the HRTS support software (Figure

4) isto providethedistributionsupport (includ-

ing both the application distribution itself and

the replication management) to hard real-time

applications. A layeredapproachto theHRTS is

providedtosimplifythedevelopmentoftheHRTS

supportsoftware:

The Communication Manager layer is re-

sponsible for the adequate communication

services;

TheReplicaManagerlayerisresponsiblefor

transparentlymanagethereplicatedcompo-

nents.

4.1 Schedulingmodel

IntheHRTS,eachapplicationconsistsofasetof

related tasks (

1 ...

n

), beingeach task asingle

processingunit.Tasksfrom thesameapplication

canbeallocatedtodierentnodes.Inordertoal-

lowtheuseofcurrento-lineschedulabilityanal-

ysis techniques (e.g., the well-known Response

Time Analysis (Audsley et al., 1993)),each task

isreleased onlybyoneinvocation event, butcan

be released an unbounded number of times. A

periodictaskisreleasedbytheruntime(temporal

invocation),whileasporadictaskcanbereleased

either by another task or by the environment.

Afterbeingreleased,ataskcannot suspend itself

orbeblockedwhileaccessingremotedata(remote

blocking).

Tasksareallowedtocommunicatewitheachother

either throughshared data objectsor byrelease

eventobjects(whichcanalsocarrydata).Shared

dataobjectsareusedforasynchronousdatacom-

municationbetweentasks,whilereleaseeventob-

jects are used for the release of sporadic tasks.

Tasks are designed as small processing units,

which, in each invocation, readinputs; carryout

theprocessing;andoutputtheresults.Thegoalis

to minimisetaskinteraction,in orderto improve

the schedulability analysis and increase the sys-

CeilingProtocols(Shaetal.,1990).

4.2 ReplicationModel

Asthereisthetargetofreliabilitythroughreplica-

tion,itisimportanttodenethereplicationunit.

Fromtheabovedenitions, two dierententities

could be considered: a single task or the com-

plete hard real-time application. However, both

solutionswould beveryrestrictive. Inthelatter,

in order to replicate part of the application, all

of its tasks would have to be replicated, which

wouldunnecessarilyincreasetheprocessing load.

Intheformer,eachtaskoutputwouldhavetobe

consolidated,whichwouldincreasetheinter-task

communicationload.

Therefore,thenotionofcomponent isintroduced.

Applicationsaredividedincomponents,eachone

being a set of tasks and resources that interact

to perform a common job. The component can

includetasksandresourcesfromseveralnodes,or

it canbelocatedin just one node. Ineach node,

several componentsmay coexist.As anexample,

Figure5showsareal-timeapplicationwith4tasks

(

1 ,

2 ,

3 and

4

). The application is divided

in twodierent components(C

1 and C

2

), which

ensurethedesiredlevelofreliability.

C1

C1

C2’ C2

τ

1

τ

2

τ

1

’ τ

2

τ

3

τ

4

τ

3

’ τ

4

Fig.5.Hardreal-timeapplication.

A similar concept is the \capsule" dened in

the Delta-4 architecture (Powell, 1991). As the

DEAR-COTS \component", aDelta-4 \capsule"

is the unit of replication, embodying a set of

tasks(referred asthreads) andobjects.However,

a \capsule" has its own thread scheduling and

separated memory space, and is also the unit of

distribution. Thus, the Delta-4 concept of \cap-

sule" is more related to Unix processes, whilst

the presented component is a more lightweight

concept, which is used to structure replication

units.

By creating components, it is possible to dene

thereplicationdegreeofspecicpartsofthereal-

time application, accordinglyto thereliabilityof

its components and the desired level of reliabil-

ity for the application. However, by replicating

components,eÆciencydecreasesasthenumberof

tasksandmessagesincreases.Hence,itispossible

to trade reliability for eÆciency and vice-versa.

AlthougheÆciencyshouldnotberegardedasthe

Referências

Documentos relacionados

Despercebido: não visto, não notado, não observado, ignorado.. Não me passou despercebido

O plano de trabalho não deve exceder oito (08) páginas, além daquelas do texto introdutório. O plano não poderá conter nenhuma outra forma de identificação do candidato, sob

Isto é, o multilateralismo, em nível regional, só pode ser construído a partir de uma agenda dos países latino-americanos que leve em consideração os problemas (mas não as

The purpose of such clinics (Ben- nett, Gattas & Teh, 1999) is to: provide indivi- duals with information about familial aspects of cancer; assess their statistical risk

(Utilizar, como material de estudos para a Multidisciplinar, os PDFs que constam no Moodle, as anotações das aulas e os exercícios disponibilizados pelo professor. Utilizar também

Ao produzir vídeos sobre o tema das violências cotidianas contra a mulher, reforça-se a necessidade de se discutir mais as questões exploradas na produção

Acresce ainda que uma bomba nuclear operacional não cria dissuasão estratégica face aos adversários, logo o Irão necessitaria de, pelo menos, duas armas nucleares além de uma

A maior densidade de plantas daninhas na cul- tura do sorgo ocorreu no intervalo dos 10 aos 35 dias e dos 7 aos 35 dias após a emergência (DAE) das plantas para as áreas sem e