technology
from seed
José Barateiro* Gonçalo Antunes José Borbinha
Aligning OAIS with the Enterprise Architecture
8th European Conference on Digital Archiving, 2010 Geneva, Switzerland
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technology
from seed
Outline
• Digital Preservation as a Problem
• Context
• The Enterprise Architecture Perspective
• Zachman Framework
• TOGAF
• Reference Architecture
• Shaman RA
• OAIS Reference Model
• Modelling OAIS
• Conclusions
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Digital Preservation as a Problem (1/2)
Generic and common requirements:
• Integrity: Effective preservation requires that the
informational content of objects remains unchanged through its lifetime.
• Reliability: A copy (or representation) of any preserved object must survive over its system’s lifetime.
• Authenticity Assurance: A future consumer may require the accessed information to be trustworthy.
Aligning OAIS with the Enterprise Architecture
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Digital Preservation as a Problem (2/2)
• Provenance: A future consumer may require information concerning the origins of the object.
• Dealing with Obsolescence: Digital objects should be able to be exploited independently of any technological context (ideally…).
• Scalability: Digital preservation systems might be required to face technological evolution through the addition of new components.
• Heterogeneity: Digital preservation system’s components should be
heterogeneous due to technology disruption.
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Approaching Problems
Systems Engineering
Enterprise Architecture
Risk
Management
Aligning OAIS with the Enterprise Architecture
from seed
(http://grito.intraneia.pt) – National project
– Exclusive storage clusters (dedicated to digital preservation)
– Extended storage clusters (using surplus resources of computing clusters)
SHAMAN - Sustaining Heritage Access through Multivalent ArchiviNg (http://shaman-ip.eu/shaman)
– European project
– Three domains of focus: memory institutions, engineering and e-Science – Strong focus on authenticity and integrity
– Definition of frameworks and architectures for digital preservation
Common ground: use of data grids (massive data sets, file management, user management, networking etc.)
Context
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
A reference architecture presents a way of recording a specific body of knowledge, with the purpose of making it available for further practical reuse.
According to the ANSI/IEEE Std. 1471-2000:
architecture is ” the fundamental organization of a system, embodied in its components, their relationships to each
other and the environment, and the principles governing its design and evolution ”
Therefore, a reference architecture for digital preservation must provide a way to capture the knowledge in the domain, so that it can be instantiated in concrete architectures for real system implementations!
Reference Architecture – The concept
from seed
pkg RA Packages
Architecture
«reference architecture»
SHAM AN
«fram ework»
Viewpoints
«architecture»
Dom ains
«architecture»
M em ory
«architecture»
Industrial
«architecture»
e-Science
«system » Im plem entations
«system » M em ory
«system » Industrial
«system » e-Science
«references»
Input
+ M otivation and Goals + Requirem ents
«references»
Related Work
+ Relevant Specifications + Relevant Standards + Relevant T echnologies
«reference m odel»
OAIS
considers
constrained by constrained by constrained by
use accounts for
guided by
depends
derived
accounts for
SHAMAN RA
Initial global view (1/2)
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
pkg RA Packages
Architecture
«reference architecture»
SHAM AN
«fram ework»
Viewpoints
«architecture»
Dom ains
«architecture»
M em ory
«architecture»
Industrial
«architecture»
e-Science
«system » Im plem entations
«system » M em ory
«system » Industrial
«system » e-Science
«references»
Input
+ M otivation and Goals + Requirem ents
«references»
Related Work
+ Relevant Specifications + Relevant Standards + Relevant T echnologies
«reference m odel»
OAIS
considers
constrained by constrained by constrained by
use accounts for
guided by
depends
derived
accounts for
•The SHAMAN DoW
•The initial work…
•…
•SOA…
•TRAC criteria…
•… Generic focus (a model
based on generic requirements and assumptions…).
SHAMAN RA
Initial global view (22)
from seed
SHAMAN RA
Information Lifecycle (1/2)
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
The digital preservation system
The interfaces of the digital preservation system
The context of the business
SHAMAN RA
Information Lifecycle (2/2)
from seed
Vulnerabilities
Process Software Faults
Software Obsolescence
Data Media Faults
Media Obsolescence
Infrastructure
Hardware Faults
Hardware Obsolescence Communication Faults Network Service Failures
Threats
Disasters Natural Disasters
Human Operational Errors
Attacks External Attacks
Internal Attacks Management
Organizational Failures Economic Failures Business Requirements
Legal Requirements
Stakeholders’ Requirements
From the lifecycle context
A taxonomy of vulnerabilities and threats to digital preservation (1/2)
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Vulnerabilities
Process Software Faults
Software Obsolescence T
T .
. .
.
Data Media Faults
Media Obsolescence T
T .
. .
.
Infrastructure
Hardware Faults
Hardware Obsolescence Communication Faults Network Service Failures
TT TT
. o. o
.. c.
Threats
Disasters Natural Disasters
Human Operational Errors .
t .
O C
.
Attacks External Attacks
Internal Attacks t
t o
O C
c Management
Organizational Failures
Economic Failures .
. O
O . c Business Requirements
Legal Requirements
Stakeholders’ Requirements .
. .
o C C
From the lifecycle context
A taxonomy of vulnerabilities and threats to digital preservation (2/2)
from seed
Technology + Organization + Context
= Enterprise Architecture
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technology
from seed
Technology +Organization +Context
=Enterprise Architecture
http://www.zachmaninternational.com/index.php/the-zachman- framework
“The Zachman Framework is not a methodology for creating the implementation (an instantiation) of the object. The Zachman Framework is the ontology for describing the Enterprise. The
Framework (ontology) is a STRUCTURE whereas a methodology is
a PROCESS. “
from seed
TOGAF - The Open Group Framework
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
TOGAF overview
from seed
The SHAMAN Reference Architecture
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
he SHAMAN Reference Architecture Part 1 – Framework, which describes the T
architectural framework and respective viewpoints;
he SHAMAN Reference Architecture Part 2 – Process, which describes the T process for the development of preservation architectures derived from the Reference Architecture;
he SHAMAN Reference Architecture Part 3 – Foundations, which describes T the foundations of this work and provides references for the instantiation of concrete architectures;
he SHAMAN Reference Architecture Part 4 – Glossary, which contains T definitions for the main terms used in this Reference Architecture.
The SHAMAN Reference Architecture
from seed
Viewpoint Framework
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Structural View
from seed
class m etam ode...
Requirements and Conformance Preservation Strategic Planning
Preservation principle
Constraint Assum ption
Requirem ent Gap
Principle
Business Governance
Acting and Operation
System Building and Support
class m etam o...
System Building and Support Acting and Operation Business Governance
T echnology Applications
Data Organization Unit
Actor Role
Function Preservation
Process Preservation
Driver
Objective Goal Measure
Event
Service Quality Contract
Preservation Service
Data Entity Application
Com ponent
T echnology Component
Platform Service
Policy Strategy
governs
orchestrates
applies to applies to meets
governs m otivates
creates adresses realises sets criteria
perform s orchestrates
tracked against
generates, resolves
assum ed by resolves
consum es, supplies
processed by operates on
supplies, consumes im plem ents
im plem ented on im plem ented on
im plem ents governs
is according to
sets
sets
determ ines
generated, resolved
Architectural Meta-model
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
class m etam ode...
Requirements and Conformance Preservation Strategic Planning
Preservation principle
Constraint Assum ption
Requirem ent Gap Principle
Business Governance
Acting and Operation
System Building and Support
class m etam o...
System Building and Support Acting and Operation Business Governance
T echnology A pplications
Data Organ ization Unit
Actor Role
Function Preservation
P rocess Preservation
Driver
G oal Objective
M easure
Event
Service Quality Contract
Preservation S ervice
Data Entity A pplication
Com ponent
T echnology Com ponent
Pla tform Service
Policy Strategy
governs
orchestrates
ap plies to applies to m e ets
g overns m otivates
creates adresses realises sets criteria
perform s orchestrate s
tracked against
generates, resolves
assum ed by resol ves
consum es, sup plies
processed by operates on
supplies, consum es im plem ents
im plem ented on im plem ented on
im plem ents governs
is according to
sets
sets
dete rm ines
g enerated, resolved
…moving from an informal way of expressing (OAIS Reference
Model Figure F-1: Composite of Functional Entities)…
… to a more appropriately formal, traceable and
objectively represented meta-model…
Moving beyond
from seed
4. System Building and Support 1. Preservation
Strategic Planning
Requirements and Conformance
4.2.
Applications 5. Architecture
Realization 3. Acting and
Operation 2. Business
Governance
4.1.
Data 4.3.
Technology
The process
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Modelling examples (1/4)
UML
pkg oais
OAIS
Management
Consumer Producer
Administration
Data Management
Access
Archival Storage Preserv ation Planning
Ingest
Media
(from Archival S tora ge) Database
(from Data Management) SIP
«flow»
Descriptive Inform ation
«flow»
DIP
«flow»
A IP
«flow»
AIP
«flow»
Descriptive Inform ation
«flow»
from seed
Modelling examples (2/4)
BPMN
BPMN OAIS Core BP
«BusinessProcess»
Ingest
«BusinessProcess»
Ingest
«BusinessProcess»
Data management
«BusinessProcess»
Data management
«BusinessProcess»
Preserv ation Planning
«BusinessProcess»
Preserv ation Planning
«BusinessProcess»
Archiv al Storage
«BusinessProcess»
Archiv al Storage
«BusinessProcess»
Administration
«BusinessProcess»
Administration
«BusinessProcess»
Access
«BusinessProcess»
Access
AIP
«flow»
Descriptive Inform ation
«flow»
Descriptive Inform ation
«flow»
AIP
«flow»
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Modelling examples (1/4)
UML
act Ingest Activ ity
«structured»
Coordinate updates
«structured»
Generate AIP
Archiv al storage Data Management
Administration Ingest
Producer
Submit SIP
:SIP
Receive submission
check
S IP Quality
assurance :QA results
Report to request?
Report request
Generate report
:Report Request
audit?
Audit request Generate audit
report Generate
AIP
:Audit report
:AIP
Generate descriptiv e info
:Descriptiv e info
Storage request
Database update request
Receiv e Data
Receiv e descriptiv e info Database update confirmation
Storage confirmation [true]
[false]
[true]
Resubm it request
[errors?]
[true]
[false]
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Modelling examples (4/4)
BPMN
BPMN Ingest
«Group»
Co-ordinate updates
«Group»
Generate AIP
«Pool» Archiv al storage
«Pool» Ingest «Pool» Data management «Pool» Administration
S IP Re ce i ve S I P fro m P ro d u c e r
Ch e ck S IP e rro rs
Re su b m i t re q u e st
Q u a l i ty a ssu ra n ce Q u a l i ty a ssu ra n ce re su l ts
Re p o rt re q u e st?
S e n d re p o rt re q u e st
Re ce i ve re p o rt re q u e st
G e n e ra te re p o rt
S e n d re p o rt
Re ce i ve re p o rt A u d i t re q u e st?
R e p o rt re ce i ve d
S e n d a u d i t re p o rt re q u e st
Re ce i ve a u d i t re p o rt re q u e st
G e n e ra te a u d i t re p o rt
S e n d a u d i t re p o rt A u d i t re p o rt
re ce i ve d
Re ce i ve a u d i t re p o rt G e n e ra te A IP
G e n e ra te d e scri p t i ve i n fo
S e n d d a ta b a se u p d a te re q u e st S e n d sto ra g e
re q u e st
R e ce i ve d a ta b a se u p d a t e re q u e st
Up d a te d a ta b a se
S e n d d a ta b a se u p d a te co n fi rm a ti o n
D a ta b a se u p d a te co n fi rm a ti o n re ce i ve d
R e c e i ve sto ra g e re q u e st
S to re A IP
S e n d sto ra g e co n fi rm a ti o n S to ra g e
co n fi rm a ti o n re ce i ve d
R e p o rt
A IP
De scri p t i ve i n fo
A u d i t re p o rt ye s
n o ye s n o
h a s e rro rs
n o e rro rs
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Deployment example
cmp test
ESB Policy :
Description Process Modeling
Tool
Process Execution Language Generator
Service Orchestration
Data Grid Database Legacy Digital
Library System Service wrapper SOAP/REST over HT T P
Services Processes
ex:
iRODS, ...
ex:
Oracle, M ySQL, ...
ex:
DSPACE, Kopal, ...
ex: Search
& Browse Integration Service, ...
ex: JBOSS jBPM , Apache ODE, ...
ex: Enterprise Architect, Eclipse BPM N, XM L Edito r, T ext Editor...
Processes : Specification ex: BPM N, AGWL , UM L Activity Diagram s, Petri net, DAG...
Process Execution Language : Specification ex: BPEL, C-GWL, jPDL, ...
ex: T ext, M S Word, PDF,
XM L, ... « flow»
«flow»
«flow»
«flow» «flow»
«flow»
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
Conclusions
• Digital preservation is a very complex problem!!!
Therefore:
• We surveyed the main requirements to digital preservation and classified the threats and vulnerabilities that might endanger preservation using a taxonomy of threats and vulnerabilities.
• We propose the alignment of OAIS with the Enterprise Architecture
• We propose a process “inspired” by TOGAF to develop create
preservation architectures.
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
from seed
16-05-09 Título da apresentação
31
technology
from seed
José Barateiro – [email protected]
Gonçalo Antunes – [email protected] José Borbinha – [email protected]