Components for Exchanging
Big Data Information Sets
Por
Philipp Urbauer
Orientador: Stefan Sauermann
Co-orientador: Jo˜
ao Agostinho Batista Lacerda Pav˜
ao
Tese submetida `a
UNIVERSIDADE DE TR ´AS-OS-MONTES E ALTO DOURO para obten¸c˜ao do grau de
DOUTOR
em Engenharia Eletrot´ecnica.e de Computadores, de acordo com o disposto no DR I s´erie-No 151, Decereto-Lei No115/2013 de 7 de agosto e no
Regulamento Geral dos Ciclos de Estudo Conducentes ao Grau de Doutor DR, 2as´erie-No133 de 13 de julho de 2016
Components for Exchanging
Big Data Information Sets
Por
Philipp Urbauer
Orientador: Stefan Sauermann
Co-orientador: Jo˜
ao Agostinho Batista Lacerda Pav˜
ao
Tese submetida `a
UNIVERSIDADE DE TR ´AS-OS-MONTES E ALTO DOURO para obten¸c˜ao do grau de
DOUTOR
em Engenharia Eletrot´ecnica.e de Computadores, de acordo com o disposto no DR I s´erie-No 151, Decereto-Lei No115/2013 de 7 de agosto e no
Regulamento Geral dos Ciclos de Estudo Conducentes ao Grau de Doutor DR, 2as´erie-No133 de 13 de julho de 2016
Stefan Sauermann
Professor do
Department of Life Science Engineering University of Applied Sciences Technikum Wien
Jo˜ao Agostinho Batista Lacerda Pav˜ao
Professor Auxiliar do
Departamento Engenharias da Escola de Ciˆencias e Tecnologia Universidade de Tr´as-os-Montes e Alto Douro
Os membros do J´uri recomendam `a Universidade de Tr´as-os-Montes e Alto Douro a aceita¸c˜ao da tese intitulada “ Conceptualization and Evaluation of Interoperable and Modular IT-Framework Components for Exchanging Big Data Information Sets” realizada por Philipp Urbauer para satisfa¸c˜ao parcial dos requisitos do grau de Doutor.
Dezembro 2018
Presidente: Professor Doutor Vitor Manuel de Jesus Filipe Professor Associado com Agrega¸c˜ao da Escola de Ciˆencias e Tecnologia
Universidade de Tr´as-os-Montes e Alto Douro
Vogais do J´uri: Professora Doutora Paula Christina Ribeiro Coutinho de Oliveira
Professor Auxiliar da Escola de Ciˆencias e Tecnologia Universidade de Tr´as-os-Montes e Alto Douro
FH-Professor Doktor Johannes Martinek FH-Professor do Department of Life Science Engineering University of Applied Sciences Technikum Wien
Professor Doutor Nelson Fernando Pacheco da Rocha Professor Catedr´atico do Departamento de Ciˆencias M´edicas Universidade de Aveiro
Professor Doutor Lu´ıs Jos´e Cal¸cada Torres Pereira Professor Auxiliar da Escola de Ciˆencias e Tecnologia
Universidade de Tr´as-os-Montes e Alto Douro
FH-Professor Doktor Stefan Sauermann
FH-Professor do Department of Life Science Engineering University of Applied Sciences Technikum Wien
e modulares para troca de conjuntos
de informa¸c˜
oes de Big Data
Philipp Urbauer
Submetido na Universidade de Tr´as-os-Montes e Alto Douro para o preenchimento dos requisitos parciais para obten¸c˜ao do grau de
Doutor em Engenharia Electrot´ecnica e de Computadores
Resumo — Sob o termo ”digitaliza¸c˜ao 2.0”, smartphones, tablets, rel´ogios inteligentes e sensores de vestu´ario (wearables) s˜ao capazes de gerar enormes quantidades de dados no contexto da Internet das Coisas. As empresas e institui¸c˜oes de investiga¸c˜ao est˜ao a usar essas enormes quantidades de dados em diferentes temas de pesquisa. H´a expectativas de que a combina¸c˜ao de dados de diferentes dom´ınios, como por exemplo, sa´ude, meio ambiente ou transporte, possa levar a novas descobertas para melhorar v´arios aspetos da vida, tais como o melhor tratamento de doen¸cas ou a melhoria da eficiˆencia do atendimento na presta¸c˜ao de servi¸cos. No entanto, um grande desafio ´e a diversidade de formatos de dados. Requisitos sint´aticos e semˆanticos associados aos dados representam fatores de qualidade muito importantes para os tornar pass´ıveis de troca e compar´aveis. Este trabalho investiga a aplicabilidade de padr˜oes e m´etodos de interoperabilidade do dom´ınio da inform´atica m´edica a dados de outros dom´ınios, como transporte e meio ambiente, para promover o intercˆambio e melhorar a qualidade dos dados, utilizando padr˜oes internacionais. Para tal foram analisadas plataformas de dados abertas e foram recolhidas e selecionadas de acordo com crit´erios bem definidos normas de interoperabilidade m´edica e tecnologias relacionadas. Em consequˆencia, um conceito denominado ”Interoperable BDIS Directory” (IBD) -Profile foi desenvolvido para a troca de conjuntos de grandes quantidades de informa¸c˜ao - Big Data Information Sets (BDIS). O perfil do IBD ´e baseado nos servi¸cos web HL7 FHIR e RESTful e inclui descri¸c˜oes de processo e defini¸c˜oes de recursos HL7 FHIR. Em trˆes estudos
foram implementados prot´otipos e feitos com sucesso testes de conformidade por ferramentas de valida¸c˜ao HL7 FHIR. Por fim, a verifica¸c˜ao do conceito desenvolvido foi feita atrav´es do desempenho de uma revis˜ao de especialistas de acordo com o IEEE 1028. A revis˜ao de especialistas confirma que o conceito desenvolvido ´e relevante e que o Perfil IBD ´e um primeiro passo bem-sucedido para introduzir a interoperabilidade com os objetivos desejados. No entanto, ´e necess´ario continuar a investiga¸c˜ao do conceito no que diz respeito `a integra¸c˜ao de requisitos de streaming, bem como `a melhoria da interliga¸c˜ao de fontes e consumidores de dados distribu´ıdos. Palavras Chave: Interoperabilidade, Normaliza¸c˜ao, Open Data, Healthcare, Transportes e Ambiente, Health Level 7 (HL7), Integrating the Healthcare Enterprises (IHE)
Components for Exchanging
Big Data Information Sets
Philipp Urbauer
Submitted to the University of Tr´as-os-Montes and Alto Douro in partial fulfillment of the requirements for the degree of
Doctor of Electrical and Computer Engineering
Abstract — Under the term ”digitization 2.0” smartphones, tablets, smart watches and wearable sensors are generating huge amounts of data in context of the Internet of Things. Companies and research institutions are investigating and using these massive amounts of data in terms of research. There are expectations that combining data from different domains like for example healthcare, environment or transport, might lead to new findings for improving several aspects of life like better treatment of diseases or improving efficiency of the care path. However, a huge challenge is the diversity of data formats. Related syntactic and semantic requirements represent very important quality factors to make data exchangeable and comparable. This work investigates the applicability of interoperability standards and methods from the medical IT domain to data from other domains like transport and environment, to foster the exchange and improve data quality by using international standards. Hence, in this work open data platforms and formats were analyzed, medical interoperability standards and related technologies were collected and selected according to well defined criteria. Based on that, a concept called ”Interoperable BDIS Directory” (IBD)-Profile was developed for the exchange of Big Data Information Sets (BDIS). The IBD-Profile is based on HL7 FHIR and RESTful web services and includes process descriptions and HL7 FHIR resource definitions. In three technical feasibility studies (transmitting fitness tracker data, pollen exposure data and public transport data) prototypes were implemented and successfully tested with conformance tests by using HL7 FHIR validation tools.
concept to be meaningful and that the IBD-Profile is a successful first step to introduce interoperability for this purpose. However, further investigations of the concept should be done regarding integration of streaming-requirements as well as improved inter-connection of distributed data sources and sinks.
Key Words: Interoperability, Standardization, Open Data, Healthcare, Transport and Environment, Health Level 7 (HL7), Integrating the Healthcare Enterprises (IHE)
My gratitude goes to Professor Stefan Sauermann for his rigidity in hardening my motivation during my study and his support all along my path over the last years. I would like to thank Professor Joao Pavao for his warmhearted kindness and valuable support in all concerns from computer science over geography to history and cultural aspects.
Furthermore, I would like to express my gratitude to Professor Alexander Mense, who supported my path over all the years from the background with patience and farsightedness.
Additionally, I want to thank Professor Luis Pereira for his wisdoms which are much more far reaching than the Ph.D. context could ever be, but always lead to a solution and Professor Rute Bastardo sharpening my visualization on the really important things.
I would like to take the opportunity to thank the team of office A4.28 at the UAS Technikum Wien, Matthias Frohner, Veronika David, Mathias Forjan, and Birgit Pohn. Thank you for countless times of fruitful discussions and common sharing of pain in times, were the suffer score reached maximum level and could severely not be counted as an outlier.
However, no medical device fulfills quality requirements without ISO 13485 and no house is durable without a stable foundation. Thus, my deepest gratitude goes to my wife Nikolett.
”T¨ovises az ´ut a csillagokig.”
To all a heartfelt thank you! A todos, um sincero obrigado! Mindenkinek k¨osz¨on¨om! Ein herzliches Danke an euch!
Vienna/Vila Real, Philipp Urbauer
December 20, 2018
Resumo ix
Abstract xi
Acknowledgments xiii
Index of Tables xix
Table of Figures xxi
Glossary, Acronyms and Abbreviations xxv
1 Introduction 1
1.1 Motivation and Objectives . . . 3
1.2 Limits and Scope of the Thesis . . . 4
1.3 Scientific Publications . . . 5
1.4 Organization of the Thesis . . . 6
2 Background 7 2.1 Standard Development Organizations . . . 11
2.1.1 Institute of Electrical and Electronics Engineers (IEEE) and IEEE 11073 Group of Standards. . . 12
2.1.2 Integrating the Healthcare Enterprises (IHE) . . . 13
2.1.3 Personal Connected Health Alliance (PCHA). . . 16
2.1.4 Health Level Seven (HL7) . . . 17 xv
2.2.3 HL7 Fast Healthcare Interoperability Resources (FHIR). . . . 24
2.3 State-of-the-Art Research and Related Work . . . 26
2.4 Information Exchange in the Domains of Transport & Environment . 29 2.5 Security & Privacy Perspectives . . . 30
2.6 Necessity for Consciousness-Raising . . . 33
3 Materials and Methods 37 3.1 Investigative Analysis of Data Platforms and Formats . . . 38
3.2 Selection of Standards and Technical Basis . . . 39
3.3 Conceptualization: Domain Overview and Components . . . 42
3.4 Application in the Domains of Health, Environment and Transport . 44 3.4.1 Scenario: Healthcare Data Integration . . . 44
3.4.2 Scenario: Environmental Data Integration . . . 47
3.4.3 Scenario: Transport Data Integration . . . 49
3.5 Concluding Evaluation . . . 51
4 Results 55 4.1 Investigative Analysis of Data Platforms and Formats . . . 55
4.2 Selection of Standards and Technical Basis . . . 59
4.2.1 User Requirements . . . 59
4.2.2 Detailed Requirements . . . 60
4.3 Business Domain Overview and Framework Components . . . 63
4.3.1 Scenarios . . . 64
4.3.2 IBD-Profile . . . 70
4.3.3 BDIS structure and BDIS-Extension definition . . . 75
4.3.4 Security & Privacy Considerations . . . 82
4.4 Domain specific Feasibility Studies . . . 83
4.4.1 Scenario: Healthcare Data Integration . . . 84
4.4.2 Scenario: Environmental Data Integration . . . 92
4.4.3 Scenario: Transport Data Integration . . . 98
4.5 Experts Review . . . 105
5 Discussion 111
6 Conclusion and Future Work 125
Bibliography 129
3.1 Shows the selected open data platforms used for investigation of the applied data formats. . . 39
3.2 Shows the information about biometric indicators and interfaces for the investigated activity trackers, worn on the persons wrist. Information taken from (Marton et al., 2017) . . . 45
3.3 Shows example data, received from the MUV pollen database. This data is integrated in the prototype of this feasibility study. . . 48
3.4 Shows the content of the questionnaire, which was handled to the experts during the experts review. . . 52
4.1 Shows the investigated interoperability standards and frameworks, selected in accordance to the user requirements. . . 60
4.2 Provides an overview of SDOs, profiles and standards covering the defined requirements. . . 62
4.3 Provides an overview of the IBD-Profile’s actors and transactions. . 72
4.4 Shows the standards applied in the described transactions of the IBD-Profile. . . 72
4.6 Shows the defined value-set for the wearable activity tracker scenario to support semantic interoperability, published in the proceedings at the DSAI2018 conference. . . 85
4.7 Shows the defined code-system for the pollen exposure integration scenario to support semantic interoperability.. . . 94
4.8 Shows the defined code-systems for the public transport data integration scenario to support semantic interoperability. . . 100
4.9 Shows the data describing the characteristics of the international experts participating in the review at the EU Connect-a-thon 2018 in The Hague. . . 106
4.10 Shows the results of the yes/no-questions, answered during international experts review performed at the EU Connect-a-thon 2018 in The Hague. . . 107
4.11 Shows the results of the descriptive questions, answered during international experts review performed at the EU Connect-a-thon 2018 in The Hague. . . 108
2.1 Shows the EIF model, its refinement process and the ReEIF model in accordance toeHealth Network (2015) . . . 10
2.2 Shows an example IHE Profile, called Consistent Time(CT), which purpose is to show how any kind of Software may synchronize its system time according to a standardized process with a time server. . 16
2.3 Shows the basic structure of a CDA document in accordance to the RIM. . . 20
2.4 This shows a truncated example HL7 version 2.6 for transmission of weight data in context of the Continua Design guidelines of the PCHA (Personal Connected Health Alliance (PCHA), 2016). . . 22
2.5 Shows an example patient resource to describe the common characteristics of HL7 FHIR resources in general, according to the specification shown in Health Level Seven International (2018b). . . . 26
2.6 Shows the security principles, described through the CIA-Triad, extended CIA-Triad and Security Star, which shall be considered with technical/non-technical measures. . . 32
2.7 Occurrence of thematic sub-areas in certification programs according to the professions within the EU. Published in Urbauer et al.(2014) . 34
3.2 Shows the methodical approach for defining the framework components for the big data domain. . . 43
4.1 Shows open data platforms in the EU, listed in table 3.1, and the number of health and transport domain related data sets. . . 56
4.2 Shows open data platforms in the EU, listed in table 3.1, and the number of environment domain related data sets. . . 57
4.3 Shows the percentage breakdown between the most common file formats applied in the respective domains of health, environment and transport. . . 58
4.4 This sequence diagram describes the components and the workflow of the elderly living independently scenario, which focused on the use of wearable fitness tracker data to observer movement and vitality behavior. . . 65
4.5 This sequence diagram describes the components and the workflow of the chronic pollen allergy patient support system (CPAPSS), which combines vital parameter data with environmental pollen exposure data. . . 67
4.6 This sequence diagram describes the components and the workflow of the medical event triggered route guidance scenario, which combines data from the domains of transport and health. . . 69
4.7 Shows the ”Interoperable Big Data Information Set (BDIS) Directory” IBD-Profile with its actors and related transactions. . . . 71
4.8 Shows the interaction diagram for the BDIS-Feed [BDD-01] transaction. . . 73
4.9 Shows the interaction diagram for the BDIS-Query [BDD-02] transaction. . . 74
4.10 Shows the interaction diagram for the BDIS-Retrieve [BDD-03] transaction. . . 74
storage in different forms. . . 75
4.12 Shows the meta-data elements in the BDIS-Extension definition. . . . 77
4.13 Shows the definition of the first sub-extension element, describing the ”bdisCategory”. . . 78
4.14 Shows the definition of the ”bdisGenerationTime” sub-extension element. . . 79
4.15 Shows the ”bdisContext” sub-extension element from the BDIS-Extension definition. . . 80
4.16 Shows the definition of the url to be used for referencing the BDIS-Extension. . . 80
4.17 Shows the system architecture of the wearable fitness tracker scenario, based on the work publish at DSAI 2018 (Urbauer et al., 2018). . . . 84
4.18 Shows the contained-elements, describing external resources in-line inside the wearable fitness tracker HL7 FHIR observation resource. . 86
4.19 Shows the implementation of the defined BDIS-Extension used in the wearable fitness tracker data observation resource. . . 87
4.20 Shows part three of the wearable fitness tracker HL7 FHIR observation resource, describing general meta-information and the first component including the heart rate data. . . 88
4.21 Shows the second component of the wearable fitness tracker HL7 FHIR observation resource, applied to include the ”steps per minutes” values. . . 90
4.22 Shows the final component, containing the value for ”number of seconds of sleep per minute”, for sleep analysis. . . 91
4.23 Shows the successful result of the BDIS conform wearable fitness tracker HL7 FHIR observation resource conformance validation by application of the HL7 Validator including the extended HL7 FHIR STU 3.0.1 specification. . . 91
(Urbauer et al., 2017). . . 92
4.25 Shows the first part of the HL7 FHIR observation resource, including the BDIS-Extension, used for pollen exposure data integration. . . 95
4.26 Shows the second part of the HL7 FHIR observation resource, including the component elements holding the meta-data and pollen exposure data.. . . 97
4.27 Shows the successful result of the BDIS conform pollen exposure HL7 FHIR observation resource conformance validation by application of the HL7 Validator including the extended HL7 FHIR STU 3.0.1 specification.. . . 98
4.28 Shows the system architecture of the public transport data integration scenario. . . 99
4.29 Shows the in-line-defined resources in the public transport HL7 FHIR observation resource. . . 102
4.30 Shows the BDIS-Extension used in the public transport HL7 FHIR observation resource. . . 102
4.31 Shows meta-data and the first component-element, holding data of a specific public bus with the line number 37A. . . 104
4.32 Shows the second component of the public transport HL7 FHIR observation resource, used for transmitting the ”steps per minutes” values. . . 105
4.33 Shows the successful result of the BDIS conform public transport HL7 FHIR observation resource conformance validation, by application of the HL7 Validator including the extended HL7 FHIR STU 3.0.1 specification.. . . 105
Abbreviations
Glossary
Actor — Actors are pieces of software i.e. modules, which provide necessary functionality to exchange information based on IT-standards. Actors have specific purposes, for example building and storing documents or searching for documents etc. and communicate via ”Transactions”. Actors and their functionality are defined in ”Integration Profiles” together with ”Transactions”.
ATNA — A technical profile from IHE, describing the integration of audits and audit trails in medical information systems. Additionally, authentication on a software component level i.e. module level is described by the profile. It is the basic security profile of IHE, which has to be implemented when developing XD*-based software solutions.
Big Data Information Sets — The term big data refers in general to huge amounts of data. The term Big Data Information Sets (BDIS) was chosen, as for this work only excerpts of data could be taken fulfilling certain quality requirements. These requirements are stated in the respective chapters.
time or time span, including meta-data to describe the context of the data e.g. purpose of data acquisition, size, etc..”
Connect-a-thon — The Connect-a-thon is a testing event carried out by IHE International, but also the affiliate organizations like IHE Europe or IHE Asia-Oceania, to provide a community based approach for interoperability conformance testing. This includes no-peer and peer-to-peer tests under the control of test-specialists called Monitors. These are technical experts having long time experiences in the filed of IT and testing. They visually observer processes and perform conformance tests like exchanging medical documents or authentication and encryption processes and much more.
DEC — An IHE profile describing the integration process of medical device data, like intensive care units or blood pumps data, into medical IT infrastructure. The profile is located in the Patient Care Device (PCD) technical framework and uses HL7 V2 and V3 as a communication standard.
Integration Profile — Integration Profiles are technical specifications based on international IT standards to solve real world scenarios. Each profile includes descriptions of example scenarios, an Actor & Transaction diagram describing the related processes, information formats and requirements through standards, as well as guidance regarding related security & privacy requirements.
IUA — Is a profile described in the IT infrastructure technical framework of IHE and focuses on the management of security tokens used for authorization purposes related to RESTful based web services. It is strongly connected to OAuth.
OID — Object Identifier (OID) are internationally used and world wide unique identifiers. OIDs are issued according to ISO/IEC 9834-1. In Austria OIDs for eHealth are issued by the OID-Portal Austria.
Radiology or Laboratory.
Transaction — Transactions are used for communicating information between ”Actors”. This includes the specification of technologies to be used, for example SOAP based web-services together with WS-Security etc., as well as requirements of structuring this data and adding semantics to it.
XACML — Is an XML based standard for the technical implementation of policies in XML format. This includes a data model for defining XML-policies, but additionally adds actors and a protocol for policy decision making processes and enforcement.
XDR — Describes information exchange of medical data similar to XDS using SOAP based web services, but with a focus on point-to-point communication in case no high sophisticated medical IT infrastructure is available.
XDS — A profile from IHE, describing the content agnostic exchange of health information between enterprises in healthcare (e.g. hospitals etc.) based on web services using SOAP and related WS-* security technologies. The profile describes and explains the processes, requirements, applicable and recommended technologies. HL7 CDA is strongly connected to this profile as it defines interoperability for medical documents send by IHE XDS based systems.
XUA — Is a profile in the IT infrastructure technical framework of IHE and focuses on enabling SOAP based web services (e.g. IHE XDS or XDR) the use of authorization data in form of e.g. SAML assertions holding relevant authorization criteria.
Initials Expanded
ABAC Attribute-Based-Access-Control AHD Application Hosting Device
ANSI American National Standards Institute ATNA Audit Trail and Node Authentication BAN Body Area Network
BDD Big Data Domain
BDIS Big Data Information Set BLE Bluetooth Low Energy
BPPC Basic Patient Privacy Consent
BT Bluetooth
CDA Clinical Document Architecture CEN Comit´e Europ´een de Normalisation
CIA-Triad Confidentiality, Integrity and Availability - Triad CKAN Comprehensive Knowledge Archive Network CPAPSS Chronic Pollen Allergy Patient Support System
CT Consistent Time
DEC Device Enterprise Communication
DICOM Digital Imaging and Communications in Medicine ECG Electrocardiography
EHR Electronic Health Record
EIF European Interoperability Framework
EIRA European Interoperability Reference Architecture epSOS European Patients Smart Open Services
ETSI European Telecommunications Standards Institute
EU European Union
FHIR HL7 Fast Healthcare Interoperability Resources GDPR General Data Protection Regulation
HDP Health Device Profile
HIPAA Health Insurance Portability and Accountability Act HIS Healthcare Information System
HL7 Health Level Seven HL7 V2 HL7 Version 2 HL7 V3 HL7 Version 3
HTML Hyper Text Markup Language
HTTP(s) Hypertext Transfer Protocol (Secure) IBD Interoperable BDIS Directory
ICD International Classification of Diseases ICP IHE Certified Professional
ICT Information and Communication Technology IEEE Institute of Electrical and Electronics Engineers IHE Integrating the Healthcare Enterprises
IMRAD Introduction, Materials & Methods, Results, Discussion and Conclusion
IoT Internet of Things
ISA2 Interoperability Solutions for European Public
Administrations 2
ISO International Organization for Standardization IT Information Technology
ITS Intelligent Transport Systems IUA Internet User Authorization JSON Java Script Object Notation
LOINC Logical Observation Identifiers Names and Codes MDER Medical Device Encoding Rules
MLLP Minimal Low Layer Protocol MS Multiple Sclerosis
MSH Message Header
MUV Medical University of Vienna NTP Network Time Protocol
OASIS Organization for the Advancement of Structured Information Standards
OBX Observation Segment of OBR OBR Observation Request Group
OECD Organization for Economic and Co-operation and Development
OSI-Model Open Systems Interconnection Model PAN Personal Area Network
PCHA Personal Connected Health Alliance PDR Pollen Data Requester
PHD Personal Health Device
PHDSC Public Health Data Standards Consortium PHR Personal Health Record
PID Patient Identifier
QRPH Quality, Research and Public Health RBAC Role-Based-Access-Control
ReEIF Refined eHealth European Interoperability Framework REST Representational State Transfer
RIM Reference Information Model
SDOs Standards Developing Organizations
SNOMED-CT Systematized Nomenclature of Medicine - Clinical Terms SOA Service Oriented Architectures
STU Standard for Trial Use
TCP/IP Transmission Control Protocol/Internet Protocol TISA Traveller Information Service Association
TPEG Transport Protocol Experts Group UCUM Unified Codes for Units of Measure UML Unified Modeling Language
URL Uniform Resource Locator VHA Vienna Hospital Association WHO World Health Organization WLAN Wireless Local Area Network
X73 ISO/IEEE 11073 Group of Standards
X73-OEP ISO/IEEE 11073-20601 Optimized Exchange Protocol XDR Cross-Enterprise Document Reliable-Interchange XDS Cross-Enterprise Document Exchange
XD* Cross-Enterprise Family of Profiles (XDS, XDR, XDM, XD-Content Profiles, etc.)
XML Extensible Markup Language
XSLT Extensible Stylesheet Language Transformation XUA Cross-Enterprise User Assertion
Abbreviation Significance
e.g. exempli gratia, for example et al. et aliae, and the other persons) etc. et cetera, and so on
i.e. id est, that means vid. vide, see also vs. versus, comparison
1
Introduction
Digitization i.e. the process of conversion of text, pictures or sound to a digital format which can be processed by a computer (Oxford Dictionaries, 2018), is a process that accompanies society since years. At present our society is on the edge of digitization 2.0 and technologies like the Internet of Things (IoT) or blockchain strongly influence this transformation processes (Helbing, 2017). Systems and devices in the context of mobile computing i.e. each smartphone, tablet, smart watch or any wearable sensor can be seen as a highly productive source of generating data. Examples are crowd-sourced approaches like the usage of smartphone ECG applications for the diagnose of health incidents like syncope (Nyotowidjojo et al.,
2016) or the use of the smartphone-based data to overcome the lack of missing infrastructure for floating car systems (i.e. providing individual sensor data from cars for to decrease traffic (Briante et al.,2014). The EU-Project COBWEB ( COBWE-Project, 2016) focused on investigating and strengthening the potential of crowd-sourced environmental data focusing on the quality of data, its bias and risks. These examples show the wide range of application of these technologies and each industry and different public services generate massive amounts of data. Hence, there is strong evidence that data will further grow through digitalization 2.0.
The expression big data is frequently used in this context. There are different definitions of the term ”big data” as shown in (Press,2014). Summarizing this, the term refers to extremely large volumes of data generated and stored, which can’t be handled with common tools for storage and analysis. As the big data revolution does not simply refer to the quantity of the data, but moreover to the ability to handle and interpret the data, the management of Big Data Information Set(s) (BDIS) in an efficient way is of high interest to improve big data analytic outcomes and gain insights into possibly new hidden values (Shaw, 2014; McKinsey Global Institute,
2011).
Furthermore, it is expected to reveal new findings by combining different data silos from different data categories i.e. domains. Examples could be improving crime investigation (Open Data Bits,2014) or increasing insights on the understanding of impacts to different diseases like Multiple Sclerosis (MS). There is first evidence that regional environmental influences have an impact on the frequency of MS-relapses (Spelman et al., 2014). To combine collected health and environmental data sets from several geographically different EU regions may bring enormously valuable support for such disease research. Another hypothesis in this context could be that the combination of transportation and environmental data, like customer movement streams, in relation with health data, like flu outbreak climax, may be used to steer and optimize patient flows in hospitals. This may be used to improve quality and efficiency of treatment for patients as well as similarly decreasing costs (Drazen and Rhoads,2011). However, these are just a few examples for the combinations of large piles of data from different domains, which may lead to several additional values for society. The group around Ahmed et al. (2017) investigated the role of big data in IoT and come to the conclusion that it is of high importance to liberate data from data silos to provide support for cross-domain approaches as research topics of the future will likely focus on the combinatorial approach. Moreover, important research challenges are to face the diversity of data, its semantics and interconnectivity as these are needed for increasing data quality, reliability and usability (Ahmed et al.,
As The term interoperability is concerned with the above stated requirements and is recognized as a very important factor to improve efficiency and sustainability of data, as indicated by (European Commission, 2018c). This is shown in several eHealth related projects regarding Electronic Health Records (EHR), Personal Health Records (PHR), medical IT systems and Personal Health Device (PHD) communication for interchanging medical information. Examples are the national Electronic Health Record in Austria (ELGA,2018) as well as the project European Patients Smart Open Services (epSOS) (EpSos-Project, 2016) funded by the EU, interconnecting national EHRs in the EU. Industry, national and international governments as well as scientific societies see benefits of standardized interoperability on the sustainability and cost savings regarding IT systems as outlined by the European Interoperability Framework (EIF) (European Commission, 2018e).
1.1
Motivation and Objectives
The intention of this thesis is to explore the combination of Big Data Information Sets (BDIS), while also developing a structured method to combine these data sources in a standardized way, to introduce interoperability for the combination of different data sources efficiently. Thus, the main research question is:
”Are interoperability standards and methods from the medical information technology domain applicable to other domains to support exchange and interpretation of Big Data Information Sets?”
The thesis is a conclusive study, with five main objectives as described in the following. The first objective is to conduct an investigative analysis to get an overview about the open data platforms and their formats. This is followed by objective two, which is a selection process for applicable medical interoperability standards based on well defined criteria. Subsequently, objective three is to establish a big data business domain overview and the conceptualization of re-usable, modular
and combinable IT framework components for standardized exchange of BDIS. The fourth objective is the execution of feasibility studies in the domains of health, environment and transport as a prove of concept of the framework components and especially the data syntax and semantics. This includes technical validation i.e. conformance testing by well defined state-of-the-art testing procedures using validation tools. Finally, the last objective is to evaluate the concept by an international experts review.
1.2
Limits and Scope of the Thesis
As stated in the first paragraphs of this thesis, big data is ubiquitous and independent of the different domains. Hence, providing a meaningful context in this thesis the following points shall narrow the focus of the study and define the scope and limits:
- Big data analysis is out of scope as the focus lies on communication interfaces for data exchange between software components
- Three scenarios for health, environment and transport are selected to study feasibility
- A special focus lies on the syntax and semantics of the data exchange and the protocol
- The prototypes are tested in lab environments and the access to data is limited by the data providers. Hence, the data is selected according to defined criteria (i.e. availability, reliability etc.).
- The performance of the components i.e. data throughput and load balancing is out of focus in this work.
Based on the previously stated facts that the big data revolution does not simply refer to the quantity of the data, but moreover to the ability to efficiently manage and
interpret BDIS i.e. providing suitable syntactic structures and semantics through a data model, the following definition of BDIS is used in the context of this thesis:
”An enclosed set of multi-domain data (e.g. domains of health, environment, transport etc.), related to a specific point in time or time span, including meta-data to describe the context of the data e.g. purpose of data acquisition, size, etc..”
1.3
Scientific Publications
Partial results and preliminary investigations were published in advance of finalizing this thesis. Hence, the contributions are listed here:
- Urbauer, Philipp; Frohner, Matthias; Forjan, Mathias; Pohn, Birgit; Sauermann, Stefan; Mense, Alexander (2012). A Closer Look on Standards Based Personal Health Device Communication: A R´esum´e over Four Years Implementing Telemonitoring Solutions. European Journal for Biomedical Informatic (EFMI-Journal-2012), Vol. 8(2012), Issue 3, pp. 65-70, ISSN 1801-5603 (print)
- Urbauer, Philipp; Herzog, Juliane; Pohn, Birgit; Forjan, Mathias; Sauermann, Stefan (2014). Certification Programs for eHealth - Status Quo eHealth. eHealth Summit Austria - Health Informatics meets eHealth (Conference-2014), 22-23 May Vienna, Austria
- Philipp Urbauer, Stefan Sauermann, Matthias Frohner, Mathias Forjan, Birgit Pohn, Alexander Mense (2015). Applicability of IHE/Continua components for PHR systems: Learning from experiences. Computers in Biology and Medicine (Elsevier-Journal-2015), Vol 59, Pages 186-193, ISSN 0010-4825
- Urbauer, Philipp; Maximilian, Kmenta; Frohner, Matthias; Mense, Alexander; Sauermann, Stefan (2017). Propose of Standards based IT Architecture to enrich the Value of Allergy Data by Telemonitoring Data. eHealth Summit
Austria - Health Informatics Meets eHealth (Conference-2017), 23-24 May Vienna, Austria
- Philipp Urbauer, Matthias Frohner, Veronika David, Stefan Sauermann (2018). Wearable Activity Trackers Supporting Elderly Living Independently: A Standards based Approach for Data Integration to Health Information Systems. Software Development and Technologies for Enhancing Accessibility and Fighting Info-exclusion (DSAI-Conference-2018), 20-22 June Thessaloniki, Greece
1.4
Organization of the Thesis
The basic structure of this thesis is according to IMRAD (Introduction, Materials & Methods, Results, Discussion and Conclusion). The chapter 2 ”Background”) provides an overview about SDO’s, interoperability standards and security basics from the medical domain as well as state-of-the-art work related to this thesis. In chapter 3 ”Materials & Methods” the detailed process to reach the five defined objectives as well as uthe used infrastructure and components are described. The chapter 4 ”Results” includes the results from the data format analysis and the standards selection process as well as the the developed concept containing a business domain overview and the conceptualization of the IT framework components. Furthermore it describes the results of the executed feasibility studies, its technical validation and the results of the experts review. All results are subsequently discussed in chapter 5 ”Discussion” and the final conclusion including future perspectives finalizes this thesis in chapter 6”Conclusion”.
2
Background
Interoperability describes the ability of heterogeneous IT applications to exchange data accurately, effectively and consistently in a reliable process and plays an important role in healthcare (Jardim, 2013). The focus lies on the IT components interfaces. This for example includes the transmission technologies like Wi-Fi, ZigBee or Bluetooth (BL), but reaches far more deeper into the Application layer of the Open Systems Interconnection model (OSI model). That is the case as the data itself is structured and made semantically interpretable at this layer in typical Internet based communication procedures in healthcare. Hence, it is a technical quality measure to increases the possibilities of cooperation between healthcare professionals through reliable exchange of medical information (Iroju et al., 2013). In accordance to Tolk et al. (2013) interoperability has seven levels:
- Level 0: No Interoperability: Closed systems which are not interacting with others.
- Level 1: Technical Interoperability: A common infrastructure along with a communication protocol is defined.
- Level 2: Syntactical Interoperability: Adds a specific format to structure the
data which is communicated between the systems.
- Level 3: Semantic Interoperability: At this level the meaning of the data is included in the dataset exchanged e.g. by usage of codes from code-lists or value-sets.
- Level 4: Pragmatic Interoperability: This level adds the context of usage of the data i.e. the systems communicating know how the data is used.
- Level 5: Dynamic Interoperability: The communicating systems are able to understand and react on certain effects of operations during the communication process.
- Level 6: Conceptual Interoperability: Includes the technical specification of conceptual models and its proper description, independent of the concrete implementations.
Although level 1 is defined as technical interoperability abstracted from the other levels, all levels in this definition are strongly focusing technical characteristics and therefore could be defined as technical interoperability together. However, when taking a closer look to the ISA2 program from the EU, which fosters the ”development of digital solutions that enable public administrations, businesses and citizens in Europe to benefit from interoperable cross-border and cross-sector public services!” (European Commission, 2018a), a much broader view and impact of interoperability is revealed. The new EIF (European Commission, 2018e) is part of the Communication (COM(2017)134), which supersedes the former version from 2010 and provides guidance for integration of interoperability in digital public services. The actual version includes 47 recommendations due to several changes through new EU policies. The EIF defines four levels of interoperability to support an interoperability-by-design pattern. Furthermore, the EIF, the Interoperability Action Plan and the European Interoperability Architecture (EIRA) are parts of the interoperability governance on an European level. The four levels of interoperability from EIF(European Commission,2018d) are:
- Legal Interoperability: Through the diversity of national legal frameworks, policies and strategies of the EU member states, it has to be ensured that organizations can work together. Interoperability checks should be executed to identify interoperability barriers like geographical restrictions, different data license models and restrictions to use specific technologies etc.. In case of new legislation, requirements and impacts of ICT shall be considered as early as possible and along the whole process. However, data protection legislation need to be fulfilled on an EU as well as on all applied member states levels.
- Organizational Interoperability: Targets the integration or alignment of business processes and the alignment in accordance with responsibilities to achieve commonly agreed goals and benefits. These should be documented with accepted modeling approaches. Furthermore organizational relationships need to be clearly defined and formalized regarding establishment and operation of services.
- Semantic Interoperability: Refers to ensure the format and meaning of data, which is exchanged between the communicating parties. This includes syntactic interoperability, for definition of grammar and format of the data, and semantic interoperability i.e.the meaning of data and its relation through the definition of terminologies and schemata. The latter assures that all communication parties understand the data exchanged in the same way. The proper installation of an information management strategy for management of meta-data, master data and reference data should minimizes the probability of duplicate and fragmented data. The development of specifications for formats, terminologies and processes shall be driven in national or European community to overcome the challenges of linguistic, cultural, legal and administrative requirements of the different member states.
- Technical Interoperability: Targets interface specifications, data presentation and exchange, communication protocols from security perspective, which are used by services and applications to link systems. As of historically driven bottom-up development of legacy services in public administration, these
systems were build to fulfill a specific aim, which resulted in fragmented ICT landscapes. Therefore open specifications shall be used when and wherever possible.
These definitions show the complexity of interoperability. However, in the medical domain the Refined eHealth European Interoperability Framework (ReEIF) (eHealth Network, 2015) was developed in 2015. In this work the EIF definitions are refined as they define six out of four levels of interoperability as shown in 2.1.
Figure 2.1 – Shows the EIF model, its refinement process and the ReEIF model in accordance to eHealth Network(2015)
The left column shows the four layers of interoperability according the EIF i.e. legal, organizational, semantic and technical. According the refinement process these four levels were extend in six level, where organizational is split into policy and care process levels and technical into applications and IT infrastructure. The processes to split these levels, were justified by the fact that these levels have different actors and responsibilities and from a technical interoperability perspective, different classes of standards.
The six levels of interoperability together with the related gray boxes represents the ReEIF model. The ”Legal and Regulatory level”, legislation and regulations specify the scope and limits of interoperability across boarders and also within countries of regions. The ”Policy level” focuses on the organizational collaboration and its agreements and governance. The ”Care Process level” refers to integrate and align processes between collaborating organizations to realize integrated care pathways and shared workflows. The ”Information level” represents the semantic level i.e. data models, description of data, terminologies with codes and its linking to data. The ”Applications level” targets the communications standards for data exchange and therefore the import and export in Healthcare Information Systems (HIS). Finally ”IT Infrastructure level” refers to communication and network protocols, the storage, backup and database engines eHealth Network (2015).
The previously stated definitions of interoperability demonstrate the complexity, which hides behind this concept. Therefore, the main categories of EIF are considered in this thesis and a special focuses is put on ReEIF ”Application level”-and ”Information level”-Interoperability in the concept to be developed.
2.1
Standard Development Organizations
Research work focusing on the application or development of standards in the domains of health, but also in the domain’s of transport & environment, have a clearly specified contexts fulfilling clear aims supported through these standards. Standards Development Organizations (SDOs) play a very important role regarding interoperability in medical informatics and interconnection of medical IT systems (Davis and LaCour, 2014). The Public Health Data Standards Consortium (PHDSC) provides a list of 17 SDOs, reaching from clinical trails, digital images, medical products, emergency data public health over to financial/business transactions and billings (Public Health Data Standards Consortium, 2018). All of the listed SDOs are concerned with developing, maintaining or at least using terminologies for medical devices and IT services used in several medical areas like
pharmacy, radiology, laboratory or clinical trials. Moreover they specify technical communication languages i.e. on an application layer level of the OSI model and interfaces as well as provide tools for knowledge management like terminology databases. They define processes and workflows to align clinical pathways and therefore improve its efficiency, make them comparable and decrease costs. Furthermore, they provide guidance regarding security measures like encryption and authentication, but additionally provide privacy recommendations like policy implementation and integration. The following examples provide a short overview about the manifold areas and kinds of activities of SDOs in the healthcare domain and therefore briefly describe SDOs, which are important for this thesis and its related studies.
2.1.1
Institute of Electrical and Electronics Engineers
(IEEE) and IEEE 11073 Group of Standards
IEEE is the worlds largest technical professional organization (Institute of Electrical and Electronics Engineers,2018) and has activities in nearly all areas of technology. In context of interoperability of medical devices the ISO/IEEE 11073 health informatics - medical/health device communication standards group, develops the 11073 series of standards. Within this family of standards, the focus is the standardized communication between Personal Health Devices (PHD). The ISO/IEEE 11073 (X73) family includes several standards. The most prominent one is the ISO/IEEE 11073-10101 (IEEE Standards Association,2004), which describes the Nomenclature used to exchange data between PHDs and from PHDs to IT infrastructure. This specification strongly fosters semantic interoperability.
However, also the ISO/IEEE 11073-20601 (Optimized Exchange Protocol) (IEEE Standards Association,2014) and its device specializations standards are important. The Optimized Exchange Protocol (X73-OEP) defines a basic structure, encodings and processes for PHD communication of data independent of the type of device. Blood pressure monitor, pulse oximeter or weight scale have device specialization
standards, as special requirements through their use and nature of data are common. A complete list of device specializations and a general overview can be found in ISO/IEEE 11073-00103 (IEEE Standards Association, 2012). In this thesis the standards were used within the feasibility studies from a syntactic and semantic perspective.
2.1.2
Integrating the Healthcare Enterprises (IHE)
Healthcare professionals, universities and industry companies unite themselves under the initiative ”Integrating the Healthcare Enterprises” short IHE. The main mission of IHE is to improve the way IT systems in healthcare exchange information through promoting the use and application of medical IT standards (IHE International,
2018a). Therefore, IHE is mainly concerned with the topic interoperability from the conceptual, dynamic and pragmatic interoperability levels (in accordance to (Tolk et al., 2013)). IHE additionally provides and collects tools and services supporting development and testing the specifications (IHE International,2018f,b).
IHE uses a well defined process (IHE International, 2018e) to reach the aim of improving or introducing interoperability in different fields of healthcare based on scenarios. This approach includes all necessary stakeholders, from healthcare professionals like physicians and nurses over technicians and developers, patients, management and external specialists. In the first phase of the process the user requirements are collected and documented in form of scenarios and fitting standards (i.e. HL7, DICOM, IEEE, OASIS and others) are identified. Based on this, technical specifications are developed for this specific scenarios i.e. description of communication processes, data exchanged and terminologies needed. This specifications are called integration profiles and are collected in frameworks regarding their medical domain e.g. radiology, pharmacy or laboratory, but also IT infrastructure for systems focusing on medical documentation like EHR or PHR systems and more.
subsequently tested extensively before and during a testing event called Connect-a-thon. These tests focus severely on interoperability (no-peer and peer-to-peer tests) and are done with the IHE gazelle testing framework (IHE International,
2018b). The tested products receive an integration statement and can then, in phase three, be applied and integrated in existing health IT systems. However, the process is of iterative nature, which leads to a feedback loop and perceptions like problems or missing coverage of requirements are feed-back to the documentation of the requirements. This approach improves the quality of the specifications, supports knowledge transfer between all stakeholders, decreased burdens of implementations and decreases costs on a long range perspective. IHE connects more than 135 organizations in its activities (IHE International,2018d).
The basics of IHE-terminology are necessary to understand the conceptual approaches in this thesis and therefore the most important once needed are briefly explained:
- IHE Actors: IHE Actors, in short actors, are pieces of software e.g. modules which are using IHE interface specifications for communication. Actors are defined on a scenario level and communicate via IHE transactions. Some actors can be used in several IHE integration profiles, although they are defined once in an concrete integration profile. Furthermore, actors can be grouped together, which therefore allows the combination of integration profiles by a Lego like approach. An example could be to integrate a profile for requesting patient demographic data together with a profile for requesting proper patient identifiers, when communication is done between different IT systems like radiology- and laboratory information systems.
- IHE Transactions: IHE transactions are used between actors to communicate and exchange information. A transaction has a specific purpose and is based on interoperability standards used to define structure and semantics. Nevertheless, transactions can have different manifestations e.g. requesting documents need the ability to formulate different query parameters. Transactions, similar to actors, are sometimes used in more than one profile
e.g. in direct and uni-direct communication of patient information, as this is defined in different profiles.
- IHE Integration Profiles: An integration profile is the technical specification based on requirements collected from the different stakeholders and well defined scenarios (see 4.3.1). Therefore, an integration profile is a collection of actors and transactions to describe the processes which are necessary to promote interoperability in a spec scenarios. This includes narrative description of the purpose of the profile, its application forms, its implications and boundaries. Furthermore, an integration profile includes overviews and links to used standards in the processes and transactions as well as flow charts, sequence diagrams and transaction diagrams for describing on how the actors and transactions work together.
- IHE Frameworks: IHE frameworks are a collection of documents which include the specifications for different scenarios in a medical domain i.e. exchange of documents between organizations or integration of medical devices in enterprise IT systems. These documents commonly underly the schema of ”Volumes”, which means that Vol.1 includes an overview, describing a specific scenario from a management perspective. That includes actors and its connections via transactions and the different manifestations i.e. pediatric option when demographic information of a woman is transmitted from general hospital to infant clinic. Vol.2 describes the transactions in a detailed level and therefore target at engineers. Vol.3 describes integration profiles interconnections through transactions and content transmitted in the transactions. Vol.4 focuses on semantic interoperability and aim at describing codes and terminologies. However, not all volumes are necessarily defined and used in all existing IHE domain specific frameworks.
Figure2.2 shows a graphical example of using Actors and Transactions in a profile, which is stored in the IT-Infrastructure Technical Framework of IHE. Therefore the IHE Profile Consistent Time (CT) is shown, which purpose is to show how any kind of Software may synchronize its system time according to a standardized process with
a time server. It has two Actors called Time Client, the software module requesting
Figure 2.2 – Shows an example IHE Profile, called Consistent Time(CT), which purpose is to show how any kind of Software may synchronize its system time according to a standardized process with a time server.
time synchronization, and the time server, the component centrally providing time for synchronization with several system components. These two Actors use the so called ”Maintain Time” Transaction to do the time synchronization via using the Network Time Protocol (NTP).
2.1.3
Personal Connected Health Alliance (PCHA)
An important part of interoperability in healthcare is the integration of medical devices. The PCHA focuses on the seamless integration of a complete interoperable measurement chain from the medical device to the professional medical information system (Personal Connected Health Alliance (PCHA), 2018b). The PCHA focuses on main activity fields like ”elderly living independently”, ”chronic diseases” and
”health & fitness”. However, the focus lies on the personal environment of patients or health-conscious persons. The PCHA is concerned with the conceptual, dynamic and pragmatic interoperability levels (in accordance to (Tolk et al., 2013)) and has a similar approach as the IHE (see 2.1.2). Based on their main fields of activity, they provide the Continua Guidelines for implementation guidance (Personal Connected Health Alliance (PCHA),2016), which are basically the specifications for organizations interested to implement parts of or a complete IT system. Therefore they provide a reference architecture, which is the basis for building an interoperable system and describing the interfaces & communication processes between the PHDs, Apps on mobile platforms like smartphones, tablets or black-box systems and medical IT infrastructure components. The PCHA furthermore provides a test-suite and implementation support for members (Personal Connected Health Alliance (PCHA), 2018c). Additionally a testing event called Plugfest is conducted, where the implementing organizations execute interoperability tests in form of no-peer and peer-to-peer tests.
A huge difference between the approach of the IHE and PCHA is, that the PCHA provides certification of tested devices. This implies that the devices are tested in special test-labs by permitted test organizations. In case of success the devices are allowed to use the PCHA label and are published in the showcase database, which can be found in Personal Connected Health Alliance (PCHA) (2018a). However, IHE and PCHA are working closely together as the PCHA uses IHE specifications, but also standards from IEEE, HL7 and others in their guidelines to integrate health related data in medical information systems.
2.1.4
Health Level Seven (HL7)
Health Level Seven International is one of the core SDOs in healthcare IT, providing standards and frameworks for exchange, integration, sharing and retrieval of electronic health information in several aspects. HL7 is concerned with the topic interoperability from the syntactical, semantical, pragmatic, dynamic and
conceptual interoperability levels in accordance to (Tolk et al., 2013). HL7 has several affiliates in 50 countries with more than 1600 members (Health Level Seven International, 2014). According to (Health Level Seven International, 2018c), the provided standards in the reference categories primary standards, foundational standards, clinical and administrative domains, EHR profiles, implementation guides, rules and references and education & awareness.
The most general and prominent HL7 standards is the Clinical Document Architecture (CDA) focusing on the definition of clinical documents based on XML i.e.laboratory report etc.. Similarly important is HL7 messaging in version 2 (HL7 V2) and version 3 (HL7 V3), which both focus on the exchange of clinical and administrative data in intramural and extramural contexts in form of syntactically and semantically defined messages. The most recent standard is called HL7 Fast Healthcare Interoperability Resources (HL7 FHIR), which is a modern approach to conform to nowadays dynamic and lightweight requirements of the IT sector (Health Level Seven International,2018c). These core standards are integrated and used in PCHA guidelines as well as in IHEs frameworks and integration profiles to a huge extend. An example for this application is the integration of PHD data do telehealth-service centers. This is done by using HL7 V2/V3 messages based on the IHE profile Device Enterprise Communication (DEC) from the patient care domain i.e. technical framework. This latter example shows that standards are combined. All of these standards were described in more detail in 2.2, as they are fundamental for this thesis.
2.2
Documents,
Messages
and
Resources:
Interoperability in eHealth
For the interoperable exchange of data between information systems, syntactical and semantical requirements are of high importance. In medical information technology the standards CDA, HL7 v2 & HL7 v3 messaging and HL7 Fast Healthcare Interoperability Resources (FHIR) are of severe importance when it comes to this
requirements. The general difference between these standards lies in the purpose of using the exchanged data. CDA focuses mainly on definition of medical documents, which includes the aspects of persistence, wholeness, stewardship, context relation and human readability of data. On the other hand, HL7 v2/v3 messaging standards are message based i.e. data is transported regarding specifications, but may or may not be stored in an unspecified way and the message may or may not be terminated. HL7 FHIR takes another approach and is an upcoming alternative to the document and message related approaches in accordance to Brull (2013); Bresnick (2018). In this case, data is structured in resources (i.e. like modules) which hold specific information, but can have relationships i.e. one resource for patient information and one for clinical trial data, and both are (or may be) linked together. HL7 FHIR is a modern approach, where the others are well known and widely applied specifications with their drawbacks e.g. XML based clinical documents can get extremely large through the markups. The modular approach of HL7 FHIR standard supports the integration with CDA documents as well. Nevertheless, the difference of documents, messages and resources should be kept in mind. The following sections will briefly describe the HL7 standards, as some aspects are fundamental to this work.
2.2.1
HL7 Clinical Document Architecture (CDA)
The Clinical Document Architecture (CDA) is a standard specified and developed by HL7 International and is part of the HL7 version 3 standards. CDA is an XML based document-markup standard, which allows to develop highly structured clinical documents for laboratory reports, radiology reports and any other kind of health related document (Dolin et al., 2006). The first version of CDA release 1.0 was proposed in 2000. The current release 2.0 was published in 2005 as an official ANSI standard, later accredited as an ISO standard in 2008 (Rodrigues et al., 2016).
Apart of narrative text, which may or may not be structured, pictures, sound files or other media can additionally be integrated in a CDA document. CDA is a well adopted standard, which fulfills the necessary characteristics for documents in the
healthcare sector. According to (M¨uller et al., 2005; Ferranti et al., 2006), this includes that a CDA document can exist in an unaltered state (Persistence) and can be maintained by an institution or person (Stewardship). Furthermore a CDA document can provide a complete context to understand its content condensed in one assembly of connected information (Context, Wholeness) and enables authentication through providing the ability to record or attest the signature of a healthcare professional responsible for the content (Potential of authentication). Finally, CDA provides the ability to present its information in human readable form (Human Readability) as the CDA-XML containing the data can be displayed by application of Extensible Stylesheet Language Transformation (XSLT).
As CDA is part of the HL7 version 3 standards family, its specification is based on the Reference Information Model (RIM). Basically a CDA document consists of a header-element and a body-element, were the latter is again structured in sub-elements like sections and entries. Figure 2.3 shows the basic structure of an CDA document in accordance to the RIM.
Figure 2.3 – Shows the basic structure of a CDA document in accordance to the RIM.
information, information related to the involved organisations and persons like physician, nurses etc..Furthermore it incorporates meta-information describing the document itself like identifier of the document, category of the document, time related information and relations to other documents. This information is in a highly structured format connected with fixed semantic definitions, as this are crucial requirements to search, exchange and interpret these documents in a medical information system like an EHR system or HIS (Boone,2011).
The CDA-body includes the relevant clinical information, whereby at least narrative text describing the desired clinical context is included. This information can furthermore be provided in structured form to improve interoperability and further automated processing of its content. CDA proposes three levels of interoperability, depending on how the data is structured and described (HL7 Austria, 2013):
- CDA-Level 1: Level 1 focuses on human readability and therefore data is integrated in narrative unstructured form or as an embedded document like pdf or text-file. The data is stored in the CDA Body.
- CDA-Level 2: Level 2 provides the possibility to structure this data in the body by using ”sections” with defined meaning e.g. diagnose or allergies. These sections are designated with codes to provide identification and improved processing through IT systems.
- CDA-Level 3: Level 3 provides the automated processing of information by IT systems, as detailed information inside ”enties” is designated with codes e.g. ICD-10 and LOINC-codes for observations and UCUM codes for physical units. This is done in addition to the narrative text, which is still in an specified XML-element. However, this requires templates to be developed for defining sets of rules.
The specification of the CDA standard provides XML-elements, which can be used to structure the body and are commonly defined as ”body structures”. Therefore, as shown in 2.3, the body is separated in thematic blocks called sections, which
themselves can be nested by sections. Structures like lists or tables are applicable, which are labeled as ”paragraph”, ”content”, ”caption”, ”table” or ”list”. Sections always include a narrative text for human interaction. The computer readable parts are called CDA-entries (see 2.3). These are used to encode information from the narrative text by usage of classes and attributes from the RIM. Example elements for entries are ”observation” i.e. a measurement like blood pressure measurement, ”procedures” i.e. a surgery, ”observationMedia” like figures, graphs etc. or ”supply” for medication. The standard offers to link these elements recursive or linear. However, the RIM provides a huge set of elements for individual setup and definition of needs, which would go behind the scope of providing an overview in this section of this work. The complete RIM can be found in Health Level Seven International
(2018e).
2.2.2
HL7 Messaging (v2/v3)
HL7 version 2 messaging is a standard continuously developed since 1989, which is reflected through several versions starting from 2.1 to 2.8.2. When speaking about all versions, the term HL7 v2.x is used (Health Level Seven International,2018c). These versions are backwards compatible and the main purpose of HL7 messaging is the communication inside healthcare institutions to support of clinical, administrative, logistic and financial workflows. HL7 v2.x messages are text based messages with a specific syntax for structuring the information to be transmitted (Rodrigues,2010). Figure2.4 shows a truncated HL7 Version 2.6 message for transmitting weight data in context of the Continua Design guidelines from the PCHA:
Figure 2.4 – This shows a truncated example HL7 version 2.6 for transmission of weight data in context of the Continua Design guidelines of the PCHA (Personal Connected Health Alliance (PCHA),2016).
HL7 v2.x messages are separated in segments, which is demonstrated by MSH, PID or OBX. Each of this segment contains related information in form of fields and is terminated with a carriage return. The first segment of a message is the message header starting with its abbreviation ”MSH”. PID depicts the patient identifier segment, OBX stands for an observation segment and is part of an observation request group (OBR). Each of these segments include, as already stated, specific fields which include relevant information and is specified by the v2 standard. Each field is separated by a pipe-character. The header abbreviation is always at position 0 e.g. PID-0 incorporates the abbreviation ”PID” and then counting from left to right followed by the next PID fields 1..n. These fields may be optional, conditional or required depending on each segment’s specification. Information in each field is separated by circumflex-character. An example like ”UrbauerˆPhilippˆˆˆˆˆ”, shows the separation of family name and given name. In case there is no information between two circumflexes, this indicates there is no need for this information i.e. in this example no second name, suffix, prefix or academic degree. Using the ampersand-character instead of the circumflex-character allows to specify subgroups. The same approach is used to add semantics to these messages. This is shown by the SNOMED-CT code for describing that the purpose of the OBR is the monitoring of a patient. Finally, the first field of the MSH specifies these and other special characters, which are used for separation i.e. ”ˆ˜&”. The message itself is transmitted via TCP/IP, packed into the Minimal Low Layer Protocol (MLLP), which can simply be described as it adds header and trailer characters.
HL7 v2 messaging is very widely applied in medical IT systems from the perspective of the IT systems communication inside healthcare institutions and it has a very pragmatic approach to support quick solutions. However, its drawback is that this approach leads to inconsistencies regarding data, due to different regional modeling approaches. Through the publication of HL7 version 3 (Health Level Seven International,2018c), the aim was to cover all communication needs derived from the healthcare system and specify format, content, semantics and processes for messaging purposes (Huang et al., 2003, 2005). As HL7 v3 messaging is part of the v3 family of standards it is, similar to CDA, based on the RIM as this is