Temporal approaches - Modeling and Querying Evolution in Semistructured Data

Chapter 2 Related Work

2.2. Modeling and Querying Evolution in Semistructured Data

2.2.2. Temporal approaches

interval in a temporal XML graph, the query performance is dramatically increased. To achieve this, a new class of summaries is introduced, denoted TSummary, that adds the time dimension to the well-known path summarization schemes. Within this framework, two new summaries are presented: LCP and Interval summaries. The indexing scheme TempIndex integrates these summaries with additional data structures. A query processing strategy based on TempIndex is presented, as well as a type of ancestor-descendant encoding, denoted temporal interval encoding. A persistent implementation of TempIndex is also presented, and a comparison against a system based on a non-temporal path index, and one based on DOM.

Finally, a language for updates is sketched, and it is shown that the cost of updating the index is compatible with real-world requirements.

In Gao and Snodgrass (2003) [26], a temporal XML query language, τXQuery, is presented.

The authors add valid time support to XQuery by minimally extending the syntax and semantics of XQuery. The goal is to move the complexity of handling time from the user/application code into the τXQuery processor. It is worth noting that the approach may also apply to transaction time queries. τXQuery utilizes the data model of XQuery. The few reserved words added to XQuery indicate three different kinds of valid time queries.

Representational queries have the same semantics with XQuery, ensuring that τXQuery is upward compatible with XQuery. To write such queries, users have to know the representation of the timestamps and treat the timestamp as a common element or attribute.

New syntax for current and sequenced queries makes these queries easier to write. A current query asks for the information about the current state. Sequenced queries are applied independently at each point in time. To implement τXQuery the stratum approach is adopted, in which a stratum accepts τXQuery expressions and maps each to a semantically equivalent conventional XQuery expression. The XQuery expression is passed to an XQuery engine.

Once the XQuery engine obtains the result, the stratum possibly performs some additional processing and returns the result to the user. The advantage of this approach is that it exploits the existing techniques in an XQuery engine, such as the query optimization and query evaluation, while at the same time it does not depend on a particular XQuery engine. The paper focuses on how to perform this mapping, in particular, on mapping sequenced queries, which are by far the most challenging. The central issue of supporting sequenced queries (in any query language) is time-slicing the input data while retaining period timestamping.

Timestamps are distributed throughout an XML document, complicating the temporal slicing.

In those terms, authors propose four optimizations of the initial maximally-fragmented time- slicing approach: selected node slicing, copy-based per-expression slicing, in-place per- expression slicing, and idiomatic slicing, each of which reduces the number of constant periods over which the query is evaluated.

In Wang and Zaniolo (2003) [62], the authors present techniques for managing multiversion documents and supporting temporal queries on such documents. The proposed approach consists of a temporally grouped data model, for representing the successive versions of a document as an XML document, named V-Document. Using XML query languages, such as XQuery, complex queries on the content of a particular version can be expressed, as well as on the temporal evolution of the document elements and their contents. Also, the paper discusses the advantages of applying the proposed scheme to XML-published relational data.

Finally, efficient implementations of the approach are discussed. In Wang and Zaniolo (2008) [63], the authors further extend and elaborate on the concepts presented in Wang and Zaniolo (2003) [62]. In these terms, a number of case studies are performed, the XChronicler tool is presented, a tool for building V-Documents from the successive versions of arbitrary XML documents, and techniques for the efficient storage and retrieval are discussed.

In Moon et al. (2008) [41], the authors work on the problem of managing the history of database information. Specifically, they propose PRIMA system, which employees two key technologies: The first is a method for publishing the history of a relational database in XML, whereby the evolution of the schema and its underlying database are given a unified representation. This temporally grouped representation makes it easy to formulate sophisticated historical queries on any given schema version using standard XQuery. For this, authors build upon and extend previous work presented in Wang and Zaniolo (2003) [62].

The second key technology is that schema evolution is transparent to the user. A user writes queries against the current schema, while retrieving the data from one or more schema versions. The system then performs the labour-intensive and error-prone task of rewriting such queries into equivalent ones for the appropriate versions of the schema. This feature is particularly important for historical queries spanning over different schema versions. For realizing this feature in PRIMA, Schema Modification Operators (SMOs) are introduced, to represent the mappings between successive schema versions, and an XML integrity constraint language (XIC), to efficiently rewrite the queries using the constraints established by the SMOs. The scalability of the approach has been also tested.

In Dyreson (2001) [19], the TTXPath data model and query language are sketched. TTXPath extends XPath with support for transaction time. To construct the TTXPath data model, snapshots of an XML document are obtained over time. The snapshots are then merged and transaction times are associated with each edge and node. The TTXPath query language extends XPath with temporal axes to enable a query to access past or future states, and with

constructs to extract and compare times. TTXPath maximally reuses XPath and is fully backwards-compatible with XPath.

No documento Managing Evolution in Web Data through Complex Changes (páginas 41-44)