• Nenhum resultado encontrado

[PENDING] Query Optimization Under Bag And Bag-set Semantics For Multiple Heterogeneous Data Sources

N/A
N/A
Protected

Academic year: 2024

Share "Query Optimization Under Bag And Bag-set Semantics For Multiple Heterogeneous Data Sources"

Copied!
159
0
0

Texto

In particular, for relational queries, we describe the requirements that a set of views must meet in order to provide an equivalent rewrite of CQ in the bag-to-bag semantics. In a query optimization setting, the main constraint requires efficient query rewriting for a given set of queries.

Motivation scenarios

Organization of thesis

Additionally, the problems of query rewriting using views and view selection are described using a detailed analysis of related work. The retention and equivalence problems of XPath queries are also given through a detailed analysis of related work.

Operators

However, the duplicates are not eliminated in the result of the projection applied to a bag. If we project onto the second attribute of the (set) relation edge(D) using bag operators, it is important to note that the result is the bag.

Expressions and substitutions

The definitions of common generalization and lgg of a set of expressions imply a simple condition on the form of the expressions. The construction of the lgg using the functionhlgg is also illustrated above (we remind you that a, b and c are constants).

Queries and Views

Semantics of Queries

For any given value database instance D of S, evaluating Q under set semantics, we have that for every tuple t in Q(D) there exists at least one evaluation θ from D to Q such that t=θ( head(Q)). The semantics of the baggage set indicates that the database instance is of the specified value; while under the bag semantics, the database instance is evaluated with the bag value.

Containment and Equivalence

Deciding conjunctive query containment

Hence, the existence of a constraint mapping is a necessary and sufficient condition for conjunctive query inclusion under set semantics. In summary, the existence of a constraint mapping is a necessary and sufficient condition for set CQ inclusion.

Conjunctive query equivalence

There is an isomorphic constraint mapping from Qm1 to Qm2 if and only if Q1≡bs Q2. It is easy to see that the mapping µ is an isomorphic constraint mapping from P1m to P2m.

Further related work on query containment and equivalence

2 Propositions 6 and 7 show that the CQ bag-string equivalence problems are equivalent to the graph isomorphism problem [CV93]. Therefore, the complexity of deciding whether to keep a bag or a set of CQ in NP [GJ79].

Answering queries using views

Useful views for rewriting a CQ

Therefore, a view definition whose body is not mapped to a subexpression of the query body cannot be used for any corresponding rewriting of the query. Regarding viewV4, it is easy to verify that there is neither substitution nor mapping from body(V4) to a subexpression of Q; hence 3 and Lemma 1 imply that there is no (bag, bag-set, or set) equivalent rewriting of Q that uses.

On constructing equivalent rewritings

More specifically, the definition of the set (also bag and bag-set) is equivalently rewriting Q using only viewV2 the following. Thus, there is no set (or bag or bag-set) equivalent to rewriting Q using only V5 and V6.

Selecting conjunctive views

Determining the search space of solutions

However, if we consider queries without self-joins in the given workload, the set-oriented view selection problem becomes tractable. This algorithm, called CGALG [ACGP07], first constructs all view definitions whose body is a generalization of a subexpression of the body of a query in the workload.

Related work on selecting views

Previously, the authors of [HRU96] proposed algorithms for selecting views in the case of data cubes and studying the complexity of the problem. For each set-valued database instance of S the following holds: the answer to Q is a set.

Table 3.1: Complexity results for the CQ containment problem.
Table 3.1: Complexity results for the CQ containment problem.

Contained Queries without self-joins

However, there are two containment mappings from Q1 to Q2 (the first subgoal of Q1 maps the first subgoal of Q2 and the other two subgoals of Q1 map together the second or third subgoals of Q2), so all subgoals of Q2 are mapped using the containment mapping from Q1 . Then the following holds: Q2 vb Q1 if and only if there is a containment mapping from subgoals Q1 to Q2.

Generalized-Pathstar Queries

Pathstar queries

We will prove that if Q2 vbs Q1 then there is a variable-on-inclusion mapping from Q1 to Q2. Moreover, all the other paths of Q1 (not existing in Q2) must be mapped to prefixes of paths of Q2, since otherwise there should be no inclusion mapping.

Simple generalized-pathstar queries

Let Q1 be a simple generalized road star query and Q2 a simple road star query of the same arity n, with n ≥ 1. Then Q2 vb Q1 if and only if for every subset S0 of d-stars of Q2 and the set S of the corresponding d-stars of Q1 and the n-stars N1,.

Other syntactic restrictions

As a consequence of Proposition 1, we know that for every variable X of Q2 there is a constraint mapping µ from Q1 to Q2 such that X ∈µ(Q1). As part) Proposition 2 implies that if there is a variable-on-inclusion mapping from Q1 to Q2, then Q2 vbs Q1.

Conclusions and Future Work

In this chapter, we focus on both bag- and bag-set-oriented versions of the view selection problem. Additionally, in the case of back-set semantics, we formally describe the form of the viewpoints in back-set equivalent rewrites.

Space of optimal solutions under bag semantics

On restricting the space of admissible viewsets

The optimal solution in example 31 uses representations constructed using generalizations of subexpressions of the queries. If there is a solution for P, then there is an optimal solution Λ = (V,R) for P such that the body of each representation in V is either a subexpression of the body of a query in Q, or an lgg of two or more subexpressions of the bodies of queries in Q.

Minimum set of distinguished variables in candidate rewritings 69

In case (1), Proposition 14 suggests that the set of binding variables is the smallest set of variables that must be placed in the header of a view definition so that this view can be used in an equivalent query rewrite. In case the body is a true lgg, the binding variable strings are not sufficient to specify the view definition header variables.

LGG-VSB Algorithm

2 The following statement summarizes the results of Proposition 13 and Proposition 14 and shows that if there is a solution to a given pocket-oriented problem input, there exists an optimal view set containing only subexpression views and lgviews. If there is a solution to P, then there exists an optimal solution Λ = (V,R), such that every view in V is a subexpression view or an lgview of subexpression views of queries in Q.

Space of Optimal solutions under bag-set semantics

Useful viewsets for rewriting CQs under bag-set semantics . 76

In particular, dual subgoals of CQ or rewriting expansion do not affect the existence of the equivalent rewriting of the bag set. If there is any optimal solution to P, then there exists an optimal solutionΛ = (R,V) to P such that the body of any view inV is either a subexpression of the body of a CQ in Q or a d-lgg of subexpressions of the bodies of CQs in Q.

Chain and Path queries

Chain-query workload

Now, if we consider the viewsetsV1 = {V1, V2},V2 ={V3},V3 ={V1, V12} and V4 ={V1, V21}, we can easily verify that these are the only viewsets that contain chain views ( i.e. views whose definitions are given by a chain query), give an equivalent rewrite of a bag set for Q and do not have any redundant views (i.e. we cannot drop any views and the viewset produced also gives an equivalent rewrite of a bag set for Q) . However, the view set V6 is admissible for P (we can easily verify that this view set is optimal for P), since V6(D) contains 3 tuples and gives an equivalent rewriting of Q.

Path-query workload

Therefore, constructing the definitions of IRiandIR0i, we have that there is a substitution on the expansion of IRi such that h(body(IRexpi )) = body(IR0expi ) andvars(head(IRi))⊆ vars(h(head(IR0i) )). Furthermore, notice that the path groups above are related to the length divisions of P3.

Conclusions and Future Work

Semantics of Patterns

In addition, the result of applying a pattern to a tree can be thought of as a set of references to nodes of the tree or as a . Practically, the result of applying a pattern to a tree is calculated with respect to an ordering of nodes of the tree.

Figure 5.4: P is a pattern in XP {//,∗}
Figure 5.4: P is a pattern in XP {//,∗}

Core Pattern

For example, the tree t0 in Figure 5.4 is CM od2(P) of the pattern P, where the circled node indicates the output of P. Here we note that a kernel path can specify either the exact distance of the images of two kernel nodes (if all edges in the core path is bottom edges) or the minimum acceptable distance (if there is at least one downward edge in the core path).

Containment and Equivalence of sets of patterns

Deciding containment and equivalence of patterns

Then the existence of a homomorphy-like mapping that also compares the decorated numbers of the descendant edges is sufficient to determine inclusion. Considering now the isomorphism h from N(cP1) to N(cP2), we have that both conditions of the definition of d-homomorphism hold.

Figure 5.6: P 1 and P 2 are equivalent
Figure 5.6: P 1 and P 2 are equivalent

Containment of unions

More specifically, considering two equivalent models P1 and P2 in XP, the definition of equivalence implies that every node of a tree that is an image of theout(P1) through an embeddinge1 is also an image of out(P2) through an embedding. e2. Q01, Q02 of the patterns also in XP{//,[ ],∗} such that the inhibition decision Q2 v Q1 reduces to the control setting of each pattern of Q01 and a pattern of Q02.

Figure 5.9: Translation of patterns
Figure 5.9: Translation of patterns

Complexity of containment problem

In the case where the samples in both sets Q1 and Q2 are now in XP{//,[ ]}, the retention decision is much easier. However, we can choose to retain unions when all patterns in both unions are in XP{//,[ ]}, in P T IM E [ZLWL09].

Further related work on containment and equivalence of pat-

In this case, the authors show that an intersection of patterns can be translated into an equivalent union of patterns in XP{//,[. Therefore, they showed that for the patterns in this fragment, the equivalence decision between two intersections is properly reduced to deciding whether two pattern unions are equivalent.

Answering a pattern using a single view

Continuing the example 54, the composition of the pattern R1 and the view V1 is also illustrated in Figure 5.10. In [ACG+09] several cases are studied in which at least one of the natural candidates is a potential rewrite.

Figure 5.10: Rewritings of a pattern
Figure 5.10: Rewritings of a pattern

Otherwise, if the reachable path length is greater than 4, n is the result of P2. Since we can decide whether there is a homomorphism between two samples in XP{//,∗}, in polynomial time with respect to the sample sizes [MS04], we conclude that deciding Q2 in Q1 is P T IM E w.r.t.

Deciding equivalence of patterns in XP {//,∗}

Let a pattern P such that there is a core path p in P with at least one downward edge. In addition, we can see that in the first core path of P1 there is only one downward edge, instead of its image of P2, which has two downward edges appearing in different positions.

Containment of unions in the fragment of XP {//,∗}

Then, replacing P by the 2-undo of (n1, n2) we obtain the three patterns P1, P2 and P3, which constitute the second instance of U (illustrated in the second column). Consider, now, a set Q of patterns in XP{//,∗} and the length kmax of the longest kernel path appearing in eQ patterns.

Figure 6.4: The 1-unrolling of n 1 , n 2 , n 3 of P 4 , illustrated in Figure 6.2
Figure 6.4: The 1-unrolling of n 1 , n 2 , n 3 of P 4 , illustrated in Figure 6.2

More specifically, the patterns in P's rollout set must be isomorphic to the patterns of a subset of Q's rollout set. Moreover, an additional requirement is that every pattern in Q's rollout set must be included in P.

Figure 6.8: A union of single-view rewritings is required
Figure 6.8: A union of single-view rewritings is required

Solving the problem of the existence of a union rewriting . 137

This is implied by the existence of sufficient homomorphic type 3 in the input set of views. Therefore, the natural candidates of a pattern in XP{//,∗} are not sufficient to find a possible union rewrite of the pattern using a set of views.

Figure 6.9: Homomorphic suffices of a pattern in XP {//,∗}
Figure 6.9: Homomorphic suffices of a pattern in XP {//,∗}

Referências

Documentos relacionados