Cláudia Antunes 2006
2 Sistemas de Apoio à Decisão by Cláudia Antunes
3 Sistemas de Apoio à Decisão by Cláudia Antunes
Cubos
um cubo é a estrutura de dados usada para manipular os dados que instanciam um modelo multidimensional as arestas do cubo correspondem às diferentes dimensões
Date Prod uct L oc al TV VCRPC D1 D2 … Dn Lisbon … New York Célula: montante de vendas para os VCRs vendidos no D1 em NY a intersecção dos valores de cada dimensão (células) expressam a agregação dos factos registados na combinação dessas condições
4 Sistemas de Apoio à Decisão by Cláudia Antunes
Cubos
os cubos são criados para pré-calcular as agregações mais frequentes, para acelerar a exploração de dados
existem vários cubos para representar um só modelo O cubo que representa os factos ao mais baixo nível de detalhe designa-se por cubo base
O cubo ao mais alto nível de
detalhe por cubo apíce Date
Prod uct L oc al TV VCRPC 1989 1990 … 2000 Portugal … USA Agregação: montante de vendas para todos os produtos vendidos no ano 1989
em Portugal
Célula
5 Sistemas de Apoio à Decisão by Cláudia Antunes
When you create and use summarized tables,
application performance can be substantially
improved.
The summarized tables eliminate the need to run
summaries at execution time.
The presummarized data must be refreshed only
when the underlying data changes.
Every time that the user moves in the application a
new summarization must often be calculated.
However, each summarization could be stored
within a cube, which eliminates the processing
time of calculating every type of summarization.
6 Sistemas de Apoio à Decisão by Cláudia Antunes
! "
#
$
Online Analytical Processing (OLAP) is an
industry-accepted reporting technology that provides
high-performance analysis and easy reporting on large
volumes of data, represented as cubes.
Goals:
to provide fast and flexible access to data summarization, to view trends over time,
to look at different relationships in data by looking beyond traditional two-dimensional row and column data analysis.
7 Sistemas de Apoio à Decisão by Cláudia Antunes
Roll-up (drill-up)
summarize data
by climbing up hierarchy or by
dimension reduction
Date Prod uct L oc al TV VCRPC D1 D2 … Dn Lisbon … New York Date Prod uct L oc al TV VCRPC 19891990 … 2000 Lisbon … New York Roll-up (Date) 8 Sistemas de Apoio à Decisão by Cláudia AntunesDrill-down (roll down):
reverse of roll-up, detailing data from higher
level summary to lower level summary or
detailed data, or introducing new dimensions
Date Prod uct L oc al TV VCRPC D1 D2 … Dn Lisbon … New York Date Prod uct L oc al TV VCRPC 1989 1990 … 2000 Lisbon … New York Drill-down(Date) 9 Sistemas de Apoio à Decisão by Cláudia Antunes
Drill-through:
Detailing data from higher level summary to
lower level summary in a single member
Date Prod uct L oc al TV VCRPC
Jan89 Feb89…Dec89
Lisbon … New York Date Prod uct L oc al TV VCRPC 1989 1990 … 2000 Lisbon … New York Drill-through(1989) 10 Sistemas de Apoio à Decisão by Cláudia Antunes
Slice:
Select portions of data in a single dimension
Date Prod uct L oc al TV VCRPC D1 D2 … Dn Lisbon … New York Date Prod uct L oc al TV VCRPC D1 D2 … Dn Lisbon Slice(Local=Lisbon)
Dice:
Select portions of data in conjunction of dimensions
Date Prod uct L oc al TV VCRPC D1 D2 … Dn Lisbon … New York Date Prod uct L oc al TV VCRPC D1 Lisbon
Pivot:
Visualization of the hyper-dimensional cube in 2D
planes
Date Prod uct L oc al TV VCRPC D1 D2 … Dn Lisbon … New York 2 1 Paris 1 2 New York 1 1 Lisbon Lo ca l TV 2 New York 2 Lisbon Lo ca l PC P ro du ct 2004 2003 2002 2001 2000 1999 Date%
&
14 Sistemas de Apoio à Decisão by Cláudia Antunes
! "
$
The Multidimensional eXpressions (MDX)
language is a standardized, high-level
language that is used for querying
multidimensional data sources
Is considered the multidimensional
equivalent to SQL
As SQL returns a subset of two-dimensional
data from tables, MDX returns a subset of
multidimensional data from cubes
15 Sistemas de Apoio à Decisão by Cláudia Antunes
' (
Cube Dimensions Measures Levels Members Hierarchies Private Dimension 16 Sistemas de Apoio à Decisão by Cláudia Antunes17 Sistemas de Apoio à Decisão by Cláudia Antunes
[ ] The square brackets encircle cube
names, dimension names, hierarchy names,
and member names
. The dot separates the member names and
the dimension names
Example of a Member: [Time].[2ndhalf].[3rdquarter]
18 Sistemas de Apoio à Decisão by Cláudia Antunes
' (
Tuples
Tuples list dimensions and members to identify
individual
cells
, as well as a larger section of
cells in the cube.
Tuples that encompass more than one cell are
called
slices
.
( )
Parentheses encircle tuples.
19 Sistemas de Apoio à Decisão by Cláudia Antunes
' (
Examples of Tuples
A slice: ([Time].[2ndhalf]) A dice: ( [Time].[2ndhalf],
[Source].[Nonground].[Air] )
20 Sistemas de Apoio à Decisão by Cláudia Antunes
' (
Sets
A set is an ordered collection of zero, one or more
tuples.
{ } Curly brackets encircle sets.
{([Time].[1sthalf], [Source].[nonground].[air]),
(Time.[2nd half], [Source].[nonground].[sea]) }
: The colon defines a range in sets
{([Time].[1sthalf].[1stquarter]) :([Time].[2ndhalf].[4thquarter]) }
, The comma separates members within
tuples and separates tuples within sets.
21 Sistemas de Apoio à Decisão by Cláudia Antunes
)
(
The components that an MDX query needs in order to extract the requested information are
a component to specify the column headers a component to specify the row headers a pointer to the cube being accessed. General form of the SELECT statement: SELECT
{set defining the column headers} ON COLUMNS, {set defining the row headers} ON ROWS FROM [cube name]
A SELECT statement is used to select the dimensions and members to be returned and is referred to as an axis dimension. An axis dimension is expected to return data for multiple members.
22 Sistemas de Apoio à Decisão by Cláudia Antunes
)
(
The WHERE clause is used to restrict the returned
data to specific dimension and member criteria – a
slicer dimension. A slicer dimension is expected to
return data for a single member.
General form of the query with the addition of the
WHERE clause:
SELECT
{set defining the column headers} ON COLUMNS,
{set defining the row headers} ON ROWS
FROM [cube name]
WHERE (slicer)
)
(
Example
SELECT
{ ( [Route].[nonground].Members ) } ON COLUMNS, { ( [Time].[1st half].Members ) } ON ROWS
FROM TestCube WHERE [Measures].[Packages] 5078 5108 1stquarter sea air
)
(
Named Sets
A named set is an alias for a defined set.
Calculated Members
Members that are based on evaluated expressions in MDX computed at runtime based on data that already exists in the cube.
Advantages:
It helps to clarify the logic Increase the ease of maintenance.
Increases the efficiency of query execution because the set is only executed once at the beginning, stored, and
25 Sistemas de Apoio à Decisão by Cláudia Antunes
)
(
Use the WITH keyword as part of the SELECT
statement to create a named set or a calculated
member
Creating Named Sets:
WITH SET set_name AS 'set'
SELECT
{set defining the column headers} ON COLUMNS,
{set defining the row headers} ON ROWS
FROM [cube name]
WHERE (slicer)
The named set created using the WITH SET
statement is available only for that query.
26 Sistemas de Apoio à Decisão by Cláudia Antunes
)
(
Named Set example
WITH SET ngMembers AS ‘{([Route].[nonground].Members)} ' SELECT
[ngMembers] ON COLUMNS,
{ ( [Time].[1st half].Members ) } ON ROWS FROM TestCube
WHERE [Measures].[Packages]
27 Sistemas de Apoio à Decisão by Cláudia Antunes
)
(
Creating Calculated Members:
WITH MEMBER member-name AS
'value-expression'
SELECT
{set defining the column headers} ON COLUMNS,
{set defining the row headers} ON ROWS
FROM [cube name]
WHERE (slicer)
The <member_name> value is the fully qualified
name of the calculated member, including the
dimension or level to which the calculated member
is associated.
28 Sistemas de Apoio à Decisão by Cláudia Antunes
)
(
Calculated Members example
WITH MEMBER [Measures].[avgPacks] AS ‘avg([Measures].[Packages]) ’
SELECT
{ ( [Route].[nonground].Members ) } ON COLUMNS, { ( [Time].[1st half].Members ) } ON ROWS
FROM TestCube
WHERE [Measures].[avgPacks]
29 Sistemas de Apoio à Decisão by Cláudia Antunes
*
Children: returns the children of a member.
Members: returns the set of all members in a
dimension, hierarchy, or level.
CrossJoin: returns the cross product of two
sets.
TopCount: returns a specified number of items
from the top of a set, optionally ordering the
set first.
30 Sistemas de Apoio à Decisão by Cláudia Antunes
*
CrossJoin Example
SELECT
CROSSJOIN ({[Time].[1st half].Members},
{[Route].[nonground].Members}) ON COLUMNS, { ( [Source].[Eastern hemisphere].Members ) } ON ROWS FROM TestCube
31 Sistemas de Apoio à Decisão by Cláudia Antunes
*
TopCount Example
SELECT
TOPCOUNT ({[Source].[Eastern hemisphere].Members)}, 2, [Measures].[Packages]) ON COLUMNS, {[Time].[1st half].Members} ON ROWS
FROM TestCube