Oracle Corporation and its affiliates disclaim any responsibility for any damages caused by the use of this software in unsafe applications. This software and documentation may provide access to or information about third party content, products and services.
Preface
Audience
Documentation Accessibility
To reach Oracle Support Services, use a telecommunications relay service (TRS) to call Oracle Support at. An Oracle Support Services technician handles technical issues and provides customer support according to the Oracle service request process.
Related Documents
Conventions
Building Your Data Warehouse
Introduction to Data Warehousing
About This Guide
Before Using This Guide
What This Guide Is Not
What is a Data Warehouse?
The Key Characteristics of a Data Warehouse
Common Oracle Data Warehousing Tasks
Tasks Illustrated in this Guide
Tools for Administering the Data Warehouse
Setting Up Your Data Warehouse System
General Steps for Setting up a Data Warehouse System
Preparing the Environment
Balanced Hardware Configuration
How Many CPUs and What Clock Speed Do I Need?
How Much Memory Do I Need?
How Many Disks Do I Need?
How Do I Determine Sufficient I/O Bandwidth?
Verifying the Data Warehouse Hardware Configuration
About the dd Utility
In your test, you should include all the storage devices you intend to include for your database storage.
About the Orion Utility
Setting Up a Database for a Data Warehouse
How Should I Set the Memory Management Parameters?
Because the target memory initialization parameter is dynamic, you can change the target memory size at any time without changing the. The maximum memory size serves as an upper limit so that you do not accidentally set the target memory size too high.
What Other Initialization Parameter Settings Are Important?
The database then tunes to the target memory size, redistributing memory as needed between the system global area (SGA) and the overall program global area (PGA). Since certain SGA components either cannot shrink easily or must remain at a minimum size, the database also prevents you from setting the target memory size too low.
Accessing Oracle Warehouse Builder
For Windows, select Start>Programs>Oracle>Warehouse Builder, and then select Design Center. Follow the prompts and accept the default settings in the Repository Assistant to create a workspace and assign a user as the workspace owner.
Installing the Oracle Warehouse Builder Demonstration
For Windows, select Start>Programs>Oracle>Warehouse Builder, and then select OMB*Plus. Start the Design Center and log in as the workspace owner with the credentials you specified in the owbdemoinit.tcl script.
Identifying Data Sources and Importing Metadata
Overview of Data Sources
General Steps for Importing Metadata from Sources
About Workspaces, Projects, and other Devices in OWB
Indicate which files should be imported as described in "Launch the Import Metadata Wizard" on page 3-4. Import the metadata for both flat files as described in “Importing Flat File Data” on page 3-5.
Specifying Locations for the Flat Files
Specify where the flat files are located, as described in "Specifying locations for the flat files" on page 3-3. After defining the resources by importing their metadata, you then design the target schema as described in Chapter 4, "Defining Warehouses in Oracle Warehouse Builder".
Creating Modules in the Project
Launching the Import Metadata Wizard
Using the Flat File Sample Wizard
Importing the Flat File Data
Defining Warehouses in Oracle Warehouse Builder
General Steps for Defining a Relational Target Warehouse
Identifying the Warehouse Target Schema
About Flat File Sources in OWB
Simply put, SQL*Loader handles large amounts of data better, while SQL allows for a wider range of complex joins and transformations. However, to use SQL, you must define an external table in the repository module, as described in "Exercise: Adding External Tables to the Target Module" on page 4-3.
Exercise: Adding External Tables to the Target Module
About Dimensions
Hierarchies: A logical structure that uses ordered levels or a set of data values (for a value-based hierarchy) as a way to organize data.
Exercise: Understanding Dimensions
About Levels
Defining Level Attributes
Defining Hierarchies
Dimension Roles
Level Relationships
Dimension Example
Control Rows
To get the actual number of rows in a dimension, count the number of rows by including a WHERE clause that excludes the NULL rows. For example, to get a count for Products, count the number of rows including a WHERE clause to exclude NULL rows in Product.
Implementing a Dimension
Star Schema
Binding
You can also select the dimension node on the canvas and choose Bind from the Object menu. Create the implementation objects (tables or views) that you will use to store the dimension data. Map the attributes in each level of the dimension to the columns that store their data.
About Cubes
Defining a Cube
Cube Measures
Cube Dimensionality
For dimensions that use a relational implementation, a dimension role for each dimension to indicate what role the dimension reference performs in the cube.
Cube Example
Implementing a Cube
Relational Implementation of a Cube
For example, you create the SALES cube that references the TIMES and PRODUCTS dimensions and performs automatic binding for the cube. Alternatively, you can use an existing database table or view to store the cube data. Map the cube's measures and dimension references to the columns that store the cube's data.
Loading Data into Your Data Warehouse
Defining ETL Logic
About Mappings and Operators
Summary of Steps for Defining Mappings
Creating a Mapping
Warehouse Builder saves the mapping definition and inserts its name into the Project Navigator. Warehouse Builder opens the Mapping Editor for the mapping and displays the name of the mapping in the title bar. In the Project Navigator, locate the repository target module in the Databases folder and then in the Oracle Database folder.
Types of Operators
Adding Operators
Adding Operators that Bind to Workspace Objects
Create Unbound Operator with No Attributes
Select from Existing Workspace Object and Bind
Editing Operators
Connecting Operators
Connecting Attributes
Connecting Groups
You can use the mapping editor with an unbound table operator to quickly create staging area tables. Use the Add Table Operator dialog box to select and bind the source table operator in the mapping. The mapping should now look like Figure 5-4 with one source table and one staging area table with no attributes.
Setting Mapping Properties
Target Load Order
To change the loading order, select a target and use the shuttle buttons on the right to move the target up or down in the list.
Reset to Default
Setting Operator, Group, and Attribute Properties
Synchronizing Operators and Workspace Objects
Synchronizing An Operator
By default, Warehouse Builder selects the option to synchronize your selected operator with the corresponding object in the workspace. In this step, you also specify Synchronize from a workspace object to an operator or select the option Synchronize from an operator to a workspace object.
Synchronizing From a Workspace Object to an Operator
Prototype mappings using tables: When working in the design environment, you can choose to design the ETL logic using tables. For production, you may want to use the mapping to access other workspace object types such as views, materialized views, or cubes.
Synchronizing Operators based on Workspace Objects
Synchronizing from an Operator to a Workspace Object
Synchronization from an operator does not affect the dependent relationship between other operators and workspace objects.
Deploying to Target Schemas and Executing ETL Logic
About Deployment
What is a Control Center?
Configuring the Physical Details of Deployment
Deployment Actions
The Deployment Process
Deploying Objects
Starting ETL Jobs
Viewing the Data
Reporting on a Data Warehouse
SQL for Reporting and Analysis
Use of SQL Analytic Capabilities to Answer Business Queries
You can take advantage of the advanced SQL and PL/SQL capabilities that Oracle Database offers to translate business queries into SQL.
How to Add Totals to Reports Using the ROLLUP Function
When to Use the ROLLUP Function
This query is useful, but you may want to see totals for different categories in the same report.
How to Separate Totals at Different Levels Using the CUBE Function
When to Use the CUBE Function
How to Add Subtotals Using the GROUPING Function
When to Use the GROUPING Function
How to Combine Aggregates Using the GROUPING SETS Function
When to Use the GROUPING SETS Function
Instead of adding a separate query to get the totals per channel class, you can use the GROUPING SETS function as illustrated in the example below.
How to Calculate Rankings Using the RANK Function
When to Use the RANK Function
How to Calculate Relative Contributions to a Total
How to Perform Interrow Calculations with Window Functions
How to Calculate a Moving Average Using a Window Function
Use of Partition Outer Join to Handle Sparse Data
The result of a partition outer join is a UNION operation of the outer joins of each of the partitions in the logically partitioned table with the table on the other side of the join.
When to Use Partition Outer Join
Use of the WITH Clause to Simplify Business Queries
When to Use the WITH Clause
Managing a Data Warehouse
Refreshing a Data Warehouse
About Refreshing Your Data Warehouse
Using Rolling Windows to Offload Data
Optimizing Data Warehouse Operations
Avoiding System Overload
Monitoring System Performance
Monitoring Parallel Execution Performance
Monitoring I/O
Using Database Resource Manager
Optimizing the Use of Indexes and Materialized Views
Optimizing Storage Requirements
Using Data Compression to Improve Storage
Eliminating Performance Bottlenecks
Verifying That SQL Runs Efficiently
Analyzing Optimizer Statistics
Analyzing an Execution Plan
The first execution plan shows dynamic pruning, using the KEY values for Pstart and Pstop, respectively. Dynamic pruning means that the database has to figure out which partition or partitions to access at run time. In the case of static pruning, the database knows which partition or partitions to access at the time of parsing, leading to more efficient execution.
Using Hints to Improve Data Warehouse Performance
In that case, use hints to instruct the optimizer to use the optimal execution plan.
Using Advisors to Verify SQL Performance
Improving Performance by Minimizing Resource Consumption
Improving Performance: Partitioning
Improving Performance: Partition Pruning
Partition pruning significantly reduces the amount of data retrieved from disk and reduces processing time, improving query performance and resource utilization. An example of dynamic pruning is using operators or functions in the WHERE condition. Partition pruning affects the statistics of the objects where pruning will occur and will therefore also affect the execution plan of a statement.
Improving Performance: Partitionwise Joins
If you partition the index and table on different columns (using a globally partitioned index), partition pruning also eliminates index partitions even when the partitions of the underlying table cannot be eliminated. An example scenario for static pruning would be an SQL statement containing a WHERE condition with a constant literal on the partition key column.
Improving Performance: Query Rewrite and Materialized Views
Improving Performance: Indexes
Improving Performance: Compression
Improving Performance: DBMS_COMPRESSION Package
Improving Performance: table_compress Clause of CREATE TABLE and ALTER TABLE
Using Resources Optimally
Optimizing Performance with Parallel Execution
How Parallel Execution Works
Setting the Degree of Parallelism
About Wait Events
Backing up and Recovering a Data Warehouse
How Should I Handle Backup and Recovery for a Data Warehouse?
Strategies and Best Practices for Backup and Recovery
Best Practice A: Use ARCHIVELOG Mode
Archived redo logs can be sent and applied to the standby physical database, which is an exact replica of the primary database. The user can only back up the database while it is completely shut down after a clean shutdown. Usually, the only option for media recovery is to restore the entire database, which will lose all transactions since the last backup.
Is Downtime Acceptable?
Best Practice B: Use RMAN
Best Practice C: Use Read-Only Tablespaces
Best Practice D: Plan for NOLOGGING Operations
Additionally, subsequent data operations on which a nologging operation occurred also cannot be recovered even if those operations did not use nologging mode. The presence of NOLOGGING operations should be considered when designing the backup and recovery strategy. There are two approaches to backup and recovery in the presence of NOLOGGING operations: ETL or incremental backup.
Extraction, Transformation, and Loading
An example of implementing this approach is to back up the data warehouse every weekend and then save the files needed to support the ETL process for each night. Essentially, the data warehouse administrator is gaining better performance in the ETL process through logging operations, at a slightly more complex price and. A weakness of this approach is that the burden is on the data warehouse administrator to track all relevant changes that have occurred in the data warehouse.
Incremental Backup
Best Practice E: Not All Tablespaces are Equally Important
For example, sales data in a data warehouse can be critical and in the event of a recovery, this data must be online as soon as possible. But in the same data warehouse, a table that stores clickstream data from a company's website may be much less critical. In this scenario, the tablespaces that contain sales data should be backed up frequently, while the tablespaces that contain clickstream data should only be backed up once every week or two.
Securing a Data Warehouse
Overview of Data Warehouse Security
Why is Security Necessary for a Data Warehouse?
Using Roles and Privileges for Data Warehouse Security
Using Virtual Private Database in Data Warehouses
How Virtual Private Database Works
Overview of Oracle Label Security
How Oracle Label Security Works
Oracle Label Security mediates access to rows in database tables based on a label contained in the row, a label associated with each database session, and Oracle Label Security privileges assigned to the session. It provides access mediation on an application table after a user has been granted the standard database system and object privileges. If the user executes a SELECT statement on the table, Oracle Label Security will evaluate the selected rows and determine whether the user can access them based on the privileges and access labels assigned to the user.
How Data Warehouses Benefit from Labels
Overview of Fine-Grained Auditing in Data Warehouses
Overview of Transparent Data Encryption in Data Warehouses
In general, most organizations do not have the time or expertise to modify existing applications to call encryption routines. In other words, applications can use the same syntax to insert data into the application table, and Oracle Database will automatically encrypt the data before writing it to disk. On subsequent SELECT operations, the data will be transparently decrypted so that the application will continue to function normally.
Index