Archive

Archive for the ‘Burleson Article’ Category

Understanding Oracle knowledge engineering

Since the earliest days of business computing, the idea has been to identify well-structured activities and automate them.

The first business processes to be automated were well-structured redundant tasks such as payroll processing, tasks that take repetitive and well-structured components of a system and automate them.

As the decades passed, Information Systems became more sophisticated at capturing and deploying human intelligence within computer systems, and we see these types of systems:

  • Expert Systems – These online system capture a well-structured task and mimic human processing.  An example would be Mycin, a system that applies physician intelligence at analyzing blood samples.  An expert system makes the decision without the aid of any human intuition.

  • Decision Support Systems - A DSS is a computerized system that recognizes that human intuition is difficult to quantify and automate.  In a DSS the human makes the decision, guided by software that automates the well-structured aspects of the problem domain.

The line between an expert system and a decision support system blurs in some cases when what is thought to be an intuitive process is actually a well-structured problem with extremely complex decision rules.

In one notable case, a major soup manufacturer was about to loose a long-term employee of forty years, who knew every intricacy of the tricky soup vats within the company.

Initially setting out to create a DSS, the decision analyst quizzed the employee over a period of months and discovered that what was once thought to be intuition was actually the application of a large set of well structured decision rules.  When this soup vat expert would say something like “I have a feeling that the problem is X“, it appeared to be human intuition to those less knowledgeable observers.

However in reality it was the application of a long forgotten decision rule or an experiential case for which the individual had since lost conscious knowledge.  The application of the decision support system technology eventually led to an expert system.  This allowed the forty year worker to retire comfortably, with the knowledge that all of his years of decision rules had in fact been quantified, helping the soup company carry on without him making even faster and better decisions as a whole.

A knowledge engineering system for Oracle data cleansing

If we start by examining known data errors to find common patterns, a qualified software engineer can design Oracle-based programs to detect these types of errors and quickly clean-up a large amount of transposition errors, and successively refine the model to identify less obvious data anomalies.  We can also search for statistical “outliers”, data that violates the norms of the database as-a-whole.

By using well-understood best practices for Oracle data cleansing a robust and flexible system can be created to dramatically reduce data anomalies.  Using an iterative cycles of refining the decision rules, the DSS evolves to become increasingly accurate and powerful.

This is a DSS for Oracle data cleansing in a nutshell.  Note that we start by examining the “nature” of known data errors and seek “fishy” data (statistically valid outliners) for creating the suggestion lists for the human expert (the DQO).

The DQO then manually resolved the errors and works with the DBA to refine the decision rules until they are 100% complete and accurate using the “feedback loop” of successive rule refinement.  At that point, that component of the Oracle data cleansing is automated, becoming an “expert system” component of the DSS.

For expert Oracle data cleansing support and data scrubbing consulting, use an expert from BC.  We understand the powerful Oracle data unification tools, and we can aid in improving the data quality of any Oracle database, large or small.

Since the earliest days of business computing, the idea has been to identify well-structured activities and automate them.

The first business processes to be automated were well-structured redundant tasks such as payroll processing, tasks that take repetitive and well-structured components of a system and automate them.

As the decades passed, Information Systems became more sophisticated at capturing and deploying human intelligence within computer systems, and we see these types of systems:

  • Expert Systems – These online system capture a well-structured task and mimic human processing.  An example would be Mycin, a system that applies physician intelligence at analyzing blood samples.  An expert system makes the decision without the aid of any human intuition.

  • Decision Support Systems - A DSS is a computerized system that recognizes that human intuition is difficult to quantify and automate.  In a DSS the human makes the decision, guided by software that automates the well-structured aspects of the problem domain.

The line between an expert system and a decision support system blurs in some cases when what is thought to be an intuitive process is actually a well-structured problem with extremely complex decision rules.

In one notable case, a major soup manufacturer was about to loose a long-term employee of forty years, who knew every intricacy of the tricky soup vats within the company.

Initially setting out to create a DSS, the decision analyst quizzed the employee over a period of months and discovered that what was once thought to be intuition was actually the application of a large set of well structured decision rules.  When this soup vat expert would say something like “I have a feeling that the problem is X“, it appeared to be human intuition to those less knowledgeable observers.

However in reality it was the application of a long forgotten decision rule or an experiential case for which the individual had since lost conscious knowledge.  The application of the decision support system technology eventually led to an expert system.  This allowed the forty year worker to retire comfortably, with the knowledge that all of his years of decision rules had in fact been quantified, helping the soup company carry on without him making even faster and better decisions as a whole.

A knowledge engineering system for Oracle data cleansing

If we start by examining known data errors to find common patterns, a qualified software engineer can design Oracle-based programs to detect these types of errors and quickly clean-up a large amount of transposition errors, and successively refine the model to identify less obvious data anomalies.  We can also search for statistical “outliers”, data that violates the norms of the database as-a-whole.

By using well-understood best practices for Oracle data cleansing a robust and flexible system can be created to dramatically reduce data anomalies.  Using an iterative cycles of refining the decision rules, the DSS evolves to become increasingly accurate and powerful.

This is a DSS for Oracle data cleansing in a nutshell.  Note that we start by examining the “nature” of known data errors and seek “fishy” data (statistically valid outliners) for creating the suggestion lists for the human expert (the DQO).

The DQO then manually resolved the errors and works with the DBA to refine the decision rules until they are 100% complete and accurate using the “feedback loop” of successive rule refinement.  At that point, that component of the Oracle data cleansing is automated, becoming an “expert system” component of the DSS.

For expert Oracle data cleansing support and data scrubbing consulting, use an expert from BC.  We understand the powerful Oracle data unification tools, and we can aid in improving the data quality of any Oracle database, large or small.

References:

Categories: Burleson Article

Oracle utilities secrets

Oracle has “secret” undocumented utilities that are so powerful that they are reluctant to release the details to the general public. Oracle deliberately glosses-over these powerful utilities because they must write the documentation to the lowest common denominator, and these utilities can be extremely dangerous to Oracle professionals without the required work experience.

These utilities are often left inside an Oracle distribution for use exclusively by Oracle technical support, but expert Oracle DBA’s find them invaluable for advanced database operations. These undocumented utilities are described in the new book “Advanced Oracle Utilities: The Definitive Reference”.

A hidden Oracle utility is not a utility that is physically hidden in the Oracle software. Rather, a hidden utility is an executable or PL/SQL package that is either undocumented or where the documentation is difficult to find. For example, some PL/SQL packages are never loaded by the Oracle installer, yet their definitions remain in the operating system files.

The main directories where Oracle utilities reside are here:
$ORACLE_HOME/bin — This contains the binary executables used by the Oracle server. Most of the tools discussed in the Oracle Utilities book reside in this directory.
$ORACLE_HOME/plsql/demo — This contains a useful collection of SQL scripts related to many utilities, including the dbms_profiler utility.
$ORACLE_HOME/rdbms/admin — This contains many SQL scripts used for creating PL/SQL packages and their required environments.
$ORACLE_HOME/sqlplus/admin — This contains scripts used with autotrace and other utilities.
$ORACLE_HOME/otrace/admin — This is the administration directory for the Oracle Trace diagnostic tool.
$ORACLE_HOME/otrace/sysman — This is used by utilities such as oemctl and the Oracle Management Server (OMS).
$ORACLE_HOME/otrace/lib — This contains facility files used with the oerr utility.

Many of these undocumented utilities such as TKPROF have surfaced from the obscure and entered mainstream Oracle toolkits while others remain hidden inside the O/S.
Finding hidden Oracle utilities

The easiest way to find hidden utilities is to look for new packages within Oracle. The following query compares the packages in Oracle10g with new packages in Oracle11g, using a database link between two instances, each on a different release of Oracle:
select
object_name
from
dba_objects@oracle11g
where
owner = ‘SYS’
and
object_type = ‘PACKAGE BODY’
minus
select
object_name
from
dba_objects@oracle10g
where
owner = ‘SYS’
and
object_type = ‘PACKAGE BODY’;

This query will quickly display all package bodies owned by SYS that exist in Oracle11g but not in Oracle10g.
Finding hidden Oracle executable utilities

Discovering new binaries entails comparing two directories in the operating system and ignoring the duplicates entries. The dircmp UNIX command can be used to find only new entries within two directories:
dircmp — s
/u01/aoracle/product/9.0.3/bin
/u01/oracle/product/10.2.0/bin

Here we compare the bin directory of 9i with Oracle 10.2.
Using advanced Oracle utilities

Fortunately, these undocumented utilities are described in the new book “Advanced Oracle Utilities: The Definitive Reference”, where we see some extremely useful tools that are hidden inside Oracle:

BBED – This is the powerful Oracle block editor utility. BBED allows you to view and update the contents of Oracle data blocks. BBED should only be used by experts, or in conjunction with Oracle technical support.

Dbverify – This dbverify utility is very useful for

Oradebug – This powerful Oracle debugger has many valuable features for debugging Oracle sessions.

Orastack – This orastack memory stack utility allow you to see how RAM is being used within an Oracle session.

Again, these are just a small sample of the hidden utilities with Oracle

Categories: Burleson Article

Sequence of steps in Oracle SQL explain plan

When Codd and Date created the relational data model, the execution plan was an afterthought, largely because the SQL optimizer was always supposed to generate the best execution plan, and hence, there was not real need to understand the internal machinations of Oracle execution plans.

However, in the real world, all SQL tuning experts must be proficient in reading Oracle execution plans and understand the steps within a explain plans and the sequence that the steps are executed.  To successfully understand an explain plan you must be able to know the order that the plan steps are executed.

Reading an explain plan is important for many reasons, and Oracle SQL tuning experts reveal the explain plans to check many things:

·        Ensure that the tables will be joined in optimal order.

·        Determine the most restrictive indexes to fetch the rows.

·        Determine the best internal join method to use (e.g. nested loops, hash join).

·        Determine that the SQL is executing the steps in the optimal order.

Reading SQL execution plans has always been difficult, but there are some tricks to help determine the correct order that the explain plan steps are executed.

 

Ordering the sequence of execution plan steps

SQL execution plans are interpreted using a preorder traversal (reverse transversal) algorithm which you will see below.  Preorder traversal is a fancy way of saying:

1. That to read an execution plan, look for the innermost indented statement. That is generally the first statement executed but NOT always! (see example here where the innermost step is not the first step executed).

2. In most cases, if there are two statements at the same level, the first statement is executed first.

In other words, execution plans are read inside-out, starting with the most indented operation. Here are some general rules for reading an explain plan.

1. The first statement is the one that has the most indentation.

2. If two statements appear at the same level of indentation, the top statement is executed first.

To see how this works, take a look at this plan. Which operation is first to execute?

—————————————————————————

 

| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |

—————————————————————————

|   0 | SELECT STATEMENT   |      |    10 |   650 |     7  (15)| 00:00:01 |

|*  1 |  HASH JOIN         |      |    10 |   650 |     7  (15)| 00:00:01 |

|   2 |   TABLE ACCESS FULL| JOB  |     4 |   160 |     3   (0)| 00:00:01 |

|   3 |   TABLE ACCESS FULL| EMP  |    10 |   250 |     3   (0)| 00:00:01 |

—————————————————————————

The answer is that the full table scan operation on the job table will execute first.  Let’s look at another example plan and read it…

ID  Par Operation

 

0      SELECT STATEMENT Optimizer=FIRST_ROWS

1    0   TABLE ACCESS (BY INDEX ROWID) OF ‘EMP’

2    1     NESTED LOOPS

3    2       TABLE ACCESS (FULL) OF ‘DEPT’

4    2       INDEX (RANGE SCAN) OF ‘IX_EMP_01′ (NON-UNIQUE)

By reviewing this hierarchy of SQL execution steps, we see that the order of operations is 3,4, 2, 1.

Here is the graph for this execution plan:

To see how this query executes, we traverse the tree in reverse order. From the left most, deepest child, traverse the tree moving up, and to the right through each branch.

 

 

 

 

 

By reviewing this hierarchy of SQL execution steps, we see that the order of operations is 3,4, 2, 1:

 

SEQ  ID  Par Operation

0      SELECT STATEMENT Optimizer=CHOOSE

3    1    0   TABLE ACCESS (BY INDEX ROWID) OF ‘EMP’

4    2    1     NESTED LOOPS

2    3    2       TABLE ACCESS (FULL) OF ‘DEPT’

1    4    2       INDEX (RANGE SCAN) OF ‘IX_EMP_01′ (NON-UNIQUE)

Understanding the sequence of explain plan steps is a critical skill, so let’s try some more examples:

Consider this SQL query:

select
a.empid,
a.ename,
b.dname

 

from
emp a,
dept b

where
a.deptno=b.deptno;

We get this execution plan:

Execution Plan

 

———————————————————-

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=40 Card=150000 Bytes=3300000)

1 0 HASH JOIN (Cost=40 Card=150000 Bytes=3300000)

2 1 TABLE ACCESS (FULL) OF ‘DEPT’ (Cost=2 Card=1 Bytes=10)

3 1 TABLE ACCESS (FULL) OF ‘EMP’ (Cost=37 Card=150000 Bytes=1800000)

What is the order of operations here?

Answer:  Execution plan steps are 2, 3, 1

 

Consider this query:

select
a.empid,
a.ename,
b.dname

 

from
emp a,
dept b

where
a.deptno=b.deptno;

We get this execution plan:


Execution Plan

 

———————————————————-

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=864 Card=150000 Bytes=3300000)

1 0 HASH JOIN (Cost=864 Card=150000 Bytes=3300000)

2 1 TABLE ACCESS (BY INDEX ROWID) OF ‘DEPT’ (Cost=826 Card=1 Bytes=10)

3 2 INDEX (FULL SCAN) OF ‘IX_DEPT_01′ (NON-UNIQUE) (Cost=26 Card=1)

4 1 TABLE ACCESS (FULL) OF ‘EMP’ (Cost=37 Card=150000 Bytes=1800000)

What is the order of operations here?

Answer:  Execution plans steps are 3, 2, 4, 1

 

Here is the same query, but slightly different plan:


select
a.empid,
a.ename,
b.dname

 

from
emp a,
dept b

where
a.deptno=b.deptno;

We get this execution plan:


Execution Plan

 

———————————————————-

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=39 Card=150000 Byte=3300000)

1    0   NESTED LOOPS (Cost=39 Card=150000 Bytes=3300000)

2 1 TABLE ACCESS (FULL) OF ‘DEPT’ (Cost=2 Card=1 Bytes=10)

3 1 TABLE ACCESS (FULL) OF ‘EMP’ (Cost=37 Card=150000 Bytes=1800000)

What is the order of operations here?

Answer:  Execution plans steps are 2, 3, 1

 

Let’s find the SQL execution steps for a three table join:

select
a.ename,
a.salary,
b.dname,
c.bonus_amount,
a.salary*c.bonus_amount

 

from
emp a,
dept b,
bonus c

where
a.deptno=b.deptno
and
a.empid=c.empid;

What is the order of operations here?


Execution Plan

 

———————————————————-

0      SELECT STATEMENT Optimizer=CHOOSE (Cost=168 Card=82 Bytes=3936)

1    0   TABLE ACCESS (BY INDEX ROWID) OF ‘EMP’ (Cost=2 Card=1 Bytes=12)

2    1     NESTED LOOPS (Cost=168 Card=82 Bytes=3936)

3    2       MERGE JOIN (CARTESIAN) (Cost=4 Card=82 Bytes=2952)

4    3         TABLE ACCESS (FULL) OF ‘DEPT’ (Cost=2 Card=1 Bytes=10)

5    3         BUFFER (SORT) (Cost=2 Card=82 Bytes=2132)

6    5           TABLE ACCESS (FULL) OF ‘BONUS’ (Cost=2 Card=82 Bytes=2132)

7    2       INDEX (RANGE SCAN) OF ‘IX_EMP_01′ (NON-UNIQUE) (Cost=1 Card=1)

This is a little tougher….

The execution order is 4,6,5,3,7,2,1.

Let’s diagram it!

Here we see that step 2 has two children, three and seven, and step 3 has two children, four and five.  Step 5 has a lone child, step 6.

 

Following our rules for preorder traversal, the execution plan steps start at step 4.

Final Exam!  What are the steps for this execution plan?

Execution Plan

 

———————————————————-

0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2871 Card=2 Bytes=143)

1    0   UNION-ALL

2    1     SORT (GROUP BY) (Cost=2003 Card=1 Bytes=59)

3    2       FILTER

4    3         HASH JOIN (Cost=1999 Card=1 Bytes=59)

5    4           INDEX (FAST FULL SCAN) OF ‘XIN8OPS_FLT_LEG’ (UNIQUE)

6    4           INDEX (RANGE SCAN) OF ‘XIN3BAG_TAG_FLT_LEG’ (UNIQUE)

7    1     SORT (GROUP BY) (Cost=868 Card=1 Bytes=84)

8    7       FILTER

9    8         NESTED LOOPS (Cost=864 Card=1 Bytes=84)

10    9           HASH JOIN (Cost=862 Card=1 Bytes=57)

11   10             INDEX (FAST FULL SCAN) OF ‘XIN1SCHED_FLT_LEG’ (UNIQUE)

12   10             INDEX (FAST FULL SCAN) OF ‘XIN8OPS_FLT_LEG’ (UNIQUE)

13    9           INDEX (RANGE SCAN) OF ‘XIN2BAG_TAG_FLT_LEG’ (UNIQUE)

Answer:  The order of operations is 5, 6, 4, 3, 2, 11, 12, 10, 13, 9, 8, 7, 1.

Categories: Burleson Article

Rewriting SQL for faster performance

Because SQL is a declarative language, you can write the same query in many forms, each getting the same result but with vastly different execution plans and performance. Re-writing SQL for easier readability (and maintenance) plus faster performance is an important tuning tool.

Rewrite SQL to remove subqueries – Subqueries can be very problematic from a performance perspective.

Rewriting the SQL in PL/SQL – For certain queries rewriting SQL in PL/SQL can result in more than a 20x performance improvement.

Rewrite SQL to simplify query – Decomposing a query into multiple queries using the WITH clause (or global temporary tables) greatly aids performance.

In this example, we select all books that do not have any sales. Note that this is a non-correlated sub-query, but it could be re-written in several ways.
select
book_key
from
book
where
book_key NOT IN (select book_key from sales);

There are serious problems with subqueries that may return NULL values. It is a good idea to discourage the use of the NOT IN clause (which invokes a sub-query) and to prefer NOT EXISTS (which invokes a correlated sub-query), since the query returns no rows if any rows returned by the sub-query contain null values.
select
book_key
from
book
where
NOT EXISTS (select book_key from sales);

Subqueries can often be re-written to use a standard outer join, resulting in faster performance. As we may know, an outer join uses the plus sign (+) operator to tell the database to return all non-matching rows with NULL values. Hence we combine the outer join with a NULL test in the WHERE clause to reproduce the result set without using a sub-query.
select
b.book_key
from
book b,
sales s
where
b.book_key = s.book_key(+)
and
s.book_key IS NULL;

This execution plan will also be faster by eliminating the sub-query.
Rewriting SQL for better readability and faster response time speed

Here is an actual example of a poorly-written SQL query. It’s hard to read, and the execution plan is horrible:
SELECT ART.DEMO_MEMBER DEMO_MEMBER,
(SELECT PARAMETER_VALUE
FROM PPM_CIA_PREMIUM_VAL CIA_VAL
WHERE CIA_VAL.PRD_DEMO_CHRSTC_MEMBER = ART.DEMO_MEMBER
AND CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Weekly’)
AND GROUP_TYPE_CODE = ‘C’
) WEEKLY_VALS,
(SELECT PARAMETER_VALUE
FROM PPM_CIA_PREMIUM_VAL CIA_VAL
WHERE CIA_VAL.PRD_DEMO_CHRSTC_MEMBER = ART.DEMO_MEMBER
AND CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Monthly’)
AND GROUP_TYPE_CODE = ‘C’
) MONTHLY_VALS,
(SELECT PARAMETER_VALUE
FROM PPM_CIA_PREMIUM_VAL CIA_VAL
WHERE CIA_VAL.PRD_DEMO_CHRSTC_MEMBER = ART.DEMO_MEMBER
AND CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘90Day’)
AND GROUP_TYPE_CODE = ‘C’
) NINETYDAY_VALS,
(SELECT PARAMETER_VALUE
FROM PPM_CIA_PREMIUM_VAL CIA_VAL
WHERE CIA_VAL.PRD_DEMO_CHRSTC_MEMBER = ART.DEMO_MEMBER
AND CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Annual’)
AND GROUP_TYPE_CODE = ‘C’
) ANNUAL_VALS,
‘C’ AS GROUP_TYPE_CODE
FROM
(SELECT DISTINCT PRD_DEMO_CHRSTC_MEMBER DEMO_MEMBER
FROM PPM_CIA_PREMIUM_VAL
WHERE CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Weekly’ ,
‘Monthly’,
‘Annual’ ,
‘90Day’)
AND GROUP_TYPE_CODE = ‘C’
GROUP BY PRD_DEMO_CHRSTC_MEMBER
) ART

UNION ALL
SELECT ART.DEMO_MEMBER DEMO_MEMBER,
(SELECT PARAMETER_VALUE
FROM PPM_CIA_PREMIUM_VAL CIA_VAL
WHERE CIA_VAL.PRD_DEMO_CHRSTC_MEMBER = ART.DEMO_MEMBER
AND CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Weekly’)
AND GROUP_TYPE_CODE = ‘T’
) WEEKLY_VALS,
(SELECT PARAMETER_VALUE
FROM PPM_CIA_PREMIUM_VAL CIA_VAL
WHERE CIA_VAL.PRD_DEMO_CHRSTC_MEMBER = ART.DEMO_MEMBER
AND CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Monthly’)
AND GROUP_TYPE_CODE = ‘T’
) MONTHLY_VALS,
(SELECT PARAMETER_VALUE
FROM PPM_CIA_PREMIUM_VAL CIA_VAL
WHERE CIA_VAL.PRD_DEMO_CHRSTC_MEMBER = ART.DEMO_MEMBER
AND CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘90Day’)
AND GROUP_TYPE_CODE = ‘T’
) NINETYDAY_VALS,
(SELECT PARAMETER_VALUE
FROM PPM_CIA_PREMIUM_VAL CIA_VAL
WHERE CIA_VAL.PRD_DEMO_CHRSTC_MEMBER = ART.DEMO_MEMBER
AND CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Annual’)
AND GROUP_TYPE_CODE = ‘T’
) ANNUAL_VALS,
‘T’ AS GROUP_TYPE_CODE
FROM
(SELECT DISTINCT PRD_DEMO_CHRSTC_MEMBER DEMO_MEMBER
FROM PPM_CIA_PREMIUM_VAL
WHERE CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Weekly’ ,
‘Monthly’,
‘Annual’ ,
‘90Day’)
AND GROUP_TYPE_CODE = ‘T’
GROUP BY PRD_DEMO_CHRSTC_MEMBER
) ART
ORDER BY GROUP_TYPE_CODE

Look like Greek? That’s not a good SQL practice, to write convoluted SQL! Let’s look at the same query, rewritten for clarity and faster performance.

The above convoluted query can be completely re-written. This SQL runs faster and more efficiently.

Rewritten query:

The above convoluted query can be completely re-written. This SQL runs faster and more efficiently.
SELECT ART.PRD_DEMO_CHRSTC_MEMBER DEMO_MEMBER,
SUM(DECODE(ART.INVOICE_TYPE_NAME,’Weekly’,ART.parameter_value)) WEEKLY_VALS,
SUM(DECODE(ART.INVOICE_TYPE_NAME,’Monthly’,ART.parameter_value)) MONTHLY_VALS,
SUM(DECODE(ART.INVOICE_TYPE_NAME,’90Day’,ART.parameter_value)) NINETYDAY_VALS,
SUM(DECODE(ART.INVOICE_TYPE_NAME,’Annual’,ART.parameter_value)) ANNUAL_VALS,
ART.GROUP_TYPE_CODE
FROM PPM_CIA_PREMIUM_VAL ART
WHERE CIA_NAME IN (‘Baseline’)
AND PRMA_MKT_MEMBER = ?
AND INVOICE_TYPE_NAME IN (‘Weekly’ ,
‘Monthly’,
‘Annual’ ,
‘90Day’)
AND GROUP_TYPE_CODE IN (‘C’,'T’)
GROUP BY ART.PRD_DEMO_CHRSTC_MEMBER,ART.GROUP_TYPE_CODE
ORDER BY ART.GROUP_TYPE_CODE, ART.PRD_DEMO_CHRSTC_MEMBER;

As we see, SQL is very flexible and an equivalent query can be written in many ways, all giving the same result, but with radically different readability and execution response time.

Categories: Burleson Article

Oracle 11g R2 new parallel query management enhancements

The database industry is clearly in the midst of massive server consolidation, an economic imperative whereby the old-fashioned one database/one server approach of the client-server days has been rendered obsolete. Today, single servers with 32 and 64 CPU’s and hundreds of gigabytes of RAM can host dozens of large Oracle databases.

While the 2nd age of mainframe computing came about to facilitate easier DBA management, there remain the impressive benefits of having a back of dozens of CPU’s to perform full scans very quickly.

When invoking Oracle parallel query, there are many perils and pitfalls:

Setting parallelism on at the table or system level influences the optimizer, and sometimes makes full-scan operations appear cheaper than they really are.

Determining the optimal degree of parallelism is tricky. The real optimal degree of parallelism depends on the physical placement of the data blocks on disk as well as the number of processors on the server (cpu_count).

To relieve these issues with parallel query, in Oracle 11g Release 2, the following new parallel query parameters are included:

The parallel_degree_policy parameter

The parallel_min_time_threshold parameter

The parallel_degree_limit parameter

The parallel_force_local parameter

Let’s take a close look at these important enhancements to Oracle parallel query in 11g Release 2.
The parallel_degree_policy parameter

The parallel_degree_policy parameter is related to the amount of table data residing in the data buffer cache. Using parallel_degree_policy allows Oracle to bypass direct path reads when Oracle determines that lots of the table data blocks already reside in the data buffer cache.

In traditional 32-bit systems (limited by on ly a few gig of RAM for the SGA), direct path reads (which bypass the SGA were always faster than reading a large table through the data buffer. However, with the advent of 64-bit servers with dozens of gigabytes for the db_cache_size, large tables are often be fully cached, negating the need to always perform direct path reads for parallel large-table full-table scans.

Guy Harrison conducted some benchmark tests of parallel_degree_policy and we see details on how parallel_degree_policy evaluates the caching of large tables:

“If PARALLEL_DEGREE_POLICY is set to AUTO then Oracle might perform buffered IO instead of direct path IO. . . The documentation says that the optimizer decides whether or not to use direct path depending on the size of the table and the buffer cache and the likelihood that some data might be in memory.”
The parallel_min_time_threshold parameter

The parallel_min_time_threshold parameter only allows parallel query to be invoked against large tables or indexes, those where the num_rows suggests that it will take more than nn seconds to scan the table. The default for parallel_min_time_threshold is 30 (seconds), but you can now adjust this threshold according to your optimal definition of what constitutes a “large table”. This parameter appears to be related to the deprecated small_table_threshold parameter.
The parallel_degree_limit parameter

The parallel_degree_limit parameter seta a limit on the maximum degree of parallelism. The default is cpu_count*2.
The parallel_force_local parameter

The parallel_force_local parameter prohibits “parallel parallelism”, a case where parallel queries on a RAC node are limited only to the local instance node.

Categories: Burleson Article

Speed Oracle SQL with Temporary Tables

Using temporary tables with Dictionary Views

The prudent use of temporary tables can dramatically improve Oracle SQL performance. To illustrate the concept, consider the following example from the DBA world. In the query that follows, we want to identify all users who exist within Oracle who have not been granted a role. We could formulate the query as an anti-join with a noncorrelated subquery (against a complex view) as shown here:

select
username
from
dba_users
where
username NOT IN
(select grantee from dba_role_privs);

This query runs in 18 seconds. As you may remember from Chapter 12, these anti-joins can often be replaced with an outer join. However, we have another option by using CTAS. Now, we rewrite the same query to utilize temporary tables by selecting the distinct values from each table.

create table
temp1
as
select
username
from
dba_users;
create table
temp2
as
select distinct
grantee
from
dba_role_privs;
select
username
from
temp1
where
username not in
(select grantee from temp2);

With the addition of temporary tables to hold the intermediate results, this query runs in less than three seconds, a 6× performance increase. Again, it is not easy to quantify the reason for this speed increase, since the DBA views do not map directly to Oracle tables, but it is clear that temporary table show promise for improving the execution speed of certain types of Oracle SQL queries.

Using Temporary Tables

If the amount of data to be processed or utilized from your PL/SQL procedure is too large to fit comfortably in a PL/SQL table, use a GLOBAL TEMPORARY table rather than a normal table. A GLOBAL TEMPORARY table has a persistent definition but data is not persistent and the global temporary table generates no redo or rollback information. For example if you are processing a large number of rows, the results of which are not needed when the current session has ended, you should create the table as a temporary table instead:

create global temporary table
results_temp (...)
on commit preserve rows;

 

The “on commit preserve rows” clause tells the SQL engine that when a transaction is committed the table should not be cleared.

The global temporary table will be created in the users temporary tablespace when the procedure populates it with data and the DIRECT_IO_COUNT will be used to govern the IO throughput (this usually defaults to 64 blocks).

 

Categories: Burleson Article

The Top 10 Reasons for Poor Oracle Performance

What are the most common root-causes of poor Oracle performance?  Every expert will give you a different opinion, but I’ve prepared my list based on our hundreds of Oracle tuning engagements:

1. Bad Design – The number one offender to poor performance is over-normalization of Oracle tables, excessive (unused indexes) and 15-way table joins for what should be a simple fetch.

2. Poor server optimization – Setting the server kernel parameters and I/O configuration (e.g. direct I/O) has a profound impact on Oracle performance

3. Bad disk I/O configuration – Inappropriate use of RAID5, disk channel bottlenecks and poor disk striping.

4. Poor Optimizer Statistics – Prior to Oracle 10g (automatic statistics), a common cause of poor SQL performance was missing/stale CBO statistics and missing histograms.

5. Object contention – Failing to set ASSM, freelists or freelist_groups for DML-active tables and indexes can cause very slow DML performance.

6. Under-allocated RAM regions – Not allocating enough RAM for shared_pool_size, pga_aggregate_target and db_cache_size can cause an I/O-bound database.

7. Non-reentrant SQL – All SQL should use host variables/cursor_sharing=force to make SQL reusable within the library cache.

8. Un-set initialization parameters – Many of the initialization parameters are made to be set by the DBA (db_file_multiblock_read_count, optimizer_index_caching) and failing to set these parameters properly results in poorly optimized execution plans.

9. Excessive nested loop joins – In 64-bit Oracle systems we have gigabytes available for RAM sorts and hash joins.  Failing to set pga_aggregate_target to allow the CBO to choose hash joins can result in very slow SQL performance.

10. Human Misfeasance – The DBA’s failure to monitor their database (STATSPACK/AWR), set-up exception reporting alerts (OEM) and adjusting their instance to match changing workloads is a major cause of poor performance.

The Oracle Documentation lists these ten reasons for poor performance.  The BC list is similar, but our top-10 list is based on what we see with our clients:

1. Bad Connection Management

The application connects and disconnects for each database interaction. This problem is common with stateless middleware in application servers. It has over two orders of magnitude impact on performance, and it is totally unscalable.

2. Bad Use of Cursors and the Shared Pool

Not using cursors results in repeated parses. If bind variables are not used, then there is hard parsing of all SQL statements. This has an order of magnitude impact in performance, and it is totally unscalable. Use cursors with bind variables that open the cursor and execute it many times. Be suspicious of applications generating dynamic SQL.

3. Getting Database I/O Wrong

Many sites lay out their databases poorly over the available disks. Other sites specify the number of disks incorrectly, because they configure disks by disk space and not I/O bandwidth.

4. Redo Log Setup Problems

Many sites run with too few redo logs that are too small. Small redo logs cause system checkpoints to continuously put a high load on the buffer cache and I/O system. If there are too few redo logs, then the archive cannot keep up, and the database will wait for the archive process to catch up.

5. Serialization of data blocks in the buffer cache due to lack of free lists, free list groups, transaction slots (INITRANS), or shortage of rollback segments.

This is particularly common on INSERT-heavy applications, in applications that have raised the block size to 8K or 16K, or in applications with large numbers of active users and few rollback segments.

6. Long Full Table Scans

Long full table scans for high-volume or interactive online operations could indicate poor transaction design, missing indexes, or poor SQL optimization. Long table scans, by nature, are I/O intensive and unscalable.

7. In Disk Sorting

In disk sorts for online operations could indicate poor transaction design, missing indexes, or poor SQL optimization. Disk sorts, by nature, are I/O-intensive and unscalable.

8. High Amounts of Recursive (SYS) SQL

Large amounts of recursive SQL executed by SYS could indicate space management activities, such as extent allocations, taking place. This is unscalable and impacts user response time. Recursive SQL executed under another user ID is probably SQL and PL/SQL, and this is not a problem.

9. Schema Errors and Optimizer Problems

In many cases, an application uses too many resources because the schema owning the tables has not been successfully migrated from the development environment or from an older implementation. Examples of this are missing indexes or incorrect statistics. These errors can lead to sub-optimal execution plans and poor interactive user performance. When migrating applications of known performance, export the schema statistics to maintain plan stability using the DBMS_STATS package.

Likewise, optimizer parameters set in the initialization parameter file can override proven optimal execution plans. For these reasons, schemas, schema statistics, and optimizer settings should be managed together as a group to ensure consistency of performance.

10. Use of Nonstandard Initialization Parameters

These might have been implemented based on poor advice or incorrect assumptions. In particular, parameters associated with SPIN_COUNT on latches and undocumented optimizer features can cause a great deal of problems that can require considerable investigation.

Categories: Burleson Article

SQL tuning with views

The relational database model gave us the concept of views to simplify relational algebra, providing a way to “name” a complex query and treat it as-if it were a discrete table:
create view
myview
as
select   stuff from   tableA natural join  tableB
natiral join  tableC;
Now, we can logically simplify the query:
select stuff from myview where xxx=yyy;
The benefits to views
In sum, views provide a benefit in these areas:
·    Code reuse – Views ensure that everyone uses the exact same SQL to access their tables
·    Column access security – Using the “grant” security model, views can be used for column-level security, such that some columns in a table may be “hidden” by not specifying them in a view.
All benefits come at a cost, and one downside to using views is that it adds complexity to the SQL and makes it harder for the optimizer to service a query with the minimum amount of resources (either I/O or CPU resources, depending on your optimizer goal).
The problem of tuning SQL that contains views
While it’s clear that views are useful for end-user ad-hoc queries and cases where we want to simplify the syntax of complex SQL queries, we have a serious problem when queries contain views.
·    Predicate pushing – The downside to re-usable code is that where clause predicates must be “pushed” down the hierarchy of views, down to the base query.  This adds processing overhead to the optimizer and increases the chances of a bad execution plan.
·    Non mergeable views – Because a view is an encapsulation of a complex query, it is used as-if it were a discrete relational table.  Hence, Oracle must pre-materialize a view whenever it is used in a query.  This create a hidden sub-plan that must be exposed for SQL tuning.
·    Unnecessary overhead – Views are abused when they introduce unnecessary complexity.  For example, you may call a view that is composed of 10 tables where the result set only requires data from two tables.
·    Excessive hard parsing – Predicate pushing may result in a hard parse of the underlying SQL that is executed. Hence, it is important to make sure you use bind variables instead of literals in SQL code calling views. Thus, our SQL should look something like this instead for best performance:
SELECT *
FROM
vw_layer_two_dept_100
WHERE
empid=:b100;
To understand why views make it difficult to tune your SQL, let’s see what happens when a views is used with a where clause.  To function, Oracle must “push” the where clause predicate to the underlying view.
Predicate pushing with views
The Oracle SQL tuning problem becomes a nightmare when views are nested within other views.  Oracle supports pushing of predicates into a given view. Assume we have a set of nested views, like this, where view1 id referenced inside view2:
– View One
CREATE VIEW
vw_layer_one
AS
SELECT * FROM emp;
– view two
CREATE VIEW
vw_layer_two_dept_100
AS
SELECT * FROM vw_layer_one
WHERE deptno=100;
Then assume we issued this query:
SELECT *
FROM
vw_layer_two_dept_100
WHERE empid=100;
The predicate in this SQL is the where empid=100 statement. You may have one of tens or even hundreds of predicates. The Oracle optimizer is written to “push” predicates down into the views that are being referenced in the SQL. Thus, Oracle will transform the vw_layer_one view into a SQL statement that looks like this:
CREATE VIEW
vw_layer_one
AS
SELECT * FROM emp
WHERE
deptno=100
AND
empid=100;
Note that both the predicate from view two (where deptno=100) and the predicate from the SQL statement being executed (where empid=100) are pushed down into the final view that is executed. This can have significant performance benefits because now the bottom view can possibly use an index if one exists on deptno and/or empid.
Predicate pushing can get very complex and predicate pushing also has several performance gotchas:
·    Unnecessary overhead – Views are abused when they introduce unnecessary complexity.  For example, you may call a view that is composed of 10 tables where the result set only requires data from two tables.
·    Hard parsing – Predicate pushing may result in a hard parse of the underlying SQL that is executed. Hence, it is important to make sure you use bind variables instead of literals in SQL code calling views. Thus, our SQL should look something like this instead for best performance:
SELECT *
FROM
vw_layer_two_dept_100
WHERE
empid=:b100;

Abusing views in Oracle SQL
Views are very handy but they get badly abused, which is a shame. I’ve seen views that return 50 columns, and have 40 predicates used to return just two or three columns that could easily have been retrieved from a simple SQL query. This is clearly a case of view abuse, and can lead to badly performing views.
Stacked views can also mask performance problems. Again, they can result in innumerable columns being returned when all you really need are a few of those columns. Also, predicate pushing tends to break down as you stack more views on top of more views.
If you are going to start stacking views, carefully review the rules for predicate pushing in the Oracle documentation. But beware, the predicate pushing rules are long and involved and may change between each release of Oracle!
Merging views and SQL tuning
As we have seen, at optimization time, the CBO will attempt to “flatten out” the views by building and optimizing one large query against the base tables that comprise the view.  This is known as “merging” views, and a merge is critical to avoid a hidden sub-plan from being introduced into your SQL.
Whenever you are tuning SQL and you see the VIEW notation in  the explain plan, you may have a non-mergeable view:
select
count(1) from(
select distinct pat_id from
(
select
p.pat_id,
p.pat_last_name,
p.pat_first_name,
v.visit_id,
v.arrive_dt_tm,
v.depart_dt_tm,
r.test_id,
r.test_name,
r.result_dt_tm,
r.result_val
from
patient p,
pat_visit v,
pat_result r
where
p.pat_id=1
and
p.pat_id=v.pat_id
and
v.pat_id=r.pat_id)
);
Here we see the VIEW in the execution plan, indicating a sub-plan:

ID    PID    Operation Name                           Rows    Bytes    Cost    CPU Cost    IO Cost
0        SELECT STATEMENT                                1        2     8     25M          6
1    0      SORT AGGREGATE                               1        2
2    1        VIEW                                       1        2     8     25M          6
3    2          SORT UNIQUE NOSORT                       1       24     8     25M          6
4    3            HASH JOIN                            125     3000     7     8484914      6
5    4              NESTED LOOPS                         5       90     3     42157        3
6    5                INDEX UNIQUE SCAN    XPKPATIENT    1        9     0     1050         0
7    5                TABLE ACCESS FULL    PAT_VISIT     5       45     3     41107        3
8    4              TABLE ACCESS FULL     PAT_RESULT    25      150     3     63107        3

Note that the VIEW notation in an explain plan means that the view is not mergeable and Oracle must run a sub-plan (which is hidden).  Now, let’s revue some tips for ensuring that you get the fastest possible execution time when using views:
Tips for tuning SQL with views
It’s ironic that views make life simple for the developers but make life complex for the DBA’s who must tune the execution!  Here are some tricks for tuning SQL with views:
·    Use plan stability and SQL profiles – SQL profiles and optimizer plan stability )stored outlines) can swap-out a bad view plan with a correct plan using the base tables.
·    Use a stored procedure instead – Instead of a view, encapsulate the complex SQL inside a stored procedure.  This way, the SQL optimizer will not see any views because you manage the SQL within your own application.
·    Train your developers – If possible, teach the developers to write SQL using the base tables instead of relying on the cosmetic simplicity of views.
·    Optimize to use views – The all-powerful Oracle optimizer parameters can be used for holistic tuning of your entire workload.  For example, using optimizer_secure_view_merging causes Oracle to materialize the view results, resulting in faster query performance.
In sum, Oracle views are a necessary evil, but nesting of views with complex queries are a frequent cause of poor SQL performance.

Categories: Burleson Article

Tracking & auditing changes to your init.ora parameters

A very important auditing task for any production database is the ability to track changes to the powerful initialization parameters.  Many of these parameters are “silver bullets”, single parameters that have a profound impact on system-wide behavior.  This is especially true for SQL optimizer changes.

A single change to an optimizer parameter might effect thousands of SQL statements, a disaster in a controlled production environment.

Also see these important notes on Oracle best practices.

Auditing changes to init.ora parameters (via pfile or spfile) is an important DBA task.  Sometimes, users which have “alter system” privilege can make unauthorized changes to the initialization parameters in the spfile on a production database.  Hence, auditing changes to parameters is a critical DBA task.  Fortunately, it’s quite simple to audit these changes by implementing the audit_sys_operations=true.

Here is a method to track changes to the initialization parameters.  In order to track all changes to parameters we can use audit for the alter system statement for any specific user

We should follow below steps to track changes to init.ora parms:

1.       ALTER SYSTEM SET audit_trail=db SCOPE=SPFILE;

2.       SHUTDOWN IMMEDIATE

3.       STARTUP

4.       CREATE USER TEST IDENTIFIED BY TEST;

5.       GRANT DBA TO TEST;

6.       AUDIT ALTER SYSTEM BY test;

7.       CONN TEST/TEST

8.       ALTER SYSTEM SET AUDIT_TRAIL=db SCOPE=SPFILE;

9.    Create an alert script to notify the DBA when a parameter has changed.

Let’s start by finding the action_name in the dba_audit_trail view for the alter system command:

SQL> select username, timestamp, action_name from dba_audit_trail;
USERNAME                       TIMESTAMP ACTION_NAME
—————————— ——— —————————-
TEST                           29-MAY-09 ALTER SYSTEM

STEP 1 – We can track changes made by SYS user by setting audit_sys_operations parameter to TRUE.

SQL> alter system set audit_sys_operations=true scope=spfile;
System altered.

STEP 2 – Next, we bounce the instance to make the change take effect:
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area  285212672 bytes
Fixed Size                  1218992 bytes
Variable Size              92276304 bytes
Database Buffers          188743680 bytes
Redo Buffers                2973696 bytes
Database mounted.
Database opened.

Here we see our auditing parameters:

SQL> show parameter audit

NAME                                 TYPE        VALUE
———————————— ———– ——————————
audit_file_dest                      string      /home/oracle/oracle/product/10
.2.0/db_1/admin/kam/adump
audit_sys_operations                 boolean     TRUE
audit_syslog_level                   string
audit_trail                          string      DB
SQL> alter system set audit_trail=db scope=spfile;

System altered.

STEP 3 – Here we go to the adump directory and examine the audit files:
SQL> host
[oracle@localhost bin]$ cd /home/oracle/oracle/product/10.2.0/db_1/admin/kam/adump/
[oracle@localhost adump]$ ls
ora_5449.aud  ora_5476.aud  ora_5477.aud  ora_5548.aud  ora_5575.aud  ora_5576.aud

[oracle@localhost adump]$ cat ora_5576.aud
Audit file /home/oracle/oracle/product/10.2.0/db_1/admin/kam/adump/ora_5576.aud
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /home/oracle/oracle/product/10.2.0/db_1/
System name:    Linux
Node name:      localhost.localdomain
Release:        2.6.18-92.el5
Version:        #1 SMP Tue Jun 10 18:49:47 EDT 2008
Machine:        i686
Instance name: TESTDB
Redo thread mounted by this instance: 1
Oracle process number: 15
Unix process pid: 5576, image: oracle@localhost.localdomain (TNS V1-V3)

Fri May 29 02:38:30 2009
ACTION : ‘alter system set audit_trail=db scope=spfile’
DATABASE USER: ‘/’
PRIVILEGE : SYSDBA
CLIENT USER: oracle
CLIENT TERMINAL: pts/2
STATUS: 0

Please beware that using the auditing command imposes additional work on the production database.

Categories: Burleson Article

How to tune Oracle SQL remotely

The central idea behind “swapping” SQL profiles is simple.  You define a SQL profile that specifies the SQL statement that you want to tune, and an alternative execution plan, in the form of hints.  When this SQL is executed and hits the library cache, Oracle detects that a SQL profile exists for this statement, and automatically applies the hints to change the execution plan.

Hence, we can tune SQL statements without ever touching the SQL statement itself.  To do this we use the DBMS_SQLTUNE package which has an import_sql_profile procedure which allows you to swap hints from one SQL profile into another SQL profile.

dbms_sqltune.import_sql_profile(
sql_text => ’select * from emp’,
profile => sqlprof_attr(‘ALL_ROWS’,'IGNORE_OPTIM_EMBEDDED_HINTS’)
category => ‘DEFAULT’,
name => ‘change_emp’,
force_match => &&6
);

To see how you can swap-out SQL profiles, let’s start by examining the dbms_sqltune package and the import_sql_profile procedure.

In the example below, he forces a query that specifies first_rows_10 to change to all_rows.  He does this trick when he executes dbms_sqltune.import_sql_profile with the profile sqlprof_attr(’ALL_ROWS’,’IGNORE_OPTIM_EMBEDDED_HINTS’):

 

1 – First, he executes the query and display the execution plan which shows an index full scan:

SQL> select /*+ first_rows(10) */ * from sh.customers order by cust_id;
—————————————————-
| Id  | Operation                   | Name         |
—————————————————-
|   0 | SELECT STATEMENT            |              |
|   1 |  TABLE ACCESS BY INDEX ROWID| CUSTOMERS    |
|   2 |   INDEX FULL SCAN           | CUSTOMERS_PK |
—————————————————-

 

2 – Next, he imports the all_rows hint into any query that matches the original query:

begin
dbms_sqltune.import_sql_profile(
name     => ‘test’,
category => ‘DEFAULT’,
sql_text => ’select /*+ first_rows(10) */ * from sh.customers order by cust_id’,
profile  => sqlprof_attr(‘ALL_ROWS’,'IGNORE_OPTIM_EMBEDDED_HINTS’)
);
end;
/

3 – Finally, we re-execute the original query and see that the plan has changed:

SQL> select /*+ first_rows(10) */ * from sh.customers order by cust_id;
—————————————-
| Id  | Operation          | Name      |
—————————————-
|   0 | SELECT STATEMENT   |           |
|   1 |  SORT ORDER BY     |           |
|   2 |   TABLE ACCESS FULL| CUSTOMERS |
—————————————-

 

Internally, SQL profiles are stored in the data dictionary with the SQL profile name, an attribute name, and the attribute value (the hint that is to be applied to the SQL).  Here is a query to display the hints within a SQL profile:

select
attr_val         hint_name
from
dba_sql_profiles prof,
sqlprof$attr     hnt
where
prof.signature = hnt.signature
and
name like (‘&profile_name’)

order by
attr#;

Categories: Burleson Article