Saturday, October 10, 2009

Using UNIX aliases with Oracle

A UNIX alias is a short command that is replaced with a larger command to save typing time. For example, we could create an alias to allow us to count the number of connected users on our oracle server:

root> alias numuse=’who|wc –l’

root> numuse

463

The whole point of UNIX aliases is to save typing. Here is a list on common UNIX aliases that can be added to your UNIX logon file for the UNIX Oracle user. These aliases perform common Oracle functions such as checking the Oracle alert log and transferring quickly between directories.

#*******************************
# UNIX aliases for Oracle DBAs
#*******************************
alias alert='tail -100 $DBA/$ORACLE_SID/bdump/alert_$ORACLE_SID.log|more'
alias errors='tail -100 $DBA/$ORACLE_SID/bdump/alert_$ORACLE_SID.log|more'
alias arch='cd $DBA/$ORACLE_SID/arch'
alias bdump='cd $DBA/$ORACLE_SID/bdump'
alias cdump='cd $DBA/$ORACLE_SID/cdump'
alias pfile='cd $DBA/$ORACLE_SID/pfile'
alias rm='rm -i'
alias sid='env|grep ORACLE_SID'
alias admin='cd $DBA/admin'

To illustrate how useful aliases are in Oracle administration, in the example below we can get to our pfile directory in a single command so we can view the contents of our init.ora file:

cheops*CPRO-/home/oracle
> pfile
cheops*CPRO-/u01/app/oracle/CPRO/pfile
>ls
initCPRO.ora

Aliases can also be used for sophisticated Oracle commands. For example, the following alias can be used to display all Oracle errors in the last 400 lines of the alert log:

cheops*testsid-/u01/app/oracle/admin/envtest/pfile
>alias errors='tail -100 $DBA/$ORACLE_SID/bdump/alert_$ORACLE_SID.log|grep ORA-'

cheops*testsid-/u01/app/oracle/admin/envtest/pfile
>errors
ORA-00604: error occurred at recursive SQL level 1
ORA-01089: immediate shutdown in progress - no operations are permitted
ORA-00604: error occurred at recursive SQL level 3
ORA-01089: immediate shutdown in progress - no operations are permitted

Enabling Your Database to perform hash joins

In cases where a very small table is being joined to a large table, the Oracle hash join will often dramatically speed-up the query. Hash joins are far faster than nested loop joins in some cases, but it is not easy to get your database to use hash joins.



Unfortunately, the hash join is more memory intensive than a nested loop join. To be faster than a nested loop join, we must set the hash_area_size large enough to hold the entire hash table in memory (about 1.6 times the sum of the rows in the table). If the hash join overflows the hash_area_size memory, the hash join will page into the TEMP tablespace, severely degrading the performance of the hash join. You can use the following script, hash_area.sql, to dynamically allocate the proper hash_area_size for your SQL query in terms of the size of your target table.



hash_area.sql
set heading off;
set feedback off;
set verify off;
set pages 999;

spool run_hash.sql

select
'alter session set hash_area_size='||trunc(sum(bytes)*1.6)||';'
from
dba_segments
where
segment_name = upper('&1');

spool off;

@run_hash


Here is the output from this script. As you see, we pass the driving table name, and the script generates the appropriate alter session command to ensure that we have enough space in hash_area_size RAM to hold the driving table.



SQL> @hash_area customer

alter session set hash_area_size=3774873;

In addition to seeing the hash_area_size, we must also be able to adjust the degree of parallelism in cases where we use a full-table scan to access the tables in a hash join.



This short tip does not have complete information about the hash join, but you can get complete information from Oracle High Performance SQL Tuning by Oracle Press. The hash join is very finicky, and there are many conditions that must be satisfied. It is not uncommon to find that a use_hash hint is ignored, and here are some common causes of this problem.



Check initialization parameters Make sure that you have the proper settings for optimizer_index_cost_adj and optimizer_max_permutations to limit the number of table join evaluations. Also check your values for hash_area_size and hash_multiblock_io_count.


Verify the driving Table Make sure that the smaller table is the driving table (the first table in the from clause when using the ordered hint). This is because a hash join builds the memory array using the driving table.


Analyze CBO statistics Check that tables and/or columns of the join tables are appropriately analyzed.


Check for skewed columns Histograms are recommended only for nonuniform column distributions. If necessary, you can override the join order chosen by the cost-based optimizer using the ordered hint.


Check RAM region Ensure that hash_area_size is large enough to hold the smaller table in memory. Otherwise, Oracle must write to the TEMP tablespace, slowing down the hash join. Oracle recommends that the hash_area_size for the driving table be set at 1.6 times the sum of bytes for the driving table, and you can use the hash_area.sql script to set the hash_area_size for your query session.


If you have followed these steps, you should be able to add a use_hash hint ton your SQL to invoke a hash join. Remember, it is always a good idea to run a explain plan on your SQL to ensure that the hash join is being invoked.

Searching for freelist and buffer busy waits

Whenever multiple insert or update tasks access a table, it is possible that Oracle may be forced to wait to access the first block in the table. The first block is called the segment header, and the segment header contains the freelist for the table.



The number of freelists for any table should be set to the high-water mark of concurrent inserts or updates.



The script below will tell you if you have waits for table or index freelists. If so, you need to identify the table and add additional freelists. In Oracle8i, you can add freelists with the ALTER table command, but in earlier versions of Oracle you must reorganize the table to add freelists.



The procedure for identifying the specific table associated with a freelist wait or a buffer busy wait is complex, but it is fully described in my Oracle Press book “Oracle High-Performance Tuning with STATSPACK.



column s_v format 999,999,999 heading 'Total Requests' new_value tnr

column count format 99999990 heading ‘count’ new_value cnt

column proc heading 'Ratio of waits'



PROMPT Current v$waitstat freelist waits...

PROMPT

set heading on;





prompt - This displays the total current waits on freelists

select

class,

count

from

v$waitstat

where

class = 'free list';





prompt - This displays the total gets in the database

select

sum(value) s_v

from

v$sysstat

where

name IN ('db block gets', 'consistent gets');



PROMPT Here is the ratio

select

&cnt/&tnr * 100 proc

from

dual;





Here is the listing from this script. Whenever waits appear in v$waitstat, you need to investigate to see which table or index is experiencing multiple concurrent inserts or updates.





SQL> @cont

Current v$waitstat freelist waits...

- This displays the total current waits on freelists



CLASS COUNT

------------------ ---------

free list 0

- This displays the total gets in the database



Total Num of Requests

---------------------

140318872

Here is the ratio



Ratio in %

----------

0



Please note the freelist contention also can be manifested as a buffer busy wait. This is because the block is already in the buffer, but cannot be accessed because another task has the segment header.



The section below describes the process the block address associated with a wait.



As we discussed, Oracle does not keep an accumulator to track individual buffer busy waits. To see them, you must create a script to detect them and then schedule the task to run frequently on your database server.



get_busy.ksh

#!/bin/ksh

# First, we must set the environment . . . .
ORACLE_SID=proderp
export ORACLE_SID
ORACLE_HOME=`cat /var/opt/oracle/oratab|grep \^$ORACLE_SID:|cut -f2 -d':'`
export ORACLE_HOME
PATH=$ORACLE_HOME/bin:$PATH
export PATH

SERVER_NAME=`uname -a|awk '{print $2}'`
typeset -u SERVER_NAME
export SERVER_NAME

# sample every 10 seconds
SAMPLE_TIME=10

while true
do

#*************************************************************
# Test to see if Oracle is accepting connections
#*************************************************************
$ORACLE_HOME/bin/sqlplus -s /< /tmp/check_$ORACLE_SID.ora
select * from v\$database;
exit
!

#*************************************************************
# If not, exit immediately . . .
#*************************************************************
check_stat=`cat /tmp/check_$ORACLE_SID.ora|grep -i error|wc -l`;
oracle_num=`expr $check_stat`
if [ $oracle_num -gt 0 ]
then
exit 0
fi

rm -f /export/home/oracle/statspack/busy.lst

$ORACLE_HOME/bin/sqlplus -s perfstat/perfstat< /tmp/busy.lst

set feedback off;
select
sysdate,
event,
substr(tablespace_name,1,14),
p2
from
v\$session_wait a,
dba_data_files b
where
a.p1 = b.file_id
;
!

var=`cat /tmp/busy.lst|wc -l`

echo $var
if [[ $var -gt 1 ]];
then
echo
**********************************************************************"
echo "There are waits"
cat /tmp/busy.lst|mailx -s "Prod block wait found"\
info@remote-dba.net \
Larry_Ellison@oracle.com
echo
**********************************************************************"
exit
fi

sleep $SAMPLE_TIME
done



As we can see from this script, it probes the database for buffer busy waits every 10 seconds. When a buffer busy wait is found, it mails the date, tablespace name, and block number to the DBA. Here is an example of a block alert e-mail:



SYSDATE SUBSTR(TABLESP BLOCK
--------- -------------- ----------
28-DEC-00 APPLSYSD 25654



Here we see that we have a block wait condition at block 25654 in the applsysd tablespace. The procedure for locating this block is beyond the scope of this tip, but complete directions are in Chapter 10 of Oracle High Performance Tuning with STATSPACK.

Monitoring Oracle Parallel Query with V$ Views

Many Oracle professionals are not aware that Oracle provides system views to show the internal state of parallel queries.



To see how many parallel query servers are busy at any given time, the following

query can be issued against the v$pq_sysstat view:





select
statistic,
value
from
v$pq_sysstat
where
statistic = 'Servers Busy';

STATISTIC VALUE
--------- -----
Servers Busy 30



In this case, we see that 30 parallel servers are busy at this moment. Do not be misled by this number. Parallel query servers are constantly accepting work or returning to idle status, so it is a good idea to issue the query many times over a one-hour period to get an accurate reading of parallel query activity. Only then will you receive a realistic measure of how many parallel query servers are being used.



There is one other method for observing parallel query from inside Oracle. If you are running Oracle on UNIX, you can use the ps command to see the parallel query background processes in action:



ps -ef|grep "ora_p"



In sum, Oracle has several tools that you can use to watch the Oracle parallel query processes in action. This information is very useful when you want to see if your parallelism is stressing your database server.

Monitoring Oracle Parallel Query with STATSPACK

In addition to monitoring parallel execution for individual queries, you can also monitor parallel query activity for your whole database. Using STATSPACK, you can query the stats$sysstat table to extract the number of parallelized queries that have been run during each time period between your STATSPACK snapshots. Here is the script that I use to monitor parallel queries:


rpt_parallel.sql
set pages 9999;

column nbr_pq format 999,999,999
column mydate heading 'yr. mo dy Hr.'

select
to_char(snap_time,'yyyy-mm-dd HH24') mydate,
new.value nbr_pq
from
perfstat.stats$sysstat old,
perfstat.stats$sysstat new,
perfstat.stats$snapshot sn
where
new.name = old.name
and
new.name = 'queries parallelized'
and
new.snap_id = sn.snap_id
and
old.snap_id = sn.snap_id-1
and
new.value > 1
order by
to_char(snap_time,'yyyy-mm-dd HH24')
;



Here is a sample of the output. This will quickly show the DBA the time periods when parallel full-table scans are being invoked:



SQL> @rpt_parallel

yr. mo dy hr. nbr_pq
------------- -------------
2001-03-12 20 3,521
2001-03-12 21 2,082
2001-03-12 22 2,832
2001-03-13 20 5,152
2001-03-13 21 1,835
2001-03-13 22 2,623
2001-03-14 20 4,274
2001-03-14 21 1,429
2001-03-14 22 2,313



In this example, we see that there appears to be a period each day between 8:00 p.m. and 10:00 p.m. when tasks are executing parallel queries against tables.

In practice, you may want to run the STATSPACK report to identify periods of high parallel query activity and then go to the stats$sql_summary table to examine and tune the individual parallel queries. Of course, you can also examine parallel query summaries since database start-up time by using the v$pq_sysstat view.



For additional details see the Oracle Press Book “Oracle High Performance Tuning with STATSPACK“.

Tuesday, October 6, 2009

Correlating UNIX processes with Oracle sessions

If we want to move outside of Oracle and see details for the session at the UNIX level, we must correlate the Oracle PIS with the UNIX PID. To see details of these processes you can write an Oracle script to filter the UNIX ps output to only include these processes:

#/bin/ksh
sqlplus system/oracle@prodlive<set pages 999
set feedback off
set echo off
set heading off

spool /tmp/run_pid.ksh

select
'ps -ef|grep '||spid||’grep –v grep’
from
v\$process
where
spid is not NULL
;

spool off;
!

# Execute the UNIX commands . . . .
chmod +x /tmp/*.ksh

Here is the output from this script. As we see, the SQL*Plus script builds the UNIX ps command for the Oracle PIDs and then executes the command.

root> /tmp/run_pid.ksh

jjahn 3231 3129 0 08:12:17 ? 0:00 oraclePA

bbeckero 2482 2377 0 07:59:26 ? 0:00 oraclePA

scarf 2376 785 0 07:59:03 ttyp9 0:00 telnetd

brobins 2377 2376 0 07:59:04 ttyp9 0:01 runmenu50 pamenu

monte 1372 1 0 Sep 21 ? 5:58 /opt/hpnp/bin/hpnpd

imels 1886 1878 0 Sep 21 ttyp1 0:00 tee -a

This script allows us to see the start time for the UNIX connection and also see the cumulative CPU consumption for each task.

Oracle9i and multiple block sizes

For the Oracle administrator, multiple blocksizes in Oracle9i are extremely important and exciting. For the first time, you will be able to customize your data buffer sizes according to the specific needs of your database. Prior to Oracle9i, your entire Oracle database had to have a single block size and this block size was determined at the time that the database was created.

With Oracle9i, we can define tablespaces with block sizes of 2K, 4K, 8K, 16K and 32K, and assign tables and indexes to the best block size to minimize I/O and best manage wasted space in our data buffers. When we combine the new data buffers for these block sizes, we get a total of seven separate and distinct data buffers to segregate our incoming table and index rows.

As we know, disk I/O is the single most expensive operation within an Oracle9i database, and multiple block sizes give us a powerful new tool to manage disk I/O with more power than ever before.

Until RAM memory becomes cheap enough that we can cache our whole database, we need to manage the RAM that we allocate to our data buffers. The allocation of tables and indexes according to block sizes is a balancing act.

If we allocate the data blocks too large, then we waste valuable data buffer space holding row data that Oracle will never reference. If we allocate the data block too small, and Oracle will have to do more disk I/O to satisfy a request. Here are some general rules for allocating data block sizes:

Segregate large-table full-table scans - Tables that experience large-table full-table scans will benefit from the largest supported block size and should be placed in a tablespace with your largest block size.

Set db_recycle_cache_size carefully - If you are not setting db_cache_size to the largest supported block size for your server, you should not use the db_recycle_cache_size parameter. Instead, you will want to create a db_32k_cache_size (or whatever your max is), and assign all tables that experience frequent large-table full-table scans to the largest buffer cache in your database.

The Data Dictionary cache uses the default block size - You should ensure that the data dictionary (e.g. your SYSTEM tablespace) is always fully cached in a data buffer pool. Remember, the block size of the data dictionary is not as important as ensuring that the data buffer associated with the SYSTEM tablespace has enough RAM to fully-cache all data dictionary blocks.

Indexes want large block sizes - Indexes will always favor the largest supported blocksize. You want to be able to retrieve as many index nodes as possible in a single I/O, especially for SQL that performs index range scans. Hence, all indexes should reside in tablespaces with a 32k block size.

Average row length determines block size - The block size for a tables’ tablespace should always be greater than the average row length for the table (dba_tables.avg_row_len). Not it is smaller than the average row length, rows chaining occurs and excessive disk I/O is incurred.

Use large blocks for data sorting – Your TEMP tablespace will benefit from the largest supported blocksize. This allows disk sorting to happen in large blocks with a minimum of disk I/O.