17
Configuring Instance Recovery Performance

This chapter offers guidelines for configuring the time to perform instance recovery.

This chapter contains the following sections:

Understanding Instance Recovery

Instance and crash recovery are the automatic application of redo log records to Oracle data blocks after a crash or system failure. During normal operation, if an instance is shutdown cleanly as when using a SHUTDOWN IMMEDIATE statement, rather than terminated abnormally, then the in-memory changes that have not already been written to the datafiles on disk are written to disk as part of the checkpoint performed during shutdown.

However, if a single instance database crashes or if all instances of an Oracle Real Application Cluster configuration crash, then Oracle performs crash recovery at the next startup. If one or more instances of an Oracle Real Application Cluster configuration crash, then a surviving instance performs instance recovery automatically. Instance and crash recovery occur in two steps: cache recovery followed by transaction recovery.

Cache Recovery (Rolling Forward)

During the cache recovery step, Oracle applies all committed and uncommitted changes in the redo log files to the affected data blocks. The work required for cache recovery processing is proportional to the rate of change to the database (update transactions each second) and the time between checkpoints.

Transaction Recovery (Rolling Back)

To make the database consistent, the changes that were not committed at the time of the crash must be undone (in other words, rolled back). During the transaction recovery step, Oracle applies the rollback segments to undo the uncommitted changes. The work required to do transaction recovery is proportional to the number and size of uncommitted transactions when the system fault occurred.

Checkpointing and Cache Recovery

Periodically, Oracle records a checkpoint. A checkpoint is the highest system change number (SCN) such that all data blocks less than or equal to that SCN are known to be written out to the data files. If a failure occurs, then only the redo records containing changes at SCNs higher than the checkpoint need to be applied during recovery. The duration of cache recovery processing is determined by two factors: the number of data blocks that have changes at SCNs higher than the SCN of the checkpoint, and the number of log blocks that need to be read to find those changes.

How Checkpoints Affect Performance

Frequent checkpointing writes dirty buffers to the datafiles more often than otherwise, and so reduces cache recovery time in the event of an instance failure. If checkpointing is frequent, then applying the redo records in the redo log between the current checkpoint position and the end of the log involves processing relatively few data blocks. This means that the cache recovery phase of recovery is fairly short.

However, in a high-update system, frequent checkpointing can reduce runtime performance, because checkpointing causes DBWn processes to perform writes.

Fast Instance Recovery Trade-offs

To minimize the duration of instance recovery, you must force Oracle to checkpoint often, thus keeping the number of redo log records to be applied during recovery to a minimum. However, in a high-update system, frequent checkpointing increases the overhead for normal database operations.

If daily operational efficiency is more important than minimizing recovery time, then decrease the frequency of writes to data files due to checkpoints. This should improve operational efficiency, but also increase instance recovery time.

See Also:

"Reducing Checkpoint Frequency to Optimize Runtime Performance" for information on how to maximize runtime performance
"Configuring the Duration of Cache Recovery" for information on how to minimize instance recovery time

Reducing Checkpoint Frequency to Optimize Runtime Performance

To reduce the checkpoint frequency and optimize runtime performance, you can do the following:

Size your online redo log files according to the amount of redo your system generates. A rough guide is to switch logs at most once every twenty minutes. Small log files can increase checkpoint activity and reduce performance. You should make all logs the same size.
Set the value of the LOG_CHECKPOINT_INTERVAL initialization parameter (in multiples of physical block size) to zero. This value eliminates interval checkpoints.
Set the value of the LOG_CHECKPOINT_TIMEOUT initialization parameter to zero. This value eliminates time-based checkpoints.
Set the value of FAST_START_MTTR_TARGET (and FAST_START_IO_TARGET) to zero to disable fast-start checkpointing.

See Also:
Oracle9i Database Concepts for a complete discussion of checkpoints

Configuring the Duration of Cache Recovery

There are several methods for tuning cache recovery to keep the duration of recovery within user-specified bounds. These include the following:

Initialization Parameters that Influence Cache Recovery Time

The initialization parameters in Example 17-1 influence cache recovery time.

Table 17-1 Initialization Parameters Influencing Cache Recovery

Parameter	Purpose
`FAST_START_MTTR_TARGET`	Lets you specify in seconds the expected mean time to recover (MTTR), which is the expected amount of time Oracle takes to perform recovery and startup the instance.
FAST_START_IO_TARGET	This initialization parameter has been deprecated in favour of `FAST_START_MTTR_TARGET`. This parameter specifies the upper limit on the number of dirty buffers.
LOG_CHECKPOINT_TIMEOUT	Limits the number of seconds between the most recent redo record and the checkpoint.
LOG_CHECKPOINT_INTERVAL	Limits the number of redo blocks generated between the most recent redo record and the checkpoint.
RECOVERY_PARALLELISM	Specifies the number of concurrent recovery processes to be used in instance or crash recovery.
LOG_PARALLELISM	Specifies the level of concurrency for redo allocation within Oracle.

Note:

Oracle recommends using the FAST_START_MTTR_TARGET initialization parameter to control the duration of startup after instance failure. Fast-start checkpointing is only available with Enterprise Edition.

The FAST_START_IO_TARGET initialization parameter has been deprecated in favor of the FAST_START_MTTR_TARGET parameter.

The initialization parameter DB_BLOCK_MAX_DIRTY_TARGET has been removed.

Use Fast-Start Checkpointing to Limit Instance Recovery Time

Oracle Enterprise Edition features include a fast-start fault recovery functionality to control instance recovery. This reduces the time required for cache recovery and makes the recovery bounded and predictable by limiting the number of dirty buffers and the number of redo records generated between the most recent redo record and the last checkpoint.

The foundation of fast-start recovery is the fast-start checkpointing architecture. Instead of the conventional event driven (that is, log switching) checkpointing, which does bulk writes, fast-start checkpointing occurs incrementally. Each DBWn process periodically writes buffers to disk to advance the checkpoint position. The oldest modified blocks are written first to ensure that every write lets the checkpoint advance. Fast-start checkpointing eliminates bulk writes and the resultant I/O spikes that occur with conventional checkpointing.

Administrators specify a target (bounded) time to complete the cache recovery phase of recovery with the FAST_START_MTTR_TARGET initialization parameter, and Oracle automatically varies the incremental checkpoint writes to meet that target. The FAST_START_MTTR_TARGET initialization parameter lets you specify in seconds the expected mean time to recover (MTTR), which is the expected amount of time Oracle takes to perform crash or instance recovery for a single instance.

FAST_START_MTTR_TARGET

The FAST_START_MTTR_TARGET initialization parameter simplifies the configuration of recovery time from instance or system failure. This parameter lets you specify the number of seconds crash or instance recovery is expected to take. The FAST_START_MTTR_TARGET is internally converted to a set of parameters that modify the operation of Oracle such that recovery time is as close to this estimate as possible.

Note:

You should disable or remove the FAST_START_IO_TARGET, LOG_CHECKPOINT_INTERVAL, and LOG_CHECKPOINT_TIMEOUT initialization parameters when using FAST_START_MTTR_TARGET. Setting these parameters to active values interferes with FAST_START_MTTR_TARGET, resulting in a different than expected value in the TARGET_MTTR column of the V$INSTANCE_RECOVERY view.

The maximum value for FAST_START_MTTR_TARGET is 3600, or one hour. If you set the value to more than 3600, then Oracle rounds it to 3600. There is no minimum value for FAST_START_MTTR_TARGET. However, this does not mean that you can target the recovery time as low as you want. The time to do a crash recovery is limited by the low limit of the target number of dirty buffers, which is 1000, as well as factors such as how long initialization and file open take.

If you set the value of FAST_START_MTTR_TARGET too low, then the effective mean time to recover (MTTR) target will be the best MTTR target the system can achieve. If you set the value of FAST_START_MTTR_TARGET to such a high value that even in the worst-case recovery would not take that long, then the effective MTTR target will be the estimated MTTR in the worst-case scenario when the whole buffer cache is dirty. Use the TARGET_MTTR column in the V$INSTANCE_RECOVERY view to see the effective MTTR.

Note:

The TARGET_MTTR column in V$INSTANCE_RECOVERY could be different than FAST_START_MTTR_TARGET if the latter is set too low or too high. Periodically check the MTTR_TARGET column in the V$INSTANCE_RECOVERY view and compare it with the parameter setting. Adjust the parameter setting if it is consistently different from the value in the target.

See Also:

"Monitoring Estimated MTTR: Example Scenario" for more information on setting FAST_START_MTTR_TARGET

Set LOG_CHECKPOINT_TIMEOUT to Influence the Amount of Redo

Set the initialization parameter LOG_CHECKPOINT_TIMEOUT to an integer value n to require that the latest checkpoint position follow the most recent redo block by no more than n seconds. In other words, at most, n seconds worth of logging activity can occur between the most recent checkpoint position and the last block written to the redo log. This forces the checkpoint position to keep pace with the most recent redo block.

You can also interpret LOG_CHECKPOINT_TIMEOUT as specifying an upper bound on the time a buffer can be dirty in the cache before DBWn must write it to disk. For example, if you set LOG_CHECKPOINT_TIMEOUT to 60, then no buffers remain dirty in the cache for more than 60 seconds. The default value for LOG_CHECKPOINT_TIMEOUT is 1800, or 30 minutes.

Set LOG_CHECKPOINT_INTERVAL to Influence the Amount of Redo

Set the initialization parameter LOG_CHECKPOINT_INTERVAL to a value n (where n is an integer) to require that the checkpoint position never follow the most recent redo block by more than n blocks. In other words, at most n redo blocks can exist between the checkpoint position and the last block written to the redo log. In effect, you are limiting the amount of redo blocks that can exist between the checkpoint and the end of the log.

Oracle limits the maximum value of LOG_CHECKPOINT_INTERVAL to 90% of the smallest log to ensure that the checkpoint advances into the current log before that log fills and a log switch is attempted.

LOG_CHECKPOINT_INTERVAL is specified in redo blocks. Redo blocks are the same size as operating system blocks. Use the LOG_FILE_SIZE_REDO_BLKS column in V$INSTANCE_RECOVERY to see the number of redo blocks corresponding to 90% of the size of the smallest log file.

See Also:

"Calculating Performance Overhead"
Chapter 15, "I/O Configuration and Design" for more information on tuning checkpoints

Use Parallel Recovery to Speed up Redo Application

Use parallel recovery to tune the cache recovery phase of recovery. Parallel recovery uses a division of labor approach to allocate different processes to different data blocks during the cache recovery phase of recovery.

For example, during recovery the redo log is read, and blocks that require redo application are parsed out. These blocks are subsequently distributed evenly to all recovery processes to be read into the buffer cache. Crash, instance, and media recovery of datafiles on different disk drives are good candidates for parallel recovery.

Use the RECOVERY_PARALLELISM initialization parameter to specify the number of concurrent recovery processes for instance or crash recovery. To use parallel processing, the value of RECOVERY_PARALLELISM must be greater than 1 and cannot exceed the value of the PARALLEL_MAX_SERVERS initialization parameter.

The LOG_PARALLELISM initialization parameter allows the parallel generation of redo and can increase the throughput of certain update-intensive workloads. If you are using Oracle on high-end servers that have more than 16 processors, and you are experiencing very high contention on the redo allocation latch, then you should consider enabling parallel redo.

Note:

The RECOVERY_PARALLELISM initialization parameter specifies the number of concurrent recovery processes for instance or crash recovery only.

Media recovery is not affected by this parameter. Use the PARALLEL clause in the RECOVER DATABASE statement for media recovery.

Recovery is usually I/O bound on reads to data blocks. Consequently, parallelism at the block level can only help recovery performance if it speeds up total I/Os. Performance on systems with efficient asynchronous I/O typically does not improve significantly with parallel recovery, unless the recovery is CPU-bound.

See Also:

Oracle9i Database Reference for more information on initialization parameters

Monitoring Cache Recovery

Use the V$INSTANCE_RECOVERY view to see the current recovery parameter settings. You can also use statistics from this view to calculate which parameter has the greatest influence on checkpointing. V$INSTANCE_RECOVERY contains the columns shown in Table 17-2.

Note:

The last three fields in V$INSTANCE_RECOVERY are new with Oracle9i, and they are the most important. With the initialization parameter FAST_START_MTTR_TARGET, the other seven fields of V$INSTANCE_RECOVERY are less useful.

Table 17-2 V$INSTANCE_RECOVERY View

Column	Description
RECOVERY_ESTIMATED_IOS	Contains the number of dirty buffers in the buffer cache. (In Standard Edition, the value of this field is always `NULL`).
ACTUAL_REDO_BLKS	Current number of redo blocks required to be read for recovery.
TARGET_REDO_BLKS	Goal for the maximum number of redo blocks to be processed during recovery. This value is the minimum of the next three columns (`LOG_FILE_SIZE_REDO_BLKS`, `LOG_CHKPT_TIMEOUT_REDO_BLKS`, `LOG_CHKPT_INTERVAL_REDO_BLKS`).
LOG_FILE_SIZE_REDO_BLKS	Number of redo blocks to be processed during recovery corresponding to 90% of the size of the smallest log file.
LOG_CHKPT_TIMEOUT_REDO_BLKS	Number of redo blocks that must be processed during recovery to satisfy `LOG_CHECKPOINT_TIMEOUT`.
LOG_CHKPT_INTERVAL_REDO_BLKS	Number of redo blocks that must be processed during recovery to satisfy `LOG_CHECKPOINT_INTERVAL`.
FAST_START_IO_TARGET_REDO_BLKS	This field is obsolete. It is retained for backward compatibility. The value of this field is always `NULL`.
TARGET_MTTR	Effective mean time to recover (MTTR) target in seconds. Usually, it should be equal to the value of the `FAST_START_MTTR_TARGET` initialization parameter. If `FAST_START_MTTR_TARGET` is set to such a small value that it is impossible to do a recovery within its time frame, then the `TARGET_MTTR` field contains the effective MTTR target, which is larger than `FAST_START_MTTR_TARGET`. If `FAST_START_MTTR_TARGET` is set to such a high value that even in the worst-case (the whole buffer cache is dirty) recovery would not take that long, then the `TARGET_MTTR` field contains the estimated MTTR in the worst-case scenario. This field is 0 if `FAST_START_MTTR_TARGET` is not specified.
ESTIMATED_MTTR	The current estimated mean time to recover (MTTR) in the number of seconds based on the number of dirty buffers and log blocks (gives the current estimated MTTR even if `FAST_START_MTTR_TARGET` is not specified).
CKPT_BLOCK_WRITES	Number of writes to disk that would be avoided if checkpointing has been disabled.

See Also:

Oracle9i Database Reference for more information on the V$INSTANCE_RECOVERY view

Monitoring Estimated MTTR: Example Scenario

The TARGET_MTTR field of V$INSTANCE_RECOVERY contains the MTTR target in effect. The ESTIMATED_MTTR field of V$INSTANCE_RECOVERY contains the estimated MTTR should a crash happen right away. Query these two fields to see if the system can keep up with your specified MTTR target.

For example, assume the initialization parameter setting is as follows:

FAST_START_MTTR_TARGET = 6      # seconds

Execute the following query after database open:

SELECT TARGET_MTTR, ESTIMATED_MTTR, CKPT_BLOCK_WRITES 
FROM V$INSTANCE_RECOVERY;

Oracle responds with the following:

TARGET_MTTR ESTIMATED_MTTR CKPT_BLOCK_WRITES
18          15             0

You see that TARGET_MTTR is 18 seconds, which is higher than the value of FAST_START_MTTR_TARGET specified (6 seconds). This means that it is impossible to recover the database within 6 seconds. 18 seconds is the minimum MTTR target that the system can achieve.

The 18 second minimum is calculated based on the absolute low limit of 1000 blocks on the target of number of dirty buffers (The deprecated initialization parameter FAST_START_IO_TARGET follows this low limit; that is, you cannot set FAST_START_IO_TARGET to less than 1000). The ESTIMATED_MTTR field contains the estimated mean time to recovery. Because the database has just opened, the system contains few dirty buffers. That is why ESTIMATED_MTTR can be lower than the minimum possible TARGET_MTTR.

Now assume that you use the following statement to modify FAST_START_MTTR_TARGET:

ALTER SYSTEM SET FAST_START_MTTR_TARGET = 30;

Reissue the query to V$INSTANCE_RECOVERY, and Oracle responds with the following:

TARGET_MTTR ESTIMATED_MTTR CKPT_BLOCK_WRITES
30          15             0

The ESTIMATED_MTTR field is still 15 seconds, which means that the estimated MTTR at the current time (should a crash happen immediately) is still 15 seconds. This is because no new redo is written, and no data block has become dirty.

Assume that heavy update activity occurs in the database and then you query V$INSTANCE_RECOVERY immediately afterward. Oracle responds with the following:

TARGET_MTTR ESTIMATED_MTTR CKPT_BLOCK_WRITES
30          36             54367

You see that the effective MTTR target is 30 seconds. The estimated MTTR at the current time (should a crash happen immediately) is 36 seconds. This is fine. This means that some checkpoints writes might not have finished yet, so the buffer cache contains more dirty buffers than targeted.

Assume that you wait for one minute and reissue the query to V$INSTANCE_RECOVERY. Oracle responds with the following:

TARGET_MTTR ESTIMATED_MTTR CKPT_BLOCK_WRITES
30          31             55230

The estimated MTTR at this time has dropped to 31 seconds. This is because more dirty buffers have been written out during this period. This is shown by the increase of CKPT_BLOCK_WRITES field of V$INSTANCE_RECOVERY.

Note:

The number of physical writes minus the number of physical writes non checkpoint (from V$SYSSTAT) equals the field CKPT_BLOCK_WRITES in V$INSTANCE_RECOVERY.

Calculating Performance Overhead

To calculate performance overhead, use the V$SYSSTAT view. For example, assume that you execute the following query:

SELECT NAME, VALUE 
FROM V$SYSSTAT
WHERE NAME IN ('physical reads','physical writes',
               'physical writes non checkpoint');

Oracle responds with the following:

NAME                                 VALUE
physical reads                        2376
physical writes                      14932
physical writes non checkpoint       11165

The first row shows the number of data blocks retrieved from disk. The second row shows the number of data blocks written to disk. The last row shows the number of writes to disk that would occur if you turned off checkpointing.

Use this data to calculate the overhead imposed by setting the FAST_START_MTTR_TARGET initialization parameter. To effectively measure the percentage of extra writes, mark the values for these statistics at different times, t_1 and t_2.

Calculate the percentage of extra I/Os generated by fast-start checkpointing using the following formula:

[((PW_2 - PW_1) - (PWNC_2 - PWNC_1)) / ((PR_2 - PR_1) + (PW_2 - PW_1))] x 100% = EIO

where the variables are described in Table 17-3.

Table 17-3 Variable Definitions

Variable	Definition
`*_1`	Value of prefixed variable at time `t_1`, which is any time after the database has been running for a while
`*_2`	Value of prefixed variable at time `t_2`, which is later than `t_1` and not immediately after changing any of the checkpoint parameters
`PWNC`	`physical` `writes` `non` `checkpoint`
`PW`	`physical` `writes`
`PR`	`physical` `reads`
`EIO`	Percentage of estimated extra I/Os generated by enabling checkpointing

It can take some time for database statistics to stabilize after instance startup or dynamic initialization parameter modification. After such events, wait until all blocks age out of the buffer cache at least once before taking measurements. If the percentage of extra I/Os is too high, then increase the value of FAST_START_MTTR_TARGET.

The number of extra writes caused by setting FAST_START_MTTR_TARGET to a nonzero value is application-dependent; it is not dependent on cache size.

Calculating Performance Overhead: Example Scenario

As an example, assume the initialization parameter setting is as follows:

FAST_START_MTTR_TARGET = 90  # 90 seconds

After the statistics stabilize, you issue this query on V$SYSSTAT:

SELECT NAME, VALUE 
FROM V$SYSSTAT
WHERE NAME IN ('physical reads','physical writes',
               'physical writes non checkpoint');

Oracle responds with the following:

NAME                               VALUE
physical reads                      2376
physical writes                    14932
physical writes non checkpoint     11165

The physical write checkpoint statistics can also be found in the CKPT_BLOCK_WRITES field of the V$INSTANCE_RECOVERY view. For example:

SELECT CKPT_BLOCK_WRITES 
FROM V$INSTANCE_RECOVERY;

Oracle responds with the following:

CKPT_BLOCK_WRITES    3767

It is consistent with the result from V$SYSSTAT: 3767 = 14932 - 11165.

After making updates for a few hours, you reissue the query. Oracle responds with the following:

NAME                               VALUE
physical reads                      3011
physical writes                    17467
physical writes non checkpoint     13231

Substitute the values from the SELECT statements in the formula to determine how much performance overhead you are incurring:

[((17467 - 14932) - (13231 - 11165)) / ((3011 - 2376) + (17467 - 14932))] x 100% = 14.8%

As the result indicates, enabling fast-start checkpointing generates about 15% more I/O than required had you not enabled fast-start checkpointing. After calculating the extra I/O, you decide you can afford more system overhead if you decrease recovery time.

To decrease recovery time, reduce the value for the parameter FAST_START_MTTR_TARGET to 60. After items in the buffer cache age out, calculate V$SYSSTAT statistics across a second interval to determine the new performance overhead. Query V$SYSSTAT:

SELECT NAME, VALUE FROM V$SYSSTAT
WHERE NAME IN ('physical reads', 'physical writes',
'physical writes non checkpoint');

Oracle responds with the following:

NAME                               VALUE
physical reads                      4652
physical writes                    28864
physical writes non checkpoint     21784

After making updates, reissue the query. Oracle responds with the following:

NAME                               VALUE
physical reads                      6000
physical writes                    35394
physical writes non checkpoint     26438

Calculate how much performance overhead you are incurring using the values from the two SELECT statements:

[(35394 - 28864) - (26438 - 21784)) / ((6000 - 4652) + (35394 - 28864))] x 100% = 23.8%

After changing the parameter, the percentage of I/Os performed by Oracle is now about 24% more than it would be if you disabled fast-start checkpointing.

Calibrating the MTTR

The FAST_START_MTTR_TARGET initialization parameter calculates internal system trigger values to limit the length of the redo log and the number of dirty data buffers in the data cache. This calculation uses estimated times to read a redo block and to read and write a data block.

Initially, internal defaults are used. These defaults are replaced by execution time estimates during system operation. However, the best values are obtained from measurements taken from an actual recovery from a failure.

Note:

To effectively align FAST_START_MTTR_TARGET, make sure that you perform several instance recoveries to ensure that the time to read a redo block and the time to read and write a data block are recorded accurately.

Before doing instance recoveries to calibrate the FAST_START_MTTR_TARGET, decide whether FAST_START_MTTR_TARGET is being calibrated for a database crash or a hardware crash. This is a consideration if your database files are stored in a file system or if your I/O subsystem has a memory cache, because there is a considerable difference in the read and write time to disk depending on whether or not the files are cached. The workload being run during the instance recovery should be a very good representation of the average workload on the system to ensure that the amount of redo records generated are similar.

MTTR Advisory

Starting with Oracle9i Release 2 (9.2), MTTR advisory is available to help you evaluate the effect of different MTTR settings on system performance in terms of extra physical writes.

How MTTR Advisory Works

When MTTR advisory is enabled, after the system runs a typical workload for a while, you can query V$MTTR_TARGET_ADVICE, which tells you the ratio of estimated number of cache writes under other MTTR settings to the number of cache writes under the current MTTR. For instance, a ratio of 1.2 indicates 20% more cache writes.

By looking at the different MTTR settings and their corresponding cache write ratio, you can decide which MTTR value fits your recovery and performance needs. V$MTTR_TARGET_ADVICE also gives the ratio on total physical writes (including direct writes), and the ratio on total I/Os (including reads).

Enabling MTTR Advisory

Enabling MTTR Advisory involves setting two initialization parameters:

STATISTICS_LEVEL

Make sure that STATISTICS_LEVEL is set to TYPICAL or ALL.

FAST_START_MTTR_TARGET

To enable MTTR advisory, set the initialization parameter FAST_START_MTTR_TARGET to a nonzero value. If FAST_START_MTTR_TARGET is not specified, then MTTR advisory will be OFF.

When MTTR advisory is ON, it simulates checkpoint queue behavior under five different MTTR settings:

Current FAST_START_MTTR_TARGET setting

Current setting times 0.1, 0.5, 1.5, and 2

Note:

Whenever you set FAST_START_MTTR_TARGET to a nonzero value, and while MTTR advisory is ON, Oracle Corporation recommends that you disable (set to 0) the following parameters:

LOG_CHECKPOINT_TIMEOUT
LOG_CHECKPOINT_INTERVAL
FAST_START_IO_TARGET

Because these initialization parameters either override FAST_START_MTTR_TARGET or potentially drive checkpoints more aggressively than FAST_START_MTTR_TARGET does, they can interfere with the simulation.

Viewing MTTR Advisory

Oracle9i Release 2 (9.2) provides a dynamic performance view for viewing statistics or advisories collected by MTTR advisory.

V$MTTR_TARGET_ADVICE

If MTTR advisory has been turned on, V$MTTR_TARGET_ADVICE shows the advisory information collected. Usually this view show five rows, corresponding to the current MTTR, 0.1 times the current MTTR, 0.5 times the current MTTR, 1.5 times the current MTTR and 2 times the current MTTR. However, if one or more of the 5 values are less than the smallest MTTR target the system can sustain, their corresponding rows are replaced with a single row corresponding to the smallest MTTR target the system can have. Similarly, if one or more of the 5 values are larger than the worst-case MTTR target the system can have, their corresponding rows are replaced with a single row corresponding to the worst-case MTTR target the system can have.

If MTTR advisory is currently OFF, the view shows information collected the last time MTTR advisory was on.

See Also:

"V$MTTR_TARGET_ADVICE" for column details of these views

Tuning Transaction Recovery

During the second phase of instance recovery, Oracle rolls back uncommitted transactions. Oracle uses two features, fast-start on-demand rollback and fast-start parallel rollback, to increase the efficiency of this recovery phase.

Note:

These features are part of fast-start fault recovery and are only available in the Oracle9i Enterprise Edition.

This section contains the following topics:

Using Fast-Start On-Demand Rollback

Using the fast-start on-demand rollback feature, Oracle automatically allows new transactions to begin as soon as the database opens, which is usually a very short time after cache recovery completes. If a user attempts to access a row that is locked by a terminated transaction, Oracle rolls back only those changes necessary to complete the transaction; in other words, it rolls them back on demand. Consequently, new transactions do not have to wait until all parts of a long transaction are rolled back.

Note:

Oracle does this automatically. You do not need to set any parameters or issue statements to use this feature.

Using Fast-Start Parallel Rollback

In fast-start parallel rollback, the background process SMON acts as a coordinator and rolls back a set of transactions in parallel using multiple server processes. Essentially, fast-start parallel rollback is to transaction recovery what parallel recovery is to cache recovery.

Fast-start parallel rollback is mainly useful when a system has transactions that run a long time before committing, especially parallel INSERT, UPDATE, and DELETE operations. SMON automatically decides when to begin parallel rollback and disperses the work among several parallel processes: process one rolls back one transaction, process two rolls back a second transaction, and so on.

One special form of fast-start parallel rollback is intra-transaction recovery. In intra-transaction recovery, a single transaction is divided among several processes. For example, assume eight transactions require recovery with one parallel process assigned to each transaction. The transactions are all similar in size except for transaction five, which is quite large. This means it takes longer for one process to roll this transaction back than for the other processes to roll back their transactions.

In this situation, Oracle automatically begins intra-transaction recovery by dispersing transaction five among the processes: process one takes one part, process two takes another part, and so on.

You control the number of processes involved in transaction recovery by setting the initialization parameter FAST_START_PARALLEL_ROLLBACK to one of three values listed in Table 17-4.

Table 17-4 FAST_START_PARALLEL_ROLLBACK Parameter Values

Value	Meaning
`FALSE`	Turns off fast-start parallel rollback.
`LOW`	Specifies that the number of recovery servers cannot exceed twice the value of the `CPU_COUNT` initialization parameter.
`HIGH`	Specifies that the number of recovery servers cannot exceed four times the value of the `CPU_COUNT` initialization parameter.

Parallel Rollback in an Oracle Real Application Clusters Configuration

In Oracle Real Application Clusters, you can perform fast-start parallel rollback on each instance. Within each instance, you can perform parallel rollback on transactions that are:

Online on a given instance
Offline and not being recovered on instances other than the given instance

After a rollback segment is online for a given instance, only this instance can perform parallel rollback on transactions on that segment.

Monitoring Progress of Fast-Start Parallel Rollback

Monitor the progress of fast-start parallel rollback by examining the V$FAST_START_SERVERS and V$FAST_START_TRANSACTIONS views. V$FAST_START_SERVERS provides information about all recovery processes performing fast-start parallel rollback. V$FAST_START_TRANSACTIONS contains data about the progress of the transactions.

See Also:

Oracle9i Database Administrator's Guide for more information on managing undo space
Oracle9i Real Application Clusters Deployment and Performance for more information on fast-start parallel rollback in an Oracle Real Application Clusters environment
Oracle9i Database Reference for more information about initialization parameters
Oracle9i Net Services Administrator's Guide for information on transparent application failure (TAF)

17 Configuring Instance Recovery Performance

Understanding Instance Recovery

Cache Recovery (Rolling Forward)

Transaction Recovery (Rolling Back)

Checkpointing and Cache Recovery

How Checkpoints Affect Performance

Fast Instance Recovery Trade-offs

Reducing Checkpoint Frequency to Optimize Runtime Performance

Configuring the Duration of Cache Recovery

Initialization Parameters that Influence Cache Recovery Time

Table 17-1 Initialization Parameters Influencing Cache Recovery

Use Fast-Start Checkpointing to Limit Instance Recovery Time

FAST_START_MTTR_TARGET

Set LOG_CHECKPOINT_TIMEOUT to Influence the Amount of Redo

Set LOG_CHECKPOINT_INTERVAL to Influence the Amount of Redo

Use Parallel Recovery to Speed up Redo Application

Monitoring Cache Recovery

Table 17-2 V$INSTANCE_RECOVERY View

Monitoring Estimated MTTR: Example Scenario

Calculating Performance Overhead

Table 17-3 Variable Definitions

Calculating Performance Overhead: Example Scenario

Calibrating the MTTR

MTTR Advisory

How MTTR Advisory Works

Enabling MTTR Advisory

STATISTICS_LEVEL

FAST_START_MTTR_TARGET

Viewing MTTR Advisory

V$MTTR_TARGET_ADVICE

Tuning Transaction Recovery

Using Fast-Start On-Demand Rollback

Using Fast-Start Parallel Rollback

Table 17-4 FAST_START_PARALLEL_ROLLBACK Parameter Values

Parallel Rollback in an Oracle Real Application Clusters Configuration

Monitoring Progress of Fast-Start Parallel Rollback

17
Configuring Instance Recovery Performance