Articles on Emerging Technologies, Cloud Computing and Cyber Security

Wednesday, March 13, 2019

Top 10 Backup and Recovery Best Practices

Top 10 Backup and Recovery Best Practices
Assuming that you are doing the Backup and Recovery basics
- Running in Archivelog mode
- Multiplexing the controlfile
- Taking regular backups
- Periodically doing a complete restore to test your procedures.
- Restore and recovery validate will not uncover nologging issues. Consider turning on force-logging if they need all transactions to be recovered, and not face nologging problems ( ALTER DATABASE FORCE LOGGING; )

1. Turn on block checking.
The aim is to detect, very early the presence of corrupt blocks in the database. This has a slight performance overhead, but will allow Oracle to detect early corruption caused by underlying disk, storage system, or I/O system problems.

SQL> alter system set db_block_checking = true scope=both;

2. Turn on Block Change Tracking tracking when using RMAN incremental backups (10g and higher)
The Change Tracking File contains information that allows the RMAN incremental backup process to avoid reading data that has not been modified since the last backup. When Block Change Tracking is not used, all blocks must be read to determine if they have been modified since the last backup.

SQL> alter database enable block change tracking using file '/u01/oradata/ora1/change_tracking.f';

3. Duplex redo log groups and members and have more than one archive log destination.
If an archivelog is corrupted or lost, by having multiple copies in multiple locations, the other logs will still be available and could be used.
If an online log is deleted or becomes corrupt, you will have another member that can be used to recover if required.

SQL> alter system set log_archive_dest_2='location=/new/location/archive2' scope=both;
SQL> alter database add logfile member '/new/location/redo21.log' to group 1;

4. When backing up the database with RMAN use the CHECK LOGICAL option.
This will cause RMAN to check for logical corruption within a block, in addition to the normal checksum verification. This is the best way to ensure that you will get a good backup.

RMAN> backup check logical database plus archivelog delete input;

5. Test your backups.
This will do everything except actually restore the database. This is the best method to determine if your backup is good and usable before being in a situation where it is critical and issues exist.
If using RMAN this can be done with:

RMAN> restore validate database;

6. When using RMAN have each datafile in a single backup piece
When doing a partial restore RMAN must read through the entire piece to get the datafile/archivelog requested. The smaller the backup piece the quicker the restore can complete. This is especially relevent with tape backups of large databases or where the restore is only on individual / few files.
However, very small values for filesperset will also cause larger numbers of backup pieces to be created, which can reduce backup performance and increase processing time for maintenance operations. So those factors must be weighed against the desired restore performance.

RMAN> backup database filesperset 1 plus archivelog delete input;

7. Maintain your RMAN catalog/controlfile
Choose your retention policy carefully. Make sure that it complements your tape subsystem retention policy, requirements for backup recovery strategy. If not using a catalog, ensure that your CONTROL_FILE_RECORD_KEEP_TIME parameter matches your retention policy.

SQL> alter system set control_file_record_keep_time=21 scope=both;

This will keep 21 days of backup records in the control file.

Run regular catalog maintenance.
REASON: Delete obsolete will remove backups that are outside your retention policy.
If obsolete backups are not deleted, the catalog will continue to grow until performance
becomes an issue.

RMAN> delete obsolete;

REASON: crosschecking will check that the catalog/controlfile matches the physical backups.
If a backup is missing, it will set the piece to 'EXPIRED' so when a restore is started,
that it will not be eligible, and an earlier backup will be used. To remove the expired
backups from the catalog/controlfile use the delete expired command.

RMAN> crosscheck backup;
RMAN> delete expired backup;

8. Prepare for loss of controlfiles.
This will ensure that you always have an up to date controlfile available that has been taken at the end of the current backup, rather then during the backup itself.

RMAN> configure controlfile autobackup on;

keep your backup logs
REASON: The backup log contains parameters for your tape access, locations on controlfile backups
that can be utilised if complete loss occurs.

9. Test your recovery
REASON: During a recovery situation this will let you know how the recovery will go without
actually doing it, and can avoid having to restore source datafiles again.

SQL> recover database test;

10. In RMAN backups do not specify 'delete all input' when backing up archivelogs
REASON: Delete all input' will backup from one destination then delete both copies of the archivelog where as 'delete input' will backup from one location and then delete what has been backed up. The next backup will back up those from location 2 as well as new logs from location 1, then delete all that are backed up. This means that you will have the archivelogs since the last backup available on disk in location 2 (as well as backed up once) and two copies backup up prior to the previous backup.

0 Comments: