Mistake #3: Not having a diversified backup strategy
In the last post, we discussed why you might or might not want to run in archivelog mode. Now we’ll discuss some related decisions.
As in the last two blog entries, I’ll be oversimplifying, for the sake of illustration. Additionally, there is often more than one way to satisfactorily resolve a given disaster, and the best method might not be the ones I’m using as examples here. It might not even involve resorting to backups at all. This blog is just a tool to help you understand how diversified backups could help you.
So, there are two basic kinds of database backups, physical and logical. Physical backups are copies of the files (or parts of the files) that comprise the database. Physical backups come in several forms, for example, copies of the database files made with operating system backup commands, RMAN backups (this tool comes with the Oracle software), and disk mirror copies.
Logical backups are collections of SQL statements that can recreate database objects and their associated data. For example, a table backup could be comprised of a CREATE TABLE statement, some GRANT statements, some CREATE INDEX statements, and many INSERT INTO [table] statements. Collectively, these statements would get your table restored.
Let’s think about some sample database disasters:
Scenario 1: A loss of the server on which the database is hosted, causing a loss of the whole database. Remember archived redo logs, from the last blog entry? Those are important to back up. If you restore a physical database backup, the data will be in the state it was in at the time you made the backup. You might need a more recent version of the data. You can use the archived redo logs to roll the database forward in time, which will bring your data as recent as the most recent available archived redo log.
What if you didn’t have a physical backup? Well, you’ll be creating a new, empty database and then loading your most recent logical backup into it. Let’s say that backup was from last night at 11:30 pm. You lost your server at 2:14 pm today. You will lose all transactions that happened since last night at 11:30 pm. Even if you have archived redo logs available, they won’t help you, because you can’t apply archived redo logs to logical backups.
Scenario 2: A loss of a single datafile. That might happen because of corruption or erroneous file deletion.
In this case, the database won’t be happy until you get the lost datafile back. If you have a physical backup AND you’re running in archivelog mode AND you have all the archivelogs since the last physical backup AND your online redo logs aren’t damaged, you will be able to make a complete recovery. Just fish the lost datafile out of the physical backup, then instruct Oracle to recover it. It will re-apply the transactions made to the lost datafile, using the archived redo logs and then the online redo logs. Good thing you made a physical backup!
What if you didn’t have a physical backup? Depending on what was in the missing datafile, you could lose some transactions. You might have to create an empty database and restore the most logical backup, losing all transactions since 11:30 pm last night. You might be fortunate and discover that missing datafile only had indexes in it, and then be able to just rebuild the indexes, assuming you have some way to reconstruct the index creation SQL (a logical backup can help with this). If the missing datafile contained data, you might do something fancy like restore the logical backup to a temporary database and then somehow transfer just the missing data to the damaged database, although there are some gotchas with that scenario. It really would have been best to have a physical backup with all archivelogs since the last backup.
Scenario 3: A user error, such as the payroll process accidentally being run twice this morning, at 10:37 am and 11:05 am.
It is now 12:44 pm. This might not be simple to fix. You’ll have some choices to make. How many tables are involved and how much data is modified by a payroll run? Are you knowledgeable enough to manually back out the second payroll run? If the back-out is simple enough and you are confident you can surgically remove only the second run, go for it.
If just one table is affected by the payroll run (probably not, but let’s pretend for the sake of this example), and if you have a logical backup, and if you are willing to lose non-payroll changes to that table since last night, you could restore just that table from last night’s logical backup, then re-run payroll. Once.
If many tables are affected or if it’s not clear how to fix the problem manually, you might find that it’s, overall, the best solution to put the database back to like it was at 11:04 am. This is possible only if you have a physical backup and only if you have all the archived redo logs made between the time of the backup and 11:04 am. You will lose all transactions made after 11:04 am. Once the recovery is done, the database will be in the state it was just before the second payroll run started.
Scenario 4: Erroneous but limited-scope data modification error. For example, 20 days after the fact, you realize that 75 rows had erroneously been deleted from a table that changes frequently.
Restoring the database via physical backup, back to how it was 20 days ago and leaving it that way is probably out of the question, unless you are conducting a forensic analysis of some sort and are restoring to a non-production server for this purpose. But you’ll still need a way to fix your production database.
If you have logical backups, you could restore the table to a different database or a different part of the production database, extract the missing rows from the restored table, and restore them to the deficient production table.
If you don’t have a logical backup, and you really need these rows back, you’ll probably have to use a physical backup, and those are less well-suited to this particular problem. In this case, you could restore the physical backup to another server, extract the missing 75 rows, and restore them to the deficient production table.
As you have seen, your best chance to recover from a wide variety of possible disasters is to have a variety of backup types. Sometimes a blend of backup types can be used to solve a recovery problem. Some very large databases are awkward to back up (or back up regularly), but with our target customer base (which is small to medium sized businesses), most of the databases are small enough to backup within a reasonable amount of time, using more than one backup strategy, and using a reasonable amount of space for the backup files.
Will your backup strategy meet your recovery needs?
Very nice article Mary Elizabeth. The language is so easy that even I understood most if it.