Views:

Summary



This article describes how to manage media for backup and restore with DPX and the general concepts of media management.

Resolution



Overview

The following document provides an introduction to media management that is useful to those who are new to DPX and to enterprise-level backup/restore solutions. Since every environment has different needs and goals, this document covers various strategies and best practices, allowing you to choose which ones fit best in your particular environment.

 

Goal

The one common goal of all DPX users is to restore the files they want, whenever the files are needed. However, many users have different priorities when it comes to meeting this goal. For some, restoring data as quickly as possible is most important. For others, saving money on media usage by using as little media as possible is most important. Others want to restore data no matter what happens to the machine or to the building housing the data. The following sections will examine how DPX can help you meet your particular goal while managing your media to address your needs, regardless of your priorities.

 

Priorities Assessment

Before you begin to think about how you want to use DPX to meet your goal, it is recommended that you conduct a comprehensive audit of your existing backup and storage strategy. Namely, some of the important items to consider are:

  • Data. How much data do you have now, what is the growth forecast for the near and long-term, what type of data is it (small or large files), how is it distributed (evenly or are some servers/disks/volumes/partitions significantly larger then others)?
  • Retention. How long do you need to keep your various data (for example, do certain servers or databases need to be kept for many years for HIPAA/SEC/legal reasons or is there other data you want to make sure does not get backed up for more then 30 days)?
  • Backup Time. What is your backup window (can backups run just at night or can they run 24/7)?
  • Restore Time. How soon must various types of data be restored (for example, must e-mail be restored within an hour, must other data be restored within one day)?
  • Disaster. From what type of disaster do you want to protect yourself (a disk drive crash, a building crash)?
  • Budget. How much money are you willing to invest toward your goal (for example, can you spend as much as needed for tapes and tape drives or do you have limited resources with which to work)?
  • People. How often do you want people moving tapes in and out of jukeboxes (in other words, will you send tapes offsite once a day, once a week)?

 

Backup Scheme

Once you have identified your priorities, define your backup jobs and media management scheme accordingly. First, you must decide on the backup variant you will be running: base, differential, incremental, or a combination of them.

  • Base backups provide quick restores since all or most of the data is on the same tape. However, they take the most time since they require a full backup of every selected file and use the most tapes.
  • Incremental backups utilize the least amount of media. These backups are quick because only the data that has changed since the last backup is backed up. However, restore times are slow since the data will likely be spread across more tapes that need to be mounted and read.
  • Differential backups provide advantages and disadvantages that are somewhat in between base and incremental backups because differential jobs back up only the files that have changed since the last base backup.

 

Media Pools

DPX uses media pools to create separate sets of tapes of the same type (for example, an LTO pool or a monthly pool). It is recommended that you group media according to one of the following distinctions:

  • Location. When you define a backup job, you specify the media pool you want to use, and DPX will select which tape it should use from within that pool. For this reason, if you have storage devices at remote sites, each separate site will typically need its own pools so that DPX does not try to use a tape at a different site. This is particularly the case with devices that do not have barcode readers and would not know which tapes are inside.
  • Media Type. You can create media pools based on media type (LTO4, LTO5, LTO6, LTO7), since media pools can only contain tapes of the same type.
  • Retention Period. Although you can create a separate media pool for any of your data (for example, Windows or NDMP data), it is recommended that you use media pools as a way to separate data you want to keep for different periods of time (a daily pool, a 90-day pool, or a 7-year pool). DPX recycles tapes, allowing you to reuse them, only when all the jobs on the tape have expired (passed the retention period) and the DPX catalog condense maintenance operation has run. Therefore, if the same tape has both data to be kept for a short time and data to be kept for a long time, you will not be able to reuse the tape until the long-term data has expired.

Other best practices for creating media pools include the following:

  • Create alternate or "scratch" pools so that when you run out of tapes in your main media pool, a backup job will automatically use a tape from the alternate pool.
  • Create a separate pool for your catalog backups so that they do not get backed up on the same tapes as the rest of your data. If you ever need to restore the catalog, you could retrieve a tape that you knew only contains backups of your catalog.

 

Media

DPX uses its catalog database to keep track of where your data is. Therefore, you do not need to keep track of the tapes to which your data has been written, whether those tapes are currently onsite or offsite, or if it is time to bring tapes back from your offsite location when they can be overwritten. DPX does all of this for you.

A key concept about media management is the life cycle of a tape.

  • The life of a tape starts when a tape is new to DPX (that is, when it has never been used before with DPX), in which case, it has the status of "New."
  • Before you can start backing up data to a tape, you must label the tape. This involves writing a header at the beginning of the tape containing its name. This header is called a volser (volume serial). When you have labeled the tape, the status of the tape becomes "Empty."
  • When you write data to the tape, its status becomes "Appendable."
  • Eventually, the tape will fill up, and its status will be "Full."
  • When a DPX catalog condense maintenance operation runs, it will change the status of any "Full" tapes on which all the jobs have expired to "Empty." Note that tapes will only go from "Full" to "Empty" and will not go from "Full" to "Appendable."

 

Offsite Media

Many offsite locations are required to store tapes because of the importance of having the ability to recover data if a disaster destroys all the data at the primary data center. DPX provides numerous options to facilitate the process of sending tapes offsite and bringing them back onsite (see Import/Export Media below).

To keep track of whether a tape is currently onsite or offsite, DPX looks at the status field of the particular tape. The status field of the tape can be marked offsite either manually or automatically:

  • Manual Method. You can change the status of a tape to "offsite" by going to the Configure/Media screen, selecting your media pool, selecting the tape, and specifying Yes or No in the Status/Offsite field.
  • Automatic Method. You can automatically change the status of tapes to "offsite" by using the Mark Originals Offsite and Mark Twin Offsite capabilities of Backup Express.

Some users send their original tapes offsite due to policies that allow them to retrieve offsite tapes fast enough so that the delay in bringing the tape back when a restore is needed is acceptable. In most environments however, customers must have a copy of the data both onsite and offsite. The method you choose will depend on your company's priorities and hardware environment.

For example, if you have a significant amount of disk space, you might consider staging your data on some of that space by running a backup-to-disk job that you will keep in that location for a short period of time.

If your backup window is large enough, you can perform both a backup-to-disk as well as a backup-to-tape every night. The advantage will be that you can send the tape offsite for long-term storage, but if a request is made for a file that was recently backed up, it will likely still be on disk.

If you have a small backup window, you can perform a backup-to-disk to stage your data on a central server and then back up that staged data to tape during production hours. The advantage is that you will not touch your production servers when the data is transferred from disk to tape. The disadvantage is that when you want to restore data that is no longer on the disk, you must restore the data from tape to disk, and then launch another restore job to restore it from its backup-to-disk format back to its original format.

Users who have enough tape drives use the DPX twinning function to send the same data to two tapes at the same time, one of them being automatically marked offsite at the end of the job. Note, however, that when you have a set of twin tapes, you cannot append any more data to either of the twin tapes even if there is still space on those tapes. For this reason, users who take advantage of the twinning feature will often add many tasks to form one or two large jobs, rather then have many small jobs.


Robotic Devices

The DPX catalog also has the ability to keep track of the location of the tapes within your jukebox. Therefore, it is not necessary for you to put certain tapes in certain slots or to keep track of where various tapes are, because DPX will manage that process. It is important to use the DPX GUI to import and export tapes in and out of the jukebox, or to move tapes around in the jukebox. Otherwise, DPX will not know where the tapes are when it needs them.

If you do decide to move your tapes without the DPX GUI by opening the door of the jukebox and moving the tapes manually, it is very important that you make DPX aware that changes have been made to the jukebox.

To do this, go to the Device Control/Devices screen, select your jukebox, and click the Operate a Selected Jukebox button. DPX will obtain the updated inventory from your jukebox, assuming it has a barcode reader. If it does not have a barcode reader, you can perform a Read Tape Label operation so that DPX becomes synchronized with the new contents of the jukebox.

 

Import/Export Media

DPX provides a tool called BEXRPT that specifies which tapes you must remove from your jukebox to be sent offsite (bexrpt -r jb_offsite) and which tapes are ready to be brought back onsite (bexrpt -r onsite). It is a command line tool that allows you to create reports about your media. For example, it will tell you which offsite tapes contain expired jobs and should be brought back onsite. Also, it will provide you with the opposite report of which tapes should be taken out of the jukebox and be sent offsite.

Once you know which tapes you want to import or export from your jukebox, you can perform those operations on the Operate Jukebox screen. Use the Import Media and Export Media options to import and export tapes via the GUI. Note that when you export tapes from the jukebox, when you can right-click either the Job Name or Media Status fields, you are given the option to automatically select all the tapes that belong to that specific job or media status. You can then easily export all the selected tapes at the same time.

 

Restore Capabilities

DPX comes with built-in tools that enable you to restore the data for which you are looking, when you need it. The DPX catalog keeps track of where your information is so that you can easily restore it. Two such tools are:

  • Restore Search is a search engine that searches your catalog to find and select the files for which you are looking. You can specify date ranges, wildcards, and many other values to help you find the information you would like to restore.
  • Restore Preview is a tool that tells you ahead of time which tapes will be required to perform a restore job based on the files you selected to restore. It will also specify the status of those tapes and the size of the files that will be restored.