Views:

Summary

This article describes how file operations on an OSSV primary affect the sizes of the SnapShots and Qtrees on the Secondary.

 

Symptoms

How do file operations on an OSSV primary affect the sizes of the SnapShots and Qtrees on the Secondary?

 

Resolution

Assumptions

This document describes the effect of data movement, data creation and change journal map deletion on the size of the QTree and volume snapshot when using the Catalogic Software Open Systems SnapVault agent for backup. This implies that the source operating systems is either Windows 2003/2008. It is also assumed that the reader is familiar with the basic concepts of SnapVault, SnapShots and filer management.

Introduction

Each OSSV backup job using the Catalogic Software OSSV agent for Windows creates (initial base backup) or updates (incremental backup) the data in the QTree(s) on the SnapVault Secondary. At the conclusion of the backup job, a SnapShot is taken on that volume. Both these QTree(s) and SnapShot(s) take up space on the filer, and this document examines how common file operations, i.e. data movement and data creation affect the sizes of these object. We will also look at the effect of a "full incremental transfer" on these objects. This happens when the change journal on the OSSV primary is lost, forcing a subsequent incremental backup to run as a "base". The change journal can be lost when the server unexpectedly reboots without saving the active change journal (kept in memory) to disk.

Results and Explanation

For this test, a total of 17 OSSV jobs were run while the data on the primary (source) disk was manipulated. The OSSV relationship was defined to backup a single disk on the Windows primary server to a volume on the secondary filer, while the size of the corresponding QTree and cumulative SnapShots were monitored. The QTree size was obtained by issuing the quota report command on the filer (after enabling quotas on the destination volume), while the (cumulative) SnapShot size was obtained by using the df -k command. The SnapShot usage numbers reported below therefore correspond to the sum of all the snapshots pertaining to this job on the destination volume.

The following graph shows the Volume Usage, SnapShot and QTree Sizes (in MB) as function of the 17 OSSV backup jobs. Note that Volume Usage is the sum of QTree Usage and SnapShot Usage, and is obtained by the df -k command.

 

The first job (point 1 on the X-Axis) corresponds to the initial "base" backup, which defined the SnapVault relationship, while jobs 2 through 17 correspond to incremental updates to the relationship. During the timeframe when jobs (1) through (6) ran, no intentional changes were made to the data on the volume backed up, and one can see that both the SnapShot and the Qtree sizes are for the most part constant. Note that both the QTree size and the SnapShot size increase during the timeframe (1) through (6), which can be attributed to block movement and changes on the source volume between backup jobs as part of operating system functions. This increase is small and is therefore not clearly visible in this graph.

After job (6) completed, a total of 530 MB of data that resided on the source volume was copied to a new location on the same volume. After the copy operation completed, the original data was then deleted. This resulted in a situation where the location of the data on disk was modified without modifying the data itself (similar to what would happen if one would run a disk defragmentation on the volume). The amount of data on the source volume was therefore not increased, and the data itself was not changed. Only the location of the data on disk changed. One can see that the QTree increased by the amount copied, while the total size of the snapshots did not show any substantial increase. During the time when jobs (8) and (9) ran, no deliberate changes were made to the data on the source volume, and one can see that the increase in usage numbers during this period correspond to that of runs (1) through (6) (slow increases in both SnapShot and QTree sizes).

After job (9) completed, a total of 530 MB of data was added to the source volume. One can see that the QTree size increased by that amount, while the cumulative snapshot size stayed more or less constant. It is interesting to note that the effect of this operation on the QTree and snapshot sizes is the same as "moving" data to a new location on the source volume (described in the previous paragraph). This is evident when comparing the behavior seen between (9) and (10) to that of (6) and (7). During the time when jobs (10) to (13) ran, no deliberate changes were made to the data on the source volume, and one can see that the increase in usage numbers during this period correspond to that of runs (1) through (6) and (7) through (9) (slow increases in both SnapShot and QTree sizes).

After job (13) completed, the change journal that tracks changes on the source volume was deliberately deleted. This would happen when the server encounters a problem which caused an unplanned reboot (blue-screen). Since the change journal map is kept in memory, an unexpected reboot of the server would cause this map to be lost, which means that a subsequent incremental OSSV backup job would transfer all the allocated blocks on the source volume to the secondary filer. The DPX job would log this event stating that the job will run as a "base", not as an "incremental" backup. One can see that this behavior causes the total SnapShot size to increase by the total amount of data on the source volume, while the QTree size stays for all intensive purposes constant.

 

Data Expiration

The data captured in the SnapShots has a retention associated with it. This retention is defined as part of the SnapVault backup job definition through DPX. As part of the Catalog Condense job, DPX deletes the corresponding SnapShot(s) when a OSSV backup job's retention is met. That implies that the blocks "captured" by the SnapShots are the only blocks that will be freed up as jobs expire. This is evident in the next graph which shows as above the sizes of the Qtree, SnapShot and Volume but now as function of Condense jobs (i.e. jobs that deletes the expired SnapShots one by one). Each number on this graph corresponds to a single SnapShot being deleted, starting with the oldest one. After the 17th Condense Job, no snapshots (and therefore no recovery points) exist on the filer.

 

From this it is evident that the QTree size never decreases with time. The effect of the operations that increase the QTree size is irreversible under normal OSSV operations. In fact, the size of the QTree will continue to increase until it reaches the total size of the disk being backed up. In the OSSV jobs discussed above, the drive being backed up has a maximum size of 4GB. The Qtree in this example can therefore continue to grow until it reaches a size of 4GB and then the growth will stop. The data movement operations that typically increase the size of the QTree as discussed earlier will have no effect on the size of the Qtree as soon as this steady-state is reached. These types of operations will now be captured in the SnapShots.

The reason for this behavior stems from the fact that the files contained in the QTrees are sparse files, and are "inflated" as new block are occupied by data on the disk(s) being backed up. As soon as the files contained in the QTrees are fully inflated, the QTrees reach a steady state and all block-level changes and additions from that point on that occurs on the source disks are captured in SnapShots.

The only way to "deflate" the QTree is to remove the SnapVault operation (using the snapvault stop command on the filer) and recreate the relationship by rerunning the first full backup. Typically, however this is not something that needs to be done. Network Appliance recently introduced Advanced Single Instance Storage (ASIS) functionality that is part of ONTAP 7.2.2. ASIS allows for the removal of duplicate blocks on the filer, which means that although the QTree size can be inflated, the size allocated on the volume will still be smaller since duplicate blocks are removed on the volume level through the ASIS functionality. ASIS does require a NetApp Nearstore Personality license (see NetApp documentation for more information).