Database Vacuuming

The copy-on-write method of writing to a persistent database will leave data objects which are no longer in use in the pack files. With updated and new data objects continually being added to the end of the pack file, the question arises on how to keep the database within a reasonable size on the disk. RDM uses an automatic vacuuming process to move information which is still in use out of sparsely populated pack files to the current pack file. The vacuumed pack files are eligible for deletion when they are completely vacuumed and are no longer needed for recovery or for current snapshot read transactions. The process keeps database objects closer together in the pack files for more efficient reads as well as more efficient usage of disk space.

Vacuumed (empty) pack files can only be deleted when the pack file ID is the lowest number in the pack file ID sequence.

It is an expected behavior for empty pack files to exist in the middle of a pack file ID sequence. For example, the lowest pack file could be full while the next pack file(s) could be vacuumed (empty) up to the current pack file. This is a normal behavior depending upon data usage.

The automatic vacuum process is performed while the database is open for use using current vacuum conditions set. If they database is opened in the SHARED mode, a separate vacuum thread runs in the background to perform the vacuuming. If the database is opened in EXCLUSIVE mode, the vacuuming process piggybacks on the write transactions performed by the application.

Conditions Triggering Vacuuming

The vacuuming process executes when RDM detects that one of the following conditions have been met. When the vacuuming begins, it with the lowest number pack file ID that meets the conditions for vacuuming. This is not necessarily the first pack file in the sequence. The vacuuming process will stop when there conditions for starting vacuuming have been addressed.

VACUUM_PERCENTAGE

When the total space used in the pack files drops to the currently defined VACUUM_PERCENTAGE, the vacuum process will start the process of vacuuming the used space into the current pack file. The default VACUUM_PERCENTAGE is 20%. See vacuum_percentage for information about this database option.

DB_SIZE

When the current total size of the pack files (used and unused space) exceeds DB_SIZE, the vacuum process will begin vacuuming to eliminate unused space to get the database size under the DB_SIZE value. The pack files will be vacuumed regardless of whether the pack files meet the VACUUM_PERCENTAGE condition or not. The process of vacuuming will continue until the total size is less than DB_SIZE. See db_size for information about this database option.

Manual Triggering

The vacuum process can also be manually triggered using the rdm-vacuum tool or by using an API call. The tool is typically used for removing all of the unused space out of a database image to create the smallest database image possible for inclusion in an archive or installation package.