With 6.5, we have made some tremendous improvements to our backup & restore technology. These improvements are optimized to enhance backup performance, consistency and storage requirements.
Improved performance (rate of backup & restore) with backup has been the most requested factor to be adopted as an enterprise grade tool. In order to improve the performance, we made some fundamental changes to the tool. Some of these improvements include leveraging value compression on the server, replacement of storage engine, modifying the storage format, limiting the size of the backup file, metadata isolation etc.,
This has resulted in significant improvement in various backup functionalities such as full backup, incrementals, merge, list etc., and also the scalability for datasets in the range of single digit TBs.
Based on our internal testing, we observe ~4x improvement compared to the previous versions of backup.
Historically, backup was stored as one big file which at times led to the need for ~1.5-2x times more storage compared to the size of the original dataset especially for merge operations. With the changes made to the storage engine, compression, file formats and metadata isolation, the required storage is reduced to ~40% of the original dataset.
Until now, the value compression always occurred on the backup client. From 6.5, cbbackupmgr will leverage the server compression and backup the documents as compressed when possible. If the data is compressed by default, it will be backed up as compressed and if it is not compressed, it will first be compressed and then backed up when value compression is opted.This will improve the performance as the size of the dataset to backup is reduced for transmission on the pipe and for the backup itself.
Info Command (Developer Preview)
A new “info” command is introduced to provide a detailed listing of the backup files with information on the type of backup such as full, incremental, merge, number of views, indexes, FTS indexes, indication of backup completion etc., in addition to the repositories and their sizes.We have also added the option to output a JSON document which can be used for any automation purpose.
Measuring Consistency (Developer Preview)
Since Couchbase Server is a completely distributed database there is some time elapse between data distributed across nodes. With 6.5, we are providing the ability to measure this consistency for backups. To attain maximum consistency, a full backup can be run, followed by small incrementals to make up for the delta. “Disk-only” mode is used where the documents persisted to the disk are backed up which provides better consistency across vBuckets.
There are several other tactical improvements such as support for Alternative Addresses, automatic bucket creation for restore, improved error messages, support for analytics, FTS alias (superset of all FTS indexes backup) etc., For more details on all the improvements in 6.5 read documentation.