1. Introdução
Periodic backup of data is an important part of any production database deployment, which helps ensure data recovery in the event of a disaster and it also minimizes data inconsistency when a restore is required.
Couchbase provides cbbackupmgr utility that has been improved over the years to become an enterprise-grade Backup and Restore tool to backup large data sets with much higher performance therefore we recommend this tool to be used in the production. It is worth mentioning that in Servidor Couchbase 6.5 we have overhauled the backup-storage engine completely, and introduced higher compression ratio, which has resulted in much improved backup-restore performance and reduced storage requirements for each backup snapshot, resulting in cost saving.
2. Best Practice
Although cbbackupmgr exists under Couchbase_HOME, it is não recommended to run this utility from any of the active nodes in the cluster. As it would be competing for resources of active requests and could potentially hamper the performance of your database system.
It is, therefore, a best practice to provide a separate instance (for backup and restore needs) with only the Couchbase binaries installed but no Couchbase services running, so resources can be better managed for both the database cluster and the backup node.

As can be seen from the above figure, a separate backup/restore node is provisioned in addition to a five node Couchbase cluster. Another best practice is to allocate sufficient storage to hold at least 5x the Couchbase data set size so there is enough space to store required snapshots of the database to meet the Recovery Point Objective (RPO) of the business.
3. Backup Strategy
cbbackupmgr provides a suite of commands which enables DBAs to implement a backup strategy that best suite their business needs. Here are some of the commands:
- cbbackupmgr backup – Backs up data from a Couchbase cluster.
- cbbackupmgr compact – Compacts a backup
- cbbackupmgr merge – Merges backups together
- Configuração do cbbackupmgr – Creates a new backup repository
- cbbackupmgr list – Lists backups in the archive
Using these commands one can implement any of the three backup strategies as mentioned in the documentação. In the example below, we will describe Periodic Merge strategy, in context of Couchbase Cluster running within Kubernetes environment.
4. Periodic Merge
This backup strategy is intended to have the lowest database overhead as it requires the least amount of time to backup the changes and practically no resources consumption from the database cluster to consolidate the data during the compaction and merge process (as it happens on the backup node).
On a high level here is how Periodic Merge strategy works:
- Setup backup repository using Configuração do cbbackupmgr
- Take an incremental backup of the database (in the repository) using cbbackupmgr backup
- Perform backup compaction using cbbackupmgr compact so that disk space can be efficiently used.
- Merge ‘n’ oldest backups using cbbackupmgr merge so that the number of backups in the repository doesn’t grow infinitely and space requirements remain under check.
Note: The above steps are captured in the backup-with-periodic-merge.sh script, which we will later use in our Kubernetes setup to take periodic backups.
5. Backup Couchbase Data
In my last blog on Couchbase Autonomous Operator, I have described step-by-step on how to deploy self-healing, highly-available Couchbase cluster using Persistent Volumes. Assuming you have followed those steps and deployed the cluster already, steps below will describe how you can setup automatic backup capability using cronjob
. It is considered best practice to regularly backup your data, and also test restoring backups to confirm the restore process before disaster recovery is actually required.
This functionality is not provided by the Operator and left to the cluster administrator to define backup policies and test data restoration. This section describes some common patterns that may be employed to perform the required functions.
5.1. Create Storage Class
The Kubernetes resource definitions below illustrate a typical arrangement for backup that saves the state of the entire cluster. We would need to first define the Classe de armazenamento
which we will format using xfs
for the most optimal performance.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# Create storage class for backup/restore operations Versão da API: armazenamento.k8s.io/v1 gentil: Classe de armazenamento metadados: rótulos: k8s-addon: armazenamento-aws.Complementos.k8s.io nome: gp2-backup-armazenamento parâmetros: tipo: gp2 fsType: xfs provisionador: kubernetes.io/aws-ebs reclaimPolicy: Retenção volumeBindingMode: WaitForFirstConsumer |
Using above definition in backup-sc.yaml file, we can create storage class like this:
1 |
$ kubectl criar -f backup-sc.yaml -n emart |
5.2. Create Persistent Volume
A persistent volume is claimed to keep data safe in the event of an outage. You will need to plan the claim size based on your expected data set size, the number of days data retention and whether incremental backups are used at all.
1 2 3 4 5 6 7 8 9 10 11 12 |
# Define backup storage volume gentil: PersistentVolumeClaim Versão da API: v1 metadados: nome: backup-pvc especificação: storageClassName: gp2-backup-armazenamento recursos: solicitações: armazenamento: 50Gi accessModes: - ReadWriteOnce |
Save above definition in backup-pvc.yaml and create the claim:
1 |
$ kubectl criar -f backup-pvc.yaml -n emart |
5.3. Configure Backup repository
Before we can begin taking snapshots of our data periodically, we need to configure the backup archive location. A job is created to mount the persistent volume and initialize a backup repository. The repository is named couchbase
which will map to the cluster name in later specifications.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# Create a backup repository gentil: Trabalho Versão da API: lote/v1 metadados: nome: couchbase-agrupamento-backup-configuração especificação: modelo: especificação: contêineres: - nome: backup-configuração imagem: couchbase/servidor:empresa-6.5.0 comando: ["cbbackupmgr", "config", "--archive", "/backups", "--repo", "couchbase"] volumeMounts: - nome: "couchbase-cluster-backup-volume" mountPath: "/backups" volumes: - nome: couchbase-agrupamento-backup-volume persistentVolumeClaim: claimName: backup-pvc restartPolicy: Nunca |
Save above definition in config.yaml and create backup repository:
1 |
$ kubectl criar -f configuração.yaml -n emart |
5.3. Run Backup as CronJob
Create a cronjob as described in the periodic-backup.yaml file, which takes a backup of the Couchbase cluster by a) downloading the backup script in the pod b) running the script and taking backup of the cluster data using the persistent storage volume.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
gentil: CronJob Versão da API: lote/v1beta1 metadados: nome: couchbase-agrupamento-backup-criar especificação: cronograma: "*/5 * * * *" jobTemplate: especificação: modelo: especificação: contêineres: #Delete backup-with-periodic-merge script so that new one can be pulled with each run - nome: excluir-script imagem: couchbase/servidor:empresa-6.5.0 comando: ["rm", "/backups/backup-with-periodic-merge.sh"] volumeMounts: - nome: "couchbase-cluster-backup-volume" mountPath: "/backups" initContainers: #Download the backup script from the git repo - nome: wget-backup-script imagem: couchbase/servidor:empresa-6.5.0 comando: ["wget", "https://raw.githubusercontent.com/couchbaselabs/cboperator-hol/master/eks/cb-operator-guide/files/sh/backup-with-periodic-merge.sh", "-P", "/backups/."] volumeMounts: - nome: "couchbase-cluster-backup-volume" mountPath: "/backups" #Change the mod of the backup script to execution - nome: chmod-script imagem: couchbase/servidor:empresa-6.5.0 comando: ["chmod", "700", "/backups/backup-with-periodic-merge.sh"] volumeMounts: - nome: "couchbase-cluster-backup-volume" mountPath: "/backups" #Run the script so it can do a) Backup b) Compaction c) Merge with each snapshot - nome: periodic-mesclar imagem: couchbase/servidor:empresa-6.5.0 comando: ["sh", "-c" ,"/backups/backup-with-periodic-merge.sh --cluster cbdemo-srv.emart.svc"] volumeMounts: - nome: "couchbase-cluster-backup-volume" mountPath: "/backups" volumes: - nome: couchbase-agrupamento-backup-volume persistentVolumeClaim: claimName: backup-pvc restartPolicy: Nunca |
In the above YAML we are running backup every 5 mins but you can change the frequency so it can meet your business RPO. As our Couchbase cluster is deployed within namespace emart
so we will deploy the backup cronjob under the same namespace:
1 2 3 |
$ kubectl aplicar -f periodic-backup.yaml -n emart cronjob.lote/couchbase-agrupamento-backup-criar criado |
5.4 Validate Periodic Backup Job
At this point, you can begin to watch the cronjob kicking in at every 5 minutes. And once it becomes active it will execute three initContainers
(wget-backup-script, chmod-script, periodic-merge) in sequential order followed by the cointainers
(delete-script):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
$ kubectl obter cápsulas -n emart -w NOME PRONTO STATUS RESTARTS IDADE backup-nó 1/1 Em execução 0 1d cbdemo-0000 1/1 Em execução 0 5d cbdemo-0001 1/1 Em execução 0 5d cbdemo-0002 1/1 Em execução 0 5d cbdemo-0003 1/1 Em execução 0 5d cbdemo-0004 1/1 Em execução 0 5d couchbase-operador-7654d844cb-gn4bw 1/1 Em execução 0 5d couchbase-operador-admissão-7ff868f54c-5pklx 1/1 Em execução 0 5d couchbase-agrupamento-backup-criar-1580357820-tz2hg 0/1 Pendente 0 2s couchbase-agrupamento-backup-criar-1580357820-tz2hg 0/1 Pendente 0 2s couchbase-agrupamento-backup-criar-1580357820-tz2hg 0/1 Init:0/3 0 2s couchbase-agrupamento-backup-criar-1580357820-tz2hg 0/1 Init:1/3 0 3s couchbase-agrupamento-backup-criar-1580357820-tz2hg 0/1 Init:2/3 0 4s couchbase-agrupamento-backup-criar-1580357820-tz2hg 0/1 Init:2/3 0 6s couchbase-agrupamento-backup-criar-1580357820-tz2hg 0/1 PodInitializing 0 27s couchbase-agrupamento-backup-criar-1580357820-tz2hg 0/1 Concluído 0 30s |
You can view the logs of each initContainers
after the pod displays status Concluído. O initContainers
we are interested in is called periodic-merge
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
$ kubectl registros couchbase-agrupamento-backup-criar-1580357820-tz2hg -n emart -c periodic-mesclar --------------------------------------------------------- INICIAR ETAPA 1: BACKUP : Assim Jan 30 04:17:12 UTC 2020 Em execução backup... Comando: cbbackupmgr backup --arquivo /backups --repo couchbase --agrupamento couchbase://cbdemo-srv.emart.svc --username Administrator --password password --threads 2 Advertência: Progress bar desativado porque terminal largura é less do que 80 characters Backup com sucesso concluído Backed para cima balde "gamesim-sample" bem-sucedido Mutations backedup; 586, Mutations falhou para backup: 0 Deletions backedup: 0, Deletions falhou para backup: 0 Backed para cima balde "amostra de viagem" bem-sucedido Mutations backedup; 0, Mutations falhou para backup: 0 Deletions backedup: 0, Deletions falhou para backup: 0 --------------------------------------------------------- INICIAR ETAPA 2: COMPACTAÇÃO : Assim Jan 30 04:17:20 UTC 2020 Lista de backup snapshots ... 2020-01-28T23_01_37.592188562Z 2020-01-28T23_03_34.160387835Z 2020-01-28T23_05_08.103740281Z 2020-01-30T04_17_12.702824188Z Último backup nome é: 2020-01-30T04_17_12.702824188Z Compacting o backup... Comando: cbbackupmgr compacto --arquivo /backups --repo couchbase --backup 2020-01-30T04_17_12.702824188Z Compactação bem-sucedido, 0 bytes freed --------------------------------------------------------- INICIAR ETAPA 3: Fusão antigo backup : Assim Jan 30 04:17:24 UTC 2020 Tamanho Itens Nome 604.93MB - + couchbase 192.00MB - + 2020-01-28T23_01_37.592188562Z 192.00MB - + cerveja-amostra 37B 0 análises.json 414B 0 balde-configuração.json 192.00MB 7303 + dados 192.00MB 7303 1024 Shards 2B 0 completo-texto.json 1.94KB 1 gsi.json 784B 1 visualizações.json 192.02MB - + 2020-01-28T23_03_34.160387835Z 192.02MB - + viagens-amostra 0B 0 análises.json 416B 0 balde-configuração.json 192.00MB 31591 + dados 192.00MB 31591 1024 Shards 2B 0 completo-texto.json 15.57KB 10 gsi.json 2B 0 visualizações.json 64.02MB - + 2020-01-28T23_05_08.103740281Z 64.02MB - + viagens-amostra 0B 0 análises.json 416B 0 balde-configuração.json 64.00MB 0 + dados 64.00MB 0 1024 Shards 2B 0 completo-texto.json 15.57KB 10 gsi.json 2B 0 visualizações.json 156.89MB - + 2020-01-30T04_17_12.702824188Z 92.88MB - + gamesim-amostra 0B 0 análises.json 417B 0 balde-configuração.json 92.88MB 586 + dados 92.88MB 586 1024 Shards 2B 0 completo-texto.json 1.95KB 1 gsi.json 501B 1 visualizações.json 64.02MB - + viagens-amostra 0B 0 análises.json 416B 0 balde-configuração.json 64.00MB 0 + dados 64.00MB 0 1024 Shards 2B 0 completo-texto.json 15.57KB 10 gsi.json 2B 0 visualizações.json Início 2020-01-28T23_01_37.592188562Z, FIM 2020-01-28T23_03_34.160387835Z Fusão antigo backups... Comando: cbbackupmgr mesclar --arquivo /backups --repo couchbase --iniciar 2020-01-28T23_01_37.592188562Z --final 2020-01-28T23_03_34.160387835Z Merge concluído com sucesso Tamanho Itens Nome 412.92MB - + couchbase 192.02MB - + 2020-01-28T23_03_34.160387835Z 192.02MB - + viagens-amostra 37B 0 análises.json 416B 0 balde-configuração.json 192.00MB 31591 + dados 192.00MB 31591 1024 Shards 2B 0 completo-texto.json 15.57KB 10 gsi.json 2B 0 visualizações.json 64.02MB - + 2020-01-28T23_05_08.103740281Z 64.02MB - + viagens-amostra 0B 0 análises.json 416B 0 balde-configuração.json 64.00MB 0 + dados 64.00MB 0 1024 Shards 2B 0 completo-texto.json 15.57KB 10 gsi.json 2B 0 visualizações.json 156.89MB - + 2020-01-30T04_17_12.702824188Z 92.88MB - + gamesim-amostra 0B 0 análises.json 417B 0 balde-configuração.json 92.88MB 586 + dados 92.88MB 586 1024 Shards 2B 0 completo-texto.json 1.95KB 1 gsi.json 501B 1 visualizações.json 64.02MB - + viagens-amostra 0B 0 análises.json 416B 0 balde-configuração.json 64.00MB 0 + dados 64.00MB 0 1024 Shards 2B 0 completo-texto.json 15.57KB 10 gsi.json 2B 0 visualizações.json |
Note: As can be seen from the logs above, before the merge step there were four backups available and after merge there are three backup snapshots that are referred to as PONTOS DE RESTAURAÇÃO
em backup-with-periodic-merge.sh roteiro.
This concludes the backup section.
6. Restoring
Much like a backup, we can restore data to a new Couchbase cluster with a Kubernetes Job.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
gentil: Trabalho Versão da API: lote/v1 metadados: nome: couchbase-agrupamento-restaurar especificação: modelo: especificação: contêineres: - nome: couchbase-agrupamento-restaurar imagem: couchbase/servidor:empresa-6.0.2 comando: ["cbbackupmgr", "restore", "--archive", "/backups", "--repo", "couchbase", "--cluster", "couchbase://cbdemo-srv.emart.svc", "--username", "Administrador", "--password", "senha"] volumeMounts: - nome: "couchbase-cluster-backup-volume" mountPath: "/backups" volumes: - nome: couchbase-agrupamento-backup-volume persistentVolumeClaim: claimName: backup-pvc restartPolicy: Nunca |
If you would rather like to create a temporary backup-restore pod to see what backups are available or to troubleshoot an issue, you can mount the same persistentVolumeClaim
to a new pod. Here is the definition of the pod which can be stored in backup-pod.yaml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
Versão da API: v1 gentil: Pod metadados: nome: backup-nó especificação: # specification of the pod's contents contêineres: - nome: backup-cápsula imagem: couchbase/servidor:empresa-6.5.0 # Just spin & wait forever comando: [ "/bin/bash", "-c", "--" ] argumentos: [ "while true; do sleep 30; done;" ] volumeMounts: - nome: "couchbase-cluster-backup-volume" mountPath: "/backups" volumes: - nome: couchbase-agrupamento-backup-volume persistentVolumeClaim: claimName: backup-pvc restartPolicy: Nunca |
Run kubectl to bring up the pod temporarily:
1 2 3 4 5 6 7 8 9 10 11 12 |
$ kubectl aplicar -f br/backup-cápsula.yaml -n emart $ kubectl obter cápsulas -n emart NOME PRONTO STATUS RESTARTS IDADE backup-nó 1/1 Em execução 0 3d1h cbdemo-0000 1/1 Em execução 0 7d1h cbdemo-0001 1/1 Em execução 0 7d1h cbdemo-0002 1/1 Em execução 0 7d1h cbdemo-0003 1/1 Em execução 0 7d1h cbdemo-0004 1/1 Em execução 0 7d1h couchbase-operador-7654d844cb-gn4bw 1/1 Em execução 0 7d2h couchbase-operador-admissão-7ff868f54c-5pklx 1/1 Em execução 0 7d2h |
Once backup-node is Running, we can login to that pod:
1 2 3 |
$ kubectl executar -ele backup-nó -n emart -- /caixa/bash raiz@backup-nó:/ |
And execute cbbackupmgr list
command to view existing backups:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# cbbackupmgr list --repo couchbase --archive /backups Tamanho Itens Nome 256.04MB - + couchbase 0B - + 2020-01-30T04_17_12.702824188Z 0B - + gamesim-amostra 0B 0 análises.json 0B 0 + dados 0B 0 Erro: não dados fragmentos foram encontrado 0B 0 completo-texto.json 0B 0 gsi.json 0B 0 visualizações.json 128.02MB - + 2020-01-30T04_18_13.021340423Z .... |
And you can also run cbbackupmgr restore
command manually:
1 |
# cbbackupmgr restore --archive /backups --repo couchbase --cluster couchbase://cbdemo-srv.emart.svc --username Administrator --password password |
Once you are done restoring just delete the pod:
1 |
$ kubectl excluir -f backup-cápsula.yaml -n emart |
7. Conclusão
We walked through step-by-step on how one can configure a backup cronjob, which automates the process of taking the periodic backup at a predefined interval. We used a backup-with-periodic-merge.sh script, that executes a) backup, b) compaction and c) merge within a single script. This script was then used in the periodic-backup.yaml file, which automated the process of taking backup within the Kubernetes environment. We hope you would use the best practices described in this blog and plan on taking regular backups as well as validate those backups using restore command regularly.