This tutorial uses Microsoft Azure as the example cloud provider. While the general procedure applies to other clouds, you will need to adapt the Azure-specific tools and configurations for your specific provider.
Prerequisites
- A running Kubernetes cluster with Stardog and ZooKeeper pods.
- A Kubernetes secret for the password of the Stardog user performing the backup/restore (
stardog-password
).
Steps to perform the backup
(Note: If you are backing up to an S3 bucket, you only need steps 3 and 4 from this article.)
1. Create a VolumeSnapshotClass
so you can take a snapshot of your data.
You can use the following manifest, azure-snapshot-class.yaml
:
apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotClass metadata: name: azure-disk-snapshot-class driver: disk.csi.azure.com deletionPolicy: Retain
You can apply this manifest with kubectl apply -f azure-snapshot-class.yaml
. This action is idempotent, so you can run it multiple times, and nothing will occur if the class has already been created.
2. Create a PersistentVolumeClaim (PVC) to hold the backup.
This PVC will be called stardog-backup-output
. You will need to specify the size of the backup under spec.resources.requests.storage
.
You can use the following manifest, stardog-backup-pvc.yaml
:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: stardog-backup-output spec: accessModes: - ReadWriteMany resources: requests: storage: ${backup_size} storageClassName: azurefile
You can apply this manifest with kubectl apply -f stardog-backup-pvc.yaml
.
3. Create a pod to run the Stardog backup.
This pod will perform a native Stardog server backup that will be sent to our stardog-backup-output
PVC and snapshotted.
The pod assumes you have environment variables for your Stardog username, Stardog password (stored in a Kubernetes secret), and the internal Kubernetes DNS name for the Stardog service. This isn't the address Stardog is running on within its pod. (In other words, it's not http://localhost:5820
.) It's typically in the form http://{RELEASE_NAME}-stardog:5820
, e.g., http://dev-sd-stardog:5820
.
You can use the following manifest, stardog-backup-runner.yaml
:
apiVersion: v1 kind: Pod metadata: name: stardog-backup-runner spec: securityContext: runAsUser: 20000 runAsGroup: 20000 fsGroup: 20000 containers: - name: stardog image: stardog/stardog:latest command: ["/bin/sh", "-c"] env: - name: STARDOG_PASSWORD valueFrom: secretKeyRef: name: stardog-password key: adminpw args: - | echo "[INFO] Starting remote backup..." /opt/stardog/bin/stardog-admin --server http://${stardog_server}:5820 \\ server backup \\ -p "\${STARDOG_PASSWORD}" \\ -u ${STARDOG_USERNAME} \\ -- /var/opt/stardog/.backup echo "[INFO] Backup complete!" && \\ tail -f /dev/null volumeMounts: - name: backup mountPath: /backup volumes: - name: backup persistentVolumeClaim: claimName: stardog-backup-output restartPolicy: Never
You can apply this manifest with kubectl apply -f stardog-backup-runner.yaml
.
If you are backing up to an S3 bucket:
- You don't need to mount the
stardog-backup-output
volume. - Your backup command should be
stardog-admin server backup s3:///bucket-name/path-in-bucket?region=us-east-1\&AWS_ACCESS_KEY_ID=<your key id>\&AWS_SECRET_ACCESS_KEY=<your key secret>
.
4. Wait for the backup pod to complete.
You can monitor its progress by searching the pod logs for the message "Backup complete!" like so.
LOG_OUTPUT=$(kubectl logs stardog-backup-runner -n "$NAMESPACE") if echo "$LOG_OUTPUT" | grep -q "Backup complete!"; then echo "Backup completed successfully." fi
This block can be integrated into a for
loop that periodically checks for backup completion.
5. Send backup to stardog-backup-runner
.
stardog-backup-runner
sends the command to your Stardog pod to perform the backup, and the backup is stored on the PVC mounted to the Stardog pod. We need to send that backup to the PVC mounted to stardog-backup-runner
(i.e., to stardog-backup-output
), because we have to delete the PVCs mounted to our Stardog and ZooKeeper pods when we restore later.
We can't copy directly from one pod to the other (and we can't mount both pods to the same PVC), so the simplest way to do this is to copy the backup to our local machine and then copy it to stardog-backup-runner
. We can do that with the following commands:
# Copy from Stardog pod to your machine first kubectl exec -n "$NAMESPACE" ${NAMESPACE}-stardog-0 -- tar cf - -C /var/opt/stardog/.backup . \ > ./local-backup.tar # Now, copy from your machine to the backup runner pod kubectl exec -n "$NAMESPACE" stardog-backup-runner -- mkdir -p /backup/stardog_backup cat ./local-backup.tar \ | kubectl exec -i -n "$NAMESPACE" stardog-backup-runner \ -- tar --no-same-owner --no-same-permissions --touch -xf - -C /backup/stardog_backup # Remove file from local machine rm ./local-backup.tar
Note this assumes you have your namespace stored in the $NAMESPACE
environment variable.
6. Create a snapshot of the backup output PVC.
The recommended name for the snapshot name is stardog-backup-snapshot-[date and time of snapshot]
. You can store that in an environment variable like so:
snapshot_name=stardog-backup-snapshot-$(date +%Y%m%d%H%M%S)
You can use the following manifest for the snapshot, ${snapshot_name}.yaml
, which will snapshot stardog-backup-output
:
apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: ${snapshot_name} spec: volumeSnapshotClassName: azure-disk-snapshot-class source: persistentVolumeClaimName: stardog-backup-output
You can apply this manifest with kubectl apply -f ${snapshot_name}.yaml
.
Once this has completed, you are done with the backup process, and you can delete stardog-backup-runner.yaml
.
Restoring
When you are ready to restore from the backup you have taken, see this companion article:
How to do server restore in Stardog Helm Charts.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article