This tutorial uses Microsoft Azure as the example cloud provider. While the general procedure applies to other clouds, you will need to adapt the Azure-specific tools and configurations for your specific provider.Prerequisites
- A running Kubernetes cluster with Stardog and ZooKeeper pods.
- A Kubernetes secret for the password of the Stardog user performing the backup/restore (
stardog-password).
Steps to perform the backup
(Note: If you are backing up to an S3 bucket, you only need steps 3 and 4 from this article.)
1. Create a VolumeSnapshotClass so you can take a snapshot of your data.
You can use the following manifest, azure-snapshot-class.yaml:
apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotClass metadata: name: azure-disk-snapshot-class driver: disk.csi.azure.com deletionPolicy: Retain
You can apply this manifest with kubectl apply -f azure-snapshot-class.yaml. This action is idempotent, so you can run it multiple times, and nothing will occur if the class has already been created.
2. Create a PersistentVolumeClaim (PVC) to hold the backup.
This PVC will be called stardog-backup-output. You will need to specify the size of the backup under spec.resources.requests.storage.
You can use the following manifest, stardog-backup-pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: stardog-backup-output
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: ${backup_size}
storageClassName: azurefileYou can apply this manifest with kubectl apply -f stardog-backup-pvc.yaml.
3. Create a pod to run the Stardog backup.
This pod will perform a native Stardog server backup that will be sent to our stardog-backup-output PVC and snapshotted.
The pod assumes you have environment variables for your Stardog username, Stardog password (stored in a Kubernetes secret), and the internal Kubernetes DNS name for the Stardog service. This isn't the address Stardog is running on within its pod. (In other words, it's not http://localhost:5820.) It's typically in the form http://{RELEASE_NAME}-stardog:5820, e.g., http://dev-sd-stardog:5820.
You can use the following manifest, stardog-backup-runner.yaml:
apiVersion: v1
kind: Pod
metadata:
name: stardog-backup-runner
spec:
securityContext:
runAsUser: 20000
runAsGroup: 20000
fsGroup: 20000
containers:
- name: stardog
image: stardog/stardog:latest
command: ["/bin/sh", "-c"]
env:
- name: STARDOG_PASSWORD
valueFrom:
secretKeyRef:
name: stardog-password
key: adminpw
args:
- |
echo "[INFO] Starting remote backup..."
/opt/stardog/bin/stardog-admin --server http://${stardog_server}:5820 \\
server backup \\
-p "\${STARDOG_PASSWORD}" \\
-u ${STARDOG_USERNAME} \\
-- /var/opt/stardog/.backup
echo "[INFO] Backup complete!" && \\
tail -f /dev/null
volumeMounts:
- name: backup
mountPath: /backup
volumes:
- name: backup
persistentVolumeClaim:
claimName: stardog-backup-output
restartPolicy: NeverYou can apply this manifest with kubectl apply -f stardog-backup-runner.yaml.
If you are backing up to an S3 bucket:
- You don't need to mount the
stardog-backup-outputvolume. - Your backup command should be
stardog-admin server backup s3:///bucket-name/path-in-bucket?region=us-east-1\&AWS_ACCESS_KEY_ID=<your key id>\&AWS_SECRET_ACCESS_KEY=<your key secret>.
4. Wait for the backup pod to complete.
You can monitor its progress by searching the pod logs for the message "Backup complete!" like so.
LOG_OUTPUT=$(kubectl logs stardog-backup-runner -n "$NAMESPACE")
if echo "$LOG_OUTPUT" | grep -q "Backup complete!"; then
echo "Backup completed successfully."
fiThis block can be integrated into a for loop that periodically checks for backup completion.
5. Send backup to stardog-backup-runner.
stardog-backup-runner sends the command to your Stardog pod to perform the backup, and the backup is stored on the PVC mounted to the Stardog pod. We need to send that backup to the PVC mounted to stardog-backup-runner (i.e., to stardog-backup-output), because we have to delete the PVCs mounted to our Stardog and ZooKeeper pods when we restore later.
We can't copy directly from one pod to the other (and we can't mount both pods to the same PVC), so the simplest way to do this is to copy the backup to our local machine and then copy it to stardog-backup-runner. We can do that with the following commands:
# Copy from Stardog pod to your machine first
kubectl exec -n "$NAMESPACE" ${NAMESPACE}-stardog-0 -- tar cf - -C /var/opt/stardog/.backup . \
> ./local-backup.tar
# Now, copy from your machine to the backup runner pod
kubectl exec -n "$NAMESPACE" stardog-backup-runner -- mkdir -p /backup/stardog_backup
cat ./local-backup.tar \
| kubectl exec -i -n "$NAMESPACE" stardog-backup-runner \
-- tar --no-same-owner --no-same-permissions --touch -xf - -C /backup/stardog_backup
# Remove file from local machine
rm ./local-backup.tarNote this assumes you have your namespace stored in the $NAMESPACE environment variable.
6. Create a snapshot of the backup output PVC.
The recommended name for the snapshot name is stardog-backup-snapshot-[date and time of snapshot]. You can store that in an environment variable like so:
snapshot_name=stardog-backup-snapshot-$(date +%Y%m%d%H%M%S)
You can use the following manifest for the snapshot, ${snapshot_name}.yaml, which will snapshot stardog-backup-output:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: ${snapshot_name}
spec:
volumeSnapshotClassName: azure-disk-snapshot-class
source:
persistentVolumeClaimName: stardog-backup-outputYou can apply this manifest with kubectl apply -f ${snapshot_name}.yaml.
Once this has completed, you are done with the backup process, and you can delete stardog-backup-runner.yaml.
Restoring
When you are ready to restore from the backup you have taken, see this companion article:
How to do server restore in Stardog Helm Charts.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article