How to do server backup in Stardog Helm Charts

Created by Steve Place, Modified on Wed, Apr 9 at 9:59 AM by Steve Place

This tutorial uses Microsoft Azure as the example cloud provider. While the general procedure applies to other clouds, you will need to adapt the Azure-specific tools and configurations for your specific provider.

Prerequisites

A running Kubernetes cluster with Stardog and ZooKeeper pods.
A Kubernetes secret for the password of the Stardog user performing the backup/restore (stardog-password).

Steps to perform the backup

(Note: If you are backing up to an S3 bucket, you only need steps 3 and 4 from this article.)

1. Create a VolumeSnapshotClass so you can take a snapshot of your data.

You can use the following manifest, azure-snapshot-class.yaml:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: azure-disk-snapshot-class
driver: disk.csi.azure.com
deletionPolicy: Retain

You can apply this manifest with kubectl apply -f azure-snapshot-class.yaml. This action is idempotent, so you can run it multiple times, and nothing will occur if the class has already been created.

2. Create a PersistentVolumeClaim (PVC) to hold the backup.

This PVC will be called stardog-backup-output. You will need to specify the size of the backup under spec.resources.requests.storage.

You can use the following manifest, stardog-backup-pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: stardog-backup-output
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: ${backup_size}
  storageClassName: azurefile

You can apply this manifest with kubectl apply -f stardog-backup-pvc.yaml.

3. Create a pod to run the Stardog backup.

This pod will perform a native Stardog server backup that will be sent to our stardog-backup-output PVC and snapshotted.

The pod assumes you have environment variables for your Stardog username, Stardog password (stored in a Kubernetes secret), and the internal Kubernetes DNS name for the Stardog service. This isn't the address Stardog is running on within its pod. (In other words, it's not http://localhost:5820.) It's typically in the form http://{RELEASE_NAME}-stardog:5820, e.g., http://dev-sd-stardog:5820.

You can use the following manifest, stardog-backup-runner.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: stardog-backup-runner
spec:
  securityContext:
    runAsUser: 20000
    runAsGroup: 20000
    fsGroup: 20000
  containers:
    - name: stardog
      image: stardog/stardog:latest
      command: ["/bin/sh", "-c"]
      env:
        - name: STARDOG_PASSWORD
          valueFrom:
            secretKeyRef:
              name: stardog-password
              key: adminpw
      args:
        - |
          echo "[INFO] Starting remote backup..."
          /opt/stardog/bin/stardog-admin --server http://${stardog_server}:5820 \\
              server backup \\
              -p "\${STARDOG_PASSWORD}" \\
              -u ${STARDOG_USERNAME} \\
              -- /var/opt/stardog/.backup
          echo "[INFO] Backup complete!" && \\
          tail -f /dev/null
      volumeMounts:
        - name: backup
          mountPath: /backup
  volumes:
    - name: backup
      persistentVolumeClaim:
        claimName: stardog-backup-output
  restartPolicy: Never

You can apply this manifest with kubectl apply -f stardog-backup-runner.yaml.

If you are backing up to an S3 bucket:

You don't need to mount the stardog-backup-output volume.
Your backup command should be stardog-admin server backup s3:///bucket-name/path-in-bucket?region=us-east-1\&AWS_ACCESS_KEY_ID=<your key id>\&AWS_SECRET_ACCESS_KEY=<your key secret>.

4. Wait for the backup pod to complete.

You can monitor its progress by searching the pod logs for the message "Backup complete!" like so.

LOG_OUTPUT=$(kubectl logs stardog-backup-runner -n "$NAMESPACE")
  if echo "$LOG_OUTPUT" | grep -q "Backup complete!"; then
    echo "Backup completed successfully."
  fi

This block can be integrated into a for loop that periodically checks for backup completion.

5. Send backup to stardog-backup-runner.

stardog-backup-runner sends the command to your Stardog pod to perform the backup, and the backup is stored on the PVC mounted to the Stardog pod. We need to send that backup to the PVC mounted to stardog-backup-runner (i.e., to stardog-backup-output), because we have to delete the PVCs mounted to our Stardog and ZooKeeper pods when we restore later.

We can't copy directly from one pod to the other (and we can't mount both pods to the same PVC), so the simplest way to do this is to copy the backup to our local machine and then copy it to stardog-backup-runner. We can do that with the following commands:

# Copy from Stardog pod to your machine first
kubectl exec -n "$NAMESPACE" ${NAMESPACE}-stardog-0 -- tar cf - -C /var/opt/stardog/.backup . \
  > ./local-backup.tar

# Now, copy from your machine to the backup runner pod
kubectl exec -n "$NAMESPACE" stardog-backup-runner -- mkdir -p /backup/stardog_backup
cat ./local-backup.tar \
  | kubectl exec -i -n "$NAMESPACE" stardog-backup-runner \
  -- tar --no-same-owner --no-same-permissions --touch -xf - -C /backup/stardog_backup

# Remove file from local machine
rm ./local-backup.tar

Note this assumes you have your namespace stored in the $NAMESPACE environment variable.

6. Create a snapshot of the backup output PVC.

The recommended name for the snapshot name is stardog-backup-snapshot-[date and time of snapshot]. You can store that in an environment variable like so:

snapshot_name=stardog-backup-snapshot-$(date +%Y%m%d%H%M%S)

You can use the following manifest for the snapshot, ${snapshot_name}.yaml, which will snapshot stardog-backup-output:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: ${snapshot_name}
spec:
  volumeSnapshotClassName: azure-disk-snapshot-class
  source:
    persistentVolumeClaimName: stardog-backup-output

You can apply this manifest with kubectl apply -f ${snapshot_name}.yaml.

Once this has completed, you are done with the backup process, and you can delete stardog-backup-runner.yaml.

Restoring

When you are ready to restore from the backup you have taken, see this companion article:

How to do server restore in Stardog Helm Charts.