How to do server restore in Stardog Helm Charts

Created by Steve Place, Modified on Tue, Apr 29 at 11:14 AM by Steve Place

This tutorial uses Microsoft Azure as the example cloud provider. While the general procedure applies to other clouds, you will need to adapt the Azure-specific tools and configurations for your specific provider.

If you want to restore from S3, see this article.

Prerequisites

A running Kubernetes cluster with Stardog and ZooKeeper pods.
Kubernetes secrets for your Stardog license file and the password of the Stardog user performing the backup/restore (stardog-license and stardog-password).
You have previously created a Stardog server backup by following the companion to this article: How to do server backup in Stardog Helm Charts.

Steps to perform the restore

(You only need to perform steps 1 and 2 if you're restoring an existing helm release.)

1. Scale your Stardog and ZooKeeper clusters down to 0 pods.

kubectl scale statefulset <stardog> --replicas=0 -n <namespace>
kubectl scale statefulset <zookeeper> --replicas=0 -n <namespace>

2. Delete the PVCs associated with Stardog and ZooKeeper.

Only do this after you've taken a server backup!

You can delete the PVCs with the following command:

kubectl delete pvc -n <namespace> data-<namespace>-stardog-0 data-<namespace>-stardog-1 \
    data-<namespace>-stardog-2 data-<namespace>-zookeeper-0 data-<namespace>-zookeeper-1 \
    data-<namespace>-zookeeper-2

3. Create a PVC from your backup snapshot.

This PVC will have the same name as the first PVC in your Stardog cluster, i.e., data-<namespace>-stardog-0. In the manifest below, stardog-restore-from-backup.yaml, you will need to include that name, the size of the PVC, and the name of the snapshot to restore from. The latter two were decided when creating the server backup.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ${PVC_NAME}
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: managed-csi
  resources:
    requests:
      storage: ${pvc_size}
  dataSource:
    name: ${snapshot_name}
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io

You can apply this manifest with kubectl apply -f stardog-restore-from-backup.yaml.

4. Create a pod to run the Stardog restore.

This pod will mount to the PVC, delete everything in its STARDOG_HOME directory except for the license key, and run a native Stardog server restore.

The pod assumes you have environment variables for your Stardog username, Stardog password (stored in a Kubernetes secret), and the name of the PVC (set in the previous step).

If you are using OAuth 2.0 and you do not have the local password of an administrator, you can pass the token with the --token option, rather than passing -u <username> -p <password> in the manifest.

You can use the following manifest, stardog-restore-runner.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: stardog-restore-runner
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
  containers:
  - name: stardog
    image: stardog/stardog:latest
    command: ["/bin/sh", "-c"]
    args:
      - |
        echo "[INFO] Waiting for volume to be writable..."
        COUNT=0
        while ! touch /var/opt/stardog/.restore-check 2>/dev/null; do
          if [[ \$COUNT -ge 60 ]]; then
            echo "[ERROR] Timed out waiting for volume"
            exit 1
          fi
          echo "[INFO] Volume not ready yet, retrying..."
          sleep 5
          COUNT=\$((COUNT+1))
        done

        echo "[INFO] Volume is ready. Removing all files from STARDOG_HOME except the license key..."
        find /var/opt/stardog ! -name 'stardog-license-key.bin' -mindepth 1 -delete

        echo "[INFO] Starting restore from /backup/stardog_backup..."
        /opt/stardog/bin/stardog-admin server restore \\
            -p "\${STARDOG_PASSWORD}" \\
            -u ${STARDOG_USERNAME} \\
            -- /backup/stardog_backup 2>&1 | tee /dev/stdout

        echo "[INFO] Restore complete!" && \
        tail -f /dev/null
    volumeMounts:
      - mountPath: /var/opt/stardog
        name: stardog-home
      - mountPath: /backup
        name: stardog-backup
      - mountPath: /var/opt/stardog/stardog-license-key.bin
        name: stardog-license
        subPath: stardog-license-key.bin
  volumes:
    - name: stardog-home
      persistentVolumeClaim:
        claimName: ${PVC_NAME}
    - name: stardog-backup
      persistentVolumeClaim:
        claimName: stardog-backup-output
    - name: stardog-license
      secret:
        secretName: stardog-license

You can apply this manifest with kubectl apply -f stardog-restore-runner.yaml.

5. Wait for the restore pod to complete.

You can monitor its progress by searching the pod logs for the message "Restore complete!" like so.

LOG_OUTPUT=$(kubectl logs stardog-restore-runner)
if echo "$LOG_OUTPUT" | grep -q "Restore complete!"; then
  break
fi

This block can be integrated into a for loop that periodically checks for restore completion.

Once completed, you can delete stardog-restore-runner.

6. Scale your Stardog cluster back up.

Scale ZooKeeper back up to the number of pods it had previously (usually 3) and Stardog up to 1 pod. You can do that by editing your values.yaml file like so and run helm upgrade:

kubectl scale statefulset <zookeeper> --replicas=3 -n <namespace>
kubectl scale statefulset <stardog> --replicas=1 -n <namespace>

Once your first Stardog pod is up and running, repeat this process, adding one Stardog pod at a time until you're at your normal number of pods.

If you have a large data set, it may take a long time for the other nodes to replicate data from the first. What you can do instead is deploy one stardog-restore-runner for each node. Once the restore is complete, scale up your cluster.