How to do server restore from S3 in Stardog Helm Charts

Created by Steve Place, Modified on Thu, Apr 10 at 9:13 AM by Steve Place

This tutorial covers server restore from an Amazon S3 bucket. If you are not using S3, see our general How to do server restore in Stardog Helm Charts article.

Prerequisites

A running Kubernetes cluster with Stardog and ZooKeeper pods.
Kubernetes secrets for your Stardog license file and the password of the Stardog user performing the backup/restore (stardog-license and stardog-password).

Steps to perform the restore

(You only need to perform steps 1 and 2 if you're restoring an existing helm release.)

1. Scale your Stardog and ZooKeeper clusters down to 0 pods.

kubectl scale statefulset <stardog> --replicas=0 -n <namespace>
kubectl scale statefulset <zookeeper> --replicas=0 -n <namespace>

2. Delete the PVCs associated with Stardog and ZooKeeper.

Only do this after you've taken a server backup!

You can delete the PVCs with the following command:

kubectl delete pvc -n <namespace> data-<namespace>-stardog-0 data-<namespace>-stardog-1 \
    data-<namespace>-stardog-2 data-<namespace>-zookeeper-0 data-<namespace>-zookeeper-1 \
    data-<namespace>-zookeeper-2

3. Create a pod to run the Stardog restore.

This pod will mount to the PVC, delete everything in its STARDOG_HOME directory except for the license key, and run a native Stardog server restore.

The pod assumes you have environment variables for:

your Stardog username
your Stardog password (stored in a Kubernetes secret)
- If you are using OAuth 2.0 and you do not have the local password of an administrator, you can pass the token with the --token option, rather than passing -u <username> -p <password> in the manifest.
the name of the PVC (set in the previous step)
the region your S3 backup is stored in
your AWS access key ID
your AWS access key secret
the node id of the node you're restoring from

The node id is specific to S3 restores and can be found in the S3 backup path, which looks like this: <aws_bucket_name>/<user-optional-path-name>/<node-id>/<other stuff>

You can use the following manifest, stardog-restore-runner.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: stardog-restore-runner
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
  containers:
  - name: stardog
    image: stardog/stardog:latest
    command: ["/bin/sh", "-c"]
    args:
      - |
        echo "[INFO] Waiting for volume to be writable..."
        COUNT=0
        while ! touch /var/opt/stardog/.restore-check 2>/dev/null; do
          if [[ \$COUNT -ge 60 ]]; then
            echo "[ERROR] Timed out waiting for volume"
            exit 1
          fi
          echo "[INFO] Volume not ready yet, retrying..."
          sleep 5
          COUNT=\$((COUNT+1))
        done

        echo "[INFO] Volume is ready. Removing all files from STARDOG_HOME except the license key..."
        find /var/opt/stardog ! -name 'stardog-license-key.bin' -mindepth 1 -delete

        echo "[INFO] Starting restore from /backup/stardog_backup..."
        /opt/stardog/bin/stardog-admin server restore \\
        "s3:///mybucket/backup/prefix?region=${region}&AWS_ACCESS_KEY_ID=${accessKey}&AWS_SECRET_ACCESS_KEY=${secret}" -i ${node_id}
            -p "\${STARDOG_PASSWORD}" \\
            -u ${STARDOG_USERNAME} \\
            | tee /dev/stdout

        echo "[INFO] Restore complete!" && \
        tail -f /dev/null
    volumeMounts:
      - mountPath: /var/opt/stardog
        name: stardog-home
      - mountPath: /var/opt/stardog/stardog-license-key.bin
        name: stardog-license
        subPath: stardog-license-key.bin
  volumes:
    - name: stardog-home
      persistentVolumeClaim:
        claimName: ${PVC_NAME}
    - name: stardog-license
      secret:
        secretName: stardog-license

You can apply this manifest with kubectl apply -f stardog-restore-runner.yaml.

4. Wait for the restore pod to complete.

You can monitor its progress by searching the pod logs for the message "Restore complete!" like so.

LOG_OUTPUT=$(kubectl logs stardog-restore-runner)
if echo "$LOG_OUTPUT" | grep -q "Restore complete!"; then
  break
fi

This block can be integrated into a for loop that periodically checks for restore completion.

Once completed, you can delete stardog-restore-runner.

5. Scale your Stardog cluster back up.

kubectl scale statefulset <zookeeper> --replicas=3 -n <namespace>
kubectl scale statefulset <stardog> --replicas=1 -n <namespace>

Once your first Stardog pod is up and running, repeat this process, adding one Stardog pod at a time until you're at your normal number of pods.

If you have a large data set, it may take a long time for the other nodes to replicate data from the first. What you can do instead is deploy one stardog-restore-runner for each node. Once the restore is complete, scale up your cluster.