A node can't obtain join lock

Created by Steve Place, Modified on Wed, 21 Feb 2024 at 05:12 PM by Steve Place

If your node can't join the cluster because it can't obtain join lock, your first step is to try to check all nodes in the cluster for open transactions. (Run stardog-admin tx list as a superuser on all your databases.) If there are any, let them finish. If you don't want to let them finish, you can offline any database to end them. Once the transactions are gone, the locks in ZooKeeper should be gone, and the node should be able to join.

If that doesn't work, run stardog-admin zk info to see if there are any open transaction locks or admin locks. Locks (specifically transaction locks, in this case) look something like this:

 (owner: 79719807413746344)
 (owner: 78610908345944116)
 (owner: 70450868276720099)

If that doesn't work and there are 0 open transactions but ZooKeeper is still showing locks, perform the following steps:

  1. Scale the cluster down to one node.
  2. Use stardog-admin cluster status to verify all database transaction IDs match the transaction IDs in ZooKeeper. 
    1. If they do, continue to step 3.
    2. If they don't, stop following these steps and open a support ticket.
  3. Restart that node.
  4. Verify with stardog-admin zk info all of the locks are gone.

If you still see the locks after restarting the node, run stardog-admin zk clear --lock all. Please be very careful with this command, as improper use can result in data loss. Do not use it unless you're in this exact situation. As written, this will clear all of the locks but none of your other ZooKeeper information. However, omitting the --lock flag will clear all of your ZooKeeper information (which you should only do as a last resort and after you've contacted Support).

Once the locks are cleared and your one node comes up, start your cluster in read-only mode (stardog-admin cluster readonly-start). After that, bring the nodes up one at a time, verifying with stardog-admin cluster info that the node joins before moving to the next one. Once all 3 nodes are back in the cluster, stop read-only mode (stardog-admin cluster readonly stop).

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article