Technical Alert #5 Fix: Agency Repair Instructions for v3.4.6 Clusters
As announced in our Technical Alert #5, there is a serious issue affecting ArangoDB clusters with the now revoked version 3.4.6.
If you upgraded your cluster to v3.4.6 already, please take immediate action to prevent data loss.
Below you find instructions for the following deployment scenarios:
- Manually created clusters (bare-metal)
- ArangoDB Starter Deployments
- Kubernetes Clusters (kube-arangodb)
… to repair the Agency state that causes it to unintentionally drop collections created after the upgrade. An upgrade to ArangoDB v3.4.6-1 is required to solve the problem permanently.
If you did not upgrade to v3.4.6 or if you use a different deployment mode (Single Server, Active Failover, Master/Slave, DC2DC) then no action is necessary.
Repair Manually Created Clusters
The following procedure describes how to correct the Agency state in bare-metal clusters (without help of the ArangoDB Starter and not using the ArangoDB Kubernetes Operator kube-arangodb).
The repair script has to be executed before the upgrade to 3.4.6-1 or higher is done. Otherwise all collections which have been created with 3.4.6 could be lost.
1. Get the repair script:
Download the following file and unzip it to get the repair.js
script:
tech05-repair-script.zip
2. Run the repair script with ArangoShell:
1 |
arangosh --server.endpoint AGENT_ENDPOINT --server.ask-jwt-secret --javascript.execute ./repair.js |
where AGENT_ENDPOINT
is replaced with the endpoint of an agent, and enter the JWT secret when prompted starting with agency leader.
Do this for all agents one after another. It is not sufficient to run it on the leader alone, it must run on the leader and each of the followers!
3. Upgrade ArangoDB:
Upgrade to ArangoDB v3.4.6-1 or higher. One can safely upgrade either in a rolling fashion or by shutting down everything and firing up the new version. See the documentation for details.
Do not create any new collections between the execution of the repair script and the upgrade, or you may lose these collections!
Repair ArangoDB Starter Deployments
ArangoDB Starter in version 0.14.5 has an automatic upgrade procedure to correct the Agency state.
Get the ArangoDB package in version 3.4.6-1, which ships with the Starter in version v0.14.5, and perform the upgrade. See the documentation for details.
Repair Kubernetes Clusters
The following procedure describes how to correct the Agency state in Kubernetes clusters operated by the ArangoDB Kubernetes Operator kube-arangodb.
This procedure has to be done on a v3.4.6 deployment before the upgrade to 3.4.6-1 or higher is done. Otherwise all collections which have been created with 3.4.6 could be lost.
1. Find JWT secret:
1 |
kubectl get secret DEPLOYMENT_NAME-jwt -o json | jq -r .data.token | base64 -d | tee /tmp/secret |
where DEPLOYMENT_NAME
is replaced by the name of the deployment, for example “arangodb”. This produces the JWT secret as output.
2. Get the repair script:
Download the following file and unzip it to get the repair.js
script:
tech05-repair-script.zip
3. Run the repair script:
1 |
kubectl port-forward AGENT_PODNAME 9000:8529 |
where AGENT_PODNAME
is replaced by one of the names of the agent pods starting with agency leader. In another shell window, with repair.js
in the current directory, run the following:
1 |
echo JWT_SECRET_FROM_ABOVE | arangosh --server.endpoint ssl://localhost:9000 --server.ask-jwt-secret --javascript.execute ./repair.js |
where JWT_SECRET_FROM_ABOVE
is the JWT secret we got in step 1.
Do this for all agent pods, one agent pod after another. It is not sufficient to run it on the leader alone, it must run on the leader and each of the followers!
4. Kill the leader agent once:
1 |
kubectl delete pod LEADER_AGENT_PODNAME |
where LEADER_AGENT_PODNAME
is the name of the pod of the agency leader (found during step 2).
5. Upgrade to server binaries:
Upgrade to ArangoDB v3.4.6-1 or higher. See the documentation for details.
Do not create any new collections between the execution of the repair script and the upgrade, or you may lose these collections!