Internal inconsistencies in the cluster configuration database can sometimes result in the inability to add to delete database servers from a ScaleArc cluster. When trying to add or delete a database server from the cluster on the primary server in a HA pair, ScaleArc will give an error stating that it was unable to update the setting on the secondary server even though the action was done on the primary node in the HA pair.
This article describes how to recover a ScaleArc HA environment experiencing this error.
- ScaleArc HA configuration
- Custom install on Red Hat Enterprise Linux (RHEL) 7.5 Server
Checking the HA Settings will show both servers in a HA pair as secondary servers when there are internal inconsistencies in the cluster configuration database resulting in the inability to add or delete databases from the ScaleArc cluster.
Follow these steps to resolve this problem:
- Navigate to SETTINGS > HA Settings and click on the 'Restart' button to restart HA services, and confirm if the HA service restarts successfully and correctly pick up the Primary and Secondary instances in the HA pair.
- If both instances still show as Secondary, connect to the one that was Primary before the issue occurred and click on 'Force to be Primary' button; the instance should then become Primary. Note that when an instance changes HA status this way, you will be automatically logged out and will be redirected to the login page in order to log in again.
- If the previous steps do not resolve the HA status was inconsistency, check if there have been RHEL package updates that may have upgraded any of the RHEL packages from the default versions installed by ScaleArc to newer versions from the RHEL repository. ScaleArc has tight dependencies with specific versions of various packages and any inadvertent updates to the dependencies may result in incorrect functioning of the ScaleArc appliance.
Check for recent package updates by running
yum history info <id> commands.
- Rollback any identified package updates suspected to have caused the issue by running
yum history undo <id> command.
- The HA status should get back to normal on both nodes in the HA pair. Reboot both instances to ensure the nodes stay in a healthy state.
It should now be possible to add and delete database servers and perform manual failover tests to validate the HA configuration.