Resolving Witness server unreachable HA alert

Overview

This article describes the cause and resolution when ScaleArc health monitor alerts that the witness server is unreachable through an alert stating: HA Alert: Witness server unreachable.

This alert can be generated on MSSQL HA environments relying on ScaleArc Cluster as the fencing option or where an SSH Server configured as the witness server experiences a network outage.

 

Environment

  • MSSQL Server with Always ON cluster
  • HA enabled with 'ScaleArc Cluster' configured as the fencing option
  • HA enabled with 'SSH Server' configured as the fencing option

 

Solution

The 'witness server unreachable' alert is expected to appear only when the cluster is down in a HA setup that relies on a ScaleArc cluster for resolving split-brain situations.

If it continues to appear despite the cluster being up, ensure both ScaleArc nodes have full access to the SQL Servers as HA creates a small database prefixed with SA_* (e.g. SA_024571ec_6b8) and a table in this database to continuously update and query the HA status.

This database should be made part of the AlwaysON Availability Group in SQL Server. Refer to this external article for detailed instructions on Creating AlwaysOn Availability Groups in SQL Server.

Tip:  The Always-ON listener should be reachable from ScaleArc if the SQL Browser service on the database servers is running. In an Always-ON cluster, connectivity to the Always-ON listener is required so that ScaleArc is able to track the Always-ON cluster status at all times. When a database server is being removed, it is strongly recommended to first remove it from the ScaleArc cluster before removing it from the Always-ON cluster. When adding a server, add it first to the Always-ON cluster, and ScaleArc will automatically report a new server was added and accordingly prompt you to add it in the cluster.

ScaleArc makes use of the HA cluster name to name this database when the HA fencing is configured to use a ScaleArc Cluster which is the default and recommended option. You can find the HA cluster name by running the following command in an SSH session on the Primary or Secondary node:

# pcs status | grep name
Cluster name: SA_aeaa3fae_30e

Alternatively, to achieve HA independent of cluster status you can configure the other two supported fencing options i.e. using an External database or SSH fencing as documented in Set Up High Availability.

Fencing_options2.png

Testing

The 'witness server unreachable' alert should go away after putting the SA_* database into the AlwaysOn Availability Group in MSSQL or configuring either of the other two supported fencing options.

Note: To avoid these alerts and ensure proper and quick HA failover when SSH Server is selected for fencing, the network communication between the ScaleArc instances and the SSH server used as the witness server should be reviewed for stability as well as availability.

Back to top

Comments

0 comments

Please sign in to leave a comment.