Open AlwaysOn Management from the context menu of any Site or Group level node as applicable. The AlwaysOn tab is available within SentryOne for any monitored SQL Server instance hosting an availability group replica.
The AlwaysOn Management page is divided into the following three areas:
|AlwaysOn Management Page area||Description||Image|
|Overview area||Offers several unique actionable views of your AlwaysOn environment, including easily digestible at-a-glance status information.|
|History area||Displays both historical charts and AlwaysOn health events from your environment.|
|Details Grid view area||Displays both high level and detailed level metrics concerning your environment.|
Customize visual aspects of the AlwaysOn views around your unique environment, including the default thresholds for Log and Recovery Queues. For more information, see the Customization section.
AlwaysOn Management includes full alerting capability surrounding your AlwaysOn environment. Fully customizable Conditions alert you on both Health and Failover status. For more information, see the AlwaysOn Alerting topic.
When you monitor the SQL Server instance hosting the Primary Replica of an availability group with SQL Sentry, the entire topology of that availability group displays as part of the Overview area. Unwatched instances hosting replicas display with a gray background.
Note: To enable complete AlwaysOn monitoring and alerting, watch each replica in the availability group with SQL Sentry. Status information is unavailable for unwatched replicas, including historical metrics, charts, and data pertaining to any associated Send or Recovery Queues.
To monitor a server, right-click the desired node, and then select Watch from the context menu.
The Overview area contains several layouts for managing your AlwaysOn environment from several perspectives, including the Windows Server Failover Cluster level, the Availability Group level, the SQL Server instance level, and the Replica level.
Change the Layout Style by selecting options from the drop-down list located in the top corner of the Overview area.
Note: Status information that's common to several views is found in the Status Information section.
Each of the Overview diagrams contains a unique representation of individual availability replicas/WSFCs that are part of your managed AlwaysOn environment. Regardless of the selected view, these nodes and the pipes connecting them all share a common set of visual status information.
Context Menu Options
|Context Menu Option||Description||Image|
|Watch||Watch the SQL Server instance hosting the replica with SentryOne Performance Analysis.|
Nodes convey status and configuration information about the replica, including Queue Health, Connection mode, and overall Health Status.
|Instance Name||On applicable views, the node contains a label with the SQL Server instance name.|
|Availability Replica Role||Each node is labeled indicating the role of the availability replica, as AG Primary, AG Secondary, or FCI Secondary.|
|Health Status||The health status or availability group state is represented by the left bar of the node. Green represents a healthy availability group state. Pink represents an unhealthy availability group state. The Health Status bar is also be pink if connectivity is lost between the SentryOne monitoring service and the SQL Server instance.|
|Queue Indicator||The Log Send Queue / Recovery Queue status is visually represented by the bar on the right. As the respective queue grows, the color changes from a dark green, to light green, to orange. Once the Queue KB Threshold passes, the color changes to red.|
|Failover Mode||The Failover mode of the availability replica is indicated in the left status box.|
|Automatic Failover mode||Represented with a play icon.|
|Manual mode||Doesn't contain an icon.|
|Client Connection mode||The client connection mode of the availability replica is represented in the center status box. If the availability replica is a readable secondary it's represented with a small glyph.|
|Failover Cluster Instances||If the node is a Failover Cluster Instance (FCI), it's represented in the right status box.|
The availability mode of the availability replica is represented as follows:
|Asynchronous-commit mode||Represented by two black lines drawn on the pipe between availability replicas.|
|Synchronous-commit mode||Represented by the absence of these two black lines drawn on the pipe.|
Additional Information: For information about troubleshooting data movement latency between synchronous-commit AlwaysOn availability groups, see the following MSDN article.
Pipe and Color Specifics
The visual characteristics of the pipes connecting the nodes convey status information about the health relationship between nodes. Pipe saturation is an indicator of the amount of information moving between replicas. Pipe color is dependent on the recovery queue of the secondary.
The pipe becomes saturated as the amount of data being transferred increases. More notably, the pipe becomes saturated as the KB To Replica/sec value approaches the KB To Replica/sec threshold.
When opening the AlwaysOn tab, the KB To Replica/sec threshold value is dynamic and is the highest global value seen between all availability groups and replicas for the current session by default.
Customize the KB To Replica/sec threshold with the AlwaysOn.AvailabilityReplicaTransferThreshold table in the SentryOne database. For more information, see the Customization section.
The pipe color changes as the recovery queue grows; the color changes from a dark green, to light green, to orange, and finally once the Recovery Queue KB Threshold is passed the color changes to red.
By default, the Recovery Queue KB Threshold value is one MB or 1024 KB. Customize the Recovery Queue KB Threshold with the AlwaysOn.AvailabilityReplica table in the SentryOne database. For more information, see the Customization section.
The History area contains tabs for Charting, Error Log, and Replica State Changes. Information in each tab displays based on the selected time range. In Live mode, each tab includes the last 10 minutes of collected data. Updating the time range on the tool bar changes this displayed range. Each of the History tabs is context aware and displays information based on the node or row selected in the Overview or Grid view area. Selecting on the diagram background displays log information for all nodes and replicas.
AlwaysOn Health Collection
The Error Log and Replica State Changes tabs are initially disabled because AlwaysOn health collection is disabled by default. This prevents innocuous entries associated with Extended Events from being repeatedly logged to the SQL Server Error Log without your knowledge or approval. For more information, see the following update for a link to CU 6 for SQL Server 2012 SP1, that contains a fix for this issue.
Note: To enable health collection, select Enable on the Error Log or Replica State Changes tab.
Important: Enabling automated AlwaysOn health (error and state change info) collection causes innocuous entries associated with Extended Events to be repeatedly logged to the SQL Server Error Log, as described in Microsoft Connect item 783036. We are working closely with Microsoft to correct this unfortunate behavior. Until it's corrected, the entry Using 'dbghelp.dll' version '4.0.5', is sent to the Error Log on all SQL Servers with Availability Groups monitored by SentryOne, at the rate specified by the collection interval.
Additional Information: Cumulative Update package 6 for SQL Server 2012 SP1 addresses this issue. Once the update is applied, innocuous entries associated with Extended Events are no longer logged to the SQL Server Error Log.
Adjust the enabled status and interval from SentryOne Monitoring Service Settings (Navigator > Configuration > Global Settings >SentryOne Monitoring Service Settings > Advanced tab).
Various charts related to your managed AlwaysOn environment are available in the expandable History area of the AlwaysOn Management tab. The Charting area is context aware and displays charts based on the selected node or row.
For example, from the WSFC Node Group/Matrix layout, select a Primary Availability Replica, to display charts for the KB to Replicas/sec and Log Send Queue KB metrics.
Selecting a Secondary Availability Replica displays charts with the KB from Replicas/sec and Recovery Queue KB metrics.
Selecting a WSFC in the WSFC Members layout displays relevant metrics for each respective Availability Group hosted on the WSFC, including KB To Replicas/sec and KB from Replicas/sec metrics.
The Error Log allows you to view any AlwaysOn Health events that are classified as errors in your managed environment. This log is equivalent to viewing AlwaysOn Error Health Events accessed from the native AlwaysOn Dashboard, but offers a number of advantages.
The Error Log is context aware and displays only errors related to the selected node or row. In Live mode, this includes the last 10 minutes of collected data. Updating the time range on the toolbar changes this displayed range, allowing you to view a historical subset of errors for any server.
Replica State Changes
The Replica State Changes tab displays AlwaysOn Health events that are classified as Replica State Changes. This log is equivalent to viewing AlwaysOn availability_replica_state_change health events accessed from the native AlwaysOn Dashboard, but offers several advantages.
The Replica State Changes tab is context aware and displays only state changes related to the selected node or row. In Live mode, this includes the last 10 minutes of collected data. Updating the time range on the toolbar changes this displayed range, allowing you to view a historical subset of state changes for any server.
SentryOne AlwaysOn Management includes the ability to configure customizable alerts specific to the status and health of your monitored Availability Groups.
The following Failsafe Conditions are available:
- SQL Availability Group Failover
- SQL Availability Replica Healthy
- SQL Availability Replica Unhealthy
Each of these conditions exposes several Condition Filters that help you customize your desired alerts.
The following Advisory Conditions are available:
- Availability Replicas Hosted on Same Virtual Host
- Availability Replicas With Disks in Same Datastore
- High Database Replica Send or Recovery Queue
- High Redo Completion Time
See the Advisory Conditions Pack article for details on each.
- Select the desired node for the level you'd like to configure the action for in the Navigator pane (View > Navigator), and then select Failsafe Conditions from the drop-down list in the Conditions pane (View > Conditions).
- Select Add to open the Select Action window, and then expand the applicable object and condition.
- Use the check box(es) to select which actions should be taken in response to this condition being met, and then select OK to save your settings.
For more information about actions that can be taken when a condition is met, see the Actions topic.
Customizing Pipe Saturation
When opening the AlwaysOn tab, the KB To Replica/sec threshold value is dynamic and is the highest global value seen between all availability groups and replicas for the current session by default. Customize the KB To Replica/sec Threshold with AlwaysOn.AvailabilityReplicaTransferThreshold table in the SentryOne database.
Note: The AlwaysOn.AvailabilityReplicaTransferThreshold table acts as a mapping table allowing you to define a specific value/threshold for the maximum allowed bytes (KB to Replica/sec Threshold). By default, the table doesn't have any records, and in this state the threshold is dynamic.
Set a new KB To Replica/sec threshold value specific for all replicas involved in a WSFC node – WSFC node relationship, by completing the following:
- Add a record with the PrimaryNodeName, SecondaryNodeName, and MaxBytesSentToSecondaryPerSec values defined. The WSFC node names are found in the AlwaysOn.ClusterNode and AlwaysOn.AvailabilityReplica tables.
INSERT INTO [SentryOne].[AlwaysOn].[AvailabilityReplicaTransferThreshold] (PrimaryNodeName, SecondaryNodeName, MaxBytesSentToSecondaryPerSec) VALUES ('MyPrimaryNode','MySecondaryNode', 1024);
Success: The KB To Replica/sec threshold value is now 1 KB for the defined replicas.
Set a new global KB To Replica/sec Threshold value by completing the following steps:
- Add a record with just a MaxBytesSentToSecondaryPerSec value. For example, to set a global value of 5 KB.
- Set a global value of 5 KB.
INSERT INTO [SentryOne].[AlwaysOn].[AvailabilityReplicaTransferThreshold] (MaxBytesSentToSecondaryPerSec) VALUES (5120);
Success: The new global KB To Replica/sec threshold value is now 5 KB.
Set a new KB To Replica/sec Threshold value specific to a Primary Replica - Secondary Replica relationship, by completing the following steps:
- Add a record with the PrimaryNodeName, SecondaryNodeName, and MaxBytesSentToSecondaryPerSec values defined.
- Set a value specific to a Relationship.
INSERT INTO [SentryOne].[AlwaysOn].[AvailabilityReplicaTransferThreshold] (PrimaryNodeName, SecondaryNodeName, MaxBytesSentToSecondaryPerSec) VALUES ('MyPrimary','MySecondary', 5120);
Success: The KB To Replica/sec threshold value for the specific MyPrimary and MySecondary is now 5 KB.The default global value is used for all other replicas.
Set a new KB To Replica/sec Threshold value for a specific Primary Replica, to be used with all of its Secondary Replicas, by completing the following steps:
- Add a record with just the PrimaryNodeName and MaxBytesSentToSecondaryPerSec values defined.
- Set a value specific to a Primary Replica to be used with all of its Secondaries.
INSERT INTO [SentryOne].[AlwaysOn].[AvailabilityReplicaTransferThreshold] (PrimaryNodeName, SecondaryNodeName, MaxBytesSentToSecondaryPerSec) VALUES ('MyPrimary', Null, 5120);
Success: The KB To Replica/sec Threshold value for all Secondary Replicas in a relationship with the MyPrimary replica is now 5 KB.
Note: Add multiple records to the AlwaysOn.AvailabilityReplicaTransferThreshold table, allowing you to easily define KB To Replica/sec thresholds specific to your environment.
Customizing Pipe and Queue Indicator Colors
By default, there's a global Recovery Queue and Log Send Queue KB Threshold value of 1 MB or 1024 KB. These values can be customized with the AlwaysOn.AvailabilityReplica table in the SentryOne database using the LogSendQueueKBMax and RecoveryQueueKBMax columns.
Each Replica in your monitored environment has an entry in the AlwaysOn.AvailabilityReplica table. By default, the LogSendQueueKBMax and RecoveryQueueKBMax columns are null. When this is the case, the default global value (1 MB) is used for each respective queue. Set these columns to your desired value for each of your replicas.
Update the Log Send Queue KB Threshold value for the replica named MyPrimary to 10 MBs.
Set a new Log Send Queue KB Threshold value for my replica MyPrimary
UPDATE [SentryOne].[AlwaysOn].[AvailabilityReplica] SET LogSendQueueKBMax = 10240 WHERE ReplicaServerName = 'MyPrimary';
Distributed Availability Groups
Feature Availability: Support for Distributed Availability Groups is currently in a preview status. It is available in version 2021.1 to those who are part of the feature preview program for this release.
Note: See the information above in this article for details on icons and the data flow pipeline as the general features for Availability Groups within the AlwaysOn tab remain the same for Distributed Availability Groups.
DAG-01 Example View
The following Distributed Availability Group setup is used in the SQL Sentry AlwaysOn tab examples below:
DAG-01 has a primary availability group (AG-02) and a secondary availability group (AG-01). Within those availability groups, there's a primary to secondary replica relationship (e.g. Node 008 primary flowing to Node 007 secondary).
Looking at AG-01, AG-02 is listed under the Distributed AG column:
In this view, AG-02 is labeled with the name of the Distributed AG (DAG-01). The data pipeline animates the data flow between the Distributed AG (Node 008) and the primary replica of AG-01 (Node 005).
Note: Rows represent availability groups (e.g. AG-01) and columns represent replicas within that availability group.
When selecting AG-02, the relationship between the two availability groups displays in the opposite direction. The data flow pipeline goes from WSFC Node 008 to the distributed node (005). The links continue to work between the distributed nodes and their associated replicas as described for Availability Groups in this article.
Note: Hover over the pipeline for more information about the connection (e.g. data flow metrics and direction).