Advisory Conditions Pack

Advisory Conditions allow you to create advanced performance alerting, enhanced change detection, and advisory rule sets. When building Advisory Conditions, use any combination of Ands, Ors, and nesting levels, to compare the values retrieved from these multiple sources. 

SentryOne provides a base set of conditions that you can download from inside the client. This set of conditions give you a starting point for monitoring and can serve as a reference when creating your own conditions. 

Availability Replicas Hosted on Same Virtual Host

This condition evaluates to true when multiple Availability Replicas from the same Availability Group are hosted on the same VMware/Hyper-V host.

In the event of an issue with the host server, there could be an outage that nullifies the Always On Availability Group configuration within the virtual machines hosted on the VMware Infrastructure.

For more information, see the Always On Availability Group and VMS article. 

--// AG Replicas on the same VM host
--// Applies to : VMware & Hyper-V
--// ConditionKey column breakdown as follows;
--// Host : {VMwareHost Name} | WSFC : {Windows Cluster Name} | AG : {AG Name} - {AG Replicas on Host}
SELECT
  'Host : ' + QUOTENAME(HS1.Name) + ' | WSFC : ' + CR.ClusterName + ' | AG : ' +  AG1.Name + ' - ' +
   STUFF((SELECT N', ' + QUOTENAME(NodeName) + '(' + CASE Role WHEN 1 THEN 'P' ELSE 'S' END + ')'
    FROM AlwaysOn.AvailabilityReplica AS AR2
    INNER JOIN VM.VirtualMachine AS VM2
       ON AR2.NodeName = VM2.Name
    WHERE AR2.GroupID = AR1.GroupID
      AND VM2.HostSystemID = VM1.HostSystemID
    ORDER BY Role, NodeName
      FOR XML PATH(N''), TYPE).value(N'.[1]', N'nvarchar(max)'), 1, 2, N'') as ConditionKey
 ,COUNT(VM1.Name) AS ReplicaCount
FROM VM.HostSystem HS1
INNER JOIN VM.VirtualMachine AS VM1
 ON HS1.ID = VM1.HostSystemID
INNER JOIN AlwaysOn.AvailabilityReplica AR1
 ON AR1.NodeName = VM1.Name
INNER JOIN AlwaysOn.AvailabilityGroup AS AG1
 ON AG1.GroupID = AR1.GroupID
INNER JOIN AlwaysOn.ClusterReference CR
 ON AR1.EventSourceConnectionID = CR.EventSourceConnectionID
GROUP BY
  CR.ClusterName
 ,AR1.GroupID
 ,AG1.Name
 ,VM1.HostSystemID
 ,HS1.Name
HAVING COUNT(VM1.Name) > 1
ORDER BY CR.ClusterName, AG1.Name, HS1.Name
;

SentryOne Availability Replicas Hosted on Same Virtual Host

Availability Replicas With Disks in Same Datastore 

This condition evaluates to true when multiple Availability Replicas from the same Availability Group have their VMDKs in the same VMware Datastore. 

If there's an issue with the datastore for the VMware server, this could result in an outage that nullifies the Always On Availability Group configuration within the virtual machines hosted on the VMware Infrastructure. 

For more information, see the Always On Availability Groups and VMS article. 

--// AG nodes with storage in the same data store.
--// Applies to : VMware
--// ConditionKey column breakdown as follows;
--// Data Store : {Data Store Name} | WSFC : {Windows Cluster Name} | AG : {AG Name} | {Replica:Disk}
select 'Data Store : ' + quotename(ds.name) + N' | WSFC : ' + cr.ClusterName 
  + ' | AG : ' + ag.name + ' | ' +
 stuff((select ', ' + ar2.nodename + '('
     + case [Role] WHEN 1 THEN 'P' ELSE 'S' END 
     +'):{'+ right(vd.[filename],(charindex('/',reverse(vd.[FileName]))-1)) +'}'
   from vm.virtualMachineVirtualDisk as vd
   join vm.datastore as ds2 
    on replace(replace(left(vd.filename,charindex(']', vd.filename)),'[',''),']','') = ds2.name
   join vm.virtualMachine as vm2 on vd.virtualMachineId = vm2.id
   join AlwaysOn.AvailabilityReplica as ar2 on vm2.name = ar2.NodeName
   where ar2.GroupId = ag.GroupId
    and ds2.name = ds.name
   order by vm2.name
   for xml path(''), type).value(N'.[1]',N'nvarchar(max)'),1,2,N'') as conditionKey,
   count(vmdk.[filename]) as DiskOverlapCount
from vm.datastore as ds
join vm.virtualMachineVirtualDisk as vmdk
 on replace(replace(left(vmdk.[filename],charindex(']', vmdk.[filename])),'[',''),']','') = ds.name
join vm.virtualmachine as vm on vmdk.virtualMachineId = vm.id
join AlwaysOn.AvailabilityReplica as ar on ar.nodename = vm.name
join AlwaysOn.AvailabilityGroup as ag on ar.GroupId = ag.GroupId
join AlwaysOn.ClusterReference as cr on ar.EventSourceConnectionID = cr.EventSourceConnectionID
group by cr.ClusterName, ag.Name, ds.Name, ag.GroupId
order by cr.ClusterName, ag.Name, ds.Name
;

SentryOne Availability Replicas With Disks in Same Datastore

Check Constraints Not Trusted

This condition evaluates to true when it finds sys.check_constraints.is_not_trusted values on objects that it expects to be trusted.

When check constraints are not trusted, SQL Server may be unable to use them for query plans and optimization, affecting the performance of your queries. Check constraints that were disabled, perhaps for bulk loading of data, and then enabled, may not be trusted. The table must be altered for those check constraints to be trusted again.

DECLARE @sql nvarchar(max);
SET @sql = N'';
SELECT @sql = @sql + N'UNION ALL 
  SELECT DBName = N''' + name + ''' COLLATE Latin1_General_BIN, 
  CCsNotTrusted =  
  (
    SELECT COUNT(*) AS CCsNotTrusted
      FROM ' + QUOTENAME(name) + '.sys.check_constraints AS c'
      + N'
      WHERE c.is_not_trusted = 1 
        AND c.is_not_for_replication = 0 
        AND c.is_disabled = 0
  )
  ' FROM sys.databases 
WHERE database_id > 4 AND state = 0;
SET @sql = N'SELECT DBName, CCsNotTrusted FROM 
(' + STUFF(@sql, 1, 10, N'') 
   + N') AS x WHERE CCsNotTrusted > 0;';
EXEC sys.sp_executesql @sql;

SentryOne Check Constraints Not Trusted

Check Data File Size

Verifies that a data file is at least 5120 MB. This condition is used to build other conditions.

SentryOne Check Data File Size

Check Error Log Size

Checks the size of the error log. This condition is used in other custom conditions that query the error log to prevent performance issues associated with larger log files.

 DECLARE @currentlogid int
DECLARE @createdate datetime
DECLARE @currfilesize int
 
CREATE TABLE #err_log_tmp(ArchiveNo int, CreateDate nvarchar(128), Size int)
 
INSERT #err_log_tmp exec master.dbo.sp_enumerrorlogs
 
SELECT TOP 1
      @currentlogid = er.ArchiveNo,
      @createdate = CONVERT(datetime, er.CreateDate, 101),
      @currfilesize = er.Size
FROM #err_log_tmp er
ORDER BY [ArchiveNo] ASC
 
DROP TABLE #err_log_tmp
 
SELECT @currfilesize;

SentryOne Check Error Log Size

Check Transaction Log File Size

Verifies that the transaction log file is at least 1024 MB. This condition is used to build other conditions. SentryOne Check Transaction Log File Size

Common Criteria Compliance Enabled

Queries sys.configurations to determine if common criteria compliance is enabled.

Common criteria compliance is an advanced option in the SQL Server Enterprise and Datacenter editions that can have a dramatic impact on server performance. The value field indicates that the setting is enabled; however, the value_in_use field indicates whether it has been activated. Activating common criteria compliance requires a server restart.

For more information, see the Common Criteria Compliance article. 

SELECT value
FROM sys.configurations 
WHERE name = N'common criteria compliance enabled'

SentryOne Common Criteria Compliance Enabled

Cost Threshold of Parallelism Changed

Queries sys.configurations to detect whenever the server-wide Cost Threshold of Parallelism configuration setting changes on a monitored server, by comparing the last value retrieved to the current value.

SELECT value_in_use
FROM sys.configurations 
WHERE name = N'cost threshold for parallelism'

SentryOne Cost Threshold of Parallelism Changed

CPU Schedulers Failed to Create Worker

This condition evaluates to true when there are schedulers that couldn't create a new worker, most likely due to memory constraints. 

 For more information, see the sys.dm_os_schedulers article. 

SELECT COUNT(*) AS 'Failed Workers'
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE failed_to_create_worker = 1;

SentryOne CPU Schedulers Failed to Create Worker

CPU Schedulers Hot Added

This condition evaluates to true when there are schedulers that have been added due to a hot add CPU event.

For more information, see the sys.dm_os_schedulers article.

SELECT COUNT(*) AS 'Hot Added'
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'HOT_ADDED';

SentryOne CPU Schedulers Hot Added

CPU Schedulers Visible Offline Status

This condition evaluates to true when there are CPU cores that are visible to SQL Server, but are offline.

SQL Server licensing might prevent a server from using all available cores. Not only can this limit performance, but it can cause additional harm by unbalancing NUMA nodes.

For more information, see the Balancing Your Available SQL Server Core Licenses and Performance Problems with SQL Server 2012 articles. 

SELECT COUNT(*) AS 'Offline Schedulers'
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'VISIBLE OFFLINE'
and scheduler_id < 255;

SentryOne CPU Schedulers Visible Offline Status

CPU Unused Schedulers

This condition evaluates to true when CPU schedulers are disabled (is_online = zero).

This indicates that a CPU is offline and SQL Server can't use it for processing (e.g. queries, batches, etc.). A disabled CPU could be caused by affinity masking or licensing issues, and can impact performance.

For more information, see the CPU Schedulers Visible Offline Status section, previously.

SELECT COUNT(*) AS 'Unused Schedulers'
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [is_online] = 0
and scheduler_id < 255;

SentryOne CPU Unused Schedulers

Data File Growth

This condition evaluates to true if any data file (> five GB) is larger than it was during the last evaluation of this condition. SentryOne Data File Growth

Data File Shrink

Evaluates to true if any data file is smaller than it was during the last evaluation of this condition. Shrinking data files is rarely recommended.

For more information, see the Why You Should Not Shrink Your Data Files article. 

SentryOne Data File Shrink

Database Files Count Change

This condition evaluates to true when the number of database files changes. It checks the count of database files in master.sys.master_files, regardless of file type (e.g. ROWS, LOG, etc.) A highlight appears on the backups chart of the dashboard to show that backup jobs might need to be created or removed depending on the scenario.

select count(*) from sys.master_files;

SentryOne Database Files Count Change

Foreign Keys Not Trusted

This condition evaluates to true when it finds sys.foreign_keys.is_not_trusted values on objects that it expects to be trusted. 

When foreign keys are not trusted, SQL Server may be unable to use them for query plans and optimization, affecting the performance of your queries. Foreign keys that were disabled, perhaps for bulk loading of data, and then enabled, may not be trusted. The table must be altered for those foreign keys to be trusted again.

DECLARE @sql nvarchar(max);
SET @sql = N'';
SELECT @sql = @sql + N'UNION ALL 
  SELECT DBName = N''' + name + ''' COLLATE Latin1_General_BIN, 
  FKsNotTrusted =  
  (
    SELECT COUNT(*) AS FKsNotTrusted
      FROM ' + QUOTENAME(name) + '.sys.foreign_keys AS f'
   + N'
      WHERE f.is_not_trusted = 1 
        AND f.is_not_for_replication = 0 
        AND f.is_disabled = 0
  )
  ' FROM sys.databases 
WHERE database_id > 4 AND state = 0;

SET @sql = N'SELECT DBName, FKsNotTrusted FROM 
(' + STUFF(@sql, 1, 10, N'') 
   + N') AS x WHERE FKsNotTrusted > 0;';

EXEC sys.sp_executesql @sql;

SentryOne Foreign Keys Not Trusted

High Active User Sessions

This condition is triggered if the count of user sessions with recent activity goes over a specified threshold.

High active sessions by itself doesn't universally correlate with performance or other problems, but on some systems it can precede an overload state, and/or be an indicator of malicious denial of service (DoS) activity. As such, this condition should be enabled selectively and the threshold adjusted appropriately.

select count(*) 
from sys.dm_exec_sessions 
where is_user_process = 1 and
last_request_start_time > DATEADD(minute, -1, GETDATE())

SentryOne High Active User Sessions

High Ad Hoc Query Plans

This condition checks for plan cache bloat that may be alleviated by enabling optimize for ad hoc workloads.
 

If optimize for ad hoc workloads is False, a high percentage of plan cache is dedicated to single use plans, and page life expectancy is low, then this condition evaluates to True.
 
A high percentage of plan cache in this condition is defined as greater than 10 percent when the server has less than or equal to 64 GB of memory and greater than five percent when it has greater than 64 GB of memory.

For more information, see the Plan Cache and Optimizing for Ad Hoc Workloads article. 

SELECT sum(cast((CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) as decimal(19,3)))/1024/1024 
FROM sys.dm_exec_cached_plans

SentryOne High Ad Hoc Query Plans

High Average Work Queue Count

This condition evaluates to true when sys.dm_os_schedulers.work_queue_count has an average value greater than one.

A high average work queue count can indicate that max worker threads for the server should be increased.

For more information, see the Max Worker Threads and When You Should Change It article. 

SELECT AVG (work_queue_count)
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'VISIBLE ONLINE';

SentryOne High Average Work Queue Count

High Avg Wait Time per User Session

Wait time measures how long queries are waiting for CPU, disk, memory, or other resources, across all sessions. High average wait time per second by itself may not be indicative of a problem, because it doesn't consider the activity level of the system. For example, 5,000 ms of average wait time/sec would be poor on a system with only 30 active users, but excellent on a system with 3,000 active users.

For this reason, this condition divides average wait time/sec (total) by the number of active user sessions to calculate the average wait time per session. If this value is over 50 ms, users may be experiencing noticeable delays.

This value shouldn't be considered a universally hard threshold, but more of a guideline, and may need to be adjusted for different systems. 

For more information, see the Wait Statistics article and the Waits category on SQLPerformance.com

Note:  SQL Sentry filters out innocuous waits, so the wait time used here is lower than if collected directly from SQL Server, and more accurately reflects user-impacting waits.

select count(*) 
from sys.dm_exec_sessions 
where is_user_process = 1 and
last_request_start_time > DATEADD(minute, -1, GETDATE())

SentryOne High Avg Wait Time per User Session

High Compiles

Query plan compiles should generally be < 15 percent of batches per second. Higher values indicate plan reuse is low and will often correlate with high cpu since plan compilation can be a cpu-intensive operation. High compiles may correlate with low plan cache hit ratios, and can be an indicator of memory pressure because there may not be enough room to keep all plans in cache.

If you see consistently high compiles, run a Quick Trace and sort the results by Cache Misses, then expand details to view actual compiling statements (SP:CacheMiss events, highlighted) along with the reason (SubClass) and procedure (Object). 

For more information, see the Batch Compilation, Recompilation, and Plan Caching Issues and Caching Mechanisms articles. SentryOne High Compiles

High Compiles + High CPU

Query plan compiles can be a CPU-intensive operation. If total CPU is high for an extended period and compiles are also high, they may be causing or at least contributing to the high CPU. 

For more information on troubleshooting, see the High Compiles conditions section, previous. SentryOne High Compiles + High CPU

High Context Switches

Context switches represent the combined rate at which all processors on the computer are switched from one thread to another. Consistently high values over 7,500 per logical processor can mean that the server is spending too much time switching threads instead of actively running threads. SentryOne High Context Switches

High Context Switches - Warning

Context switches represent the combined rate at which all processors on the computer are switched from one thread to another. Consistently high values over 5,000 per logical processor can mean that the server is spending too much time switching threads instead of actively running threads.  SentryOne High Context Switches - Warning

High CPU

Sustained CPU utilization greater than 90 percent may indicate a CPU bottleneck.    

Use the Performance Advisor for the Windows Processes tab to ascertain the processes causing the high CPU. On dedicated SQL Server machines most CPU should be associated with the SQL Server process(es). If the SQL Server is causing the high CPU, use Top SQL and/or QuickTrace to determine which sessions and queries are consuming the most CPU.  

Note:  If you launch QuickTrace by highlighting a range on the CPU Usage chart, the results are pre-sorted by CPU percent descending, so the greatest consumers are on top.

SentryOne High CPU

High CPU for Non-SQL Server Process

On dedicated SQL Servers, most of the CPU utilization should generally be related to the SQL Server process (sqlservr.exe). This condition detects when CPU utilization is high, and at least 25 percent is related to some process other than SQL Server.  SentryOne High CPU for Non-SQL Server Process

High CPU for Single Core

Sustained CPU utilization greater than 90 percent on a single core may indicate a CPU bottleneck. The CPU Total usage value is checked against 100/Number of cores used by the SQL Server instance.

Use the Performance Advisor for the Windows Processes tab to ascertain the processes causing the high CPU. On dedicated SQL Server machines, most CPU should be associated with the SQL Server process(es). If the SQL Server is causing the high CPU, use Top SQL and/or QuickTrace to determine which sessions and queries are consuming the most CPU.

Note:  If you launch QuickTrace by highlighting a range on the CPU Usage chart, the results are pre-sorted by CPU percent descending, so the greatest consumers are on top.

SELECT  100.0/COUNT(*) 
FROM SYS.DM_OS_SCHEDULERS 
WHERE STATUS = 'VISIBLE ONLINE' AND IS_ONLINE = 1;

SentryOne High CPU for Single Core

High CPU Signal Waits

This condition evaluates to true when total signal waits are above 20 percent. High signal waits are often indicative of CPU pressure, but the threshold may vary by system. 

For more information, see the A DMV a Day and Troubleshooting SQL Server Wait Stats articles. 

SELECT CAST(100.0 * SUM(signal_wait_time_ms)/ SUM (wait_time_ms)AS NUMERIC(20,2)) 
FROM sys.dm_os_wait_stats WITH (NOLOCK);

SentryOne High CPU Signal Waits

High Database Replica Send or Recovery Queue

Large log send or recovery queues can be an indicator of a system or network bottleneck. This condition is triggered if either queue goes over 3 MB. 

Use the Performance Advisor Always On Management interface and dashboard to troubleshoot.

For more information, see the AlwaysOn Architecture Guide article.
SentryOne High Database Replica Send or Recovery Queue

High Disk Waits and Latency

This condition detects if high disk latency is potentially causing high disk wait time in SQL Server. 

To troubleshoot, view the Performance Advisor Disk Activity tab to locate bottlenecks in the disk system or specific database files. On the dashboard, highlight a range with high latency and Jump To > Top SQL  to determine which queries may be contributing to and/or suffering from high disk waits.    

Disk latency is the only disk measurement for which there are generally accepted ranges that represent good and bad performance from a SQL Server perspective. The following ranges can be used as a general guideline to determine whether disk latency is acceptable:    

  • Less than 10 ms—Fast*
  • Between 10 ms - 20 ms—Acceptable
  • Between 20 ms - 50 ms—Slow
  • Greater than 50 ms—Critical

*For transaction log writes, between 0 ms and 2 ms is desirable.   SentryOne High Disk Waits and Latency

High Mirroring Send or Redo Queue

Large log send or redo queues can be an indicator of a system or network bottleneck. This condition is triggered if either queue goes over three MB.    

Use the Performance Advisor Dashboard to troubleshoot.  SentryOne High Mirroring Send or Redo Queue

High Number of Failed Logins

This condition evaluates to true when there are more than 10 failed logins in the last two minutes. The first query ensures that the error log isn't more than two MB (for performance consideration) and the second query checks for the number of failed logins.

A high number of failed login attempts may indicate that an unauthorized user is trying to access the system. SQL Server security properties must have login auditing enabled for this condition to function.

In SQL Server Management Studio, go to Server Properties > Security to check on these settings. The login auditing option must be set to Failed logins only or Both failed and successful logins.

For more information, see the Server Properties article. 

DECLARE @currentlogid int
DECLARE @createdate datetime
DECLARE @currfilesize int
 
CREATE TABLE #err_log_tmp(ArchiveNo int, CreateDate nvarchar(128), Size int)
 
INSERT #err_log_tmp exec master.dbo.sp_enumerrorlogs
 
SELECT TOP 1
@currentlogid = er.ArchiveNo,
@createdate = CONVERT(datetime, er.CreateDate, 101),
@currfilesize = er.Size
FROM #err_log_tmp er
ORDER BY
[ArchiveNo] ASC
 
DROP TABLE #err_log_tmp
 
SELECT @currfilesize;
CREATE TABLE #log
  (
   logdate DATETIME,
   info    VARCHAR (25) ,
   data    VARCHAR (200)
  );
 
  INSERT INTO #log
  EXECUTE sp_readerrorlog 0, 1, 'Login failed';
 
  SELECT count(*) AS occurences
  FROM   #log
  WHERE  logdate > dateadd(minute, -2, getdate());
 
  DROP TABLE #log;

SentryOne High Number of Failed Logins

High Pending Disk IO Count

This condition evaluates to True when sys.dm_os_schedulers.pending_disk_io_count has an average value greater than zero. A high pending disk I/O count may be indicative of an I/O bottleneck.

For more information, see the sys.dm_os_schedulers and Easy Ways to Detect I/O Pressure in SQL Server articles. 

SELECT AVG (pending_disk_io_count)
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'VISIBLE ONLINE';

SentryOne High Pending Disk IO Count

High Performance Power Plan not Enabled

If the High Performance power plan isn't enabled, Windows may be throttling CPU performance. This in turn can have a significant negative impact on both Windows and SQL Server performance, so it's highly recommended that a dedicated SQL Server always use the High Performance plan.

This condition uses a WMI query to determine whether the High Performance plan is active, and it triggers if it's not.

For more information, see the Slow Performance on Windows Server when Using the Balanced Power Plan and Ensuring Maximum CPU Performance via HTML articles. 

Note:  It's possible to use a custom power plan that doesn't throttle CPU, although this is not a common practice. If the WMI query returns the error, The program is blocked by group policy, it may mean that your network admin has applied a power plan from the group policy as described in the following post. If so, check with them to ensure that the plan doesn't throttle CPU.

SentryOne High Performance Power Plan not Enabled

High Recompiles

Query plan recompiles should generally be less than 15 percent of initial compiles, and often correlates with high cpu, since plan compilation can be a cpu-intensive operation. They may be caused by statistics updates, schema changes, etc., and may correlate with low plan cache hit ratios. 

If you see consistently high recompiles, run a Quick Trace and sort the results by recompiles, then expand details to view actual recompiling statements (sp:recompile events, highlighted) along with the reason (subclass) and procedure (object).

For more information, see the Batch Compilation, Recompilation, and Plan Caching Issues and Caching Mechanisms articles. SentryOne High Recompiles

High Redo Completion Time

This condition divides the log_send_rate and redo_rate values found in the sys.dm_hadr_database_replica_states DMV to determine the average redo completion time.

Log Send Rate is the rate at which the logs are being sent to secondary databases in an availability group, and Redo Rate is the rate in which the log records are being redone on the secondary databases.

By default, this condition evaluates to true if the average redo completion time is greater than five mins (in secs). In addition, this condition is designed to return rows only when it is executed against the primary replica. 

For more information, see the Availability Group Replica Sync article.

SELECT [AG Name, Replica, & Database] = AG.name
          + N' [' + AR.replica_server_name + N']'
          + N': (' + DB.database_name + N')',
        [Average Redo Completion Time (Sec)]
          = COALESCE( RS.log_send_rate / NULLIF (RS.redo_rate, 0),0)
 FROM sys.dm_hadr_database_replica_states AS RS
 INNER JOIN sys.availability_databases_cluster AS DB
   ON RS.group_id = DB.group_id
   AND RS.group_database_id = DB.group_database_id
 INNER JOIN sys.availability_groups AS AG
   ON AG.group_id = RS.group_id
 INNER JOIN sys.availability_replicas AS AR
   ON RS.group_id = AR.group_id
   AND RS.replica_id = AR.replica_id
      INNER JOIN sys.dm_hadr_availability_group_states AS AGS
   ON AGS.group_id = AG.group_id 
    INNER JOIN sys.dm_hadr_availability_replica_states AS ars
    ON ar.replica_id = ars.replica_id
    WHERE ars.role_desc = 'SECONDARY'
    AND AGS.primary_replica = @@SERVERNAME
 ORDER BY AG.name, AR.replica_server_name, DB.database_name;

SentryOne High Redo Completion Time

High Runnable Tasks Count

This condition evaluates to true when sys.dm_os_schedulers.runnable_tasks_count has an average value greater than 10. A high runnable tasks count is indicative of CPU pressure.

For more information, see the A DMV a Day article. 

SELECT AVG (runnable_tasks_count)
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'VISIBLE ONLINE';

SentryOne High Runnable Tasks Count

High VLF Count

SQL Server database transaction logs contain multiple virtual log files or VLFs. The number of VLFs is dictated by the initial size and auto-growth size for the transaction log. Too many VLFs can lead to increased backup/recovery times and possible performance problems. 

See the Performance Advisor  Disk Space tab for VLF counts for all log files, as well as configured auto-growth size and how many VLFs result from each auto-growth.

If VLFs are too high, the count can be reset by shrinking and resizing the log. For more information, see the 8 Steps and Transaction Log VLFs articles

SentryOne High VLF Count

Hyper-V High vCPU Wait Time

The vCPU Wait Time is a SQL Sentry virtual counter. It's calculated by multiplying CPU wait time per dispatch (the average time, in nanoseconds, spent waiting for a virtual processor to be dispatched onto a logical processor) by the number of dispatches per second onto the logical processor.

SentryOne Hyper-V High vCPU Wait Time

Incorrect Compatibility Level

This condition evaluates to true when a database in the SentryOne repository has a compatibility level that doesn't match that of the master database (current compatibility level).

Compatibility mode allows an older database to run on a newer version of SQL Server at the expense of not being able to run newer features. While some databases need to use an older compatibility mode, not all of them do.

If there are databases that must run in compatibility mode, make provisions to exclude them to reduce false positive values.

SELECT 
CAST (pasd.CompatabilityLevel AS NVARCHAR(3)) + ' ' + esc.ServerName  + ': (' + pasd.Name + ')' as [Current Level, Server, Database],
der.MaxCompatabilityLevel
FROM [dbo].[PerformanceAnalysisSqlDatabase] pasd
INNER JOIN [dbo].[EventSourceConnection] esc
 ON esc.ID = pasd.EventSourceConnectionID
INNER JOIN 
 (SELECT EventSourceConnectionID, CompatabilityLevel [MaxCompatabilityLevel]
 FROM [dbo].[PerformanceAnalysisSqlDatabase]  
 WHERE DatabaseID = 1) der
 ON der.EventSourceConnectionID = pasd.EventSourceConnectionID
WHERE pasd.DatabaseID = 2
AND esc.IsPerformanceAnalysisEnabled = 1
AND pasd.CompatabilityLevel = der.MaxCompatabilityLevel;

SentryOne Incorrect Compatibility Level

Large Windows File Cache

The Windows file (or system) cache stores in memory file data that is read from or written to disk. Since SQL Server manages its own memory, this cache typically doesn't grow large on a dedicated SQL Server. However, if some other process is causing it to unexpectedly grow large it can lead to memory pressure for SQL Server. 

In the case of Analysis Services, database files may be loaded into and served from the Windows file cache, even if the associated file data doesn't exist in the SSAS internal caches. For this reason, monitoring the file cache is important to ensure that physical memory is being used effectively, and that memory contention doesn't occur between the SSAS process, the file cache, and other processes on the server, including SQL Server.

SentryOne Large Windows File Cache

Log File Growth

This conditions evaluates to true if any transaction log, of at least 1024 MB in size, is larger than it was during the last evaluation of this condition.

Frequent log file growth can lead to the creation of too many Virtual Log Files (VLFs). You can prevent this by optimizing the initial size and auto-growth size for the transaction log.

For more information, see the High VLF Count custom condition.

SentryOne Log File Growth

Long Running Open Transactions

This condition evaluates to True if any running transactions have been open for at least 90 seconds. The query returns a list containing the query text and value (in seconds) of the longest running open transactions.

For more information, see the Script Open Transactions With Text and Plans article.

SELECT COALESCE(MAX([Transaction Length]),0) [LongestRunningTransaction]
FROM (
SELECT
    [s_tst].[session_id],
    [s_es].[login_name] AS [Login Name],
    DB_NAME (s_tdt.database_id) AS [Database],
    [s_tdt].[database_transaction_begin_time] AS [Begin Time],
 DATEDIFF(ss, [s_tdt].[database_transaction_begin_time], GETDATE()) [Transaction Length],
    [s_tdt].[database_transaction_log_bytes_used] AS [Log Bytes],
    [s_tdt].[database_transaction_log_bytes_reserved] AS [Log Rsvd],
 CASE [s_tdt].database_transaction_type
  WHEN 1 THEN 'Read/write transaction'
  WHEN 2 THEN 'Read-only transaction'
  WHEN 3 THEN 'System transaction'
 END [Transaction Type],
 CASE [s_tdt].database_transaction_state
  WHEN 1 THEN 'The transaction has not been initialized.'
  WHEN 3 THEN 'The transaction has been initialized but has not generated any log records.'
  WHEN 4 THEN 'The transaction has generated log records.'
  WHEN 5 THEN 'The transaction has been prepared.'
  WHEN 10 THEN 'The transaction has been committed.'
  WHEN 11 THEN 'The transaction has been rolled back.'
  WHEN 12 THEN 'The transaction is being committed. In this state the log record is being generated, but it has not been materialized or persisted.'
 END [Transaction State],
 [s_est].text AS [Last T-SQL Text]
FROM sys.dm_tran_database_transactions [s_tdt]
INNER JOIN sys.dm_tran_session_transactions [s_tst]
 ON [s_tst].[transaction_id] = [s_tdt].[transaction_id]
INNER JOIN sys.[dm_exec_sessions] [s_es]
 ON [s_es].[session_id] = [s_tst].[session_id]
INNER JOIN sys.dm_exec_connections [s_ec]
 ON [s_ec].[session_id] = [s_tst].[session_id]
CROSS APPLY sys.dm_exec_sql_text ([s_ec].[most_recent_sql_handle]) AS [s_est]
WHERE [s_tst].is_user_transaction = 1
) RunningTransactions

SentryOne Long Running Open Transactions

Low Available Windows Memory

Most Windows servers require at least 100 MB of available memory to ensure proper function and avoid expensive disk paging operations. This condition alerts if available memory drops beneath this threshold.

SentryOne Low Available Windows Memory

Low Page Life Expectancy

Page Life Expectancy (PLE) is the average lifespan in seconds of a data page in buffer, and is one of the best indicators of memory pressure. The larger the buffer cache size, the higher it should be. The formula* used in this condition takes the size of the buffer pool into account when determining whether the current PLE value is problematic: 

PLE < (DataCacheSizeInGB/4GB * 300)  

Each NUMA node has its own PLE value, so some nodes may be suffering from memory pressure while others are not. This condition checks all NUMA nodes via the Any instance that automatically iterates all nodes and applies the formula in a synchronized fashion across all counters. 

The first check in the condition ensures that the buffer size is at least two GB (default) before processing the formula, to avoid false positives caused by small buffers. 

For more information, see the Finding What Queries in the Plan Cache use a Specific Index and Page Life Expectancy Isn't What you Think articles. 

SentryOne Low Page Life Expectancy

Network Bottleneck - Outbound

The Output Queue Length is the size of the network output packet queue, in packets. A sustained value of less than three may indicate a network bottleneck. This condition detects a queue length of less than three on any network adapter. 

For more information, see the Wait Statistics article and the Waits category on SQLPerformance.com

SentryOne Network Bottleneck - Outbound

NUMA Disabled

This condition evaluates to True when NUMA is disabled. For more information, see the A DMV a Day article. 

SELECT COUNT(DISTINCT parent_node_id)
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE parent_node_id <> 32;

SentryOne NUMA Disabled

Optimize for Ad Hoc Workloads Changed

Detects whenever the optimize for ad hoc workloads server config setting changes by comparing the last value retrieved to the current value.

SELECT value_in_use
FROM sys.configurations
WHERE name = N'optimize for ad hoc workloads';                       

SentryOne Optimize for Ad Hoc Workloads Changed

Page Verification Check

This condition evaluates to true when it locates databases in the SentryOne repository with an inadequate level of page protection. From release 2005 databases should be using Checksum. Databases upgraded or scripted from an earlier version may still be using torn protection. The result set format is as follows: 

Key (server name: (Database), value (one if incorrect).

For more information, see the Alter Database Set Options article. 

To change a single database on a server, run the following code after changing the database name:

USE [master]
GO
ALTER DATABASE [AdventureWorks] SET PAGE_VERIFY CHECKSUM  WITH NO_WAIT;
GO

SentryOne Page Verification Check

To generate the commands to change all databases on a server, run the following code:

SELECT N'ALTER DATABASE [' + db.name + N'] SET PAGE_VERIFY CHECKSUM  WITH NO_WAIT;'
FROM sys.databases AS db 
WHERE db.page_verify_option_desc <> N'CHECKSUM';
SELECT
esc.ServerName  + ': (' + pasd.Name + ')',
 CASE
 WHEN pasd.CompatibilityLevel <= 80 AND pasd.PageVerifyOption <> 1 THEN 1
 WHEN pasd.CompatibilityLevel >= 90 AND pasd.PageVerifyOption <> 2 THEN 1
 ELSE 0
 END [IsBad],
pasd.CompatibilityLevel,
pasd.PageVerifyOption
FROM [dbo].[PerformanceAnalysisSqlDatabase] pasd
INNER JOIN [dbo].[EventSourceConnection] esc
 ON esc.ID = pasd.EventSourceConnectionID
WHERE pasd.DatabaseID <> 2
AND esc.IsPerformanceAnalysisEnabled = 1
AND
(CASE
 WHEN pasd.CompatibilityLevel <= 80 AND pasd.PageVerifyOption <> 1 THEN 1
 WHEN pasd.CompatibilityLevel >= 90 AND pasd.PageVerifyOption <> 2 THEN 1
 ELSE 0
END) = 1;

Server MAXDOP Changed

Detects when the server-wide max degree of parallelism (MAXDOP) server config setting changes on a server with more than one processor by comparing the last value retrieved to the current value.

SELECT value_in_use
FROM sys.configurations
WHERE name = N'max degree of parallelism';

SentryOne Server MAXDOP Changed

Service Broker - Activation Error Occurred 

Checks for Service Broker procedures exiting due to errors.

SentryOne Service Broker Activation Error

Service Broker - Login Connection Error

Checks sys.dm_broker_connections for a login_state of 13, which indicates a connection error.

SentryOne Service Broker Login Connection

Service Broker - 'Poison Message Handling' Disabled

Checks sys.service_queues to see if is_poison_message_handling_enabled is false.

SentryOne Service Broker Poison Message

Service Broker - Queues with Issues 

Checks sys.service_queues for entries that have an activation_procedure, but is_activation_enabled, is_receive_enabled, or is_enqueue_enabled is disabled.

SentryOne Service Broker Queues with Issues

Service Broker - Task Limit Reached/Sec

Checks for Task Limit Reached/sec Total counter >= 1.

SentryOne Service Broker Task Limit

Service Broker - Transmission Queue Errors 

Counts the number of messages in sys.transmission_queue where is_conversation_error is true.

SentryOne Service Broker Transmission Queue Errors

Sleeping Sessions With Old Open Transactions

Detects when there are sleeping sessions with open transactions older than 10 minutes by default. Such sessions can cause blocking, and can prevent the transaction log from clearing, leading to excessive log file growth, and space exhaustion. When snapshot isolation is used, they can prevent version cleanup from occurring in tempdb.

The start time, session_id, host, application and database are returned for the oldest five transactions by default. The query itself only returns transactions older than five minutes by default, to avoid bringing back unnecessary results on systems with many short-running transactions.

SELECT TxDesc = '[' + CONVERT(varchar, es.last_request_start_time, 120) + '] (' + CAST(es.session_id AS varchar(6)) + ') ' + host_name + ':' + program_name + ' [' + DB_NAME(dt.database_id) + ']'
    , OpenMinutes = DATEDIFF(minute, es.last_request_start_time, GETDATE())
FROM sys.dm_exec_sessions es
JOIN sys.dm_tran_session_transactions st
  ON es.session_id = st.session_id
JOIN sys.dm_tran_database_transactions dt
  ON dt.transaction_id = st.transaction_id
WHERE dt.database_id <> 32767
  AND status = 'sleeping'
  AND es.last_request_start_time < DATEADD(MINUTE, -5, GETDATE())
ORDER BY es.last_request_start_time

SentryOne Sleeping Sessions With Old Open Transactions

SQL File Auto-growth Disabled

This condition queries the SentryOne database to detect when auto-growth has been disabled for any data or transaction log file.

Although auto-growth should be considered a contingency for unexpected file growth vs. the primary means of managing growth, disabling it can lead to eventual free space exhaustion that halts all DML activity. 

For more information, see the Consideration for the Autogrow and Autoshrink settings article. 

SELECT
  DBFileName = EC.ObjectName + '.' + SF.Name
 ,SF.Growth
FROM EventSourceConnection EC
JOIN PerformanceAnalysisSqlFile SF
  ON EC.ID = SF.EventSourceConnectionID
WHERE EC.IsPerformanceAnalysisEnabled = 1

SentryOne SQL File Auto-growth Disabled

SQL File Auto-growth Exceeds Free Space

Queries the SentryOne database to detect when the next auto-growth exceeds either available free disk space or the max file size. The key column represents the server and file name, and the value column represents the space debt in MB.

The space debt is the minimum of the free space (PerformanceAnalysisDeviceLogicalDisk.FreeSpace) minus the growth size setting (PerformanceAnalysisSqlFile.Growth) or the maximum size (PerformanceAnalysisSqlFile.MaxSize) minus the current size plus growth size.

SELECT 
  DBFileName
 ,SpaceDebtMB = MIN(SpaceDebtMB)
FROM
 (
  SELECT
    DBFileName = EC.ObjectName + '.' + SF.Name
   ,SpaceDebtMB = (LD.FreeSpace / 1024.0 / 1024.0) - (SF.Growth / 128)
  FROM EventSourceConnection EC
  JOIN PerformanceAnalysisSqlFile SF
    ON EC.ID = SF.EventSourceConnectionID
  JOIN PerformanceAnalysisDeviceLogicalDisk LD
    ON LD.DeviceID = EC.DeviceID
   AND LD.Name = LEFT(SF.FileName, LEN(LD.Name))
  WHERE SF.Growth > (LD.FreeSpace / 8192)
   AND EC.IsPerformanceAnalysisEnabled = 1
  UNION
  SELECT
    DBFileName = EC.ObjectName + '.' + SF.Name
   ,SpaceDebtMB = (SF.MaxSize - (SF.Size + SF.Growth)) / 128
  FROM EventSourceConnection EC
  JOIN PerformanceAnalysisSqlFile SF
    ON EC.ID = SF.EventSourceConnectionID
  JOIN PerformanceAnalysisDeviceLogicalDisk LD
    ON LD.DeviceID = EC.DeviceID
   AND LD.Name = LEFT(SF.FileName, LEN(LD.Name))
  WHERE SF.MaxSize = -1
   AND (SF.Growth + SF.Size) > SF.MaxSize
   AND EC.IsPerformanceAnalysisEnabled = 1
  ) DBFiles
GROUP BY DBFileName

SentryOne SQL File Auto-growth Exceeds Free Space

SentryOne Monitoring Service Offline

Returns the SentryOne monitoring service that's been offline the longest in any site with actively watched connections. This condition only works when two or more monitoring services are used.

To avoid triggering this condition when intentionally stopping a service, temporarily move it into a site with no watched connections using Site Configuration.

SELECT TOP 1 dbo.ManagementEngine.ServerName
FROM
    (
        SELECT Device.ID
             , Device.SiteID
        FROM dbo.Device
        WHERE IsPerformanceAnalysisEnabled = 1
    UNION
        SELECT Device.ID
             , Device.SiteID
        FROM dbo.EventSourceConnection
        INNER JOIN dbo.Device
           ON dbo.EventSourceConnection.DeviceID = dbo.Device.ID
        WHERE EventSourceConnection.IsWatched = 1
           OR EventSourceConnection.IsPerformanceAnalysisEnabled = 1
    ) WatchedDevices
INNER JOIN dbo.ManagementEngine
   ON WatchedDevices.SiteID = dbo.ManagementEngine.SiteID
WHERE dbo.ManagementEngine.HeartbeatDateTime < DATEADD(minute, -3, GETUTCDATE())
   OR dbo.ManagementEngine.HeartbeatDateTime IS NULL
ORDER BY ISNULL(dbo.ManagementEngine.HeartbeatDateTime, dbo.ManagementEngine.LastInitializationDateTime)

SentryOne SQL Sentry Monitoring Service Offline

SQL Server Memory Exhaustion

Memory Grants Pending is the number of processes waiting for a query workspace memory grant. This value should be zero, but it can go above zero in cases of severe memory pressure.  

When it does, RESOURCE_SEMAPHORE waits are greater than zero, since this wait is a measure of the time that queries had to wait for memory grants. This type is visible in the Waits chart tooltips for the Memory class and category.

SentryOne SQL Server Memory Exhaustion

SQL Server Process Paged to Disk

Under severe memory pressure, Windows can page the SQL Server process to disk, which can dramatically impact SQL Server performance. This custom condition queries the ring buffer for the latest working set utilization value for the SQL Server process, and it alerts if it drops beneath 75 percent. If no record exists, 100(%) is returned, which is the optimal state.  

For more information, see the Reduce Paging of Buffer Pool Memory, Lock Pages in Memory, and SQL Server Debates articles. 

Note:  If the  Lock pages in memory  user right has been assigned to the SQL Server service user account, it will prevent Windows from paging the SQL Server process.

SELECT MIN(MemUtilization)
FROM
(
	SELECT
		ISNULL(rec.x.value('(MemoryRecord/MemoryUtilization)[1]','tinyint'), 100) As MemUtilization
	FROM (
			SELECT TOP 1 CAST(orb.record AS XML) AS xmlRec
			FROM sys.dm_os_ring_buffers AS orb
			CROSS JOIN sys.dm_os_sys_info AS osi
			WHERE orb.ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR'
			  AND DATEADD(second, -((osi.cpu_ticks/(osi.cpu_ticks/osi.ms_ticks) - orb.timestamp) / 1000), GETDATE()) > DATEADD(minute, -120, GETDATE())
			ORDER BY timestamp DESC
		) rb
	CROSS APPLY rb.xmlRec.nodes('Record') rec(x)
 UNION
	SELECT 100
) MU

SentryOne SQL Server Process Paged to Disk

SQL Server Process Physical Memory Low

This condition evaluates to true when sys.dm_os_process_memory.process_physical_memory_low is true. It indicates that the process is responding to low physical memory.

For more information, see the sys.dm_os_process_memory article. 

SELECT process_physical_memory_low
FROM sys.dm_os_process_memory WITH (NOLOCK);

SentryOne SQL Server Process Physical Memory Low

SQL Server Process Virtual Memory Low

This condition evaluates to true when sys.dm_os_process_memory.process_virtual_memory_low is true. It indicates that more virtual memory is needed for the SQL Server process. 

For more information, see the sys.dm_os_process_memory article. 

SELECT process_virtual_memory_low
FROM sys.dm_os_process_memory WITH (NOLOCK);

SentryOne SQL Server Process Virtual Memory Low

SSAS Formula Engine Query Pool Job Queuing

The query pool refers to Formula Engine activity for queries. If you're seeing consistently high queue lengths, but not high CPU utilization, adjust your MaxThreads and/or CoordinatorExecutionMode properties for your SSAS instance.

For more information, see section 6.11 of the SSAS 2008 Performance Guide.

Remember that the Formula Engine is single threaded, so increasing the query pool setting may not improve performance of any one query, but it may improve the performance in handling multiple simultaneous requests.

SentryOne SSAS Formula Engine Query Pool Job Queuing

SSAS High Memory Limit Exceeded

SSAS uses memory limit settings to determine how it allocates and manages its internal memory. Memory\LowMemoryLimit defaults to 65% of the total available physical memory on the machine (75% on AS2005), and Memory\TotalMemoryLimit (also sometimes called the High Memory Limit) defaults to 80 percent. This is the total amount of memory that the SSAS process itself (msmdsrv.exe) can consume.

Once memory usage hits the low limit, the memory cleaner threads start moving data out of memory. If memory hits the total limit, the cleaner goes into crisis mode. It spawns additional threads and gets more aggressive about memory cleanup, and this can dramatically impact performance. 

For more information, see the Memory Limits article. 

SentryOne SSAS High Memory Limit Exceeded

SSAS Low Memory Limit Exceeded

SSAS uses memory limit settings to determine how it allocates and manages its internal memory.  Memory\LowMemoryLimit defaults to 65 percent of the total available physical memory on the machine (75 percent on AS2005), and Memory\TotalMemoryLimit (also sometimes called the High Memory Limit) defaults to 80 percent. This is the total amount of memory that the SSAS process itself (msmdsrv.exe) can consume.

Once memory usage hits the low limit, the memory cleaner threads starts moving data out of memory.  If memory hits the total limit, the cleaner goes into crisis mode. It spawns additional threads and gets more aggressive about memory cleanup, and this can dramatically impact performance. 

For more information, see the Memory Limits article. 

SentryOne SSAS Low Memory Limit Exceeded

SSAS Storage Engine IO Job Queuing

The IOProcess thread pool separates reads from other activities. If the I/O job queue length is consistently above zero, you may be experiencing an I/O bottleneck.

For more information, see the Analysis Services MOLAP Guide for SQL Server 2012 and 2014.SentryOne SSAS Storage Engine IO Job Queuing

SSAS Storage Engine Processing Pool Job Queuing

Depending on your version of SSAS, queuing of jobs in this pool can be related to all Storage Engine activity (SSAS 2005 to 2008R2), or strictly processing activity in SSAS 2012 and above.

For more information on optimizing this activity for your version of SSAS, see the appropriate Microsoft SSAS Performance Guide.

SentryOne SSAS Storage Engine Processing Pool Job Queuing

SSAS Sustained Cache Evictions

If Cache Evictions/sec or Memory : KB shrunk/sec are consistently above zero, you may have memory pressure on the SSAS instance. This is often seen when SSAS memory usage exceeds configured limits.

SentryOne SSAS Sustained Cache Evictions

SSAS Sustained Connection Failures

A sustained value above zero indicates an inability for users to successfully connect to SSAS. This could be related to overburdened resources on the server.

SentryOne SSAS Sustained Connection Failures

Suspect Pages - Active Corrupt Pages

This condition queries the dbo.suspect_pages table located in the MSDB database. If there are any pages with a status of one, two, or three, it evaluates to true, advising that there is potential corruption in databases on the server being queried.

For more information, see the Monitoring for Suspect Pages article. 

SELECT db_name(database_id) as databaseName, count(*) as corruptPages
FROM dbo.suspect_pages
WHERE event_type in (1,2,3)
GROUP BY database_id;

SentryOne Suspect Pages - Active Corrupt Pages

Suspect Pages - High Row Count

This condition counts the number of rows present in the dbo.suspect_pages table in the MSDB database. When the total count of rows in the table is greater than or equal to 900, it evaluates to true. This is because the dbo.suspect_pages table is only allowed a maximum of 1,000 rows before it fills, at which point any new suspect pages that are detected don't have information logged about them. 

For more information, see the Monitoring for Suspect Pages article. 

SELECT count(*) as suspectPageCount
FROM dbo.suspect_pages;

SentryOne Suspect Pages - High Row Count

Suspect Pages - Increase in Fixed Pages

This condition queries dbo.suspect_pages in the MSDB database, checking the number of fixed pages (event types 4, 5, and 7) per database. This condition evaluates to true if the number of fixed pages has increased since the last check.

If this condition is true, it can mean that DBCC CHECKDB has corrected or deallocated pages. If Enterprise Edition is in use and the database is in either an Availability Group or Mirroring session, it could indicate that pages were fixed in Automatic Page Repair, alerting you to a potential corruption issue you might not have otherwise observed. 

For more information, see the Monitoring for Suspect Pages article. 

SELECT db_name(database_id) as databaseName, count(*) as fixedPages
FROM dbo.suspect_pages
WHERE event_type in (4,5,7)
GROUP BY database_id;

SentryOne Suspect Pages - Increase in Fixed Pages

Tempdb Data Files

This condition evaluates to true when there are more data files than logical cores in use by SQL Server, or there are more than four logical cores in use and fewer than four tempdb data files.

If the number of logical processors is less than or equal to eight, use the same number of data files as logical processors. If the number of logical processors is less than eight, use eight data files, and then if contention continues, increase the number of data files by multiples of four (up to the number of logical processors) until the contention is reduced to acceptable levels or make changes to the workload/code. 

For more information, see the Recommendations to Reduce Allocation Contention, Correctly Adding Data Files, Trace Flag Usage, and TEMPDB – Files and Trace Flags and Updates articles. 

DECLARE @TempDBFiles TinyInt
DECLARE @CPUCount TinyIntSELECT @CPUCount = COUNT(*)
FROM sys.dm_os_schedulers
WHERE STATUS = N'VISIBLE ONLINE' AND IS_ONLINE = 1;SELECT @TempDBFiles = COUNT(*)
FROM master.sys.master_files
WHERE database_id = 2
AND file_id <> 2;SELECT
 CASE
  WHEN @TempDBFiles = @CPUCount THEN 0
  /* If there are more data files than CPU Cores then this may cause overhead */
  WHEN @TempDBFiles > @CPUCount THEN 1
  /* If there are 4 or more cores and less than 4 TempDB Data files there could be contention */
  WHEN @CPUCount >= 4 AND @TempDBFiles < 4 THEN 2
  ELSE 0
 END [Contention];

SentryOne Tempdb Data Files

Tempdb Large Version Store 

Checks for a tempdb version store that is greater than 100,000 KB.

SentryOne Tempdb Large Version Store

Tempdb Low Unallocated Page Count

This condition evaluates to true when less than 10 percent when tempdb's page count is unallocated. This may indicate that tempdb is running out of space and will soon experience autogrowth or hit a size limit.

For more information, see the sys.dm_db_file_space_usage article. 

SELECT (CONVERT(DECIMAL(18,0),(SUM(unallocated_extent_page_count)))/(SUM(total_page_count))*100) as percent_used 
FROM tempdb.sys.dm_db_file_space_usage;

SentryOne Tempdb Low Unallocated Page Count

Tempdb Unequal File Size

The benefits of having multiple tempdb files can be lost if one of those files grows larger than the other files. This condition evaluates to true if your tempdb files are not the same size. The query counts the distinct tempdb file sizes from sys.master_files.

For more information, see the SQL Server tempdb Fixes and Adding Data Files articles. 

SELECT count(distinct size) as unique_sizes 
FROM sys.master_files
WHERE database_id = 2
AND type_desc <> 'LOG';

SentryOne Tempdb Unequal File Size

Tempdb/CPU Configuration Warning

Check that you have one tempdb file per CPU core; up to eight files. As with most SQL Server topics, an it depends caveat accompanies tempdb recommendations; however, starting with SQL Server 2016, Microsoft creates a default number of eight tempdb files. It uses the number of cores if there are fewer than eight cores or just one file it is SQL Server Express.

The first query counts the visible online CPUs from sys.dm_os_schedulers, and the second query counts the number of tempdb files from sys.master_files. This condition evaluates to True when the count of CPUs isn't equal to the count of tempdb files.

For more information, see the Install SQL Server from the Command Prompt and tempdb Should Always Have one Data File per Processor Core articles. 

Note:  As a general rule, if the number of logical processors is less than or equal to 8, use the same number of data files as logical processors. If the number of logical processors is greater than 8, use 8 data files and then if contention continues, increase the number of data files by multiples of 4 (up to the number of logical processors) until the contention is reduced to acceptable levels or make changes to the workload/code.

SELECT count(cpu_id) as cpu_count
FROM sys.dm_os_schedulers 
WHERE status = 'VISIBLE ONLINE';
SELECT count(Name) as tempdb_files
FROM sys.master_files 
WHERE database_id = 2
AND type_desc <> 'LOG';

SentryOne Tempdb/CPU Configuration Warning

Tempdb/CPU Configuration Warning > 8 CPUs

If you have many cores, set the number of tempdb files as a quarter to half the number of cores. This condition verifies that you have at least the quarter ratio. This means if you have 32 cores and eight files, it evaluates to false, but if there are more than 32 cores and only the eight default tempdb files, then it evaluates to true.

The first query counts the visible online CPUs from sys.dm_os_schedulers and the second query counts the number of tempdb files from sys.master_files.

Customize this based on contention or performance issues you have seen in your environments. Microsoft recommends adding files in multiples of four, so you would likely want to go from eight to 12, 16, 20, etc. 

For more information, see the Tempdb Fixes and Tempdb Should Always Have one Data File per Processor Core articles. 

SELECT count(cpu_id) as cpu_count
FROM sys.dm_os_schedulers 
WHERE status = 'VISIBLE ONLINE';
SELECT count(Name) as tempdb_files
FROM sys.master_files 
WHERE database_id = 2
AND type_desc <> 'LOG';

SentryOne Tempdb/CPU Configuration Warning

Tintri High Datastore Throttle Latency 

Checks for Tintri Datastore : Throttle Latency (ms) > 10.  SentryOne Tintri High Datastore

Tintri High vDisk Throttle Latency 

Checks for Tintri Virtual Disk : Throttle Latency (ms) Any > 10.

SentryOne Tintri High vDisk

Tintri High VM Throttle Latency 

Checks for Tintri Virtual Machine : Throttle Latency (ms) Any > 10.  SentryOne Tintri High VM Throttle

Trace Flags Number Turned On Changed

This condition evaluates to true when the number of trace flags set to ON (status = one) has changed. The query analyzes the results of DBCC TRACESTATUS WITH NO_INFOMSGS to determine the status of the trace flags.

CREATE TABLE [#TraceFlagsOn] 
      ([TraceFlag] INT, [Status] INT, [Global] INT, [Session] INT);

INSERT INTO #TraceFlagsOn EXEC ('DBCC TRACESTATUS WITH NO_INFOMSGS');

SELECT Count(*) as TraceFlagsStatusOn
FROM #TraceFlagsOn
WHERE #TraceFlagsOn.Status = 1;

DROP TABLE [#TraceFlagsOn];

SentryOne Trace Flags Number Turned On Changed

Trace Flags Total Number Changed

This condition evaluates to True when the number of trace flags enabled on the system has changed. The query analyzes the results of DBCC TRACESTATUS WITH NO_INFOMSGS to determine the status of the trace flags.

CREATE TABLE [#TraceFlags] 
      ([TraceFlag] INT, [Status] INT, [Global] INT, [Session] INT);
 
INSERT INTO #TraceFlags EXEC ('DBCC TRACESTATUS WITH NO_INFOMSGS');
 
SELECT Count(*) as TraceFlagsTotal
FROM #TraceFlags;
 
DROP TABLE [#TraceFlags];

SentryOne Trace Flags Total Number Changed

VMware High Ballooning

This VMware-specific condition evaluates to true when a significant amount of memory has been reclaimed by the host machine from the guest virtual machine(s). 

For more information, see the vSphere Memory Management article. 

SentryOne VMware High Ballooning

VMware High Co-Stop %

This condition evaluates to true when the VMware Co-Stop time value is greater than three4 percent. This is the time that a virtual machine is ready to run, but unable to run due to co-scheduling constraints. May be indicative of too many vCPU resources. 

For more information, see the CPU Counters article. 

SentryOne VMware High Co-Stop %

VMware High Ready Time % per vCPU

This VMware-specific condition evaluates to True when Ready Time % per vCPU is critically high.

When a vCPU is ready to do work but is waiting for the hypervisor to schedule that work on one or more physical CPUs, the vCPU accumulates Ready Time. This is often caused by having a high ratio of vCPUs to physical CPUs on the host machine or having VMs of significantly different sizes (by vCPU count) on the same host. 

For more information, see the CPU Ready Time in VMware article. 

SentryOne VMware High Ready Time Per vCPU

VMware High Ready Time % per vCPU - Warning

This VMware-specific condition evaluates to True when Ready Time % per vCPU is within the warning threshold.

When a vCPU is ready to do work but is waiting for the hypervisor to schedule that work on one or more physical CPUs, the vCPU accumulates Ready Time. This is often caused by having a high ratio of vCPUs to physical CPUs on the host machine or having VMs of significantly different sizes (by vCPU count) on the same host. 

For more information, see the CPU Ready Time in VMware article. 

SentryOne VMware High Ready Time % per vCPU - Warning

Windows Low Memory Notification

This condition queries the SQL Server ring buffer for recent occurrences of system-level RESOURCE_MEMPHYSICAL_LOW alerts. These occur when Windows signals that physical memory is low and can cause the SQL Server to release its memory in response. 

For more information, see the Ring Buffer Resource Monitor and Diagnose Memory Issues in SQL Server blog posts. 

SELECT COUNT(*) AS AlertCount
FROM (
    SELECT CAST(orb.record AS XML) AS xmlRec
    FROM sys.dm_os_ring_buffers AS orb
    CROSS JOIN sys.dm_os_sys_info AS osi
    WHERE orb.ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR'
   AND DATEADD(second, -((osi.cpu_ticks/(osi.cpu_ticks/osi.ms_ticks) - orb.timestamp) / 1000), GETDATE()) > DATEADD(minute, -6, GETDATE())
  ) rb
CROSS APPLY rb.xmlRec.nodes('Record') rec(x)
WHERE rec.x.value('(ResourceMonitor/IndicatorsSystem)[1]','tinyint') = 2

SentryOne Windows Low Memory Notification

Windows Volume Exhaustion Date Changed

Download: The Windows Volume Exhaustion Date Changed advisory condition is available for download here.

Additional Information: Introducing Storage Forecasting with SQL Server Machine Learning Services

This Advisory Condition evaluates to True if any forecasted exhaustion dates for any drives are within one year of the current date and have had that date change to be at least 30 days earlier than previously forecast.

This can help identify situations where a new database has been deployed, or a workload changed, possibly leading to accelerated database growth. That might lead to a much earlier exhaustion date, without you being caught by surprise.

Windows Volume Exhaustion Date Changed

Windows Volume Forecasted Exhaustion Within 90 Days

Download: The Windows Volume Forecasted Exhaustion Within 90 Days advisory condition is available for download here.

Additional Information: Introducing Storage Forecasting with SQL Server Machine Learning Services

This Advisory Condition checks the Resource Exhaustion Dates (RED's) for all forecasted volumes and evaluates to True if any occur within the next 90 days.

The value can be extended past 90 days, but note the RED's will only be provided if they are calculated within the forecast range, which is 180 days by default.

This condition returns up to 100 records, which is the maximum configurable for any Advisory Condition.

Windows Volume Forecasted Exhaustion Within 90 Days