Documentation forSQL Sentry

SQL Sentry Advisory Conditions Pack

SQL Sentry provides a base set of conditions that you can download from inside the client. This set of conditions gives you a starting point for monitoring and can serve as a reference when creating your own conditions. See the Advisory Conditions article for information on cloning and customizing any of the conditions below to your specific environment and needs, as well as building new ones. Additional conditions are also available on GitHub.

Available Advisory Conditions

Availability Replicas Hosted on Same Virtual Host

This condition evaluates to true when multiple Availability Replicas from the same Availability Group are hosted on the same VMware/Hyper-V host.

In the event of an issue with the host server, there could be an outage that nullifies the Always On Availability Group configuration within the virtual machines hosted on the VMware Infrastructure.

For more information, see the Always On Availability Group and VMS article. 

--// AG Replicas on the same VM host
--// Applies to : VMware & Hyper-V
--// ConditionKey column breakdown as follows;
--// Host : {VMwareHost Name} | WSFC : {Windows Cluster Name} | AG : {AG Name} - {AG Replicas on Host}
SELECT
  'Host : ' + QUOTENAME(HS1.Name) + ' | WSFC : ' + CR.ClusterName + ' | AG : ' +  AG1.Name + ' - ' +
   STUFF((SELECT N', ' + QUOTENAME(NodeName) + '(' + CASE Role WHEN 1 THEN 'P' ELSE 'S' END + ')'
    FROM AlwaysOn.AvailabilityReplica AS AR2
    INNER JOIN VM.VirtualMachine AS VM2
       ON AR2.NodeName = VM2.Name
    WHERE AR2.GroupID = AR1.GroupID
      AND VM2.HostSystemID = VM1.HostSystemID
    ORDER BY Role, NodeName
      FOR XML PATH(N''), TYPE).value(N'.[1]', N'nvarchar(max)'), 1, 2, N'') as ConditionKey
 ,COUNT(VM1.Name) AS ReplicaCount
FROM VM.HostSystem HS1
INNER JOIN VM.VirtualMachine AS VM1
 ON HS1.ID = VM1.HostSystemID
INNER JOIN AlwaysOn.AvailabilityReplica AR1
 ON AR1.NodeName = VM1.Name
INNER JOIN AlwaysOn.AvailabilityGroup AS AG1
 ON AG1.GroupID = AR1.GroupID
INNER JOIN AlwaysOn.ClusterReference CR
 ON AR1.EventSourceConnectionID = CR.EventSourceConnectionID
GROUP BY
  CR.ClusterName
 ,AR1.GroupID
 ,AG1.Name
 ,VM1.HostSystemID
 ,HS1.Name
HAVING COUNT(VM1.Name) > 1
ORDER BY CR.ClusterName, AG1.Name, HS1.Name
;

Availability Replicas Hosted on Same Virtual Host

Availability Replicas With Disks in Same Datastore

This condition evaluates to true when multiple Availability Replicas from the same Availability Group have their VMDKs in the same VMware Datastore. 

If there's an issue with the datastore for the VMware server, this could result in an outage that nullifies the Always On Availability Group configuration within the virtual machines hosted on the VMware Infrastructure. 

For more information, see the Always On Availability Groups and VMS article. 

--// AG nodes with storage in the same data store.
--// Applies to : VMware
--// ConditionKey column breakdown as follows;
--// Data Store : {Data Store Name} | WSFC : {Windows Cluster Name} | AG : {AG Name} | {Replica:Disk}
select 'Data Store : ' + quotename(ds.name) + N' | WSFC : ' + cr.ClusterName 
  + ' | AG : ' + ag.name + ' | ' +
 stuff((select ', ' + ar2.nodename + '('
     + case [Role] WHEN 1 THEN 'P' ELSE 'S' END 
     +'):{'+ right(vd.[filename],(charindex('/',reverse(vd.[FileName]))-1)) +'}'
   from vm.virtualMachineVirtualDisk as vd
   join vm.datastore as ds2 
    on replace(replace(left(vd.filename,charindex(']', vd.filename)),'[',''),']','') = ds2.name
   join vm.virtualMachine as vm2 on vd.virtualMachineId = vm2.id
   join AlwaysOn.AvailabilityReplica as ar2 on vm2.name = ar2.NodeName
   where ar2.GroupId = ag.GroupId
    and ds2.name = ds.name
   order by vm2.name
   for xml path(''), type).value(N'.[1]',N'nvarchar(max)'),1,2,N'') as conditionKey,
   count(vmdk.[filename]) as DiskOverlapCount
from vm.datastore as ds
join vm.virtualMachineVirtualDisk as vmdk
 on replace(replace(left(vmdk.[filename],charindex(']', vmdk.[filename])),'[',''),']','') = ds.name
join vm.virtualmachine as vm on vmdk.virtualMachineId = vm.id
join AlwaysOn.AvailabilityReplica as ar on ar.nodename = vm.name
join AlwaysOn.AvailabilityGroup as ag on ar.GroupId = ag.GroupId
join AlwaysOn.ClusterReference as cr on ar.EventSourceConnectionID = cr.EventSourceConnectionID
group by cr.ClusterName, ag.Name, ds.Name, ag.GroupId
order by cr.ClusterName, ag.Name, ds.Name
;

Availability Replicas With Disks in Same Datastore

Check Constraints Not Trusted

This condition evaluates to true when it finds sys.check_constraints.is_not_trusted values on objects that it expects to be trusted.

When check constraints are not trusted, SQL Server may be unable to use them for query plans and optimization, affecting the performance of your queries. Check constraints that were disabled, perhaps for bulk loading of data, and then enabled, may not be trusted. The table must be altered for those check constraints to be trusted again.

DECLARE @sql nvarchar(max);
SET @sql = N'';
SELECT @sql = @sql + N'UNION ALL 
  SELECT DBName = N''' + name + ''' COLLATE Latin1_General_BIN, 
  CCsNotTrusted =  
  (
    SELECT COUNT(*) AS CCsNotTrusted
      FROM ' + QUOTENAME(name) + '.sys.check_constraints AS c'
      + N'
      WHERE c.is_not_trusted = 1 
        AND c.is_not_for_replication = 0 
        AND c.is_disabled = 0
  )
  ' FROM sys.databases 
WHERE database_id > 4 AND state = 0;
SET @sql = N'SELECT DBName, CCsNotTrusted FROM 
(' + STUFF(@sql, 1, 10, N'') 
   + N') AS x WHERE CCsNotTrusted > 0;';
EXEC sys.sp_executesql @sql;

Check Constraints Not Trusted

Check Data File Size

Verifies that a data file is at least 5120 MB. This condition is used to build other conditions.

Check Data File Size

Check Error Log Size

Checks the size of the error log. This condition is used in other custom conditions that query the error log to prevent performance issues associated with larger log files.

 DECLARE @currentlogid int
DECLARE @createdate datetime
DECLARE @currfilesize int
 
CREATE TABLE #err_log_tmp(ArchiveNo int, CreateDate nvarchar(128), Size int)
 
INSERT #err_log_tmp exec master.dbo.sp_enumerrorlogs
 
SELECT TOP 1
      @currentlogid = er.ArchiveNo,
      @createdate = CONVERT(datetime, er.CreateDate, 101),
      @currfilesize = er.Size
FROM #err_log_tmp er
ORDER BY [ArchiveNo] ASC
 
DROP TABLE #err_log_tmp
 
SELECT @currfilesize;

Check Error Log Size

Check Transaction Log File Size

Verifies that the transaction log file is at least 1024 MB. This condition is used to build other conditions. Check Transaction Log File Size

Common Criteria Compliance Enabled

Queries sys.configurations to determine if common criteria compliance is enabled.

Common criteria compliance is an advanced option in the SQL Server Enterprise and Datacenter editions that can have a dramatic impact on server performance. The value field indicates that the setting is enabled; however, the value_in_use field indicates whether it has been activated. Activating common criteria compliance requires a server restart.

For more information, see the Common Criteria Compliance article. 

SELECT value
FROM sys.configurations 
WHERE name = N'common criteria compliance enabled'

Common Criteria Compliance Enabled

Cost Threshold of Parallelism Changed

Queries sys.configurations to detect whenever the server-wide Cost Threshold of Parallelism configuration setting changes on a monitored server, by comparing the last value retrieved to the current value.

SELECT value_in_use
FROM sys.configurations 
WHERE name = N'cost threshold for parallelism'

Cost Threshold of Parallelism Changed

CPU Schedulers Failed to Create Worker

This condition evaluates to true when there are schedulers that couldn't create a new worker, most likely due to memory constraints. 

 For more information, see the sys.dm_os_schedulers article. 

SELECT COUNT(*) AS 'Failed Workers'
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE failed_to_create_worker = 1;

CPU Schedulers Failed to Create Worker

CPU Schedulers Hot Added

This condition evaluates to true when there are schedulers that have been added due to a hot add CPU event.

For more information, see the sys.dm_os_schedulers article.

SELECT COUNT(*) AS 'Hot Added'
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'HOT_ADDED';

CPU Schedulers Hot Added

CPU Schedulers Visible Offline Status

This condition evaluates to true when there are CPU cores that are visible to SQL Server, but are offline.

SQL Server licensing might prevent a server from using all available cores. Not only can this limit performance, but it can cause additional harm by unbalancing NUMA nodes.

For more information, see the Balancing Your Available SQL Server Core Licenses and Performance Problems with SQL Server 2012 articles. 

SELECT COUNT(*) AS 'Offline Schedulers'
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'VISIBLE OFFLINE'
and scheduler_id < 255;

CPU Schedulers Visible Offline Status

CPU Unused Schedulers

This condition evaluates to true when CPU schedulers are disabled (is_online = zero).

This indicates that a CPU is offline and SQL Server can't use it for processing (e.g. queries, batches, etc.). A disabled CPU could be caused by affinity masking or licensing issues, and can impact performance.

For more information, see the CPU Schedulers Visible Offline Status section, previously.

SELECT COUNT(*) AS 'Unused Schedulers'
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [is_online] = 0
and scheduler_id < 255;

CPU Unused Schedulers

Data File Growth

This condition evaluates to true if any data file (> five GB) is larger than it was during the last evaluation of this condition. 

Data File Growth

Data File Shrink

Evaluates to true if any data file is smaller than it was during the last evaluation of this condition. Shrinking data files is rarely recommended.

For more information, see the Why You Should Not Shrink Your Data Files article. 

Data File Shrink

Database Files Count Change

This condition evaluates to true when the number of database files changes. It checks the count of database files in master.sys.master_files, regardless of file type (e.g. ROWS, LOG, etc.) A highlight appears on the backups chart of the dashboard to show that backup jobs might need to be created or removed depending on the scenario.

select count(*) from sys.master_files;

Database Files Count Change

Foreign Keys Not Trusted

This condition evaluates to true when it finds sys.foreign_keys.is_not_trusted values on objects that it expects to be trusted. 

When foreign keys are not trusted, SQL Server may be unable to use them for query plans and optimization, affecting the performance of your queries. Foreign keys that were disabled, perhaps for bulk loading of data, and then enabled, may not be trusted. The table must be altered for those foreign keys to be trusted again.

DECLARE @sql nvarchar(max);
SET @sql = N'';
SELECT @sql = @sql + N'UNION ALL 
  SELECT DBName = N''' + name + ''' COLLATE Latin1_General_BIN, 
  FKsNotTrusted =  
  (
    SELECT COUNT(*) AS FKsNotTrusted
      FROM ' + QUOTENAME(name) + '.sys.foreign_keys AS f'
   + N'
      WHERE f.is_not_trusted = 1 
        AND f.is_not_for_replication = 0 
        AND f.is_disabled = 0
  )
  ' FROM sys.databases 
WHERE database_id > 4 AND state = 0;
SET @sql = N'SELECT DBName, FKsNotTrusted FROM 
(' + STUFF(@sql, 1, 10, N'') 
   + N') AS x WHERE FKsNotTrusted > 0;';
EXEC sys.sp_executesql @sql;

Foreign Keys Not Trusted

High Active User Sessions

This condition is triggered if the count of user sessions with recent activity goes over a specified threshold.

High active sessions by itself doesn't universally correlate with performance or other problems, but on some systems it can precede an overload state, and/or be an indicator of malicious denial of service (DoS) activity. As such, this condition should be enabled selectively and the threshold adjusted appropriately.

select count(*) 
from sys.dm_exec_sessions 
where is_user_process = 1 and
last_request_start_time > DATEADD(minute, -1, GETDATE())

High Active User Sessions

High Ad Hoc Query Plans

This condition checks for plan cache bloat that may be alleviated by enabling optimize for ad hoc workloads.
 

If optimize for ad hoc workloads is False, a high percentage of plan cache is dedicated to single use plans, and page life expectancy is low, then this condition evaluates to True.
 
A high percentage of plan cache in this condition is defined as greater than 10 percent when the server has less than or equal to 64 GB of memory and greater than five percent when it has greater than 64 GB of memory.

For more information, see the Plan Cache and Optimizing for Ad Hoc Workloads article. 

SELECT sum(cast((CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) as decimal(19,3)))/1024/1024 
FROM sys.dm_exec_cached_plans

High Ad Hoc Query Plans

High Average Work Queue Count

This condition evaluates to true when sys.dm_os_schedulers.work_queue_count has an average value greater than one.

A high average work queue count can indicate that max worker threads for the server should be increased.

For more information, see the Max Worker Threads and When You Should Change It article. 

SELECT AVG (work_queue_count)
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'VISIBLE ONLINE';

High Average Work Queue Count

High Avg Wait Time per User Session

Wait time measures how long queries are waiting for CPU, disk, memory, or other resources, across all sessions. High average wait time per second by itself may not be indicative of a problem, because it doesn't consider the activity level of the system. For example, 5,000 ms of average wait time/sec would be poor on a system with only 30 active users, but excellent on a system with 3,000 active users.

For this reason, this condition divides average wait time/sec (total) by the number of active user sessions to calculate the average wait time per session. If this value is over 50 ms, users may be experiencing noticeable delays.

This value shouldn't be considered a universally hard threshold, but more of a guideline, and may need to be adjusted for different systems. 

For more information, see the Wait Statistics article and the Waits category on SQLPerformance.com

Note:  SQL Sentry filters out innocuous waits, so the wait time used here is lower than if collected directly from SQL Server, and more accurately reflects user-impacting waits.
select count(*) 
from sys.dm_exec_sessions 
where is_user_process = 1 and
last_request_start_time > DATEADD(minute, -1, GETDATE())

High Avg Wait Time per User Session

High Compiles

Query plan compiles should generally be < 15 percent of batches per second. Higher values indicate plan reuse is low and will often correlate with high cpu since plan compilation can be a cpu-intensive operation. High compiles may correlate with low plan cache hit ratios, and can be an indicator of memory pressure because there may not be enough room to keep all plans in cache.

If you see consistently high compiles, run a Quick Trace and sort the results by Cache Misses, then expand details to view actual compiling statements (SP:CacheMiss events, highlighted) along with the reason (SubClass) and procedure (Object). 

For more information, see the Batch Compilation, Recompilation, and Plan Caching Issues and Caching Mechanisms articles. High Compiles

High Compiles + High CPU

Query plan compiles can be a CPU-intensive operation. If total CPU is high for an extended period and compiles are also high, they may be causing or at least contributing to the high CPU. 

For more information on troubleshooting, see the High Compiles conditions section, previous. High Compiles + High CPU

High Context Switches

Context switches represent the combined rate at which all processors on the computer are switched from one thread to another. Consistently high values over 7,500 per logical processor can mean that the server is spending too much time switching threads instead of actively running threads. High Context Switches

High Context Switches - Warning

Context switches represent the combined rate at which all processors on the computer are switched from one thread to another. Consistently high values over 5,000 per logical processor can mean that the server is spending too much time switching threads instead of actively running threads.  High Context Switches - Warning

High CPU

Sustained CPU utilization greater than 90 percent may indicate a CPU bottleneck.    

Use Windows Processes tab in Performance Analysis to ascertain the processes causing the high CPU. On dedicated SQL Server machines most CPU should be associated with the SQL Server process(es). If the SQL Server is causing the high CPU, use Top SQL and/or QuickTrace to determine which sessions and queries are consuming the most CPU.  

Note:  If you launch QuickTrace by highlighting a range on the CPU Usage chart, the results are pre-sorted by CPU percent descending, so the greatest consumers are on top.

High CPU

High CPU for Non-SQL Server Process

On dedicated SQL Servers, most of the CPU utilization should generally be related to the SQL Server process (sqlservr.exe). This condition detects when CPU utilization is high, and at least 25 percent is related to some process other than SQL Server.  
High CPU for Non-SQL Server Process

High CPU for Single Core

Sustained CPU utilization greater than 90 percent on a single core may indicate a CPU bottleneck. The CPU Total usage value is checked against 100/Number of cores used by the SQL Server instance.

Use the Windows Processes tab in Performance Analysis to ascertain the processes causing the high CPU. On dedicated SQL Server machines, most CPU should be associated with the SQL Server process(es). If the SQL Server is causing the high CPU, use Top SQL and/or QuickTrace to determine which sessions and queries are consuming the most CPU.

Note:  If you launch QuickTrace by highlighting a range on the CPU Usage chart, the results are pre-sorted by CPU percent descending, so the greatest consumers are on top.
SELECT  100.0/COUNT(*) 
FROM SYS.DM_OS_SCHEDULERS 
WHERE STATUS = 'VISIBLE ONLINE' AND IS_ONLINE = 1;

High CPU for Single Core

High CPU Signal Waits

This condition evaluates to true when total signal waits are above 20 percent. High signal waits are often indicative of CPU pressure, but the threshold may vary by system. 

For more information, see the A DMV a Day and Troubleshooting SQL Server Wait Stats articles. 

SELECT CAST(100.0 * SUM(signal_wait_time_ms)/ SUM (wait_time_ms)AS NUMERIC(20,2)) 
FROM sys.dm_os_wait_stats WITH (NOLOCK);

High CPU Signal Waits

High Database Replica Send or Recovery Queue

Large log send or recovery queues can be an indicator of a system or network bottleneck. This condition is triggered if either queue goes over 3 MB. 

Use the Always On Management interface in the Performance Analysis dashboard to troubleshoot.

For more information, see the AlwaysOn Architecture Guide article.
High Database Replica Send or Recovery Queue

High Disk Waits and Latency

This condition detects if high disk latency is potentially causing high disk wait time in SQL Server. 

To troubleshoot, view the Performance Analysis Disk Activity tab to locate bottlenecks in the disk system or specific database files. On the dashboard, highlight a range with high latency and Jump To > Top SQL  to determine which queries may be contributing to and/or suffering from high disk waits.    

Disk latency is the only disk measurement for which there are generally accepted ranges that represent good and bad performance from a SQL Server perspective. The following ranges can be used as a general guideline to determine whether disk latency is acceptable:    

  • Less than 10 ms—Fast*
  • Between 10 ms - 20 ms—Acceptable
  • Between 20 ms - 50 ms—Slow
  • Greater than 50 ms—Critical

*For transaction log writes, between 0 ms and 2 ms is desirable.   
High Disk Waits and Latency

High Mirroring Send or Redo Queue

Large log send or redo queues can be an indicator of a system or network bottleneck. This condition is triggered if either queue goes over three MB.    

Use the Performance Analysis Dashboard to troubleshoot.
 High Mirroring Send or Redo Queue

High Number of Failed Logins

This condition evaluates to true when there are more than 10 failed logins in the last two minutes. The first query ensures that the error log isn't more than two MB (for performance consideration) and the second query checks for the number of failed logins.

A high number of failed login attempts may indicate that an unauthorized user is trying to access the system. SQL Server security properties must have login auditing enabled for this condition to function.

In SQL Server Management Studio, go to Server Properties > Security to check on these settings. The login auditing option must be set to Failed logins only or Both failed and successful logins.

For more information, see the Server Properties article. 

DECLARE @currentlogid int
DECLARE @createdate datetime
DECLARE @currfilesize int
 
CREATE TABLE #err_log_tmp(ArchiveNo int, CreateDate nvarchar(128), Size int)
 
INSERT #err_log_tmp exec master.dbo.sp_enumerrorlogs
 
SELECT TOP 1
@currentlogid = er.ArchiveNo,
@createdate = CONVERT(datetime, er.CreateDate, 101),
@currfilesize = er.Size
FROM #err_log_tmp er
ORDER BY
[ArchiveNo] ASC
 
DROP TABLE #err_log_tmp
 
SELECT @currfilesize;
CREATE TABLE #log
  (
   logdate DATETIME,
   info    VARCHAR (25) ,
   data    VARCHAR (200)
  );
 
  INSERT INTO #log
  EXECUTE sp_readerrorlog 0, 1, 'Login failed';
 
  SELECT count(*) AS occurences
  FROM   #log
  WHERE  logdate > dateadd(minute, -2, getdate());
 
  DROP TABLE #log;

High Number of Failed Logins

High Pending Disk IO Count

This condition evaluates to True when sys.dm_os_schedulers.pending_disk_io_count has an average value greater than zero. A high pending disk I/O count may be indicative of an I/O bottleneck.

For more information, see the sys.dm_os_schedulers and Easy Ways to Detect I/O Pressure in SQL Server articles. 

SELECT AVG (pending_disk_io_count)
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'VISIBLE ONLINE';

High Pending Disk IO Count

High Performance Power Plan not Enabled

If the High Performance power plan isn't enabled, Windows may be throttling CPU performance. This in turn can have a significant negative impact on both Windows and SQL Server performance, so it's highly recommended that a dedicated SQL Server always use the High Performance plan.

This condition uses a WMI query to determine whether the High Performance plan is active, and it triggers if it's not.

For more information, see the Slow Performance on Windows Server when Using the Balanced Power Plan and Ensuring Maximum CPU Performance via HTML articles. 

Note:  It's possible to use a custom power plan that doesn't throttle CPU, although this is not a common practice. If the WMI query returns the error, The program is blocked by group policy, it may mean that your network admin has applied a power plan from the group policy as described in the following post. If so, check with them to ensure that the plan doesn't throttle CPU.

High Performance Power Plan not Enabled

High Recompiles

Query plan recompiles should generally be less than 15 percent of initial compiles, and often correlates with high cpu, since plan compilation can be a cpu-intensive operation. They may be caused by statistics updates, schema changes, etc., and may correlate with low plan cache hit ratios. 

If you see consistently high recompiles, run a Quick Trace and sort the results by recompiles, then expand details to view actual recompiling statements (sp:recompile events, highlighted) along with the reason (subclass) and procedure (object).

For more information, see the Batch Compilation, Recompilation, and Plan Caching Issues and Caching Mechanisms articles. High Recompiles

High Redo Completion Time

This condition divides the log_send_rate and redo_rate values found in the sys.dm_hadr_database_replica_states DMV to determine the average redo completion time.

Log Send Rate is the rate at which the logs are being sent to secondary databases in an availability group, and Redo Rate is the rate in which the log records are being redone on the secondary databases.

By default, this condition evaluates to true if the average redo completion time is greater than five mins (in secs). In addition, this condition is designed to return rows only when it is executed against the primary replica. 

For more information, see the Availability Group Replica Sync article.

SELECT [AG Name, Replica, & Database] = AG.name
          + N' [' + AR.replica_server_name + N']'
          + N': (' + DB.database_name + N')',
        [Average Redo Completion Time (Sec)]
          = COALESCE( RS.log_send_rate / NULLIF (RS.redo_rate, 0),0)
 FROM sys.dm_hadr_database_replica_states AS RS
 INNER JOIN sys.availability_databases_cluster AS DB
   ON RS.group_id = DB.group_id
   AND RS.group_database_id = DB.group_database_id
 INNER JOIN sys.availability_groups AS AG
   ON AG.group_id = RS.group_id
 INNER JOIN sys.availability_replicas AS AR
   ON RS.group_id = AR.group_id
   AND RS.replica_id = AR.replica_id
      INNER JOIN sys.dm_hadr_availability_group_states AS AGS
   ON AGS.group_id = AG.group_id 
    INNER JOIN sys.dm_hadr_availability_replica_states AS ars
    ON ar.replica_id = ars.replica_id
    WHERE ars.role_desc = 'SECONDARY'
    AND AGS.primary_replica = @@SERVERNAME
 ORDER BY AG.name, AR.replica_server_name, DB.database_name;

High Redo Completion Time

High Runnable Tasks Count

This condition evaluates to true when sys.dm_os_schedulers.runnable_tasks_count has an average value greater than 10. A high runnable tasks count is indicative of CPU pressure.

For more information, see the A DMV a Day article. 

SELECT AVG (runnable_tasks_count)
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] = N'VISIBLE ONLINE';

High Runnable Tasks Count

High VLF Count

SQL Server database transaction logs contain multiple virtual log files or VLFs. The number of VLFs is dictated by the initial size and auto-growth size for the transaction log. Too many VLFs can lead to increased backup/recovery times and possible performance problems. 

See the Performance Analysis Disk Space tab for VLF counts for all log files, as well as configured auto-growth size and how many VLFs result from each auto-growth.

If VLFs are too high, the count can be reset by shrinking and resizing the log. For more information, see the 8 Steps and Transaction Log VLFs articles

High VLF Count

Hyper-V High vCPU Wait Time

The vCPU Wait Time is a SQL Sentry virtual counter. It's calculated by multiplying CPU wait time per dispatch (the average time, in nanoseconds, spent waiting for a virtual processor to be dispatched onto a logical processor) by the number of dispatches per second onto the logical processor.

Hyper-V High vCPU Wait Time

Incorrect Compatibility Level

This condition evaluates to true when a database in the SQL Sentry database has a compatibility level that doesn't match that of the master database (current compatibility level).

Compatibility mode allows an older database to run on a newer version of SQL Server at the expense of not being able to run newer features. While some databases need to use an older compatibility mode, not all of them do.

If there are databases that must run in compatibility mode, make provisions to exclude them to reduce false positive values.

SELECT 
CAST (pasd.CompatabilityLevel AS NVARCHAR(3)) + ' ' + esc.ServerName  + ': (' + pasd.Name + ')' as [Current Level, Server, Database],
der.MaxCompatabilityLevel
FROM [dbo].[PerformanceAnalysisSqlDatabase] pasd
INNER JOIN [dbo].[EventSourceConnection] esc
 ON esc.ID = pasd.EventSourceConnectionID
INNER JOIN 
 (SELECT EventSourceConnectionID, CompatabilityLevel [MaxCompatabilityLevel]
 FROM [dbo].[PerformanceAnalysisSqlDatabase]  
 WHERE DatabaseID = 1) der
 ON der.EventSourceConnectionID = pasd.EventSourceConnectionID
WHERE pasd.DatabaseID = 2
AND esc.IsPerformanceAnalysisEnabled = 1
AND pasd.CompatabilityLevel = der.MaxCompatabilityLevel;

Incorrect Compatibility Level

Large Windows File Cache

The Windows file (or system) cache stores in memory file data that is read from or written to disk. Since SQL Server manages its own memory, this cache typically doesn't grow large on a dedicated SQL Server. However, if some other process is causing it to unexpectedly grow large it can lead to memory pressure for SQL Server. 

In the case of Analysis Services, database files may be loaded into and served from the Windows file cache, even if the associated file data doesn't exist in the SSAS internal caches. For this reason, monitoring the file cache is important to ensure that physical memory is being used effectively, and that memory contention doesn't occur between the SSAS process, the file cache, and other processes on the server, including SQL Server.

Large Windows File Cache

Log File Growth

This conditions evaluates to true if any transaction log, of at least 1024 MB in size, is larger than it was during the last evaluation of this condition.

Frequent log file growth can lead to the creation of too many Virtual Log Files (VLFs). You can prevent this by optimizing the initial size and auto-growth size for the transaction log.

For more information, see the High VLF Count custom condition.

Log File Growth

Long Running Open Transactions

This condition evaluates to True if any running transactions have been open for at least 90 seconds. The query returns a list containing the query text and value (in seconds) of the longest running open transactions.

Additional Information: This is a modified version of a query in Paul Randal’s blog. For more information, see the Script Open Transactions With Text and Plans article.
SELECT [Last T-SQL Text], [Transaction Length]
FROM (
SELECT
    [s_tst].[session_id],
    [s_es].[login_name] AS [Login Name],
    DB_NAME (s_tdt.database_id) AS [Database],
    [s_tdt].[database_transaction_begin_time] AS [Begin Time],
 DATEDIFF(ss, [s_tdt].[database_transaction_begin_time], GETDATE()) [Transaction Length],
    [s_tdt].[database_transaction_log_bytes_used] AS [Log Bytes],
    [s_tdt].[database_transaction_log_bytes_reserved] AS [Log Rsvd],
 CASE [s_tdt].database_transaction_type
  WHEN 1 THEN 'Read/write transaction'
  WHEN 2 THEN 'Read-only transaction'
  WHEN 3 THEN 'System transaction'
 END [Transaction Type],
 CASE [s_tdt].database_transaction_state
  WHEN 1 THEN 'The transaction has not been initialized.'
  WHEN 3 THEN 'The transaction has been initialized but has not generated any log records.'
  WHEN 4 THEN 'The transaction has generated log records.'
  WHEN 5 THEN 'The transaction has been prepared.'
  WHEN 10 THEN 'The transaction has been committed.'
  WHEN 11 THEN 'The transaction has been rolled back.'
  WHEN 12 THEN 'The transaction is being committed. In this state the log record is being generated, but it has not been materialized or persisted.'
 END [Transaction State],
 [s_est].text AS [Last T-SQL Text]FROM sys.dm_tran_database_transactions [s_tdt]
INNER JOIN sys.dm_tran_session_transactions [s_tst]
 ON [s_tst].[transaction_id] = [s_tdt].[transaction_id]
INNER JOIN sys.[dm_exec_sessions] [s_es]
 ON [s_es].[session_id] = [s_tst].[session_id]
INNER JOIN sys.dm_exec_connections [s_ec]
 ON [s_ec].[session_id] = [s_tst].[session_id]
CROSS APPLY sys.dm_exec_sql_text ([s_ec].[most_recent_sql_handle]) AS [s_est]
WHERE [s_tst].is_user_transaction = 1
) RunningTransactions
WHERE [Transaction Length] IS NOT NULL
ORDER BY [Transaction Length] DESC
(' + STUFF(@sql, 1, 10, N'')
   + N') AS x WHERE CCsNotTrusted > 0;';
EXEC sys.sp_executesql @sql;

Long Running Open Transactions

Low Available Windows Memory

Most Windows servers require at least 100 MB of available memory to ensure proper function and avoid expensive disk paging operations. This condition alerts if available memory drops beneath this threshold.

Low Available Windows Memory

Low Page Life Expectancy

Page Life Expectancy (PLE) is the average lifespan in seconds of a data page in buffer, and is one of the best indicators of memory pressure. The larger the buffer cache size, the higher it should be. The formula* used in this condition takes the size of the buffer pool into account when determining whether the current PLE value is problematic: 

PLE < (DataCacheSizeInGB/4GB * 300)  

Each NUMA node has its own PLE value, so some nodes may be suffering from memory pressure while others are not. This condition checks all NUMA nodes via the Any instance that automatically iterates all nodes and applies the formula in a synchronized fashion across all counters. 

The first check in the condition ensures that the buffer size is at least two GB (default) before processing the formula, to avoid false positives caused by small buffers. 

For more information, see the Finding What Queries in the Plan Cache use a Specific Index and Page Life Expectancy Isn't What you Think articles. 

Low Page Life Expectancy

Network Bottleneck - Outbound

The Output Queue Length is the size of the network output packet queue, in packets. A sustained value of more than three may indicate a network bottleneck. This condition detects a queue length of more than three on any network adapter. 

For more information, see the Wait Statistics article and the Waits category on SQLPerformance.com

Network Bottleneck - Outbound

NUMA Disabled

This condition evaluates to True when NUMA is disabled. For more information, see the A DMV a Day article. 

SELECT COUNT(DISTINCT parent_node_id)
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE parent_node_id <> 32;

NUMA Disabled

Optimize for Ad Hoc Workloads Changed

Detects whenever the optimize for ad hoc workloads server config setting changes by comparing the last value retrieved to the current value.

SELECT value_in_use
FROM sys.configurations
WHERE name = N'optimize for ad hoc workloads';                       

Optimize for Ad Hoc Workloads Changed

Page Verification Check

This condition evaluates to true when it locates databases in the SQL Sentry database with an inadequate level of page protection. From release 2005 databases should be using Checksum. Databases upgraded or scripted from an earlier version may still be using torn protection. The result set format is as follows: 

Key (server name: (Database), value (one if incorrect).

For more information, see the Alter Database Set Options article. 

To change a single database on a server, run the following code after changing the database name:

USE [master]
GO
ALTER DATABASE [AdventureWorks] SET PAGE_VERIFY CHECKSUM  WITH NO_WAIT;
GO

Page Verification Check

To generate the commands to change all databases on a server, run the following code:

SELECT N'ALTER DATABASE [' + db.name + N'] SET PAGE_VERIFY CHECKSUM  WITH NO_WAIT;'
FROM sys.databases AS db 
WHERE db.page_verify_option_desc <> N'CHECKSUM';
SELECT
esc.ServerName  + ': (' + pasd.Name + ')',
 CASE
 WHEN pasd.CompatibilityLevel <= 80 AND pasd.PageVerifyOption <> 1 THEN 1
 WHEN pasd.CompatibilityLevel >= 90 AND pasd.PageVerifyOption <> 2 THEN 1
 ELSE 0
 END [IsBad],
pasd.CompatibilityLevel,
pasd.PageVerifyOption
FROM [dbo].[PerformanceAnalysisSqlDatabase] pasd
INNER JOIN [dbo].[EventSourceConnection] esc
 ON esc.ID = pasd.EventSourceConnectionID
WHERE pasd.DatabaseID <> 2
AND esc.IsPerformanceAnalysisEnabled = 1
AND
(CASE
 WHEN pasd.CompatibilityLevel <= 80 AND pasd.PageVerifyOption <> 1 THEN 1
 WHEN pasd.CompatibilityLevel >= 90 AND pasd.PageVerifyOption <> 2 THEN 1
 ELSE 0
END) = 1;

Server MAXDOP Changed

Detects when the server-wide max degree of parallelism (MAXDOP) server config setting changes on a server with more than one processor by comparing the last value retrieved to the current value.

SELECT value_in_use
FROM sys.configurations
WHERE name = N'max degree of parallelism';

Server MAXDOP Changed

Service Broker - Activation Error Occurred 

Checks for Service Broker procedures exiting due to errors.

Service Broker Activation Error

Service Broker - Login Connection Error

Checks sys.dm_broker_connections for a login_state of 13, which indicates a connection error.

Service Broker Login Connection

Service Broker - 'Poison Message Handling' Disabled

Checks sys.service_queues to see if is_poison_message_handling_enabled is false.

Service Broker Poison Message

Service Broker - Queues with Issues 

Checks sys.service_queues for entries that have an activation_procedure, but is_activation_enabled, is_receive_enabled, or is_enqueue_enabled is disabled.

Service Broker Queues with Issues

Service Broker - Task Limit Reached/Sec

Checks for Task Limit Reached/sec Total counter >= 1.

Service Broker Task Limit

Service Broker - Transmission Queue Errors 

Counts the number of messages in sys.transmission_queue where is_conversation_error is true.

Service Broker Transmission Queue Errors

Sleeping Sessions With Old Open Transactions

Detects when there are sleeping sessions with open transactions older than 10 minutes by default. Such sessions can cause blocking, and can prevent the transaction log from clearing, leading to excessive log file growth, and space exhaustion. When snapshot isolation is used, they can prevent version cleanup from occurring in tempdb.

The start time, session_id, host, application and database are returned for the oldest five transactions by default. The query itself only returns transactions older than five minutes by default, to avoid bringing back unnecessary results on systems with many short-running transactions.

SELECT TxDesc = '[' + CONVERT(varchar, es.last_request_start_time, 120) + '] (' + CAST(es.session_id AS varchar(6)) + ') ' + host_name + ':' + program_name + ' [' + DB_NAME(dt.database_id) + ']'
    , OpenMinutes = DATEDIFF(minute, es.last_request_start_time, GETDATE())
FROM sys.dm_exec_sessions es
JOIN sys.dm_tran_session_transactions st
  ON es.session_id = st.session_id
JOIN sys.dm_tran_database_transactions dt
  ON dt.transaction_id = st.transaction_id
WHERE dt.database_id <> 32767
  AND status = 'sleeping'
  AND es.last_request_start_time < DATEADD(MINUTE, -5, GETDATE())
ORDER BY es.last_request_start_time

Sleeping Sessions With Old Open Transactions

SQL File Auto-growth Disabled

This condition queries the SQL Sentry database to detect when auto-growth has been disabled for any data or transaction log file.

Although auto-growth should be considered a contingency for unexpected file growth vs. the primary means of managing growth, disabling it can lead to eventual free space exhaustion that halts all DML activity. 

For more information, see the Consideration for the Autogrow and Autoshrink settings article. 

SELECT
  DBFileName = EC.ObjectName + '.' + SF.Name
 ,SF.Growth
FROM EventSourceConnection EC
JOIN PerformanceAnalysisSqlFile SF
  ON EC.ID = SF.EventSourceConnectionID
WHERE EC.IsPerformanceAnalysisEnabled = 1

SQL File Auto-growth Disabled

SQL File Auto-growth Exceeds Free Space

Queries the SQL Sentry database to detect when the next auto-growth exceeds either available free disk space or the max file size. The key column represents the server and file name, and the value column represents the space debt in MB.

The space debt is the minimum of the free space (PerformanceAnalysisDeviceLogicalDisk.FreeSpace) minus the growth size setting (PerformanceAnalysisSqlFile.Growth) or the maximum size (PerformanceAnalysisSqlFile.MaxSize) minus the current size plus growth size.

SELECT 
  DBFileName
 ,SpaceDebtMB = MIN(SpaceDebtMB)
FROM
 (
  SELECT
    DBFileName = EC.ObjectName + '.' + SF.Name
   ,SpaceDebtMB = (LD.FreeSpace / 1024.0 / 1024.0) - (SF.Growth / 128)
  FROM EventSourceConnection EC
  JOIN PerformanceAnalysisSqlFile SF
    ON EC.ID = SF.EventSourceConnectionID
  JOIN PerformanceAnalysisDeviceLogicalDisk LD
    ON LD.DeviceID = EC.DeviceID
   AND LD.Name = LEFT(SF.FileName, LEN(LD.Name))
  WHERE SF.Growth > (LD.FreeSpace / 8192)
   AND EC.IsPerformanceAnalysisEnabled = 1
  UNION
  SELECT
    DBFileName = EC.ObjectName + '.' + SF.Name
   ,SpaceDebtMB = (SF.MaxSize - (SF.Size + SF.Growth)) / 128
  FROM EventSourceConnection EC
  JOIN PerformanceAnalysisSqlFile SF
    ON EC.ID = SF.EventSourceConnectionID
  JOIN PerformanceAnalysisDeviceLogicalDisk LD
    ON LD.DeviceID = EC.DeviceID
   AND LD.Name = LEFT(SF.FileName, LEN(LD.Name))
  WHERE SF.MaxSize = -1
   AND (SF.Growth + SF.Size) > SF.MaxSize
   AND EC.IsPerformanceAnalysisEnabled = 1
  ) DBFiles
GROUP BY DBFileName

SQL File Auto-growth Exceeds Free Space

SQL Sentry Monitoring Service Offline

Returns the SQL Sentry (or SentryOne) monitoring service that's been offline the longest in any site with actively watched connections. This condition only works when two or more monitoring services are used.

To avoid triggering this condition when intentionally stopping a service, temporarily move it into a site with no watched connections using Site Configuration.

SELECT TOP 1 dbo.ManagementEngine.ServerName
FROM
    (
        SELECT Device.ID
             , Device.SiteID
        FROM dbo.Device
        WHERE IsPerformanceAnalysisEnabled = 1
    UNION
        SELECT Device.ID
             , Device.SiteID
        FROM dbo.EventSourceConnection
        INNER JOIN dbo.Device
           ON dbo.EventSourceConnection.DeviceID = dbo.Device.ID
        WHERE EventSourceConnection.IsWatched = 1
           OR EventSourceConnection.IsPerformanceAnalysisEnabled = 1
    ) WatchedDevices
INNER JOIN dbo.ManagementEngine
   ON WatchedDevices.SiteID = dbo.ManagementEngine.SiteID
WHERE dbo.ManagementEngine.HeartbeatDateTime < DATEADD(minute, -3, GETUTCDATE())
   OR dbo.ManagementEngine.HeartbeatDateTime IS NULL
ORDER BY ISNULL(dbo.ManagementEngine.HeartbeatDateTime, dbo.ManagementEngine.LastInitializationDateTime)

SQL Sentry Monitoring Service Offline

SQL Server Memory Exhaustion

Memory Grants Pending is the number of processes waiting for a query workspace memory grant. This value should be zero, but it can go above zero in cases of severe memory pressure.  

When it does, RESOURCE_SEMAPHORE waits are greater than zero, since this wait is a measure of the time that queries had to wait for memory grants. This type is visible in the Waits chart tooltips for the Memory class and category.

SQL Server Memory Exhaustion

SQL Server Process Paged to Disk

Under severe memory pressure, Windows can page the SQL Server process to disk, which can dramatically impact SQL Server performance. This custom condition queries the ring buffer for the latest working set utilization value for the SQL Server process, and it alerts if it drops beneath 75 percent. If no record exists, 100(%) is returned, which is the optimal state.  

For more information, see the Reduce Paging of Buffer Pool Memory, Lock Pages in Memory, and SQL Server Debates articles. 

Note:  If the  Lock pages in memory  user right has been assigned to the SQL Server service user account, it will prevent Windows from paging the SQL Server process.

SELECT MIN(MemUtilization)
FROM
(
	SELECT
		ISNULL(rec.x.value('(MemoryRecord/MemoryUtilization)[1]','tinyint'), 100) As MemUtilization
	FROM (
			SELECT TOP 1 CAST(orb.record AS XML) AS xmlRec
			FROM sys.dm_os_ring_buffers AS orb
			CROSS JOIN sys.dm_os_sys_info AS osi
			WHERE orb.ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR'
			  AND DATEADD(second, -((osi.cpu_ticks/(osi.cpu_ticks/osi.ms_ticks) - orb.timestamp) / 1000), GETDATE()) > DATEADD(minute, -120, GETDATE())
			ORDER BY timestamp DESC
		) rb
	CROSS APPLY rb.xmlRec.nodes('Record') rec(x)
 UNION
	SELECT 100
) MU

SQL Server Process Paged to Disk

SQL Server Process Physical Memory Low

This condition evaluates to true when sys.dm_os_process_memory.process_physical_memory_low is true. It indicates that the process is responding to low physical memory.

For more information, see the sys.dm_os_process_memory article. 

SELECT process_physical_memory_low
FROM sys.dm_os_process_memory WITH (NOLOCK);

SQL Server Process Physical Memory Low

SQL Server Process Virtual Memory Low

This condition evaluates to true when sys.dm_os_process_memory.process_virtual_memory_low is true. It indicates that more virtual memory is needed for the SQL Server process. 

For more information, see the sys.dm_os_process_memory article. 

SELECT process_virtual_memory_low
FROM sys.dm_os_process_memory WITH (NOLOCK);

SQL Server Process Virtual Memory Low

Suspect Pages - Active Corrupt Pages

This condition queries the dbo.suspect_pages table located in the MSDB database. If there are any pages with a status of one, two, or three, it evaluates to true, advising that there is potential corruption in databases on the server being queried.

For more information, see the Monitoring for Suspect Pages article. 

SELECT db_name(database_id) as databaseName, count(*) as corruptPages
FROM dbo.suspect_pages
WHERE event_type in (1,2,3)
GROUP BY database_id;

Suspect Pages - Active Corrupt Pages

Suspect Pages - High Row Count

This condition counts the number of rows present in the dbo.suspect_pages table in the MSDB database. When the total count of rows in the table is greater than or equal to 900, it evaluates to true. This is because the dbo.suspect_pages table is only allowed a maximum of 1,000 rows before it fills, at which point any new suspect pages that are detected don't have information logged about them. 

For more information, see the Monitoring for Suspect Pages article. 

SELECT count(*) as suspectPageCount
FROM dbo.suspect_pages;

Suspect Pages - High Row Count

Suspect Pages - Increase in Fixed Pages

This condition queries dbo.suspect_pages in the MSDB database, checking the number of fixed pages (event types 4, 5, and 7) per database. This condition evaluates to true if the number of fixed pages has increased since the last check.

If this condition is true, it can mean that DBCC CHECKDB has corrected or deallocated pages. If Enterprise Edition is in use and the database is in either an Availability Group or Mirroring session, it could indicate that pages were fixed in Automatic Page Repair, alerting you to a potential corruption issue you might not have otherwise observed. 

For more information, see the Monitoring for Suspect Pages article. 

SELECT db_name(database_id) as databaseName, count(*) as fixedPages
FROM dbo.suspect_pages
WHERE event_type in (4,5,7)
GROUP BY database_id;

Suspect Pages - Increase in Fixed Pages

Tempdb Data Files

This condition evaluates to true when there are more data files than logical cores in use by SQL Server, or there are more than four logical cores in use and fewer than four tempdb data files.

If the number of logical processors is less than or equal to eight, use the same number of data files as logical processors. If the number of logical processors is less than eight, use eight data files, and then if contention continues, increase the number of data files by multiples of four (up to the number of logical processors) until the contention is reduced to acceptable levels or make changes to the workload/code. 

For more information, see the Recommendations to Reduce Allocation Contention, Correctly Adding Data Files, Trace Flag Usage, and TEMPDB – Files and Trace Flags and Updates articles. 

DECLARE @TempDBFiles TinyInt
DECLARE @CPUCount TinyIntSELECT @CPUCount = COUNT(*)
FROM sys.dm_os_schedulers
WHERE STATUS = N'VISIBLE ONLINE' AND IS_ONLINE = 1;SELECT @TempDBFiles = COUNT(*)
FROM master.sys.master_files
WHERE database_id = 2
AND file_id <> 2;SELECT
 CASE
  WHEN @TempDBFiles = @CPUCount THEN 0
  /* If there are more data files than CPU Cores then this may cause overhead */
  WHEN @TempDBFiles > @CPUCount THEN 1
  /* If there are 4 or more cores and less than 4 TempDB Data files there could be contention */
  WHEN @CPUCount >= 4 AND @TempDBFiles < 4 THEN 2
  ELSE 0
 END [Contention];

Tempdb Data Files

Tempdb Large Version Store 

Checks for a tempdb version store that is greater than 100,000 KB.

Tempdb Large Version Store

Tempdb Low Unallocated Page Count

This condition evaluates to true when less than 10 percent when tempdb's page count is unallocated. This may indicate that tempdb is running out of space and will soon experience autogrowth or hit a size limit.

For more information, see the sys.dm_db_file_space_usage article. 

SELECT (CONVERT(DECIMAL(18,0),(SUM(unallocated_extent_page_count)))/(SUM(total_page_count))*100) as percent_used 
FROM tempdb.sys.dm_db_file_space_usage;

Tempdb Low Unallocated Page Count

Tempdb Unequal File Size

The benefits of having multiple tempdb files can be lost if one of those files grows larger than the other files. This condition evaluates to true if your tempdb files are not the same size. The query counts the distinct tempdb file sizes from sys.master_files.

For more information, see the SQL Server tempdb Fixes and Adding Data Files articles. 

SELECT count(distinct size) as unique_sizes 
FROM sys.master_files
WHERE database_id = 2
AND type_desc <> 'LOG';

Tempdb Unequal File Size

Tempdb/CPU Configuration Warning

Check that you have one tempdb file per CPU core; up to eight files. As with most SQL Server topics, an it depends caveat accompanies tempdb recommendations; however, starting with SQL Server 2016, Microsoft creates a default number of eight tempdb files. It uses the number of cores if there are fewer than eight cores or just one file it is SQL Server Express.

The first query counts the visible online CPUs from sys.dm_os_schedulers, and the second query counts the number of tempdb files from sys.master_files. This condition evaluates to True when the count of CPUs isn't equal to the count of tempdb files.

For more information, see the Install SQL Server from the Command Prompt and tempdb Should Always Have one Data File per Processor Core articles. 

Note:  As a general rule, if the number of logical processors is less than or equal to 8, use the same number of data files as logical processors. If the number of logical processors is greater than 8, use 8 data files and then if contention continues, increase the number of data files by multiples of 4 (up to the number of logical processors) until the contention is reduced to acceptable levels or make changes to the workload/code.
SELECT count(cpu_id) as cpu_count
FROM sys.dm_os_schedulers 
WHERE status = 'VISIBLE ONLINE';
SELECT count(Name) as tempdb_files
FROM sys.master_files 
WHERE database_id = 2
AND type_desc <> 'LOG';

Tempdb/CPU Configuration Warning

Tempdb/CPU Configuration Warning > 8 CPUs

If you have many cores, set the number of tempdb files as a quarter to half the number of cores. This condition verifies that you have at least the quarter ratio. This means if you have 32 cores and eight files, it evaluates to false, but if there are more than 32 cores and only the eight default tempdb files, then it evaluates to true.

The first query counts the visible online CPUs from sys.dm_os_schedulers and the second query counts the number of tempdb files from sys.master_files.

Customize this based on contention or performance issues you have seen in your environments. Microsoft recommends adding files in multiples of four, so you would likely want to go from eight to 12, 16, 20, etc. 

For more information, see the Tempdb Fixes and Tempdb Should Always Have one Data File per Processor Core articles. 

SELECT count(cpu_id) as cpu_count
FROM sys.dm_os_schedulers 
WHERE status = 'VISIBLE ONLINE';
SELECT count(Name) as tempdb_files
FROM sys.master_files 
WHERE database_id = 2
AND type_desc <> 'LOG';

Tempdb/CPU Configuration Warning

Tintri High Datastore Throttle Latency 

Checks for Tintri Datastore : Throttle Latency (ms) > 10.  Tintri High Datastore

Tintri High vDisk Throttle Latency 

Checks for Tintri Virtual Disk : Throttle Latency (ms) Any > 10.

Tintri High vDisk

Tintri High VM Throttle Latency 

Checks for Tintri Virtual Machine : Throttle Latency (ms) Any > 10.  Tintri High VM Throttle

Trace Flags Number Turned On Changed

This condition evaluates to true when the number of trace flags set to ON (status = one) has changed. The query analyzes the results of DBCC TRACESTATUS WITH NO_INFOMSGS to determine the status of the trace flags.

CREATE TABLE [#TraceFlagsOn] 
      ([TraceFlag] INT, [Status] INT, [Global] INT, [Session] INT);
INSERT INTO #TraceFlagsOn EXEC ('DBCC TRACESTATUS WITH NO_INFOMSGS');
SELECT Count(*) as TraceFlagsStatusOn
FROM #TraceFlagsOn
WHERE #TraceFlagsOn.Status = 1;
DROP TABLE [#TraceFlagsOn];

Trace Flags Number Turned On Changed

Trace Flags Total Number Changed

This condition evaluates to True when the number of trace flags enabled on the system has changed. The query analyzes the results of DBCC TRACESTATUS WITH NO_INFOMSGS to determine the status of the trace flags.

CREATE TABLE [#TraceFlags] 
      ([TraceFlag] INT, [Status] INT, [Global] INT, [Session] INT);
 
INSERT INTO #TraceFlags EXEC ('DBCC TRACESTATUS WITH NO_INFOMSGS');
 
SELECT Count(*) as TraceFlagsTotal
FROM #TraceFlags;
 
DROP TABLE [#TraceFlags];

Trace Flags Total Number Changed

VMware High Ballooning

This VMware-specific condition evaluates to true when a significant amount of memory has been reclaimed by the host machine from the guest virtual machine(s). 

For more information, see the vSphere Memory Management article. 

VMware High Ballooning

VMware High Co-Stop %

This condition evaluates to true when the VMware Co-Stop time value is greater than three percent. This is the time that a virtual machine is ready to run, but unable to run due to co-scheduling constraints. May be indicative of too many vCPU resources. 

For more information, see the CPU Counters article. 

VMware High Co-Stop %

VMware High Ready Time % per vCPU

This VMware-specific condition evaluates to True when Ready Time % per vCPU is critically high.

When a vCPU is ready to do work but is waiting for the hypervisor to schedule that work on one or more physical CPUs, the vCPU accumulates Ready Time. This is often caused by having a high ratio of vCPUs to physical CPUs on the host machine or having VMs of significantly different sizes (by vCPU count) on the same host. 

For more information, see the CPU Ready Time in VMware article. 

VMware High Ready Time Per vCPU

VMware High Ready Time % per vCPU - Warning

This VMware-specific condition evaluates to True when Ready Time % per vCPU is within the warning threshold.

When a vCPU is ready to do work but is waiting for the hypervisor to schedule that work on one or more physical CPUs, the vCPU accumulates Ready Time. This is often caused by having a high ratio of vCPUs to physical CPUs on the host machine or having VMs of significantly different sizes (by vCPU count) on the same host. 

For more information, see the CPU Ready Time in VMware article. 

VMware High Ready Time % per vCPU - Warning

Windows Low Memory Notification

This condition queries the SQL Server ring buffer for recent occurrences of system-level RESOURCE_MEMPHYSICAL_LOW alerts. These occur when Windows signals that physical memory is low and can cause the SQL Server to release its memory in response. 

For more information, see the Ring Buffer Resource Monitor and Diagnose Memory Issues in SQL Server blog posts. 

SELECT COUNT(*) AS AlertCount
FROM (
    SELECT CAST(orb.record AS XML) AS xmlRec
    FROM sys.dm_os_ring_buffers AS orb
    CROSS JOIN sys.dm_os_sys_info AS osi
    WHERE orb.ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR'
   AND DATEADD(second, -((osi.cpu_ticks/(osi.cpu_ticks/osi.ms_ticks) - orb.timestamp) / 1000), GETDATE()) > DATEADD(minute, -6, GETDATE())
  ) rb
CROSS APPLY rb.xmlRec.nodes('Record') rec(x)
WHERE rec.x.value('(ResourceMonitor/IndicatorsSystem)[1]','tinyint') = 2

Windows Low Memory Notification