Documentation forSQL Sentry

Windows & Hyper-V Performance Metrics

Overview

This article covers the various Windows performance metrics displayed by the Performance Analysis Dashboard and Performance Analysis Overview, and how to interpret different metric values and combinations of values across different Windows metrics for SQL Sentry.

Note:  For Mode: S = Sample and H = History.

Windows Metrics

Section Metric Description
Network Total network utilization The total combined percentage utilization for all adapters on the server.
Mode: S, H
Type:Percent
May correlate: CPU Usage: Kernel time
Network Utilization by SQL Server/SSAS instance The percentage utilization for each instance is overlaid on top of the total percentage utilization, enabling you to see how much of the network traffic is related to SQL Server, SSAS, or other processes on the server.

If SQL Server-related network activity is high, use Quick Trace to determine which processes may be responsible.
Mode: S, H
Type:Percent
Network Utilization by network adapter Select the adapters radio group item to view activity for up to four network adapters on the server. This enables you to spot adapter-specific saturation issues.
Mode: S
Type:Mb/sec
May correlate: CPU Usage: Kernel time
Network Network output queue length The length of the network output packet queue, in packets. This is the total value across all adapters on the machine. A sustained value of  > three may indicate a network bottleneck.
Mode: S
Type:Last value
Warning range:> three
Critical range:>three for more than approx. 20 sec
CPU
Usage
Total processor time The total processor time percentage across all processors on the server. A sustained value greater than 80 percent generally indicates a CPU bottleneck.
Mode: S
Type:Percent
CPU
Usage
User time by CPU The user time percentage for each individual CPU. Represented by the color green. If user time for individual CPUs are at or near 100 percent for a sustained period, SQL Server or other applications may not be parallelizing effectively.
Mode: S, H
Type:Percent
Warning range:> 85 percent
Critical range:> 95 percent
CPU
Usage
Kernel time by CPU The kernel (or privileged) time percentage for each individual CPU. Represented by the color red. Kernel time should typically be < 40 percent total, and < 25 percent of user time. If kernel time for individual CPUs are high for a sustained period, a driver may be responsible. For example, a network driver may cause high kernel time, and it may be isolated to specific CPUs, in which case it may correlate with network activity.
Mode: S
Type:Percent
May correlate: Network activity
CPU
Usage
Processor time by SQL Server
/SSAS process
CPU time related to SQL Server and SSAS processes is overlaid on top of the total CPU series. There is only one SSAS chart series and it represents all SSAS instances on the server combined.

Use Top SQL and QuickTrace to determine which queries are utilizing the most CPU resources.
Mode: S, H
Type:Percent
May correlate: SQL Server:CPU wait time
Queries:
  • Parallelism
  • Hash joins
  • Sort operations
  • Index rebuilds & defrag
  • Consistency checks
SSAS:
CPU
Usage
Percent Guest Runtime*

For the Hyper-V Hypervisor Logical Processor, the average percentage of time guest code is running on a logical processor.

Additional Information: See Also: Performance Management: Monitoring CPU Resources
Mode: S, H
Type:Percent
CPU
Usage
vCPU Wait Time*

The percentage time the guest virtual machine spent waiting for a host kernel resource.

Additional Information: See Also: Hyper-V Equivalent to VMware CPU Ready Time
Mode: S, H
Type:Percent
CPU
Usage
Context switches The combined rate at which all processors on the computer are switched from one thread to another. Consistently high values can mean that the server is spending too much time switching threads instead of actively running threads.
Mode: S
Type:Avg/sec
Warning range:>5,000
times # of processors
Critical range:>7,500
times # of processors
May correlate: CPU Usage: Kernel time
CPU
Usage
Processor queue length The number of threads in the processor queue. If this value goes over two per processor, and CPU usage for SQL Server is high, it can be indicative of high compiles/recompiles, or a high rate of key lookups, which can often be addressed by covering indexes. Use Quick Trace to troubleshoot.
Mode: S
Type:Last value
Critical range:> two times the # of processors
System
Memory
Memory by SQL Server
/SSAS instance
The amount of physical memory used by each SQL Server and SSAS instance. Important for determining whether available memory is being used effectively, and whether there's memory contention between multiple instances on the same server.
Mode: S, H
Type:MB
System
Memory
Other processes memory The amount of physical memory used by all processes on the server other than SQL Server or SSAS. If a SQL Server or SSAS instance isn't being watched by SQL Sentry, the memory it uses is included here.
Mode: S, H
Type:MB
System
Memory
File Cache memory The amount of physical memory currently allocated to the system file cache.

SSAS —SSAS database files may be loaded into and served from the file cache, even if the associated file data doesn't exist in the SSAS internal caches. Monitoring the file cache is important to ensure that physical memory is being used effectively, and that memory contention doesn't occur between the SSAS process, the file cache, and other processes on the server.

SQL Server —This isn't an important metric to monitor for dedicated SQL Servers.
Mode: S, H
Type:MB
System
Memory
Ballooned memory** Memory ballooning allows a physical host to recapture unused memory on a virtual machine, and use it elsewhere. The portion of memory being ballooned is determined in different ways depending on the hypervisor.


Additional Information:

Mode: S, H
Type:MB
System
Memory
Available memory The amount of physical memory not in use by any processes. If this value is consistently very low, it can be a strong indicator of memory pressure. When available memory gets too low, Windows signals applications like SQL Server of the low memory condition, and this can cause SQL Server to page its own data to disk. It's generally a good idea to have a few hundred MB available, to avoid this signaling, and so Windows can make use of it as needed.
Mode: S, H
Type:MB
Critical range:<100MB
System
Memory
Read faults Pages read from disk into memory by Windows. This metric includes both page file and file reads, but not SQL Server file reads. Hard faults can be caused by various activities, including:

  • Copying files, locally or across the network
  • Anti-virus or other security-related scans
  • File indexing activity by Windows or other apps including SQL Server full text indexing
  • SQL Server backups written to the local disk system.
If high hard faults correlate with high network activity, it may be related to copying files across the network, such as with Windows backup software. If the backup software has a local agent or service, use the Process ActivityViewer to monitor page faults/sec (sort descending) to confirm. If there's no local agent, it can be more difficult to determine the source of the activity; however, the best place to start is typically the Sessions list under Windows Target Management.

SQL Server —Windows hard faults are not caused by SQL Server itself because it manages its own memory and memory/disk access. Heavy Windows paging can impact SQL Server performance by contending with normal SQL Server disk activity. The only scenario where SQL Server may contribute to hard faults is when its process memory (working set) is being paged to disk by the operating system. This can happen with severe memory pressure when SQL server doesn't respond quickly enough by releasing its own memory. This can have a dramatic impact on performance, especially on 64-bit systems where a restart of SQL Server may be required to recover.  

SSAS —Windows paging will often strongly correlate with SSAS file IO activity. Because SSAS databases consist of many files on the disk system and SSAS doesn't manage its own memory/disk access as does SQL server, reads from and writes to the SSAS files will show up as hard page faults.
Mode: S, H
Type:Avg pages/
sec
May correlate: Network activity
SSAS:File IO activity
System
Memory
Write faults Pages written from memory to disk by Windows. If you see high write faults in combination with an increase in page file usage, it may mean that the SQL Server or SSAS process is being paged to disk. For SQL Server, this can be confirmed by querying the sys.dm_os_ring_buffers DMV for RESOURCE_MEMPHYSICAL_LOW events with MemoryUsage < 100.

See Page file usage and Read faults for more details.
Mode: S, H
Type:Avg pages/
sec
System
Memory
Page file usage The percentage of page (or swap) file currently in use. Windows uses the page file to store pages of memory used by processes that are not contained in other files. If the file is adequately sized and usage percentage is high, it's a likely indicator of memory pressure.

SQL Server —If SQL Server doesn't respond quickly enough to a low physical memory condition, memory for the SQL Server process (working set) can be swapped to the page file. This can have a dramatic impact on performance, especially on 64-bit systems where a restart of SQL Server may be required to recover. If AWE or the Lock pages in memory option is in use, Windows can't send SQL Server process memory to the page file. This option is recommended on 64-bit systems.

SSAS -- SSAS doesn't acknowledge or respond to Windows low memory conditions, so it's more likely that its process memory can be sent to the page file when there is memory pressure. SSAS performance can suffer when this happens.
Mode: S
Type:Percent
Disk IO Read latency by physical disk The average time in milliseconds each physical disk read is taking. Logical disks on the same physical disks are grouped together, because they share the same set of disk spindles and latency between them will always be the same.

Disk latency is the only disk measurement for which there are generally accepted ranges that represent good and bad performance from a SQL Server perspective. Disk queue metrics, for example, are not accurate for many SAN systems, and there are also no universally agreed upon good and bad ranges for SQL Server. The following ranges can be used as a general guideline to determine whether disk latency is acceptable:

  • Less than 10ms - Fast *
  • Between 10ms - 20ms - Acceptable
  • Between 20ms - 50ms - Slow
  • Greater than 50ms - Critical
* For transaction log writes, between 0ms and 2ms is desirable.

Note:  The Disk Activity tab shows you the actual layout of controllers, disks, and database files, and detailed latency/volume metrics for each. Whereas the Dashboardgives you at a glance and historical trending information for latency, the Disk Activity tab should be used for more in depth troubleshooting of high latency issues.

Mode: S, H
Type:Avg ms/
Read
Disk IO Write latency by physical disk The average time in milliseconds each physical disk write is taking. See Disk IO: Read latency above for details.
Mode: S, H
Type:Avg ms/
Write

Hyper-V Host Metrics

Section Metric Description
Network (VM) Utilization by Virtual Machine Guest The data transfer rate in Mb per second for each guest, enabling you to see how much of the network traffic is related to each individual guest machine.
Mode: S, H
Type:Mbps
CPU Usage (VM) Usage time by Guest The usage time percentage for each individual guest machine. Represented by the color selected for each individual Virtual Machine guest (consistent over the dashboard section).
Mode: S, H
Type:Percent
CPU Usage (VM) vCPU Root usage time The usage percentage time for the Hypervisor root virtual processor.

Additional Information: See Also: BizTalk Server 2006 R2 Retired Technical documentation 
Mode: S, H
Type:Percent
CPU Usage (VM) Percent Guest Runtime For the Hyper-V Hypervisor Logical Processor, the average percentage of time guest code is running on an logical processor.

Additional Information: See Also: Performance Management: Monitoring CPU Resources
Mode: S, H
Type:Percent
CPU Usage (VM) vCPU Wait Time The percentage time the guest virtual machine spent waiting for a Host kernel resource.

Additional Information: See Also: Hyper-V Equivalent to VMware CPU Ready Time
Mode: S, H
Type:Percent
System Memory (VM) Memory by Virtual Machine Guest The amount of physical memory used by each guest on the host machine.
Mode: S, H
Type:MB
Disk IO (VM) IOPS (Read) The number of read operations per second across all virtual machines belonging to the host. In Sample mode read operations are broken down by individual virtual machine. Use the drop-down menu to select an individual guest machine.
Mode: S, H
Type:Reads/sec
Disk IO (VM) IOPS (Write) The number of write operations per second across all virtual machines belonging to the host. In Sample mode write operations are broken down by individual virtual machine. Use the drop-down menu to select an individual guest machine.
Mode: S, H
Type:Writes/sec
Disk IO (VM) Read Throughput The average MB per second read across all virtual machines belonging to the host. Use the drop-down menu to select an individual guest machine.
Mode: S, H
Type:MB/sec
Disk IO (VM) Write Throughput The average MB per second written across all virtual machines belonging to the host. Use the drop-down menu to select an individual guest machine.
Mode: S, H
Type:MB/sec

* Metric will ONLY be visible for targets that are Hyper-V Hosts
** Metric will ONLY be visible for targets that are VM guests whose host is also watched