Microsoft SQL Server
Plugin: go.d.plugin Module: mssql
Overview
This collector monitors the health and performance of Microsoft SQL Server instances.
It collects metrics from:
- Performance counters (buffer manager, memory manager, SQL statistics)
- Dynamic management views (DMVs) for wait statistics, locks, and sessions
- Per-database transaction and lock statistics
- SQL Server Agent job status
It connects to the SQL Server instance via TCP using the go-mssqldb driver and executes queries against:
sys.dm_os_performance_counters- Performance counter valuessys.dm_exec_sessions- Connection informationsys.dm_os_wait_stats- Wait statisticssys.dm_tran_locks- Lock informationsys.dm_io_virtual_file_stats- I/O stall (latency) statisticssys.dm_os_process_memory- SQL Server process memorysys.dm_os_sys_memory- OS physical memory and page filesys.master_files- Database file sizesmsdb.dbo.sysjobs- SQL Agent job status
This collector is supported on all platforms.
This collector supports collecting metrics from multiple instances of this integration, including remote instances.
The monitoring user requires the VIEW SERVER STATE permission to access DMVs.
For SQL Agent job monitoring (queried during collector startup), access to
msdb.dbo.sysjobs is required.
Default Behavior
Auto-Detection
By default, it tries to connect to SQL Server on localhost:1433 without authentication. You must configure proper credentials for monitoring.
Limits
The default configuration for this integration does not impose any limits on data collection.
Performance Impact
The collector executes lightweight queries against system views. Most queries complete in milliseconds and have minimal impact on server performance.
Metrics
Metrics grouped by scope.
The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.
Per Microsoft SQL Server instance
These metrics refer to the entire SQL Server instance.
This scope has no labels.
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.user_connections | user | connections | • | • |
| mssql.session_connections | user, internal | connections | • | • |
| mssql.blocked_processes | blocked | processes | • | • |
| mssql.batch_requests | batch | requests/s | • | • |
| mssql.compilations | compilations | compilations/s | • | • |
| mssql.recompilations | recompilations | recompilations/s | • | • |
| mssql.auto_param_attempts | total, safe, failed | attempts/s | • | • |
| mssql.sql_errors | errors | errors/s | • | • |
| mssql.buffer_cache_hit_ratio | hit_ratio | percentage | • | • |
| mssql.buffer_page_life_expectancy | life_expectancy | seconds | • | • |
| mssql.buffer_page_iops | read, written | pages/s | • | • |
| mssql.buffer_checkpoint_pages | flushed | pages/s | • | • |
| mssql.buffer_page_lookups | lookups | lookups/s | • | • |
| mssql.buffer_lazy_writes | lazy_writes | writes/s | • | • |
| mssql.memory_total | memory | bytes | • | • |
| mssql.memory_connection | memory | bytes | • | • |
| mssql.memory_pending_grants | pending | processes | • | • |
| mssql.memory_external_benefit | benefit | benefit | • | • |
| mssql.page_splits | page | splits/s | • | • |
| mssql.process_memory_resident | resident | bytes | • | • |
| mssql.process_memory_virtual | virtual | bytes | • | • |
| mssql.process_memory_utilization | utilization | percentage | • | • |
| mssql.process_page_faults | page_faults | faults | • | • |
| mssql.os_memory | used, available | bytes | • | • |
| mssql.os_pagefile | used, available | bytes | • | • |
Per database
These metrics refer to individual databases.
Labels:
| Label | Description |
|---|---|
| database | Database name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.database_active_transactions | active | transactions | • | • |
| mssql.database_transactions | transactions | transactions/s | • | • |
| mssql.database_write_transactions | write | transactions/s | • | • |
| mssql.database_log_flushes | flushes | flushes/s | • | • |
| mssql.database_log_flushed | flushed | bytes/s | • | • |
| mssql.database_log_growths | growths | growths | • | • |
| mssql.database_io_stall | read, write | ms | • | • |
| mssql.database_data_file_size | size | bytes | • | • |
| mssql.database_backup_restore_throughput | throughput | bytes/s | • | • |
| mssql.database_state | online, restoring, recovering, pending, suspect, emergency, offline | state | • | • |
| mssql.database_read_only | read_only, read_write | status | • | • |
Per lock stats
These metrics refer to lock statistics by lock resource type (from performance counters).
Labels:
| Label | Description |
|---|---|
| resource | Lock resource type (Database, File, Object, Page, Key, Extent, RID, HoBT, etc.) |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.lock_stats_deadlocks | deadlocks | deadlocks/s | • | • |
| mssql.lock_stats_waits | waits | waits/s | • | • |
| mssql.lock_stats_timeouts | timeouts | timeouts/s | • | • |
| mssql.lock_stats_requests | requests | requests/s | • | • |
Per lock resource
These metrics refer to lock resource types (from sys.dm_tran_locks).
Labels:
| Label | Description |
|---|---|
| resource | Lock resource type (Database, File, Object, Page, Key, etc.) |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.locks_by_resource | locks | locks | • | • |
Per wait type
These metrics refer to individual wait types (from sys.dm_os_wait_stats).
Labels:
| Label | Description |
|---|---|
| wait_type | Wait type name |
| wait_category | Wait category (CPU, Lock, Latch, Buffer IO, etc.) |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.wait_total_time | duration | ms | • | • |
| mssql.wait_resource_time | duration | ms | • | • |
| mssql.wait_signal_time | duration | ms | • | • |
| mssql.wait_max_time | max_time | ms | • | • |
| mssql.wait_count | waits | waits/s | • | • |
Per job
These metrics refer to SQL Server Agent jobs.
Labels:
| Label | Description |
|---|---|
| job_name | Job name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.job_status | enabled, disabled | status | • | • |
Per replication
These metrics refer to SQL Server replication publications.
Labels:
| Label | Description |
|---|---|
| publisher_db | Publisher database name |
| publication | Publication name |
Metrics:
| Metric | Dimensions | Unit | SQL Server 2016+ | Azure SQL Database |
|---|---|---|---|---|
| mssql.replication_status | started, succeeded, in_progress, idle, retrying, failed | status | • | • |
| mssql.replication_warning | expiration, latency, merge_expiration, merge_slow_duration, merge_fast_duration, merge_fast_speed, merge_slow_speed | flags | • | • |
| mssql.replication_latency | average, best, worst | seconds | • | • |
| mssql.replication_subscriptions | total, agents_running | subscriptions | • | • |
Functions
This collector exposes real-time functions for interactive troubleshooting in the Top tab.
Top Queries
Retrieves aggregated SQL query performance metrics from Microsoft SQL Server Query Store runtime statistics.
This function queries sys.query_store_runtime_stats and related views across all databases with Query Store enabled, aggregating execution statistics by query hash. It provides comprehensive timing, I/O, memory, and parallelism metrics.
Use cases:
- Identify slow or resource-intensive queries consuming excessive CPU time or memory
- Analyze I/O patterns (logical reads, physical reads, writes) to detect bottlenecks
- Monitor parallelism (DOP) and tempdb usage for capacity planning
Query text is truncated at 4096 characters for display purposes. Columns are dynamically detected based on SQL Server version (some metrics only available in 2016+/2017+).
| Aspect | Description |
|---|---|
| Name | Mssql:top-queries |
| Require Cloud | yes |
| Performance | Executes dynamic SQL to aggregate Query Store data across all enabled databases: • Execution time depends on Query Store workload and number of monitored databases • Default limit of 500 rows balances completeness with performance |
| Security | Query text may contain unmasked literal values including potentially sensitive data: • Personal information in WHERE clauses or INSERT values • Business data and internal identifiers • Access should be restricted to authorized personnel only |
| Availability | Available when: • The collector has successfully connected to SQL Server • Query Store is enabled on at least one user database • Returns HTTP 503 if collector is still initializing • Returns HTTP 500 if the query fails • Returns HTTP 504 if the query times out |
Prerequisites
Enable Query Store
Query Store must be enabled on each database you want to monitor.
-
Verify Query Store is enabled on your databases:
SELECT name, is_query_store_on
FROM sys.databases
WHERE name NOT IN ('master', 'tempdb', 'model', 'msdb'); -
Enable Query Store on databases where it is disabled:
ALTER DATABASE [YourDatabaseName] SET QUERY_STORE = ON; -
Enable the function in Netdata collector config:
jobs:
- name: local
dsn: "sqlserver://user:pass@localhost:1433"
query_store_function_enabled: true
- Query Store is available in SQL Server 2016+ and Azure SQL Database
- Requires ALTER DATABASE permission to enable Query Store
- System databases (master, tempdb, model, msdb) are excluded from queries
Parameters
| Parameter | Type | Description | Required | Default | Options |
|---|---|---|---|---|---|
| Filter By | select | Select the primary sort column. The available options depend on your SQL Server version and include metrics like total execution time, number of calls, CPU time, logical I/O, memory grants, and more. Default is Total Time to focus on most resource-intensive queries. | yes | totalTime |
Returns
Aggregated query execution statistics from Query Store runtime views, providing comprehensive performance analysis across all monitored databases. Each row represents a unique query pattern (normalized query hash) with cumulative metrics across all its executions.
| Column | Type | Unit | Visibility | Description |
|---|---|---|---|---|
| Query Hash | string | hidden | Unique hash identifier for the normalized query pattern. Queries with identical structure but different literal values share the same digest. | |
| Query | string | The SQL query text with literal values truncated at 4096 characters. Use this to identify the actual SQL being executed and spot parameterized queries or injection risks. | ||
| Database | string | Database name where the query was executed. Essential for multi-database analysis to identify which database is experiencing query load. | ||
| Calls | integer | Total number of times this query pattern has been executed. High values indicate frequently run queries that may impact server performance significantly. | ||
| Error Attribution | string | Status of error detail attribution for this query. Values: enabled, no_data, not_enabled, not_supported. | ||
| Error Number | integer | Most recent error number observed for this query (when error attribution is enabled). | ||
| Error State | integer | hidden | SQL Server error state for the most recent error (when error attribution is enabled). | |
| Error Message | string | Most recent error message for this query (when error attribution is enabled). | ||
| Hash Match Joins | integer | Count of Hash Match join operators across all stored plans for this query. | ||
| Merge Joins | integer | Count of Merge Join operators across all stored plans for this query. | ||
| Nested Loops | integer | Count of Nested Loops operators across all stored plans for this query. | ||
| Sorts | integer | Count of Sort operators across all stored plans for this query. | ||
| Total Time | duration | milliseconds | Cumulative execution time across all query executions. This is a key metric for identifying the most resource-intensive queries in terms of total server time consumption. | |
| Avg Time | duration | milliseconds | Average execution time per query run, calculated as weighted average when execution count is greater than zero. Compare with Total Time to determine if individual executions or high frequency drives resource usage. | |
| Last Time | duration | milliseconds | hidden | Execution time of the most recent execution for this query pattern. Useful for identifying recent performance changes or individual outlier executions. |
| Min Time | duration | milliseconds | hidden | Minimum execution time observed. Helps identify variability in query performance and spot potential optimization opportunities for outliers. |
| Max Time | duration | milliseconds | hidden | Maximum execution time observed. Large gaps between Min Time and Max Time may indicate performance instability due to parameter sniffing, data skew, or lock contention. |
| StdDev Time | duration | milliseconds | hidden | Standard deviation of execution time. High values indicate inconsistent query performance, making capacity planning difficult and suggesting need for query optimization or consistent indexing. |
| Avg CPU | duration | milliseconds | Average CPU time consumed per query execution. High values indicate CPU-intensive operations that may include complex calculations, string manipulations, or excessive function calls. Available in SQL Server 2016+. | |
| Last CPU | duration | milliseconds | hidden | CPU time of the most recent execution. Useful for identifying recent changes in query patterns and resource usage. |
| Min CPU | duration | milliseconds | hidden | Minimum CPU time observed. Helps identify variability in CPU consumption and spot efficient vs. inefficient query executions. |
| Max CPU | duration | milliseconds | hidden | Maximum CPU time observed. Spikes may indicate complex queries, large result sets, or parallelism issues. |
| StdDev CPU | duration | milliseconds | hidden | Standard deviation of CPU time. High variability suggests inconsistent performance due to varying data volumes, plan cache hit rates, or changing execution contexts. |
| Avg Logical Reads | float | Average number of logical read operations (8KB pages) per execution. High values indicate queries scanning large amounts of data through indexes or table scans. Monitor for I/O subsystem impact. | ||
| Last Logical Reads | integer | hidden | Logical reads from the most recent execution. Useful for identifying immediate query patterns and recent performance changes. | |
| Min Logical Reads | integer | hidden | Minimum logical reads observed. Helps identify data access patterns and spot outliers. | |
| Max Logical Reads | integer | hidden | Maximum logical reads observed. Very high values may indicate full table scans, missing indexes, or inefficient join operations requiring excessive data access. | |
| StdDev Logical Reads | float | hidden | Standard deviation of logical reads. High variability suggests inconsistent access patterns, potentially indicating performance issues with certain queries or data volumes. | |
| Avg Logical Writes | float | Average number of logical write operations per execution. High values indicate heavy write workloads that may benefit from batching or optimization. | ||
| Last Logical Writes | integer | hidden | Logical writes from the most recent execution. Helps track recent write activity and identify immediate performance impact. | |
| Min Logical Writes | integer | hidden | Minimum logical writes observed. Helps identify read-heavy vs. write-heavy query patterns and data access characteristics. | |
| Max Logical Writes | integer | hidden | Maximum logical writes observed. Spikes may indicate bulk insert/update operations, large transactions, or data migration activities. | |
| StdDev Logical Writes | float | hidden | Standard deviation of logical writes. High values indicate write performance variability, potentially suggesting inconsistent transaction sizes or periodic bulk operations. | |
| Avg Physical Reads | float | Average number of physical read operations from storage per execution. High values indicate queries requiring substantial disk I/O for data retrieval, potentially due to full table scans or missing covering indexes. | ||
| Last Physical Reads | integer | hidden | Physical reads from the most recent execution. Useful for identifying immediate I/O patterns and recent storage subsystem pressure. | |
| Min Physical Reads | integer | hidden | Minimum physical reads observed. Helps baseline I/O patterns and identify read-intensive query scenarios. | |
| Max Physical Reads | integer | hidden | Maximum physical reads observed. Extremely high values may indicate storage subsystem bottlenecks, full table scans without covering indexes, or queries processing very large data volumes. | |
| StdDev Physical Reads | float | hidden | Standard deviation of physical reads. High variability suggests inconsistent disk access patterns, potentially indicating intermittent I/O performance issues or storage contention. | |
| Avg CLR Time | duration | milliseconds | Average CLR (Common Language Runtime) time per execution. High values indicate managed code (stored procedures, functions, triggers) with heavy computations, garbage collection pressure, or inefficient memory allocations. Available in SQL Server 2016+. | |
| Last CLR Time | duration | milliseconds | hidden | CLR time of the most recent execution. Useful for identifying recent managed code performance changes and detecting inefficient code deployments. |
| Min CLR Time | duration | milliseconds | hidden | Minimum CLR time observed. Helps identify efficient managed code executions and spot expensive CLR operations. |
| Max CLR Time | duration | milliseconds | hidden | Maximum CLR time observed. Spikes may indicate complex managed code operations, large object allocations, or expensive .NET framework method calls. |
| StdDev CLR Time | duration | milliseconds | hidden | Standard deviation of CLR time. High variability suggests inconsistent managed code execution patterns, potentially varying by execution parameters, data volumes, or different code paths being taken. |
| Avg DOP | float | Average Degree of Parallelism (DOP) per query. Higher values indicate queries utilizing more CPU cores through parallelism, potentially consuming significant server resources. Values above 1 indicate intra-query parallelism; values of 1 indicate serial execution. | ||
| Last DOP | integer | hidden | DOP of the most recent execution. Helps track recent parallelism patterns and identify changes in query execution behavior. | |
| Min DOP | integer | hidden | Minimum DOP observed. Values of 0 may indicate serial execution; values above 1 suggest parallel query execution within individual queries. | |
| Max DOP | integer | hidden | Maximum DOP observed. Very high values (>4) may indicate aggressive parallelism consuming excessive resources and potentially affecting concurrent workloads. Available in SQL Server 2016+. | |
| StdDev DOP | float | hidden | Standard deviation of DOP. High variability suggests inconsistent parallelism patterns across executions, potentially indicating performance variability based on data characteristics or query complexity. | |
| Avg Memory (8KB pages) | float | Average memory grant (in 8KB pages) per execution. High values indicate memory-intensive queries that may benefit from index optimization, reduced result sets, or query tuning to reduce working memory usage. | ||
| Last Memory (8KB pages) | integer | hidden | Memory grant from the most recent execution. Useful for identifying recent memory pressure and tracking immediate impact of resource-intensive queries. | |
| Min Memory (8KB pages) | integer | hidden | Minimum memory grant observed. Helps identify memory-efficient queries and baseline memory requirements for common operations. | |
| Max Memory (8KB pages) | integer | hidden | Maximum memory grant observed. Spikes may indicate queries with large sort operations, hash joins, temporary table creation, or excessive parameter lengths consuming working memory. | |
| StdDev Memory | float | hidden | Standard deviation of memory grants. High variability suggests inconsistent memory usage patterns, potentially varying by execution parameters, result set sizes, or different code paths being executed. | |
| Avg Rows | float | Average number of rows processed per query execution. High values indicate queries returning large result sets that may consume significant network bandwidth, memory for result buffers, and client application resources. | ||
| Last Rows | integer | hidden | Row count from the most recent execution. Helps identify recent query patterns and track immediate data processing requirements. | |
| Min Rows | integer | hidden | Minimum rows observed. Helps identify data access patterns and spot outliers in result set sizes. | |
| Max Rows | integer | hidden | Maximum rows observed. Extremely high values may indicate full table scans without WHERE clauses, missing or inefficient filters, or data export operations. | |
| StdDev Rows | float | hidden | Standard deviation of rows processed. High variability suggests inconsistent result set sizes, potentially due to varying query filters, parameterized inputs, or different data distributions across executions. | |
| Avg Log Bytes | float | Average transaction log bytes written per query execution (SQL Server 2017+). High values indicate write-intensive operations (INSERT/UPDATE/DELETE), large transactions, or bulk modifications. This measures WAL activity, not diagnostic logging. | ||
| Last Log Bytes | integer | hidden | Transaction log bytes from the most recent execution. Useful for tracking recent write activity. | |
| Min Log Bytes | integer | hidden | Minimum transaction log bytes observed. Helps identify write-efficient queries and baseline requirements. | |
| Max Log Bytes | integer | hidden | Maximum transaction log bytes observed. Spikes may indicate bulk operations, large transactions, or queries affecting many rows. | |
| StdDev Log Bytes | float | hidden | Standard deviation of transaction log bytes. High variability suggests inconsistent write patterns, potentially varying by the number of rows affected or transaction sizes. | |
| Avg TempDB (8KB pages) | float | Average tempdb space usage (in 8KB pages) per execution. High values indicate queries that create or use large temporary objects, work tables, sort operations, or have heavy tempdb spillage from disk. High tempdb usage can lead to disk I/O contention and overall performance degradation. | ||
| Last TempDB (8KB pages) | integer | hidden | Tempdb space from the most recent execution. Useful for identifying recent tempdb pressure and tracking immediate disk I/O impact of resource-intensive queries. | |
| Min TempDB (8KB pages) | integer | hidden | Minimum tempdb space observed. Helps identify tempdb-efficient queries and baseline temporary object requirements for common operations. | |
| Max TempDB (8KB pages) | integer | hidden | Maximum tempdb space observed. Spikes may indicate queries with large sort operations, hash joins, index spool usage, or temporary table creation consuming substantial tempdb space. Can lead to tempdb autogrow and disk space issues. | |
| StdDev TempDB | float | hidden | Standard deviation of tempdb space usage. High variability suggests inconsistent temporary object usage patterns, potentially varying by query complexity, parameter types, or different data access patterns affecting temporary object creation. |
Deadlock Info
Retrieves the most recent deadlock event from SQL Server's system_health Extended Events ring buffer (xml_deadlock_report).
The deadlock graph XML is parsed to attribute the deadlock to the participating processes and their query text, lock mode, lock status, and wait resource.
Use cases:
- Identify which process was chosen as the deadlock victim
- Inspect the waiting resource and lock mode involved in the deadlock
- Correlate deadlocks with recent application changes or deployments
Query text and wait resource strings are truncated at 4096 characters for display purposes.
| Aspect | Description |
|---|---|
| Name | Mssql:deadlock-info |
| Require Cloud | yes |
| Performance | Executes on-demand queries against the system_health ring buffer:• Not part of regular metric collection • Overhead is limited to function execution time and XML parsing |
| Security | Query text and wait resource strings may include unmasked literal values including sensitive data (PII/secrets): • SQL literals such as emails, IDs, or tokens • Schema and table names that may be sensitive in some environments • Restrict dashboard access to authorized personnel only |
| Availability | Available when: • The collector has successfully connected to SQL Server • deadlock_info_function_enabled is true• The account has VIEW SERVER STATE permission• Returns HTTP 200 with empty data when no deadlock is found • Returns HTTP 403 when permission is missing • Returns HTTP 500 if the query fails • Returns HTTP 561 when the deadlock graph cannot be parsed • Returns HTTP 503 if the collector is still initializing or the function is disabled • Returns HTTP 504 if the query times out |
Prerequisites
- Ensure the account has the required permission:
GRANT VIEW SERVER STATE TO [netdata]; - Enable the function in Netdata collector config:
jobs:
- name: local
dsn: "sqlserver://user:pass@localhost:1433"
deadlock_info_function_enabled: true - Verify the deadlock source is accessible:
SELECT name
FROM sys.dm_xe_sessions
WHERE name = 'system_health';
Parameters
This function has no parameters.
Returns
Parsed deadlock participants from the latest detected deadlock event. Each row represents one process involved in the deadlock.
| Column | Type | Unit | Visibility | Description |
|---|---|---|---|---|
| Row ID | string | hidden | Unique row identifier composed of deadlock ID and process ID. | |
| Deadlock ID | string | Identifier for the deadlock event, derived from the deadlock timestamp to group participating processes. | ||
| Timestamp | timestamp | Timestamp of the deadlock event from the ring buffer when available; otherwise the function execution time. | ||
| Process ID | string | Deadlock graph process identifier for the process involved in the deadlock. | ||
| SPID | integer | SQL Server session ID (SPID) for the process when available. | ||
| ECID | integer | Execution context ID (ECID) for parallel execution contexts when available. | ||
| Victim | string | "true" when the process was chosen as the deadlock victim and rolled back; otherwise "false". | ||
| Query | string | SQL query text for the process involved in the deadlock. Truncated to 4096 characters. | ||
| Lock Mode | string | Lock mode reported for the process within the deadlock graph (for example X or S). | ||
| Lock Status | string | Lock status for the process. WAITING indicates the process was waiting on a lock. | ||
| Wait Resource | string | Lock resource identifier from the deadlock graph showing what the process was waiting on. | ||
| Database | string | Database name mapped from the deadlock graph database ID when available. |
Error Info
Retrieves recent SQL errors from a user-managed Extended Events session that captures sqlserver.error_reported
with the sql_text and query_hash actions (query_hash enables reliable mapping to top-queries).
| Aspect | Description |
|---|---|
| Name | Mssql:error-info |
| Require Cloud | yes |
| Performance | Executes on-demand queries against the configured Extended Events ring buffer: • Not part of regular metric collection • Overhead is limited to function execution time and XML parsing |
| Security | Error messages and query text may include unmasked literal values including sensitive data (PII/secrets): • Restrict dashboard access to authorized personnel only |
| Availability | Available when: • The collector has successfully connected to SQL Server • error_info_function_enabled is true• The Extended Events session exists and has a ring_buffer target • The account has VIEW SERVER STATE permission• Returns HTTP 200 with empty data when no errors are found • Returns HTTP 403 when permission is missing • Returns HTTP 500 if the query fails • Returns HTTP 503 if the session is not enabled or the function is disabled • Returns HTTP 504 if the query times out |
Prerequisites
- Create an Extended Events session (admin-controlled) that captures
sqlserver.error_reportedwithsql_textandquery_hash:CREATE EVENT SESSION [netdata_errors] ON SERVER
ADD EVENT sqlserver.error_reported(
ACTION(sqlserver.sql_text, sqlserver.query_hash)
)
ADD TARGET package0.ring_buffer;
GO
ALTER EVENT SESSION [netdata_errors] ON SERVER STATE = START; - Ensure the account has the required permission:
GRANT VIEW SERVER STATE TO [netdata]; - Enable the function and (optionally) set the session name in Netdata config:
jobs:
- name: local
dsn: "sqlserver://user:pass@localhost:1433"
error_info_function_enabled: true
error_info_session_name: netdata_errors
Parameters
This function has no parameters.
Returns
Recent error events from the configured Extended Events session.
| Column | Type | Unit | Visibility | Description |
|---|---|---|---|---|
| Timestamp | timestamp | Timestamp of the error event. | ||
| Error Number | integer | SQL Server error number. | ||
| Error State | integer | SQL Server error state. | ||
| Error Message | string | Error message text. | ||
| Query | string | SQL text captured with the error event. | ||
| Query Hash | string | hidden | Query hash captured with the error event (used for mapping into top-queries). |
Alerts
There are no alerts configured by default for this integration.
Setup
You can configure the mssql collector in two ways:
| Method | Best for | How to |
|---|---|---|
| UI | Fast setup without editing files | Go to Nodes → Configure this node → Collectors → Jobs, search for mssql, then click + to add a job. |
| File | If you prefer configuring via file, or need to automate deployments (e.g., with Ansible) | Edit go.d/mssql.conf and add a job. |
UI configuration requires paid Netdata Cloud plan.
Prerequisites
Create monitoring user
Create a SQL Server login with VIEW SERVER STATE permission:
-- Create login
CREATE LOGIN netdata_user WITH PASSWORD = 'YourStrongPassword!';
-- Grant VIEW SERVER STATE (required for DMVs)
GRANT VIEW SERVER STATE TO netdata_user;
-- Grant access to msdb for SQL Agent job monitoring (required)
USE msdb;
CREATE USER netdata_user FOR LOGIN netdata_user;
GRANT SELECT ON dbo.sysjobs TO netdata_user;
-- Optional: Grant access to distribution database for replication monitoring
-- (only if replication is configured)
USE distribution;
CREATE USER netdata_user FOR LOGIN netdata_user;
GRANT SELECT ON dbo.MSreplication_monitordata TO netdata_user;
GRANT SELECT ON dbo.MSpublications TO netdata_user;
GRANT SELECT ON dbo.MSsubscriptions TO netdata_user;
Required permissions:
VIEW SERVER STATE- Access to dynamic management viewsSELECT on msdb.dbo.sysjobs- SQL Agent job status monitoring
Optional permissions:
SELECT on distribution.dbo.MSreplication_monitordata- Replication monitoringSELECT on distribution.dbo.MSpublications- Publication informationSELECT on distribution.dbo.MSsubscriptions- Subscription counts
Configuration
Options
The following options can be defined globally: update_every, autodetection_retry.
Config options
| Group | Option | Description | Default | Required |
|---|---|---|---|---|
| Collection | update_every | Data collection interval (seconds). | 10 | no |
| autodetection_retry | Autodetection retry interval (seconds). Set 0 to disable. | 0 | no | |
| Target | dsn | SQL Server DSN (Data Source Name). See DSN syntax. | sqlserver://localhost:1433 | yes |
| timeout | Query timeout (seconds). | 5 | no | |
| Query Store | query_store_function_enabled | Enable the Query Store function to expose top queries via Netdata Functions. WARNING: Query Store may contain unmasked literal values (customer names, emails, IDs). Only enable after ensuring proper access controls to the Netdata dashboard. | no | no |
| query_store_time_window_days | Number of days of Query Store data to analyze. Set to 0 to include all available data. Smaller values improve query performance but show less history. | 7 | no | |
| Virtual Node | vnode | Associates this data collection job with a Virtual Node. | no |
via UI
Configure the mssql collector from the Netdata web interface:
- Go to Nodes.
- Select the node where you want the mssql data-collection job to run and click the ⚙ (Configure this node). That node will run the data collection.
- The Collectors → Jobs view opens by default.
- In the Search box, type mssql (or scroll the list) to locate the mssql collector.
- Click the + next to the mssql collector to add a new job.
- Fill in the job fields, then click Test to verify the configuration and Submit to save.
- Test runs the job with the provided settings and shows whether data can be collected.
- If it fails, an error message appears with details (for example, connection refused, timeout, or command execution errors), so you can adjust and retest.
via File
The configuration file name for this integration is go.d/mssql.conf.
The file format is YAML. Generally, the structure is:
update_every: 1
autodetection_retry: 0
jobs:
- name: some_name1
- name: some_name2
You can edit the configuration file using the edit-config script from the
Netdata config directory.
cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/mssql.conf
Examples
Basic configuration
Connect to local SQL Server with SQL authentication.
Config
jobs:
- name: local
dsn: "sqlserver://netdata_user:password@localhost:1433"
Windows Authentication
Connect using Windows integrated authentication.
Config
jobs:
- name: local
dsn: "sqlserver://localhost:1433?trusted_connection=yes"
Named instance
Connect to a named SQL Server instance.
Config
jobs:
- name: named_instance
dsn: "sqlserver://netdata_user:password@localhost/INSTANCENAME"
Remote server
Connect to a remote SQL Server.
Config
jobs:
- name: remote
dsn: "sqlserver://netdata_user:password@192.168.1.100:1433"
Multi-instance
Note: When you define multiple jobs, their names must be unique.
Monitoring multiple SQL Server instances.
Config
jobs:
- name: production
dsn: "sqlserver://netdata_user:password@prod-sql:1433"
- name: development
dsn: "sqlserver://netdata_user:password@dev-sql:1433"
With Query Store function
Enable the Query Store function to view top queries in the Netdata dashboard.
Warning: Query Store may contain unmasked literal values (PII). Only enable in environments with proper access controls.
Config
jobs:
- name: local
dsn: "sqlserver://netdata_user:password@localhost:1433"
query_store_function_enabled: true
query_store_time_window_days: 7
Troubleshooting
Debug Mode
Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.
To troubleshoot issues with the mssql collector, run the go.d.plugin with the debug option enabled. The output
should give you clues as to why the collector isn't working.
-
Navigate to the
plugins.ddirectory, usually at/usr/libexec/netdata/plugins.d/. If that's not the case on your system, opennetdata.confand look for thepluginssetting under[directories].cd /usr/libexec/netdata/plugins.d/ -
Switch to the
netdatauser.sudo -u netdata -s -
Run the
go.d.pluginto debug the collector:./go.d.plugin -d -m mssqlTo debug a specific job:
./go.d.plugin -d -m mssql -j jobName
Getting Logs
If you're encountering problems with the mssql collector, follow these steps to retrieve logs and identify potential issues:
- Run the command specific to your system (systemd, non-systemd, or Docker container).
- Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.
System with systemd
Use the following command to view logs generated since the last Netdata service restart:
journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep mssql
System without systemd
Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector's name:
grep mssql /var/log/netdata/collector.log
Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.
Docker Container
If your Netdata runs in a Docker container named "netdata" (replace if different), use this command:
docker logs netdata 2>&1 | grep mssql
Connection refused
Ensure SQL Server is running and accepting TCP connections on the configured port. Check that the SQL Server Browser service is running if using named instances.
Login failed
Verify the username and password in the DSN are correct. Ensure SQL Server is configured for mixed mode authentication if using SQL logins.
Permission denied
The monitoring user needs VIEW SERVER STATE permission.
Grant it with: GRANT VIEW SERVER STATE TO netdata_user;
Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.