设为首页 收藏本站
查看: 3561|回复: 0

[经验分享] SQL Server健康检查

[复制链接]
累计签到:1 天
连续签到:1 天
发表于 2014-7-18 10:35:59 | 显示全部楼层 |阅读模式


SQL Server健康检查的重要性

对一台新的或不熟悉的服务器,你首先要做的是收集尽可能多有关Server的信息,从硬件和存储子系统的细节,到OS,再到SQL Server实例本身。你需要知道要处理什么事情----配置是否差,存储子系统完全不充分的过时Server,或者,希望一些东西更佳。对于把精力放在正确管理和优化数据库Server,这份信息是一个关键的开始点。作为一个数据库专业人士,实在没有理由不清楚每台数据库Server的硬件及配置信息。

这个章节会通过一套查询带你做SQL Server健康检查,解释其含义及如何理解结果。这些查询从硬件和实例层次上开始,然后让你能够钻取到一个特定数据库,去采集更多特定信息。

很多DBA收集这类信息的一个拦路虎是官僚主义。系统管理员或系统工程师提供并管理实际的数据库Server硬件和OS,SAN管理员常常负责存储子系统。这些人员中,常有职称高于你,而对SQL Server基本不懂的人,成为你收集有关数据库Server重要信息的一大障碍。他们或许会把信息采集看成是入侵他们的领域,因此不愿意配合你。

不管任何官僚主义还是组织障碍,你仍然可以通过技术在SSMS中收集大部分你需要的信息来做比较全面的SQL Server健康检查。其中最有用、易用的技术是使用DM和DMF来收集服务器、实例及数据库有关健康检查的信息。

运行DMV和DMF查询

要运行大多数的DMV和DMF查询,你需要对SQL Server实例有VIEW SERVER STATE权限。

首先你要想准确地找出SQL Server的版本、版次及建立,还想知道是x64还是x86、OS是什么。很简单,直接可以通过如下非DMV查询得到:


    SELECT @@VERSION AS [SQL Server and OS Version Info];  

知道这些信息后,有助于了解SQL Server的可用功能。例如,企业版有数据压缩,而标准版或BI版的SQL Server 2012就不能使用数据压缩。

SQL Server 创建
微软定期发布SQL Server的累积更新(CU-Cumulative Update),每个更新有10~40个热补丁,以集成的安装包发布。

下面这个查询,能够获取数据库Server上有关OS、语言等更多信息:


    SELECT windows_release, windows_service_pack_level,windows_sku, os_language_version FROM sys.dm_os_windows_info WITH (NOLOCK) OPTION (RECOMPILE);  

下面的查询会告诉你有多少逻辑处理器、处理器的超线程比率、有多少物力CPU及多大物理内存。


    -- Hardware information from SQL Server 2012 (new virtual_machine_type_desc)(Cannot distinguish between HT and multi-core)  
    SELECT cpu_count AS [Logical CPU Count], hyperthread_ratio AS [HyperthreadRatio],cpu_count/hyperthread_ratio AS [Physical CPU Count],  
    physical_memory_kb/1024 AS [Physical Memory (MB)],affinity_type_desc, virtual_machine_type_desc, sqlserver_start_time  
    FROM sys.dm_os_sys_info WITH (NOLOCK) OPTION (RECOMPILE);  
    -- Gives you some good basic hardware information about your database server  

下面的查询会读取SQL Server错误日志,以获取厂商及数据库Server的型号(model number):


    -- Get System Manufacturer and model number from SQL Server Error log. This query might take a few seconds  
    -- if you have not recycled your error log recently  
    EXEC xp_readerrorlog 0,1,"Manufacturer";  
    -- This can help you determine the capabilities and capacities of your database server  

知道厂商和型号,你就能找到重要的信息,如有多少处理器插槽、有多少内存插槽,多少及何种类型的PCI-E扩展槽,还会告诉你处理器类型(Intel还是AMD)和处理器是那一代。

下面的查询会返回处理器的描述及来自Windows Registry的额定时钟频率:


    -- Get processor description from Windows Registry  
    EXEC xp_instance_regread 'HKEY_LOCAL_MACHINE','HARDWARE\DESCRIPTION\System\CentralProcessor\0','ProcessorNameString';  
    -- Gives you the model number and rated clock speed of your processor(s)  

知道额定时钟频率非常重要,因为你的处理器可能因为电源管理,不是每次都全速运行。Windows Server 2008(R2)使用Balanced Windows Power Plan。这意味着当处理器不在高负荷的情况下,会降低时钟速率来减少节省电力。当处理器突遇高负载时,它会增加时钟速率以达到全速,但这不会立即发生从而对查询性能有负面影响。当使用默认的Balanced Windows Power Plan,而不是High Performance电源计划,通常OLTP负载下只有20~25%的速率。要避免这样的问题,你首先要确认你的数据库Server使用High Performance电源计划,而不是Balanced电源计划。这个设置可以动态改变,不需要重启Windows。其次,使用cpuid.com提供的CPU-Z免费工具确定实际的时钟速率。如果你使用高性能电源计划且处理器仍然美欧全速运行,那么你需要进入系统BIOS去改变电源管理设置,要么OS控制,要么完全禁用。

下面的查询会返回安装哪些SQL Server服务及其如何配置的信息:


    -- SQL Server Services information from SQL Server 2012  
    SELECT servicename, startup_type_desc, status_desc,last_startup_time, service_account, is_clustered, cluster_nodename  
    FROM sys.dm_server_services WITH (NOLOCK) OPTION (RECOMPILE);  
    -- Gives you information about your installed SQL Server Services, whether they are clustered, and which node owns the cluster resources  

下面的查询获取SQL Server错误日志相关的信息,事先知道错误日志的路径等有助于在必要时去查阅:


    -- Shows you where the SQL Server error log is located and how it is configured  
    SELECT is_enabled, [path], max_size, max_files  
    FROM sys.dm_os_server_diagnostics_log_configurations WITH (NOLOCK) OPTION (RECOMPILE);  
    -- Knowing this information is important for troubleshooting purposes  

下面的查询会返回是否你的数据库Server使用了Windows Clustering:


    -- Get information about your OS cluster  
    --(if your database server is in a cluster)  
    SELECT VerboseLogging, SqlDumperDumpFlags, SqlDumperDumpPath,SqlDumperDumpTimeOut, FailureConditionLevel, HealthCheckTimeout  
    FROM sys.dm_os_cluster_properties WITH (NOLOCK) OPTION (RECOMPILE);  
    -- You will see no results if your instance is not clustered  

如果你使用群集,你可以使用下面的查询来获取有关群集节点的有用信息:


    -- Get information about your cluster nodes and their status (if your database server is in a cluster)  
    SELECT NodeName, status_description, is_current_owner  
    FROM sys.dm_os_cluster_nodes WITH (NOLOCK) OPTION (RECOMPILE);  
    -- Knowing which node owns the cluster resources is critical Especially when you are installing Windows or SQL Server updates  

下面的查询收集有关SQL Server实例配置的额外信息,你可以使用sp_configure来改变属性:


    -- Get configuration values for instance  
    SELECT name, value, value_in_use, [description] FROM sys.configurations WITH (NOLOCK) ORDER BY name OPTION (RECOMPILE);  
    -- Focus on backup compression default  
    -- clr enabled (only enable if it is needed)  
    -- lightweight pooling (should be zero)  
    -- max degree of parallelism  
    -- max server memory (MB) (set to an appropriate value)  
    -- optimize for ad hoc workloads (should be 1)  
    -- priority boost (should be zero)  

下面的查询返回网络配置相关的一点信息,对于处理网络、防火墙相关的问题有帮助:


    -- Get information about TCP Listener for SQL Server  
    SELECT listener_id, ip_address, is_ipv4, port, type_desc, state_desc, start_time  
    FROM sys.dm_tcp_listener_states WITH (NOLOCK) OPTION (RECOMPILE);  
    -- Helpful for network and connectivity troubleshooting  

下面的查询返回来自Windows注册表SQL Server相关的信息:


    -- SQL Server Registry information  
    SELECT registry_key, value_name, value_data  
    FROM sys.dm_server_registry WITH (NOLOCK) OPTION (RECOMPILE);  
    -- This lets you safely read some SQL Server related information from the Windows Registry  

下面的查询返回是否SQL Server实例产生内存转储:


    -- Get information on location, time and size of any memory dumps from SQL Server  
    SELECT [filename], creation_time, size_in_bytes  
    FROM sys.dm_server_memory_dumps WITH (NOLOCK) OPTION (RECOMPILE);  
    -- This will not return any rows if you have not had any memory dumps (which is a good thing)  

下面的查询返回SQL Server实例有多少在运行的数据库,它们位于哪里:


    -- File Names and Paths for Tempdb and all user databases in instance  
    SELECT DB_NAME([database_id])AS [Database Name],  
    [file_id], name, physical_name, type_desc, state_desc,  
    CONVERT( bigint, size/128.0) AS [Total Size in MB]  
    FROM sys.master_files WITH (NOLOCK)  
    WHERE [database_id] > 4  
    AND [database_id] <> 32767  
    OR [database_id] = 2  
    ORDER BY DB_NAME([database_id]) OPTION (RECOMPILE);  
    -- Things to look at:  
    -- Are data files and log files on different drives?  
    -- Is everything on the C: drive?  
    -- Is TempDB on dedicated drives?  
    -- Are there multiple data files?  

下面的查询返回实例上所有数据库的一些关键属性:


    -- Recovery model, log reuse wait description, log file size, log usage size  
    -- and compatibility level for all databases on instance  
    SELECT db.[name] AS [Database Name], db.recovery_model_desc AS [Recovery Model],  
    db.log_reuse_wait_desc AS [Log Reuse Wait Description],  
    ls.cntr_value AS [Log Size (KB)], lu.cntr_value AS [Log Used (KB)],  
    CAST(CAST(lu.cntr_value AS FLOAT) / CAST(ls.cntr_value AS FLOAT)AS DECIMAL(18,2)) *  
    100 AS  
    [Log Used %], db.[compatibility_level] AS [DB Compatibility Level],  
    db.page_verify_option_desc AS [Page Verify Option], db.is_auto_create_stats_on,  
    db.is_auto_update_stats_on, db.is_auto_update_stats_async_on,  
    db.is_parameterization_forced,  
    db.snapshot_isolation_state_desc, db.is_read_committed_snapshot_on,  
    is_auto_shrink_on, is_auto_close_on  
    FROM sys.databases AS db WITH (NOLOCK)  
    INNER JOIN sys.dm_os_performance_counters AS lu WITH (NOLOCK)  
    ON db.name = lu.instance_name  
    INNER JOIN sys.dm_os_performance_counters AS ls WITH (NOLOCK)  
    ON db.name = ls.instance_name  
    WHERE lu.counter_name LIKE N'Log File(s) Used Size (KB)%'  
    AND ls.counter_name LIKE N'Log File(s) Size (KB)%'  
    AND ls.cntr_value > 0 OPTION (RECOMPILE);  
    -- Things to look at:  
    -- How many databases are on the instance?  
    -- What recovery models are they using?  
    -- What is the log reuse wait description?  
    -- How full are the transaction logs ?  
    -- What compatibility level are they on?  

下面的查询返回哪个数据库文件有最大的I/O延迟:


    -- Calculates average stalls per read, per write,  
    -- and per total input/output for each database file.  
    SELECT DB_NAME(fs.database_id) AS [Database Name], mf.physical_name,  
    io_stall_read_ms, num_of_reads,  
    CAST(io_stall_read_ms/(1.0 + num_of_reads) AS NUMERIC(10,1)) AS  
    [avg_read_stall_ms],io_stall_write_ms,  
    num_of_writes,CAST(io_stall_write_ms/(1.0+num_of_writes) AS NUMERIC(10,1)) AS  
    [avg_write_stall_ms],  
    io_stall_read_ms + io_stall_write_ms AS [io_stalls], num_of_reads + num_of_writes  
    AS [total_io],  
    CAST((io_stall_read_ms + io_stall_write_ms)/(1.0 + num_of_reads + num_of_writes) AS  
    NUMERIC(10,1))  
    AS [avg_io_stall_ms]  
    FROM sys.dm_io_virtual_file_stats(null,null) AS fs  
    INNER JOIN sys.master_files AS mf WITH (NOLOCK)  
    ON fs.database_id = mf.database_id  
    AND fs.[file_id] = mf.[file_id]  
    ORDER BY avg_io_stall_ms DESC OPTION (RECOMPILE);  
    -- Helps determine which database files on  
    -- the entire instance have the most I/O bottlenecks  

如果遇到I/O瓶颈,你可以使用Windows Performance Monitor查看逻辑磁盘的Avg Disk Sec/Write和 Avg Disk Sec/Read等计数器。根据存储的类型,通过增加更多的spindle、改变RAID控制器高速缓存策略或改变RAID级别或许可以改善I/O性能。如果可能的话,你也可以考虑移除一些数据文件到其他磁盘。

下面的查询返回占用最多内存的用户数据库:


    -- Get total buffer usage by database for current instance  
    SELECT DB_NAME(database_id) AS [Database Name],  
    COUNT(*) * 8/1024.0 AS [Cached Size (MB)]  
    FROM sys.dm_os_buffer_descriptors WITH (NOLOCK)  
    WHERE database_id > 4 -- system databases  
    AND database_id <> 32767 -- ResourceDB  
    GROUP BY DB_NAME(database_id)  
    ORDER BY [Cached Size (MB)] DESC OPTION (RECOMPILE);  
    -- Tells you how much memory (in the buffer pool)  
    -- is being used by each database on the instance  

下面的查询返回使用最多处理器时间的用户数据库:


    -- Get CPU utilization by database  
    WITH DB_CPU_Stats  
    AS  
    (SELECT DatabaseID, DB_Name(DatabaseID) AS [DatabaseName],  
    SUM(total_worker_time) AS [CPU_Time_Ms]  
    FROM sys.dm_exec_query_stats AS qs  
    CROSS APPLY (SELECT CONVERT(int, value) AS [DatabaseID]  
    FROM sys.dm_exec_plan_attributes(qs.plan_handle)  
    WHERE attribute = N'dbid') AS F_DB  
    GROUP BY DatabaseID)  
    SELECT ROW_NUMBER() OVER(ORDER BY [CPU_Time_Ms] DESC) AS [row_num],  
    DatabaseName, [CPU_Time_Ms],  
    CAST([CPU_Time_Ms] * 1.0 / SUM([CPU_Time_Ms])  
    OVER() * 100.0 AS DECIMAL(5, 2)) AS [CPUPercent]  
    FROM DB_CPU_Stats  
    WHERE DatabaseID > 4 -- system databases  
    AND DatabaseID <> 32767 -- ResourceDB  
    ORDER BY row_num OPTION (RECOMPILE);  
    -- Helps determine which database is using the most CPU resources on the instance  

下面的查询返回SQL Server上次重启或使用DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR)以来,最高的累计等待统计:


    -- Isolate top waits for server instance since last restart or statistics clear  
    WITH Waits AS  
    (SELECT wait_type, wait_time_ms / 1000. AS wait_time_s,  
    100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct,  
    ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn  
    FROM sys.dm_os_wait_stats WITH (NOLOCK)  
    WHERE wait_type NOT IN (N'CLR_SEMAPHORE',N'LAZYWRITER_SLEEP',N'RESOURCE_QUEUE',  
    N'SLEEP_TASK',N'SLEEP_SYSTEMTASK',N'SQLTRACE_BUFFER_FLUSH',N'WAITFOR',  
    N'LOGMGR_QUEUE',N'CHECKPOINT_QUEUE', N'REQUEST_FOR_DEADLOCK_SEARCH',  
    N'XE_TIMER_EVENT',N'BROKER_TO_FLUSH',N'BROKER_TASK_STOP',N'CLR_MANUAL_EVENT',  
    N'CLR_AUTO_EVENT',N'DISPATCHER_QUEUE_SEMAPHORE', N'FT_IFTS_SCHEDULER_IDLE_WAIT',  
    N'XE_DISPATCHER_WAIT', N'XE_DISPATCHER_JOIN', N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',  
    N'ONDEMAND_TASK_QUEUE', N'BROKER_EVENTHANDLER', N'SLEEP_BPOOL_FLUSH',  
    N'DIRTY_PAGE_POLL', N'HADR_FILESTREAM_IOMGR_IOCOMPLETION',  
    N'SP_SERVER_DIAGNOSTICS_SLEEP'))  
    SELECT W1.wait_type,  
    CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s,  
    CAST(W1.pct AS DECIMAL(12, 2)) AS pct,  
    CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct  
    FROM Waits AS W1  
    INNER JOIN Waits AS W2  
    ON W2.rn <= W1.rn  
    GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct  
    HAVING SUM(W2.pct) - W1.pct < 99 OPTION (RECOMPILE); -- percentage threshold  
    -- Clear Wait Stats  
    -- DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);  

博客http://blogs.msdn.com/b/psssql/a ... ype-repository.aspx里介绍了很多等待类型。

下面的查询返回实例上累积的信号(CPU)等待:


    -- Signal Waits for instance  
    SELECT CAST(100.0 * SUM(signal_wait_time_ms) / SUM (wait_time_ms) AS NUMERIC(20,2))  
    AS [%signal (cpu) waits],  
    CAST(100.0 * SUM(wait_time_ms - signal_wait_time_ms) / SUM (wait_time_ms) AS  
    NUMERIC(20,2)) AS [%resource waits]  
    FROM sys.dm_os_wait_stats WITH (NOLOCK) OPTION (RECOMPILE);  
    -- Signal Waits above 15-20% is usually a sign of CPU pressure  

信号等待是CPU相关的等待。通常信号等待在15~20%就表示CPU压力。

下面的查询返回最常连接数据库的登录信息:


    -- Get logins that are connected and how many sessions they have  
    SELECT login_name, COUNT(session_id) AS [session_count]  
    FROM sys.dm_exec_sessions WITH (NOLOCK)  
    GROUP BY login_name  
    ORDER BY COUNT(session_id) DESC OPTION (RECOMPILE);  
    -- This can help characterize your workload and  
    -- determine whether you are seeing a normal level of activity  

下面的查询返回当前的任务急pending的I/O计数信息,返回的3个值越低越好:


    -- Get Average Task Counts (run multiple times)  
    SELECT AVG(current_tasks_count) AS [Avg Task Count],  
    AVG(runnable_tasks_count) AS [Avg Runnable Task Count],  
    AVG(pending_disk_io_count) AS [Avg Pending DiskIO Count]  
    FROM sys.dm_os_schedulers WITH (NOLOCK)  
    WHERE scheduler_id < 255 OPTION (RECOMPILE);  
    -- Sustained values above 10 suggest further investigation in that area  
    -- High Avg Task Counts are often caused by blocking or other resource contention  
    -- High Avg Runnable Task Counts are a good sign of CPU pressure  
    -- High Avg Pending DiskIO Counts are a sign of disk pressure  

下面的查询返回过去256分钟CPU使用的历史状况,1分钟一个间隔:


    -- Get CPU Utilization History for last 256 minutes (in one minute intervals)  
    -- This version works with SQL Server 2008 and above  
    DECLARE @ts_now bigint = (SELECT cpu_ticks/(cpu_ticks/ms_ticks)  
    FROM sys.dm_os_sys_info WITH (NOLOCK));  
    SELECT TOP(256) SQLProcessUtilization AS [SQL Server Process CPU Utilization],  
    SystemIdle AS [System Idle Process],  
    100 - SystemIdle - SQLProcessUtilization  
    AS [Other Process CPU Utilization],  
    DATEADD(ms, -1 * (@ts_now - [timestamp]),  
    GETDATE()) AS [Event Time]  
    FROM (SELECT record.value('(./Record/@id)[1]', 'int') AS record_id,  
    record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int')  
    AS[SystemIdle],record.value('(./Record/SchedulerMonitorEvent/SystemHealth/  
    ProcessUtilization)[1]','int')  
    AS [SQLProcessUtilization], [timestamp]  
    FROM (SELECT [timestamp], CONVERT(xml, record) AS [record]  
    FROM sys.dm_os_ring_buffers WITH (NOLOCK)  
    WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR'  
    AND record LIKE N'%<SystemHealth>%') AS x  
    ) AS y  
    ORDER BY record_id DESC OPTION (RECOMPILE);  
    -- Look at the trend over the entire period.  
    -- Also look at high sustained Other Process CPU Utilization values  

如果Other Process CPU Utilization持续超过5%,你就应该查看什么在使用CPU。

下面的查询返回OS级别物理内存的状况:


    -- Good basic information about OS memory amounts and state  
    SELECT total_physical_memory_kb, available_physical_memory_kb,  
    total_page_file_kb, available_page_file_kb,  
    system_memory_state_desc  
    FROM sys.dm_os_sys_memory WITH (NOLOCK) OPTION (RECOMPILE);  
    -- You want to see "Available physical memory is high"  
    -- This indicates that you are not under external memory pressure  

下面的查询返回SQL Server的内存使用情况:


    -- SQL Server Process Address space info  
    --(shows whether locked pages is enabled, among other things)  
    SELECT physical_memory_in_use_kb,locked_page_allocations_kb,  
    page_fault_count, memory_utilization_percentage,  
    available_commit_limit_kb, process_physical_memory_low,  
    process_virtual_memory_low  
    FROM sys.dm_os_process_memory WITH (NOLOCK) OPTION (RECOMPILE);  
    -- You want to see 0 for process_physical_memory_low  
    -- You want to see 0 for process_virtual_memory_low  
    -- This indicates that you are not under internal memory pressure  

查看SQL Server是否处于内存压力的一个经典做法是查看Page Life Expectancy (PLE),PLE越高越好(MS推荐300为可接受):


    -- Page Life Expectancy (PLE) value for default instance  
    SELECT cntr_value AS [Page Life Expectancy]  
    FROM sys.dm_os_performance_counters WITH (NOLOCK)  
    WHERE [object_name] LIKE N'%Buffer Manager%' -- Handles named instances  
    AND counter_name = N'Page life expectancy' OPTION (RECOMPILE);  
    -- PLE is one way to measure memory pressure.  
    -- Higher PLE is better. Watch the trend, not the absolute value.  

下面的查询返回Memory Grants Outstanding:


    -- Memory Grants Outstanding value for default instance  
    SELECT cntr_value AS [Memory Grants Outstanding]  
    FROM sys.dm_os_performance_counters WITH (NOLOCK)  
    WHERE [object_name] LIKE N'%Memory Manager%' -- Handles named instances  
    AND counter_name = N'Memory Grants Outstanding' OPTION (RECOMPILE);  
    -- Memory Grants Outstanding above zero  
    -- for a sustained period is a secondary indicator of memory pressure  

下面的查询返回Memory Grants Pending:


    -- Memory Grants Pending value for default instance  
    SELECT cntr_value AS [Memory Grants Pending]  
    FROM sys.dm_os_performance_counters WITH (NOLOCK)  
    WHERE [object_name] LIKE N'%Memory Manager%' -- Handles named instances  
    AND counter_name = N'Memory Grants Pending' OPTION (RECOMPILE);  
    -- Memory Grants Pending above zero  
    -- for a sustained period is an extremely strong indicator of memory pressure  

当你从上面3个查询中看到任何内存压力,要进一步查看整体的内存使用状况:


    -- Memory Clerk Usage for instance  
    -- Look for high value for CACHESTORE_SQLCP (Ad-hoc query plans)  
    SELECT TOP(10) [type] AS [Memory Clerk Type],  
    SUM(pages_kb) AS [SPA Mem, Kb]  
    FROM sys.dm_os_memory_clerks WITH (NOLOCK)  
    GROUP BY [type]  
    ORDER BY SUM(pages_kb) DESC OPTION (RECOMPILE);  
    -- CACHESTORE_SQLCP SQL Plans  
    -- These are cached SQL statements or batches that  
    -- aren't in stored procedures, functions and triggers  
    --  
    -- CACHESTORE_OBJCP Object Plans  
    -- These are compiled plans for  
    -- stored procedures, functions and triggers  
    --  
    -- CACHESTORE_PHDR Algebrizer Trees  
    -- An algebrizer tree is the parsed SQL text  
    -- that resolves the table and column names  

如果你看到CACHESTORE_SQLCP memory clerk使用很多内存,那么你可以确定在Procedure缓存里是否有很多只用一次且占用了大量内存的ad hoc查询计划:


    -- Find single-use, ad-hoc queries that are bloating the plan cache  
    SELECT TOP(20) [text] AS [QueryText], cp.size_in_bytes  
    FROM sys.dm_exec_cached_plans AS cp WITH (NOLOCK)  
    CROSS APPLY sys.dm_exec_sql_text(plan_handle)  
    WHERE cp.cacheobjtype = N'Compiled Plan'  
    AND cp.objtype = N'Adhoc'  
    AND cp.usecounts = 1  
    ORDER BY cp.size_in_bytes DESC OPTION (RECOMPILE);  
    -- Gives you the text and size of single-use ad-hoc queries that  
    -- waste space in the plan cache  
    -- Enabling 'optimize for ad hoc workloads' for the instance  
    -- can help (SQL Server 2008 and above only)  
    -- Enabling forced parameterization for the database can help, but test first!  

数据库级别的查询

切换到特定用户数据库:


    -- Database specific queries ******************************************************  
    -- **** Switch to a user database *****  
    USE YourDatabaseName;  
    GO  

查询数据库大小:


    -- Individual File Sizes and space available for current database  
    SELECT name AS [File Name], physical_name AS [Physical Name], size/128.0 AS [Total  
    Size in MB],  
    size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS int)/128.0 AS [Available Space  
    In MB], [file_id]  
    FROM sys.database_files WITH (NOLOCK) OPTION (RECOMPILE);  
    -- Look at how large and how full the files are and where they are located  
    -- Make sure the transaction log is not full!!  

查看事务日志大小及空间使用:


    -- Get transaction log size and space information for the current database  
    SELECT DB_NAME(database_id) AS [Database Name], database_id,  
    CAST((total_log_size_in_bytes/1048576.0) AS DECIMAL(10,1))  
    AS [Total_log_size(MB)],  
    CAST((used_log_space_in_bytes/1048576.0) AS DECIMAL(10,1))  
    AS [Used_log_space(MB)],  
    CAST(used_log_space_in_percent AS DECIMAL(10,1)) AS [Used_log_space(%)]  
    FROM sys.dm_db_log_space_usage WITH (NOLOCK) OPTION (RECOMPILE);  
    -- Another way to look at transaction log file size and space  

按文件搜集I/O统计:


    -- I/O Statistics by file for the current database  
    SELECT DB_NAME(DB_ID()) AS [Database Name],[file_id], num_of_reads, num_of_writes,  
    io_stall_read_ms, io_stall_write_ms,  
    CAST(100. * io_stall_read_ms/(io_stall_read_ms + io_stall_write_ms)  
    AS DECIMAL(10,1)) AS [IO Stall Reads Pct],  
    CAST(100. * io_stall_write_ms/(io_stall_write_ms + io_stall_read_ms)  
    AS DECIMAL(10,1)) AS [IO Stall Writes Pct],  
    (num_of_reads + num_of_writes) AS [Writes + Reads], num_of_bytes_read,  
    num_of_bytes_written,  
    CAST(100. * num_of_reads/(num_of_reads + num_of_writes) AS DECIMAL(10,1))  
    AS [# Reads Pct],  
    CAST(100. * num_of_writes/(num_of_reads + num_of_writes) AS DECIMAL(10,1))  
    AS [# Write Pct],  
    CAST(100. * num_of_bytes_read/(num_of_bytes_read + num_of_bytes_written)  
    AS DECIMAL(10,1)) AS [Read Bytes Pct],  
    CAST(100. * num_of_bytes_written/(num_of_bytes_read + num_of_bytes_written)  
    AS DECIMAL(10,1)) AS [Written Bytes Pct]  
    FROM sys.dm_io_virtual_file_stats(DB_ID(), NULL) OPTION (RECOMPILE);  
    -- This helps you characterize your workload better from an I/O perspective  

查看事务日志Virtual Log File (VLF)计数:


    -- Get VLF count for transaction log for the current database,  
    -- number of rows equals the VLF count. Lower is better!  
    DBCC LOGINFO;  
    -- High VLF counts can affect write performance  
    -- and they can make database restore and recovery take much longer  

事务日志中如果有大量VLF,就会影响写入事务日志的性能,更重要的是影响恢复数据库的时间。
查看特定数据库上的查询活动:


    -- Top cached queries by Execution Count (SQL Server 2012)  
    SELECT qs.execution_count, qs.total_rows, qs.last_rows, qs.min_rows, qs.max_rows,  
    qs.last_elapsed_time, qs.min_elapsed_time, qs.max_elapsed_time,  
    SUBSTRING(qt.TEXT,qs.statement_start_offset/2 +1,  
    (CASE WHEN qs.statement_end_offset = -1  
    THEN LEN(CONVERT(NVARCHAR(MAX), qt.TEXT)) * 2  
    ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)  
    AS query_text  
    FROM sys.dm_exec_query_stats AS qs WITH (NOLOCK)  
    CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt  
    ORDER BY qs.execution_count DESC OPTION (RECOMPILE);  
    -- Uses several new rows returned columns  
    -- to help troubleshoot performance problems  

查看缓存的存储过程:


    -- Top Cached SPs By Execution Count (SQL Server 2012)  
    SELECT TOP(250) p.name AS [SP Name], qs.execution_count,  
    ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time, GETDATE()), 0)  
    AS [Calls/Second],  
    qs.total_worker_time/qs.execution_count AS [AvgWorkerTime],  
    qs.total_worker_time AS [TotalWorkerTime],qs.total_elapsed_time,  
    qs.total_elapsed_time/qs.execution_count AS [avg_elapsed_time],  
    qs.cached_time  
    FROM sys.procedures AS p WITH (NOLOCK)  
    INNER JOIN sys.dm_exec_procedure_stats AS qs WITH (NOLOCK)  
    ON p.[object_id] = qs.[object_id]  
    WHERE qs.database_id = DB_ID()  
    ORDER BY qs.execution_count DESC OPTION (RECOMPILE);  
    -- Tells you which cached stored procedures are called the most often  
    -- This helps you characterize and baseline your workload  

查看存储过程在缓存里待多长时间(Cached_time):


    -- Top Cached SPs By Avg Elapsed Time (SQL Server 2012)  
    SELECT TOP(25) p.name AS [SP Name], qs.total_elapsed_time/qs.execution_count  
    AS [avg_elapsed_time], qs.total_elapsed_time, qs.execution_count,  
    ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time,  
    GETDATE()), 0) AS [Calls/Second],  
    qs.total_worker_time/qs.execution_count AS [AvgWorkerTime],  
    qs.total_worker_time AS [TotalWorkerTime], qs.cached_time  
    FROM sys.procedures AS p WITH (NOLOCK)  
    INNER JOIN sys.dm_exec_procedure_stats AS qs WITH (NOLOCK)  
    ON p.[object_id] = qs.[object_id]  
    WHERE qs.database_id = DB_ID()  
    ORDER BY avg_elapsed_time DESC OPTION (RECOMPILE);  
    -- This helps you find long-running cached stored procedures that  
    -- may be easy to optimize with standard query tuning techniques  

从整体CPU角度查看最耗时的存储过程:


    -- Top Cached SPs By Total Worker time (SQL Server 2012).  
    -- Worker time relates to CPU cost  
    SELECT TOP(25) p.name AS [SP Name], qs.total_worker_time AS [TotalWorkerTime],  
    qs.total_worker_time/qs.execution_count AS [AvgWorkerTime], qs.execution_count,  
    ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time, GETDATE()), 0)  
    AS [Calls/Second],qs.total_elapsed_time, qs.total_elapsed_time/qs.execution_count  
    AS [avg_elapsed_time], qs.cached_time  
    FROM sys.procedures AS p WITH (NOLOCK)  
    INNER JOIN sys.dm_exec_procedure_stats AS qs WITH (NOLOCK)  
    ON p.[object_id] = qs.[object_id]  
    WHERE qs.database_id = DB_ID()  
    ORDER BY qs.total_worker_time DESC OPTION (RECOMPILE);  
    -- This helps you find the most expensive cached  
    -- stored procedures from a CPU perspective  
    -- You should look at this if you see signs of CPU pressure  

从逻辑度的角度查看缓存的存储过程相关的信息:


    -- Top Cached SPs By Total Logical Reads (SQL Server 2012).  
    -- Logical reads relate to memory pressure  
    SELECT TOP(25) p.name AS [SP Name], qs.total_logical_reads  
    AS [TotalLogicalReads], qs.total_logical_reads/qs.execution_count  
    AS [AvgLogicalReads],qs.execution_count,  
    ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time, GETDATE()), 0)  
    AS [Calls/Second], qs.total_elapsed_time,qs.total_elapsed_time/qs.execution_count  
    AS [avg_elapsed_time], qs.cached_time  
    FROM sys.procedures AS p WITH (NOLOCK)  
    INNER JOIN sys.dm_exec_procedure_stats AS qs WITH (NOLOCK)  
    ON p.[object_id] = qs.[object_id]  
    WHERE qs.database_id = DB_ID()  
    ORDER BY qs.total_logical_reads DESC OPTION (RECOMPILE);  
    -- This helps you find the most expensive cached  
    -- stored procedures from a memory perspective  
    -- You should look at this if you see signs of memory pressure  

从物理度的角度查看最耗时的存储过程:


    -- Top Cached SPs By Total Physical Reads (SQL Server 2012).  
    -- Physical reads relate to disk I/O pressure  
    SELECT TOP(25) p.name AS [SP Name],qs.total_physical_reads  
    AS [TotalPhysicalReads],qs.total_physical_reads/qs.execution_count  
    AS [AvgPhysicalReads], qs.execution_count, qs.total_logical_reads,  
    qs.total_elapsed_time, qs.total_elapsed_time/qs.execution_count  
    AS [avg_elapsed_time], qs.cached_time  
    FROM sys.procedures AS p WITH (NOLOCK)  
    INNER JOIN sys.dm_exec_procedure_stats AS qs WITH (NOLOCK)  
    ON p.[object_id] = qs.[object_id]  
    WHERE qs.database_id = DB_ID()  
    AND qs.total_physical_reads > 0  
    ORDER BY qs.total_physical_reads DESC,  
    qs.total_logical_reads DESC OPTION (RECOMPILE);  
    -- This helps you find the most expensive cached  
    -- stored procedures from a read I/O perspective  
    -- You should look at this if you see signs of I/O pressure or of memory pressure  

从逻辑写来看最耗时的缓存存储过程:


    -- Top Cached SPs By Total Logical Writes (SQL Server 2012).  
    -- Logical writes relate to both memory and disk I/O pressure  
    SELECT TOP(25) p.name AS [SP Name], qs.total_logical_writes  
    AS [TotalLogicalWrites], qs.total_logical_writes/qs.execution_count  
    AS [AvgLogicalWrites], qs.execution_count,  
    ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time, GETDATE()), 0)  
    AS [Calls/Second],qs.total_elapsed_time, qs.total_elapsed_time/qs.execution_count  
    AS [avg_elapsed_time], qs.cached_time  
    FROM sys.procedures AS p WITH (NOLOCK)  
    INNER JOIN sys.dm_exec_procedure_stats AS qs WITH (NOLOCK)  
    ON p.[object_id] = qs.[object_id]  
    WHERE qs.database_id = DB_ID()  
    ORDER BY qs.total_logical_writes DESC OPTION (RECOMPILE);  
    -- This helps you find the most expensive cached  
    -- stored procedures from a write I/O perspective  
    -- You should look at this if you see signs of I/O pressure or of memory pressure  

从平均I/O来看缓存的存储过程中最耗资源的语句:


    -- Lists the top statements by average input/output  
    -- usage for the current database  
    SELECT TOP(50) OBJECT_NAME(qt.objectid) AS [SP Name],  
    (qs.total_logical_reads + qs.total_logical_writes) /qs.execution_count  
    AS [Avg IO],SUBSTRING(qt.[text],qs.statement_start_offset/2,  
    (CASE  
    WHEN qs.statement_end_offset = -1  
    THEN LEN(CONVERT(nvarchar(max), qt.[text])) * 2  
    ELSE qs.statement_end_offset  
    END - qs.statement_start_offset)/2) AS [Query Text]  
    FROM sys.dm_exec_query_stats AS qs WITH (NOLOCK)  
    CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt  
    WHERE qt.[dbid] = DB_ID()  
    ORDER BY [Avg IO] DESC OPTION (RECOMPILE);  
    -- Helps you find the most expensive statements for I/O by SP  

查看写比读更多的非聚集索引:


    -- Possible Bad NC Indexes (writes > reads)  
    SELECT OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name],  
    i.index_id,user_updates AS [Total Writes],  
    user_seeks + user_scans + user_lookups AS [Total Reads],  
    user_updates - (user_seeks + user_scans + user_lookups) AS [Difference]  
    FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)  
    INNER JOIN sys.indexes AS i WITH (NOLOCK)  
    ON s.[object_id] = i.[object_id]  
    AND i.index_id = s.index_id  
    WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1  
    AND s.database_id = DB_ID()  
    AND user_updates > (user_seeks + user_scans + user_lookups)  
    AND i.index_id > 1  
    ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC OPTION  
    (RECOMPILE);  
    -- Look for indexes with high numbers of writes  
    -- and zero or very low numbers of reads  
    -- Consider your complete workload  
    -- Investigate further before dropping an index!  

查看缺失的索引信息:


    -- Missing Indexes current database by Index Advantage  
    SELECT user_seeks * avg_total_user_cost * (avg_user_impact * 0.01)  
    AS [index_advantage],  
    migs.last_user_seek, mid.[statement] AS [Database.Schema.Table],  
    mid.equality_columns, mid.inequality_columns, mid.included_columns,  
    migs.unique_compiles, migs.user_seeks, migs.avg_total_user_cost,  
    migs.avg_user_impact  
    FROM sys.dm_db_missing_index_group_stats AS migs WITH (NOLOCK)  
    INNER JOIN sys.dm_db_missing_index_groups AS mig WITH (NOLOCK)  
    ON migs.group_handle = mig.index_group_handle  
    INNER JOIN sys.dm_db_missing_index_details AS mid WITH (NOLOCK)  
    ON mig.index_handle = mid.index_handle  
    WHERE mid.database_id = DB_ID() -- Remove this to see for entire instance  
    ORDER BY index_advantage DESC OPTION (RECOMPILE);  
    -- Look at last user seek time, number of user seeks  
    -- to help determine source and importance  
    -- SQL Server is overly eager to add included columns, so beware  
    -- Do not just blindly add indexes that show up from this query!!!  

查看缓存的执行计划中缺失索引警告:


    -- Find missing index warnings for cached plans in the current database  
    -- Note: This query could take some time on a busy instance  
    SELECT TOP(25) OBJECT_NAME(objectid) AS [ObjectName],query_plan,  
    cp.objtype, cp.usecounts  
    FROM sys.dm_exec_cached_plans AS cp WITH (NOLOCK)  
    CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp  
    WHERE CAST(query_plan AS NVARCHAR(MAX)) LIKE N'%MissingIndex%'  
    AND dbid = DB_ID()  
    ORDER BY cp.usecounts DESC OPTION (RECOMPILE);  
    -- Helps you connect missing indexes to specific stored procedures or queries  
    -- This can help you decide whether to add them or not  

查看SQL Server缓冲池中占用最多空间的表和索引:


    -- Breaks down buffers used by current database  
    -- by object (table, index) in the buffer cache  
    SELECT OBJECT_NAME(p.[object_id]) AS [ObjectName],  
    p.index_id, COUNT(*)/128 AS [Buffer size(MB)], COUNT(*) AS [BufferCount],  
    p.data_compression_desc AS [CompressionType]  
    FROM sys.allocation_units AS a WITH (NOLOCK)  
    INNER JOIN sys.dm_os_buffer_descriptors AS b WITH (NOLOCK)  
    ON a.allocation_unit_id = b.allocation_unit_id  
    INNER JOIN sys.partitions AS p WITH (NOLOCK)  
    ON a.container_id = p.hobt_id  
    WHERE b.database_id = CONVERT(int,DB_ID())  
    AND p.[object_id] > 100  
    GROUP BY p.[object_id], p.index_id, p.data_compression_desc  
    ORDER BY [BufferCount] DESC OPTION (RECOMPILE);  
    -- Tells you what tables and indexes are  
    -- using the most memory in the buffer cache  

查看数据库中所有表的大小及数据压缩状态:


    -- Get Table names, row counts, and compression status  
    -- for the clustered index or heap  
    SELECT OBJECT_NAME(object_id) AS [ObjectName],  
    SUM(Rows) AS [RowCount], data_compression_desc AS [CompressionType]  
    FROM sys.partitions WITH (NOLOCK)  
    WHERE index_id < 2 --ignore the partitions from the non-clustered index if any  
    AND OBJECT_NAME(object_id) NOT LIKE N'sys%'  
    AND OBJECT_NAME(object_id) NOT LIKE N'queue_%'  
    AND OBJECT_NAME(object_id) NOT LIKE N'filestream_tombstone%'  
    AND OBJECT_NAME(object_id) NOT LIKE N'fulltext%'  
    AND OBJECT_NAME(object_id) NOT LIKE N'ifts_comp_fragment%'  
    AND OBJECT_NAME(object_id) NOT LIKE N'filetable_updates%'  
    GROUP BY object_id, data_compression_desc  
    ORDER BY SUM(Rows) DESC OPTION (RECOMPILE);  
    -- Gives you an idea of table sizes, and possible data compression opportunities  

查看数据库中所有索引最后一次统计更新的时间:


    -- When were Statistics last updated on all indexes?  
    SELECT o.name, i.name AS [Index Name],STATS_DATE(i.[object_id],  
    i.index_id) AS [Statistics Date], s.auto_created,  
    s.no_recompute, s.user_created, st.row_count  
    FROM sys.objects AS o WITH (NOLOCK)  
    INNER JOIN sys.indexes AS i WITH (NOLOCK)  
    ON o.[object_id] = i.[object_id]  
    INNER JOIN sys.stats AS s WITH (NOLOCK)  
    ON i.[object_id] = s.[object_id]  
    AND i.index_id = s.stats_id  
    INNER JOIN sys.dm_db_partition_stats AS st WITH (NOLOCK)  
    ON o.[object_id] = st.[object_id]  
    AND i.[index_id] = st.[index_id]  
    WHERE o.[type] = 'U'  
    ORDER BY STATS_DATE(i.[object_id], i.index_id) ASC OPTION (RECOMPILE);  
    -- Helps discover possible problems with out-of-date statistics  
    -- Also gives you an idea which indexes are most active  

查看当前数据库中碎片最多的索引:


    -- Get fragmentation info for all indexes  
    -- above a certain size in the current database  
    -- Note: This could take some time on a very large database  
    SELECT DB_NAME(database_id) AS [Database Name],  
    OBJECT_NAME(ps.OBJECT_ID) AS [Object Name],  
    i.name AS [Index Name], ps.index_id, index_type_desc,  
    avg_fragmentation_in_percent, fragment_count, page_count  
    FROM sys.dm_db_index_physical_stats(DB_ID(),NULL, NULL, NULL ,'LIMITED') AS ps  
    INNER JOIN sys.indexes AS i WITH (NOLOCK)  
    ON ps.[object_id] = i.[object_id]  
    AND ps.index_id = i.index_id  
    WHERE database_id = DB_ID()  
    AND page_count > 500  
    ORDER BY avg_fragmentation_in_percent DESC OPTION (RECOMPILE);  
    -- Helps determine whether you have fragmentation in your relational indexes  
    -- and how effective your index maintenance strategy is  

如果你发现超过10%碎片的索引,你就需要决定是重组还是重建它们。重组通常是Online操作,能在任何时间停止。重建可以是Online或Offline操作。在SQL Server 2012中,你可以Online重建聚集索引,而不必考虑表包含什么类型的数据。收缩数据文件非常消耗资源。不要犯常见的错误定期去重建所有索引,这非常浪费资源。你可以在Internet上找一些好的索引维护脚本。Ola Hallengren开发了一个非常好的脚本,你可以从 http://ola.hallengren.com获取。





运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-22296-1-1.html 上篇帖子: 使用Extended Events诊断SQL Server 2012----查看由Extended Event捕获的数据 下篇帖子: sql server管理和性能 健康
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表