If you’re like me, you’ve seen people posting information found through debugging SQL Server and thought “Yeah, I could be that cool”. If you want to see call stacks but failed to get anywhere with the debugger then this post might be for you. It contains step-by-step instructions for viewing SQL Server call stacks by creating a minidump.
What is a Minidump?
I will borrow a definition from Thomas Kejser’s blog post:
What is a dump? It is a file containing a snapshot of the running process – and parts or all of the memory space of that process. The snapshot also contains the call stack of every thread the process has created.
That last sentence is relevant to our interests. We can use minidumps to generate small files that contain SQL Server call stacks. Note that if you aren’t careful you can end up writing the full contents of memory to a dump file. I believe that writing the file is a single-threaded process, so this can take hours and hundreds of GBs on a real server. Also SQL Server is frozen while the dump is happening, so don’t do it in production.
Dumps are most useful when you want to examine a long running, very simple query or a query that seems to be stuck at a certain point. They aren’t useful to see a full call stack of a non-simple query or to see something that happens just once during execution, like a specific task during query compilation. All that you can get is a snapshot and the snapshot may not contain the information that you’re looking for.
There are multiple ways to generate and read dump files. For this blog post I’ll be using sqldumper.exe and WinDbg.
First go to the directory that contains sqldumper.exe. On my machine for SQL Server 2016 the directory is C:\Program Files\Microsoft SQL Server\130\Shared . Open an admin command prompt and point it to that directory. Here are the arguments that we need to look at call stacks:
C:\Program Files\Microsoft SQL Server\130\Shared>sqldumper Usage: sqldumper [ProcessID [ThreadId [Flags[:MiniDumpFlags] [SqlInfoPtr [DumpDir
There are many ways to get the ProcessID for SQL Server. One way is to run the following SQL query:
At the time of writing this post I have a process ID of 2364.
ThreadId is set to 0 you’ll get information about all threads, including system threads that you might not be interested in. Sometimes you can get everything that you’re interested in by looking at a single thread for serial queries, or at a single thread if a parallel query appears to be throwing too much work at that thread. There’s probably a better way to write this query, but the query that I use to find the thread that I’m interested in is below:
SELECT th.os_thread_id, wta.wait_type FROM sys.dm_os_threads th INNER JOIN sys.dm_os_workers w ON th.worker_address = w.worker_address INNER JOIN sys.dm_os_tasks tk ON w.task_address = tk.task_address INNER JOIN sys.dm_os_waiting_tasks wta ON wta.waiting_task_address = tk.task_address where tk.session_id = 56;
wait_type is there to give a clue about which threads are interesting. More columns can be added to the query as needed.
Flags parameter controls which information is written to the dump file. There are many options and some of them write the full contents of memory to disk. The most useful ones that I’ve found are
0x0120 (dump minimal information about all threads) and
0x0100 (dump information about a single specified thread).
I always set
SqlInfoPtr to 0 and don’t care to know what it does.
DumpDir is where the dump files are written to. Point it to your preferred place for leaving dumps, like Erik’s doorstep.
Getting the Call Stack
If your dump is successful then you’ll end up with a .mdmp file. This can be opened with your favorite debugging program. I use WinDbg because there are instructions for it and I generally don’t know what I’m doing. Open the program and go to File -> Open Crash Dump. Open your file and type the magic command:
After a short while you’ll see call stack information:
As far as I can tell, you have to close and reopen the program in order to open a new crash dump file. This is unfortunate, but it makes viewing the call stack that much sweeter.
Let’s start with a simple example. How do call stacks differ when inserting into a heap vs a clustered index? For the source data I’ll put 2 GB of pages into a clustered index:
DROP TABLE IF EXISTS dbo.source_ci; create table dbo.source_ci ( ID BIGINT NOT NULL, FILLER VARCHAR(7777) NOT NULL, PRIMARY KEY (ID) ); INSERT INTO source_ci WITH (TABLOCK) SELECT TOP (250000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) , REPLICATE('Z', 7777) FROM master..spt_values t1 CROSS JOIN master..spt_values t2;
Here’s the definition for the target table with a clustered index:
DROP TABLE IF EXISTS dbo.target_ci; create table dbo.target_ci ( ID BIGINT NOT NULL, FILLER VARCHAR(7777) NOT NULL, PRIMARY KEY (ID) );
And here’s the query that I want to get a dump of:
INSERT INTO dbo.target_ci WITH (TABLOCK) SELECT * FROM dbo.source_ci WITH (TABLOCK);
I previously wrote the source data to
source_ci in order to make the insert query into
target_ci as simple as possible. I want to maximize the chances that the call stack associated with the step of inserting rows into the table is present in the dump file. If the query to select the data for the insert is too complicated then I might not get what I’m looking for when I take a snapshot.
I started the insert, identified the thread of interest (8324), and dumped information for the single thread:
C:\Program Files\Microsoft SQL Server\130\Shared>sqldumper 2364 8324 0x0100 0 c:\sql_dumps Parsed parameters: ProcessID = 2364 ThreadId = 8324 Flags = 0x100 MiniDumpFlags = 0x1160 SqlInfoPtr = 0x0000000000000000 DumpDir = c:\sql_dumps ExceptionRecordPtr = 0x0000000000000000 ContextPtr = 0x0000000000000000 ExtraFile = <NULL> PatternForExtraFiles = <NULL> InstanceName = <NULL> ServiceName = <NULL> Remote process didn't specify a dump file name Target suspended Callback type 11 not used Callback type 15 not used Callback type 7 not used MiniDump completed: c:\sql_dumps\SQLDmpr0024.mdmp Total Buffer pool data pages filtered out: 0 KB Total Hekaton data pages filtered out: 0 KB Total Free memory (from non top level allocators) filtered out: 0 KB Total top level free memory filtered out: 0 KB Total Log pool memory filtered out: 0 KB Location of module 'dbghelp.dll' : 'C:\Program Files\Microsoft SQL Server\130\Shared\dbghelp.dll' File version of module 'C:\Program Files\Microsoft SQL Server\130\Shared\dbghelp.dll' : '6.12:2.633' Product version of module 'C:\Program Files\Microsoft SQL Server\130\Shared\dbghelp.dll' : '6.12:2.633' Location of module 'sqldumper.exe' : 'C:\Program Files\Microsoft SQL Server\130\Shared\SqlDumper.exe' File version of module 'C:\Program Files\Microsoft SQL Server\130\Shared\SqlDumper.exe' : '2015.130:1601.5' Product version of module 'C:\Program Files\Microsoft SQL Server\130\Shared\SqlDumper.exe' : '13.0:1601.5' Watson Invoke: No
I did the same thing with a heap target table (with a
MAXDOP 1 hint), and diffed the call stacks:
Some of the function names are the same, which makes a lot of sense. We’re reading from the same source table. Of course there are differences as well. For example, for the heap we see
sqlmin!CHeapBuild::InsertRow+0x151 and for the clustered index we see
sqlmin!CIndBuild::InsertRow+0xd84. That’s pretty neat.
Stuck Query Example
For the next example I’ll use my favorite query that never finishes. First we need to create a few tables:
DROP TABLE IF EXISTS dbo.TestDriver; CREATE TABLE dbo.TestDriver ( n integer NOT NULL, n2 integer NOT NULL ); -- 100k rows INSERT dbo.TestDriver WITH (TABLOCK) (n, n2) SELECT TOP (100000) CHECKSUM(sv1.number, NEWID()), CHECKSUM(sv1.number, NEWID()) FROM master.dbo.spt_values AS SV1 CROSS JOIN master.dbo.spt_values AS SV2 OPTION (MAXDOP 1); DROP TABLE IF EXISTS dbo.TestCCI; CREATE TABLE dbo.TestCCI ( n integer NOT NULL, INDEX ccsi CLUSTERED COLUMNSTORE ); -- 10 M rows INSERT dbo.TestCCI WITH (TABLOCK) (n) SELECT TOP (10 * 1000 * 1000) CHECKSUM(sv1.number, NEWID()) FROM master.dbo.spt_values AS SV1 CROSS JOIN master.dbo.spt_values AS SV2 CROSS JOIN master.dbo.spt_values AS SV3 OPTION (MAXDOP 1);
The following query seemingly runs forever:
SELECT CA.x FROM ( SELECT TOP (1) n2 FROM dbo.TestDriver ORDER BY n ASC, n2 DESC ) AS T1 (id2) CROSS APPLY ( SELECT COUNT_BIG(*) FROM dbo.Test AS T2 WHERE T2.n <= T1.id2 ) AS CA (x);
Viewing a call stack could be helpful once you’ve exhausted the usual ways of trying to figure out why the query isn’t finishing. If I run the query to get thread information I see that
os_thread_id 4176 has a wait type of
CXPACKET (zzzzz) and
os_thread_id 3076 has a wait type of
HTBUILD. Time to take a dump:
sqldumper 2364 3076 0x0100 0 c:\sql_dumps
After running the magic command:
ntdll!NtSignalAndWaitForSingleObject+0x14 KERNELBASE!SetHandleCount+0x1f850 sqldk!SOS_Scheduler::Switch+0x106 sqldk!SOS_Scheduler::SuspendNonPreemptive+0xd3 sqlmin!EventInternal<SuspendQueueSLock>::Wait+0x1e7 sqlmin!CSyncPoint::WaitAtNthGate+0x1ac sqlmin!CSyncPoint::Wait+0x13e sqlmin!CBpSpillProcessor::Main+0xf8 sqlmin!CBpQScanHashAggNew::BpGetNextBatch+0x52 sqlmin!CQScanBatchHelper::GetRow+0x97 sqlmin!CQScanNLJoinTrivialNew::GetRow+0x12c sqlmin!CQScanProfileNew::GetRowImp<0>+0x11d sqlmin!CQScanXProducerNew::GetRowHelper+0x63 sqlmin!CQScanXProducerNew::GetRow+0x15 sqlmin!FnProducerOpen+0x5b sqlmin!FnProducerThread+0x7a9 sqlmin!SubprocEntrypoint+0x10ab sqldk!SOS_Task::Param::Execute+0x231 sqldk!SOS_Scheduler::RunTask+0xaa sqldk!SOS_Scheduler::ProcessTasks+0x3cd sqldk!SchedulerManager::WorkerEntryPoint+0x2a1 sqldk!SystemThread::RunWorker+0x8f sqldk!SystemThreadDispatcher::ProcessWorker+0x2de sqldk!SchedulerManager::ThreadEntryPoint+0x1d8 kernel32!BaseThreadInitThunk+0x14 ntdll!RtlUserThreadStart+0x21
Now you can write even more detailed connect items that will never get fixed!
Now you can impress your friends with call stacks, as long as you have easily impressed friends. Thanks for reading!