|
sql
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Intermittent and undesired SQL execution performance behavior
Please help! Have a trully bizarre (at least to me) performance case that cannot put a finger on and need prompt resolution advice as its been troubling me for weeks on. The sproc is attached. At first I suspected its the nested views, so created a flat sproc code where the views were all "AS my*" aliases. The issue still comes back. The PROBLEM: - One sproc that accepts bigint (a userID) at times performs as expected: about 2500 logical disk reads; CPU and Duration well under 1sec. - At irregular times during the day (or night), the same sproc takes 5-7min and logical data reads reach over 11 million!! I am not able to reproduce this even when copy/paste the same RPC:Completed call into SQL Mgmt Studio and run it immediately after seeing the anomalous behavior. Note: The issue is intermittently observed even when same userID is passed. Example of random userId: these sometimes perform fine, sometimes i see the 11+ million reads. "exec GetUserWorkflowCaseWorklist @userId=33" or "exec GetUserWorkflowCaseWorklist @userId=52" I will provide SQL Profiler Traces of 'good' and 'bad' performing SProcs along with execution plans additionally. The SYSTEM: Windows Server 2008 x32-bit 2 x QuadCore CPUs (not HT as Task Manager shows 8 CPU windows) 4GB RAM SQL: all system mdf/ldf and user mdf files are on C:\ disk. One exception, the db's ldf file (whose sproc is mis-behaving) is on disk D:\. (see Results03.txt) All static values seem default SQL2005: - SQL2005 SP2 STD - Recovery model = Simple - Compat. Level = SQL90 - data growth = 100MB, autogrow - Log growth 10%, autogrow - Nightly maint. plan run (all user databases, all default values, in following order): a) Check DB Integrity b) Rebuild INdexes c) Reorganize Indexes d) Shrink Database e) Update Stats f) Cleanup History Don't want to lead the thread and the discussions, but i cannot determine where are those 11M reads spent on; the major difference I see b/w the two execution plans is the use of Paralellism in the good exec plan. Could this be the cause and if so, why would SQL Engine choose NOT to use parallelism at times but decide to go with a single thread executing it all? Thank you much in advance, -mg [attached file: Results03.txt] [attached file: Corrected_GetUserWorkflowCaseWorklist_Query.sql] [addendum]
Attached are both of the Execution Plans... I will try running in SSMS the "SET ARITHABORT OFF" statement before the offending sproc execution to see if I could repro the issue. Even if I could repro, this still leaves me with question of why is this happening and how to avoid this 'randomness' I'm seeing. Any help is appreciated, -mg Show quoteHide quote "Martin Gasevski" <martinDOTgasevskiATautodeskDOTcom> wrote in message [attached file: GOOD vs. BAD.zip]news:%23oEGRSjWJHA.556@TK2MSFTNGP06.phx.gbl... > All, > > Please help! Have a trully bizarre (at least to me) performance case that > cannot put a finger on and need prompt resolution advice as its been > troubling me for weeks on. > > The sproc is attached. At first I suspected its the nested views, so > created a flat sproc code where the views were all "AS my*" aliases. The > issue still comes back. > > The PROBLEM: > - One sproc that accepts bigint (a userID) at times performs as expected: > about 2500 logical disk reads; CPU and Duration well under 1sec. > - At irregular times during the day (or night), the same sproc takes > 5-7min > and logical data reads reach over 11 million!! I am not able to reproduce > this even when copy/paste the same RPC:Completed call into SQL Mgmt Studio > and run it immediately after seeing the anomalous behavior. > > Note: The issue is intermittently observed even when same userID is > passed. > Example of random userId: these sometimes perform fine, sometimes i see > the > 11+ million reads. > "exec GetUserWorkflowCaseWorklist @userId=33" or "exec > GetUserWorkflowCaseWorklist @userId=52" > > I will provide SQL Profiler Traces of 'good' and 'bad' performing SProcs > along > with execution plans additionally. > > The SYSTEM: > Windows Server 2008 x32-bit > 2 x QuadCore CPUs (not HT as Task Manager shows 8 CPU windows) > 4GB RAM > SQL: all system mdf/ldf and user mdf files are on C:\ disk. One > exception, > the db's ldf file (whose sproc is mis-behaving) is on disk D:\. > > > (see Results03.txt) All static values seem default SQL2005: > - SQL2005 SP2 STD > - Recovery model = Simple > - Compat. Level = SQL90 > - data growth = 100MB, autogrow > - Log growth 10%, autogrow > - Nightly maint. plan run (all user databases, all default values, in > following order): > a) Check DB Integrity > b) Rebuild INdexes > c) Reorganize Indexes > d) Shrink Database > e) Update Stats > f) Cleanup History > > Don't want to lead the thread and the discussions, but i cannot determine > where are those 11M reads spent on; the major difference I see b/w the two > execution plans is the use of Paralellism in the good exec plan. Could > this > be the cause and if so, why would SQL Engine choose NOT to use parallelism > at times but decide to go with a single thread executing it all? > > Thank you much in advance, > -mg > > > > Martin
You can try to recompile the sp every time it runs. Also SQL Server 2005 introduced OPTION(RECOMPILE) on statement level. Run the following GetUserWorkflowCaseWorklist @userId=33" SET SHOWPLAN_XML ON GetUserWorkflowCaseWorklist @userId=52" Near the bottom of the XML document, you should see the following: <ParameterList> <ColumnReference Column="@p" ParameterCompiledValue="(33)" ParameterRuntimeValue="(52")" /> </ParameterList> Seeing the different values used for compilation and execution can lead you to suspect an issue with parameter sniffing. Show quoteHide quote "Martin Gasevski" <martinDOTgasevskiATautodeskDOTcom> wrote in message news:O56mhfjWJHA.4184@TK2MSFTNGP06.phx.gbl... > [addendum] > > Attached are both of the Execution Plans... > > I will try running in SSMS the "SET ARITHABORT OFF" statement before the > offending sproc execution to see if I could repro the issue. Even if I > could repro, this still leaves me with question of why is this happening > and how to avoid this 'randomness' I'm seeing. > > Any help is appreciated, > -mg > > "Martin Gasevski" <martinDOTgasevskiATautodeskDOTcom> wrote in message > news:%23oEGRSjWJHA.556@TK2MSFTNGP06.phx.gbl... >> All, >> >> Please help! Have a trully bizarre (at least to me) performance case >> that >> cannot put a finger on and need prompt resolution advice as its been >> troubling me for weeks on. >> >> The sproc is attached. At first I suspected its the nested views, so >> created a flat sproc code where the views were all "AS my*" aliases. The >> issue still comes back. >> >> The PROBLEM: >> - One sproc that accepts bigint (a userID) at times performs as expected: >> about 2500 logical disk reads; CPU and Duration well under 1sec. >> - At irregular times during the day (or night), the same sproc takes >> 5-7min >> and logical data reads reach over 11 million!! I am not able to >> reproduce >> this even when copy/paste the same RPC:Completed call into SQL Mgmt >> Studio >> and run it immediately after seeing the anomalous behavior. >> >> Note: The issue is intermittently observed even when same userID is >> passed. >> Example of random userId: these sometimes perform fine, sometimes i see >> the >> 11+ million reads. >> "exec GetUserWorkflowCaseWorklist @userId=33" or "exec >> GetUserWorkflowCaseWorklist @userId=52" >> >> I will provide SQL Profiler Traces of 'good' and 'bad' performing SProcs >> along >> with execution plans additionally. >> >> The SYSTEM: >> Windows Server 2008 x32-bit >> 2 x QuadCore CPUs (not HT as Task Manager shows 8 CPU windows) >> 4GB RAM >> SQL: all system mdf/ldf and user mdf files are on C:\ disk. One >> exception, >> the db's ldf file (whose sproc is mis-behaving) is on disk D:\. >> >> >> (see Results03.txt) All static values seem default SQL2005: >> - SQL2005 SP2 STD >> - Recovery model = Simple >> - Compat. Level = SQL90 >> - data growth = 100MB, autogrow >> - Log growth 10%, autogrow >> - Nightly maint. plan run (all user databases, all default values, in >> following order): >> a) Check DB Integrity >> b) Rebuild INdexes >> c) Reorganize Indexes >> d) Shrink Database >> e) Update Stats >> f) Cleanup History >> >> Don't want to lead the thread and the discussions, but i cannot determine >> where are those 11M reads spent on; the major difference I see b/w the >> two >> execution plans is the use of Paralellism in the good exec plan. Could >> this >> be the cause and if so, why would SQL Engine choose NOT to use >> parallelism >> at times but decide to go with a single thread executing it all? >> >> Thank you much in advance, >> -mg >> >> >> >> > > >
Other interesting topics
ORDER BY @OrderBy problem
Testing Environment + TestData + QA Setup Restoring single filegroup Migrating SQL2000 Databases to SQL2005 where to find the errors generated by stored procedures SSMS Sometimes shows keys - sometimes not 64 bit standard edition doesn't use page file Transaction Log Backup Doesn't Truncate Log Change autogrowth for a log file SQL 2005 Maintenance Plans |
|||||||||||||||||||||||