Home All Groups Group Topic Archive Search About

expectations for SUM query on 50+ million rows



Author
16 Mar 2006 4:41 PM
Jeff Turner
what's a reasonable expectation for a query like this run against a
table with 50+ million rows with SQL 2005 Standard:

SELECT
    NAME, SUM(Tickets) as NumTickets
FROM
    MYTABLE
WHERE
    TicketDate BETWEEN @StartDate and @EndDate
GROUP BY
    NAME
ORDER BY
    NumTickets DESC

let's assume my table is properly indexed, and the server machine is a
dual xeon 2.8 with 2GB of memory, and plain jane SATA drives in RAID-1.

also assume the result rows (between start and enddate) could be
anywhere from 5k to 500k rows, and there's minimal load on the server.

any takers? I'm looking for a ballpark of how well SQL should be able
to handle this...and things to consider to speed this query up.

thanks!

Author
16 Mar 2006 4:48 PM
SQL
do you need sum or count?

take a look at this count is 2 but sum is 3

select count(tickets),sum(tickets)
from (select 1 as tickets union all
select 2) x


http://sqlservercode.blogspot.com/
Are all your drivers up to date? click for free checkup

Author
16 Mar 2006 9:25 PM
David Browne
Show quote Hide quote
"Jeff Turner" <zig***@gmail.com> wrote in message
news:1142527278.244991.109940@e56g2000cwe.googlegroups.com...
> what's a reasonable expectation for a query like this run against a
> table with 50+ million rows with SQL 2005 Standard:
>
> SELECT
>    NAME, SUM(Tickets) as NumTickets
> FROM
>    MYTABLE
> WHERE
>    TicketDate BETWEEN @StartDate and @EndDate
> GROUP BY
>    NAME
> ORDER BY
>    NumTickets DESC
>
> let's assume my table is properly indexed, and the server machine is a
> dual xeon 2.8 with 2GB of memory, and plain jane SATA drives in RAID-1.
>
> also assume the result rows (between start and enddate) could be
> anywhere from 5k to 500k rows, and there's minimal load on the server.
>
> any takers? I'm looking for a ballpark of how well SQL should be able
> to handle this...and things to consider to speed this query up.
>


Test it.

David
Author
17 Mar 2006 1:50 AM
Doug
how many dates are there?
is there an index on ticketdate?
is there a composite, non-clustered index on ticketdate,name,tickets  ?

i'd try changing the where clause to
WHERE
    TicketDate >= @StartDate and
    TicketDate <= @EndDate

Just a personal deal - sometimes between isn't quite obnoxious enough
on the hint to the compiler.

how many bytes is the average row?  if the average row is 1k, then
gonna take a long time. if average row is 50 bytes, then i'll say 1
second per 10k rows, given the perfect index.
Author
17 Mar 2006 10:34 PM
Hugo Kornelis
On 16 Mar 2006 08:41:18 -0800, Jeff  Turner wrote:

>what's a reasonable expectation for a query like this run against a
>table with 50+ million rows with SQL 2005 Standard:
(snip query and hardware description)

Hi Jeff,

I have no idea - you'll have to test it.

So why do I bother to reply? Becuase you also ask for ...

>...and things to consider to speed this query up.

Try the following two indexes. Hard to predict which one of them will be
the best.

CREATE NONCLUSTERED INDEX SomeName
    ON MyTable (Name, TicketDate, Tickets)

*** or ***

CREATE NONCLUSTERED INDEX SomeName
    ON MyTable (TicketDate, Name, Tickets)


--
Hugo Kornelis, SQL Server MVP

Bookmark and Share