Home All Groups Group Topic Archive Search About

Partition Table Split Range Performance



Author
9 Dec 2008 9:56 PM
Lijun Zhang
I have a table with over 300 millions records. The total space used by this
table is around 120 GB. The table is used to records some activities. One
column is the activity date. We only keep 2 years of history data in this
table, and the data is purged on a weekly base. I am trying to partition the
table on one of our development server now. I used the following partition
function and scheme to partition the table on the activity date column:

CREATE PARTITION FUNCTION PF_Activity(datetime)
AS RANGE RIGHT FOR VALUES ('2007-07-01', '2008-01-01', '2008-07-01')

CREATE PARTITION SCHEME PS_Activity AS PARTITION PF_Activity
ALL TO ([MyFileGroup])

All the indexes is aligned exception the unique index on the idenity column.
After the table is partitioned, I run the following to create a new empty
partition to prepare to hold the newly inserted data. It completed within a
few seconds.

ALTER PARTITION SCHEME PS_Activity
NEXT USED [MyFileGroup]

ALTER PARTITION FUNCTION PF_Activity()
SPLIT RANGE ('2009-01-01')

I want to make sure that merging an empty partition will be fast too. So I
ran the following to create an empty partition on the oldest data:

ALTER PARTITION FUNCTION PF_Activity()
SPLIT RANGE ('2005-01-01')

The minimum activity date on this table is 2006-12-01. So I suppose SQL
Server will realize the newly splited partition will be empty, and the
statement will finish soon.

But to my surprise, the statement has been running over 5 hours and has not
yet complete. It looks like SQL Server is moving the whole table around.

Does anybody have the same experience with me? Or is this a known problem
with SQL Server? I mean people sometimes do need to add a empty partition on
the other side, right? You can not guarantee the data is only growing in one
direction, right?

Thanks for any input.

Lijun

Author
10 Dec 2008 2:34 AM
Dan Guzman
> CREATE PARTITION FUNCTION PF_Activity(datetime)
> AS RANGE RIGHT FOR VALUES ('2007-07-01', '2008-01-01', '2008-07-01')
> snip <
> ALTER PARTITION FUNCTION PF_Activity()
> SPLIT RANGE ('2005-01-01')
>
> The minimum activity date on this table is 2006-12-01. So I suppose SQL
> Server will realize the newly splited partition will be empty, and the
> statement will finish soon.
>
> But to my surprise, the statement has been running over 5 hours and has
> not yet complete. It looks like SQL Server is moving the whole table
> around.

Be aware that the new partition is the one that contains the boundary value
when you SPLIT a partition.  Due to the RANGE RIGHT specification, all the
data >= '2005-01-01' and < '2007-07-01' is moved into the new partition
while data < '2005-01-01' (no rows in this case) remains in the original
partition.  Also the partition with boundaries '2005-01-01' and <
'2007-07-01' is placed on the NEXT USED filegroup of the partition scheme.

> Does anybody have the same experience with me? Or is this a known problem
> with SQL Server? I mean people sometimes do need to add a empty partition
> on the other side, right? You can not guarantee the data is only growing
> in one direction, right?

The key is to efficient SPLIT and MERGE performance is planning. You can
still split a non-empty partition efficiently if no data needs to be moved
into the new partition and a useful index on the partitioning column exists
for SQL Server to determine that no data movement is necessary.  The same
consideration apply to MERGE except that the removed partition is the one
that contains the boundary and data are moved from that partition into the
adjacent retained one.

I generally suggest you keep the first and last partitions empty to simply
things for SPLIT and MERGE.

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/
Are all your drivers up to date? click for free checkup

Author
10 Dec 2008 9:40 PM
Lijun Zhang
Thanks for your help. I assume after the partition, the future index
rebuild/defragmentation will most likely apply to the partition that contain
the oldest data, where the data will be purged from, and the partition that
contain the newest data, where data will be added to. I am testing it now.
Wish SQL Server can add a TRUNCATE command to truncate a partion only, not
just a whole table.

Lijun
Author
11 Dec 2008 4:03 AM
Dan Guzman
> Thanks for your help. I assume after the partition, the future index
> rebuild/defragmentation will most likely apply to the partition that
> contain the oldest data, where the data will be purged from, and the
> partition that contain the newest data, where data will be added to. I am
> testing it now.

One of the benefits of partitioning is that you can selectively rebuild or
reorganize index partitions.  This allows you to avoid touching partitions
that are not modified.  You should always update stats after a SWITCH since
there is only one set of statistics for the entire table.

> Wish SQL Server can add a TRUNCATE command to truncate a partion only, not
> just a whole table.

You can do this indirectly with a staging table.  SWITCH the partition into
an empty staging table and then truncate the staging table.  If you have
multiple filegroups, you can partition the staging table using the same
partition scheme as the primary table.

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/

Show quoteHide quote
"Lijun Zhang" <sonyzhan***@yahoo.com> wrote in message
news:uCDj%23$wWJHA.868@TK2MSFTNGP06.phx.gbl...
> Thanks for your help. I assume after the partition, the future index
> rebuild/defragmentation will most likely apply to the partition that
> contain the oldest data, where the data will be purged from, and the
> partition that contain the newest data, where data will be added to. I am
> testing it now. Wish SQL Server can add a TRUNCATE command to truncate a
> partion only, not just a whole table.
>
> Lijun
>

Bookmark and Share