Home All Groups Group Topic Archive Search About


Author
2 Jul 2009 4:40 PM
Derek Howard
I have 100+ fields of data in a single record relating to an individual
person. The data is unique to that person and has a one to one realtionship
to him. I have been told that that is to many fields and that it should be
broken up into other tables. That is is not normailzed. This does not make
sense to me. It would require me to read several table to get the data for
this one person, requiring more code and multiple select statements and
multiple update stements to edit.
Is there a recommended limit for the number of columns? If so, why?

Author
2 Jul 2009 5:53 PM
Tom Cooper
Short answer:  It depends.

Long answer:  Although having lots of columns in a table is often an
indication that that table has not been properly normalized, it is possible
for a table that is properly normalized to have many columns.  And I would
certainly never say, "If a table has more than X columns, then it is not
normalized".  Or say  "If a table has more than X columns, then that is too
many columns" (unless, of course, X is more than the SQL Server limit (1,024
for regular tables, 30,000 for wide tables) <grin>).

But if the table is properly normalized, then normalization is obviously not
a reason to split the table into multiple tables.  Sometimes, however, you
can get performance improvements if you split the table into multiple
tables.  If, for example, you have a table with 100 columns each averaging
20 bytes, then each row averages 2000 bytes and you can only get 3 or 4 rows
per 8K page.  So if you have lots of rows and a query that does a table
scan, that will take lots of I/O.

But often data access follows something like the 80/20 rule.  Maybe you have
20 columns where 80% of the time, you only want data from one or more of
those columns.  And only 20% of the time do you want data from one or more
of the other 80 columns.  If you split this into two tables, then the
smaller table will have rows averaging 400 bytes, so you may be able to get
about 15 - 20 rows in each 8K page.  So scanning that table will be
approxmately 5 times as fast as scanning the original table.  And if rows
from that table are cached in memory, they take only 1/5 of the memory it
takes to cache rows from the original table.

Of course, 20% of the time you need columns from the other table as well and
then you will have to join the two tables and there will be a cost to that.
Also, your table is no longer fully normalized and that will make coding
your maintaince more difficult and more subject to error.

So you need to decide if the performance improvement is worth it in your
case.

I would ask the person(s) telling you this why they believe in this case
100+ columns is too many.  Perhaps they have a good answer that will
convince you  Perhaps you can convince them.  Perhaps it's your manager, who
says, "Because I said so."

So I would probably make sure the table was properly normalized, consider
whether spliting the table would be worth it in performance when the system
is implemented with the number of rows and expected transaction volumn when
the system is fully implemented, and any inputs from people who are
knowledgable and/or whom you report to, and then do what seemed
best/feasible.

As I said, it depends.

Tom

Show quoteHide quote
"Derek Howard" <DerekHow***@discussions.microsoft.com> wrote in message
news:5F2E4A10-1AE5-4F76-8783-48D096CD61E4@microsoft.com...
>I have 100+ fields of data in a single record relating to an individual
> person. The data is unique to that person and has a one to one
> realtionship
> to him. I have been told that that is to many fields and that it should be
> broken up into other tables. That is is not normailzed. This does not make
> sense to me. It would require me to read several table to get the data for
> this one person, requiring more code and multiple select statements and
> multiple update stements to edit.
> Is there a recommended limit for the number of columns? If so, why?
Are all your drivers up to date? click for free checkup

Author
2 Jul 2009 11:03 PM
Bob Simms
"Derek Howard" <DerekHow***@discussions.microsoft.com> wrote in message
news:5F2E4A10-1AE5-4F76-8783-48D096CD61E4@microsoft.com...
> I have 100+ fields of data in a single record relating to an individual
> person. The data is unique to that person and has a one to one
> realtionship
> to him. I have been told that that is to many fields and that it should be
> broken up into other tables. That is is not normailzed. This does not make
> sense to me. It would require me to read several table to get the data for
> this one person, requiring more code and multiple select statements and
> multiple update stements to edit.
> Is there a recommended limit for the number of columns? If so, why?

In addition to Tom's answer, I'd look at the null values as well.  If some
of the columns have a lot of null values, they are still taking up space
(unless they are sparse columns)  In that case it might make sense to either
convert them to sparse columns if you have SQL 2008, merge them into a
single XML column (if SQL 2005+) or move them into other tables.

Bookmark and Share