Sunday, 26 October 2014
     
AUSQLUG Forums Home   Search   Login   Register   Member List  
SQL Server 2000 Forums  > SQL Server - Programming  > Rowcount revisited  
 
Display using:  
Previous Thread :: Next Thread 
 Author Thread: Rowcount revisited
ace is not online. Last active: 27/11/2011 11:16:50 PM ace
Top 25 Poster
Joined: 24 Jun 2004
Total Posts: 200
 
Rowcount revisited
Posted: 23 May 2005 03:03 PM
Hi all,

I was wondering as to what is best way to have an accurate rowcount of a table in SQL Server. I can't really do a SELECT COUNT(*) as we're talking about massive amounts of data in this case.

In fact under circumstances does it goes out of sync on SQL Server 2000?

All INSERTs are logged, and if you look at the log via DBCC LOG it seems to modify the [rows] column in [sysindexes] from memory. Likewise with DELETEs.

So it would be for "non-logged" operations like TRUNCATE TABLE, although strictly speaking there is no such thing in SQL 2000. Except that depending on your database recovery model different things are logged, so maybe that influences this. Although it shouldn't.

So does anyone know under what circumstances it all gets out of sync?

Otherwise the best approach I can think of is to write INSERT and DELETE TRIGGERs on all table which increment and decrement a table with the table names and rows.

If I can guarantee that I only have single row inserts for example on a database in simple recovery is the sysindexes guaranteed to be accurate? Thus the quesion about the circumstances above.

Anyway, has anyone come across and technical literature on why [sysindexes].[rows] get out of sync with SQL Server 2000?

Cheers,

Victor
philcart is not online. Last active: 7/01/2013 4:04:01 PM philcart
philcart.blogspot.com
Top 25 Poster
Forum Moderator
Joined: 20 Jul 2004
Total Posts: 345
 
Re: Rowcount revisited
Posted: 23 May 2005 03:25 PM
Hmm ... what amount of data do you term "massive amounts"? I've done a COUNT([PK Column]) over a few million rows without much problems. Of course doesn't a COUNT(*) read in all fields in the row? Which would in turn comsume more resources. I also recall reading somewhere that this can also lead to rows being excluded due to nullability.

Generally when I'm looking for an accurate rowcount and I can't do a COUNT, I run DBCC UPDATEUSAGE on the specific table/index and then query sysindexes.

I think the "technical literature" that you're looking for will be in some dark hole somewhere in Microsoft. Not good marketing practice to advertise product failings ;)



Thanks
Phill

Colt 45 - the original point and click interface ;)
http://philcart.blogspot.com/
ace is not online. Last active: 27/11/2011 11:16:50 PM ace
Top 25 Poster
Joined: 24 Jun 2004
Total Posts: 200
 
Re: Rowcount revisited
Posted: 23 May 2005 04:36 PM
Hmmm...

We're talking about 100 tables or so, a third in excess of 10 million records, some over 500 million. And growing... About 20GB per quarter.

It's basically to see if 2 databases are in sync. Real-time system. Legacy one has particularly inefficient data types, bad indexing and very wide tables. PK is typically over 50 bytes in size. The legacy database is approaching 300GB. Only get's better huh?

So it all collectively adds up.

DBCCUPDATE USAGE won't help as that is an expensive operation and this real-time system has data fed into it every 5 minutes.

My problem is that I think that all these SQL Gurus are just repeating the mantra of you can't trust [SYSINDEXES].[ROWS] from SQL 6 days, but no-one has tested this. Perhaps it can be relied upon in certain circumstances...

With my testing below it seems that [SYSINDEXES].[ROWS] shold be accurate if you only ever have T-SQL INSERTs, DELETEs (UPDATEs are irrelevant) and even TRUNCATEs:

create database victor

go
use victor
go
create table t1 (c1 int)
checkpoint
insert t1 values (1)
dbcc log (victor,3) -- Note LOP_DELTA_SYSIND record
select count(*) FROM t1 union select rows from sysindexes
where name = 't1'
-- Same!
checkpoint
insert t1 values (2)
dbcc log (victor,3) -- Note LOP_DELTA_SYSIND record
select count(*) FROM t1 union select rows from sysindexes
where name = 't1'
-- Same!
checkpoint
delete t1 where c1 = 2
dbcc log (victor,3) -- Note LOP_DELTA_SYSIND record
select count(*) FROM t1 union select rows from sysindexes
where name = 't1'
-- Same!
checkpoint
update t1 set c1 = 10 where c1 = 1
dbcc log (victor,3) -- Note LOP_DELTA_SYSIND record
select count(*) FROM t1 union select rows from sysindexes
where name = 't1'
-- Same!

-- What about a nested insert?
checkpoint
insert t1 select * from t1
dbcc log (victor,3) -- Note LOP_DELTA_SYSIND record
select count(*) FROM t1 union select rows from sysindexes
where name = 't1'
-- Same!

-- What about a truncate?
checkpoint
truncate table t1
dbcc log (victor,3) -- Note LOP_DELTA_SYSIND record
select count(*) FROM t1 union select rows from sysindexes
where name = 't1'
-- Even truncate seems to works as you'd expect as the
operation is no logged

drop database victor


Since all operations in SQL 2000 are logged, it looks like every T_SQL INSERT and DELETE is logged an modifies [SYSINDEXES] on the way...

The only thing I can think of is your BCP / BULK INSERTs where (I imagine as I have not tested this) the T/L might just record the new extents/pages allocated, or something like that...

So that's why I'd like to know that if I can guarantee that my database only ever gets INSERTs (not even DELETEs, unless it's to correct a mistake which shouldn't happen) and minimal UPDATEs (which should have no effect), no BCP / BULK INSERT / DTS operations, is
[SYSINDEXES].[ROWS] accurate?

All I am after is at least one (or two, but not all) scenario where in the above circumstance where I can not rely upon it. As is will than straight away negate my hypothesis that it [SYSINDEXES].[ROWS] is accurate.

Perhaps under certain circumstance it can be...


Cheers,

Victor

As an aside a COUNT(*) does not read all the fields in a row, that is the worst possible thing it can do ;o) You're right about nullability with aggregate functions but COUNT(*) is THE exception.

nam.dang06 is not online. Last active: 20/09/2005 2:45:46 PM nam.dang06
Top 50 Poster
Joined: 10 Mar 2005
Total Posts: 8
 
Re: Rowcount revisited
Posted: 24 May 2005 12:49 PM
Hi Victor,

I have done some massive inserts and deletes and find that rowcounts from sysindexes sometimes do not match with select count(*) from X.

This is what I " think" that happens...

MSsql server do update statistics automatically as we know. However it only do update stats when server is not busy (if busy it will take low priority).

The way I've got around it:
1. come back the next day (if server not busy), you may find sysindexes get updated.
2. I force it to run update stats on table and this will update on sysindexes.


Update stats can take a very long to run on big tables

I would like hear your opinion on this as I'm not too sure if this is an ideal thing to do.

Cheers,
Nam


ace is not online. Last active: 27/11/2011 11:16:50 PM ace
Top 25 Poster
Joined: 24 Jun 2004
Total Posts: 200
 
Re: Rowcount revisited
Posted: 24 May 2005 02:50 PM
Hi Nam,

Just quickly, how do you the inserts? Single T-SQL INSERTs or soemthing else? Are there also DELETEs, etc?

UPDATE STATISTICS is triggered by the number of rows / percentage of rows changed depending on the cardinality of the table. So it's quite simple with when it's triggered by the Optimiser. Don't think it has anything to do with how busy your system is. I think.

Problem is that with this system the data is updated every 5 minutes, so even if I UPDATE STATISTICS at 3 AM, it might beout of sync by lunchtime, which is unaccpetable as this is a real-time market analysis system in a trading environment.

I am investigating this further by myself and through other factors, will be curious to see what eventuates.

Will post anything useful of interesting...

Thanxs for your comments,

Victor
JSBBS is not online. Last active: 23/08/2010 4:54:20 AM JSBBS
www.jsbbs.com.au
Top 50 Poster
Joined: 04 Mar 2005
Total Posts: 6
 
Re: Rowcount revisited
Posted: 01 Jun 2005 10:02 AM
On a previous large scale db, I ran across the same issue. After a lot of digging around MS and various SQL sites, I discovered someone figured out the the system tables could be used to retrieve the recordcount rather than querying the table in question.

Try this stored procedure...

Create Procedure dbo.pr_RecordCount
(
@pTableName nvarchar(128)
)
As

Set nocount on

select rows as Count
from sysobjects o inner join sysindexes i on o.id = i.id
where i.indid < 2
and o.name = @pTableName
return

GO
nam.dang06 is not online. Last active: 20/09/2005 2:45:46 PM nam.dang06
Top 50 Poster
Joined: 10 Mar 2005
Total Posts: 8
 
Re: Rowcount revisited
Posted: 03 Jun 2005 01:23 PM
This stored proc will work ONLY if stats is up to date.
If statistics is not being updated then the results from the proc will be out of synch with the actual select count(*) from table_name.
prfbrown is not online. Last active: 22/02/2007 9:22:27 PM prfbrown
Top 50 Poster
Joined: 09 Aug 2005
Total Posts: 5
 
Re: Rowcount revisited
Posted: 10 Aug 2005 06:03 PM
You may also want to routinely write the rowcounts for your tables into a rowcount history table seeing as though you have them being gathered. Retrospective analysis of rowcount change is sometimes useful for planning. The TSQL code that gathers the rowcount can also gather a number of other parameters (eg: space allocations for table data and indexes) and therefore you are able to extrapolate trends towards capacity planning.

I have some further information about rowcounts on the web here:
http://www.mountainman.com.au/software/Winluck/row_counter.html

Previous Thread :: Next Thread 
Page 1 of 1
 
AUSQLUG Forums  > SQL Server 2000 Forums  > SQL Server - Programming  > Rowcount revisited