Fog Creek Software
Discussion Board

Database Analysis Tool

I've been tasked by my manager with finding a particular kind of tool for analysing records in our database. He says "I know that a tool like this exists".

I'm not so sure.

But maybe you guys know more than I do.

The tool should be able to connect to a database and send it a query. When the records are returned from the query, with, say, 1000 records, the tool should be able to tell me how records 1 - 500 differ from records 501 - 1000.

Output might look something like this:

Records 1 - 500:
............TYPICAL VALUES: 25% 60% 15%

Records 501 - 1000:
............TYPICAL VALUES: 18% 40% 42%

Or something like that.

Maybe it provides statistics (mean, media, stdev) about how the two sets of records differ from eachother (in their numeric columns).

I could write software that would do these things, but my boss says he thinks that this type of data profiling software already exists.

Do you know of anything?

Benji Smith
Wednesday, January 21, 2004

Well, I don't know anything about a specific tool like that, but there are surely many packages for just analyzing data, say from a spreadsheet. I believe that Excel can hook into a database somehow, maybe ODBC (been a while since I've touched Excel. A long while). Once you have that, you should be pretty open as far as what you can do. You then wouldn't need a database specific package.

Mike Swieton
Wednesday, January 21, 2004

Ummm... it sounds vaguely like you want an OLAP cube.

Most major RDBMS' have analysis packages (some charge for them, some don't).

[disclaimer: I work for Microsoft]
Worst case, you could always use SQL server - if you've got a different engine running, you can DTS the data in and run analysis services (included) on it.
FWIW, you can get a 120-day demo for free to see if it does what you need.

CAUTION: This is by no means a weekend project. Analysis services on *any* platform are often a field unto themselves.

Best of luck!

Wednesday, January 21, 2004

You may be able to write your own tool if you cannot find a suitable commercial replacement.  You could use something like VB/ADO or whatever language suits your fancy.  It seems from your description of the desired output that it wouldn't be overly difficult.

My $0.02.

Dave B.
Wednesday, January 21, 2004

Doh, I'm sorry, didn't read your whole post.

Dave B.
Wednesday, January 21, 2004

Some kind of data mining tool might do it. Or you could write a perl one-liner :^)

Tom H
Wednesday, January 21, 2004


What you want is a pivot table. It's built right into Microsoft Excel.

Data > PivotTable.

I say "Eek" because this is the most valuable feature in Excel that the fewest people know about.

Joel Spolsky
Wednesday, January 21, 2004

Yep and it pretty much removed the whole point of  an entire Lotus product, damn what was its name... began with an 'I'

I think I was one of a tiny few that actually bought it.

Simon Lucy
Wednesday, January 21, 2004

Improv, I think.

Simon Lucy
Wednesday, January 21, 2004


Wednesday, January 21, 2004

Anyone know of some really good documentation (book, web, whatever) for Pivot Tables?  I don't mean 'how to use' but more advanced stuff.  Most things are easy to do, but some things are very well hidden.  I think what I want is a "pivot table cookbook" from O'Reilly!

Thursday, January 22, 2004

Well, there is a T-SQL Cookbook from O'Reilly that covers pivot tables. This would be for SQL Server though.

Thursday, January 22, 2004

*  Recent Topics

*  Fog Creek Home