Fog Creek Software
Discussion Board




Script Timeout searching full text of bugz

How would you go about fixing the FogBugz bug of a script timeout while searching a very small bug database consisting of just two thousand bug entries?

The specific search item is for the word "Scripting".

Andy Finkenstadt
Friday, November 21, 2003

Are you using Access?

Access doesn't have full-text search, so its not really possible using Access (we allow you to do it, but it uses a LIKE query which is just slow as mollasses.. MySQL and MSSQL do have full text searching).

Michael H. Pryor (fogcreek)
Friday, November 21, 2003

We're using SQL Server on a small, 1 gigabyte ram 2.4 ghz pentium 4 server, remote from the production (and test, and staging) web servers.  It times out on all machines.

Andy Finkenstadt
Monday, November 24, 2003

Which version of FogBUGZ?

Michael H. Pryor (fogcreek)
Monday, November 24, 2003

3.19.  I can email the HTML output, if that helps.  It's a fogbugz page with "Script timout" in the middle, followe by the fogbugz trailer.

Andy Finkenstadt
Monday, November 24, 2003

The html page won't help unfortunately.

Which version of MySQL?

Michael H. Pryor (fogcreek)
Monday, November 24, 2003

Oops, emailed by accident.

We use the Microsoft SQL Server 2000 version of MySQL.

Andy Finkenstadt
Monday, November 24, 2003

I don't know why I thought you were using MySQL...

Go into enterprise manager and check on your full text catalog... how big is it?  is it populated?

Michael H. Pryor (fogcreek)
Monday, November 24, 2003

"The catalog size is 7MB and it was last populated 11/25/2003 12:02:08 am"

Is that what you were looking for?

Andy Finkenstadt
Tuesday, November 25, 2003

"there are currently 21716 items and it is currently idle."  Whatever that means. :)

I'm an Oracle guru, we are using SQL Server 2000 to avoid the effort of a (technically unsupported) port of FogBugz.

Andy Finkenstadt
Tuesday, November 25, 2003

In CBugList.asp, at the bottom of the function "GetSQLSearchFull", you will see

        Set GetSQLList = NewCmd( s & sOrderBy & sSuffix )

Add
Response.write s & sOrderBy & sSuffix
REsponse.end

Do your search, it will display the query its executing.
Take that into query analyzer and run it...
See if you can take parts of it out to speed up the query or if you can profile the query with Profiler to find out what the bottleneck is.
If it runs fine, then the bottleneck is the network somehow... otherwise the queryanalyzer should lead you to the problem.

Michael H. Pryor (fogcreek)
Tuesday, November 25, 2003

Adding:

    NewCmd.CommandTimeout = 480

in function NewCmd() alleviated the issue of performance for our specific content of bugs.  It returned the SIXTEEN (16!!) bugs that happened to contain the word 'Scripting'.  The query analyzer was giving back an error message "[Microsoft][ODBC SQL Server Driver]COUNT field incorrect or syntax error", related, I"m sure to my inexperience in ever using SQL server 2000 "for real".

Commentary on the SQL code generator: 

is there something wrong with using bind parameters instead of inline substitution?  If I ever do a port to Oracle, I will be blowing chunks in my shared parsed SQL area with all the various textual substitutions dragging down the potential performance of my database.

And is there some wierd reason why there's a 6-way nested INNER JOIN in the FROM TABLE clause, instead of using the more usual WHERE clause joins?  There's no reason that I can think of not to, at least, not use nesting preferring, instead, to use (syntax correction necessary)

SELECT columns
FROM bugTable
  INNER JOIN projectTable ON bug.Project = project.PK
  INNER JOIN blahTable on bug.Blah = blah.PK
  INNER JOIN etc.

Andy Finkenstadt
Tuesday, November 25, 2003

Did you figure out what was causing the slowdown?  We're using MS SQL server (along with a bunch of our customers) and searching full-text bugs using 3.1.9 doesn't take any unusual amount of time.

Send me your query via email and I'll take a look...

Also, unfortunately I can't parse this :)
"is there something wrong with using *bind parameters instead of inline substitution*?  If I ever do a port to Oracle, I will be blowing chunks in my shared parsed SQL area with all the various textual substitutions dragging down the potential performance of my database."
Can you clarify what you mean?  I don't know what you mean by bind params vs. inline sub. and 'shared parsed sql area'...

Michael H. Pryor (fogcreek)
Tuesday, November 25, 2003

I did not figure out what is causing the slowdown.  My suspicion lies in the TOP 50 clauses, or in the ixBug in ( SELECT TOP 50... where text like '%Scripting%') clause.  We have several thousand open bugs (most of them, duplicate), and it was a big surprise to discover that there were only a few bugs that dealt with that specific search term.  Searching on MY name, on the other hand, comes back in just a couple seconds, so I suspect it has something to do with the actual data content.

Next post...

Andy Finkenstadt
Tuesday, November 25, 2003

Bind parameters vs. Inline substitution

Consider these two SQL statements:

" SELECT * FROM TABLE WHERE COLUMN = ? "

" SELECT * FROM TABLE WHERE COLUMN = 1 "

The former requires a cmd.AddParameter call to "bind" the ? parameter into the query, the latter does not.  Usually I use named parameters preceeded by a colon, rather than positional parameters denoted by the ODBC idiom '?'.  So:

" SELECT * FROM TABLE WHERE COLUMN = :parm"
: cmd.AddParameter "parm", adType, adDirection, value

Why even bother?

Consider: what if a parameter is a string instead of a number, do you really want to have to take the trouble to remember to remember to change quote marks & other SQL special characters before building the resulting SQL command string, or would you rather just attach the parameter as-is without needing to worry about how the parameter's value would affect the SQL?

Forgetting to do the text substitution in just one place opens up your code to "SQL Injection" attacks, or to outright syntax errors that can yield exposure of sensitive data in the resulting error message, or worse.  (Some releases of ODBC had escape characters to permit command execution via command.com!  FORMAT C: /S /V /Y to the rescue.)

Next post...

Andy Finkenstadt
Tuesday, November 25, 2003

Parsing versus Execution.

Most databases, especially Oracle and SQL Server, have an ability to cache the query execution plans for statements that are identical in both syntax (letters & numbers) and semantics (which table does it really refer to).

The most expensive part of the statement execution process for "short" queries is the parse phase.  During Parsing, the database checks for valid SQL syntax, converting them into tokens and checking against a syntax tree.  Then, the database checks for object name resolution for the current executing user, against whatever objects they have access to, via their private synonyms, the public synonyms, their own schema's tables, and their default schema's table.  Then, after confirming the user has valid access to everything, it determines a "plan" to actually execute the query.

Now that it has a plan, it executes it.

Caching of the results of the expensive parse operation involves matching identically formed queries, matching identical user context, and then retrieving the stored execution plan and executing it.  The match portion of this is performed by comparing the LENGTH of the SQL statement and its corresponding HASH to an in-memory HASH/LENGTH pair.  If they are identical, then the database checks for the user contexts available, finds the right one, and retrieves the plan.

Consider a SQL statement to retrieve bug information for bug #1.

"SELECT * from Bug where PK = 1"

Now, do bug #2:

"SELECT * from Bug where PK = 2"

Now do bug #3:

"SELECT * from Bug where PK = 3"

You can see where this is going.  Each of those SQL statement strings has a different hash value, and so the database is unable to re-use previous work, and has to perform the full syntax analysis and execution planning every time.  Unless someone happens to retrieve the same bug while a previous copy is still around in the cache.

Now use parameter binding:

"SELECT * from Bug where PK = :bugNumber"
: cmd.AddParameter "bugNumber", adLong, adIn, 1

"SELECT * from Bug where PK = :bugNumber"
: cmd.AddParameter "bugNumber", adLong, adIn, 2

"SELECT * from Bug where PK = :bugNumber"
: cmd.AddParameter "bugNumber", adLong, adIn, 3

The database can notice that the SQL statements are identical, even to case and whitespace.  It discovers that the same user context is being used (highly likely in Fogbugz).  It retrieves the execution plan, and voi la, it is done.

There's a limited amount of cache RAM for the execution plans, and the RAM is shared with a bunch of other things.  This is the shared (across all users) "parsed SQL area" inside of the database.

Next Post...

Andy Finkenstadt
Tuesday, November 25, 2003

Try this:

Query 1:
SELECT DISTINCT TOP 50 Bug.ixBug as ixBug FROM (Bug INNER JOIN BugEvent ON Bug.ixBug = BugEvent.ixBug)  WHERE Contains(BugEvent.s, 'searchterm' )

Query 2:
SELECT TOP 50 ixBug FROM Bug where (sTitle LIKE 'searchterm' OR Bug.ixBug = -1)

Query 3:
SELECT TOP 50 ixBug  FROM Bug INNER JOIN BugEvent on Bug.ixBug = BugEvent.ixBug WHERE (ixBug IN
( ** QUERY1 HERE **
  UNION
  **QUERY2  HERE ** ))

Also, we do use bound parameters almost in every single place in fogbugz, for exactly the reasons you said (security, ease of quoting, cacheing by the database). Do a grep on cmd.Parameters.  The only place we don't use it is in the special quoting for Contains in Query1 where we take care of the parsing ourselves because the cmd object in ADO didn't like a ? in the statement there (and sometimes for internal data parameters that we know are safe (like ixBug))

Michael H. Pryor (fogcreek)
Tuesday, November 25, 2003

I acknowledge that there is some usefulness in building SQL statements from the group up and passing them along to various other routines that add on "WHERE" clauses.  In order for those other routines to be generic, they can't really "know" what other parameters need to be .add'ed from the preceeding stages without being told (and pathalogically coupling together routines that need not be). 

In addition, I can see that the INNER JOINS are a necessary evil, if one does not want to have to pass along a SQL statement and its WHERE clause, to know whether, in a routine that is grabbing onto a 6-way join, it needs to append its own where clause filters with "AND" or "WHERE".

On the other hand, that last problem is easily gotten around by putting in a dummy "WHERE 1=1" after the FROM clause, thus making ALL add-ons to a where clause become "AND".

Sorry for writing a book.

Andy Finkenstadt
Tuesday, November 25, 2003

I could use a pointer to the SQL Server equivalent of "SQL*Plus" command line SQL execution tool.  I'll try out those queries tomorrow morning, or tonight if I can trick VNC into letting me into the database server from home.

Andy Finkenstadt
Tuesday, November 25, 2003

Start->Programs->Query Analyzer... or you could use Enterprise Manager...

I don't know what the command line tool is.. Hey, its windows! Whats a command line ;)

Michael H. Pryor
Tuesday, November 25, 2003

*  Recent Topics

*  Fog Creek Home