Fog Creek Software
Discussion Board




String Concatenation

My team codes our GIS applications primarily in VB, with one developer that can drop into C++ for the intense math portions.  We also use XML extensively as our data transport mechanism between the GUI components and the database components.  We use the DOM for parsing XML, but tend to build the XML by using conventional string manipulation.

As Joels recent article on Back to Basics pointed out, the "&" operation in VB can be extremely slow.  We have built a class specifically for string concatenation.  In some instances, we find nearly 3 orders of magnitude difference in performance, and almost always 2 orders of magnitude.

But... to the point of my question.  As a curiosity, does anyone know if there is a significant difference in:

    r = a & b & c & d

over

    r = a & b
    r = r & c
    r = r & d

I was just curious if the VB processor recognizes the multiple concatenations, and deals with it as one optimized process, or, internally, does it effectively do the same 3 inefficient concatentations as is done in the second example.

Thanks

Ted Macy
Saturday, December 22, 2001

This ASP white paper makes it look like there is a difference, but I'm not sure how reliable it is...

http://www.valtara.com/csc123/WhitePaper/ASPBestPractices.htm#11

Look at #6 under ASP Coding Style

Michael Pryor
Saturday, December 22, 2001

If all you care about is concatenation speed, then just make a new string type that is implemented as a linked list of fragments... Then concatenation is just setting a pointer. (each fragment may need a reference count for memory management; and if you want to do anything else with these strings you'll have to convert them back to regular strings - though it's only a single O(n) operation)

Dan Maas
Saturday, December 22, 2001

Do some easy tests and you find all sorts of interesting things.

In VB6, compiling to native code,

  r = a & b & c

is slightly SLOWER than

  r = a
  r = r & b
  r = r & c

Also, r = a & (b & c) is MUCH faster than r = a & b & c when a is a large string.

John Wiseman
Saturday, December 22, 2001

"Also, r = a & (b & c) is MUCH faster than r = a & b & c when a is a large string. "

And much slower if c is a large string compared to a and b ;)

Jan Derk
Tuesday, December 25, 2001

If you are going to do string concatenation in any volume, I reccomend you write a DLL or OCX to handle it for you. I made a class with a single function, taking variant containing an array of b-strings and returning a single bstring. The performance win comes from precalculating the length needed for the output string, and allocating that ONCE, thus:

total_length = 0;
cElements = lUBound - lLBound + 1;
for (i = 0; i < cElements; i++)
{        
  total_length += SysStringLen(pbstr[i]);
}

output = SysAllocStringLen(NULL,  total_length);

for (i = 0; i < cElements; i++)
{
  this_len_bytes = SysStringByteLen(pbstr[i]);
  memcpy(output+chars_so_far, pbstr[i], this_len_bytes);
  chars_so_far += SysStringLen(pbstr[i]);
}

This was the only real performance bottleneck in my VB app. Doing my string concatenation this way made a huge difference in performance.

Jeff Paulsen
Tuesday, December 25, 2001

You can do it in VB by Preallocating the string and using Mid to place your concatenated text.

Neila
Tuesday, January 01, 2002

Optimising string concatenation in VB is a well worn issue that has been covered to death in many forums ( including http://peach.ease.lsoft.com/scripts/wa.exe?A0=visbas-l ). No need to discuss it here. ;-)

Matthew Wills
Tuesday, January 01, 2002

You do the math in C(++) and the string concat's in VB. But did You know that VB(6?) is relatively good and efficient in Math (only 10-15% slower that VC), and VERY BAD in string concats! What VB does is copy the original strings to a new string with EVERY concat You do! This is a very slow and in-efficient proces. Of course this could be many times faster by simple reconnection of string pointers. For this reason the environment I normally work in has a dedicated COM (and proprietary) string concat module wich is written in C and wich performs string concat up to 1000 times faster than VB!

I think You should know this!

Rients Dijkstra
Friday, August 01, 2003

*  Recent Topics

*  Fog Creek Home