Fog Creek Software
Discussion Board




Legacy Visual Basic 6 Efficiency Question

After finishing a VB.net / J2EE project, I have been assigned to a legacy VB 6 application that has been designed poorly.

Instead of creating specific collections and classes which are strongly typed, the application uses a lot of late binding with generic object variables & generic collections which many different object types thrown onto them. Variants are used all over the place!

I was wondering if anyone had any analytics showing how using such late binding coding techniques & not using strongly typed variables & objects slows down the application.

I'd like to have some specifc info to back up my point and justify a re-write of such classes.

KenB
Tuesday, March 25, 2003

I'd do your own.  It shouldn't take much time to create a simple object with some functions and a test app.  I would create 4 variations of the same function:

testTypedByVal() - takes Typed arguments ByVal with a typed return.
testVariantByVal() - takes Variant arguments ByVal with a Variant return.
testTypedByRef() - takes Typed arguments ByRef with a typed return.
testVariantByRef() - takes Variant arguments ByVal with a Variant return.

The function should make assignments and perform a test on the values.

Then, write the test app that records the times to run each case X number of times.

Nick
Tuesday, March 25, 2003

The problem here is that late binding, and use of Variants is usually not a problem in terms of performance when used correctly.

For example, no one with a brain will use word automation, or Outlook automation with early binding. That is subject to so much reference breakage as to fire all the developers on the spot.

So, lets assume that late binding in the case of launching word to fill a template is 1000 times slower due to late binding. Wow! 1000 times slower! The problem here is that the execution time of VB is insignificant in that operation anyway!!.

While VB can easily execute 10 million instructions per second, you thus now drop down 10,000 instructions per second when using late binding. So, now we have a 1/10,000 of a second penalty for using late binding. The user will NEVER notice the difference in that case. And, we only take that performance hit for ONE instruction. The rest of the code for that object will now run the same speed as early binding in most cases.

On the other hand, in a data processing loop that calls a routine that uses late binding, it certainly can effect performance in a big way. However, once again, usually BEFORE that loop stuff, one can create a instance of the object, and again run that loop. In that case, once again, late binding does NOT really cost from a performance point of view.

This same thing applies to the use of Variants. I actually use variants much when working with JET, since null values cannot be put into a typed variable. So, for a lot of database stuff, I work with variants.

So, just the fact that the application has lots of variants, and late binding does NOT indicate that this has anything to do with poor performance.

Poor perforce IS almost ALWAYS a issue of design.

I remember some time ago here there was a discussion about the speed of VB as compared to C++. I commented that for loops, and doing calculations that the performance of VB was equal, or in fact better then C++. It was hilarious to see UN informed people jump in with some Sieve of Eratosthenes  or some loops that shuffled values around in a array for me to try. In virtually every example posted, the VB ran faster on my pc then did the C++. So many people think that for simple loops and stuff that C is faster then VB…and it is NOT!! Of course, many people don’t think with their brains, and function at a emotional level (ie: c++ is cool, and VB is not!!).

Most view VB as slow, and C++ as faster. It is not usually the case. The main reasons why a c++ application will run faster is due to DESIGNS that take advantage of the machine’s architecture. (designs, not looping code speed!!). There are some compiler optimizing that SKIP code, and that also helps in c++. But, when you HAVE to run the loop, VB gives c++ a equal run.

So, while late binding and the use of Variants is MUCH slower from a execution point of view, the use of them is in-significant when used correctly. Simply removing late binding, and removing Variants may not improve performance IN ANY NOTICEABLE way.

I mean for a form that loads and only has a few hundred lines of code run at load time, the use of late binding, or Variants is NOT going to effect load time in any noticeable way.

And, that discussion where all the c++ people who don’t understand this important fact came up with examples for me to try can be found at:

http://discuss.fogcreek.com/joelonsoftware/default.asp?cmd=show&ixPost=7598

Actually, I am sure someone can come up with a loop that does run faster in c++, but in every example posted in that thread, VB ran faster for me!

Albert D. Kallal
Edmonton, Alberta Canada
Kallal@msn.com

Albert D. Kallal
Tuesday, March 25, 2003

Albert,

Some of us don't have VB to benchmark against.  So, I'm curious, what is the speed of your computer?  I have various CPU's laying around and would like to try the C loop myself and test against your mark. 

Thanks,

Nat Ersoz
Tuesday, March 25, 2003

Never mind.  As the previous thread mentioned - both are compiled languages and the compiler back end is probably identical.  So perrformance on identical contructs should be identical.

Nat Ersoz
Tuesday, March 25, 2003

Ken,

I'd recommend asking this question at:

http://peach.ease.lsoft.com/scripts/wa.exe?A0=visbas-l

Thanks

Matthew
Tuesday, March 25, 2003

Let me backtrack by asserting that I agree with most of what Albert has to say.  Microsoft did a study that concluded that 95% of operating time of an average GUI app was taken up by the user (filling in text boxes, slecting controls, etc.) Only 5% of the operating time was the app itself.  So, even if you made a significant improvement in code efficiency, the overall impact may be negligible.

That said, you still may want to refactor the code. Reading and maintaining VB code with lots of Variants is a pain in the ass.

I took 20 minutes out of my busy day (sitting on the couch watching war coverage) and did a little test myself.  I created a simple VB COM component with the 4 test methods listed above. Each method performed the same actions: assigned the parameters (a Boolean, a Date, a Double, an Integer, a Long, and a String) to variables, called another function to perform simple actions on the variables, then returned a Boolean.

The test app had 8 separate loops - 4 loops to call each method using an early bound object and 4 using a late bound object.  The pseudocode for each loop is as follows:

For i = 1 to 100000
- create object
- call one of the object's 4 methods
- destroy object
Next

The results were as follows:
Start time: 4:02:23 PM
100000 Loops of Early, Typed, ByRef = 3.00 sec
100000 Loops of Early, Typed, ByVal = 3.00 sec
100000 Loops of Early, Variant, ByRef = 5.00 sec
100000 Loops of Early, Variant, ByVal = 6.00 sec
100000 Loops of Late, Typed, ByRef = 5.00 sec
100000 Loops of Late, Typed, ByVal = 6.00 sec
100000 Loops of Late, Variant, ByRef = 8.00 sec
100000 Loops of Late, Variant, ByVal = 8.00 sec
End time: 4:03:07 PM

What does this mean?  Well based on the number of loops required to obtain a neglible difference in run time, not much in this case. But there are several relevant points. First, I am running a 2.2 GHz machine w/ 512 MB RAM. My background is in manufacturing where it isn't uncommon to find a lot of older machines (200 MHz, 32 MB RAM). Second, the data set for my test app was very small. An application that did a lot of number crunching would, of course, have different results. (On the other hand, if a developer created an instance of the object 100000 times in an app, I'd shoot him.)

In the end, if you really want to justify a refactoring project, then I would copy a small subset of the code to create your own test app. Then present the before and after results for both performance and readability.

Nick
Tuesday, March 25, 2003

I was recently engaged to do a performance analyzis on a huge VB6 project. It's huge in code size and execution time. The compiled application does basicly export data from a database to some publishing format. If we call the format a book, then the book is made of documents. The export of one document takes about 4 minutes on my laptop. Multiply that with hundreds of documents, and this is going to get real slow.

In my county we have a term, roughly translated to "the giant numbers' law" or "..rule" (sorry, don't know the English term). What it means is that a reasonable small number accumulates to an impossible value when applied to a population. Like the pay-raise our teachers and nurses wants every year. That's also what I discovered in this code: When I dig deep down in the function calls, the calls are executed instantly! But even a millisecond pr function means a lot when there are 18,390 of them! For one document, that is.

I discovered two valuable optimizations: Move function calls out of loops and avoid native VB string handling. A third one, which I hadn't the time to investigate, is memory usage. If your application need to use virtual memory, then it will be not faster than the io to your harddisk.

Thomas Eyde
Thursday, March 27, 2003

*  Recent Topics

*  Fog Creek Home