Fog Creek Software
Discussion Board




C Program Optimization

HI. I am working on a product that has some performance issues. The whole thing is in C and Windows SDK. Apart from algorithms, I'd like to get resources on optimizing C programs. Like, I've heard about this struct padding - that the struct size changes depending on how we declare the members. Loop unrolling, etc. Any books?

Need help
Thursday, December 19, 2002

Buy and learn to use a profiler.  Numega and Rational both make good ones.  Books won't do you any good until you identify the slow parts, and once you see what is taking so long, it will likely be obvious how to speed it up.  It depends on your problem domain, but the answer is rarely "unroll the loop".  YMMV.

Brian
Thursday, December 19, 2002

Intel VTune is an excellent graphical profiler for Windows. I would only worry about C optimizations for hotspots after algorithmic optimization.

http://www.intel.com/software/products/vtune/

runtime
Thursday, December 19, 2002

VTune also has a "code coach" feature that will offer specific suggestions for your code. And VTune knows MUCH more about useful Intel processor quirks than any book! :-)

runtime
Thursday, December 19, 2002

Windows applications are driven by the message queue and the responsiveness of the application in returning from the original message call.

If you have significant code running off some WM_ message then the application will seem to be slow to the user and may block other applications.

Profilers will spot that to a degree but you'll find that the main message handler will appear to swamp eveything else and so introduce noise into the data.

So my first approach would be to walk the code from the message handler and determine how deep or complex that code was.  If I identify a piece that hogs the queue then I'd either break it up or spawn threads to continue the processing.

Simon Lucy
Friday, December 20, 2002

The Programming Pearls books by Jon Bentley have good general information about optimization.  I don't think they deal specifically with C, but the lessons from these books could be applied anywhere.

Bruce Perry
Friday, December 20, 2002

I don't mean to take too much from the original topic.

I noticed that in speaking of VTune, a couple of people referred to Intel optimizations. When we say "Intel", do we mean x86 or Intel processors in particular? I'd hate to think that VTune would be of no use if I'm running and AMD proc.

Cheers,
BDKR

BDKR
Friday, December 20, 2002

Hi Simon,
You are right. I used the VC++ profiler and it shows the Callback function as taking the most time. But I want to know about the other functions that actually do the work.
All of them are spanwed off as separate threads. Any way I could work around?

Need help
Friday, December 20, 2002

"Hi Simon,
You are right."

Let me just put that on a loop so I can hear it over and over....

Generally  the most expensive message to process is WM_PAINT because it happens so often and you tend to either invalidate a small rectangle and paint the whole window or want to catch up on other stuff at the same time.

As a heuristic I'd walk down the tree of function calls from the message handler and assign a weight to each function based on its number of lines, number of conditionals and number of external function calls.  I might use an actual number for the weight and write it down, more likely I'd hold it in my head as a hmm factor.

If a route through the tree generated a very long hmmmm in my head I'd work on splicing it off as a separate thread with a return back to the message handler.

All this work might leave some of the message handler code in a mess.  If it does, your message handler was broken anyway.  The only code that should be in a message handler is a switch to whatever routines handle the individual messages.

Only do what you absolutely have to do at any point and return as soon as that's done, or you've scheduled to do it at some other point in time.

If you have a complicated window display don't repaint the whole window on every WM_PAINT, match the rectangle to the objects you hold and just repaint/render those.

If you have code which doesn't interact with the display or the user at all, spawn a thread for it and return as soon as you have.  If you really have a complicated computational engine going create another controlling thread for all that code and have your message handler send it the messages.  Then your application will seem like lightning but will probably occupy exactly the same number of cpu cycles.

A complicated windows app (and this applies to any window messaging system not just MS), is really a multi-tier system identifying the interfaces between the tiers will help optimise each tier.

"Hi Simon,
You are right."

Oooo that's so nice, press that button again.

Simon Lucy
Saturday, December 21, 2002

Oh I had another thought, one of the results of spinning lots of threads is that at some point they have to be synchronised, so you might find that if you have some variable, process or event blocked on synchronisation that actually merging a thread could make the code more responsive.

If that's the case then its going to take a fair amount of just sitting and thinking to rework how the threads interoperate.  It might be useful to block out the processing in swim lanes on several sheets of paper and see if there's any commonality or cycling going on.

Simon Lucy
Saturday, December 21, 2002

spawn threads? unfortunately, spawning threads is often a bad way to go due to the overhead. sometimes it is very useful, but again it's the profiler (or some other sort of analysis, which is harder but might be valid over a larger dataset) which will suggest where you need to spawn threads.

what's taking time in your app? do you know or can you guess? CPU processing? disk reading? network operations? painting? some of these things work well in threads, some work well using asynchrounous APIs, maybe you're just blocking waiting for a network response.

printf profiling can be productive. put some sort of timer on each windows message and see how long each one takes. (if each goes to a function, the profiler may be able to do this for you.)

mb
Sunday, December 22, 2002

*  Recent Topics

*  Fog Creek Home