Fog Creek Software
Discussion Board

Soft Page Faults

I have an interesting performance issue I'm trying to understand and ideally work around.  Since there seems to be a lot of expertise represented here it seems worth a shot...

The simplified case happens with two instances of an app running under W2K that have a large shared memory buffer (say 800MB, the machine has 1GB of memory). 

Test code touches one byte in each page of shared memory. 

The first process to hit it takes ~500ms the first time and 15ms on subsequent times.

The second process takes ~2250ms the first time and between 375 & 500ms on subsequent times.

It doesn't matter which process created the file mapping objects, just which hits them first.

perfmon shows that the second process is taking a lot of page faults every time the test is run.  The first process only shows page faults on the first run.  These faults are not hard faults - nothing is going to disk, so they must be soft page faults (pages in\sec is basically 0).

I've tried various numbers in SetProcessWorkingSetSize but nothing seems to make a difference, and perfmon shows the working set size to be constant.

I know the test sounds non-sensical, but I'd still like to understand the limit I am running into.  Anyone have any ideas to the cause of the page faults?


Rob Walker
Thursday, January 02, 2003

I can't comment on the Windows issues here, but I can make a few remarks about memory hiearchy.  Modern systems have L1 cache, L2 cache and main memory.  Plus, there's a TLB which maps virtual addresses to physical addresses.

L1 cache access takes a few cycles, L2 cache access takes about 10 times as long, and main memory access takes a couple hundred cycles (this obviously varies from processor to processor, and changes every year).  A TLB miss takes basically forever.

Multi-processor machines need to arbitrate access to memory; this takes time, too.  Collectively, these effects can have a massive effect on performance.

Now, none of this explains why one process can consistently access the memory faster than another process--this is probably a Windows issue.

Eric Kidd
Thursday, January 02, 2003

If practical, it might be a good idea to try running this test under Linux or another OS, just to see whether it's an OS issue or a hardware issue.

I assume the two processes are running through the buffer serially, one after the other, rather than alternating each byte or something?

Dan Maas
Friday, January 03, 2003

:: I assume the two processes are running through the
:: buffer serially, one after the other, rather than
:: alternating each byte or something

Yes.  They are not overlapping in their access to the buffer at all. But unfortunately it is not very practical to run under another OS (we're a windows only shop).

If I look the numbers more closely, the number of page faults the second process sees on the second test is (repeatedly) suspiciously close to the number it expects to see on the first test minus 65536.

So maybe a process can only have 65K shared pages in its working set at once?

Rob Walker
Friday, January 03, 2003

*  Recent Topics

*  Fog Creek Home