Fog Creek Software
Discussion Board




Optimize this:

Interview question, that I'm stewing over.  He sez to me, write an endian inverter.  I say OK, here:

unsigned long func( void *ptr ) {
unsigned char *p = ptr, b[4];
b[3] = p[0];
b[2] = p[1];
b[1] = p[2];
b[0] = p[3];
return *(unsigned long*)&b[0];
}

He sez, too many locals, compiler cannot optimize the register space.  Optimize it.

I say:
unsigned long func2( void *ptr )
{
unsigned long *p = ptr, b;

b  = (*p >> 24) & 0x000000ff;
b |= (*p >>  8) & 0x0000ff00;
b |= (*p <<  8) & 0x00ff0000;
b |= (*p << 24) & 0xff000000;

return b;
}

He sez: "wrong, I didn't say ptr was 32 bit aligned".

AND NOW I have a Royal Queen Bee up my ass about this.  I ran, ideas through gcc to x86 and PPC assembler to see what can be optimized.  I don't think the first version can be optimized just based on the asm output.  Here it is for x86:

func:
    pushl    %ebp
    movl    %esp, %ebp
    subl    $8, %esp               
    movl    8(%ebp), %eax
    movl    %eax, -4(%ebp)
    movl    -4(%ebp), %eax
    movb    (%eax), %al
    movb    %al, -5(%ebp)
    movl    -4(%ebp), %eax
    incl    %eax
    movb    (%eax), %al
    movb    %al, -6(%ebp)
    movl    -4(%ebp), %eax
    addl    $2, %eax
    movb    (%eax), %al
    movb    %al, -7(%ebp)
    movl    -4(%ebp), %eax
    addl    $3, %eax
    movb    (%eax), %al
    movb    %al, -8(%ebp)
    movl    -8(%ebp), %eax
    leave
    ret


I just don't think it can be done any better, but what the hell do I know?

wish I was an asm expert
Thursday, June 24, 2004

"He sez"?

.
Thursday, June 24, 2004

Apparently he writes his English in assembler too.

Kyralessa
Thursday, June 24, 2004

I'd probably just htonl or ntohl depending on the platform and be done with it.

christopher baus.net
Thursday, June 24, 2004

How about a simple....

#define CONV(x,y,z)  ((unsigned long)(*((unsigned char *)x+y)))<<z

unsigned long func2( void *ptr )
{
    return CONV(ptr,0,0) +CONV(ptr,1,8) + CONV(ptr,2,16)+CONV(ptr,3,24) ;
}

Code Monkey
Thursday, June 24, 2004

'Sup Fool.

Simple? Maybe.  But the original poster's more even simpler code would also run faster.  The guy who interviewed him was a dumb bitch that needs a gut punching.

Mr. Fancypants
Thursday, June 24, 2004

When I read posts like this, I wonder how anybody manages to learn all this.

If someone asked me this in an interview I would have no idea. I would be wanting someone to say "write a 'hello world' program in C++ please" that is the level I am at....

I think good job for even knowing that, and the interviewer sounds like a jerk.,

Aussie chick
Thursday, June 24, 2004

"I just don't think it can be done any better, but what the hell do I know?"

Of course you can (5 instrutions, the others are init/finish stuff) ! =)

func:
push ebp
mov ebp, esp

mov ebx, [ebp]

push byte ptr [ebx+3]
push byte ptr [ebx+2]
push byte ptr [ebx+1]
push byte ptr [ebx+0]

pop dword ptr [ebx]

pop ebp
ret

Nix
Thursday, June 24, 2004

My current project is embedded software using a DSP from motorola.

The strange thing is changing my mind from an assembler to another.

PC:
mov destination, source

DSP:
mov source, destination

Pretty easy to put bugs ;)

Nix
Friday, June 25, 2004

Cosmetic change. I did the change 'in place'. To return the value, you need to change:
pop dword ptr [ebx]
to:
pop eax

Nix
Friday, June 25, 2004

This is some code that was posted on another forum I frequent and so I take no credit for it.
__declspec(naked) unsigned short __fastcall ByteSwapWORD(unsigned short)
{
  __asm {
    xchg cl, ch
    mov ax, cx
  }
}

__declspec(naked) unsigned long __fastcall ByteSwapDWORD(unsigned long)
{
  __asm {
    bswap ecx
    mov eax, ecx
  }
}

Zorm
Friday, June 25, 2004

*Tremendously* boring question.


Friday, June 25, 2004

Thanks Nix,

That's what I was looking for.

me again
Friday, June 25, 2004

Sup Fools,

Its good that people spend so much time thinking of things like this, since we all know that byte swapping tends to be the bottleneck in most applications!!

Mr. Fancypants
Friday, June 25, 2004

Hopefully the interviewer didin't expect you to come up the right answer or else. Questions like these are to judge how you think, how you approach a problem. If you happen to  solve it, make the mental leap during the interview, then great! I'ts more proof your smart. Otherwise, it doesn't mean your a dufus, just a slightly bigger risk of a being a dufus, all other things being equal.

If you got real flustered in the interview and insisted that it was impossible to code it to run faster, then you failed. He wants to see you thinking super hard when faced with a super hard problem.

If you puzzled and flustered and demured then you probably did ok. He'll see some humility, mostly he see a person sort of paralazed on the spot. This is the natural reaction, most people realize it. It probably won't cost you the job, assuming you did a good job on the other portions of the interview (like producing working code).

If you quickly accepted it's possible there is a faster way, and took delight in working on the problem (maybe while working on it you said something like "this is a fun problem" out loud), then when it was over if you said something like "I'm going to keep thinking about that problem. It's bugging me." We'll in this case you shown a real interest in the same puzzles that interest him which only helps you (he created the puzzle himself and he's), and you've shown a desire to tackle those puzzles even though you aren't being paid or rated for it. Good companies love that in employees. Of course if you appear phony it could backfire.

Of course everything I written assumes the guy is a good interviewer at a good company. If not then your worst fears might be true, he might not hire your because you didn't come up with the exact right answer he was looking for. Yikes!

ronk!
Friday, June 25, 2004

This is begging for an XOR trick.

Jason McCullough
Friday, June 25, 2004

OK, I'll byte ;-)

It's been a while, but how about something like:

func:    ; Most compiler can pass args in registers nowadays
            ; So arg is passed in eax

rol ax,8          ; swap low order bytes
rol eax,16      ; swap high and low order words
rol ax,8          ; swap former high order bytes

ret

Eric Hop
Friday, June 25, 2004

There's a specialized 486+ opcode for this purpose, called "BSWAP", so you can do it with a single instruction.

http://www.google.com/search?hl=en&ie=UTF-8&q=bswap

Christopher Wells
Friday, June 25, 2004

Nice aproach Eric !

To the interviewer showing the rol or the stack code is better than the bswap. The formers show that you understand how the computer works.

Nix
Friday, June 25, 2004

>>>>>>>>>>>>>>>
When I read posts like this, I wonder how anybody manages to learn all this.
>>>>>>>>>>>>>>>

Well, I'd guess that the type of C++/C (possibly asm) some of these guys do on a daily basis (for their job or possibly open source projects) gives them this type of experience.

For example, embedded software development would give you different types of skills than the MFC/C++ (PDA development for the PocketPC) I've been doing for the past three years.  When your using CString all the time, you get rusty on these other things.

It doesn't make them smarter than you, it just shows the type of experience they've obtained.

William Campbell
Friday, June 25, 2004

> He sez: "wrong, I didn't say ptr was 32 bit aligned".

I am confused over this.
He definitely said it was 32 bits. What does the
aligned part have to do with anything? If have
any address a can get the next 4 bytes.

son of parnas
Friday, June 25, 2004

He was probably looking for a "magic answer", which many interviewers are in such a case. Perhaps he was probably looking for the in-place swap technique.

For example

void swap_values(unsigned char *p)
{
    p[0] = p[0] ^ p[3];
    p[3] = p[0] ^ p[3];
    p[0] = p[0] ^ p[3];
    p[1] = p[1] ^ p[2];
    p[2] = p[1] ^ p[2];
    p[1] = p[1] ^ p[2];
}

unsigned long val = 0x1234567;
swap_values((unsigned char *)&val);

Dennis Forbes
Friday, June 25, 2004

I don't understand either. Whether it's aligned or not, a pointer fetches 4 bytes in a row. What's his problem?

Maybe he's not so smart, or maybe he was testing you.

About the "rol" thing, pity C doesn't have a "rol" instruction!

>> "this is a fun problem" out loud

Note to self: always say "this is a fun problem" out loud during interviews.

And closing braces before writing any code. ;P

Alex
Friday, June 25, 2004

I get the 32 bit pointer bit over alignment, if you're on 68K[1]  family processor *p will fail unless is it's dword aligned.
I suspect the answer he wanted was:

unsigned long func(void *ptr)
{
unsigned char *p=ptr;
unsigned long retval;
retval=*(p++);
retval=(retval<<8)+*(p++);
retval=(retval<<8)+*(p++);
return(retval<<8)+*(p++);
}

which IIRC is the best you can do on a lot of processors, (x86 family excepted)
Note: You cannot merge the p++ lines together onto one line. If you read the C spec carefully the ++ bits takes effect whenever the C compiler feels like it. So
twodigit=(*(p++)-'0')+(*(p++)-'0');
where char *p="12";
may return either 11 or 12.
(Note C99 may have changed this)

[1] This is from memory and I also seem to recall that Alphas had the same problem and MS had insert an exception handler into NT to allow ported x86 code to run.

Peter Ibbotson
Friday, June 25, 2004

Peter, why use the increment then, at all ?
(apart from the fact that the final increment is unnecessary)

Instead, why not just do:

return (((((p[0] << 8) + p[1]) << 8) + p[2]) << 8) + p[3];

Eric Hop
Friday, June 25, 2004

isn't that much the same as the very first solution, just using shift instead of byte assignment?

mb
Friday, June 25, 2004

No, it's not.

The compiler will be able to optimize this much better because of the relation between the bytes.

In the first example the compiler would have to be very clever indeed to figure out that there IS a relation between those loose bytes it is supposed to process.

Here's the code a decent optimizing compiler would probably be able to generate:

func:  ; pointer passed in eax
   
push ebx
mov ebx,eax
xor eax,eax
mov al,[ebx]
shl eax,8
mov al,[ebx+1]
shl eax,8
mov al,[ebx+2]
shl eax,8
mov al,[ebx+3]
pop ebx

ret

Eric Hop
Friday, June 25, 2004

Still far worse than what hand-rolled code would be, though. Plus there is a push and a pop...

xchg ah, al
ror eax, 16
xchg ah, al

OK, so we're all showing off... :)

Alex
Friday, June 25, 2004

Don't ask me why, but I did not think of Peter I's method - clear as it seemed.  Playing with -O1,2,3 options on the gcc compiler, it makes similar code from my 1st example and his, but I will grant that Peter's does give the compiler better clues for dealing with the data and I think is what was being asked for.

Oh well, I was not invited on site for further consultation.  Lots of other fish in the sea however.

OP
Friday, June 25, 2004

> *p will fail unless is it's dword aligned.'

Then how can you have an array access? Isn't
that a dereference?

son of parnas
Friday, June 25, 2004

i think the difference is that I was thinking memory, you were thinking registers. memory may have byte addressing, while registers usually don't (but of course are way faster).

mb
Friday, June 25, 2004

Here's how I would've answered the question:

Me: "Can we assume the code is for 32-bit Windows?"
Interviewer: "Yes."
Me: "Call htonl()."

Any interviewer worth his salt should recognize that the best solution to a problem is one that already exists and is debugged, especially when it's free (like htonl() is).

Brad Wilson (dotnetguy.techieswithcats.com)
Saturday, June 26, 2004

Oh, and for the original poster:

"He sez, too many locals, compiler cannot optimize the register space.  Optimize it."

Your second version had the exact same number of locals: one pointer, and one 32-bit integer.

Brad Wilson (dotnetguy.techieswithcats.com)
Saturday, June 26, 2004

"Call htonl()"

Of course the point to questions like these is to see if the interviewee can figure out a problem on his own.  (Granted, the interviewer in this case sounds like a bit of a jerk, but still...)

J.
Monday, June 28, 2004

*  Recent Topics

*  Fog Creek Home