Fog Creek Software
Discussion Board




Code Question~

Is there a subtle bug in this code due to size_t being compared with an int?  size_t cannot be negative.

Would it be better to declare all variables as long?

// Reverse a string

char *str_rev(char *p_str)
{
    char tmp, *p = p_str;
    size_t front = 0, back = 0;
    int test;

    assert(p_str != NULL);

    while (*p++ != '\0') { back++; }
    
    if (back == 0 || back == 1) { return p_str; }
    
    back--;
            
    do {
        tmp = p_str[front];
        p_str[front] = p_str[back];
        p_str[back] = tmp;
        front++; back--;        
        test = back - front;
    } while (test > 0);
    
    return p_str;
}


Sunday, January 19, 2003

1. I don't see anything that would be classified as a bug since your while( test > 0 ) could just as accurately be written while( test != 0 ).

2. However, if you'd like to code pretty, then use ssize_t which is the signed version of size_t.

3. I think its better to prefer size_t or ssize_t when dealing with posix functions like read and write which return ssize_t.  When using lseek, somtimes the typedef off_t is used - conform to the API norms when possible.  Then, if you compile against a 64 bit file system, the ssize_t typdef should take care of you.

Nat Ersoz
Sunday, January 19, 2003

Hey, y'know, on second thought...  to code pretty, I would just use int in this function - since you're not invoking any API where ssize_t is used.  And, the difference of pointers is int.

Nat Ersoz
Sunday, January 19, 2003

I'd be inclined to get rid of front, back, and test as integers, and just work in terms of two pointers into the string.

Andrew Simmons
Sunday, January 19, 2003

Thanks for your responses.  I now have a couple of ideas on how to make the current code "solid."

Working in terms of pointers is my goal.  This code represents a step in the refactoring process.

It is necessary to have the comparison (test > 0) as when you have strings of odd and even lengths (back - front) will equal 0 for odd lengths and -1 for even lengths because back eventually surpasses front.


Sunday, January 19, 2003

Yeh, it doesn't work:

    char t [] = "dennis sinned";

    char * p = str_rev (t);

    printf (p);

prints "dennis sinned"

those who know me have no need of my name
Sunday, January 19, 2003

Lol!


Sunday, January 19, 2003

"It is necessary to have the comparison (test > 0) as when you have strings of odd and even lengths (back - front) will equal 0 for odd lengths and -1 for even lengths because back eventually surpasses front. "

Surely odd or even doesn't matter if you write the loop correctly. What's wrong with something like the following (not tested)?

char *str_rev(char *p_str)
{
    char tmp, *front = p_str, *back = p_str;

    assert(p_str != NULL);

    while (*back)
        back++;
    back--;
   
    while (back > front) {
        tmp = *back;
        *back = *front;
        *front = tmp;
        back--;
        front++;
    }
    return p_str;
}

Andrew Simmons
Sunday, January 19, 2003

Your are correct Andrew, that is a "refined" version of strrev using pointers etc.  I made the statement in reference to my code and my own way of thinking about the problem.  My original post is only asking if (size_t - size_t = int) has any subtle bugs.  As far as I can tell it does'nt. Also can't seem to get ssize_t in VC++.  Will keep working on it.


Sunday, January 19, 2003

"My original post is only asking if (size_t - size_t = int) has any subtle bugs. "

Sorry, I misunderstood. Are you actually experiencing any problems with this? On my system, VC++ v 6, size_t is a typedef for unsigned int, and I wouldn't expect any problems with the subtraction. I've not heard of ssize_t, but I believe that ptrdiff_t is the signed size corresponding to size_t. In VC++6 this is a typedef for int.

Andrew Simmons
Sunday, January 19, 2003

Sorry, ssize_t is defined in POSIX and the typedef can be found in unistd.h.  I assumed this was a POSIX question since size_t was mentioned in the first place.  Anyhow, probably best to just use int since its the diff of 2 pointers.

Nat Ersoz
Sunday, January 19, 2003

There are theoretical bugs in the code due to the conversion of unsigned ints to signed. The only one I can see is for strings of length greater than INT_MAX, where the 'while' test would fail the first time through.

However I was caught by:

vector<int> v;
// assign contents to v
int i;
for (i=0;i<(v.size()-1);++i) {
  // replace all but the last character with X
  v[i]='X';
}

David Clayworth
Monday, January 20, 2003

Both C and C++ have the "strrev(t) " function. Why are you re-inventing the wheel?

Ichabod Crane
Monday, January 20, 2003

Maybe they're a student trying to complete some homework (hence the anonymity) ;-)

Better than being unemployed...
Monday, January 20, 2003

Hmm - I don't have a C standard in front of me, but I'm pretty sure that
(size_t) 6 - (size_t)7 == (size_t) SIZE_T_MAX, which is pretty big (and positive).

That leaves you with (int) SIZE_T_MAX; the value of this expression is implementation defined.

But I'm not sure, which indicates to me that a clearer implementation should be chosen. ;->

Danil
Monday, January 20, 2003

Yea, I'm a student trying to implement a strrev routine.  The assignment is to implement the routine without looking at the hundreds of available resources to find code for the pointer implementation.  (Thanks Andrew. hehe jk) We then have to point out and document deficiencies and limitations in our code. Fun Fun Funny Fun Fun So Fun. NOT.


Monday, January 20, 2003

Mr or Ms un-named student, shame on you!

Actually, I think David's right that there is a problem. The result of size_t - size_t will presumably be unsigned, which then gets assigned to a signed int. According to K&R this is fine if the result is in the range of a signed int, and is implementation-defined otherwise, which it will be if back < front.

Andrew Simmons
Monday, January 20, 2003

Yea, that's why, before handing it in, I changed the while condition to:

while (test != 0 && test != UINT_MAX);

and the data type of "test" to size_t;

According to Microsoft docs the maximum size_t = UINT_MAX = 0xFFFFFFFF = 4294967295 which coincidentally? = unsigned long.  Actually prolly has to do with the fact that i'm running on a 32 bit machine and the compiler is inherintely 32-bit.  So when you subtract size_t from size_t the result simply rolls over to UINT_MAX.  Welcome back to the basics eh?  This code is prolly not portable either unless UINT_MAX is used to define max size_t on different platforms.  I could not find some of the constants/defines mentioned here, such as ssize_t and SIZE_T_MAX in VC++ 6.0.  Although ssize_t is defined under windows in windows.h. (I'm using console mode.)Now even when using pointers, the pointer can only theoretically access 4G of memory max on a 32 bit machine (Not sure on this, 1 segment of 4GB... umm).  So i'd say my routine is sufficient enough to do the job just as well as Andrews, it's just not as pretty becuase it does'nt use pointers.  It also might be slower, but given an optimizing compiler... Well I'd guess you'd have to look at the assembly language generated.  My guess would be that incrementing DS:SI and ES:DI might be faster than adding an offset to them, would have to look at the instruction timings on that one.  And there might be assembly instructions for swapping data.  I'm not a asm guru. Anyway, just disregard my mumbling.


Monday, January 20, 2003

I don't understand why you can't just say

while (back > front);

and just be done with it.  If it takes 20 intelligent people collectively an hour to figure out if your code has a subtle bug in it or not, there are worse problems with your code than whether "6U - 7U == -1" or not.

Foolish Jordan
Tuesday, January 21, 2003

"I don't understand why you can't just say

while (back > front);

and just be done with it.  If it takes 20 intelligent people collectively an hour to figure out if your code has a subtle bug in it or not, there are worse problems with your code than whether "6U - 7U == -1" or not."

Well Foolish you can write the while loop like that.  BUT the point of the assignment was to take the first incarnation of the strrev routine that you come up with and point out bugs. This was MY first coding of the routine.  Notice, It was MY first coding, not YOURS, but MINE.  It is written from MY perspective.  What it boils down to is that any programmer, including myself, realizes that you only need to compare if back is still greater than front and that these two variables can be used as pointers into the string and you would end up with what Andrew wrote.  But that was not the point of the assignment.  The routine that I have written is in fact very stable and so far has worked under VC++ and GNU.  Hopefully will be able to test it under a 64 bit compiler soon. There are no other "problems", as you say, with my code.  In fact, I challenge you to point out any bug you find and of course prove that the bug exists.  Your statement are foolish.

Here's my version that I handed in:

char *str_rev(char *p_str)
{
    char tmp, *p = p_str;
    size_t front = 0, back = 0, test;

    assert(p_str != NULL);

    while (*p++ != '\0') { back++; }
    
    if (back == 0 || back == 1) { return p_str; }
    
    back--;
    
    do {
        tmp = p_str[front];
        p_str[front] = p_str[back];
        p_str[back] = tmp;
        front++; back--;        
        test = back - front;        
    } while (test != 0 && test != UINT_MAX);
    
    return p_str;
}

Here's the next revision:

char *str_rev1(char *p_str)
{
    char tmp, *p = p_str;
    size_t front = 0, back = 0;

    assert(p_str != NULL);

    while (*p++ != '\0') { back++; }
    
    if (back == 0 || back == 1) { return p_str; }
    
    back--;
    
    do {
        tmp = p_str[front];
        p_str[front] = p_str[back];
        p_str[back] = tmp;
        front++; back--;        
    } while (back > front);
    
    return p_str;
}

And the final revision:

char *str_rev2(char *p_str)
{
    char tmp, *front = p_str, *back = p_str;
    
    assert(p_str != NULL);

    while (*back) { back++; }
    
    back--;
    
    while (back > front) {
        tmp = *front;
        *front = *back;
        *back = tmp;
        front++; back--;    
    }
    
    return p_str;
}

So there you have it.  The "evolution" of the str_rev routine.


Tuesday, January 21, 2003

The last version looks pretty good, but

    while (*back) { back++; }
   
    back--;
   
will invoke "undefined behavior" if the string is empty.

Your earlier versions had a separate test to see if the string was empty. Here, you would be decrementing a pointer below the point where it started. This is not allowed in C.

-Ray G-
Thursday, January 23, 2003

need help lost code on car stereo but can't reset anyone know how i can do this or gat a free new code from

kerry martin
Thursday, February 05, 2004

*  Recent Topics

*  Fog Creek Home