Fog Creek Software
Discussion Board




how to increase range of a random number generator

Hi,

Can someone help me with this question?
Given a random number generator for range 1 to 5. How could you use this to create a generator for range 1 to 7? Also, you could call the generator any number of time.
Hint: Doing it for 1 to 25 is easier!

-thanx
Vik

Vik
Saturday, April 05, 2003

If you don't care about performance loss (and probably skew.. though it's not a problem).. you could do a 1-5 twice, index it for values 1-5 of 1-3 and the next 1-5 into the 4-7

Li-fan Chen
Sunday, April 06, 2003

I don't understand how that solution would work.  How do you decide if you want to use the number that you found in the range 1-3 or the one in the range 4-7?  You could use the random number generator again and index it to 1 or 2, but you still have the problem that you are skewing the random number generator towards 4-7.

Here is my solution:

1) Use the random number generator to get numbers A and B between 1 and 5.
2) Subtract 1 from both A and B to get numbers between 0 and 4.
3) Combine digits A and B together to create a base-5 number.
4) When converted to decimal, there are 25 equally likely values that vary between 0 and 24.
5) We can then map these 25 values into 7.  Assign values 0-2 to 1, 3-5 to 2, ..., 18-20 to 7.
6) If the value is between 21 and 24, then repeat this entire procedure.

Jared
Sunday, April 06, 2003

Well, part of it depends on what you mean by "random".  If rand5() is our 0-4 random number generator, an obvious solution is:
int rand7(){return (rand5()+rand5()) % 7;} /* this is essentially the solution of the first reply */
So how badly biased is this?  I think results 0-2 are all equally likely (an effect of the mod) but 3-6 are all more likely, with 5 the most likely (done in head, not on paper, so maybe I'm mistaken).
We can get an even distribution, but it requires arbitrarily many calls to rand5() in the worst case.  We can do something like: (in C)
int rand7(){
    int res;
    do{
        res = 5*rand5() + rand5();
    }while (res > 7);
    return res;
}
You can speed it up by avoiding the second (*) call to rand5() when the first one returns a value too large (essentially a shortcut of the comparison), but you can't avoid the fundamental problemof requiring arbitrarily many calls to rand5() (**) .
Follow-up question:  what is the expected running time of rand7() in terms of calls to rand5()?  If we make the obvious generalization to m and n, what is the expected running time in terms of m and n?

(*) I mean the one that appears in the code second.  The compiler is free to order the actual calls however it wants.
(**) I'm not sure I know how to formally prove this, but I'm certain it's true.

Brian
Sunday, April 06, 2003

I should note that my solution is essentially the same as Jared's, except that he makes more efficient use of the first call to rand5().

Brian
Sunday, April 06, 2003

I should also note that I could have made the optimization suggested in Jared's solution, but didn't because it is tricky to do in the generalized solution, and is useless in the common case where you try to generate a number based on calls to a rand2() (i.e. just getting random bits).

Brian
Sunday, April 06, 2003

Jared I think you overcompicate things  - keep the digits the same and don't convert between bases.

Call rand5 4 times and you get a number:

              ____
1111 <= abcd  <= 5555

those are 4445 numbers = 7 * 635.

Thus you can divide the spectrum evenly.

tekusme
Tuesday, April 08, 2003

umm...  most of those 4445 numbers are not possible outcomes (like 3999).  It should be obvious that there are 5*5*5*5 = 625 possible outcomes.

Even though it doesn't work, it's an interesting approach.

Brian
Tuesday, April 08, 2003

tekusme gave me an idea though...

Run rand5 7 times..

11111222223333344444555556666677777

(this gets you 35 effective bits of strong PRNG randomness assuming no bias / no skew)

line up the bits and chop it up into 5 pieces of rand7s...

11111112222222333333344444445555555

If you intend to use a lot of rands, just tip the bit maki sushi chef.

Li-fan Chen
Thursday, April 10, 2003

Let's say you got a problem where you decided to map rand5 to the first 5 bits of rand7... well you can do another rand 5 and inefficiently map to the last 2 bits pretty easily.

For example:

Last two bits initialized to: 00

First wasteful 01010 can bit XOR (one after the other) toggle 0,
another wasteful 10111 can bit XOR (one after the other) toggle 0.

This is three rand5 .. pretty wasteful.. but in a protocol this is usually used to filled up the last block of some randomness provider protocol out of N-1 block.. When N is large this is no damage to performance.

Because the last two bit is just as random as the first  5 bits and they come from separate randoms you might be able to get away with conjecturing that there are no bias over all. It's how you do the mapping that prevents the skew.

Li-fan Chen
Thursday, April 10, 2003

Correction, to eliminate even more skew.. don't initialize to 0.. just XOR bit by bit all subsequent bits of the wasteful rand5 with the first bit.

Li-fan Chen
Thursday, April 10, 2003

"Run rand5 7 times..

11111222223333344444555556666677777

(this gets you 35 effective bits of strong PRNG randomness assuming no bias / no skew) "

No, it doesn't.  A call to rand5 gives you a number 0-4: log(5) bits of randomness.

Brian
Thursday, April 10, 2003

I am a freak, please ignore me.

To clean up bias though I found a simple article someone showed me.

http://www.random.org/essay.html

I quoth:

The skew correction algorithm used is based on transition mapping. Bits are read two at a time, and if there is a transition between values (the bits are 01 or 10) one of them - say the first - is passed on as random. If there is no transition (the bits are 00 or 11), the bits are discarded and the next two are read. This simple algorithm was originally due to Von Neumann and completely eliminates any bias towards 0 or 1 in the data. It is only one of several ways of performing skew correction, though, and has a number of drawbacks. First, it takes an indeterminate number of input bits. Second, it is quite inefficient, resulting in the loss of 75% of the data, even when the bit stream is already unbiased. RFC1750 discusses skew correction in general and lists this method as well as three others. Paul Crowley discusses skew correction for use with Geiger counters and also maintains a page with implementations of a number of skew correction algorithms.

Li-fan Chen
Friday, April 11, 2003

Use mod

((Rnd * Rnd) mod 7) + 1

The Shadow
Thursday, April 24, 2003

The skew correction algorithm is not relevant.  Unless you care to explain how it answers the original question.

Brian
Thursday, June 19, 2003

*  Recent Topics

*  Fog Creek Home