You're wrong. Doing range compression does not remove the bias, it simply redistributes it. And it is slower. So using modulus is superior.
EDIT:
Since you probably won't believe me, here's a simplified example to illustrate:
Assumptions:
A) rand() produces a finite number of outputs 'X'
B) we want to translate rand()'s output to a smaller subset 'Y'
C) We assume rand()'s output is even distribution... ie: all numbers [0..X) have an equal chance of appearing (even though this is implementation dependent and may not always be true). If C is not true than rand() is fundamentally biased and there is nothing you can do to remove the bias - so for purposes of this example we assume C is true.
To keep the example simple, let's do some runs with the following numbers:
X = 10
Y = 3
that is, rand() produces [0..10) and we want to generate [0..3)
Illustration:
1 2 3 4 5
|
// compressive approach:
a = (int)( (float)rand() * 3 / 10.0f );
// mod approach
b = rand() % 3;
| |
Since X is small, we can simply map out how X values will translate to Y values for each approach:
1 2 3 4 5 6 7 8 9 10 11 12
|
rand output | mod result | compress result
====================================================
0 0 0
1 1 0
2 2 0
3 0 0
4 1 1
5 2 1
6 0 1
7 1 2
8 2 2
9 0 2
| |
Notice that in each approach (even the compression method), 0 appears more frequently than 1 or 2. So both approaches are biased towards 0.
EDIT 2:
It comes down to a very simple principle. You are translating X possible outputs into Y possible outputs. So unless X is evenly divisible by Y, there is going to be bias no matter what kind of transformation you do.
X % Y
numbers are going to be biased.
plug in whatever numbers for X and Y that you like:
X=54
Y=10
4 results are going to have bias... each item in Y can be generated fairly 5 times, but then there are 4 additional possible outputs of X that cause the bias.
AFAIK the only way to completely avoid this bias is to choose a number 'Z' with which:
X % Z is zero
Z >= Y
Then use Z to scale down from rand, and "reroll" if the result is >= Y
So to carry with my previous example:
X = 10
Y = 3
Z = 5
1 2 3 4 5 6 7 8 9
|
// with bias:
int result = rand() % 3;
// without bias
int result;
do
{
result = rand() % 5;
}while( result >= 3 );
| |
But talk about slow.
Also...
EVEN THAT may not be without bias. Since you are effectively discarding numbers in X's set, you are breaking assumption C above. (unless Y never changes. If Y and Z remain constant throughout the life of the RNG, then this would truly generate numbers without bias).
EDIT 3:
Lastly note that this bias exists of all digital rngs, since they all have a fixed number of possible outputs. Even the really complex ones. Even floating point ones. If the rng is digital, it has a finite number of possible outputs -- and translating those to a different range of finite outputs is going to have bias unless it's evenly divisible.
Mathematical fact.