Custom Frequency RNG

This just occurred to me and I thought I should share.

Suppose I have a function maybe_one(x) which returns...

-> 1 with probability x
-> 0 with probability 1-x

Now, suppose I want my RNG to give me...

-> 1 with probability 60%
-> 2 with probability 30%
-> 3 with probability 10%

This -> 1 + maybe_one(40%) * ( 1 + maybe_one(25%) ) will do the trick.

Now, suppose I want my RNG to give me...

-> 1 with probability 40%
-> 2 with probability 30%
-> 3 with probability 18%
-> 4 with probability 12%

This -> 1 + maybe_one(60%) * ( 1 + maybe_one(50%) * ( 1 + maybe_one(40%) ) ) will work.

I believe you begin to see the pattern, but where do 40%, 25%, 60%, 50% and 40% come from?

Let's take a closer look at the first example...

I want the expression to evaluate to 1 + 0 60% of the time.
This means that the first maybe_one call should return 1
40% of the time. Ok, that was easy. Now, suppose I have
1 + maybe_one(40%) * ( 1 + maybe_one(P) ). The probability
I get a 2 from this is 40% * (1-P). And I want this to be 30%.

0.4 * (1-P) = 0.3 -> ... -> P = 0.25

Try to walk through the second example step by step.

Oh, and here is a small demonstration program:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include <iostream>
#include <vector>
#include <cstdlib>
#include <ctime>

struct CFRNG
{
    std::vector<double> cfreq;

    CFRNG() { srand(time(0)); }

    int CFOne(double p) const { return (rand()*1./(RAND_MAX+1)) < p; }

    int CFRand() const
    {
        if (cfreq.empty()) return 0;

        int ret=1;

        double lsum=0, hsum=cfreq[0];

        for (int i=0; i<cfreq.size()-1; i++)
        {
            lsum+=cfreq[i];
            hsum+=cfreq[i+1];

            ret*=CFOne(lsum/hsum);
            ret++;
        }

        return ret-1;
    }

    CFRNG & operator << (double f) { cfreq.push_back(f); return *this; }
};

void test(const CFRNG & rng, int count)
{
    std::vector<int> freq(rng.cfreq.size(),0);

    for (int i=0; i<count; i++)
        freq[rng.CFRand()]++;

    for (int i=0; i<freq.size(); i++)
        std::cout << i << " -> "
            << int(freq[i]*10000./count+0.005)/100.
            << "%" << std::endl;
}

int main()
{
    CFRNG rng;

    rng << 0.01 << 0.05 << 0.15
        << 0.19 << 0.24 << 0.36;

    test(rng, 25000);

    return 0;
}

EDIT: Haha, looks like there are better options:

http://www.boost.org/doc/libs/1_46_1/doc/html/boost_random/tutorial.html

EDIT2: Mmm... Peeking at boost code made me realize this can be done much better:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
#include <iostream>
#include <numeric>
#include <vector>
#include <cstdlib>
#include <ctime>

struct CFRNG
{
    std::vector<double> cfreq;
    std::vector<double> ccfreq;

    bool ccfreq_ready;

    CFRNG() { srand(time(0)); ccfreq_ready=false;}

    int CFRand()
    {
        if (cfreq.empty()) return 0;
        if (!ccfreq_ready) build_ccfreq();

        double p=(rand()*1./(RAND_MAX+1));

        for (int i=0; i<ccfreq.size(); i++)
            if (ccfreq[i]>p) return i;
    }

    CFRNG & operator << (double f)
    {
        cfreq.push_back(f); ccfreq_ready=false; return *this;
    }

    void build_ccfreq()
    {
        ccfreq.resize(cfreq.size());

        std::partial_sum(cfreq.rbegin(),cfreq.rend(),ccfreq.begin());

        ccfreq_ready=true;
    }
};

void test(CFRNG & rng, int count)
{
    std::vector<int> freq(rng.cfreq.size(),0);

    for (int i=0; i<count; i++)
        freq[rng.CFRand()]++;

    for (int i=0; i<freq.size(); i++)
        std::cout << i << " -> "
            << int(freq[i]*10000./count+0.005)/100.
            << "%" << std::endl;
}

int main()
{
    CFRNG rng;

    rng << 0.01 << 0.05 << 0.15
        << 0.19 << 0.24 << 0.36;

    test(rng, 25000);

    return 0;
}
Last edited on
Topic archived. No new replies allowed.