Random probability

I have this pseudo-code (NOT c c/++)

1
2
3
4
5
6
7
8
k = 0
b = 0
for i = 1 to 64
   r = rnd()
   if r > .96 then b = b + 1
   if r > .98 then k = k + 3
   else if r > .95 then k = k + 2
next i


where rnd() provides a random floating point number between 0 and 1 inclusive.

Can this be translated into C++ using std::random and a distribution with just 1 line of code (without a loop) - the result needn't be exactly the same just similar. I guess the code above is based upon probability somehow but that isn't an area of maths I've ever studied beyond the absolute basics.

Obviously the minimum for b is 0 and the maximum is 64 and for k the minimum is 0 and the maximum is 192. But I think the code favours lower numbers due to probability??? I've never used a distribution other than uniform which isn't applicable in this case.

Any thoughts?
Maybe a piecewise distribution can be used, but you need to break it into the appropriate cases.

[0, 0.95] --> nothing
(0.95, 0.96] --> k = k + 2
(0.96, 0.98] --> (b = b + 1, k = k + 2)
(0.98, 1.0] --> (b = b + 1, k = k + 3)

I think that's correct?

The problem is, I don't see anything in the standard library that allows you to call into a lambda for certain input values. But due to the magic of templates... there is probably a way to fake it into calling lambdas that capture b or k.

Edit: But the template for the piecewise distribution seems to only intend a "RealType" to be used, so it might not be allowed?
Last edited on
no, this cannot be converted into 1 line of c++.
you can make it shorter at the cost of readability.
you can probably eliminate the loop** by approximating b and k randomly instead of looping, which has nothing to do with C++ and everything to do with math.

** the 'accuracy' or 'usability' of this approach may or may not be tolerable depending on what its being used for.

lets take a look. you have 64 (?) numbers or thereabouts (unclear if 1 and 64 are both included with your pcode). Its probably clear if you remember what ( and ] and all mean, but I don't, and didn't look it up because the exact value isnt important.

2% of 64 is 1.28, so 0-2 and very rarely 3-5 or so (or so could be every value, technically) of the values will add 3 to k.
96 to 98 is also 2% of the numbers. so k will have an additional 0-2 values adding 2 to k.
95-96 is 1% of the values and so as above 0-1 of the values will add 2 to k and rarely a few more.

you can use a floating point random generator with a std deviation and all to approximate this trailing edge of 0-3 values + small chance of extras. It will take about 10 lines of code to set up your <random> tools and another 3 or 4 to get K and B computed, but the loop will be gone.
I don't know how to get the exact values for your std deviation and all to exactly match the values scientifically, but you can eyeball it and get it really close or you can do the math if this is "serious" work.

I would prefer you to try it first before offering code for this.
it will look like
set up distribution(s) and generators for <random>
get random value for K (3 times, one for each term). B should be computed from K, not derived on its own. B is K minus one of Ks terms. K should be validated such that the sum of your terms is not > its max value .. if it is, which should ne a 1 in a gazillion roll, set it to that max value (not sure what that is, kinitial + 64 *3 I think?)

this seems like a lot of math to eliminate a loop over 64 iterations which should take a nearly unmeasurable amount of cpu wall clock time. I wouldn't bother until your sample size were at least a million assuming modern desktop cpu power.
Last edited on
Thanks. As it seems it can't be done easily using one of the in-built distributions, I'll just keep the code (as C++). Speed isn't the issue as it's currently only evaluated once per program run. Having C++ code like this just 'looks not right' to me...
Actually, it can be done using the multinomial distribution - see
https://en.wikipedia.org/wiki/Multinomial_distribution

You need a multinomial distribution generator and there isn't one in the standard library, but there is one in the gnu gsl library:
https://www.gnu.org/software/gsl/doc/html/randist.html#the-multinomial-distribution

In the code below I have coded my own multinomial distribution generator using only the routines in <random> and the algorithm on the Wikpedia page. However, it is not very clever and it will almost certainly be slower than the original loop! Effectively, it is the sum over n calls to a discrete probability distribution, so you could use the library function for that if you prefer.

Note that, to compare the two methods for accuracy at least, I have had to increase the loop count from 64 to 1000000 so that I can get some repeatability. As a long term average you will get
k ~ 0.12N
b ~ 0.04N

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <iostream>
#include <vector>
#include <random>
#include <ctime>
using namespace std;

mt19937 gen( time( 0 ) );
uniform_real_distribution<double> U( 0.0, 1.0 );

//----------------------------------------------------------------------

vector<int> multinomial( int n, const vector<double> &p )
{
   vector<int> v( p.size(), 0 );

   for ( int i = 0; i < n; i++ )
   {
      double X = U( gen );
      int j = 0;
      for ( double sump = p[0]; sump < X && j < p.size() - 1; j++ ) sump += p[j+1];
      v[j]++;
   }

   return v;
}

//----------------------------------------------------------------------

int main()
{
   int k, b;
   int N = 1000000;
// int N = 64;

   cout << "Using multinomial distribution ...\n";
   vector<int> v = multinomial( N, { 0.95, 0.01, 0.02, 0.02 } );
   k = 2 * ( v[1] + v[2] ) + 3 * v[3];
   b = v[2] + v[3];
   cout << "k = " << k << "      b = " << b << '\n';


   cout << "Using original loop ...\n";
   k = b = 0;
   for ( int i = 1; i <= N; i++ )
   {
      double r = U( gen );
      if ( r > 0.96 ) b++;
      if      ( r > 0.98 ) k += 3;
      else if ( r > 0.95 ) k += 2;
   }
   cout << "k = " << k << "      b = " << b << '\n';
}

//---------------------------------------------------------------------- 


Using multinomial distribution ...
k = 120986      b = 40354
Using original loop ...
k = 120632      b = 40091




If you want to use discrete_distribution instead then you can replace my multinomial routine by
1
2
3
4
5
6
7
vector<int> multinomial( int n, const vector<double> &p )
{
   vector<int> v( p.size(), 0 );
   discrete_distribution<int> D( p.begin(), p.end() );
   for ( int i = 0; i < n; i++ ) v[D(gen)]++;
   return v;
}

but for some reason this seems to be significantly slower.
Last edited on
Topic archived. No new replies allowed.