Basically, the problem simulates the following:
There is an urn with 50 green balls and 50 red balls.
I am allowed to pick balls from the urn, without replacement, with the following rules: For every red ball picked, I lose a dollar, for every green ball picked, I gain a dollar.
I can stop picking whenever I want. Worst case scenario is I pick all 100, and net 0.
The question is to come up with an optimal stopping strategy, and create a program to compute the expected value of the strategy.
My strategy is to continue picking balls, while the expected value of picking another ball is positive.
That is, the stopping rule is DYNAMIC.
In Latex, here's the recursive formula:
http://imgur.com/fnzYk
Thus:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
|
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
double ExpectedValue(double, double);
double max(double, double);
main() {
double g = 50;
double r = 50;
double EV = ExpectedValue(g, r);
printf ("%f\n\n", EV);
system("PAUSE");
}
double ExpectedValue(double g, double r){
double p = (g / (g + r));
double q = 1 - p;
if (g == 0)
return r;
if (r == 0)
return 0;
double E_gr = max ((p * ExpectedValue (g - 1, r)) + (q * ExpectedValue (g, r - 1)), (r - g));
return E_gr;
}
double max(double a, double b){
if (a > b)
return a;
else return b;
}
| |
I let it run for 30 minutes, and got no answer. For small values of g and r, a solution is computed very quickly. What am I doing wrong?
Any help is much appreciated!