We consider the multi-armed bandit problem. We show that when the state space is finite the computation of the dynamic allocation indices can be handled by linear programming methods.
This paper presents three algorithms for solving linear programming problems in which some or all of the objective function coefficients are specified in terms of intervals. Which algorithm is ...