Small-sample performance of Bernoulli two-armed bandit Bayesian strategies期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Small-sample performance of Bernoulli two-armed bandit Bayesian strategies

Institution:	1. Department of Economics, University of Warwick, National Institute of Economic and Social Research, United Kingdom;2. Department of Economics, UCLA, United States

Abstract:	In this paper we examine the small-sample performance of a number of strategies for Bernoulli two-armed bandit problems with independent arms. We first investigate strategies based on a one-armed bandit threshold value (an index analogous to the ‘Gittins index’) and on upper confidence bounds for θ_i. Using backward induction and the Bayesian viewpoint, we observe that these strategies improve on the myopic strategy and get much closer to optimal in terms of total expected reward, even though for very small samples, the myopic worth itself is already close to optimal. Second, we find that the myopic strategy and the strategy based on the one-armed threshold value dominate the Bayesian optimal strategy over a region in the parameter space that can have large probability under the assumed prior. Finally, through examples we show how this has an impact on robustness: small specifications of the prior can lead to the myopic strategy performing better than the optimal strategy in terms of Bayes worth.

Keywords:
本文献已被 ScienceDirect 等数据库收录！