### Abstract

We consider the problem of a Bayesian researcher looking to optimally design an investigation into some unknown parameter of interest, when multiple data sources (or experimental designs), possibly differing in cost per sample, are available. Because expected losses are typically impractical to compute or simulate, we consider an approximate solution that holds for large sample sizes (so when the researcher’s budget is large). We show that the asymptotically optimal design uses data sources in proportions that maximize the Fisher information per dollar. Importantly, this criterion generically does not depend on the researcher’s loss function, and thus is applicable in a wide variety of settings. Our approach provides a basis for a Bayesian approach to experiment design that escapes many of the inflexibilities of the usual frequentist approach. Specifically, we allow for comparisons of data sources that might have very disparate data generating processes, and thus would be difficult to compare under classical approaches. Furthermore, our results directly account not only for the statistical properties of a data source, but also its cost per sample.

I’m an economic theory grad student at UW–Madison. I am currently (Spring 2021) on the job market.