Speeding Up MCMC by Efficient Data Subsampling

B-Tier
Journal: Journal of the American Statistical Association
Year: 2019
Volume: 114
Issue: 526
Pages: 831-843

Authors (4)

Matias Quiroz (not in RePEc) Robert Kohn (not in RePEc) Mattias Villani (Statistiska institutionen, Sto...) Minh-Ngoc Tran (not in RePEc)

Score contribution per author:

0.503 = (α=2.01 / 4 authors) × 1.0x B-tier

α: calibrated so average coauthorship-adjusted count equals average raw count

Abstract

We propose subsampling Markov chain Monte Carlo (MCMC), an MCMC framework where the likelihood function for n observations is estimated from a random subset of m observations. We introduce a highly efficient unbiased estimator of the log-likelihood based on control variates, such that the computing cost is much smaller than that of the full log-likelihood in standard MCMC. The likelihood estimate is bias-corrected and used in two dependent pseudo-marginal algorithms to sample from a perturbed posterior, for which we derive the asymptotic error with respect to n and m, respectively. We propose a practical estimator of the error and show that the error is negligible even for a very small m in our applications. We demonstrate that subsampling MCMC is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature. Supplementary materials for this article are available online.

Technical Details

RePEc Handle
repec:taf:jnlasa:v:114:y:2019:i:526:p:831-843
Journal Field
Econometrics
Author Count
4
Added to Database
2026-01-29