Sanvesh Srivastava - Colloquium Speaker
Flexible hierarchical Bayesian modeling of massive data is challenging due to poorly scaling computations in large sample size settings. Scalable Bayesian methods based on the divide-and-conquer technique provide a general approach for tractable Bayesian inference in massive data settings. All these methods consist of three steps. First, data are divided into smaller computationally tractable subsets. Second, posterior samples are obtained in parallel across all the subsets. Third, posterior samples from all the subsets are combined to yield an approximation of the full data posterior distribution, which is used for inference and predictions. Sampling in the second step is more efficient than sampling from the full data posterior due to the smaller size of any subset. Since the combination step takes negligible time relative to sampling, posterior computations can be scaled to massive data by dividing the full data into a sufficiently large number of data subsets. Existing divide-and-conquer methods differ mainly in the combination scheme. Our focus is on the WASP method that combines subset posterior distributions through their barycenter in a Wasserstein space of probability measures. We demonstrate the application of WASP on linear mixed-effects models and conclude with an application in Kriging using stationary and non-stationary Gaussian process priors.
This talk is based on joint works with David B. Dunson (Duke University), Rajarshi Guhaniyogi (UC Santa Cruz), Cheng Li (National University of Singapore), and Terrance Savitsky (U.S. Bureau of Labor Statistics). Most of the talk will be based on the manuscript titled “Scalable Bayes via Barycenter in Wasserstein Space” available at https://arxiv.org/abs/1508.05880.