Research Papers

Computationally Efficient Variational Approximations for Bayesian Inverse Problems

[+] Author and Article Information
Panagiotis Tsilifis

Department of Mathematics,
University of Southern California,
Los Angeles, CA 90089-2532
e-mail: tsilifis@usc.edu

Ilias Bilionis

Assistant Professor
School of Mechanical Engineering,
Purdue University,
West Lafayette, IN 47906-2088
e-mail: ibilion@purdue.edu

Ioannis Katsounaros

Leiden Institute of Chemistry,
Leiden University,
Einsteinweg 55, P.O. Box 9502,
Leiden 2300 RA, The Netherlands
e-mail: katsounaros@anl.gov

Nicholas Zabaras

Warwick Centre for Predictive Modeling,
University of Warwick,
Coventry CV4 7AL, UK
e-mail: nzabaras@gmail.com

Manuscript received September 15, 2015; final manuscript received July 5, 2016; published online July 26, 2016. Editor: Ashley F. Emery.

J. Verif. Valid. Uncert 1(3), 031004 (Jul 26, 2016) (13 pages) Paper No: VVUQ-15-1041; doi: 10.1115/1.4034102 History: Received September 15, 2015; Revised July 05, 2016

The major drawback of the Bayesian approach to model calibration is the computational burden involved in describing the posterior distribution of the unknown model parameters arising from the fact that typical Markov chain Monte Carlo (MCMC) samplers require thousands of forward model evaluations. In this work, we develop a variational Bayesian approach to model calibration which uses an information theoretic criterion to recast the posterior problem as an optimization problem. Specifically, we parameterize the posterior using the family of Gaussian mixtures and seek to minimize the information loss incurred by replacing the true posterior with an approximate one. Our approach is of particular importance in underdetermined problems with expensive forward models in which both the classical approach of minimizing a potentially regularized misfit function and MCMC are not viable options. We test our methodology on two surrogate-free examples and show that it dramatically outperforms MCMC methods.

Copyright © 2016 by ASME
Your Session has timed out. Please sign back in to continue.


Oliver, D. S. , Reynolds, A. C. , and Liu, N. , 2008, Inverse Theory for Petroleum Reservoir Characterization and History Matching, Cambridge University Press, Cambridge, UK.
Tarantola, A. , 2005, Inverse Problem Theory and Methods for Model Parameter Estimation, Society for Industrial and Applied Mathematics, Philadelphia, PA.
Jaynes, E. T. , and Bretthorst, G. L. , 2003, Probability Theory: The Logic of Science, Cambridge University Press, Cambridge, UK.
Bilionis, I. , Drewniak, B. A. , and Constantinescu, E. M. , 2015, “ Crop Physiology Calibration in the CLM,” Geosci. Model Dev., 8(4), pp. 1071–1083. [CrossRef]
Kalnay, E. , 2002, Atmospheric Modeling, Data Assimilation and Predictability, 1st ed., Cambridge University Press, Cambridge, UK.
Navon, I. M. , 2009, “ Data Assimilation for Numerical Weather Prediction: A Review,” Data Assimilation for Atmospheric, Oceanic and Hydrologic Applications, K. P. Seon and L. Xu , eds., Springer, Berlin, pp. 21–65.
Hill, M. C. , and Tiedeman, C. R. , 2006, Effective Groundwater Model Calibration: With Analysis of Data, Sensitivities, Predictions, and Uncertainty, Wiley, New York.
Tikhonov, A. N. , 1963, “ Regularization of Incorrectly Formulated Problems and the Regularization,” Dokl. Akad. Nauk SSSR, 151, pp. 501–504.
Fichtner, A. , 2010, Full Seismic Waveform Modelling and Inversion (Advances in Geophysical and Environmental Mechanics and Mathematics), Springer, Berlin.
Metropolis, N. , Rosenbluth, A. W. , Rosenbluth, M. N. , Teller, A. H. , and Teller, E. , 1953, “ Equation of State Calculations by Fast Computing Machines,” J. Chem. Phys., 21(6), pp. 1087–1092. [CrossRef]
Hastings, W. K. , 1970, “ Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika, 57(1), pp. 97–109. [CrossRef]
Haario, H. , Saksman, E. , and Tamminen, J. , 2001, “ An Adaptive Metropolis Algorithm,” Bernoulli, 7(2), pp. 223–242. [CrossRef]
Mira, A. , 2001, “ On Metropolis–Hastings Algorithms With Delayed Rejection,” Metron, 59(3–4), pp. 231–241.
Tierney, L. , and Antonietta, M. , 1999, “ Some Adaptive Monte Carlo Methods for Bayesian Inference,” Stat. Med., 18(17–18), pp. 2507–2515. [CrossRef] [PubMed]
Haario, H. , Laine, M. , Mira, A. , and Saksman, E. , 2006, “ DRAM: Efficient Adaptive MCMC,” Stat. Comput., 16(4), pp. 339–354. [CrossRef]
Roberts, G. O. , and Rosenthal, J. S. , 1998, “ Optimal Scaling of Discrete Approximations to Langevin Diffusions,” J. R. Stat. Soc. Ser. B, 60(1), pp. 255–268. [CrossRef]
Duane, S. , Kennedy, A. D. , Pendleton, B. J. , and Roweth, D. , 1987, “ Hybrid Monte Carlo,” Phys. Lett. B, 195(2), pp. 216–222. [CrossRef]
Girolami, M. , and Calderhead, B. , 2011, “ Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods,” J. R. Stat. Soc. Ser. B, 73(2), pp. 123–214. [CrossRef]
Parno, M. D. , 2015, “ Transport Maps for Accelerated Bayesian Computation,” Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA.
Kennedy, M. C. , and O'Hagan, A. , 2001, “ Bayesian Calibration of Computer Models,” J. R. Stat. Soc. Ser. B, 63(3), pp. 425–464. [CrossRef]
Ghanem, R. G. , and Spanos, P. D. , 2003, Stochastic Finite Elements: A Spectral Approach, Dover Publications Inc., Mineola, NY.
Xiu, D. , and Karniadakis, G. E. , 2002, “ The Wiener–Askey Polynomial Chaos for Stochastic Differential Equations,” SIAM J. Sci. Comput., 24(2), pp. 619–644. [CrossRef]
Marzouk, Y. , and Xiu, D. , 2009, “ A Stochastic Collocation Approach to Bayesian Inference in Inverse Problems,” Commun. Comput. Phys., 6(4), pp. 826–847. [CrossRef]
Marzouk, Y. M. , Najm, H. N. , and Rahn, L. A. , 2007, “ Stochastic Spectral Methods for Efficient Bayesian Solution of Inverse Problems,” J. Comput. Phys., 224(2), pp. 560–586. [CrossRef]
Marzouk, Y. M. , and Najm, H. N. , 2009, “ Dimensionality Reduction and Polynomial Chaos Acceleration of Bayesian Inference in Inverse Problems,” J. Comput. Phys., 228(6), pp. 1862–1902. [CrossRef]
Bilionis, I. , and Zabaras, N. , 2014, “ Solution of Inverse Problems With Limited Forward Solver Evaluations: A Bayesian Perspective,” Inverse Probl., 30(1), p. 015004. [CrossRef]
Ghosh, J. K. , Delampady, M. , and Samanta, T. , 2007, An Introduction to Bayesian Analysis: Theory and Methods, Springer Science & Business Media, Heidelberg, Germany.
Griewank, A. , and Walther, A. , 2008, Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, 2nd ed., Society for Industrial and Applied Mathematics, Philadelphia, PA.
Plessix, R.-E. , 2006, “ A Review of the Adjoint-State Method for Computing the Gradient of a Functional With Geophysical Applications,” Geophys. J. Int., 167(2), pp. 495–503. [CrossRef]
Fox, C. W. , and Roberts, S. J. , 2011, “ A Tutorial on Variational Bayesian Inference,” Artif. Intell. Rev., 38(2), pp. 85–95. [CrossRef]
Chen, P. , Zabaras, N. , and Bilionis, I. , 2015, “ Uncertainty Propagation Using Infinite Mixture of Gaussian Processes and Variational Bayesian Inference,” J. Comput. Phys., 284, pp. 291–333. [CrossRef]
Ormerod, J. T. , and Wand, M. P. , 2010, “ Explaining Variational Approximations,” Am. Stat., 64(2), pp. 140–153. [CrossRef]
Li, L. , Silva, J. , Zhou, M. , and Carin, L. , 2012, “ Online Bayesian Dictionary Learning for Large Datasets,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, Mar. 25–30, pp. 2157–2160.
Blei, D. M. , Ng, A. Y. , and Jordan, M. I. , 2003, “ Latent Dirichlet Allocation,” J. Mach. Learn. Res., 3, pp. 993–1022.
Hoffman, M. D. , Blei, D. M. , and Bach, F. R. , 2010, “ Online Learning for Latent Dirichlet Allocation,” Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, Dec. 6–9, pp. 856–864.
Jin, B. , 2012, “ A Variational Bayesian Method to Inverse Problems With Impulsive Noise,” J. Comput. Phys., 231(2), pp. 423–435. [CrossRef]
Franck, I. M. , and Koutsourelakis, P. S. , 2016, “ Sparse Variational Bayesian Approximations for Nonlinear Inverse Problems: Applications in Nonlinear Elastography,” Comput. Methods Appl. Mech. Eng., 299, pp. 215–244. [CrossRef]
Pinski, F. , Simpson, G. , Stuart, A. , and Weber, H. , 2015, “ Kullback–Leibler Approximation for Probability Measures on Infinite Dimensional Spaces,” SIAM J. Math. Anal., 47(6), pp. 4091–4122. [CrossRef]
Pinski, F. J. , Simpson, G. , Stuart, A. M. , and Weber, H. , 2015, “ Algorithms for Kullback–Leibler Approximation of Probability Measures in Infinite Dimensions,” SIAM J. Sci. Comput., 37(6), pp. A2733–A2757. [CrossRef]
Gershman, S. , Hoffman, M. D. , and Blei, D. M. , 2012, “ Nonparametric Variational Inference,” Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, UK.
McLachlan, G. , and Peel, D. , 2004, Finite Mixture Models, Wiley, New York.
Jordan, M. I. , Ghahramani, Z. , Zaakkola, T. S. , and Saul, L. K. , 1999, “ An Introduction to Variational Methods for Graphical Models,” Mach. Learn., 37(2), pp. 183–233. [CrossRef]
Kullback, S. , and Leibler, R. A. , 1951, “ On Information and Sufficiency,” Ann. Math. Stat., 22(1), pp. 79–86. [CrossRef]
Robbins, H. , and Monro, S. , 1951, “ A Stochastic Approximation Method,” Ann. Math. Stat., 22(3), pp. 400–407. [CrossRef]
Bilionis, I. , and Zabaras, N. , 2013, “ A Stochastic Optimization Approach to Coarse-Graining Using a Relative-Entropy Framework,” J. Chem. Phys., 138(4), p. 044313. [CrossRef] [PubMed]
Bilionis, I. , and Koutsourelakis, P. S. , 2012, “ Free Energy Computations by Minimization of Kullback–Leibler Divergence: An Efficient Adaptive Biasing Potential Method for Sparse Representations,” J. Comput. Phys., 231(9), pp. 3849–3870. [CrossRef]
Huber, M. F. , Bailey, T. , Durrant-Whyte, H. , and Hanebeck, U. D. , 2008, “ On Entropy Approximation for Gaussian Mixture Random Vectors,” IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), Seoul, South Korea, Aug. 20–22, pp. 181–188.
Bacharoglou, A. G. , 2010, “ Approximation of Probability Distributions by Convex Mixtures of Gaussian Measures,” Proc. Am. Math. Soc., 138(7), pp. 2619–2628. [CrossRef]
Byrd, R. H. , Lu, P. , Nocedal, J. , and Zhu, C. , 1995, “ A Limited Memory Algorithm for Bound Constrained Optimization,” SIAM J. Sci. Comput., 16(5), pp. 1190–1208. [CrossRef]
Atchadé, Y. F. , 2006, “ An Adaptive Version for the Metropolis Adjusted Langevin Algorithm With a Truncated Drift,” Methodol. Comput. Appl. Probab., 8(2), pp. 235–254. [CrossRef]
Katsounaros, I. , Dortsiou, M. , Polatides, C. , Preston, S. , Kypraios, T. , and Kyriacou, G. , 2012, “ Reaction Pathways in the Electrochemical Reduction of Nitrate on Tin,” Electrochim. Acta, 71, pp. 270–276. [CrossRef]
Guyer, J. E. , Wheeler, D. , and Warren, J. A. , 2009, “ FiPy: Partial Differential Equations With Python,” Comput. Sci. Eng., 11(3), pp. 6–15. [CrossRef]


Grahic Jump Location
Fig. 1

Reaction kinetic model: Simplified reaction scheme for the reduction of nitrate

Grahic Jump Location
Fig. 2

Reaction kinetic model: Comparison of the variational posterior (VAR (L = 1)) of thescaled kinetic rate constants κi as well as of the likelihood noise σ to the MCMC (MALA) histograms of the same quantity. The prior probability density of each quantity is shown as a dashed line.

Grahic Jump Location
Fig. 3

Reaction kinetic model: The “•,” “ ∇,” “,” “◇,” and “⬠” signs indicate the scaled experimental measurements for NO3−, NO2−, N2, N2O, and NH3, respectively. The lines and the shaded areas around them correspond to the median and 95% credible intervals of the scaled concentration, ui, as a function of the scaled time τ. The left plot shows the results obtained by approximating the posterior of the parameters via the variational approach. The right plot shows the results obtained via MCMC (MALA).

Grahic Jump Location
Fig. 4

Concentration map of the contaminant at the four time instances when the measurements are taken: (a) t = 0.075, (b) t = 0.15, (c) t = 0.225, and (d) t = 0.3

Grahic Jump Location
Fig. 5

Contaminant source identification, first case: Comparison of the variational posterior (VAR (L = 1)) of the source location ξ=(ξ1,ξ2) as well as of the likelihood noise σ to the MCMC (MALA) histograms of the same quantities. The true value of each quantity is marked by a vertical, dashed line.

Grahic Jump Location
Fig. 6

Contaminant source identification, second case: Comparison of the variational posterior (VAR (L = 1)) of the source location ξ=(ξ1,ξ2) as well as of the likelihood noise σ to the MCMC (MALA) histograms of the same quantities. The true value of each quantity is marked by a vertical, dashed line.



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In