bayesian-stats

Bayesian Statistics

Hierarchical models and uncertainty quantification for algorithm benchmarking

The problem

Computational chemistry benchmarks typically report point estimates across a handful of test cases. These numbers hide the variability between problems and give no indication of whether observed differences between algorithms are statistically meaningful.

Hierarchical Bayesian approach

We use brms (Bayesian Regression Models using Stan) to fit hierarchical models where each test case is treated as drawn from a population (Goswami 2025). The model estimates both the average performance of each algorithm and the between-problem variance, producing full posterior distributions over rankings.

Applied to dimer method rotation optimizers (CG vs L-BFGS) across 500 molecular systems, the model reveals which performance differences are real and which are noise from problem selection.

Transferability

The methodology applies beyond saddle point searches. Any computational benchmark where algorithms are compared across test problems can use the same hierarchical structure for honest uncertainty on performance claims.

References

Goswami, Rohit. 2025. “Bayesian Hierarchical Models for Quantitative Estimates for Performance Metrics Applied to Saddle Search Algorithms.” Aip Advances 15 (8): 85210. https://doi.org/10.1063/5.0283639.

← All research threads