Can a Bayesian Oracle Prevent Harm from an Agent? (Bengio et al. 2024) — AI Alignment Forum