Optimally Combining Probe Monitors and Black Box Monitors
Link to our arXiv paper "Combining Cost-Constrained Runtime Monitors for AI Safety" here: https://arxiv.org/abs/2507.15886. Code can be found here. Executive Summary * Monitoring AIs at runtime can help us detect and stop harmful actions. For cost reasons, we often want to use cheap monitors like probes to monitor all of...