Lewis Hammond

Message

Research Director @ Cooperative AI Foundation
DPhil Candidate @ University of Oxford
Affiliate @ Future of Humanity Institute and Centre for the Governance of AI

223

Lewis Hammond

Research Director @ Cooperative AI Foundation
DPhil Candidate @ University of Oxford
Affiliate @ Future of Humanity Institute and Centre for the Governance of AI

Lewis Hammond — AI Alignment Forum

Lewis Hammond

Message

Research Director @ Cooperative AI Foundation
DPhil Candidate @ University of Oxford
Affiliate @ Future of Humanity Institute and Centre for the Governance of AI

223

Lewis Hammond

Research Director @ Cooperative AI Foundation
DPhil Candidate @ University of Oxford
Affiliate @ Future of Humanity Institute and Centre for the Governance of AI

Secret Collusion: Will We Know When to Unplug AI?

TL;DR: We introduce the first comprehensive theoretical framework for understanding and mitigating secret collusion among advanced AI agents, along with CASE, a novel model evaluation framework. CASE assesses the cryptographic and steganographic capabilities of agents, while exploring the emergence of secret collusion in real-world-like multi-agent settings. Whereas current AI models...

Sep 16, 2024•66

Causality: A Brief Introduction

Post 2 of Towards Causal Foundations of Safe AGI, see also Post 1 Introduction. By Lewis Hammond, Tom Everitt, Jon Richens, Francis Rhys Ward, Ryan Carey, Sebastian Benthall, and James Fox, representing the Causal Incentives Working Group. Thanks also to Alexis Bellot, Toby Shevlane, and Aliya Ahmad. Causal models are...

Jun 20, 2023•49

Introduction to Towards Causal Foundations of Safe AGI

By Tom Everitt, Lewis Hammond, Rhys Ward, Ryan Carey, James Fox, Sebastian Benthall, Matt MacDermott and Shreshth Malik representing the Causal Incentives Working Group. Thanks also to Toby Shevlane, MH Tessler, Aliya Ahmad, Zac Kenton, Maria Loks-Thompson, and Alexis Bellot. Over the next few years, society, organisations, and individuals will...

Jun 12, 2023•74