AI ALIGNMENT FORUMTags
AF

Reinforcement Learning

•

Applied to Speedrun ruiner research idea by Luke H Miles 6d ago

•

Applied to The theory of Proximal Policy Optimisation implementations by salman.mohammadi 9d ago

•

Applied to Measuring Learned Optimization in Small Transformer Models by Jonathan Bostock 12d ago

•

Applied to Skepticism About DeepMind's "Grandmaster-Level" Chess Without Search by Arjun Panickssery 2mo ago

•

Applied to Krueger Lab AI Safety Internship 2024 by Joey Bream 3mo ago

•

Applied to Interpreting the Learning of Deceit by Roger Dearnaley 4mo ago

•

Applied to Refinement of Active Inference agency ontology by Roman Leventov 4mo ago

•

Applied to Utility ≠ Reward by Oliver Sourbut 4mo ago

•

Applied to Planning in LLMs: Insights from AlphaGo by jco 5mo ago

•

Applied to Reinforcement Learning using Layered Morphology (RLLM) by Miguel de Guzman 5mo ago

•

Applied to AISC project: SatisfIA – AI that satisfies without overdoing it by Jobst Heitzig 5mo ago

•

Applied to We have promising alignment plans with low taxes by Seth Herd 5mo ago

•

Applied to Wireheading and misalignment by composition on NetHack by pierlucadoro 6mo ago

•

Applied to VLM-RM: Specifying Rewards with Natural Language by ChengCheng 6mo ago