Transformer Debugger
Transformer Debugger (TDB) is a tool developed by OpenAI's Superalignment team with the goal of supporting investigations into circuits underlying specific behaviors of small language models. The tool combines automated interpretability techniques with sparse autoencoders. TDB enables rapid exploration before needing to write code, with the ability to intervene in...
Mar 12, 202426