Top postsTop post
Jordan Taylor
Message
I'm a research scientist at the UK AI Security Institute (AISI), working on white box control, sandbagging, low-incrimination control, training-based mitigations, and model organisms.
Previously: Working on lie-detector probes and black box monitors, and training sandbagging model organisms in order to stress-test them.
Before this I was...
591
Ω
52
4
41