Clément Dumas

I'm a CS master's student at ENS Paris-Saclay. I want to pursue a career in AI safety research

Posts

Sorted by New

Wiki Contributions

Comments

Let's assume the prompt template is  Q [true/false] [banana/shred]

If I understand correctly, they don't claim   learned has_banana but  learned has_banana. Moreover evaluating  for  gives:

Therefore, we can learn a  that is a banana classifier