AI ALIGNMENT FORUM
AF

Colin McGlynn
Ω1000
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
Colin McGlynn2y10

What inspired you to try this approach? It would not occur to me to try this so I am wondering where your intuition came from

Reply
No posts to display.