AI ALIGNMENT FORUM
AF

AI EvaluationsTruthful AILanguage Models (LLMs)AI
Frontpage

31

Gaming TruthfulQA: Simple Heuristics Exposed Dataset Weaknesses

by Alex Turner
16th Jan 2025
1 min read
3

31

This is a linkpost for https://turntrout.com/original-truthfulqa-weaknesses
AI EvaluationsTruthful AILanguage Models (LLMs)AI
Frontpage
New Comment
Moderation Log
Curated and popular this week
0Comments
(Explanation. Also I have no reason to think they hate me.)

Do not use the original TruthfulQA multiple-choice or the HaluEval benchmark. We show that a simple decision tree can theoretically game multiple-choice TruthfulQA to 79.6% accuracy—even while hiding the question being asked! In response, the TruthfulQA authors created a new multiple-choice condition which avoids the vulnerabilities we highlight.

https://turntrout.com/original-truthfulqa-weaknesses