Gaming TruthfulQA: Simple Heuristics Exposed Dataset Weaknesses

TurnTrout

31 Gaming TruthfulQA: Simple Heuristics Exposed Dataset Weaknesses

by TurnTrout

16th Jan 2025

1 min read

3

31

This is a linkpost for https://turntrout.com/original-truthfulqa-weaknesses

(Explanation. Also I have no reason to think they hate me.)

Do not use the original TruthfulQA multiple-choice or the HaluEval benchmark. We show that a simple decision tree can theoretically game multiple-choice TruthfulQA to 79.6% accuracy—even while hiding the question being asked! In response, the TruthfulQA authors created a new multiple-choice condition which avoids the vulnerabilities we highlight.

https://turntrout.com/original-truthfulqa-weaknesses

AI EvaluationsTruthful AILanguage Models (LLMs)AI

Frontpage

New Comment

Moderation Log