x

AI ALIGNMENT FORUM

AF

Honesty — AI Alignment Forum

Honesty

Edited by Yoav Ravid, Multicore last updated 3rd Mar 2021

Honesty means telling the truth and not being deceptive.

External Links:
Against Lie Inflation by Scott Alexander

Related Pages: Meta-Honesty, Deception.

Add Posts

3

3

Posts tagged Honesty

3

35Truthful LMs as a warm-up for aligned AGI

4y

10

1

33Paper: Teaching GPT3 to express uncertainty in words

4y

0

1

29How "honest" is GPT-3?

abramdemski, gwern

6y

4

1

18How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?

4y

1

1

4mo

0

1

-21Lying is Cowardice, not Strategy

Connor Leahy, Gabriel Alfour

3y

21

0

93How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme

3y

20

1

53Contrast Pairs Drive the Empirical Performance of Contrast Consistent Search (CCS)

3y

1

1

23Truthful AI: Developing and governing AI that does not lie

Owain_Evans, owencb, Lukas Finnveden

5y

9

0

18Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models

Felix Hofstätter, Francis Rhys Ward, HarrietW, LAThomson, Ollie J, Patrik Bartak, Sam F. Brown

2y

0

0

14Truth is Universal: Robust Detection of Lies in LLMs

Lennart Buerger

2y

2

0

3Ground-Truth Label Imbalance Impairs the Performance of Contrast-Consistent Search (and Other Contrast-Pair-Based Unsupervised Methods)

Tom Angsten, Ami Hays

3y

0

Add Posts