AI ALIGNMENT FORUM
AF

3204
Wikitags

Honesty

Edited by Yoav Ravid, Multicore last updated 3rd Mar 2021

Honesty means telling the truth and not being deceptive.

External Links:
Against Lie Inflation by Scott Alexander

Related Pages: Meta-Honesty, Deception.

Subscribe
Discussion
2
Subscribe
Discussion
2
Posts tagged Honesty
35Truthful LMs as a warm-up for aligned AGI
Jacob_Hilton
4y
10
33Paper: Teaching GPT3 to express uncertainty in words
Owain_Evans
3y
0
29How "honest" is GPT-3?
Q
abramdemski, gwern
5y
Q
4
18How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?
Owain_Evans
4y
1
-21Lying is Cowardice, not Strategy
Connor Leahy, Gabriel Alfour
2y
21
93How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
Collin
3y
20
53Contrast Pairs Drive the Empirical Performance of Contrast Consistent Search (CCS)
Scott Emmons
2y
1
23Truthful AI: Developing and governing AI that does not lie
Owain_Evans, owencb, Lukas Finnveden
4y
9
18Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Felix Hofstätter, Francis Rhys Ward, HarrietW, LAThomson, Ollie J, Patrik Bartak, Sam F. Brown
2y
0
14Truth is Universal: Robust Detection of Lies in LLMs
Lennart Buerger
1y
2
3Ground-Truth Label Imbalance Impairs the Performance of Contrast-Consistent Search (and Other Contrast-Pair-Based Unsupervised Methods)
Tom Angsten, Ami Hays
2y
0
Add Posts