AI ALIGNMENT FORUM
AF

779
Hopenope
000
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety
Hopenope3mo20

Is optimizing CoT to look nice a big concern? There are other ways to show a nice CoT without optimizing for it. The frontrunners also have some incentives to not show the real CoT. Additionally, there is a good chance that people prefer a nice structured summary of CoT by a small LLM when reasonings become very long and convoluted. 

Reply
No wikitag contributions to display.
0Hopenope's Shortform
10mo
0