AI ALIGNMENT FORUM
AF

Satvik Golechha
000
Message
Dialogue
Subscribe

I research intelligence and it’s emergence and expression in neural networks to ensure advanced AI is safe and beneficial. 

Current interests: neural network interpretability, alignment/safety, unsupervised learning, and deep learning theory. 

For more, check out my scholar profile and personal website.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No Comments Found
82Auditing language models for hidden objectives
6mo
3