AI ALIGNMENT FORUM
AF

Marc Carauleanu
Ω134010
Message
Dialogue
Subscribe

AI Safety Researcher @AE Studio 

Currently researching a neglected prior for cooperation and honesty inspired by the cognitive neuroscience of altruism called self-other overlap in state-of-the-art ML models.

Previous SRF @ SERI 21', MLSS & Student Researcher @ CAIS 22' and LTFF grantee.

LinkedIn 

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Should we rely on the speed prior for safety?
Marc Carauleanu4y00

Any n-bit hash function will produce collisions when the number of elements in the hash table gets large enough (after the number of possible hashes stored in n bits has been reached) so adding new elements will require rehashing to avoid collisions making GLUT have a logarithmic time complexity in the limit. Meta-learning can also have a constant time complexity for an arbitrarily large number of tasks, but not in the limit, assuming a finite neural network.

Reply
No wikitag contributions to display.
29Mistral Large 2 (123B) seems to exhibit alignment faking
6mo
0
32Reducing LLM deception at scale with self-other overlap fine-tuning
6mo
9
59Self-Other Overlap: A Neglected Approach to AI Alignment
1y
7
10Should we rely on the speed prior for safety?
4y
4