AI ALIGNMENT FORUM
AF

1390
Marc Carauleanu
Ω134010
Message
Dialogue
Subscribe

AI Safety Researcher @AE Studio 

Currently researching a neglected prior for cooperation and honesty inspired by the cognitive neuroscience of altruism called self-other overlap in state-of-the-art ML models.

Previous SRF @ SERI 21', MLSS & Student Researcher @ CAIS 22' and LTFF grantee.

LinkedIn 

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Should we rely on the speed prior for safety?
Marc Carauleanu4y00

Any n-bit hash function will produce collisions when the number of elements in the hash table gets large enough (after the number of possible hashes stored in n bits has been reached) so adding new elements will require rehashing to avoid collisions making GLUT have a logarithmic time complexity in the limit. Meta-learning can also have a constant time complexity for an arbitrarily large number of tasks, but not in the limit, assuming a finite neural network.

Reply
29Mistral Large 2 (123B) seems to exhibit alignment faking
7mo
0
32Reducing LLM deception at scale with self-other overlap fine-tuning
7mo
9
59Self-Other Overlap: A Neglected Approach to AI Alignment
1y
7
10Should we rely on the speed prior for safety?
4y
4