AI ALIGNMENT FORUM
AF

152
Qria
Ω-1000
Message
Dialogue
Subscribe

Programmer. Likes math.

That's about it.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No posts to display.
0Qria's Shortform
4y
0
Information Loss --> Basin flatness
Qria3y-10

Does this framework also explain grokking phenomenon?

I haven't yet fully understood your hypothesis except that behaviour gradient is useful for measuring something related to inductive bias, but above paper seems to touch a similar topic (generalization) with similar methods (experiments on fully known toy examples such as SO5).

Reply