AI ALIGNMENT FORUM
AF

Qria
Ω-1000
Message
Dialogue
Subscribe

Programmer. Likes math.

That's about it.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
0Qria's Shortform
4y
0
Information Loss --> Basin flatness
Qria3y-10

Does this framework also explain grokking phenomenon?

I haven't yet fully understood your hypothesis except that behaviour gradient is useful for measuring something related to inductive bias, but above paper seems to touch a similar topic (generalization) with similar methods (experiments on fully known toy examples such as SO5).

Reply
No wikitag contributions to display.
No posts to display.