AI ALIGNMENT FORUM
AF

Preetum Nakkiran
Ω13010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Understanding “Deep Double Descent”
Preetum Nakkiran6y140

Hi, I'm one of the authors:

Regarding gradient descent vs. pseudoinverse:

As we mention in the note at the top of the Colab,for the linear regression objective, running gradient descent to convergence (from 0 initialization) is *equivalent* to applying the pseudoinverse -- they both find the minimum-norm interpolating solution.

So, we use the pseudoinverse just for computational efficiency; you would get equivalent results by actually running GD.

You're also right that the choice of basis matters. The choice of basis changes the "implicit bias"
of GD in linear regression, analogous to how the choice of architecture changes the implicit bias of GD on neural nets.

-- Preetum

Reply
No posts to display.