David Krueger

Comments

AI x-risk reduction: why I chose academia over industry

You can try to partner with industry, and/or advocate for big government $$$.
I am generally more optimistic about toy problems than most people, I think, even for things like Debate.
Also, scaling laws can probably help here.

AI x-risk reduction: why I chose academia over industry

um sorta modulo a type error... risk is risk.  It doesn't mean the thing has happened (we need to start using some sort of phrase like "x-event" or something for that, I think).

AI x-risk reduction: why I chose academia over industry

Yeah we've definitely discussed it!  Rereading what I wrote, I did not clearly communicate what I intended to...I wanted to say that "I think the average trend was for people to update in my direction".  I will edit it accordingly.

I think the strength of the "usual reasons" has a lot to do with personal fit and what kind of research one wants to do.  Personally, I basically didn't consider salary as a factor.

AI x-risk reduction: why I chose academia over industry

When you say academia looks like a clear win within 5-10 years, is that assuming "academia" means "starting a tenure-track job now?" If instead one is considering whether to begin a PhD program, for example, would you say that the clear win range is more like 10-15 years?

Yes.  

Also, how important is being at a top-20 institution? If the tenure track offer was instead from University of Nowhere, would you change your recommendation and say go to industry?

My cut-off was probably somewhere between top-50 and top-100, and I was prepared to go anywhere in the world.  If I couldn't make into top 100, I think I would definitely have reconsidered academia.  If you're ready to go anywhere, I think it makes it much easier to find somewhere with high EV (but might have to move up the risk/reward curve a lot).

Would you agree that if the industry project you could work on is the one that will eventually build TAI (or be one of the leading builders, if there are multiple) then you have more influence from inside than from outside in academia?

Yes.  But ofc it's hard to know if that's the case.  I also think TAI is a less important category for me than x-risk inducing AI.

"Beliefs" vs. "Notions"

Thanks!  Quick question: how do you think these notions compare to factors in an undirected graphical model?  (This is the closest thing I know of to how I imagine "notions" being formalized).

"Beliefs" vs. "Notions"

Cool!  Can you give a more specific link please?

"Beliefs" vs. "Notions"

True, but it seems the meaning I'm using it for is primary:
 

Imitative Generalisation (AKA 'Learning the Prior')

It seems like z* is meant to represent "what the human thinks the task is, based on looking at D".
So why not just try to extract the posterior directly, instead of the prior an the likelihood separately?
(And then it seems like this whole thing reduces to "ask a human to specify the task".)

[AN #141]: The case for practicing alignment work on GPT-3 and other large models

Intersting... Maybe this comes down to different taste or something.  I understand, but don't agree with, the cow analogy... I'm not sure why, but one difference is that I think we know more about cows than DNNs or something.

I haven't thought about the Zipf-distributed thing.

> Taken literally, this is easy to do. Neural nets often get the right answer on never-before-seen data points, whereas Hutter's model doesn't. Presumably you mean something else but idk what.

I'd like to see Hutter's model "translated" a bit to DNNs, e.g. by assuming they get anything right that's within epsilon of a training data poing or something... maybe it even ends up looking like the other model in that context...


 

[AN #141]: The case for practicing alignment work on GPT-3 and other large models

I have a hard time saying which of the scaling laws explanations I like better (I haven't read either paper in detail, but I think I got the gist of both).
What's interesting about Hutter's is that the model is so simple, and doesn't require generalization at all. 
I feel like there's a pretty strong Occam's Razor-esque argument for preferring Hutter's model, even though it seems wildly less intuitive to me.
Or maybe what I want to say is more like "Hutter's model DEMANDS refutation/falsification".

I think both models also are very interesting for understanding DNN generaliztion... I really think it goes beyond memorization and local generalization (c.f. https://openreview.net/forum?id=rJv6ZgHYg), but it's interesting that those are basically the mechanisms proposed by Hutter and Sharma & Kaplan (resp.)...



 

Load More