PhD student in theoretical computer science (distributed computing) in France. Currently transitioning to AI Safety and fundamental ML work.
I'd prefer if people didn't share it widely in a low-bandwidth way (e.g., just posting key graphics on Facebook or Twitter) since the conclusions don't reflect Open Phil's "institutional view" yet, and there may well be some errors in the report.
Isn't that in contradiction with posting it to LW (by crossposting)? I mean, it's in free access for everyone, so anyone that wants to share it can find it.
In practice I find that anything I say tends to lose its nuance as it spreads, so I've moved towards saying fewer things that require nuance. If I said "X might be a good resource to learn from but I don't really know", I would only be a little surprised to hear a complaint in the future of the form "I deeply read X for two months because Rohin recommended it, but I still can't understand this deep RL paper".
Hum, I did not think about that. It makes more sense to me now why you don't want to point people towards specific things. I still believe the result will be net positive if the right caveat are in place (then it's the other's fault for misinterpreting your comment), but that's indeed assuming that the resource/concept is good/important and you're confident in that.
The solution is clear: someone needs to create an Evan bot that will comment on every post of the AF related to mesa-optimization, by providing the right pointers to the paper.
Thanks for the in-depth answer!
I do share your opinion on the Sutton and Barto, which is the only book I read from your list (except a bit of the Russell and Norvig, but not the RL chapter). Notably, I took a lot of time to study the action value methods, only to realise later that a lot of recent work focus instead of policy-gradient methods (even if actor critics do use action-values).
From your answer and Rohin's, I gather that we lack a good resource in Deep RL, at least of the kind useful for AI Safety researchers. It makes me even more curious of the kind of knowledge that would be treated in such a resource.
Here's an obvious next step for people: google for resources on RL, ask others for recommendations on RL, try out some of the resources and see which one works best for you, and then choose one resource and dive deep into it, potentially repeat until you understand new RL papers by reading.
Agreed. Which is exactly why I asked you for recommendations. I don't think you're the only one someone interested in RL should ask for recommendation (I already asked other people, and knew some resource before all this), but as one of the (apparently few) members of the AF with the relevant skills in RL, it seemed that you might offer good advice on the topic.
About self-learning, I'm pretty sure people around here are good on this count. But knowing how to self-learn doesn't mean knowing what to self-learning. Hence the pointers.
I also don't buy that pointing out a problem is only effective if you have a concrete solution in mind. MIRI argues that it is a problem that we don't know how to align powerful AI systems, but doesn't seem to have any concrete solutions. Do you think this disqualifies MIRI from talking about AI risk and asking people to work on solving it?
No, I don't think you should only point to a problem with a concrete solution in hands. But solving a research problem (what MIRI's case is about) is not the same as learning a well-established field of computer science (what this discussion is about). In the latter case, you ask for people to learn things that already exists, not to invent them. And I do believe that showing some concrete things that might be relevant (as I repeated in each comment, not an exhaustive list) would make the injunction more effective.
That being said, it's perfectly okay if you don't want to propose anything. I'm just confused because it seems low effort for you, net positive, and the kind of "ask people for recommendation" that you preach in the previous comment. Maybe we disagree on one of these points?
The handyman might not give basic advice, but if he didn't have any advice, I would assume that he doesn't want to help.
I'm really confused by your answers. You have a long comment criticizing the lack of basic RL knowledge of the AF community, and when I ask you for pointers, you say that you don't want to give any, and that people should just learn the background knowledge. So should every member of the AF stop what they're doing right now to spend 5 years doing a PhD in RL before being able to post here?
If the goal of your comment was to push people to learn things you think they should know, pointing towards some stuff (not an exhaustive list) is the bare minimum for that to be effective. If you don't, I can't see many people investing the time to learn enough RL so that by osmosis they can understand a point you're making.
If you don't have a resource, then do you have a list of pointers to what people should learn? For example the policy gradient theorem and the REINFORCE trick. It will probably not be exhaustive, I'm just trying to make your call to learn more RL theory more actionable to people here.
If there was a vote for the best comment thread of 2020, that would probably be it for me.
What would be a good resource to level up on RL theory? Is the Sutton and Barto good enough, or do you have something else in mind?
But what if they reach AGI during their speed up? The smoothing at a later time assumes that we'll end up with diminishing returns before AGI, which is not what happens for the moment.