AI will change the world, but won’t take it over by playing “3-dimensional chess”.
By Boaz Barak and Ben Edelman [Cross-posted on Windows on Theory blog; See also Boaz’s posts on longtermism and AGI via scaling as well as other “philosophizing” posts.] [Disclaimer: Predictions are very hard, especially about the future. In fact, this is one of the points of this essay. Hence, while...
Thank you! I think that what we see right now is that as the horizon grows, the more "tricks" we need to make end-to-end learning works, to the extent that it might not really be end to end. So while supervised learning is very successful, and seems to be quite robust to choice of architecture, loss functions, etc., in RL we need to be much more careful, and often things won't work "out of the box" in a purely end to end fashion.
I think the question would be how performance scales with horizon, if the returns are rapidly diminishing, and the cost to train is rapidly increasing (as might well be the case because of diminishing gradient signals, and much smaller availability of data), then it could be that the "sweet spot" of what is economical to train would remain at a reasonably short horizon (far shorter than the planning needed to take over the world) for a long time.