Andy_McKenzie — AI Alignment Forum

Personal website: https://andrewtmckenzie.com/

Thanks! I agree with you about all sorts of AI alignment essays being interesting and seemingly useful. My question was more about how to measure the net rate of AI safety research progress. But I agree with you that an/your expert inside view of how insights are accumulating is a reasonable metric. I also agree with you that the acceptance of TAI x-risk in the ML community as a real thing is useful and that - while I am slightly worried about the risk of overshooting, like Scott Alexander describes - this situation seems to be generally improving.

Regarding (2), my question is why algorithmic growth leading to serious growth of AI capabilities would be so discontinuous. I agree that RL is much better in humans than in machines, but I doubt that replicating this in machines would require just one or a few algorithmic advances. Instead, my guess, based on previous technology growth stories I've read about, is that AI algorithmic progress is likely to occur due to the accumulation of many small improvements over time.

Good essay! Two questions if you have a moment:

1. Can you flesh out your view of how the community is making "slow but steady progress right now on getting ready"? In my view, much of the AI safety community seems to be doing things that have unclear safety value to me, like (a) coordinating a pause in model training that seems likely to me to make things less safe if implemented (because of leading to algorithmic and hardware overhangs) or (b) converting to capabilities work (quite common, seems like an occupational hazard for someone with initially "pure" AI safety values). Of course, I don't mean to be disparaging, as plenty of AI safety work does seem useful qua safety to me, like making more precise estimates of takeoff speeds or doing cybersecurity work. Just was surprised by that statement and I'm curious about how you are tracking progress here.

2. It seems like you think there are some key algorithmic insights, that once "unlocked", will lead to dramatically faster AI development. This suggests that not many people are working on algorithmic insights. But that doesn't seem quite right to me -- isn't that a huge group of researchers, many of whom have historically been anti-scaling? Or maybe you think there are core insights available, but the field hasn't had (enough of) its Einsteins or von Neumanns yet? Basically, I'm trying to get a sense of why you seem to have very fast takeoff speed estimates given certain algorithmic progress. But maybe I'm not understanding your worldview and/or maybe it's too infohazardous to discuss.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments