Research Scientist at DeepMind. Creator of the Alignment Newsletter. http://rohinshah.com/
Update: I think you should apply now and mention somewhere that you'd prefer to be interviewed in 3 months because in those 3 months you will be doing <whatever it is you're planning to do> and it will help with interviewing.
I don't have a strong opinion on whether it is good to support remote work. I agree we lose out on a lot of potential talent, but we also gain productivity benefits from in person collaboration.
However, this is a DeepMind-wide policy and I'm definitely not sold enough on the importance of supporting remote work to try and push for an exception here.
Looking into it, I'll try to get you a better answer soon. My current best guess is that you should apply 3 months from now. This runs an increased risk that we'll have filled all our positions / closed our applications, but also improved chances of making it through because you'll know more things and be better prepared for the interviews.
(Among other things I'm looking into: would it be reasonable to apply now and mention that you'd prefer to be interviewed in 3 months.)
Almost certainly, e.g. this one meets those criteria and I'm pretty sure costs < 1/3 of total comp (before taxes), though I don't actually know what typical total comp is. You would find significantly cheaper places if you were willing to compromise on commute, since DeepMind is right in the center of London.
Unfortunately not, though as Frederik points out below, if your concern is about getting a visa, that's relatively easy to do. DeepMind will provide assistance with the process. I went through it myself and it was relatively painless; it probably took 5-10 hours of my time total (including e.g. travel to and from the appointment where they collected biometric data).
Should be fixed now!
That's what future research is for!
I agree the lack of off-switchability is bad for safety margins (that was part of the intuition driving my last point).
I think it's more concerning in cases where you're getting all of your info from goal-oriented behaviour and solving the inverse planning problem
I agree Boltzmann rationality (over the action space of, say, "muscle movements") is going to be pretty bad, but any realistic version of this is going to include a bunch of sources of info including "things that humans say", and the human can just tell you that hyperslavery is really bad. Obviously you can't trust everything that humans say, but it seems plausible that if we spent a bunch of time figuring out a good observation model that would then lead to okay outcomes.
(Ideally you'd figure out how you were getting AGI capabilities, and then leverage those capabilities towards the task of "getting a good observation model" while you still have the ability to turn off the model. It's hard to say exactly what that would look like since I don't have a great sense of how you get AGI capabilities under the non-ML story.)
I recently had occasion to write up quick thoughts about the role of assistance games (CIRL) in AI alignment, and how it relates to the problem of fully updated deference. I thought I'd crosspost here as a reference.
(I made some of these points before in my summary of Human Compatible.)
Specifically, if for example you vary between two loss functions in some training environment, L1 and L2, that variation is called “modular” if somewhere in design space, that is, the space formed by all possible combinations of parameter values your network can take, you can find a network N1 that “does well”(1) on L1, and a network N2 that “does well” on L2, and these networks have the same values for all their parameters, except for those in a single(2) submodule(3).
It's often the case that you can implement the desired function with, say, 10% of the parameters that you actually have. So every pair of L1 and L2 would be called "modular", by changing the 10% of parameters that actually do anything, and leaving the other 90% the same. Possible fixes: