We absolutely do need to "race to build a Friendly AI before someone builds an unFriendly AI". Yes, we should also try to ban Unfriendly AI, but there is no contradiction between the two. Plans are allowed (and even encouraged) to involve multiple parallel efforts and disjunctive paths to success.
Disagree, the fact that there needs to be a friendly AI before an unfriendly AI doesn't mean building it should be plan A, or that we should race to do it. It's the same mistake OpenAI made when they let their mission drift from "ensure that artificial general intelligence benefits all of humanity" to being the ones who build an AGI that benefits all of humanity.
Plan A means it would deserve more resources than any other path, like influencing people by various means to build FAI instead of UFAI.
Also mistakes, from my point of view anyway
The Interwebs seem to indicate that that's only if you give it a laser spot to aim at, not with just GPS.
Good catch.
Agree grenade sized munitions won't damage buildings, I think the conversation is drifting between FPVs and other kinds of drones, and also between various settings, so I'll just state my beliefs.
Nor would I. In WWII bombers didn't even know where they were, but we have GPS now such that Excalibur guided artillery shells can get 1m CEP. And the US and possibly China can use Starlink and other constellations for localization even when GPS is jammed. I would guess 20m is easily doable with good sensing and dumb bombs, which would at least hit a building.
IMO it is too soon to tell whether drone defense will hold up to countercountermeasures.
I agree that Israel will probably be less affected than larger, poorer countries, but given that drones have probably killed over 200,000 people in Ukraine even a small percentage of this would be a problem for Israel.
The data is pretty low-quality for that graph because the agents we used were inconsistent and Claude 3-level models could barely solve any tasks. Epoch has better data for SWE-bench Verified, which I converted to time horizon here and found to also be doubling every 4 months ish. Their elicitation is probably not as good for OpenAI as Anthropic models, but both are increasing at similar rates.
How does your work differ from Forking Paths in Neural Text Generation (Bigelow et al.) from ICLR 2025?
I talked to the AI Futures team in person and shared roughly these thoughts:
One possible way things could go is that models behave like human drug addicts, and don't crave reward until they have an ability to manipulate it easily/directly, but as soon as they do, lose all their other motivations and values and essentially become misaligned. In this world we might get
Agree that your research didn't make this mistake, and MIRI didn't make all the same mistakes as OpenAI. I was responding in context of Wei Dai's OP about the early AI safety field. At that time, MIRI was absolutely being uncooperative: their research was closed, they didn't trust anyone else to build ASI, and their plan would end in a pivotal act that probably disempowers some world governments and possibly ends up with them taking over the world. Plus they descended from a org whose goal was to build ASI before Eliezer realized alignment should be the focus. Critch complained as late as 2022 that if there were two copies of MIRI, they wouldn't even cooperate with each other.
It's great that we have the FLI statement now. Maybe if MIRI had put more work into governance we could have gotten it a year or two earlier, but it took until Hendrycks got involved for the public statements to start.