(opinions are my own)
I think this is a good review. Some points that resonated with me:
1. "The concepts of systemic safety, monitoring, robustness, and alignment seem rather fuzzy." I don't think the difference between objective and capabilities robustness is discussed but this distinction seems important. Also, I agree that Truthful AI could easily go into monitoring.
2. "Lack of concrete threat models." At the beginning of the course, there are a few broad arguments for why AI might be dangerous but not a lot of concrete failure modes. Adding more failure...
This is because longer runs will be outcompeted by runs that start later and therefore use better hardware and better algorithms.
Wouldn't companies port their partially-trained models to new hardware? I guess the assumption here is that when more compute is available, actors will want to train larger models. I don't think this is obviously true because:
1. Data may be the bigger bottleneck. There was some discussion of this here. Making models larger doesn't help very much after a certain point compared with training them with more data.
2. If training runs ...
Claim 1: there is an AI system that (1) performs well ... (2) generalizes far outside of its training distribution.
Don't humans provide an existence proof of this? The point about there being a 'core' of general intelligence seems unnecessary.
Safety and value alignment are generally toxic words, currently. Safety is becoming more normalized due to its associations with uncertainty, adversarial robustness, and reliability, which are thought respectable. Discussions of superintelligence are often derided as “not serious”, “not grounded,” or “science fiction.”
Here's a relevant question in the 2016 survey of AI researchers:
These numbers seem to conflict with what you said but maybe I'm misinterpreting you. If there is a conflict here, do you think that if this survey was done again, the...
From what I understand, Dan plans to add more object-level arguments soon.