Richard Hollerith

Richard Hollerith. 15 miles north of San Francisco. hruvulum@gmail.com

My probability that AI research will end all human life is .92.  It went up drastically when Eliezer started going public with his pessimistic assessment in April 2022. Till then my confidence in MIRI (and knowing that MIRI has enough funding to employ many researchers) was keeping my probability down to about .4. (I am glad I found out about Eliezer's assessment.)

Currently I am willing to meet with almost anyone on the subject of AI extinction risk.

Last updated 26 Sep 2023.

Posts

Sorted by New

Wiki Contributions

Comments

Can you explain where there is an error term in AlphaGo or where an error term might appear in hypothetical model similar to AlphaGo trained much longer with much more numerous parameters and computational resources?

At least one person here disagrees with you on Goodharting. (I do.)

You've written before on this site if I recall correctly that Eliezer's 2004 CEV proposal is unworkable because of Goodharting. I am granting myself the luxury of not bothering to look up your previous statement because you can contradict me if my recollection is incorrect.

I believe that the CEV proposal is probably achievable by humans if those humans had enough time and enough resources (money, talent, protection from meddling) and that if it is not achievable, it is because of reasons other than Goodhart's law.

(Sadly, an unaligned superintelligence is much easier for humans living in 2022 to create than a CEV-aligned superintelligence is, so we are probably all going to die IMHO.)

Perhaps before discussing the CEV proposal we should discuss a simpler question, namely, whether you believe that Goodharting inevitably ruins the plans of any group setting out intentionally to create a superintelligent paperclip maximizer.

Another simple goal we might discuss is a superintelligence (SI) whose goal is to shove as much matter as possible into a black hole or an SI that "shuts itself off" within 3 months of its launching where "shuts itself off" means stops trying to survive or to affect reality in any way.