Daniel Dewey

Wiki Contributions


Thanks for the post, I found it helpful! the "competent catastrophes" direction sounds particularly interesting.

This is extremely cool -- thank you, Peter and Owen! I haven't read most of it yet, let alone the papers, but I have high hopes that this will be a useful resource for me.

Thanks for the post! FWIW, I found this quote particularly useful:

Well, on my reading of history, that means that all sorts of crazy things will be happening, analogous to the colonialist conquests and their accompanying reshaping of the world economy, before GWP growth noticeably accelerates!

The fact that it showed up right before an eye-catching image probably helped :)

This may be out-of-scope for the writeup, but I would love to get more detail on how this might be an important problem for IDA.

Thanks for the writeup! This google doc (linked near "raised this general problem" above) appears to be private: https://docs.google.com/document/u/1/d/1vJhrol4t4OwDLK8R8jLjZb8pbUg85ELWlgjBqcoS6gs/edit

This seems like a useful lens -- thanks for taking the time to post it!

Thanks for writing this -- I think it's a helpful kind of reflection for people to do!

Ah, gotcha. I'll think about those points -- I don't have a good response. (Actually adding "think about"+(link to this discussion) to my todo list.)

It seems to me that in order to be able to make rigorous arguments about systems that are potentially subject to value drift, we have to understand metaphilosophy at a deep level.

Do you have a current best guess at an architecture that will be most amenable to us applying metaphilosophical insights to avoid value drift?

These objections are all reasonable, and 3 is especially interesting to me -- it seems like the biggest objection to the structure of the argument I gave. Thanks.

I'm afraid that the point I was trying to make didn't come across, or that I'm not understanding how your response bears on it. Basically, I thought the post was prematurely assuming that schemes like Paul's are not amenable to any kind of argument for confidence, and we will only ever be able to say "well, I ran out of ideas for how to break it", so I wanted to sketch an argument structure to explain why I thought we might be able to make positive arguments for safety.

Do you think it's unlikely that we'll be able to make positive arguments for the safety of schemes like Paul's? If so, I'd be really interested in why -- apologies if you've already tried to explain this and I just haven't figured that out.

Load More