What is this, "A Series of Unfortunate Logical Events"? I laughed quite a bit, and enjoyed walking through the issues in self-knowledge that the löbstacle poses.
Curated, in part for this episode, and also as a celebration of the whole series. I've listened to 6 out of the 9, and I've learned a great deal about people's work and their motivations for it. This episode in particular was excellent because I finally learned what a finite factored set was – your example of the Cartesian plane was really helpful! Which is a credit to your communication skills.
Basically every episode has been worthwhile and valuable for me, it's been easy to sit down with a researcher and hear them explain their research, and Daniel alway... (read more)
I'm glad to hear that the podcast is useful for people :)
That’s an inspiring narrative that rings true to me, I’m sure I will think on that framing more. Thank you.
Assuming that the discounted value of a monopoly in this IP is reasonably close to Alice’s cost of training, e.g. 1x-3x, competition between Alpha and Beta only shrinks the available profits by half, and Beta expects to acquire between 10%-50% of the market,
Basic econ q here: I think that 2 competitors can often cut the profits by much more than half, because they can always undercut each other until they hit the cost of production. Especially if you're going from 1 seller to 2, I think that can shift a market from monopoly to not-a-monopoly, so I think it might be a lot less valuable.
Still, obviously likely to be worth it to the second company, so I totally expect the competition to happen.
Curated. This is a fascinating framework that (to the best of my understanding) makes substantive improvements on the Pearlian paradigm. It's also really exciting that you found a new simple sequence.
Re: the writeup, it's explained very clearly, the Q&A interspersed is a very nice touch. I like that the talk factorizes.
I really appreciate the research exploration you do around ideas of agency and am very happy to celebrate the writeups like this when you produce them.
The original lesswrong 1.0 had the following header at the top of each page, pointing at a certain concept of map/territory resemblance:
I don't remember the image you show. I looked it up, I don't see this header on the wayback machine. I see a map atop this post in 2009 and then not too long after it becomes the grey texture that stayed until LW 2.0. Where did you get your image from?
Yeah, we can have a try and see whether it ends up being worth publishing.
Nice. I'll get a transcript made on Rev.com and share it with you for edits.
Curated. This is a pretty compelling research line and seems to me like it has the potential to help us a great deal in understanding how to interface and understand and align machine intelligence systems. It's also the compilation of a bunch of good writing+work from you that I'd like to celebrate, and it's something of a mission statement for the ongoing work.
I generally love all the images and like the way it adds a bunch of prior ideas together.
Curated. Solid attempt to formalize the core problem, and solid comment section from lots of people.
I recall once seeing someone say with 99.9% probability that the sun would still rise 100 million years from now, citing information about the life-cycle of stars like our sun. Someone else pointed out that this was clearly wrong, that by default that sun would be taken apart for fuel on that time scale, by us or some AI, and that this was a lesson in people's predictions about the future being highly inaccurate.
But also, "the thing that means there won't be a sun sometime soon" is one of the things I'm pointing to when talking about "general intelligence". This post reminded me of that.
(If both parties are interested in that debate I’m more than happy to organize it in whatever medium and do any work like record+transcripts or book an in-person event space.)
The stuff about ‘alien’ knowledge sounds really fascinating, and I’d be excited about write-ups. All my concrete intuitions here come from reading Distill.Pub papers.
Huh, am surprised. Guess I might’ve predicted Boston. Curious if it’s because of the culture, the environment, or what.
Most people, or most people you know.
And “should“ = given their own goals.
I’m asking what you think people might be wrong about. And very slightly hoping for product recommendations :)
I want to know this question, but for the ‘peak’ alignment researcher.
If you could magically move most of the US rationality and x-risk and EA community to a city in the US that isn't the Bay, and you had to pick somewhere, where where would you move them to?
If I'm allowed to think about it first then I'd do that. If I'm not, then I'd regret never having thought about it, probably Seattle would be my best guess.
And on an absolute level, is the world much more or less prepared for AGI than it was 15 years ago?
Follow-up: How much did the broader x-risk community change it at all?
Why did nobody in the world run challenge trials for the covid vaccine and save us a year of economic damage?
Wild speculation, not an expert. I'd love to hear from anyone who actually knows what's going on.
I think it's overoptimistic that human challenge trials would save a year, though it does seem like they could have plausibly have saved weeks or months if done in the most effective form. (And in combination with other human trials and moderate additional spending I'd definitely believe 6-12 months of acceleration was possible.)
In terms of why so few human experiments have happened in general, I think it's largely because of strong norms designed to protect ex... (read more)
Which rationalist virtue do you identify with the strongest currently? Which one would you like to get stronger at?
Paul, if you did an episode of AXRP, which two other AXRP episodes do you expect your podcast would be between, in terms of quality? For this question, collapse all aspects of quality into a scalar.
Do you have any specific plans for your life in a post-singularity world?
I expect that many humans will continue to participate in a process of collectively clarifying what we want and how to govern the universe. I wouldn't be surprised if that involves a lot of life-kind-of-like-normal that gradually improves in a cautious way we endorse rather than some kind of table-flip (e.g. I would honestly not be surprised if post-singularity we still end up raising another generation because there's no other form of "delegation" that we feel more confident about). And of course in such a world I expect to just continue to spe... (read more)
Who do you admire?
What were your main updates from the past few months?
Who is right between Eliezer and Robin in the AI FOOM debate?
I mostly found myself more agreeing with Robin, in that e.g. I believe previous technical change is mostly a good reference class, that Eliezer's AI-specific arguments are mostly kind of weak. (I liked the image, I think from that debate, of a blacksmith emerging into the townsquare with his mighty industry and making all bow before them.)
That said, I think Robin's quantitative estimates/forecasts are pretty off and usually not very justified, and I think he puts too much stock on an outside view extrapolation from past transitions rather than looking at t... (read more)
What should people be spending more money on?
What important truth do very few people in your community/network agree with you on?
Unfortunately (fortunately?) I don't feel like I have access to any secret truths. Most idiosyncratic things I believe are pretty tentative, and I hang out with a lot of folks who are pretty open to the kinds of weird ideas that might have ended up feeling like Paul-specific secret truths if I hung with a more normal crowd.
It feels like my biggest disagreement with people around me is something like: to what extent is it likely to be possible to develop an algorithm that really looks on paper like it should just work for aligning powerful ML systems.... (read more)
Let me ask the question Daniel Filan is too polite to ask: would you like to be interviewed on your research for an episode of the AXRP podcast?
That's not the AXRP question I'm too polite to ask.
What is the main mistake you've made in your research, that you were wrong about?
Positive framing: what's been the biggest learning moment in the course of your work?
Basically every time I've shied away from a solution because it feels like cheating, or like it doesn't count / address the real spirit of the problem, I've regretted it. Often it turns out it really doesn't count, but knowing exactly why (and working on the problem with no holds barred) had been really important for me.
The most important case was dismissing imitation learning back in 2012-2014, together with basically giving up outright on all ML approaches, which I only recognized as a problem when I was writing up why those approaches were doomed more carefully and why imitation learning was a non-solution.
What work are you most proud of?
Slightly different: what blog post are you most proud of?
I don't have an easy way of slicing my work up / think that it depends on how you slice it. Broadly I think the two candidates are (i) making RL from human feedback more practical and getting people excited about it at OpenAI, (ii) the theoretical sequence from approval-directed agents and informed oversight to iterated amplification to getting a clear picture of the limits of iterated amplification and setting out on my current research project. Some steps of that were really hard for me at the time though basically all of them now feel obvious.
My favorit... (read more)
Who's the best critic of your alignment research? What have they been right about?
What was your biggest update about the world from living through the coronavirus pandemic?
Follow-up: does it change any of your feelings about how civilization will handle AGI?
I found our COVID response pretty "par for the course" in terms of how well we handle novel challenges. That was a significant negative update for me because I had a moderate probability on us collectively pulling out some more exceptional adaptiveness/competence when an issue was imposing massive economic costs and had a bunch of people's attention on it. I now have somewhat more probability on AI dooms that play out slowly where everyone is watching and yelling loudly about it but it's just really tough to do something that really improves the situation (and correspondingly more total probability on doom). I haven't really sat down and processed this update or reflected on exactly how big it should be.
What's a direction you'd like to see the rationality community grow stronger in over the coming 5-10 years?
More true beliefs (including especially about large numbers of messy details rather than a few central claims that can receive a lot of attention).
Do you know what sorts of people you're looking to hire? How much do you expect ARC to grow over the coming years, and what will the employees be doing? I can imagine it being a fairly small group of like 3 researchers and a few understudies, I can also imagine it growing to 30 people like MIRI. Which one of these is it closer to?
I'd like to hire a few people (maybe 2 researchers median?) in 2021. I think my default "things are going pretty well" story involves doubling something like every 1-2 years for a while. Where that caps out / slows down a lot depends on how the field shapes out and how broad our activities are. I would be surprised if I wanted to stop growing at <10 people just based on the stuff I really know I want to do.
The very first hires will probably be people who want to work on the kind of theory I do, since right now that's what I'm feeling most excited about ... (read more)
What are the main ways you've become stronger and smarter over the past 5 years? This isn't a question about new object-level beliefs so much as ways-of-thinking or approaches to the world that have changed for you.
I'm not interested in the strongest argument from your perspective (i.e. the steelman), but I am interested how much you think you can pass the ITT for Eliezer's perspective on the alignment problem — what shape the problem is, why it's hard, and how to make progress. Can you give a sense of the parts of his ITT you think you've got?
I think I could do pretty well (it's plausible to me that I'm the favorite in any head-to-head match with someone who isn't a current MIRI employee? probably not but I'm at least close). There are definitely some places I still get surprised and don't expect to do that well, e.g. I was recently surprised by one of Eliezer's positions regarding the relative difficulty of some kinds of reasoning tasks for near-future language models (and I expect there are similar surprises in domains that are less close to near-term predictions). I don't really know how to split it into parts for the purpose of saying what I've got or not.
What is your top feature request for LessWrong.com?
Favorite SSC / ASX post?
Other than by doing your own research, from where or whom do you tend to get valuable research insights?
What works of fiction / literature have had the strongest impact on you? Or perhaps, that are responsible for the biggest difference in your vector relative to everyone else's vector?
(e.g. lots of people were substantially impacted by the Lord of the Rings, but perhaps something else had a big impact on you that led you in a different direction from all those people)
(that said, LotR is a fine answer)
Did you get much from reading the sequences? What was one of the things you found most interesting or valuable personally it them?
I enjoyed Leave a Line of Retreat. It's a very concrete and simple procedure that I actually still use pretty often and I've benefited a lot just from knowing about. Other than that I think I found a bunch of the posts interesting and entertaining. (Looking back now the post is a bit bombastic, I suspect all the sequences are, but I don't really mind.)
You're gonna get back to thesis writing quickly, it's a very short form.
Best of skill to you.
Curated. Felt to me like a valuable step in this conversation, and analyzed some details helpfully to me. Thanks for writing it.
Great post, I’m glad this is written up nicely.
One section was especially interesting to me:
If the credences you assign to your beliefs obey the logical induction criterion, then you will get such-and-such benefits.In the case of logical induction, the benefits are things like coherence, convergence, timeliness, and unbiasedness. But different from probability theory, these concepts are operationalized as properties of the evolution of your credences over time, rather than as properties of your credences at any particular point in time.
If the credences you assign to your beliefs obey the logical induction criterion, then you will get such-and-such benefits.
In the case of logical induction, the benefits are things like coherence, convergence, timeliness, and unbiasedness. But different from probability theory, these concepts are operationalized as properties of the evolution of your credences over time, rather than as properties of your credences at any particular point in time.
I ... (read more)
This reminds me that it's hard for me to say where "I" am, in both space and time.
I read a story recently (which I'm going to butcher because I don't remember the URL), about a great scientist who pulled a joke: after he died, his wife had a seance or used a ouija board or something, which told her to look at the first sentence of the 50th page of his book, and the first sentence was "<The author> loved to find creative ways to communicate with people."
After people die, their belongings and home often contain an essence of 'them'. I think that some p... (read more)
Curated. This was perhaps the most detailed yet informative story I've read about how failure will go down. As you say at the start it's making several key assumptions, it's not your 'mainline' failure story. Thx for making the assumptions explicit, and discussing how to vary them at the end. I'd like to see more people write stories written under different assumptions.
The sorts of stories Eliezer has told in the past have focused on 10-1000x faster takeoffs than discussed here, so those stories are less extended (you kinda just wake up one day then everyo... (read more)
At this point, my plan is try to consolidate what I think the are main confusions in the comments of this post, into one or more new concepts to form the topic of a new post.
Sounds great! I was thinking myself about setting aside some time to write a summary of this comment section (as I see it).
I've felt like the problem of counterfactuals is "mostly settled" for about a year, but I don't think I've really communicated this online.
Wow that's exciting! Very interesting that you think that.
The rules say we must use consequentialism, but good people are deontologists, and virtue ethics is what actually works.—Eliezer Yudkowsky, Twitter
The rules say we must use consequentialism, but good people are deontologists, and virtue ethics is what actually works.
—Eliezer Yudkowsky, Twitter