DanielFilan's Comments

Realism about rationality

OK, I think I understand you now.

Overall I think these relatively-imprecise theories let you build things "one level above", which I think your examples fit into. My claim is that it's very hard to use them to build things "2+ levels above".

I think that I sort of agree if 'levels above' means levels of abstraction, where one system uses an abstraction of another and requires the mesa-system to satisfy some properties. In this case, the more layers of abstraction you have, the more requirements you're demanding which can independently break, which exponentially reduces the chance that you'll have no failure.

But also, to the extent that your theory is mathematisable and comes with 'error bars', you have a shot at coming up with a theory of abstractions that is robust to failure of your base-level theory. So some transistors on my computer can fail, evidencing the imprecision of the simple theory of logic gates, but my computer can still work fine because the abstractions on top of logic gates accounted for some amount of failure of logic gates. Similarly, even if you have some uncorrelated failures of individual economic rationality, you can still potentially have a pretty good model of a market. I'd say that the lesson is that the more levels of abstraction you have to go up, the more difficult it is to make each level robust to failures of the previous level, and as such the more you'd prefer the initial levels be 'exact'.

"real AGI systems" are "2+ levels above" the sorts of theories that MIRI works on.

I'd say that they're some number of levels above (of abstraction) and also levels below (of implementation). So for an unrealistic example, if you develop logical induction decision theory, you have your theory of logical induction, then you depend on that theory to have your decision theory (first level of abstraction), and then you depend on your decision theory to have multiple LIDT agents behave well together (second level of abstraction). Separately, you need to actually implement your logical inductor by some machine learning algorithm (first level of implementation), which is going to depend on numpy and floating point arithmetic and such (second and third (?) levels of implementation), which depends on computing hardware and firmware (I don't know how many levels of implementation that is).

When I read a MIRI paper, it typically seems to me that the theories discussed are pretty abstract, and as such there are more levels below than above. The levels below seem mostly unproblematic (except for machine learning, which in the form of deep learning is often under-theorised). They are also mathematised enough that I'm optimistic about upwards abstraction having the possibility of robustness. There are some exceptions (e.g. the mesa-optimisers paper), but they seem like they're on the path to greater mathematisability.

MIRI's theories will always be the relatively-imprecise theories that can't scale to "2+ levels above"

I'm not sure about this, but I disagree with the version that replaces 'MIRI's theories' with 'mathematical theories of embedded rationality', basically for the reasons that Vanessa discusses.

Realism about rationality

I'm confused how my examples don't count as 'building on' the relevant theories - it sure seems like people reasoned in the relevant theories and then built things in the real world based on the results of that reasoning, and if that's true (and if the things in the real world actually successfully fulfilled their purpose), then I'd think that spending time and effort developing the relevant theories was worth it. This argument has some weak points (the US government is not highly reliable at preserving liberty, very few individual businesses are highly reliable at delivering their products, the theories of management and liberalism were informed by a lot of experimentation), but you seem to be pointing at something else.

Realism about rationality

I think it was important to have something like this post exist. However, I now think it's not fit for purpose. In this discussion thread, rohinmshah, abramdemski and I end up spilling a lot of ink about a disagreement that ended up being at least partially because we took 'realism about rationality' to mean different things. rohinmshah thought that irrealism would mean that the theory of rationality was about as real as the theory of liberalism, abramdemski thought that irrealism would mean that the theory of rationality would be about as real as the theory of population genetics, and I leaned towards rohinmshah's position but also thought that it referred to something more akin to a mood than a proposition. I think that a better post would distinguish these three types of 'realism' and their consequences. However, I'm glad that this post sparked enough conversation for the better post to become real.

Realism about rationality

My underlying model is that when you talk about something so "real" that you can make extremely precise predictions about it, you can create towers of abstractions upon it, without worrying that they might leak. You can't do this with "non-real" things.

For what it's worth, I think I disagree with this even when "non-real" means "as real as the theory of liberalism". One example is companies - my understanding is that people have fake theories about how companies should be arranged, that these theories can be better or worse (and evaluated as so without looking at how their implementations turn out), that one can maybe learn these theories in business school, and that implementing them creates more valuable companies (at least in expectation). At the very least, my understanding is that providing management advice to companies in developing countries significantly raises their productivity, and found this study to support this half-baked memory.

(next paragraph is super political, but it's important to my point)

I live in what I honestly, straightforwardly believe is the greatest country in the world (where greatness doesn't exactly mean 'moral goodness' but does imply the ability to support moral goodness - think some combination of wealth and geo-strategic dominance), whose government was founded after a long series of discussions about how best to use the state to secure individual liberty. If I think about other wealthy countries, it seems to me that ones whose governments built upon this tradition of the interaction between liberty and governance are over-represented (e.g. Switzerland, Singapore, Hong Kong). The theory of liberalism wasn't complete or real enough to build a perfect government, or even a government reliable enough to keep to its founding principles (see complaints American constitutionalists have about how things are done today), but it was something that can be built upon.

At any rate, I think it's the case that the things that can be built off of these fake theories aren't reliable enough to satisfy a strict Yudkowsky-style security mindset. But I do think it's possible to productively build off of them.

Realism about rationality

So, yeah, I'm asking you about something which you haven't claimed is a crux of a disagreement which you and I are having, but, I am asking about it because I seem to have a disagreement with you about (a) whether rationality realism is true (pending clarification of what the term means to each of us), and (b) whether rationality realism should make a big difference for several positions you listed.

For what it's worth, from my perspective, two months ago I said I fell into a certain pattern of thinking, then raemon put me in the position of saying what that was a crux for, then I was asked to elaborate about why a specific facet of the distinction was cruxy, and also the pattern of thinking morphed into something more analogous to a proposition. So I'm happy to elaborate on consequences of 'rationality realism' in my mind (such as they are - the term seems vague enough that I'm a 'rationality realism' anti-realist and so don't want to lean too heavily on the concept) in order to further a discussion, but in the context of an exchange that was initially framed as a debate I'd like to be clear about what commitments I am and am not making.

Anyway, glad to clarify that we have a big disagreement about how 'real' a theory of rationality should be, which probably resolves to a medium-sized disagreement about how 'real' rationality and/or its best theory actually is.

Realism about rationality

Meta/summary: I think we're talking past each other, and hope that this comment clarifies things.

How critical is it that rationality is as real as electromagnetism, rather than as real as reproductive fitness? I think the latter seems much more plausible, but I also don't see why the distinction should be so cruxy...

Reproductive fitness implies something that's quite mathematizable, but with relatively "fake" models

I was thinking of the difference between the theory of electromagnetism vs the idea that there's a reproductive fitness function, but that it's very hard to realistically mathematise or actually determine what it is. The difference between the theory of electromagnetism and mathematical theories of population genetics (which are quite mathematisable but again deal with 'fake' models and inputs, and which I guess is more like what you mean?) is smaller, and if pressed I'm unsure which theory rationality will end up closer to.

Separately, I feel weird having people ask me about why things are 'cruxy' when I didn't initially say that they were and without the context of an underlying disagreement that we're hashing out. Like, either there's some misunderstanding going on, or you're asking me to check all the consequences of a belief that I have compared to a different belief that I could have, which is hard for me to do.

I am curious why you expect electromagnetism-esque levels of mathematical modeling. Even AIXI describes a heavy dependence on programming language. Any theory of bounded rationality which doesn't ignore poly-time differences (ie, anything "closer to the ground" than logical induction) has to be hardware-dependent as well.

I confess to being quite troubled by AIXI's language-dependence and the difficulty in getting around it. I do hope that there are ways of mathematically specifying the amount of computation available to a system more precisely than "polynomial in some input", which should be some input to a good theory of bounded rationality.

If I didn't believe the above,

What alternative world are you imagining, though?

I think I was imagining an alternative world where useful theories of rationality could only be about as precise as theories of liberalism, or current theories about why England had an industrial revolution when it did, and no other country did instead.

Realism about rationality

I think the mathematical theory of natural selection + the theory of DNA / genes were probably very influential in both medicine and biology, because they make very precise predictions and the real world is a very good fit for the models they propose. (That is, they are "real", in the sense that "real" is meant in the OP.)

In contrast, I think the general insight of "each part of these organisms has been designed by a local hill-climbing process to maximise reproduction" would not have been very influential in either medicine or biology, had it not been accompanied by the math.

But surely you wouldn't get the mathematics of natural selection without the general insight, and so I think the general insight deserves to get a bunch of the credit. And both the mathematics of natural selection and the general insight seem pretty tied up to the notion of 'reproductive fitness'.

Realism about rationality

Ah, I didn't quite realise you meant to talk about "human understanding of the theory of evolution" rather than evolution itself. I still suspect that the theory of evolution is so fundamental to our understanding of biology, and our understanding of biology so useful to humanity, that if human understanding of evolution doesn't contribute much to human welfare it's just because most applications deal with pretty long time-scales.

(Also I don't get why this discussion is treating evolution as 'non-real': stuff like the Price equation seems pretty formal to me. To me it seems like a pretty mathematisable theory with some hard-to-specify inputs like fitness.)

Realism about rationality
  • I believe in some form of rationality realism: that is, that there's a neat mathematical theory of ideal rationality that's in practice relevant for how to build rational agents and be rational. I expect there to be a theory of bounded rationality about as mathematically specifiable and neat as electromagnetism (which after all in the real world requires a bunch of materials science to tell you about the permittivity of things).
  • If I didn't believe the above, I'd be less interested in things like AIXI and reflective oracles. In general, the above tells you quite a bit about my 'worldview' related to AI.
  • Searching for beliefs I hold for which 'rationality realism' is crucial by imagining what I'd conclude if I learned that 'rationality irrealism' was more right:
    • I'd be more interested in empirical understanding of deep learning and less interested in an understanding of learning theory.
    • I'd be less interested in probabilistic forecasting of things.
    • I'd want to find some higher-level thing that was more 'real'/mathematically characterisable, and study that instead.
    • I'd be less optimistic about the prospects for an 'ideal' decision and reasoning theory.
  • My research depends on the belief that rational agents in the real world are likely to have some kind of ordered internal structure that is comprehensible to people. This belief is informed by rationality realism but distinct from it.
Realism about rationality

In contrast, I struggle to name a way that evolution affects an everyday person

I'm not sure what exactly you mean, but examples that come to mind:

  • Crops and domestic animals that have been artificially selected for various qualities.
  • The medical community encouraging people to not use antibiotics unnecessarily.
  • [Inheritance but not selection] The fact that your kids will probably turn out like you without specific intervention on your part to make that happen.
Load More