Problems facing a correspondence theory of knowledge

Planned summary for the Alignment Newsletter:

Probability theory can tell us about how we ought to build agents that have knowledge (start with a prior, and perform Bayesian updates as evidence comes in). However, this is not the only way to create knowledge: for example, humans are not ideal Bayesian reasoners. As part of our quest to <@_describe_ existing agents@>(@Theory of Ideal Agents, or of Existing Agents?@), could we have a theory of knowledge that specifies when a particular physical region within a closed system is “creating knowledge”? We want a theory that <@works in the Game of Life@>(@Agency in Conway’s Game of Life@) as well as the real world.
This sequence investigates this question from the perspective of defining the accumulation of knowledge as increasing correspondence between [a map and the territory](https://en.wikipedia.org/wiki/Map%E2%80%93territory_relation), and concludes that such definitions are not tenable. In particular, it considers four possibilities, and demonstrates counterexamples to all of them:
1. Direct map-territory resemblance: Here, we say that knowledge accumulates in a physical region of space (the “map”) if that region of space looks more like the full system (the “territory”) over time.
Problem: This definition fails to account for cases of knowledge where the map is represented in a very different way that doesn’t resemble the territory, such as when a map is represented by a sequence of zeros and ones in a computer.
2. Map-territory mutual information: Instead of looking at direct resemblance, we can ask whether there is increasing mutual information between the supposed map and the territory it is meant to represent.
Problem: In the real world, nearly _every_ region of space will have high mutual information with the rest of the world. For example, by this definition, a rock accumulates lots of knowledge as photons incident on its face affect the properties of specific electrons in the rock giving it lots of information.
3. Mutual information of an abstraction layer: An abstraction layer is a grouping of low-level configurations into high-level configurations such that transitions between high-level configurations are predictable without knowing the low-level configurations. For example, the zeros and ones in a computer are the high-level configurations of a digital abstraction layer over low-level physics. Knowledge accumulates in a region of space if that space has a digital abstraction layer, and the high-level configurations of the map have increasing mutual information with the low-level configurations of the territory.
Problem: A video camera that constantly records would accumulate much more knowledge by this definition than a human, even though the human is much more able to construct models and act on them.
4. Precipitation of action: The problem with our previous definitions is that they don’t require the knowledge to be _useful_. So perhaps we can instead say that knowledge is accumulating when it is being used to take action. To make this mechanistic, we say that knowledge accumulates when an entity’s actions become more fine tuned to a specific environment configuration over time. (Intuitively, they learned more about the environment, and so could condition their actions on that knowledge, which they previously could not do.)
Problem: This definition requires the knowledge to actually be used to count as knowledge. However, if someone makes a map of a coastline, but that map is never used (perhaps it is quickly destroyed), it seems wrong to say that during the map-making process knowledge was not accumulating.

[-]Ben Pace4y30

This was a great summary, thx.

[-]Alex Flint4y30

Your summaries are excellent Rohin. This looks good to me.

[-]Chris_Leong5y30

I think that part of the problem is that talking about knowledge requires adopting an interpretative frame. We can only really say whether a collection of particles represents some particular knowledge from within such a frame, although it would be possible to determine the frame of minimum complexity that interprets a system as representing certain facts. In practise though, whether or not a particular piece of storage contains knowledge will depend on the interpretative frames in the environment, although we need to remember that interpretative frames can emulate other interpretative frames. ie. A human experimenting with multiple codes in order to decode a message.

Regarding the topic of partial knowledge, it seems that the importance of various facts will vary wildly from context to context and also depending on the goal. I'm somewhat skeptical that goal independent knowledge will have a nice definition.

[-]Alex Flint5y10

Well yes I agree that knowledge exists with respect to a goal, but is there really no objective difference an alien artifact inscribed with deep facts about the structure of the universe and set up in such a way that it can be decoded by any intelligent species that might find it, and an ordinary chunk of rock arriving from outer space?

[-]Chris_Leong5y10

Well, taking the simpler case of exacting reproducing a certain string, you could find the simplest program that produces the string similar to Kolmogorov complexity and use that as a measure of complexity.

A slightly more useful way of modelling things may be to have a bunch of different strings with different points representing levels of importance. And perhaps we produce a metric combining the Kolmovorov complexity of a decoder with the sum of the points produced where points are obtained by concatenating the desired strings with a predefined separator. For example, we might find the quotient.

One immediate issue with this is that some of the strings may contain overlapping information. And we'd still have to produce a metric to assign importances to the strings. Perhaps a simpler case would be where the strings represent patterns in a stream via encoding a Turing machine with the Turing machines being able to output sets of symbols instead of just symbols representing the possible symbols at each locations. And the amount of points they provide would be equal to how much of the stream it allows you to predict. (This would still require producing a representation of the universe where the amount of the stream predicted would be roughly equivalent to how useful the predictions are).

Any thoughts on this general approach?

[-]Alex Flint5y10

Well here is a thought: a random string would have high Kolmogorov complexity, as would a string describing the most fundamental laws of physics. What are the characteristics of the latter that conveys power over one's environment to an agent that receives it, that is not conveyed by the former? This is the core question I'm most interested in at the moment.

[-]Rohin Shah4y20

Is this sequence complete? I was expecting a final literature review post before summarizing for the newsletter, but it's been a while since the last update and you've posted something new, so maybe you decided to skip it?

[-]Alex Flint4y40

The sequence is now complete.

[-]Alex Flint4y30

It's actually written, just need to edit and post. Should be very soon. Thanks for checking on it.

[-]Charlie Steiner5y20

I think grappling with this problem is important because it leads you directly to understanding that what you are talking about is part of your agent-like model of systems, and how this model should be applied depends both on the broader context and your own perspective.

[-]duck_master5y00

Au contraire, I think that "mutual information between the object and the environment" is basically the right definition of "knowledge", at least for knowledge about the world (as it correctly predicts that all four attempted "counterexamples" are in fact forms of knowledge), but that the knowledge of an object also depends on the level of abstraction of the object which you're considering.

For example, for your rock example: A rock, as a quantum object, is continually acquiring mutual information with the affairs of humans by the imprinting of subatomic information onto the surface of rock by photons bouncing off the Earth. This means that, if I was to examine the rock-as-a-quantum-object for a really long time, I would know the affairs of humans (due to the subatomic imprinting of this information on the surface of the rock), and not only that, but also the complete workings of quantum gravity, the exact formation of the rock, the exact proportions of each chemical that went into producing the rock, the crystal structure of the rock, and the exact sequence of (micro-)chips/scratches that went into making this rock into its current shape. I feel perfectly fine counting all this as the knowledge of the rock-as-a-quantum-object, because this information about the world is stored in the rock.

(Whereas, if I were only allowed to examine the rock-as-a-macroscopic-object, I would still know roughly what chemicals it was made of and how they came to be, and the largest fractures of the rock, but I wouldn't know about the affairs of humans; hence, such is the knowledge held by the rock-as-a-macroscopic-object. This makes sense because the rock-as-a-macroscopic-object is an abstraction of the rock-as-a-quantum-object, and abstractions always throw away information except that which is "useful at a distance".)

For more abstract kinds of knowledge, my intuition defaults to question-answering/epistemic-probability/bet-type definitions, at least for sufficiently agent-y things. For example, I know that 1+1=2. If you were to ask me, "What is 1+1?", I would respond "2". If you were to ask me to bet on what 1+1 was, in such a way that the bet would be instantly decided by Omega, the omniscient alien, I would bet with very high probability (maybe 40:1odds in favor, if I had to come up with concrete numbers?) that it would be 2 (not 1, because of Cromwell's law, and also because maybe my brain's mental arithmetic functions are having a bad day). However, I do not know whether the Riemann Hypothesis is true, false, or independent of ZFC. If you asked me, "Is the Riemann Hypothesis true, false, or independent of ZFC?", I would answer, "I don't know" instead of choosing one of the three possibilities, because I don't know. If you asked me to bet on whether the Riemann Hypothesis was true, false, or independent of ZFC, with the bet to be instantly decided by Omega, I might bet 70% true, 20% false, and 10% independent (totally made-up semi-plausible figures that no bearing on the heart of the argument; I haven't really tested my probabilistic calibration), but I wouldn't put >95% implied probability on anything because I'm not that confident in any one possibility. Thusly, for abstract kinds of knowledge, I think I would say that an agent (or a sufficiently agent-y thing) knows an abstract fact X if it tells you about this fact when prompted with a suitably phrased question, and/or if it places/would place a bet in favor of fact X with very high implied probability if prompted to bet about it.

(One problem with this definition is that, intuitively, when I woke up today, I had no idea what 384384*20201 was; the integers here are also completely arbitrary. However, after I typed it into a calculator and got 7764941184, I now know that 384384*20201 = 7764941184. I think this is also known as the problem of logical omniscience; Scott Aaronson once wrote a pretty nice essay about this topic and others from the perspective of computational complexity.)

I have basically no intuition whatsoever on what it means for a rock* to know that the Riemann Hypothesis is true, false, or independent of ZFC. My extremely stupid and unprincipled guess is that, unless a rock is physically inscribed with a proof of the true answer, it doesn't know, and that otherwise it does.

*I'm using a rock here as a generic example of a clearly-non-agentic thing. Obviously, if a rock was an agent, it'd be a very special rock, at least in the part of the multiverse that I inhabit. Feel free to replace "rock" with other words for non-agents.

[-]Alex Flint5y10

Thank you for this comment duck_master.

I take your point that it is possible to extract knowledge about human affairs, and about many other things, from the quantum structure of a rock that has been orbiting the Earth. However, I am interested in a definition of knowledge that allows me to say what a given AI does or does not know, insofar as it has the capacity to act on this knowledge. For example, I would like to know whether my robot vacuum has acquired sophisticated knowledge of human psychology, since if it has, and I wasn't expecting it to, then I might choose to switch it off. On the other hand, if I merely discover that my AI has recorded some videos of humans then I am less concerned, even if these videos contain the basic data necessary to constructed sophisticated knowledge of human psychology, as in the case with the rock. Therefore I am interested not just in information, but something like action-readiness. I am referring to that which is both informative and action-ready as "knowledge", although this may be stretching the standard use of this term.

Now you say that we might measure more abstract kinds of knowledge by looking at what an AI is willing to bet on. I agree that this is a good way to measure knowledge if it is available. However, if we are worried that an AI is deceiving us, then we may not be willing to trust its reports of its own epistemic state, or even of the bets it makes, since it may be willing to lose money now in order to convince us that it is not particularly intelligent, in order to make a treacherous turn later. Therefore I would very much like to find a definition that does not require me to interact with the AI through its input/output channels in order to find out what it knows, but rather allows me to look directly at its internals. I realize this may be impossible, but this is my goal.

So as you can see, my attempt at a definition of knowledge is very much wrapped up with the specific problem I'm trying to solve, and so any answers I arrive at may not be useful beyond this specific AI-related question. Nevertheless, I see this as an important question and so am content to be a little myopic in my investigation.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

17

Problems facing a correspondence theory of knowledge

17

Outline

Introduction

Problem statement

What a definition should accomplish

Preview