DanielFilan

Comments

Introduction To The Infra-Bayesianism Sequence

Therefore, there is a real sense in which its hypothesis class includes things as difficult to compute as it is. That being said, my guess is that halting oracles would indeed let you compute more than just the lower semi-computable functions, and it's also true that being able to run Solomonoff induction would also let you build a halting oracle.

I guess the way to reconcile this is to think that there's a difference between what you can lower semi-compute, and what you could compute if you could compute lower semi-computable things? But it's been a while since I had a good understanding of this type of thing.

Introduction To The Infra-Bayesianism Sequence

much like how halting oracles (which you need to run Solomonoff Induction) are nowhere in the hypotheses which Solomonoff considers

The Solomonoff prior is a mixture over semi-measures[*] that are lower semi-computable: that is, you can compute increasingly good approximations of the semi-measure from below that converge eventually to the actual semi-measure, but at finite time you don't know how close you are to the right answer. The Solomonoff prior itself is also a lower semi-computable semi-measure. Therefore, there is a real sense in which its hypothesis class includes things as difficult to compute as it is. That being said, my guess is that halting oracles would indeed let you compute more than just the lower semi-computable functions, and it's also true that being able to run Solomonoff induction would also let you build a halting oracle.

[*] semi-measures are probability distributions that have 'missing density', where the probability of a 0 and then a 0, plus the probability of a 0 and then a 1, is less than or equal to the probability of a 0, even though there aren't any other options in the space for what happens next.

Matt Botvinick on the spontaneous emergence of learning algorithms

it's not clear that the handyman would have remembered to give the advice "turn clockwise to loosen, and counterclockwise to tighten"

It's the other way around, right?

ricraz's Shortform

If you get an 'external' randomness oracle, then you could define the utility function pretty simply in terms of the outputs of the oracle.

If the agent has a pseudo-random number generator (PRNG) inside it, then I suppose I agree that you aren't going to be able to give it a utility function that has the standard set of convergent instrumental goals, and PRNGs can be pretty short. (Well, some search algorithms are probably shorter, but I bet they have higher Kt complexity, which is probably a better measure for agents)

ricraz's Shortform

And so this generates arbitrarily simple agents whose observed behaviour can only be described as maximising a utility function for arbitrarily complex utility functions (depending on how long you run them).

I object to the claim that agents that act randomly can be made "arbitrarily simple". Randomness is basically definitionally complicated!

The ground of optimization

But Filan would surely agree on this point and his question is more specific: he is asking whether the liver is an optimizer.

FYI, it seems pretty clear to me that a liver should be considered an optimiser: as an organ in the human body, it performs various tasks mostly reliably, achieves homeostasis, etc. The question I was rhetorically asking was whether it is an optimiser of one's income, and the answer (I claim) is 'no'.

How should potential AI alignment researchers gauge whether the field is right for them?

I'd say a pretty good way is to try out AI alignment research as best you can, and see if you like it. This is probably best done by being an intern at some research group, but sadly these spots are limited. Perhaps one could factor it into "do I enjoy AI research at all", which is easier to gain experience in, and "am I interested in research questions in AI alignment", which you can hopefully determine through reading AI alignment research papers and introspecting on how much you care about the contents.

Will OpenAI's work unintentionally increase existential risks related to AI?

FWIW, I thought the original question text was slightly better, since I didn't read it as aggressive, and it didn't needlessly explicitly assume that everyone at OpenAI is avoiding increasing existential risk. Furthermore, it seems clear to me that an organisation can be increasing existential risk without everybody at that organisation being a moral monster, since most organisations are heterogeneous.

In general, I think one should be able to ask questions of the form "is actor X causing harm Y" on LessWrong, and furthermore that people should not thereby assume that the questioner thinks that actor X is evil. I also think that some people are moral monsters and/or evil, and the way to figure out whether or not that's true is to ask questions of this form.

DanielFilan's Shortform Feed

As far as I can tell, people typically use the orthogonality thesis to argue that smart agents could have any motivations. But the orthogonality thesis is stronger than that, and its extra content is false - there are some goals that are too complicated for a dumb agent to have, because the agent couldn't understand those goals. I think people should instead directly defend the claim that smart agents could have arbitrary goals.

What if memes are common in highly capable minds?

My understanding of meme theory is that it considers the setting where memes mutate, reproduce, and are under selection pressure. This basically requires you to think that there's some population pool where the memes are spreading. So, one way to think about it might be to ask what memetic environment your AI systems are in.

  • Are human memes a good fit for AI agents? You might think that a physics simulator is not going to be a good fit for most human memes (except perhaps for memes like "representation theory is a good way to think about quantum operators"), because your physics simulator is structured differently from most human minds, and doesn't have the initial memes that our memes are co-adapted with. That being said, GPT-8 might be very receptive to human memes, as memes are pretty relevant to what characters humans type on the internet.
  • How large is the AI population? If there's just one smart AI overlord and then a bunch of MS Excel-level clever computers, the AI overlord is probably not exchanging memes with the spreadsheets. However, if there's a large number of smart AI systems that work in basically the same manner, you might think that that forms the relevant "meme pool", and the resulting memes are going to be different from human memes (if the smart AI systems are cognitively different from humans), and as a result perhaps harder to predict. You could also imagine there being lots of AI system communities where communication is easy within each community but difficult between communities due to architectural differences.
Load More