Wiki Contributions


Counterexamples to some ELK proposals

It seems like the frame of some of the critique is that humans are the authority on human values and want to ensure that the AI doesn't escape that authority in illegible ways. To me it seems like the frame is more like we know that the sensors we have are only goodhartedly entangled with the things we care about and would ourselves prefer the less goodharted hypothetical sensors if we knew how to construct them. And that we'd want the AI to be inhabiting the same frame as us since, to take a page from Mad Investor Chaos, we don't know how lies will propagate through an alien architecture.

I don't know how 'find less goodharted sensors' is instantiated on natural hardware or might have toy versions implemented algorithmically, seems like it would be worth trying to figure out. In a conversation, John mentioned a type of architecture that is forced through an information bottleneck to find a minimal representation of the space. Seemed like a similar direction.

Morality is Scary

You may not be interested in mutually exclusive compression schemas, but mutually exclusive compression schemas are interested in you. One nice thing is that given that the schemas use an arbitrary key to handshake with there is hope that they can be convinced to all get on the same arbitrary key without loss of useful structure.

Biology-Inspired AGI Timelines: The Trick That Never Works

Spoiler tags are borked the way I'm using them.

anyway, another place to try your hand at calibration:

Humbali: No. You're expressing absolute certainty in your underlying epistemology and your entire probability distribution

no he isn't, why?

Humbali is asking for Eliezer to double count evidence. Consilience is hard if you don't do your homework on provenance of heuristic and not just naively counting up outputs who themselves also didn't do their homework.

Or in other words: "Do not cite the deep evidence to me, I was there when it was written"

And another place to take a whack at:

I'm not sure how to lead you into the place where you can dismiss that thought with confidence.

The particular cited example of statusy aliens seems like extreme hypothesis privileging, which often arises from reference class tennis.

My take on higher-order game theory

Tangential, but did you ever happen to read statistical physics of human cooperation?

Optimization Concepts in the Game of Life

Defining a distance function between two patterns might yield some interesting stuff and allow some porting in of existing math from information theory. There is also the dynamic case (converging and diverging distances) between different patterns. Seems like it could play into robustness eg sensitivity of patterns to flipping from convergent to divergent state.

Analogies and General Priors on Intelligence

I understand, thought it was worth commenting on anyway.

Analogies and General Priors on Intelligence

the small size of the human genome suggests that brain design is simple

Bounds, yes but the bound can be quite high due to offloading much of the compression to the environment.

Draft report on AI timelines

Is a sensitivity analysis of the model separated out anywhere? I might just be missing it.

Load More