AI ALIGNMENT FORUM
AF

romeostevensit
Ω1391810
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
1romeostevensit's Shortform
6y
0
Towards a scale-free theory of intelligent agency
romeostevensit5mo20

Found this interesting and useful. Big update for me is that 'I cut you choose' is basically the property that most (all?) good self therapy modalities use afaict. In that the part or part-coalition running the therapy procedure can offer but not force things, since its frames are subtly biasing the process.

Reply
How AI Takeover Might Happen in 2 Years
romeostevensit7mo22

For people who want weirder takes I would recommend Egan's unstable orbits in the space of lies.

Reply
gwern's Shortform
romeostevensit9mo10

People inexplicably seem to favor extremely bad leaders-->people seem to inexplicably favor bad AIs.

Reply
The Obliqueness Thesis
romeostevensit1y10

You mention 'warp' when talking about cross ontology mapping which seems like your best summary of a complicated intuition. I'd be curious to hear more (I recognize this might not be practical). My own intuition surfaced 'introducing degrees of freedom' a la indeterminacy of translation.

Reply
The Checklist: What Succeeding at AI Safety Will Involve
romeostevensit1y10

https://github.com/JohnLaTwC/Shared/blob/master/Defenders think in lists. Attackers think in graphs. As long as this is true%2C attackers win.md

Reply
Fabien's Shortform
romeostevensit1y10

Is there a short summary on the rejecting Knightian uncertainty bit?

Reply
Value systematization: how values become coherent (and misaligned)
romeostevensit2y10

Sample complexity reduction is one of our main moment to moment activities, but humans seem to apply it across bigger bridges and this is probably part of transfer learning. One of the things we can apply sample complexity reduction to is the 'self' object, the idea of a coherent agent across differing decision points. The tradeoff between local and global loss seems to regulate this. Humans don't seem uniform on this dimension, foxes care more about local loss, hedgehogs more about global loss. Most moral philosophy seem like appeals to different possible high order symmetries. I don't think this is the crux of the issue, as I think human compressions of these things will turn out to be pretty easy to do with tons of cognitive horsepower, the dimensionality of our value embedding is probably not that high. My guess is the crux is getting a system to care about distress in the first place, and then balance local and global distress.

Reply
Meta Questions about Metaphilosophy
romeostevensit2y10

Also I wrote this a while back https://www.lesswrong.com/posts/caSv2sqB2bgMybvr9/exploring-tacit-linked-premises-with-gpt

Reply
Meta Questions about Metaphilosophy
romeostevensit2y30

When I look at metaphilosophy, the main places I go looking are places with large confusion deltas. Where, who, and why did someone become dramatically less philosophically confused about something, turning unfalsifiable questions into technical problems. Kuhn was too caught up in the social dynamics to want to do this from the perspective of pure ideas. A few things to point to.

  1. Wittgenstein noticed that many philosophical problems attempt to intervene at the wrong level of abstraction and posited that awareness of abstraction as a mental event might help
  2. Korzybski noticed that many philosophical problems attempt to intervene at the wrong level of abstraction and posited that awareness of abstraction as a mental event might help
  3. David Marr noticed that many philosophical and technical problems attempt to intervene at the wrong level of you get the idea
  4. Hassabis cites Marr as of help in deconfusing AI problems
  5. Eliezer's Technical Explanation of Technical Explanation doesn't use the term compression and seems the worse for it, using many many words to describe things that compression would render easier to reason about afaict.
  6. Hanson in the Elephant in the Brain posits that if we mysteriously don't make progress on something that seems crucial, maybe we have strong motivations for not making progress on it.

Question: what happens to people when they gain consciousness of abstraction? My first pass attempt at an answer is that they become a lot less interested in philosophy.

Question: if someone had quietly made progress on metaphilosophy how would we know? First guess is that we would only know if their solution scaled well, or caused something to scale well.

Reply
The Lightcone Theorem: A Better Foundation For Natural Abstraction?
romeostevensit2y40

Is there a good primer somewhere on how causal models interact with the standard model of physics?

Reply
Load More
9Towards an Intentional Research Agenda
6y
5
1romeostevensit's Shortform
6y
0