The AI alignment properties of agents which would be interesting to a range of principals trying to solve AI alignment. For example:

  • Does the AI "care" about reality, or just about its sensory observations?
  • Does the AI properly navigate ontological shifts?
  • Does the AI reason about itself as embedded in its environment?
Posts tagged General Alignment Properties