x

Raphael Roche

Subscribe

Message

496

155

1y

Raphael Roche

Subscribe

Message

496

155

1y

Schelling Goodness, and Shared Morality as a Goal

Raphael Roche3mo36

Interesting reflection. This is just an anecdotal aside with no major link to the moral discussion, but having been a Parisian for most of my life, my first intuition for a meeting point wasn't the Eiffel Tower, but the square in front of Notre-Dame (le parvis).

Indeed, several cultural elements converge toward this solution for a true-blue Parisian : it’s the historic heart of Paris, a highly symbolic spot, and by convention, 'Point Zero' for all roads in France (there’s even a well-known ground marker there). It is also very close to Châtelet-Les Halles,... (read more)

Reply

Frontier Models are Capable of In-context Scheming

Raphael Roche1y*00

We may filter training data and improve RLHF, but in the end, game theory - that is to say maths - implies that scheming could be a rational strategy, and the best strategy in some cases. Humans do not scheme just because they are bad but because it can be a rational choice to do so. I don't think LLMs do that exclusively because it is what humans do in the training data, any advanced model would in the end come to such strategies because it is the most rational choice in the context. They infere patterns from the training data and rational behavior is cer... (read more)

Reply