DanielVarga

Posts

Sorted by New

Wiki Contributions

Comments

What’s up with LLMs representing XORs of arbitrary features?

I’ll say that a model linearly represents a binary feature f if there is a linear probe out of the model’s latent space which is accurate for classifying f

If a model linearly represents features a and b, then it automatically linearly represents and $a \lor b$ .

I think I misunderstand your definition. Let feature a be represented by x_1 > 0.5, and let feature b be represented by x_2 > 0.5. Let x_i be iid uniform [0, 1]. Isn't that a counterexample to (a and b) being linearly representable?

Reply

1