Distillation Robustifies Unlearning
Current “unlearning” methods only suppress capabilities instead of truly unlearning the capabilities. But if you distill an unlearned model into a randomly initialized model, the resulting network is actually robust to relearning. We show why this works, how well it works, and how to trade off compute for robustness. Unlearn-and-Distill...

Thanks for the thoughtful reply!
This is the idea that at some point in scaling up an organization you could lose efficiency due to needing more/better management, more communication (meetings) needed and longer communication processes, "bloat" in general. I'm not claiming it’s likely to happen with AI, just another possible reason for increasing marginal cost with scale.
... (read 353 more words →)