Addie Foote — AI Alignment Forum

Four ways learning Econ makes people dumber re: future AI

Thanks for the thoughtful reply!

I understand “resource scarcity” but I’m confused by “coordination problems”. Can you give an example? (Sorry if that’s a stupid question.)

This is the idea that at some point in scaling up an organization you could lose efficiency due to needing more/better management, more communication (meetings) needed and longer communication processes, "bloat" in general. I'm not claiming it’s likely to happen with AI, just another possible reason for increasing marginal cost with scale.

Resource scarcity seems unlikely to bite here, at least not for long. If some product is very profitable to create, and one of its components has a shortage, then people (or AGIs) will find ways to redesign around that component.

Key resources that come to mind would be electricity and chips (and materials to produce these). I don’t know how elastic production is in these industries, but the reason I expect it to be a barrier is that you’re constrained by the slowest factor. For huge transformations or redesigning significant parts of the current AI pipeline, like using a different kind of computation, I think there’s probably lots of serial work that has to be done to make it work. I agree the problems are solvable, but it shifts from "how much demand will there be for cheap AGI" to "how fast can resources be scaled up".

I wasn’t complaining about economists who say “the consequences of real AGI would be [crazy stuff], but I don’t expect real AGI in [time period T / ever]”. That’s fine!
Instead I was mainly complaining about the economists who have not even considered that real AGI is even a possible thing at all. Instead it’s just a big blind spot for them.

Yeah, I definitely agree.

And I don’t think this is independent of their economics training (although non-economists are obviously capable of having this blind spot too).
Instead, I think that (A) “such-and-such is just not a thing that happens in economies in the real world” and (B) “real AGI is even a conceivable possibility” are contradictory. And I think that economists are so steeped in (A) that they consider it to be a reductio ad absurdum for (B), whereas the correct response is the opposite ((B) disproves (A)).

I see how this could happen, but I'm not convinced this effect is actually happening. As you mention, many people have this blind spot. There's people that claim AGI is already here (and evidently have a different definition of AGI). I think my crux is that this isn't unique to economists. Some people say AGI is already here. Most non-AI people who are worried about AI seem worried that it will take their job, not all jobs. There are some people willing to accept the premise that AGI (as we define it) will exist at face value, but it seems to me that most people outside of AI that question the premise at all, end up not taking it seriously.

Four ways learning Econ makes people dumber re: future AI

Addie Foote2mo20

I agree that economists make some implicit assumptions about what AGI will look like that should be more explicit. But, I disagree with several points in this post.

On equilibrium: A market will equilibriate when the supply and demand is balanced at the current price point. At any given instant this can happen for a market even with AGI (sellers increase price until buyers are not willing to buy). Being at an equilibrium doesn’t imply the supply, demand, and price won’t change over time. Economists are very familiar with growth and various kinds of dynamic equilibria.

Equilibria aside, it is an interesting point that AGI combines aspects of both labor and capital in novel ways. Being able to both replicate and work in autonomous ways could create very interesting feedback loops.

Still, there could be limits and negative feedback to the feedback loops you point out. The idea that labor adds value and costs go down with scale are usually true but not universal. Things like resource scarcity or coordination problems can cause increasing marginal cost with scale. If there are very powerful AGI and very fast takeoffs, I expect resource scarcity to be a constraint.

I agree that AGI could break usual intuitions about capital and labor. However, I don’t think this is misleading economists. I think economists don’t consider AGI launching coups or pursuing jobs/entrepreneurship independently because they don’t expect it to have those capabilities or dispositions, not that they conflate it with inanimate capital. Even in the post linked, Tyler Cowen says that “I don’t think the economics of AI are well-defined by either “an increase in labor supply,” “an increase in TFP,” or “an increase in capital,” though it is some of each of those.”

Lastly, I fully agree that GDP doesn’t capture everything of value - even now it completely misses value from free resources like wikipedia and unpaid labor like housework, and can underestimate the value of new technology. Still, if AGI transforms many industries as it would likely need to in order to transform the world, real GDP would capture this.

All in all, I don’t think economics principles are misleading. Maybe Econ thinking will have to be expanded to deal with AGI. But right now, the difference in the economists and lesswrongers comes down to what capabilities they expect AGI to have.

Distillation Robustifies Unlearning

Addie Foote4mo20

I see what you mean. I would have guessed that the unlearned model behavior is meaningfully different than "produce noise on harmful else original". My guess is that the noise if harmful is accurate, but the small differences in logits on non-harmful data are quite important. We didn't run experiments on this. It would be an interesting empirical question to answer!

Also, there could be some variation on how true this is between different unlearning methods. We did find that RMU+distillation was less robust in the arithmetic setting than the other initial unlearning methods.

Fwiw, I'm not sure that RMU is a better unlearning method than simpler alternatives. I think it might just appear better on WMDP because the WMDP datasets are very messy and don't isolate the capability well, which could be done better with a cleaned dataset. Then, the performance on the evaluation relies on unnecessary generalization.

Distillation Robustifies Unlearning

Addie Foote4mo26

Yeah, I totally agree that targeted noise is a promising direction! However, I wouldn't take the exact % of pretraining compute that we found as a hard number, but rather as a comparison between the different noise levels. I would guess labs may have better distillation techniques that could speed it up. It also seems possible that you could distill into a smaller model faster and still recover performance with distillation. This would require modification to UNDO initialization, (e.g. initializing it as a noised version of the smaller model rather than the teacher) but still seems possible. Also, in some cases labs already do distillation, and in these cases it would have a smaller added cost.

Distillation Robustifies Unlearning

Addie Foote4mo20

Thanks for the comment!
I agree that exploring targeted noise is a very promising direction and could substantially speed up the method! Could you elaborate on what you mean about unlearning techniques during pretraining?

I don't think datafiltering+distillation is analogous to unlearning+distillation. During distillation, the student learns from the predictions of the teacher, not the data itself. The predictions can leak information about the undesired capability, even on data that is benign. In a preliminary experiment, we found that datafiltering+distillation was ineffective in a TinyStories setting, but more effective in the language setting (see this comment). It's possible that real world applications differ from the former setting. Maybe the context in which information about the forget capabilities are revealed are always different/identifiable and datafiltering+distillation would be effective, but I expect this isn't usually the case.

As a concrete example, let's say we want to unlearn the following fact:
The company x data center is in location y.
We filter all of the sentences that give information about the datacenter in location y, but there still is a benign sentence that says:
The company x data center is in location z.
Given the teacher model knows about data centers in location y and z, the teacher will have high probabilities on logits y and z, and the student will learn about both data centers.
Maybe there's a way to have a classifier that predicts whether the teacher model will reveal any information about the forget capability, but it seems a bit complicated by the fact that you can't just look at the top logit.

I do think unlearning+distillation is conceptually analogous to datafiltering+pretraining. However, I think there are practical differences, including the following:

With Unlearn and Distilll it's easier/cheaper to accurately control end behavior
- You can do many tries at the initial unlearning until it is satisfactory and expect the distilled student to behave like the teacher.
- With datafiltering+pretraining, you don't get to see how the model will perform until it's trained.
  - You can do many tries of training a classifier, but it's unclear what the ideal classifier would be.
  - It may be possible to learn undesired capabilities from a combination of seemingly benign data.
The cost probably differ
- With datafiltering+pretraining, you can probably use a smaller model as a classifier (or even just heuristics) so you remove the cost of distilling but add the cost of applying this classifier to the pretraining corpus.
- In practice, I'm not sure how expensive distillation is compared to pretraining.
- Distillation may already be a part of the pipeline in order to get a smaller, faster model, so unlearning before hand may be not much extra cost.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments