All of DanielFilan's Comments + Replies

Knowledge is not just mutual information

Seems like maybe the solution should perhaps be that you should only take 'the system' to be the 'controllable' physical variables, or those variables that are relevant for 'consequential' behaviour? Hopefully if one can provide good definitions for these, it will provide a foundation for saying what the abstractions should be that let us distinguish between 'high-level' and 'low-level' behaviour.

Challenge: know everything that the best go bot knows about go

Ah, understood. I think this is basically covered by talking about what the go bot knows at various points in time, a la this comment - it seems pretty sensible to me to talk about knowledge as a property of the actual computation rather than the algorithm as a whole. But from your response there it seems that you think that this sense isn't really well-defined.

1Richard Ngo10dI'm not sure what you mean by "actual computation rather than the algorithm as a whole". I thought that I was talking about the knowledge of the trained model which actually does the "computation" of which move to play, and you were talking about the knowledge of the algorithm as a whole (i.e. the trained model plus the optimising bot).
Challenge: know everything that the best go bot knows about go

Actually, hmm. My thoughts are not really in equilibrium here.

AXRP Episode 7 - Side Effects with Victoria Krakovna

Not sure what the actual sentence you wanted to write was. "are not absolutely necessary" maybe?

You're quite right, let me fix that.

1DanielFilan1moAnd also thanks for your kind words :)
Challenge: know everything that the best go bot knows about go

(Also: such a rewrite would be a combination of 'what I really meant' and 'what the comments made me realize I should have really meant')

Challenge: know everything that the best go bot knows about go

OK, the parenthetical helped me understand where you're coming from. I think a re-write of this post should (in part) make clear that I think a massive heroic effort would be necessary to make this happen, but sometimes massive heroic efforts work, and I have no special private info that makes it seem more plausible than it looks a priori.

1DanielFilan1moActually, hmm. My thoughts are not really in equilibrium here.
1DanielFilan1mo(Also: such a rewrite would be a combination of 'what I really meant' and 'what the comments made me realize I should have really meant')
Challenge: know everything that the best go bot knows about go

In the parent, is your objection that the trained AlphaZero-like model plausibly knows nothing at all?

3Richard Ngo1moThe trained AlphaZero model knows lots of things about Go, in a comparable way to how a dog knows lots of things about running. But the algorithm that gives rise to that model can know arbitrarily few things. (After all, the laws of physics gave rise to us, but they know nothing at all.)
Challenge: know everything that the best go bot knows about go

Suppose you have a computer program that gets two neural networks, simulates a game of go between them, determines the winner, and uses the outcome to modify the neural networks. It seems to me that this program has a model of the 'go world', i.e. a simulator, and from that model you can fairly easily extract the rules and winning condition. Do you think that this is a model but not a mental model, or that it's too exact to count as a model, or something else?

2Richard Ngo1moI'd say that this is too simple and programmatic to be usefully described as a mental model. The amount of structure encoded in the computer program you describe is very small, compared with the amount of structure encoded in the neural networks themselves. (I agree that you can have arbitrarily simple models of very simple phenomena, but those aren't the types of models I'm interested in here. I care about models which have some level of flexibility and generality, otherwise you can come up with dumb counterexamples like rocks "knowing" the laws of physics.) As another analogy: would you say that the quicksort algorithm "knows" how to sort lists? I wouldn't, because you can instead just say that the quicksort algorithm sorts lists, which conveys more information (because it avoids anthropomorphic implications). Similarly, the program you describe builds networks that are good at Go, and does so by making use of the rules of Go, but can't do the sort of additional processing with respect to those rules which would make me want to talk about its knowledge of Go.
Challenge: know everything that the best go bot knows about go

I think there's some communication failure where people are very skeptical of this for reasons that they think are obvious given what they're saying, but which are not obvious to me. Can people tell me which subset of the below claims they agree with, if any? Also if you come up with slight variants that you agree with that would be appreciated.

  1. It is approximately impossible to succeed at this challenge.
  2. It is possible to be confident that advanced AGI systems will not pose an existential threat without being able to succeed at this challenge.
  3. It is not
... (read more)
1Adam Shimi1moMy take is: * I think making this post was a good idea. I'm personally interested in deconfusing the topic of universality (which basically should capture what "learning everything the model knows"), and you brought up a good "simple" example to try to build intuition on. * What I would call your mistake is a mostly 8, but a bit of the related ones (so 3 and 4?). Phrasing it as "can we do that" is a mistake in my opinion because the topic is very confused (as shown by the comments). On the other hand, I think asking the question of what it would mean is a very exciting problem. It also gives a more concrete form to the problem of deconfusing universality, which is important AFAIK to Paul's approaches to alignment.
Challenge: know everything that the best go bot knows about go

I'd also be happy with an inexact description of what the bot will do in response to specified strategies that captured all the relevant details.

Challenge: know everything that the best go bot knows about go

I think that it isn't clear what constitutes "fully understanding" an algorithm.

That seems right.

Another obstacle to full understanding is memory. Suppose your go bot has memorized a huge list of "if you are in such and such situation move here" type rules.

I think there's reason to believe that SGD doesn't do exactly this (nets that memorize random data have different learning curves than normal nets iirc?), and better reason to think it's possible to train a top go bot that doesn't do this.

There is not in general a way to compute what an algorith

... (read more)
1DanielFilan1moI'd also be happy with an inexact description of what the bot will do in response to specified strategies that captured all the relevant details.
Challenge: know everything that the best go bot knows about go

Hmmm. It does seem like I should probably rewrite this post. But to clarify things in the meantime:

  • it's not obvious to me that this is a realistic target, and I'd be surprised if it took fewer than 10 person-years to achieve.
  • I do think the knowledge should 'cover' all the athlete's ingrained instincts in your example, but I think the propositions are allowed to look like "it's a good idea to do x in case y".
2Richard Ngo1moPerhaps I should instead have said: it'd be good to explain to people why this might be a useful/realistic target. Because if you need propositions that cover all the instincts, then it seems like you're basically asking for people to revive GOFAI. (I'm being unusually critical of your post because it seems that a number of safety research agendas lately have become very reliant on highly optimistic expectations about progress on interpretability, so I want to make sure that people are forced to defend that assumption rather than starting an information cascade.)
Challenge: know everything that the best go bot knows about go

On that definition, how does one train an AlphaZero-like algorithm without knowing the rules of the game and win condition?

1Richard Ngo1moThe human knows the rules and the win condition. The optimisation algorithm doesn't, for the same reason that evolution doesn't "know" what dying is: neither are the types of entities to which you should ascribe knowledge.
Challenge: know everything that the best go bot knows about go

Perhaps the bot knows different things at different times and your job is to figure out (a) what it always knows and (b) a way to quickly find out everything it knows at a certain point in time.

I think at this point you've pushed the word "know" to a point where it's not very well-defined; I'd encourage you to try to restate the original post while tabooing that word.

This seems particularly valuable because there are some versions of "know" for which the goal of knowing everything a complex model knows seems wildly unmanageable (for example, trying to convert a human athlete's ingrained instincts into a set of propositions). So before people start trying to do what you suggested, it'd be good to explain why it's actually a realistic target.

Challenge: know everything that the best go bot knows about go

Also it certainly knows the rules of go and the win condition.

1Richard Ngo1moAs an additional reason for the importance of tabooing "know", note that I disagree with all three of your claims about what the model "knows" in this comment and its parent. (The definition of "know" I'm using is something like "knowing X means possessing a mental model which corresponds fairly well to reality, from which X can be fairly easily extracted".)
Challenge: know everything that the best go bot knows about go

But once you let it do more computation, then it doesn't have to know anything at all, right? Like, maybe the best go bot is, "Train an AlphaZero-like algorithm for a million years, and then use it to play."

I would say that bot knows what the trained AlphaZero-like model knows.

1DanielFilan1moAlso it certainly knows the rules of go and the win condition.
Challenge: know everything that the best go bot knows about go

Maybe it nearly suffices to get a go professional to know everything about go that the bot does? I bet they could.

3Adam Shimi1moWhat does that mean though? If you give the go professional a massive transcript of the bot knowledge, it's probably unusable. I think what the go professional gives you is the knowledge of where to look/what to ask for/what to search.
Challenge: know everything that the best go bot knows about go

[D]oes understanding the go bot in your sense imply that you could play an even game against it?

I imagine so. One complication is that it can do more computation than you.

4ESRogs1moBut once you let it do more computation, then it doesn't have to know anything at all, right? Like, maybe the best go bot is, "Train an AlphaZero-like algorithm for a million years, and then use it to play." I know more about go than that bot starts out knowing, but less than it will know after it does computation. I wonder if, when you use the word "know", you mean some kind of distilled, compressed, easily explained knowledge?
Challenge: know everything that the best go bot knows about go

You could plausibly play an even game against a go bot without knowing everything it knows.

2weathersystems1moSure. But the question is can you know everything it knows and not be as good as it? That is, does understanding the go bot in your sense imply that you could play an even game against it?
Mundane solutions to exotic problems

FYI: I would find it useful if you said somewhere what 'epistemic competitiveness' means and linked to it when using the term.

3Adam Shimi1moI assume the right pointer is ascription universality [https://ai-alignment.com/towards-formalizing-universality-409ab893a456].
AMA: Paul Christiano, alignment researcher

I guess I feel like we're in a domain where some people were like "we have concretely-specifiable tasks, intelligence is good, what if we figured how to create artificial intelligence to do those tasks", which is the sort of thing that someone trying to do good for the world would do, but had some serious chance of being very bad for the world. So in that domain, it seems to me that we should keep our eyes out for things that might be really bad for the world, because all the things in that domain are kind of similar.

That being said, I agree that the possi... (read more)

4Paul Christiano1moI think it's good to sometimes meditate on whether you are making the world worse (and get others' advice), and I'd more often recommend it for crowds other than EA and certainly wouldn't discourage people from doing it sometimes. I'm sympathetic to arguments that you should be super paranoid in domains like biosecurity since it honestly does seem asymmetrically easier to make things worse rather than better. But when people talk about it in the context of e.g. AI or policy interventions or gathering better knowledge about the world that might also have some negative side-effects, I often feel like there's little chance that predictable negative effects they are imagining loom large in the cost-benefit unless the whole thing is predictably pointless. Which isn't a reason not to consider those effects, just a push-back against the conclusion (and a heuristic push-back against the state of affairs where people are paralyzed by the possibility of negative consequences based on kind of tentative arguments). For advancing or deploying AI I generally have an attitude like "Even if actively trying to push the field forward full-time I'd be a small part of that effort, whereas I'm a much larger fraction of the stuff-that-we-would-be-sad-about-not-happening-if-the-field-went-faster, and I'm not trying to push the field forward," so while I'm on board with being particularly attentive to harms if you're in a field you think can easily cause massive harms, in this case I feel pretty comfortable about the expected cost-benefit unless alignment work isn't really helping much (in which case I have more important reasons not to work on it). I would feel differently about this if pushing AI faster was net bad on e.g. some common-sense perspective on which alignment was not very helpful, but I feel like I've engaged enough with those perspectives to be mostly not having it.
AMA: Paul Christiano, alignment researcher

What's the largest cardinal whose existence you feel comfortable with assuming as an axiom?

5Paul Christiano1moI'm pretty comfortable working with strong axioms. But in terms of "would actually blow my mind if it turned out not to be consistent," I guess alpha-inaccessible cardinals for any concrete alpha? Beyond that I don't really know enough set theory to have my mind blown.
AMA: Paul Christiano, alignment researcher

How many hours per week should the average AI alignment researcher spend on improving their rationality? How should they spend those hours?

I probably wouldn't set aside hours for improving rationality (/ am not exactly sure what it would entail). Seems generally good to go out of your way to do things right, to reflect on lessons learned from the things you did, to be willing to do (and slightly overinvest in) things that are currently hard in order to get better, and so on. Maybe I'd say that like 5-10% of time should be explicitly set aside for activities that just don't really move you forward (like post-mortems or reflecting on how things are going in a way that's clearly not going to pay... (read more)

I want to know this question, but for the ‘peak’ alignment researcher.

AMA: Paul Christiano, alignment researcher

What's the optimal ratio of researchers to support staff in an AI alignment research organization?

4Paul Christiano1moI guess it depends a lot on what the organization is doing and how exactly we classify "support staff." For my part I'm reasonably enthusiastic about eventually hiring people who are engaged in research but whose main role is more like clarifying, communicating, engaging with outside world, prioritizing, etc., and I could imagine doing like 25-50% as much of that kind of work as we do of frontier-pushing? I don't know whether you'd classify those people as researchers (though I probably wouldn't call it "support" since that seems to kind of minimize the work). Once you are relying on lots of computers, that's a whole different category of work and I'm not sure what the right way of organizing that is or what we'd call support. In terms of things like fundraising, accounting, supporting hiring processes, making payroll and benefits, budgeting, leasing and maintaining office space, dealing with the IRS, discharging legal obligations of employers, immigration, purchasing food, etc.... I'd guess it's very similar to other research organizations with similar salaries. I'm very ignorant about all of this stuff (I expect to learn a lot about it) but I'd guess that depending on details it ends up being 10-20% of staff. But it could go way lower if you outsource a lot to external vendors rather than in-house. (And if you organize a lot of events then that kind of work could just grow basically without bound and in that case I'd again wonder if "support" is the right word.)
AMA: Paul Christiano, alignment researcher

What's your favourite mathematical object? What's your least favourite mathematical object?

4Paul Christiano1moFavorite: Irit Dinur's PCP for constraint satisfaction [http://www.wisdom.weizmann.ac.il/~dinuri/mypapers/combpcp.pdf]. What a proof system. If you want to be more pure, and consider the mathematical objects that are found rather than built, maybe the monster group [https://en.wikipedia.org/wiki/Monster_group]? (As a layperson so I can't appreciate the full extent of what's going, on and like most people I only real know about it second-hand, but its existence seems like a crazy and beautiful fact about the world.) Least favorite: I don't know, maybe Chaitin's constant?
AMA: Paul Christiano, alignment researcher

Should more AI alignment researchers run AMAs?

5Paul Christiano1moDunno, would be nice to figure out how useful this AMA was for other people. My guess is that they should at some rate/scale (in combination with other approaches like going on a podcast or writing papers or writing informal blog posts), and the question is how much communication like that to do in an absolute sense and how much should be AMAs vs other things. Maybe I'd guess that typically like 1% of public communication should be something like an AMA, and that something like 5-10% of researcher time should be public communication (though as mentioned in another comment you might have some specialization there which would cut it down, though I think that the AMA format is less likely to be split off, though that might be an argument for doing less AMA-like stuff and more stuff that gets split off...). So that would suggest like 0.05-0.1% of time on AMA-like activities. If the typical one takes a full-time-day-equivalent, then that's like doing one every 2 years, which I guess would be way more AMAs than we have. This AMA is more like a full-time day so maybe every 4 years? That feels a bit like an overestimate, but overall I'd guess that it would be good on the margin for there to be more alignment researcher AMAs. (But I'm not sure if AMAs are the best AMA-like thing.) In general I think that talking with other researchers and practitioners 1:1 is way more valuable than broadcast communication.
AMA: Paul Christiano, alignment researcher

Should more AI alignment research be communicated in book form? Relatedly, what medium of research communication is most under-utilized by the AI alignment community?

I think it would be good to get more arguments and ideas pinned down, explained carefully, collected in one place. I think books may be a reasonable format for that, though man they take a long time to write.

I don't know what medium is most under-utilized.

AMA: Paul Christiano, alignment researcher

That's not the AXRP question I'm too polite to ask.

1Ben Pace1moPaul, if you did an episode of AXRP, which two other AXRP episodes do you expect your podcast would be between, in terms of quality? For this question, collapse all aspects of quality into a scalar.
AMA: Paul Christiano, alignment researcher

Should marginal CHAI PhD graduates who are dispositionally indifferent between the two options try to become a professor or do research outside of universities?

Not sure. If you don't want to train students, seems toe me like you should be outside of a university. If you do want to train students it's less clear and maybe depends on what you want to do (and given that students vary in what they are looking for, this is probably locally self-correcting if too many people go one way or the other). I'd certainly lean away from university for the kinds of work that I want to do, or for the kinds of things that involve aligning large ML systems (which benefit from some connection to customers and resources).

AMA: Paul Christiano, alignment researcher

What mechanisms could effective altruists adopt to improve the way AI alignment research is funded?

Long run I'd prefer with something like altruistic equity / certificates of impact. But frankly I don't think we have hard enough funding coordination problems that it's going to be worth figuring that kind of thing out. 

(And like every other community we are free-riders---I think that most of the value of experimenting with such systems would accrue to other people who can copy you if successful, and we are just too focused on helping with AI alignment to contribute to that kind of altruistic public good. If only someone would be willing to purchase ... (read more)

AMA: Paul Christiano, alignment researcher

Why aren't impact certificates a bigger deal?

4Paul Christiano1moChange is slow and hard and usually driven by organic changes rather than clever ideas, and I expect it to be the same here. In terms of why the idea is actually just not that big a deal, I think the big thing is that altruistic projects often do benefit hugely from not needing to do explicit credit attribution. So that's a real cost. (It's also a cost for for-profit businesses, leading to lots of acrimony and bargaining losses.) They also aren't quite consistent with moral public goods [https://www.lesswrong.com/posts/pqKwra9rRYYMvySHc/moral-public-goods] / donation-matching, which might be handled better by a messy status quo, and I think that's a long-term problem though probably not as big as the other issues.
AMA: Paul Christiano, alignment researcher

How many ideas of the same size as "maybe a piecewise linear non-linearity would work better than a sigmoid for not having vanishing gradients" are we away from knowing how to build human-level AI technology?

I think it's >50% chance that ideas like ReLUs or soft attention are best though of as multiplicative improvements on top of hardware progress (as are many other ideas like auxiliary objectives, objectives that better capture relevant tasks, infrastructure for training more efficiently, dense datasets, etc.), because the basic approach of "optimize for a task that requires cognitive competence" will eventually yield human-level competence. In that sense I think the answer is probably 0.

Maybe my median number of OOMs left before human-level intelligence,... (read more)

AMA: Paul Christiano, alignment researcher

How many ideas of the same size as "maybe we could use inverse reinforcement learning to learn human values" are we away from knowing how to knowably and reliably build human-level AI technology that wouldn't cause something comparably bad as human extinction?

A lot of this is going to come down to estimates of the denominator. 

(I mostly just think that you might as well just ask people "Is this good?" rather than trying to use a more sophisticated form of IRL---in particular I don't think that realistic versions of IRL will successfully address the cases where people err in answering the "is it good?" question, that directly asking is more straightforward in many important ways, and that we should mostly just try to directly empower people to give better answers to such questions.)

Anyway, with that caveat ... (read more)

AMA: Paul Christiano, alignment researcher

How many new blogs do you anticipate creating in the next 5 years?

3Paul Christiano1moI've created 3 blogs in the last 10 years and 1 blog in the preceding 5 years. It seems like 1-2 is a good guess. (A lot depends on whether there ends up being an ARC blog or it just inherits ai-alignment.com [ai-alignment.com])
AMA: Paul Christiano, alignment researcher

If a 17-year-old wanted to become the next Paul Christiano, what should they do?

AMA: Paul Christiano, alignment researcher

What is the Paul Christiano production function?

AMA: Paul Christiano, alignment researcher

How will we know when it's not worth getting more people to work on reducing existential risk from AI?

2Paul Christiano1moWe'll do the cost-benefit analysis and over time it will look like a good career for a smaller and smaller fraction of people (until eventually basically everyone for whom it looks like a good idea is already doing it). That could kind of qualitatively look like "something else is more important," or "things kind of seem under control and it's getting crowded," or "there's no longer enough money to fund scaleup." Of those, I expect "something else is more important" to be the first to go (though it depends a bit on how broadly you interpret "from AI," if anything related to the singularity / radically accelerating growth is classified as "from AI" then it may be a core part of the EA careers shtick kind of indefinitely, with most of the action in which of the many crazy new aspects of the world people are engaging with).
AMA: Paul Christiano, alignment researcher

What's the most important thing that AI alignment researchers have learned in the past 10 years? Also, that question but excluding things you came up with.

"Thing" is tricky. Maybe something like the set of intuitions and arguments we have around learned optimizers, i.e. the basic argument that ML will likely produce a system that is "trying" to do something, and that it can end up performing well on the training distribution regardless of what it is "trying" to do (and this is easier the more capable and knowledgeable it is). I don't think we really know much about what's going on here, but I do think it's an important failure to be aware of and at least folks are looking for it now. So I do think that if it... (read more)

AMA: Paul Christiano, alignment researcher

What is the most common wrong research-relevant intuition among AI alignment researchers?

Does the lottery ticket hypothesis suggest the scaling hypothesis?

Ah to be clear I am entirely basing my comments off of reading the abstracts (and skimming the multi-prize paper with an eye one develops after having been a ML PhD student for mumbles indistinctly years).

Does the lottery ticket hypothesis suggest the scaling hypothesis?

Oh here's where I think things went wrong:

Part of why I think the two tickets are the same is that the at-initialization ticket is found by taking the after-training ticket and rewinding it to the beginning!

This is true in the original LTH paper, but there the "at-initialization ticket" doesn't actually perform well: it's just easy to train to high performance.

In the multi-prize LTH paper, it is the case that the "at-initialization ticket" performs well, but they don't find it by winding back the weights of a trained pruned network.

If you got multi-pri... (read more)

2Daniel Kokotajlo2moOH this indeed changes everything (about what I had been thinking) thank you! I shall have to puzzle over these ideas some more then, and probably read the multi-prize paper more closely (I only skimmed it earlier)
Does the lottery ticket hypothesis suggest the scaling hypothesis?

I guess I'm imagining that 'by default', your distribution over which optimum SGD reaches should be basically uniform, and you need a convincing story to end up believing that it reliably gets to one specific optimum.

So for them not to be the same, the training process would need to kill the first ticket and then build a new ticket on exactly the same spot!

Yes, that's exactly what I think happens. Training takes a long time, and I expect the weights in a 'ticket' to change based on the weights of the rest of the network (since those other weights have ... (read more)

Oh here's where I think things went wrong:

Part of why I think the two tickets are the same is that the at-initialization ticket is found by taking the after-training ticket and rewinding it to the beginning!

This is true in the original LTH paper, but there the "at-initialization ticket" doesn't actually perform well: it's just easy to train to high performance.

In the multi-prize LTH paper, it is the case that the "at-initialization ticket" performs well, but they don't find it by winding back the weights of a trained pruned network.

If you got multi-pri... (read more)

Does the lottery ticket hypothesis suggest the scaling hypothesis?

I expect that there are probably a bunch of different neural networks that perform well at a given task. We sort of know this because you can train a dense neural network to high accuracy, and also prune it to get a definitely-different neural network that also has high accuracy. Is it the case that these sparse architectures are small enough that there's only one optimum? Maybe, but IDK why I'd expect that.

2Daniel Kokotajlo2moWhoa, the thing you are arguing against is not at all what I had been saying -- but maybe it was implied by what I was saying and I just didn't realize it? I totally agree that there are many optima, not just one. Maybe we are talking past each other? (Part of why I think the two tickets are the same is that the at-initialization ticket is found by taking the after-training ticket and rewinding it to the beginning! So for them not to be the same, the training process would need to kill the first ticket and then build a new ticket on exactly the same spot!)
Does the lottery ticket hypothesis suggest the scaling hypothesis?

Re: reinforcing the winning tickets: Isn't that implied? If it's not implied, would you not agree that it is happening?

I don't think it's implied, and I'm not confident that it's happening. There are lots of neural networks!

2Daniel Kokotajlo2moHmmm, ok. Can you say more about why? Isn't the simplest explanation that the two tickets are the same?
Does the lottery ticket hypothesis suggest the scaling hypothesis?

None of those quotes claim that training just reinforces the 'winning tickets'. Also those are referred to as the "strong" or "multi-ticket" LTH.

2Daniel Kokotajlo2moYeah, fair enough. I should amend the title of the question. Re: reinforcing the winning tickets: Isn't that implied? If it's not implied, would you not agree that it is happening? Plausibly, if there is a ticket at the beginning that does well at the task, and a ticket at the end that does well at the task, it's reasonable to think that it's the same ticket? Idk, I'm open to alternative suggestions now that you mention it...
3Evan Hubinger2mo*multi-prize
Does the lottery ticket hypothesis suggest the scaling hypothesis?

Yep, I agree that this question does not accurately describe the lottery ticket hypothesis.

4Daniel Kokotajlo2moThe original paper doesn't demonstrate this but later papers do, or at least claim to. Here are several papers with quotes: https://arxiv.org/abs/2103.09377 [https://arxiv.org/abs/2103.09377] "In this paper, we propose (and prove) a stronger Multi-Prize Lottery Ticket Hypothesis: A sufficiently over-parameterized neural network with random weights contains several subnetworks (winning tickets) that (a) have comparable accuracy to a dense target network with learned weights (prize 1), (b) do not require any further training to achieve prize 1 (prize 2), and (c) is robust to extreme forms of quantization (i.e., binary weights and/or activation) (prize 3)." https://arxiv.org/abs/2006.12156 [https://arxiv.org/abs/2006.12156] "An even stronger conjecture has been proven recently: Every sufficiently overparameterized network contains a subnetwork that, at random initialization, but without training, achieves comparable accuracy to the trained large network." https://arxiv.org/abs/2006.07990 [https://arxiv.org/abs/2006.07990] The strong {\it lottery ticket hypothesis} (LTH) postulates that one can approximate any target neural network by only pruning the weights of a sufficiently over-parameterized random network. A recent work by Malach et al. \cite{MalachEtAl20} establishes the first theoretical analysis for the strong LTH: one can provably approximate a neural network of width d and depth l, by pruning a random one that is a factor O(d4l2) wider and twice as deep. This polynomial over-parameterization requirement is at odds with recent experimental research that achieves good approximation with networks that are a small factor wider than the target. In this work, we close the gap and offer an exponential improvement to the over-parameterization requirement for the existence of lottery tickets. We show that any target network of width d and depth l can be approximated by pruning a random network that is a factor O(log(dl)) wider and twice as deep.
[AN #147]: An overview of the interpretability landscape

Additionally, in an intuitive sense, pruning a network seems as though it could be defined in terms of clusterability notions, which limits my enthusiasm for that result.

I see what you mean, but there exist things called expander graphs which are very sparse (i.e. very pruned) but minimally clusterable. Now, these don't have a topology compatible with being a neural network, but are proofs of concept that you can prune without being clusterable. For more evidence, note that our pruned networks are more clusterable than if you permuted the weights randomly - that is, than random pruned networks.

Load More