All of Michaël Trazzi's Comments + Replies

Why Copilot Accelerates Timelines

Well, I agree that if two worlds I had in mind were 1) foom without real AI progress beforehand 2) continuous progress, then seeing more continuous progress from increased investments should indeed update me towards 2).

The key parameter here is substitutability between capital and labor. In what sense is Human Labor the bottleneck, or is Capital the bottleneck. From the different growth trajectories and substitutability equations you can infer different growth trajectories. (For a paper / video on this see the last paragraph here).

The world in which dalle... (read more)

Why Copilot Accelerates Timelines

Thanks for the pointer. Any specific section / sub-section I should look into?

4anonymousaisafety24d
Section 1, section 10, and section 11 cover the scenario of R&D automation via AI/ML systems that drive more productive R&D automation, resulting in a positive feedback loop, without requiring the typical "self-improving agent" -- it's the R&D system (people + AI/ML products) as a whole that is self-improving, not the individual AI/ML systems. I highly recommend reading the entire report though. It was released in 2019 and I think it was brushed aside a little bit too easily. The past 3 years have (in my mind) provided sufficient evidence of things that CAIS directly predicted would happen, e.g. all of the AI/ML systems we've developed recently that have reached super-human competency on tasks despite a lack of generalized learning or capabilities or "self-improvement" or other recognizably "general intelligence" / "agent"-like behavior. In 2019, we did not have Copilot, or DALL-E, or DALL-E-2, or AlphaFold, or DeepMind Ithaca, or GPT-3 -- etc. I talk about this a little bit in my comment here [https://www.lesswrong.com/posts/GkXKvkLAcTm5ackCq/intuitions-about-solving-hard-problems?commentId=s4BF79hCfjoHTJg9f] .
Why Copilot Accelerates Timelines

I agree that we are already in this regime. In the section "AI Helping Humans with AI" I tried to make it more precise at what threshold we would see substantial change in how humans interact with AI to build more advanced AI systems. Essentially, it will be when most people would use those tools most of their time (like on a daily basis) and they would observe some substantial gains of productivity (like using some oracle to make a lot of progress on a problem they are stuck on, or Copilot auto-completing a lot of their lines of code without having to man... (read more)

Why Copilot Accelerates Timelines

Some arguments for why that might be the case:

-- the more useful it is, the more people use it, the more telemetry data the model has access to

-- while scaling laws do not exhibit diminishing returns from scaling, most of the development time would be on things like infrastructure, data collection and training, rather than aiming for additional performance

-- the higher the performance, the more people get interested in the field and the more research there is publicly accessible to improve performance by just implementing what is in the litterature (Note: ... (read more)

Matthew Barnett's Shortform

fast takeoff folks believe that we will only need a minimal seed AI that is capable of rewriting its source code, and recursively self-improving into superintelligence

Speaking only for myself, the minimal seed AI is a strawman of why I believe in "fast takeoff". In the list of benchmarks you mentioned in your bet, I think APPS is one of the most important.

I think the "self-improving" part will come from the system "AI Researchers + code synthesis model" with a direct feedback loop (modulo enough hardware), cf. here. That's the self-improving superintellige... (read more)

Forecasting Thread: AI Timelines

Yes, something like: given (programmer-hours-into-scaling(July 2020) - programmer-hours-into-scaling(Jan 2022)), and how much progress there has been on hardware for such training (I don't know the right metric for this, but probably something to do with FLOP and parallelization), the extrapolation to 2025 (either linear or exponential) would give the 4 OOM you mentioned.

Forecasting Thread: AI Timelines
You have to do lots of software engineering and for 4+ OOMs you literally need to build more chip fabs to produce more chips.

I have probably missed many considerations you have mentioned elsewhere, but in terms of software engineering, how do you think the "software production rate" for scaling up large evolved from 2020 to late 2021? I don't see why we couldn't get 4 OOM between 2020 and 2025.

If we just take the example of large LM, we went from essentially 1-10 publicly known models in 2020, to 10-100 in 2021 (cf. China, Korea, Microsoft, DM, etc.), and ... (read more)

3Daniel Kokotajlo4mo
If I understand you correctly, you are asking something like: How many programmer-hours of effort and/or how much money was being spent specifically on scaling up large models in 2020? What about in 2025? Is the latter plausibly 4 OOMs more than the former? (You need some sort of arbitrary cutoff for what counts as large. Let's say GPT-3 sized or bigger.) Yeah maybe, I don't know! I wish I did. It's totally plausible to me that it could be +4 OOMs in this metric by 2025. It's certainly been growing fast, and prior to GPT-3 there may not have been much of it at all.
Phil Trammell on Economic Growth Under Transformative AI

Among other things, Phil's literature review studies to what extent will human labor be a bottleneck for economic growth as AI substitutes for labor. I agree with you that AI-coding-AIs would have weird effects... but do you agree with the point that it won't be enough to sustain growth, or are you thinking about other paths where certain bottlenecks might not really be a problem?

1Charlie Steiner7mo
I think that humans would still be necessary for human society for a reasonable amount of time (months or more) if things go well. If things don't go well, we're toast, which is a pretty big deviation from the economic model. But even if things go well, I think the presence of things like superhuman persuasion lead to a breakdown of assumptions behind normal economic behavior in humans, even in that period where human labor is still a cost-effective input to the (now superhumanly-planned) economy.
The Codex Skeptic FAQ

I created a class initializing the attributes you mentioned, and when adding your docstring to your function signature it gave me exactly the answer you were looking for. Note that it was all in first try, and that I did not think at all about the initialization for components, marginalized or observed—I simply auto-completed.

class Distribution:
def __init__(self):
self.components = []
self.marginalized = None
self.observed = None


def unobserved(self) -> Set[str]:

"""Returns a set of all unobserved random variable names inside
... (read more)
The Codex Skeptic FAQ

Wait, they did plain forbid you to use at all during work time, or they forbid to use its outputs for IT issues? Surely, using Codex for inspiration, given a natural language prompt and looking at what function it calls does not seem to infringe any copyright rules?

  • 1) If you start with your own variable names, it would auto-complete with those, maybe using something he learned online. would that count as plagiarism in your sense? How would that differ from copy-pasting from stack overflow changing the variable names (I'm not an expert in SO copyright terms
... (read more)
The Codex Skeptic FAQ

The problem with arguing against that claim is that nobody knows whether transformers/scaling language models are sufficient for full code automation. To take your nootropics example, an analogy would be if nootropics were legal, did not have negative side effects, with a single company giving "beta access" (for now) to a new nootropic in unlimited amount at no cost to a market of tens of millions of users, that the data from using this nootropic was collected by the company to improve the product, that there actually were 100k peer-reviewed publications p... (read more)

The Codex Skeptic FAQ

I buy that "generated code" will not add anything to the training set, and that Copilot doesn't help for having good data or (directly) better algorithms. However, the feedback loop I am pointing at is when you accept suggestions on Copilot. I think it is learning from human feedback on what solutions people select. If the model is "finetuned" to the specific dev's coding style, I would expect Codex to suggest even better code (because of high quality of finetuning data) to someone at OAI than me or you.

How much of this is 'quality of code' vs. 'quality of
... (read more)
Frequent arguments about alignment

Thanks for the post, it's a great idea to have both arguments.

My personal preference would be to have both arguments to be the same length to properly compare the strength of the arguments (skeptic is one paragraph, advocate is 3-6x longer), and not always in the same order skeptic then advocate, but also advocate -> skeptic or even skeptic -> advocate --> skeptic -> ..., so it does not appear like one is the "haven't thought about it much" view.

Big picture of phasic dopamine

Right I just googled Marblestone and so you're approaching it with the dopamine side and not the acetylcholine. Without debating about words, their neuroscience paper is still at least trying to model the phasic dopamine signal as some RPE & the prefrontal network as an LSTM (IIRC), which is not acetylcholine based. I haven't read in detail this post & the one linked, I'll comment again when I do, thanks!

Big picture of phasic dopamine

Awesome post! I happen to also have tried to distill links between RPE and phasic dopamine in the "Prefrontal Cortex as a Meta-RL System" of this blog.

In particular I reference this paper on DL in the brain & this other one for RL in the brain. Also, I feel like the part 3 about links between RL and neuro of the RL book is a great resource for this.

1Steve Byrnes1y
Thanks! If you Ctrl-F the post you'll find my little paragraph on how my take differs from Marblestone, Wayne, Kording 2016. I haven't found "meta-RL" to be a helpful way to frame either the bandit thing or the follow-up paper relating it to the brain, more-or-less for reasons here [https://www.lesswrong.com/posts/Wnqua6eQkewL3bqsF/matt-botvinick-on-the-spontaneous-emergence-of-learning?commentId=pYpPnAKrz64ptyRid] , i.e. that the normal RL / POMDP expectation is that actions have to depend on previous observations—like think of playing an Atari game—and I guess we can call that "learning", but then we have to say that a large fraction of every RL paper ever is actually a meta-RL paper, and more importantly I just don't find that thinking in those terms leads me to a better understanding of anything, but whatever, YMMV. I don't agree with everything in the RL book chapter but it's still interesting, thanks for the link.
Matt Botvinick on the spontaneous emergence of learning algorithms

Funnily enough, I wrote a blog distilling what I learned from reproducing experiments of that 2018 Nature paper, adding some animations and diagrams. I especially look at the two-step task, the Harlow task (the one with monkeys looking at a screen), and also try to explain some brain things (e.g. how DA interacts with the PFN) at the end.

OpenAI announces GPT-3

HN comment unsure about the meta-learning generalization claims that OpenAI has a "serious duty [...] to frame their results more carefully"

Ultra-simplified research agenda

Having printed and read the full version, this ultra-simplified version was an useful summary.

Happy to read a (not-so-)simplified version (like 20-30 paragraphs).

Does that summarize your comment?

1. Proposals should make superintelligences less likely to fight you by using some conceptual insight true in most cases.
2. With CIRL, this insight is "we want the AI to actively cooperate with humans", so there's real value from it being formalized in a paper.
3. In the counterfactual paper, there's the insight "what if the AI thinks he's not on but still learns".
For the last bit, I have two interpretations:
4.a. However, it's unclear that this design avoids all manipulative behaviour
... (read more)
1Alex Turner3y
It's more like 4a. The line of thinking seems useful, but I'm not sure that it lands.

The zero reward is in the paper. I agree that skipping would solve the problem. From talking to Stuart, my impression is that he thinks that would be equivalent to skipping for specifying "no learning", or would just slow down learning. My disagreement on that I think it can confuse learning to the point of not learning the right thing.

Why not do a combination of pre-training and online learning, where you do enough during the training phase to get a useful predictor, and then use online learning to deal with subsequent distributional shifts?
... (read more)

The string is read with probability 1-

Yes, if we choose the utility function to make it a CDT agent optimizing for the reward for one step (so particular case of act-based) then it won't care about future versions of itself nor want to escape.

I agree with the intuition of shutting down to make it episodic, but I am still confused about the causal relationship between "having the rule to shutdown the system" and "having a current timestep maximizer". For it to really be a "current timestep maximizer" it needs to be in some kind of reward/utility function. Beca... (read more)

The Asymptotically Unambitious AGI thread helped me clarify my thoughts, thanks. I agree that an optimal CDT agent won't think about future versions, and I don't see any optimization pressure towards escape message nor disproportionately common "escape message" regions.

However, it still assumes we have access to this magic oracle that optimizes for where is the event where humans don't see the answer, its indicator function, and the counterfactual reward (given by the automatic machine). If humans were able to build ... (read more)

1Wei Dai3y
Why do we have to give the oracle a zero reward for the non-erasure episodes? Why not just skip the learning/update step for those episodes? Why not do a combination of pre-training and online learning, where you do enough during the training phase to get a useful predictor, and then use online learning to deal with subsequent distributional shifts? Although I guess that probably isn't really original either. What seems original is that during any episode where learning will take place, don't let humans (or any other system that might be insecure against the oracle) see the oracle's output until the episode is over.
0Ryan Carey3y
The escape action being randomly called should not be a problem if it is a text string that is only read if r=1, and is ineffectual otherwise...
Corrigibility as Constrained Optimisation
Reply: The button is a communication link between the operator and the agent. In general, it is possible to construct an agent that shuts down even though it has received no such message from its operators as well as an agent that does get a shutdown message, but does not shut down. Shutdown is a state dependent on actions, and not a communication link

This is very clear. Communication link made me understand that it didn't have a direct physical effect on the agent. It you want to make it even more intuitive you could do a diagram, but this explanatio... (read more)

Corrigibility as Constrained Optimisation

Layman questions:

1. I don't understand what you mean by "state" in "Suppose, however, that the AI lacked any capacity to press its shutdown button, or to indirectly control its state". Do you include its utility function in its state? Or just the observations he receives from the environment? What context/framework are you using?

2. Could you define U_S and U_N? From the Corribility paper, U_S appears to be an utility function favoring shutdown, and U_N is a potentially flawed utility function, a first stab at specifying their own ... (read more)

1Henrik Åslund3y
Thank you so much for your comments, Michaël! The post has been updated on most of them. Here are some more specific replies. 1. I don't understand what you mean by "state" in "Suppose, however, that the AI lacked any capacity to press its shutdown button, or to indirectly control its state". Do you include its utility function in its state? Or just the observations he receives from the environment? What context/framework are you using? Reply: "State" refers to the state of the button, i.e., whether it is in an on state or an off state. It is now clarified. 2. Could you define U_S and U_N? From the Corribility paper, U_S appears to be an utility function favoring shutdown, and U_N is a potentially flawed utility function, a first stab at specifying their own goals. Was that what you meant? I think it's useful to define it in the introduction. Reply: U_{N} is assumed rather than defined, but it is now clarified. 3. I don't understand how an agent that "[lacks] any capacity to press its shutdown button" could have any shutdown ability. It's seems like a contradiction, unless you mean "any capacity to directly press its shutdown button". Reply: The button is a communication link between the operator and the agent. In general, it is possible to construct an agent that shuts down even though it has received no such message from its operators as well as an agent that does get a shutdown message, but does not shut down. Shutdown is a state dependent on actions, and not a communication link. Hopefully, this clarifies that they are uncorrelated. I think it's clear enough in the post already, but if you have some suggestion on how to clarify it even more, I'd gladly hear it! 4. What's the "default value function" and the "normal utility function" in "Optimisation incentive"? Is it clearly defined in the litterature? Reply: It is now clarified. 5. "Worse still... for any action..." -> if you choose b as some action with bad corrigibility property, it seems reasonable
Alignment Research Field Guide

Hey Abram (and the MIRI research team)!

This post resonates with me on so many levels. I vividly remember the Human-Aligned AI Summer School where you used to be a "receiver" and Vlad was a "transmitter", when talking about "optimizers". Your "document" especially resonates with my experience running an AI Safety Meetup (Paris AI Safety).

On January 2019, I organized a Meetup about "Deep RL from human preferences". Essentially, the resources were by difficulty, so you could discuss the 80k podcast, the open A... (read more)