Oliver Habryka

Coding day in and out on LessWrong 2.0. You can reach me at habryka@lesswrong.com

Wiki Contributions


Relevant piece of data: https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/?fbclid=IwAR3KTBnxC_y7n0TkrCdcd63oBuwnu6wyXcDtb2lijk3G-p9wdgD9el8KzQ4 

Feb 1 (Reuters) - ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after launch, making it the fastest-growing consumer application in history, according to a UBS study on Wednesday.

The report, citing data from analytics firm Similarweb, said an average of about 13 million unique visitors had used ChatGPT per day in January, more than double the levels of December.

"In 20 years following the internet space, we cannot recall a faster ramp in a consumer internet app," UBS analysts wrote in the note.

I had some decent probability on this outcome but I have now doubled my previous estimate of the impact of Chat-GPT, since I didn't expect something this radical ("the single fastest growing consumer product in history").

I didn't realize how broadly you were defining AI investment. If you want to say that e.g ChatGPT increased investment by $10B out of $200-500B, so like +2-5%, I'm probably happy to agree (and I also think it had other accelerating effects beyond that).

Makes sense, sorry for the miscommunication. I really didn't feel like I was making a particularly controversial claim with the $10B, so was confused why it seemed so unreasonable to you. 

I do think those $10B are going to be substantially more harmful for timelines than other money in AI, because I do think a good chunk of that money will much more directly aim at AGI than most other investment. I don't know what my multiplier here for effect should be, but my guess is something around 3-5x in expectation (I've historically randomly guessed that AI applications are 10x less timelines-accelerating per dollar than full-throated AGI-research, but I sure have huge uncertainty about that number). 

That, plus me thinking there is a long tail with lower probability where Chat-GPT made a huge difference in race dynamics, and thinking that this marginal increase in investment does probably translate into increases in total investment, made me think this was going to shorten timelines in-expectation by something closer to 8-16 weeks, which isn't enormously far away from yours, though still a good bit higher. 

And yeah, I do think the thing I am most worried about with Chat-GPT in addition to just shortening timelines is increasing the number of actors in the space, which also has indirect effects on timelines. A world where both Microsoft and Google are doubling down on AI is probably also a world where AI regulation has a much harder time taking off. Microsoft and Google at large also strike me as much less careful actors than the existing leaders of AGI labs which have so far had a lot of independence (which to be clear, is less of an endorsement of current AGI labs, and more of a statement about very large moral-maze like institutions with tons of momentum). In-general the dynamics of Google and Microsoft racing towards AGI sure is among my least favorite takeoff dynamics in terms of being able to somehow navigate things cautiously. 

One thing worth pointing out in defense of your original estimate is that variance should add up to 100%, not effect sizes, so e.g. if the standard deviation is $100B then you could have 100 things each explaining ($10B)^2 of variance (and hence each responsible for +-$10B effect sizes after the fact).

Oh, yeah, good point. I was indeed thinking of the math a bit wrong here. I will think a bit about how this adjusts my estimates, though I think I was intuitively taking this into account.

How much total investment do you think there is in AI in 2023?

My guess is total investment was around the $200B - $500B range, with about $100B of that into new startups and organizations, and around $100-$400B of that in organizations like Google and Microsoft outside of acquisitions. I have pretty high uncertainty on the upper end here, since I don't know what fraction of Google's revenue gets reinvested again into AI, how much Tesla is investing in AI, how much various governments are investing, etc.

How much variance do you think there is in the level of 2023 investment in AI? (Or maybe whatever other change you think is equivalent.)

Variance between different years depending on market condition and how much products take off seems like on the order of 50% to me. Like, different years have pretty hugely differing levels of investment.

My guess is about 50% of that variance is dependent on different products taking off, how much traction AI is getting in various places, and things like Chat-GPT existing vs. not existing. 

So this gives around $50B - $125B of variance to be explained by product-adjacent things like Chat-GPT.

How much influence are you giving to GPT-3, GPT-3.5, GPT-4? How much to the existence of OpenAI? How much to the existence of Google? How much to Jasper? How much to good GPUs?

Existence of OpenAI is hard to disentangle from the rest. I would currently guess that in terms of total investment, GPT-2 -> GPT-3 made a bigger difference than GPT-3.5 -> Chat-GPT, but both made a much larger difference than GPT-3 -> GPT-3.5. 

I don't think Jasper made a huge difference, since its userbase is much smaller than Chat-GPT, and also evidently the hype from it has been much lower. 

Good GPUs feels kind of orthogonal. We can look at each product that makes up my 50% of the variance to be explained and see how useful/necessary good GPUs were for its development, and my sense is for Chat-GPT at least the effect of good GPUs were relatively minor since I don't think the training to move from GPT-3.5 to Chat-GPT was very compute intensive.

I would feel fine saying expected improvements in GPUs are responsible for 25% of the 50% variance (i.e. 17.5%) if you chase things back all the way, though that again feels like it isn't trying to add up to 100% with the impact from "Chat-GPT". I do think it's trying to add up to 100% with the impact from "RLHF's effect on Chat-GPT", which I claimed was at least 50% of the impact of Chat-GPT in-particular. 

In any case, in order to make my case for $10B using these numbers I would have to argue that between 20% and 8% of the product-dependent variance in annual investment into AI is downstream of Chat-GPT, and indeed that still seems approximately right to me after crunching the numbers. It's by far the biggest AI product of the last few years, it is directly credited with sparking an arms race between Google and Microsoft, and indeed even something as large as 40% wouldn't seem totally crazy to me, since these kinds of things tend to be heavy-tailed, so if you select on the single biggest thing, there is a decent chance you underestimate its effect.

I think it's unlikely that the reception of ChatGPT increased OpenAI's valuation by $10B, much less investment in OpenAI, even before thinking about replaceability.

Note that I never said this, so I am not sure what you are responding to. I said Chat-GPT increases investment in AI by $10B, not that it increased investment into specifically OpenAI. Companies generally don't have perfect mottes. Most of that increase in investment is probably in internal Google allocation and in increased investment into the overall AI industry.

I think the effect would have been very similar if it had been trained via supervised learning on good dialogs

I don't currently think this is the case, and seems like the likely crux. In general it seems that RLHF is substantially more flexible in what kind of target task it allows you to train for, which is the whole reason for why you are working on it, and at least my model of the difficulty of generating good training data for supervised learning here is that it would have been a much greater pain, and would have been much harder to control in various fine-grained ways (including preventing the AI from saying controversial things), which had been the biggest problem with previous chat bot attempts.

For ChatGPT in particular, I think it was built by John Schulman's team

I find a comparison with John Schulman here unimpressive if you want to argue progress on this was overdetermined, given the safety motivation by John, and my best guess being that if you had argued forcefully that RLHF was pushing on commercialization bottlenecks, that John would have indeed not worked on it.

Seeing RLHF teams in other organizations not directly downstream of your organizational involvement, or not quite directly entangled with your opinion, would make a bigger difference here.

I feel like the implicit model of the world you are using here is going to have effect sizes adding up to much more than the actual variance at stake

I don't think so, and have been trying to be quite careful about this. Chat-GPT is just by far the most successful AI product to date, with by far the biggest global impact on AI investment and the most hype. I think $10B being downstream of that isn't that crazy. The product has a user base not that different from other $10B products, and a growth rate to put basically all of them to shame, so I don't think a $10B effect from Chat-GPT seems that unreasonable. There is only so much variance to go around, but Chat-GPT is absolutely massive in its impact.

RLHF is just not that important to the bottom line right now. Imitation learning works nearly as well, other hacky techniques can do quite a lot to fix obvious problems, and the whole issue is mostly second order for the current bottom line.

I am very confused why you think this, just right after the success of Chat-GPT, where approximately the only difference from GPT-3 was the presence of RLHF. 

My current best guess is that Chat-GPT alone, via sparking an arms-race between Google and Microsoft, and by increasing OpenAIs valuation, should be modeled as the equivalent of something on the order of $10B of investment into AI capabilities research, completely in addition to the gains from GPT-3. 

And my guess is most of that success is attributable to the work on RLHF, since that was really the only substantial difference between Chat-GPT and GPT-3. We also should not think this was overdetermined since 1.5 years passed since the release of GPT-3 and the release of Chat-GPT (with some updates to GPT-3 in the meantime, but my guess is no major ones), and no other research lab focused on capabilities had set up their own RLHF pipeline (except Anthropic, which I don't think makes sense to use as a datapoint here, since it's in substantial parts the same employees). 

I have been trying to engage with the actual details here, and indeed have had a bunch of arguments with people over the last 2 years where I have been explicitly saying that RLHF is pushing on commercialization bottlenecks based on those details, and people believing this was not the case was the primary crux on whether RLHF was good or bad in those conversations. 

The crux was importantly not that other people would do the same work anyways, since people at the same time also argued that their work on RLHF was counterfactually relevant and that it's pretty plausible or likely that the work would otherwise not happen. I've had a few of these conversations with you as well (though in aggregate not a lot) and your take at the time was (IIRC) that it seemed quite unlikely that RLHF would have as big of an effect as it did have in the case of Chat-GPT (mostly via an efficiency argument that if that was the case, more capabilities-oriented people would work on it, and since they weren't it likely isn't a commercialization bottleneck), and so I do feel a bit like I want to call you out on that, though I might also be misremembering the details (some of this was online, so might be worth going back through our comment histories).

I think this is my second-favorite post in the MIRI dialogues (for my overall review see here). 

I think this post was valuable to me in a much more object-level way. I think this post was the first post that actually just went really concrete on the current landscape of efforts int he domain of AI Notkilleveryonism and talked concretely about what seems feasible for different actors to achieve, and what isn't, in a way that parsed for me, and didn't feel either like something obviously political, or delusional. 

I didn't find the part about different paradigms of compute very valuable though, and my guess is it should be cut from edited versions of this article. 

I feel like this post is the best current thing to link to for understanding the point of coherence arguments in AI Alignment, which I think are really crucial, and even in 2023 I still see lots of people make bad arguments either overextending the validity of coherence arguments, or dismissing coherence arguments completely in an unproductive way.

I wrote up a bunch of my high-level views on the MIRI dialogues in this review, so let me say some things that are more specific to this post. 

Since the dialogues are written, I keep coming back to the question of the degree to which consequentialism is a natural abstraction that will show up in AI systems we train, and while this dialogue had some frustrating parts where communication didn't go perfectly, I still think it has some of the best intuition pumps for how to think about consequentialism in AI systems. 

The other part I liked the most were actually the points about epistemology. I think in-particular the point about "look, most correct theories will not make amazing novel predictions, instead they will just unify existing sets of predictions that we've made for shallower reasons" is one where the explanation clicked for me better than for previous explanations that covered similar topics. 

This was quite a while ago, probably over 2 years, though I do feel like I remember it quite distinctly. I guess my model of you has updated somewhat here over the years, and now is more interested in heads-down work.

Load More