See also their Llama 2 website here: https://ai.meta.com/llama, and their research paper here: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
From their blog post:
Takeaways
- Today, we’re introducing the availability of Llama 2, the next generation of our open source large language model.
- Llama 2 is free for research and commercial use.
- Microsoft and Meta are expanding their longstanding partnership, with Microsoft as the preferred partner for Llama 2.
- We’re opening access to Llama 2 with the support of a broad set of companies and people across tech, academia, and policy who also believe in an open innovation approach to today’s AI technologies.
Compared to the first Llama, LLama 2 is trained for 2T tokens instead of 1.4T, has 2x the context length (4096 instead of 2048), uses Grouped Query Attention, and performs better across the board, with performance generally exceeding code-davinci-002
on benchmarks:
They also release both a normal base model (Llama 2) and a RLHF'ed chat model (Llama 2-chat). Interestingly, they're only releasing the 7B/13B/70B models, and not the 34B model, "due to a lack of time to sufficiently red team".
More importantly, they're releasing it both on Microsoft Azure and also making it available for commercial use. The form for requesting access is very straightforward and does not require stating what you're using it for: (EDIT: they gave me access ~20 minutes after submitting the form, seems pretty straightforward.)
Note that their license is not technically free for commercial use always; it contains the following clauses:
[1.] v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof).
2. Additional Commercial Terms. If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
See also the Llama 2 Acceptable Use Policy (which seems pretty standard).
Llama 2 is not open source.
(a few days after this comment, here's a concurring opinion from the Open Source Initiative - as close to authoritative as you can get)
While their custom licence permits some commercial uses, it is not an OSI approved license, and because it violates the open source definition it never will be. Specifically, the llama 2 licence violates:
So, why does this matter?
As an open-source maintainer and PSF Fellow, I have no objection to the existence of commercially licensed software. I use much of it, and have sold commercial licenses for software that I've written too. However, people - and especially megacorps - misrepresenting their closed-off projects as open source is an infuriating form of parasitism on a reputation painstakingly built over decades.
The restriction on model training makes Llama 2 much less useful for AI safety research, but it incurs just as much direct (roughly all via misuse IMO) and acceleration risk as an open-source release.
Using a custom license adds substantial legal risk for prospective commercial users, especially given the very broad restrictions imposed by the acceptable use policy. This reduces the economic upside enormously relative to standard open terms, and leaves Meta's competitors particularly at risk of lawsuits if they attempt to use Llama 2.
To summarize, Meta gets a better cost/benefit tradeoff by using a custom, non-open-source license especially if people incorrectly percieve it as open source; everyone else is worse off; and it seems to me like they're deliberately misrepresenting what they've done for their own gain. This really, really annoys me.
When someone describes Llama 2 as "open source", please correct them: Meta is offering a limited commercial license which discriminates against specific users and bans many valuable use-cases, including in alignment research.
Huh, that's very useful context, thanks! Seems like pretty sad behaviour.
Thanks a lot for the context!
Out of curiosity, why does the model training restriction make it much less useful for safety research?
Example projects you're not allowed to do, if they involve other model families:
In many cases I expect that individuals will go ahead and do this anyway, much like the license of Llama 1 was flagrantly violated all over the place, but remember that it's differentially risky for any organisation which Meta might like to legally harass.
Any idea what's happening with the 34B model? Why might it be so much less "safe" than the bigger and smaller versions? And what about the base version of the 34B--are they not releasing that? But the base version isn't supposed to be "safe" anyway...
Relevant rumors / comments:
Seems like we can continue to scale tokens and get returns model performance well after 2T tokens. : r/LocalLLaMA (reddit.com)
LLaMA 2 is here : r/LocalLLaMA (reddit.com)