LLaMA

Llama
开发者	Meta AI（英语：Meta AI）
首次发布	2023年2月24日，19个月前
当前版本	3.2（2024年9月25日；稳定版本）;
源代码库	github.com/meta-llama/llama3
编程语言	Python
类型	大型语言模型; GPT; 基础模型;
许可协议	Meta Llama 3.2 Community License
网站	llama.meta.com

LLaMA（英语：Large Language Model Meta AI）是Meta AI公司于2023年2月发布的大型语言模型。它训练了各种模型，这些模型的参数从70亿到650亿不等。LLaMA的开发人员报告说，LLaMA运行的130亿参数模型在大多数NLP基准测试中的性能超过了更大的、具有1750亿参数的GPT-3提供的模型，且LLaMA的模型可以与PaLM和Chinchilla等最先进的模型竞争^[3]。虽然其他强大的大语言模型通常只能通过有限的API访问，但Meta在非商业许可的情况下发布了LLaMA的模型权重，供研究人员参考和使用^[4]^[5] ^[6]。2023年7月，Meta推出LLaMA2，这是一种可用于商业应用的开源AI模型^[7]。

LLaMA2

2023年7月，Facebook母公司Meta推出了LLaMA2，LLaMA2是一种开源大语言模型（LLM），旨在挑战大型科技竞争对手的限制性做法。Meta免费发布LLaMA2背后的代码和数据，使世界各地的研究人员能够利用和改进该技术。 Meta的首席执行官马克·扎克伯格一直直言不讳地强调开源软件对于刺激创新的重要性。^[8]^[7]

Meta训练并发布了三种模型大小的LLaMA2：70、130和700亿个参数。模型架构与LLaMA1模型基本保持不变，但用于训练基础模型的数据增加了 40%。随附的预印本还提到了一个具有34B参数的模型，该模型可能在未来满足安全目标后发布。

LLaMA2包括基础模型和针对对话进行微调的模型，称为 Llama 2 - 聊天。与LLaMA1进一步不同的是，所有模型都附带权重，并且对于许多商业用例都是免费的。然而，由于一些剩余的限制，Llama开源的描述受到了开源倡议组织（以维护开源定义而闻名）的争议。^[9]

Code Llama

2023年8月，Meta继发布用于生成文本、翻译语言和创建音频的人工智能模型之后，开源了 Code Llama。这是一个机器学习系统，可以用自然语言（特别是英语）生成和解释代码。可以免费商用和研究。^[10]

Code Llama是从Llama-2基础模型微调而来，共有三个版本：基础版、Python版、以及指令遵循。类似于 GitHub Copilot 和 Amazon CodeWhisperer，以及 StarCoder、StableCode 和 PolyCoder 等开源人工智能代码生成器，Code Llama 可以跨多种编程语言完成代码并调试现有代码，包括 Python、C、Java、PHP、 Typescript、C# 和 Bash。^[11]

在训练 Code Llama 时，Meta 使用了与训练 Llama 2 相同的数据集——来自网络的公开可用资源的混合。但可以说，它的模型“强调”了包含代码的训练数据的子集。从本质上讲，Code Llama 比它的“父”模型 Llama 2 有更多的时间来学习代码和自然语言之间的关系。每个 Code Llama 模型的大小从 70 亿个参数到 340 亿个参数不等，均使用 5000 亿个代码标记以及与代码相关的数据进行训练。多个 Code Llama 模型可以将代码插入到现有代码中，并且所有模型都可以接受大约 100,000 个代码标记作为输入，而至少一个（70 亿个参数模型）可以在单个 GPU 上运行。（其他模型则需要更强大的硬件。）Meta 声称，340 亿个参数的模型是迄今为止所有开源代码生成器中性能最好的，也是参数数量最多的。^[11]

Llama 3

2024年4月18日，Meta发布了Llama-3，有两种模型大小尺寸：8B和70B参数。 ^[12] 这些模型已经根据从“公开可用来源”收集的大约 15 万亿个文本标记进行了预训练，并且指导模型根据“公开可用的指令数据集以及超过 1000 万个人工注释的示例”进行了微调。计划发布多模式模型、能够以多种语言进行对话的模型以及具有更大上下文窗口的模型。

于2024年7月23日增量更新至Llama-3.1。具有8B、70B、405B参数三种模型大小尺寸。^[12]

Meta AI 的测试表明，Llama 3 70B 在大多数基准测试中都击败了 Gemini 和 Claude。^[13]^[14]

模型比较

对于训练成本列，只写出最大模型的成本。例如，“21,000”是 Llama 2 69B 的训练成本，单位为 petaFLOP-day。另外，1 petaFLOP-day = 1 petaFLOP/秒 × 1 天 = 8.64E19 FLOP。

名称	发布日期	参数	训练成本 (petaFLOP-day)	上下文长度	语料库大小	商业可行性？
LLaMA	2023-02-24	6.7B 13B 32.5B 65.2B	6,300^[15]	2048	1–1.4T	否
Llama 2	2023-07-18	6.7B 13B 69B	21,000^[16]	4096	2T	是
Code Llama	2023-08-24	6.7B 13B 33.7B 69B		4096	2T
Llama 3	2024-04-18	8B 70.6B	100,000^[17]^[18]	8192	15T
Llama 3.1	2024-07-23	8B 70.6B 405B	440,000^[19]^[20]	128,000	15T
Llama 3.2	2024-09-25	1B 3B 11B 90B		128,000

架构与训练

数据集

2023年4月17日，GitHub的Together启动了一个名为RedPajama的项目，以复制和分发LLaMA数据集的开源版本。^[21]^[22]

反响

《连线》 (Wired) 杂志称Llama 3的 8B 参数版本“能力出奇地强”，考虑到它的大小。^[23]

Meta将Llama整合到Facebook后，人们的反应褒贬不一，一些用户在Meta AI告诉家长群它有一个孩子后感到困惑。^[24]

根据2023年第四季度的收益记录，Meta采用了开放权重的策略来提高模型安全性、迭代速度，增加开发人员和研究人员的采用率，并成为行业标准。未来计划推出 Llama 5、6 和 7。^[25]

参见

参考资料

^ Llama 3.2: Revolutionizing edge AI and vision with open, customizable models. 2024年9月25日 [2024年9月26日].
^ llama3/LICENSE at main · meta-llama/llama3. GitHub. [2024-05-25]. （原始内容存档于2024-05-24）（英语）.
^ Touvron, Hugo; Lavril, Thibaut; Izacard, Gautier; Martinet, Xavier; Lachaux, Marie-Anne; Lacroix, Timothée; Rozière, Baptiste; Goyal, Naman; Hambro, Eric; Azhar, Faisal; Rodriguez, Aurelien; Joulin, Armand; Grave, Edouard; Lample, Guillaume. LLaMA: Open and Efficient Foundation Language Models. 2023. arXiv:2302.13971  [cs.CL].
^ Introducing LLaMA: A foundational, 65-billion-parameter large language model. Meta AI. 24 February 2023 [2023-06-14]. （原始内容存档于2023-03-03）.
^ Vincent, James. Meta's powerful AI language model has leaked online — what happens now?. The Verge. 8 March 2023 [2023-06-14]. （原始内容存档于2023-11-03）.
^ 差一步称霸AI：历史进程中的扎克伯格, 远川研究所, 澎湃. [2023-06-28]. （原始内容存档于2023-06-28）.
^ ^7.0 ^7.1 Meta launches Llama 2, a source-available AI model that allows commercial applications. [2023-07-21]. （原始内容存档于2023-11-07）.
^ LLaMA 2: How to access and use Meta’s versatile open-source chatbot right now. [2023-07-20]. （原始内容存档于2023-11-03）.
^ Maffulli, Stefano. Meta’s LLaMa 2 license is not Open Source. Voices of Open Source. 2023-07-20 [2023-08-29]. （原始内容存档于2023-10-10）（美国英语）.
^ Code Llama: Open Foundation Models for Code, URL=https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/ （页面存档备份，存于互联网档案馆）
^ ^11.0 ^11.1 Meta releases Code Llama, a code-generating AI model, Kyle Wiggers, August 24, 2023 URL=https://techcrunch.com/2023/08/24/meta-releases-code-llama-a-code-generating-ai-model/ （页面存档备份，存于互联网档案馆）
^ ^12.0 ^12.1 Introducing Meta Llama 3: The most capable openly available LLM to date. ai.meta.com. April 18, 2024 [2024-04-21] （英语）.
^ Wiggers, Kyle. Meta releases Llama 3, claims it's among the best open models available. TechCrunch. 18 April 2024.
^ Mann, Tobias. Meta debuts third-generation Llama large language model. www.theregister.com （英语）.
^ The Falcon has landed in the Hugging Face ecosystem. huggingface.co. [2023-06-20].
^ llama/MODEL_CARD.md at main · meta-llama/llama. GitHub. [2024-05-28] （英语）.
^ Andrej Karpathy (Apr 18, 2024), The model card has some more interesting info too
^ llama3/MODEL_CARD.md at main · meta-llama/llama3. GitHub. [2024-05-28] （英语）.
^ "The Llama 3 Herd of Models" (July 23, 2024) Llama Team, AI @ Meta
^ llama-models/models/llama3_1/MODEL_CARD.md at main · meta-llama/llama-models. GitHub. [2024-07-23] （英语）.
^ RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset. GitHub. Together. [4 May 2023]. （原始内容存档于2023-11-07）.
^ RedPajama-Data-1T. Hugging Face. Together. [4 May 2023]. （原始内容存档于2023-11-03）.
^ Knight, Will. Meta's Open Source Llama 3 Is Already Nipping at OpenAI's Heels. Wired.
^ Meta's amped-up AI agents confusing Facebook users. ABC News. 19 April 2024 （澳大利亚英语）.
^ https://s21.q4cdn.com/399680738/files/doc_financials/2023/q4/META-Q4-2023-Earnings-Call-Transcript.pdf

外部链接

LLaMA2 Chatbot （页面存档备份，存于互联网档案馆）
Perplexity LLaMA2 Chatbot （页面存档备份，存于互联网档案馆）

[wikidata-e59f7ec9713d46531c9e471b11bf62f23083a574-v3-1] Llama 3.2: Revolutionizing edge AI and vision with open, customizable models. 2024年9月25日 [2024年9月26日].

[2] 3/LICENSE at main · meta-llama/llama3. GitHub. [2024-05-25]. （原始内容存档于2024-05-24）（英语）.

[paper-3] Touvron, Hugo; Lavril, Thibaut; Izacard, Gautier; Martinet, Xavier; Lachaux, Marie-Anne; Lacroix, Timothée; Rozière, Baptiste; Goyal, Naman; Hambro, Eric; Azhar, Faisal; Rodriguez, Aurelien; Joulin, Armand; Grave, Edouard; Lample, Guillaume. LLaMA: Open and Efficient Foundation Language Models. 2023. arXiv:2302.13971  [cs.CL].

[blog-4] Introducing LLaMA: A foundational, 65-billion-parameter large language model. Meta AI. 24 February 2023 [2023-06-14]. （原始内容存档于2023-03-03）.

[verge-leak-5] Vincent, James. Meta's powerful AI language model has leaked online — what happens now?. The Verge. 8 March 2023 [2023-06-14]. （原始内容存档于2023-11-03）.

[差一步称霸AI-6] 差一步称霸AI：历史进程中的扎克伯格, 远川研究所, 澎湃. [2023-06-28]. （原始内容存档于2023-06-28）.

[llama-2-7] 7.0 ^7.1 Meta launches Llama 2, a source-available AI model that allows commercial applications. [2023-07-21]. （原始内容存档于2023-11-07）.

[llama-2_chatbot-8] LLaMA 2: How to access and use Meta’s versatile open-source chatbot right now. [2023-07-20]. （原始内容存档于2023-11-03）.

[9] Maffulli, Stefano. Meta’s LLaMa 2 license is not Open Source. Voices of Open Source. 2023-07-20 [2023-08-29]. （原始内容存档于2023-10-10）（美国英语）.

[metaCodeLlama-10] Code Llama: Open Foundation Models for Code, URL=https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/ （页面存档备份，存于互联网档案馆）

[CodeLlama-11] 11.0 ^11.1 Meta releases Code Llama, a code-generating AI model, Kyle Wiggers, August 24, 2023 URL=https://techcrunch.com/2023/08/24/meta-releases-code-llama-a-code-generating-ai-model/ （页面存档备份，存于互联网档案馆）

[llama3blog-12] 12.0 ^12.1 Introducing Meta Llama 3: The most capable openly available LLM to date. ai.meta.com. April 18, 2024 [2024-04-21] （英语）.

[13] Wiggers, Kyle. Meta releases Llama 3, claims it's among the best open models available. TechCrunch. 18 April 2024.

[14] Mann, Tobias. Meta debuts third-generation Llama large language model. www.theregister.com （英语）.

[:5-15] The Falcon has landed in the Hugging Face ecosystem. huggingface.co. [2023-06-20].

[16] /MODEL_CARD.md at main · meta-llama/llama. GitHub. [2024-05-28] （英语）.

[17] Andrej Karpathy (Apr 18, 2024), The model card has some more interesting info too

[18] 3/MODEL_CARD.md at main · meta-llama/llama3. GitHub. [2024-05-28] （英语）.

[19] "The Llama 3 Herd of Models" (July 23, 2024) Llama Team, AI @ Meta

[20] -models/models/llama3_1/MODEL_CARD.md at main · meta-llama/llama-models. GitHub. [2024-07-23] （英语）.

[red-pajama-21] RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset. GitHub. Together. [4 May 2023]. （原始内容存档于2023-11-07）.

[red-pajama-download-22] RedPajama-Data-1T. Hugging Face. Together. [4 May 2023]. （原始内容存档于2023-11-03）.

[23] Knight, Will. Meta's Open Source Llama 3 Is Already Nipping at OpenAI's Heels. Wired.

[24] Meta's amped-up AI agents confusing Facebook users. ABC News. 19 April 2024 （澳大利亚英语）.

[25] ttps://s21.q4cdn.com/399680738/files/doc_financials/2023/q4/META-Q4-2023-Earnings-Call-Transcript.pdf

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

查论编自然语言处理
一般术语	语料库口语语料库停用词词袋完全人工智能（英语：AI-complete） n元语法（双字母组、三元语法（英语：Trigrams））
文本挖掘	文本分割词性标注（英语：Part-of-speech tagging）拆句处理（英语：Shallow parsing）复合词处理（英语：Compound term processing）搭配提取（英语：Collocation extraction）词干提取词形还原命名实体识别指代文本情感分析概念挖掘（英语：Concept mining）语法分析词义消歧术语提取（英语：Terminology extraction）真实大小写处理（英语：Truecasing）
自动摘要（英语：Automatic summarization）	多文档摘要（英语：Multi-document summarization）句子抽取（英语：Sentence extraction）文本简化（英语：Text simplification）
分布语义（英语：Distributional semantics）模型	潜在语义学 Seq2Seq模型 Word2vec 语言模型大型语言模型基础模型 LLaMA ChatGPT GPT-4 文心一言词嵌入
机器翻译	电脑辅助翻译基于实例（英语：Example-based machine translation）基于规则（英语：Rule-based machine translation）
自动识别与数据采集	语音识别语音合成光学字符识别自然语言生成提示工程
主题模型	弹珠分布（英语：Pachinko allocation）隐含狄利克雷分布潜在语义索引
计算机辅助审查（英语：Computer-assisted reviewing）	自动作文评分（英语：Automated essay scoring）语料库检索工具（英语：Concordancer）文法检查器（英语：Grammar checker）预测文本（英语：Predictive text）拼写检查语法猜测（英语：Syntax guessing）
自然语言用户界面（英语：Natural language user interface）	自动在线助手聊天机器人文字冒险游戏问答系统