LLaMA

Llama
開發者	Meta AI（英語：Meta AI）
首次發布	2023年2月24日，21個月前
當前版本	3.2（2024年9月25日；穩定版本）;
源代碼庫	github.com/meta-llama/llama3
編程語言	Python
類型	大型語言模型; GPT; 基礎模型;
許可協議	Meta Llama 3.2 Community License
網站	llama.meta.com

LLaMA（英語：Large Language Model Meta AI）是Meta AI公司於2023年2月發布的大型語言模型。它訓練了各種模型，這些模型的參數從70億到650億不等。LLaMA的開發人員報告說，LLaMA運行的130億參數模型在大多數NLP基準測試中的性能超過了更大的、具有1750億參數的GPT-3提供的模型，且LLaMA的模型可以與PaLM和Chinchilla等最先進的模型競爭^[3]。雖然其他強大的大語言模型通常只能通過有限的API訪問，但Meta在非商業許可的情況下發布了LLaMA的模型權重，供研究人員參考和使用^[4]^[5] ^[6]。2023年7月，Meta推出LLaMA2，這是一種可用於商業應用的開源AI模型^[7]。

LLaMA2

2023年7月，Facebook母公司Meta推出了LLaMA2，LLaMA2是一種開源大語言模型（LLM），旨在挑戰大型科技競爭對手的限制性做法。Meta免費發布LLaMA2背後的代碼和數據，使世界各地的研究人員能夠利用和改進該技術。 Meta的首席執行官馬克·扎克伯格一直直言不諱地強調開源軟件對於刺激創新的重要性。^[8]^[7]

Meta訓練並發布了三種模型大小的LLaMA2：70、130和700億個參數。模型架構與LLaMA1模型基本保持不變，但用於訓練基礎模型的數據增加了 40%。隨附的預印本還提到了一個具有34B參數的模型，該模型可能在未來滿足安全目標後發布。

LLaMA2包括基礎模型和針對對話進行微調的模型，稱為 Llama 2 - 聊天。與LLaMA1進一步不同的是，所有模型都附帶權重，並且對於許多商業用例都是免費的。然而，由於一些剩餘的限制，Llama開源的描述受到了開源倡議組織（以維護開源定義而聞名）的爭議。^[9]

Code Llama

2023年8月，Meta繼發布用於生成文本、翻譯語言和創建音頻的人工智能模型之後，開源了 Code Llama。這是一個機器學習系統，可以用自然語言（特別是英語）生成和解釋代碼。可以免費商用和研究。^[10]

Code Llama是從Llama-2基礎模型微調而來，共有三個版本：基礎版、Python版、以及指令遵循。類似於 GitHub Copilot 和 Amazon CodeWhisperer，以及 StarCoder、StableCode 和 PolyCoder 等開源人工智能代碼生成器，Code Llama 可以跨多種編程語言完成代碼並調試現有代碼，包括 Python、C、Java、PHP、 Typescript、C# 和 Bash。^[11]

在訓練 Code Llama 時，Meta 使用了與訓練 Llama 2 相同的數據集——來自網絡的公開可用資源的混合。但可以說，它的模型「強調」了包含代碼的訓練數據的子集。從本質上講，Code Llama 比它的「父」模型 Llama 2 有更多的時間來學習代碼和自然語言之間的關係。每個 Code Llama 模型的大小從 70 億個參數到 340 億個參數不等，均使用 5000 億個代碼標記以及與代碼相關的數據進行訓練。多個 Code Llama 模型可以將代碼插入到現有代碼中，並且所有模型都可以接受大約 100,000 個代碼標記作為輸入，而至少一個（70 億個參數模型）可以在單個 GPU 上運行。（其他模型則需要更強大的硬件。）Meta 聲稱，340 億個參數的模型是迄今為止所有開源代碼生成器中性能最好的，也是參數數量最多的。^[11]

Llama 3

2024年4月18日，Meta發布了Llama-3，有兩種模型大小尺寸：8B和70B參數。 ^[12] 這些模型已經根據從「公開可用來源」收集的大約 15 萬億個文本標記進行了預訓練，並且指導模型根據「公開可用的指令數據集以及超過 1000 萬個人工注釋的示例」進行了微調。計劃發布多模式模型、能夠以多種語言進行對話的模型以及具有更大上下文窗口的模型。

於2024年7月23日增量更新至Llama-3.1。具有8B、70B、405B參數三種模型大小尺寸。^[12]

Meta AI 的測試表明，Llama 3 70B 在大多數基準測試中都擊敗了 Gemini 和 Claude。^[13]^[14]

模型比較

對於訓練成本列，只寫出最大模型的成本。例如，「21,000」是 Llama 2 69B 的訓練成本，單位為 petaFLOP-day。另外，1 petaFLOP-day = 1 petaFLOP/秒 × 1 天 = 8.64E19 FLOP。

名稱	發布日期	參數	訓練成本 (petaFLOP-day)	上下文長度	語料庫大小	商業可行性？
LLaMA	2023-02-24	6.7B 13B 32.5B 65.2B	6,300^[15]	2048	1–1.4T	否
Llama 2	2023-07-18	6.7B 13B 69B	21,000^[16]	4096	2T	是
Code Llama	2023-08-24	6.7B 13B 33.7B 69B		4096	2T
Llama 3	2024-04-18	8B 70.6B	100,000^[17]^[18]	8192	15T
Llama 3.1	2024-07-23	8B 70.6B 405B	440,000^[19]^[20]	128,000	15T
Llama 3.2	2024-09-25	1B 3B 11B 90B		128,000

架構與訓練

數據集

2023年4月17日，GitHub的Together啟動了一個名為RedPajama的項目，以複製和分發LLaMA數據集的開源版本。^[21]^[22]

反響

《連線》 (Wired) 雜誌稱Llama 3的 8B 參數版本「能力出奇地強」，考慮到它的大小。^[23]

Meta將Llama整合到Facebook後，人們的反應褒貶不一，一些用戶在Meta AI告訴家長群它有一個孩子後感到困惑。^[24]

根據2023年第四季度的收益記錄，Meta採用了開放權重的策略來提高模型安全性、迭代速度，增加開發人員和研究人員的採用率，並成為行業標準。未來計劃推出 Llama 5、6 和 7。^[25]

參見

參考資料

^ Llama 3.2: Revolutionizing edge AI and vision with open, customizable models. 2024年9月25日 [2024年9月26日].
^ llama3/LICENSE at main · meta-llama/llama3. GitHub. [2024-05-25]. （原始內容存檔於2024-05-24）（英語）.
^ Touvron, Hugo; Lavril, Thibaut; Izacard, Gautier; Martinet, Xavier; Lachaux, Marie-Anne; Lacroix, Timothée; Rozière, Baptiste; Goyal, Naman; Hambro, Eric; Azhar, Faisal; Rodriguez, Aurelien; Joulin, Armand; Grave, Edouard; Lample, Guillaume. LLaMA: Open and Efficient Foundation Language Models. 2023. arXiv:2302.13971  [cs.CL].
^ Introducing LLaMA: A foundational, 65-billion-parameter large language model. Meta AI. 24 February 2023 [2023-06-14]. （原始內容存檔於2023-03-03）.
^ Vincent, James. Meta's powerful AI language model has leaked online — what happens now?. The Verge. 8 March 2023 [2023-06-14]. （原始內容存檔於2023-11-03）.
^ 差一步称霸AI：历史进程中的扎克伯格, 远川研究所, 澎湃. [2023-06-28]. （原始內容存檔於2023-06-28）.
^ ^7.0 ^7.1 Meta launches Llama 2, a source-available AI model that allows commercial applications. [2023-07-21]. （原始內容存檔於2023-11-07）.
^ LLaMA 2: How to access and use Meta’s versatile open-source chatbot right now. [2023-07-20]. （原始內容存檔於2023-11-03）.
^ Maffulli, Stefano. Meta’s LLaMa 2 license is not Open Source. Voices of Open Source. 2023-07-20 [2023-08-29]. （原始內容存檔於2023-10-10）（美國英語）.
^ Code Llama: Open Foundation Models for Code, URL=https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/ （頁面存檔備份，存於網際網路檔案館）
^ ^11.0 ^11.1 Meta releases Code Llama, a code-generating AI model, Kyle Wiggers, August 24, 2023 URL=https://techcrunch.com/2023/08/24/meta-releases-code-llama-a-code-generating-ai-model/ （頁面存檔備份，存於網際網路檔案館）
^ ^12.0 ^12.1 Introducing Meta Llama 3: The most capable openly available LLM to date. ai.meta.com. April 18, 2024 [2024-04-21] （英語）.
^ Wiggers, Kyle. Meta releases Llama 3, claims it's among the best open models available. TechCrunch. 18 April 2024.
^ Mann, Tobias. Meta debuts third-generation Llama large language model. www.theregister.com （英語）.
^ The Falcon has landed in the Hugging Face ecosystem. huggingface.co. [2023-06-20].
^ llama/MODEL_CARD.md at main · meta-llama/llama. GitHub. [2024-05-28] （英語）.
^ Andrej Karpathy (Apr 18, 2024), The model card has some more interesting info too
^ llama3/MODEL_CARD.md at main · meta-llama/llama3. GitHub. [2024-05-28] （英語）.
^ "The Llama 3 Herd of Models" (July 23, 2024) Llama Team, AI @ Meta
^ llama-models/models/llama3_1/MODEL_CARD.md at main · meta-llama/llama-models. GitHub. [2024-07-23] （英語）.
^ RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset. GitHub. Together. [4 May 2023]. （原始內容存檔於2023-11-07）.
^ RedPajama-Data-1T. Hugging Face. Together. [4 May 2023]. （原始內容存檔於2023-11-03）.
^ Knight, Will. Meta's Open Source Llama 3 Is Already Nipping at OpenAI's Heels. Wired.
^ Meta's amped-up AI agents confusing Facebook users. ABC News. 19 April 2024 （澳大利亞英語）.
^ https://s21.q4cdn.com/399680738/files/doc_financials/2023/q4/META-Q4-2023-Earnings-Call-Transcript.pdf

外部連結

LLaMA2 Chatbot （頁面存檔備份，存於網際網路檔案館）
Perplexity LLaMA2 Chatbot （頁面存檔備份，存於網際網路檔案館）

[wikidata-e59f7ec9713d46531c9e471b11bf62f23083a574-v3-1] Llama 3.2: Revolutionizing edge AI and vision with open, customizable models. 2024年9月25日 [2024年9月26日].

[2] 3/LICENSE at main · meta-llama/llama3. GitHub. [2024-05-25]. （原始內容存檔於2024-05-24）（英語）.

[paper-3] Touvron, Hugo; Lavril, Thibaut; Izacard, Gautier; Martinet, Xavier; Lachaux, Marie-Anne; Lacroix, Timothée; Rozière, Baptiste; Goyal, Naman; Hambro, Eric; Azhar, Faisal; Rodriguez, Aurelien; Joulin, Armand; Grave, Edouard; Lample, Guillaume. LLaMA: Open and Efficient Foundation Language Models. 2023. arXiv:2302.13971  [cs.CL].

[blog-4] Introducing LLaMA: A foundational, 65-billion-parameter large language model. Meta AI. 24 February 2023 [2023-06-14]. （原始內容存檔於2023-03-03）.

[verge-leak-5] Vincent, James. Meta's powerful AI language model has leaked online — what happens now?. The Verge. 8 March 2023 [2023-06-14]. （原始內容存檔於2023-11-03）.

[差一步称霸AI-6] 差一步称霸AI：历史进程中的扎克伯格, 远川研究所, 澎湃. [2023-06-28]. （原始內容存檔於2023-06-28）.

[llama-2-7] 7.0 ^7.1 Meta launches Llama 2, a source-available AI model that allows commercial applications. [2023-07-21]. （原始內容存檔於2023-11-07）.

[llama-2_chatbot-8] LLaMA 2: How to access and use Meta’s versatile open-source chatbot right now. [2023-07-20]. （原始內容存檔於2023-11-03）.

[9] Maffulli, Stefano. Meta’s LLaMa 2 license is not Open Source. Voices of Open Source. 2023-07-20 [2023-08-29]. （原始內容存檔於2023-10-10）（美國英語）.

[metaCodeLlama-10] Code Llama: Open Foundation Models for Code, URL=https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/ （頁面存檔備份，存於網際網路檔案館）

[CodeLlama-11] 11.0 ^11.1 Meta releases Code Llama, a code-generating AI model, Kyle Wiggers, August 24, 2023 URL=https://techcrunch.com/2023/08/24/meta-releases-code-llama-a-code-generating-ai-model/ （頁面存檔備份，存於網際網路檔案館）

[llama3blog-12] 12.0 ^12.1 Introducing Meta Llama 3: The most capable openly available LLM to date. ai.meta.com. April 18, 2024 [2024-04-21] （英語）.

[13] Wiggers, Kyle. Meta releases Llama 3, claims it's among the best open models available. TechCrunch. 18 April 2024.

[14] Mann, Tobias. Meta debuts third-generation Llama large language model. www.theregister.com （英語）.

[:5-15] The Falcon has landed in the Hugging Face ecosystem. huggingface.co. [2023-06-20].

[16] /MODEL_CARD.md at main · meta-llama/llama. GitHub. [2024-05-28] （英語）.

[17] Andrej Karpathy (Apr 18, 2024), The model card has some more interesting info too

[18] 3/MODEL_CARD.md at main · meta-llama/llama3. GitHub. [2024-05-28] （英語）.

[19] "The Llama 3 Herd of Models" (July 23, 2024) Llama Team, AI @ Meta

[20] -models/models/llama3_1/MODEL_CARD.md at main · meta-llama/llama-models. GitHub. [2024-07-23] （英語）.

[red-pajama-21] RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset. GitHub. Together. [4 May 2023]. （原始內容存檔於2023-11-07）.

[red-pajama-download-22] RedPajama-Data-1T. Hugging Face. Together. [4 May 2023]. （原始內容存檔於2023-11-03）.

[23] Knight, Will. Meta's Open Source Llama 3 Is Already Nipping at OpenAI's Heels. Wired.

[24] Meta's amped-up AI agents confusing Facebook users. ABC News. 19 April 2024 （澳大利亞英語）.

[25] ttps://s21.q4cdn.com/399680738/files/doc_financials/2023/q4/META-Q4-2023-Earnings-Call-Transcript.pdf

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

閱論編自然語言處理
一般術語	語料庫口語語料庫停用詞詞袋完全人工智慧（英語：AI-complete） n元語法（雙字母組、三元語法（英語：Trigrams））
文本挖掘	文本分割詞性標註（英語：Part-of-speech tagging）拆句處理（英語：Shallow parsing）複合詞處理（英語：Compound term processing）搭配提取（英語：Collocation extraction）詞幹提取詞形還原命名實體識別指代文本情感分析概念挖掘（英語：Concept mining）語法分析詞義消歧術語提取（英語：Terminology extraction）真實大小寫處理（英語：Truecasing）
自動摘要（英語：Automatic summarization）	多文檔摘要（英語：Multi-document summarization）句子抽取（英語：Sentence extraction）文本簡化（英語：Text simplification）
分佈語義（英語：Distributional semantics）模型	潛在語義學 Seq2Seq模型 Word2vec 語言模型大型語言模型基礎模型 LLaMA ChatGPT GPT-4 文心一言詞嵌入
機器翻譯	電腦輔助翻譯基於實例（英語：Example-based machine translation）基於規則（英語：Rule-based machine translation）
自動識別與數據採集	語音識別語音合成光學字符識別自然語言生成提示工程
主題模型	彈珠分布（英語：Pachinko allocation）隱含狄利克雷分布潛在語義索引
計算機輔助審查（英語：Computer-assisted reviewing）	自動作文評分（英語：Automated essay scoring）語料庫檢索工具（英語：Concordancer）文法檢查器（英語：Grammar checker）預測文本（英語：Predictive text）拼寫檢查語法猜測（英語：Syntax guessing）
自然語言用戶界面（英語：Natural language user interface）	自動在線助手聊天機器人文字冒險遊戲問答系統