你可能违反了Llama社区许可证
Breaking the Llama Community License

原始链接: https://notes.victor.earth/youre-probably-breaking-the-llama-community-license/

Llama 3.3 社区许可协议中有一些经常被忽视的条款,尽管Llama被宣传为“开源”。无论从何处获取模型,该协议都适用于任何分发或基于Llama构建的人。 一个关键点是,如果您分发Llama或包含Llama的产品/服务,则需要在网站、用户界面或产品文档上醒目地显示“基于Llama构建”。此外,任何微调的Llama模型必须在名称开头包含“Llama”。 “可接受使用政策”包括一项要求,即“适当地向最终用户披露您的人工智能系统的任何已知危险”,这可能需要您披露您已发现的偏差或不准确之处。 这些条款的执行似乎比较宽松,但它们出现在许可证中表明Meta可能认为它们很重要。用户应该阅读完整的许可证以确保遵守并评估风险,尤其是在面向公众的产品或服务中使用Llama的用户。

这个Hacker News的讨论帖探讨了Llama社区许可证是否真正“开放”,以及用户是否真正遵守它。 lolinder认为,“权重与源码”的争论忽略了一个更大的问题:即使拥有完整的源代码,该许可证也只使Llama成为“源码可用”,而非完全的开源或自由软件。 NoahZuniga指出build.nvidia.com上可能存在许可证违规行为,因为缺少明显的署名。 Wrs提出一个问题:由于缺乏人类创作,模型权重是否甚至具有版权,这意味着使用该模型并不自动构成对许可证的同意。法律诉讼将代价高昂。 Ronsor同意这种观点,他认为模型权重不具有版权,许可证主要保护Meta免受责任。 讨论强调了围绕Llama许可证和模型权重开源状态的模糊性和潜在违规行为。作者最初发布的帖子并没有获得关注。
相关文章
  • 开源人工智能是前进的道路 2024-07-24
  • (评论) 2025-04-08
  • (评论) 2024-04-24
  • (评论) 2024-07-17
  • (评论) 2024-07-01

  • 原文

    If you're distributing or redistributing a LLM model that is under the "Llama 3.3 Community License Agreement", you might be breaking at least one of the terms you've explicitly/implicitly agreed to.
    All of the Llama models and their derivatives (fine-tunes etc) of them is are covered by a Llama Community License.

    Disclaimer I (the author) am not a lawyer, and might not even be in the same country as you, while all of this is very depending on the jurisdiction. This post also doesn't try to outline what will happen if you break the agreement, but what claims Meta make. They might not be able to make these claims against you at all. It also doesn't apply if you already have your own agreement with Meta/Llama that superseedes the public license, nor if you're just a user of any of the llama models.

    The feeling I get from the ML/AI community right now, is that almost no one actually reads and follows the various license agreements they agree to follow, when they use models like Meta's Llama.

    Llama is marketed as a "open source" model yet Meta themselves also calls it "proprietary" in the license text and have a lot of conditions that aren't compatible with open source. If you trusted the marketing to be true and have an existing understanding of "open source", it's possible you've been making assumptions about the license which aren't true in reality.

    I'll try to go through some of the points from the license that I think are the most likely to have been missed, in an effort to hopefully spread at least some new information that I haven't seen become common knowledge in the community, since many still refer to the Llama family of models as "open source".

    The full license can be found here: https://www.llama.com/llama3_3/license
    (At the time of writing, revision Llama 3.3 Version Release Date: December 6, 2024, read via browser on March 27, 2025).

    If you're curious about how the the Llama Community License and its Use Policy has changed between versions, I've published a separate article with a quick summary over here: How Llama’s Licenses Have Evolved Over Time

    Did I actually agree to any license at all?

    By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement.

    The first part to keep in mind, is that it doesn't matter if you've been forced to click a button/checkbox or not, Meta considers the license to cover you if you build on top of Llama, regardless if you got it from Huggingface (requires signing agreement), Ollama (no signing required) or Torrents (obviously no signing required). The license also covers you if you redistribute Llama in any way, so it would apply to Ollama too for example.

    Which brings to mind, what are "Llama Materials"? From the license:

    “Llama Materials” means, collectively, Meta’s proprietary Llama 3.3 and Documentation (and any portion thereof) made available under this Agreement.

    So the license covers both the "proprietary" model/weights themselves, and all the documentation at llama.com/docs. Easy enough.

    Interestingly enough, Meta calls Llama "proprietary" here, compared to what llama.com says, which claims the model is Open Source ("Llama is the leading open source model family"), but that's a post for the future, so lets not dive into that right now.

    Why am I required to show "Built with Llama" prominently on my product/service?

    Only applies for Llama 3 License and later The Llama 2 License did not have this requirement at all.
    A summary of the changes across Llama License versions can be found here

    The next section is something I think almost everyone seems to have missed:

    i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name.

    Breaking it down further,

    If you distribute or make available the Llama Materials [...] you shall (A) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation.

    Meaning, if you distribute Llama model, weights, or any derivatives, you need to display "Built with Llama" prominently somewhere. So in the case of Ollama for example, https://ollama.com/library/llama3.3 would need to display "Built with Llama" somewhere on that page.

    Or another example being openrouter.ai, which doesn't display "Built with Llama" on any page either, even though they're allow you to use Llama models or their derivatives.

    This also applies to anyone who "make available [...] a product or service [...] that contains any of them" which basically means anyone building products/companies with Llama in a way that you let others use it. It basically covers most of the websites using Llama, or any models derived from Llama architecture, weights, fine-tunes and so on.

    The lack of enforcement about this term makes it seem like Meta/Facebook isn't interested in forcing people to display this message, yet it still persists in the terms and conditions you agree to when distributing/using their models or any derivatives.

    One example where this requirement wasn't violated, is on build.nvidia.com:

    One example where Built With Llama requirement is being met

    Why cannot I use whatever model name I want for my own fine-tunes of Llama?

    The end of the previous section goes:

    [...] If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name.

    This means that every model that comes from the Llama "heritage" should be named "llama-" and then your own model name. As far as I know, not a lot of models follows this at all, and it gives Meta an additional point to go after if there is some derivitive models they don't like.

    Some examples on models that are breaking this part of the license:

    • Hermes 3 - Llama-3.2 3B - Is fine-tuned from Llama by their own admission, yet "Llama" isn't included at the beginning of the Llama.
    • DeepSeek-R1-Distill-Llama-8B - This model is a "distillation" of DeepSeek's R1 by using Llama as a the target to fine-tune, with data from R1. Should also have the Llama part at the beginning of the name, according to the license at least.

    There are countless other examples, many can be found via a trivial search on Hugging Face. Every model you see on that search page, derived from Llama, should have "Llama" in the beginning of the name, to not violate the license they've agreed to.

    NVIDIA's "Nemotron" model is an example on some companies following the license properly, as the full name of that model is "Llama-3.3-Nemotron-Super-49B-v1".

    "Acceptable Use Policy" & Disclosing the dangers of your AI System

    Finally we come to the "Acceptable Use Policy" which can be found here:
    https://www.llama.com/llama3_3/use-policy

    Most of it makes sense, they want to prevent Llama for being used for violence, fraud, intentional misinformation and similar obviously-bad cases. But some of the things you promise not to use Llama for, are less clear.

    1. Fail to appropriately disclose to end users any known dangers of your AI system

    This would mean that you need to show some sort of informational text before/while using Llama, that talks about what dangers you known (if any) of your AI system.

    Basically, if you've tested your own product that uses Llama, and you've noticed any biases, factual inaccuracies or other potential harm that could happen because of the usage, you have to disclose this to the users. It's not 100% what "appropriate disclosure" requirement means,

    This might obligate you to disclose to your users any known biases, factual inaccuracies, or potential harm that you’re aware of regarding the model’s output. For a commercial or public-facing project, you’d need to figure out how to meet that “appropriate disclosure” requirement, e.g., disclaimers or user notices about potential misinformation from the model.

    I'm not 100% clear what this exactly means, which means they could come up with reasons ad-hoc to explain why your project pass/fail this particular requirement.

    Conclusions

    I'm not entirely sure what to make of this. On one hand, companies who distribute these models surely are aware of the term but may chose to ignore it (risking whatever Meta might do in the future), or they are not aware of those terms at all, meaning they're rehosting content they don't know the full license of, or finally; have their own agreement with Meta which superseedes the license the public uses.

    All possible options are kind of icky, as Llama is marketed as a Open Source LLM model some might assume they can use/distribute it as a Open Source model, but the license everyone agrees to doesn't actually allow that.

    If the big sites distributing Llama have special agreements with Llama that allows them to skip the terms that the rest of the community is under, it also misrepresents how open source Llama is in practice VS what the Llama marketing wants to give the impression of.

    If you're currently in the situation where you are using Llama by self-hosting it or using it through any 3rd party, in any products or services you allow the public to use, and you haven't yet read through the full license, this hopefully should be a wake up call to do so and actively make a choice if you want to follow it or not.

    Keep in mind that while Meta doesn't seem to actively go after companies right now, the fact that it still exists in the license text means it's something they're still considering, otherwise it wouldn't be there in the first place.

    As mentioned in the very beginning, I'm not a lawyer but a software developer, so none of this is legal advice. Everything here is based on my own interpretation of the license text as written by Meta and read by me.

    联系我们 contact @ memedata.com