Add the Bevy Org's AI Policy #2204

james7132 · 2025-08-12T03:03:57Z

Added a new Policies section to the Contributing Guide and incorporated the above draft under that section. The triage reference has also been updated to make guidance for S-Nominated-to-Close aligned with the new AI policy.

alice-i-cecile

Good. I think this lays out a clear position, and explains why we feel this is necessary without excessive moralizing or encouraging us to police problems that cannot be detected.

content/learn/contribute/policies/ai.md

content/learn/contribute/reference/triage.md

content/learn/contribute/policies/ai.md

james7132 · 2025-08-12T19:42:08Z

I did try a render of this locally and the footnote rendering we have on docs pages leaves quite a bit to be desired. Might want to follow this PR up with a style update to make it look nicer.

jyn514 · 2025-08-12T20:43:15Z

content/learn/contribute/policies/ai.md

+The unsolicited use of automated systems to communicate issues, bugs, or security vulnerabilities
+about Bevy Organization projects under the guise of a human is considered unacceptable
+and a Code of Conduct violation. Any individual contributor, operator of automated systems,


does this apply to communication that is clearly marked as AI-generated?

If unsolicited, yes. I think we should cut "under the guise of".

The "under the guise of" was explicitly included to avoid capturing automated systems that do not pretend to be humans. There are systems like GitGuardian that does a useful service but makes zero effort to hide that it's a crawler bot.

Under the proposed situation, so long as the fuzzer results are presented either by a human, or by a clearly demarked bot account, it shouldn't be an issue. It's only when it files issues or PRs as if it were a human, and then subsequently wastes valuable volunteer time with the expectations that said bot would continue to engage as if it were a human where it becomes a major problem.

jyn514 · 2025-08-12T20:46:22Z

content/learn/contribute/policies/ai.md

+
+## AI Generated Communications
+
+The unsolicited use of automated systems to communicate issues, bugs, or security vulnerabilities


"automated systems" is not a good phrase here IMO, it excludes things like fuzzers and linters. i would explicitly say "generative AI" or "LLMs" throughout.

(the other two places you say "automated systems" are further qualified and probably don't need changes, but this one is unqualified.)

Interestingly I think that automated systems is fine here: the key term is "unsolicited".

oh interesting - so you wouldn't want e.g. an academic lab to post the output of a fancy dynamic fuzzer they built? that's reasonable i suppose.

Not without asking us first, no :) We'd say yes, but asking first is much more polite!

This reasoning clashes with what @james7132 says above:

The "under the guise of" was explicitly included to avoid capturing automated systems that do not pretend to be humans. There are systems like GitGuardian that does a useful service but makes zero effort to hide that it's a crawler bot.

Under the proposed situation, so long as the fuzzer results are presented either by a human, or by a clearly demarked bot account, it shouldn't be an issue. It's only when it files issues or PRs as if it were a human, and then subsequently wastes valuable volunteer time with the expectations that said bot would continue to engage as if it were a human where it becomes a major problem.

looks like James would find it okay to have a lab post the results of their academic fuzzer without asking as long as they clearly state what's happening?

Yeah, I slightly disagree with James here, but it's not critical. Automated bots that are clearly marked as such and are trying to be helpful are both rare and not very obnoxious.

content/learn/contribute/policies/ai.md

ariofrio · 2025-08-15T02:15:49Z

Disclaimer: My job is to write software that incorporates LLMs, with the assistance of LLMs, but not for an LLM provider. These are just my personal opinions though.

Policy is Well-Scoped

I recently read Asahi Linux's Generative AI policy, and I think this policy does a good job of avoiding the pitfalls that IMHO they fell into:

It doesn't attempt to predict or assess the limitations of LLMs as they develop into the future.
It doesn't make unsupported philosophical claims about what "thought" or "reasoning" means.
As @alice-i-cecile mentioned, it avoids excessive moralizing.

Reasoning Seems Incorrect

I very much agree with ~~@cart's~~ @chorman0773 concerns about the current draft's reasoning for forbidding non-trivial AI-generated output from contributions.

If the output is not copyrightable, then it is under the public domain (source) and therefore can be freely included in any work. The scenario mentioned in the current draft is not actually a problem.
The US Copyright Office's second report seems to rule out copyrightability by the model's developer, "Copyright does not extend to purely AI-generated material" (source). LLM providers like OpenAI and Anthropic make doubly sure by assigning any rights they might have in the output to the user.
The actual problem only occurs if the output is copyrightable, but under a pre-existing copyright—that of the original author(s).

Comparing Risk Levels

One thing that's interesting about the comparison with Asahi's policy is that they make the (compelling-to-me) case that because their problem space is so esoteric and highly specific, LLMs may be more likely to reproduce the scarce, copyrighted training data that relates to the publicly undocumented hardware they write software for.

FWIW, Asahi's reasoning seems to apply somewhat less readily to Bevy. Still, the three reports on AI by the US Copyright Office do indeed raise concerns for any non-trivial use of LLMs to author copyrighted content, concerns that they do not resolve.

Unclear Cases

I think the current draft is a bit unclear on whether the following use cases would be forbidden or allowed:

Write most of a feature, all models and interfaces/signatures, and have an LLM fill in the gaps and resolve compiler errors by running the compiler iteratively ("compiler-driven development" with AI).
Have an LLM write a first draft of unit tests, and then review and rewrite as needed.

These feel like they fall somewhere between "autocomplete" and "generating entire function blocks". They're not equivalent to pre-LLM functionality, but neither are they Jesus-take-the-wheel-style vibe coding. I think the intention was to forbid these cases, though my own (perhaps less risk-averse) assessment would lean the other way, so it might be worth being a bit clearer. These cases would also be really hard to detect.

Futility?

Finally, I'd be surprised if non-trivial LLM-generated code hasn't already made it into Bevy or any widely-contributed-to open source project. Not sure what to do about that, but I guess if the intent is to err on the side of caution, a good-faith attempt to forbid the practice would be all that's practical to do.

alice-i-cecile

The recent feedback about "why is this a problem" being legally suspect has convinced me: that needs to be resolved before we can move forward.

alice-i-cecile · 2025-08-15T03:53:16Z

I very much agree with @cart's concerns about the current draft's reasoning for forbidding non-trivial AI-generated output from contributions.

I don't remember such comments? I agree with your concern, but did you mean @chorman0773?

content/learn/contribute/policies/ai.md

ariofrio · 2025-08-15T14:53:06Z

I very much agree with @cart's concerns about the current draft's reasoning for forbidding non-trivial AI-generated output from contributions.

I don't remember such comments? I agree with your concern, but did you mean @chorman0773?

Oops yeah that’s who I meant. 😅 #2204 (comment).

alice-i-cecile · 2025-08-23T05:18:12Z

Some other interesting discussion of policies around AI-generated code in the Linux Kernel.

mockersf · 2025-09-09T20:27:02Z

By using GitHub, our contributors are already agreeing to:

GitHub terms of service, which mentions "Whenever you add Content to a repository containing notice of a license, you license that Content under the same terms, and you agree that you have the right to license that Content under those terms"
GitHub code of conduct mentions of ai generated content

Those two covers already most of what I expect, not sure an additional document is needed?

james7132 · 2025-09-24T23:57:41Z

Those two covers already most of what I expect, not sure an additional document is needed?

I think we do need specific guidance on what contributors, org members, and maintainers are expected to do in the case we do see both contributions and issues from them, in addition to the policy and guidelines GitHub provides. Specifically providing the guidance to add S-Nominated-to-Close and how should maintainers should generally approach such situations.

Co-authored-by: François Mockers <[email protected]>

Co-authored-by: Alice Cecile <[email protected]>

Co-authored-by: Niklas Eicker <[email protected]>

Co-authored-by: Alice Cecile <[email protected]>

janhohenheim

I have some nits with the exact legalese (see my unresolved comments), but I believe this is justified, reasonable, well-written, and useful :)

Edit: yay, my nits were addressed!

…r entities

…eing settled

james7132 requested review from superdump, cart, alice-i-cecile and mockersf August 12, 2025 03:03

james7132 added C-Content X-Controversial There is active debate or serious implications around merging this PR A-Contributing-Guide labels Aug 12, 2025

alice-i-cecile approved these changes Aug 12, 2025

View reviewed changes

mockersf reviewed Aug 12, 2025

View reviewed changes