You are not logged in.
Pages: 1
EDIT: for anyone else with similar goals / questions as mine, this thread would suggest that the archlinux distro is not appropriate for meeting these goals (at present), and the community - at least as represented in this thread - is even quite hostile to the goal. But Gentoo and NetBSD both have clear policies against LLM-generated code and may be good alternatives. Fedora reportedly also has policies on disclosing any AI-generated content. Original post below.
I'm curious whether there are any policies on "AI-generated" code used among arch devs particularly whether there are guidelines about including / not-including such content in main repo packages.
I am not (presently, in this thread) looking for discussion on the merits or concerns of such inclusion. But for transparency, my personal goal is to avoid any "AI" generated code or software from "AI-assisted" coding. I feel this goal could be analogous to some users preferring a purely free / libre system; for that goal we'd note that arch linux would not likely be a good choice, but Parabola could be a suitable alternative.
For those with the goal of avoiding "AI" content in their software are there any similar options / recommendations for using arch linux or alternative distros that do have rigorous policies on the topic?
---
Note I put "AI" in quotes for multiple reasons which are beyond the scope of this topic. Semantic arguments about that term are out of scope for this discussion. Feel free to replace any use of "AI" in this post with "LLM" or any / all specific examples of the LLM-backed coding assistant tools.
Last edited by Trilby (Today 17:31:45)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
People really need to understand this thing isn't A.I. and it's just a glorified search engine. Definitely need to stay clear off fully generated and no human in the middle to check it.
Offline
You mean the packaged software would be AI-free, not any archlinux projects?
Any promise of 100% made by real human fools would have to be an extension of such promise made for every packaged project what doesn't sound realistic (even if purely going by blind trust, you're not getting such commitments from each and every small project)
Any policy of "we don't tolerate LLM generated code" (ie. flag for removal) would get you into a tough spot when some critical project accepts LLM generated code (arch doesn't apply downstream patches, so the only fix would be a fork or complete substitution or feature loss) and you might have to make contentious calls about what's considered LLM-generated (eg. when some dev uses an LLM to write some bulky code and the manually checks and fixes it - is that still LLM generated?)
My *advice* on the matter is to be as agnostic as possible and have more faith in evolution.
If a project gets taken over by an AI the (reasonable) concern is that it will turn to shit and then people will just stop using it because it's shit as they would if it turned to shit for other reasons.
If otoh you have some high quality tool that does an excellent job and is actively maintained and has a responsive community - do you really want to drop it because there's somewhere some LLM involved and therefore it lost its "purity"?
Online
0% Ai makes no sense nowadays!
The real issue is the proliferation of mini-projects where over 80% of the code is AI-generated; however, nothing compels us to use them—in my view, they are essentially proofs of concept. It is difficult to assign any meaningful copyright to such projects.
The Linux policy: https://docs.kernel.org/process/coding-assistants.html
Are you planning to exclude kernels with Arch Linux ? ![]()
Not because of "purity" but because of security.
The same problem has always existed with an average developer.
Last edited by papajoke (Today 15:31:24)
LTS - Fish - Kde - intel M100 - 16Go RAM - ssd
Offline
do you really want to drop it because there's somewhere some LLM involved and therefore it lost its "purity"?
Not because of "purity" but because of security. Countless projects with LLM-generated code have been found to be beyond-irresponsible with user data in many cases shipping it off to data brokers. When some slop coder gets an output from the LLM that "works" then ships it off to users without understanding what the code actually does, the potential for harm is immeasurable. But the harms of LLM use go far far beyond the impact on the product / user.
Any policy of "we don't tolerate LLM generated code" (ie. flag for removal) would get you into a tough spot
The same could be said of any policy against closed-source binary blobs. That doesn't mean it cant and hasn't been done. There are several software projects that have very clear and strictly-enforced prohibitions against any use of AI coding agents or other LLM code-generation tools.
0% Ai makes no sense nowadays!
Perhaps not to you. Some people would argue that politicians not molesting children makes no sense nowadays. That doesn't mean we just accept what is popular as being okay.
nothing compels us to use them
Of course nothing compels us to use software we don't want to use. Nothing compels anyone to use proprietary binary blobs either. That doesn't mean a project like Parabola has no place. The place of a project like Parabola is not that it is the only way to avoid being "compelled" to used closed-source material, but rather that it is a curated set of content that is designed to be only free / libre. Why is such a project for people who want to be free of LLM content so odd to you?
I'm not suggesting anyone is or should be compelled to do anything. I'm asking about resources for users to make informed decisions.
But it seems this community is all-in on the LLM bullshit. So I'm all out.
And before someone tries further moron-splaining I've taught machine learning classes for graduate students at MIT. I know precisely how these technologies work. I am ethically opposed to their use. I don't have any interest in preaching about my ethics - but this community has always been supportive of people having their own individual goals for their software use.
Last edited by Trilby (Today 15:34:04)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
It would be nice if anything that is AI generated had some note of this, the trouble is of course as with everything the ethical devs would do this, the unethical would hide it and we would need some system to check.
Rlu: 222126
Online
FYI, apparently both Gentoo and NetBSD have implemented a ban on AI-generated content: https://www.tomshardware.com/software/l … rated-code
So for everyone saying this is impossible - you clearly are disconnected from reality, perhaps from too much chat bot usage.
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
Trilby, focusing on your question, a non-authoritative information from observation/interaction:
So far I did not encounter any official policy on that. By that I don’t mean only one prominently advertised, but even one that would be simply circulated around.
Among contributors there is some concern about negative impact of such content. This may not reflect the position of all contributors. This concern doesn’t mean that any use of “AI” algorithms is seen as automatically unwanted. But certainly, if Arch had to publish an official policy now, you will see in it a strong push-back against many uses.
Do you know how to tell replies in this thread aren’t from LLMs? Any LLM would folow this simple prompt: “I am not looking for discussion on the merits or concerns of such inclusion.” Humans failed to do so. ![]()
Offline
So for everyone saying this is impossible - you clearly are disconnected from reality, perhaps from too much chat bot usage.
Sure it's possible, just switch to a different kernel. Simple enough, right? Oh, but of course, you back it up with an article from over 2 years ago that has little to do with anything, so there's that.
Disconnected from reality is an apt phrase here, but the other posters aren't the ones guilty of it.
Last edited by Scimmia (Today 16:08:12)
Offline
Thanks mpan for an on-topic and useful reply.
Scimmia, you think an article from two year ago is no longer accurate? It's quite easy to confirm that these policies are still in place for both Gentoo and NetBSD. So I'm not sure what you're even getting at. But in any case, since you don't seem to want to contribute productively, but rather troll on the one thing I said this thread should not be about, please feel free to leave this thread and take your bullshit elsewhere.
Last edited by Trilby (Today 16:35:19)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
NetBSD's development is completely and totally different from a Linux distro, so no, no relevance there, and Gentoo's policy has nothing to do what so ever with packaging AI generated code, so again, no relevance.
And how you've added "archlinux distro is not appropriate for meeting these goals". Have you really not realized yet that using linux at all cannot meet your goals? Disconnected from reality indeed.
Last edited by Scimmia (Today 16:46:04)
Offline
The reason you give is security...
To me, that's actually a weak argument, because no developer is perfect.
The real issue with open source is this:
> This is not about the quality but of the legality of AI contributions. If a model was trained on copyrighted code and/or code under a non-permissive license that is incompatible with the target project, then it may not be legal to include its output.
This is a huge issue. So yes, if that's your concern, then your topic is absolutely welcome.
My goal is to avoid any code or software generated by AI or created with AI-assisted programming.
The request needs to be clearly defined.
I'm not necessarily opposed to that. But do you really make no distinction between using AI and "vibe coding"? What is the minimum acceptable use? 1%? 0%?
Nowadays, after writing a script, I often ask my AI to review it. If it points out two or three minor issues, then yes, I used AI — but as a tool, not as a general-purpose code generator. In the end, it may have contributed only five lines of code. Should I be excluded?
In the past, developers used to copy code from a famous programming forum...
Recently, I wrote a small GUI application with more than 100 functions. For one of them, I had AI generate the code. It was just a visual effect that adds nothing essential to the application itself. I didn't want to spend an entire day on something so trivial. Should I be excluded? In terms of security, there is certainly zero risk; however, I cannot answer regarding the license.
For translations, I asked AI to generate 50 sentences in five languages. Should I be excluded?
I also asked AI to generate a man page. Should I be excluded?
Same for writing tests?
Last edited by papajoke (Today 17:20:35)
LTS - Fish - Kde - intel M100 - 16Go RAM - ssd
Offline
There currently is not policy on this topic for the overall project, although some subprojects (i.e. buildbtw) have their own policies.
A few people currently are working towards an RFC on the topic, the exact wording & guidance is not yet clear right now ![]()
Offline
Papajoke, I should not have mentioned the security point. I have quite a long list of reasons for my views - but I never wanted this thread to be about those reasons as good people could have different views on those reasons. The goal of the thread is to learn whether there were policies in place or if other distros / OSs had such policies.
The "percent" of generated content that I'd be okay with would be hard for me to answer, but it is also not actually relevant to the present thread. My own threshold for comfort is beside the point. The point is whether, or to what degree, the archlinux project / devs have guidelines or policies that they (aim to) adhere to on this topic.
Thanks gromit - that's the kind of answer I was seeking (it's not the answer I was hoping for, but it is an informative response to the question). I look forward to seeing any such RFC and where it goes.
Last edited by Trilby (Today 17:30:17)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
distros that do have rigorous policies on the topic?
since Gentoo and NetBSD got spotlights in this thread, Fedora, like they do allow AI-assisted contributions but requires a disclosure tag (Assisted-by:) and forces the human submitter to take full accountability/responsibility for the code
https://docs.fedoraproject.org/en-US/co … on-policy/
and Debian haven't decided or adopted a "rule" yet https://people.debian.org/~lucas/debian … esolution.
Offline
I do think a lot of the misunderstandings in this thread could be clarified if you mention what your scope is here, if it's about upstream code then this is nigh impossible to guarantee without a huge effort to patch things. If it's just about developments within/for Arch's tooling then I'd see such a policy to be a more reasonable and more enforceable endeavor.
Offline
When some slop coder gets an output from the LLM that "works" then ships it off to users without understanding what the code actually does, the potential for harm is immeasurable.
You're arguing the results, not the tools.
If a project gets taken over by an AI the (reasonable) concern is that it will turn to shit and then people will just stop using it because it's shit as they would if it turned to shit for other reasons.
But it seems this community is all-in on the LLM bullshit.
https://bbs.archlinux.org/viewtopic.php?id=313959
There's a difference between being all into something and pointing out obvious problems w/ the practical implementation of suggestions.
FYI, apparently both Gentoo and NetBSD have implemented a ban on AI-generated content
https://wiki.gentoo.org/wiki/Project:Council/AI_policy
This policy affects Gentoo contributions and the official Gentoo projects. It does not prohibit adding packages for AI-related software or software that is being developed with the help of such tools upstream.
Edit, @V1del
my personal goal is to avoid any "AI" generated code or software from "AI-assisted" coding. I feel this goal could be analogous to some users preferring a purely free / libre system
Last edited by seth (Today 17:47:57)
Online
This might be a little out of scope of the discussion, but the Linux Kernel Developers recently published their stance on Coding Assistants, which is basically that the human submitter is accountable for what the Coding Assistants produce.
Claire is fine.
Problems? I have dysgraphia, so clear and concise please.
My public GPG key for package signing
My x86_64 package repository
Offline
Pages: 1