You are not logged in.

#1 Yesterday 10:27:31

mskrd
Member
Registered: Yesterday
Posts: 6

IsMyArchFree - a simple programs that tell if your packages are free

Hi everyone, i created a program called "IsMyArchFree" that lists all your packages and flags their licenses as "free", "permissive", "proprietary" or "custom".

It currently correctly identifies all packages on my system, however that could not be the case in yours. If someone contributes and send the programs' output in this post or in the github repository that would be very nice!

https://github.com/mskrd/ismyarchfree

Offline

#2 Yesterday 12:13:44

cryptearth
Member
Registered: 2024-02-03
Posts: 1,325

Re: IsMyArchFree - a simple programs that tell if your packages are free

is there a reason you implemented this in C rather than a bash script if you just use

fp = popen("pacman -Qi | awk -F ': ' '/^Name/ {name=$2} /^License/ {print name \":\" $2 }'", "r");

anyway? - which, btw, looks very wrong as the pipe is something the shell provides - so if you want to go proper you should first just use pacman, capture its output, call awk in its own call and feed it what you got from pacman - your way is just overcomplicated shell scripting packaged in C

also:

#define MAX_LINES 1000
#define MAX_LENGTH 512

int main(void) {
    FILE *fp;
    char line[MAX_LENGTH];
    char packages[MAX_LINES][MAX_LENGTH];

why go with fixed length arrays? use a dynamic length list

also also: it fails when not run with en_US - which you don't set anywhere

also also also:

 printf("\033[0;32m%d

requires to be run in a shell which understands this - which breaks over a serial connection using screen which is monocrhome instead of color

overall: broken by design

Last edited by cryptearth (Yesterday 12:19:18)

Online

#3 Yesterday 12:50:23

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 30,338
Website

Re: IsMyArchFree - a simple programs that tell if your packages are free

What's your distinction between "free" and "permissive"?  Typically when those are contrasted, "free" would mean copyleft.  But only one of your "free" licenses is copyleft, the other three are more permissive than several of your "permissive" licenses.

Side note, I bet my sed implementation is a lot faster than your "C" code:

#!/bin/sh

sed -n '
/%NAME%/ {
	n
	s/$/:/
	h
}
/%LICENSE%/ {
	:loop
		n
		/^$/ {
			b end
		}
		s/\(GPL\|WTFPL\|CC0\|CC-PDDC\)/\tfree/
		s/\(MIT\|Apache\|OFL\|BSD\|MPL\|SIL\|HPND\|ISC\|X11\|APACHE\)/\tpermissive/
		s/\(PerlArtistic\|PSF\|Zlib\|PostgreSQL\|wxWindows\|zlib\)/\tpermissive/
		s/\(sleepycat\|BSL\)/\trestrictive/
		s/\(chrome\)/\tproprietary/
		H
		b loop
	:end
	g
	/\tfree/ { s/:.*/\tfree/; }
	/\tpermissive/ { s/:.*/\tpermissive/; }
	/\trestrictive/ { s/:.*/\trestrictive/; }
	/\tproprietary/ { s/:.*/\tproprietary/; }
	s/:.*/\tcustom/
	p
}' \
/var/lib/pacman/local/*/desc

This doesn't generate counts, but can easily be piped into sort and uniq to get counts, e.g., call this script "licenses" then `licenses | sort -k2 | uniq -f 1 -c` to show the counts.

Last edited by Trilby (Yesterday 13:23:58)


"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman

Online

#4 Yesterday 13:59:16

mskrd
Member
Registered: Yesterday
Posts: 6

Re: IsMyArchFree - a simple programs that tell if your packages are free

cryptearth wrote:

is there a reason you implemented this in C rather than a bash script if you just use

Yeah i mean i could make it simplier and lighter but ~150 lines in C is not that big of a deal for a simple and useful program, isn't it?

cryptearth wrote:

why go with fixed length arrays? use a dynamic length list

Well, line length won't be more than 512 so there is no reason to make it dynamic. I agree that i should've made MAX_LINES dynamically allocate space depending on packages number.


cryptearth wrote:

requires to be run in a shell which understands this - which breaks over a serial connection using screen which is monocrhome instead of color

I only use Bash, do other shells not understand this? Also i don't think anyone uses monochrome screens.

cryptearth wrote:

overall: broken by design

Respectfully, i made this program in like 1 hour and shared so if someone finds it useful could use it. I did not strive for fewest lines of code as possible or support to every arch system in the world. If it doesn't work on someone's machine they could easily make their changes in the source code, compile it in less than a second and run it. About breaking if locale isn't en_US, i did not quite understand why would that be the case, if you can fix it i would appreciate that! Thanks for your feedback

Offline

#5 Yesterday 14:06:00

mskrd
Member
Registered: Yesterday
Posts: 6

Re: IsMyArchFree - a simple programs that tell if your packages are free

Trilby wrote:

What's your distinction between "free" and "permissive"?  Typically when those are contrasted, "free" would mean copyleft.  But only one of your "free" licenses is copyleft, the other three are more permissive than several of your "permissive" licenses.

I used ChatGPT to categorize them (not a good idea apparently), I'm not really into licenses so i did not know about all these differences. I will fix that, Thanks for your feedback!

Offline

#6 Yesterday 14:39:22

ReDress
Member
From: Nairobi
Registered: 2024-11-30
Posts: 113

Re: IsMyArchFree - a simple programs that tell if your packages are free

cryptearth wrote:

is there a reason you implemented this in C rather than a bash script if you just use

fp = popen("pacman -Qi | awk -F ': ' '/^Name/ {name=$2} /^License/ {print name \":\" $2 }'", "r");

anyway? - which, btw, looks very wrong as the pipe is something the shell provides - so if you want to go proper you should first just use pacman, capture its output, call awk in its own call and feed it what you got from pacman - your way is just overcomplicated shell scripting packaged in C

also:

#define MAX_LINES 1000
#define MAX_LENGTH 512

int main(void) {
    FILE *fp;
    char line[MAX_LENGTH];
    char packages[MAX_LINES][MAX_LENGTH];

why go with fixed length arrays? use a dynamic length list

also also: it fails when not run with en_US - which you don't set anywhere

overall: broken by design

"Broken by design" - yeah, could be, though your argument so much screams of ...."*I* find it easier to do this in a shell script so why don't/won't *you* do it in a shell script"...

Last edited by ReDress (Yesterday 14:40:00)

Offline

#7 Yesterday 16:10:00

cryptearth
Member
Registered: 2024-02-03
Posts: 1,325

Re: IsMyArchFree - a simple programs that tell if your packages are free

mskrd wrote:
cryptearth wrote:

is there a reason you implemented this in C rather than a bash script if you just use

Yeah i mean i could make it simplier and lighter but ~150 lines in C is not that big of a deal for a simple and useful program, isn't it?

ReDress wrote:

"Broken by design" - yeah, could be, though your argument so much screams of ...."*I* find it easier to do this in a shell script so why don't/won't *you* do it in a shell script"...

@ReDress
first - don't full quote - thanks
@both
you COMPLETELY missed my point of "compiled executable binary vs shell code" - so here we go:

first of all:

fp = popen("pacman -Qi | awk -F ': ' '/^Name/ {name=$2} /^License/ {print name \":\" $2 }'", "r");

from man popen:

The  popen()  function  opens a process by creating a pipe, forking, and invoking the shell.  Since a pipe is by definition unidirectional, the type argument may specify only reading or writing, not both; the resulting stream is correspondingly read-only or write-only.
The command argument is a pointer to a null-terminated string containing a shell command line.  This command is passed to /bin/sh using the -c flag; interpretation, if any, is performed by the shell.

so we already use shell code here which has to be executed by a shell and use some convenience helper inside a compiled binary instead of executing that shell itself in the shell we use to call the binary - NOPE! - not just it relies on a proper shell configuration - it actually requires a shell in the firstplace - which, fun fact, is not something mandatory for a working linux system - a linux system can properly run without any shell - you just can't interact with it "from a local terminal" but need other means like a remote web interface - but a shell is not a hard requirement

my point is NOT about "oh this is easier in shell script than in a compiled language" - it's about "why write a wrapper in a compiled language around it in the first place"?
bash and other shells are already powerfull enough to do that sort of things using either built-in stuff or standard tools like grep, sed, awk and so on
the whole code is reinventing the wheel rather than use the existing one - it's not just not better - it's even worse

mskrd wrote:
cryptearth wrote:

why go with fixed length arrays? use a dynamic length list

Well, line length won't be more than 512 so there is no reason to make it dynamic. I agree that i should've made MAX_LINES dynamically allocate space depending on packages number.

you shouldn't use anything fixed based on assumptions on your system - especially when you mean to distribute that code for others to use
I have a tripl-head setup - stretchin Konsole other all 3 screens gives a lines of 700 chars - and some output of pacman even wraps around THAT still! so clearly your assumption of "nah, there won't be any output of that command that ever will exceed 512 chars / line" is proven just wrong

mskrd wrote:
cryptearth wrote:

requires to be run in a shell which understands this - which breaks over a serial connection using screen which is monocrhome instead of color

I only use Bash, do other shells not understand this? Also i don't think anyone uses monochrome screens.

again - you just ASSUME based on YOUR setup - don't do so if you distribute your code!
also I don't talk about a physical monochrome display - but rather the helper software screen: https://archlinux.org/packages/extra/x86_64/screen/
and yes - there ARE regular use cases for actual physical serial connections: for anyone using single-board-computers like raspberry pi or development kits or microcontrollers: most of development is done over actual rs-232 serial because it's quite costly to setup a full network stack to provide telnet or even ssh - and although using screen with a serial connection in fact does understand excape sequences it's still just displayed in simple monochrome white without even shades of greyscaling
so in order to take advantage of colored output - yes, a terminal which understands and can display that is required - and just because you don't use it doesn't mean there're others which rely on much simpler stuff than a colored shell in a local terminal

mskrd wrote:
cryptearth wrote:

overall: broken by design

Respectfully, i made this program in like 1 hour and shared so if someone finds it useful could use it. I did not strive for fewest lines of code as possible or support to every arch system in the world. If it doesn't work on someone's machine they could easily make their changes in the source code, compile it in less than a second and run it. About breaking if locale isn't en_US, i did not quite understand why would that be the case, if you can fix it i would appreciate that! Thanks for your feedback

again - you just assume based on your setup because it so happens that whatever locale you have set turns own pacman spitting out the word "Licenses"
in german this word is translated to Lizenzen - which already breaks your code as it will never find any line containing the english word - resulting in just 0
this, the most simples way is to just preceed the call with a LC_ALL=C - but writing a tool for distribution is supposed to handle that itself - so if you don't care about i18n then at least add this LC_ALL=C to your shell-abusement

mskrd wrote:
Trilby wrote:

...

I used ChatGPT ...

and THIS is were I stopped to read and think and just went "kthxbai"

Online

#8 Yesterday 16:36:15

seth
Member
Registered: 2012-09-03
Posts: 61,543

Re: IsMyArchFree - a simple programs that tell if your packages are free

grep -A1 '%LICENSE%'  /var/lib/pacman/local/*/desc > /tmp/package.licenses
printf "RMS pulse: 60 bpm\t"; grep -Ec '(GPL|WTFPL|CC0|CC-PDDC)' /tmp/package.licenses
printf "RMS pulse: 100 bpm\t"; grep -Ec '(MIT|Apache|OFL|BSD|MPL|SIL|HPND|ISC|X11|APACHE|PerlArtistic|PSF|Zlib|PostgreSQL|wxWindows|zlib)' /tmp/package.licenses
printf "RMS pulse: 180 bpm\t"; grep -Ec '(sleepycat\|BSL)' /tmp/package.licenses
printf "RMS pulse: 0 bpm\t"; grep -Ec '(chrome)' /tmp/package.licenses
rm /tmp/package.licenses

Edit, @ReDress - abusing C to more or less just wrap a system call is poor style. You make people type a command into their shell that then just types another command into the shell.
That's not a matter of preference. If you want to use a shell script, don't wrap that into a binary (nb. this isn't even a compiled shell script)

Last edited by seth (Yesterday 16:42:11)

Offline

#9 Yesterday 17:21:45

ReDress
Member
From: Nairobi
Registered: 2024-11-30
Posts: 113

Re: IsMyArchFree - a simple programs that tell if your packages are free

seth wrote:

Edit, @ReDress - abusing C to more or less just wrap a system call is poor style. You make people type a command into their shell that then just types another command into the shell.
That's not a matter of preference. If you want to use a shell script, don't wrap that into a binary (nb. this isn't even a compiled shell script)

I've not looked at OPs program and would agree that that is poor style, though...

For me, my shell scripting is that good. Not to suggest that my C coding is great or any better but a lot of times it's easier to me write a C "script" than to fubble around the internet trying to write a shell script. I'm still trying to improve my shell scripting though :-)

Offline

#10 Yesterday 17:27:58

Head_on_a_Stick
Member
From: The Wirral
Registered: 2014-02-20
Posts: 8,806
Website

Re: IsMyArchFree - a simple programs that tell if your packages are free

There is an Arch version of Debian's check-dfsg-status package but it retains the unfortunate name of it's predecessor: https://aur.archlinux.org/packages/vrms-arch-git

Python is better than bash & C, right? tongue

EDIT: or sed.

Last edited by Head_on_a_Stick (Yesterday 17:28:24)


Jin, Jîyan, Azadî

Offline

#11 Yesterday 17:43:16

ReDress
Member
From: Nairobi
Registered: 2024-11-30
Posts: 113

Re: IsMyArchFree - a simple programs that tell if your packages are free

Maybe Perl is the ultimate overlord of text processing. Who knows >.>

Offline

#12 Yesterday 17:59:22

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 30,338
Website

Re: IsMyArchFree - a simple programs that tell if your packages are free

Who knows?  Anyone who has written perl code suspects that it is not.  Anyone who has read perl code knows it is not.

Although I suppose I should be careful as I posted a sed solution.  Personally I do find sed to be one of the most aesthetically pleasing languages - I really love it.  But I recognize that this is probably a bit like people adoring their pet when they have one of those hairless dogs with crooked teeth.

Last edited by Trilby (Yesterday 18:00:25)


"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman

Online

#13 Yesterday 18:03:15

ReDress
Member
From: Nairobi
Registered: 2024-11-30
Posts: 113

Re: IsMyArchFree - a simple programs that tell if your packages are free

Trilby wrote:

Who knows?  Anyone who has written perl code suspects that it is not.  Anyone who has read perl code knows it is not.

It just so happens that quite a few Linux kernel tools that handle text seems to be written in Perl. Just thought there/this might be the reason why.

Offline

#14 Yesterday 19:22:57

Head_on_a_Stick
Member
From: The Wirral
Registered: 2014-02-20
Posts: 8,806
Website

Re: IsMyArchFree - a simple programs that tell if your packages are free

And in respect of the licences themselves, the OP should be aware of the SPDX identifiers, which are used in (most) Arch packages these days. The provided link shows which licences are approved by the FSF & OSI.


Jin, Jîyan, Azadî

Offline

#15 Yesterday 20:07:43

mskrd
Member
Registered: Yesterday
Posts: 6

Re: IsMyArchFree - a simple programs that tell if your packages are free

cryptearth wrote:

and THIS is were I stopped to read and think and just went "kthxbai"

Yeah bro, the code that i wrote in 1 hour to create a small useful program is indeed not perfect. I totally should've thought of people who run Linux without shell or on a raspberry pi. Why wont we go further? I probably should've included speech control of the program so people run Linux without fingers can use it too. Now a question to you: Do you understand whats the point of GitHub? If i recall correctly its for people to build projects together, not to publish something complete and forget. If this program offended you like that, then go and demonstrate your divine programming knowledge by creating a pull request, this way finally all people who run Linux without shell can enjoy my program.

Offline

#16 Yesterday 20:43:30

seth
Member
Registered: 2012-09-03
Posts: 61,543

Re: IsMyArchFree - a simple programs that tell if your packages are free

Nobody runs linux w/o a shell and if they did, your program would not work. That's kinda the point cryptearth wa trying to convey.
You (or chatgpt) wrote some C code that just wraps a "pacman | awk" chain, what raises the question why one would use the C wrapper to execute a shell script.
(I assume the answer is that you asked chatgpt for a C program to do this, chatgpt knew that one could query pacman and filter it with awk, but you asked for C code and because it has no mind of its own, chatgpt would answer your question rather than questioning the task or its own approach - this is how you end up with glue on a pizza)

I see where you might feel under attack by the responses, but we're all nerds and therefore empathically challenged.
Your take-aways should be
1. don't use LLMs to get solutions, only inspirations (if at all) - they're good at selling being smart, but they're way dumber than you.
2. pointless wrappers than only lower portability and flexibility but introduce the opportunity for OOB memory violations are bad, if you want to write this in C/++, see /usr/include/alpm.h
3. poking around to learn something is never a waste of time - iff you actually end up learning from your mistakes.

The pacman output is localized, having awk search for "Name" and "License" isn't gonna work on non-english systems.
github today is probably for MS to train their LLM, it used to be a public git server - collaboration is optional: git is a version control system, not a social media platform.

Offline

#17 Yesterday 20:59:33

mpan
Member
Registered: 2012-08-01
Posts: 1,377
Website

Re: IsMyArchFree - a simple programs that tell if your packages are free

mskrd: thanks for the contribution. smile

I assume, that you’re learning C programming and this is one of the experiments. I’ll comment accordingly.

Regarding lines array:
You only need to process one line at a time. So there is no need to store all of them. This poses a problem not only because the number of lines, but also because you make a huge local variable. In all C compilers⁽¹⁾ those variables are allocated on the stack. The call stack is meant to only hold small amounts of information: usually a dozen little variables per function. Large data goes to heap, which in all C compilers⁽¹⁾ means dynamically allocated memory. In this program and with small number of lines you can get away with that, but be aware in general this will fail.

Regarding localization:
Your code assumes, that `pacman -Qi` is going to print output containing words “Name” and “License” to identify the relevant lines. It does so only in English locales. In other locales the words are going to be different. You may force pacman to use the “C locale,” which is both English and virtually always available: set environment variable `LANG` before invoking `pacman`: `LANG=C pacman -Qi …`. However, you may avoid the problem altogether: by using expac. Parsing output meant for humans is generally a bad idea, as it’s unreliable. `expac` allows you to set the format yourself. One that suits stable, computer parsing. Even better would be to interface libalpm directly, but that would require a lot more work too.

The use of `strncpy`:
In line 21 you call strncpy. I guess your goal was to avoid exceeding the buffer. But you didn’t, and that is a mistake people often do misunderstanding the purpose `strncpy`. And that is due to `strncpy` being such an ancient history relic, most don’t even know what purpose it served. So a new use was incorrectly attributed to it.

`strncpy` is a function used to work with fixed-length records. A format used back in the dawn of computing. Strings used were always having the same length and unused characters were filled with 0 byte. Not to be confused with c-strings, which use 0 byte as a single terminator. And that confusion is critical here, because fixed-length strings didn’t use the terminator. If a field was 8 bytes in length, and text was 8 characters, it didn’t get 0 byte at the end. `strncpy` behavior reflects exactly that. It fills unused bytes with 0s (which is waste of time nowadays), but will not add a terminator where the string is too long (which produces an invalid c-string). This leads to CWE 120, which is what you wanted to avoid in the first place.

The lack of historical context makes people believe it is used to provide some kind of safety. It does not. While it may be used in a safe manner, that will not be in any way better than calling `memmove`. So just forget about `strncpy`. You have to correctly calculate the size of the buffer, period.

Regarding matching:
All tests for licenses use strstr. `strstr` just searches for matching bytes. It doesn’t respect word boundaries. So for example `strstr("forbidden", "bid")` returns a match. Are you sure no license identifier ever fits another license identifier? In particular custom/LicenseRef ones, which may contain arbitrary strings?

But this is not where the problem ends. Ignoring the above, the code is written with the assumption about licenses being listed in the simple form. But Arch uses SPDX-License-Identifier, so they may have all ways of OR, AND and parethenses. See I don’t expect you to parse full SPDX-License-Identifier expressions at this stage. But at least you should try to detect you’re not dealing with a simple identifier. See cups license or libgcrypt license for examples.

Regarding ChatGPT usage:
Don’t be offended by getting a particularly strong negative reaction for using ChatGPT. By using ChatGPT, you placed yourself in a very unpleasant position. At the same time: you introduced a bug and, perhaps unknowingly, you did something perceived as silly. That alone may make people merely laugh. But you did that standing in a cross-fire of a major battle. Couldn’t end well, could it? tongue

ChatGPT, and other large language models, receive a very deceptive marketing. Speaking shortly, consider them random dice throwing out words, fine-tuned to make the output resemble natural human writing. This doesn’t make them useless or inherently “wrong.” It just means that, like any other algorithm, they have only particular uses. They’re all associated with linguistic patterns.⁽²⁾ Acting as knowledge databases is not one of them.

Regarding the use of C:
The assumption about you learning C prevents me from commenting on the language choice. But, if you had other options to choose, C is far from good. Both regarding the problem you wish to solve and the solution envisioned, and regarding your own effort compared to benefits you got.

____
⁽¹⁾ All I can recall. If anybody knows exceptions, post them.
⁽²⁾ Note that “linguistic” and “pattern” in this case is far beyond what naïve understanding of human languages, and the patterns they recognize are not necessarily recognizable to human brain. Since the corpus of language data they were taught on is correlated with actual knowledge and statements rooted in it, this includes patterns resembling knowledge. But they don’t carry the understanding in human sense of understanding or reasoning.

Last edited by mpan (Yesterday 21:05:18)


Sometimes I seem a bit harsh — don’t get offended too easily!

Offline

#18 Yesterday 21:03:10

mskrd
Member
Registered: Yesterday
Posts: 6

Re: IsMyArchFree - a simple programs that tell if your packages are free

seth wrote:

I see where you might feel under attack by the responses, but we're all nerds and therefore empathically challenged.
Your take-aways should be
1. don't use LLMs to get solutions, only inspirations (if at all) - they're good at selling being smart, but they're way dumber than you.
2. pointless wrappers than only lower portability and flexibility but introduce the opportunity for OOB memory violations are bad, if you want to write this in C/++, see /usr/include/alpm.h
3. poking around to learn something is never a waste of time - iff you actually end up learning from your mistakes.

Im not sure where you got that ChatGPT wrote this program, but no, i wrote it. I only used ChatGPT to add some licenses because it's faster than googling. I wrote the program in C because that's the language im most comfortable working with, not because im good at it (as you all already understood). A simple thought - i need to get users' packages and their licenses and then process that data to show if they're free or not - made me come up with this solution which is far from optimal. I agree with everything @cryptearth said, but to completely reject someone's idea because he did not make it in a way that would satisfy every system is just shallow. At least you are respectful about it. I actually showed this program to my friend which is much more exprienced than me and he said "Cool idea - but the code is shit." Thanks how you should critisize someone

Offline

#19 Yesterday 21:16:59

mskrd
Member
Registered: Yesterday
Posts: 6

Re: IsMyArchFree - a simple programs that tell if your packages are free

mpan wrote:

mskrd: thanks for the contribution. smile

Thanks for your feedback. You correctly assumed that i'm learning C. Sometimes the lack of useful functions and necessity to deal with memory just forces you to use shortcuts, which im not ashamed to use since this is a idea, not a great piece of software that everyone should use. I actually did not know VRMS existed if i did i wouldn't make this program at all. Some people just think that if you're into IT means you write good code, which is not true. Have a good day!

Offline

Board footer

Powered by FluxBB