You are not logged in.

#1 2024-01-27 15:59:37

xanrer
Member
Registered: 2022-01-04
Posts: 38

Can I download all the Archman pages and train an AI with it

Hello, man pages are really useful but you first need to know the name of the command. So far I haven't had any problems with this method but I thought it could be better to have a local AI that does that for the beginners. However, I couldn't find how to download all the archman pages. I know they are here: https://man.archlinux.org/listing but I need all of the files, then convert them to some format that Llamma Factory can read, and then train my AI with it. How can I download all of these man pages? Is there a repo? Also if you have suggestions about the process please let me know.

Offline

#2 2024-01-27 17:06:59

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 12,926

Re: Can I download all the Archman pages and train an AI with it

The first question in my opinion should be : Does the lIcense for the content allow using it as input for machine learning ?

That content is taken directly from the packages built by archlinux.
I am not a lawyer, but that likely means the license under which upstream publishes the code used to create the packages dictates what can and can't be done with them.

Given the large amount of different licenses determining if using them in ML is allowed won't be easy.

There are however commands that make finding the correct manpages easier.

apropos is one of them, check its manpage .

Last edited by Lone_Wolf (2024-01-27 17:07:47)


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#3 2024-01-27 17:12:48

xanrer
Member
Registered: 2022-01-04
Posts: 38

Re: Can I download all the Archman pages and train an AI with it

Lone_Wolf wrote:

The first question in my opinion should be : Does the lIcense for the content allow using it as input for machine learning ?

That content is taken directly from the packages built by archlinux.
I am not a lawyer, but that likely means the license under which upstream publishes the code used to create the packages dictates what can and can't be done with them.

Given the large amount of different licenses determining if using them in ML is allowed won't be easy.

There are however commands that make finding the correct manpages easier.

apropos is one of them, check its manpage .

Hmm nice point, since I was going to make this AI open-source I thought it wouldn't be a problem but I don't want to commit license fraud.

I gotta do some research...

Offline

Board footer

Powered by FluxBB