You are not logged in.

#1 2020-12-10 00:46:15

tonnz
Member
Registered: 2015-08-02
Posts: 38

Can Arch's handling of Python updates be improved?

Anyone who dealt with python is familiar with python compatibility woes, so i won't be laying out examples. For a specific case of how python compatibility can be particularly frustrating, check out the dustbin section of the forum*. I'm sure long-time contributors and moderators are already used to seeing support threads about python issues coinciding with major python updates.

Arch handles multiple JVM implementations and versions very well. Sadly, that same approach can't be used with Python, mainly because Arch ships multiple python libraries as packages. Maintaining multiple versions of python (like maintaining multiple jvm implementations) would also require maintaining multiple versions of each and every python library that is available as a package in arch's repositories, which is entirely impractical.

Providing common python libraries as packages is, i assume, a mitigation to breakage associated with keeping python always up-to-date in the repo, because the python libraries arch provides are build against the python package arch also provides. This has two issues:
1 - It is not humanly possible to provide every python library as a package, with whatever methods or patches to make it work with the latest python release. Anyone relying on python libraries outside of Arch's repositories, which i assume would be every engineer, data scientist, professional or tinkerer of any kind, will experience breakage after major python update.
2 - Because the most common python libraries are provided as packages, anyone experiencing breakage with the latest python release cannot just downgrade the python package. Downgrading packages being "unsupported" aside, one would also have to downgrade every python-dash-someLibrary package, of which there are countless.

In short, providing common python libraries as packages in order to more easily and reliably keep python itself up-to-date in the repo makes sense in theory. In reality, it just cements the breakage that still occurs for anyone using python libraries which aren't provided by arch as packages, which will always be almost all python libraries out there.

I'm not a maintainer, i'm already overwhelmed by the handful of pkgbuilds i "maintain" for my personal use, so i can't say which alternative approach would be best. But i do have a vague suggestion as an optional entry point for discussion:

DKMS-style: Banish python libraries that are also available using pip from the repos, replace them by packages of the same name that invoke pip in a "system-standard" virtualenv in the .install script. Given multiple python-versions present on the system, that script would also install versions of the libraries specific for all installed python versions via pip, very much like DKMS manages module compilations/installations/removals for multiple installed kernels. With python-libraries gone from the repos, maintaining multiple python-releases "archlinux-java"-style becomes very feasible.




(*This is a repost without the filler because unbeknownst to me this forum is a sanitary place with no room for less-technical chit-chat. I was not ranting bird-with-a-hat, just laying out my most recent experience with Arch's python handling in a wannabe-humorous self-therapeutic way, even trying to make it interesting for others. I do appreciate the feedback, though)

Last edited by tonnz (2020-12-10 00:57:48)

Offline

#2 2020-12-10 00:59:47

Slithery
Administrator
From: Norfolk, UK
Registered: 2013-12-01
Posts: 5,776

Re: Can Arch's handling of Python updates be improved?

I've been using Arch for 10+ years now and never once had any breakage due to a Python update. AFAIK this is only an issue if you use pip to install stuff as root.

If your workflow involves using different versions of Python then just use a venv. As Arch is a rolling-release distro only the latest stable version is supported, for anything else you're on your own.


No, it didn't "fix" anything. It just shifted the brokeness one space to the right. - jasonwryan
Closing -- for deletion; Banning -- for muppetry. - jasonwryan

aur - dotfiles

Offline

#3 2020-12-10 01:12:46

tonnz
Member
Registered: 2015-08-02
Posts: 38

Re: Can Arch's handling of Python updates be improved?

Slithery wrote:

I've been using Arch for 10+ years now and never once had any breakage due to a Python update. AFAIK this is only an issue if you use pip to install stuff as root.

As a concrete example: Tensorflow-rocm which isn't in the official repos, and the most recent provided by pip (same for the regular tensorflow package), do not support Python 3.9. Tensorflow itself (i don't know about the one in the Arch repos because i specifically need a ROCm backend) will not officially support Python 3.9 until at least version 2.5, let alone tensorflow-rocm.
I by principle never install anything using pip as root. It is just a matter of fact that not all of the thousands of python libraries are immediately updated to the latest Python version, if ever, making Arch a less-than-ideal choice for people dealing with python.


Slithery wrote:

As Arch is a rolling-release distro only the latest stable version is supported, for anything else you're on your own.

Then what is the rationale for providing multiple versions of the OpenJDK jvm? I suppose because it is a necessity. I would argue the same applies to python. "pip install --user tensorflow-rocm" fails on Arch and there is no simple workaround that i am aware of, and while this is just a singular example which will sort itself out in the future, it isn't isolated.

Last edited by tonnz (2020-12-10 01:13:25)

Offline

#4 2020-12-10 01:23:20

eschwartz
Fellow
Registered: 2014-08-08
Posts: 4,097

Re: Can Arch's handling of Python updates be improved?

tonnz wrote:

Arch handles multiple JVM implementations and versions very well.

Our handling is cursed, and depending on your default java, random applications might not work due to being incompatible with that version.

Java the ecosystem is also cursed due to bundling its dependency jars in every program.

We'd be thrilled to drop everything but for the latest Java. The situation is more comparable to python 2 vs. python 3, only with more major versions for Java.
(Yes we would like to drop python 2 also.)

tonnz wrote:

Sadly, that same approach can't be used with Python, mainly because Arch ships multiple python libraries as packages. Maintaining multiple versions of python (like maintaining multiple jvm implementations) would also require maintaining multiple versions of each and every python library that is available as a package in arch's repositories, which is entirely impractical.

Providing common python libraries as packages is, i assume, a mitigation to breakage associated with keeping python always up-to-date in the repo, because the python libraries arch provides are build against the python package arch also provides. This has two issues:

No... the point of providing common python libraries as packages is due to them being system dependencies for common python applications which are also packages. Python is a popular programming language, it is used for end-user applications in addition to data science. Shockingly enough.

tonnz wrote:

1 - It is not humanly possible to provide every python library as a package, with whatever methods or patches to make it work with the latest python release. Anyone relying on python libraries outside of Arch's repositories, which i assume would be every engineer, data scientist, professional or tinkerer of any kind, will experience breakage after major python updates.

It's not "humanly" possible?

Anyways, lots of the things needed are in the repos.

Tensorflow is in the repos.

tonnz wrote:

2 - Because the most common python libraries are provided as packages, anyone experiencing breakage with the latest python release cannot just downgrade the python package. Downgrading packages being "unsupported" aside, one would also have to downgrade every python-dash-someLibrary package, of which there are countless.

So don't. Install https://aur.archlinux.org/packages/python38 from the AUR. Your desires are possible already.

tonnz wrote:

In short, providing common python libraries as packages in order to more easily and reliably keep python itself up-to-date in the repo makes sense in theory. In reality, it just cements the breakage that still occurs for anyone using python libraries which aren't provided by arch as packages, which will always be almost all python libraries out there.

I'm not a maintainer, i'm already overwhelmed by the handful of pkgbuilds i "maintain" for my personal use, so i can't say which alternative approach would be best. But i do have a vague suggestion as an optional entry point for discussion:

DKMS-style: Banish python libraries that are also available using pip from the repos, replace them by packages of the same name that invoke pip in a "system-standard" virtualenv in the .install script. Given multiple python-versions present on the system, that script would also install versions of the libraries specific for all installed python versions via pip, very much like DKMS manages module compilations/installations/removals for multiple installed kernels. With python-libraries gone from the repos, maintaining multiple python-releases "archlinux-java"-style becomes very feasible.

That's not how even java works. dkms is very special. It uses source code in /usr/src, not "download from the internet with pip". And in the dkms case, the point is to specifically support every possible kernel in the AUR, because if you have a kernel module, you must compile it for any kernel you might boot.

There is NO SUCH THING as "I only need this kernel module for the latest stable kernel, but not for -ck or -zen or -lts".
There is very much such thing as "I only need this python module for python 3.9".

"in a "system-standard" virtualenv" -- is this not what the system standard non-virtualenv is?

Anyway. Point is, if you want multiple versions of python you can use pip for your libraries in a virtualenv. No need to completely reinvent distro packaging from the ground up.

The only thing you need is the python 3.8 interpreter.

Which is in the AUR, because it's an old minor release we don't need.

Unlike python 2, which is a dependency for many things. (Unfortunately.)

Last edited by eschwartz (2020-12-10 01:24:57)


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

#5 2020-12-10 02:02:37

tonnz
Member
Registered: 2015-08-02
Posts: 38

Re: Can Arch's handling of Python updates be improved?

eschwartz wrote:

So don't. Install https://aur.archlinux.org/packages/python38 from the AUR. Your desires are possible already.

Perfect, thank you. I thought about building an older python version but naively didn't think someone had already provided a PKGBUILD. Thinking about it, i'm probably better off using such specific AUR versions with venvs for projects forever, without updating, until it breaks, and not use the repo-provided packaged libraries for my projects at all. Much less hassle, "update anxiety", and more control.



I appreciate the detailed comment and explanation and can agree with your reasoning.

The reason i praised Arch's handling of JVMs was not just because of the multiple versions one can have installed at the same time, but also for handling multiple different implementations of JVMs.
But i do still think some more sophisticated handling of python implementations, if not versions, might make sense in the future especially with things like graalpython around the corner.  But with venvs it's questionable if that would ever be worthwile as anyone who wants to use a specific implementation can just create a venv for it. With pip already existing i thought it would make sense for Arch's handling of python dependencies to just defer them to pip, at which point handling different implementations of python would also become an option. I know it's asking too much, but if one could set "archlinux-python set cpython" or "graalpython", that would be neat.

Last edited by tonnz (2020-12-10 02:10:24)

Offline

#6 2020-12-10 02:06:00

eschwartz
Fellow
Registered: 2014-08-08
Posts: 4,097

Re: Can Arch's handling of Python updates be improved?

https://wiki.archlinux.org/index.php/Us … ternatives

archlinux-python sounds terrible. I'd prefer to get rid of archlinux-java.


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

#7 2020-12-10 02:19:21

tonnz
Member
Registered: 2015-08-02
Posts: 38

Re: Can Arch's handling of Python updates be improved?

eschwartz wrote:

https://wiki.archlinux.org/index.php/Us … ternatives

archlinux-python sounds terrible. I'd prefer to get rid of archlinux-java.


That would achieve the same thing in a much more elegant and more general way and i wonder why it isn't the standard already.

It wouldn't easily work for python as there's still the issue of python-dash-someLibrary packages in the repo that expect a specific implementation and version of the python interpreter. Deferring python dependencies to pip using some script that handles different interpreter implementations (and issues a warning if a particular installed interpreter doesn't support a dependency) would not only make Arch more modular but also obsolete much maintaining-work, because python-dash-someLibrary packages would at that point no longer include the actual library itself.

Last edited by tonnz (2020-12-10 02:19:42)

Offline

#8 2020-12-10 02:38:37

eschwartz
Fellow
Registered: 2014-08-08
Posts: 4,097

Re: Can Arch's handling of Python updates be improved?

But then if you're using pip anyway, you might as well use pip yourself instead of creating distro packages that don't package anything, thus defeating the purpose of pacman.

The point here is to choose defaults for "python", not "python3.9".

Last edited by eschwartz (2020-12-10 02:39:44)


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

#9 2020-12-10 02:49:26

tonnz
Member
Registered: 2015-08-02
Posts: 38

Re: Can Arch's handling of Python updates be improved?

eschwartz wrote:

But then if you're using pip anyway, you might as well use pip yourself instead of creating distro packages that don't package anything, thus defeating the purpose of pacman.

The point here is to choose defaults for "python", not "python3.9".

For people programming with python it would make no difference, i agree. For people who indirectly use python as a dependency of other packages, it would give them control over which interpreter/jit is used on their system. Allowing the use of any version of any interpreter from whichever source as the "system default" is just a bonus, at anyone's own risk. I do think that alone would not really warrant a drastic change to the way Arch provides everything related to python, but given that the package search of "python-" returns 1542 results it might be still be interesting to drastically reduce maintenance work.

Edit:
Doing it via non-packages that just invoke a script that uses pip on install/update/removal isn't the best way of dealing with this, just the first that came to my mind that doesn't involve patching pacman itself.

Last edited by tonnz (2020-12-10 03:19:20)

Offline

#10 2020-12-10 02:53:30

eschwartz
Fellow
Registered: 2014-08-08
Posts: 4,097

Re: Can Arch's handling of Python updates be improved?

Dropping 1542 packages from the repos would definitely reduce maintenance work. But I like having software


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

Board footer

Powered by FluxBB