You are not logged in.

#1 2020-05-09 22:28:30

zan
Member
Registered: 2020-05-08
Posts: 6

Minimal pacman mirrorlist maintenance

I apologize in advance if such simple automations are not meant to be posted here and I will delete the post if requested.

I've been using this simple approach for /etc/pacman.d/mirrorlist updating with only https mirrors from specific countries. You can easily change to rsync or http or skip the check altogether, and also add or remove more country codes as you see fit.

sudo pacman -S curl jq pacman-contrib

curl is used to fetch the mirrorlist JSON status, jq to parse the JSON and select mirrors we're interested in and pacman-contrib provides the rankmirrors utility.

#!/bin/sh
# /etc/pacman.d/hooks/mirrorlist.hook

# Exit immediately if a command exits with a non-zero exit status.
set -e

# Colors
NR='\033[0m'
N3='\033[0;32m'
N1='\033[0;31m'

# Fail-check: make sure you have root permissions.
if [ ! -w /etc/pacman.d/mirrorlist ]; then
   printf '%b\n' "$N1::$N3 Error:$N1 missing required root permissions.$NR"
   exit 1
fi

# Mirrorlist status from the last 24 hours.
URL='https://www.archlinux.org/mirrors/status/json/'

# Return only secure mirrors from selected countries.
FILTER='.urls | [.[] | select(.protocol == "https")]
              | [.[] | select(.country_code == "CH", .country_code == "AT",
                              .country_code == "DK", .country_code == "FI",
                              .country_code == "IS", .country_code == "LU",
                              .country_code == "NL", .country_code == "NO",
                              .country_code == "SI")]
              | .[] | "## \(.country)\nServer = \(.url)$repo/os/$arch"'

# Fetch and filter the mirrors, then rank them by connection and opening speed.
curl -qs "$URL" | jq -r "$FILTER" | rankmirrors -v - > /tmp/mirrorlist

# Backup previous mirrorlist and move over the new one.
mv /etc/pacman.d/mirrorlist /etc/pacman.d/mirrorlist.previous
mv /tmp/mirrorlist /etc/pacman.d/mirrorlist

# Remove mirrorlist.pacnew created during pacman-mirrorlist upgrade.
[ -f /etc/pacman.d/mirrorlist.pacnew ] && rm /etc/pacman.d/mirrorlist.pacnew

# Exit with successful status to satisfy the pacman hook.
exit 0

Store the script in /usr/local/bin/pacman-mirrorlist.

To automate the updating I went with a simple pacman hook that runs the script on pacman-mirrorlist package upgrades.

[Trigger]
Operation = Upgrade
Type = Package
Target = pacman-mirrorlist

[Action]
Description = Updating pacman mirrorlist and removing mirrorlist.pacnew...
When = PostTransaction
Exec = /bin/sh -c "/usr/local/bin/pacman-mirrorlist"

Store the hook in /etc/pacman.d/hooks/mirrorlist.hook and that's it. The next time pacman-mirrorlist package is upgraded your mirrorlist will be refreshed.

Last edited by zan (2020-05-10 12:24:07)

Offline

#2 2020-05-10 04:18:48

Awebb
Member
Registered: 2010-05-06
Posts: 5,428

Re: Minimal pacman mirrorlist maintenance

1. https://www.archlinux.org/packages/comm … reflector/

2. Exec = /bin/sh -c "/home/$(logname)/.local/bin/pacman-mirrorlist"

Put that in /usr/local/bin instead. Imagine some wild process appeared and changed the content of that file to, say, dd'ing your disk with zeroes.

Offline

#3 2020-05-10 04:47:54

eschwartz
Trusted User/Bug Wrangler
Registered: 2014-08-08
Posts: 3,309

Re: Minimal pacman mirrorlist maintenance

Using logname there can also do some hilarious things under various circumstances where it isn't sure who you are and simply returns "logname: no login name" on stderr. But yeah, the larger issue is permitting totally untrusted code to be run under situations where the pacman binary itself is trusted.


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

#4 2020-05-10 11:49:46

zan
Member
Registered: 2020-05-08
Posts: 6

Re: Minimal pacman mirrorlist maintenance

Thank you for the feedback!

I tried it out before I went with this approach, but I didn't need 90% of the "bloat" in there and prefer to glance over ~30 lines (instead of 900+) of clear and easy to read instructions. Helps me understand faster what it does half a year later. Do you see any glaring holes with this kind of thinking?

Awebb wrote:

2. Exec = /bin/sh -c "/home/$(logname)/.local/bin/pacman-mirrorlist"

Put that in /usr/local/bin instead. Imagine some wild process appeared and changed the content of that file to, say, dd'ing your disk with zeroes.

eschwartz wrote:

Using logname there can also do some hilarious things under various circumstances where it isn't sure who you are and simply returns "logname: no login name" on stderr. But yeah, the larger issue is permitting totally untrusted code to be run under situations where the pacman binary itself is trusted.

If I understood you correctly, it's not a problem to have scripts that don't require root permissions in ~/.local/bin and added to the PATH. However if they require root permissions it's a huge security hole to have them easily changed by your user prior to being ran by root?

Offline

#5 2020-05-10 12:11:02

Awebb
Member
Registered: 2010-05-06
Posts: 5,428

Re: Minimal pacman mirrorlist maintenance

The systemd unit runs as root and changes a file in /etc, by executing code from the user home. That's the problem.

Offline

#6 2020-05-10 13:48:09

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 23,441
Website

Re: Minimal pacman mirrorlist maintenance

zan wrote:

I tried [reflector] out before I went with this approach, but I didn't need 90% of the "bloat" in there and prefer to glance over ~30 lines (instead of 900+) of clear and easy to read instructions. Helps me understand faster what it does half a year later. Do you see any glaring holes with this kind of thinking?

I agree - notwithstanding that "bloat" has more of a negative connotation than I'd personally put on reflector, it's a great tool I used for quite some time but eventually decided I wanted a much simpler approach which I posted here.

But you are conflating simplifying the code with some design choices that are being questioned here.  It is a false dichotomy to suggest that you either need a thousand lines of code providing all sorts of features or a few lines of code that is error prone.  In fact, it's often quite the opposite.  No one here critiqued the fact that you wrote concise and targetted code - they just noted the places where it could go wrong.  Write concise simple code that doesn't go wrong.

Edit: for clarity, my short code has one definite failing: there is no specified timeout, so if connecting to a mirror stalls, the whole script stalls.  But this is a loss of function failure, not a security-impacting failure.  It's always good to consider what might go wrong if your code fails and assess how much of a concern that should be.  That's what some of the above posts are highlighting: when everything goes right with your code, it should work great; but if/when things go wrong, the results could be very very bad.

Last edited by Trilby (2020-05-10 13:51:21)


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#7 2020-05-10 14:17:21

zan
Member
Registered: 2020-05-08
Posts: 6

Re: Minimal pacman mirrorlist maintenance

Awebb wrote:

The systemd unit runs as root and changes a file in /etc, by executing code from the user home. That's the problem.

Thank you for your patience in explaining this to my thick head. I understand exactly what you and eschwartz meant now.

Trilby wrote:

I agree - notwithstanding that "bloat" has more of a negative connotation than I'd personally put on reflector, it's a great tool I used for quite some time but eventually decided I wanted a much simpler approach which I posted here.

The best thought I had was to put bloat into "" as it's not really bloat as you've pointed out, only functionality that is not required for my specific use-case. It was not meant in no way as a knock on reflector. I apologize if it came across that way, it's a great tool provided free of charge.

Trilby wrote:

But you are conflating simplifying the code with some design choices that are being questioned here.  It is a false dichotomy to suggest that you either need a thousand lines of code providing all sorts of features or a few lines of code that is error prone.  In fact, it's often quite the opposite.  No one here critiqued the fact that you wrote concise and targetted code - they just noted the places where it could go wrong.  Write concise simple code that doesn't go wrong.

I don't think I've shown any hostility towards the problems pointed out as I've corrected my post as soon as they were raised and asked to help me better understand them, which the posters thankfully did. I think I understand where I screwed up my communication.

zan wrote:

and prefer to glance over ~30 lines (instead of 900+) of clear and easy to read instructions. Helps me understand faster what it does half a year later. Do you see any glaring holes with this kind of thinking?

I meant this as a sort of knock on myself. I wrote many scripts that I had no easy way to immediately understand what they're doing half a year (or less) later so I'm avoiding it at all cost. The glaring holes question should've been phrased better by adding "in this use-case" so the intent is clear. I didn't mean it negatively.

Trilby wrote:

Edit: for clarity, my short code has one definite failing: there is no specified timeout, so if connecting to a mirror stalls, the whole script stalls.  But this is a loss of function failure, not a security-impacting failure.  It's always good to consider what might go wrong if your code fails and assess how much of a concern that should be.  That's what some of the above posts are highlighting: when everything goes right with your code, it should work great; but if/when things go wrong, the results could be very very bad.

I didn't consider that I'm introducing a wildcard by having the systemd unit execute code that could be changed to literally anything in the user's home directory as I've written the script prior to learning about pacman hooks. In fact I wasn't even aware of this security problem until it was explained to me above. I've checked I'm not doing something as stupid in any of the other scripts as well.

I've only considered what could go wrong with the script itself, not the context it'll get executed from. It exits as soon as any command within it doesn't exit successfully, so if anything in fetching + parsing goes wrong the original files won't get touched and everything will stay as it was prior to running it.

Do you see any other issues that it could run into? I'm really asking and not being sarcastic (I hope my replies are not coming off as that). I want to learn if there's anything I'm doing wrong so it can be corrected and made as bulletproof as possible.

Last edited by zan (2020-05-10 15:06:08)

Offline

#8 2020-05-10 14:37:40

eschwartz
Trusted User/Bug Wrangler
Registered: 2014-08-08
Posts: 3,309

Re: Minimal pacman mirrorlist maintenance

Trilby, I interpreted Awebb's mentioning of reflector very differently. I blindly assumed it was mentioned for the same reason I'd mention it: not because one should prefer to use it, but because one should be aware that it exists before writing your own.

Sure, not everyone wants or needs to use reflector, ad there's a few different reasons that might be. Another might be "it's the only thing on my server which would depend on a python interpreter, so I would rather use something that isn't written in python".
Alternatively, one might want the fun experience of writing their own version.

And those are all perfectly fine. But on the off chance that reflector *would* be a satisfactory solution, I would still want to mention "hey, there's already a tool for that, so unless you have a particular reason to prefer writing your own, you might want to take the easy route of reusing the existing one."


Managing AUR repos The Right Way -- aurpublish (now a standalone tool)

Offline

#9 2020-05-10 15:22:40

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 23,441
Website

Re: Minimal pacman mirrorlist maintenance

This thread did inspire me to try rewriting my tool in a shell script using jq and it works well, but the python version linked above is consistently faster.

EDIT: That said, I was able to "multithread" the speed checks in the shell version easily resulting in a huge savings in running time.  But then I checked and found that this resulted in horribly inaccurate speed results due to the network bottleneck at my end (likely resulting in the first urls tested getting the lion's share of the bandwidth and getting much "faster" scores).  So with single threaded versions, python is notably quicker.

Last edited by Trilby (2020-05-10 15:32:10)


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#10 2020-05-10 17:57:20

Awebb
Member
Registered: 2010-05-06
Posts: 5,428

Re: Minimal pacman mirrorlist maintenance

I suggested reflector, because it's a well known tool and basically the default recommendation, in case OP didn't know.

Offline

Board footer

Powered by FluxBB