You are not logged in.

#1 2016-11-25 09:22:30

doragasu
Member
Registered: 2012-03-03
Posts: 152

[SOLVED] Pacman returns 403 error even though curl can access files

I have a problem with pacman that might be related to this one I have with yaourt.
Pacman returns 403 error even though curl can access files
I'm behind a corporate proxy, and I have already configured the http(s)_proxy environment variables. In my setup, when pacman tries retrieving files, curl returns error 403. But If I manually try using curl to retrieve the files (e.g. core.db) from the same location, curl completes without a problem. I have tried stracing both pacman and curl calls, with the results below.

pacman call (note I have cut strace and only show the first try to download the database):

# strace -v -e trace=network -o pacman.txt pacman -v -Sy
Root      : /
Conf File : /etc/pacman.conf
DB Path   : /var/lib/pacman/
Cache Dirs: /var/cache/pacman/pkg/  
Hook Dirs : /usr/share/libalpm/hooks/  /etc/pacman.d/hooks/  
Lock File : /var/lib/pacman/db.lck
Log File  : /var/log/pacman.log
GPG Dir   : /etc/pacman.d/gnupg/
Targets   : None
:: Synchronizing package databases...
error: failed retrieving file 'core.db' from osl.ugr.es : The requested URL returned error: 403
error: failed to update core (unexpected error)
error: failed retrieving file 'extra.db' from osl.ugr.es : The requested URL returned error: 403
error: failed to update extra (unexpected error)
error: failed retrieving file 'community.db' from osl.ugr.es : The requested URL returned error: 403
error: failed to update community (unexpected error)
error: failed retrieving file 'multilib.db' from osl.ugr.es : The requested URL returned error: 403
error: failed to update multilib (unexpected error)
error: failed to synchronize any databases
error: failed to init transaction (unexpected error)
# cat pacman.txt 
socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 5
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 6
setsockopt(6, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(6, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(6, SOL_TCP, TCP_KEEPIDLE, [60], 4) = 0
setsockopt(6, SOL_TCP, TCP_KEEPINTVL, [60], 4) = 0
connect(6, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.18.18.80")}, 16) = -1 EINPROGRESS (Operation now in progress)
getsockopt(6, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
getpeername(6, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.18.18.80")}, [128->16]) = 0
getsockname(6, {sa_family=AF_INET, sin_port=htons(40378), sin_addr=inet_addr("10.0.2.15")}, [128->16]) = 0
sendto(6, "GET http://osl.ugr.es/archlinux/"..., 234, MSG_NOSIGNAL, NULL, 0) = 234
recvfrom(6, "HTTP/1.1 403 Forbidden\r\nServer: "..., 16384, 0, NULL, NULL) = 1275

And the curl invocation:

# strace -v -e trace=network -o curl.txt curl http://osl.ugr.es/archlinux/core/os/x86_64/core.db > core.db
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  119k  100  119k    0     0  1569k      0 --:--:-- --:--:-- --:--:-- 1598k
# cat curl.txt 
socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(3, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(3, SOL_TCP, TCP_KEEPIDLE, [60], 4) = 0
setsockopt(3, SOL_TCP, TCP_KEEPINTVL, [60], 4) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.18.18.80")}, 16) = -1 EINPROGRESS (Operation now in progress)
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
getpeername(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.18.18.80")}, [128->16]) = 0
getsockname(3, {sa_family=AF_INET, sin_port=htons(40404), sin_addr=inet_addr("10.0.2.15")}, [128->16]) = 0
sendto(3, "GET http://osl.ugr.es/archlinux/"..., 153, MSG_NOSIGNAL, NULL, 0) = 153
recvfrom(3, "HTTP/1.1 200 OK\r\nDate: Fri, 25 N"..., 16384, 0, NULL, NULL) = 16384
recvfrom(3, "\213\7\276Td\227\367\211U\3441)\337\247V\362\\R\3612\26\365\345\261\377\321b\363vegY"..., 16384, 0, NULL, NULL) = 16384
recvfrom(3, "\232\326\361\10\213,\320\365z\5g\212\246\370\254\20q\233\36\20\251)P\16\32P@\22\24\353\20\317"..., 16384, 0, NULL, NULL) = 16384
recvfrom(3, "\351\276\6\0\333$\246\200M\273\20;\f\206<\206\266K\222\216\t9\322\265\35\322\242\200i\333\34\315"..., 16384, 0, NULL, NULL) = 16384
recvfrom(3, "\377\213A\340w\343?\4?\256?\205o\361\337W\221{\374\367\225*|\33\4~>\375\276\t\364z"..., 16384, 0, NULL, NULL) = 16384
recvfrom(3, "\267{M\326G\373\364R\212R\277\1\6\263\315\376\300ec\337\\\231\"@<}\216\n\212\323tN"..., 16384, 0, NULL, NULL) = 16384
recvfrom(3, "H\0PPD\1\213\344\301\21\310\"U\355\10\t\362\30PC\261J\341^\352\363\262E(\245\"\300"..., 16384, 0, NULL, NULL) = 16384
recvfrom(3, "\231j\252>\313\316C\272[\267'-\252V\353\2616\17\202\2516\2349\212;\362;\225=\344z\376"..., 8440, 0, NULL, NULL) = 8440
+++ exited with 0 +++

It looks like the only important difference in both traces it the GET request (sendto() syscall). Unfortunately I cannot see the complete request (I have been browsing strace man pages but couldn't find how to avoid the abbreviated parameters, '-v' switch does not help), but the request done by pacman is a lot longer (234 bytes vs 153 bytes).

This problem has been causing me lots of headaches, any suggestion is welcome!

Last edited by doragasu (2016-11-28 11:55:43)

Offline

#2 2016-11-25 10:37:06

mpan
Member
Registered: 2012-08-01
Posts: 1,206
Website

Re: [SOLVED] Pacman returns 403 error even though curl can access files

You do remember that environment variables are not passed via sudo unless you explicitly ask it to do so?

If this isn’t the source of problems, try strace with -e write=all.


Sometimes I seem a bit harsh — don’t get offended too easily!

Offline

#3 2016-11-25 12:09:12

doragasu
Member
Registered: 2012-03-03
Posts: 152

Re: [SOLVED] Pacman returns 403 error even though curl can access files

Thanks for help. I already thought about environment variables, so as you can see on the first post, both pacman and curl are called by root user ('#' prompt).

The "-e write=all" was really helpful to spot differences on the GET requests. The pacman HTTP request:

GET http://osl.ugr.es/archlinux/core/os/x86_64/core.db HTTP/1.1
Host: osl.ugr.es
User-Agent: pacman/5.0.1 (Linux x86_64) libalpm/10.0.1
Accept: */*
Proxy-Connection: Keep-Alive
If-Modified-Since: Wed, 23 Nov 2016 16:55:16 GMT

The curl HTTP request:

GET http://osl.ugr.es/archlinux/core/os/x86_64/core.db HTTP/1.1
Host: osl.ugr.es
User-Agent: curl/7.51.0
Accept: */*
Proxy-Connection: Keep-Alive

The  problem must be caused either by the User-Agent, or by the If-Modified-Since tag. So the question is, is there a way to change the User-Agent, or to delete the If-Modified-Since tag?

Offline

#4 2016-11-25 12:29:53

doragasu
Member
Registered: 2012-03-03
Posts: 152

Re: [SOLVED] Pacman returns 403 error even though curl can access files

It looks like the User-Agent is what is causing the proxy to reject the requests. I have tried using -A curl switch to make it use the pacman User-Agent, so the HTTP request is:

GET http://osl.ugr.es/archlinux/core/os/x86_64/core.db HTTP/1.1
Host: osl.ugr.es
User-Agent: pacman/5.0.1 (Linux x86_64) libalpm/10.0.1
Accept: */*
Proxy-Connection: Keep-Alive

And blam! error 403 sad.

So it looks like I need pacman/libalpm to lie about the User-Agent... Is there any way of doing this without having to patch the sources?

Offline

#5 2016-11-25 21:08:07

Grzechooo
Member
Registered: 2016-11-25
Posts: 1

Re: [SOLVED] Pacman returns 403 error even though curl can access files

I would take a look at the XferCommand in pacman.conf, I think it can do what you need (replace the built-in downloader with one that can use any User-Agent):

XferCommand = /path/to/command %u
If set, an external program will be used to download all remote files. All instances of %u will be replaced with the download URL. If present, instances of %o will be replaced with the local filename, plus a “.part” extension, which allows programs like wget to do file resumes properly.
This option is useful for users who experience problems with built-in HTTP/FTP support, or need the more advanced proxy support that comes with utilities like wget.

Offline

#6 2016-11-28 11:55:16

doragasu
Member
Registered: 2012-03-03
Posts: 152

Re: [SOLVED] Pacman returns 403 error even though curl can access files

That worked like a charm, I didn't notice before, but configuration file even comes with two examples (one using curl, the other using wget).

Thanks!

Offline

Board footer

Powered by FluxBB