You are not logged in.

#1 2024-05-27 16:21:19

t-sourcemaker
Member
Registered: 2015-02-01
Posts: 13

Download entire mailing list

I tried to download the entire mailing list arch-general@lists.archlinux.org
from https://lists.archlinux.org/archives/li … linux.org/.

When I download the entire archive (mbox), the archive is incomplete. Why?

Offline

#2 2024-05-28 07:49:47

Awebb
Member
Registered: 2010-05-06
Posts: 6,688

Re: Download entire mailing list

I can confirm the phenomenon. The compressed file is a little above 3 MiB. Perhaps there is some file size limit in HyperKitty or in whatever else is in the background that serves the .gz file.

EDIT: I have also downloaded the most recent 30 days. The "full" mbox' last corrupted entry predates the most recent entry the new file. It all ends somewhere in 2009.

Last edited by Awebb (2024-05-28 07:54:12)

Offline

#3 2024-05-28 10:27:44

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,085

Re: Download entire mailing list

aur-general downloads close to 18 MB but is also corrupted.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#4 2024-05-28 12:35:58

t-sourcemaker
Member
Registered: 2015-02-01
Posts: 13

Re: Download entire mailing list

I also tried to download all archives per month and then merge.
The merged archive is then also incomplete.

Offline

#5 2024-05-28 18:26:21

gromit
Administrator
From: Germany
Registered: 2024-02-10
Posts: 1,536
Website

Re: Download entire mailing list

What is missing from the archive and how do you test/open it? I just tried with the full archive of aur-general & the last 30 days and opened it with neomutt and it seemed fine.

Offline

#6 2024-05-28 19:17:41

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,085

Re: Download entire mailing list

In my case it was a noob mistake : I tried to open the downloaded gz-archive in thunderbird and saw no messages.

After uncompressing the archive and opening the *.mbox 39k messages were present.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#7 2024-05-29 07:29:39

t-sourcemaker
Member
Registered: 2015-02-01
Posts: 13

Re: Download entire mailing list

Arch-general:
The latest mail included in the mbox archive is from 28.08.2009 06:29.

Offline

#8 2024-05-29 09:32:30

Awebb
Member
Registered: 2010-05-06
Posts: 6,688

Re: Download entire mailing list

This is what I do:
0. Firefox
1. Click on the link provided by t-sourcemaker.
2. Click on Download -> Entire archive (mbox)
3. Extract the file with Ark and open the mbox file with vscode or vim.
-> Last entry is from 28.08.2009 06:29

Alternatively, If I copy the link to the mbox file, download it with wget and try to extract it with gunzip, gunzip gives me "gzip: arch-general@lists.archlinux.org.mbox.gz: unexpected end of file" and does not output a mailbox archive at all.

Last edited by Awebb (2024-05-29 09:33:26)

Offline

#9 2024-05-29 09:46:58

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,085

Re: Download entire mailing list

Similar behaviour as Awebb : download with firefox, gzip / gunzip complain about EOF .
Ark can extract the file, thunderbird can read it but only sees messages from 2009 and older.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#10 2024-05-29 10:22:59

progandy
Member
Registered: 2012-05-17
Posts: 5,317

Re: Download entire mailing list

Something in year 2009 crashes the archiver. If you start your archive on 2010-01-01, then it works. I tried 2009-09 and 2009-10 first and they both failed.

https://lists.archlinux.org/archives/list/arch-general@lists.archlinux.org/export/arch-general@lists.archlinux.org.mbox.gz?start=2010-01-01&end=2024-05-01

Edit: It looks like the problem lies in october 2009, a download of the monthly archives results in a 10 byte file for that month. (named with 2009-11):

 $ ll -h arch-general@lists.archlinux.org-20*
-rw-r--r-- 1 progandy progandy 167K 29. Mai 12:24 arch-general@lists.archlinux.org-2009-09.mbox.gz
-rw-r--r-- 1 progandy progandy   10 29. Mai 12:24 arch-general@lists.archlinux.org-2009-11.mbox.gz
-rw-r--r-- 1 progandy progandy  22K 29. Mai 12:25 arch-general@lists.archlinux.org-2009-12.mbox.gz
-rw-r--r-- 1 progandy progandy 283K 29. Mai 12:25 arch-general@lists.archlinux.org-2010-01.mbox.gz

Last edited by progandy (2024-05-29 10:27:51)


| alias CUTF='LANG=en_XX.UTF-8@POSIX ' | alias ENGLISH='LANG=C.UTF-8 ' |

Offline

#11 2024-05-29 11:34:15

gromit
Administrator
From: Germany
Registered: 2024-02-10
Posts: 1,536
Website

Re: Download entire mailing list

One terrible bash script later it seems like its messages on the following date ranges:

bad: 2009-11-19-2009-11-20.mbox.gz
bad: 2009-11-10-2009-11-11.mbox.gz
bad: 2009-11-04-2009-11-05.mbox.gz
bad: 2009-10-09-2009-10-10.mbox.gz
bad: 2009-10-05-2009-10-06.mbox.gz
bad: 2009-10-01-2009-10-02.mbox.gz
bad: 2009-09-23-2009-09-24.mbox.gz
bad: 2009-09-22-2009-09-23.mbox.gz
bad: 2009-09-21-2009-09-22.mbox.gz
bad: 2009-09-10-2009-09-11.mbox.gz
bad: 2009-09-07-2009-09-08.mbox.gz
bad: 2009-09-04-2009-09-05.mbox.gz
bad: 2009-09-03-2009-09-04.mbox.gz
bad: 2009-08-31-2009-09-01.mbox.gz

Offline

#12 2024-05-30 08:14:50

Awebb
Member
Registered: 2010-05-06
Posts: 6,688

Re: Download entire mailing list

Where would one report problems with the mailing lists (besides the mailing lists), if we operated under the usual assumption that devs don't read every thread on the bbs?

Offline

#13 2024-05-30 11:08:31

gromit
Administrator
From: Germany
Registered: 2024-02-10
Posts: 1,536
Website

Re: Download entire mailing list

Awebb wrote:

Where would one report problems with the mailing lists (besides the mailing lists)

The infrastructure repository would be a good place, we also have a mailing list, but its mostly unused ..

Awebb wrote:

if we operated under the usual assumption that devs don't read every thread on the bbs?

Devs are not the right entity in this case, the DevOps team takes care of servers and general Arch infrastructure.
Now in this case I got linked to this thread, so someone from the team is already having a look wink

Offline

#14 2024-05-30 15:03:38

Awebb
Member
Registered: 2010-05-06
Posts: 6,688

Re: Download entire mailing list

Thanks for caring smile

Offline

#15 2024-05-30 16:40:01

gromit
Administrator
From: Germany
Registered: 2024-02-10
Posts: 1,536
Website

Re: Download entire mailing list

Offline

Board footer

Powered by FluxBB