You are not logged in.

#1 2023-08-07 12:53:29

tlaguz
Member
Registered: 2023-08-07
Posts: 4

Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

Hello,

I've been experiencing an issue where files are randomly deleted from my  system, particularly affecting my Git repositories and directories  synchronized with Nextcloud. This problem has persisted across different  setups, including using ZFS on ArchLinux and BTRFS on Fedora, both on  the same Samsung 980 PRO 2TB drive (on different laptops).

Here's what I've done to nail the issue down so far:

* I've set up monitoring on Nextcloud to receive notifications whenever a file is deleted.
* I've configured auditd to monitor the entire /home dataset.
* I've updated the drive firmware to the latest version.

Here's what I've investigated:

* The files don't appear to be deleted by any specific process, as there are no corresponding entries in the audit log when files disappear.
* `zpool status` always reports that everything is ok. Autotrim is disabled. Sync is standard.
* The issue only occurs after a cold boot, never after a reboot.
* I've noticed that during shutdown, the /var/log dataset fails to unmount, although the log indicates that filesystems and block devices are synced.

Here's the sequence of system shutdown (in reverse order):

    Aug 02 23:08:40 quark systemd-journald[772]: Journal stopped
    Aug 02 23:08:40 quark systemd-journald[772]: Received SIGTERM from PID 1 (systemd-shutdow).
    Aug 02 23:08:40 quark systemd-shutdown[1]: Sending SIGTERM to remaining processes...
    Aug 02 23:08:35 quark systemd-shutdown[1]: Syncing filesystems and block devices.
    Aug 02 23:08:35 quark systemd[1]: Shutting down.
    Aug 02 23:08:35 quark systemd[1]: Reached target System Power Off.
    Aug 02 23:08:35 quark systemd[1]: Finished System Power Off.
    Aug 02 23:08:35 quark systemd[1]: systemd-poweroff.service: Deactivated successfully.
    Aug 02 23:08:35 quark systemd[1]: Reached target Late Shutdown Services.
    Aug 02 23:08:35 quark systemd[1]: Reached target System Shutdown.
    Aug 02 23:08:35 quark systemd[1]: Stopped Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
    Aug 02 23:08:35 quark systemd[1]: lvm2-monitor.service: Deactivated successfully.
    Aug 02 23:08:35 quark systemd[1]: Reached target Unmount All Filesystems.
    ...
    Aug 02 23:08:35 quark systemd[1]: Failed unmounting var-log.mount.
    Aug 02 23:08:35 quark systemd[1]: var-log.mount: Mount process exited, code=exited, status=32/n/a

What can I do to investigate this further while using arch and zfs?

Is it possible that the drive is not syncing properly?

Funny thing is that this issue was the reason for which I changed fedora/btrfs to arch/zfs. In the meantime I changed laptop from Lenovo to Dell but the drive was migrated.

Offline

#2 2023-08-07 15:37:50

twelveeighty
Member
From: Alberta, Canada
Registered: 2011-09-04
Posts: 1,096

Re: Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

Please explain how you know that files are deleted from your system. It's the missing part of your post: who or what is telling you the files are removed? Also: is there anything common to the files that are disappearing, for example: are they dotfiles?

Offline

#3 2023-08-07 17:19:28

tlaguz
Member
Registered: 2023-08-07
Posts: 4

Re: Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

Regarding the nextcloud ones - I can see that the file is removed on the server and moved to trash. In server access logs I can find that my computer sent DEL request just after booting.

Git repositories - either flies tracked by git get removed or git repository is broken due to missing files in .git directory.

twelveeighty wrote:

Also: is there anything common to the files that are disappearing, for example: are they dotfiles?

Every file I noticed removed was in my home directory. For example today I noticed that some pdf files were missing from my documents folder. Logs on the nextcloud server:

root@vm-nextcloud:~# less /var/log/apache2/access.log | grep DEL
172.31.31.16 - tlaguz [07/Aug/2023:08:35:35 +0200] "DELETE /remote.php/dav/files/tlaguz/Dokumenty/Umowy/Raty%20MediaExpert2/Formularz_informacyjny.pdf HTTP/1.0" 204 527 "-" "Mozilla/5.0 (Linux) mirall/3.9.1git (Nextcloud, arch-6.4.7-arch1-1 ClientArchitecture: x86_64 OsArchitecture: x86_64)"
172.31.31.16 - tlaguz [07/Aug/2023:08:35:35 +0200] "DELETE /remote.php/dav/files/tlaguz/Dokumenty/Umowy/Raty%20MediaExpert1/Formularz_informacyjny.pdf HTTP/1.0" 204 527 "-" "Mozilla/5.0 (Linux) mirall/3.9.1git (Nextcloud, arch-6.4.7-arch1-1 ClientArchitecture: x86_64 OsArchitecture: x86_64)"
...

I haven't deleted them and there is no entry in audit.log which would explain their disappearance.

Git repositories were also broken this time - If we assume that random files are deleted then there is huge chance that git repositories will be impacted as they contain a lot of files.

Offline

#4 2023-08-07 18:08:05

twelveeighty
Member
From: Alberta, Canada
Registered: 2011-09-04
Posts: 1,096

Re: Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

Have you tested doing several cold boots with the nextcloud sync tool turned off? To be specific: turn off the nextcloud sync daemon, reboot (making sure it's still off), then perform a "normal" workload of file activity followed by a cold boot. Repeat the workload/cold boot until you're satisfied the problem is gone (or not). Since Git would still report if a file has disappeared, that would eliminate nextcloud as the culprit.

Of course, before you do any type of further testing, take a full offline backup and make sure all your data is safe first.

Offline

#5 2023-08-12 14:53:51

Arsimael
Member
Registered: 2021-05-09
Posts: 16
Website

Re: Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

This is not a problem with Nextcloud.
I am experiencing this too.
I am using Arch with ZFS in my fileserver 2x(4x10TB) in a huge pool (2 Raid5 with 4 disks in a Raid50)

https://pics.jhml.de/images/2023/08/12/ … d3ef6e.png

I am also experiencing dissapearing files on a random base.
Sometimes its a movie, or a whole set of MP3s.
I do NOT delete them. And luckily I have backups - but it's annoying since you never know what dissapears and when.


moderator edit -- replaced oversized image with link.
Pasting pictures and code

Last edited by 2ManyDogs (2023-09-07 10:46:11)

Offline

#6 2023-09-07 10:42:25

tlaguz
Member
Registered: 2023-08-07
Posts: 4

Re: Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

Ok, so this has happened again. I see removed files in nextcloud trash bin and my git repositories have deleted files - some of repositories are broken (git missing indices ...), some of them show file deletions in `git status`.

After last incident I've set `zfs set sync=always rpool/var/log`. Like the last time the audit.log didn't register file deletions.

I didn't have any unclean shutdowns or reboots.

@Arsimael how does your setup look like? Which kernel, do you shutdown your server often? Is the OS installed on jhml pool? I'm pretty sure that files disappear on shutdown in my case. I must do some more testing.

Either I am going crazy or doing something wrong.

Offline

#7 2023-10-13 18:57:17

tlaguz
Member
Registered: 2023-08-07
Posts: 4

Re: Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

It happened again. This time to files of a repository I worked yesterday so I knew something must have happened few hours ago.
I found it in audit.log. It is Asseco proCertum SmartSign application which stores it's logs directly in home directory.

type=SYSCALL msg=audit(1697102735.910:38007): arch=c000003e syscall=87 success=yes exit=0 a0=7fa574002970 a1=7fa57fa543e8 a2=7fa574002970 a3=7fa5f577c50b items=2 ppid=56545 pid=56554 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=(none) ses=3 comm="Log4j2-TF-1-Rol" exe="/opt/proCertumSmartSign/jre/bin/java" key="home_changes"^]A
RCH=x86_64 SYSCALL=unlink AUID="tlaguz" UID="tlaguz" GID="tlaguz" EUID="tlaguz" SUID="tlaguz" FSUID="tlaguz" EGID="tlaguz" SGID="tlaguz" FSGID="tlaguz"
type=CWD msg=audit(1697102735.910:38007): cwd="/home/tlaguz/Desktop"
type=PATH msg=audit(1697102735.910:38007): item=0 name="/home/tlaguz/Repozytoria/ewid-cli/docker/compose/" inode=201534 dev=00:25 mode=040755 ouid=1000 ogid=1000 rdev=00:00 nametype=PARENT cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0^]OUID="tlaguz" OGID="tlaguz"
type=PATH msg=audit(1697102735.910:38007): item=1 name="/home/tlaguz/Repozytoria/ewid-cli/docker/compose/docker-compose-intraewid.yml" inode=3137798 dev=00:25 mode=0100644 ouid=1000 ogid=1000 rdev=00:00 nametype=DELETE cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0^]OUID="tlaguz" OGID="tlaguz"

Log4j tries to rotate logs and it deletes some random files every time the application starts. I've reproduced it after discovery so it is definitely what was happening all the time.
I've lost tens of hours to find this and almost lost my sanity.

Offline

#8 2024-01-16 13:55:10

Arsimael
Member
Registered: 2021-05-09
Posts: 16
Website

Re: Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

I had to expand my zfs raidz1 with another set of disks and did a rebalance on my files.
I also had to 'update' the zfs pool to the current version. (Started my trunk with version 1.x)
Since then, I did not experience any dissapearing files. Maybe the outdated ZFS trunk was the issue.

Offline

#9 2024-01-16 14:43:31

WorMzy
Forum Moderator
From: Scotland
Registered: 2010-06-16
Posts: 11,896
Website

Re: Arch root on ZFS, Samsung 980 PRO 2TB, files randomly disappear

Mod note: moving to AUR Issues


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Online

Board footer

Powered by FluxBB