You are not logged in.
Pages: 1
Hi guys,
I'm using Archlinux on amd64, I mount a share via SMB (NAS in my LAN) and I run rtorrent in a docker container which saves temporary and completed download in that share. After a while I always get the following reconnection messages on dmesg:
[Thu Mar 14 12:25:51 2019] CIFS VFS: Server 172.16.4.4 has not responded in 120 seconds. Reconnecting...During a session where many torrents were downloading I got a kernel crash (only CIFS module crashed) and from the kernel messages I can see the following logs:
[Thu Mar 14 12:17:28 2019] CIFS VFS: Server 172.16.4.4 has not responded in 120 seconds. Reconnecting...
[Thu Mar 14 12:19:50 2019] CIFS VFS: Server 172.16.4.4 has not responded in 120 seconds. Reconnecting...
[Thu Mar 14 12:20:00 2019] audit: type=1006 audit(1552562401.044:32417): pid=31322 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=2416 res=1
[Thu Mar 14 12:23:17 2019] CIFS VFS: Server 172.16.4.4 has not responded in 120 seconds. Reconnecting...
[Thu Mar 14 12:25:51 2019] CIFS VFS: Server 172.16.4.4 has not responded in 120 seconds. Reconnecting...
[Thu Mar 14 12:25:51 2019] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
[Thu Mar 14 12:25:51 2019] CPU: 3 PID: 5793 Comm: cifsd Not tainted 4.20.12-arch1-1-ARCH #1
[Thu Mar 14 12:25:51 2019] Hardware name: PC Engines apu2/apu2, BIOS v4.8.0.7 12/03/2018
[Thu Mar 14 12:25:51 2019] RIP: 0010:cifs_reconnect+0x273/0x820 [cifs]
[Thu Mar 14 12:25:51 2019] Code: 00 01 74 0d f6 05 2e b3 0a 00 04 0f 85 b1 03 00 00 48 8b 7c 24 08 48 8b 07 49 89 c5 48 39 ef 75 05 eb 2a 49 89 c5 48 8b 57 08 <48> 89 50 08 48 89 02 48 8b 47 40 48 89 3f 48 89 7f 08 e8 c6 35 d3
[Thu Mar 14 12:25:51 2019] RSP: 0018:ffffa13e012b7d48 EFLAGS: 00010287
[Thu Mar 14 12:25:51 2019] RAX: d3152ce0362ab108 RBX: ffff9eec481b4800 RCX: 0000000000000000
[Thu Mar 14 12:25:51 2019] RDX: ffffa13e012b7d50 RSI: 0000000000000246 RDI: ffff9eed2691da80
[Thu Mar 14 12:25:51 2019] RBP: ffffa13e012b7d50 R08: ffff9eec4bdcbc80 R09: 0000000000000000
[Thu Mar 14 12:25:51 2019] R10: 0000000080200800 R11: 0000000000000000 R12: ffff9eec481b49c0
[Thu Mar 14 12:25:51 2019] R13: d3152ce0362ab108 R14: ffffffffa7672d74 R15: 00000000fffffff5
[Thu Mar 14 12:25:51 2019] FS:  0000000000000000(0000) GS:ffff9eed2ab80000(0000) knlGS:0000000000000000
[Thu Mar 14 12:25:51 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Mar 14 12:25:51 2019] CR2: 000000c00015b010 CR3: 0000000103cd0000 CR4: 00000000000406e0
[Thu Mar 14 12:25:51 2019] Call Trace:
[Thu Mar 14 12:25:51 2019]  cifs_readv_from_socket+0x14c/0x240 [cifs]
[Thu Mar 14 12:25:51 2019]  cifs_read_from_socket+0x4a/0x70 [cifs]
[Thu Mar 14 12:25:51 2019]  ? _raw_spin_lock+0x13/0x30
[Thu Mar 14 12:25:51 2019]  ? cifs_small_buf_get+0x16/0x20 [cifs]
[Thu Mar 14 12:25:51 2019]  cifs_demultiplex_thread+0xd0/0xb40 [cifs]
[Thu Mar 14 12:25:51 2019]  ? cifs_handle_standard+0x190/0x190 [cifs]
[Thu Mar 14 12:25:51 2019]  kthread+0x112/0x130
[Thu Mar 14 12:25:51 2019]  ? kthread_park+0x80/0x80
[Thu Mar 14 12:25:51 2019]  ret_from_fork+0x22/0x40
[Thu Mar 14 12:25:51 2019] Modules linked in: arc4 ecb md4 cmac nls_utf8 cifs ccm dns_resolver fscache xt_nat xt_tcpudp macvlan ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge cfg80211 rfkill 8021q garp mrp stp llc ext4 crc16 mbcache jbd2 fscrypto amd64_edac_mod edac_mce_amd kvm_amd kvm leds_apu pcspkr irqbypass fam15h_power k10temp sp5100_tco i2c_piix4 evdev pcc_cpufreq mac_hid acpi_cpufreq ip_tables x_tables btrfs libcrc32c crc32c_generic xor raid6_pq algif_skcipher af_alg dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ahci libahci aes_x86_64 crypto_simd libata cryptd xhci_pci sdhci_pci glue_helper cqhci ccp xhci_hcd sdhci ehci_pci scsi_mod mmc_core rng_core ehci_hcd igb i2c_algo_bit dca
[Thu Mar 14 12:25:51 2019] ---[ end trace 6a92928d3aa86067 ]---
[Thu Mar 14 12:25:51 2019] RIP: 0010:cifs_reconnect+0x273/0x820 [cifs]
[Thu Mar 14 12:25:51 2019] Code: 00 01 74 0d f6 05 2e b3 0a 00 04 0f 85 b1 03 00 00 48 8b 7c 24 08 48 8b 07 49 89 c5 48 39 ef 75 05 eb 2a 49 89 c5 48 8b 57 08 <48> 89 50 08 48 89 02 48 8b 47 40 48 89 3f 48 89 7f 08 e8 c6 35 d3
[Thu Mar 14 12:25:51 2019] RSP: 0018:ffffa13e012b7d48 EFLAGS: 00010287
[Thu Mar 14 12:25:51 2019] RAX: d3152ce0362ab108 RBX: ffff9eec481b4800 RCX: 0000000000000000
[Thu Mar 14 12:25:51 2019] RDX: ffffa13e012b7d50 RSI: 0000000000000246 RDI: ffff9eed2691da80
[Thu Mar 14 12:25:51 2019] RBP: ffffa13e012b7d50 R08: ffff9eec4bdcbc80 R09: 0000000000000000
[Thu Mar 14 12:25:51 2019] R10: 0000000080200800 R11: 0000000000000000 R12: ffff9eec481b49c0
[Thu Mar 14 12:25:51 2019] R13: d3152ce0362ab108 R14: ffffffffa7672d74 R15: 00000000fffffff5
[Thu Mar 14 12:25:51 2019] FS:  0000000000000000(0000) GS:ffff9eed2aa80000(0000) knlGS:0000000000000000
[Thu Mar 14 12:25:51 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Mar 14 12:25:51 2019] CR2: 0000561ad1dc5000 CR3: 00000001279a6000 CR4: 00000000000406e0
[Thu Mar 14 12:30:00 2019] audit: type=1006 audit(1552563001.204:32418): pid=32292 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=2417 res=1
[Thu Mar 14 12:39:00 2019] audit: type=1006 audit(1552563541.351:32419): pid=789 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=2418 res=1
[Thu Mar 14 12:40:00 2019] audit: type=1006 audit(1552563601.718:32420): pid=932 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=2419 res=1
[Thu Mar 14 12:50:01 2019] audit: type=1006 audit(1552564201.841:32421): pid=1894 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=2420 res=1
[Thu Mar 14 13:00:01 2019] audit: type=1006 audit(1552564801.971:32422): pid=2908 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=2421 res=1I honestly don't know how to troubleshoot this issue, can anybody give me some tip/hin/help???
Thank you!
Offline

How are you connecting ( wifi, ethernet, bluetooth ) to the NAS ?
Does the NAS still respond to other devices when this happens ?
Does the problem also happen if you run rtorrent directly ?
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
 Try clean chroot manager by graysky
Offline
How are you connecting ( wifi, ethernet, bluetooth ) to the NAS ?
Does the NAS still respond to other devices when this happens ?
Does the problem also happen if you run rtorrent directly ?
1) I'm connected via ethernet.
2) Yes, the NAS still proper works and don't experience any issue on other devices which are using the NAS at the same moment.
3) I haven't tried to run rtorrent outside the container.
Do you know if there are any limits with the Linux kernel about the maximum number of simultaneously open files thorough a SMB connection, or limits related to the bandwidth?
Offline

The smb daemon still works fine and responds to other devices while your torrent client stalls?
Does the NAS still echo pings from the torrenting machine while your torrent client stalls?
Did you check the smb log on the NAS for coincidental messages?
What are your cifs mount options? Did you try other cache strategies or directio? (man mount.cifs)
Offline
The smb daemon still works fine and responds to other devices while your torrent client stalls?
Yes, still works properly.
Does the NAS still echo pings from the torrenting machine while your torrent client stalls?
Yes, it can reply to icmp requests.
Did you check the smb log on the NAS for coincidental messages?
Yes, I've checked on freenas logs around that timeframe, but nothing came out.
What are your cifs mount options? Did you try other cache strategies or directio? (man mount.cifs)
The following are the mount options I used when the crash happened:
 sudo mount -t cifs -o credentials=/home/username/cifs_creds,uid=666,gid=666,rw,nounix,iocharset=utf8,file_mode=0755,dir_mode=0755 //freenas/Kodi/rtorrent/download /mnt/downloadBut I also got the same issue when I have ran the container via docker-compose file, and at that time I was using this mount options.
I'm going to check cache options/directio. I will update you about if it will fix my problem. Meanwhile, thank you.!
Offline
Some more information about my configuration:
$ mount.cifs -V
mount.cifs version: 6.8
$ modinfo cifs
filename:       /lib/modules/4.20.12-arch1-1-ARCH/kernel/fs/cifs/cifs.ko.xz
softdep:        pre: ccm
softdep:        pre: aead2
softdep:        pre: sha512
softdep:        pre: sha256
softdep:        pre: cmac
softdep:        pre: aes
softdep:        pre: nls
softdep:        pre: md5
softdep:        pre: md4
softdep:        pre: hmac
softdep:        pre: ecb
softdep:        pre: des
softdep:        pre: arc4
version:        2.14
description:    VFS to access SMB3 servers e.g. Samba, Macs, Azure and Windows (and also older servers complying with the SNIA CIFS Specification)
license:        GPL
author:         Steve French
alias:          smb3
alias:          fs-smb3
alias:          fs-cifs
srcversion:     478490380C190576AD4519F
depends:        fscache,dns_resolver
retpoline:      Y
intree:         Y
name:           cifs
vermagic:       4.20.12-arch1-1-ARCH SMP preempt mod_unload
sig_id:         PKCS#7
signer:         Build time autogenerated kernel key
sig_key:        0D:33:E4:8C:87:0E:14:9A:39:8C:26:AC:9C:E1:4C:A4:89:52:A3:D3
sig_hashalgo:   sha512
signature:      59:69:65:33:9B:91:0E:57:26:A9:F1:3E:BF:D2:3A:FA:6E:07:A2:41:
                B9:78:1E:28:69:E2:56:A9:9F:4A:4E:08:AC:5D:51:C0:B5:89:B2:34:
                5D:1B:8B:9F:14:22:DD:95:4A:64:FB:AA:77:CD:3A:B7:A4:9D:46:6C:
                1B:7A:58:0C:69:3D:AC:28:BC:4E:52:0B:F9:D6:5A:84:09:8E:C9:21:
                D3:73:A6:7B:9D:B3:9B:5F:0E:C7:EF:44:FE:9A:BA:AE:92:7D:31:E2:
                3B:2A:11:7F:59:35:05:92:53:EB:F0:B1:81:D7:1F:95:A3:06:B6:EA:
                8C:18:BE:95:FC:CE:40:8A:64:89:B6:F3:55:06:2E:5B:C0:2A:54:B5:
                A7:B4:F0:A5:E3:51:FF:51:52:DE:E4:D8:A3:28:69:0B:5A:44:85:D7:
                47:38:2B:C7:2B:7B:04:9E:99:4A:69:5F:3E:1E:B0:FA:B9:05:8A:C6:
                4E:A1:BF:5D:A3:C8:E0:1A:8E:D0:D5:4D:7B:CD:EA:9D:86:0F:CE:1D:
                F3:2D:C9:A6:53:93:EB:0A:D9:AE:49:D0:C2:E7:44:06:48:F0:54:C9:
                AE:7D:8C:E5:33:86:F2:53:0E:AE:D7:49:85:B9:3F:F7:EB:6A:07:21:
                B5:C5:DF:DC:D8:D5:D3:AF:BA:0D:7D:B4:33:49:8F:A1:03:E7:E7:B9:
                49:EF:D7:12:52:0B:86:10:D1:D4:58:75:F7:8B:65:A6:22:A8:48:D7:
                37:B5:24:C9:20:56:AD:79:F4:4B:8B:6B:02:78:47:AD:34:B1:06:0E:
                1A:92:CB:2A:B5:C0:AE:CE:65:F8:A8:81:FC:03:DD:5E:DC:67:90:16:
                BC:9D:8B:A6:FE:33:3C:A3:D8:D2:D7:A9:1C:25:F3:70:48:91:2D:88:
                8F:E9:DF:37:D5:37:AB:0F:54:7E:5C:6E:53:CC:97:EB:48:18:C4:38:
                9D:0B:0F:C3:3E:41:3F:FE:BD:7D:2F:DC:B7:9E:2A:D9:54:83:C5:AA:
                36:70:F2:08:F6:75:B4:3A:4B:17:7C:B8:8D:5E:AC:86:B9:D7:5B:E6:
                D8:1E:6F:89:69:6B:47:37:90:55:9B:CE:87:F0:ED:C3:2B:9E:EA:D4:
                15:F9:AE:F2:B5:45:01:06:B2:F9:79:58:C5:F8:81:1C:93:75:27:FF:
                73:4F:47:3D:69:9E:64:DF:33:42:1C:3C:F5:C1:A5:63:89:64:1C:A8:
                7C:03:0B:99:63:98:AB:69:51:84:E1:AA:FF:35:FB:64:A9:AD:C4:3A:
                0A:98:ED:F9:74:1D:EF:52:6A:57:AA:08:D1:F5:6F:2F:CA:84:59:90:
                27:12:74:17:BF:B7:94:19:F8:C1:30:BB
parm:           CIFSMaxBufSize:Network buffer size (not including header) for CIFS requests. Default: 16384 Range: 8192 to 130048 (uint)
parm:           cifs_min_rcv:Network buffers in pool. Default: 4 Range: 1 to 64 (uint)
parm:           cifs_min_small:Small network buffers in pool. Default: 30 Range: 2 to 256 (uint)
parm:           cifs_max_pending:Simultaneous requests to server for CIFS/SMB1 dialect (N/A for SMB3) Default: 32767 Range: 2 to 32767. (uint)
parm:           enable_oplocks:Enable or disable oplocks. Default: y/Y/1 (bool)
parm:           disable_legacy_dialects:To improve security it may be helpful to restrict the ability to override the default dialects (SMB2.1, SMB3 and SMB3.02) on mount with old dialects (CIFS/SMB1 and SMB2) since vers=1.0 (CIFS/SMB1) and vers=2.0 are weaker and less secure. Default: n/N/0 (bool)Offline

So it would seem the server does only break smb connections to that particular client - yet doesn't log anything…
You could try to raise the "log level" on the server and, in case the caching strategy has no impact, limiting the protocol eg. "vers=1.0" (see man "mount.cifs")
Offline
Hi Seth,
Did you try other cache strategies or directio? (man mount.cifs)
I followed you suggestion and I tried different cache options (none, strict and loose), I honestly didn't know about it. But unfortunately it didn't fix the issue.
Anyway I left cache=none option enabled, it should improve the performance when I will have fixed this problem.
So it would seem the server does only break smb connections to that particular client - yet doesn't log anything…
You could try to raise the "log level" on the server and, in case the caching strategy has no impact, limiting the protocol eg. "vers=1.0" (see man "mount.cifs")
Even changing the smb version didn't help. But I increased the verbosity on the server, and I actually got the proof that the issue resides on the server side.
I'm using Freenas 11.2-U2.1 with Samba 4.9.4. It seems that when too many reads are performed from a client, samba is not able to properly handle all the IO... then it closes the client connection.
While I was experiencing the issue, I saw the following error on the server:
[2019/03/16 13:38:23.047039,  1] ../source3/smbd/smb2_server.c:3918(smbd_smb2_io_handler)
  ignore NBT[0x49] msg
[2019/03/16 13:38:23.087477,  1] ../source3/smbd/smb2_server.c:3918(smbd_smb2_io_handler)
  ignore NBT[0x2A] msg
[2019/03/16 13:38:23.123200,  1] ../source3/smbd/smb2_server.c:3918(smbd_smb2_io_handler)
  ignore NBT[0xAF] msg
[2019/03/16 13:38:23.175463,  1] ../source3/smbd/smb2_server.c:3918(smbd_smb2_io_handler)
  ignore NBT[0x0F] msg
[2019/03/16 13:38:23.282583,  1] ../source3/smbd/smb2_server.c:3918(smbd_smb2_io_handler)
  ignore NBT[0x7E] msg
[2019/03/16 13:40:10.572218,  2] ../source3/smbd/close.c:802(close_normal_file)Looking about it on internet I only found this old (patched) issue that seemed kind of similar, to me. Here its relative tracked bug.
What do you think about it? Any idea on what I can do or other tests to make?
I want try first limiting rtorrent to a really small number of simultaneously opened files, then just limiting the upload (reads) bandwidth. Hopefully understand what exactly triggers the problem.
I will update here with results of next tests.
Offline

[2019/03/16 13:40:10.572218,  2] ../source3/smbd/close.c:802(close_normal_file)Suggests a file is being closed, but the status (asn some pot. error) would be in the next line (which is missing in this excerpt)
I'm not sure this is a "reading from the server" issue (do you have more log output to support that theory?), but you could turn the torrent client into strict leeching/seeding mode in order to determine whether only one direction is affected (and you might have to try *and load* both)
Another thing to try on the server would be usign sync IO
aio write size = 0
aio read size = 0I'm also not sure that skipping the cache will be prone to improve things…
Offline
Since the issue is FreeNAS related (more specifically Samba related) I opened a thread on the FreeNAS' community too.
Just want to link the two threads together.
Offline
[2019/03/16 13:40:10.572218, 2] ../source3/smbd/close.c:802(close_normal_file)Suggests a file is being closed, but the status (asn some pot. error) would be in the next line (which is missing in this excerpt)
I'm not sure this is a "reading from the server" issue (do you have more log output to support that theory?), but you could turn the torrent client into strict leeching/seeding mode in order to determine whether only one direction is affected (and you might have to try *and load* both)
Another thing to try on the server would be usign sync IO
aio write size = 0 aio read size = 0I'm also not sure that skipping the cache will be prone to improve things…
Hi Seth,
I tried sync IO on the server, but the issue persists.
I captured more logs from the server, using loglevel:full, and I posted it here. Do you see anything useful on these logs?
While on the client I got the usual kernel message
[Tue Mar 19 12:45:34 2019] CIFS VFS: Server 172.16.4.4 has not responded in 120 seconds. Reconnecting...Offline

The smbd_smb2_request_error_ex messages seems suspicious enough, but for some reason the log only contains the code line, not the actual message.
Does samba log into the systemd journal?
Offline
The smbd_smb2_request_error_ex messages seems suspicious enough, but for some reason the log only contains the code line, not the actual message.
Pardon, my bad! Here the complete logs:
[2019/03/19 12:41:48.773138,  3] ../source3/smbd/smb2_read.c:421(smb2_read_complete)
[2019/03/19 12:41:48.904591,  3] ../source3/smbd/smb2_read.c:421(smb2_read_complete)
[2019/03/19 12:41:49.259247,  3] ../source3/smbd/smb2_read.c:421(smb2_read_complete)
[2019/03/19 12:41:49.355925,  3] ../source3/smbd/smb2_server.c:3195(smbd_smb2_request_error_ex)
  smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../source3/smbd/smb2_create.c:296
[2019/03/19 12:41:49.355965,  3] ../source3/smbd/smb2_server.c:3195(smbd_smb2_request_error_ex)
  smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[5] status[NT_STATUS_FILE_CLOSED] || at ../source3/smbd/smb2_server.c:2599
[2019/03/19 12:41:49.355981,  3] ../source3/smbd/smb2_server.c:3195(smbd_smb2_request_error_ex)
  smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[9] status[NT_STATUS_FILE_CLOSED] || at ../source3/smbd/smb2_server.c:2599
[2019/03/19 12:41:49.358626,  2] ../source3/smbd/open.c:1447(open_file)
[2019/03/19 12:41:49.363722,  3] ../source3/smbd/trans2.c:3465(smbd_do_qfsinfo)
  smbd_do_qfsinfo: level = 1007
[2019/03/19 12:41:49.364591,  3] ../source3/lib/sysquotas.c:488(sys_get_quota)
  sys_get_vfs_quota() failed for mntpath[.] bdev[(null)] qtype[1] id[-1]: Operation not supported
[2019/03/19 12:41:49.364618,  3] ../source3/lib/sysquotas.c:488(sys_get_quota)
  sys_get_vfs_quota() failed for mntpath[.] bdev[(null)] qtype[3] id[-1]: Operation not supported
[2019/03/19 12:41:49.443969,  3] ../source3/smbd/smb2_read.c:421(smb2_read_complete)
[2019/03/19 12:41:50.350426,  3] ../source3/smbd/smb2_read.c:421(smb2_read_complete)
[2019/03/19 12:41:50.626486,  3] ../source3/smbd/smb2_read.c:421(smb2_read_complete)Does samba log into the systemd journal?
I'm using FreeNAS, and it does not use systemd as init manager.
Offline
Pages: 1