You are not logged in.
I have an NFS share pointing to tmpfs on my workstation. The tmpfs space is essentially a RAM disk (machine has 128G of DDR4 and I am allocating 100G of that). I am finding that read/writes to the space over NFS are slow compared to read/writes on the local machine and am wondering why. For example, extracting a 3 G tarball takes over 10 minutes over NFS vs 10-12 seconds on the local machine.
-Both machines are behind a 2.5G switch and have have 2.5G NICs
-iperf tests to/from these machines saturate the connection (2325 Mbits/sec)
My /etc/nfs.conf is stock with the exception of threads=128
Here is my /etc/exports:
/srv/nfs 10.9.8.0/24(ro,no_subtree_check,async,no_wdelay,fsid=0)
/srv/nfs/scratch 10.9.8.0/24(rw,no_subtree_check,async,no_wdelay,no_root_squash)
On the client:
# mountstats /scratch-on-quaduple
Stats for 10.9.8.101:/scratch mounted on /scratch-on-quaduple:
NFS mount options: rw,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.9.8.106,local_lock=none
NFS mount age: 0:08:55
NFS server capabilities: caps=0xfffbc0bf,wtmult=512,dtsize=1048576,bsize=0,namlen=255
NFSv4 capability flags: bm0=0xfdffbfff,bm1=0x40fdbe3e,bm2=0x60803,acl=0x3,sessions,pnfs=notconfigured,lease_time=90,lease_expired=0
NFS security flavor: 1 pseudoflavor: 0
NFS byte counts:
applications read 12583598539 bytes via read(2)
applications wrote 6613910165 bytes via write(2)
applications read 0 bytes via O_DIRECT read(2)
applications wrote 0 bytes via O_DIRECT write(2)
client read 6762333260 bytes via NFS READ
client wrote 6613917308 bytes via NFS WRITE
RPC statistics:
4577638 RPC requests sent, 4577637 RPC replies received (0 XIDs not found)
average backlog queue length: 0
REMOVE:
1263477 ops (27%)
avg bytes sent per op: 217 avg bytes received per op: 116
backlog wait: 0.001273 RTT: 0.083600 total execute time: 0.087952 (milliseconds)
LOOKUP:
711023 ops (15%) 388426 errors (54%)
avg bytes sent per op: 223 avg bytes received per op: 184
backlog wait: 0.001420 RTT: 0.072269 total execute time: 0.077062 (milliseconds)
SETATTR:
691574 ops (15%)
avg bytes sent per op: 232 avg bytes received per op: 259
backlog wait: 0.001982 RTT: 0.077809 total execute time: 0.083622 (milliseconds)
GETATTR:
499691 ops (10%)
avg bytes sent per op: 183 avg bytes received per op: 239
backlog wait: 0.001781 RTT: 0.073567 total execute time: 0.079315 (milliseconds)
WRITE:
348171 ops (7%)
avg bytes sent per op: 19209 avg bytes received per op: 183
backlog wait: 0.665469 RTT: 0.254079 total execute time: 0.921378 (milliseconds)
CLOSE:
345473 ops (7%)
avg bytes sent per op: 199 avg bytes received per op: 175
backlog wait: 0.004191 RTT: 0.076159 total execute time: 0.082093 (milliseconds)
OPEN:
345407 ops (7%) 74 errors (0%)
avg bytes sent per op: 318 avg bytes received per op: 367
backlog wait: 0.004534 RTT: 0.099488 total execute time: 0.106254 (milliseconds)
ACCESS:
159409 ops (3%)
avg bytes sent per op: 206 avg bytes received per op: 167
backlog wait: 0.001330 RTT: 0.072750 total execute time: 0.077235 (milliseconds)
READDIR:
142300 ops (3%)
avg bytes sent per op: 227 avg bytes received per op: 1737
backlog wait: 0.001504 RTT: 0.131026 total execute time: 0.136444 (milliseconds)
CREATE:
41842 ops (0%) 7 errors (0%)
avg bytes sent per op: 231 avg bytes received per op: 327
backlog wait: 0.001697 RTT: 0.082190 total execute time: 0.087687 (milliseconds)
READ:
26471 ops (0%)
avg bytes sent per op: 199 avg bytes received per op: 255565
backlog wait: 0.006384 RTT: 1.231952 total execute time: 1.249065 (milliseconds)
SYMLINK:
1469 ops (0%)
avg bytes sent per op: 262 avg bytes received per op: 328
backlog wait: 0.002042 RTT: 0.085773 total execute time: 0.092580 (milliseconds)
COMMIT:
626 ops (0%)
avg bytes sent per op: 175 avg bytes received per op: 104
backlog wait: 0.003195 RTT: 0.089457 total execute time: 0.097444 (milliseconds)
DELEGRETURN:
406 ops (0%)
avg bytes sent per op: 197 avg bytes received per op: 166
backlog wait: 0.822660 RTT: 0.251232 total execute time: 1.078818 (milliseconds)
OPEN_NOATTR:
202 ops (0%)
avg bytes sent per op: 245 avg bytes received per op: 351
backlog wait: 0.004950 RTT: 0.103960 total execute time: 0.113861 (milliseconds)
READLINK:
59 ops (0%)
avg bytes sent per op: 161 avg bytes received per op: 118
backlog wait: 0.000000 RTT: 0.135593 total execute time: 0.152542 (milliseconds)
RENAME:
12 ops (0%)
avg bytes sent per op: 263 avg bytes received per op: 152
backlog wait: 0.000000 RTT: 0.166667 total execute time: 0.250000 (milliseconds)
STATFS:
3 ops (0%)
avg bytes sent per op: 182 avg bytes received per op: 160
backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
SERVER_CAPS:
2 ops (0%)
avg bytes sent per op: 196 avg bytes received per op: 172
backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
EXCHANGE_ID:
2 ops (0%)
avg bytes sent per op: 244 avg bytes received per op: 108
backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
NULL:
1 ops (0%)
avg bytes sent per op: 44 avg bytes received per op: 24
backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
FSINFO:
1 ops (0%)
avg bytes sent per op: 196 avg bytes received per op: 168
backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
LOCK:
1 ops (0%)
avg bytes sent per op: 276 avg bytes received per op: 112
backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
PATHCONF:
1 ops (0%)
avg bytes sent per op: 188 avg bytes received per op: 116
backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
CREATE_SESSION:
1 ops (0%)
avg bytes sent per op: 192 avg bytes received per op: 124
backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
RECLAIM_COMPLETE:
1 ops (0%)
avg bytes sent per op: 124 avg bytes received per op: 88
backlog wait: 0.000000 RTT: 17.000000 total execute time: 17.000000 (milliseconds)
Last edited by graysky (2024-08-30 20:07:58)
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
Does copying a file from a local tmpfs onto the remote operate w/ expected performance?
If not, what if using nfs-cp instead of the nfs mount?
You probably just end up copying the entire tarball over the network, extract it locally, might run into swap usage (because you're maybe dealing w/ >30GB of data) and copy the deflated result back over the network.
If the tarball is 10:1 compressed that's 33GB traffic, not 3 (still only a meager 450Mb/s)
You could try whether "lookupcache=none,noac" improves performance in your scenario.
Online
That is a good test. All n=1 using rsync, and flushing caches with echo 3 > /proc/sys/vm/drop_caches between all tests.
Here is copying an 8.6 G file from local SSD to local tmpfs:
9,177,828,779 100% 759.25MB/s 0:00:11 (xfr#1, to-chk=0/1)
I mounted the nfs share to /temp on the local machine via mount 10.9.8.101:/scratch /temp and repeated the rsync transfer which was slower:
9,177,828,779 100% 528.22MB/s 0:00:16 (xfr#1, to-chk=0/1)
Now copying that same file from the remote host's tmpfs (DDR3) to the nfs share:
9,177,828,779 100% 431.70MB/s 0:00:20 (xfr#1, to-chk=0/1)
So there are differences but these do not track with the slowness I'm experiencing on the tarball extraction or writing smaller files.
Last edited by graysky (2024-08-31 14:02:23)
CPU-optimized Linux-ck packages @ Repo-ck • AUR packages • Zsh and other configs
Offline
Esp. the last test sems similar to the concerned usecase of decompressing a tarball - have you
a) measured the amount of data that's actually transferred for that task (rx/tx bytes)
b) checked whether you get under (local) memory pressure with the decompression
c) tried to disable the nfs caches?
Online