You are not logged in.

#1 2024-08-30 19:32:32

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,672
Website

Increase NFS server speed/current setup is clunky

I have an NFS share pointing to tmpfs on my workstation.  The tmpfs space is essentially a RAM disk (machine has 128G of DDR4 and I am allocating 100G of that).  I am finding that read/writes to the space over NFS are slow compared to read/writes on the local machine and am wondering why.  For example, extracting a 3 G tarball takes over 10 minutes over NFS vs 10-12 seconds on the local machine.

-Both machines are behind a 2.5G switch and have have 2.5G NICs
-iperf tests to/from these machines saturate the connection (2325 Mbits/sec)

My /etc/nfs.conf is stock with the exception of threads=128
Here is my /etc/exports:

/srv/nfs          10.9.8.0/24(ro,no_subtree_check,async,no_wdelay,fsid=0)
/srv/nfs/scratch  10.9.8.0/24(rw,no_subtree_check,async,no_wdelay,no_root_squash)

On the client:

# mountstats /scratch-on-quaduple
Stats for 10.9.8.101:/scratch mounted on /scratch-on-quaduple:

  NFS mount options: rw,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.9.8.106,local_lock=none
  NFS mount age: 0:08:55
  NFS server capabilities: caps=0xfffbc0bf,wtmult=512,dtsize=1048576,bsize=0,namlen=255
  NFSv4 capability flags: bm0=0xfdffbfff,bm1=0x40fdbe3e,bm2=0x60803,acl=0x3,sessions,pnfs=notconfigured,lease_time=90,lease_expired=0
  NFS security flavor: 1  pseudoflavor: 0

NFS byte counts:
  applications read 12583598539 bytes via read(2)
  applications wrote 6613910165 bytes via write(2)
  applications read 0 bytes via O_DIRECT read(2)
  applications wrote 0 bytes via O_DIRECT write(2)
  client read 6762333260 bytes via NFS READ
  client wrote 6613917308 bytes via NFS WRITE
RPC statistics:
  4577638 RPC requests sent, 4577637 RPC replies received (0 XIDs not found)
  average backlog queue length: 0

REMOVE:
	1263477 ops (27%) 
	avg bytes sent per op: 217	avg bytes received per op: 116
	backlog wait: 0.001273 	RTT: 0.083600 	total execute time: 0.087952 (milliseconds)
LOOKUP:
	711023 ops (15%) 	388426 errors (54%)
	avg bytes sent per op: 223	avg bytes received per op: 184
	backlog wait: 0.001420 	RTT: 0.072269 	total execute time: 0.077062 (milliseconds)
SETATTR:
	691574 ops (15%) 
	avg bytes sent per op: 232	avg bytes received per op: 259
	backlog wait: 0.001982 	RTT: 0.077809 	total execute time: 0.083622 (milliseconds)
GETATTR:
	499691 ops (10%) 
	avg bytes sent per op: 183	avg bytes received per op: 239
	backlog wait: 0.001781 	RTT: 0.073567 	total execute time: 0.079315 (milliseconds)
WRITE:
	348171 ops (7%) 
	avg bytes sent per op: 19209	avg bytes received per op: 183
	backlog wait: 0.665469 	RTT: 0.254079 	total execute time: 0.921378 (milliseconds)
CLOSE:
	345473 ops (7%) 
	avg bytes sent per op: 199	avg bytes received per op: 175
	backlog wait: 0.004191 	RTT: 0.076159 	total execute time: 0.082093 (milliseconds)
OPEN:
	345407 ops (7%) 	74 errors (0%)
	avg bytes sent per op: 318	avg bytes received per op: 367
	backlog wait: 0.004534 	RTT: 0.099488 	total execute time: 0.106254 (milliseconds)
ACCESS:
	159409 ops (3%) 
	avg bytes sent per op: 206	avg bytes received per op: 167
	backlog wait: 0.001330 	RTT: 0.072750 	total execute time: 0.077235 (milliseconds)
READDIR:
	142300 ops (3%) 
	avg bytes sent per op: 227	avg bytes received per op: 1737
	backlog wait: 0.001504 	RTT: 0.131026 	total execute time: 0.136444 (milliseconds)
CREATE:
	41842 ops (0%) 	7 errors (0%)
	avg bytes sent per op: 231	avg bytes received per op: 327
	backlog wait: 0.001697 	RTT: 0.082190 	total execute time: 0.087687 (milliseconds)
READ:
	26471 ops (0%) 
	avg bytes sent per op: 199	avg bytes received per op: 255565
	backlog wait: 0.006384 	RTT: 1.231952 	total execute time: 1.249065 (milliseconds)
SYMLINK:
	1469 ops (0%) 
	avg bytes sent per op: 262	avg bytes received per op: 328
	backlog wait: 0.002042 	RTT: 0.085773 	total execute time: 0.092580 (milliseconds)
COMMIT:
	626 ops (0%) 
	avg bytes sent per op: 175	avg bytes received per op: 104
	backlog wait: 0.003195 	RTT: 0.089457 	total execute time: 0.097444 (milliseconds)
DELEGRETURN:
	406 ops (0%) 
	avg bytes sent per op: 197	avg bytes received per op: 166
	backlog wait: 0.822660 	RTT: 0.251232 	total execute time: 1.078818 (milliseconds)
OPEN_NOATTR:
	202 ops (0%) 
	avg bytes sent per op: 245	avg bytes received per op: 351
	backlog wait: 0.004950 	RTT: 0.103960 	total execute time: 0.113861 (milliseconds)
READLINK:
	59 ops (0%) 
	avg bytes sent per op: 161	avg bytes received per op: 118
	backlog wait: 0.000000 	RTT: 0.135593 	total execute time: 0.152542 (milliseconds)
RENAME:
	12 ops (0%) 
	avg bytes sent per op: 263	avg bytes received per op: 152
	backlog wait: 0.000000 	RTT: 0.166667 	total execute time: 0.250000 (milliseconds)
STATFS:
	3 ops (0%) 
	avg bytes sent per op: 182	avg bytes received per op: 160
	backlog wait: 0.000000 	RTT: 0.000000 	total execute time: 0.000000 (milliseconds)
SERVER_CAPS:
	2 ops (0%) 
	avg bytes sent per op: 196	avg bytes received per op: 172
	backlog wait: 0.000000 	RTT: 0.000000 	total execute time: 0.000000 (milliseconds)
EXCHANGE_ID:
	2 ops (0%) 
	avg bytes sent per op: 244	avg bytes received per op: 108
	backlog wait: 0.000000 	RTT: 0.000000 	total execute time: 0.000000 (milliseconds)
NULL:
	1 ops (0%) 
	avg bytes sent per op: 44	avg bytes received per op: 24
	backlog wait: 0.000000 	RTT: 0.000000 	total execute time: 0.000000 (milliseconds)
FSINFO:
	1 ops (0%) 
	avg bytes sent per op: 196	avg bytes received per op: 168
	backlog wait: 0.000000 	RTT: 0.000000 	total execute time: 0.000000 (milliseconds)
LOCK:
	1 ops (0%) 
	avg bytes sent per op: 276	avg bytes received per op: 112
	backlog wait: 0.000000 	RTT: 0.000000 	total execute time: 0.000000 (milliseconds)
PATHCONF:
	1 ops (0%) 
	avg bytes sent per op: 188	avg bytes received per op: 116
	backlog wait: 0.000000 	RTT: 0.000000 	total execute time: 0.000000 (milliseconds)
CREATE_SESSION:
	1 ops (0%) 
	avg bytes sent per op: 192	avg bytes received per op: 124
	backlog wait: 0.000000 	RTT: 0.000000 	total execute time: 0.000000 (milliseconds)
RECLAIM_COMPLETE:
	1 ops (0%) 
	avg bytes sent per op: 124	avg bytes received per op: 88
	backlog wait: 0.000000 	RTT: 17.000000 	total execute time: 17.000000 (milliseconds)

Last edited by graysky (2024-08-30 20:07:58)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#2 2024-08-30 20:59:49

seth
Member
Registered: 2012-09-03
Posts: 60,390

Re: Increase NFS server speed/current setup is clunky

Does copying a file from a local tmpfs onto the remote operate w/ expected performance?
If not, what if using nfs-cp instead of the nfs mount?

You probably just end up copying the entire tarball over the network, extract it locally, might run into swap usage (because you're maybe dealing w/ >30GB of data) and copy the deflated result back over the network.
If the tarball is 10:1 compressed that's 33GB traffic, not 3 (still only a meager 450Mb/s)

You could try whether "lookupcache=none,noac" improves performance in your scenario.

Online

#3 2024-08-31 12:42:47

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,672
Website

Re: Increase NFS server speed/current setup is clunky

That is a good test.  All n=1 using rsync, and flushing caches with echo 3 > /proc/sys/vm/drop_caches between all tests.

Here is copying an 8.6 G file from local SSD to local tmpfs:
9,177,828,779 100%  759.25MB/s    0:00:11 (xfr#1, to-chk=0/1)

I mounted the nfs share to /temp on the local machine via mount 10.9.8.101:/scratch /temp and repeated the rsync transfer which was slower:
9,177,828,779 100%  528.22MB/s    0:00:16 (xfr#1, to-chk=0/1)

Now copying that same file from the remote host's tmpfs (DDR3) to the nfs share:
9,177,828,779 100%  431.70MB/s    0:00:20 (xfr#1, to-chk=0/1)

So there are differences but these do not track with the slowness I'm experiencing on the tarball extraction or writing smaller files.

Last edited by graysky (2024-08-31 14:02:23)


CPU-optimized Linux-ck packages @ Repo-ck  • AUR packagesZsh and other configs

Offline

#4 2024-08-31 17:19:20

seth
Member
Registered: 2012-09-03
Posts: 60,390

Re: Increase NFS server speed/current setup is clunky

Esp. the last test sems similar to the concerned usecase of decompressing a tarball - have you
a) measured the amount of data that's actually transferred for that task (rx/tx bytes)
b) checked whether you get under (local) memory pressure with the decompression
c) tried to disable the nfs caches?

Online

Board footer

Powered by FluxBB