You are not logged in.

#1 2024-01-20 10:45:25

intgr
Member
Registered: 2009-10-02
Posts: 47

Low write speeds with new Kingston KC3000 SSD

TL;DR

I just got a new Kingston KC3000 2TB NVMe SSD for my computer. Reviews have listed it as among the fastest SSDs available, with sustained 6500 MB/s write speeds [1]

My machine only supports PCIe 3.0 [2] so I should be reaching ~3500 MB/s read/write. But with KDiskMark and O_DIRECT disabled (very few apps use O_DIRECT), write speeds are capped at 1100 MB/s (https://imgur.com/tsNQbmM).

I'm at a loss here. Is Linux I/O stack just hopelessly bad, that it cannot saturate an SSD at PCIe 3 rates without O_DIRECT?

Benchmarks

And read speed does reach those levels, but write speed is nowhere near in real world.

According to UDisks benchmark, write speed only goes up to ~520 MB/s: https://imgur.com/CpmRPB1. At one point I even saw ~60 MB/s write at the beginning of the test, but it recovered mid-way in the test: https://imgur.com/eBBcN1M

Testing with "dd" via an ext4 file system gets better results, but still falls very short of expectations:

# dd if=/dev/zero of=./tempfile bs=1M count=8192 conv=fdatasync
8192+0 records in
8192+0 records out
8589934592 bytes (8,6 GB, 8,0 GiB) copied, 6,67135 s, 1,3 GB/s

Finally I found KDiskMark (which uses "fio" backend). With its default settings (O_DIRECT ON), I can see the promised write up to 3500 MB/s: https://imgur.com/hnWPDaF \o/
But if I disable O_DIRECT, it's devastating, speed is capped at 1100 MB/s: https://imgur.com/tsNQbmM

O_DIRECT makes this a purely synthetic benchmark; besides a few databases, almost no applications use O_DIRECT.

What I tried

These are things I tried... I ran the benchmarks after these:

  • I disabled PCIe/NVMe power management with "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" on the kernel command line.

  • I reformatted the new SSD with 4096 byte sectors.

  • Could not find any firmware updates for this drive.

This seems to have helped somewhat. Before these changes "dd" test gave me only ~700-800 MB/s. But UDisks results did not change.

But this is still deeply unsatisfcatory. Even testing with my years old worn down Intel 660p SSD, I can reach write speeds of 1,2 GB/s with the same "dd" test (!!).

Similar reports

Googling around, I found a bunch of similar reports from users, where read speeds are close to PCIe limits, but write speeds are around 600-700 MB/s:

[1] This graph shows ~6500 MB/s sequential write, sustained for nearly 110 seconds: https://cdn.mos.cms.futurecdn.net/YzH49 … 0.png.webp full review https://www.tomshardware.com/reviews/ki … renegade/2
[2] My motherboard is ASUS ROG STRIX B450-E GAMING. There is plenty of cooling, I'm using the motherboard-provided SSD heatsink in addition to the plate included with SSD.

Last edited by intgr (2024-02-12 17:50:39)

Offline

#2 2024-01-20 18:35:16

xerxes_
Member
Registered: 2018-04-29
Posts: 923

Re: Low write speeds with new Kingston KC3000 SSD

And how about hdparm benchmark: 'hdparm -Tt /dev/your_drive/partition_name' or 'hdparm -Tt --direct /dev/nvme0n1' ?

Offline

#3 2024-01-21 16:10:10

intgr
Member
Registered: 2009-10-02
Posts: 47

Re: Low write speeds with new Kingston KC3000 SSD

hdparm is only a read benchmark, the main issue here is write speed. But sure, why not...

# hdparm -Tt --direct /dev/nvme1n1p2
/dev/nvme1n1p2:
 Timing O_DIRECT cached reads:   5846 MB in  2.00 seconds = 2924.68 MB/sec
 Timing O_DIRECT disk reads: 9710 MB in  3.00 seconds = 3236.27 MB/sec
# hdparm -Tt /dev/nvme1n1p2
/dev/nvme1n1p2:
 Timing cached reads:   25468 MB in  1.99 seconds = 12770.07 MB/sec
 Timing buffered disk reads: 7106 MB in  3.00 seconds = 2368.08 MB/sec

Offline

#4 2025-06-06 16:53:27

ethanol9859
Member
Registered: 2025-06-06
Posts: 1

Re: Low write speeds with new Kingston KC3000 SSD

@intgr Any luck?

Offline

#5 2025-06-08 00:40:16

Brocellous
Member
Registered: 2017-11-27
Posts: 155

Re: Low write speeds with new Kingston KC3000 SSD

dd is not an io benchmark... Why do you think the linked plot is representative of application performance, but your numbers are not? You haven't attempted to recreate the test you're referencing, so its not that surprising to see different numbers.

The author of the review could have provided more detail, but they did helpfully give their plot an informative title, so we can see that the test was SEQ Write 1MB QD 32, and was apparently recorded with a tool called iometer. Fio is the defacto io bechmarking tool on Linux, it's very flexible and will be easier to find comparisons using fio. So, with fio, use bs=1M, iodepth=32. Then, looking at iometer, it appears it uses one io thread per cpu, so you should also use numjobs=$(nproc). Finally, it's probably no surprise that iometer uses libaio with O_DIRECT on linux:

#ifdef IOMTR_SETTING_LINUX_LIBAIO
		((struct File *)disk_file)->fd =
		    open(file_name, O_DIRECT | O_RDWR | O_CREAT | O_LARGEFILE | open_flag, S_IRUSR | S_IWUSR);

and the equivalent FILE_FLAG_NO_BUFFERING on Windows:

#elif defined(IOMTR_OS_WIN32) || defined(IOMTR_OS_WIN64)
		// Ignore errors that occur if trying to open a floppy or CD-ROM with
		// nothing in the drive.
		SetErrorMode(SEM_FAILCRITICALERRORS);
		disk_file = CreateFile(file_name, GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ |
				       FILE_SHARE_WRITE, NULL, OPEN_ALWAYS,
				       FILE_FLAG_NO_BUFFERING | FILE_FLAG_OVERLAPPED, NULL);
		SetErrorMode(0);

So to reproduce the test result you will also want to use ioengine=libaio and direct=1. The reviewer did record that the test host used an intel-12900k, a 24-core cpu, so the total # of inflight io could be 32*24=768. This is a world of difference from dd — the queue depth is absolutely mandatory, you cannot expect to reach high-performance speeds using single threaded synchronous IO. You've already observed that the write speed is limited by the pci bus on your host, which is the expected result.

For the record, the default ioengine libaio doesn't actually support async buffered io on linux, so if you want to test buffered io, you probably want to use the posixaio or io_uring engines instead, though just using more threads may be sufficient and is probably representative of typical application behavior anyway.

Offline

#6 2025-06-08 02:37:34

dakota
Member
Registered: 2016-05-20
Posts: 397

Re: Low write speeds with new Kingston KC3000 SSD

Brocellous wrote:

dd is not an io benchmark

Why not?

It is listed in the Benchmarking page in the wiki. I use it to compare real-world read/write speeds of usb thumbdrives.

Cheers,


"Before Enlightenment chop wood, carry water. After Enlightenment chop wood, carry water." -- Zen proverb

Offline

#7 2025-06-08 07:18:51

Brocellous
Member
Registered: 2017-11-27
Posts: 155

Re: Low write speeds with new Kingston KC3000 SSD

Professer Oak wrote:

There's a time and a place for everything, but not now.

Sure, dd could help you to measure your usb transfer speed. It's not ideal, but it could work.

But OP is basically asking "what's the biggest number I can get out of this modern nvme", and dd is unsuitable for that. OP seems to think that only dd is representative of application behavior on linux, but for some reason thinks the linked test is representative of application behavior on other platforms.

The majority of applications are served perfectly well by the page cache without any aio, and get good performance from buffered io. For those that don't, only the most basic software will just call synchronous writes in a tight loop like dd. Plenty of applications will use dedicated io threads or, whatever style of aio their event loop provides. The real world performance of real applications that actually have to care about io will lie somewhere in between.

Offline

#8 2025-06-08 15:25:23

dakota
Member
Registered: 2016-05-20
Posts: 397

Re: Low write speeds with new Kingston KC3000 SSD

Got it. Thanks for the explanation.

Cheers,


"Before Enlightenment chop wood, carry water. After Enlightenment chop wood, carry water." -- Zen proverb

Offline

#9 2025-06-12 16:22:01

intgr
Member
Registered: 2009-10-02
Posts: 47

Re: Low write speeds with new Kingston KC3000 SSD

Questions about my premise
Brocellous wrote:

dd is not an io benchmark... Why do you think the linked plot is representative of application performance

I don't run database workloads on my machine.

Apart from databases, the vast majority of applications use single-threaded buffered I/O. No O_DIRECT, no async IO, no io_uring. dd is exactly the tool that simulates this workload and allows benchmarking it.

Brocellous wrote:

the queue depth is absolutely mandatory, you cannot expect to reach high-performance speeds using single threaded synchronous IO.

No. I can and do expect good performance using single-threaded synchronous IO. This workload is not bottlenecked in the dd tool itself, it's bottlenecked in the way Linux handles buffered writes. Linux is in full control about how the buffers are being flushed, and there is room for improvement. (For example: if what was needed to exploit full I/O performance is to split up large writes into smaller requests, to increase queue depth, Linux can and should do that. But I don't think that is the bottleneck observed above.)

If you need more evidence that this can be improved, see this report about statements from Chris Mason, btrfs developer:

Amir Goldstein asked how severe the current bottleneck with single-threaded writeback is. Gupta said that he did not have any numbers on that. Chris Mason said that in his testing it largely depends on whether large folios are available; the easiest way to see performance problems with single-threaded writeback is to turn off large folios for XFS. In some simple testing, he could get around 800MB per second on XFS with large folios disabled before the kernel writeback thread was saturated. With large folios enabled, that number goes to around 2.4GB per second.

It's just the reality that Linux leaves a lot of potential performance on the table, but see my next comment about how things are improving

Last edited by intgr (2025-06-12 16:36:24)

Offline

#10 2025-06-12 16:28:13

intgr
Member
Registered: 2009-10-02
Posts: 47

Re: Low write speeds with new Kingston KC3000 SSD

Things have improved already!
ethanol9859 wrote:

@intgr Any luck?

Yes, there are good news: things have already improved in this regard.

Last edited by intgr (2025-06-12 16:36:44)

Offline

#11 2025-06-12 21:06:31

Brocellous
Member
Registered: 2017-11-27
Posts: 155

Re: Low write speeds with new Kingston KC3000 SSD

I didn't say it can't get better, or that more optimization isn't possible. I'm pointing out that it's a poor performance case for a _reason_.

In the OP, you said your test was unsatisfactory predicated on the claim it didn't reach the lofty number you thought it should based on the linked review article, and therefore the "linux I/O stack is hopelessly bad". My quotes are presented to demonstrate that all the tricks you claim make the KDiskMark results a "purely synthetic benchmark" unrepresentative of real workloads are also applied in the test from the review image you posted, because of course they are. Your mistake is thinking that dd is comparable to the reviewer's test, or that the reviewer's test is representative of application behavior on other platforms.

Offline

Board footer

Powered by FluxBB