[Bug 209571] ZFS and NVMe performing poorly. TRIM requests stall I/O

Discussion:

Add Reply

b***@freebsd.org

2016-05-20 08:02:50 UTC

Permalink

--
You are receiving this mail because:
You are the assignee for the bug.

b***@freebsd.org

2017-11-24 00:27:10 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

David NewHamlet <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gmail.com

--- Comment #2 from David NewHamlet <***@gmail.com> ---
I believe that this issue still on FreeBSD-12-CURRENT with dmesg log:

nvme0: async event occurred (log page id=0x2)
nvme0: temperature above threshold

and iostat -x 5
extended device statistics
device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen %b
nvd0 0 1 0.6 153.6 2384 3287 0 2664 80 99

I am glad to test any patch for this in my test environment.

ref:
https://lists.freebsd.org/pipermail/freebsd-stable/2015-January/081621.html

FreeBSD david-al.localdomain 12.0-CURRENT FreeBSD 12.0-CURRENT #11
5ea1f67b4ad(master)-dirty: Fri Nov 24 09:14:41 NZDT 2017
***@n550jk.localdomain:/tank/cross/obj/12.0-amd64.amd64/sys/INITZ-NODEBUG
amd64

01:00.0 Non-Volatile memory controller: Intel Corporation Device f1a5 (rev 03)

nvme0: <Generic NVMe Device> mem 0xdf200000-0xdf203fff irq 16 at device 0.0 on
pci1

nvd0: <INTEL SSDPEKKW512G7> NVMe namespace

--
You are receiving this mail because:
You are the assignee for the bug.

b***@freebsd.org

2017-11-28 22:27:29 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

Jim Phillips <***@ks.uiuc.edu> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@ks.uiuc.edu

--- Comment #3 from Jim Phillips <***@ks.uiuc.edu> ---
I'm going to post this as a separate issue, but will add it here since it is
related. We were seeing non-NVMe SSDs on ZFS timing out on TRIM operations and
have been able to greatly reduce the incidence by setting (for all drives)
sysctl kern.cam.da.0.delete_max=536870912 based on the advice here:
https://lists.freebsd.org/pipermail/freebsd-scsi/2015-July/006777.html

--
You are receiving this mail because:
You are the assignee for the bug.

b***@freebsd.org

2017-11-28 22:31:44 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

Warner Losh <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@FreeBSD.org

--- Comment #4 from Warner Losh <***@FreeBSD.org> ---
nvd and nda have the opposite problem that da/ada have. ada/da collapse as many
BIO_DELETE commands down into a single TRIM to the device. That's why limiting
helps since huge trims take a long time.

nvd/nda, however, has the opposite problem. They do no trim collapsing at all,
so flood the device with TRIM requests that starve read/write requests.
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2018-02-14 16:39:53 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

Nick Evans <***@talkpoint.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@talkpoint.com

--- Comment #5 from Nick Evans <***@talkpoint.com> ---
Seeing the same on 11.0-RELEASE-p6 using a Samsung 960 EVO NVME. I've stopped
my (extremely slow) copy to the drive and this background traffic continues.

tty nvd0 da0 da1 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
0 150 32.00 456 14.25 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
0 493 32.00 455 14.22 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
0 151 32.00 454 14.18 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
0 164 32.00 455 14.21 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100

I've seen this on another system with Intel 750 PCIe cards but if I remember
right it seemed to handle the background trim better. It's been a while since I
tested it though. Both systems are test boxes so I can test any changes needed
and can move them to whatever base system versions are needed.
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2018-02-14 18:18:50 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

Conrad Meyer <***@freebsd.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@freebsd.org
Summary|ZFS and NVMe performing |NVMe performing poorly.
|poorly. TRIM requests stall |TRIM requests stall I/O
|I/O activity |activity

--- Comment #6 from Conrad Meyer <***@freebsd.org> ---
I see the exact same problem with TRIM, Samsung 960 EVO, and UFS.

Part of the problem is consumer controller (960 EVO) doing a bad job with TRIM.

Part of the problem is nvd does not coalesces TRIMs. Part of the problem is
cam_iosched separates out and prioritizes TRIM over all other IO; see
cam_iosched_next_bio(). (Separating out is useful for coalescing, but we don't
actually coalesce yet.)

I observed iostat showing non-zero queue depths and long IO latencies with TRIM
enabled (960 EVO). With TRIM disabled, qdepth was at most ever 1 and IO
latency fell drastically under the same workload.
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2018-02-14 22:02:37 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

--- Comment #7 from Nick Evans <***@talkpoint.com> ---
So is some immediate relief as simple as reordering the trim option in
cam_iosched_next_bio to prioritize normal I/O over trim? At least for now?
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2018-02-14 23:34:42 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

--- Comment #8 from Warner Losh <***@FreeBSD.org> ---
Only if you are using nda, which most people aren't. It doesn't exist in 10 or
11 (well, until last week). It's not turned on by default in 12.

So talking about cam_ioschded and nvd in the same breath misunderstands the
problem. nvd doesn't collapse trims and that's the problem there. It doesn't
use CAM at all so no tweaks to cam_iosched is going to help because it's
completely unused. nda, on the other hand, has the collapsing trim issue and
the priority inversion issue cem points out. I've been tweaking the code to
reduce trim priority as well as collapse trims, but that won't help nvd. I have
no plans on doing anything with nvd.

So if you are using nvd, the whole problem is lack of trim collapsing. If you
are using nda, that problem is compounded by the I/O scheduler. I should have a
fix for nda by the end of next week.
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2019-02-18 16:58:51 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

***@gmail.com changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gmail.com

--- Comment #9 from ***@gmail.com ---
Is it possible the following commit addresses this issue?

https://svnweb.freebsd.org/base?view=revision&sortby=date&revision=343586
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2019-02-18 20:09:52 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

--- Comment #10 from Warner Losh <***@FreeBSD.org> ---
(In reply to ncrogers from comment #9)
I don't think it will. At least if this person's issues with TRIM are the same
as the ones I've investigated for Netflix, the issue is too many trims starving
other I/O by taking up I/O slots and/or triggering performance issues in the
driver (or both).
It may help, but I don't think the FLUSH is the issues with TRIM.
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2019-02-18 20:22:13 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

--- Comment #11 from ***@gmail.com ---
(In reply to Warner Losh from comment #10)
Interesting. Does changing vfs.zfs.vdev.trim_max_pending or other sysctls
improve this problem? I guess I am looking for some advice as I am also
experiencing issues with some NVMe + ZFS systems hanging under 12.0-RELEASE.
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2019-02-18 21:37:52 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

Peter Eriksson <***@liu.se> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@liu.se

--- Comment #12 from Peter Eriksson <***@liu.se> ---
Just another data point:

We are seeing strange behaviour with some PCIe SSDs - Intel DC P3520 which
randomly/after a while seems to hang (go dead) - so dead that they disappear
from the PCI bus and requires a power off / wait until caps have drained /
power on to revive... But this happens on 11.2 - and some other folks here have
seen similar problems on Linux. So I suspect this is not really OS-related.

We've also had problems with the SATA version (Intel S3520) where they
occasionally freeze up on us (and go "silent" on the SATA/SAS port). Often if
we wait long enough (days/a week or so) they will revive themselfs and start
working again...
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2019-02-19 10:30:24 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

--- Comment #13 from Borja Marcos <***@sarenet.es> ---
(In reply to Peter Eriksson from comment #12)

But that would suggest a firmware problem? In my experience with NVMEs the
situation improved a lot with a firmware update. Although of course the lack of
trmi collapse is a problem.

I still think that maybe discarding TRIM operations in serious congestion
situations is a lesser evil. After all, what are people doing when facing this
issue? Disabling TRIMs. It's better to do some TRIMs than none at all.

Certainly you cannot discard read/write operations like you would discard
packets on a network in order to enforce a bandwidth limit. Buffering them
doesn't help either when applications need to commit data to storage.
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-***@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-***@freebsd.org"

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

b***@freebsd.org

2025-02-21 13:12:19 UTC

Permalink

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

Mark Linimon <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Assignee|***@FreeBSD.org |***@FreeBSD.org
Resolution|--- |Overcome By Events
Status|New |Closed

--- Comment #14 from Mark Linimon <***@FreeBSD.org> ---
^Triage: I'm sorry that this PR did not get addressed in a timely fashion.

By now, the version that it was created against is long out of support.
Please re-open if it is still a problem on a supported version.

--
You are receiving this mail because:
You are the assignee for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de