You are not logged in.

#1 2016-08-30 19:10:28

knopki
Member
Registered: 2016-08-30
Posts: 4

[Solved] Alienware 15 r2: PCI devices disappearing after resume

Hello, dear community! Arch newcomer here, asking for help.

My setup

Alienware 15 R2 with latest BIOS (1.3.6), Skylake i7-6700HQ and Samsung 950 Pro NVMe.

00:00.0 Host bridge: Intel Corporation Skylake Host Bridge/DRAM Registers (rev 07)
00:01.0 PCI bridge: Intel Corporation Skylake PCIe Controller (x16) (rev 07)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
00:04.0 Signal processing controller: Intel Corporation Skylake Processor Thermal Subsystem (rev 07)
00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)
00:14.2 Signal processing controller: Intel Corporation Sunrise Point-H Thermal subsystem (rev 31)
00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31)
00:17.0 SATA controller: Intel Corporation Sunrise Point-H SATA Controller [AHCI mode] (rev 31)
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #1 (rev f1)
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #5 (rev f1)
00:1c.5 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #6 (rev f1)
00:1c.6 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #7 (rev f1)
00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #9 (rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31)
00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD Audio (rev 31)
00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)
01:00.0 3D controller: NVIDIA Corporation GM204M [GeForce GTX 970M] (rev ff)
3b:00.0 Ethernet controller: Qualcomm Atheros Killer E2400 Gigabit Ethernet Controller (rev 10)
3c:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
3d:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5227 PCI Express Card Reader (rev 01)
3e:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)

nvme0 - ssd
sda - hdd

/boot on nvme0n1p1
swap on lvm on luks on nvme0n1p3
/root and /boot btrfs subvolumes on lvm on luks on nvme0n1p3
/home btrfs subvolume on lvm on luks on bcache (sda6 + nvme0n1p4)
complex, oh...

NAME                        FSTYPE      LABEL      UUID                                   MOUNTPOINT
sda                                                                                       
├─sda1                                                                                    
├─sda2                      vfat        ESP        39B4-82EE                              
├─sda3                      ntfs        OS         EE106F0129C7BF33                       
├─sda4                      ntfs        WINRETOOLS CA21D5BD3C4F0B5D                       
├─sda5                      ntfs        Image      E2C9C8F05DC95E5F                       
└─sda6                      bcache                 17540d53-7b5f-4998-bff7-b4b09f8c7f13   
  └─bcache0                 crypto_LUKS            ead98797-7022-4160-829f-8380d5c7bd94   
    └─luks-bcache           LVM2_member            uMoCp8-DpCf-L5Ic-jxBq-xx5w-A0He-xT9q59 
      └─cached-cached_btrfs btrfs       cached     e6bc23e5-4270-469c-8dee-c56dbb6132d5   /home
nvme0n1                                                                                   
├─nvme0n1p1                 vfat                   F783-E5D0                              /boot
├─nvme0n1p2                 ext2                   0cfcccf4-4c29-4b85-8c61-0c0ddaf444b6   
├─nvme0n1p3                 crypto_LUKS            6b364a9a-e599-4c63-96cd-c4e95062c2ac   
│ └─luks-ssd                LVM2_member            ZeqGr5-EHP0-poFq-65Ui-6Krw-C4HG-Nou2X4 
│   ├─ssd-swap              swap                   11eb4918-7d7c-4f7a-b1c7-dd912a823637   [SWAP]
│   └─ssd-root              btrfs       root       f1bb09de-4090-4c93-9743-4bc748ceb258   /root
└─nvme0n1p4                 bcache                 bf7969b5-c292-4a86-9075-81422c72cce4   

Kernel: 4.7.2-1-ARCH
Boot options: cryptdevice=PARTUUID=97eb0135-251f-4167-b43a-e6c917b19353:luks-ssd:allow-discards root=UUID=f1bb09de-4090-4c93-9743-4bc748ceb258 resume=UUID=11eb4918-7d7c-4f7a-b1c7-dd912a823637 rd.luks.options=discard rw initcall_debug log_buf_len=16M

X server disabled at all to avoid any problems with graphics drivers.


Problem

System resumes successfully after systemctl suspend/hibernate/hybrid-sleep, but nvme0 device disappears (underlying device of my root fs!). So i have only tons of IO errors about root fs and bcache.
After resume doesn't exists /dev/nvme0 and /sys/bus/pci/devices/0000:3e:00.0. I can do echo 1 > /sys/bus/pci/rescan and device found again as nvme1 (that not help me).

What I tried already:
echo 0 > /sys/power/pm_async - doesn't help
echo freeze > /sys/power/state - without problem
echo freezer > /sys/power/pm_test && echo mem > /sys/power/pm_test - without problem
echo devices > /sys/power/pm_test && echo mem > /sys/power/pm_test - without problem
echo platform > /sys/power/pm_test && echo mem > /sys/power/pm_test - same problem with nvme disappearing

journalctl -f > log.txt started before suspend and killed after resume: http://knopki.github.io/tmp/nvme0problem/log.txt
journalctl -f on display 1 second after resume: http://imgur.com/a/JVo5O
journalctl -f on display 10 seconds after resume: http://imgur.com/a/v2DbK

Output of analyze_suspend.py from https://01.org/suspendresume :
mem_dmesg.txt http://knopki.github.io/tmp/nvme0proble … _dmesg.txt
mem_ftrace.txt http://knopki.github.io/tmp/nvme0proble … ftrace.txt
mem.html http://knopki.github.io/tmp/nvme0problem/alien_mem.html



I'll appreciate any help. Thank you!

Last edited by knopki (2017-02-09 01:34:05)

Offline

#2 2016-08-31 09:50:46

knopki
Member
Registered: 2016-08-30
Posts: 4

Re: [Solved] Alienware 15 r2: PCI devices disappearing after resume

Found that journal -f show nothing when disk is gone. Changed journald storage to volatile. New logs:
journalctl --system -f http://knopki.github.io/tmp/nvme0problem/journal.txt
dmesg -w http://knopki.github.io/tmp/nvme0problem/dmesg.txt

I see that on resume sda is starting, but nothing about nvme0.

Offline

#3 2016-08-31 18:25:02

knopki
Member
Registered: 2016-08-30
Posts: 4

Re: [Solved] Alienware 15 r2: PCI devices disappearing after resume

Ok, I played with Arch Live system and discover that maybe it's not related to nvme at all.
I have about 12 second after resume when I can read-write nvme0n1 without problem.
Also I discover that not only nvme disappears, but that PCI devices:

3b:00.0 Ethernet controller: Qualcomm Atheros Killer E2400 Gigabit Ethernet Controller (rev 10)
3c:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
3d:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5227 PCI Express Card Reader (rev 01)
3e:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller (rev 01)

I think this bug related: https://bugs.launchpad.net/ubuntu/+sour … ug/1568703
Same symptoms, same NVMe and Wi-Fi.

Maybe this related too: https://bugzilla.kernel.org/show_bug.cgi?id=112121

Last edited by knopki (2016-08-31 19:04:47)

Offline

#4 2017-02-09 01:34:35

knopki
Member
Registered: 2016-08-30
Posts: 4

Re: [Solved] Alienware 15 r2: PCI devices disappearing after resume

Temporary workaround: acpiphp.disable=1 boot parameter

Offline

#5 2017-05-17 10:46:11

ccorail
Member
Registered: 2017-05-17
Posts: 2

Re: [Solved] Alienware 15 r2: PCI devices disappearing after resume

Hello, I have the same problem, also on an Alienware 15 R2, the same devices disappear after suspend/resume.
But adding acpihp.disable=1 does not make any difference.
Any help would be greatly appreciated.
/proc/cmdline

initrd=\initramfs-linux.img root=UUID=bb489c51-507f-4246-8f41-7375ab7fbb22 rw acpihp.disable=1

Offline

#6 2017-05-17 15:23:41

tom.ty89
Member
Registered: 2012-11-15
Posts: 897

Re: [Solved] Alienware 15 r2: PCI devices disappearing after resume

It's acpiphp, not acpihp : \

Offline

#7 2017-05-18 09:12:45

ccorail
Member
Registered: 2017-05-17
Posts: 2

Re: [Solved] Alienware 15 r2: PCI devices disappearing after resume

Thanks, I've tried dozen of times over the last months with the wrong option... big_smile
It works now.

Offline

#8 2018-02-12 11:16:26

FireBurn
Member
Registered: 2018-02-12
Posts: 1

Re: [Solved] Alienware 15 r2: PCI devices disappearing after resume

A patch https://patchwork.kernel.org/patch/10212201/ is going to fix this issue properly, it should be backported to older kernels too, so hopefully going forward your NVMe drives won't require any workarounds, it should also fix USB-C detection too

Offline

Board footer

Powered by FluxBB