You are not logged in.

#1 2020-05-25 03:58:17

gannon
Member
Registered: 2019-06-29
Posts: 29

[SOLVED] Dump HDA codec communication with qemu using VFIO passthrough

Edit: Marked as solved.

Hello,

I'm dual-booting Arch 5.6.3 [1] on a 2019 Samsung Notebook 9 Pro (np930mbe-k04us). The speakers do not work in Arch but they work great in Windows 10. I made a bug report at bugzilla.kernel.org [2] where the sound subsystem maintainer recommended that I dump the codec communication for the Windows driver using qemu. I'm new to VFIO passthrough so I quickly got stuck.

While trying to resolve the issue I tried

  • PCI passthrough via OVMF [3], but I got stuck at

     dmesg | grep -i vfio 

    which did not show output.

  • Following Samuel Pitoiset's 2015 instructions [4].

  • Following Joshua Stein's 2018 instructions [5].

  • Following Connor McAdams 2018 instructions [6] - I got the furthest with these.

First, I compiled qemu with tracing enabled (see [7] for full configure line)

./configure \
  --enable-trace-backends=log \
  --target-list=x86_64-softmmu

Second, I created a Win10 img, downloaded a Win10 iso, then booted qemu and clicked through the Win10 installer

qemu-img create -f qcow2 virtual-machine.img 20G
qemu-system-x86_64 -enable-kvm -hda virtual-machine.img -cdrom Win10_1909_English_x64.iso -m 4G -smp 4

Then I found the PCI group for the HDA controller

00:1f.0 ISA bridge [0601]: Intel Corporation Cannon Point-LP LPC Controller [8086:9d84] (rev 11)
00:1f.3 Multimedia audio controller [0401]: Intel Corporation Cannon Point-LP High Definition Audio Controller [8086:9dc8] (rev 11)
00:1f.4 SMBus [0c05]: Intel Corporation Cannon Point-LP SMBus Controller [8086:9da3] (rev 11)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Point-LP SPI Controller [8086:9da4] (rev 11)

I set my commandline to

GRUB_CMDLINE_LINUX_DEFAULT="text pci-stub.ids=8086:9d84,8086:9dc8,8086:9da3,8086:9da4 iommu=pt intel_iommu=on"

grub-mkconfig -o /boot/grub/grub.cfg

I rebooted my laptop and then created the vfio devices

sudo modprobe pci-stub
sudo modprobe vfio-pci

echo 0000:00:1f.X | sudo tee /sys/bus/pci/devices/0000:00:1f.X/driver/unbind <- for X in {0, 3, 4, 5}
echo 0x8086 0xWXYZ | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id <- for WXYZ in {9d84, 9dc8, 9da3, 9da4}

I booted my Win10 img in qemu

sudo qemu-system-x86_64 \
    -M q35 -m 2G -cpu host,kvm=off \
    -enable-kvm \
    -device vfio-pci,host=00:1f.3,multifunction=on,x-no-mmap \
    -hda virtual-machine.img \
    -trace events=events.txt

where events.txt contained

-vfio_region_read
vfio_region_write

In the terminal I expected to see communication between the Windows driver and the codec. Instead, I see

qemu-system-x86_64: vfio: Cannot reset device 0000:00:1f.3, no available reset mechanism.

I ran the one-liner in [8] which showed that the device does not support resetting. I read in various places that resetting allows the host to recover the device when the VM shuts down, which is not something that I need - it's fine if the VM crashes on shut down. I just want to see the driver initializing the codec. Even though I can't see the driver communicating with the codec, I can login and run Win10 applications in the VM window, but there's no sound. The speaker icon on the right side of the bottom bar has a red X and clicking it pops a window with a loading bar and the messages "Detecting problems" and "Checking audio device driver". After a few seconds the loading bar is replaced by the message "Troubleshooting couldn't identify the problem".

Is Linux touching the device, causing the VM to attempt a rest on boot? Could I prevent Linux from doing this? Am I out of luck here? Any help is appreciated!

[1] I'm running 5.6.3-arch1-1-custom, the official Arch 5.6.3 kernel plus this patch that I wrote with roinincoder to fix kthe headphone audio https://github.com/torvalds/linux/commi … 86432e08b4. I compiled the kernel myself.
[2] https://bugzilla.kernel.org/show_bug.cgi?id=207423
[3] https://wiki.archlinux.org/index.php/PC … h_via_OVMF
[4] https://hakzsam.wordpress.com/2015/02/21/471
[5] https://jcs.org/2018/11/12/vfio
[6] https://github.com/Conmanx360/QemuHDADu … he-program
[7] Full configure line. This is from the AUR qemu-git package's PKGBUILD file, which I modified to prepend --enable-trace-backends=log --target-list=x86_64-softmmu before running makepkg.

configure \
  --enable-trace-backends=log \
  --target-list=x86_64-softmmu \
  --prefix=/usr \
  --sysconfdir=/etc \
  --localstatedir=/var \
  --libexecdir=/usr/lib/qemu \
  --extra-ldflags="$LDFLAGS" \
  --smbd=/usr/bin/smbd \
  --enable-modules \
  --enable-sdl \
  --disable-werror \
  --enable-vhost-user \
  --enable-slirp=system \
  --enable-xfsctl \
  --audio-drv-list="pa alsa sdl"

[8] https://wiki.archlinux.org/index.php/PC … _resetting

Last edited by gannon (2020-06-05 20:08:04)

Offline

#2 2020-05-26 06:27:59

gannon
Member
Registered: 2019-06-29
Posts: 29

Re: [SOLVED] Dump HDA codec communication with qemu using VFIO passthrough

Here's the output from the 1-liner on the Arch wiki showing which devices can and cannot be reset [1]

IOMMU group 7
	00:16.0 Communication controller [0780]: Intel Corporation Cannon Point-LP MEI Controller #1 [8086:9de0] (rev 11)
IOMMU group 15
	02:04.0 PCI bridge [0604]: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] [8086:15d3] (rev 02)
IOMMU group 5
	00:14.0 USB controller [0c03]: Intel Corporation Cannon Point-LP USB 3.1 xHCI Controller [8086:9ded] (rev 11)
	00:14.2 RAM memory [0500]: Intel Corporation Cannon Point-LP Shared SRAM [8086:9def] (rev 11)
[RESET]	00:14.3 Network controller [0280]: Intel Corporation Cannon Point-LP CNVi [Wireless-AC] [8086:9df0] (rev 11)
IOMMU group 13
	02:01.0 PCI bridge [0604]: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] [8086:15d3] (rev 02)
IOMMU group 3
	00:12.0 Signal processing controller [1180]: Intel Corporation Cannon Point-LP Thermal Controller [8086:9df9] (rev 11)
IOMMU group 11
[RESET]	01:00.0 PCI bridge [0604]: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] [8086:15d3] (rev 02)
IOMMU group 1
[RESET]	00:02.0 VGA compatible controller [0300]: Intel Corporation UHD Graphics 620 (Whiskey Lake) [8086:3ea0] (rev 02)
IOMMU group 8
[RESET]	00:1c.0 PCI bridge [0604]: Intel Corporation Cannon Point-LP PCI Express Root Port #5 [8086:9dbc] (rev f1)
IOMMU group 16
[RESET]	6c:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Black 2018/PC SN520 NVMe SSD [15b7:5003] (rev 01)
IOMMU group 6
	00:15.0 Serial bus controller [0c80]: Intel Corporation Cannon Point-LP Serial IO I2C Controller #0 [8086:9de8] (rev 11)
	00:15.2 Serial bus controller [0c80]: Intel Corporation Device [8086:9dea] (rev 11)
IOMMU group 14
	02:02.0 PCI bridge [0604]: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] [8086:15d3] (rev 02)
[RESET]	37:00.0 USB controller [0c03]: Intel Corporation JHL6540 Thunderbolt 3 USB Controller (C step) [Alpine Ridge 4C 2016] [8086:15d4] (rev 02)
IOMMU group 4
	00:13.0 Serial controller [0700]: Intel Corporation Cannon Point-LP Integrated Sensor Hub [8086:9dfc] (rev 11)
IOMMU group 12
[RESET]	02:00.0 PCI bridge [0604]: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] [8086:15d3] (rev 02)
[RESET]	03:00.0 System peripheral [0880]: Intel Corporation JHL6540 Thunderbolt 3 NHI (C step) [Alpine Ridge 4C 2016] [8086:15d2] (rev 02)
IOMMU group 2
	00:04.0 Signal processing controller [1180]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem [8086:1903] (rev 0c)
IOMMU group 10
	00:1f.0 ISA bridge [0601]: Intel Corporation Cannon Point-LP LPC Controller [8086:9d84] (rev 11)
	00:1f.3 Multimedia audio controller [0401]: Intel Corporation Cannon Point-LP High Definition Audio Controller [8086:9dc8] (rev 11)
	00:1f.4 SMBus [0c05]: Intel Corporation Cannon Point-LP SMBus Controller [8086:9da3] (rev 11)
	00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Point-LP SPI Controller [8086:9da4] (rev 11)
IOMMU group 0
	00:00.0 Host bridge [0600]: Intel Corporation Coffee Lake HOST and DRAM Controller [8086:3e34] (rev 0c)
IOMMU group 9
[RESET]	00:1d.0 PCI bridge [0604]: Intel Corporation Cannon Point-LP PCI Express Root Port #9 [8086:9db0] (rev f1)

I wasn't sure whether my inability to extract a trace from the win10 driver was because the HDA controller (00:1f.3 8086:9dc8) is not resettable, or because of user error. To eliminate the possibility of user error I tried to pass through a resettable device instead. After trying many devices I finally succeeded with the USB controller (37:00.0 8086:15d4). Specifically, I set my cmdline to

/etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="pci-stub.ids=8086:15d3,8086:15d4 intel_iommu=on iommu=pt"
grub-mkconfig -o /boot/grub/grub.cfg

Then I ran

echo 0000:37:00.0  | sudo tee /sys/bus/pci/devices/0000:37:00.0/driver/unbind
echo 0x8086 0x15d4 | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id

I started qemu with

sudo qemu-system-x86_64 \
    -enable-kvm \
    -hda virtual-machine.img \
    -m 4G -smp 4 \
    -device vfio-pci,host=0000:37:00.0,x-no-mmap=true \
    -trace events=events.txt \
    -monitor stdio

Immediately I saw a glorious trace!

2375@1590472340.771395:vfio_region_write  (0000:37:00.0:region0+0x80, 0x2, 4)
2375@1590472340.771420:vfio_region_write  (0000:37:00.0:region0+0xb8, 0x40, 4)
2375@1590472340.771425:vfio_region_write  (0000:37:00.0:region0+0xb0, 0xbffdfb00, 4)
2375@1590472340.771429:vfio_region_write  (0000:37:00.0:region0+0xb4, 0x0, 4)
2375@1590472340.771432:vfio_region_write  (0000:37:00.0:region0+0x98, 0xbffdf901, 4)
2375@1590472340.771437:vfio_region_write  (0000:37:00.0:region0+0x9c, 0x0, 4)
2375@1590472340.771443:vfio_region_write  (0000:37:00.0:region0+0x2028, 0x1, 4)
2375@1590472340.771450:vfio_region_write  (0000:37:00.0:region0+0x2038, 0xbffdf700, 4)
2375@1590472340.771455:vfio_region_write  (0000:37:00.0:region0+0x203c, 0x0, 4)
2375@1590472340.771459:vfio_region_write  (0000:37:00.0:region0+0x2030, 0xbffdfac0, 4)
2375@1590472340.771463:vfio_region_write  (0000:37:00.0:region0+0x2034, 0x0, 4)
2375@1590472351.161765:vfio_region_write  (0000:37:00.0:region0+0x846f, 0x1, 1)
2375@1590472351.161792:vfio_region_write  (0000:37:00.0:region0+0x8470, 0x0, 4)
2375@1590472351.161806:vfio_region_write  (0000:37:00.0:region0+0x80, 0x2, 4)
2375@1590472351.189481:vfio_region_write  (0000:37:00.0:region0+0x2028, 0x0, 4)
2375@1590472351.190300:vfio_region_write  (0000:37:00.0:region0+0x2028, 0x8, 4)
2375@1590472351.190331:vfio_region_write  (0000:37:00.0:region0+0x2038, 0x111b16008, 8)
2375@1590472351.190338:vfio_region_write  (0000:37:00.0:region0+0x2030, 0x105f3d000, 8)
2375@1590472351.190344:vfio_region_write  (0000:37:00.0:region0+0xb8, 0x40, 4)
2375@1590472351.190351:vfio_region_write  (0000:37:00.0:region0+0xb0, 0x111b1e000, 8)
2375@1590472351.190361:vfio_region_write  (0000:37:00.0:region0+0x98, 0x105f3d401, 8)
2375@1590472351.190569:vfio_region_write  (0000:37:00.0:region0+0x2024, 0xc8, 4)
2375@1590472351.190586:vfio_region_write  (0000:37:00.0:region0+0x2020, 0x2, 4)
2375@1590472351.190600:vfio_region_write  (0000:37:00.0:region0+0x94, 0x2, 4)
2375@1590472351.190637:vfio_region_write  (0000:37:00.0:region0+0x80, 0x4005, 4)
2375@1590472351.212897:vfio_region_write  (0000:37:00.0:region0+0x480, 0x200, 4)
2375@1590472351.212948:vfio_region_write  (0000:37:00.0:region0+0x490, 0x200, 4)
2375@1590472351.393158:vfio_region_write  (0000:37:00.0:region0+0x4a0, 0xe000200, 4)
2375@1590472351.393320:vfio_region_write  (0000:37:00.0:region0+0x4b0, 0xe000200, 4)

How do I overcome the inability to reset the HDA controller?

I found some interesting information from Alex Williamson, who is the vfio maintainer in both the kernel and qemu [2]. Alex Williamson also authors a vfio blog [3]. In 2018, Alex made an interesting comment in the vfio subreddit thread where someone had a similar problem [4]

As usual, some misleading info in the Arch wiki. Gosh, I wish we could power cycle devices, that would resolve so many problems, and it would be a type of reset. It also doesn't explain why GPUs don't support reset, but don't generate this warning. Devices with this warning may not provide reproducible results, are more likely not to work at all, and if you care about potentially leaking information between host and guest, they're more susceptible to that. We try to reset devices before starting the VM and on each VM reboot to put the device in a known, consistent, reproducible state. If we can't do that, then you get the device in whatever state the last driver left it. You rely on the ability of the next driver to get the device to put it into the working state that it wants. Effectively, if it works for you, great, but I'm not going to consider it a bug if it doesn't, the hardware isn't designed for the use case.

According to Alex, qemu tries "to reset devices before starting the VM and on each VM reboot to put the device in a known, consistent, reproducible state." So maybe qemu is refusing to pass-through the HDA controller because qemu fails when it tries to reset the HDA controller. Does qemu have a do-not-reset-devices option? Or am I stuck?

[1] https://wiki.archlinux.org/index.php/PC … _resetting
[2] https://www.youtube.com/watch?v=WFkdTFTOTpA
[3] http://vfio.blogspot.com/
[4] https://www.reddit.com/r/VFIO/comments/ … m/dx94uj4/

Edit #1: I forgot to include module loading in an above code section

sudo modprobe pci-stub
sudo modprobe vfio-pci

echo 0000:37:00.0  | sudo tee /sys/bus/pci/devices/0000:37:00.0/driver/unbind
echo 0x8086 0x15d4 | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id

Edit #2: I also forgot to mention that I am using mkinitcpio to ensure vfio-pci is loaded first

/etc/mkinitcpio.conf
MODULES=(vfio_pci vfio vfio_iommu_type1 vfio_virqfd)
mkinitcpio -p linux

Last edited by gannon (2020-05-26 06:49:00)

Offline

#3 2020-05-29 06:39:42

gannon
Member
Registered: 2019-06-29
Posts: 29

Re: [SOLVED] Dump HDA codec communication with qemu using VFIO passthrough

I forgot to install the Windows driver! I'm so embarrassed. Can you tell how bad I am at Windows? XD

Samsung distributes drivers through an application called SamsungUpdate. I downloaded SamsungUpdate from the Windows Store, but it crashes when I use it. Specifically, it runs for ~20 seconds then I see a popup that says

An additional service package must be installed for Samsung Update to
work properly. Do you want to download the installation file now?

I click OK and it crashes, ie the SamsungUpdate window disappears. Occasionally the Terms of Use will appear before the popup, but it is always followed by the popup, which is always followed by a crash. sad

Offline

#4 2020-06-05 20:06:43

gannon
Member
Registered: 2019-06-29
Posts: 29

Re: [SOLVED] Dump HDA codec communication with qemu using VFIO passthrough

We did it! All thanks to user attackzero from #archlinux who realized that SamsungUpdate writes an xml log containing download links when it updates drivers. User attackzero ran SU, asked for the audio driver for np930mbe, and gave me the download link:

orcaservice.samsungmobile.com/FileDownloader.aspx?FILENAME=BASW-A1542A6N.ZIP

All I had to do was boot windows in qemu, download the file, then unzip and double click setup to run the installer, and suddenly I saw the traces. Additionally, the speakers are now working when I play a video in the VM. I'm marking this SOLVED. Thanks attackzero!

(If anyone is interested in the vfio trace please come join us on the kernel bugzilla where we would be happy to have your help to make sense of it. big_smile https://bugzilla.kernel.org/show_bug.cgi?id=207423 )

Offline

Board footer

Powered by FluxBB