You are not logged in.

#51 2022-08-20 19:49:10

Watnuss
Member
Registered: 2013-03-04
Posts: 53

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

another good one:

$ uname -r
5.18.0-rc1-1-00037-g76f61e1e89b3

Offline

#52 2022-08-20 20:08:13

loqs
Member
Registered: 2014-03-06
Posts: 18,001

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

git bisect good
Bisecting: 0 revisions left to test after this (roughly 1 step)
[30612045e69d088f1effd748048ebb0e282984ec] x86/sev: Use firmware-validated CPUID for SEV-SNP guests

https://drive.google.com/file/d/1WQia1c … sp=sharing linux-5.18rc1.r39.g30612045e69d-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1s1gnCQ … sp=sharing linux-headers-5.18rc1.r39.g30612045e69d-1-x86_64.pkg.tar.zst

Offline

#53 2022-08-20 20:15:34

Watnuss
Member
Registered: 2013-03-04
Posts: 53

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

bad commit

Offline

#54 2022-08-20 20:33:13

loqs
Member
Registered: 2014-03-06
Posts: 18,001

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[b190a043c49af4587f5e157053f909192820522a] x86/sev: Add SEV-SNP feature detection/setup

https://drive.google.com/file/d/18-rY45 … sp=sharing linux-5.18rc1.r38.gb190a043c49a-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1ByEqEY … sp=sharing linux-headers-5.18rc1.r38.gb190a043c49a-1-x86_64.pkg.tar.zst

Offline

#55 2022-08-20 20:56:31

Watnuss
Member
Registered: 2013-03-04
Posts: 53

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

again bad commit

Offline

#56 2022-08-20 20:57:10

loqs
Member
Registered: 2014-03-06
Posts: 18,001

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

git bisect bad
b190a043c49af4587f5e157053f909192820522a is the first bad commit
commit b190a043c49af4587f5e157053f909192820522a
Author: Michael Roth <michael.roth@amd.com>
Date:   Thu Feb 24 10:56:18 2022 -0600

    x86/sev: Add SEV-SNP feature detection/setup
    
    Initial/preliminary detection of SEV-SNP is done via the Confidential
    Computing blob. Check for it prior to the normal SEV/SME feature
    initialization, and add some sanity checks to confirm it agrees with
    SEV-SNP CPUID/MSR bits.
    
    Signed-off-by: Michael Roth <michael.roth@amd.com>
    Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
    Signed-off-by: Borislav Petkov <bp@suse.de>
    Link: https://lore.kernel.org/r/20220307213356.2797205-39-brijesh.singh@amd.com

 arch/x86/boot/compressed/sev.c     | 27 ----------------
 arch/x86/include/asm/sev.h         |  2 ++
 arch/x86/kernel/sev-shared.c       | 27 ++++++++++++++++
 arch/x86/kernel/sev.c              | 64 ++++++++++++++++++++++++++++++++++++++
 arch/x86/mm/mem_encrypt_identity.c |  8 +++++
 5 files changed, 101 insertions(+), 27 deletions(-)

Edit:
See if snp is being detected:

diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index f415498d3175..4844c5b0221d 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -513,7 +513,7 @@ void __init sme_enable(struct boot_params *bp)
 	bool snp;
 	u64 msr;
 
-	snp = snp_init(bp);
+	snp = false;
 
 	/* Check for the SME/SEV support leaf */
 	eax = 0x80000000;

https://drive.google.com/file/d/1b0Kmp4 … sp=sharing linux-5.19.2.arch1-2.1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1er9Ili … sp=sharing linux-headers-5.19.2.arch1-2.1-x86_64.pkg.tar.zst

Last edited by loqs (2022-08-20 21:29:41)

Offline

#57 2022-08-20 21:48:29

Artlav
Member
Registered: 2016-07-11
Posts: 36
Website

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

Got the same problem (with GRUB). Kernel refuses to do anything but lock up or boot loop, regardless of parameters, earlyprintk, consol=ttyS0, nothing.

Also, i was able to reproduce it in QEMU, with OVMF EFI, so this is not hardware specific.

sudo qemu-system-x86_64 \
-nodefaults \
-enable-kvm \
-m 2048 \
-cpu host \
-smp cores=2,threads=1,sockets=1 \
-machine q35,vmport=off,kernel_irqchip=on \
-drive if=pflash,format=raw,readonly=on,file=ovmf_code.fd \
-drive if=pflash,format=raw,file=ovmf_vars-1024x768.fd \
-smbios type=2 \
-netdev user,id=net0,hostfwd=tcp::5002-:22,hostfwd=tcp::5902-:5900 \
-device e1000,netdev=net0,mac=00:25:4B:00:00:02 \
-vnc 127.0.0.1:1 -vga std \
-cdrom archlinux-x86_64.iso \
-drive file=/dev/sdb,if=none,format=raw,aio=native,cache=none,id=hd0 \
-device virtio-scsi-pci,id=scsi -device scsi-block,drive=hd0,bus=scsi.0 \
-serial stdio

Using LTS kernel helps, but this is critical since it can kill remote servers with no recourse.

Last edited by Artlav (2022-08-20 21:49:31)

Offline

#58 2022-08-20 22:01:36

Watnuss
Member
Registered: 2013-03-04
Posts: 53

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

booting with your change works:

$ uname -r
5.19.2-arch1-2.1

Last edited by Watnuss (2022-08-20 22:01:59)

Offline

#59 2022-08-20 22:32:04

loqs
Member
Registered: 2014-03-06
Posts: 18,001

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

Please open a bug on https://bugzilla.kernel.org Product=Platform Specific Hardware Component=x86-64 include the git bisect log

git bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [4b0986a3613c92f4ec1bdc7f60ec66fea135991f] Linux 5.18
git bisect good 4b0986a3613c92f4ec1bdc7f60ec66fea135991f
# status: waiting for bad commit, 1 good commit known
# bad: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
git bisect bad 3d7cb6b04c3f3115719235cc6866b10326de34cd
# bad: [c011dd537ffe47462051930413fed07dbdc80313] Merge tag 'arm-soc-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad c011dd537ffe47462051930413fed07dbdc80313
# bad: [7e062cda7d90543ac8c7700fc7c5527d0c0f22ad] Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git bisect bad 7e062cda7d90543ac8c7700fc7c5527d0c0f22ad
# bad: [3842007b1a33589d57f67eac479b132b77767514] Merge tag 'zonefs-5.19-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
git bisect bad 3842007b1a33589d57f67eac479b132b77767514
# bad: [22922deae13fc8d3769790c2eb388e9afce9771d] Merge tag 'objtool-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 22922deae13fc8d3769790c2eb388e9afce9771d
# good: [03e1ccd45fa70904e43ddceda140854d22b7e871] Merge tag 'x86-irq-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 03e1ccd45fa70904e43ddceda140854d22b7e871
# bad: [d61306047533eb6f63a7bd51dfa7f868503bf0ba] Merge tag 'for-linus-5.19-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
git bisect bad d61306047533eb6f63a7bd51dfa7f868503bf0ba
# bad: [1de564b8c1a6f9f8bf3a106daa0be9f2cba7d045] Merge tag 'x86_build_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 1de564b8c1a6f9f8bf3a106daa0be9f2cba7d045
# bad: [eb39e37d5cebdf0f63ee2a315fc23b035d81b4b0] Merge tag 'x86_sev_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad eb39e37d5cebdf0f63ee2a315fc23b035d81b4b0
# bad: [ba37a1438aeb540cc48722d629f4b2e7e4398466] x86/sev: Add a sev= cmdline option
git bisect bad ba37a1438aeb540cc48722d629f4b2e7e4398466
# good: [9704c07bf9f7682a83aec4e66f2d9154dbd8577f] x86/kernel: Validate ROM memory before accessing when SEV-SNP is active
git bisect good 9704c07bf9f7682a83aec4e66f2d9154dbd8577f
# good: [b66370db9a90b3fa4c4a1a732af3e7e38d6d4c7c] KVM: x86: Move lookup of indexed CPUID leafs to helper
git bisect good b66370db9a90b3fa4c4a1a732af3e7e38d6d4c7c
# good: [5f211f4fc49622473667e6983bb57beab755f6f6] x86/compressed: Use firmware-validated CPUID leaves for SEV-SNP guests
git bisect good 5f211f4fc49622473667e6983bb57beab755f6f6
# good: [76f61e1e89b32f3e5d639f1b57413a919066da06] x86/compressed/64: Add identity mapping for Confidential Computing blob
git bisect good 76f61e1e89b32f3e5d639f1b57413a919066da06
# bad: [30612045e69d088f1effd748048ebb0e282984ec] x86/sev: Use firmware-validated CPUID for SEV-SNP guests
git bisect bad 30612045e69d088f1effd748048ebb0e282984ec
# bad: [b190a043c49af4587f5e157053f909192820522a] x86/sev: Add SEV-SNP feature detection/setup
git bisect bad b190a043c49af4587f5e157053f909192820522a
# first bad commit: [b190a043c49af4587f5e157053f909192820522a] x86/sev: Add SEV-SNP feature detection/setup

Details of your systems hardware and bootloader.  Add to the CC list Michael Roth <michael.roth@amd.com> Brijesh Singh <brijesh.singh@amd.com> Borislav Petkov <bp@suse.de> Thomas Gleixner <tglx@linutronix.de> Ingo Molnar <mingo@redhat.com>

Offline

#60 2022-08-20 22:37:28

Watnuss
Member
Registered: 2013-03-04
Posts: 53

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

Will do. Thanks a lot for your help smile

Edit: The bug report: https://bugzilla.kernel.org/show_bug.cgi?id=216387

Last edited by Watnuss (2022-08-20 23:03:49)

Offline

#61 2022-08-20 23:41:14

Artlav
Member
Registered: 2016-07-11
Posts: 36
Website

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

I've put together a QEMU image and script to reproduce it: https://orbides.org/etc/qemu_519_arch_bug.zip (21Mb)
Simple image made out of EFI grub and kernel file, nothing else, no parameters.

Just put the vmlinuz file into the image root, and run.
Good ones print out a kernel panic, bad ones boot loop.

Offline

#62 2022-08-21 06:46:47

seth
Member
Registered: 2012-09-03
Posts: 58,238

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

Assuming BIOS systems are not affected (exclusively) and EFI systems will not freak out about efi_find_vendor_table (as the code was in/is used by the ACPI) and just guessing that some junk is randomly found there:

https://github.com/torvalds/linux/blob/ … sev.c#L406
"return NULL;" instead of sev_es_terminate'ing SEV?

Offline

#63 2022-08-21 07:39:48

loqs
Member
Registered: 2014-03-06
Posts: 18,001

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

seth wrote:

"return NULL;" instead of sev_es_terminate'ing SEV?

So let it be written...

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 63dc626627a0..6ad2ef39aa7c 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2058,6 +2058,7 @@ static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
 
 	/* Boot kernel would have passed the CC blob via boot_params. */
 	if (bp->cc_blob_address) {
+		pr_info("bp->cc_blob_address %px.\n",bp->cc_blob_address);
 		cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address;
 		goto found_cc_info;
 	}
@@ -2068,12 +2069,13 @@ static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
 	 * setup_data instead.
 	 */
 	cc_info = find_cc_blob_setup_data(bp);
+	pr_info("cc_info %px.\n",cc_info);
 	if (!cc_info)
 		return NULL;
 
 found_cc_info:
 	if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
-		snp_abort();
+		return NULL;
 
 	return cc_info;
 }

https://drive.google.com/file/d/17pkeVg … sp=sharing linux-5.19.2.arch1-2.3-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1oB8Vvz … sp=sharing linux-headers-5.19.2.arch1-2.3-x86_64.pkg.tar.zst

Last edited by loqs (2022-08-21 07:40:07)

Offline

#64 2022-08-21 08:29:47

Watnuss
Member
Registered: 2013-03-04
Posts: 53

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

tested the latest build: Getting a boot loop again (AMD/UEFI system).

Offline

#65 2022-08-21 14:06:02

seth
Member
Registered: 2012-09-03
Posts: 58,238

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

Do you get some kernel messages?
In doubt add "boot_delay=1000 earlyprintk=vga,keep" to the kernel parameters.
(I'm not sure whether this calls the compressed or uncompressed implementation, @loqs did you run the patched kernel?)

Offline

#66 2022-08-21 14:37:42

Artlav
Member
Registered: 2016-07-11
Posts: 36
Website

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

No messages what so ever, on screen or in serial console.
debug ignore_loglevel earlyprintk=efi,keep console=ttyS0

Offline

#67 2022-08-21 15:16:08

seth
Member
Registered: 2012-09-03
Posts: 58,238

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

w/o boot_delay they might just flush too fast?

Offline

#68 2022-08-21 15:26:49

Watnuss
Member
Registered: 2013-03-04
Posts: 53

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

Yeah. The boot loop starts before the boot_delay takes place. Also no output visible.

Offline

#69 2022-08-22 19:23:16

loqs
Member
Registered: 2014-03-06
Posts: 18,001

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

Test patch from https://bugzilla.kernel.org/show_bug.cgi?id=216387#c3
https://drive.google.com/file/d/1CM4WRZ … sp=sharing linux-5.19.2.arch1-2.4-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1BKvrgH … sp=sharing linux-headers-5.19.2.arch1-2.4-x86_64.pkg.tar.zst
Edit:
@seth is the padding block https://git.savannah.gnu.org/cgit/grub. … nux.h#n234

	__u32 ext_ramdisk_image;			/* 0x0c0 */
	__u32 ext_ramdisk_size;				/* 0x0c4 */
	__u32 ext_cmd_line_ptr;				/* 0x0c8 */
	__u8  _pad4[112];				/* 0x0cc */
	__u32 cc_blob_address;				/* 0x13c */

which includes cc_blob_address?
Edit2:
syslinux seems to be missing quite a few paramaters
https://repo.or.cz/syslinux.git/blob/HE … ain.c#l472
Edit3:
systemd-boot zeros the whole structure https://github.com/systemd/systemd/blob … x86.c#L162 before filling in the fields it uses.
Grub zeros the whole structure as well https://git.savannah.gnu.org/cgit/grub. … nux.c#n768
syslinux zeros the structure as well https://repo.or.cz/syslinux.git/blob/HE … in.c#l1141 what is the line after doing?

Last edited by loqs (2022-08-22 20:42:38)

Offline

#70 2022-08-23 08:51:18

daren_k
Member
Registered: 2020-02-13
Posts: 37
Website

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

I have a 128GB RAM System, so booting takes a bit longer setting up the ramfs.
I can see it trying to load up the initial ramfs as last output for a few seconds before it hard reboots.

Both linux 5.19.3.arch1-1 and linux-lts 5.15.62-1 don't boot for me with the default flags anymore.

Booting with the fallback initramfs worked out for me with linux 5.19.3.arch1-1. Is some utility maybe botching the initramfs images?

Offline

#71 2022-08-23 09:36:44

loqs
Member
Registered: 2014-03-06
Posts: 18,001

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

@daren_k have you confirmed it is the same issue using the test kernel from post #69 or building the kernel yourself with the patch from the link applied?

Offline

#72 2022-08-23 10:09:46

daren_k
Member
Registered: 2020-02-13
Posts: 37
Website

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

No, just the arch repo kernels.

Wanted to share my workaround as I didn't read any "fallback" mention specifically in this thread.

Offline

#73 2022-08-23 10:10:33

nevatar
Member
Registered: 2022-08-21
Posts: 2

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

Hi there,

I think Daren has a different problem, the bug discussed here is definitely a kernel issue. 5.18.16 and LTS Kernel booting fine here, 5.19.x does not with exactly the same configuration

Offline

#74 2022-08-23 12:00:57

seth
Member
Registered: 2012-09-03
Posts: 58,238

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

daren_k wrote:

Booting with the fallback initramfs worked out for me with linux 5.19.3.arch1-1. Is some utility maybe botching the initramfs images?

When you do so, what's the output of "uname -a"?
Do you actually use syslinux to boot?

Offline

#75 2022-08-23 12:10:06

jancici
Member
From: svk
Registered: 2011-12-04
Posts: 192

Re: Lock up on upgrade kernel 5.18.16-arch1-1 to 5.19.1-arch2-1

I can boot with kernel from post #69
(before that I did describe problem here https://bbs.archlinux.org/viewtopic.php?id=278895)

uname -a
Linux guido 5.19.2-arch1-2.4 #1 SMP PREEMPT_DYNAMIC Mon, 22 Aug 2022 19:06:59 +0000 x86_64 GNU/Linux

using syslinux and this is command line
cat /proc/cmdline
BOOT_IMAGE=../vmlinuz-linux root=/dev/nvme0n1p2 rw debug ignore_loglevel earlyprintk=vga initrd=../intel-ucode.img,../initramfs-linux.img

do I understand correctly? looks that syslinux is problematic?

Last edited by jancici (2022-08-23 12:10:54)

Offline

Board footer

Powered by FluxBB