You are not logged in.
All of my Arch VMs running under XCP-ng become inoperable within a minute of boot. Exact same symptoms as reported here mid-March:
System unstable, kernel ring buffer flooded with "BUG: Bad page state in process swapper/0"
https://bugs.launchpad.net/ubuntu/+sour … ug/2056706
Unfortunately I have a swarm of low-memory VMs for my environment, and can't make it beyond viewing the journal, the system becomes completely unresponsive. Hence no bug report here, I have zero data. Anyone who can provide more info for the Ubuntu thread or help submit a bug?
Offline
Can confirm no issues with otherwise identical images running 6.7.9-arch1-1.
Offline
Same issue using an updated XCP-ng 8.2.1 on multiple physical hosts.
journal / dmesg is flooded with "BUG: Bad page state in process <some process>".
Also no issues with 6.7.9-arch1-1.
Same with or without ucode, and different versions of xe-guest utils including the latest that comes with XCP-ng 8.2.1.
I can spin up an Arch VM with a large (up to 64G) amount of RAM if needed and capture output if someone lets me know what I need to do.
Offline
I can spin up an Arch VM with a large (up to 64G) amount of RAM if needed and capture output if someone lets me know what I need to do.
We need a bisect really to pinpoint the bug. I got journal logs from the server which was affected but doesn't show anything more than a random process leaking memory as per the bug.
Offline
https://git.kernel.org/pub/scm/linux/ke … 198822c6cb looks related; does backporting that commit fix the issue?
Offline
https://git.kernel.org/pub/scm/linux/ke … 198822c6cb looks related; does backporting that commit fix the issue?
Tested this - doesn't fix it.
Offline
Maybe related to this https://lwn.net/Articles/949277/
Offline
For reference the journal logs for this look like:
[..]
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: BUG: Bad page state in process systemd-network pfn:02e27
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: page:000000006941032e refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2e27
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: flags: 0x1ffff0000000000(node=0|zone=1|lastcpupid=0xffff)
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: page_type: 0xffffffff()
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: raw: 01ffff0000000000 dead000000000040 ffff93c2c3798800 0000000000000000
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: page dumped because: page_pool leak
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: Modules linked in: iptable_mangle iptable_raw iptable_security wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel cfg80211 rfkill ip6table_filter ip6_tables iptable_filter intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_telemetry pmt_class crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd xen_netfront pcspkr fuse loop dm_mod nfnetlink ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq xen_blkfront crc32c_intel sha256_ssse3
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: CPU: 0 PID: 294 Comm: systemd-network Tainted: G B 6.8.1-arch1-1 #1 52f97d9bb37be6168651745a1a9f8f7240d21ce5
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: Call Trace:
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: <IRQ>
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: dump_stack_lvl+0x47/0x60
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: bad_page+0x71/0x100
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: free_unref_page_prepare+0x236/0x390
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: free_unref_page+0x34/0x180
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: __pskb_pull_tail+0x3ff/0x4a0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: xennet_poll+0x909/0xa40 [xen_netfront 12c02fdcf84c692965d9cd6ca5a6ff0a530b4ce9]
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: __napi_poll+0x28/0x1b0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: net_rx_action+0x2b5/0x370
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? handle_irq_desc+0x3e/0x60
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: __do_softirq+0xc9/0x2c8
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: do_softirq.part.0+0x3d/0x60
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: </IRQ>
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: <TASK>
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: __local_bh_enable_ip+0x68/0x70
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: xennet_open+0x5e/0x120 [xen_netfront 12c02fdcf84c692965d9cd6ca5a6ff0a530b4ce9]
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: __dev_open+0xfa/0x1b0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: __dev_change_flags+0x1c3/0x240
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: dev_change_flags+0x26/0x70
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: do_setlink+0x375/0x12d0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? __nla_validate_parse+0x61/0xd50
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? get_page_from_freelist+0x1919/0x1a80
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? update_load_avg+0x7e/0x7e0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: rtnl_setlink+0x11f/0x1d0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? __mod_memcg_lruvec_state+0x97/0x110
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? ep_poll_callback+0x245/0x2a0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? security_capable+0x41/0x70
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: rtnetlink_rcv_msg+0x14f/0x3c0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? inode_sub_bytes+0x22/0x80
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: netlink_rcv_skb+0x58/0x110
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: netlink_unicast+0x1a3/0x290
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: netlink_sendmsg+0x223/0x490
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: __sys_sendto+0x1dc/0x1f0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: __x64_sys_sendto+0x24/0x30
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: do_syscall_64+0x86/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? xen_clocksource_get_cycles+0x1c/0x40
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? ktime_get_ts64+0x47/0xe0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? syscall_exit_to_user_mode_prepare+0x178/0x1a0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? syscall_exit_to_user_mode+0x80/0x230
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? do_syscall_64+0x96/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? do_syscall_64+0x96/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? do_syscall_64+0x96/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? do_syscall_64+0x96/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: ? exc_page_fault+0x7f/0x180
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: entry_SYSCALL_64_after_hwframe+0x6e/0x76
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RIP: 0033:0x7ea793928f0c
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: Code: 9a c8 f7 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 e0 c8 f7 ff 48 8b
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RSP: 002b:00007ffe63d2ce80 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RAX: ffffffffffffffda RBX: 0000598c90b2bc90 RCX: 00007ea793928f0c
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RDX: 0000000000000020 RSI: 0000598c90b36970 RDI: 0000000000000003
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RBP: 0000000000000000 R08: 00007ffe63d2cec0 R09: 0000000000000080
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000598c90b4c0d8
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: R13: 0000598c90b2bc90 R14: 0000000000000000 R15: 0000598c90b4c090
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: </TASK>
[..]
Last edited by inglor (2024-03-26 13:47:29)
Offline
tested 6.8.2-arch1 and experience same issue
Offline
I tried reporting it upstream
https://bugzilla.kernel.org/show_bug.cgi?id=218654
Offline
Fixed with https://patchwork.kernel.org/project/xe … @firesoul/ scheduled for 6.9
Offline
I've released 6.8.2-arch2 with it.
Offline
No more "BUG / bad page" messages during quick testing with 6.8.2-arch2-1 on multiple XCP-ng VMs.
Thank you!
Offline