You are not logged in.

#1 2024-03-23 02:55:29

wretchedbanana
Member
Registered: 2023-09-15
Posts: 4

6.8.1 + Xen/XCP-ng crumble under memory pressure

All of my Arch VMs running under XCP-ng become inoperable within a minute of boot. Exact same symptoms as reported here mid-March:

System unstable, kernel ring buffer flooded with "BUG: Bad page state in process swapper/0"
https://bugs.launchpad.net/ubuntu/+sour … ug/2056706

Unfortunately I have a swarm of low-memory VMs for my environment, and can't make it beyond viewing the journal, the system becomes completely unresponsive. Hence no bug report here, I have zero data. Anyone who can provide more info for the Ubuntu thread or help submit a bug?

Offline

#2 2024-03-23 03:03:03

wretchedbanana
Member
Registered: 2023-09-15
Posts: 4

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

Can confirm no issues with otherwise identical images running 6.7.9-arch1-1.

Offline

#3 2024-03-25 20:21:36

SlashQuit
Member
Registered: 2013-06-05
Posts: 4

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

Same issue using an updated XCP-ng 8.2.1 on multiple physical hosts.
journal / dmesg is flooded with "BUG: Bad page state in process <some process>".
Also no issues with 6.7.9-arch1-1.

Same with or without ucode, and different versions of xe-guest utils including the latest that comes with XCP-ng 8.2.1.

I can spin up an Arch VM with a large (up to 64G) amount of RAM if needed and capture output if someone lets me know what I need to do.

Offline

#4 2024-03-25 21:43:45

inglor
Package Maintainer (PM)
Registered: 2008-07-22
Posts: 88

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

SlashQuit wrote:

I can spin up an Arch VM with a large (up to 64G) amount of RAM if needed and capture output if someone lets me know what I need to do.

We need a bisect really to pinpoint the bug. I got journal logs from the server which was affected but doesn't show anything more than a random process leaking memory as per the bug.

Offline

#5 2024-03-25 23:41:39

heftig
Developer
From: Germany
Registered: 2010-04-19
Posts: 159

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

https://git.kernel.org/pub/scm/linux/ke … 198822c6cb looks related; does backporting that commit fix the issue?

Offline

#6 2024-03-26 10:17:44

inglor
Package Maintainer (PM)
Registered: 2008-07-22
Posts: 88

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

heftig wrote:

https://git.kernel.org/pub/scm/linux/ke … 198822c6cb looks related; does backporting that commit fix the issue?

Tested this - doesn't fix it. sad

Offline

#7 2024-03-26 13:41:36

inglor
Package Maintainer (PM)
Registered: 2008-07-22
Posts: 88

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

Maybe related to this https://lwn.net/Articles/949277/

Offline

#8 2024-03-26 13:44:03

inglor
Package Maintainer (PM)
Registered: 2008-07-22
Posts: 88

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

For reference the journal logs for this look like:

[..]
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: BUG: Bad page state in process systemd-network  pfn:02e27
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: page:000000006941032e refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2e27
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: flags: 0x1ffff0000000000(node=0|zone=1|lastcpupid=0xffff)
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: page_type: 0xffffffff()
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: raw: 01ffff0000000000 dead000000000040 ffff93c2c3798800 0000000000000000
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: page dumped because: page_pool leak
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: Modules linked in: iptable_mangle iptable_raw iptable_security wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel cfg80211 rfkill ip6table_filter ip6_tables iptable_filter intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_telemetry pmt_class crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd xen_netfront pcspkr fuse loop dm_mod nfnetlink ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq xen_blkfront crc32c_intel sha256_ssse3
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: CPU: 0 PID: 294 Comm: systemd-network Tainted: G    B              6.8.1-arch1-1 #1 52f97d9bb37be6168651745a1a9f8f7240d21ce5
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: Call Trace:
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  <IRQ>
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  dump_stack_lvl+0x47/0x60
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  bad_page+0x71/0x100
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  free_unref_page_prepare+0x236/0x390
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  free_unref_page+0x34/0x180
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  __pskb_pull_tail+0x3ff/0x4a0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  xennet_poll+0x909/0xa40 [xen_netfront 12c02fdcf84c692965d9cd6ca5a6ff0a530b4ce9]
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  __napi_poll+0x28/0x1b0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  net_rx_action+0x2b5/0x370
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? handle_irq_desc+0x3e/0x60
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  __do_softirq+0xc9/0x2c8
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  do_softirq.part.0+0x3d/0x60
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  </IRQ>
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  <TASK>
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  __local_bh_enable_ip+0x68/0x70
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  xennet_open+0x5e/0x120 [xen_netfront 12c02fdcf84c692965d9cd6ca5a6ff0a530b4ce9]
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  __dev_open+0xfa/0x1b0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  __dev_change_flags+0x1c3/0x240
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  dev_change_flags+0x26/0x70
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  do_setlink+0x375/0x12d0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? __nla_validate_parse+0x61/0xd50
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? get_page_from_freelist+0x1919/0x1a80
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? update_load_avg+0x7e/0x7e0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  rtnl_setlink+0x11f/0x1d0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? __mod_memcg_lruvec_state+0x97/0x110
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? ep_poll_callback+0x245/0x2a0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? security_capable+0x41/0x70
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  rtnetlink_rcv_msg+0x14f/0x3c0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? inode_sub_bytes+0x22/0x80
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  netlink_rcv_skb+0x58/0x110
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  netlink_unicast+0x1a3/0x290
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  netlink_sendmsg+0x223/0x490
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  __sys_sendto+0x1dc/0x1f0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  __x64_sys_sendto+0x24/0x30
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  do_syscall_64+0x86/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? xen_clocksource_get_cycles+0x1c/0x40
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? ktime_get_ts64+0x47/0xe0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? syscall_exit_to_user_mode_prepare+0x178/0x1a0
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? syscall_exit_to_user_mode+0x80/0x230
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? do_syscall_64+0x96/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? do_syscall_64+0x96/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? do_syscall_64+0x96/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? do_syscall_64+0x96/0x170
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  ? exc_page_fault+0x7f/0x180
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0x76
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RIP: 0033:0x7ea793928f0c
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: Code: 9a c8 f7 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 e0 c8 f7 ff 48 8b
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RSP: 002b:00007ffe63d2ce80 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RAX: ffffffffffffffda RBX: 0000598c90b2bc90 RCX: 00007ea793928f0c
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RDX: 0000000000000020 RSI: 0000598c90b36970 RDI: 0000000000000003
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: RBP: 0000000000000000 R08: 00007ffe63d2cec0 R09: 0000000000000080
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000598c90b4c0d8
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel: R13: 0000598c90b2bc90 R14: 0000000000000000 R15: 0000598c90b4c090
Mar 24 16:04:22 london.mirror.pkgbuild.com kernel:  </TASK>
[..]

Last edited by inglor (2024-03-26 13:47:29)

Offline

#9 2024-03-28 12:19:11

inglor
Package Maintainer (PM)
Registered: 2008-07-22
Posts: 88

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

tested 6.8.2-arch1 and experience same issue

Offline

#10 2024-03-28 13:08:14

inglor
Package Maintainer (PM)
Registered: 2008-07-22
Posts: 88

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

Offline

#11 2024-03-28 17:02:07

inglor
Package Maintainer (PM)
Registered: 2008-07-22
Posts: 88

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

Offline

#12 2024-03-28 19:07:26

heftig
Developer
From: Germany
Registered: 2010-04-19
Posts: 159

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

I've released 6.8.2-arch2 with it.

Offline

#13 2024-03-29 20:48:33

SlashQuit
Member
Registered: 2013-06-05
Posts: 4

Re: 6.8.1 + Xen/XCP-ng crumble under memory pressure

No more "BUG / bad page" messages during quick testing with 6.8.2-arch2-1 on multiple XCP-ng VMs.
Thank you!

Offline

Board footer

Powered by FluxBB