You are not logged in.

#1 2017-02-04 15:55:24

mensinda
Member
Registered: 2017-02-04
Posts: 5

Getting "BUG: Bad page map in process <random process>" in journalctl

Hi,

after my window manager crashed the third time in 2 weeks, I looked in my log files and found this:

Nov 24 19:46:55 Mense-1 kernel: BUG: Bad page map in process awesome  pte:80000001e0d45045 pmd:1e3f6b067
Nov 24 19:46:55 Mense-1 kernel: page:ffffea0007835140 count:3 mapcount:-254 mapping:ffff880210d22411 index:0x1d6d
Nov 24 19:46:55 Mense-1 kernel: flags: 0x17fff0000040068(uptodate|lru|active|swapbacked)
Nov 24 19:46:55 Mense-1 kernel: page dumped because: bad pte
Nov 24 19:46:55 Mense-1 kernel: page->mem_cgroup:ffff88040d011c00
Nov 24 19:46:55 Mense-1 kernel: addr:0000000001d6d000 vm_flags:00100073 anon_vma:ffff8802d7d6d460 mapping:          (null) index:1d6d
Nov 24 19:46:55 Mense-1 kernel: file:          (null) fault:          (null) mmap:          (null) readpage:          (null)
Nov 24 19:46:55 Mense-1 kernel: CPU: 7 PID: 13960 Comm: sh Tainted: P    B      O    4.8.8-2-ARCH #1
Nov 24 19:46:55 Mense-1 kernel: Hardware name: Gigabyte Technology Co., Ltd. H97-HD3/H97-HD3, BIOS F9c 03/03/2016
Nov 24 19:46:55 Mense-1 kernel:  0000000000000286 00000000f5fafe58 ffff8801e3483a98 ffffffff812fde10
Nov 24 19:46:55 Mense-1 kernel:  0000000001d6d000 ffff8801fd22b840 ffff8801e3483ae8 ffffffff811ada3f
Nov 24 19:46:55 Mense-1 kernel:  ffff8801e3483ab8 ffffffff8119e2fd ffff8801e3483ae8 0000000001d6d000
Nov 24 19:46:55 Mense-1 kernel: Call Trace:
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff812fde10>] dump_stack+0x63/0x83
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff811ada3f>] print_bad_pte+0x1df/0x2a0
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff8119e2fd>] ? __dec_node_page_state+0x1d/0x20
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff811b08ba>] unmap_page_range+0x7ea/0x960
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff811b0aad>] unmap_single_vma+0x7d/0xe0
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff811b0e01>] unmap_vmas+0x51/0xa0
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff811b98f7>] exit_mmap+0xa7/0x170
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff81079bcd>] mmput+0x4d/0x100
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff812105ef>] flush_old_exec+0x54f/0x620
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff8126619f>] load_elf_binary+0x3af/0x16f0
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff81263bba>] ? load_misc_binary+0x31a/0x460
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff81205dc4>] ? __check_object_size+0x54/0x1d6
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff81210921>] search_binary_handler+0xa1/0x200
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff812113f7>] do_execveat_common.isra.15+0x587/0x760
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff8121186a>] SyS_execve+0x3a/0x50
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff81003b97>] do_syscall_64+0x57/0xb0
Nov 24 19:46:55 Mense-1 kernel:  [<ffffffff815f7f21>] entry_SYSCALL64_slow_path+0x25/0x25

This errors is reported ca. once a second for up to 2h after the system was up a random amount of time.

According to journalctl the error first occurred Nov 24 2016, however my oldest entries are from Nov 16
and the next occurrence of the error message was on Nov 30.

The error seems to be independent of the process (also bash, zsh, cc1, python2 were affected). The only
constant value (even after months) is

page->mem_cgroup:ffff88040d011c00

.

I can not reliably reproduce this bug, but it seems to have a higher probability to occur when more RAM is used.

The error occures with both linux and linux-lts and the swap partition enabled and disabled.
I am not using zram or zswap.

Also memtest86 does not report any errors.

My System:
  CPU: Intel(R) Xeon(R) CPU E3-1231 v3
  VGA: Nvidia GTX 750ti
  RAM: 16GB DDR3 1600 MHz
  Motherboard: Gigabyte Technology Co., Ltd. H97-HD3

Has anyone any ideas what is going on?

Offline

#2 2017-02-06 13:22:15

R00KIE
Forum Fellow
From: Between a computer and a chair
Registered: 2008-09-14
Posts: 4,734

Re: Getting "BUG: Bad page map in process <random process>" in journalctl

My first guess would be bad ram but you say that memtest reports no problems. Are you overcloking anything in your system or setting tighter timings than the default ones? Is the cpu being cooled properly? Did you make sure that the cpu's microcode is updated?


R00KIE
Tm90aGluZyB0byBzZWUgaGVyZSwgbW92ZSBhbG9uZy4K

Offline

Board footer

Powered by FluxBB