You are not logged in.

#1 2023-12-17 23:21:45

npit
Member
Registered: 2016-07-01
Posts: 17

[SOLVED] rdrand errors on some applications after upgrade to 5800x

Hello,
I switched from a Ryzen 3600 to a 5800x on an MSI B550 A pro, and am now getting rdrand-related errors. E.g. running ktorrent and deluge as below:

Uncaught exception: random_device: rdrand failed
Unable to start deluged: random_device: rdrand failed

Edit: this also causes sddm to crash on startup (but switching to another tty and running `startx` works fine) -- journalctl log:

                                               Stack trace of thread 573:
                                               #0  0x00007f1cc1eac83c n/a (libc.so.6 + 0x8e83c)
                                               #1  0x00007f1cc1e5c668 raise (libc.so.6 + 0x3e668)
                                               #2  0x00007f1cc1e444b8 abort (libc.so.6 + 0x264b8)
                                               #3  0x00007f1cc209ca6f _ZN9__gnu_cxx27__verbose_terminate_handlerEv (libstdc++.so.6 + 0x9ca6f)
                                               #4  0x00007f1cc20b011c _ZN10__cxxabiv111__terminateEPFvvE (libstdc++.so.6 + 0xb011c)
                                               #5  0x00007f1cc20b0189 _ZSt9terminatev (libstdc++.so.6 + 0xb0189)
                                               #6  0x00007f1cc20b03ed __cxa_throw (libstdc++.so.6 + 0xb03ed)
                                               #7  0x00007f1cc20a02cd _ZSt21__throw_runtime_errorPKc (libstdc++.so.6 + 0xa02cd)
                                               #8  0x00007f1cc20e0352 __x86_rdrand (libstdc++.so.6 + 0xe0352)
                                               #9  0x00007f1cc20e04ef __x86_rdseed (libstdc++.so.6 + 0xe04ef)
                                               #10 0x00005602039f7c5c n/a (sddm + 0x30c5c)
                                               #11 0x0000560203a0d36a n/a (sddm + 0x4636a)
                                               #12 0x0000560203a0e32f n/a (sddm + 0x4732f)
                                               #13 0x0000560203a0e603 n/a (sddm + 0x47603)
                                               #14 0x00007f1cc26d1097 n/a (libQt5Core.so.5 + 0x2d1097)
                                               #15 0x0000560203a0925f n/a (sddm + 0x4225f)
                                               #16 0x00007f1cc26d1097 n/a (libQt5Core.so.5 + 0x2d1097)
                                               #17 0x00007f1cc31bcad4 _ZN23QDBusPendingCallWatcher8finishedEPS_ (libQt5DBus.so.5 + 0x57ad4)
                                               #18 0x00007f1cc26c3bd4 _ZN7QObject5eventEP6QEvent (libQt5Core.so.5 + 0x2c3bd4)
                                               #19 0x00007f1cc269c14c _ZN16QCoreApplication15notifyInternal2EP7QObjectP6QEvent (libQt5Core.so.5 + 0x29c14c)
                                               #20 0x00007f1cc26a10cb _ZN23QCoreApplicationPrivate16sendPostedEventsEP7QObjectiP11QThreadData (libQt5Core.so.5 + 0x2a10cb)
                                               #21 0x00007f1cc26e7138 n/a (libQt5Core.so.5 + 0x2e7138)
                                               #22 0x00007f1cc110df69 n/a (libglib-2.0.so.0 + 0x59f69)
                                               #23 0x00007f1cc116c367 n/a (libglib-2.0.so.0 + 0xb8367)
                                               #24 0x00007f1cc110c162 g_main_context_iteration (libglib-2.0.so.0 + 0x58162)
                                               #25 0x00007f1cc26eaf7c _ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE (libQt5Core.so.5 + 0x2eaf7c)
                                               #26 0x00007f1cc269ae74 _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE (libQt5Core.so.5 + 0x29ae74)
                                               #27 0x00007f1cc269c313 _ZN16QCoreApplication4execEv (libQt5Core.so.5 + 0x29c313)
                                               #28 0x00005602039e043e n/a (sddm + 0x1943e)
                                               #29 0x00007f1cc1e45cd0 n/a (libc.so.6 + 0x27cd0)
                                               #30 0x00007f1cc1e45d8a __libc_start_main (libc.so.6 + 0x27d8a)
                                               #31 0x00005602039e2a25 n/a (sddm + 0x1ba25)

Checking cpuinfo with `lscpu | grep -i rdrand` outputs nothing.

Running this tester repo outputs:

================================================================================
RDRAND Tester v20210328 x86_64
--------------------------------------------------------------------------------
Compiled on Dec 18 2023
Compiled with GNU Compiler Collection (GCC) 13.2.1 20230801
================================================================================

Running on AMD Ryzen 7 5800X 8-Core Processor             
This CPU supports the following instructions:
 RDRAND: Supported
 RDSEED: Supported

Testing RDRAND...

The RDRAND instruction of this CPU appears to be NOT working.

Questions:
- Does the info above mean that there is a manufacturing error? If so, should I expect warranty coverage?
- How would I fix / workaround this? Should I boot with rdrand disabled or is there another recommended way (e.g. some firmware update is mentioned here -- do they mean updating BIOS or some processor microcode layer)?

I'd appreciate any recommendations and links to documentation / resources for how to apply some solution.

Thanks.

Last edited by npit (2023-12-22 09:58:21)

Offline

#2 2023-12-18 07:34:57

Head_on_a_Stick
Member
From: The Wirral
Registered: 2014-02-20
Posts: 9,003
Website

Re: [SOLVED] rdrand errors on some applications after upgrade to 5800x

npit wrote:

some processor microcode layer

Is the amd-ucode package installed and is the bootloader configured to load the µcode? Check the "Microcode" page on the ArchWiki for details.

Updating the firmware ("BIOS") is probably also a good idea.


Jin, Jîyan, Azadî

Offline

#3 2023-12-18 21:31:46

npit
Member
Registered: 2016-07-01
Posts: 17

Re: [SOLVED] rdrand errors on some applications after upgrade to 5800x

Head_on_a_Stick wrote:
npit wrote:

some processor microcode layer

Is the amd-ucode package installed and is the bootloader configured to load the µcode? Check the "Microcode" page on the ArchWiki for details.

Updating the firmware ("BIOS") is probably also a good idea.

I applied the microcode update via grub as per the wiki, however I'm getting:

Dec 18 23:10:43 machine kernel: Speculative Return Stack Overflow: IBPB-extending microcode not applied!
Dec 18 23:10:43 machine kernel: Speculative Return Stack Overflow: Vulnerable: Safe RET, no microcode
Dec 18 23:10:43 machine kernel: microcode: CPU0: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU1: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU3: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU2: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU4: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU8: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU6: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU7: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU9: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU10: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU11: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU5: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU12: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU13: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU15: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: CPU14: patch_level=0x00000000
Dec 18 23:10:43 machine kernel: microcode: Microcode Update Driver: v2.2.

I also tried booting with the "nordrand" kernel parameter via grub, which however appears to have no effect whatsoever.

I will try updating BIOS next, which is the last option left as far as I can see.

Offline

#4 2023-12-18 21:47:33

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,387

Re: [SOLVED] rdrand errors on some applications after upgrade to 5800x

"random.trust_cpu=off" ?

Offline

#5 2023-12-18 22:39:31

loqs
Member
Registered: 2014-03-06
Posts: 18,875

Re: [SOLVED] rdrand errors on some applications after upgrade to 5800x

Please post the full output of dmesg.  I am wondering if the kernel has detected that rdrand is broken https://github.com/torvalds/linux/blob/ … rand.c#L47
Edit:

Intel® 64 and IA-32 Architectures Software Developer’s Manual wrote:

In order for the hardware design to meet its security goals, the random number generator continuously tests itself and the random data it is generating. Runtime failures in the random number generator circuitry or statistically anomalous data occurring by chance will be detected by the self test hardware and flag the resulting data as being bad. In such extremely rare cases, the RDRAND instruction will return no data instead of bad data.
Under heavy load, with multiple cores executing RDRAND in parallel, it is possible, though unlikely, for the demand of random numbers by software processes/threads to exceed the rate at which the random number generator hardware can supply them. This will lead to the RDRAND instruction returning no data transitorily. The RDRAND instruction indicates the occurrence of this rare situation by clearing the CF flag.
The RDRAND instruction returns with the carry flag set (CF = 1) to indicate valid data is returned. It is recommended that software using the RDRAND instruction to get random numbers retry for a limited number of iterations while RDRAND returns CF=0 and complete when valid data is returned, indicated with CF=1. This will deal with transitory underflows. A retry limit should be employed to prevent a hard failure in the RNG (expected to be extremely rare) leading to a busy loop in software.
The intrinsic primitive for RDRAND is defined to address software’s need for the common cases (CF = 1) and the rare situations (CF = 0). The intrinsic primitive returns a value that reflects the value of the carry flag returned by the underlying RDRAND instruction.

If AMD followed Intel's spec then permanent CF = 0 means hardware failure of the RNG.
Edit2:
https://github.com/torvalds/linux/commi … 7a949f6f24

Last edited by loqs (2023-12-18 23:43:55)

Offline

#6 2023-12-22 09:57:44

npit
Member
Registered: 2016-07-01
Posts: 17

Re: [SOLVED] rdrand errors on some applications after upgrade to 5800x

seth wrote:

"random.trust_cpu=off" ?

Right, that might have worked as a patch-fix.
It seems nordrand did nothing because it is deprecated, the up-to-date parameter list is in the wiki.

In the end I did a (long overdue) BIOS update, which fixed the issue.

Offline

Board footer

Powered by FluxBB