You are not logged in.
I'm not sure if you doing that can help anybody besides you
It's just a temporary "solution" to get Arch running reliably without disabling power management.
Last edited by mich41 (2015-09-01 05:42:34)
Offline
Fair enough. I'd rather have a fully functional Fedora install than a kludged together Frankenstein of an Arch Install I have to worry about accidentally breaking. I just figure it might be worth testing to isolate causes for sure.
Avatar by Ditey: https://twitter.com/phrobitey
Offline
I have the same laptop with kris7t , which has 5700hq cpu. Sometimes I got random crashes, which system just stops responding and waits 10 seconds before blinking caps lock key for kernel panic. I never saw any records in logs/dmesg. My main suspicion nowadays is integrated graphics chip. Next step is setting up a kernel for crash dumps. I need my laptop to be reliable on high load and I too prefer to use Arch if it's possible. If I can find something, I'll share.
Creeds matter very little… The optimist proclaims that we live in the best of all possible worlds; and the pessimist fears this is true. So I elect for neither label. - James Branch Cabell
Offline
I have the same laptop with kris7t , which has 5700hq cpu. Sometimes I got random crashes, which system just stops responding and waits 10 seconds before blinking caps lock key for kernel panic. I never saw any records in logs/dmesg. My main suspicion nowadays is integrated graphics chip. Next step is setting up a kernel for crash dumps. I need my laptop to be reliable on high load and I too prefer to use Arch if it's possible. If I can find something, I'll share.
The issue is most certainly the intel broadwell CPU. Check the full post and the phoronix articles mentioned previously.
Offline
Have any of you done the Intel microcode update?
Offline
Have any of you done the Intel microcode update?
I had, before I gave up and switched to Fedora on the machine. It didn't help.
Avatar by Ditey: https://twitter.com/phrobitey
Offline
The issue is most certainly the intel broadwell CPU. Check the full post and the phoronix articles mentioned previously.
I guess mine was related with SNA, because after switching to UXA I'm mostly stable now. Uptime is 22 hours (which passed a load sensitive workday with 4-5 times sleep-wakeup and virtualbox). I don't know if this is an exception or something. I'll still try different stress tests & compiling with multicore to catch a panic.
What is your cpu governor by the way? I remember some crash with ondemand, learned it's not supported on speedstep anymore.
Creeds matter very little… The optimist proclaims that we live in the best of all possible worlds; and the pessimist fears this is true. So I elect for neither label. - James Branch Cabell
Offline
nashamri wrote:The issue is most certainly the intel broadwell CPU. Check the full post and the phoronix articles mentioned previously.
I guess mine was related with SNA, because after switching to UXA I'm mostly stable now. Uptime is 22 hours (which passed a load sensitive workday with 4-5 times sleep-wakeup and virtualbox). I don't know if this is an exception or something. I'll still try different stress tests & compiling with multicore to catch a panic.
What is your cpu governor by the way? I remember some crash with ondemand, learned it's not supported on speedstep anymore.
I hadn't changed it, so it was the default Intel governor for recent chips, which is where the problem code is. And it's pretty much the only game in town for CPU power management on these chips, hence the issues.
Avatar by Ditey: https://twitter.com/phrobitey
Offline
I hadn't changed it, so it was the default Intel governor for recent chips, which is where the problem code is. And it's pretty much the only game in town for CPU power management on these chips, hence the issues.
I suggest you to check:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
I'm using TLP for power management on laptop, which skips between powersave and performance governors by AC/BAT status.
Creeds matter very little… The optimist proclaims that we live in the best of all possible worlds; and the pessimist fears this is true. So I elect for neither label. - James Branch Cabell
Offline
GourdCaptain wrote:I hadn't changed it, so it was the default Intel governor for recent chips, which is where the problem code is. And it's pretty much the only game in town for CPU power management on these chips, hence the issues.
I suggest you to check:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
I'm using TLP for power management on laptop, which skips between powersave and performance governors by AC/BAT status.
Huh, the different governor might be why you're not having the same issues, but the intel_pstate driver basically disregards most governor settings and if I remember correctly, doesn't show the exact results to the OS. Interestingly, Fedora doesn't have that file available to read, so I had to do it with cpupower.
Avatar by Ditey: https://twitter.com/phrobitey
Offline
Huh, the different governor might be why you're not having the same issues, but the intel_pstate driver basically disregards most governor settings and if I remember correctly, doesn't show the exact results to the OS. Interestingly, Fedora doesn't have that file available to read, so I had to do it with cpupower.
Didn't know that. I just see the effects of whatever I set on tlp config (freq drops to 800 if I switch to powersave etc.).
Creeds matter very little… The optimist proclaims that we live in the best of all possible worlds; and the pessimist fears this is true. So I elect for neither label. - James Branch Cabell
Offline
GourdCaptain wrote:Huh, the different governor might be why you're not having the same issues, but the intel_pstate driver basically disregards most governor settings and if I remember correctly, doesn't show the exact results to the OS. Interestingly, Fedora doesn't have that file available to read, so I had to do it with cpupower.
Didn't know that. I just see the effects of whatever I set on tlp config (freq drops to 800 if I switch to powersave etc.).
Yeah. But recent intel chips basically do their own power management mostly on the hardware level, irregardless of what the OS tells them to do, and fudge the numbers a bit so the OS thinks everything's going as it wants. (I may not be absolutely accurate, but I read the kernel documentation for this pstate driver a while back.)
Avatar by Ditey: https://twitter.com/phrobitey
Offline
And in utter suprise, I just found out you can cause the crash by running an affected kernel IN VIRTUALBOX on Fedora 22. Woah.
Avatar by Ditey: https://twitter.com/phrobitey
Offline
The issue is still there with linux 4.2-3 from the testing repo.
Offline
Disabling intel's speedstep from the BIOS and disabling fast boot and secure boot and running the 4.2.0 kernel seems to be stable and no issues so far. I Will report here if I encounter anything.
Offline
Hi! I didn't find this forum post until now, I'm really relieved my hardware isn't broken. I also have an i7-5775C, not overclocked and the cooling can't be the problem either, I was able to compile a new kernel with -j8 without a crash, then it crashed on installation. My system was absolutely unusable until I added `cpufreq.governor=performance` to the kernel command line. It still freezes, but much less frequently. I've also tried the kernel in core and in testing, without that changing anything.
And I tried https://wiki.archlinux.org/index.php/In … re-M_chips and switching to UXA, neither of which helped. If I had more free space on my SSD I would immediately install Fedora, but I think my semi-stable arch is the easiest thing to use at the moment. I'll check if my EFI has the option to disable SpeedStep.
Offline
You can try the Fedora/Arch Frankenstein
Offline
Okay, I'm now on Linux 4.1.6-201.fc22.x86_64 ^^
Let's see how this goes. So far I've tried a few things that made it freeze sometimes before, and also I haven't started with the performance cpufreq governor. Seems to be working!
Offline
Didn't help... First time I booted today, the system immediately froze. Then I re-enabled cpufreq.governor=performance for the Fedora kernel as well, and the system froze again when I tried building a package from AUR.
It seems the most reliable way of freezing my system is running gcc. I had it before that the system froze when trying to compile my own C++ program. It worked every single time, so I'm really lucky that that project isn't something I actively work on at the moment.
Offline
Didn't help... First time I booted today, the system immediately froze. Then I re-enabled cpufreq.governor=performance for the Fedora kernel as well, and the system froze again when I tried building a package from AUR.
It seems the most reliable way of freezing my system is running gcc. I had it before that the system froze when trying to compile my own C++ program. It worked every single time, so I'm really lucky that that project isn't something I actively work on at the moment.
Huh. We're kind of at the drawing board again on why Fedora's stable on the same hardware for me - I've run 24+ hour chains of encode jobs without crashes, lengthy compiles, and such.
Avatar by Ditey: https://twitter.com/phrobitey
Offline
FWIW, I just booted into Fedora, chrooted into Arch from there and attempted to build the same AUR package (glade-git) again. During configure, the system froze, just like on Arch.
So... The Fedora Kernel, at least for me, is *not* more stable than the arch kernel, and if Fedora is more stable at all, then because of other low-level things, e.g.... uhhh, a different version of coreutils or something?!
EDIT: gcc might actually be a thing to cause this. I'll try manually going through what makepkg does, on Fedora.
Last edited by jonessen (2015-09-13 17:00:30)
Offline
Well, who would have guessed.. Compiling glade on Fedora with its gcc works like a charm!
As arch very often also crashed during the compilation phase, not only configure (with runs gcc for some tests as well), this particular issue where the system runs fine for a long time and then freezes later when in use seems very likely to be caused by gcc 5.2.
Of course, we still have no clue why Fedora runs well with the powersave cpufreq governor and arch doesn't... The first thing that would come to my mind would be autostart programs / DE. I'm using i3, what are you using as a DE?
Last edited by jonessen (2015-09-13 18:30:24)
Offline
Some time ago, Haswell cores turned out to contain a silicon bug which broke software using the hardware lock elision functionality. I wonder if it's possible that HLE or some other exotic instructions used by Arch and Ubuntu glibc, but not by Fedora glibc, are messing up these CPUs.
As for HLE, it can be disabled at runtime by setting
export GLIBC_PTHREAD_MUTEX=none
export GLIBC_PTHREAD_RWLOCK=none
before launching an affected program. I'm not sure if there is any way to disable it globally besides recompiling glibc without --enable-lock-elision.
Last edited by mich41 (2015-09-13 20:59:46)
Offline
I tried disabling HLE, that didn't help. Also after the freeze I got from trying that out, my system froze right on boot again, with the normal arch kernel + cpufreq.governor=performance. I booted the Fedora kernel afterwards...
Anyway, I think some exotic instrucion triggering this is the best theory we got so far.
Has anybody created a bug on the kernel bugtracker? I gave it a shot about two weeks ago, but it was probably the wrong category and didn't have much useful information, so I closed it myself after nobody had replied a few days later.
Offline
Well, who would have guessed.. Compiling glade on Fedora with its gcc works like a charm!
As arch very often also crashed during the compilation phase, not only configure (with runs gcc for some tests as well), this particular issue where the system runs fine for a long time and then freezes later when in use seems very likely to be caused by gcc 5.2.Of course, we still have no clue why Fedora runs well with the powersave cpufreq governor and arch doesn't... The first thing that would come to my mind would be autostart programs / DE. I'm using i3, what are you using as a DE?
I'm using XFCE from the Fedora XFCE spin.
Has anybody created a bug on the kernel bugtracker? I gave it a shot about two weeks ago, but it was probably the wrong category and didn't have much useful information, so I closed it myself after nobody had replied a few days later.
I posted on the bug I'm guessing you made earlier in the thread, but I don't think any other bug tracker entries have been created.
Avatar by Ditey: https://twitter.com/phrobitey
Offline