You are not logged in.
After an upgrade of system and kernel (custom kernel 4.15) I can no longer boot into my system: black screen; using the arch generic kernel 4.14.15 works though.
I thought this might be related to the harfbuzz problem, but even after downgrading harfbuzz, the problem remains.
Anyone knows what may cause this.
thanks
gummo
Last edited by gen2arch (2018-02-03 05:12:36)
Offline
Anyone knows what may cause this.
Yeah: your custom kernel. As you've told us absolutely nothing about what you've customized in that kernel, there's really nothing else that could be done here.
Please modify your title to indicate the problem is with your own kernel.
EDIT: I just realized you downgraded to a different version of the stock kernel. The obvious test would be to check whether the stock build of 4.15 works or not.
Last edited by Trilby (2018-02-02 12:26:54)
"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" - Richard Stallman
Offline
(custom kernel 4.15)
I have custom kernel 4.15 too and Plasma 5 (5.12 beta, but it's not important) and everything works ok (with the other kernels: linux, linux-pf, linux-zen everythink works ok, too). Then it's probably your fault in building yours custom kernel.
Offline
gen2arch wrote:Anyone knows what may cause this.
Yeah: your custom kernel. As you've told us absolutely nothing about what you've customized in that kernel, there's really nothing else that could be done here.
Please modify your title to indicate the problem is with your own kernel.
EDIT: I just realized you downgraded to a different version of the stock kernel. The obvious test would be to check whether the stock build of 4.15 works or not.
I check out the kernel via asp (asp export linux); I apply graysky's patch (https://github.com/graysky2/kernel_gcc_patch) for better processor support and I use the modprobd-db mechanism and localmodconfig build target to build the kernel.
Within these nothing changed recently, so the breakage is somehow related to 4.15.
All this seems to be related also to nvidia(-dkms) and gcc as noted in other threads that seem to hit a similar problem, see
https://bbs.archlinux.org/viewtopic.php?id=234112 and
https://bbs.archlinux.org/viewtopic.php?id=234067
I somehow managed to get my custom kernel back running by enabling the testing repo, which resulted in a newer nvidia driver (390.25).
Unfortunately I cannot give more precise info on what was the original problem or why it works now.
thanks
gummo
Offline
Did you check the DKMS output from the build of nvidia 387.34 on 4.15? I would have expected it to fail without a patch.
You should be able to see in the journal from the bad boots that the nvidia modules were not loaded. If you install the kernel using pacman would also expect the DKMS failure to be recorded there.
You would have needed something like the following to make 387.34 work with 4.15
diff --git a/kernel/nvidia-modeset/nvidia-modeset-linux.c b/kernel/nvidia-modeset/nvidia-modeset-linux.c
index edeb152..cd0ce2b 100644
--- a/kernel/nvidia-modeset/nvidia-modeset-linux.c
+++ b/kernel/nvidia-modeset/nvidia-modeset-linux.c
@@ -21,6 +21,7 @@
#include <linux/random.h>
#include <linux/file.h>
#include <linux/list.h>
+#include <linux/version.h>
#include "nvstatus.h"
@@ -566,9 +567,17 @@ static void nvkms_queue_work(nv_kthread_q_t *q, nv_kthread_q_item_t *q_item)
WARN_ON(!ret);
}
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
static void nvkms_timer_callback(unsigned long arg)
+#else
+static void nvkms_timer_callback(struct timer_list * t)
+#endif
{
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
struct nvkms_timer_t *timer = (struct nvkms_timer_t *) arg;
+#else
+ struct nvkms_timer_t *timer = from_timer(timer, t, kernel_timer);
+#endif
/* In softirq context, so schedule nvkms_kthread_q_callback(). */
nvkms_queue_work(&nvkms_kthread_q, &timer->nv_kthread_q_item);
@@ -606,10 +615,16 @@ nvkms_init_timer(struct nvkms_timer_t *timer, nvkms_timer_proc_t *proc,
timer->kernel_timer_created = NV_FALSE;
nvkms_queue_work(&nvkms_kthread_q, &timer->nv_kthread_q_item);
} else {
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
init_timer(&timer->kernel_timer);
+#else
+ timer_setup(&timer->kernel_timer,nvkms_timer_callback,0);
+#endif
timer->kernel_timer_created = NV_TRUE;
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
timer->kernel_timer.function = nvkms_timer_callback;
timer->kernel_timer.data = (unsigned long) timer;
+#endif
mod_timer(&timer->kernel_timer, jiffies + NVKMS_USECS_TO_JIFFIES(usec));
}
spin_unlock_irqrestore(&nvkms_timers.lock, flags);
diff --git a/kernel/nvidia/nv.c b/kernel/nvidia/nv.c
index ad5091b..a469bf9 100644
--- a/kernel/nvidia/nv.c
+++ b/kernel/nvidia/nv.c
@@ -320,7 +320,11 @@ static irqreturn_t nvidia_isr (int, void *, struct pt_regs *);
#else
static irqreturn_t nvidia_isr (int, void *);
#endif
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
static void nvidia_rc_timer (unsigned long);
+#else
+static void nvidia_rc_timer (struct timer_list *);
+#endif
static int nvidia_ctl_open (struct inode *, struct file *);
static int nvidia_ctl_close (struct inode *, struct file *);
@@ -2472,10 +2476,18 @@ nvidia_isr_bh_unlocked(
static void
nvidia_rc_timer(
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
unsigned long data
+#else
+ struct timer_list * t
+#endif
)
{
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
nv_linux_state_t *nvl = (nv_linux_state_t *) data;
+#else
+ nv_linux_state_t *nvl = from_timer(nvl, t, rc_timer);
+#endif
nv_state_t *nv = NV_STATE_PTR(nvl);
nvidia_stack_t *sp = nvl->sp[NV_DEV_STACK_TIMER];
@@ -3386,9 +3398,13 @@ int NV_API_CALL nv_start_rc_timer(
return -1;
nv_printf(NV_DBG_INFO, "NVRM: initializing rc timer\n");
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
init_timer(&nvl->rc_timer);
nvl->rc_timer.function = nvidia_rc_timer;
nvl->rc_timer.data = (unsigned long) nvl;
+#else
+ timer_setup(&nvl->rc_timer,nvidia_rc_timer,0);
+#endif
nv->rc_timer_enabled = 1;
mod_timer(&nvl->rc_timer, jiffies + HZ); /* set our timeout for 1 second */
nv_printf(NV_DBG_INFO, "NVRM: rc timer initialized\n");
Last edited by loqs (2018-02-03 08:44:37)
Offline
Did you check the DKMS output from the build of nvidia 387.34 on 4.15? I would have expected it to fail without a patch.
You should be able to see in the journal from the bad boots that the nvidia modules were not loaded. If you install the kernel using pacman would also expect the DKMS failure to be recorded there.
You would have needed something like the following to make 387.34 work with 4.15diff --git a/kernel/nvidia-modeset/nvidia-modeset-linux.c b/kernel/nvidia-modeset/nvidia-modeset-linux.c index edeb152..cd0ce2b 100644 --- a/kernel/nvidia-modeset/nvidia-modeset-linux.c +++ b/kernel/nvidia-modeset/nvidia-modeset-linux.c @@ -21,6 +21,7 @@ #include <linux/random.h> #include <linux/file.h> #include <linux/list.h> +#include <linux/version.h> #include "nvstatus.h" @@ -566,9 +567,17 @@ static void nvkms_queue_work(nv_kthread_q_t *q, nv_kthread_q_item_t *q_item) WARN_ON(!ret); } +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0) static void nvkms_timer_callback(unsigned long arg) +#else +static void nvkms_timer_callback(struct timer_list * t) +#endif { +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0) struct nvkms_timer_t *timer = (struct nvkms_timer_t *) arg; +#else + struct nvkms_timer_t *timer = from_timer(timer, t, kernel_timer); +#endif /* In softirq context, so schedule nvkms_kthread_q_callback(). */ nvkms_queue_work(&nvkms_kthread_q, &timer->nv_kthread_q_item); @@ -606,10 +615,16 @@ nvkms_init_timer(struct nvkms_timer_t *timer, nvkms_timer_proc_t *proc, timer->kernel_timer_created = NV_FALSE; nvkms_queue_work(&nvkms_kthread_q, &timer->nv_kthread_q_item); } else { +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0) init_timer(&timer->kernel_timer); +#else + timer_setup(&timer->kernel_timer,nvkms_timer_callback,0); +#endif timer->kernel_timer_created = NV_TRUE; +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0) timer->kernel_timer.function = nvkms_timer_callback; timer->kernel_timer.data = (unsigned long) timer; +#endif mod_timer(&timer->kernel_timer, jiffies + NVKMS_USECS_TO_JIFFIES(usec)); } spin_unlock_irqrestore(&nvkms_timers.lock, flags); diff --git a/kernel/nvidia/nv.c b/kernel/nvidia/nv.c index ad5091b..a469bf9 100644 --- a/kernel/nvidia/nv.c +++ b/kernel/nvidia/nv.c @@ -320,7 +320,11 @@ static irqreturn_t nvidia_isr (int, void *, struct pt_regs *); #else static irqreturn_t nvidia_isr (int, void *); #endif +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0) static void nvidia_rc_timer (unsigned long); +#else +static void nvidia_rc_timer (struct timer_list *); +#endif static int nvidia_ctl_open (struct inode *, struct file *); static int nvidia_ctl_close (struct inode *, struct file *); @@ -2472,10 +2476,18 @@ nvidia_isr_bh_unlocked( static void nvidia_rc_timer( +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0) unsigned long data +#else + struct timer_list * t +#endif ) { +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0) nv_linux_state_t *nvl = (nv_linux_state_t *) data; +#else + nv_linux_state_t *nvl = from_timer(nvl, t, rc_timer); +#endif nv_state_t *nv = NV_STATE_PTR(nvl); nvidia_stack_t *sp = nvl->sp[NV_DEV_STACK_TIMER]; @@ -3386,9 +3398,13 @@ int NV_API_CALL nv_start_rc_timer( return -1; nv_printf(NV_DBG_INFO, "NVRM: initializing rc timer\n"); +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0) init_timer(&nvl->rc_timer); nvl->rc_timer.function = nvidia_rc_timer; nvl->rc_timer.data = (unsigned long) nvl; +#else + timer_setup(&nvl->rc_timer,nvidia_rc_timer,0); +#endif nv->rc_timer_enabled = 1; mod_timer(&nvl->rc_timer, jiffies + HZ); /* set our timeout for 1 second */ nv_printf(NV_DBG_INFO, "NVRM: rc timer initialized\n");
Thanks loqs! that is really valuable info!
Good to know that in fact it the combo 387.34 on 4.15 was doomed to fail; this narrows down the problem.
In fact there was an error message, saying (I paraphrase): "Error! There is no instance of nvidia XX for kernel XX located in the DKMS tree", but then, in the log file that dkms said to look at, the error was said to be something completely different, namely: a compiler mismatch between 7.21 (compilation of the running kernel) and 7.3. (actual dkms compilation of the modules).
But I am almost 100% sure that this was not the case as I compiled also 4.15 with gcc 7.3!
On the other hand: in spite of this error, the kernel seems to have been built nevertheless.
And I'm not sure if this compiler mismatch was the original cause of the whole problem.
thanks
gummo
EDIT: Your are right: looking at bad boots via journalctl --list-boots and journalctl -b I see that the nvidia module isn't even loaded.
Last edited by gen2arch (2018-02-03 12:50:26)
Offline
Were multiple kernels installed at the time such as linux-lts 4.14.16? As in such a case the 4.15 module build fails writes a log, then the 4.14.16 module build fails and overwrites the previous log.
Offline
Were multiple kernels installed at the time such as linux-lts 4.14.16? As in such a case the 4.15 module build fails writes a log, then the 4.14.16 module build fails and overwrites the previous log.
I see.
Yes, absolutely: I have three kernels installed at the same time: my custom kernel, the stock arch kernel and the stock arch lts kernel.
thanks
gummo
Offline
I have the same issue with nouveau, after upgrading to 4.15 this morning.
Offline
How it could have the same cause as nouveau is a built in module? If it was the same problem then the same solution would resolve it.
Offline