You are not logged in.

#1 2017-11-15 16:03:51

awc
Member
Registered: 2017-09-20
Posts: 9

[SOLVED] Vanilla Kernel stopped booting throughout 4.13

Hello archlinux community,

A month or two ago, when 4.13 was introduced, my vanilla kernel stopped booting. I switched to lts but for several reasons (the desire to learn, the incoming 4.14 lts) I'd like to get to the bottom of this issue.

Specifically, when I try and boot the vanilla kernel, grub prints that the initial ramdisk is being loaded, then the screen flashes two or three times (light, no text) and the screen goes blank.

These are my current kernel params:
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=off modprobe.blacklist=ehci_pci modprobe.blacklist=i915 threadirqs ipv6.disable_ipv6=1 i8042.nopnp i8042.unlock"

Per the recommendation of a very helpful user on the irc channel, I removed all modules from my initramfs.

nomodeset and break=premount freeze the screen at "loading initial ramdisk" for both my vanilla kernel and the lts kernel. The same thing happened when I followed the wiki on kernel panics.

I am running arch on a Dell Chromebook 13 with intel hardware.

And as a side note, I got help on the irc channel earlier today and the users I interacted with were extremely helpful and patient with me. We reached a bit of a dead end after break=premount seemed to fail and the user suggested that it might be an issue with static drivers in the kernel.

Any advice on what I can try or further resources on the topic would be appreciated

Last edited by awc (2018-03-04 02:57:39)

Offline

#2 2017-11-15 16:13:21

progandy
Member
Registered: 2012-05-17
Posts: 5,184

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

It is unlikely, but maybe 4.14 does already contain a fix for your problem. You can try to install it with linux-mainline from AUR or the precompiled package by miffe

Last edited by progandy (2017-11-15 16:37:32)


| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |

Offline

#3 2017-11-15 16:26:09

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

You might get more output with loglevel=7 earlyprintk=vga or efi depending on the system type.
If that and 4.14 fails would suggest bisecting the kernel between 4.12 and 4.13 https://bbs.archlinux.org/viewtopic.php … 5#p1747625

Offline

#4 2017-11-15 17:04:15

seth
Member
Registered: 2012-09-03
Posts: 49,974

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

For a very basic question: you're sure this is not an I/O issue and the boot actually fails (and the system doesn't eg. echo on pings)

Online

#5 2017-11-16 00:31:34

GenkiSky
Member
From: This account is henceforth dis
Registered: 2017-04-04
Posts: 82

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

To add to seth's comment, what is your GRUB_TIMEOUT in your grub config? If it's -1 or very long, try spamming enter when the screen starts flashing. The screen issues makes this sounds like a graphics issue, not a boot issue. Also, are the kernel params you listed what you need to boot on the lts kernel, or just the result of debugging attempts?

Offline

#6 2017-11-19 14:23:03

awc
Member
Registered: 2017-09-20
Posts: 9

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

Thank you all for your responses. I'm currently trying to figure out how bisecting works and how best to do it but I wanted to briefly respond to everyone.

progandy: I installed and compiled linux-mainline from AUR, no change in symptoms.

loqs: loglevel7 and earlyprintk=vga did not allow the early output we're looking for. Same flashing and blank screen. I'm working on the bisect now.

seth: I have no idea if this is an I/O issue. I don't have any tty visible to manually ping the system if that's what you're asking. Can you explain how to tell the difference between I/O and boot issues?

GenkiSky: My GRUB_TIMEOUT is 5, which I think was the default setting. I'm realizing now that I tried spamming enter but I didn't change my timeout first. I'll give it a go at this point! And the kernel params I listed are almost entirely the result of debugging efforts. I think the only one I've had since installation is modprobe.blacklist=ehci_pci per the wiki on installing arch on chromebooks.

Offline

#7 2017-11-19 15:32:20

seth
Member
Registered: 2012-09-03
Posts: 49,974

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

The idea is to literally "ping" the system from another machine.
The mere inability to see output (and maybe even to input to the system, though you didn't indicate that) does not imply the kernel didn't boot - could just be broken scanout buffer.

Online

#8 2017-11-19 19:37:28

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

Another basic test if you do not have another system to ping the affected one from,  is there any record in the journal from these failed boots and does a press of the power button cause the system to shut down?
You could try linux 4.14 now in testing link provided as there is no link on the package page for the signature file.  https://mirror.rackspace.com/archlinux/ … pkg.tar.xz  https://mirror.rackspace.com/archlinux/ … tar.xz.sig place both files in the same directory to use the signature.

Offline

#9 2018-02-01 22:03:42

awc
Member
Registered: 2017-09-20
Posts: 9

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

This is an update from my previous post on the problem:

Briefly, after loading the initial ramdisk, the screen flashes two times, goes dark and requires a hard shutdown. My computer is a Dell Chromebook 13 with Intel Hardware. I am running the LTS without issue. My previous attempts to troubleshoot the situation included removing all modules from the initramfs, nomodeset and break=premount, and a series of kernel params {current params are intel_iommu=off modprobe.blacklist=ehci_pci modprobe.blacklist=i915 threadirqs ipv6.disable_ipv6=1 i8042.nopnp i8042.unlock}.

Anyways, to the new stuff:

After bisecting the kernel between 4.12 and 4.13 I found this commit to be a trouble spot -
drm/i915: Implement Link Rate fallback on Link training failure.

1) I don't trust this is as a true problem point because it was committed in 4.11, before my issue arrived, and it was in fact later reverted in the kernel. I performed the bisect three times with different starting commits and it kept ending up here.

2) If this is indeed correct, I'm not sure what to make of it and if this is something that I can resolve in kernel params or if I need to create and apply patches in order to use the latest kernels.

And some forum etiquette:
I decided to post a new topic because I saw elsewhere that the mods prefer a rolling cycle of posts as opposed to possible necrobumping. If you need me to address the previous post in any way, please say so.
And to those who helped earlier, apologies for responding so late. Computers are not my day job and this little bit has...taken a while tongue

Last edited by awc (2018-02-01 22:04:18)

Offline

#10 2018-02-01 22:15:04

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

Is it still an issue on 4.14.15 and 4.15?  When you performed the git bisect was the i915 module blacklisted?
As you were the topic starter of the previous thread and it is still unresolved and you are still working on it I would not have thought continuing to use that thread counted as a necrobump.
Edit:
To clarify the order of the git commits:
233ce881dd91fb13eb6b09deefae33168e6ead4c was the original commit.
afc1ebf4562a14b8a981a0de2a3aa063dbd4c5b2 was the revert due to test failures.
9301397a63b3bf1090dffe846c6f1c8efa032236 was the reapplication of the commit after the test failures were explained.
Edit2:
9301397a63b3bf1090dffe846c6f1c8efa032236 does not revert cleanly from 4.15 or 4.14 but does from 4.13 does reverting that single commit on 4.13 fix the issue?

Last edited by loqs (2018-02-01 22:27:00)

Offline

#11 2018-02-01 22:29:45

WorMzy
Forum Moderator
From: Scotland
Registered: 2010-06-16
Posts: 11,786
Website

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

Mod note: merging with the existing thread.

I decided to post a new topic because I saw elsewhere that the mods prefer a rolling cycle of posts as opposed to possible necrobumping. If you need me to address the previous post in any way, please say so.

Thank you for being considerate, but I wouldn't count this as necrobumping. You still have the same problem, and are carrying on from where you left off, so creating a new topic is unnecessary.

Last edited by WorMzy (2018-02-01 22:33:39)


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Online

#12 2018-02-02 05:02:23

awc
Member
Registered: 2017-09-20
Posts: 9

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

Thank you WorMzy for merging the threads.

loqs: 4.14.15 does not work. I'm compiling linux-mainline again now to test 4.15. I have had i915 blacklisted throughout the bisect process.

Great news though, 4.13 works after reverting the 930...236 commit! The culprit is found.

Offline

#13 2018-02-02 12:51:02

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

This patch is the revert for 4.15 of 9301397a63b3bf1090dffe846c6f1c8efa032236 plus 713946d16f45ad0509434970ae6ff71529faab4b which relied upon it.

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 50f8443641b8..3ca480039a6d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15231,23 +15231,6 @@ void intel_connector_unregister(struct drm_connector *connector)
 	intel_panel_destroy_backlight(connector);
 }
 
-static void intel_hpd_poll_fini(struct drm_device *dev)
-{
-	struct intel_connector *connector;
-	struct drm_connector_list_iter conn_iter;
-
-	/* First disable polling... */
-	drm_kms_helper_poll_fini(dev);
-
-	/* Then kill the work that may have been queued by hpd. */
-	drm_connector_list_iter_begin(dev, &conn_iter);
-	for_each_intel_connector_iter(connector, &conn_iter) {
-		if (connector->modeset_retry_work.func)
-			cancel_work_sync(&connector->modeset_retry_work);
-	}
-	drm_connector_list_iter_end(&conn_iter);
-}
-
 void intel_modeset_cleanup(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -15268,7 +15251,7 @@ void intel_modeset_cleanup(struct drm_device *dev)
 	 * Due to the hpd irq storm handling the hotplug work can re-arm the
 	 * poll handlers. Hence disable polling after hpd handling is shut down.
 	 */
-	intel_hpd_poll_fini(dev);
+	drm_kms_helper_poll_fini(dev);
 
 	/* poll work can call into fbdev, hence clean that up afterwards */
 	intel_fbdev_fini(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 158438bb0389..bc9b8bc651e0 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -5973,29 +5973,6 @@ intel_dp_init_connector_port_info(struct intel_digital_port *intel_dig_port)
 	}
 }
 
-static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
-{
-	struct intel_connector *intel_connector;
-	struct drm_connector *connector;
-
-	intel_connector = container_of(work, typeof(*intel_connector),
-				       modeset_retry_work);
-	connector = &intel_connector->base;
-	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id,
-		      connector->name);
-
-	/* Grab the locks before changing connector property*/
-	mutex_lock(&connector->dev->mode_config.mutex);
-	/* Set connector link status to BAD and send a Uevent to notify
-	 * userspace to do a modeset.
-	 */
-	drm_mode_connector_set_link_status_property(connector,
-						    DRM_MODE_LINK_STATUS_BAD);
-	mutex_unlock(&connector->dev->mode_config.mutex);
-	/* Send Hotplug uevent so userspace can reprobe */
-	drm_kms_helper_hotplug_event(connector->dev);
-}
-
 bool
 intel_dp_init_connector(struct intel_digital_port *intel_dig_port,
 			struct intel_connector *intel_connector)
@@ -6008,10 +5985,6 @@ intel_dp_init_connector(struct intel_digital_port *intel_dig_port,
 	enum port port = intel_dig_port->port;
 	int type;
 
-	/* Initialize the work for modeset in case of link train failure */
-	INIT_WORK(&intel_connector->modeset_retry_work,
-		  intel_dp_modeset_retry_work_fn);
-
 	if (WARN(intel_dig_port->max_lanes < 1,
 		 "Not enough lanes (%d) for DP on port %c\n",
 		 intel_dig_port->max_lanes, port_name(port)))
diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c
index 05907fa8a553..694ad0ffb523 100644
--- a/drivers/gpu/drm/i915/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
@@ -314,28 +314,6 @@ void intel_dp_stop_link_train(struct intel_dp *intel_dp)
 void
 intel_dp_start_link_train(struct intel_dp *intel_dp)
 {
-	struct intel_connector *intel_connector = intel_dp->attached_connector;
-
-	if (!intel_dp_link_training_clock_recovery(intel_dp))
-		goto failure_handling;
-	if (!intel_dp_link_training_channel_equalization(intel_dp))
-		goto failure_handling;
-
-	DRM_DEBUG_KMS("[CONNECTOR:%d:%s] Link Training Passed at Link Rate = %d, Lane count = %d",
-		      intel_connector->base.base.id,
-		      intel_connector->base.name,
-		      intel_dp->link_rate, intel_dp->lane_count);
-	return;
-
- failure_handling:
-	DRM_DEBUG_KMS("[CONNECTOR:%d:%s] Link Training failed at link rate = %d, lane count = %d",
-		      intel_connector->base.base.id,
-		      intel_connector->base.name,
-		      intel_dp->link_rate, intel_dp->lane_count);
-	if (!intel_dp_get_link_train_fallback_values(intel_dp,
-						     intel_dp->link_rate,
-						     intel_dp->lane_count))
-		/* Schedule a Hotplug Uevent to userspace to start modeset */
-		schedule_work(&intel_connector->modeset_retry_work);
-	return;
+	intel_dp_link_training_clock_recovery(intel_dp);
+	intel_dp_link_training_channel_equalization(intel_dp);
 }
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 5d77f75a9f9c..5bd66f2f8d2c 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -327,9 +327,6 @@ struct intel_connector {
 	void *port; /* store this opaque as its illegal to dereference it */
 
 	struct intel_dp *mst_port;
-
-	/* Work struct to schedule a uevent on link train failure */
-	struct work_struct modeset_retry_work;
 };
 
 struct intel_digital_connector_state {

I suggest you open an upstream bug report with your findings.

Offline

#14 2018-03-03 00:29:44

awc
Member
Registered: 2017-09-20
Posts: 9

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

I'm having trouble getting the patch to apply. I'm using the wiki.

First I git cloned and makepkg -si the linux-mainline package to my general aur folder ~/build/ then descend into the directory containing the PKGBUILD. Copied your patch into a file I called linux-mainline.patch, went into PKGBUILD and edited source as such:

source=(
  "git+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git#tag=$_tag"
  'linux-mainline.patch'  # a patch for chromebook functionality
  'config'         # the main kernel config file
  '60-linux.hook'  # pacman hook for depmod
  '90-linux.hook'  # pacman hook for initramfs regeneration
  'linux.preset'   # standard config files for mkinitcpio ramdisk
)

updpkgsums and edited prepare as such:

prepare() {
  cd "${srcdir}/linux"

  patch -p1 -i linux-mainline.patch

  cp -Tf ../config .config

Any attempt to makepkg from there just says that it's already made and installed.

A few notes:
I'm leaving it at v4.15.1 right now just because I'm not sure how you made the patch and I just want proof that it works. I'll git pull to the latest version as soon as this proof of concept is done / I know the patch will work going forward.
My linux-mainline directory does not have and src directory as the wiki seems to assume I should. Am I missing something in the git clone / makepkg set-up?
I've tried fiddling with the value of the -p parameter in patch. No avail.

Sorry if this seems like hand-holding. I really appreciate your help and time here. Thank you.

Offline

#15 2018-03-03 00:42:08

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

$ makepkg
==> ERROR: The package group has already been built. (use -f to overwrite)

As the error notes use makepkg -f if the error is something else please post the makepkg command you are using and the output it generates.
Edit:
If makepkg then changes to output from patch asking if a patch should Assume -R you need to clean the src directory

makepkg -Cf

Last edited by loqs (2018-03-03 00:47:57)

Offline

#16 2018-03-03 01:03:00

awc
Member
Registered: 2017-09-20
Posts: 9

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

That wasn't my error but  the -f option allowed the patch to be recognized. I'm using makepkg -fsie and I still have patch -p1 set in the PKGBUILD. The error now reads:

==> Making package: linux-mainline 4.15-1 (Fri Mar  2 18:55:23 CST 2018)
==> Checking runtime dependencies...
==> Checking buildtime dependencies...
==> WARNING: Using existing $srcdir/ tree
==> Removing existing $pkgdir/ directory...
==> Starting build()...
scripts/kconfig/conf  --silentoldconfig Kconfig
***
*** Configuration file ".config" not found!
***
*** Please run some configurator (e.g. "make oldconfig" or
*** "make menuconfig" or "make xconfig").
***
make[2]: *** [scripts/kconfig/Makefile:40: silentoldconfig] Error 1
make[1]: *** [Makefile:504: silentoldconfig] Error 2
make: *** No rule to make target 'include/config/auto.conf', needed by 'include/config/kernel.release'.  Stop.
==> ERROR: A failure occurred in build().
    Aborting...

The build() function in PKGBUILD reads:

build() {
  cd "${srcdir}/linux"

  make ${MAKEFLAGS} LOCALVERSION= bzImage modules
}

Offline

#17 2018-03-03 09:30:04

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

Please post the PKGBUILD and the output of `ls src/linux/.config`
Edit:
The prepare() function is not called when you use  makepkg -e so that was why the patch was not being applied but something else seems to have gone wrong as .config now appears to be missing.

Last edited by loqs (2018-03-03 09:42:44)

Offline

#18 2018-03-03 13:22:08

awc
Member
Registered: 2017-09-20
Posts: 9

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

Here is the PKGBUILD

And there is no directory at src/linux/.config
Could it do with the ordering of commands in my prepare() function? I have the patch preceeding a command to copy a .config over.

Offline

#19 2018-03-03 13:25:23

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

PKGBUILD looks fine but somehow src/linux/.config has been deleted.  To clean the srcdir force a rebuild and install the result try

makepkg -Cfi

Offline

#20 2018-03-03 13:38:26

awc
Member
Registered: 2017-09-20
Posts: 9

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

There seems to be an issue in prepare(). I just tried adjusting the patch -p value again in case it wasn't properly pointing to the patch file, but the output remained the same. Here's my output from makepkg -Cfi

==> Making package: linux-mainline 4.15-1 (Sat Mar  3 07:33:32 CST 2018)
==> Checking runtime dependencies...
==> Checking buildtime dependencies...
==> Retrieving sources...
  -> Updating linux git repo...
Fetching origin
  -> Found linux-mainline.patch
  -> Found config
  -> Found 60-linux.hook
  -> Found 90-linux.hook
  -> Found linux.preset
==> Validating source files with md5sums...
    linux ... Skipped
    linux-mainline.patch ... Passed
    config ... Passed
    60-linux.hook ... Passed
    90-linux.hook ... Passed
    linux.preset ... Passed
==> Removing existing $srcdir/ directory...
==> Extracting sources...
  -> Creating working copy of linux git repo...
Cloning into 'linux'...
done.
Checking out files: 100% (62911/62911), done.
Checking out files: 100% (11943/11943), done.
Switched to a new branch 'makepkg'
==> Starting prepare()...
patch: **** Can't open patch file linux-mainline.patch : No such file or directory
==> ERROR: A failure occurred in prepare().
    Aborting...

Offline

#21 2018-03-03 13:44:55

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

The patch in src / ${srcdir} one directory up from src/linux which the prepare function just changed into.

patch -p1 -i "${srcdir}"/linux-mainline.patch

or

patch -p1 -i ../linux-mainline.patch

Edit:
Have you reported the issue upstream?

Last edited by loqs (2018-03-03 14:47:35)

Offline

#22 2018-03-04 02:57:02

awc
Member
Registered: 2017-09-20
Posts: 9

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

And 4.15-0 works! Thank you again for helping out here. I'll submit that bug report and change this thread to a /solved/.

Can you point me to a resource for how you generated that revert patch just in case I need to do so again for kernels down the line?

Offline

#23 2018-03-04 11:10:59

loqs
Member
Registered: 2014-03-06
Posts: 17,192

Re: [SOLVED] Vanilla Kernel stopped booting throughout 4.13

This is using https://git.kernel.org/pub/scm/linux/ke … stable.git with v4.15.4 as a random example

$ git status
HEAD detached from v4.15.4
$ git revert -n 713946d16f45ad0509434970ae6ff71529faab4b
Performing inexact rename detection: 100% (432820/432820), done.
$ git revert -n 9301397a63b3bf1090dffe846c6f1c8efa032236
Performing inexact rename detection: 100% (9228852/9228852), done.
error: could not revert 9301397a63b3... drm/i915: Implement Link Rate fallback on Link training failure
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
$ git mergetool --tool=meld 
Merging:
drivers/gpu/drm/i915/intel_dp_link_training.c

Normal merge conflict for 'drivers/gpu/drm/i915/intel_dp_link_training.c':
  {local}: modified file
  {remote}: modified file
$ git diff v4.15.4

The first commit reverted cleanly.  The second one required manual merging using mergetool (meld is my preferred diff tool git supports many others)
Then diff against the tag for the kernel you want to apply the patch to.
As you have a bisection though I really would report that upstream so that the issue can be fixed and you would not have to keep patching the kernel.

Offline

Board footer

Powered by FluxBB