You are not logged in.

#1 2014-05-29 18:29:56

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Unfindable Illegal Instruction

Hello everyone,

I'm back with another error I can't seem to solve by myself, even with help from my friends. I had a few free days so I tried to fix the cairo problems I faced earlier when trying to incorporate it into DWM. I decided to start from scratch, which went relatively easy this time. However, it seems like I either missed something or did something wrong, because upon execution I get a segfault  when I compile normally or an illegal instruction when I compile with -g.

The thing is, if I create a diff between my cairo code as it is now and as it was a few weeks ago, I don't see anything different that can cause this. Further more, if I checkout the older code and run that, it works. I have pinpointed the issue to be the creation of the cairo xlib surface on line 22 in drw.c, but I have also verified (via printf) that this function is run through perfectly fine before crashing. Actually, it gets through the whole setup() function (starting on line 1608 in dwm.c) and even draws the bar at least once before bailing out on me. Also, after a fresh boot it usually works fine but a second attempt triggers the illegal instruction.

The code base I started out from (which is in my master branch on github, see below) works fine, but again a diff between these two code bases does not show anything strange. I tried to compile cairo with symbol lookup enabled (because gdb only shows ???, as does valgrind), but I need a package I can not find and trying to build cairo with tracing crashes during compiling.

I'm completely baffled by this and I have no clue how to continue. I know I could just pick up the old code, but then I won't learn anything wink

The cairo code can be found inside the cairo branch: https://github.com/Unia/dwm/tree/cairo
The "normal" code can be found inside the master branch: https://github.com/Unia/dwm
The diff files between my old and my new cairo code can be found here: https://gist.github.com/Unia/24e26092b416c0bb79de
The diff files between my master and cairo branch can be found here: https://gist.github.com/Unia/7033ffc4a47be5d0cdda

Thanks in advance for any help!


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

#2 2014-05-30 06:49:17

GloW_on_dub
Member
Registered: 2013-03-13
Posts: 388

Re: Unfindable Illegal Instruction

maybe valgrind could help you.

Offline

#3 2014-05-30 13:28:36

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Re: Unfindable Illegal Instruction

Could you explain how? As I said in my post, both gdb and valgrind only show three questionmarks, because symbol lookup is disabled in cairo. I've tried to compile cairo with these turned on, but I need a dependency I cannot satisfy.


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

#4 2014-05-30 14:46:59

GloW_on_dub
Member
Registered: 2013-03-13
Posts: 388

Re: Unfindable Illegal Instruction

Ho, sorry, i had not seen the valgrind part.
Of course you would need to compile cairo in debug to get all information, but does valgrind give you nothing else than "???",
And also , with gdb you should be able to get a backtrace, even wihtout cairo debugging symbol.

what happens if you compile your program with -g, run it with gdb and show the backtrace when it crashes ?

Offline

#5 2014-05-30 16:27:28

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Re: Unfindable Illegal Instruction

I compile my cairo dwm with -g, start Xephyr like this:

Xephyr -ac -br -noreset -screen 800x600 :1 &

Then I start dwm like so:

DISPLAY=:1.0 gdb ./dwm

And I let gdb do it's thing by typing "run". When I get the segfault, I type "bt" and get the following output:

Program received signal SIGILL, Illegal instruction.
0x000000000040ab50 in ?? ()
(gdb) bt
#0  0x000000000040ab50 in ?? ()
#1  0x0000000000407a7b in ?? ()
#2  0x000000000040a6a2 in ?? ()
#3  0x00007ffff683a000 in __libc_start_main () from /usr/lib/libc.so.6
#4  0x0000000000402629 in ?? ()

Valgrind's output can be seen below:

┌─jente @ dwm (cairo) 18:23:10 
└─╼ DISPLAY=:1.0 valgrind --tool=memcheck ./dwm
==2581== Memcheck, a memory error detector
==2581== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2581== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==2581== Command: ./dwm
==2581== 
vex amd64->IR: unhandled instruction bytes: 0x2F 0x68 0x6F 0x6D 0x65 0x2F 0x6A 0x65
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==2581== valgrind: Unrecognised instruction at address 0x40ab50.
==2581==    at 0x40AB50: ??? (in /home/jente/code/dwm/dwm)
==2581==    by 0x40A6A1: ??? (in /home/jente/code/dwm/dwm)
==2581==    by 0x6066FFF: (below main) (in /usr/lib/libc-2.19.so)
==2581== Your program just tried to execute an instruction that Valgrind
==2581== did not recognise.  There are two possible reasons for this.
==2581== 1. Your program has a bug and erroneously jumped to a non-code
==2581==    location.  If you are running Memcheck and you just saw a
==2581==    warning about a bad jump, it's probably your program's fault.
==2581== 2. The instruction is legitimate but Valgrind doesn't handle it,
==2581==    i.e. it's Valgrind's fault.  If you think this is the case or
==2581==    you are not sure, please let us know and we'll try to fix it.
==2581== Either way, Valgrind will now raise a SIGILL signal which will
==2581== probably kill your program.
==2581== 
==2581== Process terminating with default action of signal 4 (SIGILL)
==2581==  Illegal opcode at address 0x40AB50
==2581==    at 0x40AB50: ??? (in /home/jente/code/dwm/dwm)
==2581==    by 0x40A6A1: ??? (in /home/jente/code/dwm/dwm)
==2581==    by 0x6066FFF: (below main) (in /usr/lib/libc-2.19.so)
==2581== 
==2581== HEAP SUMMARY:
==2581==     in use at exit: 772,434 bytes in 17,669 blocks
==2581==   total heap usage: 26,609 allocs, 8,940 frees, 3,683,274 bytes allocated
==2581== 
==2581== LEAK SUMMARY:
==2581==    definitely lost: 3,584 bytes in 8 blocks
==2581==    indirectly lost: 7,069 bytes in 270 blocks
==2581==      possibly lost: 5,304 bytes in 99 blocks
==2581==    still reachable: 755,797 bytes in 17,288 blocks
==2581==         suppressed: 0 bytes in 0 blocks
==2581== Rerun with --leak-check=full to see details of leaked memory
==2581== 
==2581== For counts of detected and suppressed errors, rerun with: -v
==2581== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
Illegal instruction (core dumped)

As you can see, both gdb and valgrind give a clue as to what is going on but not where it is happening. At least, not to me. I'm not that experienced with gdb and valgrind though, so if there are any other commands I can use please enlighten me.


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

#6 2014-05-30 19:22:35

GloW_on_dub
Member
Registered: 2013-03-13
Posts: 388

Re: Unfindable Illegal Instruction

Ouch, never had a illegal instuction before. We are indeed clueless here.
If i have some time i will try to look into it.

Offline

#7 2014-05-30 19:25:11

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Re: Unfindable Illegal Instruction

Thanks for your time smile


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

#8 2014-05-30 21:19:52

SahibBommelig
Member
From: Germany
Registered: 2010-05-28
Posts: 80

Re: Unfindable Illegal Instruction

Just very vague guesses based on that valgrind output:

  • Buffer overflows might trigger such an error. So look out for places where you might exceed your buffer on the stack.

  • Tried with a different compiler like clang?

  • Try good ol' printf() debugging if everything fails.

Offline

#9 2014-05-30 21:50:07

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Re: Unfindable Illegal Instruction

SahibBommelig wrote:

Just very vague guesses based on that valgrind output:

  • Buffer overflows might trigger such an error. So look out for places where you might exceed your buffer on the stack.

The thing is, if I compare my current cairo code to the old one (which still works), I don't see anything that can trigger a buffer overflow. The same goes for comparing my current cairo code to my current non-cairo code, but I guess having another look won't hurt.

SahibBommelig wrote:
  • Tried with a different compiler like clang?

Just for debugging, or for actual use? If the latter, I would rather find out what is causing this..

SahibBommelig wrote:
  • Try good ol' printf() debugging if everything fails.

I have done that and by doing so, verified the drw_create method is actually run through perfectly fine. The whole setup method is run through fine and even the drw_rect/drw_text methods get run through fine. Actually, the whole bar is drawn once or twice before it bails on me. The thing is, if I create a normal cairo image surface instead of an xlib surface (in drw_create), it does not segfault so I assume that is where the fault is.


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

#10 2014-05-30 22:06:56

derhamster
Member
Registered: 2012-07-08
Posts: 86

Re: Unfindable Illegal Instruction

I usually get illegal instructions when I forget a return statement in a function (with a return type != void big_smile). Maybe you changed some code paths that could lead to a case like that?

Offline

#11 2014-05-30 22:11:38

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Re: Unfindable Illegal Instruction

Shouldn't the compiler warn on that? It's worth another check though, so will do!


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

#12 2014-05-30 22:28:24

SahibBommelig
Member
From: Germany
Registered: 2010-05-28
Posts: 80

Re: Unfindable Illegal Instruction

Unia wrote:
SahibBommelig wrote:
  • Tried with a different compiler like clang?

Just for debugging, or for actual use? If the latter, I would rather find out what is causing this..

For debugging only. Clang often generates warnings that gcc did not. Also we could rule out the very unlikely case of an compiler problem.

I have done that and by doing so, verified the drw_create method is actually run through perfectly fine. The whole setup method is run through fine and even the drw_rect/drw_text methods get run through fine. Actually, the whole bar is drawn once or twice before it bails on me. The thing is, if I create a normal cairo image surface instead of an xlib surface (in drw_create), it does not segfault so I assume that is where the fault is.

That's actually symptomatic for buffer overflows.
From a very quick glance at drw.c:

https://github.com/Unia/dwm/blob/315d7d … drw.c#L151

Do you make sure you have a nullbyte at the end? Why not use strncpy instead of memcpy? 
The problem might only be exposed now, the solution might not be visible in the diff.

Offline

#13 2014-05-30 22:37:31

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Re: Unfindable Illegal Instruction

^ I will have a go with clang, then. That memcpy is in vanilla dwm and is not one of my changes. However, if I completely comment the drawbar function in dwm.c so that no drw_text/drw_rect are being called, it still segfaults.


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

#14 2014-05-30 22:56:37

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Re: Unfindable Illegal Instruction

Compiling with clang did not report any faults regarding this crash. It pointed out another silly (old) fault, though, but I verified it's still not working.

EDIT: I can comment out all of the functions in drw.c (except for drw_create) and the segfault will still occur. If I don't make use of drw_create, the segfault does not occur. This means that the segfault still occurs without any of the actual drawing code.

EDIT2: Again, with just the drw_create function and all the others commented, it works if I use cairo_image_surface instead of cairo_xlib_surface.

Last edited by Unia (2014-05-30 23:25:13)


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

#15 2014-06-01 21:52:28

Unia
Member
From: Stockholm, Sweden
Registered: 2010-03-30
Posts: 2,486
Website

Re: Unfindable Illegal Instruction

I have likely found the culprit, but don't ask me why that is it... I took my old cairo code and started (re)implementing the changes I had made onto my current master and then my changes from there onto my current cairo code. After every few changes I tested the code and by doing so, I found out the Illegal Instruction turns up when I put the code related to dmenumon back in. This code is two lines long and can be found inside the spawn function in dwm.c.

Why or how this is related to drawing and cairo in particular is a mystery to me, but with that code commented it works like a charm in the few test runs I have given it.

EDIT: Now it is just -Os causing a segfault somewhere, I have yet to figure out where that is. On the other hand, I could just not use it.

Last edited by Unia (2014-06-01 22:07:20)


If you can't sit by a cozy fire with your code in hand enjoying its simplicity and clarity, it needs more work. --Carlos Torres

Offline

Board footer

Powered by FluxBB