I really need the bt from gdb. I do not have an arch box to try the core on. I rigged the snapshot with a bunch of aborts to see if I can find something bad in the switchws code but if you can trigger it by switching layouts then one of the issue is elsewhere. Looking at the previous trace it looks like it jumped to NULL at some point causing all kind of ruckus. It sure looks like a corruption of the layout jump table.
Since I do not see any of these issues I have to assume it is Linux specific. What arch (amd64, i386, sparc64 etc)?
Do you use overly aggressive optimizations? Don't forget that gcc is a turd above -O2.
aha! roxterm on openbsd also screws the pooch. I'll look into that one. Please get me that gdb trace for the crash.
This diff will probably get mangled but lets try:
RCS file: /u/marco/scrotvs/scrotwm/scrotwm.c,v
retrieving revision 1.235
diff -u -p -u -p -r1.235 scrotwm.c
--- scrotwm.c 8 Oct 2009 19:21:24 -0000 1.235
+++ scrotwm.c 9 Oct 2009 02:33:28 -0000
@@ -1923,6 +1923,7 @@ stack_master(struct workspace *ws, struc
mask = CWX | CWY | CWWidth | CWHeight | CWBorderWidth;
XConfigureWindow(display, win->id, mask, &wc);
+ XSync(display, True);
last_h = win_g.h;
This "fixes" it for me. If it works I'll make it a proper callback. Please try and report back.
edit: whoops, I did not see your patch while typing stuff.
I've been building with -march=i686 -mtune=generic -O2.
For the roxterm bug:
(gdb) bt #0 0xb7fea424 in __kernel_vsyscall () #1 0xb7e2d26d in ___newselect_nocancel () from /lib/libc.so.6 #2 0x080526d3 in main ()
I have not been able to reproduce the null jump.
For the empty swap-main crash, core and bt:
Reading symbols from /usr/bin/scrotwm...(no debugging symbols found)...done. Attaching to program: /usr/bin/scrotwm, process 31990 (snip) 0xb7fa0424 in __kernel_vsyscall () (gdb) c Continuing. (pressed M+space on empty screen) Program received signal SIGSEGV, Segmentation fault. 0x0804d2f8 in swapwin () (gdb) generate-core-file Saved corefile core.31990 (gdb) bt #0 0x0804d2f8 in swapwin () #1 0x0804e095 in keypress () #2 0x0805255b in main () (gdb)
What really odd is the swap-main bug ONLY occurs on desktop 1.
Last edited by keenerd (2009-10-09 02:49:50)
Any reason why gdb isn't telling us the line number?
email me so that I can send you to a server to come have a chat with me. This forum is just too slow to knock this out.
In cvs I committed a fix for one crash that I think is the one you saw. In the meantime I discovered some more race conditions courtesy of Xlib. I am chasing those now but please do try the cvs version and report back.
Ok I think I fixed these issues in 0.9.12. Please give it a twirl and let me know if it is still busted on Linux.
marco_p, does scrotwm work properly with Twinview? I currently use Xmonad, but would love to give scrotwm a whirl.
If it uses xrandr it'll work just fine.
Yay, the swap-main bug was fixed.
Did some more testing with the roxterm issue. Still not fixed. It is also a little worse than I thought. The bug happens with many VTE terminals: roxterm, lxterm, terminal (xfce), sakura. The only immune VTE terminal is the optional term embedded into Geany.
Also, it occurred during an operation I had not seen before. While spawning terminals, the resize might blank one or more.
I saw roxterm sometimes not getting redrawn however it didn't hang. It seems that the XConfigureWindow command doesn't get to xserver quickly enough when it is followed by XMapRaised. I am still thinking about this.
I never said it hanged. I've said the exact opposite, several times.
I was just confirming your observation. I am at a loss for this actually; still trying a few things.
Ok I dropped 0.9.13. This addresses the issues I saw in roxterm on OpenBSD. You guys have a try and let me know if it fixes it for you too.
Sweet. 90% there.
The terminal blanking is now pretty rare. It only happens in the vertical layout, and then only in the minor half of the screen. I hit cycle-layout 100 times and only saw 11 glitches. So the vertical layout still glitches 33% of the time.
not sure how much more I can slow down scrotwm to let it catch up....
I'll add a few more messages in there and see what it does. Now I need to fix the borders again after this diff. Please keep testing and looking for crashes and stuff.
Ok I think I got it licked. I am virtually sure I fixed the crashes you guys were seeing and roxterm now works 100% of the time for me. Subtle bug; took only a few days to figure out :-(
Please try 0.9.14.
I've been experiencing segfaults when switching between workspaces with 0.9.13 (can't really reproduce it reliably, seems to happen when switching back to workspace one)--perhaps you fixed this with the latest stuff in CVS. Here's a backtrace, let me know if you need more info:
Program received signal SIGSEGV, Segmentation fault. 0x0804bf6b in focus_win (win=0x8cd6ce8) at scrotwm.c:1288 1288 scrotwm.c: No such file or directory. in scrotwm.c #0 0x0804bf6b in focus_win (win=0x8cd6ce8) at scrotwm.c:1288 #1 0x0804c271 in switchws (r=0x8cb7f70, args=0x805779c) at scrotwm.c:1379 #2 0x08051912 in keypress (e=0xbfddbe7c) at scrotwm.c:3609 #3 0x080535f6 in main (argc=1, argv=0xbfddd024) at scrotwm.c:4401
I think I got that fixed in 0.9.14. Please try I'd love to see that issue die :-)
So far so good with 0.9.14.
One thing: when I was compiling this latest snapshot, gcc complained about an undefined reference to TAILQ_END and bailed. Turns out TAILQ_END is not defined in sys/queue.h on Linux as it apparently is on BSD. I stuck the definition for TAILQ_END in util.h (probably not the best method) and it then compiled fine.
VTE bug fixed! I guess TAILQ_END is something for the package maintainer to worry about? Until then, and just to make it easier for other people to find:
#define TAILQ_END(head) NULL
Throw that line in the beginning of scrotwm.c and it'll compile.
Last edited by keenerd (2009-10-13 09:00:53)
huh? wherever you got that TAILQ macro file it is busted. I don't mind carrying the OpenBSD one in the snaps but this seems more like a fundamental issue with whatever development environment on linux.
I think I will start yatwm .
/usr/include/sys/queue.h is owned by glibc 2.10.1-4
What version of glibc are you using? There are no defines with "END" in the name in our queue.h. There are a lot of "== NULL"s around.
Ulrich Drepper continues to fuck up glibc. I committed a retarded workaround to work around his bullshit. I am so sick of his crap!
I don't use glibc if I can help it and this is one of the obvious reasons.