You are not logged in.

#1 2011-11-09 18:45:19

jjacky
Member
Registered: 2011-11-09
Posts: 347
Website

System got messed up... somehow

Hello everyone,

So this isn't really how I had imagined my first post would turn out, but here we go.

Little background

I am new to Arch, and to Linux in general. For a while now I've wondered and thought about switching, but never did. Then about a couple of months ago now, I decided I should, and have been working on it since then. Meaning that for a few weeks now, I am using Arch, inside a VM (Win7 host).

Things were going fine, I've been learning & loving it so far, but then!

Don't die on me

Yesterday, at one point I opened a terminal (urxvt) and wanted to do a simple thing:

$ sudo tail /var/log/nginx/foo.errors.log

but instead of being prompted for my password, I got a few error messages, alongside a bunch of wrong characters that were apparantly meant to be a filename or something, then sudo complained about not being able to stat something... unfortunately I didn't kept those error messages; my bad.

Anyways, then I noticed something happening - something that had actually happened before. That is, a few days back, I noticed that when moving a terminal window on my screen, there were some drawing issues. Moving the window from top-left to bottom-right, there were "parts" of it still left on at the top-left, the window wasn't fully/properly drawn on the bottom-right, and there were some bits in between. (It has happpened a couple of times actually.)

Very odd (also, I thought that was the kind of things that didn't happen with a compositing WM, and I'm runinng XFWM4 (XFCE) with compositor enabled... but I might be wrong? I don't really know about this...) and it eventually led to VirtualBox crashing. I'll admit I didn't really investigate it much, not sure why. I guess I thought/hoped it might just be an issue with the VM and, maybe, the fact that I run it with two screens (I have 2 monitors) both in fullscreen mode, and some video driver issue might have caused this. Not that it would be okay, but...

To the point

Back to yesterday, where the same display issues were happening again. Except now, in (unrelated?) news I couldn't sudo either. Then I eventually tried to reboot, but when clicking "Log out" in XFCE nothing would happen. I could have "sudo reboot" only sudo didn't work. For the sake of it I tried "reboot" knowing it wouldn't work, except it failed for a different reason than I expected: command not found.

So, cold reboot.

But I could not (re-)boot, because a lot more was missing, things like hostname or runlevel and more! This was getting freaky, so I quickly got salt and drew a circle around me, just in case.

Anyways, because I happen to have a little script running every night using rsync to backup pretty much the whole system, I figured it was time for a restore. And because it's a VM, it's quite easy to create a brand new disk, remove the old one, replace it with the new one, and go: I booted from the ISO and used rsync again to restore everything as it was in the morning.

I am here, so this (seems to) have worked.

History repeats itself

So thankfully I had a backup (always backup, people!) and I have a running system again, with only little data loss (a few hours of work). That's good. But I have no idea what happened (or why), so I'm thinking it might/will happen again. Which is no good. Surely I'll still be doing backups, but that's far from an answer.

Now, I still have the old disk (i.e. the old .vdi file) so I did mount it (ro), to see. And this is what I get (upon mounting):

[  571.990980] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  872.586681] EXT4-fs (dm-3): error count: 35
[  872.586702] EXT4-fs (dm-3): initial error at 1320758976: htree_dirblock_to_tree:587: inode 819: block 8970
[  872.586724] EXT4-fs (dm-3): last error at 1320760632: ext4_iget:5019: inode 15105

Also, if I now do a "ls /mnt/old/sbin" I find this in the log:

[ 5449.028115] EXT4-fs error (device dm-3): ext4_iget:5019: inode #15105: comm ls: bad extended attribute block 3021702897

and the actual result/listing is prefixed with this:

ls: cannot access /mnt/old/sbin/sulogin: Input/output error
ls: cannot access /mnt/old/sbin/bootlogd: Input/output error
ls: cannot access /mnt/old/sbin/runlevel: Input/output error
ls: cannot access /mnt/old/sbin/killall5: Input/output error
ls: cannot access /mnt/old/sbin/fstab-decode: Input/output error
ls: cannot access /mnt/old/sbin/shutdown: Input/output error
ls: cannot access /mnt/old/sbin/halt: Input/output error

which leads me to think the disk is screwed up. Only, this isn't a real disk, but a virtual one. So, I'm thinking that hardware failure is then unlikely(*), meaning that it's probably something else, something still running/going on in my system, something that will screw things up again.

(*) Besides, I haven't had any problems with the physical disk where the .vdi is stored (which is also where Windows is installed), the SMART info say it's fine (and I even did copy that VDI file to another disk without problems).

So, is there anything I said that seems/is (completely) wrong? Does anyone have any idea as to what could be the cause? Or what I can do to try and find out?


Bonus

Completely offtopic, but looking at /var/log/boot I see it ends like so:

Tue Nov  8 16:12:18 2011: :: Starting vbox-service    [BKGD] :: Starting network    [B
Tue Nov  8 16:12:19 2011: :: Starting D-BUS system messagebus    [BUSY]    [DONE] 
Tue Nov  8 16:12:20 2011: 

..is this normal?

To be clear: When I said I have a backup, I got one from this morning, but also one from earlier this month, and it already ended that way then. So this has nothing to do with the problems I described here. Still, it looks odd, plus there are a few more things (daemons) that are started (BKGD) after that... so I just wonder if this is normal or not.


Alright, that's it. Any help is appreciated.
Thanks,
-jacky

Offline

#2 2011-11-10 18:33:56

stqn
Member
Registered: 2010-03-19
Posts: 1,191
Website

Re: System got messed up... somehow

I think your post is too long smile. Try to stick to the point and people will be more inclined to help you because it requires less work on their part. Also use a title less vague!

You say sudo doesn't work but you don't post the error messages. They would be helpful.

The drawing problems are most probably unrelated and are probably caused by the VMWare driver; I'd try disabling compositing.

"Log out" not working in Xfce: search the forum. Several people have had this problem very recently.

Offline

Board footer

Powered by FluxBB