You are not logged in.
Pages: 1
I've been having issues with my root partition the last few weeks. Everytime I turn off my machine or reboot I get the warning saying umount: device busy and the machine turns off or reboots. During boot I the get the message during "checking filesystems: superblock: FIXED write time in the future" or something similar and when it checks/scans the partition (after every 35th or 37th boot) it fails. I have no idea what is the problem or how to fix it.
Regards
André
Offline
I may not have the solution but I may have a few leads.
The "umount device busy" sounds like some program isn't shutting down correctly. I'd suspect a music player app or anything using a CD drive or USB storage devices. Take a look at this: http://ocaoimh.ie/how-to-umount-when-th … e-is-busy/
I had similar "write time in the future" messages when I first installed Arch because the time stamps on some files were wrong. My hardware clock and my system clock had been set to two different times.
Possibly, if you're setting the hardware clock to local time, there's an incorrect setting in rc.conf. The correct line should read:
HARDWARECLOCK="localtime"
not
HARDWARECLOCK="local"
Before you shutdown or reboot, you might want to check the system clock with the 'date' command and the hardware clock with 'hwclock -r'.
Offline
I checked the clock and rc.conf and everything is correct. I took a look aty the link you gave and I get this when I look at my root partition
[af@andre ~]$ fuser -m /dev/mapper/isw_baaggideei_Volume01
/dev/mapper/isw_baaggideei_Volume01: 3573re 3613rce 3614rce 3620re 3621re 3623re 3651re 3654re 3657re 3659re 3664re 3665re 3668re 3669re 3670re 3674re 3693re 3698re 3704re 3707re 3710re 3720re 3733re 3734re 3735re 3736re 3745re 3750re 3752re 3755rce 3764re 3766re 3779rce 3789rce 3794rce 5321re 5323re 9403re 9571re 9572re 10830re 10832re 10835re 10836re 10842re 10843re 10844re
Regards
André
Offline
OK, there are too many processes to sort through there easily. You shouldn't show that many processes as an ordinary user, if you are not running X. It may be normal if X is running. But I don't think it's right. And we should have used the verbose option, '-v' as well as the '-m' option. The '-m' shows all processes accessing files on that file system. The '-v' will give us human readable output.
I have separate /, /boot, /var and /home partitions. Root is /dev/sda6. From a Linux console I get
$ fuser -mv /dev/sda6
USER PID ACCESS COMMAND
/dev/sda6: root kernel mount /
casey 2676 .r.e. bash
Running openbox, tint2, urxvt, firefox and leafpad, I get
$ fuser -mv /dev/sda6
USER PID ACCESS COMMAND
/dev/sda6: root kernel mount /
casey 2676 .r.e. bash
casey 3391 fr.e. startx
casey 3407 .r.e. xinit
casey 3414 .r.e. ck-launch-sessi
casey 3433 .r.e. openbox
casey 3440 .r.e. tint2
casey 3477 Fr.e. firefox
casey 3481 .rce. dbus-daemon
casey 3482 .rce. dbus-launch
casey 3484 Frce. gconfd-2
casey 3506 .r.e. urxvt
casey 3507 .r.e. bash
casey 3511 .r.e. leafpad
Under the 'ACCESS' heading, 'f' means there's an open file, 'F' means there's a file open for writing. If /var or /home are on your root disk, the list will be longer. The effs don't show unless '-v' is used.
If you run fuser with root priveleges, you get a much longer list, but I don't think you'll need to do that. I think you want to look for processes with the uppercase 'F' and kill those processes. And then you have to find why they are not shutting down properly.
If you install lsof, and then run 'lsof /dev/sda6'it will list all the open files on /dev/sda6. Then to find the process that has /ttt/iii/some_file open, you simply enter 'lsof /ttt/iii/some_file'. to kill the process that has /ttt/iii/some_file open, enter
kill -HUP `lsof -t /ttt/iii/some_file` #Those are backticks
Edit: I'm learning here, too. I'm trying these commandsjust before I write them.
Last edited by thisoldman (2009-09-22 02:42:32)
Offline
Well, things got a bit worse now. When I boot my machine fails when it checks the filesystems. More specifically my root partition fails and I fall into a rescue console. I still get the warning with the super block being fixed but now the partition has errors. I reboot the machine and now it checks the partition which now passes (no FIXED super block) and the machine boots just fine. It just writes after checking the partition 1.2 % non-continiges files. How can I fix my root partition? I was thinking about reinstalling my machine and going to ext4 and use grub2 but how does this work on fake-raid (RAID 0, 2 500 GB drives)? Please help!
Regards
André
Offline
It sounds like drive failure can happen at any time, to me. Better make backups if you haven't done so before now.
Offline
Ah, crap. I really hope that I'm not dealing with a hardware failure but a failure of the RAID. But when I boot my machine my RAID shows both my drives as working (green), so I can't imagine that this is a hardware failure.
Regards
André
Last edited by fettouhi (2009-09-23 11:14:27)
Offline
If it's a Raid error, I'd be last on your list to ask for help.
Offline
That's ehy I'm hoping that someone who is running fake-raid can give me some suggestion on how to fix my root partition, bacause like I said earlier the machine boots but I have to force a check on the root partition, because the filesystem check in the archlinux boot process fails and I get the super block fixed warning.
Offline
After running
e2fsck -f /dev/mapper/isw_baaggideei_Volume01
on my partition after [checking filesystems] failed it seems like the problem is something to do with the superblock and the write time. It writes at boot that the last write time was at 21:22:45 and the new time is 21:22:49 and when hit control-D for reboot it get the warning that it can't unmount the root partition but I can't figure out what is keeping the filesystem from unmounting. The only thing that has been loaded are modules in rc.conf.
Offline
The Arch boot sequence, from the wiki page,http://wiki.archlinux.org/index.php/Boot
Early in rc.sysinit, the realtime-clock driver is loaded and there are several places in that script where the hwclock is adjusted.
If it's just a system and hardware clock synchronizing fault, deleting /var/lib/hwclock/adjtime may work to cure it. See post #18 on this thread http://bbs.archlinux.org/viewtopic.php?id=79543.
Would clock disagreement upset system file journaling? Is that what's causing the "device is busy" message?
Offline
The Arch boot sequence, from the wiki page,http://wiki.archlinux.org/index.php/Boot
Early in rc.sysinit, the realtime-clock driver is loaded and there are several places in that script where the hwclock is adjusted.
If it's just a system and hardware clock synchronizing fault, deleting /var/lib/hwclock/adjtime may work to cure it. See post #18 on this thread http://bbs.archlinux.org/viewtopic.php?id=79543.
Would clock disagreement upset system file journaling? Is that what's causing the "device is busy" message?
I don't think that clock disagreement would render the "device is busy" message. Hence, I still can't figure out what is causing the "device is busy" message. I'll try your suggestion.
Regards
André
Offline
I FINALLY figured out what was causing the "device is busy" thing. It is samba for some reason and I don't know why. I can see that when samba shuts down that nscd also is stopped. I'm running on dhcp on my machine. Can anyone explain to me why samba is doing this. I haven't touched my samba settings in many many many months. I though switch from static ip to dhcp when we made changes to our dhcp server. Could that be the cause of it?
Regards
André
Offline
After looking more closely the problem isn't samba it is nscd that is causing the problem at shutdown and making the root partition unable to unmount. Hence, getting the writre time error at next boot. Why is it doing that?
Regards
André
Offline
I had these same symptoms (root not unmounting, dropping to recovery console on next boot and then "fixing" the fs on next boot). No RAID tho
I believe it started after I adjusted system time with date -s
Just tried this solution and it worked for me
If it's just a system and hardware clock synchronizing fault, deleting /var/lib/hwclock/adjtime may work to cure it. See post #18 on this thread http://bbs.archlinux.org/viewtopic.php?id=79543.
Xyne wrote:
"We've got Pacman. Wacka wacka, bitches!"
Offline
I take that back. Problem returned on next reboot
Xyne wrote:
"We've got Pacman. Wacka wacka, bitches!"
Offline
I started having this issue a few days ago, no RAID either, no SAMBA.
It happens when the computer is not shut down correctly (ie power failure).
But before it was just "playing back the journal", whereas now it fails.
Maybe it's time to file a bug-report. I will first try what was suggested here, and then file a bug-report if nobody has done it before.
Last edited by john_schaf (2009-10-02 08:17:16)
Offline
Please see this bug report, initiated by André Fettouhi, http://bugs.archlinux.org/task/16368. No solution or workaround has been posted yet.
Offline
After looking more closely the problem isn't samba it is nscd that is causing the problem at shutdown and making the root partition unable to unmount. Hence, getting the writre time error at next boot. Why is it doing that?
Regards
André
Hmm, I have exactly the same problem here. It appears that for me, the wicd daemon starts nscd, and if I don't manually kill nscd before reboot, it causes this problem. Strangely, when I shut down without manually killing nscd, there is a "Done" instead of a "Fail" and the device is busy problem soon occurs...
Any society that would give up a little liberty to gain a little security will deserve neither and lose both.
-Benjamin Franklin
The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.
-George Bernard Shaw
Offline
Pages: 1