possible bug in the arch NFS server package?

slackhack · 2007-05-30 15:10:26

i run the nfs server on my arch box, exporting to a debian box and an arch laptop. whenever the arch server reboots lately, the shares don't automatically remount properly on the clients. they're listed in mtab, but if i ls or try to access the directories, i get a permission denied error. i have to manually unmount the shares and then remount them again to be able to see/use them.

the reason i think this is an arch package problem is because i set up a share on the debian box to share with arch, and that worked perfectly. when i rebooted debian as the server, the shares were automatically remounted on the arch client, and when i rebooted arch, the shares were again mounted properly on reboot.

it's possible i'm doing something wrong with permissions, but it seems unlikely because 1) everything was working fine for a long time until recently, when i started noticing the behavior, 2) all the permissions on the shared directory are identical to the ones on the arch shared directory, all user name UIDs are the same, same groups and GIDs, etc., 3) the shares mount perfectly well manually from the command line, and 4) i set up the debian share/exports, etc. in about 2 minutes with no problem at all, while dealing with this problem on arch for 2 days now, after changing options and going over everything multiple times until my head is spinning. it just seems unlikely that a configuration is wrong, although i guess anything is possible. i can provide all that permissions/group info, fstab info, /etc/exports info, etc. if anyone wants to take a closer look at it.

so until this is sorted, i wondered if anyone else is having this problem, or if anyone had any ideas of something i might be overlooking. again, everything *seems* to be set up right, but maybe there's some arch specific thing i'm overlooking. thanks.

hungsonbk · 2007-05-31 01:02:21

Hi there,
I also have a NFS server where i store everything in my network. All PCs are using ARCH:P:P:P
I am using autofs to automatically mount NFS from my Arch server as well as USB and other stuff. It works fine for me. I have rebooted my server several times and there is no problem at all.
i don't use static mount in fstab file. I use autofs and this is the entry in my autofs configuration file:
hdb -rw,soft,intr,rsize=8192,wsize=8192,noatime serverfarm:/professional/hdb

I think when u mount NFS, spend sometimes to play with hard, soft, intr and timeout parameters. they are used to mount NFS.

slackhack · 2007-05-31 14:03:01

thanks, at least there is a confirmation that it works in some conditions.

i think hard and intr are the defaults, and i've always used those and it's worked fine. something else must have changed, but i just can't find it.

so it seems something is wrong somewhere after all and it's probably not a bug, i'm going to have to keep digging. i guess the question is why does it mount fine from the command line, but mount with "permission denied" even to root when the server reboots? could this have something to do with that weird "nobody" account?

Last edited by slackhack (2007-05-31 14:11:26)

slackhack · 2007-06-01 12:55:08

~bump~

can anyone who doesn't use autofs confirm that their client NFS shares remount automatically after arch server reboot (edit: without "permission denied" error when accessing the share)?

Last edited by slackhack (2007-06-01 12:59:41)

lucke · 2007-06-01 12:57:11

Doesn't remount here.

slackhack · 2007-06-01 13:01:19

lucke wrote:

Doesn't remount here.

okay, now we're getting somewhere. can you post your server's /etc/exports and client's /etc/fstab entries for the share?

slackhack · 2007-06-01 13:53:09

another thing i'm just noticing is that if i restart the server (/etc/rc.d/nfsd restart), the client gives a "stale NFS file handle" error. i have to unmount and remount the share, and then if i restart the server again, same error. must be a kernel or nfs bug somewhere i guess. afaik nothing's writing to the share because it's freshly mounted.

this is really driving me crazy.

Ruckus · 2007-06-01 19:58:32

It happens to me as well, I just assumed that was the way it was supposed to work tho. I'm fairly sure it worked the same way under ubuntu, but that was 6+ months ago.

hungsonbk · 2007-06-01 20:27:17

NFS works just fine for me without any problem. I just mount nfs with soft and timeout=15 as my earlier post.
I think you should set the timeout in a fews seconds so that if the server was down, your autofs will not try to connect. And when your server comes up again, autofs can remount it.

slackhack · 2007-06-01 22:21:26

hungsonbk wrote:

NFS works just fine for me without any problem. I just mount nfs with soft and timeout=15 as my earlier post.
I think you should set the timeout in a fews seconds so that if the server was down, your autofs will not try to connect. And when your server comes up again, autofs can remount it.

but you're using autofs. it should work fine without autofs. autofs is just for when you have a lot of connections. this server only mounts 2 shares, 4 or 5 on another client. that small amount of shares shouldn't require autofs. i'd like to find out why it isn't working properly on its own without just dropping it for a workaround.

i also definitely strongly prefer not to use "soft":

"soft: If a file request fails, the NFS client will report an error to the process on the client machine requesting the file access. Some programs can handle this with composure, most won't. We do not recommend using this setting; it is a recipe for corrupted files and lost data. You should especially not use this for mail disks --- if you value your mail, that is."
http://nfs.sourceforge.net/nfs-howto/ar01s04.html

ruckus: it's most definitely not supposed to work that way. if a share is mounted on the client and the server reboots or restarts the daemon, the shares are supposed to "remount" (or re-continue with the mount they already had) when the server is up again. managing NFS would be a nightmare without that kind of functionality.

slackhack · 2007-06-01 22:26:48

besides, don't forget that it's not actually a problem with the share's not mounting. it's a problem with them being "permission denied" even to root when they're remounted, or giving a "stale file handle" error when the daemon restarts, which afaik is not normal behavior at all.

as i said in the first post, i set up my debian box to share the other way, with arch as the client, and the shares mounted on arch are reestablished after the debian box reboots. that's the way it should work, but it's not acting right in arch for some reason.

shazeal · 2007-06-05 00:45:53

Until recently I have been using NFS to share files locally on 2 Gentoo boxes as you have described via fstab. Either machine can reboot and the shares reconnect correctly. Yesterday I setup my computer with Arch (love it btw), and re setup the shares with the same config (Arch box is now the server), all connected fine. I havent tested rebooting while files are open on the server yet (which worked fine with 2 gentoo boxes), Ill try that out later and post the results

shazeal · 2007-06-05 03:27:19

Ok Ive confirmed the same behavior to you when Arch is running as the NFS server. No changes to clients, server config is same as on Gentoo. Issue is with the package for Arch NFS server it seems to me. Ill investigate some more

cactus · 2007-06-05 03:31:41

add the following to /etc/hosts.allow on your arch box, and try again. Just for S's and G's.

ALL: ALL

Comment it out (or remove it) when you are done testing.

shazeal · 2007-06-05 04:55:50

hosts.allow is already setup fine.

The problem is with the exportfs command, under gentoo if you are restarting if doesnt bother doing an exportfs -ua.

        # When restarting the NFS server, running "exportfs -ua" probably
        # isn't what the user wants.  Running it causes all entries listed
        # in xtab to be removed from the kernel export tables, and the
        # xtab file is cleared. This effectively shuts down all NFS
        # activity, leaving all clients holding stale NFS filehandles,
        # *even* when the NFS server has restarted.
        #
        # That's what you would want if you were shutting down the NFS
        # server for good, or for a long period of time, but not when the
        # NFS server will be running again in short order.  In this case,
        # then "exportfs -r" will reread the xtab, and all the current
        # clients will be able to resume NFS activity, *without* needing
        # to umount/(re)mount the filesystem.
        if [ "$restarting" = no ]; then
                ebegin "Unexporting NFS directories"
                # Exportfs likes to hang if networking isn't working.
                # If that's the case, then try to kill it so the
                # shutdown process can continue.
                $exportfs -ua 1>&2 &
                waitfor_exportfs $!
                eend $? "Error unexporting NFS directories"
        fi

So its possible to adapt this to the Arch way I guess

shazeal · 2007-06-05 05:34:35

Ok I modified my nfsd script to check for $RUNLEVEL = 0 and only bother doing exportfs -ua then. But something is killing the /var/lib/nfs/xtab anyway... Does killing the daemon in the script cause this? Sorry Im not 100% familiar with Arch way of doing things yet

Heres my changes.

    #  kill -9 $RQUOTAD_PID &> /dev/null
    #  rm /var/run/rpc.rquotad.pid
    #fi
    if [ "$RUNLEVEL" = "0" ]; then
      /usr/sbin/exportfs -au
    fi
    rm_daemon $DAEMON_NAME
    stat_done
    ;;

  reload)
    $0 stop
    sleep 2
    /usr/sbin/exportfs -au
    sleep 2
    $0 start
   ;;

shazeal · 2007-06-05 09:16:17

Taking out the -ua on restart will stop nfs purging the mounts. So now the script doesnt purge on restarts, but on startup it still wont re-add the mounts that should be there.

/var/lib/nfs/xtab after /etc/rc.d/nfsd stop still contains mount info.

/var/lib/nfs/xtab after /etc/rc.d/nfsd start is empty.
/proc/fs/nfs/exports after /etc/rc.d/nfsd start is empty.

All the entrys in /etc/exports /var/lib/nfs/rmtab & etab are correct.

Something is not right somewhere Im no expert so I think this is probably beyond me.

shazeal · 2007-06-05 09:54:43

Ok out of pure fustration I just grabbed the gentoo init script. Installed start-stop-daemon, modified the script to run as #!/bin/bash, stuck it in /etc/rc.d, rebooted and everything works, I can reboot my computer and clients reconnect. Or restart daemon and they reconnect.

Heres the script I am using.

#!/bin/bash
# Copyright 1999-2005 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: /var/cvsroot/gentoo-x86/net-fs/nfs-utils/files/nfs,v 1.14 2007/03/24 10:14:43 vapier Exp $

#---------------------------------------------------------------------------
# This script starts/stops the following
#    rpc.statd if necessary (also checked by init.d/nfsmount)
#    rpc.rquotad if exists (from quota package)
#    rpc.nfsd
#    rpc.mountd
#---------------------------------------------------------------------------

# NB: Config is in /etc/conf.d/nfs

opts="reload"

# This variable is used for controlling whether or not to run exportfs -ua;
# see stop() for more information
restarting=no

# The binary locations
exportfs=/usr/sbin/exportfs
    gssd=/usr/sbin/rpc.gssd
  idmapd=/usr/sbin/rpc.idmapd
  mountd=/usr/sbin/rpc.mountd
    nfsd=/usr/sbin/rpc.nfsd
 rquotad=/usr/sbin/rpc.rquotad
   statd=/usr/sbin/rpc.statd
 svcgssd=/usr/sbin/rpc.svcgssd

mkdir_nfsdirs() {
    local d
    for d in /var/lib/nfs/{rpc_pipefs,v4recovery,v4root} ; do
        [[ ! -d ${d} ]] && mkdir -p "${d}"
    done
}

mount_pipefs() {
    if grep -q rpc_pipefs /proc/filesystems ; then
        if ! grep -q "rpc_pipefs /var/lib/nfs/rpc_pipefs" /proc/mounts ; then
            mount -t rpc_pipefs rpc_pipefs /var/lib/nfs/rpc_pipefs
        fi
    fi
}

umount_pipefs() {
    if [[ ${restarting} == "no" ]] ; then
        if grep -q "rpc_pipefs /var/lib/nfs/rpc_pipefs" /proc/mounts ; then
            umount /var/lib/nfs/rpc_pipefs
        fi
    fi
}

start_gssd() {
    [[ ! -x ${gssd} || ! -x ${svcgssd} ]] && return 0
    local ret1 ret2

    ${gssd} ${RPCGSSDDOPTS}
    ret1=$?

    ${svcgssd} ${RPCSVCGSSDDOPTS}
    ret2=$?

    return $((${ret1} + ${ret2}))
}

stop_gssd() {
    [[ ! -x ${gssd} || ! -x ${svcgssd} ]] && return 0
    local ret

    start-stop-daemon --stop --quiet --exec ${gssd}
    ret1=$?

    start-stop-daemon --stop --quiet --exec ${svcgssd}
    ret2=$?

    return $((${ret1} + ${ret2}))
}

start_idmapd() {
    [[ ! -x ${idmapd} ]] && return 0

    ${idmapd} ${RPCIDMAPDOPTS}
}

stop_idmapd() {
    [[ ! -x ${idmapd} ]] && return 0
    local ret

    start-stop-daemon --stop --quiet --exec ${idmapd}
    ret=$?

    umount_pipefs

    return ${ret}
}

start_statd() {
    # Don't start rpc.statd if already started by init.d/nfsmount
    killall -0 rpc.statd &>/dev/null && return 0
    start-stop-daemon --start --quiet --exec \
        $statd -- $RPCSTATDOPTS 1>&2
}

stop_statd() {
    # Don't stop rpc.statd if it's in use by init.d/nfsmount.
    mount -t nfs | grep -q . && return 0
    # Make sure it's actually running
    killall -0 rpc.statd &>/dev/null || return 0
    # Okay, all tests passed, stop rpc.statd
    start-stop-daemon --stop --quiet --exec $statd 1>&2
}

waitfor_exportfs() {
    local pid=$1
    ( sleep ${EXPORTFSTIMEOUT:-30}; kill -9 $pid &>/dev/null ) &
    wait $1
}

case "$1" in
  start) 
    # Make sure nfs support is loaded in the kernel #64709
    if [[ -e /proc/modules ]] && ! grep -qs nfsd /proc/filesystems ; then
        modprobe nfsd &> /dev/null
    fi

    # This is the new "kernel 2.6 way" to handle the exports file
    if grep -qs nfsd /proc/filesystems ; then
        if ! grep -qs "^nfsd[[:space:]]/proc/fs/nfsd[[:space:]]" /proc/mounts ; then
            mount -t nfsd nfsd /proc/fs/nfsd
        fi
    fi
    # now that nfsd is mounted inside /proc, we can safely start mountd later

    mkdir_nfsdirs

    mount_pipefs
    start_idmapd
    start_gssd
    start_statd

    # Exportfs likes to hang if networking isn't working.
    # If that's the case, then try to kill it so the
    # bootup process can continue.
    if grep -q '^/' /etc/exports &>/dev/null; then
        $exportfs -r 1>&2 &
        waitfor_exportfs $!
    fi

    if [ -x $rquotad ]; then
        start-stop-daemon --start --quiet --exec \
            $rquotad -- $RPCRQUOTADOPTS 1>&2
    fi

    start-stop-daemon --start --quiet --exec \
        $nfsd --name nfsd -- $RPCNFSDCOUNT 1>&2

    # Start mountd
    start-stop-daemon --start --quiet --exec \
        $mountd -- $RPCMOUNTDOPTS 1>&2
  ;;

  stop) 
    # Don't check NFSSERVER variable since it might have changed,
    # instead use --oknodo to smooth things over
    start-stop-daemon --stop --quiet --oknodo \
        --exec $mountd 1>&2

    # nfsd sets its process name to [nfsd] so don't look for $nfsd
    start-stop-daemon --stop --quiet --oknodo \
        --name nfsd --user root --signal 2 1>&2

    if [ -x $rquotad ]; then
        start-stop-daemon --stop --quiet --oknodo \
            --exec $rquotad 1>&2
    fi

    # When restarting the NFS server, running "exportfs -ua" probably
    # isn't what the user wants.  Running it causes all entries listed
    # in xtab to be removed from the kernel export tables, and the
    # xtab file is cleared. This effectively shuts down all NFS
    # activity, leaving all clients holding stale NFS filehandles,
    # *even* when the NFS server has restarted.
    #
    # That's what you would want if you were shutting down the NFS
    # server for good, or for a long period of time, but not when the
    # NFS server will be running again in short order.  In this case,
    # then "exportfs -r" will reread the xtab, and all the current
    # clients will be able to resume NFS activity, *without* needing
    # to umount/(re)mount the filesystem.
    if [ "$restarting" = no ]; then
        # Exportfs likes to hang if networking isn't working.
        # If that's the case, then try to kill it so the
        # shutdown process can continue.
        $exportfs -ua 1>&2 &
        waitfor_exportfs $!
    fi

    stop_statd
    stop_gssd
    stop_idmapd
    umount_pipefs

  ;;

  reload) 
    # Exportfs likes to hang if networking isn't working.
    # If that's the case, then try to kill it so the
    # bootup process can continue.
    $exportfs -r 1>&2 &
    waitfor_exportfs $!
  ;;

  restart) 
    # See long comment in stop() regarding "restarting" and exportfs -ua
    restarting=yes
    svc_stop
    svc_start
  ;;

  *)
    echo "usage: $0 {start|stop|restart}"
esac
exit 0

shazeal · 2007-06-05 10:06:34

Is this the answer?

    # This is the new "kernel 2.6 way" to handle the exports file
    if grep -qs nfsd /proc/filesystems ; then
        if ! grep -qs "^nfsd[[:space:]]/proc/fs/nfsd[[:space:]]" /proc/mounts ; then
            mount -t nfsd nfsd /proc/fs/nfsd
        fi
    fi

That seems to be the only real difference.

shazeal · 2007-06-05 10:17:46

Sorry for so many posts, yes that fixes it

Here is the modified Arch script.

#!/bin/bash

# source application-specific settings
[ -f /etc/conf.d/nfs ] && . /etc/conf.d/nfs

. /etc/rc.conf
. /etc/rc.d/functions

DAEMON_NAME=nfsd
#RQUOTAD_PID=`pidof -o %PPID /usr/sbin/rpc.rquotad`
NFSD_PID=`pidof -o %PPID nfsd`
MOUNTD_PID=`pidof -o %PPID /usr/sbin/rpc.mountd`
case "$1" in
  start)
    stat_busy "Starting $DAEMON_NAME"

    # This is the new "kernel 2.6 way" to handle the exports file
    if grep -qs nfsd /proc/filesystems ; then
        if ! grep -qs "^nfsd[[:space:]]/proc/fs/nfsd[[:space:]]" /proc/mounts ; then
            mount -t nfsd nfsd /proc/fs/nfsd
        fi
    fi

    /usr/sbin/exportfs -r
    if [ ! -f /var/run/daemons/portmap ]; then 
        echo "ERROR: portmap is not running"
      stat_fail
      exit 1
    fi
    if [ ! -f /var/run/daemons/nfslock ]; then 
      stat_fail
        echo "ERROR: nfslock is not running"
      exit 1
    fi
    #[ -z "$RQUOTAD_PID" ] && /usr/sbin/rpc.rquotad $RQUOTAD_OPTS
    #if [ $? -gt 0 ]; then
    #  stat_fail
    #  exit 1
    #else
    #  echo `pidof -o %PPID /usr/sbin/rpc.rquotad` > /var/run/rpc.rquotad.pid
    #fi
    [ -z "$NFSD_PID" ] && /usr/sbin/rpc.nfsd $NFSD_OPTS
    if [ $? -gt 0 ]; then
      stat_fail
      exit 1
    else
      echo `pidof -o %PPID nfsd` > /var/run/rpc.nfsd.pid
    fi
    [ -z "$MOUNTD_PID" ] && /usr/sbin/rpc.mountd $MOUNTD_OPTS
    if [ $? -gt 0 ]; then
      stat_fail
      exit 1
    else
      echo `pidof -o %PPID /usr/sbin/rpc.mountd` > /var/run/rpc.mountd.pid
    fi
    add_daemon $DAEMON_NAME
    stat_done
    ;;

  stop)
    stat_busy "Stopping $DAEMON_NAME"
    [ ! -z "$MOUNTD_PID" ]  && kill $MOUNTD_PID &> /dev/null
    if [ $? -gt 0 ]; then
      stat_fail
      exit 1
    else
    rm /var/run/rpc.mountd.pid &> /dev/null
    fi
    sleep 1
    [ ! -z "$NFSD_PID" ]  && kill $NFSD_PID &> /dev/null
    if [ $? -gt 0 ]; then
      stat_fail
      exit 1
    else
      kill -9 $NFSD_PID &> /dev/null
      rm /var/run/rpc.nfsd.pid &> /dev/null
    fi
    #[ ! -z "$RQUOTAD_PID" ]  && kill $RQUOTAD_PID &> /dev/null
    #if [ $? -gt 0 ]; then
    #  stat_fail
    #  exit 1
    #else
    #  kill -9 $RQUOTAD_PID &> /dev/null
    #  rm /var/run/rpc.rquotad.pid
    #fi
    if [ "$RUNLEVEL" = "0" ]; then
      /usr/sbin/exportfs -au
    fi
    rm_daemon $DAEMON_NAME
    stat_done
    ;;

  reload)
    /usr/sbin/exportfs -au
    ;;

  restart)
    $0 stop
    sleep 2
    $0 start
    ;;

  *)
    echo "usage: $0 {start|stop|restart}"  
esac
exit 0

Last edited by shazeal (2007-06-05 10:22:33)

slackhack · 2007-06-05 13:24:18

shazeal wrote:

Sorry for so many posts, yes that fixes it

that fixes it for me, too! shares reconnect properly after rebooting and restarting nfsd. something so simple, and yet i probably would have never found it. great work!

i guess this script should now be incorporated into the arch package. should we file a bug report to notify the appropriate devs, or is this in your department cactus?

lucke · 2007-06-05 13:55:38

Please file a bug report.

cactus · 2007-06-05 15:22:08

slackhack wrote:

i guess this script should now be incorporated into the arch package. should we file a bug report to notify the appropriate devs, or is this in your department cactus?

File a bug report, with a patch of the fix, if you could.

linderox · 2007-06-07 12:21:00

i worked through KDE-Center.There are lots of shared directories in it, but there is no any entries in the /etc/exports, but when i'm add something in it. This appeares in the KDE-CENTER as duplicate line...

Where KDE safe my rules?

tomk · 2007-06-07 13:52:31

Good work, shazeal - greatly appreciated. I'll be putting the 1.1.0 release in testing shortly, and I was about to start investigating this.

I've picked up your bug report #7368, and the package will be in testing later today.

Arch Linux

#1 2007-05-30 15:10:26

possible bug in the arch NFS server package?

#2 2007-05-31 01:02:21

Re: possible bug in the arch NFS server package?

#3 2007-05-31 14:03:01

Re: possible bug in the arch NFS server package?

#4 2007-06-01 12:55:08

Re: possible bug in the arch NFS server package?

#5 2007-06-01 12:57:11

Re: possible bug in the arch NFS server package?

#6 2007-06-01 13:01:19

Re: possible bug in the arch NFS server package?

#7 2007-06-01 13:53:09

Re: possible bug in the arch NFS server package?

#8 2007-06-01 19:58:32

Re: possible bug in the arch NFS server package?

#9 2007-06-01 20:27:17

Re: possible bug in the arch NFS server package?

#10 2007-06-01 22:21:26

Re: possible bug in the arch NFS server package?

#11 2007-06-01 22:26:48

Re: possible bug in the arch NFS server package?

#12 2007-06-05 00:45:53

Re: possible bug in the arch NFS server package?

#13 2007-06-05 03:27:19

Re: possible bug in the arch NFS server package?

#14 2007-06-05 03:31:41

Re: possible bug in the arch NFS server package?

#15 2007-06-05 04:55:50

Re: possible bug in the arch NFS server package?

#16 2007-06-05 05:34:35

Re: possible bug in the arch NFS server package?

#17 2007-06-05 09:16:17

Re: possible bug in the arch NFS server package?

#18 2007-06-05 09:54:43

Re: possible bug in the arch NFS server package?

#19 2007-06-05 10:06:34

Re: possible bug in the arch NFS server package?

#20 2007-06-05 10:17:46

Re: possible bug in the arch NFS server package?

#21 2007-06-05 13:24:18

Re: possible bug in the arch NFS server package?

#22 2007-06-05 13:55:38

Re: possible bug in the arch NFS server package?

#23 2007-06-05 15:22:08

Re: possible bug in the arch NFS server package?

#24 2007-06-07 12:21:00

Re: possible bug in the arch NFS server package?

#25 2007-06-07 13:52:31

Re: possible bug in the arch NFS server package?

Board footer