You are not logged in.

#1 2011-05-02 20:22:54

frojnd
Member
Registered: 2008-09-20
Posts: 125

Server hangs after some time. Can't find the reason why.

Hi there!

It's been happening for 4+ months now. After some time server hangs. I can't ssh, use keyboard I have to hard reset! I have perform several hardware tests:
- memory check
- hard disk check
- cpu test (core temps were at most 65C during 100% of CPU for more than 45mins...)
- I replaced power supply

And yet it keeps on hanging. I've noticed in munin that there is suspiciously high usage of memory also swap before hangs happens:

Memory usage - by year
memoryyear.png

Memory usage - by month
memorymonth.png

Memory usage - by week
memoryweek.png

Memory usage - by day
memoryday.png

Offline

#2 2011-05-05 22:09:30

frojnd
Member
Registered: 2008-09-20
Posts: 125

Re: Server hangs after some time. Can't find the reason why.

Update: Some new /var/log/errors.log

May  5 23:22:08 localhost smbd[7914]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:22:08 localhost smbd[7915]: [2011/05/05 23:22:08.530207,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:22:08 localhost smbd[7915]: [2011/05/05 23:22:08.530355,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:22:08 localhost smbd[7915]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:22:08 localhost smbd[7915]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:27:05 localhost kernel: pci 0000:00:00.0: BAR 0: address space collision on of device [0xf0000000-0xf7ffffff]
May  5 23:27:19 localhost smbd[3785]: [2011/05/05 23:27:19.756229,  0] param/loadparm.c:7504(lp_set_enum_parm)
May  5 23:27:19 localhost smbd[3785]:   WARNING: Ignoring invalid value 'none' for parameter 'printing'
May  5 23:27:19 localhost smbd[3785]: [2011/05/05 23:27:19.790512,  0] param/loadparm.c:7504(lp_set_enum_parm)
May  5 23:27:19 localhost smbd[3785]:   WARNING: Ignoring invalid value 'none' for parameter 'printing'
May  5 23:27:19 localhost smbd[3785]: [2011/05/05 23:27:19.796415,  0] param/loadparm.c:6883(service_ok)
May  5 23:27:19 localhost smbd[3785]:   WARNING: No path in service public - making it unavailable!
May  5 23:27:20 localhost nmbd[3812]: [2011/05/05 23:27:20.186518,  0] param/loadparm.c:7504(lp_set_enum_parm)
May  5 23:27:20 localhost nmbd[3812]:   WARNING: Ignoring invalid value 'none' for parameter 'printing'
May  5 23:27:20 localhost nmbd[3812]: [2011/05/05 23:27:20.195667,  0] param/loadparm.c:7504(lp_set_enum_parm)
May  5 23:27:20 localhost nmbd[3812]:   WARNING: Ignoring invalid value 'none' for parameter 'printing'
May  5 23:27:20 localhost smbd[3810]: [2011/05/05 23:27:20.238079,  0] smbd/server.c:500(smbd_open_one_socket)
May  5 23:27:20 localhost smbd[3810]:   smbd_open_once_socket: open_socket_in: Address already in use
May  5 23:27:43 localhost nmbd[3815]: [2011/05/05 23:27:43.370538,  0] nmbd/nmbd_become_lmb.c:395(become_local_master_stage2)
May  5 23:27:43 localhost nmbd[3815]:   *****
May  5 23:27:43 localhost nmbd[3815]:   
May  5 23:27:43 localhost nmbd[3815]:   Samba name server ARDOMA is now a local master browser for workgroup LINUX on subnet 192.168.1.101
May  5 23:27:43 localhost nmbd[3815]:   
May  5 23:27:43 localhost nmbd[3815]:   *****
May  5 23:36:58 localhost smbd[4355]: [2011/05/05 23:36:58.635887,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:36:58 localhost smbd[4355]: [2011/05/05 23:36:58.644063,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:36:58 localhost smbd[4355]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:36:58 localhost smbd[4355]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:36:59 localhost smbd[4356]: [2011/05/05 23:36:59.202813,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:36:59 localhost smbd[4356]: [2011/05/05 23:36:59.202998,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:36:59 localhost smbd[4356]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:36:59 localhost smbd[4356]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:42:04 localhost smbd[4858]: [2011/05/05 23:42:04.025940,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:42:04 localhost smbd[4858]: [2011/05/05 23:42:04.026083,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:42:04 localhost smbd[4858]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:42:04 localhost smbd[4858]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:42:04 localhost smbd[4859]: [2011/05/05 23:42:04.485853,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:42:04 localhost smbd[4859]: [2011/05/05 23:42:04.485982,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:42:04 localhost smbd[4859]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:42:04 localhost smbd[4859]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:47:09 localhost smbd[5364]: [2011/05/05 23:47:09.348653,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:47:09 localhost smbd[5364]: [2011/05/05 23:47:09.348808,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:47:09 localhost smbd[5364]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:47:09 localhost smbd[5364]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:47:09 localhost smbd[5365]: [2011/05/05 23:47:09.862611,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:47:09 localhost smbd[5365]: [2011/05/05 23:47:09.862700,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:47:09 localhost smbd[5365]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:47:09 localhost smbd[5365]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:52:10 localhost smbd[5872]: [2011/05/05 23:52:10.784172,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:52:10 localhost smbd[5872]: [2011/05/05 23:52:10.784304,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:52:10 localhost smbd[5872]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:52:10 localhost smbd[5872]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.
May  5 23:52:11 localhost smbd[5873]: [2011/05/05 23:52:11.273104,  0] lib/util_sock.c:474(read_fd_with_timeout)
May  5 23:52:11 localhost smbd[5873]: [2011/05/05 23:52:11.273233,  0] lib/util_sock.c:1441(get_peer_addr_internal)
May  5 23:52:11 localhost smbd[5873]:   getpeername failed. Error was Transport endpoint is not connected
May  5 23:52:11 localhost smbd[5873]:   read_fd_with_timeout: client 0.0.0.0 read error = Connection reset by peer.

Offline

#3 2011-05-12 09:48:19

frojnd
Member
Registered: 2008-09-20
Posts: 125

Re: Server hangs after some time. Can't find the reason why.

jeah, i am going to install freebsd over arhlinux and see how it goes... no real solutons on the forums...

Offline

Board footer

Powered by FluxBB