[solved] smartd ignoring smartd.conf and always sending mails?

palmaway · 2014-10-10 10:26:53

Hi all!

I have just configured the smartd daemon, but I have problems limiting the mail function. I want smartd to send and e-mail (in fact I execute a script instead) only when pre-fail or fail attributes reach the threshold, or the health check fails. Basically, the options -p -f -H, as for my smartd.conf:

/dev/sda -H -p -f -s S/../../7/01 -m root -M exec /usr/local/bin/smartnotify
/dev/sdb -H -p -f -s S/../../7/02 -m root -M exec /usr/local/bin/smartnotify

However, I have a 8 Currently unreadable (pending) sectors warning for /dev/sda coming up that triggers the mail (well, the smartnotify script). Why is that? From my understanding of the smartd.conf manual, the -p option should notify only values of the "Pre-fail" type, while -f should notify if a non "Pre-fail" goes beyond the threshold, not just when it changes (that should be -u, right?). Then why smartd triggers the notification even if the pending sectors (ID: 197) are far from the threshold? Below the (partial) output from smartctl -a:

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0007   186   186   033    Pre-fail  Always       -       2
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       797
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0012   091   091   000    Old_age   Always       -       3955
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       796
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       33
193 Load_Cycle_Count        0x0012   045   045   000    Old_age   Always       -       556861
194 Temperature_Celsius     0x0002   206   206   000    Old_age   Always       -       29 (Min/Max 14/47)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       8
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -       0

Thanks a lot for any help, this is driving me crazy!

P.S. I know having bad sectors is not a good thing, so please don't just tell me to change the drive, that's not my question!

Last edited by palmaway (2014-10-12 00:35:06)

Spider.007 · 2014-10-11 10:25:42

What does `journalctl -u smartd.service` say. When it starts it'll tell you if it parsed the config-file succesfully; for example:

smartd[342]: Opened configuration file /etc/smartd.conf
smartd[342]: Drive: DEVICESCAN, implied '-a' Directive on line 23 of file /etc/smartd.conf
smartd[342]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices

palmaway · 2014-10-11 23:42:40

The configuration seems to be fine... It says Configuration file /etc/smartd.conf parsed, but it still triggers the notification for some reason:

ott 12 01:40:08 liberty systemd[1]: Starting Self Monitoring and Reporting Technology (SMART) Daemon...
ott 12 01:40:08 liberty systemd[1]: Started Self Monitoring and Reporting Technology (SMART) Daemon.
ott 12 01:40:08 liberty smartd[10336]: smartd 6.3 2014-07-26 r3976 [x86_64-linux-3.16.4-1-ARCH] (local build)
ott 12 01:40:08 liberty smartd[10336]: Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
ott 12 01:40:08 liberty smartd[10336]: Opened configuration file /etc/smartd.conf
ott 12 01:40:08 liberty smartd[10336]: Configuration file /etc/smartd.conf parsed.
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sda, type changed from 'scsi' to 'sat'
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sda [SAT], opened
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sda [SAT], Hitachi HTS727575A9E364, S/N:J3740084H9K3PE, WWN:5-000cca-68cd26f1b, FW:JF4OA0D0, 750 GB
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sda [SAT], found in smartd database: Hitachi/HGST Travelstar 7K750
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sdb, type changed from 'scsi' to 'sat'
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sdb [SAT], opened
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sdb [SAT], SanDisk SSD i100 16GB, S/N:123600107147, WWN:5-001b44-7d2b8f28b, FW:11.56.04, 16.0 GB
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sdb [SAT], found in smartd database: SanDisk based SSDs
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sdb [SAT], can't monitor Current_Pending_Sector count - no Attribute 197
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sdb [SAT], can't monitor Offline_Uncorrectable count - no Attribute 198
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sdb [SAT], is SMART capable. Adding to "monitor" list.
ott 12 01:40:08 liberty smartd[10336]: Monitoring 2 ATA and 0 SCSI devices
ott 12 01:40:08 liberty smartd[10336]: Device: /dev/sda [SAT], 8 Currently unreadable (pending) sectors
ott 12 01:40:08 liberty smartd[10336]: Sending warning via /usr/local/bin/smartnotify to root ...
ott 12 01:40:08 liberty smartd[10336]: Warning via /usr/local/bin/smartnotify to root: successful

Last edited by palmaway (2014-10-11 23:43:24)

palmaway · 2014-10-11 23:53:38

Just a quick note: I modified the configuration by adding -d ata in order to avoid the type changed from 'scsi' to 'sat' message in the log. No changes in behavior...

palmaway · 2014-10-12 00:33:53

I found out the smartd daemon thinks that attribute 197 is pre-fail, despite the hard drive producer disagreeing (see table above). I wonder why... This means that both -H and -f would trigger the notification. I solved by using

/dev/sda -d ata -H -p -f -C 197+ -s S/../../7/01 -m root -M exec /usr/local/bin/smartnotify

which warns me only if the number of unreadable (pending) sectors increases.

Last edited by palmaway (2014-10-12 00:36:26)

Arch Linux

#1 2014-10-10 10:26:53

[solved] smartd ignoring smartd.conf and always sending mails?

#2 2014-10-11 10:25:42

Re: [solved] smartd ignoring smartd.conf and always sending mails?

#3 2014-10-11 23:42:40

Re: [solved] smartd ignoring smartd.conf and always sending mails?

#4 2014-10-11 23:53:38

Re: [solved] smartd ignoring smartd.conf and always sending mails?

#5 2014-10-12 00:33:53

Re: [solved] smartd ignoring smartd.conf and always sending mails?

Board footer