You are not logged in.
Hello,
It's been a month or something that my system performance dropped considerably. This seems to be somehow correlated to jbd2 eating out all system resources during intense disk IO. Basically, whenever the system needs to access intensely the disk (eg during pacman -Syu) the system start becoming unresponsive, with graphical programs freezing and problems like that.
Trying to investigate the issue using iotop and top, I found that jdb2 is probably the one eating out all resources and making the system slow and unresponsive.
I suspect this could be a kernel bug as I found out some other people having similar issues:
http://forums.opensuse.org/english/get- … rface.html
Does anyone know something about this? Ideas, suggestions, fixes?
Thanks,
Fabio Varesano
Offline
Wait for linux 3.3 or try it from [testing].
It should bring some memory management improvements.
Offline
Just upgraded to Linux 3.3 .. exactly the same problem. Any help?
Offline
Posting your fstab and the output of 'mount' would give some information. Are the partitions near full capacity, say 70% or more? Has fsck been run on the partitions?
I probably can't help much but I know more information is needed.
Offline
Ok, no problem.. fsck is being run regularly each time the system asks me to.. (about once a month)
[fabio@gamma ~]$ mount
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sys on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
/dev on /dev type devtmpfs (rw,nosuid,relatime,size=1978776k,nr_inodes=494694,mode=755)
run on /run type tmpfs (rw,nosuid,nodev,relatime,mode=755)
/dev/sda5 on / type ext3 (rw,relatime,user_xattr,acl,barrier=1,nodelalloc,data=ordered)
devpts on /dev/pts type devpts (rw,relatime,mode=600,ptmxmode=000)
shm on /dev/shm type tmpfs (rw,relatime)
/dev/sda7 on /home type ext3 (rw,relatime,user_xattr,acl,barrier=1,nodelalloc,data=ordered)
/dev/sda2 on /mnt/win_c type fuseblk (rw,nosuid,nodev,noexec,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)
/dev/sda8 on /mnt/misc type ext3 (rw,relatime,user_xattr,acl,barrier=1,nodelalloc,data=ordered)
gvfs-fuse-daemon on /home/fabio/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,relatime,user_id=1001,group_id=100)
[fabio@gamma ~]$ cat /etc/fstab
#
# /etc/fstab: static file system information
#
# <file system> <dir> <type> <options> <dump> <pass>
devpts /dev/pts devpts defaults 0 0
shm /dev/shm tmpfs defaults 0 0
/dev/cdrom /mnt/cdrom iso9660 ro,user,noauto,unhide 0 0
/dev/dvd /mnt/dvd udf ro,user,noauto,unhide 0 0
/dev/sda5 / ext3 defaults 0 1
/dev/sda6 swap swap defaults 0 0
/dev/sda7 /home ext3 defaults,user_xattr 0 1
/dev/sdb1 /mnt/pen vfat noauto,owner,user,uid=1001,gid=100 1 0
/dev/sda2 /mnt/win_c ntfs-3g user,uid=1001,gid=100,fmask=0113,dmask=0002 0 0
/dev/sda8 /mnt/misc ext3 defaults 0 1
/dev/sdb1 /mnt/black_win_e ext3 defaults 0 0
/dev/sdb2 /mnt/sea_win_c ntfs-3g noauto,user,uid=1001,gid=100,fmask=0113,dmask=0002 0 0
/dev/sdb1 /mnt/sea_home ext3 noauto 0 0
[fabio@gamma ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
rootfs 46G 18G 26G 42% /
/dev 1.9G 0 1.9G 0% /dev
run 1.9G 316K 1.9G 1% /run
/dev/sda5 46G 18G 26G 42% /
shm 1.9G 0 1.9G 0% /dev/shm
/dev/sda7 275G 39G 223G 15% /home
/dev/sda2 145G 78G 68G 54% /mnt/win_c
/dev/sda8 205G 43G 152G 23% /mnt/misc
[root@gamma ~]# fdisk -l
Disk /dev/sda: 750.2 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders, total 1465149168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x9f2b3dfe
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 2457599 1227776 7 HPFS/NTFS/exFAT
/dev/sda2 2457600 305979391 151760896 7 HPFS/NTFS/exFAT
/dev/sda3 305979392 1465144064 579582336+ 5 Extended
/dev/sda5 305979455 403638526 48829536 83 Linux
Partition 5 does not start on physical sector boundary.
/dev/sda6 403638590 442708606 19535008+ 82 Linux swap / Solaris
Partition 6 does not start on physical sector boundary.
/dev/sda7 442708670 1028647351 292969341 83 Linux
Partition 7 does not start on physical sector boundary.
/dev/sda8 1028647415 1465144064 218248325 83 Linux
Partition 8 does not start on physical sector boundary.
[root@gamma ~]# smartctl -a /dev/sda
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.3.1-1-ARCH] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: Hitachi HTS727575A9E364
Serial Number: J3740084GEGM6E
LU WWN Device Id: 5 000cca 68cc61fb4
Firmware Version: JF4OA0D0
User Capacity: 750,156,374,016 bytes [750 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 6
Local Time is: Tue Apr 10 10:02:46 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 45) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 148) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 0
3 Spin_Up_Time 0x0007 187 187 033 Pre-fail Always - 2
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 225
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0
9 Power_On_Hours 0x0012 099 099 000 Old_age Always - 631
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 225
191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 1
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 9895941
193 Load_Cycle_Count 0x0012 095 095 000 Old_age Always - 53298
194 Temperature_Celsius 0x0002 153 153 000 Old_age Always - 39 (Min/Max 8/48)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
223 Load_Retry_Count 0x000a 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 496 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
[root@gamma ~]# hdparm -iI /dev/sda
/dev/sda:
Model=Hitachi HTS727575A9E364, FwRev=JF4OA0D0, SerialNo=J3740084GEGM6E
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=DualPortCache, BuffSize=16384kB, MaxMultSect=16, MultSect=16
CurCHS=65535/1/63, CurSects=4128705, LBA=yes, LBAsects=1465149168
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled
Drive conforms to: unknown: ATA/ATAPI-2,3,4,5,6,7
* signifies the current active mode
ATA device, with non-removable media
Model Number: Hitachi HTS727575A9E364
Serial Number: J3740084GEGM6E
Firmware Revision: JF4OA0D0
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6; Revision: ATA8-AST T13 Project D1697 Revision 0b
Standards:
Used: unknown (minor revision code 0x0028)
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders 16383 65535
heads 16 1
sectors/track 63 63
--
CHS current addressable sectors: 4128705
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 1465149168
Logical Sector size: 512 bytes
Physical Sector size: 4096 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 1024*1024: 715404 MBytes
device size with M = 1000*1000: 750156 MBytes (750 GB)
cache/buffer size = 16384 KBytes (type=DualPortCache)
Form Factor: 2.5 inch
Nominal Media Rotation Rate: 7200
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
Advanced power management level: 128
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
* Advanced Power Management feature set
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* IDLE_IMMEDIATE with UNLOAD
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
* NCQ priority information
Non-Zero buffer offsets in DMA Setup FIS
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
In-order data delivery
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
supported: enhanced erase
148min for SECURITY ERASE UNIT. 150min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000cca68cc61fb4
NAA : 5
IEEE OUI : 000cca
Unique ID : 68cc61fb4
Checksum: correct
Last edited by fax8 (2012-04-10 08:08:03)
Offline
You should check the wiki article on fstab. If you are not running a server or a multi-user PC, there is little reason to update access times for files. See this, https://wiki.archlinux.org/index.php/Fs … me_options. I use the mount option 'noatime' for my ext4 partitions. This will definitely lower the amount of filesystem journaling updates (jbd2).
Also, only your root partition should have a pass value of 1. The others should have a pass value of 0 (don't check with fsck) or 2 (lowers the priority for fsck).
Offline
According to the wiki page you shared and my mount output above, I'm using the relatime option which only updates access time when a modification on the file occurs. Since my system changed its performance without changes in configurations and I'm experiencing slowness and freezes also on simple disk reads (eg: KDE startup) I don't think this is the problem...
Last edited by fax8 (2012-04-10 09:35:15)
Offline
I wish we could tell which updates might have caused your problem.
You could try using the ext3 mount option 'commit=300'. This would increase the interval for the journal synchronization to once every five minutes instead of the default setting of once every five seconds. The new setting could cause some corruption if the computer were to crash within that sync interval.
Other mount options to try are 'data=writeback' and 'data=journal'. See 'man mount' for the details. I have read one account where the user saw better disk IO throughput with 'data=journal' over the default 'ordered' and also over 'writeback'.
I know these are not cures but they may relieve some of the symptoms.
Offline
Thanks for your help... I'll try your suggestions.. I'll also try switching to a different filesystem than ext3/ext4 since I'm thinking that this has something to do with the filesystem.
Offline
I decided not to loose to journaling features as suggested by thisoldman above but I took the drastic solution: switching to another filesystem. Using ReiserFS 3.6 now and my system is fast as before. Definitely something wrong here with ext3-4... very strange nobody else is complaining about this.
Offline
Actually I have exactly the same issue. I can hear my HDD writing every 3 seconds.
I checked with iotop, and I saw jbd2 was the culprit.
I was using 3.3.3 this morning but I think this 3.3 kernel related. Don't remember having these spins on 3.2 kernel.
Updated to 3.3.4 after, my HDD seems better now, but when I use Chromium (cache directory on tmpfs), the disk writes very often (still due to jbd2).
I think switching to another filesystem, is overkill but I understand it's very annoying to hear HDD every seconds...
Here my fstab
#
# /etc/fstab: static file system information
#
# <file system> <dir> <type> <options> <dump> <pass>
#tmpfs /tmp tmpfs nodev,nosuid 0 0
tmpfs /tmp tmpfs defaults,nodev,nosuid,noexec,size=1G 0 0
UUID=a0167e80-a201-42b8-b8d5-9670d4ea0f25 / ext4 defaults,noatime 0 1
UUID=99e7ba10-6170-49ce-bf13-0b7410cffed1 swap swap defaults 0 0
UUID=7dc4b17d-5d5e-41c5-ae19-b43213de8341 /boot ext2 defaults 0 2
UUID=7a851c5e-74ff-4d5f-9420-4d5fadc23010 /home ext4 defaults,noatime 0 2
UUID=9b8e18ca-67e3-4986-9919-f9d7dae49cb9 /mnt/public ext4 defaults,noatime 0 2
UUID=24786e90-8af6-4b43-a1fc-55c25b79e592 /mnt/storage ext4 defaults,noatime 0 2
UUID=138D-6D32 /media/USB vfat user,noauto,noatime,flush 0 0
EDIT: 3.3.4 changes nothing. It's getting worse.
Last edited by Ypnose (2012-04-30 15:04:17)
Offline
@Ypnose hi, there! Nice to not be the only one having this problem.. let us know if switching to another filesystem fixes the problem for you..
I have an open kernel bug at https://bugzilla.kernel.org/show_bug.cgi?id=42895 .. maybe you wanna stop by there and let the kernel guys know that somethings wrong?
Offline
Offline
I'm getting the same problem with ext4. Mostly notable with chrome when startup, creating new tabs and after resume from suspend (also cache directory on tmpfs).
It' s much slower than before. I feel I have to reformat and reinstall to solve this.
Offline