You are not logged in.

#1 2011-01-17 07:02:02

vkumar
Member
Registered: 2008-10-06
Posts: 166

[Solved] Frustratingly low CPU usage

I need to parse several large XML files and insert their contents into a sqlite database. The code works correctly, but uses 32 minutes of "real" time and only 2 minutes of "user" time.

Another developer on our project ran my script in 3 minutes of "real" time. Adding insult to injury, he was working on an old 32-bit Ubuntu distro *hosted* by Virtualbox on a laptop running Windows 7. Both of our laptops are 64-bit machines. Though his CPU cache size is 6KB and mine is 2KB, my laptop operates at 2Ghz while his runs at 1.6Ghz. edit: (I meant 6MB and 2MB, oops)

My script never uses more than 3.0% of CPU power on my laptop. However, it hits 70 - 75% CPU usage on the other guy's machine. (Stats reported by Htop.)

My theory is that there's a disk IO bottleneck on my machine. It could be reluctant to perform fsync()s rapidly (which sqlite may require to ensure data integrity). At any rate, I've disabled laptop-mode and reniced the process to make it very greedy, but it still takes 30 minutes to run! Absurd.

I'm using the latest x86-64 packages on my laptop. Any help appreciated.

edit: This is all being done through the Django ORM.

Last edited by vkumar (2011-01-17 07:35:53)


div curl F = 0

Offline

#2 2011-01-17 07:08:07

karol
Archivist
Registered: 2009-05-06
Posts: 25,440

Re: [Solved] Frustratingly low CPU usage

"Though his CPU cache size is 6KB and mine is 2KB" - are you sure you got that right?

Create a RAM disk and load your xml files to it to check if it's disk I/O related. iotop may help too.

Offline

#3 2011-01-17 07:33:40

vkumar
Member
Registered: 2008-10-06
Posts: 166

Re: [Solved] Frustratingly low CPU usage

Excellent suggestion. I changed my script to do the transaction on a ramfs and then copy the finished database to disk.

It runs in under a minute now.

Wow.

edit:
$ sudo mkdir /tmp/ram/
$ sudo chmod -R 777 /tmp/ram/
$ sudo mount -t ramfs -o size=64M ramfs /tmp/ram/
<do work>
$ sudo umount /tmp/ram/

Last edited by vkumar (2011-01-17 07:35:37)


div curl F = 0

Offline

#4 2011-01-17 07:35:45

litemotiv
Forum Fellow
Registered: 2008-08-01
Posts: 5,026

Re: [Solved] Frustratingly low CPU usage

There have been several kernel IO regressions considering sqlite performance with ext3/4, sometimes impacting performance by a factor 10 or greater. Not sure if this is hitting you, but it might be something to look at.


ᶘ ᵒᴥᵒᶅ

Offline

#5 2011-01-17 07:37:24

vkumar
Member
Registered: 2008-10-06
Posts: 166

Re: [Solved] Frustratingly low CPU usage

I'll look into this. Using ramfs just shifts the problem, it doesn't actually solve it wink


div curl F = 0

Offline

#6 2011-01-17 11:51:24

Spacenick
Member
From: Germany
Registered: 2010-04-02
Posts: 168

Re: [Solved] Frustratingly low CPU usage

I think the "problem" here is one recently found by Phoronix. The thing is VirtualBox does not correctly do fsyncs for the fsyncs in the VM (unlike e.g. kvm) so some workloads like sqlite run a lot faster than on the host. So the Ubuntu in VirtualBox tries to do fsyncs like your host linux but unlike yours those won't result in actual syninc to disk which makes it a lot faster

Last edited by Spacenick (2011-01-17 11:52:25)

Offline

Board footer

Powered by FluxBB