Python Script Performance Q

hjl3 · 2011-06-02 02:16:39

I've been learning Python by going through problems on Project Euler, and my script for Problem 14 took longer to execute then I thought it should. I then found another example for the problem, that was very similar, but took much less to execute. I modified both to make them as similar as possible, and the only difference is that the slow script stores 2 values in two int variables, and the fast one stores them in a tuple.

Here are the scripts...

#!/usr/bin/env python2

# This script takes ~4 seconds to complete.

known = {}
def countSteps(num):
    steps = 0
    while num > 1:
        if num not in known:
            if num % 2 == 0:
                num /= 2
            else:
                num = 3*num + 1
            steps += 1
        else:
            return steps + known[num]
    return steps

highest = (0,0)

for n in xrange(1,1000000,1):
    s = countSteps(n)
    if s > highest[1]:
        highest=(n,s)
    known.update({n:s})

print "Answer:"
print "%d : %d" % (highest[0], highest[1])

This one is much slower...

#!/usr/bin/env python2

# This script takes ~50 seconds to complete.

known = {}
def countSteps(num):
    steps = 0
    while num > 1:
        if num not in known:
            if num % 2 == 0:
                num /= 2
            else:
                num = 3*num + 1
            steps += 1
        else:
            return steps + known[num]
    return steps

high_count = 0
high_n = 0

for n in xrange(1,1000000,1):
    s = countSteps(n)
    if s > high_count:
        high_count = s
        high_n = n
    known.update({high_n:high_count})

print "Answer:"
print "%d : $d" % (high_n, high_count)

Can anyone explain why one performs so much better than the other? Both of them yield a correct result.

Fadel · 2011-06-02 02:39:16

#!/usr/bin/env python2

# This script takes ~50 (?) seconds to complete.

known = {}
def countSteps(num):
    steps = 0
    while num > 1:
        if num not in known:
            if num % 2 == 0:
                num /= 2
            else:
                num = 3*num + 1
            steps += 1
        else:
            return steps + known[num]
    return steps

high_count = 0
high_n = 0

for n in xrange(1,1000000,1):
    s = countSteps(n)
    if s > high_count:
        high_count = s
        high_n = n
    known.update({n:s})

print "Answer:"
print "%d : %d" % (high_n, high_count)

I used this version as the 'slow' one. 'Fast' one is unmodified.

$ time python slow.py 
Answer:
837799 : 524

real0m7.117s
user0m6.510s
sys0m0.070s

$ time python fast.py 
Answer:
837799 : 524

real0m7.149s
user0m6.553s
sys0m0.067s

I don't know python, so I would guess it's probably the typo on the last line. (?)

hjl3 · 2011-06-02 10:26:58

Ooops, yeah that was a typo.

However, I don't understand why it ran so fast for you... that slow script takes ~50 seconds to execute for me:

harry@ projecteuler > cat test-slow.py 
#!/usr/bin/env python2

known = {}
def countSteps(num):
    steps = 0

    while num > 1:
        if num not in known:
            if num % 2 == 0:
                num /= 2
            else:
                num = 3*num + 1
            steps += 1
        else:
            return steps + known[num]
    return steps

high_count = 0
high_n = 0

for n in xrange(1,1000000,1):
    s = countSteps(n)

    if s > high_count:
        high_count = s
        high_n = n

    known.update({high_n:high_count})

print "Answer:"
print "%d : %d" % (high_n, high_count)
harry@ projecteuler > time python2 test-slow.py 
Answer:
837799 : 524

real    0m49.408s
user    0m49.337s
sys    0m0.013s

I tested it on a few different boxes, with the same results.

Stebalien · 2011-06-02 16:00:22

You should be using python's timeit class for this; it's much more accurate. On my computer, the two scripts were took approximately the same time to complete.

hjl3 · 2011-06-02 16:37:07

The difference in execution time between the two for me is ~45 seconds. This enough of a difference that I'm sure just 'time' will suffice. I woudln't be as confused if it was just a coupel seconds. I've executed the script on 4 seperate boxes (all different distros/versions of python2), and they all yeild the same result for me...

On some advise, I ran both with cProfile:

> python2 -m cProfile test-slow.py
Answer:
837799 : 524
         2000000 function calls in 52.688 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    3.322    3.322   52.688   52.688 test-slow.py:3(<module>)
   999999   47.892    0.000   47.892    0.000 test-slow.py:4(countSteps)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   999999    1.473    0.000    1.473    0.000 {method 'update' of 'dict' objects}

> python2 -m cProfile test.py
Answer:
837799 : 524
         2000000 function calls in 9.367 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    3.281    3.281    9.367    9.367 test.py:3(<module>)
   999999    4.510    0.000    4.510    0.000 test.py:4(countSteps)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   999999    1.577    0.000    1.577    0.000 {method 'update' of 'dict' objects}

The difference shown is in the execution time of the countSteps() function, but the function in both scripts is exactly the same, and executed wiith the same exact variables...

austin.rbn · 2011-06-03 06:06:12

So, when you change the line "known.update({high_n:high_count})" in the slow script to "known.update({n:s})", then it executes as quickly as the "fast" one. This is because, if you only store the lengths of those chains that qualify for being "high_count", you don't have a record of those shorter chains which are nonetheless important sub-chains for many of the starting values you need to calculate.

So, basically, a smaller table means less optimization. A fast execution time requires that you store the length of every chain you calculate, so that you can look it up later, for the cases in which it is a sub-chain later on. Remember that the length of Collatz chains is pretty unintuitive. A larger starting value does not imply a longer chain.

Last edited by austin.rbn (2011-06-03 10:47:26)

Arch Linux

#1 2011-06-02 02:16:39

Python Script Performance Q

#2 2011-06-02 02:39:16

Re: Python Script Performance Q

#3 2011-06-02 10:26:58

Re: Python Script Performance Q

#4 2011-06-02 16:00:22

Re: Python Script Performance Q

#5 2011-06-02 16:37:07

Re: Python Script Performance Q

#6 2011-06-03 06:06:12

Re: Python Script Performance Q

Board footer