Need help rewriting a recursive function in python

syamajala · 2013-06-21 03:10:08

http://paste.pound-python.org/show/34142/

I have a function that I wrote recursively in python, it seems to work fine for small inputs, but not the big inputs i need to run it on. The function is basically implementing distributing and FOILing. When I try to run it in on the list a, the python process just gets killed after a while.

syamajala · 2013-06-21 06:23:15

According to kernprof, I've been able to do a little better by changing some of the for loops to list comprehensions, but I still can't expand the a list in the link posted before...

def expand_help(l, r, s):                                                                                                                                                     
                                                                                                                                                                              
    """                                                                                                                                                                       
    Basically what we want to do is walk through a list like                                                                                                                  
    l = [(0, 2), (0, 1), (1, 2)] and turn it into something like                                                                                                              
    r = [(-1, [(0, 2)]), (-1, [(0, 1), (1, 2)])], so there are a                                                                                                              
    few different cases.                                                                                                                                                      
                                                                                                                                                                              
    if l[0] is a list then we should expand that, if not then it must be                                                                                                      
    (1, [(0, 2)]) by definition of phi_b_ext.                                                                                                                                 
                                                                                                                                                                              
    if l[1] is a list, expand that, then if l[2] is a list as well expand                                                                                                     
    it and foil, otherwise multiply each element of l[1] by l[2],                                                                                                             
                                                                                                                                                                              
    similarly, if l[2] is a list, multiply each element of l[2] by l[1],                                                                                                      
    otherwise just multiply l[1] and l[2]                                                                                                                                     
    """                                                                                                                                                                       
                                                                                                                                                                              
    if type(l[0]) == list:                                                                                                                                                    
        r.extend(expand_help(l[0], [], -1*s))                                                                                                                                 
    else:                                                                                                                                                                     
        r.append((-1*s, [l[0]]))                                                                                                                                              
                                                                                                                                                                              
    if type(l[1]) == list:                                                                                                                                                    
        a = expand_help(l[1], [], -1*s)                                                                                                                                       
        if type(l[2]) == list:                                                                                                                                                
            b = expand_help(l[2], [], s)                                                                                                                                      
            t = [(i*p, j+q) for i, j in a for p, q in b]                                                                                                                      
        else:                                                                                                                                                                 
            t = [(i, j+[l[2]]) for i, j in a]                                                                                                                                 
    else:                                                                                                                                                                     
        if type(l[2]) == list:                                                                                                                                                
            a = expand_help(l[2], [], s)                                                                                                                                      
            t = []                                                                                                                                                            
            for i in a:                                                                                                                                                       
                x, y = i                                                                                                                                                      
                if type(y) == list:                                                                                                                                           
                    y.insert(0, l[1])                                                                                                                                         
                    t.append((-1*s*x, y))                                                                                                                                     
                else:                                                                                                                                                         
                    t.append((-1*s*x, [l[1], y]))                                                                                                                             
        else:                                                                                                                                                                 
            t = [(-1*s, [l[1], l[2]])]                                                                                                                                        
                                                                                                                                                                              
    r.extend(t)                                                                                                                                                               
    return r

Trent · 2013-06-21 11:22:57

This appears to be an unusually complex data structure definition and I'm not sure what it is supposed to represent. What's the equivalency between [ (0,2), (0,1), (1,2) ] and [ (-1, [ (0,2) ]), (-1, [ (0,1), (1,2) ]) ] ? What do each of the parts mean?

syamajala · 2013-06-21 14:27:25

There is a (n+1)x(n+1) that looks like this

Ga =     [as0, as1, as2]
         [a1s, a11, a12]
         [a2s, a21, a22]

We then want to compose a function phi_sk (where k goes between 1 and n-1) with either itself or some other function phi_sk, n times and apply that function to each as* and a*s element. Doing that we get lists like

a = [(1, 0), (2, 1), [(0, 1), (1, 2), (2, 0)]]

The tuples in the list represent the indices into the matrix Ga. By definition of phi_sk, we'll either get a single element (in this case for example phi_s1(a2s) = a1s), or we'll get 3 elements (phi_s1(as1) = -as2 - as1*a12) that will always look like -1*(some element) - (the product of two others). So, I represented the equation -as2 - as1*a12 by the list [(0, 2), (0, 1), (1, 2)].

So, now say we've applied all the phi_sk's to all the as* and a*s elements n times. We'll get back lists like [(1, 0), (2, 1), [(0, 1), (1, 2), (2, 0)]] and even much more complicated ones like this [[(2, 0), (2, 1), (1, 0)], (2, 1), [(1, 0), (1, 2), [(2, 0), (2, 1), (1, 0)]]] What we want do now is build another matrix Phi_L_B where the ij entry are the coffeicients of the ajs terms of the composition of the phi_sk's on a_is. In order to get those coefficients, we need to take a list like [(0, 2), (0, 1), (1, 2)] and associate a sign with each element and figure out the products and sign of the products.

For example looking at this [(1, 0), (2, 1), [(0, 1), (1, 2), (2, 0)]] this represents -a1s - a21*(-as1 - a12*a2s))
if we expand that out we get -a1s + a21*as1 + a21*a12*a2s. Its important to note that the structure we are working over DOES NOT have a commutative multiplication so we cannot change the order. So, I choose to represent the expanded out version in the following way, [(-1, [(1, 0)]), (1, [(2, 1), (0, 1)]), (1, [(2, 1), (1, 2), (2, 0)])] where in a tuple, the first element is the sign, and the second element is a list of elements we multiple together.

TLDR

[(1, 0), (2, 1), [(0, 1), (1, 2), (2, 0)]] are the indices into the matrix Ga and represents -a1s - a21*(-as1 - a12*a2s)) we want to expand it into this -a1s + a21*as1 + a21*a12*a2s, but we need to keep track of certain things like whether we are subtracting or adding an element and which elements we are multiplying together, so, we use a list like [(-1, [(1, 0)]), (1, [(2, 1), (0, 1)]), (1, [(2, 1), (1, 2), (2, 0)])] where in a tuple, the first element is the sign, and the second element is a list of elements we multiple together.

Trent · 2013-06-21 21:41:51

syamajala wrote:

[(1, 0), (2, 1), [(0, 1), (1, 2), (2, 0)]] are the indices into the matrix Ga and represents -a1s - a21*(-as1 - a12*a2s)) we want to expand it into this -a1s + a21*as1 + a21*a12*a2s, but we need to keep track of certain things like whether we are subtracting or adding an element and which elements we are multiplying together, so, we use a list like [(-1, [(1, 0)]), (1, [(2, 1), (0, 1)]), (1, [(2, 1), (1, 2), (2, 0)])] where in a tuple, the first element is the sign, and the second element is a list of elements we multiple together.

That's just too much nesting, both for Python and for mortals. Plus the meaning of each data structure depends heavily on both its contents and context. Only a mathematician would come up with something like this...

Okay, okay, here's what I'm thinking: you have two options. Preferably, you would rewrite your phi_sN functions so that they simplify as they go, and always return a simple (non-recursive) data structure like a list-of-terms; that would probably cost much less than trying to simplify the entire massive expression after the fact.

The other option is to store the whole mess into a tree-like data structure, and simplify it in-place using postorder traversal. Which is not entirely different from what you're doing now, but you need to store it in such a manner that the context doesn't determine what you do with a subexpression (i.e. is this tuple part of a list, or is this list the 2nd member of a tuple, etc.) -- make the data structure more symmetric, and you'll be a good step closer to taming that complexity. It has to be postorder and in-place to keep your memory usage under control.

Actually, if you can modify your existing algorithm to do the simplification in-place (rather than appending to a new data structure) that might help enough to make it work. I'm not really sure that's possible though. In any case, I doubt I'll be much more help.

syamajala · 2013-06-22 04:14:10

I think I have another idea that might work. A generator that returns the most deeply nested element of a list. We expand that, then return the next most deeply nested element. Expand that and so on.

Also, I simplified what expand returns. Its now [(-1, (0, 2)), (-1, (0, 1, 1, 2))] instead of a list of tuples for the second entry in the tuple. Now, when iterating through we just take 2 elements a time instead of 1. I might possibly be able to simplify it more and just have it be a list of tuples. [(-1, 0, 2), (-1, 0, 1, 1, 2)]

tavianator · 2013-06-22 17:42:48

So the main question is: why do you need this function? What is it for?

fledermann · 2013-06-24 12:07:11

I did not understand neither the purpose nor the data structure completely, but it sounds like something which might better be implemented with map() and numpy's matrix. Did you look into these?
From my experiences you need recursion very very rarely in python.

Trent · 2013-06-27 00:19:30

So... I did a little more digging, you nerd sniped me for sure. Here's what I came up with for the original problem, but don't get excited until you read the caveats below:

http://paste.pound-python.org/show/34425/

The classes are just containers for variables and terms ("term" being a sequence of variables multiplied together in order, plus a numeric coefficient). I wrote them to ease the complexity of the nesting and to make it easy to print out an expression for debugging. If you need the result in something other than a list-of-terms, it shouldn't be too difficult to write a conversion function. Edit: Custom classes removed; interpret() now returns a sequence of dicts, which represent terms. Closer to the original problem and less extraneous fat.

The interpret() function works on the recursive data structure you defined (e.g. [(1, 0), (2, 1), [(0, 1), (1, 2), (2, 0)]] ) and returns a generator. The generator yields terms, one after the other. Like terms are not combined, because to do so you would have to keep track of all previous terms, which is not possible for obvious reasons. I found that it was surprisingly hard to write this in a way that didn't bleed a lot of memory, but I think it works now; I can write a loop to churn through the generator output and leave it running for some time without the process being killed or locking up my desktop.

However, that brings me to the other caveat: the example input at the end of the code you originally posted? It's really big. Really, really, really big. The expansion contains millions of terms. In the last 10 minutes my computer has generated 32 million terms and I have no way of knowing how close it is to "finished". I imagine the time required to do so could be much, much longer than you could conceivably wait. If my gut instincts are right, rewriting it in C wouldn't help you much; you need to find another approach that doesn't involve enumerating every term in the expression. Perhaps there is some shortcut you can take when generating the expression lists.

To give you an idea of the scale of the problem, the following relatively short input has an expansion 33 terms long:

[(1, 0), [(1, 4), (1, 2), [[(2, 4), (2, 1), (1, 4)], (2, 1), (1, 4)]], [[(4, 0), (4, 1), (1, 0)], [[(4, 2), (4, 1), (1, 2)], (4, 1), (1, 2)], [(2, 0), (2, 1), (1, 0)]]]

The next input I tried, which is about 4 times as long, has an expansion 737 terms long, and the next input, which is about twice as long as that, has an expansion 64,864 terms long and takes a few seconds to calculate. The entire expression is over 3 times longer than the 65k-term expression. There's no rigid correlation between input size and number of terms, which just makes it hard to reason about performance. I tried to figure a way to calculate how big the expansion would be, but then I realized I'm not getting paid for this.

tl;dr -- here's some code that works better for shorter inputs and doesn't lock up for longer ones, but I can't guarantee it will finish churning the test data this millenium. 70 million terms and no end in sight. Hope you didn't need the whole expression at once because that just ain't going to happen.

Last edited by Trent (2013-06-29 00:47:33)

syamajala · 2013-06-27 04:26:05

So, I've spent the past few days playing with the other idea i had, ie starting from deepest nested list first, but I couldn't get the new expand function to pass all the tests.

Taviantor, I need the function to build the Phi_L_B matrix I described in my third post. If you mean more generally what do I need this function for, basically, I'm trying to reimplement the algorithms described in this paper, http://arxiv.org/pdf/math/0302099.pdf The guy who wrote that paper wrote a mathematica program to do the computations but only 1 person in our group has mathematica and he said that the program was written for version 5 but the latest version is 9 so he had some trouble getting it to work.

Anyway, it turns out that there is another method described in the paper for generating the Phi_L_B matrix using a "so called" chain rule. I've also been playing with implementing that.

Another thing we realized is that there is a way to linearize, which we are interested in. Anytime we have a product like a_ij*a_kl we can replace that by -2*a_ij*-2*a_kl - 4. I haven't used this fact yet in my code.

Trent, I'll take a look at your code tomorrow when I am less tired. Also, I gave cython a try on my "optimized" expand function in my second post, but all that happened was that it became faster at expanding small terms, according to the profiling I did.

syamajala · 2013-06-27 04:30:02

Also, I should mention, that as for the example I posted being really big. In terms of the overall picture of the things we are trying to do thats the second non-trivial example. Things are only going to get worse after that.

jakobcreutzfeldt · 2013-06-27 07:07:05

Are you absolutely tied to Python? This seems like a task that screams for Scheme or Common Lisp. Two words: tail-call recursion. Python has a recursion limit of 1000 frames.

Trent · 2013-06-27 10:18:48

Update: I left it to churn overnight and it finished at 567623460 terms. This is the last term:

-1a_3_1*a_1_2*a_2_1*a_1_4*a_4_1*a_1_2*a_2_1*a_1_2*a_2_1*a_1_4*a_4_1*a_1_2*a_2_1*a_1_3*a_3_1*a_1_2*a_2_1*a_1_4*a_4_1*a_1_2*a_2_1*a_1_4*a_4_1*a_1_2*a_2_1*a_1_3*a_3_1*a_1_2*a_2_1*a_1_0

So it's definitely big, but not unmanageable. The main source of slowdown is probably Python, and if you rewrote it in a more appropriate language you might have been able to generate the sequence in considerably less time (but still less than instantaneous -- say, 30 minutes instead of 3 hours). On the other hand, if you're ok with leaving it to crunch numbers overnight, maybe Python's speed isn't a problem here. Still, if you're planning to deal with much bigger expressions, you will soon find yourself leaving it over the weekend, or longer.

Like I said before, it does generate sets of like terms that need to be combined, so you'll need to account for that in the code that makes the matrix Phi_L_B. I haven't read the paper and you didn't explain exactly how that matrix is calculated; I'm assuming the matrix you're going for is much smaller than the entire expression, which is why generators work with you (since you don't need the entire expression at once).

@jakob: I was thinking the same thing when I started, but having finished I'm not sure tail-call optimization can help here. The recursion depth isn't that great -- only around 25 -- and I'm not even sure it's possible to do TC optimization in this case; the logic is just too complex. However, a Lisp would definitely be a more natural choice for this kind of problem.

Mr.Elendig · 2013-06-28 10:41:24

pypy and not using recursion, ???, profits!

Trent · 2013-06-28 22:50:15

syamajala wrote:

Another thing we realized is that there is a way to linearize, which we are interested in. Anytime we have a product like a_ij*a_kl we can replace that by -2*a_ij*-2*a_kl - 4. I haven't used this fact yet in my code.

Although it reduces the problem to something that's no longer interesting, it seems apparent to me that you can use this fact to rewrite phi_sk so that every invocation, no matter how deeply nested, always returns a data structure of fixed size. If that's possible, you'd certainly save a lot of time by doing so in the original calculations rather than by calculating a huge tree of lists and then running it through an algorithm to simplify it.

It's also possible I've misunderstood something, so if you have any follow up questions feel free to ask.

Trent · 2013-06-29 00:50:21

I was feeling bad about the quality of the code in my earlier post, so I went back and made some changes for clarity and documentation. You probably don't need it anymore, but if somebody finds this thread in the future maybe it will be less confusing this way.

syamajala · 2013-06-30 05:51:39

I made a few changes and all appears to be good now. I modified the phi_sk functions to return stuff like [1, [2, 1]] for phi_sk(1, 1, 2) for example or [[-1, [2, 0]], [-1, [2, 1, 1, 0]]] for phi_sk(1, 1, 0) for example, and I modified the function that calls the phi_sk's to simplify as it goes. It appears to be returning the correct results for the example computation done in the paper, and it seems to be able to do computations for the bigger braid we were looking at, but I don't know yet if they are right or not. However, someone was able to get the old mathematica code to run so we will have something to compare to now.

The code is available on github in case anyone is interested. It requires sage to run though. https://github.com/syamajala/KnotContac … /braids.py

Arch Linux

#1 2013-06-21 03:10:08

Need help rewriting a recursive function in python

#2 2013-06-21 06:23:15

Re: Need help rewriting a recursive function in python

#3 2013-06-21 11:22:57

Re: Need help rewriting a recursive function in python

#4 2013-06-21 14:27:25

Re: Need help rewriting a recursive function in python

#5 2013-06-21 21:41:51

Re: Need help rewriting a recursive function in python

#6 2013-06-22 04:14:10

Re: Need help rewriting a recursive function in python

#7 2013-06-22 17:42:48

Re: Need help rewriting a recursive function in python

#8 2013-06-24 12:07:11

Re: Need help rewriting a recursive function in python

#9 2013-06-27 00:19:30

Re: Need help rewriting a recursive function in python

#10 2013-06-27 04:26:05

Re: Need help rewriting a recursive function in python

#11 2013-06-27 04:30:02

Re: Need help rewriting a recursive function in python

#12 2013-06-27 07:07:05

Re: Need help rewriting a recursive function in python

#13 2013-06-27 10:18:48

Re: Need help rewriting a recursive function in python

#14 2013-06-28 10:41:24

Re: Need help rewriting a recursive function in python

#15 2013-06-28 22:50:15

Re: Need help rewriting a recursive function in python

#16 2013-06-29 00:50:21

Re: Need help rewriting a recursive function in python

#17 2013-06-30 05:51:39

Re: Need help rewriting a recursive function in python

Board footer