You are not logged in.

#1 2010-04-21 00:42:33

xelados
Member
Registered: 2007-06-02
Posts: 314
Website

Doing things The Python Way™

As a proof of concept or comparison, I ported a quick and dirty journal script from PHP to Python. As a part of my learning process, I want to get feedback from others on how I can improve my code (or comments). If you guys have any good articles I can read on improving my practices, I'd like it.

The code is as follows:

main.py

#! /usr/bin/python
# coding: utf-8

#  pyjournal - the dumb journal software
#  by Daniel Campbell <danny@sporkbox.us>
#  Version 0.2 (2010-04-20)

#             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
#                     Version 2, December 2004
#
#  Copyright (C) 2004 Sam Hocevar <sam@hocevar.net>
#
#  Everyone is permitted to copy and distribute verbatim or modified
#  copies of this license document, and changing it is allowed as long
#  as the name is changed.
#
#             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
#    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
#
#   0. You just DO WHAT THE FUCK YOU WANT TO.
#
#  This program is free software. It comes without any warranty, to
#  the extent permitted by applicable law. You can redistribute it
#  and/or modify it under the terms of the Do What The Fuck You Want
#  To Public License, Version 2, as published by Sam Hocevar. See
#  http://sam.zoy.org/wtfpl/COPYING for more details.

import web
import os
import re
import markdown
import datetime

urls = (
        "/", "index",
        "/(.*)", "article" # Anything that's given in the URL is assumed to be the title of an article
        )

article_path = os.getcwdu() + "/entries/"

def fetch_articles(path):
    """Fetches a list of .mdown files in the specified directory and returns a list sorted by file mtime, descending. Each entry in the list follows a format of <mtime><filename>, e.g. '1234567890.0Hello World'"""
    list =[]
    for item in os.listdir(path):
        if re.search(".mdown$", unicode(item)):
            modtime = str(os.stat(path + item).st_mtime)
            list.append(u"" + modtime + item[:-6])
    list.sort()
    list.reverse()
    return list

template = web.template.frender('template.html')

class index:
    def GET(self):
        content = u""
        fileset = fetch_articles(article_path)
        for entry in fileset:
            # To make things easier, I want to slice the timestamp from the list item.
            # That way, I don't have to create two separate lists and combine them into a dict.
            modtime = entry[0:12]
            timestamp = datetime.date.fromtimestamp(float(modtime))
            name = entry[12:]
            content = content + "* [" + name + "](" + name + "), " + str(timestamp) + "\n"
        return template('', markdown.markdown(content))

class article:
    def GET(self, title):
        if len(title[0:]) > 0 and os.path.isfile(article_path + title + ".mdown"):
            source = open(article_path + title + ".mdown")
            content = markdown.markdown(source.read().decode('utf8'))
        else:
            content = u"Error: Article does not exist."

        return template(title, content)

app = web.application(urls, globals(), autoreload=True)

if __name__ == "__main__":
    app.run()

template.html

$def with (title=None, content=None)
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
$if title:
    <title>pyjournal: viewing $title</title>
$else:
    <title>pyjournal: article index</title>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta name="Author" content="Daniel Campbell">
</head>
<body>
$if title:
    <h1>pyjournal → $title</h1>
$else:
    <h1>pyjournal</h1>

$if content:
    $if title:
        <p class="bar"><a href="./">← Article Index</a></p>
    <div id="content">
    $:content
    </div>
    $if title:
        <p class="bar"><a href="./">← Article Index</a></p>
<p id="footer">pyjournal is a half-assed hack by <a href="http://sporkbox.us" title="Sporkbox Productions">Daniel Campbell</a> released under the <a href="http://sam.zoy.org/wtfpl" title="Do What the Fuck You Want To Public License">WTFPL</a>. Treat it as such.</p>
</body>
</html>

Offline

#2 2010-04-21 13:39:34

Chimera
Member
From: Run Level 7
Registered: 2010-03-28
Posts: 43

Re: Doing things The Python Way™

I was hoping someone more qualified would have responded by now, but...
I think you're coding looks fabulous. Better than mine anyways. hmm

I would add more one-line comments for the more confusing lines (especially the regex--not everyone can read them)
Very nice code and documentation though.

Offline

#3 2010-04-21 14:36:21

dauerbaustelle
Member
Registered: 2010-02-07
Posts: 62

Re: Doing things The Python Way™

http://paste.pocoo.org/show/204515/

* Do not override builtin names (`list = blah`)
* Constants IN_UPPERCASE_WITH_UNDESCORES
* Keep as much logic as possible out of Views
* DON'T EVER USE `globals()`.

The most important tip: If you see yourself using strings as data container, you are most probably doing something wrong.

Offline

#4 2010-04-21 22:18:20

AlecSchueler
Member
Registered: 2009-05-30
Posts: 22

Re: Doing things The Python Way™

These are some things I noted when I read over your code:

* use __version__ etc instead of having that information in a comment
* avoid global variables, or at least distinguish them (e.g. uppercase)
* put # -*- coding: UTF-8 -*- after your shebang to force encoding
* don't go over 80 chars in a line
* use `for modtime, name in fileset` and have fetch_articles return a list of tuples -- don't use strings to make new datatypes
* line 64 - would be clearer using .format (see Chimera's post)
* line 69 - why `title[0:]` instead of `title`?

Offline

#5 2010-04-22 04:14:38

xelados
Member
Registered: 2007-06-02
Posts: 314
Website

Re: Doing things The Python Way™

dauerbaustelle wrote:

http://paste.pocoo.org/show/204515/

* Do not override builtin names (`list = blah`)
* Constants IN_UPPERCASE_WITH_UNDESCORES
* Keep as much logic as possible out of Views
* DON'T EVER USE `globals()`.

The most important tip: If you see yourself using strings as data container, you are most probably doing something wrong.

There are a lot of things I see in your edit that I've never seen before; where can I go to learn more about these and how to use them? Thanks a lot for the edit; I'll continue studying it and see what I can learn from it.

EDIT: Seems like there are some unicode errors in your edit...

Last edited by xelados (2010-04-22 04:59:26)

Offline

#6 2010-04-22 09:41:22

dauerbaustelle
Member
Registered: 2010-02-07
Posts: 62

Re: Doing things The Python Way™

xelados, http://docs.python.org. :-)

Yes, I forgot utf-8 decoding and reversing/sorting in my edit.

Offline

#7 2010-04-22 09:43:32

dauerbaustelle
Member
Registered: 2010-02-07
Posts: 62

Re: Doing things The Python Way™

AlecSchueler, he *has* `coding: utf-8` right after the shebang. No need to complain smile

Last edited by dauerbaustelle (2010-04-22 09:43:46)

Offline

#8 2010-04-23 22:41:05

raf_kig
Member
Registered: 2008-11-28
Posts: 143

Re: Doing things The Python Way™

xelados wrote:
    """Fetches a list of .mdown files in the specified directory and returns a list sorted by file mtime, descending. Each entry in the list follows a format of <mtime><filename>, e.g. '1234567890.0Hello World'"""
    list =[]
    for item in os.listdir(path):
        if item.endswith('.mdown'): #why use re here?
            modtime = datetime.date.fromtimestamp(os.stat(os.path.join(path,item)).st_mtime)
            list.append((modtime, unicode(item[:-6])))
    list.sort(cmp = lambda (a,x),(b,y) : x > y) #use a custom compare function that sorts by the first tuple element without needing to reverse the list
    return list

template = web.template.frender('template.html')

class index:
    def GET(self):
        content = u""
        fileset = fetch_articles(article_path)
        for timestamp, content in fileset:
            name = entry[12:]
            content = "%s* [%s](%s)%s\n" % (content, name, name, timestamp) #use string formatting
        return template('', markdown.markdown(content))

Just a few suggestions. But dauerbaustelles code is really worth a look if you want to see how things can be done in a more pythonic way.

Regards,

Raf

Last edited by raf_kig (2010-04-23 22:45:14)

Offline

#9 2010-04-26 02:08:19

Zoole
Member
Registered: 2008-07-27
Posts: 6

Re: Doing things The Python Way™

In fetch_articles, instead of sorting the list and then reversing it, you can do it in one go like this:
list.sort(reverse=True)

This page has some more info on sorting techniques in Python, I found it quite useful smile
http://wiki.python.org/moin/HowTo/Sorting

Offline

#10 2010-04-26 10:56:03

xelados
Member
Registered: 2007-06-02
Posts: 314
Website

Re: Doing things The Python Way™

Thanks for the tip, Zoole.

dauer: What is "yield" doing in your edit? I consulted the docs but it was not clear. There wasn't a good example for me to follow and learn from.

Offline

#11 2010-04-26 16:47:29

dauerbaustelle
Member
Registered: 2010-02-07
Posts: 62

Re: Doing things The Python Way™

xelados: A function that uses 'yield' statements implicitly is an generator, e.g. an  iterable that calculates its items not at once, but only if they're needed. I think range vs. xrange is a good example for this (Python < 3.0):

In [2]: range(5)
Out[2]: [0, 1, 2, 3, 4]

In [3]: type(range(5))
Out[3]: <type 'list'>

In [4]: xrange(5)
Out[4]: xrange(5)

In [5]: type(xrange(5))
Out[5]: <type 'xrange'>

One could implement `range` like this (of course this implementation is far from being complete):

In [8]: def myrange(n):
   ...:     result = []
   ...:     i = 0
   ...:     while n:
   ...:         result.append(i)
   ...:         i += 1
   ...:         n -= 1
   ...:     return result
   ...: 

In [10]: 

In [11]: myrange(5)
Out[11]: [0, 1, 2, 3, 4]

As you can see I create an empty list and add items to it. At the end I return that list. You can imagine how much memory that needs if n is very high: at least n * sizeof(integer).

Using xrange, you'll never create any list but only 'yield' the current item while iterating over the generator. An implementation of `xrange` could be like this:

In [12]: def myxrange(n):
    i = 0
    while n:
        yield i
        i += 1
        n -= 1
   ....:         
   ....:         

In [19]: myxrange(5)
Out[19]: <generator object myxrange at 0x9aaa1bc>

In [20]: type(myxrange(5))
Out[20]: <type 'generator'>

You see, no lists here. Saves memory and processing time and so on.

[Never use range. :-)]

Offline

Board footer

Powered by FluxBB