Last night (aka this morning) I wrote that I would venture into the realm of making my own Python web framework, without any restrictions on making a complete (as in you-should-use-this-for-your-business) or Pythonic solution. As promised, here's my first attempt at some URL mapping mechanism. But first...

Thanks to Ian Bicking for pointing out Routes. It gave me that slight flutter of "ooh, what if someone's already done that?" For better or worse, it is quite different (and quite nice).

Adrian also corrected me on the origins of Django's components. I actually have spent some time looking at Django, and even read Adrian's presentation, which must be even better with sound & video. So either I just didn't comprehend the history of Django, or it isn't immediately obvious from looking at it that it is a single unit (I think a little of both, yes that is a slight jab, sorry).

Onto some code...

Some things to keep in mind. This is only my first attempt at this particular problem domain; as I said, I'm prepared to experiment. This was the first thing that came to mind and I was pleased when I got a working prototype in about 20 minutes.

Common Problem 1: URL Mapping

Some problems I listed with current URL mapping solutions the other day are regular expression usage, big jumps in complexity for deep URLs, and lots of space between where URLs are defined and where their respective methods are defined.

Hacky Solution 1: Division Overloading for Path Components (a la path), Which Can be Types!

This offers simple validation, keeps the URL and its corresponding method right next to each other (it's silly, but that's what I like the most), makes complex situations a bit easier, and you can call your methods and classes whatever you want.

Simple examples, assuming SQLObject classes for our model:

Edit: This may be painful to read without syntax highlighting, so here are some similar examples in a .py file.

class Blog:

    # Example: /index

    @url('index')
    def homePage(self):
        return Entry.select(orderBy='-created')[:10]


    entries = Path('entries')

    # Example: /entries?query=test&format=rss

    @url(entries, query=str, format=oneOf('html', 'rss'))
    def findEntries(self, query=None, format='html'):
        if query:
            return Entry.search(body=query)
        else:
            return Entry.select()


    # Example: /entries/5

    @url(entries/Entry.id)
    def showEntry(self, entry):
       print "Retrieving entry", entry.id
       return entry

Okay, there's something neat. That entry argument in showEntry won't be the entry ID from the URL, it will be the entry itself with that ID! When a URL component is an SQLObject alternateID field, the corresponding row is automatically retrieved. Let's go back to that findEntries example now that we know it plays nice with SQLObject...

    @url(entries, query=Entry.body)
    def findEntries(self, query):
        for entry in query:
            print "Found entry: %s." % entry.title

What's going on here? Entry.body isn't an alternateID, so it just uses selectBy instead, so query will be a SelectResults sequence. However, this is NOT the same as before, since it will only return exact matches, and there's no way to get the actual string parameter that was requested. Still, you can imagine this working well for, say, a phone book, where you only want to find exact first or last name matches. Continuing on...

    # Example: /entries/5/comments?format=rss

    @url(entries/Entry.id/'comments', format=oneOf('html', 'rss'))
    def listComments(self, entry, format='html'):
        return entry.comments

    # Example: /entries/5/comments/10

    @url(entries/Entry.id/'comments'/int)
    def showComment(self, entry, commentIndex):
        # Show an individual comment, maybe for thread view?
        return entry.comments[commentIndex]

As you can see, unknowns in the URL's path are mapped to positional arguments, while parameters are mapped to keyword arguments. Check out the use of the actual int type as a URL component in that last example. Any type will work, as long as you trust its conversion from string to be an accurate representation (bool('False') is a good example of where this fails).

Now for something a bit more complex, like file browsing in arbitrarily deep folders.

    static = Path('static')

    @url(static/[str]/str)
    def staticFile(self, subdirs, filename):
        import os
        fullPath = os.abspath('/'.join(subdirs + [filename]))
        if not os.path.isdir(fullPath):
            return file(fullPath).read()

    # Example: /static/images/anim/glow.gif
    # -> self.staticFile(['images', 'anim'], 'glow.gif')

Archives are often browsed by date, where the URL components can specify a varying date by year, month, and day. So we want to support anywhere from 0 to 3 consecutive integer arguments (0 showing the whole archive, 1 being the year, etc.)

    @url(entries/'archive'/[int])
    def listByDate(self, date):
        date = date[:3]
        date.extend([None] * (3 - len(date))) # Pad to length 3 with None
        year, month, day = date
        return Entry.selectBy(year=year, month=month, day=day)

Enclosing a component in a list means that it will match any number of that component, and the positional argument will be returned as a list. This is starting to look a bit like regular expressions, except prettier (to me, at least), and the values are converted to their respective types.

There are more things, like using tuples for enumerated values (oneOf is actually just for readability), and possibly using | like in regular expressions, but you get the idea.

This fits my brain nicely: "at this URL, return this function..."

Issues, besides the obvious: there's some repeating going on, but I think this happens in other systems with similar features. For example, you obviously have to retype the path prefixes every time if you're using regular expressions, and if you're not, there's the (typing) overhead of nesting classes (or at least making a bunch of them). And if you include validation, that's more re-typing of parameters (telling the validator which parameter should use which validator).

What about using multiple URLs for a function and vice versa? Well, the neat thing is that the url decorator sets a url property on the method itself, with all the relevant information. Redirecting is made easier with a custom Redirect exception (like in CherryPy), just give it the method, it can determine what URL to redirect to from there:

    @url(entries/'new'):
    def newEntryForm(self):
        return "<form>...</form>"

    @url(entries/'save', title=str, body=str)
    def makeEntry(self, title, body):
        entry = Entry(title=title, body=body)
        raise Redirect(showEntry, entry)

Right now this "simple" solution would actually work for all my current projects. I'm guessing anyone interested in this can probably make a good guess at how it's implemented, so I'll only post about that if someone asks. URL dispatching works fine (this isn't just a make-it-work-like-this proposal); I have a very simple function that takes a URL just like what a web server would receive and calls the appropriate method.

So, web people: could this technique actually survive? What would need to be changed?