Entries for November 2006

Talk: Geocoding at Case

Tomorrow (November 28th at 6 PM) is ACM's Nerd Cultural Dinner at Case. The dinner is located in Thwing Ballroom and you should probably reserve a ticket by e-mailing acm-officers@case.edu.

After dinner is served, there will be several presentations. The first (at 6:30 PM) is by yours truly and is about Geocoding at Case. They gave me 15 minutes to talk, and I'll be following this approximate outline:

Geocoding at Case

  • What is geocoding?
  • Why is it useful?
  • Using the Case geocoder web service
  • A simple example
  • Development details
    • Location finding: Case Wiki, Google Maps, ...
    • Location parsing: Matching buildings, areas, addresses, ...
  • How you can contribute to its development and improvement

Hope to see you there!

Simple CAS 1.0 Authentication for Django

Back when I expressed interest in making the web presentation bounty based solely on client-side code, Simon (bounty master and Filer admin) expressed his wish to keep the two services decoupled (so I shouldn't rely on Filer for slideshow storage). While I still want to have a save-to-Filer feature, I decided that I should just go ahead and get the web presentation system up and running before worrying about a client-side-only version. So I started a Django project.

Anyway, the result is that I got CAS 1.0 working alongside the Django authentication system, which means I can take advantage of built-in features like permissions and messages with CAS-authenticated users.

If anyone else is interested in using CAS authentication with Django, you can download the code I'm using. Here's a brief usage guide:

  • Set SERVICE_URL in cas/__init__.py to the location of your CAS service. For example, Case's is https://login.case.edu/cas/.
  • Set DEFAULT_REDIRECT_URL in cas/__init__.py. Normally the user will be sent back to their HTTP_REFERER (the page that requested login) after authentication. But if the user requests /accounts/login/ directly (or there is no HTTP_REFERER), they will be sent to DEFAULT_REDIRECT_URL.
  • Enable the login and logout views by adding these to your URLconf (customize the URLs if you want):
    (r'^accounts/login/$', 'your_site.cas.views.login'),
    (r'^accounts/logout/$', 'your_site.cas.views.logout'),
    
  • Add the backend in settings.py:
    AUTHENTICATION_BACKENDS = (
        'your_site.cas.backends.CASBackend',
    )
    
  • Make sure at least the following apps are installed:
    INSTALLED_APPS = (
        'django.contrib.auth',
        'django.contrib.sessions',
        'your_site.cas',
    )
    
  • Finally, if you have a way to populate the user's name and e-mail address fields from their username, put it in cas/backends.py (see the comments). For example, I have LDAP code there.

P.S.: This just implements the minimum required for CAS authentication. Features like gateway, renew, and proxies are not supported.

An alpha version of the presentation system should be online to play with later this week.

Python Databases Workshop, Post-Mortem

So apparently doing a workshop is orders of magnitude harder than doing a talk. I say this because the Making Databases Fun with Python workshop I hosted tonight went horribly awry due to technical failures.

First off, if you're looking for workshop notes, here's the file I used to guide myself through the worskhop. If you're looking for a screencast and audio, which I was all prepared for; sorry, I totally forgot to set up both of them. Anyway, they wouldn't have been very useful because everything went wrong.

First I'll explain the setup I used for the workshop, then I'll theorize about what went wrong. The computer lab I used was stocked with over 20 Windows machines. Since I didn't have privileges to install Python and all the necessary libraries, I simply ran PuTTY on all of them and connected to a FreeBSD server I had set up for this purpose.

Everyone used PuTTY to log onto the same account on the FreeBSD machine, which had this zsh login script:

python2.4 login.py

And login.py contained:

import os
import IPython

print """
    Hi! Welcome to my talk, Making Databases Fun with Python.
    This login process will get you set up with your own sandbox
    folder and Python interpreter. No password is required.
"""

username = None
while not username:
    username = raw_input("Enter your Case network username: ").strip().lower()

try:
    os.mkdir(username)
except OSError:
    print "Sandbox found."
else:
    print "Sandbox created."

print "Entering sandbox..."
os.chdir(username)
print "Starting Python!"
while True:
    IPython.Shell.start().mainloop()

Okay, this script doesn't account for any extreme cases going wrong, but it covers the basics and it worked fine. The person gives their unique ID, and it puts them in an IPython shell in their own directory.

The first thing that went wrong was after almost 20 people were logged in, SSH on the remote machine stopped accepting logins. The first thing that came to mind was that this was a FreeBSD security feature. Luckily, there was only one person left without an IPython shell from this happening.

Not soon after getting into the DBM example, people started having import problems. Despite everyone being connected to the same machine, running the same installation of Python, some people would get errors trying to import anydbm and some wouldn't. I got around this by just telling some people to import dbhash or gdbm instead, which worked fine.

Things really went haywire when we got to the DBAPI portion of the workshop. Everyone could import pysqlite2 okay, but most people started getting errors when trying to make a connection object. Some people got back a valid connection object, while others had an exception claiming to not be able to read the database file. Keep in mind, these people were all in their own sandbox working with their own database file. I figured it might be a thread-safety issue, but why would that be a problem if they're all using their own database?

The rest of the talk covering SQLAlchemy wasn't much better. People kept on getting import errors or just weird exceptions claiming to not be able to create or read the database file. It was a total disaster. Sometimes people encountered failures, then simply tried again later and it worked, which really sounds like a threading issue.

So, Python and FreeBSD folks, what do you think? Did SSH lock out my logins eventually, even though there were practically no failed login attempts? Was my FreeBSD using an unreasonably low file handle limit? Was having everyone import all these modules around the same time bumping up against the GIL? Or simply thread-safety issues in some modules? Keep in mind, there were about 20 people at the talk, just sitting mostly idle in an IPython shell. That's all it takes to fail at loading anydbm?

Besides the database file creation errors, at some point different exceptions were saying they couldn't find lib/libc or zlib. Also, while troubleshooting and reloading IPython, sometimes it just said 'IPython not found' and dumped me into a regular Python shell. But trying again loaded IPython just fine. Was this a result of having IPython running as the same user many times at once?

Was it my fault? Was it FreeBSD's fault? Was it Python's fault?

Workshop: Making Databases Fun with Python

Reminder! This is today!

Did you ever notice how writing SQL is not very fun?

This Monday (November 20th) on behalf of Case Project Club, I will be hosting a workshop for those interested in Python and databases. The talk will be at 7:01 PM (sharp) until 8:30 PM in the Olin 303 classroom/computer lab. I'll have Python all set up for everyone to play with and follow along. Pizza and drinks will be provided!

Python is a powerful dynamic programming language suitable for many tasks, including data analysis for research, web programming, and just plain fun. Even if you don't know Python, there won't be any crazy wizardry going on during the worskhop, so you should be able to pick up the basics very quickly.

Some contents of the talk will include:

  • Simple data/object persistence, for when SQL is overkill.

  • The dbapi, a standardized interface for talking to databases with Python.

  • An overview of object-relational mappers that will let you harness the power of relational databases without writing a single line of SQL (and easily swap out SQL backends).

  • Construction of a database application during the workshop everyone can play with, made with Django's object-relational mapper (or perhaps SQLAlchemy).

Again, no prior knowledge of Python or any of the related libraries is required.

Hope to see you there!

databases_72.png