Automating Case Wiki Tasks
posted by brian at 10:41 PM
A while ago Chris added a login method to the CAS module in CaseClasses. It returns a mechanize Browser object so that you can programmatically surf the web as if you had logged into CAS in a real web browser.
CaseClasses also has a Codes module that has the abbreviated codes for majors, departments, and buildings. I combined these two features to tackle the Building codes project on the Case Wiki.
P.S.: There is a MediaWiki API that would normally be used to do this kind of stuff, but according to Greg, editing is not fully functional yet.
Think you could add a lot to the wiki with some automated task? Here's how it was done.
First, you'll need mechanize and CaseClasses:
$ sudo easy_install mechanize
$ sudo easy_install http://opensource.case.edu/svn/CaseClasses/python/trunk
Now log into CAS with mechanize:
import Case
from getpass import getpass
username = 'bmb12'
password = getpass() # Enter a password without echoing
cas = Case.CAS()
browser = cas.login(username, password)
You can open any page with browser and interact with it as a logged in Case user. So let's go to the Case Wiki and log in:
browser.set_handle_robots(False)
browser.open("http://wiki.case.edu")
browser.follow_link(text_regex='Log In')
Editing can be done like so:
browser.open("http://wiki.case.edu/User:Brian.Beck")
browser.follow_link(text='Edit this page')
browser.select_form(name='editform')
browser['wpTextbox1'] += " Also, this guy sucks!"
browser.submit()
Automating the building code edits was done like so:
for code, name in Case.Codes.buildings.iteritems():
url = "http://wiki.case.edu/%s" % name.replace(' ', '_')
try:
browser.open(url)
except:
print "Didn't find %r." % name
else:
browser.follow_link(text='Edit this page')
browser.select_form(name='editform')
source = browser['wpTextbox1']
add_text = "The building code for %s is [[building code:=%s]].\r\n"
add_text %= (name, code)
if 'code:=' not in source:
insert_at = source.find('{{Building')
if insert_at != -1:
new_source = source[:insert_at] + add_text + source[insert_at:]
else:
new_source = source + add_text
browser['wpTextbox1'] = new_source
browser.submit()
print "Added building code for %r." % name
Happy automating!
Update: The same has now been done for the Street addresses project. Check out the discussion to see how.
Comments
I've often thought that if you submit HTTP Auth credentials to a mod_cas protected resource, it should allow you to the resource.
Greg talked me out, though. I can't remember his take on it, but I remember it won me over fairly easily.
But now that I am thinking about it again, I think that that would be the correct thing to do.
Jeremy,
Well, wouldn't the service you're logging into have your password that way (which it would forward to CAS)? It might make automated login a little more complicated, but it seems like giving credentials directly to the service is one thing CAS aims to resolve.
I'm thinking more in terms of programmatically manipulating resources.
Use case #1 is a user with a browser going to a mod_cas protected resource http://host.case.edu/foo to create the resource http://host.case.edu/foo/bar. Before they are allowed in, they are redirected to CAS, their credentials are verified, and they're allowed access to /foo to create /foo/bar
Use case #2 is that http://host.case.edu/foo is also an Atom Publishing Protocol (APP) endpoint. So the user wants to use their APP client to create /foo/bar (APP leverages HTTP Auth). For this to happen, the APP client is either going to have to a) know how to do CAS (and Pubcookie, and WebAuth, and CWL, and Bluestem, and etc.), or b) mod_cas will check down to using standard HTTP Auth for interoperability and being a good standards compliant HTTP citizen.