Posts for 2011

If you can't talk, you can't program

Thanks to Danielle Thesis for pointing out this excellent article pointing out the advantage of cities in growing social skills.

Where the Skills Are

Human progress, to a large degree, has depended on the continual expansion of social networks, which enable faster sharing and shaping of ideas. And humanity’s greatest social innovation remains the city. As our cities grow larger, the synapses that connect them—people with exceptional social skills—are becoming ever more essential to economic growth.

As a software engineering manager at a University I often lament that we have no technical challenges, only relationship challenges.  While not entirely true it's true enough and more often than not the big obstacles we run into have more to do with our ability to communicate, related on a project.  

Software is no different from many careers in that it's a victim of automation.  Gone are the days where a programmer could pull down 60 or 70k a year writing login pages for their institution and that leaves is in a place where software engineers are focused on real technical challenges of a project.  More challenging still is that project owner usually only come to a software engineer when software doesn't exist that already solves their problem.  So what we most often end up with are people talking to one another with varying degrees of certainty about what they want, often only knowing it's different but unable to explain exactly how.  In these situations, social skills are arguably more important than programming skills or at the very least more immediately important because if you can't bootstrap a conversation you'll never be able to get enough information on what needs to be done.

In these situations it's also incredibly important to maintain relationships over ...

(Read More)
A Stance with Dragons (Spoilers)

Taking a bit of a break to finally put down some thought's after finishing George RR Martin's "A Dance with Dragons".   What follows will be copious spoilers and a discussion of the book so read no further if you don't want to hear mention of what went on.

Spoiler Alert.... you have been warned...

Overall I enjoyed the book but it lacked something for me and convinced me even further that the author has lost the thread of the whole thing.  I think technical aspects of his writing are top notch and the whole thing came off as an incredibly polished use of language, but overall the story lacked the animus of the other books.  Perhaps this can be attributed to a normal second act lull but it's hard to feel that is an adequate summary.  I didn't feel there was much the way of character development, aside from Theon anyway and I struggled with that not being very interesting.

Having had far too many years since seeing Jon, Tyrion or Danny I thought their story was wholly inadequate and rather flaccid.  Danny sitting around lamenting the difficulties of leadership and her aloneness, Tyrion lamenting his lot in life and being enamored with his own cleverness, Jon alone with is burden of duty and not being understood by others.  I feel like we've done all that and in some cases I thought we were past that, at least I know my interest was, so it was hard to feel there wasn't a retrograde of the character arcs here and made the book feel a bit meandering.

That said I'd rather meander with these characters than most others, but still the overall lack of momentum increased the feeling that started to creep in ...

(Read More)
Social Panic Attacks and Technology

The Wall Street Journal had an interesting blog post exploring why some technologies cause moral outrage or panic and some don't.

Technology and Moral Panic

Why is it that some technologies cause moral panic and others don’t? Why was the introduction of electricity seen as a terrible thing, while nobody cared much about the fountain pen?

It is a curious topic and working in technology in Libraries, I encounter some form of this panic on almost a daily basis.  The other day I tweeted a link to Eric Hellman's piece on the changing needs of library data and one of the first comments I got was an accusation of wanting to remove all humans from the library process.   I find I repeat some form of that conversation over and over again on almost a daily basis.  

Given that experience I'm not sure I just have a niche culture that reacts different or if I disagree with the WSJ author as to what generates panic about technology. 

More than our relation to space, time and people.  I think it's more about how big a contrast something is to the script we've written for ourselves.  Most people achieve a perspective on their life that ensures them what they do day to day is valuable and I think most go so far as to convince themselves of some form of moral "rightness" to not only what they do but how they do it.  It make sense that we do that and it's a valuable tool for getting through any particular day and getting up the next morning.  

I'd argue that it's the fundamental discomfort with changing not just a routine, but the thinking that has built up to make us accept that routine that ...

(Read More)
Maintaining and Open Relationship with Google

Google's mission is to "Organize the World's Information" and they do a rather smashing job of it as long as they alone are doing it.   Even though they do a better job than most as staying open, there is still a significant risk when putting all your eggs in the Google basket and few options for backing out.   Particularly with the rise of Google+, Google Music beta and other such services going all in on Google could prove a big liability for individuals and companies in terms of being able to shift to new or better services as they emerge or just re-establish ownership over your own content.  Every business would do well to act with caution in opting for the convenience of any service as that convenience would too easily transform into abducting the ownership of your content.   I would suggest that losing control of your content in a world where ideas and content are a commodity is the same as losing control of your life or business.

With a little forethought however some convenient ways exist to both leverage the convenient services offered by Google and remain managers of your information.  The Digital Liberation Foundation launched it's "Google Takeout" service, that allows the harvesting and export of your information from various Google services into open formats.  Open formats are the key to keeping your content flexible and mobile and in a world where 5 years is an entire era of information management practices, this is critical to surviving and thriving in the modern world.  The group is starting with export features related to Google services but plan to expand their ability to other services as well.  I assume (hope) this means Facebook and Yahoo! based services but only time will tell.

Even with groups like ...

(Read More)
Game of Thrones Redux

Even knowing what was coming in the series, the end left me pleased and stunned as I was when I first started reading the books.  The characters were visualized and brought to life in amazing ways and I was blown away at how lean and mean the storytelling was.  The series highlights just what a powerful contribution a great set of writers and directors can make to a series.  I'm not one of those fans who feels unhappy because their favorite character wasn't featured enough.  The breath of the story itself meant that there was just no way every character could be touched on and they obviously had some hard decisions about who to feature and how.  Given the realities of what they had to film I think the obviously painful choices were good ones.

Even given the lean nature of the storytelling there is a long list of standout characters for me in the series already.  Danny and Drogo, Tyrion, Jon and Jamie were just brilliant and it's only in Game of Thrones that that would seem like a short list of stand-out characters.  Pulling that off in any series would be amazing, here it's just a miracle.

Some thing else the series really highlighted for me was the just how telling the story on screen drove the need for different decisions than telling it in the book.  Most obvious to me is the fact that in print there is no background, everything is foreground in writing.  You don't accidentally see a sentence in a novel so every detail, character and event is right there in the front of your mind.  The series instead was unafraid to let many things play out in the background, which gave readers of the book a special treat ...

(Read More)
New Rise of the Apes European Trailer

There's a new trailer out for the Planet of the Apes reboot called "Rise of the Apes" and it really looks fantastic.    It's amazing how much story they can convey in a simple trailer and I have hopes the movie has a lot of depth in it.  There seems to be a lot of potential for commentary about the nature of intelligence and self, the respect for life and compassion or what a lack of it brings.  Seeing the "acting" of Caesar with the John Lithgow character, who seems to have Alzheimer, brought a tear to my eye as you watch the chimp express compassion for the human who is obviously struggling.    This could be the start of a great movie franchise, at least I hope it is.

Django Libraries for XML and eXist DB

We often use XML at Academic Libraries and decided to create a set of libraries to ease our work connecting our XML and repository based work to the Django framework by building a central set of libraries.  We'll be continuing to build these libraries out and recently released the code as open source projects on GitHub.

EULxml provides XPath parsing features in python and mappings for xml documents to pythonic objects as well as features to provide Django Form to simple XML objects.  The code is available on GitHub and some documentation and examples up on read the docs. 

EULexistdb provides connections and XQuery capability to eXist DB and Django Queryset like objects for rich interaction between Django and XML data stored in eXist DB.  Combined with the XML Django Forms from EULxml (on which it depends) it has enabled us to do a lot with our Library collection.  This library is also available on GitHub and has some documentation and examples up on read the docs.

We're excited at the possibilities of leveraging the power of Django with our XML databases and repositories.   We're open sourcing it in hopes others may find it useful and may want to contribute to the libraries as well.

They Shoot URLs Don't They?

I've had a rather lengthy and interesting blogging life these last nine years and stopped over the last few for a number of reasons.  A backlog of 2400+ posts however have given me a rather interesting dataset to test when it comes to URL persistence and as I'm going over old posts I find and example of URL persistence that seems very backwards to me.

I use to run a funny little site for Gamespy called Paragon City Hall, just a community based site for an as-then-un-release game called City of Heroes and posted about that site back in 2002

In another post around the same time I reference my depression over a news story from Reuters that made me want to kill myself.  Over dramatic yes, but hey, it was 9 years ago so give me a break.

The amazing thing to me is that the Reuters story results in a dead link, nothing, no forward, no search suggestion, nothing.  The Paragon City Hall link however STILL WORKs, even though I shut the site down 8 years ago and it's unlinked by the network.

What kind of world do we live in when RPGPlanet has better URL persistence than Reuters?

Although I came to this realization later than I should have in my career, Content in the web is a Social Contract.  Tim Berners-Lee made this case quiet eloquently in an article I cite quiet frequently and while I might not expect Gamespy to understand it, of all agencies Reuters should get it.  They should have gotten it perhaps before even TBL posted anything about it in 1998.

A particular fear creeps over me when an agency like Reuters is letting links expire like that and offends me as an adopted digital librarian.   (They found me ...

(Read More)
Django AuthenticationForm For User Login

Django already makes it insanely easy to log a user in and out via their generic views.  Engineers will often want to create their own login view to provide some flexibility, say an Ajax login or other spin on standard login.  A number of examples are given in Django for that as well, and as with most of the framework this is a snap too.  A convenient feature of Django that doesn't make it into many of the examples I've seen is the AuthenticationForm that provides a convenience Django form with associated logic render a login form, validate input and throw errors if they do things like forget to supply a password and do the basic authentication check.

The form provides that all for you and all you really need to do in your view is read the user submitted data, validate the form and take the final step of logging the user in.

This is just one form in a group of 7 or so that provide all kinds of convenience features like Password Changes and User Registration.  Not only do they provide a developer with very easy access to common functions but they can extended or subclassed like any other Python Class to add or override functionality.

Here's an Example of a simple view method using the AuthenticationForm.  Something of a 'gotcha' for developers who normally use Django form is the POST values are passed as the second argument to the form.  The request object can be passed but that is normally only done to check for authentication cookies.  See the Source for more info on the form..  

from django.contrib.auth import login
from django.contrib.auth.forms import AuthenticationForm
from django.shortcuts import render
from django.shortcuts import HttpResponseRedirect
from django.core.urlresolvers ...
(Read More)
Django Tempate Tag for Gravatar Images

Gravatar images seem to be growing in popularity across a number of sites and the services already makes it incredibly simple to grab a profile picture there via URL.  The Gravatar site itself has a number of examples on how to grab an image off of the service, as well as more detailed examples of grabbing more information.

They do provide a examples for grabbing an image via Python and even a Django example which renders the image as a template note.  For displays like this I generally prefer an inclusion tag since I can render the image in a template rather than having to build it each time on my own.

The template tag itself is just:

from django import template
import urllib, hashlib

from yourapp import settings

# Provide Default settings so users only need to provide them in settings.py if they want to override.
GRAVATAR_BASEURL = getattr(settings, "GRAVATAR_BASEURL", "http://www.gravatar.com/avatar/")
GRAVATAR_DEFAULT_IMAGE = getattr(settings, "GRAVATAR_DEFAULT_IMAGE", "")
GRAVATAR_SIZE = getattr(settings, "GRAVATAR_SIZE", 40)

register = template.Library()

def gravatar_url(email, size):
    """
    Builds a Gravatar Image URL based on the provided email.

    :param email: Email address to query for a gravatar image.
    :param size:  Size to request and render the image in pixels.
    """

    attrs = {
        'd': GRAVATAR_DEFAULT_IMAGE,
        's': size
    }

    gravatar_url = "%s%s/?" % (GRAVATAR_BASEURL, hashlib.md5(email.lower()).hexdigest())
    gravatar_url += urllib.urlencode(attrs)

    return {'gravatar': {'url': gravatar_url, 'size': size}}

@register.inclusion_tag('account/snippets/gravitar.xhtml')
def gravatar_for_email(email, size=GRAVATAR_SIZE):
    """
    Renders a gravatar image for user with the specified email via a template.

    {% gravatar_for_email "user@email.com" 40 %}

    :param email:  String representing the users email.
    :param size: Size of gravatar to use in pixels.  OPTIONAL
    
    """
    email = "%s" % email
    size = int(size)
    return gravatar_url(email, size)

This approach also has the advantage of being extendable and it's easy enough to build additional ...

(Read More)
Gaming Wiki Back Online

I brought my Gaming Wiki back online on the site here after several months of being down. I apologize for that and I don't have any better excuse than not really taking the time to do it.  I had some difficulties with my previous web host and all I was ever able to get was a *.tar.gz download of the wiki database and the service would timeout every time I tried to download a gzipped directory of the mediawiki itself.  I kept hoping I'd find a backup up copy on a CD somewhere but no joy.

So it languished in 404 hell for a bit while I came to terms with the fact that I'd have to take the DB dump from an unknown older version of MediaWiki and try to upgrade it to work with a modern download.  

All in all I have to give it to the MediaWiki folks, I was essentially able to just run the update scripts and only had to make a few settings changes that took a bit of looking up.  More or less though the whole thing came back up.

Because I couldn't get a backup of the files stored in my wiki though some of the file links and thumbnails wont work until I come up with some plan to rebuild them.

Thanks to everyone for your patience.

SemTech 2011 Redux

The SemTech 2011 Conference delivered a lot to the attendees and I thought I'd jot down a few of my thoughts and note some highlights as the conference draws to a close.

By and large I have to say that the technology has definitely arrived and we're capable of some exciting advances in linking data and having the web to begin to fulfill some of the promises of being a real knowledge base.  I just hope Skynet appreciates all the work we're all doing on it's behalf when it finally becomes self aware. 

What had the structured data crowd buzzing the most was last weeks announcement of MicroData format support by Google, Microsoft and Yahoo at schema.org.  Annoyance aside at Microsoft trying to put up schema.org as if it was some small independent standards board, I think the Microdata format seems just fine to me.  Essentially a competitor to RDFa it targets easy markup of information in a web page and is a bit leaner and easier than the current RDFa 1.0 standard.  The crowd here being a bit biased toward RDFa, there wasn't a lot of positive talk about Schema.org but I find I can't really care too much one way or another.  What we need to develop is a community of practice, and the technology should be secondary to that as long as it's not a barrier.  To me Microdata or RDFa are both fine standards and the only logical argument I would make to prefer one over another is that Schema.org's aim to to mark up information for better searching while RDFa is aimed at marking up knowledge.  It may seem a subtle difference but misaligned motivations like this can be the cause of ...

(Read More)
San Fran Pic-So

Conference ended by mid-day here so I decided to take an open top  Bus Tour around San Fransico.  Great experience and you could get on and off all day so I got to see  more of the city in 4 hours than I have on most of my previous Trips.  Posted Pics to Picasa Web and linking blow.

Dead Simple Python Calls to Open Calais API

I was amazed at how easy Open Calais makes it for anyone to make calls to it's API via REST and return suggested tags and entitty recognition for any text.  Native Python libaries urllib(2) and httplib provide some effective methods for connecting and making simple REST calls to the Calais Web Services API but the httplib2 libray makes easier still.

Start off by installing httplib2 via pip

pip install httplib2

From there you just need to get an API key at the Calais site, set some headers, define a bit of text you want to pass to the API for tagging and entity recognition and then reap the benefit.

You can see this in the simple code snippet below...

import httplib2
import json

# Some local values needed for the call
LOCAL_API_KEY = 'PUT_YOUR_KEY_HERE' # Aquire this by registering at the Calais site
CALAIS_TAG_API = 'http://api.opencalais.com/tag/rs/enrich'

# Some sample text from a news story to pass to Calais for analysis
test_body = """
Some huge announcements were made at Apple's Worldwide Developer's Conference Monday, including the new mobile operating system iOS 5, PC software OS X Lion, and the unveiling of the iCloud.
"""

# header information need by Calais.
# For more info see http://www.opencalais.com/documentation/calais-web-service-api/api-invocation/rest
headers = {
    'x-calais-licenseID': LOCAL_API_KEY,
    'content-type': 'text/raw',
    'accept': 'application/json',
}

# Create your http object
http = httplib2.Http()
# Make the http post request, passing the body and headers as needed.
response, content = http.request(CALAIS_TAG_API, 'POST', headers=headers, body=test_body)

jcontent = json.loads(content) # Parse the json return into a python dict
print json.dumps(jcontent, indent=4) # Pretty print the resulting dictionary returned.

The server itself parses the body send as part of the http request and returns a json string with the results in this example because ...

(Read More)
SemTech 2011 - O'Rielly on RDF in eBooks

Instead of a flood of tweets I thought I'd go a bit old school and do some live blogging from the SemTech 2011 session Discovering and Using RDF for Books at O'Reilly Media this morning.   My own interest in this session is how we might apply this to texts coming from our local repository and in particular related to our Yellowbacks Project which we hope to enhance soon.  We also have a body of texts sitting on our servers in TEI format and we haven't landed on a way to comfortably leverage that in our infrastructure.  My own comments here appear in parenthesis (like so).

O'Reilly took their first stab at modeling information about their books in straight XML in a bit of a "tag soup" approach. This proved way too heavyweight for them and they ended up being late in delivering products because of the time it took to modify and extend their XML approach.  They then moved onto ONIX as an internal format, but it was old and writing xpath was a bit nightmarish because of the standards drift involved and other reasons.  In the end it was just not extensible and not friendly toward being agile.   That lead them to take a stab and creating their own schema, which also proved too heavyweight and slow.  Alas they washed up on the shores of Dublin Core, specifically with DC Terms and this introduced them to the world of RDF.

The extensibility of RDF starting with DC seemed pretty cool and useful to them and they kept adding FOAF, BIBLIO and more.  More useful for the company, the problem at the end of the day was they were still thinking in XML terms.  (Implying they should have been thinking in RDF and triples terms instead ...

(Read More)
Some Antics for the Week

I'm off this week to the SemTech 2011 conference in San Fransico so content may be a bit light.  I hope to have some interesting things to say when I come back when I clear away the fog of depession from being unhappy with my information architecture, service architecture and 'no doubt' feelng like my content is worthless because it isn't backed by OWL.

Sigh.

On a side note.  I could use a bit of a break from "I've invented the intelligent web" quotes from just about every CEO or manager I meet here.   I wish them luck of course and I love to see competition in the field but this is a tech confrence (mostly), not really a marketing one so people are going to need more information than that.

On a side note meeting up with some folks from Yale, Mayo Clinic and The Library of Congress has proven very interesting. They're doing some great stuff that I think has some application for us.  I'm looking forward to getting back.

Setting up LAMP on Ubuntu 11.04 (Natty) Desktop Edition

The default download for Ubuntu 11.04 (Natty) is the Desktop edition, which doesn't come with the LAMP server stack installed by default.  Fortunately setting up LAMP is almost as easy as installing Ubuntu itself these days.  You could install each package seperatly via apt-get but the most convenient method is to use the Tasksel package to install and configure LAMP all at once and together.

Install this this the terminal by typing:

$ sudo apt-get install tasksel

and when complete launch it in terminal by typing:

$ sudo tasksel

The basic terminal GUI will allow you to select from a number of package installs, use the arrow keys to move through the list and find the LAMP Server entry.  Hit spacebar to select it and ten Tab to highlight [ok] and hit return.  The installer should walk you through setup options as needed.

There are a few things you may want to look at after the install, the apache install defaults to allow indexing of directories and many peole turn this off.  

Also MySQL enthusiasts also may wish to install phpMyAdmin, which is not installed as part of the LAMP package.  For that just type...

$ sudo apt-get install phpmyadmin

And there you go!

X-Men: First Class Review

Overall I thought X-Men: First Class was just kind of okay.  Several fun moments separated by some overstylized and heavy handed storytelling.   I thought I'd share a few spoiler free thoughts.

I really have to blame the directors and producers on this as the cast is pretty top notch, but most of the performances were over the top.  Even so each one of them managed to squeeze in a couple of real or fun moments for their characters.  The stars had to be the Dorm Room X-Men by far, who were fun throughout most of the movie.  Macavoy was consistently good as Xaviar but Magneto was just as consistently card-board and dry, while Sebastian Shaw came off as a massive parody of an already over the top character from the comics.

Overall it felt like the movie was trying to conjure up a Bond movie feel but more often than not seemed a little too Austin Powers to me.

What was consistently good were the special effects and at times they were downright fantastic.  Banshee's flying scene were really great and you felt more like watching someone snap back up from a bungie jump and didn't feel CGI at all.  Jennifer Lawrence is great, but the slavish devotion of tying her character into the later movies made Mystique seem a bit cartoonish next to the other younger X-men.  I could have used me more Mystique/Beast throughout and less Xaviar/Magneto bro-mance.

Overall I have to reiterate that I thought the movie had a few good moments but overall was a miss.  After movies like Thor, Iron Man 1/2, and the latest Hulk it seemed Marvel had wised up and put movies like Wolverine behind them, this movie just seemed a step backward for the studio ...

(Read More)
Wordpress to Django: Designing Compatible URLs in urls.py

As I mentioned in my previous post, there are a few fairly easy strategies for maintaining the stable URLs for your content when migrating from WordPress to a local Django driven blog.

Django allows you a high level of control over URL formats so it's fairly simple to design them to be compatible with WordPress URLs.  Additionally WordPress has been around long enough that the standard URL re-write formats follow suggested best practices for content, so bringing your Django URLs in alignment with that is not only useful for migrating content but good practice overall.

That said the two most common formats for URLs in WordPress are:

http://<domain>/<4 digit year>/<1 or 2 digit month/<1 or 2 digit day/<slug>/

so for example the URL for the previous post linked above is...

http://www.flagonwiththedragon.com/2011/06/01/wordpress-to-django-strategies-dealing-with-WordPress-querystring-urls/

The next most common format for URLs is similar and differs mostly in how months are abbreviated:

http://<domain>/4 digit year>/<3 char month>/<1 or 2 digit day>/<slug>/

So an example of the same URL above in this format would be...

http://www.flagonwiththedragon.com/2011/jun/01/wordpress-to-django-strategies-dealing-with-WordPress-querystring-urls/

Designing urls.py in Django to accomodate this is simply:

    # URL format where month format is abbreviated character format.
    url(r'^(?P\d{4})/(?P\w{3})/(?P\d{1,2})/(?P[0-9A-Za-z-]+)/$', 'post_detail_alt'),
    url(r'^(?P\d{4})/(?P\w{3})/(?P\d{1,2})/$', 'post_day_alt'),
    url(r'^(?P\d{4})/(?P\w{3})/$', 'post_month_alt'),
    # URL format where month is either one or two digits.
    url(r'Word\d{4}Press\d{1,2})/(?P\d{1,2})/(?P[0-9A-Za-z-]+)/$', 'post_detail', name='post-detail'),
    url(r'^(?P\d{4})/(?P\d{1,2})/(?P\d{1,2})/$', 'post_day', name='list-day'),
    url(r'^(?P\d ...
(Read More)
Using the Django-Pagination app in Django 1.3

Like many Djangonaughts I use django-pagination as my primary means to page results on lists pages and between the differences on the original Google project page(1.0.5 is the last downloadable version), what appears to be the same project migrated to GitHub and the PyPi (1.0.7 is the pip install version) site for the project, things can get confusing.

I'm probably the only person confused by this but it appear that the GitHub site is the most up to date and it appears to be in sync with the pip install version.  Life signs overall are dubious on the project though with no updates having come since early 2010 and several (what seem to be) reasonable pull requests sitting in the project queue.

One gotcha I wanted to point out in the PyPI readme file is the directions for TEMPLATE_CONTEXT_PROCESSORS:

According to the project documentation, in settings.py you should set:

TEMPLATE_CONTEXT_PROCESSORS= (
   "django.core.context_processors.auth",
   "django.core.context_processors.debug",
   "django.core.context_processors.i18n",
   "django.core.context_processors.media",
   "django.core.context_processors.request"
)

This can cause some problems in Django 1.3 however since the default TEMPLATE_CONTEXT_PROCESSORS have changed, in particular to support the new features for serving static media. 

So to include for pagination to work and to keep the default template context processors you should instead set:

TEMPLATE_CONTEXT_PROCESSORS = (
    "django.contrib.auth.context_processors.auth",
    "django.core.context_processors.debug",
    "django.core.context_processors.i18n",
    "django.core.context_processors.media",
    "django.core.context_processors.static",
    "django.contrib.messages.context_processors.messages",
    "django.core.context_processors.request",
)

Alternatively if you want to just extend the default template context processors with just the one you need for django-pagination to work you could simply:

TEMPLATE_CONTEXT_PROCESSORS = TEMPLATE_CONTEXT_PROCESSORS + ("django.core.context_processors.request",)