Archive for the 'Open Source' Category

Sphinx Search with PostgreSQL

While I don’t plan on moving away from Apache Solr for my searching needs any time soon, Jeremy Zawodny’s post on Sphinx at craigslist made me want to take a closer look. Sphinx works with MySQL, PostgreSQL, and XML input as data sources, but MySQL seems to be the best documented. I’m a PostgreSQL guy so I ran in to a few hiccups along the way. These instructions, based on instructions on the Sphinx wiki, got me up and running on Ubuntu Server 8.10.

Install build toolchain:

$ sudo aptitude install build-essential checkinstall

Install Postgres:

$ sudo aptitude install postgresql postgresql-client \
postgresql-client-common postgresql-contrib \
postgresql-server-dev-8.3

Get Sphinx source:

$ wget http://www.sphinxsearch.com/downloads/sphinx-0.9.8.1.tar.gz
$ tar xzvf sphinx-0.9.8.1.tar.gz
$ cd sphinx-0.9.8.1

Configure and make:

$ ./configure --without-mysql --with-pgsql \
--with-pgsql-includes=/usr/include/postgresql/ \
--with-pgsql-lib=/usr/lib/postgresql/8.3/lib/
$ make

Run checkinstall:

$ mkdir /usr/local/var
$ sudo checkinstall

Sphinx is now installed in /usr/local. Check out /usr/local/etc/ for configuration info.

Create something to index:

$ createdb -U postgres test
$ psql -U postgres test
test=# create table test (id integer primary key not null, text text);
test=# insert into test (text) values ('Hello, World!');
test=# insert into test (text) values ('This is a test.');
test=# insert into test (text) values ('I have another thing to test.');
test=# -- A user with a password is required.
test=# create user foo with password 'bar';
test=# alter table test owner to foo;
test=# \q

Configure sphinx (replace nano with your editor of choice):

$ cd /usr/local/etc
$ sudo cp sphinx-min.conf.dist sphinx.conf
$ sudo nano sphinx.conf

These values worked for me. I left configuration for indexer and searchd unchanged:

source src1
{
  type = pgsql
  sql_host = localhost
  sql_user = foo
  sql_pass = bar
  sql_db = test
  sql_port = 5432
  sql_query = select id, text from test
  sql_query_info = SELECT * from test WHERE id=$id
}

index test1
{
  source = src1
  path = /var/data/test1
  docinfo = extern
  charset_type = utf-8
}

Reindex:

$ sudo mkdir /var/data
$ sudo indexer --all

Run searchd:

$ sudo searchd

Play:

$ search world

Sphinx 0.9.8.1-release (r1533)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/etc/sphinx.conf'...
index 'test1': query 'world ': returned 1 matches of 1 total in 0.000 sec

displaying matches:
1. document=1, weight=1

words:
1. 'world': 1 documents, 1 hits

Use Python:

cd sphinx-0.9.8.1/api
python
>>> import sphinxapi, pprint
>>> c = sphinxapi.SphinxClient()
>>> q = c.Query('world')
>>> pprint.pprint(q)
{'attrs': [],
 'error': '',
 'fields': ['text'],
 'matches': [{'attrs': {}, 'id': 1, 'weight': 1}],
 'status': 0,
 'time': '0.000',
 'total': 1,
 'total_found': 1,
 'warning': '',
 'words': [{'docs': 1, 'hits': 1, 'word': 'world'}]}

If you add new data and want to reindex, make sure you use the –rotate flag:

sudo indexer --rotate --all

This is an extremely quick and dirty installation designed to give me a sandbox
to play with. For production use you would want to run as a non-privileged user
and would probably want to have an /etc/init.d script for searchd or run it
behind supervised. If you’re looking to experiment with Sphinx and MySQL,
there should be plenty of documentation out there to get you started.

Arduino: Transforming the DIY UAV Community

It’s been pretty awesome watching the homebrew UAV community discover and embrace Arduino. Back in January community leader Chris Anderson discovered and fell in love with Arduino. Today he posted information and the board design for an Arduino-powered UAV platform. Because everything is open, it’s very easy to combine functionality from other boards in order to reduce the cost:

The decision to port the Basic Stamp autopilot to Arduino turned out to be an unexpected opportunity to make something really cool. I’ve taken Jordi’s open source RC multiplexer/failsafe board, and mashed it up with an Arduino clone to create “ArduPilot”, perhaps the cheapest autopilot in the world. ($110! That’s one-third the price of Paparazzi)

As with their other projects, the UAV schematics, board design, and Arduino control software will be released before they’re done. It’s quite awesome to realize just how cheap the Arduino-based autopilot is:

That’s a $110 autopilot, thanks to the open source hardware. By comparison, the Basic Stamp version of this, with processor, development board and failsafe board, would run you $300, and it’s not as powerful

I’ve been quite impressed by how quickly the Arduino autopilot has gotten off the ground (pun only slightly intended). The decision to port the existing Basic Stamp code to Arduino was made just over a week ago. While I haven’t seen the control code, it looks like the team are well on their way.

I love it when geek topics collide, and this is about as good as it gets. I’ll be keeping a close eye on the ArduPilot, and I can’t wait to see it in the skies.

Kansas covered by OpenStreetMap!

I was checking up on the TIGER/Line import to OpenStreetMaps earlier today and I was pleased to see that Kansas is already partially processed! I had emailed Dave the other day and he had queued Kansas up, but I was pleasantly suprised to see it partially processed already. Douglas County is already completely imported and Tiles@Home is currently rendering it up. Parts of Lawrence have already been rendered and cam be seen using the osmarender layer. Here’s 23rd and Iowa:

Lawrence in OpenStreetMap!

Andrew Turner had turned me on to OpenStreetMap over beers at Free State and at GIS Day @ KU even though I’ve been reading about it for some time now. So far it seems like an amazing community and I’ve been enjoying digging in to the API and XML format and various open source tools like osmarender, JSOM, and OpenLayers.

After getting psyched up at GIS Day I’ve been playing with some other geo tools, but more on that later.

Maemo blows me away again

Nokia N810 Internet Tablet

My wife and I just bought a house and I’ve realized that there isn’t any room in the budget for major gadget purposes, so I’ve been trying not to get too excited about things coming down the road. It’s not suprising that I’ve been following the recently announced Nokia N810 Internet Tablet in a much more detached manner than usual.

That is until I saw Ari Jaaksi holding a prototype in his hand. Holy crap that thing is significantly smaller than the N800 and packs quite a punch. The slide-out keyboard is killer, GPS is a no-brainer these days and is included, the browser is Mozilla-based, the UI got a refresh… I could go on for days.

The other thing I really like about the new tablet is that the Maemo platform is moving to be even more open than it was before (which was about as open as the lawyers at Nokia would allow). The quite good but closed source Opera web browser has been replaced by one that is Mozilla-based. This is yet another major component that is now open instead of closed. The major closed-source components (if I’m remembering correctly) are now limited to the DSP, various binary drivers that Nokia licenses directly, and the handwriting recognition software. That’s definitely a smaller list than it was before, and I applaud Nokia’s efforts in opening up as much as possible. It’s also worth noting that the Ubuntu Mobile project is basing a lot of its work on the work that Nokia has done with Maemo (most notable Matchbox and the Hildon UI).

So yeah, I’m now paying much closer attention to this new device that I was doing my best to ignore. Job well done.

There is an Erlang community, it’s just smaller than you’re used to

I thought I’d pick a nit with something that Ted Leung mentioned in his response to Sam Ruby’s response to Russ’ original post about Java needing an overhaul.

Particularly this comment about the Erlang community:

This is actually 2 problems. There’s the issue with the libraries, and there’s the issue with the community that did/didn’t produce the libraries. We don’t just need a technology, we need a community. Hmm, Erlang lab, anyone?

I’d like to assert that there is a vibrant Erlang community, it’s just smaller than you’re used to, and might be a little harder to find than most.

I’ve been lurking, participating, and sharing what I’ve learned in #erlang on irc.freenode.net since the Erlang movie blew my mind. I started tinkering with Erlang and the PDF for Programming Erlang came out a few days later. I still consider myself an Erlang noob. I still ask questions with obvious answers, spend hours trying to do something that should have taken 5 minutes, and sometimes don’t do things in the most Erlangish manner.

But the Erlang community has been great to me. Whenever I ask a question in #erlang, I almost always get the answer I was looking for, whether it’s a pointer to a specific part of the Erlang docs, a snippet of code, an opinion on the best way to do something, or a link to a blog post that has the answer. I’ve been around long enough that I can offer the same to people who are just picking up Programming Erlang and are running in to the same things I was a month or two back.

If you don’t think there’s an Erlang community, please come by #erlang and spend some time. They’ve been one of the most helpful communities I’ve ever been a part of. Activity is currently skewed more towards the European timezones, but as the community grows and more people across the world pick up Erlang that’s changing.

There’s also a pretty huge community outside of IRC. I’ve been subscribed to erlang-questions, the most active Erlang mailing list for a month or two too. The solution to one of my problems with bit syntax was asked and answered before I even knew that’s what I wanted. I’ve also learned a lot of things about Erlang that I wouldn’t have otherwise from the mailing list.

The community doesn’t stop there. Head over to trapexit.org and check out the Wiki and the forums. When you’re done there, check out the packages available at CEAN (the smaller Erlang counterpart of CPAN). There are lots of libraries included here if that’s what you’re looking for. Anything in CEAN can be installed from within the Erlang shell.

If you’re looking for libraries, don’t forget to check out Erlang’s module documentation. It’s far from Python’s batteries included, but there’s more there than it gets credit for. Aside from a distributed database there are TCP and UDP socket libraries, an http client and server, an XML library and support for SNMP and SSH. You’ll find many more protocol implementations at CEAN while the building blocks reside in Erlang’s standard library. Other places to look for sample code or libraries is Jungerl, a loosely knit collection of useful (but sometimes aging) libraries and applications, or Google Code.

While many third party Erlang libraries feel like they’re at the 0.1 stage (and many are simply because their authors are new to Erlang), don’t forget about the polished apps and libraries. I’m specifically thinking of ejabberd, RabbitMQ, YAWS, and Wings3D to name a few. Also worth a specific mention is ErlyWeb and the several libraries that it is built on top of.

So yes: the Erlang community is quite small. Think Python 10-12 years ago or Ruby before the Rails. But don’t pretend that it doesn’t exist, because while tiny, it’s vibrant and extremely helpful.

Erlang bit syntax and network programming

I’ve been playing with Erlang over nights and weekends off and on for a few months now. I’m loving it for several reasons. First off, it’s completely different than any other programming language I’ve worked with. It makes me think rather than take things for granted. I’m intrigued by concurrency abilities and its immutable no-shared-state mentality. I do go through periods of amazing productivity followed by hours if not days of figuring out a simple task. Luckily the folks in #erlang on irc.freenode.net have been extremely patient with myself and the hundreds of other newcomers and have been extremely helpful. I’m also finding that those long pain periods are happening less frequently the longer I stick with it.

One of the things that truly blew me away about Erlang (after the original Erlang Now! moment) is its bit syntax. The bit syntax as documented at erlang.org and covered in Programming Erlang is extremely powerful. Some of the examples in Joe’s book such as parsing an MP3 file or an IPv4 datagram hint at the power and conciseness of binary matching and Erlang’s bit syntax. I wanted to highlight a few more that have impressed me while I was working on some network socket programming in Erlang.

There are several mind-boggling examples in Applications, Implementation and Performance Evaluation of Bit Stream Programming in Erlang (PDF) by Per Gustafsson and Konstantinos Sagonas. Here are two functions for uuencoding and uudecoding a binary:

uuencode(BitStr) ->
<< (X+32):8 || <<X:6>> <= BitStr >>.
uudecode(Text) ->
<< (X-32):6 || <<X:8>> <= Text >>.

UUencoding and UUdecoding isn’t particularly hard, but I’ve never seen an implementation so concise. I’ve also found that Erlang’s bit syntax makes socket programming extremely easy. The gen_tcp library makes connection to TCP sockets easy, and Erlang’s bit syntax makes creating requests and processing responses dead simple too.

Here’s an example from qrbgerl, a quick project of mine that receives random numbers from the Quantum Random Bit Generator service. The only documentation I needed to use the protocol was the Python client and the C++ client. Having access to an existing Python client helped me bridge the “how might I do this in Python?” and “how might I do this in Erlang?” gaps, but I ended up referring to the canonical C++ implementation quite a bit too.

I start out by opening a socket and sending a request. Here’s the binary representation of the request I’m sending:

list_to_binary([<<0:8,ContentLength:16,UsernameLength:8>>, Username,
<<PasswordLength:8>>, Password, <<?REQUEST_SIZE:32>>]),

This creates a binary packet that complies exactly with what the QRBG service expects. The first thing that it expects is a 0 represented in 8 bits. Then it wants the length of the username plus the length of the password plus 6 (ContentLength above) represented in 16 bits. Then we have the username represented as 8 bit characters/integers, followed by the length of the password represented in 8 bits, followed by the password (again as 8 bit characters/integers). Finally we represent the size of the request in 32 bits. In this case the macro ?REQUEST_SIZE is 4096.

While that’s a little tricky, once we’ve sent the request, we can use Erlang’s pattern matching and bit syntax to process the response:

<<Response:8, Reason:8, Length:32, Data:Length/binary,
_Rest/binary>> = Bin,

We’re matching several things here. The response code is the first 8 bits of the response. In a successful response we’ll get a 0. The next 8 bits represent the reason code, again 0 in this case. The next 32 bits will represent the length of the data we’re being sent back. It should be 4096 bytes of data, but we can’t be sure. Next we’re using the length of the data that we just determined to match that length of data as a binary. Finally we match anything else after the data and discard it. This is crucial because binaries are often padded at the beginning or end of the stream. In this case there’s some padding at the end that we need to match but can safely discard.

Now that we have 4096 bytes of random bits, let’s do something with them! I’ve mirrored the C++ and Python APIs as well as I could, but because of Erlang’s no shared state it’s going to look a little different. Let’s match a 32 bit integer from the random data that we’ve obtained:

<<Int:32/integer-signed, Rest/binary>> = Bin,

We’re matching the first 32 bits of our binary stream to a signed integer. We’re also matching the rest of the data in binary form so that we can reuse it later. Here’s that data extraction in action:

5> {Int, RestData} = qrbg:extract_int(Data).
{-427507221,
 <<0,254,163,8,239,180,51,164,169,160,170,248,94,132,220,79,234,4,117,
   248,174,59,167,49,165,170,154,...>>}
6> Int.
-427507221

I’ve been quite happy with my experimentation with Erlang, but I’m definitely still learning some basic syntax and have only begun to play with concurrency. If the above examples confuse you, it might help to view them in context or take a look at the project on google code. I have also released an ISBN-10 and ISBN-13 validation and conversion library for Erlang which was a project I used to teach myself some Erlang basics. I definitely have some polishing to do with the QRBG client, but isbn.erl has full documentation and some 44 tests.

Reason 3,287 why I hate setuptools

root@monkey:~/inst/simplejson# python setup.py install
The required version of setuptools (>=0.6c6) is not available, and
can't be installed while this script is running. Please install
 a more recent version first.

(Currently using setuptools 0.6c3 
(/usr/lib/python2.4/site-packages/setuptools-0.6c3-py2.4.egg))

isbn.erl: My first Erlang module

For a few weeks now I’ve been tinkering with, learning about, and falling in love with Erlang. I’ve been noticing the buzz about Erlang over the past few months, but two things won me over: the Erlang video and how amazingly simple and elegant concurrency and message passing is.

For the past few weeks I’ve been reading the Erlang documentation, Joe Armstrong’s new book, reading the trapexit wiki, and lurking in #erlang on irc.freenode.net. If you’re wading in the erlang waters, I highly suggest just about every link in this wonderful roundup of beginners erlang links.

After a few weeks of reading and tinkering in the shell I decided that it was time to come up with a quick project to hone my Erlang skills. I had tinkered with ISBN validation and conversion while writing a small django application to catalog the books on the bookshelf, so I thought that was a good place to start. The Wikipedia page and my collection of ISBN links provided me with more than enough guidance on validation, check digit generation, and conversion.

I found it very easy to build this module from the ground up: start with ISBN-10 check digit generation, then use that to build an ISBN-10 validator. I did a similar thing with ISBN-13, writing the check digit generator and then the ISBN-13 validator. From there I was able to build on all four public functions to write an ISBN-10 to ISBN-13 converter as well as an ISBN-13 to ISBN-10 converter (when that is a possibility). In the process of this very simple module I ended up learning about and applying accumulators, guards, the use of case, and lots of general Erlang knowledge.

Here’s a peek at how the module works. After downloading it via google code or checking out the latest version via Subversion, the module needs to be compiled. This is easily accomplished from the Erlang shell (once you have Erlang installed of course):

mcroydon$ erl
Erlang (BEAM) emulator version 5.5.4 [source] [async-threads:0] [kernel-poll:false]

Eshell V5.5.4  (abort with ^G)
1> c('isbn.erl').
{ok,isbn}

After that we can use any of the exported functions:

2> isbn:validate_13([9,7,8,1,9,3,4,3,5,6,0,0,5]).
true

Let’s trace the execution of a simple function, check_digit_10/1. The function expects a list of 9 numbers (the first 9 numbers of an ISBN-10) and returns the check digit as either an integer or the character 'X'. The first thing that the function does is check to see if we’ve actually passed it a 9-item list:

check_digit_10(Isbn) when length(Isbn) /= 9 ->
    throw(wrongLength);

This is accomplished with a simple guard (when length(Isbn) /- 9). If that guard isn’t triggered we move on to the next function:

check_digit_10(Isbn) ->
    check_digit_10(Isbn, 0).

This forwards our list of 9 numbers on to check_digit_10/2 (a function with the same name that takes two arguments. We’ll see in a minute that the 0 I’m passing in will be used as an accumulator. The next function does most of the heavy lifting for us:

check_digit_10([H|T], Total) ->
    check_digit_10(T, Total + (H * (length(T) + 2)));

This function takes the list, splits it in to the first item (H) and the rest of the list (T). We add to the total as specified in ISBN-10 and then call check_digit_10/2 again with the rest of the list. This tail-recursive approach seems odd at first if you’re coming from most any object oriented language, but Erlang’s function nature, strong list handling, and tail-recursive ways feel natural very quickly. After we’ve recursed through the entire list, it’s time to return the check digit (or 11 minus the total modulus 11). There’s a special case if we get a result of 10 to use the character ‘X’ instead:

check_digit_10([], Total) when 11 - (Total rem 11) =:= 10 ->
    'X';

Finally we return the result for the common case given an empty list:

check_digit_10([], Total) ->
    11 - (Total rem 11).

Feel free to browse around the rest of the module and use it if you feel it might be useful to you. I’ve posted the full source to my isbn module at my isbnerl Google Code project, including the module itself and the module’s EDoc documentation. It is released under the new BSD license in hopes that it might be useful to you.

I learned quite a bit creating this simple module, and if you’re learning Erlang I suggest you pick something not too big, not too small, and something that you are interested in to get your feet wet. Now that I have the basics of sequential Erlang programming down, I think the next step is to make the same move from tinkering to doing something useful with concurrent Erlang.

Darwin Calendar Server Status Update

Wilfredo Sánchez Vega posted an update on the status of Darwin Calendar Server to the calenderserver-dev mailing list this afternoon with a status update on the project. Check the full post for details, but here’s the takeaway:

We think the server’s in pretty solid shape right now. Certainly there are still some bugs that have to be fixed before we can roll up a “1.0″ release, but the core functionality is all in place now, and we think it’s fairly useable in its current state.

There is also a preview release in the works, so keep a close eye on the mailing list and the Darwin Calendar Server site.

Properly serving a 404 with lighttpd’s server.error-handler-404

The other day I was looking in to why Django’s verify_exists validator wasn’t working on a URLField. It turns out that the 404 handler that we were using to generate thumbnails with lighttpd on the media server was serving up 404 error pages with HTTP/1.1 200 OK as the status code. Django’s validator was seeing the 200 OK and (rightfully) not raising a validation error, even though there was nothing at that location.

It took more than the usual amount of digging to find the solution to this, so I’m making sure to write about it so that I can google for it again when it happens to me next year. The server.error-hadler-404 documentation metnions that to send a 404 response you will need to use server.errorfile-prefix. That doesn’t help me a lot since I needed to retain the dynamic 404 handler.

Amusingly enough, the solution is quite clear once you dig in to the source code. I dug through connections.c and found that if I sent back a Status: line, that would be forwarded on by lighttpd to the client (exactly what I wanted!)

Here’s the solution (in Python) but it should translate to any language that you’re using as a 404 handler:

print "Status: 404"
print "Content-type: text/html"
print
print # (X)HTML error response goes here

I hope this helps, as I’m sure it will help me again some time in the future.

Packaging Python Imaging Library for maemo 3.0 (bora) and the N800

I found myself wanting to do some image manipulation in Python with the Python Imaging Library on my N800. Unfortunately PIL isn’t available in the standard repositories. Not to worry, after reading the Debian Maintainers’ Guide I packaged up python2.5-imaging_1.1.6-1_armel.deb, built against python2.5 and maemo 3.0 (bora), and installs perfectly on my N800:

Nokia-N800-51:/media/mmc1# dpkg -i python2.5-imaging_1.1.6-1_armel.deb
Selecting previously deselected package python2.5-imaging.
(Reading database ... 13815 files and directories currently installed.)
Unpacking python2.5-imaging (from python2.5-imaging_1.1.6-1_armel.deb) ...
Setting up python2.5-imaging (1.1.6-1) ...
Nokia-N800-51:/media/mmc1# python2.5
Python 2.5 (r25:9277, Jan 23 2007, 15:56:37)
[GCC 3.4.4 (release) (CodeSourcery ARM 2005q3-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from PIL import Image
>>>

Feel free to take a look at the directory listing for full source, diffs, etc. You might also want to check out debian/rules to see where to use setup.py in the makefile to build and install a Python package. If anyone wants a build of this for maemo-2.0/770 please let me know. It would just take a little time to set up a maemo-2.0 scratchbox.

Nokia N800 and camera.py

Nokia N800 and camera.py

I received my Nokia N800 yesterday and have been enjoying how zippy it is compared to the 770 (which has been getting faster with every firmware upgrade).. I got a chance to play with the browser while waiting for my wife at the airport and have been poking around to see how Bora is different than Mistral and Scirocco.

One of the bigger physical differences between the 770 an N800 is the onboard camera. I haven’t set up Nokia Internet Call Invitation yet but I looked around for some camera code samples and was pleasantly suprised. Luckily Nokia is one step ahead of me with camera.py, an example program to show what the camera sees on the screen. It looks like some bindings are missing so saving an image from the camera is off limits at the moment but this is a great start.

To run the above example on your N800, grab camera.py, install Python and osso-xterm, and run camera.py from the console.

It’s time to dust off the Python Maemo tutorial and get my feet wet.

Update: I’ve also been playing with the c version of the camera example code and have managed to get it running and taking pictures after building it in scratchbox and running it with run-standalone.sh ./camera.

From GPX to PostGIS

Now that I have a RELAX NG Compact schema of GPX, it’s time to figure out how to get my data in to PostGIS. so I can do geospatial queries. I installed PostGIS on my Ubuntu box with these instructions. If I recall correctly, the latest version didn’t work but the previous release did.

GPX makes a great GPS data interchange format, particularly because you can convert just about anything to GPX with GPSBabel. I’m not aware of anything that can convert straight from GPX to PostGIS, but it is a relatively straightforward two-part process.

The first order of business is converting our GPX file to an ESRI Shapefile. This is easiest done by using gpx2shp, a utility written in C that is also available in Ubuntu. Once gpx2shp is installed, run it on your GPX file: gpx2shp filename.gpx.

Once we have a shape file, we can use a utility included with PostGIS to import the shapefile in to postgres. The program is called shp2pgsql, and can be used as such: shp2pgsql -a shapefile.shp tablename. I tend to prefer the -a option which appends data to the table (as long as it’s the same basic type of the existing data). shp2pgsql generates SQL that you can then pipe in to psql as such: shp2pgsql -a shapefile.shp tablename | psql databasename.

Once the data is in Postgres (don’t forget to spatially-enable your database, you can query it in ways that I’m still figuring out. A basic query like this will return a list of lat/lon pairs that corresponds to my GPS track: select asText(the_geom) from tablename;. I’m llooking forward to wrapping my head around enough jargon to do bounding box selects, finding things nearby a specific point, etc.

The examples in the PostGIS documentation seem pretty good, I just haven’t figured out how to apply it to the data I have. I feel like mastering GIS jargon is one of the biggest hurdles to understanding PostGIS better. Well, that and a masters degree in GIS wouldn’t hurt.

All I want to do is convert my schema!

I’m working on a django in which I want to store GPS track information in GPX format. The bests way to store that in django is with an XMLField. An XMLField is basically just a TextField with validation via a RELAX NG Compact schema.

There is a schema for GPX. Great! The schema is an XSD though, but that’s okay, it’s a schema for XML so it should be pretty easy to just convert that to RELAX NG compact, right?

Wrong.

I pulled out my handy dandy schema swiss army knife, Trang but was shocked to find out that while it can handle Relax NG (both verbose and compact), DTD, and an XML file as input and even XSD as an output, there was just no way that I was going to be able to coax it to read an XSD. Trang is one of those things (much like Jing that I rely on pretty heavily that hasn’t been updated in years. That scares me a bit, but I keep on using ‘em.

With Trang out of the picture, I struck out with various google searches (which doesn’t happen very often). the conversion section of the RELAX NG website. The first thing that struck my eye was the Sun RELAX NG Converter. Hey, Sun’s got it all figured out. I clicked the link and was somewhat confused when I ended up at their main XML page. I scanned around and even searched the site but was unable to find any useful mention of their converter. A quick google search for sun “relax ng converter” yielded nothing but people talking about how cool it was and a bunch of confused people (just like me) wondering where they could get it.

At this point I was grasping at straws so I pulled up The Internet Archive version of the extinct Sun RELAX NG Converter page. That tipped me off to the fact that I really needed to start tracking down rngconf.jar. A google search turned up several Xdoclet and Maven cvs repositories. I grabbed a copy of the jar but it wouldn’t work without something called Sun Multi-Schema XML Validator.

That’s the phrase that pays, folks.

A search for Sun “Multi-Schema XML Validator” brought me to the java.net project page and included a prominent link to nightly builds of the multi-schema validator as well as nightly builds of rngconv. These nightly builds are a few months old, but I’m not going to pick nits at this point.

After downloading msv.zip and rngconv.zip and making sure all the jars were in the same directory I had the tools I needed to convert the XSD in hand to RELAX NG Compact. First I converted the XSD to RELAX NG Verbose with the following command: java -jar rngconv.jar gpx.xsd > gpxverbose.rng. That yielded the following RELAX NG (very) Verbose schema. Once I had that I could fall back to trusty Trang to do the rest: trang -I rng -O rnc gpxverbose.rng gpx.rng. It errored out on any(lax:##other) so I removed that bit and tried again. After a lot more work than should have been required, I had my RELAX NG Compact schema for GPX.

My experience in finding the right tools to convert XSD to RELAX NG was so absurd that I had to write it up, if only to remind myself where to look when I need to do this again in two years.

Oh the CalDAV Possibilities

While checking up on the Darwin Calendar Server wiki the other day I noticed something I had missed last week: CalDAVTester. It is an exhaustive suite of tests written in Python with XML config files to verify that a CalDAV server implementation is properly implementing the spec.  This suite of tests is going to prove very useful as more servers and clients implement the CalDAV spec.

Right now the biggest problem with CalDAV is a lack of clients and servers.  That will change over the next 6-8 months as clients and servers are refined, released and rolled out.  Hopefully the CalConnect group and an exhaustive suite of tests will help keep interop a high priority.

Darwin Calendar Server

As soon as Gruber pointed out Darwin Calendar Server I felt like I had to check it out. I’ve played with Darwin Streaming Server in the past and love me some Webkit. I was pleasantly suprised to find that Darwin Calendar Server runs on top of Python and Twisted.

So away I went. I checked out the source and began to poke around. I managed to check out the source before the README was added so I did a fair amount of head scratching and wheel spinning, but it turns out that getting up and running is pretty easy: ./run -s

That sets up the server, downloading and building some prereqs as it goes. I already had some prereqs installed system wide so I can’t guarantee that this works, but I’m pretty sure that it has worked for others. I should take a second to qualify that I’m running OS X 10.4 with Python 2.4 installed. From there I copied over the sample config file (cp ./conf/repository-static.xml ./conf/repository-dev.xml) and immediately started troubleshooting SSL errors. First I installed PyOpenSSL and created a self-signed certificate. That yielded a brand new error: OpenSSL.SSL.Error: [('PEM routines', 'PEM_read_bio', 'no start line'), ('SSL routines', 'SSL_CTX_use_PrivateKey_file', 'PEM lib')]

After doing that and getting some guidance from the folks in #collaboration on freenode I decided to hack away at the plist and disable SSL for now (change SSLEnable to false instead of true). From there I could run the server (./run) and bring up a directory listing my pointing to 127.0.0.1:8008.

Darwin Calendar Server Chandler Setup

From there I subscribed to the example payday calendar and the holiday calendar. It appears that iCal won’t do two-way CalDAV until Leopard, but in the meantime I was able to successfully set up and test Chandler.

This is some absolutely amazing tech in its infancy. I can’t wait to see where this goes and I’m excited that it’s built with tools that I’m familiar with (Python, Twisted, SQLite, iCal). It seems to me like this open source app is but the tip of the iceberg of collaboration features that will be baked in to OS X 10.5 desktop and server.  I would also kill for a mobile device that spoke CalDAV natively so that I can replace my duct taped google calendar to iCal to iSync to 6682 workflow.

PyS60 1.3.8 Released

Python for S60 version 1.3.8, released specifically for S60 3rd Edition is now available for download. See the release notes for more information. Special thanks to Jukka and everyone else for pushing this release out the door just before Finland shuts down for the summer.

Xgl

XGL thumbnail

A week or so ago I managed to get Xgl working on a box at home using these instructions wtih an aging Nvidia card. I previously had issues getting it to work with an ATI card, but I think that had more to do with flgx and my rather old Radeon than anything else.

I was completely blown away when I started up XGL and enabled compiz for the first time. It brought an extra level of polish to the already amazing Dapper Drake. By default it enables several effects and features. Some are more whizzy than useful, but the alt-tab preview pane and expose-like features are quite useful. Less useful but still pretty are true transparency and the waving effect that happens when you drag a window.

Unfortunately running Xgl and compiz meant that things like emacs didn’t run at all, and other apps like Evolution behaved unpredictably.

That’s to be expected though. Xgl is still very much unstable and bleeding edge. Even if it’s not usable (for me) day in and day out, I think it’s a glimpse in to the future of desktop linux. The packages are available in the Universe repository for Dapper, but there are big warnings everywhere about their experimental nature.

I really hope that over the next six months Xgl matures and that by the time Edgy Eft is born Xgl will be ready for prime time, if not in the default install.

Backing Up Flickr Photos with Amazon S3

I love that I now have an Amazon S3 billing page that reads like a really cheap phone or water bill. I think that they’re silently changing the game (again) without telling anyone else. I really like the implications of this magepiebrain post and decided to start using S3 “for real” myself last night.

The first ingredient was James Clarke’s flickr.py. Getting a list of my photos is pretty simple:

import flickr
me = flickr.people_findByUsername("postneo")
photos = flickr.people_getPublicPhotos(me.id)

The second ingredient for getting the job done was a pythonic wrapper around the Amazon example python libraries by Mitch Garnaat called BitBucket. Because it builds on the example libraries, there’s very little error checking, so be careful. Check out Mitch’s site for some example BitBucket usage, it’s pretty slick.

Once I was familiar with both libraries, I put together a little script that finds all of my photos and uploads the original quality image to S3, using the flickr photo ID as the key. Here’s the complete code for flickrbackup.py, all 25 lines of it.

After uploading 160 or so photos to Amazon, I owe them about a penny.

Getting photos back out is really easy too:

>>> import BitBucket
>>> bucket = BitBucket.BitBucket("postneo-flickr")
>>> bucket.keys()
>>> bits = bucket[u'116201243']
>>> bits.to_file("photo.jpg")

My Nokia 770 Is On The Way!

I just got “the email” saying that my Nokia 770 developer device was ready to rock. It’s been ordered up, more when the device arrives!

Update: It’s here, more later.