James Tauber's Blog 2004


blog >


Favourite Posts of 2004

I mentioned in Blog Hits by Age that I would, as others have done recently, list my favourite entries from my blog this year.

Here are the ones that come to mind. Some generated some good discussion in the blogosphere at the time; others disappointingly didn't generate any response at all.

In no particular order...

Conference Reporting

Questions and Observations

Programming Ideas

Little Python Scripts

Typed Citations Meme

Aggregation versus Hosting Meme

XML versus RDF Meme

by : Created on Dec. 30, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Open Coverings and Compactness

If you pick a collection of open sets whose union is the space's entire set, then that collection is called an open covering of the space.

For example, consider the set {a, b} with topology { {}, {a}, {b}, {a, b} }. One open covering would be:

{ {a}, {b} }

Another would be

{ {a}, {a, b} }

Clearly it is possible to cover any finite topological space with a finite number of open sets.

It is also possible to cover any infinite topological space with a finite number of open sets. Because X is an open set in any topology on X, a collection consisting of just X itself is an open covering.

If an open covering has a finite subset which still manages to cover the entire set, the covering is said to have a finite subcovering.

Some topological spaces have the property that every open covering has a finite subcovering. Such a space is said to be compact.

Compactness is a topological property. Recall that this means if a topological space is compact, any topological spaces homeomorphic to it will also be compact (and also that a homeomorphism can't exist between a compact topological space and one that is not compact).

UPDATE: next post

by : Created on Dec. 30, 2004 : Last modified Aug. 20, 2005 : (permalink)


Film Project Update: The Long Journey Home

Sending the DVDs to festivals has, so far, gone smoothly. The same can't be said for the 20 Tom sent back to me (recall the mastering had been done in Australia but the duplication in the US).

Tom went to the UPS shop on Saturday 18th December. Once the package was in the system, they were claiming an arrival estimate of Wednesday 22nd December. This seemed optimistic at the time. I thought 23rd was a possibility. Backworking, that would mean it would have to arrive in Sydney on 22nd and hence leave California by late on 20th.

The package, however, was not even picked up from the UPS shop until 6pm on Monday 20th. This meant it didn't fly out of New Hampshire until 10pm that night. At that point I knew it probably wouldn't make it by Christmas. The arrival estimate, however, was still showing 22nd.

It arrived in Ontario, California at 7.10am on 21st and within a few hours, had been seized by customs (or whatever "PKG DELAY-ADD'L SECURITY CHECK BY GOV'T OR OTHER AGENCY- BEYOND UPS CONTROL" means).

Finally at 3.44am on 23rd December, another hub scan was done. The arrival estimate was still showing as 22nd but at least there was a chance it was going to make it on a flight pretty soon and make it into the country by Christmas at least.

But, alas, no new scans. It didn't make it on a flight on 23rd or the 24th. By 27th there was still no new scan. Then on 28th December, ten days after the package had been sent, there was another hub scan done at Ontario, California. It still hadn't left the US!

Then, a few hours later, another dreaded: "PKG DELAY-ADD'L SECURITY CHECK BY GOV'T OR OTHER AGENCY- BEYOND UPS CONTROL"

I can only guess that the customs officials just really like the film. Hey guys, keep a couple of copies, just send the rest on please!

UPS is still showing the arrival estimate as...you guessed it...22nd December.

by : Created on Dec. 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


Upgrade Apologies

I've upgraded this site to Leonardo 0.4.0rc3. Apologies to feed readers for the numerous atom entries whose modification dates got changed as a result.

by : Created on Dec. 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


More On LinkRanks Ups and Downs

Recently, I observed large jumps in the PubSub LinkRanks for jtauber.com and attributed it to influential sites coming in and out of the 10-day window PubSub uses.

However, in a comment to Trevor Cook's entry on the jumps, the PubSub CEO responded:

The reason for the sudden shift is that we increased the granularity of how we measure linkranks. Specifically, we added individual blogs from the various hosting services for the first time (e.g. livejournal.com/johndoe) - that has suddenly shifted everyone's ranking. Bob Wyman, our CTO, dropped 30,000 places (much to his chagrin). Check out his blog for more details - http://bobwyman.pubsub.com

While it has obviously affected some blogs in the downward direction, I've been sub-50,000 ever since.

LinkRanks PSI

by : Created on Dec. 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


Blog Hits By Age

I was going to give a Top Ten Blog Entries By Number of Hits listing but I suspected it would not necessarily be that insightful under the hypothesis that hit numbers are partly a function of the age of the entry.

So I took the number of hits for each entry and graphed it against the age of the entry in days:

There definitely appears to be a linear baseline which the entries "rise above". To make this clearer, I graphed the hits per day against age:

Notice that the two entries from 250-300 days ago lower in significance while the entry from 50 days ago rises considerably. Which entries were these?

The older two are Eclipse is the next Emacs and Eclipse GEF. Both those get a lot of their referrals from Google searches.

The entry from 50 days ago is, funnily enough, another Eclipse GEF-related post, Six Snapshots of a Simple Eclipse GEF Application. Note that that entry is linked to from one of the older ones.

So, what effect does using average hits per day instead of just hits have on a Top Ten Blog Entries?

Here is a list of the top 10 just by hits:

And here is a list of the top 10 by hits per day (ignoring the last couple of days):

Is the second list more representative? I think so. It includes some extra entries (in bold) that were popular (judging by incoming links and del.icio.us citations) but didn't make the first list because they hadn't been around for as long.

How does any of this match up with what I consider my own favourite entries? I'll save that for another entry.

by : Created on Dec. 23, 2004 : Last modified Feb. 8, 2005 : (permalink)


TeX for Leonardo

Looking at Wikitex (via Simon Willison) has convinced me more than ever that I want support for TeX in Leonardo.

Hopefully 0.5 will have the framework (if not the actual implementations) to support a range of underlying document formats including TeX, XHTML and Word.

by : Created on Dec. 23, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: The Standard Topology for Ordered Sets

One common way of defining a topology is to take a set, add some structure to that set, define a collection of subsets that meet some criteria in that structure and then use that collection as a basis for the open sets.

Although we didn't have the vocabulary to accurately describe it in those terms, that's what we did previously with the topology of a metric space. A metric space, recall, adds to a set the structure of a distance function. From this, we can define the collection of open balls. This collection can then form the basis for the other open sets in a topology.

Here is another example. Take a set X and add to it the structure of a total ordering. A total ordering is a relationship < such that

In other words, a set with a total ordering is a set whose elements can be sorted.

Now define an open interval (a, b) to be the subset of X such that, for each element x, a < x and x < b.

The open intervals form the basis for a topology. So a total ordering on a set defines a particular topology. While other topologies are possible, the one based on the open intervals is referred to as the standard topology for the ordering or the order topology.

The real numbers, being a totally ordered set, has an order topology. While other topologies can be defined on the real numbers (as long as the rules for open sets are followed), the order topology is the most natural and consistent with one's intuitions about how the real numbers work.

UPDATE: next post

by : Created on Dec. 23, 2004 : Last modified Feb. 8, 2005 : (permalink)


Flickr and DataLibre

Darren Barefoot has come around on Flickr after earlier making the very DataLibre comment "I’ve yet to be convinced that the best place for my online photos isn’t on my own site."

He says it's the convenience that's won him over. Any feature in particular, Darren?

I certainly have found it easier to put photos up on Flickr than on jtauber.com, but that's just because of the current state of Leonardo. There's no reason why, in the future, Leonardo couldn't provide things like Windows Publishing Wizard support and iPhoto integration to make it just as easy to get stuff up on my own website.

But even then, I might still consider using Flickr. As I've mentioned before, I'm interesting in separating aggregation and hosting, not eliminating aggregation. I should be able to take advantage of Flickr's aggregation by pointing them to my self-hosted photos.

by : Created on Dec. 22, 2004 : Last modified Feb. 8, 2005 : (permalink)


Happy Birthday Konrad Tauber

Today is my father's 56th birthday.

He opened up both the world of computers and the world of business to me. He gave me endless opportunities while always leaving the path up to me. He also taught me that business is about people.

I love you dad. Happy Birthday!

by : Created on Dec. 22, 2004 : Last modified Feb. 8, 2005 : (permalink)


LinkRanks Ups and Downs

PubSub LinkRanks seem to be very sensitive to very recent activity which means one's rank can jump around a lot. I'm guessing this is particularly true at the long tail where just one link can leap frog you over hundreds of thousands of fellow bloggers.

Yesterday I was 938,610, today I am 89,060. I've been sub-100,000 before but I also spend time around the 1,000,000 mark if I haven't been linked to in the last week or so.

Oddly, Trevor Cook and others are reporting their rank has dropped recently. Perhaps some highly weighted bloggers just dropped out of the time-weighted window of referrers for their sites.

Or maybe PubSub have changed their algorithm. They say they are still refining it.

Incidentally, I'll use the recommended PSI so PubSub know I'm talking about them.

by : Created on Dec. 22, 2004 : Last modified Feb. 8, 2005 : (permalink)


Branching in Subversion

I'm just about to release Leonardo 0.4.0 so I thought I'd better learn how to branch in Subversion. Turned out to embarrassingly easy:

svn copy trunk branches/0.4

assuming you've got the entire tree checked out (otherwise it can be done almost as easily with URLs).

But it did get me thinking. Previously I've talked about replacing the structure recommended by the O'Reilly Subversion book

/branches /tags /trunk

with more explicit indications of what I use tags for:

/branches /checkpoints /milestones /releases /trunk

with further structure possible under the first four directories before getting to the actual source code.

Well, if I understand correctly, there is nothing special about the /trunk directory. I'm not even sure Subversion really has a notion of a trunk. So why not only have branches?

In other words, instead of keeping the latest development under /trunk and maintenance branches under /branches, why not have a branch for the current development version alongside the branches for maintenance. Something like:

/branches/0.4 /branches/0.5

where (in Leonardo's current state), next-version development takes place under /branches/0.5 and maintenance on 0.4 is done under /branches/0.4

Unless I'm missing something, this seems a clean way of organising things that is native Subversion. The original suggestion given by the Subversion book really makes sense only if you're coming from CVS.

Again, unless I'm missing something :-)

UPDATE (2004-12-23): Justin Johnson, in email noted:

The reason for using trunk is so that developers can continue working on the latest release without having to setup a new working copy everytime the project releases. For example, I were working on 0.4 and then 0.4 released and we created a 0.5 branch, I'd have to clobber my working copy and create a new one. But if I were looking at the trunk, I would be guaranteed that it always points to the latest release that is still in development. It may seem like a minor point, but when you have a lot of developers and when the size of the project is significant, it makes a huge difference.

This is a good point. I did consider the issue of "knowing which is the development branch" and that actually made me wonder about having aliases in Subversion.

However, in my own experience, for commercial software development at least, the developers (even on big projects) all know exactly what version is the latest development version and it is an important "event" in the engineering organization when a new branch is made.

I can see that, for distributed open source development, particularly if the cycles are short, a clearly designated trunk becomes more important, though.

by : Created on Dec. 22, 2004 : Last modified Feb. 8, 2005 : (permalink)


XML Elements versus Attributes

Ned Batchelder discusses the old question of elements versus attributes in XML. As I've been answering that question for over seven years in various places, I thought I'd put down my viewpoint here.

Firstly, there are distinctions based on performance or API usability. Those distinctions are so implementation-specific, I don't think they are very interesting; certainly not to someone doing schema design.

Secondly, there are distinctions based on a particular schema language. Different schema languages have different levels of expressiveness so it's important to distinguish the characteristics of elements and attributes inherent to XML from those that are true only because of the particular choice of schema language. One important take away here is that a schema is only part of the description of a markup language. In my experience there are always constraints placed on a language beyond what the schema (in any schema language) can say.

Thirdly, there are distinctions inherent to the XML syntax itself; things like the lack of attribute order or the inability to have further XML structure within an attribute value.

But when all those three are considered, there is still a fundamental "style" question around attributes and elements and here is where a lot of people really find themselves asking the elements versus attributes question.

My take on that is that the distinction is more meaningful the more markup-oriented your XML is and more fuzzy the more data-oriented your XML is.

If you are using XML to serialise objects, then the distinction is blurry and it largely comes down to convention and things like the third type of distinction above. In such cases, an element-only approach might make perfect sense, especially if you are using a schema language that can express characteristics that, in DTDs, attributes had over elements, like default values or insignificant ordering.

But if you are truly doing markup, in other words annotating text (particularly a pre-existing text) then the distinction between attributes and elements becomes much clearer and the reason why attributes exist in XML (and SGML) is far more obvious. The key is that attribute values are considered part of the markup, rather than part of the content. So the clearer the distinction is between markup and content, the clearer it will be between using attributes or child elements.

Imagine that you want to describe Max as a black cat. From a data structure representation point of view, there's no semantic distinction between:

<cat> <name>Max</name> <colour>Black</colour> </cat>

or

<cat name="Max" colour="Black"/>

and so decisions about whether to use elements or attributes tend to boil down to (a) whether order matters; (b) whether values can have internal structure; (c) compactness or whatever.

However, if you are doing document markup, things are a little different. In the document markup case, you have some existing text that you annotate. So you start with a word "Max" in your document and you want to mark that up with a generic identifier and any additional properties you want to give that word (or referrant). You might end up with something like:

<cat colour="Black">Max</cat>

Making colour a child element rather than an attribute wouldn't make sense from a document markup perspective. In document markup there is a much clearer distinction between content and markup. "Max" is content. "Black" is markup. If you made "colour" a child element with "Black" as content then "Black" would change from being markup to content. Makes no difference in data structure representation but it does in document markup.

From a data structure representation point of view, this attribute/element distinction is so blurred that it is entirely possible to do away with attributes in representations (and sometime less confusing to do so). This is even more the case where you have schema languages that allow expression of the fact that element order (in a particular context) is not significant.

But in pure document markup applications, where attributes are just indicating characteristic qualities of an element's content, they have a clearer role.

by : Created on Dec. 21, 2004 : Last modified Feb. 8, 2005 : (permalink)


Alexa Does DataLibre Right (Almost)

I was fiddling around with Amazon.com's Alexa and discovered they provide a very DataLibre-style way of updating one's site information:

To update your contact info, you may place an info.txt file containing your contact info in the root of your site for Alexa to fetch.

Right-click this link: info.txt. And save it to your computer. Copy the info.txt file from your computer to the root of your site. Verify that the info.txt file is there with your browser. (Go to http://www.jtauber.com/info.txt.) Once you have verified that the file is there, tell us to fetch it by clicking this link: Go Fetch

Well done Amazon! Now if Bloglines did it with OPML, LinkedIn with FOAF, Freshmeat with DOAP, etc...

UPDATE (2004-12-22): Gary Fleming thinks info.txt is a bad idea. I agree with him. While I still like the DataLibre aspect of what Alexa does, Gary's entry persuaded me that requiring a fixed path "/info.txt" is the wrong way to do it. I should have been able to give Alexa my own URI. DataLibre means owning your own URI space too. Thanks Gary for making me realise that!

by : Created on Dec. 21, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Ten More Festivals

Just submitted Alibi Phone Network to ten more festivals: Phoenix FF, Palm Beach IFF, Newport Beach FF, Atlanta FF, Beverly Hills FF, San Fernando Valley IFF, Independent FF of Boston, Malibu IFF, Seattle IFF and IFP/Los Angeles FF.

by : Created on Dec. 21, 2004 : Last modified Feb. 8, 2005 : (permalink)


New Mac for Audio and Video

For years, I've dreamed of having a computer dedicated to video and audio editing. It's always been hard to do because the moment I get a fast new machine with lots of memory and disk space, I want to move over to using it for everything. But I'm resolved this time to "keep it pure".

I got a PowerMac dual 2.0GHz G5 (on principle, I always buy the second-fastest processor available on the thinking that the state-of-the-art is over priced for the the people who will pay anything to get the best) with 2.5GB RAM, 2x250 HDDs and a GeForce 6800 GT card. I had earlier bought a 23" Cinema HD screen which I was running off my 12" Powerbook but now it belongs to the PowerMac.

(Actually, losing the 23" screen is going to be the toughest part of "staying pure" as I'm now back to 12" for things like Leonardo and MorphGNT. I might have to share the screen - that's not cheating is it? Do they make KVMs that work with Cinema HD screens?)

I spent a good part of today doing OS updates and installing Apple's Production Suite (Final Cut Pro HD, Motion and DVD Studio Pro). The machine came with OS X 10.3.4 which didn't have support for the 6800 card so I had to put a different graphics card in, upgrade to 10.3.7 and then put the 6800 back in.

The Production Suite install went smoothly. When it came to ProTools LE 6.1, things didn't go so well.

Until now, I've been running ProTools off my Windows machine. I'd forgotten just how much of a pain it was getting ProTools to work last time. ProTools is very picky about hardware and OS. I think I finally got it to work on Windows by upgrading my HDD drivers.

Anyway, I wasn't expecting any problems with my new Mac. But lo and behold, when I started up ProTools for the first time on the Mac, I got an error message (actually it was error code 1). A quick Google result on the DigiDesign discussion board indicated that error 1 meant that ProTools didn't like the OS version.

The next major version of ProTools is due soon so I wonder if that will work. Hopefully in the meantime there is a minor release that works on OS X 10.3.7. Going to investigate now...

UPDATE (2004-12-18): Looks like upgrading to ProTools LE 6.4 did the trick.

by : Created on Dec. 18, 2004 : Last modified Feb. 8, 2005 : (permalink)


Nominations Open for 2005 Australian Blog Awards

see http://kekoc.com/wp/archives/2004/12/14/2005-australian-blog-awards-nominations/

by : Created on Dec. 17, 2004 : Last modified Feb. 8, 2005 : (permalink)


Priority, Severity and Roundup

I'm a big fan of roundup as a bug tracking system. It does, however, come with an odd list of default priorities:

One thing I don't like about it is that it conflates priority and severity. I think it's useful in a bug tracking system to distinguish priority and severity. While the two are often related, it is possible to have a high-priority low-severity bug (e.g. embarrassing typo in UI the day before an important customer meeting) and a low-priority high-severity bug (e.g. software crashes on an unsupported OS)

Severity, in my view is, about the impact on what the user is trying to do. Severity is fairly easy for the submitter to judge. Priority, on the other hand, is more of a triaging issue that needs to take into account a number of factors the submitter might not be privy to. So priority is best assigned in some separate review session. That is not to say the submitter can't be involved in that review — just that others need to be involved too so priority can't generally be judged at the time of submission.

Here is a list I came up with a few years ago for the severity of bugs:

Any alternative lists people have used and found useful?

Note that features aren't included here. I'm not sure that features should be treated as a level of priority or severity. I like the approach of them being a completely different issue type. I also think there's value in having a "task" type which covers things that aren't features or bugs but nevertheless benefit from being tracked. The only problem I see with different types is that, as a developer you really want to see all your issues at once, whether they be features, bugs or tasks. It isn't clear to me how one would do that in roundup.

UPDATE (2005-01-03) : Now see More on Priority and Severity

by : Created on Dec. 17, 2004 : Last modified Feb. 8, 2005 : (permalink)


Leonardo Release Candidate

The first release candidate for Leonardo 0.4 is available at http://jtauber.com/2004/12/leonardo-0.4.0-rc1.tgz. Let me know if you encounter any problems. If all goes well, Leonardo 0.4 will be out by the end of the year.

by : Created on Dec. 16, 2004 : Last modified Feb. 8, 2005 : (permalink)


Why Couldn't They Have Had Blogs in 1986

I was reminiscing with my parents this evening about my first year of high school, which I did by correspondence because we were living in Brunei at the time. My mum reminded me that the thing I hated most was having to write a journal for English.

My teacher didn't care what I wrote, as long as I wrote something. But I always found it difficult, perhaps because the act of writing something down on paper and posting it off to my teacher in Australia made it all seem so formal.

How much easier it would have been if blogs had existed back in 1986!

by : Created on Dec. 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


It Took Me A Lot Longer

Scoble mentions that today is the fourth anniversary of his blog and he credits Dave Winer as one of the people that talked him into it.

Thinking back, four years ago was the EDevCon conference in New Orleans that I gave a Web Services keynote at. Scoble was the organizer. I also met Dave Winer there for the first time (and Brent Simmons). Dave has a picture to prove it (that's me with the Slashdot fleece :-)

Whatever Dave said to Scoble to talk him into blogging, he mustn't have said to me, but I got there eventually.

by : Created on Dec. 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


Architecture of the World Wide Web, Volume One

The Architecture of the World Wide Web, Volume One has become a W3C Recommendation.

Congratulations to the W3C TAG. This is a great piece of work (even if the title does sound like a Mel Brooks movie) and provides an invaluable foundation for the design of Web-based systems.

Where Leonardo has failed to embody the terminology, principles or best practices of this document, I consider that to be a bug in Leonardo.

by : Created on Dec. 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


Thoughts on GNT-NET Parallel Glossing Project

Zack Hubert mentions that I'm thinking about using the NET Bible for a collaborative parallel glossing project.

Here is how it might work:

The user is presented with the Greek text and the NET text.

Consider Luke 1.1. The Greek reads:

Ἐπειδήπερ πολλοὶ ἐπεχείρησαν ἀνατάξασθαι διήγησιν περὶ τῶν πεπληροφορημένων ἐν ἡμῖν πραγμάτων,

The NET reads

Now many have undertaken to compile an account of the things that have been fulfilled among us,

It should be possible to select any number of words in the Greek and any number of words from the NET and assert that they correspond (or link) to one another. There is no need to link between the entire verse of Greek and the entire verse of the NET because that link has already been made automatically.

Say the user selects Ἐπειδήπερ. They should then be shown the part-of-speech and parse information for the word (in this case C) as well as the lexical form, ἐπειδήπερ. The user should also be shown all previous glosses for ἐπειδήπερ in other contexts.

The user is then instructed to select the word or words that directly translate ἐπειδήπερ. In this case, the user selects Now and submits.

The user need not progress in order. Say the next thing they select is the word πραγμάτων. As before, they are shown the part-of-speech and parse information (N-GPN) and the lexical form, πρᾶγμα. Again the user is show previous glosses. These glosses should include those specifically for πραγμάτων as well as other forms of πρᾶγμα, perhaps displayed differently.

The user then selects things and submits.

It should be possible to select multiple Greek words and link them to just one word from NET. It should also be possible to select one Greek word and link it to multiple words in the NET. Many-to-many links should also be possible. For example, a user could select περὶ τῶν πεπληροφορημένων ἐν ἡμῖν πραγμάτων and of the things that have been fulfilled among us and submit that linkage.

It is also possible that some words won’t link to anything.

Many-to-many linkages should be encouraged where the particular sense of a word is entirely determined by its use in a sequence (such as an idiom).

Users should be discouraged from doing many-to-many linkages where the sequence isn't a grammatical unit such as a phrase. For example, a user shouldn't submit a link between περὶ τῶν and of the. This clearly can't be enforced.

Users should be required to log in before they can submit linkages. Each linkage will be stored with the email address of the person that made the linkage.

While users may be encouraged to work on particular verses, they should be free to go to whatever verses interest them. Duplicate effort is not a problem and provides redundancy. The data can be checked later for inconsistencies.

by : Created on Dec. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


MorphGNT v5.05 Available

by : Created on Dec. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


Best Use of MorphGNT So Far

Zack Hubert has taken my MorphGNT and built a GNT Browser that blew me away! It displays the text in the browser; hover on a word and the lemma and parsing is shown in a pop-up; click on the word and you get a graph of word occurrence by book with the ability to list all occurrences.

I've toyed with web interfaces to the MorphGNT for years but nothing even remotely as slick as this.

by : Created on Dec. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: DVDs and More Festivals

We've just submitted Alibi Phone Network to five more festivals: Newport, Sedona, Vail, OC and Sonoma Valley.

It was our first submission using professionally duplicated DVDs rather than making copies ourselves. We got a batch of 100 done, of which I expect around 50 to be submitted to festivals.

by : Created on Dec. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


Ground Loop

The last few days I've been reorganising my home office / recording studio (unfortunately, they are still the same thing).

When I plugged my Korg Triton LE into my Digidesign Digi002 I noticed the distinctive hum of a ground loop. I've never had to deal with a ground loop before. Basically they occur when one device's path of least resistance to the ground is through the audio cable. The result is a low hum at AC frequency (50Hz in Australia).

So I hopped on to the excellent home-recording mailing list to ask what I should do.

Rodrigue Amyot came to my rescue with some things to try. The first possible problem we identified was that the Korg's power cable is only a two-pin (what were they thinking!)

Another possibility Rod raised was mixing balanced and unbalanced devices. I don't know what the Korg is (my Roland keyboard definitely has balanced outputs) and I don't know what the Digi002 takes although I would guess balanced. My cabling assumes both are balanced.

Unplugging the power to the Korg still left the hum which suggested it wasn't a power ground loop problem after all.

Still working on the problem. Audio electrics is fun.

by : Created on Dec. 13, 2004 : Last modified Feb. 8, 2005 : (permalink)


Blog Goals or Lack Thereof

Dorothea Salo in Caveat Lector comments on how odd it seemed being asked how her blog was going. I think I would react the same way.

Ask how my music's going, or my filmmaking, or my morphological analysis of the Greek New Testament and I'd be able to tell you. They are projects, or at least interests manifesting as specific projects. Even the Poincare Project is foremost about me taking notes on my way to understanding the (possible) proof of the Poincare Conjecture. The use of the blog for those notes is largely incidental to that goal.

Blogging in and of itself isn't a project for me. I think that's largely because I don't have goals for it. Sure I track referrer logs and webstats, etc. Sure I get a thrill when Mark Liberman likes an idea of mine or Doc Searls doesn't. But they aren't accomplishments tracked against some schedule. I don't have monthly Scoble linkblogging targets.

Not that there's anything wrong with that. But for me, like Dorothea, blogging is scribbling. Occasionally making announcements, but mostly just scribbling.

by : Created on Dec. 13, 2004 : Last modified Feb. 8, 2005 : (permalink)


On the Red Couch

No, I'm not appearing on Scoble's Red Couch (I wouldn't say no, though) but Nelson James will be on this red couch next Sunday.

That's right, the pop duo I'm in has been invited back (always a good sign) to perform on local chat show, The Couch, for their Christmas special.

UPDATE (2004-12-14): Unfortunately, there is a conflict with a play that Nelson is in and so we've had to cancel our television appearance. However, we should be appearing some time in the new year.

by : Created on Dec. 12, 2004 : Last modified Feb. 8, 2005 : (permalink)


More on Typed Citations

I've written before about the idea of typed citations.

Mark Liberman (who I might have studied under if I'd gone ahead with my PhD application to UPenn) comments on the idea of typed citations with some excellent thoughts. One thing that I realised, reading Mark's post: I probably wasn't clear that I was envisaging a controlled vocabulary, much like XFN has.

The notion of typed citations relates to trackbacks, a topic I've also talked about before. Bryan Lawrence (who has recently become my main sounding board in the development of Leonardo) asks about semantics in trackbacks. He is talking about typing the source object rather than relationship but the two are related. In RDF terms, one is a class, the other is a property. I would love to see both able to be expressed in a trackback.

by : Created on Dec. 12, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: A Basis for a Topology

Because of the requirement that unions and finite intersections of open sets must also be open sets, you don't need to specify every open set in order to define a topology. You can characterise a topology by describing a certain class of open sets from which the other open sets can be calculated.

Such a class is called a basis for the topology.

Because members of the basis are themselves open sets, once we have a basis we can generate all the other open sets by taking unions.

A random selection of subsets of X isn't always going to give as a basis for a topology on X anymore than it gives us a topology, so what restrictions exists on a basis the ensure it can generate a topology?

Clearly every element in the set X must appear in at least one basis open set. Otherwise that element would miss out on being in any open sets (and we know that, by definition, X itself must be open).

There is one more requirement, however, that must be met. Consider X = {a, b, c}. The open sets {a, b}, {b, c} cannot form a basis because if {a, b} and {b, c} are open then the intersection {b} must be. But {b} cannot be open because it isn't the union of basis open sets.

To avoid this, we have the additional requirement on a basis as follows:

if x is in the intersection of two basis open sets then x must also be in a third basis open set which is a subset of the intersection.

This, along with the requirement that every element must appear in at least one basis open set is sufficient to ensure that one has a basis for a topology.

UPDATE: next post

by : Created on Dec. 10, 2004 : Last modified Feb. 8, 2005 : (permalink)


Shift to Aggregator Use

I noticed some interesting numbers in my website logs that suggest a significant shift towards aggregator use when reading this blog.

In October, there were 772 unique IP hits to the full-text atom feed. In November, that number was 941. That's a more than 20% increase.

However, October saw 3228 unique IP hits to blog pages compared with only 2600 in November. A just under 20% decrease.

Now this might not have been caused by a shift from people reading in a browser to people reading in an aggregator but it does seem plausible, even likely.

by : Created on Dec. 9, 2004 : Last modified Feb. 8, 2005 : (permalink)


MorphGNT v5.04 and Beyond

I've released a new version of my MorphGNT.

Details of the changes are on the MorphGNT page but they all stem from a simple query performed via a Python script: in cases where there is no parse-code (i.e. the word is essentially uninflected), is the text form the same as the lexical form (other than accentuation)?

In some cases this rule means that new lexical forms need to be provided to allow for spelling variation, rather than the lexical form normalising spelling. This is an editorial decision I've made that makes more sense in the larger picture of where I'm going with the MorphGNT.

The corrections I'm making to the CCAT database are really just a side-effect of my efforts to build an original database of New Testament Greek morphology. I'll say more about it as it develops but the idea is that surface forms, lexical forms, spelling variations, roots, stems, suppletion, morpho-phonological rules, etc. will all be catalogued with relationships between them expressed as a directed labelled graph.

Eventually, the MorphGNT will reference into this graph rather than merely give the lemma. There'll be a partial ordering of nodes in the graph (expressed by a subset of arc types) and so references will be to the node that is as general as can explain the specific surface form.

by : Created on Dec. 9, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Two More Festivals Without a Box

Just completed the submissions for Ann Arbor and Aspen. For these festivals I was able to use the phenomenally useful site WITHOUTABOX.

WITHOUTBOX lets you enter the information about your film once and submit electronically (everything but the film itself but that's coming) to each festival. If you're submitting to more than a couple of festivals, this is an incredible time saver. Not only that but the site provides a calender showing upcoming festival deadlines filtered by whether your film is eligible for the festival or not.

They have support for submission to hundreds of festivals (including some pretty big ones) and seem to be adding more all the time. They also have a larger database of known festivals that aren't part of the WITHOUTABOX submission system (yet) so you can still track their deadlines too.

by : Created on Dec. 8, 2004 : Last modified Feb. 8, 2005 : (permalink)


Integrating Subversion and Roundup

I'm using Subversion for Leonardo and have recently started using Roundup for issue tracking.

I'd like to have some level of integration between the two. The sort of thing I was initially thinking of was being able to associate an issue with a revision and vice versa.

The Roundup wiki gives an example of making something like Version:37 in a issue message automatically get turned into a link to the version control system (or something like ViewCVS).

Because Roundup is extensible in the object types it manages, one could presumably go a step further and have a class called "change" and extend subversion to, every time a commit is done, create a new change object for it in Roundup including the commit message.

References to issues could then be made in commit messages (and the link automatically made). Furthermore, Roundup would facilitate chatting about revisions. Revisions could be classified by topic, assigned to people for review, etc.

by : Created on Dec. 8, 2004 : Last modified Feb. 8, 2005 : (permalink)


MorphGNT v5.03 available

More corrections now and more coming soon.

Version 5.03 contains a major correction to the lemma PRO; a correction to MYRA; some spelling distinctions ENEKEN/ENEKA, BETHSAIDA(N), GOLGOTHA(N); and case corrections in proper names GERASENOS, STEFANOS, FOROS, TREIS, TABERNE, DIABLOS.

See MorphGNT.

by : Created on Dec. 7, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Connectedness, Closed Sets and Topological Properties

Some topological spaces have the property that they can be decomposed into two disjoint non-empty open sets. In other words, there exist two non-empty open sets whose intersection is empty but whose union is the entire space. Take our ball of clay and cut it in half.

Such a topological space is said to be disconnected. Topological spaces for which this is not true are said to be connected.

Another way of defining the same notion of connectedness is via the notion of closed sets. (The existence of open sets suggested there would be something called closed sets right?)

A closed set of a topological space is simple one whose complement is open. In other words, if you have an open set, then the set of points not in that open set is a closed set. One interesting property of this definition is it allows a set to be both open and closed at the same time. If a set and its complement are both open, then both sets are also closed.

Because, by definition, the empty set and the set of all points in a topological space are open sets, they are also closed sets. And here is where we come to the definition of connectedness based on closed sets.

A topological space is connected if and only if the only two sets that are both open and closed are the empty set and the set of all points. If any other sets are both open and closed then the topological space must be disconnected.

It is fairly easy to see why this is true. If two disjoint non-empty open sets A and B have a union which is the entire space then A and B are each others complements. Therefore A must be closed (because B is open) and B must be closed (because A is open). Therefore A and B are both open and closed.

Connectedness is said to be a topological property because it is based purely on the open sets and no additional structure. Because topological properties are based only on the open sets, they are preserved by a homeomorphism. All homeomorphisms preserve all topological properties. So if a space is connected, then any space homeomorphic to it will also be connected. An important corollary is that you can never find a homeomorphism between a connected space and a disconnected one, or between any two spaces that have differing topological properties.

In the example of cutting our ball of clay in half, the before and after are not homeomorphic because the before is connected and the after is disconnected. Again, we've ripped apart points that were once in lots of open sets together so that now the only open set they share is the topological space as a whole.

UPDATE: next post

by : Created on Dec. 7, 2004 : Last modified Feb. 8, 2005 : (permalink)


Next Film After Alibi

After we finished principal photography on Alibi Phone Network, I suggested our next short film should expand in one of the following three dimensions:

Tom has been working on a great script that I definitely want to produce—the problem is it expands on Alibi in all three dimensions simultaneously: 40 mins versus 14; really deserves HD rather than MiniDV; massive increase in cast/crew/prop/location requirements. To do well, it would take 5 times as long a shoot and 10-20 times the budget of Alibi and, particularly given my lack of experience on HD, just too much of a risk.

So today I suggested to Tom that we think about an intermediate project. One that is around 20-25 minutes, shot on HD but not requiring much more beyond Alibi in terms of cast/crew size, number of locations, etc.

I have an idea I came up with in 2001 that would probably fit well. Watch this space!

by : Created on Dec. 7, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: First Festival Submission Arrived

The Alibi Phone Network DVD arrived at SXSW. Next up: Ann Arbor and Aspen.

by : Created on Dec. 6, 2004 : Last modified Feb. 8, 2005 : (permalink)


MorphGNT v5.02 Available

Some breathing corrections on rho-initial words.

by : Created on Dec. 5, 2004 : Last modified Feb. 8, 2005 : (permalink)


Structured Tag Naming in Subversion

I've recently started using Subversion for versioning the Leonardo code base. While I've admired the design of Subversion since before 1.0, I'd never really had an opportunity to use it on a project.

One of the things I've done with the Leonardo repository is followed the suggestion of the O'Reilly Subversion book in having three top-level directories:

However, it's just occurred to me that, because tags are just copies with their own directory path, I could add some structure to my tags. Because I normally use tags for either checkpoints, milestones or releases, my top-level directories could be:

Even within things like /releases I could have structure such as

I'm thinking aloud but it seems like a reasonable practice to follow. Anyone done anything similar?

by : Created on Dec. 4, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Homeomorphisms

Previously we talked about bijections as a way of pairing up all the elements of two sets. Often this is done to express that one set is equivalent to another.

Once you have structure on the set, it isn't enough to just have a bijection. The elements of the two sets must be paired up in a way that maintains the structure before the two structured sets can be said to be equivalent.

Two topological spaces are equivalent if the bijection maintains the open sets. In other words, if the bijection maps open sets to open sets then our two spaces are topologically equivalent.

Another word for topologically equivalent is homeomorphic (note the 'e') and the topology-preserving mapping is called a homeomorphism.

A topological space is the most general space that has a notion of continuity, so two spaces that differ in terms of other structures (like distance between their points) might still be homeomorphic if continuity is preserved. One way to think about this is moulding a ball of clay...

Imagine taking a ball of clay and squashing it flat. If you think of the clay as a metric space, you've clearly changed the space quite a bit because distances between pairs of points are no longer the same. However, you haven't changed the topology. The open sets are still open sets in the squashed version. Squashing the clay is a homeomorphism. If you'd drawn a continuous line on your ball it would still be continuous after the squashing. Squashing hasn't ripped two points apart from one another.

But, now consider pushing your thumb through the clay to mould it into a doughnut-shape. To make the hole, you had to rip points apart from one another. This has altered the open sets. Two points that might have been very close (and hence in some very small open sets together) might now only share very large open sets in common. Because the topology is not preserved, the mapping from ball to doughnut is not a homeomorphism.

A topologist would say that the ball of clay is not homeomorphic to the doughnut-shaped clay.

We've reached an important milestone because the Poincare Conjecture has to do with whether one particular type of topological space is always homeomorphic to another particular type.

UPDATE: next post

by : Created on Nov. 29, 2004 : Last modified July 1, 2005 : (permalink)


Film Project Update: Mailing the DVD

With a climax worthy of a film, I got the DVD of Alibi Phone Network sent off to Tom for duplication and festival submission.

I had arranged to visit a friend in the afternoon and my original plan was to spend the morning doing the DVD burning, mail it off and then go visit the friend. However, the burning took longer than I planned and so I decided I'd go to the post office after I'd paid the visit.

Somewhere between when I left to go to the friend's house and when I got back home, I misplaced my wallet so I had no money to pay for the shipping. I rang my mum (who lives ten minutes away) and asked if I could come over and borrow some money. (Oh how many times in my thirty-one years my mum has come to my rescue!)

I got the money, rushed to the post office just before closing time and, as I put the parcel on the counter, the lady said "can I see some ID?". I'd forgotten that international shipping requires ID. And my drivers licence was...you guessed it...in my wallet.

So I raced back home, found my passport, raced back to the post office and got the DVDs sent off within minutes of closing.

by : Created on Nov. 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


Leonardo Mailing List Available

As completion of 0.4 nears, I've set up a mailing list for users and potential contributors. You can join it at:

http://mail.pyworks.org/listinfo/leonardo

UPDATE (2004-12-12): I've edited this page to reflect the new address.

by : Created on Nov. 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


Global Warming and Eskimo Words for Robin

If I see another blog entry that spreads the false meme that, due to global warming, Eskimos are now seeing species they don't have words for, I'm going to scream.

It's just bad linguistics.

Geoffrey Pullum does a much better job than I could of debunking this. Pullum was also the guy who debunked the "Eskimos have hundreds of words for snow" meme many years ago.

by : Created on Nov. 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


Programmed Vocabulary Learning as a Travelling Salesman Problem

For a while I've been interested in how you could select the order in which vocabulary is learnt in order to maximise one's ability to read a particular corpus of sentences. Or more generally, imagine you have a set of things you want to learn and each item has prerequisites drawn from a large set with items sharing a lot of common prerequisites.

As an abstract example, imagine you want to be able to read the "sentences":

{"a b", "b a", "h a b", "d a b e c", "d a g f"}

where we assume you must first learn each "word". Further assuming that all sentences are equally valuable to learn, how would you order the learning of words to maximise what you know at any given point in time?

One approach would be to learn the prerequisites in order of their frequency. So you might learn in an order like

<a, b, d, c, e, f, g, h>

However, had we put h before d, we could have had an overall learning programme that, although equal in length by the end, enabled the learner, at the half-way mark, to understand three sentences instead of just two.

To investigate this further, I needed a way to score a particular learning programme and decided that one reasonable way to do so would be to sum, across each step, the fraction of the overall set of sentences understandable at that point.

I then needed an algorithm that would find the ordering that would maximise this score.

After the quick realisation that the number of possible learning programmes was factorial in the number of words, it dawn on me that this was essentially a travelling salesman problem.

So my sister, Jenni and I wrote a Python script that implements a simulated annealing approach to the TSP. We then applied it to the above contrived example. Sure enough, it found a solution that was better than a straight prerequisite frequency ordering.

I then decided to try applying it to a small extract of the Greek New Testament (which, of course, I have in electronic form, already stemmed). So I ran it on the first chapter of John's Gospel. 198 words and 51 verses. A straight frequency ordering on this text achieves a score of 48 so that was the score to beat.

My first attempt, it didn't even come close to that. What a disappointment! Jenni and I wondered if it was just the initial parameters to the annealing model. So we increased the number of iterations at a given temperature to 50 and lowered the final temperature to 0.001 (keeping the initial temperature at 1 and the alpha at 0.9).

Success!! It found a solution that scored 82.94. The first verse readable (after 27 words) was John 1.34. John 1.20 was then readable after just 2 more words and John 1.4 after another 7.

I decided to try different parameters. With 100 iterations per temp, a final temp of 0.0001 and a few hours, it achieved a score of 91.59 (and was still increasing at the time). This time the first verse readable was John 1.24, after only 8 words; then John 1.4 after another 9; John 1.10 after 4; and both John 1.1 and John 1.6 after another 4 and John 1.2 just 1 word after that.

Overall a very promising approach. I doubt it's anything new but it was fun discovering the approach ourselves rather than just reading about it in some textbook. The example I tested it on was vocabulary learning, but it could apply to anything that can similarly be modelled as items to learn with prerequisites drawn from a large, shared set.

The next step (besides more optimised code and even more long-running parameters) would be to try to work out how to model layered prerequisites - i.e. where prerequisites themselves have prerequisites - to any number of levels. I haven't thought yet how (or even whether) that boils down (no pun intended) to a simulated annealing solution to the TSP.

UPDATE (2005-08-03): Now see Using Simulated Annealing to Order Goal Prerequisites.

by : Created on Nov. 26, 2004 : Last modified Aug. 3, 2005 : (permalink)


Film Project Update: Final Cut Done

Okay, I didn't get to it last weekend but today I finally managed to do an edit of Alibi Phone Network that cut around the line we didn't like as well as fix a bunch of other little things.

The latter included some sound level normalization and removing a sigh noise that didn't fit because the audio was from a different take than the video and in the video you couldn't see any sighing.

There are a bunch of places where I used audio from a different take than the visuals. Mostly it's during an over-the-shoulder shot during a dialog. The clearest dialog is usually recorded from the person facing the camera, so when the person with their back to the camera is speaking, it's generally better to try to use the audio from the take when they were facing the camera themselves. Syncing is generally not too difficult because you rarely see their lips so you just have to sync to their general head movement.

Sometimes, though, you mix takes when the person is facing the camera (if the audio is much clearer on a take that is different from the one with the best performance visually) and that's what I did that resulted in the sigh. To fix it, I literally cut out one second and replaced it with a second of "silence" from another part of the take. You have to replace it with something to get the sound of the room.

The whole concept of using audio from one take with visuals from another would never have occurred to me had it not been for a remark Bryan Singer makes in the commentary to The Usual Suspects (the first director commentary I ever owned—and on video, long before I owned a DVD player). The commentary on The Usual Suspects was probably the single best lesson in filmmaking I've ever had.

So, I think the film is pretty much done. Now to send a DVD to Tom who's arranged duplication for festival submission.

by : Created on Nov. 26, 2004 : Last modified Feb. 8, 2005 : (permalink)


Thank You Blog Readers

This blog is nine months old today.

Every couple of days, I find a new person that has added me to their blog roll. I can't tell you what a nice feeling it is knowing that, not only do people read your blog, but they are willing to admit to it publicly :-)

I still worry that my journeyman of some lack of focus...err...breadth of topics means that each post is completely irrelevant to 90% of readers—the filmmakers tracking the progress of Alibi Phone Network likely don't care if a school dance pairing is a bijection or not.

But I think I'll still just continue to blog about things that interest me and things that I'm working on. After all, pretty much every single topic I've written on has put me in contact with some interesting person that I've learnt and am continuing to learn new things from.

So thanks for reading!

by : Created on Nov. 26, 2004 : Last modified Feb. 8, 2005 : (permalink)


Google Scholar and Typed Citations

A couple of days ago I found out about Google Scholar which enables searching of scholarly publications. What would make this even more useful is if they combined it with a more comprehensive citation index.

Thinking about citation indices got me wondering, though: what if citation indices were annotated with the relationship between the newer publication and what it was citing? You could have relationships like "quotes", "summarises", "provides further evidence for", "argues against", "answers question posed by", and so on.

The granularity of many articles might not be right for this to really work given that one might argue for one part of an article and argue against another.

But it's theoretically appealing from the point of view of the richer searches you could do.

Continuing to think aloud: I wonder if it might be more practical in blogs. People could link to this entry with annotations like "agree", "agree with additional ideas", "agree with caveats", "seen something like this already", "really dumb idea with reasons stated".

Kind of an XFN for memes.

by : Created on Nov. 21, 2004 : Last modified Feb. 8, 2005 : (permalink)


MorphGNT v5.01 Available

Found an accent and breathing problem in both the text and lemma for ABEL, ANNA and ANNAS which is now corrected.

by : Created on Nov. 21, 2004 : Last modified Nov. 18, 2007 : (permalink)


Film Project Update: Final Cut Looming

A few people have been asking where I'm at with the film. I'm planning on completing the final cut this weekend ready for festival submission starting in December.

There's a line in the film none of us like and I'm working on trying to cut it out. Not sure if it will work yet. If it does, you'll have to wait until the commentary on the DVD to find out what was changed :-) (unless you're one of a handful of friends and family who've already seen the film and will probably pick it right away).

by : Created on Nov. 20, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Further Thoughts on Topologies and Open Sets

A question raised via email by Dave Long (one of my partners-in-crime on Cleese) has prompted these thoughts.

There is an inherent circularity to think of topologies as collections of open sets because it is the topology that defines what an open set is to start with. There's nothing inherent in an open set that makes it "open" apart from the fact it is a member of the topology.

In sets with more structure that enable you to define openness in terms of that additional structure, openness still comes down to the choice of topology that the additional structure is implying.

For example, if you choose a distance function for a metric space, you've implicitly chosen the topology. So while the open sets can be explicitly defined by the distance function in that case, the very choice of the function assumes a particular underlying topology.

UPDATE: next post

by : Created on Nov. 20, 2004 : Last modified Feb. 8, 2005 : (permalink)


Conversation Categories

Don Park writes about an idea he calls "Conversation Categories". The idea is having a discussion on a particular topic with each participant writing in their own blog but categorising their entry as belonging to the particular conversation. An aggregator could then pick up all the pieces of the conversation.

It's discussion datalibre-style and something I'd love to implement in Leonardo.

It actually fits nicely with some of my previous ideas around trackbacks and categories, maybe even using wikipedia for URIs.

del.icio.us has got to fit in somewhere there too!

by : Created on Nov. 19, 2004 : Last modified Feb. 8, 2005 : (permalink)


Birthday Thoughts

Today is my birthday and I spent some of it thinking about what I've achieved over the last year and what I want to achieve in the next.

I think the two things I'm most pleased about in the last year are how the short film Alibi Phone Network turned out and how this blog is turning out.

Some of the things I'd like to see happen in the next year:

It will be fun to revisit this list in 365 days time to see how I've done :-)

by : Created on Nov. 19, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Injections, Surjections and Bijections

Imagine a school dance. There is a set of boys and a set of girls. When the music starts, each boy picks a girl to dance with.

Think of this as a mapping from a boy to a girl, or from an element in the set of boys to an element in the set of girls.

The mapping is said to be injective (or one-to-one) if each boy picks a different girl. If two boys try to dance with the same girl, the mapping isn't injective.

The mapping is said to be surjective (or onto) if no girls are left without a partner. If there is a girl not dancing, the mapping isn't surjective.

If the mapping is both injective and surjective it is said to be bijective.

You can immediately tell if there are the same number of boys and girls if the mapping is bijective—in other words, each boy is dancing with one and only one girl and no girls are left without a boy to dance with.

The existence of a bijection can be used to demonstrate that two sets have same number of elements or, in the case of infinite sets, have the same cardinality.

Bijections are also very important in establishing the equivalence between two structured sets (for example between two topological spaces) as we shall see in the near future.

UPDATE: next post

by : Created on Nov. 17, 2004 : Last modified Feb. 8, 2005 : (permalink)


The Road to DataLibre

Steve Mallett has paid me a huge compliment calling my site the "closest DataLibre site I've seen" although I'm somewhat embarrassed because I'm still a long way from where I want to be.

I'm still thrilled Steve likes where I'm going, though. DataLibre is one the two main drivers (the other being REST) in how I'm implementing Leonardo. In fact, I'm considering describing Leonardo as "a RESTful DataLibre server written in Python".

I received my November copy of HBR today and there was a Forethought article entitled "I Am My Own Database" by Richard T. Watson which is pretty much talking about DataLibre. He describes what is referred to in the article as "customer-managed interaction" or CMI:

Under CMI, when a consumer buys merchandise online, he receives an electronic file that describes his purchases and that can be automatically imported into a database he's installed on his home PC. If he wants to record purchases made earlier or offline, the consumer can obtain an electronic list of common products, like books, and CDs, from the Library of Congress or commercial sources such as the Internet service Gracenote. He also registers an opinion of each purchase by using rating software incorporated into the database. The database remains in the consumer's control at all times, so if he decides that the Led Zeppelin period of his life has irretrievably passed, he can simply change his ratings of Led Zeppelin CDs he's purchased from all sources.

Finally, while writing this entry, it occurred to me that readers of the datalibre-discuss mailing list might be interested in the Forethought article. In true DataLibre fashion, I'll post this entry (along with the permalink) to the list. One feature I want to implement in Leonardo is that kind of "trackback to an email address" feature.

by : Created on Nov. 17, 2004 : Last modified Feb. 8, 2005 : (permalink)


Belated Thoughts on Blogs and Wikis

When I read Tim Bray's suggestion that blogs and wikis couldn't be more different in their essential nature, I knew I wanted to say something on the matter. Well, I've finally got around to it.

Bottom line is I agree with Tim. This may surprise some readers given I've talked before about this site being a wiki/blog hybrid and I describe Leonardo as a wiki/blog server. But here's why I don't consider it a contradiction...

Firstly, purely from the perspective of implementing the content management, there can be similarities—that's what I meant when I talked about wiki/blog hybrids. But Tim was talking about essential nature, not implementation details.

There are really a number of facets to the wiki nature. Four that immediately come to mind:

I'd like to suggest that you can have varying mixes of these and, depending on which mix you have, blogs seem further apart from or closer to wikis.

The important characteristic for something like Wikipedia is the first one. While the rest are still true to varying extents, they aren't what's interesting about Wikipedia. Martin Fowler's bliki, on the other hand, clearly doesn't have the first characteristic. However, it is strongly driven by the fourth and I think it is this facet that really makes his blog wiki-like.

I call my site (and any site served by Leonardo) a "personal wiki" in that it shares characteristics two, three and, to a small degree, four. Plenty of blog software supports in-browser editing. For someone that associates in-browser content editing with wikis, that blog software is wiki-like.

Would two and three alone really be enough to be considered a wiki, though? If not, then wikis and blogs start to diverge. Perhaps people that think blogs and wikis are similar are focusing on two and three. The more you consider four an important characteristic of wikis, the less wiki-like blogs seem—unless they are written like Martin Fowler's. The first characteristic is the one that really sets wikis and blogs apart.

There is no doubt that both wikis and blogs are social. But they are a different kind of social. Blogs are conversations (at least collectively). Wikis (when focusing on characteristic one) are collaborations. Conversations and collaborations are not the same thing. Both are useful—but they are not the same thing.

by : Created on Nov. 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


New Mic, New Song

A couple of days ago, Nelson received the Rode NT-1A microphone he had earlier ordered. Today was the first opportunity we had to record with it.

We spent most of the afternoon recording Noise which was the song we performed on television a few weeks ago. I got Nelson to record multiple takes with differing tonal qualities and when mixed together subtly, it proved to be very effective. We managed to get some great vocal harmonies tracked too and using a gentler, breathier tone in the harmonies worked very well against the main vocal line.

In the last hour of our session, we laid down the initial tracks of a song Star in Vegas we wrote many months ago (on opposite sides of the globe) but had never recorded. Besides using the new mic, it was the first time we'd recorded Nelson's electro-acoustic guitar. We did the entire song in two single-take passes. The first was me on keyboard bass and Nelson on electro-acoustic guitar (DIed straight into the Digi002). Second was me improvising a simple piano line while Nelson sang.

A few mistakes (especially in my piano improv) but the overall recording had a magical quality that I'm too scared to try to mess with. So I'm going have to be very careful with re-recording the problem areas and keep corrections to a minimum. Nothing beats the magic you get in a first take.

All I did to the vocals was added a simple 'verb. I hardly think it needs anything else. I doubt I'll EQ it. The Rode is a beautiful sounding mic—perfect for Nelson's voice.

by : Created on Nov. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


MorphGNT v5.00 Available

At wildly varying intensities over the last ten years, I've worked on correcting the UPenn CCAT Morphological Parsed Greek New Testament as a side-effect of larger linguistic analyses I've undertaken. The last big burst of activity was in 2002 when I resumed work on my own morphological analysis (starting with the nouns).

The last couple of weekends, I've been working on preparing a new release of the corrected MorphGNT file, the first in probably seven or so years.

Prompted by a post to the b-greek mailing list, I've now made that release. MorphGNT v5.00 is now available at MorphGNT.

by : Created on Nov. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


Canon Multi-Unfunctional on OS X

I just bought a Canon MP390 multi-function printer/scanner/copier/fax and stupidly assumed (without checking) that it would work on OS X. Apparently none of Canon's multi-function units support OS X (although oddly Google reveals that Apple had a Hot Deal on them through B&H Photo at one point). All the other Canon products I've used do support OS X and it appears all their standalone printers and scanners do. I'm not sure what it is about their multi-function units.

Not sure yet whether to return it or to just use it on my Windows box hoping that Canon will soon release an OS X driver.

It's the first time since I bought my PowerBook four months ago that something I've wanted to use hasn't worked with OS X.

I'm seriously bummed.

by : Created on Nov. 12, 2004 : Last modified Feb. 8, 2005 : (permalink)


Dragon Optical Illusion

Doing the rounds in the blogosphere is a cool optical illusion based on the looks-convex-but-is-really-concave trick. I printed it out and made my own, as shown below:

Cheered me up after my Canon disappointment.

by : Created on Nov. 12, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Topologies and Topological Spaces

We saw in Open Sets that open subsets of a set X always follow the rules:

If you pick a collection of subsets of X that follows the four rules above, that collection is said to be a topology on X. Furthermore, a set along with a choice of topology on that set is called a topological space.

The use of the word choice is an important one. A given set will (unless it is a singleton) allow multiple valid topologies. It is the choice of topology that gives a topological space its characteristics rather than the the set itself.

Consider a simply set {a, b}. The smallest possible topology would be:

{ {}, {a, b} }

In other words, the empty set and the the set itself are the only two open sets. This meets the definition of a topology and, in fact, for any set will be the smallest possible topology.

Another valid topology on {a, b} would be:

{ {}, {a}, {b}, {a, b} }

In other words, all subsets are open. This also meets the definition of a topology. For any set the topology which defines all subsets to be open will be the largest possible topology.

There are two other possible topologies that can be defined on the set {a, b}

{ {}, {a}, {a, b} }

and

{ {}, {b}, {a, b} }

Step through the four rules to convince yourself that these are valid topologies for {a, b}.

Note that, although this example has involved a small, finite set, everything here applies to infinite sets too. It is possible to define, for example, different topologies on the set of real numbers. One such topology is one that equates the open intervals with the open sets. This is by far the most intuitive topology on the reals but by no means the only one.

UPDATE: next post

by : Created on Nov. 11, 2004 : Last modified Aug. 10, 2007 : (permalink)


Delicious Library

Not quite so highly anticipated as Halo 2, but there's been a fair amount of hype around the release of the book/CD/DVD cataloging software for OS X, Delicious Library from Delicious Monster. Yesterday, I downloaded a copy.

It certainly looks cool, presenting your library on a graphic of shelves using cover photos downloaded from Amazon.com. Bar codes can be scanned using an iSight, but I already had a bar code scanner so can use that. I had a number of text files with an ISBN-per-line of all my existing books and I was able to import that file and Delicious Library went off and downloaded all the catalog information (including front cover photo) from Amazon. It even makes use of Amazon to list similar items when you select a book.

I had already put together a catalog of my own using Mark Pilgrim's PyAmazon library but Delicious Library just looks nicer than anything I could have built. The only issues I've found so far:

But all-in-all, it's worth checking out if you run OS X, even if just to see how cool it looks showing you the covers of all your books.

by : Created on Nov. 9, 2004 : Last modified Feb. 8, 2005 : (permalink)


The Key to Successful Technical Discussions

The key to successful technical discussions is precise, unambiguous terminology.

Last night I had a phone meeting with the mValent senior technical team in Boston discussing the design of one component of the next major release of our software. The meeting was focused and progressed us forward tremendously in a common understanding of how the component was going to work. The key, I believe, was a clear vocabulary of terms that I insisted everyone use.

Previously, I'd been talking with another colleague about a physical representation versus a logical representation in our system. The problem was, "logical" was terribly overloaded and sometimes was used to mean just part of what was being called the logical representation. I suggested we use a new term without the word "logical". Because it is the representation surfaced to the user, I proposed "surface representation". So then we had a physical representation and a surface representation. But when talking about the surface representation, we were getting tangled up because different object hierarchies within the surface representation had different characteristics. So I gave each hierarchy type a number. Then we could talk about H1, H2, H3, H4.

So last night, I started the meeting defining what the surface representation was and what H1 thru H4 meant. From then on, the discussion was crystal clear. At one point it looked like there were variations of H4. So we decided to refer to them as H4a and H4b. That way we could talk about the characteristics of H4 in general as well as drill down into the differences between H4a and H4b.

(And yes, jokes were made about the names sounding like US visas).

So often I've found that technical meetings become burdensome when people are arguing about what they think is the same thing but are really two different things. Or are talking about two different things that are really the same. Having unambiguous terms (even if they are silly things like H4b) is a tremendous help in discussions.

Most people at mValent think I'm precise about my terms because of my linguistics background. I'm not sure it's just that. I think my interest in linguistics is correlative rather than causal. Linguistics, like many sciences, is about categorizing phenomena. I think at the core, it's the categorizing that I love. That's clear in my previous post on thinking like a pure mathematician. I've always been fascinated by taxonomies.

I've also had ten years involvement in the standards-writing world including a significant amount of implementation of standards. That alone gives one an appreciation for precise, unambiguous terminology.

Next time you're arguing in a technical meeting and the other person just doesn't seem to "get it", take a step back and both agree on a set of terms to use. It really does work wonders.

by : Created on Nov. 5, 2004 : Last modified Feb. 8, 2005 : (permalink)


The Art of the Dust Jacket

I recently bought a copy of Guy Kawasaki's The Art of the Start. Great book so far, but one of the first things I noticed was the comment on the back inside flap:

The front jacket was created by Adam Tucker, winner of a design contest sponsored by Guy Kawasaki. Please take off the book jacket to see some of the other entries from his fans on the reverse side.

That's right. The inside of the dust jacket features 70-odd submissions for cover designs. Each of them is completely different. Makes you realise not only how much variation there can be in a book cover but also just how different one's perception of a book can be depending on the cover.

They say you shouldn't judge a book by its cover. But have 70 alternative covers suggested to you and you pretty soon decide which make you want to buy the book and which don't.

For what it's worth, I think Guy picked the right cover in the end.

by : Created on Nov. 5, 2004 : Last modified Feb. 8, 2005 : (permalink)


Six Snapshots of a Simple Eclipse GEF Application

Back in March, I talked a little about my initial attempts writing an Eclipse Graphical Editor Framework (GEF) application. I wanted, then, to write a tutorial that essentially walked the reader through the various stages of the development of my first application. I even suggested some kind of versioned literate programming approach to writing the tutorial and the code at the same time.

I haven't had time since then to make any progress, but I did get the GEF application to the stage where I had put together a snapshot at each of six milestones. A few people have written to me over the last six months asking the status of my tutorial and I've sent them my six snapshots as a starting point.

It makes sense for me to just to offer them here.

You can download a ZIP file with the six snapshots at http://jtauber.com/2004/gef/gef.zip.

Hopefully they are still useful, even without a surrounding tutorial.

by : Created on Nov. 2, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Open Sets

In Open Balls and Continuity, I said:

Imagine that you don't know the distance function of either the domain or co-domain of the function but someone who does has precalculated all the open balls for you. Of course, for most metric spaces, there would be an infinite number of these, but the key point here is that you only need to know what the open balls are to test continuity. You don't need to know the distance function.

So imagine that a friend has given you a set along with all of the subsets that are open balls. From this alone, you can establish whether a function on the set is continuous.

We can simplify the definition of continuity (and other concepts) by introducing a more general subset than the open ball called an open set. Even though we will initially define open sets in terms of open balls, we can simply provide the open sets without reference to open balls, much like your friend provided the open balls without reference to the metric.

A subset of a set X is called an open set if it is the union of open balls of X.

Now continuity can be defined in terms of open sets (and this definition can be proven to be the same as that using open balls). A function from X to Y is continuous if and only if, for each open set in Y, the inverse in X is also an open set.

So, instead of giving you the open balls, your friend could give the open sets. And from that, you'd be able to establish whether a function was continuous.

Now you might be wondering: in the absence of the original metric, how would we know whether the collection of open sets was really a collection of open sets and not just some random selection of subsets? (Perhaps you don't trust your friend.) Well, it turns out open sets have some interesting properties:

What is so significant about this is that a collection of subsets of X is a collection of open subsets of X if (and only if) it has the four properties above. It doesn't matter if it came from a distance function or open balls or some random selection of subsets. As long as the four properties above hold, the subsets are open subsets and can be used to demonstrate continuity (along with many other things).

NOTE: It is important that the union can be of any collection whereas the intersection can only be of a finite collection.

For a while, I thought the second two rules might be redundant and derivable from the first but a number of people (including Michael Walter and Richard Plagge) have clarified it for me. Michael points out the case where X = {1, 2} and the candidate collection of sets is {{1}}. This meets the first two rules but not the second two. Richard gives the example X = {a, b, c, d} with the candidate collection {{a}, {a, b}, {a, b, c}}. Again, the first two rules are met but the second two clearly do not follow.

UPDATE: next post

by : Created on Nov. 1, 2004 : Last modified Aug. 10, 2007 : (permalink)


Too Taxing for the Servers

Online submissions of Australian federal income tax returns were due today. (Normally they are due 31st October, but given it was a Sunday, they made it Monday 1st November).

I completed my return last night and went to lodge it and it came back with an Error 1200 and told me to call the e-Tax technical hotline. By this stage it was about 2am so I decided to wait until it was working hours on the east coast (give them a chance to come in to work and see that something was broken).

So early this morning, I tried lodging again. Error 1200. So I called the hotline number. Engaged. Lodge again. Error 1200. Hotline still giving busy signal.

After a few hours, I stopped getting a busy signal and got put on hold (listening to piano works on what sounds like a radio slightly tuned off the station). While on hold, I tried lodging a few more times. Still getting an Error 1200.

Finally got through to someone. I decided I wouldn't be mad at the guy—he's probably received hundreds of "Error 1200" calls already today. I didn't even need to tell him the issue - he just asked right away if I was getting an Error 1200.

Turned out some servers had been down since 5pm Sunday. He told me they should be back up by this evening but that the Australian Tax Office had extended lodgement until Wednesday for e-Tax filers.

So I just tried lodging my tax at 11.55pm this evening. No Error 1200. But it told me I'd already lodged!

by : Created on Nov. 1, 2004 : Last modified Feb. 8, 2005 : (permalink)


ReadySET JotSpot

A few months ago, I came across ReadySET, a collection of XHTML document templates for software engineering processes. ReadySET has templates for things like project plans, use cases, QA test plans, design documents, status reports and release notes. The collection is structured such that you basically end up with a template for an entire engineering intranet.

When I first came across ReadySET, I immediately thought of four technology improvements that would make it even more useful:

My thinking at the time was some kind of custom wiki-like application that served up the ReadySET documents wiki-style with the additional functionality outlined above.

Now I'm thinking the idea might be perfect for JotSpot. I still haven't found the time to dig deep into the beta but it might be a great project to try.

by : Created on Oct. 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Open Balls and Continuity

Previously, we introduced the notion of a metric space.

Once you have a metric space (that is, a set of points and a function that specifies how distant you consider any two points) then you can start to develop notions of continuity and limits that form the basis for analysis.

For a metric space (X, d), let us call all the points less than r away from a point a the open ball of radius r at point a. In other words, B(X, d, a, r) = {x in X | d(a,x) < r}.

We can then define continuity by saying that a function f between the metric spaces (X1, d1) and (X2, d2) is continuous at point a in X iff, given a positive r2, there is a positive r1 such that f(B(X1, d1, a, r1)) is a subset of B(X2, d2, f(a), r2). In other words, f is continuous at point a if you can always provide an open ball at a that maps to points within an arbitrarily small radius open ball at f(a).

Once you think of continuity in terms of open balls, you're able to do something interesting. Imagine that you don't know the distance function of either the domain or co-domain of the function but someone who does has precalculated all the open balls for you. Of course, for most metric spaces, there would be an infinite number of these, but the key point here is that you only need to know what the open balls are to test continuity. You don't need to know the distance function.

Let's call a set of points with the open balls precalculated an open ball space. Clearly it is easy to turn any metric space into an open ball space. You can't go the other way, however, as we've thrown out what the actual radius of each open ball is.

But we're now able to talk about continuity with a more general set structure than a metric space. There are many other notions that can be introduced on an open ball space, some of which we'll get to on our journey through the poincare conjecture.

In the next poincare project entry, however, we will take one more step of abstraction and get to the very core concept of topology itself.

UPDATE: next post

by : Created on Oct. 28, 2004 : Last modified Aug. 10, 2007 : (permalink)


Permission to Linkblog

Jason Clarke points out that Scoble's linkblog is a great idea but it's broken in that the full content of the entries he references is not included in the linkblog.

I remember when Scoble started his linkblog, he did include full content but I recall he got in to some trouble with some people that didn't like their content being reproduced in full.

The problem is that, like Scoble, I have a strong preference for blogs with full content feeds. The fact I have a titles-only feed for my own blog is a historical artifact of a time where the aggregator I was reading downloaded the link if the feed entry had no content.

Jason subsequently talks about giving Scoble permission to reproduce content. Well, Scoble, feel free to include the full content of my entries when you linkblog them.

As Jason points out, it would be nice if, rather than Scoble having to track the permissions, each entry included them in a machine readable form (i.e. a CC URI). "Blog this" features in aggregators could easily make use of this to default whether to include the content or not in linkblogs.

UPDATE (2004-11-26): Scoble has announced he's going back to full text link blogging. I, for one, am glad!

by : Created on Oct. 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


Perth Blognite

Congratulations to Bret, Richard, Graeme and the other organizers for a very successful Perth Blognite. Great diversity of speakers and the talks were just the right length with, in most cases, time for a good Q&A after each.

Highlights for me were Veronica Bowden's humorous and insightful look at personal blogging, Bret Treasure's inspiring Why Perth Should Blog, Richard Giles's informative guide to corporate blogging (which, as was mentioned a number of times, is just as relevant to organizations of other types) and Robert Corr's analysis of the good and bad in political blogging and the application of the Herman-Chomsky propaganda model to the blogosphere.

Interestingly, while Robert maintained a remarkable bi-partisan inclusiveness when discussing the positive effects of political blogging, his selective examples of the negatives betrayed his left-wing bias, which, to his credit, he fully disclosed from the outset.

All in all, a fantastic night.

by : Created on Oct. 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Score Done

I've finished scoring Alibi Phone Network and I'm extremely pleased with it. Hopefully the director, Tom Bennett, will be too (although, as producer, I have final cut anyway :-)

It's all solo piano. The main title sequence is a rearrangement of the first 8 bars of a piece for violin and piano I started working on a few years ago. The piece was going around in my head during most of the editing and the fit seemed just right.

The opening music is then broken up into different fragments representing each of the three main characters with variations depicting various moods. There's also a motif representing the alibi phone network itself.

The end title music is a based on a couple of motives from the melody of the main title music but now interpreted much more sadly (this isn't a happy ending).

I also did the remaining sound effects, and added the voice of Tom Bennett in an audio cameo as the Maitre D' at Chez Quis.

Rendering now, ready to burn to DVD and send it express courier to Tom and my co-producer, James Marcus.

by : Created on Oct. 24, 2004 : Last modified Feb. 8, 2005 : (permalink)


Amazon Recommendations and Self-Hosting

While reading blogs this afternoon, I found out about Guy Kawasaki's new book The Art of the Start. So I went to Amazon.com to order it and, lo and behold, Amazon was already recommending it to me on the home page as a new release I might like.

I confess to being somewhat of an Amazon recommendation addict. I'm highly motivated to inform Amazon of all my book purchases (the 30% I don't buy from Amazon but instead buy from the Dymocks at the mall near my house in Perth, the Barnes and Noble near the office in Boston or at various airport bookstores I hang out at frequently).

But as you know, I'm also interested in hosting my own data (see aggregation versus hosting). I'm always on the lookout for ways I can take back my data, host it on jtauber.com and provide it to aggregators rather than have to host it with them.

So that's got me wondering about the books-I-own being stored at Amazon. It's somewhat of a duplication, because I have a barcode scanner and maintain my own book catalog (with data, incidently, retrieved from Amazon web services). I've not checked yet, but I wonder if Amazon will let me import the books I own so I can maintain the authoritative list and, in the words of datalibre, "Write Once, Read Everywhere."

Of course, that then leads me to Amazon wishlists. Could I self-host my wishlist without losing the huge value-add of Amazon keeping the wishlist updated based on what others have ordered for me?

by : Created on Oct. 23, 2004 : Last modified Feb. 8, 2005 : (permalink)


Blogging and the Personal Brand

Recently, Doc Searls made the observation that the companies known for their brand don't have nearly as many bloggers.

I was all ready to embrace this meme that blogging and branding were opposing when I stopped and thought—hang on, Tom Peters blogs. Tom, more than any other person taught me the power of the personal brand.

Then it dawned on me. Blogging builds your personal brand. Perhaps people that (are good at or want to) build their personal brand don't sit well in companies that have a strong corporate brand.

UPDATE (2004-10-26): Well it looks like Todd Sattersten agrees with me but Doc Searls doesn't. Doc dislikes the notion of a "personal brand" but I agree with everything else he says in his comment so maybe he doesn't mean the same thing that Tom Peters does.

by : Created on Oct. 23, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Second Rough Cut Done

I've completed a second rough cut of Alibi Phone Network. I found that doing a left-right flip of shots removed the crossing-the-line issues. It introduces some continuity problems but they are far more subtle than the effect of crossing-the-line.

After completing the rough cut, I showed it to my family. I'm very proud of the film - from the script to the performances to the look to the pacing.

Remaining tasks are:

After that we'll start submitting to festivals.

by : Created on Oct. 23, 2004 : Last modified Feb. 8, 2005 : (permalink)


Apple Build Numbers

Can anyone explain to me the build numbering system Apple uses for OS X?

I'm currently running 7M34 (more commonly known as 10.3.5)

There is a list of build numbers here that confirm that the first number is the major release):

But when does the following letter get increased? Even the final number doesn't seem likely to go up after after every build judging by how close some of the numbers are between successive minor releases.

Any readers know, or at least care to speculate?

UPDATE (2004-11-02): The Tiger developer preview that was just announced is 8A294. So they are still on the 'A' series of 10.4. I wonder when they start 'B'.

by : Created on Oct. 23, 2004 : Last modified Feb. 8, 2005 : (permalink)


Token Absentee

Technorati picked up this compliment from a local Perth blogger regarding Wednesday's Perth Blogger Meetup:

"Our token newbie was James who seemed like a more then worthy addition to the Meetup." — (source)

Only thing is: I wasn't there! Although I'd RSVPed, I found out at the last minute that a friend was performing in the King and I that night so I went to that instead.

Was someone pretending to be me?

UPDATE: David's response: "Whoops! Sorry dude. That would explain why they were calling him Mark."

by : Created on Oct. 21, 2004 : Last modified Feb. 8, 2005 : (permalink)


Nelson James TV appearance

The pop duo I produce and form one half of, Nelson James, is performing on the local community television station, Access 31, today as the musical act on the chat show, The Couch. If you're in Perth, check us out at 5.30pm.

It will be, not only our first television performance, but our first public performance together.

by : Created on Oct. 16, 2004 : Last modified Feb. 8, 2005 : (permalink)


JotSpot

Most of the bloggers I read who were at WEB 2.0 have already blogged about JotSpot. I wasn't at WEB 2.0 but now it's my turn.

In short, I'm very excited about JotSpot which is a web development platform built on top of a wiki engine. As you may have guess already, I'm a big fan of wikis. I've also long been interested in placing real-time structured data in documents (it's one of the things that excited me about SGML in the first place).

I almost think of JotSpot as a team version of where I want to go with Leonardo. And whereas I'm focused largely on personal info hosting/publishing in Leonardo, JotSpot looks like it could be a great platform for building aggregators.

Beyond the information that's available at JotSpot you might want to check out Jon Udell's flash demo of JotSpot which made me feel like I was getting my own personal demo from JotSpot's co-founders.

I'd already been following Joe Kraus's blog, Bnoopy before JotSpot was announced. I hope he can find the time to get back to blogging more soon.

I just got in the beta program so I'll report more on JotSpot in the future.

by : Created on Oct. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Metric Spaces

A surface is more than just a set of points. Points on a surface have a notion of closeness that doesn't exist with a set unless we add some structure.

One way we can introduce the idea of closeness is to introduce the idea of the distance between points. That is, a function d that gives us a number for any pair of points.

To be a distance function, our function must meet some additional requirements:

A distance function is often called a metric. A set of points with a distance function is called a metric space.

A metric space clearly has a notion of closeness. A point y is closer to x than z is if d(y,x)<d(z,x).

UPDATE: next post

by : Created on Oct. 14, 2004 : Last modified Aug. 10, 2007 : (permalink)


Aliases and Symlinks

Things are going well so far using a filesystem as a PIM. One odd observation, though, about OS X.

If I create a symlink from a shell, it comes up as an alias in Finder. But if I create an alias in Finder, it doesn't look like a symlink from the shell (it's an empty file but there's no display of the target).

by : Created on Oct. 12, 2004 : Last modified Feb. 8, 2005 : (permalink)


Juggling

A friend of my sister was doing a juggling demonstration for me on Friday night. The linguist and mathematician in me immediately asked if there was a notation for juggling. The coder in me then asked if there was software that took that notation and generated an animation. The answer in both cases was in the affirmative.

A quick Google search today revealed Juggling Lab at SourceForge. Very cool.

by : Created on Oct. 10, 2004 : Last modified Feb. 8, 2005 : (permalink)


Using a Filesystem as a PIM

I've talked before about using an outliner as a PIM. Since getting my PowerBook I've tried out a number of OS X outliners with a fair degree of success.

Lately, however, I've wondered if I could pretty much achieve what I want just using the filesystem. I can set up a directory structure that mirrors the outline. I can use symlinks for items that belong in multiple categories. It's dead easy to put notes, bookmarks and arbitrary files under any item. Sharing of parts of my PIM info just becomes a file serving issue and I can version with a revision control system.

It should be relatively easy to integrate email as long as I can treat individual mails as files I can link to.

I can even set up a 43-folder tickler system and have a cron job that moves (or symlinks) files daily and monthly into an INBOX directory.

by : Created on Oct. 10, 2004 : Last modified Feb. 8, 2005 : (permalink)


Back Home

I'm back home in Perth. Feels like I never left. Things to do today (in no particular order):

by : Created on Oct. 8, 2004 : Last modified Feb. 8, 2005 : (permalink)


MarsEdit and Externally Editing Leonardo

One of the things that will be made considerably easier after the Leonardo rearchitecture is pluggable support for alternative approaches to editing. In a way, the RESTful approach I've taken has always made the use of external blog editors possible - I've just never tried it. Of course, I'm not sure the blog editors out there are that RESTful. They seem to largely use XML-RPC rather than just PUTing and/or POSTing.

I'm keen to try out MarsEdit, although I haven't investigated just how RESTful it can get. Part of the appeal of MarsEdit is posts like this one from Brent. I'd love it if more commercial software developers posted this kind of thing on their blogs.

by : Created on Oct. 5, 2004 : Last modified Feb. 8, 2005 : (permalink)


Return Home Delayed

Due to a SNAFU in my re-ticketing, I wasn't able to leave Boston yesterday despite having a confirmed reservation. I'm scheduled to now leave on Monday and Qantas this morning confirmed that I now have a valid e-ticket to do so.

After being here three months, what's another three days?

by : Created on Oct. 2, 2004 : Last modified Feb. 8, 2005 : (permalink)


Wikipedia as a URI Lookup Service

Often when working with RDF or trying to be generally Web-like, one needs a URI to identify something. It's easy enough to come up with your own URI but it would be incredibly useful to be able to see what URIs others have used for the same concept or entity.

Say, for example, that I wanted to express an interest in Linguistics or that I subscribe to American Cinematographer. What URIs would I use?

At first I started pondering a service that would allow you to search for URIs by a human-readable description of the concept/entity. URIs could be submitted to such a service and where duplicate URIs existed, users could assert a "same-as" relationship between them. It could also be possible for preferred URIs to be voted on or, where appropriate, an authoritative URI claimed so people could normalize their references.

But thinking about it more, I wonder if Wikipedia might play a role here. A large proportion of the concepts and entities that a disparate group of people might want to refer to are probably described in Wikipedia (or it's easy to make them so). So an obvious URI to start with would be the link to Wikipedia.

In the cases where a better (perhaps more official) URI exists for a particular concept or entity, then Wikipedia itself could specify the authoritative URI.

What do people think?

by : Created on Oct. 2, 2004 : Last modified Feb. 8, 2005 : (permalink)


Bloglines RESTful Web Services

Bloglines, the website via which I do my blog reading, has recently introduced some web services interfaces.

Nice, simple, RESTful interfaces. Well done Bloglines!

I have one enhancement request so far: let me PUT my blogroll OPML as well as GET it. Or (even better) let me give you the URI of my blogroll OPML and you can poll it (regularly and/or on demand).

by : Created on Sept. 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


Local Blogs and Bloggers

Most of the blogs I read and the bloggers I know are based in the US so it's refreshing to discover blogs and blog-related events based a little closer to home.

I recently found out about the Australian Blogging Conference that SplaTT is trying to organize. Count me in!

I also found out about blognite in my home town of Perth as well as the Perth Blogs Wiki.

by : Created on Sept. 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


Registered for SxSW 2005

I've just registered for SxSW 2005. Any readers of this blog who are planning on attending, let me know.

by : Created on Sept. 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: A Little More Editing

This evening I finally had a spare moment to do some more editing. Before I knew it, five hours had passed. So I'm now half-way through the second edit of the film. I tightened things up a bit; got rid of a couple of shots and shortened some others that I felt dragged the pace. I also fleshed out a couple of scenes that previously only used the master shot; put in some opening titles just to test out the feel and did some minor colour correction between some adjacent shots that had a jarring difference in the blue cast.

I head back home to Australia on Friday—my goal is to get the second edit done before I go so I can show Tom and James. It's going to be tight, especially with everything else I need to get done before I go.

by : Created on Sept. 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


PIMs and DLAs

I'm not sure that I've articulated here before the strong relationship that I believe exists between personal information managers (PIMs) and digital lifestyle aggregators (DLAs). To some extent, it's pretty obvious but I think some interesting things emerge from thinking about it.

Here are some random thoughts:

Numerous object types that were once the domain of PIMs are now being shared and aggregated: address books, calendars, etc. Things like music and photos should be included there too—although we don't tend to call the tools for managing them PIMs, they are.

The address book in my PIM should be seamlessly integrated with my published FOAF which should be seamlessly integrated with sites like LinkedIn. The address book should actually be the hub of a lot of stuff. Why not manage who can see my Flickr photos via my address book?

The merging of PIMs and DLAs means that we are increasingly creating our own personal intranets and extranets.

Communication is moving outside of email—discussions are taking place via blogs, photos are being exchanged via Flickr rather than email attachments, documents are being collaborated on via wikis. Atom/RSS feeds already change what I need email for—that should be taken to a whole other level.

One component of PIMs I'd like to integrate more with my website as well as aggregator sites I use is the whole project/task/todo aspect. I should be able to tie email, blog entries, photos, documents together into projects. "Tags" should span sites. Topics should have URIs.

I should be able to expose the status of certain projects/tasks to interested parties via a website (with a feed of course). Certain people: my boss, the team lead of an open source project I'm working on, a user of software I've written - should be able to submit requests via my website and have them integrated with my PIM.

UPDATE (2004-09-25): On that last point: they should also being able to just blog their request and ping me via a trackback. No reason why I have to own their request.

by : Created on Sept. 25, 2004 : Last modified Feb. 8, 2005 : (permalink)


More on Aggregation Versus Hosting

Previously on this blog, I've called for a separation of hosting from aggregation. I want to be able to maintain authoritative data on one site and have other sites use it for their aggregation.

When I read Ted Leung's entry Microcontent personality disorder and Steve Mallett's comments on it, my immediate thought was that they could both have what they want if we could separate where we host our data with where it is aggregated and made "social".

Marc Canter (whose work around Digital Lifestyle Aggregators is definitely worth following) responds to Steve Mallett. Marc is spot on that people have their information all over the place. But I still believe that if systems are built to support a separation between hosting and aggregation, they'll support both the distribution of primary data and the kind of "self-hosting" that a certain segment like Steve and myself want.

Bottom line is all combinations of centralized/decentralized hosting/aggregation should be possible.

It's not that hard to do. Sites that aggregate just need to provide a mechanism where users can point to their data hosted somewhere else rather than have to re-enter their data in multiple aggregators. Aggregators then keep customers based on the value of their aggregation, not the lock-in of being the hosts of people's valuable data. People who want hosting for their pictures, blogs, etc can use hosting services to do it. But their choice of hosting service should not impact their participating in aggregation and the social aspects of micro-content that follow.

UPDATE (2004-09-27): see also Jon Udell's post Next-generation infoware

by : Created on Sept. 25, 2004 : Last modified Feb. 8, 2005 : (permalink)


Poincare Project: Adding Structure to Sets

Most pure mathematics takes as a starting point a set of objects. If things stopped there, we would be dealing just with set theory; but we branch into other areas of pure mathematics by adding structure to the set.

Defining such a structure may involve calling out particular elements of the set or particular subsets whose element have some particular relationship to one another; or it may involve some mapping from one element in the set to other or an operation that takes two elements of the set and produces a third.

Calling out particular subsets is the basis for the branch known as topology. If the choice of subsets meets certain criteria (which we'll get to shortly), the set (along with the called-out subsets) is called a topological space.

Defining an operation that takes two elements of the set and produces a third that is also in the set (think of adding two numbers or concatenating two strings) is the basis for the branch known as group theory. If the operation meets certain criteria (which we'll also get to shortly), the set and the operation is called a group.

Some structures involve reference to one or more additional sets (such as the set of real numbers). For example, one might define an operation that takes two elements of a set and gives a number that can be thought of as the "distance" between the two elements. As long as that operation follows certain rules (such as the distance between two distinct elements always being positive and the distance between an element and itself always being 0) then the operation is called a metric and the set and the operation is called a metric space.

In the next few entries in this project, we'll take a look at the criteria necessary for a set + operation to be a group and for a set + collection of subsets to be a topological space.

UPDATE: next post

by : Created on Sept. 21, 2004 : Last modified Aug. 10, 2007 : (permalink)


Poincare Project: Thinking Like a Pure Mathematician

Before we are at a point where we can discuss the Poincare Conjecture itself, we need to learn some general topology and group theory. But before we lay that foundation, I think it is worth taking a moment to establish the mode of thinking we must enter.

Marcus Aurelius exhorts us to ask "what is the nature of the whole, and what is my nature, and how this is related to that, and what kind of a part it is of what kind of a whole?" Now Aurelius is talking about human nature (and see Hannibal Lecter's use of the quotation in Silence of the Lambs) but it encapsulates the fundamental questions asked by pure mathematicians, not of humans, but of abstract objects such as numbers and shapes.

Imagine you're looking at an apple and you notice certain characteristics it posseses. Which of those characteristics are specific to that particular apple? Which are specific to all apples of that particular variety? Of apples in general? Or of all fruit? Of food? Organic objects? Physical objects?

In mathematics in general, and in the early days of this Poincare Project in particular, we will often be asking questions like: what is the most general object that exhibits this characteristic? What is the distinguishing characteristic of this object compared with others we're dealing with?

Get your mind in a mode to ask those kind of questions and we'll be ready to introduce topology.

UPDATE: next post

by : Created on Sept. 19, 2004 : Last modified Aug. 10, 2007 : (permalink)


Categories Coming Soon

I'm well aware that readers of this blog can't all be interested in all of filmmaking, blogging technology, XML, pure mathematics, Eclipse, music, productivity software, RDF, Python and software development.

I've been planning on categories in Leonardo for a while and I'm pleased to say they'll come shortly after the current Leonardo rewrite.

I'm still thinking about the best way to approach them. My current thoughts are to merge the notion of a category being a feed that you post to with the notion of a category being a resource that you annotate. In other words, categorization equals feed-posting equals trackback.

More on this soon.

by : Created on Sept. 19, 2004 : Last modified Feb. 8, 2005 : (permalink)


43 Folders: The Latest Addition to My Blogroll

I recently added 43 Folders to my blogroll and it's rapidly becoming one of my favourites.

It's a blog about "life hacks" from an OS-X-using fan of David Allen's Getting Things Done.

As longer-time readers of this blog know, I'm a big fan of the GTD approach to personal productivity too. I also recently bought a PowerBook (although I haven't ruled out getting a Tablet PC — take note Scoble!)

by : Created on Sept. 19, 2004 : Last modified Feb. 8, 2005 : (permalink)


Leonardo Rewrite

I'm about halfway through a complete rewrite of Leonardo (which I'll soon release as 0.4.0). I mostly wanted to rework the backend to support annotations to entries such as comments and trackbacks (see Blogs, Annotations, Comments and Trackbacks for why I think they are the same thing). I also wanted to allow for alternative backends such as Subversion.

While I was at it, I thought I may as well improve the frontend dispatching to make things more modular.

To avoid "second system syndrome", I'm treating this entirely as a refactoring and not adding any new features until I'm done.

by : Created on Sept. 18, 2004 : Last modified Feb. 8, 2005 : (permalink)


Favourite Pieces

A friend recently asked me my 10 favourite pieces of music. The first eight to pop into my head (grouped by period and composer) were:

If I thought about it more, they might not really all be in my top 10 but they'd all be close. In a larger list I'd include some Ligeti, Stravinsky, Copland and Glass.

by : Created on Sept. 17, 2004 : Last modified Feb. 8, 2005 : (permalink)


Journey to the Poincare Conjecture

I'm going to try something over the next few months (years?) that I hope will be of interest to at least some readers of this blog.

I've taken an interest recently in the Poincare Conjecture, one of the most famous outstanding problems in pure mathematics.

Although I understand the conjecture itself, recent developments in proving it involve some areas of mathematics that are well beyond my current level of understanding.

I plan to change that. I want to brush up on my algebraic topology, work my way up to Thurston's work on classifying 3-manifolds and Hamilton's work on Ricci Flow and eventually end up at a point where I can read, understand and maybe even enjoy Perelman's recent papers which may very well turn out to finally prove Poincare's Conjecture.

My intention is to blog what I learn as I go. It's an opportunity for me to better formulate my own understanding, improve the mathematical capabilities of Leonardo and maybe even help someone else learn a thing or two.

It's a change from my series of film project updates, but hopefully a fruitful one.

UPDATE: next post

by : Created on Sept. 17, 2004 : Last modified Aug. 10, 2007 : (permalink)


The Inverse Law of Bug Complexity

"The harder a bug is to track down, the simpler the fix tends to be."

UPDATE (2004-09-18): A couple of people have asked me if this is an original quote. Yes it is. I've never heard it stated before but I have personally observed it many many times.

by : Created on Sept. 16, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Makes Burlington Union

The local paper, the Burlington Union featured us on the front page with a couple of great photos and a nice article all about the film.

If I get permission to distribute the article, it'll be on http://www.alibiphonenetwork.com/ although I'll mention it here.

UPDATE (2004-09-11): The article is now online.

by : Created on Sept. 10, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Editing Has Begun

This afternoon I transferred our almost four hours of rough footage to disk and this evening I did a rough cut that came in at just over 13 minutes. No major issues—just one scene with boom handling noise in every take and another scene in which the lone take features a clearly visible boom operator in the corner.

But other than that, the final film works well for me. Will be interesting to watch it tomorrow after I'm a little more detached.

Oh, and the film now has a website: http://alibiphonenetwork.com/

Don't look for the film itself online any time soon, though. Most festivals have a rule that makes films made available on the Internet ineligible for entry.

Update (2004/09/05): I actually think there are some continuity problems caused by "crossing the line" in a couple of scenes. Will be interesting to see if I can rescue them in editing to avoid having to reshoot.

by : Created on Sept. 4, 2004 : Last modified Feb. 8, 2005 : (permalink)


Prank of the Day

I hooked up a filter to CVS to run the check-in mail through the Jive filter for one particular user.

Definitely recommended!

by : Created on Sept. 3, 2004 : Last modified Feb. 8, 2005 : (permalink)


Fare Basis Only On Printed Ticket

I need to extend my stay in the US for a couple more weeks so I called Qantas from work today.

"What's the fare basis printed on your ticket, sir?" "Umm, I don't have my ticket with me at the moment, can you look it up?" "No, it's only on the ticket, sir".

Why is an important piece of information like the fare basis for the ticket only on the printed ticket and not in some Qantas computer?

I recall this isn't the first time that Qantas telepone sales have had no idea what the conditions of my ticket were.

Which reminds me: a couple of years ago I was travelling domestically in the US and they wouldn't let me on the plane because no ticket had been issued for me. "But here's my reservation confirmation", I said and pointed out that I even had a seat allocation. "Yes, you have a reservation on this flight, sir, but not a ticket". Silly me! What was I thinking?

UPDATE (2004-09-03): I now have to physically send my ticket to Arizona so Qantas can physically send me back a new one.

by : Created on Sept. 2, 2004 : Last modified Feb. 8, 2005 : (permalink)


Note to Titles-Only Atom Feed Subscribers

I've noticed that the number of subscribers to the titles-only feed is decreasing (with a corresponding increase in the full-text feed subscribers).

It just occurred to me that there are probably some titles-only feed subscribers that don't know that a full-text feed is available.

Oh, and now that production on Alibi Phone Network is over, expect me to get back to work on Leonardo.

by : Created on Aug. 30, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Second and Final Day of Shooting

Well, I'm glad we were ahead yesterday because we had a near disaster today.

We got off to a great start at a house we were fortunate enough to be able to use but we didn't get done until an hour after we'd told the owners of the house we'd be finished.

There was a mad rush to pack up everything and get it in James Marcus's car to take back to the office for the remaining scenes. However, after it had all be loaded up, Tom tried to start the car and the key broke. Without a spare, this would have been distressing at the best of times. But when it was the main equipment transportation for the entire shoot, the whole day was in jeopardy.

We managed, however, to move all the equipment to other cars and share rides to get back only a few hours later than planned.

The final shots were a mad race against the sun but we got everything done.

So principal photography is over. I'm actually quite sad—we had a great group of people and, stress aside, it was a great deal of fun.

I now need to sleep for 20 hours straight. After I've recovered and have caught up on some of the other aspects of my life that have been neglected during the making of this film, I'll be starting on the editing.

by : Created on Aug. 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: The First Day of Shooting

We ended at 6.30pm today, exactly 12 hours after the main cast/crew call. Things went exceptionally well. The actors were awesome and we had a phenomenal crew, some of whom were friends of friends and who turned out to be excellent. We even got an extra scene shot so we're looking good for tomorrow.

There were only two problems that arose. One was the buzz of insects that sometimes held up a scene in a garden. The other was a very humorous sequence of goofs where every elevator except the one our actress was in front of answered the call and opened. To the old adage of "never work with animals or children" I now add "or elevators or insects".

I am absolutely exhausted. But we survived our first day.

by : Created on Aug. 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: First Call

I'm writing this on location at 5.30am. Majority of cast and crew will be arriving in the next hour. I'm printing the latest script updates from Tom and once James Marcus (who'll be 1st AD) comes in we'll go through the day's schedule.

Actors came in last night for a final rehearsal (also a first rehearsal with the two male characters together). My combined Script Supervisor / 2nd Assistant Camera pulled out last night. Otherwise things are going well.

by : Created on Aug. 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: On the Eve of Principal Photography

The equipment has arrived. The shooting schedule and shot list has been finalised. We have all our props. James Marcus has had a hair cut. I've done my laundry.

Two more hours until the actors come for a rehearsal. Fifteen hours until the first call for principal photography.

by : Created on Aug. 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Good News

Yesterday we got confirmation on the final location. Then today I got the letter from SAG approving our application for an Experimental Film Agreement.

Funnily enough, when Tom, James and I grabbed some Chinese food for dinner, my fortune cookie said "good news will come via mail".

by : Created on Aug. 24, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Second Crew Rehearsal

On Sunday we did another full day of crew rehearsal. Came up with a great interior crane shot with the jib. In the afternoon, two of the actors came by and we did a bunch of scenes with them including a thirty-foot tracking shot in a parking lot. We've since asked our rental place if we can get more track for the principal photography so we're set to get sixty feet.

The actors we've cast are great. This should be a great shoot.

by : Created on Aug. 24, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: First Crew Rehearsal

Today was our first full-equipment crew rehearsal.

Tom, James and I, along with Production Assistants and Stand-ins Travis Bennett and Virender Dogra filmed two scenes: one a tracking shot and the other a dialog scene.

Immediate observations: firstly, the combination of 24p and cine-like gamma on the Panasonic DVX-100A is stunning. Secondly, use of a track dolly adds incredible production value.

All three together led to the most professional, film-looking shot I've ever done in my time as an amateur digital cinematographer.

After dinner, James and I went through the script and did a shot-by-shot breakdown. For most of the dialog scenes, we've planned a fairly standard combination of master shot plus tight and wide shots in both directions for good coverage in five shots per dialog scene. For some of the more action-oriented scenes we have some more fine-grained shots planned.

As both the cinematographer and editor, I found myself having to be strict about separating what we wanted to film for coverage with how I would eventually edit it. Sufficient coverage will be crucial.

by : Created on Aug. 22, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: One Week To Go

This weekend I've rented the full set of equipment we'll be using. This will give us the opportunity to get familiar with it, both in general and in the specific context of our locations.

I've rented:

We have verbal agreements from all three cast members. I'm almost finalised with SAG (the person I've been dealing with there has been excellent). We have all but one location secured.

Biggest risks at the moment are: securing last location, bad weather, actor/crew illness.

by : Created on Aug. 20, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: The Auditions

At 3pm on Tuesday, went to pick up the camera and tripod and FedEx the documents to SAG.

At 5.30pm we went to the Marriott to set up the room.

From 6pm to 8.30pm we had numerous people come in and read for us.

The first thing I learnt was that headshots do not tell you much about whether a person would fit a character or not. There was at least one person that I was so sure would be a fit for one character based on their headshot but ended up being a much better fit for a completely different character once I saw their facial expressions.

Second thing I learnt was that actors will read any given line so differently from each other that it doesn't even seem like the same line.

We got some excellent performances. Actors deliver lines so much better than non-actors. It was really amazing to see. And our SAG member - he had such a presence that when he walked in the room, I felt like I was meeting someone famous (super nice guy, though!)

During the audition, we ordered some food and when the waitress walked in, I was speechless because she looked exactly like what I imagined the female character would look like. So I got James to go ask her to audition. She couldn't then and there and we didn't really want to risk losing our first choice from the auditions and take a chance on someone who'd never acted. I think if we were more experienced at directing actors, we might have given it a go. I still want to give her the opportunity to be involved in the film, though, as she seemed excited about the whole thing.

But the actress we chose for the principal female role was a great actress, even though she didn't have the look we were initially thinking, so I think we made the right choice. If the waitress had turned out to be an actor, it would have been a cool "discovery" story, though.

So, basically, we have our three choices for the principal characters now. Two have verbally accepted, just waiting on the third.

Tomorrow I have to check with SAG whether I can get the agreement I need with them.

by : Created on Aug. 18, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Auditions Tomorrow

Well, we've organized equipment rental and auditions should be over in 24 hours time. I'm about to finalise the schedule and budget to fax over to SAG. The actress we all liked the best from the headshots can't make the film which is a huge disappointment to Tom, James and I. I'm hoping someone else comes in and blows us away. I've asked the SAG actor we all like for the male lead if he can help out with casting the female part.

I said last update that I'd reveal the name of the film in this update.

Our film is called The Alibi Phone Network.

by : Created on Aug. 16, 2004 : Last modified Feb. 8, 2005 : (permalink)


OPML Sharing and Polling Security

Prompted by Scoble, I uploaded my OPML to Dave Winer's OPML sharing site.

You should too!

I was just about to comment that it would be nice—along the lines I was suggesting in Aggregation Versus Hosting—if you could just provide a URI for your OPML and have the site pull it in on a regular basis.

Well it turns out you can. Thank you Dave!

Now if Bloglines would take the same URI (via polling plus the ability to force a reload) I'd be even happier.

Making a resource available for polling rather than uploading it to a variety of sites raises some additional security issues. What if I wanted to make my resource available to aggregator.example.com but no one else? One possibility would be to submit the URI along with a username and password that aggregator.example.com could use with Basic Auth to retrieve the URI. Alternatively, and more securely, aggregator.example.com could publish a public key and I could configure my site to encrypt the resource using that public key whenever aggregator.example.com requested it. I wonder if either would fly.

by : Created on Aug. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


Maximizing the Differences

OpenFlow mostly agreed with my post on XML and RDF but took issue with me on one point that I think was a misunderstanding.

When I said "the default serialization of RDF as XML should not be the principal way RDF is interchanged", I wasn't against serializing as XML. I meant that a generic RDF to XML serialization isn't necessarily going to result in the optimal XML schema.

XML should be the way RDF is expressed but I don't think a single (or even small handful of) generic mappings is going to give you nearly as nice XML as if you tweaked the mapping for the particular ontology.

OpenFlow suggests "A canonical way of expressing RDF would probably go a long way in minimizing the differences (and flame wars) between RDF and XML" but I don't want to minimize the differences between RDF and XML because I think they serve a very different purpose. I'm trying to reduce the overlap in order to minimize the flame wars.

One advantage of the RDF ontology + mapping + XML schema approach over the RDF ontology + generic XML serialization is people who don't like the generic RDF/XML serialization don't have to use it; they can invent their own XML schema.

Furthermore, we RDFers don't have to lament over every "non-RDF" XML schema developed. We can hope that the developers of individual XML schemas would provide a mapping to an RDF ontology, but if they don't, someone else always can.

by : Created on Aug. 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Two Weeks Until Principal Photography

A couple of people have asked for an update on the film. Things are going well, although I made the decision to push back the schedule a little.

Auditions are taking place next Tuesday. We've rented a meeting room at the Boston Marriott Burlington. Most of the locations have been chosen although we're still scouting a couple more. Arranging the equipment rental is a little behind as is getting workers' compensation insurance. The latter means we're going to be cutting it fine for the SAG agreement.

But overall I'm confident. Principal photography is scheduled for 28th and 29th August. My next film project update, I'll probably announce the film's working title.

by : Created on Aug. 13, 2004 : Last modified Feb. 8, 2005 : (permalink)


My New PowerBook

I just bought an Apple PowerBook. So all you Mac users out there, email me your tips on the "must-have" apps/tools I need to download/buy and I'll post them here after a little while.

UPDATE (2004-09-18):

What people recommended (in no particular order).

Thanks to Simon Willison, Robert Fleming, Michael Twomey, Bill Anderson, Chris Adams, Ashley Aitken:

Launchbar (3 recommendations)

http://www.obdev.at/products/launchbar/

"if you prefer keyboard over mouse"

"having a fast search engine for apps, documents, contacts, etc. really changes the way you work."

"quick access to applications/URLs/email addresses"

WorldClock

http://www.mabasoft.com/downloads/worldClockDeluxeX.html

"for international travellers/workers like yourself"

DragThing

http://www.dragthing.com/

"a more powerful version of the Dock for power users"

iTerm (4 recommendations)

http://iterm.sourceforge.net

"a much better alternative to Apple's Terminal.app."

"replacement for terminal"

"(though I've temporarily reverted to Terminal.app for some stuff as I'm getting bad load issues with iTerm). For me it's terminal with tabs."

SubEthaEdit (3 recommendations)

http://www.codingmonkeys.de/subethaedit/

"a fantastic text editor, and that's even before you start playing with the collaboration features."

"is collaborative editing as it should be. I know I've moaned about it in the past, but I still like it, and it's great for group editing."

QuickSilver (3 recommendations)

http://quicksilver.blacktree.com/

"makes the Dock obsolete - an awesome way of launching programs and quickly finding files and contacts."

"having a fast search engine for apps, documents, contacts, etc. really changes the way you work."

"is a must (at least until spotlight arrives). Ctrl + space + first few letters of app's name = great way to launch stuff for people used to keeping an xterm around to launch apps."

VLC (2 recommendations)

http://www.videolan.org

"an open source video player, plays almost any format you throw at it (including DivX) and unlike QuickTime allows you to play things in full screen mode."

"Video file player"

NetNewsWire (4 recommendations)

http://www.ranchero.com/netnewswire

"an excellent RSS / Atom news aggregator."

"RSS client"

"THE news aggregator, just buy it (I know there is a lite, but just buy it)."

FireFox (2 recommendations)

http://www.mozilla.org/products/firefox/

"Safari is good, but I find things like gmail work better on firefox. And type ahead find is hard to beat when you are addicted to it."

OmniWeb

http://www.omnigroup.com

"it's a very user-oriented browser which relies on the Safari rendering engine."

Salling Clicker

http://www.salling.com

"if you have a Bluetooth phone you can use this as a remote control for your computer (iTunes, PowerPoint, etc.) and to execute arbitrary commands based on state changes - pause the DVD player when you get a call, lock the screen or adjust IM status when you walk away from the computer, etc."

BluePhoneMenu

http://www.reelintelligence.com/BluePhoneMenu/

"completes things from a Bluetooth perspective - it manages your SMS history, maintains a call log, displays your addressbook entry for incoming calls along with options to reply w/SMS or punt the call to voice mail."

uControl (2 recommendations)

http://gnufoo.org/ucontrol/

"this lets you remap keys and enable scroll-wheel emulation on a trackpad (I use function+trackpad for scrolling all the time)."

"remap keys/enable "scroll-wheel" like support w/ trackpad.  I use in conjuction with sidetrack"

Sidetrack

http://www.ragingmenace.com/software/sidetrack/index.html

"take full control of your trackpad"

QuickTime Broadcaster

http://www.apple.com/quicktime/products/broadcaster/

"allows you to stream / save arbitrary audio and video streams. I've used it for recording and streaming talks."

RsyncX

http://www.macosxlabs.org/rsyncx/rsyncx.html

"a resource-fork-aware version of rsync with a graphical front-end. I give it to portable users for backups since SSH works anywhere on the Internet and it's usable over slow connections."

TinkerTool

http://www.bresink.de/osx/TinkerTool.html

"a front-end for a bunch of system customizations."

SSHPassKey

http://www.codefab.com/unsupported/

"an Aqua ssh-askpass equivalent. If you script SSH but don't use an SSH agent this will prompt you for your password."

Little Snitch

http://www.obdev.at

"control outbound tcp connections"

MenuMeters (2 recommendations)

http://www.ragingmenace.com/software/menumeters/

"a set of CPU, memory, disk, and network monitoring tools for MacOS X. (sit in toolbar)"

"is great for spotting runaway apps and generally getting a feel for your mac's status."  

Geektool

http://projects.tynsoe.org/en/geektool/

"is a geek's must have, it prints console messages (you know, the stuff which winds up in the syslog) in a discrete manner to your desktop. Great for spotting apps which are throwing wobblers (helped me pin down an odd issue with Control Center)."

Desktop Manager

http://wsmanager.sourceforge.net/

"gives you virtual desktops. I just have the menu bar desktops and drop the desktop panel myself."

Ecto

http://ecto.kung-foo.tv/

"is an excellent blogging tool (I'm using it right now)."

Emacs (2 recommendations)

http://www.mindlube.com/products/emacs/

"I almost didn't get my new job 'cos they thought I used vi (allegedly). Emacs is very nice once you've activated the pc-mode stuff, then you have sane selection and navigation."

GnuPG

http://www.gnupg.org/

Thunderbird

by : Created on Aug. 13, 2004 : Last modified Feb. 8, 2005 : (permalink)


Aggregation Versus Hosting

Last week, in DOAP and the Next Advogato, I touched on my desire for sites that allow me to use my own website as the authoritative source of information while at the same time publishing that information and aggregating it with that from others for the purpose of searching, rating, etc.

In Some 'Web as platform' noodling, Kottke ponders the opposite: putting different components of your digital life out on different websites: using one site for your photos, another for your playlists, another for your calendar, etc.

But this is exactly what I don't want to do. I'm happy to publish my blog and open source project info on Advogato but I'd like it to act purely as an aggregator of my RSS/Atom and DOAP feed, not the authoritative source. I'm happy to use LinkedIn but I'd like LinkedIn to poll my FOAF - not have LinkedIn be the authoritative source. I'd like to be able to publish events from my calendar to Upcoming.org rather than enter them in Upcoming.org and use their feed if I want to use my own info on my site.

I think there's a lot of value in pure aggregation sites. I like the value-add that reading blogs in Bloglines provides in terms of references and recommendations. But I'm not interested in pulling my blogroll from them, I'd rather push it to them (or have them pull it from me).

The value of a LinkedIn or Upcoming.org is in the aggregation, not using them as an authoring tool or repository for one's own data. They should focus on competing on the value-add of their aggregation. I don't see any disadvantage for them in opening up the input mechanism to pull the source information from external authoritative feeds (or support the information being pushed to them).

I'm not ruling out the need for information hosting services. But I think aggregators and hosting services are different beasts and separating them provides many advantages to both providers and consumers of information.

UPDATE (2004-09-25): Now see More on Aggregation Versus Hosting

by : Created on Aug. 11, 2004 : Last modified Feb. 8, 2005 : (permalink)


Hackers and Painters

I started reading Paul Graham's Hackers and Painters this weekend, in somewhat of a random order. I've read five essays so far and have thoroughly enjoyed every single one of them. It is not so much that they are introducing me to new ideas as expressing in an interesting way ideas I've had for a while. So I found myself not so much saying "I'd never thought of that before" as "I'd never thought of putting it that way before".

Clearly, I'm coming at the book already sharing many of Paul Graham's views. I'm not sure the book would convince people who hold alternative views to change their mind. This is evidenced by the strikingly bi-modal distribution of ratings on Amazon.com:

But certainly I'm recommending the book to non-hacker friends as well as to friends I already know share a lot of the same perspective.

by : Created on Aug. 9, 2004 : Last modified Feb. 8, 2005 : (permalink)


More on XML and RDF

The 'Document' in Document-Oriented Messaging is another great post from mnot on why XML (and the Infoset and XML Schema) are good for surface syntax but not data modeling.

Norm Walsh argues in Is RDF/XML Good for Anything? that the RDF/XML serialization might be good for RDF "core dumps" but not for authoring data.

An earlier mnot post prompted me to write on the XML Infoset, XML Schemas, RDF and RDF Schemas. I think it combines a lot of what mnot is saying with a lot of what Norm is saying.

That doesn't necessarily mean either Norm or Mark would agree with me but I continue to believe that:

I therefore believe that when one develops a vocabulary (or "application" in the SGML sense of the term) it should include:

UPDATE (2004/08/06): I need to work out where XMI fits in all of this.

by : Created on Aug. 6, 2004 : Last modified Feb. 8, 2005 : (permalink)


DOAP and the next Advogato

I've recently been reading about Edd Dumbill's Description of a Project (DOAP) project.

Machine readable descriptions of software projects is something I've dabbled in since 1998 when I started XMLSOFTWARE.COM. Around that time I worked a little bit with Lars Marius Garshol's XML Software Autoupdate (XSA). Microsoft had Open Software Description (OSD), although OSD was more designed for describing component dependencies whereas XSA was designed for software directories to be able to poll to get updates from developers (a use-case DOAP would be suited to as well).

Given that a lot of this site is about open source software projects of mine, I'll probably add DOAP support to Leonardo at some stage, probably around the same time I add FOAF support. But I have the same questions about the relationship between DOAP and, say, Freshmeat or Advogato as I do between FOAF and the Orkuts and Linkedins of the world. Namely: how can I use my own website as the authoritative source of my own FOAF and DOAP information while at the same time that information being available in directories for searching, rating, etc.

The RDF-nature of both FOAF and DOAP means that what is really needed is a general mechanism that does this for any RDF, although FOAF and DOAP specific support would make a great start.

I'm thinking for starters of a version of Advogato where you just specific the URI for your FOAF, DOAP and RSS.

by : Created on Aug. 1, 2004 : Last modified Feb. 8, 2005 : (permalink)


Great Hackers, Python, Java, Eclipse and Chandler

In his latest essay, Great Hackers, Paul Graham suggests that one mark of a great programmer is that, given the choice, they'll program in Python rather than Java. He points out that Google requires Python experience, even when hiring for Java positions, because it attracts better candidates.

Almost all of my "recreational" programming over the last five years has been in Python. I'm not saying that makes me one of Graham's Great Hackers, but it is certainly the case that, given the choice, I write in Python and not in Java. In fact, the only time I write in Java outside of work is when I'm writing plugins for Eclipse. The Eclipse platform is seductive enough that it keeps me writing GUI apps in Java, even when I have the choice.

The implication is, though, that if there were a Python equivalent of Eclipse (the platform, not just the IDE), I could cease to write in Java all together whenever I had the choice.

The first step would be getting something like wxPython to the level of SWT+JFace. Next would be a standard plugin framework.

The last couple of months I've wondered if Chandler might be the best avenue to achieve this. It already has a pluggable "parcel" framework and if it is going to compete as a PIM, it's going to have to drive wxPython to the level of SWT+JFace.

I still stand by my prediction about Eclipse being the next Emacs but I'd love to see a Python equivalent, whether it's Chandler or not.

by : Created on Aug. 1, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Script Updates, SAG, Equipment and Schedules

Things are so busy at work, I haven't had much of a chance to blog. I want to get back to talking about things besides the film, but for now, here's another update.

Tom and I are still tweaking the script. We had a great session on Saturday where we closed up a plot hole that was bothering me.

SAG returned my call yesterday and there seems to be no problem getting an Experimental Film Agreement in place. This excites me a lot. All of a sudden this feels like a real film.

I got a quote for equipment rental. Fortunately weekend rental counts as one day so that's keeping the costs down. I'm going to shoot on the Panasonic DVX100A, which can do 24p. I briefly toyed with going HD but not only is that 8x the price, it adds a bunch of additional requirements and complications to post-production. If it had been released in time, I probably would have shot on a Canon XL-2 given that I have an XL-1 at home that I'm used to.

In researching cameras, I came across a great self-published book called "Shooting Digital" from Marcus van Bavel of DVFilm, a company that does DV-to-Film transfers. The book is specifically for people shooting with digital cameras that intend (or in my case, hope one day to need) to transfer to 35mm film.

This weekend I also got around to trying out the Sun Frog film scheduling software I've mentioned before. I have to say, it's a very nice tool. I'll write a more detailed review shortly but it certainly makes management of breakdown items and schedules as well as the generation of reports very easy. And it has a nice modern interface which alone makes it feel so much more professional.

by : Created on July 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Casting

On Wednesday, a casting call notice I'd submitted earlier to the NE Film site was posted. Almost immediately I started getting responses.

As of right now, I've received on the order of twenty responses. From a headshot perspective, we've got great fits for each of the three principal roles. However, my favourite for the lead is a SAG member. I've seen a demo reel and he is really good. I emailed him and told him the issue. The result: he encouraged me to call the Boston SAG office to try to get a SAG experimental contract expedited.

So if all goes well, I might be a SAG signatory for this film!

by : Created on July 22, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Script Almost Done and Casting About to Begin

Tom did a full draft of the screenplay on Friday which we discussed in length over the phone on Saturday and Sunday. There were a lot of great things he added but I felt some key features of the original treatment were lost. Plus, both of us felt it was going to be more difficult to cast.

So this evening, Tom did another draft and sent it to me. It has some minor issues which we talked about on the phone again, but I think we pretty much have our script. And we thought of a great reference to Ferris Bueller's Day Off that fans of that film will immediately recognize.

The big challenge is now going to be casting. There are three principal parts: two males and one female. All three are professionals in their late twenties to early thirties. All interested non-union actors in the Boston-area are encouraged to send headshots and resumes to me (see contact information). Availability is required over two weekends mid-to-late August. I would have tried for a SAGIndie Experimental Film Agreement but we don't have time. Next film, I'll definitely apply.

by : Created on July 19, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Introducing Tom Bennett

Last weekend, when I should have been well into writing the script, I was still struggling on the treatment. I managed to get a rough sketch done but was worried I wasn't going to get an entire script written by 18th July which was my target.

Then on Tuesday, I was having lunch with some colleagues including our new director of professional services at mValent, Tom Bennett. Tom mentioned that he writes screenplays.

I told him about the project and yesterday sent him my treatment. That night he went home and wrote a brilliant first five pages—far better than I could have done.

So he's on board. I should have a first draft from him by the end of the week which means we can start scheduling, casting and location scouting.

by : Created on July 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


37 is a Psychologically Random Number

About eight years ago, I started noticing the number 37 appearing disproportionately in television and movies. It soon became a "cult" number for my girlfriend and myself—we'd ring up each other any time we saw a "37" somewhere (that was my excuse, anyway).

We soon came to distinguish three types of instances of 37:

The third, we referred to as "true 37s". It really is remarkable how many times 37 is the number people will come up with when they are randomly picking a number but want to sound specific (as in "I can think of 37 things wrong with this proposal").

I recently found:

which lists many instances of 37 (of which only some are "true 37s").

And my sister then found the following:

On that last link, make sure you follow the link on psychologically random numbers. It's where I picked up the term for this phenomenon that has fascinated me for almost a decade.

by : Created on July 9, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Week Two

It's not written up as a treatment yet, but James Marcus and I pretty much have the story worked out (mostly during trips in his car). The Ferris Bueller connection probably isn't going to work but maybe that can be saved for another film.

My goal is to write up the treatment this weekend and maybe sketch out some of the key dialog. I might try out some script editing software too, which reminds me: I've wanted to do a script editing plugin for Eclipse pretty much since 2.0.

by : Created on July 2, 2004 : Last modified Feb. 8, 2005 : (permalink)


My First Eclipse RCP Application

After I told a colleague about the Calkin-Wilf tree (see Enumerating the Rationals in Python), he suggested a program that would generate the tree graphically.

I decided it was a good excuse to try writing my first Eclipse 3.0 rich-client platform (RCP) application. Given the children of each node in the Calkin-Wilf tree are derivable solely from the node itself, it just took a little tree content provider and the RCP skeleton pretty much copied straight from the example app referenced in the online doc.

It worked out pretty well. Writing the application itself was straightforward; packaging it up with the necessary components from the RCP platform was not. It took a lot of trial and error to get the minimal set of additional plugins and to bootstrap the execution of the application outside of the IDE.

What I would like to see (and I'm sure will be in 3.1) are some improvements to the plugin development environment specifically for RCP; namely, a wizard for creating the base skeleton (plugin, app class and advisor class) and a wizard for packaging the RCP app for standalone execution.

by : Created on July 2, 2004 : Last modified Feb. 8, 2005 : (permalink)


Enumerating the Rationals in Python

I've written a short script that enumerates the positive rationals without duplicates:

The script involves a simple generator wrapping a single-line iterative expression.

I've long been aware of the method of enumerating the positive rationals by walking the diagonals of the infinite matrix (with the numerator increasing across columns and denominator increasing down rows) but this results in duplicates (an infinite number for each rational number, in fact).

The approach taken in my script is based, instead, on walking a Calkin-Wilf tree. I became aware of this approach from a recent paper by Jeremy Gibbons, David Lester and Richard Bird which I found out about from the Lambda the Ultimate blog.

by : Created on July 1, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Project Update: Week One

Work's been busy and between that, exercising and Pimsleur Italian, I haven't had much time to think about the film. I was getting worried that I wouldn't come up with a concept and a script in time. Then last Friday, I read an article describing a new meme that sounded like it would make an interesting film. I don't want to say much more about the idea itself at this stage but I'm liking it more and more. Should provide a great foundation for some comedy and some conflict. Ferris Bueller's Day Off was on last night and, in an interesting way, there are some similarities. Will be interesting to see how the idea develops.

Unlike a lot of ideas I get for short films, I know how this one should end. That's half the battle, so hopefully the script writing will go smoothly. I only have three weeks.

by : Created on June 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


VCs and Films

At the same time mValent was going for its first round of VC funding, I was also reading a lot about producing films. I saw many interesting parallels and even at one point conceived of a short film (tentatively called The Pitch) that juxtaposed the two worlds.

Joshua Newman, former hi-tech entrepreneur, VC and now film producer has a great blog entry on the development of independent films.

I'd like to follow in Josh's footsteps one day, which is humbling given that, at 24, he is six years younger than me.

by : Created on June 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


Film Scheduling Software

I've been looking to use some scheduling software for the film project and I recently came across Sun Frog. Not only does the software look interesting but I love the company's approach. They have a blog and a discussion group, a cafe press store, a competition for the best production of the sample script that ships with the product and finally, you can rent the software, which I think is great for film productions.

I'm going to try out the trial version for the next few weeks and, if I like it, I'll rent it for the duration of the production.

You may be wondering why bother using film scheduling software for what is a short film with skeleton crew. The simple answer is that I'm fascinated by production management (I even have an Amazon.com List of production management books) and I want to manage this production as if it were a full feature production.

by : Created on June 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


Twelve Week Film Project

Speaking of film projects...

I'm about to embark on an extended trip to the US; extended enough that it will be longer than any stretch I've spent in Australia since moving back.

This means I have to bring more of my hobbies with me. This time, I'm bringing filmmaking and plan to make a short film from scratch while there.

I'll update readers of this blog as I go along but the current plan, working with my good friend James Marcus, is to spend weeks 1-4 on development; 5-6 on pre-production; 7-8 on production and 9-12 on post.

Stay tuned!

by : Created on June 16, 2004 : Last modified Feb. 8, 2005 : (permalink)


Tintin Script in First Draft

Ever since I found out about Spielberg's plans to make a trilogy of films based on Tintin, I've run marlinspike.org, a site dedicated to news on the film project. In the first real news in a while, it seems that a draft script for the first film is complete.

by : Created on June 16, 2004 : Last modified Feb. 8, 2005 : (permalink)


Why Are There Three Primary Colours?

One question that has long puzzled me (although not enough to motivate me to find an answer until now) is:

why are there three primary colours?

Another, possible statement of the problem might be: why is the space of colours three dimensional?

Once when I posed this question to a friend they suggested the reason was that the human eye has three cones. But that could be the result of rather than the reason for the three dimensional colour space.

One possible answer is that it isn't three dimensional, it's infinite dimensional but three gives a reasonably good approximation and the marginal utility of adding more dimensions drops off quickly. This reminds me of music where 12 notes gives a decent approximation to the harmonic series—much better than seven notes which is the next best under 12 and enough that few have been motivated to go to 19 notes where the next improvement happens.

After all, isn't white light a combination of all frequencies, not just three?

If you take a sine wave with the frequency of red and one with the frequency of green and add them, you get a wave whose periodicity resembles that of yellow. But it isn't the same as a pure sine wave of that frequency. This in itself suggests that additive colours are just approximations.

Newton recognized that there were colours that didn't appear in the spectrum but were achievable through combining spectral colours. What I'm not clear about is whether such combinations require three components. Will a combination of two spectral colours only give you an approximation of another spectral colour? Do you need a third component to get non-spectral colours? Does adding a fourth component give you better approximations but with a far reduced marginal utility?

Anyone care to shed some light? (pun intended)

UPDATE (2004/06/07): now see Update On The Primary Colours.

by : Created on June 7, 2004 : Last modified Feb. 8, 2005 : (permalink)


Update On The Primary Colours

Already got a response to my primary colours question from T.J. Jankun-Kelly.

Some great points:

TJK also cited http://hyperphysics.phy-astr.gsu.edu/hbase/vision/colper.html which is a nice summary of colour science and makes a point that is key for me:

"It is found that many different combinations of light wavelengths can produce the same perception of color."

So there is a definite distinction between the dimensionality of colour perception and the actual space of light. I feel more confident now in asserting that the actual colour space of light is infinite dimensional but that it is projected onto a three dimensional non-linear space of perception.

So does that mean that the number of cones is the reason for the dimensionality of the (perceptual) colour space?

UPDATE (2004/06/08): TJK say yes. The number of cones is the reason for the dimensionality of the perceptual colour space. Chickens, which have 12 cones, would have a 12-dimensional perceptual colour space. Makes me think of a name one could use for an article on this topic: "If Munsell Were A Chicken".

by : Created on June 7, 2004 : Last modified Feb. 8, 2005 : (permalink)


Alive and Well

It's been a while since I've blogged, due to three factors: busy in US with mValent including an office move to Burlington; travelling back to Australia; being sick for the last week.

Expect a bunch of stuff soon.

by : Created on June 4, 2004 : Last modified Feb. 8, 2005 : (permalink)


Da Vinci's Notebooks and RSS Reading Plans

Catching up on blog reading, I discovered Matt Webb's day-by-day feed of the Notebooks of Leonardo Da Vinci. Besides the actual feed itself, which I've now subscribed to, Matt nicely solves the problem of how to provide an RSS-based day-by-day reading plan that allows the reader to start at any time.

Rather that a single feed, it appears Matt offers a new feed each day. If you start the reading plan on the 4th of June, for example, you use the feed at http://interconnected.org/home/more/davinci/2004-06-04.rss whereas people starting in a week's time will presumably use a feed at http://interconnected.org/home/more/davinci/2004-06-11.rss

The main page provides a dynamically generated link to the feed to use if you're starting today.

It's a very simple idea but a great way of implementing a reading plan. I'll probably implement it in my own Leonardo.

by : Created on June 4, 2004 : Last modified Feb. 8, 2005 : (permalink)


Naked Objects in Sparta

When I first discovered Naked Objects, I thought about a Python implementation that used RDF to provide the schema. What I wasn't sure about was how best to mix use of Python objects for the instances with RDF for the schema.

The answer could be mnot's sparta.

Sparta provides a simple Python-object-to-RDF binding. It uses eikeon's rdflib, which is based on the code eikeon and I wrote for Redfoot.

Sparta itself doesn't make use of the schema module in rdflib but a Naked Objects implementation could be built on top of Sparta + the schema module.

The schema module lets you do things like ask what properties are allowed on a given resource and what values those properties can take. (Actually, it doesn't appear to do the latter anymore - I'll have to ask eikeon why that dropped out of our original code). The generator-based version of the schema module is certainly a lot cleaner than the original code eikeon and I wrote three years ago. One of us should write up a side-by-side comparison of before and after.

So another little project to add to the list: a Naked-Objects-like generic UI on top of sparta, using the schema module in rdflib to control property domains and ranges. The result would be pretty close to a rich-client version of the original generic viewer in Redfoot.

by : Created on May 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


Using the Leo Outliner as a PIM

Because what little time I've had to work on "recreational programming" has been spent on Leonardo and the tree-based instant messenger, I haven't made any progress on some of my tree-based personal information manager ideas.

So I thought it would be interesting to just start using an outliner to keep track of my calendar, projects, weekly work targets, etc. So far it's working pretty well.

I'm using Leo (not to be confused with Leonardo) which is a Python/Tk-based outliner that also supports literate programming. I first heard about Leo a few years ago from Joe Orr whom I met via the now-defunct Alliance for Advanced Real Estate Transaction Technology (AARTT).

One feature Leo has which is extremely important (although by no means unique amongst outliners) is the ability to place a node under multiple parents. I'll write a separate entry soon about why I think hierachies based on only the containment relationship are an unnecessary limitation (and how hierarchies should be done).

I've got some top-level nodes for goals, interests, responsibilities and projects. I also have a top-level node for my calendar.

Using multiple parents, I can link a project under one or more goals, interests and responsibilities. An event such as a meeting can go under a specific day node in the calendar as well as under the relevant project, responsibility, etc.

I even have a node for each week's "status" - I can link tasks, meetings, etc under the status node for the week and generate a weekly status report by using Leo's export-flattened-outline-to-file feature.

Leo has a text editor pane that allows text to be associated with any node in the outline (this is where you'd write your literate programs when using Leo for that purpose). This pane is great for taking notes on tasks, meetings, etc.

There are a couple of features that would really take Leo to the next level for these sorts of things. I'm not sure if it would make sense to extend Leo or to write a Leo-like outliner sans the literate programming features.

What I'd really like is the ability to have custom views of a node, not just the text pane. The text pane could be replaced by a tabbed control and the text editor would be just one of the tabs. Other tabs could contain things like the result of applying an XSLT stylesheet to the node and its descendants. This would be great for things like my Calendar node, which doesn't really lend itself to a tree view. It would also be very useful for aggregated views where I want to see the text for a node and all its descendants in a single view.

An XSLT stylesheet generating read-only HTML would be a huge step forward. You could go even further by having custom controls on these tabs that allow manipulation of the data. Finally, you could introduce a way of expressing properties (via something like Notation-3 in the text view) and make the hierarchy derived.

by : Created on May 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


Leonardo 0.3.0 Released

A new version of Leonardo, the code behind this site, is available.

Changes:

If you are upgrading, you'll just need to replace the lib directory and add the three new settings (the ones beginning DRAFT) to your config file.

by : Created on May 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


XML Infoset and XML Schemas versus RDF and RDF Schemas

In my other mnot response today, I wrote the response here and just referenced it in the comments section of mnot's entry. In this case, I responded in full in the comments section and then decided afterwards to include it here as well. More evidence that comments and trackbacks are the same thing.

I wrote the following in response to mnot's Informational Properties of Infosets:

It could be my document-centric bias (like many members of the original WG, I came from a publishing and text processing background) but, for the most part, I've viewed XML as surface syntax (and by extension, XML schemas as as grammars for surface syntax and the Infoset as modelling surface syntactic information).

RDF/RDFS has always seemed to me to be a much better data modelling language. The problem I've always had with the syntax of RDF is that it is neither a fixed serialization of the data model nor a generic mapping to-and-from arbitrary XML. It is rather a middle-ground where some common XML patterns are supported but not the generic case.

I've always argued that RDF should support a mapping to any XML surface syntax. Back when I was writing PyTREX (never updated to support RELAX NG, unfortunately) I was hoping to annotate the TREX grammar (which I saw as being about surface syntax) with a mapping to RDF (which I saw as the right way to express the underlying data model of the document). This plus something like Sparta would be then be the XML data binding.

The Infoset is priceless for modelling the surface syntax. For everything else there's RDF.

by : Created on May 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


Inheriting Doc in Javadoc and Eclipse

There are three ways in Javadoc that you can reference interface doc from implementations.

For a while I've favoured alternative A over alternative B because of the "inherited doc in Eclipse" issue. But there are still disadvantages of this approach.

I recently discovered the {@inheritdoc} tag in Javadoc: alternative C. It's attractive, but it's a tough call because even though the only issue with it is the lack of inherited doc in Eclipse, that's a pretty big issue for me.

Right-click "Open Super Implementation" allows navigation up the tree to see the superclass or interface's doc but it still doesn't help with tool tips for calls to the implementation.

Alternative A: non-Javadoc @see reference

Pros:

Cons:

Alternative B: Javadoc @see reference

Pros:

Cons:

Alternative C: Javadoc {@inheritdoc}

Pros:

Cons:

by : Created on May 11, 2004 : Last modified Feb. 8, 2005 : (permalink)


The Haskell Road to Literate Tutorials

Previously, I've talked a little about using literate programming for writing tutorials.

I just found The Haskell Road to Logic, Maths and Programming (via Lambda the Ultimate) in which they state that "the full source code of all programs is integrated in the book; in fact, each chapter can be viewed as a literate program in Haskell."

I haven't yet established if the book was actually written as a literate program or whether they are just saying it is like one. No indication, either, of how they express the evolution of the code in the source to the book.

But speculation aside and quite apart from the literate programming aspects of the book, it looks like a very nice book on foundational mathematics.

I think there is a certain flavour that programming brings to pure mathematics. The way I think of topics like abstract algebra is heavily influenced by both object-oriented and functional programming (see, for example, my observations about currying tensors.)

by : Created on May 10, 2004 : Last modified Feb. 8, 2005 : (permalink)


Leonardo 0.2.1 Released

A quick bug-fix release for Leonardo.

Changes:

by : Created on May 6, 2004 : Last modified Feb. 8, 2005 : (permalink)


Buying from Amazon within Eclipse

My Own Shopping Cart is a plugin for Eclipse that lets you browse and shop at Amazon.com. Not only a great application of Amazon.com's API but also more evidence that by 2005, there will be people that never leave Eclipse to do their work.

by : Created on May 6, 2004 : Last modified Feb. 8, 2005 : (permalink)


Leonardo 0.2.0 Released

I've released a new version of Leonardo, the wiki/blog publishing system that is used to produce this site.

Changes:

by : Created on May 5, 2004 : Last modified Feb. 8, 2005 : (permalink)


The Bible and the Semantic Web

For many years I've been thinking about the application of Semantic Web technology to studying (and presenting the results of the study of) the Bible. However, I never really thought about the application of Bible study (and the tools and techniques developed for it) to the Semantic Web. Then I came across this great blog entry, discussing the latter.

On the former, there is a wonderful site SemanticBible that I hope I can contribute to in some way.

I also really need to get back to my morphological analysis. I haven't thought about it for a while, but I need to come up with URIs for each lemmata and word form. I could even grandfather in Strong's numbers and G/K numbers.

by : Created on May 4, 2004 : Last modified Feb. 8, 2005 : (permalink)


Digital Life Colophon

Mark Pilgrim has a nice entry on his "essentials"—the software he can't do without.

I like reading posts like this. They're like a colophon for your digital life.

Someone should come up with an RDF schema for this sort of information.

Doesn't just have to be software - could include what cell phone, what portable music player, etc.

by : Created on May 1, 2004 : Last modified Feb. 8, 2005 : (permalink)


Google Auction in Python

After reading the possible auction-based share allocation algorithm in Google's S-1 filing, I thought I'd try to implement it in Python.

The result is available at http://jtauber.com/2004/04/30/google_auction.py.

by : Created on April 30, 2004 : Last modified Feb. 8, 2005 : (permalink)


Blogs, Annotations, Comments and Trackbacks

Danny Ayers makes the link between blogging and annotation. I've been thinking about this sort of thing from a different (although related) viewpoint.

Lately I've been thinking about implementing comments and/or trackback in Leonardo. I personally think they are essentially the same thing and that there is a obvious relationship with web page annotation.

The way trackback works is you ping a blog with information about a reference that has been made about an entry in that blog. The information can include an excerpt of what was said.

The trackback implementations I've seen tend to give a trackback URI for an entry that is different from the URI for the entry itself. A more RESTful approach (and one I plan to implement in Leonardo) is to have the trackback URI be the URI of the entry. So you POST to the blog entry to trackback.

I had already considered POSTing to the blog entry as the mechanism for comments and that is when it first struck me that comments and trackbacks are really the same thing. The fields that you POST would be slightly different, but the mechanism should be the same.

Which leads me to web page (or, more generally, resource) annotation. There is no reason why the resource you post to should be restricted to being a blog entry. In fact, in Leonardo, there is nothing special about a blog entry—implementing trackback/comments for blog entries would enable the same capability for any page on my site.

Finally, in all these cases the mechanism involves POSTing the comment/trackback/annotation to the source itself but there is no reason why such information couldn't also be detached and expressed in RDF for the purposes of annotation servers, Technorati-style sites, etc.

I'd like to see a spec that supports this approach. I don't think it should be Atom's initial goal but I would like Atom to at least be compatible with the kind of unification I'm describing.

UPDATE (2004/05/06): see Joe Gregorio's response.

by : Created on April 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


Email Stats

The last week, I've kept all incoming mail including spam to get a rough breakdown of how much email I get and of what category it is.

Here are the preliminary results:

leaving the remaining 1.5% (124 messages) which is the core email I likely need to act on, reply to and which I will file in my "Keep" folder (currently close to 10,000 emails) once it's no longer actionable.

by : Created on April 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


New Amazon Purchases

At some stage I might add a "currently reading" box on this site. Until then, here's what just arrived from Amazon:

by : Created on April 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


Bubblets After Bray

Following after Tim Bray's linking to the Technorati Cosmos for each post, I've done the same. Not that I get nearly the incoming links that Tim does (two orders of magnitude less, in fact).

Simon Phipps makes some excellent comments on the limitations of this approach and has the beginnings of a nice taxonomy of comments.

UPDATE (2004/05/05): Decided to remove the bubblets - I just don't get enough incoming links yet :-). It was a good exercise in making Leonardo a little more extensible, though, and other users of Leonardo can easily add them back in.

by : Created on April 23, 2004 : Last modified Feb. 8, 2005 : (permalink)


Introducing Leonardo

I've had a few requests for the Python code this site runs on so, over the weekend, I cleaned it up a little ready for a release. I then had to come up with a name. I'm over my "must include 'Py' in the name of every Python project" phase and wanted something that invoked the notion of a technologist's or scientist's notebook. It didn't take me long to come up with "Leonardo". And no sign of a name clash on either Freshmeat or Sourceforge. So, "Leonardo" it is.

I'm not sure it's ready for prime-time (although it's been running jtauber.com for almost a year) but if you have a lot of patience, you are more than willing to give it a try and I'd love your feedback.

Code is available from the Leonardo page. I'll incrementally put documentation there too - likely in response to user questions as they come in.

by : Created on April 21, 2004 : Last modified Feb. 8, 2005 : (permalink)


Elements of Linear Spaces

Since Tim Bray kindly announced my entry into the blogosphere, I've found that questions I pose here get answered by wonderful people I've never met before.

It's worked with technology questions so let's try it with a question of mathematical terminology that has been bothering me recently.

The question is simply: If one wishes to refer to vector spaces by the alternative "linear spaces", what should elements of that structure be referred to if not "vectors"?

I want to avoid using the term "vector" for generic elements of a linear space because, when talking about things like one-forms and bivectors, I'd like to use the term "vector" in its narrower sense.

Most texts I've looked at give "linear space" as an alternative name for "vector space" but none provide an alternative to "vector".

Any ideas?

by : Created on April 20, 2004 : Last modified Feb. 8, 2005 : (permalink)


More Feeds Wanted

One of the first things I noticed when I started using a news aggregator is how much it changed my web surfing routine. I used to get up in the morning and open my bookmarks for daily reading: news sites like Slashdot and blogs like ongoing. With an aggregator I read a lot more but it's a lot more efficient because I don't need to visit a site that hasn't been updated. I've greatly reduced the number of "information inboxes" I need to routinely check. That's a good thing and David Allen agrees with me. I even read Dilbert via a feed.

There are still a small number of sites that are part of my old web surfing routine. I wish these had RSS/Atom feeds. One such site is my bank. I make a point of logging into online banking regularly to check that nothing funny is going on. Just last week I had the first case of fraudulent use of my credit card. I noticed a bunch of purchases being made in North Hollywood (where I was six weeks ago). It's a real pain to cancel a card and get a new one. But I digress. My point is that I'd like to get a feed of my transactions rather than having to explicitly go to the bank's site. (I remember financial aggregators like OnMoney.com were big a few years ago. I wonder if they'd offer an RSS feed if they were around today).

Another feed I'd like to see: source control check-in logs. Someone must have done a CVS to RSS bridge.

Event Log Monitors with RSS has been done.

UPDATE (2004/04/17): Tim Bray actually gives the credit card transaction example in his eWeek interview

UPDATE (2004/04/23): Mel Riffe pointed me to Fisheye from the same people that make the excellent coverage tool Clover. Think of Fisheye as ViewCVS on steroids. Looks very cool - and it provides an RSS feed!

UPDATE (2004/04/23): Aaron Straup Cope pointed me to cvs2rss.

UPDATE (2004/04/23): Norm Walsh has written cvslog2atom.

by : Created on April 16, 2004 : Last modified Feb. 8, 2005 : (permalink)


Digital Lifestyle Aggregation

If it isn't obvious already, I'm deeply interested in the convergence between email, IM, blogs, calendars, contact lists, music playlists, photo collections, etc. It was the original driver for Redfoot and, in fact, drove a lot of my passion for SGML (and then XML) in the mid-nineties.

I've just come across Marc Canter's use of the term "Digital Lifestyle Aggregation". I like it.

I still think RDF or something RDF-like is core: at least the concepts of URIs, out-of-band relationships and relationship types as first class objects.

by : Created on April 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


Amazon and Google

Word has just gotten out in the blogosphere about Amazon.com's A9 search engine. This, and my recent rediscovery of Alexa has got me thinking a lot about the similarities and differences between Amazon.com and Google and also synergies both inter-company and intra-company.

Random Thoughts:

Amazon and Google are two of my favourite companies. It will be fascinating to see what happens over the next few years.

UPDATE (2004/04/16): Found (via Scoble) an interesting Amazon what-if: Amazoning the News.

by : Created on April 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


IM in the Matrix

According to this Gamespot story, communication in the forthcoming MMORPG Matrix Online will use AIM. This means that people in the game will be able to communicate with people outside the game and vice-versa.

What makes this particularly appealing, I think, is that unlike other MMOGs, Matrix Online isn't 100% escapism. When you are in the Matrix, you are still the real you. You aren't playing a character completely separate from the real you. So it's entirely plausible within the game world that some friend outside could IM you and you could reply "I'm in the Matrix at the moment. Wanna join me or should I come out?"

Very cool concept.

by : Created on April 13, 2004 : Last modified Feb. 8, 2005 : (permalink)


Tree-based Instant Messaging

My sister Jenni and I have a lot of IM sessions with very rich structure: lots of tangents and a real need to maintain a stack so as not to miss anything. Even when I'm IMing with other people, there are frequently multiple threads going on at a time and it is sometimes difficult to follow which response goes with which thread.

For a while, Jenni and I have been talking about writing a tree-based instant messaging client - a real-time threaded discussion client.

This weekend, we were able to come up with a usable prototype using Python, wxPython and Jabber.

Stay tuned for more information as development continues.

UPDATE (2004/04/16): Michael Lawley has just told me about http://tickertape.org where "there's a whole family of IM clients supporting threaded discussion."

by : Created on April 11, 2004 : Last modified Feb. 8, 2005 : (permalink)


Bayesian Classification for Blog Reading Prioritization

Mouthful of a title, I know.

During my reading-USENET-via-nn days, I envisaged a news reader that would learn from what I selected and what I didn't select to read and would sort the articles according to how likely it thought I would want to read them.

I didn't know about Bayesian Classification at the time. Now that I do, it seems the perfect technique to use.

I wonder if a similar technique would be useful in prioritizing the reading of blog entries. Admittedly, the signal-to-noise ratio on the blogs I read is considerably higher than USENET but the quantity of blogs I now read makes it potentially useful.

by : Created on April 10, 2004 : Last modified March 28, 2005 : (permalink)


USENET and Blog Reading Strategies

I was a fairly active USENET reader in the early-to-mid nineties. For a while I used tin as my news reader but once the quantity of groups I was regularly reading reached a critical mass, I found the approach of nn more suitable. In the former, I would navigate to a particular newsgroup and if I saw any articles of interest, I'd navigate into each, one-by-one. In the latter, I'd scan the list of articles across all newsgroups and tag those that looked interesting and only then would start to read them.

Recently I've heard seasoned blog readers talking about their blog reading strategies in very similar terms to the way things were done with nn. I would say my current blog reading is more tin-like but I am starting to reach that point where I may have to switch to an nn-like reading strategy.

by : Created on April 10, 2004 : Last modified Feb. 8, 2005 : (permalink)


Wiki/Blog Hybrid

As much as possible I've tried to make this site a hybrid of blog and (privately-editable) wiki. In fact, blog entries are just wiki pages whose location in the URL space of my site means they get picked up by both my atom Atom feed generator and the "by day", "by month", "by year" and "all" blog entries listings. The site's current API (as RESTful as the lack of PUT support in browsers allows me to be) doesn't distinguish wiki page from blog entry.

As a blog entry is a wiki page in my homegrown system it raised questions in my mind about the extent to which a wiki page is blog-entry-like. This in turn gave me the idea of making an Atom feed consisting of an entry for each page on my site. This "site map" feed isn't a change log, it summarizes the actual site itself.

by : Created on April 9, 2004 : Last modified Feb. 8, 2005 : (permalink)


Channel 9 and Bill Hill

About a month ago Scoble blogged some notes from an interview with Bill Hill, the type guru at Microsoft responsible for, amongst other things, ClearType.

Yesterday, I checked out Microsoft's new Channel 9. I've found two great video interviews with Bill Hill so far. They are definitely worth listening to. He's like a geek Billy Connolly without the swearing.

by : Created on April 7, 2004 : Last modified Feb. 8, 2005 : (permalink)


Eclipse is the next Emacs

November last year, on the FoRK mailing list, I declared that "Eclipse is the new Emacs" and predicted that "by 2005, there will be people that never leave Eclipse to do their work."

Since then I've been toying with the idea of a full-blown PIM based on Eclipse: email, rss aggregation, calendar, todo and maybe even instant messaging. The nature of Eclipse is such that these not need all come from the same developers.

Yesterday, I discovered (via this presentation) that the Haystack (RDF-based PIM) project at MIT is moving to Eclipse.

Today I found a wiki-style note-taker (like VoodooPad, I guess) that runs on Eclipse.

I think my prediction is definitely shaping up to come true and it's quite possible I'll be one of those people.

UPDATE (2004/05/06): More evidence: Buying from Amazon within Eclipse

by : Created on April 5, 2004 : Last modified Feb. 8, 2005 : (permalink)


libferris

libferris looks like a very nice project along the lines of Plan X.

by : Created on April 3, 2004 : Last modified Feb. 8, 2005 : (permalink)


Amazon.com - Your Store

If "James's Store" is really my store - why can't I just claim everything in it?

by : Created on April 3, 2004 : Last modified Feb. 8, 2005 : (permalink)


Ant and Little Languages

James Duncan Davidson has a nice article on his choice to use XML for Ant scripts:

http://x180.net/Articles/Java/AntAndXML.html

His comment that "I never intended for the file format to become a scripting language" and "If I knew then what I knew now, I would have tried using a real scripting language" reinforced the argument I've made with friends and colleagues for years that you almost always end up needing a full-blown language in the end so you are much better off just starting with something like Python rather than inventing a domain-specific language.

I've seen a similar argument made (and myself made it) as a counter-counter argument against Tcl: Anti-Tcl says "Tcl isn't a full-blown language". Pro-Tcl says "Tcl isn't intended to be; it's for the little jobs that don't need a full-blown language". Anti-Tcl counters with "but what starts out as a little job almost always grows to a bigger one". I've used that argument against Perl too.

I'm wondering if the notion "you almost always end up needing a full-blown language in the end so you are much better off just starting with an existing full-blown language rather than using a little language or inventing a domain-specific one" has a name? Has someone claimed it as their Law yet?

by : Created on April 2, 2004 : Last modified Feb. 8, 2005 : (permalink)


Naked Objects

One of the key concepts behind Redfoot and, more recently, the TrIM project I'm working on with my sister, Jenni, is the notion that with a rich enough object schema, UI can be provided by the framework and objects can be manipulated and relationships express via that generic UI.

Via the session schedule for the Boston No Fluff Just Stuff conference, I found out about Naked Objects.

The session description reads:

"What if you never had to write a user interface again? What if you could simply expose your business objects directly to the end user? How would this affect your productivity? The way you work? The flexibility of your applications? Is this even possible? Sometimes, yes. This talk describes a style of application development, Naked Objects, where you write just the business objects, and a framework lets your users interact directly with these objects."

I tracked down the Naked Objects website and it turns out there is a book, which definitely looks worth getting.

A quick perusal suggests that the approach (at least as it is implemented) relies on classes written in some specific OO language rather than something like RDF. I think I'd prefer the flexibility and interoperability that declarative object schemas would provide and there's no reason why the Naked Object approach couldn't use RDF (or perhaps XMI?) with logic written in something like Python.

by : Created on March 31, 2004 : Last modified Feb. 8, 2005 : (permalink)


Geometric Algebra and Maxwell's Equations

My latest Amazon.com shipment included what is shaping up to be my favourite book in the area of mathematical physics: Doran and Lasenby's "Geometric Algebra for Physicists".

The elegance of geometric algebra is clearly evident in that fact that Maxwell's equations become a single equation in this algebra. I recall my delight when I discovered that a tensor treatment of Maxwell's equations resulted in two equations instead of four. This takes that to the next level.

It got me thinking that it would be fun to put together a document that took the reader on a tour of vector calculus, tensors and geometric algebra, in each case using Maxwell's equations as the common thread.

by : Created on March 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


GA Tutorials

If you are interested in learning more about geometric algebra, there are some interactive tutorials at http://www.science.uva.nl/ga/tutorials/ which are worth checking out.

The tutorial adapted from a talk on GA at GDC2003 is excellent! The GA Viewer software that it uses is a nice piece of work too.

by : Created on March 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


WebDAV Support

Yesterday, I was talking to Jenni about what we could use as a server for shared calendaring (she runs Mac OS X; I've been checking out eventSherpa). Jenni mentioned that iCal supports WebDAV which got me thinking: I should add WebDAV support to this site.

by : Created on March 25, 2004 : Last modified Feb. 8, 2005 : (permalink)


Goodbye Origin

Tonight I had the pleasure (and honour) of attending the closing down party / wake for Origin Systems.

It was recently announced that Electronic Arts was closing down their Austin office (i.e. Origin) and relocating the team to the Bay area.

Literally hundreds of ex-Origin employees (and a bunch of outsiders like myself) gathered at Lord British's huge lake-side property (complete with wooden fort and pirate ship) for a night celebrating the remarkable achievement that was Origin Systems.

Origin produced the games that had the biggest influence on me growing up. Hearing Garriott and others recount stories made me wish I'd somehow been a part of it.

by : Created on March 20, 2004 : Last modified Feb. 8, 2005 : (permalink)


Great Music / Great Food

This evening I caught up with my good friend and mValent co-founder, Duane Tharp who is visiting Austin. Along with his French friends Stefan and Raoul, we went down to the Town Lake Stage at Auditorium Shores.

First we heard Toots and the Maytals - probably the best reggae I've ever heard. I'll confess that until Raoul explained it to me, I was unaware of just how significant Toots is in the world of Reggae. I might have to buy an album now.

Next was Joss Stone who completely blew me away. I can't express it any better than this quote from the SxSW website: "Joss Stone may well be best old-school, roof-raising, Southern-style soul music to appear in the 21st century: a claim made all the more remarkable when you consider Joss Stone is a 16 year-old girl from England." All I can add is: wow!

After listening to Joss for a while, we went to Austin's best Italian restaurant Vespaio. Thanks to Claude and Alan for fantastic food and wine.

After the concert and food I was unsure if I had the energy to head off to catch the special secret act closing the Barsuk Records show at The Parish. Fortunately, The Parish was very close to the hotel so I decided to go, despite feeling exhausted. The secret act was afterall, my favourite band of all time: They Might Be Giants. (Thanks to my sister Leonie for letting me know they were performing!)

When I arrived at The Parish I got my first ever taste of VIP treatment by bouncers. There was a huge line outside (word had gotten around that the secret act was TMBG) and at first I was worried I would never get in. But as I approached the bouncer to confirm that the line was indeed for the Barsuk Records show, he saw my platinum SxSW attendee badge and let me in ahead of the line.

I slowly made my way to the front of the stage, about twenty minutes before the two Johns, two Dans and a new drummer (Mike?) came on stage. They played a bunch of stuff from their upcoming EP and album. The Linnell-sung songs all had his characteristic style (yet more ascending scales in the melody) but they were just as catchy as ever. "Experimental Film" might just be my favourite song de jour. Also fantastic was "Memo to Human Resources".

Other highlights were a phenomenal two-minute classical guitar intro to Istanbul from lead guitarist Dan; Particle Man, with some very funky bass from other Dan; Birdhouse, which is still probably my favourite song of all time; and the first live performance I've heard of Fingertips which they actually pulled off.

Probably the best TMBG concert I've been too and a perfect ending to an awesome night.

by : Created on March 19, 2004 : Last modified Feb. 8, 2005 : (permalink)


David Allen's Blog

Well, Scoble did it. David Allen now has a blog.

Welcome to the blogosphere, David!

by : Created on March 18, 2004 : Last modified Feb. 8, 2005 : (permalink)


SxSW Dinner

At about 6pm I was unsure whether to bother going to the official SxSW Music dinner but I'm so glad I did.

The event was considerably smaller than I expected -- I guess SxSW attendees aren't used to paying for expensive conference dinners. There were only five people on my table, including myself but it was a much better environment to talk to people than I've experienced probably the whole of SxSW (although the Austin Game Developers' Happy Hour was pretty good).

Renee Sebastian is a Pop/R&B artist based in San Francisco. She's released stuff on her own label and it sounds pretty good to my ears!

Kimberly Guise and Steven Erdman run GO Records based in New York and are currently working on an album featuring Brit James Hunter. If Steve's passion is anything to go by, the James Hunter album should be awesome.

I'll definitely be watching both Renee's and Kimberly and Steven's efforts. You might want to do yourself a favour (to quote Molly Meldrum) and check them out.

They were certainly a wonderful bunch of people to spend an evening talking with.

by : Created on March 17, 2004 : Last modified Feb. 8, 2005 : (permalink)


Versioned Literate Aspect-Oriented Programming

As I've been developing a GEF application, I've been thinking about turning it into a tutorial. I immediately wondered if it might be a great project for literate programming.

I could write a web and then tangle it to generate the GEF application and weave it to get the tutorial. But as features are incrementally added to the application over the course of the tutorial, conventional literate programming might not be enough. At the very least, some kind of versioning would need to be included.

But then it occurred to me that it's perhaps best thought of not just as a versioning issue but as an aspect-oriented one. For example, if step four of the tutorial is adding undo support then that would involve not only new classes but the insertion of code at points in existing methods.

I wonder if such a system exists?

by : Created on March 15, 2004 : Last modified Feb. 8, 2005 : (permalink)


Onfolio

Via Scoble, I've discovered Onfolio and it looks awesome.

At one level, Onfolio is very similar to where my sister, Jenni and I are headed with a little Python tool we're writing code-named TrIM. Both essentially allow files, links and fragments of information to be managed in an organized collection.

Onfolio has nice integration with IE, which makes it great for web-based information gathering (although it does support dragging-and-dropping of arbitrary files like TrIM). It also has a nice publishing mechanism (including ability to publish information collections as RSS feeds).

by : Created on March 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


RSS Feed Coming Soon

I've become more and more addicted to a small handful of blogs lately (I'll put together a blog roll soon) and so finally switched to using an aggregator.

It now makes sense for me to implement an RSS feed for my own blog so other people using an aggregator aren't shut out. I've been GEF-hacking while not attending SxSW sessions and events but I may need to pry myself away from Eclipse to add RSS support to the homegrown Python code that generates this site.

UPDATE : An experimental atom feed is available at http://jtauber.com/atom/.

by : Created on March 14, 2004 : Last modified Feb. 8, 2005 : (permalink)


SxSW: Day Two

Decided to take it easy during the day. Glad I did because I didn't get back from the short film screening until 2.30am tonight.

Went to the "I Am Stamos" party and caught up with Alex Eastburg (Writer/Producer), Rob Meltzer (Writer/Director) and Karl Preusser (Composer) all of whom I met at the Film opening party last night. Also met Robert Peters, John Stamos and Rebecca Romijn-Stamos.

The party included performance art that I could only describe as chinese acrobatics meets lesbian French maids (that'll bring some interesting Google searches to this site!)

After the party, went back to the hotel and did some more GEF programming before heading off again for the midnight screening.

There were nine films in total ranging in length from four minutes to seventeen. The overall theme was clearly the absurd. Most of the films got at least a laugh out of me - some were absolutely hilarious. My two favourites were I Am Stamos and Walkentalk (the latter an absolute must-see for fans of Christopher Walken). Interestingly, they were the two shot on film. Also worth seeing (although not quite as good as "I Am Stamos" and "Walkentalk") is The Frank International Film Festival, a mock video diary of a visit to the most exclusive Film Festival in the world.

by : Created on March 13, 2004 : Last modified Feb. 8, 2005 : (permalink)


Free Markets and Ecosystem Simulation

Last night, Richard said that a lot of the world simulation aspects of Ultima Online had to be taken out because players didn't behave as expected. I have another theory.

The example he gave was that they originally set things up so that surrounding a village were sheep and deer and further out were wolves and dragons that ate those sheep and deer. The idea was that if players killed off the population of sheep and deer too quickly, the wolves and dragons would have to venture closer to the village and the village would cry out for a hero to kill the wolves/dragons. Hence the system itself would create the need for kill-the-dragon-type hero quests.

Unfortunately, players killed everything and so the system never had time to restore balance and so the concept had to be removed. Richard said there were lots of examples of this.

I wonder if the concept might have worked had human populations been susceptible to the same forces. It's a classic example where a free-market-like system won't lead to equilibrium because there are constraints which prevent the system from operating on each participant.

The moment you set up some 'protection' for one participant in the system, the natural balance will be lost.

So if you want to simulate plants and animals in an ecosystem realistically, you've got to include humans and allow for disease, famine and player death.

by : Created on March 13, 2004 : Last modified Feb. 8, 2005 : (permalink)


Richard Garriott and Warren Spector

I was thrilled when I recently found out that Richard Garriott would be speaking at SxSW. It's hard to overstate the impact that Richard's games had on me growing up. Not just the hours I spent playing them with my sister but also everything I learnt trying to hack them (especially IV and V) to see how they worked.

The panel was excellent and Richard and Warren Spector (who worked with Richard on Ultimas VI and VII and designed Deus Ex) make a wonderfully entertaining pair to listen to.

First a few anecdotes about the origins (no pun intended) of the Ultima series:

There were a lot of things that Warren and Richard agreed on. Both thought story was core. And both commented on the fact that the D&D sessions they most enjoyed were where the rules were almost forgotten about and a game centred around story telling.

(As an aside: I remember when I first started AD&D with school friends in 1985, we hardly even used dice at all. I remember one adventure I DMed on a bus with no dice or character sheets or handbooks - just storytelling.)

Interestingly, Warren (who worked at TSR for a while) said that TSR deliberately left aspects of AD&D underspecified to encourage players to augment the rules (and therefore get them more attached to the game).

Where Warren and Richard differed was on single-player versus multi-player and that largely seemed to stem from their different goals in audience size.

Warren pointed out that if the number of people that played the most successful computer game of all time was the audience number for a new TV show, the show would get cancelled after two episodes.

Warren clearly wants a bigger audience which, both he and Richard agree, means consoles.

Richard is happy with the smaller audience interested in MMOGs - largely because of the economics. The profit margins on successful MMOGs are much greater than those on single-player games and even greater than on console games. Apparently EA made about $100 million on $2.5 billion revenue last year. In contrast NCSoft made $50 million on $125 million revenue.

All the business comments Richard made had Warren staring in disbelief: 'Richard Garriott died five years ago and was replaced by his brother', a reference to Robert Garriott who ran the business side of Origin Systems.

At the end I spoken briefly to Richard and had the opportunity to thank him, not just for a great panel session, but for the last twenty years.

by : Created on March 12, 2004 : Last modified Feb. 8, 2005 : (permalink)


SxSW: Day One

Back in Boston, if I mentioned I was going to Austin, they'd rave about what the weather would be like this time of year. So I arrive in Austin around noon and it's raining :-)

Checked in at the Hilton then crossed the road to the convention center to register. There were a pile of Australian Music Guides - the newspaper that Phil Tripp put together which includes a photo of yours truly - so I picked up a couple of extra copies for my family.

My Platinum pass gets me in to the Film, Music and Interactive sections of the conference as well as the Film and Music festivals. The music side of things doesn't start for a few days so I only got the conference material and goodies bag for the Film and Interactive streams. It's already clear I'm going to have a hard time picking how to split my time.

Even though I have absolutely no room in my suitcase, I splashed out a little on SxSW clothing including a hemp jacket, sweatshirt and three t-shirts.

There are a bunch of films on each day - hard to pick which ones to see.

This evening I went to a panel session featuring Richard Garriott and Warren Spector. It deserves an entry on its own. Later on, I briefly popped into the opening party for the film stream - met the guys who made the short film "I Am Stamos". They have a party tomorrow night and then the film is being screened.

by : Created on March 12, 2004 : Last modified Feb. 8, 2005 : (permalink)


XSL by Wayback Example

Earlier today a colleague mentioned that he had stumbled across my (very) old "XSL Templates by Example" article. I was surprised it was still around as I haven't hosted it for years.

This evening a quick Google search resulted in a hard-to-read text version. So a visit to the Internet Archive Wayback Machine and I had the original HTML from an archived version of the XMLSOFTWARE.COM.

It's completely out of date, but I've made it available at a new permanent home at http://jtauber.com/1999/03/03/xsl-by-example.html.

I'll see what other artifacts I can dig up from the attic of past websites.

by : Created on March 10, 2004 : Last modified Feb. 8, 2005 : (permalink)


Blogs and David Allen

A link from Tim Bray's blog led me to Robert Scoble's blog (which I must read more). Reading some entries nearby to the one Tim referenced, I discovered Robert has recently attended a David Allen seminar.

David Allen's book is probably the best I've read on organization and productivity and Robert has become a fan. Robert also suggests David Allen should start writing a blog. I for one would definitely read it!

by : Created on March 9, 2004 : Last modified Feb. 8, 2005 : (permalink)


OSAF and David Allen

I got around to downloading Chandler 0.3 this evening. Looks like good progress has been made on the core although it will probably be a while before I really get to dig into it.

Reading Ted Leung's blog inside Chandler I noticed that Ted referenced Robert Scoble's entry about David Allen. He further added that he himself is a fan of David's approach.

There's even a page on the Chandler wiki that talks about the David Allen method as a usage pattern for Chandler.

by : Created on March 9, 2004 : Last modified Feb. 8, 2005 : (permalink)


Eclipse GEF

This weekend I tried out the Eclipse Graphical Editing Framework. First impressions are that it is a very rich, mature, extensible framework. Like Eclipse itself, there is enough decoupling and extensibility hooks that it takes a while longer to get a basic application up and running; but once you've reached that stage you can keep adding features in an extremely flexible way.

With its model, parts, figures, policies and commands, GEF makes MVC look tightly coupled. The two examples that come with GEF are excellent examples, although I found they were a little too advanced to base my first application on. At some stage, I'll post my first application with documentation as I think it might provide a useful stepping stone to a more complex example.

Look out for some future open-source projects from me based on GEF.

UPDATE (2004-11-03): Now see Six Snapshots of a Simple Eclipse GEF Application.

by : Created on March 7, 2004 : Last modified Feb. 8, 2005 : (permalink)


Oscar Party

I really need to get around to writing up what happened at TheOneRing.Net's Oscar Party.

by : Created on Feb. 29, 2004 : Last modified Feb. 8, 2005 : (permalink)


The Night Before Oscar

Things are pretty crazy around the Hollywood and Highland complex at the moment. I am about the only person without a security badge. Entry to the hotel was checked by an FBI agent.

Most of the people staying in the hotel are either out-of-town press or staff of the Academy involved in running the show.

This evening went to the hotel bar. Mostly Academy staff although Catherine O'Hara was there (I'm guessing she'll be performing the song from A Mighty Wind tomorrow).

Met a rep called Chuck Holbrook who insisted on introducing me as the sound designer on Lord of the Rings. Not sure if it was because he'd had too much to drink of if he was just doing what reps do :-) He was with a crew doing an Oscar segment for MTV - same guys that had done the last couple of music videos for 50 Cent. Chuck is going to Elton John's party tomorrow night. I wonder if it will be as wild as the LoTR party :-)

Drank too many apple martinis.

by : Created on Feb. 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


Visible Artists

Was just awake enough and my hotel close enough to make it to the ACE Panel Invisible Art/Visible Artists. The panel consisted of the year's Academy Award nominees for Best Achievement in Editing; namely: Daniel Rezende (Cidade de Deus), Walter Murch (Cold Mountain), Jamie Selkirk (Lord of the Rings: Return of the King), Lee Smith (Master and Commander: The Far Side of the World) and William Goldenberg (Seabiscuit).

The Egyptian Theatre, was filled with a mix of film buffs, film students and film professionals (the guy in front of me is currently working on a romantic comedy with Will Smith and Eva Mendez).

The panel members took it in turns to describe how they got in to editing and what their approach generally is. They then showed clips from their respective films and talked about why they chose the clip as being representative of their work in the film.

The panel was fascinating, in large part due to the breadth of its participants. On the one hand you had Walter the veteran; arguably the best known and most eloquent. On the other hand you had Daniel the newcomer, who seemed so overwhelmed by his sudden exposure. The contrast mirrored that between their films: Cold Mountain was Miramax's most expensive solo-funded film to date and featured huge stars; Cidade de Deus was a low budget film shot using non-actors in the streets of Rio de Janeiro.

After the panel I spoke to a few of the participants. Didn't get a chance to speak to Walter (and, although I brought his book on my trip, forgot to bring it to the theatre!). Spoke to Lee Smith for a while and Jamie Selkirk for a little bit. Also met a friend of Lee's - another Australian called Matt who is currently working on I, Robot.

by : Created on Feb. 28, 2004 : Last modified Feb. 8, 2005 : (permalink)


WA Venture Capital Symposium

The symposium was definitely worth going to, even if just for the case studies, but there was some other good stuff too.

After the opening address was a panel discussion about investment in the WA context. The panel consisted of three VCs (CM Capital Investments, Foundation Capital and Innovation Capital) and two institutional investors who invest in VC funds (ING and Westscheme). More than anything it gave good insight into the "other" side of being a VC: the VC-institutional relationship. Interesting take away was that, for its population, Western Australia is under-represented in private equity but over-represented in public offerings (with many companies listing earlier than they should).

After morning tea was a case study of Immersive Technologies, makers of training simulators for equipment like mining trucks. The most amazing thing was they bootstrapped themselves from $100 and 2 people (Peter Salfinger and his brother) to profitability and 58 people before they got venture funding for expansion (from Equity Partners)

After Tim Mazzarol introduced what the UWA Graduate School of Management is doing with their Centre for Entrepreneurial Management and Innovation (CEMI), James Thompson of Quadrant Capital gave a talk on exit strategies. James gave a couple of really interesting examples of creative exit strategies. Also interesting was the breakdown of exits in 02/03: 28% trade sale; 28% IPO; 13% liquidation or write-off; buyout by other shareholders 13%; other 18%.

After lunch was another great case study of Worldwide Online Printing, another company that was large before going for its first round (again for expansion). Fascinating information about the print industry in Australia: there are over 6200 businesses in the industry and no player with more than 3% market share. WOP's model is a hub and spoke that centralises production equipment for economies of scale and low outlet setup cost. Other highlight was a very nice business modeling spreadsheet and outlet KPIs fed into a scoreboard.

This was followed by an insightful case study of the Kailis and France MBO and then, after afternoon tea, a humorous talk by Chris Golis (author of the excellent "Enterprise and Venture Capital") on what makes a good entrepreneur.

The cocktail party and buffet dinner afterwards was a good opportunity to meet a few new people and a couple of familiar faces (such as Running Code's Ashley Aitken whom I knew from Curtin University). It did get a bit tiring having to explain why mValent is in the US and I'm in Perth, though :-)

I'll definitely try to make future AVCAL events.

by : Created on Feb. 27, 2004 : Last modified Feb. 8, 2005 : (permalink)


Oscar Editors Seminar

I've just found out that ACE is running a seminar at American Cinematheque featuring all of this year's Oscar nominated attendees. It's on the morning I arrive in LA so, depending on how I feel (and smell) after flying for 24 hours, I'll try to make it. Must remember to pack my copy of Walter Murch's book to get signed.

by : Created on Feb. 26, 2004 : Last modified Feb. 8, 2005 : (permalink)


Last Minute Trip Preparations

Taking the red-eye to Melbourne on my way to the US usually means I have the day to get ready and pack. However, tomorrow the Australian Venture Capital Association is running an all-day symposium in Perth. I thought it would be a great opportunity for me to find out more about what's going on VC-wise at home. The event includes dinner so it will be a pretty mad rush from the Duxton Hotel, back home to pick up my bags and then on to the airport. I'll give an update on how the symposium went from either the lounge in Perth or in Melbourne.

Proofed the artwork for my "Independent Producer and Composer" business cards to take to SxSW. The design Snap Printing came up with worked out well. I ordered 500 so hopefully there's plenty of opportunity to hand them out at SxSW.

I've finished paying bills, am currently charging various devices such as phones, iPods, laptops and cameras and am just about to start packing my clothes. I don't see myself getting much sleep tonight, which normally wouldn't be a problem as it helps me sleep on the plane - but hopefully the VC symposium will serve up lots of coffee.

by : Created on Feb. 26, 2004 : Last modified Feb. 8, 2005 : (permalink)