(From an e-mail I wrote earlier today).
I thought I’d give you a better answer on why hAtom should be supported (by blogging tools). hAtom has the same potential benefit of any other (HTML-based) microformat—it will allow applications to deal with objects in a blog as a natural chunk. For example, ”reblogging” an entry (i.e. responding to someone else’s blog post), contact the author of this post, take the authors of this post and add to my address book, print this blog post (without all the comments, blogrolls, etc), and so forth. Search engines would be able to know in exactly what microcontent chunk a piece of text is in, rather than “it appeared on the front page of instapundit”. If people reblog using hAtom, it becomes easy to generate BLOCKQUOTE and Q elements with the correct CITE information and now we can start reliably linking in place conversations on the web.
hAtom also has the benefit of being a natural “microcontent container” so that other microformats in the future can be composited on top of it. I’m considering my next project in the blog/microformat space to be a “blog archive” microformat; combined with hAtom, this will give tools and people the ability to walk through weblogs in a structure fashion.
I know there’s a hand-wavingness quality to this argument—these tools do not exist. However, the bootstrap for this is quite small—the amount of work it is to put hAtom into a weblog template is less than 30 minutes work and it’s easy to test a validate that that will continue to work. Once there’s hAtom content out there, Javascript, Greasemonkey, TIDY + XML parsers in many languages will make it easy to get the content out. We’re already seeing this happening with hCalendar and hCard.
Interestingly, a lot of people on the microformats list think this is a way of combining syndication into HTML. I do not. Why? 1. size, 2. crud. First RSS provides a channel for delivering the data that’s relevant for syndication; a weblog is… well, a weblog, and usually much bigger. Size does matter and I don’t think hAtom will ever be used for syndication in more than a fringe # of cases. Secondly, if people can barely compose legal-XML for RSS and OPML, what’s the hope for correctly formatted XHTML delivered content? IMHO: none. Thus, you either have to have a super-parser available like Mozilla’s (and not that many of the apps I’m talking about above will already have access to a parsed DOM in the browser!) or they will run the result through TIDY, which is CPU intensive and less than guaranteed to have a happy ending.
Just to briefly bring up this discussion I had with Randy [Morin] on “why not just store it as XML”, as per his resume example. Last night I quoted the 2NF of DB development—i.e. don’t duplicate your data. If we didn’t have hAtom, how could you write the reblogging tools, the printing tools, etc.? Cross reference through the RSS? I’ve tried doing that and it’s surprisingly difficult because as an outside tool person, there’s no guarantee that the URIs match up and whether you can know all the text is there and what format it is in. So likewise with all microformats: if I have someone’s contact information as HTML (which is being eyeball-validated on a continual basis), why encode it as XML when I can just embed the semantic information in place.
Tagged: microformats, hAtom

