The Trouble With hCard and Microformats in General
If you haven’t already heard of Microformats, they are a set of simple, open data formats built upon existing and widely adopted standards.
Basically, they allow you to use a combination of class names to mark up data in your page along the same lines as existing data formats. So for example, you’d mark up contact info on a page using class names based on the popular vCard format. The resulting markup would be an hCard. In essence, Microformats aren’t in the business of reinventing the wheel, they’re all about reuse of existing patterns, and therein lie their genius.
<span class="tel">
<span class="type">home</span>:
<span class="value">+1.415.555.1212</span>
</span>
An example of an hCard from: http://microformats.org/wiki/hcard
There is however, one glaring problem with Microformats and I ran into it head on when I was marking up a page in French. Since property values need to be in the clear (in this case, “home”), and those values need to follow an established format (in this case vCard), you can’t use any other language but English. Yep, you heard me, Microformats (at least hCard) are English only. So much for i18n.
<span class="tel">
<span class="type">Téléphone</span>:
<span class=”value”>+1.415.555.1212</span>
</span>
An example of an invalid hCard due to a property value in French
I think that the defining difference between hCard and vCard is that the former needs to have its property values in cleartext whereas I don’t think the latter does. In other words, you can exchange vCards containing English property values using Japanese applications and since the vCard is simply a file, the Japanese app can open it up, read the properties in English and then display them with Japanese labels.
I tried discussing the issue with the Microformats community, but the results were far from conclusive. I was told that the abbr property could be used to remedy this situation, however the spec doesn’t discuss this in terms of i18n. Rather the intended use of abbr is to differentiate between human and machine readable formats of the same content. For example dates:
<abbr title="2008-04-15T00:00:00">April 15th, 2008</abbr>
If I’m way off base here or Microformats have evolved since to include i18n, please by all means let me know in the comments.
Sphere: Related Content



April 15th, 2008 at 8:19 am
I think the challenge comes down to the limitations of the markup. The advantage of a vCard is that it doesn’t adhere to any standards other than it’s own. An hCard tries to accomplish the same functionality, but is constrained to html standards.
I personally think that MicroFormats are a great idea, but they don’t really have enouhg adoption by vendors to make it worthwhile in many cases. Once browsers start natively recognizing and supporting MicroFormats, there will likely be some additional work put into localization and some of the other technical issues.
April 15th, 2008 at 9:21 am
The interesting thing here is that Microformats are less likely to receive browser endorsement if it can’t support i18n. What you’ll end up with is a company like Microsoft embracing and extending the format.
April 15th, 2008 at 10:33 am
Any ideas on how to do it better?
I personally don’t have a huge amount of experience with localization, and I have a feeling many of the people in the MicroFormats community might be in the same boat. I didn’t even realize that the formats wouldn’t work for other languages. I believe that the type can be any value, though I may be mistaken.
In all honesty, I wouldn’t mind if Microsoft embraced Microformats. It would be better than using a more proprietary format and it would give Microformats some visiblity. If MicroFomats were natively supported in IE8, or even added to IE7 through a patch, that would likely make a lot more developers take notice.
April 15th, 2008 at 10:49 am
While I’m all for internationalization, is not HTML itself a bit too English-centric? A paragraph tag, for instance, derives the “P” from the English word for paragraph. So it is with DIV, for division, etc. We have to draw the line somewhere, if only for the sake of being able to support parsers. :)
April 15th, 2008 at 11:13 am
@Nathan: Yes, but you never see a “p” tag in clear text. It’s under the hood. Microformats require that “tel” or “home” be in the clear.
@Ian: I didn’t think much about i18n either till I got hit with a scenario that required it.
May 20th, 2008 at 3:01 am
I’ve written about the microformats misconceptions and the common criticisms here: http://www.csarven.ca/microformats-misconceptions
I don’t believe that the whole microformats concept is faulty or useless if localisation is an issue. Keep in mind that microformats is adopting *existing* standards (e.g., vCard) for most common patterns. The root of the problem is outside of microformats and it would be great if it could be solved easily.