Thursday, June 24, 2004

Fun with XMerge

A year or so ago, I worked with Aidan Butler on the openoffice.org project to improve the conversion of Star Office documents to DocBook.

My group had a requirement to use a "WYSIWYG" editor to create support content, and I have been (and probably always will be) insistent on using validated, structured markup (specifically Simplified DocBook) to describe the support content.

Star Office was our best chance, at meeting both of these requirements, because it stored the data in XML. The XMerge plugin is a set of XSL transformations that does a "best effort" conversion of Star Office to Docbook.

The problem, is that Star Office is not a validating XML editor. It does store content and presentation information in XML, but cannot force the content to conform to a specific schema or DTD.

There are a few tricks you can do with locking sections in Star Office templates to force some desired structure before the content is converted, but it may still result in invalid DocBook markup. Personally, I still prefer Arbortext Epic for working with DocBook markup.

I think WYSIWYG and semantic markup are diametrically opposed. One concentrates on "look and feel" the other on what the content means. To me, it's like the Cheaper, Better, Faster argument: Pick two. You can't have them all. Writing for semantic markup requires a paradigm shift from traditional desktop publishing that most people have been trained in. More on this in a future post...

Hopefully, the StarOffice/OpenOffice teams can introduce XML validation in future versions of the product. That may prove to be quite a challenge, though!

Thanks to bondolo for his interesting post that stirred this commentary...

See also: