2. Background information

There is a number of new programming languages for XML processing already (Comega, CDuce, etc), so I need to explain the need and the reasons for creating XSieve.

I work in the area of technical documentation. Structured texts, and therefore SGML and XML technologies, are very important there. More, the roots of SGML come from the needs of technical documentation.

There is a lot of structured data and conversion tasks. To work fast and good, we need a nice general-purpose programming language that is targeted at SGML/XML processing. To work with SGML data, we used Balise, a programming language which combines a sort of Pascal with a sort of XML SAX interface and template matching. Unfortunately, SGML and Balise died.

I searched for a replacement, but failed. Java, Perl and other languages have XML APIs, but writing an XML processing code using such languages is like programming in Assembler: it is a slow and error-proning process.

When XSLT appeared, I made an error. In addition to converting trees, I tried to use it for creating and formatting plain text. It was a bad experience, and I forgot about XSLT for a long time. But some time later, when we had written a system for extracting data from HTML pages, I realized that we re-invented XSLT.

That was how I re-opened XSLT. Soon I started to think that XSLT is one of the most brilliant developments in the computer science of the last time. Unfortunately, I need a general-purpose programming language, and XSLT isn't one. It's possible to use XSLT extensions, but it means returning to Java and company.

Meanwhile, I found another need: using XML technologies to process non-XML data. In search for solution, I came to the Lisp world. I don't want to discuss it deeply here. Instead, here are some links.

Lisp folks say that XML is a poor copy of S-expressions and that Lisp is a superior language. Their words have a byte of the truth. For example, the paper “SXSLT: Manipulation Language for XML” [.ps.gz] provides an example of recursively reorganizing punctuation. While a Scheme (a Lisp dialect) code is small, an XSLT solution is extremely unwieldy even for a simplified formulation.

But Lisp isn't the answer too. It has a great potential for XML processing, but it doesn't have high-level libraries (for example, DTD validation or XInclude processing).

Let's make a conclusion. On the one side, XSLT is a great for XML processing, but it's not a general-purpose language. On the other side, Lisp is a general-purpose language and it's potentially great for XML processing. So, why not to join them together?

Result is XSieve, a combination of XSLT and a Lisp dialect Scheme.

Finally, I'd like to say “thanks!” to Google. They accepted XSieve as a Google Summer of Code 2005 project. As result, I was able to work on XSieve full-time, and have grown an internal research prototype to a public product.