Localizing XML with XSL Transformations

I’m working on a project that uses XML documents to describe layout for a UI. One of the requirement of this project is internationalization (i18n) and localization (l10n). In other words, the strings used in the UI needs to be in the local language of the user. The naive approach is to make a layout for each language, but that is very likely to become a maintenance nightmare as the UI is updated. Instead, I came up with the following.

Suppose we have a simple XML document (contacts.xml) that describes a list of contacts:

<?xml version="1.0" encoding="utf-8"?>
<contacts>
    <contact>
        <name>Smith, John</name>
        <phone>555-1234</phone>
    </contact>
    <contact>
        <name>Doe, Jane</name>
        <phone>555-1235</phone>
    </contact>
</contacts>

We can use the following XSL document (contacts.xsl) to transform the above XML document into a simple HTML table. The XSL first matches on the contacts root element.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:fn="http://www.w3.org/2005/02/xpath-functions"
    exclude-result-prefixes="xsl fn">
    <xsl:output method="html" />
    <xsl:template match="/contacts">
        <table border="1">
            <tr>
                <th>Name</th>
                <th>Phone Number</th>
            </tr>
            <xsl:for-each select="contact">
                <tr>
                    <td><xsl:value-of select="name" /></td>
                    <td><xsl:value-of select="phone" /></td>
                </tr>
            </xsl:for-each>
        </table>
    </xsl:template>
</xsl:stylesheet>

We can add a xml-stylesheet declaration to automatically apply our XSL document:

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="contacts.xsl"?>
<contacts>
    <contact>
        <name>Smith, John</name>
        <phone>555-1234</phone>
    </contact>
    <contact>
        <name>Doe, Jane</name>
        <phone>555-1235</phone>
    </contact>
</contacts>

The output will now look something like this:

<table>
    <tr>
        <th>Name</th>
        <th>Phone Number</th>
    </tr>
    <tr>
        <td>Smith, John</td>
        <td>555-1234</td>
    </tr>
    <tr>
        <td>Doe, Jane</td>
        <td>555-1235</td>
    </tr>
</table>

Our goal here is replace the table headers, “Name” and “Phone Number”, with their appropriate translation in the specified locale. First, we need to write the language files. Each language file will be kept in a sub-directory named res and the file name will be language_territory.xml. For example, es_MX.xml would be for Spanish spoken in Mexico. Each string tag has an idattribute that will be used to reference that string later; this value should be the same for each translation.

<?xml version="1.0" encoding="utf-8"?>
<strings>
    <string id="name">Name</string>
    <string id="phone">Phone Number</string>
</strings>

 

<?xml version="1.0" encoding="utf-8"?>
<strings>
    <string id="name">Nombre</string>
    <string id="phone">Número de Teléfono</string>
</strings>

 

<?xml version="1.0" encoding="utf-8"?>
<strings>
    <string id="name">Nom</string>
    <string id="phone">Numéro de Téléphone</string>
</strings>

Now we need an XSL that will retrieve strings for the appropriate locale (localize.xsl). The document accepts a param named lang which defaults to en_US. It defines a variable named _ (à la GNU gettext) which is set to the stringtag of the appropriate language XML files.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:fn="http://www.w3.org/2005/02/xpath-functions">
    <xsl:param name="lang" select="'en_US'" />
    <xsl:variable name="_" select="document(concat('res/', $lang, '.xml'))/strings/string" />
</xsl:stylesheet>

Finally, we have to modify contacts.xsl to import localize.xsl and retrieve the localized strings. We retrieve a string in the appropriate language by accessing variable we defined in localize.xsl by using $_ followed by [@id='name'] which retrieve the first string with an id matching “name”.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:fn="http://www.w3.org/2005/02/xpath-functions"
    exclude-result-prefixes="xsl fn">
    <xsl:include href="localize.xsl" />
    <xsl:output method="html" version="4.01" />
    <xsl:template match="/">
        <html>
            <head>
                <title>Contacts</title>
            </head>
            <body>
                <xsl:apply-templates />
            </body>
        </html>
    </xsl:template>
    <xsl:template match="contacts">
        <table border="1">
            <tr>
                <th><xsl:value-of select="$_[@id='name']" /></th>
                <th><xsl:value-of select="$_[@id='phone']" /></th>
            </tr>
            <xsl:apply-templates />
        </table>
    </xsl:template>
    <xsl:template match="contact">
        <tr>
            <td><xsl:value-of select="name" /></td>
            <td><xsl:value-of select="phone" /></td>
        </tr>
    </xsl:template>
</xsl:stylesheet>

Known Limitations

The lang param must match exactly; there is no matching on language only. Each language file must be filled out completely; there is no fallback to the default language. There is no support for plurals.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a comment