Translating DocBook Documents
A frequently asked question is: how can I easily translate DocBoook-XML documents whithout having to hard-copy the structure of the document? What should I do if my translators cannot learn DocBook XML and if I have frequent changes in my DocBook files?
From the need of translating strings in computer program user interfaces comes the gettext .po format. In this article I will demonstrate how to use this format to translate DocBook XML files.
Installing Prerequisites
You need a linux box with a recent development environment, especially a CVS client, Python and the wxWidgets libraries.
Installing needed Programs
We need two programs for this translation: One program will extract the relevant strings from our DocBook documents, another program will help us translating the strings and the first program will help us merging the translated strings into a new translated, structured document. The first program is called xml2po and it currently resides in the CVS repository of the GNOME project, which means we have to check it out from CVS and build it by hand.
After entering the first command you will be promted for the password of the repository. An empty password (RETURN) will allow you to log in. The second command will retrieve the source code from the repository.
# cvs -d:pserver:anonymous@anoncvs.gnome.org:/cvs/gnome login # cvs -d:pserver:anonymous@anoncvs.gnome.org:/cvs/gnome co gnome-doc-utils/xml2po
Change into the newly created directory and compile the sources.
# cd gnome-doc-utils/xml2po/ # ./autogen.sh --prefix=/usr && make
As superuser you can install the program. After typing su enter your root password.
# su # make install # exit
Now it is time to install the second program, poedit, a graphical editor for translation strings. The poedit download page contains information about installing and downloading, users of Gentoo Linux just have to type
# su # emerge poedit # exit
A small example file
Following example file (example-en.xml) will be translated:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE set PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<article>
<title>Example</title>
<para>
This is just an
<emphasis>unimportant</emphasis>
example.
</para>
</article>
Create the translation template
Now it's time to create the translation template.
# xml2po -o example.pot example-en.xml
Translate
This will leave you with a translation template called example.pot. Open this file with poedit by typing poedit example.pot. The user interface will display you a list of strings to translate. Click on each string in the list and enter the correct translation in the textarea at the bottom.
After translating all strings, select File->Save As and enter de.po as the filename (or whatever language code is used for your target language). Afterwards you can close poedit in order to proceed with merging the translations into a new XML document.
Create a new, translated XML file
This is done by calling xml2po once again.
# xml2po -p de.po example-en.xml > example-de.xml
This command will read the translation patterns from the file de.po and the structure from the file example-en.xml. The translations and structure will be merged into a file called example-de.xml. This is the content of this new file:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE set PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<article lang="de">
<title>Beispiel</title>
<para>Dies ist nur ein <emphasis>unbedeutendes</emphasis> Beispiel.</para>
</article>
I am Product Manager for Collaboration and Digital Asset Management at