2.4. Translation of documentation

Users of an operating system require, in order to be able to use it, both online and offline documentation.

In UNIX systems, online documentation is typically handled through manual pages (retrieved through the use of man)[1]. In Desktop environments both KDE and GNOME provide online help systems.

Offline documentation includes documents such as the Debian Installation Manual, the Debian Reference Guide, or the Security Debian Manual. These manuals, developed by the Debian Documentation Project (DDP), are typically printed by users and, sometimes, read from other (non-Debian) systems at the project's website. Some of the manuals are included in the official CD set and most of them are also available through Debian packages.

Obviously, users not proficient with English will prefer documentation in a language they're familiar with. This is why translation of documentation is needed.

2.4.1. SGML/XML documentation

The DDP project develops documentation for Debian written, primarily, in SGML format. The first documentation produced by the project was written in a variant of SGML called debiandoc-sgml for which tools were developed to convert the SGML documents into different formats (HTML for online viewing, PDF and PostSCript for printing, and simple text). The latest documentation written by members of the project or revisions of available documentation (such as the Debian Installation Manual) has been written using XML and more specifically Docbook-XML which is widely used by many free software projects[2].

The translation of documentation written in SGML or XML format, however, faces some initial issues:

In order to overcome these issues, the document writers, in cooperation with translators, have introduced significant changes to how documentation is written:

Currently, the Debian Documentation Project uses the CVS server at cvs.debian.org[3]. Documentation is compiled through a set of Makefiles and published on the official project web site, which updates its copy of the CVS repository and compiles all the documentation daily. The CVS server at cvs.debian.org holds most (but not all) of the documentation available. The most notable exception is the Debian Installer Manual which is available in the SVN repository of the Debian Installer project on Alioth (web access to the SVN repository). The development version of the Debian Installation Guide is available at the Debian Installer project's web pages also on Alioth.

Translators either get access to the CVS or use the original documentation maintainer as a proxy to publish the information in the CVS. When a translation is added to an available document the maintainer typically needs to update the LANGS (or LANGUAGES) variable in the document's Makefile in order to tell the publication system to also build copies for that language. If the added translation builds, it should be available in the Debian website after the next daily build.

In order to track the changes of documentation and when a translation needs to be updated, document maintainers and translators use the doc-check tool. See Section 4.5.

2.4.2. Debian manpages translation

There are mainly two types of manpages provided within the Debian operating system: those packaged with upstream software (such as binutils or gcc) and those written for Debian tools provided in Debian-only packages. The translation of manpages of upstream software are typically provided either within the package itself or within manpages-XX package (where XX is a given language codename) the case of the translation of the Linux manpages.

2.4.2.1. Manpages packages

At the time of this writing the following translation manpages packages are available:

  • manpages-de and manpages-de-dev - German manpages

  • manpages-es and manpages-es-extra - Spanish man pages

  • manpages-fi - Finnish man pages

  • manpages-fr - French version of the manual pages

  • manpages-hu - Hungarian manpages

  • manpages-it - Italian man pages

  • manpages-ja and manpages-ja-dev - Japanese version of the manual pages

  • manpages-ko - Korean version of the manual pages

  • manpages-nl - Dutch manpages

  • manpages-pl - Polish man pages

  • manpages-pt and manpages-pt-dev - Portuguese Versions of the Manual Pages

  • manpages-ru - Russian translations of Linux manpages

  • manpages-tr - Turkish version of the manual pages

  • manpages-zh - Chinese manual pages

Those manpage packages contain, for the most part, the same manpages available in the manpages or manpages-dev packages although some packages contain extra manpages.

Users will see the translated manpages through the internationalisation mechanisms of the man command which will review, when asked to present a manpage, if a translation is available under /usr/share/man/XX (with XX being the language code of the user's environment[4]). Consequently, the Debian installation system will install both manpages and manpages-XX through a default installation if the user selects an specific language task for which manpages are available.

2.4.2.2. Issues with manpage translations

One of the main issues with manpages, however, is that there is no provisions in the man to detect when a translation is out of date. As a consequence, users reading translated manual pages might be reading out of date content that does not really apply to the latest version of the program's man page.

The translation project is also lacking a central web page where teams can see (at a glance) which manpages are available for translation, which translations are out of date, translated or untranslated and who is the last translator of the manpage.

2.4.2.3. Translation of Debian manpages

The translation of manpages specific to programs developed within the Debian project (such as the dpkg or apt tools) is a work that falls within the scope of the Debian translation teams.

The translation of manpages for Debian programs are included within the Debian package itself, which means that translators have to request the Debian maintainer to include the translation in it, once finished. Since Debian programs are typically managed through common revision control repositories available to Debian developers or contributors (either at cvs.debian.org or at alioth.debian.org) active translators of a Debian program will typically have access to those resources and will be able to commit directly into the source control zone were manpages are included. It is worthwhile noting, that it is also common for people active in the program translation to work on the translation of the manpage so that the translation of the program messages and options is consistent with the manual page itself.

In order to coordinate the translation of manpages and make it possible to track when the translation changes, the translation teams introduced the manpages in the CVS DDP area. This module includes several scripts in order to track manpages translations: check_trans.pl and compare_files.pl[5].

This module was introduce since translators did not have access to the CVS repository of the programs for which the translations were going to be made available. Consequently, the original manpages themselves could not be modified to include a translation control header to keep track whenever one was modified. In order to keep track of translation status the CVS module holds INFO with meta-data of translated documents including:

Manpage

document's name in the CVS repository, may be different from the one in the source package. This is used as the document ID.

Encoding

Document's encoding.

Location

Location of this document in the source package (this value is only set when the source package does contain this document).

Original

Original ID in the english/ directory.

Original-CVS-Revision

CVS revision number of the original document on which a translation is based.

Translator

Translator's name, is be used to send automatic notifications when a translated man page is outdated.

Original manpages are included in the english/ directory of that CVS module by the translation teams and need to be updated manually when the original file is updated. Based on the meta-data information and the CVS revisions available for manpages the scripts can track when a manpage is outdated and notify the translator in charge of it.

Unfortunately, this mechanism, initially developed by the French translation team and used by other teams, is not being maintained. There have been no updates in the English manpages for two years and translations have not been updated there either.

FIXME: Describe use of po4a. The CVS-DDP module is not that much used anymore since most translators are now in the packages themselves...

2.4.3. Coordination of documentation translation

Initially, some of the translation projects (French and Spanish) introduced their own documentation translation management system[6] in order to coordinate the translation of the Debian Documentation Project published manuals. This management system was based on a flat database that included the available documents in the DDTP system and the status of translations. With the use of Perl scripts, this database was converted into HTML files that were published on the website so that the translation team could see which documents were being worked on and who was coordinating the translation.

This system was not integrated with the document database provided by the DDP itself and the translation teams have, for a few years, made use of the translation robots (see Section 3.4) in order too coordinate translation of documents themselves.

Notes

[1]

The GNU project prefers the use of info documentation but most upstream developers just provide manpages.

[2]

Including distributions such as Red Hat GNU/Linux, or the Linux Documentation Project

[3]

The web interface can be accessed at http://cvs.debian.org/ddp/manuals.sgml/?root=debian-doc.

[4]

As defined through the LANG variable

[5]

There is an additional script, gen_db.pl, used to generate the wml files used for the translation coordination database in the web page area

[6]

The Spanish management system (the status database has not been updated since January 2003) is available at http://www.debian.org/international/spanish/ltcp/.