3.4. Translation robots

One of the things that typically takes more time when coordinating translation projects is keeping track of what is each member of the team working on, and what is the precise status of each of the translations.

The most well known translation robot is the one used by the Translation Project. This is an e-mail service that takes care of PO file submissions for translations registered within the project. It checks if files sent to it are receivable, that is, if a translator has filled the translation disclaimer [1]. This robot also calls msgfmt to see if the PO file is healthy. The robot also sends notices of updated PO files to the translation teams whenever translations need to be updated.

This translation robot however, only tracks PO files, and only PO files of those GNU projects registered with it. Bearing in mind that the requirements of the translations teams in Debian were different, the translation teams started writing translation robots to handle their own translation process. This work was started by members of the French team and then reused by other translation teams including Spanish, Dutch, Brazilian Portuguese, and German.

The translation coordination robot is an e-mail robot that "listens" to the mails sent to the translation teams mailing lists (debian-l10n-XXXX) and looks for a set of pseudo-urls in the messages' subjects. These pseudo-urls are composed of a translation status and a translation item (see Section 3.4.2). The robot takes this information to compose a list of items being translated, the status of the translation and the translator in charge of it.

This is a useful tool for both translation coordinators and members of the translation teams. Any member can, at a glance, see the status of a translation and help if help is needed (for translations or reviews). It also helps people detect translations that are stalled (a translator stated that they were going to work on it, but didn't finish actually finish it).

Currently, there are several translation robots in place. First, there is a generic robot that handles different mailing lists by crawling the web site information with a set of scripts. This robot is used by most translation teams, although some teams use their own robot, currently active are the: Spanish team which uses its own robot, available at Spanish translation coordination robot, Dutch translation coordination robot, Catalonian translation coordination robot. These last robots, instead of crawling the web site, use real e-mail addresses which are subscribed to the team's mailing lists and handle messages in real time through procmail filters that filter this information to the translation robot's status database.

3.4.1. Translation stages

Translations evolve through the following stages:

When the original item (document, wml or PO file) is modified, the cycle starts again. However, in that case, the last translator who worked on it is considered upstream's point of contact ((s)he should be contacted whenever there is a need to update the translation). The last translator is also typically considered as the maintainer for upstream's translations so there is no need to tell the translation team that (s)he will be updating the translation ((s)he is supposed to do it, as (s)he is in charge). Moreover, on many occasions, when the changes to the original item are few, there is not really a need to do a full review of the new translation by the translation team.

3.4.2. Pseudo-urls used by the robot

A pseudo-url is made of the following:


[<state>] <type>://<package>/<file>

These pseudo-urls are used in the mail Subject: to help the robot distinguish which mails need to be handled by it and which mails are part of the mailing list discussion.

The contents of the pseudo-url are:

state

The state the translation is in, for more information see Section 3.4.2.1.

type

The type of item being translated. The translation coordination robot accepts the following item as valid types in the pseudo-url to indicate a translatable item: po-debconf, debian-installer, po, man or wml (webwml is deprecated, wml should by used instead).

package

the name of the package where the document came from. www.debian.org is used for the wml files of the Debian web site cvs.

file

the filename of the document, it can contain other information such as the path to the file or the section for a manpage, so no other document in the same package should be referred the same.

The structure of name depends on the chosen type. In principle it's just an identifier, but it's strongly recommended to follow the following rules:

3.4.2.1. Translation states handled by the robot

The translation coordination robot can track what stage is a translation for any item through the use of the following keys in the "state" part of the pseudo-url:

TAF

("Travail À Faire", French for "translation to do", "taf" also means "work" in French slang) is sent to indicate that there is a document that needs to be worked on;

ITT

(Intent To Translate) indicates that there is a translator that is planning on working on a given translation. This helps preventing translators to duplicate their work;

RFR

(Request For Review) states than an initial translation is finished. The translator will attach the translation itself to the sent e-mail, for peer review. This key might be used more than once for the same item[2] if substantial changes have been made to the initial translation based on other's comments. Just like with CVS commit e-mails, translators expect a reply to these requests even if it's just to say "The translation is OK.";

ITR

(Intent To Review) a peer of the translation team notes that (s)he is working on a review of the translation and might take some (typically because the translation is large, or because the reviewer will not have time available until a given point in time) This is used to prevent the original translator to consider the translation as finished (send an LCFC, see below);

LCFC

(Last Chance For Comments) tells the team that the translator considers the translation to be finished and has included the comments from the review process. (S)he is giving a last chance for peers to review before the translation is submitted upstream. Typically, it is sent when there are no no ITR's, discussion following the RFR has ended and it has been three days since the RFR was sent. Most translation teams don't allow translators to do this unless at least one member of the team has reviewed the translator's work;

BTS#<bug number>

(Bug Tracking System) tells the team that a bug has been open to submit the translation to the maintainer. This is useful, since the translation robot can then automatically start checking if the bug report is still open and updates the translation status accordingly;

FIX#<bug number>

(bug FIXed) notes that an open bug has been fixed already (useful if the translation robot missed it being closed);

DONE

states that the translation has been finished and is now included upstream. This should be used in cases where no bug report is involved such as web site translations. Otherwise, the robot will handle DONE automatically by crawling the Bug Tracking System;

HOLD

put a translation on hold, when the original version has changed but there is no need to update the translation, e.g. the translator knows other modifications will be done soon on the translation and they don't want someone else to update it too quickly.

3.4.3. Example of translation robot usage

This is a typical example of the way the translation robots are currently used.

Throughout this process both members of the team subscribed to the list, new members which were not subscribed when the process started and the translation team coordinator have full access to the translation status (and its history) through the web application of the translation team robot.

3.4.4. Future changes for translation robots

In the future, the different translation robots of the translation teams should be merged into one common database as part of Debian's infrastructure for translators. This would help prevent having robots coded in different ways and help that new features (such as handling compressed translations or testing PO files with msgfmt) need to be coded in each of the independent robots.

Notes

[1]

Translators of the Translation Project have to send, through postal e-mail, a form that disclaims in writing by the translators, before being accepted for inclusion in the distribution. For more information, read http://translation.sourceforge.net/HTML/disclaim.html

[2]

Some translation robots use RFR2 for subsequent reviews

[3]

Some translation robots don't handle compressed files but most will handle MIME attachments