4.3. Po for everything (po4a)

po4a (PO for anything) is a set of tools originally developed by Martin Quinson with the goal of easing translations and, more interestingly, maintenance of translations, through the use of gettext tools in areas where they were not expected, like documentation.

Tools in the po4a package are used to update PO files from the original files and generates translated files from these PO files.

The usual process is made of one initial step (convert translations to PO files) and two recurrent steps (update PO files and regenerate translated files).

4.3.1. Converting existing/translated documentation to PO

Converting documentation files to PO is the job of the po4a-gettextize command. When starting a new translation, po4a-gettextize will extract the translatable strings from the documentation file and write a POT file from it. When translated files exist, strings are grabbed from them and collected in a new PO file.

The po4a-gettextize tool only extracts the Nth string from the translated file and matches it to the Nth string of the original file in the created PO file. It verifies that the original document and the translated documents have the same structure, but can't detect the state of the translation.

Because some manual expertise by translators is then required, all extracted strings are marked as "fuzzy" by the po4a-gettextize process.

4.3.1.1. Converting manpages to PO

Converting existent manpages to PO files can be done using the po4a-gettextize tool:


$ po4a-gettextize -f man -m the_manpage -p foo.po
  

Or if an existing translation exists:


$ po4a-gettextize -f man -m the_manpage -p foo.po -l existing_translation
  

PO files can be converted back to manpages with:


$ po4a-translate -f man -m the_original_manpage -p the_PO_file -l translated_manpage
  

Upstream authors can use the following Makefile to translate manpages by placing it in a po subdirectory under the documentation directory (which holds the manpages in troff format).

4.3.2. Maintaining translated documentation with po4a

The dataflow can be summarised as "master document --> PO files --> translations". Any changes to the master document will be reflected in the PO files, and all changes to the PO files (either manual or caused by the changes from the master document) will be reflected in translated documents.

Po4a tools allow defining a minimum translation ratio below which the translated document will not be generated anymore or will only contain the original strings. This avoids generating documents with too few translated strings but still allows the publication of complete translations. As a consequence, translated documents may still contain some strings in the original language. This tries to guarantee the accuracy of the translated document.

This behaviour is a big improvement for documents such as manual pages. Indeed, an argument often used for not using these translations is the lack of guarantee that the translations are in sync with the original man pages. The po4a tools lower this argument, the price being sometimes mixed man pages, though.

4.3.3. Recommended po4a organisation for software

This section is mostly taken from the po4a documentation and describes the recommended organisation for software that use po4a.

A standardised architecture of the source tree will help the translation teams when they try to detect the POTs that need to be updated. As a consequence, the following architecture is recommended:


  /
  /doc/
  /doc/en/
  /doc/en/
  /doc/po4a/
  /doc/po4a/add_<ll>/
  /doc/po4a/po4a.cfg
  /doc/po4a/po/
  /doc/po4a/po/<pkg>.pot
  /doc/po4a/po/<ll>.po
  /doc/<ll>/

Or, if you want to avoid a big POT and split it according to the packages, documents, formats, or subjects, you can use the following architecture:


  /
  /doc/
  /doc/en/
  /doc/en/
  /doc/po4a/
  /doc/po4a/add_<ll>/
  /doc/po4a/<pkg1>/po4a.cfg
  /doc/po4a/<pkg1>/po/
  /doc/po4a/<pkg1>/po/<pkg1>.pot
  /doc/po4a/<pkg1>/po/<ll>.po
  /doc/<ll>/

It is important to avoid a build failure if a translation cannot be generated (the PO is too outdated, an addendum cannot be applied, ...). Wildcards should therefore be used or tests be included to check whether the files are generated in the 'install' or 'dist' rules.

When po4a is used upstream, it is recommended to run po4a in the 'dist' rule. This will update the POT and POs, and will generate the translated documents.

These translated documents can be distributed in the source archive if the maintainer doesn't want to add a build dependency on po4a. This requires adding an autoconf check on po4a and will allow updating the documentation if po4a is available locally. If po4a is not available, documents will be distributed without being synced with the original version, but the build process won't fail.

It is important to distribute the POT and POs in the source archive.

When po4a is used in a distribution such as Debian, one should ensure that the source of the package contains only up-to-date POT and POs. This means that po4a should be run in the 'clean' rule of the make process (typically in debian/rules for Debian packages.


clean:
       # Update the POT and POs
       cd <...>/po4a && po4a --no-translations --rm-backups <package>.cfg

build:
       # Generate the translations
       cd <...>/po4a && po4a --rm-backups <package>.cfg