INTERNATIONALIZATION AND LOCALIZATION OF SHELL SCRIPTS 3 - Internationalization process -------------------------------- This chapter is intended for developers and maintainers. The internationalization process comprises following tasks: 1. Prepare scripts for internationalization 2. Mark messages to be localized 3. Use 'xgettext' to produce a template catalog of messages (1) Prepare scripts for internationalization This task is needed for shell scripts which do not yet fulfill requirements for internationalization. *** Gettext's requirements for shell scripts. *** The list of requirements below is not complete: it includes only the main ones that I recommend the developer or maintainer to check, based on my experience. Gettext replaces at run time text strings output of: (1) an "echo" command or (2) a program (like 'dialog', for instance) with translated text strings (found in a messages catalog for the language set by $LANG or $LANGUAGE) if following conditions are fulfilled: - A MO file is available in the path computed from the TEXTDOMAIN environment variable as dir_name/locale/LC_MESSAGES/text_domain.mo. For instance, if TEXTDOMAIN=software and $LANG=de_DE.utf8, gettext will look for: dir_name/de_DE.utf8/LC_MESSAGES/software.mo dir_name can be set through the value of the TEXTDOMAINDIR environment variable, otherwise a default value is used. In Slackware Linux for instance, the default value is /usr/share/locale. - TEXTDOMAIN variable is exported before any *gettext command occurs. - gettext.sh, which provides the eval_gettext and eval_ngettext functions, is sourced before any occurrence of one of these functions. - A msgid string in the MO file matches exactly the argument of gettext (or eval_gettext if the text string includes a parameter expansion). - The corresponding msgstr string does not include a backslash followed by a white space. - The msgstr string begins and ends with a newline or not, as the msgid does. - If the text string includes a parameter expansion, eval_gettext is used instead of gettext. - "The variable names must consist solely of alphanumeric or underscore ASCII characters, not start with a digit and be nonempty; otherwise such a variable reference is ignored." (gettext manual) - Parameter expansions are escaped with a single backslash like this: \$parameter or \${parameter}, unless the eval_gettext command be inside a command substitution like this: `eval_gettext "..."`" or "`$(eval_gettext "...")". In the latter case, three backslashes is used like this: \\\$parameter or \\\${parameter}. - Only the forms $parameter and ${parameter} of parameter expansion are used inside an eval_gettext's argument (all other ones are forbidden). - Positional parameters, special parameters and command substitutions are *not* used inside a gettext's or eval_gettext's argument. As a practical consequence of the two last rules, it is advisable that all positional parameters, special parameters, command substitutions and not allowed forms of parameter substitutions be assigned upstream to named variables, then expanded in the text string argument of eval_gettext or eval_negettext. Tip: if a text string has been included as a msgid in a catalog of messages and is assigned to a named variable in a script, then the commands: "gettext $parameter" and "gettext ${parameter}" will output the translated string at run time, even though 'xgettext' would discard that command when parsing the script, because 'gettext' is used instead of 'eval_gettext'. This can be handy. In this case the parameter expansion should not be escaped. (2) Mark messages to be localized I recommend to mark messages: - arguments of a not redirected 'echo' command - arguments of redirected 'echo' commands whenever a further processing displays it on user's screen - arguments of other commands which displays the message, for instance the 'dialog' program On the contrary I recommend not to mark - comments intended for readers of the script, - text string whose value will be processed later, for instance as arguments of a 'case' compound command, or arguments of a 'dialog --menu' command. Sometimes the shell script writes other shell scripts. Then the developer or maintainer have to decide on a case by case basis what to mark depending on the intended scope of internationalization. (3) Use 'xgettext' to produce a template catalog of messages The choice to produce only one POT file for the software as a whole or to make one POT files per set of scripts have to be made, considering for instance which choice will minimize maintenance work, how localization's work can be organized, relative frequency of updates for the different sets of scripts which comprise the software, and the relevance of distinguishing groups of features like setup vs configuration vs package management. I'm inclined to produce only one POT file, but the choice is yours. If the software comprises of numerous scripts located in different places or included in several packages, it can be handy to collect a copy of all scripts in a single directory, and/or to register in a text file a list of all of them with their paths. The POT file will be generated using the 'xgettext' command (see the manual or 'xgettext --help' for details). I suggest to include following options in the command: -L Shell (of course!) --strict (to facilitate checks and management of the messages catalogs) -c (to include comments useful for the translators in the POT file) -n (to identify the source file and the line number of each message. This is the default.) Once the POT file is generated you could check that it includes entries for all *gettext invocation in shell script(s). Didier Spaier