NAME
       encoding - Manipulate encodings

SYNOPSIS
       encoding option ?arg arg ...?


INTRODUCTION
       Strings  in  Tcl  are encoded using 16-bit Unicode charac-
       ters.  Different operating system interfaces  or  applica-
       tions  may  generate  strings  in  other encodings such as
       Shift-JIS.  The encoding command helps to bridge  the  gap
       between Unicode and these other formats.


DESCRIPTION
       Performs  one  of  several  encoding  related  operations,
       depending on option.  The legal options are:

       encoding convertfrom ?encoding? data
              Convert data to Unicode from the  specified  encod-
              ing.   The characters in data are treated as binary
              data where the lower 8-bits of  each  character  is
              taken  as a single byte.  The resulting sequence of
              bytes is treated  as  a  string  in  the  specified
              encoding.   If  encoding is not specified, the cur-
              rent system encoding is used.

       encoding convertto ?encoding? string
              Convert string from Unicode to the specified encod-
              ing.  The result is a sequence of bytes that repre-
              sents the converted string.  Each byte is stored in
              the lower 8-bits of a Unicode character.  If encod-
              ing is not specified, the current  system  encoding
              is used.

       encoding names
              Returns  a  list containing the names of all of the
              encodings that are currently available.

       encoding system ?encoding?
              Set the system encoding to encoding. If encoding is
              omitted then the command returns the current system
              encoding.  The system encoding is used whenever Tcl
              passes strings to system calls.


EXAMPLE
       It  is  common practice to write script files using a text
       editor that produces output in the euc-jp encoding,  which
       represents   the  ASCII  characters  as  singe  bytes  and
       Japanese characters as two bytes.  This makes it  easy  to
       embed literal strings that correspond to non-ASCII charac-
       ters by simply typing the strings in place in the  script.
       However,  because  the  source  command always reads files
       using the current system encoding, Tcl  will  only  source
       such  files  correctly when the encoding used to write the
       file is the same.  This tends not to be true in an  inter-
       nationalized  setting.   For  example,  if such a file was
       sourced in North America (where the ISO8859-1 is  normally
       used),  each  byte in the file would be treated as a sepa-
       rate character that maps to the 00 page in  Unicode.   The
       resulting  Tcl  strings  will  not  contain  the  expected
       Japanese  characters.   Instead,  they  will   contain   a
       sequence  of  Latin-1  characters  that  correspond to the
       bytes of the original string.  The encoding command can be
       used  to convert this string to the expected Japanese Uni-
       code characters.  For example,
                set s [encoding convertfrom euc-jp "\xA4\xCF"]
       would return the Unicode string  "\u306F",  which  is  the
       Hiragana letter HA.


SEE ALSO
       Tcl_GetEncoding(3)


KEYWORDS
