plan9port

fork of plan9port with libvec, libstr and libsdb
Log | Files | Refs | README | LICENSE

tcs.1 (2576B)


      1 .TH TCS 1
      2 .SH NAME
      3 tcs \- translate character sets
      4 .SH SYNOPSIS
      5 .B tcs
      6 [
      7 .B -slcv
      8 ]
      9 [
     10 .B -f
     11 .I ics
     12 ]
     13 [
     14 .B -t
     15 .I ocs
     16 ]
     17 [
     18 .I file ...
     19 ]
     20 .SH DESCRIPTION
     21 .I Tcs
     22 interprets the named
     23 .I file(s)
     24 (standard input default) as a stream of characters from the
     25 .I ics
     26 character set or format, converts them to runes,
     27 and then converts them into a stream of characters from the
     28 .I ocs
     29 character set or format on the standard output.
     30 The default value for
     31 .I ics
     32 and
     33 .I ocs
     34 is
     35 .BR utf ,
     36 the
     37 .SM UTF
     38 encoding described in
     39 .MR utf (7) .
     40 The
     41 .B -l
     42 option lists the character sets known to
     43 .IR tcs .
     44 Processing continues in the face of conversion errors (the
     45 .B -s
     46 option prevents reporting of these errors).
     47 The
     48 .B -c
     49 option forces the output to contain only correctly converted characters;
     50 otherwise,
     51 .B 0x80
     52 characters will be substituted for
     53 .SM UTF
     54 encoding errors and
     55 .B 0xFFFD
     56 characters will substituted for unknown characters.
     57 .PP
     58 The
     59 .B -v
     60 option generates various diagnostic and summary information on standard error,
     61 or makes the
     62 .B -l
     63 output more verbose.
     64 .PP
     65 .I Tcs
     66 recognizes an ever changing list of character sets.
     67 In particular, it supports a variety of Russian and Japanese encodings.
     68 Some of the supported encodings are
     69 .TF jis-kanji
     70 .TP
     71 .B utf
     72 The Plan 9
     73 .SM UTF
     74 encoding, known by ISO as UTF-8
     75 .TP
     76 .B utf1
     77 The deprecated original
     78 .SM UTF
     79 encoding from ISO 10646
     80 .TP
     81 .B ascii
     82 7-bit ASCII
     83 .TP
     84 .B 8859-1
     85 Latin-1 (Central European)
     86 .TP
     87 .B 8859-2
     88 Latin-2 (Czech .. Slovak)
     89 .TP
     90 .B 8859-3
     91 Latin-3 (Dutch .. Turkish)
     92 .TP
     93 .B 8859-4
     94 Latin-4 (Scandinavian)
     95 .TP
     96 .B 8859-5
     97 Part 5 (Cyrillic)
     98 .TP
     99 .B 8859-6
    100 Part 6 (Arabic)
    101 .TP
    102 .B 8859-7
    103 Part 7 (Greek)
    104 .TP
    105 .B 8859-8
    106 Part 8 (Hebrew)
    107 .TP
    108 .B 8859-9
    109 Latin-5 (Finnish .. Portuguese)
    110 .TP
    111 .B koi8
    112 KOI-8 (GOST 19769-74)
    113 .TP
    114 .B jis-kanji
    115 ISO 2022-JP
    116 .TP
    117 .B ujis
    118 EUC-JX: JIS 0208
    119 .TP
    120 .B ms-kanji
    121 Microsoft, or Shift-JIS
    122 .TP
    123 .B jis
    124 (from only) guesses between ISO 2022-JP, EUC or Shift-Jis
    125 .TP
    126 .B gb
    127 Chinese national standard (GB2312-80)
    128 .TP
    129 .B big5
    130 Big 5 (HKU version)
    131 .TP
    132 .B unicode
    133 Unicode Standard 1.0
    134 .TP
    135 .B tis
    136 Thai character set plus
    137 .SM ASCII
    138 (TIS 620-1986)
    139 .TP
    140 .B msdos
    141 IBM PC: CP 437
    142 .TP
    143 .B atari
    144 Atari-ST character set
    145 .SH EXAMPLES
    146 .TP
    147 .B tcs -f 8859-1
    148 Convert 8859-1 (Latin-1) characters into
    149 .SM UTF
    150 format.
    151 .TP
    152 .B tcs -s -f jis
    153 Convert characters encoded in one of several shift JIS encodings into
    154 .SM UTF
    155 format.
    156 Unknown Kanji will be converted into
    157 .B 0xFFFD
    158 characters.
    159 .TP
    160 .B tcs -lv
    161 Print an up to date list of the supported character sets.
    162 .SH SOURCE
    163 .B \*9/src/cmd/tcs
    164 .SH SEE ALSO
    165 .IR ascii (1), 
    166 .IR rune (3), 
    167 .MR utf (7) .