tcs.1 (2576B)
1 .TH TCS 1 2 .SH NAME 3 tcs \- translate character sets 4 .SH SYNOPSIS 5 .B tcs 6 [ 7 .B -slcv 8 ] 9 [ 10 .B -f 11 .I ics 12 ] 13 [ 14 .B -t 15 .I ocs 16 ] 17 [ 18 .I file ... 19 ] 20 .SH DESCRIPTION 21 .I Tcs 22 interprets the named 23 .I file(s) 24 (standard input default) as a stream of characters from the 25 .I ics 26 character set or format, converts them to runes, 27 and then converts them into a stream of characters from the 28 .I ocs 29 character set or format on the standard output. 30 The default value for 31 .I ics 32 and 33 .I ocs 34 is 35 .BR utf , 36 the 37 .SM UTF 38 encoding described in 39 .MR utf (7) . 40 The 41 .B -l 42 option lists the character sets known to 43 .IR tcs . 44 Processing continues in the face of conversion errors (the 45 .B -s 46 option prevents reporting of these errors). 47 The 48 .B -c 49 option forces the output to contain only correctly converted characters; 50 otherwise, 51 .B 0x80 52 characters will be substituted for 53 .SM UTF 54 encoding errors and 55 .B 0xFFFD 56 characters will substituted for unknown characters. 57 .PP 58 The 59 .B -v 60 option generates various diagnostic and summary information on standard error, 61 or makes the 62 .B -l 63 output more verbose. 64 .PP 65 .I Tcs 66 recognizes an ever changing list of character sets. 67 In particular, it supports a variety of Russian and Japanese encodings. 68 Some of the supported encodings are 69 .TF jis-kanji 70 .TP 71 .B utf 72 The Plan 9 73 .SM UTF 74 encoding, known by ISO as UTF-8 75 .TP 76 .B utf1 77 The deprecated original 78 .SM UTF 79 encoding from ISO 10646 80 .TP 81 .B ascii 82 7-bit ASCII 83 .TP 84 .B 8859-1 85 Latin-1 (Central European) 86 .TP 87 .B 8859-2 88 Latin-2 (Czech .. Slovak) 89 .TP 90 .B 8859-3 91 Latin-3 (Dutch .. Turkish) 92 .TP 93 .B 8859-4 94 Latin-4 (Scandinavian) 95 .TP 96 .B 8859-5 97 Part 5 (Cyrillic) 98 .TP 99 .B 8859-6 100 Part 6 (Arabic) 101 .TP 102 .B 8859-7 103 Part 7 (Greek) 104 .TP 105 .B 8859-8 106 Part 8 (Hebrew) 107 .TP 108 .B 8859-9 109 Latin-5 (Finnish .. Portuguese) 110 .TP 111 .B koi8 112 KOI-8 (GOST 19769-74) 113 .TP 114 .B jis-kanji 115 ISO 2022-JP 116 .TP 117 .B ujis 118 EUC-JX: JIS 0208 119 .TP 120 .B ms-kanji 121 Microsoft, or Shift-JIS 122 .TP 123 .B jis 124 (from only) guesses between ISO 2022-JP, EUC or Shift-Jis 125 .TP 126 .B gb 127 Chinese national standard (GB2312-80) 128 .TP 129 .B big5 130 Big 5 (HKU version) 131 .TP 132 .B unicode 133 Unicode Standard 1.0 134 .TP 135 .B tis 136 Thai character set plus 137 .SM ASCII 138 (TIS 620-1986) 139 .TP 140 .B msdos 141 IBM PC: CP 437 142 .TP 143 .B atari 144 Atari-ST character set 145 .SH EXAMPLES 146 .TP 147 .B tcs -f 8859-1 148 Convert 8859-1 (Latin-1) characters into 149 .SM UTF 150 format. 151 .TP 152 .B tcs -s -f jis 153 Convert characters encoded in one of several shift JIS encodings into 154 .SM UTF 155 format. 156 Unknown Kanji will be converted into 157 .B 0xFFFD 158 characters. 159 .TP 160 .B tcs -lv 161 Print an up to date list of the supported character sets. 162 .SH SOURCE 163 .B \*9/src/cmd/tcs 164 .SH SEE ALSO 165 .IR ascii (1), 166 .IR rune (3), 167 .MR utf (7) .