rune.3 (3023B)
1 .TH RUNE 3 2 .SH NAME 3 runetochar, chartorune, runelen, runenlen, fullrune, utfecpy, utflen, utfnlen, utfrune, utfrrune, utfutf \- rune/UTF conversion 4 .SH SYNOPSIS 5 .ta \w'\fLchar*xx'u 6 .B #include <utf.h> 7 .PP 8 .B 9 int runetochar(char *s, Rune *r) 10 .PP 11 .B 12 int chartorune(Rune *r, char *s) 13 .PP 14 .B 15 int runelen(long r) 16 .PP 17 .B 18 int runenlen(Rune *r, int n) 19 .PP 20 .B 21 int fullrune(char *s, int n) 22 .PP 23 .B 24 char* utfecpy(char *s1, char *es1, char *s2) 25 .PP 26 .B 27 int utflen(char *s) 28 .PP 29 .B 30 int utfnlen(char *s, long n) 31 .PP 32 .B 33 char* utfrune(char *s, long c) 34 .PP 35 .B 36 char* utfrrune(char *s, long c) 37 .PP 38 .B 39 char* utfutf(char *s1, char *s2) 40 .SH DESCRIPTION 41 These routines convert to and from a 42 .SM UTF 43 byte stream and runes. 44 .PP 45 .I Runetochar 46 copies one rune at 47 .I r 48 to at most 49 .B UTFmax 50 bytes starting at 51 .I s 52 and returns the number of bytes copied. 53 .BR UTFmax , 54 defined as 55 .B 3 56 in 57 .BR <libc.h> , 58 is the maximum number of bytes required to represent a rune. 59 .PP 60 .I Chartorune 61 copies at most 62 .B UTFmax 63 bytes starting at 64 .I s 65 to one rune at 66 .I r 67 and returns the number of bytes copied. 68 If the input is not exactly in 69 .SM UTF 70 format, 71 .I chartorune 72 will convert to 0x80 and return 1. 73 .PP 74 .I Runelen 75 returns the number of bytes 76 required to convert 77 .I r 78 into 79 .SM UTF. 80 .PP 81 .I Runenlen 82 returns the number of bytes 83 required to convert the 84 .I n 85 runes pointed to by 86 .I r 87 into 88 .SM UTF. 89 .PP 90 .I Fullrune 91 returns 1 if the string 92 .I s 93 of length 94 .I n 95 is long enough to be decoded by 96 .I chartorune 97 and 0 otherwise. 98 This does not guarantee that the string 99 contains a legal 100 .SM UTF 101 encoding. 102 This routine is used by programs that 103 obtain input a byte at 104 a time and need to know when a full rune 105 has arrived. 106 .PP 107 The following routines are analogous to the 108 corresponding string routines with 109 .B utf 110 substituted for 111 .B str 112 and 113 .B rune 114 substituted for 115 .BR chr . 116 .PP 117 .I Utfecpy 118 copies UTF sequences until a null sequence has been copied, but writes no 119 sequences beyond 120 .IR es1 . 121 If any sequences are copied, 122 .I s1 123 is terminated by a null sequence, and a pointer to that sequence is returned. 124 Otherwise, the original 125 .I s1 126 is returned. 127 .PP 128 .I Utflen 129 returns the number of runes that 130 are represented by the 131 .SM UTF 132 string 133 .IR s . 134 .PP 135 .I Utfnlen 136 returns the number of complete runes that 137 are represented by the first 138 .I n 139 bytes of 140 .SM UTF 141 string 142 .IR s . 143 If the last few bytes of the string contain an incompletely coded rune, 144 .I utfnlen 145 will not count them; in this way, it differs from 146 .IR utflen , 147 which includes every byte of the string. 148 .PP 149 .I Utfrune 150 .RI ( utfrrune ) 151 returns a pointer to the first (last) 152 occurrence of rune 153 .I c 154 in the 155 .SM UTF 156 string 157 .IR s , 158 or 0 if 159 .I c 160 does not occur in the string. 161 The NUL byte terminating a string is considered to 162 be part of the string 163 .IR s . 164 .PP 165 .I Utfutf 166 returns a pointer to the first occurrence of 167 the 168 .SM UTF 169 string 170 .I s2 171 as a 172 .SM UTF 173 substring of 174 .IR s1 , 175 or 0 if there is none. 176 If 177 .I s2 178 is the null string, 179 .I utfutf 180 returns 181 .IR s1 . 182 .SH SOURCE 183 .B https://9fans.github.io/plan9port/unix 184 .SH SEE ALSO 185 .IR utf (7), 186 .IR tcs (1)