plan9port

fork of plan9port with libvec, libstr and libsdb
Log | Files | Refs | README | LICENSE

venti-fmt.8 (8680B)


      1 .TH VENTI-FMT 8
      2 .SH NAME
      3 buildindex,
      4 checkarenas,
      5 checkindex,
      6 conf,
      7 fmtarenas,
      8 fmtbloom,
      9 fmtindex,
     10 fmtisect,
     11 syncindex \- prepare and maintain a venti server
     12 .SH SYNOPSIS
     13 .PP
     14 .B venti/fmtarenas
     15 [
     16 .B -4Z
     17 ]
     18 [
     19 .B -a
     20 .I arenasize
     21 ]
     22 [
     23 .B -b
     24 .I blocksize
     25 ]
     26 .I name
     27 .I file
     28 .PP
     29 .B venti/fmtisect
     30 [
     31 .B -1Z
     32 ]
     33 [
     34 .B -b
     35 .I blocksize
     36 ]
     37 .I name
     38 .I file
     39 .PP
     40 .B venti/fmtbloom
     41 [
     42 .B -n
     43 .I nblocks
     44 |
     45 .B -N
     46 .I nhash
     47 ]
     48 [
     49 .B -s
     50 .I size
     51 ]
     52 .I file
     53 .PP
     54 .B venti/fmtindex
     55 [
     56 .B -a
     57 ]
     58 .I venti.conf
     59 .PP
     60 .B venti/conf
     61 [
     62 .B -w
     63 ]
     64 .I partition
     65 [
     66 .I configfile
     67 ]
     68 .if t .sp 0.5
     69 .PP
     70 .B venti/buildindex
     71 [
     72 .B -bd
     73 ] [
     74 .B -i
     75 .I isect
     76 ] ... [
     77 .B -M
     78 .I imemsize
     79 ]
     80 .I venti.conf
     81 .PP
     82 .B venti/checkindex
     83 [
     84 .B -f
     85 ]
     86 [
     87 .B -B
     88 .I blockcachesize
     89 ]
     90 .I venti.conf
     91 .I tmp
     92 .PP
     93 .B venti/checkarenas
     94 [
     95 .B -afv 
     96 ]
     97 .I file
     98 .SH DESCRIPTION
     99 These commands aid in the setup, maintenance, and debugging of
    100 venti servers.
    101 See
    102 .MR venti (7)
    103 for an overview of the venti system and
    104 .MR venti (8)
    105 for an overview of the data structures used by the venti server.
    106 .PP
    107 Note that the units for the various sizes in the following
    108 commands can be specified by appending
    109 .LR k ,
    110 .LR m ,
    111 or
    112 .LR g
    113 to indicate kilobytes, megabytes, or gigabytes respectively.
    114 .SS Formatting
    115 To prepare a server for its initial use, the arena partitions and
    116 the index sections must be formatted individually, with
    117 .I fmtarenas
    118 and
    119 .IR fmtisect .
    120 Then the 
    121 collection of index sections must be combined into a venti
    122 index with 
    123 .IR fmtindex .
    124 .PP
    125 .I Fmtarenas
    126 formats the given
    127 .IR file ,
    128 typically a disk partition, into an arena partition.
    129 The arenas in the partition are given names of the form
    130 .IR name%d ,
    131 where
    132 .I %d
    133 is replaced with a sequential number starting at 0.
    134 .PP
    135 Options to 
    136 .I fmtarenas
    137 are:
    138 .TP
    139 .BI -a " arenasize
    140 The arenas are of
    141 .I arenasize
    142 bytes.  The default is
    143 .BR 512M ,
    144 which was selected to provide a balance
    145 between the number of arenas and the ability to copy an arena to external
    146 media such as recordable CDs and tapes.
    147 .TP
    148 .BI -b " blocksize
    149 The size, in bytes, for read and write operations to the file.
    150 The size is recorded in the file, and is used by applications that access the arenas.
    151 The default is
    152 .BR 8k .
    153 .TP
    154 .B -4
    155 Create a `version 4' arena partition for backwards compatibility with old servers.
    156 The default is version 5, used by the current venti server.
    157 .TP
    158 .B -Z
    159 Do not zero the data sections of the arenas.
    160 Using this option reduces the formatting time
    161 but should only be used when it is known that the file was already zeroed.
    162 (Version 4 only; version 5 sections are not and do not need to be zeroed.)
    163 .PD
    164 .PP
    165 .I Fmtisect
    166 formats the given
    167 .IR file ,
    168 typically a disk partition, as a venti index section with the specified
    169 .IR name .
    170 Each of the index sections in a venti configuration must have a unique name.
    171 .PP
    172 Options to 
    173 .I fmtisect
    174 are:
    175 .TP
    176 .BI -b " bucketsize
    177 The size of an index bucket, in bytes.
    178 All the index sections within a index must have the same bucket size.
    179 The default is
    180 .BR 8k .
    181 .TP
    182 .B -1
    183 Create a `version 1' index section for backwards compatibility with old servers.
    184 The default is version 2, used by the current venti server.
    185 .TP
    186 .B -Z
    187 Do not zero the index.
    188 Using this option reduces the formatting time
    189 but should only be used when it is known that the file was already zeroed.
    190 (Version 1 only; version 2 sections are not and do not need to be zeroed.)
    191 .PD
    192 .PP
    193 .I Fmtbloom
    194 formats the given
    195 .I file
    196 as a Bloom filter
    197 (see
    198 .MR venti (7) ).
    199 The options are:
    200 .TF "\fL-s\fI size"
    201 .PD
    202 .TP
    203 .BI -n " nblock \fR| " -N " nhash
    204 The number of blocks expected to be indexed by the filter
    205 or the number of hash functions to use.
    206 If the
    207 .B -n
    208 option
    209 is given, it is used, along with the total size of the filter,
    210 to compute an appropriate
    211 .IR nhash .
    212 .TP
    213 .BI -s " size
    214 The size of the Bloom filter.  The default is the total size of the file.
    215 In either case,
    216 .I size
    217 is rounded down to a power of two.
    218 .PD
    219 .PP
    220 The
    221 .I file
    222 argument in the commands above can be of the form
    223 .IB file : lo - hi
    224 to specify a range of the file. 
    225 .I Lo
    226 and
    227 .I hi
    228 are specified in bytes but can have the usual
    229 .BI k ,
    230 .BI m ,
    231 or
    232 .B g
    233 suffixes.
    234 Either
    235 .I lo
    236 or
    237 .I hi
    238 may be omitted.
    239 This notation eliminates the need to
    240 partition raw disks on non-Plan 9 systems.
    241 .PP
    242 .I Fmtindex
    243 reads the configuration file
    244 .I venti.conf
    245 and initializes the index sections to form a usable index structure.
    246 The arena files and index sections must have previously been formatted
    247 using 
    248 .I fmtarenas
    249 and 
    250 .I fmtisect
    251 respectively.
    252 .PP
    253 The function of a venti index is to map a SHA1 fingerprint to a location
    254 in the data section of one of the arenas.  The index is composed of
    255 blocks, each of which contains the mapping for a fixed range of possible
    256 fingerprint values.
    257 .I Fmtindex
    258 determines the mapping between SHA1 values and the blocks
    259 of the collection of index sections.  Once this mapping has been determined,
    260 it cannot be changed without rebuilding the index. 
    261 The basic assumption in the current implementation is that the index
    262 structure is sufficiently empty that individual blocks of the index will rarely
    263 overflow.  The total size of the index should be about 2% to 10% of
    264 the total size of the arenas, but the exact percentage depends both on the
    265 index block size and the compressed size of blocks stored.
    266 See the discussion in
    267 .MR venti (8)
    268 for more.
    269 .PP
    270 .I Fmtindex
    271 also computes a mapping between a linear address space and
    272 the data section of the collection of arenas.  The
    273 .B -a
    274 option can be used to add additional arenas to an index.
    275 To use this feature,
    276 add the new arenas to
    277 .I venti.conf
    278 after the existing arenas and then run
    279 .I fmtindex
    280 .BR -a .
    281 .PP
    282 A copy of the above mappings is stored in the header for each of the index sections.
    283 These copies enable
    284 .I buildindex
    285 to restore a single index section without rebuilding the entire index.
    286 .PP
    287 To make it easier to bootstrap servers, the configuration
    288 file can be stored in otherwise empty space
    289 at the beginning of any venti partitions using
    290 .IR conf .
    291 A partition so branded with a configuration file can
    292 be used in place of a configuration file when invoking any
    293 of the venti commands.
    294 By default,
    295 .I conf
    296 prints the configuration stored in
    297 .IR partition .
    298 When invoked with the
    299 .B -w
    300 flag,
    301 .I conf
    302 reads a configuration file from 
    303 .I configfile
    304 (or else standard input)
    305 and stores it in
    306 .IR partition .
    307 .SS Checking and Rebuilding
    308 .PP
    309 .I Buildindex
    310 populates the index for the Venti system described in
    311 .IR venti.conf .
    312 The index must have previously been formatted using
    313 .IR fmtindex .
    314 This command is typically used to build a new index for a Venti
    315 system when the old index becomes too small, or to rebuild
    316 an index after media failure.
    317 Small errors in an index can usually be fixed with
    318 .IR checkindex ,
    319 but 
    320 .I checkindex
    321 requires a large temporary workspace and 
    322 .I buildindex
    323 does not.
    324 .PP
    325 Options to 
    326 .I buildindex
    327 are:
    328 .TF "\fL-M\fI imemsize"
    329 .PD
    330 .TP
    331 .B -b
    332 Reinitialise the Bloom filter, if any.
    333 .TP
    334 .B -d
    335 `Dumb' mode; run all three passes.
    336 .TP
    337 .BI -i " isect
    338 Only rebuild index section
    339 .IR isect ;
    340 may be repeated to rebuild multiple sections.
    341 The name
    342 .L none
    343 is special and just reads the arenas.
    344 .TP
    345 .BI -M " imemsize
    346 The amount of memory, in bytes, to use for caching raw disk accesses while running
    347 .IR buildindex .
    348 (This is not a property of the created index.)
    349 The usual suffices apply.
    350 The default is 256M.
    351 .PD
    352 .PP
    353 .I Checkindex
    354 examines the Venti index described in
    355 .IR venti.conf .
    356 The program detects various error conditions including:
    357 blocks that are not indexed, index entries for blocks that do not exist,
    358 and duplicate index entries.
    359 If requested, an attempt can be made to fix errors that are found.
    360 .PP
    361 The
    362 .I tmp
    363 file, usually a disk partition, must be large enough to store a copy of the index.
    364 This temporary space is used to perform a merge sort of index entries
    365 generated by reading the arenas.
    366 .PP
    367 Options to 
    368 .I checkindex
    369 are:
    370 .TP
    371 .BI -B " blockcachesize
    372 The amount of memory, in bytes, to use for caching raw disk accesses while running
    373 .IR checkindex .
    374 The default is 8k.
    375 .TP
    376 .B -f
    377 Attempt to fix any errors that are found.
    378 .PD
    379 .PP
    380 .I Checkarenas
    381 examines the Venti arenas contained in the given
    382 .IR file .
    383 The program detects various error conditions, and optionally attempts
    384 to fix any errors that are found.
    385 .PP
    386 Options to 
    387 .I checkarenas
    388 are:
    389 .TP
    390 .B -a
    391 For each arena, scan the entire data section.
    392 If this option is omitted, only the end section of
    393 the arena is examined.
    394 .TP
    395 .B -f
    396 Attempt to fix any errors that are found.
    397 .TP
    398 .B -v
    399 Increase the verbosity of output.
    400 .PD
    401 .SH SOURCE
    402 .B \*9/src/cmd/venti/srv
    403 .SH SEE ALSO
    404 .MR venti (7) ,
    405 .MR venti (8)
    406 .SH BUGS
    407 .I Buildindex
    408 should allow an individual index section to be rebuilt.