xz/src/xz/xz.1 - Issue 2869016: Add an unpatched version of xz, XZ Utils, to /trunk/deps/third_party

Side by Side Diff: xz/src/xz/xz.1

Issue 2869016: Add an unpatched version of xz, XZ Utils, to /trunk/deps/third_party (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/deps/third_party/

Patch Set: Created 10 years, 6 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

Property Changes:

Added: svn:eol-style
+ LF

OLD	NEW
(Empty)
	1 '\" t

	2 .\"

	3 .\" Author: Lasse Collin

	4 .\"

	5 .\" This file has been put into the public domain.

	6 .\" You can do whatever you want with this file.

	7 .\"

	8 .TH XZ 1 "2010-06-15" "Tukaani" "XZ Utils"

	9 .SH NAME

	10 xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma fil es

	11 .SH SYNOPSIS

	12 .B xz

	13 .RI [ option ]...

	14 .RI [ file ]...

	15 .PP

	16 .B unxz

	17 is equivalent to

	18 .BR "xz \-\-decompress" .

	19 .br

	20 .B xzcat

	21 is equivalent to

	22 .BR "xz \-\-decompress \-\-stdout" .

	23 .br

	24 .B lzma

	25 is equivalent to

	26 .BR "xz \-\-format=lzma" .

	27 .br

	28 .B unlzma

	29 is equivalent to

	30 .BR "xz \-\-format=lzma \-\-decompress" .

	31 .br

	32 .B lzcat

	33 is equivalent to

	34 .BR "xz \-\-format=lzma \-\-decompress \-\-stdout" .

	35 .PP

	36 When writing scripts that need to decompress files, it is recommended to

	37 always use the name

	38 .B xz

	39 with appropriate arguments

	40 .RB ( "xz \-d"

	41 or

	42 .BR "xz \-dc" )

	43 instead of the names

	44 .B unxz

	45 and

	46 .BR xzcat.

	47 .SH DESCRIPTION

	48 .B xz

	49 is a general-purpose data compression tool with command line syntax similar to

	50 .BR gzip (1)

	51 and

	52 .BR bzip2 (1).

	53 The native file format is the

	54 .B .xz

	55 format, but also the legacy

	56 .B .lzma

	57 format and raw compressed streams with no container format headers

	58 are supported.

	59 .PP

	60 .B xz

	61 compresses or decompresses each

	62 .I file

	63 according to the selected operation mode.

	64 If no

	65 .I files

	66 are given or

	67 .I file

	68 is

	69 .BR \- ,

	70 .B xz

	71 reads from standard input and writes the processed data to standard output.

	72 .B xz

	73 will refuse (display an error and skip the

	74 .IR file )

	75 to write compressed data to standard output if it is a terminal. Similarly,

	76 .B xz

	77 will refuse to read compressed data from standard input if it is a terminal.

	78 .PP

	79 Unless

	80 .B \-\-stdout

	81 is specified,

	82 .I files

	83 other than

	84 .B \-

	85 are written to a new file whose name is derived from the source

	86 .I file

	87 name:

	88 .IP \(bu 3

	89 When compressing, the suffix of the target file format

	90 .RB ( .xz

	91 or

	92 .BR .lzma )

	93 is appended to the source filename to get the target filename.

	94 .IP \(bu 3

	95 When decompressing, the

	96 .B .xz

	97 or

	98 .B .lzma

	99 suffix is removed from the filename to get the target filename.

	100 .B xz

	101 also recognizes the suffixes

	102 .B .txz

	103 and

	104 .BR .tlz ,

	105 and replaces them with the

	106 .B .tar

	107 suffix.

	108 .PP

	109 If the target file already exists, an error is displayed and the

	110 .I file

	111 is skipped.

	112 .PP

	113 Unless writing to standard output,

	114 .B xz

	115 will display a warning and skip the

	116 .I file

	117 if any of the following applies:

	118 .IP \(bu 3

	119 .I File

	120 is not a regular file. Symbolic links are not followed, thus they

	121 are not considered to be regular files.

	122 .IP \(bu 3

	123 .I File

	124 has more than one hard link.

	125 .IP \(bu 3

	126 .I File

	127 has setuid, setgid, or sticky bit set.

	128 .IP \(bu 3

	129 The operation mode is set to compress, and the

	130 .I file

	131 already has a suffix of the target file format

	132 .RB ( .xz

	133 or

	134 .B .txz

	135 when compressing to the

	136 .B .xz

	137 format, and

	138 .B .lzma

	139 or

	140 .B .tlz

	141 when compressing to the

	142 .B .lzma

	143 format).

	144 .IP \(bu 3

	145 The operation mode is set to decompress, and the

	146 .I file

	147 doesn't have a suffix of any of the supported file formats

	148 .RB ( .xz ,

	149 .BR .txz ,

	150 .BR .lzma ,

	151 or

	152 .BR .tlz ).

	153 .PP

	154 After successfully compressing or decompressing the

	155 .IR file ,

	156 .B xz

	157 copies the owner, group, permissions, access time, and modification time

	158 from the source

	159 .I file

	160 to the target file. If copying the group fails, the permissions are modified

	161 so that the target file doesn't become accessible to users who didn't have

	162 permission to access the source

	163 .IR file .

	164 .B xz

	165 doesn't support copying other metadata like access control lists

	166 or extended attributes yet.

	167 .PP

	168 Once the target file has been successfully closed, the source

	169 .I file

	170 is removed unless

	171 .B \-\-keep

	172 was specified. The source

	173 .I file

	174 is never removed if the output is written to standard output.

	175 .PP

	176 Sending

	177 .B SIGINFO

	178 or

	179 .B SIGUSR1

	180 to the

	181 .B xz

	182 process makes it print progress information to standard error.

	183 This has only limited use since when standard error is a terminal, using

	184 .B \-\-verbose

	185 will display an automatically updating progress indicator.

	186 .SS "Memory usage"

	187 The memory usage of

	188 .B xz

	189 varies from a few hundred kilobytes to several gigabytes depending on

	190 the compression settings. The settings used when compressing a file

	191 affect also the memory usage of the decompressor. Typically the decompressor

	192 needs only 5\ % to 20\ % of the amount of RAM that the compressor needed when

	193 creating the file. Still, the worst-case memory usage of the decompressor

	194 is several gigabytes.

	195 .PP

	196 To prevent uncomfortable surprises caused by huge memory usage,

	197 .B xz

	198 has a built-in memory usage limiter. While some operating systems provide

	199 ways to limit the memory usage of processes, relying on it wasn't deemed

	200 to be flexible enough. The default limit depends on the total amount of

	201 physical RAM:

	202 .IP \(bu 3

	203 If 40\ % of RAM is at least 80 MiB, 40\ % of RAM is used as the limit.

	204 .IP \(bu 3

	205 If 80\ % of RAM is less than 80 MiB, 80\ % of RAM is used as the limit.

	206 .IP \(bu 3

	207 Otherwise 80 MiB is used as the limit.

	208 .PP

	209 When compressing, if the selected compression settings exceed the memory

	210 usage limit, the settings are automatically adjusted downwards and a notice

	211 about this is displayed. As an exception, if the memory usage limit is

	212 exceeded when compressing with

	213 .B \-\-format=raw

	214 or

	215 .BR \-\-no\-adjust ,

	216 an error is displayed and

	217 .B xz

	218 will exit with exit status

	219 .BR 1 .

	220 .PP

	221 If source

	222 .I file

	223 cannot be decompressed without exceeding the memory usage limit, an error

	224 message is displayed and the file is skipped. Note that compressed files

	225 may contain many blocks, which may have been compressed with different

	226 settings. Typically all blocks will have roughly the same memory requirements,

	227 but it is possible that a block later in the file will exceed the memory usage

	228 limit, and an error about too low memory usage limit gets displayed after some

	229 data has already been decompressed.

	230 .PP

	231 The absolute value of the active memory usage limit can be seen with

	232 .B \-\-info-memory

	233 or near the bottom of the output of

	234 .BR \-\-long\-help .

	235 The default limit can be overridden with

	236 \fB\-\-memory=\fIlimit\fR.

	237 .SS Concatenation and padding with .xz files

	238 It is possible to concatenate

	239 .B .xz

	240 files as is.

	241 .B xz

	242 will decompress such files as if they were a single

	243 .B .xz

	244 file.

	245 .PP

	246 It is possible to insert padding between the concenated parts

	247 or after the last part. The padding must be null bytes and the size

	248 of the padding must be a multiple of four bytes. This can be useful

	249 if the .xz file is stored on a medium that stores file sizes

	250 e.g. as 512-byte blocks.

	251 .PP

	252 Concatenation and padding are not allowed with

	253 .B .lzma

	254 files or raw streams.

	255 .SH OPTIONS

	256 .SS "Integer suffixes and special values"

	257 In most places where an integer argument is expected, an optional suffix

	258 is supported to easily indicate large integers. There must be no space

	259 between the integer and the suffix.

	260 .TP

	261 .B KiB

	262 The integer is multiplied by 1,024 (2^10). Also

	263 .BR Ki ,

	264 .BR k ,

	265 .BR kB ,

	266 .BR K ,

	267 and

	268 .B KB

	269 are accepted as synonyms for

	270 .BR KiB .

	271 .TP

	272 .B MiB

	273 The integer is multiplied by 1,048,576 (2^20). Also

	274 .BR Mi ,

	275 .BR m ,

	276 .BR M ,

	277 and

	278 .B MB

	279 are accepted as synonyms for

	280 .BR MiB .

	281 .TP

	282 .B GiB

	283 The integer is multiplied by 1,073,741,824 (2^30). Also

	284 .BR Gi ,

	285 .BR g ,

	286 .BR G ,

	287 and

	288 .B GB

	289 are accepted as synonyms for

	290 .BR GiB .

	291 .PP

	292 A special value

	293 .B max

	294 can be used to indicate the maximum integer value supported by the option.

	295 .SS "Operation mode"

	296 If multiple operation mode options are given, the last one takes effect.

	297 .TP

	298 .BR \-z ", " \-\-compress

	299 Compress. This is the default operation mode when no operation mode option

	300 is specified, and no other operation mode is implied from the command name

	301 (for example,

	302 .B unxz

	303 implies

	304 .BR \-\-decompress ).

	305 .TP

	306 .BR \-d ", " \-\-decompress ", " \-\-uncompress

	307 Decompress.

	308 .TP

	309 .BR \-t ", " \-\-test

	310 Test the integrity of compressed

	311 .IR files .

	312 No files are created or removed. This option is equivalent to

	313 .B "\-\-decompress \-\-stdout"

	314 except that the decompressed data is discarded instead of being

	315 written to standard output.

	316 .TP

	317 .BR \-l ", " \-\-list

	318 List information about compressed

	319 .IR files .

	320 No uncompressed output is produced, and no files are created or removed.

	321 In list mode, the program cannot read the compressed data from standard

	322 input or from other unseekable sources.

	323 .IP

	324 The default listing shows basic information about

	325 .IR files ,

	326 one file per line. To get more detailed information, use also the

	327 .B \-\-verbose

	328 option. For even more information, use

	329 .B \-\-verbose

	330 twice, but note that it may be slow, because getting all the extra

	331 information requires many seeks. The width of verbose output exceeds

	332 80 characters, so piping the output to e.g.

	333 .B "less\ \-S"

	334 may be convenient if the terminal isn't wide enough.

	335 .IP

	336 The exact output may vary between

	337 .B xz

	338 versions and different locales. To get machine-readable output,

	339 .B \-\-robot \-\-list

	340 should be used.

	341 .SS "Operation modifiers"

	342 .TP

	343 .BR \-k ", " \-\-keep

	344 Keep (don't delete) the input files.

	345 .TP

	346 .BR \-f ", " \-\-force

	347 This option has several effects:

	348 .RS

	349 .IP \(bu 3

	350 If the target file already exists, delete it before compressing or

	351 decompressing.

	352 .IP \(bu 3

	353 Compress or decompress even if the input is a symbolic link to a regular file,

	354 has more than one hard link, or has setuid, setgid, or sticky bit set.

	355 The setuid, setgid, and sticky bits are not copied to the target file.

	356 .IP \(bu 3

	357 If combined with

	358 .B \-\-decompress

	359 .BR \-\-stdout

	360 and

	361 .B xz

	362 doesn't recognize the type of the source file,

	363 .B xz

	364 will copy the source file as is to standard output. This allows using

	365 .B xzcat

	366 .B \--force

	367 like

	368 .BR cat (1)

	369 for files that have not been compressed with

	370 .BR xz .

	371 Note that in future,

	372 .B xz

	373 might support new compressed file formats, which may make

	374 .B xz

	375 decompress more types of files instead of copying them as is to

	376 standard output.

	377 .BI \-\-format= format

	378 can be used to restrict

	379 .B xz

	380 to decompress only a single file format.

	381 .RE

	382 .TP

	383 .BR \-c ", " \-\-stdout ", " \-\-to-stdout

	384 Write the compressed or decompressed data to standard output instead of

	385 a file. This implies

	386 .BR \-\-keep .

	387 .TP

	388 .B \-\-no\-sparse

	389 Disable creation of sparse files. By default, if decompressing into

	390 a regular file,

	391 .B xz

	392 tries to make the file sparse if the decompressed data contains long

	393 sequences of binary zeros. It works also when writing to standard output

	394 as long as standard output is connected to a regular file, and certain

	395 additional conditions are met to make it safe. Creating sparse files may

	396 save disk space and speed up the decompression by reducing the amount of

	397 disk I/O.

	398 .TP

	399 \fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf

	400 When compressing, use

	401 .I .suf

	402 as the suffix for the target file instead of

	403 .B .xz

	404 or

	405 .BR .lzma .

	406 If not writing to standard output and the source file already has the suffix

	407 .IR .suf ,

	408 a warning is displayed and the file is skipped.

	409 .IP

	410 When decompressing, recognize also files with the suffix

	411 .I .suf

	412 in addition to files with the

	413 .BR .xz ,

	414 .BR .txz ,

	415 .BR .lzma ,

	416 or

	417 .B .tlz

	418 suffix. If the source file has the suffix

	419 .IR .suf ,

	420 the suffix is removed to get the target filename.

	421 .IP

	422 When compressing or decompressing raw streams

	423 .RB ( \-\-format=raw ),

	424 the suffix must always be specified unless writing to standard output,

	425 because there is no default suffix for raw streams.

	426 .TP

	427 \fB\-\-files\fR[\fB=\fIfile\fR]

	428 Read the filenames to process from

	429 .IR file ;

	430 if

	431 .I file

	432 is omitted, filenames are read from standard input. Filenames must be

	433 terminated with the newline character. A dash

	434 .RB ( \- )

	435 is taken as a regular filename; it doesn't mean standard input.

	436 If filenames are given also as command line arguments, they are

	437 processed before the filenames read from

	438 .IR file .

	439 .TP

	440 \fB\-\-files0\fR[\fB=\fIfile\fR]

	441 This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except that the

	442 filenames must be terminated with the null character.

	443 .SS "Basic file format and compression options"

	444 .TP

	445 \fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat

	446 Specify the file format to compress or decompress:

	447 .RS

	448 .IP \(bu 3

	449 .BR auto :

	450 This is the default. When compressing,

	451 .B auto

	452 is equivalent to

	453 .BR xz .

	454 When decompressing, the format of the input file is automatically detected.

	455 Note that raw streams (created with

	456 .BR \-\-format=raw )

	457 cannot be auto-detected.

	458 .IP \(bu 3

	459 .BR xz :

	460 Compress to the

	461 .B .xz

	462 file format, or accept only

	463 .B .xz

	464 files when decompressing.

	465 .IP \(bu 3

	466 .B lzma

	467 or

	468 .BR alone :

	469 Compress to the legacy

	470 .B .lzma

	471 file format, or accept only

	472 .B .lzma

	473 files when decompressing. The alternative name

	474 .B alone

	475 is provided for backwards compatibility with LZMA Utils.

	476 .IP \(bu 3

	477 .BR raw :

	478 Compress or uncompress a raw stream (no headers). This is meant for advanced

	479 users only. To decode raw streams, you need to set not only

	480 .B \-\-format=raw

	481 but also specify the filter chain, which would normally be stored in the

	482 container format headers.

	483 .RE

	484 .TP

	485 \fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck

	486 Specify the type of the integrity check, which is calculated from the

	487 uncompressed data. This option has an effect only when compressing into the

	488 .B .xz

	489 format; the

	490 .B .lzma

	491 format doesn't support integrity checks.

	492 The integrity check (if any) is verified when the

	493 .B .xz

	494 file is decompressed.

	495 .IP

	496 Supported

	497 .I check

	498 types:

	499 .RS

	500 .IP \(bu 3

	501 .BR none :

	502 Don't calculate an integrity check at all. This is usually a bad idea. This

	503 can be useful when integrity of the data is verified by other means anyway.

	504 .IP \(bu 3

	505 .BR crc32 :

	506 Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet).

	507 .IP \(bu 3

	508 .BR crc64 :

	509 Calculate CRC64 using the polynomial from ECMA-182. This is the default, since

	510 it is slightly better than CRC32 at detecting damaged files and the speed

	511 difference is negligible.

	512 .IP \(bu 3

	513 .BR sha256 :

	514 Calculate SHA-256. This is somewhat slower than CRC32 and CRC64.

	515 .RE

	516 .IP

	517 Integrity of the

	518 .B .xz

	519 headers is always verified with CRC32. It is not possible to change or

	520 disable it.

	521 .TP

	522 .BR \-0 " ... " \-9

	523 Select compression preset. If a preset level is specified multiple times,

	524 the last one takes effect.

	525 .IP

	526 The compression preset levels can be categorised roughly into three

	527 categories:

	528 .RS

	529 .IP "\fB\-0\fR ... \fB\-2"

	530 Fast presets with relatively low memory usage.

	531 .B \-1

	532 and

	533 .B \-2

	534 should give compression speed and ratios comparable to

	535 .B "bzip2 \-1"

	536 and

	537 .BR "bzip2 \-9" ,

	538 respectively.

	539 Currently

	540 .B \-0

	541 is not very good (not much faster than

	542 .B \-1

	543 but much worse compression). In future,

	544 .B \-0

	545 may be indicate some fast algorithm instead of LZMA2.

	546 .IP "\fB\-3\fR ... \fB\-5"

	547 Good compression ratio with low to medium memory usage.

	548 These are significantly slower than levels 0\-2.

	549 .IP "\fB\-6\fR ... \fB\-9"

	550 Excellent compression with medium to high memory usage. These are also

	551 slower than the lower preset levels. The default is

	552 .BR \-6 .

	553 Unless you want to maximize the compression ratio, you probably don't want

	554 a higher preset level than

	555 .B \-7

	556 due to speed and memory usage.

	557 .RE

	558 .IP

	559 The exact compression settings (filter chain) used by each preset may

	560 vary between

	561 .B xz

	562 versions. The settings may also vary between files being compressed, if

	563 .B xz

	564 determines that modified settings will probably give better compression

	565 ratio without significantly affecting compression time or memory usage.

	566 .IP

	567 Because the settings may vary, the memory usage may vary too. The following

	568 table lists the maximum memory usage of each preset level, which won't be

	569 exceeded even in future versions of

	570 .BR xz .

	571 .IP

	572 .B "FIXME: The table below is just a rough idea."

	573 .RS

	574 .RS

	575 .TS

	576 tab(;);

	577 c c c

	578 n n n.

	579 Preset;Compression;Decompression

	580 \-0;6 MiB;1 MiB

	581 \-1;6 MiB;1 MiB

	582 \-2;10 MiB;1 MiB

	583 \-3;20 MiB;2 MiB

	584 \-4;30 MiB;3 MiB

	585 \-5;60 MiB;6 MiB

	586 \-6;100 MiB;10 MiB

	587 \-7;200 MiB;20 MiB

	588 \-8;400 MiB;40 MiB

	589 \-9;800 MiB;80 MiB

	590 .TE

	591 .RE

	592 .RE

	593 .IP

	594 When compressing,

	595 .B xz

	596 automatically adjusts the compression settings downwards if

	597 the memory usage limit would be exceeded, so it is safe to specify

	598 a high preset level even on systems that don't have lots of RAM.

	599 .TP

	600 .BR \-\-fast " and " \-\-best

	601 These are somewhat misleading aliases for

	602 .B \-0

	603 and

	604 .BR \-9 ,

	605 respectively.

	606 These are provided only for backwards compatibility with LZMA Utils.

	607 Avoid using these options.

	608 .IP

	609 Especially the name of

	610 .B \-\-best

	611 is misleading, because the definition of best depends on the input data,

	612 and that usually people don't want the very best compression ratio anyway,

	613 because it would be very slow.

	614 .TP

	615 .BR \-e ", " \-\-extreme

	616 Modify the compression preset (\fB\-0\fR ... \fB\-9\fR) so that a little bit

	617 better compression ratio can be achieved without increasing memory usage

	618 of the compressor or decompressor (exception: compressor memory usage may

	619 increase a little with presets \fB\-0\fR ... \fB\-2\fR). The downside is that

	620 the compression time will increase dramatically (it can easily double).

	621 .TP

	622 .B \-\-no\-adjust

	623 Display an error and exit if the compression settings exceed the

	624 the memory usage limit. The default is to adjust the settings downwards so

	625 that the memory usage limit is not exceeded. Automatic adjusting is

	626 always disabled when creating raw streams

	627 .RB ( \-\-format=raw ).

	628 .TP

	629 \fB\-M\fR \fIlimit\fR, \fB\-\-memory=\fIlimit

	630 Set the memory usage limit. If this option is specified multiple times,

	631 the last one takes effect. The

	632 .I limit

	633 can be specified in multiple ways:

	634 .RS

	635 .IP \(bu 3

	636 The

	637 .I limit

	638 can be an absolute value in bytes. Using an integer suffix like

	639 .B MiB

	640 can be useful. Example:

	641 .B "\-\-memory=80MiB"

	642 .IP \(bu 3

	643 The

	644 .I limit

	645 can be specified as a percentage of physical RAM. Example:

	646 .B "\-\-memory=70%"

	647 .IP \(bu 3

	648 The

	649 .I limit

	650 can be reset back to its default value by setting it to

	651 .BR 0 .

	652 See the section

	653 .B "Memory usage"

	654 for how the default limit is defined.

	655 .IP \(bu 3

	656 The memory usage limiting can be effectively disabled by setting

	657 .I limit

	658 to

	659 .BR max .

	660 This isn't recommended. It's usually better to use, for example,

	661 .BR \-\-memory=90% .

	662 .RE

	663 .IP

	664 The current

	665 .I limit

	666 can be seen near the bottom of the output of the

	667 .B \-\-long-help

	668 option.

	669 .TP

	670 \fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads

	671 Specify the maximum number of worker threads to use. The default is

	672 the number of available CPU cores. You can see the current value of

	673 .I threads

	674 near the end of the output of the

	675 .B \-\-long\-help

	676 option.

	677 .IP

	678 The actual number of worker threads can be less than

	679 .I threads

	680 if using more threads would exceed the memory usage limit.

	681 In addition to CPU-intensive worker threads,

	682 .B xz

	683 may use a few auxiliary threads, which don't use a lot of CPU time.

	684 .IP

	685 .B "Multithreaded compression and decompression are not implemented yet,"

	686 .B "so this option has no effect for now."

	687 .SS Custom compressor filter chains

	688 A custom filter chain allows specifying the compression settings in detail

	689 instead of relying on the settings associated to the preset levels.

	690 When a custom filter chain is specified, the compression preset level options

	691 (\fB\-0\fR ... \fB\-9\fR and \fB\-\-extreme\fR) are silently ignored.

	692 .PP

	693 A filter chain is comparable to piping on the UN*X command line.

	694 When compressing, the uncompressed input goes to the first filter, whose

	695 output goes to the next filter (if any). The output of the last filter

	696 gets written to the compressed file. The maximum number of filters in

	697 the chain is four, but typically a filter chain has only one or two filters.

	698 .PP

	699 Many filters have limitations where they can be in the filter chain:

	700 some filters can work only as the last filter in the chain, some only

	701 as a non-last filter, and some work in any position in the chain. Depending

	702 on the filter, this limitation is either inherent to the filter design or

	703 exists to prevent security issues.

	704 .PP

	705 A custom filter chain is specified by using one or more filter options in

	706 the order they are wanted in the filter chain. That is, the order of filter

	707 options is significant! When decoding raw streams

	708 .RB ( \-\-format=raw ),

	709 the filter chain is specified in the same order as it was specified when

	710 compressing.

	711 .PP

	712 Filters take filter-specific

	713 .I options

	714 as a comma-separated list. Extra commas in

	715 .I options

	716 are ignored. Every option has a default value, so you need to

	717 specify only those you want to change.

	718 .TP

	719 \fB\-\-lzma1\fR[\fB=\fIoptions\fR], \fB\-\-lzma2\fR[\fB=\fIoptions\fR]

	720 Add LZMA1 or LZMA2 filter to the filter chain. These filter can be used

	721 only as the last filter in the chain.

	722 .IP

	723 LZMA1 is a legacy filter, which is supported almost solely due to the legacy

	724 .B .lzma

	725 file format, which supports only LZMA1. LZMA2 is an updated

	726 version of LZMA1 to fix some practical issues of LZMA1. The

	727 .B .xz

	728 format uses LZMA2, and doesn't support LZMA1 at all. Compression speed and

	729 ratios of LZMA1 and LZMA2 are practically the same.

	730 .IP

	731 LZMA1 and LZMA2 share the same set of

	732 .IR options :

	733 .RS

	734 .TP

	735 .BI preset= preset

	736 Reset all LZMA1 or LZMA2

	737 .I options

	738 to

	739 .IR preset .

	740 .I Preset

	741 consist of an integer, which may be followed by single-letter preset

	742 modifiers. The integer can be from

	743 .B 0

	744 to

	745 .BR 9 ,

	746 matching the command line options \fB\-0\fR ... \fB\-9\fR.

	747 The only supported modifier is currently

	748 .BR e ,

	749 which matches

	750 .BR \-\-extreme .

	751 .IP

	752 The default

	753 .I preset

	754 is

	755 .BR 6 ,

	756 from which the default values for the rest of the LZMA1 or LZMA2

	757 .I options

	758 are taken.

	759 .TP

	760 .BI dict= size

	761 Dictionary (history buffer) size indicates how many bytes of the recently

	762 processed uncompressed data is kept in memory. One method to reduce size of

	763 the uncompressed data is to store distance-length pairs, which

	764 indicate what data to repeat from the dictionary buffer. The bigger

	765 the dictionary, the better the compression ratio usually is,

	766 but dictionaries bigger than the uncompressed data are waste of RAM.

	767 .IP

	768 Typical dictionary size is from 64 KiB to 64 MiB. The minimum is 4 KiB.

	769 The maximum for compression is currently 1.5 GiB. The decompressor already

	770 supports dictionaries up to one byte less than 4 GiB, which is the

	771 maximum for LZMA1 and LZMA2 stream formats.

	772 .IP

	773 Dictionary size has the biggest effect on compression ratio.

	774 Dictionary size and match finder together determine the memory usage of

	775 the LZMA1 or LZMA2 encoder. The same dictionary size is required

	776 for decompressing that was used when compressing, thus the memory usage of

	777 the decoder is determined by the dictionary size used when compressing.

	778 .TP

	779 .BI lc= lc

	780 Specify the number of literal context bits. The minimum is

	781 .B 0

	782 and the maximum is

	783 .BR 4 ;

	784 the default is

	785 .BR 3 .

	786 In addition, the sum of

	787 .I lc

	788 and

	789 .I lp

	790 must not exceed

	791 .BR 4 .

	792 .TP

	793 .BI lp= lp

	794 Specify the number of literal position bits. The minimum is

	795 .B 0

	796 and the maximum is

	797 .BR 4 ;

	798 the default is

	799 .BR 0 .

	800 .TP

	801 .BI pb= pb

	802 Specify the number of position bits. The minimum is

	803 .B 0

	804 and the maximum is

	805 .BR 4 ;

	806 the default is

	807 .BR 2 .

	808 .TP

	809 .BI mode= mode

	810 Compression

	811 .I mode

	812 specifies the function used to analyze the data produced by the match finder.

	813 Supported

	814 .I modes

	815 are

	816 .B fast

	817 and

	818 .BR normal .

	819 The default is

	820 .B fast

	821 for

	822 .I presets

	823 .BR 0 \- 2

	824 and

	825 .B normal

	826 for

	827 .I presets

	828 .BR 3 \- 9 .

	829 .TP

	830 .BI mf= mf

	831 Match finder has a major effect on encoder speed, memory usage, and

	832 compression ratio. Usually Hash Chain match finders are faster than

	833 Binary Tree match finders. Hash Chains are usually used together with

	834 .B mode=fast

	835 and Binary Trees with

	836 .BR mode=normal .

	837 The memory usage formulas are only rough estimates,

	838 which are closest to reality when

	839 .I dict

	840 is a power of two.

	841 .RS

	842 .TP

	843 .B hc3

	844 Hash Chain with 2- and 3-byte hashing

	845 .br

	846 Minimum value for

	847 .IR nice :

	848 3

	849 .br

	850 Memory usage:

	851 .I dict

	852 * 7.5 (if

	853 .I dict

	854 <= 16 MiB);

	855 .br

	856 .I dict

	857 * 5.5 + 64 MiB (if

	858 .I dict

	859 > 16 MiB)

	860 .TP

	861 .B hc4

	862 Hash Chain with 2-, 3-, and 4-byte hashing

	863 .br

	864 Minimum value for

	865 .IR nice :

	866 4

	867 .br

	868 Memory usage:

	869 .I dict

	870 * 7.5

	871 .TP

	872 .B bt2

	873 Binary Tree with 2-byte hashing

	874 .br

	875 Minimum value for

	876 .IR nice :

	877 2

	878 .br

	879 Memory usage:

	880 .I dict

	881 * 9.5

	882 .TP

	883 .B bt3

	884 Binary Tree with 2- and 3-byte hashing

	885 .br

	886 Minimum value for

	887 .IR nice :

	888 3

	889 .br

	890 Memory usage:

	891 .I dict

	892 * 11.5 (if

	893 .I dict

	894 <= 16 MiB);

	895 .br

	896 .I dict

	897 * 9.5 + 64 MiB (if

	898 .I dict

	899 > 16 MiB)

	900 .TP

	901 .B bt4

	902 Binary Tree with 2-, 3-, and 4-byte hashing

	903 .br

	904 Minimum value for

	905 .IR nice :

	906 4

	907 .br

	908 Memory usage:

	909 .I dict

	910 * 11.5

	911 .RE

	912 .TP

	913 .BI nice= nice

	914 Specify what is considered to be a nice length for a match. Once a match

	915 of at least

	916 .I nice

	917 bytes is found, the algorithm stops looking for possibly better matches.

	918 .IP

	919 .I nice

	920 can be 2\-273 bytes. Higher values tend to give better compression ratio

	921 at expense of speed. The default depends on the

	922 .I preset

	923 level.

	924 .TP

	925 .BI depth= depth

	926 Specify the maximum search depth in the match finder. The default is the

	927 special value

	928 .BR 0 ,

	929 which makes the compressor determine a reasonable

	930 .I depth

	931 from

	932 .I mf

	933 and

	934 .IR nice .

	935 .IP

	936 Using very high values for

	937 .I depth

	938 can make the encoder extremely slow with carefully crafted files.

	939 Avoid setting the

	940 .I depth

	941 over 1000 unless you are prepared to interrupt the compression in case it

	942 is taking too long.

	943 .RE

	944 .IP

	945 When decoding raw streams

	946 .RB ( \-\-format=raw ),

	947 LZMA2 needs only the value of

	948 .BR dict .

	949 LZMA1 needs also

	950 .BR lc ,

	951 .BR lp ,

	952 and

	953 .BR pb.

	954 .TP

	955 \fB\-\-x86\fR[\fB=\fIoptions\fR]

	956 .TP

	957 \fB\-\-powerpc\fR[\fB=\fIoptions\fR]

	958 .TP

	959 \fB\-\-ia64\fR[\fB=\fIoptions\fR]

	960 .TP

	961 \fB\-\-arm\fR[\fB=\fIoptions\fR]

	962 .TP

	963 \fB\-\-armthumb\fR[\fB=\fIoptions\fR]

	964 .TP

	965 \fB\-\-sparc\fR[\fB=\fIoptions\fR]

	966 Add a branch/call/jump (BCJ) filter to the filter chain. These filters

	967 can be used only as non-last filter in the filter chain.

	968 .IP

	969 A BCJ filter converts relative addresses in the machine code to their

	970 absolute counterparts. This doesn't change the size of the data, but

	971 it increases redundancy, which allows e.g. LZMA2 to get better

	972 compression ratio.

	973 .IP

	974 The BCJ filters are always reversible, so using a BCJ filter for wrong

	975 type of data doesn't cause any data loss. However, applying a BCJ filter

	976 for wrong type of data is a bad idea, because it tends to make the

	977 compression ratio worse.

	978 .IP

	979 Different instruction sets have have different alignment:

	980 .RS

	981 .RS

	982 .TS

	983 tab(;);

	984 l n l

	985 l n l.

	986 Filter;Alignment;Notes

	987 x86;1;32-bit and 64-bit x86

	988 PowerPC;4;Big endian only

	989 ARM;4;Little endian only

	990 ARM-Thumb;2;Little endian only

	991 IA-64;16;Big or little endian

	992 SPARC;4;Big or little endian

	993 .TE

	994 .RE

	995 .RE

	996 .IP

	997 Since the BCJ-filtered data is usually compressed with LZMA2, the compression

	998 ratio may be improved slightly if the LZMA2 options are set to match the

	999 alignment of the selected BCJ filter. For example, with the IA-64 filter,

	1000 it's good to set

	1001 .B pb=4

	1002 with LZMA2 (2^4=16). The x86 filter is an exception; it's usually good to

	1003 stick to LZMA2's default four-byte alignment when compressing x86 executables.

	1004 .IP

	1005 All BCJ filters support the same

	1006 .IR options :

	1007 .RS

	1008 .TP

	1009 .BI start= offset

	1010 Specify the start

	1011 .I offset

	1012 that is used when converting between relative and absolute addresses.

	1013 The

	1014 .I offset

	1015 must be a multiple of the alignment of the filter (see the table above).

	1016 The default is zero. In practice, the default is good; specifying

	1017 a custom

	1018 .I offset

	1019 is almost never useful.

	1020 .IP

	1021 Specifying a non-zero start

	1022 .I offset

	1023 is probably useful only if the executable has multiple sections, and there

	1024 are many cross-section jumps or calls. Applying a BCJ filter separately for

	1025 each section with proper start offset and then compressing the result as

	1026 a single chunk may give some improvement in compression ratio compared

	1027 to applying the BCJ filter with the default

	1028 .I offset

	1029 for the whole executable.

	1030 .RE

	1031 .TP

	1032 \fB\-\-delta\fR[\fB=\fIoptions\fR]

	1033 Add Delta filter to the filter chain. The Delta filter

	1034 can be used only as non-last filter in the filter chain.

	1035 .IP

	1036 Currently only simple byte-wise delta calculation is supported. It can

	1037 be useful when compressing e.g. uncompressed bitmap images or uncompressed

	1038 PCM audio. However, special purpose algorithms may give significantly better

	1039 results than Delta + LZMA2. This is true especially with audio, which

	1040 compresses faster and better e.g. with FLAC.

	1041 .IP

	1042 Supported

	1043 .IR options :

	1044 .RS

	1045 .TP

	1046 .BI dist= distance

	1047 Specify the

	1048 .I distance

	1049 of the delta calculation as bytes.

	1050 .I distance

	1051 must be 1\-256. The default is 1.

	1052 .IP

	1053 For example, with

	1054 .B dist=2

	1055 and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be

	1056 A1 B1 01 02 01 02 01 02.

	1057 .RE

	1058 .SS "Other options"

	1059 .TP

	1060 .BR \-q ", " \-\-quiet

	1061 Suppress warnings and notices. Specify this twice to suppress errors too.

	1062 This option has no effect on the exit status. That is, even if a warning

	1063 was suppressed, the exit status to indicate a warning is still used.

	1064 .TP

	1065 .BR \-v ", " \-\-verbose

	1066 Be verbose. If standard error is connected to a terminal,

	1067 .B xz

	1068 will display a progress indicator.

	1069 Specifying

	1070 .B \-\-verbose

	1071 twice will give even more verbose output (useful mostly for debugging).

	1072 .IP

	1073 The progress indicator shows the following information:

	1074 .RS

	1075 .IP \(bu 3

	1076 Completion percentage is shown if the size of the input file is known.

	1077 That is, percentage cannot be shown in pipes.

	1078 .IP \(bu 3

	1079 Amount of compressed data produced (compressing) or consumed (decompressing).

	1080 .IP \(bu 3

	1081 Amount of uncompressed data consumed (compressing) or produced

	1082 (decompressing).

	1083 .IP \(bu 3

	1084 Compression ratio, which is calculated by dividing the amount of

	1085 compressed data processed so far by the amount of uncompressed data

	1086 processed so far.

	1087 .IP \(bu 3

	1088 Compression or decompression speed. This is measured as the amount of

	1089 uncompressed data consumed (compression) or produced (decompression)

	1090 per second. It is shown once a few seconds have passed since

	1091 .B xz

	1092 started processing the file.

	1093 .IP \(bu 3

	1094 Elapsed time or estimated time remaining.

	1095 Elapsed time is displayed in the format M:SS or H:MM:SS.

	1096 The estimated remaining time is displayed in a less precise format

	1097 which never has colons, for example, 2 min 30 s. The estimate can

	1098 be shown only when the size of the input file is known and a couple of

	1099 seconds have already passed since

	1100 .B xz

	1101 started processing the file.

	1102 .RE

	1103 .IP

	1104 When standard error is not a terminal,

	1105 .B \-\-verbose

	1106 will make

	1107 .B xz

	1108 print the filename, compressed size, uncompressed size, compression ratio,

	1109 speed, and elapsed time on a single line to standard error after

	1110 compressing or decompressing the file. If operating took at least a few

	1111 seconds, also the speed and elapsed time are printed. If the operation

	1112 didn't finish, for example due to user interruption, also the completion

	1113 percentage is printed if the size of the input file is known.

	1114 .TP

	1115 .BR \-Q ", " \-\-no\-warn

	1116 Don't set the exit status to

	1117 .B 2

	1118 even if a condition worth a warning was detected. This option doesn't affect

	1119 the verbosity level, thus both

	1120 .B \-\-quiet

	1121 and

	1122 .B \-\-no\-warn

	1123 have to be used to not display warnings and to not alter the exit status.

	1124 .TP

	1125 .B \-\-robot

	1126 Print messages in a machine-parsable format. This is intended to ease

	1127 writing frontends that want to use

	1128 .B xz

	1129 instead of liblzma, which may be the case with various scripts. The output

	1130 with this option enabled is meant to be stable across

	1131 .B xz

	1132 releases. See the section

	1133 .B "ROBOT MODE"

	1134 for details.

	1135 .TP

	1136 .BR \-\-info-memory

	1137 Display the current memory usage limit in human-readable format on

	1138 a single line, and exit successfully. To see how much RAM

	1139 .B xz

	1140 thinks your system has, use

	1141 .BR "\-\-memory=100% \-\-info\-memory" .

	1142 .TP

	1143 .BR \-h ", " \-\-help

	1144 Display a help message describing the most commonly used options,

	1145 and exit successfully.

	1146 .TP

	1147 .BR \-H ", " \-\-long\-help

	1148 Display a help message describing all features of

	1149 .BR xz ,

	1150 and exit successfully

	1151 .TP

	1152 .BR \-V ", " \-\-version

	1153 Display the version number of

	1154 .B xz

	1155 and liblzma in human readable format. To get machine-parsable output, specify

	1156 .B \-\-robot

	1157 before

	1158 .BR \-\-version .

	1159 .SH ROBOT MODE

	1160 The robot mode is activated with the

	1161 .B \-\-robot

	1162 option. It makes the output of

	1163 .B xz

	1164 easier to parse by other programs. Currently

	1165 .B \-\-robot

	1166 is supported only together with

	1167 .BR \-\-version ,

	1168 .BR \-\-info-memory ,

	1169 and

	1170 .BR \-\-list .

	1171 It will be supported for normal compression and decompression in the future.

	1172 .PP

	1173 .SS Version

	1174 .B "xz \-\-robot \-\-version"

	1175 will print the version number of

	1176 .B xz

	1177 and liblzma in the following format:

	1178 .PP

	1179 .BI XZ_VERSION= XYYYZZZS

	1180 .br

	1181 .BI LIBLZMA_VERSION= XYYYZZZS

	1182 .TP

	1183 .I X

	1184 Major version.

	1185 .TP

	1186 .I YYY

	1187 Minor version. Even numbers are stable.

	1188 Odd numbers are alpha or beta versions.

	1189 .TP

	1190 .I ZZZ

	1191 Patch level for stable releases or just a counter for development releases.

	1192 .TP

	1193 .I S

	1194 Stability.

	1195 .B 0

	1196 is alpha,

	1197 .B 1

	1198 is beta, and

	1199 .B 2

	1200 is stable.

	1201 .I S

	1202 should be always

	1203 .B 2

	1204 when

	1205 .I YYY

	1206 is even.

	1207 .PP

	1208 .I XYYYZZZS

	1209 are the same on both lines if

	1210 .B xz

	1211 and liblzma are from the same XZ Utils release.

	1212 .PP

	1213 Examples: 4.999.9beta is

	1214 .B 49990091

	1215 and

	1216 5.0.0 is

	1217 .BR 50000002 .

	1218 .SS Memory limit information

	1219 .B "xz \-\-robot \-\-info-memory"

	1220 prints the current memory usage limit as bytes on a single line.

	1221 To get the total amount of installed RAM, use

	1222 .BR "xz \-\-robot \-\-memory=100% \-\-info-memory" .

	1223 .SS List mode

	1224 .B "xz \-\-robot \-\-list"

	1225 uses tab-separated output. The first column of every line has a string

	1226 that indicates the type of the information found on that line:

	1227 .TP

	1228 .B name

	1229 This is always the first line when starting to list a file. The second

	1230 column on the line is the filename.

	1231 .TP

	1232 .B file

	1233 This line contains overall information about the

	1234 .B .xz

	1235 file. This line is always printed after the

	1236 .B name

	1237 line.

	1238 .TP

	1239 .B stream

	1240 This line type is used only when

	1241 .B \-\-verbose

	1242 was specified. There are as many

	1243 .B stream

	1244 lines as there are streams in the

	1245 .B .xz

	1246 file.

	1247 .TP

	1248 .B block

	1249 This line type is used only when

	1250 .B \-\-verbose

	1251 was specified. There are as many

	1252 .B block

	1253 lines as there are blocks in the

	1254 .B .xz

	1255 file. The

	1256 .B block

	1257 lines are shown after all the

	1258 .B stream

	1259 lines; different line types are not interleaved.

	1260 .TP

	1261 .B summary

	1262 This line type is used only when

	1263 .B \-\-verbose

	1264 was specified twice. This line is printed after all

	1265 .B block

	1266 lines. Like the

	1267 .B file

	1268 line, the

	1269 .B summary

	1270 line contains overall information about the

	1271 .B .xz

	1272 file.

	1273 .TP

	1274 .B totals

	1275 This line is always the very last line of the list output. It shows

	1276 the total counts and sizes.

	1277 .PP

	1278 The columns of the

	1279 .B file

	1280 lines:

	1281 .RS

	1282 .IP 2. 4

	1283 Number of streams in the file

	1284 .IP 3. 4

	1285 Total number of blocks in the stream(s)

	1286 .IP 4. 4

	1287 Compressed size of the file

	1288 .IP 5. 4

	1289 Uncompressed size of the file

	1290 .IP 6. 4

	1291 Compression ratio, for example

	1292 .BR 0.123.

	1293 If ratio is over 9.999, three dashes

	1294 .RB ( \-\-\- )

	1295 are displayed instead of the ratio.

	1296 .IP 7. 4

	1297 Comma-separated list of integrity check names. The following strings are

	1298 used for the known check types:

	1299 .BR None ,

	1300 .BR CRC32 ,

	1301 .BR CRC64 ,

	1302 and

	1303 .BR SHA\-256 .

	1304 For unknown check types,

	1305 .BI Unknown\- N

	1306 is used, where

	1307 .I N

	1308 is the Check ID as a decimal number (one or two digits).

	1309 .IP 8. 4

	1310 Total size of stream padding in the file

	1311 .RE

	1312 .PP

	1313 The columns of the

	1314 .B stream

	1315 lines:

	1316 .RS

	1317 .IP 2. 4

	1318 Stream number (the first stream is 1)

	1319 .IP 3. 4

	1320 Number of blocks in the stream

	1321 .IP 4. 4

	1322 Compressed start offset

	1323 .IP 5. 4

	1324 Uncompressed start offset

	1325 .IP 6. 4

	1326 Compressed size (does not include stream padding)

	1327 .IP 7. 4

	1328 Uncompressed size

	1329 .IP 8. 4

	1330 Compression ratio

	1331 .IP 9. 4

	1332 Name of the integrity check

	1333 .IP 10. 4

	1334 Size of stream padding

	1335 .RE

	1336 .PP

	1337 The columns of the

	1338 .B block

	1339 lines:

	1340 .RS

	1341 .IP 2. 4

	1342 Number of the stream containing this block

	1343 .IP 3. 4

	1344 Block number relative to the beginning of the stream (the first block is 1)

	1345 .IP 4. 4

	1346 Block number relative to the beginning of the file

	1347 .IP 5. 4

	1348 Compressed start offset relative to the beginning of the file

	1349 .IP 6. 4

	1350 Uncompressed start offset relative to the beginning of the file

	1351 .IP 7. 4

	1352 Total compressed size of the block (includes headers)

	1353 .IP 8. 4

	1354 Uncompressed size

	1355 .IP 9. 4

	1356 Compression ratio

	1357 .IP 10. 4

	1358 Name of the integrity check

	1359 .RE

	1360 .PP

	1361 If

	1362 .B \-\-verbose

	1363 was specified twice, additional columns are included on the

	1364 .B block

	1365 lines. These are not displayed with a single

	1366 .BR \-\-verbose ,

	1367 because getting this information requires many seeks and can thus be slow:

	1368 .RS

	1369 .IP 11. 4

	1370 Value of the integrity check in hexadecimal

	1371 .IP 12. 4

	1372 Block header size

	1373 .IP 13. 4

	1374 Block flags:

	1375 .B c

	1376 indicates that compressed size is present, and

	1377 .B u

	1378 indicates that uncompressed size is present.

	1379 If the flag is not set, a dash

	1380 .RB ( \- )

	1381 is shown instead to keep the string length fixed. New flags may be added

	1382 to the end of the string in the future.

	1383 .IP 14. 4

	1384 Size of the actual compressed data in the block (this excludes

	1385 the block header, block padding, and check fields)

	1386 .IP 15. 4

	1387 Amount of memory (as bytes) required to decompress this block with this

	1388 .B xz

	1389 version

	1390 .IP 16. 4

	1391 Filter chain. Note that most of the options used at compression time cannot

	1392 be known, because only the options that are needed for decompression are

	1393 stored in the

	1394 .B .xz

	1395 headers.

	1396 .RE

	1397 .PP

	1398 The columns of the

	1399 .B totals

	1400 line:

	1401 .RS

	1402 .IP 2. 4

	1403 Number of streams

	1404 .IP 3. 4

	1405 Number of blocks

	1406 .IP 4. 4

	1407 Compressed size

	1408 .IP 5. 4

	1409 Uncompressed size

	1410 .IP 6. 4

	1411 Average compression ratio

	1412 .IP 7. 4

	1413 Comma-separated list of integrity check names that were present in the files

	1414 .IP 8. 4

	1415 Stream padding size

	1416 .IP 9. 4

	1417 Number of files. This is here to keep the order of the earlier columns

	1418 the same as on

	1419 .B file

	1420 lines.

	1421 .RE

	1422 .PP

	1423 If

	1424 .B \-\-verbose

	1425 was specified twice, additional columns are included on the

	1426 .B totals

	1427 line:

	1428 .RS

	1429 .IP 10. 4

	1430 Maximum amount of memory (as bytes) required to decompress the files

	1431 with this

	1432 .B xz

	1433 version

	1434 .IP 11. 4

	1435 .B yes

	1436 or

	1437 .B no

	1438 indicating if all block headers have both compressed size and

	1439 uncompressed size stored in them

	1440 .RE

	1441 .PP

	1442 Future versions may add new line types and new columns can be added to

	1443 the existing line types, but the existing columns won't be changed.

	1444 .SH "EXIT STATUS"

	1445 .TP

	1446 .B 0

	1447 All is good.

	1448 .TP

	1449 .B 1

	1450 An error occurred.

	1451 .TP

	1452 .B 2

	1453 Something worth a warning occurred, but no actual errors occurred.

	1454 .PP

	1455 Notices (not warnings or errors) printed on standard error don't affect

	1456 the exit status.

	1457 .SH ENVIRONMENT

	1458 .TP

	1459 .B XZ_OPT

	1460 A space-separated list of options is parsed from

	1461 .B XZ_OPT

	1462 before parsing the options given on the command line. Note that only

	1463 options are parsed from

	1464 .BR XZ_OPT ;

	1465 all non-options are silently ignored. Parsing is done with

	1466 .BR getopt_long (3)

	1467 which is used also for the command line arguments.

	1468 .SH "LZMA UTILS COMPATIBILITY"

	1469 The command line syntax of

	1470 .B xz

	1471 is practically a superset of

	1472 .BR lzma ,

	1473 .BR unlzma ,

	1474 and

	1475 .BR lzcat

	1476 as found from LZMA Utils 4.32.x. In most cases, it is possible to replace

	1477 LZMA Utils with XZ Utils without breaking existing scripts. There are some

	1478 incompatibilities though, which may sometimes cause problems.

	1479 .SS "Compression preset levels"

	1480 The numbering of the compression level presets is not identical in

	1481 .B xz

	1482 and LZMA Utils.

	1483 The most important difference is how dictionary sizes are mapped to different

	1484 presets. Dictionary size is roughly equal to the decompressor memory usage.

	1485 .RS

	1486 .TS

	1487 tab(;);

	1488 c c c

	1489 c n n.

	1490 Level;xz;LZMA Utils

	1491 \-1;64 KiB;64 KiB

	1492 \-2;512 KiB;1 MiB

	1493 \-3;1 MiB;512 KiB

	1494 \-4;2 MiB;1 MiB

	1495 \-5;4 MiB;2 MiB

	1496 \-6;8 MiB;4 MiB

	1497 \-7;16 MiB;8 MiB

	1498 \-8;32 MiB;16 MiB

	1499 \-9;64 MiB;32 MiB

	1500 .TE

	1501 .RE

	1502 .PP

	1503 The dictionary size differences affect the compressor memory usage too,

	1504 but there are some other differences between LZMA Utils and XZ Utils, which

	1505 make the difference even bigger:

	1506 .RS

	1507 .TS

	1508 tab(;);

	1509 c c c

	1510 c n n.

	1511 Level;xz;LZMA Utils 4.32.x

	1512 \-1;2 MiB;2 MiB

	1513 \-2;5 MiB;12 MiB

	1514 \-3;13 MiB;12 MiB

	1515 \-4;25 MiB;16 MiB

	1516 \-5;48 MiB;26 MiB

	1517 \-6;94 MiB;45 MiB

	1518 \-7;186 MiB;83 MiB

	1519 \-8;370 MiB;159 MiB

	1520 \-9;674 MiB;311 MiB

	1521 .TE

	1522 .RE

	1523 .PP

	1524 The default preset level in LZMA Utils is

	1525 .B \-7

	1526 while in XZ Utils it is

	1527 .BR \-6 ,

	1528 so both use 8 MiB dictionary by default.

	1529 .SS "Streamed vs. non-streamed .lzma files"

	1530 Uncompressed size of the file can be stored in the

	1531 .B .lzma

	1532 header. LZMA Utils does that when compressing regular files.

	1533 The alternative is to mark that uncompressed size is unknown and

	1534 use end of payload marker to indicate where the decompressor should stop.

	1535 LZMA Utils uses this method when uncompressed size isn't known, which is

	1536 the case for example in pipes.

	1537 .PP

	1538 .B xz

	1539 supports decompressing

	1540 .B .lzma

	1541 files with or without end of payload marker, but all

	1542 .B .lzma

	1543 files created by

	1544 .B xz

	1545 will use end of payload marker and have uncompressed size marked as unknown

	1546 in the

	1547 .B .lzma

	1548 header. This may be a problem in some (uncommon) situations. For example, a

	1549 .B .lzma

	1550 decompressor in an embedded device might work only with files that have known

	1551 uncompressed size. If you hit this problem, you need to use LZMA Utils or

	1552 LZMA SDK to create

	1553 .B .lzma

	1554 files with known uncompressed size.

	1555 .SS "Unsupported .lzma files"

	1556 The

	1557 .B .lzma

	1558 format allows

	1559 .I lc

	1560 values up to 8, and

	1561 .I lp

	1562 values up to 4. LZMA Utils can decompress files with any

	1563 .I lc

	1564 and

	1565 .IR lp ,

	1566 but always creates files with

	1567 .B lc=3

	1568 and

	1569 .BR lp=0 .

	1570 Creating files with other

	1571 .I lc

	1572 and

	1573 .I lp

	1574 is possible with

	1575 .B xz

	1576 and with LZMA SDK.

	1577 .PP

	1578 The implementation of the LZMA1 filter in liblzma requires

	1579 that the sum of

	1580 .I lc

	1581 and

	1582 .I lp

	1583 must not exceed 4. Thus,

	1584 .B .lzma

	1585 files which exceed this limitation, cannot be decompressed with

	1586 .BR xz .

	1587 .PP

	1588 LZMA Utils creates only

	1589 .B .lzma

	1590 files which have dictionary size of

	1591 .RI "2^" n

	1592 (a power of 2), but accepts files with any dictionary size.

	1593 liblzma accepts only

	1594 .B .lzma

	1595 files which have dictionary size of

	1596 .RI "2^" n

	1597 or

	1598 .RI "2^" n " + 2^(" n "\-1)."

	1599 This is to decrease false positives when detecting

	1600 .B .lzma

	1601 files.

	1602 .PP

	1603 These limitations shouldn't be a problem in practice, since practically all

	1604 .B .lzma

	1605 files have been compressed with settings that liblzma will accept.

	1606 .SS "Trailing garbage"

	1607 When decompressing, LZMA Utils silently ignore everything after the first

	1608 .B .lzma

	1609 stream. In most situations, this is a bug. This also means that LZMA Utils

	1610 don't support decompressing concatenated

	1611 .B .lzma

	1612 files.

	1613 .PP

	1614 If there is data left after the first

	1615 .B .lzma

	1616 stream,

	1617 .B xz

	1618 considers the file to be corrupt. This may break obscure scripts which have

	1619 assumed that trailing garbage is ignored.

	1620 .SH NOTES

	1621 .SS Compressed output may vary

	1622 The exact compressed output produced from the same uncompressed input file

	1623 may vary between XZ Utils versions even if compression options are identical.

	1624 This is because the encoder can be improved (faster or better compression)

	1625 without affecting the file format. The output can vary even between different

	1626 builds of the same XZ Utils version, if different build options are used.

	1627 .PP

	1628 The above means that implementing

	1629 .B \-\-rsyncable

	1630 to create rsyncable

	1631 .B .xz

	1632 files is not going to happen without freezing a part of the encoder

	1633 implementation, which can then be used with

	1634 .BR \-\-rsyncable .

	1635 .SS Embedded .xz decompressors

	1636 Embedded

	1637 .B .xz

	1638 decompressor implementations like XZ Embedded don't necessarily support files

	1639 created with

	1640 .I check

	1641 types other than

	1642 .B none

	1643 and

	1644 .BR crc32 .

	1645 Since the default is \fB\-\-check=\fIcrc64\fR, you must use

	1646 .B \-\-check=none

	1647 or

	1648 .B \-\-check=crc32

	1649 when creating files for embedded systems.

	1650 .PP

	1651 Outside embedded systems, all

	1652 .B .xz

	1653 format decompressors support all the

	1654 .I check

	1655 types, or at least are able to decompress the file without verifying the

	1656 integrity check if the particular

	1657 .I check

	1658 is not supported.

	1659 .PP

	1660 XZ Embedded supports BCJ filters, but only with the default start offset.

	1661 .SH EXAMPLES

	1662 .SS Basics

	1663 A mix of compressed and uncompressed files can be decompressed

	1664 to standard output with a single command:

	1665 .IP

	1666 .B "xz -dcf a.txt b.txt.xz c.txt d.txt.xz > abcd.txt"

	1667 .SS Parallel compression of many files

	1668 On GNU and *BSD,

	1669 .BR find (1)

	1670 and

	1671 .BR xargs (1)

	1672 can be used to parallellize compression of many files:

	1673 .PP

	1674 .IP

	1675 .B "find . \-type f \e! \-name '*.xz' \-print0 \| xargs \-0r \-P4 \-n16 xz"

	1676 .PP

	1677 The

	1678 .B \-P

	1679 option sets the number of parallel

	1680 .B xz

	1681 processes. The best value for the

	1682 .B \-n

	1683 option depends on how many files there are to be compressed.

	1684 If there are only a couple of files, the value should probably be

	1685 .BR 1 ;

	1686 with tens of thousands of files,

	1687 .B 100

	1688 or even more may be appropriate to reduce the number of

	1689 .B xz

	1690 processes that

	1691 .BR xargs (1)

	1692 will eventually create.

	1693 .SS Robot mode examples

	1694 Calculating how many bytes have been saved in total after compressing

	1695 multiple files:

	1696 .IP

	1697 .B "xz --robot --list *.xz \| awk '/^totals/{print $5\-$4}'"

	1698 .SH "SEE ALSO"

	1699 .BR xzdec (1),

	1700 .BR gzip (1),

	1701 .BR bzip2 (1)

	1702 .PP

	1703 XZ Utils: <http://tukaani.org/xz/>

	1704 .br

	1705 XZ Embedded: <http://tukaani.org/xz/embedded.html>

	1706 .br

	1707 LZMA SDK: <http://7-zip.org/sdk.html>

OLD	NEW

« no previous file with comments | « xz/src/xz/util.c ('k') | xz/src/xz/xz_w32res.rc » ('j') | no next file with comments »