xz/src/xz/xz.1 - Issue 2869016: Add an unpatched version of xz, XZ Utils, to /trunk/deps/third_party

Unified Diff: xz/src/xz/xz.1

Issue 2869016: Add an unpatched version of xz, XZ Utils, to /trunk/deps/third_party (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/deps/third_party/

Patch Set: Created 10 years, 6 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

Index: xz/src/xz/xz.1

===================================================================

--- xz/src/xz/xz.1 (revision 0)

+++ xz/src/xz/xz.1 (revision 0)

@@ -0,0 +1,1707 @@

+'\" t

+.\"

+.\" Author: Lasse Collin

+.\"

+.\" This file has been put into the public domain.

+.\" You can do whatever you want with this file.

+.\"

+.TH XZ 1 "2010-06-15" "Tukaani" "XZ Utils"

+.SH NAME

+xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files

+.SH SYNOPSIS

+.B xz

+.RI [ option ]...

+.RI [ file ]...

+.PP

+.B unxz

+is equivalent to

+.BR "xz \-\-decompress" .

+.br

+.B xzcat

+is equivalent to

+.BR "xz \-\-decompress \-\-stdout" .

+.br

+.B lzma

+is equivalent to

+.BR "xz \-\-format=lzma" .

+.br

+.B unlzma

+is equivalent to

+.BR "xz \-\-format=lzma \-\-decompress" .

+.br

+.B lzcat

+is equivalent to

+.BR "xz \-\-format=lzma \-\-decompress \-\-stdout" .

+.PP

+When writing scripts that need to decompress files, it is recommended to

+always use the name

+.B xz

+with appropriate arguments

+.RB ( "xz \-d"

+or

+.BR "xz \-dc" )

+instead of the names

+.B unxz

+and

+.BR xzcat.

+.SH DESCRIPTION

+.B xz

+is a general-purpose data compression tool with command line syntax similar to

+.BR gzip (1)

+and

+.BR bzip2 (1).

+The native file format is the

+.B .xz

+format, but also the legacy

+.B .lzma

+format and raw compressed streams with no container format headers

+are supported.

+.PP

+.B xz

+compresses or decompresses each

+.I file

+according to the selected operation mode.

+If no

+.I files

+are given or

+.I file

+is

+.BR \- ,

+.B xz

+reads from standard input and writes the processed data to standard output.

+.B xz

+will refuse (display an error and skip the

+.IR file )

+to write compressed data to standard output if it is a terminal. Similarly,

+.B xz

+will refuse to read compressed data from standard input if it is a terminal.

+.PP

+Unless

+.B \-\-stdout

+is specified,

+.I files

+other than

+.B \-

+are written to a new file whose name is derived from the source

+.I file

+name:

+.IP \(bu 3

+When compressing, the suffix of the target file format

+.RB ( .xz

+or

+.BR .lzma )

+is appended to the source filename to get the target filename.

+.IP \(bu 3

+When decompressing, the

+.B .xz

+or

+.B .lzma

+suffix is removed from the filename to get the target filename.

+.B xz

+also recognizes the suffixes

+.B .txz

+and

+.BR .tlz ,

+and replaces them with the

+.B .tar

+suffix.

+.PP

+If the target file already exists, an error is displayed and the

+.I file

+is skipped.

+.PP

+Unless writing to standard output,

+.B xz

+will display a warning and skip the

+.I file

+if any of the following applies:

+.IP \(bu 3

+.I File

+is not a regular file. Symbolic links are not followed, thus they

+are not considered to be regular files.

+.IP \(bu 3

+.I File

+has more than one hard link.

+.IP \(bu 3

+.I File

+has setuid, setgid, or sticky bit set.

+.IP \(bu 3

+The operation mode is set to compress, and the

+.I file

+already has a suffix of the target file format

+.RB ( .xz

+or

+.B .txz

+when compressing to the

+.B .xz

+format, and

+.B .lzma

+or

+.B .tlz

+when compressing to the

+.B .lzma

+format).

+.IP \(bu 3

+The operation mode is set to decompress, and the

+.I file

+doesn't have a suffix of any of the supported file formats

+.RB ( .xz ,

+.BR .txz ,

+.BR .lzma ,

+or

+.BR .tlz ).

+.PP

+After successfully compressing or decompressing the

+.IR file ,

+.B xz

+copies the owner, group, permissions, access time, and modification time

+from the source

+.I file

+to the target file. If copying the group fails, the permissions are modified

+so that the target file doesn't become accessible to users who didn't have

+permission to access the source

+.IR file .

+.B xz

+doesn't support copying other metadata like access control lists

+or extended attributes yet.

+.PP

+Once the target file has been successfully closed, the source

+.I file

+is removed unless

+.B \-\-keep

+was specified. The source

+.I file

+is never removed if the output is written to standard output.

+.PP

+Sending

+.B SIGINFO

+or

+.B SIGUSR1

+to the

+.B xz

+process makes it print progress information to standard error.

+This has only limited use since when standard error is a terminal, using

+.B \-\-verbose

+will display an automatically updating progress indicator.

+.SS "Memory usage"

+The memory usage of

+.B xz

+varies from a few hundred kilobytes to several gigabytes depending on

+the compression settings. The settings used when compressing a file

+affect also the memory usage of the decompressor. Typically the decompressor

+needs only 5\ % to 20\ % of the amount of RAM that the compressor needed when

+creating the file. Still, the worst-case memory usage of the decompressor

+is several gigabytes.

+.PP

+To prevent uncomfortable surprises caused by huge memory usage,

+.B xz

+has a built-in memory usage limiter. While some operating systems provide

+ways to limit the memory usage of processes, relying on it wasn't deemed

+to be flexible enough. The default limit depends on the total amount of

+physical RAM:

+.IP \(bu 3

+If 40\ % of RAM is at least 80 MiB, 40\ % of RAM is used as the limit.

+.IP \(bu 3

+If 80\ % of RAM is less than 80 MiB, 80\ % of RAM is used as the limit.

+.IP \(bu 3

+Otherwise 80 MiB is used as the limit.

+.PP

+When compressing, if the selected compression settings exceed the memory

+usage limit, the settings are automatically adjusted downwards and a notice

+about this is displayed. As an exception, if the memory usage limit is

+exceeded when compressing with

+.B \-\-format=raw

+or

+.BR \-\-no\-adjust ,

+an error is displayed and

+.B xz

+will exit with exit status

+.BR 1 .

+.PP

+If source

+.I file

+cannot be decompressed without exceeding the memory usage limit, an error

+message is displayed and the file is skipped. Note that compressed files

+may contain many blocks, which may have been compressed with different

+settings. Typically all blocks will have roughly the same memory requirements,

+but it is possible that a block later in the file will exceed the memory usage

+limit, and an error about too low memory usage limit gets displayed after some

+data has already been decompressed.

+.PP

+The absolute value of the active memory usage limit can be seen with

+.B \-\-info-memory

+or near the bottom of the output of

+.BR \-\-long\-help .

+The default limit can be overridden with

+\fB\-\-memory=\fIlimit\fR.

+.SS Concatenation and padding with .xz files

+It is possible to concatenate

+.B .xz

+files as is.

+.B xz

+will decompress such files as if they were a single

+.B .xz

+file.

+.PP

+It is possible to insert padding between the concenated parts

+or after the last part. The padding must be null bytes and the size

+of the padding must be a multiple of four bytes. This can be useful

+if the .xz file is stored on a medium that stores file sizes

+e.g. as 512-byte blocks.

+.PP

+Concatenation and padding are not allowed with

+.B .lzma

+files or raw streams.

+.SH OPTIONS

+.SS "Integer suffixes and special values"

+In most places where an integer argument is expected, an optional suffix

+is supported to easily indicate large integers. There must be no space

+between the integer and the suffix.

+.TP

+.B KiB

+The integer is multiplied by 1,024 (2^10). Also

+.BR Ki ,

+.BR k ,

+.BR kB ,

+.BR K ,

+and

+.B KB

+are accepted as synonyms for

+.BR KiB .

+.TP

+.B MiB

+The integer is multiplied by 1,048,576 (2^20). Also

+.BR Mi ,

+.BR m ,

+.BR M ,

+and

+.B MB

+are accepted as synonyms for

+.BR MiB .

+.TP

+.B GiB

+The integer is multiplied by 1,073,741,824 (2^30). Also

+.BR Gi ,

+.BR g ,

+.BR G ,

+and

+.B GB

+are accepted as synonyms for

+.BR GiB .

+.PP

+A special value

+.B max

+can be used to indicate the maximum integer value supported by the option.

+.SS "Operation mode"

+If multiple operation mode options are given, the last one takes effect.

+.TP

+.BR \-z ", " \-\-compress

+Compress. This is the default operation mode when no operation mode option

+is specified, and no other operation mode is implied from the command name

+(for example,

+.B unxz

+implies

+.BR \-\-decompress ).

+.TP

+.BR \-d ", " \-\-decompress ", " \-\-uncompress

+Decompress.

+.TP

+.BR \-t ", " \-\-test

+Test the integrity of compressed

+.IR files .

+No files are created or removed. This option is equivalent to

+.B "\-\-decompress \-\-stdout"

+except that the decompressed data is discarded instead of being

+written to standard output.

+.TP

+.BR \-l ", " \-\-list

+List information about compressed

+.IR files .

+No uncompressed output is produced, and no files are created or removed.

+In list mode, the program cannot read the compressed data from standard

+input or from other unseekable sources.

+.IP

+The default listing shows basic information about

+.IR files ,

+one file per line. To get more detailed information, use also the

+.B \-\-verbose

+option. For even more information, use

+.B \-\-verbose

+twice, but note that it may be slow, because getting all the extra

+information requires many seeks. The width of verbose output exceeds

+80 characters, so piping the output to e.g.

+.B "less\ \-S"

+may be convenient if the terminal isn't wide enough.

+.IP

+The exact output may vary between

+.B xz

+versions and different locales. To get machine-readable output,

+.B \-\-robot \-\-list

+should be used.

+.SS "Operation modifiers"

+.TP

+.BR \-k ", " \-\-keep

+Keep (don't delete) the input files.

+.TP

+.BR \-f ", " \-\-force

+This option has several effects:

+.RS

+.IP \(bu 3

+If the target file already exists, delete it before compressing or

+decompressing.

+.IP \(bu 3

+Compress or decompress even if the input is a symbolic link to a regular file,

+has more than one hard link, or has setuid, setgid, or sticky bit set.

+The setuid, setgid, and sticky bits are not copied to the target file.

+.IP \(bu 3

+If combined with

+.B \-\-decompress

+.BR \-\-stdout

+and

+.B xz

+doesn't recognize the type of the source file,

+.B xz

+will copy the source file as is to standard output. This allows using

+.B xzcat

+.B \--force

+like

+.BR cat (1)

+for files that have not been compressed with

+.BR xz .

+Note that in future,

+.B xz

+might support new compressed file formats, which may make

+.B xz

+decompress more types of files instead of copying them as is to

+standard output.

+.BI \-\-format= format

+can be used to restrict

+.B xz

+to decompress only a single file format.

+.RE

+.TP

+.BR \-c ", " \-\-stdout ", " \-\-to-stdout

+Write the compressed or decompressed data to standard output instead of

+a file. This implies

+.BR \-\-keep .

+.TP

+.B \-\-no\-sparse

+Disable creation of sparse files. By default, if decompressing into

+a regular file,

+.B xz

+tries to make the file sparse if the decompressed data contains long

+sequences of binary zeros. It works also when writing to standard output

+as long as standard output is connected to a regular file, and certain

+additional conditions are met to make it safe. Creating sparse files may

+save disk space and speed up the decompression by reducing the amount of

+disk I/O.

+.TP

+\fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf

+When compressing, use

+.I .suf

+as the suffix for the target file instead of

+.B .xz

+or

+.BR .lzma .

+If not writing to standard output and the source file already has the suffix

+.IR .suf ,

+a warning is displayed and the file is skipped.

+.IP

+When decompressing, recognize also files with the suffix

+.I .suf

+in addition to files with the

+.BR .xz ,

+.BR .txz ,

+.BR .lzma ,

+or

+.B .tlz

+suffix. If the source file has the suffix

+.IR .suf ,

+the suffix is removed to get the target filename.

+.IP

+When compressing or decompressing raw streams

+.RB ( \-\-format=raw ),

+the suffix must always be specified unless writing to standard output,

+because there is no default suffix for raw streams.

+.TP

+\fB\-\-files\fR[\fB=\fIfile\fR]

+Read the filenames to process from

+.IR file ;

+if

+.I file

+is omitted, filenames are read from standard input. Filenames must be

+terminated with the newline character. A dash

+.RB ( \- )

+is taken as a regular filename; it doesn't mean standard input.

+If filenames are given also as command line arguments, they are

+processed before the filenames read from

+.IR file .

+.TP

+\fB\-\-files0\fR[\fB=\fIfile\fR]

+This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except that the

+filenames must be terminated with the null character.

+.SS "Basic file format and compression options"

+.TP

+\fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat

+Specify the file format to compress or decompress:

+.RS

+.IP \(bu 3

+.BR auto :

+This is the default. When compressing,

+.B auto

+is equivalent to

+.BR xz .

+When decompressing, the format of the input file is automatically detected.

+Note that raw streams (created with

+.BR \-\-format=raw )

+cannot be auto-detected.

+.IP \(bu 3

+.BR xz :

+Compress to the

+.B .xz

+file format, or accept only

+.B .xz

+files when decompressing.

+.IP \(bu 3

+.B lzma

+or

+.BR alone :

+Compress to the legacy

+.B .lzma

+file format, or accept only

+.B .lzma

+files when decompressing. The alternative name

+.B alone

+is provided for backwards compatibility with LZMA Utils.

+.IP \(bu 3

+.BR raw :

+Compress or uncompress a raw stream (no headers). This is meant for advanced

+users only. To decode raw streams, you need to set not only

+.B \-\-format=raw

+but also specify the filter chain, which would normally be stored in the

+container format headers.

+.RE

+.TP

+\fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck

+Specify the type of the integrity check, which is calculated from the

+uncompressed data. This option has an effect only when compressing into the

+.B .xz

+format; the

+.B .lzma

+format doesn't support integrity checks.

+The integrity check (if any) is verified when the

+.B .xz

+file is decompressed.

+.IP

+Supported

+.I check

+types:

+.RS

+.IP \(bu 3

+.BR none :

+Don't calculate an integrity check at all. This is usually a bad idea. This

+can be useful when integrity of the data is verified by other means anyway.

+.IP \(bu 3

+.BR crc32 :

+Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet).

+.IP \(bu 3

+.BR crc64 :

+Calculate CRC64 using the polynomial from ECMA-182. This is the default, since

+it is slightly better than CRC32 at detecting damaged files and the speed

+difference is negligible.

+.IP \(bu 3

+.BR sha256 :

+Calculate SHA-256. This is somewhat slower than CRC32 and CRC64.

+.RE

+.IP

+Integrity of the

+.B .xz

+headers is always verified with CRC32. It is not possible to change or

+disable it.

+.TP

+.BR \-0 " ... " \-9

+Select compression preset. If a preset level is specified multiple times,

+the last one takes effect.

+.IP

+The compression preset levels can be categorised roughly into three

+categories:

+.RS

+.IP "\fB\-0\fR ... \fB\-2"

+Fast presets with relatively low memory usage.

+.B \-1

+and

+.B \-2

+should give compression speed and ratios comparable to

+.B "bzip2 \-1"

+and

+.BR "bzip2 \-9" ,

+respectively.

+Currently

+.B \-0

+is not very good (not much faster than

+.B \-1

+but much worse compression). In future,

+.B \-0

+may be indicate some fast algorithm instead of LZMA2.

+.IP "\fB\-3\fR ... \fB\-5"

+Good compression ratio with low to medium memory usage.

+These are significantly slower than levels 0\-2.

+.IP "\fB\-6\fR ... \fB\-9"

+Excellent compression with medium to high memory usage. These are also

+slower than the lower preset levels. The default is

+.BR \-6 .

+Unless you want to maximize the compression ratio, you probably don't want

+a higher preset level than

+.B \-7

+due to speed and memory usage.

+.RE

+.IP

+The exact compression settings (filter chain) used by each preset may

+vary between

+.B xz

+versions. The settings may also vary between files being compressed, if

+.B xz

+determines that modified settings will probably give better compression

+ratio without significantly affecting compression time or memory usage.

+.IP

+Because the settings may vary, the memory usage may vary too. The following

+table lists the maximum memory usage of each preset level, which won't be

+exceeded even in future versions of

+.BR xz .

+.IP

+.B "FIXME: The table below is just a rough idea."

+.RS

+.TS

+tab(;);

+c c c

+n n n.

+Preset;Compression;Decompression

+\-0;6 MiB;1 MiB

+\-1;6 MiB;1 MiB

+\-2;10 MiB;1 MiB

+\-3;20 MiB;2 MiB

+\-4;30 MiB;3 MiB

+\-5;60 MiB;6 MiB

+\-6;100 MiB;10 MiB

+\-7;200 MiB;20 MiB

+\-8;400 MiB;40 MiB

+\-9;800 MiB;80 MiB

+.TE

+.RE

+.IP

+When compressing,

+.B xz

+automatically adjusts the compression settings downwards if

+the memory usage limit would be exceeded, so it is safe to specify

+a high preset level even on systems that don't have lots of RAM.

+.TP

+.BR \-\-fast " and " \-\-best

+These are somewhat misleading aliases for

+.B \-0

+and

+.BR \-9 ,

+respectively.

+These are provided only for backwards compatibility with LZMA Utils.

+Avoid using these options.

+.IP

+Especially the name of

+.B \-\-best

+is misleading, because the definition of best depends on the input data,

+and that usually people don't want the very best compression ratio anyway,

+because it would be very slow.

+.TP

+.BR \-e ", " \-\-extreme

+Modify the compression preset (\fB\-0\fR ... \fB\-9\fR) so that a little bit

+better compression ratio can be achieved without increasing memory usage

+of the compressor or decompressor (exception: compressor memory usage may

+increase a little with presets \fB\-0\fR ... \fB\-2\fR). The downside is that

+the compression time will increase dramatically (it can easily double).

+.TP

+.B \-\-no\-adjust

+Display an error and exit if the compression settings exceed the

+the memory usage limit. The default is to adjust the settings downwards so

+that the memory usage limit is not exceeded. Automatic adjusting is

+always disabled when creating raw streams

+.RB ( \-\-format=raw ).

+.TP

+\fB\-M\fR \fIlimit\fR, \fB\-\-memory=\fIlimit

+Set the memory usage limit. If this option is specified multiple times,

+the last one takes effect. The

+.I limit

+can be specified in multiple ways:

+.RS

+.IP \(bu 3

+The

+.I limit

+can be an absolute value in bytes. Using an integer suffix like

+.B MiB

+can be useful. Example:

+.B "\-\-memory=80MiB"

+.IP \(bu 3

+The

+.I limit

+can be specified as a percentage of physical RAM. Example:

+.B "\-\-memory=70%"

+.IP \(bu 3

+The

+.I limit

+can be reset back to its default value by setting it to

+.BR 0 .

+See the section

+.B "Memory usage"

+for how the default limit is defined.

+.IP \(bu 3

+The memory usage limiting can be effectively disabled by setting

+.I limit

+to

+.BR max .

+This isn't recommended. It's usually better to use, for example,

+.BR \-\-memory=90% .

+.RE

+.IP

+The current

+.I limit

+can be seen near the bottom of the output of the

+.B \-\-long-help

+option.

+.TP

+\fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads

+Specify the maximum number of worker threads to use. The default is

+the number of available CPU cores. You can see the current value of

+.I threads

+near the end of the output of the

+.B \-\-long\-help

+option.

+.IP

+The actual number of worker threads can be less than

+.I threads

+if using more threads would exceed the memory usage limit.

+In addition to CPU-intensive worker threads,

+.B xz

+may use a few auxiliary threads, which don't use a lot of CPU time.

+.IP

+.B "Multithreaded compression and decompression are not implemented yet,"

+.B "so this option has no effect for now."

+.SS Custom compressor filter chains

+A custom filter chain allows specifying the compression settings in detail

+instead of relying on the settings associated to the preset levels.

+When a custom filter chain is specified, the compression preset level options

+(\fB\-0\fR ... \fB\-9\fR and \fB\-\-extreme\fR) are silently ignored.

+.PP

+A filter chain is comparable to piping on the UN*X command line.

+When compressing, the uncompressed input goes to the first filter, whose

+output goes to the next filter (if any). The output of the last filter

+gets written to the compressed file. The maximum number of filters in

+the chain is four, but typically a filter chain has only one or two filters.

+.PP

+Many filters have limitations where they can be in the filter chain:

+some filters can work only as the last filter in the chain, some only

+as a non-last filter, and some work in any position in the chain. Depending

+on the filter, this limitation is either inherent to the filter design or

+exists to prevent security issues.

+.PP

+A custom filter chain is specified by using one or more filter options in

+the order they are wanted in the filter chain. That is, the order of filter

+options is significant! When decoding raw streams

+.RB ( \-\-format=raw ),

+the filter chain is specified in the same order as it was specified when

+compressing.

+.PP

+Filters take filter-specific

+.I options

+as a comma-separated list. Extra commas in

+.I options

+are ignored. Every option has a default value, so you need to

+specify only those you want to change.

+.TP

+\fB\-\-lzma1\fR[\fB=\fIoptions\fR], \fB\-\-lzma2\fR[\fB=\fIoptions\fR]

+Add LZMA1 or LZMA2 filter to the filter chain. These filter can be used

+only as the last filter in the chain.

+.IP

+LZMA1 is a legacy filter, which is supported almost solely due to the legacy

+.B .lzma

+file format, which supports only LZMA1. LZMA2 is an updated

+version of LZMA1 to fix some practical issues of LZMA1. The

+.B .xz

+format uses LZMA2, and doesn't support LZMA1 at all. Compression speed and

+ratios of LZMA1 and LZMA2 are practically the same.

+.IP

+LZMA1 and LZMA2 share the same set of

+.IR options :

+.RS

+.TP

+.BI preset= preset

+Reset all LZMA1 or LZMA2

+.I options

+to

+.IR preset .

+.I Preset

+consist of an integer, which may be followed by single-letter preset

+modifiers. The integer can be from

+.B 0

+to

+.BR 9 ,

+matching the command line options \fB\-0\fR ... \fB\-9\fR.

+The only supported modifier is currently

+.BR e ,

+which matches

+.BR \-\-extreme .

+.IP

+The default

+.I preset

+is

+.BR 6 ,

+from which the default values for the rest of the LZMA1 or LZMA2

+.I options

+are taken.

+.TP

+.BI dict= size

+Dictionary (history buffer) size indicates how many bytes of the recently

+processed uncompressed data is kept in memory. One method to reduce size of

+the uncompressed data is to store distance-length pairs, which

+indicate what data to repeat from the dictionary buffer. The bigger

+the dictionary, the better the compression ratio usually is,

+but dictionaries bigger than the uncompressed data are waste of RAM.

+.IP

+Typical dictionary size is from 64 KiB to 64 MiB. The minimum is 4 KiB.

+The maximum for compression is currently 1.5 GiB. The decompressor already

+supports dictionaries up to one byte less than 4 GiB, which is the

+maximum for LZMA1 and LZMA2 stream formats.

+.IP

+Dictionary size has the biggest effect on compression ratio.

+Dictionary size and match finder together determine the memory usage of

+the LZMA1 or LZMA2 encoder. The same dictionary size is required

+for decompressing that was used when compressing, thus the memory usage of

+the decoder is determined by the dictionary size used when compressing.

+.TP

+.BI lc= lc

+Specify the number of literal context bits. The minimum is

+.B 0

+and the maximum is

+.BR 4 ;

+the default is

+.BR 3 .

+In addition, the sum of

+.I lc

+and

+.I lp

+must not exceed

+.BR 4 .

+.TP

+.BI lp= lp

+Specify the number of literal position bits. The minimum is

+.B 0

+and the maximum is

+.BR 4 ;

+the default is

+.BR 0 .

+.TP

+.BI pb= pb

+Specify the number of position bits. The minimum is

+.B 0

+and the maximum is

+.BR 4 ;

+the default is

+.BR 2 .

+.TP

+.BI mode= mode

+Compression

+.I mode

+specifies the function used to analyze the data produced by the match finder.

+Supported

+.I modes

+are

+.B fast

+and

+.BR normal .

+The default is

+.B fast

+for

+.I presets

+.BR 0 \- 2

+and

+.B normal

+for

+.I presets

+.BR 3 \- 9 .

+.TP

+.BI mf= mf

+Match finder has a major effect on encoder speed, memory usage, and

+compression ratio. Usually Hash Chain match finders are faster than

+Binary Tree match finders. Hash Chains are usually used together with

+.B mode=fast

+and Binary Trees with

+.BR mode=normal .

+The memory usage formulas are only rough estimates,

+which are closest to reality when

+.I dict

+is a power of two.

+.RS

+.TP

+.B hc3

+Hash Chain with 2- and 3-byte hashing

+.br

+Minimum value for

+.IR nice :

+.br

+Memory usage:

+.I dict

+* 7.5 (if

+.I dict

+<= 16 MiB);

+.br

+.I dict

+* 5.5 + 64 MiB (if

+.I dict

+> 16 MiB)

+.TP

+.B hc4

+Hash Chain with 2-, 3-, and 4-byte hashing

+.br

+Minimum value for

+.IR nice :

+.br

+Memory usage:

+.I dict

+* 7.5

+.TP

+.B bt2

+Binary Tree with 2-byte hashing

+.br

+Minimum value for

+.IR nice :

+.br

+Memory usage:

+.I dict

+* 9.5

+.TP

+.B bt3

+Binary Tree with 2- and 3-byte hashing

+.br

+Minimum value for

+.IR nice :

+.br

+Memory usage:

+.I dict

+* 11.5 (if

+.I dict

+<= 16 MiB);

+.br

+.I dict

+* 9.5 + 64 MiB (if

+.I dict

+> 16 MiB)

+.TP

+.B bt4

+Binary Tree with 2-, 3-, and 4-byte hashing

+.br

+Minimum value for

+.IR nice :

+.br

+Memory usage:

+.I dict

+* 11.5

+.RE

+.TP

+.BI nice= nice

+Specify what is considered to be a nice length for a match. Once a match

+of at least

+.I nice

+bytes is found, the algorithm stops looking for possibly better matches.

+.IP

+.I nice

+can be 2\-273 bytes. Higher values tend to give better compression ratio

+at expense of speed. The default depends on the

+.I preset

+level.

+.TP

+.BI depth= depth

+Specify the maximum search depth in the match finder. The default is the

+special value

+.BR 0 ,

+which makes the compressor determine a reasonable

+.I depth

+from

+.I mf

+and

+.IR nice .

+.IP

+Using very high values for

+.I depth

+can make the encoder extremely slow with carefully crafted files.

+Avoid setting the

+.I depth

+over 1000 unless you are prepared to interrupt the compression in case it

+is taking too long.

+.RE

+.IP

+When decoding raw streams

+.RB ( \-\-format=raw ),

+LZMA2 needs only the value of

+.BR dict .

+LZMA1 needs also

+.BR lc ,

+.BR lp ,

+and

+.BR pb.

+.TP

+\fB\-\-x86\fR[\fB=\fIoptions\fR]

+.TP

+\fB\-\-powerpc\fR[\fB=\fIoptions\fR]

+.TP

+\fB\-\-ia64\fR[\fB=\fIoptions\fR]

+.TP

+\fB\-\-arm\fR[\fB=\fIoptions\fR]

+.TP

+\fB\-\-armthumb\fR[\fB=\fIoptions\fR]

+.TP

+\fB\-\-sparc\fR[\fB=\fIoptions\fR]

+Add a branch/call/jump (BCJ) filter to the filter chain. These filters

+can be used only as non-last filter in the filter chain.

+.IP

+A BCJ filter converts relative addresses in the machine code to their

+absolute counterparts. This doesn't change the size of the data, but

+it increases redundancy, which allows e.g. LZMA2 to get better

+compression ratio.

+.IP

+The BCJ filters are always reversible, so using a BCJ filter for wrong

+type of data doesn't cause any data loss. However, applying a BCJ filter

+for wrong type of data is a bad idea, because it tends to make the

+compression ratio worse.

+.IP

+Different instruction sets have have different alignment:

+.RS

+.TS

+tab(;);

+l n l

+l n l.

+Filter;Alignment;Notes

+x86;1;32-bit and 64-bit x86

+PowerPC;4;Big endian only

+ARM;4;Little endian only

+ARM-Thumb;2;Little endian only

+IA-64;16;Big or little endian

+SPARC;4;Big or little endian

+.TE

+.RE

+.IP

+Since the BCJ-filtered data is usually compressed with LZMA2, the compression

+ratio may be improved slightly if the LZMA2 options are set to match the

+alignment of the selected BCJ filter. For example, with the IA-64 filter,

+it's good to set

+.B pb=4

+with LZMA2 (2^4=16). The x86 filter is an exception; it's usually good to

+stick to LZMA2's default four-byte alignment when compressing x86 executables.

+.IP

+All BCJ filters support the same

+.IR options :

+.RS

+.TP

+.BI start= offset

+Specify the start

+.I offset

+that is used when converting between relative and absolute addresses.

+The

+.I offset

+must be a multiple of the alignment of the filter (see the table above).

+The default is zero. In practice, the default is good; specifying

+a custom

+.I offset

+is almost never useful.

+.IP

+Specifying a non-zero start

+.I offset

+is probably useful only if the executable has multiple sections, and there

+are many cross-section jumps or calls. Applying a BCJ filter separately for

+each section with proper start offset and then compressing the result as

+a single chunk may give some improvement in compression ratio compared

+to applying the BCJ filter with the default

+.I offset

+for the whole executable.

+.RE

+.TP

+\fB\-\-delta\fR[\fB=\fIoptions\fR]

+Add Delta filter to the filter chain. The Delta filter

+can be used only as non-last filter in the filter chain.

+.IP

+Currently only simple byte-wise delta calculation is supported. It can

+be useful when compressing e.g. uncompressed bitmap images or uncompressed

+PCM audio. However, special purpose algorithms may give significantly better

+results than Delta + LZMA2. This is true especially with audio, which

+compresses faster and better e.g. with FLAC.

+.IP

+Supported

+.IR options :

+.RS

+.TP

+.BI dist= distance

+Specify the

+.I distance

+of the delta calculation as bytes.

+.I distance

+must be 1\-256. The default is 1.

+.IP

+For example, with

+.B dist=2

+and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be

+A1 B1 01 02 01 02 01 02.

+.RE

+.SS "Other options"

+.TP

+.BR \-q ", " \-\-quiet

+Suppress warnings and notices. Specify this twice to suppress errors too.

+This option has no effect on the exit status. That is, even if a warning

+was suppressed, the exit status to indicate a warning is still used.

+.TP

+.BR \-v ", " \-\-verbose

+Be verbose. If standard error is connected to a terminal,

+.B xz

+will display a progress indicator.

+Specifying

+.B \-\-verbose

+twice will give even more verbose output (useful mostly for debugging).

+.IP

+The progress indicator shows the following information:

+.RS

+.IP \(bu 3

+Completion percentage is shown if the size of the input file is known.

+That is, percentage cannot be shown in pipes.

+.IP \(bu 3

+Amount of compressed data produced (compressing) or consumed (decompressing).

+.IP \(bu 3

+Amount of uncompressed data consumed (compressing) or produced

+(decompressing).

+.IP \(bu 3

+Compression ratio, which is calculated by dividing the amount of

+compressed data processed so far by the amount of uncompressed data

+processed so far.

+.IP \(bu 3

+Compression or decompression speed. This is measured as the amount of

+uncompressed data consumed (compression) or produced (decompression)

+per second. It is shown once a few seconds have passed since

+.B xz

+started processing the file.

+.IP \(bu 3

+Elapsed time or estimated time remaining.

+Elapsed time is displayed in the format M:SS or H:MM:SS.

+The estimated remaining time is displayed in a less precise format

+which never has colons, for example, 2 min 30 s. The estimate can

+be shown only when the size of the input file is known and a couple of

+seconds have already passed since

+.B xz

+started processing the file.

+.RE

+.IP

+When standard error is not a terminal,

+.B \-\-verbose

+will make

+.B xz

+print the filename, compressed size, uncompressed size, compression ratio,

+speed, and elapsed time on a single line to standard error after

+compressing or decompressing the file. If operating took at least a few

+seconds, also the speed and elapsed time are printed. If the operation

+didn't finish, for example due to user interruption, also the completion

+percentage is printed if the size of the input file is known.

+.TP

+.BR \-Q ", " \-\-no\-warn

+Don't set the exit status to

+.B 2

+even if a condition worth a warning was detected. This option doesn't affect

+the verbosity level, thus both

+.B \-\-quiet

+and

+.B \-\-no\-warn

+have to be used to not display warnings and to not alter the exit status.

+.TP

+.B \-\-robot

+Print messages in a machine-parsable format. This is intended to ease

+writing frontends that want to use

+.B xz

+instead of liblzma, which may be the case with various scripts. The output

+with this option enabled is meant to be stable across

+.B xz

+releases. See the section

+.B "ROBOT MODE"

+for details.

+.TP

+.BR \-\-info-memory

+Display the current memory usage limit in human-readable format on

+a single line, and exit successfully. To see how much RAM

+.B xz

+thinks your system has, use

+.BR "\-\-memory=100% \-\-info\-memory" .

+.TP

+.BR \-h ", " \-\-help

+Display a help message describing the most commonly used options,

+and exit successfully.

+.TP

+.BR \-H ", " \-\-long\-help

+Display a help message describing all features of

+.BR xz ,

+and exit successfully

+.TP

+.BR \-V ", " \-\-version

+Display the version number of

+.B xz

+and liblzma in human readable format. To get machine-parsable output, specify

+.B \-\-robot

+before

+.BR \-\-version .

+.SH ROBOT MODE

+The robot mode is activated with the

+.B \-\-robot

+option. It makes the output of

+.B xz

+easier to parse by other programs. Currently

+.B \-\-robot

+is supported only together with

+.BR \-\-version ,

+.BR \-\-info-memory ,

+and

+.BR \-\-list .

+It will be supported for normal compression and decompression in the future.

+.PP

+.SS Version

+.B "xz \-\-robot \-\-version"

+will print the version number of

+.B xz

+and liblzma in the following format:

+.PP

+.BI XZ_VERSION= XYYYZZZS

+.br

+.BI LIBLZMA_VERSION= XYYYZZZS

+.TP

+.I X

+Major version.

+.TP

+.I YYY

+Minor version. Even numbers are stable.

+Odd numbers are alpha or beta versions.

+.TP

+.I ZZZ

+Patch level for stable releases or just a counter for development releases.

+.TP

+.I S

+Stability.

+.B 0

+is alpha,

+.B 1

+is beta, and

+.B 2

+is stable.

+.I S

+should be always

+.B 2

+when

+.I YYY

+is even.

+.PP

+.I XYYYZZZS

+are the same on both lines if

+.B xz

+and liblzma are from the same XZ Utils release.

+.PP

+Examples: 4.999.9beta is

+.B 49990091

+and

+5.0.0 is

+.BR 50000002 .

+.SS Memory limit information

+.B "xz \-\-robot \-\-info-memory"

+prints the current memory usage limit as bytes on a single line.

+To get the total amount of installed RAM, use

+.BR "xz \-\-robot \-\-memory=100% \-\-info-memory" .

+.SS List mode

+.B "xz \-\-robot \-\-list"

+uses tab-separated output. The first column of every line has a string

+that indicates the type of the information found on that line:

+.TP

+.B name

+This is always the first line when starting to list a file. The second

+column on the line is the filename.

+.TP

+.B file

+This line contains overall information about the

+.B .xz

+file. This line is always printed after the

+.B name

+line.

+.TP

+.B stream

+This line type is used only when

+.B \-\-verbose

+was specified. There are as many

+.B stream

+lines as there are streams in the

+.B .xz

+file.

+.TP

+.B block

+This line type is used only when

+.B \-\-verbose

+was specified. There are as many

+.B block

+lines as there are blocks in the

+.B .xz

+file. The

+.B block

+lines are shown after all the

+.B stream

+lines; different line types are not interleaved.

+.TP

+.B summary

+This line type is used only when

+.B \-\-verbose

+was specified twice. This line is printed after all

+.B block

+lines. Like the

+.B file

+line, the

+.B summary

+line contains overall information about the

+.B .xz

+file.

+.TP

+.B totals

+This line is always the very last line of the list output. It shows

+the total counts and sizes.

+.PP

+The columns of the

+.B file

+lines:

+.RS

+.IP 2. 4

+Number of streams in the file

+.IP 3. 4

+Total number of blocks in the stream(s)

+.IP 4. 4

+Compressed size of the file

+.IP 5. 4

+Uncompressed size of the file

+.IP 6. 4

+Compression ratio, for example

+.BR 0.123.

+If ratio is over 9.999, three dashes

+.RB ( \-\-\- )

+are displayed instead of the ratio.

+.IP 7. 4

+Comma-separated list of integrity check names. The following strings are

+used for the known check types:

+.BR None ,

+.BR CRC32 ,

+.BR CRC64 ,

+and

+.BR SHA\-256 .

+For unknown check types,

+.BI Unknown\- N

+is used, where

+.I N

+is the Check ID as a decimal number (one or two digits).

+.IP 8. 4

+Total size of stream padding in the file

+.RE

+.PP

+The columns of the

+.B stream

+lines:

+.RS

+.IP 2. 4

+Stream number (the first stream is 1)

+.IP 3. 4

+Number of blocks in the stream

+.IP 4. 4

+Compressed start offset

+.IP 5. 4

+Uncompressed start offset

+.IP 6. 4

+Compressed size (does not include stream padding)

+.IP 7. 4

+Uncompressed size

+.IP 8. 4

+Compression ratio

+.IP 9. 4

+Name of the integrity check

+.IP 10. 4

+Size of stream padding

+.RE

+.PP

+The columns of the

+.B block

+lines:

+.RS

+.IP 2. 4

+Number of the stream containing this block

+.IP 3. 4

+Block number relative to the beginning of the stream (the first block is 1)

+.IP 4. 4

+Block number relative to the beginning of the file

+.IP 5. 4

+Compressed start offset relative to the beginning of the file

+.IP 6. 4

+Uncompressed start offset relative to the beginning of the file

+.IP 7. 4

+Total compressed size of the block (includes headers)

+.IP 8. 4

+Uncompressed size

+.IP 9. 4

+Compression ratio

+.IP 10. 4

+Name of the integrity check

+.RE

+.PP

+If

+.B \-\-verbose

+was specified twice, additional columns are included on the

+.B block

+lines. These are not displayed with a single

+.BR \-\-verbose ,

+because getting this information requires many seeks and can thus be slow:

+.RS

+.IP 11. 4

+Value of the integrity check in hexadecimal

+.IP 12. 4

+Block header size

+.IP 13. 4

+Block flags:

+.B c

+indicates that compressed size is present, and

+.B u

+indicates that uncompressed size is present.

+If the flag is not set, a dash

+.RB ( \- )

+is shown instead to keep the string length fixed. New flags may be added

+to the end of the string in the future.

+.IP 14. 4

+Size of the actual compressed data in the block (this excludes

+the block header, block padding, and check fields)

+.IP 15. 4

+Amount of memory (as bytes) required to decompress this block with this

+.B xz

+version

+.IP 16. 4

+Filter chain. Note that most of the options used at compression time cannot

+be known, because only the options that are needed for decompression are

+stored in the

+.B .xz

+headers.

+.RE

+.PP

+The columns of the

+.B totals

+line:

+.RS

+.IP 2. 4

+Number of streams

+.IP 3. 4

+Number of blocks

+.IP 4. 4

+Compressed size

+.IP 5. 4

+Uncompressed size

+.IP 6. 4

+Average compression ratio

+.IP 7. 4

+Comma-separated list of integrity check names that were present in the files

+.IP 8. 4

+Stream padding size

+.IP 9. 4

+Number of files. This is here to keep the order of the earlier columns

+the same as on

+.B file

+lines.

+.RE

+.PP

+If

+.B \-\-verbose

+was specified twice, additional columns are included on the

+.B totals

+line:

+.RS

+.IP 10. 4

+Maximum amount of memory (as bytes) required to decompress the files

+with this

+.B xz

+version

+.IP 11. 4

+.B yes

+or

+.B no

+indicating if all block headers have both compressed size and

+uncompressed size stored in them

+.RE

+.PP

+Future versions may add new line types and new columns can be added to

+the existing line types, but the existing columns won't be changed.

+.SH "EXIT STATUS"

+.TP

+.B 0

+All is good.

+.TP

+.B 1

+An error occurred.

+.TP

+.B 2

+Something worth a warning occurred, but no actual errors occurred.

+.PP

+Notices (not warnings or errors) printed on standard error don't affect

+the exit status.

+.SH ENVIRONMENT

+.TP

+.B XZ_OPT

+A space-separated list of options is parsed from

+.B XZ_OPT

+before parsing the options given on the command line. Note that only

+options are parsed from

+.BR XZ_OPT ;

+all non-options are silently ignored. Parsing is done with

+.BR getopt_long (3)

+which is used also for the command line arguments.

+.SH "LZMA UTILS COMPATIBILITY"

+The command line syntax of

+.B xz

+is practically a superset of

+.BR lzma ,

+.BR unlzma ,

+and

+.BR lzcat

+as found from LZMA Utils 4.32.x. In most cases, it is possible to replace

+LZMA Utils with XZ Utils without breaking existing scripts. There are some

+incompatibilities though, which may sometimes cause problems.

+.SS "Compression preset levels"

+The numbering of the compression level presets is not identical in

+.B xz

+and LZMA Utils.

+The most important difference is how dictionary sizes are mapped to different

+presets. Dictionary size is roughly equal to the decompressor memory usage.

+.RS

+.TS

+tab(;);

+c c c

+c n n.

+Level;xz;LZMA Utils

+\-1;64 KiB;64 KiB

+\-2;512 KiB;1 MiB

+\-3;1 MiB;512 KiB

+\-4;2 MiB;1 MiB

+\-5;4 MiB;2 MiB

+\-6;8 MiB;4 MiB

+\-7;16 MiB;8 MiB

+\-8;32 MiB;16 MiB

+\-9;64 MiB;32 MiB

+.TE

+.RE

+.PP

+The dictionary size differences affect the compressor memory usage too,

+but there are some other differences between LZMA Utils and XZ Utils, which

+make the difference even bigger:

+.RS

+.TS

+tab(;);

+c c c

+c n n.

+Level;xz;LZMA Utils 4.32.x

+\-1;2 MiB;2 MiB

+\-2;5 MiB;12 MiB

+\-3;13 MiB;12 MiB

+\-4;25 MiB;16 MiB

+\-5;48 MiB;26 MiB

+\-6;94 MiB;45 MiB

+\-7;186 MiB;83 MiB

+\-8;370 MiB;159 MiB

+\-9;674 MiB;311 MiB

+.TE

+.RE

+.PP

+The default preset level in LZMA Utils is

+.B \-7

+while in XZ Utils it is

+.BR \-6 ,

+so both use 8 MiB dictionary by default.

+.SS "Streamed vs. non-streamed .lzma files"

+Uncompressed size of the file can be stored in the

+.B .lzma

+header. LZMA Utils does that when compressing regular files.

+The alternative is to mark that uncompressed size is unknown and

+use end of payload marker to indicate where the decompressor should stop.

+LZMA Utils uses this method when uncompressed size isn't known, which is

+the case for example in pipes.

+.PP

+.B xz

+supports decompressing

+.B .lzma

+files with or without end of payload marker, but all

+.B .lzma

+files created by

+.B xz

+will use end of payload marker and have uncompressed size marked as unknown

+in the

+.B .lzma

+header. This may be a problem in some (uncommon) situations. For example, a

+.B .lzma

+decompressor in an embedded device might work only with files that have known

+uncompressed size. If you hit this problem, you need to use LZMA Utils or

+LZMA SDK to create

+.B .lzma

+files with known uncompressed size.

+.SS "Unsupported .lzma files"

+The

+.B .lzma

+format allows

+.I lc

+values up to 8, and

+.I lp

+values up to 4. LZMA Utils can decompress files with any

+.I lc

+and

+.IR lp ,

+but always creates files with

+.B lc=3

+and

+.BR lp=0 .

+Creating files with other

+.I lc

+and

+.I lp

+is possible with

+.B xz

+and with LZMA SDK.

+.PP

+The implementation of the LZMA1 filter in liblzma requires

+that the sum of

+.I lc

+and

+.I lp

+must not exceed 4. Thus,

+.B .lzma

+files which exceed this limitation, cannot be decompressed with

+.BR xz .

+.PP

+LZMA Utils creates only

+.B .lzma

+files which have dictionary size of

+.RI "2^" n

+(a power of 2), but accepts files with any dictionary size.

+liblzma accepts only

+.B .lzma

+files which have dictionary size of

+.RI "2^" n

+or

+.RI "2^" n " + 2^(" n "\-1)."

+This is to decrease false positives when detecting

+.B .lzma

+files.

+.PP

+These limitations shouldn't be a problem in practice, since practically all

+.B .lzma

+files have been compressed with settings that liblzma will accept.

+.SS "Trailing garbage"

+When decompressing, LZMA Utils silently ignore everything after the first

+.B .lzma

+stream. In most situations, this is a bug. This also means that LZMA Utils

+don't support decompressing concatenated

+.B .lzma

+files.

+.PP

+If there is data left after the first

+.B .lzma

+stream,

+.B xz

+considers the file to be corrupt. This may break obscure scripts which have

+assumed that trailing garbage is ignored.

+.SH NOTES

+.SS Compressed output may vary

+The exact compressed output produced from the same uncompressed input file

+may vary between XZ Utils versions even if compression options are identical.

+This is because the encoder can be improved (faster or better compression)

+without affecting the file format. The output can vary even between different

+builds of the same XZ Utils version, if different build options are used.

+.PP

+The above means that implementing

+.B \-\-rsyncable

+to create rsyncable

+.B .xz

+files is not going to happen without freezing a part of the encoder

+implementation, which can then be used with

+.BR \-\-rsyncable .

+.SS Embedded .xz decompressors

+Embedded

+.B .xz

+decompressor implementations like XZ Embedded don't necessarily support files

+created with

+.I check

+types other than

+.B none

+and

+.BR crc32 .

+Since the default is \fB\-\-check=\fIcrc64\fR, you must use

+.B \-\-check=none

+or

+.B \-\-check=crc32

+when creating files for embedded systems.

+.PP

+Outside embedded systems, all

+.B .xz

+format decompressors support all the

+.I check

+types, or at least are able to decompress the file without verifying the

+integrity check if the particular

+.I check

+is not supported.

+.PP

+XZ Embedded supports BCJ filters, but only with the default start offset.

+.SH EXAMPLES

+.SS Basics

+A mix of compressed and uncompressed files can be decompressed

+to standard output with a single command:

+.IP

+.B "xz -dcf a.txt b.txt.xz c.txt d.txt.xz > abcd.txt"

+.SS Parallel compression of many files

+On GNU and *BSD,

+.BR find (1)

+and

+.BR xargs (1)

+can be used to parallellize compression of many files:

+.PP

+.IP

+.B "find . \-type f \e! \-name '*.xz' \-print0 | xargs \-0r \-P4 \-n16 xz"

+.PP

+The

+.B \-P

+option sets the number of parallel

+.B xz

+processes. The best value for the

+.B \-n

+option depends on how many files there are to be compressed.

+If there are only a couple of files, the value should probably be

+.BR 1 ;

+with tens of thousands of files,

+.B 100

+or even more may be appropriate to reduce the number of

+.B xz

+processes that

+.BR xargs (1)

+will eventually create.

+.SS Robot mode examples

+Calculating how many bytes have been saved in total after compressing

+multiple files:

+.IP

+.B "xz --robot --list *.xz | awk '/^totals/{print $5\-$4}'"

+.SH "SEE ALSO"

+.BR xzdec (1),

+.BR gzip (1),

+.BR bzip2 (1)

+.PP

+XZ Utils: <http://tukaani.org/xz/>

+.br

+XZ Embedded: <http://tukaani.org/xz/embedded.html>

+.br

+LZMA SDK: <http://7-zip.org/sdk.html>

Property changes on: xz/src/xz/xz.1

___________________________________________________________________

Added: svn:eol-style

+ LF

« no previous file with comments | « xz/src/xz/util.c ('k') | xz/src/xz/xz_w32res.rc » ('j') | no next file with comments »