doc/faq.txt - Issue 7109015: Update XZ Utils to 5.0.3 (in deps)

Side by Side Diff: doc/faq.txt

Issue 7109015: Update XZ Utils to 5.0.3 (in deps) (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/deps/third_party/xz/

Patch Set: Created 9 years, 6 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

OLD	NEW
1	1

2 XZ Utils FAQ	2 XZ Utils FAQ

3 ============	3 ============

4	4

5 Q: What do the letters XZ mean?	5 Q: What do the letters XZ mean?

6	6

7 A: Nothing. They are just two letters, which come from the file format	7 A: Nothing. They are just two letters, which come from the file format

8 suffix .xz. The .xz suffix was selected, because it seemed to be	8 suffix .xz. The .xz suffix was selected, because it seemed to be

9 pretty much unused. It is no deeper meaning.	9 pretty much unused. It has no deeper meaning.

10	10

11	11

12 Q: What are LZMA and LZMA2?	12 Q: What are LZMA and LZMA2?

13	13

14 A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name	14 A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name

15 of the compression algorithm designed by Igor Pavlov for 7-Zip.	15 of the compression algorithm designed by Igor Pavlov for 7-Zip.

16 LZMA is based on LZ77 and range encoding.	16 LZMA is based on LZ77 and range encoding.

17	17

18 LZMA2 is an updated version of the original LZMA to fix a couple of	18 LZMA2 is an updated version of the original LZMA to fix a couple of

19 practical issues. In context of XZ Utils, LZMA is called LZMA1 to	19 practical issues. In context of XZ Utils, LZMA is called LZMA1 to

20 emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the	20 emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the

21 primary compression algorithm in the .xz file format.	21 primary compression algorithm in the .xz file format.

22	22

23	23

24 Q: There are many LZMA related projects. How does XZ Utils relate to them?	24 Q: There are many LZMA related projects. How does XZ Utils relate to them?

25	25

26 A: 7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly	26 A: 7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly

27 a subset of the 7-Zip source tree.	27 a subset of the 7-Zip source tree.

28	28

29 p7zip is 7-Zip's command line tools ported to POSIX-like systems.	29 p7zip is 7-Zip's command line tools ported to POSIX-like systems.

30	30

31 LZMA Utils provide a gzip-like lzma tool for POSIX-like systems.	31 LZMA Utils provide a gzip-like lzma tool for POSIX-like systems.

32 LZMA Utils are based on LZMA SDK. XZ Utils are the successor to	32 LZMA Utils are based on LZMA SDK. XZ Utils are the successor to

33 LZMA Utils.	33 LZMA Utils.

34	34

35 There are several other projects using LZMA. Most are more or less	35 There are several other projects using LZMA. Most are more or less

36 based on LZMA SDK.	36 based on LZMA SDK. See <http://7-zip.org/links.html>.

	37

	38

	39 Q: Why is liblzma named liblzma if its primary file format is .xz?

	40 Shouldn't it be e.g. libxz?

	41

	42 A: When the designing of the .xz format began, the idea was to replace

	43 the .lzma format and use the same .lzma suffix. It would have been

	44 quite OK to reuse the suffix when there were very few .lzma files

	45 around. However, the old .lzma format become popular before the

	46 new format was finished. The new format was renamed to .xz but the

	47 name of liblzma wasn't changed.

37	48

38	49

39 Q: Do XZ Utils support the .7z format?	50 Q: Do XZ Utils support the .7z format?

40	51

41 A: No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z	52 A: No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z

42 files.	53 files.

43	54

44	55

45 Q: I have many .tar.7z files. Can I convert them to .tar.xz without	56 Q: I have many .tar.7z files. Can I convert them to .tar.xz without

46 spending hours recompressing the data?	57 spending hours recompressing the data?

(...skipping 42 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
89	100

90 Q: Where can I find documentation about the file format and algorithms?	101 Q: Where can I find documentation about the file format and algorithms?

91	102

92 A: The .xz format is documented in xz-file-format.txt. It is a container	103 A: The .xz format is documented in xz-file-format.txt. It is a container

93 format only, and doesn't include descriptions of any non-trivial	104 format only, and doesn't include descriptions of any non-trivial

94 filters.	105 filters.

95	106

96 Documenting LZMA and LZMA2 is planned, but for now, there is no other	107 Documenting LZMA and LZMA2 is planned, but for now, there is no other

97 documentation that the source code. Before you begin, you should know	108 documentation that the source code. Before you begin, you should know

98 the basics of LZ77 and range coding algorithms. LZMA is based on LZ77,	109 the basics of LZ77 and range coding algorithms. LZMA is based on LZ77,

99 but LZMA is a lot more complex. Range coding is used to compress	110 but LZMA is a lot more complex. Range coding is used to compress

100 the final bitstream like Huffman coding is used in Deflate.	111 the final bitstream like Huffman coding is used in Deflate.

101	112

102	113

103 Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma?	114 Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma?

104	115

105 A: BCJ filter is called "x86" in liblzma. BCJ2 is not included,	116 A: BCJ filter is called "x86" in liblzma. BCJ2 is not included,

106 because it requires using more than one encoded output stream.	117 because it requires using more than one encoded output stream.

	118 A streamable version of BCJ2-style filtering is planned.

	119

	120

	121 Q: I need to use a script that runs "xz -9". On a system with 256 MiB

	122 of RAM, xz says that it cannot allocate memory. Can I make the

	123 script work without modifying it?

	124

	125 A: Set a default memory usage limit for compression. You can do it e.g.

	126 in a shell initialization script such as ~/.bashrc or /etc/profile:

	127

	128 XZ_DEFAULTS=--memlimit-compress=150MiB

	129 export XZ_DEFAULTS

	130

	131 xz will then scale the compression settings down so that the given

	132 memory usage limit is not reached. This way xz shouldn't run out

	133 of memory.

	134

	135 Check also that memory-related resource limits are high enough.

	136 On most systems, "ulimit -a" will show the current resource limits.

	137

	138

	139 Q: How do I create files that can be decompressed with XZ Embedded?

	140

	141 A: See the documentation in XZ Embedded. In short, something like

	142 this is a good start:

	143

	144 xz --check=crc32 --lzma2=preset=6e,dict=64KiB

	145

	146 Or if a BCJ filter is needed too, e.g. if compressing

	147 a kernel image for PowerPC:

	148

	149 xz --check=crc32 --powerpc --lzma2=preset=6e,dict=64KiB

	150

	151 Adjust dictionary size to get a good compromise between

	152 compression ratio and decompressor memory usage. Note that

	153 in single-call decompression mode of XZ Embedded, a big

	154 dictionary doesn't increase memory usage.

	155

	156

	157 Q: Will xz support threaded compression?

	158

	159 A: It is planned and has been taken into account when designing

	160 the .xz file format. Eventually there will probably be three types

	161 of threading, each method having its own advantages and disadvantages.

	162

	163 The simplest method is splitting the uncompressed data into blocks

	164 and compressing them in parallel independent from each other.

	165 Since the blocks are compressed independently, they can also be

	166 decompressed independently. Together with the index feature in .xz,

	167 this allows using threads to create .xz files for random-access

	168 reading. This also makes threaded decompression possible, although

	169 it is not clear if threaded decompression will ever be implemented.

	170

	171 The independent blocks method has a couple of disadvantages too. It

	172 will compress worse than a single-block method. Often the difference

	173 is not too big (maybe 1-2 %) but sometimes it can be too big. Also,

	174 the memory usage of the compressor increases linearly when adding

	175 threads.

	176

	177 Match finder parallelization is another threading method. It has

	178 been in 7-Zip for ages. It doesn't affect compression ratio or

	179 memory usage significantly. Among the three threading methods, only

	180 this is useful when compressing small files (files that are not

	181 significantly bigger than the dictionary). Unfortunately this method

	182 scales only to about two CPU cores.

	183

	184 The third method is pigz-style threading (I use that name, because

	185 pigz <http://www.zlib.net/pigz/> uses that method). It doesn't

	186 affect compression ratio significantly and scales to many cores.

	187 The memory usage scales linearly when threads are added. It isn't

	188 significant with pigz, because Deflate uses only 32 KiB dictionary,

	189 but with LZMA2 the memory usage will increase dramatically just like

	190 with the independent blocks method. There is also a constant

	191 computational overhead, which may make pigz-method a bit dull on

	192 dual-core compared to the parallel match finder method, but with more

	193 cores the overhead is not a big deal anymore.

	194

	195 Combining the threading methods will be possible and also useful.

	196 E.g. combining match finder parallelization with pigz-style threading

	197 can cut the memory usage by 50 %.

	198

	199 It is possible that the single-threaded method will be modified to

	200 create files indentical to the pigz-style method. We'll see once

	201 pigz-style threading has been implemented in liblzma.

107	202

108	203

109 Q: How do I build a program that needs liblzmadec (lzmadec.h)?	204 Q: How do I build a program that needs liblzmadec (lzmadec.h)?

110	205

111 A: liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no	206 A: liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no

112 liblzmadec. The code using liblzmadec should be ported to use	207 liblzmadec. The code using liblzmadec should be ported to use

113 liblzma instead. If you cannot or don't want to do that, download	208 liblzma instead. If you cannot or don't want to do that, download

114 LZMA Utils from <http://tukaani.org/lzma/>.	209 LZMA Utils from <http://tukaani.org/lzma/>.

115	210

116	211

117 Q: The default build of liblzma is too big. How can I make it smaller?	212 Q: The default build of liblzma is too big. How can I make it smaller?

118	213

119 A: Give --enable-small to the configure script. Use also appropriate	214 A: Give --enable-small to the configure script. Use also appropriate

120 --enable or --disable options to include only those filter encoders	215 --enable or --disable options to include only those filter encoders

121 and decoders and integrity checks that you actually need. Use	216 and decoders and integrity checks that you actually need. Use

122 CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize	217 CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize

123 for size. See INSTALL for information about configure options.	218 for size. See INSTALL for information about configure options.

124	219

125 If the result is still too big, take a look at XZ Embedded. It is	220 If the result is still too big, take a look at XZ Embedded. It is

126 a separate project, which provides a limited but significantly	221 a separate project, which provides a limited but significantly

127 smaller XZ decoder implementation than XZ Utils.	222 smaller XZ decoder implementation than XZ Utils. You can find it

	223 at <http://tukaani.org/xz/embedded.html>.

128	224

OLD	NEW

« no previous file with comments | « doc/examples/xz_pipe_decomp.c ('k') | doc/lzma-file-format.txt » ('j') | no next file with comments »