OLD | NEW |
1 | 1 |
2 XZ Utils FAQ | 2 XZ Utils FAQ |
3 ============ | 3 ============ |
4 | 4 |
5 Q: What do the letters XZ mean? | 5 Q: What do the letters XZ mean? |
6 | 6 |
7 A: Nothing. They are just two letters, which come from the file format | 7 A: Nothing. They are just two letters, which come from the file format |
8 suffix .xz. The .xz suffix was selected, because it seemed to be | 8 suffix .xz. The .xz suffix was selected, because it seemed to be |
9 pretty much unused. It is no deeper meaning. | 9 pretty much unused. It has no deeper meaning. |
10 | 10 |
11 | 11 |
12 Q: What are LZMA and LZMA2? | 12 Q: What are LZMA and LZMA2? |
13 | 13 |
14 A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name | 14 A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name |
15 of the compression algorithm designed by Igor Pavlov for 7-Zip. | 15 of the compression algorithm designed by Igor Pavlov for 7-Zip. |
16 LZMA is based on LZ77 and range encoding. | 16 LZMA is based on LZ77 and range encoding. |
17 | 17 |
18 LZMA2 is an updated version of the original LZMA to fix a couple of | 18 LZMA2 is an updated version of the original LZMA to fix a couple of |
19 practical issues. In context of XZ Utils, LZMA is called LZMA1 to | 19 practical issues. In context of XZ Utils, LZMA is called LZMA1 to |
20 emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the | 20 emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the |
21 primary compression algorithm in the .xz file format. | 21 primary compression algorithm in the .xz file format. |
22 | 22 |
23 | 23 |
24 Q: There are many LZMA related projects. How does XZ Utils relate to them? | 24 Q: There are many LZMA related projects. How does XZ Utils relate to them? |
25 | 25 |
26 A: 7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly | 26 A: 7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly |
27 a subset of the 7-Zip source tree. | 27 a subset of the 7-Zip source tree. |
28 | 28 |
29 p7zip is 7-Zip's command line tools ported to POSIX-like systems. | 29 p7zip is 7-Zip's command line tools ported to POSIX-like systems. |
30 | 30 |
31 LZMA Utils provide a gzip-like lzma tool for POSIX-like systems. | 31 LZMA Utils provide a gzip-like lzma tool for POSIX-like systems. |
32 LZMA Utils are based on LZMA SDK. XZ Utils are the successor to | 32 LZMA Utils are based on LZMA SDK. XZ Utils are the successor to |
33 LZMA Utils. | 33 LZMA Utils. |
34 | 34 |
35 There are several other projects using LZMA. Most are more or less | 35 There are several other projects using LZMA. Most are more or less |
36 based on LZMA SDK. | 36 based on LZMA SDK. See <http://7-zip.org/links.html>. |
| 37 |
| 38 |
| 39 Q: Why is liblzma named liblzma if its primary file format is .xz? |
| 40 Shouldn't it be e.g. libxz? |
| 41 |
| 42 A: When the designing of the .xz format began, the idea was to replace |
| 43 the .lzma format and use the same .lzma suffix. It would have been |
| 44 quite OK to reuse the suffix when there were very few .lzma files |
| 45 around. However, the old .lzma format become popular before the |
| 46 new format was finished. The new format was renamed to .xz but the |
| 47 name of liblzma wasn't changed. |
37 | 48 |
38 | 49 |
39 Q: Do XZ Utils support the .7z format? | 50 Q: Do XZ Utils support the .7z format? |
40 | 51 |
41 A: No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z | 52 A: No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z |
42 files. | 53 files. |
43 | 54 |
44 | 55 |
45 Q: I have many .tar.7z files. Can I convert them to .tar.xz without | 56 Q: I have many .tar.7z files. Can I convert them to .tar.xz without |
46 spending hours recompressing the data? | 57 spending hours recompressing the data? |
(...skipping 42 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
89 | 100 |
90 Q: Where can I find documentation about the file format and algorithms? | 101 Q: Where can I find documentation about the file format and algorithms? |
91 | 102 |
92 A: The .xz format is documented in xz-file-format.txt. It is a container | 103 A: The .xz format is documented in xz-file-format.txt. It is a container |
93 format only, and doesn't include descriptions of any non-trivial | 104 format only, and doesn't include descriptions of any non-trivial |
94 filters. | 105 filters. |
95 | 106 |
96 Documenting LZMA and LZMA2 is planned, but for now, there is no other | 107 Documenting LZMA and LZMA2 is planned, but for now, there is no other |
97 documentation that the source code. Before you begin, you should know | 108 documentation that the source code. Before you begin, you should know |
98 the basics of LZ77 and range coding algorithms. LZMA is based on LZ77, | 109 the basics of LZ77 and range coding algorithms. LZMA is based on LZ77, |
99 but LZMA is *a lot* more complex. Range coding is used to compress | 110 but LZMA is a lot more complex. Range coding is used to compress |
100 the final bitstream like Huffman coding is used in Deflate. | 111 the final bitstream like Huffman coding is used in Deflate. |
101 | 112 |
102 | 113 |
103 Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma? | 114 Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma? |
104 | 115 |
105 A: BCJ filter is called "x86" in liblzma. BCJ2 is not included, | 116 A: BCJ filter is called "x86" in liblzma. BCJ2 is not included, |
106 because it requires using more than one encoded output stream. | 117 because it requires using more than one encoded output stream. |
| 118 A streamable version of BCJ2-style filtering is planned. |
| 119 |
| 120 |
| 121 Q: I need to use a script that runs "xz -9". On a system with 256 MiB |
| 122 of RAM, xz says that it cannot allocate memory. Can I make the |
| 123 script work without modifying it? |
| 124 |
| 125 A: Set a default memory usage limit for compression. You can do it e.g. |
| 126 in a shell initialization script such as ~/.bashrc or /etc/profile: |
| 127 |
| 128 XZ_DEFAULTS=--memlimit-compress=150MiB |
| 129 export XZ_DEFAULTS |
| 130 |
| 131 xz will then scale the compression settings down so that the given |
| 132 memory usage limit is not reached. This way xz shouldn't run out |
| 133 of memory. |
| 134 |
| 135 Check also that memory-related resource limits are high enough. |
| 136 On most systems, "ulimit -a" will show the current resource limits. |
| 137 |
| 138 |
| 139 Q: How do I create files that can be decompressed with XZ Embedded? |
| 140 |
| 141 A: See the documentation in XZ Embedded. In short, something like |
| 142 this is a good start: |
| 143 |
| 144 xz --check=crc32 --lzma2=preset=6e,dict=64KiB |
| 145 |
| 146 Or if a BCJ filter is needed too, e.g. if compressing |
| 147 a kernel image for PowerPC: |
| 148 |
| 149 xz --check=crc32 --powerpc --lzma2=preset=6e,dict=64KiB |
| 150 |
| 151 Adjust dictionary size to get a good compromise between |
| 152 compression ratio and decompressor memory usage. Note that |
| 153 in single-call decompression mode of XZ Embedded, a big |
| 154 dictionary doesn't increase memory usage. |
| 155 |
| 156 |
| 157 Q: Will xz support threaded compression? |
| 158 |
| 159 A: It is planned and has been taken into account when designing |
| 160 the .xz file format. Eventually there will probably be three types |
| 161 of threading, each method having its own advantages and disadvantages. |
| 162 |
| 163 The simplest method is splitting the uncompressed data into blocks |
| 164 and compressing them in parallel independent from each other. |
| 165 Since the blocks are compressed independently, they can also be |
| 166 decompressed independently. Together with the index feature in .xz, |
| 167 this allows using threads to create .xz files for random-access |
| 168 reading. This also makes threaded decompression possible, although |
| 169 it is not clear if threaded decompression will ever be implemented. |
| 170 |
| 171 The independent blocks method has a couple of disadvantages too. It |
| 172 will compress worse than a single-block method. Often the difference |
| 173 is not too big (maybe 1-2 %) but sometimes it can be too big. Also, |
| 174 the memory usage of the compressor increases linearly when adding |
| 175 threads. |
| 176 |
| 177 Match finder parallelization is another threading method. It has |
| 178 been in 7-Zip for ages. It doesn't affect compression ratio or |
| 179 memory usage significantly. Among the three threading methods, only |
| 180 this is useful when compressing small files (files that are not |
| 181 significantly bigger than the dictionary). Unfortunately this method |
| 182 scales only to about two CPU cores. |
| 183 |
| 184 The third method is pigz-style threading (I use that name, because |
| 185 pigz <http://www.zlib.net/pigz/> uses that method). It doesn't |
| 186 affect compression ratio significantly and scales to many cores. |
| 187 The memory usage scales linearly when threads are added. It isn't |
| 188 significant with pigz, because Deflate uses only 32 KiB dictionary, |
| 189 but with LZMA2 the memory usage will increase dramatically just like |
| 190 with the independent blocks method. There is also a constant |
| 191 computational overhead, which may make pigz-method a bit dull on |
| 192 dual-core compared to the parallel match finder method, but with more |
| 193 cores the overhead is not a big deal anymore. |
| 194 |
| 195 Combining the threading methods will be possible and also useful. |
| 196 E.g. combining match finder parallelization with pigz-style threading |
| 197 can cut the memory usage by 50 %. |
| 198 |
| 199 It is possible that the single-threaded method will be modified to |
| 200 create files indentical to the pigz-style method. We'll see once |
| 201 pigz-style threading has been implemented in liblzma. |
107 | 202 |
108 | 203 |
109 Q: How do I build a program that needs liblzmadec (lzmadec.h)? | 204 Q: How do I build a program that needs liblzmadec (lzmadec.h)? |
110 | 205 |
111 A: liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no | 206 A: liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no |
112 liblzmadec. The code using liblzmadec should be ported to use | 207 liblzmadec. The code using liblzmadec should be ported to use |
113 liblzma instead. If you cannot or don't want to do that, download | 208 liblzma instead. If you cannot or don't want to do that, download |
114 LZMA Utils from <http://tukaani.org/lzma/>. | 209 LZMA Utils from <http://tukaani.org/lzma/>. |
115 | 210 |
116 | 211 |
117 Q: The default build of liblzma is too big. How can I make it smaller? | 212 Q: The default build of liblzma is too big. How can I make it smaller? |
118 | 213 |
119 A: Give --enable-small to the configure script. Use also appropriate | 214 A: Give --enable-small to the configure script. Use also appropriate |
120 --enable or --disable options to include only those filter encoders | 215 --enable or --disable options to include only those filter encoders |
121 and decoders and integrity checks that you actually need. Use | 216 and decoders and integrity checks that you actually need. Use |
122 CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize | 217 CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize |
123 for size. See INSTALL for information about configure options. | 218 for size. See INSTALL for information about configure options. |
124 | 219 |
125 If the result is still too big, take a look at XZ Embedded. It is | 220 If the result is still too big, take a look at XZ Embedded. It is |
126 a separate project, which provides a limited but significantly | 221 a separate project, which provides a limited but significantly |
127 smaller XZ decoder implementation than XZ Utils. | 222 smaller XZ decoder implementation than XZ Utils. You can find it |
| 223 at <http://tukaani.org/xz/embedded.html>. |
128 | 224 |
OLD | NEW |