Index: xz/doc/lzma-file-format.txt |
=================================================================== |
--- xz/doc/lzma-file-format.txt (revision 0) |
+++ xz/doc/lzma-file-format.txt (revision 0) |
@@ -0,0 +1,166 @@ |
+ |
+The .lzma File Format |
+===================== |
+ |
+ 0. Preface |
+ 0.1. Notices and Acknowledgements |
+ 0.2. Changes |
+ 1. File Format |
+ 1.1. Header |
+ 1.1.1. Properties |
+ 1.1.2. Dictionary Size |
+ 1.1.3. Uncompressed Size |
+ 1.2. LZMA Compressed Data |
+ 2. References |
+ |
+ |
+0. Preface |
+ |
+ This document describes the .lzma file format, which is |
+ sometimes also called LZMA_Alone format. It is a legacy file |
+ format, which is being or has been replaced by the .xz format. |
+ The MIME type of the .lzma format is `application/x-lzma'. |
+ |
+ The most commonly used software to handle .lzma files are |
+ LZMA SDK, LZMA Utils, 7-Zip, and XZ Utils. This document |
+ describes some of the differences between these implementations |
+ and gives hints what subset of the .lzma format is the most |
+ portable. |
+ |
+ |
+0.1. Notices and Acknowledgements |
+ |
+ This file format was designed by Igor Pavlov for use in |
+ LZMA SDK. This document was written by Lasse Collin |
+ <lasse.collin@tukaani.org> using the documentation found |
+ from the LZMA SDK. |
+ |
+ This document has been put into the public domain. |
+ |
+ |
+0.2. Changes |
+ |
+ Last modified: 2009-05-01 11:15+0300 |
+ |
+ |
+1. File Format |
+ |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+ |
+ | Header | LZMA Compressed Data | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+ |
+ |
+ The .lzma format file consist of 13-byte Header followed by |
+ the LZMA Compressed Data. |
+ |
+ Unlike the .gz, .bz2, and .xz formats, it is not possible to |
+ concatenate multiple .lzma files as is and expect the |
+ decompression tool to decode the resulting file as if it were |
+ a single .lzma file. |
+ |
+ For example, the command line tools from LZMA Utils and |
+ LZMA SDK silently ignore all the data after the first .lzma |
+ stream. In contrast, the command line tool from XZ Utils |
+ considers the .lzma file to be corrupt if there is data after |
+ the first .lzma stream. |
+ |
+ |
+1.1. Header |
+ |
+ +------------+----+----+----+----+--+--+--+--+--+--+--+--+ |
+ | Properties | Dictionary Size | Uncompressed Size | |
+ +------------+----+----+----+----+--+--+--+--+--+--+--+--+ |
+ |
+ |
+1.1.1. Properties |
+ |
+ The Properties field contains three properties. An abbreviation |
+ is given in parentheses, followed by the value range of the |
+ property. The field consists of |
+ |
+ 1) the number of literal context bits (lc, [0, 8]); |
+ 2) the number of literal position bits (lp, [0, 4]); and |
+ 3) the number of position bits (pb, [0, 4]). |
+ |
+ The properties are encoded using the following formula: |
+ |
+ Properties = (pb * 5 + lp) * 9 + lc |
+ |
+ The following C code illustrates a straightforward way to |
+ decode the Properties field: |
+ |
+ uint8_t lc, lp, pb; |
+ uint8_t prop = get_lzma_properties(); |
+ if (prop > (4 * 5 + 4) * 9 + 8) |
+ return LZMA_PROPERTIES_ERROR; |
+ |
+ pb = prop / (9 * 5); |
+ prop -= pb * 9 * 5; |
+ lp = prop / 9; |
+ lc = prop - lp * 9; |
+ |
+ XZ Utils has an additional requirement: lc + lp <= 4. Files |
+ which don't follow this requirement cannot be decompressed |
+ with XZ Utils. Usually this isn't a problem since the most |
+ common lc/lp/pb values are 3/0/2. It is the only lc/lp/pb |
+ combination that the files created by LZMA Utils can have, |
+ but LZMA Utils can decompress files with any lc/lp/pb. |
+ |
+ |
+1.1.2. Dictionary Size |
+ |
+ Dictionary Size is stored as an unsigned 32-bit little endian |
+ integer. Any 32-bit value is possible, but for maximum |
+ portability, only sizes of 2^n and 2^n + 2^(n-1) should be |
+ used. |
+ |
+ LZMA Utils creates only files with dictionary size 2^n, |
+ 16 <= n <= 25. LZMA Utils can decompress files with any |
+ dictionary size. |
+ |
+ XZ Utils creates and decompresses .lzma files only with |
+ dictionary sizes 2^n and 2^n + 2^(n-1). If some other |
+ dictionary size is specified when compressing, the value |
+ stored in the Dictionary Size field is a rounded up, but the |
+ specified value is still used in the actual compression code. |
+ |
+ |
+1.1.3. Uncompressed Size |
+ |
+ Uncompressed Size is stored as unsigned 64-bit little endian |
+ integer. A special value of 0xFFFF_FFFF_FFFF_FFFF indicates |
+ that Uncompressed Size is unknown. End of Payload Marker (*) |
+ is used if and only if Uncompressed Size is unknown. |
+ |
+ XZ Utils rejects files whose Uncompressed Size field specifies |
+ a known size that is 256 GiB or more. This is to reject false |
+ positives when trying to guess if the input file is in the |
+ .lzma format. When Uncompressed Size is unknown, there is no |
+ limit for the uncompressed size of the file. |
+ |
+ (*) Some tools use the term End of Stream (EOS) marker |
+ instead of End of Payload Marker. |
+ |
+ |
+1.2. LZMA Compressed Data |
+ |
+ Detailed description of the format of this field is out of |
+ scope of this document. |
+ |
+ |
+2. References |
+ |
+ LZMA SDK - The original LZMA implementation |
+ http://7-zip.org/sdk.html |
+ |
+ 7-Zip |
+ http://7-zip.org/ |
+ |
+ LZMA Utils - LZMA adapted to POSIX-like systems |
+ http://tukaani.org/lzma/ |
+ |
+ XZ Utils - The next generation of LZMA Utils |
+ http://tukaani.org/xz/ |
+ |
+ The .xz file format - The successor of the the .lzma format |
+ http://tukaani.org/xz/xz-file-format.txt |
+ |
Property changes on: xz/doc/lzma-file-format.txt |
___________________________________________________________________ |
Added: svn:eol-style |
+ LF |