Index: third_party/lzma_sdk/lzma.txt |
=================================================================== |
--- third_party/lzma_sdk/lzma.txt (revision 0) |
+++ third_party/lzma_sdk/lzma.txt (revision 0) |
@@ -0,0 +1,598 @@ |
+LZMA SDK 9.20 |
+------------- |
+ |
+LZMA SDK provides the documentation, samples, header files, libraries, |
+and tools you need to develop applications that use LZMA compression. |
+ |
+LZMA is default and general compression method of 7z format |
+in 7-Zip compression program (www.7-zip.org). LZMA provides high |
+compression ratio and very fast decompression. |
+ |
+LZMA is an improved version of famous LZ77 compression algorithm. |
+It was improved in way of maximum increasing of compression ratio, |
+keeping high decompression speed and low memory requirements for |
+decompressing. |
+ |
+ |
+ |
+LICENSE |
+------- |
+ |
+LZMA SDK is written and placed in the public domain by Igor Pavlov. |
+ |
+Some code in LZMA SDK is based on public domain code from another developers: |
+ 1) PPMd var.H (2001): Dmitry Shkarin |
+ 2) SHA-256: Wei Dai (Crypto++ library) |
+ |
+ |
+LZMA SDK Contents |
+----------------- |
+ |
+LZMA SDK includes: |
+ |
+ - ANSI-C/C++/C#/Java source code for LZMA compressing and decompressing |
+ - Compiled file->file LZMA compressing/decompressing program for Windows system |
+ |
+ |
+UNIX/Linux version |
+------------------ |
+To compile C++ version of file->file LZMA encoding, go to directory |
+CPP/7zip/Bundles/LzmaCon |
+and call make to recompile it: |
+ make -f makefile.gcc clean all |
+ |
+In some UNIX/Linux versions you must compile LZMA with static libraries. |
+To compile with static libraries, you can use |
+LIB = -lm -static |
+ |
+ |
+Files |
+--------------------- |
+lzma.txt - LZMA SDK description (this file) |
+7zFormat.txt - 7z Format description |
+7zC.txt - 7z ANSI-C Decoder description |
+methods.txt - Compression method IDs for .7z |
+lzma.exe - Compiled file->file LZMA encoder/decoder for Windows |
+7zr.exe - 7-Zip with 7z/lzma/xz support. |
+history.txt - history of the LZMA SDK |
+ |
+ |
+Source code structure |
+--------------------- |
+ |
+C/ - C files |
+ 7zCrc*.* - CRC code |
+ Alloc.* - Memory allocation functions |
+ Bra*.* - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code |
+ LzFind.* - Match finder for LZ (LZMA) encoders |
+ LzFindMt.* - Match finder for LZ (LZMA) encoders for multithreading encoding |
+ LzHash.h - Additional file for LZ match finder |
+ LzmaDec.* - LZMA decoding |
+ LzmaEnc.* - LZMA encoding |
+ LzmaLib.* - LZMA Library for DLL calling |
+ Types.h - Basic types for another .c files |
+ Threads.* - The code for multithreading. |
+ |
+ LzmaLib - LZMA Library (.DLL for Windows) |
+ |
+ LzmaUtil - LZMA Utility (file->file LZMA encoder/decoder). |
+ |
+ Archive - files related to archiving |
+ 7z - 7z ANSI-C Decoder |
+ |
+CPP/ -- CPP files |
+ |
+ Common - common files for C++ projects |
+ Windows - common files for Windows related code |
+ |
+ 7zip - files related to 7-Zip Project |
+ |
+ Common - common files for 7-Zip |
+ |
+ Compress - files related to compression/decompression |
+ |
+ Archive - files related to archiving |
+ |
+ Common - common files for archive handling |
+ 7z - 7z C++ Encoder/Decoder |
+ |
+ Bundles - Modules that are bundles of other modules |
+ |
+ Alone7z - 7zr.exe: Standalone version of 7z.exe that supports only 7z/LZMA/BCJ/BCJ2 |
+ LzmaCon - lzma.exe: LZMA compression/decompression |
+ Format7zR - 7zr.dll: Reduced version of 7za.dll: extracting/compressing to 7z/LZMA/BCJ/BCJ2 |
+ Format7zExtractR - 7zxr.dll: Reduced version of 7zxa.dll: extracting from 7z/LZMA/BCJ/BCJ2. |
+ |
+ UI - User Interface files |
+ |
+ Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll |
+ Common - Common UI files |
+ Console - Code for console archiver |
+ |
+ |
+ |
+CS/ - C# files |
+ 7zip |
+ Common - some common files for 7-Zip |
+ Compress - files related to compression/decompression |
+ LZ - files related to LZ (Lempel-Ziv) compression algorithm |
+ LZMA - LZMA compression/decompression |
+ LzmaAlone - file->file LZMA compression/decompression |
+ RangeCoder - Range Coder (special code of compression/decompression) |
+ |
+Java/ - Java files |
+ SevenZip |
+ Compression - files related to compression/decompression |
+ LZ - files related to LZ (Lempel-Ziv) compression algorithm |
+ LZMA - LZMA compression/decompression |
+ RangeCoder - Range Coder (special code of compression/decompression) |
+ |
+ |
+C/C++ source code of LZMA SDK is part of 7-Zip project. |
+7-Zip source code can be downloaded from 7-Zip's SourceForge page: |
+ |
+ http://sourceforge.net/projects/sevenzip/ |
+ |
+ |
+ |
+LZMA features |
+------------- |
+ - Variable dictionary size (up to 1 GB) |
+ - Estimated compressing speed: about 2 MB/s on 2 GHz CPU |
+ - Estimated decompressing speed: |
+ - 20-30 MB/s on 2 GHz Core 2 or AMD Athlon 64 |
+ - 1-2 MB/s on 200 MHz ARM, MIPS, PowerPC or other simple RISC |
+ - Small memory requirements for decompressing (16 KB + DictionarySize) |
+ - Small code size for decompressing: 5-8 KB |
+ |
+LZMA decoder uses only integer operations and can be |
+implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions). |
+ |
+Some critical operations that affect the speed of LZMA decompression: |
+ 1) 32*16 bit integer multiply |
+ 2) Misspredicted branches (penalty mostly depends from pipeline length) |
+ 3) 32-bit shift and arithmetic operations |
+ |
+The speed of LZMA decompressing mostly depends from CPU speed. |
+Memory speed has no big meaning. But if your CPU has small data cache, |
+overall weight of memory speed will slightly increase. |
+ |
+ |
+How To Use |
+---------- |
+ |
+Using LZMA encoder/decoder executable |
+-------------------------------------- |
+ |
+Usage: LZMA <e|d> inputFile outputFile [<switches>...] |
+ |
+ e: encode file |
+ |
+ d: decode file |
+ |
+ b: Benchmark. There are two tests: compressing and decompressing |
+ with LZMA method. Benchmark shows rating in MIPS (million |
+ instructions per second). Rating value is calculated from |
+ measured speed and it is normalized with Intel's Core 2 results. |
+ Also Benchmark checks possible hardware errors (RAM |
+ errors in most cases). Benchmark uses these settings: |
+ (-a1, -d21, -fb32, -mfbt4). You can change only -d parameter. |
+ Also you can change the number of iterations. Example for 30 iterations: |
+ LZMA b 30 |
+ Default number of iterations is 10. |
+ |
+<Switches> |
+ |
+ |
+ -a{N}: set compression mode 0 = fast, 1 = normal |
+ default: 1 (normal) |
+ |
+ d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB) |
+ The maximum value for dictionary size is 1 GB = 2^30 bytes. |
+ Dictionary size is calculated as DictionarySize = 2^N bytes. |
+ For decompressing file compressed by LZMA method with dictionary |
+ size D = 2^N you need about D bytes of memory (RAM). |
+ |
+ -fb{N}: set number of fast bytes - [5, 273], default: 128 |
+ Usually big number gives a little bit better compression ratio |
+ and slower compression process. |
+ |
+ -lc{N}: set number of literal context bits - [0, 8], default: 3 |
+ Sometimes lc=4 gives gain for big files. |
+ |
+ -lp{N}: set number of literal pos bits - [0, 4], default: 0 |
+ lp switch is intended for periodical data when period is |
+ equal 2^N. For example, for 32-bit (4 bytes) |
+ periodical data you can use lp=2. Often it's better to set lc0, |
+ if you change lp switch. |
+ |
+ -pb{N}: set number of pos bits - [0, 4], default: 2 |
+ pb switch is intended for periodical data |
+ when period is equal 2^N. |
+ |
+ -mf{MF_ID}: set Match Finder. Default: bt4. |
+ Algorithms from hc* group doesn't provide good compression |
+ ratio, but they often works pretty fast in combination with |
+ fast mode (-a0). |
+ |
+ Memory requirements depend from dictionary size |
+ (parameter "d" in table below). |
+ |
+ MF_ID Memory Description |
+ |
+ bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing. |
+ bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing. |
+ bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing. |
+ hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing. |
+ |
+ -eos: write End Of Stream marker. By default LZMA doesn't write |
+ eos marker, since LZMA decoder knows uncompressed size |
+ stored in .lzma file header. |
+ |
+ -si: Read data from stdin (it will write End Of Stream marker). |
+ -so: Write data to stdout |
+ |
+ |
+Examples: |
+ |
+1) LZMA e file.bin file.lzma -d16 -lc0 |
+ |
+compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K) |
+and 0 literal context bits. -lc0 allows to reduce memory requirements |
+for decompression. |
+ |
+ |
+2) LZMA e file.bin file.lzma -lc0 -lp2 |
+ |
+compresses file.bin to file.lzma with settings suitable |
+for 32-bit periodical data (for example, ARM or MIPS code). |
+ |
+3) LZMA d file.lzma file.bin |
+ |
+decompresses file.lzma to file.bin. |
+ |
+ |
+Compression ratio hints |
+----------------------- |
+ |
+Recommendations |
+--------------- |
+ |
+To increase the compression ratio for LZMA compressing it's desirable |
+to have aligned data (if it's possible) and also it's desirable to locate |
+data in such order, where code is grouped in one place and data is |
+grouped in other place (it's better than such mixing: code, data, code, |
+data, ...). |
+ |
+ |
+Filters |
+------- |
+You can increase the compression ratio for some data types, using |
+special filters before compressing. For example, it's possible to |
+increase the compression ratio on 5-10% for code for those CPU ISAs: |
+x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC. |
+ |
+You can find C source code of such filters in C/Bra*.* files |
+ |
+You can check the compression ratio gain of these filters with such |
+7-Zip commands (example for ARM code): |
+No filter: |
+ 7z a a1.7z a.bin -m0=lzma |
+ |
+With filter for little-endian ARM code: |
+ 7z a a2.7z a.bin -m0=arm -m1=lzma |
+ |
+It works in such manner: |
+Compressing = Filter_encoding + LZMA_encoding |
+Decompressing = LZMA_decoding + Filter_decoding |
+ |
+Compressing and decompressing speed of such filters is very high, |
+so it will not increase decompressing time too much. |
+Moreover, it reduces decompression time for LZMA_decoding, |
+since compression ratio with filtering is higher. |
+ |
+These filters convert CALL (calling procedure) instructions |
+from relative offsets to absolute addresses, so such data becomes more |
+compressible. |
+ |
+For some ISAs (for example, for MIPS) it's impossible to get gain from such filter. |
+ |
+ |
+LZMA compressed file format |
+--------------------------- |
+Offset Size Description |
+ 0 1 Special LZMA properties (lc,lp, pb in encoded form) |
+ 1 4 Dictionary size (little endian) |
+ 5 8 Uncompressed size (little endian). -1 means unknown size |
+ 13 Compressed data |
+ |
+ |
+ANSI-C LZMA Decoder |
+~~~~~~~~~~~~~~~~~~~ |
+ |
+Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58. |
+If you want to use old interfaces you can download previous version of LZMA SDK |
+from sourceforge.net site. |
+ |
+To use ANSI-C LZMA Decoder you need the following files: |
+1) LzmaDec.h + LzmaDec.c + Types.h |
+LzmaUtil/LzmaUtil.c is example application that uses these files. |
+ |
+ |
+Memory requirements for LZMA decoding |
+------------------------------------- |
+ |
+Stack usage of LZMA decoding function for local variables is not |
+larger than 200-400 bytes. |
+ |
+LZMA Decoder uses dictionary buffer and internal state structure. |
+Internal state structure consumes |
+ state_size = (4 + (1.5 << (lc + lp))) KB |
+by default (lc=3, lp=0), state_size = 16 KB. |
+ |
+ |
+How To decompress data |
+---------------------- |
+ |
+LZMA Decoder (ANSI-C version) now supports 2 interfaces: |
+1) Single-call Decompressing |
+2) Multi-call State Decompressing (zlib-like interface) |
+ |
+You must use external allocator: |
+Example: |
+void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); } |
+void SzFree(void *p, void *address) { p = p; free(address); } |
+ISzAlloc alloc = { SzAlloc, SzFree }; |
+ |
+You can use p = p; operator to disable compiler warnings. |
+ |
+ |
+Single-call Decompressing |
+------------------------- |
+When to use: RAM->RAM decompressing |
+Compile files: LzmaDec.h + LzmaDec.c + Types.h |
+Compile defines: no defines |
+Memory Requirements: |
+ - Input buffer: compressed size |
+ - Output buffer: uncompressed size |
+ - LZMA Internal Structures: state_size (16 KB for default settings) |
+ |
+Interface: |
+ int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen, |
+ const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode, |
+ ELzmaStatus *status, ISzAlloc *alloc); |
+ In: |
+ dest - output data |
+ destLen - output data size |
+ src - input data |
+ srcLen - input data size |
+ propData - LZMA properties (5 bytes) |
+ propSize - size of propData buffer (5 bytes) |
+ finishMode - It has meaning only if the decoding reaches output limit (*destLen). |
+ LZMA_FINISH_ANY - Decode just destLen bytes. |
+ LZMA_FINISH_END - Stream must be finished after (*destLen). |
+ You can use LZMA_FINISH_END, when you know that |
+ current output buffer covers last bytes of stream. |
+ alloc - Memory allocator. |
+ |
+ Out: |
+ destLen - processed output size |
+ srcLen - processed input size |
+ |
+ Output: |
+ SZ_OK |
+ status: |
+ LZMA_STATUS_FINISHED_WITH_MARK |
+ LZMA_STATUS_NOT_FINISHED |
+ LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK |
+ SZ_ERROR_DATA - Data error |
+ SZ_ERROR_MEM - Memory allocation error |
+ SZ_ERROR_UNSUPPORTED - Unsupported properties |
+ SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src). |
+ |
+ If LZMA decoder sees end_marker before reaching output limit, it returns OK result, |
+ and output value of destLen will be less than output buffer size limit. |
+ |
+ You can use multiple checks to test data integrity after full decompression: |
+ 1) Check Result and "status" variable. |
+ 2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize. |
+ 3) Check that output(srcLen) = compressedSize, if you know real compressedSize. |
+ You must use correct finish mode in that case. */ |
+ |
+ |
+Multi-call State Decompressing (zlib-like interface) |
+---------------------------------------------------- |
+ |
+When to use: file->file decompressing |
+Compile files: LzmaDec.h + LzmaDec.c + Types.h |
+ |
+Memory Requirements: |
+ - Buffer for input stream: any size (for example, 16 KB) |
+ - Buffer for output stream: any size (for example, 16 KB) |
+ - LZMA Internal Structures: state_size (16 KB for default settings) |
+ - LZMA dictionary (dictionary size is encoded in LZMA properties header) |
+ |
+1) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header: |
+ unsigned char header[LZMA_PROPS_SIZE + 8]; |
+ ReadFile(inFile, header, sizeof(header) |
+ |
+2) Allocate CLzmaDec structures (state + dictionary) using LZMA properties |
+ |
+ CLzmaDec state; |
+ LzmaDec_Constr(&state); |
+ res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc); |
+ if (res != SZ_OK) |
+ return res; |
+ |
+3) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop |
+ |
+ LzmaDec_Init(&state); |
+ for (;;) |
+ { |
+ ... |
+ int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen, |
+ const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode); |
+ ... |
+ } |
+ |
+ |
+4) Free all allocated structures |
+ LzmaDec_Free(&state, &g_Alloc); |
+ |
+For full code example, look at C/LzmaUtil/LzmaUtil.c code. |
+ |
+ |
+How To compress data |
+-------------------- |
+ |
+Compile files: LzmaEnc.h + LzmaEnc.c + Types.h + |
+LzFind.c + LzFind.h + LzFindMt.c + LzFindMt.h + LzHash.h |
+ |
+Memory Requirements: |
+ - (dictSize * 11.5 + 6 MB) + state_size |
+ |
+Lzma Encoder can use two memory allocators: |
+1) alloc - for small arrays. |
+2) allocBig - for big arrays. |
+ |
+For example, you can use Large RAM Pages (2 MB) in allocBig allocator for |
+better compression speed. Note that Windows has bad implementation for |
+Large RAM Pages. |
+It's OK to use same allocator for alloc and allocBig. |
+ |
+ |
+Single-call Compression with callbacks |
+-------------------------------------- |
+ |
+Check C/LzmaUtil/LzmaUtil.c as example, |
+ |
+When to use: file->file decompressing |
+ |
+1) you must implement callback structures for interfaces: |
+ISeqInStream |
+ISeqOutStream |
+ICompressProgress |
+ISzAlloc |
+ |
+static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); } |
+static void SzFree(void *p, void *address) { p = p; MyFree(address); } |
+static ISzAlloc g_Alloc = { SzAlloc, SzFree }; |
+ |
+ CFileSeqInStream inStream; |
+ CFileSeqOutStream outStream; |
+ |
+ inStream.funcTable.Read = MyRead; |
+ inStream.file = inFile; |
+ outStream.funcTable.Write = MyWrite; |
+ outStream.file = outFile; |
+ |
+ |
+2) Create CLzmaEncHandle object; |
+ |
+ CLzmaEncHandle enc; |
+ |
+ enc = LzmaEnc_Create(&g_Alloc); |
+ if (enc == 0) |
+ return SZ_ERROR_MEM; |
+ |
+ |
+3) initialize CLzmaEncProps properties; |
+ |
+ LzmaEncProps_Init(&props); |
+ |
+ Then you can change some properties in that structure. |
+ |
+4) Send LZMA properties to LZMA Encoder |
+ |
+ res = LzmaEnc_SetProps(enc, &props); |
+ |
+5) Write encoded properties to header |
+ |
+ Byte header[LZMA_PROPS_SIZE + 8]; |
+ size_t headerSize = LZMA_PROPS_SIZE; |
+ UInt64 fileSize; |
+ int i; |
+ |
+ res = LzmaEnc_WriteProperties(enc, header, &headerSize); |
+ fileSize = MyGetFileLength(inFile); |
+ for (i = 0; i < 8; i++) |
+ header[headerSize++] = (Byte)(fileSize >> (8 * i)); |
+ MyWriteFileAndCheck(outFile, header, headerSize) |
+ |
+6) Call encoding function: |
+ res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable, |
+ NULL, &g_Alloc, &g_Alloc); |
+ |
+7) Destroy LZMA Encoder Object |
+ LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc); |
+ |
+ |
+If callback function return some error code, LzmaEnc_Encode also returns that code |
+or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS. |
+ |
+ |
+Single-call RAM->RAM Compression |
+-------------------------------- |
+ |
+Single-call RAM->RAM Compression is similar to Compression with callbacks, |
+but you provide pointers to buffers instead of pointers to stream callbacks: |
+ |
+HRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen, |
+ CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark, |
+ ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig); |
+ |
+Return code: |
+ SZ_OK - OK |
+ SZ_ERROR_MEM - Memory allocation error |
+ SZ_ERROR_PARAM - Incorrect paramater |
+ SZ_ERROR_OUTPUT_EOF - output buffer overflow |
+ SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version) |
+ |
+ |
+ |
+Defines |
+------- |
+ |
+_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code. |
+ |
+_LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for |
+ some structures will be doubled in that case. |
+ |
+_LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is 32-bit. |
+ |
+_LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type. |
+ |
+ |
+_7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder. |
+ |
+ |
+C++ LZMA Encoder/Decoder |
+~~~~~~~~~~~~~~~~~~~~~~~~ |
+C++ LZMA code use COM-like interfaces. So if you want to use it, |
+you can study basics of COM/OLE. |
+C++ LZMA code is just wrapper over ANSI-C code. |
+ |
+ |
+C++ Notes |
+~~~~~~~~~~~~~~~~~~~~~~~~ |
+If you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling), |
+you must check that you correctly work with "new" operator. |
+7-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator. |
+So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator: |
+operator new(size_t size) |
+{ |
+ void *p = ::malloc(size); |
+ if (p == 0) |
+ throw CNewException(); |
+ return p; |
+} |
+If you use MSCV that throws exception for "new" operator, you can compile without |
+"NewHandler.cpp". So standard exception will be used. Actually some code of |
+7-Zip catches any exception in internal code and converts it to HRESULT code. |
+So you don't need to catch CNewException, if you call COM interfaces of 7-Zip. |
+ |
+--- |
+ |
+http://www.7-zip.org |
+http://www.7-zip.org/sdk.html |
+http://www.7-zip.org/support.html |
Property changes on: third_party\lzma_sdk\lzma.txt |
___________________________________________________________________ |
Added: svn:executable |
+ * |