The Gz module contains functions to compress and uncompress strings using the same algorithm as the program gzip. Compressing can be done in streaming mode or all at once.
The Gz module consists of two classes; Gz.deflate and Gz.inflate. Gz.deflate is used to pack data and Gz.inflate is used to unpack data. (Think "inflatable boat")
Note that this module is only available if the gzip library was available when Pike was compiled.
Note that although these functions use the same algorithm as gzip, they do not use the exact same format, so you cannot directly unzip gzipped files with these routines. Support for this will be added in the future.
inherit _Gz : _Gz
constant Gz.DEFAULT_STRATEGY
The default strategy as selected in the zlib library.
constant Gz.FILTERED
This strategy is intented for data created by a filter or predictor and will put more emphasis on huffman encoding and less on LZ string matching. This is between DEFAULT_STRATEGY and HUFFMAN_ONLY.
constant Gz.FIXED
In this mode dynamic huffman codes are disabled, allowing for a simpler decoder for special applications. This mode is not available in all zlib versions.
constant Gz.HUFFMAN_ONLY
This strategy will turn of string matching completely, only doing huffman encoding. Window size doesn't matter in this mode and the data can be decompressed with a zero size window.
constant Gz.RLE
This strategy is even closer to the HUFFMAN_ONLY in that it only looks at the latest byte in the window, i.e. a window size of 1 byte is sufficient for decompression. This mode is not available in all zlib versions.
int adler32(string(8bit) data, void|int(0..) start_value)
This function calculates the Adler-32 Cyclic Redundancy Check.
string(8bit)|zero check_header(Stdio.Stream|void f, Stdio.Buffer|string(8bit)|void buf)
Check whether a file has a valid gzip header.
fFile to check.
bufPrefix of f.
Returns the content of f after the gzip header
if a header was found. Returns 0 (zero)
if there was no header.
string(8bit) compress(string(8bit)|String.Buffer|System.Memory|Stdio.Buffer data, void|bool raw, void|int(0..9) level, void|int strategy, void|int(8..15) window_size)
Encodes and returns the input data according to the deflate
format defined in RFC 1951.
dataThe data to be encoded.
rawIf set, the data is encoded without the header and footer defined in RFC 1950. Example of uses is the ZIP container format.
levelIndicates the level of effort spent to make the data compress well. Zero means no packing, 2-3 is considered 'fast', 8 is default and higher is considered 'slow' but gives better packing.
strategyThe strategy to be used when compressing the data. One of the following.
| The default strategy as selected in the zlib library. |
| This strategy is intented for data created by a filter or predictor and will put more emphasis on huffman encoding and less on LZ string matching. This is between DEFAULT_STRATEGY and HUFFMAN_ONLY. |
| This strategy is even closer to the HUFFMAN_ONLY in that it only looks at the latest byte in the window, i.e. a window size of 1 byte is sufficient for decompression. This mode is not available in all zlib versions. |
| This strategy will turn of string matching completely, only doing huffman encoding. Window size doesn't matter in this mode and the data can be decompressed with a zero size window. |
| In this mode dynamic huffman codes are disabled, allowing for a simpler decoder for special applications. This mode is not available in all zlib versions. |
window_sizeDefines the size of the LZ77 window from 256 bytes to 32768 bytes, expressed as 2^x.
deflate, inflate, uncompress
int crc32(string(8bit) data, void|int(0..) start_value)
This function calculates the standard ISO3309 Cyclic Redundancy Check.
int make_header(Stdio.Stream|Stdio.Buffer f)
Write a gzip header to a file or buffer.
fFile or buffer to write a gzip header to.
Returns 1 on success and 0 (zero)
on failure.
string(8bit) uncompress(string(8bit)|String.Buffer|System.Memory|Stdio.Buffer data, void|bool raw)
Uncompresses the data and returns it. The raw parameter
tells the decoder that the indata lacks the data header and footer
defined in RFC 1950.
inherit ._file : _file
Allows the user to open a Gzip archive and read and write
it's contents in an uncompressed form, emulating the Stdio.File
interface.
An important limitation on this class is that it may only be used for reading or writing, not both at the same time. Please also note that if you want to reopen a file for reading after a write, you must close the file before calling open or strange effects might be the result.
Gz.File Gz.File(void|string|int|Stdio.Stream file, void|string mode)
fileFilename or filedescriptor of the gzip file to open, or an already open Stream.
modemode for the file. Defaults to "rb".
open Stdio.File
String.SplitIterator|Stdio.LineIterator line_iterator(int|void trim)
Returns an iterator that will loop over the lines in this file. If trim is true, all '\r' characters will be removed from the input.
int open(string|int|Stdio.Stream file, void|string mode)
fileFilename or filedescriptor of the gzip file to open, or an already open Stream.
modemode for the file. Defaults to "rb". May be one of the following:
read mode
write mode
append mode
For the wb and ab mode, additional parameters may be specified. Please se zlib manual for more info.
non-zero if successful.
int|string read(void|int length)
Reads data from the file. If no argument is given, the whole file is read.
function(:string) read_function(int nbytes)
Returns a function that when called will call read with
nbytes as argument. Can be used to get various callback
functions, eg for the fourth argument to
String.SplitIterator.
Low-level implementation of read/write support for GZip files
int close()
closes the file
1 if successful
Gz._file Gz._file(void|string|Stdio.Stream gzFile, void|string mode)
Opens a gzip file for reading.
bool eof()
1 if EOF has been reached.
int open(string|int|Stdio.Stream file, void|string mode)
Opens a file for I/O.
fileThe filename or an open filedescriptor or Stream for the GZip file to use.
modeMode for the file operations. Defaults to read only. The following mode characters are unique to Gz.File.
| Values 0 to 9 set the compression level from no compression to maximum available compression. Defaults to 6. |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| Sets the compression strategy to |
| Sets the compression strategy to |
If the object already has been opened, it will first be closed.
int|string read(int len)
Reads len (uncompressed) bytes from the file. If read is unsuccessful, 0 is returned.
int seek(int pos, void|int type)
Seeks within the file.
posPosition relative to the searchtype.
typeSEEK_SET = set current position in file to pos SEEK_CUR = new position is current+pos SEEK_END is not supported.
New position or negative number if seek failed.
int setparams(void|int(0..9) level, void|int strategy, void|int(8..15) window_size)
Sets the encoding level, strategy and window_size.
Gz.deflate
int tell()
the current position within the file.
int write(string data)
Writes the data to the file.
the number of bytes written to the file.
This class interfaces with the compression routines in the libz library.
This class is only available if libz was available and found when Pike was compiled.
Gz.inflate(), Gz.compress(), Gz.uncompress()
Gz.deflate clone()
Clones the deflate object. Typically used to test compression of new content using the same exact state.
Gz.deflate Gz.deflate(int(-9..9)|void level, int|void strategy, int(8..15)|void window_size)
Gz.deflate Gz.deflate(mapping options)
This function can also be used to re-initialize a Gz.deflate object so it can be re-used.
If a mapping is passed as the only argument, it will accept the
parameters described below as indices, and additionally it accepts
a string as dictionary.
levelIndicates the level of effort spent to make the data compress well. Zero means no packing, 2-3 is considered 'fast', 6 is default and higher is considered 'slow' but gives better packing.
If the argument is negative, no headers will be emitted. This is needed to produce ZIP-files, as an example. The negative value is then negated, and handled as a positive value.
strategyThe strategy to be used when compressing the data. One of the following.
| The default strategy as selected in the zlib library. |
| This strategy is intented for data created by a filter or predictor and will put more emphasis on huffman encoding and less on LZ string matching. This is between DEFAULT_STRATEGY and HUFFMAN_ONLY. |
| This strategy is even closer to the HUFFMAN_ONLY in that it only looks at the latest byte in the window, i.e. a window size of 1 byte is sufficient for decompression. This mode is not available in all zlib versions. |
| This strategy will turn of string matching completely, only doing huffman encoding. Window size doesn't matter in this mode and the data can be decompressed with a zero size window. |
| In this mode dynamic huffman codes are disabled, allowing for a simpler decoder for special applications. This mode is not available in all zlib versions. |
window_sizeDefines the size of the LZ77 window from 256 bytes to 32768 bytes, expressed as 2^x.
string(8bit) deflate(string(8bit)|String.Buffer|System.Memory|Stdio.Buffer data, int|void flush)
This function performs gzip style compression on a string data and
returns the packed data. Streaming can be done by calling this
function several times and concatenating the returned data.
The optional argument flush should be one of the following:
| Only data that doesn't fit in the internal buffers is returned. |
| All input is packed and returned. |
| All input is packed and returned. |
| All input is packed and an 'end of data' marker is appended (default). |
Gz.inflate->inflate()
This class interfaces with the uncompression routines in the libz library.
This program is only available if libz was available and found when Pike was compiled.
deflate, compress, uncompress
Gz.inflate Gz.inflate(int|void window_size)
Gz.inflate Gz.inflate(mapping options)
If called with a mapping as only argument, create accepts
the entries window_size (described below) and
dictionary, which is a string to be set as dictionary.
The window_size value is passed down to inflateInit2 in zlib.
If the argument is negative, no header checks are done, and no verification of the data will be done either. This is needed for uncompressing ZIP-files, as an example. The negative value is then negated, and handled as a positive value.
Positive arguments set the maximum dictionary size to an exponent of 2, such that 8 (the minimum) will cause the window size to be 256, and 15 (the maximum, and default value) will cause it to be 32Kb. Setting this to anything except 15 is rather pointless in Pike.
It can be used to limit the amount of memory that is used to uncompress files, but 32Kb is not all that much in the great scheme of things.
To decompress files compressed with level 9 compression, a 32Kb window size is needed. level 1 compression only requires a 256 byte window.
If the options version is used you can specify your own dictionary in addition to the window size.
|
|
string(8bit) end_of_stream()
This function returns 0 if the end of stream marker has not yet been encountered, or a string (possibly empty) containg any extra data received following the end of stream marker if the marker has been encountered. If the extra data is not needed, the result of this function can be treated as a logical value.
string(8bit) inflate(string(8bit)|String.Buffer|System.Memory|Stdio.Buffer data)
This function performs gzip style decompression. It can inflate a whole file at once or in blocks.
// whole file
write(Gz.inflate()->inflate(stdin->read(0x7fffffff)); // streaming (blocks) function inflate=Gz.inflate()->inflate; while(string s=stdin->read(8192)) write(inflate(s));
Gz.deflate->deflate(), Gz.uncompress
The Bz2 module contains functions to compress and uncompress strings using the same algorithm as the program bzip2. Compressing and decompressing can be done in streaming mode feeding the compress and decompress objects with arbitrarily large pieces of data.
The Bz2 module consists of three classes; Bz2.Deflate,
Bz2.Inflate and Bz2.File. Bz2.Deflate is used to compress
data and Bz2.Inflate is used to uncompress data. Bz2.File is
used to handle Bzip2 files.
Note that this module is only available if libbzip2 was available when Pike was compiled.
Note that although the functions in Inflate and Deflate
use the same algorithm as bzip2, they do not use the
exact same format, so you can not directly zip files or unzip
zip-files using those functions. That is why there exists a
third class for files.
inherit "___Bz2" : Bz2
Bz2.Deflate is a builtin program written in C. It interfaces the packing routines in the bzlib library.
This program is only available if libz was available and found when Pike was compiled.
Bz2.Inflate()
Bz2.Deflate Bz2.Deflate(int(1..9)|void block_size)
If given, block_size should be a number from 1 to 9 indicating
the block size used when doing compression. The actual block size
will be a 100000 times this number. Low numbers are considered
'fast', higher numbers are considered 'slow' but give better
packing. The parameter is set to 9 if it is omitted.
This function can also be used to re-initialize a Bz2.Deflate object so it can be re-used.
string deflate(string data, int(0..2)|void flush_mode)
This function performs bzip2 style compression on a string
data and returns the packed data. Streaming can be done by
calling this function several times and concatenating the
returned data.
The optional argument flush_mode should be one of the
following:
| Runs Bz2.Deflate->feed() |
| Runs Bz2.Deflate->read() |
| Runs Bz2.Deflate->finish() |
Bz2.Inflate->inflate()
void feed(string data)
This function feeds the data to the internal buffers of the Deflate object. All data is buffered until a read or a finish is done.
Bz2.Deflate->read()
Bz2.Deflate->finish()
string finish(string data)
This method feeds the data to the internal buffers of the Deflate object. Then it compresses all buffered data adds a end of data marker ot it, returns the compressed data as a string, and reinitializes the deflate object.
Bz2.Deflate->feed()
Bz2.Deflate->read()
string read(string data)
This function feeds the data to the internal buffers of the Deflate object. Then it compresses all buffered data and returns the compressed data as a string
Bz2.Deflate->feed()
Bz2.Deflate->finish()
Low-level implementation of read/write support for Bzip2 files
This class is currently not available on Windows.
inherit Bz2::File : File
bool close()
closes the file
Bz2.File Bz2.File()
Bz2.File Bz2.File(string filename, void|string mode)
Creates a Bz2.File object
bool eof()
1 if EOF has been reached, 0 otherwise
String.SplitIterator|Stdio.LineIterator line_iterator(int|void trim)
Returns an iterator that will loop over the lines in this file. If trim is true, all '\r' characters will be removed from the input.
bool open(string file, void|string mode)
Opens a file for I/O.
fileThe name of the file to be opened
modeMode for the file operations. Can be either "r" (read) or "w". Read is default.
string read(int len)
Reads len (uncompressed) bytes from the file. If len is omitted the whole file is read. If read is unsuccessful, 0 is returned.
function(:string) read_function(int nbytes)
Returns a function that when called will call read with
nbytes as argument. Can be used to get various callback
functions, eg for the fourth argument to
String.SplitIterator.
bool read_open(string file)
Opens a file for reading.
fileThe name of the file to be opened
int write(string data)
Writes the data to the file.
the number of bytes written to the file.
bool write_open(string file)
Opens a file for writing.
fileThe name of the file to be opened
Bz2.Inflate is a builtin program written in C. It interfaces the unpacking routines in the libz library.
This program is only available if bzlib was available and found when Pike was compiled.
Deflate
Bz2.Inflate Bz2.Inflate()
string inflate(string data)
This function performs bzip2 style decompression. It can do decompression with arbitrarily large pieces of data. When fed with data, it decompresses as much as it can and buffers the rest.
while(..){ foo = compressed_data[i..i+9]; uncompressed_concatenated_data += inflate_object->inflate(foo); i = i+10; }
Bz2.Deflate->deflate()
Implementation of the HPACK (RFC 7541) header packing standard.
This is the header packing system that is used in HTTP/2 (RFC 7540).
inherit "___HPack" : "___HPack"
constant int HPack.DEFAULT_HEADER_TABLE_SIZE
This is the default static maximum size of the dynamic header table.
This constant is taken from RFC 7540 section 6.5.2.
constant HPack.static_header_tab
Table of static headers. RFC 7541 appendix A, Table 1.
| Array | |||||||
|
| ||||||
Note that this table is indexed starting on 0 (zero),
while the corresponding table in RFC 7541 starts on 1
(one).
protected mapping(string(8bit):int|mapping(string(8bit):int)) HPack.static_header_index
Index for static_header_tab.
Note that the indices are offset by 1 (one).
This variable should be regarded as a constant.
This variable is used to initialize the header index in the Context.
static_header_tab, Context()->header_index
protected mapping(string(8bit):int|mapping(string(8bit):int)) create_index(array(array(string(8bit))) tab)
Helper function used to create the static_header_index.
string(8bit) huffman_decode(string(8bit) str)
Decodes the string str encoded with the static huffman code specified
in RFC 7541 appendix B.
strString to decode.
Returns the decoded string.
huffman_encode().
string(8bit) huffman_encode(string(8bit) str)
Encodes the string str with the static huffman code specified
in RFC 7541 appendix B.
strString to encode.
Returns the encoded string.
huffman_decode().
protected void update_index(mapping(string(8bit):int|mapping(string(8bit):int)) index, int i, array(string(8bit)) key)
Update the specified encoder lookup index.
indexLookup index to add an entry to.
keyLookup key to add.
iValue to store in the index for the key.
Flags for Context()->encode_header() et al.
constant HPack.HEADER_INDEXED
Indexed header.
constant HPack.HEADER_INDEXED_MASK
Bitmask for indexing mode.
constant HPack.HEADER_NEVER_INDEXED
Never indexed header.
constant HPack.HEADER_NOT_INDEXED
Unindexed header.
Context for an HPack encoder or decoder.
This class implements the majority of RFC 7541.
Functions of interest are typically encode() and decode().
protected array(array(string(8bit))) HPack.Context.dynamic_headers
Table of currently available dynamically defined headers.
New entries are appended last, and the first dynamic_prefix
elements are not used.
header_index, add_header()
protected int HPack.Context.dynamic_max_size
Current upper size limit in bytes for dynamic_headers.
set_dynamic_size()
protected int HPack.Context.dynamic_prefix
Index of first avaiable header in dynamic_headers.
protected int HPack.Context.dynamic_size
Current size in bytes of dynamic_headers.
protected mapping(string(8bit):int|mapping(string(8bit):int)) HPack.Context.header_index
Index into dynamic_headers and static_headers.
| Indexed on the header name in lower-case. The value is one of:
|
The index values in turn are coded as follows:
| Index into |
| Not used. |
| Inverted ( |
dynamic_headers, static_header_tab, add_header()
protected int HPack.Context.static_max_size
Static upper size limit in bytes for dynamic_headers.
create(), set_dynamic_size()
int(0)|int(62) add_header(string(8bit) header, string(8bit) value)
Add a header to the table of known headers and to the header index.
headerName of header to add.
valueValue of the header.
Returns 0 (zero) if the header was too large to store.
Returns the encoding key for the header on success (this is always
sizeof(static_header_tab + 1 (ie 62), as new headers
are prepended to the dynamic header table.
Adding a header may cause old headers to be evicted from the table.
get_indexed_header()
HPack.Context HPack.Context(int|void protocol_dynamic_max_size)
Create a new HPack Context.
static_max_sizeThis is the static maximum size in bytes (as calculated by
RFC 7541 section 4.1) of the dynamic header table.
It defaults to DEFAULT_HEADER_TABLE_SIZE, and is the
upper limit for set_dynamic_size().
set_dynamic_size()
array(array(string(8bit)|HPackFlags)) decode(Stdio.Buffer buf)
Decode a HPack header block.
bufInput buffer.
Returns an array of headers. Cf decode_header().
decode_header(), encode()
array(string(8bit)|HPackFlags) decode_header(Stdio.Buffer buf)
Decode a single HPack header.
bufInput buffer.
Returns UNDEFINED on empty buffer.
Returns an array with a header and value otherwise:
| Array | |
| Name of the header. Under normal circumstances this is always lower-case, but no check is currently performed. |
| Value of the header. |
| Optional encoding flags. Only set for fields having
|
The elements in the array are in the same order and compatible
with the arguments to encode_header().
Throws on encoding errors.
The returned array MUST NOT be modified.
In future implementations the result array may get extended with a flag field.
The in-band signalling of encoding table sizes is handled internally.
decode(), encode_header()
void encode(array(array(string(8bit)|HPackFlags)) headers, Stdio.Buffer buf)
Encode a full set of headers.
headersAn array of ({ header, value })-tuples.
bufOutput buffer.
encode_header(), decode()
variant string(8bit) encode(array(array(string(8bit))) headers)
Convenience variant of encode().
headersAn array of ({ header, value })-tuples.
Returns the corresponding HPack encoding.
void encode_header(Stdio.Buffer buf, string(8bit) header, string(8bit) value, HPackFlags|void flags)
Encode a single HPack header.
bufOutput buffer.
headerName of header. This should under normal circumstances be a lower-case string, but this is currently not checked.
valueHeader value.
flagsOptional encoding flags.
encode(), decode_header()
protected void evict_dynamic_headers()
Evict dynamic headers until dynamic_size goes below
dynamic_max_size.
array(string(8bit)) get_indexed_header(int(1..) index)
Lookup a known header.
indexEncoding key for the header to retrieve.
Returns UNDEFINED on unknown header.
Returns an array with a header and value otherwise:
| Array | |
| Name of the header. Under normal circumstances this is always lower-case, but no check is currently performed. |
| Value of the header. |
add_header()
protected void put_int(Stdio.Buffer buf, int(8bit) bits, int(8bit) mask, int value)
Encode an integer with the HPack integer encoding.
bufOutput buffer.
bitsBits that should always be set in the first byte of output.
maskBitmask for the value part of the first byte of output.
valueInteger value to encode.
protected void put_string(Stdio.Buffer buf, string(8bit) str)
Encode a string with the HPack string encoding.
bufOutput buffer.
strString to output.
The encoder will huffman_encode() the string if that
renders a shorter encoding than the verbatim string.
void set_dynamic_size(Stdio.Buffer buf, int(0..) new_max_size)
Set the dynamic maximum size of the dynamic header lookup table.
bufOutput buffer.
new_max_sizeNew dynamic maximum size in bytes (as calculated by RFC 7541 section 4.1).
This function can be used to clear the dynamic header table by setting the size to zero.
Also note that the new_max_size has an upper bound that
is limited by static_max_size.
encode_header(), encode(), create().