diff options
Diffstat (limited to 'winsup/bz2lib/manual_3.html')
-rw-r--r-- | winsup/bz2lib/manual_3.html | 1773 |
1 files changed, 0 insertions, 1773 deletions
diff --git a/winsup/bz2lib/manual_3.html b/winsup/bz2lib/manual_3.html deleted file mode 100644 index a8fa7e682..000000000 --- a/winsup/bz2lib/manual_3.html +++ /dev/null @@ -1,1773 +0,0 @@ -<HTML> -<HEAD> -<!-- This HTML file has been created by texi2html 1.54 - from manual.texi on 23 March 2000 --> - -<TITLE>bzip2 and libbzip2 - Programming with libbzip2</TITLE> -<link href="manual_4.html" rel=Next> -<link href="manual_2.html" rel=Previous> -<link href="manual_toc.html" rel=ToC> - -</HEAD> -<BODY> -<p>Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_2.html">previous</A>, <A HREF="manual_4.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>. -<P><HR><P> - - -<H1><A NAME="SEC12" HREF="manual_toc.html#TOC12">Programming with <CODE>libbzip2</CODE></A></H1> - -<P> -This chapter describes the programming interface to <CODE>libbzip2</CODE>. - -</P> -<P> -For general background information, particularly about memory -use and performance aspects, you'd be well advised to read Chapter 2 -as well. - -</P> - - -<H2><A NAME="SEC13" HREF="manual_toc.html#TOC13">Top-level structure</A></H2> - -<P> -<CODE>libbzip2</CODE> is a flexible library for compressing and decompressing -data in the <CODE>bzip2</CODE> data format. Although packaged as a single -entity, it helps to regard the library as three separate parts: the low -level interface, and the high level interface, and some utility -functions. - -</P> -<P> -The structure of <CODE>libbzip2</CODE>'s interfaces is similar to -that of Jean-loup Gailly's and Mark Adler's excellent <CODE>zlib</CODE> -library. - -</P> -<P> -All externally visible symbols have names beginning <CODE>BZ2_</CODE>. -This is new in version 1.0. The intention is to minimise pollution -of the namespaces of library clients. - -</P> - - -<H3><A NAME="SEC14" HREF="manual_toc.html#TOC14">Low-level summary</A></H3> - -<P> -This interface provides services for compressing and decompressing -data in memory. There's no provision for dealing with files, streams -or any other I/O mechanisms, just straight memory-to-memory work. -In fact, this part of the library can be compiled without inclusion -of <CODE>stdio.h</CODE>, which may be helpful for embedded applications. - -</P> -<P> -The low-level part of the library has no global variables and -is therefore thread-safe. - -</P> -<P> -Six routines make up the low level interface: -<CODE>BZ2_bzCompressInit</CODE>, <CODE>BZ2_bzCompress</CODE>, and <BR> <CODE>BZ2_bzCompressEnd</CODE> -for compression, -and a corresponding trio <CODE>BZ2_bzDecompressInit</CODE>, <BR> <CODE>BZ2_bzDecompress</CODE> -and <CODE>BZ2_bzDecompressEnd</CODE> for decompression. -The <CODE>*Init</CODE> functions allocate -memory for compression/decompression and do other -initialisations, whilst the <CODE>*End</CODE> functions close down operations -and release memory. - -</P> -<P> -The real work is done by <CODE>BZ2_bzCompress</CODE> and <CODE>BZ2_bzDecompress</CODE>. -These compress and decompress data from a user-supplied input buffer -to a user-supplied output buffer. These buffers can be any size; -arbitrary quantities of data are handled by making repeated calls -to these functions. This is a flexible mechanism allowing a -consumer-pull style of activity, or producer-push, or a mixture of -both. - -</P> - - - -<H3><A NAME="SEC15" HREF="manual_toc.html#TOC15">High-level summary</A></H3> - -<P> -This interface provides some handy wrappers around the low-level -interface to facilitate reading and writing <CODE>bzip2</CODE> format -files (<CODE>.bz2</CODE> files). The routines provide hooks to facilitate -reading files in which the <CODE>bzip2</CODE> data stream is embedded -within some larger-scale file structure, or where there are -multiple <CODE>bzip2</CODE> data streams concatenated end-to-end. - -</P> -<P> -For reading files, <CODE>BZ2_bzReadOpen</CODE>, <CODE>BZ2_bzRead</CODE>, -<CODE>BZ2_bzReadClose</CODE> and <BR> <CODE>BZ2_bzReadGetUnused</CODE> are supplied. For -writing files, <CODE>BZ2_bzWriteOpen</CODE>, <CODE>BZ2_bzWrite</CODE> and -<CODE>BZ2_bzWriteFinish</CODE> are available. - -</P> -<P> -As with the low-level library, no global variables are used -so the library is per se thread-safe. However, if I/O errors -occur whilst reading or writing the underlying compressed files, -you may have to consult <CODE>errno</CODE> to determine the cause of -the error. In that case, you'd need a C library which correctly -supports <CODE>errno</CODE> in a multithreaded environment. - -</P> -<P> -To make the library a little simpler and more portable, -<CODE>BZ2_bzReadOpen</CODE> and <CODE>BZ2_bzWriteOpen</CODE> require you to pass them file -handles (<CODE>FILE*</CODE>s) which have previously been opened for reading or -writing respectively. That avoids portability problems associated with -file operations and file attributes, whilst not being much of an -imposition on the programmer. - -</P> - - - -<H3><A NAME="SEC16" HREF="manual_toc.html#TOC16">Utility functions summary</A></H3> -<P> -For very simple needs, <CODE>BZ2_bzBuffToBuffCompress</CODE> and -<CODE>BZ2_bzBuffToBuffDecompress</CODE> are provided. These compress -data in memory from one buffer to another buffer in a single -function call. You should assess whether these functions -fulfill your memory-to-memory compression/decompression -requirements before investing effort in understanding the more -general but more complex low-level interface. - -</P> -<P> -Yoshioka Tsuneo (<CODE>QWF00133@niftyserve.or.jp</CODE> / -<CODE>tsuneo-y@is.aist-nara.ac.jp</CODE>) has contributed some functions to -give better <CODE>zlib</CODE> compatibility. These functions are -<CODE>BZ2_bzopen</CODE>, <CODE>BZ2_bzread</CODE>, <CODE>BZ2_bzwrite</CODE>, <CODE>BZ2_bzflush</CODE>, -<CODE>BZ2_bzclose</CODE>, -<CODE>BZ2_bzerror</CODE> and <CODE>BZ2_bzlibVersion</CODE>. You may find these functions -more convenient for simple file reading and writing, than those in the -high-level interface. These functions are not (yet) officially part of -the library, and are minimally documented here. If they break, you -get to keep all the pieces. I hope to document them properly when time -permits. - -</P> -<P> -Yoshioka also contributed modifications to allow the library to be -built as a Windows DLL. - -</P> - - - -<H2><A NAME="SEC17" HREF="manual_toc.html#TOC17">Error handling</A></H2> - -<P> -The library is designed to recover cleanly in all situations, including -the worst-case situation of decompressing random data. I'm not -100% sure that it can always do this, so you might want to add -a signal handler to catch segmentation violations during decompression -if you are feeling especially paranoid. I would be interested in -hearing more about the robustness of the library to corrupted -compressed data. - -</P> -<P> -Version 1.0 is much more robust in this respect than -0.9.0 or 0.9.5. Investigations with Checker (a tool for -detecting problems with memory management, similar to Purify) -indicate that, at least for the few files I tested, all single-bit -errors in the decompressed data are caught properly, with no -segmentation faults, no reads of uninitialised data and no -out of range reads or writes. So it's certainly much improved, -although I wouldn't claim it to be totally bombproof. - -</P> -<P> -The file <CODE>bzlib.h</CODE> contains all definitions needed to use -the library. In particular, you should definitely not include -<CODE>bzlib_private.h</CODE>. - -</P> -<P> -In <CODE>bzlib.h</CODE>, the various return values are defined. The following -list is not intended as an exhaustive description of the circumstances -in which a given value may be returned -- those descriptions are given -later. Rather, it is intended to convey the rough meaning of each -return value. The first five actions are normal and not intended to -denote an error situation. -<DL COMPACT> - -<DT><CODE>BZ_OK</CODE> -<DD> -The requested action was completed successfully. -<DT><CODE>BZ_RUN_OK</CODE> -<DD> -<DT><CODE>BZ_FLUSH_OK</CODE> -<DD> -<DT><CODE>BZ_FINISH_OK</CODE> -<DD> -In <CODE>BZ2_bzCompress</CODE>, the requested flush/finish/nothing-special action -was completed successfully. -<DT><CODE>BZ_STREAM_END</CODE> -<DD> -Compression of data was completed, or the logical stream end was -detected during decompression. -</DL> - -<P> -The following return values indicate an error of some kind. -<DL COMPACT> - -<DT><CODE>BZ_CONFIG_ERROR</CODE> -<DD> -Indicates that the library has been improperly compiled on your -platform -- a major configuration error. Specifically, it means -that <CODE>sizeof(char)</CODE>, <CODE>sizeof(short)</CODE> and <CODE>sizeof(int)</CODE> -are not 1, 2 and 4 respectively, as they should be. Note that the -library should still work properly on 64-bit platforms which follow -the LP64 programming model -- that is, where <CODE>sizeof(long)</CODE> -and <CODE>sizeof(void*)</CODE> are 8. Under LP64, <CODE>sizeof(int)</CODE> is -still 4, so <CODE>libbzip2</CODE>, which doesn't use the <CODE>long</CODE> type, -is OK. -<DT><CODE>BZ_SEQUENCE_ERROR</CODE> -<DD> -When using the library, it is important to call the functions in the -correct sequence and with data structures (buffers etc) in the correct -states. <CODE>libbzip2</CODE> checks as much as it can to ensure this is -happening, and returns <CODE>BZ_SEQUENCE_ERROR</CODE> if not. Code which -complies precisely with the function semantics, as detailed below, -should never receive this value; such an event denotes buggy code -which you should investigate. -<DT><CODE>BZ_PARAM_ERROR</CODE> -<DD> -Returned when a parameter to a function call is out of range -or otherwise manifestly incorrect. As with <CODE>BZ_SEQUENCE_ERROR</CODE>, -this denotes a bug in the client code. The distinction between -<CODE>BZ_PARAM_ERROR</CODE> and <CODE>BZ_SEQUENCE_ERROR</CODE> is a bit hazy, but still worth -making. -<DT><CODE>BZ_MEM_ERROR</CODE> -<DD> -Returned when a request to allocate memory failed. Note that the -quantity of memory needed to decompress a stream cannot be determined -until the stream's header has been read. So <CODE>BZ2_bzDecompress</CODE> and -<CODE>BZ2_bzRead</CODE> may return <CODE>BZ_MEM_ERROR</CODE> even though some of -the compressed data has been read. The same is not true for -compression; once <CODE>BZ2_bzCompressInit</CODE> or <CODE>BZ2_bzWriteOpen</CODE> have -successfully completed, <CODE>BZ_MEM_ERROR</CODE> cannot occur. -<DT><CODE>BZ_DATA_ERROR</CODE> -<DD> -Returned when a data integrity error is detected during decompression. -Most importantly, this means when stored and computed CRCs for the -data do not match. This value is also returned upon detection of any -other anomaly in the compressed data. -<DT><CODE>BZ_DATA_ERROR_MAGIC</CODE> -<DD> -As a special case of <CODE>BZ_DATA_ERROR</CODE>, it is sometimes useful to -know when the compressed stream does not start with the correct -magic bytes (<CODE>'B' 'Z' 'h'</CODE>). -<DT><CODE>BZ_IO_ERROR</CODE> -<DD> -Returned by <CODE>BZ2_bzRead</CODE> and <CODE>BZ2_bzWrite</CODE> when there is an error -reading or writing in the compressed file, and by <CODE>BZ2_bzReadOpen</CODE> -and <CODE>BZ2_bzWriteOpen</CODE> for attempts to use a file for which the -error indicator (viz, <CODE>ferror(f)</CODE>) is set. -On receipt of <CODE>BZ_IO_ERROR</CODE>, the caller should consult -<CODE>errno</CODE> and/or <CODE>perror</CODE> to acquire operating-system -specific information about the problem. -<DT><CODE>BZ_UNEXPECTED_EOF</CODE> -<DD> -Returned by <CODE>BZ2_bzRead</CODE> when the compressed file finishes -before the logical end of stream is detected. -<DT><CODE>BZ_OUTBUFF_FULL</CODE> -<DD> -Returned by <CODE>BZ2_bzBuffToBuffCompress</CODE> and -<CODE>BZ2_bzBuffToBuffDecompress</CODE> to indicate that the output data -will not fit into the output buffer provided. -</DL> - - - -<H2><A NAME="SEC18" HREF="manual_toc.html#TOC18">Low-level interface</A></H2> - - - -<H3><A NAME="SEC19" HREF="manual_toc.html#TOC19"><CODE>BZ2_bzCompressInit</CODE></A></H3> - -<PRE> -typedef - struct { - char *next_in; - unsigned int avail_in; - unsigned int total_in_lo32; - unsigned int total_in_hi32; - - char *next_out; - unsigned int avail_out; - unsigned int total_out_lo32; - unsigned int total_out_hi32; - - void *state; - - void *(*bzalloc)(void *,int,int); - void (*bzfree)(void *,void *); - void *opaque; - } - bz_stream; - -int BZ2_bzCompressInit ( bz_stream *strm, - int blockSize100k, - int verbosity, - int workFactor ); - -</PRE> - -<P> -Prepares for compression. The <CODE>bz_stream</CODE> structure -holds all data pertaining to the compression activity. -A <CODE>bz_stream</CODE> structure should be allocated and initialised -prior to the call. -The fields of <CODE>bz_stream</CODE> -comprise the entirety of the user-visible data. <CODE>state</CODE> -is a pointer to the private data structures required for compression. - -</P> -<P> -Custom memory allocators are supported, via fields <CODE>bzalloc</CODE>, -<CODE>bzfree</CODE>, -and <CODE>opaque</CODE>. The value -<CODE>opaque</CODE> is passed to as the first argument to -all calls to <CODE>bzalloc</CODE> and <CODE>bzfree</CODE>, but is -otherwise ignored by the library. -The call <CODE>bzalloc ( opaque, n, m )</CODE> is expected to return a -pointer <CODE>p</CODE> to -<CODE>n * m</CODE> bytes of memory, and <CODE>bzfree ( opaque, p )</CODE> -should free -that memory. - -</P> -<P> -If you don't want to use a custom memory allocator, set <CODE>bzalloc</CODE>, -<CODE>bzfree</CODE> and -<CODE>opaque</CODE> to <CODE>NULL</CODE>, -and the library will then use the standard <CODE>malloc</CODE>/<CODE>free</CODE> -routines. - -</P> -<P> -Before calling <CODE>BZ2_bzCompressInit</CODE>, fields <CODE>bzalloc</CODE>, -<CODE>bzfree</CODE> and <CODE>opaque</CODE> should -be filled appropriately, as just described. Upon return, the internal -state will have been allocated and initialised, and <CODE>total_in_lo32</CODE>, -<CODE>total_in_hi32</CODE>, <CODE>total_out_lo32</CODE> and -<CODE>total_out_hi32</CODE> will have been set to zero. -These four fields are used by the library -to inform the caller of the total amount of data passed into and out of -the library, respectively. You should not try to change them. -As of version 1.0, 64-bit counts are maintained, even on 32-bit -platforms, using the <CODE>_hi32</CODE> fields to store the upper 32 bits -of the count. So, for example, the total amount of data in -is <CODE>(total_in_hi32 << 32) + total_in_lo32</CODE>. - -</P> -<P> -Parameter <CODE>blockSize100k</CODE> specifies the block size to be used for -compression. It should be a value between 1 and 9 inclusive, and the -actual block size used is 100000 x this figure. 9 gives the best -compression but takes most memory. - -</P> -<P> -Parameter <CODE>verbosity</CODE> should be set to a number between 0 and 4 -inclusive. 0 is silent, and greater numbers give increasingly verbose -monitoring/debugging output. If the library has been compiled with -<CODE>-DBZ_NO_STDIO</CODE>, no such output will appear for any verbosity -setting. - -</P> -<P> -Parameter <CODE>workFactor</CODE> controls how the compression phase behaves -when presented with worst case, highly repetitive, input data. If -compression runs into difficulties caused by repetitive data, the -library switches from the standard sorting algorithm to a fallback -algorithm. The fallback is slower than the standard algorithm by -perhaps a factor of three, but always behaves reasonably, no matter how -bad the input. - -</P> -<P> -Lower values of <CODE>workFactor</CODE> reduce the amount of effort the -standard algorithm will expend before resorting to the fallback. You -should set this parameter carefully; too low, and many inputs will be -handled by the fallback algorithm and so compress rather slowly, too -high, and your average-to-worst case compression times can become very -large. The default value of 30 gives reasonable behaviour over a wide -range of circumstances. - -</P> -<P> -Allowable values range from 0 to 250 inclusive. 0 is a special case, -equivalent to using the default value of 30. - -</P> -<P> -Note that the compressed output generated is the same regardless of -whether or not the fallback algorithm is used. - -</P> -<P> -Be aware also that this parameter may disappear entirely in future -versions of the library. In principle it should be possible to devise a -good way to automatically choose which algorithm to use. Such a -mechanism would render the parameter obsolete. - -</P> -<P> -Possible return values: - -<PRE> - <CODE>BZ_CONFIG_ERROR</CODE> - if the library has been mis-compiled - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>strm</CODE> is <CODE>NULL</CODE> - or <CODE>blockSize</CODE> < 1 or <CODE>blockSize</CODE> > 9 - or <CODE>verbosity</CODE> < 0 or <CODE>verbosity</CODE> > 4 - or <CODE>workFactor</CODE> < 0 or <CODE>workFactor</CODE> > 250 - <CODE>BZ_MEM_ERROR</CODE> - if not enough memory is available - <CODE>BZ_OK</CODE> - otherwise -</PRE> - -<P> -Allowable next actions: - -<PRE> - <CODE>BZ2_bzCompress</CODE> - if <CODE>BZ_OK</CODE> is returned - no specific action needed in case of error -</PRE> - - - -<H3><A NAME="SEC20" HREF="manual_toc.html#TOC20"><CODE>BZ2_bzCompress</CODE></A></H3> - -<PRE> - int BZ2_bzCompress ( bz_stream *strm, int action ); -</PRE> - -<P> -Provides more input and/or output buffer space for the library. The -caller maintains input and output buffers, and calls <CODE>BZ2_bzCompress</CODE> to -transfer data between them. - -</P> -<P> -Before each call to <CODE>BZ2_bzCompress</CODE>, <CODE>next_in</CODE> should point at -the data to be compressed, and <CODE>avail_in</CODE> should indicate how many -bytes the library may read. <CODE>BZ2_bzCompress</CODE> updates <CODE>next_in</CODE>, -<CODE>avail_in</CODE> and <CODE>total_in</CODE> to reflect the number of bytes it -has read. - -</P> -<P> -Similarly, <CODE>next_out</CODE> should point to a buffer in which the -compressed data is to be placed, with <CODE>avail_out</CODE> indicating how -much output space is available. <CODE>BZ2_bzCompress</CODE> updates -<CODE>next_out</CODE>, <CODE>avail_out</CODE> and <CODE>total_out</CODE> to reflect the -number of bytes output. - -</P> -<P> -You may provide and remove as little or as much data as you like on each -call of <CODE>BZ2_bzCompress</CODE>. In the limit, it is acceptable to supply and -remove data one byte at a time, although this would be terribly -inefficient. You should always ensure that at least one byte of output -space is available at each call. - -</P> -<P> -A second purpose of <CODE>BZ2_bzCompress</CODE> is to request a change of mode of the -compressed stream. - -</P> -<P> -Conceptually, a compressed stream can be in one of four states: IDLE, -RUNNING, FLUSHING and FINISHING. Before initialisation -(<CODE>BZ2_bzCompressInit</CODE>) and after termination (<CODE>BZ2_bzCompressEnd</CODE>), a -stream is regarded as IDLE. - -</P> -<P> -Upon initialisation (<CODE>BZ2_bzCompressInit</CODE>), the stream is placed in the -RUNNING state. Subsequent calls to <CODE>BZ2_bzCompress</CODE> should pass -<CODE>BZ_RUN</CODE> as the requested action; other actions are illegal and -will result in <CODE>BZ_SEQUENCE_ERROR</CODE>. - -</P> -<P> -At some point, the calling program will have provided all the input data -it wants to. It will then want to finish up -- in effect, asking the -library to process any data it might have buffered internally. In this -state, <CODE>BZ2_bzCompress</CODE> will no longer attempt to read data from -<CODE>next_in</CODE>, but it will want to write data to <CODE>next_out</CODE>. -Because the output buffer supplied by the user can be arbitrarily small, -the finishing-up operation cannot necessarily be done with a single call -of <CODE>BZ2_bzCompress</CODE>. - -</P> -<P> -Instead, the calling program passes <CODE>BZ_FINISH</CODE> as an action to -<CODE>BZ2_bzCompress</CODE>. This changes the stream's state to FINISHING. Any -remaining input (ie, <CODE>next_in[0 .. avail_in-1]</CODE>) is compressed and -transferred to the output buffer. To do this, <CODE>BZ2_bzCompress</CODE> must be -called repeatedly until all the output has been consumed. At that -point, <CODE>BZ2_bzCompress</CODE> returns <CODE>BZ_STREAM_END</CODE>, and the stream's -state is set back to IDLE. <CODE>BZ2_bzCompressEnd</CODE> should then be -called. - -</P> -<P> -Just to make sure the calling program does not cheat, the library makes -a note of <CODE>avail_in</CODE> at the time of the first call to -<CODE>BZ2_bzCompress</CODE> which has <CODE>BZ_FINISH</CODE> as an action (ie, at the -time the program has announced its intention to not supply any more -input). By comparing this value with that of <CODE>avail_in</CODE> over -subsequent calls to <CODE>BZ2_bzCompress</CODE>, the library can detect any -attempts to slip in more data to compress. Any calls for which this is -detected will return <CODE>BZ_SEQUENCE_ERROR</CODE>. This indicates a -programming mistake which should be corrected. - -</P> -<P> -Instead of asking to finish, the calling program may ask -<CODE>BZ2_bzCompress</CODE> to take all the remaining input, compress it and -terminate the current (Burrows-Wheeler) compression block. This could -be useful for error control purposes. The mechanism is analogous to -that for finishing: call <CODE>BZ2_bzCompress</CODE> with an action of -<CODE>BZ_FLUSH</CODE>, remove output data, and persist with the -<CODE>BZ_FLUSH</CODE> action until the value <CODE>BZ_RUN</CODE> is returned. As -with finishing, <CODE>BZ2_bzCompress</CODE> detects any attempt to provide more -input data once the flush has begun. - -</P> -<P> -Once the flush is complete, the stream returns to the normal RUNNING -state. - -</P> -<P> -This all sounds pretty complex, but isn't really. Here's a table -which shows which actions are allowable in each state, what action -will be taken, what the next state is, and what the non-error return -values are. Note that you can't explicitly ask what state the -stream is in, but nor do you need to -- it can be inferred from the -values returned by <CODE>BZ2_bzCompress</CODE>. - -<PRE> -IDLE/<CODE>any</CODE> - Illegal. IDLE state only exists after <CODE>BZ2_bzCompressEnd</CODE> or - before <CODE>BZ2_bzCompressInit</CODE>. - Return value = <CODE>BZ_SEQUENCE_ERROR</CODE> - -RUNNING/<CODE>BZ_RUN</CODE> - Compress from <CODE>next_in</CODE> to <CODE>next_out</CODE> as much as possible. - Next state = RUNNING - Return value = <CODE>BZ_RUN_OK</CODE> - -RUNNING/<CODE>BZ_FLUSH</CODE> - Remember current value of <CODE>next_in</CODE>. Compress from <CODE>next_in</CODE> - to <CODE>next_out</CODE> as much as possible, but do not accept any more input. - Next state = FLUSHING - Return value = <CODE>BZ_FLUSH_OK</CODE> - -RUNNING/<CODE>BZ_FINISH</CODE> - Remember current value of <CODE>next_in</CODE>. Compress from <CODE>next_in</CODE> - to <CODE>next_out</CODE> as much as possible, but do not accept any more input. - Next state = FINISHING - Return value = <CODE>BZ_FINISH_OK</CODE> - -FLUSHING/<CODE>BZ_FLUSH</CODE> - Compress from <CODE>next_in</CODE> to <CODE>next_out</CODE> as much as possible, - but do not accept any more input. - If all the existing input has been used up and all compressed - output has been removed - Next state = RUNNING; Return value = <CODE>BZ_RUN_OK</CODE> - else - Next state = FLUSHING; Return value = <CODE>BZ_FLUSH_OK</CODE> - -FLUSHING/other - Illegal. - Return value = <CODE>BZ_SEQUENCE_ERROR</CODE> - -FINISHING/<CODE>BZ_FINISH</CODE> - Compress from <CODE>next_in</CODE> to <CODE>next_out</CODE> as much as possible, - but to not accept any more input. - If all the existing input has been used up and all compressed - output has been removed - Next state = IDLE; Return value = <CODE>BZ_STREAM_END</CODE> - else - Next state = FINISHING; Return value = <CODE>BZ_FINISHING</CODE> - -FINISHING/other - Illegal. - Return value = <CODE>BZ_SEQUENCE_ERROR</CODE> -</PRE> - -<P> -That still looks complicated? Well, fair enough. The usual sequence -of calls for compressing a load of data is: - -<UL> -<LI>Get started with <CODE>BZ2_bzCompressInit</CODE>. - -<LI>Shovel data in and shlurp out its compressed form using zero or more - -calls of <CODE>BZ2_bzCompress</CODE> with action = <CODE>BZ_RUN</CODE>. -<LI>Finish up. - -Repeatedly call <CODE>BZ2_bzCompress</CODE> with action = <CODE>BZ_FINISH</CODE>, -copying out the compressed output, until <CODE>BZ_STREAM_END</CODE> is returned. -<LI>Close up and go home. Call <CODE>BZ2_bzCompressEnd</CODE>. - -</UL> - -<P> -If the data you want to compress fits into your input buffer all -at once, you can skip the calls of <CODE>BZ2_bzCompress ( ..., BZ_RUN )</CODE> and -just do the <CODE>BZ2_bzCompress ( ..., BZ_FINISH )</CODE> calls. - -</P> -<P> -All required memory is allocated by <CODE>BZ2_bzCompressInit</CODE>. The -compression library can accept any data at all (obviously). So you -shouldn't get any error return values from the <CODE>BZ2_bzCompress</CODE> calls. -If you do, they will be <CODE>BZ_SEQUENCE_ERROR</CODE>, and indicate a bug in -your programming. - -</P> -<P> -Trivial other possible return values: - -<PRE> - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>strm</CODE> is <CODE>NULL</CODE>, or <CODE>strm->s</CODE> is <CODE>NULL</CODE> -</PRE> - - - -<H3><A NAME="SEC21" HREF="manual_toc.html#TOC21"><CODE>BZ2_bzCompressEnd</CODE></A></H3> - -<PRE> -int BZ2_bzCompressEnd ( bz_stream *strm ); -</PRE> - -<P> -Releases all memory associated with a compression stream. - -</P> -<P> -Possible return values: - -<PRE> - <CODE>BZ_PARAM_ERROR</CODE> if <CODE>strm</CODE> is <CODE>NULL</CODE> or <CODE>strm->s</CODE> is <CODE>NULL</CODE> - <CODE>BZ_OK</CODE> otherwise -</PRE> - - - -<H3><A NAME="SEC22" HREF="manual_toc.html#TOC22"><CODE>BZ2_bzDecompressInit</CODE></A></H3> - -<PRE> -int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small ); -</PRE> - -<P> -Prepares for decompression. As with <CODE>BZ2_bzCompressInit</CODE>, a -<CODE>bz_stream</CODE> record should be allocated and initialised before the -call. Fields <CODE>bzalloc</CODE>, <CODE>bzfree</CODE> and <CODE>opaque</CODE> should be -set if a custom memory allocator is required, or made <CODE>NULL</CODE> for -the normal <CODE>malloc</CODE>/<CODE>free</CODE> routines. Upon return, the internal -state will have been initialised, and <CODE>total_in</CODE> and -<CODE>total_out</CODE> will be zero. - -</P> -<P> -For the meaning of parameter <CODE>verbosity</CODE>, see <CODE>BZ2_bzCompressInit</CODE>. - -</P> -<P> -If <CODE>small</CODE> is nonzero, the library will use an alternative -decompression algorithm which uses less memory but at the cost of -decompressing more slowly (roughly speaking, half the speed, but the -maximum memory requirement drops to around 2300k). See Chapter 2 for -more information on memory management. - -</P> -<P> -Note that the amount of memory needed to decompress -a stream cannot be determined until the stream's header has been read, -so even if <CODE>BZ2_bzDecompressInit</CODE> succeeds, a subsequent -<CODE>BZ2_bzDecompress</CODE> could fail with <CODE>BZ_MEM_ERROR</CODE>. - -</P> -<P> -Possible return values: - -<PRE> - <CODE>BZ_CONFIG_ERROR</CODE> - if the library has been mis-compiled - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>(small != 0 && small != 1)</CODE> - or <CODE>(verbosity < 0 || verbosity > 4)</CODE> - <CODE>BZ_MEM_ERROR</CODE> - if insufficient memory is available -</PRE> - -<P> -Allowable next actions: - -<PRE> - <CODE>BZ2_bzDecompress</CODE> - if <CODE>BZ_OK</CODE> was returned - no specific action required in case of error -</PRE> - -<P> - - -</P> - - -<H3><A NAME="SEC23" HREF="manual_toc.html#TOC23"><CODE>BZ2_bzDecompress</CODE></A></H3> - -<PRE> -int BZ2_bzDecompress ( bz_stream *strm ); -</PRE> - -<P> -Provides more input and/out output buffer space for the library. The -caller maintains input and output buffers, and uses <CODE>BZ2_bzDecompress</CODE> -to transfer data between them. - -</P> -<P> -Before each call to <CODE>BZ2_bzDecompress</CODE>, <CODE>next_in</CODE> -should point at the compressed data, -and <CODE>avail_in</CODE> should indicate how many bytes the library -may read. <CODE>BZ2_bzDecompress</CODE> updates <CODE>next_in</CODE>, <CODE>avail_in</CODE> -and <CODE>total_in</CODE> -to reflect the number of bytes it has read. - -</P> -<P> -Similarly, <CODE>next_out</CODE> should point to a buffer in which the uncompressed -output is to be placed, with <CODE>avail_out</CODE> indicating how much output space -is available. <CODE>BZ2_bzCompress</CODE> updates <CODE>next_out</CODE>, -<CODE>avail_out</CODE> and <CODE>total_out</CODE> to reflect -the number of bytes output. - -</P> -<P> -You may provide and remove as little or as much data as you like on -each call of <CODE>BZ2_bzDecompress</CODE>. -In the limit, it is acceptable to -supply and remove data one byte at a time, although this would be -terribly inefficient. You should always ensure that at least one -byte of output space is available at each call. - -</P> -<P> -Use of <CODE>BZ2_bzDecompress</CODE> is simpler than <CODE>BZ2_bzCompress</CODE>. - -</P> -<P> -You should provide input and remove output as described above, and -repeatedly call <CODE>BZ2_bzDecompress</CODE> until <CODE>BZ_STREAM_END</CODE> is -returned. Appearance of <CODE>BZ_STREAM_END</CODE> denotes that -<CODE>BZ2_bzDecompress</CODE> has detected the logical end of the compressed -stream. <CODE>BZ2_bzDecompress</CODE> will not produce <CODE>BZ_STREAM_END</CODE> until -all output data has been placed into the output buffer, so once -<CODE>BZ_STREAM_END</CODE> appears, you are guaranteed to have available all -the decompressed output, and <CODE>BZ2_bzDecompressEnd</CODE> can safely be -called. - -</P> -<P> -If case of an error return value, you should call <CODE>BZ2_bzDecompressEnd</CODE> -to clean up and release memory. - -</P> -<P> -Possible return values: - -<PRE> - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>strm</CODE> is <CODE>NULL</CODE> or <CODE>strm->s</CODE> is <CODE>NULL</CODE> - or <CODE>strm->avail_out < 1</CODE> - <CODE>BZ_DATA_ERROR</CODE> - if a data integrity error is detected in the compressed stream - <CODE>BZ_DATA_ERROR_MAGIC</CODE> - if the compressed stream doesn't begin with the right magic bytes - <CODE>BZ_MEM_ERROR</CODE> - if there wasn't enough memory available - <CODE>BZ_STREAM_END</CODE> - if the logical end of the data stream was detected and all - output in has been consumed, eg <CODE>s->avail_out > 0</CODE> - <CODE>BZ_OK</CODE> - otherwise -</PRE> - -<P> -Allowable next actions: - -<PRE> - <CODE>BZ2_bzDecompress</CODE> - if <CODE>BZ_OK</CODE> was returned - <CODE>BZ2_bzDecompressEnd</CODE> - otherwise -</PRE> - - - -<H3><A NAME="SEC24" HREF="manual_toc.html#TOC24"><CODE>BZ2_bzDecompressEnd</CODE></A></H3> - -<PRE> -int BZ2_bzDecompressEnd ( bz_stream *strm ); -</PRE> - -<P> -Releases all memory associated with a decompression stream. - -</P> -<P> -Possible return values: - -<PRE> - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>strm</CODE> is <CODE>NULL</CODE> or <CODE>strm->s</CODE> is <CODE>NULL</CODE> - <CODE>BZ_OK</CODE> - otherwise -</PRE> - -<P> -Allowable next actions: - -<PRE> - None. -</PRE> - - - -<H2><A NAME="SEC25" HREF="manual_toc.html#TOC25">High-level interface</A></H2> - -<P> -This interface provides functions for reading and writing -<CODE>bzip2</CODE> format files. First, some general points. - -</P> - -<UL> -<LI>All of the functions take an <CODE>int*</CODE> first argument, - - <CODE>bzerror</CODE>. - After each call, <CODE>bzerror</CODE> should be consulted first to determine - the outcome of the call. If <CODE>bzerror</CODE> is <CODE>BZ_OK</CODE>, - the call completed - successfully, and only then should the return value of the function - (if any) be consulted. If <CODE>bzerror</CODE> is <CODE>BZ_IO_ERROR</CODE>, - there was an error - reading/writing the underlying compressed file, and you should - then consult <CODE>errno</CODE>/<CODE>perror</CODE> to determine the - cause of the difficulty. - <CODE>bzerror</CODE> may also be set to various other values; precise details are - given on a per-function basis below. -<LI>If <CODE>bzerror</CODE> indicates an error - - (ie, anything except <CODE>BZ_OK</CODE> and <CODE>BZ_STREAM_END</CODE>), - you should immediately call <CODE>BZ2_bzReadClose</CODE> (or <CODE>BZ2_bzWriteClose</CODE>, - depending on whether you are attempting to read or to write) - to free up all resources associated - with the stream. Once an error has been indicated, behaviour of all calls - except <CODE>BZ2_bzReadClose</CODE> (<CODE>BZ2_bzWriteClose</CODE>) is undefined. - The implication is that (1) <CODE>bzerror</CODE> should - be checked after each call, and (2) if <CODE>bzerror</CODE> indicates an error, - <CODE>BZ2_bzReadClose</CODE> (<CODE>BZ2_bzWriteClose</CODE>) should then be called to clean up. -<LI>The <CODE>FILE*</CODE> arguments passed to - - <CODE>BZ2_bzReadOpen</CODE>/<CODE>BZ2_bzWriteOpen</CODE> - should be set to binary mode. - Most Unix systems will do this by default, but other platforms, - including Windows and Mac, will not. If you omit this, you may - encounter problems when moving code to new platforms. -<LI>Memory allocation requests are handled by - - <CODE>malloc</CODE>/<CODE>free</CODE>. - At present - there is no facility for user-defined memory allocators in the file I/O - functions (could easily be added, though). -</UL> - - - -<H3><A NAME="SEC26" HREF="manual_toc.html#TOC26"><CODE>BZ2_bzReadOpen</CODE></A></H3> - -<PRE> - typedef void BZFILE; - - BZFILE *BZ2_bzReadOpen ( int *bzerror, FILE *f, - int small, int verbosity, - void *unused, int nUnused ); -</PRE> - -<P> -Prepare to read compressed data from file handle <CODE>f</CODE>. <CODE>f</CODE> -should refer to a file which has been opened for reading, and for which -the error indicator (<CODE>ferror(f)</CODE>)is not set. If <CODE>small</CODE> is 1, -the library will try to decompress using less memory, at the expense of -speed. - -</P> -<P> -For reasons explained below, <CODE>BZ2_bzRead</CODE> will decompress the -<CODE>nUnused</CODE> bytes starting at <CODE>unused</CODE>, before starting to read -from the file <CODE>f</CODE>. At most <CODE>BZ_MAX_UNUSED</CODE> bytes may be -supplied like this. If this facility is not required, you should pass -<CODE>NULL</CODE> and <CODE>0</CODE> for <CODE>unused</CODE> and n<CODE>Unused</CODE> -respectively. - -</P> -<P> -For the meaning of parameters <CODE>small</CODE> and <CODE>verbosity</CODE>, -see <CODE>BZ2_bzDecompressInit</CODE>. - -</P> -<P> -The amount of memory needed to decompress a file cannot be determined -until the file's header has been read. So it is possible that -<CODE>BZ2_bzReadOpen</CODE> returns <CODE>BZ_OK</CODE> but a subsequent call of -<CODE>BZ2_bzRead</CODE> will return <CODE>BZ_MEM_ERROR</CODE>. - -</P> -<P> -Possible assignments to <CODE>bzerror</CODE>: - -<PRE> - <CODE>BZ_CONFIG_ERROR</CODE> - if the library has been mis-compiled - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>f</CODE> is <CODE>NULL</CODE> - or <CODE>small</CODE> is neither <CODE>0</CODE> nor <CODE>1</CODE> - or <CODE>(unused == NULL && nUnused != 0)</CODE> - or <CODE>(unused != NULL && !(0 <= nUnused <= BZ_MAX_UNUSED))</CODE> - <CODE>BZ_IO_ERROR</CODE> - if <CODE>ferror(f)</CODE> is nonzero - <CODE>BZ_MEM_ERROR</CODE> - if insufficient memory is available - <CODE>BZ_OK</CODE> - otherwise. -</PRE> - -<P> -Possible return values: - -<PRE> - Pointer to an abstract <CODE>BZFILE</CODE> - if <CODE>bzerror</CODE> is <CODE>BZ_OK</CODE> - <CODE>NULL</CODE> - otherwise -</PRE> - -<P> -Allowable next actions: - -<PRE> - <CODE>BZ2_bzRead</CODE> - if <CODE>bzerror</CODE> is <CODE>BZ_OK</CODE> - <CODE>BZ2_bzClose</CODE> - otherwise -</PRE> - - - -<H3><A NAME="SEC27" HREF="manual_toc.html#TOC27"><CODE>BZ2_bzRead</CODE></A></H3> - -<PRE> - int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len ); -</PRE> - -<P> -Reads up to <CODE>len</CODE> (uncompressed) bytes from the compressed file -<CODE>b</CODE> into -the buffer <CODE>buf</CODE>. If the read was successful, -<CODE>bzerror</CODE> is set to <CODE>BZ_OK</CODE> -and the number of bytes read is returned. If the logical end-of-stream -was detected, <CODE>bzerror</CODE> will be set to <CODE>BZ_STREAM_END</CODE>, -and the number -of bytes read is returned. All other <CODE>bzerror</CODE> values denote an error. - -</P> -<P> -<CODE>BZ2_bzRead</CODE> will supply <CODE>len</CODE> bytes, -unless the logical stream end is detected -or an error occurs. Because of this, it is possible to detect the -stream end by observing when the number of bytes returned is -less than the number -requested. Nevertheless, this is regarded as inadvisable; you should -instead check <CODE>bzerror</CODE> after every call and watch out for -<CODE>BZ_STREAM_END</CODE>. - -</P> -<P> -Internally, <CODE>BZ2_bzRead</CODE> copies data from the compressed file in chunks -of size <CODE>BZ_MAX_UNUSED</CODE> bytes -before decompressing it. If the file contains more bytes than strictly -needed to reach the logical end-of-stream, <CODE>BZ2_bzRead</CODE> will almost certainly -read some of the trailing data before signalling <CODE>BZ_SEQUENCE_END</CODE>. -To collect the read but unused data once <CODE>BZ_SEQUENCE_END</CODE> has -appeared, call <CODE>BZ2_bzReadGetUnused</CODE> immediately before <CODE>BZ2_bzReadClose</CODE>. - -</P> -<P> -Possible assignments to <CODE>bzerror</CODE>: - -<PRE> - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>b</CODE> is <CODE>NULL</CODE> or <CODE>buf</CODE> is <CODE>NULL</CODE> or <CODE>len < 0</CODE> - <CODE>BZ_SEQUENCE_ERROR</CODE> - if <CODE>b</CODE> was opened with <CODE>BZ2_bzWriteOpen</CODE> - <CODE>BZ_IO_ERROR</CODE> - if there is an error reading from the compressed file - <CODE>BZ_UNEXPECTED_EOF</CODE> - if the compressed file ended before the logical end-of-stream was detected - <CODE>BZ_DATA_ERROR</CODE> - if a data integrity error was detected in the compressed stream - <CODE>BZ_DATA_ERROR_MAGIC</CODE> - if the stream does not begin with the requisite header bytes (ie, is not - a <CODE>bzip2</CODE> data file). This is really a special case of <CODE>BZ_DATA_ERROR</CODE>. - <CODE>BZ_MEM_ERROR</CODE> - if insufficient memory was available - <CODE>BZ_STREAM_END</CODE> - if the logical end of stream was detected. - <CODE>BZ_OK</CODE> - otherwise. -</PRE> - -<P> -Possible return values: - -<PRE> - number of bytes read - if <CODE>bzerror</CODE> is <CODE>BZ_OK</CODE> or <CODE>BZ_STREAM_END</CODE> - undefined - otherwise -</PRE> - -<P> -Allowable next actions: - -<PRE> - collect data from <CODE>buf</CODE>, then <CODE>BZ2_bzRead</CODE> or <CODE>BZ2_bzReadClose</CODE> - if <CODE>bzerror</CODE> is <CODE>BZ_OK</CODE> - collect data from <CODE>buf</CODE>, then <CODE>BZ2_bzReadClose</CODE> or <CODE>BZ2_bzReadGetUnused</CODE> - if <CODE>bzerror</CODE> is <CODE>BZ_SEQUENCE_END</CODE> - <CODE>BZ2_bzReadClose</CODE> - otherwise -</PRE> - - - -<H3><A NAME="SEC28" HREF="manual_toc.html#TOC28"><CODE>BZ2_bzReadGetUnused</CODE></A></H3> - -<PRE> - void BZ2_bzReadGetUnused ( int* bzerror, BZFILE *b, - void** unused, int* nUnused ); -</PRE> - -<P> -Returns data which was read from the compressed file but was not needed -to get to the logical end-of-stream. <CODE>*unused</CODE> is set to the address -of the data, and <CODE>*nUnused</CODE> to the number of bytes. <CODE>*nUnused</CODE> will -be set to a value between <CODE>0</CODE> and <CODE>BZ_MAX_UNUSED</CODE> inclusive. - -</P> -<P> -This function may only be called once <CODE>BZ2_bzRead</CODE> has signalled -<CODE>BZ_STREAM_END</CODE> but before <CODE>BZ2_bzReadClose</CODE>. - -</P> -<P> -Possible assignments to <CODE>bzerror</CODE>: - -<PRE> - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>b</CODE> is <CODE>NULL</CODE> - or <CODE>unused</CODE> is <CODE>NULL</CODE> or <CODE>nUnused</CODE> is <CODE>NULL</CODE> - <CODE>BZ_SEQUENCE_ERROR</CODE> - if <CODE>BZ_STREAM_END</CODE> has not been signalled - or if <CODE>b</CODE> was opened with <CODE>BZ2_bzWriteOpen</CODE> - <CODE>BZ_OK</CODE> - otherwise -</PRE> - -<P> -Allowable next actions: - -<PRE> - <CODE>BZ2_bzReadClose</CODE> -</PRE> - - - -<H3><A NAME="SEC29" HREF="manual_toc.html#TOC29"><CODE>BZ2_bzReadClose</CODE></A></H3> - -<PRE> - void BZ2_bzReadClose ( int *bzerror, BZFILE *b ); -</PRE> - -<P> -Releases all memory pertaining to the compressed file <CODE>b</CODE>. -<CODE>BZ2_bzReadClose</CODE> does not call <CODE>fclose</CODE> on the underlying file -handle, so you should do that yourself if appropriate. -<CODE>BZ2_bzReadClose</CODE> should be called to clean up after all error -situations. - -</P> -<P> -Possible assignments to <CODE>bzerror</CODE>: - -<PRE> - <CODE>BZ_SEQUENCE_ERROR</CODE> - if <CODE>b</CODE> was opened with <CODE>BZ2_bzOpenWrite</CODE> - <CODE>BZ_OK</CODE> - otherwise -</PRE> - -<P> -Allowable next actions: - -<PRE> - none -</PRE> - - - -<H3><A NAME="SEC30" HREF="manual_toc.html#TOC30"><CODE>BZ2_bzWriteOpen</CODE></A></H3> - -<PRE> - BZFILE *BZ2_bzWriteOpen ( int *bzerror, FILE *f, - int blockSize100k, int verbosity, - int workFactor ); -</PRE> - -<P> -Prepare to write compressed data to file handle <CODE>f</CODE>. -<CODE>f</CODE> should refer to -a file which has been opened for writing, and for which the error -indicator (<CODE>ferror(f)</CODE>)is not set. - -</P> -<P> -For the meaning of parameters <CODE>blockSize100k</CODE>, -<CODE>verbosity</CODE> and <CODE>workFactor</CODE>, see -<BR> <CODE>BZ2_bzCompressInit</CODE>. - -</P> -<P> -All required memory is allocated at this stage, so if the call -completes successfully, <CODE>BZ_MEM_ERROR</CODE> cannot be signalled by a -subsequent call to <CODE>BZ2_bzWrite</CODE>. - -</P> -<P> -Possible assignments to <CODE>bzerror</CODE>: - -<PRE> - <CODE>BZ_CONFIG_ERROR</CODE> - if the library has been mis-compiled - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>f</CODE> is <CODE>NULL</CODE> - or <CODE>blockSize100k < 1</CODE> or <CODE>blockSize100k > 9</CODE> - <CODE>BZ_IO_ERROR</CODE> - if <CODE>ferror(f)</CODE> is nonzero - <CODE>BZ_MEM_ERROR</CODE> - if insufficient memory is available - <CODE>BZ_OK</CODE> - otherwise -</PRE> - -<P> -Possible return values: - -<PRE> - Pointer to an abstract <CODE>BZFILE</CODE> - if <CODE>bzerror</CODE> is <CODE>BZ_OK</CODE> - <CODE>NULL</CODE> - otherwise -</PRE> - -<P> -Allowable next actions: - -<PRE> - <CODE>BZ2_bzWrite</CODE> - if <CODE>bzerror</CODE> is <CODE>BZ_OK</CODE> - (you could go directly to <CODE>BZ2_bzWriteClose</CODE>, but this would be pretty pointless) - <CODE>BZ2_bzWriteClose</CODE> - otherwise -</PRE> - - - -<H3><A NAME="SEC31" HREF="manual_toc.html#TOC31"><CODE>BZ2_bzWrite</CODE></A></H3> - -<PRE> - void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len ); -</PRE> - -<P> -Absorbs <CODE>len</CODE> bytes from the buffer <CODE>buf</CODE>, eventually to be -compressed and written to the file. - -</P> -<P> -Possible assignments to <CODE>bzerror</CODE>: - -<PRE> - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>b</CODE> is <CODE>NULL</CODE> or <CODE>buf</CODE> is <CODE>NULL</CODE> or <CODE>len < 0</CODE> - <CODE>BZ_SEQUENCE_ERROR</CODE> - if b was opened with <CODE>BZ2_bzReadOpen</CODE> - <CODE>BZ_IO_ERROR</CODE> - if there is an error writing the compressed file. - <CODE>BZ_OK</CODE> - otherwise -</PRE> - - - -<H3><A NAME="SEC32" HREF="manual_toc.html#TOC32"><CODE>BZ2_bzWriteClose</CODE></A></H3> - -<PRE> - void BZ2_bzWriteClose ( int *bzerror, BZFILE* f, - int abandon, - unsigned int* nbytes_in, - unsigned int* nbytes_out ); - - void BZ2_bzWriteClose64 ( int *bzerror, BZFILE* f, - int abandon, - unsigned int* nbytes_in_lo32, - unsigned int* nbytes_in_hi32, - unsigned int* nbytes_out_lo32, - unsigned int* nbytes_out_hi32 ); -</PRE> - -<P> -Compresses and flushes to the compressed file all data so far supplied -by <CODE>BZ2_bzWrite</CODE>. The logical end-of-stream markers are also written, so -subsequent calls to <CODE>BZ2_bzWrite</CODE> are illegal. All memory associated -with the compressed file <CODE>b</CODE> is released. -<CODE>fflush</CODE> is called on the -compressed file, but it is not <CODE>fclose</CODE>'d. - -</P> -<P> -If <CODE>BZ2_bzWriteClose</CODE> is called to clean up after an error, the only -action is to release the memory. The library records the error codes -issued by previous calls, so this situation will be detected -automatically. There is no attempt to complete the compression -operation, nor to <CODE>fflush</CODE> the compressed file. You can force this -behaviour to happen even in the case of no error, by passing a nonzero -value to <CODE>abandon</CODE>. - -</P> -<P> -If <CODE>nbytes_in</CODE> is non-null, <CODE>*nbytes_in</CODE> will be set to be the -total volume of uncompressed data handled. Similarly, <CODE>nbytes_out</CODE> -will be set to the total volume of compressed data written. For -compatibility with older versions of the library, <CODE>BZ2_bzWriteClose</CODE> -only yields the lower 32 bits of these counts. Use -<CODE>BZ2_bzWriteClose64</CODE> if you want the full 64 bit counts. These -two functions are otherwise absolutely identical. - -</P> - -<P> -Possible assignments to <CODE>bzerror</CODE>: - -<PRE> - <CODE>BZ_SEQUENCE_ERROR</CODE> - if <CODE>b</CODE> was opened with <CODE>BZ2_bzReadOpen</CODE> - <CODE>BZ_IO_ERROR</CODE> - if there is an error writing the compressed file - <CODE>BZ_OK</CODE> - otherwise -</PRE> - - - -<H3><A NAME="SEC33" HREF="manual_toc.html#TOC33">Handling embedded compressed data streams</A></H3> - -<P> -The high-level library facilitates use of -<CODE>bzip2</CODE> data streams which form some part of a surrounding, larger -data stream. - -<UL> -<LI>For writing, the library takes an open file handle, writes - -compressed data to it, <CODE>fflush</CODE>es it but does not <CODE>fclose</CODE> it. -The calling application can write its own data before and after the -compressed data stream, using that same file handle. -<LI>Reading is more complex, and the facilities are not as general - -as they could be since generality is hard to reconcile with efficiency. -<CODE>BZ2_bzRead</CODE> reads from the compressed file in blocks of size -<CODE>BZ_MAX_UNUSED</CODE> bytes, and in doing so probably will overshoot -the logical end of compressed stream. -To recover this data once decompression has -ended, call <CODE>BZ2_bzReadGetUnused</CODE> after the last call of <CODE>BZ2_bzRead</CODE> -(the one returning <CODE>BZ_STREAM_END</CODE>) but before calling -<CODE>BZ2_bzReadClose</CODE>. -</UL> - -<P> -This mechanism makes it easy to decompress multiple <CODE>bzip2</CODE> -streams placed end-to-end. As the end of one stream, when <CODE>BZ2_bzRead</CODE> -returns <CODE>BZ_STREAM_END</CODE>, call <CODE>BZ2_bzReadGetUnused</CODE> to collect the -unused data (copy it into your own buffer somewhere). -That data forms the start of the next compressed stream. -To start uncompressing that next stream, call <CODE>BZ2_bzReadOpen</CODE> again, -feeding in the unused data via the <CODE>unused</CODE>/<CODE>nUnused</CODE> -parameters. -Keep doing this until <CODE>BZ_STREAM_END</CODE> return coincides with the -physical end of file (<CODE>feof(f)</CODE>). In this situation -<CODE>BZ2_bzReadGetUnused</CODE> -will of course return no data. - -</P> -<P> -This should give some feel for how the high-level interface can be used. -If you require extra flexibility, you'll have to bite the bullet and get -to grips with the low-level interface. - -</P> - - -<H3><A NAME="SEC34" HREF="manual_toc.html#TOC34">Standard file-reading/writing code</A></H3> -<P> -Here's how you'd write data to a compressed file: - -<PRE> -FILE* f; -BZFILE* b; -int nBuf; -char buf[ /* whatever size you like */ ]; -int bzerror; -int nWritten; - -f = fopen ( "myfile.bz2", "w" ); -if (!f) { - /* handle error */ -} -b = BZ2_bzWriteOpen ( &bzerror, f, 9 ); -if (bzerror != BZ_OK) { - BZ2_bzWriteClose ( b ); - /* handle error */ -} - -while ( /* condition */ ) { - /* get data to write into buf, and set nBuf appropriately */ - nWritten = BZ2_bzWrite ( &bzerror, b, buf, nBuf ); - if (bzerror == BZ_IO_ERROR) { - BZ2_bzWriteClose ( &bzerror, b ); - /* handle error */ - } -} - -BZ2_bzWriteClose ( &bzerror, b ); -if (bzerror == BZ_IO_ERROR) { - /* handle error */ -} -</PRE> - -<P> -And to read from a compressed file: - -<PRE> -FILE* f; -BZFILE* b; -int nBuf; -char buf[ /* whatever size you like */ ]; -int bzerror; -int nWritten; - -f = fopen ( "myfile.bz2", "r" ); -if (!f) { - /* handle error */ -} -b = BZ2_bzReadOpen ( &bzerror, f, 0, NULL, 0 ); -if (bzerror != BZ_OK) { - BZ2_bzReadClose ( &bzerror, b ); - /* handle error */ -} - -bzerror = BZ_OK; -while (bzerror == BZ_OK && /* arbitrary other conditions */) { - nBuf = BZ2_bzRead ( &bzerror, b, buf, /* size of buf */ ); - if (bzerror == BZ_OK) { - /* do something with buf[0 .. nBuf-1] */ - } -} -if (bzerror != BZ_STREAM_END) { - BZ2_bzReadClose ( &bzerror, b ); - /* handle error */ -} else { - BZ2_bzReadClose ( &bzerror ); -} -</PRE> - - - -<H2><A NAME="SEC35" HREF="manual_toc.html#TOC35">Utility functions</A></H2> - - -<H3><A NAME="SEC36" HREF="manual_toc.html#TOC36"><CODE>BZ2_bzBuffToBuffCompress</CODE></A></H3> - -<PRE> - int BZ2_bzBuffToBuffCompress( char* dest, - unsigned int* destLen, - char* source, - unsigned int sourceLen, - int blockSize100k, - int verbosity, - int workFactor ); -</PRE> - -<P> -Attempts to compress the data in <CODE>source[0 .. sourceLen-1]</CODE> -into the destination buffer, <CODE>dest[0 .. *destLen-1]</CODE>. -If the destination buffer is big enough, <CODE>*destLen</CODE> is -set to the size of the compressed data, and <CODE>BZ_OK</CODE> is -returned. If the compressed data won't fit, <CODE>*destLen</CODE> -is unchanged, and <CODE>BZ_OUTBUFF_FULL</CODE> is returned. - -</P> -<P> -Compression in this manner is a one-shot event, done with a single call -to this function. The resulting compressed data is a complete -<CODE>bzip2</CODE> format data stream. There is no mechanism for making -additional calls to provide extra input data. If you want that kind of -mechanism, use the low-level interface. - -</P> -<P> -For the meaning of parameters <CODE>blockSize100k</CODE>, <CODE>verbosity</CODE> -and <CODE>workFactor</CODE>, <BR> see <CODE>BZ2_bzCompressInit</CODE>. - -</P> -<P> -To guarantee that the compressed data will fit in its buffer, allocate -an output buffer of size 1% larger than the uncompressed data, plus -six hundred extra bytes. - -</P> -<P> -<CODE>BZ2_bzBuffToBuffDecompress</CODE> will not write data at or -beyond <CODE>dest[*destLen]</CODE>, even in case of buffer overflow. - -</P> -<P> -Possible return values: - -<PRE> - <CODE>BZ_CONFIG_ERROR</CODE> - if the library has been mis-compiled - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>dest</CODE> is <CODE>NULL</CODE> or <CODE>destLen</CODE> is <CODE>NULL</CODE> - or <CODE>blockSize100k < 1</CODE> or <CODE>blockSize100k > 9</CODE> - or <CODE>verbosity < 0</CODE> or <CODE>verbosity > 4</CODE> - or <CODE>workFactor < 0</CODE> or <CODE>workFactor > 250</CODE> - <CODE>BZ_MEM_ERROR</CODE> - if insufficient memory is available - <CODE>BZ_OUTBUFF_FULL</CODE> - if the size of the compressed data exceeds <CODE>*destLen</CODE> - <CODE>BZ_OK</CODE> - otherwise -</PRE> - - - -<H3><A NAME="SEC37" HREF="manual_toc.html#TOC37"><CODE>BZ2_bzBuffToBuffDecompress</CODE></A></H3> - -<PRE> - int BZ2_bzBuffToBuffDecompress ( char* dest, - unsigned int* destLen, - char* source, - unsigned int sourceLen, - int small, - int verbosity ); -</PRE> - -<P> -Attempts to decompress the data in <CODE>source[0 .. sourceLen-1]</CODE> -into the destination buffer, <CODE>dest[0 .. *destLen-1]</CODE>. -If the destination buffer is big enough, <CODE>*destLen</CODE> is -set to the size of the uncompressed data, and <CODE>BZ_OK</CODE> is -returned. If the compressed data won't fit, <CODE>*destLen</CODE> -is unchanged, and <CODE>BZ_OUTBUFF_FULL</CODE> is returned. - -</P> -<P> -<CODE>source</CODE> is assumed to hold a complete <CODE>bzip2</CODE> format -data stream. <BR> <CODE>BZ2_bzBuffToBuffDecompress</CODE> tries to decompress -the entirety of the stream into the output buffer. - -</P> -<P> -For the meaning of parameters <CODE>small</CODE> and <CODE>verbosity</CODE>, -see <CODE>BZ2_bzDecompressInit</CODE>. - -</P> -<P> -Because the compression ratio of the compressed data cannot be known in -advance, there is no easy way to guarantee that the output buffer will -be big enough. You may of course make arrangements in your code to -record the size of the uncompressed data, but such a mechanism is beyond -the scope of this library. - -</P> -<P> -<CODE>BZ2_bzBuffToBuffDecompress</CODE> will not write data at or -beyond <CODE>dest[*destLen]</CODE>, even in case of buffer overflow. - -</P> -<P> -Possible return values: - -<PRE> - <CODE>BZ_CONFIG_ERROR</CODE> - if the library has been mis-compiled - <CODE>BZ_PARAM_ERROR</CODE> - if <CODE>dest</CODE> is <CODE>NULL</CODE> or <CODE>destLen</CODE> is <CODE>NULL</CODE> - or <CODE>small != 0 && small != 1</CODE> - or <CODE>verbosity < 0</CODE> or <CODE>verbosity > 4</CODE> - <CODE>BZ_MEM_ERROR</CODE> - if insufficient memory is available - <CODE>BZ_OUTBUFF_FULL</CODE> - if the size of the compressed data exceeds <CODE>*destLen</CODE> - <CODE>BZ_DATA_ERROR</CODE> - if a data integrity error was detected in the compressed data - <CODE>BZ_DATA_ERROR_MAGIC</CODE> - if the compressed data doesn't begin with the right magic bytes - <CODE>BZ_UNEXPECTED_EOF</CODE> - if the compressed data ends unexpectedly - <CODE>BZ_OK</CODE> - otherwise -</PRE> - - - -<H2><A NAME="SEC38" HREF="manual_toc.html#TOC38"><CODE>zlib</CODE> compatibility functions</A></H2> -<P> -Yoshioka Tsuneo has contributed some functions to -give better <CODE>zlib</CODE> compatibility. These functions are -<CODE>BZ2_bzopen</CODE>, <CODE>BZ2_bzread</CODE>, <CODE>BZ2_bzwrite</CODE>, <CODE>BZ2_bzflush</CODE>, -<CODE>BZ2_bzclose</CODE>, -<CODE>BZ2_bzerror</CODE> and <CODE>BZ2_bzlibVersion</CODE>. -These functions are not (yet) officially part of -the library. If they break, you get to keep all the pieces. -Nevertheless, I think they work ok. - -<PRE> -typedef void BZFILE; - -const char * BZ2_bzlibVersion ( void ); -</PRE> - -<P> -Returns a string indicating the library version. - -<PRE> -BZFILE * BZ2_bzopen ( const char *path, const char *mode ); -BZFILE * BZ2_bzdopen ( int fd, const char *mode ); -</PRE> - -<P> -Opens a <CODE>.bz2</CODE> file for reading or writing, using either its name -or a pre-existing file descriptor. -Analogous to <CODE>fopen</CODE> and <CODE>fdopen</CODE>. - -<PRE> -int BZ2_bzread ( BZFILE* b, void* buf, int len ); -int BZ2_bzwrite ( BZFILE* b, void* buf, int len ); -</PRE> - -<P> -Reads/writes data from/to a previously opened <CODE>BZFILE</CODE>. -Analogous to <CODE>fread</CODE> and <CODE>fwrite</CODE>. - -<PRE> -int BZ2_bzflush ( BZFILE* b ); -void BZ2_bzclose ( BZFILE* b ); -</PRE> - -<P> -Flushes/closes a <CODE>BZFILE</CODE>. <CODE>BZ2_bzflush</CODE> doesn't actually do -anything. Analogous to <CODE>fflush</CODE> and <CODE>fclose</CODE>. - -</P> - -<PRE> -const char * BZ2_bzerror ( BZFILE *b, int *errnum ) -</PRE> - -<P> -Returns a string describing the more recent error status of -<CODE>b</CODE>, and also sets <CODE>*errnum</CODE> to its numerical value. - -</P> - - - -<H2><A NAME="SEC39" HREF="manual_toc.html#TOC39">Using the library in a <CODE>stdio</CODE>-free environment</A></H2> - - - -<H3><A NAME="SEC40" HREF="manual_toc.html#TOC40">Getting rid of <CODE>stdio</CODE></A></H3> - -<P> -In a deeply embedded application, you might want to use just -the memory-to-memory functions. You can do this conveniently -by compiling the library with preprocessor symbol <CODE>BZ_NO_STDIO</CODE> -defined. Doing this gives you a library containing only the following -eight functions: - -</P> -<P> -<CODE>BZ2_bzCompressInit</CODE>, <CODE>BZ2_bzCompress</CODE>, <CODE>BZ2_bzCompressEnd</CODE> <BR> -<CODE>BZ2_bzDecompressInit</CODE>, <CODE>BZ2_bzDecompress</CODE>, <CODE>BZ2_bzDecompressEnd</CODE> <BR> -<CODE>BZ2_bzBuffToBuffCompress</CODE>, <CODE>BZ2_bzBuffToBuffDecompress</CODE> - -</P> -<P> -When compiled like this, all functions will ignore <CODE>verbosity</CODE> -settings. - -</P> - - -<H3><A NAME="SEC41" HREF="manual_toc.html#TOC41">Critical error handling</A></H3> -<P> -<CODE>libbzip2</CODE> contains a number of internal assertion checks which -should, needless to say, never be activated. Nevertheless, if an -assertion should fail, behaviour depends on whether or not the library -was compiled with <CODE>BZ_NO_STDIO</CODE> set. - -</P> -<P> -For a normal compile, an assertion failure yields the message - -<PRE> - bzip2/libbzip2: internal error number N. - This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000. - Please report it to me at: jseward@acm.org. If this happened - when you were using some program which uses libbzip2 as a - component, you should also report this bug to the author(s) - of that program. Please make an effort to report this bug; - timely and accurate bug reports eventually lead to higher - quality software. Thanks. Julian Seward, 21 March 2000. -</PRE> - -<P> -where <CODE>N</CODE> is some error code number. <CODE>exit(3)</CODE> -is then called. - -</P> -<P> -For a <CODE>stdio</CODE>-free library, assertion failures result -in a call to a function declared as: - -<PRE> - extern void bz_internal_error ( int errcode ); -</PRE> - -<P> -The relevant code is passed as a parameter. You should supply -such a function. - -</P> -<P> -In either case, once an assertion failure has occurred, any -<CODE>bz_stream</CODE> records involved can be regarded as invalid. -You should not attempt to resume normal operation with them. - -</P> -<P> -You may, of course, change critical error handling to suit -your needs. As I said above, critical errors indicate bugs -in the library and should not occur. All "normal" error -situations are indicated via error return codes from functions, -and can be recovered from. - -</P> - - - -<H2><A NAME="SEC42" HREF="manual_toc.html#TOC42">Making a Windows DLL</A></H2> -<P> -Everything related to Windows has been contributed by Yoshioka Tsuneo -<BR> (<CODE>QWF00133@niftyserve.or.jp</CODE> / -<CODE>tsuneo-y@is.aist-nara.ac.jp</CODE>), so you should send your queries to -him (but perhaps Cc: me, <CODE>jseward@acm.org</CODE>). - -</P> -<P> -My vague understanding of what to do is: using Visual C++ 5.0, -open the project file <CODE>libbz2.dsp</CODE>, and build. That's all. - -</P> -<P> -If you can't -open the project file for some reason, make a new one, naming these files: -<CODE>blocksort.c</CODE>, <CODE>bzlib.c</CODE>, <CODE>compress.c</CODE>, -<CODE>crctable.c</CODE>, <CODE>decompress.c</CODE>, <CODE>huffman.c</CODE>, <BR> -<CODE>randtable.c</CODE> and <CODE>libbz2.def</CODE>. You will also need -to name the header files <CODE>bzlib.h</CODE> and <CODE>bzlib_private.h</CODE>. - -</P> -<P> -If you don't use VC++, you may need to define the proprocessor symbol -<CODE>_WIN32</CODE>. - -</P> -<P> -Finally, <CODE>dlltest.c</CODE> is a sample program using the DLL. It has a -project file, <CODE>dlltest.dsp</CODE>. - -</P> -<P> -If you just want a makefile for Visual C, have a look at -<CODE>makefile.msc</CODE>. - -</P> -<P> -Be aware that if you compile <CODE>bzip2</CODE> itself on Win32, you must set -<CODE>BZ_UNIX</CODE> to 0 and <CODE>BZ_LCCWIN32</CODE> to 1, in the file -<CODE>bzip2.c</CODE>, before compiling. Otherwise the resulting binary won't -work correctly. - -</P> -<P> -I haven't tried any of this stuff myself, but it all looks plausible. - -</P> - -<P><HR><P> -<p>Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_2.html">previous</A>, <A HREF="manual_4.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>. -</BODY> -</HTML> |