Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/amachronic/microtar.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAidan MacDonald <amachronic@protonmail.com>2021-11-05 17:41:58 +0300
committerAidan MacDonald <amachronic@protonmail.com>2021-11-05 18:21:52 +0300
commit4868c7d4de5114757cc56de32a577851c0f68377 (patch)
tree3f31bfd8bb344adf996df9ada27c1c7410261790
parent2dd008245d42504c12c27d9731f2e1180162fe21 (diff)
Update the README
-rw-r--r--README.md307
1 files changed, 224 insertions, 83 deletions
diff --git a/README.md b/README.md
index 1a663d5..8d8a02b 100644
--- a/README.md
+++ b/README.md
@@ -1,122 +1,263 @@
# microtar
-A lightweight tar library written in ANSI C
+A lightweight tar library written in ANSI C.
-## Modifications from upstream
+This version is a fork of [rxi's microtar](https://github.com/rxi/microtar)
+with bugfixes and API changes aimed at improving usability, but still keeping
+with the minimal design of the original library.
-[Upstream](https://github.com/rxi/microtar) has numerous bugs and gotchas,
-which I fixed in order to improve the overall robustness of the library.
+## License
+
+This library is free software; you can redistribute it and/or modify it under
+the terms of the MIT license. See [LICENSE](LICENSE) for details.
+
+
+## Supported format variants
+
+No effort has been put into handling every tar format variant. Basically
+what is accepted is the "old-style" format, which appears to work well
+enough to access basic archives created by GNU `tar`.
-A summary of my changes, in no particular order:
-- Fix possible sscanf beyond the bounds of the input buffer
-- Fix possible buffer overruns due to strcpy on untrusted input
-- Fix incorrect octal formatting by sprintf and possible output overrruns
-- Catch read/writes which are too big and handle them gracefully
-- Handle over-long names in `mtar_write_file_header` / `mtar_write_dir_header`
-- Ensure strings in `mtar_header_t` are always null-terminated
-- Save and load group information so we don't lose information
-- Move `mtar_open()` to `microtar-stdio.c` so `microtar.c` can be used in
- a freestanding environment
-- Allow control of stack usage by moving temporary variables into `mtar_t`,
- so the caller can decide whether to use the stack or heap
+## Basic usage
-An up-to-date copy of this modified version can be found
-[here](https://github.com/amachronic/microtar).
+The library consists of two files, `microtar.c` and `microtar.h`, which only
+depend on a tiny part of the standard C library & can be easily incorporated
+into a host project's build system.
+The core library does not include any I/O hooks as these are supposed to be
+provided by the host application. If the C library's `fopen` and friends is
+good enough, you can use `microtar-stdio.c`.
-## Basic Usage
-The library consists of `microtar.c` and `microtar.h`. These two files can be
-dropped into an existing project and compiled along with it.
+### Initialization
+
+Initialization is very simple. Everything the library needs is contained in
+the `mtar_t` struct; there is no memory allocation and no global state. It is
+enough to zero-initialize an `mtar_t` object to put it into a "closed" state.
+You can use `mtar_is_open()` to query whether the archive is open or not.
+
+An archive can be opened for reading _or_ writing, but not both. You have to
+specify which access mode you're using when you create the archive.
-#### Reading
```c
mtar_t tar;
-mtar_header_t h;
-char *p;
+mtar_init(&tar, MTAR_READ, my_io_ops, my_stream);
+```
-/* Open archive for reading */
-mtar_open(&tar, "test.tar", "r");
+Or if using `microtar-stdio.c`:
-/* Print all file names and sizes */
-while ( (mtar_read_header(&tar, &h)) != MTAR_ENULLRECORD ) {
- printf("%s (%d bytes)\n", h.name, h.size);
- mtar_next(&tar);
+```c
+int error = mtar_open(&tar, "file.tar", "rb");
+if(error) {
+ /* do something about it */
}
+```
-/* Load and print contents of file "test.txt" */
-mtar_find(&tar, "test.txt", &h);
-p = calloc(1, h.size + 1);
-mtar_read_data(&tar, p, h.size);
-printf("%s", p);
-free(p);
+Note that `mtar_init()` is called for you in this case and the access mode is
+deduced from the mode flags.
-/* Close archive */
-mtar_close(&tar);
-```
-#### Writing
+### Iterating and locating files
+
+If you opened an archive for reading, you'll likely want to iterate over
+all the files. Here's the long way of doing it:
+
```c
mtar_t tar;
-const char *str1 = "Hello world";
-const char *str2 = "Goodbye world";
+int err;
+
+/* Go to the start of the archive... Not necessary if you've
+ * just opened the archive and are already at the beginning.
+ * (And of course you normally want to check the return value.) */
+mtar_rewind(&tar);
+
+/* Iterate over the archive members */
+while((err = mtar_next(&tar)) == MTAR_ESUCCESS) {
+ /* Get a pointer to the current file header. It will
+ * remain valid until you move to another record with
+ * mtar_next() or call mtar_rewind() */
+ const mtar_header_t* header = mtar_get_header(&tar);
+
+ printf("%s (%d bytes)\n", header->name, header->size);
+}
+
+if(err != MTAR_ENULLRECORD) {
+ /* ENULLRECORD means we hit end of file; any
+ * other return value is an actual error. */
+}
+```
+
+There's a useful shortcut for this type of iteration which removes the
+loop boilerplate, replacing it with another kind of boilerplate that may
+be more palatable in some cases.
-/* Open archive for writing */
-mtar_open(&tar, "test.tar", "w");
+```c
+/* Will be called for each archive member visited by mtar_foreach().
+ * The member's header is passed in as an argument so you don't need
+ * to fetch it manually with mtar_get_header(). You can freely read
+ * data (if present) and seek around. There is no special cleanup
+ * required and it is not necessary to read to the end of the stream.
+ *
+ * The callback should return zero (= MTAR_SUCCESS) to continue the
+ * iteration or return nonzero to abort. On abort, the value returned
+ * by the callback will be returned from mtar_foreach(). Since it may
+ * also return normal microtar error codes, it is suggested to use a
+ * positive value or pass the result via 'arg'.
+ */
+int foreach_cb(mtar_t* tar, const mtar_header_t* header, void* arg)
+{
+ // ...
+ return 0;
+}
-/* Write strings to files `test1.txt` and `test2.txt` */
-mtar_write_file_header(&tar, "test1.txt", strlen(str1));
-mtar_write_data(&tar, str1, strlen(str1));
-mtar_write_file_header(&tar, "test2.txt", strlen(str2));
-mtar_write_data(&tar, str2, strlen(str2));
+void main()
+{
+ mtar_t tar;
-/* Finalize -- this needs to be the last thing done before closing */
-mtar_finalize(&tar);
+ // ...
-/* Close archive */
-mtar_close(&tar);
+ int ret = mtar_foreach(&tar, foreach_cb, NULL);
+ if(ret < 0) {
+ /* Microtar error codes are negative and may be returned if
+ * there is a problem with the iteration. */
+ } else if(ret == MTAR_ESUCCESS) {
+ /* If the iteration reaches the end of the archive without
+ * errors, the return code is MTAR_ESUCCESS. */
+ } else if(ret > 0) {
+ /* Positive values might be returned by the callback to
+ * signal some condition was met; they'll never be returned
+ * by microtar */
+ }
+}
```
+The other thing you're likely to do is look for a specific file:
+
+```c
+/* Seek to a specific member in the archive */
+int err = mtar_find(&tar, "foo.txt");
+if(err == MTAR_ESUCCESS) {
+ /* File was found -- read the header with mtar_get_header() */
+} else if(err == MTAR_ENOTFOUND) {
+ /* File wasn't in the archive */
+} else {
+ /* Some error occurred */
+}
+```
+
+Note this isn't terribly efficient since it scans the entire archive
+looking for the file.
+
+
+### Reading file data
+
+Once pointed at a file via `mtar_next()` or `mtar_find()` you can read the
+data with a simple POSIX-like API.
+
+- `mtar_read_data(tar, buf, count)` reads up to `count` bytes into `buf`,
+ returning the actual number of bytes read, or a negative error value.
+ If at EOF, this returns zero.
+
+- `mtar_seek_data(tar, offset, whence)` works exactly like `fseek()` with
+ `whence` being one of `SEEK_SET`, `SEEK_CUR`, or `SEEK_END` and `offset`
+ indicating a point relative to the beginning, current position, or end
+ of the file. Returns zero on success, or a negative error code.
+
+- `mtar_eof_data(tar)` returns nonzero if the end of the file has been
+ reached. It is possible to seek backward to clear this condition.
+
+
+### Writing archives
+
+If you have opened an archive for writing, your options are a bit more
+limited than with reading as you need to generate the whole archive in
+a single pass. Seeking around and rewriting previously written data is
+not allowed. Support for this wouldn't be hard to add, but it was not
+included in the interest of simplicity.
+
+The main functions are:
+
+- `mtar_write_header(tar, header)` writes out a new record. The amount
+ of data that follows is dictated by `header->size` and you will have
+ to write it out before moving to the next record.
+
+- `mtar_write_data(tar, buf, count)` will write up to `count` bytes from
+ `buf` into the current record. Returns the number of bytes actually
+ written or a negative error code. If you provide too much data, a short
+ count is returned.
+
+- `mtar_end_record(tar)` will end the current record. It will complain
+ if you did not write the correct amount data provided in the header.
+
+- `mtar_finalize(tar)` is called after you have written all records to
+ the archive. It writes out some null records which mark the end of the
+ archive, so you cannot write any more records after calling this.
+
+It isn't necessary to call `mtar_end_record()` explicitly since it will
+be called automatically by `mtar_write_header()` and `mtar_finalize()`.
+Similarily, `mtar_finalize()` is implicitly called by `mtar_close()` if
+you don't do so yourself.
+
+Also note that `mtar_close()` can fail independently if there was a problem
+flushing buffered data to disk, so its return value should always be checked.
+
## Error handling
-All functions which return an `int` will return `MTAR_ESUCCESS` if the operation
-is successful. If an error occurs an error value less-than-zero will be
-returned; this value can be passed to the function `mtar_strerror()` to get its
-corresponding error string.
+Most functions that return `int` return an error code from `enum mtar_error`.
+Zero is success and all other error codes are negative. `mtar_strerror()` can
+return a string describing the error code.
-## Wrapping a stream
-If you want to read or write from something other than a file, the `mtar_t`
-struct can be manually initialized with your own callback functions and a
-`stream` pointer.
+A couple of functions use a different return value convention:
-All callback functions are passed a pointer to the `mtar_t` struct as their
-first argument. They should return `MTAR_ESUCCESS` if the operation succeeds
-without an error, or an integer below zero if an error occurs.
+- `mtar_foreach()` may error codes or an arbitrary nonzero value provided
+ by the callback.
+- `mtar_read_data()` and `mtar_write_data()` returns the number of bytes read
+ or written, or a negative error code. In particular zero means that no bytes
+ were read or written.
+- `mtar_get_header()` may return `NULL` if there is no valid header.
+ It is only possible to see a null pointer if misusing the API or after
+ a previous error so checking for this is usually not necessary.
-After the `stream` field has been set, all required callbacks have been set and
-all unused fields have been zeroset the `mtar_t` struct can be safely used with
-the microtar functions. `mtar_open` *should not* be called if the `mtar_t`
-struct was initialized manually.
+There is essentially no support for error recovery. After an error you can
+only do two things reliably: close the archive with `mtar_close()` or try
+rewinding to the beginning with `mtar_rewind()`.
-#### Reading
-The following callbacks should be set for reading an archive from a stream:
-Name | Arguments | Description
---------|------------------------------------------|---------------------------
-`read` | `mtar_t *tar, void *data, unsigned size` | Read data from the stream
-`seek` | `mtar_t *tar, unsigned pos` | Set the position indicator
-`close` | `mtar_t *tar` | Close the stream
+## I/O hooks
-#### Writing
-The following callbacks should be set for writing an archive to a stream:
+You can provide your own I/O hooks in a `mtar_ops_t` struct. The same ops
+struct can be shared among multiple `mtar_t` objects but each object gets
+its own `void* stream` pointer.
-Name | Arguments | Description
---------|------------------------------------------------|---------------------
-`write` | `mtar_t *tar, const void *data, unsigned size` | Write data to the stream
+Name | Arguments | Required
+--------|-------------------------------------------|------------
+`read` | `void* stream, void* data, unsigned size` | If reading
+`write` | `void* stream, void* data, unsigned size` | If writing
+`seek` | `void* stream, unsigned pos` | If reading
+`close` | `void* stream` | Always
+`read` and `write` should transfer the number of bytes indicated
+and return the number of bytes actually read or written, or a negative
+`enum mtar_error` code on error.
-## License
-This library is free software; you can redistribute it and/or modify it under
-the terms of the MIT license. See [LICENSE](LICENSE) for details.
+`seek` must have semantics like `lseek(..., pos, SEEK_SET)`; that is,
+the position is an absolute byte offset in the stream. Seeking is not
+optional for read support, but the library only performs backward
+seeks under two circumstances:
+
+- `mtar_rewind()` seeks to position 0.
+- `mtar_seek_data()` may seek backward if the user requests it.
+
+Therefore, you will be able to get away with a limited forward-only
+seek function if you're able to read everything in a single pass use
+the API carefully. Note `mtar_find()` and `mtar_foreach()` will call
+`mtar_rewind()`.
+
+`close` is called by `mtar_close()` to clean up the stream. Note the
+library assumes that the stream handle is cleaned up by `close` even
+if an error occurs.
+
+`seek` and `close` should return an `enum mtar_error` code, either
+`MTAR_SUCCESS`, or a negative value on error.