diff options
Diffstat (limited to 'Documentation/technical')
-rw-r--r-- | Documentation/technical/api-hashmap.txt | 54 | ||||
-rw-r--r-- | Documentation/technical/api-strbuf.txt | 10 | ||||
-rw-r--r-- | Documentation/technical/http-protocol.txt | 2 | ||||
-rw-r--r-- | Documentation/technical/index-format.txt | 35 |
4 files changed, 91 insertions, 10 deletions
diff --git a/Documentation/technical/api-hashmap.txt b/Documentation/technical/api-hashmap.txt index b977ae8bbb..ad7a5bddd2 100644 --- a/Documentation/technical/api-hashmap.txt +++ b/Documentation/technical/api-hashmap.txt @@ -8,11 +8,19 @@ Data Structures `struct hashmap`:: - The hash table structure. + The hash table structure. Members can be used as follows, but should + not be modified directly: + -The `size` member keeps track of the total number of entries. The `cmpfn` -member is a function used to compare two entries for equality. The `table` and -`tablesize` members store the hash table and its size, respectively. +The `size` member keeps track of the total number of entries (0 means the +hashmap is empty). ++ +`tablesize` is the allocated size of the hash table. A non-0 value indicates +that the hashmap is initialized. It may also be useful for statistical purposes +(i.e. `size / tablesize` is the current load factor). ++ +`cmpfn` stores the comparison function specified in `hashmap_init()`. In +advanced scenarios, it may be useful to change this, e.g. to switch between +case-sensitive and case-insensitive lookup. `struct hashmap_entry`:: @@ -58,6 +66,15 @@ Functions + `strihash` and `memihash` are case insensitive versions. +`unsigned int sha1hash(const unsigned char *sha1)`:: + + Converts a cryptographic hash (e.g. SHA-1) into an int-sized hash code + for use in hash tables. Cryptographic hashes are supposed to have + uniform distribution, so in contrast to `memhash()`, this just copies + the first `sizeof(int)` bytes without shuffling any bits. Note that + the results will be different on big-endian and little-endian + platforms, so they should not be stored or transferred over the net. + `void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, size_t initial_size)`:: Initializes a hashmap structure. @@ -101,6 +118,20 @@ hashmap_entry) that has at least been initialized with the proper hash code If an entry with matching hash code is found, `key` and `keydata` are passed to `hashmap_cmp_fn` to decide whether the entry matches the key. +`void *hashmap_get_from_hash(const struct hashmap *map, unsigned int hash, const void *keydata)`:: + + Returns the hashmap entry for the specified hash code and key data, + or NULL if not found. ++ +`map` is the hashmap structure. ++ +`hash` is the hash code of the entry to look up. ++ +If an entry with matching hash code is found, `keydata` is passed to +`hashmap_cmp_fn` to decide whether the entry matches the key. The +`entry_or_key` parameter points to a bogus hashmap_entry structure that +should not be used in the comparison. + `void *hashmap_get_next(const struct hashmap *map, const void *entry)`:: Returns the next equal hashmap entry, or NULL if not found. This can be @@ -162,6 +193,21 @@ more entries. `hashmap_iter_first` is a combination of both (i.e. initializes the iterator and returns the first entry, if any). +`const char *strintern(const char *string)`:: +`const void *memintern(const void *data, size_t len)`:: + + Returns the unique, interned version of the specified string or data, + similar to the `String.intern` API in Java and .NET, respectively. + Interned strings remain valid for the entire lifetime of the process. ++ +Can be used as `[x]strdup()` or `xmemdupz` replacement, except that interned +strings / data must not be modified or freed. ++ +Interned strings are best used for short strings with high probability of +duplicates. ++ +Uses a hashmap to store the pool of interned strings. + Usage example ------------- diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt index 077a7096a4..f9c06a7573 100644 --- a/Documentation/technical/api-strbuf.txt +++ b/Documentation/technical/api-strbuf.txt @@ -7,10 +7,10 @@ use the mem* functions than a str* one (memchr vs. strchr e.g.). Though, one has to be careful about the fact that str* functions often stop on NULs and that strbufs may have embedded NULs. -An strbuf is NUL terminated for convenience, but no function in the +A strbuf is NUL terminated for convenience, but no function in the strbuf API actually relies on the string being free of NULs. -strbufs has some invariants that are very important to keep in mind: +strbufs have some invariants that are very important to keep in mind: . The `buf` member is never NULL, so it can be used in any usual C string operations safely. strbuf's _have_ to be initialized either by @@ -56,8 +56,8 @@ Data structures * `struct strbuf` This is the string buffer structure. The `len` member can be used to -determine the current length of the string, and `buf` member provides access to -the string itself. +determine the current length of the string, and `buf` member provides +access to the string itself. Functions --------- @@ -202,7 +202,7 @@ strbuf_addstr(sb, "immediate string"); `strbuf_addbuf`:: - Copy the contents of an other buffer at the end of the current one. + Copy the contents of another buffer at the end of the current one. `strbuf_adddup`:: diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt index 59be59b0eb..229f845dfa 100644 --- a/Documentation/technical/http-protocol.txt +++ b/Documentation/technical/http-protocol.txt @@ -60,7 +60,7 @@ Because Git repositories are accessed by standard path components server administrators MAY use directory based permissions within their HTTP server to control repository access. -Clients SHOULD support Basic authentication as described by RFC 2616. +Clients SHOULD support Basic authentication as described by RFC 2617. Servers SHOULD support Basic authentication by relying upon the HTTP server placed in front of the Git server software. diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index f352a9b22e..fe6f31667d 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -129,6 +129,9 @@ Git index format (Version 4) In version 4, the padding after the pathname does not exist. + Interpretation of index entries in split index mode is completely + different. See below for details. + == Extensions === Cached tree @@ -198,3 +201,35 @@ Git index format - At most three 160-bit object names of the entry in stages from 1 to 3 (nothing is written for a missing stage). +=== Split index + + In split index mode, the majority of index entries could be stored + in a separate file. This extension records the changes to be made on + top of that to produce the final index. + + The signature for this extension is { 'l', 'i, 'n', 'k' }. + + The extension consists of: + + - 160-bit SHA-1 of the shared index file. The shared index file path + is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the + index does not require a shared index file. + + - An ewah-encoded delete bitmap, each bit represents an entry in the + shared index. If a bit is set, its corresponding entry in the + shared index will be removed from the final index. Note, because + a delete operation changes index entry positions, but we do need + original positions in replace phase, it's best to just mark + entries for removal, then do a mass deletion after replacement. + + - An ewah-encoded replace bitmap, each bit represents an entry in + the shared index. If a bit is set, its corresponding entry in the + shared index will be replaced with an entry in this index + file. All replaced entries are stored in sorted order in this + index. The first "1" bit in the replace bitmap corresponds to the + first index entry, the second "1" bit to the second entry and so + on. Replaced entries may have empty path names to save space. + + The remaining index entries after replaced ones will be added to the + final index. These added entries are also sorted by entry namme then + stage. |