4 files changed, 91 insertions, 10 deletions
diff --git a/Documentation/technical/api-hashmap.txt b/Documentation/technical/api-hashmap.txt
index b977ae8bbb..ad7a5bddd2 100644
--- a/Documentation/technical/api-hashmap.txt
+++ b/Documentation/technical/api-hashmap.txt
@@ -8,11 +8,19 @@ Data Structures
 
 `struct hashmap`::
 
-	The hash table structure.
+	The hash table structure. Members can be used as follows, but should
+	not be modified directly:
 +
-The `size` member keeps track of the total number of entries. The `cmpfn`
-member is a function used to compare two entries for equality. The `table` and
-`tablesize` members store the hash table and its size, respectively.
+The `size` member keeps track of the total number of entries (0 means the
+hashmap is empty).
++
+`tablesize` is the allocated size of the hash table. A non-0 value indicates
+that the hashmap is initialized. It may also be useful for statistical purposes
+(i.e. `size / tablesize` is the current load factor).
++
+`cmpfn` stores the comparison function specified in `hashmap_init()`. In
+advanced scenarios, it may be useful to change this, e.g. to switch between
+case-sensitive and case-insensitive lookup.
 
 `struct hashmap_entry`::
 
@@ -58,6 +66,15 @@ Functions
 +
 `strihash` and `memihash` are case insensitive versions.
 
+`unsigned int sha1hash(const unsigned char *sha1)`::
+
+	Converts a cryptographic hash (e.g. SHA-1) into an int-sized hash code
+	for use in hash tables. Cryptographic hashes are supposed to have
+	uniform distribution, so in contrast to `memhash()`, this just copies
+	the first `sizeof(int)` bytes without shuffling any bits. Note that
+	the results will be different on big-endian and little-endian
+	platforms, so they should not be stored or transferred over the net.
+
 `void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, size_t initial_size)`::
 
 	Initializes a hashmap structure.
@@ -101,6 +118,20 @@ hashmap_entry) that has at least been initialized with the proper hash code
 If an entry with matching hash code is found, `key` and `keydata` are passed
 to `hashmap_cmp_fn` to decide whether the entry matches the key.
 
+`void *hashmap_get_from_hash(const struct hashmap *map, unsigned int hash, const void *keydata)`::
+
+	Returns the hashmap entry for the specified hash code and key data,
+	or NULL if not found.
++
+`map` is the hashmap structure.
++
+`hash` is the hash code of the entry to look up.
++
+If an entry with matching hash code is found, `keydata` is passed to
+`hashmap_cmp_fn` to decide whether the entry matches the key. The
+`entry_or_key` parameter points to a bogus hashmap_entry structure that
+should not be used in the comparison.
+
 `void *hashmap_get_next(const struct hashmap *map, const void *entry)`::
 
 	Returns the next equal hashmap entry, or NULL if not found. This can be
@@ -162,6 +193,21 @@ more entries.
 `hashmap_iter_first` is a combination of both (i.e. initializes the iterator
 and returns the first entry, if any).
 
+`const char *strintern(const char *string)`::
+`const void *memintern(const void *data, size_t len)`::
+
+	Returns the unique, interned version of the specified string or data,
+	similar to the `String.intern` API in Java and .NET, respectively.
+	Interned strings remain valid for the entire lifetime of the process.
++
+Can be used as `[x]strdup()` or `xmemdupz` replacement, except that interned
+strings / data must not be modified or freed.
++
+Interned strings are best used for short strings with high probability of
+duplicates.
++
+Uses a hashmap to store the pool of interned strings.
+
 Usage example
 -------------
 
diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt
index 077a7096a4..f9c06a7573 100644
--- a/Documentation/technical/api-strbuf.txt
+++ b/Documentation/technical/api-strbuf.txt
@@ -7,10 +7,10 @@ use the mem* functions than a str* one (memchr vs. strchr e.g.).
 Though, one has to be careful about the fact that str* functions often
 stop on NULs and that strbufs may have embedded NULs.
 
-An strbuf is NUL terminated for convenience, but no function in the
+A strbuf is NUL terminated for convenience, but no function in the
 strbuf API actually relies on the string being free of NULs.
 
-strbufs has some invariants that are very important to keep in mind:
+strbufs have some invariants that are very important to keep in mind:
 
 . The `buf` member is never NULL, so it can be used in any usual C
 string operations safely. strbuf's _have_ to be initialized either by
@@ -56,8 +56,8 @@ Data structures
 * `struct strbuf`
 
 This is the string buffer structure. The `len` member can be used to
-determine the current length of the string, and `buf` member provides access to
-the string itself.
+determine the current length of the string, and `buf` member provides
+access to the string itself.
 
 Functions
 ---------
@@ -202,7 +202,7 @@ strbuf_addstr(sb, "immediate string");
 
 `strbuf_addbuf`::
 
-	Copy the contents of an other buffer at the end of the current one.
+	Copy the contents of another buffer at the end of the current one.
 
 `strbuf_adddup`::
 
diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt
index 59be59b0eb..229f845dfa 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -60,7 +60,7 @@ Because Git repositories are accessed by standard path components
 server administrators MAY use directory based permissions within
 their HTTP server to control repository access.
 
-Clients SHOULD support Basic authentication as described by RFC 2616.
+Clients SHOULD support Basic authentication as described by RFC 2617.
 Servers SHOULD support Basic authentication by relying upon the
 HTTP server placed in front of the Git server software.
 
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index f352a9b22e..fe6f31667d 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -129,6 +129,9 @@ Git index format
   (Version 4) In version 4, the padding after the pathname does not
   exist.
 
+  Interpretation of index entries in split index mode is completely
+  different. See below for details.
+
 == Extensions
 
 === Cached tree
@@ -198,3 +201,35 @@ Git index format
   - At most three 160-bit object names of the entry in stages from 1 to 3
     (nothing is written for a missing stage).
 
+=== Split index
+
+  In split index mode, the majority of index entries could be stored
+  in a separate file. This extension records the changes to be made on
+  top of that to produce the final index.
+
+  The signature for this extension is { 'l', 'i, 'n', 'k' }.
+
+  The extension consists of:
+
+  - 160-bit SHA-1 of the shared index file. The shared index file path
+    is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the
+    index does not require a shared index file.
+
+  - An ewah-encoded delete bitmap, each bit represents an entry in the
+    shared index. If a bit is set, its corresponding entry in the
+    shared index will be removed from the final index.  Note, because
+    a delete operation changes index entry positions, but we do need
+    original positions in replace phase, it's best to just mark
+    entries for removal, then do a mass deletion after replacement.
+
+  - An ewah-encoded replace bitmap, each bit represents an entry in
+    the shared index. If a bit is set, its corresponding entry in the
+    shared index will be replaced with an entry in this index
+    file. All replaced entries are stored in sorted order in this
+    index. The first "1" bit in the replace bitmap corresponds to the
+    first index entry, the second "1" bit to the second entry and so
+    on. Replaced entries may have empty path names to save space.
+
+  The remaining index entries after replaced ones will be added to the
+  final index. These added entries are also sorted by entry namme then
+  stage.