Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJacob Vosmaer <jacob@gitlab.com>2019-10-16 17:43:34 +0300
committerJacob Vosmaer <jacob@gitlab.com>2019-10-16 17:43:34 +0300
commit427e0f5aa924bc310173a19cdda5b52b94b38283 (patch)
tree776382f21189e2c2a69baccc9083f8265ec50f12
parent0abe0dd4194621aec898fafda0f98403afe0799e (diff)
parenta20c568602552814b954273ad41e97f46b46d2f8 (diff)
Merge branch 'jv-doc-git-source' into 'master'
Add tips for reading Git source code See merge request gitlab-org/gitaly!1551
-rw-r--r--doc/README.md1
-rw-r--r--doc/reading_git_source.md146
2 files changed, 147 insertions, 0 deletions
diff --git a/doc/README.md b/doc/README.md
index c0f957e00..d558b867e 100644
--- a/doc/README.md
+++ b/doc/README.md
@@ -30,6 +30,7 @@ For configuration please read [praefects configuration documentation](doc/config
- [Delta Islands](delta_islands.md)
- [Disk-based Cache](design_diskcache.md)
+- [Tips for reading Git source code](reading_git_source.md)
#### Proposals
diff --git a/doc/reading_git_source.md b/doc/reading_git_source.md
new file mode 100644
index 000000000..5c285a06a
--- /dev/null
+++ b/doc/reading_git_source.md
@@ -0,0 +1,146 @@
+# Tips for reading Git source code
+
+Although Git has good documentation, sometimes you just need to read the
+code to understand how it works. This document collects some tips on how
+to approach [Git's source code](https://gitlab.com/gitlab-org/git).
+
+## Audience
+
+This is written for Gitaly developers and GitLab troubleshooters (SRE, support engineer).
+
+## Look at the right version
+
+If you want to understand Git's behavior by reading the source, make
+sure you are reading the right source. Find out the Git version of the
+system you're investigating and select or check out the appropriate tag
+in Git.
+
+## Use a viewer with code search
+
+Online code search is usually not that great compared with code search
+in an offline text editor, or on the terminal with `git grep`.
+
+## Look at the tests
+
+If you want to know something that is not clear from the documentation,
+sometimes the answer is in the tests. These can be found in the
+[`t/` subdirectory](https://gitlab.com/gitlab-org/git/tree/master/t).
+
+In [`t/helper`](https://gitlab.com/gitlab-org/git/tree/master/t/helper)
+you can find C executables that expose some Git internal functions that
+you normally cannot call directly.
+
+The tests themselves are written in shell script. Instructions for
+running them are in
+[`t/README`](https://gitlab.com/gitlab-org/git/blob/master/t/README).
+However, often you don't have to run a test in order to understand what
+it are does.
+
+If you're interested in the workings a particular Git command, try
+searching the `t/` directory for it.
+
+## Look at the technical documentation
+
+There is a lot of [technical
+documentation](https://gitlab.com/gitlab-org/git/tree/master/Documentation/technical)
+in the Git source. If you want to know more about file formats, internal Git API's or
+network protocols, this is a good place to start.
+
+## Code organization
+
+The Git subcommands we use to interact with Git are mostly (all?) found
+in the `builtin/`
+[directory](https://gitlab.com/gitlab-org/git/tree/master/builtin). For
+example, `git log` is
+[`builtin/log.c`](https://gitlab.com/gitlab-org/git/blob/master/builtin/log.c).
+
+The `.c` files at the top level of the Git repository contain code that
+is shared across sub-commands. For example,
+[`config.c`](https://gitlab.com/gitlab-org/git/blob/master/config.c)
+contains code related to getting and setting Git configuration values.
+Contrast this with `builtin/config.c`, which is the sub-command code for
+`git config`.
+
+When doing a code search for an error message you sometimes get false
+matches in the `po/` directory which contains localizations. You may
+want to ignore those or filter them out of your search.
+
+If you are trying to make sense of what some internal Git function does
+you can read its definition somewhere in a `*.c` file in the root. There
+may also be some extra explanation in the corresponding `*.h` (header)
+file; the header files define the API of the corresponding `*.c` file.
+
+## Sub-command source files
+
+### Not all sub-commands are written in C
+
+At the top level of the repository, you will find `*.sh` and `*.perl`
+files that implement some of Git's sub-commands. For example,
+[`git-bisect.sh`](https://gitlab.com/gitlab-org/git/blob/v2.22.0/git-bisect.sh).
+
+### Main function
+
+If you're used to reading Ruby or Go, the `builtin/*.c` files could be a
+little disorienting. This is because the function call graph is ordered
+with leaf functions at the top, and the main entrypoint will be at the
+bottom. This allows the Git source code to have fewer (or no) forward
+declarations of functions.
+
+So if you want to do a top-down walk of a Git sub-command, expect to
+find the main entry point at the bottom of the corresponding
+`builtin/*.c` file. The entry point for e.g. `git blame` will be called
+[`cmd_blame` in
+`builtin/blame.c`](https://gitlab.com/gitlab-org/git/blob/v2.22.0/builtin/blame.c#L778).
+Recall that hyphens are not allowed in function names, so the entry
+point for `git upload-pack` is `cmd_upload_pack`.
+
+Some functions are not where you expect them. For example,
+`cmd_format_patch` is in `builtin/log.c`. Use code search!
+
+### Global state
+
+The way we write Ruby and Go at GitLab, it is common to bundle and hide
+state in classes (Ruby) or structs (Go). Global state is rare.
+
+Things are different in Git. Builtin commands often use `static`
+(i.e. file-scoped) global state. This reduces the number of arguments
+that have to be passed to functions, just like having state in a Ruby
+class does.
+
+You usually find the global variables at the top of the file.
+
+## C trivia
+
+If you don't use C every day some things about it might be surprising.
+
+### Implicit use of "zero means false"
+
+In Ruby, you will never write `if some_number` because if `some_number`
+is a variable containing a number, that `if` is equivalent to `if true`.
+In Go, you are not allowed by the compiler to write `if someNumber {`.
+
+However, in C, it is OK to write `if (some_number)`: this is equivalent
+to `if (some_number != 0)`. Whether that is OK is a matter of style, and
+in Git, you will see that `if (some_number)` is common.
+
+A variation of this has to do with zero-terminated data structures such
+as classic C strings, and linked lists. The loop below will visit each
+character in the string. Note that the test condition of the loop, `*s`,
+will be `0` at the end of the string, and the loop will break.
+
+```C
+for (s = "some string"; *s; s++)
+```
+
+You will see the same pattern with linked lists, where the test
+condition is the pointer to the current element.
+
+```C
+for (x = my_list; x; x = x-> next)
+```
+
+This becomes even more cryptic if you are dealing with a `while` loop.
+
+```C
+while (x)
+```