diff options
Diffstat (limited to 'doc/topics/git')
-rw-r--r-- | doc/topics/git/index.md | 6 | ||||
-rw-r--r-- | doc/topics/git/migrate_to_git_lfs/index.md | 174 | ||||
-rw-r--r-- | doc/topics/git/numerous_undo_possibilities_in_git/index.md | 92 | ||||
-rw-r--r-- | doc/topics/git/partial_clone.md | 147 | ||||
-rw-r--r-- | doc/topics/git/troubleshooting_git.md | 10 | ||||
-rw-r--r-- | doc/topics/git/useful_git_commands.md | 210 |
6 files changed, 587 insertions, 52 deletions
diff --git a/doc/topics/git/index.md b/doc/topics/git/index.md index cdcd8215b23..6a539b526f3 100644 --- a/doc/topics/git/index.md +++ b/doc/topics/git/index.md @@ -48,6 +48,7 @@ The following are resources about version control concepts: The following resources may help you become more efficient at using Git: +- [Useful Git commands](useful_git_commands.md) collected by the GitLab support team. - [Git Tips & Tricks](https://about.gitlab.com/2016/12/08/git-tips-and-tricks/) - [Eight Tips to help you work better with Git](https://about.gitlab.com/2015/02/19/8-tips-to-help-you-work-better-with-git/) @@ -71,6 +72,7 @@ The following are advanced topics for those who want to get the most out of Git: - [Custom Git Hooks](../../administration/custom_hooks.md) - [Git Attributes](../../user/project/git_attributes.md) - Git Submodules: [Using Git submodules with GitLab CI](../../ci/git_submodules.md#using-git-submodules-with-gitlab-ci) +- [Partial Clone](partial_clone.md) ## API @@ -82,6 +84,8 @@ Git-related queries from GitLab. The following relate to Git Large File Storage: - [Getting Started with Git LFS](https://about.gitlab.com/2017/01/30/getting-started-with-git-lfs-tutorial/) -- [GitLab Git LFS documentation](../../workflow/lfs/manage_large_binaries_with_git_lfs.md) +- [Migrate an existing Git repo with Git LFS](migrate_to_git_lfs/index.md) +- [GitLab Git LFS user documentation](../../workflow/lfs/manage_large_binaries_with_git_lfs.md) +- [GitLab Git LFS admin documentation](../../workflow/lfs/lfs_administration.md) - [Git-Annex to Git-LFS migration guide](../../workflow/lfs/migrate_from_git_annex_to_git_lfs.md) - [Towards a production quality open source Git LFS server](https://about.gitlab.com/2015/08/13/towards-a-production-quality-open-source-git-lfs-server/) diff --git a/doc/topics/git/migrate_to_git_lfs/index.md b/doc/topics/git/migrate_to_git_lfs/index.md new file mode 100644 index 00000000000..c879e404997 --- /dev/null +++ b/doc/topics/git/migrate_to_git_lfs/index.md @@ -0,0 +1,174 @@ +--- +type: tutorial, concepts +description: "How to migrate an existing Git repository to Git LFS with BFG." +last_updated: 2019-07-11 +--- + +# Migrate a Git repo into Git LFS with BFG + +Using Git LFS can help you to reduce the size of your Git +repository and improve its performance. + +However, simply adding the +large files that are already in your repository to Git LFS, +will not actually reduce the size of your repository because +the files are still referenced by previous commits. + +Through the method described on this document, first migrate +to Git LFS with [BFG](https://rtyley.github.io/bfg-repo-cleaner/) +through a mirror repo, then clean up the repository's history, +and lastly create LFS tracking rules to prevent new binary files +from being added. + +This tutorial was inspired by the guide +[Use BFG to migrate a repo to Git LFS](https://confluence.atlassian.com/bitbucket/use-bfg-to-migrate-a-repo-to-git-lfs-834233484.html). +For more information on Git LFS, see the [references](#references) +below. + +CAUTION: **Warning:** +The method described on this guide rewrites Git history. Make +sure to back up your repo before beginning and use it at your +own risk. + +## Requirements + +Before beginning, make sure: + +- You have enough LFS storage for the files you want to convert. + Storage is required for the entire history of all files. +- All the team members you share the repository with have pushed all changes. + Branches based on the repository before applying this method cannot be merged. + Branches based on the repo before applying this method cannot be merged. + +To follow this tutorial, you'll need: + +- Maintainer permissions to the existing Git repository + you'd like to migrate to LFS with access through the command line. +- [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) + and [Java Runtime Environment](https://www.java.com/en/download/manual.jsp) + (Java 7 or above) installed locally. +- BFG installed locally: + + ```bash + brew install bfg + ``` + +- Git LFS installed locally: + + ```bash + brew install git-lfs + ``` + +NOTE: **Note:** +This guide was tested on macOS Mojave. + +## Steps + +Consider an example upstream project, `git@gitlab.com:gitlab-tests/test-git-lfs-repo-migration.git`. + +1. Back up your repository: + + Create a copy of your repository so that you can + recover it in case something goes wrong. + +1. Clone `--mirror` the repo: + + Cloning with the mirror flag will create a bare repository. + This ensures you get all the branches within the repo. + + It creates a directory called `<repo-name>.git` + (in our example, `test-git-lfs-repo-migration.git`), + mirroring the upstream project: + + ```bash + git clone --mirror git@gitlab.com:gitlab-tests/test-git-lfs-repo-migration.git + ``` + +1. Convert the Git history with BFG: + + ```bash + bfg --convert-to-git-lfs "*.{png,mp4,jpg,gif}" --no-blob-protection test-git-lfs-repo-migration.git + ``` + + It is scanning all the history, and looking for any files with + that extension, and then converting them to an LFS pointer. + +1. Clean up the repository: + + ```bash + # cd path/to/mirror/repo: + cd test-git-lfs-repo-migration.git + # clean up the repo: + git reflog expire --expire=now --all && git gc --prune=now --aggressive + ``` + + You can also take a look on how to further [clean the repo](../../../user/project/repository/reducing_the_repo_size_using_git.md), + but it's not necessary for the purposes of this guide. + +1. Install Git LFS in the mirror repository: + + ```bash + git lfs install + ``` + +1. [Unprotect the default branch](../../../user/project/protected_branches.md), + so that we can force-push the rewritten repository: + + 1. Navigate to your project's **Settings > Repository** and + expand **Protected Branches**. + 1. Scroll down to locate the protected branches and click + **Unprotect** the default branch. + +1. Force-push to GitLab: + + ```bash + git push --force + ``` + +1. Track the files you want with LFS: + + ```bash + # cd path/to/upstream/repo: + cd test-git-lfs-repo-migration + # You may need to reset your local copy with upstream's `master` after force-pushing from the mirror: + git reset --hard origin/master + # Track the files with LFS: + git lfs track "*.gif" "*.png" "*.jpg" "*.psd" "*.mp4" ".gitattributes" "img/" + ``` + + Now all existing the files you converted, as well as the new + ones you add, will be properly tracked with LFS. + +1. [Re-protect the default branch](../../../user/project/protected_branches.md): + + 1. Navigate to your project's **Settings > Repository** and + expand **Protected Branches**. + 1. Select the default branch from the **Branch** dropdown menu, + and set up the + **Allowed to push** and **Allowed to merge** rules. + 1. Click **Protect**. + +<!-- ## Troubleshooting + +Include any troubleshooting steps that you can foresee. If you know beforehand what issues +one might have when setting this up, or when something is changed, or on upgrading, it's +important to describe those, too. Think of things that may go wrong and include them here. +This is important to minimize requests for support, and to avoid doc comments with +questions that you know someone might ask. + +Each scenario can be a third-level heading, e.g. `### Getting error message X`. +If you have none to add when creating a doc, leave this section in place +but commented out to help encourage others to add to it in the future. --> + +## References + +- [Getting Started with Git LFS](https://about.gitlab.com/2017/01/30/getting-started-with-git-lfs-tutorial/) +- [Migrate from Git Annex to Git LFS](../../../workflow/lfs/migrate_from_git_annex_to_git_lfs.md) +- [GitLab's Git LFS user documentation](../../../workflow/lfs/manage_large_binaries_with_git_lfs.md) +- [GitLab's Git LFS administrator documentation](../../../workflow/lfs/lfs_administration.md) +- Alternative method to [migrate an existing repo to Git LFS](https://github.com/git-lfs/git-lfs/wiki/Tutorial#migrating-existing-repository-data-to-lfs) + +<!-- +Test project: +https://gitlab.com/gitlab-tests/test-git-lfs-repo-migration +--> diff --git a/doc/topics/git/numerous_undo_possibilities_in_git/index.md b/doc/topics/git/numerous_undo_possibilities_in_git/index.md index 84201e11831..5cae532bf54 100644 --- a/doc/topics/git/numerous_undo_possibilities_in_git/index.md +++ b/doc/topics/git/numerous_undo_possibilities_in_git/index.md @@ -110,21 +110,21 @@ At this point there are 3 options to undo the local changes you have: - Discard all local changes, but save them for possible re-use [later](#quickly-save-local-changes): - ```shell - git stash - ``` + ```shell + git stash + ``` - Discarding local changes (permanently) to a file: - ```shell - git checkout -- <file> - ``` + ```shell + git checkout -- <file> + ``` - Discard all local changes to all files permanently: - ```shell - git reset --hard - ``` + ```shell + git reset --hard + ``` Before executing `git reset --hard`, keep in mind that there is also a way to just temporary store the changes without committing them using `git stash`. @@ -182,27 +182,27 @@ Now you have 4 options to undo your changes: - Unstage the file to current commit (HEAD): - ```shell - git reset HEAD <file> - ``` + ```shell + git reset HEAD <file> + ``` - Unstage everything - retain changes: - ```shell - git reset - ``` + ```shell + git reset + ``` - Discard all local changes, but save them for [later](#quickly-save-local-changes): - ```shell - git stash - ``` + ```shell + git stash + ``` - Discard everything permanently: - ```shell - git reset --hard - ``` + ```shell + git reset --hard + ``` ## Committed local changes @@ -240,21 +240,21 @@ In our example we will end up with commit `B`, that introduced bug/error. We hav - Undo (swap additions and deletions) changes introduced by commit `B`: - ```shell - git revert commit-B-id - ``` + ```shell + git revert commit-B-id + ``` - Undo changes on a single file or directory from commit `B`, but retain them in the staged state: - ```shell - git checkout commit-B-id <file> - ``` + ```shell + git checkout commit-B-id <file> + ``` - Undo changes on a single file or directory from commit `B`, but retain them in the unstaged state: - ```shell - git reset commit-B-id <file> - ``` + ```shell + git reset commit-B-id <file> + ``` - There is one command we also must not forget: **creating a new branch** from the point where changes are not applicable or where the development has hit a @@ -270,14 +270,14 @@ In our example we will end up with commit `B`, that introduced bug/error. We hav you can [cherry-pick](../../../user/project/merge_requests/cherry_pick_changes.md#cherry-picking-a-commit) that commit into a new merge request. - ![Create a new branch to avoid clashing](img/branching.png) + ![Create a new branch to avoid clashing](img/branching.png) - ```shell - git checkout commit-B-id - git checkout -b new-path-of-feature - # Create <commit F> - git commit -a - ``` + ```shell + git checkout commit-B-id + git checkout -b new-path-of-feature + # Create <commit F> + git commit -a + ``` ### With history modification @@ -297,9 +297,9 @@ delete commit `B`. - Rebase the range from current commit D to A: - ```shell - git rebase -i A - ``` + ```shell + git rebase -i A + ``` - Command opens your favorite editor where you write `drop` in front of commit `B`, but you leave default `pick` with all other commits. Save and exit the @@ -310,9 +310,9 @@ In case you want to modify something introduced in commit `B`. - Rebase the range from current commit D to A: - ```shell - git rebase -i A - ``` + ```shell + git rebase -i A + ``` - Command opens your favorite text editor where you write `edit` in front of commit `B`, but leave default `pick` with all other commits. Save and exit the editor to @@ -320,9 +320,9 @@ In case you want to modify something introduced in commit `B`. - Now do your edits and commit changes: - ```shell - git commit -a - ``` + ```shell + git commit -a + ``` You can find some more examples in [below section where we explain how to modify history](#how-modifying-history-is-done) diff --git a/doc/topics/git/partial_clone.md b/doc/topics/git/partial_clone.md new file mode 100644 index 00000000000..ea4223355d8 --- /dev/null +++ b/doc/topics/git/partial_clone.md @@ -0,0 +1,147 @@ +# Partial Clone for Large Repositories + +CAUTION: **Alpha:** +Partial Clone is an experimental feature, and will significantly increase +Gitaly resource utilization when performing a partial clone, and decrease +performance of subsequent fetch operations. + +As Git repositories become very large, usability decreases as performance +decreases. One major challenge is cloning the repository, because Git will +download the entire repository including every commit and every version of +every object. This can be slow to transfer, and require large amounts of disk +space. + +Historically, performing a **shallow clone** +([`--depth`](https://www.git-scm.com/docs/git-clone#Documentation/git-clone.txt---depthltdepthgt)) +has been the only way to reduce the amount of data transferred when cloning +a Git repository. This does not, however, allow filtering by sub-tree which is +important for monolithic repositories containing many projects, or by object +size preventing unnecessary large objects being downloaded. + +[Partial clone](https://github.com/git/git/blob/master/Documentation/technical/partial-clone.txt) +is a performance optimization that "allows Git to function without having a +complete copy of the repository. The goal of this work is to allow Git better +handle extremely large repositories." + +Specifically, using partial clone, it should be possible for Git to natively +support: + +- large objects, instead of using [Git LFS](https://git-lfs.github.com/) +- enormous repositories + +Briefly, partial clone works by: + +- excluding objects from being transferred when cloning or fetching a + repository using a new `--filter` flag +- downloading missing objects on demand + +Follow [Git for enormous repositories](https://gitlab.com/groups/gitlab-org/-/epics/773) for roadmap and updates. + +## Enabling partial clone + +GitLab 12.1 uses Git 2.21.0 which has an arbitrary file access security +vulnerability when `uploadpack.allowFilter` is enabled, and should not be +enabled in production environments. + +A feature flag is planned to enable `uploadpack.allowFilter` and +`uploadpack.allowAnySHA1InWant` once the version of Git used by GitLab has been +updated to Git 2.22.0. + +Follow [this issue](https://gitlab.com/gitlab-org/gitaly/issues/1553) for +updated. + +## Excluding objects by size + +Partial Clone allows large objects to be stored directly in the Git repository, +and be excluded from clones as desired by the user. This eliminates the error +prone process of deciding which objects should be stored in LFS or not. Using +partial clone, all files – large or small – may be treated the same. + +With the `uploadpack.allowFilter` and `uploadpack.allowAnySHA1InWant` options +enabled on the Git server: + +```bash +# clone the repo, excluding blobs larger than 1 megabyte +git clone --filter=blob:limit=1m <url> + +# in the checkout step of the clone, and any subsequent operations +# any blobs that are needed will be downloaded on demand +git checkout feature-branch +``` + +## Excluding objects by path + +Partial Clone allows clones to be filtered by path using a format similar to a +`.gitignore` file stored inside the repository. + +With the `uploadpack.allowFilter` and `uploadpack.allowAnySHA1InWant` options +enabled on the Git server: + +1. **Create a filter spec.** For example, consider a monolithic repository with + many applications, each in a different subdirectory in the root. Create a file + `shiny-app/.filterspec` using the GitLab web interface: + + ```.gitignore + # Only the paths listed in the file will be downloaded when performing a + # partial clone using `--filter=sparse:oid=shiny-app/.gitfilterspec` + + # Explicitly include filterspec needed to configure sparse checkout with + # git config --local core.sparsecheckout true + # git show master:snazzy-app/.gitfilterspec >> .git/info/sparse-checkout + shiny-app/.gitfilterspec + + # Shiny App + shiny-app/ + + # Dependencies + shimmery-app/ + shared-component-a/ + shared-component-b/ + ``` + +1. *Create a new Git repository and fetch.* Support for `--filter=sparse:oid` + using the clone command is incomplete, so we will emulate the clone command + by hand, using `git init` and `git fetch`. Follow + [gitaly#1769](https://gitlab.com/gitlab-org/gitaly/issues/1769) for updates. + + ```bash + # Create a new directory for the Git repository + mkdir jumbo-repo && cd jumbo-repo + + # Initialize a new Git repository + git init + + # Add the remote + git remote add origin git@gitlab.com/example/jumbo-repo + + # Enable partial clone support for the remote + git config --local extensions.partialClone origin + + # Fetch the filtered set of objects using the filterspec stored on the + # server. WARNING: this step is slow! + git fetch --filter=sparse:oid=master:shiny-app/.gitfilterspec origin + + # Optional: observe there are missing objects that we have not fetched + git rev-list --all --quiet --objects --missing=print | wc -l + ``` + + CAUTION: **IDE and Shell integrations:** + Git integrations with `bash`, `zsh`, etc and editors that automatically + show Git status information often run `git fetch` which will fetch the + entire repository. You many need to disable or reconfigure these + integrations. + +1. **Sparse checkout** must be enabled and configured to prevent objects from + other paths being downloaded automatically when checking out branches. Follow + [gitaly#1765](https://gitlab.com/gitlab-org/gitaly/issues/1765) for updates. + + ```bash + # Enable sparse checkout + git config --local core.sparsecheckout true + + # Configure sparse checkout + git show master:snazzy-app/.gitfilterspec >> .git/info/sparse-checkout + + # Checkout master + git checkout master + ``` diff --git a/doc/topics/git/troubleshooting_git.md b/doc/topics/git/troubleshooting_git.md index 417d91bf834..11284da30af 100644 --- a/doc/topics/git/troubleshooting_git.md +++ b/doc/topics/git/troubleshooting_git.md @@ -51,11 +51,11 @@ Configuring *both* the client and the server is unnecessary. - On UNIX, edit `~/.ssh/config` (create the file if it doesn’t exist) and add or edit: - ```text - Host your-gitlab-instance-url.com - ServerAliveInterval 60 - ServerAliveCountMax 5 - ``` + ```text + Host your-gitlab-instance-url.com + ServerAliveInterval 60 + ServerAliveCountMax 5 + ``` - On Windows, if you are using PuTTY, go to your session properties, then navigate to "Connection" and under "Sending of null packets to keep diff --git a/doc/topics/git/useful_git_commands.md b/doc/topics/git/useful_git_commands.md new file mode 100644 index 00000000000..030e62f485a --- /dev/null +++ b/doc/topics/git/useful_git_commands.md @@ -0,0 +1,210 @@ +--- +type: reference +--- + +# Useful Git commands + +Here are some useful Git commands collected by the GitLab support team. You may not +need to use often, but they can can come in handy when needed. + +## Remotes + +### Add another URL to a remote, so both remotes get updated on each push + +```sh +git remote set-url --add <remote_name> <remote_url> +``` + +## Staging and reverting changes + +### Remove last commit and leave the changes in unstaged + +```sh +git reset --soft HEAD^ +``` + +### Unstage a certain number of commits from HEAD + +To unstage 3 commits, for example, run: + +```sh +git reset HEAD^3 +``` + +### Unstage changes to a certain file from HEAD + +```sh +git reset <filename> +``` + +### Revert a file to HEAD state and remove changes + +There are two options to revert changes to a file: + +- `git checkout <filename>` +- `git reset --hard <filename>` + +### Undo a previous commit by creating a new replacement commit + +```sh +git revert <commit-sha> +``` + +### Create a new message for last commit + +```sh +git commit --amend +``` + +### Add a file to the last commit + +```sh +git add <filename> +git commit --amend +``` + +Append `--no-edit` to the `commit` command if you do not want to edit the commit +message. + +## Stashing + +### Stash changes + +```sh +git stash save +``` + +The default behavor of `stash` is to save, so you can also use just: + +```sh +git stash +``` + +### Unstash your changes + +```sh +git stash apply +``` + +### Discard your stashed changes + +```sh +git stash drop +``` + +### Apply and drop your stashed changes + +```sh +git stash pop +``` + +## Refs and Log + +### Use reflog to show the log of reference changes to HEAD + +```sh +git reflog +``` + +### Check the Git history of a file + +The basic command to check the git history of a file: + +```sh +git log <file> +``` + +If you get this error message: + +```text +fatal: ambiguous argument <file_name>: unknown revision or path not in the working tree. +Use '--' to separate paths from revisions, like this: +``` + +Use this to check the Git history of the file: + +```sh +git log -- <file> +``` + +### Find the tags that contain a particular SHA + +```sh +git tag --contains <sha> +``` + +### Check the content of each change to a file + +```sh +gitk <file> +``` + +### Check the content of each change to a file, follows it past file renames + +```sh +gitk --follow <file> +``` + +## Debugging + +### Use a custom SSH key for a git command + +```sh +GIT_SSH_COMMAND="ssh -i ~/.ssh/gitlabadmin" git <command> +``` + +### Debug cloning + +With SSH: + +```sh +GIT_SSH_COMMAND="ssh -vvv" git clone <git@url> +``` + +With HTTPS: + +```sh +GIT_TRACE_PACKET=1 GIT_TRACE=2 GIT_CURL_VERBOSE=1 git clone <url> +``` + +## Rebasing + +### Rebase your branch onto master + +The -i flag stands for 'interactive': + +```sh +git rebase -i master +``` + +### Continue the rebase if paused + +```sh +git rebase --continue +``` + +### Use git rerere + +To _reuse_ recorded solutions to the same problems when repeated: + +```sh +git rerere +``` + +To enable `rerere` functionality: + +```sh +git config --global rerere.enabled true +``` + +<!-- ## Troubleshooting + +Include any troubleshooting steps that you can foresee. If you know beforehand what issues +one might have when setting this up, or when something is changed, or on upgrading, it's +important to describe those, too. Think of things that may go wrong and include them here. +This is important to minimize requests for support, and to avoid doc comments with +questions that you know someone might ask. + +Each scenario can be a third-level heading, e.g. `### Getting error message X`. +If you have none to add when creating a doc, leave this section in place +but commented out to help encourage others to add to it in the future. --> |