From fee46e300ce5b473bb232fa5156fd1414e715807 Mon Sep 17 00:00:00 2001 From: Jacob Vosmaer Date: Tue, 13 Aug 2019 16:32:29 +0000 Subject: Move gitaly-proto to gitaly/proto --- proto/README.md | 324 ++++++++++++++++++++++++++++++++++++++++++++++++++- proto/README.orig.md | 319 -------------------------------------------------- proto/RELEASE.md | 30 ----- proto/REVISION | 1 - proto/SOURCE | 1 - proto/VERSION | 1 - 6 files changed, 319 insertions(+), 357 deletions(-) delete mode 100644 proto/README.orig.md delete mode 100644 proto/RELEASE.md delete mode 100644 proto/REVISION delete mode 100644 proto/SOURCE delete mode 100644 proto/VERSION (limited to 'proto') diff --git a/proto/README.md b/proto/README.md index c398d0c49..0b24bfcb7 100644 --- a/proto/README.md +++ b/proto/README.md @@ -1,7 +1,321 @@ -# Vendored copy of gitaly-proto +# Protobuf specifications and client libraries for Gitaly -Vendored from gitlab.com/gitlab-org/gitaly-proto at a1e8074ed11a176e964c505dc30d153cbe8fed4d. +> This directory was previously hosted at https://gitlab.com/gitlab-org/gitaly-proto. As of Gitaly 1.58.0 and gitaly-proto 1.39.0, all further proto changes will be made here, in `gitaly/proto`. -Migration in progress, see -https://gitlab.com/gitlab-org/gitaly/issues/1761. Do not edit files in -this directory, your changes will be ignored and overwritten. +Gitaly is part of GitLab. It is a [server +application](https://gitlab.com/gitlab-org/gitaly) that uses its own +gRPC protocol to communicate with its clients. This repository +contains the protocol definition and automatically generated wrapper +code for Go and Ruby. + +The .proto files define the remote procedure calls for interacting +with Gitaly. We keep auto-generated client libraries for Ruby and Go +in their respective subdirectories. The list of RPCs can be +[found here](https://gitlab-org.gitlab.io/gitaly-proto/). + +Run `make proto` from the root of the repository to regenerate the client +libraries after updating .proto files. + +See +[developers.google.com](https://developers.google.com/protocol-buffers/docs/proto3) +for documentation of the 'proto3' Protocol buffer specification +language. + +## Issues + +We have disabled the issue tracker of the gitaly-proto project. Please use the +[Gitaly issue tracker](https://gitlab.com/gitlab-org/gitaly/issues). + +## gRPC/Protobuf concepts + +The core Protobuf concepts we use are rpc, service and message. We use +these to define the Gitaly **protocol**. + +- **rpc** a function that can be called from the client and that gets + executed on the server. Belongs to a service. Can have one of four + request/response signatures: message/message (example: get metadata for + commit xxx), message/stream (example: get contents of blob xxx), + stream/message (example: create new blob with contents xxx), + stream/stream (example: git SSH session). +- **service** a logical group of RPC's. +- **message** like a JSON object except it has pre-defined types. +- **stream** an unbounded sequence of messages. In the Ruby clients + this looks like an Enumerator. + +gRPC provides an implementation framework based on these Protobuf concepts. + +- A gRPC **server** implements one or more services behind a network + listener. Example: the Gitaly server application. +- The gRPC toolchain automatically generates **client libraries** that + handle serialization and connection management. Example: the Go + client package and Ruby gem in this repository. +- gRPC **clients** use the client libraries to make remote procedure + calls. These clients must decide what network address to reach their + gRPC servers on and handle connection reuse: it is possible to + spread different gRPC services over multiple connections to the same + gRPC server. +- Officially a gRPC connection is called a **channel**. In the Go gRPC + library these channels are called **client connections** because + 'channel' is already a concept in Go itself. In Ruby a gRPC channel + is an instance of GRPC::Core::Channel. We use the word 'connection' + in this document. The underlying transport of gRPC, HTTP/2, allows + multiple remote procedure calls to happen at the same time on a + single connection to a gRPC server. In principle, a multi-threaded + gRPC client needs only one connection to a gRPC server. + +## Design decisions + +1. In Gitaly's case there is one server application + https://gitlab.com/gitlab-org/gitaly which implements all services + in the protocol. +1. In default GitLab installations each Gitaly client interacts with + exactly 1 Gitaly server, on the same host, via a Unix domain socket. + In a larger installation each Gitaly client will interact with many + different Gitaly servers (one per GitLab storage shard) via TCP + connections. +1. Gitaly uses + [grpc.Errorf](https://godoc.org/google.golang.org/grpc#Errorf) to + return meaningful + [errors](https://godoc.org/google.golang.org/grpc/codes#Code) to its + clients. +1. Each RPC `FooBar` has its own `FooBarRequest` and `FooBarResponse` + message types. Try to keep the structure of these messages as flat as + possible. Only add abstractions when they have a practical benefit. +1. We never make backwards incompatible changes to an RPC that is + already implemented on either the client side or server side. + Instead we just create a new RPC call and start a deprecation + procedure (see below) for the old one. +1. It is encouraged to put comments (starting with `//`) in .proto files. + Please put comments on their own lines. This will cause them to be + treated as documentation by the protoc compiler. +1. When choosing an RPC name don't use the service name as context. + Good: `service CommitService { rpc CommitExists }`. Bad: + `service CommitService { rpc Exists }`. + +### RPC naming conventions + +Gitaly-Proto has RPCs that are resource based, for example when querying for a +commit. Another class of RPCs are operations, where the result might be empty +or one of the RPC error codes but the fact that the operation took place is +of importance. + +For all RPCs, start the name with a verb, followed by an entity, and if required +followed by a further specification. For example: +- GetCommit +- RepackRepositoryIncremental +- CreateRepositoryFromBundle + +For resource RPCs the verbs in use are limited to: Get, List, Create, Update, +Delete, or Is. Where both Get and List as verbs denote these operations have no side +effects. These verbs differ in terms of the expected number of results the query +yields. Get queries are limited to one result, and are expected to return one +result to the client. List queries have zero or more results, and generally will +create a gRPC stream for their results. When the `Is` verb is used, this RPC +is expected to return a boolean, or an error. For example: `IsRepositoryEmpty`. + + +When an operation based RPC is defined, the verb should map to the first verb in +the Git command it represents. Example; FetchRemote. + +Note that the current interface defined in this repository does not yet abide +fully to these conventions. Newly defined RPCs should, though, so eventually +gitaly-proto converges to a common standard. + +### Common field names and types + +As a general principle, remember that Git does not enforce encodings on +most data inside repositories, so we can rarely assume data to be a +Protobuf "string" (which implies UTF-8). + +1. `bytes revision`: for fields that accept any of branch names / tag + names / commit ID's. Uses `bytes` to be encoding agnostic. +2. `string commit_id`: for fields that accept a commit ID. +3. `bytes ref`: for fields that accept a refname. +4. `bytes path`: for paths inside Git repositories, i.e., inside Git + `tree` objects. +5. `string relative_path`: for paths on disk on a Gitaly server, + created by "us" (GitLab the application) instead of the user, we + want to use UTF-8, or better, ASCII. + +### Stream patterns + +These are some patterns we already use, or want to use going forward. + +#### Stream response of many small items + +``` +rpc FooBar(FooBarRequest) returns (stream FooBarResponse); + +message FooBarResponse { + message Item { + // ... + } + repeated Item items = 1; +} +``` + +A typical example of an "Item" would be a commit. To avoid the penalty +of network IO for each Item we return, we batch them together. You can +think of this as a kind of buffered IO at the level of the Item +messages. In Go, to ease the bookkeeping you can use +[gitlab.com/gitlab-org/gitaly/internal/helper/chunker](https://godoc.org/gitlab.com/gitlab-org/gitaly/internal/helper/chunker). + +#### Single large item split over multiple messages + +``` +rpc FooBar(FooBarRequest) returns (stream FooBarResponse); + +message FooBarResponse { + message Header { + // ... + } + + oneof payload { + Header header = 1; + bytes data = 2; + } +} +``` + +A typical example of a large item would be the contents of a Git blob. +The header might contain the blob OID and the blob size. Only the first +message in the response stream has `header` set, all others have `data` +but no `header`. + +In the particular case where you're sending back raw binary data from +Go, you can use +[gitlab.com/gitlab-org/gitaly/streamio](https://godoc.org/gitlab.com/gitlab-org/gitaly/streamio) +to turn your gRPC response stream into an `io.Writer`. + +> Note that a number of existing RPC's do not use this pattern exactly; +> they don't use `oneof`. In practice this creates ambiguity (does the +> first message contain non-empty `data`?) and encourages complex +> optimization in the server implementation (trying to squeeze data into +> the first response message). Using `oneof` avoids this ambiguity. + +#### Many large items split over multiple messages + +``` +rpc FooBar(FooBarRequest) returns (stream FooBarResponse); + +message FooBarResponse { + message Header { + // ... + } + + oneof payload { + Header header = 1; + bytes data = 2; + } +} +``` + +This looks the same as the "single large item" case above, except +whenever a new large item begins, we send a new message with a non-empty +`header` field. + +#### Footers + +If the RPC requires it we can also send a footer using `oneof`. But by +default, we prefer headers. + +### RPC Annotations + +In preparation for Gitaly HA, we are now requiring all RPC's to be annotated +with an appropriate designation. All methods must contain one of the following lines: + +- `option (op_type).op = ACCESSOR;` + - Designates an RPC as being read-only (i.e. side effect free) +- `option (op_type).op = MUTATOR;` + - Designates that an RPC modifies the repository + +Failing to designate an RPC correctly will result in a CI error. For example: + +`--gitaly_out: server.proto: Method ServerInfo missing op_type option` + +Additionally, all mutator RPC's require additional annotations to clearly +indicate what is being modified: + +- When an RPC modifies a server-wide resource, the scope should specify `SERVER`. +- When an RPC modifies a specific repository, the scope should specify `REPOSITORY`. + - Additionally, every RPC with `REPOSITORY` scope, should also specify the target repository. + +The target repository represents the location or address of the repository +being modified by the operation. This is needed by Praefect (Gitaly HA) in +order to properly schedule replications to keep repository replicas up to date. + +The target repository annotation specifies where the target repository can be +found in the message. The annotation looks similar to an IP address, but +variable in length (e.g. "1", "1.1", "1.1.1"). Each dot delimited field +represents the field number of how to traverse the protobuf request message to +find the target repository. The target repository **must** be of protobuf +message type `gitaly.Repository`. + +See our examples of [valid](go/internal/linter/testdata/valid.proto) and +[invalid](go/internal/linter/testdata/invalid.proto) proto annotations. + +### Go Package + +If adding new protobuf files, make sure to correctly set the `go_package` option +near the top of the file: + +`option go_package = "gitlab.com/gitlab-org/gitaly-proto/go/gitalypb";` + +This allows other protobuf files to locate and import the Go generated stubs. If +you forget to add a `go_package` option, you may receive an error similar to: + +`blob.proto is missing the go_package option` + +## Contributing + +The CI at https://gitlab.com/gitlab-org/gitaly-proto regenerates the +client libraries to guard against the mistake of updating the .proto +files but not the client libraries. This check uses `git diff` to look +for changes. Some of the code in the Go client libraries is sensitive +to implementation details of the Go standard library (specifically, +the output of gzip). **Use the same Go version as .gitlab-ci.yml (Go +1.11)** when generating new client libraries for a merge request. + +[DCO + License](CONTRIBUTING.md) + +### Build process + +After you change or add a .proto file you need to re-generate the Go +and Ruby libraries before committing your change. + +``` +# Re-generate Go and Ruby libraries +make generate +``` + +## How to deprecate an RPC call + +See [DEPRECATION.md](DEPRECATION.md). + +## Release + +This will tag and release the gitaly-proto library, including +pushing the gem to rubygems.org + +``` +make release version=X.Y.Z +``` + + +## How to manually push the gem + +If the release script fails the gem may not be pushed. This is how you can do that after the fact: + +```shell +# Use a sub-shell to limit scope of 'set -e' +( + set -e + + # Replace X.Y.Z with the version you are pushing + GEM_VERSION=X.Y.Z + + git checkout v$GEM_VERSION + gem build gitaly.gemspec + gem push gitaly-$GEM_VERSION.gem +) +``` diff --git a/proto/README.orig.md b/proto/README.orig.md deleted file mode 100644 index 67dae178b..000000000 --- a/proto/README.orig.md +++ /dev/null @@ -1,319 +0,0 @@ -# Protobuf specifications and client libraries for Gitaly - -Gitaly is part of GitLab. It is a [server -application](https://gitlab.com/gitlab-org/gitaly) that uses its own -gRPC protocol to communicate with its clients. This repository -contains the protocol definition and automatically generated wrapper -code for Go and Ruby. - -The .proto files define the remote procedure calls for interacting -with Gitaly. We keep auto-generated client libraries for Ruby and Go -in their respective subdirectories. The list of RPCs can be -[found here](https://gitlab-org.gitlab.io/gitaly-proto/). - -Run `make` from the root of the repository to regenerate the client -libraries after updating .proto files. - -See -[developers.google.com](https://developers.google.com/protocol-buffers/docs/proto3) -for documentation of the 'proto3' Protocol buffer specification -language. - -## Issues - -We have disabled the issue tracker of the gitaly-proto project. Please use the -[Gitaly issue tracker](https://gitlab.com/gitlab-org/gitaly/issues). - -## gRPC/Protobuf concepts - -The core Protobuf concepts we use are rpc, service and message. We use -these to define the Gitaly **protocol**. - -- **rpc** a function that can be called from the client and that gets - executed on the server. Belongs to a service. Can have one of four - request/response signatures: message/message (example: get metadata for - commit xxx), message/stream (example: get contents of blob xxx), - stream/message (example: create new blob with contents xxx), - stream/stream (example: git SSH session). -- **service** a logical group of RPC's. -- **message** like a JSON object except it has pre-defined types. -- **stream** an unbounded sequence of messages. In the Ruby clients - this looks like an Enumerator. - -gRPC provides an implementation framework based on these Protobuf concepts. - -- A gRPC **server** implements one or more services behind a network - listener. Example: the Gitaly server application. -- The gRPC toolchain automatically generates **client libraries** that - handle serialization and connection management. Example: the Go - client package and Ruby gem in this repository. -- gRPC **clients** use the client libraries to make remote procedure - calls. These clients must decide what network address to reach their - gRPC servers on and handle connection reuse: it is possible to - spread different gRPC services over multiple connections to the same - gRPC server. -- Officially a gRPC connection is called a **channel**. In the Go gRPC - library these channels are called **client connections** because - 'channel' is already a concept in Go itself. In Ruby a gRPC channel - is an instance of GRPC::Core::Channel. We use the word 'connection' - in this document. The underlying transport of gRPC, HTTP/2, allows - multiple remote procedure calls to happen at the same time on a - single connection to a gRPC server. In principle, a multi-threaded - gRPC client needs only one connection to a gRPC server. - -## Design decisions - -1. In Gitaly's case there is one server application - https://gitlab.com/gitlab-org/gitaly which implements all services - in the protocol. -1. In default GitLab installations each Gitaly client interacts with - exactly 1 Gitaly server, on the same host, via a Unix domain socket. - In a larger installation each Gitaly client will interact with many - different Gitaly servers (one per GitLab storage shard) via TCP - connections. -1. Gitaly uses - [grpc.Errorf](https://godoc.org/google.golang.org/grpc#Errorf) to - return meaningful - [errors](https://godoc.org/google.golang.org/grpc/codes#Code) to its - clients. -1. Each RPC `FooBar` has its own `FooBarRequest` and `FooBarResponse` - message types. Try to keep the structure of these messages as flat as - possible. Only add abstractions when they have a practical benefit. -1. We never make backwards incompatible changes to an RPC that is - already implemented on either the client side or server side. - Instead we just create a new RPC call and start a deprecation - procedure (see below) for the old one. -1. It is encouraged to put comments (starting with `//`) in .proto files. - Please put comments on their own lines. This will cause them to be - treated as documentation by the protoc compiler. -1. When choosing an RPC name don't use the service name as context. - Good: `service CommitService { rpc CommitExists }`. Bad: - `service CommitService { rpc Exists }`. - -### RPC naming conventions - -Gitaly-Proto has RPCs that are resource based, for example when querying for a -commit. Another class of RPCs are operations, where the result might be empty -or one of the RPC error codes but the fact that the operation took place is -of importance. - -For all RPCs, start the name with a verb, followed by an entity, and if required -followed by a further specification. For example: -- GetCommit -- RepackRepositoryIncremental -- CreateRepositoryFromBundle - -For resource RPCs the verbs in use are limited to: Get, List, Create, Update, -Delete, or Is. Where both Get and List as verbs denote these operations have no side -effects. These verbs differ in terms of the expected number of results the query -yields. Get queries are limited to one result, and are expected to return one -result to the client. List queries have zero or more results, and generally will -create a gRPC stream for their results. When the `Is` verb is used, this RPC -is expected to return a boolean, or an error. For example: `IsRepositoryEmpty`. - - -When an operation based RPC is defined, the verb should map to the first verb in -the Git command it represents. Example; FetchRemote. - -Note that the current interface defined in this repository does not yet abide -fully to these conventions. Newly defined RPCs should, though, so eventually -gitaly-proto converges to a common standard. - -### Common field names and types - -As a general principle, remember that Git does not enforce encodings on -most data inside repositories, so we can rarely assume data to be a -Protobuf "string" (which implies UTF-8). - -1. `bytes revision`: for fields that accept any of branch names / tag - names / commit ID's. Uses `bytes` to be encoding agnostic. -2. `string commit_id`: for fields that accept a commit ID. -3. `bytes ref`: for fields that accept a refname. -4. `bytes path`: for paths inside Git repositories, i.e., inside Git - `tree` objects. -5. `string relative_path`: for paths on disk on a Gitaly server, - created by "us" (GitLab the application) instead of the user, we - want to use UTF-8, or better, ASCII. - -### Stream patterns - -These are some patterns we already use, or want to use going forward. - -#### Stream response of many small items - -``` -rpc FooBar(FooBarRequest) returns (stream FooBarResponse); - -message FooBarResponse { - message Item { - // ... - } - repeated Item items = 1; -} -``` - -A typical example of an "Item" would be a commit. To avoid the penalty -of network IO for each Item we return, we batch them together. You can -think of this as a kind of buffered IO at the level of the Item -messages. In Go, to ease the bookkeeping you can use -[gitlab.com/gitlab-org/gitaly/internal/helper/chunker](https://godoc.org/gitlab.com/gitlab-org/gitaly/internal/helper/chunker). - -#### Single large item split over multiple messages - -``` -rpc FooBar(FooBarRequest) returns (stream FooBarResponse); - -message FooBarResponse { - message Header { - // ... - } - - oneof payload { - Header header = 1; - bytes data = 2; - } -} -``` - -A typical example of a large item would be the contents of a Git blob. -The header might contain the blob OID and the blob size. Only the first -message in the response stream has `header` set, all others have `data` -but no `header`. - -In the particular case where you're sending back raw binary data from -Go, you can use -[gitlab.com/gitlab-org/gitaly/streamio](https://godoc.org/gitlab.com/gitlab-org/gitaly/streamio) -to turn your gRPC response stream into an `io.Writer`. - -> Note that a number of existing RPC's do not use this pattern exactly; -> they don't use `oneof`. In practice this creates ambiguity (does the -> first message contain non-empty `data`?) and encourages complex -> optimization in the server implementation (trying to squeeze data into -> the first response message). Using `oneof` avoids this ambiguity. - -#### Many large items split over multiple messages - -``` -rpc FooBar(FooBarRequest) returns (stream FooBarResponse); - -message FooBarResponse { - message Header { - // ... - } - - oneof payload { - Header header = 1; - bytes data = 2; - } -} -``` - -This looks the same as the "single large item" case above, except -whenever a new large item begins, we send a new message with a non-empty -`header` field. - -#### Footers - -If the RPC requires it we can also send a footer using `oneof`. But by -default, we prefer headers. - -### RPC Annotations - -In preparation for Gitaly HA, we are now requiring all RPC's to be annotated -with an appropriate designation. All methods must contain one of the following lines: - -- `option (op_type).op = ACCESSOR;` - - Designates an RPC as being read-only (i.e. side effect free) -- `option (op_type).op = MUTATOR;` - - Designates that an RPC modifies the repository - -Failing to designate an RPC correctly will result in a CI error. For example: - -`--gitaly_out: server.proto: Method ServerInfo missing op_type option` - -Additionally, all mutator RPC's require additional annotations to clearly -indicate what is being modified: - -- When an RPC modifies a server-wide resource, the scope should specify `SERVER`. -- When an RPC modifies a specific repository, the scope should specify `REPOSITORY`. - - Additionally, every RPC with `REPOSITORY` scope, should also specify the target repository. - -The target repository represents the location or address of the repository -being modified by the operation. This is needed by Praefect (Gitaly HA) in -order to properly schedule replications to keep repository replicas up to date. - -The target repository annotation specifies where the target repository can be -found in the message. The annotation looks similar to an IP address, but -variable in length (e.g. "1", "1.1", "1.1.1"). Each dot delimited field -represents the field number of how to traverse the protobuf request message to -find the target repository. The target repository **must** be of protobuf -message type `gitaly.Repository`. - -See our examples of [valid](go/internal/linter/testdata/valid.proto) and -[invalid](go/internal/linter/testdata/invalid.proto) proto annotations. - -### Go Package - -If adding new protobuf files, make sure to correctly set the `go_package` option -near the top of the file: - -`option go_package = "gitlab.com/gitlab-org/gitaly-proto/go/gitalypb";` - -This allows other protobuf files to locate and import the Go generated stubs. If -you forget to add a `go_package` option, you may receive an error similar to: - -`blob.proto is missing the go_package option` - -## Contributing - -The CI at https://gitlab.com/gitlab-org/gitaly-proto regenerates the -client libraries to guard against the mistake of updating the .proto -files but not the client libraries. This check uses `git diff` to look -for changes. Some of the code in the Go client libraries is sensitive -to implementation details of the Go standard library (specifically, -the output of gzip). **Use the same Go version as .gitlab-ci.yml (Go -1.11)** when generating new client libraries for a merge request. - -[DCO + License](CONTRIBUTING.md) - -### Build process - -After you change or add a .proto file you need to re-generate the Go -and Ruby libraries before committing your change. - -``` -# Re-generate Go and Ruby libraries -make generate -``` - -## How to deprecate an RPC call - -See [DEPRECATION.md](DEPRECATION.md). - -## Release - -This will tag and release the gitaly-proto library, including -pushing the gem to rubygems.org - -``` -make release version=X.Y.Z -``` - - -## How to manually push the gem - -If the release script fails the gem may not be pushed. This is how you can do that after the fact: - -```shell -# Use a sub-shell to limit scope of 'set -e' -( - set -e - - # Replace X.Y.Z with the version you are pushing - GEM_VERSION=X.Y.Z - - git checkout v$GEM_VERSION - gem build gitaly.gemspec - gem push gitaly-$GEM_VERSION.gem -) -``` diff --git a/proto/RELEASE.md b/proto/RELEASE.md deleted file mode 100644 index db92f239e..000000000 --- a/proto/RELEASE.md +++ /dev/null @@ -1,30 +0,0 @@ -# Gitaly-proto release process - -## Requirements - -- Ruby -- Bundler -- Go 1.10 - -## 1. Install dependencies - -If you have done a release before this may not be needed. - -``` -_support/install-protoc -``` - -## 2. Release - -This will: - -- do a final consistency check -- create a version bump commit -- create a tag -- build the gem -- ask for confirmation -- push the gem and the tag out to the world - -``` -_support/release 1.2.3 -``` diff --git a/proto/REVISION b/proto/REVISION deleted file mode 100644 index 365cbf9b1..000000000 --- a/proto/REVISION +++ /dev/null @@ -1 +0,0 @@ -v1.39.0 \ No newline at end of file diff --git a/proto/SOURCE b/proto/SOURCE deleted file mode 100644 index 700b61c39..000000000 --- a/proto/SOURCE +++ /dev/null @@ -1 +0,0 @@ -gitlab.com/gitlab-org/gitaly-proto \ No newline at end of file diff --git a/proto/VERSION b/proto/VERSION deleted file mode 100644 index 5edffce6d..000000000 --- a/proto/VERSION +++ /dev/null @@ -1 +0,0 @@ -1.39.0 -- cgit v1.2.3