Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitaly.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-12-19hook: Add the `ProcReceiveHook` serverKarthik Nayak
With the creation of `ProcReceiveHook` under the `HookManager`, we can now create the `ProcReceiveHook` server. The server simply propagates the data to/from the `HookMananger.ProcReceiveHook` to the client.
2023-12-08hook: Introduce `ProcReceiveHandler` and `ProcReceiveRegistry`Karthik Nayak
Introduce the `ProcReceiveHandler` which provides the mechanism for RPCs which invoked git-receive-pack(1) to interact with the proc-receive hook. It provides access to a list of references that a transaction is attempting to update, and functions to accept or reject individual updates. Also, introduce `ProcReceiveRegistry`, a registry which provides the proc-receive handlers against a provided transaction ID. The registry allows RPCs which perform commands that execute git-receive-pack(1) to hook into the proc-receive handler. The RPC must register itself with the registry by calling RegisterWaiter(), this provides a channel where the handler will be provided along with a registry cleanup function.
2023-10-31Garbage collect value log entriesSami Hiltunen
BadgerDB stores the keys in the LSM tree and the in a log. The LSM tree is compacted in the background automatically. The client is responsible for triggering the value log's garbage collection to prune out unneeded values. As we'll soon wire transactions to Gitaly, we should start garbage collecting the value log as well. This commit makes PartitionManager run a garbage collection goroutine for each storage's database. The goroutine is started when the database is opened and stopped when the PartitionManager is closed. Garbage collection is ran for each database once per minute. The garbage collection works by rewriting the value log while omitting the pruned values. While the garbage collection runs once a minute, it's configured to only rewrite the log file if it can free more than 50% of the space. This aims to strike a balance between the load from rewriting the log files and reclaiming space. Each RunValueLogGC run garbage collects only a single file. If a file was garbage collected, we run the function again immediately to check for further files that may need garbage collection The garbage collection interval and the discard ratio may need to be changed in the future. We'll make them configurable in later changes.
2023-10-31Inject database opener into PartitionManagerSami Hiltunen
PartitionManager will soon gain a goroutine that runs garbage collection on the Badger database. In order to test that change, we need to be able to mock out badger in the test. This commit injects a database opener into PartitionManager so we can mock out the database in the tests.
2023-10-18Configure transaction middleware in test serverSami Hiltunen
This commit configures the transaction middleware in Gitaly's test server when GITALY_TEST_WAL environment variable is set. With the middleware in place, tests that are using the test server are automaticaclly ran with the transaction management enabled. Mutators which perform changes that are not automatically recorded in the transaction are updated to record the changes to have the tests pass. This includes RemoveRepository to record the deletion, UserSquash to record the unreference commit object, SetCustomHooks to record the new hooks and default branch updates. Some tests were adapted to the changed behavior with transactions. The behavior changes into the following categories: 1. Lock conflicts don't occur anymore with the transactions running in isolated snapshots. Such tests were disabled if transactions are enabled. 2. Concurrent reference update errors no longer manifest during the RPC handler's execution as the handlers are isolated from concurrently performed changes. Tests testing this behavior were updated to expect the reference conflict error from the transaction commit. 3. Objects are not written into the repository if the transaction doesn't commit. Tests which are asserting objects were migrated after the 'pre-receive' hook succeeded have been updated to assume that no objects make it into the repository if the transaction doesn't commit. 4. The handlers receive a rewritten repository that points to the snapshot. Tests which are asserting the relative path or the quarantine variables have been updated to not do so. Tests which are utilizing object pools have been disabled as object pools are not yet supported by the transaction manager.
2023-10-18Introduce a transaction managing middlewareSami Hiltunen
This commit implements gRPC middleware for running most repository scoped RPCs using transactions. The middleware reads the target repository from the request and starts a transaction for it. It then calls the handler with the request's repository rewritten to point to the transaction's snapshot repository. If the handler returns successfully the transaction is committed. If the handler returns an error, the transaction is rolled back. The transaction's ID is included in the context. This way it can be passed through to the hooks which can then retrieve the transaction and stage the reference updates made. The transaction itself is also included in the context. This way RPCs can stage changes that are not captured automatically like the reference updates are. This includes for example default branch updates, custom hook updates and others. Almost all accessors are ran transactionally. Most mutators are as well, and the ones that aren't are either deprecated or not yet supported by transactions. Maintenance RPCs aren't supported by transactions and will need to be integrated in the actual transaction manager. Unsupported mutators and maintenance RPCs are ran non-transactionally rather than erroring out to keep the tests passing. Accessors with quarantine configuration set are passed through. Such requests should only be sent from Rails' access checks. The parent request that called into the access checks would already be running in a transaction so another one isn't started. Currently the access don't target the snapshot repository yet. Mutators with quarantine configured are rejected as such requests shouldn't arrive.
2023-10-18Wire TransactionMiddleware to GitalyServerFactorySami Hiltunen
We'll soon have middleware to handle transaction management for RPC in a general manner. This commit wires GitalyServerFactory to optionally configure a middleware for doing so in the right location.
2023-10-11git/housekeeping: Replace use of ctxlogrus with injected loggerPatrick Steinhardt
The ctxlogrus package is going away with the replacement being log fields extracted from the context via `log.DebugContext()` et al. Refactor the code to stop using ctxlogrus by injecting a logger and using the new context-based logging methods.
2023-10-11git/localrepo: Replace use of ctxlogrus with injected loggerPatrick Steinhardt
The ctxlogrus package is going away with the replacement being log fields extracted from the context via `log.DebugContext()` et al. Refactor the code to stop using ctxlogrus by injecting a logger and using the new context-based logging methods.
2023-10-10transaction: Replace use of ctxlogrus with injected loggerPatrick Steinhardt
The ctxlogrus package is going away with the replacement being log fields extracted from the context via `log.DebugContext()` et al. Refactor the code to stop using ctxlogrus by injecting a logger and using the new context-based logging methods.
2023-10-10gitaly/hook: Replace use of ctxlogrus with injected loggerPatrick Steinhardt
The ctxlogrus package is going away with the replacement being log fields extracted from the context via `log.DebugContext()` et al. Refactor the code to stop using ctxlogrus by injecting a logger and using the new context-based logging methods.
2023-10-05updateref: Replace use of ctxlogrus with injected loggerPatrick Steinhardt
The ctxlogrus package is going away with the replacement being log fields extracted from the context via `log.DebugContext()` et al. Refactor the code to stop using ctxlogrus by injecting a logger and using the new context-based logging methods.
2023-10-03Merge branch 'pks-log-replace-ctxlogrus-pt1' into 'master'James Liu
log: Replace ctxlogrus (pt.1) See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6429 Merged-by: James Liu <jliu@gitlab.com> Approved-by: James Fargher <jfargher@gitlab.com> Approved-by: James Liu <jliu@gitlab.com> Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
2023-09-29global: Inject logger into service serversPatrick Steinhardt
Many callsites extract the logger via the passed-in context by using the `ctxlogrus` package. The ability to add loggers to the context via this package is going away upstream though in favor of a new set of logging functions that accept a context as input parameter themselves. We're thus going to introduce an equivalent interface, but that will require our service servers to have a logger available to them. Prepare for this change by injecting the logger into both Gitaly and Praefect servers.
2023-09-28limiter: Let limithandler.WithConcurrencyLimiters return created limitsQuang-Minh Nguyen
limithandler.WithConcurrencyLimiters returns a function to setup per-RPC concurrency limiter. This function is called when Gitaly process creates a limiter interceptor. It creates a lot of things and notably adaptive limit structs. Currenty, those limits are not connected to the adaptive calculator. In some later commits, we need to pull those limits out and plug into the calculator so that the calculator can adjust those limits. This commit adjusts the signature of that function and let it return the configured limits.
2023-09-26Wire TransactionRegistry in HookManagerSami Hiltunen
The reference transaction hook will need access to the transaction in order to stage the reference updates performed during a transaction. This commit wires TransactionRegistry into HookManager to give the hook access to the transaction.
2023-09-21limiter: Add adaptive limit to the concurrency limiterQuang-Minh Nguyen
This commit adds the adaptive limit to limiter.ConcurrencyLimiter. It changes the signature of the initializer to receive an AdaptiveLimit object and converts the queues in the limiter to dynamic pools. At this point, as the calculator has not been integrated (yet), the input limit never changes. It always returns a persistent limit, which is a perfect replacement for the integer static limit. In the future, when the calculator kicks in, the dynamic pools in the concurrency limiter scale up and down accordingly when the current limit is changed by the calculator.
2023-09-15log: Wrap the logrus loggerPatrick Steinhardt
Introduce a wrapper for the logrus logging infrastructure. This is a first step towards abstracting away the actual implementation of our logs behind a type that we can iterate on and will thus pave the way towards adoption of the `slog` package. Note that this is only one step of this transition. The interface does not yet roundtrip to itself, e.g. calling `log.Logger.WithError()` will result in a `logrus.Entry`, not in a `log.Logger`. This will be handled at a later point.
2023-09-13limiter: Replace limiter.ConcurrencyLimiter's ticker by context timeoutQuang-Minh Nguyen
`limiter.ConcurrencyLimiter` initializer receives a function that creates a new ticker. This ticker is used to notify the timeout event of an operation waiting in the queue for too long. This usage pattern is a bit far-stretched. Using a timeout context is more elegant. The only reason we used that pattern was for testing purposes, which was test-friendly. Thankfully, in a prior commit, a test helper for simulating context timeout events was introduced. This commit replaces the ticker generator by context timeout.
2023-09-07global: Stop using public `client` packagePatrick Steinhardt
Refactor all callsites of our public `client` package such that they instead use our internal packages. This ensures that we can stay agile and refactor our internal APIs without breaking users of our public API.
2023-09-05global: Drop dependency on the Git2go executorPatrick Steinhardt
None of our services require the Git2go executor anymore, but we still inject it into the various services. Remove it.
2023-09-05cmd/gitaly-git2go: Use log writer instead of splicing log outputsPatrick Steinhardt
When spawning gitaly-git2go processes, we redirect their stderr to print to the writer used by our global logging mechanism. This should in the general case be stderr of the Gitaly process, which means that any logs would get spliced into the main logging stream generated by Gitaly. The mechanism isn't great though, and we're about to get rid of the global logging instance in favor of the injected ones. Given that this is the last instance where we still rely on the global logger's internal workings, and that the gitaly-git2go command is going away in a release or two anyway, there is thus a strong urge to adapt it to unblock our remaining logging refactorings. And that's exactly what we do now. Instead of reaching into the global logging instance to retrieve its writer, we simply construct a writer from the logger and pass that to the newly created process. Now this has the downside that we're unable to control the log level, and furthermore it will cause the log messages themselves to be encoded in JSON again. But as said, this is only a temporary measure that we can get rid of in the near future by removing the complete gitaly-git2go command. This change removes the last instance where we rely on our logger to be a `logrus.Logger` and is also the last user of `log.Default()`.
2023-08-31Print Gitaly and Praefect logs on test failureSami Hiltunen
Gitaly and Praefect logs are currently redirected to files. To debug test failures, one has to open the file separately and download the log artifacts from CI. Instead of redirecting the logs to a different files, use the recording logger and output the logs automatically when a test case fails.
2023-08-18cache/diskcache: Inject logger instead of using global onePatrick Steinhardt
We're using the global logging instance provided by the logrus package, which is discouraged. Inject a logger such that we can use a properly configured logger instead.
2023-08-17Revert introduction of adaptive pack-objects limiterPatrick Steinhardt
In 023c2a7d0 (Merge branch 'qmnguyen0711/add-adaptive-limit-to-pack-objects-limit' into 'master', 2023-08-09), we have introduced new code to add an adaptive limit to pack-objects. This new limiter is seemingly creating a memory leak that causes issues in production systems. Revert this merge request. Changelog: fixed
2023-08-15Correct context.Context and testing.T order in testsQuang-Minh Nguyen
This commit fixes some incorrect parameter ordering in tests. We always set testing.T before context.Context.
2023-08-14Merge branch 'wc/repo-count-dedup' into 'master'Justin Tobler
counter: Key metric on storage path instead of name See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6188 Merged-by: Justin Tobler <jtobler@gitlab.com> Approved-by: Steve Xuereb <sxuereb@gitlab.com> Approved-by: Quang-Minh Nguyen <qmnguyen@gitlab.com> Approved-by: Justin Tobler <jtobler@gitlab.com> Reviewed-by: Steve Xuereb <sxuereb@gitlab.com> Reviewed-by: Will Chandler <wchandler@gitlab.com> Reviewed-by: Quang-Minh Nguyen <qmnguyen@gitlab.com> Co-authored-by: Will Chandler <wchandler@gitlab.com>
2023-08-12Merge branch 'pks-log-refactor-configuration' into 'master'Justin Tobler
log: Refactor configuration to be more self-contained and explicit See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6201 Merged-by: Justin Tobler <jtobler@gitlab.com> Approved-by: Justin Tobler <jtobler@gitlab.com> Approved-by: Will Chandler <wchandler@gitlab.com> Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
2023-08-11counter: Use storage path instead of nameWill Chandler
On dotcom each Gitaly server is configured with all storages know to rails, with each using the same path. In practice only on storage will be sent to the host, but this allows us to keep the config identical across all Gitaly servers. Currently `RepositoryCounter` will walk each storage configured on startup, which is causing each Gitaly server to generate <STORAGE_CT>-1 duplicate values and creating very high cardinality on the `gitaly_total_repository_count` metric. To resolve this, we switch the metric key from the storage name over to its path, which will always be unique on a given server. We now store a map of storage name to paths as inserts now require us to convert the storage name to its path. Changlog: fixed
2023-08-09limiter: Add adaptive limit to the concurrency limiterQuang-Minh Nguyen
This commit adds the adaptive limit to limiter.ConcurrencyLimiter. It changes the signature of the initializer to receive an AdaptiveLimit object and converts the queues in the limiter to dynamic pools. At this point, as the calculator has not been integrated (yet), the input limit never changes. It always returns a persistent limit, which is a perfect replacement for the integer static limit. In the future, when the calculator kicks in, the dynamic pools in the concurrency limiter scale up and down accordingly when the current limit is changed by the calculator.
2023-08-09limiter: Add context to limithandler.WithConcurrencyLimitersQuang-Minh Nguyen
This commit adds context to limithandler.WithConcurrencyLimiters. This addition makes it align with limithandler.WithRateLimiters's signature. This context is used to control the cancellation of some internal goroutines in some next commits.
2023-08-09log: Move logging configuration into "internal/log"Patrick Steinhardt
The logging configuration is reused by both Gitaly and Praefect but hosted in Gitaly-specific code. Let's move it into "internal/log" so that it becomes clearer that this code is generic.
2023-07-28counter: Wire up RepositoryCounter to repoutilWill Chandler
All repository creation and removal is done via `repoutil`, so hooking `RepositoryCounter` in here will allow us to track changes without dealing with the idiosyncracies of the many RPCs that may create repositories. Add a `RepositoryCounter` as a parameter to the `repoutil.Create` and `repoutil.Remove` functions and wire it up to necessary services. The exception to this is the `RemoveAll` RPC which deletes an entire storage, we use the counter directly here.
2023-07-27Unify closing method and field namesSami Hiltunen
PartitionManager and TransactionManager are using deviating names to signal scenarios when the closing process has been initiated and when it has completed. This commit unifies the names in both types and moves to use `close` for the method to initiate closing, `closing` for the channel that signifies the process has been initiated, and `closed` for the final end state when the closing has finished.
2023-07-25Implement helper for conditional testing with WALSami Hiltunen
We are soon plugging in the transaction management and running all tests with it enabled. While there mostly shouldn't be any behavioral differences, there may be some cases where the tests are testing scenarios that shouldn't really happen with transactions or asserting specific error messages. This commit introduces a helper that can be used to conditionally change the test behavior depending on whether or not transaction management is enabled.
2023-07-18Merge branch 'smh-move-updater-with-hooks' into 'master'Patrick Steinhardt
Move UpdaterWithHooks from git to hook package See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/6044 Merged-by: Patrick Steinhardt <psteinhardt@gitlab.com> Approved-by: James Fargher <jfargher@gitlab.com> Approved-by: Patrick Steinhardt <psteinhardt@gitlab.com> Co-authored-by: Sami Hiltunen <shiltunen@gitlab.com>
2023-07-13operations: Pass in SigingKeyJohn Cai
In order to make use of the SigningKey in the gitaly config, we need to pass it through to the operations service via service dependencies.
2023-07-11Move UpdaterWithHooks from git to hook packageSami Hiltunen
UpdaterWithHooks wraps 'git update-ref' with executing the hooks by calling Gitaly's HookManager. While the Updater itself is just a plain 'update-ref' wrapper, UpdaterWithHooks ties more into Gitaly's domain types. This is prone to creating cyclic dependencies as anything that needs to use the 'update-ref' wrapper will also import the hook package from Gitaly. This creates a problem for TransactionManager as we are about to integrate some of the transaction logic in the hooks. Fix the cyclic dependency by moving UpdaterWithHooks into its own subpackage below the main 'hook' package.
2023-07-11Extract core limiter out of limithandler middlewareQuang-Minh Nguyen
Recently, all limiter logic locate in limithandler middleware package (internal/grpc/middleware/limithandler). Technically speaking, this package is not the right location. The limiters include Per-RPC concurrency limiter, Per-RPC rate limiter, and pack-objects limiter. The pack-objects limiter is not a part of gRPC middleware. We plan to create some more limiters and advance them with adaptive limiters. This commit extracts the core limiter functionalities out of the limithandler package. The new location for them is at internal/limiter. The ones that belong to gRPC middleware stay intact.
2023-06-23Remove pack-objects hook's concurrency trackerQuang-Minh Nguyen
hook.ConcurrencyTracker is used for logging the concurrency usage of the pack-objects hook. It maintains a map of counters. Each counter is increased or decreased before and after a pack-objects command runs. Apart from logging, it also exposes some Prometheus metrics. The original purpose of these logs and metrics was to monitor the concurrency segmented by different kinds of actors (user/repository/remote_ip). That information was helpful to pick a strategy for the concurrency limiter. After we picked one and removed the rest, they become redundant. The limiters expose similar and somewhat richer logs and metrics. They can replace pack-objects concurrency tracker perfectly. Besides, they align with per-RPC limiters and future limiters, if any. This commit removes the pack-objects hook's concurrency tracker and its usages.
2023-06-19service: Add server-side backup dependencies to gitalyJames Fargher
Now that there are configs, we can add the corresponding dependencies. Ideally we would have added backup.Manager except that manager still accepts backupID which is request scoped. This will be future refactor work.
2023-06-14gitaly: Move write-ahead log logic into separate packagePatrick Steinhardt
The write-ahead log logic is currently contained in the rather generic `internal/gitaly` package. Since its introduction it has grown in scope though and has a more clearly defined purpose now, namely to manage all access to the Git repositories of a particular storage. Furthermore, the current naming schema creates some confusion due to the ever so tiny difference between `internal/gitaly/transaction_manager.go` betwen the write-ahead log's `internal/gitaly/transaction_manager.go` and `internal/gitaly/trannsaction/manager.go` part of our transactional voting logic. Move the implementation into a separate `storagemgr` package to more clearly define the purpose of the files.
2023-05-30gitaly: Add `RepositoryManager` to `TransactionManager`Karthik Nayak
In the `TransactionManager` we want to cleanup stale lock files if we run into them (but not when housekeeping is running). To do this we need the `housekeeping.RepositoryManager`, let's add this field to the TransactionManager. The following commit[s] will utilize the field.
2023-05-25Inject git.CommandFactory to TransactionManagerSami Hiltunen
This commit injects a git.CommandFactory instance into TransactionManager. This will later be needed to spawn git commands that don't run in a given repository.
2023-05-23Wire PartitionManager into Gitaly's test serverSami Hiltunen
PartitionManager is the entry point for the WAL logic. It handles routing transaction to the correct partitions and the lifecycle of TransactionManagers responsible for those partitions. As a first step into integrating it into our tests, wire the PartitionManager into Gitaly's test server. As we eventually want run the tests with both WAL enabled and disabled, the PartitionManager is configured only if 'GITALY_TEST_WAL' environment variable is set. All of the users of the WAL will have to check whether or not it is set. This remains the same when we eventually make it possible to enable the WAL logic through the configuration. If WAL is enabled, we'll configure and dependency inject the PartitionManager. If WAL is disabled, we'll leave it nil. testPackObjectsConcurrency was changed to use a unique config for each test case so each of the tests have a unique storages for the PartitionManager to initialize state in. For now, none of the RPCs are using the PartitionManager yet. Regardless, it is enabled in the test when the env is set and its initialization logic runs and is tested. We'll begin gradually integrating the WAL logic in to the RPCs one by one.
2023-05-19Merge branch 'pks-praefect-server-factory-flake' into 'master'Will Chandler
praefect: Fix flaky server factory test Closes #5137 See merge request https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5796 Merged-by: Will Chandler <wchandler@gitlab.com> Approved-by: Justin Tobler <jtobler@gitlab.com> Approved-by: Will Chandler <wchandler@gitlab.com> Co-authored-by: Patrick Steinhardt <psteinhardt@gitlab.com>
2023-05-18Decouple production code from test protoMikhail Mazurskiy
WithInterceptedMetadata() uses testproto.ErrorMetadata, which is in a test-only package. This function is only used in tests so move it to testhelper and interceptors into testserver
2023-05-17gitaly/client: Move into `internal/grpc` packagePatrick Steinhardt
Our internal client package is hosted in `internal/gitaly/client` even though it isn't Gitaly-specific at all. Move it into our new gRPC specific package to avoid confusion.
2023-05-16middleware: Move package into `internal/grpc/`Patrick Steinhardt
We've grown quite a bunch of packages that provide various different functionality in the context of gRPC. But it is not easy to identify them as there is no common hierarchy for gRPC-related functionality. We're thus introducing a new `internal/grpc` package that will host various bits and pieces related to gRPC. Move the `internal/middleware` package into this new package.
2023-05-16backchannel: Move package into `internal/grpc/`Patrick Steinhardt
We've grown quite a bunch of packages that provide various different functionality in the context of gRPC. But it is not easy to identify them as there is no common hierarchy for gRPC-related functionality. We're thus introducing a new `internal/grpc` package that will host various bits and pieces related to gRPC. Move the `internal/backchannel` package into this new package.