Age | Commit message (Collapse) | Author |
|
Various functions return error which we do not check. Fix these by
calling `require.NoError()` to catch any unexpected issues.
|
|
GitLab's tests use Praefect with the DisabledElector. Recently an
MR was merged to default enable all Gitaly and Praefect feature flags
when the tests are being run. The pipeline was failign as the
DisabledElector did not support the connection multiplexing and the
feature flags were enabled. Fix this by passing the multiplexed connections
in to the DisabledElector as well.
|
|
Inject backchannel dependencies to components
See merge request gitlab-org/gitaly!3339
|
|
This commit injects the multiplexing handshaker from Praefect's main
to the dialing locations. This allows us to later plug in a backchannel
server easily. This commit has no changes to the functionality itself.
|
|
The PerRepositoryRouter expects database records to exist for each
accessed repository. This is due to two reasons:
1. With each repository having their own primary, we don't have a
virtual storage's primary anymore to fallback on when we don't
know where the replicas exist.
2. With variable replication factor, we really need to know where to
send the request as not every storage contains all repositories
anymore.
This behavior caused problems for Rails, as Rails sends us requests
about repositories which do not exist. This could be due to various
race conditions or the repository might still be present in Rails'
caches. Rails is catching not found errors returned and handles them
gracefully.
Prior to this commit, the PerRepositoryRouter was returning 'no suitable
node to serve request' errors when we didn't have a node we could serve
the request from. This case was also hit when the repository had no
database records. To differentiate the repository existing but all host
nodes being unaccessible from Rails asking for a non-existent repository,
this commit returns a proper status code to indicate the repository did not
exist.
This worked with virtual storage scoped primaries due to the fallback to
primary behavior when no replicas where found from the database. The primary
would then return NotFound when trying to access a non-existent repository on
the disk. This behavior is still maintained and the case when no database
records exist is ignored when virtual storage scoped primaries are used.
|
|
After a couple of releases with enabled reads distribution
feature we are removing a feature flag completely.
The special test-cases deleted as now the feature is a default
and there is no way to change it.
The change also fixes some comments around reads distribution.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3383
|
|
This commit removes DirectStorageProvider as it is just a proxy to
call GetConsistentStorages on the RepositoryStore.
|
|
Until now, we've been completely agnostic about which accessor RPC we're
routing: they simply all use reads distribution. This works just fine
for most of the RPCs, but leads to inconsistent results for some others
where it's really hard to fix.
Two such RPCs are `RepositorySize` and `GetObjectDiretcorySize`: both of
them depend on the on-disk state of the repository. This is nothing we
have full control of: first, we do not assure that replicas always get
packed at the same point in time. And second, even if we did, we cannot
guarantee that git always ends up with the same set of packfiles. As a
result, the likelihood that replicas would report different sizes is
exceedingly high.
We could probably solve this by implementing some custom logic which
simply routes the RPC to all replicas and then takes the mean of all
returned sizes. But this is needlessly complex, and would require us to
put a lot of RPC-specific behaviour into Praefect. This commit instead
introduces a best-effort strategy of simply always routing both RPCs to
the primary. This may still lead to discrepancies when the primary node
is flapping. But that shouldn't happen too frequently, and when it does
we probably have other problems than repo sizes.
|
|
TestCoordinator_grpcErrorHandling has a data race due to sharing
test server state between different test cases. If the primary node
returns an error, the proxy cancels all secondary streams. This cancel
is purely client side though, the proxy does not wait for the server
handlers to actually terminate. This causes a race where the primary's
error is returned while some secondary servers are still executing the
handler. The test goroutine then runs assertions and starts setting up
the next test case. The WaitGroup and the errByNode are shared between
the test cases, so the loop setting up the WaitGroup and errByNode then
races with the secondary server handler still executing from the previous
test case.
This commit fixes the race by setting up a new server for each test case
to avoid sharing the WaitGroup and errByNode between the test cases.
|
|
It's currently not possible to use our git DSL in the testhelper package
because of an import cycle between the testhelper and git package. To
fix this import cycle, we thus move test-related git helpers into the
gittest package.
This commit moves the repository helpers. As we're already touching all
sites which use these helpers anyway, it also aligns functions to have
more consistent naming.
|
|
TestCoordinator_grpcErrorHandling has a race where in the
"primary error gets forwarded" error case. The mock server does acks
a call as done before setting a flag indicating the node was called.
This causes a race where the primary might return an error before the
secondary calls have been recorded as having arrived. This causes the
test to then flake as the primary error is returned and the assertions
are done before the secondary calls have been recorded.
This commit fixes the issue by recording the call as having arrived
prior to marking decremeting the counter of the WorkGroup.
|
|
When proxying streams in a transaction, we do frame-wise proxying to
both primary and all secondaries at once. This is done by kicking off a
bunch of Goroutines to forward from client to primary/secondaries, and
from primary to client. This brings up some interesting questions about
error handling though. While it's clear that we should be aborting the
proxying operation if the client errors out, as the channel is now
essentially broken, and it's also clear that, if proxying to the primary
fails, we need to cancel the stream. But right now, we also cancel
streams if proxying to the secondaries fail.
This last failure mode is quite debatable, and in fact it can cause
weird bugs. This is mostly because while we wait for all secondaries to
terminate, even if any of them errors out, the same isn't true for the
primary: we'll happily cancel the stream when proxying to secondaries
has concluded, if at least one of them has returned an error, even if
proxying to the primary is still ongoing.
As a mitigation, this commit thus starts to ignore any errors caused by
proxying to secondaries -- we already store their errors in a local map
anyway and consider them when creating replication jobs. It's not an
ideal fix though: when enough secondaries fail, then we know that
transactions cannot reach quorum anymore and should thus be able to just
abort the transaction. If necessary, we can do this in a follow-up fix.
|
|
Historically, we didn't have any access to error codes of proxying
destinations if proxying to more than one node at once. This is mostly
because the proxy code just conflated all errors into one, which made
proper error handling next to impossible. As a result, the transactional
finalizer cannot and thus doesn't take into account whether any of the
nodes fails, which may lead to spotty coverage when creating replication
jobs.
Fix this shortcoming by setting up error handlers for all proxying
destinations. Like this, we can easily grab any errors they encounter
and put it into a local map, which we can then forward to the
transaction finalizer. Now there's three cases to consider:
- The primary raised an error. In that case, we cannot help it and
need to create replication jobs to all nodes.
- A secondary raised an error. In that case, we need to consider it
as outdated and need to create a replication job for that specific
node.
- Neither of them failed, in which case the node is considered to be
up-to-date.
Note that currently we still pass the error through as-is even for the
secondaries. We should instead ignore errors on secondaries, but this is
left for a subsequent commit.
|
|
The current implementation of determining updated and outdated
secondaries for a transactional RPC is suboptimal as there are several
conditions where a node is not considered to be outdated. As a result,
we wouldn't schedule replication jobs in these scenarios and potentially
the state would be inconsistent across nodes.
This commit addresses most of these shortcomings:
- When no subtransactions were created, replication targets which
didn't take part in the transaction weren't considered outdated.
- If the primary failed the vote and didn't reach quorum, then no
replication jobs were created. We cannot judge though as there
might have been on-disk changes which we must replicate.
- If we failed to get the transaction state, then we simply errored
and didn't finalize the transaction at all. While this does
indicate a real error, we must still finalize the transaction and
err on the safe side by creating replication jobs.
One shortcoming that isn't yet addressed is if any of the nodes fails
the RPC. This change is left for a subsequent commit so that we do the
best we can with the info we've currently got available.
|
|
The transactional request finalizer is quite hard to follow. This has
led to multiple issues with the current implementation where it doesn't
correctly handle some nodes and won't create replication jobs when
there's an error.
As a preparation for refactorings, this commit pulls out the code to
determine outdated and up-to-date nodes into its own standalone
function and adds a bunch of tests.
No functional change is expected from this commit.
|
|
This commit implements a bunch of tests to assert that we correctly
proxy GRPC errors from primary while ignoring any errors from
secondaries. One of the tests is currently broken and will be fixed in a
subsequent commit.
The tests use the MockManager, which was missing two functions to get
nodes. These have also been mocked such that tests succeed.
|
|
This commit adds support for default replication factor in
Praefect's coordinator and hooks up the configuration option with
the rest of the code.
When a default replication factor is configured for a virtual storage
and repository specific primaries are enabled, the coordinator stores
the nodes participating in the repository's creation as the assigned
nodes.
|
|
When variable replication factor is enabled, the repository's host
nodes should be selected when the repository is being created. The
selected assignments should then be persisted so they can be used
for future routing decisions. This commit adds support for storing
assignments in RepositoryStore's CreateRepository.
The functionality is behind a boolean switch as assignments are not
supported when repository specific primaries are not used. Separate
switch from storing primary is used in order to support repository
specific primaries without variable replication factor. This allows
for independently rolling out the features to production later.
|
|
This commit adds support for selecting host nodes randomly on
repository creation in PerRepositoryRouter. New repositories get
random secondaries assigned as host nodes until the default
replication factor is met. If the virtual storage has no default
replication factor configured, every configured node, except the
randomly picked primary, is included as secondaries. This is
fallback behavior to match Praefect's behavior prior to variable
replication factor.
|
|
Adds Shuffle method to Random interface for shuffling a slice. While
not in use yet, the method is required for selecting random host nodes
when a repository is being created.
NewLockedRandom implements a mutex protected wrapper for Random.
Currently it was implemented with a function wrapper. As we need more
than one method now, this commit changes the implementation in to a
struct and adds the Shuffle method on it.
The interface is also implemented by mockRandom which is used for mocking
out a Random during tests. mockRandom also gains the Shuffle function so
it implements the updated interface.
|
|
With reads distribution being enabled, requests would typically be
routed to a random up-to-date node. There is situations though where not
even up-to-date nodes would be able to serve a request, e.g. when object
quarantine is in use.
This commit implements force-routing to the primary for the node-manager
router. It's implemented a new GRPC metadata header
"gitaly-route-repository-accessor-to-primary": if set, a
repository-scoped accessors will always get routed to the primary.
|
|
This commit wires up repository creation routing to Praefect's
coordinator. If the RPC creates a repository, the routing is handled
different manner from other repository scoped mutators.
This fixes repository creation when repository specific primaries are
enabled. Prior to this, repository creation did not work as
PerRepositoryRouter needs to know where to route the mutator request.
As there were no record of the repository, the router didn't know
where to route the request. The router now picks a random healthy node
to act as the primary and sets the other nodes as replication targets.
It then stores the primary in the database in the request finalizers.
|
|
Repository creation when using repository specific primaries does not
work as the creation goes through the usual repository mutator route
which fails due to the repository not having a primary. As a preliminary
step towards fixing that, this commit adds an option that can be used
to store a repository's primary node in the database when creating its
record.
|
|
The `testhelper.GetTemporaryGitalySocketFileName()` function currently
returns an error. Given that all callers use `require.NoError()` on that
error, let's convert it to instead receive a `testing.TB` and not return
an error.
|
|
In c1064ad30 (Merge branch 'datastore_create_repo' into 'master',
2021-01-04) I merged a commit which ran afoul of the new checks in
20ef26439 (golangci: Enable error and documentation linters,
2020-12-17).
I think the CI was passing when I looked at the MR and it was based
off an earlier master, but I'm not 100% sure. In any case, this fixes
it.
See also my 149d56118 (golangci: fix linting issues on "master",
2020-12-24) for previous fights with the linter.
|
|
praefect: intercept CreateRepository* RPCs to populate database
See merge request gitlab-org/gitaly!2873
|
|
On each read/write operation praefect requires to know which
gitaly node is a primary. For mutator operations it used to
define from what node the response will be returned back to
the client. For the read operations it is used to redirect request
to or as a fallback option for reads distribution in case it
is enabled. The default strategy for defining the primary is
'sql' which means the primary is tracked inside of the Postgres
database and praefect issues select statement into it each time
it needs to define the current primary. It creates a high load
on the database when there are too many read operations (the
outcome of the performance testing).
To resolve this problem we change the logic of retrieval of
the set of up to date storages to return all storages including
the primary. With it in place we don't need to know the current
primary and use any storage that has latest generation of the
repository to serve the requests. As this information is cached
by the in-memory cache praefect won't create a high load on the
database anymore.
This change also makes check IsLatestGeneration for the primary
node redundant as it won't be present in the set of consistent
storages if its generation not the latest one.
Fix linting issues
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3337
|
|
|
|
As the strategy of interacting with the repository
changes there is no more need to provide RepositoryStore
into Mgr as a dependency as it has no usage.
This commit removes RepositoryStore from the list of the
input parameters of the constructor function and from the
list of fields of the Mgr struct.
|
|
This reverts commit 09c6d25de370446ac855a8241d8f821ed3f1ceec
|
|
gitaly node is a primary. For mutator operations it used to
define from what node the response will be returned back to
the client. For the read operations it is used to redirect request
to or as a fallback option for reads distribution in case it
is enabled. The default strategy for defining the primary is
an 'sql' which means the primary is tracked inside Postgres
database and praefect issues select statement into it each time
it needs to define current primary. It creates a high load
on the database when there are too many read operations (the
outcome of the performance testing).
To resolve this problem we change logic of retrieving set of
up to date storages to return all storages including primary.
So now we don't need to know the current primary and use any
storage that has latest generation of the repository to serve
the requests. As this information is cached by the in-memory
cache praefect won't create a high load on the database anymore.
This change also makes check IsLatestGeneration for the primary
node redundant as it won't be present in the set of consistent
storages if its generation not the latest one.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3337
|
|
This reverts commit ca8e2de58193775a62599b3a4de682c511046033 and
restores the feature flag for enabling/disabling reference transactions.
The main reason to revert the change is that there are currently still
some problems with TLS setups and injected Praefect server information,
where Gitaly may not be able to verify authenticity of Praefect servers
and thus fail to establish a connection.
This also removes the environment variable to disable transactions,
which was only added as a short-term fix to make GitLab setups work in
such environments again.
|
|
The coordinator tests have two tests which verify routing logic for
RPCs which don't use transactions. This is currently implemented via
`FetchIntoObjectPool`, but we're in fact about to change it to
conditionally use transactions. As this would break tests, let's switch
to a different RPC whihc we know won't use transactions.
|
|
To prepare for removing the MemoryRepositoryStore, this commit
removes its uses in tests. Mostly the tests need something that works,
which is when DisableRepositoryStore is used. When a test is testing
a particular scenario with the RepositoryStore, a mock is provided
instead. Ideally we'd use the Postgres implementation in these cases
but hooking it in requires some additional work as the test setup
overwrites home directory which breaks the discovery of GDK's Postgres.
|
|
This replaces the in-memory store in Praefect's coordinator's
TestStreamDirectorMutator_Transaction tests with Postgres implementation
of the store. The test case exercises the whole mutator flow and tests
assumptions related to it. Due to this, mocking gets more involved.
As the test case covers a lot of ground, it makes sense to use the
real implementation here and perform integration testing with it.
|
|
Logic-wise this comments is intended to achieve the same as reverting:
ca8e2de58193775a62599b3a4de682c511046033:
(featureflag: Remove reference transaction feature flag), just without
conflicts and a more concentrated diff.
The use case is that a customer turned the feature flag for reference
transactions off, which wasn't honoured anymore. This change allows for
disabling transactions again.
The root problem lies in the fact that Gitaly needs to
connect with one Praefect that coordinates the current transaction.
Given this is an IP based connection, not to a domain name, this
violates some security policies.
|
|
The `nodes.Manager` interface's method `GetShard()` doesn't currently
receive a context as parameter, even though some of its implementations
perform non-trivial and potentially blocking work like querying the
database. This commit thus adds this parameter and adjusts all current
callsites.
|
|
It is a next step in including cached storages provider in order to
support reads distribution across gitalies. On each invocation it
queries the passed in dependency and combine the result with existing
primary. The resulted list is used by the manager to decide where request
should be routed for processing. In a follow up MR it will be extended
with expiration cache to reduce load on database as accessing it on each
read operation is not efficient.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/3053
|
|
In order to ease the migration to reference transactions and reduce the
risk, initially all transactions used a primary-wins strategy: as long
as the primary node performed a vote, the transaction was guaranteed to
succeed. This was thus only a transitional feature flag, and given that
we've been running without for many weeks now without any issues we can
safely remove it now.
|
|
When a transaction was gracefully stopped, it indicates that any node
taking part has hit an expected error case, e.g. the primary has
executed hooks which declined a push. The coordinator can't and
shouldn't do anything about it as this is an expected error inside of
the business logic which the Gitaly nodes should know to handle
correctly.
Change the logic to reflect this by silently returning in case the
primary's transaction has been stopped.
|
|
'repositories' table's records are used to keep record of repositories
that should exist and as repository specific locks for accessing
'storage_repositories' table.
The repository specific locking was implemented by incrementing the
generation column before modifying the entries in the 'storage_repositories'
table. Furthermore, the generation column is used as an easy way to
access the latest generation number. These uses prevent adding other
columns to the table, as inserting a record forces one to also set a
generation number from unrelated code. To make it viable to add other
repository specific columns to the table, this commit stops using the
'repositories.generation' as an easy way to get the current expected
generation and instead gets it by getting the maximum value from the
'storage_repositories' table. To make it possible to add other columns
without affecting the generation, NOT NULL constraint is dropped from
the column and queries have been updated to handle the case.
The column is still kept around and updated in the writes to keep
backwards compatibility with earlier versions. The column can be dropped
later once every repository is guaranteed to have a record in the table
by replacing the increment operation with a select for update.
Behavior should be semantically the same. Only difference is in
GetConsistentSecondaries which was only checking the secondaries match
the primary's generation rather than that they are on the latest generation.
This was changed as we can't conistency of the repository unless it
is on the very latest generation.
|
|
Routing logic is currently not pluggable as it is part of the coordinator
and depends on the NodeManager. In preparation for variable replication
factor and per repository primary, this commit extracts the routing logic
in to a separate component which allows for plugging in alternative
implementation. There should be no behavior difference, except for eagerly
loading consistent secondaries on repository scoped mutators regardless of
whether transactions are enabled or not. This should be fine though as
transactions are enabled by default.
|
|
This commit removes server scoped request handling from the coordinator
as there are no server scoped RPCs Praefect should direct to Gitaly
nodes.
|
|
Hooks up the error tracker in the node manager so it checks if a certain
backend node has reached a threshold of errors. If it has, then it will
be deemed unhealthy.
|
|
We currently have two ways we set up transactions in the coordinator:
- With the primary-wins feature flag, the primary will always
succeed no matter what the secondaries vote.
- Without the flag, all nodes need to agree.
The first strategy really is only to ensure smoother transitioning
towards using reference transactions and is going away at some point in
time. The second voting strategy is the one that's going to stay, but
right now it's really pessimistic because it will always require all
nodes to agree on the same thing.
Let's refine this strategy and move it to the next level, which is a
majority-wins approach: instead of requiring all nodes to agree, we will
now require a majority of nodes to agree. In this case, we define
majority to:
- Always include the primary node. This is a hard requirement, as
the primary node is the one who's executing hooks. If it failed
and the transaction succeeded, we'd have changes persisted to disk
but none of the hooks were run, which may lead to inconsistent
state.
- Require at least half of the secondaries to agree, rounded up. As
a special case, if there's only a single secondary, then it's not
required to agree with the primary.
This would result in the following typical scenarios:
- 1 primary, 0 secondaries: Primary always wins.
- 1 primary, 1 secondary: Primary always wins.
- 1 primary, 2 secondaries: Primary and a single secondary required.
- 1 primary, 3 secondaries: Primary and two secondaries required.
- 1 primary, 4 secondaries: Primary and two secondaries required.
This commit implements the above by changing the default transaction
setup in case the primary-wins feature flag is disabled and adds/adjusts
some of our tests to match this new strategy.
|
|
Right now, setup of metrics used in the transaction manager is split
across multiple locations. This makes the process of adding new metrics
more involved than it needs to be and is a source of bugs in case any of
those locations is not updated.
Improve the situation by moving setup of metrics into the transaction
manager. Metrics are exposed by implementing the Collector interface and
registering the transaction manager itself as a metric.
|
|
Replication should consider the state of the target nodes and
omit dequeueing and processing of replication events for unhealthy
nodes. Processing of replication events for unhealthy nodes could
lead to outdated repositories as all attempts to process the
event would be failed and event will be marked 'dead'. And the
repository will remain outdated until the next replication event
happen to it. By checking the health state before processing
the event we have more chances to apply them without issues, so
reduce potential data loss.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2922
|
|
Local elector doesn't handle retrieval of the primary node properly.
As it is not started by the Mgr and there is no health checks
executed for each node. In suck case each node considered
unhealthy as it has 0 successful health checks.
Disabled elector is used in case failover is disabled.
It returns a first passed in node as a primary and all the others
as secondaries. It also returns `ErrPrimaryNotHealthy` as other
electors in case primary is not healthy and can't serve the requests.
The health check request starts despite of failover for any type of
elector. Each node shows actual health status with `IsHealthy` method.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/3011
|
|
It is decided to go with all features enabled by default behaviour
for tests. This should help us identify potential problems with
feature flag combinations enabled in production. To verify implementation
without feature flag enabled it should be disabled explicitly in the
test.
Part of: https://gitlab.com/gitlab-org/gitaly/-/issues/2702
|
|
Retrieve up to date storages that can server read operation for the
repository in order to distribute reads across all healthy storages
of the virtual storage.
Closes: https://gitlab.com/gitlab-org/gitaly/-/issues/2944
|