doc/development/database/transaction_guidelines.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143

---
stage: Data Stores
group: Database
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
---

# Transaction guidelines

This document gives a few examples of the usage of database transactions in application code.

For further reference, check PostgreSQL documentation about [transactions](https://www.postgresql.org/docs/current/tutorial-transactions.html).

## Database decomposition and sharding

The [sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding/) plans
to split the main GitLab database and move some of the database tables to other database servers.

We start decomposing the `ci_*`-related database tables first. To maintain the current application
development experience, we add tooling and static analyzers to the codebase to ensure correct
data access and data modification methods. By using the correct form for defining database transactions,
we can save significant refactoring work in the future.

## The transaction block

The `ActiveRecord` library provides a convenient way to group database statements into a transaction:

```ruby
issue = Issue.find(10)
project = issue.project

ApplicationRecord.transaction do
  issue.update!(title: 'updated title')
  project.update!(last_update_at: Time.now)
end
```

This transaction involves two database tables. In case of an error, each `UPDATE`
statement rolls back to the previous consistent state.

NOTE:
Avoid referencing the `ActiveRecord::Base` class and use `ApplicationRecord` instead.

## Transaction and database locks

When a transaction block is opened, the database tries to acquire the necessary
locks on the resources. The type of locks depend on the actual database statements.

Consider a concurrent update scenario where the following code is executed at the
same time from two different processes:

```ruby
issue = Issue.find(10)
project = issue.project

ApplicationRecord.transaction do
  issue.update!(title: 'updated title')
  project.update!(last_update_at: Time.now)
end
```

The database tries to acquire the `FOR UPDATE` lock for the referenced `issue` and
`project` records. In our case, we have two competing transactions for these locks,
and only one of them successfully acquires them. The other transaction has
to wait in the lock queue until the first transaction finishes. The execution of the
second transaction is blocked at this point.

## Transaction speed

To prevent lock contention and maintain stable application performance, the transaction
block should finish as fast as possible. When a transaction acquires locks, it holds
on to them until the transaction finishes.

Apart from application performance, long-running transactions can also affect application
upgrade processes by blocking database migrations.

### Dangerous example: third-party API calls

Consider the following example:

```ruby
member = Member.find(5)

Member.transaction do
  member.update!(notification_email_sent: true)

  member.send_notification_email
end
```

Here, we ensure that the `notification_email_sent` column is updated only when the
`send_notification_email` method succeeds. The `send_notification_email` method
executes a network request to an email sending service. If the underlying infrastructure
does not specify timeouts or the network call takes too long time, the database transaction
stays open.

Ideally, a transaction should only contain database statements.

Avoid doing in a `transaction` block:

- External network requests such as:
  - Triggering Sidekiq jobs.
  - Sending emails.
  - HTTP API calls.
  - Running database statements using a different connection.
- File system operations.
- Long, CPU intensive computation.
- Calling `sleep(n)`.

## Explicit model referencing

If a transaction modifies records from the same database table, we advise to use the
`Model.transaction` block:

```ruby
build_1 = Ci::Build.find(1)
build_2 = Ci::Build.find(2)

Ci::Build.transaction do
  build_1.touch
  build_2.touch
end
```

The transaction above uses the same database connection for the transaction as the models
in the `transaction` block. In a multi-database environment the following example is dangerous:

```ruby
# `ci_builds` table is located on another database
class Ci::Build < CiDatabase
end

build_1 = Ci::Build.find(1)
build_2 = Ci::Build.find(2)

ApplicationRecord.transaction do
  build_1.touch
  build_2.touch
end
```

The `ApplicationRecord` class uses a different database connection than the `Ci::Build` records.
The two statements in the transaction block are not part of the transaction and are
rolled back in case something goes wrong. They act as 3rd part calls.