Welcome to mirror list, hosted at ThFree Co, Russian Federation.

database_lab.md « database « development « doc - gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
blob: 7edb8ab4de52ee8ab6ca32edbfa752a795ae3ecd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
---
stage: Data Stores
group: Database
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
---

# Database Lab and Postgres.ai

Internal users at GitLab have access to the Database Lab Engine (DLE) and
[postgres.ai](https://console.postgres.ai/) for testing performance of database queries
on replicated production data. Unlike a typical read-only production replica, in the DLE you can
also create, update, and delete rows. You can also test the performance of
schema changes, like additional indexes or columns, in an isolated copy of production data.

## Database Lab quick start

1. [Visit the console](https://console.postgres.ai/).
1. Select **Sign in with Google**. (Not GitLab, as you need Google SSO to connect with our project.)
1. After you sign in, select the GitLab organization and then visit "Ask Joe" in the sidebar.
1. Select the database you're testing against:
   - Most queries for the GitLab project run against `gitlab-production-tunnel-pg12`.
   - If the query is for a CI table, select `gitlab-production-ci`.
   - If the query is for the container registry, select `gitlab-production-registry`.
1. Type `explain <Query Text>` in the chat box to get a plan.

## Access Database Lab Engine

Access to the DLE is helpful for:

- Database reviewers and maintainers.
- Engineers who work on merge requests that have large effects on databases.

To access the DLE's services, you can:

- Perform query testing in the Postgres.ai web console.
  Employees access both services with their GitLab Google account. Query testing
  provides `EXPLAIN` (analyze, buffers) plans for queries executed there.
- Migration testing by triggering a job as a part of a merge request.
- Direct `psql` access to DLE instead of a production replica. Available to authorized users only.
  To request `psql` access, file an [access request](https://about.gitlab.com/handbook/business-technology/end-user-services/onboarding-access-requests/access-requests/#individual-or-bulk-access-request).

For more assistance, use the `#database` Slack channel.

NOTE:
If you need only temporary access to a production replica, instead of a Database Lab
clone, follow the runbook procedure for connecting to the
[database console with Teleport](https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/teleport/Connect_to_Database_Console_via_Teleport.md).
This procedure is similar to [Rails console access with Teleport](https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/teleport/Connect_to_Rails_Console_via_Teleport.md#how-to-use-teleport-to-connect-to-rails-console).

### Query testing

You can access Database Lab's query analysis features either:

- In [the Postgres.ai web console](https://console.postgres.ai/GitLab/joe-instances).
  Shows only the commands you run.

#### Generate query plans

Query plans are an essential part of the database review process. These plans
enable us to decide quickly if a given query can be performant on GitLab.com.
Running the `explain` command generates an `explain` plan and a link to the Postgres.ai
console with more query analysis. For example, running `EXPLAIN SELECT * FROM application_settings`
does the following:

1. Runs `explain (analyze, buffers) select * from application_settings;` against a database clone.
1. Responds with timing and buffer details from the run.
1. Provides a [detailed, shareable report on the results](https://console.postgres.ai/shared/24d543c9-893b-4ff6-8deb-a8f902f85a53).

#### Making schema changes

Sometimes when testing queries, a contributor may realize that the query needs an index
or other schema change to make added queries more performant. To test the query, run the `exec` command.
For example, running this command:

```sql
exec CREATE INDEX on application_settings USING btree (instance_administration_project_id)
```

creates the specified index on the table. You can [test queries](#generate-query-plans) leveraging
the new index. `exec` does not return any results, only the time required to execute the query.

#### Reset the clone

After many changes, such as after a destructive query or an ineffective index,
you must start over. To reset your designated clone, run `reset`.

#### Checking indexes

Use Database Lab to check the status of an index with the meta-command `\d <index_name>`.

Caveats:

- Indexes are created in both the `main` and `ci` databases, so be sure to use the instance
  that matches the table's `gitlab_schema`. For example, if the index is added to
  [`ci_builds`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/docs/ci_builds.yml#L14),
  use `gitlab-production-ci`.
- Database Lab typically has a small delay of a few hours. If more up-to-date information
  is required, you can instead request access to a replica [via Teleport](https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/teleport/Connect_to_Database_Console_via_Teleport.md)

For example: `\d index_design_management_designs_on_project_id` produces:

```plaintext
Index "public.index_design_management_designs_on_project_id"
   Column   |  Type   | Key? | Definition
------------+---------+------+------------
 project_id | integer | yes  | project_id
btree, for table "public.design_management_designs"
```

In the case of an invalid index, the output ends with `invalid`, like:

```plaintext
Index "public.index_design_management_designs_on_project_id"
   Column   |  Type   | Key? | Definition
------------+---------+------+------------
 project_id | integer | yes  | project_id
btree, for table "public.design_management_designs", invalid
```

If the index doesn't exist, JoeBot throws an error like:

```plaintext
ERROR: psql error: psql:/tmp/psql-query-932227396:1: error: Did not find any relation named "no_index".
```

### Migration testing

For information on testing migrations, review our
[database migration testing documentation](database_migration_pipeline.md).

### Access the console with `psql`

NOTE:
You must have `AllFeaturesUser` [`psql` access](#access-database-lab-engine) to access the console with `psql`.

#### Simplified access through `pgai` Ruby gem

[@mbobin](https://gitlab.com/mbobin) created the [`pgai` Ruby Gem](https://gitlab.com/mbobin/pgai/#pgai) that
greatly simplifies access to a database clone, with support for:

- Access to all database clones listed in the [Postgres.ai instances page](https://console.postgres.ai/gitlab/instances);
- Multiple `psql` sessions on the same clone.

If you have `AllFeaturesUser` [`psql` access](#access-database-lab-engine), you can follow the steps below to configure
the `pgai` Gem:

1. To get started, you need to gather some values from the [Postgres.ai instances page](https://console.postgres.ai/gitlab/instances):

   1. Go to the instance that you want to configure and the on right side of the screen.
   1. Under **Connection**, select **Connect**. The menu might be collapsed.

      A dialog with everything that's needed for configuration appears, using this format:

      ```shell
      dblab init --url "http://127.0.0.1:1234" --token TOKEN --environment-id <environment-id>
      ```

      ```shell
      ssh -NTML 1234:localhost:<environment-port> <postgresai-user>@<postgresai-proxy> -i ~/.ssh/id_rsa
      ```

1. Add the following snippet to your SSH configuration file at `~/.ssh/config`, replacing the variable values:

   ```plaintext
   Host pgai-proxy
     HostName <postgresai-proxy>
     User <postgresai-user>
     IdentityFile ~/.ssh/id_ed25519
   ```

1. Run the following command so you can accept the server key fingerprint:

   ```shell
   ssh pgai-proxy
   ```

1. Run the following commands:

   ```shell
   gem install pgai

   # Grab an access token: https://console.postgres.ai/gitlab/tokens
   # GITLAB_USER is your GitLab handle
   pgai config --dbname=gitlabhq_dblab --prefix=$GITLAB_USER --proxy=pgai-proxy

   # Grab the respective port values from https://console.postgres.ai/gitlab/instances
   # for the instances you'll be using (in this case, for the `main` database instance)
   pgai env add --alias main --id <environment-id> --port <environment-port>
   ```

1. Once this one-time configuration is done, you can use `pgai connect` to connect to a particular database. For
   instance, to connect to the `main` database:

   ```shell
   pgai connect main
   ```

1. Once done with the clone, you can destroy it:

   ```shell
   pgai destroy main
   ```

#### Manual access through the Postgres.ai instances page

Team members with [`psql` access](#access-database-lab-engine), can gain direct access
to a clone via `psql`. Access to `psql` enables you to see data, not just metadata.

To connect to a clone using `psql`:

1. Create a clone from the [desired instance](https://console.postgres.ai/gitlab/instances/).
   1. Provide a **Clone ID**: Something that uniquely identifies your clone, such as `yourname-testing-gitlabissue`.
   1. Provide a **Database username** and **Database password**: Connects `psql` to your clone.
   1. Select **Enable deletion protection** if you want to preserve your clone. Avoid selecting this option.
      Clones are removed after 12 hours.
1. In the **Clone details** page of the Postgres.ai web interface, copy and run
   the command to start SSH port forwarding for the clone.
1. In the **Clone details** page of the Postgres.ai web interface, copy and run the `psql` connection string.
   Use the password provided at setup and set the `dbname` to `gitlabhq_dblab` (or check what databases are available by using `psql -l` with the same query string but `dbname=postgres`).

After you connect, use clone like you would any `psql` console in production, but with
the added benefit and safety of an isolated writeable environment.