doc/user/project/ml/experiment_tracking/mlflow_client.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307

---
stage: ModelOps
group: MLOps
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments
---

# MLflow client compatibility **(FREE ALL EXPERIMENT)**

> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/8560) in GitLab 15.11 as an [Experiment](../../../../policy/experiment-beta-support.md#experiment) release [with a flag](../../../../administration/feature_flags.md) named `ml_experiment_tracking`. Disabled by default.

NOTE:
Model registry and model experiment tracking are [Experiments](../../../../policy/experiment-beta-support.md).
Provide feedback [for model experiment tracking](https://gitlab.com/gitlab-org/gitlab/-/issues/381660). Provide feedback for [model registry](https://gitlab.com/gitlab-org/gitlab/-/epics/9423).

[MLflow](https://mlflow.org/) is a popular open source tool for Machine Learning Experiment Tracking.
GitLab [Model experiment tracking](index.md) and GitLab
[Model registry](../model_registry/index.md) are compatible with the MLflow client. The setup requires minimal changes to existing code.

GitLab plays the role of a MLflow server. Running `mlflow server` is not necessary.

## Enable MLflow client integration

Prerequisites:

- A [personal](../../../../user/profile/personal_access_tokens.md), [project](../../../../user/project/settings/project_access_tokens.md), or [group](../../../../user/group/settings/group_access_tokens.md) access token with at least the Developer role and the `api` permission.
- The project ID. To find the project ID:
  1. On the left sidebar, select **Search or go to** and find your project.
  1. Select **Settings > General**.

To use MLflow client compatibility from a local environment:

1. Set the tracking URI and token environment variables on the host that runs the code.
   This can be your local environment, CI pipeline, or remote host. For example:

   ```shell
   export MLFLOW_TRACKING_URI="<your gitlab endpoint>/api/v4/projects/<your project id>/ml/mlflow"
   export MLFLOW_TRACKING_TOKEN="<your_access_token>"
   ```

1. If the training code contains the call to `mlflow.set_tracking_uri()`, remove it.

In the model registry, you can copy the tracking URI from the overflow menu in the top right
by selecting the vertical ellipsis (**{ellipsis_v}**).

## Model experiments

When running the training code, MLflow client can be used to create experiments, runs,
models, model versions, log parameters, metrics, metadata and artifacts on GitLab.

After experiments are logged, they are listed under `/<your project>/-/ml/experiments`.

Runs are registered as candidates, which can be explored by selecting an experiment, model, or model version.

### Associating a candidate to a CI/CD job

> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/119454) in GitLab 16.1.

If your training code is being run from a CI/CD job, GitLab can use that information to enhance
candidate metadata. To associate a candidate to a CI/CD job:

1. In the [Project CI variables](../../../../ci/variables/index.md), include the following variables:
    - `MLFLOW_TRACKING_URI`: `"<your gitlab endpoint>/api/v4/projects/<your project id>/ml/mlflow"`
    - `MLFLOW_TRACKING_TOKEN`: `<your_access_token>`

1. In your training code within the run execution context, add the following code snippet:

    ```python
    with mlflow.start_run(run_name=f"Candidate {index}"):
      # Your training code

      # Start of snippet to be included
      if os.getenv('GITLAB_CI'):
        mlflow.set_tag('gitlab.CI_JOB_ID', os.getenv('CI_JOB_ID'))
      # End of snippet to be included
    ```

## Model registry

You can also manage models and model versions by using the MLflow
client. Models are registered under `/<your project>/-/ml/models`.

### Models

#### Creating a model

```python
client = MlflowClient()
model_name = '<your_model_name>'
description = 'Model description'
model = client.create_registered_model(model_name, description=description)
```

**Notes**

- `create_registered_model` argument `tags` is ignored.
- `name` must be unique within the project.
- `name` cannot be the name of an existing experiment.

#### Fetching a model

```python
client = MlflowClient()
model_name = '<your_model_name>'
model = client.get_registered_model(model_name)
```

#### Updating a model

```python
client = MlflowClient()
model_name = '<your_model_name>'
description = 'New description'
client.update_registered_model(model_name, description=description)
```

#### Deleting a model

```python
client = MlflowClient()
model_name = '<your_model_name>'
client.delete_registered_model(model_name)
```

### Logging candidates to a model

Every model has an associated experiment with the same name. To log a candidate/run to the model,
use the experiment with the name of the model:

```python
client = MlflowClient()
model_name = '<your_model_name>'
exp = client.get_experiment_by_name(model_name)
run = client.create_run(exp.experiment_id)
```

### Model version

#### Creating a model version

```python
client = MlflowClient()
model_name = '<your_model_name>'
description = 'Model version description'
model_version = client.create_model_version(model_name, source="", description=description)
```

If the version parameter is not passed, it will be auto-incremented from the latest uploaded
version. You can set the version by passing a tag during model version creation. The version
must follow [SemVer](https://semver.org/) format.

```python
client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
tags = { "gitlab.version" = version }
client.create_model)version(model_name, version, description=description, tags=tags)
```

**Notes**

- Argument `run_id` is ignored. Every model version behaves as a Candidate/Run. Creating a mode version from a run is not yet supported.
- Argument `source` is ignored. GitLab will create a package location for the model version files.
- Argument `run_link` is ignored.
- Argument `await_creation_for` is ignored.

#### Updating a model

```python
client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
description = 'New description'
client.update_model_version(model_name, version, description=description)
```

#### Fetching a model version

```python
client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
client.get_model_version(model_name, version)
```

#### Getting latest versions of a model

```python
client = MlflowClient()
model_name = '<your_model_name>'
client.get_latest_versions(model_name)
```

**Notes**

- Argument `stages` is ignored.
- Versions are ordered by last created.

#### Logging metrics and parameters to a model version

Every model version is also a candidate/run, allowing users to log parameters
and metrics. The run ID can either be found at the Model version page in GitLab,
or by using the MLflow client:

```python
client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
model_version = client.get_model_version(model_name, version)
run_id = model_version.run_id

# Your training code

client.log_metric(run_id, '<metric_name>', '<metric_value>')
client.log_param(run_id, '<param_name>', '<param_value>')
client.log_batch(run_id, metric_list, param_list, tag_list)
```

#### Logging artifacts to a model version

GitLab creates a package that can be used by the MLflow client to upload files.

```python
client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
model_version = client.get_model_version(model_name, version)
run_id = model_version.run_id

# Your training code

client.log_artifact(run_id, '<local/path/to/file.txt>', artifact_path="")
client.log_figure(run_id, figure, artifact_file="my_plot.png")
client.log_dict(run_id, my_dict, artifact_file="my_dict.json")
client.log_image(run_id, image, artifact_file="image.png")
```

Artifacts will then be available under `https/<your project>/-/ml/models/<model_id>/versions/<version_id>`.

#### Linking a model version to a CI/CD job

Similar to candidates, it is also possible to link a model version to a CI/CD job:

```python
client = MlflowClient()
model_name = '<your_model_name>'
version = '<your_version>'
model_version = client.get_model_version(model_name, version)
run_id = model_version.run_id

# Your training code

if os.getenv('GITLAB_CI'):
    client.set_tag(model_version.run_id, 'gitlab.CI_JOB_ID', os.getenv('CI_JOB_ID'))
```

## Supported MLflow client methods and caveats

GitLab supports these methods from the MLflow client. Other methods might be supported but were not
tested. More information can be found in the [MLflow Documentation](https://www.mlflow.org/docs/1.28.0/python_api/mlflow.html). The MlflowClient counterparts
of the methods below are also supported with the same caveats.

| Method                   | Supported        | Version Added  | Comments                                                                            |
|--------------------------|------------------|----------------|-------------------------------------------------------------------------------------|
| `get_experiment`         | Yes              | 15.11          |                                                                                     |
| `get_experiment_by_name` | Yes              | 15.11          |                                                                                     |
| `set_experiment`         | Yes              | 15.11          |                                                                                     |
| `get_run`                | Yes              | 15.11          |                                                                                     |
| `start_run`              | Yes              | 15.11          | (16.3) If a name is not provided, the candidate receives a random nickname.         |
| `search_runs`            | Yes              | 15.11          | (16.4) `experiment_ids` supports only a single experiment ID with order by column or metric. |
| `log_artifact`           | Yes with caveat  | 15.11          | (15.11) `artifact_path` must be empty. Does not support directories.                |
| `log_artifacts`          | Yes with caveat  | 15.11          | (15.11) `artifact_path` must be empty. Does not support directories.                |
| `log_batch`              | Yes              | 15.11          |                                                                                     |
| `log_metric`             | Yes              | 15.11          |                                                                                     |
| `log_metrics`            | Yes              | 15.11          |                                                                                     |
| `log_param`              | Yes              | 15.11          |                                                                                     |
| `log_params`             | Yes              | 15.11          |                                                                                     |
| `log_figure`             | Yes              | 15.11          |                                                                                     |
| `log_image`              | Yes              | 15.11          |                                                                                     |
| `log_text`               | Yes with caveat  | 15.11          | (15.11) Does not support directories.                                               |
| `log_dict`               | Yes with caveat  | 15.11          | (15.11) Does not support directories.                                               |
| `set_tag`                | Yes              | 15.11          |                                                                                     |
| `set_tags`               | Yes              | 15.11          |                                                                                     |
| `set_terminated`         | Yes              | 15.11          |                                                                                     |
| `end_run`                | Yes              | 15.11          |                                                                                     |
| `update_run`             | Yes              | 15.11          |                                                                                     |
| `log_model`              | Partial          | 15.11          | (15.11) Saves the artifacts, but not the model data. `artifact_path` must be empty. |

Other MLflowClient methods:

| Method                    | Supported        | Version added | Comments                                         |
|---------------------------|------------------|---------------|--------------------------------------------------|
| `create_registered_model` | Yes with caveats | 16.8          | [See notes](#creating-a-model)                   |
| `get_registered_model`    | Yes              | 16.8          |                                                  |
| `delete_registered_model` | Yes              | 16.8          |                                                  |
| `update_registered_model` | Yes              | 16.8          |                                                  |
| `create_model_version`    | Yes with caveats | 16.8          | [See notes](#creating-a-model-version)           |
| `get_model_version`       | Yes              | 16.8          |                                                  |
| `get_latest_versions`     | Yes with caveats | 16.8          | [See notes](#getting-latest-versions-of-a-model) |
| `update_model_version`    | Yes              | 16.8          |                                                  |
| `create_registered_model` | Yes              | 16.8          |                                                  |
| `create_registered_model` | Yes              | 16.8          |                                                  |

## Limitations

- The API GitLab supports is the one defined at MLflow version 2.7.1.
- MLflow client methods not listed above are not supported.
- During creation of experiments and runs, ExperimentTags are stored, even though they are not displayed.