doc/development/ai_architecture.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129

---
stage: none
group: unassigned
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
---

# AI Architecture (Experiment)

GitLab has created a common set of tools to support our product groups and their utilization of AI. Our goals with this common architecture are:

1. Increase the velocity of feature teams by providing a set of high quality, ready to use tools
1. Ability to switch underlying technologies quickly and easily

AI is moving very quickly, and we need to be able to keep pace with changes in the area. We have built an [abstraction layer](../../ee/development/ai_features.md) to do this, allowing us to take a more "pluggable" approach to the underlying models, data stores, and other technologies.

The following diagram shows a simplified view of how the different components in GitLab interact. The abstraction layer helps avoid code duplication within the REST APIs within the `AI API` block.

```plantuml
@startuml
skin rose

package "Code Suggestions" {
  node "Model Gateway"
  node "Triton Inference Server" as Triton
}

package "Code Suggestions Models"  as CSM {
  node "codegen"
  node "PaLM"
}

package "Suggested Reviewers" {
  node "Model Gateway (SR)"
  node "Extractor"
  node "Serving Model"
}

package "AI API" as AIF {
  node "OpenAI"
  node "Vertex AI"
}

package GitLab {
  node "Web IDE"

  package "Web" {
    node "REST API"
    node "GraphQL"
  }

  package "Jobs" {
    node "Sidekiq"
  }
}

package Databases {
  node "Vector Database"
  node "PostgreSQL"
}

node "VSCode"

"Model Gateway" --> Triton
Triton --> CSM
GitLab --> Databases
VSCode --> "Model Gateway"
"Web IDE" --> "Model Gateway"
"Web IDE" --> "GraphQL"
"Web IDE" --> "REST API"
"Model Gateway" -[#blue]--> "REST API": user authorized?

"Sidekiq" --> AIF
Web --> AIF

"Model Gateway (SR)" --> "REST API"
"Model Gateway (SR)" --> "Serving Model"
"Extractor" --> "GraphQL"
"Sidekiq" --> "Model Gateway (SR)"

@enduml
```

## SaaS-based AI abstraction layer

GitLab currently operates a cloud-hosted AI architecture. We are exploring how self-managed instances integrate with it.

There are two primary reasons for this: the best AI models are cloud-based as they often depend on specialized hardware designed for this purpose, and operating self-managed infrastructure capable of AI at-scale and with appropriate performance is a significant undertaking. We are actively [tracking self-managed customers interested in AI](https://gitlab.com/gitlab-org/gitlab/-/issues/409183).

## Supported technologies

As part of the AI working group, we have been investigating various technologies and vetting them. Below is a list of the tools which have been reviewed and already approved for use within the GitLab application.

It is possible to utilize other models or technologies, however they will need to go through a review process prior to use. Use the [AI Project Proposal template](https://gitlab.com/gitlab-org/gitlab/-/issues/new?issuable_template=AI%20Project%20Proposal) as part of your idea and include the new tools required to support it.

### Models

The following models have been approved for use:

- [OpenAI models](https://platform.openai.com/docs/models)
- Google's [Vertex AI](https://cloud.google.com/vertex-ai) and [model garden](https://cloud.google.com/model-garden)
- [AI Code Suggestions](https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/tree/main)
- [Suggested reviewer](https://gitlab.com/gitlab-org/modelops/applied-ml/applied-ml-updates/-/issues/10)

### Vector stores

The following vector stores have been approved for use:

- [`pgvector`](https://github.com/pgvector/pgvector) is a Postgres extension adding support for storing vector embeddings and calculating ANN (approximate nearest neighbor).

### Indexing Update

We are currently using sequential scan, which provides perfect recall. We are considering adding an index if we can ensure that it still produces accurate results, as noted in the `pgvector` indexing [documentation](https://github.com/pgvector/pgvector#indexing).

Given that the table contains thousands of entries, indexing with these updated settings would likely improve search speed while maintaining high accuracy. However, more testing may be needed to verify the optimal configuration for this dataset size before deploying to production.

A [draft MR](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/122035) has been created to update the index.

The index function has been updated to improve search quality. This was tested locally by setting the `ivfflat.probes` value to `10` with the following SQL command:

```ruby
Embedding::TanukiBotMvc.connection.execute("SET ivfflat.probes = 10")
```

Setting the `probes` value for indexing improves results, as per the neighbor [documentation](https://github.com/ankane/neighbor#indexing).

For optimal `probes` and `lists` values:

- Use `lists` equal to `rows / 1000` for tables with up to 1 million rows and `sqrt(rows)` for larger datasets.
- For `probes` start with `lists / 10` for tables up to 1 million rows and `sqrt(lists)` for larger datasets.