Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/ai_features/glossary.md')
-rw-r--r--doc/development/ai_features/glossary.md9
1 files changed, 9 insertions, 0 deletions
diff --git a/doc/development/ai_features/glossary.md b/doc/development/ai_features/glossary.md
index be856639b83..6c3966a054a 100644
--- a/doc/development/ai_features/glossary.md
+++ b/doc/development/ai_features/glossary.md
@@ -39,6 +39,15 @@ to AI that you think could benefit from being in this list, add it!
piece of information, which helps to clarify its meaning and implications.
For GitLab Duo Chat, context is the attributes of the Issue or Epic being
referenced in a user question.
+- **Embeddings**: In the context of machine learning and large language models,
+ embeddings refer to a technique used to represent words, phrases, or even
+ entire documents as dense numerical vectors in a continuous vector space.
+ At GitLab, [we use Vertex AI's Embeddings API](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/129930)
+ to create a vector representation of GitLab documentation. These
+ embeddings are stored in the `vertex_gitlab_docs` database table in the
+ `embeddings` database. The embeddings search is done in Postgres using the
+ `vector` extension. The vertex embeddings database is updated based on the
+ latest version of GitLab documentation on daily basis by running `Llm::Embedding::GitlabDocumentation::CreateEmptyEmbeddingsRecordsWorker` as a cronjob.
- **Golden Questions**: a small subset of the types of questions we think a user
should be able to ask GitLab Duo Chat. Used to generate data for Chat evaluation.
[Questions for Chat Beta](https://gitlab.com/groups/gitlab-org/-/epics/10550#what-the-user-can-ask).