Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/elasticsearch.md')
-rw-r--r--doc/development/elasticsearch.md22
1 files changed, 15 insertions, 7 deletions
diff --git a/doc/development/elasticsearch.md b/doc/development/elasticsearch.md
index 9f54386f1af..90debab3b5c 100644
--- a/doc/development/elasticsearch.md
+++ b/doc/development/elasticsearch.md
@@ -60,7 +60,7 @@ The `whitespace` tokenizer was selected in order to have more control over how t
Please see the `code` filter for an explanation on how tokens are split.
-NOTE: **Known Issues**:
+NOTE: **Note:**
Currently the [Elasticsearch code_analyzer doesn't account for all code cases](../integration/elasticsearch.md#known-issues).
#### `code_search_analyzer`
@@ -111,11 +111,8 @@ Patterns:
- `'"((?:\\"|[^"]|\\")*)"'`: captures terms inside quotes, removing the quotes
- `"'((?:\\'|[^']|\\')*)'"`: same as above, for single-quotes
- `'\.([^.]+)(?=\.|\s|\Z)'`: separate terms with periods in-between
-- `'\/?([^\/]+)(?=\/|\b)'`: separate path terms `like/this/one`
-
-#### `edgeNGram_filter`
-
-Uses an [Edge NGram token filter](https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-edgengram-tokenfilter.html) to allow inputs with only parts of a token to find the token. For example it would turn `glasses` into permutations starting with `gl` and ending with `glasses`, which would allow a search for "`glass`" to find the original token `glasses`
+- `'([\p{L}_.-]+)'`: some common chars in file names to keep the whole filename intact (eg. `my_file-ñame.txt`)
+- `'([\p{L}\d_]+)'`: letters, numbers and underscores are the most common tokens in programming. Always capture them greedily regardless of context.
## Gotchas
@@ -160,7 +157,8 @@ The global configurations per version are now in the `Elastic::(Version)::Config
### Creating new version of schema
-NOTE: **Note:** this is not applicable yet as multiple indices functionality is not fully implemented.
+NOTE: **Note:**
+This is not applicable yet as multiple indices functionality is not fully implemented.
Folders like `ee/lib/elastic/v12p1` contain snapshots of search logic from different versions. To keep a continuous Git history, the latest version lives under `ee/lib/elastic/latest`, but its classes are aliased under an actual version (e.g. `ee/lib/elastic/v12p3`). When referencing these classes, never use the `Latest` namespace directly, but use the actual version (e.g. `V12p3`).
@@ -222,6 +220,16 @@ be used both locally in development and on any deployed GitLab instance to
diagnose poor search performance. This will show the exact queries being made,
which is useful to diagnose why a search might be slow.
+### Correlation ID and X-Opaque-Id
+
+Our [correlation
+ID](./distributed_tracing.md#developer-guidelines-for-working-with-correlation-ids)
+is forwarded by all requests from Rails to Elasticsearch as the
+[`X-Opaque-Id`](https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html#_identifying_running_tasks)
+header which allows us to track any
+[tasks](https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html)
+in the cluster back the request in GitLab.
+
## Troubleshooting
### Getting `flood stage disk watermark [95%] exceeded`