Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/namespaces.md')
-rw-r--r--doc/development/namespaces.md302
1 files changed, 302 insertions, 0 deletions
diff --git a/doc/development/namespaces.md b/doc/development/namespaces.md
new file mode 100644
index 00000000000..e25b0f57f08
--- /dev/null
+++ b/doc/development/namespaces.md
@@ -0,0 +1,302 @@
+---
+stage: none
+group: unassigned
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
+---
+
+# Namespaces
+
+Namespaces are containers for projects and associated resources. A `Namespace` is instantiated through its subclasses of `Group`, `ProjectNamespace`, and `UserNamespace`.
+
+```mermaid
+graph TD
+ Namespace -.- Group
+ Namespace -.- ProjectNamespace
+ Namespace -.- UserNamespace
+```
+
+A `User` has one `UserNamespace`, and can be a member of many `Namespaces`.
+
+```mermaid
+graph TD
+ Namespace -.- Group
+ Namespace -.- ProjectNamespace
+ Namespace -.- UserNamespace
+
+ User -- has one --- UserNamespace
+ Namespace --- Member --- User
+```
+
+`Group` exists in a recursive hierarchical relationship. `Groups` have many `ProjectNamespaces` which parent one `Project`.
+
+```mermaid
+graph TD
+ Group -- has many --- ProjectNamespace -- has one --- Project
+ Group -- has many --- Group
+```
+
+## Querying namespaces
+
+There is a set of methods provided to query the namespace hierarchy. The methods produce standard Rails `ActiveRecord::Relation` objects.
+The methods can be split into two similar halves. One set of methods operate on a Namespace object, while the other set operate as composable Namespace scopes.
+
+By their nature, the object methods will operate within a single `Namespace` hierarchy, while the scopes can span hierarchies.
+
+The following is a non-exhaustive list of methods to query `Namespace` hierarchies.
+
+### Root namespaces
+
+The root is the top most `Namespace` in the hierarchy. A root has a `nil` `parent_id`.
+
+```mermaid
+graph TD
+ classDef active fill:#f00,color:#fff
+ classDef sel fill:#00f,color:#fff
+
+ A --- A.A --- A.A.A
+ A.A --- A.A.B
+ A --- A.B --- A.B.A
+ A.B --- A.B.B
+
+ class A.A.B active
+ class A sel
+```
+
+```ruby
+Namespace.where(...).roots
+
+namespace_object.root_ancestor
+```
+
+### Descendant namespaces
+
+The descendants of a namespace are its children, their children, and so on.
+
+```mermaid
+graph TD
+ classDef active fill:#f00,color:#fff
+ classDef sel fill:#00f,color:#fff
+
+ A --- A.A --- A.A.A
+ A.A --- A.A.B
+ A --- A.B --- A.B.A
+ A.B --- A.B.B
+
+ class A.A active
+ class A.A.A,A.A.B sel
+```
+
+We can return ourself and our descendants through `self_and_descendants`.
+
+```ruby
+Namespace.where(...).self_and_descendants
+
+namespace_object.self_and_descendants
+```
+
+We can return only our descendants excluding ourselves:
+
+```ruby
+Namespace.where(...).self_and_descendants(include_self: false)
+
+namespace_object.descendants
+```
+
+We could not name the scope method `.descendants` because we would override the `Object` method of the same name.
+
+It can be more efficient to return the descendant IDs instead of the whole record:
+
+```ruby
+Namespace.where(...).self_and_descendant_ids
+Namespace.where(...).self_and_descendant_ids(include_self: false)
+
+namespace_object.self_and_descendant_ids
+namespace_object.descendant_ids
+```
+
+### Ancestor namespaces
+
+The ancestors of a namespace are its children, their children, and so on.
+
+```mermaid
+graph TD
+ classDef active fill:#f00,color:#fff
+ classDef sel fill:#00f,color:#fff
+
+ A --- A.A --- A.A.A
+ A.A --- A.A.B
+ A --- A.B --- A.B.A
+ A.B --- A.B.B
+
+ class A.A active
+ class A sel
+```
+
+We can return ourself and our ancestors through `self_and_ancestors`.
+
+```ruby
+Namespace.where(...).self_and_ancestors
+
+namespace_object.self_and_ancestors
+```
+
+We can return only our ancestors excluding ourselves:
+
+```ruby
+Namespace.where(...).self_and_ancestors(include_self: false)
+
+namespace_object.ancestors
+```
+
+We could not name the scope method `.ancestors` because we would override the `Module` method of the same name.
+
+It can be more efficient to return the ancestor ids instead of the whole record:
+
+```ruby
+Namespace.where(...).self_and_ancstor_ids
+Namespace.where(...).self_and_ancestor_ids(include_self: false)
+
+namespace_object.self_and_ancestor_ids
+namespace_object.ancestor_ids
+```
+
+### Hierarchies
+
+A Namespace hierarchy is a `Namespace`, its ancestors, and its descendants.
+
+```mermaid
+graph TD
+ classDef active fill:#f00,color:#fff
+ classDef sel fill:#00f,color:#fff
+
+ A --- A.A --- A.A.A
+ A.A --- A.A.B
+ A --- A.B --- A.B.A
+ A.B --- A.B.B
+
+ class A.A active
+ class A,A.A.A,A.A.B sel
+```
+
+We can query a namespace hierarchy:
+
+```ruby
+Namespace.where(...).self_and_hierarchy
+
+namespace_object.self_and_hierarchy
+```
+
+### Recursive queries
+
+The queries above are known as the linear queries because they use the `namespaces.traversal_ids` column to perform standard SQL queries instead of recursive CTE queries.
+
+A set of legacy recursive queries are also accessible if needed:
+
+```ruby
+Namespace.where(...).recursive_self_and_descendants
+Namespace.where(...).recursive_self_and_descendants(include_self: false)
+Namespace.where(...).recursive_self_and_descendant_ids
+Namespace.where(...).recursive_self_and_descendant_ids(include_self: false)
+Namespace.where(...).recursive_self_and_ancestors
+Namespace.where(...).recursive_self_and_ancestors(include_self: false)
+Namespace.where(...).recursive_self_and_ancstor_ids
+Namespace.where(...).recursive_self_and_ancestor_ids(include_self: false)
+Namespace.where(...).recursive_self_and_hierarchy
+
+namespace_object.recursive_root_ancestor
+namespace_object.recursive_self_and_descendants
+namespace_object.recursive_descendants
+namespace_object.recursive_self_and_descendant_ids
+namespace_object.recursive_descendant_ids
+namespace_object.recursive_self_and_ancestors
+namespace_object.recursive_ancestors
+namespace_object.recursive_self_and_ancestor_ids
+namespace_object.recursive_ancestor_ids
+namespace_object.recursive_self_and_hierarchy
+```
+
+## Namespace query implementation
+
+The linear queries are executed using the `namespaces.traversal_ids` array column. Each array represents an ordered set of `Namespace` IDs from the root `Namespace` to the current `Namespace`.
+
+Given the scenario:
+
+```mermaid
+graph TD
+ classDef active fill:#f00,color:#fff
+ classDef sel fill:#00f,color:#fff
+
+ A --- A.A --- A.A.A
+ A.A --- A.A.B
+ A --- A.B --- A.B.A
+ A.B --- A.B.B
+
+ class A.A.B active
+```
+
+The `traversal_ids` for `Namespace` `A.A.B` would be `[A, A.A, A.A.B]`.
+
+The `traversal_ids` have some useful properties to keep in mind if working in this area:
+
+- The root of every `Namespace` is provided by `traversal_ids[1]`. Note that PostgreSQL array indexes begin at `1`.
+- The ID of the current `Namespace` is provided by `traversal_ids[array_length(traversal_ids, 1)]`.
+- The `Namespace` ancestors are represented by the `traversal_ids`.
+- A `Namespace`'s `traversal_ids` are a subset of their descendants `traversal_ids`. A `Namespace` with `traversal_ids = [1,2,3]` will have descendants that all start with `[1,2,3,...]`.
+- PostgreSQL arrays are ordered such that `[1] < [1,1] < [2]`.
+
+Using these properties we find the `root` and `ancestors` are already provided for by `traversal_ids`.
+
+With the object descendant queries we lean on the `@>` array operator which will test inclusion of an array inside another array.
+The `@>` operator has been found to be quite slow as the search space grows. Another method is used for scope queries which tend to have larger search spaces.
+With scope queries we combine comparison operators with the array ordering property.
+
+All descendants of a `Namespace` with `traversal_ids = [1,2,3]` have `traversal_ids` that are greater than `[1,2,3]` but less than `[1,2,4]`.
+In this example `[1,2,3]` and `[1,2,4]` are siblings, and `[1,2,4]` is the next sibling after `[1,2,3]`. A SQL function is provided to find the next sibling of a `traversal_ids` called `next_traversal_ids_sibling`.
+
+```sql
+gitlabhq_development=# select next_traversal_ids_sibling(ARRAY[1,2,3]);
+ next_traversal_ids_sibling
+----------------------------
+ {1,2,4}
+(1 row)
+```
+
+We then build descendant linear query scopes using comparison operators:
+
+```sql
+WHERE namespaces.traversal_ids > ARRAY[1,2,3]
+ AND namespaces.traversal_ids < next_traversal_ids_sibling(ARRAY[1,2,3])
+```
+
+### Superset
+
+`Namespace` queries are prone to returning duplicate results. For example, consider a query to find descendants of `A` and `A.A`:
+
+```mermaid
+graph TD
+ classDef active fill:#f00,color:#fff
+ classDef sel fill:#00f,color:#fff
+
+ A --- A.A --- A.A.A
+ A.A --- A.A.B
+ A --- A.B --- A.B.A
+ A.B --- A.B.B
+
+ class A,A.A active
+ class A.A.A,A.A.B,A.B,A.B.A,A.B.B sel
+```
+
+```ruby
+namespaces = Namespace.where(name: ['A', 'A.A'])
+
+namespaces.self_and_descendants
+
+=> A.A, A.A.A, A.A.B, A.B, A.B.A, A.B.B
+```
+
+Searching for the descendants of both `A` and `A.A` is unnecessary because `A.A` is already a descendant of `A`.
+In extreme cases this can create excessive I/O leading to poor performance.
+
+Redundant `Namespaces` are eliminated from a query if a `Namespace` `ID` in the `traversal_ids` attribute matches an `ID` belonging to another `Namespace` in the set of `Namespaces` being queried.
+A match of this condition signifies that an ancestor exists in the set of `Namespaces` being queried, and the current `Namespace` is therefore redundant.
+This optimization will result in much better performance of edge cases that would otherwise be very slow.