Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development')
-rw-r--r--doc/development/performance.md78
-rw-r--r--doc/development/testing_guide/end_to_end/rspec_metadata_tests.md1
-rw-r--r--doc/development/testing_guide/frontend_testing.md20
3 files changed, 94 insertions, 5 deletions
diff --git a/doc/development/performance.md b/doc/development/performance.md
index 72eb85c623b..a7ae283432f 100644
--- a/doc/development/performance.md
+++ b/doc/development/performance.md
@@ -415,6 +415,84 @@ test += " world"
When adding new Ruby files, please check that you can add the above header,
as omitting it may lead to style check failures.
+## Reading from files and other data sources
+
+Ruby offers several convenience functions that deal with file contents specifically
+or I/O streams in general. Functions such as `IO.read` and `IO.readlines` make
+it easy to read data into memory, but they can be inefficient when the
+data grows large. Because these functions read the entire contents of a data
+source into memory, memory use will grow by _at least_ the size of the data source.
+In the case of `readlines`, it will grow even further, due to extra bookkeeping
+the Ruby VM has to perform to represent each line.
+
+Consider the following program, which reads a text file that is 750MB on disk:
+
+```ruby
+File.readlines('large_file.txt').each do |line|
+ puts line
+end
+```
+
+Here is a process memory reading from while the program was running, showing
+how we indeed kept the entire file in memory (RSS reported in kilobytes):
+
+```shell
+$ ps -o rss -p <pid>
+
+RSS
+783436
+```
+
+And here is an excerpt of what the garbage collector was doing:
+
+```ruby
+pp GC.stat
+
+{
+ :heap_live_slots=>2346848,
+ :malloc_increase_bytes=>30895288,
+ ...
+}
+```
+
+We can see that `heap_live_slots` (the number of reachable objects) jumped to ~2.3M,
+which is roughly two orders of magnitude more compared to reading the file line by
+line instead. It was not just the raw memory usage that increased, but also how the garbage collector (GC)
+responded to this change in anticipation of future memory use. We can see that `malloc_increase_bytes` jumped
+to ~30MB, which compares to just ~4kB for a "fresh" Ruby program. This figure specifies how
+much additional heap space the Ruby GC will claim from the operating system next time it runs out of memory.
+Not only did we occupy more memory, we also changed the behavior of the application
+to increase memory use at a faster rate.
+
+The `IO.read` function exhibits similar behavior, with the difference that no extra memory will
+be allocated for each line object.
+
+### Recommendations
+
+Instead of reading data sources into memory in full, it is better to read them line by line
+instead. This is not always an option, for instance when you need to convert a YAML file
+into a Ruby `Hash`, but whenever you have data where each row represents some entity that
+can be processed and then discarded, you can use the following approaches.
+
+First, replace calls to `readlines.each` with either `each` or `each_line`.
+The `each_line` and `each` functions read the data source line by line without keeping
+already visited lines in memory:
+
+```ruby
+File.new('file').each { |line| puts line }
+```
+
+Alternatively, you can read individual lines explicitly using `IO.readline` or `IO.gets` functions:
+
+```ruby
+while line = file.readline
+ # process line
+end
+```
+
+This might be preferable if there is a condition that allows exiting the loop early, saving not
+just memory but also unnecessary time spent in CPU and I/O for processing lines you're not interested in.
+
## Anti-Patterns
This is a collection of [anti-patterns][anti-pattern] that should be avoided
diff --git a/doc/development/testing_guide/end_to_end/rspec_metadata_tests.md b/doc/development/testing_guide/end_to_end/rspec_metadata_tests.md
index 4f0e506a964..f8dc3366904 100644
--- a/doc/development/testing_guide/end_to_end/rspec_metadata_tests.md
+++ b/doc/development/testing_guide/end_to_end/rspec_metadata_tests.md
@@ -13,3 +13,4 @@ This is a partial list of the [RSpec metadata](https://relishapp.com/rspec/rspec
| `:quarantine` | The test has been [quarantined](https://about.gitlab.com/handbook/engineering/quality/guidelines/debugging-qa-test-failures/#quarantining-tests), will run in a separate job that only includes quarantined tests, and is allowed to fail. The test will be skipped in its regular job so that if it fails it will not hold up the pipeline. |
| `:reliable` | The test has been [promoted to a reliable test](https://about.gitlab.com/handbook/engineering/quality/guidelines/reliable-tests/#promoting-an-existing-test-to-reliable) meaning it passes consistently in all pipelines, including merge requests. |
| `:requires_admin` | The test requires an admin account. Tests with the tag are excluded when run against Canary and Production environments. |
+| `:runner` | The test depends on and will set up a GitLab Runner instance, typically to run a pipeline. |
diff --git a/doc/development/testing_guide/frontend_testing.md b/doc/development/testing_guide/frontend_testing.md
index 5ba63cc07fa..141d1db1bf6 100644
--- a/doc/development/testing_guide/frontend_testing.md
+++ b/doc/development/testing_guide/frontend_testing.md
@@ -547,16 +547,26 @@ The more challenging part are mocks, which can be used for functions or even dep
### Manual module mocks
-Jest supports [manual module mocks](https://jestjs.io/docs/en/manual-mocks) by placing a mock in a `__mocks__/` directory next to the source module. **Don't do this.** We want to keep all of our test-related code in one place (the `spec/` folder), and the logic that Jest uses to apply mocks from `__mocks__/` is rather inconsistent.
+Manual mocks are used to mock modules across the entire Jest environment. This is a very powerful testing tool that helps simplify
+unit testing by mocking out modules which cannot be easily consumned in our test environment.
-Instead, our test runner detects manual mocks from `spec/frontend/mocks/`. Any mock placed here is automatically picked up and injected whenever you import its source module.
+> NOTE: Do not use manual mocks if a mock should not be consistently applied (i.e. it's only needed by a few specs).
+> Instead, consider using `jest.mock` in the relevant spec file.
+
+#### Where should I put manual mocks?
+
+Jest supports [manual module mocks](https://jestjs.io/docs/en/manual-mocks) by placing a mock in a `__mocks__/` directory next to the source module
+(e.g. `app/assets/javascripts/ide/__mocks__`). **Don't do this.** We want to keep all of our test-related code in one place (the `spec/` folder).
+
+If a manual mock is needed for a `node_modules` package, please use the `spec/frontend/__mocks__` folder. Here's an example of
+a [Jest mock for the package `monaco-editor`](https://gitlab.com/gitlab-org/gitlab/blob/b7f914cddec9fc5971238cdf12766e79fa1629d7/spec/frontend/__mocks__/monaco-editor/index.js#L1).
+
+If a manual mock is needed for a CE module, please place it in `spec/frontend/mocks/ce`.
- Files in `spec/frontend/mocks/ce` will mock the corresponding CE module from `app/assets/javascripts`, mirroring the source module's path.
- Example: `spec/frontend/mocks/ce/lib/utils/axios_utils` will mock the module `~/lib/utils/axios_utils`.
-- Files in `spec/frontend/mocks/node` will mock NPM packages of the same name or path.
- We don't support mocking EE modules yet.
-
-If a mock is found for which a source module doesn't exist, the test suite will fail. 'Virtual' mocks, or mocks that don't have a 1-to-1 association with a source module, are not supported yet.
+- If a mock is found for which a source module doesn't exist, the test suite will fail. 'Virtual' mocks, or mocks that don't have a 1-to-1 association with a source module, are not supported yet.
### Writing a mock