Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.com/gitlab-org/gitlab-foss.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/contributing/verify/index.md')
-rw-r--r--doc/development/contributing/verify/index.md236
1 files changed, 236 insertions, 0 deletions
diff --git a/doc/development/contributing/verify/index.md b/doc/development/contributing/verify/index.md
new file mode 100644
index 00000000000..a2bb0eca733
--- /dev/null
+++ b/doc/development/contributing/verify/index.md
@@ -0,0 +1,236 @@
+---
+type: reference, dev
+stage: none
+group: Verify
+---
+
+# Contribute to Verify stage codebase
+
+## What are we working on in Verify?
+
+Verify stage is working on a comprehensive Continuous Integration platform
+integrated into the GitLab product. Our goal is to empower our users to make
+great technical and business decisions, by delivering a fast, reliable, secure
+platform that verifies assumptions that our users make, and check them against
+the criteria defined in CI/CD configuration. They could be unit tests, end-to-end
+tests, benchmarking, performance validation, code coverage enforcement, and so on.
+
+Feedback delivered by GitLab CI/CD makes it possible for our users to make well
+informed decisions about technological and business choices they need to make
+to succeed. Why is Continuous Integration a mission critical product?
+
+GitLab CI/CD is our platform to deliver feedback to our users and customers.
+
+They contribute their continuous integration configuration files
+`.gitlab-ci.yml` to describe the questions they want to get answers for. Each
+time someone pushes a commit or triggers a pipeline we need to find answers for
+very important questions that have been asked in CI/CD configuration.
+
+Failing to answer these questions or, what might be even worse, providing false
+answers, might result in a user making a wrong decision. Such wrong decisions
+can have very severe consequences.
+
+## Core principles of our CI/CD platform
+
+Data produced by the platform should be:
+
+1. Accurate.
+1. Durable.
+1. Accessible.
+
+The platform itself should be:
+
+1. Reliable.
+1. Secure.
+1. Deterministic.
+1. Trustworthy.
+1. Fast.
+1. Simple.
+
+Since the inception of GitLab CI/CD, we have lived by these principles,
+and they serve us and our users well. Some examples of these principles are that:
+
+- The feedback delivered by GitLab CI/CD and data produced by the platform should be accurate.
+ If a job fails and we notify a user that it was successful, it can have severe negative consequences.
+- Feedback needs to be available when a user needs it and data can not disappear unexpectedly when engineers need it.
+- It all doesn’t matter if the platform is not secure and we
+are leaking credentials or secrets.
+- When a user provides a set of preconditions in a form of CI/CD configuration, the result should be deterministic each time a pipeline runs, because otherwise the platform might not be trustworthy.
+- If it is fast, simple to use and has a great UX it will serve our users well.
+
+## Building things in Verify
+
+### Measure before you optimize, and make data-informed decisions
+
+It is very difficult to optimize something that you can not measure. How would you
+know if you succeeded, or how significant the success was? If you are working on
+a performance or reliability improvement, make sure that you measure things before
+you optimize them.
+
+The best way to measure stuff is to add a Prometheus metric. Counters, gauges, and
+histograms are great ways to quickly get approximated results. Unfortunately this
+is not the best way to measure tail latency. Prometheus metrics, especially histograms,
+are usually approximations.
+
+If you have to measure tail latency, like how slow something could be or how
+large a request payload might be, consider adding custom application logs and
+always use structured logging.
+
+It's useful to use profiling and flamegraphs to understand what the code execution
+path truly looks like!
+
+### Strive for simple solutions, avoid clever solutions
+
+It is sometimes tempting to use a clever solution to deliver something more
+quickly. We want to avoid shipping clever code, because it is usually more
+difficult to understand and maintain in the long term. Instead, we want to
+focus on boring solutions that make it easier to evolve the codebase and keep the
+contribution barrier low. We want to find solutions that are as simple as
+possible.
+
+### Do not confuse boring solutions with easy solutions
+
+Boring solutions are sometimes confused with easy solutions. Very often the
+opposite is true. An easy solution might not be simple - for example, a complex
+new library can be included to add a very small functionality that otherwise
+could be implemented quickly - it is easier to include this library than to
+build this thing, but it would bring a lot of complexity into the product.
+
+On the other hand, it is also possible to over-engineer a solution when a simple,
+well tested, and well maintained library is available. In that case using the
+library might make sense. We recognize that we are constantly balancing simple
+and easy solutions, and that finding the right balance is important.
+
+### "Simple" is not mutually exclusive with "flexible"
+
+Building simple things does not mean that more advanced and flexible solutions
+will not be available. A good example here is an expanding complexity of
+writing `.gitlab-ci.yml` configuration. For example, you can use a simple
+method to define an environment name:
+
+```yaml
+deploy:
+ environment: production
+ script: cap deploy
+```
+
+But the `environment` keyword can be also expanded into another level of
+configuration that can offer more flexibility.
+
+```yaml
+deploy:
+ environment:
+ name: review/$CI_COMMIT_REF_SLUG
+ url: https://prod.example.com
+ script: cap deploy
+```
+
+This kind of approach shields new users from the complexities of the platform,
+but still allows them to go deeper if they need to. This approach can be
+applied to many other technical implementations.
+
+### Make things observable
+
+GitLab is a DevOps platform. We popularize DevOps because it helps companies
+be more efficient and achieve better results. One important component of
+DevOps culture is to take ownership over features and code that you are
+building. It is very difficult to do that when you don’t know how your features
+perform and behave in the production environment.
+
+This is why we want to make our features and code observable. It
+should be written in a way that an author can understand how well or how poorly
+the feature or code behaves in the production environment. We usually accomplish
+that by introducing the proper mix of Prometheus metrics and application
+loggers.
+
+**TODO** document when to use Prometheus metrics, when to use loggers. Write a
+few sentences about histograms and counters. Write a few sentences highlighting
+importance of metrics when doing incremental rollouts.
+
+### Protect customer data
+
+Making data produced by our CI/CD platform durable is important. We recognize that
+data generated in the CI/CD by users and customers is
+something important and we must protect it. This data is not only important
+because it can contain important information, we also do have compliance and
+auditing responsibilities.
+
+Therefore we must take extra care when we are writing migrations
+that permanently removes data from our database, or when we are define
+new retention policies.
+
+As a general rule, when you are writing code that is supposed to remove
+data from the database, file system, or object storage, you should get an extra pair
+of eyes on your changes. When you are defining a new retention policy, you
+should double check with PMs and EMs.
+
+### Get your changes reviewed
+
+When your merge request is ready for reviews you must assign
+reviewers and then maintainers. Depending on the complexity of a change, you
+might want to involve the people that know the most about the codebase area you are
+changing. We do have many domain experts in Verify and it is absolutely acceptable to
+ask them to review your code when you are not certain if a reviewer or
+maintainer assigned by the Reviewer Roulette has enough context about the
+change.
+
+The reviewer roulette offers useful suggestions, but as assigning the right
+reviewers is important it should not be done automatically every time. It might
+not make sense to assign someone who knows nothing about the area you are
+updating, because their feedback might be limited to code style and syntax.
+Depending on the complexity and impact of a change, assigning the right people
+to review your changes might be very important.
+
+If you don’t know who to assign, consult `git blame` or ask in the `#verify`
+Slack channel (GitLab team members only).
+
+### Incremental rollouts
+
+After your merge request is merged by a maintainer, it is time to release it to
+users and the wider community. We usually do this with feature flags.
+While not every merge request needs a feature flag, most merge
+requests in Verify should have feature flags. [**TODO** link to docs about what
+needs a feature flag and what doesn’t].
+
+If you already follow the advice on this page, you probably already have a
+few metrics and perhaps a few loggers added that make your new code observable
+in the production environment. You can now use these metrics to incrementally
+roll out your changes!
+
+A typical scenario involves enabling a few features in a few internal projects
+while observing your metrics or loggers. Be aware that there might be a
+small delay involved in ingesting logs in Elastic or Kibana. After you confirm
+the feature works well with internal projects you can start an
+incremental rollout for other projects.
+
+Avoid using "percent of time" incremental rollouts. These are error prone,
+especially when you are checking feature flags in a few places in the codebase
+and you have not memoized the result of a check in a single place.
+
+### Do not cause our Universe to implode
+
+During one of the first GitLab Contributes events we had a discussion about the importance
+of keeping CI/CD pipeline, stage, and job statuses accurate. We considered a hypothetical
+scenario relating to a software being built by one of our [early customers](https://about.gitlab.com/blog/2016/11/23/gitlab-adoption-growing-at-cern/)
+
+> What happens if software deployed to the [Large Hadron Collider (LHC)](https://en.wikipedia.org/wiki/Large_Hadron_Collider),
+> breaks because of a bug in GitLab CI/CD that showed that a pipeline
+> passed, but this data was not accurate and the software deployed was actually
+> invalid? A problem like this could cause the LHC to malfunction, which
+> could generate a new particle that would then cause the universe to implode.
+
+That would be quite an undesirable outcome of a small bug in GitLab CI/CD status
+processing. Please take extra care when you are working on CI/CD statuses,
+we don’t want to implode our Universe!
+
+This is an extreme and unlikely scenario, but presenting data that is not accurate
+can potentially cause a myriad of problems through the
+[butterfly effect](https://en.wikipedia.org/wiki/Butterfly_effect).
+There are much more likely scenarios that
+can have disastrous consequences. GitLab CI/CD is being used by companies
+building medical, aviation, and automotive software. Continuous Integration is
+a mission critical part of software engineering.
+
+When you are working on a subsystem for pipeline processing and transitioning
+CI/CD statuses, request an additional review from a domain expert and hold
+others accountable for doing the same.