Age | Commit message (Collapse) | Author |
|
This adds the non-root section and information about the parameter
--unprivileged to the man page.
Co-authored-by: Anna Singleton <annabeths111@gmail.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Anna Singleton <annabeths111@gmail.com>
|
|
A file's r/w/x changing between checkpoint and restore does
not necessarily imply that something is wrong. For example,
if a process opens a file having perms rw- for reading and
we change the perms to r--, the process can be restored and
will function as expected.
Therefore, this patch adds an option
--skip-file-rwx-check
to disable this check on restore. File validation is unaffected
and should still function as expected with respect to the content
of files.
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
|
|
Add SIGTSTP signal dump and restore. Add a corresponding field
in the image, save it only if a task is in the stopped state.
Restore task state by sending desired stop signal if it is present
in the image. Fallback to SIGSTOP if it's absent.
Signed-off-by: Yuriy Vasiliev <yuriy.vasiliev@openvz.org>
|
|
Brought to you by
codespell -w
(using codespell v2.1.0).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
|
|
Modifications to support criu image streamer when using amdgpu_plugin.
When running with criu image streamer, fseek/lseek is not available so
we store the file size in the first 8-bytes of the actual file.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
|
|
Store BO contents directly to file (1 per GPU) instead of using
protobuf.
Bug Fix:
Fixes an issue where we could not handle BOs bigger than 4GB because
protobuf has an internal limit of 4GB for the Bytes structure.
Performance Improvements:
This significantly reduces CR duration on multi-GPU systems as it allows
reading and writing to disk in parallel. During checkpoint, instead of
waiting for all the BO contents to be read from the one protobuf file,
we can now start writing the BO contents as soon as the first BO is read
from disk. During restore, we can start writing BO contents to disk
after the first BO from VRAM. This also reduces the peak amount of
system memory used as we only need to keep 1 BO content in memory per
GPU at a time instead of all the BO contents.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
|
|
Add optional parameters to override default behavior during restore.
These parameters are passed in as environment variables before executing
CRIU.
List of parameters:
KFD_FW_VER_CHECK - disable firmware version check
KFD_SDMA_FW_VER_CHECK - disable SDMA firmware version check
KFD_CACHES_COUNT_CHECK - disable caches count check
KFD_NUM_GWS_CHECK - disable num_gws check
KFD_VRAM_SIZE_CHECK - disable VRAM size check
KFD_NUMA_CHECK - preserve NUMA regions
KFD_CAPABILITY_CHECK - disable capability check
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
|
|
The device topology on the restore node can be different from the
topology on the checkpointed node. The GPUs on the restore node may
have different gpu_ids, minor number. or some GPUs may have different
properties as checkpointed node. During restore, the CRIU plugin
determines the target GPUs to avoid restore failures caused by trying
to restore a process on a gpu that is different.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
|
|
To support Checkpoint Restore with AMDGPUs for ROCm workloads, introduce
a new plugin to assist CRIU with the help of AMD KFD kernel driver. This
initial commit just provides the basic framework to build up further
capabilities. Like CRIU, the amdgpu plugin also uses protobuf to
serialize
and save the amdkfd data which is mostly VRAM contents with some
metadata.
We generate a data file "amdgpu-kfd-<id>.img" during the dump stage. On restore
this file is read and extracted to re-create various types of buffer
objects that belonged to the previously checkpointed process. Upon
restore the mmap page offset within a device file might change so we use
the new hook to update and adjust the mmap offsets for newly created
target process. This is needed for sys_mmap call in pie restorer phase.
Support for queues and events is added in future patches of this series.
With the current implementation (amdgpu_plugin), we support:
- Only compute workloads such (Non Gfx) are supported
- GPU visible inside a container
- AMD GPU Gfx 9 Family
- Pytorch Benchmarks such as BERT Base
amdgpu plugin dependes on libdrm and libdrm_amdgpu which are typically
installed with libdrm-dev package. We build amdgpu_plugin only when the
dependencies are met on the target system and when user intends to
install the amdgpu plugin and not by default with criu build.
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Co-authored-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
|
|
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
|
|
The --timeout option was introduced in [1] to prevent criu dump from
being able to hang indefinitely and allow users to adjust the time limit
in seconds for collecting tasks during the dump operation.
[1] https://github.com/checkpoint-restore/criu/commit/d0ff730
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
|
|
The expected behavior of --tcp-close option when dumpping is to close
all established tcp connections including connection that is once
established but now closed. This adds an explicit description about
that behavior.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
|
|
Support for external net namespaces has been introduced with
commit c2b21fbf (criu: add support for external net namespaces).
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
|
|
Python 2 has been deprecated since January 1, 2020 and linux distributions
already support Python 3. Thus, to simplify maintenance and packaging
we could support criu-ns as Python 3 only.
v2: Add a message for criu-ns installation
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
|
|
This adds the option to choose the networking locking method.
CRIU currently uses iptables-restore cli for network locking/unlocking
but nftables support will be added later.
There have been reports from users that iptables-restore fails in some
way and an nftables based approach using libnftables could avoid this
external dependency.
v2: remove dependency details in man page for --network-lock.
v3: remove --network-lock from restore section in docs because it is
automatically detected from the inventory image now.
v4: add message that --network-lock will be ignored during restore
and value from dump will be used.
v5: run make indent
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
|
|
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
|
|
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
|
|
This change is motivated by checkpointing and restoring container in
Pods.
When restoring a container into a new Pod the SELinux label of the
existing Pod needs to be used and not the SELinux label saved during
checkpointing.
The option --lsm-profile already enables changing of process SELinux
labels on restore. If there are, however, tmpfs checkpointed they
will be mounted during restore with the same context as during
checkpointing. This can look like the following example:
context="system_u:object_r:container_file_t:s0:c82,c137"
On restore we want to change this context to match the mount label of
the Pod this container is restored into. Changing of the mount label
is now possible with the new option --mount-context:
criu restore --mount-context "system_u:object_r:container_file_t:s0:c204,c495"
This will lead to mount options being changed to
context="system_u:object_r:container_file_t:s0:c204,c495"
Now the restored container can access all the files in the container
again.
This has been tested in combination with runc and CRI-O.
Signed-off-by: Adrian Reber <areber@redhat.com>
|
|
The --join-ns option was introduced with commits:
https://github.com/checkpoint-restore/criu/commit/2cf17cd
https://github.com/checkpoint-restore/criu/commit/790ec46
Closes: #1054
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
|
|
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
file_validation_method field added to cr_options structure in
"criu/include/cr_options.h" along with the constants:
FILE_VALIDATION_FILE_SIZE
FILE_VALIDATION_BUILD_ID
FILE_VALIDATION_DEFAULT (Equal to FILE_VALIDATION_BUILD_ID)
Usage and description information is yet to be added
Usage:
--file-validation="filesize" (To use only the file size check)
--file-validation="buildid" (To try and use only the build-id check)
Signed-off-by: Ajay Bharadwaj <ajayrbharadwaj@gmail.com>
|
|
Signed-off-by: Adrian Reber <areber@redhat.com>
|
|
This adds the ability to stream images with criu-image-streamer
The workflow is the following:
1) criu-image-streamer is started, and starts listening on a UNIX
socket.
2) CRIU is started. img_streamer_init() is invoked, which connects to the
socket. During dump/restore operations, instead of using local disk to
open an image file, img_streamer_open() is called to provide a UNIX pipe
that is sent over the UNIX socket.
3) Once the operation is done, img_streamer_finish() is called, and the
UNIX socket is disconnected.
criu-image-streamer can be found at:
https://github.com/checkpoint-restore/criu-image-streamer
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
|
|
The file only includes other headers (which may be not needed).
If we aim for one-include-for-compel, we could instead paste all
subheaders into "compel.h".
Rather, I think it's worth to migrate to more fine-grained compel
headers than follow the strategy 'one header to rule them all'.
Further, the header creates problems for cross-compilation: it's
included in files, those are used by host-compel. Which rightfully
confuses compiler/linker as host's definitions for fpu regs/other
platform details get drained into host's compel.
Signed-off-by: Dmitry Safonov <dima@arista.com>
|
|
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
|
|
This option was introduced with:
https://github.com/checkpoint-restore/criu/commit/e2c38245c613df5e36dcf0253c7652f928e46abf
v2: (comment from Pavel Tikhomirov) --enable-fs does not fit with
--external dev[]:, see try_resolve_ext_mount, external dev mounts
only determined for FSTYPE__UNSUPPORTED.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
|
|
Commit 0493724c8eda3 added support for using asciidoctor
(instead of asciidoc + xmlto) to generate man pages.
For some reason, asciidoctor does not deal well with some
complex formatting that we use for options such as --external,
leading to literal ’ and ' appearing in the man page instead
of italic formatting. For example:
> --inherit-fd fd[’N']:’resource'
(here both N and resource should be in italic).
Asciidoctor documentation (asciidoctor --help syntax) tells:
> == Text Formatting
>
> .Constrained (applied at word boundaries)
> *strong importance* (aka bold)
> _stress emphasis_ (aka italic)
> `monospaced` (aka typewriter text)
> "`double`" and '`single`' typographic quotes
> +passthrough text+ (substitutions disabled)
> `+literal text+` (monospaced with substitutions disabled)
>
> .Unconstrained (applied anywhere)
> **C**reate+**R**ead+**U**pdate+**D**elete
> fan__freakin__tastic
> ``mono``culture
so I had to carefully replace *bold* with **bold** and
'italic' with __italic__ to make it all work.
Tested with both terminal and postscript output, with both
asciidoctor and asciidoc+xmlto.
TODO: figure out how to fix examples (literal multi-line text),
since asciidoctor does not display it in monospaced font (this
is only true for postscript/pdf output so low priority).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
|
|
1. Add a/the articles where I see them missing
2. s/Forbid/disable/
3. s/crit/crit(1)/ as we're referring to a man page
4. Simplify some descriptions
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
|
|
In case asciidoc is installed and xmlto is not, make returns an error
but there's no diagnostics shown, since "xmlto: command not found"
goes to /dev/null.
Remove the redirect.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
|
|
The original/old guide probably doesn't work anymore:
- the patch isn't accessible;
- criu now depends on more libraries not only protobuf
Still, keep it as it might be helpful for someone.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
These requirements have been described in
https://github.com/opencontainers/runc/blob/b133feae/libcontainer/container_linux.go#L1265
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
|
|
Two modes of pre-dump algorithm:
1) splicing memory by parasite
--pre-dump-mode=splice (default)
2) using process_vm_readv syscall
--pre-dump-mode=read
Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
Instead of creating cgroup yard in CRIU, now we can create it externally
and pass it to CRIU. Useful if somebody doesn't want to grant
CAP_SYS_ADMIN to CRIU.
Signed-off-by: Michał Cłapiński <mclapinski@google.com>
|
|
Resolves #349
Signed-off-by: Harshavardhan Unnibhavi <hvubfoss@gmail.com>
|
|
This commit adds Transport Layer Security (TLS) support for remote
page-server connections.
The following command-line options are introduced with this commit:
--tls-cacert FILE Trust certificates signed only by this CA
--tls-cacrl FILE CA certificate revocation list
--tls-cert FILE TLS certificate
--tls-key FILE TLS private key
--tls Use TLS to secure remote connections
The default PKI locations are:
CA certificate /etc/pki/CA/cacert.pem
CA revocation list /etc/pki/CA/cacrl.pem
Client/server certificate /etc/pki/criu/cert.pem
Client/server private key /etc/pki/criu/private/key.pem
The files cacert.pem and cacrl.pem are optional. If they are not
present, and not explicitly specified with a command-line option,
CRIU will use only the system's trusted CAs to verify the remote
peer's identity. This implies that if a CA certificate is specified
using "--tls-cacert" only this CA will be used for verification.
If CA certificate (cacert.pem) is not present, certificate revocation
list (cacrl.pem) will be ignored.
Both (client and server) sides require a private key and certificate.
When the "--tls" option is specified, a TLS handshake (key exchange)
will be performed immediately after the remote TCP connection has been
accepted.
X.509 certificates can be generated as follows:
-------------------------%<-------------------------
# Generate CA key and certificate
echo -ne "ca\ncert_signing_key" > temp
certtool --generate-privkey > cakey.pem
certtool --generate-self-signed \
--template temp \
--load-privkey cakey.pem \
--outfile cacert.pem
# Generate server key and certificate
echo -ne "cn=$HOSTNAME\nencryption_key\nsigning_key" > temp
certtool --generate-privkey > key.pem
certtool --generate-certificate \
--template temp \
--load-privkey key.pem \
--load-ca-certificate cacert.pem \
--load-ca-privkey cakey.pem \
--outfile cert.pem
rm temp
mkdir -p /etc/pki/CA
mkdir -p /etc/pki/criu/private
mv cacert.pem /etc/pki/CA/
mv cert.pem /etc/pki/criu/
mv key.pem /etc/pki/criu/private
-------------------------%<-------------------------
Usage Example:
Page-server:
[src]# criu page-server -D <PATH> --port <PORT> --tls
[dst]# criu dump --page-server --address <SRC> --port <PORT> \
-t <PID> -D <PATH> --tls
Lazy migration:
[src]# criu dump --lazy-pages --port <PORT> -t <PID> -D <PATH> --tls
[dst]# criu lazy-pages --page-server --address <SRC> --port <PORT> \
-D <PATH> --tls
[dst]# criu restore -D <PATH> --lazy-pages
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
|
|
Since commit 6c572bee8f10 ("cgroup: Set "soft" mode by default") it
become impossible to set ignore mode at all. Provide a user option to do
that.
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
some notes for Android NDK cross compile.
Signed-off-by: Zhang Ning <ning.a.zhang@intel.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
|
|
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
|
|
* "post-resume" was introduced with commit:
2ab599398ddbde3449f1b9d4d3f5152591854cff
cr-restore: "post-resume" hook introduced
This hook is called at the very end, when everything is restored and processes
were resumed.
Can be used for some actions, which require operation container, like
restarting of systemd autofs services.
* "post-setup-namespaces" was introduced with commit:
eec66f3d30f9ccd75a8f1fab6920c20933eecd64
criu [PATCH] post-setup-namespaces
Introduce post-setup-namespaces action script
It needed to have possibility to run cutom script after mount
namespace is configured
* "orphan-pts-master" was introduced with commit:
6afe523d97d59e6bf29621b8aa0e6a4332f710fc
tty: notify about orphan tty-s via rpc
Now Docker creates a pty pair from a container devpts to use is as console.
A slave tty is set as a control tty for the init process and bind-mounted
into /dev/console. The master tty is handled externelly.
Now CRIU can handle external resources, but here we have internal resources
which are used externaly.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
|
|
Since asciidoc is based on Phyton 2, we want to move to alternative,
and a promising one is asciidoctor. This patch allows to use
asciidoctor for formatting man pages instead of asiidoc, by passing
a make option, USE_ASCIIDOCTOR=yes.
Although asciidoctor is almost compatible with asciidoc, it can
produce a man page directly from a text file without XML, which is
more efficiently. So in asciidoctor mode, we don't require xmlto.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
a2x is never used although its presence is checked mandatorily.
Let's remove this superfluous check and the unused entry.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
The option --lsm-profile was added with commit:
6af96c8404181e63d2424d1695fd7f8a42a291bf
lsm: add a --lsm-profile flag
In LXD, we use the container name in the LSM profile. If the container name
is changed on migrate (on the host side), we want to use a different LSM
profile name (a. la. --cgroup-root). This flag adds that support.
A usage example is available in
https://github.com/lxc/lxc/commit/13389b2963692a51162c703d8a64a79542b18949
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
The --tcp-close option was introduced with commit
2c37042821906f013634e305899cb25ff1a5a7b1
tcp: Add tcp-close option to restore connected TCP sockets in closed state
This options is applicable only for restore. Therefore, move the
documentation from 'dump' to 'restore'.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
The --log-pid option was introduced with commit
fc7bedc50ab4ca08b1c96764a5bd1acbc1889fa6
crtools: make to be able to split messages by pid
This option is applicable only for restore. Therefore, move the
documentation from "Common options" to "restore".
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
Add hostname resolution in setup_tcp_client(). This change allows a
valid hostname to be provided as value for --address option when
connecting to page server.
This change is needed for the following path which removes
setup_TCP_client_socket() from img-proxy.c. In this function the
hostname resolution was implemented using gethostbyname()
However, here we use `getaddrinfo` instead because gethostbyname() is
marked obsolescent in POSIX.1-2001 and is removed in POSIX.1-2008
specifications.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
|
|
For a long time we've been demanding from cpus to be compatible
on fpu frame level, but growing list of cpu features triggers
inability for restored programs to proceed after restore due
to specific intructions execution (such as avx2). Note the
fpu frame may carry same size but not on instruction level
where SIGILL may happen after the restore.
Thus lets require instruction mode to be set and verified by default.
Still one can drop this option via command line or rpc request.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
|
|
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmaill.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
|
|
Most of the typos were found by codespell.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
|
|
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
|