diff options
author | David Yat Sin <david.yatsin@amd.com> | 2022-02-16 05:41:05 +0300 |
---|---|---|
committer | Andrei Vagin <avagin@gmail.com> | 2022-04-29 03:53:52 +0300 |
commit | 72905c9c9b829e29ae7fa90840b9eb4ba44d2a88 (patch) | |
tree | 7a09742108c1b5e09a12b4ada6119f52f1941a17 /Documentation | |
parent | 6e99fea2fa143f6f7aa27e24465838afa723f9ea (diff) |
criu/plugin: Remap GPUs on checkpoint restore
The device topology on the restore node can be different from the
topology on the checkpointed node. The GPUs on the restore node may
have different gpu_ids, minor number. or some GPUs may have different
properties as checkpointed node. During restore, the CRIU plugin
determines the target GPUs to avoid restore failures caused by trying
to restore a process on a gpu that is different.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/amdgpu_plugin.txt | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/Documentation/amdgpu_plugin.txt b/Documentation/amdgpu_plugin.txt index 4b731cf9a..8ba602cce 100644 --- a/Documentation/amdgpu_plugin.txt +++ b/Documentation/amdgpu_plugin.txt @@ -9,8 +9,8 @@ userspace for AMD GPUs. CURRENT SUPPORT --------------- -Single GPU systems (Gfx9) -Checkpoint / Restore on same system +Single and Multi GPU systems (Gfx9) +Checkpoint / Restore on different system Checkpoint / Restore inside a docker container Pytorch |