Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/nccl.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSylvain Jeaugey <sjeaugey@nvidia.com>2020-09-05 00:35:05 +0300
committerSylvain Jeaugey <sjeaugey@nvidia.com>2020-11-17 22:08:52 +0300
commit920dbe5b359fe5817b8ba874476ca4ba2dc5f1ef (patch)
treeda539cb823c9e11e4fa8e7e6de88dd4a662c7128 /src/graph/rings.cc
parent084207e685c4587e7d0aa2f1f7f148d3e0e68da6 (diff)
2.8.3-1
Optimization for Tree allreduce on A100. Improve aggregation performance. Use shared buffers for inter-node send/recv. Add NVTX profiling hooks. Accelerate alltoall connections by merging communication for all channels. Add support for one hop communication through NVLink, for faster send/recv communication on cubemesh topologies like DGX-1. Improve alltoall scheduling to better balance intra/inter node communication. Increase send/recv parallelism by 8x, each warp sending or receiving to a different peer. Net: move to v4. Net: make flush operation asynchronous to accelerate alltoall. Net: define maximum number of requests. Fix hang when using LL128 protocol after 2^31 steps. Fix #379 : topology injection failing when using less GPUs than described in the XML. Fix #394 : protocol mismatch causing hangs or crashes when using one GPU per node.
Diffstat (limited to 'src/graph/rings.cc')
-rw-r--r--src/graph/rings.cc2
1 files changed, 1 insertions, 1 deletions
diff --git a/src/graph/rings.cc b/src/graph/rings.cc
index 5aacbb5..53130d1 100644
--- a/src/graph/rings.cc
+++ b/src/graph/rings.cc
@@ -21,7 +21,7 @@ void dumpLine(int* values, int nranks, const char* prefix) {
ncclResult_t ncclBuildRings(int nrings, int* rings, int rank, int nranks, int* prev, int* next) {
for (int r=0; r<nrings; r++) {
- char prefix[30];
+ char prefix[40];
/*sprintf(prefix, "[%d] Channel %d Prev : ", rank, r);
dumpLine(prev+r*nranks, nranks, prefix);
sprintf(prefix, "[%d] Channel %d Next : ", rank, r);