Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/marian-nmt/nccl.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSylvain Jeaugey <sjeaugey@nvidia.com>2019-11-20 01:57:39 +0300
committerGitHub <noreply@github.com>2019-11-20 01:57:39 +0300
commit299c554dccf923230321ad7495946543f3e9b457 (patch)
tree6a70b52080f0570fc87285b3b2300dbd2f2918ad /src/collectives/device/common_kernel.h
parentccb1298148327bacb9b83452ed6ae0b29417e7e2 (diff)
2.5.6-1 (#255)
Add LL128 Protocol. Rewrite the topology detection and tree/ring creation (#179). Improve tree performance by sending/receiving from different GPUs. Add model-based tuning to switch between the different algorithms and protocols. Rework P2P/SHM detection in containers (#155, #248). Detect duplicated devices and return an error (#231). Add tuning for GCP
Diffstat (limited to 'src/collectives/device/common_kernel.h')
-rw-r--r--src/collectives/device/common_kernel.h2
1 files changed, 0 insertions, 2 deletions
diff --git a/src/collectives/device/common_kernel.h b/src/collectives/device/common_kernel.h
index 435a598..aa1e936 100644
--- a/src/collectives/device/common_kernel.h
+++ b/src/collectives/device/common_kernel.h
@@ -263,8 +263,6 @@ __device__ __forceinline__ void ReduceCopyMulti(const int tid, const int nthread
}
}
-#define WARP_SIZE 32
-
template<class FUNC, typename T, int UNROLL, int MINSRCS, int MAXSRCS, int MINDSTS, int MAXDSTS>
__device__ __forceinline__ void ReduceCopy128bMulti( const int w, const int nw, const int t,
int nsrcs, const T* s[MAXSRCS], int ndsts, T* d[MAXDSTS],