Age | Commit message (Collapse) | Author | |
---|---|---|---|
2018-10-25 | 2.3.7-1v2.3.7-1 | David Addison | |
Improved LL tuning for multi-node jobs. Improved bootstrap for large job scaling. Fixed a hang during bootstrap due to socket reuse. Added operation name to the COLL INFO logging. | |||
2018-10-14 | Fix nccl-tests all_reduce_perf path | Obihörnchen | |
It's `all_reduce_perf` not `allreduce_perf` | |||
2018-09-26 | 2.3.5-5v2.3.5-5 | Sylvain Jeaugey | |
Add support for inter-node communication using sockets and InfiniBand/RoCE. Improve latency. Add support for aggregation. Improve LL/regular tuning. Remove tests as those are now at github.com/nvidia/nccl-tests . | |||
2017-11-29 | Merge pull request #119 from sclarkson/master | Sylvain Jeaugey | |
Fix tests: call cudaHostUnregister on the host pointer instead of the device pointer. | |||
2017-11-12 | fix tests on maxwell | sclarkson | |
2017-08-04 | Update README to link to NCCL2 | Sylvain Jeaugey | |
2017-08-04 | Update README to link to NCCL2 part 3 | Sylvain Jeaugey | |
2017-08-04 | Update README to link to NCCL2 #2 | Sylvain Jeaugey | |
2017-08-04 | Update README to link to NCCL2 | Sylvain Jeaugey | |
2017-06-14 | Add support for CUDA9 half semantics | Sylvain Jeaugey | |
2017-04-04 | Merge pull request #78 from ilya-biryukov/master | Sylvain Jeaugey | |
Fix compilation error when compiling with 'clang -x cuda'. | |||
2017-03-24 | Added Pascal nvcc flags, bumped versionv1.3.4-1 | Boris Fomitchev | |
2017-03-16 | Fix compilation error when compiling with 'clang -x cuda'. | Ilya Biryukov | |
Functions vFetch and vStore are not found by ADL with clang, so they need to be declared before usage in ReduceCopy. | |||
2017-03-02 | Bumping version to 1.3.3 | Sylvain Jeaugey | |
2017-03-02 | Only enable peer access for ring neighbors. | Nathan Luehr | |
This enables support for systems with more than 9 GPUs attached to a single PCIe root complex. | |||
2017-03-02 | Fix copy/paste typo in error message | Sylvain Jeaugey | |
2017-03-02 | Fix crash in Reduce when non-root ranks have invalid recvbuff | Sylvain Jeaugey | |
2017-02-08 | Merge pull request #69 from cwhipkey/master | Sylvain Jeaugey | |
Qualify nullptr_t with std:: | |||
2017-02-08 | Qualify nullptr_t with std::. | Chad Whipkey | |
2016-12-08 | Fix 1.3.2 compilation | Sylvain Jeaugey | |
2016-12-06 | Adding missing file | Sylvain Jeaugey | |
2016-12-02 | 1.3.2 release | Sylvain Jeaugey | |
Broadcast tuning Better checking of inputs Copy/reduce code simplification | |||
2016-12-02 | Replace min BW by average BW in tests | Sylvain Jeaugey | |
2016-11-28 | Merge pull request #54 from peterhj/peterhj-staticlib | Sylvain Jeaugey | |
Add a static library target "staticlib" to the Makefile. | |||
2016-11-24 | Add a static library target "staticlib" to the Makefile. | Peter Jin | |
Rename the static library "libnccl_static.a" to disambiguate from the dynamic libraries. | |||
2016-11-21 | Remove irrelevant output from ncclReduce Fortran tests | Kyle Fernandes, ne Jacobs | |
2016-11-21 | Add Copyright header to Fortran bindings source files | Kyle Fernandes, ne Jacobs | |
2016-11-18 | Add Fortran bindings | Kyle Fernandes, ne Jacobs | |
2016-10-13 | Bump to 1.3.1 | Sylvain Jeaugey | |
2016-10-13 | Fix primitives function prototype | Sylvain Jeaugey | |
2016-10-13 | NVML (libwrap) : import the needed definitions | Sylvain Jeaugey | |
2016-10-07 | Improved allreduce segmentation for small sizes | Sylvain Jeaugey | |
2016-09-22 | Add scan testsv1.3.0-1 | Sylvain Jeaugey | |
2016-09-22 | Make tests check for deltas and report bandwidth | Sylvain Jeaugey | |
2016-09-22 | Heavy code refactoring to remove a lot of code in collectives (~1000 lines). | Sylvain Jeaugey | |
Have all collectives use the same args, the same ring, and the same primitives for synchronization between threads with the same pattern. | |||
2016-09-22 | Add profiling API | Sylvain Jeaugey | |
2016-09-22 | Fix MPI test path | Sylvain Jeaugey | |
2016-09-15 | Merge pull request #41 from jia-kai/master | Sylvain Jeaugey | |
Some minor fixes for compile/usage | |||
2016-08-27 | Merge pull request #45 from NVIDIA/cw-update-copyright-year | Sylvain Jeaugey | |
Update LICENSE.txt | |||
2016-08-27 | Update LICENSE.txt | Cliff Woolley | |
2016-08-27 | Updated LICENCE.txt | Sylvain Jeaugey | |
2016-08-19 | pass devlist as const int* rather than int* in ncclCommInitAll | jiakai | |
2016-08-19 | link library with -lrt; otherwise there is undefined reference to shm_open | jiakai | |
2016-07-28 | Remove unneeded deb build script | Sylvain Jeaugey | |
2016-07-25 | Merge remote-tracking branch 'github/master' into public | Sylvain Jeaugey | |
2016-07-25 | Fixed redundant contexts in multi-process apps | Nathan Luehr | |
Change-Id: If787014450fd281304f0c7baf01d25963e40905d | |||
2016-07-07 | Improved Deb generation | Sylvain Jeaugey | |
2016-06-17 | Fix version number | Sylvain Jeaugey | |
2016-06-17 | Add a debug level to NCCL and CUDA versions at init | Sylvain Jeaugey | |
2016-06-16 | Increased version to 1.2.3 | Sylvain Jeaugey | |