github.com/torch/cutorch.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2016-12-17	Revert "Bugfix of type in THCTensor macro."revert-639-patch-1	Soumith Chintala

2016-12-16	Merge pull request #639 from popol1991/patch-1	Soumith Chintala
	Bugfix of type in THCTensor macro.
2016-12-16	Bugfix of type in THCTensor macro.	Gao Yingkai
	A fix for issue #632.
2016-12-16	Merge pull request #637 from pavanky/test-fixes	Soumith Chintala
	Fixing various tests
2016-12-16	Fixing various tests	Pavan Yalamanchili
	- Increased the number of elements being used by distributions. - Fixed indexFill to generate a number that can be used by all types.
2016-12-15	fix wrong export directive for THCCachingHostAllocator (#633)	Eric Cosatto
	fix wrong export directive for THCCachingHostAllocator
2016-12-14	Merge pull request #630 from apaszke/bernoulli	Soumith Chintala
	Implement bernoulli with element-wise probabilities for all types
2016-12-13	Implement bernoulli with element-wise probabilities for all types	Adam Paszke

2016-12-12	Merge pull request #628 from killeent/more-documentation	Soumith Chintala
	TensorInfo related code documentation
2016-12-12	TensorInfo related code documentation	Trevor Killeen

2016-12-02	Merge pull request #619 from colesbury/cached_pinned_memory_fix	Soumith Chintala
	Process outstanding CUDA events in recordEvent
2016-12-02	Process outstanding CUDA events in recordEvent	Sam Gross
	Without this, the cuda_events could continuously grow from calls to cudaMemcpyAsync, but would never be processed if there were no new pinned memory allocations. For example: t1 = cutorch.createCudaHostTensor(10) t2 = torch.CudaTensor(10) while true do t2:copyAsync(t1) end
2016-12-02	Merge pull request #618 from colesbury/cached_pinned_memory	Soumith Chintala
	Add caching allocator for pinned (page-locked) memory
2016-12-02	Add caching allocator for pinned (host) memory	Sam Gross
	Adds a caching allocator for CUDA pinned (page-locked) memory. This avoid synchronization due to cudaFreeHost or cudaHostUnregister at the expense of potentially higher host memory usage. Correctness is preserved by recording CUDA events after each cudaMemcpyAsync involving the pinned memory. The pinned memory allocations are not reused until all events associated with it have completed.
2016-12-01	Adds a CUDA "sleep" kernel	Sam Gross
	Adds a CUDA "sleep" kernel which spins for the given number of iterations. This is useful for testing correct synchronization with streams.
2016-11-28	Merge pull request #614 from BTNC/win	Soumith Chintala
	use local modified select_compute_arch.cmake for msvc
2016-11-28	use local modified select_compute_arch.cmake for msvc	Rui Guo

2016-11-26	Merge pull request #613 from colesbury/lazy	Soumith Chintala
	Lazily initialize CUDA devices (take 2)
2016-11-26	Lazily initialize CUDA devices	Sam Gross
	Previously, cutorch would initialize every CUDA device and enable P2P access between all pairs. This slows down start-up, especially with 8 devices. Now, THCudaInit does not initialize any devices and P2P access is enabled lazily. Setting the random number generator seed also does not initialize the device until random numbers are actually used.
2016-11-24	Merge pull request #611 from torch/revert-610-lazy	Soumith Chintala
	Revert "Lazily initialize CUDA devices"
2016-11-24	Revert "Lazily initialize CUDA devices"revert-610-lazy	Soumith Chintala

2016-11-24	remove spurious prints in tests	soumith

2016-11-24	Merge pull request #610 from colesbury/lazy	Soumith Chintala
	Lazily initialize CUDA devices
2016-11-24	Implemented cudaMemGetInfo for caching allocator (#600)	Boris Fomitchev
	* Implemented cudaMemGetInfo for caching allocator
2016-11-23	Lazily initialize CUDA devices	Sam Gross
	Previously, cutorch would initialize every CUDA device and enable P2P access between all pairs. This slows down start-up, especially with 8 devices. Now, THCudaInit does not initialize any devices and P2P access is enabled lazily. Setting the random number generator seed also does not initialize the device until random numbers are actually used.
2016-11-18	Merge pull request #607 from killeent/half-guard	Soumith Chintala
	guard random functions for half
2016-11-18	guard random functions for half	Trevor Killeen

2016-11-18	Merge pull request #605 from gchanan/halfAddrAddmv	Soumith Chintala
	Add half support for addmv and addr.
2016-11-18	Add half support for addmv and addr.	Gregory Chanan

2016-11-17	Merge pull request #604 from killeent/memleak	Soumith Chintala
	fix memory leak in (equal)
2016-11-17	fix memory leak in (equal)	Trevor Killeen

2016-11-17	Merge pull request #603 from killeent/remainder	Soumith Chintala
	Implement fmod, remainder, equal in Cutorch
2016-11-17	add support for equal in cutorch	Trevor Killeen

2016-11-17	Merge pull request #602 from killeent/magma	Soumith Chintala
	Magma functions to generic
2016-11-16	add support for fmod in cutorch	Trevor Killeen

2016-11-16	add support for remainder in cutorch	Trevor Killeen

2016-11-16	[cutorch mag2gen] more cleanup	Trevor Killeen

2016-11-16	[cutorch mag2gen] some cleanup	Trevor Killeen

2016-11-16	[cutorch mag2gen] move qr to generic	Trevor Killeen

2016-11-16	[cutorch mag2gen] move potr* to generic	Trevor Killeen

2016-11-16	[cutorch mag2gen] move inverse to generic	Trevor Killeen

2016-11-16	[cutorch mag2gen] move svd to generic	Trevor Killeen

2016-11-16	[cutorch mag2gen] move eig to generic	Trevor Killeen

2016-11-16	[cutorch mag2gen] move symeig to generic	Trevor Killeen

2016-11-16	[cutorch mag2gen] move gels to generic	Trevor Killeen

2016-11-16	[cutorch mag2gen] code refactor to support generics; move gesv to generic	Trevor Killeen

2016-11-16	[cutorch mag2gen] generic MAGMA memory allocator function	Trevor Killeen

2016-11-16	[cutorch potr] API parity for potr functions in cutorch	Trevor Killeen

2016-11-15	Merge pull request #601 from 1nadequacy/fix_baddbmm	Soumith Chintala
	[cutorch] remove syncing point from baddbmm
2016-11-15	[cutorch] remove syncing point from baddbmm	Denis Yarats
	This change removes HtoD copies inside baddbmm. These copies introduce a syncing point which causes slow downs in a multi gpu training. Test plan: Run unittests for baddbmm.