updated blas dok (related to OpenBLAS)

author: Ronan Collobert <ronan@collobert.com> 2012-08-16 19:09:31 +0400
committer: Ronan Collobert <ronan@collobert.com> 2012-08-16 19:09:31 +0400
commit: e7517eb5c506ab6ebb434d60918d7b9a1fd2291b (patch)
tree: 907edb275de19f66f66591c8766ea3327806df25
parent: c004da2f3712e02f27ac0b687c1b78f03e480fcf (diff)
1 files changed, 18 insertions, 16 deletions
diff --git a/dokinstall/blas.dok b/dokinstall/blas.dok
index 36e85ef..da34a77 100644
--- a/dokinstall/blas.dok
+++ b/dokinstall/blas.dok
@@ -21,8 +21,7 @@ released under a BSD-like license. Unfortunately, it is not maintained
 anymore (at this time), but several forks have been released later. Our preference
 goes to [[https://github.com/xianyi/OpenBLAS|OpenBLAS]].
 
-We provide below simple instructions to install OpenBLAS. Similar instructions apply for
-GotoBLAS.
+We provide below simple instructions to install OpenBLAS.
 
 First get the latest OpenBLAS stable code:
 <file>
@@ -44,18 +43,28 @@ pkg_add -r gcc46
 On MacOS X, you should install one gfortran package provided on
 [[http://gcc.gnu.org/wiki/GFortranBinaries|this GCC webpage]].
 
-You can now go into the OpenBlas directory, and just do ''make''. Read OpenBLAS manual for more details.
+You can now go into the OpenBlas directory, and just do:
+<file>
+make NO_AFFINITY=1 USE_OPENMP=1
+</file>
+OpenBLAS uses processor affinity to go faster. However, in general, on a
+computer shared between several users, this causes processes to fight for
+the same CPU. We thus disable it here with the ''NO_AFFINITY'' flag. We
+also use the ''USE_OPENMP'' flag, such that OpenBLAS uses OpenMP and not
+pthreads. This is important to avoid some confusion in the number of
+threads, as Torch7 uses OpenMP. Read OpenBLAS manual for more details.
+
 You can use ''CC'' and ''FC'' variables to control the C and Fortran compilers.
 
 On FreeBSD use 'gmake' instead of 'make'. You also have to specify the correct MD5 sum program
 You will probably want to use the following command line:
 <file>
-gmake CC=gcc46 FC=gcc46 MD5SUM='md5 -q'
+gmake NO_AFFINITY=1 USE_OPENMP=1 CC=gcc46 FC=gcc46 MD5SUM='md5 -q'
 </file>
 
 On MacOS X, you will also have to specify the correct MD5SUM program:
 <file>
-make MD5SUM='md5 -q'
+make NO_AFFINITY=1 USE_OPENMP=1 MD5SUM='md5 -q'
 </file>
 
 Be sure to specify MD5SUM correctly, otherwise OpenBLAS might not compile LAPACK properly.
@@ -74,7 +83,9 @@ Make sure that CMake can find your OpenBLAS library. This can be done with
 <file>
 export CMAKE_LIBRARY_PATH=/your_installation_path/lib
 </file>
-before starting cmake command line.
+before starting cmake command line. On some platforms, the ''gfortran''
+library might also be not found. In this case, add the path to the
+''gfortran'' library into ''CMAKE_LIBRARY_PATH''.
 
 ===== Installing Intel MKL =====
 
@@ -121,12 +132,10 @@ The locations to search for are generally as follows.
 /usr/lib/gcc/x86_64-linux-gnu/
 /usr/lib/gcc/x86_64-redhat-linux/4.4.4/
 </file>
-These are a bit crytpic, but look around and find the path that contains libgfortran.so. And, use 
-
+These are a bit crytic, but look around and find the path that contains libgfortran.so. And, use
 <file>
 export CMAKE_LIBRARY_PATH=...
 </file>
-
 before calling cmake to build torch, this makes sure that OpenBLAS will be found.
 
 
@@ -146,7 +155,6 @@ Note again that the best choices are probably ''open'' or ''mkl''. For
 consistency reasons, CMake will try to find the corresponding LAPACK
 package (and does not allow mixing up different BLAS/LAPACK versions).
 
-
 ===== GotoBLAS/OpenBLAS and MKL threads =====
 
 GotoBLAS/OpenBLAS and MKL are multi-threaded libraries.
@@ -156,12 +164,6 @@ export OMP_NUM_THREADS=N
 </file>
 where N is an integer.
 
-With OpenBLAS, you can use
-<file>
-export OPENBLAS_NUM_THREADS=N
-</file>
-or GOTO_NUM_THREADS, or OMP_NUM_THREADS.
-
 Beware that running small problems on a large number of threads reduce
 performance! Multi-threading should be enable only for large-scale
 computations.
author	Ronan Collobert <ronan@collobert.com>	2012-08-16 19:09:31 +0400
committer	Ronan Collobert <ronan@collobert.com>	2012-08-16 19:09:31 +0400
commit	e7517eb5c506ab6ebb434d60918d7b9a1fd2291b (patch)
tree	907edb275de19f66f66591c8766ea3327806df25
parent	c004da2f3712e02f27ac0b687c1b78f03e480fcf (diff)