Monday, September 18, 2017

Install tensorflow by building source

Problem occurs when using direct pip install:
make tensorflow c++ code
third_party/eigen3/unsupported/eigen/cxx11/tensor: no such file or directory

Fixed by compiling open source
Environment used
Ubuntu 16.04
gcc 5.4.0
Cuda 8.0
Cudnn 5
python 2.7
tensorflow 1.2.1

Oficial doc:
https://www.tensorflow.org/install/install_sources#PrepareLinux

Clone the TensorFlow repository

Start the process of building TensorFlow by cloning a TensorFlow repository.
To clone the latest TensorFlow repository, issue the following command:
$ git clone https://github.com/tensorflow/tensorflow
The preceding git clone command creates a subdirectory named tensorflow. After cloning, you may optionally build a specific branch (such as a release branch) by invoking the following commands:
$ cd tensorflow $ git checkout Branch # where Branch is the desired branch
For example, to work with the r1.0 release instead of the master release, issue the following command:
$ git checkout r1.2

Prepare environment for Linux

Install Bazel

If bazel is not installed on your system, install it now by following these directions.
Bug:
The latest bazel has problem to build, need to roll back to 0.5.2
Download from
https://github.com/bazelbuild/bazel/releases/download/0.5.2/bazel_0.5.2-linux-x86_64.deb

Install TensorFlow Python dependencies

To install these packages for Python 2.7, issue the following command:
$ sudo apt-get install python-numpy python-dev python-pip python-wheel
To install these packages for Python 3.n, issue the following command:
$ sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel

Optional: install TensorFlow for GPU prerequisites

Finally, you must also install libcupti-dev by invoking the following command:

$ sudo apt-get install libcupti-dev

Next

After preparing the environment, you must now configure the installation.
$ cd tensorflow  # cd to the top-level directory created
$ ./configure
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python2.7/dist-packages]

Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with MKL support? [y/N]
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n]
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N]
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N]
No XLA support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N]
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N]
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] Y
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N]
nvcc will be used as CUDA compiler
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 5
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0
Do you wish to build TensorFlow with MPI support? [y/N] 
MPI support will not be enabled for TensorFlow
Configuration finished
To build a pip package for TensorFlow with GPU support, invoke the following command:
$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" 
NOTE on gcc 5 or later: the binary pip packages available on the TensorFlow website are built with gcc 4, which uses the older ABI. To make your build compatible with the older ABI, you need to add --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" to your bazel build command.


The bazel build command builds a script named build_pip_package. Running this script as follows will build a .whl file within the /tmp/tensorflow_pkg directory:

$ bazel-bin/tensorflow/tools/pip_package/build_pip_package ~/tmp/tensorflow_pkg

Install via pip in virtual environment
virtualenv tensorflow
(tensorflow)$ source ~/tensorflow/bin/activate
(tensorflow) $ pip install /tmp/tensorflow_pkg/tensorflow-1.2.1-cp27-cp27mu-linux_x86_64.whl


Update 2018.01.16

Problems for

cuda 9.1
cudnn 7.0
tensorflow 1.5.0


When building using bazel ...


Describe the problem 1

While trying to compile the latest TensorFlow(cloned from 798fa36), such error will be raised:
ERROR: /home/ubuntu/tensorflow/tensorflow/contrib/seq2seq/BUILD:64:1: error while parsing .d file: /home/ubuntu/.cache/bazel/_bazel_ubuntu/ad1e09741bb4109fbc70ef8216b59ee2/execroot/org_tensorflow/bazel-out/k8-py3-opt/bin/tensorflow/contrib/seq2seq/_objs/python/ops/_beam_search_ops_gpu/tensorflow/contrib/seq2seq/kernels/beam_search_ops_gpu.cu.pic.d (No such file or directory)
In file included from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:14:0,
                 from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1,
                 from ./tensorflow/contrib/seq2seq/kernels/beam_search_ops.h:19,
                 from tensorflow/contrib/seq2seq/kernels/beam_search_ops_gpu.cu.cc:20:
external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/Core:59:34: fatal error: math_functions.hpp: No such file or directory
It turns out that in CUDA 9.1, math_functions.hpp is located at cuda/include/crt/math_functions.hpp, rather than cuda/include/math_functions.hpp (CUDA 9.0 does), which leads to this error.
ln -s /usr/local/cuda/include/crt/math_functions.hpp /usr/local/cuda/include/math_functions.hpp will fix this problem and complete the compiling process.

Reference


Note on gcc version >=5: gcc uses the new C++ ABI since version 5. The binary pip packages available on the TensorFlow website are built with gcc4 that uses the older ABI. If you compile your op library with gcc>=5, add -D_GLIBCXX_USE_CXX11_ABI=0 to the command line to make the library compatible with the older abi. Furthermore if you are using TensorFlow package created from source remember to add --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" as bazel command to compile the Python package.

Problem 2

no such package '@nasm//': java.io.IOException: Error downloading [https://mirror.bazel.build/www.nasm.us/pub/nasm/releasebuilds/2.12.02/nasm-2.12.02.tar.bz2, http://pkgs.fedoraproject.org/repo/pkgs/nasm/nasm-2.12.02.tar.bz2/d15843c3fb7db39af80571ee27ec6fad/nasm-2.12.02.tar.bz2]

Solution
https://github.com/tensorflow/tensorflow/issues/16862
The problem is that one of two mirrors for nasm is dead, and the second one is sort some reason problematic. Workaround would be to add one more mirror:
      urls = [
          "https://mirror.bazel.build/www.nasm.us/pub/nasm/releasebuilds/2.12.02/nasm-2.12.02.tar.bz2",  
          "http://www.nasm.us/pub/nasm/releasebuilds/2.12.02/nasm-2.12.02.tar.bz2",
          "http://pkgs.fedoraproject.org/repo/pkgs/nasm/nasm-2.12.02.tar.bz2/d15843c3fb7db39af80571ee27ec6fad/nasm-2.12.02.tar.bz2",
      ]
in
"https://mirror.bazel.build/www.nasm.us/pub/nasm/releasebuilds/2.12.02/nasm-2.12.02.tar.bz2",

Problem 3

'Numpy dangling symbolic links' when building from source
Solution

sudo pip install --no-cache-dir --upgrade --force-reinstall numpy

Update 2018.06.05

Problem for 
cuda 9.0
cudnn 7.0
tensorflow 1.7.0
bazel 0.14

when building using bazel

bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Ustring_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cublasDsymm_v2@libcublas.so.9.0
........

Solution
1. check $LD_LIBRARY_PATH in ~/.bashrc 
2. check CUDA_PATH
The solution is to not use LD_LIBRARY_PATH but ldconfig:
sudo echo "/usr/local/cuda/lib64" > /etc/ld.so.conf.d/cuda.conf
sudo ldconfig

Updated on 2018.12.21


Reinstall bazel

https://docs.bazel.build/versions/master/install-ubuntu.html
rm ~/.cache/bazel -fr
rm -fr ~/.bazel ~/.bazelrc

Step 2: Download Bazel

Next, download the Bazel binary installer named bazel-<version>-installer-linux-x86_64.sh from the Bazel releases page on GitHub.

Step 3: Run the installer

Run the Bazel installer as follows:
chmod +x bazel-<version>-installer-linux-x86_64.sh
./bazel-<version>-installer-linux-x86_64.sh --user
The --user flag installs Bazel to the $HOME/bin directory on your system and sets the .bazelrc path to $HOME/.bazelrc. Use the --help command to see additional installation options.
Different version of bazel for different tensorflow

Linux

VersionPython versionCompilerBuild tools
tensorflow-1.12.02.7, 3.3-3.6GCC 4.8Bazel 0.15.0
tensorflow-1.11.02.7, 3.3-3.6GCC 4.8Bazel 0.15.0
tensorflow-1.10.02.7, 3.3-3.6GCC 4.8Bazel 0.15.0
tensorflow-1.9.02.7, 3.3-3.6GCC 4.8Bazel 0.11.0
tensorflow-1.8.02.7, 3.3-3.6GCC 4.8Bazel 0.10.0
tensorflow-1.7.02.7, 3.3-3.6GCC 4.8Bazel 0.10.0
tensorflow-1.6.02.7, 3.3-3.6GCC 4.8Bazel 0.9.0
tensorflow-1.5.02.7, 3.3-3.6GCC 4.8Bazel 0.8.0
tensorflow-1.4.02.7, 3.3-3.6GCC 4.8Bazel 0.5.4
tensorflow-1.3.02.7, 3.3-3.6GCC 4.8Bazel 0.4.5
tensorflow-1.2.02.7, 3.3-3.6GCC 4.8Bazel 0.4.5
tensorflow-1.1.02.7, 3.3-3.6GCC 4.8Bazel 0.4.2
tensorflow-1.0.02.7, 3.3-3.6GCC 4.8Bazel 0.4.2
VersionPython versionCompilerBuild toolscuDNNCUDA
tensorflow_gpu-1.12.02.7, 3.3-3.6GCC 4.8Bazel 0.15.079
tensorflow_gpu-1.11.02.7, 3.3-3.6GCC 4.8Bazel 0.15.079
tensorflow_gpu-1.10.02.7, 3.3-3.6GCC 4.8Bazel 0.15.079
tensorflow_gpu-1.9.02.7, 3.3-3.6GCC 4.8Bazel 0.11.079
tensorflow_gpu-1.8.02.7, 3.3-3.6GCC 4.8Bazel 0.10.079
tensorflow_gpu-1.7.02.7, 3.3-3.6GCC 4.8Bazel 0.9.079
tensorflow_gpu-1.6.02.7, 3.3-3.6GCC 4.8Bazel 0.9.079
tensorflow_gpu-1.5.02.7, 3.3-3.6GCC 4.8Bazel 0.8.079
tensorflow_gpu-1.4.02.7, 3.3-3.6GCC 4.8Bazel 0.5.468
tensorflow_gpu-1.3.02.7, 3.3-3.6GCC 4.8Bazel 0.4.568
tensorflow_gpu-1.2.02.7, 3.3-3.6GCC 4.8Bazel 0.4.55.18
tensorflow_gpu-1.1.02.7, 3.3-3.6GCC 4.8Bazel 0.4.25.18
tensorflow_gpu-1.0.02.7, 3.3-3.6GCC 4.8Bazel 0.4.25.18

Update 2020.01.19
Tested on (version matters!)
tensorflow 1.8
Cuda 10.0
cudnn 7.6
python 3.6
bazel 0.15.0

No comments:

Post a Comment