设为首页 收藏本站
查看: 1806|回复: 0

[经验分享] Ubuntu 18.04 基于NVIDIA 2080安装TensorFlow

[复制链接]

尚未签到

发表于 2019-4-19 08:19:04 | 显示全部楼层 |阅读模式
  Ubuntu 18.04 基于NVIDIA 2080安装TensorFlow-GPU 1.13.1

官方文档
  https://www.tensorflow.org/install
https://www.tensorflow.org/install/gpu

其他请参考
  Ubuntu16.04 基于NVIDIA 1080Ti安装TensorFlow-GPU

安装环境


  • 系统:Ubuntu 18.04.02 desktop
  • 显卡:NVIDIA GeForce GTX 2080
  • 显卡驱动:NVIDIA-Linux-x86_64-410.72.run
  • CUDA:cuda_10.0.130_410.48_linux
  • cuDNN:

    • libcudnn7_7.4.2.24-1+cuda10.0_amd64.deb
    • libcudnn7-dev_7.4.2.24-1+cuda10.0_amd64.deb
    • libcudnn7-doc_7.4.2.24-1+cuda10.0_amd64.deb

  • Tensorflow-gpu:1.13.1
  首先要确定各软件之间的版本:
https://www.tensorflow.org/install/source
经过测试的构建配置 --> Linux --> 可以分别查看 CPU 和 GPU 中各版本的对应关系:
https://s1.运维网.com/images/blog/201904/18/305cb092a85594fb177b7e70d52c859f.jpg
  主要看Tensorflow是否适配CUDA版本,其次是CUDA的版本选择,推荐9.0或者10.0,然后再根据CUDA版本选择显卡驱动和cudnn,
安装版本选择时不要安装最新版,往低降一两个稳定版,注意相应软件之间的兼容性;

查看NVIDIA显卡驱动

netc@gpu-2:~$ nvidia-smi
Mon Mar 25 23:16:33 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2080    Off  | 00000000:03:00.0 Off |                  N/A |
| 24%   40C    P0     1W / 225W |      0MiB /  7949MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
netc@gpu-2:~$
安装CUDA

netc@gpu-2:/data/tools/GeForce-RTX-2080$ sudo sh  cuda_10.0.130_410.48_linux.run
-----------------
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: y
Do you want to install the OpenGL libraries?
(y)es/(n)o/(q)uit [ default is yes ]: n
Do you want to run nvidia-xconfig?
This will update the system X configuration file so that the NVIDIA X driver
is used. The pre-existing X configuration file will be backed up.
This option should not be used on systems that require a custom
X configuration, such as systems with multiple GPU vendors.
(y)es/(n)o/(q)uit [ default is no ]: n
Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-10.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
[ default is /home/netc ]:
Installing the NVIDIA display driver...
Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...
Installing the CUDA Samples in /home/netc ...
Copying samples to /home/netc/NVIDIA_CUDA-10.0_Samples now...
Finished copying samples.
===========
= Summary =
===========
Driver:   Installed
Toolkit:  Installed in /usr/local/cuda-10.0
Samples:  Installed in /home/netc
Please make sure that
-   PATH includes /usr/local/cuda-10.0/bin
-   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.
Logfile is /tmp/cuda_install_13131.log
netc@gpu-2:/data/tools/GeForce-RTX-2080$
查看CUDA版本

netc@gpu-2:/data/tools/GeForce-RTX-2080$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
netc@gpu-2:/data/tools/GeForce-RTX-2080$
更新pip3

netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ sudo pip3 install --upgrade pip
Successfully installed pip-19.0.3
安装tensorflow-gpu

netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ sudo pip3 install --index-url https://mirrors.aliyun.com/pypi/simple tensorflow-gpu
Successfully installed absl-py-0.7.1 astor-0.7.1 gast-0.2.2 grpcio-1.19.0 h5py-2.9.0 keras-applications-1.0.7 keras-preprocessing-1.0.9 markdown-3.0.1 mock-2.0.0 numpy-1.16.2 pbr-5.1.3 protobuf-3.7.0 tensorboard-1.13.1 tensorflow-estimator-1.13.0 tensorflow-gpu-1.13.1 termcolor-1.1.0 werkzeug-0.15.1
验证

netc@gpu-2:~/cudnn_samples_v7/mnistCUDNN$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2019-03-25 23:32:23.967770: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-25 23:32:23.968691: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x2ce8960 executing computations on platform CUDA. Devices:
2019-03-25 23:32:23.968749: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce RTX 2080, Compute Capability 7.5
2019-03-25 23:32:23.992261: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200065000 Hz
2019-03-25 23:32:23.994027: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x33acc10 executing computations on platform Host. Devices:
2019-03-25 23:32:23.994073: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): ,
2019-03-25 23:32:23.994507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce RTX 2080 major: 7 minor: 5 memoryClockRate(GHz): 1.8
pciBusID: 0000:03:00.0
totalMemory: 7.76GiB freeMemory: 7.62GiB
2019-03-25 23:32:23.994558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-03-25 23:32:23.995840: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-25 23:32:23.995878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0
2019-03-25 23:32:23.995900: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N
2019-03-25 23:32:23.996310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7413 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:03:00.0, compute capability: 7.5)
>>> print(sess.run(hello))
b'Hello, TensorFlow!'
报错总结:

运行import tensorflow时报错:

ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
  原因:
tensorflow版本与CUDA的版本不对应,tensorflow需要的cuda为10.0;
对应关系:https://tensorflow.google.cn/install/source
  查看cuda版本

cat /usr/local/cuda/version.txt
  查看cudnn版本

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2



运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-829101-1-1.html 上篇帖子: ubuntu16.04开机自启动ssh自动连接screen 下篇帖子: Ubuntu软件包管理相关部分命令
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表