Archive

Posts Tagged ‘ec2’

Modern CUDA + CuDNN Theano/Keras AMI on AWS

January 22nd, 2017 2 comments

Wow, what a jargon-filled post title. Basically, we do a lot of our deep learning currently on the AWS EC2 cloud – but to use the GPU there with all the goodies (up to CuDNN that supports modern Theano’s batch normalization) is a surprisingly arduous process which you basically need to do manually, with a lot of trial and error and googling and hacking. This is awful, mind-boggling and I hate that everyone has to go through this. So, to fix this bad situation, I just released a community AMI that:

  • …is based on Ubuntu 16.04 LTS (as opposed to 14.04)
  • …comes with CUDA + CuDNN drivers and toolkit already set up to work on g2.2xlarge instances
  • …has Theano and Keras preinstalled and preconfigured so that you can run the Keras ResNet model on a GPU right away (or anything else you desire)

To get started, just spin up a GPU (g2.2xlarge) instance from community AMI ami-f0bde196 (1604-cuda80-cudnn5110-theano-keras), ssh in as the ubuntu@ user and get going! No hassles. But of course, EC2 charges apply.


Edit (errata): Actually, there’s a bug – sorry about that! Out of the box, the nvidia kernel driver is not loaded properly on boot. I might update the AMI later, for now to fix it manually:

  1. Edit /etc/modprobe.d/blacklist.conf (using for example sudo nano) and append the line blacklist nouveau to the end of that file
  2. Run sudo update-initramfs -u
  3. Reboot. Now, everything should finally work.

This AMI was created like this:

  • The stock Ubuntu 16.04 LTS AMI
  • NVIDIA driver 367.57 (older drivers do not support CUDA 8.0, while this is the last driver version to support the K520 GRID GPU used in AWS)
  • To make the driver setup go through, the trick to install apt-get install linux-image-extra-`uname -r` per
  • CUDA 8.0 and CuDNN 8.0 set up from the official though unannounced NVIDIA Debian packages by replaying the nvidia-docker recipes
  • bashrc modified to include cuda in the path
  • Theano and Keras from latest Git as of writing this blogpost (feel free to git pull and reinstall), and some auxiliary python-related etc. packages
  • Theano configured to use GPU and Keras configured to use Theano (and the “th” image dim ordering rather than “tf” – this is currently non-default in Keras!)
  • Example Keras deep learning models, even an elephant.jpg! Just run python resnet50.py
  • Exercise: Install TensorFlow on the system as well, release your own AMI and post its id in the comments!
  • Tip: Use nvidia-docker based containers to package your deep learning software; combine it with docker-machine to easily provision GPU instances in AWS and execute your models as needed. Using this for development is a hassle, though.

Enjoy!

Categories: ailao, linux, software Tags: , , , , ,