Pasky’s Log

Home > ailao, linux, software > Modern CUDA + CuDNN Theano/Keras AMI on AWS

Modern CUDA + CuDNN Theano/Keras AMI on AWS

January 22nd, 2017 Leave a comment Go to comments

Wow, what a jargon-filled post title. Basically, we do a lot of our deep learning currently on the AWS EC2 cloud – but to use the GPU there with all the goodies (up to CuDNN that supports modern Theano’s batch normalization) is a surprisingly arduous process which you basically need to do manually, with a lot of trial and error and googling and hacking. This is awful, mind-boggling and I hate that everyone has to go through this. So, to fix this bad situation, I just released a community AMI that:

…is based on Ubuntu 16.04 LTS (as opposed to 14.04)
…comes with CUDA + CuDNN drivers and toolkit already set up to work on g2.2xlarge instances
…has Theano and Keras preinstalled and preconfigured so that you can run the Keras ResNet model on a GPU right away (or anything else you desire)

To get started, just spin up a GPU (g2.2xlarge) instance from community AMI ami-f0bde196 (1604-cuda80-cudnn5110-theano-keras), ssh in as the ubuntu@ user and get going! No hassles. But of course, EC2 charges apply.

Edit (errata): Actually, there’s a bug – sorry about that! Out of the box, the nvidia kernel driver is not loaded properly on boot. I might update the AMI later, for now to fix it manually:

Edit /etc/modprobe.d/blacklist.conf (using for example sudo nano) and append the line blacklist nouveau to the end of that file
Run sudo update-initramfs -u
Reboot. Now, everything should finally work.

This AMI was created like this:

The stock Ubuntu 16.04 LTS AMI
NVIDIA driver 367.57 (older drivers do not support CUDA 8.0, while this is the last driver version to support the K520 GRID GPU used in AWS)
To make the driver setup go through, the trick to install apt-get install linux-image-extra-`uname -r` per
CUDA 8.0 and CuDNN 8.0 set up from the official though unannounced NVIDIA Debian packages by replaying the nvidia-docker recipes
bashrc modified to include cuda in the path
Theano and Keras from latest Git as of writing this blogpost (feel free to git pull and reinstall), and some auxiliary python-related etc. packages
Theano configured to use GPU and Keras configured to use Theano (and the “th” image dim ordering rather than “tf” – this is currently non-default in Keras!)
Example Keras deep learning models, even an elephant.jpg! Just run python resnet50.py
Exercise: Install TensorFlow on the system as well, release your own AMI and post its id in the comments!
Tip: Use nvidia-docker based containers to package your deep learning software; combine it with docker-machine to easily provision GPU instances in AWS and execute your models as needed. Using this for development is a hassle, though.

Enjoy!

Categories: ailao, linux, software Tags: aws, cuda, ec2, gpu, keras, theano

Comments (2) Trackbacks (0) Leave a comment Trackback

Eugene

February 13th, 2018 at 00:22 | #1

Reply | Quote

Hi
Thanks a lot for your AMI! Just wanted to let you know there’s one more gotcha: you have to be running 4.4.0-53-generic kernel. When I launched AMI I was on 4.4.0-112-generic so the bugfix (update-initramfs -u) didn’t work on 4.4.0-112 kernel. I didn’t find out how to force specific kernel on boot, so I just uninstalled 4.4.0-112 and rebooted

dpkg -l | grep linux-image
apt-get remove
sudo reboot
# make sure you’re on 4.4.0-53
uname -a
# edit /etc/modprobe.d/blacklist.conf and run sudo update-initramfs -u
AIOU Tutor

June 29th, 2020 at 13:24 | #2

Reply | Quote

I am also facing this issue, When I launched AMI I was on 4.4.0-112-generic so the bugfix (update-initramfs -u) didnâ€™t work on 4.4.0-112 kernel. Anyone here if find the solution then please tell me here.

No trackbacks yet.

Some Kalimba Melodies Research at Ailao

Modern CUDA + CuDNN Theano/Keras AMI on AWS

Recent Comments

Categories

Blogroll

Licence