Using GPUs Smartly with Tensorflow


When you’re working in a lab and you need to coordinate GPU usage it can be tough to run things recklessly. You are definitely going to need smarter GPU allocations. And Tensorflow docs provide you with a fair deal of information how to manage GPU usage. And it is not that difficult.

Change your code

Do not let things run wildly.

    with tf.Session(graph=graph,) as session:

Instead use something like,

    with tf.Session(graph=graph,
                    config=tf.ConfigProto(allow_soft_placement=True)) as session, \
            tf.device('/gpu:0'):

This is a bit more complicated than what you would use if you didn’t have any constraints. These few extended lines serve few important purposes.

config=tf.ConfigProto(allow_soft_placement=True)

This part is important to avoid errors as follows.
tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node ‘Variable’: Could not satisfy explicit device specification ‘/device:GPU:0’ because no supported kernel for GPU devices is available

Such errors can wildly appear. One example this can take place is keeping the Variables in the CPU and you try to run an operation on the GPU. allow_soft_placement allows the GPU to use suitable device to place the variables which prevent from such formidable errors.

Masking GPUs with CUDA

If you’re not careful, tensorflow can behave like an obnoxious little rascal, using all the memory available to it, depriving others processes completely. To prevent such ordeals, CUDA provides you with an elegant way.

You can set the CUDA_VISIBLE_DEVICES environment variable to only see certain GPUs so, tensorflow does not accidentally poke its head everywhere.

Putting Everything Together

Now you know the concepts. To put stuff to work, all you need to do is, change the session usage to the above in the code. And run the following command.
CUDA_VISIBLE_DEVICES= python .py

Remember: If you set CUDA_VISIBLE_DEVICES=2 then tf.device('/gpu:0') will be corresponding to GPU2 (not GPU0 or GPU1). In other words, tensorflow will index only the GPUs it can see.

Extra: Specifying how much Memory to use

This doc and the stackoverflow post explains how you can specify how much memory to use if you want to use a certain fraction of memory instead of everything.

The following example uses 50% of the GPU0 memory.

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
    with tf.Session(graph=graph,
                    config=tf.ConfigProto(allow_soft_placement=True,gpu_options=gpu_options)) as session, \
            tf.device('/gpu:0'):
Tags: , , ,