sandbox.cuda.dnn – cuDNN
cuDNN is an NVIDIA library with
functionality used by deep neural network. It provides optimized versions
of some operations like the convolution. cuDNN is not currently
installed with CUDA 6.5. You must download and install it
yourself.
To install it, decompress the downloaded file and make the *.h and
*.so* files available to the compilation environment.
There are at least three possible ways of doing so:
- The easiest is to include them in your CUDA installation. Copy the
*.h files to CUDA_ROOT/include and the *.so* files to
CUDA_ROOT/lib64 (by default, CUDA_ROOT is /usr/local/cuda
on Linux).
- Alternatively, on Linux, you can set the environment variables
LD_LIBRARY_PATH, LIBRARY_PATH and CPATH to the directory
extracted from the download. If needed, separate multiple directories
with : as in the PATH environment variable.
- And as a third way, also on Linux, you can copy the *.h files
to /usr/include and the *.so* files to /lib64.
By default, Theano will detect if it can use cuDNN. If so, it will use
it. If not, Theano optimizations will not introduce cuDNN ops. So
Theano will still work if the user did not introduce them manually.
To get an error if Theano can not use cuDNN, use this Theano flag:
optimizer_including=cudnn.
Note
CuDNN v2 is now released, if you used any v2 release candidate, we
strongly suggest that you update it to the final version. From now
on, we only support the final release.
CuDNN v2 is much faster than v1. We recommend that everybody
updates to v2.
Note
Normally you should not call GPU Ops directly, but the CPU interface
currently does not allow all options supported by cuDNN ops. So it is
possible that you will need to call them manually.
Note
The documentation of CUDNN R1 and R2 tells that, for the following
2 operations, the reproducibility is not guaranteed:
cudnnConvolutionBackwardFilter and cudnnConvolutionBackwardData.
Those correspond to the gradient wrt the weights and the gradient wrt the
input of the convolution. They are also used sometimes in the forward
pass, when they give a speed up.
Note
There is a problem we do not understand yet when cudnn paths are
used with symbolic links. So avoid using that.
Functions
-
theano.sandbox.cuda.dnn.dnn_conv(img, kerns, border_mode='valid', subsample=(1, 1), conv_mode='conv', direction_hint=None, workmem=None)
GPU convolution using cuDNN from NVIDIA.
The memory layout to use is ‘bc01’, that is ‘batch’, ‘channel’,
‘first dim’, ‘second dim’ in that order.
Parameters: |
- img – images to do the convolution over
- kerns – convolution filters
- border_mode – one of ‘valid’, ‘full’; additionally, the padding size
could be directly specified by an integer or a pair of integers
- subsample – perform subsampling of the output (default: (1, 1))
- conv_mode – perform convolution (kernels flipped) or cross-correlation.
One of ‘conv’, ‘cross’. (default: ‘conv’)
- direction_hint – Used by graph optimizers to change algorithm choice.
By default, GpuDnnConv will be used to carry out the convolution.
If border_mode is ‘valid’, subsample is (1,1) and direction_hint is
‘bprop weights’, it will use GpuDnnConvGradW.
If border_mode is ‘full’, subsample is (1,1) and direction_hint is
not ‘forward!’, it will use GpuDnnConvGradI.
This parameter is used internally by graph optimizers and may be
removed at any time without a deprecation period. You have been warned.
- workmem – Specify the amount of working memory allowed.
More memory is usually faster. One of ‘none’, ‘small’ or
‘large’. (default is None which takes its value from
config.dnn.conv.workmem)
|
Warning: | The cuDNN library only works with GPU that have a compute
capability of 3.0 or higer. This means that older GPU will not
work with this Op.
|
-
theano.sandbox.cuda.dnn.dnn_pool(img, ws, stride=(1, 1), mode='max', pad=(0, 0))
GPU pooling using cuDNN from NVIDIA.
The memory layout to use is ‘bc01’, that is ‘batch’, ‘channel’,
‘first dim’, ‘second dim’ in that order.
Parameters: |
- img – images to do the pooling over
- ws – subsampling window size
- stride – subsampling stride (default: (1, 1))
- mode – one of ‘max’, ‘average’ (default: ‘max’)
- pad – (padX, padY) padding information.
padX is the size of the left and right borders,
padY is the size of the top and bottom borders.
|
Warning: | The cuDNN library only works with GPU that have a compute
capability of 3.0 or higer. This means that older GPU will not
work with this Op.
|
Note: | This Op implements the ignore_border=True of max_pool_2d.
|
Convolution Ops
-
class theano.sandbox.cuda.dnn.GpuDnnConvDesc(border_mode, subsample=(1, 1), conv_mode='conv')
This Op builds a convolution descriptor for use in the other
convolution operations.
see the doc of dnn_conv() for a description of the parameters
-
class theano.sandbox.cuda.dnn.GpuDnnConv(workmem=None, inplace=False)
The forward convolution.
Parameters: |
- image –
- kernel –
- descr – the convolution descriptor
|
-
static get_out_shape(ishape, kshape, border_mode, subsample)
This function computes the output shape for a convolution with
the specified parameters. ishape and kshape can be symbolic
or scalar.
-
class theano.sandbox.cuda.dnn.GpuDnnConvGradW(inplace=False)
The convolution gradient with respect to the weights.
Parameters: |
- image –
- kernel –
- descr – the convolution descriptor
|
-
class theano.sandbox.cuda.dnn.GpuDnnConvGradI(inplace=False)
The convolution gradient with respect to the inputs.
Parameters: |
- image –
- kernel –
- descr – the convolution descriptor
|
Pooling Ops
-
class theano.sandbox.cuda.dnn.GpuDnnPoolDesc(ws=(1, 1), stride=(1, 1), mode='max', pad=(0, 0))
This Op builds a pooling descriptor for use in the other
pooling operations.
Parameters: |
- ws – windows size
- stride – (dx, dy)
- mode – ‘max’ or ‘average’
- pad – (padX, padY) padding information.
padX is the size of the left and right borders,
padY is the size of the top and bottom borders.
|
-
class theano.sandbox.cuda.dnn.GpuDnnPool
Pooling.
Parameters: |
- img – the image 4d tensor.
- desc – the pooling descriptor.
|
-
class theano.sandbox.cuda.dnn.GpuDnnPoolGrad
The pooling gradient.
Parameters: |
- inp – the input of the pooling.
- out – the output of the pooling in the forward.
- inp_grad – same size as out, but is the corresponding gradient information.
- desc – The pooling descriptor.
|
Softmax Ops
-
class theano.sandbox.cuda.dnn.GpuDnnSoftmax(tensor_format, algo, mode)
Op for the cuDNN Softmax.
Parameters: |
- tensor_format – Whether the data format is ‘bc01’ or ‘b01c’.
- algo – ‘fast’ or ‘accurate’ indicating whether computations should be
optimized for speed or accuracy respectively.
- mode – ‘instance’ or ‘channel’ indicating whether the softmax should
be computed per image across ‘c01’ or per spatial location ‘01’ per
image across ‘c’.
|
-
class theano.sandbox.cuda.dnn.GpuDnnSoftmaxGrad(tensor_format, algo, mode)
Op for the cuDNN SoftmaxGrad.
Parameters: |
- tensor_format – Whether the data format is ‘bc01’ or ‘b01c’.
- algo – ‘fast’ or ‘accurate’ indicating whether computations should be
optimized for speed or accuracy respectively.
- mode – ‘instance’ or ‘channel’ indicating whether the softmax should
be computed per image across ‘c01’ or per spatial location ‘01’ per
image across ‘c’.
|