
cuda-convnet
Note July 18, 2014:
* I've released an update to cuda-convnet
, called cuda-convnet2. The two main new features are faster training on Kepler-generation GPUs and support for multi-GPU training.
This is a fast C++/CUDA implementation of convolutional (or more generally, feed-forward) neural networks. It can model arbitrary layer connectivity and network depth. Any directed acyclic graph of layers will do. Training is done using the back-propagation algorithm.
Fermi-generation GPU (GTX 4xx, GTX 5xx, or Tesla equivalent) required.
Documentation
- Compiling -- how to check out and compile this code.
- Data -- what kind of data this net can train on.
- LayerParams -- how to specify an architecture for the net.
- NeuronTypes -- types of hidden unit nonlinearities.
- TrainingNet -- how to train the net.
- Options -- the command-line arguments that the net takes.
- ViewingNet -- how to look inside the checkpoints saved by the net.
- CheckingGradients -- how to numerically test the gradients for correctness.
Fast results
- 11% error on CIFAR-10 in 75 minutes, with image translations and horizontal reflections (def, params).
- 13% error on CIFAR-10 in 25 minutes, with image translations and horizontal reflections (def, params).
- See Methodology for details of training.
Filters learned by this net:
- See Methodology for details of training.
- 18% error on CIFAR-10 in 20 minutes, without any image translations/transformations/preprocessing (def, params).
- 26% error on CIFAR-10 in 80 seconds, without any image translations/transformations/preprocessing (def, params).
Recent changes
- Jul 17, 2012
- Fixed bug in contrast normalization backpropagation code which caused wrong gradients to be computed near image borders. (Thanks Hannes Schulz).
- Mar 13, 2012
- Added response-normalization across maps layer.
- Started modifying the code to support rectangular (i.e. non-square) images. The convolution code now supports rectangular images, but the remaining code does not yet. So the whole package still requires square images.
- Feb 8, 2012
- Most layer types now should work well with minibatch size 64 or 32.
- Fixed --conserve-mem option so it can be combined with -f (i.e. its value can be changed after a net has been saved).
- See ChangeLog for older changes.
Features
Supported layer types:
- Layers with weights:
- Fully-connected
- Convolutional, including sparsely-connected convolutional
- Locally-connected, unshared
- Others:
- Local pooling (avg, max), including overlapping pooling regions
- Local response normalization
- Local contrast normalization
- Softmax
- Elementwise sum, elementwise max
- Gaussian blur + "bed of nails" subsampling
- Resize with bilinear filtering
Supported neuron activation functions:
- Logistic
- Hyperbolic tangent
- Rectified linear
- Others
Supported objectives:
- Logistic regression
- Sum-of-squares
Other features:
- Efficient implementation of convolution in CUDA.
- Supports arbitrary stride size at zero loss of efficiency (except that which comes from reducing the problem size).
- Implicitly pads your images with an arbitrary-sized border of zeros without using any extra memory.
- Supports block-random sparse connectivity at no performance cost (see LayerParams).
- Modular design makes it easy to add new layer types, neuron activation functions, or objectives if you should need them.
- Mostly avoids use of temporary memory where it isn't strictly needed.
- Optimizes multiple objectives simultaneously.
- Saves checkpoints to disk as python pickled objects, so you can write python scripts to probe the mind of your neural net.
- Capable of training on one batch of data while simultaneously loading the next from disk (or pre-processing it in some way, if necessary).
- Numerically tests gradient computations for correctness.
Contact
- My university web page
- My email
Acknowledgements
- I am grateful to Ilya Sutskever for suggestions that led to many of the features in this package.
Project Information
The project was created on Jun 29, 2011.
- License: New BSD License
- 339 stars
- svn-based source control
Labels:
Machinelearning
Academic
Cuda
CPlusPlus
Python
neuralnet
convnet
convolution
GPU