What steps will reproduce the problem? 1. use a big image like 512 x 512 2. put lots of filters (like 64) 3. have lots of color channels (again 64?)
What is the expected output? What do you see instead? I expect a big filtered image, but instead it crashes.
The blocks are defined such that blocks.y > (2^16) so CUDA refuses to launch the kernel.
I'm not sure I understand how to set the number of modules when doing a normal convolution, but it seems that an outer loop is required. The trouble with an outer loop is that the data is arranged in such a way that it is impossible to apply just a fraction of the filters, or to process just some of each image. The data arrangement makes it natural to process just some of the image channels... but the color channels don't come into the blocking structure.
Basically... can I use this kernel to perform big convolutions?
Comment #1
Posted on Mar 10, 2012 by Massive CamelHi James,
You're right, there is a maximum-grid-size-imposed limitation on the size of the convolution that can be performed in filterActs. I'm not really sure what to do about it yet, though. One hacky solution if you really need to perform such a big convolution is to split your filters into several sets, each set in its own matrix. Then call filterActs for each set. The target matrix can be the same for all calls (just a different offset into the same array). The gradient computation routines would have to be called twice too.
I'll probably come up with something better in the future but for now it's an unfixed bug.
Alex
Comment #2
Posted on Dec 26, 2013 by Happy LionCUDA compute compatibility 2.x or lower has a limitation that x, y and z dimension of a grid must be smaller than 65536. However when it comes to 3.x, x dimension of a grid can be as large as 2^31-1, which will be enough for larger pictures. The original function can be enhanced just by swapping x and y dimension.
Comment #3
Posted on Jan 7, 2014 by Massive BirdThe suggested fix sounds very promising, Any plans to swap x and y in the future?
Status: New
Labels:
Type-Defect
Priority-Medium