|
ReadFromAndWriteToBuffersAndImages
General information on how to write data to OpenCL memory objects (Buffers and 2D/3D Images
IntroductionThis page describes common properties and principles of buffers and images, but another page is dedicated to image-specific features : UsingImages. OpenCL Memory ObjectsOpenCL lets you create memory objects of three types : buffers (1D arrays), 2D images and 3D images. These memory objects can be stored in the host memory (typically, in RAM) or in the device memory (typically, in GRAM directly on the graphic card). Memory allocation : host vs. deviceFor host memory-stored memory objects, you can either ask OpenCL to allocate memory on its own or directly providing the pointer to some memory you allocated (and initialized) by yourself. When using a host-allocated memory object, OpenCL might let the kernels access directly to the data from the host or instead choose to cache some of this data, in read and/or write modes. What it means is that if you create a direct buffer of values and create an OpenCL buffer out of it, in "use host pointer" mode, you cannot expect your changes done on your direct buffer to be visible from OpenCL kernels using the OpenCL buffer. Before and after doing any change to or reading any data from a host pointer, you must call the CLBuffer.map / unmap methods. Lastly, OpenCL might let you read/write directly from an OpenCL memory object with the map/unmap mechanism. Mapping a memory object gives a direct pointer to the memory object's data (for instance using a DMA mechanism). However, mapping a memory object's data is only guaranteed to work in the case of host-allocated pointers, so it should be used with caution. Creating Memory Objects with JavaCLCreating BuffersWhile OpenCL defines buffers as arrays of bytes, JavaCL provides support for typed buffers with 1 to 1 mapping to NIO buffers. If you use a __global float* argument in a kernel of yours, you can either choose to use a CLBuffer<Byte> and do the float <-> byte conversion and offsets calculations on your own or directly use a CLBuffer<Float> that will take care of everything. Host-allocated pointers
Device-allocated pointersAllocate a zeroed-out buffer : int size = 100; CLBuffer<Float> buffer = context.createFloatBuffer(CLMem.Usage.InputOutput, size); Allocate a buffer as a copy of existing data : // Last 'true' argument asks for a copy of ptr to be made in the device memory. // The clBuffer will have no further link to ptr. CLBuffer<Float> clBuffer = context.createBuffer(CLMem.Usage.InputOutput, ptr, true); Reading from / Writing to a bufferCLContext context = ... ; // e.g. JavaCL.getBestContext() CLQueue queue = ... ; // e.g. context.createDefaultQueue() CLBuffer<Float> clBuffer = ... ; // e.g. context.createFloatBuffer(CLMem.Usage.InputOutput, size)
Mapping a buffer for read / write operationsMapping a buffer is only guaranteed to work for buffers allocated from host-pointers.
Important Note on EndiannessEach OpenCL device can have a different endianness. This makes it very hard to share the same buffers between two devices with mismatching endianness (for instance, take a little endian Intel Core device vs. a big endian Radeon 4850 GPU device on the same ATI Stream platform). CLContext.getByteOrder() can be called to check that the context has a consistent byte ordering (checks that all of its device's CLDevice.getByteOrder() methods return the same order, and returns null in case it's mismatching). All the CLBuffer read/map methods should return properly ordered NIO buffers or BridJ pointers (using the order of the queue's device). NIO Buffers / BridJ Pointers given to all CLBuffer write methods should be ordered with the correct byte order (see examples above). Garbage Collection vs. Manual ReleaseJavaCL maps OpenCL entities (allocated by the OpenCL driver, typically in the device memory) to Java objects (managed by the JVM's garbage collector). OpenCL entities are released when their Java objects counterparts are garbage collected. In many cases this can lead to serious issues : when the OpenCL driver runs out of memory, it does not tell Java to try and collect unused objects (which would release a few OpenCL entities in the process) and just fails, which makes JavaCL throw a CLException.MemObjectAllocationFailure or OutOfResources exception. To avoid that, one can manually release an unused buffer (or any JavaCL entity) by calling CLEntity.release() (CLEntity is a base class which is inherited by CLBuffer, CLImage2D, CLProgram, CLEvent... virtually all JavaCL classes of interest). JavaCL features a workaround for allocations that fail due to lack of OpenCL memory (it then calls the Java Garbage Collector and waits a while before retrying), but it is advised to call release() everytime possible (but only once on each object, of course). | |||||||||||||