|
UsersGuide
Aparapi User's Guide.
User’s GuideAparapi is: An API used to express data parallel workloads in Java and a runtime system capable of running compatible workloads on a compatible GPU. Where your workload runs depends on
Aparapi depends on AMD’s OpenCL™ driver to execute on the GPU and therefore shares the same device, driver, and platform compatibility requirements as AMD APP SDK V2.5®.
The following fragment of Java code takes an input float array and populates an output array with the square of each element. final float in[8192]; // initialization of in[0..8191] omitted
final float out[in.length];
for(int i=0; i<in.length; i++){
out[i]=in[i]*in[i];
}This code segment illustrates an ideal data parallel candidate, each pass through the loop is independent of the others. Traversing the loop in any order should provide the same result. To convert the above code to Aparapi we use an anonymous inner-class (a common Java idiom) to express the data parallel nature of the above sequential loop. Kernel kernel = new Kernel(){
@Override public void run(){
int i = getGlobalId();
out[i]=in[i]*in[i];
}
};
kernel.execute(in.length);Java developers should recognize the general pattern as similar to that used to launch a new Thread. Thread thread = new Thread(new Runnable(){
@Override public void run(){
System.out.println(“In another thread!”);
}
});
thread.start();
thread.join();The Aparapi developer extends the com.amd.aparapi.Kernel and overrides the public void Kernel.run() method. It is this Kernel.run() method that is executed in parallel. The base class also exposes the Kernel.execute(range) method which is used to initiate the execution of Kernel.run() over the range 0...n. Kernel.execute(range) will block until execution has completed. Any code within the overridden ‘void run()’ method of Kernel (and indeed any method or methods reachable from that method) is assumed to be data-parallel and it is the developer’s responsibility to ensure that it is. Aparapi can neither detect nor enforce this. Within the executing kernel (on the GPU device or from the thread pool) the Kernel.getGlobalId() method is used to identify which (of the range 0..n) a particular execution represents. Compiling an Aparapi applicationAparapi has only two compilation requirements:
$ javac –g –cp ${APARAPI_DIR}/aparapi.jar Squares.javaAparapi requires this classfile debug information so that can extract the name and scope of local variables for the generated OpenCL. Running an Aparapi applicationAt runtime an Aparapi-enabled application requires aparapi.jar to be in the class path to be able to execute in a Java Thread Pool (no GPU offload). $ java–cp ${APARAPI_DIR}/aparapi.jar;. SquaresTo take advantage of the GPU, the directory containing the platform-dependent Aparapi shared library is passed via the java.library.path property. $ java –Djava.library.path=${APARAPI_DIR} –cp ${APARAPI_DIR}/aparapi.jar;. SquaresAparapi detects whether the JNI shared library is available. If the library cannot be located your code will be executed using a Java Thread Pool. An application can detect whether a kernel was executed on the GPU or by a Java Thread Pool (JTP) by querying the execution mode ‘after’ Kernel.execute(range) has returned. This is achieved using the Kernel.getExecutionMode() method. Kernel kernel = new Kernel(){
@Override public void run(){
int i = getGlobalId();
out[i]=in[i]*in[i];
}
};
kernel.execute(in.length);
if (!kernel.getExecutionMode().equals(Kernel.EXECUTION_MODE.GPU)){
System.out.println(“Kernel nid not execute on the GPU!”);
}To obtain a runtime report of the execution mode of all kernel executions, set the com.amd.aparapi.enableExecutionModeReporting property to true when the JVM is launched. $ java –Djava.library.path=${APARAPI_DIR} –Dcom.amd.aparapi.enableExecutionModeReporting=true –cp ${APARAPI_DIR}/aparapi.jar;. SquaresRunning the sample applicationsAparapi includes two sample applications in the /samples subdirectory of the binary distribution zip file.
The jar file for each sample is included (so you can run a sample without having to build it) as well as both Linux® and Microsoft Windows® script files for launching the samples. You will need an appropriate GPU card, OpenCL® enabled Catalyst® driver and a compatible Oracle Java 6 JRE for your platform. To execute a sample:
The sample scripts pass the first arg (%1 or $1) to -Dcom.amd.aparapi.executionMode when the JVM is launched. This allows the sample to be tested in either GPU or JTP execution modes by passing the requested mode. $ cd samples/mandel $ bash ./mandel.sh GPU <executes in GPU mode here> $ bash ./mandel.sh JTP <executes in JTP mode here> Building the sample applicationsTo build a sample, install Oracle® JDK 6 and Apache Ant (at least 1.7.1).
$ cd samples/mandel $ ant $ bash ./mandel.sh GPU | ||||