You are not logged in.
Hi,
yesterday and today I was updating the AMD ATI Stream SDK package, which is known as 'amdstream' in AUR.
See http://aur.archlinux.org/packages.php?ID=21933
The big news about this new version (2.01) is the OpenCL support.
This package contains AMD OpenCL implementation includes, libs and CL compiler.
Due to problems with this SDK - AMD made it into one big folder without respecting the usual unix structure - I had to
make it install in /opt. However, CAL and CL includes are copied into /usr/include/ and libs & bins are symlinked in /usr/lib/ and /usr/bin/ respectively.
Hope this arrangement is OK.
OpenCL GPU execution platform is not included in this package, as it is HW-driver dependant and provided by the latest catalyst package.
However, amdstream package is not as such dependent on catalyst, OpenCL build and CPU execution will work without catalyst.
This is done by small 'libatical' dependence library, which I have uploaded to AUR as well and which is provided by catalyst as well.
I'm currently testing amdstream on my machine. Problem is that I do not have ATI graphics on this pc so I can't test this with Catalyst.
I'd be glad if someone could test this with catalyst and report, if you are interested, please do it
Also, it would be very interesting to test this on Nvidia hardware with Nvidia drivers. In pure theory, it should be possible to execute amdstream OpenCL samples on Nvidia GPU. If proved as possible in practice, this would be a great breakthrough in GPGPU platform agnosticism. Anyone with Nvidia graphics is encouraged to look into this.
I case you will be testing this package on whatever hardware, first thing you should try after installation is the 'CLInfo' sample,
which will print out all available OpenCL platforms and devices along with extensive information about each.
Also note that almost every sample from the SDK accepts the '--device <gpu|cpu>' option, which comes in handy if you have no OpenCL-enabled GPU or want to test the code on other than default OpenCL device.
Note: I moved the old amdstream version to amdstream-old, as the new version is a completely different thing (except for the CAL part), rather than a mere update. So I keep the old version in case someone wanted the Brook+ libs&runtimes (i.e. for backwards compatibility or whatever). (But it could as well be deleted if no one wanted it or the sourced were discontinued)
Ok, enjoy and hope the amdstream package will work fine for you...
Last edited by kralyk (2010-03-25 14:21:34)
Offline
It depends on libatical.
libatical uses catalyst 10.1 and conflicts with catalyst-test (10.4 which works with xorg 1.7).
So no, I cannot test it without downgrading catalyst which leads to downgrading xorg and probably the kernel (2.6.34-rc2 here).
฿ 18PRsqbZCrwPUrVnJe1BZvza7bwSDbpxZz
Offline
You DO NOT need to install libatical if you use catalyst or catalyst-test!
libatical is only meant for people who do not have catalyst installed.
Both catalyst and catalyst-test provide libatical as well as libgl.
So yes you can test it without having to install any more dependencies or downgrading anything...
Last edited by kralyk (2010-03-26 09:23:56)
Offline
Then maybe you should change
depends=('libatical' 'libgl')
to optdepends.
Because catalyst-test only provides
Stellt bereit : catalyst libgl
or maybe catalyst needs to be changed to provide libatical?
Whatever.
edit: ok, catalyst-test NOW in aur provides it, when I installed it it didn't provide it...
Seems to be good:
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4098
Max compute units: 8
Max work items dimensions: 3
Max work items[0]: 128
Max work items[1]: 128
Max work items[2]: 128
Max work group size: 128
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 4
Preferred vector width double: 0
Max clock frequency: 650Mhz
Address bits: 32
Max memeory allocation: 134217728
Image support: No
Max size of kernel argument: 1024
Alignment (bits) of base address: 4096
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: No
Round to +ve and infinity: No
IEEE754-2008 fused multiply-add: No
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 134217728
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Global
Local memory size: 16384
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue properties:
Out-of-Order: No
Profiling : Yes
Platform ID: 0xb74f0488
Name: ATI RV730
Vendor: Advanced Micro Devices, Inc.
Driver version: CAL 1.4.596
Profile: FULL_PROFILE
Version: OpenCL 1.0 ATI-Stream-v2.0.1
Extensions: cl_khr_icd
HelloCL!
Getting Platform Information
Creating a context AMD platform
Getting device info
Loading and compiling CL source
Running CL program
Done
Passed!
HD 4650, Kernel 2.6.34-rc2, catalyst 10.4
Last edited by Cdh (2010-03-26 09:47:43)
฿ 18PRsqbZCrwPUrVnJe1BZvza7bwSDbpxZz
Offline
Is there any OPENCL app on linux?
Excuse my poor English.
Offline
Cdh: yes, only the latest AUR catalyst/catalyst-test package provides libatical.
libatical package is as new as this SDK. I should have probably mentioned that in my post...
Anyway, thanks for testing, I'm glad it works so far, good job ;-)
agapito: I don't think there are any user-ready OpenCL apps yet.
OpenCL has mostly been implemented last year and thus it is a very young platform.
As far as I know there are only scientific and experimental OpenCL apps so far.
So this package is probably only good for developers and enthusiasts.
Last edited by kralyk (2010-03-26 12:42:54)
Offline
Today I updated everything again and no dependency problems anymore.
Most of the examples actually do run slower on the gpu than on the cpu. I guess it's because the gpu is not so good in sorting something or matrixmultiplication...
Mandelbrot is really better on the gpu, I guess because of many equations to be solved at the same time:
chris@chrispc /opt/amdstream/samples/opencl/bin/x86 % ./Mandelbrot -t -q --device gpu -x 4000
Executing kernel for 1 iterations
-------------------------------------------
Width Height Time(sec)
4096 4096 3.098
chris@chrispc /opt/amdstream/samples/opencl/bin/x86 % ./Mandelbrot -t -q --device cpu -x 4000
Executing kernel for 1 iterations
-------------------------------------------
Width Height Time(sec)
4096 4096 34.017
./Mandelbrot -t -q --device cpu -x 4000 37,57s user 18,12s system 163% cpu 34,042 total
chris@chrispc /opt/amdstream/samples/opencl/bin/x86 %
edit: gpu and cpu are very similar words
Last edited by Cdh (2010-03-27 12:12:10)
฿ 18PRsqbZCrwPUrVnJe1BZvza7bwSDbpxZz
Offline