Phoronix · @phoronix
2780 followers · 2522 posts · Server noc.social

4.0 @openclapi Implementation Released With @IntelGraphics Level Zero Driver

phoronix.com/news/PoCL-4.0-Rel

Original tweet : twitter.com/phoronix/status/16

#oneAPI #pocl

Last updated 1 year ago

Moritz Lehmann · @ProjectPhysX
185 followers · 53 posts · Server mast.hpc.social

🧵9/9
The source code for the experimental @FluidX3D P2P is available in this branch on : github.com/ProjectPhysX/FluidX

The PR for with cudaMemcpy is available here: github.com/pocl/pocl/pull/1189

#github #pocl

Last updated 2 years ago

Moritz Lehmann · @ProjectPhysX
185 followers · 52 posts · Server mast.hpc.social

Credit and many thanks to Jan Solanti from Tampere University for visiting me at University of Bayreuth and testing this together with me, in his endeavour to implement/optimize -Remote.
Thanks to @ShmarvDogg for testing P2P mode on his 2x A770 16GB "bigboi" PC!
🧵8/9

#pocl

Last updated 2 years ago

Moritz Lehmann · @ProjectPhysX
185 followers · 47 posts · Server mast.hpc.social

When running with the backend of + P2P cudaMemcpy, performance is 40% faster compared to PCIe copy over CPU memory. PoCL's P2P backend is >3x faster than Nvidias own runtime here. This is the perf delta are giving up on.
🧵3/9

#fluidx3d #cuda #pocl #opencl #nvidia

Last updated 2 years ago

Phoronix · @phoronix
1065 followers · 855 posts · Server noc.social

3.1 Released - Improved For CPU & CUDA Drivers, WIP @VulkanAPI Driver

phoronix.com/news/PoCL-3.1-Rel

Original tweet : twitter.com/phoronix/status/15

#nvidia #SPIRV #pocl

Last updated 2 years ago

Phoronix · @phoronix
890 followers · 773 posts · Server noc.social

3.1-RC1 Released With Improved SPIR-V Support For CPU & CUDA Drivers, @VulkanAPI WIP

-- This "portable OpenCL" implementation continues improving.

phoronix.com/news/PoCL-3.1-RC1

Original tweet : twitter.com/phoronix/status/15

#pocl

Last updated 2 years ago

claude · @mathr
288 followers · 2739 posts · Server post.lurk.org

device fission / device partition works fine on with (I can get it to use only 15 of my 16 cores/threads if I like) but on the call to clCreateSubDevices just returns "invalid value", which I guess means "not supported". I was hoping to leave 1 compute unit free in the hope that it wouldn't make my desktop environment completely unusable for the duration of the computations.

#gpu #amd #pocl #cpu #opencl

Last updated 5 years ago

claude · @mathr
288 followers · 2739 posts · Server post.lurk.org

get 1.2 on with :

- make sure your system is up to date github.com/RadeonOpenCompute/R

- add the rocm apt repository github.com/RadeonOpenCompute/R

- install rocm-opencl-dev (using upstream kernel drivers) github.com/RadeonOpenCompute/R

- do NOT try to mess with anything dkms, it won't work

- purge mesa-opencl-icd and pocl-opencl-icd, they get in the way and stop the amdgpu icd from loading correctly(*)

implementation doesn't go as high as OpenCL version 1.2, and is only, thus usually slower. There was a proprietary CPU-based OpenCL implementation that I found in some random backports repository once, but in my test it was very slow, and it got uninstalled during my tinkering

tested on RX 580 GPU with Ryzen 2700X CPU, don't know about other hardware, maybe check for support online or just try it

I used Fractorium for testing, it needs OpenCL >= 1.2.

(*) works for me, your mileage may vary

#howto #opencl #debian #buster #amdgpu #mesa #clover #gpu #pocl #cpu #amd

Last updated 6 years ago