Hello,
I'm trying to run some OpenCL programs but I keep getting segfaults. This is on FreeBSD 11.2,
using an AMD Radeon RX580 with the amdgpu driver and drm-next-kmod.
I have no problems with this GPU running Xorg and OpenGL programs. But OpenCL is kind
of broken, or works only partially. I have no idea what's wrong. Here are some examples:
With benchmarks/clpeak:
Or simple programs like https://github.com/boostorg/compute/blob/master/example/list_devices.cpp
Or the GPU-based variant of net-p2p/xmrig from https://github.com/xmrig/xmrig-amd
Not sure how to interpret this.
Installed ports:
clinfo-2.1.16.01.12
clpeak-1.0g20170524
drm-next-kmod-4.11.g20180619_1
gpu-firmware-kmod-g20180319_1
libdrm-2.4.92,1
ocl-icd-2.2.12
opencl-2.2
xf86-video-amdgpu-1.3.0_1
Loaded modules:
Interesting dmesg output:
Any help appreciated.
Thanks.
I'm trying to run some OpenCL programs but I keep getting segfaults. This is on FreeBSD 11.2,
using an AMD Radeon RX580 with the amdgpu driver and drm-next-kmod.
I have no problems with this GPU running Xorg and OpenGL programs. But OpenCL is kind
of broken, or works only partially. I have no idea what's wrong. Here are some examples:
Code:
$ clinfo
Number of platforms 1
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 18.1.3
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name Clover
Number of devices 1
Device Name Radeon RX 580 Series (POLARIS10, DRM 3.8.0, 11.2-BETA2, LLVM 6.0.1)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 Mesa 18.1.3
Driver Version 18.1.3
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 36
Max clock frequency 1430MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple 64
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 8 / 8 (cl_khr_fp16)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (cl_khr_fp16)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 64, Little-Endian
Global memory size 17091704832 (15.92GiB)
Error Correction support No
Max memory allocation 11964193382 (11.14GiB)
Unified memory for Host and Device No
Minimum alignment for any data type 128 bytes
Alignment of base address 32768 bits (4096 bytes)
Global Memory cache type None
Image support No
Local memory type Local
Local memory size 32768 (32KiB)
Max constant buffer size 2147483647 (2GiB)
Max number of constant args 16
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 0ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Available Yes
Compiler Available Yes
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_fp16
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Clover
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [MESA]
clCreateContext(NULL, ...) [default] Success [MESA]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Clover
Device Name Radeon RX 580 Series (POLARIS10, DRM 3.8.0, 11.2-BETA2, LLVM 6.0.1)
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Clover
Device Name Radeon RX 580 Series (POLARIS10, DRM 3.8.0, 11.2-BETA2, LLVM 6.0.1)
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.12
ICD loader Profile OpenCL 2.2
NOTE: your OpenCL library declares to support OpenCL 2.2,
but it seems to support up to OpenCL 2.1 only.
Segmentation fault
With benchmarks/clpeak:
Code:
$ clpeak
Platform: Clover
Device: Radeon RX 580 Series (POLARIS10, DRM 3.8.0, 11.2-BETA2, LLVM 6.0.1)
Driver version : 18.1.3 (FreeBSD)
Compute units : 36
Clock frequency : 1430 MHz
Global memory bandwidth (GBPS)
float : 216.07
float2 : 220.49
float4 : 223.63
float8 : 214.87
float16 : 124.73
Single-precision compute (GFLOPS)
float : 6273.42
float2 : 6324.87
float4 : 6348.79
float8 : 6309.87
float16 : 6249.02
half-precision compute (GFLOPS)
half : 6359.56
half2 : 6319.32
half4 : 6350.19
half8 : 6327.64
half16 : 6276.21
Double-precision compute (GFLOPS)
double : 409.68
double2 : 409.66
double4 : 409.06
double8 : 408.41
double16 : 406.55
Integer compute (GIOPS)
int : 1306.39
int2 : 1304.44
int4 : 1306.26
int8 : 1303.58
int16 : 1302.69
Transfer bandwidth (GBPS)
enqueueWriteBuffer : 5.18
enqueueReadBuffer : 5.59
enqueueMapBuffer(for read) : 9686.44
memcpy from mapped ptr : 5.93
enqueueUnmap(after write) : 9552.86
memcpy to mapped ptr : 5.17
Kernel launch latency : 167.84 us
Segmentation fault
Or simple programs like https://github.com/boostorg/compute/blob/master/example/list_devices.cpp
Code:
$ ./list_devices
Platform 'Clover'
GPU Device: Radeon RX 580 Series (POLARIS10, DRM 3.8.0, 11.2-BETA2, LLVM 6.0.1)
Segmentation fault
Or the GPU-based variant of net-p2p/xmrig from https://github.com/xmrig/xmrig-amd
Code:
$ ./xmrig-amd --config=config.json
* VERSIONS XMRig/2.7.3-beta libuv/1.22.0 OpenCL/2.0 clang/6.0.0
* CPU AMD Ryzen Threadripper 1950X 16-Core Processor x64 AES
* ALGO cryptonight, donate=5%
* POOL #1 pool.supportxmr.com:5555 variant 1
* POOL #2 pool.monero.hashvault.pro:3333 variant 1
* COMMANDS hashrate, pause, resume
[2018-07-20 18:23:52] compiling code and initializing GPUs. This will take a while...
[2018-07-20 18:23:52] No AMD OpenCL platform found. Possible driver issues or wrong vendor driver.
[2018-07-20 18:23:52] Selected OpenCL platform index -1 doesn't exist.
[2018-07-20 18:23:52] Failed to start threads
Segmentation fault
Not sure how to interpret this.
Installed ports:
clinfo-2.1.16.01.12
clpeak-1.0g20170524
drm-next-kmod-4.11.g20180619_1
gpu-firmware-kmod-g20180319_1
libdrm-2.4.92,1
ocl-icd-2.2.12
opencl-2.2
xf86-video-amdgpu-1.3.0_1
Loaded modules:
Code:
$ kldstat
Id Refs Address Size Name
1 109 0xffffffff80200000 2036448 kernel
2 1 0xffffffff82239000 af98 aesni.ko
3 1 0xffffffff82244000 1e0d8 geom_eli.ko
4 1 0xffffffff82266000 381080 zfs.ko
5 2 0xffffffff825e8000 a380 opensolaris.ko
6 1 0xffffffff825f3000 15da0 fuse.ko
7 1 0xffffffff82821000 155a50 amdgpu.ko
8 1 0xffffffff82977000 714f0 drm.ko
9 4 0xffffffff829e9000 edc8 linuxkpi.ko
10 3 0xffffffff829f8000 d470 linuxkpi_gplv2.ko
11 2 0xffffffff82a06000 6b8 debugfs.ko
12 1 0xffffffff82a07000 8148 amdgpu_polaris10_mc_bin.ko
13 1 0xffffffff82a10000 4400 amdgpu_polaris10_pfp_bin.ko
14 1 0xffffffff82a15000 4400 amdgpu_polaris10_me_bin.ko
15 1 0xffffffff82a1a000 2400 amdgpu_polaris10_ce_bin.ko
16 1 0xffffffff82a1d000 5f30 amdgpu_polaris10_rlc_bin.ko
17 1 0xffffffff82a23000 40400 amdgpu_polaris10_mec_bin.ko
18 1 0xffffffff82a64000 40400 amdgpu_polaris10_mec2_bin.ko
19 1 0xffffffff82aa5000 3318 amdgpu_polaris10_sdma_bin.ko
20 1 0xffffffff82aa9000 3320 amdgpu_polaris10_sdma1_bin.ko
21 1 0xffffffff82aad000 5bc00 amdgpu_polaris10_uvd_bin.ko
22 1 0xffffffff82b09000 28d20 amdgpu_polaris10_vce_bin.ko
23 1 0xffffffff82b32000 1fe18 amdgpu_polaris10_smc_bin.ko
24 1 0xffffffff82b52000 3698 ng_ubt.ko
25 5 0xffffffff82b56000 9a20 netgraph.ko
26 1 0xffffffff82b60000 8e78 ng_hci.ko
27 3 0xffffffff82b69000 95c ng_bluetooth.ko
28 1 0xffffffff82b6a000 2328 ums.ko
29 1 0xffffffff82b6d000 1780 uhid.ko
30 1 0xffffffff82b6f000 bc0e ng_l2cap.ko
31 1 0xffffffff82b7b000 176a8 ng_btsocket.ko
32 1 0xffffffff82b93000 1d40 ng_socket.ko
33 1 0xffffffff82b95000 1070 cpuctl.ko
34 1 0xffffffff82b97000 32d048 vmm.ko
35 1 0xffffffff82ec5000 a54 nmdm.ko
36 1 0xffffffff82ec6000 5fb8 if_bridge.ko
37 1 0xffffffff82ecc000 3b78 bridgestp.ko
38 1 0xffffffff82ed0000 24a0 if_tap.ko
39 1 0xffffffff82ed3000 11a0 amdtemp.ko
40 1 0xffffffff82ed5000 628 amdsmn.ko
Interesting dmesg output:
Code:
pid 11541 (xmrig-amd), uid 1001: exited on signal 11
pid 11545 (clinfo), uid 1001: exited on signal 11
pid 11551 (xmrig-amd), uid 0: exited on signal 11
pid 11550 (sudo), uid 0: exited on signal 11
pid 11585 (clinfo), uid 1001: exited on signal 11
drmn0: failed to get a new IB (-22)
[drm:amdgpu_gem_va_update_vm] Couldn't update BO_VA (-22)
pid 12492 (clpeak), uid 1001: exited on signal 11
pid 18970 (clinfo), uid 1001: exited on signal 11
pid 22560 (clinfo), uid 1001: exited on signal 11
drmn0: failed to get a new IB (-22)
[drm:amdgpu_gem_va_update_vm] Couldn't update BO_VA (-22)
pid 23240 (clpeak), uid 1001: exited on signal 11
pid 23287 (list_devices), uid 1001: exited on signal 11
Any help appreciated.
Thanks.