Segmentation Fault GPU

D

Deleted member 62636

Guest


Hello, whenever I run clinfo or clpeak I get this error:

clpeak:
Code:
Platform: Clover
  Device: AMD CEDAR (DRM 2.50.0 / 12.2-RELEASE-p3, LLVM 10.0.1)
    Driver version  : 20.2.3 (FreeBSD)
    Compute units   : 2
    Clock frequency : 650 MHz
Segmentation fault (core dumped)

clinfo;
Code:
...
ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.13
  ICD loader Profile                              OpenCL 3.0
Segmentation fault (core dumped)

I have installed graphics/drm-fbsd12.0-kmod
 

vigole

Daemon

Reaction score: 1,558
Messages: 1,392

OP
D

Deleted member 62636

Guest


I think this thread died... And Issue is still not solved.
 

astyle

Daemon

Reaction score: 634
Messages: 1,439

Trying to do the same thing on FreeBSD 13-RELEASE. It didn't work under Xorg, or under Plasma Wayland, either. Even compiling the ports with EVERYTHING enabled didn't help. I actually compiled lang/clover, benchmarks/clpeak, devel/ocl-icd, devel/clinfo, and graphics/drm-kmod, deskutils/sysctlview. It looks like my card (Asus Radeon RX 550 4GB) is visible, I can even run radeon-top, but running clinfo dumps core, and clpeak crashes the whole system so hard, I can't even SSH in and reboot.
 

astyle

Daemon

Reaction score: 634
Messages: 1,439

I actually recompiled the FreeBSD 13-RELEASE kernel, userland, and everything mentioned in the above post, except for sysctlview. After that, I posted my results of running clpeak, along with some analysis, and comparison to competition. I took the liberty to crow about my benchmark, and I was especially proud to have accomplished that under FreeBSD. Unfortunately, it looks like that post got deleted... would be nice if I had some heads up on my profile, or a pm.
 

astyle

Daemon

Reaction score: 634
Messages: 1,439

Probably no help:

Code:
% clinfo
Number of platforms                               0
% clpeak
clGetPlatformIDs (-1001)
no platforms found
%

I guess the hardware (HP EliteBook 8570p, recently <https://bsd-hardware.info/?probe=e90abb54c9>) doesn't support OpenCL.
Your output tells me that you need to run # kldload radeonkms.ko before running % clinfo or % clpeak. Then you'll have the whole train wreck to sort through: numbers for your card, crashes, etc.
 

grahamperrin

Son of Beastie

Reaction score: 822
Messages: 2,647

It's normally loaded. Today, for example:

Code:
% kldstat | grep -i radeon
53    1 0xffffffff83836000   150c70 radeonkms.ko
55    1 0xffffffff83997000     3258 radeon_TURKS_pfp_bin.ko
56    1 0xffffffff8399b000     3658 radeon_TURKS_me_bin.ko
57    1 0xffffffff8399f000     2cd8 radeon_BTC_rlc_bin.ko
58    1 0xffffffff839a2000     7ef8 radeon_TURKS_mc_bin.ko
59    1 0xffffffff839aa000     8138 radeon_TURKS_smc_bin.ko
60    1 0xffffffff839b3000    341f0 radeon_SUMO_uvd_bin.ko
%
 

astyle

Daemon

Reaction score: 634
Messages: 1,439

Yeah, but when clinfo says 'Number of platforms: 0', that suggests a need to re-load the radeonkms module and re-run clinfo... what are your results after doing that?
 

Vull

Aspiring Daemon

Reaction score: 517
Messages: 821

I'm on 13.0-RELEASE-p2. Prior to looking at this thread, I had installed only drm-kmod and xf86-video-ati from packages. Everything I have is installed from packages, and I have nothing pertaining to my graphics modules in either /boot/loader.conf or /etc/rc.conf. This was enough to give me smooth running graphics and sufficient support to use the kde5 meta-package.

So out of curiosity about this thread, I installed clinfo and clpeak, also from packages, and I also got the "no platforms found" messages. Next I found this article, and followed its advice to install a bunch more packages, and then reboot: https://beneschtech.medium.com/freebsd-and-opencl-c7055e717586

Now I get all this output from clinfo:
Code:
root@kde5:~ # clinfo
Number of platforms                               1
  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 1.2 pocl 1.6, Release+Asserts, LLVM 11.0.1, RELOC, SLEEF, DISTRO, POCL_DEBUG
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL

  Platform Name                                   Portable Computing Language
Number of devices                                 1
  Device Name                                     AMD A6-6310 APU with AMD Radeon R4 Graphics    
  Device Vendor                                   pocl
  Device Vendor ID                                0x6c636f70
  Device Version                                  OpenCL 1.2 pocl HSTR: pthread-x86_64-portbld-freebsd13.0-btver2
  Driver Version                                  1.6
  Device OpenCL C Version                         OpenCL C 1.2 pocl
  Device Type                                     CPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               4
  Max clock frequency                             1800MHz
  Device Partition                                (core)
    Max number of sub-devices                     4
    Supported partition types                     equally, by counts
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             4096x4096x4096
  Max work group size                             4096
  Preferred work group size multiple (kernel)     8
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                               16 / 16      
    int                                                  8 / 8       
    long                                                 4 / 4       
    half                                                 0 / 0        (n/a)
    float                                                8 / 8       
    double                                               4 / 4        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              5429256192 (5.056GiB)
  Error Correction support                        No
  Max memory allocation                           2147483648 (2GiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        2097152 (2MiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             8192x8192 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     8
  Max constant buffer size                        32768 (32KiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
  printf() buffer size                            16777216 (16MiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Portable Computing Language
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [POCL]
  clCreateContext(NULL, ...) [default]            Success [POCL]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   AMD A6-6310 APU with AMD Radeon R4 Graphics    
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   AMD A6-6310 APU with AMD Radeon R4 Graphics    
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   AMD A6-6310 APU with AMD Radeon R4 Graphics    

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.13
  ICD loader Profile                              OpenCL 3.0
... and I got this from clpeak:
Code:
root@kde5:~ # clpeak

Platform: Portable Computing Language
  Device: AMD A6-6310 APU with AMD Radeon R4 Graphics    
    Driver version  : 1.6 (FreeBSD)
    Compute units   : 4
    Clock frequency : 1800 MHz
48 warnings generated.

    Global memory bandwidth (GBPS)
      float   : 4.55
      float2  : 5.63
      float4  : 3.44
      float8  : 5.05
      float16 : 6.32

    Single-precision compute (GFLOPS)
      float   : 2.03
      float2  : 0.19
      float4  : 7.79
      float8  : 13.55
      float16 : 1.50

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 1.73
      double2  : 3.13
      double4  : 6.03
      double8  : 0.80
      double16 : 0.98

    Integer compute (GIOPS)
      int   : 3.87
      int2  : 5.77
      int4  : 11.70
      int8  : 18.14
      int16 : 18.24

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer         : 2.32
      enqueueReadBuffer          : 2.40
      enqueueMapBuffer(for read) : 21304.40
        memcpy from mapped ptr   : 2.49
      enqueueUnmap(after write)  : 17516.18
        memcpy to mapped ptr     : 2.50

    Kernel launch latency : 16.06 us
The clinfo output was generated pretty rapidly, but it took several minutes of maxxed-out CPU activity to generate the clpeak report.

Otherwise, my system and kde5 DE seem to run much as they did before, but they maybe seem a bit more labored and busy now. This was enough to satisfy my curiosity, and, in time, I'll probably rip these changes out, or reinstall everything from scratch to the point where it was before.
 

astyle

Daemon

Reaction score: 634
Messages: 1,439

Is it not (simply) that I'm using old hardware that doesn't support OpenCL?
I don't think so...

I'm on 13.0-RELEASE-p2. Prior to looking at this thread, I had installed only drm-kmod and xf86-video-ati from packages. Everything I have is installed from packages, and I have nothing pertaining to my graphics modules in either /boot/loader.conf or /etc/rc.conf. This was enough to give me smooth running graphics and sufficient support to use the kde5 meta-package.

So out of curiosity about this thread, I installed clinfo and clpeak, also from packages, and I also got the "no platforms found" messages. Next I found this article, and followed its advice to install a bunch more packages, and then reboot: https://beneschtech.medium.com/freebsd-and-opencl-c7055e717586

Now I get all this output from clinfo:
Code:
root@kde5:~ # clinfo
Number of platforms                               1
  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 1.2 pocl 1.6, Release+Asserts, LLVM 11.0.1, RELOC, SLEEF, DISTRO, POCL_DEBUG
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL

  Platform Name                                   Portable Computing Language
Number of devices                                 1
  Device Name                                     AMD A6-6310 APU with AMD Radeon R4 Graphics   
  Device Vendor                                   pocl
  Device Vendor ID                                0x6c636f70
  Device Version                                  OpenCL 1.2 pocl HSTR: pthread-x86_64-portbld-freebsd13.0-btver2
  Driver Version                                  1.6
  Device OpenCL C Version                         OpenCL C 1.2 pocl
  Device Type                                     CPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               4
  Max clock frequency                             1800MHz
  Device Partition                                (core)
    Max number of sub-devices                     4
    Supported partition types                     equally, by counts
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             4096x4096x4096
  Max work group size                             4096
  Preferred work group size multiple (kernel)     8
  Preferred / native vector sizes                
    char                                                16 / 16     
    short                                               16 / 16     
    int                                                  8 / 8      
    long                                                 4 / 4      
    half                                                 0 / 0        (n/a)
    float                                                8 / 8      
    double                                               4 / 4        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              5429256192 (5.056GiB)
  Error Correction support                        No
  Max memory allocation                           2147483648 (2GiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        2097152 (2MiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             8192x8192 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     8
  Max constant buffer size                        32768 (32KiB)
  Max size of kernel argument                     1024
  Queue properties                               
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                         
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
  printf() buffer size                            16777216 (16MiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Portable Computing Language
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [POCL]
  clCreateContext(NULL, ...) [default]            Success [POCL]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   AMD A6-6310 APU with AMD Radeon R4 Graphics   
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   AMD A6-6310 APU with AMD Radeon R4 Graphics   
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   AMD A6-6310 APU with AMD Radeon R4 Graphics   

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.13
  ICD loader Profile                              OpenCL 3.0
... and I got this from clpeak:
Code:
root@kde5:~ # clpeak

Platform: Portable Computing Language
  Device: AMD A6-6310 APU with AMD Radeon R4 Graphics   
    Driver version  : 1.6 (FreeBSD)
    Compute units   : 4
    Clock frequency : 1800 MHz
48 warnings generated.

    Global memory bandwidth (GBPS)
      float   : 4.55
      float2  : 5.63
      float4  : 3.44
      float8  : 5.05
      float16 : 6.32

    Single-precision compute (GFLOPS)
      float   : 2.03
      float2  : 0.19
      float4  : 7.79
      float8  : 13.55
      float16 : 1.50

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 1.73
      double2  : 3.13
      double4  : 6.03
      double8  : 0.80
      double16 : 0.98

    Integer compute (GIOPS)
      int   : 3.87
      int2  : 5.77
      int4  : 11.70
      int8  : 18.14
      int16 : 18.24

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer         : 2.32
      enqueueReadBuffer          : 2.40
      enqueueMapBuffer(for read) : 21304.40
        memcpy from mapped ptr   : 2.49
      enqueueUnmap(after write)  : 17516.18
        memcpy to mapped ptr     : 2.50

    Kernel launch latency : 16.06 us
The clinfo output was generated pretty rapidly, but it took several minutes of maxxed-out CPU activity to generate the clpeak report.

Otherwise, my system and kde5 DE seem to run much as they did before, but they maybe seem a bit more labored and busy now. This was enough to satisfy my curiosity, and, in time, I'll probably rip these changes out, or reinstall everything from scratch to the point where it was before.
Vull : Did you get the core dumps? I read the article you quote - my compiled setup pulled all that stuff in as deps. How to your numbers compare to what AMD advertises for the hardware? As an aside - it looks like your R4 is faster at starting kernels than my RX 550... my 'Kernel launch latency' was measured at ~60 microseconds! By comparison, the author of clpeak is quoting numbers for a recent NVidia GPU at 7.22 microseconds, which seems to be the norm for the brand... and I'm not promoting one brand over the other, just trying to be academic in my analysis.
 

Vull

Aspiring Daemon

Reaction score: 517
Messages: 821

No coredumps that I know of astyle, other than the drkonqi.core I inevitably seem to get with plasma5 anyway. Haven't read the AMD hype for this Lenovo laptop yet, but I can get you the model number from the BIOS in a sec. I think I got kinda lucky when I bought this used laptop cheaply, it seems much more hardware-compatible than my previous ones.

I'll just say one more thing: I was having a lot of trouble with the graphics on this thing until I ( a.) started using the xf86-video-ati package, and ( b.) took all the module loading stuff out of my loader.conf and rc.conf files. Unlike previous versions, 13.0 seems to be doing a much better job of automatically loading the right kernel modules precisely when it needs them... it seems like maybe I had been loading them too soon? But I'm not sure, just guessing. Here is what the system has been loading for me automatically... I think it's pulling in ext2fs.ko because I'm mounting a scratchpad ext4 partition with /etc/fstab. But this list of 26 modules hasn't changed at all as a result of installing all the clinfo and clpeak dependency packages:
Code:
len@kde5:/usr/home/len $ kldstat
Id Refs Address                Size Name
1   85 0xffffffff80200000  1f11ef8 kernel
2    1 0xffffffff82730000    1ae78 ext2fs.ko
3    1 0xffffffff8274b000     3218 intpm.ko
4    1 0xffffffff8274f000     2180 smbus.ko
5    1 0xffffffff82752000     4b60 ng_ubt.ko
6    3 0xffffffff82757000     aac8 netgraph.ko
7    2 0xffffffff82762000     a238 ng_hci.ko
8    1 0xffffffff8276d000     25a8 ng_bluetooth.ko
9    1 0xffffffff82770000     2340 uhid.ko
10    1 0xffffffff82773000     4350 ums.ko
11    1 0xffffffff82778000     3380 usbhid.ko
12    1 0xffffffff8277c000     31f8 hidbus.ko
13    1 0xffffffff82780000    27040 ipfw.ko
14    1 0xffffffff827a8000   150c70 radeonkms.ko
15    2 0xffffffff828f9000    7f4c8 drm.ko
16    3 0xffffffff82979000     cbc8 linuxkpi_gplv2.ko
17    1 0xffffffff82986000     2328 lindebugfs.ko
18    1 0xffffffff82989000     e778 ttm.ko
19    1 0xffffffff82998000     4358 radeon_mullins_pfp_bin.ko
20    1 0xffffffff8299d000     4358 radeon_mullins_me_bin.ko
21    1 0xffffffff829a2000     4358 radeon_mullins_ce_bin.ko
22    1 0xffffffff829a7000     6358 radeon_mullins_mec_bin.ko
23    1 0xffffffff829ae000     49d8 radeon_mullins_rlc_bin.ko
24    1 0xffffffff829b3000     3240 radeon_mullins_sdma_bin.ko
25    1 0xffffffff829b7000    3ae08 radeon_bonaire_uvd_bin.ko
26    1 0xffffffff829f2000    15280 radeon_BONAIRE_vce_bin.ko
len@kde5:/usr/home/len $

Edited to add: It's a Lenovo model G50-45 laptop.
 

astyle

Daemon

Reaction score: 634
Messages: 1,439

Vull, I think you're lucky to not get coredumps from running clpeak and clinfo. Mine are 1.7 GB from clpeak, and 600MB from clinfo. AMD may have the hype, true, but it also has some numbers published that are frankly a baseline for brand-new cards. If the card in your possession is putting up much worse numbers than what's published by AMD, it may well be a sign that the card is dying, and needs to be replaced. My numbers happened to be in line with the AMD-published baseline. Are yours?

13-RELEASE does a MUCH better job (than earlier versions) at RAM management, so loading all those .ko modules is now a much more tightly run ship.
 

Vull

Aspiring Daemon

Reaction score: 517
Messages: 821

Vull, I think you're lucky to not get coredumps from running clpeak and clinfo. Mine are 1.7 GB from clpeak, and 600MB from clinfo. AMD may have the hype, true, but it also has some numbers published that are frankly a baseline for brand-new cards. If the card in your possession is putting up much worse numbers than what's published by AMD, it may well be a sign that the card is dying, and needs to be replaced. My numbers happened to be in line with the AMD-published baseline. Are yours?

13-RELEASE does a MUCH better job (than earlier versions) at RAM management, so loading all those .ko modules is now a much more tightly run ship.
I don't really understand what it is you want me to compare or where to find the AMD published baseline. I tried searching for "amd baseline AMD A6-6310 APU with AMD Radeon R4 Graphics" but either it didn't get me there or I didn't know how to find it. xD
 

grahamperrin

Son of Beastie

Reaction score: 822
Messages: 2,647


Thanks!

From the article:

… install first (on 13.0): pkg install drm-fbsd13-kmod drm-kmod drm_info libdrm linux-c7-libdrm ocl-icd pocl clinfo …

pkg query -e '%a = 1' %o | grep -e libdrm -e ocl-icd | sort found three of the packages already installed automatically: devel/ocl-icd, graphics/libdrm and graphics/linux-c7-libdrm.

It's unnecessary to specify drm-fbsd13-kmod for installation. The version-specific kernel module will be installed by graphics/drm-kmod.

Also, manual installation of ocl-icd is unnecessary. <https://www.freshports.org/devel/clinfo/#requiredlib>

I found graphics/drm_info already installed manually (2021-03-22). <https://www.freshports.org/graphics/drm_info#requiredlib> there's a likely explanation for my automated installation of graphics/libdrm.

After installing lang/pocl: without a reboot of FreeBSD, clinfo found one platform, Portable Computing Language.
 

Vull

Aspiring Daemon

Reaction score: 517
Messages: 821

Thanks!

From the article:



pkg query -e '%a = 1' %o | grep -e libdrm -e ocl-icd | sort found three of the packages already installed automatically: devel/ocl-icd, graphics/libdrm and graphics/linux-c7-libdrm.

It's unnecessary to specify drm-fbsd13-kmod for installation. The version-specific kernel module will be installed by graphics/drm-kmod.

Also, manual installation of ocl-icd is unnecessary. <https://www.freshports.org/devel/clinfo/#requiredlib>

I found graphics/drm_info already installed manually (2021-03-22). <https://www.freshports.org/graphics/drm_info#requiredlib> there's a likely explanation for my automated installation of graphics/libdrm.

After installing lang/pocl: without a reboot of FreeBSD, clinfo found one platform, Portable Computing Language.
Welcome. That was the same list of packages I went through.
 

grahamperrin

Son of Beastie

Reaction score: 822
Messages: 2,647

… running clinfo dumps core, and clpeak crashes the whole system …

… clinfo output was generated pretty rapidly, but it took several minutes of maxxed-out CPU activity to generate the clpeak report. …

The same here. No dump or crash, and (at a glance) use of all four CPUs was at its peak during the run of clpeak. Subsequent runs of clinfo take less than one second.

I deleted my first post (the one that was of no help).
 

grahamperrin

Son of Beastie

Reaction score: 822
Messages: 2,647

… trying to find a way to fix those

I got the segmentation fault in clinfo after installing lang/clover, so I deleted clover and devel/libclc.

Deletion is not a fix, but I have a clinfo.core

Code:
root@mowa219-gjp4-8570p:/var/cache/pkg # pkg install clover
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
Updating poudriere repository catalogue...
poudriere repository is up to date.
All repositories are up to date.
The following 2 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        clover: 20.2.3 [FreeBSD]
        libclc: 0.4.0.20190527_2 [FreeBSD]

Number of packages to be installed: 2

The process will require 541 MiB more space.
37 MiB to be downloaded.

Proceed with this action? [y/N]: y
[1/2] Fetching clover-20.2.3.txz: 100%    2 MiB   1.9MB/s    00:01   
[2/2] Fetching libclc-0.4.0.20190527_2.txz: 100%   35 MiB   6.2MB/s    00:06   
Checking integrity... done (0 conflicting)
[1/2] Installing libclc-0.4.0.20190527_2...
[1/2] Extracting libclc-0.4.0.20190527_2: 100%
[2/2] Installing clover-20.2.3...
[2/2] Extracting clover-20.2.3: 100%
=====
Message from clover-20.2.3:

--
===>   NOTICE:

This port is deprecated; you may wish to reconsider installing it:

Uses EOL Python 2.7 via devel/libclc.

It is scheduled to be removed on or after 2021-06-23.
root@mowa219-gjp4-8570p:/var/cache/pkg # cd
root@mowa219-gjp4-8570p:~ # pkg delete -q -y clover libclc
root@mowa219-gjp4-8570p:~ # pkg clean
The following package files will be deleted:
        /var/cache/pkg/py38-pystemmer-2.0.1.txz
        /var/cache/pkg/py38-sphinxcontrib-serializinghtml-1.1.4~0e0a07a80e.txz
        /var/cache/pkg/py38-sphinxcontrib-applehelp-1.0.2~c0c4030628.txz
        /var/cache/pkg/py38-sphinxcontrib-jsmath-1.0.1~945a29fb9a.txz
        /var/cache/pkg/py38-sphinx-3.5.2,1~f6e564b89b.txz
        /var/cache/pkg/py38-alabaster-0.7.12~c4a2471468.txz
        /var/cache/pkg/py38-sphinxcontrib-qthelp-1.0.3~952cecf097.txz
        /var/cache/pkg/py38-sphinx-3.5.2,1.txz
        /var/cache/pkg/py38-sphinxcontrib-jsmath-1.0.1.txz
        /var/cache/pkg/libclc-0.4.0.20190527_2~4c8488ca8b.txz
        /var/cache/pkg/py38-docutils-0.17.1~d2464dd033.txz
        /var/cache/pkg/py38-alabaster-0.7.12.txz
        /var/cache/pkg/at-spi2-atk-2.34.2.txz
        /var/cache/pkg/py38-imagesize-1.2.0.txz
        /var/cache/pkg/at-spi2-atk-2.34.2~eb0a570d2a.txz
        /var/cache/pkg/pocl-1.6_1~743e65b139.txz
        /var/cache/pkg/py38-sphinxcontrib-devhelp-1.0.2~f880a76475.txz
        /var/cache/pkg/py38-sphinxcontrib-applehelp-1.0.2.txz
        /var/cache/pkg/py38-sphinxcontrib-htmlhelp-1.0.3.txz
        /var/cache/pkg/libclc-0.4.0.20190527_2.txz
        /var/cache/pkg/py38-imagesize-1.2.0~95d8b6d4e6.txz
        /var/cache/pkg/ninja-1.10.2,2.txz
        /var/cache/pkg/hwloc2-2.4.1~13c625dfd8.txz
        /var/cache/pkg/clover-20.2.3~682488371d.txz
        /var/cache/pkg/py38-sphinxcontrib-qthelp-1.0.3.txz
        /var/cache/pkg/ninja-1.10.2,2~46523e4f6b.txz
        /var/cache/pkg/py38-docutils-0.17.1.txz
        /var/cache/pkg/pocl-1.6_1.txz
        /var/cache/pkg/py38-snowballstemmer-2.1.0.txz
        /var/cache/pkg/py38-sphinxcontrib-htmlhelp-1.0.3~2f470d1619.txz
        /var/cache/pkg/py38-sphinxcontrib-serializinghtml-1.1.4.txz
        /var/cache/pkg/py38-pystemmer-2.0.1~1ad9d96237.txz
        /var/cache/pkg/py38-sphinxcontrib-devhelp-1.0.2.txz
        /var/cache/pkg/hwloc2-2.4.1.txz
        /var/cache/pkg/clover-20.2.3.txz
        /var/cache/pkg/py38-snowballstemmer-2.1.0~369098b10f.txz
The cleanup will free 49 MiB

Proceed with cleaning the cache? [y/N]: y
Deleting files: 100%
All done
root@mowa219-gjp4-8570p:~ # date ; uname -KUv
Sun Jun 13 01:02:44 BST 2021
FreeBSD 14.0-CURRENT #98 main-n247326-2349cda44fe: Sat Jun 12 08:19:48 BST 2021     root@mowa219-gjp4-8570p:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  1400021 1400021
root@mowa219-gjp4-8570p:~ #
 

Vull

Aspiring Daemon

Reaction score: 517
Messages: 821

The same here. No dump or crash, and (at a glance) use of all four CPUs was at its peak during the run of clpeak. Subsequent runs of clinfo take less than one second.

I deleted my first post (the one that was of no help).
I'm running clpeak again now to time it, and hoping it doesn't smoke my board. It has passive cooling and the temp here is about 93 degrees F. hah. When I ran it earlier, it seemed to slow my system down afterwards, but those symptoms soon went away, so maybe it was just overheated and worn out? At any rate I'll probably give this whole thing up after this 2nd run; this is really not my bailiwick.

It just finished, and took about 18 minutes for clpeak to run from start to finish.

Edited to add: results from 2nd run were very close to those from the first run:
Code:
root@kde5:~ # clpeak

Platform: Portable Computing Language
  Device: AMD A6-6310 APU with AMD Radeon R4 Graphics    
    Driver version  : 1.6 (FreeBSD)
    Compute units   : 4
    Clock frequency : 1800 MHz

    Global memory bandwidth (GBPS)
      float   : 4.59
      float2  : 5.59
      float4  : 3.46
      float8  : 5.02
      float16 : 6.28

    Single-precision compute (GFLOPS)
      float   : 2.02
      float2  : 0.20
      float4  : 7.79
      float8  : 11.98
      float16 : 1.38

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 1.64
      double2  : 3.45
      double4  : 6.08
      double8  : 0.79
      double16 : 0.96

    Integer compute (GIOPS)
      int   : 3.60
      int2  : 5.73
      int4  : 11.55
      int8  : 17.31
      int16 : 17.11

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer         : 1.93
      enqueueReadBuffer          : 2.07
      enqueueMapBuffer(for read) : 21389.28
        memcpy from mapped ptr   : 1.96
      enqueueUnmap(after write)  : 17179.87
        memcpy to mapped ptr     : 1.95

    Kernel launch latency : 16.40 us

root@kde5:~ #
 

astyle

Daemon

Reaction score: 634
Messages: 1,439

I don't really understand what it is you want me to compare or where to find the AMD published baseline. I tried searching for "amd baseline AMD A6-6310 APU with AMD Radeon R4 Graphics" but either it didn't get me there or I didn't know how to find it. xD
Found yours on amd.com: Radeon R4 with A6-6310 APU. It did take using the search function on amd.com... For comparison, here's specs for my RX 550 4GB... the 1.2 TFLOPs there are in line with what clpeak reported for my Single-precision compute - float (not float2 or float4).

The point was to show what baseline numbers look like - it won't always be labeled 'baseline', but the numbers on amd.com can be assumed to be baseline numbers - what you can expect from a brand-new card.
 

grahamperrin

Son of Beastie

Reaction score: 822
Messages: 2,647

<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240761#c5> clearly I don't know what I'm doing with gdb(1) :) – I so rarely need to touch it.

gdb file clinfo.core is saner but still, nothing useful (below) in the backtrace.

There's already a more useful backtrace at <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240761#c4> so I'm not bothered by mine.

Code:
% sudo pkg install -q -y clover
grahamperrin's password:
=====
Message from clover-20.2.3:

--
===>   NOTICE:

This port is deprecated; you may wish to reconsider installing it:

Uses EOL Python 2.7 via devel/libclc.

It is scheduled to be removed on or after 2021-06-23.
% rm clinfo.core
% clinfo > /dev/null
Segmentation fault (core dumped)
% gdb file clinfo.core
GNU gdb (GDB) 10.2 [GDB v10.2 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from file...
Reading symbols from /usr/lib/debug//usr/bin/file.debug...

warning: core file may not match specified executable file.
[New LWP 107531]
[New LWP 131359]
[New LWP 131360]
[New LWP 131361]
[New LWP 131362]
Core was generated by `clinfo'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000008107d5f00 in ?? ()
[Current thread is 1 (LWP 107531)]
(gdb) bt
#0  0x00000008107d5f00 in ?? ()
#1  0x00000008003e0524 in ?? ()
#2  0x00007fffffffe388 in ?? ()
#3  0x0000000800b35a00 in ?? ()
#4  0x0000000000000000 in ?? ()
(gdb) q
% rm clinfo.core
% sudo pkg delete -q -y clover libclc && sudo pkg clean -q -y
%
 

Vull

Aspiring Daemon

Reaction score: 517
Messages: 821

Found yours on amd.com: Radeon R4 with A6-6310 APU. It did take using the search function on amd.com... For comparison, here's specs for my RX 550 4GB... the 1.2 TFLOPs there are in line with what clpeak reported for my Single-precision compute - float (not float2 or float4).

The point was to show what baseline numbers look like - it won't always be labeled 'baseline', but the numbers on amd.com can be assumed to be baseline numbers - what you can expect from a brand-new card.
For my limited purposes it's way more than fast enough. I like this laptop.
 
Top