Is OpenCL possible with the current Nvidia driver?

devel/ocl-icd says it will work with a non-free ICD, and I've seen people get nvidia.icd from... wherever (presumably some nvidia binary driver package)... on linux, so perhaps the mechanism is the same. I couldn't find any specific examples of this working, though.
 
Nvidia's OpenCL is implemented through CUDA.
I guess that means not. Then I'll use virtualization or sell this Nvidia card and buy an AMD one. Using WebGL appears to be an alternative too, but I guess it is very slow.
 
WebGL has nothing to do with general-purpose GPU computing. Vulkan could be used for that purpose in theory, but you'll have to implement everything from scratch yourself, which is probably not what you are looking for.

I'm kind of curious what kind of small project you have in mind. You obviously didn't bother to do any research on the topic, so there no reason for me to believe you actually need GPU anything.
 
WebGL has nothing to do with general-purpose GPU computing. Vulkan could be used for that purpose in theory, but you'll have to implement everything from scratch yourself, which is probably not what you are looking for.

I'm kind of curious what kind of small project you have in mind. You obviously didn't bother to do any research on the topic, so there no reason for me to believe you actually need GPU anything.
Well Tensorflow can run on CPU too, I just have a spare GPU I haven't sold, so I thought I use that instead of stressing the CPU. It is a sequential pattern mining project, I want to compare normal algorithms like SPADE or PrefixSpan to machine learning.

Looks like you are not up to date either:

https://github.com/tensorflow/tfjs
A WebGL accelerated JavaScript library for training and deploying ML models.
 
Due to constant problems with my RADEON RX580 I swapped it with NVIDIA GTX1060.. using binary driver version 550.. all problems are gone.. and I was wondering how to use OpenCL/CUDA with that card.. any hints welcome :-) Graphics/OpenCL Wiki (https://wiki.freebsd.org/Graphics/OpenCL) does not even mention nvidia..
 
I myself don't use OpenCL nor CUDA directly, but
Code:
% pkg info -l x11/nvidia-driver | grep -n icd                                                                       
124:    /usr/local/share/vulkan/icd.d/nvidia_icd.json
% pkg info -l x11/linux-nvidia-libs | grep -n icd
2:    /compat/linux/etc/OpenCL/vendors/nvidia.icd
3:    /compat/linux/etc/vulkan/icd.d/nvidia_icd.json
%
could the above be of hints?
 
I'm currently running on 565.77 of the nvidia-driver by overriding version.
New beta branch of driver 570.86.16 turned out to require some work on ports (cannot find screen with simple overriding).

I'm planning to work on it once I can take enough time with updates for latest production branch of driver at the moment.

As Linux version of 570.86.16 has more changes/additions, but as I'm not actually using x11/linux-nvidia-libs myself, wouldn't be able to determine which files to be additionally install "to which directory", especially for new json files. So my next work wouldn't include this part, not as usual.
 
Yeah, I have Debian12 deboostraped and was trying to run some applications there on a binary nvidia-driver 550. It turns out Debian has all libraries built for 535 so nothing works anymore after installing nvidia packages. I had full 3D acceleration with AMDGPU. Also 535 driver from nvidia website does not build on FreeBSD 14:-)

I am a bit impressed that FreeBSD has newer driver than Debian.. although this makes nvidia drivers self-incompatible.. maybe we should stick to the same versions among different systems to get things compatible:-P
 
Also 535 driver from nvidia website does not build on FreeBSD 14:-)
Which minor version of 535? I've tried ones listed below before.
  • 535.43.02
  • 535.54.03
  • 535.86.05
  • 535.98
  • 535.104.05
  • 535.113.01
  • 535.129.03
  • 535.146.02
  • 535.154.05
As seen in PR 282312, 550.127.05 of the driver contains security fix, so using 535 should be discouraged.

If you need 535 series of driver anyway, do you want graphics/nvidia-drm-[510|515|61]-kmod?
If so, as the official support for it started from 550 series of drivers, you need to obtain Austin's private distfiles corresponding with the version you want, and need to follow his procedure. See diff of distinfo part in commit 71e92b26bd43763a7b82208625e628f043858fa7.

If you don't need DRM part of the driver, you can override using something like below in your /etc/make.conf.

Code:
NVIDIA_OVERRIDE_VERSION= 535.146.02

.if ${.CURDIR:M/usr/ports/x11/nvidia-driver} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION=    ${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM=    YES
.endif

.if ${.CURDIR:M/usr/ports/x11/linux-nvidia-libs} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION=    ${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM=    YES
.endif

Unfortunately, overriding versions for graphics/nvidia-drm-[510|515|61]-kmod like above even if the same logic is in /etc/make.conf for it. You need to modify x11/nvidia-driver/Makefile.version to point to the wanted version, too. Otherwise it picks the wrong (pointed in the Makefile.version) version, thus, don't work. NO_CHECKSUM is still needed to override, though.


I am a bit impressed that FreeBSD has newer driver than Debian.
Maybe it would be because of the Debian's philosopy, dislike proprietary softwares.
But FreeBSD port of x11/nvidia-driver is already behind official nvidia production branch, too, as I work on it only when some work is needed for supporting new feature branch and/or beta branch of drivers, and no one others updates it for the latest version.

If you're OK for c7 on Linuxulator, you can use x11/linux-nvidia-libs for Linux apps running on it. Not sure it works for rl9.

Currently, turned out that latest beta 570.86.16 requires some work, but I cannot take enough time to investigate for now. And as I myself already don't using x11/linux-nvidia-libs and Linux version of driver seems to have more changes than FreeBSD version, possibly x11/linux-nvidia-libs does not work even after I file a PR.
 
Code:
% nv-sglrun clpeak
shim init

Platform: NVIDIA CUDA
  Device: NVIDIA GeForce GTX 1660
    Driver version  : 550.127.05 (FreeBSD)
    Compute units   : 22
    Clock frequency : 1830 MHz

    Global memory bandwidth (GBPS)
      float   : 153.98
      float2  : 160.37
      float4  : 164.72
      float8  : 160.68
      float16 : 159.13

    Single-precision compute (GFLOPS)
      float   : 5580.16
      float2  : 5519.13
      float4  : 5503.80
      float8  : 5451.26
      float16 : 5396.63

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 176.49
      double2  : 176.27
      double4  : 175.76
      double8  : 175.04
      double16 : 173.29

    Integer compute (GIOPS)
      int   : 4860.36
      int2  : 4770.71
      int4  : 4774.76
      int8  : 4799.89
      int16 : 4777.37

    Integer compute Fast 24bit (GIOPS)
      int   : 4628.80
      int2  : 4761.41
      int4  : 4770.80
      int8  : 4744.14
      int16 : 4692.37

    Integer char (8bit) compute (GIOPS)
      char   : 4041.64
      char2  : 3993.92
      char4  : 4049.56
      char8  : 4063.33
      char16 : 3389.27

    Integer short (16bit) compute (GIOPS)
      short   : 4011.38
      short2  : 3832.12
      short4  : 3959.47
      short8  : 3474.62
      short16 : 3339.47

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 12.55
      enqueueReadBuffer               : 12.75
      enqueueWriteBuffer non-blocking : 11.90
      enqueueReadBuffer non-blocking  : 12.14
      enqueueMapBuffer(for read)      : 3.88
        memcpy from mapped ptr        : 19.61
      enqueueUnmap(after write)       : 13.13
        memcpy to mapped ptr          : 19.86

    Kernel launch latency : 6.49 us
% pkg which -p nv-sglrun
/usr/local/bin/nv-sglrun was installed by package libc6-shim-20240512
% pkg info | grep nvidia
linux-nvidia-libs-550.127.05   NVidia graphics libraries and programs (Linux version)
nvidia-driver-550.127.05.1402000 NVidia graphics card binary drivers for hardware OpenGL rendering
nvidia-settings-535.146.02_1   Display Control Panel for X NVidia driver
nvidia-xconfig-525.116.04      Tool to manipulate X configuration files for the NVidia driver

AMD supports OpenCL / Clover.
Clover is entirely unusable: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19385.
 
If you are happy with "poor-man's" compute (generally misusing OpenGL for generic compute use-cases), then for desktop OpenGL (i.e not WebGL/OpenGL|ES) you can make use of compute shaders:

https://learnopengl.com/Guest-Articles/2022/Compute-Shaders/Introduction

Apparently TensorFlow does have support for this.

That said, I have not tested it personally. I am still happy with OpenGL 2.1+ "ultra-poor man's" compute (passing general data via a sampler).
 
Allright, having x11/nvidia-driver (550) and building x11/linux-nvidia-libs with DEFAULT_VERSIONS+=linux=rl9 set in /etc/make.conf CUDA and OpenCL is now working with Linux-RL9 layer (default port still uses C7). Programs that are supposed to run OpenCL/CUDA applications whould be wrapped by nv-sglrun that comes with libc6-shim-20240512 package. For instance nvidia-smi that shows GPU capabilities (bundled with the driver) will not reveal CUDA until it is wrapped with nv-sglrun nvidia-smi (something like padsp to wrap OSS audio over PulseAudio). I can see now that computations can be shortened from 6 weeks to 4 days on relatively old NVIDIA GTX1060 GPU :-) !!BIG THANK YOU!! :-)

Code:
# nv-sglrun nvidia-smi
shim init
Tue Feb  4 11:16:02 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.05             Driver Version: 550.127.05     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 6GB    Off |   00000000:01:00.0  On |                  N/A |
| 29%   50C    P0             26W /  120W |    1220MiB /   6144MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

Code:
# nv-sglrun clpeak
shim init

Platform: NVIDIA CUDA
  Device: NVIDIA GeForce GTX 1060 6GB
    Driver version  : 550.127.05 (FreeBSD)
    Compute units   : 10
    Clock frequency : 1708 MHz

    Global memory bandwidth (GBPS)
      float   : 138.25
      float2  : 142.37
      float4  : 147.13
      float8  : 147.19
      float16 : 98.19

    Single-precision compute (GFLOPS)
      float   : 4108.05
      float2  : 4312.85
      float4  : 4283.72
      float8  : 4249.30
      float16 : 4226.22

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 140.94
      double2  : 140.18
      double4  : 139.88
      double8  : 138.68
      double16 : 139.25

    Integer compute (GIOPS)
      int   : 1441.66
      int2  : 1421.13
      int4  : 1431.67
      int8  : 1316.01
      int16 : 1299.19

    Integer compute Fast 24bit (GIOPS)
      int   : 1428.16
      int2  : 1394.27
      int4  : 1415.95
      int8  : 1411.34
      int16 : 1388.43

    Integer char (8bit) compute (GIOPS)
      char   : 3871.67
      char2  : 4115.16
      char4  : 4133.75
      char8  : 4086.24
      char16 : 4042.09

    Integer short (16bit) compute (GIOPS)
      short   : 3798.35
      short2  : 3925.07
      short4  : 4028.50
      short8  : 4101.96
      short16 : 4036.17

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 5.14
      enqueueReadBuffer               : 5.82
      enqueueWriteBuffer non-blocking : 4.99
      enqueueReadBuffer non-blocking  : 5.46
      enqueueMapBuffer(for read)      : 5.86
        memcpy from mapped ptr        : 4.16
      enqueueUnmap(after write)       : 5.93
        memcpy to mapped ptr          : 4.18

    Kernel launch latency : 7.88 us
 
Some additional note. (I've noted in Comment 7 of PR 284537.)
USES= linux in ports Makefile make it depending on default Linuxulator (currently still c7) and can be overridden with rl9 by USES= linux:rl9 per-port basis. This is on the maintainer of the each port.

For system-wide configuration, you can specify DEFAUT_VERSIONS+= linux=rl9 in /etc/make.conf. This is on the admin of the each computer.

What can be specified is defined in /usr/ports/Mk/bsd.default-versions.mk.

Anyway, thanks to Comment 2 of PR 284537 by Dima Panov, turned out that no need to adapt x11/linux-nvidia-libs for rl9.
 
HI

Yes I can get this linux shim to work as well, also VAAPI using libva-nvidia-driver seems to work with firefox.
this is done on a workstation with "pkg install" ed packages ,no local compile.

xxxxx@w680ace:~ $ nv-sglrun nvidia-smi
shim init
Tue Feb 4 23:30:36 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.05 Driver Version: 550.127.05 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 Ti Off | 00000000:01:00.0 On | N/A |
| 0% 36C P8 4W / 160W | 662MiB / 8188MiB | 1% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
ltu@w680ace:~ $ nv-sglrun clpeak
shim init

Platform: NVIDIA CUDA
Device: NVIDIA GeForce RTX 4060 Ti
Driver version : 550.127.05 (FreeBSD)
Compute units : 34
Clock frequency : 2685 MHz

Global memory bandwidth (GBPS)
float : 250.84
float2 : 258.27
float4 : 262.82
float8 : 265.79
float16 : 268.13

Single-precision compute (GFLOPS)
float : 23131.70
float2 : 23055.74
float4 : 22980.27
float8 : 22821.29
float16 : 22674.35

No half precision support! Skipped

Double-precision compute (GFLOPS)
double : 374.97
double2 : 374.69
double4 : 373.07
double8 : 357.72
double16 : 350.53

Integer compute (GIOPS)
int : 11718.43
int2 : 11761.22
int4 : 11730.86
int8 : 11276.48
int16 : 10451.94

Integer compute Fast 24bit (GIOPS)
int : 10994.94
int2 : 11013.73
int4 : 11461.28
int8 : 11458.52
int16 : 11211.60

Integer char (8bit) compute (GIOPS)
char : 10321.13
char2 : 10123.83
char4 : 9848.45
char8 : 8251.40
char16 : 7871.05

Integer short (16bit) compute (GIOPS)
short : 10279.86
short2 : 9728.00
short4 : 9967.57
short8 : 8892.56
short16 : 7616.51

Transfer bandwidth (GBPS)
enqueueWriteBuffer : 8.65
enqueueReadBuffer : 8.45
enqueueWriteBuffer non-blocking : 8.74
enqueueReadBuffer non-blocking : 9.83
enqueueMapBuffer(for read) : 11.74
memcpy from mapped ptr : 12.52
enqueueUnmap(after write) : 12.93
memcpy to mapped ptr : 12.82

Kernel launch latency : 3.84 us


$ pkg info | grep nvidia
libva-nvidia-driver-0.0.13 NVDEC-based backend for VAAPI
linux-nvidia-libs-550.127.05 NVidia graphics libraries and programs (Linux version)
nvidia-driver-550.127.05.1401000 NVidia graphics card binary drivers for hardware OpenGL rendering
nvidia-drm-61-kmod-550.127.05.1401000_1 NVIDIA DRM Kernel Module
nvidia-drm-kmod-550.127.05 NVIDIA DRM Kernel Module
nvidia-settings-535.146.02_1 Display Control Panel for X NVidia driver
nvidia-xconfig-525.116.04 Tool to manipulate X configuration files for the NVidia driver
 
Hi everyone,
I've been trying this on my -current, and it works.
However, I can only compile OpenCL source code using Clang , and I need to use the nv-sglrun command to run the binaries.

It seems that Nvidia doesn't invest time or resources in creating a native FreeBSD port.
 
Some additional note. (I've noted in Comment 7 of PR 284537.)
USES= linux in ports Makefile make it depending on default Linuxulator (currently still c7) and can be overridden with rl9 by USES= linux:rl9 per-port basis. This is on the maintainer of the each port.

For system-wide configuration, you can specify DEFAUT_VERSIONS+= linux=rl9 in /etc/make.conf. This is on the admin of the each computer.

What can be specified is defined in /usr/ports/Mk/bsd.default-versions.mk.

Anyway, thanks to Comment 2 of PR 284537 by Dima Panov, turned out that no need to adapt x11/linux-nvidia-libs for rl9.

I think the best way to offer this x11/linux-nvidia-libs in packages is to just add FLAVOR for c7 and rl9 so it builds both for c7 and rl9 and its still a single port easy to maitnain? :-)
 
Hi everyone,
I've been trying this on my -current, and it works.
However, I can only compile OpenCL source code using Clang , and I need to use the nv-sglrun command to run the binaries.

Yes exactly, nv-sglrun is a wrapper that enables application to access underlying hardware acceleration (something like padsp to pass audio from oss applications to pulseaudio).

It seems that Nvidia doesn't invest time or resources in creating a native FreeBSD port.

I was initially looking for that too, no change for 15 years so not gonna happen :-P Turns out FreeBSD's Linuxlator is already so good it can do this sort of tricks :-) And by my initial experimentation it is very important that nvidia drivers and libs have the same version, so having all this stuff in ports keeps things in sync :-)
 
I think the best way to offer this x11/linux-nvidia-libs in packages is to just add FLAVOR for c7 and rl9 so it builds both for c7 and rl9 and its still a single port easy to maitnain? :-)
It would be quite trivial if the pkg name r9-linux-nvidia-libs or linux-nvidia-libs-rl9 is OK, but not sure it's accepted by the group maintainers (x11@). I'm not a part of them.
Are there any ports doing such a thing? I assume none.

Note that, as far as I remember, ports having its name foo-c7-bar or foo-rl9-bar are the part of upstream distributions (Centos7 or Rocky Linux 9). And linux-nvidia-libs is not a part of them (as its upstream is nvidia itself).
 
Back
Top