Leaving FreeBSD with broken heart

freebsd-nvidia-cuda ... problems.
Can FreeBSD not use CUDA for compute at all with NVIDIA GPUs today?

FreeBSD might have been pretty cool back with GPU coin mining (or at least when I was into it :p); lighter in-general and better CPU perf at the base OS would sound appealing for quick set-ups.
 
Why are we over 50 posts in this thread? OP decides FreeBSD doesn't work for them, ok fine. Don't use it. Why be dramatic about "leaving FreeBSD".

I've never understood threads like this. If one decides that a Ford automobile does not meet their needs, does one announce "I'm leaving Ford for Chevy?" No, you just go buy a new car.
 
Why are we over 50 posts in this thread? OP decides FreeBSD doesn't work for them, ok fine. Don't use it. Why be dramatic about "leaving FreeBSD".

I've never understood threads like this. If one decides that a Ford automobile does not meet their needs, does one announce "I'm leaving Ford for Chevy?" No, you just go buy a new car.
I can as well say that I left all other car brands for BMW :P
 
  • Like
Reactions: mer
As mentioned in this thread, the right combination of FreeBSD drivers and Linux libs makes you able to run Linux CUDA binaries in the Linuxulator.

Building CUDA programs in the Linuxulator is probably no problem.
This is why (even though I myself don't use it, just installed for confirmation) I'm keeping x11/linux-nvidia-libs* in sync with corresponding x11/nvidia-driver* when filing PR / opening review for upgrading x11/nvidia-driver*, in conjunction with graphics/nvidia-drm*-kmod*, x11/nvidia-settings and x11/nvidia-xconfig.
 
To only cuda-nvidia i could get to work was debian-tensorflow.
Or some freebsd-ollama models.
Strange everything other failes.
scikit-learn, or torch not able to nvidia-cuda to work on any o.s.
 
Good morning FreeBSD community!

I am traditionally a Linux user and moved to FreeBSD a little more than a year, and man.... I loved it! It is some of those moments we say "how have a lived until now without it?".
But it is with broken heart that I am planning to move back to Linux as I currently have a urge to expedite my competences in ML/AI and the lack of CUDA integration by NVIDIA
has been a obstacle difficult to get around. I thought hard about dual boots and VMs, but I have a HW with limited capacity to start with.

Anyhow! I may be leaving but I joined this forum to remain tightened to FreeBSD looking forward to make a comeback!

Cheers!
I currently have to use Windows. (We are moving, and initially will have just our laptops, and I am having problems with hardware support on the laptop.) But since I have moved almost completely to open-source apps, moving back and forth between FreeBSD and Windows is not so bad.

I have an MSI CUBI 5, a small, fanless computer, running FreeBSD. It will be my daily driver once we are settled in, and I use it from time to time to stay in practice.

I also want to use CUDA, in my case parallel processing with CUDA Fortran. For that I will need Linux, since it does not run on FreeBSD and, IIRC, does not run on Windows. That will be on my big, power-hungry Windows machine (the one with the NVIDIA GPU), which will be dual booting Linux.

That may be your answer, if you have the resources: a small FreeBSD computer for day to day stuff, and a Linux system with the NVIDIA GPU for ML stuff.
 
I'm a migrant from a Devuan Linux main installation. I've transitioned almost completely to FreeBSD, and on the FreeBSD side, everything mostly works without a hitch. I have already been using AMD GPUs because I know what kind of crap NVIDIA has been pulling with Linux, and surprise surprise, same story here.

This caught my interest because I recently installed FreeBSD on an old Dell Latitude D830. It's got an integrated legacy NVIDIA card.
I've been working my butt off trying to get my installation to the point where I can run Linux and Windows games, and ultimately my goal is to port or make a shim to run Monado ( the VR runtime ) on FreeBSD!

Broadly speaking, having an updated Mesa is pivotal to a lot of this, because those guys move fast and keep increasing support every month for even the older graphics architectures and uplifting recent cards to newer and newer Vulkan and OpenGL versions. Great to see, but without the ports, we can't use it.

I need to dip my toes into making proper pkgs, and Mesa is one of the first things to target.
On the linuxulator side of the house, I want to make a BSD-style ports tree to build Linux software, so that it can be maintained in version lockstep with BSD.

I need that in order to allow graphics drivers to cross the boundary between the Linuxulator environment and the host system, and I am pretty much sure now that making a ports tree will actually be *less* work than wrangling an upstream distribution into a chroot, when I keep having to recompile Mesa, LLVM and friends by hand in those chroots.

If there's a way to get CUDA on FreeBSD native, in the limited time I have to wrangle all this, hopefully I can squeeze it in. But I'm here, and I'm feeling it too!
 
On the linuxulator side of the house, I want to make a BSD-style ports tree to build Linux software, so that it can be maintained in version lockstep with BSD.
You may want to look into Gentoo's Portage, they do exactly that.

If there's a way to get CUDA on FreeBSD native, in the limited time I have to wrangle all this, hopefully I can squeeze it in. But I'm here, and I'm feeling it too!
Good luck! I did try having CUDA work the last few week-ends and ended giving up, but I did not give it the best effort I could. In the thread I wrote about it, AlfredoLlaquet posted an instructions Gist authored by someone who reports recent success, you may find it helpful.
 
You'll need x11/linux-nvidia-libs, devel/libepoll-shim, emulators/libc6-shim and maybe science/linux-ai-ml-env to use CUDA on Linuxulator.

With all the above installed (-devel variants of NVIDIA things, though),
Code:
% nv-sglrun nvidia-smi   
/usr/local/lib/libc6-shim/libc6.so: shim init
Fri Mar 13 21:48:34 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.142                Driver Version: 580.142        CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A400                Off |   00000000:01:00.0  On |                  N/A |
| 30%   56C    P5            N/A  /   50W |    1052MiB /   4094MiB |     21%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

And with benchmarks/clpeak installed additionally,
Code:
% nv-sglrun clpeak   
/usr/local/lib/libc6-shim/libc6.so: shim init

Platform: NVIDIA CUDA
  Device: NVIDIA RTX A400
    Driver version  : 580.142 (FreeBSD)
    Compute units   : 6
    Clock frequency : 1762 MHz

    Global memory bandwidth (GBPS)
      float   : 80.31
      float2  : 82.53
      float4  : 83.70
      float8  : 84.28
      float16 : 85.46

    Single-precision compute (GFLOPS)
      float   : 2477.12
      float2  : 2632.15
      float4  : 2486.82
      float8  : 2483.61
      float16 : 2519.08

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 42.73
      double2  : 41.98
      double4  : 42.12
      double8  : 40.04
      double16 : 36.32

    Integer compute (GIOPS)
      int   : 1310.25
      int2  : 1327.29
      int4  : 940.59
      int8  : 916.97
      int16 : 1111.14

    Integer compute Fast 24bit (GIOPS)
      int   : 1278.67
      int2  : 1222.75
      int4  : 902.45
      int8  : 904.22
      int16 : 1236.72

    Integer char (8bit) compute (GIOPS)
      char   : 1169.36
      char2  : 1267.38
      char4  : 1133.21
      char8  : 1008.05
      char16 : 934.48

    Integer short (16bit) compute (GIOPS)
      short   : 1180.83
      short2  : 1254.91
      short4  : 1148.93
      short8  : 1051.33
      short16 : 905.25

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 7.79
      enqueueReadBuffer               : 8.59
      enqueueWriteBuffer non-blocking : 7.01
      enqueueReadBuffer non-blocking  : 5.00
      enqueueMapBuffer(for read)      : 10.78
        memcpy from mapped ptr        : 6.40
      enqueueUnmap(after write)       : 12.79
        memcpy to mapped ptr          : 6.23

    Kernel launch latency : 12.79 us
 
You'll need x11/linux-nvidia-libs, devel/libepoll-shim, emulators/libc6-shim and maybe science/linux-ai-ml-env to use CUDA on Linuxulator.

With all the above installed (-devel variants of NVIDIA things, though),
Code:
% nv-sglrun nvidia-smi  
/usr/local/lib/libc6-shim/libc6.so: shim init
Fri Mar 13 21:48:34 2026      
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.142                Driver Version: 580.142        CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A400                Off |   00000000:01:00.0  On |                  N/A |
| 30%   56C    P5            N/A  /   50W |    1052MiB /   4094MiB |     21%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

And with benchmarks/clpeak installed additionally,
Code:
% nv-sglrun clpeak  
/usr/local/lib/libc6-shim/libc6.so: shim init

Platform: NVIDIA CUDA
  Device: NVIDIA RTX A400
    Driver version  : 580.142 (FreeBSD)
    Compute units   : 6
    Clock frequency : 1762 MHz

    Global memory bandwidth (GBPS)
      float   : 80.31
      float2  : 82.53
      float4  : 83.70
      float8  : 84.28
      float16 : 85.46

    Single-precision compute (GFLOPS)
      float   : 2477.12
      float2  : 2632.15
      float4  : 2486.82
      float8  : 2483.61
      float16 : 2519.08

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 42.73
      double2  : 41.98
      double4  : 42.12
      double8  : 40.04
      double16 : 36.32

    Integer compute (GIOPS)
      int   : 1310.25
      int2  : 1327.29
      int4  : 940.59
      int8  : 916.97
      int16 : 1111.14

    Integer compute Fast 24bit (GIOPS)
      int   : 1278.67
      int2  : 1222.75
      int4  : 902.45
      int8  : 904.22
      int16 : 1236.72

    Integer char (8bit) compute (GIOPS)
      char   : 1169.36
      char2  : 1267.38
      char4  : 1133.21
      char8  : 1008.05
      char16 : 934.48

    Integer short (16bit) compute (GIOPS)
      short   : 1180.83
      short2  : 1254.91
      short4  : 1148.93
      short8  : 1051.33
      short16 : 905.25

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 7.79
      enqueueReadBuffer               : 8.59
      enqueueWriteBuffer non-blocking : 7.01
      enqueueReadBuffer non-blocking  : 5.00
      enqueueMapBuffer(for read)      : 10.78
        memcpy from mapped ptr        : 6.40
      enqueueUnmap(after write)       : 12.79
        memcpy to mapped ptr          : 6.23

    Kernel launch latency : 12.79 us
Are you running nv-sglrun nvidia-smi and nv-sglrun clpeak from the host FreeBSD system, or chrooted in the Linux compat directory?
 
Yeah, maybe it was my first mistake during my run, trying to go with a Linux distro I was familiar with (I never used any from the Red Hat family).
 
Back
Top