Solved quarterly now using Nvidia 570.124.04 and Cuda 12.8

drsnx60 · May 15, 2025

ZioMario said:
Can u explain what worked and what didn't work for you ?

when system boots and has already loaded 570.124 nvidia driver it refuses to load 570-144 nvidia-drm and complain that the two modules are incompatible . Need to run "dmesg -a | less " to see this error MSG.

ZioMario · May 15, 2025

It didn't happen to me. On 14.2 I have just upgraded 570.124 to 570.144 without complains by the system. What I did ? This easy procedure :

Code:

# cd /usr/ports/x11/linux-nvidia-libs
# nano Makefile
DISTVERSION?=   570.144

# make deinstall
# make makesum
# make
# make install

# cd /usr/ports/x11/nvidia-driver
# nano Makefile
DISTVERSION?=   570.144

# make deinstall
# make makesum
# make
# make install

REBOOT

marietto# dmesg -a | grep drm
[drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[drm] Initialized nvidia-drm 0.0.0 20160202 for nvidia0 on minor 0

marietto# nvidia-smi
Thu May 15 17:17:35 2025      
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.144                Driver Version: 570.144        CUDA Version: N/A      |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 3GB    Off |   00000000:01:00.0  On |                  N/A |
| 56%   39C    P8              9W /  120W |     270MiB /   3072MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                        
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            5174      G   /usr/local/libexec/Xorg                 104MiB |
|    0   N/A  N/A            5324      G   firefox                                 161MiB |
+-----------------------------------------------------------------------------------------+

marietto# nv-sglrun nvidia-smi
/usr/local/lib/libc6-shim/libc6.so: shim init
Thu May 15 17:18:17 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.144                Driver Version: 570.144        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 3GB    Off |   00000000:01:00.0  On |                  N/A |
| 56%   39C    P8              7W /  120W |     252MiB /   3072MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

drsnx60 · May 15, 2025

ZioMario said:

It didn't happen to me. On 14.2 I have just upgraded 570.124 to 570.144 without complains by the system. What I did ? This easy procedure :

Code:

# cd /usr/ports/x11/linux-nvidia-libs
# nano Makefile
DISTVERSION?=   570.144

# make deinstall
# make makesum
# make
# make install

# cd /usr/ports/x11/nvidia-driver
# nano Makefile
DISTVERSION?=   570.144

# make deinstall
# make makesum
# make
# make install

marietto# dmesg -a | grep drm
[drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[drm] Initialized nvidia-drm 0.0.0 20160202 for nvidia0 on minor 0

marietto# nvidia-smi
Thu May 15 17:17:35 2025      
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.144                Driver Version: 570.144        CUDA Version: N/A      |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 3GB    Off |   00000000:01:00.0  On |                  N/A |
| 56%   39C    P8              9W /  120W |     270MiB /   3072MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                        
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            5174      G   /usr/local/libexec/Xorg                 104MiB |
|    0   N/A  N/A            5324      G   firefox                                 161MiB |
+-----------------------------------------------------------------------------------------+

REBOOT.

Yes thats what I should have done , but initially i did not replace 570.124 nvidia-driver with 570.144 version. so there was a version mismatch.

ZioMario · May 15, 2025

drsnx60 said:
Yes thats what I should have done , but initially i did not replace 570.124 nvidia-driver with 570.144 version. so there was a version mismatch.

You will always see the error version mismatch before to reboot the PC.

ZioMario · May 17, 2025

NapoleonWils0n said:
im closing this thread

I don't think its a good idea. The problem is unfixed,so "we" should go ahead (at least me), trying and trying until we find a solution to achieve the goal. That's our passion and our "mission" , the power to serve.

Do you want to give a look at this thread ?

5090 RTX fail to initialize in pytorch

Same environment but setting visible devices to only 1 works fine (e.g. export CUDA_VISIBLE_DEVICES=0; python train.py …). Seems to error out in DDP? I can get past the original error by specifying up to 4 GPUs in CUDA_VISIBLE_DEVICES, but then I get a “CUDA error: an illegal memory access was...

discuss.pytorch.org

specially where he says :

Same environment but setting visible devices to only 1 works fine (e.g. export CUDA_VISIBLE_DEVICES=0; python train.py …). Seems to error out in DDP? I can get past the original error by specifying up to 4 GPUs in CUDA_VISIBLE_DEVICES, but then I get a “CUDA error: an illegal memory access was encountered” error for 2 or more GPUs w/DDP. Smoke test (another data point during debugging): export CUDA_VISIBLE_DEVICES=0,1,2,3; python -c ‘import torch; torch.cuda.is_available()’ works fine, but then adding more than 4 GPUs fails: export CUDA_VISIBLE_DEVICES=0,1,2,3,4; python -c ‘import torch; torch.cuda.is_available()’ /home/adaboost/miniconda3/envs/mustango/lib/python3.10/site-packages/torch/cuda/init.py:181: UserWarning: CUDA initialization: CUDA driver initialization failed, you might not have a CUDA gpu. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:109.) return torch._C._cuda_getDeviceCount() > 0

this is interesting :

Code:

export CUDA_VISIBLE_DEVICES=0,1,2,3; python -c ‘import torch; torch.cuda.is_available()’
works fine

so,this could be the reason and the solution of the error....

drsnx60 · May 19, 2025

I know believe thats it /boot/modules/dmabuf.ko that is not getting rebuildt somehow, by rebuilding the two /usr/ports/graphics packages:

nvidia-drm-61-kmod
nvidia-drm-kmod

When installing from scratch the dmabuf.ko finds its way to /boot/modules
but where does it come from ?

T-Aoki · May 19, 2025

drsnx60 said:
When installing from scratch the dmabuf.ko finds its way to /boot/modules
but where does it come from ?

% pkg which -o /boot/modules/dmabuf.ko
/boot/modules/dmabuf.ko was installed by package graphics/drm-61-kmod

drsnx60 · May 19, 2025

T-Aoki said:
% pkg which -o /boot/modules/dmabuf.ko
/boot/modules/dmabuf.ko was installed by package graphics/drm-61-kmod

Thank's Yes so it says on my machine too......
but for some reason reinstalling a new veersion of drm-61-kmod ( the update from 570.124 to 570.144 )
does not reinstall dmabuf.ko
This happenden on one of my machines that I now reinstalled to 14.3-Stable so I cant test it again.....

ZioMario · May 19, 2025

I haven't still tested this argument :

export CUDA_VISIBLE_DEVICES=0,1,2,3;

but if it works,is not enough to add that prefix only,instead of apporting some heavy changes to a system that works great ?

T-Aoki · May 19, 2025

drsnx60 said:
but for some reason reinstalling a new veersion of drm-61-kmod ( the update from 570.124 to 570.144 )
does not reinstall dmabuf.ko

Strange. It is in pkg-plist, and not excluded in its Makefile at least if you're running on amd64, so it should be reinstalled.

Possibly ccache issue? Not sure final kmod is cached or not, though.

T-Aoki · May 19, 2025

ZioMario said:
I haven't still tested this argument :

but if it works,is not enough to add that prefix only,instead of apporting some heavy changes to a system that works great ?

Not sure as I never tried CUDA, but beware to your shell you're configuring.
The syntax may differ depending on shells. The syntax you noted is for /bin/sh or any other POSIX compliant sh.

ZioMario · May 21, 2025

It does not work :

Code:

(pytorch) marietto# export CUDA_VISIBLE_DEVICES=0,1; LD_PRELOAD="/mnt/da0p2/CG/Tools/Stable-Diffusion/dummy-uvm.so" python3 -c 'import torch; torch.cuda.is_available()'

/home/marietto/miniconda3/envs/pytorch/lib/python3.12/site-packages/torch/cuda/__init__.py:181: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 304: OS call failed or operation not supported on this OS (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:109.)
  return torch._C._cuda_getDeviceCount() > 0

ZioMario · May 21, 2025

Good starting point to be able to debug what's broken with pytorch,cuda and the nvidia driver...

NVIDIA CUDA Support

If you want to compile pytorch with CUDA support, select a supported version of CUDA from our support matrix, then install the following:

NVIDIA CUDA

NVIDIA cuDNN v8.5 or above

Compiler compatible with CUDA

Note: You could refer to the cuDNN Support Matrix for cuDNN versions with the various supported CUDA, CUDA driver and NVIDIA hardware

If you want to disable CUDA support, export the environment variable USE_CUDA=0.Other potentially useful environment variables may be found in setup.py.

If you are building for NVIDIA's Jetson platforms (Jetson Nano, TX1, TX2, AGX Xavier), Instructions to install PyTorch for Jetson Nano are available here :

source :

GitHub - pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration

Tensors and Dynamic neural networks in Python with strong GPU acceleration - pytorch/pytorch

github.com

yurivict · Aug 26, 2025

I downloaded the CUDA toolkit
cuda-repo-ubuntu2404-13-0-local_13.0.0-580.65.06-1_amd64.deb
and unpacked all packages into the same folder.

nvidia-smi fails when run as root:

# usr/bin/nvidia-smi
Failed to initialize NVML: GPU access blocked by the operating system

The latest nvidia-driver-580.76.05.1403505 is installed on 14.3 and the NVidia card works fine.

What might be causing this error, and how to make it run?

drhowarddrfine · Aug 26, 2025

yurivict Could the fact that the toolkit is built for ubuntu and debian have something to do with it?

Alain De Vos · Aug 26, 2025

ZioMario said:
source :

GitHub - pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration

Tensors and Dynamic neural networks in Python with strong GPU acceleration - pytorch/pytorch

github.com

For me i did not even worked on Linux ...

ZioMario · Aug 26, 2025

Can these tutorials be useful for you ?

Thread 'How to run Stable Diffusion WebUI on FreeBSD...'

Dec 31, 2022

Hello to everyone.

I'm trying to install the AUTOMATIC1111 webui for stable diffusion within my /compat/ubuntu distro using the FreeBSD linuxulator. You can find the proper repository for this tool here :

GitHub - AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI

Stable Diffusion web UI. Contribute to AUTOMATIC1111/stable-diffusion-webui development by creating an account on GitHub.

github.com

at some point of the installation,I should do :

Code:

$ python3 ./launch.py

Python 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0]
Commit hash: 360feed9b55fb03060c236773867b08b4265645d
Installing torch and torchvision
Traceback (most recent call last):
  File "./launch.py", line 294, in <module>...

https://www.reddit.com/r/freebsd/comments/1118eae/how_to_install_the_nvidia_driver_5257801_cuda_12/

yurivict · Sep 12, 2025

Is CUDA support actually present in the FreeBSD NVidia driver for sure?

I have the latest NVidia driver, and the no-too-old NVidia card supporting CUDA, but am getting the error "Failed to initialize NVML: GPU access blocked by the operating system"

Perhaps some sysctl variable is needed?

ZioMario · Sep 12, 2025

yurivict said:
Is CUDA support actually present in the FreeBSD NVidia driver for sure?

I have the latest NVidia driver, and the no-too-old NVidia card supporting CUDA, but am getting the error "Failed to initialize NVML: GPU access blocked by the operating system"

Perhaps some sysctl variable is needed?

The version of the nvidia driver installed inside the Linuxulator should be the same as the version of the nvidia driver installed on FreeBSD.

Solved quarterly now using Nvidia 570.124.04 and Cuda 12.8

Good starting point to be able to debug what's broken with pytorch,cuda and the nvidia driver...​

Good starting point to be able to debug what's broken with pytorch,cuda and the nvidia driver...