NVIDIA Open GPU Kernel Modules Version
595.58.03
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
Operating System and Version
Ubuntu 26.04 LTS
Kernel Release
7.0.0-14-generic
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
Hardware: GPU
3080 Ti
Describe the bug
Unable to unbind NVIDIA driver from unused GPU
It is unused via the following
- lsof /dev/nvidia*
- fuser -av /dev/nvidia*
To Reproduce
gpu_vd="$(cat /sys/bus/pci/devices/$gpu/vendor) $(cat /sys/bus/pci/devices/$gpu/device)"
aud_vd="$(cat /sys/bus/pci/devices/$aud/vendor) $(cat /sys/bus/pci/devices/$aud/device)"
echo "$gpu" | sudo tee "/sys/bus/pci/devices/$gpu/driver/unbind"
echo "$aud" | sudo tee "/sys/bus/pci/devices/$aud/driver/unbind"
echo "$gpu_vd" | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id
echo "$aud_vd" | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id
when opening some apps. Note that lsof and fuser both return no open file handles, whether by nvidia-persistenced, nvtop, or btop. The processes then hangs at unbinding without ever exiting.
nvidia_drm is not loaded and none of the NVIDIA GPUs are driving any displays (KDE instead of Gnome on Ubuntu).
Bug Incidence
Sometimes
nvidia-bug-report.log.gz
kernel: ? irqentry_exit+0x97/0x5a0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? arch_exit_to_user_mode_prepare.isra.0+0xd/0x100
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? handle_mm_fault+0x1c0/0x2e0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? count_memcg_events+0x103/0x250
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? __handle_mm_fault+0x493/0x720
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? do_syscall_64+0x150/0x5a0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? arch_exit_to_user_mode_prepare.isra.0+0xd/0xe0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? __audit_syscall_exit+0x36/0x120
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? ksys_read+0xc6/0xf0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? lruvec_stat_mod_folio+0x8d/0x100
kernel: ? vfs_read+0x364/0x3a0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? rw_verify_area+0x57/0x180
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? do_syscall_64+0x150/0x5a0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? arch_exit_to_user_mode_prepare.isra.0+0xd/0xe0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? __audit_syscall_exit+0x36/0x120
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? ksys_write+0x71/0xf0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? vfs_write+0x25b/0x490
kernel: do_syscall_64+0x115/0x5a0
kernel: x64_sys_call+0x22f/0x2390
kernel: __x64_sys_write+0x19/0x30
kernel: ksys_write+0x71/0xf0
kernel: vfs_write+0x25b/0x490
kernel: kernfs_fop_write_iter+0x161/0x210
kernel: sysfs_kf_write+0x74/0x90
kernel: drv_attr_store+0x24/0x50
kernel: new_id_store+0xf4/0x1f0
kernel: pci_add_dynid+0xe6/0x110
kernel: driver_attach+0x1e/0x30
kernel: bus_for_each_dev+0x8a/0xe0
kernel: __driver_attach+0xe4/0x250
kernel: mutex_lock+0x3b/0x50
kernel: __mutex_lock_slowpath+0x13/0x20
kernel: ? __pfx___driver_attach+0x10/0x10
kernel: ? simple_strntoull+0x8c/0xa0
kernel: ? select_task_rq+0x91/0x100
kernel: __mutex_lock.constprop.0+0x550/0xaf0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: schedule+0x27/0x90
kernel: __schedule+0x2b2/0x630
kernel: <TASK>
kernel: ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
kernel: ? exc_page_fault+0x94/0x1e0
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? do_syscall_64+0x115/0x5a0
kernel: ? x64_sys_call+0x22f/0x2390
kernel: ? __x64_sys_write+0x19/0x30
kernel: ? ksys_write+0x71/0xf0
kernel: ? vfs_write+0x25b/0x490
kernel: ? kernfs_fop_write_iter+0x161/0x210
kernel: ? sysfs_kf_write+0x74/0x90
kernel: ? drv_attr_store+0x24/0x50
kernel: ? unbind_store+0xaf/0xc0
kernel: ? device_driver_detach+0x14/0x20
kernel: ? bus_find_device+0xb0/0xf0
kernel: ? device_release_driver_internal+0x1fb/0x260
kernel: ? device_remove+0x43/0x80
kernel: ? pci_device_remove+0x4b/0xc0
kernel: ? nv_pci_remove+0x52/0x80 [nvidia]
kernel: ? nv_pci_remove_helper+0x3e9/0x500 [nvidia]
kernel: ? os_delay+0xfb/0x250 [nvidia]
kernel: ? __pfx_process_timeout+0x10/0x10
kernel: ? schedule_timeout+0x88/0x110
kernel: ? srso_alias_return_thunk+0x5/0xfbef5
kernel: ? schedule+0x27/0x90
kernel: ? timer_delete_sync+0x5c/0xb0
kernel: __schedule+0x175/0x630
kernel: ? raw_spin_rq_lock_nested+0x21/0xa0
More Info
No response
NVIDIA Open GPU Kernel Modules Version
595.58.03
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
Operating System and Version
Ubuntu 26.04 LTS
Kernel Release
7.0.0-14-generic
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
Hardware: GPU
3080 Ti
Describe the bug
Unable to unbind NVIDIA driver from unused GPU
It is unused via the following
To Reproduce
when opening some apps. Note that lsof and fuser both return no open file handles, whether by nvidia-persistenced, nvtop, or btop. The processes then hangs at unbinding without ever exiting.
nvidia_drm is not loaded and none of the NVIDIA GPUs are driving any displays (KDE instead of Gnome on Ubuntu).
Bug Incidence
Sometimes
nvidia-bug-report.log.gz
More Info
No response