You are not logged in.

#1 2023-09-08 06:36:10

thangalin
Member
Registered: 2023-09-08
Posts: 10

GPU fan won't idle

Hi folks,

The fan for an NVIDIA T1000 8GB was idling for over a year, except on occasion when I'd run Lizzie and Leela for analyzing Go games. After stopping the software, the GPU fan would slip into idle again. Normally, I never hear any fan running. The case is an HDPLEX Fanless PC Chassis, so there's no CPU fan, only GPU.

The GPU fan activity changed after I installed EasyDiffusion:

https://easydiffusion.github.io/docs/installation/

The GPU fan will no longer return to idle even after deleting the software, upgrading the system, resetting the settings (nvidia-settings --load-config-only) and rebooting. The GPU temperature rarely goes above 60 C. However, the GPU fan is now at 2500 RPM when idle. When inside the BIOS settings, the GPU fan skyrockets, which never happened before.

The lowest I can set the fan myself is 33%, which I understand NVIDIA has done deliberately to prevent users from accidentally pooching their GPU.

$ uname -a
Linux hostname 6.4.12-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 24 Aug 2023 00:38:14 +0000 x86_64 GNU/Linux
$ nvidia-settings --version
nvidia-settings:  version 525.60.11

I've placed a copy of the NVIDIA bug report file at: https://easyupload.io/ddgj3f

Any ideas how I can diagnose the issue and return the system back to the way it was before running EasyDiffusion?

$ nvidia-smi
Thu Sep  7 23:36:36 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA T1000 8GB               On  | 00000000:0A:00.0  On |                  N/A |
| 40%   56C    P8              N/A /  50W |    562MiB /  8192MiB |     14%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A       633      G   /usr/lib/Xorg                               415MiB |
|    0   N/A  N/A       810      G   xfwm4                                         2MiB |
|    0   N/A  N/A      1009      G   /usr/lib/thunderbird/thunderbird              8MiB |
|    0   N/A  N/A      1256      G   /usr/lib/firefox/firefox                    133MiB |
+---------------------------------------------------------------------------------------+

Thank you!

Last edited by thangalin (2023-09-08 06:38:38)

Offline

#2 2023-09-08 07:57:49

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,101

Re: GPU fan won't idle

The particular script doesn't seem to be root-run and "When inside the BIOS settings, the GPU fan skyrockets" isn't sth. you'd get out of the OS.
=> Likely coincidental

Was the previous behavior stock or did you play w/ https://wiki.archlinux.org/title/NVIDIA … nd_cooling ?
Is there a parallel OS (windows)?
This isn't a hybrid graphics system and you switched from the IGP to nvidia?

Does the fan shut down when you cool down the system externally (eg. w/ an external fan, many hairdryers have a mode for cold air)?

Offline

#3 2023-09-09 00:16:45

thangalin
Member
Registered: 2023-09-08
Posts: 10

Re: GPU fan won't idle

Thanks for helping out!

seth wrote:

The particular script doesn't seem to be root-run

I ran it as root; the script won't run as non-root:

$ nvidia-bug-report.sh 
ERROR: Please run nvidia-bug-report.sh as root.
seth wrote:

and "When inside the BIOS settings, the GPU fan skyrockets" isn't sth. you'd get out of the OS.
=> Likely coincidental

Seems rather unlikely to be a coincidence? I've been rebooting this computer for years without the GPU fan going haywire on startup. It started happening on the first reboot after installing EasyDiffusion.

seth wrote:

Was the previous behavior stock

I didn't even know about the GPU fan settings until it started meowing for attention. Everything was stock and ultra-quiet up until after running EasyDiffusion. At that point, I mucked with a few settings to try and shush the fan.

seth wrote:

Is there a parallel OS (windows)?

No.

seth wrote:

This isn't a hybrid graphics system and you switched from the IGP to nvidia?

Not to my knowledge.

seth wrote:

Does the fan shut down when you cool down the system externally (eg. w/ an external fan, many hairdryers have a mode for cold air)?

I haven't tried that. There's no easy way to cool it down. An ice pack on the case, perhaps, but it probably wouldn't help the GPU. (I can't open the case, I don't have the right screwdriver.)

Offline

#4 2023-09-09 06:23:43

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,101

Re: GPU fan won't idle

No, I meant the the EasyDiffusion script.

Seems rather unlikely to be a coincidence?

That's the very nature of a coincidence.

Not to my knowledge.

lspci

I can't open the case, I don't have the right screwdriver.

Get one? Or rather bits, but it looks like it's just an Allan key?

Did you update the UEFI?
Do the fans blow up if you boot some live distro (grml)?

Offline

#5 2023-09-09 17:28:01

thangalin
Member
Registered: 2023-09-08
Posts: 10

Re: GPU fan won't idle

seth wrote:

No, I meant the the EasyDiffusion script.

Yes, ran it as a regular user.

seth wrote:
lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 59)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset USB 3.1 xHCI Controller (rev 02)
02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset SATA Controller (rev 02)
02:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b2 (rev 02)
03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
03:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
03:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
03:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
03:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
03:07.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
08:00.0 Network controller: Intel Corporation Dual Band Wireless-AC 3168NGW [Stone Peak] (rev 10)
09:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
0a:00.0 VGA compatible controller: NVIDIA Corporation TU117GL [T1000 8GB] (rev a1)
0a:00.1 Audio device: NVIDIA Corporation Device 10fa (rev a1)
0b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function
0b:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor (PSP) 3.0 Device
0b:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller
0c:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function
0c:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
0c:00.3 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller
seth wrote:

Get one? Or rather bits, but it looks like it's just an Allan key?

It's one of those star-shaped Allan keys, but a peculiar size. I always get a confused reaction when I bring the case into a shop.

seth wrote:

Did you update the UEFI?

There's no GPU fan setting in the UEFI. I did, afterwards, set the CPU fan to quiet mode. However, there's no CPU fan, so it shouldn't affect anything.

seth wrote:

Do the fans blow up if you boot some live distro (grml)?

Good idea, I'll give that a try and see what happens, thank you.

Offline

#6 2023-09-09 18:56:27

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,101

Re: GPU fan won't idle

There's only one VGA device, no hybrid graphics.

The "star-shaped Allan key" isn't just Torx??
https://en.wikipedia.org/wiki/Torx

Offline

Board footer

Powered by FluxBB