You are not logged in.

#1 2018-04-27 22:50:33

oangelo
Member
Registered: 2013-02-10
Posts: 12

Nvidia with bad performance because of default thermal settings

I have a VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 660 Ti] (rev a1). I have noticed that when playing some hardware demanding games (on wine, Cemu, RPCS3), after some time, the games begin to slow down! It is not my processor, because it is good (Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz | 12 Threads), and is not the memory (8GB), since I checked htop while playing, and it was not an issue. And my Nvidia driver is the latest one (extra/nvidia 390.48-9).

I have improved the performance by just "Enable GPU Fan Settings" and using a highter speed, at the Nvidia Settings. WTF, the fan speed should be automatic tuned, not manualy! I don't know if this is a problema with my specific GPU or if this happens in general. Let me know if you are also having similar behavior.

I do not know who is to blame, if the kernel should have been controling this or the Nvidia driver. But is very disapointing to set this manually every time. I know, I can make a script that changes this at the begining of the session, but the constant noise of the coolers makes me mad.
I also think this should be mentioned at the ArchWiki at the Nvidia article.

Last edited by oangelo (2018-04-27 22:54:04)

Offline

#2 2018-04-29 23:20:28

Ropid
Member
Registered: 2015-03-09
Posts: 1,069

Re: Nvidia with bad performance because of default thermal settings

What are the temperatures you are seeing? This could be the behavior of the graphics card's hardware itself (and its firmware). It will slow down after a certain temperature. I don't remember what that is for the GTX 600 generation. Perhaps something around 75 °C? I have this script here to track the changes of the temperatures of an nvidia card over time:

#!/bin/bash
trap "echo; exit" INT # print newline after ctrl-c
previous_minute=99
while true; do
    time=$(date '+%T')
    minute=${time#*:}
    minute=${time%:*}
    if [[ $previous_minute != $minute ]]; then
        previous_minute=$minute
        echo
        echo -n "$time - "
    fi
    gpu_temp=$(nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader --id=0)
    echo -n "${gpu_temp} "
    sleep 2
done

I name the file "followtemp-gpu". It's supposed to be used inside a terminal window in the background while you play a game. It will print temperatures every few seconds, with just one line of text being used per minute.

Make sure your case is well ventilated. If you didn't clean dust inside the case over the last year or so, do that. Perhaps think about adding a case fan. If you don't have enough case fans, the CPU and GPU cooler's fans will just cycle the air inside the case around, without getting enough fresh air.

I use a GTX 680 currently. I removed its cooler and reapplied thermal paste between the cooler's base plate and the GPU. Doing that dropped temperatures by 20 °C. It was previously something around 80 °C while playing games, and now it's usually below 60 °C.

I also have a script to track CPU temperatures with similar output:

#!/bin/bash
trap "echo; exit" INT # print newline after ctrl-c
previous_minute=99
while true; do
    time=$(date '+%T')
    minute=${time#*:}
    minute=${time%:*}
    if [[ $previous_minute != $minute ]]; then
        previous_minute=$minute
        echo
        echo -n "$time - "
    fi
    echo -n "$(sensors | sed -nr 's/^Package id 0: *[-+]([0-9]+).*$/\1/p') "
    sleep 2
done

I name this script "followtemp". On my CPU here, it shows the temperature of the hottest CPU core.

The CPU also has similar behavior as the graphics card, where it will slow down if the temperature reaches a certain point. According to Intel's website, that's 100 °C for your CPU, so very high. I can't really imagine that level being hit by a game on a 6-core CPU.

Last edited by Ropid (2018-04-29 23:24:23)

Offline

#3 2018-05-01 23:58:03

oangelo
Member
Registered: 2013-02-10
Posts: 12

Re: Nvidia with bad performance because of default thermal settings

@Ropid, I did check my GPU and it is clean, but I did changed the thermal paste for a new one, just to make sure. Any way, when the GPU is near 70° Celsius, it does not performs well and the FPS on the games falls too much. What makes me mad is that the GPU cooler does not seems to go faster to cool down the GPU, I have to manually set the cooler to a higher speed to get a decent FPS. This behavior for the GPU makes no sense, should be like the CPU, that does speeds up the cooler when it gets too hot and tries to maintain the performance.

I don't know if this is made on purpose by Nvidia to force people with older GPUs to buy new ones, or if Nvidia still cares nothing for the linux user and did not implemented a good thermal management protocol on their drivers.

Or maybe is my fault, perhaps I am not knowing how to configure it. But I don't think is my fault, a decent thermal management should be the default.

Anyway, I think the best solution is to implement a PID controller to control the fan speed to maintain the GPU at a given temperature. This should make the GPU silent when idle and with a decent performance when the user is playing some games, but making a lot of noise. Anybody knows if there is something like this already available?

Offline

#4 2018-05-02 06:23:32

seth
Member
Registered: 2012-09-03
Posts: 51,162

Re: Nvidia with bad performance because of default thermal settings

70°C isn't very much for a GPU but quite some for a CPU (and too much for RAM or HDDs, SDDs can heat up even more)
Check the heat distribution (this is in particular a problem in notebooks if some smartass used the same heatpipe for the CPU and the GPU, linking their temperature) and whether the CPU heats up and throttles in return - causing the performance loss.

Offline

Board footer

Powered by FluxBB