You are not logged in.

#1 2025-09-19 20:43:25

Sudouken
Member
Registered: 2025-06-28
Posts: 5

[SOLVED] Ollama with CUDA Acceleration?

I don't know much about CUDA, but my understanding is that it can allow for GPU acceleration for certain processes. When I found the ollama-cuda package, I thought it might help my Ollama models to run faster, but out of the box this doesn't seem to be the case. I checked the Wiki, but this had only very basic information and didn't opine on CUDA at all. I would like to be educated by someone who knows more than me on this topic. Are my assumptions wrong or is configuration required? Forgive me. This not urgent by any means, I just want to learn and understand my system better. Currently, running a 30b model (granted, it's likely pushing the limits of what my system can handle), I almost have time to make a coffee after sending a prompt (~2-3 minutes) lol. Thanks!

Laptop Specs
  • Kernel Version: 6.12.47-1-lts (64-bit)

  • Graphics Platform: Wayland

  • Processors: 12 × AMD Ryzen 5 5600H with Radeon Graphics

  • Memory: 32 GiB of RAM (27.3 GiB usable)

  • Graphics Processor 1: AMD Radeon Graphics

  • Graphics Processor 2: NVIDIA GeForce RTX 3050 Ti -- with CUDA Version: 13.0 and 4096MiB (from `nvidia-smi`)

Last edited by Sudouken (2025-09-19 21:53:02)

Offline

#2 2025-09-19 21:10:42

lahwaacz
Wiki Admin
From: Czech Republic
Registered: 2012-05-29
Posts: 775

Re: [SOLVED] Ollama with CUDA Acceleration?

Your GPU most likely does not have enough memory to load the 30b model. There is some online calculator where you can input your specs: https://apxml.com/tools/vram-calculator

Offline

#3 2025-09-19 21:39:19

Sudouken
Member
Registered: 2025-06-28
Posts: 5

Re: [SOLVED] Ollama with CUDA Acceleration?

It actually loads just fine, takes maybe 20 seconds. It's just rather slow to process and respond to prompts, maybe 2-3 minutes at most. I'd like to ensure that my GPU is being used, so I can get the best performance possible, that's all. Currently, I think it's only using my CPU. Could it be that the GPU is not being used, because it only has 4GB on board RAM, whereas as my system has 32GB?

Offline

#4 2025-09-19 21:44:19

lahwaacz
Wiki Admin
From: Czech Republic
Registered: 2012-05-29
Posts: 775

Re: [SOLVED] Ollama with CUDA Acceleration?

Sudouken wrote:

Currently, I think it's only using my CPU. Could it be that the GPU is not being used, because it only has 4GB on board RAM, whereas as my system has 32GB?

That's what I meant.

Offline

#5 2025-09-19 21:49:11

Sudouken
Member
Registered: 2025-06-28
Posts: 5

Re: [SOLVED] Ollama with CUDA Acceleration?

Well crap, that stinks. VRAM is WAY more expensive than system RAM. I guess I'm stuck with using my CPU then until I can afford to buy a beefy desktop GPU suitable for running these larger models then. Oh well, at least it works; better than nothing. I might as well uninstall the cuda package then and get that 8GB of disk space back.

Thanks for the clarification.

Offline

#6 2025-09-19 22:20:47

WorMzy
Administrator
From: Scotland
Registered: 2010-06-16
Posts: 13,022
Website

Re: [SOLVED] Ollama with CUDA Acceleration?

Mod note: Not a Forum/Wiki discussion, moving to Applications & DEs.


Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Making lemonade from lemons since 2015.

Offline

Board footer

Powered by FluxBB