[SOLVED] Ollama with CUDA Acceleration?

Sudouken · 2025-09-19 20:43:25

I don't know much about CUDA, but my understanding is that it can allow for GPU acceleration for certain processes. When I found the ollama-cuda package, I thought it might help my Ollama models to run faster, but out of the box this doesn't seem to be the case. I checked the Wiki, but this had only very basic information and didn't opine on CUDA at all. I would like to be educated by someone who knows more than me on this topic. Are my assumptions wrong or is configuration required? Forgive me. This not urgent by any means, I just want to learn and understand my system better. Currently, running a 30b model (granted, it's likely pushing the limits of what my system can handle), I almost have time to make a coffee after sending a prompt (~2-3 minutes) . Thanks!

Laptop Specs

Kernel Version: 6.12.47-1-lts (64-bit)
Graphics Platform: Wayland
Processors: 12 × AMD Ryzen 5 5600H with Radeon Graphics
Memory: 32 GiB of RAM (27.3 GiB usable)
Graphics Processor 1: AMD Radeon Graphics
Graphics Processor 2: NVIDIA GeForce RTX 3050 Ti -- with CUDA Version: 13.0 and 4096MiB (from `nvidia-smi`)

Last edited by Sudouken (2025-09-19 21:53:02)

lahwaacz · 2025-09-19 21:10:42

Your GPU most likely does not have enough memory to load the 30b model. There is some online calculator where you can input your specs: https://apxml.com/tools/vram-calculator

Sudouken · 2025-09-19 21:39:19

It actually loads just fine, takes maybe 20 seconds. It's just rather slow to process and respond to prompts, maybe 2-3 minutes at most. I'd like to ensure that my GPU is being used, so I can get the best performance possible, that's all. Currently, I think it's only using my CPU. Could it be that the GPU is not being used, because it only has 4GB on board RAM, whereas as my system has 32GB?

lahwaacz · 2025-09-19 21:44:19

Sudouken wrote:

Currently, I think it's only using my CPU. Could it be that the GPU is not being used, because it only has 4GB on board RAM, whereas as my system has 32GB?

That's what I meant.

Sudouken · 2025-09-19 21:49:11

Well crap, that stinks. VRAM is WAY more expensive than system RAM. I guess I'm stuck with using my CPU then until I can afford to buy a beefy desktop GPU suitable for running these larger models then. Oh well, at least it works; better than nothing. I might as well uninstall the cuda package then and get that 8GB of disk space back.

Thanks for the clarification.

WorMzy · 2025-09-19 22:20:47

Mod note: Not a Forum/Wiki discussion, moving to Applications & DEs.

Arch Linux

#1 2025-09-19 20:43:25

[SOLVED] Ollama with CUDA Acceleration?

Laptop Specs

#2 2025-09-19 21:10:42

Re: [SOLVED] Ollama with CUDA Acceleration?

#3 2025-09-19 21:39:19

Re: [SOLVED] Ollama with CUDA Acceleration?

#4 2025-09-19 21:44:19

Re: [SOLVED] Ollama with CUDA Acceleration?

#5 2025-09-19 21:49:11

Re: [SOLVED] Ollama with CUDA Acceleration?

#6 2025-09-19 22:20:47

Re: [SOLVED] Ollama with CUDA Acceleration?

Board footer