You are not logged in.
I don't know much about CUDA, but my understanding is that it can allow for GPU acceleration for certain processes. When I found the ollama-cuda package, I thought it might help my Ollama models to run faster, but out of the box this doesn't seem to be the case. I checked the Wiki, but this had only very basic information and didn't opine on CUDA at all. I would like to be educated by someone who knows more than me on this topic. Are my assumptions wrong or is configuration required? Forgive me. This not urgent by any means, I just want to learn and understand my system better. Currently, running a 30b model (granted, it's likely pushing the limits of what my system can handle), I almost have time to make a coffee after sending a prompt (~2-3 minutes) . Thanks!
Kernel Version: 6.12.47-1-lts (64-bit)
Graphics Platform: Wayland
Processors: 12 × AMD Ryzen 5 5600H with Radeon Graphics
Memory: 32 GiB of RAM (27.3 GiB usable)
Graphics Processor 1: AMD Radeon Graphics
Graphics Processor 2: NVIDIA GeForce RTX 3050 Ti -- with CUDA Version: 13.0 and 4096MiB (from `nvidia-smi`)
Last edited by Sudouken (2025-09-19 21:53:02)
Offline
Your GPU most likely does not have enough memory to load the 30b model. There is some online calculator where you can input your specs: https://apxml.com/tools/vram-calculator
Offline
It actually loads just fine, takes maybe 20 seconds. It's just rather slow to process and respond to prompts, maybe 2-3 minutes at most. I'd like to ensure that my GPU is being used, so I can get the best performance possible, that's all. Currently, I think it's only using my CPU. Could it be that the GPU is not being used, because it only has 4GB on board RAM, whereas as my system has 32GB?
Offline
Currently, I think it's only using my CPU. Could it be that the GPU is not being used, because it only has 4GB on board RAM, whereas as my system has 32GB?
That's what I meant.
Offline
Well crap, that stinks. VRAM is WAY more expensive than system RAM. I guess I'm stuck with using my CPU then until I can afford to buy a beefy desktop GPU suitable for running these larger models then. Oh well, at least it works; better than nothing. I might as well uninstall the cuda package then and get that 8GB of disk space back.
Thanks for the clarification.
Offline
Mod note: Not a Forum/Wiki discussion, moving to Applications & DEs.
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Offline