Ollama (Download and run large language models locally) Ollama is an application which lets you run large language models offline. Optional dependencies like CUDA or ROCm will be automatically detected during compilation of ollama libraries, if present. CUDA=ON: building with CUDA, default is CUDA=OFF, build and runtime dependencies: development/cudatoolkit_13 system/nvidia-driver or system/nvidia-legacy580-driver ROCM=ON: building with ROCm, default is ROCM=OFF, build and runtime dependencies: development/rocmtoolkit_7 LIB64=OFF: supporting LIBDIRSUFFIX, for x86_64, ollama libs will be installed to /usr/lib64/ollama, default is LIB64=OFF. VULKAN=OFF: building without vulkan, default is VULKAN=OFF. If you have vulkan-sdk >= 1.4, enable it with VULKAN=ON. Building ollama requires network and google-go-lang. To verify the installation: $ nohup ollama serve & $ ollama --version $ ollama run gemma3:270m See also: https://docs.ollama.com/faq https://docs.ollama.com/quickstart