Ollama (Download and run large language models locally)

Ollama is an application which lets you run large language models
offline.

Optional dependencies like CUDA or ROCm will be automatically detected
during compilation of ollama libraries, if present.

CUDA=ON: building with CUDA, default is CUDA=OFF, build and runtime
dependencies:
  development/cudatoolkit_13
  system/nvidia-driver or system/nvidia-legacy580-driver

ROCM=ON: building with ROCm, default is ROCM=OFF, build and runtime
dependencies:
  development/rocmtoolkit_7

LIB64=OFF: supporting LIBDIRSUFFIX, for x86_64, ollama libs will be
installed to /usr/lib64/ollama, default is LIB64=OFF.

VULKAN=OFF: building without vulkan, default is VULKAN=OFF. If you
have vulkan-sdk >= 1.4, enable it with VULKAN=ON.

Building ollama requires network and google-go-lang.

To verify the installation:

$ nohup ollama serve &
$ ollama --version
$ ollama run gemma3:270m

See also:
https://docs.ollama.com/faq
https://docs.ollama.com/quickstart