home / ai-agents / ollama

Ollama

Official · Ollama

164k

Local model runtime CLI for pulling models, serving a local API with OpenAI-compatible endpoints, creating Modelfile-based variants, and launching supported integrations.

$curl -fsSL https://ollama.com/install.sh | sh

Language

Stars

164,442

Category

AI & LLM Tools

Agent

Ready

Agent Compatibility

JSON Output

Agent Skill

MCP Support

AI Analysis

Ollama is a local model runtime and control CLI for pulling models, running them locally or through Ollama Cloud, and exposing them over a local HTTP API. It also packages customized models and can launch supported coding tools against that runtime.

What It Enables

Pull, run, stop, and inspect local or cloud-backed models from the shell, including interactive chat and embedding generation.
Start a local Ollama server that exposes native and OpenAI-compatible JSON APIs for chat, embeddings, structured outputs, vision, and tool-calling requests.
Create and import customized models from Modelfile, Safetensors, or GGUF assets, then launch supported tools like codex or claude against the local runtime.

Agent Fit

Agents usually get the most value from ollama serve plus the JSON API, with the CLI handling model lifecycle, setup, and simple one-shot runs.
ollama run accepts piped stdin, supports --format json, and embedding models print a JSON array, so it can participate in shell pipelines even without wrapping the HTTP API.
Fit is mixed rather than fully deterministic: the default entrypoint is a TUI, model responses are still probabilistic, and unattended use depends on local hardware limits or Ollama account auth for cloud flows.

Caveats

Large local models are constrained by available CPU, GPU, memory, and disk; cloud models require sign-in or API-key setup.
The bare ollama and ollama launch flows are human-oriented Bubble Tea menus, so automation should call explicit subcommands or the API directly.