home / ai-agents / ollama

Ollama

Official · Ollama
164k

Local model runtime CLI for pulling models, serving a local API with OpenAI-compatible endpoints, creating Modelfile-based variants, and launching supported integrations.

$curl -fsSL https://ollama.com/install.sh | sh
Language
Go
Stars
164,442
Category
AI & LLM Tools
Agent
Ready
Agent Compatibility
JSON Output
Agent Skill
MCP Support
AI Analysis

Ollama is a local model runtime and control CLI for pulling models, running them locally or through Ollama Cloud, and exposing them over a local HTTP API. It also packages customized models and can launch supported coding tools against that runtime.

What It Enables
  • Pull, run, stop, and inspect local or cloud-backed models from the shell, including interactive chat and embedding generation.
  • Start a local Ollama server that exposes native and OpenAI-compatible JSON APIs for chat, embeddings, structured outputs, vision, and tool-calling requests.
  • Create and import customized models from Modelfile, Safetensors, or GGUF assets, then launch supported tools like codex or claude against the local runtime.
Agent Fit
  • Agents usually get the most value from ollama serve plus the JSON API, with the CLI handling model lifecycle, setup, and simple one-shot runs.
  • ollama run accepts piped stdin, supports --format json, and embedding models print a JSON array, so it can participate in shell pipelines even without wrapping the HTTP API.
  • Fit is mixed rather than fully deterministic: the default entrypoint is a TUI, model responses are still probabilistic, and unattended use depends on local hardware limits or Ollama account auth for cloud flows.
Caveats
  • Large local models are constrained by available CPU, GPU, memory, and disk; cloud models require sign-in or API-key setup.
  • The bare ollama and ollama launch flows are human-oriented Bubble Tea menus, so automation should call explicit subcommands or the API directly.