Ollama

Ollama is an open-source tool for running large language models locally. It provides a simple command-line interface and API for downloading, managing, and running LLMs on your own hardware.

Overview

Ollama packages models into a single executable with all dependencies included. It supports macOS, Linux, and Windows, and uses llama.cpp for efficient inference on both CPU and GPU. The tool is designed to make local LLM access as simple as possible.

Key Features

Simple CLI: Pull and run models with a single command (ollama run <model>)
Local REST API: Built-in HTTP API compatible with the OpenAI Chat Completions format
Model Library: Curated list of models available via ollama pull (Llama, Mistral, Gemma, Phi, etc.)
Hardware Acceleration: Automatic GPU detection and acceleration via CUDA, ROCm, and Metal
Modelfiles: Custom model definitions with parameters, system prompts, and license info
Cross-Platform: Native support for macOS (including Apple Silicon), Linux, and Windows

Licensing

Ollama is open source (MIT License). The Ollama application is free to use, modify, and distribute. Individual models have their own licenses as defined by their creators.

See Ollama GitHub for license details.

Official Resources

Website: https://ollama.com
GitHub: https://github.com/ollama/ollama
Model Library: https://ollama.com/library

Ollama

Overview

Key Features

Licensing

Official Resources

Related