Run powerful AI models completely offline on your own computer.
No cloud. No subscriptions. No data leaving your machine.
- 100% Offline - Your data never leaves your computer
- Modern Chat UI - Clean, responsive web interface
- Python Library - Simple API for scripting and automation
- Knowledge Base (RAG) - Chat with your documents
- USB Transfer - Move models between air-gapped systems
- Cross-Platform - Linux, macOS, and Windows support
Linux / macOS:
curl -fsSL https://raw.githubusercontent.com/takuphilchan/offgrid-llm/main/installers/desktop.sh | bashWindows (PowerShell as Admin):
irm https://raw.githubusercontent.com/takuphilchan/offgrid-llm/main/installers/desktop.ps1 | iexpip install offgrid
Chat Interface |
Model Management |
import offgrid
# Connect to server
client = offgrid.Client() # localhost:11611
# Or custom server
client = offgrid.Client(host="http://192.168.1.100:11611")
# Chat
response = client.chat("Hello!")
print(response)
# Specify model
response = client.chat("Hello!", model="Llama-3.2-3B-Instruct")
# Streaming
for chunk in client.chat("Tell me a story", stream=True):
print(chunk, end="", flush=True)
# With options
response = client.chat(
"Explain quantum computing",
model="Llama-3.2-3B-Instruct",
system="You are a physics teacher",
temperature=0.7,
max_tokens=500
)# List models
for model in client.list_models():
print(model["id"])
# Search HuggingFace
results = client.models.search("llama", ram=8)
# Download
client.models.download(
"bartowski/Llama-3.2-3B-Instruct-GGUF",
"Llama-3.2-3B-Instruct-Q4_K_M.gguf"
)
# Import/Export USB
client.models.import_usb("/media/usb")
client.models.export_usb("model-name", "/media/usb")# Add documents
client.kb.add("notes.txt")
client.kb.add("meeting", content="Meeting notes...")
client.kb.add_directory("./docs")
# Chat with context
response = client.chat("Summarize the meeting", use_kb=True)
# Search documents
results = client.kb.search("deadline")embedding = client.embed("Hello world")
embeddings = client.embed(["Hello", "World"])After installing the desktop app:
Web Interface: http://localhost:11611/ui/
Command Line:
offgrid list # List models
offgrid search llama --ram 8 # Search HuggingFace
offgrid download-hf repo --file model.gguf
offgrid run model-name # Interactive chat
offgrid serve # Start server| RAM | Models |
|---|---|
| 4GB | 1B-3B parameters |
| 8GB | 7B parameters |
| 16GB+ | 13B+ parameters |
GPU optional. Supports NVIDIA (CUDA), AMD (ROCm), Apple Silicon (Metal), Vulkan.
| Guide | Description |
|---|---|
| Python Library | Full Python API reference |
| Quick Start | Get running in 5 minutes |
| CLI Reference | All commands |
| API Reference | REST API endpoints |
Docker: docs/DOCKER.md · Contributing: dev/CONTRIBUTING.md
MIT License - LICENSE
Built with llama.cpp

