OpenCodellama.cppQwenLocal-First
OpenCode with Local llama.cpp (Qwen 3.6)
Point OpenCode at a local llama.cpp server. Works with any OpenAI-compatible endpoint.
Create opencode.json in your project root:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"llamacpp": {
"npm": "@ai-sdk/openai-compatible",
"name": "llama-server",
"options": {
"baseURL": "http://<your-server-ip>:11433/v1"
},
"models": {
"home-qwen": {
"name": "Home Qwen",
"limit": {
"context": 90000,
"output": 90000
}
}
}
}
},
"mcp": {
}
}
Key points:
baseURLpoints to your llama-server’s/v1endpoint (OpenAI-compatible API)- Context limit set to 90K to stay within the MTP + TurboQuant sweet spot on 16 GB VRAM
- The
mcpblock is where you’d add MCP servers (analytics, GitHub, etc.) - Model name is arbitrary — OpenCode sends requests to whatever model the server is running