Snippets | NJannasch.Dev

OpenCodellama.cppQwenLocal-First

OpenCode with Local llama.cpp (Qwen 3.6)

Connect OpenCode to a local llama.cpp server running Qwen 3.6 MTP. Zero API costs, 90K context, local-first AI coding.

May 23, 2026

llama.cppGemmaMTPRTX 5060 Ti

Run Gemma 4 26B-A4B with MTP speculative decoding using ik_llama.cpp. Separate drafter model, 133 t/s on an NVIDIA RTX 5060 Ti 16 GB.

May 22, 2026

llama.cppQwenMTPRTX 5060 Ti

Run Qwen 3.6 35B-A3B with MTP speculative decoding on llama.cpp. 144 t/s on an NVIDIA RTX 5060 Ti 16 GB.

May 14, 2026

llama.cppGemmaRTX 5060 Ti

llama-server config for Gemma 4 26B-A4B MoE with full 256K context on an NVIDIA RTX 5060 Ti 16 GB. The key: do NOT use --swa-full.

May 1, 2026

llama.cppQwenTurboQuantRTX 5060 Ti

llama-server config for Qwen 3.6 35B-A3B with TurboQuant turbo3 KV cache. 400K context window on an RTX 5060 Ti 16 GB.

April 19, 2026

llama.cppCUDABuildRTX 5060 Ti

CMake build commands for llama.cpp with CUDA GPU acceleration. Works for mainline, ik_llama.cpp, and TurboQuant forks.

March 2, 2026

eBPFSecurityGo

Attach eBPF uprobes to SSL_write and SSL_read to intercept HTTPS traffic before encryption. Includes the Node.js static OpenSSL fix.

February 6, 2026