How to Add Gemma 4 to OpenClaw — Fix the Missing Model Error (Free Local AI Agent)

Опубликовано: 28 Май 2026
на канале: AnassKartit

847

OpenClaw Doesn't Show Gemma 4? Here's the Fix — Run a Free Local AI Coding Agent on Your Mac

OpenClaw is one of the most exciting open-source AI coding agent frameworks out there, but if you've tried to use Google's brand new Gemma 4 models with it, you've probably hit a wall — the model simply doesn't appear in the list. In this video, I walk you through the exact fix step by step, so you can get Gemma 4 running locally as a fully functional AI coding agent on your MacBook, completely free, no API keys, no subscriptions, no cloud dependency.

This is Part 2 of my Gemma 4 local benchmarking series, where I'm pushing these models to their limits on Apple Silicon. If you haven't seen Part 1 yet, I highly recommend watching it first (linked below) — it covers the initial benchmark results across all four Gemma 4 variants and sets the stage for what we're doing here.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

🔧 THE PROBLEM

OpenClaw currently doesn't list Gemma 4 models in its built-in model catalog. When you try to select a local model, Gemma 4 is nowhere to be found. This is because the model is brand new and hasn't been added to OpenClaw's default registry yet. But the good news is: OpenClaw supports custom model configurations, and the fix takes less than 2 minutes.

✅ THE FIX (STEP BY STEP)

1. Install Unsloth (includes llama.cpp optimized for your hardware):
curl -fsSL https://unsloth.ai/install.sh | sh

2. Download the Gemma 4 26B model (Q3_K_XL quantization — best balance of quality and speed for 24GB RAM):
huggingface-cli download unsloth/gemma-4-26B-A4B-it-GGUF --include "*Q3_K_XL*"

3. Start the local inference server:
~/.unsloth/llama.cpp/llama-server -m model.gguf --port 8089 --jinja --reasoning off

4. Open your OpenClaw config file at ~/.openclaw/openclaw.json and add this custom provider block:

"llamacpp": {
"api": "openai-completions",
"baseUrl": "http://127.0.0.1:8089/v1",
"apiKey": "dummy",
"models": [{
"id": "gemma4-26b",
"name": "Gemma 4 26B (local)",
"reasoning": false,
"contextWindow": 131072
}]
}

5. Run your first agent command:
openclaw agent --local -m "hello"

That's it. You now have a local AI coding agent powered by Gemma 4 running entirely on your machine.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

⚡ PERFORMANCE

On my MacBook M4 Pro (24GB), the Gemma 4 26B model runs at approximately 49 tokens per second with the Q3_K_XL quantization. That's fast enough for a genuinely usable coding agent experience — not just a demo, but something you can actually work with day to day. The 128K context window means you can feed it entire codebases, long files, and complex multi-file projects without truncation.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

💡 WHY THIS MATTERS

Cloud-based AI coding agents like Claude Code, Cursor, and GitHub Copilot are incredible tools, but they come with costs — both financial and in terms of privacy. Running a local agent means:

→ Zero API costs — no tokens to pay for, ever
→ Complete privacy — your code never leaves your machine
→ No rate limits — use it as much as you want
→ Offline capable — works without internet after setup
→ Full control — customize the model, quantization, and behavior

Gemma 4's Mixture-of-Experts architecture (26B total, ~4B active parameters) is what makes this possible on consumer hardware. You get the intelligence of a much larger model with the speed and memory footprint of a smaller one.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

🔗 LINKS & RESOURCES

📝 Full written guide (dev.to):
https://dev.to/akartit/how-to-add-gem...

🎬 Part 1 — Gemma 4 Full Benchmark (all 4 models on M4 Pro):
• I Tested Every Gemma 4 Model Locally -- He...

🛠️ LocalCoder — my open-source Claude Code alternative that runs locally:
https://github.com/AnassKartit/localc...

🌐 OpenClaw: https://github.com/anthropics/claude-...
🦙 Unsloth: https://unsloth.ai
🤗 Gemma 4 GGUF models: https://huggingface.co/unsloth/gemma-...

📰 Blog: https://kartit.net
🐦 X/Twitter: https://x.com/AKartit

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

📋 CHAPTERS (update timestamps to match your video)

0:00 — Intro: The Problem with OpenClaw + Gemma 4
0:00 — Installing Unsloth and llama.cpp
0:00 — Downloading the Gemma 4 26B GGUF Model
0:00 — Starting the Local Inference Server
0:00 — Configuring OpenClaw (the actual fix)
0:00 — First Agent Run and Demo
0:00 — Performance Results (49 tok/s)
0:00 — LocalCoder: My Claude Code Clone
0:00 — Wrap-up and What's Next

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

🏷️ TAGS

gemma 4, openclaw, local llm, ai coding agent, macbook m4 pro, llama.cpp, unsloth, gemma 4 26b, open source ai, local ai, free coding agent, claude code alternative, apple silicon ai, gguf, hugging face, mixture of experts, localcoder, ai agent setup, openclaw fix, gemma 4 benchmark

#gemma4 #openclaw #localllm #ai #opensource #macbook #llamacpp #codingagent #localai #unsloth