The Architecture
Before you start clicking through Oracle Cloud's setup wizard, it helps to understand how the four components fit together. The diagram below shows where each piece lives and how they communicate.
Free Hermes Agent Stack — Data Flow
Notice that the heavy lifting — the actual LLM inference — happens on OpenRouter's servers, not on your VPS. Your free VPS only needs to run the lightweight Hermes Agent process (~500MB RAM at idle), a small SQLite database for memory, and any tools the agent invokes (browser, terminal, file ops). That's why a 24GB Oracle Cloud instance feels like overkill — and why this setup actually works on much smaller VPSes too.
About Qwen 3.6 Plus (Reality Check)
If you came here looking specifically for qwen/qwen3.6-plus:free, here's the honest situation as of April 10, 2026:
| Model | Free Status | As of Today |
|---|---|---|
| qwen/qwen3.6-plus:free | Deprecated | Free tier ended April 7, 2026 |
| qwen/qwen3.6-plus-preview:free | Uncertain | May still work, no guarantees |
| qwen/qwen3-coder:free | Free ✓ | Confirmed free, 262K context, tools |
| qwen/qwen3.6-plus (paid) | Paid only | $0.325/M in, $1.95/M out |
- It's currently free — no surprise paywall after 5 days
- Native tool calling — confirmed support for `tools` and `tool_choice` parameters (essential for Hermes Agent)
- 262K context — enough to load entire codebases or long conversation histories
- Optimized for agentic coding — Qwen explicitly designed it for "function calling, tool use, and long-context reasoning"
- 480B parameters with 35B active (MoE) — runs faster than dense models of similar quality
For the rest of this guide, we'll use Qwen3 Coder 480B. Everything else (VPS setup, Hermes installation, configuration) stays identical regardless of which model you choose — so if a better free model appears next week, you can swap with one command: hermes model.
The Best Free Models on OpenRouter (April 2026)
OpenRouter currently lists 27 free models. Most are not suitable for AI agents because they lack tool/function calling support. Below is the shortlist of free models that actually work with Hermes Agent — ranked by context window.
Free Models on OpenRouter — Context Window Comparison
Detailed Comparison (Tool Calling Confirmed)
| Model | Context | Tools | Best For |
|---|---|---|---|
| Qwen3 Coder 480B ⭐ | 262K | ✓ | Coding agents, complex reasoning |
| Nemotron 3 Super 120B | 262K | ✓ | Long-context agents, planning |
| Qwen3 Next 80B | 262K | ✓ | General-purpose, fast inference |
| GLM 4.5 Air (Z.ai) | 131K | ✓ | Conversational, multilingual |
| MiniMax M2.5 | 197K | ✓ | Multi-step task chains |
| Llama 3.3 70B | 66K | ✓ | Reliable fallback, well-documented |
Source: OpenRouter free models collection, verified April 10, 2026. Free models share capacity with paid tiers and may experience throttling during peak hours.
Step 1: Get Your Free VPS
We're using Oracle Cloud's Always Free tier. It's the only major cloud that gives you genuinely free compute with no time limit and no credit card expiration. You get up to 4 ARM OCPUs (Ampere A1), 24GB RAM, 200GB block storage, and 10TB outbound data transfer per month.
Oracle requires a credit card to sign up — but you won't be charged unless you explicitly upgrade to a paid plan. The Always Free resources stay free even after the 30-day trial credits expire. The verification charge is $0 (or a small temporary hold that's refunded).
1.1 Sign up for Oracle Cloud
- Go to oracle.com/cloud/free and click "Start for free"
- Choose your home region carefully — pick one with ARM capacity (US East Ashburn, Frankfurt, and London tend to have the best availability)
- Complete email verification and credit card validation
- Wait 5-10 minutes for account provisioning
1.2 Create an Always Free ARM instance
- From the Oracle Cloud dashboard, click Create a VM instance
- Name it
hermes-agent - Under Image and shape, click "Edit" and select Canonical Ubuntu 22.04
- Click "Change shape" → select Ampere → VM.Standard.A1.Flex
- Set OCPUs to 4 and Memory to 24 GB (the maximum free allocation)
- Add an SSH key (generate one with
ssh-keygenif you don't have one) - Click Create
This is Oracle's most common gotcha. The free ARM tier is heavily oversubscribed in popular regions. Your options:
- Wait 5-10 minutes and try again (capacity frees up constantly)
- Try a different availability domain (AD-1, AD-2, AD-3) within your region
- Use a community script like oracle-freetier-instance-creation to auto-retry
- Switch to a less popular region (Frankfurt and Switzerland often have capacity)
For a deeper VPS comparison and Oracle Cloud setup tips, see our Best VPS for AI Agents guide.
Step 2: Initial Server Setup
Once your instance is running, grab the public IP from the Oracle Cloud dashboard and SSH in. Run these commands in order — each one is essential.
2.1 SSH into your VPS
# Replace YOUR_IP with the public IP from Oracle Cloud
ssh ubuntu@YOUR_IP
# If you get a permissions error on your private key:
chmod 600 ~/.ssh/id_rsa2.2 Update the system & install Git
sudo apt update && sudo apt upgrade -y
sudo apt install -y git curl ca-certificates
sudo apt autoremove -y2.3 Open the firewall (Oracle-specific)
Oracle Cloud uses iptables by default and blocks most ports. You need to allow outbound HTTPS so Hermes can talk to OpenRouter:
# Allow outbound HTTPS (already allowed by default but verify)
sudo iptables -L OUTPUT -n | head -5
# If you plan to use Hermes via Telegram/Discord (no inbound port needed)
# you can skip the rest. If you want to expose a webhook or web UI later,
# open port 443 in BOTH places:
# 1. Oracle Cloud Console: Networking → VCN → Security List → Ingress Rules
# 2. The Ubuntu firewall:
sudo iptables -I INPUT -p tcp --dport 443 -j ACCEPT
sudo netfilter-persistent save2.4 (Optional but recommended) Create a non-root user
# Create user, grant sudo, copy SSH keys
sudo adduser hermes
sudo usermod -aG sudo hermes
sudo rsync --archive --chown=hermes:hermes ~/.ssh /home/hermes
sudo su - hermesRunning Hermes Agent as a non-root user limits the blast radius if anything goes wrong. Even though Hermes won't intentionally do anything destructive, AI agents can be manipulated via prompt injection — running as a non-root user is a basic safety net.
Step 3: Get Your Free OpenRouter API Key
OpenRouter is a unified API gateway for 200+ LLMs. It's OpenAI-compatible, which means anything that speaks the OpenAI API format (including Hermes Agent) works out of the box. Free tier signup takes 30 seconds.
- Go to openrouter.ai and sign in (Google, GitHub, or email)
- Navigate to Keys in the sidebar
- Click Create Key, name it
hermes-agent - Copy the key (starts with
sk-or-v1-...) - You don't need to add credits to use free models — but be aware of the rate limits
As of April 2026, free models on OpenRouter are subject to:
For most personal automation, 200 requests/day is plenty. If you need more, add $10 of credits and the limits increase significantly.
Step 4: Install Hermes Agent
Hermes Agent has the cleanest install of any AI agent framework I've tested. One command does everything — installs Python 3.11, Node.js, ripgrep, ffmpeg, uv (the Python package manager), and Hermes itself. The installer auto-detects your OS.
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bashThe installer takes 3-5 minutes on a fresh Oracle Cloud ARM instance. When it finishes, reload your shell:
source ~/.bashrc
# Verify the install
hermes --versionHermes Agent works on ARM (aarch64) without issues — Python wheels exist for all dependencies. However, if you see any "wheel not found" errors during install, run sudo apt install -y build-essential python3-dev first and retry.
Step 5: Configure With the Free Model
Now we wire Hermes to OpenRouter and tell it to use Qwen3 Coder 480B (free). There are two pieces: the API key (stored as a secret) and the model selection (stored in config).
5.1 Set the OpenRouter API key
hermes config set OPENROUTER_API_KEY sk-or-v1-YOUR_KEY_HERE5.2 Set Qwen3 Coder 480B as the active model
hermes model openrouter/qwen/qwen3-coder:freeOr if you prefer to edit the config file directly:
model:
provider: openrouter
model: qwen/qwen3-coder:free
# Optional: use a different model for vision tasks
auxiliary:
vision:
provider: openrouter
model: google/gemma-4-31b:freeIf you want to gamble on the preview variant being still free, try this instead. It may work, may not — OpenRouter doesn't guarantee preview models:
hermes model openrouter/qwen/qwen3.6-plus-preview:freeIf you get an error like "model has been deprecated," just switch back to qwen/qwen3-coder:free.
Step 6: Your First Agent Run
Time to see if everything works. Start with a simple chat to verify the model connection:
hermes chat -q "Hello! What model are you running on?"You should see a response within a few seconds. If you get a 401 error, your API key is wrong. If you get a 429, you've hit the rate limit (wait a minute). If you get a 404 with "model not found," the free tier may have ended — try a different model from the table above.
Test the agentic capabilities
Now let's test something that actually exercises Hermes's tool-calling and learning loop:
hermes chat -q "Check disk usage on this server, then write a summary to ~/disk-report.md"Hermes will: invoke the terminal tool to run df -h, parse the output, format a summary, write it to a file, and confirm completion. The first time it does this, it'll create a skill document in ~/.hermes/skills/. Next time you ask something similar, it'll reference that skill and run faster.
Step 7: Run Hermes as a Service (Optional)
If you want Hermes to run continuously — for scheduled tasks, Telegram/Discord bot mode, or always-on memory — set it up as a systemd service so it survives reboots.
[Unit]
Description=Hermes Agent
After=network.target
[Service]
Type=simple
User=hermes
WorkingDirectory=/home/hermes
ExecStart=/home/hermes/.local/bin/hermes serve
Restart=on-failure
RestartSec=10
Environment="HOME=/home/hermes"
[Install]
WantedBy=multi-user.targetsudo systemctl daemon-reload
sudo systemctl enable hermes
sudo systemctl start hermes
# Check it's running
sudo systemctl status hermes
# View logs
sudo journalctl -u hermes -fWith the systemd service enabled, Hermes will automatically start when the VPS reboots — including after Oracle Cloud's scheduled maintenance windows. You don't need to SSH in and manually restart anything.
Cost Reality Check
"$0/month" sounds too good to be true, so let's break down what you're actually getting for free vs. what the equivalent paid setup would cost.
| Component | This Guide (Free) | Equivalent Paid Setup |
|---|---|---|
| VPS (4 vCPU, 24GB RAM) | $0/mo (Oracle Always Free) | $24-40/mo (Hetzner CCX23) |
| 200GB storage | $0 (included) | $10-20/mo |
| 10TB outbound transfer | $0 (included) | $50-100/mo on AWS |
| LLM API (200 req/day) | $0 (free model rate limit) | ~$30-60/mo (Claude/GPT-4) |
| Hermes Agent license | $0 (MIT open source) | $0 (also free) |
| Monthly Total | $0 | $114-220 |
You're essentially getting ~$1,400-2,600 of value per year for free. The catch is rate limits and the inherent fragility of relying on free tiers — both providers can change terms with little notice.
Rate Limits & Gotchas
Free tiers come with constraints. Understanding them upfront prevents frustration when your agent suddenly stops working at 3 AM.
OpenRouter Free Model Rate Limits
20 requests/minute, 200 requests/day per account. Hermes's caching helps a lot, but a chatty bot hitting the LLM on every keystroke will exhaust this quickly. Solution: add $10 of credits to OpenRouter and the limits increase significantly.
Oracle Cloud Capacity Shortages
ARM instances are oversubscribed. You may get "Out of capacity" errors trying to create or restart instances. Always create a snapshot/backup of your config so you can quickly redeploy in a different region if needed.
Free Models Get Deprecated
Qwen 3.6 Plus had a 5-day free window before being moved to paid. Always have a backup model in mind. The free models router (openrouter/free) auto-routes to whatever's currently free, but the model behavior changes which can confuse Hermes's skill system.
Oracle's Idle Reclamation Policy
Oracle Cloud reclaims Always Free compute instances that have been idle for 7 consecutive days (CPU < 20%, network < 15KBps). To prevent this, run a small monitoring task or keep Hermes's scheduled jobs active.
Data Collection on Free Models
OpenRouter's free models often collect prompt and completion data to improve the underlying model. Don't send sensitive PII, credentials, or proprietary code through free tiers. For sensitive workloads, use Ollama locally on the same VPS instead.
Skip the Setup: Cognio Professional Deployment
The free setup is great for personal use and learning. For business-critical deployments where you need reliability, security hardening, and someone to call when things break — Cognio engineers deploy a production-ready Hermes Agent (or OpenClaw) instance on your infrastructure in 24-48 hours.