Is Qwen 3.6 Plus still free on OpenRouter?

No. The free tier for qwen/qwen3.6-plus:free was deprecated on April 7, 2026. The model is now paid-only at approximately $0.325/M input tokens and $1.95/M output tokens. The qwen/qwen3.6-plus-preview:free variant may still have limited free access — but it's unstable and may end at any time. For a reliable free experience right now, we recommend Qwen3 Coder 480B (qwen/qwen3-coder:free), which is currently confirmed free, has 262K context, and supports tool calling — making it actually better suited for AI agents than Qwen 3.6 Plus would have been.

Is this really $0/month or are there hidden costs?

Genuinely $0/month if you stay within Oracle Cloud's Always Free tier limits (4 ARM OCPUs, 24GB RAM, 200GB block storage, 10TB outbound transfer/month) and use only OpenRouter free models. The only paid components would be: (1) a custom domain name (~$10/year, optional), (2) if you exceed OpenRouter's free model rate limits and need to upgrade ($0.325/M tokens for paid Qwen 3.6 Plus), or (3) if you want a managed deployment with priority support (Cognio's $499 setup).

What are the rate limits on OpenRouter free models?

Free models on OpenRouter typically have a rate limit of 20 requests per minute and 200 requests per day per account. This is sufficient for personal use and learning but will hit limits if you build a high-traffic chatbot. To get higher limits, you can: (1) add credits to your OpenRouter account (the limits scale with your balance), (2) self-host a model via Ollama on your VPS, or (3) use a paid model. For most solo developers running Hermes Agent for personal automation, 200 requests/day is plenty.

Why Oracle Cloud and not Hetzner or DigitalOcean?

Oracle Cloud Always Free is the only major cloud that offers 4 ARM OCPUs and 24GB RAM completely free, forever, with no time limit. Hetzner is faster and easier but starts at $3.79/month. DigitalOcean has a $200 credit but it expires after 60 days. For a strictly $0/month setup, Oracle Cloud is the only viable option in 2026. The trade-off: Oracle's setup is more complex (capacity issues, ARM architecture), so expect 1-2 hours of setup time.

Will Hermes Agent actually run well on a free VPS?

Yes — Hermes Agent itself is lightweight (Python-based, ~500MB RAM idle). The 24GB RAM on Oracle Cloud's free ARM tier is overkill for the agent itself. The heavy lifting happens on OpenRouter's servers (the LLM inference), not your VPS. Your VPS just orchestrates tool calls, manages memory in SQLite, and handles the learning loop. Even a 2GB VPS would technically work, but the 24GB free tier gives you headroom for browser automation, document processing, and running other services alongside.

How does Hermes Agent's learning loop save me money on free models?

Hermes Agent's cache-aware architecture freezes the system prompt snapshot at session initialization and uses cached context windows for repeated calls. This dramatically reduces redundant token usage. Combined with the auto-generated skill system — where Hermes turns successful workflows into reusable skill documents — you naturally use fewer tokens over time on similar tasks. After 10-20 similar tasks, execution speed improves 2-3x while token usage drops, making free model rate limits much less of a constraint.

Can I use this setup for production workloads?

It depends on what you mean by production. For personal automation, side projects, learning, and proof-of-concepts: absolutely yes. For business-critical workloads: only if you accept the trade-offs. Free models can be deprecated with little notice (Qwen 3.6 Plus had a 5-day free window), Oracle's free tier has occasional capacity issues, and rate limits will throttle you under load. For mission-critical use, Cognio's $499 professional setup deploys a hardened instance with paid LLM credits and 30 days of priority support.

Hermes Agent on Free VPS + Free LLM: Complete 2026 Setup Guide

The Architecture

Before you start clicking through Oracle Cloud's setup wizard, it helps to understand how the four components fit together. The diagram below shows where each piece lives and how they communicate.

Free Hermes Agent Stack — Data Flow

You

SSH / Telegram / CLI

Oracle Cloud Always Free VPS

4 ARM OCPUs · 24GB RAM · Ubuntu 22.04

Hermes Agent (Python)

SQLite Memory

40+ Tools

Skill Loop

HTTPS / OpenAI API

OpenRouter API Gateway

Qwen3 Coder 480B (free)

262K context · MoE · Tool calling

Notice that the heavy lifting — the actual LLM inference — happens on OpenRouter's servers, not on your VPS. Your free VPS only needs to run the lightweight Hermes Agent process (~500MB RAM at idle), a small SQLite database for memory, and any tools the agent invokes (browser, terminal, file ops). That's why a 24GB Oracle Cloud instance feels like overkill — and why this setup actually works on much smaller VPSes too.

About Qwen 3.6 Plus (Reality Check)

If you came here looking specifically for qwen/qwen3.6-plus:free, here's the honest situation as of April 10, 2026:

Model	Free Status	As of Today
qwen/qwen3.6-plus:free	Deprecated	Free tier ended April 7, 2026
qwen/qwen3.6-plus-preview:free	Uncertain	May still work, no guarantees
qwen/qwen3-coder:free	Free ✓	Confirmed free, 262K context, tools
qwen/qwen3.6-plus (paid)	Paid only	$0.325/M in, $1.95/M out

Why Qwen3 Coder 480B is actually a better choice

It's currently free — no surprise paywall after 5 days
Native tool calling — confirmed support for `tools` and `tool_choice` parameters (essential for Hermes Agent)
262K context — enough to load entire codebases or long conversation histories
Optimized for agentic coding — Qwen explicitly designed it for "function calling, tool use, and long-context reasoning"
480B parameters with 35B active (MoE) — runs faster than dense models of similar quality

For the rest of this guide, we'll use Qwen3 Coder 480B. Everything else (VPS setup, Hermes installation, configuration) stays identical regardless of which model you choose — so if a better free model appears next week, you can swap with one command: hermes model.

The Best Free Models on OpenRouter (April 2026)

OpenRouter currently lists 27 free models. Most are not suitable for AI agents because they lack tool/function calling support. Below is the shortlist of free models that actually work with Hermes Agent — ranked by context window.

Free Models on OpenRouter — Context Window Comparison

Qwen3 Coder 480B262K

Nemotron 3 Super 120B262K

Qwen3 Next 80B262K

Gemma 4 31B262K

Trinity Large Preview131K

GLM 4.5 Air131K

GPT-OSS-120B131K

MiniMax M2.5197K

Llama 3.3 70B66K

Detailed Comparison (Tool Calling Confirmed)

Model	Context	Tools	Best For
Qwen3 Coder 480B ⭐	262K	✓	Coding agents, complex reasoning
Nemotron 3 Super 120B	262K	✓	Long-context agents, planning
Qwen3 Next 80B	262K	✓	General-purpose, fast inference
GLM 4.5 Air (Z.ai)	131K	✓	Conversational, multilingual
MiniMax M2.5	197K	✓	Multi-step task chains
Llama 3.3 70B	66K	✓	Reliable fallback, well-documented

Source: OpenRouter free models collection, verified April 10, 2026. Free models share capacity with paid tiers and may experience throttling during peak hours.

Step 1: Get Your Free VPS

We're using Oracle Cloud's Always Free tier. It's the only major cloud that gives you genuinely free compute with no time limit and no credit card expiration. You get up to 4 ARM OCPUs (Ampere A1), 24GB RAM, 200GB block storage, and 10TB outbound data transfer per month.

Heads up: Oracle Cloud requires a credit card for verification

Oracle requires a credit card to sign up — but you won't be charged unless you explicitly upgrade to a paid plan. The Always Free resources stay free even after the 30-day trial credits expire. The verification charge is $0 (or a small temporary hold that's refunded).

1.1 Sign up for Oracle Cloud

Go to oracle.com/cloud/free and click "Start for free"
Choose your home region carefully — pick one with ARM capacity (US East Ashburn, Frankfurt, and London tend to have the best availability)
Complete email verification and credit card validation
Wait 5-10 minutes for account provisioning

1.2 Create an Always Free ARM instance

From the Oracle Cloud dashboard, click Create a VM instance
Name it hermes-agent
Under Image and shape, click "Edit" and select Canonical Ubuntu 22.04
Click "Change shape" → select Ampere → VM.Standard.A1.Flex
Set OCPUs to 4 and Memory to 24 GB (the maximum free allocation)
Add an SSH key (generate one with ssh-keygen if you don't have one)
Click Create

If you see 'Out of capacity' errors

This is Oracle's most common gotcha. The free ARM tier is heavily oversubscribed in popular regions. Your options:

Wait 5-10 minutes and try again (capacity frees up constantly)
Try a different availability domain (AD-1, AD-2, AD-3) within your region
Use a community script like oracle-freetier-instance-creation to auto-retry
Switch to a less popular region (Frankfurt and Switzerland often have capacity)

For a deeper VPS comparison and Oracle Cloud setup tips, see our Best VPS for AI Agents guide.

Step 2: Initial Server Setup

Once your instance is running, grab the public IP from the Oracle Cloud dashboard and SSH in. Run these commands in order — each one is essential.

2.1 SSH into your VPS

Connect to the new instancebash

# Replace YOUR_IP with the public IP from Oracle Cloud
ssh ubuntu@YOUR_IP

# If you get a permissions error on your private key:
chmod 600 ~/.ssh/id_rsa

2.2 Update the system & install Git

Bring the system up to datebash

sudo apt update && sudo apt upgrade -y
sudo apt install -y git curl ca-certificates
sudo apt autoremove -y

2.3 Open the firewall (Oracle-specific)

Oracle Cloud uses iptables by default and blocks most ports. You need to allow outbound HTTPS so Hermes can talk to OpenRouter:

Allow outbound trafficbash

# Allow outbound HTTPS (already allowed by default but verify)
sudo iptables -L OUTPUT -n | head -5

# If you plan to use Hermes via Telegram/Discord (no inbound port needed)
# you can skip the rest. If you want to expose a webhook or web UI later,
# open port 443 in BOTH places:
#   1. Oracle Cloud Console: Networking → VCN → Security List → Ingress Rules
#   2. The Ubuntu firewall:
sudo iptables -I INPUT -p tcp --dport 443 -j ACCEPT
sudo netfilter-persistent save

2.4 (Optional but recommended) Create a non-root user

Set up a dedicated user for Hermesbash

# Create user, grant sudo, copy SSH keys
sudo adduser hermes
sudo usermod -aG sudo hermes
sudo rsync --archive --chown=hermes:hermes ~/.ssh /home/hermes
sudo su - hermes

Why a dedicated user matters

Running Hermes Agent as a non-root user limits the blast radius if anything goes wrong. Even though Hermes won't intentionally do anything destructive, AI agents can be manipulated via prompt injection — running as a non-root user is a basic safety net.

Step 3: Get Your Free OpenRouter API Key

OpenRouter is a unified API gateway for 200+ LLMs. It's OpenAI-compatible, which means anything that speaks the OpenAI API format (including Hermes Agent) works out of the box. Free tier signup takes 30 seconds.

Go to openrouter.ai and sign in (Google, GitHub, or email)
Navigate to Keys in the sidebar
Click Create Key, name it hermes-agent
Copy the key (starts with sk-or-v1-...)
You don't need to add credits to use free models — but be aware of the rate limits

OpenRouter Free Tier Rate Limits

As of April 2026, free models on OpenRouter are subject to:

requests/minute

200

requests/day

For most personal automation, 200 requests/day is plenty. If you need more, add $10 of credits and the limits increase significantly.

Step 4: Install Hermes Agent

Hermes Agent has the cleanest install of any AI agent framework I've tested. One command does everything — installs Python 3.11, Node.js, ripgrep, ffmpeg, uv (the Python package manager), and Hermes itself. The installer auto-detects your OS.

One-line installbash

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

The installer takes 3-5 minutes on a fresh Oracle Cloud ARM instance. When it finishes, reload your shell:

Reload shell to activate hermes commandbash

source ~/.bashrc

# Verify the install
hermes --version

ARM compatibility check

Hermes Agent works on ARM (aarch64) without issues — Python wheels exist for all dependencies. However, if you see any "wheel not found" errors during install, run sudo apt install -y build-essential python3-dev first and retry.

Step 5: Configure With the Free Model

Now we wire Hermes to OpenRouter and tell it to use Qwen3 Coder 480B (free). There are two pieces: the API key (stored as a secret) and the model selection (stored in config).

5.1 Set the OpenRouter API key

Save your key as a Hermes secretbash

hermes config set OPENROUTER_API_KEY sk-or-v1-YOUR_KEY_HERE

5.2 Set Qwen3 Coder 480B as the active model

Switch to Qwen3 Coder via OpenRouterbash

hermes model openrouter/qwen/qwen3-coder:free

Or if you prefer to edit the config file directly:

~/.hermes/config.yamlyaml

model:
  provider: openrouter
  model: qwen/qwen3-coder:free

# Optional: use a different model for vision tasks
auxiliary:
  vision:
    provider: openrouter
    model: google/gemma-4-31b:free

Want to try Qwen 3.6 Plus Preview anyway?

If you want to gamble on the preview variant being still free, try this instead. It may work, may not — OpenRouter doesn't guarantee preview models:

hermes model openrouter/qwen/qwen3.6-plus-preview:free

If you get an error like "model has been deprecated," just switch back to qwen/qwen3-coder:free.

Step 6: Your First Agent Run

Time to see if everything works. Start with a simple chat to verify the model connection:

Quick smoke testbash

hermes chat -q "Hello! What model are you running on?"

You should see a response within a few seconds. If you get a 401 error, your API key is wrong. If you get a 429, you've hit the rate limit (wait a minute). If you get a 404 with "model not found," the free tier may have ended — try a different model from the table above.

Test the agentic capabilities

Now let's test something that actually exercises Hermes's tool-calling and learning loop:

Real agent taskbash

hermes chat -q "Check disk usage on this server, then write a summary to ~/disk-report.md"

Hermes will: invoke the terminal tool to run df -h, parse the output, format a summary, write it to a file, and confirm completion. The first time it does this, it'll create a skill document in ~/.hermes/skills/. Next time you ask something similar, it'll reference that skill and run faster.

Step 7: Run Hermes as a Service (Optional)

If you want Hermes to run continuously — for scheduled tasks, Telegram/Discord bot mode, or always-on memory — set it up as a systemd service so it survives reboots.

/etc/systemd/system/hermes.serviceini

[Unit]
Description=Hermes Agent
After=network.target

[Service]
Type=simple
User=hermes
WorkingDirectory=/home/hermes
ExecStart=/home/hermes/.local/bin/hermes serve
Restart=on-failure
RestartSec=10
Environment="HOME=/home/hermes"

[Install]
WantedBy=multi-user.target

Enable and start the servicebash

sudo systemctl daemon-reload
sudo systemctl enable hermes
sudo systemctl start hermes

# Check it's running
sudo systemctl status hermes

# View logs
sudo journalctl -u hermes -f

Auto-restart on reboot

With the systemd service enabled, Hermes will automatically start when the VPS reboots — including after Oracle Cloud's scheduled maintenance windows. You don't need to SSH in and manually restart anything.

Cost Reality Check

"$0/month" sounds too good to be true, so let's break down what you're actually getting for free vs. what the equivalent paid setup would cost.

Component	This Guide (Free)	Equivalent Paid Setup
VPS (4 vCPU, 24GB RAM)	$0/mo (Oracle Always Free)	$24-40/mo (Hetzner CCX23)
200GB storage	$0 (included)	$10-20/mo
10TB outbound transfer	$0 (included)	$50-100/mo on AWS
LLM API (200 req/day)	$0 (free model rate limit)	~$30-60/mo (Claude/GPT-4)
Hermes Agent license	$0 (MIT open source)	$0 (also free)
Monthly Total	$0	$114-220

You're essentially getting ~$1,400-2,600 of value per year for free. The catch is rate limits and the inherent fragility of relying on free tiers — both providers can change terms with little notice.

Rate Limits & Gotchas

Free tiers come with constraints. Understanding them upfront prevents frustration when your agent suddenly stops working at 3 AM.

OpenRouter Free Model Rate Limits

20 requests/minute, 200 requests/day per account. Hermes's caching helps a lot, but a chatty bot hitting the LLM on every keystroke will exhaust this quickly. Solution: add $10 of credits to OpenRouter and the limits increase significantly.

Oracle Cloud Capacity Shortages

ARM instances are oversubscribed. You may get "Out of capacity" errors trying to create or restart instances. Always create a snapshot/backup of your config so you can quickly redeploy in a different region if needed.

Free Models Get Deprecated

Qwen 3.6 Plus had a 5-day free window before being moved to paid. Always have a backup model in mind. The free models router (openrouter/free) auto-routes to whatever's currently free, but the model behavior changes which can confuse Hermes's skill system.

Oracle's Idle Reclamation Policy

Oracle Cloud reclaims Always Free compute instances that have been idle for 7 consecutive days (CPU < 20%, network < 15KBps). To prevent this, run a small monitoring task or keep Hermes's scheduled jobs active.

Data Collection on Free Models

OpenRouter's free models often collect prompt and completion data to improve the underlying model. Don't send sensitive PII, credentials, or proprietary code through free tiers. For sensitive workloads, use Ollama locally on the same VPS instead.

Skip the Setup: Cognio Professional Deployment

The free setup is great for personal use and learning. For business-critical deployments where you need reliability, security hardening, and someone to call when things break — Cognio engineers deploy a production-ready Hermes Agent (or OpenClaw) instance on your infrastructure in 24-48 hours.

What's Included in the $499 Setup

VPS provisioning and OS hardening

Hermes Agent install with systemd service

Nginx reverse proxy with SSL certificates

Security audit & firewall configuration

API key management & secrets handling

Model selection & cost optimization

Telegram/Discord/Slack channel setup

Performance tuning for your workload

Auto-backup of skills & memory

30 days of priority support

$499

One-time setup fee

24-48 hrs

Deployment time

30 days

Priority support

Book Free Discovery Call Compare with OpenClaw

Hermes Agent on Free VPS + Free LLM:The Complete $0/Month Setup

Quick Answer: The $0/Month Stack

Table of Contents