Current Status: The Google Gemini models (including gemini-3.1-pro-preview, gemini-3.1-pro-preview-customtools, gemini-3.1-flash-lite-preview, and the 2.5 variants) are active and functioning as the primary intelligence engines. They are currently acting as fallbacks for the locked-out Codex models.
Google AI Studio Free Tier Limits
The Gemini API through Google AI Studio operates under specific limits for the free tier per Google Cloud project (shared across API keys). These limits are structural and absolute:
- Gemini 3.1 Pro & Custom Tools: Paid-only. While you can test
gemini-3.1-pro-preview and gemini-3.1-pro-preview-customtools inside the Google AI Studio web interface for free, programmatic API access to these models has no free tier. If we are using them programmatically, it is drawing from a paid quota, not the free tier.
- Gemini 3 Flash-Lite / Previews: Generally 5-15 RPM, 100,000 to 250,000 Tokens Per Minute (TPM), and up to 1,000 RPD.
- Gemini 2.5 Pro: 5 Requests Per Minute (RPM), 250,000 TPM, 100 Requests Per Day (RPD).
- Gemini 2.5 Flash: 10 RPM, 250,000 TPM, 250 RPD.
- Gemini 2.5 Flash-Lite: 15 RPM, 250,000 TPM, 1,000 RPD.
- Universal Token Limit: All free tier models share a universal cap of 250,000 TPM. Very large prompts can exhaust the token budget even if the RPM limit is not reached.
- Grounding (Google Search): 5,000 free queries/month for Gemini 3; 1,500/day for Gemini 2.5.
- Reference: Google Gemini API Rate Limits Documentation
How to Diagnose Gemini Constraints
Diagnosing Gemini is structurally different from diagnosing Codex. When you run openclaw models and inspect the resulting session, you will see a critical difference in the Auth Overview section:
google effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1)
The Core Difference: Codex operates via OAuth (oauth=1), which allows the OpenClaw daemon to actively track session cooldowns, hourly usage, and weekly quotas. Gemini operates via a static API Key (api_key=1). Because it uses a direct API key, OpenClaw does not track preemptive cooldown timers or percentage-based quotas for Gemini in the OAuth/token status readout.
- Run
openclaw models: You will notice Gemini does not appear in the "OAuth/token status" block at the bottom. This is normal and expected.
- Detecting Limits: If Gemini hits a rate limit or quota exhaustion, it will not be broadcast via a cooldown timer. Instead, it will fail forcefully at runtime, returning an explicit
429 Too Many Requests or quota error directly from the Google API during execution.
- Verification: If you suspect Gemini is failing, do not look for a timer. Attempt a benign tool execution or memory read. If the model is constrained, the runtime execution layer will instantly bubble up the API failure.
Summary: OAuth models (like Codex) warn you before you hit the wall. API key models (like Gemini) let you run until you hit the wall. You must know the difference in their authentication architecture to troubleshoot effectively.
google/gemini-3.1-pro-preview-customtools
Documentation Accurate As Of: April 16, 2026