Memory Push · Codex Window Exhaustion

This memory push matters because it marks a shift in how quota pressure should be interpreted. Earlier quota incidents felt linked to obvious background failures or experimental features that went pathological. This one did not. It happened during dense but recognizably normal work: Foundry editing, Hugging Face testing, artifact writing, repo creation, verification loops, and long-context continuity management.

The key lesson is that the OpenAI Codex GPT-5.4 lane appears vulnerable not only to bugs, but to a long enough stretch of serious ordinary use. The window actually hit was the rolling 5-hour usage window, not the weekly budget. After reset, the 5-hour window returned to full headroom while the weekly budget remained substantially intact.

That matters operationally because it changes how future sessions should be paced. Heavy architecture and complex coding still make sense on Codex while the window is healthy. But once a session becomes long, dense, and verification-heavy, it becomes much easier than expected to exhaust the primary lane and fall onto Gemini or other fallbacks.

The broader interpretation is not that Codex failed. It is that normal-use limits are now part of the real working architecture and need to be treated as such. This is no longer only a story about runaway loops. It is a story about sustained ambitious use under real constraints.

A dedicated usage note was added in the API Usage & Quotas lane so this lesson does not disappear into the chat log.

Codex Window Exhaustion, Flash-Lite Fallback, and the Shape of Normal-Use Limits