Billing usage, settings redesign, sturdier chat proxy

Managed-key billing now reserves spend up front, settings becomes a full-window page, and the chat proxy retries the right things and fails fast on the rest.

This release turns on usage-based billing for managed-key requests, makes Settings a full-window experience with a rebuilt connections panel, and tightens chat so brief upstream hiccups recover on their own while real failures stop fast. Specialists also pick up a long list of fixes — fewer unsolicited replies, working Stop on queued turns, and image attachments that survive thread replies and regenerations.

Managed-key billing and usage

Every managed-key chat request is now checked against your workspace wallet before it goes out. We estimate the cost up front, hold it against your balance, and reconcile the actual spend after the model responds. If your workspace is out of credit or has no active subscription, you get a clear “out of credit” message before the call starts instead of a confusing mid-stream failure. Pricing stays in sync with the OpenRouter catalog throughout the day, and top-ups show up in your balance within about a minute.

Settings becomes a full-window page

Settings no longer pops up as a dialog on top of your workspace on desktop — it takes over the window, with the same sidebar still in place so opening Settings is a quick crossfade rather than a full re-layout. The X close is replaced with a “Return to App” item. Inside, the connections area has been rebuilt around a dedicated page per provider: Cloudflare, Bedrock, OpenRouter, Hugging Face, OpenAI-compatible endpoints, and your built-in Zephyr key each have their own sign-in panel, allowed-models editor, routing-rules dialog, and role picker for marking a provider as the default or a fallback. “Make default” is back on configured rows, the workspace switcher popover has been redesigned with per-account add-workspace and log-out menus, and the sidebar no longer breaks when you drag it narrower than the readable minimum.

Chat handles upstream blips better

Brief rate limits and server errors from upstream providers used to fail your request immediately even when a short wait would have recovered. Chat now backs off and retries those automatically, with respect for Retry-After hints. Real failures like bad requests or auth errors still fail fast. Foreground chat absorbs short rate-limit bursts, while background work stops fast and lets the rest of the app decide what to do next. Background sync also stops hammering an unhealthy service, checks back periodically, and resumes cleanly once it recovers. Expired billing tokens refresh themselves on the fly.

When a Cloudflare Workers AI model is retired upstream, you now get a clear “provider not available” message with deprecation details instead of a confusing failure. Existing selections still resolve so old configurations and history stay readable, but the deprecated model no longer appears in pickers.

Specialists behave better

Specialists no longer reply when a message clearly mentions a person and not a specialist, and model routing falls back more confidently — fewer unsolicited replies in busy channels. Brand-new channels with no specialists connected show a single empty-channel notice instead of repeating it on every message. Stop works again on queued specialist turns and on direct send paths that previously couldn’t be cancelled before streaming began. Uploaded images now stay viewable across thread replies and regenerations, so a specialist sees the actual image instead of just the filename. Workflow keywords intercept matching messages before specialists see them, so a keyword no longer triggers a duplicate specialist reply.

Channel history is read-only for non-members

If you’re previewing a channel you haven’t joined, you can still read history but can’t take action on it. Retry, regenerate, react, delete, feedback, and thread reply are all disabled until you join. Auto-join on a verified company domain now also requires the email itself to be verified — unverified accounts can’t slip into a domain-allowlisted workspace.

Polish & fixes

OpenAI is now a first-class provider in our model catalog, alongside OpenRouter, Anthropic, Cloudflare Workers AI, and Bedrock.
Duplicate models from different providers (Cloudflare, Bedrock, OpenRouter) now appear once instead of producing duplicate rows.
Kimi K2.6 on Cloudflare Workers AI is now wired up to a runnable model identifier so it actually responds.
NVIDIA provider configurations no longer get rejected when they’re valid.
The default model for testing specialists moves to GPT-5.4 to avoid a Gemini error.
Chat streams tokens as they arrive when the provider supports it, instead of waiting for the full response before showing anything.
Chat message bubbles, hover rows, and reaction chips now match the design system — tighter spacing, consistent hover tint, and right-sized hover toolbar buttons.
Specialists show a clearer error when an unsupported model is selected, instead of a generic stream failure.
Frontend and backend errors now report to our error tracker with version and environment info, plus React error-boundary capture; sensitive fields are scrubbed before send.
Specialist Builder no longer crashes when uploading a folder of knowledge files.
macOS launches no longer crash with a missing Bluetooth permission prompt.