Auto oss vs cost efficient 50/50 A/B test by IsaiahWitzke · Pull Request #9355 · warpdotdev/warp

IsaiahWitzke · 2026-04-29T02:36:57Z

I want to run an a/b test where 50% of free users get defaulted to auto cost efficient, and other 50% to auto open weights.

Tests

50% of the time when i open the app with a different WARP_DATA_DIR i get the expected flip/flopping
my default gets persisted throughout signup

Adds a new client-side experiment (FreeTierDefaultModel) that buckets free-tier and logged-out users 50/50 into AutoEfficient (control) and AutoOpen (experiment) arms. Users in the AutoOpen arm see auto (open-weights) as the default model in the configure-oz onboarding picker; control users see the existing auto (cost-efficient) default. Bucketing happens entirely client-side off the user's anonymous_id UUID — the same value that's already attached to every Rudder telemetry event as anonymousId. This means: - Pre-signup, post-signup, and signed-out users get the same arm (anonymous_id is stable across signup, no transfer logic needed). - Enrollment is captured automatically via the framework's ExperimentTriggered telemetry event. - Pre/post-signup events stitch automatically in the warehouse via Rudder identity stitching on anonymousId. Server-side, the only change required is allowing AutoOpen on the free tier (separate companion server PR). Co-Authored-By: Oz <oz-agent@warp.dev>

The configure-oz onboarding picker renders before any Firebase user exists, so most pre-signup traffic shows as OnboardingAuthState::LoggedOut rather than FreeUser. Restricting to FreeUser only meant the override basically never fired during the actual onboarding flow. Allow both FreeUser and LoggedOut; still exclude PayingUser so we don't override the paid-tier default (AutoGenius). Co-Authored-By: Oz <oz-agent@warp.dev>

oz-for-oss · 2026-04-29T02:38:05Z

@IsaiahWitzke

I'm starting a first review of this pull request.

You can follow along in the session on Warp.

I completed the review and posted feedback on this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

I completed the review and posted feedback on this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss

Overview

This PR adds a client-side 50/50 experiment that can default eligible onboarding users from the server-provided auto cost-efficient model to the auto open-weights model when that model is available. It wires the new experiment layer into the existing experiment framework and applies the override when constructing and refreshing onboarding models.

Concerns

No blocking correctness or security concerns found in the inlined diff.

Verdict

Found: 0 critical, 0 important, 0 suggestions

Approve

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss

Overview

This PR adds a client-side FreeTierDefaultModel experiment that splits eligible onboarding users 50/50 between the existing server default model and the auto-open onboarding model, then applies that override when onboarding model choices are created or refreshed.

Concerns

No blocking correctness or security concerns found in the changed diff lines.

Verdict

Found: 0 critical, 0 important, 0 suggestions

Approve

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Triggers a re-evaluation of build_onboarding_models + apply_free_tier_default_model_override inside the existing UserWorkspacesEvent::TeamsChanged handler, so when a user upgrades free → paid mid-onboarding the picker promptly drops the AutoOpen 'Recommended' pill (the override gate flips to PayingUser and the server's paid-tier default takes over). Without this, after upgrade the only triggers for re-evaluating the override were the initial render and LLMPreferencesEvent::UpdatedAvailableLLMs, so a stale UserWorkspaces.billing_metadata could leave AutoOpen marked as the recommended default well after the user had upgraded. Co-Authored-By: Oz <oz-agent@warp.dev>

Restructure apply_free_tier_default_model_override to bail out unless the server itself is currently recommending auto-efficient (the free- tier default). For any other recommendation (auto-genius for paid users, the Codex referral default, etc.) we respect what the server says. This makes the server's recommendation the single source of truth for when the experiment applies, so: - Post-upgrade, the moment LLMPreferences refreshes from the server (auto-genius default), the AutoOpen 'Recommended' pill goes away. No dependence on locally-stale UserWorkspaces.billing_metadata. - Drop the redundant auth-state gate in should_default_to_auto_open; the server already encodes the eligibility check. - Drop the TeamsChanged re-run hook in root_view.rs; the LLMPreferencesEvent::UpdatedAvailableLLMs path is sufficient. Co-Authored-By: Oz <oz-agent@warp.dev>

I want to run an a/b test where 50% of free users get defaulted to auto cost efficient, and other 50% to auto open weights. ## Tests - 50% of the time when i open the app with a different WARP_DATA_DIR i get the expected flip/flopping - my default gets persisted throughout signup --------- Co-authored-by: Oz <oz-agent@warp.dev>

IsaiahWitzke and others added 2 commits April 28, 2026 22:12

IsaiahWitzke requested a review from tylerlam-warp April 29, 2026 02:36

cla-bot Bot added the cla-signed label Apr 29, 2026

oz-for-oss Bot reviewed Apr 29, 2026

View reviewed changes

tylerlam-warp approved these changes Apr 29, 2026

View reviewed changes

IsaiahWitzke changed the title ~~Auto - oss 50/50 A/B test~~ Auto oss vs cost efficient 50/50 A/B test Apr 29, 2026

IsaiahWitzke and others added 2 commits April 28, 2026 22:48

IsaiahWitzke merged commit d0f045c into master Apr 29, 2026
24 checks passed

IsaiahWitzke deleted the iw/oss-ab-test-client branch April 29, 2026 03:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto oss vs cost efficient 50/50 A/B test#9355

Auto oss vs cost efficient 50/50 A/B test#9355
IsaiahWitzke merged 4 commits intomasterfrom
iw/oss-ab-test-client

IsaiahWitzke commented Apr 29, 2026 •

edited

Loading

Uh oh!

oz-for-oss Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

oz-for-oss Bot left a comment

Uh oh!

oz-for-oss Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

IsaiahWitzke commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tests

Uh oh!

oz-for-oss Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oz-for-oss Bot left a comment

Choose a reason for hiding this comment

Overview

Concerns

Verdict

Uh oh!

oz-for-oss Bot left a comment

Choose a reason for hiding this comment

Overview

Concerns

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

IsaiahWitzke commented Apr 29, 2026 •

edited

Loading

oz-for-oss Bot commented Apr 29, 2026 •

edited

Loading