Auto oss vs cost efficient 50/50 A/B test#9355
Conversation
Adds a new client-side experiment (FreeTierDefaultModel) that buckets free-tier and logged-out users 50/50 into AutoEfficient (control) and AutoOpen (experiment) arms. Users in the AutoOpen arm see auto (open-weights) as the default model in the configure-oz onboarding picker; control users see the existing auto (cost-efficient) default. Bucketing happens entirely client-side off the user's anonymous_id UUID — the same value that's already attached to every Rudder telemetry event as anonymousId. This means: - Pre-signup, post-signup, and signed-out users get the same arm (anonymous_id is stable across signup, no transfer logic needed). - Enrollment is captured automatically via the framework's ExperimentTriggered telemetry event. - Pre/post-signup events stitch automatically in the warehouse via Rudder identity stitching on anonymousId. Server-side, the only change required is allowing AutoOpen on the free tier (separate companion server PR). Co-Authored-By: Oz <oz-agent@warp.dev>
The configure-oz onboarding picker renders before any Firebase user exists, so most pre-signup traffic shows as OnboardingAuthState::LoggedOut rather than FreeUser. Restricting to FreeUser only meant the override basically never fired during the actual onboarding flow. Allow both FreeUser and LoggedOut; still exclude PayingUser so we don't override the paid-tier default (AutoGenius). Co-Authored-By: Oz <oz-agent@warp.dev>
|
I'm starting a first review of this pull request. You can follow along in the session on Warp. I completed the review and posted feedback on this pull request. Comment I completed the review and posted feedback on this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR adds a client-side 50/50 experiment that can default eligible onboarding users from the server-provided auto cost-efficient model to the auto open-weights model when that model is available. It wires the new experiment layer into the existing experiment framework and applies the override when constructing and refreshing onboarding models.
Concerns
- No blocking correctness or security concerns found in the inlined diff.
Verdict
Found: 0 critical, 0 important, 0 suggestions
Approve
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
There was a problem hiding this comment.
Overview
This PR adds a client-side FreeTierDefaultModel experiment that splits eligible onboarding users 50/50 between the existing server default model and the auto-open onboarding model, then applies that override when onboarding model choices are created or refreshed.
Concerns
- No blocking correctness or security concerns found in the changed diff lines.
Verdict
Found: 0 critical, 0 important, 0 suggestions
Approve
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
Triggers a re-evaluation of build_onboarding_models + apply_free_tier_default_model_override inside the existing UserWorkspacesEvent::TeamsChanged handler, so when a user upgrades free → paid mid-onboarding the picker promptly drops the AutoOpen 'Recommended' pill (the override gate flips to PayingUser and the server's paid-tier default takes over). Without this, after upgrade the only triggers for re-evaluating the override were the initial render and LLMPreferencesEvent::UpdatedAvailableLLMs, so a stale UserWorkspaces.billing_metadata could leave AutoOpen marked as the recommended default well after the user had upgraded. Co-Authored-By: Oz <oz-agent@warp.dev>
Restructure apply_free_tier_default_model_override to bail out unless the server itself is currently recommending auto-efficient (the free- tier default). For any other recommendation (auto-genius for paid users, the Codex referral default, etc.) we respect what the server says. This makes the server's recommendation the single source of truth for when the experiment applies, so: - Post-upgrade, the moment LLMPreferences refreshes from the server (auto-genius default), the AutoOpen 'Recommended' pill goes away. No dependence on locally-stale UserWorkspaces.billing_metadata. - Drop the redundant auth-state gate in should_default_to_auto_open; the server already encodes the eligibility check. - Drop the TeamsChanged re-run hook in root_view.rs; the LLMPreferencesEvent::UpdatedAvailableLLMs path is sufficient. Co-Authored-By: Oz <oz-agent@warp.dev>
I want to run an a/b test where 50% of free users get defaulted to auto cost efficient, and other 50% to auto open weights. ## Tests - 50% of the time when i open the app with a different WARP_DATA_DIR i get the expected flip/flopping - my default gets persisted throughout signup --------- Co-authored-by: Oz <oz-agent@warp.dev>
I want to run an a/b test where 50% of free users get defaulted to auto cost efficient, and other 50% to auto open weights.
Tests