feat: enable codex models #1666

Kylejeong2 · 2026-02-05T21:07:23Z

why

people want to try codex models in Stagehand (also see benchmarks)

what changed

Codex needs a text verbosity and reasoning effort of medium (not low) so we set this using a conditional operator.

test plan

Ran evals

Summary by cubic

Enables Codex models in Stagehand via the AI SDK by setting OpenAI provider options. Addresses STG-1324. Codex uses medium text verbosity and medium reasoning effort; gpt-5 non-Codex defaults stay the same.

New Features
- Detects Codex models (modelId includes "codex") and sets textVerbosity: "medium" and reasoningEffort: "medium".
- Updates AISdkClient and AISdkClientWrapped to only apply low/minimal reasoning defaults to gpt-5.1/5.2 when not Codex.

^{Written for commit 5b5b831. Summary will update on new commits. Review in cubic}

greptile-apps · 2026-02-05T21:11:03Z

Greptile Overview

Greptile Summary

Enabled codex model support in Stagehand by adding special configuration for OpenAI's codex models. Changes applied consistently across both production (aisdk.ts) and evaluation (AISdkClientWrapped.ts) clients.

Key Changes:

Added isCodex detection for models containing "codex" in their name
Modified usesLowReasoningEffort logic to exclude codex models
Set codex-specific providerOptions: textVerbosity: "medium" and reasoningEffort: "medium" (required by codex API)
Previous gpt-5.1/5.2 models continue using textVerbosity: "low" and reasoningEffort: "low"
Other gpt-5 models use textVerbosity: "low" and reasoningEffort: "minimal"

Review Notes:
The conditional logic for determining reasoningEffort uses nested ternary operators that check isCodex first, then usesLowReasoningEffort, then default to "minimal". Upon careful analysis, this logic appears correct for all model types. The PR description states codex models require medium verbosity and reasoning effort, which is properly enforced in the code.

Confidence Score: 4/5

This PR is safe to merge with minor verification recommended
The implementation correctly adds codex model support with appropriate configuration values. The nested ternary logic is sound and handles all cases properly. Changes are consistently applied to both production and eval clients. The code follows existing patterns and is well-commented. Minor confidence reduction due to the complexity of nested ternaries and lack of unit tests visible in this PR to verify the codex-specific logic paths.
No files require special attention - both files have identical, straightforward changes

Important Files Changed

Filename	Overview
packages/core/lib/v3/llm/aisdk.ts	Added codex model detection with special handling for textVerbosity and reasoningEffort settings; potential logic bug where codex models outside gpt-5.1/5.2 may not receive correct reasoningEffort value
packages/evals/lib/AISdkClientWrapped.ts	Mirrored changes from aisdk.ts for evaluation client; same potential logic bug exists

Sequence Diagram

sequenceDiagram
    participant Client
    participant AISdkClient
    participant ModelCheck
    participant generateObject

    Client->>AISdkClient: createChatCompletion(options)
    AISdkClient->>ModelCheck: Check model.modelId
    
    alt contains "gpt-5"
        ModelCheck->>ModelCheck: isGPT5 = true
        
        alt contains "codex"
            ModelCheck->>ModelCheck: isCodex = true
            ModelCheck->>ModelCheck: usesLowReasoningEffort = false
            Note over ModelCheck: Codex models excluded from low reasoning
        else contains "gpt-5.1" or "gpt-5.2"
            ModelCheck->>ModelCheck: isCodex = false
            ModelCheck->>ModelCheck: usesLowReasoningEffort = true
        else other gpt-5 variant
            ModelCheck->>ModelCheck: isCodex = false
            ModelCheck->>ModelCheck: usesLowReasoningEffort = false
        end
        
        ModelCheck->>generateObject: Call with providerOptions
        
        alt isCodex = true
            Note over generateObject: textVerbosity: "medium"<br/>reasoningEffort: "medium"
        else usesLowReasoningEffort = true
            Note over generateObject: textVerbosity: "low"<br/>reasoningEffort: "low"
        else default
            Note over generateObject: textVerbosity: "low"<br/>reasoningEffort: "minimal"
        end
        
    else not gpt-5
        ModelCheck->>generateObject: Call without providerOptions
        Note over generateObject: Uses model defaults
    end
    
    generateObject-->>AISdkClient: objectResponse
    AISdkClient-->>Client: Return result

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-05T21:11:07Z

packages/core/lib/v3/llm/aisdk.ts

+                  reasoningEffort: isCodex
+                    ? "medium"
+                    : usesLowReasoningEffort
+                      ? "low"
+                      : "minimal",


Logic bug: codex models that are gpt-5 (not gpt-5.1 or gpt-5.2) will get reasoningEffort: "minimal" instead of "medium". The ternary checks isCodex first, but if the model is gpt-5-codex, usesLowReasoningEffort is false (doesn't match gpt-5.1 or gpt-5.2), so it falls through to "minimal".

Suggested change

reasoningEffort: isCodex

? "medium"

: usesLowReasoningEffort

? "low"

: "minimal",

reasoningEffort: isCodex

? "medium"

: usesLowReasoningEffort

? "low"

: "minimal",

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/core/lib/v3/llm/aisdk.ts Line: 175:179 Comment: Logic bug: codex models that are `gpt-5` (not `gpt-5.1` or `gpt-5.2`) will get `reasoningEffort: "minimal"` instead of `"medium"`. The ternary checks `isCodex` first, but if the model is `gpt-5-codex`, `usesLowReasoningEffort` is false (doesn't match `gpt-5.1` or `gpt-5.2`), so it falls through to `"minimal"`. ```suggestion reasoningEffort: isCodex ? "medium" : usesLowReasoningEffort ? "low" : "minimal", ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-05T21:11:08Z

packages/evals/lib/AISdkClientWrapped.ts

+                  reasoningEffort: isCodex
+                    ? "medium"
+                    : usesLowReasoningEffort
+                      ? "low"
+                      : "minimal",


Same logic bug as in aisdk.ts: codex models that are gpt-5 (not gpt-5.1 or gpt-5.2) will get reasoningEffort: "minimal" instead of "medium".

Suggested change

reasoningEffort: isCodex

? "medium"

: usesLowReasoningEffort

? "low"

: "minimal",

reasoningEffort: isCodex

? "medium"

: usesLowReasoningEffort

? "low"

: "minimal",

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/evals/lib/AISdkClientWrapped.ts Line: 152:156 Comment: Same logic bug as in `aisdk.ts`: codex models that are `gpt-5` (not `gpt-5.1` or `gpt-5.2`) will get `reasoningEffort: "minimal"` instead of `"medium"`. ```suggestion reasoningEffort: isCodex ? "medium" : usesLowReasoningEffort ? "low" : "minimal", ``` How can I resolve this? If you propose a fix, please make it concise.

cubic-dev-ai

1 issue found across 2 files

Confidence score: 4/5

The main risk is a new hardcoded model name check (isCodex) in packages/core/lib/v3/llm/aisdk.ts, which violates the rule against hardcoded LLM model name checks and could cause future model handling regressions.
Severity is moderate (5/10) with good confidence, so this looks safe to merge with a small policy-compliance concern rather than a likely runtime break.
Pay close attention to packages/core/lib/v3/llm/aisdk.ts - hardcoded model name check violates the no-list rule.

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="packages/core/lib/v3/llm/aisdk.ts">

<violation number="1" location="packages/core/lib/v3/llm/aisdk.ts:133">
P2: Rule violated: **Ensure we never check against hardcoded lists of allowed LLM model names**

This adds a new hardcoded model name check (`isCodex`) which violates the rule against hardcoded LLM model name checks. The rule states newly added code should accept any model name and let the provider handle errors. The exception only applies to guarding *against* known-bad models, not special-casing to enable models.

Consider an alternative approach such as:
- Letting the API return an error for unsupported configurations, then handling it
- Using a model capabilities/metadata system
- Adding a configuration option for these parameters rather than inferring from model name</violation>
</file>

Architecture diagram

sequenceDiagram
    participant App as Stagehand Application
    participant Client as AISdkClient / Wrapped
    participant SDK as AI SDK (generateObject)
    participant OpenAI as OpenAI API

    App->>Client: Request LLM Extraction (modelId, response_model)
    
    Note over Client: Internal Model Classification
    Client->>Client: Check modelId for "gpt-5" and "codex"

    alt NEW: Model is Codex-based
        Note over Client: Set specific Codex requirements
        Client->>Client: Set textVerbosity: "medium"
        Client->>Client: Set reasoningEffort: "medium"
    else Model is Standard GPT-5
        Note over Client: Apply default GPT-5 constraints
        Client->>Client: Set textVerbosity: "low"
        alt modelId is 5.1 or 5.2
            Client->>Client: Set reasoningEffort: "low"
        else 
            Client->>Client: Set reasoningEffort: "minimal"
        end
    end

    Client->>SDK: generateObject(prompt, schema, providerOptions)
    
    Note right of SDK: Includes OpenAI-specific<br/>textVerbosity & reasoningEffort
    
    SDK->>OpenAI: POST /v1/chat/completions
    OpenAI-->>SDK: JSON Response
    SDK-->>Client: Typed Object
    Client-->>App: Extraction Result

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/core/lib/v3/llm/aisdk.ts

miguelg719 · 2026-02-05T22:16:25Z

packages/core/lib/v3/llm/aisdk.ts

+      (this.model.modelId.includes("gpt-5.1") ||
+        this.model.modelId.includes("gpt-5.2")) &&


can these be consolidated into includes gpt-5.?

i think so, will test if gpt-5-2025-08-07 also has the same reasoning requirements

gpt 5 reasoning requirement is different

changeset-bot · 2026-02-10T21:35:00Z

⚠️ No Changeset found

Latest commit: 5b5b831

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

greptile-apps bot reviewed Feb 5, 2026

View reviewed changes

cubic-dev-ai bot reviewed Feb 5, 2026

View reviewed changes

packages/core/lib/v3/llm/aisdk.ts Show resolved Hide resolved

miguelg719 reviewed Feb 5, 2026

View reviewed changes

miguelg719 approved these changes Feb 10, 2026

View reviewed changes

feat: enable codex models

5b5b831

Kylejeong2 force-pushed the kylejeong/stg-1324-enable-codex-models-through-ai-sdk branch from a926fad to 5b5b831 Compare February 10, 2026 21:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enable codex models #1666

feat: enable codex models #1666

Uh oh!

Kylejeong2 commented Feb 5, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

greptile-apps bot commented Feb 5, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 5, 2026

Uh oh!

greptile-apps bot Feb 5, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

miguelg719 Feb 5, 2026

Uh oh!

Kylejeong2 Feb 6, 2026

Uh oh!

Kylejeong2 Feb 10, 2026

Uh oh!

changeset-bot bot commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		(this.model.modelId.includes("gpt-5.1") \|\|
		this.model.modelId.includes("gpt-5.2")) &&

feat: enable codex models #1666

Are you sure you want to change the base?

feat: enable codex models #1666

Uh oh!

Conversation

Kylejeong2 commented Feb 5, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

why

what changed

test plan

Summary by cubic

Uh oh!

greptile-apps bot commented Feb 5, 2026

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

miguelg719 Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Kylejeong2 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Kylejeong2 Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

changeset-bot bot commented Feb 10, 2026

⚠️ No Changeset found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Kylejeong2 commented Feb 5, 2026 •

edited by cubic-dev-ai bot

Loading