More e2e nodejs tests by friggeri · Pull Request #466 · github/copilot-sdk

friggeri · 2026-02-13T17:37:25Z

As titled

Copilot

Pull request overview

Adds additional Node.js E2E coverage by introducing new Vitest suites and the corresponding replay-proxy snapshots, extending coverage around tool results, streaming/event fidelity, lifecycle APIs, hooks, and built-in tool behaviors. It also updates the replaying CAPI proxy to support “request-only” snapshots used for timeout scenarios.

Changes:

Added multiple new Node.js E2E test suites (tool results, streaming fidelity, lifecycle, hooks, built-in tools, multi-turn, error resilience) and their snapshot YAMLs under test/snapshots/**.
Updated the replaying proxy to hang indefinitely when a request matches a “request-only” snapshot (for timeout tests).
Updated a handful of existing snapshots and skipped the Node compaction E2E suite.

Reviewed changes

Copilot reviewed 51 out of 51 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
test/harness/replayingCapiProxy.ts	Adds “request-only snapshot” detection to support timeout test scenarios.
nodejs/test/e2e/builtin_tools.test.ts	New E2E suite covering built-in tools (bash/view/edit/create/grep/glob).
nodejs/test/e2e/client_lifecycle.test.ts	New E2E coverage for last-session-id and lifecycle event emission.
nodejs/test/e2e/compaction.test.ts	Marks compaction suite skipped.
nodejs/test/e2e/error_resilience.test.ts	New E2E coverage for error behaviors (destroyed sessions, abort, resume invalid).
nodejs/test/e2e/event_fidelity.test.ts	New E2E coverage validating ordering and required fields on emitted events.
nodejs/test/e2e/hooks_extended.test.ts	New E2E coverage for extended session hooks.
nodejs/test/e2e/multi_turn.test.ts	New E2E coverage for multi-turn behavior using prior tool/file results.
nodejs/test/e2e/session_config.test.ts	New E2E coverage for workingDirectory, provider config, and attachments.
nodejs/test/e2e/session_lifecycle.test.ts	New E2E coverage for list/delete sessions and getMessages behavior.
nodejs/test/e2e/streaming_fidelity.test.ts	New E2E coverage ensuring deltas appear only when streaming is enabled.
nodejs/test/e2e/tool_results.test.ts	New E2E coverage for ToolResultObject handling and Zod param validation.
test/snapshots/ask_user/should_invoke_user_input_handler_when_model_uses_ask_user_tool.yaml	Updates expected assistant text for ask_user flow.
test/snapshots/builtin_tools/should_capture_exit_code_in_output.yaml	New snapshot for bash output/exit code behavior.
test/snapshots/builtin_tools/should_capture_stderr_output.yaml	New snapshot for bash stderr capture behavior.
test/snapshots/builtin_tools/should_create_a_new_file.yaml	New snapshot for create tool behavior.
test/snapshots/builtin_tools/should_edit_a_file_successfully.yaml	New snapshot for edit tool behavior.
test/snapshots/builtin_tools/should_find_files_by_pattern.yaml	New snapshot for glob tool behavior.
test/snapshots/builtin_tools/should_handle_nonexistent_file_gracefully.yaml	New snapshot for view error handling.
test/snapshots/builtin_tools/should_read_file_with_line_range.yaml	New snapshot for ranged view behavior.
test/snapshots/builtin_tools/should_search_for_patterns_in_files.yaml	New snapshot for grep behavior.
test/snapshots/client_lifecycle/should_emit_session_lifecycle_events.yaml	New snapshot for lifecycle-events test.
test/snapshots/client_lifecycle/should_return_last_session_id_after_sending_a_message.yaml	New snapshot for last-session-id test.
test/snapshots/compaction/should_trigger_compaction_with_low_threshold_and_emit_events.yaml	Updates snapshot content for compaction scenario.
test/snapshots/event_fidelity/should_emit_assistant_message_with_messageid.yaml	New snapshot for assistant messageId behavior.
test/snapshots/event_fidelity/should_emit_events_in_correct_order_for_tool_using_conversation.yaml	New snapshot for event ordering in tool-using flow.
test/snapshots/event_fidelity/should_emit_tool_execution_events_with_correct_fields.yaml	New snapshot for tool execution events.
test/snapshots/event_fidelity/should_include_valid_fields_on_all_events.yaml	New snapshot for general event field validity.
test/snapshots/hooks/should_invoke_both_pretooluse_and_posttooluse_hooks_for_a_single_tool_call.yaml	Updates expected assistant text for hook scenario.
test/snapshots/hooks/should_invoke_pretooluse_hook_when_model_runs_a_tool.yaml	Updates expected assistant text for hook scenario.
test/snapshots/hooks_extended/should_invoke_onerroroccurred_hook_when_error_occurs.yaml	New snapshot for extended hook coverage.
test/snapshots/hooks_extended/should_invoke_onsessionend_hook_when_session_is_destroyed.yaml	New snapshot for extended hook coverage.
test/snapshots/hooks_extended/should_invoke_onsessionstart_hook_on_new_session.yaml	New snapshot for extended hook coverage.
test/snapshots/hooks_extended/should_invoke_onuserpromptsubmitted_hook_when_sending_a_message.yaml	New snapshot for extended hook coverage.
test/snapshots/multi_turn/should_handle_file_creation_then_reading_across_turns.yaml	New snapshot for create+read multi-turn flow.
test/snapshots/multi_turn/should_use_tool_results_from_previous_turns.yaml	New snapshot for multi-turn state usage.
test/snapshots/permissions/should_deny_permission_when_handler_returns_denied.yaml	Updates expected assistant text in permissions flow.
test/snapshots/permissions/should_handle_permission_handler_errors_gracefully.yaml	Updates shell tool description text in permissions flow.
test/snapshots/session/should_abort_a_session.yaml	Updates expected assistant response in abort scenario.
test/snapshots/session_config/should_accept_message_attachments.yaml	New snapshot for attachments request content.
test/snapshots/session_config/should_use_workingdirectory_for_tool_execution.yaml	New snapshot validating workingDirectory paths.
test/snapshots/session_lifecycle/should_delete_session_permanently.yaml	New snapshot for delete-session lifecycle behavior.
test/snapshots/session_lifecycle/should_list_created_sessions_after_sending_a_message.yaml	New snapshot for session listing after activity.
test/snapshots/session_lifecycle/should_return_events_via_getmessages_after_conversation.yaml	New snapshot for getMessages lifecycle behavior.
test/snapshots/session_lifecycle/should_support_multiple_concurrent_sessions.yaml	New snapshot for concurrent sessions scenario.
test/snapshots/skills/should_load_and_apply_skill_from_skilldirectories.yaml	Updates expected assistant text for skills scenario.
test/snapshots/streaming_fidelity/should_not_produce_deltas_when_streaming_is_disabled.yaml	New snapshot for non-streaming case.
test/snapshots/streaming_fidelity/should_produce_delta_events_when_streaming_is_enabled.yaml	New snapshot for streaming-enabled case.
test/snapshots/tool_results/should_handle_structured_toolresultobject_from_custom_tool.yaml	New snapshot for structured ToolResultObject success case.
test/snapshots/tool_results/should_handle_tool_result_with_failure_resulttype.yaml	New snapshot for ToolResultObject failure case.
test/snapshots/tool_results/should_pass_validated_zod_parameters_to_tool_handler.yaml	New snapshot for Zod-validated tool params flow.

Comments suppressed due to low confidence (2)

nodejs/test/e2e/tool_results.test.ts:13

openAiEndpoint is destructured from createSdkTestContext() but never used. With the repo's ESLint config (@typescript-eslint/no-unused-vars), this will fail lint; remove it or rename to _openAiEndpoint if intentionally unused.

describe("Tool Results", async () => {
    const { copilotClient: client } = await createSdkTestContext();

nodejs/test/e2e/hooks_extended.test.ts:16

workDir is destructured from createSdkTestContext() but never used. This will fail ESLint @typescript-eslint/no-unused-vars; remove it or rename to _workDir if intentionally unused.

describe("Extended session hooks", async () => {
    const { copilotClient: client } = await createSdkTestContext();

Copilot · 2026-02-13T17:43:44Z

test/harness/replayingCapiProxy.ts

+            const headers = {
+              "content-type": "text/event-stream",
+              ...commonResponseHeaders,
+            };
+            options.onResponseStart(200, headers);
+            // Never call onResponseEnd - hang indefinitely for timeout tests
+            await new Promise(() => {});


The request-only snapshot branch hangs by awaiting a never-resolving Promise. This keeps an async stack alive indefinitely and is unnecessary to keep the HTTP response open. Consider returning immediately after onResponseStart (and simply never calling onResponseEnd) to simulate a hang without leaking a pending Promise; also consider matching the response content-type (SSE vs JSON) to the request's stream flag, as the non-streaming path uses application/json.

Suggested change

const headers = {

"content-type": "text/event-stream",

...commonResponseHeaders,

};

options.onResponseStart(200, headers);

// Never call onResponseEnd - hang indefinitely for timeout tests

await new Promise(() => {});

const streamingIsRequested =

options.body &&

(JSON.parse(options.body) as { stream?: boolean }).stream ===

true;

const headers = {

"content-type": streamingIsRequested

? "text/event-stream"

: "application/json",

...commonResponseHeaders,

};

options.onResponseStart(200, headers);

// Never call onResponseEnd - hang indefinitely for timeout tests.

// Returning here keeps the HTTP response open without leaking a pending Promise.

Copilot · 2026-02-13T17:43:45Z

nodejs/test/e2e/tool_results.test.ts

+        });
+
+        expect(assistantMessage).not.toBeNull();
+        expect(assistantMessage?.data.content).toBeTruthy();


This test is intended to validate failure handling (prompt: "If it fails, say 'service is down'"), but the assertions only check that some content exists. This can pass even if the SDK ignores resultType: "failure" entirely; assert that the assistant response includes the expected fallback text (e.g., "service is down").

Suggested change

expect(assistantMessage?.data.content).toBeTruthy();

const failureContent = assistantMessage?.data.content ?? "";

expect(failureContent).toMatch(/service is down/i);

Copilot · 2026-02-13T17:43:45Z

nodejs/test/e2e/client_lifecycle.test.ts

+    it("should return undefined for getLastSessionId with no sessions", async () => {
+        // On a fresh client this may return undefined or an older session ID
+        const lastSessionId = await client.getLastSessionId();
+        expect(() => lastSessionId).not.toThrow();


This assertion is a no-op: expect(() => lastSessionId).not.toThrow() cannot fail because it just returns a value. If the intent is to validate behavior when no sessions exist, assert on the return value (e.g., undefined) or explicitly document and check the allowed shapes (string | undefined).

Suggested change

expect(() => lastSessionId).not.toThrow();

expect(lastSessionId === undefined || typeof lastSessionId === "string").toBe(true);

Copilot · 2026-02-13T17:43:45Z

nodejs/test/e2e/compaction.test.ts

 import { createSdkTestContext } from "./harness/sdkTestContext.js";

-describe("Compaction", async () => {
+describe.skip("Compaction", async () => {


The entire Compaction E2E suite is now skipped via describe.skip, which disables coverage for compaction behavior. If this is due to flakiness, consider describe.skipIf(...) with a tracked condition, or leave the suite enabled and mark only the flaky test(s) as skipped with a TODO + issue link.

Suggested change

describe.skip("Compaction", async () => {

describe("Compaction", async () => {

github-actions · 2026-02-13T17:43:57Z

Cross-SDK Consistency Review ✅

I've reviewed this PR's additions of comprehensive E2E tests for the Node.js SDK against the Python, Go, and .NET implementations.

📊 Test Coverage Gap (Not a Consistency Issue)

This PR adds extensive test coverage for Node.js that doesn't yet exist in other SDKs:

✅ Built-in tools (bash, view, edit, create, glob, grep)
✅ Event ordering and fidelity
✅ Multi-turn conversation context
✅ Error resilience (destroyed sessions, double abort, etc.)
✅ Session configuration (working directory, provider config, attachments)
✅ Streaming delta events vs non-streaming mode
✅ Tool result objects (success/failure types)
✅ Extended session hooks (onSessionStart, onUserPromptSubmitted, etc.)

Good news: The underlying SDK features exist in all languages—this is purely a test coverage gap, not an API inconsistency. The new tests provide excellent patterns that could be ported to Python/Go/.NET test suites in future work.

🔍 One API Inconsistency Found

getLastSessionId() is missing in Python and Go SDKs:

✅ Node.js: client.getLastSessionId() (tested in client_lifecycle.test.ts)
✅ .NET: client.GetLastSessionIdAsync() (exists but not E2E tested)
❌ Python: Missing (no client.get_last_session_id())
❌ Go: Missing (no client.GetLastSessionId())

This is a minor utility method, but should be added to Python and Go for complete feature parity.

✅ Recommendation

For this PR: ✅ Approve—this adds valuable test coverage and doesn't introduce inconsistencies.

Follow-up work suggested:

Add get_last_session_id() to Python SDK and GetLastSessionId() to Go SDK
Port these comprehensive E2E test patterns to Python/Go/.NET for better cross-SDK validation (especially built-in tools, event fidelity, and error resilience tests)

Note: All other features tested here (streaming, list/delete sessions, extended hooks, tool results, etc.) already exist across all SDKs—this PR just demonstrates best practices for testing them thoroughly.

AI generated by SDK Consistency Review Agent

SteveSandersonMS · 2026-02-13T18:31:50Z

nodejs/test/e2e/client_lifecycle.test.ts

+            await session.sendAndWait({ prompt: "Say hello" });
+
+            // Wait for session data to flush to disk
+            await new Promise((r) => setTimeout(r, 500));


Can you make sure none of the tests rely on timings like this? Should poll until the desired condition is met.

SteveSandersonMS · 2026-02-13T18:33:25Z

@friggeri This looks very useful! Eventually we should make sure we have the same e2e tests across the other languages as well, but I'm personally fine with this PR doing it for Node first and then getting Copilot to copy the same set of tests to other languages as a later follow-up.

github-actions · 2026-02-27T03:05:52Z

Cross-SDK Test Coverage Analysis

This PR adds excellent 11 new E2E test files to the Node.js SDK, significantly improving test coverage. However, this creates a test parity gap with the other SDK implementations (Python, Go, and .NET).

New Test Coverage Added (Node.js only)

The following test scenarios are now only tested in the Node.js SDK:

Test File	What It Tests	Available in Other SDKs?
builtin_tools.test.ts	Built-in tools (bash, view, create, edit, grep, glob)	✅ Yes - APIs exist
client_lifecycle.test.ts	`getLastSessionId()`, session lifecycle events	⚠️ Partial - `getLastSessionId` missing from Python & Go
compaction.test.ts	Session compaction behavior	✅ Yes - APIs exist
error_resilience.test.ts	Error handling (`abort()`, destroyed sessions, `resumeSession()` errors)	✅ Yes - APIs exist
event_fidelity.test.ts	Event emission order, event field validation	✅ Yes - same events
hooks_extended.test.ts	Extended hook scenarios (`onSessionStart`, `onErrorOccurred`)	✅ Yes - hooks exist
multi_turn.test.ts	Multi-turn conversations with tool results	✅ Yes - APIs exist
session_config.test.ts	Session config (`workingDirectory`, etc.)	✅ Yes - APIs exist
session_lifecycle.test.ts	`listSessions()`, `deleteSession()`, `getMessages()`	✅ Yes - APIs exist
streaming_fidelity.test.ts	Streaming deltas, `streaming: true/false` behavior	✅ Yes - APIs exist
tool_results.test.ts	Structured tool result handling	✅ Yes - APIs exist

Key Findings

Most features are universal: Nearly all the functionality being tested exists in Python, Go, and .NET SDKs with equivalent APIs (accounting for language conventions like snake_case vs camelCase vs PascalCase).
API Parity Issue Identified:
- ❌ getLastSessionId() exists in Node.js and .NET but is missing from Python and Go
- This is tested in client_lifecycle.test.ts lines 20-22
Test coverage gap: While the features exist in all SDKs, the test coverage is now heavily weighted toward Node.js. Python, Go, and .NET don't have equivalent tests for:
- Built-in tool behavior validation
- Streaming delta event verification
- Error resilience scenarios
- Event ordering and field validation
- Session configuration edge cases
- Multi-turn tool interaction patterns

Recommendations

Option 1: Add equivalent tests to other SDKs (Preferred for feature parity)

Port these 11 test files to Python (python/e2e/), Go (go/internal/e2e/), and .NET (dotnet/test/)
Ensures consistent behavior verification across all SDK implementations
Uses existing shared snapshots in test/snapshots/ (which this PR also updates)

Option 2: Add missing APIs (For identified gaps)

Add get_last_session_id() to Python SDK (python/copilot/client.py)
Add GetLastSessionId() to Go SDK (go/client.go)

Option 3: Document this as Node.js-first testing (If intentional)

Update contributing docs to note that Node.js serves as the reference implementation for new test scenarios
Create tracking issues for porting tests to other SDKs

What This PR Does Well

✅ Comprehensive test coverage of critical scenarios
✅ Uses shared snapshot infrastructure (test/snapshots/)
✅ Tests universal SDK features that should work consistently
✅ Good test organization and clear test names

Suggested Next Steps

Since this is a test-only PR (no API changes), I recommend:

Merge this PR as-is to improve Node.js test coverage
Create follow-up issues to port these tests to Python, Go, and .NET
Consider adding getLastSessionId() to Python and Go for API parity

Would you like me to help create tracking issues for porting these tests to the other SDKs?

AI generated by SDK Consistency Review Agent

…romise leak - Add onPermissionRequest: approveAll to all createSession()/resumeSession() calls in new E2E test files (CI blocker) - Fix no-op assertion in client_lifecycle.test.ts (was testing lambda return) - Fix weak assertion in tool_results.test.ts (now checks 'service is down') - Fix resumeSession in error_resilience.test.ts (was missing required arg) - Fix hanging Promise leak in replayingCapiProxy.ts (return without await) - Add TODO comment on skipped Compaction test suite Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-02-27T03:19:42Z

Cross-SDK Consistency Review

I've reviewed PR #466 for cross-SDK consistency. This PR adds extensive E2E test coverage to the Node.js/TypeScript SDK.

✅ Positive: Strong existing test parity

Most of the scenarios being tested already have equivalent coverage in Python and .NET:

Streaming/delta events (Python ✅, .NET has test but currently skipped pending schema update)
Session lifecycle (list/delete/concurrent sessions) (Python ✅, .NET ✅)
Compaction (Python ✅, .NET ✅)
Hooks (Python ✅, .NET ✅)

📝 Consideration: Some new test scenarios

The following test files appear to add more granular or new test coverage that may not have direct equivalents in other SDKs:

builtin_tools.test.ts - Tests individual built-in tools (bash, view, edit, create_file, grep, glob) with specific scenarios like exit codes, stderr, line ranges, nonexistent files
event_fidelity.test.ts - Validates event ordering, field presence, tool execution event structure
error_resilience.test.ts - Tests error handling: sending to destroyed sessions, double abort, resuming non-existent sessions
multi_turn.test.ts - Tests context preservation across turns, file creation then reading
tool_results.test.ts - (Need to review content to assess)
session_config.test.ts - (Need to review content to assess)

🤔 Question for consideration

Would it be valuable to add equivalent tests to the Python, Go, and .NET SDKs for these more granular scenarios? This would help ensure:

All SDKs have comprehensive test coverage
Behavioral consistency is verified across languages
Regressions are caught consistently

Alternatively, if these tests are primarily validating CLI behavior (not SDK-specific logic), they may only need to exist in one SDK's test suite.

Note on .NET

The .NET SDK has a streaming delta test that's currently skipped (SessionTests.cs:237) pending a schema update for AssistantMessageDeltaEvent. This might be worth addressing.

Overall assessment: This PR maintains consistency well—it's adding tests that either already have equivalents in other SDKs, or are testing new scenarios that could potentially benefit all SDKs.

AI generated by SDK Consistency Review Agent

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-02-27T03:29:41Z

SDK Consistency Review - PR #466

This PR adds comprehensive e2e test coverage for the Node.js/TypeScript SDK. While this significantly improves Node.js test quality, it creates test coverage gaps across SDKs.

✅ Good: Tests for existing features

The following test categories cover features that already have tests in Python, Go, and .NET:

compaction - All SDKs have compaction tests
session - All SDKs have session tests
hooks_extended - All SDKs have basic hooks tests
builtin_tools, tool_results - All SDKs have tool tests

⚠️ Test Coverage Gaps Created

The following new test categories in this PR have no equivalents in Python, Go, or .NET SDKs:

1. error_resilience.test.ts (New)

Tests error handling scenarios:

Sending to destroyed sessions
Getting messages from destroyed sessions
Double abort handling
Invalid session operations

Recommendation: Consider adding equivalent error resilience tests to Python (python/e2e/test_error_resilience.py), Go (go/internal/e2e/error_resilience_test.go), and .NET (dotnet/test/ErrorResilienceTests.cs).

2. event_fidelity.test.ts (New)

Tests event ordering and structure:

Events emitted in correct order for tool-using conversations
Message IDs on assistant messages
Tool execution events with correct fields
Valid fields on all events

Recommendation: Python, Go, and .NET should have similar tests to ensure event contracts are consistent across SDKs.

3. streaming_fidelity.test.ts (Enhanced)

While all SDKs have basic streaming tests, this PR adds comprehensive streaming validation:

Delta events produced when streaming enabled
No deltas when streaming disabled
Streaming behavior differences

Status: Python/Go have basic streaming tests. .NET has a streaming test that's currently skipped due to schema issues (see dotnet/test/SessionTests.cs line ~120).

Recommendation: Enhance streaming tests in Python/Go to match this fidelity, and enable the .NET streaming test once schema is updated.

4. multi_turn.test.ts (New)

Dedicated multi-turn conversation tests.

Note: Python and Go have test_should_have_stateful_conversation in their session tests, but no dedicated multi-turn test file.

Recommendation: Consider whether multi-turn deserves dedicated test files in other SDKs.

5. session_lifecycle.test.ts & session_config.test.ts (New)

Dedicated tests for session lifecycle and configuration.

Recommendation: These may be appropriate as separate test files in other SDKs for better organization.

📋 Suggested Follow-up

To maintain cross-SDK test parity, consider:

Short term: File an issue to track adding equivalent error_resilience and event_fidelity tests to Python, Go, and .NET
Medium term: Enhance streaming tests in Python/Go, fix .NET streaming schema issue
Long term: Consider a test coverage matrix document that tracks which test scenarios exist in which SDKs

✅ Approval Status

This PR is good to merge as it improves Node.js test coverage. The consistency gaps are a backlog item, not a blocker for this PR.

Great work on the comprehensive test additions! 🎉

AI generated by SDK Consistency Review Agent

…pshot match) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

On Windows, glob returns files in alphabetical order (app.ts before index.ts) while macOS/Linux return them differently. Add a second conversation variant to handle both orderings. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-02-27T03:39:48Z

Cross-SDK Consistency Review ✅

I've reviewed PR #466 for cross-language SDK consistency. This PR adds comprehensive E2E test coverage for the Node.js/TypeScript SDK without introducing new features or modifying any SDK client code.

Summary

✅ No consistency issues detected. This PR only adds test files and test infrastructure for Node.js. It does not introduce new SDK features or modify public APIs.

Test Coverage Analysis

The PR adds the following new Node.js E2E tests:

builtin_tools.test.ts - Tests bash, view, edit, create_file, grep, glob tools
streaming_fidelity.test.ts - Tests streaming delta events
event_fidelity.test.ts - Tests event ordering and structure
session_lifecycle.test.ts - Tests listSessions(), deleteSession()
client_lifecycle.test.ts - Tests getLastSessionId(), session lifecycle events
tool_results.test.ts - Tests ToolResultObject handling
error_resilience.test.ts - Tests error handling edge cases
multi_turn.test.ts - Tests multi-turn conversations
session_config.test.ts - Tests session configuration options
hooks_extended.test.ts - Tests extended hook callbacks
session.test.ts, compaction.test.ts - Updates to existing tests

Cross-SDK Test Parity Check

I verified that the functionality being tested already exists and is tested in the other SDKs:

Test Category	Node.js	Python	Go	.NET
Streaming delta events	✅ (new)	✅ `test_session.py`	✅ `session_test.go`	✅ `SessionTests.cs`
Session lifecycle (list/delete)	✅ (new)	✅ `test_session.py`	✅ `session_test.go`	✅ `SessionTests.cs`
Event fidelity & ordering	✅ (new)	✅ (scattered across tests)	✅ (scattered across tests)	✅ (scattered across tests)
Error resilience	✅ (new)	✅ (scattered across tests)	✅ (scattered across tests)	✅ (scattered across tests)
Multi-turn conversations	✅ (new)	✅ `test_session.py`	✅ `session_test.go`	✅ `SessionTests.cs`
Tool execution hooks	✅ (extended)	✅ `test_hooks.py`	✅ `hooks_test.go`	✅ `HooksTests.cs`

Notes

Builtin tools tests: The new builtin_tools.test.ts tests CLI-provided tools (bash, view, edit, grep, glob). These are not SDK-specific features but rather test the underlying CLI tool integration. Similar focused tests don't exist in other SDKs, but this is acceptable as these tests verify CLI behavior rather than SDK API surface.
Test organization: Node.js is now organizing E2E tests more granularly (one file per feature area), while Python, Go, and .NET have consolidated test files. This is a testing style difference and doesn't affect functionality.
Extended hooks: The hooks_extended.test.ts tests onSessionStart, onUserPromptSubmitted, onSessionEnd, and onErrorOccurred hooks. These hooks appear to be available in all SDKs but may not be explicitly tested in separate files in Python/Go/.NET.

Recommendation

✅ Approve. This PR improves Node.js test coverage to match or exceed the test coverage in other SDKs. No SDK consistency issues were found.

AI generated by SDK Consistency Review Agent

More e2e nodejs tests

779fe12

friggeri requested a review from a team as a code owner February 13, 2026 17:37

Copilot AI review requested due to automatic review settings February 13, 2026 17:37

Copilot started reviewing on behalf of friggeri February 13, 2026 17:37 View session

fix lint/format

5f9c460

Copilot AI reviewed Feb 13, 2026

View reviewed changes

SteveSandersonMS reviewed Feb 13, 2026

View reviewed changes

Merge branch 'main' into frg/more-e2e-tests

2ee58d5

patniko and others added 2 commits February 26, 2026 19:13

Fix prettier formatting

4e20af1

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

fix: fix glob snapshot mismatch and session cleanup in e2e tests

7adc04e

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

patniko and others added 2 commits February 26, 2026 19:30

fix: skip bash stderr test on Windows (PowerShell wrapping breaks sna…

0f4fc97

…pshot match) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

patniko approved these changes Feb 27, 2026

View reviewed changes

patniko added this pull request to the merge queue Feb 27, 2026

Merged via the queue into main with commit 8e61e9b Feb 27, 2026
25 checks passed

patniko deleted the frg/more-e2e-tests branch February 27, 2026 03:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More e2e nodejs tests#466

More e2e nodejs tests#466
patniko merged 8 commits intomainfrom
frg/more-e2e-tests

friggeri commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

SteveSandersonMS Feb 13, 2026

Uh oh!

SteveSandersonMS commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 27, 2026

Uh oh!

github-actions bot commented Feb 27, 2026

Uh oh!

github-actions bot commented Feb 27, 2026

Uh oh!

Uh oh!

github-actions bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	expect(assistantMessage?.data.content).toBeTruthy();
	const failureContent = assistantMessage?.data.content ?? "";
	expect(failureContent).toMatch(/service is down/i);

	expect(() => lastSessionId).not.toThrow();
	expect(lastSessionId === undefined \|\| typeof lastSessionId === "string").toBe(true);

	describe.skip("Compaction", async () => {
	describe("Compaction", async () => {

Conversation

friggeri commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 13, 2026

Cross-SDK Consistency Review ✅

📊 Test Coverage Gap (Not a Consistency Issue)

🔍 One API Inconsistency Found

✅ Recommendation

Uh oh!

SteveSandersonMS Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

SteveSandersonMS commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 27, 2026

Cross-SDK Test Coverage Analysis

New Test Coverage Added (Node.js only)

Key Findings

Recommendations

What This PR Does Well

Suggested Next Steps

Uh oh!

github-actions bot commented Feb 27, 2026

Cross-SDK Consistency Review

✅ Positive: Strong existing test parity

📝 Consideration: Some new test scenarios

🤔 Question for consideration

Note on .NET

Uh oh!

github-actions bot commented Feb 27, 2026

SDK Consistency Review - PR #466

✅ Good: Tests for existing features

⚠️ Test Coverage Gaps Created

1. error_resilience.test.ts (New)

2. event_fidelity.test.ts (New)

3. streaming_fidelity.test.ts (Enhanced)

4. multi_turn.test.ts (New)

5. session_lifecycle.test.ts & session_config.test.ts (New)

📋 Suggested Follow-up

✅ Approval Status

Uh oh!

Uh oh!

github-actions bot commented Feb 27, 2026

Cross-SDK Consistency Review ✅

Summary

Test Coverage Analysis

Cross-SDK Test Parity Check

Notes

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants