fix: save trace as agent run suspends and resumes by mathurk · Pull Request #1350 · UiPath/uipath-python

mathurk · 2026-02-19T21:38:03Z

The Problem

When running evaluations locally on agents that use sub-agent tools (like "Backwards-String-Generator"), the evaluation process involves a suspend/resume cycle:

Initial run: The agent calls a sub-agent tool, which creates a remote job.
Resume run: A second CLI invocation picks up where the first left off, gets the job result, and completes the agent execution.

The trajectory evaluator (which grades how well the agent followed expected steps by examining the trace) was scoring 0 for these agents, even though the agent was running as expected.

Root Cause 1: Spans lost across process boundary

OpenTelemetry spans are stored in memory by the ExecutionSpanExporter. When the first process suspends and exits, all
32+ spans from that run are lost due to the line self.span_exporter.clear(execution_id). The second process starts fresh with zero spans. The trajectory evaluator only sees the resume-phase spans and has no record of the agent's actual work.

Fix: We added span persistence to SQLite. On suspend, all collected spans are serialized to JSON and saved to the existing __uipath/state.db database (which was already used for storing resume triggers). On resume, the saved spans are loaded back and prepended to the new spans. This required writing _serialize_span() and _deserialize_span() helper functions to convert OpenTelemetry ReadableSpan objects to/from JSON-compatible dicts.

Result: Trajectory evaluator went from 0 → 50.

Root Cause 2: Resume-phase spans invisible to evaluator (exec_id=None)

Even with span persistence working, the trajectory evaluator scored 50 instead of 100. It could see the tool call (from saved first-run spans) but not the successful tool result (from resume-phase spans).

Every span needs an execution.id attribute to be collected by the ExecutionSpanExporter. This ID is propagated from parent spans to child spans by UiPathExecutionTraceProcessorMixin.on_start() in the tracing infrastructure. However, this propagation requires parent_span.is_recording() to return True.

On resume, the "Evaluation" span is restored as a NonRecordingSpan. NonRecordingSpan.is_recording() returns False, so execution.id propagation breaks at this boundary. Resume-phase spans never get execution.id, so the exporter silently drops them.

The tracing infrastructure code (uipath.core.tracing.processors) is in a separate installed package we can't modify. But the eval-specific ExecutionSpanProcessor (which extends it) is in our code.

Fix: Added a fallback in ExecutionSpanProcessor.on_start(): after the parent's propagation attempt, if execution.id is still missing, read it from the execution_id_context ContextVar (which is already set before the runtime executes). This ensures all spans during eval execution get tagged correctly, regardless of whether the parent span is recording.

Result: Trajectory evaluator went from 50 → 100.

Development Package

Use uipath pack --nolock to get the latest dev build from this PR (requires version range).
Add this package as a dependency in your pyproject.toml:

[project]
dependencies = [
  # Exact version:
  "uipath==2.8.47.dev1013504954",

  # Any version from PR
  "uipath>=2.8.47.dev1013500000,<2.8.47.dev1013510000"
]

[[tool.uv.index]]
name = "testpypi"
url = "https://test.pypi.org/simple/"
publish-url = "https://test.pypi.org/legacy/"
explicit = true

[tool.uv.sources]
uipath = { index = "testpypi" }

[tool.uv]
override-dependencies = [
    "uipath>=2.8.47.dev1013500000,<2.8.47.dev1013510000",
]

Chibionos

need tests

Chibionos · 2026-02-20T08:13:59Z

src/uipath/_cli/_evals/_runtime.py

            self._log_handlers.clear()


+def _serialize_span(span: ReadableSpan) -> dict[str, Any]:


move these to helpers, keep the file clean and add tests for all code added here.

Chibionos · 2026-02-20T18:51:20Z

src/uipath/_cli/_evals/_runtime.py

+            if self.context.resume:
+                saved_spans = await self._load_execution_spans(eval_item.id)
+                if saved_spans:
+                    spans = saved_spans + spans


check if there are duplicates

github-actions bot added test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-llamaindex Triggers tests in the uipath-llamaindex-python repository labels Feb 19, 2026

mathurk added the build:dev Create a dev build from the pr label Feb 19, 2026

Chibionos requested changes Feb 20, 2026

View reviewed changes

Chibionos reviewed Feb 20, 2026

View reviewed changes

Chibionos approved these changes Feb 20, 2026

View reviewed changes

mathurk added 5 commits February 20, 2026 15:25

fix: save trace as agent run suspends and resumes

8030f81

chore: lint

cd82890

chore: logger cleanup

11fd243

fix: helpers and testing

39c16d5

fix: load and save as helpers

5842582

mathurk force-pushed the fix/suspend_resume_spans branch from f3b089c to 5842582 Compare February 20, 2026 20:25

mathurk merged commit 18493f0 into main Feb 20, 2026
95 checks passed

mathurk deleted the fix/suspend_resume_spans branch February 20, 2026 20:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix: save trace as agent run suspends and resumes#1350

fix: save trace as agent run suspends and resumes#1350
mathurk merged 5 commits intomainfrom
fix/suspend_resume_spans

mathurk commented Feb 19, 2026 •

edited

Loading

Uh oh!

Chibionos left a comment

Uh oh!

Chibionos Feb 20, 2026

Uh oh!

Chibionos Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		self._log_handlers.clear()


		def _serialize_span(span: ReadableSpan) -> dict[str, Any]:

Comments

Conversation

mathurk commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Development Package

Uh oh!

Chibionos left a comment

Choose a reason for hiding this comment

Uh oh!

Chibionos Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Chibionos Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mathurk commented Feb 19, 2026 •

edited

Loading